21 Dec, 2012

1 commit


18 Dec, 2012

1 commit


17 Dec, 2012

1 commit

  • Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
    "There are three implementations for NUMA balancing, this tree
    (balancenuma), numacore which has been developed in tip/master and
    autonuma which is in aa.git.

    In almost all respects balancenuma is the dumbest of the three because
    its main impact is on the VM side with no attempt to be smart about
    scheduling. In the interest of getting the ball rolling, it would be
    desirable to see this much merged for 3.8 with the view to building
    scheduler smarts on top and adapting the VM where required for 3.9.

    The most recent set of comparisons available from different people are

    mel: https://lkml.org/lkml/2012/12/9/108
    mingo: https://lkml.org/lkml/2012/12/7/331
    tglx: https://lkml.org/lkml/2012/12/10/437
    srikar: https://lkml.org/lkml/2012/12/10/397

    The results are a mixed bag. In my own tests, balancenuma does
    reasonably well. It's dumb as rocks and does not regress against
    mainline. On the other hand, Ingo's tests shows that balancenuma is
    incapable of converging for this workloads driven by perf which is bad
    but is potentially explained by the lack of scheduler smarts. Thomas'
    results show balancenuma improves on mainline but falls far short of
    numacore or autonuma. Srikar's results indicate we all suffer on a
    large machine with imbalanced node sizes.

    My own testing showed that recent numacore results have improved
    dramatically, particularly in the last week but not universally.
    We've butted heads heavily on system CPU usage and high levels of
    migration even when it shows that overall performance is better.
    There are also cases where it regresses. Of interest is that for
    specjbb in some configurations it will regress for lower numbers of
    warehouses and show gains for higher numbers which is not reported by
    the tool by default and sometimes missed in treports. Recently I
    reported for numacore that the JVM was crashing with
    NullPointerExceptions but currently it's unclear what the source of
    this problem is. Initially I thought it was in how numacore batch
    handles PTEs but I'm no longer think this is the case. It's possible
    numacore is just able to trigger it due to higher rates of migration.

    These reports were quite late in the cycle so I/we would like to start
    with this tree as it contains much of the code we can agree on and has
    not changed significantly over the last 2-3 weeks."

    * tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
    mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
    mm/rmap: Convert the struct anon_vma::mutex to an rwsem
    mm: migrate: Account a transhuge page properly when rate limiting
    mm: numa: Account for failed allocations and isolations as migration failures
    mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
    mm: numa: Add THP migration for the NUMA working set scanning fault case.
    mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
    mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
    mm: sched: numa: Control enabling and disabling of NUMA balancing
    mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
    mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely tasknode relationships
    mm: numa: migrate: Set last_nid on newly allocated page
    mm: numa: split_huge_page: Transfer last_nid on tail page
    mm: numa: Introduce last_nid to the page frame
    sched: numa: Slowly increase the scanning period as NUMA faults are handled
    mm: numa: Rate limit setting of pte_numa if node is saturated
    mm: numa: Rate limit the amount of memory that is migrated between nodes
    mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
    mm: numa: Migrate pages handled during a pmd_numa hinting fault
    mm: numa: Migrate on reference policy
    ...

    Linus Torvalds
     

13 Dec, 2012

1 commit

  • Pull networking changes from David Miller:

    1) Allow to dump, monitor, and change the bridge multicast database
    using netlink. From Cong Wang.

    2) RFC 5961 TCP blind data injection attack mitigation, from Eric
    Dumazet.

    3) Networking user namespace support from Eric W. Biederman.

    4) tuntap/virtio-net multiqueue support by Jason Wang.

    5) Support for checksum offload of encapsulated packets (basically,
    tunneled traffic can still be checksummed by HW). From Joseph
    Gasparakis.

    6) Allow BPF filter access to VLAN tags, from Eric Dumazet and
    Daniel Borkmann.

    7) Bridge port parameters over netlink and BPDU blocking support
    from Stephen Hemminger.

    8) Improve data access patterns during inet socket demux by rearranging
    socket layout, from Eric Dumazet.

    9) TIPC protocol updates and cleanups from Ying Xue, Paul Gortmaker, and
    Jon Maloy.

    10) Update TCP socket hash sizing to be more in line with current day
    realities. The existing heurstics were choosen a decade ago.
    From Eric Dumazet.

    11) Fix races, queue bloat, and excessive wakeups in ATM and
    associated drivers, from Krzysztof Mazur and David Woodhouse.

    12) Support DOVE (Distributed Overlay Virtual Ethernet) extensions
    in VXLAN driver, from David Stevens.

    13) Add "oops_only" mode to netconsole, from Amerigo Wang.

    14) Support set and query of VEB/VEPA bridge mode via PF_BRIDGE, also
    allow DCB netlink to work on namespaces other than the initial
    namespace. From John Fastabend.

    15) Support PTP in the Tigon3 driver, from Matt Carlson.

    16) tun/vhost zero copy fixes and improvements, plus turn it on
    by default, from Michael S. Tsirkin.

    17) Support per-association statistics in SCTP, from Michele
    Baldessari.

    And many, many, driver updates, cleanups, and improvements. Too
    numerous to mention individually.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1722 commits)
    net/mlx4_en: Add support for destination MAC in steering rules
    net/mlx4_en: Use generic etherdevice.h functions.
    net: ethtool: Add destination MAC address to flow steering API
    bridge: add support of adding and deleting mdb entries
    bridge: notify mdb changes via netlink
    ndisc: Unexport ndisc_{build,send}_skb().
    uapi: add missing netconf.h to export list
    pkt_sched: avoid requeues if possible
    solos-pci: fix double-free of TX skb in DMA mode
    bnx2: Fix accidental reversions.
    bna: Driver Version Updated to 3.1.2.1
    bna: Firmware update
    bna: Add RX State
    bna: Rx Page Based Allocation
    bna: TX Intr Coalescing Fix
    bna: Tx and Rx Optimizations
    bna: Code Cleanup and Enhancements
    ath9k: check pdata variable before dereferencing it
    ath5k: RX timestamp is reported at end of frame
    ath9k_htc: RX timestamp is reported at end of frame
    ...

    Linus Torvalds
     

12 Dec, 2012

3 commits

  • Pull x86 timer update from Ingo Molnar:
    "This tree includes HPET fixes and also implements a calibration-free,
    TSC match driven APIC timer interrupt mode: 'TSC deadline mode'
    supported in SandyBridge and later CPUs."

    * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: hpet: Fix inverted return value check in arch_setup_hpet_msi()
    x86: hpet: Fix masking of MSI interrupts
    x86: apic: Use tsc deadline for oneshot when available

    Linus Torvalds
     
  • Pull x86 BSP hotplug changes from Ingo Molnar:
    "This tree enables CPU#0 (the boot processor) to be onlined/offlined on
    x86, just like any other CPU. Enabled on Intel CPUs for now.

    Allowing this required the identification and fixing of latent CPU#0
    assumptions (such as CPU#0 initializations, etc.) in the x86
    architecture code, plus the identification of barriers to
    BSP-offlining, such as active PIC interrupts which can only be
    serviced on the BSP.

    It's behind a default-off option, and there's a debug option that
    allows the automatic testing of this feature.

    The motivation of this feature is to allow and prepare for true
    CPU-hotplug hardware support: recent changes to MCE support enable us
    to detect a deteriorating but not yet hard-failing L1/L2 cache on a
    CPU that could be soft-unplugged - or a failing L3 cache on a
    multi-socket system.

    Note that true hardware hot-plug is not yet fully enabled by this,
    because that requires a special platform wakeup sequence to be sent to
    the freshly powered up CPU#0. Future patches for this are planned,
    once such a platform exists. Chicken and egg"

    * 'x86-bsp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, topology: Debug CPU0 hotplug
    x86/i387.c: Initialize thread xstate only on CPU0 only once
    x86, hotplug: Handle retrigger irq by the first available CPU
    x86, hotplug: The first online processor saves the MTRR state
    x86, hotplug: During CPU0 online, enable x2apic, set_numa_node.
    x86, hotplug: Wake up CPU0 via NMI instead of INIT, SIPI, SIPI
    x86-32, hotplug: Add start_cpu0() entry point to head_32.S
    x86-64, hotplug: Add start_cpu0() entry point to head_64.S
    kernel/cpu.c: Add comment for priority in cpu_hotplug_pm_callback
    x86, hotplug, suspend: Online CPU0 for suspend or hibernate
    x86, hotplug: Support functions for CPU0 online/offline
    x86, topology: Don't offline CPU0 if any PIC irq can not be migrated out of it
    x86, Kconfig: Add config switch for CPU0 hotplug
    doc: Add x86 CPU0 online/offline feature

    Linus Torvalds
     
  • Pull perf updates from Ingo Molnar:
    "Lots of activity:

    211 files changed, 8328 insertions(+), 4116 deletions(-)

    most of it on the tooling side.

    Main changes:

    * ftrace enhancements and fixes from Steve Rostedt.

    * uprobes fixes, cleanups and preparation for the ARM port from Oleg
    Nesterov.

    * UAPI fixes, from David Howels - prepares the arch/x86 UAPI
    transition

    * Separate perf tests into multiple objects, one per test, from Jiri
    Olsa.

    * Make hardware event translations available in sysfs, from Jiri
    Olsa.

    * Fixes to /proc/pid/maps parsing, preparatory to supporting data
    maps, from Namhyung Kim

    * Implement ui_progress for GTK, from Namhyung Kim

    * Add framework for automated perf_event_attr tests, where tools with
    different command line options will be run from a 'perf test', via
    python glue, and the perf syscall will be intercepted to verify
    that the perf_event_attr fields set by the tool are those expected,
    from Jiri Olsa

    * Add a 'link' method for hists, so that we can have the leader with
    buckets for all the entries in all the hists. This new method is
    now used in the default 'diff' output, making the sum of the
    'baseline' column be 100%, eliminating blind spots.

    * libtraceevent fixes for compiler warnings trying to make perf it
    build on some distros, like fedora 14, 32-bit, some of the warnings
    really pointed to real bugs.

    * Add a browser for 'perf script' and make it available from the
    report and annotate browsers. It does filtering to find the
    scripts that handle events found in the perf.data file used. From
    Feng Tang

    * perf inject changes to allow showing where a task sleeps, from
    Andrew Vagin.

    * Makefile improvements from Namhyung Kim.

    * Add --pre and --post command hooks in 'stat', from Peter Zijlstra.

    * Don't stop synthesizing threads when one vanishes, this is for the
    existing threads when we start a tool like trace.

    * Use sched:sched_stat_runtime to provide a thread summary, this
    produces the same output as the 'trace summary' subcommand of
    tglx's original "trace" tool.

    * Support interrupted syscalls in 'trace'

    * Add an event duration column and filter in 'trace'.

    * There are references to the man pages in some tools, so try to
    build Documentation when installing, warning the user if that is
    not possible, from Borislav Petkov.

    * Give user better message if precise is not supported, from David
    Ahern.

    * Try to find cross-built objdump path by using the session
    environment information in the perf.data file header, from Irina
    Tirdea, original patch and idea by Namhyung Kim.

    * Diplays more output on features check for make V=1, so that one can
    figure out what is happening by looking at gcc output, etc. From
    Jiri Olsa.

    * Add on_exit implementation for systems without one, e.g. Android,
    from Bernhard Rosenkraenzer.

    * Only process events for vcpus of interest, helps handling large
    number of events, from David Ahern.

    * Cross compilation fixes for Android, from Irina Tirdea.

    * Add documentation on compiling for Android, from Irina Tirdea.

    * perf diff improvements from Jiri Olsa.

    * Target (task/user/cpu/syswide) handling improvements, from Namhyung
    Kim.

    * Add support in 'trace' for tracing workload given by command line,
    from Namhyung Kim.

    * ... and much more."

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (194 commits)
    uprobes: Use percpu_rw_semaphore to fix register/unregister vs dup_mmap() race
    perf evsel: Introduce is_group_member method
    perf powerpc: Use uapi/unistd.h to fix build error
    tools: Pass the target in descend
    tools: Honour the O= flag when tool build called from a higher Makefile
    tools: Define a Makefile function to do subdir processing
    perf ui: Always compile browser setup code
    perf ui: Add ui_progress__finish()
    perf ui gtk: Implement ui_progress functions
    perf ui: Introduce generic ui_progress helper
    perf ui tui: Move progress.c under ui/tui directory
    perf tools: Add basic event modifier sanity check
    perf tools: Omit group members from perf_evlist__disable/enable
    perf tools: Ensure single disable call per event in record comand
    perf tools: Fix 'disabled' attribute config for record command
    perf tools: Fix attributes for '{}' defined event groups
    perf tools: Use sscanf for parsing /proc/pid/maps
    perf tools: Add gtk. config option for launching GTK browser
    perf tools: Fix compile error on NO_NEWT=1 build
    perf hists: Initialize all of he->stat with zeroes
    ...

    Linus Torvalds
     

11 Dec, 2012

1 commit


17 Nov, 2012

1 commit

  • RCU callback execution can add significant OS jitter and also can
    degrade both scheduling latency and, in asymmetric multiprocessors,
    energy efficiency. This commit therefore adds the ability for selected
    CPUs ("rcu_nocbs=" boot parameter) to have their callbacks offloaded
    to kthreads. If the "rcu_nocb_poll" boot parameter is also specified,
    these kthreads will do polling, removing the need for the offloaded
    CPUs to do wakeups. At least one CPU must be doing normal callback
    processing: currently CPU 0 cannot be selected as a no-CBs CPU.
    In addition, attempts to offline the last normal-CBs CPU will fail.

    This feature was inspired by Jim Houston's and Joe Korty's JRCU, and
    this commit includes fixes to problems located by Fengguang Wu's
    kbuild test robot.

    [ paulmck: Added gfp.h include file as suggested by Fengguang Wu. ]

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

16 Nov, 2012

1 commit


15 Nov, 2012

1 commit

  • If CONFIG_BOOTPARAM_HOTPLUG_CPU0 is turned on, CPU0 is hotpluggable. Otherwise,
    by default CPU0 is not hotpluggable and kernel parameter cpu0_hotplug enables
    CPU0 online/offline feature.

    The documentations point out two known CPU0 dependencies. First, resume from
    hibernate or suspend always starts from CPU0. So hibernate and suspend are
    prevented if CPU0 is offline. Another dependency is PIC interrupts always go
    to CPU0.

    It's said that some machines may depend on CPU0 to poweroff/reboot. But I
    haven't seen such dependency on a few tested machines.

    Please let me know if you see any CPU0 dependencies on your machine.

    Signed-off-by: Fenghua Yu
    Link: http://lkml.kernel.org/r/1352835171-3958-2-git-send-email-fenghua.yu@intel.com
    Signed-off-by: H. Peter Anvin

    Fenghua Yu
     

02 Nov, 2012

2 commits

  • Add trace_options to the kernel command line parameter to be able to
    set options at early boot. For example, to enable stack dumps of
    events, add the following:

    trace_options=stacktrace

    This along with the trace_event option, you can get not only
    traces of the events but also the stack dumps with them.

    Requested-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • If the TSC deadline mode is supported, LAPIC timer one-shot mode can be
    implemented using IA32_TSC_DEADLINE MSR. An interrupt will be generated
    when the TSC value equals or exceeds the value in the IA32_TSC_DEADLINE
    MSR.

    This enables us to skip the APIC calibration during boot. Also, in
    xapic mode, this enables us to skip the uncached apic access to re-arm
    the APIC timer.

    As this timer ticks at the high frequency TSC rate, we use the
    TSC_DIVISOR (32) to work with the 32-bit restrictions in the
    clockevent API's to avoid 64-bit divides etc (frequency is u32 and
    "unsigned long" in the set_next_event(), max_delta limits the next
    event to 32-bit for 32-bit kernel).

    Signed-off-by: Suresh Siddha
    Cc: venki@google.com
    Cc: len.brown@intel.com
    Link: http://lkml.kernel.org/r/1350941878.6017.31.camel@sbsiddha-desk.sc.intel.com
    Signed-off-by: Thomas Gleixner

    Suresh Siddha
     

15 Oct, 2012

1 commit

  • Pull module signing support from Rusty Russell:
    "module signing is the highlight, but it's an all-over David Howells frenzy..."

    Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.

    * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
    X.509: Fix indefinite length element skip error handling
    X.509: Convert some printk calls to pr_devel
    asymmetric keys: fix printk format warning
    MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
    MODSIGN: Make mrproper should remove generated files.
    MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
    MODSIGN: Use the same digest for the autogen key sig as for the module sig
    MODSIGN: Sign modules during the build process
    MODSIGN: Provide a script for generating a key ID from an X.509 cert
    MODSIGN: Implement module signature checking
    MODSIGN: Provide module signing public keys to the kernel
    MODSIGN: Automatically generate module signing keys if missing
    MODSIGN: Provide Kconfig options
    MODSIGN: Provide gitignore and make clean rules for extra files
    MODSIGN: Add FIPS policy
    module: signature checking hook
    X.509: Add a crypto key parser for binary (DER) X.509 certificates
    MPILIB: Provide a function to read raw data into an MPI
    X.509: Add an ASN.1 decoder
    X.509: Add simple ASN.1 grammar compiler
    ...

    Linus Torvalds
     

10 Oct, 2012

2 commits

  • Pull NFS client updates from Trond Myklebust:
    "Features include:

    - Remove CONFIG_EXPERIMENTAL dependency from NFSv4.1
    Aside from the issues discussed at the LKS, distros are shipping
    NFSv4.1 with all the trimmings.
    - Fix fdatasync()/fsync() for the corner case of a server reboot.
    - NFSv4 OPEN access fix: finally distinguish correctly between
    open-for-read and open-for-execute permissions in all situations.
    - Ensure that the TCP socket is closed when we're in CLOSE_WAIT
    - More idmapper bugfixes
    - Lots of pNFS bugfixes and cleanups to remove unnecessary state and
    make the code easier to read.
    - In cases where a pNFS read or write fails, allow the client to
    resume trying layoutgets after two minutes of read/write-
    through-mds.
    - More net namespace fixes to the NFSv4 callback code.
    - More net namespace fixes to the NFSv3 locking code.
    - More NFSv4 migration preparatory patches.
    Including patches to detect network trunking in both NFSv4 and
    NFSv4.1
    - pNFS block updates to optimise LAYOUTGET calls."

    * tag 'nfs-for-3.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (113 commits)
    pnfsblock: cleanup nfs4_blkdev_get
    NFS41: send real read size in layoutget
    NFS41: send real write size in layoutget
    NFS: track direct IO left bytes
    NFSv4.1: Cleanup ugliness in pnfs_layoutgets_blocked()
    NFSv4.1: Ensure that the layout sequence id stays 'close' to the current
    NFSv4.1: Deal with seqid wraparound in the pNFS return-on-close code
    NFSv4 set open access operation call flag in nfs4_init_opendata_res
    NFSv4.1: Remove the dependency on CONFIG_EXPERIMENTAL
    NFSv4 reduce attribute requests for open reclaim
    NFSv4: nfs4_open_done first must check that GETATTR decoded a file type
    NFSv4.1: Deal with wraparound when updating the layout "barrier" seqid
    NFSv4.1: Deal with wraparound issues when updating the layout stateid
    NFSv4.1: Always set the layout stateid if this is the first layoutget
    NFSv4.1: Fix another refcount issue in pnfs_find_alloc_layout
    NFSv4: don't put ACCESS in OPEN compound if O_EXCL
    NFSv4: don't check MAY_WRITE access bit in OPEN
    NFS: Set key construction data for the legacy upcall
    NFSv4.1: don't do two EXCHANGE_IDs on mount
    NFS: nfs41_walk_client_list(): re-lock before iterating
    ...

    Linus Torvalds
     
  • We do a very simple search for a particular string appended to the module
    (which is cache-hot and about to be SHA'd anyway). There's both a config
    option and a boot parameter which control whether we accept or fail with
    unsigned modules and modules that are signed with an unknown key.

    If module signing is enabled, the kernel will be tainted if a module is
    loaded that is unsigned or has a signature for which we don't have the
    key.

    (Useful feedback and tweaks by David Howells )

    Signed-off-by: Rusty Russell
    Signed-off-by: David Howells
    Signed-off-by: Rusty Russell

    Rusty Russell
     

03 Oct, 2012

2 commits

  • Pull security subsystem updates from James Morris:
    "Highlights:

    - Integrity: add local fs integrity verification to detect offline
    attacks
    - Integrity: add digital signature verification
    - Simple stacking of Yama with other LSMs (per LSS discussions)
    - IBM vTPM support on ppc64
    - Add new driver for Infineon I2C TIS TPM
    - Smack: add rule revocation for subject labels"

    Fixed conflicts with the user namespace support in kernel/auditsc.c and
    security/integrity/ima/ima_policy.c.

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits)
    Documentation: Update git repository URL for Smack userland tools
    ima: change flags container data type
    Smack: setprocattr memory leak fix
    Smack: implement revoking all rules for a subject label
    Smack: remove task_wait() hook.
    ima: audit log hashes
    ima: generic IMA action flag handling
    ima: rename ima_must_appraise_or_measure
    audit: export audit_log_task_info
    tpm: fix tpm_acpi sparse warning on different address spaces
    samples/seccomp: fix 31 bit build on s390
    ima: digital signature verification support
    ima: add support for different security.ima data types
    ima: add ima_inode_setxattr/removexattr function and calls
    ima: add inode_post_setattr call
    ima: replace iint spinblock with rwlock/read_lock
    ima: allocating iint improvements
    ima: add appraise action keywords and default rules
    ima: integrity appraisal extension
    vfs: move ima_file_free before releasing the file
    ...

    Linus Torvalds
     
  • Pull first round of SCSI updates from James Bottomley:
    "This is a large set of updates, mostly for drivers (qla2xxx [including
    support for new 83xx based card], qla4xxx, mpt2sas, bfa, zfcp, hpsa,
    be2iscsi, isci, lpfc, ipr, ibmvfc, ibmvscsi, megaraid_sas).

    There's also a rework for tape adding virtually unlimited numbers of
    tape drives plus a set of dif fixes for sd and a fix for a live lock
    on hot remove of SCSI devices.

    This round includes a signed tag pull of isci-for-3.6

    Signed-off-by: James Bottomley "

    Fix up trivial conflict in drivers/scsi/qla2xxx/qla_nx.c due to new PCI
    helper function use in a function that was removed by this pull.

    * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (198 commits)
    [SCSI] st: remove st_mutex
    [SCSI] sd: Ensure we correctly disable devices with unknown protection type
    [SCSI] hpsa: gen8plus Smart Array IDs
    [SCSI] qla4xxx: Update driver version to 5.03.00-k1
    [SCSI] qla4xxx: Disable generating pause frames for ISP83XX
    [SCSI] qla4xxx: Fix double clearing of risc_intr for ISP83XX
    [SCSI] qla4xxx: IDC implementation for Loopback
    [SCSI] qla4xxx: update copyrights in LICENSE.qla4xxx
    [SCSI] qla4xxx: Fix panic while rmmod
    [SCSI] qla4xxx: Fail probe_adapter if IRQ allocation fails
    [SCSI] qla4xxx: Prevent MSI/MSI-X falling back to INTx for ISP82XX
    [SCSI] qla4xxx: Update idc reg in case of PCI AER
    [SCSI] qla4xxx: Fix double IDC locking in qla4_8xxx_error_recovery
    [SCSI] qla4xxx: Clear interrupt while unloading driver for ISP83XX
    [SCSI] qla4xxx: Print correct IDC version
    [SCSI] qla4xxx: Added new mbox cmd to pass driver version to FW
    [SCSI] scsi_dh_alua: Enable STPG for unavailable ports
    [SCSI] scsi_remove_target: fix softlockup regression on hot remove
    [SCSI] ibmvscsi: Fix host config length field overflow
    [SCSI] ibmvscsi: Remove backend abstraction
    ...

    Linus Torvalds
     

02 Oct, 2012

4 commits

  • An optional boot parameter is introduced to allow client
    administrators to specify a string that the Linux NFS client can
    insert into its nfs_client_id4 id string, to make it both more
    globally unique, and to ensure that it doesn't change even if the
    client's nodename changes.

    If this boot parameter is not specified, the client's nodename is
    used, as before.

    Client installation procedures can create a unique string (typically,
    a UUID) which remains unchanged during the lifetime of that client
    instance. This works just like creating a UUID for the label of the
    system's root and boot volumes.

    Signed-off-by: Chuck Lever
    Signed-off-by: Trond Myklebust

    Chuck Lever
     
  • Pull x86/smap support from Ingo Molnar:
    "This adds support for the SMAP (Supervisor Mode Access Prevention) CPU
    feature on Intel CPUs: a hardware feature that prevents unintended
    user-space data access from kernel privileged code.

    It's turned on automatically when possible.

    This, in combination with SMEP, makes it even harder to exploit kernel
    bugs such as NULL pointer dereferences."

    Fix up trivial conflict in arch/x86/kernel/entry_64.S due to newly added
    includes right next to each other.

    * 'x86-smap-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, smep, smap: Make the switching functions one-way
    x86, suspend: On wakeup always initialize cr4 and EFER
    x86-32: Start out eflags and cr4 clean
    x86, smap: Do not abuse the [f][x]rstor_checking() functions for user space
    x86-32, smap: Add STAC/CLAC instructions to 32-bit kernel entry
    x86, smap: Reduce the SMAP overhead for signal handling
    x86, smap: A page fault due to SMAP is an oops
    x86, smap: Turn on Supervisor Mode Access Prevention
    x86, smap: Add STAC and CLAC instructions to control user space access
    x86, uaccess: Merge prototypes for clear_user/__clear_user
    x86, smap: Add a header file with macros for STAC/CLAC
    x86, alternative: Add header guards to
    x86, alternative: Use .pushsection/.popsection
    x86, smap: Add CR4 bit for SMAP
    x86-32, mm: The WP test should be done on a kernel page

    Linus Torvalds
     
  • Pull x86/fpu update from Ingo Molnar:
    "The biggest change is the addition of the non-lazy (eager) FPU saving
    support model and enabling it on CPUs with optimized xsaveopt/xrstor
    FPU state saving instructions.

    There are also various Sparse fixes"

    Fix up trivial add-add conflict in arch/x86/kernel/traps.c

    * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, kvm: fix kvm's usage of kernel_fpu_begin/end()
    x86, fpu: remove cpu_has_xmm check in the fx_finit()
    x86, fpu: make eagerfpu= boot param tri-state
    x86, fpu: enable eagerfpu by default for xsaveopt
    x86, fpu: decouple non-lazy/eager fpu restore from xsave
    x86, fpu: use non-lazy fpu restore for processors supporting xsave
    lguest, x86: handle guest TS bit for lazy/non-lazy fpu host models
    x86, fpu: always use kernel_fpu_begin/end() for in-kernel FPU usage
    x86, kvm: use kernel_fpu_begin/end() in kvm_load/put_guest_fpu()
    x86, fpu: remove unnecessary user_fpu_end() in save_xstate_sig()
    x86, fpu: drop_fpu() before restoring new state from sigframe
    x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels
    x86, fpu: Consolidate inline asm routines for saving/restoring fpu state
    x86, signal: Cleanup ifdefs and is_ia32, is_x32

    Linus Torvalds
     
  • Pull x86/asm changes from Ingo Molnar:
    "The one change that stands out is the alternatives patching change
    that prevents us from ever patching back instructions from SMP to UP:
    this simplifies things and speeds up CPU hotplug.

    Other than that it's smaller fixes, cleanups and improvements."

    * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: Unspaghettize do_trap()
    x86_64: Work around old GAS bug
    x86: Use REP BSF unconditionally
    x86: Prefer TZCNT over BFS
    x86/64: Adjust types of temporaries used by ffs()/fls()/fls64()
    x86: Drop unnecessary kernel_eflags variable on 64-bit
    x86/smp: Don't ever patch back to UP if we unplug cpus

    Linus Torvalds
     

23 Sep, 2012

1 commit

  • Although almost everyone is well-served by the defaults, some uses of RCU
    benefit from shorter grace periods, while others benefit more from the
    greater efficiency provided by longer grace periods. Situations requiring
    a large number of grace periods to elapse (and wireshark startup has
    been called out as an example of this) are helped by lower-latency
    grace periods. Furthermore, in some embedded applications, people are
    willing to accept a small degradation in update efficiency (due to there
    being more of the shorter grace-period operations) in order to gain the
    lower latency.

    In contrast, those few systems with thousands of CPUs need longer grace
    periods because the CPU overhead of a grace period rises roughly
    linearly with the number of CPUs. Such systems normally do not make
    much use of facilities that require large numbers of grace periods to
    elapse, so this is a good tradeoff.

    Therefore, this commit allows the durations to be controlled from sysfs.
    There are two sysfs parameters, one named "jiffies_till_first_fqs" that
    specifies the delay in jiffies from the end of grace-period initialization
    until the first attempt to force quiescent states, and the other named
    "jiffies_till_next_fqs" that specifies the delay (again in jiffies)
    between subsequent attempts to force quiescent states. They both default
    to three jiffies, which is compatible with the old hard-coded behavior.

    At some future time, it may be possible to automatically increase the
    grace-period length with the number of CPUs, but we do not yet have
    sufficient data to do a good job. Preliminary data indicates that we
    should add an addiitonal jiffy to each of the delays for every 200 CPUs
    in the system, but more experimentation is needed. For now, the number
    of systems with more than 1,000 CPUs is small enough that this can be
    relegated to boot-time hand tuning.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

22 Sep, 2012

2 commits

  • Reason for merge:
    x86/fpu changed the structure of some of the code that x86/smap
    changes; mostly fpu-internal.h but also minor changes to the
    signal code.

    Signed-off-by: H. Peter Anvin

    Resolved Conflicts:
    arch/x86/ia32/ia32_signal.c
    arch/x86/include/asm/fpu-internal.h
    arch/x86/kernel/signal.c

    H. Peter Anvin
     
  • If Supervisor Mode Access Prevention is available and not disabled by
    the user, turn it on. Also fix the expansion of SMEP (Supervisor Mode
    Execution Prevention.)

    Signed-off-by: H. Peter Anvin
    Link: http://lkml.kernel.org/r/1348256595-29119-10-git-send-email-hpa@linux.intel.com

    H. Peter Anvin
     

19 Sep, 2012

3 commits

  • Add the "eagerfpu=auto" (that selects the default scheme in
    enabling eagerfpu) which can override compiled-in boot parameters
    like "eagerfpu=on/off" (that force enable/disable eagerfpu).

    Signed-off-by: Suresh Siddha
    Link: http://lkml.kernel.org/r/1347300665-6209-5-git-send-email-suresh.b.siddha@intel.com
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     
  • xsaveopt/xrstor support optimized state save/restore by tracking the
    INIT state and MODIFIED state during context-switch.

    Enable eagerfpu by default for processors supporting xsaveopt.
    Can be disabled by passing "eagerfpu=off" boot parameter.

    Signed-off-by: Suresh Siddha
    Link: http://lkml.kernel.org/r/1347300665-6209-3-git-send-email-suresh.b.siddha@intel.com
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     
  • Decouple non-lazy/eager fpu restore policy from the existence of the xsave
    feature. Introduce a synthetic CPUID flag to represent the eagerfpu
    policy. "eagerfpu=on" boot paramter will enable the policy.

    Requested-by: H. Peter Anvin
    Requested-by: Linus Torvalds
    Signed-off-by: Suresh Siddha
    Link: http://lkml.kernel.org/r/1347300665-6209-2-git-send-email-suresh.b.siddha@intel.com
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     

08 Sep, 2012

2 commits

  • Unlike the IMA measurement policy, the appraise policy can not be dependent
    on runtime process information, such as the task uid, as the 'security.ima'
    xattr is written on file close and must be updated each time the file changes,
    regardless of the current task uid.

    This patch extends the policy language with 'fowner', defines an appraise
    policy, which appraises all files owned by root, and defines 'ima_appraise_tcb',
    a new boot command line option, to enable the appraise policy.

    Changelog v3:
    - separate the measure from the appraise rules in order to support measuring
    without appraising and appraising without measuring.
    - change appraisal default for filesystems without xattr support to fail
    - update default appraise policy for cgroups

    Changelog v1:
    - don't appraise RAMFS (Dmitry Kasatkin)
    - merged rest of "ima: ima_must_appraise_or_measure API change" commit
    (Dmtiry Kasatkin)

    ima_must_appraise_or_measure() called ima_match_policy twice, which
    searched the policy for a matching rule. Once for a matching measurement
    rule and subsequently for an appraisal rule. Searching the policy twice
    is unnecessary overhead, which could be noticeable with a large policy.

    The new version of ima_must_appraise_or_measure() does everything in a
    single iteration using a new version of ima_match_policy(). It returns
    IMA_MEASURE, IMA_APPRAISE mask.

    With the use of action mask only one efficient matching function
    is enough. Removed other specific versions of matching functions.

    Changelog:
    - change 'owner' to 'fowner' to conform to the new LSM conditions posted by
    Roberto Sassu.
    - fix calls to ima_log_string()

    Signed-off-by: Mimi Zohar
    Signed-off-by: Dmitry Kasatkin

    Mimi Zohar
     
  • IMA currently maintains an integrity measurement list used to assert the
    integrity of the running system to a third party. The IMA-appraisal
    extension adds local integrity validation and enforcement of the
    measurement against a "good" value stored as an extended attribute
    'security.ima'. The initial methods for validating 'security.ima' are
    hashed based, which provides file data integrity, and digital signature
    based, which in addition to providing file data integrity, provides
    authenticity.

    This patch creates and maintains the 'security.ima' xattr, containing
    the file data hash measurement. Protection of the xattr is provided by
    EVM, if enabled and configured.

    Based on policy, IMA calls evm_verifyxattr() to verify a file's metadata
    integrity and, assuming success, compares the file's current hash value
    with the one stored as an extended attribute in 'security.ima'.

    Changelov v4:
    - changed iint cache flags to hex values

    Changelog v3:
    - change appraisal default for filesystems without xattr support to fail

    Changelog v2:
    - fix audit msg 'res' value
    - removed unused 'ima_appraise=' values

    Changelog v1:
    - removed unused iint mutex (Dmitry Kasatkin)
    - setattr hook must not reset appraised (Dmitry Kasatkin)
    - evm_verifyxattr() now differentiates between no 'security.evm' xattr
    (INTEGRITY_NOLABEL) and no EVM 'protected' xattrs included in the
    'security.evm' (INTEGRITY_NOXATTRS).
    - replace hash_status with ima_status (Dmitry Kasatkin)
    - re-initialize slab element ima_status on free (Dmitry Kasatkin)
    - include 'security.ima' in EVM if CONFIG_IMA_APPRAISE, not CONFIG_IMA
    - merged half "ima: ima_must_appraise_or_measure API change" (Dmitry Kasatkin)
    - removed unnecessary error variable in process_measurement() (Dmitry Kasatkin)
    - use ima_inode_post_setattr() stub function, if IMA_APPRAISE not configured
    (moved ima_inode_post_setattr() to ima_appraise.c)
    - make sure ima_collect_measurement() can read file

    Changelog:
    - add 'iint' to evm_verifyxattr() call (Dimitry Kasatkin)
    - fix the race condition between chmod, which takes the i_mutex and then
    iint->mutex, and ima_file_free() and process_measurement(), which take
    the locks in the reverse order, by eliminating iint->mutex. (Dmitry Kasatkin)
    - cleanup of ima_appraise_measurement() (Dmitry Kasatkin)
    - changes as a result of the iint not allocated for all regular files, but
    only for those measured/appraised.
    - don't try to appraise new/empty files
    - expanded ima_appraisal description in ima/Kconfig
    - IMA appraise definitions required even if IMA_APPRAISE not enabled
    - add return value to ima_must_appraise() stub
    - unconditionally set status = INTEGRITY_PASS *after* testing status,
    not before. (Found by Joe Perches)

    Signed-off-by: Mimi Zohar
    Signed-off-by: Dmitry Kasatkin

    Mimi Zohar
     

24 Aug, 2012

1 commit

  • Hotplug testing with libsas currently encounters a 55 second wait for
    link recovery to give up. In the case where the user trusts the
    response time of their devices permit the recovery attempts to be
    limited to one.

    Signed-off-by: Dan Williams
    Acked-by: Jeff Garzik
    Acked-by: Tejun Heo
    Signed-off-by: James Bottomley

    Dan Williams
     

23 Aug, 2012

1 commit

  • We still patch SMP instructions to UP variants if we boot with a
    single CPU, but not at any other time. In particular, not if we
    unplug CPUs to return to a single cpu.

    Paul McKenney points out:

    mean offline overhead is 6251/48=130.2 milliseconds.

    If I remove the alternatives_smp_switch() from the offline
    path [...] the mean offline overhead is 550/42=13.1 milliseconds

    Basically, we're never going to get those 120ms back, and the
    code is pretty messy.

    We get rid of:

    1) The "smp-alt-once" boot option. It's actually "smp-alt-boot", the
    documentation is wrong. It's now the default.

    2) The skip_smp_alternatives flag used by suspend.

    3) arch_disable_nonboot_cpus_begin() and arch_disable_nonboot_cpus_end()
    which were only used to set this one flag.

    Signed-off-by: Rusty Russell
    Cc: Paul McKenney
    Cc: Suresh Siddha
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/87vcgwwive.fsf@rustcorp.com.au
    Signed-off-by: Ingo Molnar

    Rusty Russell
     

31 Jul, 2012

1 commit

  • Pull DMA-mapping updates from Marek Szyprowski:
    "Those patches are continuation of my earlier work.

    They contains extensions to DMA-mapping framework to remove limitation
    of the current ARM implementation (like limited total size of DMA
    coherent/write combine buffers), improve performance of buffer sharing
    between devices (attributes to skip cpu cache operations or creation
    of additional kernel mapping for some specific use cases) as well as
    some unification of the common code for dma_mmap_attrs() and
    dma_mmap_coherent() functions. All extensions have been implemented
    and tested for ARM architecture."

    * 'for-linus-for-3.6-rc1' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
    ARM: dma-mapping: add support for DMA_ATTR_SKIP_CPU_SYNC attribute
    common: DMA-mapping: add DMA_ATTR_SKIP_CPU_SYNC attribute
    ARM: dma-mapping: add support for dma_get_sgtable()
    common: dma-mapping: introduce dma_get_sgtable() function
    ARM: dma-mapping: add support for DMA_ATTR_NO_KERNEL_MAPPING attribute
    common: DMA-mapping: add DMA_ATTR_NO_KERNEL_MAPPING attribute
    common: dma-mapping: add support for generic dma_mmap_* calls
    ARM: dma-mapping: fix error path for memory allocation failure
    ARM: dma-mapping: add more sanity checks in arm_dma_mmap()
    ARM: dma-mapping: remove custom consistent dma region
    mm: vmalloc: use const void * for caller argument
    scatterlist: add sg_alloc_table_from_pages function

    Linus Torvalds
     

30 Jul, 2012

1 commit

  • This patch changes dma-mapping subsystem to use generic vmalloc areas
    for all consistent dma allocations. This increases the total size limit
    of the consistent allocations and removes platform hacks and a lot of
    duplicated code.

    Atomic allocations are served from special pool preallocated on boot,
    because vmalloc areas cannot be reliably created in atomic context.

    Signed-off-by: Marek Szyprowski
    Reviewed-by: Kyungmin Park
    Reviewed-by: Minchan Kim

    Marek Szyprowski
     

25 Jul, 2012

2 commits

  • Pull first round of SCSI updates from James Bottomley:
    "The most important feature of this patch set is the new async
    infrastructure that makes sure async_synchronize_full() synchronizes
    all domains and allows us to remove all the hacks (like having
    scsi_complete_async_scans() in the device base code) and means that
    the async infrastructure will "just work" in future.

    The rest is assorted driver updates (aacraid, bnx2fc, virto-scsi,
    megaraid, bfa, lpfc, qla2xxx, qla4xxx) plus a lot of infrastructure
    work in sas and FC.

    Signed-off-by: James Bottomley "

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (97 commits)
    [SCSI] Revert "[SCSI] fix async probe regression"
    [SCSI] cleanup usages of scsi_complete_async_scans
    [SCSI] queue async scan work to an async_schedule domain
    [SCSI] async: make async_synchronize_full() flush all work regardless of domain
    [SCSI] async: introduce 'async_domain' type
    [SCSI] bfa: Fix to set correct return error codes and misc cleanup.
    [SCSI] aacraid: Series 7 Async. (performance) mode support
    [SCSI] aha152x: Allow use on 64bit systems
    [SCSI] virtio-scsi: Add vdrv->scan for post VIRTIO_CONFIG_S_DRIVER_OK LUN scanning
    [SCSI] bfa: squelch lockdep complaint with a spin_lock_init
    [SCSI] qla2xxx: remove unnecessary reads of PCI_CAP_ID_EXP
    [SCSI] qla4xxx: remove unnecessary read of PCI_CAP_ID_EXP
    [SCSI] ufs: fix incorrect return value about SUCCESS and FAILED
    [SCSI] ufs: reverse the ufshcd_is_device_present logic
    [SCSI] ufs: use module_pci_driver
    [SCSI] usb-storage: update usb devices for write cache quirk in quirk list.
    [SCSI] usb-storage: add support for write cache quirk
    [SCSI] set to WCE if usb cache quirk is present.
    [SCSI] virtio-scsi: hotplug support for virtio-scsi
    [SCSI] virtio-scsi: split scatterlist per target
    ...

    Linus Torvalds
     
  • Pull IOMMU updates from Joerg Roedel:
    "The most important part of these updates is the IOMMU groups code
    enhancement written by Alex Williamson. It abstracts the problem that
    a given hardware IOMMU can't isolate any given device from any other
    device (e.g. 32 bit PCI devices can't usually be isolated). Devices
    that can't be isolated are grouped together. This code is required
    for the upcoming VFIO framework.

    Another IOMMU-API change written by me is the introduction of domain
    attributes. This makes it easier to handle GART-like IOMMUs with the
    IOMMU-API because now the start-address and the size of the domain
    address space can be queried.

    Besides that there are a few cleanups and fixes for the NVidia Tegra
    IOMMU drivers and the reworked init-code for the AMD IOMMU. The
    latter is from my patch-set to support interrupt remapping. The rest
    of this patch-set requires x86 changes which are not mergabe yet. So
    full support for interrupt remapping with AMD IOMMUs will come in a
    future merge window."

    * tag 'iommu-updates-v3.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (33 commits)
    iommu/amd: Fix hotplug with iommu=pt
    iommu/amd: Add missing spin_lock initialization
    iommu/amd: Convert iommu initialization to state machine
    iommu/amd: Introduce amd_iommu_init_dma routine
    iommu/amd: Move unmap_flush message to amd_iommu_init_dma_ops()
    iommu/amd: Split enable_iommus() routine
    iommu/amd: Introduce early_amd_iommu_init routine
    iommu/amd: Move informational prinks out of iommu_enable
    iommu/amd: Split out PCI related parts of IOMMU initialization
    iommu/amd: Use acpi_get_table instead of acpi_table_parse
    iommu/amd: Fix sparse warnings
    iommu/tegra: Don't call alloc_pdir with as->lock
    iommu/tegra: smmu: Fix unsleepable memory allocation at alloc_pdir()
    iommu/tegra: smmu: Remove unnecessary sanity check at alloc_pdir()
    iommu/exynos: Implement DOMAIN_ATTR_GEOMETRY attribute
    iommu/tegra: Implement DOMAIN_ATTR_GEOMETRY attribute
    iommu/msm: Implement DOMAIN_ATTR_GEOMETRY attribute
    iommu/omap: Implement DOMAIN_ATTR_GEOMETRY attribute
    iommu/vt-d: Implement DOMAIN_ATTR_GEOMETRY attribute
    iommu/amd: Implement DOMAIN_ATTR_GEOMETRY attribute
    ...

    Linus Torvalds
     

20 Jul, 2012

1 commit


03 Jul, 2012

1 commit

  • Although making RCU_FANOUT_LEAF a kernel configuration parameter rather
    than a fixed constant makes it easier for people to decrease cache-miss
    overhead for large systems, it is of little help for people who must
    run a single pre-built kernel binary.

    This commit therefore allows the value of RCU_FANOUT_LEAF to be
    increased (but not decreased!) via a boot-time parameter named
    rcutree.rcu_fanout_leaf.

    Reported-by: Mike Galbraith
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

25 Jun, 2012

1 commit

  • The iommu=group_mf is really no longer needed with the addition of ACS
    support in IOMMU drivers creating groups. Most multifunction devices
    will now be grouped already. If a device has gone to the trouble of
    exposing ACS, trust that it works. We can use the device specific ACS
    function for fixing devices we trust individually. This largely
    reverts bcb71abe.

    Signed-off-by: Alex Williamson
    Signed-off-by: Joerg Roedel

    Alex Williamson
     

05 Jun, 2012

1 commit

  • Pull timer updates from Thomas Gleixner:
    "The clocksource driver is pure hardware enablement and the skew option
    is default off, well tested and non dangerous."

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    tick: Move skew_tick option into the HIGH_RES_TIMER section
    clocksource: em_sti: Add DT support
    clocksource: em_sti: Emma Mobile STI driver
    clockevents: Make clockevents_config() a global symbol
    tick: Add tick skew boot option

    Linus Torvalds