08 May, 2015

7 commits

  • When we detect a hypervisor (!paravirt, see qspinlock paravirt support
    patches), revert to a simple test-and-set lock to avoid the horrors
    of queue preemption.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Daniel J Blueman
    Cc: David Vrabel
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Konrad Rzeszutek Wilk
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Raghavendra K T
    Cc: Rik van Riel
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1429901803-29771-8-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Peter Zijlstra (Intel)
     
  • When we allow for a max NR_CPUS < 2^14 we can optimize the pending
    wait-acquire and the xchg_tail() operations.

    By growing the pending bit to a byte, we reduce the tail to 16bit.
    This means we can use xchg16 for the tail part and do away with all
    the repeated compxchg() operations.

    This in turn allows us to unconditionally acquire; the locked state
    as observed by the wait loops cannot change. And because both locked
    and pending are now a full byte we can use simple stores for the
    state transition, obviating one atomic operation entirely.

    This optimization is needed to make the qspinlock achieve performance
    parity with ticket spinlock at light load.

    All this is horribly broken on Alpha pre EV56 (and any other arch that
    cannot do single-copy atomic byte stores).

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Daniel J Blueman
    Cc: David Vrabel
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Konrad Rzeszutek Wilk
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Raghavendra K T
    Cc: Rik van Riel
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1429901803-29771-6-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Peter Zijlstra (Intel)
     
  • This is a preparatory patch that extracts out the following 2 code
    snippets to prepare for the next performance optimization patch.

    1) the logic for the exchange of new and previous tail code words
    into a new xchg_tail() function.
    2) the logic for clearing the pending bit and setting the locked bit
    into a new clear_pending_set_locked() function.

    This patch also simplifies the trylock operation before queuing by
    calling queued_spin_trylock() directly.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Daniel J Blueman
    Cc: David Vrabel
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Konrad Rzeszutek Wilk
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Raghavendra K T
    Cc: Rik van Riel
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1429901803-29771-5-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Because the qspinlock needs to touch a second cacheline (the per-cpu
    mcs_nodes[]); add a pending bit and allow a single in-word spinner
    before we punt to the second cacheline.

    It is possible so observe the pending bit without the locked bit when
    the last owner has just released but the pending owner has not yet
    taken ownership.

    In this case we would normally queue -- because the pending bit is
    already taken. However, in this case the pending bit is guaranteed
    to be released 'soon', therefore wait for it and avoid queueing.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Daniel J Blueman
    Cc: David Vrabel
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Konrad Rzeszutek Wilk
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Raghavendra K T
    Cc: Rik van Riel
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1429901803-29771-4-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Peter Zijlstra (Intel)
     
  • This patch introduces a new generic queued spinlock implementation that
    can serve as an alternative to the default ticket spinlock. Compared
    with the ticket spinlock, this queued spinlock should be almost as fair
    as the ticket spinlock. It has about the same speed in single-thread
    and it can be much faster in high contention situations especially when
    the spinlock is embedded within the data structure to be protected.

    Only in light to moderate contention where the average queue depth
    is around 1-3 will this queued spinlock be potentially a bit slower
    due to the higher slowpath overhead.

    This queued spinlock is especially suit to NUMA machines with a large
    number of cores as the chance of spinlock contention is much higher
    in those machines. The cost of contention is also higher because of
    slower inter-node memory traffic.

    Due to the fact that spinlocks are acquired with preemption disabled,
    the process will not be migrated to another CPU while it is trying
    to get a spinlock. Ignoring interrupt handling, a CPU can only be
    contending in one spinlock at any one time. Counting soft IRQ, hard
    IRQ and NMI, a CPU can only have a maximum of 4 concurrent lock waiting
    activities. By allocating a set of per-cpu queue nodes and used them
    to form a waiting queue, we can encode the queue node address into a
    much smaller 24-bit size (including CPU number and queue node index)
    leaving one byte for the lock.

    Please note that the queue node is only needed when waiting for the
    lock. Once the lock is acquired, the queue node can be released to
    be used later.

    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Daniel J Blueman
    Cc: David Vrabel
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Konrad Rzeszutek Wilk
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Raghavendra K T
    Cc: Rik van Riel
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/1429901803-29771-2-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     
  • Looks like commit :

    43239cbe79fc ("kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)")

    left behind a reference to ASSIGN_ONCE(). Update this to WRITE_ONCE().

    Signed-off-by: Preeti U Murthy
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Borislav Petkov
    Cc: H. Peter Anvin
    Cc: Thomas Gleixner
    Cc: borntraeger@de.ibm.com
    Cc: dave@stgolabs.net
    Cc: paulmck@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/20150430115721.22278.94082.stgit@preeti.in.ibm.com
    Signed-off-by: Ingo Molnar

    Preeti U Murthy
     
  • In up_write()/up_read(), rwsem_wake() will be called whenever it
    detects that some writers/readers are waiting. The rwsem_wake()
    function will take the wait_lock and call __rwsem_do_wake() to do the
    real wakeup. For a heavily contended rwsem, doing a spin_lock() on
    wait_lock will cause further contention on the heavily contended rwsem
    cacheline resulting in delay in the completion of the up_read/up_write
    operations.

    This patch makes the wait_lock taking and the call to __rwsem_do_wake()
    optional if at least one spinning writer is present. The spinning
    writer will be able to take the rwsem and call rwsem_wake() later
    when it calls up_write(). With the presence of a spinning writer,
    rwsem_wake() will now try to acquire the lock using trylock. If that
    fails, it will just quit.

    Suggested-by: Peter Zijlstra (Intel)
    Signed-off-by: Waiman Long
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Davidlohr Bueso
    Acked-by: Jason Low
    Cc: Andrew Morton
    Cc: Borislav Petkov
    Cc: Douglas Hatch
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1430428337-16802-2-git-send-email-Waiman.Long@hp.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

07 May, 2015

3 commits

  • Pull infiniband updates from Doug Ledford:
    "Minor updates for 4.1-rc

    Most of the changes are fairly small and well confined. The iWARP
    address reporting changes are the only ones that are a medium size. I
    had these queued up prior to rc1, but due to the shuffle in
    maintainers, they did not get submitted when I expected. My apologies
    for that. I feel comfortable with them however due to the testing
    they've received, so I left them in this submission"

    * tag 'for-linus' of git://github.com/dledford/linux:
    MAINTAINERS: Update InfiniBand subsystem maintainer
    MAINTAINERS: add include/rdma/ to InfiniBand subsystem
    IPoIB/CM: Fix indentation level
    iw_cxgb4: Remove negative advice dmesg warnings
    IB/core: Fix unaligned accesses
    IB/core: change rdma_gid2ip into void function as it always return zero
    IB/qib: use arch_phys_wc_add()
    IB/qib: add acounting for MTRR
    IB/core: dma unmap optimizations
    IB/core: dma map/unmap locking optimizations
    RDMA/cxgb4: Report the actual address of the remote connecting peer
    RDMA/nes: Report the actual address of the remote connecting peer
    RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the connecting peer to its clients
    iw_cxgb4: enforce qp/cq id requirements
    iw_cxgb4: use BAR2 GTS register for T5 kernel mode CQs
    iw_cxgb4: 32b platform fixes
    iw_cxgb4: Cleanup register defines/MACROS
    RDMA/CMA: Canonize IPv4 on IPV6 sockets properly

    Linus Torvalds
     
  • Pull xen bug fixes from David Vrabel:

    - fix blkback regression if using persistent grants

    - fix various event channel related suspend/resume bugs

    - fix AMD x86 regression with X86_BUG_SYSRET_SS_ATTRS

    - SWIOTLB on ARM now uses frames evtchn before binding the channel to CPU in __startup_pirq()
    xen/console: Update console event channel on resume
    xen/xenbus: Update xenbus event channel on resume
    xen/events: Clear cpu_evtchn_mask before resuming
    xen-pciback: Add name prefix to global 'permissive' variable
    xen: Suspend ticks on all CPUs during suspend
    xen/grant: introduce func gnttab_unmap_refs_sync()
    xen/blkback: safely unmap purge persistent grants

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar:
    "Mostly tooling fixes, but also an uncore PMU driver fix and an uncore
    PMU driver hardware-enablement addition"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf probe: Fix segfault if passed with ''.
    perf report: Fix -T/--threads option to work again
    perf bench numa: Fix immediate meeting of convergence condition
    perf bench numa: Fixes of --quiet argument
    perf bench futex: Fix hung wakeup tasks after requeueing
    perf probe: Fix bug with global variables handling
    perf top: Fix a segfault when kernel map is restricted.
    tools lib traceevent: Fix build failure on 32-bit arch
    perf kmem: Fix compiles on RHEL6/OL6
    tools lib api: Undefine _FORTIFY_SOURCE before setting it
    perf kmem: Consistently use PRIu64 for printing u64 values
    perf trace: Disable events and drain events when forked workload ends
    perf trace: Enable events when doing system wide tracing and starting a workload
    perf/x86/intel/uncore: Move PCI IDs for IMC to uncore driver
    perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile Processor) IMC uncore PMUs
    perf/x86/intel: Add cpu_(prepare|starting|dying) for core_pmu

    Linus Torvalds
     

06 May, 2015

5 commits

  • The range check for b-tree level parameter in nilfs_btree_root_broken()
    is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
    though the level is limited to values in the range of 0 to
    (NILFS_BTREE_LEVEL_MAX - 1).

    Since the level parameter is read from storage device and used to index
    nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
    can cause memory overrun during btree operations if the boundary value
    is set to the level parameter on device.

    This fixes the broken sanity check and adds a comment to clarify that
    the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.

    Signed-off-by: Ryusuke Konishi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • Using the new find_closest() macro can result in the following sparse
    warnings.

    drivers/hwmon/lm85.c:194:16: warning:
    incorrect type in initializer (different modifiers)
    drivers/hwmon/lm85.c:194:16: expected int *__fc_a
    drivers/hwmon/lm85.c:194:16: got int static const [toplevel] *
    drivers/hwmon/lm85.c:210:16: warning:
    incorrect type in initializer (different modifiers)
    drivers/hwmon/lm85.c:210:16: expected int *__fc_a
    drivers/hwmon/lm85.c:210:16: got int const *map

    This is because the array passed to find_closest() will typically be
    declared as array of constants, but the macro declares a non-constant
    pointer to it.

    Signed-off-by: Guenter Roeck
    Cc: Bartosz Golaszewski

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Guenter Roeck
     
  • Addresses the following kernel logs seen during boot of sparc systems:

    Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
    Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
    Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
    Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
    Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]

    Signed-off-by: David Ahern
    Signed-off-by: Doug Ledford

    David Ahern
     
  • Signed-off-by: Honggang Li
    Acked-by: Sean Hefty
    Signed-off-by: Doug Ledford

    Honggang LI
     
  • Pull crypto fixes from Herbert Xu:
    "This fixes a build problem with bcm63xx and yet another fix to the
    memzero_explicit function to ensure that the memset is not elided"

    * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    hwrng: bcm63xx - Fix driver compilation
    lib: make memzero_explicit more robust against dead store elimination

    Linus Torvalds
     

05 May, 2015

1 commit

  • …necting peer to its clients

    Add functionality to enable the port mapper on the passive side to provide to its
    clients the actual (non-mapped) ip/tcp address information of the connecting peer

    1) Adding remote_info_cb() to process the address info of the connecting peer
    The address info is provided by the user space port mapper service when
    the connection is initiated by the peer
    2) Adding a hash list to store the remote address info
    3) Adding functionality to add/remove the remote address info
    After the info has been provided to the port mapper client,
    it is removed from the hash list

    Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
    Reviewed-by: Steve Wise <swise@opengridcomputing.com>
    Signed-off-by: Doug Ledford <dledford@redhat.com>

    Tatyana Nikolova
     

04 May, 2015

2 commits

  • In commit 0b053c951829 ("lib: memzero_explicit: use barrier instead
    of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
    case LTO would decide to inline memzero_explicit() and eventually
    find out it could be elimiated as dead store.

    While using barrier() works well for the case of gcc, recent efforts
    from LLVMLinux people suggest to use llvm as an alternative to gcc,
    and there, Stephan found in a simple stand-alone user space example
    that llvm could nevertheless optimize and thus elimitate the memset().
    A similar issue has been observed in the referenced llvm bug report,
    which is regarded as not-a-bug.

    Based on some experiments, icc is a bit special on its own, while it
    doesn't seem to eliminate the memset(), it could do so with an own
    implementation, and then result in similar findings as with llvm.

    The fix in this patch now works for all three compilers (also tested
    with more aggressive optimization levels). Arguably, in the current
    kernel tree it's more of a theoretical issue, but imho, it's better
    to be pedantic about it.

    It's clearly visible with gcc/llvm though, with the below code: if we
    would have used barrier() only here, llvm would have omitted clearing,
    not so with barrier_data() variant:

    static inline void memzero_explicit(void *s, size_t count)
    {
    memset(s, 0, count);
    barrier_data(s);
    }

    int main(void)
    {
    char buff[20];
    memzero_explicit(buff, sizeof(buff));
    return 0;
    }

    $ gcc -O2 test.c
    $ gdb a.out
    (gdb) disassemble main
    Dump of assembler code for function main:
    0x0000000000400400 : lea -0x28(%rsp),%rax
    0x0000000000400405 : movq $0x0,-0x28(%rsp)
    0x000000000040040e : movq $0x0,-0x20(%rsp)
    0x0000000000400417 : movl $0x0,-0x18(%rsp)
    0x000000000040041f : xor %eax,%eax
    0x0000000000400421 : retq
    End of assembler dump.

    $ clang -O2 test.c
    $ gdb a.out
    (gdb) disassemble main
    Dump of assembler code for function main:
    0x00000000004004f0 : xorps %xmm0,%xmm0
    0x00000000004004f3 : movaps %xmm0,-0x18(%rsp)
    0x00000000004004f8 : movl $0x0,-0x8(%rsp)
    0x0000000000400500 : lea -0x18(%rsp),%rax
    0x0000000000400505 : xor %eax,%eax
    0x0000000000400507 : retq
    End of assembler dump.

    As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
    this in compiler-gcc.h only to be picked up. For a fallback or otherwise
    unsupported compiler, we define it as a barrier. Similarly, for ecc which
    does not support gcc inline asm.

    Reference: https://llvm.org/bugs/show_bug.cgi?id=15495
    Reported-by: Stephan Mueller
    Tested-by: Stephan Mueller
    Signed-off-by: Daniel Borkmann
    Cc: Theodore Ts'o
    Cc: Stephan Mueller
    Cc: Hannes Frederic Sowa
    Cc: mancha security
    Cc: Mark Charlebois
    Cc: Behan Webster
    Signed-off-by: Herbert Xu

    Daniel Borkmann
     
  • Pull SCSI fixes from James Bottomley:
    "This is three logical fixes (as 5 patches).

    The 3ware class of drivers were causing an oops with multiqueue by
    tearing down the command mappings after completing the command (where
    the variables in the command used to tear down the mapping were
    no-longer valid). There's also a fix for the qnap iscsi target which
    was choking on us sending it commands that were too long and a fix for
    the reworked aha1542 allocating GFP_KERNEL under a lock"

    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
    3w-9xxx: fix command completion race
    3w-xxxx: fix command completion race
    3w-sas: fix command completion race
    aha1542: Allocate memory before taking a lock
    SCSI: add 1024 max sectors black list flag

    Linus Torvalds
     

02 May, 2015

2 commits

  • Pull networking fixes from David Miller:

    1) Receive packet length needs to be adjust by 2 on RX to accomodate
    the two padding bytes in altera_tse driver. From Vlastimil Setka.

    2) If rx frame is dropped due to out of memory in macb driver, we leave
    the receive ring descriptors in an undefined state. From Punnaiah
    Choudary Kalluri

    3) Some netlink subsystems erroneously signal NLM_F_MULTI. That is
    only for dumps. Fix from Nicolas Dichtel.

    4) Fix mis-use of raw rt->rt_pmtu value in ipv4, one must always go via
    the ipv4_mtu() helper. From Herbert Xu.

    5) Fix null deref in bridge netfilter, and miscalculated lengths in
    jump/goto nf_tables verdicts. From Florian Westphal.

    6) Unhash ping sockets properly.

    7) Software implementation of BPF divide did 64/32 rather than 64/64
    bit divide. The JITs got it right. Fix from Alexei Starovoitov.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
    ipv4: Missing sk_nulls_node_init() in ping_unhash().
    net: fec: Fix RGMII-ID mode
    net/mlx4_en: Schedule napi when RX buffers allocation fails
    netxen_nic: use spin_[un]lock_bh around tx_clean_lock
    net/mlx4_core: Fix unaligned accesses
    mlx4_en: Use correct loop cursor in error path.
    cxgb4: Fix MC1 memory offset calculation
    bnx2x: Delay during kdump load
    net: Fix Kernel Panic in bonding driver debugfs file: rlb_hash_table
    net: dsa: Fix scope of eeprom-length property
    net: macb: Fix race condition in driver when Rx frame is dropped
    hv_netvsc: Fix a bug in netvsc_start_xmit()
    altera_tse: Correct rx packet length
    mlx4: Fix tx ring affinity_mask creation
    tipc: fix problem with parallel link synchronization mechanism
    tipc: remove wrong use of NLM_F_MULTI
    bridge/nl: remove wrong use of NLM_F_MULTI
    bridge/mdb: remove wrong use of NLM_F_MULTI
    net: sched: act_connmark: don't zap skb->nfct
    trivial: net: systemport: bcmsysport.h: fix 0x0x prefix
    ...

    Linus Torvalds
     
  • Here the "other side" refers to the guest or host.

    Signed-off-by: Stefan Hajnoczi
    Signed-off-by: Rusty Russell
    Signed-off-by: Linus Torvalds

    Stefan Hajnoczi
     

01 May, 2015

6 commits

  • Pull power management and ACPI fixes from Rafael Wysocki:
    "Three regression fixes this time, one for a recent regression in the
    cpuidle core affecting multiple systems, one for an inadvertently
    added duplicate typedef in ACPICA that breaks compilation with GCC 4.5
    and one for an ACPI Smart Battery Subsystem driver regression
    introduced during the 3.18 cycle (stable-candidate).

    Specifics:

    - Fix for a regression in the cpuidle core introduced by one of the
    recent commits in the clockevents_notify() removal series that put
    a call to a function which had to be executed with disabled
    interrupts into a code path running with enabled interrupts (Rafael
    J Wysocki)

    - Fix for a build problem in ACPICA (with GCC 4.5) introduced by one
    of the recent ACPICA tools commits that added a duplicate typedef
    to one of the ACPICA's header files by mistake (Olaf Hering)

    - Fix for a regression in the ACPI SBS (Smart Battery Subsystem)
    driver introduced during the 3.18 development cycle causing the
    smart battery manager to be marked as not present when it should be
    marked as present (Chris Bainbridge)"

    * tag 'pm+acpi-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    cpuidle: Run tick_broadcast_exit() with disabled interrupts
    ACPI / SBS: Enable battery manager when present
    ACPICA: remove duplicate u8 typedef

    Linus Torvalds
     
  • Pull sound fixes from Takashi Iwai:
    "One nice fix is Peter's patch to make the old good SB Audigy PCI to
    work with 32bit DMA instead of 31bit. This allows the MIDI synth
    running on modern machines again. Along with it, a few fixes for
    emu10k1 have merged.

    In ASoC side, there is one fix in the common code, but it's just
    trivial additions of static inline functions for CONFIG_PM=n. The
    rest are various device-specific small fixes.

    Last but not least, a few HD-audio fixes are included, as usual, too"

    * tag 'sound-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
    ASoC: rt5677: fixed wrong DMIC ref clock
    ALSA: emu10k1: Emu10k2 32 bit DMA mode
    ALSA: emux: Fix mutex deadlock in OSS emulation
    ASoC: Update email-id of Rajeev Kumar
    ASoC: rt5645: Fix mask for setting RT5645_DMIC_2_DP_GPIO12 bit
    ALSA: hda - Fix missing va_end() call in snd_hda_codec_pcm_new()
    ALSA: emux: Fix mutex deadlock at unloading
    ALSA: emu10k1: Fix card shortname string buffer overflow
    ALSA: hda - Add mute-LED mode control to Thinkpad
    ALSA: hda - Fix mute-LED fixed mode
    ALSA: hda - Fix click noise at start on Dell XPS13
    ASoC: rt5645: Add ACPI match ID
    ASoC: rt5677: add register patch for PLL
    ASoC: Intel: fix the makefile for atom code
    ASoC: dapm: Enable autodisable on SOC_DAPM_SINGLE_TLV_AUTODISABLE
    ASoC: add static inline funcs to fix a compiling issue
    ASoC: Intel: sst_byt: remove kfree for memory allocated with devm_kzalloc
    ASoC: samsung: s3c24xx-i2s: Fix return value check in s3c24xx_iis_dev_probe()
    ASoC: tfa9879: Fix return value check in tfa9879_i2c_probe()
    ASoC: fsl_ssi: Fix platform_get_irq() error handling
    ...

    Linus Torvalds
     
  • …ie/sound into for-linus

    ASoC: Fixes for v4.1

    A few fixes for v4.1, none earth shattering and mostly driver related
    except for one change to fix !PM builds for Intel platforms which is
    done by adding stubs in the core so other platforms don't run into the
    same issue.

    Takashi Iwai
     
  • Pull kvm changes from Paolo Bonzini:
    "Remove from guest code the handling of task migration during a pvclock
    read; instead use the correct protocol in KVM.

    This removes the need for task migration notifiers in core scheduler
    code"

    [ The scheduler people really hated the migration notifiers, so this was
    kind of required - Linus ]

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    x86: pvclock: Really remove the sched notifier for cross-cpu migrations
    kvm: x86: fix kvmclock update protocol

    Linus Torvalds
     
  • Pull tty/serial fixes from Greg KH:
    "Here are some small tty/serial driver fixes for 4.1-rc2.

    They include some minor fixes that resolve reported issues, and a new
    device quirk.

    All have been in linux-next succesfully"

    * tag 'tty-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
    serial: 8250_pci: Add support for 16 port Exar boards
    serial: samsung: fix serial console break
    tty/serial: at91: maxburst was missing for dma transfers
    serial: of-serial: Remove device_type = "serial" registration
    serial: xilinx: Use platform_get_irq to get irq description structure
    serial: core: Fix kernel-doc build warnings
    tty: Re-add external interface for tty_set_termios()

    Linus Torvalds
     
  • Pull USB fixes from Greg KH:
    "Here are a number of small USB fixes for 4.2-rc2. They revert one
    problem patch, fix some minor things, and add some new quirks for
    "broken" devices.

    All have been in linux-next successfully"

    * tag 'usb-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
    cdc-acm: prevent infinite loop when parsing CDC headers.
    Revert "usb: host: ehci-msm: Use devm_ioremap_resource instead of devm_ioremap"
    usb: chipidea: otg: remove mutex unlock and lock while stop and start role
    uas: Set max_sectors_240 quirk for ASM1053 devices
    uas: Add US_FL_MAX_SECTORS_240 flag
    uas: Allow uas_use_uas_driver to return usb-storage flags

    Linus Torvalds
     

30 Apr, 2015

2 commits

  • NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
    it is sent only at the end of a dump.

    Libraries like libnl will wait forever for NLMSG_DONE.

    Fixes: e5a55a898720 ("net: create generic bridge ops")
    Fixes: 815cccbf10b2 ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
    CC: John Fastabend
    CC: Sathya Perla
    CC: Subbu Seetharaman
    CC: Ajit Khaparde
    CC: Jeff Kirsher
    CC: intel-wired-lan@lists.osuosl.org
    CC: Jiri Pirko
    CC: Scott Feldman
    CC: Stephen Hemminger
    CC: bridge@lists.linux-foundation.org
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • Commit 77e32c89a711 ("clockevents: Manage device's state separately for
    the core") decouples clockevent device's modes from states. With this
    change when a Xen guest tries to resume, it won't be calling its
    set_mode op which needs to be done on each VCPU in order to make the
    hypervisor aware that we are in oneshot mode.

    This happens because clockevents_tick_resume() (which is an intermediate
    step of resuming ticks on a processor) doesn't call clockevents_set_state()
    anymore and because during suspend clockevent devices on all VCPUs (except
    for the one doing the suspend) are left in ONESHOT state. As result, during
    resume the clockevents state machine will assume that device is already
    where it should be and doesn't need to be updated.

    To avoid this problem we should suspend ticks on all VCPUs during
    suspend.

    Signed-off-by: Boris Ostrovsky
    Signed-off-by: David Vrabel

    Boris Ostrovsky
     

29 Apr, 2015

5 commits

  • …m', 'asoc/fix/qcom' and 'asoc/fix/rcar' into asoc-linus

    Mark Brown
     
  • Mark Brown
     
  • Looks like audigy emu10k2 (probably emu10k1 - sb live too) support two
    modes for DMA. Second mode is useful for 64 bit os with more then 2 GB
    of ram (fixes problems with big soundfont loading)

    1) 32MB from 2 GB address space using 8192 pages (used now as default)
    2) 16MB from 4 GB address space using 4096 pages

    Mode is set using HCFG_EXPANDED_MEM flag in HCFG register.
    Also format of emu10k2 page table is then different.

    Signed-off-by: Peter Zubaj
    Tested-by: Takashi Iwai
    Cc:
    Signed-off-by: Takashi Iwai

    Peter Zubaj
     
  • During commit e252652fb266 ("ACPICA: acpidump: Remove integer types
    translation protection.") two 'unsigned char' types got converted to 'u8'.

    The result does not compile with gcc-4.5, it can not cope with duplicate
    typedefs.

    Signed-off-by: Olaf Hering
    Signed-off-by: Rafael J. Wysocki

    Olaf Hering
     
  • Pull s390 updates from Martin Schwidefsky:
    "One additional new feature for 4.1, a new PRNG based on SHA-512 for
    the zcrypt driver.

    Two memory management related changes, the page table reallocation for
    KVM is removed, and with file ptes gone the encoding of page table
    entries is improved.

    And three bug fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator.
    s390/mm: change swap pte encoding and pgtable cleanup
    s390/mm: correct transfer of dirty & young bits in __pmd_to_pte
    s390/bpf: add dependency to z196 features
    s390/3215: free memory in error path
    s390/kvm: remove delayed reallocation of page tables for KVM
    kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP

    Linus Torvalds
     

28 Apr, 2015

6 commits

  • rajeev-dlh.kumar@st.com email-id doesn't exist anymore as I have left the
    company. Replace ST's id with Rajeev Kumar

    Signed-off-by: Rajeev Kumar
    Signed-off-by: Mark Brown

    Rajeev Kumar
     
  • This is needed by Bluetooth hci_uart module to be able to change speed
    of Bluetooth controller and local UART.

    Signed-off-by: Frederic Danis
    Reviewed-by: Peter Hurley
    Cc: Marcel Holtmann
    Signed-off-by: Greg Kroah-Hartman

    Frederic Danis
     
  • The usb-storage driver sets max_sectors = 240 in its scsi-host template,
    for uas we do not want to do that for all devices, but testing has shown
    that some devices need it.

    This commit adds a US_FL_MAX_SECTORS_240 flag for such devices, and
    implements support for it in uas.c, while at it it also adds support
    for US_FL_MAX_SECTORS_64 to uas.c.

    Cc: stable@vger.kernel.org # 3.16
    Signed-off-by: Hans de Goede
    Acked-by: Alan Stern
    Signed-off-by: Greg Kroah-Hartman

    Hans de Goede
     
  • Pablo Neira Ayuso says:

    ====================
    Netfilter fixes for net

    The following patchset contains Netfilter fixes for your net tree,
    they are:

    1) Fix a crash in nf_tables when dictionaries are used from the ruleset,
    due to memory corruption, from Florian Westphal.

    2) Fix another crash in nf_queue when used with br_netfilter. Also from
    Florian.

    Both fixes are related to new stuff that got in 4.0-rc.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pull networking fixes from David Miller:

    1) mlx4 doesn't check fully for supported valid RSS hash function, fix
    from Amir Vadai

    2) Off by one in ibmveth_change_mtu(), from David Gibson

    3) Prevent altera chip from reporting false error interrupts in some
    circumstances, from Chee Nouk Phoon

    4) Get rid of that stupid endless loop trying to allocate a FIN packet
    in TCP, and in the process kill deadlocks. From Eric Dumazet

    5) Fix get_rps_cpus() crash due to wrong invalid-cpu value, also from
    Eric Dumazet

    6) Fix two bugs in async rhashtable resizing, from Thomas Graf

    7) Fix topology server listener socket namespace bug in TIPC, from Ying
    Xue

    8) Add some missing HAS_DMA kconfig dependencies, from Geert
    Uytterhoeven

    9) bgmac driver intends to force re-polling but does so by returning
    the wrong value from it's ->poll() handler. Fix from Rafał Miłecki

    10) When the creater of an rhashtable configures a max size for it,
    don't bark in the logs and drop insertions when that is exceeded.
    Fix from Johannes Berg

    11) Recover from out of order packets in ppp mppe properly, from Sylvain
    Rochet

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
    bnx2x: really disable TPA if 'disable_tpa' option is set
    net:treewide: Fix typo in drivers/net
    net/mlx4_en: Prevent setting invalid RSS hash function
    mdio-mux-gpio: use new gpiod_get_array and gpiod_put_array functions
    netfilter; Add some missing default cases to switch statements in nft_reject.
    ppp: mppe: discard late packet in stateless mode
    ppp: mppe: sanity error path rework
    net/bonding: Make DRV macros private
    net: rfs: fix crash in get_rps_cpus()
    altera tse: add support for fixed-links.
    pxa168: fix double deallocation of managed resources
    net: fix crash in build_skb()
    net: eth: altera: Resolve false errors from MSGDMA to TSE
    ehea: Fix memory hook reference counting crashes
    net/tg3: Release IRQs on permanent error
    net: mdio-gpio: support access that may sleep
    inet: fix possible panic in reqsk_queue_unlink()
    rhashtable: don't attempt to grow when at max_size
    bgmac: fix requests for extra polling calls from NAPI
    tcp: avoid looping in tcp_send_fin()
    ...

    Linus Torvalds
     
  • This works around a issue with qnap iscsi targets not handling large IOs
    very well.

    The target returns:

    VPD INQUIRY: Block limits page (SBC)
    Maximum compare and write length: 1 blocks
    Optimal transfer length granularity: 1 blocks
    Maximum transfer length: 4294967295 blocks
    Optimal transfer length: 4294967295 blocks
    Maximum prefetch, xdread, xdwrite transfer length: 0 blocks
    Maximum unmap LBA count: 8388607
    Maximum unmap block descriptor count: 1
    Optimal unmap granularity: 16383
    Unmap granularity alignment valid: 0
    Unmap granularity alignment: 0
    Maximum write same length: 0xffffffff blocks
    Maximum atomic transfer length: 0
    Atomic alignment: 0
    Atomic transfer length granularity: 0

    and it is *sometimes* able to handle at least one IO of size up to 8 MB. We
    have seen in traces where it will sometimes work, but other times it
    looks like it fails and it looks like it returns failures if we send
    multiple large IOs sometimes. Also it looks like it can return 2 different
    errors. It will sometimes send iscsi reject errors indicating out of
    resources or it will send invalid cdb illegal requests check conditions.
    And then when it sends iscsi rejects it does not seem to handle retries
    when there are command sequence holes, so I could not just add code to
    try and gracefully handle that error code.

    The problem is that we do not have a good contact for the company,
    so we are not able to determine under what conditions it returns
    which error and why it sometimes works.

    So, this patch just adds a new black list flag to set targets like this to
    the old max safe sectors of 1024. The max_hw_sectors changes added in 3.19
    caused this regression, so I also ccing stable.

    Reported-by: Christian Hesse
    Signed-off-by: Mike Christie
    Cc: stable@vger.kernel.org
    Reviewed-by: Christoph Hellwig
    Signed-off-by: James Bottomley

    Mike Christie
     

27 Apr, 2015

1 commit

  • This reverts commits 0a4e6be9ca17c54817cf814b4b5aa60478c6df27
    and 80f7fdb1c7f0f9266421f823964fd1962681f6ce.

    The task migration notifier was originally introduced in order to support
    the pvclock vsyscall with non-synchronized TSC, but KVM only supports it
    with synchronized TSC. Hence, on KVM the race condition is only needed
    due to a bad implementation on the host side, and even then it's so rare
    that it's mostly theoretical.

    As far as KVM is concerned it's possible to fix the host, avoiding the
    additional complexity in the vDSO and the (re)introduction of the task
    migration notifier.

    Xen, on the other hand, hasn't yet implemented vsyscall support at
    all, so we do not care about its plans for non-synchronized TSC.

    Reported-by: Peter Zijlstra
    Suggested-by: Marcelo Tosatti
    Signed-off-by: Paolo Bonzini

    Paolo Bonzini