Eric Lee / smarc-fsl-linux-kernel

08 May, 2015

7 commits

2aa79af64 locking/qspinlock: Revert to test-and-set on hypervisors ... Browse Code »

When we detect a hypervisor (!paravirt, see qspinlock paravirt support
patches), revert to a simple test-and-set lock to avoid the horrors
of queue preemption.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-8-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Peter Zijlstra (Intel)
2015-05-08 18:36:58 +0800
69f9cae90 locking/qspinlock: Optimize for smaller NR_CPUS ... Browse Code »

When we allow for a max NR_CPUS < 2^14 we can optimize the pending
wait-acquire and the xchg_tail() operations.

By growing the pending bit to a byte, we reduce the tail to 16bit.
This means we can use xchg16 for the tail part and do away with all
the repeated compxchg() operations.

This in turn allows us to unconditionally acquire; the locked state
as observed by the wait loops cannot change. And because both locked
and pending are now a full byte we can use simple stores for the
state transition, obviating one atomic operation entirely.

This optimization is needed to make the qspinlock achieve performance
parity with ticket spinlock at light load.

All this is horribly broken on Alpha pre EV56 (and any other arch that
cannot do single-copy atomic byte stores).

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-6-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Peter Zijlstra (Intel)
2015-05-08 18:36:48 +0800
6403bd7d0 locking/qspinlock: Extract out code snippets for the next patch ... Browse Code »

This is a preparatory patch that extracts out the following 2 code
snippets to prepare for the next performance optimization patch.

1) the logic for the exchange of new and previous tail code words
into a new xchg_tail() function.
2) the logic for clearing the pending bit and setting the locked bit
into a new clear_pending_set_locked() function.

This patch also simplifies the trylock operation before queuing by
calling queued_spin_trylock() directly.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-5-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-05-08 18:36:41 +0800
c1fb159db locking/qspinlock: Add pending bit ... Browse Code »

Because the qspinlock needs to touch a second cacheline (the per-cpu
mcs_nodes[]); add a pending bit and allow a single in-word spinner
before we punt to the second cacheline.

It is possible so observe the pending bit without the locked bit when
the last owner has just released but the pending owner has not yet
taken ownership.

In this case we would normally queue -- because the pending bit is
already taken. However, in this case the pending bit is guaranteed
to be released 'soon', therefore wait for it and avoid queueing.

Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-4-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Peter Zijlstra (Intel)
2015-05-08 18:36:32 +0800
a33fda35e locking/qspinlock: Introduce a simple generic 4-byte queued spinlock ... Browse Code »

This patch introduces a new generic queued spinlock implementation that
can serve as an alternative to the default ticket spinlock. Compared
with the ticket spinlock, this queued spinlock should be almost as fair
as the ticket spinlock. It has about the same speed in single-thread
and it can be much faster in high contention situations especially when
the spinlock is embedded within the data structure to be protected.

Only in light to moderate contention where the average queue depth
is around 1-3 will this queued spinlock be potentially a bit slower
due to the higher slowpath overhead.

This queued spinlock is especially suit to NUMA machines with a large
number of cores as the chance of spinlock contention is much higher
in those machines. The cost of contention is also higher because of
slower inter-node memory traffic.

Due to the fact that spinlocks are acquired with preemption disabled,
the process will not be migrated to another CPU while it is trying
to get a spinlock. Ignoring interrupt handling, a CPU can only be
contending in one spinlock at any one time. Counting soft IRQ, hard
IRQ and NMI, a CPU can only have a maximum of 4 concurrent lock waiting
activities. By allocating a set of per-cpu queue nodes and used them
to form a waiting queue, we can encode the queue node address into a
much smaller 24-bit size (including CPU number and queue node index)
leaving one byte for the lock.

Please note that the queue node is only needed when waiting for the
lock. Once the lock is acquired, the queue node can be released to
be used later.

Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Daniel J Blueman
Cc: David Vrabel
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Paolo Bonzini
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Raghavendra K T
Cc: Rik van Riel
Cc: Scott J Norton
Cc: Thomas Gleixner
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1429901803-29771-2-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-05-08 18:36:25 +0800
663fdcbee kernel: Replace reference to ASSIGN_ONCE() with WRITE_ONCE() in comment ... Browse Code »

Looks like commit :

43239cbe79fc ("kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)")

left behind a reference to ASSIGN_ONCE(). Update this to WRITE_ONCE().

Signed-off-by: Preeti U Murthy
Signed-off-by: Peter Zijlstra (Intel)
Cc: Borislav Petkov
Cc: H. Peter Anvin
Cc: Thomas Gleixner
Cc: borntraeger@de.ibm.com
Cc: dave@stgolabs.net
Cc: paulmck@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/20150430115721.22278.94082.stgit@preeti.in.ibm.com
Signed-off-by: Ingo Molnar

Preeti U Murthy
2015-05-08 18:28:53 +0800
59aabfc7e locking/rwsem: Reduce spinlock contention in wakeup after up_read()/up_write() ... Browse Code »

In up_write()/up_read(), rwsem_wake() will be called whenever it
detects that some writers/readers are waiting. The rwsem_wake()
function will take the wait_lock and call __rwsem_do_wake() to do the
real wakeup. For a heavily contended rwsem, doing a spin_lock() on
wait_lock will cause further contention on the heavily contended rwsem
cacheline resulting in delay in the completion of the up_read/up_write
operations.

This patch makes the wait_lock taking and the call to __rwsem_do_wake()
optional if at least one spinning writer is present. The spinning
writer will be able to take the rwsem and call rwsem_wake() later
when it calls up_write(). With the presence of a spinning writer,
rwsem_wake() will now try to acquire the lock using trylock. If that
fails, it will just quit.

Suggested-by: Peter Zijlstra (Intel)
Signed-off-by: Waiman Long
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Davidlohr Bueso
Acked-by: Jason Low
Cc: Andrew Morton
Cc: Borislav Petkov
Cc: Douglas Hatch
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Scott J Norton
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1430428337-16802-2-git-send-email-Waiman.Long@hp.com
Signed-off-by: Ingo Molnar

Waiman Long
2015-05-08 18:27:59 +0800

07 May, 2015

3 commits

8cb7c15b3 Merge tag 'for-linus' of git://github.com/dledford/linux ... Browse Code »

Pull infiniband updates from Doug Ledford:
"Minor updates for 4.1-rc

Most of the changes are fairly small and well confined. The iWARP
address reporting changes are the only ones that are a medium size. I
had these queued up prior to rc1, but due to the shuffle in
maintainers, they did not get submitted when I expected. My apologies
for that. I feel comfortable with them however due to the testing
they've received, so I left them in this submission"

* tag 'for-linus' of git://github.com/dledford/linux:
MAINTAINERS: Update InfiniBand subsystem maintainer
MAINTAINERS: add include/rdma/ to InfiniBand subsystem
IPoIB/CM: Fix indentation level
iw_cxgb4: Remove negative advice dmesg warnings
IB/core: Fix unaligned accesses
IB/core: change rdma_gid2ip into void function as it always return zero
IB/qib: use arch_phys_wc_add()
IB/qib: add acounting for MTRR
IB/core: dma unmap optimizations
IB/core: dma map/unmap locking optimizations
RDMA/cxgb4: Report the actual address of the remote connecting peer
RDMA/nes: Report the actual address of the remote connecting peer
RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the connecting peer to its clients
iw_cxgb4: enforce qp/cq id requirements
iw_cxgb4: use BAR2 GTS register for T5 kernel mode CQs
iw_cxgb4: 32b platform fixes
iw_cxgb4: Cleanup register defines/MACROS
RDMA/CMA: Canonize IPv4 on IPV6 sockets properly

Linus Torvalds
2015-05-07 22:04:33 +0800
0e1dc4274 Merge tag 'for-linus-4.1b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip ... Browse Code »

Pull xen bug fixes from David Vrabel:

- fix blkback regression if using persistent grants

- fix various event channel related suspend/resume bugs

- fix AMD x86 regression with X86_BUG_SYSRET_SS_ATTRS

- SWIOTLB on ARM now uses frames evtchn before binding the channel to CPU in __startup_pirq()
xen/console: Update console event channel on resume
xen/xenbus: Update xenbus event channel on resume
xen/events: Clear cpu_evtchn_mask before resuming
xen-pciback: Add name prefix to global 'permissive' variable
xen: Suspend ticks on all CPUs during suspend
xen/grant: introduce func gnttab_unmap_refs_sync()
xen/blkback: safely unmap purge persistent grants

Linus Torvalds
2015-05-07 06:58:06 +0800
d8fce2db7 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, but also an uncore PMU driver fix and an uncore
PMU driver hardware-enablement addition"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf probe: Fix segfault if passed with ''.
perf report: Fix -T/--threads option to work again
perf bench numa: Fix immediate meeting of convergence condition
perf bench numa: Fixes of --quiet argument
perf bench futex: Fix hung wakeup tasks after requeueing
perf probe: Fix bug with global variables handling
perf top: Fix a segfault when kernel map is restricted.
tools lib traceevent: Fix build failure on 32-bit arch
perf kmem: Fix compiles on RHEL6/OL6
tools lib api: Undefine _FORTIFY_SOURCE before setting it
perf kmem: Consistently use PRIu64 for printing u64 values
perf trace: Disable events and drain events when forked workload ends
perf trace: Enable events when doing system wide tracing and starting a workload
perf/x86/intel/uncore: Move PCI IDs for IMC to uncore driver
perf/x86/intel/uncore: Add support for Intel Haswell ULT (lower power Mobile Processor) IMC uncore PMUs
perf/x86/intel: Add cpu_(prepare|starting|dying) for core_pmu

Linus Torvalds
2015-05-07 01:47:25 +0800

06 May, 2015

5 commits

d8fd150fe nilfs2: fix sanity check of btree level in nilfs_btree_root_broken() ... Browse Code »

The range check for b-tree level parameter in nilfs_btree_root_broken()
is wrong; it accepts the case of "level == NILFS_BTREE_LEVEL_MAX" even
though the level is limited to values in the range of 0 to
(NILFS_BTREE_LEVEL_MAX - 1).

Since the level parameter is read from storage device and used to index
nilfs_btree_path array whose element count is NILFS_BTREE_LEVEL_MAX, it
can cause memory overrun during btree operations if the boundary value
is set to the level parameter on device.

This fixes the broken sanity check and adds a comment to clarify that
the upper bound NILFS_BTREE_LEVEL_MAX is exclusive.

Signed-off-by: Ryusuke Konishi
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ryusuke Konishi
2015-05-06 08:10:11 +0800
05836c378 util_macros.h: have array pointer point to array of constants ... Browse Code »

Using the new find_closest() macro can result in the following sparse
warnings.

drivers/hwmon/lm85.c:194:16: warning:
incorrect type in initializer (different modifiers)
drivers/hwmon/lm85.c:194:16: expected int *__fc_a
drivers/hwmon/lm85.c:194:16: got int static const [toplevel] *
drivers/hwmon/lm85.c:210:16: warning:
incorrect type in initializer (different modifiers)
drivers/hwmon/lm85.c:210:16: expected int *__fc_a
drivers/hwmon/lm85.c:210:16: got int const *map

This is because the array passed to find_closest() will typically be
declared as array of constants, but the macro declares a non-constant
pointer to it.

Signed-off-by: Guenter Roeck
Cc: Bartosz Golaszewski

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Guenter Roeck
2015-05-06 08:10:11 +0800
0d0f738f6 IB/core: Fix unaligned accesses ... Browse Code »

Addresses the following kernel logs seen during boot of sparc systems:

Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]
Kernel unaligned access at TPC[103bce50] cm_find_listen+0x34/0xf8 [ib_cm]

Signed-off-by: David Ahern
Signed-off-by: Doug Ledford

David Ahern
2015-05-06 01:21:27 +0800
471e70583 IB/core: change rdma_gid2ip into void function as it always return zero ... Browse Code »

Signed-off-by: Honggang Li
Acked-by: Sean Hefty
Signed-off-by: Doug Ledford

Honggang LI
2015-05-06 01:21:27 +0800
d9cee5d4f Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 ... Browse Code »

Pull crypto fixes from Herbert Xu:
"This fixes a build problem with bcm63xx and yet another fix to the
memzero_explicit function to ensure that the memset is not elided"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: bcm63xx - Fix driver compilation
lib: make memzero_explicit more robust against dead store elimination

Linus Torvalds
2015-05-06 00:03:52 +0800

05 May, 2015

1 commit

6eec17746 RDMA/core: Enable the iWarp Port Mapper to provide the actual address of the con… ... Browse Code »

…necting peer to its clients

Add functionality to enable the port mapper on the passive side to provide to its
clients the actual (non-mapped) ip/tcp address information of the connecting peer

1) Adding remote_info_cb() to process the address info of the connecting peer
The address info is provided by the user space port mapper service when
the connection is initiated by the peer
2) Adding a hash list to store the remote address info
3) Adding functionality to add/remove the remote address info
After the info has been provided to the port mapper client,
it is removed from the hash list

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>

Tatyana Nikolova
2015-05-05 21:18:01 +0800

04 May, 2015

2 commits

7829fb09a lib: make memzero_explicit more robust against dead store elimination ... Browse Code »

In commit 0b053c951829 ("lib: memzero_explicit: use barrier instead
of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
case LTO would decide to inline memzero_explicit() and eventually
find out it could be elimiated as dead store.

While using barrier() works well for the case of gcc, recent efforts
from LLVMLinux people suggest to use llvm as an alternative to gcc,
and there, Stephan found in a simple stand-alone user space example
that llvm could nevertheless optimize and thus elimitate the memset().
A similar issue has been observed in the referenced llvm bug report,
which is regarded as not-a-bug.

Based on some experiments, icc is a bit special on its own, while it
doesn't seem to eliminate the memset(), it could do so with an own
implementation, and then result in similar findings as with llvm.

The fix in this patch now works for all three compilers (also tested
with more aggressive optimization levels). Arguably, in the current
kernel tree it's more of a theoretical issue, but imho, it's better
to be pedantic about it.

It's clearly visible with gcc/llvm though, with the below code: if we
would have used barrier() only here, llvm would have omitted clearing,
not so with barrier_data() variant:

static inline void memzero_explicit(void *s, size_t count)
{
memset(s, 0, count);
barrier_data(s);
}

int main(void)
{
char buff[20];
memzero_explicit(buff, sizeof(buff));
return 0;
}

$ gcc -O2 test.c
$ gdb a.out
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400400 : lea -0x28(%rsp),%rax
0x0000000000400405 : movq $0x0,-0x28(%rsp)
0x000000000040040e : movq $0x0,-0x20(%rsp)
0x0000000000400417 : movl $0x0,-0x18(%rsp)
0x000000000040041f : xor %eax,%eax
0x0000000000400421 : retq
End of assembler dump.

$ clang -O2 test.c
$ gdb a.out
(gdb) disassemble main
Dump of assembler code for function main:
0x00000000004004f0 : xorps %xmm0,%xmm0
0x00000000004004f3 : movaps %xmm0,-0x18(%rsp)
0x00000000004004f8 : movl $0x0,-0x8(%rsp)
0x0000000000400500 : lea -0x18(%rsp),%rax
0x0000000000400505 : xor %eax,%eax
0x0000000000400507 : retq
End of assembler dump.

As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
this in compiler-gcc.h only to be picked up. For a fallback or otherwise
unsupported compiler, we define it as a barrier. Similarly, for ecc which
does not support gcc inline asm.

Reference: https://llvm.org/bugs/show_bug.cgi?id=15495
Reported-by: Stephan Mueller
Tested-by: Stephan Mueller
Signed-off-by: Daniel Borkmann
Cc: Theodore Ts'o
Cc: Stephan Mueller
Cc: Hannes Frederic Sowa
Cc: mancha security
Cc: Mark Charlebois
Cc: Behan Webster
Signed-off-by: Herbert Xu

Daniel Borkmann
2015-05-04 17:49:51 +0800
61f06db00 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi ... Browse Code »

Pull SCSI fixes from James Bottomley:
"This is three logical fixes (as 5 patches).

The 3ware class of drivers were causing an oops with multiqueue by
tearing down the command mappings after completing the command (where
the variables in the command used to tear down the mapping were
no-longer valid). There's also a fix for the qnap iscsi target which
was choking on us sending it commands that were too long and a fix for
the reworked aha1542 allocating GFP_KERNEL under a lock"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
3w-9xxx: fix command completion race
3w-xxxx: fix command completion race
3w-sas: fix command completion race
aha1542: Allocate memory before taking a lock
SCSI: add 1024 max sectors black list flag

Linus Torvalds
2015-05-04 04:22:32 +0800

02 May, 2015

2 commits

6c3c1eb3c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Receive packet length needs to be adjust by 2 on RX to accomodate
the two padding bytes in altera_tse driver. From Vlastimil Setka.

2) If rx frame is dropped due to out of memory in macb driver, we leave
the receive ring descriptors in an undefined state. From Punnaiah
Choudary Kalluri

3) Some netlink subsystems erroneously signal NLM_F_MULTI. That is
only for dumps. Fix from Nicolas Dichtel.

4) Fix mis-use of raw rt->rt_pmtu value in ipv4, one must always go via
the ipv4_mtu() helper. From Herbert Xu.

5) Fix null deref in bridge netfilter, and miscalculated lengths in
jump/goto nf_tables verdicts. From Florian Westphal.

6) Unhash ping sockets properly.

7) Software implementation of BPF divide did 64/32 rather than 64/64
bit divide. The JITs got it right. Fix from Alexei Starovoitov.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
ipv4: Missing sk_nulls_node_init() in ping_unhash().
net: fec: Fix RGMII-ID mode
net/mlx4_en: Schedule napi when RX buffers allocation fails
netxen_nic: use spin_[un]lock_bh around tx_clean_lock
net/mlx4_core: Fix unaligned accesses
mlx4_en: Use correct loop cursor in error path.
cxgb4: Fix MC1 memory offset calculation
bnx2x: Delay during kdump load
net: Fix Kernel Panic in bonding driver debugfs file: rlb_hash_table
net: dsa: Fix scope of eeprom-length property
net: macb: Fix race condition in driver when Rx frame is dropped
hv_netvsc: Fix a bug in netvsc_start_xmit()
altera_tse: Correct rx packet length
mlx4: Fix tx ring affinity_mask creation
tipc: fix problem with parallel link synchronization mechanism
tipc: remove wrong use of NLM_F_MULTI
bridge/nl: remove wrong use of NLM_F_MULTI
bridge/mdb: remove wrong use of NLM_F_MULTI
net: sched: act_connmark: don't zap skb->nfct
trivial: net: systemport: bcmsysport.h: fix 0x0x prefix
...

Linus Torvalds
2015-05-02 11:51:04 +0800
e412d3a32 virtio: fix typo in vring_need_event() doc comment ... Browse Code »

Here the "other side" refers to the guest or host.

Signed-off-by: Stefan Hajnoczi
Signed-off-by: Rusty Russell
Signed-off-by: Linus Torvalds

Stefan Hajnoczi
2015-05-02 11:46:32 +0800

01 May, 2015

6 commits

4a152c391 Merge tag 'pm+acpi-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull power management and ACPI fixes from Rafael Wysocki:
"Three regression fixes this time, one for a recent regression in the
cpuidle core affecting multiple systems, one for an inadvertently
added duplicate typedef in ACPICA that breaks compilation with GCC 4.5
and one for an ACPI Smart Battery Subsystem driver regression
introduced during the 3.18 cycle (stable-candidate).

Specifics:

- Fix for a regression in the cpuidle core introduced by one of the
recent commits in the clockevents_notify() removal series that put
a call to a function which had to be executed with disabled
interrupts into a code path running with enabled interrupts (Rafael
J Wysocki)

- Fix for a build problem in ACPICA (with GCC 4.5) introduced by one
of the recent ACPICA tools commits that added a duplicate typedef
to one of the ACPICA's header files by mistake (Olaf Hering)

- Fix for a regression in the ACPI SBS (Smart Battery Subsystem)
driver introduced during the 3.18 development cycle causing the
smart battery manager to be marked as not present when it should be
marked as present (Chris Bainbridge)"

* tag 'pm+acpi-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: Run tick_broadcast_exit() with disabled interrupts
ACPI / SBS: Enable battery manager when present
ACPICA: remove duplicate u8 typedef

Linus Torvalds
2015-05-01 05:23:31 +0800
5a2e73b28 Merge tag 'sound-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound ... Browse Code »

Pull sound fixes from Takashi Iwai:
"One nice fix is Peter's patch to make the old good SB Audigy PCI to
work with 32bit DMA instead of 31bit. This allows the MIDI synth
running on modern machines again. Along with it, a few fixes for
emu10k1 have merged.

In ASoC side, there is one fix in the common code, but it's just
trivial additions of static inline functions for CONFIG_PM=n. The
rest are various device-specific small fixes.

Last but not least, a few HD-audio fixes are included, as usual, too"

* tag 'sound-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
ASoC: rt5677: fixed wrong DMIC ref clock
ALSA: emu10k1: Emu10k2 32 bit DMA mode
ALSA: emux: Fix mutex deadlock in OSS emulation
ASoC: Update email-id of Rajeev Kumar
ASoC: rt5645: Fix mask for setting RT5645_DMIC_2_DP_GPIO12 bit
ALSA: hda - Fix missing va_end() call in snd_hda_codec_pcm_new()
ALSA: emux: Fix mutex deadlock at unloading
ALSA: emu10k1: Fix card shortname string buffer overflow
ALSA: hda - Add mute-LED mode control to Thinkpad
ALSA: hda - Fix mute-LED fixed mode
ALSA: hda - Fix click noise at start on Dell XPS13
ASoC: rt5645: Add ACPI match ID
ASoC: rt5677: add register patch for PLL
ASoC: Intel: fix the makefile for atom code
ASoC: dapm: Enable autodisable on SOC_DAPM_SINGLE_TLV_AUTODISABLE
ASoC: add static inline funcs to fix a compiling issue
ASoC: Intel: sst_byt: remove kfree for memory allocated with devm_kzalloc
ASoC: samsung: s3c24xx-i2s: Fix return value check in s3c24xx_iis_dev_probe()
ASoC: tfa9879: Fix return value check in tfa9879_i2c_probe()
ASoC: fsl_ssi: Fix platform_get_irq() error handling
...

Linus Torvalds
2015-05-01 05:00:18 +0800
0ae3aba28 Merge tag 'asoc-v4.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broon… ... Browse Code »

…ie/sound into for-linus

ASoC: Fixes for v4.1

A few fixes for v4.1, none earth shattering and mostly driver related
except for one change to fix !PM builds for Intel platforms which is
done by adding stubs in the core so other platforms don't run into the
same issue.

Takashi Iwai
2015-05-01 01:08:06 +0800
9dbbe3cfc Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

Pull kvm changes from Paolo Bonzini:
"Remove from guest code the handling of task migration during a pvclock
read; instead use the correct protocol in KVM.

This removes the need for task migration notifiers in core scheduler
code"

[ The scheduler people really hated the migration notifiers, so this was
kind of required - Linus ]

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
x86: pvclock: Really remove the sched notifier for cross-cpu migrations
kvm: x86: fix kvmclock update protocol

Linus Torvalds
2015-05-01 00:44:04 +0800
9263a06a5 Merge tag 'tty-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty ... Browse Code »

Pull tty/serial fixes from Greg KH:
"Here are some small tty/serial driver fixes for 4.1-rc2.

They include some minor fixes that resolve reported issues, and a new
device quirk.

All have been in linux-next succesfully"

* tag 'tty-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: 8250_pci: Add support for 16 port Exar boards
serial: samsung: fix serial console break
tty/serial: at91: maxburst was missing for dma transfers
serial: of-serial: Remove device_type = "serial" registration
serial: xilinx: Use platform_get_irq to get irq description structure
serial: core: Fix kernel-doc build warnings
tty: Re-add external interface for tty_set_termios()

Linus Torvalds
2015-05-01 00:30:07 +0800
dcca8de0a Merge tag 'usb-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb ... Browse Code »

Pull USB fixes from Greg KH:
"Here are a number of small USB fixes for 4.2-rc2. They revert one
problem patch, fix some minor things, and add some new quirks for
"broken" devices.

All have been in linux-next successfully"

* tag 'usb-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
cdc-acm: prevent infinite loop when parsing CDC headers.
Revert "usb: host: ehci-msm: Use devm_ioremap_resource instead of devm_ioremap"
usb: chipidea: otg: remove mutex unlock and lock while stop and start role
uas: Set max_sectors_240 quirk for ASM1053 devices
uas: Add US_FL_MAX_SECTORS_240 flag
uas: Allow uas_use_uas_driver to return usb-storage flags

Linus Torvalds
2015-05-01 00:08:53 +0800

30 Apr, 2015

2 commits

46c264daa bridge/nl: remove wrong use of NLM_F_MULTI ... Browse Code »

NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
it is sent only at the end of a dump.

Libraries like libnl will wait forever for NLMSG_DONE.

Fixes: e5a55a898720 ("net: create generic bridge ops")
Fixes: 815cccbf10b2 ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
CC: John Fastabend
CC: Sathya Perla
CC: Subbu Seetharaman
CC: Ajit Khaparde
CC: Jeff Kirsher
CC: intel-wired-lan@lists.osuosl.org
CC: Jiri Pirko
CC: Scott Feldman
CC: Stephen Hemminger
CC: bridge@lists.linux-foundation.org
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller

Nicolas Dichtel
2015-04-30 02:59:16 +0800
2b953a5e9 xen: Suspend ticks on all CPUs during suspend ... Browse Code »

Commit 77e32c89a711 ("clockevents: Manage device's state separately for
the core") decouples clockevent device's modes from states. With this
change when a Xen guest tries to resume, it won't be calling its
set_mode op which needs to be done on each VCPU in order to make the
hypervisor aware that we are in oneshot mode.

This happens because clockevents_tick_resume() (which is an intermediate
step of resuming ticks on a processor) doesn't call clockevents_set_state()
anymore and because during suspend clockevent devices on all VCPUs (except
for the one doing the suspend) are left in ONESHOT state. As result, during
resume the clockevents state machine will assume that device is already
where it should be and doesn't need to be updated.

To avoid this problem we should suspend ticks on all VCPUs during
suspend.

Signed-off-by: Boris Ostrovsky
Signed-off-by: David Vrabel

Boris Ostrovsky
2015-04-30 00:10:05 +0800

29 Apr, 2015

5 commits

a78001b01 Merge remote-tracking branches 'asoc/fix/email', 'asoc/fix/fsl-ssi', 'asoc/fix/p… ... Browse Code »

…m', 'asoc/fix/qcom' and 'asoc/fix/rcar' into asoc-linus

Mark Brown
2015-04-29 20:37:28 +0800
449f1ca62 Merge remote-tracking branch 'asoc/fix/dapm' into asoc-linus Browse Code »

Mark Brown
2015-04-29 20:37:26 +0800
7241ea558 ALSA: emu10k1: Emu10k2 32 bit DMA mode ... Browse Code »

Looks like audigy emu10k2 (probably emu10k1 - sb live too) support two
modes for DMA. Second mode is useful for 64 bit os with more then 2 GB
of ram (fixes problems with big soundfont loading)

1) 32MB from 2 GB address space using 8192 pages (used now as default)
2) 16MB from 4 GB address space using 4096 pages

Mode is set using HCFG_EXPANDED_MEM flag in HCFG register.
Also format of emu10k2 page table is then different.

Signed-off-by: Peter Zubaj
Tested-by: Takashi Iwai
Cc:
Signed-off-by: Takashi Iwai

Peter Zubaj
2015-04-29 13:27:30 +0800
9e9d55e69 ACPICA: remove duplicate u8 typedef ... Browse Code »

During commit e252652fb266 ("ACPICA: acpidump: Remove integer types
translation protection.") two 'unsigned char' types got converted to 'u8'.

The result does not compile with gcc-4.5, it can not cope with duplicate
typedefs.

Signed-off-by: Olaf Hering
Signed-off-by: Rafael J. Wysocki

Olaf Hering
2015-04-29 05:58:54 +0800
14bc84ce0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux ... Browse Code »

Pull s390 updates from Martin Schwidefsky:
"One additional new feature for 4.1, a new PRNG based on SHA-512 for
the zcrypt driver.

Two memory management related changes, the page table reallocation for
KVM is removed, and with file ptes gone the encoding of page table
entries is improved.

And three bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator.
s390/mm: change swap pte encoding and pgtable cleanup
s390/mm: correct transfer of dirty & young bits in __pmd_to_pte
s390/bpf: add dependency to z196 features
s390/3215: free memory in error path
s390/kvm: remove delayed reallocation of page tables for KVM
kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP

Linus Torvalds
2015-04-29 00:58:46 +0800

28 Apr, 2015

6 commits

9d7dd6cd2 ASoC: Update email-id of Rajeev Kumar ... Browse Code »

rajeev-dlh.kumar@st.com email-id doesn't exist anymore as I have left the
company. Replace ST's id with Rajeev Kumar

Signed-off-by: Rajeev Kumar
Signed-off-by: Mark Brown

Rajeev Kumar
2015-04-28 23:31:01 +0800
b00f5c2dc tty: Re-add external interface for tty_set_termios() ... Browse Code »

This is needed by Bluetooth hci_uart module to be able to change speed
of Bluetooth controller and local UART.

Signed-off-by: Frederic Danis
Reviewed-by: Peter Hurley
Cc: Marcel Holtmann
Signed-off-by: Greg Kroah-Hartman

Frederic Danis
2015-04-28 20:26:20 +0800
ee136af4a uas: Add US_FL_MAX_SECTORS_240 flag ... Browse Code »

The usb-storage driver sets max_sectors = 240 in its scsi-host template,
for uas we do not want to do that for all devices, but testing has shown
that some devices need it.

This commit adds a US_FL_MAX_SECTORS_240 flag for such devices, and
implements support for it in uas.c, while at it it also adds support
for US_FL_MAX_SECTORS_64 to uas.c.

Cc: stable@vger.kernel.org # 3.16
Signed-off-by: Hans de Goede
Acked-by: Alan Stern
Signed-off-by: Greg Kroah-Hartman

Hans de Goede
2015-04-28 18:48:57 +0800
39376ccb1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for your net tree,
they are:

1) Fix a crash in nf_tables when dictionaries are used from the ruleset,
due to memory corruption, from Florian Westphal.

2) Fix another crash in nf_queue when used with br_netfilter. Also from
Florian.

Both fixes are related to new stuff that got in 4.0-rc.
====================

Signed-off-by: David S. Miller

David S. Miller
2015-04-28 11:12:34 +0800
2decb2682 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) mlx4 doesn't check fully for supported valid RSS hash function, fix
from Amir Vadai

2) Off by one in ibmveth_change_mtu(), from David Gibson

3) Prevent altera chip from reporting false error interrupts in some
circumstances, from Chee Nouk Phoon

4) Get rid of that stupid endless loop trying to allocate a FIN packet
in TCP, and in the process kill deadlocks. From Eric Dumazet

5) Fix get_rps_cpus() crash due to wrong invalid-cpu value, also from
Eric Dumazet

6) Fix two bugs in async rhashtable resizing, from Thomas Graf

7) Fix topology server listener socket namespace bug in TIPC, from Ying
Xue

8) Add some missing HAS_DMA kconfig dependencies, from Geert
Uytterhoeven

9) bgmac driver intends to force re-polling but does so by returning
the wrong value from it's ->poll() handler. Fix from Rafał Miłecki

10) When the creater of an rhashtable configures a max size for it,
don't bark in the logs and drop insertions when that is exceeded.
Fix from Johannes Berg

11) Recover from out of order packets in ppp mppe properly, from Sylvain
Rochet

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
bnx2x: really disable TPA if 'disable_tpa' option is set
net:treewide: Fix typo in drivers/net
net/mlx4_en: Prevent setting invalid RSS hash function
mdio-mux-gpio: use new gpiod_get_array and gpiod_put_array functions
netfilter; Add some missing default cases to switch statements in nft_reject.
ppp: mppe: discard late packet in stateless mode
ppp: mppe: sanity error path rework
net/bonding: Make DRV macros private
net: rfs: fix crash in get_rps_cpus()
altera tse: add support for fixed-links.
pxa168: fix double deallocation of managed resources
net: fix crash in build_skb()
net: eth: altera: Resolve false errors from MSGDMA to TSE
ehea: Fix memory hook reference counting crashes
net/tg3: Release IRQs on permanent error
net: mdio-gpio: support access that may sleep
inet: fix possible panic in reqsk_queue_unlink()
rhashtable: don't attempt to grow when at max_size
bgmac: fix requests for extra polling calls from NAPI
tcp: avoid looping in tcp_send_fin()
...

Linus Torvalds
2015-04-28 05:05:19 +0800
35e9a9f93 SCSI: add 1024 max sectors black list flag ... Browse Code »

This works around a issue with qnap iscsi targets not handling large IOs
very well.

The target returns:

VPD INQUIRY: Block limits page (SBC)
Maximum compare and write length: 1 blocks
Optimal transfer length granularity: 1 blocks
Maximum transfer length: 4294967295 blocks
Optimal transfer length: 4294967295 blocks
Maximum prefetch, xdread, xdwrite transfer length: 0 blocks
Maximum unmap LBA count: 8388607
Maximum unmap block descriptor count: 1
Optimal unmap granularity: 16383
Unmap granularity alignment valid: 0
Unmap granularity alignment: 0
Maximum write same length: 0xffffffff blocks
Maximum atomic transfer length: 0
Atomic alignment: 0
Atomic transfer length granularity: 0

and it is *sometimes* able to handle at least one IO of size up to 8 MB. We
have seen in traces where it will sometimes work, but other times it
looks like it fails and it looks like it returns failures if we send
multiple large IOs sometimes. Also it looks like it can return 2 different
errors. It will sometimes send iscsi reject errors indicating out of
resources or it will send invalid cdb illegal requests check conditions.
And then when it sends iscsi rejects it does not seem to handle retries
when there are command sequence holes, so I could not just add code to
try and gracefully handle that error code.

The problem is that we do not have a good contact for the company,
so we are not able to determine under what conditions it returns
which error and why it sometimes works.

So, this patch just adds a new black list flag to set targets like this to
the old max safe sectors of 1024. The max_hw_sectors changes added in 3.19
caused this regression, so I also ccing stable.

Reported-by: Christian Hesse
Signed-off-by: Mike Christie
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig
Signed-off-by: James Bottomley

Mike Christie
2015-04-28 00:38:06 +0800

27 Apr, 2015

1 commit

73459e2a1 x86: pvclock: Really remove the sched notifier for cross-cpu migrations ... Browse Code »

This reverts commits 0a4e6be9ca17c54817cf814b4b5aa60478c6df27
and 80f7fdb1c7f0f9266421f823964fd1962681f6ce.

The task migration notifier was originally introduced in order to support
the pvclock vsyscall with non-synchronized TSC, but KVM only supports it
with synchronized TSC. Hence, on KVM the race condition is only needed
due to a bad implementation on the host side, and even then it's so rare
that it's mostly theoretical.

As far as KVM is concerned it's possible to fix the host, avoiding the
additional complexity in the vDSO and the (re)introduction of the task
migration notifier.

Xen, on the other hand, hasn't yet implemented vsyscall support at
all, so we do not care about its plans for non-synchronized TSC.

Reported-by: Peter Zijlstra
Suggested-by: Marcelo Tosatti
Signed-off-by: Paolo Bonzini

Paolo Bonzini
2015-04-27 21:49:30 +0800