Eric Lee / smarc-fsl-linux-kernel

23 Oct, 2015

40 commits

e3f916d20 mm/slab: fix unexpected index mapping result of kmalloc_size(INDEX_NODE+1) ... Browse Code »

commit 03a2d2a3eafe4015412cf4e9675ca0e2d9204074 upstream.

Commit description is copied from the original post of this bug:

http://comments.gmane.org/gmane.linux.kernel.mm/135349

Kernels after v3.9 use kmalloc_size(INDEX_NODE + 1) to get the next
larger cache size than the size index INDEX_NODE mapping. In kernels
3.9 and earlier we used malloc_sizes[INDEX_L3 + 1].cs_size.

However, sometimes we can't get the right output we expected via
kmalloc_size(INDEX_NODE + 1), causing a BUG().

The mapping table in the latest kernel is like:
index = {0, 1, 2 , 3, 4, 5, 6, n}
size = {0, 96, 192, 8, 16, 32, 64, 2^n}
The mapping table before 3.10 is like this:
index = {0 , 1 , 2, 3, 4 , 5 , 6, n}
size = {32, 64, 96, 128, 192, 256, 512, 2^(n+3)}

The problem on my mips64 machine is as follows:

(1) When configured DEBUG_SLAB && DEBUG_PAGEALLOC && DEBUG_LOCK_ALLOC
&& DEBUG_SPINLOCK, the sizeof(struct kmem_cache_node) will be "150",
and the macro INDEX_NODE turns out to be "2": #define INDEX_NODE
kmalloc_index(sizeof(struct kmem_cache_node))

(2) Then the result of kmalloc_size(INDEX_NODE + 1) is 8.

(3) Then "if(size >= kmalloc_size(INDEX_NODE + 1)" will lead to "size
= PAGE_SIZE".

(4) Then "if ((size >= (PAGE_SIZE >> 3))" test will be satisfied and
"flags |= CFLGS_OFF_SLAB" will be covered.

(5) if (flags & CFLGS_OFF_SLAB)" test will be satisfied and will go to
"cachep->slabp_cache = kmalloc_slab(slab_size, 0u)", and the result
here may be NULL while kernel bootup.

(6) Finally,"BUG_ON(ZERO_OR_NULL_PTR(cachep->slabp_cache));" causes the
BUG info as the following shows (may be only mips64 has this problem):

This patch fixes the problem of kmalloc_size(INDEX_NODE + 1) and removes
the BUG by adding 'size >= 256' check to guarantee that all necessary
small sized slabs are initialized regardless sequence of slab size in
mapping table.

Fixes: e33660165c90 ("slab: Use common kmalloc_index/kmalloc_size...")
Signed-off-by: Joonsoo Kim
Reported-by: Liuhailong
Acked-by: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Joonsoo Kim
2015-10-23 05:43:27 +0800
13bc967d6 intel_pstate: Fix overflow in busy_scaled due to long delay ... Browse Code »

commit 7180dddf7c32c49975c7e7babf2b60ed450cb760 upstream.

The kernel may delay interrupts for a long time which can result in timers
being delayed. If this occurs the intel_pstate driver will crash with a
divide by zero error:

divide error: 0000 [#1] SMP
Modules linked in: btrfs zlib_deflate raid6_pq xor msdos ext4 mbcache jbd2 binfmt_misc arc4 md4 nls_utf8 cifs dns_resolver tcp_lp bnep bluetooth rfkill fuse dm_service_time iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ftp ip6t_rpfilter ip6t_REJECT ipt_REJECT xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables intel_powerclamp coretemp vfat fat kvm_intel iTCO_wdt iTCO_vendor_support ipmi_devintf sr_mod kvm crct10dif_pclmul
crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel cdc_ether lrw usbnet cdrom mii gf128mul glue_helper ablk_helper cryptd lpc_ich mfd_core pcspkr sb_edac edac_core ipmi_si ipmi_msghandler ioatdma wmi shpchp acpi_pad nfsd auth_rpcgss nfs_acl lockd uinput dm_multipath sunrpc xfs libcrc32c usb_storage sd_mod crc_t10dif crct10dif_common ixgbe mgag200 syscopyarea sysfillrect sysimgblt mdio drm_kms_helper ttm igb drm ptp pps_core dca i2c_algo_bit megaraid_sas i2c_core dm_mirror dm_region_hash dm_log dm_mod
CPU: 113 PID: 0 Comm: swapper/113 Tainted: G W -------------- 3.10.0-229.1.2.el7.x86_64 #1
Hardware name: IBM x3950 X6 -[3837AC2]-/00FN827, BIOS -[A8E112BUS-1.00]- 08/27/2014
task: ffff880fe8abe660 ti: ffff880fe8ae4000 task.ti: ffff880fe8ae4000
RIP: 0010:[] [] intel_pstate_timer_func+0x179/0x3d0
RSP: 0018:ffff883fff4e3db8 EFLAGS: 00010206
RAX: 0000000027100000 RBX: ffff883fe6965100 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000010 RDI: 000000002e53632d
RBP: ffff883fff4e3e20 R08: 000e6f69a5a125c0 R09: ffff883fe84ec001
R10: 0000000000000002 R11: 0000000000000005 R12: 00000000000049f5
R13: 0000000000271000 R14: 00000000000049f5 R15: 0000000000000246
FS: 0000000000000000(0000) GS:ffff883fff4e0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7668601000 CR3: 000000000190a000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffff883fff4e3e58 ffffffff81099dc1 0000000000000086 0000000000000071
ffff883fff4f3680 0000000000000071 fbdc8a965e33afee ffffffff810b69dd
ffff883fe84ec000 ffff883fe6965108 0000000000000100 ffffffff814a9100
Call Trace:

[] ? run_posix_cpu_timers+0x51/0x840
[] ? trigger_load_balance+0x5d/0x200
[] ? pid_param_set+0x130/0x130
[] call_timer_fn+0x36/0x110
[] ? pid_param_set+0x130/0x130
[] run_timer_softirq+0x21f/0x320
[] __do_softirq+0xef/0x280
[] call_softirq+0x1c/0x30
[] do_softirq+0x65/0xa0
[] irq_exit+0x115/0x120
[] smp_apic_timer_interrupt+0x45/0x60
[] apic_timer_interrupt+0x6d/0x80

[] ? cpuidle_enter_state+0x52/0xc0
[] ? cpuidle_enter_state+0x48/0xc0
[] cpuidle_idle_call+0xc5/0x200
[] arch_cpu_idle+0xe/0x30
[] cpu_startup_entry+0xf1/0x290
[] start_secondary+0x1ba/0x230
Code: 42 0f 00 45 89 e6 48 01 c2 43 8d 44 6d 00 39 d0 73 26 49 c1 e5 08 89 d2 4d 63 f4 49 63 c5 48 c1 e2 08 48 c1 e0 08 48 63 ca 48 99 f7 f9 48 98 4c 0f af f0 49 c1 ee 08 8b 43 78 c1 e0 08 44 29
RIP [] intel_pstate_timer_func+0x179/0x3d0
RSP

The kernel values for cpudata for CPU 113 were:

struct cpudata {
cpu = 113,
timer = {
entry = {
next = 0x0,
prev = 0xdead000000200200
},
expires = 8357799745,
base = 0xffff883fe84ec001,
function = 0xffffffff814a9100 ,
data = 18446612406765768960,

i_gain = 0,
d_gain = 0,
deadband = 0,
last_err = 22489
},
last_sample_time = {
tv64 = 4063132438017305
},
prev_aperf = 287326796397463,
prev_mperf = 251427432090198,
sample = {
core_pct_busy = 23081,
aperf = 2937407,
mperf = 3257884,
freq = 2524484,
time = {
tv64 = 4063149215234118
}
}
}

which results in the time between samples = last_sample_time - sample.time
= 4063149215234118 - 4063132438017305 = 16777216813 which is 16.777 seconds.

The duration between reads of the APERF and MPERF registers overflowed a s32
sized integer in intel_pstate_get_scaled_busy()'s call to div_fp(). The result
is that int_tofp(duration_us) == 0, and the kernel attempts to divide by 0.

While the kernel shouldn't be delaying for a long time, it can and does
happen and the intel_pstate driver should not panic in this situation. This
patch changes the div_fp() function to use div64_s64() to allow for "long"
division. This will avoid the overflow condition on long delays.

[v2]: use div64_s64() in div_fp()

Signed-off-by: Prarit Bhargava
Signed-off-by: Rafael J. Wysocki
Cc: Thomas Renninger
Signed-off-by: Greg Kroah-Hartman

Prarit Bhargava
2015-10-23 05:43:27 +0800
8a1d5ab82 serial: atmel: fix error path of probe function ... Browse Code »

commit 8f1bd8f2ad2358d6a88c115481ff3e69817d1bde upstream.

If atmel_init_gpios fails the port has already been marked as busy (in
line 2629), so this must be undone in the error path.

This bug was introduced because I created the patch that finally
became 722ccf416ac2 ("serial: atmel: fix error handling when
mctrl_gpio_init fails") on top of 3.19 which didn't have commit
6fbb9bdf0f3f ("tty/serial: at91: fix error handling in
atmel_serial_probe()") yet.

Signed-off-by: Uwe Kleine-König
Fixes: 722ccf416ac2 ("serial: atmel: fix error handling when mctrl_gpio_init fails")
Acked-by: Nicolas Ferre
Signed-off-by: Greg Kroah-Hartman

Uwe Kleine-König
2015-10-23 05:43:27 +0800
5e2b2e1c4 serial: 8250: add uart_config entry for PORT_RT2880 ... Browse Code »

commit 3c5a0357fdb3a9116a48dbdb0abb91fd23fbff80 upstream.

This adds an entry to the uart_config table for PORT_RT2880
enabling rx/tx FIFOs. The UART is actually a Palmchip BK-3103
which is found in several devices from Alchemy/RMI, Ralink, and
Sigma Designs.

Signed-off-by: Mans Rullgard
Signed-off-by: Greg Kroah-Hartman

Mans Rullgard
2015-10-23 05:43:27 +0800
9f98531e2 drivers/tty: require read access for controlling terminal ... Browse Code »

commit 0c55627167870255158db1cde0d28366f91c8872 upstream.

This is mostly a hardening fix, given that write-only access to other
users' ttys is usually only given through setgid tty executables.

Signed-off-by: Jann Horn
Signed-off-by: Greg Kroah-Hartman

Jann Horn
2015-10-23 05:43:26 +0800
614ea4ea2 tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c ... Browse Code »

commit e81107d4c6bd098878af9796b24edc8d4a9524fd upstream.

My colleague ran into a program stall on a x86_64 server, where
n_tty_read() was waiting for data even if there was data in the buffer
in the pty. kernel stack for the stuck process looks like below.
#0 [ffff88303d107b58] __schedule at ffffffff815c4b20
#1 [ffff88303d107bd0] schedule at ffffffff815c513e
#2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818
#3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2
#4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23
#5 [ffff88303d107dd0] tty_read at ffffffff81368013
#6 [ffff88303d107e20] __vfs_read at ffffffff811a3704
#7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57
#8 [ffff88303d107f00] sys_read at ffffffff811a4306
#9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7

There seems to be two problems causing this issue.

First, in drivers/tty/n_tty.c, __receive_buf() stores the data and
updates ldata->commit_head using smp_store_release() and then checks
the wait queue using waitqueue_active(). However, since there is no
memory barrier, __receive_buf() could return without calling
wake_up_interactive_poll(), and at the same time, n_tty_read() could
start to wait in wait_woken() as in the following chart.

__receive_buf() n_tty_read()
------------------------------------------------------------------------
if (waitqueue_active(&tty->read_wait))
/* Memory operations issued after the
RELEASE may be completed before the
RELEASE operation has completed */
add_wait_queue(&tty->read_wait, &wait);
...
if (!input_available_p(tty, 0)) {
smp_store_release(&ldata->commit_head,
ldata->read_head);
...
timeout = wait_woken(&wait,
TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------

The second problem is that n_tty_read() also lacks a memory barrier
call and could also cause __receive_buf() to return without calling
wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken()
as in the chart below.

__receive_buf() n_tty_read()
------------------------------------------------------------------------
spin_lock_irqsave(&q->lock, flags);
/* from add_wait_queue() */
...
if (!input_available_p(tty, 0)) {
/* Memory operations issued after the
RELEASE may be completed before the
RELEASE operation has completed */
smp_store_release(&ldata->commit_head,
ldata->read_head);
if (waitqueue_active(&tty->read_wait))
__add_wait_queue(q, wait);
spin_unlock_irqrestore(&q->lock,flags);
/* from add_wait_queue() */
...
timeout = wait_woken(&wait,
TASK_INTERRUPTIBLE, timeout);
------------------------------------------------------------------------

There are also other places in drivers/tty/n_tty.c which have similar
calls to waitqueue_active(), so instead of adding many memory barrier
calls, this patch simply removes the call to waitqueue_active(),
leaving just wake_up*() behind.

This fixes both problems because, even though the memory access before
or after the spinlocks in both wake_up*() and add_wait_queue() can
sneak into the critical section, it cannot go past it and the critical
section assures that they will be serialized (please see "INTER-CPU
ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a
better explanation). Moreover, the resulting code is much simpler.

Latency measurement using a ping-pong test over a pty doesn't show any
visible performance drop.

Signed-off-by: Kosuke Tatsukawa
Signed-off-by: Greg Kroah-Hartman

Kosuke Tatsukawa
2015-10-23 05:43:26 +0800
a0533fb8c staging: speakup: fix speakup-r regression ... Browse Code »

commit b1d562acc78f0af46de0dfe447410bc40bdb7ece upstream.

Here is a patch to make speakup-r work again.

It broke in 3.6 due to commit 4369c64c79a22b98d3b7eff9d089196cd878a10a
"Input: Send events one packet at a time)

The problem was that the fakekey.c routine to fake a down arrow no
longer functioned properly and putting the input_sync fixed it.

Fixes: 4369c64c79a22b98d3b7eff9d089196cd878a10a
Acked-by: Samuel Thibault
Signed-off-by: John Covici
Signed-off-by: Greg Kroah-Hartman
Signed-off-by: Greg Kroah-Hartman

covici@ccs.covici.com
2015-10-23 05:43:26 +0800
383f72c17 dm cache: fix NULL pointer when switching from cleaner policy ... Browse Code »

commit 2bffa1503c5c06192eb1459180fac4416575a966 upstream.

The cleaner policy doesn't make use of the per cache block hint space in
the metadata (unlike the other policies). When switching from the
cleaner policy to mq or smq a NULL pointer crash (in dm_tm_new_block)
was observed. The crash was caused by bugs in dm-cache-metadata.c
when trying to skip creation of the hint btree.

The minimal fix is to change hint size for the cleaner policy to 4 bytes
(only hint size supported).

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Joe Thornber
2015-10-23 05:43:26 +0800
16d4c27cb dm: fix AB-BA deadlock in __dm_destroy() ... Browse Code »

commit 2a708cff93f1845b9239bc7d6310aef54e716c6a upstream.

__dm_destroy() takes io_barrier SRCU lock (dm_get_live_table) and
suspend_lock in reverse order. Doing so can cause AB-BA deadlock:

__dm_destroy dm_swap_table
---------------------------------------------------
mutex_lock(suspend_lock)
dm_get_live_table()
srcu_read_lock(io_barrier)
dm_sync_table()
synchronize_srcu(io_barrier)
.. waiting for dm_put_live_table()
mutex_lock(suspend_lock)
.. waiting for suspend_lock

Fix this by taking the locks in proper order.

Signed-off-by: Jun'ichi Nomura
Fixes: ab7c7bb6f4ab ("dm: hold suspend_lock while suspending device during device deletion")
Acked-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Junichi Nomura
2015-10-23 05:43:26 +0800
2058efbcb namei: results of d_is_negative() should be checked after dentry revalidation ... Browse Code »

commit daf3761c9fcde0f4ca64321cbed6c1c86d304193 upstream.

Leandro Awa writes:
"After switching to version 4.1.6, our parallelized and distributed
workflows now fail consistently with errors of the form:

T34: ./regex.c:39:22: error: config.h: No such file or directory

From our 'git bisect' testing, the following commit appears to be the
possible cause of the behavior we've been seeing: commit 766c4cbfacd8"

Al Viro says:
"What happens is that 766c4cbfacd8 got the things subtly wrong.

We used to treat d_is_negative() after lookup_fast() as "fall with
ENOENT". That was wrong - checking ->d_flags outside of ->d_seq
protection is unreliable and failing with hard error on what should've
fallen back to non-RCU pathname resolution is a bug.

Unfortunately, we'd pulled the test too far up and ran afoul of
another kind of staleness. The dentry might have been absolutely
stable from the RCU point of view (and we might be on UP, etc), but
stale from the remote fs point of view. If ->d_revalidate() returns
"it's actually stale", dentry gets thrown away and the original code
wouldn't even have looked at its ->d_flags.

What we need is to check ->d_flags where 766c4cbfacd8 does (prior to
->d_seq validation) but only use the result in cases where we do not
discard this dentry outright"

Reported-by: Leandro Awa
Link: https://bugzilla.kernel.org/show_bug.cgi?id=104911
Fixes: 766c4cbfacd8 ("namei: d_is_negative() should be checked...")
Tested-by: Leandro Awa
Signed-off-by: Trond Myklebust
Acked-by: Al Viro
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Trond Myklebust
2015-10-23 05:43:26 +0800
645b9d380 clk: ti: fix dual-registration of uart4_ick ... Browse Code »

commit 19e79687de22f23bcfb5e79cce3daba20af228d1 upstream.

On the OMAP AM3517 platform the uart4_ick gets registered
twice, causing any power management to /dev/ttyO3 to fail
when trying to wake the device up.

This solves the following oops:

[] Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa09e008
[] PC is at serial_omap_pm+0x48/0x15c
[] LR is at _raw_spin_unlock_irqrestore+0x30/0x5c

Fixes: aafd900cab87 ("CLK: TI: add omap3 clock init file")
Cc: mturquette@baylibre.com
Cc: sboyd@codeaurora.org
Cc: linux-clk@vger.kernel.org
Cc: linux-omap@vger.kernel.org
Cc: linux-kernel@lists.codethink.co.uk
Signed-off-by: Ben Dooks
Signed-off-by: Tero Kristo
Signed-off-by: Greg Kroah-Hartman

Ben Dooks
2015-10-23 05:43:26 +0800
863e9b4f5 nfs/filelayout: Fix NULL reference caused by double freeing of fh_array ... Browse Code »

commit 3ec0c97959abff33a42db9081c22132bcff5b4f2 upstream.

If filelayout_decode_layout fail, _filelayout_free_lseg will causes
a double freeing of fh_array.

[ 1179.279800] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 1179.280198] IP: [] filelayout_free_fh_array.isra.11+0x1d/0x70 [nfs_layout_nfsv41_files]
[ 1179.281010] PGD 0
[ 1179.281443] Oops: 0000 [#1]
[ 1179.281831] Modules linked in: nfs_layout_nfsv41_files(OE) nfsv4(OE) nfs(OE) fscache(E) xfs libcrc32c coretemp nfsd crct10dif_pclmul ppdev crc32_pclmul crc32c_intel auth_rpcgss ghash_clmulni_intel nfs_acl lockd vmw_balloon grace sunrpc parport_pc vmw_vmci parport shpchp i2c_piix4 vmwgfx drm_kms_helper ttm drm serio_raw mptspi scsi_transport_spi mptscsih e1000 mptbase ata_generic pata_acpi [last unloaded: fscache]
[ 1179.283891] CPU: 0 PID: 13336 Comm: cat Tainted: G OE 4.3.0-rc1-pnfs+ #244
[ 1179.284323] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
[ 1179.285206] task: ffff8800501d48c0 ti: ffff88003e3c4000 task.ti: ffff88003e3c4000
[ 1179.285668] RIP: 0010:[] [] filelayout_free_fh_array.isra.11+0x1d/0x70 [nfs_layout_nfsv41_files]
[ 1179.286612] RSP: 0018:ffff88003e3c77f8 EFLAGS: 00010202
[ 1179.287092] RAX: 0000000000000000 RBX: ffff88001fe78900 RCX: 0000000000000000
[ 1179.287731] RDX: ffffea0000f40760 RSI: ffff88001fe789c8 RDI: ffff88001fe789c0
[ 1179.288383] RBP: ffff88003e3c7810 R08: ffffea0000f40760 R09: 0000000000000000
[ 1179.289170] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88001fe789c8
[ 1179.289959] R13: ffff88001fe789c0 R14: ffff88004ec05a80 R15: ffff88004f935b88
[ 1179.290791] FS: 00007f4e66bb5700(0000) GS:ffffffff81c29000(0000) knlGS:0000000000000000
[ 1179.291580] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1179.292209] CR2: 0000000000000000 CR3: 00000000203f8000 CR4: 00000000001406f0
[ 1179.292731] Stack:
[ 1179.293195] ffff88001fe78900 00000000000000d0 ffff88001fe78178 ffff88003e3c7868
[ 1179.293676] ffffffffa0272737 0000000000000001 0000000000000001 ffff88001fe78800
[ 1179.294151] 00000000614fffce ffffffff81727671 ffff88001fe78100 ffff88001fe78100
[ 1179.294623] Call Trace:
[ 1179.295092] [] filelayout_alloc_lseg+0xa7/0x2d0 [nfs_layout_nfsv41_files]
[ 1179.295625] [] ? out_of_line_wait_on_bit+0x81/0xb0
[ 1179.296133] [] pnfs_layout_process+0xae/0x320 [nfsv4]
[ 1179.296632] [] nfs4_proc_layoutget+0x2b1/0x360 [nfsv4]
[ 1179.297134] [] pnfs_update_layout+0x853/0xb30 [nfsv4]
[ 1179.297632] [] ? nfs_get_lock_context+0x74/0x170 [nfs]
[ 1179.298158] [] filelayout_pg_init_read+0x37/0x50 [nfs_layout_nfsv41_files]
[ 1179.298834] [] __nfs_pageio_add_request+0x119/0x460 [nfs]
[ 1179.299385] [] ? nfs_create_request.part.9+0x37/0x2e0 [nfs]
[ 1179.299872] [] nfs_pageio_add_request+0xa3/0x1b0 [nfs]
[ 1179.300362] [] readpage_async_filler+0x85/0x260 [nfs]
[ 1179.300907] [] read_cache_pages+0x91/0xd0
[ 1179.301391] [] ? nfs_read_completion+0x220/0x220 [nfs]
[ 1179.301867] [] nfs_readpages+0x128/0x200 [nfs]
[ 1179.302330] [] __do_page_cache_readahead+0x203/0x280
[ 1179.302784] [] ? __do_page_cache_readahead+0xd8/0x280
[ 1179.303413] [] ondemand_readahead+0x1a6/0x2f0
[ 1179.303855] [] page_cache_sync_readahead+0x31/0x50
[ 1179.304286] [] generic_file_read_iter+0x4a6/0x5c0
[ 1179.304711] [] ? __nfs_revalidate_mapping+0x1f6/0x240 [nfs]
[ 1179.305132] [] nfs_file_read+0x52/0xa0 [nfs]
[ 1179.305540] [] __vfs_read+0xcc/0x100
[ 1179.305936] [] vfs_read+0x85/0x130
[ 1179.306326] [] SyS_read+0x58/0xd0
[ 1179.306708] [] entry_SYSCALL_64_fastpath+0x12/0x76
[ 1179.307094] Code: c4 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 8b 07 49 89 f4 85 c0 74 47 48 8b 06 49 89 fd 8b 38 48 85 ff 74 22 31 db eb 0c 48 63 d3 48 8b 3c d0 48 85
[ 1179.308357] RIP [] filelayout_free_fh_array.isra.11+0x1d/0x70 [nfs_layout_nfsv41_files]
[ 1179.309177] RSP
[ 1179.309582] CR2: 0000000000000000

Signed-off-by: Kinglong Mee
Signed-off-by: Trond Myklebust
Cc: William Dauchy
Signed-off-by: Greg Kroah-Hartman

Kinglong Mee
2015-10-23 05:43:26 +0800
aaf19f122 fix a braino in ovl_d_select_inode() ... Browse Code »

commit 9391dd00d13c853ab4f2a85435288ae2202e0e43 upstream.

when opening a directory we want the overlayfs inode, not one from
the topmost layer.

Reported-By: Andrey Jr. Melnikov
Tested-By: Andrey Jr. Melnikov
Signed-off-by: Al Viro
Cc: "Kamata, Munehisa"
Signed-off-by: Greg Kroah-Hartman

Al Viro
2015-10-23 05:43:26 +0800
9abb3b810 overlayfs: Make f_path always point to the overlay and f_inode to the underlay ... Browse Code »

commit 4bacc9c9234c7c8eec44f5ed4e960d9f96fa0f01 upstream.

Make file->f_path always point to the overlay dentry so that the path in
/proc/pid/fd is correct and to ensure that label-based LSMs have access to the
overlay as well as the underlay (path-based LSMs probably don't need it).

Using my union testsuite to set things up, before the patch I see:

[root@andromeda union-testsuite]# bash 5 /a/foo107
[root@andromeda union-testsuite]# stat /mnt/a/foo107
...
Device: 23h/35d Inode: 13381 Links: 1
...
[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
...
Device: 23h/35d Inode: 13381 Links: 1
...

After the patch:

[root@andromeda union-testsuite]# bash 5 /mnt/a/foo107
[root@andromeda union-testsuite]# stat /mnt/a/foo107
...
Device: 23h/35d Inode: 40346 Links: 1
...
[root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
...
Device: 23h/35d Inode: 40346 Links: 1
...

Note the change in where /proc/$$/fd/5 points to in the ls command. It was
pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
(which is correct).

The inode accessed, however, is the lower layer. The union layer is on device
25h/37d and the upper layer on 24h/36d.

Signed-off-by: David Howells
Signed-off-by: Al Viro
Cc: "Kamata, Munehisa"
Signed-off-by: Greg Kroah-Hartman

David Howells
2015-10-23 05:43:26 +0800
0d2ea357d overlay: Call ovl_drop_write() earlier in ovl_dentry_open() ... Browse Code »

commit f25801ee4680ef1db21e15c112e6e5fe3ffe8da5 upstream.

Call ovl_drop_write() earlier in ovl_dentry_open() before we call vfs_open()
as we've done the copy up for which we needed the freeze-write lock by that
point.

Signed-off-by: David Howells
Signed-off-by: Al Viro
Cc: "Kamata, Munehisa"
Signed-off-by: Greg Kroah-Hartman

David Howells
2015-10-23 05:43:26 +0800
583c46f9c md/bitmap: don't pass -1 to bitmap_storage_alloc. ... Browse Code »

commit da6fb7a9e5bd6f04f7e15070f630bdf1ea502841 upstream.

Passing -1 to bitmap_storage_alloc() causes page->index to be set to
-1, which is quite problematic.

So only pass ->cluster_slot if mddev_is_clustered().

Fixes: b97e92574c0b ("Use separate bitmaps for each nodes in the cluster")
Signed-off-by: NeilBrown
Signed-off-by: Greg Kroah-Hartman

NeilBrown
2015-10-23 05:43:26 +0800
0cf68c236 genirq: Fix race in register_irq_proc() ... Browse Code »

commit 95c2b17534654829db428f11bcf4297c059a2a7e upstream.

Per-IRQ directories in procfs are created only when a handler is first
added to the irqdesc, not when the irqdesc is created. In the case of
a shared IRQ, multiple tasks can race to create a directory. This
race condition seems to have been present forever, but is easier to
hit with async probing.

Signed-off-by: Ben Hutchings
Link: http://lkml.kernel.org/r/1443266636.2004.2.camel@decadent.org.uk
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Ben Hutchings
2015-10-23 05:43:25 +0800
5f9611c80 igb: do not re-init SR-IOV during probe ... Browse Code »

commit 6423fc34160939142d72ffeaa2db6408317f54df upstream.

During driver probing the following code path is triggered.
igb_probe
->igb_sw_init
->igb_probe_vfs
->igb_pci_enable_sriov
->igb_sriov_reinit

Doing the SR-IOV re-init is not necessary during probing since we're
starting from scratch. Here we can call igb_enable_sriov() right away.

Running igb_sriov_reinit() during igb_probe() also seems to cause
occasional packet loss on some onboard 82576 NICs. Reproduced on
Dell and HP servers with onboard 82576 NICs.
Example:
Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
Subsystem: Dell Device [1028:0481]

Signed-off-by: Stefan Assmann
Tested-by: Aaron Brown
Signed-off-by: Jeff Kirsher
Cc: Daniel J Blueman
Signed-off-by: Greg Kroah-Hartman

Stefan Assmann
2015-10-23 05:43:25 +0800
9373e7b42 net/xen-netfront: only napi_synchronize() if running ... Browse Code »

commit 274b045509175db0405c784be85e8cce116e6f7d upstream.

If an interface isn't running napi_synchronize() will hang forever.

[ 392.248403] rmmod R running task 0 359 343 0x00000000
[ 392.257671] ffff88003760fc88 ffff880037193b40 ffff880037193160 ffff88003760fc88
[ 392.267644] ffff880037610000 ffff88003760fcd8 0000000100014c22 ffffffff81f75c40
[ 392.277524] 0000000000bc7010 ffff88003760fca8 ffffffff81796927 ffffffff81f75c40
[ 392.287323] Call Trace:
[ 392.291599] [] schedule+0x37/0x90
[ 392.298553] [] schedule_timeout+0x14b/0x280
[ 392.306421] [] ? irq_free_descs+0x69/0x80
[ 392.314006] [] ? internal_add_timer+0xb0/0xb0
[ 392.322125] [] msleep+0x37/0x50
[ 392.329037] [] xennet_disconnect_backend.isra.24+0xda/0x390 [xen_netfront]
[ 392.339658] [] xennet_remove+0x2c/0x80 [xen_netfront]
[ 392.348516] [] xenbus_dev_remove+0x59/0xc0
[ 392.356257] [] __device_release_driver+0x87/0x120
[ 392.364645] [] driver_detach+0xb8/0xc0
[ 392.371989] [] bus_remove_driver+0x59/0xe0
[ 392.379883] [] driver_unregister+0x30/0x70
[ 392.387495] [] xenbus_unregister_driver+0x12/0x20
[ 392.395908] [] netif_exit+0x10/0x775 [xen_netfront]
[ 392.404877] [] SyS_delete_module+0x1d8/0x230
[ 392.412804] [] system_call_fastpath+0x12/0x71

Signed-off-by: Chas Williams
Signed-off-by: David S. Miller
Cc: "Kamata, Munehisa"
Signed-off-by: Greg Kroah-Hartman

Chas Williams
2015-10-23 05:43:25 +0800
59c73a0ac m68k: Define asmlinkage_protect ... Browse Code »

commit 8474ba74193d302e8340dddd1e16c85cc4b98caf upstream.

Make sure the compiler does not modify arguments of syscall functions.
This can happen if the compiler generates a tailcall to another
function. For example, without asmlinkage_protect sys_openat is compiled
into this function:

sys_openat:
clr.l %d0
move.w 18(%sp),%d0
move.l %d0,16(%sp)
jbra do_sys_open

Note how the fourth argument is modified in place, modifying the register
%d4 that gets restored from this stack slot when the function returns to
user-space. The caller may expect the register to be unmodified across
system calls.

Signed-off-by: Andreas Schwab
Signed-off-by: Geert Uytterhoeven
Signed-off-by: Greg Kroah-Hartman

Andreas Schwab
2015-10-23 05:43:25 +0800
f01570729 arm64: readahead: fault retry breaks mmap file read random detection ... Browse Code »

commit 569ba74a7ba69f46ce2950bf085b37fea2408385 upstream.

This is the arm64 portion of commit 45cac65b0fcd ("readahead: fault
retry breaks mmap file read random detection"), which was absent from
the initial port and has since gone unnoticed. The original commit says:

> .fault now can retry. The retry can break state machine of .fault. In
> filemap_fault, if page is miss, ra->mmap_miss is increased. In the second
> try, since the page is in page cache now, ra->mmap_miss is decreased. And
> these are done in one fault, so we can't detect random mmap file access.
>
> Add a new flag to indicate .fault is tried once. In the second try, skip
> ra->mmap_miss decreasing. The filemap_fault state machine is ok with it.

With this change, Mark reports that:

> Random read improves by 250%, sequential read improves by 40%, and
> random write by 400% to an eMMC device with dm crypto wrapped around it.

Cc: Shaohua Li
Cc: Rik van Riel
Cc: Wu Fengguang
Signed-off-by: Mark Salyzyn
Signed-off-by: Riley Andrews
Signed-off-by: Will Deacon
Signed-off-by: Greg Kroah-Hartman

Mark Salyzyn
2015-10-23 05:43:25 +0800
249af812d arm64: ftrace: fix function_graph tracer panic ... Browse Code »

commit ee556d00cf20012e889344a0adbbf809ab5015a3 upstream.

When function graph tracer is enabled, the following operation
will trigger panic:

mount -t debugfs nodev /sys/kernel
echo next_tgid > /sys/kernel/tracing/set_ftrace_filter
echo function_graph > /sys/kernel/tracing/current_tracer
ls /proc/

------------[ cut here ]------------
[ 198.501417] Unable to handle kernel paging request at virtual address cb88537fdc8ba316
[ 198.506126] pgd = ffffffc008f79000
[ 198.509363] [cb88537fdc8ba316] *pgd=00000000488c6003, *pud=00000000488c6003, *pmd=0000000000000000
[ 198.517726] Internal error: Oops: 94000005 [#1] SMP
[ 198.518798] Modules linked in:
[ 198.520582] CPU: 1 PID: 1388 Comm: ls Tainted: G
[ 198.521800] Hardware name: linux,dummy-virt (DT)
[ 198.522852] task: ffffffc0fa9e8000 ti: ffffffc0f9ab0000 task.ti: ffffffc0f9ab0000
[ 198.524306] PC is at next_tgid+0x30/0x100
[ 198.525205] LR is at return_to_handler+0x0/0x20
[ 198.526090] pc : [] lr : [] pstate: 60000145
[ 198.527392] sp : ffffffc0f9ab3d40
[ 198.528084] x29: ffffffc0f9ab3d40 x28: ffffffc0f9ab0000
[ 198.529406] x27: ffffffc000d6a000 x26: ffffffc000b786e8
[ 198.530659] x25: ffffffc0002a1900 x24: ffffffc0faf16c00
[ 198.531942] x23: ffffffc0f9ab3ea0 x22: 0000000000000002
[ 198.533202] x21: ffffffc000d85050 x20: 0000000000000002
[ 198.534446] x19: 0000000000000002 x18: 0000000000000000
[ 198.535719] x17: 000000000049fa08 x16: ffffffc000242efc
[ 198.537030] x15: 0000007fa472b54c x14: ffffffffff000000
[ 198.538347] x13: ffffffc0fada84a0 x12: 0000000000000001
[ 198.539634] x11: ffffffc0f9ab3d70 x10: ffffffc0f9ab3d70
[ 198.540915] x9 : ffffffc0000907c0 x8 : ffffffc0f9ab3d40
[ 198.542215] x7 : 0000002e330f08f0 x6 : 0000000000000015
[ 198.543508] x5 : 0000000000000f08 x4 : ffffffc0f9835ec0
[ 198.544792] x3 : cb88537fdc8ba316 x2 : cb88537fdc8ba306
[ 198.546108] x1 : 0000000000000002 x0 : ffffffc000d85050
[ 198.547432]
[ 198.547920] Process ls (pid: 1388, stack limit = 0xffffffc0f9ab0020)
[ 198.549170] Stack: (0xffffffc0f9ab3d40 to 0xffffffc0f9ab4000)
[ 198.582568] Call trace:
[ 198.583313] [] next_tgid+0x30/0x100
[ 198.584359] [] ftrace_graph_caller+0x6c/0x70
[ 198.585503] [] ftrace_graph_caller+0x6c/0x70
[ 198.586574] [] ftrace_graph_caller+0x6c/0x70
[ 198.587660] [] ftrace_graph_caller+0x6c/0x70
[ 198.588896] Code: aa0003f5 2a0103f4 b4000102 91004043 (885f7c60)
[ 198.591092] ---[ end trace 6a346f8f20949ac8 ]---

This is because when using function graph tracer, if the traced
function return value is in multi regs ([x0-x7]), return_to_handler
may corrupt them. So in return_to_handler, the parameter regs should
be protected properly.

Signed-off-by: Li Bin
Acked-by: AKASHI Takahiro
Signed-off-by: Catalin Marinas
Signed-off-by: Greg Kroah-Hartman

Li Bin
2015-10-23 05:43:25 +0800
b23b63c22 arm64/efi: Fix boot crash by not padding between EFI_MEMORY_RUNTIME regions ... Browse Code »

commit 0ce3cc008ec04258b6a6314b09f1a6012810881a upstream.

The new Properties Table feature introduced in UEFIv2.5 may
split memory regions that cover PE/COFF memory images into
separate code and data regions. Since these regions only differ
in the type (runtime code vs runtime data) and the permission
bits, but not in the memory type attributes (UC/WC/WT/WB), the
spec does not require them to be aligned to 64 KB.

Since the relative offset of PE/COFF .text and .data segments
cannot be changed on the fly, this means that we can no longer
pad out those regions to be mappable using 64 KB pages.
Unfortunately, there is no annotation in the UEFI memory map
that identifies data regions that were split off from a code
region, so we must apply this logic to all adjacent runtime
regions whose attributes only differ in the permission bits.

So instead of rounding each memory region to 64 KB alignment at
both ends, only round down regions that are not directly
preceded by another runtime region with the same type
attributes. Since the UEFI spec does not mandate that the memory
map be sorted, this means we also need to sort it first.

Note that this change will result in all EFI_MEMORY_RUNTIME
regions whose start addresses are not aligned to the OS page
size to be mapped with executable permissions (i.e., on kernels
compiled with 64 KB pages). However, since these mappings are
only active during the time that UEFI Runtime Services are being
invoked, the window for abuse is rather small.

Tested-by: Mark Salter
Tested-by: Mark Rutland [UEFI 2.4 only]
Signed-off-by: Ard Biesheuvel
Signed-off-by: Matt Fleming
Reviewed-by: Mark Salter
Reviewed-by: Mark Rutland
Cc: Catalin Marinas
Cc: Leif Lindholm
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Will Deacon
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443218539-7610-3-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Ard Biesheuvel
2015-10-23 05:43:25 +0800
eed13ce27 vfs: Test for and handle paths that are unreachable from their mnt_root ... Browse Code »

commit 397d425dc26da728396e66d392d5dcb8dac30c37 upstream.

In rare cases a directory can be renamed out from under a bind mount.
In those cases without special handling it becomes possible to walk up
the directory tree to the root dentry of the filesystem and down
from the root dentry to every other file or directory on the filesystem.

Like division by zero .. from an unconnected path can not be given
a useful semantic as there is no predicting at which path component
the code will realize it is unconnected. We certainly can not match
the current behavior as the current behavior is a security hole.

Therefore when encounting .. when following an unconnected path
return -ENOENT.

- Add a function path_connected to verify path->dentry is reachable
from path->mnt.mnt_root. AKA to validate that rename did not do
something nasty to the bind mount.

To avoid races path_connected must be called after following a path
component to it's next path component.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: Al Viro
Signed-off-by: Greg Kroah-Hartman

Eric W. Biederman
2015-10-23 05:43:25 +0800
6f4e45e35 dcache: Handle escaped paths in prepend_path ... Browse Code »

commit cde93be45a8a90d8c264c776fab63487b5038a65 upstream.

A rename can result in a dentry that by walking up d_parent
will never reach it's mnt_root. For lack of a better term
I call this an escaped path.

prepend_path is called by four different functions __d_path,
d_absolute_path, d_path, and getcwd.

__d_path only wants to see paths are connected to the root it passes
in. So __d_path needs prepend_path to return an error.

d_absolute_path similarly wants to see paths that are connected to
some root. Escaped paths are not connected to any mnt_root so
d_absolute_path needs prepend_path to return an error greater
than 1. So escaped paths will be treated like paths on lazily
unmounted mounts.

getcwd needs to prepend "(unreachable)" so getcwd also needs
prepend_path to return an error.

d_path is the interesting hold out. d_path just wants to print
something, and does not care about the weird cases. Which raises
the question what should be printed?

Given that / should result in -ENOENT I
believe it is desirable for escaped paths to be printed as empty
paths. As there are not really any meaninful path components when
considered from the perspective of a mount tree.

So tweak prepend_path to return an empty path with an new error
code of 3 when it encounters an escaped path.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: Al Viro
Signed-off-by: Greg Kroah-Hartman

Eric W. Biederman
2015-10-23 05:43:25 +0800
344fa142d mmc: core: Don't return an error for CD/WP GPIOs when GPIOLIB is unset ... Browse Code »

commit 43934ece2ea72c1dd279c0b0478c1a036d5d77ee upstream.

When CONFIG_GPIOLIB is unset, its stubs will return -ENOSYS. That means
when the mmc core parses DT for CD/WP GPIOs via mmc_of_parse(), -ENOSYS
becomes propagated to the caller. Typically this means that the mmc host
driver fails to probe.

As the CD/WP GPIOs are already treated as optional, let's extend that to
cover the case when CONFIG_GPIOLIB is unset.

Reported-by: Michal Simek
Fixes: 16b23787fc70 ("mmc: sdhci-of-arasan: Call OF parsing for MMC")
Signed-off-by: Ulf Hansson
Tested-by: Michal Simek
Acked-by: Venu Byravarasu
Signed-off-by: Greg Kroah-Hartman

Ulf Hansson
2015-10-23 05:43:25 +0800
c1d40e01a mmc: sdhci: fix dma memory leak in sdhci_pre_req() ... Browse Code »

commit d31911b9374a76560d2c8ea4aa6ce5781621e81d upstream.

Currently one mrq->data maybe execute dma_map_sg() twice
when mmc subsystem prepare over one new request, and the
following log show up:
sdhci[sdhci_pre_dma_transfer] invalid cookie: 24, next-cookie 25

In this condition, mrq->date map a dma-memory(1) in sdhci_pre_req
for the first time, and map another dma-memory(2) in sdhci_prepare_data
for the second time. But driver only unmap the dma-memory(2), and
dma-memory(1) never unmapped, which cause the dma memory leak issue.

This patch use another method to map the dma memory for the mrq->data
which can fix this dma memory leak issue.

Fixes: 348487cb28e6 ("mmc: sdhci: use pipeline mmc requests to improve performance")
Reported-and-tested-by: Jiri Slaby
Signed-off-by: Haibo Chen
Signed-off-by: Ulf Hansson
Signed-off-by: Jiri Slaby
Signed-off-by: Greg Kroah-Hartman

Haibo Chen
2015-10-23 05:43:24 +0800
ef1108596 UBI: return ENOSPC if no enough space available ... Browse Code »

commit 7c7feb2ebfc9c0552c51f0c050db1d1a004faac5 upstream.

UBI: attaching mtd1 to ubi0
UBI: scanning is finished
UBI error: init_volumes: not enough PEBs, required 706, available 686
UBI error: ubi_wl_init: no enough physical eraseblocks (-20, need 1)
UBI error: ubi_attach_mtd_dev: failed to attach mtd1, error -12
Signed-off-by: Richard Weinberger
Reviewed-by: David Gstir
Signed-off-by: Greg Kroah-Hartman

shengyong
2015-10-23 05:43:24 +0800
189c815c3 UBI: Validate data_size ... Browse Code »

commit 281fda27673f833a01d516658a64d22a32c8e072 upstream.

Make sure that data_size is less than LEB size.
Otherwise a handcrafted UBI image is able to trigger
an out of bounds memory access in ubi_compare_lebs().

Signed-off-by: Richard Weinberger
Reviewed-by: David Gstir
Signed-off-by: Greg Kroah-Hartman

Richard Weinberger
2015-10-23 05:43:24 +0800
207663ca0 UBIFS: Kill unneeded locking in ubifs_init_security ... Browse Code »

commit cf6f54e3f133229f02a90c04fe0ff9dd9d3264b4 upstream.

Fixes the following lockdep splat:
[ 1.244527] =============================================
[ 1.245193] [ INFO: possible recursive locking detected ]
[ 1.245193] 4.2.0-rc1+ #37 Not tainted
[ 1.245193] ---------------------------------------------
[ 1.245193] cp/742 is trying to acquire lock:
[ 1.245193] (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [] ubifs_init_security+0x29/0xb0
[ 1.245193]
[ 1.245193] but task is already holding lock:
[ 1.245193] (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [] path_openat+0x3af/0x1280
[ 1.245193]
[ 1.245193] other info that might help us debug this:
[ 1.245193] Possible unsafe locking scenario:
[ 1.245193]
[ 1.245193] CPU0
[ 1.245193] ----
[ 1.245193] lock(&sb->s_type->i_mutex_key#9);
[ 1.245193] lock(&sb->s_type->i_mutex_key#9);
[ 1.245193]
[ 1.245193] *** DEADLOCK ***
[ 1.245193]
[ 1.245193] May be due to missing lock nesting notation
[ 1.245193]
[ 1.245193] 2 locks held by cp/742:
[ 1.245193] #0: (sb_writers#5){.+.+.+}, at: [] mnt_want_write+0x1f/0x50
[ 1.245193] #1: (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [] path_openat+0x3af/0x1280
[ 1.245193]
[ 1.245193] stack backtrace:
[ 1.245193] CPU: 2 PID: 742 Comm: cp Not tainted 4.2.0-rc1+ #37
[ 1.245193] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140816_022509-build35 04/01/2014
[ 1.245193] ffffffff8252d530 ffff88007b023a38 ffffffff814f6f49 ffffffff810b56c5
[ 1.245193] ffff88007c30cc80 ffff88007b023af8 ffffffff810a150d ffff88007b023a68
[ 1.245193] 000000008101302a ffff880000000000 00000008f447e23f ffffffff8252d500
[ 1.245193] Call Trace:
[ 1.245193] [] dump_stack+0x4c/0x65
[ 1.245193] [] ? console_unlock+0x1c5/0x510
[ 1.245193] [] __lock_acquire+0x1a6d/0x1ea0
[ 1.245193] [] ? __lock_is_held+0x58/0x80
[ 1.245193] [] lock_acquire+0xd3/0x270
[ 1.245193] [] ? ubifs_init_security+0x29/0xb0
[ 1.245193] [] mutex_lock_nested+0x6b/0x3a0
[ 1.245193] [] ? ubifs_init_security+0x29/0xb0
[ 1.245193] [] ? ubifs_init_security+0x29/0xb0
[ 1.245193] [] ubifs_init_security+0x29/0xb0
[ 1.245193] [] ubifs_create+0xa6/0x1f0
[ 1.245193] [] ? path_openat+0x3af/0x1280
[ 1.245193] [] vfs_create+0x95/0xc0
[ 1.245193] [] path_openat+0x7cc/0x1280
[ 1.245193] [] ? __lock_acquire+0x543/0x1ea0
[ 1.245193] [] ? sched_clock_cpu+0x90/0xc0
[ 1.245193] [] ? calc_global_load_tick+0x60/0x90
[ 1.245193] [] ? sched_clock_cpu+0x90/0xc0
[ 1.245193] [] ? __alloc_fd+0xaf/0x180
[ 1.245193] [] do_filp_open+0x75/0xd0
[ 1.245193] [] ? _raw_spin_unlock+0x26/0x40
[ 1.245193] [] ? __alloc_fd+0xaf/0x180
[ 1.245193] [] do_sys_open+0x129/0x200
[ 1.245193] [] SyS_open+0x19/0x20
[ 1.245193] [] entry_SYSCALL_64_fastpath+0x12/0x6f

While the lockdep splat is a false positive, becuase path_openat holds i_mutex
of the parent directory and ubifs_init_security() tries to acquire i_mutex
of a new inode, it reveals that taking i_mutex in ubifs_init_security() is
in vain because it is only being called in the inode allocation path
and therefore nobody else can see the inode yet.

Reported-and-tested-by: Boris Brezillon
Reviewed-and-tested-by: Dongsheng Yang
Signed-off-by: Richard Weinberger
Signed-off-by: dedekind1@gmail.com
Signed-off-by: Greg Kroah-Hartman

Richard Weinberger
2015-10-23 05:43:24 +0800
d3a1196bf inet: fix potential deadlock in reqsk_queue_unlink() ... Browse Code »

commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af upstream.

When replacing del_timer() with del_timer_sync(), I introduced
a deadlock condition :

reqsk_queue_unlink() is called from inet_csk_reqsk_queue_drop()

inet_csk_reqsk_queue_drop() can be called from many contexts,
one being the timer handler itself (reqsk_timer_handler()).

In this case, del_timer_sync() loops forever.

Simple fix is to test if timer is pending.

Fixes: 2235f2ac75fd ("inet: fix races with reqsk timers")
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
Cc: Holger Hoffstätte
Cc: Andre Tomt
Cc: Chris Caputo
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2015-10-23 05:43:24 +0800
a58897f9e rsi: Fix possible leak when loading firmware ... Browse Code »

commit a8b9774571d46506a0774b1ced3493b1245cf893 upstream.

Commit 5d5cd85ff441 ("rsi: Fix failure to load firmware after memory
leak fix and fix the leak") also added a check on the allocation of
DMA-accessible memory that may directly return. In that case the
already allocated firmware data is leaked. Make sure the data is
always freed correctly. Detected by Coverity CID 1316519.

Fixes: 5d5cd85ff441 ("rsi: Fix failure to load firmware after memory leak fix and fix the leak")
Signed-off-by: Christian Engelmayer
Signed-off-by: Kalle Valo
Signed-off-by: Greg Kroah-Hartman

Christian Engelmayer
2015-10-23 05:43:24 +0800
e6b5ff2bb powerpc/MSI: Fix race condition in tearing down MSI interrupts ... Browse Code »

commit e297c939b745e420ef0b9dc989cb87bda617b399 upstream.

This fixes a race which can result in the same virtual IRQ number
being assigned to two different MSI interrupts. The most visible
consequence of that is usually a warning and stack trace from the
sysfs code about an attempt to create a duplicate entry in sysfs.

The race happens when one CPU (say CPU 0) is disposing of an MSI
while another CPU (say CPU 1) is setting up an MSI. CPU 0 calls
(for example) pnv_teardown_msi_irqs(), which calls
msi_bitmap_free_hwirqs() to indicate that the MSI (i.e. its
hardware IRQ number) is no longer in use. Then, before CPU 0 gets
to calling irq_dispose_mapping() to free up the virtal IRQ number,
CPU 1 comes in and calls msi_bitmap_alloc_hwirqs() to allocate an
MSI, and gets the same hardware IRQ number that CPU 0 just freed.
CPU 1 then calls irq_create_mapping() to get a virtual IRQ number,
which sees that there is currently a mapping for that hardware IRQ
number and returns the corresponding virtual IRQ number (which is
the same virtual IRQ number that CPU 0 was using). CPU 0 then
calls irq_dispose_mapping() and frees that virtual IRQ number.
Now, if another CPU comes along and calls irq_create_mapping(), it
is likely to get the virtual IRQ number that was just freed,
resulting in the same virtual IRQ number apparently being used for
two different hardware interrupts.

To fix this race, we just move the call to msi_bitmap_free_hwirqs()
to after the call to irq_dispose_mapping(). Since virq_to_hw()
doesn't work for the virtual IRQ number after irq_dispose_mapping()
has been called, we need to call it before irq_dispose_mapping() and
remember the result for the msi_bitmap_free_hwirqs() call.

The pattern of calling msi_bitmap_free_hwirqs() before
irq_dispose_mapping() appears in 5 places under arch/powerpc, and
appears to have originated in commit 05af7bd2d75e ("[POWERPC] MPIC
U3/U4 MSI backend") from 2007.

Fixes: 05af7bd2d75e ("[POWERPC] MPIC U3/U4 MSI backend")
Reported-by: Alexey Kardashevskiy
Signed-off-by: Paul Mackerras
Signed-off-by: Michael Ellerman
Signed-off-by: Greg Kroah-Hartman

Paul Mackerras
2015-10-23 05:43:24 +0800
41f3fa173 tools lib traceevent: Fix string handling in heterogeneous arch environments ... Browse Code »

commit c2e4b24ff848bb180f9b9cd873a38327cd219ad2 upstream.

When a trace recorded on a 32-bit device is processed with a 64-bit
binary, the higher 32-bits of the address need to ignored.

The lack of this results in the output of the 64-bit pointer
value to the trace as the 32-bit address lookup fails in find_printk().

Before:

burn-1778 [003] 548.600305: bputs: 0xc0046db2s: 2cec5c058d98c

After:

burn-1778 [003] 548.600305: bputs: 0xc0046db2s: RT throttling activated

The problem occurs in PRINT_FIELD when the field is recognized as a
pointer to a string (of the type const char *)

Heterogeneous architectures cases below can arise and should be handled:

* Traces recorded using 32-bit addresses processed on a 64-bit machine
* Traces recorded using 64-bit addresses processed on a 32-bit machine

Reported-by: Juri Lelli
Signed-off-by: Kapileshwar Singh
Reviewed-by: Steven Rostedt
Cc: David Ahern
Cc: Javi Merino
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: http://lkml.kernel.org/r/1442928123-13824-1-git-send-email-kapileshwar.singh@arm.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Greg Kroah-Hartman

Kapileshwar Singh
2015-10-23 05:43:24 +0800
42719676d batman-adv: Fix potentially broken skb network header access ... Browse Code »

commit 53cf037bf846417fd92dc92ddf97267f69b110f4 upstream.

The two commits noted below added calls to ip_hdr() and ipv6_hdr(). They
need a correctly set skb network header.

Unfortunately we cannot rely on the device drivers to set it for us.
Therefore setting it in the beginning of the according ndo_start_xmit
handler.

Fixes: 1d8ab8d3c176 ("batman-adv: Modified forwarding behaviour for multicast packets")
Fixes: ab49886e3da7 ("batman-adv: Add IPv4 link-local/IPv6-ll-all-nodes multicast support")
Signed-off-by: Linus Lüssing
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli
Signed-off-by: Greg Kroah-Hartman

Linus Lüssing
2015-10-23 05:43:24 +0800
3e6263c02 batman-adv: Fix potential synchronization issues in mcast tvlv handler ... Browse Code »

commit 8a4023c5b5e30b11f1f383186f4a7222b3b823cf upstream.

So far the mcast tvlv handler did not anticipate the processing of
multiple incoming OGMs from the same originator at the same time. This
can lead to various issues:

* Broken refcounting: For instance two mcast handlers might both assume
that an originator just got multicast capabilities and will together
wrongly decrease mcast.num_disabled by two, potentially leading to
an integer underflow.

* Potential kernel panic on hlist_del_rcu(): Two mcast handlers might
one after another try to do an
hlist_del_rcu(&orig->mcast_want_all_*_node). The second one will
cause memory corruption / crashes.
(Reported by: Sven Eckelmann )

Right in the beginning the code path makes assumptions about the current
multicast related state of an originator and bases all updates on that. The
easiest and least error prune way to fix the issues in this case is to
serialize multiple mcast handler invocations with a spinlock.

Fixes: 60432d756cf0 ("batman-adv: Announce new capability via multicast TVLV")
Signed-off-by: Linus Lüssing
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli
Signed-off-by: Greg Kroah-Hartman

Linus Lüssing
2015-10-23 05:43:24 +0800
8dbeac75e batman-adv: Make MCAST capability changes atomic ... Browse Code »

commit 9c936e3f4c4fad07abb6c082a89508b8f724c88f upstream.

Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One
OGM handler might undo the set/clear of a specific bit from another
handler run in between.

Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions.

Fixes: 60432d756cf0 ("batman-adv: Announce new capability via multicast TVLV")
Signed-off-by: Linus Lüssing
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli
Signed-off-by: Greg Kroah-Hartman

Linus Lüssing
2015-10-23 05:43:24 +0800
3dd853ed3 batman-adv: Make TT capability changes atomic ... Browse Code »

commit ac4eebd48461ec993e7cb614d5afe7df8c72e6b7 upstream.

Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One
OGM handler might undo the set/clear of a specific bit from another
handler run in between.

Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions.

Fixes: e17931d1a61d ("batman-adv: introduce capability initialization bitfield")
Signed-off-by: Linus Lüssing
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli
Signed-off-by: Greg Kroah-Hartman

Linus Lüssing
2015-10-23 05:43:23 +0800
505f068df batman-adv: Make NC capability changes atomic ... Browse Code »

commit 4635469f5c617282f18c69643af36cd8c0acf707 upstream.

Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One
OGM handler might undo the set/clear of a specific bit from another
handler run in between.

Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions.

Fixes: 3f4841ffb336 ("batman-adv: tvlv - add network coding container")
Signed-off-by: Linus Lüssing
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli
Signed-off-by: Greg Kroah-Hartman

Linus Lüssing
2015-10-23 05:43:23 +0800
88108b384 MIPS: dma-default: Fix 32-bit fall back to GFP_DMA ... Browse Code »

commit 53960059d56ecef67d4ddd546731623641a3d2d1 upstream.

If there is a DMA zone (usually 24bit = 16MB I believe), but no DMA32
zone, as is the case for some 32-bit kernels, then massage_gfp_flags()
will cause DMA memory allocated for devices with a 32..63-bit
coherent_dma_mask to fall back to using __GFP_DMA, even though there may
only be 32-bits of physical address available anyway.

Correct that case to compare against a mask the size of phys_addr_t
instead of always using a 64-bit mask.

Signed-off-by: James Hogan
Fixes: a2e715a86c6d ("MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.")
Cc: Ralf Baechle
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9610/
Signed-off-by: Ralf Baechle
Signed-off-by: Greg Kroah-Hartman

James Hogan
2015-10-23 05:43:23 +0800