Eric Lee / smarc-fsl-linux-kernel

27 Oct, 2015

40 commits

c7fc0d838 locks: inline posix_lock_file_wait and flock_lock_file_wait ... Browse Code »

commit ee296d7c5709440f8abd36b5b65c6b3e388538d9 upstream.

They just call file_inode and then the corresponding *_inode_file_wait
function. Just make them static inlines instead.

Signed-off-by: Jeff Layton
Cc: William Dauchy
Signed-off-by: Greg Kroah-Hartman

Jeff Layton
2015-10-27 08:52:00 +0800
b2540f146 locks: new helpers - flock_lock_inode_wait and posix_lock_inode_wait ... Browse Code »

commit 29d01b22eaa18d8b46091d3c98c6001c49f78e4a upstream.

Allow callers to pass in an inode instead of a filp.

Signed-off-by: Jeff Layton
Reviewed-by: "J. Bruce Fields"
Tested-by: "J. Bruce Fields"
Cc: William Dauchy
Signed-off-by: Greg Kroah-Hartman

Jeff Layton
2015-10-27 08:52:00 +0800
0bdb53e1b locks: have flock_lock_file take an inode pointer instead of a filp ... Browse Code »

commit bcd7f78d078ff6197715c1ed070c92aca57ec12c upstream.

...and rename it to better describe how it works.

In order to fix a use-after-free in NFS, we need to be able to remove
locks from an inode after the filp associated with them may have already
been freed. flock_lock_file already only dereferences the filp to get to
the inode, so just change it so the callers do that.

All of the callers already pass in a lock request that has the fl_file
set properly, so we don't need to pass it in individually. With that
change it now only dereferences the filp to get to the inode, so just
push that out to the callers.

Signed-off-by: Jeff Layton
Reviewed-by: "J. Bruce Fields"
Tested-by: "J. Bruce Fields"
Cc: William Dauchy
Signed-off-by: Greg Kroah-Hartman

Jeff Layton
2015-10-27 08:51:59 +0800
23a0f8cd3 svcrdma: handle rdma read with a non-zero initial page offset ... Browse Code »

commit c91aed9896946721bb30705ea2904edb3725dd61 upstream.

The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
were not taking into account the initial page_offset when determining
the rdma read length. This resulted in a read who's starting address
and length exceeded the base/bounds of the frmr.

The server gets an async error from the rdma device and kills the
connection, and the client then reconnects and resends. This repeats
indefinitely, and the application hangs.

Most work loads don't tickle this bug apparently, but one test hit it
every time: building the linux kernel on a 16 core node with 'make -j
16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.

This bug seems to only be tripped with devices having small fastreg page
list depths. I didn't see it with mlx4, for instance.

Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
Signed-off-by: Steve Wise
Tested-by: Chuck Lever
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Steve Wise
2015-10-27 08:51:59 +0800
ae7d50053 arm64: Fix THP protection change logic ... Browse Code »

commit 1a541b4e3cd6f5795022514114854b3e1345f24e upstream.

6910fa1 ("arm64: enable PTE type bit in the mask for pte_modify") fixes
a problem whereby a large block of PROT_NONE mapped memory is
incorrectly mapped as block descriptors when mprotect is called.

Unfortunately, a subtle bug was introduced by this fix to the THP logic.

If one mmaps a large block of memory, then faults it such that it is
collapsed into THPs; resulting calls to mprotect on this area of memory
will lead to incorrect table descriptors being written instead of block
descriptors. This is because pmd_modify calls pte_modify which is now
allowed to modify the type of the page table entry.

This patch reverts commit 6910fa16dbe142f6a0fd0fd7c249f9883ff7fc8a, and
fixes the problem it was trying to address by adjusting PAGE_NONE to
represent a table entry. Thus no change in pte type is required when
moving from PROT_NONE to a different protection.

Fixes: 6910fa16dbe1 ("arm64: enable PTE type bit in the mask for pte_modify")
Cc: # 4.0+
Cc: Feng Kan
Reported-by: Ganapatrao Kulkarni
Tested-by: Ganapatrao Kulkarni
Reviewed-by: Catalin Marinas
[SteveC: backported 1a541b4e3cd6f5795022514114854b3e1345f24e to 4.1 and
4.2 stable. Just one minor fix to second part to allow patch to apply
cleanly, no logic changed.]
Signed-off-by: Steve Capper
Signed-off-by: Catalin Marinas
Signed-off-by: Greg Kroah-Hartman

Steve Capper
2015-10-27 08:51:59 +0800
e44ddb677 pinctrl: imx25: ensure that a pin with id i is at position i in the info array ... Browse Code »

commit 9911a2d5e9d14e39692b751929a92cb5a1d9d0e0 upstream.

The code in pinctrl-imx.c only works correctly if in the
imx_pinctrl_soc_info passed to imx_pinctrl_probe we have:

info->pins[i].number = i
conf_reg(info->pins[i]) = 4 * i

(which conf_reg(pin) being the offset of the pin's configuration
register).

When the imx25 specific part was introduced in b4a87c9b966f ("pinctrl:
pinctrl-imx: add imx25 pinctrl driver") we had:

info->pins[i].number = i + 1
conf_reg(info->pins[i]) = 4 * i

. Commit 34027ca2bbc6 ("pinctrl: imx25: fix numbering for pins") tried
to fix that but made the situation:

info->pins[i-1].number = i
conf_reg(info->pins[i-1]) = 4 * i

which is hardly better but fixed the error seen back then.

So insert another reserved entry in the array to finally yield:

info->pins[i].number = i
conf_reg(info->pins[i]) = 4 * i

Fixes: 34027ca2bbc6 ("pinctrl: imx25: fix numbering for pins")
Signed-off-by: Uwe Kleine-König
Signed-off-by: Linus Walleij
Signed-off-by: Greg Kroah-Hartman

Uwe Kleine-König
2015-10-27 08:51:59 +0800
98197d3de sched/preempt: Fix cond_resched_lock() and cond_resched_softirq() ... Browse Code »

commit fe32d3cd5e8eb0f82e459763374aa80797023403 upstream.

These functions check should_resched() before unlocking spinlock/bh-enable:
preempt_count always non-zero => should_resched() always returns false.
cond_resched_lock() worked iff spin_needbreak is set.

This patch adds argument "preempt_offset" to should_resched().

preempt_count offset constants for that:

PREEMPT_DISABLE_OFFSET - offset after preempt_disable()
PREEMPT_LOCK_OFFSET - offset after spin_lock()
SOFTIRQ_DISABLE_OFFSET - offset after local_bh_distable()
SOFTIRQ_LOCK_OFFSET - offset after spin_lock_bh()

Signed-off-by: Konstantin Khlebnikov
Signed-off-by: Peter Zijlstra (Intel)
Cc: Alexander Graf
Cc: Boris Ostrovsky
Cc: David Vrabel
Cc: Linus Torvalds
Cc: Mike Galbraith
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Fixes: bdb438065890 ("sched: Extract the basic add/sub preempt_count modifiers")
Link: http://lkml.kernel.org/r/20150715095204.12246.98268.stgit@buzz
Signed-off-by: Ingo Molnar
Signed-off-by: Mike Galbraith
Signed-off-by: Greg Kroah-Hartman

Konstantin Khlebnikov
2015-10-27 08:51:58 +0800
3905f7abd sched/preempt: Rename PREEMPT_CHECK_OFFSET to PREEMPT_DISABLE_OFFSET ... Browse Code »

commit 90b62b5129d5cb50f62f40e684de7a1961e57197 upstream.

"CHECK" suggests it's only used as a comparison mask. But now it's used
further as a config-conditional preempt disabler offset. Lets
disambiguate this name.

Signed-off-by: Frederic Weisbecker
Signed-off-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1431441711-29753-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar
Signed-off-by: Mike Galbraith
Signed-off-by: Greg Kroah-Hartman

Frederic Weisbecker
2015-10-27 08:51:58 +0800
dd703751f rbd: fix double free on rbd_dev->header_name ... Browse Code »

commit 3ebe138ac642a195c7f2efdb918f464734421fd6 upstream.

If rbd_dev_image_probe() in rbd_dev_probe_parent() fails, header_name
is freed twice: once in rbd_dev_probe_parent() and then in its caller
rbd_dev_image_probe() (rbd_dev_image_probe() is called recursively to
handle parent images).

rbd_dev_probe_parent() is responsible for probing the parent, so it
shouldn't muck with clone's fields.

Signed-off-by: Ilya Dryomov
Reviewed-by: Alex Elder
Signed-off-by: Greg Kroah-Hartman

Ilya Dryomov
2015-10-27 08:51:58 +0800
015ec5d44 dm thin: fix missing pool reference count decrement in pool_ctr error path ... Browse Code »

commit ba30670f4d5292c4e7f7980bbd5071f7c4794cdd upstream.

Fixes: ac8c3f3df ("dm thin: generate event when metadata threshold passed")
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman

Mike Snitzer
2015-10-27 08:51:57 +0800
4e4887f08 drm/radeon: add pm sysfs files late ... Browse Code »

commit 51a4726b04e880fdd9b4e0e58b13f70b0a68a7f5 upstream.

They were added relatively early in the driver init process
which meant that in some cases the driver was not finished
initializing before external tools tried to use them which
could result in a crash depending on the timing.

Signed-off-by: Alex Deucher
Signed-off-by: Greg Kroah-Hartman

Alex Deucher
2015-10-27 08:51:57 +0800
17c5761b0 drm/radeon: attach tile property to mst connector ... Browse Code »

commit bc8c131ccdd62d4ed4f33c6b50f92907e7c32dee upstream.

This allows tiled monitors to work with radeon once mst is enabled.

Signed-off-by: Dave Airlie
Signed-off-by: Greg Kroah-Hartman

Dave Airlie
2015-10-27 08:51:57 +0800
510035b39 drm/dp/mst: make mst i2c transfer code more robust. ... Browse Code »

commit ae491542cbbbcca0ec8938c37d4079a985e58440 upstream.

This zeroes the msg so no random stack data ends up getting
sent, it also limits the function to not accepting > 4
i2c msgs.

Reviewed-by: Daniel Vetter
Signed-off-by: Dave Airlie
Signed-off-by: Greg Kroah-Hartman

Dave Airlie
2015-10-27 08:51:57 +0800
db531487d drm/nouveau/fbcon: take runpm reference when userspace has an open fd ... Browse Code »

commit f231976c2e8964ceaa9250e57d27c35ff03825c2 upstream.

We need to do this in order to prevent accesses to the device while it's
powered down. Userspace may have an mmap of the fb, and there's no good
way (that I know of) to prevent it from touching the device otherwise.

This fixes some nasty races between runpm and plymouth on some systems,
which result in the GPU getting very upset and hanging the boot.

Signed-off-by: Ben Skeggs
Signed-off-by: Greg Kroah-Hartman

Ben Skeggs
2015-10-27 08:51:56 +0800
65825ff63 workqueue: make sure delayed work run in local cpu ... Browse Code »

commit 874bbfe600a660cba9c776b3957b1ce393151b76 upstream.

My system keeps crashing with below message. vmstat_update() schedules a delayed
work in current cpu and expects the work runs in the cpu.
schedule_delayed_work() is expected to make delayed work run in local cpu. The
problem is timer can be migrated with NO_HZ. __queue_work() queues work in
timer handler, which could run in a different cpu other than where the delayed
work is scheduled. The end result is the delayed work runs in different cpu.
The patch makes __queue_delayed_work records local cpu earlier. Where the timer
runs doesn't change where the work runs with the change.

[ 28.010131] ------------[ cut here ]------------
[ 28.010609] kernel BUG at ../mm/vmstat.c:1392!
[ 28.011099] invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[ 28.011860] Modules linked in:
[ 28.012245] CPU: 0 PID: 289 Comm: kworker/0:3 Tainted: G W4.3.0-rc3+ #634
[ 28.013065] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153802- 04/01/2014
[ 28.014160] Workqueue: events vmstat_update
[ 28.014571] task: ffff880117682580 ti: ffff8800ba428000 task.ti: ffff8800ba428000
[ 28.015445] RIP: 0010:[] []vmstat_update+0x31/0x80
[ 28.016282] RSP: 0018:ffff8800ba42fd80 EFLAGS: 00010297
[ 28.016812] RAX: 0000000000000000 RBX: ffff88011a858dc0 RCX:0000000000000000
[ 28.017585] RDX: ffff880117682580 RSI: ffffffff81f14d8c RDI:ffffffff81f4df8d
[ 28.018366] RBP: ffff8800ba42fd90 R08: 0000000000000001 R09:0000000000000000
[ 28.019169] R10: 0000000000000000 R11: 0000000000000121 R12:ffff8800baa9f640
[ 28.019947] R13: ffff88011a81e340 R14: ffff88011a823700 R15:0000000000000000
[ 28.020071] FS: 0000000000000000(0000) GS:ffff88011a800000(0000)knlGS:0000000000000000
[ 28.020071] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 28.020071] CR2: 00007ff6144b01d0 CR3: 00000000b8e93000 CR4:00000000000006f0
[ 28.020071] Stack:
[ 28.020071] ffff88011a858dc0 ffff8800baa9f640 ffff8800ba42fe00ffffffff8106bd88
[ 28.020071] ffffffff8106bd0b 0000000000000096 0000000000000000ffffffff82f9b1e8
[ 28.020071] ffffffff829f0b10 0000000000000000 ffffffff81f18460ffff88011a81e340
[ 28.020071] Call Trace:
[ 28.020071] [] process_one_work+0x1c8/0x540
[ 28.020071] [] ? process_one_work+0x14b/0x540
[ 28.020071] [] worker_thread+0x114/0x460
[ 28.020071] [] ? process_one_work+0x540/0x540
[ 28.020071] [] kthread+0xf8/0x110
[ 28.020071] [] ?kthread_create_on_node+0x200/0x200
[ 28.020071] [] ret_from_fork+0x3f/0x70
[ 28.020071] [] ?kthread_create_on_node+0x200/0x200

Signed-off-by: Shaohua Li
Signed-off-by: Tejun Heo
Signed-off-by: Greg Kroah-Hartman

Shaohua Li
2015-10-27 08:51:56 +0800
90e210c2d i2c: designware-platdrv: enable RuntimePM before registering to the core ... Browse Code »

commit 36d48fb5766aee9717e429f772046696b215282d upstream.

The core may register clients attached to this master which may use
funtionality from the master. So, RuntimePM must be enabled before, otherwise
this will fail.

Signed-off-by: Wolfram Sang
Signed-off-by: Wolfram Sang
Acked-by: Mika Westerberg
Signed-off-by: Greg Kroah-Hartman

Wolfram Sang
2015-10-27 08:51:56 +0800
67e6df4ba i2c: designware: Do not use parameters from ACPI on Dell Inspiron 7348 ... Browse Code »

commit 56d4b8a24cef5d66f0d10ac778a520d3c2c68a48 upstream.

ACPI SSCN/FMCN methods were originally added because then the platform can
provide the most accurate HCNT/LCNT values to the driver. However, this
seems not to be true for Dell Inspiron 7348 where using these causes the
touchpad to fail in boot:

i2c_hid i2c-DLL0675:00: failed to retrieve report from device.
i2c_designware INT3433:00: i2c_dw_handle_tx_abort: lost arbitration
i2c_hid i2c-DLL0675:00: failed to retrieve report from device.
i2c_designware INT3433:00: controller timed out

The values received from ACPI are (in fast mode):

HCNT: 72
LCNT: 160

this translates to following timings (input clock is 100MHz on Broadwell):

tHIGH: 720 ns (spec min 600 ns)
tLOW: 1600 ns (spec min 1300 ns)
Bus period: 2920 ns (assuming 300 ns tf and tr)
Bus speed: 342.5 kHz

Both tHIGH and tLOW are within the I2C specification.

The calculated values when ACPI parameters are not used are (in fast mode):

HCNT: 87
LCNT: 159

which translates to:

tHIGH: 870 ns (spec min 600 ns)
tLOW: 1590 ns (spec min 1300 ns)
Bus period 3060 ns (assuming 300 ns tf and tr)
Bus speed 326.8 kHz

These values are also within the I2C specification.

Since both ACPI and calculated values meet the I2C specification timing
requirements it is hard to say why the touchpad does not function properly
with the ACPI values except that the bus speed is higher in this case (but
still well below the max 400kHz).

Solve this by adding DMI quirk to the driver that disables using ACPI
parameters on this particulare machine.

Reported-by: Pavel Roskin
Signed-off-by: Mika Westerberg
Tested-by: Pavel Roskin
Signed-off-by: Wolfram Sang
Signed-off-by: Greg Kroah-Hartman

Mika Westerberg
2015-10-27 08:51:56 +0800
d07e907e6 i2c: s3c2410: enable RuntimePM before registering to the core ... Browse Code »

commit eadd709f5d2e8aebb1b7bf49460e97a68d81a9b0 upstream.

The core may register clients attached to this master which may use
funtionality from the master. So, RuntimePM must be enabled before, otherwise
this will fail. While here, move drvdata, too.

Signed-off-by: Wolfram Sang
Tested-by: Krzysztof Kozlowski
Acked-by: Kukjin Kim
Signed-off-by: Wolfram Sang
Signed-off-by: Greg Kroah-Hartman

Wolfram Sang
2015-10-27 08:51:55 +0800
4689c5051 i2c: rcar: enable RuntimePM before registering to the core ... Browse Code »

commit 4f7effddf4549d57114289f273710f077c4c330a upstream.

The core may register clients attached to this master which may use
funtionality from the master. So, RuntimePM must be enabled before, otherwise
this will fail. While here, move drvdata, too.

Reported-by: Geert Uytterhoeven
Signed-off-by: Wolfram Sang
Signed-off-by: Wolfram Sang
Signed-off-by: Greg Kroah-Hartman

Wolfram Sang
2015-10-27 08:51:55 +0800
b6af04a21 mfd: max77843: Fix max77843_chg_init() return on error ... Browse Code »

commit 1b52e50f2a402a266f1ba2281f0a57e87637a047 upstream.

If i2c_new_dummy() fails in max77843_chg_init(), an PTR_ERR(NULL) is
returned which is 0. So the function was wrongly returning a success
value instead of an error code.

Fixes: c7f585fe46d8 ("mfd: max77843: Add max77843 MFD driver core driver")
Signed-off-by: Javier Martinez Canillas
Reviewed-by: Krzysztof Kozlowski
Signed-off-by: Lee Jones
Signed-off-by: Greg Kroah-Hartman

Javier Martinez Canillas
2015-10-27 08:51:55 +0800
271759afb nfsd/blocklayout: accept any minlength ... Browse Code »

commit 8c3ad9cb7343dc5f61b8cf3cdbe1016c5e7c2c8b upstream.

Recent Linux clients have started to send GETLAYOUT requests with
minlength less than blocksize.

Servers aren't really allowed to impose this kind of restriction on
layouts; see RFC 5661 section 18.43.3 for details.

This has been observed to cause indefinite hangs on fsx runs on some
clients.

Signed-off-by: Christoph Hellwig
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Christoph Hellwig
2015-10-27 08:51:54 +0800
a20f5eac8 arm64: errata: use KBUILD_CFLAGS_MODULE for erratum #843419 ... Browse Code »

commit b6dd8e0719c0d2d01429639a11b7bc2677de240c upstream.

Commit df057cc7b4fa ("arm64: errata: add module build workaround for
erratum #843419") sets CFLAGS_MODULE to ensure that the large memory
model is used by the compiler when building kernel modules.

However, CFLAGS_MODULE is an environment variable and intended to be
overridden on the command line, which appears to be the case with the
Ubuntu kernel packaging system, so use KBUILD_CFLAGS_MODULE instead.

Cc: Ard Biesheuvel
Fixes: df057cc7b4fa ("arm64: errata: add module build workaround for erratum #843419")
Reported-by: Dann Frazier
Tested-by: Dann Frazier
Signed-off-by: Will Deacon
Signed-off-by: Greg Kroah-Hartman

Will Deacon
2015-10-27 08:51:54 +0800
6780e0d1b btrfs: fix use after free iterating extrefs ... Browse Code »

commit dc6c5fb3b514221f2e9d21ee626a9d95d3418dff upstream.

The code for btrfs inode-resolve has never worked properly for
files with enough hard links to trigger extrefs. It was trying to
get the leaf out of a path after freeing the path:

btrfs_release_path(path);
leaf = path->nodes[0];
item_size = btrfs_item_size_nr(leaf, slot);

The fix here is to use the extent buffer we cloned just a little higher
up to avoid deadlocks caused by using the leaf in the path.

Signed-off-by: Chris Mason
cc: Mark Fasheh
Reviewed-by: Filipe Manana
Reviewed-by: Mark Fasheh
Signed-off-by: Greg Kroah-Hartman

Chris Mason
2015-10-27 08:51:54 +0800
2382a147e btrfs: check unsupported filters in balance arguments ... Browse Code »

commit 8eb934591f8bf584969454a658f629cd06e59f3a upstream.

We don't verify that all the balance filter arguments supplemented by
the flags are actually known to the kernel. Thus we let it silently pass
and do nothing.

At the moment this means only the 'limit' filter, but we're going to add
a few more soon so it's better to have that fixed. Also in older stable
kernels so that it works with newer userspace tools.

Signed-off-by: David Sterba
Signed-off-by: Chris Mason
Signed-off-by: Greg Kroah-Hartman

David Sterba
2015-10-27 08:51:54 +0800
2105f9aec memcg: convert threshold to bytes ... Browse Code »

commit 424cdc14138088ada1b0e407a2195b2783c6e5ef upstream.

page_counter_memparse() returns pages for the threshold, while
mem_cgroup_usage() returns bytes for memory usage. Convert the
threshold to bytes.

Fixes: 3e32cb2e0a12b6915 ("memcg: rename cgroup_event to mem_cgroup_event").
Signed-off-by: Shaohua Li
Cc: Johannes Weiner
Acked-by: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Shaohua Li
2015-10-27 08:51:53 +0800
3ae2c7951 crypto: ahash - ensure statesize is non-zero ... Browse Code »

commit 8996eafdcbad149ac0f772fb1649fbb75c482a6a upstream.

Unlike shash algorithms, ahash drivers must implement export
and import as their descriptors may contain hardware state and
cannot be exported as is. Unfortunately some ahash drivers did
not provide them and end up causing crashes with algif_hash.

This patch adds a check to prevent these drivers from registering
ahash algorithms until they are fixed.

Signed-off-by: Russell King
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Russell King
2015-10-27 08:51:53 +0800
6ebf06eb1 crypto: sparc - initialize blkcipher.ivsize ... Browse Code »

commit a66d7f724a96d6fd279bfbd2ee488def6b081bea upstream.

Some of the crypto algorithms write to the initialization vector,
but no space has been allocated for it. This clobbers adjacent memory.

Signed-off-by: Dave Kleikamp
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Dave Kleikamp
2015-10-27 08:51:53 +0800
d96105a5f drm: Fix locking for sysfs dpms file ... Browse Code »

commit 621bd0f6982badd6483acb191eb7b6226a578328 upstream.

With atomic drivers we need to make sure that (at least in general)
property reads hold the right locks. But the legacy dpms property is
special and can be read locklessly. Since userspace loves to just
randomly look at that all the time (like with "status") do that.

To make it clear that we play tricks use the READ_ONCE compiler
barrier (and also for paranoia).

Note that there's not really anything bad going on since even with the
new atomic paths we eventually end up not chasing any pointers (and
hence possibly freed memory and other fun stuff). The locking WARNING
has been added in

commit 88a48e297b3a3bac6022c03babfb038f1a886cea
Author: Rob Clark
Date: Thu Dec 18 16:01:50 2014 -0500

drm: add atomic properties

but since drivers are converting not everyone will have seen this from
the start.

Jens reported this and submitted a patch to just grab the
mode_config.connection_mutex, but we can do a bit better.

v2: Remove unused variables I failed to git add for real.

Reference: http://mid.gmane.org/20150928194822.GA3930@kernel.dk
Reported-by: Jens Axboe
Tested-by: Jens Axboe
Cc: Rob Clark
Signed-off-by: Daniel Vetter
Signed-off-by: Dave Airlie
Signed-off-by: Greg Kroah-Hartman

Daniel Vetter
2015-10-27 08:51:53 +0800
e09e88967 net/unix: fix logic about sk_peek_offset ... Browse Code »

[ Upstream commit e9193d60d363e4dff75ff6d43a48f22be26d59c7 ]

Now send with MSG_PEEK can return data from multiple SKBs.

Unfortunately we take into account the peek offset for each skb,
that is wrong. We need to apply the peek offset only once.

In addition, the peek offset should be used only if MSG_PEEK is set.

Cc: "David S. Miller" (maintainer:NETWORKING
Cc: Eric Dumazet (commit_signer:1/14=7%)
Cc: Aaron Conole
Fixes: 9f389e35674f ("af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag")
Signed-off-by: Andrey Vagin
Tested-by: Aaron Conole
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Andrey Vagin
2015-10-27 08:51:52 +0800
9bf31c538 af_unix: return data from multiple SKBs on recv() with MSG_PEEK flag ... Browse Code »

[ Upstream commit 9f389e35674f5b086edd70ed524ca0f287259725 ]

AF_UNIX sockets now return multiple skbs from recv() when MSG_PEEK flag
is set.

This is referenced in kernel bugzilla #12323 @
https://bugzilla.kernel.org/show_bug.cgi?id=12323

As described both in the BZ and lkml thread @
http://lkml.org/lkml/2008/1/8/444 calling recv() with MSG_PEEK on an
AF_UNIX socket only reads a single skb, where the desired effect is
to return as much skb data has been queued, until hitting the recv
buffer size (whichever comes first).

The modified MSG_PEEK path will now move to the next skb in the tree
and jump to the again: label, rather than following the natural loop
structure. This requires duplicating some of the loop head actions.

This was tested using the python socketpair python code attached to
the bugzilla issue.

Signed-off-by: Aaron Conole
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Aaron Conole
2015-10-27 08:51:52 +0800
a96cb1c60 af_unix: Convert the unix_sk macro to an inline function for type safety ... Browse Code »

[ Upstream commit 4613012db1d911f80897f9446a49de817b2c4c47 ]

As suggested by Eric Dumazet this change replaces the
#define with a static inline function to enjoy
complaints by the compiler when misusing the API.

Signed-off-by: Aaron Conole
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Aaron Conole
2015-10-27 08:51:52 +0800
e962218b4 netlink: Trim skb to alloc size to avoid MSG_TRUNC ... Browse Code »

[ Upstream commit db65a3aaf29ecce2e34271d52e8d2336b97bd9fe ]

netlink_dump() allocates skb based on the calculated min_dump_alloc or
a per socket max_recvmsg_len.
min_alloc_size is maximum space required for any single netdev
attributes as calculated by rtnl_calcit().
max_recvmsg_len tracks the user provided buffer to netlink_recvmsg.
It is capped at 16KiB.
The intention is to avoid small allocations and to minimize the number
of calls required to obtain dump information for all net devices.

netlink_dump packs as many small messages as could fit within an skb
that was sized for the largest single netdev information. The actual
space available within an skb is larger than what is requested. It could
be much larger and up to near 2x with align to next power of 2 approach.

Allowing netlink_dump to use all the space available within the
allocated skb increases the buffer size a user has to provide to avoid
truncaion (i.e. MSG_TRUNG flag set).

It was observed that with many VLANs configured on at least one netdev,
a larger buffer of near 64KiB was necessary to avoid "Message truncated"
error in "ip link" or "bridge [-c[ompressvlans]] vlan show" when
min_alloc_size was only little over 32KiB.

This patch trims skb to allocated size in order to allow the user to
avoid truncation with more reasonable buffer size.

Signed-off-by: Ronen Arad
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Arad, Ronen
2015-10-27 08:51:52 +0800
7b61554c2 tipc: move fragment importance field to new header position ... Browse Code »

[ Upstream commit dde4b5ae65de659b9ec64bafdde0430459fcb495 ]

In commit e3eea1eb47a ("tipc: clean up handling of message priorities")
we introduced a field in the packet header for keeping track of the
priority of fragments, since this value is not present in the specified
protocol header. Since the value so far only is used at the transmitting
end of the link, we have not yet officially defined it as part of the
protocol.

Unfortunately, the field we use for keeping this value, bits 13-15 in
in word 5, has turned out to be a poor choice; it is already used by the
broadcast protocol for carrying the 'network id' field of the sending
node. Since packet fragments also need to be transported across the
broadcast protocol, the risk of conflict is obvious, and we see this
happen when we use network identities larger than 2^13-1. This has
escaped our testing because we have so far only been using small network
id values.

We now move this field to bits 0-2 in word 9, a field that is guaranteed
to be unused by all involved protocols.

Fixes: e3eea1eb47a ("tipc: clean up handling of message priorities")
Signed-off-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Jon Paul Maloy
2015-10-27 08:51:51 +0800
3f90898b4 ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings ... Browse Code »

[ Upstream commit 077cb37fcf6f00a45f375161200b5ee0cd4e937b ]

It seems that kernel memory can leak into userspace by a
kmalloc, ethtool_get_strings, then copy_to_user sequence.

Avoid this by using kcalloc to zero fill the copied buffer.

Signed-off-by: Joe Perches
Acked-by: Ben Hutchings
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Joe Perches
2015-10-27 08:51:51 +0800
3043def89 act_mirred: clear sender cpu before sending to tx ... Browse Code »

[ Upstream commit d40496a56430eac0d330378816954619899fe303 ]

Similar to commit c29390c6dfee ("xps: must clear sender_cpu before forwarding")
the skb->sender_cpu needs to be cleared when moving from Rx
Tx, otherwise kernel could crash.

Fixes: 2bd82484bb4c ("xps: fix xps for stacked devices")
Cc: Eric Dumazet
Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Signed-off-by: Cong Wang
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

WANG Cong
2015-10-27 08:51:51 +0800
bc845f677 ovs: do not allocate memory from offline numa node ... Browse Code »

[ Upstream commit 598c12d0ba6de9060f04999746eb1e015774044b ]

When openvswitch tries allocate memory from offline numa node 0:
stats = kmem_cache_alloc_node(flow_stats_cache, GFP_KERNEL | __GFP_ZERO, 0)
It catches VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid))
[ replaced with VM_WARN_ON(!node_online(nid)) recently ] in linux/gfp.h
This patch disables numa affinity in this case.

Signed-off-by: Konstantin Khlebnikov
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Konstantin Khlebnikov
2015-10-27 08:51:50 +0800
a54d12792 bpf: fix panic in SO_GET_FILTER with native ebpf programs ... Browse Code »

[ Upstream commit 93d08b6966cf730ea669d4d98f43627597077153 ]

When sockets have a native eBPF program attached through
setsockopt(sk, SOL_SOCKET, SO_ATTACH_BPF, ...), and then try to
dump these over getsockopt(sk, SOL_SOCKET, SO_GET_FILTER, ...),
the following panic appears:

[49904.178642] BUG: unable to handle kernel NULL pointer dereference at (null)
[49904.178762] IP: [] sk_get_filter+0x39/0x90
[49904.182000] PGD 86fc9067 PUD 531a1067 PMD 0
[49904.185196] Oops: 0000 [#1] SMP
[...]
[49904.224677] Call Trace:
[49904.226090] [] sock_getsockopt+0x319/0x740
[49904.227535] [] ? sock_has_perm+0x63/0x70
[49904.228953] [] ? release_sock+0x108/0x150
[49904.230380] [] ? selinux_socket_getsockopt+0x23/0x30
[49904.231788] [] SyS_getsockopt+0xa6/0xc0
[49904.233267] [] entry_SYSCALL_64_fastpath+0x12/0x71

The underlying issue is the very same as in commit b382c0865600
("sock, diag: fix panic in sock_diag_put_filterinfo"), that is,
native eBPF programs don't store an original program since this
is only needed in cBPF ones.

However, sk_get_filter() wasn't updated to test for this at the
time when eBPF could be attached. Just throw an error to the user
to indicate that eBPF cannot be dumped over this interface.
That way, it can also be known that a program _is_ attached (as
opposed to just return 0), and a different (future) method needs
to be consulted for a dump.

Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets")
Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Daniel Borkmann
2015-10-27 08:51:50 +0800
8ae3dfacd inet: fix race in reqsk_queue_unlink() ... Browse Code »

[ Upstream commit 2306c704ce280c97a60d1f45333b822b40281dea ]

reqsk_timer_handler() tests if icsk_accept_queue.listen_opt
is NULL at its beginning.

By the time it calls inet_csk_reqsk_queue_drop() and
reqsk_queue_unlink(), listener might have been closed and
inet_csk_listen_stop() had called reqsk_queue_yank_acceptq()
which sets icsk_accept_queue.listen_opt to NULL

We therefore need to correctly check listen_opt being NULL
after holding syn_wait_lock for proper synchronization.

Fixes: fa76ce7328b2 ("inet: get rid of central tcp/dccp listener timer")
Fixes: b357a364c57c ("inet: fix possible panic in reqsk_queue_unlink()")
Signed-off-by: Eric Dumazet
Cc: Yuchung Cheng
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2015-10-27 08:51:50 +0800
bbc89a619 ppp: don't override sk->sk_state in pppoe_flush_dev() ... Browse Code »

[ Upstream commit e6740165b8f7f06d8caee0fceab3fb9d790a6fed ]

Since commit 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release"),
pppoe_release() calls dev_put(po->pppoe_dev) if sk is in the
PPPOX_ZOMBIE state. But pppoe_flush_dev() can set sk->sk_state to
PPPOX_ZOMBIE _and_ reset po->pppoe_dev to NULL. This leads to the
following oops:

[ 570.140800] BUG: unable to handle kernel NULL pointer dereference at 00000000000004e0
[ 570.142931] IP: [] pppoe_release+0x50/0x101 [pppoe]
[ 570.144601] PGD 3d119067 PUD 3dbc1067 PMD 0
[ 570.144601] Oops: 0000 [#1] SMP
[ 570.144601] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox ppp_generic slhc loop crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper acpi_cpufreq evdev serio_raw processor button ext4 crc16 mbcache jbd2 virtio_net virtio_blk virtio_pci virtio_ring virtio
[ 570.144601] CPU: 1 PID: 15738 Comm: ppp-apitest Not tainted 4.2.0 #1
[ 570.144601] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[ 570.144601] task: ffff88003d30d600 ti: ffff880036b60000 task.ti: ffff880036b60000
[ 570.144601] RIP: 0010:[] [] pppoe_release+0x50/0x101 [pppoe]
[ 570.144601] RSP: 0018:ffff880036b63e08 EFLAGS: 00010202
[ 570.144601] RAX: 0000000000000000 RBX: ffff880034340000 RCX: 0000000000000206
[ 570.144601] RDX: 0000000000000006 RSI: ffff88003d30dd20 RDI: ffff88003d30dd20
[ 570.144601] RBP: ffff880036b63e28 R08: 0000000000000001 R09: 0000000000000000
[ 570.144601] R10: 00007ffee9b50420 R11: ffff880034340078 R12: ffff8800387ec780
[ 570.144601] R13: ffff8800387ec7b0 R14: ffff88003e222aa0 R15: ffff8800387ec7b0
[ 570.144601] FS: 00007f5672f48700(0000) GS:ffff88003fc80000(0000) knlGS:0000000000000000
[ 570.144601] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 570.144601] CR2: 00000000000004e0 CR3: 0000000037f7e000 CR4: 00000000000406a0
[ 570.144601] Stack:
[ 570.144601] ffffffffa018f240 ffff8800387ec780 ffffffffa018f240 ffff8800387ec7b0
[ 570.144601] ffff880036b63e48 ffffffff812caabe ffff880039e4e000 0000000000000008
[ 570.144601] ffff880036b63e58 ffffffff812cabad ffff880036b63ea8 ffffffff811347f5
[ 570.144601] Call Trace:
[ 570.144601] [] sock_release+0x1a/0x75
[ 570.144601] [] sock_close+0xd/0x11
[ 570.144601] [] __fput+0xff/0x1a5
[ 570.144601] [] ____fput+0x9/0xb
[ 570.144601] [] task_work_run+0x66/0x90
[ 570.144601] [] prepare_exit_to_usermode+0x8c/0xa7
[ 570.144601] [] syscall_return_slowpath+0x16d/0x19b
[ 570.144601] [] int_ret_from_sys_call+0x25/0x9f
[ 570.144601] Code: 48 8b 83 c8 01 00 00 a8 01 74 12 48 89 df e8 8b 27 14 e1 b8 f7 ff ff ff e9 b7 00 00 00 8a 43 12 a8 0b 74 1c 48 8b 83 a8 04 00 00 8b 80 e0 04 00 00 65 ff 08 48 c7 83 a8 04 00 00 00 00 00 00
[ 570.144601] RIP [] pppoe_release+0x50/0x101 [pppoe]
[ 570.144601] RSP
[ 570.144601] CR2: 00000000000004e0
[ 570.200518] ---[ end trace 46956baf17349563 ]---

pppoe_flush_dev() has no reason to override sk->sk_state with
PPPOX_ZOMBIE. pppox_unbind_sock() already sets sk->sk_state to
PPPOX_DEAD, which is the correct state given that sk is unbound and
po->pppoe_dev is NULL.

Fixes: 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release")
Tested-by: Oleksii Berezhniak
Signed-off-by: Guillaume Nault
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Guillaume Nault
2015-10-27 08:51:50 +0800
fac948a1c net: add pfmemalloc check in sk_add_backlog() ... Browse Code »

[ Upstream commit c7c49b8fde26b74277188bdc6c9dca38db6fa35b ]

Greg reported crashes hitting the following check in __sk_backlog_rcv()

BUG_ON(!sock_flag(sk, SOCK_MEMALLOC));

The pfmemalloc bit is currently checked in sk_filter().

This works correctly for TCP, because sk_filter() is ran in
tcp_v[46]_rcv() before hitting the prequeue or backlog checks.

For UDP or other protocols, this does not work, because the sk_filter()
is ran from sock_queue_rcv_skb(), which might be called _after_ backlog
queuing if socket is owned by user by the time packet is processed by
softirq handler.

Fixes: b4b9e35585089 ("netvm: set PF_MEMALLOC as appropriate during SKB processing")
Signed-off-by: Eric Dumazet
Reported-by: Greg Thelen
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2015-10-27 08:51:49 +0800