Eric Lee / smarc-fsl-linux-kernel

09 Feb, 2017

28 commits

6cb0497ae base/memory, hotplug: fix a kernel oops in show_valid_zones() ... Browse Code »

commit a96dfddbcc04336bbed50dc2b24823e45e09e80c upstream.

Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().

BUG: unable to handle kernel paging request at ffffea017a000000
IP: show_valid_zones+0x6f/0x160

This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB. [1] An example of such
systems is desribed below. 0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.

BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable

Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range. show_valid_zones() then proceeds with the valid range.

[1] 'Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'

Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.com
Signed-off-by: Toshi Kani
Cc: Greg Kroah-Hartman
Cc: Zhang Zhen
Cc: Reza Arbab
Cc: David Rientjes
Cc: Dan Williams
Signed-off-by: Greg Kroah-Hartman

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Toshi Kani
2017-02-09 15:08:28 +0800
72f741961 mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone() ... Browse Code »

commit deb88a2a19e85842d79ba96b05031739ec327ff4 upstream.

Patch series "fix a kernel oops when reading sysfs valid_zones", v2.

A sysfs memory file is created for each 2GiB memory block on x86-64 when
the system has 64GiB or more memory. [1] When the start address of a
memory block is not backed by struct page, i.e. a memory range is not
aligned by 2GiB, reading its 'valid_zones' attribute file leads to a
kernel oops. This issue was observed on multiple x86-64 systems with
more than 64GiB of memory. This patch-set fixes this issue.

Patch 1 first fixes an issue in test_pages_in_a_zone(), which does not
test the start section.

Patch 2 then fixes the kernel oops by extending test_pages_in_a_zone()
to return valid [start, end).

Note for stable kernels: The memory block size change was made by commit
bdee237c0343 ("x86: mm: Use 2GB memory block size on large-memory x86-64
systems"), which was accepted to 3.9. However, this patch-set depends
on (and fixes) the change to test_pages_in_a_zone() made by commit
5f0f2887f4de ("mm/memory_hotplug.c: check for missing sections in
test_pages_in_a_zone()"), which was accepted to 4.4.

So, I recommend that we backport it up to 4.4.

[1] 'Commit bdee237c0343 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'

This patch (of 2):

test_pages_in_a_zone() does not check 'start_pfn' when it is aligned by
section since 'sec_end_pfn' is set equal to 'pfn'. Since this function
is called for testing the range of a sysfs memory file, 'start_pfn' is
always aligned by section.

Fix it by properly setting 'sec_end_pfn' to the next section pfn.

Also make sure that this function returns 1 only when the range belongs
to a zone.

Link: http://lkml.kernel.org/r/20170127222149.30893-2-toshi.kani@hpe.com
Signed-off-by: Toshi Kani
Cc: Andrew Banman
Cc: Reza Arbab
Cc: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Toshi Kani
2017-02-09 15:08:27 +0800
9e255997c cifs: initialize file_info_lock ... Browse Code »

commit 81ddd8c0c5e1cb41184d66567140cb48c53eb3d1 upstream.

Reviewed-by: Jeff Layton

file_info_lock is not initalized in initiate_cifs_search(), leading to the
following splat after a simple "mount.cifs ... dir && ls dir/":

BUG: spinlock bad magic on CPU#0, ls/486
lock: 0xffff880009301110, .magic: 00000000, .owner: /-1, .owner_cpu: 0
CPU: 0 PID: 486 Comm: ls Not tainted 4.9.0 #27
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
ffffc900042f3db0 ffffffff81327533 0000000000000000 ffff880009301110
ffffc900042f3dd0 ffffffff810baf75 ffff880009301110 ffffffff817ae077
ffffc900042f3df0 ffffffff810baff6 ffff880009301110 ffff880008d69900
Call Trace:
[] dump_stack+0x65/0x92
[] spin_dump+0x85/0xe0
[] spin_bug+0x26/0x30
[] do_raw_spin_lock+0xe9/0x130
[] _raw_spin_lock+0x1f/0x30
[] cifs_closedir+0x4d/0x100
[] __fput+0x5d/0x160
[] ____fput+0xe/0x10
[] task_work_run+0x7e/0xa0
[] exit_to_usermode_loop+0x92/0xa0
[] syscall_return_slowpath+0x49/0x50
[] entry_SYSCALL_64_fastpath+0xa7/0xa9

Fixes: 3afca265b5f53a0 ("Clarify locking of cifs file and tcon structures and make more granular")
Signed-off-by: Rabin Vincent
Signed-off-by: Steve French
Signed-off-by: Greg Kroah-Hartman

Rabin Vincent
2017-02-09 15:08:27 +0800
f0c3a0ac3 zswap: disable changing params if init fails ... Browse Code »

commit d7b028f56a971a2e4d8d7887540a144eeefcd4ab upstream.

Add zswap_init_failed bool that prevents changing any of the module
params, if init_zswap() fails, and set zswap_enabled to false. Change
'enabled' param to a callback, and check zswap_init_failed before
allowing any change to 'enabled', 'zpool', or 'compressor' params.

Any driver that is built-in to the kernel will not be unloaded if its
init function returns error, and its module params remain accessible for
users to change via sysfs. Since zswap uses param callbacks, which
assume that zswap has been initialized, changing the zswap params after
a failed initialization will result in WARNING due to the param
callbacks expecting a pool to already exist. This prevents that by
immediately exiting any of the param callbacks if initialization failed.

This was reported here:
https://marc.info/?l=linux-mm&m=147004228125528&w=4

And fixes this WARNING:
[ 429.723476] WARNING: CPU: 0 PID: 5140 at mm/zswap.c:503 __zswap_pool_current+0x56/0x60

The warning is just noise, and not serious. However, when init fails,
zswap frees all its percpu dstmem pages and its kmem cache. The kmem
cache might be serious, if kmem_cache_alloc(NULL, gfp) has problems; but
the percpu dstmem pages are definitely a problem, as they're used as
temporary buffer for compressed pages before copying into place in the
zpool.

If the user does get zswap enabled after an init failure, then zswap
will likely Oops on the first page it tries to compress (or worse, start
corrupting memory).

Fixes: 90b0fc26d5db ("zswap: change zpool/compressor at runtime")
Link: http://lkml.kernel.org/r/20170124200259.16191-2-ddstreet@ieee.org
Signed-off-by: Dan Streetman
Reported-by: Marcin Miroslaw
Cc: Seth Jennings
Cc: Michal Hocko
Cc: Sergey Senozhatsky
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Dan Streetman
2017-02-09 15:08:27 +0800
a3d729526 svcrpc: fix oops in absence of krb5 module ... Browse Code »

commit 034dd34ff4916ec1f8f74e39ca3efb04eab2f791 upstream.

Olga Kornievskaia says: "I ran into this oops in the nfsd (below)
(4.10-rc3 kernel). To trigger this I had a client (unsuccessfully) try
to mount the server with krb5 where the server doesn't have the
rpcsec_gss_krb5 module built."

The problem is that rsci.cred is copied from a svc_cred structure that
gss_proxy didn't properly initialize. Fix that.

[120408.542387] general protection fault: 0000 [#1] SMP
...
[120408.565724] CPU: 0 PID: 3601 Comm: nfsd Not tainted 4.10.0-rc3+ #16
[120408.567037] Hardware name: VMware, Inc. VMware Virtual =
Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[120408.569225] task: ffff8800776f95c0 task.stack: ffffc90003d58000
[120408.570483] RIP: 0010:gss_mech_put+0xb/0x20 [auth_rpcgss]
...
[120408.584946] ? rsc_free+0x55/0x90 [auth_rpcgss]
[120408.585901] gss_proxy_save_rsc+0xb2/0x2a0 [auth_rpcgss]
[120408.587017] svcauth_gss_proxy_init+0x3cc/0x520 [auth_rpcgss]
[120408.588257] ? __enqueue_entity+0x6c/0x70
[120408.589101] svcauth_gss_accept+0x391/0xb90 [auth_rpcgss]
[120408.590212] ? try_to_wake_up+0x4a/0x360
[120408.591036] ? wake_up_process+0x15/0x20
[120408.592093] ? svc_xprt_do_enqueue+0x12e/0x2d0 [sunrpc]
[120408.593177] svc_authenticate+0xe1/0x100 [sunrpc]
[120408.594168] svc_process_common+0x203/0x710 [sunrpc]
[120408.595220] svc_process+0x105/0x1c0 [sunrpc]
[120408.596278] nfsd+0xe9/0x160 [nfsd]
[120408.597060] kthread+0x101/0x140
[120408.597734] ? nfsd_destroy+0x60/0x60 [nfsd]
[120408.598626] ? kthread_park+0x90/0x90
[120408.599448] ret_from_fork+0x22/0x30

Fixes: 1d658336b05f "SUNRPC: Add RPC based upcall mechanism for RPCGSS auth"
Cc: Simo Sorce
Reported-by: Olga Kornievskaia
Tested-by: Olga Kornievskaia
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

J. Bruce Fields
2017-02-09 15:08:27 +0800
743146d34 NFSD: Fix a null reference case in find_or_create_lock_stateid() ... Browse Code »

commit d19fb70dd68c4e960e2ac09b0b9c79dfdeefa726 upstream.

nfsd assigns the nfs4_free_lock_stateid to .sc_free in init_lock_stateid().

If nfsd doesn't go through init_lock_stateid() and put stateid at end,
there is a NULL reference to .sc_free when calling nfs4_put_stid(ns).

This patch let the nfs4_stid.sc_free assignment to nfs4_alloc_stid().

Fixes: 356a95ece7aa "nfsd: clean up races in lock stateid searching..."
Signed-off-by: Kinglong Mee
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Kinglong Mee
2017-02-09 15:08:27 +0800
4c953848c powerpc/mm: Use the correct pointer when setting a 2MB pte ... Browse Code »

commit a0615a16f7d0ceb5804d295203c302d496d8ee91 upstream.

When setting a 2MB pte, radix__map_kernel_page() is using the address

ptep = (pte_t *)pudp;

Fix this conversion to use pmdp instead. Use pmdp_ptep() to do this
instead of casting the pointer.

Fixes: 2bfd65e45e87 ("powerpc/mm/radix: Add radix callbacks for early init routines")
Reviewed-by: Aneesh Kumar K.V
Signed-off-by: Reza Arbab
Signed-off-by: Michael Ellerman
Signed-off-by: Greg Kroah-Hartman

Reza Arbab
2017-02-09 15:08:27 +0800
8f415333b powerpc: Fix build failure with clang due to BUILD_BUG_ON() ... Browse Code »

commit b5fa0f7f88edcde37df1807fdf9ff10ec787a60e upstream.

Anton says: In commit 4db7327194db ("powerpc: Add option to use jump
label for cpu_has_feature()") and commit c12e6f24d413 ("powerpc: Add
option to use jump label for mmu_has_feature()") we added:

BUILD_BUG_ON(!__builtin_constant_p(feature))

to cpu_has_feature() and mmu_has_feature() in order to catch usage
issues (such as cpu_has_feature(cpu_has_feature(X), which has happened
once in the past). Unfortunately LLVM isn't smart enough to resolve
this, and it errors out.

I work around it in my clang/LLVM builds of the kernel, but I have just
discovered that it causes a lot of issues for the bcc (eBPF) trace tool
(which uses LLVM).

For now just #ifdef it away for clang builds.

Fixes: 4db7327194db ("powerpc: Add option to use jump label for cpu_has_feature()")
Fixes: c12e6f24d413 ("powerpc: Add option to use jump label for mmu_has_feature()")
Reported-by: Anton Blanchard
Tested-by: Naveen N. Rao
Signed-off-by: Michael Ellerman
Signed-off-by: Greg Kroah-Hartman

Michael Ellerman
2017-02-09 15:08:27 +0800
bbf69e519 powerpc: Add missing error check to prom_find_boot_cpu() ... Browse Code »

commit af2b7fa17eb92e52b65f96604448ff7a2a89ee99 upstream.

prom_init.c calls 'instance-to-package' twice, but the return
is not checked during prom_find_boot_cpu(). The result is then
passed to prom_getprop(), which could be PROM_ERROR. Add a return check
to prevent this.

This was found on a pasemi system, where CFE doesn't have a working
'instance-to package' prom call.

Before Commit 5c0484e25ec0 ('powerpc: Endian safe trampoline') the area
around addr 0 was mostly 0's and this doesn't cause a problem. Once the
macro 'FIXUP_ENDIAN' has been added to head_64.S, the low memory area
now has non-zero values, which cause the prom_getprop() call
to hang.

mpe: Also confirmed that under SLOF if 'instance-to-package' did fail
with PROM_ERROR we would crash in SLOF. So the bug is not specific to
CFE, it's just that other open firmwares don't trigger it because they
have a working 'instance-to-package'.

Fixes: 5c0484e25ec0 ("powerpc: Endian safe trampoline")
Signed-off-by: Darren Stevens
Signed-off-by: Michael Ellerman
Signed-off-by: Greg Kroah-Hartman

Darren Stevens
2017-02-09 15:08:27 +0800
73d459097 powerpc/eeh: Fix wrong flag passed to eeh_unfreeze_pe() ... Browse Code »

commit f05fea5b3574a5926c53865eea27139bb40b2f2b upstream.

In __eeh_clear_pe_frozen_state(), we should pass the flag's value
instead of its address to eeh_unfreeze_pe(). The isolated flag is
cleared if no error returned from __eeh_clear_pe_frozen_state(). We
never observed the error from the function. So the isolated flag should
have been always cleared, no real issue is caused because of the misused
@flag.

This fixes the code by passing the value of @flag to eeh_unfreeze_pe().

Fixes: 5cfb20b96f6 ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
Signed-off-by: Gavin Shan
Signed-off-by: Michael Ellerman
Signed-off-by: Greg Kroah-Hartman

Gavin Shan
2017-02-09 15:08:27 +0800
4b70d598c libata: Fix ATA request sense ... Browse Code »

commit 2dae99558e86894e9e5dbf097477baaa5eb70134 upstream.

For an ATA device supporting the sense data reporting feature set, a
failed command will trigger the execution of ata_eh_request_sense if
the result task file of the failed command has the ATA_SENSE bit set
(sense data available bit). ata_eh_request_sense executes the REQUEST
SENSE DATA EXT command to retrieve the sense data of the failed
command. On success of REQUEST SENSE DATA EXT, the ATA_SENSE bit will
NOT be set (the command succeeded) but ata_eh_request_sense
nevertheless tests the availability of sense data by testing that bit
presence in the result tf of the REQUEST SENSE DATA EXT command. This
leads us to falsely assume that request sense data failed and to the
warning message:

atax.xx: request sense failed stat 50 emask 0

Upon success of REQUEST SENSE DATA EXT, set the ATA_SENSE bit in the
result task file command so that sense data can be returned by
ata_eh_request_sense.

Signed-off-by: Damien Le Moal
Signed-off-by: Tejun Heo
Signed-off-by: Greg Kroah-Hartman

Damien Le Moal
2017-02-09 15:08:27 +0800
6d08607ef libata: apply MAX_SEC_1024 to all CX1-JB*-HP devices ... Browse Code »

commit e0edc8c546463f268d41d064d855bcff994c52fa upstream.

Marko reports that CX1-JB512-HP shows the same timeout issues as
CX1-JB256-HP. Let's apply MAX_SEC_128 to all devices in the series.

Signed-off-by: Tejun Heo
Reported-by: Marko Koski-Vähälä
Signed-off-by: Greg Kroah-Hartman

Tejun Heo
2017-02-09 15:08:27 +0800
fc794153c ata: sata_mv:- Handle return value of devm_ioremap. ... Browse Code »

commit 064c3db9c564cc5be514ac21fb4aa26cc33db746 upstream.

Here, If devm_ioremap will fail. It will return NULL.
Then hpriv->base = NULL - 0x20000; Kernel can run into
a NULL-pointer dereference. This error check will avoid
NULL pointer dereference.

Signed-off-by: Arvind Yadav
Signed-off-by: Tejun Heo
Signed-off-by: Greg Kroah-Hartman

Arvind Yadav
2017-02-09 15:08:26 +0800
b41615aa7 perf/core: Fix PERF_RECORD_MMAP2 prot/flags for anonymous memory ... Browse Code »

commit 0b3589be9b98994ce3d5aeca52445d1f5627c4ba upstream.

Andres reported that MMAP2 records for anonymous memory always have
their protection field 0.

Turns out, someone daft put the prot/flags generation code in the file
branch, leaving them unset for anonymous memory.

Reported-by: Andres Freund
Signed-off-by: Peter Zijlstra (Intel)
Cc: Alexander Shishkin
Cc: Arnaldo Carvalho de Melo
Cc: Don Zickus
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: acme@kernel.org
Cc: anton@ozlabs.org
Cc: namhyung@kernel.org
Fixes: f972eb63b100 ("perf: Pass protection and flags bits through mmap2 interface")
Link: http://lkml.kernel.org/r/20170126221508.GF6536@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Peter Zijlstra
2017-02-09 15:08:26 +0800
3996a91e3 perf/core: Fix use-after-free bug ... Browse Code »

commit a76a82a3e38c8d3fb6499e3dfaeb0949241ab588 upstream.

Dmitry reported a KASAN use-after-free on event->group_leader.

It turns out there's a hole in perf_remove_from_context() due to
event_function_call() not calling its function when the task
associated with the event is already dead.

In this case the event will have been detached from the task, but the
grouping will have been retained, such that group operations might
still work properly while there are live child events etc.

This does however mean that we can miss a perf_group_detach() call
when the group decomposes, this in turn can then lead to
use-after-free.

Fix it by explicitly doing the group detach if its still required.

Reported-by: Dmitry Vyukov
Tested-by: Dmitry Vyukov
Signed-off-by: Peter Zijlstra (Intel)
Cc: Alexander Shishkin
Cc: Arnaldo Carvalho de Melo
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Mathieu Desnoyers
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: syzkaller
Fixes: 63b6da39bb38 ("perf: Fix perf_event_exit_task() race")
Link: http://lkml.kernel.org/r/20170126153955.GD6515@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Peter Zijlstra
2017-02-09 15:08:26 +0800
53bed1f64 crypto: arm64/aes-blk - honour iv_out requirement in CBC and CTR modes ... Browse Code »

commit 11e3b725cfc282efe9d4a354153e99d86a16af08 upstream.

Update the ARMv8 Crypto Extensions and the plain NEON AES implementations
in CBC and CTR modes to return the next IV back to the skcipher API client.
This is necessary for chaining to work correctly.

Note that for CTR, this is only done if the request is a round multiple of
the block size, since otherwise, chaining is impossible anyway.

Signed-off-by: Ard Biesheuvel
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Ard Biesheuvel
2017-02-09 15:08:26 +0800
b04a39f88 crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg ... Browse Code »

commit d6040764adcb5cb6de1489422411d701c158bb69 upstream.

Make sure CRYPTO_ALG_DEAD bit is cleared before proceeding with
the algorithm registration. This fixes qat-dh registration when
driver is restarted

Signed-off-by: Salvatore Benedetto
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Salvatore Benedetto
2017-02-09 15:08:26 +0800
2eb8f7c42 drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrieval ... Browse Code »

commit 24bf7ae359b8cca165bb30742d2b1c03a1eb23af upstream.

Based on the xf86-video-nv code, NFORCE (NV1A) and NFORCE2 (NV1F) have a
different way of retrieving clocks. See the
nv_hw.c:nForceUpdateArbitrationSettings function in the original code
for how these clocks were accessed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54587
Signed-off-by: Ilia Mirkin
Signed-off-by: Ben Skeggs
Signed-off-by: Greg Kroah-Hartman

Ilia Mirkin
2017-02-09 15:08:26 +0800
bd5cefed1 drm/nouveau/disp/gt215: Fix HDA ELD handling (thus, HDMI audio) on gt215 ... Browse Code »

commit d347583a39e2df609a9e40c835f72d3614665b53 upstream.

Store the ELD correctly, not just enough copies of the first byte
to pad out the given ELD size.

Signed-off-by: Alastair Bridgewater
Fixes: 120b0c39c756 ("drm/nv50-/disp: audit and version SOR_HDA_ELD method")
Reviewed-by: Ilia Mirkin
Signed-off-by: Ben Skeggs
Signed-off-by: Greg Kroah-Hartman

Alastair Bridgewater
2017-02-09 15:08:26 +0800
c9fb422fd drm/amdgpu/si: fix crash on headless asics ... Browse Code »

commit 57bcd0a6364cd4eaa362d7ff1777e88ddf501602 upstream.

Missing check for crtcs present.

Fixes:
https://bugzilla.kernel.org/show_bug.cgi?id=193341
https://bugs.freedesktop.org/show_bug.cgi?id=99387

Reviewed-by: Christian König
Signed-off-by: Alex Deucher
Signed-off-by: Alex Deucher
Signed-off-by: Greg Kroah-Hartman

Alex Deucher
2017-02-09 15:08:26 +0800
20658b3df pinctrl: baytrail: Add missing spinlock usage in byt_gpio_irq_handler ... Browse Code »

commit cdca06e4e85974d8a3503ab15709dbbaf90d3dd1 upstream.

According to VLI64 Intel Atom E3800 Specification Update (#329901)
concurrent read accesses may result in returning 0xffffffff and write
accesses may be dropped silently.
To workaround all accesses must be protected by locks.

Signed-off-by: Alexander Stein
Acked-by: Mika Westerberg
Signed-off-by: Linus Walleij
Signed-off-by: Greg Kroah-Hartman

Alexander Stein
2017-02-09 15:08:25 +0800
7396685a1 HID: cp2112: fix gpio-callback error handling ... Browse Code »

commit 8e9faa15469ed7c7467423db4c62aeed3ff4cae3 upstream.

In case of a zero-length report, the gpio direction_input callback would
currently return success instead of an errno.

Fixes: 1ffb3c40ffb5 ("HID: cp2112: make transfer buffers DMA capable")
Signed-off-by: Johan Hovold
Reviewed-by: Benjamin Tissoires
Signed-off-by: Jiri Kosina
Signed-off-by: Greg Kroah-Hartman

Johan Hovold
2017-02-09 15:08:25 +0800
a18c4584a HID: cp2112: fix sleep-while-atomic ... Browse Code »

commit 7a7b5df84b6b4e5d599c7289526eed96541a0654 upstream.

A recent commit fixing DMA-buffers on stack added a shared transfer
buffer protected by a spinlock. This is broken as the USB HID request
callbacks can sleep. Fix this up by replacing the spinlock with a mutex.

Fixes: 1ffb3c40ffb5 ("HID: cp2112: make transfer buffers DMA capable")
Signed-off-by: Johan Hovold
Reviewed-by: Benjamin Tissoires
Signed-off-by: Jiri Kosina
Signed-off-by: Greg Kroah-Hartman

Johan Hovold
2017-02-09 15:08:25 +0800
dfd713307 xtensa: fix noMMU build on cores with MMU ... Browse Code »

commit 4b3e6f2ef3722f1a6a97b6034ed492c1a21fd4ae upstream.

Commit bf15f86b343ed8 ("xtensa: initialize MMU before jumping to reset
vector") calls MMU management functions even when CONFIG_MMU is not
selected. That breaks noMMU build on cores with MMU.

Don't manage MMU when CONFIG_MMU is not selected.

Signed-off-by: Max Filippov
Signed-off-by: Greg Kroah-Hartman

Max Filippov
2017-02-09 15:08:25 +0800
f2e24dd91 efi/fdt: Avoid FDT manipulation after ExitBootServices() ... Browse Code »

commit c8f325a59cfc718d13a50fbc746ed9b415c25e92 upstream.

Some AArch64 UEFI implementations disable the MMU in ExitBootServices(),
after which unaligned accesses to RAM are no longer supported.

Commit:

abfb7b686a3e ("efi/libstub/arm*: Pass latest memory map to the kernel")

fixed an issue in the memory map handling of the stub FDT code, but
inadvertently created an issue with such firmware, by moving some
of the FDT manipulation to after the invocation of ExitBootServices().

Given that the stub's libfdt implementation uses the ordinary, accelerated
string functions, which rely on hardware handling of unaligned accesses,
manipulating the FDT with the MMU off may result in alignment faults.

So fix the situation by moving the update_fdt_memmap() call into the
callback function invoked by efi_exit_boot_services() right before it
calls the ExitBootServices() UEFI service (which is arguably a better
place for it anyway)

Note that disabling the MMU in ExitBootServices() is not compliant with
the UEFI spec, and carries great risk due to the fact that switching from
cached to uncached memory accesses halfway through compiler generated code
(i.e., involving a stack) can never be done in a way that is architecturally
safe.

Fixes: abfb7b686a3e ("efi/libstub/arm*: Pass latest memory map to the kernel")
Signed-off-by: Ard Biesheuvel
Tested-by: Riku Voipio
Cc: mark.rutland@arm.com
Cc: linux-efi@vger.kernel.org
Cc: matt@codeblueprint.co.uk
Cc: leif.lindholm@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1485971102-23330-2-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Ard Biesheuvel
2017-02-09 15:08:25 +0800
f0c7412ed x86/efi: Always map the first physical page into the EFI pagetables ... Browse Code »

commit bf29bddf0417a4783da3b24e8c9e017ac649326f upstream.

Commit:

129766708 ("x86/efi: Only map RAM into EFI page tables if in mixed-mode")

stopped creating 1:1 mappings for all RAM, when running in native 64-bit mode.

It turns out though that there are 64-bit EFI implementations in the wild
(this particular problem has been reported on a Lenovo Yoga 710-11IKB),
which still make use of the first physical page for their own private use,
even though they explicitly mark it EFI_CONVENTIONAL_MEMORY in the memory
map.

In case there is no mapping for this particular frame in the EFI pagetables,
as soon as firmware tries to make use of it, a triple fault occurs and the
system reboots (in case of the Yoga 710-11IKB this is very early during bootup).

Fix that by always mapping the first page of physical memory into the EFI
pagetables. We're free to hand this page to the BIOS, as trim_bios_range()
will reserve the first page and isolate it away from memory allocators anyway.

Note that just reverting 129766708 alone is not enough on v4.9-rc1+ to fix the
regression on affected hardware, as this commit:

ab72a27da ("x86/efi: Consolidate region mapping logic")

later made the first physical frame not to be mapped anyway.

Reported-by: Hanka Pavlikova
Signed-off-by: Jiri Kosina
Signed-off-by: Matt Fleming
Cc: Ard Biesheuvel
Cc: Borislav Petkov
Cc: Borislav Petkov
Cc: Laura Abbott
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Vojtech Pavlik
Cc: Waiman Long
Cc: linux-efi@vger.kernel.org
Fixes: 129766708 ("x86/efi: Only map RAM into EFI page tables if in mixed-mode")
Link: http://lkml.kernel.org/r/20170127222552.22336-1-matt@codeblueprint.co.uk
[ Tidied up the changelog and the comment. ]
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Jiri Kosina
2017-02-09 15:08:25 +0800
13e6ef99d ext4: validate s_first_meta_bg at mount time ... Browse Code »

commit 3a4b77cd47bb837b8557595ec7425f281f2ca1fe upstream.

Ralf Spenneberg reported that he hit a kernel crash when mounting a
modified ext4 image. And it turns out that kernel crashed when
calculating fs overhead (ext4_calculate_overhead()), this is because
the image has very large s_first_meta_bg (debug code shows it's
842150400), and ext4 overruns the memory in count_overhead() when
setting bitmap buffer, which is PAGE_SIZE.

ext4_calculate_overhead():
buf = get_zeroed_page(GFP_NOFS); 0; j--) {
Signed-off-by: Eryu Guan
Signed-off-by: Theodore Ts'o
Reviewed-by: Andreas Dilger
Signed-off-by: Greg Kroah-Hartman

Eryu Guan
2017-02-09 15:08:25 +0800
610c2b7ff PCI/ASPM: Handle PCI-to-PCIe bridges as roots of PCIe hierarchies ... Browse Code »

commit 030305d69fc6963c16003f50d7e8d74b02d0a143 upstream.

In a struct pcie_link_state, link->root points to the pcie_link_state of
the root of the PCIe hierarchy. For the topmost link, this points to
itself (link->root = link). For others, we copy the pointer from the
parent (link->root = link->parent->root).

Previously we recognized that Root Ports originated PCIe hierarchies, but
we treated PCI/PCI-X to PCIe Bridges as being in the middle of the
hierarchy, and when we tried to copy the pointer from link->parent->root,
there was no parent, and we dereferenced a NULL pointer:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
IP: [] pcie_aspm_init_link_state+0x170/0x820

Recognize that PCI/PCI-X to PCIe Bridges originate PCIe hierarchies just
like Root Ports do, so link->root for these devices should also point to
itself.

Fixes: 51ebfc92b72b ("PCI: Enumerate switches below PCI-to-PCIe bridges")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=193411
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1022181
Tested-by: lists@ssl-mail.com
Tested-by: Jayachandran C.
Signed-off-by: Bjorn Helgaas
Signed-off-by: Greg Kroah-Hartman

Bjorn Helgaas
2017-02-09 15:08:25 +0800

04 Feb, 2017

12 commits

c8ea2f3b8 Linux 4.9.8 Browse Code »

Greg Kroah-Hartman
2017-02-04 16:47:29 +0800
b5b4d4a91 xfs: fix bmv_count confusion w/ shared extents ... Browse Code »

commit c364b6d0b6cda1cd5d9ab689489adda3e82529aa upstream.

In a bmapx call, bmv_count is the total size of the array, including the
zeroth element that userspace uses to supply the search key. The output
array starts at offset 1 so that we can set up the user for the next
invocation. Since we now can split an extent into multiple bmap records
due to shared/unshared status, we have to be careful that we don't
overflow the output array.

In the original patch f86f403794b ("xfs: teach get_bmapx about shared
extents and the CoW fork") I used cur_ext (the output index) to check
for overflows, albeit with an off-by-one error. Since nexleft no longer
describes the number of unfilled slots in the output, we can rip all
that out and use cur_ext for the overflow check directly.

Failure to do this causes heap corruption in bmapx callers such as
xfs_io and xfs_scrub. xfs/328 can reproduce this problem.

Reviewed-by: Eric Sandeen
Signed-off-by: Darrick J. Wong
Signed-off-by: Greg Kroah-Hartman

Darrick J. Wong
2017-02-04 16:47:13 +0800
5d44dd54b xfs: clear _XBF_PAGES from buffers when readahead page ... Browse Code »

commit 2aa6ba7b5ad3189cc27f14540aa2f57f0ed8df4b upstream.

If we try to allocate memory pages to back an xfs_buf that we're trying
to read, it's possible that we'll be so short on memory that the page
allocation fails. For a blocking read we'll just wait, but for
readahead we simply dump all the pages we've collected so far.

Unfortunately, after dumping the pages we neglect to clear the
_XBF_PAGES state, which means that the subsequent call to xfs_buf_free
thinks that b_pages still points to pages we own. It then double-frees
the b_pages pages.

This results in screaming about negative page refcounts from the memory
manager, which xfs oughtn't be triggering. To reproduce this case,
mount a filesystem where the size of the inodes far outweighs the
availalble memory (a ~500M inode filesystem on a VM with 300MB memory
did the trick here) and run bulkstat in parallel with other memory
eating processes to put a huge load on the system. The "check summary"
phase of xfs_scrub also works for this purpose.

Signed-off-by: Darrick J. Wong
Reviewed-by: Eric Sandeen
Signed-off-by: Greg Kroah-Hartman

Darrick J. Wong
2017-02-04 16:47:12 +0800
29f96b7e9 xfs: extsize hints are not unlikely in xfs_bmap_btalloc ... Browse Code »

commit 493611ebd62673f39e2f52c2561182c558a21cb6 upstream.

With COW files they are the hotpath, just like for files with the
extent size hint attribute. We really shouldn't micro-manage anything
but failure cases with unlikely.

Additionally Arnd Bergmann recently reported that one of these two
unlikely annotations causes link failures together with an upcoming
kernel instrumentation patch, so let's get rid of it ASAP.

Signed-off-by: Christoph Hellwig
Reported-by: Arnd Bergmann
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Signed-off-by: Greg Kroah-Hartman

Christoph Hellwig
2017-02-04 16:47:12 +0800
aab858dab xfs: remove racy hasattr check from attr ops ... Browse Code »

commit 5a93790d4e2df73e30c965ec6e49be82fc3ccfce upstream.

xfs_attr_[get|remove]() have unlocked attribute fork checks to optimize
away a lock cycle in cases where the fork does not exist or is otherwise
empty. This check is not safe, however, because an attribute fork short
form to extent format conversion includes a transient state that causes
the xfs_inode_hasattr() check to fail. Specifically,
xfs_attr_shortform_to_leaf() creates an empty extent format attribute
fork and then adds the existing shortform attributes to it.

This means that lookup of an existing xattr can spuriously return
-ENOATTR when racing against a setxattr that causes the associated
format conversion. This was originally reproduced by an untar on a
particularly configured glusterfs volume, but can also be reproduced on
demand with properly crafted xattr requests.

The format conversion occurs under the exclusive ilock. xfs_attr_get()
and xfs_attr_remove() already have the proper locking and checks further
down in the functions to handle this situation correctly. Drop the
unlocked checks to avoid the spurious failure and rely on the existing
logic.

Signed-off-by: Brian Foster
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Signed-off-by: Greg Kroah-Hartman

Brian Foster
2017-02-04 16:47:12 +0800
29094164e xfs: verify dirblocklog correctly ... Browse Code »

commit 83d230eb5c638949350f4761acdfc0af5cb1bc00 upstream.

sb_dirblklog is added to sb_blocklog to compute the directory block size
in bytes. Therefore, we must compare the sum of both those values
against XFS_MAX_BLOCKSIZE_LOG, not just dirblklog.

Signed-off-by: Darrick J. Wong
Reviewed-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Signed-off-by: Greg Kroah-Hartman

Darrick J. Wong
2017-02-04 16:47:12 +0800
214d55efa xfs: fix COW writeback race ... Browse Code »

commit d2b3964a0780d2d2994eba57f950d6c9fe489ed8 upstream.

Due to the way how xfs_iomap_write_allocate tries to convert the whole
found extents from delalloc to real space we can run into a race
condition with multiple threads doing writes to this same extent.
For the non-COW case that is harmless as the only thing that can happen
is that we call xfs_bmapi_write on an extent that has already been
converted to a real allocation. For COW writes where we move the extent
from the COW to the data fork after I/O completion the race is, however,
not quite as harmless. In the worst case we are now calling
xfs_bmapi_write on a region that contains hole in the COW work, which
will trip up an assert in debug builds or lead to file system corruption
in non-debug builds. This seems to be reproducible with workloads of
small O_DSYNC write, although so far I've not managed to come up with
a with an isolated reproducer.

The fix for the issue is relatively simple: tell xfs_bmapi_write
that we are only asked to convert delayed allocations and skip holes
in that case.

Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Signed-off-by: Greg Kroah-Hartman

Christoph Hellwig
2017-02-04 16:47:12 +0800
29f319275 xfs: fix xfs_mode_to_ftype() prototype ... Browse Code »

commit fd29f7af75b7adf250beccffa63746c6a88e2b74 upstream.

A harmless warning just got introduced:

fs/xfs/libxfs/xfs_dir2.h:40:8: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]

Removing the 'const' modifier avoids the warning and has no
other effect.

Fixes: 1fc4d33fed12 ("xfs: replace xfs_mode_to_ftype table with switch statement")
Signed-off-by: Arnd Bergmann
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Signed-off-by: Greg Kroah-Hartman

Arnd Bergmann
2017-02-04 16:47:12 +0800
d062d90c3 xfs: don't wrap ID in xfs_dq_get_next_id ... Browse Code »

commit 657bdfb7f5e68ca5e2ed009ab473c429b0d6af85 upstream.

The GETNEXTQOTA ioctl takes whatever ID is sent in,
and looks for the next active quota for an user
equal or higher to that ID.

But if we are at the maximum ID and then ask for the "next"
one, we may wrap back to zero. In this case, userspace
may loop forever, because it will start querying again
at zero.

We'll fix this in userspace as well, but for the kernel,
return -ENOENT if we ask for the next quota ID
past UINT_MAX so the caller knows to stop.

Signed-off-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Signed-off-by: Greg Kroah-Hartman

Eric Sandeen
2017-02-04 16:47:12 +0800
d3201a14b xfs: sanity check inode di_mode ... Browse Code »

commit a324cbf10a3c67aaa10c9f47f7b5801562925bc2 upstream.

Check for invalid file type in xfs_dinode_verify()
and fail to load the inode structure from disk.

Reviewed-by: Darrick J. Wong
Signed-off-by: Amir Goldstein
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Signed-off-by: Greg Kroah-Hartman

Amir Goldstein
2017-02-04 16:47:12 +0800
43ce59217 xfs: sanity check inode mode when creating new dentry ... Browse Code »

commit fab8eef86c814c3dd46bc5d760b6e4a53d5fc5a6 upstream.

The helper xfs_dentry_to_name() is used by 2 different
classes of callers: Callers that pass zero mode and don't care
about the returned name.type field and Callers that pass
non zero mode and do care about the name.type field.

Change xfs_dentry_to_name() to not take the mode argument and
change the call sites of the first class to not pass the mode
argument.

Create a new helper xfs_dentry_mode_to_name() which does pass
the mode argument and returns -EFSCORRUPTED if mode is invalid.
Callers that translate non zero mode to on-disk file type now
check the return value and will export the error to user instead
of staging an invalid file type to be written to directory entry.

Signed-off-by: Amir Goldstein
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong
Signed-off-by: Greg Kroah-Hartman

Amir Goldstein
2017-02-04 16:47:12 +0800
b5f68e24c xfs: replace xfs_mode_to_ftype table with switch statement ... Browse Code »

commit 1fc4d33fed124fb182e8e6c214e973a29389ae83.

The size of the xfs_mode_to_ftype[] conversion table
was too small to handle an invalid value of mode=S_IFMT.

Instead of fixing the table size, replace the conversion table
with a conversion helper that uses a switch statement.

Suggested-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Amir Goldstein
Signed-off-by: Darrick J. Wong
Signed-off-by: Greg Kroah-Hartman

Amir Goldstein
2017-02-04 16:47:12 +0800