Eric Lee / smarc-fsl-linux-kernel

27 Sep, 2016

7 commits

ebd7c72c6 nfsd: randomize SETCLIENTID reply to help distinguish servers ... Browse Code »

NFSv4.1 has built-in trunking support that allows a client to determine
whether two connections to two different IP addresses are actually to
the same server. NFSv4.0 does not, but RFC 7931 attempts to provide
clients a means to do this, basically by performing a SETCLIENTID to one
address and confirming it with a SETCLIENTID_CONFIRM to the other.

Linux clients since 05f4c350ee02 "NFS: Discover NFSv4 server trunking
when mounting" implement a variation on this suggestion. It is possible
that other clients do too.

This depends on the clientid and verifier not being accepted by an
unrelated server. Since both are 64-bit values, that would be very
unlikely if they were random numbers. But they aren't:

knfsd generates the 64-bit clientid by concatenating the 32-bit boot
time (in seconds) and a counter. This makes collisions between
clientids generated by the same server extremely unlikely. But
collisions are very likely between clientids generated by servers that
boot at the same time, and it's quite common for multiple servers to
boot at the same time. The verifier is a concatenation of the
SETCLIENTID time (in seconds) and a counter, so again collisions between
different servers are likely if multiple SETCLIENTIDs are done at the
same time, which is a common case.

Therefore recent NFSv4.0 clients may decide two different servers are
really the same, and mount a filesystem from the wrong server.

Fortunately the Linux client, since 55b9df93ddd6 "nfsv4/v4.1: Verify the
client owner id during trunking detection", only does this when given
the non-default "migration" mount option.

The fault is really with RFC 7931, and needs a client fix, but in the
meantime we can mitigate the chance of these collisions by randomizing
the starting value of the counters used to generate clientids and
verifiers.

Reported-by: Frank Sorenson
Reviewed-by: Jeff Layton
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2016-09-27 03:20:38 +0800
19e4c3477 nfsd: set the MAY_NOTIFY_LOCK flag in OPEN replies ... Browse Code »

If we are using v4.1+, then we can send notification when contended
locks become free. Inform the client of that fact.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2016-09-27 03:20:37 +0800
b4c8eb037 nfs: add a new NFS4_OPEN_RESULT_MAY_NOTIFY_LOCK constant ... Browse Code »

As defined in RFC 5661, section 18.16.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2016-09-27 03:20:37 +0800
7919d0a27 nfsd: add a LRU list for blocked locks ... Browse Code »

It's possible for a client to call in on a lock that is blocked for a
long time, but discontinue polling for it. A malicious client could
even set a lock on a file, and then spam the server with failing lock
requests from different lockowners that pile up in a DoS attack.

Add the blocked lock structures to a per-net namespace LRU when hashing
them, and timestamp them. If the lock request is not revisited after a
lease period, we'll drop it under the assumption that the client is no
longer interested.

This also gives us a mechanism to clean up these objects at server
shutdown time as well.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2016-09-27 03:20:36 +0800
76d348fad nfsd: have nfsd4_lock use blocking locks for v4.1+ locks ... Browse Code »

Create a new per-lockowner+per-inode structure that contains a
file_lock. Have nfsd4_lock add this structure to the lockowner's list
prior to setting the lock. Then call the vfs and request a blocking lock
(by setting FL_SLEEP). If we get anything besides FILE_LOCK_DEFERRED
back, then we dequeue the block structure and free it. When the next
lock request comes in, we'll look for an existing block for the same
filehandle and dequeue and reuse it if there is one.

When the lock comes free (a'la an lm_notify call), we dequeue it
from the lockowner's list and kick off a CB_NOTIFY_LOCK callback to
inform the client that it should retry the lock request.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2016-09-27 03:20:36 +0800
a188620eb nfsd: plumb in a CB_NOTIFY_LOCK operation ... Browse Code »

Add the encoding/decoding for CB_NOTIFY_LOCK operations.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2016-09-27 03:20:35 +0800
1eca45f8a NFSD: fix corruption in notifier registration ... Browse Code »

By design notifier can be registered once only, however nfsd registers
the same inetaddr notifiers per net-namespace. When this happen it
corrupts list of notifiers, as result some notifiers can be not called
on proper event, traverse on list can be cycled forever, and second
unregister can access already freed memory.

Cc: stable@vger.kernel.org
fixes: 36684996 ("nfsd: Register callbacks on the inetaddr_chain and inet6addr_chain")
Signed-off-by: Vasily Averin
Reviewed-by: Jeff Layton
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

Vasily Averin
2016-09-27 02:17:45 +0800

23 Sep, 2016

6 commits

25d55296d svcrdma: support Remote Invalidation ... Browse Code »

Support Remote Invalidation. A private message is exchanged with
the client upon RDMA transport connect that indicates whether
Send With Invalidation may be used by the server to send RPC
replies. The invalidate_rkey is arbitrarily chosen from among
rkeys present in the RPC-over-RDMA header's chunk lists.

Send With Invalidate improves performance only when clients can
recognize, while processing an RPC reply, that an rkey has already
been invalidated. That has been submitted as a separate change.

In the future, the RPC-over-RDMA protocol might support Remote
Invalidation properly. The protocol needs to enable signaling
between peers to indicate when Remote Invalidation can be used
for each individual RPC.

Signed-off-by: Chuck Lever
Reviewed-by: Sagi Grimberg
Signed-off-by: J. Bruce Fields

Chuck Lever
2016-09-23 22:18:54 +0800
cc9d83408 svcrdma: Server-side support for rpcrdma_connect_private ... Browse Code »

Prepare to receive an RDMA-CM private message when handling a new
connection attempt, and send a similar message as part of connection
acceptance.

Both sides can communicate their various implementation limits.
Implementations that don't support this sideband protocol ignore it.

Signed-off-by: Chuck Lever
Reviewed-by: Sagi Grimberg
Signed-off-by: J. Bruce Fields

Chuck Lever
2016-09-23 22:18:54 +0800
5d4870965 rpcrdma: RDMA/CM private message data structure ... Browse Code »

Introduce data structure used by both client and server to exchange
implementation details during RDMA/CM connection establishment.

This is an experimental out-of-band exchange between Linux
RPC-over-RDMA Version One implementations, replacing the deprecated
CCP (see RFC 5666bis). The purpose of this extension is to enable
prototyping of features that might be introduced in a subsequent
version of RPC-over-RDMA.

Suggested by Christoph Hellwig and Devesh Sharma.

Signed-off-by: Chuck Lever
Reviewed-by: Sagi Grimberg
Signed-off-by: J. Bruce Fields

Chuck Lever
2016-09-23 22:18:53 +0800
9995237bb svcrdma: Skip put_page() when send_reply() fails ... Browse Code »

Message from syslogd@klimt at Aug 18 17:00:37 ...
kernel:page:ffffea0020639b00 count:0 mapcount:0 mapping: (null) index:0x0
Aug 18 17:00:37 klimt kernel: flags: 0x2fffff80000000()
Aug 18 17:00:37 klimt kernel: page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)

Aug 18 17:00:37 klimt kernel: kernel BUG at /home/cel/src/linux/linux-2.6/include/linux/mm.h:445!
Aug 18 17:00:37 klimt kernel: RIP: 0010:[] svc_rdma_sendto+0x641/0x820 [rpcrdma]

send_reply() assigns its page argument as the first page of ctxt. On
error, send_reply() already invokes svc_rdma_put_context(ctxt, 1);
which does a put_page() on that very page. No need to do that again
as svc_rdma_sendto exits.

Fixes: 3e1eeb980822 ("svcrdma: Close connection when a send error occurs")
Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2016-09-23 22:18:53 +0800
cace564f8 svcrdma: Tail iovec leaves an orphaned DMA mapping ... Browse Code »

The ctxt's count field is overloaded to mean the number of pages in
the ctxt->page array and the number of SGEs in the ctxt->sge array.
Typically these two numbers are the same.

However, when an inline RPC reply is constructed from an xdr_buf
with a tail iovec, the head and tail often occupy the same page,
but each are DMA mapped independently. In that case, ->count equals
the number of pages, but it does not equal the number of SGEs.
There's one more SGE, for the tail iovec. Hence there is one more
DMA mapping than there are pages in the ctxt->page array.

This isn't a real problem until the server's iommu is enabled. Then
each RPC reply that has content in that iovec orphans a DMA mapping
that consists of real resources.

krb5i and krb5p always populate that tail iovec. After a couple
million sent krb5i/p RPC replies, the NFS server starts behaving
erratically. Reboot is needed to clear the problem.

Fixes: 9d11b51ce7c1 ("svcrdma: Fix send_reply() scatter/gather set-up")
Signed-off-by: Chuck Lever
Signed-off-by: J. Bruce Fields

Chuck Lever
2016-09-23 22:18:52 +0800
bec782b4f nfsd: fix dprintk in nfsd4_encode_getdeviceinfo ... Browse Code »

nfserr is big-endian, so we should convert it to host-endian before
printing it.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2016-09-23 22:18:52 +0800

17 Sep, 2016

2 commits

89dfdc964 nfsd: eliminate cb_minorversion field ... Browse Code »

We already have that info in the client pointer. No need to pass around
a copy.

Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2016-09-17 04:15:52 +0800
1983a66f5 nfsd: don't set a FL_LAYOUT lease for flexfiles layouts ... Browse Code »

We currently can hit a deadlock (of sorts) when trying to use flexfiles
layouts with XFS. XFS will call break_layout when something wants to
write to the file. In the case of the (super-simple) flexfiles layout
driver in knfsd, the MDS and DS are the same machine.

The client can get a layout and then issue a v3 write to do its I/O. XFS
will then call xfs_break_layouts, which will cause a CB_LAYOUTRECALL to
be issued to the client. The client however can't return the layout
until the v3 WRITE completes, but XFS won't allow the write to proceed
until the layout is returned.

Christoph says:

XFS only cares about block-like layouts where the client has direct
access to the file blocks. I'd need to look how to propagate the
flag into break_layout, but in principle we don't need to do any
recalls on truncate ever for file and flexfile layouts.

If we're never going to recall the layout, then we don't even need to
set the lease at all. Just skip doing so on flexfiles layouts by
adding a new flag to struct nfsd4_layout_ops and skipping the lease
setting and removal when that flag is true.

Cc: Christoph Hellwig
Signed-off-by: Jeff Layton
Signed-off-by: J. Bruce Fields

Jeff Layton
2016-09-17 04:15:52 +0800

13 Sep, 2016

1 commit

bf2c4b6f9 svcauth_gss: Revert 64c59a3726f2 ("Remove unnecessary allocation") ... Browse Code »

rsc_lookup steals the passed-in memory to avoid doing an allocation of
its own, so we can't just pass in a pointer to memory that someone else
is using.

If we really want to avoid allocation there then maybe we should
preallocate somwhere, or reference count these handles.

For now we should revert.

On occasion I see this on my server:

kernel: kernel BUG at /home/cel/src/linux/linux-2.6/mm/slub.c:3851!
kernel: invalid opcode: 0000 [#1] SMP
kernel: Modules linked in: cts rpcsec_gss_krb5 sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd btrfs xor iTCO_wdt iTCO_vendor_support raid6_pq pcspkr i2c_i801 i2c_smbus lpc_ich mfd_core mei_me sg mei shpchp wmi ioatdma ipmi_si ipmi_msghandler acpi_pad acpi_power_meter rpcrdma ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm nfsd nfs_acl lockd grace auth_rpcgss sunrpc ip_tables xfs libcrc32c mlx4_ib mlx4_en ib_core sr_mod cdrom sd_mod ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crc32c_intel igb mlx4_core ahci libahci libata ptp pps_core dca i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod
kernel: CPU: 7 PID: 145 Comm: kworker/7:2 Not tainted 4.8.0-rc4-00006-g9d06b0b #15
kernel: Hardware name: Supermicro Super Server/X10SRL-F, BIOS 1.0c 09/09/2015
kernel: Workqueue: events do_cache_clean [sunrpc]
kernel: task: ffff8808541d8000 task.stack: ffff880854344000
kernel: RIP: 0010:[] [] kfree+0x155/0x180
kernel: RSP: 0018:ffff880854347d70 EFLAGS: 00010246
kernel: RAX: ffffea0020fe7660 RBX: ffff88083f9db064 RCX: 146ff0f9d5ec5600
kernel: RDX: 000077ff80000000 RSI: ffff880853f01500 RDI: ffff88083f9db064
kernel: RBP: ffff880854347d88 R08: ffff8808594ee000 R09: ffff88087fdd8780
kernel: R10: 0000000000000000 R11: ffffea0020fe76c0 R12: ffff880853f01500
kernel: R13: ffffffffa013cf76 R14: ffffffffa013cff0 R15: ffffffffa04253a0
kernel: FS: 0000000000000000(0000) GS:ffff88087fdc0000(0000) knlGS:0000000000000000
kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007fed60b020c3 CR3: 0000000001c06000 CR4: 00000000001406e0
kernel: Stack:
kernel: ffff8808589f2f00 ffff880853f01500 0000000000000001 ffff880854347da0
kernel: ffffffffa013cf76 ffff8808589f2f00 ffff880854347db8 ffffffffa013d006
kernel: ffff8808589f2f20 ffff880854347e00 ffffffffa0406f60 0000000057c7044f
kernel: Call Trace:
kernel: [] rsc_free+0x16/0x90 [auth_rpcgss]
kernel: [] rsc_put+0x16/0x30 [auth_rpcgss]
kernel: [] cache_clean+0x2e0/0x300 [sunrpc]
kernel: [] do_cache_clean+0xe/0x70 [sunrpc]
kernel: [] process_one_work+0x1ff/0x3b0
kernel: [] worker_thread+0x2bc/0x4a0
kernel: [] ? rescuer_thread+0x3a0/0x3a0
kernel: [] kthread+0xe4/0xf0
kernel: [] ret_from_fork+0x1f/0x40
kernel: [] ? kthread_stop+0x110/0x110
kernel: Code: f7 ff ff eb 3b 65 8b 05 da 30 e2 7e 89 c0 48 0f a3 05 a0 38 b8 00 0f 92 c0 84 c0 0f 85 d1 fe ff ff 0f 1f 44 00 00 e9 f5 fe ff ff 0b 49 8b 03 31 f6 f6 c4 40 0f 85 62 ff ff ff e9 61 ff ff ff
kernel: RIP [] kfree+0x155/0x180
kernel: RSP
kernel: ---[ end trace 3fdec044969def26 ]---

It seems to be most common after a server reboot where a client has been
using a Kerberos mount, and reconnects to continue its workload.

Signed-off-by: Chuck Lever
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

Chuck Lever
2016-09-13 04:57:16 +0800

12 Sep, 2016

4 commits

9395452b4 Linux 4.8-rc6 Browse Code »

Linus Torvalds
2016-09-12 11:02:25 +0800
bd0b841fe nvme: make NVME_RDMA depend on BLOCK ... Browse Code »

Commit aa71987472a9 ("nvme: fabrics drivers don't need the nvme-pci
driver") removed the dependency on BLK_DEV_NVME, but the cdoe does
depend on the block layer (which used to be an implicit dependency
through BLK_DEV_NVME).

Otherwise you get various errors from the kbuild test robot random
config testing when that happens to hit a configuration with BLOCK
device support disabled.

Cc: Christoph Hellwig
Cc: Jay Freyensee
Cc: Sagi Grimberg
Signed-off-by: Linus Torvalds

Linus Torvalds
2016-09-12 05:41:49 +0800
2afe669ac Merge tag 'staging-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging ... Browse Code »

Pull IIO fixes from Greg KH:
"Here are a few small IIO fixes for 4.8-rc6.

Nothing major, full details are in the shortlog, all of these have
been in linux-next with no reported issues"

* tag 'staging-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
iio:core: fix IIO_VAL_FRACTIONAL sign handling
iio: ensure ret is initialized to zero before entering do loop
iio: accel: kxsd9: Fix scaling bug
iio: accel: bmc150: reset chip at init time
iio: fix pressure data output unit in hid-sensor-attributes
tools:iio:iio_generic_buffer: fix trigger-less mode

Linus Torvalds
2016-09-12 05:23:48 +0800
61c3dae67 Merge tag 'usb-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb ... Browse Code »

Pull USB fixes from Greg KH:
"Here are some small USB gadget, phy, and xhci fixes for 4.8-rc6.

All of these resolve minor issues that have been reported, and all
have been in linux-next with no reported issues"

* tag 'usb-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: chipidea: udc: fix NULL ptr dereference in isr_setup_status_phase
xhci: fix null pointer dereference in stop command timeout function
usb: dwc3: pci: fix build warning on !PM_SLEEP
usb: gadget: prevent potenial null pointer dereference on skb->len
usb: renesas_usbhs: fix clearing the {BRDY,BEMP}STS condition
usb: phy: phy-generic: Check clk_prepare_enable() error
usb: gadget: udc: renesas-usb3: clear VBOUT bit in DRD_CON
Revert "usb: dwc3: gadget: always decrement by 1"

Linus Torvalds
2016-09-12 05:10:29 +0800

11 Sep, 2016

3 commits

98ac9a608 Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm ... Browse Code »

Pull libnvdimm fixes from Dan Williams:
"nvdimm fixes for v4.8, two of them are tagged for -stable:

- Fix devm_memremap_pages() to use track_pfn_insert(). Otherwise,
DAX pmd mappings end up with an uncached pgprot, and unusable
performance for the device-dax interface. The device-dax interface
appeared in 4.7 so this is tagged for -stable.

- Fix a couple VM_BUG_ON() checks in the show_smaps() path to
understand DAX pmd entries. This fix is tagged for -stable.

- Fix a mis-merge of the nfit machine-check handler to flip the
polarity of an if() to match the final version of the patch that
Vishal sent for 4.8-rc1. Without this the nfit machine check
handler never detects / inserts new 'badblocks' entries which
applications use to identify lost portions of files.

- For test purposes, fix the nvdimm_clear_poison() path to operate on
legacy / simulated nvdimm memory ranges. Without this fix a test
can set badblocks, but never clear them on these ranges.

- Fix the range checking done by dax_dev_pmd_fault(). This is not
tagged for -stable since this problem is mitigated by specifying
aligned resources at device-dax setup time.

These patches have appeared in a next release over the past week. The
recent rebase you can see in the timestamps was to drop an invalid fix
as identified by the updated device-dax unit tests [1]. The -mm
touches have an ack from Andrew"

[1]: "[ndctl PATCH 0/3] device-dax test for recent kernel bugs"
https://lists.01.org/pipermail/linux-nvdimm/2016-September/006855.html

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm: allow legacy (e820) pmem region to clear bad blocks
nfit, mce: Fix SPA matching logic in MCE handler
mm: fix cache mode of dax pmd mappings
mm: fix show_smap() for zone_device-pmd ranges
dax: fix mapping size check

Linus Torvalds
2016-09-11 00:58:52 +0800
b8db3714d Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux ... Browse Code »

Pull i2c fixes from Wolfram Sang:
"Mostly driver bugfixes, but also a few cleanups which are nice to have
out of the way"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: rk3x: Restore clock settings at resume time
i2c: Spelling s/acknowedge/acknowledge/
i2c: designware: save the preset value of DW_IC_SDA_HOLD
Documentation: i2c: slave-interface: add note for driver development
i2c: mux: demux-pinctrl: run properly with multiple instances
i2c: bcm-kona: fix inconsistent indenting
i2c: rcar: use proper device with dma_mapping_error
i2c: sh_mobile: use proper device with dma_mapping_error
i2c: mux: demux-pinctrl: invalidate properly when switching fails

Linus Torvalds
2016-09-11 00:43:10 +0800
6905732c8 Merge tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

Pull fscrypto fixes fromTed Ts'o:
"Fix some brown-paper-bag bugs for fscrypto, including one one which
allows a malicious user to set an encryption policy on an empty
directory which they do not own"

* tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
fscrypto: require write access to mount to set encryption policy
fscrypto: only allow setting encryption policy on directories
fscrypto: add authorization check for setting encryption policy

Linus Torvalds
2016-09-11 00:18:33 +0800

10 Sep, 2016

17 commits

ba63f23d6 fscrypto: require write access to mount to set encryption policy ... Browse Code »

Since setting an encryption policy requires writing metadata to the
filesystem, it should be guarded by mnt_want_write/mnt_drop_write.
Otherwise, a user could cause a write to a frozen or readonly
filesystem. This was handled correctly by f2fs but not by ext4. Make
fscrypt_process_policy() handle it rather than relying on the filesystem
to get it right.

Signed-off-by: Eric Biggers
Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs}
Signed-off-by: Theodore Ts'o
Acked-by: Jaegeuk Kim

Eric Biggers
2016-09-10 13:18:57 +0800
002ced4be fscrypto: only allow setting encryption policy on directories ... Browse Code »

The FS_IOC_SET_ENCRYPTION_POLICY ioctl allowed setting an encryption
policy on nondirectory files. This was unintentional, and in the case
of nonempty regular files did not behave as expected because existing
data was not actually encrypted by the ioctl.

In the case of ext4, the user could also trigger filesystem errors in
->empty_dir(), e.g. due to mismatched "directory" checksums when the
kernel incorrectly tried to interpret a regular file as a directory.

This bug affected ext4 with kernels v4.8-rc1 or later and f2fs with
kernels v4.6 and later. It appears that older kernels only permitted
directories and that the check was accidentally lost during the
refactoring to share the file encryption code between ext4 and f2fs.

This patch restores the !S_ISDIR() check that was present in older
kernels.

Signed-off-by: Eric Biggers
Cc: stable@vger.kernel.org
Signed-off-by: Theodore Ts'o

Eric Biggers
2016-09-10 11:38:12 +0800
163ae1c6a fscrypto: add authorization check for setting encryption policy ... Browse Code »

On an ext4 or f2fs filesystem with file encryption supported, a user
could set an encryption policy on any empty directory(*) to which they
had readonly access. This is obviously problematic, since such a
directory might be owned by another user and the new encryption policy
would prevent that other user from creating files in their own directory
(for example).

Fix this by requiring inode_owner_or_capable() permission to set an
encryption policy. This means that either the caller must own the file,
or the caller must have the capability CAP_FOWNER.

(*) Or also on any regular file, for f2fs v4.6 and later and ext4
v4.8-rc1 and later; a separate bug fix is coming for that.

Signed-off-by: Eric Biggers
Cc: stable@vger.kernel.org # 4.1+; check fs/{ext4,f2fs}
Signed-off-by: Theodore Ts'o

Eric Biggers
2016-09-10 11:37:14 +0800
1e8b8d961 libnvdimm: allow legacy (e820) pmem region to clear bad blocks ... Browse Code »

Bad blocks can be injected via /sys/block/pmemN/badblocks. In a situation
where legacy pmem is being used or a pmem region created by using memmap
kernel parameter, the injected bad blocks are not cleared due to
nvdimm_clear_poison() failing from lack of ndctl function pointer. In
this case we need to just return as handled and allow the bad blocks to
be cleared rather than fail.

Reviewed-by: Vishal Verma
Signed-off-by: Dave Jiang
Signed-off-by: Dan Williams

Dave Jiang
2016-09-10 08:34:46 +0800
2e21807d4 nfit, mce: Fix SPA matching logic in MCE handler ... Browse Code »

The check for a 'pmem' type SPA in the MCE handler was inverted due to a
merge/rebase error.

Fixes: 6839a6d nfit: do an ARS scrub on hitting a latent media error
Cc: linux-acpi@vger.kernel.org
Cc: Dan Williams
Signed-off-by: Vishal Verma
Signed-off-by: Dan Williams

Vishal Verma
2016-09-10 08:34:46 +0800
9049771f7 mm: fix cache mode of dax pmd mappings ... Browse Code »

track_pfn_insert() in vmf_insert_pfn_pmd() is marking dax mappings as
uncacheable rendering them impractical for application usage. DAX-pte
mappings are cached and the goal of establishing DAX-pmd mappings is to
attain more performance, not dramatically less (3 orders of magnitude).

track_pfn_insert() relies on a previous call to reserve_memtype() to
establish the expected page_cache_mode for the range. While memremap()
arranges for reserve_memtype() to be called, devm_memremap_pages() does
not. So, teach track_pfn_insert() and untrack_pfn() how to handle
tracking without a vma, and arrange for devm_memremap_pages() to
establish the write-back-cache reservation in the memtype tree.

Cc:
Cc: Matthew Wilcox
Cc: Ross Zwisler
Cc: Nilesh Choudhury
Cc: Kirill A. Shutemov
Reported-by: Toshi Kani
Reported-by: Kai Zhang
Acked-by: Andrew Morton
Signed-off-by: Dan Williams

Dan Williams
2016-09-10 08:34:46 +0800
ca120cf68 mm: fix show_smap() for zone_device-pmd ranges ... Browse Code »

Attempting to dump /proc//smaps for a process with pmd dax mappings
currently results in the following VM_BUG_ONs:

kernel BUG at mm/huge_memory.c:1105!
task: ffff88045f16b140 task.stack: ffff88045be14000
RIP: 0010:[] [] follow_trans_huge_pmd+0x2cb/0x340
[..]
Call Trace:
[] smaps_pte_range+0xa0/0x4b0
[] ? vsnprintf+0x255/0x4c0
[] __walk_page_range+0x1fe/0x4d0
[] walk_page_vma+0x62/0x80
[] show_smap+0xa6/0x2b0

kernel BUG at fs/proc/task_mmu.c:585!
RIP: 0010:[] [] smaps_pte_range+0x499/0x4b0
Call Trace:
[] ? vsnprintf+0x255/0x4c0
[] __walk_page_range+0x1fe/0x4d0
[] walk_page_vma+0x62/0x80
[] show_smap+0xa6/0x2b0

These locations are sanity checking page flags that must be set for an
anonymous transparent huge page, but are not set for the zone_device
pages associated with dax mappings.

Cc: Ross Zwisler
Cc: Kirill A. Shutemov
Acked-by: Andrew Morton
Signed-off-by: Dan Williams

Dan Williams
2016-09-10 08:34:45 +0800
d0acc7dfd Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost ... Browse Code »

Pull virtio fixes from Michael Tsirkin:
"This includes a couple of bugfixs for virtio.

The virtio console patch is actually also in x86/tip targeting 4.9
because it helps vmap stacks, but it also fixes IOMMU_PLATFORM which
was added in 4.8, and it seems important not to ship that in a broken
configuration"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_console: Stop doing DMA on the stack
virtio: mark vring_dma_dev() static

Linus Torvalds
2016-09-10 05:52:05 +0800
daf6b9b68 Merge tag 'pm-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull power management fixes from Rafael Wysocki:
"This includes a PM QoS framework fix from Tejun to prevent interrupts
from being enabled unexpectedly during early boot and a cpufreq
documentation fix.

Specifics:

- If the PM QoS framework invokes cancel_delayed_work_sync() during
early boot, it will enable interrupts which is not expected at that
point, so prevent it from happening (Tejun Heo)

- Fix cpufreq statistic documentation to follow a recent change in
behavior that forgot to update it as appropriate (Jean Delvare)"

* tag 'pm-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq-stats: Minor documentation fix
PM / QoS: avoid calling cancel_delayed_work_sync() during early boot

Linus Torvalds
2016-09-10 05:47:41 +0800
8a2a835bb Merge branches 'pm-core-fixes' and 'pm-cpufreq-fixes' ... Browse Code »

* pm-core-fixes:
PM / QoS: avoid calling cancel_delayed_work_sync() during early boot

* pm-cpufreq-fixes:
cpufreq-stats: Minor documentation fix

Rafael J. Wysocki
2016-09-10 04:34:16 +0800
c4a6c70f9 Merge tag 'gpio-v4.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio ... Browse Code »

Pull GPIO fixes from Linus Walleij:
"Some GPIO fixes that have been boiling the last two weeks or so.
Nothing special, I'm trying to sort out some Kconfig business and
Russell needs a fix in for -his SA1100 rework.

Summary:

- Revert a pointless attempt to add an include to solve the UM allyes
compilation problem.

- Make the mcp23s08 depend on OF_GPIO as it uses it and doesn't
compile properly without it.

- Fix a probing problem for ucb1x00"

* tag 'gpio-v4.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: sa1100: fix irq probing for ucb1x00
gpio: mcp23s08: make driver depend on OF_GPIO
Revert "gpio: include in gpiolib-of"

Linus Torvalds
2016-09-10 04:09:50 +0800
6dc728ccd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse ... Browse Code »

Pull fuse fix from Miklos Szeredi:
"This fixes a deadlock when fuse, direct I/O and loop device are
combined"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: direct-io: don't dirty ITER_BVEC pages

Linus Torvalds
2016-09-10 04:00:41 +0800
5c44ad6a3 Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs ... Browse Code »

Pull overlayfs fix from Miklos Szeredi:
"This fixes a regression caused by the last pull request"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fix workdir creation

Linus Torvalds
2016-09-10 03:56:28 +0800
f4a9c169c Merge branch 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes from Chris Mason:
"I'm not proud of how long it took me to track down that one liner in
btrfs_sync_log(), but the good news is the patches I was trying to
blame for these problems were actually fine (sorry Filipe)"

* 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: introduce tickets_id to determine whether asynchronous metadata reclaim work makes progress
btrfs: remove root_log_ctx from ctx list before btrfs_sync_log returns
btrfs: do not decrease bytes_may_use when replaying extents

Linus Torvalds
2016-09-10 03:52:31 +0800
067c2f472 Merge tag 'sound-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound ... Browse Code »

Pull sound fixes from Takashi Iwai:
"We've got quite a few fixes at this time, and all are stable patches.

syzkaller strikes back again (episode 19 or so), and we had to plug
some holes in ALSA core part (mostly timer).

In addition, a couple of FireWire audio fixes for the invalid copy
user calls in locks, and a few quirks for HD-audio and USB-audio as
usual are included"

* tag 'sound-4.8-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: rawmidi: Fix possible deadlock with virmidi registration
ALSA: timer: Fix zero-division by continue of uninitialized instance
ALSA: timer: fix NULL pointer dereference in read()/ioctl() race
ALSA: fireworks: accessing to user space outside spinlock
ALSA: firewire-tascam: accessing to user space outside spinlock
ALSA: hda - Enable subwoofer on Dell Inspiron 7559
ALSA: hda - Add headset mic quirk for Dell Inspiron 5468
ALSA: usb-audio: Add sample rate inquiry quirk for B850V3 CP2114
ALSA: timer: fix NULL pointer dereference on memory allocation failure
ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE

Linus Torvalds
2016-09-10 03:02:46 +0800
5e59d9a1a virtio_console: Stop doing DMA on the stack ... Browse Code »

virtio_console uses a small DMA buffer for control requests. Move
that buffer into heap memory.

Doing virtio DMA on the stack is normally okay on non-DMA-API virtio
systems (which is currently most of them), but it breaks completely
if the stack is virtually mapped.

Tested by typing both directions using picocom aimed at /dev/hvc0.

Signed-off-by: Andy Lutomirski
Signed-off-by: Michael S. Tsirkin
Reviewed-by: Amit Shah

Andy Lutomirski
2016-09-10 02:12:45 +0800
af7c1becc virtio: mark vring_dma_dev() static ... Browse Code »

We get 1 warning when building kernel with W=1:
drivers/virtio/virtio_ring.c:170:16: warning: no previous prototype for 'vring_dma_dev' [-Wmissing-prototypes]

In fact, this function is only used in the file in which it is
declared and don't need a declaration, but can be made static.
so this patch marks this function with 'static'.

Signed-off-by: Baoyou Xie
Acked-by: Arnd Bergmann
Signed-off-by: Michael S. Tsirkin

Baoyou Xie
2016-09-10 02:12:35 +0800