Eric Lee / smarc-fsl-linux-kernel

25 Dec, 2017

2 commits

482b3f92a vhost-vsock: add pkt cancel capability ... Browse Code »

[ Upstream commit 16320f363ae128d9b9c70e60f00f2a572f57c23d ]

To allow canceling all packets of a connection.

Reviewed-by: Stefan Hajnoczi
Reviewed-by: Jorgen Hansen
Signed-off-by: Peng Tao
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Peng Tao
2017-12-25 21:23:37 +0800
6f1848e77 vsock: track pkt owner vsock ... Browse Code »

[ Upstream commit 36d277bac8080202684e67162ebb157f16631581 ]

So that we can cancel a queued pkt later if necessary.

Signed-off-by: Peng Tao
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Peng Tao
2017-12-25 21:23:37 +0800

20 Dec, 2017

7 commits

04d5a2d5d IB/core: Fix calculation of maximum RoCE MTU ... Browse Code »

[ Upstream commit 99260132fde7bddc6e0132ce53da94d1c9ccabcb ]

The original code only took into consideration the largest header
possible after the IB_BTH_BYTES. This was incorrect, as the largest
possible header size is the largest possible combination of headers we
might run into. The new code accounts for all possible headers in the
largest possible combination and subtracts that from the MTU to make
sure that all packets will fit on the wire.

Link: https://www.spinics.net/lists/linux-rdma/msg54558.html
Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices")
Signed-off-by: Parav Pandit
Reviewed-by: Daniel Jurgens
Reported-by: Roland Dreier
Signed-off-by: Leon Romanovsky
Signed-off-by: Doug Ledford
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Parav Pandit
2017-12-20 17:07:32 +0800
2850c3ec0 mm: Handle 0 flags in _calc_vm_trans() macro ... Browse Code »

[ Upstream commit 592e254502041f953e84d091eae2c68cba04c10b ]

_calc_vm_trans() does not handle the situation when some of the passed
flags are 0 (which can happen if these VM flags do not make sense for
the architecture). Improve the _calc_vm_trans() macro to return 0 in
such situation. Since all passed flags are constant, this does not add
any runtime overhead.

Signed-off-by: Jan Kara
Signed-off-by: Dan Williams
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Jan Kara
2017-12-20 17:07:29 +0800
e15628b29 Revert "x86/acpi: Set persistent cpuid <-> nodeid mapping when booting" ... Browse Code »

[ Upstream commit c962cff17dfa11f4a8227ac16de2b28aea3312e4 ]

Revert: dc6db24d2476 ("x86/acpi: Set persistent cpuid nodeid mapping when booting")

The mapping of "cpuid nodeid" is established at boot time via ACPI
tables to keep associations of workqueues and other node related items
consistent across cpu hotplug.

But, ACPI tables are unreliable and failures with that boot time mapping
have been reported on machines where the ACPI table and the physical
information which is retrieved at actual hotplug is inconsistent.

Revert the mapping implementation so it can be replaced with a less error
prone approach.

Signed-off-by: Dou Liyang
Tested-by: Xiaolong Ye
Cc: rjw@rjwysocki.net
Cc: linux-acpi@vger.kernel.org
Cc: guzheng1@huawei.com
Cc: izumi.taku@jp.fujitsu.com
Cc: lenb@kernel.org
Link: http://lkml.kernel.org/r/1488528147-2279-2-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Thomas Gleixner
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Dou Liyang
2017-12-20 17:07:26 +0800
892e4f9bc target: fix ALUA transition timeout handling ... Browse Code »

[ Upstream commit d7175373f2745ed4abe5b388d5aabd06304f801e ]

The implicit transition time tells initiators the min time
to wait before timing out a transition. We currently schedule
the transition to occur in tg_pt_gp_implicit_trans_secs
seconds so there is no room for delays. If
core_alua_do_transition_tg_pt_work->core_alua_update_tpg_primary_metadata
needs to write out info to a remote file, then the initiator can
easily time out the operation.

Signed-off-by: Mike Christie
Signed-off-by: Nicholas Bellinger
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Mike Christie
2017-12-20 17:07:26 +0800
fd27dbcae net/mlx4_core: Avoid delays during VF driver device shutdown ... Browse Code »

[ Upstream commit 4cbe4dac82e423ecc9a0ba46af24a860853259f4 ]

Some Hypervisors detach VFs from VMs by instantly causing an FLR event
to be generated for a VF.

In the mlx4 case, this will cause that VF's comm channel to be disabled
before the VM has an opportunity to invoke the VF device's "shutdown"
method.

For such Hypervisors, there is a race condition between the VF's
shutdown method and its internal-error detection/reset thread.

The internal-error detection/reset thread (which runs every 5 seconds) also
detects a disabled comm channel. If the internal-error detection/reset
flow wins the race, we still get delays (while that flow tries repeatedly
to detect comm-channel recovery).

The cited commit fixed the command timeout problem when the
internal-error detection/reset flow loses the race.

This commit avoids the unneeded delays when the internal-error
detection/reset flow wins.

Fixes: d585df1c5ccf ("net/mlx4_core: Avoid command timeouts during VF driver device shutdown")
Signed-off-by: Jack Morgenstein
Reported-by: Simon Xiao
Signed-off-by: Tariq Toukan
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Jack Morgenstein
2017-12-20 17:07:25 +0800
b6dbace92 usb: add helper to extract bits 12:11 of wMaxPacketSize ... Browse Code »

commit 541b6fe63023f3059cf85d47ff2767a3e42a8e44 upstream.

According to USB Specification 2.0 table 9-4,
wMaxPacketSize is a bitfield. Endpoint's maxpacket
is laid out in bits 10:0. For high-speed,
high-bandwidth isochronous endpoints, bits 12:11
contain a multiplier to tell us how many
transactions we want to try per uframe.

This means that if we want an isochronous endpoint
to issue 3 transfers of 1024 bytes per uframe,
wMaxPacketSize should contain the value:

1024 | (2 << 11)

or 5120 (0x1400). In order to make Host and
Peripheral controller drivers' life easier, we're
adding a helper which returns bits 12:11. Note that
no care is made WRT to checking endpoint type and
gadget's speed. That's left for drivers to handle.

Signed-off-by: Felipe Balbi
Signed-off-by: Greg Kroah-Hartman

Felipe Balbi
2017-12-20 17:07:16 +0800
43259d07f crypto: hmac - require that the underlying hash algorithm is unkeyed ... Browse Code »

commit af3ff8045bbf3e32f1a448542e73abb4c8ceb6f1 upstream.

Because the HMAC template didn't check that its underlying hash
algorithm is unkeyed, trying to use "hmac(hmac(sha3-512-generic))"
through AF_ALG or through KEYCTL_DH_COMPUTE resulted in the inner HMAC
being used without having been keyed, resulting in sha3_update() being
called without sha3_init(), causing a stack buffer overflow.

This is a very old bug, but it seems to have only started causing real
problems when SHA-3 support was added (requires CONFIG_CRYPTO_SHA3)
because the innermost hash's state is ->import()ed from a zeroed buffer,
and it just so happens that other hash algorithms are fine with that,
but SHA-3 is not. However, there could be arch or hardware-dependent
hash algorithms also affected; I couldn't test everything.

Fix the bug by introducing a function crypto_shash_alg_has_setkey()
which tests whether a shash algorithm is keyed. Then update the HMAC
template to require that its underlying hash algorithm is unkeyed.

Here is a reproducer:

#include
#include

int main()
{
int algfd;
struct sockaddr_alg addr = {
.salg_type = "hash",
.salg_name = "hmac(hmac(sha3-512-generic))",
};
char key[4096] = { 0 };

algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(algfd, (const struct sockaddr *)&addr, sizeof(addr));
setsockopt(algfd, SOL_ALG, ALG_SET_KEY, key, sizeof(key));
}

Here was the KASAN report from syzbot:

BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:341 [inline]
BUG: KASAN: stack-out-of-bounds in sha3_update+0xdf/0x2e0 crypto/sha3_generic.c:161
Write of size 4096 at addr ffff8801cca07c40 by task syzkaller076574/3044

CPU: 1 PID: 3044 Comm: syzkaller076574 Not tainted 4.14.0-mm1+ #25
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x25b/0x340 mm/kasan/report.c:409
check_memory_region_inline mm/kasan/kasan.c:260 [inline]
check_memory_region+0x137/0x190 mm/kasan/kasan.c:267
memcpy+0x37/0x50 mm/kasan/kasan.c:303
memcpy include/linux/string.h:341 [inline]
sha3_update+0xdf/0x2e0 crypto/sha3_generic.c:161
crypto_shash_update+0xcb/0x220 crypto/shash.c:109
shash_finup_unaligned+0x2a/0x60 crypto/shash.c:151
crypto_shash_finup+0xc4/0x120 crypto/shash.c:165
hmac_finup+0x182/0x330 crypto/hmac.c:152
crypto_shash_finup+0xc4/0x120 crypto/shash.c:165
shash_digest_unaligned+0x9e/0xd0 crypto/shash.c:172
crypto_shash_digest+0xc4/0x120 crypto/shash.c:186
hmac_setkey+0x36a/0x690 crypto/hmac.c:66
crypto_shash_setkey+0xad/0x190 crypto/shash.c:64
shash_async_setkey+0x47/0x60 crypto/shash.c:207
crypto_ahash_setkey+0xaf/0x180 crypto/ahash.c:200
hash_setkey+0x40/0x90 crypto/algif_hash.c:446
alg_setkey crypto/af_alg.c:221 [inline]
alg_setsockopt+0x2a1/0x350 crypto/af_alg.c:254
SYSC_setsockopt net/socket.c:1851 [inline]
SyS_setsockopt+0x189/0x360 net/socket.c:1830
entry_SYSCALL_64_fastpath+0x1f/0x96

Reported-by: syzbot
Signed-off-by: Eric Biggers
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Eric Biggers
2017-12-20 17:07:15 +0800

16 Dec, 2017

2 commits

564fe3e0e net: remove hlist_nulls_add_tail_rcu() ... Browse Code »

[ Upstream commit d7efc6c11b277d9d80b99b1334a78bfe7d7edf10 ]

Alexander Potapenko reported use of uninitialized memory [1]

This happens when inserting a request socket into TCP ehash,
in __sk_nulls_add_node_rcu(), since sk_reuseport is not initialized.

Bug was added by commit d894ba18d4e4 ("soreuseport: fix ordering for
mixed v4/v6 sockets")

Note that d296ba60d8e2 ("soreuseport: Resolve merge conflict for v4/v6
ordering fix") missed the opportunity to get rid of
hlist_nulls_add_tail_rcu() :

Both UDP sockets and TCP/DCCP listeners no longer use
__sk_nulls_add_node_rcu() for their hash insertion.

Since all other sockets have unique 4-tuple, the reuseport status
has no special meaning, so we can always use hlist_nulls_add_head_rcu()
for them and save few cycles/instructions.

[1]

==================================================================
BUG: KMSAN: use of uninitialized memory in inet_ehash_insert+0xd40/0x1050
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0+ #3288
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:

__dump_stack lib/dump_stack.c:16
dump_stack+0x185/0x1d0 lib/dump_stack.c:52
kmsan_report+0x13f/0x1c0 mm/kmsan/kmsan.c:1016
__msan_warning_32+0x69/0xb0 mm/kmsan/kmsan_instr.c:766
__sk_nulls_add_node_rcu ./include/net/sock.h:684
inet_ehash_insert+0xd40/0x1050 net/ipv4/inet_hashtables.c:413
reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:754
inet_csk_reqsk_queue_hash_add+0x1cc/0x300 net/ipv4/inet_connection_sock.c:765
tcp_conn_request+0x31e7/0x36f0 net/ipv4/tcp_input.c:6414
tcp_v4_conn_request+0x16d/0x220 net/ipv4/tcp_ipv4.c:1314
tcp_rcv_state_process+0x42a/0x7210 net/ipv4/tcp_input.c:5917
tcp_v4_do_rcv+0xa6a/0xcd0 net/ipv4/tcp_ipv4.c:1483
tcp_v4_rcv+0x3de0/0x4ab0 net/ipv4/tcp_ipv4.c:1763
ip_local_deliver_finish+0x6bb/0xcb0 net/ipv4/ip_input.c:216
NF_HOOK ./include/linux/netfilter.h:248
ip_local_deliver+0x3fa/0x480 net/ipv4/ip_input.c:257
dst_input ./include/net/dst.h:477
ip_rcv_finish+0x6fb/0x1540 net/ipv4/ip_input.c:397
NF_HOOK ./include/linux/netfilter.h:248
ip_rcv+0x10f6/0x15c0 net/ipv4/ip_input.c:488
__netif_receive_skb_core+0x36f6/0x3f60 net/core/dev.c:4298
__netif_receive_skb net/core/dev.c:4336
netif_receive_skb_internal+0x63c/0x19c0 net/core/dev.c:4497
napi_skb_finish net/core/dev.c:4858
napi_gro_receive+0x629/0xa50 net/core/dev.c:4889
e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4018
e1000_clean_rx_irq+0x1492/0x1d30
drivers/net/ethernet/intel/e1000/e1000_main.c:4474
e1000_clean+0x43aa/0x5970 drivers/net/ethernet/intel/e1000/e1000_main.c:3819
napi_poll net/core/dev.c:5500
net_rx_action+0x73c/0x1820 net/core/dev.c:5566
__do_softirq+0x4b4/0x8dd kernel/softirq.c:284
invoke_softirq kernel/softirq.c:364
irq_exit+0x203/0x240 kernel/softirq.c:405
exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:638
do_IRQ+0x15e/0x1a0 arch/x86/kernel/irq.c:263
common_interrupt+0x86/0x86

Fixes: d894ba18d4e4 ("soreuseport: fix ordering for mixed v4/v6 sockets")
Fixes: d296ba60d8e2 ("soreuseport: Resolve merge conflict for v4/v6 ordering fix")
Signed-off-by: Eric Dumazet
Reported-by: Alexander Potapenko
Acked-by: Craig Gallek
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2017-12-16 23:25:45 +0800
80ad5bd1b usbnet: fix alignment for frames with no ethernet header ... Browse Code »

[ Upstream commit a4abd7a80addb4a9547f7dfc7812566b60ec505c ]

The qmi_wwan minidriver support a 'raw-ip' mode where frames are
received without any ethernet header. This causes alignment issues
because the skbs allocated by usbnet are "IP aligned".

Fix by allowing minidrivers to disable the additional alignment
offset. This is implemented using a per-device flag, since the same
minidriver also supports 'ethernet' mode.

Fixes: 32f7adf633b9 ("net: qmi_wwan: support "raw IP" mode")
Reported-and-tested-by: Jay Foster
Signed-off-by: Bjørn Mork
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Bjørn Mork
2017-12-16 23:25:45 +0800

14 Dec, 2017

6 commits

a77c11607 lib/genalloc.c: make the avail variable an atomic_long_t ... Browse Code »

[ Upstream commit 36a3d1dd4e16bcd0d2ddfb4a2ec7092f0ae0d931 ]

If the amount of resources allocated to a gen_pool exceeds 2^32 then the
avail atomic overflows and this causes problems when clients try and
borrow resources from the pool. This is only expected to be an issue on
64 bit systems.

Add the header to pull in atomic_long* operations. So
that 32 bit systems continue to use atomic32_t but 64 bit systems can
use atomic64_t.

Link: http://lkml.kernel.org/r/1509033843-25667-1-git-send-email-sbates@raithlin.com
Signed-off-by: Stephen Bates
Reviewed-by: Logan Gunthorpe
Reviewed-by: Mathieu Desnoyers
Reviewed-by: Daniel Mentz
Cc: Jonathan Corbet
Cc: Andrew Morton
Cc: Will Deacon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Stephen Bates
2017-12-14 16:28:22 +0800
93247ff1f ARM: OMAP2+: gpmc-onenand: propagate error on initialization failure ... Browse Code »

[ Upstream commit 7807e086a2d1f69cc1a57958cac04fea79fc2112 ]

gpmc_probe_onenand_child returns success even on gpmc_onenand_init
failure. Fix that.

Signed-off-by: Ladislav Michl
Acked-by: Roger Quadros
Signed-off-by: Tony Lindgren
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Ladislav Michl
2017-12-14 16:28:16 +0800
6a53078b9 mm: drop unused pmdp_huge_get_and_clear_notify() ... Browse Code »

commit c0c379e2931b05facef538e53bf3b21f283d9a0b upstream.

Dave noticed that after fixing MADV_DONTNEED vs numa balancing race the
last pmdp_huge_get_and_clear_notify() user is gone.

Let's drop the helper.

Link: http://lkml.kernel.org/r/20170306112047.24809-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov
Cc: Dave Hansen
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
[jwang: adjust context for 4.9]
Signed-off-by: Jack Wang
Signed-off-by: Greg Kroah-Hartman

Kirill A. Shutemov
2017-12-14 16:28:16 +0800
29c3b7a85 efi: Move some sysfs files to be read-only by root ... Browse Code »

commit af97a77bc01ce49a466f9d4c0125479e2e2230b6 upstream.

Thanks to the scripts/leaking_addresses.pl script, it was found that
some EFI values should not be readable by non-root users.

So make them root-only, and to do that, add a __ATTR_RO_MODE() macro to
make this easier, and use it in other places at the same time.

Reported-by: Linus Torvalds
Tested-by: Dave Young
Signed-off-by: Greg Kroah-Hartman
Signed-off-by: Ard Biesheuvel
Cc: H. Peter Anvin
Cc: Matt Fleming
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20171206095010.24170-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-12-14 16:28:11 +0800
35b4bfbda scsi: libsas: align sata_device's rps_resp on a cacheline ... Browse Code »

commit c2e8fbf908afd81ad502b567a6639598f92c9b9d upstream.

The rps_resp buffer in ata_device is a DMA target, but it isn't
explicitly cacheline aligned. Due to this, adjacent fields can be
overwritten with stale data from memory on non-coherent architectures.
As a result, the kernel is sometimes unable to communicate with an SATA
device behind a SAS expander.

Fix this by ensuring that the rps_resp buffer is cacheline aligned.

This issue is similar to that fixed by Commit 84bda12af31f93 ("libata:
align ap->sector_buf") and Commit 4ee34ea3a12396f35b26 ("libata: Align
ata_device's id on a cacheline").

Signed-off-by: Huacai Chen
Signed-off-by: Christoph Hellwig
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman

Huacai Chen
2017-12-14 16:28:11 +0800
4cb4d78c5 scsi: dma-mapping: always provide dma_get_cache_alignment ... Browse Code »

commit 860dd4424f344400b491b212ee4acb3a358ba9d9 upstream.

Provide the dummy version of dma_get_cache_alignment that always returns
1 even if CONFIG_HAS_DMA is not set, so that drivers and subsystems can
use it without ifdefs.

Signed-off-by: Christoph Hellwig
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman

Christoph Hellwig
2017-12-14 16:28:11 +0800

10 Dec, 2017

6 commits

05ffc7ed5 USB: core: Add type-specific length check of BOS descriptors ... Browse Code »

commit 81cf4a45360f70528f1f64ba018d61cb5767249a upstream.

As most of BOS descriptors are longer in length than their header
'struct usb_dev_cap_header', comparing solely with it is not sufficient
to avoid out-of-bounds access to BOS descriptors.

This patch adds descriptor type specific length check in
usb_get_bos_descriptor() to fix the issue.

Signed-off-by: Masakazu Mokuno
Signed-off-by: Greg Kroah-Hartman

Masakazu Mokuno
2017-12-10 05:01:56 +0800
f5e0724e7 dma-fence: Introduce drm_fence_set_error() helper ... Browse Code »

commit a009e975da5c7d42a7f5eaadc54946eb5f76c9af upstream.

The dma_fence.error field (formerly known as dma_fence.status) is an
optional field that may be set by drivers before calling
dma_fence_signal(). The field can be used to indicate that the fence was
completed in err rather than with success, and is visible to other
consumers of the fence and to userspace via sync_file.

This patch renames the field from status to error so that its meaning is
hopefully more clear (and distinct from dma_fence_get_status() which is
a composite between the error state and signal state) and adds a helper
that validates the preconditions of when it is suitable to adjust the
error field.

Signed-off-by: Chris Wilson
Reviewed-by: Daniel Vetter
Reviewed-by: Sumit Semwal
Signed-off-by: Sumit Semwal
Link: http://patchwork.freedesktop.org/patch/msgid/20170104141222.6992-3-chris@chris-wilson.co.uk
[s/dma_fence/fence/g - gregkh]
Cc: Jisheng Zhang
Signed-off-by: Greg Kroah-Hartman

Chris Wilson
2017-12-10 05:01:56 +0800
d3b029a44 dma-fence: Wrap querying the fence->status ... Browse Code »

commit d6c99f4bf093a58d3ab47caaec74b81f18bc4e3f upstream.

The fence->status is an optional field that is only valid once the fence
has been signaled. (Driver may fill the fence->status with an error code
prior to calling dma_fence_signal().) Given the restriction upon its
validity, wrap querying of the fence->status into a helper
dma_fence_get_status().

Signed-off-by: Chris Wilson
Reviewed-by: Daniel Vetter
Reviewed-by: Sumit Semwal
Signed-off-by: Sumit Semwal
Link: http://patchwork.freedesktop.org/patch/msgid/20170104141222.6992-2-chris@chris-wilson.co.uk
[s/dma_fence/fence/g - gregkh]
Cc: Jisheng Zhang
Signed-off-by: Greg Kroah-Hartman

Chris Wilson
2017-12-10 05:01:56 +0800
b53525eaa dma-buf/dma-fence: Extract __dma_fence_is_later() ... Browse Code »

commit 8111477663813caa1a4469cfe6afaae36cd04513 upstream.

Often we have the task of comparing two seqno known to be on the same
context, so provide a common __dma_fence_is_later().

Signed-off-by: Chris Wilson
Cc: Sumit Semwal
Cc: Sean Paul
Cc: Gustavo Padovan
Reviewed-by: Sean Paul
Signed-off-by: Gustavo Padovan
Link: http://patchwork.freedesktop.org/patch/msgid/20170629125930.821-1-chris@chris-wilson.co.uk
[renamed to __fence_is_later() - gregkh]
Cc: Jisheng Zhang
Signed-off-by: Greg Kroah-Hartman

Chris Wilson
2017-12-10 05:01:55 +0800
c93c09a05 mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers ... Browse Code »

[ Upstream commit 0911d0041c22922228ca52a977d7b0b0159fee4b ]

Some ->page_mkwrite handlers may return VM_FAULT_RETRY as its return
code (GFS2 or Lustre can definitely do this). However VM_FAULT_RETRY
from ->page_mkwrite is completely unhandled by the mm code and results
in locking and writeably mapping the page which definitely is not what
the caller wanted.

Fix Lustre and block_page_mkwrite_ret() used by other filesystems
(notably GFS2) to return VM_FAULT_NOPAGE instead which results in
bailing out from the fault code, the CPU then retries the access, and we
fault again effectively doing what the handler wanted.

Link: http://lkml.kernel.org/r/20170203150729.15863-1-jack@suse.cz
Signed-off-by: Jan Kara
Reported-by: Al Viro
Reviewed-by: Jinshan Xiong
Cc: Matthew Wilcox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Jan Kara
2017-12-10 05:01:54 +0800
a88ff235e perf/x86/intel: Account interrupts for PEBS errors ... Browse Code »

[ Upstream commit 475113d937adfd150eb82b5e2c5507125a68e7af ]

It's possible to set up PEBS events to get only errors and not
any data, like on SNB-X (model 45) and IVB-EP (model 62)
via 2 perf commands running simultaneously:

taskset -c 1 ./perf record -c 4 -e branches:pp -j any -C 10

This leads to a soft lock up, because the error path of the
intel_pmu_drain_pebs_nhm() does not account event->hw.interrupt
for error PEBS interrupts, so in case you're getting ONLY
errors you don't have a way to stop the event when it's over
the max_samples_per_tick limit:

NMI watchdog: BUG: soft lockup - CPU#22 stuck for 22s! [perf_fuzzer:5816]
...
RIP: 0010:[] [] smp_call_function_single+0xe2/0x140
...
Call Trace:
? trace_hardirqs_on_caller+0xf5/0x1b0
? perf_cgroup_attach+0x70/0x70
perf_install_in_context+0x199/0x1b0
? ctx_resched+0x90/0x90
SYSC_perf_event_open+0x641/0xf90
SyS_perf_event_open+0x9/0x10
do_syscall_64+0x6c/0x1f0
entry_SYSCALL64_slow_path+0x25/0x25

Add perf_event_account_interrupt() which does the interrupt
and frequency checks and call it from intel_pmu_drain_pebs_nhm()'s
error path.

We keep the pending_kill and pending_wakeup logic only in the
__perf_event_overflow() path, because they make sense only if
there's any data to deliver.

Signed-off-by: Jiri Olsa
Signed-off-by: Peter Zijlstra (Intel)
Cc: Alexander Shishkin
Cc: Arnaldo Carvalho de Melo
Cc: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Stephane Eranian
Cc: Thomas Gleixner
Cc: Vince Weaver
Cc: Vince Weaver
Link: http://lkml.kernel.org/r/1482931866-6018-2-git-send-email-jolsa@kernel.org
Signed-off-by: Ingo Molnar
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Jiri Olsa
2017-12-10 05:01:52 +0800

05 Dec, 2017

2 commits

8588eb0ce bcache: Fix building error on MIPS ... Browse Code »

commit cf33c1ee5254c6a430bc1538232b49c3ea13e613 upstream.

This patch try to fix the building error on MIPS. The reason is MIPS
has already defined the PTR macro, which conflicts with the PTR macro
in include/uapi/linux/bcache.h.

[fixed by mlyle: corrected a line-length issue]

Signed-off-by: Huacai Chen
Reviewed-by: Michael Lyle
Signed-off-by: Michael Lyle
Signed-off-by: Jens Axboe
Signed-off-by: Greg Kroah-Hartman

Huacai Chen
2017-12-05 18:24:34 +0800
cebe139e5 mm, hugetlbfs: introduce ->split() to vm_operations_struct ... Browse Code »

commit 31383c6865a578834dd953d9dbc88e6b19fe3997 upstream.

Patch series "device-dax: fix unaligned munmap handling"

When device-dax is operating in huge-page mode we want it to behave like
hugetlbfs and fail attempts to split vmas into unaligned ranges. It
would be messy to teach the munmap path about device-dax alignment
constraints in the same (hstate) way that hugetlbfs communicates this
constraint. Instead, these patches introduce a new ->split() vm
operation.

This patch (of 2):

The device-dax interface has similar constraints as hugetlbfs in that it
requires the munmap path to unmap in huge page aligned units. Rather
than add more custom vma handling code in __split_vma() introduce a new
vm operation to perform this vma specific check.

Link: http://lkml.kernel.org/r/151130418135.4029.6783191281930729710.stgit@dwillia2-desk3.amr.corp.intel.com
Fixes: dee410792419 ("/dev/dax, core: file operations and dax-mmap")
Signed-off-by: Dan Williams
Cc: Jeff Moyer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Dan Williams
2017-12-05 18:24:32 +0800

30 Nov, 2017

1 commit

d45c593bb SUNRPC: Fix tracepoint storage issues with svc_recv and svc_rqst_status ... Browse Code »

commit e9d4bf219c83d09579bc62512fea2ca10f025d93 upstream.

There is no guarantee that either the request or the svc_xprt exist
by the time we get round to printing the trace message.

Signed-off-by: Trond Myklebust
Signed-off-by: J. Bruce Fields
Signed-off-by: Greg Kroah-Hartman

Trond Myklebust
2017-11-30 16:39:07 +0800

24 Nov, 2017

2 commits

9980b8278 mm/page_alloc.c: broken deferred calculation ... Browse Code »

commit d135e5750205a21a212a19dbb05aeb339e2cbea7 upstream.

In reset_deferred_meminit() we determine number of pages that must not
be deferred. We initialize pages for at least 2G of memory, but also
pages for reserved memory in this node.

The reserved memory is determined in this function:
memblock_reserved_memory_within(), which operates over physical
addresses, and returns size in bytes. However, reset_deferred_meminit()
assumes that that this function operates with pfns, and returns page
count.

The result is that in the best case machine boots slower than expected
due to initializing more pages than needed in single thread, and in the
worst case panics because fewer than needed pages are initialized early.

Link: http://lkml.kernel.org/r/20171021011707.15191-1-pasha.tatashin@oracle.com
Fixes: 864b9a393dcb ("mm: consider memblock reservations for deferred memory initialization sizing")
Signed-off-by: Pavel Tatashin
Acked-by: Michal Hocko
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Pavel Tatashin
2017-11-24 15:33:42 +0800
afd9fa661 netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed ... Browse Code »

[ Upstream commit 2b5ec1a5f9738ee7bf8f5ec0526e75e00362c48f ]

When run ipvs in two different network namespace at the same host, and one
ipvs transport network traffic to the other network namespace ipvs.
'ipvs_property' flag will make the second ipvs take no effect. So we should
clear 'ipvs_property' when SKB network namespace changed.

Fixes: 621e84d6f373 ("dev: introduce skb_scrub_packet()")
Signed-off-by: Ye Yin
Signed-off-by: Wei Zhou
Signed-off-by: Julian Anastasov
Signed-off-by: Simon Horman
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Ye Yin
2017-11-24 15:33:40 +0800

21 Nov, 2017

4 commits

f95d6058d uapi: fix linux/rds.h userspace compilation errors ... Browse Code »

[ Upstream commit feb0869d90e51ce8b6fd8a46588465b1b5a26d09 ]

Consistently use types from linux/types.h to fix the following
linux/rds.h userspace compilation errors:

/usr/include/linux/rds.h:106:2: error: unknown type name 'uint8_t'
uint8_t name[32];
/usr/include/linux/rds.h:107:2: error: unknown type name 'uint64_t'
uint64_t value;
/usr/include/linux/rds.h:117:2: error: unknown type name 'uint64_t'
uint64_t next_tx_seq;
/usr/include/linux/rds.h:118:2: error: unknown type name 'uint64_t'
uint64_t next_rx_seq;
/usr/include/linux/rds.h:121:2: error: unknown type name 'uint8_t'
uint8_t transport[TRANSNAMSIZ]; /* null term ascii */
/usr/include/linux/rds.h:122:2: error: unknown type name 'uint8_t'
uint8_t flags;
/usr/include/linux/rds.h:129:2: error: unknown type name 'uint64_t'
uint64_t seq;
/usr/include/linux/rds.h:130:2: error: unknown type name 'uint32_t'
uint32_t len;
/usr/include/linux/rds.h:135:2: error: unknown type name 'uint8_t'
uint8_t flags;
/usr/include/linux/rds.h:139:2: error: unknown type name 'uint32_t'
uint32_t sndbuf;
/usr/include/linux/rds.h:144:2: error: unknown type name 'uint32_t'
uint32_t rcvbuf;
/usr/include/linux/rds.h:145:2: error: unknown type name 'uint64_t'
uint64_t inum;
/usr/include/linux/rds.h:153:2: error: unknown type name 'uint64_t'
uint64_t hdr_rem;
/usr/include/linux/rds.h:154:2: error: unknown type name 'uint64_t'
uint64_t data_rem;
/usr/include/linux/rds.h:155:2: error: unknown type name 'uint32_t'
uint32_t last_sent_nxt;
/usr/include/linux/rds.h:156:2: error: unknown type name 'uint32_t'
uint32_t last_expected_una;
/usr/include/linux/rds.h:157:2: error: unknown type name 'uint32_t'
uint32_t last_seen_una;
/usr/include/linux/rds.h:164:2: error: unknown type name 'uint8_t'
uint8_t src_gid[RDS_IB_GID_LEN];
/usr/include/linux/rds.h:165:2: error: unknown type name 'uint8_t'
uint8_t dst_gid[RDS_IB_GID_LEN];
/usr/include/linux/rds.h:167:2: error: unknown type name 'uint32_t'
uint32_t max_send_wr;
/usr/include/linux/rds.h:168:2: error: unknown type name 'uint32_t'
uint32_t max_recv_wr;
/usr/include/linux/rds.h:169:2: error: unknown type name 'uint32_t'
uint32_t max_send_sge;
/usr/include/linux/rds.h:170:2: error: unknown type name 'uint32_t'
uint32_t rdma_mr_max;
/usr/include/linux/rds.h:171:2: error: unknown type name 'uint32_t'
uint32_t rdma_mr_size;
/usr/include/linux/rds.h:212:9: error: unknown type name 'uint64_t'
typedef uint64_t rds_rdma_cookie_t;
/usr/include/linux/rds.h:215:2: error: unknown type name 'uint64_t'
uint64_t addr;
/usr/include/linux/rds.h:216:2: error: unknown type name 'uint64_t'
uint64_t bytes;
/usr/include/linux/rds.h:221:2: error: unknown type name 'uint64_t'
uint64_t cookie_addr;
/usr/include/linux/rds.h:222:2: error: unknown type name 'uint64_t'
uint64_t flags;
/usr/include/linux/rds.h:228:2: error: unknown type name 'uint64_t'
uint64_t cookie_addr;
/usr/include/linux/rds.h:229:2: error: unknown type name 'uint64_t'
uint64_t flags;
/usr/include/linux/rds.h:234:2: error: unknown type name 'uint64_t'
uint64_t flags;
/usr/include/linux/rds.h:240:2: error: unknown type name 'uint64_t'
uint64_t local_vec_addr;
/usr/include/linux/rds.h:241:2: error: unknown type name 'uint64_t'
uint64_t nr_local;
/usr/include/linux/rds.h:242:2: error: unknown type name 'uint64_t'
uint64_t flags;
/usr/include/linux/rds.h:243:2: error: unknown type name 'uint64_t'
uint64_t user_token;
/usr/include/linux/rds.h:248:2: error: unknown type name 'uint64_t'
uint64_t local_addr;
/usr/include/linux/rds.h:249:2: error: unknown type name 'uint64_t'
uint64_t remote_addr;
/usr/include/linux/rds.h:252:4: error: unknown type name 'uint64_t'
uint64_t compare;
/usr/include/linux/rds.h:253:4: error: unknown type name 'uint64_t'
uint64_t swap;
/usr/include/linux/rds.h:256:4: error: unknown type name 'uint64_t'
uint64_t add;
/usr/include/linux/rds.h:259:4: error: unknown type name 'uint64_t'
uint64_t compare;
/usr/include/linux/rds.h:260:4: error: unknown type name 'uint64_t'
uint64_t swap;
/usr/include/linux/rds.h:261:4: error: unknown type name 'uint64_t'
uint64_t compare_mask;
/usr/include/linux/rds.h:262:4: error: unknown type name 'uint64_t'
uint64_t swap_mask;
/usr/include/linux/rds.h:265:4: error: unknown type name 'uint64_t'
uint64_t add;
/usr/include/linux/rds.h:266:4: error: unknown type name 'uint64_t'
uint64_t nocarry_mask;
/usr/include/linux/rds.h:269:2: error: unknown type name 'uint64_t'
uint64_t flags;
/usr/include/linux/rds.h:270:2: error: unknown type name 'uint64_t'
uint64_t user_token;
/usr/include/linux/rds.h:274:2: error: unknown type name 'uint64_t'
uint64_t user_token;
/usr/include/linux/rds.h:275:2: error: unknown type name 'int32_t'
int32_t status;

Signed-off-by: Dmitry V. Levin
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Dmitry V. Levin
2017-11-21 16:23:28 +0800
3323d0761 uapi: fix linux/rds.h userspace compilation error ... Browse Code »

[ Upstream commit 1786dbf3702e33ce3afd2d3dbe630bd04b1d2e58 ]

On the kernel side, sockaddr_storage is #define'd to
__kernel_sockaddr_storage. Replacing struct sockaddr_storage with
struct __kernel_sockaddr_storage defined by fixes
the following linux/rds.h userspace compilation error:

/usr/include/linux/rds.h:226:26: error: field 'dest_addr' has incomplete type
struct sockaddr_storage dest_addr;

Signed-off-by: Dmitry V. Levin
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Dmitry V. Levin
2017-11-21 16:23:28 +0800
3f0cc5422 Revert "uapi: fix linux/rds.h userspace compilation errors" ... Browse Code »

This reverts commit ad50561ba7a664bc581826c9d57d137fcf17bfa5.

There was a mixup with the commit message for two upstream commit
that have the same subject line.

This revert will be followed by the two commits with proper commit
messages.

Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Sasha Levin
2017-11-21 16:23:28 +0800
c26fa1306 ARM: dts: Fix omap3 off mode pull defines ... Browse Code »

[ Upstream commit d97556c8012015901a3ce77f46960078139cd79d ]

We need to also have OFFPULLUDENABLE bit set to use the off mode pull values.
Otherwise the line is pulled down internally if no external pull exists.

This is has some documentation at:

http://processors.wiki.ti.com/index.php/Optimizing_OMAP35x_and_AM/DM37x_OFF_mode_PAD_configuration

Note that the value is still glitchy during off mode transitions as documented
in spz319f.pdf "Advisory 1.45". It's best to use external pulls instead of
relying on the internal ones for off mode and even then anything pulled up
will get driven down momentarily on off mode restore for GPIO banks other
than bank1.

Signed-off-by: Tony Lindgren
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Tony Lindgren
2017-11-21 16:23:22 +0800

18 Nov, 2017

6 commits

ff4927181 target/iscsi: Fix iSCSI task reassignment handling ... Browse Code »

commit 59b6986dbfcdab96a971f9663221849de79a7556 upstream.

Allocate a task management request structure for all task management
requests, including task reassignment. This change avoids that the
se_tmr->response assignment dereferences an uninitialized se_tmr
pointer.

Reported-by: Moshe David
Signed-off-by: Bart Van Assche
Reviewed-by: Hannes Reinecke
Reviewed-by: Christoph Hellwig
Cc: Moshe David
Signed-off-by: Nicholas Bellinger
Signed-off-by: Greg Kroah-Hartman

Bart Van Assche
2017-11-18 18:22:25 +0800
a23349bb9 netfilter: nat: Revert "netfilter: nat: convert nat bysrc hash to rhashtable" ... Browse Code »

commit e1bf1687740ce1a3598a1c5e452b852ff2190682 upstream.

This reverts commit 870190a9ec9075205c0fa795a09fa931694a3ff1.

It was not a good idea. The custom hash table was a much better
fit for this purpose.

A fast lookup is not essential, in fact for most cases there is no lookup
at all because original tuple is not taken and can be used as-is.
What needs to be fast is insertion and deletion.

rhlist removal however requires a rhlist walk.
We can have thousands of entries in such a list if source port/addresses
are reused for multiple flows, if this happens removal requests are so
expensive that deletions of a few thousand flows can take several
seconds(!).

The advantages that we got from rhashtable are:
1) table auto-sizing
2) multiple locks

1) would be nice to have, but it is not essential as we have at
most one lookup per new flow, so even a million flows in the bysource
table are not a problem compared to current deletion cost.
2) is easy to add to custom hash table.

I tried to add hlist_node to rhlist to speed up rhltable_remove but this
isn't doable without changing semantics. rhltable_remove_fast will
check that the to-be-deleted object is part of the table and that
requires a list walk that we want to avoid.

Furthermore, using hlist_node increases size of struct rhlist_head, which
in turn increases nf_conn size.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=196821
Reported-by: Ivan Babrou
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman

Florian Westphal
2017-11-18 18:22:24 +0800
2af59c655 tcp/dccp: fix other lockdep splats accessing ireq_opt ... Browse Code »

[ Upstream commit 06f877d613be3621604c2520ec0351d9fbdca15f ]

In my first attempt to fix the lockdep splat, I forgot we could
enter inet_csk_route_req() with a freshly allocated request socket,
for which refcount has not yet been elevated, due to complex
SLAB_TYPESAFE_BY_RCU rules.

We either are in rcu_read_lock() section _or_ we own a refcount on the
request.

Correct RCU verb to use here is rcu_dereference_check(), although it is
not possible to prove we actually own a reference on a shared
refcount :/

In v2, I added ireq_opt_deref() helper and use in three places, to fix other
possible splats.

[ 49.844590] lockdep_rcu_suspicious+0xea/0xf3
[ 49.846487] inet_csk_route_req+0x53/0x14d
[ 49.848334] tcp_v4_route_req+0xe/0x10
[ 49.850174] tcp_conn_request+0x31c/0x6a0
[ 49.851992] ? __lock_acquire+0x614/0x822
[ 49.854015] tcp_v4_conn_request+0x5a/0x79
[ 49.855957] ? tcp_v4_conn_request+0x5a/0x79
[ 49.858052] tcp_rcv_state_process+0x98/0xdcc
[ 49.859990] ? sk_filter_trim_cap+0x2f6/0x307
[ 49.862085] tcp_v4_do_rcv+0xfc/0x145
[ 49.864055] ? tcp_v4_do_rcv+0xfc/0x145
[ 49.866173] tcp_v4_rcv+0x5ab/0xaf9
[ 49.868029] ip_local_deliver_finish+0x1af/0x2e7
[ 49.870064] ip_local_deliver+0x1b2/0x1c5
[ 49.871775] ? inet_del_offload+0x45/0x45
[ 49.873916] ip_rcv_finish+0x3f7/0x471
[ 49.875476] ip_rcv+0x3f1/0x42f
[ 49.876991] ? ip_local_deliver_finish+0x2e7/0x2e7
[ 49.878791] __netif_receive_skb_core+0x6d3/0x950
[ 49.880701] ? process_backlog+0x7e/0x216
[ 49.882589] __netif_receive_skb+0x1d/0x5e
[ 49.884122] process_backlog+0x10c/0x216
[ 49.885812] net_rx_action+0x147/0x3df

Fixes: a6ca7abe53633 ("tcp/dccp: fix lockdep splat in inet_csk_route_req()")
Fixes: c92e8c02fe66 ("tcp/dccp: fix ireq->opt races")
Signed-off-by: Eric Dumazet
Reported-by: kernel test robot
Reported-by: Maciej Żenczykowski
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2017-11-18 18:22:22 +0800
2ffd26133 tcp/dccp: fix ireq->opt races ... Browse Code »

[ Upstream commit c92e8c02fe664155ac4234516e32544bec0f113d ]

syzkaller found another bug in DCCP/TCP stacks [1]

For the reasons explained in commit ce1050089c96 ("tcp/dccp: fix
ireq->pktopts race"), we need to make sure we do not access
ireq->opt unless we own the request sock.

Note the opt field is renamed to ireq_opt to ease grep games.

[1]
BUG: KASAN: use-after-free in ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474
Read of size 1 at addr ffff8801c951039c by task syz-executor5/3295

CPU: 1 PID: 3295 Comm: syz-executor5 Not tainted 4.14.0-rc4+ #80
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:52
print_address_description+0x73/0x250 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x25b/0x340 mm/kasan/report.c:409
__asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:427
ip_queue_xmit+0x1687/0x18e0 net/ipv4/ip_output.c:474
tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1135
tcp_send_ack.part.37+0x3bb/0x650 net/ipv4/tcp_output.c:3587
tcp_send_ack+0x49/0x60 net/ipv4/tcp_output.c:3557
__tcp_ack_snd_check+0x2c6/0x4b0 net/ipv4/tcp_input.c:5072
tcp_ack_snd_check net/ipv4/tcp_input.c:5085 [inline]
tcp_rcv_state_process+0x2eff/0x4850 net/ipv4/tcp_input.c:6071
tcp_child_process+0x342/0x990 net/ipv4/tcp_minisocks.c:816
tcp_v4_rcv+0x1827/0x2f80 net/ipv4/tcp_ipv4.c:1682
ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
NF_HOOK include/linux/netfilter.h:249 [inline]
ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
dst_input include/net/dst.h:464 [inline]
ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
NF_HOOK include/linux/netfilter.h:249 [inline]
ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
__netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
__netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
netif_receive_skb+0xae/0x390 net/core/dev.c:4611
tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
call_write_iter include/linux/fs.h:1770 [inline]
new_sync_write fs/read_write.c:468 [inline]
__vfs_write+0x68a/0x970 fs/read_write.c:481
vfs_write+0x18f/0x510 fs/read_write.c:543
SYSC_write fs/read_write.c:588 [inline]
SyS_write+0xef/0x220 fs/read_write.c:580
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x40c341
RSP: 002b:00007f469523ec10 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000718000 RCX: 000000000040c341
RDX: 0000000000000037 RSI: 0000000020004000 RDI: 0000000000000015
RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000000f4240 R11: 0000000000000293 R12: 00000000004b7fd1
R13: 00000000ffffffff R14: 0000000020000000 R15: 0000000000025000

Allocated by task 3295:
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
save_stack+0x43/0xd0 mm/kasan/kasan.c:447
set_track mm/kasan/kasan.c:459 [inline]
kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
__do_kmalloc mm/slab.c:3725 [inline]
__kmalloc+0x162/0x760 mm/slab.c:3734
kmalloc include/linux/slab.h:498 [inline]
tcp_v4_save_options include/net/tcp.h:1962 [inline]
tcp_v4_init_req+0x2d3/0x3e0 net/ipv4/tcp_ipv4.c:1271
tcp_conn_request+0xf6d/0x3410 net/ipv4/tcp_input.c:6283
tcp_v4_conn_request+0x157/0x210 net/ipv4/tcp_ipv4.c:1313
tcp_rcv_state_process+0x8ea/0x4850 net/ipv4/tcp_input.c:5857
tcp_v4_do_rcv+0x55c/0x7d0 net/ipv4/tcp_ipv4.c:1482
tcp_v4_rcv+0x2d10/0x2f80 net/ipv4/tcp_ipv4.c:1711
ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
NF_HOOK include/linux/netfilter.h:249 [inline]
ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
dst_input include/net/dst.h:464 [inline]
ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
NF_HOOK include/linux/netfilter.h:249 [inline]
ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
__netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
__netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
netif_receive_skb+0xae/0x390 net/core/dev.c:4611
tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
call_write_iter include/linux/fs.h:1770 [inline]
new_sync_write fs/read_write.c:468 [inline]
__vfs_write+0x68a/0x970 fs/read_write.c:481
vfs_write+0x18f/0x510 fs/read_write.c:543
SYSC_write fs/read_write.c:588 [inline]
SyS_write+0xef/0x220 fs/read_write.c:580
entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 3306:
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
save_stack+0x43/0xd0 mm/kasan/kasan.c:447
set_track mm/kasan/kasan.c:459 [inline]
kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
__cache_free mm/slab.c:3503 [inline]
kfree+0xca/0x250 mm/slab.c:3820
inet_sock_destruct+0x59d/0x950 net/ipv4/af_inet.c:157
__sk_destruct+0xfd/0x910 net/core/sock.c:1560
sk_destruct+0x47/0x80 net/core/sock.c:1595
__sk_free+0x57/0x230 net/core/sock.c:1603
sk_free+0x2a/0x40 net/core/sock.c:1614
sock_put include/net/sock.h:1652 [inline]
inet_csk_complete_hashdance+0xd5/0xf0 net/ipv4/inet_connection_sock.c:959
tcp_check_req+0xf4d/0x1620 net/ipv4/tcp_minisocks.c:765
tcp_v4_rcv+0x17f6/0x2f80 net/ipv4/tcp_ipv4.c:1675
ip_local_deliver_finish+0x2e2/0xba0 net/ipv4/ip_input.c:216
NF_HOOK include/linux/netfilter.h:249 [inline]
ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257
dst_input include/net/dst.h:464 [inline]
ip_rcv_finish+0x887/0x19a0 net/ipv4/ip_input.c:397
NF_HOOK include/linux/netfilter.h:249 [inline]
ip_rcv+0xc3f/0x1820 net/ipv4/ip_input.c:493
__netif_receive_skb_core+0x1a3e/0x34b0 net/core/dev.c:4476
__netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4514
netif_receive_skb_internal+0x10b/0x670 net/core/dev.c:4587
netif_receive_skb+0xae/0x390 net/core/dev.c:4611
tun_rx_batched.isra.50+0x5ed/0x860 drivers/net/tun.c:1372
tun_get_user+0x249c/0x36d0 drivers/net/tun.c:1766
tun_chr_write_iter+0xbf/0x160 drivers/net/tun.c:1792
call_write_iter include/linux/fs.h:1770 [inline]
new_sync_write fs/read_write.c:468 [inline]
__vfs_write+0x68a/0x970 fs/read_write.c:481
vfs_write+0x18f/0x510 fs/read_write.c:543
SYSC_write fs/read_write.c:588 [inline]
SyS_write+0xef/0x220 fs/read_write.c:580
entry_SYSCALL_64_fastpath+0x1f/0xbe

Fixes: e994b2f0fb92 ("tcp: do not lock listener to process SYN packets")
Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table")
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2017-11-18 18:22:22 +0800
3e2ab0cee tun: call dev_get_valid_name() before register_netdevice() ... Browse Code »

[ Upstream commit 0ad646c81b2182f7fa67ec0c8c825e0ee165696d ]

register_netdevice() could fail early when we have an invalid
dev name, in which case ->ndo_uninit() is not called. For tun
device, this is a problem because a timer etc. are already
initialized and it expects ->ndo_uninit() to clean them up.

We could move these initializations into a ->ndo_init() so
that register_netdevice() knows better, however this is still
complicated due to the logic in tun_detach().

Therefore, I choose to just call dev_get_valid_name() before
register_netdevice(), which is quicker and much easier to audit.
And for this specific case, it is already enough.

Fixes: 96442e42429e ("tuntap: choose the txq based on rxq")
Reported-by: Dmitry Alexeev
Cc: Jason Wang
Cc: "Michael S. Tsirkin"
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Cong Wang
2017-11-18 18:22:21 +0800
e12c42c55 tcp: fix tcp_mtu_probe() vs highest_sack ... Browse Code »

[ Upstream commit 2b7cda9c35d3b940eb9ce74b30bbd5eb30db493d ]

Based on SNMP values provided by Roman, Yuchung made the observation
that some crashes in tcp_sacktag_walk() might be caused by MTU probing.

Looking at tcp_mtu_probe(), I found that when a new skb was placed
in front of the write queue, we were not updating tcp highest sack.

If one skb is freed because all its content was copied to the new skb
(for MTU probing), then tp->highest_sack could point to a now freed skb.

Bad things would then happen, including infinite loops.

This patch renames tcp_highest_sack_combine() and uses it
from tcp_mtu_probe() to fix the bug.

Note that I also removed one test against tp->sacked_out,
since we want to replace tp->highest_sack regardless of whatever
condition, since keeping a stale pointer to freed skb is a recipe
for disaster.

Fixes: a47e5a988a57 ("[TCP]: Convert highest_sack to sk_buff to allow direct access")
Signed-off-by: Eric Dumazet
Reported-by: Alexei Starovoitov
Reported-by: Roman Gushchin
Reported-by: Oleksandr Natalenko
Acked-by: Alexei Starovoitov
Acked-by: Neal Cardwell
Acked-by: Yuchung Cheng
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2017-11-18 18:22:21 +0800

15 Nov, 2017

2 commits

b35783871 ALSA: seq: Avoid invalid lockdep class warning ... Browse Code »

commit 3510c7aa069aa83a2de6dab2b41401a198317bdc upstream.

The recent fix for adding rwsem nesting annotation was using the given
"hop" argument as the lock subclass key. Although the idea itself
works, it may trigger a kernel warning like:
BUG: looking up invalid subclass: 8
....
since the lockdep has a smaller number of subclasses (8) than we
currently allow for the hops there (10).

The current definition is merely a sanity check for avoiding the too
deep delivery paths, and the 8 hops are already enough. So, as a
quick fix, just follow the max hops as same as the max lockdep
subclasses.

Fixes: 1f20f9ff57ca ("ALSA: seq: Fix nested rwsem annotation for lockdep splat")
Reported-by: syzbot
Signed-off-by: Takashi Iwai
Signed-off-by: Greg Kroah-Hartman

Takashi Iwai
2017-11-15 22:53:18 +0800
2715f6841 x86/uaccess, sched/preempt: Verify access_ok() context ... Browse Code »

commit 7c4788950ba5922fde976d80b72baf46f14dee8d upstream.

I recently encountered wreckage because access_ok() was used where it
should not be, add an explicit WARN when access_ok() is used wrongly.

Signed-off-by: Peter Zijlstra (Intel)
Cc: Andy Lutomirski
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Peter Zijlstra
2017-11-15 22:53:17 +0800