11 Aug, 2017
12 commits
-
Commit 1203c8e6fb0a ("fault-inject: simplify access check for fail-nth")
unintentionally broke a conditional statement in should_fail(). Any
faults are not injected in the task context by the change when the
systematic fault injection is not used.This change restores to the previous correct behaviour.
Link: http://lkml.kernel.org/r/1501633700-3488-1-git-send-email-akinobu.mita@gmail.com
Fixes: 1203c8e6fb0a ("fault-inject: simplify access check for fail-nth")
Signed-off-by: Akinobu Mita
Reported-by: Lu Fengqi
Tested-by: Lu Fengqi
Cc: Dmitry Vyukov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The break was in the wrong place so file system tests don't work as
intended, leaking memory at each test switch.[mcgrof@kernel.org: massaged commit subject, noted memory leak issue without the fix]
Link: http://lkml.kernel.org/r/20170802211450.27928-6-mcgrof@kernel.org
Fixes: 39258f448d71 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Dan Carpenter
Signed-off-by: Luis R. Rodriguez
Reported-by: David Binderman
Cc: Colin Ian King
Cc: Dmitry Torokhov
Cc: Eric W. Biederman
Cc: Jessica Yu
Cc: Josh Poimboeuf
Cc: Kees Cook
Cc: Michal Marek
Cc: Miroslav Benes
Cc: Petr Mladek
Cc: Rusty Russell
Cc: Shuah Khan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We accidentally just drop the lock twice instead of taking it and then
releasing it. This isn't a big issue unless you are adding more than
one device to test on, and the kmod.sh doesn't do that yet, however this
obviously is the correct thing to do.[mcgrof@kernel.org: massaged subject, explain what happens]
Link: http://lkml.kernel.org/r/20170802211450.27928-5-mcgrof@kernel.org
Fixes: 39258f448d71 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Dan Carpenter
Signed-off-by: Luis R. Rodriguez
Cc: Colin Ian King
Cc: David Binderman
Cc: Dmitry Torokhov
Cc: Eric W. Biederman
Cc: Jessica Yu
Cc: Josh Poimboeuf
Cc: Kees Cook
Cc: Michal Marek
Cc: Miroslav Benes
Cc: Petr Mladek
Cc: Rusty Russell
Cc: Shuah Khan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Parsing with kstrtol() enables values to be negative, and we failed to
check for negative values when parsing with test_dev_config_update_uint_sync()
or test_dev_config_update_uint_range().test_dev_config_update_uint_range() has a minimum check though so an
issue is not present there. test_dev_config_update_uint_sync() is only
used for the number of threads to use (config_num_threads_store()), and
indeed this would fail with an attempt for a large allocation.Although the issue is only present in practice with the first fix both
by using kstrtoul() instead of kstrtol().Link: http://lkml.kernel.org/r/20170802211450.27928-4-mcgrof@kernel.org
Fixes: 39258f448d71 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Luis R. Rodriguez
Reported-by: Dan Carpenter
Cc: Colin Ian King
Cc: David Binderman
Cc: Dmitry Torokhov
Cc: Eric W. Biederman
Cc: Jessica Yu
Cc: Josh Poimboeuf
Cc: Kees Cook
Cc: Michal Marek
Cc: Miroslav Benes
Cc: Petr Mladek
Cc: Rusty Russell
Cc: Shuah Khan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Trivial fix to spelling mistake in snprintf text
[mcgrof@kernel.org: massaged commit message]
Link: http://lkml.kernel.org/r/20170802211450.27928-3-mcgrof@kernel.org
Fixes: 39258f448d71 ("kmod: add test driver to stress test the module loader")
Signed-off-by: Colin Ian King
Signed-off-by: Luis R. Rodriguez
Cc: Dmitry Torokhov
Cc: Kees Cook
Cc: Jessica Yu
Cc: Rusty Russell
Cc: Michal Marek
Cc: Petr Mladek
Cc: Miroslav Benes
Cc: Josh Poimboeuf
Cc: Eric W. Biederman
Cc: Shuah Khan
Cc: Dan Carpenter
Cc: David Binderman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
huge_add_to_page_cache->add_to_page_cache implicitly unlocks the page
before returning in case of errors.The error returned was -EEXIST by running UFFDIO_COPY on a non-hole
offset of a VM_SHARED hugetlbfs mapping. It was an userland bug that
triggered it and the kernel must cope with it returning -EEXIST from
ioctl(UFFDIO_COPY) as expected.page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
kernel BUG at mm/filemap.c:964!
invalid opcode: 0000 [#1] SMP
CPU: 1 PID: 22582 Comm: qemu-system-x86 Not tainted 4.11.11-300.fc26.x86_64 #1
RIP: unlock_page+0x4a/0x50
Call Trace:
hugetlb_mcopy_atomic_pte+0xc0/0x320
mcopy_atomic+0x96f/0xbe0
userfaultfd_ioctl+0x218/0xe90
do_vfs_ioctl+0xa5/0x600
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x1a/0xa9Link: http://lkml.kernel.org/r/20170802165145.22628-2-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli
Tested-by: Maxime Coquelin
Reviewed-by: Mike Kravetz
Cc: "Dr. David Alan Gilbert"
Cc: Mike Rapoport
Cc: Alexey Perevalov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The RDMA subsystem can generate several thousand of these messages per
second eventually leading to a kernel crash. Ratelimit these messages
to prevent this crash.Doug said:
"I've been carrying a version of this for several kernel versions. I
don't remember when they started, but we have one (and only one) class
of machines: Dell PE R730xd, that generate these errors. When it
happens, without a rate limit, we get rcu timeouts and kernel oopses.
With the rate limit, we just get a lot of annoying kernel messages but
the machine continues on, recovers, and eventually the memory
operations all succeed"And:
"> Well... why are all these EBUSY's occurring? It sounds inefficient
> (at least) but if it is expected, normal and unavoidable then
> perhaps we should just remove that message altogether?I don't have an answer to that question. To be honest, I haven't
looked real hard. We never had this at all, then it started out of the
blue, but only on our Dell 730xd machines (and it hits all of them),
but no other classes or brands of machines. And we have our 730xd
machines loaded up with different brands and models of cards (for
instance one dedicated to mlx4 hardware, one for qib, one for mlx5, an
ocrdma/cxgb4 combo, etc), so the fact that it hit all of the machines
meant it wasn't tied to any particular brand/model of RDMA hardware.
To me, it always smelled of a hardware oddity specific to maybe the
CPUs or mainboard chipsets in these machines, so given that I'm not an
mm expert anyway, I never chased it down.A few other relevant details: it showed up somewhere around 4.8/4.9 or
thereabouts. It never happened before, but the prinkt has been there
since the 3.18 days, so possibly the test to trigger this message was
changed, or something else in the allocator changed such that the
situation started happening on these machines?And, like I said, it is specific to our 730xd machines (but they are
all identical, so that could mean it's something like their specific
ram configuration is causing the allocator to hit this on these
machine but not on other machines in the cluster, I don't want to say
it's necessarily the model of chipset or CPU, there are other bits of
identicalness between these machines)"Link: http://lkml.kernel.org/r/499c0f6cc10d6eb829a67f2a4d75b4228a9b356e.1501695897.git.jtoppins@redhat.com
Signed-off-by: Jonathan Toppins
Reviewed-by: Doug Ledford
Tested-by: Doug Ledford
Cc: Michal Hocko
Cc: Vlastimil Babka
Cc: Mel Gorman
Cc: Hillf Danton
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As Tetsuo points out:
"Commit 385386cff4c6 ("mm: vmstat: move slab statistics from zone to
node counters") broke "Slab:" field of /proc/meminfo . It shows nearly
0kB"In addition to /proc/meminfo, this problem also affects the slab
counters OOM/allocation failure info dumps, can cause early -ENOMEM from
overcommit protection, and miscalculate image size requirements during
suspend-to-disk.This is because the patch in question switched the slab counters from
the zone level to the node level, but forgot to update the global
accessor functions to read the aggregate node data instead of the
aggregate zone data.Use global_node_page_state() to access the global slab counters.
Fixes: 385386cff4c6 ("mm: vmstat: move slab statistics from zone to node counters")
Link: http://lkml.kernel.org/r/20170801134256.5400-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner
Reported-by: Tetsuo Handa
Acked-by: Michal Hocko
Cc: Josef Bacik
Cc: Vladimir Davydov
Cc: Stefan Agner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Pull networking fixes from David Miller:
1) Fix handling of initial STATE message in TIPC, from Jon Paul Maloy.
2) Fix stats handling in bcm_sysport_get_stats(), from Florian
Fainelli.3) Reject 16777215 VNI value in geneve_validate(), from Girish
Moodalbail.4) Fix initial IGMP sysctl setting regression, from Nikolay Borisov.
5) Once a UFO fragmented frame is treated as UFO, we should continue
doing so. Likewise once a frame has been segmented, we should
continue doing that and not try to convert it to a UFO frame. From
Willem de Bruijn.6) Test the AF_PACKET RX/TX ring pg_vec state under the socket lock to
prevent races. From Willem de Bruijn.* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
packet: fix tp_reserve race in packet_set_ring
udp: consistently apply ufo or fragmentation
net: sched: set xt_tgchk_param par.nft_compat as 0 in ipt_init_target
igmp: Fix regression caused by igmp sysctl namespace code.
geneve: maximum value of VNI cannot be used
net: systemport: Fix software statistics for SYSTEMPORT Lite
tipc: remove premature ESTABLISH FSM event at link synchronization -
Updates to tp_reserve can race with reads of the field in
packet_set_ring. Avoid this by holding the socket lock during
updates in setsockopt PACKET_RESERVE.This bug was discovered by syzkaller.
Fixes: 8913336a7e8d ("packet: add PACKET_RESERVE sockopt")
Reported-by: Andrey Konovalov
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller -
When iteratively building a UDP datagram with MSG_MORE and that
datagram exceeds MTU, consistently choose UFO or fragmentation.Once skb_is_gso, always apply ufo. Conversely, once a datagram is
split across multiple skbs, do not consider ufo.Sendpage already maintains the first invariant, only add the second.
IPv6 does not have a sendpage implementation to modify.A gso skb must have a partial checksum, do not follow sk_no_check_tx
in udp_send_skb.Found by syzkaller.
Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
Reported-by: Andrey Konovalov
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller -
Pull sparc updates from David Miller:
1) Recognize M8 cpus, just basic chip ID matching, from Allen Pais.
2) Prevent crashes when bringing up sunvdc virtual block devices in
some environments. From Jim Quigley.* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sunvdc: prevent sunvdc panic when mpgroup disk added to guest domain
sparc64: Increase max_phys_bits to 51 and VA bits to 53 for M8.
sparc64: recognize and support sparc M8 cpu type
sparc64: properly name the cpu constants
10 Aug, 2017
12 commits
-
Commit 55917a21d0cc ("netfilter: x_tables: add context to know if
extension runs from nft_compat") introduced a member nft_compat to
xt_tgchk_param structure.But it didn't set it's value for ipt_init_target. With unexpected
value in par.nft_compat, it may return unexpected result in some
target's checkentry.This patch is to set all it's fields as 0 and only initialize the
non-zero fields in ipt_init_target.v1->v2:
As Wang Cong's suggestion, fix it by setting all it's fields as
0 and only initializing the non-zero fields.Fixes: 55917a21d0cc ("netfilter: x_tables: add context to know if extension runs from nft_compat")
Suggested-by: Cong Wang
Signed-off-by: Xin Long
Signed-off-by: David S. Miller -
Commit dcd87999d415 ("igmp: net: Move igmp namespace init to correct file")
moved the igmp sysctls initialization from tcp_sk_init to igmp_net_init. This
function is only called as part of per-namespace initialization, only if
CONFIG_IP_MULTICAST is defined, otherwise igmp_mc_init() call in ip_init is
compiled out, casuing the igmp pernet ops to not be registerd and those sysctl
being left initialized with 0. However, there are certain functions, such as
ip_mc_join_group which are always compiled and make use of some of those
sysctls. Let's do a partial revert of the aforementioned commit and move the
sysctl initialization into inet_init_net, that way they will always have
sane values.Fixes: dcd87999d415 ("igmp: net: Move igmp namespace init to correct file")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196595
Reported-by: Gerardo Exequiel Pozzi
Signed-off-by: Nikolay Borisov
Signed-off-by: David S. Miller -
Geneve's Virtual Network Identifier (VNI) is 24 bit long, so the range
of values for it would be from 0 to 16777215 (2^24 -1). However, one
cannot create a geneve device with VNI set to 16777215. This patch fixes
this issue.Signed-off-by: Girish Moodalbail
Signed-off-by: David S. Miller -
With SYSTEMPORT Lite we have holes in our statistics layout that make us
skip over the hardware MIB counters, bcm_sysport_get_stats() was not
taking that into account resulting in reporting 0 for all SW-maintained
statistics, fix this by skipping accordingly.Fixes: 44a4524c54af ("net: systemport: Add support for SYSTEMPORT Lite")
Signed-off-by: Florian Fainelli
Signed-off-by: David S. Miller -
When a link between two nodes come up, both endpoints will initially
send out a STATE message to the peer, to increase the probability that
the peer endpoint also is up when the first traffic message arrives.
Thereafter, if the establishing link is the second link between two
nodes, this first "traffic" message is a TUNNEL_PROTOCOL/SYNCH message,
helping the peer to perform initial synchronization between the two
links.However, the initial STATE message may be lost, in which case the SYNCH
message will be the first one arriving at the peer. This should also
work, as the SYNCH message itself will be used to take up the link
endpoint before initializing synchronization.Unfortunately the code for this case is broken. Currently, the link is
brought up through a tipc_link_fsm_evt(ESTABLISHED) when a SYNCH
arrives, whereupon __tipc_node_link_up() is called to distribute the
link slots and take the link into traffic. But, __tipc_node_link_up() is
itself starting with a test for whether the link is up, and if true,
returns without action. Clearly, the tipc_link_fsm_evt(ESTABLISHED) call
is unnecessary, since tipc_node_link_up() is itself issuing such an
event, but also harmful, since it inhibits tipc_node_link_up() to
perform the test of its tasks, and the link endpoint in question hence
is never taken into traffic.This problem has been exposed when we set up dual links between pre-
and post-4.4 kernels, because the former ones don't send out the
initial STATE message described above.We fix this by removing the unnecessary event call.
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
Using mpgroup to define multiple paths for a virtual disk causes multiple
virtual-device-port ports to be created for that virtual device.
Each virtual-device-port port then gets a vdisk created for it by the Linux
sunvdc driver. As mpgroup is not supported by the Linux sunvdc driver it
cannot handle multiple ports for a single vdisk, leading to a kernel panic
at startup.This fix prevents more than one vdisk per virtual-device-port being created
until full virtual disk multipathing (mpgroup) support is implemented.Signed-off-by: Jim Quigley
Reviewed-by: Liam Merwick
Reviewed-by: Shannon Nelson
Reviewed-by: Alexandre Chartre
Reviewed-by: Aaron Young
Signed-off-by: David S. Miller -
Pull pin control fixes from Linus Walleij:
"These are the pin control fixes I have gathered since the return from
my vacation. They boiled in -next a while so let's get them in.Apart from the documentation build it is purely driver fixes. Which is
nice. The Intel fixes seem kind of important.- Fix the documentation build as the docs were moved
- Correct the UART pin list on the Intel Merrifield
- Fix pin assignment and number of pins on the Marvell Armada 37xx
pin controller- Cover the Setzer models in the Chromebook DMI quirk in the Intel
cheryview driver so they start working- Add the missing "sim" function to the sunxi driver
- Fix USB pin definitions on Uniphier Pro4
- Smatch fix for invalid reference in the zx pin control driver"
* tag 'pinctrl-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: generic: update references to Documentation/pinctrl.txt
pinctrl: intel: merrifield: Correct UART pin lists
pinctrl: armada-37xx: Fix number of pin in south bridge
pinctrl: armada-37xx: Fix the pin 23 on south bridge
pinctrl: cherryview: Add Setzer models to the Chromebook DMI quirk
pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver
pinctrl: uniphier: fix USB3 pin assignment for Pro4
pinctrl: zte: fix dereference of 'data' in zx_set_mux() -
Commit 65d8fc777f6d ("futex: Remove requirement for lock_page() in
get_futex_key()") removed an unnecessary lock_page() with the
side-effect that page->mapping needed to be treated very carefully.Two defensive warnings were added in case any assumption was missed and
the first warning assumed a correct application would not alter a
mapping backing a futex key. Since merging, it has not triggered for
any unexpected case but Mark Rutland reported the following bug
triggering due to the first warning.kernel BUG at kernel/futex.c:679!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3
Hardware name: linux,dummy-virt (DT)
task: ffff80001e271780 task.stack: ffff000010908000
PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
pc : [] lr : [] pstate: 80000145The fact that it's a bug instead of a warning was due to an unrelated
arm64 problem, but the warning itself triggered because the underlying
mapping changed.This is an application issue but from a kernel perspective it's a
recoverable situation and the warning is unnecessary so this patch
removes the warning. The warning may potentially be triggered with the
following test program from Mark although it may be necessary to adjust
NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the
system.#include
#include
#include
#include
#include
#include
#include
#include#define NR_FUTEX_THREADS 16
pthread_t threads[NR_FUTEX_THREADS];void *mem;
#define MEM_PROT (PROT_READ | PROT_WRITE)
#define MEM_SIZE 65536static int futex_wrapper(int *uaddr, int op, int val,
const struct timespec *timeout,
int *uaddr2, int val3)
{
syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3);
}void *poll_futex(void *unused)
{
for (;;) {
futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1);
}
}int main(int argc, char *argv[])
{
int i;mem = mmap(NULL, MEM_SIZE, MEM_PROT,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);printf("Mapping @ %p\n", mem);
printf("Creating futex threads...\n");
for (i = 0; i < NR_FUTEX_THREADS; i++)
pthread_create(&threads[i], NULL, poll_futex, NULL);printf("Flipping mapping...\n");
for (;;) {
mmap(mem, MEM_SIZE, MEM_PROT,
MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0);
}return 0;
}Reported-and-tested-by: Mark Rutland
Signed-off-by: Mel Gorman
Acked-by: Peter Zijlstra (Intel)
Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Linus Torvalds -
Pull i2c fixes from Wolfram Sang:
"The main thing is to allow empty id_tables for ACPI to make some
drivers get probed again. It looks a bit bigger than usual because it
needs some internal renaming, too.Other than that, there is a fix for broken DSTDs, a super simple
enablement for ARM MPS, and two documentation fixes which I'd like to
see in v4.13 already"* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: rephrase explanation of I2C_CLASS_DEPRECATED
i2c: allow i2c-versatile for ARM MPS platforms
i2c: designware: Some broken DSTDs use 1MiHz instead of 1MHz
i2c: designware: Print clock freq on invalid clock freq error
i2c: core: Allow empty id_table in ACPI case as well
i2c: mux: pinctrl: mention correct module name in Kconfig help text -
Pull block fixes from Jens Axboe:
"Three patches that should go into this release.Two of them are from Paolo and fix up some corner cases with BFQ, and
the last patch is from Ming and fixes up a potential usage count
imbalance regression due to the recent NOWAIT work"* 'for-linus' of git://git.kernel.dk/linux-block:
blk-mq: don't leak preempt counter/q_usage_counter when allocating rq failed
block, bfq: consider also in_service_entity to state whether an entity is active
block, bfq: reset in_service_entity if it becomes idle -
Pull crypto fixes from Herbert Xu:
"Fix two regressions in the inside-secure driver with respect to
hmac(sha1)"* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: inside-secure - fix the sha state length in hmac_sha1_setkey
crypto: inside-secure - fix invalidation check in hmac_sha1_setkey -
Pull networking fixes from David Miller:
"The pull requests are getting smaller, that's progress I suppose :-)1) Fix infinite loop in CIPSO option parsing, from Yujuan Qi.
2) Fix remote checksum handling in VXLAN and GUE tunneling drivers,
from Koichiro Den.3) Missing u64_stats_init() calls in several drivers, from Florian
Fainelli.4) TCP can set the congestion window to an invalid ssthresh value
after congestion window reductions, from Yuchung Cheng.5) Fix BPF jit branch generation on s390, from Daniel Borkmann.
6) Correct MIPS ebpf JIT merge, from David Daney.
7) Correct byte order test in BPF test_verifier.c, from Daniel
Borkmann.8) Fix various crashes and leaks in ASIX driver, from Dean Jenkins.
9) Handle SCTP checksums properly in mlx4 driver, from Davide
Caratti.10) We can potentially enter tcp_connect() with a cached route
already, due to fastopen, so we have to explicitly invalidate it.11) skb_warn_bad_offload() can bark in legitimate situations, fix from
Willem de Bruijn"* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
net: avoid skb_warn_bad_offload false positives on UFO
qmi_wwan: fix NULL deref on disconnect
ppp: fix xmit recursion detection on ppp channels
rds: Reintroduce statistics counting
tcp: fastopen: tcp_connect() must refresh the route
net: sched: set xt_tgchk_param par.net properly in ipt_init_target
net: dsa: mediatek: add adjust link support for user ports
net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets
qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()'
hysdn: fix to a race condition in put_log_buffer
s390/qeth: fix L3 next-hop in xmit qeth hdr
asix: Fix small memory leak in ax88772_unbind()
asix: Ensure asix_rx_fixup_info members are all reset
asix: Add rx->ax_skb = NULL after usbnet_skb_return()
bpf: fix selftest/bpf/test_pkt_md_access on s390x
netvsc: fix race on sub channel creation
bpf: fix byte order test in test_verifier
xgene: Always get clk source, but ignore if it's missing for SGMII ports
MIPS: Add missing file for eBPF JIT.
bpf, s390: fix build for libbpf and selftest suite
...
09 Aug, 2017
16 commits
-
skb_warn_bad_offload triggers a warning when an skb enters the GSO
stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL
checksum offload set.Commit b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise")
observed that SKB_GSO_DODGY producers can trigger the check and
that passing those packets through the GSO handlers will fix it
up. But, the software UFO handler will set ip_summed to
CHECKSUM_NONE.When __skb_gso_segment is called from the receive path, this
triggers the warning again.Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On
Tx these two are equivalent. On Rx, this better matches the
skb state (checksum computed), as CHECKSUM_NONE here means no
checksum computed.See also this thread for context:
http://patchwork.ozlabs.org/patch/799015/Fixes: b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise")
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller -
qmi_wwan_disconnect is called twice when disconnecting devices with
separate control and data interfaces. The first invocation will set
the interface data to NULL for both interfaces to flag that the
disconnect has been handled. But the matching NULL check was left
out when qmi_wwan_disconnect was added, resulting in this oops:usb 2-1.4: USB disconnect, device number 4
qmi_wwan 2-1.4:1.6 wwp0s29u1u4i6: unregister 'qmi_wwan' usb-0000:00:1d.0-1.4, WWAN/QMI device
BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0
IP: qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan]
PGD 0
P4D 0
Oops: 0000 [#1] SMP
Modules linked in:
CPU: 2 PID: 33 Comm: kworker/2:1 Tainted: G E 4.12.3-nr44-normandy-r1500619820+ #1
Hardware name: LENOVO 4291LR7/4291LR7, BIOS CBET4000 4.6-810-g50522254fb 07/21/2017
Workqueue: usb_hub_wq hub_event [usbcore]
task: ffff8c882b716040 task.stack: ffffb8e800d84000
RIP: 0010:qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan]
RSP: 0018:ffffb8e800d87b38 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffff8c8824f3f1d0 RDI: ffff8c8824ef6400
RBP: ffff8c8824ef6400 R08: 0000000000000000 R09: 0000000000000000
R10: ffffb8e800d87780 R11: 0000000000000011 R12: ffffffffc07ea0e8
R13: ffff8c8824e2e000 R14: ffff8c8824e2e098 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8c8835300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000000000e0 CR3: 0000000229ca5000 CR4: 00000000000406e0
Call Trace:
? usb_unbind_interface+0x71/0x270 [usbcore]
? device_release_driver_internal+0x154/0x210
? qmi_wwan_unbind+0x6d/0xc0 [qmi_wwan]
? usbnet_disconnect+0x6c/0xf0 [usbnet]
? qmi_wwan_disconnect+0x87/0xc0 [qmi_wwan]
? usb_unbind_interface+0x71/0x270 [usbcore]
? device_release_driver_internal+0x154/0x210Reported-and-tested-by: Nathaniel Roach
Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support")
Cc: Daniele Palmas
Signed-off-by: Bjørn Mork
Signed-off-by: David S. Miller -
Commit e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp
devices") dropped the xmit_recursion counter incrementation in
ppp_channel_push() and relied on ppp_xmit_process() for this task.
But __ppp_channel_push() can also send packets directly (using the
.start_xmit() channel callback), in which case the xmit_recursion
counter isn't incremented anymore. If such packets get routed back to
the parent ppp unit, ppp_xmit_process() won't notice the recursion and
will call ppp_channel_push() on the same channel, effectively creating
the deadlock situation that the xmit_recursion mechanism was supposed
to prevent.This patch re-introduces the xmit_recursion counter incrementation in
ppp_channel_push(). Since the xmit_recursion variable is now part of
the parent ppp unit, incrementation is skipped if the channel doesn't
have any. This is fine because only packets routed through the parent
unit may enter the channel recursively.Finally, we have to ensure that pch->ppp is not going to be modified
while executing ppp_channel_push(). Instead of taking this lock only
while calling ppp_xmit_process(), we now have to hold it for the full
ppp_channel_push() execution. This respects the ppp locks ordering
which requires locking ->upl before ->downl.Fixes: e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices")
Signed-off-by: Guillaume Nault
Signed-off-by: David S. Miller -
In commit 7e3f2952eeb1 ("rds: don't let RDS shutdown a connection
while senders are present"), refilling the receive queue was removed
from rds_ib_recv(), along with the increment of
s_ib_rx_refill_from_thread.Commit 73ce4317bf98 ("RDS: make sure we post recv buffers")
re-introduces filling the receive queue from rds_ib_recv(), but does
not add the statistics counter. rds_ib_recv() was later renamed to
rds_ib_recv_path().This commit reintroduces the statistics counting of
s_ib_rx_refill_from_thread and s_ib_rx_refill_from_cq.Signed-off-by: Håkon Bugge
Reviewed-by: Knut Omang
Reviewed-by: Wei Lin Guay
Reviewed-by: Shamir Rabinovitch
Acked-by: Santosh Shilimkar
Signed-off-by: David S. Miller -
With new TCP_FASTOPEN_CONNECT socket option, there is a possibility
to call tcp_connect() while socket sk_dst_cache is either NULL
or invalid.+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
+0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
+0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
+0 connect(4, ..., ...) = 0<< sk->sk_dst_cache becomes obsolete, or even set to NULL >>
+1 sendto(4, ..., 1000, MSG_FASTOPEN, ..., ...) = 1000
We need to refresh the route otherwise bad things can happen,
especially when syzkaller is running on the host :/Fixes: 19f6d3f3c8422 ("net/tcp-fastopen: Add new API support")
Reported-by: Dmitry Vyukov
Signed-off-by: Eric Dumazet
Cc: Wei Wang
Cc: Yuchung Cheng
Acked-by: Wei Wang
Acked-by: Yuchung Cheng
Signed-off-by: David S. Miller -
Now xt_tgchk_param par in ipt_init_target is a local varibale,
par.net is not initialized there. Later when xt_check_target
calls target's checkentry in which it may access par.net, it
would cause kernel panic.Jaroslav found this panic when running:
# ip link add TestIface type dummy
# tc qd add dev TestIface ingress handle ffff:
# tc filter add dev TestIface parent ffff: u32 match u32 0 0 \
action xt -j CONNMARK --set-mark 4This patch is to pass net param into ipt_init_target and set
par.net with it properly in there.v1->v2:
As Wang Cong pointed, I missed ipt_net_id != xt_net_id, so fix
it by also passing net_id to __tcf_ipt_init.
v2->v3:
Missed the fixes tag, so add it.Fixes: ecb2421b5ddf ("netfilter: add and use nf_ct_netns_get/put")
Reported-by: Jaroslav Aster
Signed-off-by: Xin Long
Acked-by: Jiri Pirko
Signed-off-by: David S. Miller -
Manually adjust the port settings of user ports once PHY polling has
completed. This patch extends the adjust_link callback to configure the
per port PMCR register, applying the proper values polled from the PHY.
Without this patch flow control was not always getting setup properly.Signed-off-by: Shashidhar Lakkavalli
Signed-off-by: Muciri Gatimu
Signed-off-by: John Crispin
Signed-off-by: David S. Miller -
if the NIC fails to validate the checksum on TCP/UDP, and validation of IP
checksum is successful, the driver subtracts the pseudo-header checksum
from the value obtained by the hardware and sets CHECKSUM_COMPLETE. Don't
do that if protocol is IPPROTO_SCTP, otherwise CRC32c validation fails.V2: don't test MLX4_CQE_STATUS_IPV6 if MLX4_CQE_STATUS_IPV4 is set
Reported-by: Shuang Li
Fixes: f8c6455bb04b ("net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE")
Signed-off-by: Davide Caratti
Acked-by: Saeed Mahameed
Signed-off-by: David S. Miller -
Pull rdma fixes from Doug Ledford:
"Third set of -rc fixes for 4.13 cycle- small set of miscellanous fixes
- a reasonably sizable set of IPoIB fixes that deal with multiple
long standing issues"* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/hns: checking for IS_ERR() instead of NULL
RDMA/mlx5: Fix existence check for extended address vector
IB/uverbs: Fix device cleanup
RDMA/uverbs: Prevent leak of reserved field
IB/core: Fix race condition in resolving IP to MAC
IB/ipoib: Notify on modify QP failure only when relevant
Revert "IB/core: Allow QP state transition from reset to error"
IB/ipoib: Remove double pointer assigning
IB/ipoib: Clean error paths in add port
IB/ipoib: Add get statistics support to SRIOV VF
IB/ipoib: Add multicast packets statistics
IB/ipoib: Set IPOIB_NEIGH_TBL_FLUSH after flushed completion initialization
IB/ipoib: Prevent setting negative values to max_nonsrq_conn_qp
IB/ipoib: Make sure no in-flight joins while leaving that mcast
IB/ipoib: Use cancel_delayed_work_sync when needed
IB/ipoib: Fix race between light events and interface restart -
Allow any number of command line arguments to match either the
section header or the section contents and create new files.Create MAINTAINERS.new and SECTION.new.
This allows scripting of the movement of various sections from
MAINTAINERS.Signed-off-by: Joe Perches
Signed-off-by: Linus Torvalds -
Instead of reading STDIN and writing STDOUT, use specific filenames of
MAINTAINERS and MAINTAINERS.new.Use hash references instead of global hash %hash so future modifications
can read and write specific hashes to split up MAINTAINERS into multiple
files using a script.Signed-off-by: Joe Perches
Signed-off-by: Linus Torvalds -
Section [A-Z]: patterns are not currently in any required sorting order.
Add a specific sorting sequence to MAINTAINERS entries.
Sort F: and X: patterns in alphabetic order.The preferred section ordering is:
SECTION HEADER
M: Maintainers
R: Reviewers
P: Named persons without email addresses
L: Mailing list addresses
S: Status of this section (Supported, Maintained, Orphan, etc...)
W: Any relevant URLs
T: Source code control type (git, quilt, etc)
Q: Patchwork patch acceptance queue site
B: Bug tracking URIs
C: Chat URIs
F: Files with wildcard patterns (alphabetic ordered)
X: Excluded files with wildcard patterns (alphabetic ordered)
N: Files with regex patterns
K: Keyword regexes in source code for maintainership identificationMiscellaneous perl neatening:
- Rename %map to %hash, map has a different meaning in perl
- Avoid using \& and local variables for function indirection
- Use return for a little c like clarity
- Use c-like function call style instead of &functionSigned-off-by: Joe Perches
Signed-off-by: Linus Torvalds -
Allow for MAINTAINERS to become a directory and if it is,
read all the files in the directory for maintained sections.Optionally look for all files named MAINTAINERS in directories
excluding the .git directory by using --find-maintainer-files.This optional feature adds ~.3 seconds of CPU on an Intel
i5-6200 with an SSD.Miscellanea:
- Create a read_maintainer_file subroutine from the existing code
- Test only the existence of MAINTAINERS, not whether it's a fileSigned-off-by: Joe Perches
Signed-off-by: Linus Torvalds -
The openbmc mailing list is moderated for non-subscribers.
Signed-off-by: Randy Dunlap
Acked-by: Brendan Higgins
Cc: Benjamin Herrenschmidt
Cc: Joel Stanley
Signed-off-by: Linus Torvalds -
Fixes: f47e07bc5f1a5c48 ("Fix up MAINTAINERS file problems")
Cc: Joe Perches
Signed-off-by: Sedat Dilek
Signed-off-by: Linus Torvalds -
Pull SCSI fixes from James Bottomley:
"Two small fixes, one re-fix of a previous fix and five patches sorting
out hotplug in the bnx2X class of drivers. The latter is rather
involved, but necessary because these drivers have started dropping
lockdep recursion warnings on the hotplug lock because of its
conversion to a percpu rwsem"* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sg: only check for dxfer_len greater than 256M
scsi: aacraid: reading out of bounds
scsi: qedf: Limit number of CQs
scsi: bnx2i: Simplify cpu hotplug code
scsi: bnx2fc: Simplify CPU hotplug code
scsi: bnx2i: Prevent recursive cpuhotplug locking
scsi: bnx2fc: Prevent recursive cpuhotplug locking
scsi: bnx2fc: Plug CPU hotplug race