16 Nov, 2017
1 commit
-
Pull networking updates from David Miller:
"Highlights:1) Maintain the TCP retransmit queue using an rbtree, with 1GB
windows at 100Gb this really has become necessary. From Eric
Dumazet.2) Multi-program support for cgroup+bpf, from Alexei Starovoitov.
3) Perform broadcast flooding in hardware in mv88e6xxx, from Andrew
Lunn.4) Add meter action support to openvswitch, from Andy Zhou.
5) Add a data meta pointer for BPF accessible packets, from Daniel
Borkmann.6) Namespace-ify almost all TCP sysctl knobs, from Eric Dumazet.
7) Turn on Broadcom Tags in b53 driver, from Florian Fainelli.
8) More work to move the RTNL mutex down, from Florian Westphal.
9) Add 'bpftool' utility, to help with bpf program introspection.
From Jakub Kicinski.10) Add new 'cpumap' type for XDP_REDIRECT action, from Jesper
Dangaard Brouer.11) Support 'blocks' of transformations in the packet scheduler which
can span multiple network devices, from Jiri Pirko.12) TC flower offload support in cxgb4, from Kumar Sanghvi.
13) Priority based stream scheduler for SCTP, from Marcelo Ricardo
Leitner.14) Thunderbolt networking driver, from Amir Levy and Mika Westerberg.
15) Add RED qdisc offloadability, and use it in mlxsw driver. From
Nogah Frankel.16) eBPF based device controller for cgroup v2, from Roman Gushchin.
17) Add some fundamental tracepoints for TCP, from Song Liu.
18) Remove garbage collection from ipv6 route layer, this is a
significant accomplishment. From Wei Wang.19) Add multicast route offload support to mlxsw, from Yotam Gigi"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2177 commits)
tcp: highest_sack fix
geneve: fix fill_info when link down
bpf: fix lockdep splat
net: cdc_ncm: GetNtbFormat endian fix
openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_start
netem: remove unnecessary 64 bit modulus
netem: use 64 bit divide by rate
tcp: Namespace-ify sysctl_tcp_default_congestion_control
net: Protect iterations over net::fib_notifier_ops in fib_seq_sum()
ipv6: set all.accept_dad to 0 by default
uapi: fix linux/tls.h userspace compilation error
usbnet: ipheth: prevent TX queue timeouts when device not ready
vhost_net: conditionally enable tx polling
uapi: fix linux/rxrpc.h userspace compilation errors
net: stmmac: fix LPI transitioning for dwmac4
atm: horizon: Fix irq release error
net-sysfs: trigger netlink notification on ifalias change via sysfs
openvswitch: Using kfree_rcu() to simplify the code
openvswitch: Make local function ovs_nsh_key_attr_size() static
openvswitch: Fix return value check in ovs_meter_cmd_features()
...
07 Nov, 2017
1 commit
-
Conflicts:
include/linux/compiler-clang.h
include/linux/compiler-gcc.h
include/linux/compiler-intel.h
include/uapi/linux/stddef.hSigned-off-by: Ingo Molnar
04 Nov, 2017
1 commit
-
Files removed in 'net-next' had their license header updated
in 'net'. We take the remove from 'net-next'.Signed-off-by: David S. Miller
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
25 Oct, 2017
2 commits
-
…READ_ONCE()/WRITE_ONCE()
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
virtual patch
@ depends on patch @
expression E1, E2;
@@- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)@ depends on patch @
expression E;
@@- ACCESS_ONCE(E)
+ READ_ONCE(E)
----Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org> -
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.Cc: "David S. Miller"
Cc: Eric Dumazet
Cc: Hans Liljestrand
Cc: "Paul E. McKenney"
Cc: "Reshetova, Elena"
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook
Signed-off-by: David S. Miller
05 Jul, 2017
1 commit
-
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller
01 Jul, 2017
1 commit
-
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.This patch uses refcount_inc_not_zero() instead of
atomic_inc_not_zero_hint() due to absense of a _hint()
version of refcount API. If the hint() version must
be used, we might need to revisit API.Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller
27 May, 2017
1 commit
-
There is a race condition in llc_ui_bind if two or more processes/threads
try to bind a same socket.If more processes/threads bind a same socket success that will lead to
two problems, one is this action is not what we expected, another is
will lead to kernel in unstable status or oops(in my simple test case,
cause llc2.ko can't unload).The current code is test SOCK_ZAPPED bit to avoid a process to
bind a same socket twice but that is can't avoid more processes/threads
try to bind a same socket at the same time.So, add lock_sock in llc_ui_bind like others, such as llc_ui_connect.
Signed-off-by: Lin Zhang
Signed-off-by: David S. Miller
23 Apr, 2017
1 commit
-
…k/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:
- Documentation updates.
- Miscellaneous fixes.
- Parallelize SRCU callback handling (plus overlapping patches).
Signed-off-by: Ingo Molnar <mingo@kernel.org>
19 Apr, 2017
1 commit
-
A group of Linux kernel hackers reported chasing a bug that resulted
from their assumption that SLAB_DESTROY_BY_RCU provided an existence
guarantee, that is, that no block from such a slab would be reallocated
during an RCU read-side critical section. Of course, that is not the
case. Instead, SLAB_DESTROY_BY_RCU only prevents freeing of an entire
slab of blocks.However, there is a phrase for this, namely "type safety". This commit
therefore renames SLAB_DESTROY_BY_RCU to SLAB_TYPESAFE_BY_RCU in order
to avoid future instances of this sort of confusion.Signed-off-by: Paul E. McKenney
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Cc: Andrew Morton
Cc:
Acked-by: Johannes Weiner
Acked-by: Vlastimil Babka
[ paulmck: Add comments mentioning the old name, as requested by Eric
Dumazet, in order to help people familiar with the old name find
the new one. ]
Acked-by: David Rientjes
10 Mar, 2017
1 commit
-
Lockdep issues a circular dependency warning when AFS issues an operation
through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.The theory lockdep comes up with is as follows:
(1) If the pagefault handler decides it needs to read pages from AFS, it
calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
creating a call requires the socket lock:mmap_sem must be taken before sk_lock-AF_RXRPC
(2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind()
binds the underlying UDP socket whilst holding its socket lock.
inet_bind() takes its own socket lock:sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET
(3) Reading from a TCP socket into a userspace buffer might cause a fault
and thus cause the kernel to take the mmap_sem, but the TCP socket is
locked whilst doing this:sk_lock-AF_INET must be taken before mmap_sem
However, lockdep's theory is wrong in this instance because it deals only
with lock classes and not individual locks. The AF_INET lock in (2) isn't
really equivalent to the AF_INET lock in (3) as the former deals with a
socket entirely internal to the kernel that never sees userspace. This is
a limitation in the design of lockdep.Fix the general case by:
(1) Double up all the locking keys used in sockets so that one set are
used if the socket is created by userspace and the other set is used
if the socket is created by the kernel.(2) Store the kern parameter passed to sk_alloc() in a variable in the
sock struct (sk_kern_sock). This informs sock_lock_init(),
sock_init_data() and sk_clone_lock() as to the lock keys to be used.Note that the child created by sk_clone_lock() inherits the parent's
kern setting.(3) Add a 'kern' parameter to ->accept() that is analogous to the one
passed in to ->create() that distinguishes whether kernel_accept() or
sys_accept4() was the caller and can be passed to sk_alloc().Note that a lot of accept functions merely dequeue an already
allocated socket. I haven't touched these as the new socket already
exists before we get the parameter.Note also that there are a couple of places where I've made the accepted
socket unconditionally kernel-based:irda_accept()
rds_rcp_accept_one()
tcp_accept_from_sock()because they follow a sock_create_kern() and accept off of that.
Whilst creating this, I noticed that lustre and ocfs don't create sockets
through sock_create_kern() and thus they aren't marked as for-kernel,
though they appear to be internal. I wonder if these should do that so
that they use the new set of lock keys.Signed-off-by: David Howells
Signed-off-by: David S. Miller
02 Mar, 2017
1 commit
-
…hed.h> into <linux/sched/signal.h>
Fix up affected files that include this signal functionality via sched.h.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
13 Feb, 2017
1 commit
-
It seems nobody used LLC since linux-3.12.
Fortunately fuzzers like syzkaller still know how to run this code,
otherwise it would be no fun.Setting skb->sk without skb->destructor leads to all kinds of
bugs, we now prefer to be very strict about it.Ideally here we would use skb_set_owner() but this helper does not exist yet,
only CAN seems to have a private helper for that.Fixes: 376c7311bdb6 ("net: add a temporary sanity check in skb_orphan()")
Signed-off-by: Eric Dumazet
Reported-by: Andrey Konovalov
Signed-off-by: David S. Miller
15 Nov, 2016
1 commit
-
Similar to commit 14135f30e33c ("inet: fix sleeping inside inet_wait_for_connect()"),
sk_wait_event() needs to fix too, because release_sock() is blocking,
it changes the process state back to running after sleep, which breaks
the previous prepare_to_wait().Switch to the new wait API.
Cc: Eric Dumazet
Cc: Peter Zijlstra
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
17 Sep, 2016
1 commit
-
(As asked by Dave in Februrary)
Signed-off-by: Alan Cox
Signed-off-by: David S. Miller
10 May, 2016
1 commit
-
In netdevice.h we removed the structure in net-next that is being
changes in 'net'. In macsec.c and rtnetlink.c we have overlaps
between fixes in 'net' and the u64 attribute changes in 'net-next'.The mlx5 conflicts have to do with vxlan support dependencies.
Signed-off-by: David S. Miller
05 May, 2016
1 commit
-
The stack object “info” has a total size of 12 bytes. Its last byte
is padding which is not initialized and leaked via “put_cmsg”.Signed-off-by: Kangjie Lu
Signed-off-by: David S. Miller
14 Apr, 2016
1 commit
-
sock_owned_by_user should not be used without socket lock held. It seems
to be a common practice to check .owned before lock reclassification, so
provide a little help to abstract this check away.Cc: linux-cifs@vger.kernel.org
Cc: linux-bluetooth@vger.kernel.org
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
18 Feb, 2016
1 commit
-
The timeout is a long, we return it truncated if it is huge. Basically
harmless as the only caller does a boolean check, but tidy it up anyway.(64bit build tested this time. Thank you 0day)
Signed-off-by: Alan Cox
Signed-off-by: David S. Miller
27 Jul, 2015
1 commit
-
Currently, tcp_recvmsg enters a busy loop in sk_wait_data if called
with flags = MSG_WAITALL | MSG_PEEK.sk_wait_data waits for sk_receive_queue not empty, but in this case,
the receive queue is not empty, but does not contain any skb that we
can use.Add a "last skb seen on receive queue" argument to sk_wait_data, so
that it sleeps until the receive queue has new skbs.Link: https://bugzilla.kernel.org/show_bug.cgi?id=99461
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=18493
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1205258
Reported-by: Enrico Scholz
Reported-by: Dan Searle
Signed-off-by: Sabrina Dubroca
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
11 May, 2015
1 commit
-
In preparation for changing how struct net is refcounted
on kernel sockets pass the knowledge that we are creating
a kernel socket from sock_create_kern through to sk_alloc.Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
03 Mar, 2015
1 commit
-
After TIPC doesn't depend on iocb argument in its internal
implementations of sendmsg() and recvmsg() hooks defined in proto
structure, no any user is using iocb argument in them at all now.
Then we can drop the redundant iocb argument completely from kinds of
implementations of both sendmsg() and recvmsg() in the entire
networking stack.Cc: Christoph Hellwig
Suggested-by: Al Viro
Signed-off-by: Ying Xue
Signed-off-by: David S. Miller
25 Jan, 2015
1 commit
-
The timeout entries are sizeof(int) rather than sizeof(long), which
means that when they were getting read we'd also leak kernel memory
to userspace along with the timeout values.Signed-off-by: Sasha Levin
Signed-off-by: David S. Miller
11 Dec, 2014
3 commits
-
It's better when function pointer arrays aren't modifiable.
Net change:
$ size net/llc/built-in.o.*
text data bss dec hex filename
61193 12758 1344 75295 1261f net/llc/built-in.o.new
47113 27030 1344 75487 126df net/llc/built-in.o.oldSigned-off-by: Joe Perches
Signed-off-by: David S. Miller -
It's better when function pointer arrays aren't modifiable.
Net change from original:
$ size net/llc/built-in.o.*
text data bss dec hex filename
61065 12886 1344 75295 1261f net/llc/built-in.o.new
47113 27030 1344 75487 126df net/llc/built-in.o.oldSigned-off-by: Joe Perches
Signed-off-by: David S. Miller -
It's better when function pointer arrays aren't modifiable.
Signed-off-by: Joe Perches
Signed-off-by: David S. Miller
24 Nov, 2014
1 commit
-
Signed-off-by: Al Viro
06 Nov, 2014
1 commit
-
This encapsulates all of the skb_copy_datagram_iovec() callers
with call argument signature "skb, offset, msghdr->msg_iov, length".When we move to iov_iters in the networking, the iov_iter object will
sit in the msghdr.Having a helper like this means there will be less places to touch
during that transformation.Based upon descriptions and patch from Al Viro.
Signed-off-by: David S. Miller
25 Oct, 2014
1 commit
-
Signed-off-by: Fabian Frederick
Signed-off-by: David S. Miller
28 Sep, 2014
1 commit
-
Per commit "77873803363c net_dma: mark broken" net_dma is no longer used
and there is no plan to fix it.This is the mechanical removal of bits in CONFIG_NET_DMA ifdef guards.
Reverting the remainder of the net_dma induced changes is deferred to
subsequent patches.Marked for stable due to Roman's report of a memory leak in
dma_pin_iovec_pages():https://lkml.org/lkml/2014/9/3/177
Cc: Dave Jiang
Cc: Vinod Koul
Cc: David Whipple
Cc: Alexander Duyck
Cc:
Reported-by: Roman Gushchin
Acked-by: David S. Miller
Signed-off-by: Dan Williams
29 Jan, 2014
1 commit
-
Sending malformed llc packets triggers this spew, which seems excessive.
WARNING: CPU: 1 PID: 6917 at net/llc/llc_output.c:46 llc_mac_hdr_init+0x85/0x90 [llc]()
device type not supported: 0
CPU: 1 PID: 6917 Comm: trinity-c1 Not tainted 3.13.0+ #95
0000000000000009 00000000007e257d ffff88009232fbe8 ffffffffac737325
ffff88009232fc30 ffff88009232fc20 ffffffffac06d28d ffff88020e07f180
ffff88009232fec0 00000000000000c8 0000000000000000 ffff88009232fe70
Call Trace:
[] dump_stack+0x4e/0x7a
[] warn_slowpath_common+0x7d/0xa0
[] warn_slowpath_fmt+0x5c/0x80
[] llc_mac_hdr_init+0x85/0x90 [llc]
[] llc_build_and_send_ui_pkt+0x79/0x90 [llc]
[] llc_ui_sendmsg+0x23a/0x400 [llc2]
[] sock_sendmsg+0x9c/0xe0
[] ? might_fault+0x47/0x50
[] SYSC_sendto+0x121/0x1c0
[] ? syscall_trace_enter+0x207/0x270
[] SyS_sendto+0xe/0x10
[] tracesys+0xdd/0xe2Until 2009, this was a printk, when it was changed in
bf9ae5386bc: "llc: use dev_hard_header".Let userland figure out what -EINVAL means by itself.
Signed-off-by: Dave Jones
Signed-off-by: David S. Miller
19 Jan, 2014
1 commit
-
This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg
handler msg_name and msg_namelen logic").DECLARE_SOCKADDR validates that the structure we use for writing the
name information to is not larger than the buffer which is reserved
for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR
consistently in sendmsg code paths.Signed-off-by: Steffen Hurrle
Suggested-by: Hannes Frederic Sowa
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
07 Jan, 2014
1 commit
-
Conflicts:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_pf.c
net/ipv6/ip6_tunnel.c
net/ipv6/ip6_vti.cipv6 tunnel statistic bug fixes conflicting with consolidation into
generic sw per-cpu net stats.qlogic conflict between queue counting bug fix and the addition
of multiple MAC address support.Signed-off-by: David S. Miller
04 Jan, 2014
1 commit
-
The llc_sap_list_lock does not need to be global, only acquired
in core.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
03 Jan, 2014
1 commit
-
While commit 30a584d944fb fixes datagram interface in LLC, a use
after free bug has been introduced for SOCK_STREAM sockets that do
not make use of MSG_PEEK.The flow is as follow ...
if (!(flags & MSG_PEEK)) {
...
sk_eat_skb(sk, skb, false);
...
}
...
if (used + offset < skb->len)
continue;... where sk_eat_skb() calls __kfree_skb(). Therefore, cache
original length and work on skb_len to check partial reads.Fixes: 30a584d944fb ("[LLX]: SOCK_DGRAM interface fixes")
Signed-off-by: Daniel Borkmann
Cc: Stephen Hemminger
Cc: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller
28 Dec, 2013
1 commit
-
Signed-off-by: Weilong Chen
Signed-off-by: David S. Miller
21 Nov, 2013
1 commit
-
This patch now always passes msg->msg_namelen as 0. recvmsg handlers must
set msg_namelen to the proper size
Suggested-by: Eric Dumazet
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
04 Sep, 2013
1 commit
-
Convert the llc_ static inlines to the
equivalents from etherdevice.h and remove
the llc_ static inline functions.llc_mac_null -> is_zero_ether_addr
llc_mac_multicast -> is_multicast_ether_addr
llc_mac_match -> ether_addr_equalSigned-off-by: Joe Perches
Signed-off-by: David S. Miller
16 Aug, 2013
1 commit
-
UIDs are printed in the proc_fs as signed int, whereas
they are unsigned int.Signed-off-by: Francesco Fusco
Signed-off-by: David S. Miller