04 Nov, 2018
1 commit
-
[ Upstream commit b6168562c8ce2bd5a30e213021650422e08764dc ]
In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
statement to see whether it is necessary to pre-process the ethtool
structure, because, as mentioned in the comment, the structure
ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
is allocated through compat_alloc_user_space(). One thing to note here is
that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
partially determined by 'rule_cnt', which is actually acquired from the
user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
get_user(). After 'rxnfc' is allocated, the data in the original user-space
buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
including the 'rule_cnt' field. However, after this copy, no check is
re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
race to change the value in the 'compat_rxnfc->rule_cnt' between these two
copies. Through this way, the attacker can bypass the previous check on
'rule_cnt' and inject malicious data. This can cause undefined behavior of
the kernel and introduce potential security risk.This patch avoids the above issue via copying the value acquired by
get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.Signed-off-by: Wenwen Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
06 Aug, 2018
1 commit
-
commit c8e8cd579bb4265651df8223730105341e61a2d1 upstream.
'call' is a user-controlled value, so sanitize the array index after the
bounds check to avoid speculating past the bounds of the 'nargs' array.Found with the help of Smatch:
net/socket.c:2508 __do_sys_socketcall() warn: potential spectre issue
'nargs' [r] (local cap)Cc: Josh Poimboeuf
Cc: stable@vger.kernel.org
Signed-off-by: Jeremy Cline
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
26 Jun, 2018
1 commit
-
[ Upstream commit 6d8c50dcb029872b298eea68cc6209c866fd3e14 ]
fchownat() doesn't even hold refcnt of fd until it figures out
fd is really needed (otherwise is ignored) and releases it after
it resolves the path. This means sock_close() could race with
sockfs_setattr(), which leads to a NULL pointer dereference
since typically we set sock->sk to NULL in ->release().As pointed out by Al, this is unique to sockfs. So we can fix this
in socket layer by acquiring inode_lock in sock_close() and
checking against NULL in sockfs_setattr().sock_release() is called in many places, only the sock_close()
path matters here. And fortunately, this should not affect normal
sock_close() as it is only called when the last fd refcnt is gone.
It only affects sock_close() with a parallel sockfs_setattr() in
progress, which is not common.Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Reported-by: shankarapailoor
Cc: Tetsuo Handa
Cc: Lorenzo Colitti
Cc: Al Viro
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
22 Feb, 2018
1 commit
-
commit 4950276672fce5c241857540f8561c440663673d upstream.
Patch series "kmemcheck: kill kmemcheck", v2.
As discussed at LSF/MM, kill kmemcheck.
KASan is a replacement that is able to work without the limitation of
kmemcheck (single CPU, slow). KASan is already upstream.We are also not aware of any users of kmemcheck (or users who don't
consider KASan as a suitable replacement).The only objection was that since KASAN wasn't supported by all GCC
versions provided by distros at that time we should hold off for 2
years, and try again.Now that 2 years have passed, and all distros provide gcc that supports
KASAN, kill kmemcheck again for the very same reasons.This patch (of 4):
Remove kmemcheck annotations, and calls to kmemcheck from the kernel.
[alexander.levin@verizon.com: correctly remove kmemcheck call from dma_map_sg_attrs]
Link: http://lkml.kernel.org/r/20171012192151.26531-1-alexander.levin@verizon.com
Link: http://lkml.kernel.org/r/20171007030159.22241-2-alexander.levin@verizon.com
Signed-off-by: Sasha Levin
Cc: Alexander Potapenko
Cc: Eric W. Biederman
Cc: Michal Hocko
Cc: Pekka Enberg
Cc: Steven Rostedt
Cc: Tim Hansen
Cc: Vegard Nossum
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman
31 Jan, 2018
1 commit
-
[ upstream commit 290af86629b25ffd1ed6232c4e9107da031705cb ]
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.
A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_hardenv2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
It will be sent when the trees are merged back to net-nextConsidered doing:
int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.Signed-off-by: Alexei Starovoitov
Signed-off-by: Daniel Borkmann
Signed-off-by: Greg Kroah-Hartman
17 Aug, 2017
1 commit
-
A couple fixes to new skb_send_sock infrastructure. However, no users
currently exist for this code (adding user in next handful of patches)
so it should not be possible to trigger a panic with existing in-kernel
code.Fixes: 306b13eb3cf9 ("proto_ops: Add locked held versions of sendmsg and sendpage")
Signed-off-by: John Fastabend
Signed-off-by: David S. Miller
02 Aug, 2017
2 commits
-
Add new proto_ops sendmsg_locked and sendpage_locked that can be
called when the socket lock is already held. Correspondingly, add
kernel_sendmsg_locked and kernel_sendpage_locked as front end
functions.These functions will be used in zero proxy so that we can take
the socket lock in a ULP sendmsg/sendpage and then directly call the
backend transport proto_ops functions.Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller -
Two minor conflicts in virtio_net driver (bug fix overlapping addition
of a helper) and MAINTAINERS (new driver edit overlapping revamp of
PHY entry).Signed-off-by: David S. Miller
26 Jul, 2017
1 commit
-
The variable owned_by_user is always set, but only used
when kernel is configured with LOCKDEP enabled.Get rid of the warning by moving the code to put the call
to owned_by_user into the the rcu_protected call.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
25 Jul, 2017
1 commit
-
The commit ffb07550c76f ("copy_msghdr_from_user(): get rid of
field-by-field copyin") introduce a new sparse warning:net/socket.c:1919:27: warning: incorrect type in assignment (different address spaces)
net/socket.c:1919:27: expected void *msg_control
net/socket.c:1919:27: got void [noderef] *[addressable] msg_controland a line above 80 chars, let's fix them
Fixes: ffb07550c76f ("copy_msghdr_from_user(): get rid of field-by-field copyin")
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller
16 Jul, 2017
1 commit
-
Pull network field-by-field copy-in updates from Al Viro:
"This part of the misc compat queue was held back for review from
networking folks and since davem has jus ACKed those..."* 'misc.compat' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
get_compat_bpf_fprog(): don't copyin field-by-field
get_compat_msghdr(): get rid of field-by-field copyin
copy_msghdr_from_user(): get rid of field-by-field copyin
06 Jul, 2017
1 commit
-
Pull misc user access cleanups from Al Viro:
"The first pile is assorted getting rid of cargo-culted access_ok(),
cargo-culted set_fs() and field-by-field copyouts.The same description applies to a lot of stuff in other branches -
this is just the stuff that didn't fit into a more specific topical
branch"* 'work.misc-set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
Switch flock copyin/copyout primitives to copy_{from,to}_user()
fs/fcntl: return -ESRCH in f_setown when pid/pgid can't be found
fs/fcntl: f_setown, avoid undefined behaviour
fs/fcntl: f_setown, allow returning error
lpfc debugfs: get rid of pointless access_ok()
adb: get rid of pointless access_ok()
isdn: get rid of pointless access_ok()
compat statfs: switch to copy_to_user()
fs/locks: don't mess with the address limit in compat_fcntl64
nfsd_readlink(): switch to vfs_get_link()
drbd: ->sendpage() never needed set_fs()
fs/locks: pass kernel struct flock to fcntl_getlk/setlk
fs: locks: Fix some troubles at kernel-doc comments
05 Jul, 2017
1 commit
-
Signed-off-by: Al Viro
14 Jun, 2017
1 commit
-
Allow f_setown to return an error value. We will fail in the next patch
with EINVAL for bad input to f_setown, so tile the path for the later
patch.Signed-off-by: Jiri Slaby
Reviewed-by: Jeff Layton
Cc: Jeff Layton
Cc: "J. Bruce Fields"
Cc: Alexander Viro
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Jeff Layton
23 May, 2017
1 commit
-
This patch fixes a typo in sockfd_lookup() in net/socket.c.
Signed-off-by: Rami Rosen
Signed-off-by: David S. Miller
22 May, 2017
2 commits
-
Add SOF_TIMESTAMPING_OPT_TX_SWHW option to allow an outgoing packet to
be looped to the socket's error queue with a software timestamp even
when a hardware transmit timestamp is expected to be provided by the
driver.Applications using this option will receive two separate messages from
the error queue, one with a software timestamp and the other with a
hardware timestamp. As the hardware timestamp is saved to the shared skb
info, which may happen before the first message with software timestamp
is received by the application, the hardware timestamp is copied to the
SCM_TIMESTAMPING control message only when the skb has no software
timestamp or it is an incoming packet.While changing sw_tx_timestamp(), inline it in skb_tx_timestamp() as
there are no other users.CC: Richard Cochran
CC: Willem de Bruijn
Signed-off-by: Miroslav Lichvar
Acked-by: Willem de Bruijn
Signed-off-by: David S. Miller -
Add SOF_TIMESTAMPING_OPT_PKTINFO option to request a new control message
for incoming packets with hardware timestamps. It contains the index of
the real interface which received the packet and the length of the
packet at layer 2.The index is useful with bonding, bridges and other interfaces, where
IP_PKTINFO doesn't allow applications to determine which PHC made the
timestamp. With the L2 length (and link speed) it is possible to
transpose preamble timestamps to trailer timestamps, which are used in
the NTP protocol.While this information could be provided by two new socket options
independently from timestamping, it doesn't look like they would be very
useful. With this option any performance impact is limited to hardware
timestamping.Use dev_get_by_napi_id() to get the device and its index. On kernels
with disabled CONFIG_NET_RX_BUSY_POLL or drivers not using NAPI, a zero
index will be returned in the control message.CC: Richard Cochran
Acked-by: Willem de Bruijn
Signed-off-by: Miroslav Lichvar
Signed-off-by: David S. Miller
18 Apr, 2017
1 commit
-
The MTU overhead calculation in L2TP device set-up
merged via commit b784e7ebfce8cfb16c6f95e14e8532d0768ab7ff
needs to be adjusted to lock the tunnel socket while
referencing the sub-data structures to derive the
socket's IP overhead.Reported-by: Guillaume Nault
Tested-by: Guillaume Nault
Signed-off-by: R. Parameswaran
Signed-off-by: David S. Miller
07 Apr, 2017
1 commit
-
A new function, kernel_sock_ip_overhead(), is provided
to calculate the cumulative overhead imposed by the IP
Header and IP options, if any, on a socket's payload.
The new function returns an overhead of zero for sockets
that do not belong to the IPv4 or IPv6 address families.
This is used in the L2TP code path to compute the
total outer IP overhead on the L2TP tunnel socket when
calculating the default MTU for Ethernet pseudowires.Signed-off-by: R. Parameswaran
Signed-off-by: David S. Miller
22 Mar, 2017
2 commits
-
SOF_TIMESTAMPING_OPT_STATS can be enabled and disabled
while packets are collected on the error queue.
So, checking SOF_TIMESTAMPING_OPT_STATS in sk->sk_tsflags
is not enough to safely assume that the skb contains
OPT_STATS data.Add a bit in sock_exterr_skb to indicate whether the
skb contains opt_stats data.Fixes: 1c885808e456 ("tcp: SOF_TIMESTAMPING_OPT_STATS option for SO_TIMESTAMPING")
Reported-by: JongHwan Kim
Signed-off-by: Soheil Hassas Yeganeh
Signed-off-by: Eric Dumazet
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller -
__sock_recv_timestamp can be called for both normal skbs (for
receive timestamps) and for skbs on the error queue (for transmit
timestamps).Commit 1c885808e456
(tcp: SOF_TIMESTAMPING_OPT_STATS option for SO_TIMESTAMPING)
assumes any skb passed to __sock_recv_timestamp are from
the error queue, containing OPT_STATS in the content of the skb.
This results in accessing invalid memory or generating junk
data.To fix this, set skb->pkt_type to PACKET_OUTGOING for packets
on the error queue. This is safe because on the receive path
on local sockets skb->pkt_type is never set to PACKET_OUTGOING.
With that, copy OPT_STATS from a packet, only if its pkt_type
is PACKET_OUTGOING.Fixes: 1c885808e456 ("tcp: SOF_TIMESTAMPING_OPT_STATS option for SO_TIMESTAMPING")
Reported-by: JongHwan Kim
Signed-off-by: Soheil Hassas Yeganeh
Signed-off-by: Eric Dumazet
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller
10 Mar, 2017
2 commits
-
Lockdep issues a circular dependency warning when AFS issues an operation
through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.The theory lockdep comes up with is as follows:
(1) If the pagefault handler decides it needs to read pages from AFS, it
calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
creating a call requires the socket lock:mmap_sem must be taken before sk_lock-AF_RXRPC
(2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind()
binds the underlying UDP socket whilst holding its socket lock.
inet_bind() takes its own socket lock:sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET
(3) Reading from a TCP socket into a userspace buffer might cause a fault
and thus cause the kernel to take the mmap_sem, but the TCP socket is
locked whilst doing this:sk_lock-AF_INET must be taken before mmap_sem
However, lockdep's theory is wrong in this instance because it deals only
with lock classes and not individual locks. The AF_INET lock in (2) isn't
really equivalent to the AF_INET lock in (3) as the former deals with a
socket entirely internal to the kernel that never sees userspace. This is
a limitation in the design of lockdep.Fix the general case by:
(1) Double up all the locking keys used in sockets so that one set are
used if the socket is created by userspace and the other set is used
if the socket is created by the kernel.(2) Store the kern parameter passed to sk_alloc() in a variable in the
sock struct (sk_kern_sock). This informs sock_lock_init(),
sock_init_data() and sk_clone_lock() as to the lock keys to be used.Note that the child created by sk_clone_lock() inherits the parent's
kern setting.(3) Add a 'kern' parameter to ->accept() that is analogous to the one
passed in to ->create() that distinguishes whether kernel_accept() or
sys_accept4() was the caller and can be passed to sk_alloc().Note that a lot of accept functions merely dequeue an already
allocated socket. I haven't touched these as the new socket already
exists before we get the parameter.Note also that there are a couple of places where I've made the accepted
socket unconditionally kernel-based:irda_accept()
rds_rcp_accept_one()
tcp_accept_from_sock()because they follow a sock_create_kern() and accept off of that.
Whilst creating this, I noticed that lustre and ocfs don't create sockets
through sock_create_kern() and thus they aren't marked as for-kernel,
though they appear to be internal. I wonder if these should do that so
that they use the new set of lock keys.Signed-off-by: David Howells
Signed-off-by: David S. Miller -
KMSAN reports a use of uninitialized memory in put_cmsg() because
msg.msg_flags in recvfrom haven't been initialized properly.
The flag values don't affect the result on this path, but it's still a
good idea to initialize them explicitly.Signed-off-by: Alexander Potapenko
Signed-off-by: David S. Miller
22 Feb, 2017
1 commit
-
Commit 34b88a68f26a ("net: Fix use after free in the recvmmsg exit path"),
changed the exit path of recvmmsg to always return the datagrams
variable and modified the error paths to set the variable to the error
code returned by recvmsg if necessary.However in the case sock_error returned an error, the error code was
then ignored, and recvmmsg returned 0.Change the error path of recvmmsg to correctly return the error code
of sock_error.The bug was triggered by using recvmmsg on a CAN interface which was
not up. Linux 4.6 and later return 0 in this case while earlier
releases returned -ENETDOWN.Fixes: 34b88a68f26a ("net: Fix use after free in the recvmmsg exit path")
Signed-off-by: Maxime Jayat
Signed-off-by: David S. Miller
12 Jan, 2017
1 commit
-
Two AF_* families adding entries to the lockdep tables
at the same time.Signed-off-by: David S. Miller
11 Jan, 2017
1 commit
-
Make sockfs_setattr() static as it is not used outside of net/socket.c
This fixes the following GCC warning:
net/socket.c:534:5: warning: no previous prototype for ‘sockfs_setattr’ [-Wmissing-prototypes]Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Cc: Lorenzo Colitti
Signed-off-by: Tobias Klauser
Acked-by: Lorenzo Colitti
Signed-off-by: David S. Miller
10 Jan, 2017
1 commit
-
sock_init() call it but not check it's return value,
so change it to void return and add an internal BUG_ON() check.Signed-off-by: yuan linyu
Signed-off-by: David S. Miller
06 Jan, 2017
1 commit
05 Jan, 2017
1 commit
-
It must always be the case that CMSG_ALIGN(sizeof(hdr)) == sizeof(hdr).
Otherwise there are missing adjustments in the various calculations
that parse and build these things.Signed-off-by: David S. Miller
02 Jan, 2017
1 commit
-
->setattr() was recently implemented for socket files to sync the socket
inode's uid to the new 'sk_uid' member of struct sock. It does this by
copying over the ia_uid member of struct iattr. However, ia_uid is
actually only valid when ATTR_UID is set in ia_valid, indicating that
the uid is being changed, e.g. by chown. Other metadata operations such
as chmod or utimes leave ia_uid uninitialized. Therefore, sk_uid could
be set to a "garbage" value from the stack.Fix this by only copying the uid over when ATTR_UID is set.
Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
Signed-off-by: Eric Biggers
Tested-by: Lorenzo Colitti
Acked-by: Lorenzo Colitti
Signed-off-by: David S. Miller
26 Dec, 2016
1 commit
-
ktime is a union because the initial implementation stored the time in
scalar nanoseconds on 64 bit machine and in a endianess optimized timespec
variant for 32bit machines. The Y2038 cleanup removed the timespec variant
and switched everything to scalar nanoseconds. The union remained, but
become completely pointless.Get rid of the union and just keep ktime_t as simple typedef of type s64.
The conversion was done with coccinelle and some manual mopping up.
Signed-off-by: Thomas Gleixner
Cc: Peter Zijlstra
25 Dec, 2016
1 commit
-
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
sed -i -e "s!$PATT!#include !" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)to do the replacement at the end of the merge window.
Requested-by: Al Viro
Signed-off-by: Linus Torvalds
11 Dec, 2016
1 commit
-
This patch removes a newline which was added
in socket.c file in net-nextSigned-off-by: Amit Kushwaha
Signed-off-by: David S. Miller
09 Dec, 2016
1 commit
-
This patch cleanup checkpatch.pl warning
WARNING: __aligned(size) is preferred over __attribute__((aligned(size)))Signed-off-by: Amit Kushwaha
Signed-off-by: David S. Miller
30 Nov, 2016
1 commit
-
This patch exports the sender chronograph stats via the socket
SO_TIMESTAMPING channel. Currently we can instrument how long a
particular application unit of data was queued in TCP by tracking
SOF_TIMESTAMPING_TX_SOFTWARE and SOF_TIMESTAMPING_TX_SCHED. Having
these sender chronograph stats exported simultaneously along with
these timestamps allow further breaking down the various sender
limitation. For example, a video server can tell if a particular
chunk of video on a connection takes a long time to deliver because
TCP was experiencing small receive window. It is not possible to
tell before this patch without packet traces.To prepare these stats, the user needs to set
SOF_TIMESTAMPING_OPT_STATS and SOF_TIMESTAMPING_OPT_TSONLY flags
while requesting other SOF_TIMESTAMPING TX timestamps. When the
timestamps are available in the error queue, the stats are returned
in a separate control message of type SCM_TIMESTAMPING_OPT_STATS,
in a list of TLVs (struct nlattr) of types: TCP_NLA_BUSY_TIME,
TCP_NLA_RWND_LIMITED, TCP_NLA_SNDBUF_LIMITED. Unit is microsecond.Signed-off-by: Francis Yan
Signed-off-by: Yuchung Cheng
Signed-off-by: Soheil Hassas Yeganeh
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller
23 Nov, 2016
1 commit
-
All conflicts were simple overlapping changes except perhaps
for the Thunder driver.That driver has a change_mtu method explicitly for sending
a message to the hardware. If that fails it returns an
error.Normally a driver doesn't need an ndo_change_mtu method becuase those
are usually just range changes, which are now handled generically.
But since this extra operation is needed in the Thunder driver, it has
to stay.However, if the message send fails we have to restore the original
MTU before the change because the entire call chain expects that if
an error is thrown by ndo_change_mtu then the MTU did not change.
Therefore code is added to nicvf_change_mtu to remember the original
MTU, and to restore it upon nicvf_update_hw_max_frs() failue.Signed-off-by: David S. Miller
17 Nov, 2016
1 commit
-
The IOP_XATTR flag is set on sockfs because sockfs supports getting the
"system.sockprotoname" xattr. Since commit 6c6ef9f2, this flag is checked for
setxattr support as well. This is wrong on sockfs because security xattr
support there is supposed to be provided by security_inode_setsecurity. The
smack security module relies on socket labels (xattrs).Fix this by adding a security xattr handler on sockfs that returns
-EAGAIN, and by checking for -EAGAIN in setxattr.We cannot simply check for -EOPNOTSUPP in setxattr because there are
filesystems that neither have direct security xattr support nor support
via security_inode_setsecurity. A more proper fix might be to move the
call to security_inode_setsecurity into sockfs, but it's not clear to me
if that is safe: we would end up calling security_inode_post_setxattr after
that as well.Signed-off-by: Andreas Gruenbacher
Signed-off-by: Al Viro
15 Nov, 2016
1 commit
-
Several cases of bug fixes in 'net' overlapping other changes in
'net-next-.Signed-off-by: David S. Miller
10 Nov, 2016
1 commit
-
Do not send the next message in sendmmsg for partial sendmsg
invocations.sendmmsg assumes that it can continue sending the next message
when the return value of the individual sendmsg invocations
is positive. It results in corrupting the data for TCP,
SCTP, and UNIX streams.For example, sendmmsg([["abcd"], ["efgh"]]) can result in a stream
of "aefgh" if the first sendmsg invocation sends only the first
byte while the second sendmsg goes through.Datagram sockets either send the entire datagram or fail, so
this patch affects only sockets of type SOCK_STREAM and
SOCK_SEQPACKET.Fixes: 228e548e6020 ("net: Add sendmmsg socket system call")
Signed-off-by: Soheil Hassas Yeganeh
Signed-off-by: Eric Dumazet
Signed-off-by: Willem de Bruijn
Signed-off-by: Neal Cardwell
Acked-by: Maciej Żenczykowski
Signed-off-by: David S. Miller
05 Nov, 2016
1 commit
-
Protocol sockets (struct sock) don't have UIDs, but most of the
time, they map 1:1 to userspace sockets (struct socket) which do.Various operations such as the iptables xt_owner match need
access to the "UID of a socket", and do so by following the
backpointer to the struct socket. This involves taking
sk_callback_lock and doesn't work when there is no socket
because userspace has already called close().Simplify this by adding a sk_uid field to struct sock whose value
matches the UID of the corresponding struct socket. The semantics
are as follows:1. Whenever sk_socket is non-null: sk_uid is the same as the UID
in sk_socket, i.e., matches the return value of sock_i_uid.
Specifically, the UID is set when userspace calls socket(),
fchown(), or accept().
2. When sk_socket is NULL, sk_uid is defined as follows:
- For a socket that no longer has a sk_socket because
userspace has called close(): the previous UID.
- For a cloned socket (e.g., an incoming connection that is
established but on which userspace has not yet called
accept): the UID of the socket it was cloned from.
- For a socket that has never had an sk_socket: UID 0 inside
the user namespace corresponding to the network namespace
the socket belongs to.Kernel sockets created by sock_create_kern are a special case
of #1 and sk_uid is the user that created them. For kernel
sockets created at network namespace creation time, such as the
per-processor ICMP and TCP sockets, this is the user that created
the network namespace.Signed-off-by: Lorenzo Colitti
Signed-off-by: David S. Miller