Eric Lee / smarc-fsl-linux-kernel

19 Mar, 2019

1 commit

345af5abc missing barriers in some of unix_sock ->addr and ->path accesses ... Browse Code »

[ Upstream commit ae3b564179bfd06f32d051b9e5d72ce4b2a07c37 ]

Several u->addr and u->path users are not holding any locks in
common with unix_bind(). unix_state_lock() is useless for those
purposes.

u->addr is assign-once and *(u->addr) is fully set up by the time
we set u->addr (all under unix_table_lock). u->path is also
set in the same critical area, also before setting u->addr, and
any unix_sock with ->path filled will have non-NULL ->addr.

So setting ->addr with smp_store_release() is all we need for those
"lockless" users - just have them fetch ->addr with smp_load_acquire()
and don't even bother looking at ->path if they see NULL ->addr.

Users of ->addr and ->path fall into several classes now:
1) ones that do smp_load_acquire(u->addr) and access *(u->addr)
and u->path only if smp_load_acquire() has returned non-NULL.
2) places holding unix_table_lock. These are guaranteed that
*(u->addr) is seen fully initialized. If unix_sock is in one of the
"bound" chains, so's ->path.
3) unix_sock_destructor() using ->addr is safe. All places
that set u->addr are guaranteed to have seen all stores *(u->addr)
while holding a reference to u and unix_sock_destructor() is called
when (atomic) refcount hits zero.
4) unix_release_sock() using ->path is safe. unix_bind()
is serialized wrt unix_release() (normally - by struct file
refcount), and for the instances that had ->path set by unix_bind()
unix_release_sock() comes from unix_release(), so they are fine.
Instances that had it set in unix_stream_connect() either end up
attached to a socket (in unix_accept()), in which case the call
chain to unix_release_sock() and serialization are the same as in
the previous case, or they never get accept'ed and unix_release_sock()
is called when the listener is shut down and its queue gets purged.
In that case the listener's queue lock provides the barriers needed -
unix_stream_connect() shoves our unix_sock into listener's queue
under that lock right after having set ->path and eventual
unix_release_sock() caller picks them from that queue under the
same lock right before calling unix_release_sock().
5) unix_find_other() use of ->path is pointless, but safe -
it happens with successful lookup by (abstract) name, so ->path.dentry
is guaranteed to be NULL there.

earlier-variant-reviewed-by: "Paul E. McKenney"
Signed-off-by: Al Viro
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Al Viro
2019-03-19 20:12:41 +0800

04 Nov, 2018

1 commit

fd54c188b Revert "net: simplify sock_poll_wait" ... Browse Code »

[ Upstream commit 89ab066d4229acd32e323f1569833302544a4186 ]

This reverts commit dd979b4df817e9976f18fb6f9d134d6bc4a3c317.

This broke tcp_poll for SMC fallback: An AF_SMC socket establishes an
internal TCP socket for the initial handshake with the remote peer.
Whenever the SMC connection can not be established this TCP socket is
used as a fallback. All socket operations on the SMC socket are then
forwarded to the TCP socket. In case of poll, the file->private_data
pointer references the SMC socket because the TCP socket has no file
assigned. This causes tcp_poll to wait on the wrong socket.

Signed-off-by: Karsten Graul
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Karsten Graul
2018-11-04 21:50:51 +0800

04 Aug, 2018

1 commit

51f7e9518 af_unix: ensure POLLOUT on remote close() for connected dgram socket ... Browse Code »

Applications use -ECONNREFUSED as returned from write() in order to
determine that a socket should be closed. However, when using connected
dgram unix sockets in a poll/write loop, a final POLLOUT event can be
missed when the remote end closes. Thus, the poll is stuck forever:

thread 1 (client) thread 2 (server)

connect() to server
write() returns -EAGAIN
unix_dgram_poll()
-> unix_recvq_full() is true
close()
->unix_release_sock()
->wake_up_interruptible_all()
unix_dgram_poll() (due to the
wake_up_interruptible_all)
-> unix_recvq_full() still is true
->free all skbs

Now thread 1 is stuck and will not receive anymore wakeups. In this
case, when thread 1 gets the -EAGAIN, it has not queued any skbs
otherwise the 'free all skbs' step would in fact cause a wakeup and
a POLLOUT return. So the race here is probably fairly rare because
it means there are no skbs that thread 1 queued and that thread 1
schedules before the 'free all skbs' step.

This issue was reported as a hang when /dev/log is closed.

The fix is to signal POLLOUT if the socket is marked as SOCK_DEAD, which
means a subsequent write() will get -ECONNREFUSED.

Reported-by: Ian Lance Taylor
Cc: David Rientjes
Cc: Rainer Weikusat
Cc: Eric Dumazet
Signed-off-by: Jason Baron
Signed-off-by: David S. Miller

Jason Baron
2018-08-04 07:44:19 +0800

31 Jul, 2018

1 commit

dd979b4df net: simplify sock_poll_wait ... Browse Code »

The wait_address argument is always directly derived from the filp
argument, so remove it.

Signed-off-by: Christoph Hellwig
Signed-off-by: David S. Miller

Christoph Hellwig
2018-07-31 00:10:25 +0800

29 Jun, 2018

1 commit

a11e1d432 Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLL ... Browse Code »

The poll() changes were not well thought out, and completely
unexplained. They also caused a huge performance regression, because
"->poll()" was no longer a trivial file operation that just called down
to the underlying file operations, but instead did at least two indirect
calls.

Indirect calls are sadly slow now with the Spectre mitigation, but the
performance problem could at least be largely mitigated by changing the
"->get_poll_head()" operation to just have a per-file-descriptor pointer
to the poll head instead. That gets rid of one of the new indirections.

But that doesn't fix the new complexity that is completely unwarranted
for the regular case. The (undocumented) reason for the poll() changes
was some alleged AIO poll race fixing, but we don't make the common case
slower and more complex for some uncommon special case, so this all
really needs way more explanations and most likely a fundamental
redesign.

[ This revert is a revert of about 30 different commits, not reverted
individually because that would just be unnecessarily messy - Linus ]

Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Linus Torvalds

Linus Torvalds
2018-06-29 01:40:47 +0800

05 Jun, 2018

1 commit

408afb8d7 Merge branch 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull aio updates from Al Viro:
"Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly.

The only thing I'm holding back for a day or so is Adam's aio ioprio -
his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case),
but let it sit in -next for decency sake..."

* 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
aio: sanitize the limit checking in io_submit(2)
aio: fold do_io_submit() into callers
aio: shift copyin of iocb into io_submit_one()
aio_read_events_ring(): make a bit more readable
aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way
aio: take list removal to (some) callers of aio_complete()
aio: add missing break for the IOCB_CMD_FDSYNC case
random: convert to ->poll_mask
timerfd: convert to ->poll_mask
eventfd: switch to ->poll_mask
pipe: convert to ->poll_mask
crypto: af_alg: convert to ->poll_mask
net/rxrpc: convert to ->poll_mask
net/iucv: convert to ->poll_mask
net/phonet: convert to ->poll_mask
net/nfc: convert to ->poll_mask
net/caif: convert to ->poll_mask
net/bluetooth: convert to ->poll_mask
net/sctp: convert to ->poll_mask
net/tipc: convert to ->poll_mask
...

Linus Torvalds
2018-06-05 04:57:43 +0800

26 May, 2018

1 commit

e76cd24d0 net/unix: convert to ->poll_mask ... Browse Code »

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2018-05-26 15:16:44 +0800

16 May, 2018

1 commit

c35063722 proc: introduce proc_create_net{,_data} ... Browse Code »

Variants of proc_create{,_data} that directly take a struct seq_operations
and deal with network namespaces in ->open and ->release. All callers of
proc_create + seq_open_net converted over, and seq_{open,release}_net are
removed entirely.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2018-05-16 13:24:30 +0800

04 Apr, 2018

1 commit

3848ec5dc af_unix: remove redundant lockdep class ... Browse Code »

After commit 581319c58600 ("net/socket: use per af lockdep classes for sk queues")
sock queue locks now have per-af lockdep classes, including unix socket.
It is no longer necessary to workaround it.

I noticed this while looking at a syzbot deadlock report, this patch
itself doesn't fix it (this is why I don't add Reported-by).

Fixes: 581319c58600 ("net/socket: use per af lockdep classes for sk queues")
Cc: Paolo Abeni
Signed-off-by: Cong Wang
Acked-by: Paolo Abeni
Signed-off-by: David S. Miller

Cong Wang
2018-04-04 23:13:40 +0800

28 Mar, 2018

1 commit

2f635ceeb net: Drop pernet_operations::async ... Browse Code »

Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.

Signed-off-by: Kirill Tkhai
Signed-off-by: David S. Miller

Kirill Tkhai
2018-03-28 01:18:09 +0800

20 Feb, 2018

1 commit

f5c0c6f42 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2018-02-20 07:46:11 +0800

14 Feb, 2018

1 commit

d4e9a408e net: af_unix: fix typo in UNIX_SKB_FRAGS_SZ comment ... Browse Code »

Change "minimun" to "minimum".

Signed-off-by: Tobias Klauser
Signed-off-by: David S. Miller

Tobias Klauser
2018-02-14 01:21:45 +0800

13 Feb, 2018

2 commits

167f7ac72 net: Convert unix_net_ops ... Browse Code »

These pernet_operations are just create and destroy
/proc and sysctl entries, and are not touched by
foreign pernet_operations.

So, we are able to make them async.

Signed-off-by: Kirill Tkhai
Acked-by: Andrei Vagin
Signed-off-by: David S. Miller

Kirill Tkhai
2018-02-13 23:36:08 +0800
9b2c45d47 net: make getname() functions return length rather than use int* parameter ... Browse Code »

Changes since v1:
Added changes in these files:
drivers/infiniband/hw/usnic/usnic_transport.c
drivers/staging/lustre/lnet/lnet/lib-socket.c
drivers/target/iscsi/iscsi_target_login.c
drivers/vhost/net.c
fs/dlm/lowcomms.c
fs/ocfs2/cluster/tcp.c
security/tomoyo/network.c

Before:
All these functions either return a negative error indicator,
or store length of sockaddr into "int *socklen" parameter
and return zero on success.

"int *socklen" parameter is awkward. For example, if caller does not
care, it still needs to provide on-stack storage for the value
it does not need.

None of the many FOO_getname() functions of various protocols
ever used old value of *socklen. They always just overwrite it.

This change drops this parameter, and makes all these functions, on success,
return length of sockaddr. It's always >= 0 and can be differentiated
from an error.

Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.

rpc_sockname() lost "int buflen" parameter, since its only use was
to be passed to kernel_getsockname() as &buflen and subsequently
not used in any way.

Userspace API is not changed.

text data bss dec hex filename
30108430 2633624 873672 33615726 200ef6e vmlinux.before.o
30108109 2633612 873672 33615393 200ee21 vmlinux.o

Signed-off-by: Denys Vlasenko
CC: David S. Miller
CC: linux-kernel@vger.kernel.org
CC: netdev@vger.kernel.org
CC: linux-bluetooth@vger.kernel.org
CC: linux-decnet-user@lists.sourceforge.net
CC: linux-wireless@vger.kernel.org
CC: linux-rdma@vger.kernel.org
CC: linux-sctp@vger.kernel.org
CC: linux-nfs@vger.kernel.org
CC: linux-x25@vger.kernel.org
Signed-off-by: David S. Miller

Denys Vlasenko
2018-02-13 03:15:04 +0800

12 Feb, 2018

1 commit

a9a08845e vfs: do bulk POLL* -> EPOLL* replacement ... Browse Code »

This is the mindless scripted replacement of kernel use of POLL*
variables as described by Al, done by this script:

for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
for f in $L; do sed -i "-es/^$[^\"]*$$\$/\\1E\\2/" $f; done
done

with de-mangling cleanups yet to come.

NOTE! On almost all architectures, the EPOLL* constants have the same
values as the POLL* constants do. But they keyword here is "almost".
For various bad reasons they aren't the same, and epoll() doesn't
actually work quite correctly in some cases due to this on Sparc et al.

The next patch from Al will sort out the final differences, and we
should be all done.

Scripted-by: Al Viro
Signed-off-by: Linus Torvalds

Linus Torvalds
2018-02-12 06:34:03 +0800

01 Feb, 2018

1 commit

b2fe5fa68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:

1) Significantly shrink the core networking routing structures. Result
of http://vger.kernel.org/~davem/seoul2017_netdev_keynote.pdf

2) Add netdevsim driver for testing various offloads, from Jakub
Kicinski.

3) Support cross-chip FDB operations in DSA, from Vivien Didelot.

4) Add a 2nd listener hash table for TCP, similar to what was done for
UDP. From Martin KaFai Lau.

5) Add eBPF based queue selection to tun, from Jason Wang.

6) Lockless qdisc support, from John Fastabend.

7) SCTP stream interleave support, from Xin Long.

8) Smoother TCP receive autotuning, from Eric Dumazet.

9) Lots of erspan tunneling enhancements, from William Tu.

10) Add true function call support to BPF, from Alexei Starovoitov.

11) Add explicit support for GRO HW offloading, from Michael Chan.

12) Support extack generation in more netlink subsystems. From Alexander
Aring, Quentin Monnet, and Jakub Kicinski.

13) Add 1000BaseX, flow control, and EEE support to mvneta driver. From
Russell King.

14) Add flow table abstraction to netfilter, from Pablo Neira Ayuso.

15) Many improvements and simplifications to the NFP driver bpf JIT,
from Jakub Kicinski.

16) Support for ipv6 non-equal cost multipath routing, from Ido
Schimmel.

17) Add resource abstration to devlink, from Arkadi Sharshevsky.

18) Packet scheduler classifier shared filter block support, from Jiri
Pirko.

19) Avoid locking in act_csum, from Davide Caratti.

20) devinet_ioctl() simplifications from Al viro.

21) More TCP bpf improvements from Lawrence Brakmo.

22) Add support for onlink ipv6 route flag, similar to ipv4, from David
Ahern.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1925 commits)
tls: Add support for encryption using async offload accelerator
ip6mr: fix stale iterator
net/sched: kconfig: Remove blank help texts
openvswitch: meter: Use 64-bit arithmetic instead of 32-bit
tcp_nv: fix potential integer overflow in tcpnv_acked
r8169: fix RTL8168EP take too long to complete driver initialization.
qmi_wwan: Add support for Quectel EP06
rtnetlink: enable IFLA_IF_NETNSID for RTM_NEWLINK
ipmr: Fix ptrdiff_t print formatting
ibmvnic: Wait for device response when changing MAC
qlcnic: fix deadlock bug
tcp: release sk_frag.page in tcp_disconnect
ipv4: Get the address of interface correctly.
net_sched: gen_estimator: fix lockdep splat
net: macb: Handle HRESP error
net/mlx5e: IPoIB, Fix copy-paste bug in flow steering refactoring
ipv6: addrconf: break critical section in addrconf_verify_rtnl()
ipv6: change route cache aging logic
i40e/i40evf: Update DESC_NEEDED value to reflect larger value
bnxt_en: cleanup DIM work on device shutdown
...

Linus Torvalds
2018-02-01 06:31:10 +0800

17 Jan, 2018

1 commit

96890d625 net: delete /proc THIS_MODULE references ... Browse Code »

/proc has been ignoring struct file_operations::owner field for 10 years.
Specifically, it started with commit 786d7e1612f0b0adb6046f19b906609e4fe8b1ba
("Fix rmmod/read/write races in /proc entries"). Notice the chunk where
inode->i_fop is initialized with proxy struct file_operations for
regular files:

- if (de->proc_fops)
- inode->i_fop = de->proc_fops;
+ if (de->proc_fops) {
+ if (S_ISREG(inode->i_mode))
+ inode->i_fop = &proc_reg_file_ops;
+ else
+ inode->i_fop = de->proc_fops;
+ }

VFS stopped pinning module at this point.

Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Alexey Dobriyan
2018-01-17 04:01:33 +0800

28 Nov, 2017

2 commits

ade994f4f net: annotate ->poll() instances ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2017-11-28 05:20:04 +0800
3ad6f93e9 annotate poll-related wait keys ... Browse Code »

__poll_t is also used as wait key in some waitqueues.
Verify that wait_..._poll() gets __poll_t as key and
provide a helper for wakeup functions to get back to
that __poll_t value.

Signed-off-by: Al Viro

Al Viro
2017-11-28 05:19:54 +0800

04 Nov, 2017

1 commit

2a171788b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Files removed in 'net-next' had their license header updated
in 'net'. We take the remove from 'net-next'.

Signed-off-by: David S. Miller

David S. Miller
2017-11-04 08:26:51 +0800

03 Nov, 2017

1 commit

ead751507 Merge tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/gregkh/driver-core

Pull initial SPDX identifiers from Greg KH:
"License cleanup: add SPDX license identifiers to some files

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the
'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
binding shorthand, which can be used instead of the full boiler plate
text.

This patch is based on work done by Thomas Gleixner and Kate Stewart
and Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset
of the use cases:

- file had no licensing information it it.

- file was a */uapi/* one with no licensing information in it,

- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to
license had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied
to a file was done in a spreadsheet of side by side results from of
the output of two independent scanners (ScanCode & Windriver)
producing SPDX tag:value files created by Philippe Ombredanne.
Philippe prepared the base worksheet, and did an initial spot review
of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537
files assessed. Kate Stewart did a file by file comparison of the
scanner results in the spreadsheet to determine which SPDX license
identifier(s) to be applied to the file. She confirmed any
determination that was not immediately clear with lawyers working with
the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:

- Files considered eligible had to be source code files.

- Make and config files were included as candidates if they contained
>5 lines of source

- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that
became the concluded license(s).

- when there was disagreement between the two scanners (one detected
a license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply
(and which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases,
confirmation by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.
The Windriver scanner is based on an older version of FOSSology in
part, so they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot
checks in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect
the correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial
patch version early this week with:

- a full scancode scan run, collecting the matched texts, detected
license ids and scores

- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct

- reviewing anything where there was no detection but the patch
license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
applied SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
License cleanup: add SPDX license identifier to uapi header files with a license
License cleanup: add SPDX license identifier to uapi header files with no license
License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Linus Torvalds
2017-11-03 01:04:46 +0800

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

30 Oct, 2017

1 commit

e1ea2f985 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Several conflicts here.

NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to
nfp_fl_output() needed some adjustments because the code block is in
an else block now.

Parallel additions to net/pkt_cls.h and net/sch_generic.h

A bug fix in __tcp_retransmit_skb() conflicted with some of
the rbtree changes in net-next.

The tc action RCU callback fixes in 'net' had some overlap with some
of the recent tcf_block reworking.

Signed-off-by: David S. Miller

David S. Miller
2017-10-30 20:09:24 +0800

26 Oct, 2017

1 commit

0f5da659d net/unix: don't show information about sockets from other namespaces ... Browse Code »

socket_diag shows information only about sockets from a namespace where
a diag socket lives.

But if we request information about one unix socket, the kernel don't
check that its netns is matched with a diag socket namespace, so any
user can get information about any unix socket in a system. This looks
like a bug.

v2: add a Fixes tag

Fixes: 51d7cccf0723 ("net: make sock diag per-namespace")
Signed-off-by: Andrei Vagin
Signed-off-by: David S. Miller

Andrei Vagin
2017-10-26 09:05:59 +0800

22 Oct, 2017

1 commit

110af3acb net: af_unix: mark expected switch fall-through ... Browse Code »

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Signed-off-by: Gustavo A. R. Silva
Signed-off-by: David S. Miller

Gustavo A. R. Silva
2017-10-22 10:07:50 +0800

22 Aug, 2017

1 commit

e2a7c34fb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2017-08-22 08:06:42 +0800

19 Aug, 2017

1 commit

a0917e0bc datagram: When peeking datagrams with offset < 0 don't skip empty skbs ... Browse Code »

Due to commit e6afc8ace6dd5cef5e812f26c72579da8806f5ac ("udp: remove
headers from UDP packets before queueing"), when udp packets are being
peeked the requested extra offset is always 0 as there is no need to skip
the udp header. However, when the offset is 0 and the next skb is
of length 0, it is only returned once. The behaviour can be seen with
the following python script:

from socket import *;
f=socket(AF_INET6, SOCK_DGRAM | SOCK_NONBLOCK, 0);
g=socket(AF_INET6, SOCK_DGRAM | SOCK_NONBLOCK, 0);
f.bind(('::', 0));
addr=('::1', f.getsockname()[1]);
g.sendto(b'', addr)
g.sendto(b'b', addr)
print(f.recvfrom(10, MSG_PEEK));
print(f.recvfrom(10, MSG_PEEK));

Where the expected output should be the empty string twice.

Instead, make sk_peek_offset return negative values, and pass those values
to __skb_try_recv_datagram/__skb_try_recv_from_queue. If the passed offset
to __skb_try_recv_from_queue is negative, the checked skb is never skipped.
__skb_try_recv_from_queue will then ensure the offset is reset back to 0
if a peek is requested without an offset, unless no packets are found.

Also simplify the if condition in __skb_try_recv_from_queue. If _off is
greater then 0, and off is greater then or equal to skb->len, then
(_off || skb->len) must always be true assuming skb->len >= 0 is always
true.

Also remove a redundant check around a call to sk_peek_offset in af_unix.c,
as it double checked if MSG_PEEK was set in the flags.

V2:
- Moved the negative fixup into __skb_try_recv_from_queue, and remove now
redundant checks
- Fix peeking in udp{,v6}_recvmsg to report the right value when the
offset is 0

V3:
- Marked new branch in __skb_try_recv_from_queue as unlikely.

Signed-off-by: Matthew Dawson
Acked-by: Willem de Bruijn
Signed-off-by: David S. Miller

Matthew Dawson
2017-08-19 06:12:54 +0800

17 Jul, 2017

1 commit

27eac47b0 net/unix: drop obsolete fd-recursion limits ... Browse Code »

All unix sockets now account inflight FDs to the respective sender.
This was introduced in:

commit 712f4aad406bb1ed67f3f98d04c044191f0ff593
Author: willy tarreau
Date: Sun Jan 10 07:54:56 2016 +0100

unix: properly account for FDs passed over unix sockets

and further refined in:

commit 415e3d3e90ce9e18727e8843ae343eda5a58fad6
Author: Hannes Frederic Sowa
Date: Wed Feb 3 02:11:03 2016 +0100

unix: correctly track in-flight fds in sending process user_struct

Hence, regardless of the stacking depth of FDs, the total number of
inflight FDs is limited, and accounted. There is no known way for a
local user to exceed those limits or exploit the accounting.

Furthermore, the GC logic is independent of the recursion/stacking depth
as well. It solely depends on the total number of inflight FDs,
regardless of their layout.

Lastly, the current `recursion_level' suffers a TOCTOU race, since it
checks and inherits depths only at queue time. If we consider `A
Cc: Simon McVittie
Signed-off-by: David Herrmann
Reviewed-by: Tom Gundersen
Signed-off-by: David S. Miller

David Herrmann
2017-07-17 23:57:59 +0800

06 Jul, 2017

1 commit

5518b69b7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:
"Reasonably busy this cycle, but perhaps not as busy as in the 4.12
merge window:

1) Several optimizations for UDP processing under high load from
Paolo Abeni.

2) Support pacing internally in TCP when using the sch_fq packet
scheduler for this is not practical. From Eric Dumazet.

3) Support mutliple filter chains per qdisc, from Jiri Pirko.

4) Move to 1ms TCP timestamp clock, from Eric Dumazet.

5) Add batch dequeueing to vhost_net, from Jason Wang.

6) Flesh out more completely SCTP checksum offload support, from
Davide Caratti.

7) More plumbing of extended netlink ACKs, from David Ahern, Pablo
Neira Ayuso, and Matthias Schiffer.

8) Add devlink support to nfp driver, from Simon Horman.

9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa
Prabhu.

10) Add stack depth tracking to BPF verifier and use this information
in the various eBPF JITs. From Alexei Starovoitov.

11) Support XDP on qed device VFs, from Yuval Mintz.

12) Introduce BPF PROG ID for better introspection of installed BPF
programs. From Martin KaFai Lau.

13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann.

14) For loads, allow narrower accesses in bpf verifier checking, from
Yonghong Song.

15) Support MIPS in the BPF selftests and samples infrastructure, the
MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David
Daney.

16) Support kernel based TLS, from Dave Watson and others.

17) Remove completely DST garbage collection, from Wei Wang.

18) Allow installing TCP MD5 rules using prefixes, from Ivan
Delalande.

19) Add XDP support to Intel i40e driver, from Björn Töpel

20) Add support for TC flower offload in nfp driver, from Simon
Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub
Kicinski, and Bert van Leeuwen.

21) IPSEC offloading support in mlx5, from Ilan Tayari.

22) Add HW PTP support to macb driver, from Rafal Ozieblo.

23) Networking refcount_t conversions, From Elena Reshetova.

24) Add sock_ops support to BPF, from Lawrence Brako. This is useful
for tuning the TCP sockopt settings of a group of applications,
currently via CGROUPs"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits)
net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap
dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap
cxgb4: Support for get_ts_info ethtool method
cxgb4: Add PTP Hardware Clock (PHC) support
cxgb4: time stamping interface for PTP
nfp: default to chained metadata prepend format
nfp: remove legacy MAC address lookup
nfp: improve order of interfaces in breakout mode
net: macb: remove extraneous return when MACB_EXT_DESC is defined
bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case
bpf: fix return in load_bpf_file
mpls: fix rtm policy in mpls_getroute
net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t
net, ax25: convert ax25_route.refcount from atomic_t to refcount_t
net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t
net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t
net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t
...

Linus Torvalds
2017-07-06 03:31:59 +0800

01 Jul, 2017

3 commits

8c9814b97 net: convert unix_address.refcnt from atomic_t to refcount_t ... Browse Code »

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller

Reshetova, Elena
2017-07-01 22:39:08 +0800
41c6d650f net: convert sock.sk_refcnt from atomic_t to refcount_t ... Browse Code »

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

This patch uses refcount_inc_not_zero() instead of
atomic_inc_not_zero_hint() due to absense of a _hint()
version of refcount API. If the hint() version must
be used, we might need to revisit API.

Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller

Reshetova, Elena
2017-07-01 22:39:08 +0800
14afee4b6 net: convert sock.sk_wmem_alloc from atomic_t to refcount_t ... Browse Code »

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova
Signed-off-by: Hans Liljestrand
Signed-off-by: Kees Cook
Signed-off-by: David Windsor
Signed-off-by: David S. Miller

Reshetova, Elena
2017-07-01 22:39:08 +0800

20 Jun, 2017

1 commit

ac6424b98 sched/wait: Rename wait_queue_t => wait_queue_entry_t ... Browse Code »

Rename:

wait_queue_t => wait_queue_entry_t

'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
which had to carry the name.

Start sorting this out by renaming it to 'wait_queue_entry_t'.

This also allows the real structure name 'struct __wait_queue' to
lose its double underscore and become 'struct wait_queue_entry',
which is the more canonical nomenclature for such data types.

Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Ingo Molnar
2017-06-20 18:18:27 +0800

09 Jun, 2017

1 commit

defbcf2de af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers ... Browse Code »

Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.

Signed-off-by: Mateusz Jurczyk
Signed-off-by: David S. Miller

Mateusz Jurczyk
2017-06-09 22:10:24 +0800

07 Apr, 2017

1 commit

82fe0d2b4 af_unix: Use designated initializers ... Browse Code »

Prepare to mark sensitive kernel structures for randomization by making
sure they're using designated initializers. These were identified during
allyesconfig builds of x86, arm, and arm64, and the initializer fixes
were extracted from grsecurity. In this case, NULL initialize with { }
instead of undesignated NULLs.

Signed-off-by: Kees Cook
Signed-off-by: David S. Miller

Kees Cook
2017-04-07 03:43:04 +0800

22 Mar, 2017

1 commit

7df9c2462 net: unix: properly re-increment inflight counter of GC discarded candidates ... Browse Code »

Dmitry has reported that a BUG_ON() condition in unix_notinflight()
may be triggered by a simple code that forwards unix socket in an
SCM_RIGHTS message.
That is caused by incorrect unix socket GC implementation in unix_gc().

The GC first collects list of candidates, then (a) decrements their
"children's" inflight counter, (b) checks which inflight counters are
now 0, and then (c) increments all inflight counters back.
(a) and (c) are done by calling scan_children() with inc_inflight or
dec_inflight as the second argument.

Commit 6209344f5a37 ("net: unix: fix inflight counting bug in garbage
collector") changed scan_children() such that it no longer considers
sockets that do not have UNIX_GC_CANDIDATE flag. It also added a block
of code that that unsets this flag _before_ invoking
scan_children(, dec_iflight, ). This may lead to incorrect inflight
counters for some sockets.

This change fixes this bug by changing order of operations:
UNIX_GC_CANDIDATE is now unset only after all inflight counters are
restored to the original state.

kernel BUG at net/unix/garbage.c:149!
RIP: 0010:[] []
unix_notinflight+0x3b4/0x490 net/unix/garbage.c:149
Call Trace:
[] unix_detach_fds.isra.19+0xff/0x170 net/unix/af_unix.c:1487
[] unix_destruct_scm+0xf9/0x210 net/unix/af_unix.c:1496
[] skb_release_head_state+0x101/0x200 net/core/skbuff.c:655
[] skb_release_all+0x1a/0x60 net/core/skbuff.c:668
[] __kfree_skb+0x1a/0x30 net/core/skbuff.c:684
[] kfree_skb+0x184/0x570 net/core/skbuff.c:705
[] unix_release_sock+0x5b5/0xbd0 net/unix/af_unix.c:559
[] unix_release+0x49/0x90 net/unix/af_unix.c:836
[] sock_release+0x92/0x1f0 net/socket.c:570
[] sock_close+0x1b/0x20 net/socket.c:1017
[] __fput+0x34e/0x910 fs/file_table.c:208
[] ____fput+0x1a/0x20 fs/file_table.c:244
[] task_work_run+0x1a0/0x280 kernel/task_work.c:116
[< inline >] exit_task_work include/linux/task_work.h:21
[] do_exit+0x183a/0x2640 kernel/exit.c:828
[] do_group_exit+0x14e/0x420 kernel/exit.c:931
[] get_signal+0x663/0x1880 kernel/signal.c:2307
[] do_signal+0xc5/0x2190 arch/x86/kernel/signal.c:807
[] exit_to_usermode_loop+0x1ea/0x2d0
arch/x86/entry/common.c:156
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
[] syscall_return_slowpath+0x4d3/0x570
arch/x86/entry/common.c:259
[] entry_SYSCALL_64_fastpath+0xc4/0xc6

Link: https://lkml.org/lkml/2017/3/6/252
Signed-off-by: Andrey Ulanov
Reported-by: Dmitry Vyukov
Fixes: 6209344 ("net: unix: fix inflight counting bug in garbage collector")
Signed-off-by: David S. Miller

Andrey Ulanov
2017-03-22 06:25:10 +0800

10 Mar, 2017

1 commit

cdfbabfb2 net: Work around lockdep limitation in sockets that use sockets ... Browse Code »

Lockdep issues a circular dependency warning when AFS issues an operation
through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.

The theory lockdep comes up with is as follows:

(1) If the pagefault handler decides it needs to read pages from AFS, it
calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
creating a call requires the socket lock:

mmap_sem must be taken before sk_lock-AF_RXRPC

(2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind()
binds the underlying UDP socket whilst holding its socket lock.
inet_bind() takes its own socket lock:

sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET

(3) Reading from a TCP socket into a userspace buffer might cause a fault
and thus cause the kernel to take the mmap_sem, but the TCP socket is
locked whilst doing this:

sk_lock-AF_INET must be taken before mmap_sem

However, lockdep's theory is wrong in this instance because it deals only
with lock classes and not individual locks. The AF_INET lock in (2) isn't
really equivalent to the AF_INET lock in (3) as the former deals with a
socket entirely internal to the kernel that never sees userspace. This is
a limitation in the design of lockdep.

Fix the general case by:

(1) Double up all the locking keys used in sockets so that one set are
used if the socket is created by userspace and the other set is used
if the socket is created by the kernel.

(2) Store the kern parameter passed to sk_alloc() in a variable in the
sock struct (sk_kern_sock). This informs sock_lock_init(),
sock_init_data() and sk_clone_lock() as to the lock keys to be used.

Note that the child created by sk_clone_lock() inherits the parent's
kern setting.

(3) Add a 'kern' parameter to ->accept() that is analogous to the one
passed in to ->create() that distinguishes whether kernel_accept() or
sys_accept4() was the caller and can be passed to sk_alloc().

Note that a lot of accept functions merely dequeue an already
allocated socket. I haven't touched these as the new socket already
exists before we get the parameter.

Note also that there are a couple of places where I've made the accepted
socket unconditionally kernel-based:

irda_accept()
rds_rcp_accept_one()
tcp_accept_from_sock()

because they follow a sock_create_kern() and accept off of that.

Whilst creating this, I noticed that lustre and ocfs don't create sockets
through sock_create_kern() and thus they aren't marked as for-kernel,
though they appear to be internal. I wonder if these should do that so
that they use the new set of lock keys.

Signed-off-by: David Howells
Signed-off-by: David S. Miller

David Howells
2017-03-10 10:23:27 +0800

02 Mar, 2017

1 commit

3f07c0144 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/signal.h> ... Browse Code »

We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Ingo Molnar
2017-03-02 15:42:29 +0800

03 Feb, 2017

1 commit

ba94f3088 unix: add ioctl to open a unix socket file with O_PATH ... Browse Code »

This ioctl opens a file to which a socket is bound and
returns a file descriptor. The caller has to have CAP_NET_ADMIN
in the socket network namespace.

Currently it is impossible to get a path and a mount point
for a socket file. socket_diag reports address, device ID and inode
number for unix sockets. An address can contain a relative path or
a file may be moved somewhere. And these properties say nothing about
a mount namespace and a mount point of a socket file.

With the introduced ioctl, we can get a path by reading
/proc/self/fd/X and get mnt_id from /proc/self/fdinfo/X.

In CRIU we are going to use this ioctl to dump and restore unix socket.

Here is an example how it can be used:

$ strace -e socket,bind,ioctl ./test /tmp/test_sock
socket(AF_UNIX, SOCK_STREAM, 0) = 3
bind(3, {sa_family=AF_UNIX, sun_path="test_sock"}, 11) = 0
ioctl(3, SIOCUNIXFILE, 0) = 4
^Z

$ ss -a | grep test_sock
u_str LISTEN 0 1 test_sock 17798 * 0

$ ls -l /proc/760/fd/{3,4}
lrwx------ 1 root root 64 Feb 1 09:41 3 -> 'socket:[17798]'
l--------- 1 root root 64 Feb 1 09:41 4 -> /tmp/test_sock

$ cat /proc/760/fdinfo/4
pos: 0
flags: 012000000
mnt_id: 40

$ cat /proc/self/mountinfo | grep "^40\s"
40 19 0:37 / /tmp rw shared:23 - tmpfs tmpfs rw

Signed-off-by: Andrei Vagin
Signed-off-by: David S. Miller

Andrey Vagin
2017-02-03 10:58:02 +0800

25 Jan, 2017

1 commit

0fb44559f af_unix: move unix_mknod() out of bindlock ... Browse Code »

Dmitry reported a deadlock scenario:

unix_bind() path:
u->bindlock ==> sb_writer

do_splice() path:
sb_writer ==> pipe->mutex ==> u->bindlock

In the unix_bind() code path, unix_mknod() does not have to
be done with u->bindlock held, since it is a pure fs operation,
so we can just move unix_mknod() out.

Reported-by: Dmitry Vyukov
Tested-by: Dmitry Vyukov
Cc: Rainer Weikusat
Cc: Al Viro
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

WANG Cong
2017-01-25 03:30:56 +0800