Eric Lee / smarc-fsl-linux-kernel

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

15 Sep, 2017

1 commit

cc73fee0b Merge branch 'work.ipc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull ipc compat cleanup and 64-bit time_t from Al Viro:
"IPC copyin/copyout sanitizing, including 64bit time_t work from Deepa
Dinamani"

* 'work.ipc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
utimes: Make utimes y2038 safe
ipc: shm: Make shmid_kernel timestamps y2038 safe
ipc: sem: Make sem_array timestamps y2038 safe
ipc: msg: Make msg_queue timestamps y2038 safe
ipc: mqueue: Replace timespec with timespec64
ipc: Make sys_semtimedop() y2038 safe
get rid of SYSVIPC_COMPAT on ia64
semtimedop(): move compat to native
shmat(2): move compat to native
msgrcv(2), msgsnd(2): move compat to native
ipc(2): move compat to native
ipc: make use of compat ipc_perm helpers
semctl(): move compat to native
semctl(): separate all layout-dependent copyin/copyout
msgctl(): move compat to native
msgctl(): split the actual work from copyin/copyout
ipc: move compat shmctl to native
shmctl: split the work from copyin/copyout

Linus Torvalds
2017-09-15 08:37:26 +0800

09 Sep, 2017

1 commit

0cfb6aee7 ipc: optimize semget/shmget/msgget for lots of keys ... Browse Code »

ipc_findkey() used to scan all objects to look for the wanted key. This
is slow when using a high number of keys. This change adds an rhashtable
of kern_ipc_perm objects in ipc_ids, so that one lookup cease to be O(n).

This change gives a 865% improvement of benchmark reaim.jobs_per_min on a
56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory [1]

Other (more micro) benchmark results, by the author: On an i5 laptop, the
following loop executed right after a reboot took, without and with this
change:

for (int i = 0, k=0x424242; i < KEYS; ++i)
semget(k++, 1, IPC_CREAT | 0600);

total total max single max single
KEYS without with call without call with

1 3.5 4.9 Âµs 3.5 4.9
10 7.6 8.6 Âµs 3.7 4.7
32 16.2 15.9 Âµs 4.3 5.3
100 72.9 41.8 Âµs 3.7 4.7
1000 5,630.0 502.0 Âµs * *
10000 1,340,000.0 7,240.0 Âµs * *
31900 17,600,000.0 22,200.0 Âµs * *

*: unreliable measure: high variance

The duration for a lookup-only usage was obtained by the same loop once
the keys are present:

total total max single max single
KEYS without with call without call with

1 2.1 2.5 Âµs 2.1 2.5
10 4.5 4.8 Âµs 2.2 2.3
32 13.0 10.8 Âµs 2.3 2.8
100 82.9 25.1 Âµs * 2.3
1000 5,780.0 217.0 Âµs * *
10000 1,470,000.0 2,520.0 Âµs * *
31900 17,400,000.0 7,810.0 Âµs * *

Finally, executing each semget() in a new process gave, when still
summing only the durations of these syscalls:

creation:
total total
KEYS without with

1 3.7 5.0 Âµs
10 32.9 36.7 Âµs
32 125.0 109.0 Âµs
100 523.0 353.0 Âµs
1000 20,300.0 3,280.0 Âµs
10000 2,470,000.0 46,700.0 Âµs
31900 27,800,000.0 219,000.0 Âµs

lookup-only:
total total
KEYS without with

1 2.5 2.7 Âµs
10 25.4 24.4 Âµs
32 106.0 72.6 Âµs
100 591.0 352.0 Âµs
1000 22,400.0 2,250.0 Âµs
10000 2,510,000.0 25,700.0 Âµs
31900 28,200,000.0 115,000.0 Âµs

[1] http://lkml.kernel.org/r/20170814060507.GE23258@yexl-desktop

Link: http://lkml.kernel.org/r/20170815194954.ck32ta2z35yuzpwp@debix
Signed-off-by: Guillaume Knispel
Reviewed-by: Marc Pardo
Cc: Davidlohr Bueso
Cc: Kees Cook
Cc: Manfred Spraul
Cc: Alexey Dobriyan
Cc: "Eric W. Biederman"
Cc: "Peter Zijlstra (Intel)"
Cc: Ingo Molnar
Cc: Sebastian Andrzej Siewior
Cc: Serge Hallyn
Cc: Andrey Vagin
Cc: Guillaume Knispel
Cc: Marc Pardo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Guillaume Knispel
2017-09-09 09:26:51 +0800

04 Sep, 2017

1 commit

50578ea97 ipc: msg: Make msg_queue timestamps y2038 safe ... Browse Code »

time_t is not y2038 safe. Replace all uses of
time_t by y2038 safe time64_t.

Similarly, replace the calls to get_seconds() with
y2038 safe ktime_get_real_seconds().
Note that this preserves fast access on 64 bit systems,
but 32 bit systems need sequence counters.

The syscall interfaces themselves are not changed as part of
the patch. They will be part of a different series.

Signed-off-by: Deepa Dinamani
Reviewed-by: Arnd Bergmann
Signed-off-by: Al Viro

Deepa Dinamani
2017-09-04 08:21:24 +0800

03 Aug, 2017

1 commit

ade9f91b3 ipc: add missing container_of()s for randstruct ... Browse Code »

When building with the randstruct gcc plugin, the layout of the IPC
structs will be randomized, which requires any sub-structure accesses to
use container_of(). The proc display handlers were missing the needed
container_of()s since the iterator is passing in the top-level struct
kern_ipc_perm.

This would lead to crashes when running the "lsipc" program after the
system had IPC registered (e.g. after starting up Gnome):

general protection fault: 0000 [#1] PREEMPT SMP
...
RIP: 0010:shm_add_rss_swap.isra.1+0x13/0xa0
...
Call Trace:
sysvipc_shm_proc_show+0x5e/0x150
sysvipc_proc_show+0x1a/0x30
seq_read+0x2e9/0x3f0
...

Link: http://lkml.kernel.org/r/20170730205950.GA55841@beast
Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization")
Signed-off-by: Kees Cook
Reported-by: Dominik Brodowski
Acked-by: Davidlohr Bueso
Acked-by: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2017-08-03 08:16:12 +0800

16 Jul, 2017

4 commits

9b1404c24 msgrcv(2), msgsnd(2): move compat to native ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2017-07-16 08:46:46 +0800
28327fae6 ipc: make use of compat ipc_perm helpers ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2017-07-16 08:46:45 +0800
469391684 msgctl(): move compat to native ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2017-07-16 08:46:43 +0800
156d9ed12 msgctl(): split the actual work from copyin/copyout ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2017-07-16 08:46:43 +0800

13 Jul, 2017

6 commits

fb259c310 ipc/msg: remove special msg_alloc/free ... Browse Code »

There is nothing special about the msg_alloc/free routines any more, so
remove them to make code more readable.

[manfred@colorfullife.com: Rediff to keep rcu protection for security_msg_queue_alloc()]
Link: http://lkml.kernel.org/r/20170525185107.12869-19-manfred@colorfullife.com
Signed-off-by: Kees Cook
Signed-off-by: Manfred Spraul
Cc: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2017-07-13 07:26:02 +0800
3d3653f97 ipc: move atomic_set() to where it is needed ... Browse Code »

Only after ipc_addid() has succeeded will refcounting be used, so move
initialization into ipc_addid() and remove from open-coded *_alloc()
routines.

Link: http://lkml.kernel.org/r/20170525185107.12869-17-manfred@colorfullife.com
Signed-off-by: Kees Cook
Signed-off-by: Manfred Spraul
Cc: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2017-07-13 07:26:02 +0800
51c23b7b7 ipc/msg.c: avoid ipc_rcu_putref for failed ipc_addid() ... Browse Code »

Loosely based on a patch from Kees Cook :
- id and retval can be merged
- if ipc_addid() fails, then use call_rcu() directly.

The difference is that call_rcu is used for failed ipc_addid() calls, to
continue to guaranteed an rcu delay for security_msg_queue_free().

Link: http://lkml.kernel.org/r/20170525185107.12869-16-manfred@colorfullife.com
Signed-off-by: Manfred Spraul
Cc: Kees Cook
Cc: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Manfred Spraul
2017-07-13 07:26:02 +0800
52f908904 ipc/msg: avoid ipc_rcu_alloc() ... Browse Code »

Instead of using ipc_rcu_alloc() which only performs the refcount bump,
open code it. This also allows for msg_queue structure layout to be
randomized in the future.

Link: http://lkml.kernel.org/r/20170525185107.12869-12-manfred@colorfullife.com
Signed-off-by: Kees Cook
Signed-off-by: Manfred Spraul
Cc: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2017-07-13 07:26:02 +0800
9ef5932f8 ipc/msg: do not use ipc_rcu_free() ... Browse Code »

Avoid using ipc_rcu_free, since it just re-finds the original structure
pointer. For the pre-list-init failure path, there is no RCU needed,
since it was just allocated. It can be directly freed.

Link: http://lkml.kernel.org/r/20170525185107.12869-8-manfred@colorfullife.com
Signed-off-by: Kees Cook
Signed-off-by: Manfred Spraul
Cc: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2017-07-13 07:26:01 +0800
dba4cdd39 ipc: merge ipc_rcu and kern_ipc_perm ... Browse Code »

ipc has two management structures that exist for every id:
- struct kern_ipc_perm, it contains e.g. the permissions.
- struct ipc_rcu, it contains the rcu head for rcu handling and the
refcount.

The patch merges both structures.

As a bonus, we may save one cacheline, because both structures are
cacheline aligned. In addition, it reduces the number of casts, instead
most codepaths can use container_of.

To simplify code, the ipc_rcu_alloc initializes the allocation to 0.

[manfred@colorfullife.com: really include the memset() into ipc_alloc_rcu()]
Link: http://lkml.kernel.org/r/564f8612-0601-b267-514f-a9f650ec9b32@colorfullife.com
Link: http://lkml.kernel.org/r/20170525185107.12869-3-manfred@colorfullife.com
Signed-off-by: Manfred Spraul
Cc: Davidlohr Bueso
Cc: Kees Cook
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Manfred Spraul
2017-07-13 07:26:01 +0800

02 Mar, 2017

2 commits

eb61baf69 sched/headers: Move the wake-queue types and interfaces from sched.h into <linux/sched/wake_q.h> ... Browse Code »

Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar

Ingo Molnar
2017-03-02 15:42:42 +0800
84f001e15 sched/headers: Prepare for new header dependencies before moving code to <linux/sched/wake_q.h> ... Browse Code »

We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Ingo Molnar
2017-03-02 15:42:26 +0800

15 Dec, 2016

1 commit

999898355 ipc: msg, make msgrcv work with LONG_MIN ... Browse Code »

When LONG_MIN is passed to msgrcv, one would expect to recieve any
message. But convert_mode does *msgtyp = -*msgtyp and -LONG_MIN is
undefined. In particular, with my gcc -LONG_MIN produces -LONG_MIN
again.

So handle this case properly by assigning LONG_MAX to *msgtyp if
LONG_MIN was specified as msgtyp to msgrcv.

This code:
long msg[] = { 100, 200 };
int m = msgget(IPC_PRIVATE, IPC_CREAT | 0644);
msgsnd(m, &msg, sizeof(msg), 0);
msgrcv(m, &msg, sizeof(msg), LONG_MIN, 0);

produces currently nothing:

msgget(IPC_PRIVATE, IPC_CREAT|0644) = 65538
msgsnd(65538, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0
msgrcv(65538, ...

Except a UBSAN warning:

UBSAN: Undefined behaviour in ipc/msg.c:745:13
negation of -9223372036854775808 cannot be represented in type 'long int':

With the patch, I see what I expect:

msgget(IPC_PRIVATE, IPC_CREAT|0644) = 0
msgsnd(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0
msgrcv(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, -9223372036854775808, 0) = 16

Link: http://lkml.kernel.org/r/20161024082633.10148-1-jslaby@suse.cz
Signed-off-by: Jiri Slaby
Cc: Davidlohr Bueso
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiri Slaby
2016-12-15 08:04:08 +0800

21 Nov, 2016

1 commit

194a6b5b9 sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q ... Browse Code »

Currently the wake_q data structure is defined by the WAKE_Q() macro.
This macro, however, looks like a function doing something as "wake" is
a verb. Even checkpatch.pl was confused as it reported warnings like

WARNING: Missing a blank line after declarations
#548: FILE: kernel/futex.c:3665:
+ int ret;
+ WAKE_Q(wake_q);

This patch renames the WAKE_Q() macro to DEFINE_WAKE_Q() which clarifies
what the macro is doing and eliminates the checkpatch.pl warnings.

Signed-off-by: Waiman Long
Acked-by: Davidlohr Bueso
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1479401198-1765-1-git-send-email-longman@redhat.com
[ Resolved conflict and added missing rename. ]
Signed-off-by: Ingo Molnar

Waiman Long
2016-11-21 17:29:01 +0800

12 Oct, 2016

4 commits

ed27f9122 ipc/msg: avoid waking sender upon full queue ... Browse Code »

Blocked tasks queued in q_senders waiting for their message to fit in the
queue are blindly awoken every time we think there's a remote chance this
might happen. This could cause numerous (and expensive -- thundering
herd-ish) bogus wakeups if the queue is still really full. Adding to the
scheduling cost/overhead, there's also the fact that we need to take the
ipc object lock and requeue ourselves in the q_senders list.

By keeping track of the blocked sender's message size, we can know
previously if the wakeup ought to occur or not. Otherwise, to maintain
the current wakeup order we just move it to the tail. This is exactly
what occurs right now if the sender needs to go back to sleep.

The case of EIDRM is left completely untouched, as we need to wakeup all
the tasks, and shouldn't be playing games in the first place.

This patch was seen to save on the 'msgctl10' ltp testcase ~15% in context
switches (avg out of ten runs). Although these tests are really about
functionality (as opposed to performance), is does show the direct
benefits of the optimization.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/1469748819-19484-6-git-send-email-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso
Acked-by: Peter Zijlstra (Intel)
Cc: Manfred Spraul
Cc: Sebastian Andrzej Siewior
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2016-10-12 06:06:33 +0800
d0d6a2a95 ipc/msg: make ss_wakeup() kill arg boolean ... Browse Code »

... 'tis annoying.

Link: http://lkml.kernel.org/r/1469748819-19484-4-git-send-email-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso
Acked-by: Peter Zijlstra (Intel)
Cc: Manfred Spraul
Cc: Sebastian Andrzej Siewior
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2016-10-12 06:06:33 +0800
e3658538b ipc/msg: batch queue sender wakeups ... Browse Code »

Currently the use of wake_qs in sysv msg queues are only for the receiver
tasks that are blocked on the queue. But blocked sender tasks (due to
queue size constraints) still are awoken with the ipc object lock held,
which can be a problem particularly for small sized queues and far from
gracious for -rt (just like it was for the receiver side).

The paths that actually wakeup a sender are obviously related to when we
are either getting rid of the queue or after (some) space is freed-up
after a receiver takes the msg (msgrcv). Furthermore, with the exception
of msgrcv, we can always piggy-back on expunge_all that has its own tasks
lined-up for waking. Finally, upon unlinking the message, it should be no
problem delaying the wakeups a bit until after we've released the lock.

Link: http://lkml.kernel.org/r/1469748819-19484-3-git-send-email-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso
Acked-by: Peter Zijlstra (Intel)
Cc: Manfred Spraul
Cc: Sebastian Andrzej Siewior
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2016-10-12 06:06:33 +0800
ee51636ca ipc/msg: implement lockless pipelined wakeups ... Browse Code »

This patch moves the wakeup_process() invocation so it is not done under
the ipc global lock by making use of a lockless wake_q. With this change,
the waiter is woken up once the message has been assigned and it does not
need to loop on SMP if the message points to NULL. In the signal case we
still need to check the pointer under the lock to verify the state.

This change should also avoid the introduction of preempt_disable() in -RT
which avoids a busy-loop which pools for the NULL -> !NULL change if the
waiter has a higher priority compared to the waker.

By making use of wake_qs, the logic of sysv msg queues is greatly
simplified (and very well suited as we can batch lockless wakeups),
particularly around the lockless receive algorithm.

This has been tested with Manred's pmsg-shared tool on a "AMD A10-7800
Radeon R7, 12 Compute Cores 4C+8G":

test | before | after | diff
-----------------|------------|------------|----------
pmsg-shared 8 60 | 19,347,422 | 30,442,191 | + ~57.34 %
pmsg-shared 4 60 | 21,367,197 | 35,743,458 | + ~67.28 %
pmsg-shared 2 60 | 22,884,224 | 24,278,200 | + ~6.09 %

Link: http://lkml.kernel.org/r/1469748819-19484-2-git-send-email-dave@stgolabs.net
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Davidlohr Bueso
Acked-by: Peter Zijlstra (Intel)
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sebastian Andrzej Siewior
2016-10-12 06:06:33 +0800

03 Aug, 2016

1 commit

9b24fef9f sysv, ipc: fix security-layer leaking ... Browse Code »

Commit 53dad6d3a8e5 ("ipc: fix race with LSMs") updated ipc_rcu_putref()
to receive rcu freeing function but used generic ipc_rcu_free() instead
of msg_rcu_free() which does security cleaning.

Running LTP msgsnd06 with kmemleak gives the following:

cat /sys/kernel/debug/kmemleak

unreferenced object 0xffff88003c0a11f8 (size 8):
comm "msgsnd06", pid 1645, jiffies 4294672526 (age 6.549s)
hex dump (first 8 bytes):
1b 00 00 00 01 00 00 00 ........
backtrace:
kmemleak_alloc+0x23/0x40
kmem_cache_alloc_trace+0xe1/0x180
selinux_msg_queue_alloc_security+0x3f/0xd0
security_msg_queue_alloc+0x2e/0x40
newque+0x4e/0x150
ipcget+0x159/0x1b0
SyS_msgget+0x39/0x40
entry_SYSCALL_64_fastpath+0x13/0x8f

Manfred Spraul suggested to fix sem.c as well and Davidlohr Bueso to
only use ipc_rcu_free in case of security allocation failure in newary()

Fixes: 53dad6d3a8e ("ipc: fix race with LSMs")
Link: http://lkml.kernel.org/r/1470083552-22966-1-git-send-email-fabf@skynet.be
Signed-off-by: Fabian Frederick
Cc: Davidlohr Bueso
Cc: Manfred Spraul
Cc: [3.12+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2016-08-03 05:31:41 +0800

01 Oct, 2015

1 commit

b9a532277 Initialize msg/shm IPC objects before doing ipc_addid() ... Browse Code »

As reported by Dmitry Vyukov, we really shouldn't do ipc_addid() before
having initialized the IPC object state. Yes, we initialize the IPC
object in a locked state, but with all the lockless RCU lookup work,
that IPC object lock no longer means that the state cannot be seen.

We already did this for the IPC semaphore code (see commit e8577d1f0329:
"ipc/sem.c: fully initialize sem_array before making it visible") but we
clearly forgot about msg and shm.

Reported-by: Dmitry Vyukov
Cc: Manfred Spraul
Cc: Davidlohr Bueso
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds

Linus Torvalds
2015-10-01 00:48:40 +0800

01 Jul, 2015

2 commits

55b7ae501 ipc: rename ipc_obtain_object ... Browse Code »

... to ipc_obtain_object_idr, which is more meaningful and makes the code
slightly easier to follow.

Signed-off-by: Davidlohr Bueso
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2015-07-01 10:44:58 +0800
ff35e5ef8 ipc,msg: provide barrier pairings for lockless receive ... Browse Code »

We currently use a full barrier on the sender side to to avoid receiver
tasks disappearing on us while still performing on the sender side wakeup.
We lack however, the proper CPU-CPU interactions pairing on the receiver
side which busy-waits for the message. Similarly, we do not need a full
smp_mb, and can relax the semantics for the writer and reader sides of the
message. This is safe as we are only ordering loads and stores to r_msg.
And in both smp_wmb and smp_rmb, there are no stores after the calls
_anyway_.

This obviously applies for pipelined_send and expunge_all, for EIRDM when
destroying a queue.

Signed-off-by: Davidlohr Bueso
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2015-07-01 10:44:58 +0800

16 Apr, 2015

1 commit

7f032d6ef ipc: remove use of seq_printf return value ... Browse Code »

The seq_printf return value, because it's frequently misused,
will eventually be converted to void.

See: commit 1f33c41c03da ("seq_file: Rename seq_overflow() to
seq_has_overflowed() and make public")

Signed-off-by: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2015-04-16 07:35:24 +0800

14 Dec, 2014

1 commit

0050ee059 ipc/msg: increase MSGMNI, remove scaling ... Browse Code »

SysV can be abused to allocate locked kernel memory. For most systems, a
small limit doesn't make sense, see the discussion with regards to SHMMAX.

Therefore: increase MSGMNI to the maximum supported.

And: If we ignore the risk of locking too much memory, then an automatic
scaling of MSGMNI doesn't make sense. Therefore the logic can be removed.

The code preserves auto_msgmni to avoid breaking any user space applications
that expect that the value exists.

Notes:
1) If an administrator must limit the memory allocations, then he can set
MSGMNI as necessary.

Or he can disable sysv entirely (as e.g. done by Android).

2) MSGMAX and MSGMNB are intentionally not increased, as these values are used
to control latency vs. throughput:
If MSGMNB is large, then msgsnd() just returns and more messages can be queued
before a task switch to a task that calls msgrcv() is forced.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Manfred Spraul
Cc: Davidlohr Bueso
Cc: Rafael Aquini
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Manfred Spraul
2014-12-14 04:42:52 +0800

07 Jun, 2014

6 commits

4bb6657dd ipc,msg: document volatile r_msg ... Browse Code »

The need for volatile is not obvious, document it.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Manfred Spraul
Cc: Aswin Chandramouleeswaran
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2014-06-07 07:08:15 +0800
3440a6bd1 ipc,msg: move some msgq ns code around ... Browse Code »

Nothing big and no logical changes, just get rid of some redundant
function declarations. Move msg_[init/exit]_ns down the end of the
file.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Manfred Spraul
Cc: Aswin Chandramouleeswaran
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2014-06-07 07:08:14 +0800
f75a2f358 ipc,msg: use current->state helpers ... Browse Code »

Call __set_current_state() instead of assigning the new state directly.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Manfred Spraul
Cc: Aswin Chandramouleeswaran
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2014-06-07 07:08:14 +0800
46c0a8ca3 ipc, kernel: clear whitespace ... Browse Code »

trailing whitespace

Signed-off-by: Paul McQuade
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul McQuade
2014-06-07 07:08:14 +0800
7153e4027 ipc, kernel: use Linux headers ... Browse Code »

Use #include instead of
Use #include instead of

Signed-off-by: Paul McQuade
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul McQuade
2014-06-07 07:08:14 +0800
eb66ec44f ipc: constify ipc_ops ... Browse Code »

There is no need to recreate the very same ipc_ops structure on every
kernel entry for msgget/semget/shmget. Just declare it static and be
done with it. While at it, constify it as we don't modify the structure
at runtime.

Found in the PaX patch, written by the PaX Team.

Signed-off-by: Mathias Krause
Cc: PaX Team
Cc: Davidlohr Bueso
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mathias Krause
2014-06-07 07:08:14 +0800

17 Mar, 2014

1 commit

4f87dac38 ipc: Fix 2 bugs in msgrcv() MSG_COPY implementation ... Browse Code »

While testing and documenting the msgrcv() MSG_COPY flag that Stanislav
Kinsbursky added in commit 4a674f34ba04 ("ipc: introduce message queue
copy feature" => kernel 3.8), I discovered a couple of bugs in the
implementation. The two bugs concern MSG_COPY interactions with other
msgrcv() flags, namely:

(A) MSG_COPY + MSG_EXCEPT
(B) MSG_COPY + !IPC_NOWAIT

The bugs are distinct (and the fix for the first one is obvious),
however my fix for both is a single-line patch, which is why I'm
combining them in a single mail, rather than writing two mails+patches.

===== (A) MSG_COPY + MSG_EXCEPT =====

With the addition of the MSG_COPY flag, there are now two msgrcv()
flags--MSG_COPY and MSG_EXCEPT--that modify the meaning of the 'msgtyp'
argument in unrelated ways. Specifying both in the same call is a
logical error that is currently permitted, with the effect that MSG_COPY
has priority and MSG_EXCEPT is ignored. The call should give an error
if both flags are specified. The patch below implements that behavior.

===== (B) (B) MSG_COPY + !IPC_NOWAIT =====

The test code that was submitted in commit 3a665531a3b7 ("selftests: IPC
message queue copy feature test") shows MSG_COPY being used in
conjunction with IPC_NOWAIT. In other words, if there is no message at
the position 'msgtyp'. return immediately with the error in ENOMSG.

What was not (fully) tested is the behavior if MSG_COPY is specified
*without* IPC_NOWAIT, and there is an odd behavior. If the queue
contains less than 'msgtyp' messages, then the call blocks until the
next message is written to the queue. At that point, the msgrcv() call
returns a copy of the newly added message, regardless of whether that
message is at the ordinal position 'msgtyp'. This is clearly bogus, and
problematic for applications that might want to make use of the MSG_COPY
flag.

I considered the following possible solutions to this problem:

(1) Force the call to block until a message *does* appear at the
position 'msgtyp'.

(2) If the MSG_COPY flag is specified, the kernel should implicitly add
IPC_NOWAIT, so that the call fails with ENOMSG for this case.

(3) If the MSG_COPY flag is specified, but IPC_NOWAIT is not, generate
an error (probably, EINVAL is the right one).

I do not know if any application would really want to have the
functionality of solution (1), especially since an application can
determine in advance the number of messages in the queue using msgctl()
IPC_STAT. Obviously, this solution would be the most work to implement.

Solution (2) would have the effect of silently fixing any applications
that tried to employ broken behavior. However, it would mean that if we
later decided to implement solution (1), then user-space could not
easily detect what the kernel supports (but, since I'm somewhat doubtful
that solution (1) is needed, I'm not sure that this is much of a
problem).

Solution (3) would have the effect of informing broken applications that
they are doing something broken. The downside is that this would cause
a ABI breakage for any applications that are currently employing the
broken behavior. However:

a) Those applications are almost certainly not getting the results they
expect.
b) Possibly, those applications don't even exist, because MSG_COPY is
currently hidden behind CONFIG_CHECKPOINT_RESTORE.

The upside of solution (3) is that if we later decided to implement
solution (1), user-space could determine what the kernel supports, via
the error return.

In my view, solution (3) is mildly preferable to solution (2), and
solution (1) could still be done later if anyone really cares. The
patch below implements solution (3).

PS. For anyone out there still listening, it's the usual story:
documenting an API (and the thinking about, and the testing of the API,
that documentation entails) is the one of the single best ways of
finding bugs in the API, as I've learned from a lot of experience. Best
to do that documentation before releasing the API.

Signed-off-by: Michael Kerrisk
Acked-by: Stanislav Kinsbursky
Cc: Stanislav Kinsbursky
Cc: stable@vger.kernel.org
Cc: Serge Hallyn
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Al Viro
Cc: KOSAKI Motohiro
Signed-off-by: Linus Torvalds

Michael Kerrisk
2014-03-17 01:41:04 +0800

28 Jan, 2014

3 commits

ffa571daf ipc,msg: document barriers ... Browse Code »

Both expunge_all() and pipeline_send() rely on both a nil msg value and
a full barrier to guarantee the correct ordering when waking up a task.

While its counterpart at the receiving end is well documented for the
lockless recv algorithm, we still need to document these specific
smp_mb() calls.

[akpm@linux-foundation.org: fix typo, per Mike]
[akpm@linux-foundation.org: mroe tpyos]
Signed-off-by: Davidlohr Bueso
Cc: Aswin Chandramouleeswaran
Cc: Rik van Riel
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2014-01-28 13:02:40 +0800
239521f31 ipc: whitespace cleanup ... Browse Code »

The ipc code does not adhere the typical linux coding style.
This patch fixes lots of simple whitespace errors.

- mostly autogenerated by
scripts/checkpatch.pl -f --fix \
--types=pointer_location,spacing,space_before_tab
- one manual fixup (keep structure members tab-aligned)
- removal of additional space_before_tab that were not found by --fix

Tested with some of my msg and sem test apps.

Andrew: Could you include it in -mm and move it towards Linus' tree?

Signed-off-by: Manfred Spraul
Suggested-by: Li Bin
Cc: Joe Perches
Acked-by: Rafael Aquini
Cc: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Manfred Spraul
2014-01-28 13:02:39 +0800
0f3d2b013 ipc: introduce ipc_valid_object() helper to sort out IPC_RMID races ... Browse Code »

After the locking semantics for the SysV IPC API got improved, a couple
of IPC_RMID race windows were opened because we ended up dropping the
'kern_ipc_perm.deleted' check performed way down in ipc_lock(). The
spotted races got sorted out by re-introducing the old test within the
racy critical sections.

This patch introduces ipc_valid_object() to consolidate the way we cope
with IPC_RMID races by using the same abstraction across the API
implementation.

Signed-off-by: Rafael Aquini
Acked-by: Rik van Riel
Acked-by: Greg Thelen
Reviewed-by: Davidlohr Bueso
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rafael Aquini
2014-01-28 13:02:39 +0800

01 Oct, 2013

1 commit

4271b05a2 ipc,msg: prevent race with rmid in msgsnd,msgrcv ... Browse Code »

This fixes a race in both msgrcv() and msgsnd() between finding the msg
and actually dealing with the queue, as another thread can delete shmid
underneath us if we are preempted before acquiring the
kern_ipc_perm.lock.

Manfred illustrates this nicely:

Assume a preemptible kernel that is preempted just after

msq = msq_obtain_object_check(ns, msqid)

in do_msgrcv(). The only lock that is held is rcu_read_lock().

Now the other thread processes IPC_RMID. When the first task is
resumed, then it will happily wait for messages on a deleted queue.

Fix this by checking for if the queue has been deleted after taking the
lock.

Signed-off-by: Davidlohr Bueso
Reported-by: Manfred Spraul
Cc: Rik van Riel
Cc: Mike Galbraith
Cc: [3.11]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-10-01 05:31:03 +0800