Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

17 Mar, 2014

1 commit

4f87dac38 ipc: Fix 2 bugs in msgrcv() MSG_COPY implementation ... Browse Code »

While testing and documenting the msgrcv() MSG_COPY flag that Stanislav
Kinsbursky added in commit 4a674f34ba04 ("ipc: introduce message queue
copy feature" => kernel 3.8), I discovered a couple of bugs in the
implementation. The two bugs concern MSG_COPY interactions with other
msgrcv() flags, namely:

(A) MSG_COPY + MSG_EXCEPT
(B) MSG_COPY + !IPC_NOWAIT

The bugs are distinct (and the fix for the first one is obvious),
however my fix for both is a single-line patch, which is why I'm
combining them in a single mail, rather than writing two mails+patches.

===== (A) MSG_COPY + MSG_EXCEPT =====

With the addition of the MSG_COPY flag, there are now two msgrcv()
flags--MSG_COPY and MSG_EXCEPT--that modify the meaning of the 'msgtyp'
argument in unrelated ways. Specifying both in the same call is a
logical error that is currently permitted, with the effect that MSG_COPY
has priority and MSG_EXCEPT is ignored. The call should give an error
if both flags are specified. The patch below implements that behavior.

===== (B) (B) MSG_COPY + !IPC_NOWAIT =====

The test code that was submitted in commit 3a665531a3b7 ("selftests: IPC
message queue copy feature test") shows MSG_COPY being used in
conjunction with IPC_NOWAIT. In other words, if there is no message at
the position 'msgtyp'. return immediately with the error in ENOMSG.

What was not (fully) tested is the behavior if MSG_COPY is specified
*without* IPC_NOWAIT, and there is an odd behavior. If the queue
contains less than 'msgtyp' messages, then the call blocks until the
next message is written to the queue. At that point, the msgrcv() call
returns a copy of the newly added message, regardless of whether that
message is at the ordinal position 'msgtyp'. This is clearly bogus, and
problematic for applications that might want to make use of the MSG_COPY
flag.

I considered the following possible solutions to this problem:

(1) Force the call to block until a message *does* appear at the
position 'msgtyp'.

(2) If the MSG_COPY flag is specified, the kernel should implicitly add
IPC_NOWAIT, so that the call fails with ENOMSG for this case.

(3) If the MSG_COPY flag is specified, but IPC_NOWAIT is not, generate
an error (probably, EINVAL is the right one).

I do not know if any application would really want to have the
functionality of solution (1), especially since an application can
determine in advance the number of messages in the queue using msgctl()
IPC_STAT. Obviously, this solution would be the most work to implement.

Solution (2) would have the effect of silently fixing any applications
that tried to employ broken behavior. However, it would mean that if we
later decided to implement solution (1), then user-space could not
easily detect what the kernel supports (but, since I'm somewhat doubtful
that solution (1) is needed, I'm not sure that this is much of a
problem).

Solution (3) would have the effect of informing broken applications that
they are doing something broken. The downside is that this would cause
a ABI breakage for any applications that are currently employing the
broken behavior. However:

a) Those applications are almost certainly not getting the results they
expect.
b) Possibly, those applications don't even exist, because MSG_COPY is
currently hidden behind CONFIG_CHECKPOINT_RESTORE.

The upside of solution (3) is that if we later decided to implement
solution (1), user-space could determine what the kernel supports, via
the error return.

In my view, solution (3) is mildly preferable to solution (2), and
solution (1) could still be done later if anyone really cares. The
patch below implements solution (3).

PS. For anyone out there still listening, it's the usual story:
documenting an API (and the thinking about, and the testing of the API,
that documentation entails) is the one of the single best ways of
finding bugs in the API, as I've learned from a lot of experience. Best
to do that documentation before releasing the API.

Signed-off-by: Michael Kerrisk
Acked-by: Stanislav Kinsbursky
Cc: Stanislav Kinsbursky
Cc: stable@vger.kernel.org
Cc: Serge Hallyn
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Al Viro
Cc: KOSAKI Motohiro
Signed-off-by: Linus Torvalds

Michael Kerrisk
2014-03-17 01:41:04 +0800

28 Jan, 2014

3 commits

ffa571daf ipc,msg: document barriers ... Browse Code »

Both expunge_all() and pipeline_send() rely on both a nil msg value and
a full barrier to guarantee the correct ordering when waking up a task.

While its counterpart at the receiving end is well documented for the
lockless recv algorithm, we still need to document these specific
smp_mb() calls.

[akpm@linux-foundation.org: fix typo, per Mike]
[akpm@linux-foundation.org: mroe tpyos]
Signed-off-by: Davidlohr Bueso
Cc: Aswin Chandramouleeswaran
Cc: Rik van Riel
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2014-01-28 13:02:40 +0800
239521f31 ipc: whitespace cleanup ... Browse Code »

The ipc code does not adhere the typical linux coding style.
This patch fixes lots of simple whitespace errors.

- mostly autogenerated by
scripts/checkpatch.pl -f --fix \
--types=pointer_location,spacing,space_before_tab
- one manual fixup (keep structure members tab-aligned)
- removal of additional space_before_tab that were not found by --fix

Tested with some of my msg and sem test apps.

Andrew: Could you include it in -mm and move it towards Linus' tree?

Signed-off-by: Manfred Spraul
Suggested-by: Li Bin
Cc: Joe Perches
Acked-by: Rafael Aquini
Cc: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Manfred Spraul
2014-01-28 13:02:39 +0800
0f3d2b013 ipc: introduce ipc_valid_object() helper to sort out IPC_RMID races ... Browse Code »

After the locking semantics for the SysV IPC API got improved, a couple
of IPC_RMID race windows were opened because we ended up dropping the
'kern_ipc_perm.deleted' check performed way down in ipc_lock(). The
spotted races got sorted out by re-introducing the old test within the
racy critical sections.

This patch introduces ipc_valid_object() to consolidate the way we cope
with IPC_RMID races by using the same abstraction across the API
implementation.

Signed-off-by: Rafael Aquini
Acked-by: Rik van Riel
Acked-by: Greg Thelen
Reviewed-by: Davidlohr Bueso
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rafael Aquini
2014-01-28 13:02:39 +0800

01 Oct, 2013

1 commit

4271b05a2 ipc,msg: prevent race with rmid in msgsnd,msgrcv ... Browse Code »

This fixes a race in both msgrcv() and msgsnd() between finding the msg
and actually dealing with the queue, as another thread can delete shmid
underneath us if we are preempted before acquiring the
kern_ipc_perm.lock.

Manfred illustrates this nicely:

Assume a preemptible kernel that is preempted just after

msq = msq_obtain_object_check(ns, msqid)

in do_msgrcv(). The only lock that is held is rcu_read_lock().

Now the other thread processes IPC_RMID. When the first task is
resumed, then it will happily wait for messages on a deleted queue.

Fix this by checking for if the queue has been deleted after taking the
lock.

Signed-off-by: Davidlohr Bueso
Reported-by: Manfred Spraul
Cc: Rik van Riel
Cc: Mike Galbraith
Cc: [3.11]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-10-01 05:31:03 +0800

25 Sep, 2013

1 commit

53dad6d3a ipc: fix race with LSMs ... Browse Code »
4

Currently, IPC mechanisms do security and auditing related checks under
RCU. However, since security modules can free the security structure,
for example, through selinux_[sem,msg_queue,shm]_free_security(), we can
race if the structure is freed before other tasks are done with it,
creating a use-after-free condition. Manfred illustrates this nicely,
for instance with shared mem and selinux:

-> do_shmat calls rcu_read_lock()
-> do_shmat calls shm_object_check().
Checks that the object is still valid - but doesn't acquire any locks.
Then it returns.
-> do_shmat calls security_shm_shmat (e.g. selinux_shm_shmat)
-> selinux_shm_shmat calls ipc_has_perm()
-> ipc_has_perm accesses ipc_perms->security

shm_close()
-> shm_close acquires rw_mutex & shm_lock
-> shm_close calls shm_destroy
-> shm_destroy calls security_shm_free (e.g. selinux_shm_free_security)
-> selinux_shm_free_security calls ipc_free_security(&shp->shm_perm)
-> ipc_free_security calls kfree(ipc_perms->security)

This patch delays the freeing of the security structures after all RCU
readers are done. Furthermore it aligns the security life cycle with
that of the rest of IPC - freeing them based on the reference counter.
For situations where we need not free security, the current behavior is
kept. Linus states:

"... the old behavior was suspect for another reason too: having the
security blob go away from under a user sounds like it could cause
various other problems anyway, so I think the old code was at least
_prone_ to bugs even if it didn't have catastrophic behavior."

I have tested this patch with IPC testcases from LTP on both my
quad-core laptop and on a 64 core NUMA server. In both cases selinux is
enabled, and tests pass for both voluntary and forced preemption models.
While the mentioned races are theoretical (at least no one as reported
them), I wanted to make sure that this new logic doesn't break anything
we weren't aware of.

Suggested-by: Linus Torvalds
Signed-off-by: Davidlohr Bueso
Acked-by: Manfred Spraul
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-09-25 00:36:53 +0800

12 Sep, 2013

2 commits

4718787d1 ipc,msg: drop msg_unlock ... Browse Code »

There is only one user left, drop this function and just call
ipc_unlock_object() and rcu_read_unlock().

Signed-off-by: Davidlohr Bueso
Tested-by: Sedat Dilek
Cc: Rik van Riel
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-09-12 06:59:42 +0800
d9a605e40 ipc: rename ids->rw_mutex ... Browse Code »

Since in some situations the lock can be shared for readers, we shouldn't
be calling it a mutex, rename it to rwsem.

Signed-off-by: Davidlohr Bueso
Tested-by: Sedat Dilek
Cc: Rik van Riel
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-09-12 06:59:42 +0800

04 Sep, 2013

1 commit

bebcb928c ipc/msg.c: Fix lost wakeup in msgsnd(). ... Browse Code »

The check if the queue is full and adding current to the wait queue of
pending msgsnd() operations (ss_add()) must be atomic.

Otherwise:
- the thread that performs msgsnd() finds a full queue and decides to
sleep.
- the thread that performs msgrcv() first reads all messages from the
queue and then sleeps, because the queue is empty.
- the msgrcv() calls do not perform any wakeups, because the msgsnd()
task has not yet called ss_add().
- then the msgsnd()-thread first calls ss_add() and then sleeps.

Net result: msgsnd() and msgrcv() both sleep forever.

Observed with msgctl08 from ltp with a preemptible kernel.

Fix: Call ipc_lock_object() before performing the check.

The patch also moves security_msg_queue_msgsnd() under ipc_lock_object:
- msgctl(IPC_SET) explicitely mentions that it tries to expunge any
pending operations that are not allowed anymore with the new
permissions. If security_msg_queue_msgsnd() is called without locks,
then there might be races.
- it makes the patch much simpler.

Reported-and-tested-by: Vineet Gupta
Acked-by: Rik van Riel
Cc: stable@vger.kernel.org # for 3.11
Signed-off-by: Manfred Spraul
Signed-off-by: Linus Torvalds

Manfred Spraul
2013-09-04 01:42:56 +0800

29 Aug, 2013

1 commit

368ae537e IPC: bugfix for msgrcv with msgtyp < 0 ... Browse Code »

According to 'man msgrcv': "If msgtyp is less than 0, the first message of
the lowest type that is less than or equal to the absolute value of msgtyp
shall be received."

Bug: The kernel only returns a message if its type is 1; other messages
with type < abs(msgtype) will never get returned.

Fix: After having traversed the list to find the first message with the
lowest type, we need to actually return that message.

This regression was introduced by commit daaf74cf0867 ("ipc: refactor
msg list search into separate function")

Signed-off-by: Svenning Soerensen
Reviewed-by: Peter Hurley
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Svenning Sørensen
2013-08-29 10:26:38 +0800

10 Jul, 2013

10 commits

9ad66ae65 ipc: remove unused functions ... Browse Code »

We can now drop the msg_lock and msg_lock_check functions along with a
bogus comment introduced previously in semctl_down.

Signed-off-by: Davidlohr Bueso
Cc: Andi Kleen
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-07-10 01:33:27 +0800
41a0d523d ipc,msg: shorten critical region in msgrcv ... Browse Code »

do_msgrcv() is the last msg queue function that abuses the ipc lock Take
it only when needed when actually updating msq.

Signed-off-by: Davidlohr Bueso
Cc: Andi Kleen
Cc: Rik van Riel
Tested-by: Sedat Dilek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-07-10 01:33:27 +0800
3dd1f784e ipc,msg: shorten critical region in msgsnd ... Browse Code »

do_msgsnd() is another function that does too many things with the ipc
object lock acquired. Take it only when needed when actually updating
msq.

Signed-off-by: Davidlohr Bueso
Cc: Andi Kleen
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-07-10 01:33:27 +0800
ac0ba20ea ipc,msg: make msgctl_nolock lockless ... Browse Code »

While the INFO cmd doesn't take the ipc lock, the STAT commands do
acquire it unnecessarily. We can do the permissions and security checks
only holding the rcu lock.

This function now mimics semctl_nolock().

Signed-off-by: Davidlohr Bueso
Cc: Andi Kleen
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-07-10 01:33:27 +0800
a5001a0d9 ipc,msg: introduce lockless functions to obtain the ipc object ... Browse Code »

Add msq_obtain_object() and msq_obtain_object_check(), which will allow
us to get the ipc object without acquiring the lock. Just as with
semaphores, these functions are basically wrappers around
ipc_obtain_object*().

Signed-off-by: Davidlohr Bueso
Cc: Andi Kleen
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-07-10 01:33:27 +0800
2cafed30f ipc,msg: introduce msgctl_nolock ... Browse Code »

Similar to semctl, when calling msgctl, the *_INFO and *_STAT commands
can be performed without acquiring the ipc object.

Add a msgctl_nolock() function and move the logic of *_INFO and *_STAT
out of msgctl(). This change still takes the lock and it will be
properly lockless in the next patch

Signed-off-by: Davidlohr Bueso
Cc: Andi Kleen
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-07-10 01:33:27 +0800
15724ecb7 ipc,msg: shorten critical region in msgctl_down ... Browse Code »

Instead of holding the ipc lock for the entire function, use the
ipcctl_pre_down_nolock and only acquire the lock for specific commands:
RMID and SET.

Signed-off-by: Davidlohr Bueso
Cc: Andi Kleen
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-07-10 01:33:27 +0800
7b4cc5d84 ipc: move locking out of ipcctl_pre_down_nolock ... Browse Code »

This function currently acquires both the rw_mutex and the rcu lock on
successful lookups, leaving the callers to explicitly unlock them,
creating another two level locking situation.

Make the callers (including those that still use ipcctl_pre_down())
explicitly lock and unlock the rwsem and rcu lock.

Signed-off-by: Davidlohr Bueso
Cc: Andi Kleen
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-07-10 01:33:27 +0800
cf9d5d78d ipc: close open coded spin lock calls ... Browse Code »

Signed-off-by: Davidlohr Bueso
Cc: Andi Kleen
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-07-10 01:33:27 +0800
dbfcd91f0 ipc: move rcu lock out of ipc_addid ... Browse Code »

This patchset continues the work that began in the sysv ipc semaphore
scaling series, see

https://lkml.org/lkml/2013/3/20/546

Just like semaphores used to be, sysv shared memory and msg queues also
abuse the ipc lock, unnecessarily holding it for operations such as
permission and security checks.

This patchset mostly deals with mqueues, and while shared mem can be
done in a very similar way, I want to get these patches out in the open
first. It also does some pending cleanups, mostly focused on the two
level locking we have in ipc code, taking care of ipc_addid() and
ipcctl_pre_down_nolock() - yes there are still functions that need to be
updated as well.

This patch:

Make all callers explicitly take and release the RCU read lock.

This addresses the two level locking seen in newary(), newseg() and
newqueue(). For the last two, explicitly unlock the ipc object and the
rcu lock, instead of calling the custom shm_unlock and msg_unlock
functions. The next patch will deal with the open coded locking for
->perm.lock

Signed-off-by: Davidlohr Bueso
Cc: Andi Kleen
Cc: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2013-07-10 01:33:26 +0800

01 May, 2013

6 commits

41239fe82 ipc/msg.c: use list_for_each_entry_[safe] for list traversing ... Browse Code »

The ipc/msg.c code does its list operations by hand and it open-codes the
accesses, instead of using for_each_entry_[safe].

Signed-off-by: Nikola Pajkovsky
Cc: Stanislav Kinsbursky
Cc: "Eric W. Biederman"
Cc: Peter Hurley
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nikola Pajkovsky
2013-05-01 23:12:58 +0800
6062a8dc0 ipc,sem: fine grained locking for semtimedop ... Browse Code »

Introduce finer grained locking for semtimedop, to handle the common case
of a program wanting to manipulate one semaphore from an array with
multiple semaphores.

If the call is a semop manipulating just one semaphore in an array with
multiple semaphores, only take the lock for that semaphore itself.

If the call needs to manipulate multiple semaphores, or another caller is
in a transaction that manipulates multiple semaphores, the sem_array lock
is taken, as well as all the locks for the individual semaphores.

On a 24 CPU system, performance numbers with the semop-multi
test with N threads and N semaphores, look like this:

vanilla Davidlohr's Davidlohr's + Davidlohr's +
threads patches rwlock patches v3 patches
10 610652 726325 1783589 2142206
20 341570 365699 1520453 1977878
30 288102 307037 1498167 2037995
40 290714 305955 1612665 2256484
50 288620 312890 1733453 2650292
60 289987 306043 1649360 2388008
70 291298 306347 1723167 2717486
80 290948 305662 1729545 2763582
90 290996 306680 1736021 2757524
100 292243 306700 1773700 3059159

[davidlohr.bueso@hp.com: do not call sem_lock when bogus sma]
[davidlohr.bueso@hp.com: make refcounter atomic]
Signed-off-by: Rik van Riel
Suggested-by: Linus Torvalds
Acked-by: Davidlohr Bueso
Cc: Chegu Vinod
Cc: Jason Low
Reviewed-by: Michel Lespinasse
Cc: Peter Hurley
Cc: Stanislav Kinsbursky
Tested-by: Emmanuel Benisty
Tested-by: Sedat Dilek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rik van Riel
2013-05-01 23:12:58 +0800
daaf74cf0 ipc: refactor msg list search into separate function ... Browse Code »

[fengguang.wu@intel.com: find_msg can be static]
Signed-off-by: Peter Hurley
Cc: Fengguang Wu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Hurley
2013-05-01 23:12:57 +0800
d076ac911 ipc: simplify msg list search ... Browse Code »

Signed-off-by: Peter Hurley
Acked-by: Stanislav Kinsbursky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Hurley
2013-05-01 23:12:57 +0800
8ac6ed585 ipc: implement MSG_COPY as a new receive mode ... Browse Code »

Teach the helper routines about MSG_COPY so that msgtyp is preserved as
the message number to copy.

The security functions affected by this change were audited and no
additional changes are necessary.

Signed-off-by: Peter Hurley
Acked-by: Stanislav Kinsbursky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Hurley
2013-05-01 23:12:57 +0800
852028af8 ipc: remove msg handling from queue scan ... Browse Code »

In preparation for refactoring the queue scan into a separate
function, relocate msg copying.

Signed-off-by: Peter Hurley
Acked-by: Stanislav Kinsbursky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Hurley
2013-05-01 23:12:57 +0800

03 Apr, 2013

1 commit

2dc958fa2 ipc: set msg back to -EAGAIN if copy wasn't performed ... Browse Code »

Make sure that msg pointer is set back to error value in case of
MSG_COPY flag is set and desired message to copy wasn't found. This
garantees that msg is either a error pointer or a copy address.

Otherwise the last message in queue will be freed without unlinking from
the queue (which leads to memory corruption) and the dummy allocated
copy won't be released.

Signed-off-by: Stanislav Kinsbursky
Signed-off-by: Linus Torvalds

Stanislav Kinsbursky
2013-04-03 01:09:01 +0800

09 Mar, 2013

1 commit

88b9e456b ipc: don't allocate a copy larger than max ... Browse Code »

When MSG_COPY is set, a duplicate message must be allocated for the copy
before locking the queue. However, the copy could not be larger than was
sent which is limited to msg_ctlmax.

Signed-off-by: Peter Hurley
Acked-by: Stanislav Kinsbursky
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Hurley
2013-03-09 07:05:33 +0800

05 Jan, 2013

7 commits

3fcfe7865 ipc: add more comments to message copying related code ... Browse Code »

Signed-off-by: Stanislav Kinsbursky
Cc: "Eric W. Biederman"
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stanislav Kinsbursky
2013-01-05 08:11:46 +0800
51eeacaa0 ipc: simplify message copying ... Browse Code »

Remove the redundant and confusing fill_copy(). Also add copy_msg()
check for error. In this case exit from the function have to be done
instead of break, because further code interprets any error as EAGAIN.

Also define copy_msg() for the case when CONFIG_CHECKPOINT_RESTORE is
disabled.

Signed-off-by: Stanislav Kinsbursky
Cc: "Eric W. Biederman"
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stanislav Kinsbursky
2013-01-05 08:11:46 +0800
b30efe277 ipc: convert prepare_copy() from macro to function ... Browse Code »

This code works if CONFIG_CHECKPOINT_RESTORE is disabled.

[akpm@linux-foundation.org: remove __maybe_unused]
Signed-off-by: Stanislav Kinsbursky
Cc: "Eric W. Biederman"
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stanislav Kinsbursky
2013-01-05 08:11:46 +0800
85398aa8d ipc: simplify free_copy() call ... Browse Code »

Passing and checking of msgflg to free_copy() is redundant. This patch
sets copy to NULL on declaration instead and checks for non-NULL in
free_copy().

Note: in case of copy allocation failure, error is returned immediately.
So no need to check for IS_ERR() in free_copy().

Signed-off-by: Stanislav Kinsbursky
Cc: "Eric W. Biederman"
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stanislav Kinsbursky
2013-01-05 08:11:45 +0800
4a674f34b ipc: introduce message queue copy feature ... Browse Code »

This patch is required for checkpoint/restore in userspace.

c/r requires some way to get all pending IPC messages without deleting
them from the queue (checkpoint can fail and in this case tasks will be
resumed, so queue have to be valid).

To achive this, new operation flag MSG_COPY for sys_msgrcv() system call
was introduced. If this flag was specified, then mtype is interpreted as
number of the message to copy.

If MSG_COPY is set, then kernel will allocate dummy message with passed
size, and then use new copy_msg() helper function to copy desired message
(instead of unlinking it from the queue).

Notes:

1) Return -ENOSYS if MSG_COPY is specified, but
CONFIG_CHECKPOINT_RESTORE is not set.

Signed-off-by: Stanislav Kinsbursky
Cc: Serge Hallyn
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Al Viro
Cc: KOSAKI Motohiro
Cc: Michael Kerrisk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stanislav Kinsbursky
2013-01-05 08:11:45 +0800
f9dd87f47 ipc: message queue receive cleanup ... Browse Code »

Move all message related manipulation into one function msg_fill().
Actually, two functions because of the compat one.

[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Stanislav Kinsbursky
Cc: Serge Hallyn
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Al Viro
Cc: KOSAKI Motohiro
Cc: Michael Kerrisk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stanislav Kinsbursky
2013-01-05 08:11:45 +0800
9afdacda0 ipc: remove forced assignment of selected message ... Browse Code »

This is a cleanup patch. The assignment is redundant.

Signed-off-by: Stanislav Kinsbursky
Cc: Serge Hallyn
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Al Viro
Cc: KOSAKI Motohiro
Cc: Michael Kerrisk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stanislav Kinsbursky
2013-01-05 08:11:45 +0800

07 Sep, 2012

1 commit

1efdb69b0 userns: Convert ipc to use kuid and kgid where appropriate ... Browse Code »

- Store the ipc owner and creator with a kuid
- Store the ipc group and the crators group with a kgid.
- Add error handling to ipc_update_perms, allowing it to
fail if the uids and gids can not be converted to kuids
or kgids.
- Modify the proc files to display the ipc creator and
owner in the user namespace of the opener of the proc file.

Signed-off-by: Eric W. Biederman

Eric W. Biederman
2012-09-07 13:17:20 +0800

31 Mar, 2011

1 commit

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800

24 Mar, 2011

1 commit

b0e77598f userns: user namespaces: convert several capable() calls ... Browse Code »

CAP_IPC_OWNER and CAP_IPC_LOCK can be checked against current_user_ns(),
because the resource comes from current's own ipc namespace.

setuid/setgid are to uids in own namespace, so again checks can be against
current_user_ns().

Changelog:
Jan 11: Use task_ns_capable() in place of sched_capable().
Jan 11: Use nsown_capable() as suggested by Bastian Blank.
Jan 11: Clarify (hopefully) some logic in futex and sched.c
Feb 15: use ns_capable for ipc, not nsown_capable
Feb 23: let copy_ipcs handle setting ipc_ns->user_ns
Feb 23: pass ns down rather than taking it from current

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn
Acked-by: "Eric W. Biederman"
Acked-by: Daniel Lezcano
Acked-by: David Howells
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2011-03-24 10:47:08 +0800

25 May, 2010

1 commit

4be929be3 kernel-wide: replace USHORT_MAX, SHORT_MAX and SHORT_MIN with USHRT_MAX, SHRT_MAX and SHRT_MIN ... Browse Code »

- C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not
USHORT_MAX/SHORT_MAX/SHORT_MIN.

- Make SHRT_MIN of type s16, not int, for consistency.

[akpm@linux-foundation.org: fix drivers/dma/timb_dma.c]
[akpm@linux-foundation.org: fix security/keys/keyring.c]
Signed-off-by: Alexey Dobriyan
Acked-by: WANG Cong
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2010-05-25 23:07:02 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800