06 Sep, 2020
1 commit
-
Commit 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
signature of proc_ipc_sem_dointvec to match ctl_table.proc_handler which
fixes the following sparse error/warning:ipc/ipc_sysctl.c:94:47: warning: incorrect type in argument 3 (different address spaces)
ipc/ipc_sysctl.c:94:47: expected void *buffer
ipc/ipc_sysctl.c:94:47: got void [noderef] __user *buffer
ipc/ipc_sysctl.c:194:35: warning: incorrect type in initializer (incompatible argument 3 (different address spaces))
ipc/ipc_sysctl.c:194:35: expected int ( [usertype] *proc_handler )( ... )
ipc/ipc_sysctl.c:194:35: got int ( * )( ... )Fixes: 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: Tobias Klauser
Signed-off-by: Andrew Morton
Cc: Christoph Hellwig
Cc: Alexander Viro
Link: https://lkml.kernel.org/r/20200825105846.5193-1-tklauser@distanz.ch
Signed-off-by: Linus Torvalds
27 Apr, 2020
1 commit
-
Instead of having all the sysctl handlers deal with user pointers, which
is rather hairy in terms of the BPF interaction, copy the input to and
from userspace in common code. This also means that the strings are
always NUL-terminated by the common code, making the API a little bit
safer.As most handler just pass through the data to one of the common handlers
a lot of the changes are mechnical.Signed-off-by: Christoph Hellwig
Acked-by: Andrey Ignatov
Signed-off-by: Al Viro
19 Jul, 2019
1 commit
-
In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range. This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:$ git grep -E '\.extra[12].*&(zero|one|int_max)' |wc -l
248Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:# scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
Data old new delta
sysctl_vals - 12 +12
__kstrtab_sysctl_vals - 12 +12
max 14 10 -4
int_max 16 - -16
one 68 - -68
zero 128 28 -100
Total: Before=20583249, After=20583085, chg -0.00%[mcroce@redhat.com: tipc: remove two unused variables]
Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
[akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
[arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
[akpm@linux-foundation.org: fix fs/eventpoll.c]
Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com
Signed-off-by: Matteo Croce
Signed-off-by: Arnd Bergmann
Acked-by: Kees Cook
Reviewed-by: Aaron Tomlin
Cc: Matthew Wilcox
Cc: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
05 Jun, 2019
1 commit
-
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation version 2 of the licenseextracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 315 file(s).
Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Reviewed-by: Armijn Hemel
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de
Signed-off-by: Greg Kroah-Hartman
15 May, 2019
2 commits
-
For ipcmni_extend mode, the sequence number space is only 7 bits. So
the chance of id reuse is relatively high compared with the non-extended
mode.To alleviate this id reuse problem, this patch enables cyclic allocation
for the index to the radix tree (idx). The disadvantage is that this
can cause a slight slow-down of the fast path, as the radix tree could
be higher than necessary.To limit the radix tree height, I have chosen the following limits:
1) The cycling is done over in_use*1.5.
2) At least, the cycling is done over
"normal" ipcnmi mode: RADIX_TREE_MAP_SIZE elements
"ipcmni_extended": 4096 elementsResult:
- for normal mode:
No change for 4095 active objects until the 3rd level
is added without cyclic allocation.For a 2-level radix tree compared to a 1-level radix tree, I have
observed < 1% performance impact.Notes:
1) Normal "x=semget();y=semget();" is unaffected: Then the idx
is e.g. a and a+1, regardless if idr_alloc() or idr_alloc_cyclic()
is used.2) The -1% happens in a microbenchmark after this situation:
x=semget();
for(i=0;i<
Acked-by: Waiman Long
Cc: "Luis R. Rodriguez"
Cc: Kees Cook
Cc: Jonathan Corbet
Cc: Al Viro
Cc: Matthew Wilcox
Cc: "Eric W . Biederman"
Cc: Takashi Iwai
Cc: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The maximum number of unique System V IPC identifiers was limited to
32k. That limit should be big enough for most use cases.However, there are some users out there requesting for more, especially
those that are migrating from Solaris which uses 24 bits for unique
identifiers. To satisfy the need of those users, a new boot time kernel
option "ipcmni_extend" is added to extend the IPCMNI value to 16M. This
is a 512X increase which should be big enough for users out there that
need a large number of unique IPC identifier.The use of this new option will change the pattern of the IPC
identifiers returned by functions like shmget(2). An application that
depends on such pattern may not work properly. So it should only be
used if the users really need more than 32k of unique IPC numbers.This new option does have the side effect of reducing the maximum number
of unique sequence numbers from 64k down to 128. So it is a trade-off.The computation of a new IPC id is not done in the performance critical
path. So a little bit of additional overhead shouldn't have any real
performance impact.Link: http://lkml.kernel.org/r/20190329204930.21620-1-longman@redhat.com
Signed-off-by: Waiman Long
Acked-by: Manfred Spraul
Cc: Al Viro
Cc: Davidlohr Bueso
Cc: "Eric W . Biederman"
Cc: Jonathan Corbet
Cc: Kees Cook
Cc: "Luis R. Rodriguez"
Cc: Matthew Wilcox
Cc: Takashi Iwai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
31 Oct, 2018
2 commits
-
For SysV semaphores, the semmni value is the last part of the 4-element
sem number array. To make semmni behave in a similar way to msgmni and
shmmni, we can't directly use the _minmax handler. Instead, a special sem
specific handler is added to check the last argument to make sure that it
is limited to the [0, IPCMNI] range. An error will be returned if this is
not the case.Link: http://lkml.kernel.org/r/1536352137-12003-3-git-send-email-longman@redhat.com
Signed-off-by: Waiman Long
Reviewed-by: Davidlohr Bueso
Cc: "Eric W. Biederman"
Cc: Jonathan Corbet
Cc: Kees Cook
Cc: Luis R. Rodriguez
Cc: Matthew Wilcox
Cc: Takashi Iwai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Patch series "ipc: IPCMNI limit check for *mni & increase that limit", v9.
The sysctl parameters msgmni, shmmni and semmni have an inherent limit of
IPC_MNI (32k). However, users may not be aware of that because they can
write a value much higher than that without getting any error or
notification. Reading the parameters back will show the newly written
values which are not real.The real IPCMNI limit is now enforced to make sure that users won't put in
an unrealistic value. The first 2 patches enforce the limits.There are also users out there requesting increase in the IPCMNI value.
The last 2 patches attempt to do that by using a boot kernel parameter
"ipcmni_extend" to increase the IPCMNI limit from 32k to 8M if the users
really want the extended value.This patch (of 4):
A user can write arbitrary integer values to msgmni and shmmni sysctl
parameters without getting error, but the actual limit is really IPCMNI
(32k). This can mislead users as they think they can get a value that is
not real.The right limits are now set for msgmni and shmmni so that the users will
become aware if they set a value outside of the acceptable range.Link: http://lkml.kernel.org/r/1536352137-12003-2-git-send-email-longman@redhat.com
Signed-off-by: Waiman Long
Acked-by: Luis R. Rodriguez
Reviewed-by: Davidlohr Bueso
Cc: Kees Cook
Cc: Jonathan Corbet
Cc: Matthew Wilcox
Cc: "Eric W. Biederman"
Cc: Takashi Iwai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Dec, 2014
1 commit
-
SysV can be abused to allocate locked kernel memory. For most systems, a
small limit doesn't make sense, see the discussion with regards to SHMMAX.Therefore: increase MSGMNI to the maximum supported.
And: If we ignore the risk of locking too much memory, then an automatic
scaling of MSGMNI doesn't make sense. Therefore the logic can be removed.The code preserves auto_msgmni to avoid breaking any user space applications
that expect that the value exists.Notes:
1) If an administrator must limit the memory allocations, then he can set
MSGMNI as necessary.Or he can disable sysv entirely (as e.g. done by Android).
2) MSGMAX and MSGMNB are intentionally not increased, as these values are used
to control latency vs. throughput:
If MSGMNB is large, then msgsnd() just returns and more messages can be queued
before a task switch to a task that calls msgrcv() is forced.[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Manfred Spraul
Cc: Davidlohr Bueso
Cc: Rafael Aquini
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Oct, 2014
1 commit
-
proc_dointvec_minmax() returns zero if a new value has been set. So we
don't need to check all charecters have been handled.Below you can find two examples. In the new value has not been handled
properly.$ strace ./a.out
open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
write(3, "0\n\0", 3) = 2
close(3) = 0
exit_group(0)
$ cat /sys/kernel/debug/tracing/trace$strace ./a.out
open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
write(3, "0\n", 2) = 2
close(3) = 0$ cat /sys/kernel/debug/tracing/trace
a.out-697 [000] .... 3280.998235: unregister_ipcns_notifier
Cc: Mathias Krause
Cc: Manfred Spraul
Cc: Joe Perches
Cc: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
07 Jun, 2014
1 commit
-
This typedef is unnecessary and should just be removed.
Signed-off-by: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
08 Apr, 2014
1 commit
-
... since __initcall is now deprecated.
Signed-off-by: Davidlohr Bueso
Cc: Manfred Spraul
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
28 Jan, 2014
1 commit
-
The ipc code does not adhere the typical linux coding style.
This patch fixes lots of simple whitespace errors.- mostly autogenerated by
scripts/checkpatch.pl -f --fix \
--types=pointer_location,spacing,space_before_tab
- one manual fixup (keep structure members tab-aligned)
- removal of additional space_before_tab that were not found by --fixTested with some of my msg and sem test apps.
Andrew: Could you include it in -mm and move it towards Linus' tree?
Signed-off-by: Manfred Spraul
Suggested-by: Li Bin
Cc: Joe Perches
Acked-by: Rafael Aquini
Cc: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
04 Nov, 2013
1 commit
-
Negative message lengths make no sense -- so don't do negative queue
lenghts or identifier counts. Prevent them from getting negative.Also change the underlying data types to be unsigned to avoid hairy
surprises with sign extensions in cases where those variables get
evaluated in unsigned expressions with bigger data types, e.g size_t.In case a user still wants to have "unlimited" sizes she could just use
INT_MAX instead.Signed-off-by: Mathias Krause
Cc: Andrew Morton
Signed-off-by: Linus Torvalds
05 Jan, 2013
1 commit
-
Add 3 new variables and sysctls to tune them (by one "next_id" variable
for messages, semaphores and shared memory respectively). This variable
can be used to set desired id for next allocated IPC object. By default
it's equal to -1 and old behaviour is preserved. If this variable is
non-negative, then desired idr will be extracted from it and used as a
start value to search for free IDR slot.Notes:
1) this patch doesn't guarantee that the new object will have desired
id. So it's up to user space how to handle new object with wrong id.2) After a sucessful id allocation attempt, "next_id" will be set back
to -1 (if it was non-negative).[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Stanislav Kinsbursky
Cc: Serge Hallyn
Cc: "Eric W. Biederman"
Cc: Pavel Emelyanov
Cc: Al Viro
Cc: KOSAKI Motohiro
Cc: Michael Kerrisk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 Jul, 2011
1 commit
-
Add support for the shm_rmid_forced sysctl. If set to 1, all shared
memory objects in current ipc namespace will be automatically forced to
use IPC_RMID.The POSIX way of handling shmem allows one to create shm objects and
call shmdt(), leaving shm object associated with no process, thus
consuming memory not counted via rlimits.With shm_rmid_forced=1 the shared memory object is counted at least for
one process, so OOM killer may effectively kill the fat process holding
the shared memory.It obviously breaks POSIX - some programs relying on the feature would
stop working. So set shm_rmid_forced=1 only if you're sure nobody uses
"orphaned" memory. Use shm_rmid_forced=0 by default for compatability
reasons.The feature was previously impemented in -ow as a configure option.
[akpm@linux-foundation.org: fix documentation, per Randy]
[akpm@linux-foundation.org: fix warning]
[akpm@linux-foundation.org: readability/conventionality tweaks]
[akpm@linux-foundation.org: fix shm_rmid_forced/shm_forced_rmid confusion, use standard comment layout]
Signed-off-by: Vasiliy Kulikov
Cc: Randy Dunlap
Cc: "Eric W. Biederman"
Cc: "Serge E. Hallyn"
Cc: Daniel Lezcano
Cc: Oleg Nesterov
Cc: Tejun Heo
Cc: Ingo Molnar
Cc: Alan Cox
Cc: Solar Designer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Nov, 2009
1 commit
-
Now that sys_sysctl is a generic wrapper around /proc/sys .ctl_name
and .strategy members of sysctl tables are dead code. Remove them.Signed-off-by: Eric W. Biederman
24 Sep, 2009
1 commit
-
It's unused.
It isn't needed -- read or write flag is already passed and sysctl
shouldn't care about the rest.It _was_ used in two places at arch/frv for some reason.
Signed-off-by: Alexey Dobriyan
Cc: David Howells
Cc: "Eric W. Biederman"
Cc: Al Viro
Cc: Ralf Baechle
Cc: Martin Schwidefsky
Cc: Ingo Molnar
Cc: "David S. Miller"
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
03 Apr, 2009
1 commit
-
As pointed out by Cedric Le Goater (in response to Alexey's original
comment wrt mqns), ipc_sysctl.c and utsname_sysctl.c are using
CONFIG_PROC_FS, not CONFIG_PROC_SYSCTL, to determine whether to define
the proc_handlers. Change that.Signed-off-by: Serge E. Hallyn
Cc: Cedric Le Goater
Acked-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
07 Jan, 2009
1 commit
-
proc_ipcauto_dointvec_minmax() is the only user of ipc_auto_callback(),
since the former function is protected by CONFIG_PROC_FS, so should be the
latter one.Just move its definition down.
Signed-off-by: WANG Cong
Cc: Eric Biederman
Cc: Nadia Derbey
Cc: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
17 Oct, 2008
1 commit
-
name and nlen parameters passed to ->strategy hook are unused, remove
them. In general ->strategy hook should know what it's doing, and don't
do something tricky for which, say, pointer to original userspace array
may be needed (name).Signed-off-by: Alexey Dobriyan
Acked-by: David S. Miller [ networking bits ]
Cc: Ralf Baechle
Cc: David Howells
Cc: Matt Mackall
Cc: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
26 Jul, 2008
1 commit
-
This patch proposes an alternative to the "magical
positive-versus-negative number trick" Andrew complained about last week
in http://lkml.org/lkml/2008/6/24/418.This had been introduced with the patches that scale msgmni to the amount
of lowmem. With these patches, msgmni has a registered notification
routine that recomputes msgmni value upon memory add/remove or ipc
namespace creation/ removal.When msgmni is changed from user space (i.e. value written to the proc
file), that notification routine is unregistered, and the way to make it
registered back is to write a negative value into the proc file. This is
the "magical positive-versus-negative number trick".To fix this, a new proc file is introduced: /proc/sys/kernel/auto_msgmni.
This file acts as ON/OFF for msgmni automatic recomputing.With this patch, the process is the following:
1) kernel boots in "automatic recomputing mode"
/proc/sys/kernel/msgmni contains the value that has been computed (depends
on lowmem)
/proc/sys/kernel/automatic_msgmni contains "1"2) echo > /proc/sys/kernel/msgmni
. sets msg_ctlmni to
. de-activates automatic recomputing (i.e. if, say, some memory is added
msgmni won't be recomputed anymore)
. /proc/sys/kernel/automatic_msgmni now contains "0"3) echo "0" > /proc/sys/kernel/automatic_msgmni
. de-activates msgmni automatic recomputing
this has the same effect as 2) except that msg_ctlmni's value stays
blocked at its current value)3) echo "1" > /proc/sys/kernel/automatic_msgmni
. recomputes msgmni's value based on the current available memory size
and number of ipc namespaces
. re-activates automatic recomputing for msgmni.Signed-off-by: Nadia Derbey
Cc: Solofo Ramangalahy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Apr, 2008
2 commits
-
The enhancement as asked for by Yasunori: if msgmni is set to a negative
value, register it back into the ipcns notifier chain.A new interface has been added to the notification mechanism:
notifier_chain_cond_register() registers a notifier block only if not already
registered. With that new interface we avoid taking care of the states
changes in procfs.Signed-off-by: Nadia Derbey
Cc: Yasunori Goto
Cc: Matt Helsley
Cc: Mingming Cao
Cc: Pierre Peiffer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make msgmni not recomputed anymore upon ipc namespace creation / removal or
memory add/remove, as soon as it has been set from userland.As soon as msgmni is explicitly set via procfs or sysctl(), the associated
callback routine is unregistered from the ipc namespace notifier chain.Signed-off-by: Nadia Derbey
Cc: Yasunori Goto
Cc: Matt Helsley
Cc: Mingming Cao
Cc: Pierre Peiffer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
09 Feb, 2008
1 commit
-
Currently the IPC namespace management code is spread over the ipc/*.c files.
I moved this code into ipc/namespace.c file which is compiled out when needed.The linux/ipc_namespace.h file is used to store the prototypes of the
functions in namespace.c and the stubs for NAMESPACES=n case. This is done
so, because the stub for copy_ipc_namespace requires the knowledge of the
CLONE_NEWIPC flag, which is in sched.h. But the linux/ipc.h file itself in
included into many many .c files via the sys.h->sem.h sequence so adding the
sched.h into it will make all these .c depend on sched.h which is not that
good. On the other hand the knowledge about the namespaces stuff is required
in 4 .c files only.Besides, this patch compiles out some auxiliary functions from ipc/sem.c,
msg.c and shm.c files. It turned out that moving these functions into
namespaces.c is not that easy because they use many other calls and macros
from the original file. Moving them would make this patch complicated. On
the other hand all these functions can be consolidated, so I will send a
separate patch doing this a bit later.Signed-off-by: Pavel Emelyanov
Acked-by: Serge Hallyn
Cc: Cedric Le Goater
Cc: "Eric W. Biederman"
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Cc: Sukadev Bhattiprolu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
17 Oct, 2007
1 commit
-
Finish the work : kill all #ifdef CONFIG_IPC_NS.
Thanks Robert !
Signed-off-by: Cedric Le Goater
Cc: Eric Biederman
Cc: Robert P. J. Day
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Feb, 2007
2 commits
-
The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name. Which is
pain for caching and the proc interface never implemented.I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.Signed-off-by: Eric W. Biederman
Acked-by: Ralf Baechle
Acked-by: Martin Schwidefsky
Cc: Russell King
Cc: David Howells
Cc: "Luck, Tony"
Cc: Ralf Baechle
Cc: Paul Mackerras
Cc: Martin Schwidefsky
Cc: Andi Kleen
Cc: Jens Axboe
Cc: Corey Minyard
Cc: Neil Brown
Cc: "John W. Linville"
Cc: James Bottomley
Cc: Jan Kara
Cc: Trond Myklebust
Cc: Mark Fasheh
Cc: David Chinner
Cc: "David S. Miller"
Cc: Patrick McHardy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is just a simple cleanup to keep kernel/sysctl.c from getting to crowded
with special cases, and by keeping all of the ipc logic to together it makes
the code a little more readable.[gcoady.lk@gmail.com: build fix]
Signed-off-by: Eric W. Biederman
Cc: Serge E. Hallyn
Cc: Herbert Poetzl
Cc: Kirill Korotaev
Signed-off-by: Grant Coady
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds