14 Dec, 2014

1 commit

  • SysV can be abused to allocate locked kernel memory. For most systems, a
    small limit doesn't make sense, see the discussion with regards to SHMMAX.

    Therefore: increase MSGMNI to the maximum supported.

    And: If we ignore the risk of locking too much memory, then an automatic
    scaling of MSGMNI doesn't make sense. Therefore the logic can be removed.

    The code preserves auto_msgmni to avoid breaking any user space applications
    that expect that the value exists.

    Notes:
    1) If an administrator must limit the memory allocations, then he can set
    MSGMNI as necessary.

    Or he can disable sysv entirely (as e.g. done by Android).

    2) MSGMAX and MSGMNB are intentionally not increased, as these values are used
    to control latency vs. throughput:
    If MSGMNB is large, then msgsnd() just returns and more messages can be queued
    before a task switch to a task that calls msgrcv() is forced.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Manfred Spraul
    Cc: Davidlohr Bueso
    Cc: Rafael Aquini
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Manfred Spraul
     

14 Oct, 2014

1 commit

  • proc_dointvec_minmax() returns zero if a new value has been set. So we
    don't need to check all charecters have been handled.

    Below you can find two examples. In the new value has not been handled
    properly.

    $ strace ./a.out
    open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
    write(3, "0\n\0", 3) = 2
    close(3) = 0
    exit_group(0)
    $ cat /sys/kernel/debug/tracing/trace

    $strace ./a.out
    open("/proc/sys/kernel/auto_msgmni", O_WRONLY) = 3
    write(3, "0\n", 2) = 2
    close(3) = 0

    $ cat /sys/kernel/debug/tracing/trace
    a.out-697 [000] .... 3280.998235: unregister_ipcns_notifier
    Cc: Mathias Krause
    Cc: Manfred Spraul
    Cc: Joe Perches
    Cc: Davidlohr Bueso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Vagin
     

07 Jun, 2014

1 commit


08 Apr, 2014

1 commit


28 Jan, 2014

1 commit

  • The ipc code does not adhere the typical linux coding style.
    This patch fixes lots of simple whitespace errors.

    - mostly autogenerated by
    scripts/checkpatch.pl -f --fix \
    --types=pointer_location,spacing,space_before_tab
    - one manual fixup (keep structure members tab-aligned)
    - removal of additional space_before_tab that were not found by --fix

    Tested with some of my msg and sem test apps.

    Andrew: Could you include it in -mm and move it towards Linus' tree?

    Signed-off-by: Manfred Spraul
    Suggested-by: Li Bin
    Cc: Joe Perches
    Acked-by: Rafael Aquini
    Cc: Davidlohr Bueso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Manfred Spraul
     

04 Nov, 2013

1 commit

  • Negative message lengths make no sense -- so don't do negative queue
    lenghts or identifier counts. Prevent them from getting negative.

    Also change the underlying data types to be unsigned to avoid hairy
    surprises with sign extensions in cases where those variables get
    evaluated in unsigned expressions with bigger data types, e.g size_t.

    In case a user still wants to have "unlimited" sizes she could just use
    INT_MAX instead.

    Signed-off-by: Mathias Krause
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathias Krause
     

05 Jan, 2013

1 commit

  • Add 3 new variables and sysctls to tune them (by one "next_id" variable
    for messages, semaphores and shared memory respectively). This variable
    can be used to set desired id for next allocated IPC object. By default
    it's equal to -1 and old behaviour is preserved. If this variable is
    non-negative, then desired idr will be extracted from it and used as a
    start value to search for free IDR slot.

    Notes:

    1) this patch doesn't guarantee that the new object will have desired
    id. So it's up to user space how to handle new object with wrong id.

    2) After a sucessful id allocation attempt, "next_id" will be set back
    to -1 (if it was non-negative).

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Stanislav Kinsbursky
    Cc: Serge Hallyn
    Cc: "Eric W. Biederman"
    Cc: Pavel Emelyanov
    Cc: Al Viro
    Cc: KOSAKI Motohiro
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stanislav Kinsbursky
     

27 Jul, 2011

1 commit

  • Add support for the shm_rmid_forced sysctl. If set to 1, all shared
    memory objects in current ipc namespace will be automatically forced to
    use IPC_RMID.

    The POSIX way of handling shmem allows one to create shm objects and
    call shmdt(), leaving shm object associated with no process, thus
    consuming memory not counted via rlimits.

    With shm_rmid_forced=1 the shared memory object is counted at least for
    one process, so OOM killer may effectively kill the fat process holding
    the shared memory.

    It obviously breaks POSIX - some programs relying on the feature would
    stop working. So set shm_rmid_forced=1 only if you're sure nobody uses
    "orphaned" memory. Use shm_rmid_forced=0 by default for compatability
    reasons.

    The feature was previously impemented in -ow as a configure option.

    [akpm@linux-foundation.org: fix documentation, per Randy]
    [akpm@linux-foundation.org: fix warning]
    [akpm@linux-foundation.org: readability/conventionality tweaks]
    [akpm@linux-foundation.org: fix shm_rmid_forced/shm_forced_rmid confusion, use standard comment layout]
    Signed-off-by: Vasiliy Kulikov
    Cc: Randy Dunlap
    Cc: "Eric W. Biederman"
    Cc: "Serge E. Hallyn"
    Cc: Daniel Lezcano
    Cc: Oleg Nesterov
    Cc: Tejun Heo
    Cc: Ingo Molnar
    Cc: Alan Cox
    Cc: Solar Designer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vasiliy Kulikov
     

12 Nov, 2009

1 commit


24 Sep, 2009

1 commit

  • It's unused.

    It isn't needed -- read or write flag is already passed and sysctl
    shouldn't care about the rest.

    It _was_ used in two places at arch/frv for some reason.

    Signed-off-by: Alexey Dobriyan
    Cc: David Howells
    Cc: "Eric W. Biederman"
    Cc: Al Viro
    Cc: Ralf Baechle
    Cc: Martin Schwidefsky
    Cc: Ingo Molnar
    Cc: "David S. Miller"
    Cc: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

03 Apr, 2009

1 commit


07 Jan, 2009

1 commit


17 Oct, 2008

1 commit

  • name and nlen parameters passed to ->strategy hook are unused, remove
    them. In general ->strategy hook should know what it's doing, and don't
    do something tricky for which, say, pointer to original userspace array
    may be needed (name).

    Signed-off-by: Alexey Dobriyan
    Acked-by: David S. Miller [ networking bits ]
    Cc: Ralf Baechle
    Cc: David Howells
    Cc: Matt Mackall
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

26 Jul, 2008

1 commit

  • This patch proposes an alternative to the "magical
    positive-versus-negative number trick" Andrew complained about last week
    in http://lkml.org/lkml/2008/6/24/418.

    This had been introduced with the patches that scale msgmni to the amount
    of lowmem. With these patches, msgmni has a registered notification
    routine that recomputes msgmni value upon memory add/remove or ipc
    namespace creation/ removal.

    When msgmni is changed from user space (i.e. value written to the proc
    file), that notification routine is unregistered, and the way to make it
    registered back is to write a negative value into the proc file. This is
    the "magical positive-versus-negative number trick".

    To fix this, a new proc file is introduced: /proc/sys/kernel/auto_msgmni.
    This file acts as ON/OFF for msgmni automatic recomputing.

    With this patch, the process is the following:
    1) kernel boots in "automatic recomputing mode"
    /proc/sys/kernel/msgmni contains the value that has been computed (depends
    on lowmem)
    /proc/sys/kernel/automatic_msgmni contains "1"

    2) echo > /proc/sys/kernel/msgmni
    . sets msg_ctlmni to
    . de-activates automatic recomputing (i.e. if, say, some memory is added
    msgmni won't be recomputed anymore)
    . /proc/sys/kernel/automatic_msgmni now contains "0"

    3) echo "0" > /proc/sys/kernel/automatic_msgmni
    . de-activates msgmni automatic recomputing
    this has the same effect as 2) except that msg_ctlmni's value stays
    blocked at its current value)

    3) echo "1" > /proc/sys/kernel/automatic_msgmni
    . recomputes msgmni's value based on the current available memory size
    and number of ipc namespaces
    . re-activates automatic recomputing for msgmni.

    Signed-off-by: Nadia Derbey
    Cc: Solofo Ramangalahy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadia Derbey
     

29 Apr, 2008

2 commits

  • The enhancement as asked for by Yasunori: if msgmni is set to a negative
    value, register it back into the ipcns notifier chain.

    A new interface has been added to the notification mechanism:
    notifier_chain_cond_register() registers a notifier block only if not already
    registered. With that new interface we avoid taking care of the states
    changes in procfs.

    Signed-off-by: Nadia Derbey
    Cc: Yasunori Goto
    Cc: Matt Helsley
    Cc: Mingming Cao
    Cc: Pierre Peiffer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadia Derbey
     
  • Make msgmni not recomputed anymore upon ipc namespace creation / removal or
    memory add/remove, as soon as it has been set from userland.

    As soon as msgmni is explicitly set via procfs or sysctl(), the associated
    callback routine is unregistered from the ipc namespace notifier chain.

    Signed-off-by: Nadia Derbey
    Cc: Yasunori Goto
    Cc: Matt Helsley
    Cc: Mingming Cao
    Cc: Pierre Peiffer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadia Derbey
     

09 Feb, 2008

1 commit

  • Currently the IPC namespace management code is spread over the ipc/*.c files.
    I moved this code into ipc/namespace.c file which is compiled out when needed.

    The linux/ipc_namespace.h file is used to store the prototypes of the
    functions in namespace.c and the stubs for NAMESPACES=n case. This is done
    so, because the stub for copy_ipc_namespace requires the knowledge of the
    CLONE_NEWIPC flag, which is in sched.h. But the linux/ipc.h file itself in
    included into many many .c files via the sys.h->sem.h sequence so adding the
    sched.h into it will make all these .c depend on sched.h which is not that
    good. On the other hand the knowledge about the namespaces stuff is required
    in 4 .c files only.

    Besides, this patch compiles out some auxiliary functions from ipc/sem.c,
    msg.c and shm.c files. It turned out that moving these functions into
    namespaces.c is not that easy because they use many other calls and macros
    from the original file. Moving them would make this patch complicated. On
    the other hand all these functions can be consolidated, so I will send a
    separate patch doing this a bit later.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Cc: Cedric Le Goater
    Cc: "Eric W. Biederman"
    Cc: Herbert Poetzl
    Cc: Kirill Korotaev
    Cc: Sukadev Bhattiprolu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

17 Oct, 2007

1 commit


15 Feb, 2007

2 commits

  • The semantic effect of insert_at_head is that it would allow new registered
    sysctl entries to override existing sysctl entries of the same name. Which is
    pain for caching and the proc interface never implemented.

    I have done an audit and discovered that none of the current users of
    register_sysctl care as (excpet for directories) they do not register
    duplicate sysctl entries.

    So this patch simply removes the support for overriding existing entries in
    the sys_sysctl interface since no one uses it or cares and it makes future
    enhancments harder.

    Signed-off-by: Eric W. Biederman
    Acked-by: Ralf Baechle
    Acked-by: Martin Schwidefsky
    Cc: Russell King
    Cc: David Howells
    Cc: "Luck, Tony"
    Cc: Ralf Baechle
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Andi Kleen
    Cc: Jens Axboe
    Cc: Corey Minyard
    Cc: Neil Brown
    Cc: "John W. Linville"
    Cc: James Bottomley
    Cc: Jan Kara
    Cc: Trond Myklebust
    Cc: Mark Fasheh
    Cc: David Chinner
    Cc: "David S. Miller"
    Cc: Patrick McHardy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • This is just a simple cleanup to keep kernel/sysctl.c from getting to crowded
    with special cases, and by keeping all of the ipc logic to together it makes
    the code a little more readable.

    [gcoady.lk@gmail.com: build fix]
    Signed-off-by: Eric W. Biederman
    Cc: Serge E. Hallyn
    Cc: Herbert Poetzl
    Cc: Kirill Korotaev
    Signed-off-by: Grant Coady
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman