29 Apr, 2008

8 commits

  • sys_unshare(CLONE_NEWIPC) doesn't handle the undo lists properly, this can
    cause a kernel memory corruption. CLONE_NEWIPC must detach from the existing
    undo lists.

    Fix, part 1: add support for sys_unshare(CLONE_SYSVSEM)

    The original reason to not support it was the potential (inevitable?)
    confusion due to the fact that sys_unshare(CLONE_SYSVSEM) has the
    inverse meaning of clone(CLONE_SYSVSEM).

    Our two most reasonable options then appear to be (1) fully support
    CLONE_SYSVSEM, or (2) continue to refuse explicit CLONE_SYSVSEM,
    but always do it anyway on unshare(CLONE_SYSVSEM). This patch does
    (1).

    Changelog:
    Apr 16: SEH: switch to Manfred's alternative patch which
    removes the unshare_semundo() function which
    always refused CLONE_SYSVSEM.

    Signed-off-by: Manfred Spraul
    Signed-off-by: Serge E. Hallyn
    Acked-by: "Eric W. Biederman"
    Cc: Pavel Emelyanov
    Cc: Michael Kerrisk
    Cc: Pierre Peiffer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Manfred Spraul
     
  • semctl_down(), msgctl_down() and shmctl_down() are used to handle the same set
    of commands for each kind of IPC. They all start to do the same job (they
    retrieve the ipc and do some permission checks) before handling the commands
    on their own.

    This patch proposes to consolidate this by moving these same pieces of code
    into one common function called ipcctl_pre_down().

    It simplifies a little these xxxctl_down() functions and increases a little
    the maintainability.

    Signed-off-by: Pierre Peiffer
    Acked-by: Serge Hallyn
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • The IPC_SET command performs the same permission setting for all IPCs. This
    patch introduces a common ipc_update_perm() function to update these
    permissions and makes use of it for all IPCs.

    Signed-off-by: Pierre Peiffer
    Acked-by: Serge Hallyn
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • All IPCs make use of an intermetiate *_setbuf structure to handle the IPC_SET
    command. This is not really needed and, moreover, it complicates a little bit
    the code.

    This patch gets rid of the use of it and uses directly the semid64_ds/
    msgid64_ds/shmid64_ds structure.

    In addition of removing one struture declaration, it also simplifies and
    improves a little bit the common 64-bits path.

    Signed-off-by: Pierre Peiffer
    Acked-by: Serge Hallyn
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • semctl_down() takes one unused parameter: semnum. This patch proposes to get
    rid of it.

    Signed-off-by: Pierre Peiffer
    Acked-by: Serge Hallyn
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • semctl_down is called with the rwmutex (the one which protects the list of
    ipcs) taken in write mode.

    This patch moves this rwmutex taken in write-mode inside semctl_down.

    This has the advantages of reducing a little bit the window during which this
    rwmutex is taken, clarifying sys_semctl, and finally of having a coherent
    behaviour with [shm|msg]ctl_down

    Signed-off-by: Pierre Peiffer
    Acked-by: Serge Hallyn
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • Trivial patch which adds some small locking functions and makes use of them to
    factorize some part of the code and to make it cleaner.

    Signed-off-by: Pierre Peiffer
    Acked-by: Serge Hallyn
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • By continuing to consolidate a little the IPC code, each id can be built
    directly in ipc_addid() instead of having it built from each callers of
    ipc_addid()

    And I also remove shm_addid() in order to have, as much as possible, the
    same code for shm/sem/msg.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Pierre Peiffer
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     

09 Feb, 2008

4 commits

  • sem_exit_ns(), msg_exit_ns() and shm_exit_ns() are all called when an
    ipc_namespace is released to free all ipcs of each type. But in fact, they
    do the same thing: they loop around all ipcs to free them individually by
    calling a specific routine.

    This patch proposes to consolidate this by introducing a common function,
    free_ipcs(), that do the job. The specific routine to call on each
    individual ipcs is passed as parameter. For this, these ipc-specific
    'free' routines are reworked to take a generic 'struct ipc_perm' as
    parameter.

    Signed-off-by: Pierre Peiffer
    Cc: Cedric Le Goater
    Cc: Pavel Emelyanov
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • Each ipc_namespace contains a table of 3 pointers to struct ipc_ids (3 for
    msg, sem and shm, structure used to store all ipcs) These 'struct ipc_ids'
    are dynamically allocated for each icp_namespace as the ipc_namespace
    itself (for the init namespace, they are initialized with pointers to
    static variables instead)

    It is so for historical reason: in fact, before the use of idr to store the
    ipcs, the ipcs were stored in tables of variable length, depending of the
    maximum number of ipc allowed. Now, these 'struct ipc_ids' have a fixed
    size. As they are allocated in any cases for each new ipc_namespace, there
    is no gain of memory in having them allocated separately of the struct
    ipc_namespace.

    This patch proposes to make this table static in the struct ipc_namespace.
    Thus, we can allocate all in once and get rid of all the code needed to
    allocate and free these ipc_ids separately.

    Signed-off-by: Pierre Peiffer
    Acked-by: Cedric Le Goater
    Cc: Pavel Emelyanov
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • These commands (SEM_STAT and IPC_STAT) are rather doing the same things
    (only the meaning of the id given as input and the return value differ).
    However, for the semaphores, they are handled in two different places (two
    different functions).

    This patch consolidates this for clarification by handling these both
    commands in the same place in semctl_nolock(). It also removes one unused
    parameter for this function.

    Signed-off-by: Pierre Peiffer
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • Currently the IPC namespace management code is spread over the ipc/*.c files.
    I moved this code into ipc/namespace.c file which is compiled out when needed.

    The linux/ipc_namespace.h file is used to store the prototypes of the
    functions in namespace.c and the stubs for NAMESPACES=n case. This is done
    so, because the stub for copy_ipc_namespace requires the knowledge of the
    CLONE_NEWIPC flag, which is in sched.h. But the linux/ipc.h file itself in
    included into many many .c files via the sys.h->sem.h sequence so adding the
    sched.h into it will make all these .c depend on sched.h which is not that
    good. On the other hand the knowledge about the namespaces stuff is required
    in 4 .c files only.

    Besides, this patch compiles out some auxiliary functions from ipc/sem.c,
    msg.c and shm.c files. It turned out that moving these functions into
    namespaces.c is not that easy because they use many other calls and macros
    from the original file. Moving them would make this patch complicated. On
    the other hand all these functions can be consolidated, so I will send a
    separate patch doing this a bit later.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Cc: Cedric Le Goater
    Cc: "Eric W. Biederman"
    Cc: Herbert Poetzl
    Cc: Kirill Korotaev
    Cc: Sukadev Bhattiprolu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

07 Feb, 2008

1 commit

  • In the new implementation of the [sem|shm|msg]_lock[_check]() routines, we
    use the return value of ipc_lock() in container_of() without any check.
    But ipc_lock may return a errcode. The use of this errcode in
    container_of() may alter this errcode, and we don't want this.

    And in xxx_exit_ns, the pointer return by idr_find is of type 'struct
    kern_ipc_per'...

    Today, the code will work as is because the member used in these
    container_of() is the first member of its container (offset == 0), the
    errcode isn't changed then. But in the general case, we can't count on
    this assumption and this may lead later to a real bug if we don't correct
    this.

    Again, the proposed solution is simple and correct. But, as pointed by
    Nadia, with this solution, the same check will be done several times (in
    all sub-callers...), what is not very funny/optimal...

    Signed-off-by: Pierre Peiffer
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     

20 Oct, 2007

10 commits

  • With the use of idr to store the ipc, the case where the idr cache is
    empty, when idr_get_new is called (this may happen even if we call
    idr_pre_get() before), is not well handled: it lets
    semget()/shmget()/msgget() return ENOSPC when this cache is empty, what 1.
    does not reflect the facts and 2. does not conform to the man(s).

    This patch fixes this by retrying the whole process of allocation in this case.

    Signed-off-by: Pierre Peiffer
    Cc: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • Some comments about sem_undo_list seem wrong.
    About the comment above unlock_semundo:
    "... If task2 now exits before task1 releases the lock (by calling
    unlock_semundo()), then task1 will never call spin_unlock(). ..."

    This is just wrong, I see no reason for which task1 will not call
    spin_unlock... The rest of this comment is also wrong... Unless I
    miss something (of course).

    Finally, (un)lock_semundo functions are useless, so remove them
    for simplification. (this avoids an useless if statement)

    Signed-off-by: Pierre Peiffer
    Cc: Nadia Derbey
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pierre Peiffer
     
  • Remvoe the unneeded parameters from ipc_checkid() and ipc_buildid()
    interfaces.

    Signed-off-by: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadia Derbey
     
  • This is a patch that fixes the way idr_find() used to be called in ipc_lock():
    in all the paths that don't imply an update of the ipcs idr, it was called
    without the idr tree being locked.

    The changes are:
    . in ipc_ids, the mutex has been changed into a reader/writer semaphore.
    . ipc_lock() now takes the mutex as a reader during the idr_find().
    . a new routine ipc_lock_down() has been defined: it doesn't take the
    mutex, assuming that it is being held by the caller. This is the routine
    that is now called in all the update paths.

    Signed-off-by: Nadia Derbey
    Acked-by: Jarek Poplawski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadia Derbey
     
  • This patch fixes the wrong / obsolete comments in the ipc code. Also adds
    a missing lock around ipc_get_maxid() in shm_get_stat().

    Signed-off-by: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadia Derbey
     
  • This patch converts casts of struct kern_ipc_perm to
    . struct msg_queue
    . struct sem_array
    . struct shmid_kernel
    into the equivalent container_of() macro. It improves code maintenance
    because the code need not change if kern_ipc_perm is no longer at the
    beginning of the containing struct.

    Signed-off-by: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadia Derbey
     
  • This patch introduces a new ipc_lock_check() routine interface:
    . each time ipc_checkid() is called, this is done after calling ipc_lock().
    ipc_checkid() is now called from inside ipc_lock_check().

    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: fix RCU locking]
    Signed-off-by: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadia Derbey
     
  • This patch introduces a change into the sys_msgget(), sys_semget() and
    sys_shmget() routines: they now share a common code, which is better for
    maintainability.

    Signed-off-by: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadia Derbey
     
  • This patch introduces ipcs storage into IDRs. The main changes are:
    . This ipc_ids structure is changed: the entries array is changed into a
    root idr structure.
    . The grow_ary() routine is removed: it is not needed anymore when adding
    an ipc structure, since we are now using the IDR facility.
    . The ipc_rmid() routine interface is changed:
    . there is no need for this routine to return the pointer passed in as
    argument: it is now declared as a void
    . since the id is now part of the kern_ipc_perm structure, no need to
    have it as an argument to the routine

    Signed-off-by: Nadia Derbey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nadia Derbey
     
  • This is the largest patch in the set. Make all (I hope) the places where
    the pid is shown to or get from user operate on the virtual pids.

    The idea is:
    - all in-kernel data structures must store either struct pid itself
    or the pid's global nr, obtained with pid_nr() call;
    - when seeking the task from kernel code with the stored id one
    should use find_task_by_pid() call that works with global pids;
    - when showing pid's numerical value to the user the virtual one
    should be used, but however when one shows task's pid outside this
    task's namespace the global one is to be used;
    - when getting the pid from userspace one need to consider this as
    the virtual one and use appropriate task/pid-searching functions.

    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: nuther build fix]
    [akpm@linux-foundation.org: yet nuther build fix]
    [akpm@linux-foundation.org: remove unneeded casts]
    Signed-off-by: Pavel Emelyanov
    Signed-off-by: Alexey Dobriyan
    Cc: Sukadev Bhattiprolu
    Cc: Oleg Nesterov
    Cc: Paul Menage
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

18 Jul, 2007

1 commit


17 Jul, 2007

1 commit

  • CONFIG_UTS_NS and CONFIG_IPC_NS have very little value as they only
    deactivate the unshare of the uts and ipc namespaces and do not improve
    performance.

    Signed-off-by: Cedric Le Goater
    Acked-by: "Serge E. Hallyn"
    Cc: Eric W. Biederman
    Cc: Herbert Poetzl
    Cc: Pavel Emelianov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cedric Le Goater
     

09 May, 2007

1 commit


08 Dec, 2006

1 commit


04 Nov, 2006

1 commit

  • Fix two issuses related to ipc_ids->entries freeing.

    1. When freeing ipc namespace we need to free entries allocated
    with ipc_init_ids().

    2. When removing old entries in grow_ary() ipc_rcu_putref()
    may be called on entries set to &ids->nullentry earlier in
    ipc_init_ids().
    This is almost impossible without namespaces, but with
    them this situation becomes possible.

    Found during OpenVZ testing after obvious leaks in beancounters.

    Signed-off-by: Pavel Emelianov
    Cc: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelianov
     

02 Oct, 2006

2 commits


01 Jul, 2006

1 commit


20 Jun, 2006

1 commit

  • The following patch addresses most of the issues with the IPC_SET_PERM
    records as described in:
    https://www.redhat.com/archives/linux-audit/2006-May/msg00010.html
    and addresses the comments I received on the record field names.

    To summarize, I made the following changes:

    1. Changed sys_msgctl() and semctl_down() so that an IPC_SET_PERM
    record is emitted in the failure case as well as the success case.
    This matches the behavior in sys_shmctl(). I could simplify the
    code in sys_msgctl() and semctl_down() slightly but it would mean
    that in some error cases we could get an IPC_SET_PERM record
    without an IPC record and that seemed odd.

    2. No change to the IPC record type, given no feedback on the backward
    compatibility question.

    3. Removed the qbytes field from the IPC record. It wasn't being
    set and when audit_ipc_obj() is called from ipcperms(), the
    information isn't available. If we want the information in the IPC
    record, more extensive changes will be necessary. Since it only
    applies to message queues and it isn't really permission related, it
    doesn't seem worth it.

    4. Removed the obj field from the IPC_SET_PERM record. This means that
    the kern_ipc_perm argument is no longer needed.

    5. Removed the spaces and renamed the IPC_SET_PERM field names. Replaced iuid and
    igid fields with ouid and ogid in the IPC record.

    I tested this with the lspp.22 kernel on an x86_64 box. I believe it
    applies cleanly on the latest kernel.

    -- ljk

    Signed-off-by: Linda Knippers
    Signed-off-by: Al Viro

    Linda Knippers
     

01 May, 2006

1 commit

  • 1) The audit_ipc_perms() function has been split into two different
    functions:
    - audit_ipc_obj()
    - audit_ipc_set_perm()

    There's a key shift here... The audit_ipc_obj() collects the uid, gid,
    mode, and SElinux context label of the current ipc object. This
    audit_ipc_obj() hook is now found in several places. Most notably, it
    is hooked in ipcperms(), which is called in various places around the
    ipc code permforming a MAC check. Additionally there are several places
    where *checkid() is used to validate that an operation is being
    performed on a valid object while not necessarily having a nearby
    ipcperms() call. In these locations, audit_ipc_obj() is called to
    ensure that the information is captured by the audit system.

    The audit_set_new_perm() function is called any time the permissions on
    the ipc object changes. In this case, the NEW permissions are recorded
    (and note that an audit_ipc_obj() call exists just a few lines before
    each instance).

    2) Support for an AUDIT_IPC_SET_PERM audit message type. This allows
    for separate auxiliary audit records for normal operations on an IPC
    object and permissions changes. Note that the same struct
    audit_aux_data_ipcctl is used and populated, however there are separate
    audit_log_format statements based on the type of the message. Finally,
    the AUDIT_IPC block of code in audit_free_aux() was extended to handle
    aux messages of this new type. No more mem leaks I hope ;-)

    Signed-off-by: Al Viro

    Steve Grubb
     

27 Mar, 2006

3 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
    drivers/char/ftape/lowlevel/fdc-io.c: Correct a comment
    Kconfig help: MTD_JEDECPROBE already supports Intel
    Remove ugly debugging stuff
    do_mounts.c: Minor ROOT_DEV comment cleanup
    BUG_ON() Conversion in drivers/s390/block/dasd_devmap.c
    BUG_ON() Conversion in mm/mempool.c
    BUG_ON() Conversion in mm/memory.c
    BUG_ON() Conversion in kernel/fork.c
    BUG_ON() Conversion in ipc/sem.c
    BUG_ON() Conversion in fs/ext2/
    BUG_ON() Conversion in fs/hfs/
    BUG_ON() Conversion in fs/dcache.c
    BUG_ON() Conversion in fs/buffer.c
    BUG_ON() Conversion in input/serio/hp_sdc_mlc.c
    BUG_ON() Conversion in md/dm-table.c
    BUG_ON() Conversion in md/dm-path-selector.c
    BUG_ON() Conversion in drivers/isdn
    BUG_ON() Conversion in drivers/char
    BUG_ON() Conversion in drivers/mtd/

    Linus Torvalds
     
  • Semaphore to mutex conversion.

    The conversion was generated via scripts, and the result was validated
    automatically via a script as well.

    Signed-off-by: Ingo Molnar
    Cc: Manfred Spraul
    Signed-off-by: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • this changes if() BUG(); constructs to BUG_ON() which is
    cleaner, contains unlikely() and can better optimized away.

    Signed-off-by: Eric Sesterhenn
    Signed-off-by: Adrian Bunk

    Eric Sesterhenn
     

21 Mar, 2006

1 commit

  • This patch extends existing audit records with subject/object context
    information. Audit records associated with filesystem inodes, ipc, and
    tasks now contain SELinux label information in the field "subj" if the
    item is performing the action, or in "obj" if the item is the receiver
    of an action.

    These labels are collected via hooks in SELinux and appended to the
    appropriate record in the audit code.

    This additional information is required for Common Criteria Labeled
    Security Protection Profile (LSPP).

    [AV: fixed kmalloc flags use]
    [folded leak fixes]
    [folded cleanup from akpm (kfree(NULL)]
    [folded audit_inode_context() leak fix]
    [folded akpm's fix for audit_ipc_perm() definition in case of !CONFIG_AUDIT]

    Signed-off-by: Dustin Kirkland
    Signed-off-by: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Dustin Kirkland
     

15 Jan, 2006

1 commit


12 Jan, 2006

1 commit

  • - Move capable() from sched.h to capability.h;

    - Use where capable() is used
    (in include/, block/, ipc/, kernel/, a few drivers/,
    mm/, security/, & sound/;
    many more drivers/ to go)

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy.Dunlap
     

25 Dec, 2005

1 commit