02 Mar, 2007

2 commits

  • This patch provides the following hugetlb-related fixes to the recent stacked
    shm files changes:
    - Update is_file_hugepages() so it will reconize hugetlb shm segments.
    - get_unmapped_area must be called with the nested file struct to handle
    the sfd->file->f_ops->get_unmapped_area == NULL case.
    - The fsync f_op must be wrapped since it is specified in the hugetlbfs
    f_ops.

    This is based on proposed fixes from Eric Biederman that were debugged and
    tested by me. Without it, attempting to use hugetlb shared memory segments
    on powerpc (and likely ia64) will kill your box.

    Signed-off-by: Adam Litke
    Cc: Eric Biederman
    Cc: Andrew Morton
    Acked-by: William Irwin
    Signed-off-by: Linus Torvalds

    Adam Litke
     
  • shm_nopage() can become static.

    Signed-off-by: Adrian Bunk
    Acked-by: Eric W. Biederman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

21 Feb, 2007

1 commit

  • The current ipc shared memory code runs into several problems because it
    does not quite use files like the rest of the kernel. With the option of
    backing ipc shared memory with either hugetlbfs or ordinary shared memory
    the problems got worse. With the added support for ipc namespaces things
    behaved so unexpected that we now have several bad namespace reference
    counting bugs when using what appears at first glance to be a reasonable
    idiom.

    So to attack these problems and hopefully make the code more maintainable
    this patch simply uses the files provided by other parts of the kernel and
    builds it's own files out of them. The shm files are allocated in do_shmat
    and freed when their reference count drops to zero with their last unmap.
    The file and vm operations that we don't want to implement or we don't
    implement completely we just delegate to the operations of our backing
    file.

    This means that we now get an accurate shm_nattch count for we have a
    hugetlbfs inode for backing store, and the shm accounting of last attach
    and last detach time work as well.

    This means that getting a reference to the ipc namespace when we create the
    file and dropping the referenece in the release method is now safe and
    correct.

    This means we no longer need a special case for clearing VM_MAYWRITE
    as our file descriptor now only has write permissions when we have
    requested write access when calling shmat. Although VM_SHARED is now
    cleared as well which I believe is harmless and is mostly likely a
    minor bug fix.

    By using the same set of operations for both the hugetlb case and regular
    shared memory case shmdt is not simplified and made slightly more correct
    as now the test "vma->vm_ops == &shm_vm_ops" is 100% accurate in spotting
    all shared memory regions generated from sysvipc shared memory.

    Signed-off-by: Eric W. Biederman
    Cc: Michal Piotrowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

13 Feb, 2007

1 commit

  • Many struct file_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

24 Jan, 2007

1 commit


09 Dec, 2006

1 commit


04 Nov, 2006

1 commit

  • Fix two issuses related to ipc_ids->entries freeing.

    1. When freeing ipc namespace we need to free entries allocated
    with ipc_init_ids().

    2. When removing old entries in grow_ary() ipc_rcu_putref()
    may be called on entries set to &ids->nullentry earlier in
    ipc_init_ids().
    This is almost impossible without namespaces, but with
    them this situation becomes possible.

    Found during OpenVZ testing after obvious leaks in beancounters.

    Signed-off-by: Pavel Emelianov
    Cc: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelianov
     

02 Oct, 2006

1 commit


01 Jul, 2006

1 commit


23 Jun, 2006

1 commit


20 Jun, 2006

1 commit

  • The following patch addresses most of the issues with the IPC_SET_PERM
    records as described in:
    https://www.redhat.com/archives/linux-audit/2006-May/msg00010.html
    and addresses the comments I received on the record field names.

    To summarize, I made the following changes:

    1. Changed sys_msgctl() and semctl_down() so that an IPC_SET_PERM
    record is emitted in the failure case as well as the success case.
    This matches the behavior in sys_shmctl(). I could simplify the
    code in sys_msgctl() and semctl_down() slightly but it would mean
    that in some error cases we could get an IPC_SET_PERM record
    without an IPC record and that seemed odd.

    2. No change to the IPC record type, given no feedback on the backward
    compatibility question.

    3. Removed the qbytes field from the IPC record. It wasn't being
    set and when audit_ipc_obj() is called from ipcperms(), the
    information isn't available. If we want the information in the IPC
    record, more extensive changes will be necessary. Since it only
    applies to message queues and it isn't really permission related, it
    doesn't seem worth it.

    4. Removed the obj field from the IPC_SET_PERM record. This means that
    the kern_ipc_perm argument is no longer needed.

    5. Removed the spaces and renamed the IPC_SET_PERM field names. Replaced iuid and
    igid fields with ouid and ogid in the IPC record.

    I tested this with the lspp.22 kernel on an x86_64 box. I believe it
    applies cleanly on the latest kernel.

    -- ljk

    Signed-off-by: Linda Knippers
    Signed-off-by: Al Viro

    Linda Knippers
     

01 May, 2006

1 commit

  • 1) The audit_ipc_perms() function has been split into two different
    functions:
    - audit_ipc_obj()
    - audit_ipc_set_perm()

    There's a key shift here... The audit_ipc_obj() collects the uid, gid,
    mode, and SElinux context label of the current ipc object. This
    audit_ipc_obj() hook is now found in several places. Most notably, it
    is hooked in ipcperms(), which is called in various places around the
    ipc code permforming a MAC check. Additionally there are several places
    where *checkid() is used to validate that an operation is being
    performed on a valid object while not necessarily having a nearby
    ipcperms() call. In these locations, audit_ipc_obj() is called to
    ensure that the information is captured by the audit system.

    The audit_set_new_perm() function is called any time the permissions on
    the ipc object changes. In this case, the NEW permissions are recorded
    (and note that an audit_ipc_obj() call exists just a few lines before
    each instance).

    2) Support for an AUDIT_IPC_SET_PERM audit message type. This allows
    for separate auxiliary audit records for normal operations on an IPC
    object and permissions changes. Note that the same struct
    audit_aux_data_ipcctl is used and populated, however there are separate
    audit_log_format statements based on the type of the message. Finally,
    the AUDIT_IPC block of code in audit_free_aux() was extended to handle
    aux messages of this new type. No more mem leaks I hope ;-)

    Signed-off-by: Al Viro

    Steve Grubb
     

18 Apr, 2006

1 commit

  • I found that all of 2.4 and 2.6 have been letting mprotect give write
    permission to a readonly attachment of shared memory, whether or not IPC
    would give the caller that permission.

    SUS says "The behaviour of this function [mprotect] is unspecified if the
    mapping was not established by a call to mmap", but I don't think we can
    interpret that as allowing it to subvert IPC permissions.

    I haven't tried 2.2, but the 2.2.26 source looks like it gets it right; and
    the patch below reproduces that behaviour - mprotect cannot be used to add
    write permission to a shared memory segment attached readonly.

    This patch is simple, and I'm sure it's what we should have done in 2.4.0:
    if you want to go on to switch write permission on and off with mprotect,
    just don't attach the segment readonly in the first place.

    However, we could have accumulated apps which attach readonly (even though
    they would be permitted to attach read/write), and which subsequently use
    mprotect to switch write permission on and off: it's not unreasonable.

    I was going to add a second ipcperms check in do_shmat, to check for
    writable when readonly, and if not writable find_vma and clear VM_MAYWRITE.
    But security_ipc_permission might do auditing, and it seems wrong to
    report an attempt for write permission when there has been none. Or we
    could flag the vma as SHM, note the shmid or shp in vm_private_data, and
    then get mprotect to check.

    But the patch below is a lot simpler: I'd rather stick with it, if we can
    convince ourselves somehow that it'll be safe.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Hugh Dickins
     

02 Apr, 2006

1 commit


27 Mar, 2006

1 commit

  • Semaphore to mutex conversion.

    The conversion was generated via scripts, and the result was validated
    automatically via a script as well.

    Signed-off-by: Ingo Molnar
    Cc: Manfred Spraul
    Signed-off-by: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

26 Mar, 2006

1 commit

  • * 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (22 commits)
    [PATCH] fix audit_init failure path
    [PATCH] EXPORT_SYMBOL patch for audit_log, audit_log_start, audit_log_end and audit_format
    [PATCH] sem2mutex: audit_netlink_sem
    [PATCH] simplify audit_free() locking
    [PATCH] Fix audit operators
    [PATCH] promiscuous mode
    [PATCH] Add tty to syscall audit records
    [PATCH] add/remove rule update
    [PATCH] audit string fields interface + consumer
    [PATCH] SE Linux audit events
    [PATCH] Minor cosmetic cleanups to the code moved into auditfilter.c
    [PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL
    [PATCH] Fix IA64 success/failure indication in syscall auditing.
    [PATCH] Miscellaneous bug and warning fixes
    [PATCH] Capture selinux subject/object context information.
    [PATCH] Exclude messages by message type
    [PATCH] Collect more inode information during syscall processing.
    [PATCH] Pass dentry, not just name, in fsnotify creation hooks.
    [PATCH] Define new range of userspace messages.
    [PATCH] Filter rule comparators
    ...

    Fixed trivial conflict in security/selinux/hooks.c

    Linus Torvalds
     

24 Mar, 2006

1 commit

  • SUSv3 says the shmdt() function shall fail with EINVAL if the value of
    shmaddr is not the data segment start address of a shared memory segment:
    our sys_shmdt needs to reject a shmaddr which is not page-aligned.

    Does it have the potential to break existing apps?

    Hugh says

    "sys_shmdt() just does the wrong (unexpected) thing with a misaligned
    address: it'll fail on what you might expect it to succeed on, and only
    succeed on what it should definitely fail on.

    "That is, I think it behaves as if shmaddr gets rounded up, when the only
    understandable behaviour would be if it rounded it down.

    "Which does mean you'd have to be devious to see anything but EINVAL from
    a misaligned shmaddr there, so it's not terribly important."

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

21 Mar, 2006

1 commit

  • This patch extends existing audit records with subject/object context
    information. Audit records associated with filesystem inodes, ipc, and
    tasks now contain SELinux label information in the field "subj" if the
    item is performing the action, or in "obj" if the item is the receiver
    of an action.

    These labels are collected via hooks in SELinux and appended to the
    appropriate record in the audit code.

    This additional information is required for Common Criteria Labeled
    Security Protection Profile (LSPP).

    [AV: fixed kmalloc flags use]
    [folded leak fixes]
    [folded cleanup from akpm (kfree(NULL)]
    [folded audit_inode_context() leak fix]
    [folded akpm's fix for audit_ipc_perm() definition in case of !CONFIG_AUDIT]

    Signed-off-by: Dustin Kirkland
    Signed-off-by: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Dustin Kirkland
     

11 Feb, 2006

1 commit

  • sys_shmdt() can manage shm segments which are covered by multiple vmas. (This
    can happen when a user uses mprotect() after shmat().)

    This works well if shm is aligned to PAGE_SIZE, but if not, the last
    segment cannot be detached. It is because a comparison in sys_shmdt()

    (vma->vm_end - addr) < size
    addr == return address of shmat()
    size == shmsize, argments to shmget()

    size should be aligned to PAGE_SIZE before being compared with vma->vm_end,
    which is aligned.

    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Manfred Spraul
    Acked-by: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

12 Jan, 2006

1 commit

  • - Move capable() from sched.h to capability.h;

    - Use where capable() is used
    (in include/, block/, ipc/, kernel/, a few drivers/,
    mm/, security/, & sound/;
    many more drivers/ to go)

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy.Dunlap
     

09 Jan, 2006

1 commit


07 Jan, 2006

1 commit

  • The attached patch makes the SYSV IPC shared memory facilities use the new
    ramfs facilities on a no-MMU kernel.

    The following changes are made:

    (1) There are now shmem_mmap() and shmem_get_unmapped_area() functions to
    allow the IPC SHM facilities to commune with the tiny-shmem and shmem
    code.

    (2) ramfs files now need resizing using do_truncate() rather than by modifying
    the inode size directly (see shmem_file_setup()). This causes ramfs to
    attempt to bind a block of pages of sufficient size to the inode.

    (3) CONFIG_SYSVIPC is no longer contingent on CONFIG_MMU.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

07 Nov, 2005

1 commit

  • Add SHM_NORESERVE functionality similar to MAP_NORESERVE for shared memory
    segments.

    This is mainly to avoid abuse of OVERCOMMIT_ALWAYS and this flag is ignored
    for OVERCOMMIT_NEVER.

    Signed-off-by: Badari Pulavarty
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     

30 Oct, 2005

1 commit


08 Sep, 2005

1 commit


02 Aug, 2005

1 commit


01 May, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds