02 Oct, 2006

1 commit


01 Oct, 2006

2 commits

  • This is mostly included for parity with dec_nlink(), where we will have some
    more hooks. This one should stay pretty darn straightforward for now.

    Signed-off-by: Dave Hansen
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • When a filesystem decrements i_nlink to zero, it means that a write must be
    performed in order to drop the inode from the filesystem.

    We're shortly going to have keep filesystems from being remounted r/o between
    the time that this i_nlink decrement and that write occurs.

    So, add a little helper function to do the decrements. We'll tie into it in a
    bit to note when i_nlink hits zero.

    Signed-off-by: Dave Hansen
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     

27 Sep, 2006

2 commits

  • This eliminates the i_blksize field from struct inode. Filesystems that want
    to provide a per-inode st_blksize can do so by providing their own getattr
    routine instead of using the generic_fillattr() function.

    Note that some filesystems were providing pretty much random (and incorrect)
    values for i_blksize.

    [bunk@stusta.de: cleanup]
    [akpm@osdl.org: generic_fillattr() fix]
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Theodore Ts'o
     
  • * Rougly half of callers already do it by not checking return value
    * Code in drivers/acpi/osl.c does the following to be sure:

    (void)kmem_cache_destroy(cache);

    * Those who check it printk something, however, slab_error already printed
    the name of failed cache.
    * XFS BUGs on failed kmem_cache_destroy which is not the decision
    low-level filesystem driver should make. Converted to ignore.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

01 Aug, 2006

1 commit

  • Clean up ipc/msg.c to conform to Documentation/CodingStyle. (before it was
    an inconsistent hodepodge of various coding styles)

    Verified that the before/after .o's are identical.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

01 Jul, 2006

1 commit


23 Jun, 2006

3 commits

  • Pass the POSIX lock owner ID to the flush operation.

    This is useful for filesystems which don't want to store any locking state
    in inode->i_flock but want to handle locking/unlocking POSIX locks
    internally. FUSE is one such filesystem but I think it possible that some
    network filesystems would need this also.

    Also add a flag to indicate that a POSIX locking request was generated by
    close(), so filesystems using the above feature won't send an extra locking
    request in this case.

    Signed-off-by: Miklos Szeredi
    Cc: Trond Myklebust
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Remove the unused variable o_flags from do_shmat.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Extend the get_sb() filesystem operation to take an extra argument that
    permits the VFS to pass in the target vfsmount that defines the mountpoint.

    The filesystem is then required to manually set the superblock and root dentry
    pointers. For most filesystems, this should be done with simple_set_mnt()
    which will set the superblock pointer and then set the root dentry to the
    superblock's s_root (as per the old default behaviour).

    The get_sb() op now returns an integer as there's now no need to return the
    superblock pointer.

    This patch permits a superblock to be implicitly shared amongst several mount
    points, such as can be done with NFS to avoid potential inode aliasing. In
    such a case, simple_set_mnt() would not be called, and instead the mnt_root
    and mnt_sb would be set directly.

    The patch also makes the following changes:

    (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
    pointer argument and return an integer, so most filesystems have to change
    very little.

    (*) If one of the convenience function is not used, then get_sb() should
    normally call simple_set_mnt() to instantiate the vfsmount. This will
    always return 0, and so can be tail-called from get_sb().

    (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
    dcache upon superblock destruction rather than shrink_dcache_anon().

    This is required because the superblock may now have multiple trees that
    aren't actually bound to s_root, but that still need to be cleaned up. The
    currently called functions assume that the whole tree is rooted at s_root,
    and that anonymous dentries are not the roots of trees which results in
    dentries being left unculled.

    However, with the way NFS superblock sharing are currently set to be
    implemented, these assumptions are violated: the root of the filesystem is
    simply a dummy dentry and inode (the real inode for '/' may well be
    inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
    with child trees.

    [*] Anonymous until discovered from another tree.

    (*) The documentation has been adjusted, including the additional bit of
    changing ext2_* into foo_* in the documentation.

    [akpm@osdl.org: convert ipath_fs, do other stuff]
    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Nathan Scott
    Cc: Roland Dreier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

20 Jun, 2006

2 commits

  • This patch adds audit support to POSIX message queues. It applies cleanly to
    the lspp.b15 branch of Al Viro's git tree. There are new auxiliary data
    structures, and collection and emission routines in kernel/auditsc.c. New hooks
    in ipc/mqueue.c collect arguments from the syscalls.

    I tested the patch by building the examples from the POSIX MQ library tarball.
    Build them -lrt, not against the old MQ library in the tarball. Here's the URL:
    http://www.geocities.com/wronski12/posix_ipc/libmqueue-4.41.tar.gz
    Do auditctl -a exit,always -S for mq_open, mq_timedsend, mq_timedreceive,
    mq_notify, mq_getsetattr. mq_unlink has no new hooks. Please see the
    corresponding userspace patch to get correct output from auditd for the new
    record types.

    [fixes folded]

    Signed-off-by: George Wilson
    Signed-off-by: Al Viro

    George C. Wilson
     
  • The following patch addresses most of the issues with the IPC_SET_PERM
    records as described in:
    https://www.redhat.com/archives/linux-audit/2006-May/msg00010.html
    and addresses the comments I received on the record field names.

    To summarize, I made the following changes:

    1. Changed sys_msgctl() and semctl_down() so that an IPC_SET_PERM
    record is emitted in the failure case as well as the success case.
    This matches the behavior in sys_shmctl(). I could simplify the
    code in sys_msgctl() and semctl_down() slightly but it would mean
    that in some error cases we could get an IPC_SET_PERM record
    without an IPC record and that seemed odd.

    2. No change to the IPC record type, given no feedback on the backward
    compatibility question.

    3. Removed the qbytes field from the IPC record. It wasn't being
    set and when audit_ipc_obj() is called from ipcperms(), the
    information isn't available. If we want the information in the IPC
    record, more extensive changes will be necessary. Since it only
    applies to message queues and it isn't really permission related, it
    doesn't seem worth it.

    4. Removed the obj field from the IPC_SET_PERM record. This means that
    the kern_ipc_perm argument is no longer needed.

    5. Removed the spaces and renamed the IPC_SET_PERM field names. Replaced iuid and
    igid fields with ouid and ogid in the IPC record.

    I tested this with the lspp.22 kernel on an x86_64 box. I believe it
    applies cleanly on the latest kernel.

    -- ljk

    Signed-off-by: Linda Knippers
    Signed-off-by: Al Viro

    Linda Knippers
     

01 May, 2006

1 commit

  • 1) The audit_ipc_perms() function has been split into two different
    functions:
    - audit_ipc_obj()
    - audit_ipc_set_perm()

    There's a key shift here... The audit_ipc_obj() collects the uid, gid,
    mode, and SElinux context label of the current ipc object. This
    audit_ipc_obj() hook is now found in several places. Most notably, it
    is hooked in ipcperms(), which is called in various places around the
    ipc code permforming a MAC check. Additionally there are several places
    where *checkid() is used to validate that an operation is being
    performed on a valid object while not necessarily having a nearby
    ipcperms() call. In these locations, audit_ipc_obj() is called to
    ensure that the information is captured by the audit system.

    The audit_set_new_perm() function is called any time the permissions on
    the ipc object changes. In this case, the NEW permissions are recorded
    (and note that an audit_ipc_obj() call exists just a few lines before
    each instance).

    2) Support for an AUDIT_IPC_SET_PERM audit message type. This allows
    for separate auxiliary audit records for normal operations on an IPC
    object and permissions changes. Note that the same struct
    audit_aux_data_ipcctl is used and populated, however there are separate
    audit_log_format statements based on the type of the message. Finally,
    the AUDIT_IPC block of code in audit_free_aux() was extended to handle
    aux messages of this new type. No more mem leaks I hope ;-)

    Signed-off-by: Al Viro

    Steve Grubb
     

18 Apr, 2006

2 commits

  • grow_ary() should not copy struct ipc_id_ary (it copies new->p, not
    new). Due to this, memcpy() src pointer could hit unmapped vmalloc page
    when near page boundary.

    Found during OpenVZ stress testing

    Signed-off-by: Alexey Kuznetsov
    Signed-off-by: Kirill Korotaev
    Signed-off-by: Linus Torvalds

    Alexey Kuznetsov
     
  • I found that all of 2.4 and 2.6 have been letting mprotect give write
    permission to a readonly attachment of shared memory, whether or not IPC
    would give the caller that permission.

    SUS says "The behaviour of this function [mprotect] is unspecified if the
    mapping was not established by a call to mmap", but I don't think we can
    interpret that as allowing it to subvert IPC permissions.

    I haven't tried 2.2, but the 2.2.26 source looks like it gets it right; and
    the patch below reproduces that behaviour - mprotect cannot be used to add
    write permission to a shared memory segment attached readonly.

    This patch is simple, and I'm sure it's what we should have done in 2.4.0:
    if you want to go on to switch write permission on and off with mprotect,
    just don't attach the segment readonly in the first place.

    However, we could have accumulated apps which attach readonly (even though
    they would be permitted to attach read/write), and which subsequently use
    mprotect to switch write permission on and off: it's not unreasonable.

    I was going to add a second ipcperms check in do_shmat, to check for
    writable when readonly, and if not writable find_vma and clear VM_MAYWRITE.
    But security_ipc_permission might do auditing, and it seems wrong to
    report an attempt for write permission when there has been none. Or we
    could flag the vma as SHM, note the shmid or shp in vm_private_data, and
    then get mprotect to check.

    But the patch below is a lot simpler: I'd rather stick with it, if we can
    convince ourselves somehow that it'll be safe.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Hugh Dickins
     

02 Apr, 2006

1 commit


01 Apr, 2006

1 commit


29 Mar, 2006

1 commit


27 Mar, 2006

4 commits

  • Ingo's sem2mutex patch incorrectly replaced one reference to ipc/sem.c
    with ipc/mutex.c in a comment.

    Signed-off-by: Manfred Spraul
    Signed-off-by: Linus Torvalds

    Manfred Spraul
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
    drivers/char/ftape/lowlevel/fdc-io.c: Correct a comment
    Kconfig help: MTD_JEDECPROBE already supports Intel
    Remove ugly debugging stuff
    do_mounts.c: Minor ROOT_DEV comment cleanup
    BUG_ON() Conversion in drivers/s390/block/dasd_devmap.c
    BUG_ON() Conversion in mm/mempool.c
    BUG_ON() Conversion in mm/memory.c
    BUG_ON() Conversion in kernel/fork.c
    BUG_ON() Conversion in ipc/sem.c
    BUG_ON() Conversion in fs/ext2/
    BUG_ON() Conversion in fs/hfs/
    BUG_ON() Conversion in fs/dcache.c
    BUG_ON() Conversion in fs/buffer.c
    BUG_ON() Conversion in input/serio/hp_sdc_mlc.c
    BUG_ON() Conversion in md/dm-table.c
    BUG_ON() Conversion in md/dm-path-selector.c
    BUG_ON() Conversion in drivers/isdn
    BUG_ON() Conversion in drivers/char
    BUG_ON() Conversion in drivers/mtd/

    Linus Torvalds
     
  • Semaphore to mutex conversion.

    The conversion was generated via scripts, and the result was validated
    automatically via a script as well.

    Signed-off-by: Ingo Molnar
    Cc: Manfred Spraul
    Signed-off-by: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • this changes if() BUG(); constructs to BUG_ON() which is
    cleaner, contains unlikely() and can better optimized away.

    Signed-off-by: Eric Sesterhenn
    Signed-off-by: Adrian Bunk

    Eric Sesterhenn
     

26 Mar, 2006

1 commit

  • * 'audit.b3' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (22 commits)
    [PATCH] fix audit_init failure path
    [PATCH] EXPORT_SYMBOL patch for audit_log, audit_log_start, audit_log_end and audit_format
    [PATCH] sem2mutex: audit_netlink_sem
    [PATCH] simplify audit_free() locking
    [PATCH] Fix audit operators
    [PATCH] promiscuous mode
    [PATCH] Add tty to syscall audit records
    [PATCH] add/remove rule update
    [PATCH] audit string fields interface + consumer
    [PATCH] SE Linux audit events
    [PATCH] Minor cosmetic cleanups to the code moved into auditfilter.c
    [PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL
    [PATCH] Fix IA64 success/failure indication in syscall auditing.
    [PATCH] Miscellaneous bug and warning fixes
    [PATCH] Capture selinux subject/object context information.
    [PATCH] Exclude messages by message type
    [PATCH] Collect more inode information during syscall processing.
    [PATCH] Pass dentry, not just name, in fsnotify creation hooks.
    [PATCH] Define new range of userspace messages.
    [PATCH] Filter rule comparators
    ...

    Fixed trivial conflict in security/selinux/hooks.c

    Linus Torvalds
     

25 Mar, 2006

1 commit


24 Mar, 2006

1 commit

  • SUSv3 says the shmdt() function shall fail with EINVAL if the value of
    shmaddr is not the data segment start address of a shared memory segment:
    our sys_shmdt needs to reject a shmaddr which is not page-aligned.

    Does it have the potential to break existing apps?

    Hugh says

    "sys_shmdt() just does the wrong (unexpected) thing with a misaligned
    address: it'll fail on what you might expect it to succeed on, and only
    succeed on what it should definitely fail on.

    "That is, I think it behaves as if shmaddr gets rounded up, when the only
    understandable behaviour would be if it rounded it down.

    "Which does mean you'd have to be devious to see anything but EINVAL from
    a misaligned shmaddr there, so it's not terribly important."

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

22 Mar, 2006

1 commit


21 Mar, 2006

1 commit

  • This patch extends existing audit records with subject/object context
    information. Audit records associated with filesystem inodes, ipc, and
    tasks now contain SELinux label information in the field "subj" if the
    item is performing the action, or in "obj" if the item is the receiver
    of an action.

    These labels are collected via hooks in SELinux and appended to the
    appropriate record in the audit code.

    This additional information is required for Common Criteria Labeled
    Security Protection Profile (LSPP).

    [AV: fixed kmalloc flags use]
    [folded leak fixes]
    [folded cleanup from akpm (kfree(NULL)]
    [folded audit_inode_context() leak fix]
    [folded akpm's fix for audit_ipc_perm() definition in case of !CONFIG_AUDIT]

    Signed-off-by: Dustin Kirkland
    Signed-off-by: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Dustin Kirkland
     

11 Feb, 2006

1 commit

  • sys_shmdt() can manage shm segments which are covered by multiple vmas. (This
    can happen when a user uses mprotect() after shmat().)

    This works well if shm is aligned to PAGE_SIZE, but if not, the last
    segment cannot be detached. It is because a comparison in sys_shmdt()

    (vma->vm_end - addr) < size
    addr == return address of shmat()
    size == shmsize, argments to shmget()

    size should be aligned to PAGE_SIZE before being compared with vma->vm_end,
    which is aligned.

    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Manfred Spraul
    Acked-by: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

10 Feb, 2006

1 commit

  • netlink overrun was broken while improvement of netlink.
    Destination socket is used in the place where it was meant to be source socket,
    so that now overrun is never sent to user netlink sockets, when it should be,
    and it even can be set on kernel socket, which results in complete deadlock
    of rtnetlink.

    Suggested fix is to restore status quo passing source socket as additional
    argument to netlink_attachskb().

    A little explanation: overrun is set on a socket, when it failed
    to receive some message and sender of this messages does not or even
    have no way to handle this error. This happens in two cases:
    1. when kernel sends something. Kernel never retransmits and cannot
    wait for buffer space.
    2. when user sends a broadcast and the message was not delivered
    to some recipients.

    Signed-off-by: Alexey Kuznetsov
    Signed-off-by: David S. Miller

    Alexey Kuznetsov
     

15 Jan, 2006

2 commits

  • I tried to send the forcedeth maintainer an email, but it came back with:

    "The mail address manfreds@colorfullife.com is not read anymore.
    Please resent your mail to manfred@ instead of manfreds@."

    This patch fixes this.

    Signed-off-by: Adrian Bunk

    Christian Kujau
     
  • Fixed the refcounting on failure exits in sys_mq_open() and
    cleaned the logics up. Rules are actually pretty simple - dentry_open()
    expects vfsmount and dentry to be pinned down and it either transfers
    them into created struct file or drops them. Old code had been very
    confused in that area - if dentry_open() had failed either in do_open()
    or do_create(), we ended up dentry and mqueue_mnt dropped twice, once
    by dentry_open() cleanup and then by sys_mq_open().

    Fix consists of making the rules for do_create() and do_open()
    same as for dentry_open() and updating the sys_mq_open() accordingly;
    that actually leads to more straightforward code and less work on
    normal path.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Alexander Viro
     

12 Jan, 2006

1 commit

  • - Move capable() from sched.h to capability.h;

    - Use where capable() is used
    (in include/, block/, ipc/, kernel/, a few drivers/,
    mm/, security/, & sound/;
    many more drivers/ to go)

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy.Dunlap
     

10 Jan, 2006

1 commit


09 Jan, 2006

1 commit


07 Jan, 2006

1 commit

  • The attached patch makes the SYSV IPC shared memory facilities use the new
    ramfs facilities on a no-MMU kernel.

    The following changes are made:

    (1) There are now shmem_mmap() and shmem_get_unmapped_area() functions to
    allow the IPC SHM facilities to commune with the tiny-shmem and shmem
    code.

    (2) ramfs files now need resizing using do_truncate() rather than by modifying
    the inode size directly (see shmem_file_setup()). This causes ramfs to
    attempt to bind a block of pages of sufficient size to the inode.

    (3) CONFIG_SYSVIPC is no longer contingent on CONFIG_MMU.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

25 Dec, 2005

2 commits


08 Nov, 2005

1 commit


07 Nov, 2005

2 commits

  • Various core kernel-doc cleanups:
    - add missing function parameters in ipc, irq/manage, kernel/sys,
    kernel/sysctl, and mm/slab;
    - move description to just above function for kernel_restart()

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Add SHM_NORESERVE functionality similar to MAP_NORESERVE for shared memory
    segments.

    This is mainly to avoid abuse of OVERCOMMIT_ALWAYS and this flag is ignored
    for OVERCOMMIT_NEVER.

    Signed-off-by: Badari Pulavarty
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty