21 Jun, 2006

3 commits

  • * 'audit.b21' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit-current: (25 commits)
    [PATCH] make set_loginuid obey audit_enabled
    [PATCH] log more info for directory entry change events
    [PATCH] fix AUDIT_FILTER_PREPEND handling
    [PATCH] validate rule fields' types
    [PATCH] audit: path-based rules
    [PATCH] Audit of POSIX Message Queue Syscalls v.2
    [PATCH] fix se_sen audit filter
    [PATCH] deprecate AUDIT_POSSBILE
    [PATCH] inline more audit helpers
    [PATCH] proc_loginuid_write() uses simple_strtoul() on non-terminated array
    [PATCH] update of IPC audit record cleanup
    [PATCH] minor audit updates
    [PATCH] fix audit_krule_to_{rule,data} return values
    [PATCH] add filtering by ppid
    [PATCH] log ppid
    [PATCH] collect sid of those who send signals to auditd
    [PATCH] execve argument logging
    [PATCH] fix deadlocks in AUDIT_LIST/AUDIT_LIST_RULES
    [PATCH] audit_panic() is audit-internal
    [PATCH] inotify (5/5): update kernel documentation
    ...

    Manual fixup of conflict in unclude/linux/inotify.h

    Linus Torvalds
     
  • * git://git.infradead.org/~dwmw2/rbtree-2.6:
    [RBTREE] Switch rb_colour() et al to en_US spelling of 'color' for consistency
    Update UML kernel/physmem.c to use rb_parent() accessor macro
    [RBTREE] Update hrtimers to use rb_parent() accessor macro.
    [RBTREE] Add explicit alignment to sizeof(long) for struct rb_node.
    [RBTREE] Merge colour and parent fields of struct rb_node.
    [RBTREE] Remove dead code in rb_erase()
    [RBTREE] Update JFFS2 to use rb_parent() accessor macro.
    [RBTREE] Update eventpoll.c to use rb_parent() accessor macro.
    [RBTREE] Update key.c to use rb_parent() accessor macro.
    [RBTREE] Update ext3 to use rb_parent() accessor macro.
    [RBTREE] Change rbtree off-tree marking in I/O schedulers.
    [RBTREE] Add accessor macros for colour and parent fields of rb_node

    Linus Torvalds
     
  • * git://git.infradead.org/mtd-2.6: (199 commits)
    [MTD] NAND: Fix breakage all over the place
    [PATCH] NAND: fix remaining OOB length calculation
    [MTD] NAND Fixup NDFC merge brokeness
    [MTD NAND] S3C2410 driver cleanup
    [MTD NAND] s3c24x0 board: Fix clock handling, ensure proper initialisation.
    [JFFS2] Check CRC32 on dirent and data nodes each time they're read
    [JFFS2] When retiring nextblock, allocate a node_ref for the wasted space
    [JFFS2] Mark XATTR support as experimental, for now
    [JFFS2] Don't trust node headers before the CRC is checked.
    [MTD] Restore MTD_ROM and MTD_RAM types
    [MTD] assume mtd->writesize is 1 for NOR flashes
    [MTD NAND] Fix s3c2410 NAND driver so it at least _looks_ like it compiles
    [MTD] Prepare physmap for 64-bit-resources
    [JFFS2] Fix more breakage caused by janitorial meddling.
    [JFFS2] Remove stray __exit from jffs2_compressors_exit()
    [MTD] Allow alternate JFFS2 mount variant for root filesystem.
    [MTD] Disconnect struct mtd_info from ABI
    [MTD] replace MTD_RAM with MTD_GENERIC_TYPE
    [MTD] replace MTD_ROM with MTD_GENERIC_TYPE
    [MTD] remove a forgotten MTD_XIP
    ...

    Linus Torvalds
     

20 Jun, 2006

18 commits

  • Hi,

    I was doing some testing and noticed that when the audit system was disabled,
    I was still getting messages about the loginuid being set. The following patch
    makes audit_set_loginuid look at in_syscall to determine if it should create
    an audit event. The loginuid will continue to be set as long as there is a context.

    Signed-off-by: Steve Grubb
    Signed-off-by: Al Viro

    Steve Grubb
     
  • When an audit event involves changes to a directory entry, include
    a PATH record for the directory itself. A few other notable changes:

    - fixed audit_inode_child() hooks in fsnotify_move()
    - removed unused flags arg from audit_inode()
    - added audit log routines for logging a portion of a string

    Here's some sample output.

    before patch:
    type=SYSCALL msg=audit(1149821605.320:26): arch=40000003 syscall=39 success=yes exit=0 a0=bf8d3c7c a1=1ff a2=804e1b8 a3=bf8d3c7c items=1 ppid=739 pid=800 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255
    type=CWD msg=audit(1149821605.320:26): cwd="/root"
    type=PATH msg=audit(1149821605.320:26): item=0 name="foo" parent=164068 inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0

    after patch:
    type=SYSCALL msg=audit(1149822032.332:24): arch=40000003 syscall=39 success=yes exit=0 a0=bfdd9c7c a1=1ff a2=804e1b8 a3=bfdd9c7c items=2 ppid=714 pid=777 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 comm="mkdir" exe="/bin/mkdir" subj=root:system_r:unconfined_t:s0-s0:c0.c255
    type=CWD msg=audit(1149822032.332:24): cwd="/root"
    type=PATH msg=audit(1149822032.332:24): item=0 name="/root" inode=164068 dev=03:00 mode=040750 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_dir_t:s0
    type=PATH msg=audit(1149822032.332:24): item=1 name="foo" inode=164010 dev=03:00 mode=040755 ouid=0 ogid=0 rdev=00:00 obj=root:object_r:user_home_t:s0

    Signed-off-by: Amy Griffis
    Signed-off-by: Al Viro

    Amy Griffis
     
  • Clear AUDIT_FILTER_PREPEND flag after adding rule to list. This
    fixes three problems when a rule is added with the -A syntax:

    - auditctl displays filter list as "(null)"
    - the rule cannot be removed using -d
    - a duplicate rule can be added with -a

    Signed-off-by: Amy Griffis
    Signed-off-by: Al Viro

    Amy Griffis
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • In this implementation, audit registers inotify watches on the parent
    directories of paths specified in audit rules. When audit's inotify
    event handler is called, it updates any affected rules based on the
    filesystem event. If the parent directory is renamed, removed, or its
    filesystem is unmounted, audit removes all rules referencing that
    inotify watch.

    To keep things simple, this implementation limits location-based
    auditing to the directory entries in an existing directory. Given
    a path-based rule for /foo/bar/passwd, the following table applies:

    passwd modified -- audit event logged
    passwd replaced -- audit event logged, rules list updated
    bar renamed -- rule removed
    foo renamed -- untracked, meaning that the rule now applies to
    the new location

    Audit users typically want to have many rules referencing filesystem
    objects, which can significantly impact filtering performance. This
    patch also adds an inode-number-based rule hash to mitigate this
    situation.

    The patch is relative to the audit git tree:
    http://kernel.org/git/?p=linux/kernel/git/viro/audit-current.git;a=summary
    and uses the inotify kernel API:
    http://lkml.org/lkml/2006/6/1/145

    Signed-off-by: Amy Griffis
    Signed-off-by: Al Viro

    Amy Griffis
     
  • This patch adds audit support to POSIX message queues. It applies cleanly to
    the lspp.b15 branch of Al Viro's git tree. There are new auxiliary data
    structures, and collection and emission routines in kernel/auditsc.c. New hooks
    in ipc/mqueue.c collect arguments from the syscalls.

    I tested the patch by building the examples from the POSIX MQ library tarball.
    Build them -lrt, not against the old MQ library in the tarball. Here's the URL:
    http://www.geocities.com/wronski12/posix_ipc/libmqueue-4.41.tar.gz
    Do auditctl -a exit,always -S for mq_open, mq_timedsend, mq_timedreceive,
    mq_notify, mq_getsetattr. mq_unlink has no new hooks. Please see the
    corresponding userspace patch to get correct output from auditd for the new
    record types.

    [fixes folded]

    Signed-off-by: George Wilson
    Signed-off-by: Al Viro

    George C. Wilson
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • pull checks for ->audit_context into inlined wrappers

    Signed-off-by: Al Viro

    Al Viro
     
  • The following patch addresses most of the issues with the IPC_SET_PERM
    records as described in:
    https://www.redhat.com/archives/linux-audit/2006-May/msg00010.html
    and addresses the comments I received on the record field names.

    To summarize, I made the following changes:

    1. Changed sys_msgctl() and semctl_down() so that an IPC_SET_PERM
    record is emitted in the failure case as well as the success case.
    This matches the behavior in sys_shmctl(). I could simplify the
    code in sys_msgctl() and semctl_down() slightly but it would mean
    that in some error cases we could get an IPC_SET_PERM record
    without an IPC record and that seemed odd.

    2. No change to the IPC record type, given no feedback on the backward
    compatibility question.

    3. Removed the qbytes field from the IPC record. It wasn't being
    set and when audit_ipc_obj() is called from ipcperms(), the
    information isn't available. If we want the information in the IPC
    record, more extensive changes will be necessary. Since it only
    applies to message queues and it isn't really permission related, it
    doesn't seem worth it.

    4. Removed the obj field from the IPC_SET_PERM record. This means that
    the kern_ipc_perm argument is no longer needed.

    5. Removed the spaces and renamed the IPC_SET_PERM field names. Replaced iuid and
    igid fields with ouid and ogid in the IPC record.

    I tested this with the lspp.22 kernel on an x86_64 box. I believe it
    applies cleanly on the latest kernel.

    -- ljk

    Signed-off-by: Linda Knippers
    Signed-off-by: Al Viro

    Linda Knippers
     
  • Just a few minor proposed updates. Only the last one will
    actually affect behavior. The rest are just misleading
    code.

    Several AUDIT_SET functions return 'old' value, but only
    return value
    Signed-off-by: Al Viro

    Serge E. Hallyn
     
  • Don't return -ENOMEM when callers of these functions are checking for
    a NULL return. Bug noticed by Serge Hallyn.

    Signed-off-by: Amy Griffis
    Signed-off-by: Al Viro

    Amy Griffis
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • We should not send a pile of replies while holding audit_netlink_mutex
    since we hold the same mutex when we receive commands. As the result,
    we can get blocked while sending and sit there holding the mutex while
    auditctl is unable to send the next command and get around to receiving
    what we'd sent.

    Solution: create skb and put them into a queue instead of sending;
    once we are done, send what we've got on the list. The former can
    be done synchronously while we are handling AUDIT_LIST or AUDIT_LIST_RULES;
    we are holding audit_netlink_mutex at that point. The latter is done
    asynchronously and without messing with audit_netlink_mutex.

    Signed-off-by: Al Viro

    Al Viro
     
  • The following series of patches introduces a kernel API for inotify,
    making it possible for kernel modules to benefit from inotify's
    mechanism for watching inodes. With these patches, inotify will
    maintain for each caller a list of watches (via an embedded struct
    inotify_watch), where each inotify_watch is associated with a
    corresponding struct inode. The caller registers an event handler and
    specifies for which filesystem events their event handler should be
    called per inotify_watch.

    Signed-off-by: Amy Griffis
    Acked-by: Robert Love
    Acked-by: John McCutchan
    Signed-off-by: Al Viro

    Amy Griffis
     
  • Trying to suspend/resume with console messages flying all around is
    doomed to failure, when the devices that the messages are trying to
    go to are being shut down.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

18 Jun, 2006

3 commits

  • arm_timer() checks PF_EXITING to prevent BUG_ON(->exit_state)
    in run_posix_cpu_timers().

    However, for some reason it does so only for CPUCLOCK_PERTHREAD
    case (which is imho wrong).

    Also, this check is not reliable, PF_EXITING could be set on
    another cpu without any locks/barriers just after the check,
    so it can't prevent from attaching the timer to the exiting
    task.

    The previous patch makes this check unneeded.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • do_exit() clears ->it_##clock##_expires, but nothing prevents
    another cpu to attach the timer to exiting process after that.
    arm_timer() tries to protect against this race, but the check
    is racy.

    After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
    before do_exit() calls 'schedule() local timer interrupt can find
    tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
    does sys_wait4) interrupted task has ->signal == NULL.

    At this moment exiting task has no pending cpu timers, they were
    cleanuped in __exit_signal()->posix_cpu_timers_exit{,_group}(),
    so we can just return from irq.

    John Stultz recently confirmed this bug, see

    http://marc.theaimsgroup.com/?l=linux-kernel&m=115015841413687

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • If the local timer interrupt happens just after do_exit() sets PF_EXITING
    (and before it clears ->it_xxx_expires) run_posix_cpu_timers() will call
    check_process_timers() with tasklist_lock + ->siglock held and

    check_process_timers:

    t = tsk;
    do {
    ....

    do {
    t = next_thread(t);
    } while (unlikely(t->flags & PF_EXITING));
    } while (t != tsk);

    the outer loop will never stop.

    Actually, the window is bigger. Another process can attach the timer
    after ->it_xxx_expires was cleared (see the next commit) and the 'if
    (PF_EXITING)' check in arm_timer() is racy (see the one after that).

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

01 Jun, 2006

1 commit

  • From: Stephen Hemminger

    I want to use the hrtimer's in the netem (Network Emulator) qdisc. But the
    necessary symbols aren't exported for module use.

    Also needed by SystemTap.

    Signed-off-by: Stephen Hemminger
    Acked-by: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "Stone, Joshua I"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Hemminger
     

22 May, 2006

4 commits

  • This reverts commit 5ce74abe788a26698876e66b9c9ce7e7acc25413 (and its
    dependent commit 8a5bc075b8d8cf7a87b3f08fad2fba0f5d13295e), because of
    audio underruns.

    Reported by Rene Herman , who also pinpointed
    the exact cause of the underruns:

    "Audio underruns galore, with only ogg123 and firefox (browsing the
    GIT tree online is also a nice trigger by the way).

    If I back it out, everything is fine for me again."

    Cc: Rene Herman
    Cc: Mike Galbraith
    Acked-by: Con Kolivas
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Under certain timing conditions, a race during boot occurs where timer
    ticks are being processed on remote CPUs. The remote timer ticks can
    increment jiffies, and if this happens during a window when a timeout is
    very close to expiring but a local tick has not yet been delivered, you can
    end up with

    1) No softirq pending
    2) A local timer wheel which is not synced to jiffies
    3) No high resolution timer active
    4) A local timer which is supposed to fire before the current jiffies value.

    In this circumstance, the comparison in next_timer_interrupt overflows,
    because the base of the comparison for high resolution timers is jiffies,
    but for the softirq timer wheel, it is relative the the current base of the
    wheel (jiffies_base).

    Signed-off-by: Zachary Amsden
    Cc: Martin Schwidefsky
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zachary Amsden
     
  • It's too easy to incorrectly call cpuset_zone_allowed() in an atomic
    context without __GFP_HARDWALL set, and when done, it is not noticed until
    a tight memory situation forces allocations to be tried outside the current
    cpuset.

    Add a 'might_sleep_if()' check, to catch this earlier on, instead of
    waiting for a similar check in the mutex_lock() code, which is only rarely
    invoked.

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • Update the kernel/cpuset.c:cpuset_zone_allowed() comment.

    The rule for when mm/page_alloc.c should call cpuset_zone_allowed()
    was intended to be:

    Don't call cpuset_zone_allowed() if you can't sleep, unless you
    pass in the __GFP_HARDWALL flag set in gfp_flag, which disables
    the code that might scan up ancestor cpusets and sleep.

    The explanation of this rule in the comment above cpuset_zone_allowed() was
    stale, as a result of a restructuring of some __alloc_pages() code in
    November 2005.

    Rewrite that comment ...

    Signed-off-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     

16 May, 2006

3 commits

  • Signed-off-by: David Woodhouse

    David Woodhouse
     
  • Even since a previous patch:

    Fix race between CONFIG_DEBUG_SLABALLOC and modules
    Sun, 27 Jun 2004 17:55:19 +0000 (17:55 +0000)
    http://www.kernel.org/git/?p=linux/kernel/git/torvalds/old-2.6-bkcvs.git;a=commit;h=92b3db26d31cf21b70e3c1eadc56c179506d8fbe

    The function symbol_put_addr() will deadlock the kernel.

    symbol_put_addr() would acquire modlist_lock, then while holding the lock call
    two functions kernel_text_address() and module_text_address() which also try
    to acquire the same lock. This deadlocks the kernel of course.

    This patch changes symbol_put_addr() to not acquire the modlist_lock, it
    doesn't need it since it never looks at the module list directly. Also, it
    now uses core_kernel_text() instead of kernel_text_address(). The latter has
    an additional check for addr inside a module, but we don't need to do that
    since we call module_text_address() (the same function kernel_text_address
    uses) ourselves.

    Signed-off-by: Trent Piepho
    Cc: Zwane Mwaikambo
    Acked-by: Rusty Russell
    Cc: Johannes Stezenbach
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Trent Piepho
     
  • With "Paul E. McKenney"

    Introduce rcu_needs_cpu() interface. This can be used to tell if there
    will be a new rcu batch on a cpu soon by looking at the curlist pointer.
    This can be used to avoid to enter a tickless idle state where the cpu
    would miss that a new batch is ready when rcu_start_batch would be called
    on a different cpu.

    Signed-off-by: Heiko Carstens
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     

12 May, 2006

1 commit

  • Eric Biederman points out that we can't take the task_lock while holding
    tasklist_lock for writing, because another CPU that holds the task lock
    might take an interrupt that then tries to take tasklist_lock for writing.

    Which would be a nasty deadlock, with one CPU spinning forever in an
    interrupt handler (although admittedly you need to really work at
    triggering it ;)

    Since the ptrace_attach() code is special and very unusual, just make it
    be extra careful, and use trylock+repeat to avoid the possible deadlock.

    Cc: Oleg Nesterov
    Cc: Eric W. Biederman
    Cc: Roland McGrath
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

09 May, 2006

1 commit


08 May, 2006

1 commit

  • This holds the task lock (and, for ptrace_attach, the tasklist_lock)
    over the actual attach event, which closes a race between attacking to a
    thread that is either doing a PTRACE_TRACEME or getting de-threaded.

    Thanks to Oleg Nesterov for reminding me about this, and Chris Wright
    for noticing a lost return value in my first version.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

01 May, 2006

5 commits

  • While testing the watch performance, I noticed that selinux_task_ctxid()
    was creeping into the results more than it should. Investigation showed
    that the function call was being called whether it was needed or not. The
    below patch fixes this.

    Signed-off-by: Steve Grubb
    Signed-off-by: Al Viro

    Steve Grubb
     
  • 1) The audit_ipc_perms() function has been split into two different
    functions:
    - audit_ipc_obj()
    - audit_ipc_set_perm()

    There's a key shift here... The audit_ipc_obj() collects the uid, gid,
    mode, and SElinux context label of the current ipc object. This
    audit_ipc_obj() hook is now found in several places. Most notably, it
    is hooked in ipcperms(), which is called in various places around the
    ipc code permforming a MAC check. Additionally there are several places
    where *checkid() is used to validate that an operation is being
    performed on a valid object while not necessarily having a nearby
    ipcperms() call. In these locations, audit_ipc_obj() is called to
    ensure that the information is captured by the audit system.

    The audit_set_new_perm() function is called any time the permissions on
    the ipc object changes. In this case, the NEW permissions are recorded
    (and note that an audit_ipc_obj() call exists just a few lines before
    each instance).

    2) Support for an AUDIT_IPC_SET_PERM audit message type. This allows
    for separate auxiliary audit records for normal operations on an IPC
    object and permissions changes. Note that the same struct
    audit_aux_data_ipcctl is used and populated, however there are separate
    audit_log_format statements based on the type of the message. Finally,
    the AUDIT_IPC block of code in audit_free_aux() was extended to handle
    aux messages of this new type. No more mem leaks I hope ;-)

    Signed-off-by: Al Viro

    Steve Grubb
     
  • Hi,

    The patch below builds upon the patch sent earlier and adds subject label to
    all audit events generated via the netlink interface. It also cleans up a few
    other minor things.

    Signed-off-by: Steve Grubb

    Signed-off-by: Al Viro

    Steve Grubb
     
  • The below patch should be applied after the inode and ipc sid patches.
    This patch is a reworking of Tim's patch that has been updated to match
    the inode and ipc patches since its similar.

    [updated:
    > Stephen Smalley also wanted to change a variable from isec to tsec in the
    > user sid patch. ]

    Signed-off-by: Steve Grubb
    Signed-off-by: Al Viro

    Steve Grubb
     
  • Hi,

    The patch below converts IPC auditing to collect sid's and convert to context
    string only if it needs to output an audit record. This patch depends on the
    inode audit change patch already being applied.

    Signed-off-by: Steve Grubb

    Signed-off-by: Al Viro

    Steve Grubb