30 May, 2018

1 commit

  • [ Upstream commit 23138ead270045f1b3e912e667967b6094244999 ]

    If there is a memory allocation error when trying to change an audit
    kernel feature value, the ignored allocation error will trigger a NULL
    pointer dereference oops on subsequent use of that pointer. Return
    instead.

    Passes audit-testsuite.
    See: https://github.com/linux-audit/audit-kernel/issues/76

    Signed-off-by: Richard Guy Briggs
    [PM: not necessary (other funcs check for NULL), but a good practice]
    Signed-off-by: Paul Moore
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Richard Guy Briggs
     

17 Dec, 2017

2 commits

  • [ Upstream commit 173743dd99a49c956b124a74c8aacb0384739a4c ]

    Prior to this patch we enabled audit in audit_init(), which is too
    late for PID 1 as the standard initcalls are run after the PID 1 task
    is forked. This means that we never allocate an audit_context (see
    audit_alloc()) for PID 1 and therefore miss a lot of audit events
    generated by PID 1.

    This patch enables audit as early as possible to help ensure that when
    PID 1 is forked it can allocate an audit_context if required.

    Reviewed-by: Richard Guy Briggs
    Signed-off-by: Paul Moore
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Paul Moore
     
  • [ Upstream commit 33e8a907804428109ce1d12301c3365d619cc4df ]

    The API to end auditing has historically been for auditd to set the
    pid to 0. This patch restores that functionality.

    See: https://github.com/linux-audit/audit-kernel/issues/69

    Reviewed-by: Richard Guy Briggs
    Signed-off-by: Steve Grubb
    Signed-off-by: Paul Moore
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Steve Grubb
     

05 Sep, 2017

2 commits

  • Update the function comments to match the code.

    Signed-off-by: Geliang Tang
    Signed-off-by: Paul Moore

    Geliang Tang
     
  • Commit 2115bb250f26 ("audit: Use timespec64 to represent audit timestamps")
    noted that audit timestamps were not y2038 safe and used a 64-bit
    timestamp. In itself, this makes sense but the conversion was from
    CURRENT_TIME to ktime_get_real_ts64() which is a heavier call to record
    an accurate timestamp which is required in some, but not all, cases. The
    impact is that when auditd is running without any rules that all syscalls
    have higher overhead. This is visible in the sysbench-thread benchmark as
    a 11.5% performance hit. That benchmark is dumb as rocks but it's also
    visible in redis as an 8-10% hit on all operations which is of greater
    concern. It is somewhat stupid of audit to track syscalls without any
    rules related to syscalls but that is how it behaves.

    The overhead can be directly measured with perf comparing 4.9 with 4.12

    4.9
    7.76% sysbench [kernel.vmlinux] [k] __schedule
    7.62% sysbench [kernel.vmlinux] [k] _raw_spin_lock
    7.37% sysbench libpthread-2.22.so [.] __lll_lock_elision
    7.29% sysbench [kernel.vmlinux] [.] syscall_return_via_sysret
    6.59% sysbench [kernel.vmlinux] [k] native_sched_clock
    5.21% sysbench libc-2.22.so [.] __sched_yield
    4.38% sysbench [kernel.vmlinux] [k] entry_SYSCALL_64
    4.28% sysbench [kernel.vmlinux] [k] do_syscall_64
    3.49% sysbench libpthread-2.22.so [.] __lll_unlock_elision
    3.13% sysbench [kernel.vmlinux] [k] __audit_syscall_exit
    2.87% sysbench [kernel.vmlinux] [k] update_curr
    2.73% sysbench [kernel.vmlinux] [k] pick_next_task_fair
    2.31% sysbench [kernel.vmlinux] [k] syscall_trace_enter
    2.20% sysbench [kernel.vmlinux] [k] __audit_syscall_entry
    .....
    0.00% swapper [kernel.vmlinux] [k] read_tsc

    4.12
    7.84% sysbench [kernel.vmlinux] [k] __schedule
    7.05% sysbench [kernel.vmlinux] [k] _raw_spin_lock
    6.57% sysbench libpthread-2.22.so [.] __lll_lock_elision
    6.50% sysbench [kernel.vmlinux] [.] syscall_return_via_sysret
    5.95% sysbench [kernel.vmlinux] [k] read_tsc
    5.71% sysbench [kernel.vmlinux] [k] native_sched_clock
    4.78% sysbench libc-2.22.so [.] __sched_yield
    4.30% sysbench [kernel.vmlinux] [k] entry_SYSCALL_64
    3.94% sysbench [kernel.vmlinux] [k] do_syscall_64
    3.37% sysbench libpthread-2.22.so [.] __lll_unlock_elision
    3.32% sysbench [kernel.vmlinux] [k] __audit_syscall_exit
    2.91% sysbench [kernel.vmlinux] [k] __getnstimeofday64

    Note the additional overhead from read_tsc which goes from 0% to 5.95%.
    This is on a single-socket E3-1230 but similar overheads have been measured
    on an older machine which the patch also eliminates.

    The patch in question has no explanation as to why a fully-accurate timestamp
    is required and is likely an oversight. Using a coarser, but monotically
    increasing, timestamp the overhead can be eliminated. While it can be
    worked around by configuring or disabling audit, it's tricky enough to
    detect that a kernel fix is justified. With this patch, we see the following;

    sysbenchthread
    4.9.0 4.12.0 4.12.0
    vanilla vanilla coarse-v1r1
    Amean 1 1.49 ( 0.00%) 1.66 ( -11.42%) 1.51 ( -1.34%)
    Amean 3 1.48 ( 0.00%) 1.65 ( -11.45%) 1.50 ( -0.96%)
    Amean 5 1.49 ( 0.00%) 1.67 ( -12.31%) 1.51 ( -1.83%)
    Amean 7 1.49 ( 0.00%) 1.66 ( -11.72%) 1.50 ( -0.67%)
    Amean 12 1.48 ( 0.00%) 1.65 ( -11.57%) 1.52 ( -2.89%)
    Amean 16 1.49 ( 0.00%) 1.65 ( -11.13%) 1.51 ( -1.73%)

    The benchmark is reporting the time required for different thread counts to
    lock/unlock a private mutex which, while dense, demonstrates the syscall
    overhead. This is showing that 4.12 took a 11-12% hit but the overhead is
    almost eliminated by the patch. While the variance is not reported here,
    it's well within the noise with the patch applied.

    Signed-off-by: Mel Gorman
    Acked-by: Arnd Bergmann
    Acked-by: Deepa Dinamani
    Signed-off-by: Paul Moore

    Mel Gorman
     

21 Jul, 2017

1 commit


19 Jul, 2017

1 commit

  • Found this issue by kmemleak report, auditd_send_unicast_skb
    did not free skb if rcu_dereference(auditd_conn) returns null.

    unreferenced object 0xffff88082568ce00 (size 256):
    comm "auditd", pid 1119, jiffies 4294708499
    backtrace:
    [] kmemleak_alloc+0x4a/0xa0
    [] kmem_cache_alloc_node+0xcc/0x210
    [] __alloc_skb+0x5d/0x290
    [] audit_make_reply+0x54/0xd0
    [] audit_receive_msg+0x967/0xd70
    ----------------
    (gdb) list *audit_receive_msg+0x967
    0xffffffff8113dff7 is in audit_receive_msg (kernel/audit.c:1133).
    1132 skb = audit_make_reply(0, AUDIT_REPLACE, 0,
    0, &pvnr, sizeof(pvnr));
    ---------------
    [] audit_receive+0x52/0xa0
    [] netlink_unicast+0x181/0x240
    [] netlink_sendmsg+0x2c2/0x3b0
    [] sock_sendmsg+0x38/0x50
    [] SYSC_sendto+0x102/0x190
    [] SyS_sendto+0xe/0x10
    [] entry_SYSCALL_64_fastpath+0x1a/0xa5
    [] 0xffffffffffffffff

    Signed-off-by: Shu Wang
    Signed-off-by: Paul Moore

    Shu Wang
     

06 Jul, 2017

1 commit

  • Pull audit updates from Paul Moore:
    "Things are relatively quiet on the audit front for v4.13, just five
    patches for a total diffstat of 102 lines.

    There are two patches from Richard to consistently record the POSIX
    capabilities and add the ambient capability information as well.

    I also chipped in two patches to fix a race condition with the auditd
    tracking code and ensure we don't skip sending any records to the
    audit multicast group.

    Finally a single style fix that I accepted because I must have been in
    a good mood that day.

    Everything passes our test suite, and should be relatively harmless,
    please merge for v4.13"

    * 'stable-4.13' of git://git.infradead.org/users/pcmoore/audit:
    audit: make sure we never skip the multicast broadcast
    audit: fix a race condition with the auditd tracking code
    audit: style fix
    audit: add ambient capabilities to CAPSET and BPRM_FCAPS records
    audit: unswing cap_* fields in PATH records

    Linus Torvalds
     

16 Jun, 2017

1 commit

  • When the auditd connection is reset, either intentionally or due to
    a failure, any records that were in the main backlog queue would not
    be sent in a multicast broadcast. This patch fixes this problem by
    not flushing the main backlog queue on a connection reset, the main
    kauditd_thread() will take care of that normally.

    Resolves: https://github.com/linux-audit/audit-kernel/issues/41
    Reviewed-by: Richard Guy Briggs
    Signed-off-by: Paul Moore

    Paul Moore
     

14 Jun, 2017

1 commit

  • Originally reported by Adam and Dusty, it appears we have a small
    race window in kauditd_thread(), as documented in the Fedora BZ:

    * https://bugzilla.redhat.com/show_bug.cgi?id=1459326#c35

    "This issue is partly due to the read-copy nature of RCU, and
    partly due to how we sync the auditd_connection state across
    kauditd_thread and the audit control channel. The kauditd_thread
    thread is always running so it can service the record queues and
    emit the multicast messages, if it happens to be just past the
    "main_queue" label, but before the "if (sk == NULL || ...)"
    if-statement which calls auditd_reset() when the new auditd
    connection is registered it could end up resetting the auditd
    connection, regardless of if it is valid or not. This is a rather
    small window and the variable nature of multi-core scheduling
    explains why this is proving rather difficult to reproduce."

    The fix is to have functions only call auditd_reset() when they
    believe that the kernel/auditd connection is still valid, e.g.
    non-NULL, and to have these callers pass their local copy of the
    auditd_connection pointer to auditd_reset() where it can be compared
    with the current connection state before resetting. If the caller
    has a stale state tracking pointer then the reset is ignored.

    We also make a small change to kauditd_thread() so that if the
    kernel/auditd connection is dead we skip the retry queue and send the
    records straight to the hold queue. This is necessary as we used to
    rely on auditd_reset() to occasionally purge the retry queue but we
    are going to be calling the reset function much less now and we want
    to make sure the retry queue doesn't grow unbounded.

    Reported-by: Adam Williamson
    Reported-by: Dusty Mabe
    Reviewed-by: Richard Guy Briggs
    Signed-off-by: Paul Moore

    Paul Moore
     

24 May, 2017

1 commit

  • The cap_* fields swing in and out of PATH records.
    If no capabilities are set, the cap_* fields are completely missing and when
    one of the cap_fi or cap_fp values is empty, that field is omitted.

    Original:
    type=PATH msg=audit(04/20/2017 12:17:11.222:193) : item=1 name=/lib64/ld-linux-x86-64.so.2 inode=787694 dev=08:03 mode=file,755 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:ld_so_t:s0 nametype=NORMAL
    type=PATH msg=audit(04/20/2017 12:17:11.222:193) : item=0 name=/home/sleep inode=1319469 dev=08:03 mode=file,suid,755 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:bin_t:s0 nametype=NORMAL cap_fp=sys_admin cap_fe=1 cap_fver=2

    Normalize the PATH record by always printing all 4 cap_* fields.

    Fixed:
    type=PATH msg=audit(04/20/2017 13:01:31.679:201) : item=1 name=/lib64/ld-linux-x86-64.so.2 inode=787694 dev=08:03 mode=file,755 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:ld_so_t:s0 nametype=NORMAL cap_fp=none cap_fi=none cap_fe=0 cap_fver=0
    type=PATH msg=audit(04/20/2017 13:01:31.679:201) : item=0 name=/home/sleep inode=1319469 dev=08:03 mode=file,suid,755 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:bin_t:s0 nametype=NORMAL cap_fp=sys_admin cap_fi=none cap_fe=1 cap_fver=2

    See: https://github.com/linux-audit/audit-kernel/issues/42

    Signed-off-by: Richard Guy Briggs
    Signed-off-by: Paul Moore

    Richard Guy Briggs
     

04 May, 2017

1 commit

  • Pull audit updates from Paul Moore:
    "Fourteen audit patches for v4.12 that span the full range of fixes,
    new features, and internal cleanups.

    We have a patches to move to 64-bit timestamps, convert refcounts from
    atomic_t to refcount_t, track PIDs using the pid struct instead of
    pid_t, convert our own private audit buffer cache to a standard
    kmem_cache, log kernel module names when they are unloaded, and
    normalize the NETFILTER_PKT to make the userspace folks happier.

    From a fixes perspective, the most important is likely the auditd
    connection tracking RCU fix; it was a rather brain dead bug that I'll
    take the blame for, but thankfully it didn't seem to affect many
    people (only one report).

    I think the patch subject lines and commit descriptions do a pretty
    good job of explaining the details and why the changes are important
    so I'll point you there instead of duplicating it here; as usual, if
    you have any questions you know where to find us.

    We also manage to take out more code than we put in this time, that
    always makes me happy :)"

    * 'stable-4.12' of git://git.infradead.org/users/pcmoore/audit:
    audit: fix the RCU locking for the auditd_connection structure
    audit: use kmem_cache to manage the audit_buffer cache
    audit: Use timespec64 to represent audit timestamps
    audit: store the auditd PID as a pid struct instead of pid_t
    audit: kernel generated netlink traffic should have a portid of 0
    audit: combine audit_receive() and audit_receive_skb()
    audit: convert audit_watch.count from atomic_t to refcount_t
    audit: convert audit_tree.count from atomic_t to refcount_t
    audit: normalize NETFILTER_PKT
    netfilter: use consistent ipv4 network offset in xt_AUDIT
    audit: log module name on delete_module
    audit: remove unnecessary semicolon in audit_watch_handle_event()
    audit: remove unnecessary semicolon in audit_mark_handle_event()
    audit: remove unnecessary semicolon in audit_field_valid()

    Linus Torvalds
     

02 May, 2017

6 commits


16 Apr, 2017

1 commit


14 Apr, 2017

1 commit

  • Add the base infrastructure and UAPI for netlink extended ACK
    reporting. All "manual" calls to netlink_ack() pass NULL for now and
    thus don't get extended ACK reporting.

    Big thanks goes to Pablo Neira Ayuso for not only bringing up the
    whole topic at netconf (again) but also coming up with the nlattr
    passing trick and various other ideas.

    Signed-off-by: Johannes Berg
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Johannes Berg
     

10 Apr, 2017

1 commit

  • The retry queue is intended to provide a temporary buffer in the case
    of transient errors when communicating with auditd, it is not meant
    as a long life queue, that functionality is provided by the hold
    queue.

    This patch fixes a problem identified by Seth where the retry queue
    could grow uncontrollably if an auditd instance did not connect to
    the kernel to drain the queues. This commit fixes this by doing the
    following:

    * Make sure we always call auditd_reset() if we decide the connection
    with audit is really dead. There were some cases in
    kauditd_hold_skb() where we did not reset the connection, this patch
    relocates the reset calls to kauditd_thread() so all the error
    conditions are caught and the connection reset. As a side effect,
    this means we could move auditd_reset() and get rid of the forward
    definition at the top of kernel/audit.c.

    * We never checked the status of the auditd connection when
    processing the main audit queue which meant that the retry queue
    could grow unchecked. This patch adds a call to auditd_reset()
    after the main queue has been processed if auditd is not connected,
    the auditd_reset() call will make sure the retry and hold queues are
    correctly managed/flushed so that the retry queue remains reasonable.

    Cc: # 4.10.x-: 5b52330bbfe6
    Reported-by: Seth Forshee
    Signed-off-by: Paul Moore

    Paul Moore
     

21 Mar, 2017

1 commit

  • What started as a rather straightforward race condition reported by
    Dmitry using the syzkaller fuzzer ended up revealing some major
    problems with how the audit subsystem managed its netlink sockets and
    its connection with the userspace audit daemon. Fixing this properly
    had quite the cascading effect and what we are left with is this rather
    large and complicated patch. My initial goal was to try and decompose
    this patch into multiple smaller patches, but the way these changes
    are intertwined makes it difficult to split these changes into
    meaningful pieces that don't break or somehow make things worse for
    the intermediate states.

    The patch makes a number of changes, but the most significant are
    highlighted below:

    * The auditd tracking variables, e.g. audit_sock, are now gone and
    replaced by a RCU/spin_lock protected variable auditd_conn which is
    a structure containing all of the auditd tracking information.

    * We no longer track the auditd sock directly, instead we track it
    via the network namespace in which it resides and we use the audit
    socket associated with that namespace. In spirit, this is what the
    code was trying to do prior to this patch (at least I think that is
    what the original authors intended), but it was done rather poorly
    and added a layer of obfuscation that only masked the underlying
    problems.

    * Big backlog queue cleanup, again. In v4.10 we made some pretty big
    changes to how the audit backlog queues work, here we haven't changed
    the queue design so much as cleaned up the implementation. Brought
    about by the locking changes, we've simplified kauditd_thread() quite
    a bit by consolidating the queue handling into a new helper function,
    kauditd_send_queue(), which allows us to eliminate a lot of very
    similar code and makes the looping logic in kauditd_thread() clearer.

    * All netlink messages sent to auditd are now sent via
    auditd_send_unicast_skb(). Other than just making sense, this makes
    the lock handling easier.

    * Change the audit_log_start() sleep behavior so that we never sleep
    on auditd events (unchanged) or if the caller is holding the
    audit_cmd_mutex (changed). Previously we didn't sleep if the caller
    was auditd or if the message type fell between a certain range; the
    type check was a poor effort of doing what the cmd_mutex check now
    does. Richard Guy Briggs originally proposed not sleeping the
    cmd_mutex owner several years ago but his patch wasn't acceptable
    at the time. At least the idea lives on here.

    * A problem with the lost record counter has been resolved. Steve
    Grubb and I both happened to notice this problem and according to
    some quick testing by Steve, this problem goes back quite some time.
    It's largely a harmless problem, although it may have left some
    careful sysadmins quite puzzled.

    Cc: # 4.10.x-
    Reported-by: Dmitry Vyukov
    Signed-off-by: Paul Moore

    Paul Moore
     

22 Feb, 2017

1 commit

  • Pull audit updates from Paul Moore:
    "The audit changes for v4.11 are relatively small compared to what we
    did for v4.10, both in terms of size and impact.

    - two patches from Steve tweak the formatting for some of the audit
    records to make them more consistent with other audit records.

    - three patches from Richard record the name of a module on module
    load, fix the logging of sockaddr information when using
    socketcall() on 32-bit systems, and add the ability to reset
    audit's lost record counter.

    - my lone patch just fixes an annoying style nit that I was reminded
    about by one of Richard's patches.

    All these patches pass our test suite"

    * 'stable-4.11' of git://git.infradead.org/users/pcmoore/audit:
    audit: remove unnecessary curly braces from switch/case statements
    audit: log module name on init_module
    audit: log 32-bit socketcalls
    audit: add feature audit_lost reset
    audit: Make AUDIT_ANOM_ABEND event normalized
    audit: Make AUDIT_KERNEL event conform to the specification

    Linus Torvalds
     

19 Jan, 2017

1 commit

  • Add a method to reset the audit_lost value.

    An AUDIT_SET message with the AUDIT_STATUS_LOST flag set by itself
    will return a positive value repesenting the current audit_lost value
    and reset the counter to zero. If AUDIT_STATUS_LOST is not the
    only flag set, the reset command will be ignored. The value sent with
    the command is ignored. The return value will be the +ve lost value at
    reset time.

    An AUDIT_CONFIG_CHANGE message will be queued to the listening audit
    daemon. The message will be a standard CONFIG_CHANGE message with the
    fields "lost=0" and "old=" with the latter containing the value of
    audit_lost at reset time.

    See: https://github.com/linux-audit/audit-kernel/issues/3

    Signed-off-by: Richard Guy Briggs
    Acked-by: Steve Grubb
    Signed-off-by: Paul Moore

    Richard Guy Briggs
     

18 Dec, 2016

1 commit

  • Pull more vfs updates from Al Viro:
    "In this pile:

    - autofs-namespace series
    - dedupe stuff
    - more struct path constification"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (40 commits)
    ocfs2: implement the VFS clone_range, copy_range, and dedupe_range features
    ocfs2: charge quota for reflinked blocks
    ocfs2: fix bad pointer cast
    ocfs2: always unlock when completing dio writes
    ocfs2: don't eat io errors during _dio_end_io_write
    ocfs2: budget for extent tree splits when adding refcount flag
    ocfs2: prohibit refcounted swapfiles
    ocfs2: add newlines to some error messages
    ocfs2: convert inode refcount test to a helper
    simple_write_end(): don't zero in short copy into uptodate
    exofs: don't mess with simple_write_{begin,end}
    9p: saner ->write_end() on failing copy into non-uptodate page
    fix gfs2_stuffed_write_end() on short copies
    fix ceph_write_end()
    nfs_write_end(): fix handling of short copies
    vfs: refactor clone/dedupe_file_range common functions
    fs: try to clone files first in vfs_copy_file_range
    vfs: misc struct path constification
    namespace.c: constify struct path passed to a bunch of primitives
    quota: constify struct path in quota_on
    ...

    Linus Torvalds
     

15 Dec, 2016

12 commits

  • Pull audit updates from Paul Moore:
    "After the small number of patches for v4.9, we've got a much bigger
    pile for v4.10.

    The bulk of these patches involve a rework of the audit backlog queue
    to enable us to move the netlink multicasting out of the task/thread
    that generates the audit record and into the kernel thread that emits
    the record (just like we do for the audit unicast to auditd).

    While we were playing with the backlog queue(s) we fixed a number of
    other little problems with the code, and from all the testing so far
    things look to be in much better shape now. Doing this also allowed us
    to re-enable disabling IRQs for some netns operations ("netns: avoid
    disabling irq for netns id").

    The remaining patches fix some small problems that are well documented
    in the commit descriptions, as well as adding session ID filtering
    support"

    * 'stable-4.10' of git://git.infradead.org/users/pcmoore/audit:
    audit: use proper refcount locking on audit_sock
    netns: avoid disabling irq for netns id
    audit: don't ever sleep on a command record/message
    audit: handle a clean auditd shutdown with grace
    audit: wake up kauditd_thread after auditd registers
    audit: rework audit_log_start()
    audit: rework the audit queue handling
    audit: rename the queues and kauditd related functions
    audit: queue netlink multicast sends just like we do for unicast sends
    audit: fixup audit_init()
    audit: move kaudit thread start from auditd registration to kaudit init (#2)
    audit: add support for session ID user filter
    audit: fix formatting of AUDIT_CONFIG_CHANGE events
    audit: skip sessionid sentinel value when auto-incrementing
    audit: tame initialization warning len_abuf in audit_log_execve_info
    audit: less stack usage for /proc/*/loginuid

    Linus Torvalds
     
  • The AUDIT_KERNEL event is not following name=value format. This causes
    some information to get lost. The event has been reformatted to follow
    the convention. Additionally the audit_enabled value was added for
    troubleshooting purposes. The following is an example of the new event:

    type=KERNEL audit(1480621249.833:1): state=initialized
    audit_enabled=0 res=1

    Signed-off-by: Steve Grubb
    [PM: commit tweaks to make checkpatch.pl happy]
    Signed-off-by: Paul Moore

    Steve Grubb
     
  • Resetting audit_sock appears to be racy.

    audit_sock was being copied and dereferenced without using a refcount on
    the source sock.

    Bump the refcount on the underlying sock when we store a refrence in
    audit_sock and release it when we reset audit_sock. audit_sock
    modification needs the audit_cmd_mutex.

    See: https://lkml.org/lkml/2016/11/26/232

    Thanks to Eric Dumazet and Cong Wang
    on ideas how to fix it.

    Signed-off-by: Richard Guy Briggs
    Reviewed-by: Cong Wang
    [PM: fixed the comment block text formatting for auditd_reset()]
    Signed-off-by: Paul Moore

    Richard Guy Briggs
     
  • Sleeping on a command record/message in audit_log_start() could slow
    something, e.g. auditd, from doing something important, e.g. clean
    shutdown, which could present problems on a heavily loaded system.
    This patch allows tasks to bypass any queue restrictions if they are
    logging a command record/message.

    Signed-off-by: Paul Moore

    Paul Moore
     
  • When auditd stops cleanly it sets 'auditd_pid' to 0 with an
    AUDIT_SET message, in this case we should reset our backlog
    queues via the auditd_reset() function. This patch also adds
    a 'auditd_pid' check to the top of kauditd_send_unicast_skb()
    so we can fail quicker.

    Signed-off-by: Paul Moore

    Paul Moore
     
  • This patch was suggested by Richard Briggs back in 2015, see the link
    to the mail archive below. Unfortunately, that patch is no longer
    even remotely valid due to other changes to the code.

    * https://www.redhat.com/archives/linux-audit/2015-October/msg00075.html

    Suggested-by: Richard Guy Briggs
    Signed-off-by: Paul Moore

    Paul Moore
     
  • The backlog queue handling in audit_log_start() is a little odd with
    some questionable design decisions, this patch attempts to rectify
    this with the following changes:

    * Never make auditd wait, ignore any backlog limits as we need auditd
    awake so it can drain the backlog queue.

    * When we hit a backlog limit and start dropping records, don't wake
    all the tasks sleeping on the backlog, that's silly. Instead, let
    kauditd_thread() take care of waking everyone once it has had a chance
    to drain the backlog queue.

    * Don't keep a global backlog timeout countdown, make it per-task. A
    per-task timer means we won't have all the sleeping tasks waking at
    the same time and hammering on an already stressed backlog queue.

    Signed-off-by: Paul Moore

    Paul Moore
     
  • The audit record backlog queue has always been a bit of a mess, and
    the moving the multicast send into kauditd_thread() from
    audit_log_end() only makes things worse. This patch attempts to fix
    the backlog queue with a better design that should hold up better
    under load and have less of a performance impact at syscall
    invocation time.

    While it looks like there is a log going on in this patch, the main
    change is the move from a single backlog queue to three queues:

    * A queue for holding records generated from audit_log_end() that
    haven't been consumed by kauditd_thread() (audit_queue).

    * A queue for holding records that have been sent via multicast but
    had a temporary failure when sending via unicast and need a resend
    (audit_retry_queue).

    * A queue for holding records that haven't been sent via unicast
    because no one is listening (audit_hold_queue).

    Special care is taken in this patch to ensure that the proper
    record ordering is preserved, e.g. we send everything in the hold
    queue first, then the retry queue, and finally the main queue.

    Signed-off-by: Paul Moore

    Paul Moore
     
  • The audit queue names can be shortened and the record sending
    helpers associated with the kauditd task could be named better, do
    these small cleanups now to make life easier once we start reworking
    the queues and kauditd code.

    Signed-off-by: Paul Moore

    Paul Moore
     
  • Sending audit netlink multicast messages is bad for all the same
    reasons that sending audit netlink unicast messages is bad, so this
    patch reworks things so that we don't do the multicast send in
    audit_log_end(), we do it from the dedicated kauditd_thread thread just
    as we do for unicast messages.

    See the GitHub issues below for more information/history:

    * https://github.com/linux-audit/audit-kernel/issues/23
    * https://github.com/linux-audit/audit-kernel/issues/22

    Signed-off-by: Paul Moore

    Paul Moore
     
  • Make sure everything is initialized before we start the kauditd_thread
    and don't emit the "initialized" record until everything is finished.
    We also panic with a descriptive message if we can't start the
    kauditd_thread.

    Signed-off-by: Paul Moore

    Paul Moore
     
  • Richard made this change some time ago but Eric backed it out because
    the rest of the supporting code wasn't ready. In order to move the
    netlink multicast send to kauditd_thread we need to ensure the
    kauditd_thread is always running, so restore commit 6ff5e459 ("audit:
    move kaudit thread start from auditd registration to kaudit init").

    Signed-off-by: Richard Guy Briggs
    [PM: brought forward and merged based on Richard's old patch]
    Signed-off-by: Paul Moore

    Richard Guy Briggs
     

06 Dec, 2016

1 commit


02 Dec, 2016

1 commit


18 Nov, 2016

1 commit

  • Make struct pernet_operations::id unsigned.

    There are 2 reasons to do so:

    1)
    This field is really an index into an zero based array and
    thus is unsigned entity. Using negative value is out-of-bound
    access by definition.

    2)
    On x86_64 unsigned 32-bit data which are mixed with pointers
    via array indexing or offsets added or subtracted to pointers
    are preffered to signed 32-bit data.

    "int" being used as an array index needs to be sign-extended
    to 64-bit before being used.

    void f(long *p, int i)
    {
    g(p[i]);
    }

    roughly translates to

    movsx rsi, esi
    mov rdi, [rsi+...]
    call g

    MOVSX is 3 byte instruction which isn't necessary if the variable is
    unsigned because x86_64 is zero extending by default.

    Now, there is net_generic() function which, you guessed it right, uses
    "int" as an array index:

    static inline void *net_generic(const struct net *net, int id)
    {
    ...
    ptr = ng->ptr[id - 1];
    ...
    }

    And this function is used a lot, so those sign extensions add up.

    Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
    messing with code generation):

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)

    Unfortunately some functions actually grow bigger.
    This is a semmingly random artefact of code generation with register
    allocator being used differently. gcc decides that some variable
    needs to live in new r8+ registers and every access now requires REX
    prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
    used which is longer than [r8]

    However, overall balance is in negative direction:

    add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
    function old new delta
    nfsd4_lock 3886 3959 +73
    tipc_link_build_proto_msg 1096 1140 +44
    mac80211_hwsim_new_radio 2776 2808 +32
    tipc_mon_rcv 1032 1058 +26
    svcauth_gss_legacy_init 1413 1429 +16
    tipc_bcbase_select_primary 379 392 +13
    nfsd4_exchange_id 1247 1260 +13
    nfsd4_setclientid_confirm 782 793 +11
    ...
    put_client_renew_locked 494 480 -14
    ip_set_sockfn_get 730 716 -14
    geneve_sock_add 829 813 -16
    nfsd4_sequence_done 721 703 -18
    nlmclnt_lookup_host 708 686 -22
    nfsd4_lockt 1085 1063 -22
    nfs_get_client 1077 1050 -27
    tcf_bpf_init 1106 1076 -30
    nfsd4_encode_fattr 5997 5930 -67
    Total: Before=154856051, After=154854321, chg -0.00%

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan