05 May, 2013

1 commit

  • When BSD process accounting is enabled and logs information to a
    filesystem which gets frozen, system easily becomes unusable because
    each attempt to account process information blocks. Thus e.g. every task
    gets blocked in exit.

    It seems better to drop accounting information (which can already happen
    when filesystem is running out of space) instead of locking system up.
    So we just skip the write if the filesystem is frozen.

    Reported-by: Nikola Ciprich
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     

10 Apr, 2013

1 commit


27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

23 Feb, 2013

1 commit


28 Jan, 2013

1 commit

  • This is in preparation for the full dynticks feature. While
    remotely reading the cputime of a task running in a full
    dynticks CPU, we'll need to do some extra-computation. This
    way we can account the time it spent tickless in userspace
    since its last cputime snapshot.

    Signed-off-by: Frederic Weisbecker
    Cc: Andrew Morton
    Cc: Ingo Molnar
    Cc: Li Zhong
    Cc: Namhyung Kim
    Cc: Paul E. McKenney
    Cc: Paul Gortmaker
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner

    Frederic Weisbecker
     

13 Oct, 2012

2 commits

  • ...and fix up the callers. For do_file_open_root, just declare a
    struct filename on the stack and fill out the .name field. For
    do_filp_open, make it also take a struct filename pointer, and fix up its
    callers to call it appropriately.

    For filp_open, add a variant that takes a struct filename pointer and turn
    filp_open into a wrapper around it.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     
  • getname() is intended to copy pathname strings from userspace into a
    kernel buffer. The result is just a string in kernel space. It would
    however be quite helpful to be able to attach some ancillary info to
    the string.

    For instance, we could attach some audit-related info to reduce the
    amount of audit-related processing needed. When auditing is enabled,
    we could also call getname() on the string more than once and not
    need to recopy it from userspace.

    This patchset converts the getname()/putname() interfaces to return
    a struct instead of a string. For now, the struct just tracks the
    string in kernel space and the original userland pointer for it.

    Later, we'll add other information to the struct as it becomes
    convenient.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     

12 Oct, 2012

1 commit


18 Sep, 2012

1 commit

  • BSD process accounting conveniently passes the file the accounting
    records will be written into to do_acct_process. The file credentials
    captured the user namespace of the opener of the file. Use the file
    credentials to format the uid and the gid of the current process into
    the user namespace of the user that started the bsd process
    accounting.

    Cc: Pavel Emelyanov
    Reviewed-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

09 Jan, 2012

1 commit

  • * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
    reiserfs: Properly display mount options in /proc/mounts
    vfs: prevent remount read-only if pending removes
    vfs: count unlinked inodes
    vfs: protect remounting superblock read-only
    vfs: keep list of mounts for each superblock
    vfs: switch ->show_options() to struct dentry *
    vfs: switch ->show_path() to struct dentry *
    vfs: switch ->show_devname() to struct dentry *
    vfs: switch ->show_stats to struct dentry *
    switch security_path_chmod() to struct path *
    vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
    vfs: trim includes a bit
    switch mnt_namespace ->root to struct mount
    vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
    vfs: opencode mntget() mnt_set_mountpoint()
    vfs: spread struct mount - remaining argument of next_mnt()
    vfs: move fsnotify junk to struct mount
    vfs: move mnt_devname
    vfs: move mnt_list to struct mount
    vfs: switch pnode.h macros to struct mount *
    ...

    Linus Torvalds
     

07 Jan, 2012

1 commit


04 Jan, 2012

1 commit


15 Dec, 2011

1 commit


10 Aug, 2010

1 commit

  • We'll need the path to implement the flags field for statvfs support.
    We do have it available in all callers except:

    - ecryptfs_statfs. This one doesn't actually need vfs_statfs but just
    needs to do a caller to the lower filesystem statfs method.
    - sys_ustat. Add a non-exported statfs_by_dentry helper for it which
    doesn't won't be able to fill out the flags field later on.

    In addition rename the helpers for statfs vs fstatfs to do_*statfs instead
    of the misleading vfs prefix.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

18 May, 2010

1 commit


12 May, 2010

1 commit


12 Apr, 2010

1 commit


13 Mar, 2010

1 commit


16 Dec, 2009

1 commit

  • commit d8e180dcd5bbbab9cd3ff2e779efcf70692ef541 "bsdacct: switch
    credentials for writing to the accounting file" introduced credential
    switching during final acct data collecting. However, uid/gid pair
    continued to be collected from current which became credentials of who
    created acct file, not who exits.

    Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14676

    Signed-off-by: Alexey Dobriyan
    Reported-by: Juho K. Juopperi
    Acked-by: Serge Hallyn
    Acked-by: David Howells
    Reviewed-by: Michal Schmidt
    Cc: James Morris
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

24 Aug, 2009

1 commit

  • When process accounting is enabled, every exiting process writes a log to
    the account file. In addition, every once in a while one of the exiting
    processes checks whether there's enough free space for the log.

    SELinux policy may or may not allow the exiting process to stat the fs.
    So unsuspecting processes start generating AVC denials just because
    someone enabled process accounting.

    For these filesystem operations, the exiting process's credentials should
    be temporarily switched to that of the process which enabled accounting,
    because it's really that process which wanted to have the accounting
    information logged.

    Signed-off-by: Michal Schmidt
    Acked-by: David Howells
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: James Morris

    Michal Schmidt
     

01 Jul, 2009

1 commit

  • The file opened in acct_on and freshly stored in the ns->bacct struct can
    be closed in acct_file_reopen by a concurrent call after we release
    acct_lock and before we call mntput(file->f_path.mnt).

    Record file->f_path.mnt in a local variable and use this variable only.

    Signed-off-by: Renaud Lottiaux
    Signed-off-by: Louis Rilling
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Renaud Lottiaux
     

14 Jan, 2009

1 commit


14 Nov, 2008

1 commit

  • Wrap access to task credentials so that they can be separated more easily from
    the task_struct during the introduction of COW creds.

    Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

    Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
    sense to use RCU directly rather than a convenient wrapper; these will be
    addressed by later patches.

    Signed-off-by: David Howells
    Reviewed-by: James Morris
    Acked-by: Serge Hallyn
    Cc: Al Viro
    Cc: linux-audit@redhat.com
    Cc: containers@lists.linux-foundation.org
    Cc: linux-mm@kvack.org
    Signed-off-by: James Morris

    David Howells
     

14 Oct, 2008

1 commit


26 Jul, 2008

9 commits

  • Fix the one describing what this function is and add one more - about
    locking absence around pid namespaces loop.

    Signed-off-by: Pavel Emelyanov
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • This just makes the acct_proces walk the pid namespaces from current up to
    the top and account a task in each with the accounting turned on.

    ns->parent access if safe lockless, since current it still alive and holds
    its namespace, which in turn holds its parent.

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • All the bsd_acct_strcts with opened accounting are linked into a global
    list. So, the acct_auto_close(_mnt) walks one and drops the accounting
    for each.

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • Allocate the structure on the first call to sys_acct(). After this each
    namespace, that ordered the accounting, will live with this structure till
    its own death.

    Two notes
    - routines, that close the accounting on fs umount time use
    the init_pid_ns's acct by now;
    - accounting routine accounts to dying task's namespace
    (also by now).

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • This adds the appropriate pointer to all the internal (i.e. static)
    functions that work with global acct instance. API calls pass a global
    instance to them (while we still have such).

    Mostly this is a s/acct_globals./acct->/ over the file.

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • Don't use per-bsd-acct-struct lock, but work with a global one.

    This lock is taken for short periods, so it doesn't seem it'll become a
    bottleneck, but it will allow us to easily avoid many locking difficulties
    in the future.

    So this is a mostly s/acct_globals.lock/acct_lock/ over the file.

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • We're going to have many bsd_acct_struct instances, not just one, so the
    timer (currently working with a global one) has to know which one to work
    with.

    Use a handy setup_timer macro for it (thanks to Oleg for one).

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • The acct_process does not accept any arguments actually.

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • After I fixed access to task->tgid in kernel/acct.c, Oleg pointed out some
    bad side effects with this accounting vs pid namespaces interaction. I.e.
    when some task in pid namespace sets this accounting up, this blocks all
    the others from doing the same. Restricting this to init namespace only
    could help, but didn't look a graceful solution.

    So here is the approach to make this accounting work with pid namespaces
    properly.

    The idea is simple - when a task dies it accounts itself in each namespace
    it is visible from and which set the accounting up.

    For example here are the commands run and the output of lastcomm from init
    and sub namespaces:

    init_ns# accton pacct
    sub_ns# accton pacct (this is a different file - sub ns is run in
    a chroot-ed environment)
    init_ns# cat /dev/null
    sub_ns# ls /dev/null
    init_ns# accton
    sub_ns# accton

    sub_ns# lastcomm -f pacct
    ls 0 [136,0] 0.00 secs Thu May 15 10:30
    accton 0 [136,0] 0.00 secs Thu May 15 10:30

    init_ns# lastcomm -f pacct
    accton root pts/0 0.00 secs Thu May 15 14:30 << got from sub
    cat root pts/1 0.00 secs Thu May 15 14:30
    ls root pts/0 0.00 secs Thu May 15 14:30 << got from sub
    accton root pts/1 0.00 secs Thu May 15 14:30

    That was the summary, the details are in patches.

    This patch:

    It will be visible in pid_namespace.h file, so fix its name to look better
    outside the acct.c file.

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

25 Mar, 2008

2 commits

  • In case we're accounting from a sub-namespace, the tgids reported will not
    refer to the right namespace.

    Save the pid_namespace we're accounting in on the acct_glbs and use it in
    do_acct_process.

    Two less :) places using the task_struct.tgid member.

    Signed-off-by: Pavel Emelyanov
    Cc: Oleg Nesterov
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • This is minor, but dereferencing even current real_parent is not safe on debug
    kernels, since the memory, this points to, can be unmapped - RCU protection is
    required.

    Besides, the tgid field is deprecated and is to be replaced with task_tgid_xxx
    call (the 2nd patch), so RCU will be required anyway.

    Signed-off-by: Pavel Emelyanov
    Cc: Oleg Nesterov
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

08 Jan, 2008

1 commit

  • The ac_ppid field reported in process accounting records
    should match what getppid() would have returned to that
    process, regardless of whether a debugger is attached.

    Signed-off-by: Roland McGrath
    Signed-off-by: Linus Torvalds

    Roland McGrath
     

27 Nov, 2007

1 commit


19 Oct, 2007

1 commit


26 Jul, 2007

1 commit

  • This avoids use of the kernel-internal "xtime" variable directly outside
    of the actual time-related functions. Instead, use the helper functions
    that we already have available to us.

    This doesn't actually change any behaviour, but this will allow us to
    fix the fact that "xtime" isn't updated very often with CONFIG_NO_HZ
    (because much of the realtime information is maintained as separate
    offsets to 'xtime'), which has caused interfaces that use xtime directly
    to get a time that is out of sync with the real-time clock by up to a
    third of a second or so.

    Signed-off-by: John Stultz
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    john stultz
     

09 Dec, 2006

1 commit