27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

25 Mar, 2011

3 commits

  • All that remains of the inode_lock is protecting the inode hash list
    manipulation and traversals. Rename the inode_lock to
    inode_hash_lock to reflect it's actual function.

    Signed-off-by: Dave Chinner
    Signed-off-by: Al Viro

    Dave Chinner
     
  • Protect the per-sb inode list with a new global lock
    inode_sb_list_lock and use it to protect the list manipulations and
    traversals. This lock replaces the inode_lock as the inodes on the
    list can be validity checked while holding the inode->i_lock and
    hence the inode_lock is no longer needed to protect the list.

    Signed-off-by: Dave Chinner
    Signed-off-by: Al Viro

    Dave Chinner
     
  • Protect inode state transitions and validity checks with the
    inode->i_lock. This enables us to make inode state transitions
    independently of the inode_lock and is the first step to peeling
    away the inode_lock from the code.

    This requires that __iget() is done atomically with i_state checks
    during list traversals so that we don't race with another thread
    marking the inode I_FREEING between the state check and grabbing the
    reference.

    Also remove the unlock_new_inode() memory barrier optimisation
    required to avoid taking the inode_lock when clearing I_NEW.
    Simplify the code by simply taking the inode->i_lock around the
    state change and wakeup. Because the wakeup is no longer tricky,
    remove the wake_up_inode() function and open code the wakeup where
    necessary.

    Signed-off-by: Dave Chinner
    Signed-off-by: Al Viro

    Dave Chinner
     

29 Oct, 2010

1 commit

  • fanotify needs to be able to specify that some groups get events before
    others. They use this idea to make sure that a hierarchical storage
    manager gets access to files before programs which actually use them. This
    is purely infrastructure. Everything will have a priority of 0, but the
    infrastructure will exist for it to be non-zero.

    Signed-off-by: Eric Paris

    Eric Paris
     

26 Oct, 2010

1 commit

  • Pull removal of fsnotify marks into generic_shutdown_super().
    Split umount-time work into a new function - evict_inodes().
    Make sure that invalidate_inodes() will be able to cope with
    I_FREEING once we change locking in iput().

    Signed-off-by: Al Viro

    Al Viro
     

11 Aug, 2010

1 commit

  • * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits)
    fanotify: use both marks when possible
    fsnotify: pass both the vfsmount mark and inode mark
    fsnotify: walk the inode and vfsmount lists simultaneously
    fsnotify: rework ignored mark flushing
    fsnotify: remove global fsnotify groups lists
    fsnotify: remove group->mask
    fsnotify: remove the global masks
    fsnotify: cleanup should_send_event
    fanotify: use the mark in handler functions
    audit: use the mark in handler functions
    dnotify: use the mark in handler functions
    inotify: use the mark in handler functions
    fsnotify: send fsnotify_mark to groups in event handling functions
    fsnotify: Exchange list heads instead of moving elements
    fsnotify: srcu to protect read side of inode and vfsmount locks
    fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called
    fsnotify: use _rcu functions for mark list traversal
    fsnotify: place marks on object in order of group memory address
    vfs/fsnotify: fsnotify_close can delay the final work in fput
    fsnotify: store struct file not struct path
    ...

    Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.

    Linus Torvalds
     

10 Aug, 2010

1 commit

  • add I_CLEAR instead of replacing I_FREEING with it. I_CLEAR is
    equivalent to I_FREEING for almost all code looking at either;
    it's there to keep track of having called clear_inode() exactly
    once per inode lifetime, at some point after having set I_FREEING.
    I_CLEAR and I_FREEING never get set at the same time with the
    current code, so we can switch to setting i_flags to I_FREEING | I_CLEAR
    instead of I_CLEAR without loss of information. As the result of
    such change, checks become simpler and the amount of code that needs
    to know about I_CLEAR shrinks a lot.

    Signed-off-by: Al Viro

    Al Viro
     

28 Jul, 2010

16 commits

  • Currently fsnotify check is mark->group is NULL to decide if
    fsnotify_destroy_mark() has already been called or not. With the upcoming
    rcu work it is a heck of a lot easier to use an explicit flag than worry
    about group being set to NULL.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • In preparation for srcu locking use all _rcu appropiete functions for mark
    list addition, removal, and traversal. The operations are still done under a
    spinlock at the end of this patch.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fsnotify_marks currently are placed on objects (inodes or vfsmounts) in
    arbitrary order. This patch places them in order of the group memory address.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fanotify listeners may want to clear all marks. They may want to do this
    to destroy all of their inode marks which have nothing but ignores.
    Realistically this is useful for av vendors who update policy and want to
    clear all of their cached allows.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • inotify marks must pin inodes in core. dnotify doesn't technically need to
    since they are closed when the directory is closed. fanotify also need to
    pin inodes in core as it works today. But the next step is to introduce
    the concept of 'ignored masks' which is actually a mask of events for an
    inode of no interest. I claim that these should be liberally sent to the
    kernel and should not pin the inode in core. If the inode is brought back
    in the listener will get an event it may have thought excluded, but this is
    not a serious situation and one any listener should deal with.

    This patch lays the ground work for non-pinning inode marks by using lazy
    inode pinning. We do not pin a mark until it has a non-zero mask entry. If a
    listener new sets a mask we never pin the inode.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • currently all marking is done by functions in inode-mark.c. Some of this
    is pretty generic and should be instead done in a generic function and we
    should only put the inode specific code in inode-mark.c

    Signed-off-by: Eric Paris

    Eric Paris
     
  • All callers to fsnotify_find_mark_entry() except one take and
    release inode->i_lock around the call. Take the lock inside
    fsnotify_find_mark_entry() instead.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Eric Paris

    Andreas Gruenbacher
     
  • previously I used mark_entry when talking about marks on inodes. The
    _entry is pretty useless. Just use "mark" instead.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • the _entry portion of fsnotify functions is useless. Drop it.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • The name is long and it serves no real purpose. So rename
    fsnotify_mark_entry to just fsnotify_mark.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • To differentiate between inode and vfsmount (or other future) types of
    marks we add a flags field and set the inode bit on inode marks (the only
    currently supported type of mark)

    Signed-off-by: Eric Paris

    Eric Paris
     
  • The addition of marks on vfs mounts will be simplified if the inode
    specific parts of a mark and the vfsmnt specific parts of a mark are
    actually in a union so naming can be easy. This patch just implements the
    inode struct and the union.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • currently all of the notification systems implemented select which inodes
    they care about and receive messages only about those inodes (or the
    children of those inodes.) This patch begins to flesh out fsnotify support
    for the concept of listeners that want to hear notification for an inode
    accessed below a given monut point. This patch implements a second list
    of fsnotify groups to hold these types of groups and a second global mask
    to hold the events of interest for this type of group.

    The reason we want a second group list and mask is because the inode based
    notification should_send_event support which makes each group look for a mark
    on the given inode. With one nfsmount listener that means that every group would
    have to take the inode->i_lock, look for their mark, not find one, and return
    for every operation. By seperating vfsmount from inode listeners only when
    there is a inode listener will the inode groups have to look for their
    mark and take the inode lock. vfsmount listeners will have to grab the lock and
    look for a mark but there should be fewer of them, and one vfsmount listener
    won't cause the i_lock to be grabbed and released for every fsnotify group
    on every io operation.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Currently all fsnotify groups are added immediately to the
    fsnotify_inode_groups list upon creation. This means, even groups with no
    watches (common for audit) will be on the global tracking list and will
    get checked for every event. This patch adds groups to the global list on
    when the first inode mark is added to the group.

    Signed-of-by: Eric Paris

    Eric Paris
     
  • This patch allows a task to add a second fsnotify mark to an inode for the
    same group. This mark will be added to the end of the inode's list and
    this will never be found by the stand fsnotify_find_mark() function. This
    is useful if a user wants to add a new mark before removing the old one.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Simple copy fsnotify information from one mark to another in preparation
    for the second mark to replace the first.

    Signed-off-by: Eric Paris

    Eric Paris
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

19 Oct, 2009

1 commit

  • fsnotify_add_mark is supposed to add a mark to the g_list and i_list and to
    set the group and inode for the mark. fsnotify_destroy_mark_by_entry uses
    the fact that ->group != NULL to know if this group should be destroyed or
    if it's already been done.

    But fsnotify_add_mark sets the group and inode before it actually adds the
    mark to the i_list and g_list. This can result in a race in inotify, it
    requires 3 threads.

    sys_inotify_add_watch("file") sys_inotify_add_watch("file") sys_inotify_rm_watch([a])
    inotify_update_watch()
    inotify_new_watch()
    inotify_add_to_idr()
    ^--- returns wd = [a]
    inotfiy_update_watch()
    inotify_new_watch()
    inotify_add_to_idr()
    fsnotify_add_mark()
    ^--- returns wd = [b]
    returns to userspace;
    inotify_idr_find([a])
    ^--- gives us the pointer from task 1
    fsnotify_add_mark()
    ^--- this is going to set the mark->group and mark->inode fields, but will
    return -EEXIST because of the race with [b].
    fsnotify_destroy_mark()
    ^--- since ->group != NULL we call back
    into inotify_freeing_mark() which calls
    inotify_remove_from_idr([a])

    since fsnotify_add_mark() failed we call:
    inotify_remove_from_idr([a]) group until we are sure the mark is
    on the inode and fsnotify_add_mark will return success.

    Signed-off-by: Eric Paris

    Eric Paris
     

12 Jun, 2009

5 commits

  • Most fsnotify listeners (all but inotify) do not care about marks being
    freed. Allow groups to set freeing_mark to null and do not call any
    function if it is set that way.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • When an fs is unmounted with an fsnotify mark entry attached to one of its
    inodes we need to destroy that mark entry and we also (like inotify) send
    an unmount event.

    Signed-off-by: Eric Paris
    Acked-by: Al Viro
    Cc: Christoph Hellwig

    Eric Paris
     
  • This patch pins any inodes with an fsnotify mark in core. The idea is that
    as soon as the mark is removed from the inode->fsnotify_mark_entries list
    the inode will be iput. In reality is doesn't quite work exactly this way.
    The igrab will happen when the mark is added to an inode, but the iput will
    happen when the inode pointer is NULL'd inside the mark.

    It's possible that 2 racing things will try to remove the mark from
    different directions. One may try to remove the mark because of an
    explicit request and one might try to remove it because the inode was
    deleted. It's possible that the removal because of inode deletion will
    remove the mark from the inode's list, but the removal by explicit request
    will actually set entry->inode == NULL; and call the iput. This is safe.

    Signed-off-by: Eric Paris
    Acked-by: Al Viro
    Cc: Christoph Hellwig

    Eric Paris
     
  • inotify and dnotify both use a similar parent notification mechanism. We
    add a generic parent notification mechanism to fsnotify for both of these
    to use. This new machanism also adds the dentry flag optimization which
    exists for inotify to dnotify.

    Signed-off-by: Eric Paris
    Acked-by: Al Viro
    Cc: Christoph Hellwig

    Eric Paris
     
  • This patch creates a way for fsnotify groups to attach marks to inodes.
    These marks have little meaning to the generic fsnotify infrastructure
    and thus their meaning should be interpreted by the group that attached
    them to the inode's list.

    dnotify and inotify will make use of these markings to indicate which
    inodes are of interest to their respective groups. But this implementation
    has the useful property that in the future other listeners could actually
    use the marks for the exact opposite reason, aka to indicate which inodes
    it had NO interest in.

    Signed-off-by: Eric Paris
    Acked-by: Al Viro
    Cc: Christoph Hellwig

    Eric Paris