27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

07 Jan, 2011

1 commit


08 Dec, 2010

1 commit

  • When fanotify_release() is called, there may still be processes waiting for
    access permission. Currently only processes for which an event has already been
    queued into the groups access list will be woken up. Processes for which no
    event has been queued will continue to sleep and thus cause a deadlock when
    fsnotify_put_group() is called.
    Furthermore there is a race allowing further processes to be waiting on the
    access wait queue after wake_up (if they arrive before clear_marks_by_group()
    is called).
    This patch corrects this by setting a flag to inform processes that the group
    is about to be destroyed and thus not to wait for access permission.

    [additional changelog from eparis]
    Lets think about the 4 relevant code paths from the PoV of the
    'operator' 'listener' 'responder' and 'closer'. Where operator is the
    process doing an action (like open/read) which could require permission.
    Listener is the task (or in this case thread) slated with reading from
    the fanotify file descriptor. The 'responder' is the thread responsible
    for responding to access requests. 'Closer' is the thread attempting to
    close the fanotify file descriptor.

    The 'operator' is going to end up in:
    fanotify_handle_event()
    get_response_from_access()
    (THIS BLOCKS WAITING ON USERSPACE)

    The 'listener' interesting code path
    fanotify_read()
    copy_event_to_user()
    prepare_for_access_response()
    (THIS CREATES AN fanotify_response_event)

    The 'responder' code path:
    fanotify_write()
    process_access_response()
    (REMOVE A fanotify_response_event, SET RESPONSE, WAKE UP 'operator')

    The 'closer':
    fanotify_release()
    (SUPPOSED TO CLEAN UP THE REST OF THIS MESS)

    What we have today is that in the closer we remove all of the
    fanotify_response_events and set a bit so no more response events are
    ever created in prepare_for_access_response().

    The bug is that we never wake all of the operators up and tell them to
    move along. You fix that in fanotify_get_response_from_access(). You
    also fix other operators which haven't gotten there yet. So I agree
    that's a good fix.
    [/additional changelog from eparis]

    [remove additional changes to minimize patch size]
    [move initialization so it was inside CONFIG_FANOTIFY_PERMISSION]

    Signed-off-by: Lino Sanfilippo
    Signed-off-by: Eric Paris

    Lino Sanfilippo
     

29 Oct, 2010

7 commits


23 Aug, 2010

1 commit

  • When an fanotify listener is closing it may cause a deadlock between the
    listener and the original task doing an fs operation. If the original task
    is waiting for a permissions response it will be holding the srcu lock. The
    listener cannot clean up and exit until after that srcu lock is syncronized.
    Thus deadlock. The fix introduced here is to stop accepting new permissions
    events when a listener is shutting down and to grant permission for all
    outstanding events. Thus the original task will eventually release the srcu
    lock and the listener can complete shutdown.

    Reported-by: Andreas Gruenbacher
    Cc: Andreas Gruenbacher
    Signed-off-by: Eric Paris

    Eric Paris
     

13 Aug, 2010

1 commit

  • This reverts commit 3bcf3860a4ff9bbc522820b4b765e65e4deceb3e (and the
    accompanying commit c1e5c954020e "vfs/fsnotify: fsnotify_close can delay
    the final work in fput" that was a horribly ugly hack to make it work at
    all).

    The 'struct file' approach not only causes that disgusting hack, it
    somehow breaks pulseaudio, probably due to some other subtlety with
    f_count handling.

    Fix up various conflicts due to later fsnotify work.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

28 Jul, 2010

28 commits

  • fanotify currently, when given a vfsmount_mark will look up (if it exists)
    the corresponding inode mark. This patch drops that lookup and uses the
    mark provided.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • should_send_event() and handle_event() will both need to look up the inode
    event if they get a vfsmount event. Lets just pass both at the same time
    since we have them both after walking the lists in lockstep.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • The global fsnotify groups lists were invented as a way to increase the
    performance of fsnotify by shortcutting events which were not interesting.
    With the changes to walk the object lists rather than global groups lists
    these shortcuts are not useful.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • group->mask is now useless. It was originally a shortcut for fsnotify to
    save on performance. These checks are now redundant, so we remove them.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Because we walk the object->fsnotify_marks list instead of the global
    fsnotify groups list we don't need the fsnotify_inode_mask and
    fsnotify_vfsmount_mask as these were simply shortcuts in fsnotify() for
    performance. They are now extra checks, rip them out.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • With the change of fsnotify to use srcu walking the marks list instead of
    walking the global groups list we now know the mark in question. The code can
    send the mark to the group's handling functions and the groups won't have to
    find those marks themselves.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Currently reading the inode->i_fsnotify_marks or
    vfsmount->mnt_fsnotify_marks lists are protected by a spinlock on both the
    read and the write side. This patch protects the read side of those lists
    with a new single srcu.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Currently fsnotify check is mark->group is NULL to decide if
    fsnotify_destroy_mark() has already been called or not. With the upcoming
    rcu work it is a heck of a lot easier to use an explicit flag than worry
    about group being set to NULL.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Al explains that calling dentry_open() with a mnt/dentry pair is only
    garunteed to be safe if they are already used in an open struct file. To
    make sure this is the case don't store and use a struct path in fsnotify,
    always use a struct file.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Rather than the horrific void ** argument and such just to pass the
    fanotify_merge event back to the caller of fsnotify_add_notify_event() have
    those things return an event if it was different than the event suggusted to
    be added.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Currently fanotify fds opened for thier listeners are done with f_flags
    equal to O_RDONLY | O_LARGEFILE. This patch instead takes f_flags from the
    fanotify_init syscall and uses those when opening files in the context of
    the listener.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • This patch adds a check to make sure that all fsnotify bits are unique and we
    cannot accidentally use the same bit for 2 different fsnotify event types.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • An inotify watch on a directory will send events for children even if those
    children have been unlinked. This patch add a new inotify flag IN_EXCL_UNLINK
    which allows a watch to specificy they don't care about unlinked children.
    This should fix performance problems seen by tasks which add a watch to
    /tmp and then are overrun with events when other processes are reading and
    writing to unlinked files they created in /tmp.

    https://bugzilla.kernel.org/show_bug.cgi?id=16296

    Requested-by: Matthias Clasen
    Signed-off-by: Eric Paris

    Eric Paris
     
  • The priority argument in fanotify is useless. Kill it.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • This is the backend work needed for fanotify to support the new
    FS_OPEN_PERM and FS_ACCESS_PERM fsnotify events. This is done using the
    new fsnotify secondary queue. No userspace interface is provided actually
    respond to or request these events.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • introduce a new fsnotify hook, fsnotify_perm(), which is called from the
    security code. This hook is used to allow fsnotify groups to make access
    control decisions about events on the system. We also must change the
    generic fsnotify function to return an error code if we intend these hooks
    to be in any way useful.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fsnotify was using char * when it passed around the d_name.name string
    internally but it is actually an unsigned char *. This patch switches
    fsnotify to use unsigned and should silence some pointer signess warnings
    which have popped out of xfs. I do not add -Wpointer-sign to the fsnotify
    code as there are still issues with kstrdup and strlen which would pop
    out needless warnings.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Each group can define their own notification (and secondary_q) merge
    function. Inotify does tail drop, fanotify does matching and drop which
    can actually allocate a completely new event. But for fanotify to properly
    deal with permissions events it needs to know the new event which was
    ultimately added to the notification queue. This patch just implements a
    void ** argument which is passed to the merge function. fanotify can use
    this field to pass the new event back to higher layers.

    Signed-off-by: Eric Paris
    for fanotify to properly deal with permissions events

    Eric Paris
     
  • This introduces an ordering to fsnotify groups. With purely asynchronous
    notification based "things" implementing fsnotify (inotify, dnotify) ordering
    isn't particularly important. But if people want to use fsnotify for the
    basis of sycronous notification or blocking notification ordering becomes
    important.

    eg. A Hierarchical Storage Management listener would need to get its event
    before an AV scanner could get its event (since the HSM would need to
    bring the data in for the AV scanner to scan.) Typically asynchronous notification
    would want to run after the AV scanner made any relevant access decisions
    so as to not send notification about an event that was denied.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fanotify listeners may want to clear all marks. They may want to do this
    to destroy all of their inode marks which have nothing but ignores.
    Realistically this is useful for av vendors who update policy and want to
    clear all of their cached allows.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Some inodes a group may want to never hear about a set of events even if
    the inode is modified. We add a new mark flag which indicates that these
    marks should not have their ignored_mask cleared on modification.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • The ignored_mask is a new mask which is part of fsnotify marks. A group's
    should_send_event() function can use the ignored mask to determine that
    certain events are not of interest. In particular if a group registers a
    mask including FS_OPEN on a vfsmount they could add FS_OPEN to the
    ignored_mask for individual inodes and not send open events for those
    inodes.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • inotify marks must pin inodes in core. dnotify doesn't technically need to
    since they are closed when the directory is closed. fanotify also need to
    pin inodes in core as it works today. But the next step is to introduce
    the concept of 'ignored masks' which is actually a mask of events for an
    inode of no interest. I claim that these should be liberally sent to the
    kernel and should not pin the inode in core. If the inode is brought back
    in the listener will get an event it may have thought excluded, but this is
    not a serious situation and one any listener should deal with.

    This patch lays the ground work for non-pinning inode marks by using lazy
    inode pinning. We do not pin a mark until it has a non-zero mask entry. If a
    listener new sets a mask we never pin the inode.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • currently should_send_event in fanotify only cares about marks on inodes.
    This patch extends that interface to indicate that it cares about events
    that happened on vfsmounts.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Per-mount watches allow groups to listen to fsnotify events on an entire
    mount. This patch simply adds and initializes the fields needed in the
    vfsmount struct to make this happen.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Eric Paris

    Andreas Gruenbacher
     
  • Much like inode-mark.c has all of the code dealing with marks on inodes
    this patch adds a vfsmount-mark.c which has similar code but is intended
    for marks on vfsmounts.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • currently all marking is done by functions in inode-mark.c. Some of this
    is pretty generic and should be instead done in a generic function and we
    should only put the inode specific code in inode-mark.c

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Pass the process identifiers of the triggering processes to fanotify
    listeners: this information is useful for event filtering and logging.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Eric Paris

    Andreas Gruenbacher