10 Apr, 2017

3 commits

  • The function is already mostly contained in what
    fsnotify_clear_marks_by_group() does. Just update that function to not
    select marks when all of them should be destroyed and remove
    fsnotify_detach_group_marks().

    Reviewed-by: Miklos Szeredi
    Reviewed-by: Amir Goldstein
    Signed-off-by: Jan Kara

    Jan Kara
     
  • fanotify wants to drop fsnotify_mark_srcu lock when waiting for response
    from userspace so that the whole notification subsystem is not blocked
    during that time. This patch provides a framework for safely getting
    mark reference for a mark found in the object list which pins the mark
    in that list. We can then drop fsnotify_mark_srcu, wait for userspace
    response and then safely continue iteration of the object list once we
    reaquire fsnotify_mark_srcu.

    Reviewed-by: Miklos Szeredi
    Reviewed-by: Amir Goldstein
    Signed-off-by: Jan Kara

    Jan Kara
     
  • Currently we queue all marks for destruction on group shutdown and then
    destroy them from fsnotify_destroy_group() instead from a worker thread
    which is the usual path. However worker can already be processing some
    list of marks to destroy so this does not make 100% all marks are really
    destroyed by the time group is shut down. This isn't a big problem as
    each mark holds group reference and thus group stays partially alive
    until all marks are really freed but there's no point in complicating
    our lives - just wait for the delayed work to be finished instead.

    Reviewed-by: Miklos Szeredi
    Reviewed-by: Amir Goldstein
    Signed-off-by: Jan Kara

    Jan Kara
     

08 Oct, 2016

1 commit

  • notification_mutex is used to protect the list of pending events. As such
    there's no reason to use a sleeping lock for it. Convert it to a
    spinlock.

    [jack@suse.cz: fixed version]
    Link: http://lkml.kernel.org/r/1474031567-1831-1-git-send-email-jack@suse.cz
    Link: http://lkml.kernel.org/r/1473797711-14111-5-git-send-email-jack@suse.cz
    Signed-off-by: Jan Kara
    Reviewed-by: Lino Sanfilippo
    Tested-by: Guenter Roeck
    Cc: Miklos Szeredi
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

20 Sep, 2016

1 commit

  • Implement a function that can be called when a group is being shutdown
    to stop queueing new events to the group. Fanotify will use this.

    Fixes: 5838d4442bd5 ("fanotify: fix double free of pending permission events")
    Link: http://lkml.kernel.org/r/1473797711-14111-2-git-send-email-jack@suse.cz
    Signed-off-by: Jan Kara
    Reviewed-by: Miklos Szeredi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

20 May, 2016

1 commit

  • Inotify instance is destroyed when all references to it are dropped.
    That not only means that the corresponding file descriptor needs to be
    closed but also that all corresponding instance marks are freed (as each
    mark holds a reference to the inotify instance). However marks are
    freed only after SRCU period ends which can take some time and thus if
    user rapidly creates and frees inotify instances, number of existing
    inotify instances can exceed max_user_instances limit although from user
    point of view there is always at most one existing instance. Thus
    inotify_init() returns EMFILE error which is hard to justify from user
    point of view. This problem is exposed by LTP inotify06 testcase on
    some machines.

    We fix the problem by making sure all group marks are properly freed
    while destroying inotify instance. We wait for SRCU period to end in
    that path anyway since we have to make sure there is no event being
    added to the instance while we are tearing down the instance. So it
    takes only some plumbing to allow for marks to be destroyed in that path
    as well and not from a dedicated work item.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Jan Kara
    Reported-by: Xiaoguang Wang
    Tested-by: Xiaoguang Wang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

10 Oct, 2014

1 commit


25 Feb, 2014

1 commit

  • Commit 7053aee26a35 "fsnotify: do not share events between notification
    groups" used overflow event statically allocated in a group with the
    size of the generic notification event. This causes problems because
    some code looks at type specific parts of event structure and gets
    confused by a random data it sees there and causes crashes.

    Fix the problem by allocating overflow event with type corresponding to
    the group type so code cannot get confused.

    Signed-off-by: Jan Kara

    Jan Kara
     

22 Jan, 2014

1 commit

  • Currently fsnotify framework creates one event structure for each
    notification event and links this event into all interested notification
    groups. This is done so that we save memory when several notification
    groups are interested in the event. However the need for event
    structure shared between inotify & fanotify bloats the event structure
    so the result is often higher memory consumption.

    Another problem is that fsnotify framework keeps path references with
    outstanding events so that fanotify can return open file descriptors
    with its events. This has the undesirable effect that filesystem cannot
    be unmounted while there are outstanding events - a regression for
    inotify compared to a situation before it was converted to fsnotify
    framework. For fanotify this problem is hard to avoid and users of
    fanotify should kind of expect this behavior when they ask for file
    descriptors from notified files.

    This patch changes fsnotify and its users to create separate event
    structure for each group. This allows for much simpler code (~400 lines
    removed by this patch) and also smaller event structures. For example
    on 64-bit system original struct fsnotify_event consumes 120 bytes, plus
    additional space for file name, additional 24 bytes for second and each
    subsequent group linking the event, and additional 32 bytes for each
    inotify group for private data. After the conversion inotify event
    consumes 48 bytes plus space for file name which is considerably less
    memory unless file names are long and there are several groups
    interested in the events (both of which are uncommon). Fanotify event
    fits in 56 bytes after the conversion (fanotify doesn't care about file
    names so its events don't have to have it allocated). A win unless
    there are four or more fanotify groups interested in the event.

    The conversion also solves the problem with unmount when only inotify is
    used as we don't have to grab path references for inotify events.

    [hughd@google.com: fanotify: fix corruption preventing startup]
    Signed-off-by: Jan Kara
    Reviewed-by: Christoph Hellwig
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

12 Dec, 2012

5 commits

  • inotify is supposed to support async signal notification when information
    is available on the inotify fd. This patch moves that support to generic
    fsnotify functions so it can be used by all notification mechanisms.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Replaces the groups mark_lock spinlock with a mutex. Using a mutex instead
    of a spinlock results in more flexibility (i.e it allows to sleep while the
    lock is held).

    Signed-off-by: Lino Sanfilippo
    Signed-off-by: Eric Paris

    Lino Sanfilippo
     
  • Get a group ref for each mark that is added to the groups list and release that
    ref when the mark is freed in fsnotify_put_mark().
    We also use get a group reference for duplicated marks and for private event
    data.
    Now we dont free a group any more when the number of marks becomes 0 but when
    the groups ref count does. Since this will only happen when all marks are removed
    from a groups mark list, we dont have to set the groups number of marks to 1 at
    group creation.

    Beside clearing all marks in fsnotify_destroy_group() we do also flush the
    groups event queue. This is since events may hold references to groups (due to
    private event data) and we have to put those references first before we get a
    chance to put the final ref, which will result in a call to
    fsnotify_final_destroy_group().

    Signed-off-by: Lino Sanfilippo
    Signed-off-by: Eric Paris

    Lino Sanfilippo
     
  • Introduce fsnotify_get_group() which increments the reference counter of a group.

    Signed-off-by: Lino Sanfilippo
    Signed-off-by: Eric Paris

    Lino Sanfilippo
     
  • Currently in fsnotify_put_group() the ref count of a group is decremented and if
    it becomes 0 fsnotify_destroy_group() is called. Since a groups ref count is only
    at group creation set to 1 and never increased after that a call to fsnotify_put_group()
    always results in a call to fsnotify_destroy_group().
    With this patch fsnotify_destroy_group() is called directly.

    Signed-off-by: Lino Sanfilippo
    Signed-off-by: Eric Paris

    Lino Sanfilippo
     

27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

28 Jul, 2010

17 commits

  • The global fsnotify groups lists were invented as a way to increase the
    performance of fsnotify by shortcutting events which were not interesting.
    With the changes to walk the object lists rather than global groups lists
    these shortcuts are not useful.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • group->mask is now useless. It was originally a shortcut for fsnotify to
    save on performance. These checks are now redundant, so we remove them.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Because we walk the object->fsnotify_marks list instead of the global
    fsnotify groups list we don't need the fsnotify_inode_mask and
    fsnotify_vfsmount_mask as these were simply shortcuts in fsnotify() for
    performance. They are now extra checks, rip them out.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Currently reading the inode->i_fsnotify_marks or
    vfsmount->mnt_fsnotify_marks lists are protected by a spinlock on both the
    read and the write side. This patch protects the read side of those lists
    with a new single srcu.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • The priority argument in fanotify is useless. Kill it.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • This introduces an ordering to fsnotify groups. With purely asynchronous
    notification based "things" implementing fsnotify (inotify, dnotify) ordering
    isn't particularly important. But if people want to use fsnotify for the
    basis of sycronous notification or blocking notification ordering becomes
    important.

    eg. A Hierarchical Storage Management listener would need to get its event
    before an AV scanner could get its event (since the HSM would need to
    bring the data in for the AV scanner to scan.) Typically asynchronous notification
    would want to run after the AV scanner made any relevant access decisions
    so as to not send notification about an event that was denied.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • previously I used mark_entry when talking about marks on inodes. The
    _entry is pretty useless. Just use "mark" instead.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • The name is long and it serves no real purpose. So rename
    fsnotify_mark_entry to just fsnotify_mark.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • currently all of the notification systems implemented select which inodes
    they care about and receive messages only about those inodes (or the
    children of those inodes.) This patch begins to flesh out fsnotify support
    for the concept of listeners that want to hear notification for an inode
    accessed below a given monut point. This patch implements a second list
    of fsnotify groups to hold these types of groups and a second global mask
    to hold the events of interest for this type of group.

    The reason we want a second group list and mask is because the inode based
    notification should_send_event support which makes each group look for a mark
    on the given inode. With one nfsmount listener that means that every group would
    have to take the inode->i_lock, look for their mark, not find one, and return
    for every operation. By seperating vfsmount from inode listeners only when
    there is a inode listener will the inode groups have to look for their
    mark and take the inode lock. vfsmount listeners will have to grab the lock and
    look for a mark but there should be fewer of them, and one vfsmount listener
    won't cause the i_lock to be grabbed and released for every fsnotify group
    on every io operation.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Currently all fsnotify groups are added immediately to the
    fsnotify_inode_groups list upon creation. This means, even groups with no
    watches (common for audit) will be on the global tracking list and will
    get checked for every event. This patch adds groups to the global list on
    when the first inode mark is added to the group.

    Signed-of-by: Eric Paris

    Eric Paris
     
  • Currently the comments say that group->num_marks is held because the group
    is on the fsnotify_group list. This isn't strictly the case, we really
    just hold the num_marks for the life of the group (any time group->refcnt
    is != 0) This patch moves the initialization stuff and makes it clear when
    it is really being held.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Simple renaming patch. fsnotify is about to support mount point listeners
    so I am renaming fsnotify_groups and fsnotify_mask to indicate these are lists
    used only for groups which have watches on inodes.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Nothing uses the mask argument to fsnotify_alloc_group. This patch drops
    that argument.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fsnotify_obtain_group was intended to be able to find an already existing
    group. Nothing uses that functionality. This just renames it to
    fsnotify_alloc_group so it is clear what it is doing.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fsnotify_obtain_group uses kzalloc but then proceedes to set things to 0.
    This patch just deletes those useless lines.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • The original fsnotify interface has a group-num which was intended to be
    able to find a group after it was added. I no longer think this is a
    necessary thing to do and so we remove the group_num.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Use kzalloc for fsnotify_groups so that none of the fields can leak any
    information accidentally.

    Signed-off-by: Eric Paris

    Eric Paris
     

12 Jun, 2009

3 commits

  • inotify needs to do asyc notification in which event information is stored
    on a queue until the listener is ready to receive it. This patch
    implements a generic notification queue for inotify (and later fanotify) to
    store events to be sent at a later time.

    Signed-off-by: Eric Paris
    Acked-by: Al Viro
    Cc: Christoph Hellwig

    Eric Paris
     
  • This patch creates a way for fsnotify groups to attach marks to inodes.
    These marks have little meaning to the generic fsnotify infrastructure
    and thus their meaning should be interpreted by the group that attached
    them to the inode's list.

    dnotify and inotify will make use of these markings to indicate which
    inodes are of interest to their respective groups. But this implementation
    has the useful property that in the future other listeners could actually
    use the marks for the exact opposite reason, aka to indicate which inodes
    it had NO interest in.

    Signed-off-by: Eric Paris
    Acked-by: Al Viro
    Cc: Christoph Hellwig

    Eric Paris
     
  • fsnotify is a backend for filesystem notification. fsnotify does
    not provide any userspace interface but does provide the basis
    needed for other notification schemes such as dnotify. fsnotify
    can be extended to be the backend for inotify or the upcoming
    fanotify. fsnotify provides a mechanism for "groups" to register for
    some set of filesystem events and to then deliver those events to
    those groups for processing.

    fsnotify has a number of benefits, the first being actually shrinking the size
    of an inode. Before fsnotify to support both dnotify and inotify an inode had

    unsigned long i_dnotify_mask; /* Directory notify events */
    struct dnotify_struct *i_dnotify; /* for directory notifications */
    struct list_head inotify_watches; /* watches on this inode */
    struct mutex inotify_mutex; /* protects the watches list

    But with fsnotify this same functionallity (and more) is done with just

    __u32 i_fsnotify_mask; /* all events for this inode */
    struct hlist_head i_fsnotify_mark_entries; /* marks on this inode */

    That's right, inotify, dnotify, and fanotify all in 64 bits. We used that
    much space just in inotify_watches alone, before this patch set.

    fsnotify object lifetime and locking is MUCH better than what we have today.
    inotify locking is incredibly complex. See 8f7b0ba1c8539 as an example of
    what's been busted since inception. inotify needs to know internal semantics
    of superblock destruction and unmounting to function. The inode pinning and
    vfs contortions are horrible.

    no fsnotify implementers do allocation under locks. This means things like
    f04b30de3 which (due to an overabundance of caution) changes GFP_KERNEL to
    GFP_NOFS can be reverted. There are no longer any allocation rules when using
    or implementing your own fsnotify listener.

    fsnotify paves the way for fanotify. In brief fanotify is a notification
    mechanism that delivers the lisener both an 'event' and an open file descriptor
    to the object in question. This means that fanotify is pathname agnostic.
    Some on lkml may not care for the original companies or users that pushed for
    TALPA, but fanotify was designed with flexibility and input for other users in
    mind. The readahead group expressed interest in fanotify as it could be used
    to profile disk access on boot without breaking the audit system. The desktop
    search groups have also expressed interest in fanotify as it solves a number
    of the race conditions and problems present with managing inotify when more
    than a limited number of specific files are of interest. fanotify can provide
    for a userspace access control system which makes it a clean interface for AV
    vendors to hook without trying to do binary patching on the syscall table,
    LSM, and everywhere else they do their things today. With this patch series
    fanotify can be implemented in less than 1200 lines of easy to review code.
    Almost all of which is the socket based user interface.

    This patch series builds fsnotify to the point that it can implement
    dnotify and inotify_user. Patches exist and will be sent soon after
    acceptance to finish the in kernel inotify conversion (audit) and implement
    fanotify.

    Signed-off-by: Eric Paris
    Acked-by: Al Viro
    Cc: Christoph Hellwig

    Eric Paris