08 Oct, 2016

4 commits

  • Use assert_spin_locked() macro instead of hand-made BUG_ON statements.

    Link: http://lkml.kernel.org/r/1474537439-18919-1-git-send-email-jack@suse.cz
    Signed-off-by: Jan Kara
    Suggested-by: Heiner Kallweit
    Reviewed-by: Jeff Layton
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • When freeing permission events by fsnotify_destroy_event(), the warning
    WARN_ON(!list_empty(&event->list)); may falsely hit.

    This is because although fanotify_get_response() saw event->response
    set, there is nothing to make sure the current CPU also sees the removal
    of the event from the list. Add proper locking around the WARN_ON() to
    avoid the false warning.

    Link: http://lkml.kernel.org/r/1473797711-14111-7-git-send-email-jack@suse.cz
    Reported-by: Miklos Szeredi
    Signed-off-by: Jan Kara
    Reviewed-by: Lino Sanfilippo
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • notification_mutex is used to protect the list of pending events. As such
    there's no reason to use a sleeping lock for it. Convert it to a
    spinlock.

    [jack@suse.cz: fixed version]
    Link: http://lkml.kernel.org/r/1474031567-1831-1-git-send-email-jack@suse.cz
    Link: http://lkml.kernel.org/r/1473797711-14111-5-git-send-email-jack@suse.cz
    Signed-off-by: Jan Kara
    Reviewed-by: Lino Sanfilippo
    Tested-by: Guenter Roeck
    Cc: Miklos Szeredi
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • fsnotify_flush_notify() and fanotify_release() destroy notification
    event while holding notification_mutex.

    The destruction of fanotify event includes a path_put() call which may
    end up calling into a filesystem to delete an inode if we happen to be
    the last holders of dentry reference which happens to be the last holder
    of inode reference.

    That in turn may violate lock ordering for some filesystems since
    notification_mutex is also acquired e. g. during write when generating
    fanotify event.

    Also this is the only thing that forces notification_mutex to be a
    sleeping lock. So drop notification_mutex before destroying a
    notification event.

    Link: http://lkml.kernel.org/r/1473797711-14111-4-git-send-email-jack@suse.cz
    Signed-off-by: Jan Kara
    Cc: Miklos Szeredi
    Cc: Lino Sanfilippo
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

20 Sep, 2016

2 commits

  • fanotify_get_response() calls fsnotify_remove_event() when it finds that
    group is being released from fanotify_release() (bypass_perm is set).

    However the event it removes need not be only in the group's notification
    queue but it can have already moved to access_list (userspace read the
    event before closing the fanotify instance fd) which is protected by a
    different lock. Thus when fsnotify_remove_event() races with
    fanotify_release() operating on access_list, the list can get corrupted.

    Fix the problem by moving all the logic removing permission events from
    the lists to one place - fanotify_release().

    Fixes: 5838d4442bd5 ("fanotify: fix double free of pending permission events")
    Link: http://lkml.kernel.org/r/1473797711-14111-3-git-send-email-jack@suse.cz
    Signed-off-by: Jan Kara
    Reported-by: Miklos Szeredi
    Tested-by: Miklos Szeredi
    Reviewed-by: Miklos Szeredi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Implement a function that can be called when a group is being shutdown
    to stop queueing new events to the group. Fanotify will use this.

    Fixes: 5838d4442bd5 ("fanotify: fix double free of pending permission events")
    Link: http://lkml.kernel.org/r/1473797711-14111-2-git-send-email-jack@suse.cz
    Signed-off-by: Jan Kara
    Reviewed-by: Miklos Szeredi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

07 Aug, 2014

2 commits

  • Commit 85816794240b ("fanotify: Fix use after free for permission
    events") introduced a double free issue for permission events which are
    pending in group's notification queue while group is being destroyed.
    These events are freed from fanotify_handle_event() but they are not
    removed from groups notification queue and thus they get freed again
    from fsnotify_flush_notify().

    Fix the problem by removing permission events from notification queue
    before freeing them if we skip processing access response. Also expand
    comments in fanotify_release() to explain group shutdown in detail.

    Fixes: 85816794240b9659e66e4d9b0df7c6e814e5f603
    Signed-off-by: Jan Kara
    Reported-by: Douglas Leeder
    Tested-by: Douglas Leeder
    Reported-by: Heinrich Schuchard
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Rename fsnotify_add_notify_event() to fsnotify_add_event() since the
    "notify" part is duplicit. Rename fsnotify_remove_notify_event() and
    fsnotify_peek_notify_event() to fsnotify_remove_first_event() and
    fsnotify_peek_first_event() respectively since "notify" part is duplicit
    and they really look at the first event in the queue.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Jan Kara
    Cc: Eric Paris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

25 Feb, 2014

3 commits

  • Commit 7053aee26a35 "fsnotify: do not share events between notification
    groups" used overflow event statically allocated in a group with the
    size of the generic notification event. This causes problems because
    some code looks at type specific parts of event structure and gets
    confused by a random data it sees there and causes crashes.

    Fix the problem by allocating overflow event with type corresponding to
    the group type so code cannot get confused.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • If the event queue overflows when we are handling permission event, we
    will never get response from userspace. So we must avoid waiting for it.
    Change fsnotify_add_notify_event() to return whether overflow has
    happened so that we can detect it in fanotify_handle_event() and act
    accordingly.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • Currently we didn't initialize event's list head when we removed it from
    the event list. Thus a detection whether overflow event is already
    queued wasn't working. Fix it by always initializing the list head when
    deleting event from a list.

    Signed-off-by: Jan Kara

    Jan Kara
     

29 Jan, 2014

1 commit

  • The event returned from fsnotify_add_notify_event() cannot ever be used
    safely as the event may be freed by the time the function returns (after
    dropping notification_mutex). So change the prototype to just return
    whether the event was added or merged into some existing event.

    Reported-and-tested-by: Jiri Kosina
    Reported-and-tested-by: Dave Jones
    Signed-off-by: Jan Kara

    Jan Kara
     

22 Jan, 2014

1 commit

  • Currently fsnotify framework creates one event structure for each
    notification event and links this event into all interested notification
    groups. This is done so that we save memory when several notification
    groups are interested in the event. However the need for event
    structure shared between inotify & fanotify bloats the event structure
    so the result is often higher memory consumption.

    Another problem is that fsnotify framework keeps path references with
    outstanding events so that fanotify can return open file descriptors
    with its events. This has the undesirable effect that filesystem cannot
    be unmounted while there are outstanding events - a regression for
    inotify compared to a situation before it was converted to fsnotify
    framework. For fanotify this problem is hard to avoid and users of
    fanotify should kind of expect this behavior when they ask for file
    descriptors from notified files.

    This patch changes fsnotify and its users to create separate event
    structure for each group. This allows for much simpler code (~400 lines
    removed by this patch) and also smaller event structures. For example
    on 64-bit system original struct fsnotify_event consumes 120 bytes, plus
    additional space for file name, additional 24 bytes for second and each
    subsequent group linking the event, and additional 32 bytes for each
    inotify group for private data. After the conversion inotify event
    consumes 48 bytes plus space for file name which is considerably less
    memory unless file names are long and there are several groups
    interested in the events (both of which are uncommon). Fanotify event
    fits in 56 bytes after the conversion (fanotify doesn't care about file
    names so its events don't have to have it allocated). A win unless
    there are four or more fanotify groups interested in the event.

    The conversion also solves the problem with unmount when only inotify is
    used as we don't have to grab path references for inotify events.

    [hughd@google.com: fanotify: fix corruption preventing startup]
    Signed-off-by: Jan Kara
    Reviewed-by: Christoph Hellwig
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

21 Dec, 2012

1 commit

  • Pull filesystem notification updates from Eric Paris:
    "This pull mostly is about locking changes in the fsnotify system. By
    switching the group lock from a spin_lock() to a mutex() we can now
    hold the lock across things like iput(). This fixes a problem
    involving unmounting a fs and having inodes be busy, first pointed out
    by FAT, but reproducible with tmpfs.

    This also restores signal driven I/O for inotify, which has been
    broken since about 2.6.32."

    Ugh. I *hate* the timing of this. It was rebased after the merge
    window opened, and then left to sit with the pull request coming the day
    before the merge window closes. That's just crap. But apparently the
    patches themselves have been around for over a year, just gathering
    dust, so now it's suddenly critical.

    Fixed up semantic conflict in fs/notify/fdinfo.c as per Stephen
    Rothwell's fixes from -next.

    * 'for-next' of git://git.infradead.org/users/eparis/notify:
    inotify: automatically restart syscalls
    inotify: dont skip removal of watch descriptor if creation of ignored event failed
    fanotify: dont merge permission events
    fsnotify: make fasync generic for both inotify and fanotify
    fsnotify: change locking order
    fsnotify: dont put marks on temporary list when clearing marks by group
    fsnotify: introduce locked versions of fsnotify_add_mark() and fsnotify_remove_mark()
    fsnotify: pass group to fsnotify_destroy_mark()
    fsnotify: use a mutex instead of a spinlock to protect a groups mark list
    fanotify: add an extra flag to mark_remove_from_mask that indicates wheather a mark should be destroyed
    fsnotify: take groups mark_lock before mark lock
    fsnotify: use reference counting for groups
    fsnotify: introduce fsnotify_get_group()
    inotify, fanotify: replace fsnotify_put_group() with fsnotify_destroy_group()

    Linus Torvalds
     

12 Dec, 2012

1 commit


19 Nov, 2012

1 commit


24 Mar, 2012

1 commit


27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

13 Aug, 2010

1 commit

  • This reverts commit 3bcf3860a4ff9bbc522820b4b765e65e4deceb3e (and the
    accompanying commit c1e5c954020e "vfs/fsnotify: fsnotify_close can delay
    the final work in fput" that was a horribly ugly hack to make it work at
    all).

    The 'struct file' approach not only causes that disgusting hack, it
    somehow breaks pulseaudio, probably due to some other subtlety with
    f_count handling.

    Fix up various conflicts due to later fsnotify work.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

28 Jul, 2010

14 commits

  • fanotify almost works like so:

    user context calls fsnotify_* function with a struct file.
    fsnotify takes a reference on the struct path
    user context goes about it's buissiness

    at some later point in time the fsnotify listener gets the struct path
    fanotify listener calls dentry_open() to create a file which userspace can deal with
    listener drops the reference on the struct path
    at some later point the listener calls close() on it's new file

    With the switch from struct path to struct file this presents a problem for
    fput() and fsnotify_close(). fsnotify_close() is called when the filp has
    already reached 0 and __fput() wants to do it's cleanup.

    The solution presented here is a bit odd. If an event is created from a
    struct file we take a reference on the file. We check however if the f_count
    was already 0 and if so we take an EXTRA reference EVEN THOUGH IT WAS ZERO.
    In __fput() (where we know the f_count hit 0 once) we check if the f_count is
    non-zero and if so we drop that 'extra' ref and return without destroying the
    file.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Al explains that calling dentry_open() with a mnt/dentry pair is only
    garunteed to be safe if they are already used in an open struct file. To
    make sure this is the case don't store and use a struct path in fsnotify,
    always use a struct file.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Rather than the horrific void ** argument and such just to pass the
    fanotify_merge event back to the caller of fsnotify_add_notify_event() have
    those things return an event if it was different than the event suggusted to
    be added.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • It can be hard to debug fsnotify since there are so few printks. Use
    pr_debug to allow for dynamic debugging.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fsnotify was using char * when it passed around the d_name.name string
    internally but it is actually an unsigned char *. This patch switches
    fsnotify to use unsigned and should silence some pointer signess warnings
    which have popped out of xfs. I do not add -Wpointer-sign to the fsnotify
    code as there are still issues with kstrdup and strlen which would pop
    out needless warnings.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Each group can define their own notification (and secondary_q) merge
    function. Inotify does tail drop, fanotify does matching and drop which
    can actually allocate a completely new event. But for fanotify to properly
    deal with permissions events it needs to know the new event which was
    ultimately added to the notification queue. This patch just implements a
    void ** argument which is passed to the merge function. fanotify can use
    this field to pass the new event back to higher layers.

    Signed-off-by: Eric Paris
    for fanotify to properly deal with permissions events

    Eric Paris
     
  • Pass the process identifiers of the triggering processes to fanotify
    listeners: this information is useful for event filtering and logging.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Eric Paris

    Andreas Gruenbacher
     
  • Some fsnotify operations send a struct file. This is more information than
    we technically need. We instead send a struct path in all cases instead of
    sometimes a path and sometimes a file.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Eric Paris

    Andreas Gruenbacher
     
  • fsnotify_replace_event need to lock both the old and the new event. This
    causes lockdep to get all pissed off since it dosn't know this is safe.
    It's safe in this case since the new event is impossible to be reached from
    other places in the kernel.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fanotify would like to clone events already on its notification list, make
    changes to the new event, and then replace the old event on the list with
    the new event. This patch implements the replace functionality of that
    process.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fsnotify_clone_event will take an event, clone it, and return the cloned
    event to the caller. Since events may be in use by multiple fsnotify
    groups simultaneously certain event entries (such as the mask) cannot be
    changed after the event was created. Since fanotify would like to merge
    events happening on the same file it needs a new clean event to work with
    so it can change any fields it wishes.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • inotify only wishes to merge a new event with the last event on the
    notification fifo. fanotify is willing to merge any events including by
    means of bitwise OR masks of multiple events together. This patch moves
    the inotify event merging logic out of the generic fsnotify notification.c
    and into the inotify code. This allows each use of fsnotify to provide
    their own merge functionality.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fsnotify event initialization is done entry by entry with almost everything
    set to either 0 or NULL. Use kmem_cache_zalloc and only initialize things
    that need non-zero initialization. Also means we don't have to change
    initialization entries based on the config options.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • Currently fsnotify defines a static fsnotify event which is sent when a
    group overflows its allotted queue length. This patch just allocates that
    event from the event cache rather than defining it statically. There is no
    known reason that the current implementation is wrong, but this makes sure the
    event is initialized and created like any other.

    Signed-off-by: Eric Paris

    Eric Paris
     

19 Oct, 2009

1 commit

  • If we do rename a dir entry, like this:

    rename("/tmp/ino7UrgoJ.rename1", "/tmp/ino7UrgoJ.rename2")
    rename("/tmp/ino7UrgoJ.rename2", "/tmp/ino7UrgoJ")

    The duplicate events should be coalesced into a single event. But those two
    events do not be coalesced into a single event, due to some bad check in
    event_compare(). It can not match the two NULL inodes as the same event.

    Signed-off-by: Wei Yongjun
    Signed-off-by: Eric Paris

    Wei Yongjun
     

18 Aug, 2009

2 commits

  • In f44aebcc the tail drop logic of events with no file backing
    (q_overflow and in_ignored) was reversed so IN_IGNORED events would
    never be tail dropped. This now means that Q_OVERFLOW events are NOT
    tail dropped. The fix is to not tail drop IN_IGNORED, but to tail drop
    Q_OVERFLOW.

    Signed-off-by: Eric Paris
    Signed-off-by: Linus Torvalds

    Eric Paris
     
  • inotify decides if private data it passed to get added to an event was
    used by checking list_empty(). But it's possible that the event may
    have been dequeued and the private event removed so it would look empty.

    The fix is to use the return code from fsnotify_add_notify_event rather
    than looking at the list.

    Signed-off-by: Eric Paris
    Signed-off-by: Linus Torvalds

    Eric Paris
     

22 Jul, 2009

3 commits

  • inotify can have a watchs removed under filesystem reclaim.

    =================================
    [ INFO: inconsistent lock state ]
    2.6.31-rc2 #16
    ---------------------------------
    inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
    khubd/217 [HC0[0]:SC0[0]:HE1:SE1] takes:
    (iprune_mutex){+.+.?.}, at: [] invalidate_inodes+0x20/0xe3
    {IN-RECLAIM_FS-W} state was registered at:
    [] __lock_acquire+0x2c9/0xac4
    [] lock_acquire+0x9f/0xc2
    [] __mutex_lock_common+0x2d/0x323
    [] mutex_lock_nested+0x2e/0x36
    [] shrink_icache_memory+0x38/0x1b2
    [] shrink_slab+0xe2/0x13c
    [] kswapd+0x3d1/0x55d
    [] kthread+0x66/0x6b
    [] kernel_thread_helper+0x7/0x10
    [] 0xffffffff

    Two things are needed to fix this. First we need a method to tell
    fsnotify_create_event() to use GFP_NOFS and second we need to stop using
    one global IN_IGNORED event and allocate them one at a time. This solves
    current issues with multiple IN_IGNORED on a queue having tail drop
    problems and simplifies the allocations since we don't have to worry about
    two tasks opperating on the IGNORED event concurrently.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • fsnotify drops new events when they are the same as the tail event on the
    queue to be sent to userspace. The problem is that if the event comes with
    a path we forget to break out of the switch statement and fall into the
    code path which matches on events that do not have any type of file backed
    information (things like IN_UNMOUNT and IN_Q_OVERFLOW). The problem is
    that this code thinks all such events should be dropped. Fix is to add a
    break.

    Signed-off-by: Eric Paris

    Eric Paris
     
  • inotify drops events if the last event on the queue is the same as the
    current event. But it does 2 things wrong. First it is comparing old->inode
    with new->inode. But after an event if put on the queue the ->inode is no
    longer allowed to be used. It's possible between the last event and this new
    event the inode could be reused and we would falsely match the inode's memory
    address between two differing events.

    The second problem is that when a file is removed fsnotify is passed the
    negative dentry for the removed object rather than the postive dentry from
    immediately before the removal. This mean the (broken) inotify tail drop code
    was matching the NULL ->inode of differing events.

    The fix is to check the file name which is stored with events when doing the
    tail drop instead of wrongly checking the address of the stored ->inode.

    Reported-by: Scott James Remnant
    Signed-off-by: Eric Paris

    Eric Paris
     

12 Jun, 2009

1 commit