04 Apr, 2014

5 commits

  • Move code moving event structure to access_list from copy_event_to_user()
    to fanotify_read() where it is more logical (so that we can immediately
    see in the main loop that we either move the event to a different list
    or free it). Also move special error handling for permission events
    from copy_event_to_user() to the main loop to have it in one place with
    error handling for normal events. This makes copy_event_to_user()
    really only copy the event to user without any side effects.

    Signed-off-by: Jan Kara
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Swap the error / "read ok" branches in the main loop of fanotify_read().
    We will grow the "read ok" part in the next patch and this makes the
    indentation easier. Also it is more common to have error conditions
    inside an 'if' instead of the fast path.

    Signed-off-by: Jan Kara
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • access_mutex is used only to guard operations on access_list. There's
    no need for sleeping within this lock so just make a spinlock out of it.

    Signed-off-by: Jan Kara
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Currently, fanotify creates new structure to track the fact that
    permission event has been reported to userspace and someone is waiting
    for a response to it. As event structures are now completely in the
    hands of each notification framework, we can use the event structure for
    this tracking instead of allocating a new structure.

    Since this makes the event structures for normal events and permission
    events even more different and the structures have different lifetime
    rules, we split them into two separate structures (where permission
    event structure contains the structure for a normal event). This makes
    normal events 8 bytes smaller and the code a tad bit cleaner.

    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Jan Kara
    Cc: Eric Paris
    Cc: Al Viro
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • The prepare_for_access_response() function checks whether
    group->fanotify_data.bypass_perm is set. However this test can never be
    true because prepare_for_access_response() is called only from
    fanotify_read() which means fanotify group is alive with an active fd
    while bypass_perm is set from fanotify_release() when all file
    descriptors pointing to the group are closed and the group is going
    away.

    Signed-off-by: Jan Kara
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

25 Feb, 2014

3 commits

  • Commit 7053aee26a35 "fsnotify: do not share events between notification
    groups" used overflow event statically allocated in a group with the
    size of the generic notification event. This causes problems because
    some code looks at type specific parts of event structure and gets
    confused by a random data it sees there and causes crashes.

    Fix the problem by allocating overflow event with type corresponding to
    the group type so code cannot get confused.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • If the event queue overflows when we are handling permission event, we
    will never get response from userspace. So we must avoid waiting for it.
    Change fsnotify_add_notify_event() to return whether overflow has
    happened so that we can detect it in fanotify_handle_event() and act
    accordingly.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • Currently we didn't initialize event's list head when we removed it from
    the event list. Thus a detection whether overflow event is already
    queued wasn't working. Fix it by always initializing the list head when
    deleting event from a list.

    Signed-off-by: Jan Kara

    Jan Kara
     

18 Feb, 2014

1 commit

  • My rework of handling of notification events (namely commit 7053aee26a35
    "fsnotify: do not share events between notification groups") broke
    sending of cookies with inotify events. We didn't propagate the value
    passed to fsnotify() properly and passed 4 uninitialized bytes to
    userspace instead (so it is also an information leak). Sadly I didn't
    notice this during my testing because inotify cookies aren't used very
    much and LTP inotify tests ignore them.

    Fix the problem by passing the cookie value properly.

    Fixes: 7053aee26a3548ebaba046ae2e52396ccf56ac6c
    Reported-by: Vegard Nossum
    Signed-off-by: Jan Kara

    Jan Kara
     

29 Jan, 2014

3 commits

  • Currently struct fanotify_event_info has been destroyed immediately
    after reporting its contents to userspace. However that is wrong for
    permission events because those need to stay around until userspace
    provides response which is filled back in fanotify_event_info. So change
    to code to free permission events only after we have got the response
    from userspace.

    Reported-and-tested-by: Jiri Kosina
    Reported-and-tested-by: Dave Jones
    Signed-off-by: Jan Kara

    Jan Kara
     
  • The event returned from fsnotify_add_notify_event() cannot ever be used
    safely as the event may be freed by the time the function returns (after
    dropping notification_mutex). So change the prototype to just return
    whether the event was added or merged into some existing event.

    Reported-and-tested-by: Jiri Kosina
    Reported-and-tested-by: Dave Jones
    Signed-off-by: Jan Kara

    Jan Kara
     
  • We cannot use the event structure returned from
    fsnotify_add_notify_event() because that event can be freed by the time
    that function returns. Use the mask argument passed into the event
    handler directly instead. This also fixes a possible problem when we
    could unnecessarily wait for permission response for a normal fanotify
    event which got merged with a permission event.

    We also disallow merging of permission event with any other event so
    that we know the permission event which we just created is the one on
    which we should wait for permission response.

    Reported-and-tested-by: Jiri Kosina
    Reported-and-tested-by: Dave Jones
    Signed-off-by: Jan Kara

    Jan Kara
     

28 Jan, 2014

1 commit

  • Commit 91c2e0bcae72 ("unify compat fanotify_mark(2), switch to
    COMPAT_SYSCALL_DEFINE") added a new unified compat fanotify_mark syscall
    to be used by all architectures.

    Unfortunately the unified version merges the split mask parameter in a
    wrong way: the lower and higher word got swapped.

    This was discovered with glibc's tst-fanotify test case.

    Signed-off-by: Heiko Carstens
    Reported-by: Andreas Krebbel
    Cc: "James E.J. Bottomley"
    Acked-by: "David S. Miller"
    Acked-by: Al Viro
    Cc: Benjamin Herrenschmidt
    Cc: Ingo Molnar
    Cc: Ralf Baechle
    Cc: [3.10+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     

22 Jan, 2014

4 commits

  • We usually rely on the fact that struct members not specified in the
    initializer are set to NULL. So do that with fsnotify function pointers
    as well.

    Signed-off-by: Jan Kara
    Reviewed-by: Christoph Hellwig
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • After removing event structure creation from the generic layer there is
    no reason for separate .should_send_event and .handle_event callbacks.
    So just remove the first one.

    Signed-off-by: Jan Kara
    Reviewed-by: Christoph Hellwig
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Currently fsnotify framework creates one event structure for each
    notification event and links this event into all interested notification
    groups. This is done so that we save memory when several notification
    groups are interested in the event. However the need for event
    structure shared between inotify & fanotify bloats the event structure
    so the result is often higher memory consumption.

    Another problem is that fsnotify framework keeps path references with
    outstanding events so that fanotify can return open file descriptors
    with its events. This has the undesirable effect that filesystem cannot
    be unmounted while there are outstanding events - a regression for
    inotify compared to a situation before it was converted to fsnotify
    framework. For fanotify this problem is hard to avoid and users of
    fanotify should kind of expect this behavior when they ask for file
    descriptors from notified files.

    This patch changes fsnotify and its users to create separate event
    structure for each group. This allows for much simpler code (~400 lines
    removed by this patch) and also smaller event structures. For example
    on 64-bit system original struct fsnotify_event consumes 120 bytes, plus
    additional space for file name, additional 24 bytes for second and each
    subsequent group linking the event, and additional 32 bytes for each
    inotify group for private data. After the conversion inotify event
    consumes 48 bytes plus space for file name which is considerably less
    memory unless file names are long and there are several groups
    interested in the events (both of which are uncommon). Fanotify event
    fits in 56 bytes after the conversion (fanotify doesn't care about file
    names so its events don't have to have it allocated). A win unless
    there are four or more fanotify groups interested in the event.

    The conversion also solves the problem with unmount when only inotify is
    used as we don't have to grab path references for inotify events.

    [hughd@google.com: fanotify: fix corruption preventing startup]
    Signed-off-by: Jan Kara
    Reviewed-by: Christoph Hellwig
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Rounding of name length when passing it to userspace was done in several
    places. Provide a function to do it and use it in all places.

    Signed-off-by: Jan Kara
    Reviewed-by: Christoph Hellwig
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

10 Jul, 2013

6 commits

  • There have been changes in the locking scheme of fsnotify but the
    comments in the source code have not been updated yet. This patch
    corrects this.

    Signed-off-by: Lino Sanfilippo
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lino Sanfilippo
     
  • In inotify_new_watch() the number of watches for a group is compared
    against the max number of allowed watches and increased afterwards. The
    check and incrementation is not done atomically, so it is possible for
    multiple concurrent threads to pass the check and increment the number
    of marks above the allowed max.

    This patch uses an inotify groups mark_lock to ensure that both check
    and incrementation are done atomic. Furthermore we dont have to worry
    about the race that allows a concurrent thread to add a watch just after
    inotify_update_existing_watch() returned with -ENOENT anymore, since
    this is also synchronized by the groups mark mutex now.

    Signed-off-by: Lino Sanfilippo
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lino Sanfilippo
     
  • There is no need to use a special mutex to protect against the
    fcntl/close race (see dnotify.c for a description of this race).
    Instead the dnotify_groups mark mutex can be used.

    Signed-off-by: Lino Sanfilippo
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lino Sanfilippo
     
  • The code under the groups mark_mutex in fanotify_add_inode_mark() and
    fanotify_add_vfsmount_mark() is almost identical. So put it into a
    seperate function.

    Signed-off-by: Lino Sanfilippo
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lino Sanfilippo
     
  • For both adding an event to an existing mark and destroying a mark we
    first have to find it via fsnotify_find_[inode|vfsmount]_mark(). But
    getting the mark and adding an event (or destroying it) is not done
    atomically. This opens a race where a thread is about to destroy a mark
    while another thread still finds the same mark and adds an event to its
    mask although it will be destroyed.

    Another race exists concerning the excess of a groups number of marks
    limit: When a mark is added the number of group marks is checked against
    the max number of marks per group and increased afterwards. Since check
    and increment is also not done atomically, this may result in 2 or more
    processes passing the check at the same time and increasing the number
    of group marks above the allowed limit.

    With this patch both races are avoided by doing the concerning
    operations with the groups mark mutex locked.

    Signed-off-by: Lino Sanfilippo
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lino Sanfilippo
     
  • The ->reserved field isn't cleared so we leak one byte of stack
    information to userspace.

    Signed-off-by: Dan Carpenter
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Carpenter
     

29 Jun, 2013

1 commit


10 May, 2013

1 commit


02 May, 2013

1 commit

  • Pull VFS updates from Al Viro,

    Misc cleanups all over the place, mainly wrt /proc interfaces (switch
    create_proc_entry to proc_create(), get rid of the deprecated
    create_proc_read_entry() in favor of using proc_create_data() and
    seq_file etc).

    7kloc removed.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
    don't bother with deferred freeing of fdtables
    proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
    proc: Make the PROC_I() and PDE() macros internal to procfs
    proc: Supply a function to remove a proc entry by PDE
    take cgroup_open() and cpuset_open() to fs/proc/base.c
    ppc: Clean up scanlog
    ppc: Clean up rtas_flash driver somewhat
    hostap: proc: Use remove_proc_subtree()
    drm: proc: Use remove_proc_subtree()
    drm: proc: Use minor->index to label things, not PDE->name
    drm: Constify drm_proc_list[]
    zoran: Don't print proc_dir_entry data in debug
    reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
    proc: Supply an accessor for getting the data from a PDE's parent
    airo: Use remove_proc_subtree()
    rtl8192u: Don't need to save device proc dir PDE
    rtl8187se: Use a dir under /proc/net/r8180/
    proc: Add proc_mkdir_data()
    proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
    proc: Move PDE_NET() to fs/proc/proc_net.c
    ...

    Linus Torvalds
     

01 May, 2013

2 commits

  • Pull compat cleanup from Al Viro:
    "Mostly about syscall wrappers this time; there will be another pile
    with patches in the same general area from various people, but I'd
    rather push those after both that and vfs.git pile are in."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
    syscalls.h: slightly reduce the jungles of macros
    get rid of union semop in sys_semctl(2) arguments
    make do_mremap() static
    sparc: no need to sign-extend in sync_file_range() wrapper
    ppc compat wrappers for add_key(2) and request_key(2) are pointless
    x86: trim sys_ia32.h
    x86: sys32_kill and sys32_mprotect are pointless
    get rid of compat_sys_semctl() and friends in case of ARCH_WANT_OLD_COMPAT_IPC
    merge compat sys_ipc instances
    consolidate compat lookup_dcookie()
    convert vmsplice to COMPAT_SYSCALL_DEFINE
    switch getrusage() to COMPAT_SYSCALL_DEFINE
    switch epoll_pwait to COMPAT_SYSCALL_DEFINE
    convert sendfile{,64} to COMPAT_SYSCALL_DEFINE
    switch signalfd{,4}() to COMPAT_SYSCALL_DEFINE
    make SYSCALL_DEFINE-generated wrappers do asmlinkage_protect
    make HAVE_SYSCALL_WRAPPERS unconditional
    consolidate cond_syscall and SYSCALL_ALIAS declarations
    teach SYSCALL_DEFINE how to deal with long long/unsigned long long
    get rid of duplicate logics in __SC_....[1-6] definitions

    Linus Torvalds
     
  • When we run the crackerjack testsuite, the inotify_add_watch test is
    stalled.

    This is caused by the invalid mask 0 - the task is waiting for the event
    but it never comes. inotify_add_watch() should return -EINVAL as it did
    before commit 676a0675cf92 ("inotify: remove broken mask checks causing
    unmount to be EINVAL"). That commit removes the invalid mask check, but
    that check is needed.

    Check the mask's ALL_INOTIFY_BITS before the inotify_arg_to_mask() call.
    If none are set, just return -EINVAL.

    Because IN_UNMOUNT is in ALL_INOTIFY_BITS, this change will not trigger
    the problem that above commit fixed.

    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Zhao Hongjiang
    Acked-by: Jim Somerville
    Cc: Paul Gortmaker
    Cc: Jerome Marchand
    Cc: Eric Paris
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zhao Hongjiang
     

30 Apr, 2013

2 commits


04 Mar, 2013

1 commit


28 Feb, 2013

3 commits

  • I'm not sure why, but the hlist for each entry iterators were conceived

    list_for_each_entry(pos, head, member)

    The hlist ones were greedy and wanted an extra parameter:

    hlist_for_each_entry(tpos, pos, head, member)

    Why did they need an extra pos parameter? I'm not quite sure. Not only
    they don't really need it, it also prevents the iterator from looking
    exactly like the list iterator, which is unfortunate.

    Besides the semantic patch, there was some manual work required:

    - Fix up the actual hlist iterators in linux/list.h
    - Fix up the declaration of other iterators based on the hlist ones.
    - A very small amount of places were using the 'node' parameter, this
    was modified to use 'obj->member' instead.
    - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
    properly, so those had to be fixed up manually.

    The semantic patch which is mostly the work of Peter Senna Tschudin is here:

    @@
    iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

    type T;
    expression a,c,d,e;
    identifier b;
    statement S;
    @@

    -T b;

    [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
    [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
    [akpm@linux-foundation.org: checkpatch fixes]
    [akpm@linux-foundation.org: fix warnings]
    [akpm@linux-foudnation.org: redo intrusive kvm changes]
    Tested-by: Peter Senna Tschudin
    Acked-by: Paul E. McKenney
    Signed-off-by: Sasha Levin
    Cc: Wu Fengguang
    Cc: Marcelo Tosatti
    Cc: Gleb Natapov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     
  • Convert to the much saner new idr interface.

    Note that the adhoc cyclic id allocation is buggy. If wraparound
    happens, the previous code with idr_get_new_above() may segfault and
    the converted code will trigger WARN and return -EINVAL. Even if it's
    fixed to wrap to zero, the code will be prone to unnecessary -ENOSPC
    failures after the first wraparound. We probably need to implement
    proper cyclic support in idr.

    Signed-off-by: Tejun Heo
    Cc: John McCutchan
    Cc: Robert Love
    Cc: Eric Paris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • idr_destroy() can destroy idr by itself and idr_remove_all() is being
    deprecated. Drop its usage.

    Signed-off-by: Tejun Heo
    Cc: John McCutchan
    Cc: Robert Love
    Cc: Eric Paris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

23 Feb, 2013

1 commit


22 Feb, 2013

1 commit

  • Running the command:

    inotifywait -e unmount /mnt/disk

    immediately aborts with a -EINVAL return code. This is however a valid
    parameter. This abort occurs only if unmount is the sole event
    parameter. If other event parameters are supplied, then the unmount
    event wait will work.

    The problem was introduced by commit 44b350fc23e ("inotify: Fix mask
    checks"). In that commit, it states:

    The mask checks in inotify_update_existing_watch() and
    inotify_new_watch() are useless because inotify_arg_to_mask()
    sets FS_IN_IGNORED and FS_EVENT_ON_CHILD bits anyway.

    But instead of removing the useless checks, it did this:

    mask = inotify_arg_to_mask(arg);
    - if (unlikely(!mask))
    + if (unlikely(!(mask & IN_ALL_EVENTS)))
    return -EINVAL;

    The problem is that IN_ALL_EVENTS doesn't include IN_UNMOUNT, and other
    parts of the code keep IN_UNMOUNT separate from IN_ALL_EVENTS. So the
    check should be:

    if (unlikely(!(mask & (IN_ALL_EVENTS | IN_UNMOUNT))))

    But inotify_arg_to_mask(arg) always sets the IN_UNMOUNT bit in the mask
    anyway, so the check is always going to pass and thus should simply be
    removed. Also note that inotify_arg_to_mask completely controls what
    mask bits get set from arg, there's no way for invalid bits to get
    enabled there.

    Lets fix it by simply removing the useless broken checks.

    Signed-off-by: Jim Somerville
    Signed-off-by: Paul Gortmaker
    Cc: Jerome Marchand
    Cc: John McCutchan
    Cc: Robert Love
    Cc: Eric Paris
    Cc: [2.6.37+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jim Somerville
     

21 Dec, 2012

1 commit

  • Pull filesystem notification updates from Eric Paris:
    "This pull mostly is about locking changes in the fsnotify system. By
    switching the group lock from a spin_lock() to a mutex() we can now
    hold the lock across things like iput(). This fixes a problem
    involving unmounting a fs and having inodes be busy, first pointed out
    by FAT, but reproducible with tmpfs.

    This also restores signal driven I/O for inotify, which has been
    broken since about 2.6.32."

    Ugh. I *hate* the timing of this. It was rebased after the merge
    window opened, and then left to sit with the pull request coming the day
    before the merge window closes. That's just crap. But apparently the
    patches themselves have been around for over a year, just gathering
    dust, so now it's suddenly critical.

    Fixed up semantic conflict in fs/notify/fdinfo.c as per Stephen
    Rothwell's fixes from -next.

    * 'for-next' of git://git.infradead.org/users/eparis/notify:
    inotify: automatically restart syscalls
    inotify: dont skip removal of watch descriptor if creation of ignored event failed
    fanotify: dont merge permission events
    fsnotify: make fasync generic for both inotify and fanotify
    fsnotify: change locking order
    fsnotify: dont put marks on temporary list when clearing marks by group
    fsnotify: introduce locked versions of fsnotify_add_mark() and fsnotify_remove_mark()
    fsnotify: pass group to fsnotify_destroy_mark()
    fsnotify: use a mutex instead of a spinlock to protect a groups mark list
    fanotify: add an extra flag to mark_remove_from_mask that indicates wheather a mark should be destroyed
    fsnotify: take groups mark_lock before mark lock
    fsnotify: use reference counting for groups
    fsnotify: introduce fsnotify_get_group()
    inotify, fanotify: replace fsnotify_put_group() with fsnotify_destroy_group()

    Linus Torvalds
     

18 Dec, 2012

2 commits

  • The kernel keeps FAN_MARK_IGNORED_SURV_MODIFY bit separately from
    fsnotify_mark::mask|ignored_mask thus put it in @mflags (mark flags)
    field so the user-space reader will be able to detect if such bit were
    used on mark creation procedure.

    | pos: 0
    | flags: 04002
    | fanotify flags:10 event-flags:0
    | fanotify mnt_id:12 mflags:40 mask:38 ignored_mask:40000003
    | fanotify ino:4f969 sdev:800013 mflags:0 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:69f90400c275b5b4

    Signed-off-by: Cyrill Gorcunov
    Cc: Pavel Emelyanov
    Cc: Oleg Nesterov
    Cc: Andrey Vagin
    Cc: Al Viro
    Cc: Alexey Dobriyan
    Cc: James Bottomley
    Cc: "Aneesh Kumar K.V"
    Cc: Matthew Helsley
    Cc: "J. Bruce Fields"
    Cc: Tvrtko Ursulin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cyrill Gorcunov
     
  • This allow us to print out fsnotify details such as watchee inode, device,
    mask and optionally a file handle.

    For inotify objects if kernel compiled with exportfs support the output
    will be

    | pos: 0
    | flags: 02000000
    | inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:7e9e0000640d1b6d
    | inotify wd:2 ino:a111 sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:11a1000020542153
    | inotify wd:1 ino:6b149 sdev:800013 mask:800afce ignored_mask:0 fhandle-bytes:8 fhandle-type:1 f_handle:49b1060023552153

    If kernel compiled without exportfs support, the file handle
    won't be provided but inode and device only.

    | pos: 0
    | flags: 02000000
    | inotify wd:3 ino:9e7e sdev:800013 mask:800afce ignored_mask:0
    | inotify wd:2 ino:a111 sdev:800013 mask:800afce ignored_mask:0
    | inotify wd:1 ino:6b149 sdev:800013 mask:800afce ignored_mask:0

    For fanotify the output is like

    | pos: 0
    | flags: 04002
    | fanotify flags:10 event-flags:0
    | fanotify mnt_id:12 mask:3b ignored_mask:0
    | fanotify ino:50205 sdev:800013 mask:3b ignored_mask:40000000 fhandle-bytes:8 fhandle-type:1 f_handle:05020500fb1d47e7

    To minimize impact on general fsnotify code the new functionality
    is gathered in fs/notify/fdinfo.c file.

    Signed-off-by: Cyrill Gorcunov
    Acked-by: Pavel Emelyanov
    Cc: Oleg Nesterov
    Cc: Andrey Vagin
    Cc: Al Viro
    Cc: Alexey Dobriyan
    Cc: James Bottomley
    Cc: "Aneesh Kumar K.V"
    Cc: Alexey Dobriyan
    Cc: Matthew Helsley
    Cc: "J. Bruce Fields"
    Cc: "Aneesh Kumar K.V"
    Cc: Tvrtko Ursulin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cyrill Gorcunov