01 Jul, 2016

1 commit

  • (Another one for the f_path debacle.)

    ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask.

    The reason is that generic_add_lease() used filp->f_path.dentry->inode
    while all the others use file_inode(). This makes a difference for files
    opened on overlayfs since the former will point to the overlay inode the
    latter to the underlying inode.

    So generic_add_lease() added the lease to the overlay inode and
    generic_delete_lease() removed it from the underlying inode. When the file
    was released the lease remained on the overlay inode's lock list, resulting
    in use after free.

    Reported-by: Eryu Guan
    Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
    Cc:
    Signed-off-by: Miklos Szeredi
    Reviewed-by: Jeff Layton
    Signed-off-by: J. Bruce Fields

    Miklos Szeredi
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

13 Jan, 2016

1 commit

  • Pull vfs copy_file_range updates from Al Viro:
    "Several series around copy_file_range/CLONE"

    * 'work.copy_file_range' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    btrfs: use new dedupe data function pointer
    vfs: hoist the btrfs deduplication ioctl to the vfs
    vfs: wire up compat ioctl for CLONE/CLONE_RANGE
    cifs: avoid unused variable and label
    nfsd: implement the NFSv4.2 CLONE operation
    nfsd: Pass filehandle to nfs4_preprocess_stateid_op()
    vfs: pull btrfs clone API to vfs layer
    locks: new locks_mandatory_area calling convention
    vfs: Add vfs_copy_file_range() support for pagecache copies
    btrfs: add .copy_file_range file operation
    x86: add sys_copy_file_range to syscall tables
    vfs: add copy_file_range syscall and vfs helper

    Linus Torvalds
     

09 Jan, 2016

5 commits


08 Jan, 2016

1 commit

  • Dmitry reported that he was able to reproduce the WARN_ON_ONCE that
    fires in locks_free_lock_context when the flc_posix list isn't empty.

    The problem turns out to be that we're basically rebuilding the
    file_lock from scratch in fcntl_setlk when we discover that the setlk
    has raced with a close. If the l_whence field is SEEK_CUR or SEEK_END,
    then we may end up with fl_start and fl_end values that differ from
    when the lock was initially set, if the file position or length of the
    file has changed in the interim.

    Fix this by just reusing the same lock request structure, and simply
    override fl_type value with F_UNLCK as appropriate. That ensures that
    we really are unlocking the lock that was initially set.

    While we're there, make sure that we do pop a WARN_ON_ONCE if the
    removal ever fails. Also return -EBADF in this event, since that's
    what we would have returned if the close had happened earlier.

    Cc: Alexander Viro
    Cc:
    Fixes: c293621bbf67 (stale POSIX lock handling)
    Reported-by: Dmitry Vyukov
    Signed-off-by: Jeff Layton
    Acked-by: "J. Bruce Fields"

    Jeff Layton
     

18 Dec, 2015

1 commit

  • The Kconfig currently controlling compilation of this code is:

    config FILE_LOCKING
    bool "Enable POSIX file locking API" if EXPERT

    ...meaning that it currently is not being built as a module by anyone.

    Lets remove the couple traces of modularity so that when reading the
    driver there is no doubt it is builtin-only.

    Since module_init translates to device_initcall in the non-modular
    case, the init ordering gets bumped to one level earlier when we
    use the more appropriate fs_initcall here. However we've made similar
    changes before without any fallout and none is expected here either.

    Cc: Jeff Layton
    Acked-by: Jeff Layton
    Cc: "J. Bruce Fields"
    Cc: Alexander Viro
    Cc: linux-fsdevel@vger.kernel.org
    Signed-off-by: Paul Gortmaker
    Signed-off-by: Jeff Layton

    Paul Gortmaker
     

08 Dec, 2015

1 commit


18 Nov, 2015

1 commit


16 Nov, 2015

1 commit

  • Mandatory locking appears to be almost unused and buggy and there
    appears no real interest in doing anything with it. Since effectively
    no one uses the code and since the code is buggy let's allow it to be
    disabled at compile time. I would just suggest removing the code but
    undoubtedly that will break some piece of userspace code somewhere.

    For the distributions that don't care about this piece of code
    this gives a nice starting point to make mandatory locking go away.

    Cc: Benjamin Coddington
    Cc: Dmitry Vyukov
    Cc: Jeff Layton
    Cc: J. Bruce Fields
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Jeff Layton

    Jeff Layton
     

23 Oct, 2015

3 commits


15 Oct, 2015

1 commit


21 Sep, 2015

1 commit

  • locks_get_lock_context() uses cmpxchg() to install i_flctx.
    cmpxchg() is a release operation which is correct. But it uses
    a plain load to load i_flctx. This is incorrect. Subsequent loads
    from i_flctx can hoist above the load of i_flctx pointer itself
    and observe uninitialized garbage there. This in turn can lead
    to corruption of ctx->flc_lock and other members.

    Documentation/memory-barriers.txt explicitly requires to use
    a barrier in such context:
    "A load-load control dependency requires a full read memory barrier".

    Use smp_load_acquire() in locks_get_lock_context() and in bunch
    of other functions that can proceed concurrently with
    locks_get_lock_context().

    The data race was found with KernelThreadSanitizer (KTSAN).

    Signed-off-by: Dmitry Vyukov
    Signed-off-by: Jeff Layton

    Dmitry Vyukov
     

01 Sep, 2015

1 commit


13 Jul, 2015

3 commits


17 Apr, 2015

1 commit

  • Let's show locks which are associated with a file descriptor in
    its fdinfo file.

    Currently we don't have a reliable way to determine who holds a lock. We
    can find some information in /proc/locks, but PID which is reported there
    can be wrong. For example, a process takes a lock, then forks a child and
    dies. In this case /proc/locks contains the parent pid, which can be
    reused by another process.

    $ cat /proc/locks
    ...
    6: FLOCK ADVISORY WRITE 324 00:13:13431 0 EOF
    ...

    $ ps -C rpcbind
    PID TTY TIME CMD
    332 ? 00:00:00 rpcbind

    $ cat /proc/332/fdinfo/4
    pos: 0
    flags: 0100000
    mnt_id: 22
    lock: 1: FLOCK ADVISORY WRITE 324 00:13:13431 0 EOF

    $ ls -l /proc/332/fd/4
    lr-x------ 1 root root 64 Mar 5 14:43 /proc/332/fd/4 -> /run/rpcbind.lock

    $ ls -l /proc/324/fd/
    total 0
    lrwx------ 1 root root 64 Feb 27 14:50 0 -> /dev/pts/0
    lrwx------ 1 root root 64 Feb 27 14:50 1 -> /dev/pts/0
    lrwx------ 1 root root 64 Feb 27 14:49 2 -> /dev/pts/0

    You can see that the process with the 324 pid doesn't hold the lock.

    This information is required for proper dumping and restoring file
    locks.

    Signed-off-by: Andrey Vagin
    Cc: Jonathan Corbet
    Cc: Alexander Viro
    Acked-by: Jeff Layton
    Acked-by: "J. Bruce Fields"
    Acked-by: Cyrill Gorcunov
    Cc: Pavel Emelyanov
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Vagin
     

03 Apr, 2015

7 commits

  • During the v3.20/v4.0 cycle, I had originally had the code manage the
    inode->i_flctx pointer using a compare-and-swap operation instead of the
    i_lock.

    Sasha Levin though hit a problem while testing with trinity that made me
    believe that that wasn't safe. At the time, changing the code to protect
    the i_flctx pointer seemed to fix the issue, but I now think that was
    just coincidence.

    The issue was likely the same race that Kirill Shutemov hit while
    testing the pre-rc1 v4.0 kernel and that Linus spotted. Due to the way
    that the spinlock was dropped in the middle of flock_lock_file, you
    could end up with multiple flock locks for the same struct file on the
    inode.

    Reinstate the use of a CAS operation to assign this pointer since it's
    likely to be more efficient and gets the i_lock completely out of the
    file locking business.

    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • As Bruce points out, there's no compelling reason to change /proc/locks
    output at this point. If we did want to do this, then we'd almost
    certainly want to introduce a new file to display this info (maybe via
    debugfs?).

    Let's remove the dead WE_CAN_BREAK_LSLK_NOW ifdef here and just plan to
    stay with the legacy format.

    Reported-by: J. Bruce Fields
    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • The current prototypes for these operations are somewhat awkward as they
    deal with fl_owners but take struct file_lock arguments. In the future,
    we'll want to be able to take references without necessarily dealing
    with a struct file_lock.

    Change them to take fl_owner_t arguments instead and have the callers
    deal with assigning the values to the file_lock structs.

    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • In the event that we get an F_UNLCK request on an inode that has no lock
    context, there is no reason to allocate one. Change
    locks_get_lock_context to take a "type" pointer and avoid allocating a
    new context if it's F_UNLCK.

    Then, fix the callers to return appropriately if that function returns
    NULL.

    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • Annonate insert, remove and iterate function that we need
    blocked_lock_lock held.

    Signed-off-by: Daniel Wagner
    Signed-off-by: Jeff Layton

    Daniel Wagner
     
  • We know that the locks being passed into this function are of the
    correct type, now that they live on their own lists.

    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • Since following change

    commit bd61e0a9c852de2d705b6f1bb2cc54c5774db570
    Author: Jeff Layton
    Date: Fri Jan 16 15:05:55 2015 -0500

    locks: convert posix locks to file_lock_context

    all Posix locks are kept on their a separate list, so the test is
    redudant.

    Signed-off-by: Daniel Wagner
    Cc: Jeff Layton
    Cc: "J. Bruce Fields"
    Cc: Alexander Viro
    Signed-off-by: Jeff Layton

    Daniel Wagner
     

27 Mar, 2015

1 commit


14 Mar, 2015

1 commit


05 Mar, 2015

1 commit

  • Commit 8634b51f6ca2 (locks: convert lease handling to file_lock_context)
    introduced a regression in the handling of lease upgrade/downgrades.

    In the event that we already have a lease on a file and are going to
    either upgrade or downgrade it, we skip doing any list insertion or
    deletion and simply re-call lm_setup on the existing lease.

    As of commit 8634b51f6ca2 however, we end up calling lm_setup on the
    lease that was passed in, instead of on the existing lease. This causes
    us to leak the fasync_struct that was allocated in the event that there
    was not already an existing one (as it always appeared that there
    wasn't one).

    Fixes: 8634b51f6ca2 (locks: convert lease handling to file_lock_context)
    Reported-and-Tested-by: Daniel Wagner
    Signed-off-by: Jeff Layton

    Jeff Layton
     

18 Feb, 2015

3 commits

  • In the case where we're splitting a lock in two, the current code
    the new "left" lock in the incorrect spot. It's inserted just
    before "right" when it should instead be inserted just before the
    new lock.

    When we add a new lock, set "fl" to that value so that we can
    add "left" before it.

    Reported-by: Al Viro
    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • As Linus pointed out:

    Say we have an existing flock, and now do a new one that conflicts. I
    see what looks like three separate bugs.

    - We go through the first loop, find a lock of another type, and
    delete it in preparation for replacing it

    - we *drop* the lock context spinlock.

    - BUG #1? So now there is no lock at all, and somebody can come in
    and see that unlocked state. Is that really valid?

    - another thread comes in while the first thread dropped the lock
    context lock, and wants to add its own lock. It doesn't see the
    deleted or pending locks, so it just adds it

    - the first thread gets the context spinlock again, and adds the lock
    that replaced the original

    - BUG #2? So now there are *two* locks on the thing, and the next
    time you do an unlock (or when you close the file), it will only
    remove/replace the first one.

    ...remove the "drop the spinlock" code in the middle of this function as
    it has always been suspicious. This should eliminate the potential race
    that can leave two locks for the same struct file on the list.

    He also pointed out another thing as a bug -- namely that you
    flock_lock_file removes the lock from the list unconditionally when
    doing a lock upgrade, without knowing whether it'll be able to set the
    new lock. Bruce pointed out that this is expected behavior and may help
    prevent certain deadlock situations.

    We may want to revisit that at some point, but it's probably best that
    we do so in the context of a different patchset.

    Reported-by: Linus Torvalds
    Signed-off-by: Jeff Layton

    Jeff Layton
     
  • We don't want to remove all leases just because one filp was closed.

    Signed-off-by: Jeff Layton

    Jeff Layton
     

17 Feb, 2015

1 commit


03 Feb, 2015

2 commits

  • This (ab-)uses the file locking code to allow filesystems to recall
    outstanding pNFS layouts on a file. This new lease type is similar but
    not quite the same as FL_DELEG. A FL_LAYOUT lease can always be granted,
    an a per-filesystem lock (XFS iolock for the initial implementation)
    ensures not FL_LAYOUT leases granted when we would need to recall them.

    Also included are changes that allow multiple outstanding read
    leases of different types on the same file as long as they have a
    differnt owner. This wasn't a problem until now as nfsd never set
    FL_LEASE leases, and no one else used FL_DELEG leases, but given that
    nfsd will also issues FL_LAYOUT leases we will have to handle it now.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     
  • Just like for other lock types we should allow different owners to have
    a read lease on a file. Currently this can't happen, but with the addition
    of pNFS layout leases we'll need this feature.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     

22 Jan, 2015

1 commit