13 Apr, 2014

1 commit

  • Pull vfs updates from Al Viro:
    "The first vfs pile, with deep apologies for being very late in this
    window.

    Assorted cleanups and fixes, plus a large preparatory part of iov_iter
    work. There's a lot more of that, but it'll probably go into the next
    merge window - it *does* shape up nicely, removes a lot of
    boilerplate, gets rid of locking inconsistencie between aio_write and
    splice_write and I hope to get Kent's direct-io rewrite merged into
    the same queue, but some of the stuff after this point is having
    (mostly trivial) conflicts with the things already merged into
    mainline and with some I want more testing.

    This one passes LTP and xfstests without regressions, in addition to
    usual beating. BTW, readahead02 in ltp syscalls testsuite has started
    giving failures since "mm/readahead.c: fix readahead failure for
    memoryless NUMA nodes and limit readahead pages" - might be a false
    positive, might be a real regression..."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    missing bits of "splice: fix racy pipe->buffers uses"
    cifs: fix the race in cifs_writev()
    ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
    kill generic_file_buffered_write()
    ocfs2_file_aio_write(): switch to generic_perform_write()
    ceph_aio_write(): switch to generic_perform_write()
    xfs_file_buffered_aio_write(): switch to generic_perform_write()
    export generic_perform_write(), start getting rid of generic_file_buffer_write()
    generic_file_direct_write(): get rid of ppos argument
    btrfs_file_aio_write(): get rid of ppos
    kill the 5th argument of generic_file_buffered_write()
    kill the 4th argument of __generic_file_aio_write()
    lustre: don't open-code kernel_recvmsg()
    ocfs2: don't open-code kernel_recvmsg()
    drbd: don't open-code kernel_recvmsg()
    constify blk_rq_map_user_iov() and friends
    lustre: switch to kernel_sendmsg()
    ocfs2: don't open-code kernel_sendmsg()
    take iov_iter stuff to mm/iov_iter.c
    process_vm_access: tidy up a bit
    ...

    Linus Torvalds
     

05 Apr, 2014

1 commit

  • Pull file locking updates from Jeff Layton:
    "Highlights:

    - maintainership change for fs/locks.c. Willy's not interested in
    maintaining it these days, and is OK with Bruce and I taking it.
    - fix for open vs setlease race that Al ID'ed
    - cleanup and consolidation of file locking code
    - eliminate unneeded BUG() call
    - merge of file-private lock implementation"

    * 'locks-3.15' of git://git.samba.org/jlayton/linux:
    locks: make locks_mandatory_area check for file-private locks
    locks: fix locks_mandatory_locked to respect file-private locks
    locks: require that flock->l_pid be set to 0 for file-private locks
    locks: add new fcntl cmd values for handling file private locks
    locks: skip deadlock detection on FL_FILE_PVT locks
    locks: pass the cmd value to fcntl_getlk/getlk64
    locks: report l_pid as -1 for FL_FILE_PVT locks
    locks: make /proc/locks show IS_FILE_PVT locks as type "FLPVT"
    locks: rename locks_remove_flock to locks_remove_file
    locks: consolidate checks for compatible filp->f_mode values in setlk handlers
    locks: fix posix lock range overflow handling
    locks: eliminate BUG() call when there's an unexpected lock on file close
    locks: add __acquires and __releases annotations to locks_start and locks_stop
    locks: remove "inline" qualifier from fl_link manipulation functions
    locks: clean up comment typo
    locks: close potential race between setlease and open
    MAINTAINERS: update entry for fs/locks.c

    Linus Torvalds
     

02 Apr, 2014

4 commits


31 Mar, 2014

1 commit

  • This function currently removes leases in addition to flock locks and in
    a later patch we'll have it deal with file-private locks too. Rename it
    to locks_remove_file to indicate that it removes locks that are
    associated with a particular struct file, and not just flock locks.

    Acked-by: J. Bruce Fields
    Signed-off-by: Jeff Layton

    Jeff Layton
     

10 Mar, 2014

1 commit

  • Our write() system call has always been atomic in the sense that you get
    the expected thread-safe contiguous write, but we haven't actually
    guaranteed that concurrent writes are serialized wrt f_pos accesses, so
    threads (or processes) that share a file descriptor and use "write()"
    concurrently would quite likely overwrite each others data.

    This violates POSIX.1-2008/SUSv4 Section XSI 2.9.7 that says:

    "2.9.7 Thread Interactions with Regular File Operations

    All of the following functions shall be atomic with respect to each
    other in the effects specified in POSIX.1-2008 when they operate on
    regular files or symbolic links: [...]"

    and one of the effects is the file position update.

    This unprotected file position behavior is not new behavior, and nobody
    has ever cared. Until now. Yongzhi Pan reported unexpected behavior to
    Michael Kerrisk that was due to this.

    This resolves the issue with a f_pos-specific lock that is taken by
    read/write/lseek on file descriptors that may be shared across threads
    or processes.

    Reported-by: Yongzhi Pan
    Reported-by: Michael Kerrisk
    Cc: Al Viro
    Signed-off-by: Linus Torvalds
    Signed-off-by: Al Viro

    Linus Torvalds
     

13 Nov, 2013

1 commit

  • Pull vfs updates from Al Viro:
    "All kinds of stuff this time around; some more notable parts:

    - RCU'd vfsmounts handling
    - new primitives for coredump handling
    - files_lock is gone
    - Bruce's delegations handling series
    - exportfs fixes

    plus misc stuff all over the place"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits)
    ecryptfs: ->f_op is never NULL
    locks: break delegations on any attribute modification
    locks: break delegations on link
    locks: break delegations on rename
    locks: helper functions for delegation breaking
    locks: break delegations on unlink
    namei: minor vfs_unlink cleanup
    locks: implement delegations
    locks: introduce new FL_DELEG lock flag
    vfs: take i_mutex on renamed file
    vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
    vfs: don't use PARENT/CHILD lock classes for non-directories
    vfs: pull ext4's double-i_mutex-locking into common code
    exportfs: fix quadratic behavior in filehandle lookup
    exportfs: better variable name
    exportfs: move most of reconnect_path to helper function
    exportfs: eliminate unused "noprogress" counter
    exportfs: stop retrying once we race with rename/remove
    exportfs: clear DISCONNECTED on all parents sooner
    exportfs: more detailed comment for path_reconnect
    ...

    Linus Torvalds
     

09 Nov, 2013

1 commit


25 Oct, 2013

1 commit


20 Oct, 2013

1 commit

  • Background: nfsd v[23] had throughput regression since delayed fput
    went in; every read or write ends up doing fput() and we get a pair
    of extra context switches out of that (plus quite a bit of work
    in queue_work itselfi, apparently). Use of schedule_delayed_work()
    gives it a chance to accumulate a bit before we do __fput() on all
    of them. I'm not too happy about that solution, but... on at least
    one real-world setup it reverts about 10% throughput loss we got from
    switch to delayed fput.

    Signed-off-by: Al Viro

    Al Viro
     

12 Sep, 2013

1 commit


04 Sep, 2013

1 commit


13 Jul, 2013

2 commits

  • fput() and delayed_fput() can use llist and avoid the locking.

    This is unlikely path, it is not that this change can improve
    the performance, but this way the code looks simpler.

    Signed-off-by: Oleg Nesterov
    Suggested-by: Andrew Morton
    Cc: Al Viro
    Cc: Andrey Vagin
    Cc: "Eric W. Biederman"
    Cc: David Howells
    Cc: Huang Ying
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Oleg Nesterov
     
  • A missed update to "fput: task_work_add() can fail if the caller has
    passed exit_task_work()".

    Cc: "Eric W. Biederman"
    Cc: Al Viro
    Cc: Andrey Vagin
    Cc: David Howells
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Andrew Morton
     

29 Jun, 2013

1 commit


15 Jun, 2013

1 commit

  • fput() assumes that it can't be called after exit_task_work() but
    this is not true, for example free_ipc_ns()->shm_destroy() can do
    this. In this case fput() silently leaks the file.

    Change it to fallback to delayed_fput_work if task_work_add() fails.
    The patch looks complicated but it is not, it changes the code from

    if (PF_KTHREAD) {
    schedule_work(...);
    return;
    }
    task_work_add(...)

    to
    if (!PF_KTHREAD) {
    if (!task_work_add(...))
    return;
    /* fallback */
    }
    schedule_work(...);

    As for shm_destroy() in particular, we could make another fix but I
    think this change makes sense anyway. There could be another similar
    user, it is not safe to assume that task_work_add() can't fail.

    Reported-by: Andrey Vagin
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Al Viro

    Oleg Nesterov
     

02 Mar, 2013

1 commit


23 Feb, 2013

3 commits

  • Allocating a file structure in function get_empty_filp() might fail because
    of several reasons:
    - not enough memory for file structures
    - operation is not allowed
    - user is over its limit

    Currently the function returns NULL in all cases and we loose the exact
    reason of the error. All callers of get_empty_filp() assume that the function
    can fail with ENFILE only.

    Return error through pointer. Change all callers to preserve this error code.

    [AV: cleaned up a bit, carved the get_empty_filp() part out into a separate commit
    (things remaining here deal with alloc_file()), removed pipe(2) behaviour change]

    Signed-off-by: Anatol Pomozov
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Al Viro

    Anatol Pomozov
     
  • Based on parts from Anatol's patch (the rest is the next commit).

    Signed-off-by: Al Viro

    Al Viro
     
  • Signed-off-by: Al Viro

    Al Viro
     

21 Dec, 2012

1 commit

  • File descriptors (even those for writing) do not hold freeze protection.
    Thus mark_files_ro() must call __mnt_drop_write() to only drop protection
    against remount read-only. Calling mnt_drop_write_file() as we do now
    results in:

    [ BUG: bad unlock balance detected! ]
    3.7.0-rc6-00028-g88e75b6 #101 Not tainted
    -------------------------------------
    kworker/1:2/79 is trying to release lock (sb_writers) at:
    [] mnt_drop_write+0x24/0x30
    but there are no more locks to release!

    Reported-by: Zdenek Kabelac
    CC: stable@vger.kernel.org
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     

10 Oct, 2012

1 commit


03 Oct, 2012

1 commit

  • Pull security subsystem updates from James Morris:
    "Highlights:

    - Integrity: add local fs integrity verification to detect offline
    attacks
    - Integrity: add digital signature verification
    - Simple stacking of Yama with other LSMs (per LSS discussions)
    - IBM vTPM support on ppc64
    - Add new driver for Infineon I2C TIS TPM
    - Smack: add rule revocation for subject labels"

    Fixed conflicts with the user namespace support in kernel/auditsc.c and
    security/integrity/ima/ima_policy.c.

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits)
    Documentation: Update git repository URL for Smack userland tools
    ima: change flags container data type
    Smack: setprocattr memory leak fix
    Smack: implement revoking all rules for a subject label
    Smack: remove task_wait() hook.
    ima: audit log hashes
    ima: generic IMA action flag handling
    ima: rename ima_must_appraise_or_measure
    audit: export audit_log_task_info
    tpm: fix tpm_acpi sparse warning on different address spaces
    samples/seccomp: fix 31 bit build on s390
    ima: digital signature verification support
    ima: add support for different security.ima data types
    ima: add ima_inode_setxattr/removexattr function and calls
    ima: add inode_post_setattr call
    ima: replace iint spinblock with rwlock/read_lock
    ima: allocating iint improvements
    ima: add appraise action keywords and default rules
    ima: integrity appraisal extension
    vfs: move ima_file_free before releasing the file
    ...

    Linus Torvalds
     

27 Sep, 2012

1 commit


08 Sep, 2012

1 commit

  • ima_file_free(), called on __fput(), currently flags files that have
    changed, so that the file is re-measured. For appraising a files's
    integrity, the file's hash must be re-calculated and stored in the
    'security.ima' xattr to reflect any changes.

    This patch moves the ima_file_free() call to before releasing the file
    in preparation of ima-appraisal measuring the file and updating the
    'security.ima' xattr.

    Signed-off-by: Mimi Zohar
    Acked-by: Serge Hallyn
    Acked-by: Dmitry Kasatkin

    Mimi Zohar
     

31 Jul, 2012

1 commit

  • Most of places where we want freeze protection coincides with the places where
    we also have remount-ro protection. So make mnt_want_write() and
    mnt_drop_write() (and their _file alternative) prevent freezing as well.
    For the few cases that are really interested only in remount-ro protection
    provide new function variants.

    BugLink: https://bugs.launchpad.net/bugs/897421
    Tested-by: Kamal Mostafa
    Tested-by: Peter M. Petrakis
    Tested-by: Dann Frazier
    Tested-by: Massimo Morana
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     

30 Jul, 2012

1 commit


23 Jul, 2012

1 commit

  • ... and schedule_work() for interrupt/kernel_thread callers
    (and yes, now it *is* OK to call from interrupt).

    We are guaranteed that __fput() will be done before we return
    to userland (or exit). Note that for fput() from a kernel
    thread we get an async behaviour; it's almost always OK, but
    sometimes you might need to have __fput() completed before
    you do anything else. There are two mechanisms for that -
    a general barrier (flush_delayed_fput()) and explicit
    __fput_sync(). Both should be used with care (as was the
    case for fput() from kernel threads all along). See comments
    in fs/file_table.c for details.

    Signed-off-by: Al Viro

    Al Viro
     

14 Jul, 2012

1 commit


30 May, 2012

2 commits

  • lglocks and brlocks are currently generated with some complicated macros
    in lglock.h. But there's no reason to not just use common utility
    functions and put all the data into a common data structure.

    In preparation, this patch changes the API to look more like normal
    function calls with pointers, not magic macros.

    The patch is rather large because I move over all users in one go to keep
    it bisectable. This impacts the VFS somewhat in terms of lines changed.
    But no actual behaviour change.

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Andi Kleen
    Cc: Al Viro
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Rusty Russell
    Signed-off-by: Al Viro

    Andi Kleen
     
  • lglocks and brlocks are currently generated with some complicated macros
    in lglock.h. But there's no reason to not just use common utility
    functions and put all the data into a common data structure.

    Since there are at least two users it makes sense to share this code in a
    library. This is also easier maintainable than a macro forest.

    This will also make it later possible to dynamically allocate lglocks and
    also use them in modules (this would both still need some additional, but
    now straightforward, code)

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Andi Kleen
    Cc: Al Viro
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Rusty Russell
    Signed-off-by: Al Viro

    Andi Kleen
     

21 Mar, 2012

1 commit


07 Jan, 2012

1 commit


27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

17 Mar, 2011

3 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    fix cdev leak on O_PATH final fput()

    Linus Torvalds
     
  • __fput doesn't need a cdev_put() for O_PATH handles.

    Signed-off-by: mszeredi@suse.cz
    Signed-off-by: Al Viro

    Miklos Szeredi
     
  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (33 commits)
    AppArmor: kill unused macros in lsm.c
    AppArmor: cleanup generated files correctly
    KEYS: Add an iovec version of KEYCTL_INSTANTIATE
    KEYS: Add a new keyctl op to reject a key with a specified error code
    KEYS: Add a key type op to permit the key description to be vetted
    KEYS: Add an RCU payload dereference macro
    AppArmor: Cleanup make file to remove cruft and make it easier to read
    SELinux: implement the new sb_remount LSM hook
    LSM: Pass -o remount options to the LSM
    SELinux: Compute SID for the newly created socket
    SELinux: Socket retains creator role and MLS attribute
    SELinux: Auto-generate security_is_socket_class
    TOMOYO: Fix memory leak upon file open.
    Revert "selinux: simplify ioctl checking"
    selinux: drop unused packet flow permissions
    selinux: Fix packet forwarding checks on postrouting
    selinux: Fix wrong checks for selinux_policycap_netpeer
    selinux: Fix check for xfrm selinux context algorithm
    ima: remove unnecessary call to ima_must_measure
    IMA: remove IMA imbalance checking
    ...

    Linus Torvalds
     

15 Mar, 2011

1 commit