23 Feb, 2015

11 commits

  • Pull ext4 fixes from Ted Ts'o:
    "Ext4 bug fixes.

    We also reserved code points for encryption and read-only images (for
    which the implementation is mostly just the reserved code point for a
    read-only feature :-)"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: fix indirect punch hole corruption
    ext4: ignore journal checksum on remount; don't fail
    ext4: remove duplicate remount check for JOURNAL_CHECKSUM change
    ext4: fix mmap data corruption in nodelalloc mode when blocksize < pagesize
    ext4: support read-only images
    ext4: change to use setup_timer() instead of init_timer()
    ext4: reserve codepoints used by the ext4 encryption feature
    jbd2: complain about descriptor block checksum errors

    Linus Torvalds
     
  • Pull more vfs updates from Al Viro:
    "Assorted stuff from this cycle. The big ones here are multilayer
    overlayfs from Miklos and beginning of sorting ->d_inode accesses out
    from David"

    * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (51 commits)
    autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation
    procfs: fix race between symlink removals and traversals
    debugfs: leave freeing a symlink body until inode eviction
    Documentation/filesystems/Locking: ->get_sb() is long gone
    trylock_super(): replacement for grab_super_passive()
    fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
    Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
    VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
    SELinux: Use d_is_positive() rather than testing dentry->d_inode
    Smack: Use d_is_positive() rather than testing dentry->d_inode
    TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
    Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode
    Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
    VFS: Split DCACHE_FILE_TYPE into regular and special types
    VFS: Add a fallthrough flag for marking virtual dentries
    VFS: Add a whiteout dentry type
    VFS: Introduce inode-getting helpers for layered/unioned fs environments
    Infiniband: Fix potential NULL d_inode dereference
    posix_acl: fix reference leaks in posix_acl_create
    autofs4: Wrong format for printing dentry
    ...

    Linus Torvalds
     
  • X-Coverup: just ask spender
    Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro

    Al Viro
     
  • use_pde()/unuse_pde() in ->follow_link()/->put_link() resp.

    Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro

    Al Viro
     
  • As it is, we have debugfs_remove() racing with symlink traversals.
    Supply ->evict_inode() and do freeing there - inode will remain
    pinned until we are done with the symlink body.

    And rip the idiocy with checking if dentry is positive right after
    we'd verified debugfs_positive(), which is a stronger check...

    Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro

    Al Viro
     
  • I've noticed significant locking contention in memory reclaimer around
    sb_lock inside grab_super_passive(). Grab_super_passive() is called from
    two places: in icache/dcache shrinkers (function super_cache_scan) and
    from writeback (function __writeback_inodes_wb). Both are required for
    progress in memory allocator.

    Grab_super_passive() acquires sb_lock to increment sb->s_count and check
    sb->s_instances. It seems sb->s_umount locked for read is enough here:
    super-block deactivation always runs under sb->s_umount locked for write.
    Protecting super-block itself isn't a problem: in super_cache_scan() sb
    is protected by shrinker_rwsem: it cannot be freed if its slab shrinkers
    are still active. Inside writeback super-block comes from inode from bdi
    writeback list under wb->list_lock.

    This patch removes locking sb_lock and checks s_instances under s_umount:
    generic_shutdown_super() unlinks it under sb->s_umount locked for write.
    New variant is called trylock_super() and since it only locks semaphore,
    callers must call up_read(&sb->s_umount) instead of drop_super(sb) when
    they're done.

    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Al Viro

    Konstantin Khlebnikov
     
  • Fanotify probably doesn't want to watch autodirs so make it use d_can_lookup()
    rather than d_is_dir() when checking a dir watch and give an error on fake
    directories.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • Fix up the following scripted S_ISDIR/S_ISREG/S_ISLNK conversions (or lack
    thereof) in cachefiles:

    (1) Cachefiles mostly wants to use d_can_lookup() rather than d_is_dir() as
    it doesn't want to deal with automounts in its cache.

    (2) Coccinelle didn't find S_IS* expressions in ASSERT() statements in
    cachefiles.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • Convert the following where appropriate:

    (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

    (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

    (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
    complicated than it appears as some calls should be converted to
    d_can_lookup() instead. The difference is whether the directory in
    question is a real dir with a ->lookup op or whether it's a fake dir with
    a ->d_automount op.

    In some circumstances, we can subsume checks for dentry->d_inode not being
    NULL into this, provided we the code isn't in a filesystem that expects
    d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
    use d_inode() rather than d_backing_inode() to get the inode pointer).

    Note that the dentry type field may be set to something other than
    DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
    manages the fall-through from a negative dentry to a lower layer. In such a
    case, the dentry type of the negative union dentry is set to the same as the
    type of the lower dentry.

    However, if you know d_inode is not NULL at the call site, then you can use
    the d_is_xxx() functions even in a filesystem.

    There is one further complication: a 0,0 chardev dentry may be labelled
    DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
    intended for special directory entry types that don't have attached inodes.

    The following perl+coccinelle script was used:

    use strict;

    my @callers;
    open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
    die "Can't grep for S_ISDIR and co. callers";
    @callers = ;
    close($fd);
    unless (@callers) {
    print "No matches\n";
    exit(0);
    }

    my @cocci = (
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISLNK(E->d_inode->i_mode)',
    '+ d_is_symlink(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISDIR(E->d_inode->i_mode)',
    '+ d_is_dir(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISREG(E->d_inode->i_mode)',
    '+ d_is_reg(E)' );

    my $coccifile = "tmp.sp.cocci";
    open($fd, ">$coccifile") || die $coccifile;
    print($fd "$_\n") || die $coccifile foreach (@cocci);
    close($fd);

    foreach my $file (@callers) {
    chomp $file;
    print "Processing ", $file, "\n";
    system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
    die "spatch failed";
    }

    [AV: overlayfs parts skipped]

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • Split DCACHE_FILE_TYPE into DCACHE_REGULAR_TYPE (dentries representing regular
    files) and DCACHE_SPECIAL_TYPE (representing blockdev, chardev, FIFO and
    socket files).

    d_is_reg() and d_is_special() are added to detect these subtypes and
    d_is_file() is left as the union of the two.

    This allows a number of places that use S_ISREG(dentry->d_inode->i_mode) to
    use d_is_reg(dentry) instead.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • Add a DCACHE_FALLTHRU flag to indicate that, in a layered filesystem, this is
    a virtual dentry that covers another one in a lower layer that should be used
    instead. This may be recorded on medium if directory integration is stored
    there.

    The flag can be set with d_set_fallthru() and tested with d_is_fallthru().

    Original-author: Valerie Aurora
    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     

22 Feb, 2015

3 commits

  • …rnel/git/dgc/linux-xfs

    Pull xfs pnfs block layout support from Dave Chinner:
    "This contains the changes to XFS needed to support the PNFS block
    layout server that you pulled in through Bruce's NFS server tree
    merge.

    I originally thought that I'd need to merge changes into the NFS
    server side, but Bruce had already picked them up and so this is
    purely changes to the fs/xfs/ codebase.

    Summary:

    This update contains the implementation of the PNFS server export
    methods that enable use of XFS filesystems as a block layout target"

    * tag 'xfs-pnfs-for-linus-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
    xfs: recall pNFS layouts on conflicting access
    xfs: implement pNFS export operations

    Linus Torvalds
     
  • Pull more NFS client updates from Trond Myklebust:
    "Highlights include:

    - Fix a use-after-free in decode_cb_sequence_args()
    - Fix a compile error when #undef CONFIG_PROC_FS
    - NFSv4.1 backchannel spinlocking issue
    - Cleanups in the NFS unstable write code requested by Linus
    - NFSv4.1 fix issues when the server denies our backchannel request
    - Cleanups in create_session and bind_conn_to_session"

    * tag 'nfs-for-3.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    NFSv4.1: Clean up bind_conn_to_session
    NFSv4.1: Always set up a forward channel when binding the session
    NFSv4.1: Don't set up a backchannel if the server didn't agree to do so
    NFSv4.1: Clean up create_session
    pnfs: Refactor the *_layout_mark_request_commit to use pnfs_layout_mark_request_commit
    NFSv4: Kill unused nfs_inode->delegation_state field
    NFS: struct nfs_commit_info.lock must always point to inode->i_lock
    nfs: Can call nfs_clear_page_commit() instead
    nfs: Provide and use helper functions for marking a page as unstable
    SUNRPC: Always manipulate rpc_rqst::rq_bc_pa_list under xprt->bc_pa_lock
    SUNRPC: Fix a compile error when #undef CONFIG_PROC_FS
    NFSv4.1: Convert open-coded array allocation calls to kmalloc_array()
    NFSv4.1: Fix a kfree() of uninitialised pointers in decode_cb_sequence_args

    Linus Torvalds
     
  • Pull misc x86 fixes from Ingo Molnar:
    "This contains:

    - EFI fixes
    - a boot printout fix
    - ASLR/kASLR fixes
    - intel microcode driver fixes
    - other misc fixes

    Most of the linecount comes from an EFI revert"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/mm/ASLR: Avoid PAGE_SIZE redefinition for UML subarch
    x86/microcode/intel: Handle truncated microcode images more robustly
    x86/microcode/intel: Guard against stack overflow in the loader
    x86, mm/ASLR: Fix stack randomization on 64-bit systems
    x86/mm/init: Fix incorrect page size in init_memory_mapping() printks
    x86/mm/ASLR: Propagate base load address calculation
    Documentation/x86: Fix path in zero-page.txt
    x86/apic: Fix the devicetree build in certain configs
    Revert "efi/libstub: Call get_memory_map() to obtain map and desc sizes"
    x86/efi: Avoid triple faults during EFI mixed mode calls

    Linus Torvalds
     

20 Feb, 2015

9 commits

  • …szeredi/vfs into for-next

    Al Viro
     
  • get_acl gets a reference which we must release in the error cases.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Omar Sandoval
    Signed-off-by: Al Viro

    Omar Sandoval
     
  • %pD for struct file*, %pd for struct dentry*.

    Fixes: a455589f181e ("assorted conversions to %p[dD]")
    Signed-off-by: Rasmus Villemoes
    Signed-off-by: Al Viro

    Rasmus Villemoes
     
  • Signed-off-by: Bastien Nocera
    Signed-off-by: Al Viro

    Bastien Nocera
     
  • Have defined pr_fmt as below in fs/aio.c, so remove duplicate
    function name in pr_debug message.

    #define pr_fmt(fmt) "%s: " fmt, __func__

    Signed-off-by: Kinglong Mee
    Signed-off-by: Al Viro

    Kinglong Mee
     
  • Code that does this:

    if (!(d_unhashed(dentry) && dentry->d_inode)) {
    ...
    simple_unlink(parent->d_inode, dentry);
    }

    is broken because:

    !(d_unhashed(dentry) && dentry->d_inode)

    is equivalent to:

    !d_unhashed(dentry) || !dentry->d_inode

    so it is possible to get into simple_unlink() with dentry->d_inode == NULL.

    simple_unlink(), however, assumes dentry->d_inode cannot be NULL.

    I think that what was meant is this:

    !d_unhashed(dentry) && dentry->d_inode

    and that the logical-not operator or the final close-bracket was misplaced.

    Signed-off-by: David Howells
    cc: Joel Becker
    Signed-off-by: Al Viro

    David Howells
     
  • Only ->open() should be there (always failing, of course). We never
    replace ->f_op of an already opened struct file, so there's no way
    for any of those methods to be called.

    Signed-off-by: Al Viro

    Al Viro
     
  • Pull btrfs updates from Chris Mason:
    "This pull is mostly cleanups and fixes:

    - The raid5/6 cleanups from Zhao Lei fixup some long standing warts
    in the code and add improvements on top of the scrubbing support
    from 3.19.

    - Josef has round one of our ENOSPC fixes coming from large btrfs
    clusters here at FB.

    - Dave Sterba continues a long series of cleanups (thanks Dave), and
    Filipe continues hammering on corner cases in fsync and others

    This all was held up a little trying to track down a use-after-free in
    btrfs raid5/6. It's not clear yet if this is just made easier to
    trigger with this pull or if its a new bug from the raid5/6 cleanups.
    Dave Sterba is the only one to trigger it so far, but he has a
    consistent way to reproduce, so we'll get it nailed shortly"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (68 commits)
    Btrfs: don't remove extents and xattrs when logging new names
    Btrfs: fix fsync data loss after adding hard link to inode
    Btrfs: fix BUG_ON in btrfs_orphan_add() when delete unused block group
    Btrfs: account for large extents with enospc
    Btrfs: don't set and clear delalloc for O_DIRECT writes
    Btrfs: only adjust outstanding_extents when we do a short write
    btrfs: Fix out-of-space bug
    Btrfs: scrub, fix sleep in atomic context
    Btrfs: fix scheduler warning when syncing log
    Btrfs: Remove unnecessary placeholder in btrfs_err_code
    btrfs: cleanup init for list in free-space-cache
    btrfs: delete chunk allocation attemp when setting block group ro
    btrfs: clear bio reference after submit_one_bio()
    Btrfs: fix scrub race leading to use-after-free
    Btrfs: add missing cleanup on sysfs init failure
    Btrfs: fix race between transaction commit and empty block group removal
    btrfs: add more checks to btrfs_read_sys_array
    btrfs: cleanup, rename a few variables in btrfs_read_sys_array
    btrfs: add checks for sys_chunk_array sizes
    btrfs: more superblock checks, lower bounds on devices and sectorsize/nodesize
    ...

    Linus Torvalds
     
  • Pull Ceph changes from Sage Weil:
    "On the RBD side, there is a conversion to blk-mq from Christoph,
    several long-standing bug fixes from Ilya, and some cleanup from
    Rickard Strandqvist.

    On the CephFS side there is a long list of fixes from Zheng, including
    improved session handling, a few IO path fixes, some dcache management
    correctness fixes, and several blocking while !TASK_RUNNING fixes.

    The core code gets a few cleanups and Chaitanya has added support for
    TCP_NODELAY (which has been used on the server side for ages but we
    somehow missed on the kernel client).

    There is also an update to MAINTAINERS to fix up some email addresses
    and reflect that Ilya and Zheng are doing most of the maintenance for
    RBD and CephFS these days. Do not be surprised to see a pull request
    come from one of them in the future if I am unavailable for some
    reason"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (27 commits)
    MAINTAINERS: update Ceph and RBD maintainers
    libceph: kfree() in put_osd() shouldn't depend on authorizer
    libceph: fix double __remove_osd() problem
    rbd: convert to blk-mq
    ceph: return error for traceless reply race
    ceph: fix dentry leaks
    ceph: re-send requests when MDS enters reconnecting stage
    ceph: show nocephx_require_signatures and notcp_nodelay options
    libceph: tcp_nodelay support
    rbd: do not treat standalone as flatten
    ceph: fix atomic_open snapdir
    ceph: properly mark empty directory as complete
    client: include kernel version in client metadata
    ceph: provide seperate {inode,file}_operations for snapdir
    ceph: fix request time stamp encoding
    ceph: fix reading inline data when i_size > PAGE_SIZE
    ceph: avoid block operation when !TASK_RUNNING (ceph_mdsc_close_sessions)
    ceph: avoid block operation when !TASK_RUNNING (ceph_get_caps)
    ceph: avoid block operation when !TASK_RUNNING (ceph_mdsc_sync)
    rbd: fix error paths in rbd_dev_refresh()
    ...

    Linus Torvalds
     

19 Feb, 2015

17 commits