11 Oct, 2016

2 commits

  • Pull more vfs updates from Al Viro:
    ">rename2() work from Miklos + current_time() from Deepa"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: Replace current_fs_time() with current_time()
    fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
    fs: Replace CURRENT_TIME with current_time() for inode timestamps
    fs: proc: Delete inode time initializations in proc_alloc_inode()
    vfs: Add current_time() api
    vfs: add note about i_op->rename changes to porting
    fs: rename "rename2" i_op to "rename"
    vfs: remove unused i_op->rename
    fs: make remaining filesystems use .rename2
    libfs: support RENAME_NOREPLACE in simple_rename()
    fs: support RENAME_NOREPLACE for local filesystems
    ncpfs: fix unused variable warning

    Linus Torvalds
     
  • Al Viro
     

06 Oct, 2016

1 commit


28 Sep, 2016

1 commit

  • CURRENT_TIME_SEC is not y2038 safe. current_time() will
    be transitioned to use 64 bit time along with vfs in a
    separate patch.
    There is no plan to transistion CURRENT_TIME_SEC to use
    y2038 safe time interfaces.

    current_time() will also be extended to use superblock
    range checking parameters when range checking is introduced.

    This works because alloc_super() fills in the the s_time_gran
    in super block to NSEC_PER_SEC.

    Signed-off-by: Deepa Dinamani
    Acked-by: Jan Kara
    Signed-off-by: Al Viro

    Deepa Dinamani
     

27 Sep, 2016

2 commits

  • Generated patch:

    sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2`
    sed -i "s/\brename2\b/rename/g" `git grep -wl rename2`

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • This is trivial to do:

    - add flags argument to foo_rename()
    - check if flags is zero
    - assign foo_rename() to .rename2 instead of .rename

    This doesn't mean it's impossible to support RENAME_NOREPLACE for these
    filesystems, but it is not trivial, like for local filesystems.
    RENAME_NOREPLACE must guarantee atomicity (i.e. it shouldn't be possible
    for a file to be created on one host while it is overwritten by rename on
    another host).

    Filesystems converted:

    9p, afs, ceph, coda, ecryptfs, kernfs, lustre, ncpfs, nfs, ocfs2, orangefs.

    After this, we can get rid of the duplicate interfaces for rename.

    Signed-off-by: Miklos Szeredi
    Acked-by: Greg Kroah-Hartman
    Acked-by: David Howells [AFS]
    Acked-by: Mike Marshall
    Cc: Eric Van Hensbergen
    Cc: Ilya Dryomov
    Cc: Jan Harkes
    Cc: Tyler Hicks
    Cc: Oleg Drokin
    Cc: Trond Myklebust
    Cc: Mark Fasheh

    Miklos Szeredi
     

30 May, 2016

1 commit


03 May, 2016

1 commit


24 Jan, 2016

1 commit

  • Pull final vfs updates from Al Viro:

    - The ->i_mutex wrappers (with small prereq in lustre)

    - a fix for too early freeing of symlink bodies on shmem (they need to
    be RCU-delayed) (-stable fodder)

    - followup to dedupe stuff merged this cycle

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vfs: abort dedupe loop if fatal signals are pending
    make sure that freeing shmem fast symlinks is RCU-delayed
    wrappers for ->i_mutex access
    lustre: remove unused declaration

    Linus Torvalds
     

23 Jan, 2016

2 commits

  • There are many locations that do

    if (memory_was_allocated_by_vmalloc)
    vfree(ptr);
    else
    kfree(ptr);

    but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory
    using is_vmalloc_addr(). Unless callers have special reasons, we can
    replace this branch with kvfree(). Please check and reply if you found
    problems.

    Signed-off-by: Tetsuo Handa
    Acked-by: Michal Hocko
    Acked-by: Jan Kara
    Acked-by: Russell King
    Reviewed-by: Andreas Dilger
    Acked-by: "Rafael J. Wysocki"
    Acked-by: David Rientjes
    Cc: "Luck, Tony"
    Cc: Oleg Drokin
    Cc: Boris Petkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     
  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

15 Jan, 2016

1 commit

  • Mark those kmem allocations that are known to be easily triggered from
    userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
    memcg. For the list, see below:

    - threadinfo
    - task_struct
    - task_delay_info
    - pid
    - cred
    - mm_struct
    - vm_area_struct and vm_region (nommu)
    - anon_vma and anon_vma_chain
    - signal_struct
    - sighand_struct
    - fs_struct
    - files_struct
    - fdtable and fdtable->full_fds_bits
    - dentry and external_name
    - inode for all filesystems. This is the most tedious part, because
    most filesystems overwrite the alloc_inode method.

    The list is far from complete, so feel free to add more objects.
    Nevertheless, it should be close to "account everything" approach and
    keep most workloads within bounds. Malevolent users will be able to
    breach the limit, but this was possible even with the former "account
    everything" approach (simply because it did not account everything in
    fact).

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Vladimir Davydov
    Acked-by: Johannes Weiner
    Acked-by: Michal Hocko
    Cc: Tejun Heo
    Cc: Greg Thelen
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

31 Dec, 2015

1 commit


09 Dec, 2015

2 commits

  • new method: ->get_link(); replacement of ->follow_link(). The differences
    are:
    * inode and dentry are passed separately
    * might be called both in RCU and non-RCU mode;
    the former is indicated by passing it a NULL dentry.
    * when called that way it isn't allowed to block
    and should return ERR_PTR(-ECHILD) if it needs to be called
    in non-RCU mode.

    It's a flagday change - the old method is gone, all in-tree instances
    converted. Conversion isn't hard; said that, so far very few instances
    do not immediately bail out when called in RCU mode. That'll change
    in the next commits.

    Signed-off-by: Al Viro

    Al Viro
     
  • kmap() in page_follow_link_light() needed to go - allowing to hold
    an arbitrary number of kmaps for long is a great way to deadlocking
    the system.

    new helper (inode_nohighmem(inode)) needs to be used for pagecache
    symlinks inodes; done for all in-tree cases. page_follow_link_light()
    instrumented to yell about anything missed.

    Signed-off-by: Al Viro

    Al Viro
     

11 Sep, 2015

1 commit

  • Dan Carpenter discovered a buffer overflow in the Coda file system
    readlink code. A userspace file system daemon can return a 4096 byte
    result which then triggers a one byte write past the allocated readlink
    result buffer.

    This does not trigger with an unmodified Coda implementation because Coda
    has a 1024 byte limit for symbolic links, however other userspace file
    systems using the Coda kernel module could be affected.

    Although this is an obvious overflow, I don't think this has to be handled
    as too sensitive from a security perspective because the overflow is on
    the Coda userspace daemon side which already needs root to open Coda's
    kernel device and to mount the file system before we get to the point that
    links can be read.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Jan Harkes
    Reported-by: Dan Carpenter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Harkes
     

24 Jun, 2015

1 commit

  • list_entry is just a wrapper for container_of, but it is arguably
    wrong (and slightly confusing) to use it when the pointed-to struct
    member is not a struct list_head. Use container_of directly instead.

    Signed-off-by: Rasmus Villemoes
    Signed-off-by: Al Viro

    Rasmus Villemoes
     

16 Apr, 2015

1 commit


12 Apr, 2015

2 commits


23 Feb, 2015

1 commit

  • Convert the following where appropriate:

    (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

    (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

    (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
    complicated than it appears as some calls should be converted to
    d_can_lookup() instead. The difference is whether the directory in
    question is a real dir with a ->lookup op or whether it's a fake dir with
    a ->d_automount op.

    In some circumstances, we can subsume checks for dentry->d_inode not being
    NULL into this, provided we the code isn't in a filesystem that expects
    d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
    use d_inode() rather than d_backing_inode() to get the inode pointer).

    Note that the dentry type field may be set to something other than
    DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
    manages the fall-through from a negative dentry to a lower layer. In such a
    case, the dentry type of the negative union dentry is set to the same as the
    type of the lower dentry.

    However, if you know d_inode is not NULL at the call site, then you can use
    the d_is_xxx() functions even in a filesystem.

    There is one further complication: a 0,0 chardev dentry may be labelled
    DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
    intended for special directory entry types that don't have attached inodes.

    The following perl+coccinelle script was used:

    use strict;

    my @callers;
    open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
    die "Can't grep for S_ISDIR and co. callers";
    @callers = ;
    close($fd);
    unless (@callers) {
    print "No matches\n";
    exit(0);
    }

    my @cocci = (
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISLNK(E->d_inode->i_mode)',
    '+ d_is_symlink(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISDIR(E->d_inode->i_mode)',
    '+ d_is_dir(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISREG(E->d_inode->i_mode)',
    '+ d_is_reg(E)' );

    my $coccifile = "tmp.sp.cocci";
    open($fd, ">$coccifile") || die $coccifile;
    print($fd "$_\n") || die $coccifile foreach (@cocci);
    close($fd);

    foreach my $file (@callers) {
    chomp $file;
    print "Processing ", $file, "\n";
    system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
    die "spatch failed";
    }

    [AV: overlayfs parts skipped]

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     

18 Feb, 2015

1 commit

  • - Move operation structures to avoid forward declarations.

    - Fix some checkpatch warnings:

    WARNING: Missing a blank line after declarations
    + struct inode *host_inode = file_inode(host_file);
    + mutex_lock(&host_inode->i_mutex);

    ERROR: that open brace { should be on the previous line
    +const struct dentry_operations coda_dentry_operations =
    +{

    ERROR: that open brace { should be on the previous line
    +const struct inode_operations coda_dir_inode_operations =
    +{

    Signed-off-by: Fabian Frederick
    Cc: Jan Harkes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     

21 Jan, 2015

1 commit

  • Since "BDI: Provide backing device capability information [try #3]" the
    backing_dev_info structure also provides flags for the kind of mmap
    operation available in a nommu environment, which is entirely unrelated
    to it's original purpose.

    Introduce a new nommu-only file operation to provide this information to
    the nommu mmap code instead. Splitting this from the backing_dev_info
    structure allows to remove lots of backing_dev_info instance that aren't
    otherwise needed, and entirely gets rid of the concept of providing a
    backing_dev_info for a character device. It also removes the need for
    the mtd_inodefs filesystem.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Tejun Heo
    Acked-by: Brian Norris
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

12 Dec, 2014

1 commit


04 Nov, 2014

1 commit


01 Nov, 2014

1 commit


09 Aug, 2014

1 commit


07 Jun, 2014

4 commits


05 Apr, 2014

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "Major changes for 3.14 include support for the newly added ZERO_RANGE
    and COLLAPSE_RANGE fallocate operations, and scalability improvements
    in the jbd2 layer and in xattr handling when the extended attributes
    spill over into an external block.

    Other than that, the usual clean ups and minor bug fixes"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (42 commits)
    ext4: fix premature freeing of partial clusters split across leaf blocks
    ext4: remove unneeded test of ret variable
    ext4: fix comment typo
    ext4: make ext4_block_zero_page_range static
    ext4: atomically set inode->i_flags in ext4_set_inode_flags()
    ext4: optimize Hurd tests when reading/writing inodes
    ext4: kill i_version support for Hurd-castrated file systems
    ext4: each filesystem creates and uses its own mb_cache
    fs/mbcache.c: doucple the locking of local from global data
    fs/mbcache.c: change block and index hash chain to hlist_bl_node
    ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
    ext4: refactor ext4_fallocate code
    ext4: Update inode i_size after the preallocation
    ext4: fix partial cluster handling for bigalloc file systems
    ext4: delete path dealloc code in ext4_ext_handle_uninitialized_extents
    ext4: only call sync_filesystm() when remounting read-only
    fs: push sync_filesystem() down to the file system's remount_fs()
    jbd2: improve error messages for inconsistent journal heads
    jbd2: minimize region locked by j_list_lock in jbd2_journal_forget()
    jbd2: minimize region locked by j_list_lock in journal_get_create_access()
    ...

    Linus Torvalds
     

04 Apr, 2014

2 commits

  • init_inodecache is only called by __init init_coda

    Signed-off-by: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • Reclaim will be leaving shadow entries in the page cache radix tree upon
    evicting the real page. As those pages are found from the LRU, an
    iput() can lead to the inode being freed concurrently. At this point,
    reclaim must no longer install shadow pages because the inode freeing
    code needs to ensure the page tree is really empty.

    Add an address_space flag, AS_EXITING, that the inode freeing code sets
    under the tree lock before doing the final truncate. Reclaim will check
    for this flag before installing shadow pages.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Rik van Riel
    Reviewed-by: Minchan Kim
    Cc: Andrea Arcangeli
    Cc: Bob Liu
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Greg Thelen
    Cc: Hugh Dickins
    Cc: Jan Kara
    Cc: KOSAKI Motohiro
    Cc: Luigi Semenzato
    Cc: Mel Gorman
    Cc: Metin Doslu
    Cc: Michel Lespinasse
    Cc: Ozgun Erdogan
    Cc: Peter Zijlstra
    Cc: Roman Gushchin
    Cc: Ryan Mallon
    Cc: Tejun Heo
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

13 Mar, 2014

1 commit

  • Previously, the no-op "mount -o mount /dev/xxx" operation when the
    file system is already mounted read-write causes an implied,
    unconditional syncfs(). This seems pretty stupid, and it's certainly
    documented or guaraunteed to do this, nor is it particularly useful,
    except in the case where the file system was mounted rw and is getting
    remounted read-only.

    However, it's possible that there might be some file systems that are
    actually depending on this behavior. In most file systems, it's
    probably fine to only call sync_filesystem() when transitioning from
    read-write to read-only, and there are some file systems where this is
    not needed at all (for example, for a pseudo-filesystem or something
    like romfs).

    Signed-off-by: "Theodore Ts'o"
    Cc: linux-fsdevel@vger.kernel.org
    Cc: Christoph Hellwig
    Cc: Artem Bityutskiy
    Cc: Adrian Hunter
    Cc: Evgeniy Dushistov
    Cc: Jan Kara
    Cc: OGAWA Hirofumi
    Cc: Anders Larsen
    Cc: Phillip Lougher
    Cc: Kees Cook
    Cc: Mikulas Patocka
    Cc: Petr Vandrovec
    Cc: xfs@oss.sgi.com
    Cc: linux-btrfs@vger.kernel.org
    Cc: linux-cifs@vger.kernel.org
    Cc: samba-technical@lists.samba.org
    Cc: codalist@coda.cs.cmu.edu
    Cc: linux-ext4@vger.kernel.org
    Cc: linux-f2fs-devel@lists.sourceforge.net
    Cc: fuse-devel@lists.sourceforge.net
    Cc: cluster-devel@redhat.com
    Cc: linux-mtd@lists.infradead.org
    Cc: jfs-discussion@lists.sourceforge.net
    Cc: linux-nfs@vger.kernel.org
    Cc: linux-nilfs@vger.kernel.org
    Cc: linux-ntfs-dev@lists.sourceforge.net
    Cc: ocfs2-devel@oss.oracle.com
    Cc: reiserfs-devel@vger.kernel.org

    Theodore Ts'o
     

09 Nov, 2013

1 commit


25 Oct, 2013

1 commit


05 Jul, 2013

1 commit


29 Jun, 2013

2 commits