27 Mar, 2016

1 commit

  • This series fixes bugs in nfs and ext4 due to 4bacc9c9234c ("overlayfs:
    Make f_path always point to the overlay and f_inode to the underlay").

    Regular files opened on overlayfs will result in the file being opened on
    the underlying filesystem, while f_path points to the overlayfs
    mount/dentry.

    This confuses filesystems which get the dentry from struct file and assume
    it's theirs.

    Add a new helper, file_dentry() [*], to get the filesystem's own dentry
    from the file. This checks file->f_path.dentry->d_flags against
    DCACHE_OP_REAL, and returns file->f_path.dentry if DCACHE_OP_REAL is not
    set (this is the common, non-overlayfs case).

    In the uncommon case it will call into overlayfs's ->d_real() to get the
    underlying dentry, matching file_inode(file).

    The reason we need to check against the inode is that if the file is copied
    up while being open, d_real() would return the upper dentry, while the open
    file comes from the lower dentry.

    [*] If possible, it's better simply to use file_inode() instead.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Theodore Ts'o
    Tested-by: Goldwyn Rodrigues
    Reviewed-by: Trond Myklebust
    Cc: # v4.2
    Cc: David Howells
    Cc: Al Viro
    Cc: Daniel Axtens

    Miklos Szeredi
     

22 Mar, 2016

7 commits

  • Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • The 'is_merge' is an historical naming from when only a single lower layer
    could exist. With the introduction of multiple lower layers the meaning of
    this flag was changed to mean only the "lowest layer" (while all lower
    layers were being merged).

    So now 'is_merge' is inaccurate and hence renaming to 'is_lowest'

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • This patch fixes a newline warning found by the checkpatch.pl tool

    Signed-off-by: Sohom-Bhattacharjee
    Signed-off-by: Miklos Szeredi

    Sohom Bhattacharjee
     
  • In some instances xfs has been created with ftype=0 and there if a file
    on lower fs is removed, overlay leaves a whiteout in upper fs but that
    whiteout does not get filtered out and is visible to overlayfs users.

    And reason it does not get filtered out because upper filesystem does
    not report file type of whiteout as DT_CHR during iterate_dir().

    So it seems to be a requirement that upper filesystem support d_type for
    overlayfs to work properly. Do this check during mount and fail if d_type
    is not supported.

    Suggested-by: Dave Chinner
    Signed-off-by: Vivek Goyal
    Signed-off-by: Miklos Szeredi

    Vivek Goyal
     
  • Print a warning when overlayfs copies up a file if the process that
    triggered the copy up has a R/O fd open to the lower file being copied up.

    This can help catch applications that do things like the following:

    fd1 = open("foo", O_RDONLY);
    fd2 = open("foo", O_RDWR);

    where they expect fd1 and fd2 to refer to the same file - which will no
    longer be the case post-copy up.

    With this patch, the following commands:

    bash 5<>/mnt/a/foo128

    assuming /mnt/a/foo128 to be an un-copied up file on an overlay will
    produce the following warning in the kernel log:

    overlayfs: Copying up foo129, but open R/O on fd 5 which will cease
    to be coherent [pid=3818 bash]

    This is enabled by setting:

    /sys/module/overlay/parameters/check_copy_up

    to 1.

    The warnings are ratelimited and are also limited to one warning per file -
    assuming the copy up completes in each case.

    Signed-off-by: David Howells
    Signed-off-by: Miklos Szeredi

    David Howells
     
  • This patch hides error about missing lowerdir if MS_SILENT is set.

    We use mount(NULL, "/", "overlay", MS_SILENT, NULL) for testing support of
    overlayfs: syscall returns -ENODEV if it's not supported. Otherwise kernel
    automatically loads module and returns -EINVAL because lowerdir is missing.

    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Miklos Szeredi

    Konstantin Khlebnikov
     
  • Unlink and rename in overlayfs checked the upper dentry for staleness by
    verifying upper->d_parent against upperdir. However the dentry can go
    stale also by being unhashed, for example.

    Expand the verification to actually look up the name again (under parent
    lock) and check if it matches the upper dentry. This matches what the VFS
    does before passing the dentry to filesytem's unlink/rename methods, which
    excludes any inconsistency caused by overlayfs.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

04 Mar, 2016

4 commits

  • Overlayfs must update uid/gid after chown, otherwise functions
    like inode_owner_or_capable() will check user against stale uid.
    Catched by xfstests generic/087, it chowns file and calls utimes.

    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Miklos Szeredi
    Cc:

    Konstantin Khlebnikov
     
  • After rename file dentry still holds reference to lower dentry from
    previous location. This doesn't matter for data access because data comes
    from upper dentry. But this stale lower dentry taints dentry at new
    location and turns it into non-pure upper. Such file leaves visible
    whiteout entry after remove in directory which shouldn't have whiteouts at
    all.

    Overlayfs already tracks pureness of file location in oe->opaque. This
    patch just uses that for detecting actual path type.

    Comment from Vivek Goyal's patch:

    Here are the details of the problem. Do following.

    $ mkdir upper lower work merged upper/dir/
    $ touch lower/test
    $ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=
    work merged
    $ mv merged/test merged/dir/
    $ rm merged/dir/test
    $ ls -l merged/dir/
    /usr/bin/ls: cannot access merged/dir/test: No such file or directory
    total 0
    c????????? ? ? ? ? ? test

    Basic problem seems to be that once a file has been unlinked, a whiteout
    has been left behind which was not needed and hence it becomes visible.

    Whiteout is visible because parent dir is of not type MERGE, hence
    od->is_real is set during ovl_dir_open(). And that means ovl_iterate()
    passes on iterate handling directly to underlying fs. Underlying fs does
    not know/filter whiteouts so it becomes visible to user.

    Why did we leave a whiteout to begin with when we should not have.
    ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave
    whiteout if file is pure upper. In this case file is not found to be pure
    upper hence whiteout is left.

    So why file was not PURE_UPPER in this case? I think because dentry is
    still carrying some leftover state which was valid before rename. For
    example, od->numlower was set to 1 as it was a lower file. After rename,
    this state is not valid anymore as there is no such file in lower.

    Signed-off-by: Konstantin Khlebnikov
    Reported-by: Viktor Stanchev
    Suggested-by: Vivek Goyal
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611
    Acked-by: Vivek Goyal
    Signed-off-by: Miklos Szeredi
    Cc:

    Konstantin Khlebnikov
     
  • ovl_remove_upper() should do d_drop() only after it successfully
    removes the dir, otherwise a subsequent getcwd() system call will
    fail, breaking userspace programs.

    This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491

    Signed-off-by: Rui Wang
    Reviewed-by: Konstantin Khlebnikov
    Signed-off-by: Miklos Szeredi
    Cc:

    Rui Wang
     
  • This adds missing .d_select_inode into alternative dentry_operations.

    Signed-off-by: Konstantin Khlebnikov
    Fixes: 7c03b5d45b8e ("ovl: allow distributed fs as lower layer")
    Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
    Reviewed-by: Nikolay Borisov
    Tested-by: Nikolay Borisov
    Signed-off-by: Miklos Szeredi
    Cc: # 4.2+

    Konstantin Khlebnikov
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

22 Jan, 2016

2 commits

  • Merge third patch-bomb from Andrew Morton:
    "I'm pretty much done for -rc1 now:

    - the rest of MM, basically

    - lib/ updates

    - checkpatch, epoll, hfs, fatfs, ptrace, coredump, exit

    - cpu_mask simplifications

    - kexec, rapidio, MAINTAINERS etc, etc.

    - more dma-mapping cleanups/simplifications from hch"

    * emailed patches from Andrew Morton : (109 commits)
    MAINTAINERS: add/fix git URLs for various subsystems
    mm: memcontrol: add "sock" to cgroup2 memory.stat
    mm: memcontrol: basic memory statistics in cgroup2 memory controller
    mm: memcontrol: do not uncharge old page in page cache replacement
    Documentation: cgroup: add memory.swap.{current,max} description
    mm: free swap cache aggressively if memcg swap is full
    mm: vmscan: do not scan anon pages if memcg swap limit is hit
    swap.h: move memcg related stuff to the end of the file
    mm: memcontrol: replace mem_cgroup_lruvec_online with mem_cgroup_online
    mm: vmscan: pass memcg to get_scan_count()
    mm: memcontrol: charge swap to cgroup2
    mm: memcontrol: clean up alloc, online, offline, free functions
    mm: memcontrol: flatten struct cg_proto
    mm: memcontrol: rein in the CONFIG space madness
    net: drop tcp_memcontrol.c
    mm: memcontrol: introduce CONFIG_MEMCG_LEGACY_KMEM
    mm: memcontrol: allow to disable kmem accounting for cgroup2
    mm: memcontrol: account "kmem" consumers in cgroup2 memory controller
    mm: memcontrol: move kmem accounting code to CONFIG_MEMCG
    mm: memcontrol: separate kmem code from legacy tcp accounting code
    ...

    Linus Torvalds
     
  • Pull overlayfs updates from Miklos Szeredi:
    "This contains several bug fixes and a new mount option
    'default_permissions' that allows read-only exported NFS
    filesystems to be used as lower layer"

    * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
    ovl: check dentry positiveness in ovl_cleanup_whiteouts()
    ovl: setattr: check permissions before copy-up
    ovl: root: copy attr
    ovl: move super block magic number to magic.h
    ovl: use a minimal buffer in ovl_copy_xattr
    ovl: allow zero size xattr
    ovl: default permissions

    Linus Torvalds
     

21 Jan, 2016

1 commit

  • i386 allmodconfig:

    In file included from fs/overlayfs/super.c:10:0:
    fs/overlayfs/super.c: In function 'ovl_fill_super':
    include/linux/fs.h:898:36: error: 'PAGE_CACHE_SIZE' undeclared (first use in this function)
    #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
    ^
    fs/overlayfs/super.c:939:19: note: in expansion of macro 'MAX_LFS_FILESIZE'
    sb->s_maxbytes = MAX_LFS_FILESIZE;
    ^
    include/linux/fs.h:898:36: note: each undeclared identifier is reported only once for each function it appears in
    #define MAX_LFS_FILESIZE (((loff_t)PAGE_CACHE_SIZE << (BITS_PER_LONG-1))-1)
    ^
    fs/overlayfs/super.c:939:19: note: in expansion of macro 'MAX_LFS_FILESIZE'
    sb->s_maxbytes = MAX_LFS_FILESIZE;
    ^

    Cc: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

31 Dec, 2015

1 commit


11 Dec, 2015

2 commits

  • This patch fixes kernel crash at removing directory which contains
    whiteouts from lower layers.

    Cache of directory content passed as "list" contains entries from all
    layers, including whiteouts from lower layers. So, lookup in upper dir
    (moved into work at this stage) will return negative entry. Plus this
    cache is filled long before and we can race with external removal.

    Example:
    mkdir -p lower0/dir lower1/dir upper work overlay
    touch lower0/dir/a lower0/dir/b
    mknod lower1/dir/a c 0 0
    mount -t overlay none overlay -o lowerdir=lower1:lower0,upperdir=upper,workdir=work
    rm -fr overlay/dir

    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Miklos Szeredi
    Cc: # 3.18+

    Konstantin Khlebnikov
     
  • Without this copy-up of a file can be forced, even without actually being
    allowed to do anything on the file.

    [Arnd Bergmann] include for PAGE_CACHE_SIZE (used by
    MAX_LFS_FILESIZE definition).

    Signed-off-by: Miklos Szeredi
    Cc:

    Miklos Szeredi
     

09 Dec, 2015

2 commits

  • We copy i_uid and i_gid of underlying inode into overlayfs inode. Except
    for the root inode.

    Fix this omission.

    Signed-off-by: Miklos Szeredi
    Cc:

    Miklos Szeredi
     
  • new method: ->get_link(); replacement of ->follow_link(). The differences
    are:
    * inode and dentry are passed separately
    * might be called both in RCU and non-RCU mode;
    the former is indicated by passing it a NULL dentry.
    * when called that way it isn't allowed to block
    and should return ERR_PTR(-ECHILD) if it needs to be called
    in non-RCU mode.

    It's a flagday change - the old method is gone, all in-tree instances
    converted. Conversion isn't hard; said that, so far very few instances
    do not immediately bail out when called in RCU mode. That'll change
    in the next commits.

    Signed-off-by: Al Viro

    Al Viro
     

07 Dec, 2015

2 commits

  • Signed-off-by: Al Viro

    Al Viro
     
  • [Al Viro] The bug is in being too enthusiastic about optimizing ->setattr()
    away - instead of "copy verbatim with metadata" + "chmod/chown/utimes"
    (with the former being always safe and the latter failing in case of
    insufficient permissions) it tries to combine these two. Note that copyup
    itself will have to do ->setattr() anyway; _that_ is where the elevated
    capabilities are right. Having these two ->setattr() (one to set verbatim
    copy of metadata, another to do what overlayfs ->setattr() had been asked
    to do in the first place) combined is where it breaks.

    Signed-off-by: Miklos Szeredi
    Cc:
    Signed-off-by: Al Viro

    Miklos Szeredi
     

11 Nov, 2015

3 commits

  • The overlayfs file system is not recognized by programs
    like tail because the magic number is not in standard header location.

    Move it so that the value will propagate on for the GNU library
    and utilities. Needs to go in the fstatfs manual page as well.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: Miklos Szeredi

    Stephen Hemminger
     
  • Rather than always allocating the high-order XATTR_SIZE_MAX buffer
    which is costly and prone to failure, only allocate what is needed and
    realloc if necessary.

    Fixes https://github.com/coreos/bugs/issues/489

    Signed-off-by: Miklos Szeredi
    Cc:

    Vito Caputo
     
  • When ovl_copy_xattr() encountered a zero size xattr no more xattrs were
    copied and the function returned success. This is clearly not the desired
    behavior.

    Signed-off-by: Miklos Szeredi
    Cc:

    Miklos Szeredi
     

01 Nov, 2015

1 commit


12 Oct, 2015

6 commits

  • Add mount option "default_permissions" to alter the way permissions are
    calculated.

    Without this option and prior to this patch permissions were calculated by
    underlying lower or upper filesystem.

    With this option the permissions are calculated by overlayfs based on the
    file owner, group and mode bits.

    This has significance for example when a read-only exported NFS filesystem
    is used as a lower layer. In this case the underlying NFS filesystem will
    reply with EROFS, in which case all we know is that the filesystem is
    read-only. But that's not what we are interested in, we are interested in
    whether the access would be allowed if the filesystem wasn't read-only; the
    server doesn't tell us that, and would need updating at various levels,
    which doesn't seem practicable.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • This fixes memory leak after umount.

    Kmemleak report:

    unreferenced object 0xffff8800ba791010 (size 8):
    comm "mount", pid 2394, jiffies 4294996294 (age 53.920s)
    hex dump (first 8 bytes):
    20 1c 13 02 00 88 ff ff .......
    backtrace:
    [] create_object+0x124/0x2c0
    [] kmemleak_alloc+0x7b/0xc0
    [] __kmalloc+0x106/0x340
    [] ovl_fill_super+0x55c/0x9b0 [overlay]
    [] mount_nodev+0x54/0xa0
    [] ovl_mount+0x18/0x20 [overlay]
    [] mount_fs+0x43/0x170
    [] vfs_kern_mount+0x74/0x170
    [] do_mount+0x22d/0xdf0
    [] SyS_mount+0x7b/0xc0
    [] entry_SYSCALL_64_fastpath+0x12/0x76
    [] 0xffffffffffffffff

    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Miklos Szeredi
    Fixes: dd662667e6d3 ("ovl: add mutli-layer infrastructure")
    Cc: # v4.0+

    Konstantin Khlebnikov
     
  • This fixes small memory leak after mount.

    Kmemleak report:

    unreferenced object 0xffff88003683fe00 (size 16):
    comm "mount", pid 2029, jiffies 4294909563 (age 33.380s)
    hex dump (first 16 bytes):
    20 27 1f bb 00 88 ff ff 40 4b 0f 36 02 88 ff ff '......@K.6....
    backtrace:
    [] create_object+0x124/0x2c0
    [] kmemleak_alloc+0x7b/0xc0
    [] __kmalloc+0x106/0x340
    [] ovl_fill_super+0x389/0x9a0 [overlay]
    [] mount_nodev+0x54/0xa0
    [] ovl_mount+0x18/0x20 [overlay]
    [] mount_fs+0x43/0x170
    [] vfs_kern_mount+0x74/0x170
    [] do_mount+0x22d/0xdf0
    [] SyS_mount+0x7b/0xc0
    [] entry_SYSCALL_64_fastpath+0x12/0x76
    [] 0xffffffffffffffff

    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Miklos Szeredi
    Fixes: a78d9f0d5d5c ("ovl: support multiple lower layers")
    Cc: # v4.0+

    Konstantin Khlebnikov
     
  • If two overlayfs filesystems are stacked on top of each other, then we need
    recursion in ovl_d_select_inode().

    I guess d_backing_inode() is supposed to do that. But currently it doesn't
    and that functionality is open coded in vfs_open(). This is now copied
    into ovl_d_select_inode() to fix this regression.

    Reported-by: Alban Crequy
    Signed-off-by: Miklos Szeredi
    Fixes: 4bacc9c9234c ("overlayfs: Make f_path always point to the overlay...")
    Cc: David Howells
    Cc: # v4.2+

    Miklos Szeredi
     
  • In ovl_copy_up_locked(), newdentry is leaked if the function exits through
    out_cleanup as this just to out after calling ovl_cleanup() - which doesn't
    actually release the ref on newdentry.

    The out_cleanup segment should instead exit through out2 as certainly
    newdentry leaks - and possibly upper does also, though this isn't caught
    given the catch of newdentry.

    Without this fix, something like the following is seen:

    BUG: Dentry ffff880023e9eb20{i=f861,n=#ffff880023e82d90} still in use (1) [unmount of tmpfs tmpfs]
    BUG: Dentry ffff880023ece640{i=0,n=bigfile} still in use (1) [unmount of tmpfs tmpfs]

    when unmounting the upper layer after an error occurred in copyup.

    An error can be induced by creating a big file in a lower layer with
    something like:

    dd if=/dev/zero of=/lower/a/bigfile bs=65536 count=1 seek=$((0xf000))

    to create a large file (4.1G). Overlay an upper layer that is too small
    (on tmpfs might do) and then induce a copy up by opening it writably.

    Reported-by: Ulrich Obergfell
    Signed-off-by: David Howells
    Signed-off-by: Miklos Szeredi
    Cc: # v3.18+

    David Howells
     
  • Open the lower file with O_LARGEFILE in ovl_copy_up().

    Pass O_LARGEFILE unconditionally in ovl_copy_up_data() as it's purely for
    catching 32-bit userspace dealing with a file large enough that it'll be
    mishandled if the application isn't aware that there might be an integer
    overflow. Inside the kernel, there shouldn't be any problems.

    Reported-by: Ulrich Obergfell
    Signed-off-by: David Howells
    Signed-off-by: Miklos Szeredi
    Cc: # v3.18+

    David Howells
     

05 Sep, 2015

1 commit

  • Many file systems that implement the show_options hook fail to correctly
    escape their output which could lead to unescaped characters (e.g. new
    lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This
    could lead to confusion, spoofed entries (resulting in things like
    systemd issuing false d-bus "mount" notifications), and who knows what
    else. This looks like it would only be the root user stepping on
    themselves, but it's possible weird things could happen in containers or
    in other situations with delegated mount privileges.

    Here's an example using overlay with setuid fusermount trusting the
    contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use
    of "sudo" is something more sneaky:

    $ BASE="ovl"
    $ MNT="$BASE/mnt"
    $ LOW="$BASE/lower"
    $ UP="$BASE/upper"
    $ WORK="$BASE/work/ 0 0
    none /proc fuse.pwn user_id=1000"
    $ mkdir -p "$LOW" "$UP" "$WORK"
    $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
    $ cat /proc/mounts
    none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
    none /proc fuse.pwn user_id=1000 0 0
    $ fusermount -u /proc
    $ cat /proc/mounts
    cat: /proc/mounts: No such file or directory

    This fixes the problem by adding new seq_show_option and
    seq_show_option_n helpers, and updating the vulnerable show_option
    handlers to use them as needed. Some, like SELinux, need to be open
    coded due to unusual existing escape mechanisms.

    [akpm@linux-foundation.org: add lost chunk, per Kees]
    [keescook@chromium.org: seq_show_option should be using const parameters]
    Signed-off-by: Kees Cook
    Acked-by: Serge Hallyn
    Acked-by: Jan Kara
    Acked-by: Paul Moore
    Cc: J. R. Okajima
    Signed-off-by: Kees Cook
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

12 Jul, 2015

1 commit


05 Jul, 2015

1 commit

  • Pull more vfs updates from Al Viro:
    "Assorted VFS fixes and related cleanups (IMO the most interesting in
    that part are f_path-related things and Eric's descriptor-related
    stuff). UFS regression fixes (it got broken last cycle). 9P fixes.
    fs-cache series, DAX patches, Jan's file_remove_suid() work"

    [ I'd say this is much more than "fixes and related cleanups". The
    file_table locking rule change by Eric Dumazet is a rather big and
    fundamental update even if the patch isn't huge. - Linus ]

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
    9p: cope with bogus responses from server in p9_client_{read,write}
    p9_client_write(): avoid double p9_free_req()
    9p: forgetting to cancel request on interrupted zero-copy RPC
    dax: bdev_direct_access() may sleep
    block: Add support for DAX reads/writes to block devices
    dax: Use copy_from_iter_nocache
    dax: Add block size note to documentation
    fs/file.c: __fget() and dup2() atomicity rules
    fs/file.c: don't acquire files->file_lock in fd_install()
    fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
    vfs: avoid creation of inode number 0 in get_next_ino
    namei: make set_root_rcu() return void
    make simple_positive() public
    ufs: use dir_pages instead of ufs_dir_pages()
    pagemap.h: move dir_pages() over there
    remove the pointless include of lglock.h
    fs: cleanup slight list_entry abuse
    xfs: Correctly lock inode when removing suid and file capabilities
    fs: Call security_ops->inode_killpriv on truncate
    fs: Provide function telling whether file_remove_privs() will do anything
    ...

    Linus Torvalds
     

03 Jul, 2015

1 commit

  • Pull overlayfs updates from Miklos Szeredi:
    "This relaxes the requirements on the lower layer filesystem: now ones
    that implement .d_revalidate, such as NFS, can be used.

    Upper layer filesystems still has the "no .d_revalidate" requirement.

    Also a bad interaction with jffs2 locking has been fixed"

    * 'overlayfs-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
    ovl: lookup whiteouts outside iterate_dir()
    ovl: allow distributed fs as lower layer
    ovl: don't traverse automount points

    Linus Torvalds
     

23 Jun, 2015

1 commit

  • Pull vfs updates from Al Viro:
    "In this pile: pathname resolution rewrite.

    - recursion in link_path_walk() is gone.

    - nesting limits on symlinks are gone (the only limit remaining is
    that the total amount of symlinks is no more than 40, no matter how
    nested).

    - "fast" (inline) symlinks are handled without leaving rcuwalk mode.

    - stack footprint (independent of the nesting) is below kilobyte now,
    about on par with what it used to be with one level of nested
    symlinks and ~2.8 times lower than it used to be in the worst case.

    - struct nameidata is entirely private to fs/namei.c now (not even
    opaque pointers are being passed around).

    - ->follow_link() and ->put_link() calling conventions had been
    changed; all in-tree filesystems converted, out-of-tree should be
    able to follow reasonably easily.

    For out-of-tree conversions, see Documentation/filesystems/porting
    for details (and in-tree filesystems for examples of conversion).

    That has sat in -next since mid-May, seems to survive all testing
    without regressions and merges clean with v4.1"

    * 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (131 commits)
    turn user_{path_at,path,lpath,path_dir}() into static inlines
    namei: move saved_nd pointer into struct nameidata
    inline user_path_create()
    inline user_path_parent()
    namei: trim do_last() arguments
    namei: stash dfd and name into nameidata
    namei: fold path_cleanup() into terminate_walk()
    namei: saner calling conventions for filename_parentat()
    namei: saner calling conventions for filename_create()
    namei: shift nameidata down into filename_parentat()
    namei: make filename_lookup() reject ERR_PTR() passed as name
    namei: shift nameidata inside filename_lookup()
    namei: move putname() call into filename_lookup()
    namei: pass the struct path to store the result down into path_lookupat()
    namei: uninline set_root{,_rcu}()
    namei: be careful with mountpoint crossings in follow_dotdot_rcu()
    Documentation: remove outdated information from automount-support.txt
    get rid of assorted nameidata-related debris
    lustre: kill unused helper
    lustre: kill unused macro (LOOKUP_CONTINUE)
    ...

    Linus Torvalds
     

22 Jun, 2015

2 commits

  • If jffs2 can deadlock on overlayfs readdir because it takes the same lock
    on ->iterate() as in ->lookup().

    Fix by moving whiteout checking outside iterate_dir(). Optimized by
    collecting potential whiteouts (DT_CHR) in a temporary list and if
    non-empty iterating throug these and checking for a 0/0 chardev.

    Signed-off-by: Miklos Szeredi
    Fixes: 49c21e1cacd7 ("ovl: check whiteout while reading directory")
    Reported-by: Roman Yeryomin

    Miklos Szeredi
     
  • Allow filesystems with .d_revalidate as lower layer(s), but not as upper
    layer.

    For local filesystems the rule was that modifications on the layers
    directly while being part of the overlay results in undefined behavior.

    This can easily be extended to distributed filesystems: we assume the tree
    used as lower layer is static, which means ->d_revalidate() should always
    return "1". If that is not the case, return -ESTALE, don't try to work
    around the modification.

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi