03 Apr, 2019

1 commit

  • commit 73601ea5b7b18eb234219ae2adf77530f389da79 upstream.

    syzbot is hitting lockdep warning [1] due to trying to open a fifo
    during an execve() operation. But we don't need to open non regular
    files during an execve() operation, for all files which we will need are
    the executable file itself and the interpreter programs like /bin/sh and
    ld-linux.so.2 .

    Since the manpage for execve(2) says that execve() returns EACCES when
    the file or a script interpreter is not a regular file, and the manpage
    for uselib(2) says that uselib() can return EACCES, and we use
    FMODE_EXEC when opening for execve()/uselib(), we can bail out if a non
    regular file is requested with FMODE_EXEC set.

    Since this deadlock followed by khungtaskd warnings is trivially
    reproducible by a local unprivileged user, and syzbot's frequent crash
    due to this deadlock defers finding other bugs, let's workaround this
    deadlock until we get a chance to find a better solution.

    [1] https://syzkaller.appspot.com/bug?id=b5095bfec44ec84213bac54742a82483aad578ce

    Link: http://lkml.kernel.org/r/1552044017-7890-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
    Reported-by: syzbot
    Fixes: 8924feff66f35fe2 ("splice: lift pipe_lock out of splice_to_pipe()")
    Signed-off-by: Tetsuo Handa
    Acked-by: Kees Cook
    Cc: Al Viro
    Cc: Eric Biggers
    Cc: Dmitry Vyukov
    Cc: [4.9+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Tetsuo Handa
     

22 Aug, 2018

1 commit

  • Pull overlayfs updates from Miklos Szeredi:
    "This contains two new features:

    - Stack file operations: this allows removal of several hacks from
    the VFS, proper interaction of read-only open files with copy-up,
    possibility to implement fs modifying ioctls properly, and others.

    - Metadata only copy-up: when file is on lower layer and only
    metadata is modified (except size) then only copy up the metadata
    and continue to use the data from the lower file"

    * tag 'ovl-update-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (66 commits)
    ovl: Enable metadata only feature
    ovl: Do not do metacopy only for ioctl modifying file attr
    ovl: Do not do metadata only copy-up for truncate operation
    ovl: add helper to force data copy-up
    ovl: Check redirect on index as well
    ovl: Set redirect on upper inode when it is linked
    ovl: Set redirect on metacopy files upon rename
    ovl: Do not set dentry type ORIGIN for broken hardlinks
    ovl: Add an inode flag OVL_CONST_INO
    ovl: Treat metacopy dentries as type OVL_PATH_MERGE
    ovl: Check redirects for metacopy files
    ovl: Move some dir related ovl_lookup_single() code in else block
    ovl: Do not expose metacopy only dentry from d_real()
    ovl: Open file with data except for the case of fsync
    ovl: Add helper ovl_inode_realdata()
    ovl: Store lower data inode in ovl_inode
    ovl: Fix ovl_getattr() to get number of blocks from lower
    ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
    ovl: Copy up meta inode data from lowest data inode
    ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
    ...

    Linus Torvalds
     

18 Jul, 2018

5 commits


12 Jul, 2018

11 commits


11 Jul, 2018

2 commits

  • An ->open() instances really, really should not be doing that. There's
    a lot of places e.g. around atomic_open() that could be confused by that,
    so let's catch that early.

    Acked-by: Linus Torvalds
    Signed-off-by: Al Viro

    Al Viro
     
  • it's exactly the same thing as
    dentry_open(&file->f_path, file->f_flags, file->f_cred)

    ... and rename it to file_clone_open(), while we are at it.
    'filp' naming convention is bogus; sure, it's "file pointer",
    but we generally don't do that kind of Hungarian notation.
    Some of the instances have too many callers to touch, but this
    one has only two, so let's sanitize it while we can...

    Acked-by: Linus Torvalds
    Signed-off-by: Al Viro

    Al Viro
     

04 Jun, 2018

1 commit

  • This reverts commit cab64df194667dc5d9d786f0a895f647f5501c0d.

    Having vfs_open() in some cases drop the reference to
    struct file combined with

    error = vfs_open(path, f, cred);
    if (error) {
    put_filp(f);
    return ERR_PTR(error);
    }
    return f;

    is flat-out wrong. It used to be

    error = vfs_open(path, f, cred);
    if (!error) {
    /* from now on we need fput() to dispose of f */
    error = open_check_o_direct(f);
    if (error) {
    fput(f);
    f = ERR_PTR(error);
    }
    } else {
    put_filp(f);
    f = ERR_PTR(error);
    }

    and sure, having that open_check_o_direct() boilerplate gotten rid of is
    nice, but not that way...

    Worse, another call chain (via finish_open()) is FUBAR now wrt
    FILE_OPENED handling - in that case we get error returned, with file
    already hit by fput() *AND* FILE_OPENED not set. Guess what happens in
    path_openat(), when it hits

    if (!(opened & FILE_OPENED)) {
    BUG_ON(!error);
    put_filp(file);
    }

    The root cause of all that crap is that the callers of do_dentry_open()
    have no way to tell which way did it fail; while that could be fixed up
    (by passing something like int *opened to do_dentry_open() and have it
    marked if we'd called ->open()), it's probably much too late in the
    cycle to do so right now.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

07 Apr, 2018

1 commit

  • Pull misc vfs updates from Al Viro:
    "Assorted stuff, including Christoph's I_DIRTY patches"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: move I_DIRTY_INODE to fs.h
    ubifs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) call
    ntfs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) call
    gfs2: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) calls
    fs: fold open_check_o_direct into do_dentry_open
    vfs: Replace stray non-ASCII homoglyph characters with their ASCII equivalents
    vfs: make sure struct filename->iname is word-aligned
    get rid of pointless includes of fs_struct.h
    [poll] annotate SAA6588_CMD_POLL users

    Linus Torvalds
     

03 Apr, 2018

10 commits

  • Using the ksys_fallocate() wrapper allows us to get rid of in-kernel
    calls to the sys_fallocate() syscall. The ksys_ prefix denotes that this
    function is meant as a drop-in replacement for the syscall. In
    particular, it uses the same calling convention as sys_fallocate().

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Dominik Brodowski

    Dominik Brodowski
     
  • Using the ksys_truncate() wrapper allows us to get rid of in-kernel
    calls to the sys_truncate() syscall. The ksys_ prefix denotes that this
    function is meant as a drop-in replacement for the syscall. In
    particular, it uses the same calling convention as sys_truncate().

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Dominik Brodowski

    Dominik Brodowski
     
  • Using this wrapper allows us to avoid the in-kernel calls to the
    sys_open() syscall. The ksys_ prefix denotes that this function is meant
    as a drop-in replacement for the syscall. In particular, it uses the
    same calling convention as sys_open().

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Dominik Brodowski

    Dominik Brodowski
     
  • Using the ksys_close() wrapper allows us to get rid of in-kernel calls
    to the sys_close() syscall. The ksys_ prefix denotes that this function
    is meant as a drop-in replacement for the syscall. In particular, it
    uses the same calling convention as sys_close(), with one subtle
    difference:

    The few places which checked the return value did not care about the return
    value re-writing in sys_close(), so simply use a wrapper around
    __close_fd().

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Dominik Brodowski

    Dominik Brodowski
     
  • Using the ksys_ftruncate() wrapper allows us to get rid of in-kernel
    calls to the sys_ftruncate() syscall. The ksys_ prefix denotes that this
    function is meant as a drop-in replacement for the syscall. In
    particular, it uses the same calling convention as sys_ftruncate().

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Dominik Brodowski

    Dominik Brodowski
     
  • Using the fs-interal do_fchownat() wrapper allows us to get rid of
    fs-internal calls to the sys_fchownat() syscall.

    Introducing the ksys_fchown() helper and the ksys_{,}chown() wrappers
    allows us to avoid the in-kernel calls to the sys_{,l,f}chown() syscalls.
    The ksys_ prefix denotes that these functions are meant as a drop-in
    replacement for the syscalls. In particular, they use the same calling
    convention as sys_{,l,f}chown().

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Dominik Brodowski

    Dominik Brodowski
     
  • Using the fs-internal do_faccessat() helper allows us to get rid of
    fs-internal calls to the sys_faccessat() syscall.

    Introducing the ksys_access() wrapper allows us to avoid the in-kernel
    calls to the sys_access() syscall. The ksys_ prefix denotes that this
    function is meant as a drop-in replacement for the syscall. In
    particular, it uses the same calling convention as sys_access().

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Dominik Brodowski

    Dominik Brodowski
     
  • … in-kernel calls to syscall

    Using the fs-internal do_fchmodat() helper allows us to get rid of
    fs-internal calls to the sys_fchmodat() syscall.

    Introducing the ksys_fchmod() helper and the ksys_chmod() wrapper allows
    us to avoid the in-kernel calls to the sys_fchmod() and sys_chmod()
    syscalls. The ksys_ prefix denotes that these functions are meant as a
    drop-in replacement for the syscalls. In particular, they use the same
    calling convention as sys_fchmod() and sys_chmod().

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Al Viro <viro@zeniv.linux.org.uk>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>

    Dominik Brodowski
     
  • Using this helper allows us to avoid the in-kernel calls to the sys_chdir()
    syscall. The ksys_ prefix denotes that this function is meant as a drop-in
    replacement for the syscall. In particular, it uses the same calling
    convention as sys_chdir().

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Al Viro
    Cc: Andrew Morton
    Signed-off-by: Dominik Brodowski

    Dominik Brodowski
     
  • Using this helper allows us to avoid the in-kernel calls to the
    sys_chroot() syscall. The ksys_ prefix denotes that this function is
    meant as a drop-in replacement for the syscall. In particular, it uses the
    same calling convention as sys_chroot().

    In the near future, the fs-external callers of ksys_chroot() should be
    converted to use kern_path()/set_fs_root() directly. Then ksys_chroot()
    can be moved within sys_chroot() again.

    This patch is part of a series which removes in-kernel calls to syscalls.
    On this basis, the syscall entry path can be streamlined. For details, see
    http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

    Cc: Alexander Viro
    Signed-off-by: Dominik Brodowski

    Dominik Brodowski
     

28 Mar, 2018

1 commit


05 Sep, 2017

2 commits

  • Problem with ioctl() is that it's a file operation, yet often used as an
    inode operation (i.e. modify the inode despite the file being opened for
    read-only).

    mnt_want_write_file() is used by filesystems in such cases to get write
    access on an arbitrary open file.

    Since overlayfs lets filesystems do all file operations, including ioctl,
    this can lead to mnt_want_write_file() returning OK for a lower file and
    modification of that lower file.

    This patch prevents modification by checking if the file is from an
    overlayfs lower layer and returning EPERM in that case.

    Need to introduce a mnt_want_write_file_path() variant that still does the
    old thing for inode operations that can do the copy up + modification
    correctly in such cases (fchown, fsetxattr, fremovexattr).

    This does not address the correctness of such ioctls on overlayfs (the
    correct way would be to copy up and attempt to perform ioctl on upper
    file).

    In theory this could be a regression. We very much hope that nobody is
    relying on such a hack in any sane setup.

    While this patch meddles in VFS code, it has no effect on non-overlayfs
    filesystems.

    Reported-by: "zhangyi (F)"
    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • Add a separate flags argument (in addition to the open flags) to control
    the behavior of d_real().

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     

08 Jul, 2017

1 commit

  • Pull Writeback error handling updates from Jeff Layton:
    "This pile represents the bulk of the writeback error handling fixes
    that I have for this cycle. Some of the earlier patches in this pile
    may look trivial but they are prerequisites for later patches in the
    series.

    The aim of this set is to improve how we track and report writeback
    errors to userland. Most applications that care about data integrity
    will periodically call fsync/fdatasync/msync to ensure that their
    writes have made it to the backing store.

    For a very long time, we have tracked writeback errors using two flags
    in the address_space: AS_EIO and AS_ENOSPC. Those flags are set when a
    writeback error occurs (via mapping_set_error) and are cleared as a
    side-effect of filemap_check_errors (as you noted yesterday). This
    model really sucks for userland.

    Only the first task to call fsync (or msync or fdatasync) will see the
    error. Any subsequent task calling fsync on a file will get back 0
    (unless another writeback error occurs in the interim). If I have
    several tasks writing to a file and calling fsync to ensure that their
    writes got stored, then I need to have them coordinate with one
    another. That's difficult enough, but in a world of containerized
    setups that coordination may even not be possible.

    But wait...it gets worse!

    The calls to filemap_check_errors can be buried pretty far down in the
    call stack, and there are internal callers of filemap_write_and_wait
    and the like that also end up clearing those errors. Many of those
    callers ignore the error return from that function or return it to
    userland at nonsensical times (e.g. truncate() or stat()). If I get
    back -EIO on a truncate, there is no reason to think that it was
    because some previous writeback failed, and a subsequent fsync() will
    (incorrectly) return 0.

    This pile aims to do three things:

    1) ensure that when a writeback error occurs that that error will be
    reported to userland on a subsequent fsync/fdatasync/msync call,
    regardless of what internal callers are doing

    2) report writeback errors on all file descriptions that were open at
    the time that the error occurred. This is a user-visible change,
    but I think most applications are written to assume this behavior
    anyway. Those that aren't are unlikely to be hurt by it.

    3) document what filesystems should do when there is a writeback
    error. Today, there is very little consistency between them, and a
    lot of cargo-cult copying. We need to make it very clear what
    filesystems should do in this situation.

    To achieve this, the set adds a new data type (errseq_t) and then
    builds new writeback error tracking infrastructure around that. Once
    all of that is in place, we change the filesystems to use the new
    infrastructure for reporting wb errors to userland.

    Note that this is just the initial foray into cleaning up this mess.
    There is a lot of work remaining here:

    1) convert the rest of the filesystems in a similar fashion. Once the
    initial set is in, then I think most other fs' will be fairly
    simple to convert. Hopefully most of those can in via individual
    filesystem trees.

    2) convert internal waiters on writeback to use errseq_t for
    detecting errors instead of relying on the AS_* flags. I have some
    draft patches for this for ext4, but they are not quite ready for
    prime time yet.

    This was a discussion topic this year at LSF/MM too. If you're
    interested in the gory details, LWN has some good articles about this:

    https://lwn.net/Articles/718734/
    https://lwn.net/Articles/724307/"

    * tag 'for-linus-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
    btrfs: minimal conversion to errseq_t writeback error reporting on fsync
    xfs: minimal conversion to errseq_t writeback error reporting
    ext4: use errseq_t based error handling for reporting data writeback errors
    fs: convert __generic_file_fsync to use errseq_t based reporting
    block: convert to errseq_t based writeback error tracking
    dax: set errors in mapping when writeback fails
    Documentation: flesh out the section in vfs.txt on storing and reporting writeback errors
    mm: set both AS_EIO/AS_ENOSPC and errseq_t in mapping_set_error
    fs: new infrastructure for writeback error handling and reporting
    lib: add errseq_t type and infrastructure for handling it
    mm: don't TestClearPageError in __filemap_fdatawait_range
    mm: clear AS_EIO/AS_ENOSPC when writeback initiation fails
    jbd2: don't clear and reset errors after waiting on writeback
    buffer: set errors in mapping at the time that the error occurs
    fs: check for writeback errors after syncing out buffers in generic_file_fsync
    buffer: use mapping_set_error instead of setting the flag
    mm: fix mapping_set_error call in me_pagecache_dirty

    Linus Torvalds
     

06 Jul, 2017

1 commit

  • Most filesystems currently use mapping_set_error and
    filemap_check_errors for setting and reporting/clearing writeback errors
    at the mapping level. filemap_check_errors is indirectly called from
    most of the filemap_fdatawait_* functions and from
    filemap_write_and_wait*. These functions are called from all sorts of
    contexts to wait on writeback to finish -- e.g. mostly in fsync, but
    also in truncate calls, getattr, etc.

    The non-fsync callers are problematic. We should be reporting writeback
    errors during fsync, but many places spread over the tree clear out
    errors before they can be properly reported, or report errors at
    nonsensical times.

    If I get -EIO on a stat() call, there is no reason for me to assume that
    it is because some previous writeback failed. The fact that it also
    clears out the error such that a subsequent fsync returns 0 is a bug,
    and a nasty one since that's potentially silent data corruption.

    This patch adds a small bit of new infrastructure for setting and
    reporting errors during address_space writeback. While the above was my
    original impetus for adding this, I think it's also the case that
    current fsync semantics are just problematic for userland. Most
    applications that call fsync do so to ensure that the data they wrote
    has hit the backing store.

    In the case where there are multiple writers to the file at the same
    time, this is really hard to determine. The first one to call fsync will
    see any stored error, and the rest get back 0. The processes with open
    fds may not be associated with one another in any way. They could even
    be in different containers, so ensuring coordination between all fsync
    callers is not really an option.

    One way to remedy this would be to track what file descriptor was used
    to dirty the file, but that's rather cumbersome and would likely be
    slow. However, there is a simpler way to improve the semantics here
    without incurring too much overhead.

    This set adds an errseq_t to struct address_space, and a corresponding
    one is added to struct file. Writeback errors are recorded in the
    mapping's errseq_t, and the one in struct file is used as the "since"
    value.

    This changes the semantics of the Linux fsync implementation such that
    applications can now use it to determine whether there were any
    writeback errors since fsync(fd) was last called (or since the file was
    opened in the case of fsync having never been called).

    Note that those writeback errors may have occurred when writing data
    that was dirtied via an entirely different fd, but that's the case now
    with the current mapping_set_error/filemap_check_error infrastructure.
    This will at least prevent you from getting a false report of success.

    The new behavior is still consistent with the POSIX spec, and is more
    reliable for application developers. This patch just adds some basic
    infrastructure for doing this, and ensures that the f_wb_err "cursor"
    is properly set when a file is opened. Later patches will change the
    existing code to use this new infrastructure for reporting errors at
    fsync time.

    Signed-off-by: Jeff Layton
    Reviewed-by: Jan Kara

    Jeff Layton
     

28 Jun, 2017

1 commit

  • Define a set of write life time hints:

    RWH_WRITE_LIFE_NOT_SET No hint information set
    RWH_WRITE_LIFE_NONE No hints about write life time
    RWH_WRITE_LIFE_SHORT Data written has a short life time
    RWH_WRITE_LIFE_MEDIUM Data written has a medium life time
    RWH_WRITE_LIFE_LONG Data written has a long life time
    RWH_WRITE_LIFE_EXTREME Data written has an extremely long life time

    The intent is for these values to be relative to each other, no
    absolute meaning should be attached to these flag names.

    Add an fcntl interface for querying these flags, and also for
    setting them as well:

    F_GET_RW_HINT Returns the read/write hint set on the
    underlying inode.

    F_SET_RW_HINT Set one of the above write hints on the
    underlying inode.

    F_GET_FILE_RW_HINT Returns the read/write hint set on the
    file descriptor.

    F_SET_FILE_RW_HINT Set one of the above write hints on the
    file descriptor.

    The user passes in a 64-bit pointer to get/set these values, and
    the interface returns 0/-1 on success/error.

    Sample program testing/implementing basic setting/getting of write
    hints is below.

    Add support for storing the write life time hint in the inode flags
    and in struct file as well, and pass them to the kiocb flags. If
    both a file and its corresponding inode has a write hint, then we
    use the one in the file, if available. The file hint can be used
    for sync/direct IO, for buffered writeback only the inode hint
    is available.

    This is in preparation for utilizing these hints in the block layer,
    to guide on-media data placement.

    /*
    * writehint.c: get or set an inode write hint
    */
    #include
    #include
    #include
    #include
    #include
    #include

    #ifndef F_GET_RW_HINT
    #define F_LINUX_SPECIFIC_BASE 1024
    #define F_GET_RW_HINT (F_LINUX_SPECIFIC_BASE + 11)
    #define F_SET_RW_HINT (F_LINUX_SPECIFIC_BASE + 12)
    #endif

    static char *str[] = { "RWF_WRITE_LIFE_NOT_SET", "RWH_WRITE_LIFE_NONE",
    "RWH_WRITE_LIFE_SHORT", "RWH_WRITE_LIFE_MEDIUM",
    "RWH_WRITE_LIFE_LONG", "RWH_WRITE_LIFE_EXTREME" };

    int main(int argc, char *argv[])
    {
    uint64_t hint;
    int fd, ret;

    if (argc < 2) {
    fprintf(stderr, "%s: file \n", argv[0]);
    return 1;
    }

    fd = open(argv[1], O_RDONLY);
    if (fd < 0) {
    perror("open");
    return 2;
    }

    if (argc > 2) {
    hint = atoi(argv[2]);
    ret = fcntl(fd, F_SET_RW_HINT, &hint);
    if (ret < 0) {
    perror("fcntl: F_SET_RW_HINT");
    return 4;
    }
    }

    ret = fcntl(fd, F_GET_RW_HINT, &hint);
    if (ret < 0) {
    perror("fcntl: F_GET_RW_HINT");
    return 3;
    }

    printf("%s: hint %s\n", argv[1], str[hint]);
    close(fd);
    return 0;
    }

    Reviewed-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Jens Axboe
     

13 May, 2017

1 commit

  • Pull misc vfs updates from Al Viro:
    "Making sure that something like a referral point won't end up as pwd
    or root.

    The main part is the last commit (fixing mntns_install()); that one
    fixes a hard-to-hit race. The fchdir() commit is making fchdir(2) a
    bit more robust - it should be impossible to get opened files (even
    O_PATH ones) for referral points in the first place, so the existing
    checks are OK, but checking the same thing as in chdir(2) is just as
    cheap.

    The path_init() commit removes a redundant check that shouldn't have
    been there in the first place"

    * 'work.sane_pwd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    make sure that mntns_install() doesn't end up with referral for root
    path_init(): don't bother with checking MAY_EXEC for LOOKUP_ROOT
    make sure that fchdir() won't accept referral points, etc.

    Linus Torvalds
     

11 May, 2017

1 commit

  • Pull overlayfs update from Miklos Szeredi:
    "The biggest part of this is making st_dev/st_ino on the overlay behave
    like a normal filesystem (i.e. st_ino doesn't change on copy up,
    st_dev is the same for all files and directories). Currently this only
    works if all layers are on the same filesystem, but future work will
    move the general case towards more sane behavior.

    There are also miscellaneous fixes, including fixes to handling
    append-only files. There's a small change in the VFS, but that only
    has an effect on overlayfs, since otherwise file->f_path.dentry->inode
    and file_inode(file) are always the same"

    * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
    ovl: update documentation w.r.t. constant inode numbers
    ovl: persistent inode numbers for upper hardlinks
    ovl: merge getattr for dir and nondir
    ovl: constant st_ino/st_dev across copy up
    ovl: persistent inode number for directories
    ovl: set the ORIGIN type flag
    ovl: lookup non-dir copy-up-origin by file handle
    ovl: use an auxiliary var for overlay root entry
    ovl: store file handle of lower inode on copy up
    ovl: check if all layers are on the same fs
    ovl: do not set overlay.opaque on non-dir create
    ovl: check IS_APPEND() on real upper inode
    vfs: ftruncate check IS_APPEND() on real upper inode
    ovl: Use designated initializers
    ovl: lockdep annotate of nested stacked overlayfs inode lock

    Linus Torvalds