24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

14 May, 2020

3 commits

  • Parsing "silent" and clearing SB_SILENT makes zero sense.

    Parsing "silent" and setting SB_SILENT would make a bit more sense, but
    apparently nobody cares.

    Signed-off-by: Miklos Szeredi
    Reviewed-by: Christoph Hellwig

    Miklos Szeredi
     
  • Unlike the others, this is _not_ a standard option accepted by mount(8).

    In fact SB_POSIXACL is an internal flag, and accepting MS_POSIXACL on the
    mount(2) interface is possibly a bug.

    The only filesystem that apparently wants to handle the "posixacl" option
    is 9p, but it has special handling of that option besides setting
    SB_POSIXACL.

    Signed-off-by: Miklos Szeredi
    Reviewed-by: Christoph Hellwig

    Miklos Szeredi
     
  • Makes little sense to keep this blacklist synced with what mount(8) parses
    and what it doesn't. E.g. it has various forms of "*atime" options, but
    not "atime"...

    Signed-off-by: Miklos Szeredi
    Reviewed-by: Christoph Hellwig

    Miklos Szeredi
     

08 Feb, 2020

3 commits


07 Feb, 2020

1 commit

  • As it is, vfs_parse_fs_string() makes "foo" and "foo=" indistinguishable;
    both get fs_value_is_string for ->type and NULL for ->string. To make
    it even more unpleasant, that combination is impossible to produce with
    fsconfig().

    Much saner rules would be
    "foo" => fs_value_is_flag, NULL
    "foo=" => fs_value_is_string, ""
    "foo=bar" => fs_value_is_string, "bar"
    All cases are distinguishable, all results are expressable by fsconfig(),
    ->has_value checks are much simpler that way (to the point of the field
    being useless) and quite a few regressions go away (gfs2 has no business
    accepting -o nodebug=, for example).

    Partially based upon patches from Miklos.

    Signed-off-by: Al Viro

    Al Viro
     

07 Sep, 2019

1 commit

  • The unused vfs code can be removed. Don't pass empty subtype (same as if
    ->parse callback isn't called).

    The bits that are left involve determining whether it's permitted to split the
    filesystem type string passed in to mount(2). Consequently, this means that we
    cannot get rid of the FS_HAS_SUBTYPE flag unless we define that a type string
    with a dot in it always indicates a subtype specification.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro
    Signed-off-by: Miklos Szeredi

    David Howells
     

06 Sep, 2019

1 commit

  • fs_context::user_ns is used by fuse_parse_param(), even during remount,
    so it needs to be set to the existing value for reconfigure.

    Reproducer:

    #include
    #include

    int main()
    {
    char opts[128];
    int fd = open("/dev/fuse", O_RDWR);

    sprintf(opts, "fd=%d,rootmode=040000,user_id=0,group_id=0", fd);
    mkdir("mnt", 0777);
    mount("foo", "mnt", "fuse.foo", 0, opts);
    mount("foo", "mnt", "fuse.foo", MS_REMOUNT, opts);
    }

    Crash:
    BUG: kernel NULL pointer dereference, address: 0000000000000000
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 0 P4D 0
    Oops: 0000 [#1] SMP
    CPU: 0 PID: 129 Comm: syz_make_kuid Not tainted 5.3.0-rc5-next-20190821 #3
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-20181126_142135-anatol 04/01/2014
    RIP: 0010:map_id_range_down+0xb/0xc0 kernel/user_namespace.c:291
    [...]
    Call Trace:
    map_id_down kernel/user_namespace.c:312 [inline]
    make_kuid+0xe/0x10 kernel/user_namespace.c:389
    fuse_parse_param+0x116/0x210 fs/fuse/inode.c:523
    vfs_parse_fs_param+0xdb/0x1b0 fs/fs_context.c:145
    vfs_parse_fs_string+0x6a/0xa0 fs/fs_context.c:188
    generic_parse_monolithic+0x85/0xc0 fs/fs_context.c:228
    parse_monolithic_mount_data+0x1b/0x20 fs/fs_context.c:708
    do_remount fs/namespace.c:2525 [inline]
    do_mount+0x39a/0xa60 fs/namespace.c:3107
    ksys_mount+0x7d/0xd0 fs/namespace.c:3325
    __do_sys_mount fs/namespace.c:3339 [inline]
    __se_sys_mount fs/namespace.c:3336 [inline]
    __x64_sys_mount+0x20/0x30 fs/namespace.c:3336
    do_syscall_64+0x4a/0x1a0 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Reported-by: syzbot+7d6a57304857423318a5@syzkaller.appspotmail.com
    Fixes: 408cbe695350 ("vfs: Convert fuse to use the new mount API")
    Cc: David Howells
    Cc: Miklos Szeredi
    Signed-off-by: Eric Biggers
    Reviewed-by: David Howells
    Signed-off-by: Al Viro

    Eric Biggers
     

24 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public licence as published by
    the free software foundation either version 2 of the licence or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 114 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Kate Stewart
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190520170857.552531963@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 Mar, 2019

3 commits

  • Add a syscall for configuring a filesystem creation context and triggering
    actions upon it, to be used in conjunction with fsopen, fspick and fsmount.

    long fsconfig(int fs_fd, unsigned int cmd, const char *key,
    const void *value, int aux);

    Where fs_fd indicates the context, cmd indicates the action to take, key
    indicates the parameter name for parameter-setting actions and, if needed,
    value points to a buffer containing the value and aux can give more
    information for the value.

    The following command IDs are proposed:

    (*) FSCONFIG_SET_FLAG: No value is specified. The parameter must be
    boolean in nature. The key may be prefixed with "no" to invert the
    setting. value must be NULL and aux must be 0.

    (*) FSCONFIG_SET_STRING: A string value is specified. The parameter can
    be expecting boolean, integer, string or take a path. A conversion to
    an appropriate type will be attempted (which may include looking up as
    a path). value points to a NUL-terminated string and aux must be 0.

    (*) FSCONFIG_SET_BINARY: A binary blob is specified. value points to
    the blob and aux indicates its size. The parameter must be expecting
    a blob.

    (*) FSCONFIG_SET_PATH: A non-empty path is specified. The parameter must
    be expecting a path object. value points to a NUL-terminated string
    that is the path and aux is a file descriptor at which to start a
    relative lookup or AT_FDCWD.

    (*) FSCONFIG_SET_PATH_EMPTY: As fsconfig_set_path, but with AT_EMPTY_PATH
    implied.

    (*) FSCONFIG_SET_FD: An open file descriptor is specified. value must
    be NULL and aux indicates the file descriptor.

    (*) FSCONFIG_CMD_CREATE: Trigger superblock creation.

    (*) FSCONFIG_CMD_RECONFIGURE: Trigger superblock reconfiguration.

    For the "set" command IDs, the idea is that the file_system_type will point
    to a list of parameters and the types of value that those parameters expect
    to take. The core code can then do the parse and argument conversion and
    then give the LSM and FS a cooked option or array of options to use.

    Source specification is also done the same way same way, using special keys
    "source", "source1", "source2", etc..

    [!] Note that, for the moment, the key and value are just glued back
    together and handed to the filesystem. Every filesystem that uses options
    uses match_token() and co. to do this, and this will need to be changed -
    but not all at once.

    Example usage:

    fd = fsopen("ext4", FSOPEN_CLOEXEC);
    fsconfig(fd, fsconfig_set_path, "source", "/dev/sda1", AT_FDCWD);
    fsconfig(fd, fsconfig_set_path_empty, "journal_path", "", journal_fd);
    fsconfig(fd, fsconfig_set_fd, "journal_fd", "", journal_fd);
    fsconfig(fd, fsconfig_set_flag, "user_xattr", NULL, 0);
    fsconfig(fd, fsconfig_set_flag, "noacl", NULL, 0);
    fsconfig(fd, fsconfig_set_string, "sb", "1", 0);
    fsconfig(fd, fsconfig_set_string, "errors", "continue", 0);
    fsconfig(fd, fsconfig_set_string, "data", "journal", 0);
    fsconfig(fd, fsconfig_set_string, "context", "unconfined_u:...", 0);
    fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0);
    mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC);

    or:

    fd = fsopen("ext4", FSOPEN_CLOEXEC);
    fsconfig(fd, fsconfig_set_string, "source", "/dev/sda1", 0);
    fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0);
    mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC);

    or:

    fd = fsopen("afs", FSOPEN_CLOEXEC);
    fsconfig(fd, fsconfig_set_string, "source", "#grand.central.org:root.cell", 0);
    fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0);
    mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC);

    or:

    fd = fsopen("jffs2", FSOPEN_CLOEXEC);
    fsconfig(fd, fsconfig_set_string, "source", "mtd0", 0);
    fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0);
    mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC);

    Signed-off-by: David Howells
    cc: linux-api@vger.kernel.org
    Signed-off-by: Al Viro

    David Howells
     
  • Implement the ability for filesystems to log error, warning and
    informational messages through the fs_context. These can be extracted by
    userspace by reading from an fd created by fsopen().

    Error messages are prefixed with "e ", warnings with "w " and informational
    messages with "i ".

    Inside the kernel, formatted messages are malloc'd but unformatted messages
    are not copied if they're either in the core .rodata section or in the
    .rodata section of the filesystem module pinned by fs_context::fs_type.
    The messages are only good till the fs_type is released.

    Note that the logging object is shared between duplicated fs_context
    structures. This is so that such as NFS which do a mount within a mount
    can get at least some of the errors from the inner mount.

    Five logging functions are provided for this:

    (1) void logfc(struct fs_context *fc, const char *fmt, ...);

    This logs a message into the context. If the buffer is full, the
    earliest message is discarded.

    (2) void errorf(fc, fmt, ...);

    This wraps logfc() to log an error.

    (3) void invalf(fc, fmt, ...);

    This wraps errorf() and returns -EINVAL for convenience.

    (4) void warnf(fc, fmt, ...);

    This wraps logfc() to log a warning.

    (5) void infof(fc, fmt, ...);

    This wraps logfc() to log an informational message.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • Provide an fsopen() system call that starts the process of preparing to
    create a superblock that will then be mountable, using an fd as a context
    handle. fsopen() is given the name of the filesystem that will be used:

    int mfd = fsopen(const char *fsname, unsigned int flags);

    where flags can be 0 or FSOPEN_CLOEXEC.

    For example:

    sfd = fsopen("ext4", FSOPEN_CLOEXEC);
    fsconfig(sfd, FSCONFIG_SET_PATH, "source", "/dev/sda1", AT_FDCWD);
    fsconfig(sfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0);
    fsconfig(sfd, FSCONFIG_SET_FLAG, "acl", NULL, 0);
    fsconfig(sfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0);
    fsconfig(sfd, FSCONFIG_SET_STRING, "sb", "1", 0);
    fsconfig(sfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
    fsinfo(sfd, NULL, ...); // query new superblock attributes
    mfd = fsmount(sfd, FSMOUNT_CLOEXEC, MS_RELATIME);
    move_mount(mfd, "", sfd, AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);

    sfd = fsopen("afs", -1);
    fsconfig(fd, FSCONFIG_SET_STRING, "source",
    "#grand.central.org:root.cell", 0);
    fsconfig(fd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
    mfd = fsmount(sfd, 0, MS_NODEV);
    move_mount(mfd, "", sfd, AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);

    If an error is reported at any step, an error message may be available to be
    read() back (ENODATA will be reported if there isn't an error available) in
    the form:

    "e :"
    "e SELinux:Mount on mountpoint not permitted"

    Once fsmount() has been called, further fsconfig() calls will incur EBUSY,
    even if the fsmount() fails. read() is still possible to retrieve error
    information.

    The fsopen() syscall creates a mount context and hangs it of the fd that it
    returns.

    Netlink is not used because it is optional and would make the core VFS
    dependent on the networking layer and also potentially add network
    namespace issues.

    Note that, for the moment, the caller must have SYS_CAP_ADMIN to use
    fsopen().

    Signed-off-by: David Howells
    cc: linux-api@vger.kernel.org
    Signed-off-by: Al Viro

    David Howells
     

28 Feb, 2019

3 commits

  • Implement the ability for filesystems to log error, warning and
    informational messages through the fs_context. In the future, these will
    be extractable by userspace by reading from an fd created by the fsopen()
    syscall.

    Error messages are prefixed with "e ", warnings with "w " and informational
    messages with "i ".

    In the future, inside the kernel, formatted messages will be malloc'd but
    unformatted messages will not copied if they're either in the core .rodata
    section or in the .rodata section of the filesystem module pinned by
    fs_context::fs_type. The messages will only be good till the fs_type is
    released.

    Note that the logging object will be shared between duplicated fs_context
    structures. This is so that such as NFS which do a mount within a mount
    can get at least some of the errors from the inner mount.

    Five logging functions are provided for this:

    (1) void logfc(struct fs_context *fc, const char *fmt, ...);

    This logs a message into the context. If the buffer is full, the
    earliest message is discarded.

    (2) void errorf(fc, fmt, ...);

    This wraps logfc() to log an error.

    (3) void invalf(fc, fmt, ...);

    This wraps errorf() and returns -EINVAL for convenience.

    (4) void warnf(fc, fmt, ...);

    This wraps logfc() to log a warning.

    (5) void infof(fc, fmt, ...);

    This wraps logfc() to log an informational message.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • new primitive: vfs_dup_fs_context(). Comes with fs_context
    method (->dup()) for copying the filesystem-specific parts
    of fs_context, along with LSM one (->fs_context_dup()) for
    doing the same to LSM parts.

    [needs better commit message, and change of Author:, anyway]

    Signed-off-by: Al Viro

    Al Viro
     
  • [AV - unfuck kern_mount_data(); we want non-NULL ->mnt_ns on long-living
    mounts]
    [AV - reordering fs/namespace.c is badly overdue, but let's keep it
    separate from that series]
    [AV - drop simple_pin_fs() change]
    [AV - clean vfs_kern_mount() failure exits up]

    Implement a filesystem context concept to be used during superblock
    creation for mount and superblock reconfiguration for remount.

    The mounting procedure then becomes:

    (1) Allocate new fs_context context.

    (2) Configure the context.

    (3) Create superblock.

    (4) Query the superblock.

    (5) Create a mount for the superblock.

    (6) Destroy the context.

    Rather than calling fs_type->mount(), an fs_context struct is created and
    fs_type->init_fs_context() is called to set it up. Pointers exist for the
    filesystem and LSM to hang their private data off.

    A set of operations has to be set by ->init_fs_context() to provide
    freeing, duplication, option parsing, binary data parsing, validation,
    mounting and superblock filling.

    Legacy filesystems are supported by the provision of a set of legacy
    fs_context operations that build up a list of mount options and then invoke
    fs_type->mount() from within the fs_context ->get_tree() operation. This
    allows all filesystems to be accessed using fs_context.

    It should be noted that, whilst this patch adds a lot of lines of code,
    there is quite a bit of duplication with existing code that can be
    eliminated should all filesystems be converted over.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     

31 Jan, 2019

5 commits

  • Signed-off-by: Al Viro

    Al Viro
     
  • This is an eventual replacement for vfs_submount() uses. Unlike the
    "mount" and "remount" cases, the users of that thing are not in VFS -
    they are buried in various ->d_automount() instances and rather than
    converting them all at once we introduce the (thankfully small and
    simple) infrastructure here and deal with the prospective users in
    afs, nfs, etc. parts of the series.

    Here we just introduce a new constructor (fs_context_for_submount())
    along with the corresponding enum constant to be put into fc->purpose
    for those.

    Signed-off-by: Al Viro

    Al Viro
     
  • Replace do_remount_sb() with a function, reconfigure_super(), that's
    fs_context aware. The fs_context is expected to be parameterised already
    and have ->root pointing to the superblock to be reconfigured.

    A legacy wrapper is provided that is intended to be called from the
    fs_context ops when those appear, but for now is called directly from
    reconfigure_super(). This wrapper invokes the ->remount_fs() superblock op
    for the moment. It is intended that the remount_fs() op will be phased
    out.

    The fs_context->purpose is set to FS_CONTEXT_FOR_RECONFIGURE to indicate
    that the context is being used for reconfiguration.

    do_umount_root() is provided to consolidate remount-to-R/O for umount and
    emergency remount by creating a context and invoking reconfiguration.

    do_remount(), do_umount() and do_emergency_remount_callback() are switched
    to use the new process.

    [AV -- fold UMOUNT and EMERGENCY_REMOUNT in; fixes the
    umount / bug, gets rid of pointless complexity]
    [AV -- set ->net_ns in all cases; nfs remount will need that]
    [AV -- shift security_sb_remount() call into reconfigure_super(); the callers
    that didn't do security_sb_remount() have NULL fc->security anyway, so it's
    a no-op for them]

    Signed-off-by: David Howells
    Co-developed-by: Al Viro
    Signed-off-by: Al Viro

    David Howells
     
  • Right now vfs_get_tree() calls security_sb_kern_mount() (i.e.
    mount MAC) unless it gets MS_KERNMOUNT or MS_SUBMOUNT in flags.
    Doing it that way is both clumsy and imprecise.

    Consider the callers' tree of vfs_get_tree():
    vfs_get_tree()
    s_umount (in
    do_new_mount_fc()).

    Signed-off-by: Al Viro

    Al Viro
     
  • Introduce a filesystem context concept to be used during superblock
    creation for mount and superblock reconfiguration for remount. This is
    allocated at the beginning of the mount procedure and into it is placed:

    (1) Filesystem type.

    (2) Namespaces.

    (3) Source/Device names (there may be multiple).

    (4) Superblock flags (SB_*).

    (5) Security details.

    (6) Filesystem-specific data, as set by the mount options.

    Accessor functions are then provided to set up a context, parameterise it
    from monolithic mount data (the data page passed to mount(2)) and tear it
    down again.

    A legacy wrapper is provided that implements what will be the basic
    operations, wrapping access to filesystems that aren't yet aware of the
    fs_context.

    Finally, vfs_kern_mount() is changed to make use of the fs_context and
    mount_fs() is replaced by vfs_get_tree(), called from vfs_kern_mount().
    [AV -- add missing kstrdup()]
    [AV -- put_cred() can be unconditional - fc->cred can't be NULL]
    [AV -- take legacy_validate() contents into legacy_parse_monolithic()]
    [AV -- merge KERNEL_MOUNT and USER_MOUNT]
    [AV -- don't unlock superblock on success return from vfs_get_tree()]
    [AV -- kill 'reference' argument of init_fs_context()]

    Signed-off-by: David Howells
    Co-developed-by: Al Viro
    Signed-off-by: Al Viro

    David Howells