Eric Lee / smarc-fsl-linux-kernel

18 Jan, 2020

1 commit

3fe209c84 fs: move guard_bio_eod() after bio_set_op_attrs ... Browse Code »

commit 83c9c547168e8b914ea6398430473a4de68c52cc upstream.

Commit 85a8ce62c2ea ("block: add bio_truncate to fix guard_bio_eod")
adds bio_truncate() for handling bio EOD. However, bio_truncate()
doesn't use the passed 'op' parameter from guard_bio_eod's callers.

So bio_trunacate() may retrieve wrong 'op', and zering pages may
not be done for READ bio.

Fixes this issue by moving guard_bio_eod() after bio_set_op_attrs()
in submit_bh_wbc() so that bio_truncate() can always retrieve correct
op info.

Meantime remove the 'op' parameter from guard_bio_eod() because it isn't
used any more.

Cc: Carlos Maiolino
Cc: linux-fsdevel@vger.kernel.org
Fixes: 85a8ce62c2ea ("block: add bio_truncate to fix guard_bio_eod")
Signed-off-by: Ming Lei
Signed-off-by: Greg Kroah-Hartman

Fold in kerneldoc and bio_op() change.

Signed-off-by: Jens Axboe

Ming Lei
2020-01-18 02:48:21 +0800

21 Jul, 2019

1 commit

18253e034 Merge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull dcache and mountpoint updates from Al Viro:
"Saner handling of refcounts to mountpoints.

Transfer the counting reference from struct mount ->mnt_mountpoint
over to struct mountpoint ->m_dentry. That allows us to get rid of the
convoluted games with ordering of mount shutdowns.

The cost is in teaching shrink_dcache_{parent,for_umount} to cope with
mixed-filesystem shrink lists, which we'll also need for the Slab
Movable Objects patchset"

* 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
switch the remnants of releasing the mountpoint away from fs_pin
get rid of detach_mnt()
make struct mountpoint bear the dentry reference to mountpoint, not struct mount
Teach shrink_dcache_parent() to cope with mixed-filesystem shrink lists
fs/namespace.c: shift put_mountpoint() to callers of unhash_mnt()
__detach_mounts(): lookup_mountpoint() can't return ERR_PTR() anymore
nfs: dget_parent() never returns NULL
ceph: don't open-code the check for dead lockref

Linus Torvalds
2019-07-21 00:15:51 +0800

20 Jul, 2019

2 commits

26473f837 Merge tag 'iomap-5.3-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux ... Browse Code »

Pull iomap split/cleanup from Darrick Wong:
"As promised, here's the second part of the iomap merge for 5.3, in
which we break up iomap.c into smaller files grouped by functional
area so that it'll be easier in the long run to maintain cohesiveness
of code units and to review incoming patches. There are no functional
changes and fs/iomap.c split cleanly.

Summary:

- Regroup the fs/iomap.c code by major functional area so that we can
start development for 5.4 from a more stable base"

* tag 'iomap-5.3-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: move internal declarations into fs/iomap/
iomap: move the main iteration code into a separate file
iomap: move the buffered IO code into a separate file
iomap: move the direct IO code into a separate file
iomap: move the SEEK_HOLE code into a separate file
iomap: move the file mapping reporting code into a separate file
iomap: move the swapfile code into a separate file
iomap: start moving code to fs/iomap/

Linus Torvalds
2019-07-20 02:38:12 +0800
933a90bf4 Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs mount updates from Al Viro:
"The first part of mount updates.

Convert filesystems to use the new mount API"

* 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
mnt_init(): call shmem_init() unconditionally
constify ksys_mount() string arguments
don't bother with registering rootfs
init_rootfs(): don't bother with init_ramfs_fs()
vfs: Convert smackfs to use the new mount API
vfs: Convert selinuxfs to use the new mount API
vfs: Convert securityfs to use the new mount API
vfs: Convert apparmorfs to use the new mount API
vfs: Convert openpromfs to use the new mount API
vfs: Convert xenfs to use the new mount API
vfs: Convert gadgetfs to use the new mount API
vfs: Convert oprofilefs to use the new mount API
vfs: Convert ibmasmfs to use the new mount API
vfs: Convert qib_fs/ipathfs to use the new mount API
vfs: Convert efivarfs to use the new mount API
vfs: Convert configfs to use the new mount API
vfs: Convert binfmt_misc to use the new mount API
convenience helper: get_tree_single()
convenience helper get_tree_nodev()
vfs: Kill sget_userns()
...

Linus Torvalds
2019-07-20 01:42:02 +0800

17 Jul, 2019

1 commit

5d907307a iomap: move internal declarations into fs/iomap/ ... Browse Code »

Move internal function declarations out of fs/internal.h into
include/linux/iomap.h so that our transition is complete.

Signed-off-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig

Darrick J. Wong
2019-07-17 22:21:02 +0800

10 Jul, 2019

1 commit

9bdebc2bd Teach shrink_dcache_parent() to cope with mixed-filesystem shrink lists ... Browse Code »

Currently, running into a shrink list that contains dentries from different
filesystems can cause several unpleasant things for shrink_dcache_parent()
and for umount(2).

The first problem is that there's a window during shrink_dentry_list() between
__dentry_kill() takes a victim out and dropping reference to its parent. During
that window the parent looks like a genuine busy dentry. shrink_dcache_parent()
(or, worse yet, shrink_dcache_for_umount()) coming at that time will see no
eviction candidates and no indication that it needs to wait for some
shrink_dentry_list() to proceed further.

That applies for any shrink list that might intersect with the subtree we are
trying to shrink; the only reason it does not blow on umount(2) in the mainline
is that we unregister the memory shrinker before hitting shrink_dcache_for_umount().

Another problem happens if something in a mixed-filesystem shrink list gets
be stuck in e.g. iput(), getting umount of unrelated fs to spin waiting for
the stuck shrinker to get around to our dentries.

Solution:
1) have shrink_dentry_list() decrement the parent's refcount and
make sure it's on a shrink list (ours unless it already had been on some
other) before calling __dentry_kill(). That eliminates the window when
shrink_dcache_parent() would've blown past the entire subtree without
noticing anything with zero refcount not on shrink lists.
2) when shrink_dcache_parent() has found no eviction candidates,
but some dentries are still sitting on shrink lists, rather than
repeating the scan in hope that shrinkers have progressed, scan looking
for something on shrink lists with zero refcount. If such a thing is
found, grab rcu_read_lock() and stop the scan, with caller locking
it for eviction, dropping out of RCU and doing __dentry_kill(), with
the same treatment for parent as shrink_dentry_list() would do.

Note that right now mixed-filesystem shrink lists do not occur, so this
is not a mainline bug. Howevere, there's a bunch of uses for such
beasts (e.g. the "try and evict everything we can out of given page"
patches; there are potential uses in mount-related code, considerably
simplifying the life in fs/namespace.c, etc.)

Signed-off-by: Al Viro

Al Viro
2019-07-10 19:32:22 +0800

28 Jun, 2019

1 commit

8af54f291 fs: fold __generic_write_end back into generic_write_end ... Browse Code »

This effectively reverts a6d639da63ae ("fs: factor out a
__generic_write_end helper") as we now open code what is left of that
helper in iomap.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andreas Gruenbacher
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2019-06-28 08:28:40 +0800

31 May, 2019

1 commit

2874c5fd2 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-31 02:26:32 +0800

26 May, 2019

2 commits

20284ab74 switch mount_capable() to fs_context ... Browse Code »

now both callers of mount_capable() have access to fs_context;
the only difference is that for sget_fc() we have the possibility
of fc->global being true, while for legacy_get_tree() it's guaranteed
to be impossible. Unify to more generic variant...

Signed-off-by: Al Viro

Al Viro
2019-05-26 05:59:59 +0800
2527b284d move the capability checks from sget_userns() to legacy_get_tree() ... Browse Code »

1) all call chains leading to sget_userns() pass through ->mount()
instances.
2) none of ->mount() instances is ever called directly - the only
call site is legacy_get_tree()
3) all remaining ->mount() instances end up calling sget_userns()

IOW, we might as well do the capability checks just before calling
->mount(). As for the arguments passed to mount_capable(),
in case of call chains to sget_userns() going through sget(),
we either don't call mount_capable() at all, or pass current_user_ns()
to it. The call chains going through mount_pseudo_xattr() don't
call mount_capable() at all (SB_KERNMOUNT in flags on those).

That could've been split into smaller steps (lifting the checks
into sget(), then callers of sget(), then all the way to the
entries of every ->mount() out there, then to the sole caller),
but that would be too much churn for little benefit...

Signed-off-by: Al Viro

Al Viro
2019-05-26 05:59:58 +0800

21 May, 2019

1 commit

7e5f7bb08 unexport simple_dname() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2019-05-21 15:23:41 +0800

08 May, 2019

3 commits

400913252 Merge branch 'work.mount-syscalls' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull mount ABI updates from Al Viro:
"The syscalls themselves, finally.

That's not all there is to that stuff, but switching individual
filesystems to new methods is fortunately independent from everything
else, so e.g. NFS series can go through NFS tree, etc.

As those conversions get done, we'll be finally able to get rid of a
bunch of duplication in fs/super.c introduced in the beginning of the
entire thing. I expect that to be finished in the next window..."

* 'work.mount-syscalls' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: Add a sample program for the new mount API
vfs: syscall: Add fspick() to select a superblock for reconfiguration
vfs: syscall: Add fsmount() to create a mount for a superblock
vfs: syscall: Add fsconfig() for configuring and managing a context
vfs: Implement logging through fs_context
vfs: syscall: Add fsopen() to prepare for superblock creation
Make anon_inodes unconditional
teach move_mount(2) to work with OPEN_TREE_CLONE
vfs: syscall: Add move_mount(2) to move mounts around
vfs: syscall: Add open_tree(2) to reference or clone a mount

Linus Torvalds
2019-05-08 11:17:51 +0800
d27fb65bc Merge branch 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull misc dcache updates from Al Viro:
"Most of this pile is putting name length into struct name_snapshot and
making use of it.

The beginning of this series ("ovl_lookup_real_one(): don't bother
with strlen()") ought to have been split in two (separate switch of
name_snapshot to struct qstr from overlayfs reaping the trivial
benefits of that), but I wanted to avoid a rebase - by the time I'd
spotted that it was (a) in -next and (b) close to 5.1-final ;-/"

* 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
audit_compare_dname_path(): switch to const struct qstr *
audit_update_watch(): switch to const struct qstr *
inotify_handle_event(): don't bother with strlen()
fsnotify: switch send_to_group() and ->handle_event to const struct qstr *
fsnotify(): switch to passing const struct qstr * for file_name
switch fsnotify_move() to passing const struct qstr * for old_name
ovl_lookup_real_one(): don't bother with strlen()
sysv: bury the broken "quietly truncate the long filenames" logics
nsfs: unobfuscate
unexport d_alloc_pseudo()

Linus Torvalds
2019-05-08 11:03:32 +0800
d8456eaf3 Merge tag 'iomap-5.2-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux ... Browse Code »

Pull iomap updates from Darrick Wong:
"Nothing particularly exciting here, just adding some callouts for gfs2
and cleaning a few things.

Summary:

- Add some extra hooks to the iomap buffered write path to enable
gfs2 journalled writes

- SPDX conversion

- Various refactoring"

* tag 'iomap-5.2-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: move iomap_read_inline_data around
iomap: Add a page_prepare callback
iomap: Fix use-after-free error in page_done callback
fs: Turn __generic_write_end into a void function
iomap: Clean up __generic_write_end calling
iomap: convert to SPDX identifier

Linus Torvalds
2019-05-08 02:43:32 +0800

01 May, 2019

1 commit

26ddb1f4f fs: Turn __generic_write_end into a void function ... Browse Code »

The VFS-internal __generic_write_end helper always returns the value of
its @copied argument. This can be confusing, and it isn't very useful
anyway, so turn __generic_write_end into a function returning void
instead.

Signed-off-by: Andreas Gruenbacher
Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Andreas Gruenbacher
2019-05-01 22:47:37 +0800

10 Apr, 2019

1 commit

ab1152dd5 unexport d_alloc_pseudo() ... Browse Code »

No modular uses since introducion of alloc_file_pseudo(),
and the only non-modular user not in alloc_file_pseudo()
had actually been wrong - should've been d_alloc_anon().

Signed-off-by: Al Viro

Al Viro
2019-04-10 07:20:46 +0800

05 Apr, 2019

1 commit

9419a3191 acct_on(): don't mess with freeze protection ... Browse Code »

What happens there is that we are replacing file->path.mnt of
a file we'd just opened with a clone and we need the write
count contribution to be transferred from original mount to
new one. That's it. We do *NOT* want any kind of freeze
protection for the duration of switchover.

IOW, we should just use __mnt_{want,drop}_write() for that
switchover; no need to bother with mnt_{want,drop}_write()
there.

Tested-by: Amir Goldstein
Reported-by: syzbot+2a73a6ea9507b7112141@syzkaller.appspotmail.com
Signed-off-by: Al Viro

Al Viro
2019-04-05 09:04:13 +0800

21 Mar, 2019

2 commits

ecdab150f vfs: syscall: Add fsconfig() for configuring and managing a context ... Browse Code »

Add a syscall for configuring a filesystem creation context and triggering
actions upon it, to be used in conjunction with fsopen, fspick and fsmount.

long fsconfig(int fs_fd, unsigned int cmd, const char *key,
const void *value, int aux);

Where fs_fd indicates the context, cmd indicates the action to take, key
indicates the parameter name for parameter-setting actions and, if needed,
value points to a buffer containing the value and aux can give more
information for the value.

The following command IDs are proposed:

(*) FSCONFIG_SET_FLAG: No value is specified. The parameter must be
boolean in nature. The key may be prefixed with "no" to invert the
setting. value must be NULL and aux must be 0.

(*) FSCONFIG_SET_STRING: A string value is specified. The parameter can
be expecting boolean, integer, string or take a path. A conversion to
an appropriate type will be attempted (which may include looking up as
a path). value points to a NUL-terminated string and aux must be 0.

(*) FSCONFIG_SET_BINARY: A binary blob is specified. value points to
the blob and aux indicates its size. The parameter must be expecting
a blob.

(*) FSCONFIG_SET_PATH: A non-empty path is specified. The parameter must
be expecting a path object. value points to a NUL-terminated string
that is the path and aux is a file descriptor at which to start a
relative lookup or AT_FDCWD.

(*) FSCONFIG_SET_PATH_EMPTY: As fsconfig_set_path, but with AT_EMPTY_PATH
implied.

(*) FSCONFIG_SET_FD: An open file descriptor is specified. value must
be NULL and aux indicates the file descriptor.

(*) FSCONFIG_CMD_CREATE: Trigger superblock creation.

(*) FSCONFIG_CMD_RECONFIGURE: Trigger superblock reconfiguration.

For the "set" command IDs, the idea is that the file_system_type will point
to a list of parameters and the types of value that those parameters expect
to take. The core code can then do the parse and argument conversion and
then give the LSM and FS a cooked option or array of options to use.

Source specification is also done the same way same way, using special keys
"source", "source1", "source2", etc..

[!] Note that, for the moment, the key and value are just glued back
together and handed to the filesystem. Every filesystem that uses options
uses match_token() and co. to do this, and this will need to be changed -
but not all at once.

Example usage:

fd = fsopen("ext4", FSOPEN_CLOEXEC);
fsconfig(fd, fsconfig_set_path, "source", "/dev/sda1", AT_FDCWD);
fsconfig(fd, fsconfig_set_path_empty, "journal_path", "", journal_fd);
fsconfig(fd, fsconfig_set_fd, "journal_fd", "", journal_fd);
fsconfig(fd, fsconfig_set_flag, "user_xattr", NULL, 0);
fsconfig(fd, fsconfig_set_flag, "noacl", NULL, 0);
fsconfig(fd, fsconfig_set_string, "sb", "1", 0);
fsconfig(fd, fsconfig_set_string, "errors", "continue", 0);
fsconfig(fd, fsconfig_set_string, "data", "journal", 0);
fsconfig(fd, fsconfig_set_string, "context", "unconfined_u:...", 0);
fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0);
mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC);

or:

fd = fsopen("ext4", FSOPEN_CLOEXEC);
fsconfig(fd, fsconfig_set_string, "source", "/dev/sda1", 0);
fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0);
mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC);

or:

fd = fsopen("afs", FSOPEN_CLOEXEC);
fsconfig(fd, fsconfig_set_string, "source", "#grand.central.org:root.cell", 0);
fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0);
mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC);

or:

fd = fsopen("jffs2", FSOPEN_CLOEXEC);
fsconfig(fd, fsconfig_set_string, "source", "mtd0", 0);
fsconfig(fd, fsconfig_cmd_create, NULL, NULL, 0);
mfd = fsmount(fd, FSMOUNT_CLOEXEC, MS_NOEXEC);

Signed-off-by: David Howells
cc: linux-api@vger.kernel.org
Signed-off-by: Al Viro

David Howells
2019-03-21 06:49:06 +0800
a07b20004 vfs: syscall: Add open_tree(2) to reference or clone a mount ... Browse Code »

open_tree(dfd, pathname, flags)

Returns an O_PATH-opened file descriptor or an error.
dfd and pathname specify the location to open, in usual
fashion (see e.g. fstatat(2)). flags should be an OR of
some of the following:
* AT_PATH_EMPTY, AT_NO_AUTOMOUNT, AT_SYMLINK_NOFOLLOW -
same meanings as usual
* OPEN_TREE_CLOEXEC - make the resulting descriptor
close-on-exec
* OPEN_TREE_CLONE or OPEN_TREE_CLONE | AT_RECURSIVE -
instead of opening the location in question, create a detached
mount tree matching the subtree rooted at location specified by
dfd/pathname. With AT_RECURSIVE the entire subtree is cloned,
without it - only the part within in the mount containing the
location in question. In other words, the same as mount --rbind
or mount --bind would've taken. The detached tree will be
dissolved on the final close of obtained file. Creation of such
detached trees requires the same capabilities as doing mount --bind.

Signed-off-by: Al Viro
Signed-off-by: David Howells
cc: linux-api@vger.kernel.org
Signed-off-by: Al Viro

Al Viro
2019-03-21 06:49:06 +0800

28 Feb, 2019

1 commit

31d921c7f vfs: Add configuration parser helpers ... Browse Code »

Because the new API passes in key,value parameters, match_token() cannot be
used with it. Instead, provide three new helpers to aid with parsing:

(1) fs_parse(). This takes a parameter and a simple static description of
all the parameters and maps the key name to an ID. It returns 1 on a
match, 0 on no match if unknowns should be ignored and some other
negative error code on a parse error.

The parameter description includes a list of key names to IDs, desired
parameter types and a list of enumeration name -> ID mappings.

[!] Note that for the moment I've required that the key->ID mapping
array is expected to be sorted and unterminated. The size of the
array is noted in the fsconfig_parser struct. This allows me to use
bsearch(), but I'm not sure any performance gain is worth the hassle
of requiring people to keep the array sorted.

The parameter type array is sized according to the number of parameter
IDs and is indexed directly. The optional enum mapping array is an
unterminated, unsorted list and the size goes into the fsconfig_parser
struct.

The function can do some additional things:

(a) If it's not ambiguous and no value is given, the prefix "no" on
a key name is permitted to indicate that the parameter should
be considered negatory.

(b) If the desired type is a single simple integer, it will perform
an appropriate conversion and store the result in a union in
the parse result.

(c) If the desired type is an enumeration, {key ID, name} will be
looked up in the enumeration list and the matching value will
be stored in the parse result union.

(d) Optionally generate an error if the key is unrecognised.

This is called something like:

enum rdt_param {
Opt_cdp,
Opt_cdpl2,
Opt_mba_mpbs,
nr__rdt_params
};

const struct fs_parameter_spec rdt_param_specs[nr__rdt_params] = {
[Opt_cdp] = { fs_param_is_bool },
[Opt_cdpl2] = { fs_param_is_bool },
[Opt_mba_mpbs] = { fs_param_is_bool },
};

const const char *const rdt_param_keys[nr__rdt_params] = {
[Opt_cdp] = "cdp",
[Opt_cdpl2] = "cdpl2",
[Opt_mba_mpbs] = "mba_mbps",
};

const struct fs_parameter_description rdt_parser = {
.name = "rdt",
.nr_params = nr__rdt_params,
.keys = rdt_param_keys,
.specs = rdt_param_specs,
.no_source = true,
};

int rdt_parse_param(struct fs_context *fc,
struct fs_parameter *param)
{
struct fs_parse_result parse;
struct rdt_fs_context *ctx = rdt_fc2context(fc);
int ret;

ret = fs_parse(fc, &rdt_parser, param, &parse);
if (ret < 0)
return ret;

switch (parse.key) {
case Opt_cdp:
ctx->enable_cdpl3 = true;
return 0;
case Opt_cdpl2:
ctx->enable_cdpl2 = true;
return 0;
case Opt_mba_mpbs:
ctx->enable_mba_mbps = true;
return 0;
}

return -EINVAL;
}

(2) fs_lookup_param(). This takes a { dirfd, path, LOOKUP_EMPTY? } or
string value and performs an appropriate path lookup to convert it
into a path object, which it will then return.

If the desired type was a blockdev, the type of the looked up inode
will be checked to make sure it is one.

This can be used like:

enum foo_param {
Opt_source,
nr__foo_params
};

const struct fs_parameter_spec foo_param_specs[nr__foo_params] = {
[Opt_source] = { fs_param_is_blockdev },
};

const char *char foo_param_keys[nr__foo_params] = {
[Opt_source] = "source",
};

const struct constant_table foo_param_alt_keys[] = {
{ "device", Opt_source },
};

const struct fs_parameter_description foo_parser = {
.name = "foo",
.nr_params = nr__foo_params,
.nr_alt_keys = ARRAY_SIZE(foo_param_alt_keys),
.keys = foo_param_keys,
.alt_keys = foo_param_alt_keys,
.specs = foo_param_specs,
};

int foo_parse_param(struct fs_context *fc,
struct fs_parameter *param)
{
struct fs_parse_result parse;
struct foo_fs_context *ctx = foo_fc2context(fc);
int ret;

ret = fs_parse(fc, &foo_parser, param, &parse);
if (ret < 0)
return ret;

switch (parse.key) {
case Opt_source:
return fs_lookup_param(fc, &foo_parser, param,
&parse, &ctx->source);
default:
return -EINVAL;
}
}

(3) lookup_constant(). This takes a table of named constants and looks up
the given name within it. The table is expected to be sorted such
that bsearch() be used upon it.

Possibly I should require the table be terminated and just use a
for-loop to scan it instead of using bsearch() to reduce hassle.

Tables look something like:

static const struct constant_table bool_names[] = {
{ "0", false },
{ "1", true },
{ "false", false },
{ "no", false },
{ "true", true },
{ "yes", true },
};

and a lookup is done with something like:

b = lookup_constant(bool_names, param->string, -1);

Additionally, optional validation routines for the parameter description
are provided that can be enabled at compile time. A later patch will
invoke these when a filesystem is registered.

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2019-02-28 16:28:53 +0800

31 Jan, 2019

4 commits

f3a09c920 introduce fs_context methods ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2019-01-31 06:44:27 +0800
8d0347f6c convert do_remount_sb() to fs_context ... Browse Code »

Replace do_remount_sb() with a function, reconfigure_super(), that's
fs_context aware. The fs_context is expected to be parameterised already
and have ->root pointing to the superblock to be reconfigured.

A legacy wrapper is provided that is intended to be called from the
fs_context ops when those appear, but for now is called directly from
reconfigure_super(). This wrapper invokes the ->remount_fs() superblock op
for the moment. It is intended that the remount_fs() op will be phased
out.

The fs_context->purpose is set to FS_CONTEXT_FOR_RECONFIGURE to indicate
that the context is being used for reconfiguration.

do_umount_root() is provided to consolidate remount-to-R/O for umount and
emergency remount by creating a context and invoking reconfiguration.

do_remount(), do_umount() and do_emergency_remount_callback() are switched
to use the new process.

[AV -- fold UMOUNT and EMERGENCY_REMOUNT in; fixes the
umount / bug, gets rid of pointless complexity]
[AV -- set ->net_ns in all cases; nfs remount will need that]
[AV -- shift security_sb_remount() call into reconfigure_super(); the callers
that didn't do security_sb_remount() have NULL fc->security anyway, so it's
a no-op for them]

Signed-off-by: David Howells
Co-developed-by: Al Viro
Signed-off-by: Al Viro

David Howells
2019-01-31 06:44:26 +0800
c9ce29ed7 vfs_get_tree(): evict the call of security_sb_kern_mount() ... Browse Code »

Right now vfs_get_tree() calls security_sb_kern_mount() (i.e.
mount MAC) unless it gets MS_KERNMOUNT or MS_SUBMOUNT in flags.
Doing it that way is both clumsy and imprecise.

Consider the callers' tree of vfs_get_tree():
vfs_get_tree()
s_umount (in
do_new_mount_fc()).

Signed-off-by: Al Viro

Al Viro
2019-01-31 06:44:26 +0800
9bc61ab18 vfs: Introduce fs_context, switch vfs_kern_mount() to it. ... Browse Code »

Introduce a filesystem context concept to be used during superblock
creation for mount and superblock reconfiguration for remount. This is
allocated at the beginning of the mount procedure and into it is placed:

(1) Filesystem type.

(2) Namespaces.

(3) Source/Device names (there may be multiple).

(4) Superblock flags (SB_*).

(5) Security details.

(6) Filesystem-specific data, as set by the mount options.

Accessor functions are then provided to set up a context, parameterise it
from monolithic mount data (the data page passed to mount(2)) and tear it
down again.

A legacy wrapper is provided that implements what will be the basic
operations, wrapping access to filesystems that aren't yet aware of the
fs_context.

Finally, vfs_kern_mount() is changed to make use of the fs_context and
mount_fs() is replaced by vfs_get_tree(), called from vfs_kern_mount().
[AV -- add missing kstrdup()]
[AV -- put_cred() can be unconditional - fc->cred can't be NULL]
[AV -- take legacy_validate() contents into legacy_parse_monolithic()]
[AV -- merge KERNEL_MOUNT and USER_MOUNT]
[AV -- don't unlock superblock on success return from vfs_get_tree()]
[AV -- kill 'reference' argument of init_fs_context()]

Signed-off-by: David Howells
Co-developed-by: Al Viro
Signed-off-by: Al Viro

David Howells
2019-01-31 06:44:23 +0800

22 Aug, 2018

1 commit

d9a185f8b Merge tag 'ovl-update-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs ... Browse Code »

Pull overlayfs updates from Miklos Szeredi:
"This contains two new features:

- Stack file operations: this allows removal of several hacks from
the VFS, proper interaction of read-only open files with copy-up,
possibility to implement fs modifying ioctls properly, and others.

- Metadata only copy-up: when file is on lower layer and only
metadata is modified (except size) then only copy up the metadata
and continue to use the data from the lower file"

* tag 'ovl-update-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (66 commits)
ovl: Enable metadata only feature
ovl: Do not do metacopy only for ioctl modifying file attr
ovl: Do not do metadata only copy-up for truncate operation
ovl: add helper to force data copy-up
ovl: Check redirect on index as well
ovl: Set redirect on upper inode when it is linked
ovl: Set redirect on metacopy files upon rename
ovl: Do not set dentry type ORIGIN for broken hardlinks
ovl: Add an inode flag OVL_CONST_INO
ovl: Treat metacopy dentries as type OVL_PATH_MERGE
ovl: Check redirects for metacopy files
ovl: Move some dir related ovl_lookup_single() code in else block
ovl: Do not expose metacopy only dentry from d_real()
ovl: Open file with data except for the case of fsync
ovl: Add helper ovl_inode_realdata()
ovl: Store lower data inode in ovl_inode
ovl: Fix ovl_getattr() to get number of blocks from lower
ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
ovl: Copy up meta inode data from lowest data inode
ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
...

Linus Torvalds
2018-08-22 09:19:09 +0800

14 Aug, 2018

1 commit

161fa27ff Merge branch 'iomap-4.19-merge' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux ... Browse Code »

Pull fs iomap refactoring from Darrick Wong:
"This is the first part of the XFS changes for 4.19.

Christoph and Andreas coordinated some refactoring work on the iomap
code in preparation for removing buffer heads from XFS and porting
gfs2 to iomap. I'm sending this small pull request ahead of the main
XFS merge to avoid holding up gfs2 unnecessarily"

* 'iomap-4.19-merge' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
iomap: add inline data support to iomap_readpage_actor
iomap: support direct I/O to inline data
iomap: refactor iomap_dio_actor
iomap: add initial support for writes without buffer heads
iomap: add an iomap-based readpage and readpages implementation
iomap: add private pointer to struct iomap
iomap: add a page_done callback
iomap: generic inline data handling
iomap: complete partial direct I/O writes synchronously
iomap: mark newly allocated buffer heads as new
fs: factor out a __generic_write_end helper

Linus Torvalds
2018-08-14 13:29:03 +0800

18 Jul, 2018

4 commits

c67185434 Revert "vfs: update ovl inode before relatime check" ... Browse Code »

This reverts commit 598e3c8f72f5b77c84d2cb26cfd936ffb3cfdbaa.

Overlayfs no longer relies on the vfs correct atime handling.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2018-07-18 21:44:43 +0800
6742cee04 Revert "ovl: don't allow writing ioctl on lower layer" ... Browse Code »

This reverts commit 7c6893e3c9abf6a9676e060a1e35e5caca673d57.

Overlayfs no longer relies on the vfs for checking writability of files.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2018-07-18 21:44:43 +0800
9df6702ad vfs: export vfs_ioctl() to modules ... Browse Code »

This is needed by the stacked ioctl implementation in overlayfs.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2018-07-18 21:44:40 +0800
d3b1084df vfs: make open_with_fake_path() not contribute to nr_files ... Browse Code »

Stacking file operations in overlay will store an extra open file for each
overlay file opened.

The overhead is just that of "struct file" which is about 256bytes, because
overlay already pins an extra dentry and inode when the file is open, which
add up to a much larger overhead.

For fear of breaking working setups, don't start accounting the extra file.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2018-07-18 21:44:40 +0800

12 Jul, 2018

4 commits

69527c554 now we can fold open_check_o_direct() into do_dentry_open() ... Browse Code »

These checks are better off in do_dentry_open(); the reason we couldn't
put them there used to be that callers couldn't tell what kind of cleanup
would do_dentry_open() failure call for. Now that we have FMODE_OPENED,
cleanup is the same in all cases - it's simply fput(). So let's fold
that into do_dentry_open(), as Christoph's patch tried to.

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:17 +0800
ae2bb293a get rid of cred argument of vfs_open() and do_dentry_open() ... Browse Code »

always equal to ->f_cred

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:14 +0800
ea73ea727 pass ->f_flags value to alloc_empty_file() ... Browse Code »

... and have it set the f_flags-derived part of ->f_mode.

Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:13 +0800
6de37b6dc pass creds to get_empty_filp(), make sure dentry_open() passes the right creds ... Browse Code »

... and rename get_empty_filp() to alloc_empty_file().

dentry_open() gets creds as argument, but the only thing that sees those is
security_file_open() - file->f_cred still ends up with current_cred(). For
almost all callers it's the same thing, but there are several broken cases.

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:13 +0800

11 Jul, 2018

1 commit

b4e7a7a88 drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open() ... Browse Code »

Failure of ->open() should *not* be followed by fput(). Fixed by
using filp_clone_open(), which gets the cleanups right.

Cc: stable@vger.kernel.org
Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-11 11:29:03 +0800

20 Jun, 2018

1 commit

a6d639da6 fs: factor out a __generic_write_end helper ... Browse Code »

Bits of the buffer.c based write_end implementations that don't know
about buffer_heads and can be reused by other implementations.

Signed-off-by: Christoph Hellwig
Reviewed-by: Brian Foster
Reviewed-by: Andreas Gruenbacher
Reviewed-by: Darrick J. Wong
Signed-off-by: Darrick J. Wong

Christoph Hellwig
2018-06-20 06:10:55 +0800

04 Jun, 2018

1 commit

af04fadca Revert "fs: fold open_check_o_direct into do_dentry_open" ... Browse Code »

This reverts commit cab64df194667dc5d9d786f0a895f647f5501c0d.

Having vfs_open() in some cases drop the reference to
struct file combined with

error = vfs_open(path, f, cred);
if (error) {
put_filp(f);
return ERR_PTR(error);
}
return f;

is flat-out wrong. It used to be

error = vfs_open(path, f, cred);
if (!error) {
/* from now on we need fput() to dispose of f */
error = open_check_o_direct(f);
if (error) {
fput(f);
f = ERR_PTR(error);
}
} else {
put_filp(f);
f = ERR_PTR(error);
}

and sure, having that open_check_o_direct() boilerplate gotten rid of is
nice, but not that way...

Worse, another call chain (via finish_open()) is FUBAR now wrt
FILE_OPENED handling - in that case we get error returned, with file
already hit by fput() *AND* FILE_OPENED not set. Guess what happens in
path_openat(), when it hits

if (!(opened & FILE_OPENED)) {
BUG_ON(!error);
put_filp(file);
}

The root cause of all that crap is that the callers of do_dentry_open()
have no way to tell which way did it fail; while that could be fixed up
(by passing something like int *opened to do_dentry_open() and have it
marked if we'd called ->open()), it's probably much too late in the
cycle to do so right now.

Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2018-06-04 01:58:23 +0800

07 Apr, 2018

1 commit

9022ca6b1 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull misc vfs updates from Al Viro:
"Assorted stuff, including Christoph's I_DIRTY patches"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: move I_DIRTY_INODE to fs.h
ubifs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) call
ntfs: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) call
gfs2: fix bogus __mark_inode_dirty(I_DIRTY_SYNC | I_DIRTY_DATASYNC) calls
fs: fold open_check_o_direct into do_dentry_open
vfs: Replace stray non-ASCII homoglyph characters with their ASCII equivalents
vfs: make sure struct filename->iname is word-aligned
get rid of pointless includes of fs_struct.h
[poll] annotate SAA6588_CMD_POLL users

Linus Torvalds
2018-04-07 02:07:08 +0800

03 Apr, 2018

2 commits

411d9475c fs: add ksys_ftruncate() wrapper; remove in-kernel calls to sys_ftruncate() ... Browse Code »

Using the ksys_ftruncate() wrapper allows us to get rid of in-kernel
calls to the sys_ftruncate() syscall. The ksys_ prefix denotes that this
function is meant as a drop-in replacement for the syscall. In
particular, it uses the same calling convention as sys_ftruncate().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro
Cc: Andrew Morton
Signed-off-by: Dominik Brodowski

Dominik Brodowski
2018-04-03 02:16:00 +0800
55731b3cd fs: add do_fchownat(), ksys_fchown() helpers and ksys_{,l}chown() wrappers ... Browse Code »

Using the fs-interal do_fchownat() wrapper allows us to get rid of
fs-internal calls to the sys_fchownat() syscall.

Introducing the ksys_fchown() helper and the ksys_{,}chown() wrappers
allows us to avoid the in-kernel calls to the sys_{,l,f}chown() syscalls.
The ksys_ prefix denotes that these functions are meant as a drop-in
replacement for the syscalls. In particular, they use the same calling
convention as sys_{,l,f}chown().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro
Cc: Andrew Morton
Signed-off-by: Dominik Brodowski

Dominik Brodowski
2018-04-03 02:15:59 +0800