Eric Lee / smarc-fsl-linux-kernel

20 Jun, 2019

1 commit

116b9731a fsnotify: add empty fsnotify_{unlink,rmdir}() hooks ... Browse Code »

We would like to move fsnotify_nameremove() calls from d_delete()
into a higher layer where the hook makes more sense and so we can
consider every d_delete() call site individually.

Start by creating empty hook fsnotify_{unlink,rmdir}() and place
them in the proper VFS call sites. After all d_delete() call sites
will be converted to use the new hook, the new hook will generate the
delete events and fsnotify_nameremove() hook will be removed.

Signed-off-by: Amir Goldstein
Signed-off-by: Jan Kara

Amir Goldstein
2019-06-20 20:44:55 +0800

08 May, 2019

1 commit

a9fbcd672 Merge tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt ... Browse Code »

Pull fscrypt updates from Ted Ts'o:
"Clean up fscrypt's dcache revalidation support, and other
miscellaneous cleanups"

* tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
fscrypt: cache decrypted symlink target in ->i_link
vfs: use READ_ONCE() to access ->i_link
fscrypt: fix race where ->lookup() marks plaintext dentry as ciphertext
fscrypt: only set dentry_operations on ciphertext dentries
fs, fscrypt: clear DCACHE_ENCRYPTED_NAME when unaliasing directory
fscrypt: fix race allowing rename() and link() of ciphertext dentries
fscrypt: clean up and improve dentry revalidation
fscrypt: use READ_ONCE() to access ->i_crypt_info
fscrypt: remove WARN_ON_ONCE() when decryption fails
fscrypt: drop inode argument from fscrypt_get_ctx()

Linus Torvalds
2019-05-08 12:28:04 +0800

27 Apr, 2019

2 commits

f4ec3a3d4 switch fsnotify_move() to passing const struct qstr * for old_name ... Browse Code »

note that in the second (RENAME_EXCHANGE) call of fsnotify_move() in
vfs_rename() the old_dentry->d_name is guaranteed to be unchanged
throughout the evaluation of fsnotify_move() (by the fact that the
parent directory is locked exclusive), so we don't need to fetch
old_dentry->d_name.name in the caller.

Signed-off-by: Al Viro

Al Viro
2019-04-27 01:22:05 +0800
230c6402b ovl_lookup_real_one(): don't bother with strlen() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2019-04-27 01:13:33 +0800

18 Apr, 2019

1 commit

4c4f7c19b vfs: use READ_ONCE() to access ->i_link ... Browse Code »

Use 'READ_ONCE(inode->i_link)' to explicitly support filesystems caching
the symlink target in ->i_link later if it was unavailable at iget()
time, or wasn't easily available. I'll be doing this in fscrypt, to
improve the performance of encrypted symlinks on ext4, f2fs, and ubifs.

->i_link will start NULL and may later be set to a non-NULL value by a
smp_store_release() or cmpxchg_release(). READ_ONCE() is needed on the
read side. smp_load_acquire() is unnecessary because only a data
dependency barrier is required. (Thanks to Al for pointing this out.)

Acked-by: Al Viro
Signed-off-by: Eric Biggers
Signed-off-by: Theodore Ts'o

Eric Biggers
2019-04-18 00:43:14 +0800

13 Mar, 2019

1 commit

7b47a9e7c Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs mount infrastructure updates from Al Viro:
"The rest of core infrastructure; no new syscalls in that pile, but the
old parts are switched to new infrastructure. At that point
conversions of individual filesystems can happen independently; some
are done here (afs, cgroup, procfs, etc.), there's also a large series
outside of that pile dealing with NFS (quite a bit of option-parsing
stuff is getting used there - it's one of the most convoluted
filesystems in terms of mount-related logics), but NFS bits are the
next cycle fodder.

It got seriously simplified since the last cycle; documentation is
probably the weakest bit at the moment - I considered dropping the
commit introducing Documentation/filesystems/mount_api.txt (cutting
the size increase by quarter ;-), but decided that it would be better
to fix it up after -rc1 instead.

That pile allows to do followup work in independent branches, which
should make life much easier for the next cycle. fs/super.c size
increase is unpleasant; there's a followup series that allows to
shrink it considerably, but I decided to leave that until the next
cycle"

* 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (41 commits)
afs: Use fs_context to pass parameters over automount
afs: Add fs_context support
vfs: Add some logging to the core users of the fs_context log
vfs: Implement logging through fs_context
vfs: Provide documentation for new mount API
vfs: Remove kern_mount_data()
hugetlbfs: Convert to fs_context
cpuset: Use fs_context
kernfs, sysfs, cgroup, intel_rdt: Support fs_context
cgroup: store a reference to cgroup_ns into cgroup_fs_context
cgroup1_get_tree(): separate "get cgroup_root to use" into a separate helper
cgroup_do_mount(): massage calling conventions
cgroup: stash cgroup_root reference into cgroup_fs_context
cgroup2: switch to option-by-option parsing
cgroup1: switch to option-by-option parsing
cgroup: take options parsing into ->parse_monolithic()
cgroup: fold cgroup1_mount() into cgroup1_get_tree()
cgroup: start switching to fs_context
ipc: Convert mqueue fs to fs_context
proc: Add fs_context support to procfs
...

Linus Torvalds
2019-03-13 05:08:19 +0800

11 Mar, 2019

1 commit

c3665a6be Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/j… ... Browse Code »

…morris/linux-security

Pull integrity updates from James Morris:
"Mimi Zohar says:

'Linux 5.0 introduced the platform keyring to allow verifying the IMA
kexec kernel image signature using the pre-boot keys. This pull
request similarly makes keys on the platform keyring accessible for
verifying the PE kernel image signature.

Also included in this pull request is a new IMA hook that tags tmp
files, in policy, indicating the file hash needs to be calculated.
The remaining patches are cleanup'"

* 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
evm: Use defined constant for UUID representation
ima: define ima_post_create_tmpfile() hook and add missing call
evm: remove set but not used variable 'xattr'
encrypted-keys: fix Opt_err/Opt_error = -1
kexec, KEYS: Make use of platform keyring for signature verify
integrity, KEYS: add a reference to platform keyring

Linus Torvalds
2019-03-11 08:32:04 +0800

08 Mar, 2019

2 commits

b5dd0c658 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge more updates from Andrew Morton:

- some of the rest of MM

- various misc things

- dynamic-debug updates

- checkpatch

- some epoll speedups

- autofs

- rapidio

- lib/, lib/lzo/ updates

* emailed patches from Andrew Morton : (83 commits)
samples/mic/mpssd/mpssd.h: remove duplicate header
kernel/fork.c: remove duplicated include
include/linux/relay.h: fix percpu annotation in struct rchan
arch/nios2/mm/fault.c: remove duplicate include
unicore32: stop printing the virtual memory layout
MAINTAINERS: fix GTA02 entry and mark as orphan
mm: create the new vm_fault_t type
arm, s390, unicore32: remove oneliner wrappers for memblock_alloc()
arch: simplify several early memory allocations
openrisc: simplify pte_alloc_one_kernel()
sh: prefer memblock APIs returning virtual address
microblaze: prefer memblock API returning virtual address
powerpc: prefer memblock APIs returning virtual address
lib/lzo: separate lzo-rle from lzo
lib/lzo: implement run-length encoding
lib/lzo: fast 8-byte copy on arm64
lib/lzo: 64-bit CTZ on arm64
lib/lzo: tidy-up ifdefs
ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
ipc: annotate implicit fall through
...

Linus Torvalds
2019-03-08 11:25:37 +0800
f1fffbd44 linux/fs.h: move member alignment check next to definition of struct filename ... Browse Code »

Instead of doing this compile-time check in some slightly arbitrary user
of struct filename, put it next to the definition.

Link: http://lkml.kernel.org/r/20190208203015.29702-3-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes
Cc: Alexander Viro
Cc: Kees Cook
Cc: Luc Van Oostenryck
Cc: Masahiro Yamada
Cc: Nick Desaulniers
Cc: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2019-03-08 10:31:59 +0800

28 Feb, 2019

1 commit

31d921c7f vfs: Add configuration parser helpers ... Browse Code »

Because the new API passes in key,value parameters, match_token() cannot be
used with it. Instead, provide three new helpers to aid with parsing:

(1) fs_parse(). This takes a parameter and a simple static description of
all the parameters and maps the key name to an ID. It returns 1 on a
match, 0 on no match if unknowns should be ignored and some other
negative error code on a parse error.

The parameter description includes a list of key names to IDs, desired
parameter types and a list of enumeration name -> ID mappings.

[!] Note that for the moment I've required that the key->ID mapping
array is expected to be sorted and unterminated. The size of the
array is noted in the fsconfig_parser struct. This allows me to use
bsearch(), but I'm not sure any performance gain is worth the hassle
of requiring people to keep the array sorted.

The parameter type array is sized according to the number of parameter
IDs and is indexed directly. The optional enum mapping array is an
unterminated, unsorted list and the size goes into the fsconfig_parser
struct.

The function can do some additional things:

(a) If it's not ambiguous and no value is given, the prefix "no" on
a key name is permitted to indicate that the parameter should
be considered negatory.

(b) If the desired type is a single simple integer, it will perform
an appropriate conversion and store the result in a union in
the parse result.

(c) If the desired type is an enumeration, {key ID, name} will be
looked up in the enumeration list and the matching value will
be stored in the parse result union.

(d) Optionally generate an error if the key is unrecognised.

This is called something like:

enum rdt_param {
Opt_cdp,
Opt_cdpl2,
Opt_mba_mpbs,
nr__rdt_params
};

const struct fs_parameter_spec rdt_param_specs[nr__rdt_params] = {
[Opt_cdp] = { fs_param_is_bool },
[Opt_cdpl2] = { fs_param_is_bool },
[Opt_mba_mpbs] = { fs_param_is_bool },
};

const const char *const rdt_param_keys[nr__rdt_params] = {
[Opt_cdp] = "cdp",
[Opt_cdpl2] = "cdpl2",
[Opt_mba_mpbs] = "mba_mbps",
};

const struct fs_parameter_description rdt_parser = {
.name = "rdt",
.nr_params = nr__rdt_params,
.keys = rdt_param_keys,
.specs = rdt_param_specs,
.no_source = true,
};

int rdt_parse_param(struct fs_context *fc,
struct fs_parameter *param)
{
struct fs_parse_result parse;
struct rdt_fs_context *ctx = rdt_fc2context(fc);
int ret;

ret = fs_parse(fc, &rdt_parser, param, &parse);
if (ret < 0)
return ret;

switch (parse.key) {
case Opt_cdp:
ctx->enable_cdpl3 = true;
return 0;
case Opt_cdpl2:
ctx->enable_cdpl2 = true;
return 0;
case Opt_mba_mpbs:
ctx->enable_mba_mbps = true;
return 0;
}

return -EINVAL;
}

(2) fs_lookup_param(). This takes a { dirfd, path, LOOKUP_EMPTY? } or
string value and performs an appropriate path lookup to convert it
into a path object, which it will then return.

If the desired type was a blockdev, the type of the looked up inode
will be checked to make sure it is one.

This can be used like:

enum foo_param {
Opt_source,
nr__foo_params
};

const struct fs_parameter_spec foo_param_specs[nr__foo_params] = {
[Opt_source] = { fs_param_is_blockdev },
};

const char *char foo_param_keys[nr__foo_params] = {
[Opt_source] = "source",
};

const struct constant_table foo_param_alt_keys[] = {
{ "device", Opt_source },
};

const struct fs_parameter_description foo_parser = {
.name = "foo",
.nr_params = nr__foo_params,
.nr_alt_keys = ARRAY_SIZE(foo_param_alt_keys),
.keys = foo_param_keys,
.alt_keys = foo_param_alt_keys,
.specs = foo_param_specs,
};

int foo_parse_param(struct fs_context *fc,
struct fs_parameter *param)
{
struct fs_parse_result parse;
struct foo_fs_context *ctx = foo_fc2context(fc);
int ret;

ret = fs_parse(fc, &foo_parser, param, &parse);
if (ret < 0)
return ret;

switch (parse.key) {
case Opt_source:
return fs_lookup_param(fc, &foo_parser, param,
&parse, &ctx->source);
default:
return -EINVAL;
}
}

(3) lookup_constant(). This takes a table of named constants and looks up
the given name within it. The table is expected to be sorted such
that bsearch() be used upon it.

Possibly I should require the table be terminated and just use a
for-loop to scan it instead of using bsearch() to reduce hassle.

Tables look something like:

static const struct constant_table bool_names[] = {
{ "0", false },
{ "1", true },
{ "false", false },
{ "no", false },
{ "true", true },
{ "yes", true },
};

and a lookup is done with something like:

b = lookup_constant(bool_names, param->string, -1);

Additionally, optional validation routines for the parameter description
are provided that can be enabled at compile time. A later patch will
invoke these when a filesystem is registered.

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2019-02-28 16:28:53 +0800

05 Feb, 2019

1 commit

fdb2410f7 ima: define ima_post_create_tmpfile() hook and add missing call ... Browse Code »

If tmpfiles can be made persistent, then newly created tmpfiles need to
be treated like any other new files in policy.

This patch indicates which newly created tmpfiles are in policy, causing
the file hash to be calculated on __fput().

Reported-by: Ignaz Forster
[rgoldwyn@suse.com: Call ima_post_create_tmpfile() in vfs_tmpfile() as
opposed to do_tmpfile(). This will help the case for overlayfs where
copy_up is denied while overwriting a file.]
Signed-off-by: Goldwyn Rodrigues
Signed-off-by: Mimi Zohar

Mimi Zohar
2019-02-05 06:36:01 +0800

31 Jan, 2019

1 commit

57d465771 audit: ignore fcaps on umount ... Browse Code »

Don't fetch fcaps when umount2 is called to avoid a process hang while
it waits for the missing resource to (possibly never) re-appear.

Note the comment above user_path_mountpoint_at():
* A umount is a special case for path walking. We're not actually interested
* in the inode in this situation, and ESTALE errors can be a problem. We
* simply want track down the dentry and vfsmount attached at the mountpoint
* and avoid revalidating the last component.

This can happen on ceph, cifs, 9p, lustre, fuse (gluster) or NFS.

Please see the github issue tracker
https://github.com/linux-audit/audit-kernel/issues/100

Signed-off-by: Richard Guy Briggs
[PM: merge fuzz in audit_log_fcaps()]
Signed-off-by: Paul Moore

Richard Guy Briggs
2019-01-31 09:51:47 +0800

23 Dec, 2018

1 commit

94f82008c Revert "vfs: Allow userns root to call mknod on owned filesystems." ... Browse Code »

This reverts commit 55956b59df336f6738da916dbb520b6e37df9fbd.

commit 55956b59df33 ("vfs: Allow userns root to call mknod on owned filesystems.")
enabled mknod() in user namespaces for userns root if CAP_MKNOD is
available. However, these device nodes are useless since any filesystem
mounted from a non-initial user namespace will set the SB_I_NODEV flag on
the filesystem. Now, when a device node s created in a non-initial user
namespace a call to open() on said device node will fail due to:

bool may_open_dev(const struct path *path)
{
return !(path->mnt->mnt_flags & MNT_NODEV) &&
!(path->mnt->mnt_sb->s_iflags & SB_I_NODEV);
}

The problem with this is that as of the aforementioned commit mknod()
creates partially functional device nodes in non-initial user namespaces.
In particular, it has the consequence that as of the aforementioned commit
open() will be more privileged with respect to device nodes than mknod().
Before it was the other way around. Specifically, if mknod() succeeded
then it was transparent for any userspace application that a fatal error
must have occured when open() failed.

All of this breaks multiple userspace workloads and a widespread assumption
about how to handle mknod(). Basically, all container runtimes and systemd
live by the slogan "ask for forgiveness not permission" when running user
namespace workloads. For mknod() the assumption is that if the syscall
succeeds the device nodes are useable irrespective of whether it succeeds
in a non-initial user namespace or not. This logic was chosen explicitly
to allow for the glorious day when mknod() will actually be able to create
fully functional device nodes in user namespaces.
A specific problem people are already running into when running 4.18 rc
kernels are failing systemd services. For any distro that is run in a
container systemd services started with the PrivateDevices= property set
will fail to start since the device nodes in question cannot be
opened (cf. the arguments in [1]).

Full disclosure, Seth made the very sound argument that it is already
possible to end up with partially functional device nodes. Any filesystem
mounted with MS_NODEV set will allow mknod() to succeed but will not allow
open() to succeed. The difference to the case here is that the MS_NODEV
case is transparent to userspace since it is an explicitly set mount option
while the SB_I_NODEV case is an implicit property enforced by the kernel
and hence opaque to userspace.

[1]: https://github.com/systemd/systemd/pull/9483

Signed-off-by: Christian Brauner
Cc: "Eric W. Biederman"
Cc: Seth Forshee
Cc: Serge Hallyn
Signed-off-by: Linus Torvalds

Christian Brauner
2018-12-23 06:18:34 +0800

24 Aug, 2018

1 commit

30aba6656 namei: allow restricted O_CREAT of FIFOs and regular files ... Browse Code »

Disallows open of FIFOs or regular files not owned by the user in world
writable sticky directories, unless the owner is the same as that of the
directory or the file is opened without the O_CREAT flag. The purpose
is to make data spoofing attacks harder. This protection can be turned
on and off separately for FIFOs and regular files via sysctl, just like
the symlinks/hardlinks protection. This patch is based on Openwall's
"HARDEN_FIFO" feature by Solar Designer.

This is a brief list of old vulnerabilities that could have been prevented
by this feature, some of them even allow for privilege escalation:

CVE-2000-1134
CVE-2007-3852
CVE-2008-0525
CVE-2009-0416
CVE-2011-4834
CVE-2015-1838
CVE-2015-7442
CVE-2016-7489

This list is not meant to be complete. It's difficult to track down all
vulnerabilities of this kind because they were often reported without any
mention of this particular attack vector. In fact, before
hardlinks/symlinks restrictions, fifos/regular files weren't the favorite
vehicle to exploit them.

[s.mesoraca16@gmail.com: fix bug reported by Dan Carpenter]
Link: https://lkml.kernel.org/r/20180426081456.GA7060@mwanda
Link: http://lkml.kernel.org/r/1524829819-11275-1-git-send-email-s.mesoraca16@gmail.com
[keescook@chromium.org: drop pr_warn_ratelimited() in favor of audit changes in the future]
[keescook@chromium.org: adjust commit subjet]
Link: http://lkml.kernel.org/r/20180416175918.GA13494@beast
Signed-off-by: Salvatore Mesoraca
Signed-off-by: Kees Cook
Suggested-by: Solar Designer
Suggested-by: Kees Cook
Cc: Al Viro
Cc: Dan Carpenter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Salvatore Mesoraca
2018-08-24 09:48:43 +0800

22 Aug, 2018

1 commit

d9a185f8b Merge tag 'ovl-update-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs ... Browse Code »

Pull overlayfs updates from Miklos Szeredi:
"This contains two new features:

- Stack file operations: this allows removal of several hacks from
the VFS, proper interaction of read-only open files with copy-up,
possibility to implement fs modifying ioctls properly, and others.

- Metadata only copy-up: when file is on lower layer and only
metadata is modified (except size) then only copy up the metadata
and continue to use the data from the lower file"

* tag 'ovl-update-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (66 commits)
ovl: Enable metadata only feature
ovl: Do not do metacopy only for ioctl modifying file attr
ovl: Do not do metadata only copy-up for truncate operation
ovl: add helper to force data copy-up
ovl: Check redirect on index as well
ovl: Set redirect on upper inode when it is linked
ovl: Set redirect on metacopy files upon rename
ovl: Do not set dentry type ORIGIN for broken hardlinks
ovl: Add an inode flag OVL_CONST_INO
ovl: Treat metacopy dentries as type OVL_PATH_MERGE
ovl: Check redirects for metacopy files
ovl: Move some dir related ovl_lookup_single() code in else block
ovl: Do not expose metacopy only dentry from d_real()
ovl: Open file with data except for the case of fsync
ovl: Add helper ovl_inode_realdata()
ovl: Store lower data inode in ovl_inode
ovl: Fix ovl_getattr() to get number of blocks from lower
ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry
ovl: Copy up meta inode data from lowest data inode
ovl: Modify ovl_lookup() and friends to lookup metacopy dentry
...

Linus Torvalds
2018-08-22 09:19:09 +0800

14 Aug, 2018

1 commit

4591343e3 Merge branches 'work.misc' and 'work.dcache' of git://git.kernel.org/pub/scm/lin… ... Browse Code »

…ux/kernel/git/viro/vfs

Pull misc vfs updates from Al Viro:
"Misc cleanups from various folks all over the place

I expected more fs/dcache.c cleanups this cycle, so that went into a
separate branch. Said cleanups have missed the window, so in the
hindsight it could've gone into work.misc instead. Decided not to
cherry-pick, thus the 'work.dcache' branch"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: dcache: Use true and false for boolean values
fold generic_readlink() into its only caller
fs: shave 8 bytes off of struct inode
fs: Add more kernel-doc to the produced documentation
fs: Fix attr.c kernel-doc
removed extra extern file_fdatawait_range

* 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
kill dentry_update_name_case()

Linus Torvalds
2018-08-14 12:28:25 +0800

20 Jul, 2018

1 commit

f2df5da66 fold generic_readlink() into its only caller ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2018-07-20 05:35:51 +0800

18 Jul, 2018

1 commit

c67185434 Revert "vfs: update ovl inode before relatime check" ... Browse Code »

This reverts commit 598e3c8f72f5b77c84d2cb26cfd936ffb3cfdbaa.

Overlayfs no longer relies on the vfs correct atime handling.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2018-07-18 21:44:43 +0800

12 Jul, 2018

17 commits

5f336e722 few more cleanups of link_path_walk() callers ... Browse Code »

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:31 +0800
9b5858e99 allow link_path_walk() to take ERR_PTR() ... Browse Code »

There is a check for IS_ERR(name) immediately upstream of each call
of link_path_walk(name, nd), with positives treated as if link_path_walk()
failed with PTR_ERR(name). Taking that check into link_path_walk() itself
simplifies things nicely.

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:30 +0800
edc2b1da7 make path_init() unconditionally paired with terminate_walk() ... Browse Code »

including the failure exits

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:30 +0800
00a07c159 switch atomic_open() and lookup_open() to returning 0 in all success cases ... Browse Code »

caller can tell "opened" from "open it yourself" by looking at ->f_mode.

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:22 +0800
64e1ac4d4 ->atomic_open(): return 0 in all success cases ... Browse Code »

FMODE_OPENED can be used to distingusish "successful open" from the
"called finish_no_open(), do it yourself" cases. Since finish_no_open()
has been adjusted, no changes in the instances were actually needed.
The caller has been adjusted.

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:21 +0800
3ec2eef11 get rid of 'opened' in path_openat() and the helpers downstream ... Browse Code »

unused now

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:21 +0800
44907d790 get rid of 'opened' argument of ->atomic_open() - part 3 ... Browse Code »

now it can be done...

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:20 +0800
be12af3ef getting rid of 'opened' argument of ->atomic_open() - part 1 ... Browse Code »

'opened' argument of finish_open() is unused. Kill it.

Signed-off-by Al Viro

Al Viro
2018-07-12 22:04:19 +0800
6035a27b2 IMA: don't propagate opened through the entire thing ... Browse Code »

just check ->f_mode in ima_appraise_measurement()

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:19 +0800
73a09dd94 introduce FMODE_CREATED and switch to it ... Browse Code »

Parallel to FILE_CREATED, goes into ->f_mode instead of *opened.
NFS is a bit of a wart here - it doesn't have file at the point
where FILE_CREATED used to be set, so we need to propagate it
there (for now). IMA is another one (here and everywhere)...

Note that this needs do_dentry_open() to leave old bits in ->f_mode
alone - we want it to preserve FMODE_CREATED if it had been already
set (no other bit can be there).

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:18 +0800
aad888f82 switch all remaining checks for FILE_OPENED to FMODE_OPENED ... Browse Code »

... and don't bother with setting FILE_OPENED at all.

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:18 +0800
69527c554 now we can fold open_check_o_direct() into do_dentry_open() ... Browse Code »

These checks are better off in do_dentry_open(); the reason we couldn't
put them there used to be that callers couldn't tell what kind of cleanup
would do_dentry_open() failure call for. Now that we have FMODE_OPENED,
cleanup is the same in all cases - it's simply fput(). So let's fold
that into do_dentry_open(), as Christoph's patch tried to.

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:17 +0800
7c1c01ec2 lift fput() on late failures into path_openat() ... Browse Code »

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:17 +0800
4d27f3266 fold put_filp() into fput() ... Browse Code »

Just check FMODE_OPENED in __fput() and be done with that...

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:16 +0800
ae2bb293a get rid of cred argument of vfs_open() and do_dentry_open() ... Browse Code »

always equal to ->f_cred

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:14 +0800
ea73ea727 pass ->f_flags value to alloc_empty_file() ... Browse Code »

... and have it set the f_flags-derived part of ->f_mode.

Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:13 +0800
6de37b6dc pass creds to get_empty_filp(), make sure dentry_open() passes the right creds ... Browse Code »

... and rename get_empty_filp() to alloc_empty_file().

dentry_open() gets creds as argument, but the only thing that sees those is
security_file_open() - file->f_cred still ends up with current_cred(). For
almost all callers it's the same thing, but there are several broken cases.

Acked-by: Linus Torvalds
Signed-off-by: Al Viro

Al Viro
2018-07-12 22:04:13 +0800

16 Jun, 2018

1 commit

35773c938 Merge branch 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull AFS updates from Al Viro:
"Assorted AFS stuff - ended up in vfs.git since most of that consists
of David's AFS-related followups to Christoph's procfs series"

* 'afs-proc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
afs: Optimise callback breaking by not repeating volume lookup
afs: Display manually added cells in dynamic root mount
afs: Enable IPv6 DNS lookups
afs: Show all of a server's addresses in /proc/fs/afs/servers
afs: Handle CONFIG_PROC_FS=n
proc: Make inline name size calculation automatic
afs: Implement network namespacing
afs: Mark afs_net::ws_cell as __rcu and set using rcu functions
afs: Fix a Sparse warning in xdr_decode_AFSFetchStatus()
proc: Add a way to make network proc files writable
afs: Rearrange fs/afs/proc.c to remove remaining predeclarations.
afs: Rearrange fs/afs/proc.c to move the show routines up
afs: Rearrange fs/afs/proc.c by moving fops and open functions down
afs: Move /proc management functions to the end of the file

Linus Torvalds
2018-06-16 15:32:04 +0800

15 Jun, 2018

1 commit

0da0b7fd7 afs: Display manually added cells in dynamic root mount ... Browse Code »

Alter the dynroot mount so that cells created by manipulation of
/proc/fs/afs/cells and /proc/fs/afs/rootcell and by specification of a root
cell as a module parameter will cause directories for those cells to be
created in the dynamic root superblock for the network namespace[*].

To this end:

(1) Only one dynamic root superblock is now created per network namespace
and this is shared between all attempts to mount it. This makes it
easier to find the superblock to modify.

(2) When a dynamic root superblock is created, the list of cells is walked
and directories created for each cell already defined.

(3) When a new cell is added, if a dynamic root superblock exists, a
directory is created for it.

(4) When a cell is destroyed, the directory is removed.

(5) These directories are created by calling lookup_one_len() on the root
dir which automatically creates them if they don't exist.

[*] Inasmuch as network namespaces are currently supported here.

Signed-off-by: David Howells

David Howells
2018-06-15 22:27:09 +0800

13 Jun, 2018

1 commit

6da2ec560 treewide: kmalloc() -> kmalloc_array() ... Browse Code »

The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
patch replaces cases of:

kmalloc(a * b, gfp)

with:
kmalloc_array(a * b, gfp)

as well as handling cases of:

kmalloc(a * b * c, gfp)

with:

kmalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

kmalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

kmalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The tools/ directory was manually excluded, since it has its own
implementation of kmalloc().

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
kmalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
kmalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
kmalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_ID)
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_ID
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_CONST)
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_CONST
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_ID)
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_ID
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_CONST)
+ COUNT_CONST, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_CONST
+ COUNT_CONST, sizeof(THING)
, ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kmalloc
+ kmalloc_array
(
- SIZE * COUNT
+ COUNT, SIZE
, ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
kmalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
kmalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
kmalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(
- (E1) * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * (E3)
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
kmalloc(sizeof(THING) * C2, ...)
|
kmalloc(sizeof(TYPE) * C2, ...)
|
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(C1 * C2, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (E2)
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * E2
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (E2)
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * E2
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * E2
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * (E2)
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- E1 * E2
+ E1, E2
, ...)
)

Signed-off-by: Kees Cook

Kees Cook
2018-06-13 07:19:22 +0800

05 Jun, 2018

2 commits

d8aed8415 Merge branch 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull userns updates from Eric Biederman:
"This is the last couple of vfs bits to enable root in a user namespace
to mount and manipulate a filesystem with backing store (AKA not a
virtual filesystem like proc, but a filesystem where the unprivileged
user controls the content). The target filesystem for this work is
fuse, and Miklos should be sending you the pull request for the fuse
bits this merge window.

The two key patches are "evm: Don't update hmacs in user ns mounts"
and "vfs: Don't allow changing the link count of an inode with an
invalid uid or gid". Those close small gaps in the vfs that would be a
problem if an unprivileged fuse filesystem is mounted.

The rest of the changes are things that are now safe to allow a root
user in a user namespace to do with a filesystem they have mounted.
The most interesting development is that remount is now safe"

* 'userns-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
fs: Allow CAP_SYS_ADMIN in s_user_ns to freeze and thaw filesystems
capabilities: Allow privileged user in s_user_ns to set security.* xattrs
fs: Allow superblock owner to access do_remount_sb()
fs: Allow superblock owner to replace invalid owners of inodes
vfs: Allow userns root to call mknod on owned filesystems.
vfs: Don't allow changing the link count of an inode with an invalid uid or gid
evm: Don't update hmacs in user ns mounts

Linus Torvalds
2018-06-05 06:21:19 +0800
f956d08a5 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull misc vfs updates from Al Viro:
"Misc bits and pieces not fitting into anything more specific"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
vfs: delete unnecessary assignment in vfs_listxattr
Documentation: filesystems: update filesystem locking documentation
vfs: namei: use path_equal() in follow_dotdot()
fs.h: fix outdated comment about file flags
__inode_security_revalidate() never gets NULL opt_dentry
make xattr_getsecurity() static
vfat: simplify checks in vfat_lookup()
get rid of dead code in d_find_alias()
it's SB_BORN, not MS_BORN...
msdos_rmdir(): kill BS comment
remove rpc_rmdir()
fs: avoid fdput() after failed fdget() in vfs_dedupe_file_range()

Linus Torvalds
2018-06-05 01:14:28 +0800