Eric Lee / smarc-fsl-linux-kernel

23 Oct, 2015

1 commit

4f6563677 Move locks API users to locks_lock_inode_wait() ... Browse Code »

Instead of having users check for FL_POSIX or FL_FLOCK to call the correct
locks API function, use the check within locks_lock_inode_wait(). This
allows for some later cleanup.

Signed-off-by: Benjamin Coddington
Signed-off-by: Jeff Layton

Benjamin Coddington
2015-10-23 02:57:36 +0800

17 Aug, 2015

1 commit

8ed1f0e22 fs/fuse: fix ioctl type confusion ... Browse Code »

fuse_dev_ioctl() performed fuse_get_dev() on a user-supplied fd,
leading to a type confusion issue. Fix it by checking file->f_op.

Signed-off-by: Jann Horn
Acked-by: Miklos Szeredi
Signed-off-by: Linus Torvalds

Jann Horn
2015-08-17 03:35:44 +0800

05 Jul, 2015

1 commit

1dc51b828 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull more vfs updates from Al Viro:
"Assorted VFS fixes and related cleanups (IMO the most interesting in
that part are f_path-related things and Eric's descriptor-related
stuff). UFS regression fixes (it got broken last cycle). 9P fixes.
fs-cache series, DAX patches, Jan's file_remove_suid() work"

[ I'd say this is much more than "fixes and related cleanups". The
file_table locking rule change by Eric Dumazet is a rather big and
fundamental update even if the patch isn't huge. - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
9p: cope with bogus responses from server in p9_client_{read,write}
p9_client_write(): avoid double p9_free_req()
9p: forgetting to cancel request on interrupted zero-copy RPC
dax: bdev_direct_access() may sleep
block: Add support for DAX reads/writes to block devices
dax: Use copy_from_iter_nocache
dax: Add block size note to documentation
fs/file.c: __fget() and dup2() atomicity rules
fs/file.c: don't acquire files->file_lock in fd_install()
fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
vfs: avoid creation of inode number 0 in get_next_ino
namei: make set_root_rcu() return void
make simple_positive() public
ufs: use dir_pages instead of ufs_dir_pages()
pagemap.h: move dir_pages() over there
remove the pointless include of lglock.h
fs: cleanup slight list_entry abuse
xfs: Correctly lock inode when removing suid and file capabilities
fs: Call security_ops->inode_killpriv on truncate
fs: Provide function telling whether file_remove_privs() will do anything
...

Linus Torvalds
2015-07-05 10:36:06 +0800

04 Jul, 2015

1 commit

0cbee9926 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull user namespace updates from Eric Biederman:
"Long ago and far away when user namespaces where young it was realized
that allowing fresh mounts of proc and sysfs with only user namespace
permissions could violate the basic rule that only root gets to decide
if proc or sysfs should be mounted at all.

Some hacks were put in place to reduce the worst of the damage could
be done, and the common sense rule was adopted that fresh mounts of
proc and sysfs should allow no more than bind mounts of proc and
sysfs. Unfortunately that rule has not been fully enforced.

There are two kinds of gaps in that enforcement. Only filesystems
mounted on empty directories of proc and sysfs should be ignored but
the test for empty directories was insufficient. So in my tree
directories on proc, sysctl and sysfs that will always be empty are
created specially. Every other technique is imperfect as an ordinary
directory can have entries added even after a readdir returns and
shows that the directory is empty. Special creation of directories
for mount points makes the code in the kernel a smidge clearer about
it's purpose. I asked container developers from the various container
projects to help test this and no holes were found in the set of mount
points on proc and sysfs that are created specially.

This set of changes also starts enforcing the mount flags of fresh
mounts of proc and sysfs are consistent with the existing mount of
proc and sysfs. I expected this to be the boring part of the work but
unfortunately unprivileged userspace winds up mounting fresh copies of
proc and sysfs with noexec and nosuid clear when root set those flags
on the previous mount of proc and sysfs. So for now only the atime,
read-only and nodev attributes which userspace happens to keep
consistent are enforced. Dealing with the noexec and nosuid
attributes remains for another time.

This set of changes also addresses an issue with how open file
descriptors from /proc//ns/* are displayed. Recently readlink of
/proc//fd has been triggering a WARN_ON that has not been
meaningful since it was added (as all of the code in the kernel was
converted) and is not now actively wrong.

There is also a short list of issues that have not been fixed yet that
I will mention briefly.

It is possible to rename a directory from below to above a bind mount.
At which point any directory pointers below the renamed directory can
be walked up to the root directory of the filesystem. With user
namespaces enabled a bind mount of the bind mount can be created
allowing the user to pick a directory whose children they can rename
to outside of the bind mount. This is challenging to fix and doubly
so because all obvious solutions must touch code that is in the
performance part of pathname resolution.

As mentioned above there is also a question of how to ensure that
developers by accident or with purpose do not introduce exectuable
files on sysfs and proc and in doing so introduce security regressions
in the current userspace that will not be immediately obvious and as
such are likely to require breaking userspace in painful ways once
they are recognized"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
vfs: Remove incorrect debugging WARN in prepend_path
mnt: Update fs_fully_visible to test for permanently empty directories
sysfs: Create mountpoints with sysfs_create_mount_point
sysfs: Add support for permanently empty directories to serve as mount points.
kernfs: Add support for always empty directories.
proc: Allow creating permanently empty directories that serve as mount points
sysctl: Allow creating permanently empty directories that serve as mountpoints.
fs: Add helper functions for permanently empty directories.
vfs: Ignore unlocked mounts in fs_fully_visible
mnt: Modify fs_fully_visible to deal with locked ro nodev and atime
mnt: Refactor the logic for mounting sysfs and proc in a user namespace

Linus Torvalds
2015-07-04 06:20:57 +0800

03 Jul, 2015

1 commit

a7ba4bf5e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse ... Browse Code »

Pull fuse updates from Miklos Szeredi:
"This is the start of improving fuse scalability.

An input queue and a processing queue is split out from the monolithic
fuse connection, each of those having their own spinlock. The end of
the patchset adds the ability to clone a fuse connection. This means,
that instead of having to read/write requests/answers on a single fuse
device fd, the fuse daemon can have multiple distinct file descriptors
open. Each of those can be used to receive requests and send answers,
currently the only constraint is that a request must be answered on
the same fd as it was read from.

This can be extended further to allow binding a device clone to a
specific CPU or NUMA node.

Based on a patchset by Srinivas Eeda and Ashish Samant. Thanks to
Ashish for the review of this series"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (40 commits)
fuse: update MAINTAINERS entry
fuse: separate pqueue for clones
fuse: introduce per-instance fuse_dev structure
fuse: device fd clone
fuse: abort: no fc->lock needed for request ending
fuse: no fc->lock for pqueue parts
fuse: no fc->lock in request_end()
fuse: cleanup request_end()
fuse: request_end(): do once
fuse: add req flag for private list
fuse: pqueue locking
fuse: abort: group pqueue accesses
fuse: cleanup fuse_dev_do_read()
fuse: move list_del_init() from request_end() into callers
fuse: duplicate ->connected in pqueue
fuse: separate out processing queue
fuse: simplify request_wait()
fuse: no fc->lock for iqueue parts
fuse: allow interrupt queuing without fc->lock
fuse: iqueue locking
...

Linus Torvalds
2015-07-03 02:21:26 +0800

01 Jul, 2015

35 commits

f9bb48825 sysfs: Create mountpoints with sysfs_create_mount_point ... Browse Code »

This allows for better documentation in the code and
it allows for a simpler and fully correct version of
fs_fully_visible to be written.

The mount points converted and their filesystems are:
/sys/hypervisor/s390/ s390_hypfs
/sys/kernel/config/ configfs
/sys/kernel/debug/ debugfs
/sys/firmware/efi/efivars/ efivarfs
/sys/fs/fuse/connections/ fusectl
/sys/fs/pstore/ pstore
/sys/kernel/tracing/ tracefs
/sys/fs/cgroup/ cgroup
/sys/kernel/security/ securityfs
/sys/fs/selinux/ selinuxfs
/sys/fs/smackfs/ smackfs

Cc: stable@vger.kernel.org
Acked-by: Greg Kroah-Hartman
Signed-off-by: "Eric W. Biederman"

Eric W. Biederman
2015-07-01 23:36:47 +0800
c3696046b fuse: separate pqueue for clones ... Browse Code »

Make each fuse device clone refer to a separate processing queue. The only
constraint on userspace code is that the request answer must be written to
the same device clone as it was read off.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2015-07-01 22:26:09 +0800
cc080e9e9 fuse: introduce per-instance fuse_dev structure ... Browse Code »

Allow fuse device clones to refer to be distinguished. This patch just
adds the infrastructure by associating a separate "struct fuse_dev" with
each clone.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:08 +0800
00c570f4b fuse: device fd clone ... Browse Code »

Allow an open fuse device to be "cloned". Userspace can create a clone by:

newfd = open("/dev/fuse", O_RDWR)
ioctl(newfd, FUSE_DEV_IOC_CLONE, &oldfd);

At this point newfd will refer to the same fuse connection as oldfd.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:08 +0800
ee314a870 fuse: abort: no fc->lock needed for request ending ... Browse Code »

In fuse_abort_conn() when all requests are on private lists we no longer
need fc->lock protection.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:08 +0800
46c34a348 fuse: no fc->lock for pqueue parts ... Browse Code »

Remove fc->lock protection from processing queue members, now protected by
fpq->lock.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:07 +0800
efe2800fa fuse: no fc->lock in request_end() ... Browse Code »

No longer need to call request_end() with the connection lock held. We
still protect the background counters and queue with fc->lock, so acquire
it if necessary.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:07 +0800
1e6881c36 fuse: cleanup request_end() ... Browse Code »

Now that we atomically test having already done everything we no longer
need other protection.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:07 +0800
365ae710d fuse: request_end(): do once ... Browse Code »

When the connection is aborted it is possible that request_end() will be
called twice. Use atomic test and set to do the actual ending only once.

test_and_set_bit() also provides the necessary barrier semantics so no
explicit smp_wmb() is necessary.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:06 +0800
77cd9d488 fuse: add req flag for private list ... Browse Code »

When an unlocked request is aborted, it is moved from fpq->io to a private
list. Then, after unlocking fpq->lock, the private list is processed and
the requests are finished off.

To protect the private list, we need to mark the request with a flag, so if
in the meantime the request is unlocked the list is not corrupted.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:06 +0800
45a91cb1a fuse: pqueue locking ... Browse Code »

Add a fpq->lock for protecting members of struct fuse_pqueue and FR_LOCKED
request flag.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:06 +0800
24b4d33d4 fuse: abort: group pqueue accesses ... Browse Code »

Rearrange fuse_abort_conn() so that processing queue accesses are grouped
together.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:05 +0800
82cbdcd32 fuse: cleanup fuse_dev_do_read() ... Browse Code »

- locked list_add() + list_del_init() cancel out

- common handling of case when request is ended here in the read phase

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:05 +0800
f377cb799 fuse: move list_del_init() from request_end() into callers ... Browse Code »

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2015-07-01 22:26:04 +0800
e96edd94d fuse: duplicate ->connected in pqueue ... Browse Code »

This will allow checking ->connected just with the processing queue lock.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:04 +0800
3a2b5b9cd fuse: separate out processing queue ... Browse Code »

This is just two fields: fc->io and fc->processing.

This patch just rearranges the fields, no functional change.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:04 +0800
5250921bb fuse: simplify request_wait() ... Browse Code »

wait_event_interruptible_exclusive_locked() will do everything
request_wait() does, so replace it.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:03 +0800
fd22d62ed fuse: no fc->lock for iqueue parts ... Browse Code »

Remove fc->lock protection from input queue members, now protected by
fiq->waitq.lock.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:03 +0800
8f7bb368d fuse: allow interrupt queuing without fc->lock ... Browse Code »

Interrupt is only queued after the request has been sent to userspace.
This is either done in request_wait_answer() or fuse_dev_do_read()
depending on which state the request is in at the time of the interrupt.
If it's not yet sent, then queuing the interrupt is postponed until the
request is read. Otherwise (the request has already been read and is
waiting for an answer) the interrupt is queued immedidately.

We want to call queue_interrupt() without fc->lock protection, in which
case there can be a race between the two functions:

- neither of them queue the interrupt (thinking the other one has already
done it).

- both of them queue the interrupt

The first one is prevented by adding memory barriers, the second is
prevented by checking (under fiq->waitq.lock) if the interrupt has already
been queued.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2015-07-01 22:26:03 +0800
4ce608126 fuse: iqueue locking ... Browse Code »

Use fiq->waitq.lock for protecting members of struct fuse_iqueue and
FR_PENDING request flag, previously protected by fc->lock.

Following patches will remove fc->lock protection from these members.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:02 +0800
ef7592588 fuse: dev read: split list_move ... Browse Code »

Different lists will need different locks.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:02 +0800
8c91189a2 fuse: abort: group iqueue accesses ... Browse Code »

Rearrange fuse_abort_conn() so that input queue accesses are grouped
together.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:02 +0800
e16714d87 fuse: duplicate ->connected in iqueue ... Browse Code »

This will allow checking ->connected just with the input queue lock.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:01 +0800
f88996a93 fuse: separate out input queue ... Browse Code »

The input queue contains normal requests (fc->pending), forgets
(fc->forget_*) and interrupts (fc->interrupts). There's also fc->waitq and
fc->fasync for waking up the readers of the fuse device when a request is
available.

The fc->reqctr is also moved to the input queue (assigned to the request
when the request is added to the input queue.

This patch just rearranges the fields, no functional change.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:01 +0800
33e14b4df fuse: req state use flags ... Browse Code »

Use flags for representing the state in fuse_req. This is needed since
req->list will be protected by different locks in different states, hence
we'll want the state itself to be split into distinct bits, each protected
with the relevant lock in that state.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2015-07-01 22:26:01 +0800
7a3b2c754 fuse: simplify req states ... Browse Code »

FUSE_REQ_INIT is actually the same state as FUSE_REQ_PENDING and
FUSE_REQ_READING and FUSE_REQ_WRITING can be merged into a common
FUSE_REQ_IO state.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:00 +0800
c47752673 fuse: don't hold lock over request_wait_answer() ... Browse Code »

Only hold fc->lock over sections of request_wait_answer() that actually
need it. If wait_event_interruptible() returns zero, it means that the
request finished. Need to add memory barriers, though, to make sure that
all relevant data in the request is synchronized.

Signed-off-by: Miklos Szeredi

Miklos Szeredi
2015-07-01 22:26:00 +0800
7d2e0a099 fuse: simplify unique ctr ... Browse Code »

Since it's a 64bit counter, it's never gonna wrap around. Remove code
dealing with that possibility.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:26:00 +0800
41f982747 fuse: rework abort ... Browse Code »

Splice fc->pending and fc->processing lists into a common kill list while
holding fc->lock.

By the time we release fc->lock, pending and processing lists are empty and
the io list contains only locked requests.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:25:59 +0800
b716d4253 fuse: fold helpers into abort ... Browse Code »

Fold end_io_requests() and end_queued_requests() into fuse_abort_conn().

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:25:59 +0800
dc00809a5 fuse: use per req lock for lock/unlock_request() ... Browse Code »

Reuse req->waitq.lock for protecting FR_ABORTED and FR_LOCKED flags.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:25:58 +0800
825d6d339 fuse: req use bitops ... Browse Code »

Finer grained locking will mean there's no single lock to protect
modification of bitfileds in fuse_req.

So move to using bitops. Can use the non-atomic variants for those which
happen while the request definitely has only one reference.

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:25:58 +0800
0d8e84b04 fuse: simplify request abort ... Browse Code »

- don't end the request while req->locked is true

- make unlock_request() return an error if the connection was aborted

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:25:58 +0800
ccd0a0bd1 fuse: call fuse_abort_conn() in dev release ... Browse Code »

fuse_abort_conn() does all the work done by fuse_dev_release() and more.
"More" consists of:

end_io_requests(fc);
wake_up_all(&fc->waitq);
kill_fasync(&fc->fasync, SIGIO, POLL_IN);

All of which should be no-op (WARN_ON's added).

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:25:57 +0800
f0139aa81 fuse: fold fuse_request_send_nowait() into single caller ... Browse Code »

And the same with fuse_request_send_nowait_locked().

Signed-off-by: Miklos Szeredi
Reviewed-by: Ashish Samant

Miklos Szeredi
2015-07-01 22:25:57 +0800