Eric Lee / smarc-fsl-linux-kernel

15 Jan, 2012

1 commit

fed474857 fsnotify: don't BUG in fsnotify_destroy_mark() ... Browse Code »
1

Removing the parent of a watched file results in "kernel BUG at
fs/notify/mark.c:139".

To reproduce

add "-w /tmp/audit/dir/watched_file" to audit.rules
rm -rf /tmp/audit/dir

This is caused by fsnotify_destroy_mark() being called without an
extra reference taken by the caller.

Reported by Francesco Cosoleto here:

https://bugzilla.novell.com/show_bug.cgi?id=689860

Fix by removing the BUG_ON and adding a comment about not accessing mark after
the iput.

Signed-off-by: Miklos Szeredi
CC: stable@vger.kernel.org
Signed-off-by: Linus Torvalds

Miklos Szeredi
2012-01-15 10:01:42 +0800

04 Jan, 2012

1 commit

c63181e6b vfs: move fsnotify junk to struct mount ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:57:12 +0800

27 Jul, 2011

1 commit

60063497a atomic: use <linux/atomic.h> ... Browse Code »

This allows us to move duplicated code in
(atomic_inc_not_zero() for now) to

Signed-off-by: Arun Sharma
Reviewed-by: Eric Dumazet
Cc: Ingo Molnar
Cc: David Miller
Cc: Eric Dumazet
Acked-by: Mike Frysinger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arun Sharma
2011-07-27 07:49:47 +0800

08 Apr, 2011

1 commit

42933bac1 Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6 ... Browse Code »

* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
Fix common misspellings

Linus Torvalds
2011-04-08 02:14:49 +0800

06 Apr, 2011

1 commit

d0de4dc58 inotify: fix double free/corruption of stuct user ... Browse Code »

On an error path in inotify_init1 a normal user can trigger a double
free of struct user. This is a regression introduced by a2ae4cc9a16e
("inotify: stop kernel memory leak on file creation failure").

We fix this by making sure that if a group exists the user reference is
dropped when the group is cleaned up. We should not explictly drop the
reference on error and also drop the reference when the group is cleaned
up.

The new lifetime rules are that an inotify group lives from
inotify_new_group to the last fsnotify_put_group. Since the struct user
and inotify_devs are directly tied to this lifetime they are only
changed/updated in those two locations. We get rid of all special
casing of struct user or user->inotify_devs.

Signed-off-by: Eric Paris
Cc: stable@kernel.org (2.6.37 and up)
Signed-off-by: Linus Torvalds

Eric Paris
2011-04-06 06:27:14 +0800

31 Mar, 2011

1 commit

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800

25 Mar, 2011

3 commits

67a23c494 fs: rename inode_lock to inode_hash_lock ... Browse Code »

All that remains of the inode_lock is protecting the inode hash list
manipulation and traversals. Rename the inode_lock to
inode_hash_lock to reflect it's actual function.

Signed-off-by: Dave Chinner
Signed-off-by: Al Viro

Dave Chinner
2011-03-25 09:17:51 +0800
55fa6091d fs: move i_sb_list out from under inode_lock ... Browse Code »

Protect the per-sb inode list with a new global lock
inode_sb_list_lock and use it to protect the list manipulations and
traversals. This lock replaces the inode_lock as the inodes on the
list can be validity checked while holding the inode->i_lock and
hence the inode_lock is no longer needed to protect the list.

Signed-off-by: Dave Chinner
Signed-off-by: Al Viro

Dave Chinner
2011-03-25 09:16:32 +0800
250df6ed2 fs: protect inode->i_state with inode->i_lock ... Browse Code »

Protect inode state transitions and validity checks with the
inode->i_lock. This enables us to make inode state transitions
independently of the inode_lock and is the first step to peeling
away the inode_lock from the code.

This requires that __iget() is done atomically with i_state checks
during list traversals so that we don't race with another thread
marking the inode I_FREEING between the state check and grabbing the
reference.

Also remove the unlock_new_inode() memory barrier optimisation
required to avoid taking the inode_lock when clearing I_NEW.
Simplify the code by simply taking the inode->i_lock around the
state change and wakeup. Because the wakeup is no longer tricky,
remove the wake_up_inode() function and open code the wakeup where
necessary.

Signed-off-by: Dave Chinner
Signed-off-by: Al Viro

Dave Chinner
2011-03-25 09:16:31 +0800

01 Mar, 2011

1 commit

ae0e47f02 Remove one to many n's in a word ... Browse Code »

Signed-off-by: Justin P. Mattock
Signed-off-by: Jiri Kosina

Justin P. Mattock
2011-03-01 22:47:58 +0800

14 Jan, 2011

1 commit

008d23e48 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial ... Browse Code »

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
Documentation/trace/events.txt: Remove obsolete sched_signal_send.
writeback: fix global_dirty_limits comment runtime -> real-time
ppc: fix comment typo singal -> signal
drivers: fix comment typo diable -> disable.
m68k: fix comment typo diable -> disable.
wireless: comment typo fix diable -> disable.
media: comment typo fix diable -> disable.
remove doc for obsolete dynamic-printk kernel-parameter
remove extraneous 'is' from Documentation/iostats.txt
Fix spelling milisec -> ms in snd_ps3 module parameter description
Fix spelling mistakes in comments
Revert conflicting V4L changes
i7core_edac: fix typos in comments
mm/rmap.c: fix comment
sound, ca0106: Fix assignment to 'channel'.
hrtimer: fix a typo in comment
init/Kconfig: fix typo
anon_inodes: fix wrong function name in comment
fix comment typos concerning "consistent"
poll: fix a typo in comment
...

Fix up trivial conflicts in:
- drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
- fs/ext4/ext4.h

Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.

Linus Torvalds
2011-01-14 02:05:56 +0800

07 Jan, 2011

4 commits

873feea09 fs: dcache per-inode inode alias locking ... Browse Code »

dcache_inode_lock can be replaced with per-inode locking. Use existing
inode->i_lock for this. This is slightly non-trivial because we sometimes
need to find the inode from the dentry, which requires d_inode to be
stabilised (either with refcount or d_lock).

Signed-off-by: Nick Piggin

Nick Piggin
2011-01-07 14:50:31 +0800
b5c84bf6f fs: dcache remove dcache_lock ... Browse Code »

dcache_lock no longer protects anything. remove it.

Signed-off-by: Nick Piggin

Nick Piggin
2011-01-07 14:50:23 +0800
b23fb0a60 fs: scale inode alias list ... Browse Code »

Add a new lock, dcache_inode_lock, to protect the inode's i_dentry list
from concurrent modification. d_alias is also protected by d_lock.

Signed-off-by: Nick Piggin

Nick Piggin
2011-01-07 14:50:22 +0800
2fd6b7f50 fs: dcache scale subdirs ... Browse Code »

Protect d_subdirs and d_child with d_lock, except in filesystems that aren't
using dcache_lock for these anyway (eg. using i_mutex).

Note: if we change the locking rule in future so that ->d_child protection is
provided only with ->d_parent->d_lock, it may allow us to reduce some locking.
But it would be an exception to an otherwise regular locking scheme, so we'd
have to see some good results. Probably not worthwhile.

Signed-off-by: Nick Piggin

Nick Piggin
2011-01-07 14:50:21 +0800

23 Dec, 2010

1 commit

4b7bd3647 Merge branch 'master' into for-next ... Browse Code »

Conflicts:
MAINTAINERS
arch/arm/mach-omap2/pm24xx.c
drivers/scsi/bfa/bfa_fcpim.c

Needed to update to apply fixes for which the old branch was too
outdated.

Jiri Kosina
2010-12-23 01:57:02 +0800

16 Dec, 2010

1 commit

7d1316233 fanotify: fill in the metadata_len field on struct fanotify_event_metadata ... Browse Code »

The fanotify_event_metadata now has a field which is supposed to
indicate the length of the metadata portion of the event. Fill in that
field as well.

Based-in-part-on-patch-by: Alexey Zaytsev
Signed-off-by: Eric Paris

Eric Paris
2010-12-16 02:58:18 +0800

08 Dec, 2010

7 commits

fdbf3ceeb fanotify: Dont try to open a file descriptor for the overflow event ... Browse Code »

We should not try to open a file descriptor for the overflow event since this
will always fail.

Signed-off-by: Lino Sanfilippo
Signed-off-by: Eric Paris

Lino Sanfilippo
2010-12-08 05:14:24 +0800
263791989 fanotify: do not leak user reference on allocation failure ... Browse Code »

If fanotify_init is unable to allocate a new fsnotify group it will
return but will not drop its reference on the associated user struct.
Drop that reference on error.

Reported-by: Vegard Nossum
Signed-off-by: Eric Paris

Eric Paris
2010-12-08 05:14:23 +0800
a2ae4cc9a inotify: stop kernel memory leak on file creation failure ... Browse Code »

If inotify_init is unable to allocate a new file for the new inotify
group we leak the new group. This patch drops the reference on the
group on file allocation failure.

Reported-by: Vegard Nossum
cc: stable@kernel.org
Signed-off-by: Eric Paris

Eric Paris
2010-12-08 05:14:22 +0800
09e5f14e5 fanotify: on group destroy allow all waiters to bypass permission check ... Browse Code »

When fanotify_release() is called, there may still be processes waiting for
access permission. Currently only processes for which an event has already been
queued into the groups access list will be woken up. Processes for which no
event has been queued will continue to sleep and thus cause a deadlock when
fsnotify_put_group() is called.
Furthermore there is a race allowing further processes to be waiting on the
access wait queue after wake_up (if they arrive before clear_marks_by_group()
is called).
This patch corrects this by setting a flag to inform processes that the group
is about to be destroyed and thus not to wait for access permission.

[additional changelog from eparis]
Lets think about the 4 relevant code paths from the PoV of the
'operator' 'listener' 'responder' and 'closer'. Where operator is the
process doing an action (like open/read) which could require permission.
Listener is the task (or in this case thread) slated with reading from
the fanotify file descriptor. The 'responder' is the thread responsible
for responding to access requests. 'Closer' is the thread attempting to
close the fanotify file descriptor.

The 'operator' is going to end up in:
fanotify_handle_event()
get_response_from_access()
(THIS BLOCKS WAITING ON USERSPACE)

The 'listener' interesting code path
fanotify_read()
copy_event_to_user()
prepare_for_access_response()
(THIS CREATES AN fanotify_response_event)

The 'responder' code path:
fanotify_write()
process_access_response()
(REMOVE A fanotify_response_event, SET RESPONSE, WAKE UP 'operator')

The 'closer':
fanotify_release()
(SUPPOSED TO CLEAN UP THE REST OF THIS MESS)

What we have today is that in the closer we remove all of the
fanotify_response_events and set a bit so no more response events are
ever created in prepare_for_access_response().

The bug is that we never wake all of the operators up and tell them to
move along. You fix that in fanotify_get_response_from_access(). You
also fix other operators which haven't gotten there yet. So I agree
that's a good fix.
[/additional changelog from eparis]

[remove additional changes to minimize patch size]
[move initialization so it was inside CONFIG_FANOTIFY_PERMISSION]

Signed-off-by: Lino Sanfilippo
Signed-off-by: Eric Paris

Lino Sanfilippo
2010-12-08 05:14:22 +0800
1734dee4e fanotify: Dont allow a mask of 0 if setting or removing a mark ... Browse Code »

In mark_remove_from_mask() we destroy marks that have their event mask cleared.
Thus we should not allow the creation of those marks in the first place.
With this patch we check if the mask given from user is 0 in case of FAN_MARK_ADD.
If so we return an error. Same for FAN_MARK_REMOVE since this does not have any
effect.

Signed-off-by: Lino Sanfilippo
Signed-off-by: Eric Paris

Lino Sanfilippo
2010-12-08 05:14:21 +0800
fa218ab98 fanotify: correct broken ref counting in case adding a mark failed ... Browse Code »

If adding a mount or inode mark failed fanotify_free_mark() is called explicitly.
But at this time the mark has already been put into the destroy list of the
fsnotify_mark kernel thread. If the thread is too slow it will try to decrease
the reference of a mark, that has already been freed by fanotify_free_mark().
(If its fast enough it will only decrease the marks ref counter from 2 to 1 - note
that the counter has been increased to 2 in add_mark() - which has practically no
effect.)

This patch fixes the ref counting by not calling free_mark() explicitly, but
decreasing the ref counter and rely on the fsnotify_mark thread to cleanup in
case adding the mark has failed.

Signed-off-by: Lino Sanfilippo
Signed-off-by: Eric Paris

Lino Sanfilippo
2010-12-08 05:14:21 +0800
ecf6f5e7d fanotify: deny permissions when no event was sent ... Browse Code »

If no event was sent to userspace we cannot expect userspace to respond to
permissions requests. Today such requests just hang forever. This patch will
deny any permissions event which was unable to be sent to userspace.

Reported-by: Tvrtko Ursulin
Signed-off-by: Eric Paris

Eric Paris
2010-12-08 05:14:17 +0800

02 Nov, 2010

1 commit

6aaccece1 Kconfig: typo: and -> an ... Browse Code »

Signed-off-by: Michael Witten
Signed-off-by: Jiri Kosina

Michael Witten
2010-11-02 03:17:29 +0800

31 Oct, 2010

1 commit

1a5cea721 make fanotify_read() restartable across signals ... Browse Code »

In fanotify_read() return -ERESTARTSYS instead of -EINTR to
make read() restartable across signals (BSD semantic).

Signed-off-by: Eric Paris

Lino Sanfilippo
2010-10-31 02:07:35 +0800

29 Oct, 2010

14 commits

19ba54f46 fs/notify/fanotify/fanotify_user.c: fix warnings ... Browse Code »

fs/notify/fanotify/fanotify_user.c: In function 'fanotify_release':
fs/notify/fanotify/fanotify_user.c:375: warning: unused variable 'lre'
fs/notify/fanotify/fanotify_user.c:375: warning: unused variable 're'

this is really ugly.

Cc: Eric Paris
Signed-off-by: Andrew Morton
Signed-off-by: Eric Paris

Andrew Morton
2010-10-29 05:22:16 +0800
192ca4d19 fanotify: do not recalculate the mask if the ignored mask changed ... Browse Code »

If fanotify sets a new bit in the ignored mask it will cause the generic
fsnotify layer to recalculate the real mask. This is stupid since we
didn't change that part.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:16 +0800
8fcd65280 fanotify: ignore events on directories unless specifically requested ... Browse Code »

fanotify has a very limited number of events it sends on directories. The
usefulness of these events is yet to be seen and still we send them. This
is particularly painful for mount marks where one might receive many of
these useless events. As such this patch will drop events on IS_DIR()
inodes unless they were explictly requested with FAN_ON_DIR.

This means that a mark on a directory without FAN_EVENT_ON_CHILD or
FAN_ON_DIR is meaningless and will result in no events ever (although it
will still be allowed since detecting it is hard)

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:16 +0800
b29866aab fsnotify: rename FS_IN_ISDIR to FS_ISDIR ... Browse Code »

The _IN_ in the naming is reserved for flags only used by inotify. Since I
am about to use this flag for fanotify rename it to be generic like the
rest.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:15 +0800
e1c048ba7 fanotify: do not send events for irregular files ... Browse Code »

fanotify_should_send_event has a test to see if an object is a file or
directory and does not send an event otherwise. The problem is that the
test is actually checking if the object with a mark is a file or directory,
not if the object the event happened on is a file or directory. We should
check the latter.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:15 +0800
4afeff850 fanotify: limit number of listeners per user ... Browse Code »

fanotify currently has no limit on the number of listeners a given user can
have open. This patch limits the total number of listeners per user to
128. This is the same as the inotify default limit.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:15 +0800
ac7e22dcf fanotify: allow userspace to override max marks ... Browse Code »

Some fanotify groups, especially those like AV scanners, will need to place
lots of marks, particularly ignore marks. Since ignore marks do not pin
inodes in cache and are cleared if the inode is removed from core (usually
under memory pressure) we expose an interface for listeners, with
CAP_SYS_ADMIN, to override the maximum number of marks and be allowed to
set and 'unlimited' number of marks. Programs which make use of this
feature will be able to OOM a machine.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:15 +0800
e7099d8a5 fanotify: limit the number of marks in a single fanotify group ... Browse Code »

There is currently no limit on the number of marks a given fanotify group
can have. Since fanotify is gated on CAP_SYS_ADMIN this was not seen as
a serious DoS threat. This patch implements a default of 8192, the same as
inotify to work towards removing the CAP_SYS_ADMIN gating and eliminating
the default DoS'able status.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:14 +0800
5dd03f55f fanotify: allow userspace to override max queue depth ... Browse Code »

fanotify has a defualt max queue depth. This patch allows processes which
explicitly request it to have an 'unlimited' queue depth. These processes
need to be very careful to make sure they cannot fall far enough behind
that they OOM the box. Thus this flag is gated on CAP_SYS_ADMIN.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:14 +0800
2529a0df0 fsnotify: implement a default maximum queue depth ... Browse Code »

Currently fanotify has no maximum queue depth. Since fanotify is
CAP_SYS_ADMIN only this does not pose a normal user DoS issue, but it
certianly is possible that an fanotify listener which can't keep up could
OOM the box. This patch implements a default 16k depth. This is the same
default depth used by inotify, but given fanotify's better queue merging in
many situations this queue will contain many additional useful events by
comparison.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:14 +0800
5322a59f1 fanotify: ignore fanotify ignore marks if open writers ... Browse Code »

fanotify will clear ignore marks if a task changes the contents of an
inode. The problem is with the races around when userspace finishes
checking a file and when that result is actually attached to the inode.
This race was described as such:

Consider the following scenario with hostile processes A and B, and
victim process C:
1. Process A opens new file for writing. File check request is generated.
2. File check is performed in userspace. Check result is "file has no malware".
3. The "permit" response is delivered to kernel space.
4. File ignored mark set.
5. Process A writes dummy bytes to the file. File ignored flags are cleared.
6. Process B opens the same file for reading. File check request is generated.
7. File check is performed in userspace. Check result is "file has no malware".
8. Process A writes malware bytes to the file. There is no cached response yet.
9. The "permit" response is delivered to kernel space and is cached in fanotify.
10. File ignored mark set.
11. Now any process C will be permitted to open the malware file.
There is a race between steps 8 and 10

While fanotify makes no strong guarantees about systems with hostile
processes there is no reason we cannot harden against this race. We do
that by simply ignoring any ignore marks if the inode has open writers (aka
i_writecount > 0). (We actually do not ignore ignore marks if the
FAN_MARK_SURV_MODIFY flag is set)

Reported-by: Vasily Novikov
Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:14 +0800
52420392c fsnotify: call fsnotify_parent in perm events ... Browse Code »

fsnotify perm events do not call fsnotify parent. That means you cannot
register a perm event on a directory and enforce permissions on all inodes in
that directory. This patch fixes that situation.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:13 +0800
ff8bcbd03 fsnotify: correctly handle return codes from listeners ... Browse Code »

When fsnotify groups return errors they are ignored. For permissions
events these should be passed back up the stack, but for most events these
should continue to be ignored.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:13 +0800
4231a2353 fanotify: implement fanotify listener ordering ... Browse Code »

The fanotify listeners needs to be able to specify what types of operations
they are going to perform so they can be ordered appropriately between other
listeners doing other types of operations. They need this to be able to make
sure that things like hierarchichal storage managers will get access to inodes
before processes which need the data. This patch defines 3 possible uses
which groups must indicate in the fanotify_init() flags.

FAN_CLASS_PRE_CONTENT
FAN_CLASS_CONTENT
FAN_CLASS_NOTIF

Groups will receive notification in that order. The order between 2 groups in
the same class is undeterministic.

FAN_CLASS_PRE_CONTENT is intended to be used by listeners which need access to
the inode before they are certain that the inode contains it's final data. A
hierarchical storage manager should choose to use this class.

FAN_CLASS_CONTENT is intended to be used by listeners which need access to the
inode after it contains its intended contents. This would be the appropriate
level for an AV solution or document control system.

FAN_CLASS_NOTIF is intended for normal async notification about access, much the
same as inotify and dnotify. Syncronous permissions events are not permitted
at this class.

Signed-off-by: Eric Paris

Eric Paris
2010-10-29 05:22:13 +0800