18 Feb, 2014
1 commit
-
My rework of handling of notification events (namely commit 7053aee26a35
"fsnotify: do not share events between notification groups") broke
sending of cookies with inotify events. We didn't propagate the value
passed to fsnotify() properly and passed 4 uninitialized bytes to
userspace instead (so it is also an information leak). Sadly I didn't
notice this during my testing because inotify cookies aren't used very
much and LTP inotify tests ignore them.Fix the problem by passing the cookie value properly.
Fixes: 7053aee26a3548ebaba046ae2e52396ccf56ac6c
Reported-by: Vegard Nossum
Signed-off-by: Jan Kara
22 Jan, 2014
3 commits
-
We usually rely on the fact that struct members not specified in the
initializer are set to NULL. So do that with fsnotify function pointers
as well.Signed-off-by: Jan Kara
Reviewed-by: Christoph Hellwig
Cc: Eric Paris
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
After removing event structure creation from the generic layer there is
no reason for separate .should_send_event and .handle_event callbacks.
So just remove the first one.Signed-off-by: Jan Kara
Reviewed-by: Christoph Hellwig
Cc: Eric Paris
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently fsnotify framework creates one event structure for each
notification event and links this event into all interested notification
groups. This is done so that we save memory when several notification
groups are interested in the event. However the need for event
structure shared between inotify & fanotify bloats the event structure
so the result is often higher memory consumption.Another problem is that fsnotify framework keeps path references with
outstanding events so that fanotify can return open file descriptors
with its events. This has the undesirable effect that filesystem cannot
be unmounted while there are outstanding events - a regression for
inotify compared to a situation before it was converted to fsnotify
framework. For fanotify this problem is hard to avoid and users of
fanotify should kind of expect this behavior when they ask for file
descriptors from notified files.This patch changes fsnotify and its users to create separate event
structure for each group. This allows for much simpler code (~400 lines
removed by this patch) and also smaller event structures. For example
on 64-bit system original struct fsnotify_event consumes 120 bytes, plus
additional space for file name, additional 24 bytes for second and each
subsequent group linking the event, and additional 32 bytes for each
inotify group for private data. After the conversion inotify event
consumes 48 bytes plus space for file name which is considerably less
memory unless file names are long and there are several groups
interested in the events (both of which are uncommon). Fanotify event
fits in 56 bytes after the conversion (fanotify doesn't care about file
names so its events don't have to have it allocated). A win unless
there are four or more fanotify groups interested in the event.The conversion also solves the problem with unmount when only inotify is
used as we don't have to grab path references for inotify events.[hughd@google.com: fanotify: fix corruption preventing startup]
Signed-off-by: Jan Kara
Reviewed-by: Christoph Hellwig
Cc: Eric Paris
Cc: Al Viro
Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Jun, 2013
1 commit
-
audit_add_tree_rule() must set 'rule->tree = NULL;' firstly, to protect
the rule itself freed in kill_rules().The reason is when it is killed, the 'rule' itself may have already
released, we should not access it. one example: we add a rule to an
inode, just at the same time the other task is deleting this inode.The work flow for adding a rule:
audit_receive() -> (need audit_cmd_mutex lock)
audit_receive_skb() ->
audit_receive_msg() ->
audit_receive_filter() ->
audit_add_rule() ->
audit_add_tree_rule() -> (need audit_filter_mutex lock)
...
unlock audit_filter_mutex
get_tree()
...
iterate_mounts() -> (iterate all related inodes)
tag_mount() ->
tag_trunk() ->
create_trunk() -> (assume it is 1st rule)
fsnotify_add_mark() ->
fsnotify_add_inode_mark() -> (add mark to inode->i_fsnotify_marks)
...
get_tree(); (each inode will get one)
...
lock audit_filter_mutexThe work flow for deleting an inode:
__destroy_inode() ->
fsnotify_inode_delete() ->
__fsnotify_inode_delete() ->
fsnotify_clear_marks_by_inode() -> (get mark from inode->i_fsnotify_marks)
fsnotify_destroy_mark() ->
fsnotify_destroy_mark_locked() ->
audit_tree_freeing_mark() ->
evict_chunk() ->
...
tree->goner = 1
...
kill_rules() -> (assume current->audit_context == NULL)
call_rcu() -> (rule->tree != NULL)
audit_free_rule_rcu() ->
audit_free_rule()
...
audit_schedule_prune() -> (assume current->audit_context == NULL)
kthread_run() -> (need audit_cmd_mutex and audit_filter_mutex lock)
prune_one() -> (delete it from prue_list)
put_tree(); (match the original get_tree above)Signed-off-by: Chen Gang
Cc: Eric Paris
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
30 Apr, 2013
1 commit
-
audit_trim_trees() calls get_tree(). If a failure occurs we must call
put_tree().[akpm@linux-foundation.org: run put_tree() before mutex_lock() for small scalability improvement]
Signed-off-by: Chen Gang
Cc: Al Viro
Cc: Eric Paris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Jan, 2013
1 commit
-
It's possible for audit_log_start() to return NULL. Handle it in the
various callers.Signed-off-by: Kees Cook
Cc: Al Viro
Cc: Eric Paris
Cc: Jeff Layton
Cc: "Eric W. Biederman"
Cc: Julien Tinnes
Cc: Will Drewry
Cc: Steve Grubb
Cc: Andrea Arcangeli
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Dec, 2012
1 commit
-
In fsnotify_destroy_mark() dont get the group from the passed mark anymore,
but pass the group itself as an additional parameter to the function.Signed-off-by: Lino Sanfilippo
Signed-off-by: Eric Paris
15 Aug, 2012
3 commits
-
Drop the initial reference by fsnotify_init_mark early instead of
audit_tree_freeing_mark() at destroy time.In the cases we destroy the mark before we drop the initial reference we need to
get rid of the get_mark that balances the put_mark in audit_tree_freeing_mark().Signed-off-by: Miklos Szeredi
-
Refcounting of fsnotify_mark in audit tree is broken. E.g:
refcount
create_chunk
alloc_chunk 1
fsnotify_add_mark 2untag_chunk
fsnotify_get_mark 3
fsnotify_destroy_mark
audit_tree_freeing_mark 2
fsnotify_put_mark 1
fsnotify_put_mark 0
via destroy_list
fsnotify_mark_destroy -1This was reported by various people as triggering Oops when stopping auditd.
We could just remove the put_mark from audit_tree_freeing_mark() but that would
break freeing via inode destruction. So this patch simply omits a put_mark
after calling destroy_mark or adds a get_mark before.The additional get_mark is necessary where there's no other put_mark after
fsnotify_destroy_mark() since it assumes that the caller is holding a reference
(or the inode is keeping the mark pinned, not the case here AFAICS).Signed-off-by: Miklos Szeredi
Reported-by: Valentin Avram
Reported-by: Peter Moody
Acked-by: Eric Paris
CC: stable@vger.kernel.org -
Don't do free_chunk() after fsnotify_add_mark(). That one does a delayed unref
via the destroy list and this results in use-after-free.Signed-off-by: Miklos Szeredi
Acked-by: Eric Paris
CC: stable@vger.kernel.org
14 Jul, 2012
1 commit
-
copy_tree() can theoretically fail in a case other than ENOMEM, but always
returns NULL which is interpreted by callers as -ENOMEM. Change it to return
an explicit error.Also change clone_mnt() for consistency and because union mounts will add new
error cases.Thanks to Andreas Gruenbacher for a bug fix.
[AV: folded braino fix by Dan Carpenter]Original-author: Valerie Aurora
Signed-off-by: David Howells
Cc: Valerie Aurora
Cc: Andreas Gruenbacher
Signed-off-by: Al Viro
21 Jul, 2011
1 commit
-
The rcu callback __put_tree() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(__put_tree).Signed-off-by: Lai Jiangshan
Signed-off-by: Paul E. McKenney
Cc: Al Viro
Cc: Eric Paris
Reviewed-by: Josh Triplett
31 Mar, 2011
1 commit
-
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi
30 Oct, 2010
1 commit
-
... while we are not holding spinlocks.
Signed-off-by: Al Viro
28 Jul, 2010
16 commits
-
fanotify currently, when given a vfsmount_mark will look up (if it exists)
the corresponding inode mark. This patch drops that lookup and uses the
mark provided.Signed-off-by: Eric Paris
-
should_send_event() and handle_event() will both need to look up the inode
event if they get a vfsmount event. Lets just pass both at the same time
since we have them both after walking the lists in lockstep.Signed-off-by: Eric Paris
-
The change to use srcu and walk the object list rather than the global
fsnotify_group list means that should_send_event is no longer needed for a
number of groups and can be simplified for others. Do that.Signed-off-by: Eric Paris
-
With the change of fsnotify to use srcu walking the marks list instead of
walking the global groups list we now know the mark in question. The code can
send the mark to the group's handling functions and the groups won't have to
find those marks themselves.Signed-off-by: Eric Paris
-
currently all marking is done by functions in inode-mark.c. Some of this
is pretty generic and should be instead done in a generic function and we
should only put the inode specific code in inode-mark.cSigned-off-by: Eric Paris
-
All callers to fsnotify_find_mark_entry() except one take and
release inode->i_lock around the call. Take the lock inside
fsnotify_find_mark_entry() instead.Signed-off-by: Andreas Gruenbacher
Signed-off-by: Eric Paris -
the _entry portion of fsnotify functions is useless. Drop it.
Signed-off-by: Eric Paris
-
The name is long and it serves no real purpose. So rename
fsnotify_mark_entry to just fsnotify_mark.Signed-off-by: Eric Paris
-
The addition of marks on vfs mounts will be simplified if the inode
specific parts of a mark and the vfsmnt specific parts of a mark are
actually in a union so naming can be easy. This patch just implements the
inode struct and the union.Signed-off-by: Eric Paris
-
To ensure that a group will not duplicate events when it receives it based
on the vfsmount and the inode should_send_event test we should distinguish
those two cases. We pass a vfsmount to this function so groups can make
their own determinations.Signed-off-by: Eric Paris
-
Nothing uses the mask argument to fsnotify_alloc_group. This patch drops
that argument.Signed-off-by: Eric Paris
-
fsnotify_obtain_group was intended to be able to find an already existing
group. Nothing uses that functionality. This just renames it to
fsnotify_alloc_group so it is clear what it is doing.Signed-off-by: Eric Paris
-
The original fsnotify interface has a group-num which was intended to be
able to find a group after it was added. I no longer think this is a
necessary thing to do and so we remove the group_num.Signed-off-by: Eric Paris
-
fanotify is going to need to look at file->private_data to know if an event
should be sent or not. This passes the data (which might be a file,
dentry, inode, or none) to the should_send function calls so fanotify can
get that information when availableSigned-off-by: Eric Paris
-
fanotify is only interested in event types which contain enough information
to open the original file in the context of the fanotify listener. Since
fanotify may not want to send events if that data isn't present we pass
the data type to the should_send_event function call so fanotify can express
its lack of interest.Signed-off-by: Eric Paris
-
Simply switch audit_trees from using inotify to using fsnotify for it's
inode pinning and disappearing act information.Signed-off-by: Eric Paris
30 Mar, 2010
1 commit
-
…it slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
04 Mar, 2010
2 commits
-
apply function to vfsmounts in set returned by collect_mounts(),
stop if it returns non-zero.Signed-off-by: Al Viro
-
Analog of is_subdir for vfsmount,dentry pairs, moved from audit_tree.c
Signed-off-by: Al Viro
20 Dec, 2009
2 commits
-
Several leaks in audit_tree didn't get caught by commit
318b6d3d7ddbcad3d6867e630711b8a705d873d7, including the leak on normal
exit in case of multiple rules refering to the same chunk.Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds -
... aka "Al had badly fscked up when writing that thing and nobody
noticed until Eric had fixed leaks that used to mask the breakage".The function essentially creates a copy of old array sans one element
and replaces the references to elements of original (they are on cyclic
lists) with those to corresponding elements of new one. After that the
old one is fair game for freeing.First of all, there's a dumb braino: when we get to list_replace_init we
use indices for wrong arrays - position in new one with the old array
and vice versa.Another bug is more subtle - termination condition is wrong if the
element to be excluded happens to be the last one. We shouldn't go
until we fill the new array, we should go until we'd finished the old
one. Otherwise the element we are trying to kill will remain on the
cyclic lists...That crap used to be masked by several leaks, so it was not quite
trivial to hit. Eric had fixed some of those leaks a while ago and the
shit had hit the fan...Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds
24 Jun, 2009
2 commits
-
If syscall removes the root of subtree being watched, we
definitely do not want the rules refering that subtree
to be destroyed without the syscall in question having
a chance to match them.Signed-off-by: Al Viro
-
A number of places in the audit system we send an op= followed by a string
that includes spaces. Somehow this works but it's just wrong. This patch
moves all of those that I could find to be quoted.Example:
Change From: type=CONFIG_CHANGE msg=audit(1244666690.117:31): auid=0 ses=1
subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 op=remove rule
key="number2" list=4 res=0Change To: type=CONFIG_CHANGE msg=audit(1244666690.117:31): auid=0 ses=1
subj=unconfined_u:unconfined_r:auditctl_t:s0-s0:c0.c1023 op="remove rule"
key="number2" list=4 res=0Signed-off-by: Eric Paris
12 Jun, 2009
1 commit
-
Signed-off-by: Al Viro
21 Apr, 2009
1 commit
-
is_under() will DTRT anyway. And yes, is_subdir() behaviour
is intentional.Signed-off-by: Al Viro