Eric Lee / smarc-fsl-linux-kernel

11 Dec, 2008

1 commit

6ee5a399d inotify: fix IN_ONESHOT unmount event watcher ... Browse Code »

On umount two event will be dispatched to watcher:

1: inotify_dev_queue_event(.., IN_UNMOUNT,..)
2: remove_watch(watch, dev)
->inotify_dev_queue_event(.., IN_IGNORED, ..)

But if watcher has IN_ONESHOT bit set then the watcher will be released
inside first event. Which result in accessing invalid object later. IMHO
it is not pure regression. This bug wasn't triggered while initial
inotify interface testing phase because of another bug in IN_ONESHOT
handling logic :)

commit ac74c00e499ed276a965e5b5600667d5dc04a84a
Author: Ulisses Furquim
Date: Fri Feb 8 04:18:16 2008 -0800
inotify: fix check for one-shot watches before destroying them
As the IN_ONESHOT bit is never set when an event is sent we must check it
in the watch's mask and not in the event's mask.

TESTCASE:
mkdir mnt
mount -ttmpfs none mnt
mkdir mnt/d
./inotify mnt/d&
umount mnt ## << lockup or crash here

TESTSOURCE:
/* gcc -oinotify inotify.c */
#include
#include
#include

int main(int argc, char **argv)
{
char buf[1024];
struct inotify_event *ie;
char *p;
int i;
ssize_t l;

p = argv[1];
i = inotify_init();
inotify_add_watch(i, p, ~0);

l = read(i, buf, sizeof(buf));
printf("read %d bytes\n", l);
ie = (struct inotify_event *) buf;
printf("event mask: %d\n", ie->mask);
return 0;
}

Signed-off-by: Dmitri Monakhov
Cc: John McCutchan
Cc: Al Viro
Cc: Robert Love
Cc: Ulisses Furquim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dmitri Monakhov
2008-12-11 00:01:53 +0800

16 Nov, 2008

1 commit

8f7b0ba1c Fix inotify watch removal/umount races ... Browse Code »

Inotify watch removals suck violently.

To kick the watch out we need (in this order) inode->inotify_mutex and
ih->mutex. That's fine if we have a hold on inode; however, for all
other cases we need to make damn sure we don't race with umount. We can
*NOT* just grab a reference to a watch - inotify_unmount_inodes() will
happily sail past it and we'll end with reference to inode potentially
outliving its superblock.

Ideally we just want to grab an active reference to superblock if we
can; that will make sure we won't go into inotify_umount_inodes() until
we are done. Cleanup is just deactivate_super().

However, that leaves a messy case - what if we *are* racing with
umount() and active references to superblock can't be acquired anymore?
We can bump ->s_count, grab ->s_umount, which will almost certainly wait
until the superblock is shut down and the watch in question is pining
for fjords. That's fine, but there is a problem - we might have hit the
window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
the moment when superblock is past the point of no return and is heading
for shutdown) and the moment when deactivate_super() acquires
->s_umount.

We could just do drop_super() yield() and retry, but that's rather
antisocial and this stuff is luser-triggerable. OTOH, having grabbed
->s_umount and having found that we'd got there first (i.e. that
->s_root is non-NULL) we know that we won't race with
inotify_umount_inodes().

So we could grab a reference to watch and do the rest as above, just
with drop_super() instead of deactivate_super(), right? Wrong. We had
to drop ih->mutex before we could grab ->s_umount. So the watch
could've been gone already.

That still can be dealt with - we need to save watch->wd, do idr_find()
and compare its result with our pointer. If they match, we either have
the damn thing still alive or we'd lost not one but two races at once,
the watch had been killed and a new one got created with the same ->wd
at the same address. That couldn't have happened in inotify_destroy(),
but inotify_rm_wd() could run into that. Still, "new one got created"
is not a problem - we have every right to kill it or leave it alone,
whatever's more convenient.

So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
"grab it and kill it" check. If it's been our original watch, we are
fine, if it's a newcomer - nevermind, just pretend that we'd won the
race and kill the fscker anyway; we are safe since we know that its
superblock won't be going away.

And yes, this is far beyond mere "not very pretty"; so's the entire
concept of inotify to start with.

Signed-off-by: Al Viro
Acked-by: Greg KH
Signed-off-by: Linus Torvalds

Al Viro
2008-11-16 04:26:44 +0800

07 Feb, 2008

2 commits

0d71bd599 inotify: remove debug code ... Browse Code »

The inotify debugging code is supposed to verify that the
DCACHE_INOTIFY_PARENT_WATCHED scalability optimisation does not result in
notifications getting lost nor extra needless locking generated.

Unfortunately there are also some races in the debugging code. And it isn't
very good at finding problems anyway. So remove it for now.

Signed-off-by: Nick Piggin
Cc: Robert Love
Cc: John McCutchan
Cc: Jan Kara
Cc: Yan Zheng
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2008-02-07 02:41:07 +0800
d599e36a9 inotify: fix race ... Browse Code »

There is a race between setting an inode's children's "parent watched" flag
when placing the first watch on a parent, and instantiating new children of
that parent: a child could miss having its flags set by
set_dentry_child_flags, but then inotify_d_instantiate might still see
!inotify_inode_watched.

The solution is to set_dentry_child_flags after adding the watch. Locking is
taken care of, because both set_dentry_child_flags and inotify_d_instantiate
hold dcache_lock and child->d_locks.

Signed-off-by: Nick Piggin
Cc: Robert Love
Cc: John McCutchan
Cc: Jan Kara
Cc: Yan Zheng
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2008-02-07 02:41:06 +0800

21 Oct, 2007

2 commits

455434d45 [PATCH] new helper - inotify_evict_watch() ... Browse Code »

Kicks the watch out without dropping it. Called under ->inotify_mutex

Signed-off-by: Al Viro

Al Viro
2007-10-21 14:37:38 +0800
b9efe8a23 [PATCH] new helper - inotify_clone_watch() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2007-10-21 14:37:32 +0800

09 May, 2007

1 commit

b5e618181 Introduce a handy list_first_entry macro ... Browse Code »

There are many places in the kernel where the construction like

foo = list_entry(head->next, struct foo_struct, list);

are used.
The code might look more descriptive and neat if using the macro

list_first_entry(head, type, member) \
list_entry((head)->next, type, member)

Here is the macro itself and the examples of its usage in the generic code.
If it will turn out to be useful, I can prepare the set of patches to
inject in into arch-specific code, drivers, networking, etc.

Signed-off-by: Pavel Emelianov
Signed-off-by: Kirill Korotaev
Cc: Randy Dunlap
Cc: Andi Kleen
Cc: Zach Brown
Cc: Davide Libenzi
Cc: John McCutchan
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: john stultz
Cc: Ram Pai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelianov
2007-05-09 02:15:11 +0800

04 Dec, 2006

1 commit

914e26379 [PATCH] severing fs.h, radix-tree.h -> sched.h ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2006-12-04 15:00:24 +0800

20 Jun, 2006

4 commits

3ca10067f [PATCH] inotify (4/5): allow watch removal from event handler ... Browse Code »

Allow callers to remove watches from their event handler via
inotify_remove_watch_locked(). This functionality can be used to
achieve IN_ONESHOT-like functionality for a subset of events in the
mask.

Signed-off-by: Amy Griffis
Acked-by: Robert Love
Acked-by: John McCutchan
Signed-off-by: Al Viro

Amy Griffis
2006-06-20 17:25:19 +0800
a9dc971d3 [PATCH] inotify (3/5): add interfaces to kernel API ... Browse Code »

Add inotify_init_watch() so caller can use inotify_watch refcounts
before calling inotify_add_watch().

Add inotify_find_watch() to find an existing watch for an (ih,inode)
pair. This is similar to inotify_find_update_watch(), but does not
update the watch's mask if one is found.

Add inotify_rm_watch() to remove a watch via the watch pointer instead
of the watch descriptor.

Signed-off-by: Amy Griffis
Acked-by: Robert Love
Acked-by: John McCutchan
Signed-off-by: Al Viro

Amy Griffis
2006-06-20 17:25:18 +0800
7c2977228 [PATCH] inotify (2/5): add name's inode to event handler ... Browse Code »

When an inotify event includes a dentry name, also include the inode
associated with that name.

Signed-off-by: Amy Griffis
Acked-by: Robert Love
Acked-by: John McCutchan
Signed-off-by: Al Viro

Amy Griffis
2006-06-20 17:25:18 +0800
2d9048e20 [PATCH] inotify (1/5): split kernel API from userspace support ... Browse Code »

The following series of patches introduces a kernel API for inotify,
making it possible for kernel modules to benefit from inotify's
mechanism for watching inodes. With these patches, inotify will
maintain for each caller a list of watches (via an embedded struct
inotify_watch), where each inotify_watch is associated with a
corresponding struct inode. The caller registers an event handler and
specifies for which filesystem events their event handler should be
called per inotify_watch.

Signed-off-by: Amy Griffis
Acked-by: Robert Love
Acked-by: John McCutchan
Signed-off-by: Al Viro

Amy Griffis
2006-06-20 17:25:17 +0800

22 May, 2006

2 commits

d66fd908a [PATCH] fix NULL dereference in inotify_ignore ... Browse Code »

Don't reassign to watch. If idr_find() returns NULL, then
put_inotify_watch() will choke.

Signed-off-by: Amy Griffis
Cc: John McCutchan
Cc: Robert Love
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Amy Griffis
2006-05-22 03:59:18 +0800
66055a4e7 [PATCH] fix race in inotify_release ... Browse Code »

While doing some inotify stress testing, I hit the following race. In
inotify_release(), it's possible for a watch to be removed from the lists
in between dropping dev->mutex and taking inode->inotify_mutex. The
reference we hold prevents the watch from being freed, but not from being
removed.

Checking the dev's idr mapping will prevent a double list_del of the
same watch.

Signed-off-by: Amy Griffis
Acked-by: John McCutchan
Cc: Robert Love
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Amy Griffis
2006-05-22 03:59:18 +0800

11 Apr, 2006

1 commit

091e881d0 [PATCH] inotify: check for NULL inode in inotify_d_instantiate ... Browse Code »

The spufs file system creates files in a directory before instantiating the
directory itself, which causes a NULL pointer access in
inotify_d_instantiate since c32ccd87bfd1414b0aabfcd8dbc7539ad23bcbaa.

I'd like to keep this behavior since it means that the user will not have
access to files in the directory before I know that I succeed in creating
everything in it. This patch adds a simple check for the inode to keep
that working.

Signed-off-by: Arnd Bergmann
Acked-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arnd Bergmann
2006-04-11 21:18:45 +0800

29 Mar, 2006

1 commit

4b6f5d20b [PATCH] Make most file operations structs in fs/ const ... Browse Code »

This is a conversion to make the various file_operations structs in fs/
const. Basically a regexp job, with a few manual fixups

The goal is both to increase correctness (harder to accidentally write to
shared datastructures) and reducing the false sharing of cachelines with
things that get dirty in .data (while .rodata is nicely read only and thus
cache clean)

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arjan van de Ven
2006-03-29 01:16:06 +0800

27 Mar, 2006

1 commit

fa3536cc1 [PATCH] Use __read_mostly on some hot fs variables ... Browse Code »

I discovered on oprofile hunting on a SMP platform that dentry lookups were
slowed down because d_hash_mask, d_hash_shift and dentry_hashtable were in
a cache line that contained inodes_stat. So each time inodes_stats is
changed by a cpu, other cpus have to refill their cache line.

This patch moves some variables to the __read_mostly section, in order to
avoid false sharing. RCU dentry lookups can go full speed.

Signed-off-by: Eric Dumazet
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Dumazet
2006-03-27 00:56:56 +0800

26 Mar, 2006

1 commit

c32ccd87b [PATCH] inotify: lock avoidance with parent watch status in dentry ... Browse Code »

Previous inotify work avoidance is good when inotify is completely unused,
but it breaks down if even a single watch is in place anywhere in the
system. Robin Holt notices that udev is one such culprit - it slows down a
512-thread application on a 512 CPU system from 6 seconds to 22 minutes.

Solve this by adding a flag in the dentry that tells inotify whether or not
its parent inode has a watch on it. Event queueing to parent will skip
taking locks if this flag is cleared. Setting and clearing of this flag on
all child dentries versus event delivery: this is no in terms of race
cases, and that was shown to be equivalent to always performing the check.

The essential behaviour is that activity occuring _after_ a watch has been
added and _before_ it has been removed, will generate events.

Signed-off-by: Nick Piggin
Cc: Robert Love
Cc: John McCutchan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2006-03-26 00:22:53 +0800

23 Mar, 2006

2 commits

f24075bd0 [PATCH] sem2mutex: iprune ... Browse Code »

Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2006-03-23 23:38:12 +0800
d4f9af9da [PATCH] sem2mutex: inotify ... Browse Code »

Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Ingo Molnar
Cc: John McCutchan
Signed-off-by: Andrew Morton
Acked-by: Robert Love
Signed-off-by: Linus Torvalds

Ingo Molnar
2006-03-23 23:38:11 +0800

08 Feb, 2006

1 commit

b5173119f [PATCH] inotify: fix one-shot support ... Browse Code »

Fix one-shot support in inotify. We currently drop the IN_ONESHOT flag
during watch addition. Fix is to not do that.

Signed-off-by: Robert Love
Cc: John McCutchan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Robert Love
2006-02-08 08:12:33 +0800

19 Jan, 2006

1 commit

5131cf154 [PATCH] add missing syscall declarations ... Browse Code »

All standard system calls should be declared in include/linux/syscalls.h.

Add some of the new additions that were previously missed.

Signed-off-by: Arnd Bergmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arnd Bergmann
2006-01-19 11:20:22 +0800

13 Dec, 2005

1 commit

8140a5005 [PATCH] inotify: add two inotify_add_watch flags ... Browse Code »

The below patch lets userspace have more control over the inodes that
inotify will watch. It introduces two new flags.

IN_ONLYDIR -- only watch the inode if it is a directory.
This is needed to avoid the race that can occur when we want to be
sure that we are watching a directory.

IN_DONT_FOLLOW -- don't follow a symlink. In combination
with IN_ONLYDIR we can make sure that we don't watch the target of
symlinks.

The issues the flags fix came up when writing the gnome-vfs inotify
backend. Default behaviour is unchanged.

Signed-off-by: John McCutchan
Acked-by: Robert Love
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

John McCutchan
2005-12-13 00:57:43 +0800

09 Nov, 2005

1 commit

e4543eddf [PATCH] add a vfs_permission helper ... Browse Code »

Most permission() calls have a struct nameidata * available. This helper
takes that as an argument and thus makes sure we pass it down for lookup
intents and prepares for per-mount read-only support where we need a struct
vfsmount for checking whether a file is writeable.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2005-11-09 23:55:58 +0800

24 Oct, 2005

1 commit

8d3b35914 [PATCH] inotify/idr leak fix ... Browse Code »

Fix a bug which was reported and diagnosed by
Stefan Jones

IDR trees include a cache of idr_layer objects. There's no way to destroy
this cache, so when we discard an overall idr tree we end up leaking some
memory.

Add and use idr_destroy() for this. v9fs and infiniband also need to use
idr_destroy() to avoid leaks.

Or, we make the cache global, like radix_tree_preload(). Which is probably
better. Later.

Cc: Eric Van Hensbergen
Cc: Roland Dreier
Cc: Robert Love
Cc: John McCutchan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2005-10-24 07:38:39 +0800

08 Sep, 2005

2 commits

7ea6040b0 [PATCH] inotify: fix event loss on hardlinked files ... Browse Code »

People have run into a problem when they do this:

watch (file1, all_events);
watch (file2, some_events);

if file2 is a hard link to file1, some events will be missed because by
default we replace the mask. The patch below adds a flag IN_MASK_ADD which
will cause inotify to add to the existing mask if present.

Signed-off-by: John McCutchan
Signed-off-by: Robert Love
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

John McCutchan
2005-09-08 07:57:39 +0800
820249baf [PATCH] inotify speedup ... Browse Code »

Bypass an inotify-related fastpath spinlock and several function calls on
systems which have no inotify watches registered.

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

John McCutchan
2005-09-08 07:57:19 +0800

27 Aug, 2005

1 commit

7c657f2f2 [PATCH] Document idr_get_new_above() semantics, update inotify ... Browse Code »

There is an off by one problem with idr_get_new_above.

The comment and function name suggest that it will return an id >
starting_id, but it actually returned an id >= starting_id, and kernel
callers other than inotify treated it as such.

The patch below fixes the comment, and fixes inotifys usage. The
function name still doesn't match the behaviour, but it never did.

Signed-off-by: John McCutchan
Signed-off-by: Linus Torvalds

John McCutchan
2005-08-27 02:32:57 +0800

16 Aug, 2005

1 commit

0bf955ce9 [PATCH] inotify: fix idr_get_new_above usage ... Browse Code »

We are saving the wrong thing in ->last_wd. We want the wd, not the
return value.

Signed-off-by: Robert Love
Signed-off-by: Linus Torvalds

Robert Love
2005-08-16 00:48:31 +0800

02 Aug, 2005

1 commit

b9c55d29e [PATCH] inotify: fix race between the kernel and user space ... Browse Code »

When you rm a watch, an IN_IGNORED event is sent down the event queue
with the watch descriptor that you just rm'd.

If you then add a watch you could get the ignored watch's wd and if you
haven't read the entire event queue, user space will think that it's
newly created watch was just ignored.

To avoid this problem we just use idr_get_new_above instead of
idr_get_new.

Signed-off-by: John McCutchan
Signed-off-by: Robert Love
Signed-off-by: Linus Torvalds

John McCutchan
2005-08-02 00:16:53 +0800

27 Jul, 2005

7 commits

89373de7d [PATCH] inotify: fix oops fix ... Browse Code »

Cc: Robert Love
Cc: John McCutchan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2005-07-27 05:34:18 +0800
e5ca844a9 [PATCH] inotify: check retval in init ... Browse Code »

Check for (unlikely) errors in the filesystem initialization stuff in
our module_init() function.

Signed-off-by: Robert Love
Signed-off-by: John McCutchan
Signed-off-by: Linus Torvalds

Robert Love
2005-07-27 04:37:22 +0800
1b2ccf0cc [PATCH] inotify: change default limits ... Browse Code »

Change default inotify limits: Maximum instances per user to 128 and
maximum events per queue to 16k. The max instances used to be 128; the
change to 8 was a mistake. Memory consumption is fine.

Signed-off-by: Robert Love
Signed-off-by: John McCutchan
Signed-off-by: Linus Torvalds

Robert Love
2005-07-27 04:37:22 +0800
5eb22cbcd [PATCH] inotify: exit path cleanups ... Browse Code »

Handle error out paths better.

Signed-off-by: Robert Love
Signed-off-by: John McCutchan
Signed-off-by: Linus Torvalds

Robert Love
2005-07-27 04:37:22 +0800
783bc29bb [PATCH] inotify: oops fix ... Browse Code »

Bug fix: Ensure that the fd passed to inotify_add_watch() and
inotify_rm_watch() belongs to inotify.

Signed-off-by: Robert Love
Signed-off-by: John McCutchan
Signed-off-by: Linus Torvalds

Robert Love
2005-07-27 04:37:21 +0800
33ea2f52b [PATCH] inotify: use fget_light ... Browse Code »

As an optimization, use fget_light() and fput_light() where possible.

Signed-off-by: Robert Love
Signed-off-by: John McCutchan
Signed-off-by: Linus Torvalds

Robert Love
2005-07-27 04:31:57 +0800
b680716ed [PATCH] inotify: misc. cleanup ... Browse Code »

Miscellaneous invariant clean up, comment fixes, and so on. Trivial
stuff.

Signed-off-by: Robert Love
Signed-off-by: John McCutchan
Signed-off-by: Linus Torvalds

Robert Love
2005-07-27 04:31:57 +0800

14 Jul, 2005

2 commits

9a556e890 [PATCH] inotify: misc cleanup ... Browse Code »

Really simple, basic cleanup.

Signed-off-by: Robert Love
Signed-off-by: Linus Torvalds

Robert Love
2005-07-14 02:09:31 +0800
0399cb08c [PATCH] inotify: move sysctl ... Browse Code »

This moves the inotify sysctl knobs to "/proc/sys/fs/inotify" from
"/proc/sys/fs". Also some related cleanup.

Signed-off-by: Robert Love
Signed-off-by: Linus Torvalds

Robert Love
2005-07-14 02:09:31 +0800

13 Jul, 2005

1 commit

0eeca2830 [PATCH] inotify ... Browse Code »

inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:

* dnotify requires the opening of one fd per each directory
that you intend to watch. This quickly results in too many
open files and pins removable media, preventing unmount.
* dnotify is directory-based. You only learn about changes to
directories. Sure, a change to a file in a directory affects
the directory, but you are then forced to keep a cache of
stat structures.
* dnotify's interface to user-space is awful. Signals?

inotify provides a more usable, simple, powerful solution to file change
notification:

* inotify's interface is a system call that returns a fd, not SIGIO.
You get a single fd, which is select()-able.
* inotify has an event that says "the filesystem that the item
you were watching is on was unmounted."
* inotify can watch directories or files.

Inotify is currently used by Beagle (a desktop search infrastructure),
Gamin (a FAM replacement), and other projects.

See Documentation/filesystems/inotify.txt.

Signed-off-by: Robert Love
Cc: John McCutchan
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Robert Love
2005-07-13 11:38:38 +0800