12 Oct, 2012

2 commits

  • Pull v9fs update from Eric Van Hensbergen.

    * tag 'for-linus-merge-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
    9P: Fix race between p9_write_work() and p9_fd_request()
    9P: Fix race in p9_write_work()
    9P: fix test at the end of p9_write_work()
    9P: Fix race in p9_read_work()
    9p: don't use __getname/__putname for uname/aname
    net/9p: Check errno validity
    fs/9p: avoid debug OOPS when reading a long symlink

    Linus Torvalds
     
  • Race scenario:

    thread A thread B

    p9_write_work() p9_fd_request()

    if (list_empty
    (&m->unsent_req_list))
    ...

    spin_lock(&client->lock);
    req->status = REQ_STATUS_UNSENT;
    list_add_tail(..., &m->unsent_req_list);
    spin_unlock(&client->lock);
    ....
    if (n & POLLOUT &&
    !test_and_set_bit(Wworksched, &m->wsched)
    schedule_work(&m->wq);
    --> not done because Wworksched is set

    clear_bit(Wworksched, &m->wsched);
    return;

    --> nobody will take care of sending the new request.

    This is not very likely to happen though, because p9_write_work()
    being called with an empty unsent_req_list is not frequent.
    But this also means that taking the lock earlier will not be costly.

    Signed-off-by: Simon Derr
    Signed-off-by: Eric Van Hensbergen

    Simon Derr
     

03 Oct, 2012

1 commit

  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     

27 Sep, 2012

1 commit

  • Both modular callers of sock_map_fd() had been buggy; sctp one leaks
    descriptor and file if copy_to_user() fails, 9p one shouldn't be
    exposing file in the descriptor table at all.

    Switch both to sock_alloc_file(), export it, unexport sock_map_fd() and
    make it static.

    Signed-off-by: Al Viro

    Al Viro
     

18 Sep, 2012

3 commits

  • See previous commit about p9_read_work() for details.

    This fixes a similar race between p9_write_work() and p9_poll_mux()

    Signed-off-by: Simon Derr
    Signed-off-by: Eric Van Hensbergen

    Simon Derr
     
  • At the end of p9_write_work() we want to test if there is still data to send.
    This means:
    - either the current request still has data to send (wsize != 0)
    - or there are requests in the unsent queue

    Signed-off-by: Simon Derr
    Signed-off-by: Eric Van Hensbergen

    Simon Derr
     
  • Race scenario between p9_read_work() and p9_poll_mux()

    Data arrive, Rworksched is set, p9_read_work() is called.

    thread A thread B

    p9_read_work()
    .
    reads data
    .
    checks if new data ready. No.
    .
    gets preempted
    .
    More data arrive, p9_poll_mux() is called. .
    .
    .
    p9_poll_mux() .
    .
    if (!test_and_set_bit(Rworksched, .
    &m->wsched)) { .
    schedule_work(&m->rq); .
    } .
    .
    -> does not schedule work because .
    Rworksched is set .
    .
    clear_bit(Rworksched, &m->wsched);
    return;

    No work has been scheduled, and yet data are waiting.

    Currently p9_read_work() checks if there is data to read,
    and if not, it clears Rworksched.

    I think it should clear Rworksched first, and then check if there is data to read.

    Signed-off-by: Simon Derr
    Signed-off-by: Eric Van Hensbergen

    Simon Derr
     

21 Aug, 2012

1 commit

  • flush[_delayed]_work_sync() are now spurious. Mark them deprecated
    and convert all users to flush[_delayed]_work().

    If you're cc'd and wondering what's going on: Now all workqueues are
    non-reentrant and the regular flushes guarantee that the work item is
    not pending or running on any CPU on return, so there's no reason to
    use the sync flushes at all and they're going away.

    This patch doesn't make any functional difference.

    Signed-off-by: Tejun Heo
    Cc: Russell King
    Cc: Paul Mundt
    Cc: Ian Campbell
    Cc: Jens Axboe
    Cc: Mattia Dongili
    Cc: Kent Yoder
    Cc: David Airlie
    Cc: Jiri Kosina
    Cc: Karsten Keil
    Cc: Bryan Wu
    Cc: Benjamin Herrenschmidt
    Cc: Alasdair Kergon
    Cc: Mauro Carvalho Chehab
    Cc: Florian Tobias Schandinat
    Cc: David Woodhouse
    Cc: "David S. Miller"
    Cc: linux-wireless@vger.kernel.org
    Cc: Anton Vorontsov
    Cc: Sangbeom Kim
    Cc: "James E.J. Bottomley"
    Cc: Greg Kroah-Hartman
    Cc: Eric Van Hensbergen
    Cc: Takashi Iwai
    Cc: Steven Whitehouse
    Cc: Petr Vandrovec
    Cc: Mark Fasheh
    Cc: Christoph Hellwig
    Cc: Avi Kivity

    Tejun Heo
     

16 Apr, 2012

1 commit


06 Jan, 2012

1 commit

  • Reduce object size by deduplicating formats.

    Use vsprintf extension %pV.
    Rename P9_DPRINTK uses to p9_debug, align arguments.
    Add function for _p9_debug and macro to add __func__.
    Add missing "\n"s to p9_debug uses.
    Remove embedded function names as p9_debug adds it.
    Remove P9_EPRINTK macro and convert use to pr_.
    Add and use pr_fmt and pr_.

    $ size fs/9p/built-in.o*
    text data bss dec hex filename
    62133 984 16000 79117 1350d fs/9p/built-in.o.new
    67342 984 16928 85254 14d06 fs/9p/built-in.o.old
    $ size net/9p/built-in.o*
    text data bss dec hex filename
    88792 4148 22024 114964 1c114 net/9p/built-in.o.new
    94072 4148 23232 121452 1da6c net/9p/built-in.o.old

    Signed-off-by: Joe Perches
    Signed-off-by: Eric Van Hensbergen

    Joe Perches
     

25 May, 2011

1 commit

  • Teach 9p filesystem to work in container with non-default network namespace.
    (Note: I also patched the unix domain socket code but don't have a test case
    for that. It's the same fix, I just don't have a server for it...)

    To test, run diod server (http://code.google.com/p/diod):
    diod -n -f -L stderr -l 172.23.255.1:9999 -c /dev/null -e /root
    and then mount like so:
    mount -t 9p -o port=9999,aname=/root,version=9p2000.L 172.23.255.1 /mnt

    A container test environment is described at http://landley.net/lxc

    Signed-off-by: Rob Landley
    Signed-off-by: Eric Van Hensbergen

    Rob Landley
     

20 May, 2011

1 commit


23 Mar, 2011

1 commit

  • Without this we can cause reclaim allocation in writepage.

    [ 3433.448430] =================================
    [ 3433.449117] [ INFO: inconsistent lock state ]
    [ 3433.449117] 2.6.38-rc5+ #84
    [ 3433.449117] ---------------------------------
    [ 3433.449117] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage.
    [ 3433.449117] kswapd0/505 [HC0[0]:SC0[0]:HE1:SE1] takes:
    [ 3433.449117] (iprune_sem){+++++-}, at: [] shrink_icache_memory+0x45/0x2b1
    [ 3433.449117] {RECLAIM_FS-ON-W} state was registered at:
    [ 3433.449117] [] mark_held_locks+0x52/0x70
    [ 3433.449117] [] lockdep_trace_alloc+0x85/0x9f
    [ 3433.449117] [] slab_pre_alloc_hook+0x18/0x3c
    [ 3433.449117] [] kmem_cache_alloc+0x23/0xa2
    [ 3433.449117] [] idr_pre_get+0x2d/0x6f
    [ 3433.449117] [] p9_idpool_get+0x30/0xae
    [ 3433.449117] [] p9_client_rpc+0xd7/0x9b0
    [ 3433.449117] [] p9_client_clunk+0x88/0xdb
    [ 3433.449117] [] v9fs_evict_inode+0x3c/0x48
    [ 3433.449117] [] evict+0x1f/0x87
    [ 3433.449117] [] dispose_list+0x47/0xe3
    [ 3433.449117] [] evict_inodes+0x138/0x14f
    [ 3433.449117] [] generic_shutdown_super+0x57/0xe8
    [ 3433.449117] [] kill_anon_super+0x11/0x50
    [ 3433.449117] [] v9fs_kill_super+0x49/0xab
    [ 3433.449117] [] deactivate_locked_super+0x21/0x46
    [ 3433.449117] [] deactivate_super+0x40/0x44
    [ 3433.449117] [] mntput_no_expire+0x100/0x109
    [ 3433.449117] [] sys_umount+0x2f1/0x31c
    [ 3433.449117] [] system_call_fastpath+0x16/0x1b
    [ 3433.449117] irq event stamp: 192941
    [ 3433.449117] hardirqs last enabled at (192941): [] _raw_spin_unlock_irq+0x2b/0x30
    [ 3433.449117] hardirqs last disabled at (192940): [] shrink_inactive_list+0x290/0x2f5
    [ 3433.449117] softirqs last enabled at (188470): [] __do_softirq+0x133/0x152
    [ 3433.449117] softirqs last disabled at (188455): [] call_softirq+0x1c/0x28
    [ 3433.449117]
    [ 3433.449117] other info that might help us debug this:
    [ 3433.449117] 1 lock held by kswapd0/505:
    [ 3433.449117] #0: (shrinker_rwsem){++++..}, at: [] shrink_slab+0x38/0x15f
    [ 3433.449117]
    [ 3433.449117] stack backtrace:
    [ 3433.449117] Pid: 505, comm: kswapd0 Not tainted 2.6.38-rc5+ #84
    [ 3433.449117] Call Trace:
    [ 3433.449117] [] ? valid_state+0x17e/0x191
    [ 3433.449117] [] ? save_stack_trace+0x28/0x45
    [ 3433.449117] [] ? check_usage_forwards+0x0/0x87
    [ 3433.449117] [] ? mark_lock+0x113/0x22c
    [ 3433.449117] [] ? __lock_acquire+0x37a/0xcf7
    [ 3433.449117] [] ? mark_lock+0x2d/0x22c
    [ 3433.449117] [] ? __lock_acquire+0x392/0xcf7
    [ 3433.449117] [] ? determine_dirtyable_memory+0x15/0x28
    [ 3433.449117] [] ? lock_acquire+0x57/0x6d
    [ 3433.449117] [] ? shrink_icache_memory+0x45/0x2b1
    [ 3433.449117] [] ? down_read+0x47/0x5c
    [ 3433.449117] [] ? shrink_icache_memory+0x45/0x2b1
    [ 3433.449117] [] ? shrink_icache_memory+0x45/0x2b1
    [ 3433.449117] [] ? shrink_slab+0xdb/0x15f
    [ 3433.449117] [] ? kswapd+0x574/0x96a
    [ 3433.449117] [] ? kswapd+0x0/0x96a
    [ 3433.449117] [] ? kthread+0x7d/0x85
    [ 3433.449117] [] ? kernel_thread_helper+0x4/0x10
    [ 3433.449117] [] ? restore_args+0x0/0x30
    [ 3433.449117] [] ? kthread+0x0/0x85
    [ 3433.449117] [] ? kernel_thread_helper+0x0/0x10

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Venkateswararao Jujjuri
    Signed-off-by: Eric Van Hensbergen

    Aneesh Kumar K.V
     

01 Feb, 2011

2 commits

  • Now that cmwq can handle high concurrency, it's more efficient to use
    work than a dedicated kthread. Convert p9_poll_proc() to a work
    function for p9_poll_work and make p9_pollwake() schedule it on each
    poll event. The work is sync flushed on module exit.

    Signed-off-by: Tejun Heo
    Cc: Eric Van Hensbergen
    Cc: Ron Minnich
    Cc: Latchesar Ionkov
    Cc: v9fs-developer@lists.sourceforge.net

    Tejun Heo
     
  • With cmwq, there's no reason to use a dedicated workqueue in trans_fd.
    Drop p9_mux_wq and use system_wq instead. The used work items are
    already sync canceled in p9_conn_destroy() and doesn't require further
    synchronization.

    Signed-off-by: Tejun Heo
    Cc: Eric Van Hensbergen
    Cc: Ron Minnich
    Cc: Latchesar Ionkov
    Cc: v9fs-developer@lists.sourceforge.net

    Tejun Heo
     

07 Sep, 2010

1 commit

  • The function has an unsigned return type, but returns a negative constant
    to indicate an error condition. The result of calling the function is
    always stored in a variable of type (signed) int, and thus unsigned can be
    dropped from the return type.

    A sematic match that finds this problem is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @exists@
    identifier f;
    constant C;
    @@

    unsigned f(...)
    { }
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: David S. Miller

    Julia Lawall
     

02 Aug, 2010

1 commit

  • This is an off by one bug because strlen() doesn't count the NULL
    terminator. We strcpy() addr into a fixed length array of size
    UNIX_PATH_MAX later on.

    The addr variable is the name of the device being mounted.

    Signed-off-by: Dan Carpenter
    Signed-off-by: Eric Van Hensbergen

    Dan Carpenter
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

09 Feb, 2010

1 commit

  • Options pointer is being moved before calling kfree() which seems
    to cause problems. This uses a separate pointer to track and free
    original allocation.

    Signed-off-by: Venkateswararao Jujjuri
    Signed-off-by: Eric Van Hensbergen w

    Eric Van Hensbergen
     

17 Dec, 2009

1 commit

  • * if we fail in p9_conn_create(), we shouldn't leak references to struct file.
    Logics in ->close() doesn't help - ->trans is already gone by the time it's
    called.
    * sock_create_kern() can fail.
    * use of sock_map_fd() is all fscked up; I'd fixed most of that, but the
    rest will have to wait for a bit more work in net/socket.c (we still are
    violating the basic rule of working with descriptor table: "once the reference
    is installed there, don't rely on finding it there again").

    Signed-off-by: Al Viro

    Al Viro
     

30 Nov, 2009

1 commit


18 Aug, 2009

1 commit


03 Jul, 2009

1 commit


06 Apr, 2009

1 commit


27 Feb, 2009

1 commit


23 Oct, 2008

1 commit


18 Oct, 2008

14 commits