21 Aug, 2010

9 commits


20 Aug, 2010

29 commits

  • Implement a small-memory-footprint uniprocessor-only implementation of
    preemptible RCU. This implementation uses but a single blocked-tasks
    list rather than the combinatorial number used per leaf rcu_node by
    TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies
    processing. This version also takes advantage of uniprocessor execution
    to accelerate grace periods in the case where there are no readers.

    The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU.

    This implementation is a step towards having RCU implementation driven
    off of the SMP and PREEMPT kernel configuration variables, which can
    happen once this implementation has accumulated sufficient experience.

    Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as
    suggested by Steve Rostedt in order to avoid the compiler-reordering
    issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183).

    As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte
    savings compared to CONFIG_TREE_PREEMPT_RCU. Of course, for non-real-time
    workloads, CONFIG_TINY_RCU is even better.

    CONFIG_TREE_PREEMPT_RCU

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    6170 825 28 7023 kernel/rcutree.o
    ----
    7026 Total

    CONFIG_TINY_PREEMPT_RCU

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    2081 81 8 2170 kernel/rcutiny.o
    ----
    2183 Total

    CONFIG_TINY_RCU (non-preemptible)

    text data bss dec filename
    13 0 0 13 kernel/rcupdate.o
    719 25 0 744 kernel/rcutiny.o
    ---
    757 Total

    Requested-by: Loïc Minier
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Commit cf244dc01bf68 added a fourth level to the TREE_RCU hierarchy,
    but the RCU_FANOUT help message still said "cube root". This commit
    fixes this to "fourth root" and also emphasizes that production
    systems are well-served by the default. (Stress-testing RCU itself
    uses small RCU_FANOUT values in order to test large-system code paths
    on small(er) systems.)

    Located-by: John Kacur
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Currently, if RCU CPU stall warnings are enabled, they are enabled
    immediately upon boot. They can be manually disabled via /sys (and
    also re-enabled via /sys), and are automatically disabled upon panic.
    However, some users need RCU CPU stalls to be disabled at boot time,
    but to be enabled without rebuilding/rebooting. For example, someone
    running a real-time application in production might not want the
    additional latency of RCU CPU stall detection in normal operation, but
    might need to enable it at any point for fault isolation purposes.

    This commit therefore provides a new CONFIG_RCU_CPU_STALL_DETECTOR_RUNNABLE
    kernel configuration parameter that maintains the current behavior
    (enable at boot) by default, but allows a kernel to be configured
    with RCU CPU stall detection built into the kernel, but disabled at
    boot time.

    Requested-by: Clark Williams
    Requested-by: John Kacur
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Because both TINY_RCU and TREE_PREEMPT_RCU have been in mainline for
    several releases, it is time to restrict the use of TREE_RCU to SMP
    non-preemptible systems. This reduces testing/validation effort. This
    commit is a first step towards driving the selection of RCU implementation
    directly off of the SMP and PREEMPT configuration parameters.

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Set the permissions of the rcu_cpu_stall_suppress to 644 to enable RCU
    CPU stall warnings to be enabled and disabled at runtime via sysfs.

    Suggested-by: Josh Triplett
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Reported-by: Kyle Hubert
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • RCU heads really don't need to be initialized. Their state before call_rcu()
    really does not matter.

    We need to keep init/destroy_rcu_head_on_stack() though, since we want
    debugobjects to be able to keep track of these objects.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Mathieu Desnoyers
    CC: David S. Miller
    CC: "Paul E. McKenney"
    CC: akpm@linux-foundation.org
    CC: mingo@elte.hu
    CC: laijs@cn.fujitsu.com
    CC: dipankar@in.ibm.com
    CC: josh@joshtriplett.org
    CC: dvhltc@us.ibm.com
    CC: niv@us.ibm.com
    CC: tglx@linutronix.de
    CC: peterz@infradead.org
    CC: rostedt@goodmis.org
    CC: Valdis.Kletnieks@vt.edu
    CC: dhowells@redhat.com
    CC: eric.dumazet@gmail.com
    CC: Alexey Dobriyan
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Mathieu Desnoyers
     
  • This adds annotations for RCU operations in core kernel components

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Al Viro
    Cc: Jens Axboe
    Cc: Andrew Morton
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Manfred Spraul
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Nick Piggin
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Alan Cox
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • Make it explicit that new RCU read-side critical sections that start
    after call_rcu() and synchronize_rcu() start might still be running
    after the end of the relevant grace period.

    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Although the RCU CPU stall warning messages are a very good way to alert
    people to a problem, once alerted, it is sometimes helpful to shut them
    off in order to avoid obscuring other messages that might be being used
    to track down the problem. Although you can rebuild the kernel with
    CONFIG_RCU_CPU_STALL_DETECTOR=n, this is sometimes inconvenient. This
    commit therefore adds a boot parameter named "rcu_cpu_stall_suppress"
    that shuts these messages off without requiring a rebuild (though a
    reboot might be needed for those not brave enough to patch their kernel
    while it is running).

    This message-suppression was already in place for the panic case, so this
    commit need only rename the variable and export it via module_param().

    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • Also set the default to 60 seconds, up from the previous hard-coded timeout
    of 10 seconds. This allows people who care to set short timeouts, while
    avoiding people with unusual configurations (make randconfig!!!) from being
    bothered with spurious CPU stall warnings.

    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • find_task_by_vpid() says "Must be called under rcu_read_lock().". But due to
    commit 3120438 "rcu: Disable lockdep checking in RCU list-traversal primitives",
    we are currently unable to catch "find_task_by_vpid() with tasklist_lock held
    but RCU lock not held" errors due to the RCU-lockdep checks being
    suppressed in the RCU variants of the struct list_head traversals.
    This commit therefore places an explicit check for being in an RCU
    read-side critical section in find_task_by_pid_ns().

    ===================================================
    [ INFO: suspicious rcu_dereference_check() usage. ]
    ---------------------------------------------------
    kernel/pid.c:386 invoked rcu_dereference_check() without protection!

    other info that might help us debug this:

    rcu_scheduler_active = 1, debug_locks = 1
    1 lock held by rc.sysinit/1102:
    #0: (tasklist_lock){.+.+..}, at: [] sys_setpgid+0x40/0x160

    stack backtrace:
    Pid: 1102, comm: rc.sysinit Not tainted 2.6.35-rc3-dirty #1
    Call Trace:
    [] lockdep_rcu_dereference+0x94/0xb0
    [] find_task_by_pid_ns+0x6d/0x70
    [] find_task_by_vpid+0x18/0x20
    [] sys_setpgid+0x47/0x160
    [] sysenter_do_call+0x12/0x36

    Commit updated to use a new rcu_lockdep_assert() exported API rather than
    the old internal __do_rcu_dereference().

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Tetsuo Handa
     
  • &percpu_data is compatible with allocated percpu data.

    And we use it and remove the "->rda[NR_CPUS]" array, saving significant
    storage on systems with large numbers of CPUs. This does add an additional
    level of indirection and thus an additional cache line referenced, but
    because ->rda is not used on the read side, this is OK.

    Signed-off-by: Lai Jiangshan
    Reviewed-by: Tejun Heo
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Lai Jiangshan
     
  • Add random preemption to help we to torture the preemptable rcu.

    srcu_read_delay() also calls rcu_read_delay() for shorter delays.

    Added comment to preempt_schedule() call indicating that no quiescent
    states happen if preemption is disabled.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Lai Jiangshan
     
  • Add a section describing PROVE_RCU, DEBUG_OBJECTS_RCU_HEAD, and
    the __rcu sparse checking to the RCU checklist.

    Suggested-by: David Miller
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Avi Kivity
    Cc: Marcelo Tosatti
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Acked-by: Patrick McHardy
    Cc: "David S. Miller"
    Cc: Eric Dumazet
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Dmitry Torokhov
    Acked-by: Dmitry Torokhov
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Acked-by: Trond Myklebust

    Arnd Bergmann
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Acked-by: David Howells
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Acked-by: David Howells
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Acked-by: Paul Menage
    Cc: Li Zefan
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • This avoids warnings from missing __rcu annotations
    in the rculist implementation, making it possible to
    use the same lists in both RCU and non-RCU cases.

    We can add rculist annotations later, together with
    lockdep support for rculist, which is missing as well,
    but that may involve changing all the users.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Pavel Emelyanov
    Cc: Sukadev Bhattiprolu
    Reviewed-by: Josh Triplett

    Arnd Bergmann
     
  • This commit provides definitions for the __rcu annotation defined earlier.
    This annotation permits sparse to check for correct use of RCU-protected
    pointers. If a pointer that is annotated with __rcu is accessed
    directly (as opposed to via rcu_dereference(), rcu_assign_pointer(),
    or one of their variants), sparse can be made to complain. To enable
    such complaints, use the new default-disabled CONFIG_SPARSE_RCU_POINTER
    kernel configuration option. Please note that these sparse complaints are
    intended to be a debugging aid, -not- a code-style-enforcement mechanism.

    There are special rcu_dereference_protected() and rcu_access_pointer()
    accessors for use when RCU read-side protection is not required, for
    example, when no other CPU has access to the data structure in question
    or while the current CPU hold the update-side lock.

    This patch also updates a number of docbook comments that were showing
    their age.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul E. McKenney
    Cc: Christopher Li
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • The task_cls_classid() function applies rcu_dereference() to integers,
    which does not work with the shiny new sparse-based checking in
    rcu_dereference(). This commit therefore moves to the new RCU API
    rcu_dereference_index_check().

    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett
    Acked-by: David S. Miller
    Acked-by: Herbert Xu

    Paul E. McKenney
     

19 Aug, 2010

2 commits

  • * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
    NFS: Fix an Oops in the NFSv4 atomic open code
    NFS: Fix the selection of security flavours in Kconfig
    NFS: fix the return value of nfs_file_fsync()
    rpcrdma: Fix SQ size calculation when memreg is FRMR
    xprtrdma: Do not truncate iova_start values in frmr registrations.
    nfs: Remove redundant NULL check upon kfree()
    nfs: Add "lookupcache" to displayed mount options
    NFS: allow close-to-open cache semantics to apply to root of NFS filesystem
    SUNRPC: fix NFS client over TCP hangs due to packet loss (Bug 16494)

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
    USB HID: Add ID for eGalax Multitouch used in JooJoo tablet
    HID: hiddev: fix memory corruption due to invalid intfdata
    HID: hiddev: protect against disconnect/NULL-dereference race
    HID: picolcd: correct ordering of framebuffer freeing
    HID: picolcd: testing the wrong variable

    Linus Torvalds