15 Jun, 2011

1 commit

  • Fix kernel-doc warnings in signal.c:

    Warning(kernel/signal.c:2374): No description found for parameter 'nset'
    Warning(kernel/signal.c:2374): Excess function parameter 'set' description in 'sys_rt_sigprocmask'

    Signed-off-by: Randy Dunlap
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

14 Jun, 2011

1 commit

  • …l/git/tip/linux-2.6-tip

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    ftrace: Revert 8ab2b7efd ftrace: Remove unnecessary disabling of irqs
    kprobes/trace: Fix kprobe selftest for gcc 4.6
    ftrace: Fix possible undefined return code
    oprofile, dcookies: Fix possible circular locking dependency
    oprofile: Fix locking dependency in sync_start()
    oprofile: Free potentially owned tasks in case of errors
    oprofile, x86: Add comments to IBS LVT offset initialization

    Linus Torvalds
     

10 Jun, 2011

1 commit

  • In kernel/irq/manage.c::irq_set_irq_wake() we call
    irq_get_desc_buslock() which may return NULL, but the code
    dereferences the result unconditionally.

    irq_set_irq_wake() has lots of callers - I checked a few and I couldn't
    find anything that guarantees that they won't call it with some input that
    will cause irq_get_desc_buslock() to return NULL, so I think it's a good
    thing to test and -EINVAL was the most sane error code in this situation
    that I could think of.

    Not all callers test the return value of irq_set_irq_wake(), but those
    that do take != 0 to mean error as far as I can see, so they should be
    fine. I guess those that don't test actually should, but that's a
    different issue.

    Signed-off-by: Jesper Juhl
    Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1106092300360.17868@swampdragon.chaosbits.net
    Signed-off-by: Thomas Gleixner

    Jesper Juhl
     

08 Jun, 2011

5 commits

  • …l/git/tip/linux-2.6-tip

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf: Fix comments in include/linux/perf_event.h
    perf: Comment /proc/sys/kernel/perf_event_paranoid to be part of user ABI
    perf python: Fix argument name list of read_on_cpu()
    perf evlist: Don't die if sample_{id_all|type} is invalid
    perf python: Use exception to propagate errors
    perf evlist: Remove dependency on debug routines
    perf, cgroups: Fix up for new API

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    genirq: Ensure we locate the passed IRQ in irq_alloc_descs()
    genirq: Fix descriptor init on non-sparse IRQs
    irq: Handle spurios irq detection for threaded irqs
    genirq: Print threaded handler in spurious debug output

    Linus Torvalds
     
  • …el/git/tip/linux-2.6-tip

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: Fix/clarify set_task_cpu() locking rules
    lockdep: Fix lock_is_held() on recursion
    sched: Fix schedstat.nr_wakeups_migrate
    sched: Fix cross-cpu clock sync on remote wakeups

    Linus Torvalds
     
  • Revert the commit that removed the disabling of interrupts around
    the initial modifying of mcount callers to nops, and update the comment.

    The original comment was outdated and stated that the interrupts were
    being disabled to prevent kstop machine, which was required with the
    old ftrace daemon, but was no longer the case.

    What the comment failed to mention was that interrupts needed to be
    disabled to keep interrupts from preempting the modifying of the code
    and then executing the code that was partially modified.

    Revert the commit and update the comment.

    Reported-by: Richard W.M. Jones
    Tested-by: Richard W.M. Jones
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • With gcc 4.6, the self test kprobe function:

    kprobe_trace_selftest_target()

    is optimized such that kallsyms does not list it. The kprobes
    test uses this function to insert a probe and test it. But
    it will fail the test if the function is not listed in kallsyms.

    Adding a __used annotation keeps the symbol in the kallsyms table.

    Suggested-by: David Daney
    Cc: Masami Hiramatsu
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

07 Jun, 2011

3 commits

  • Sergey reported a CONFIG_PROVE_RCU warning in push_rt_task where
    set_task_cpu() was called with both relevant rq->locks held, which
    should be sufficient for running tasks since holding its rq->lock
    will serialize against sched_move_task().

    Update the comments and fix the task_group() lockdep test.

    Reported-and-tested-by: Sergey Senozhatsky
    Cc: Oleg Nesterov
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1307115427.2353.3456.camel@twins
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The main lock_is_held() user is lockdep_assert_held(), avoid false
    assertions in lockdep_off() sections by unconditionally reporting the
    lock is taken.

    [ the reason this is important is a lockdep_assert_held() in ttwu()
    which triggers a warning under lockdep_off() as in printk() which
    can trigger another wakeup and lock up due to spinlock
    recursion, as reported and heroically debugged by Arne Jansen ]

    Reported-and-tested-by: Arne Jansen
    Signed-off-by: Peter Zijlstra
    Cc: Linus Torvalds
    Cc:
    Link: http://lkml.kernel.org/r/1307398759.2497.966.camel@laptop
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • kernel/trace/ftrace.c: In function 'ftrace_regex_write.clone.15':
    kernel/trace/ftrace.c:2743:6: warning: 'ret' may be used uninitialized in this
    function

    Signed-off-by: GuoWen Li
    Link: http://lkml.kernel.org/r/201106011918.47939.guowen.li.linux@gmail.com
    Signed-off-by: Steven Rostedt

    GuoWen Li
     

04 Jun, 2011

2 commits


03 Jun, 2011

6 commits

  • There is an optimization which does not update the timer if the timer
    was pending and the expiration time was unchanged.

    Since commit 3bbb9ec9 ("timers: Introduce the concept of timer slack
    for legacy timers") this optimization is no longer applied for timers
    where the expiration time got extended due to the slack value. So we
    need to check again after the expiration time might have been updated.

    [ tglx: Made it a single check by applying slack first and sorting
    out the slack = 0 value (all timeouts < 256 jiffies) early ]

    Signed-off-by: Sebastian Andrzej Siewior
    Link: http://lkml.kernel.org/r/20110521105828.GA29442@Chamillionaire.breakpoint.cc
    Signed-off-by: Thomas Gleixner

    Sebastian Andrzej Siewior
     
  • When irq_alloc_descs() is called with no base IRQ specified then it will
    search for a range of IRQs starting from a specified base address. In the
    case where an IRQ is specified it still does this search in order to ensure
    that none of the requested range is already allocated and it still uses the
    from parameter to specify the base for the search. This means that in the
    case where a base is specified but from is zero (which is reasonable as
    any IRQ number is in the range specified by a zero from) the function will
    get confused and try to allocate the first suitably sized block of free IRQs
    it finds.

    Instead use a specified IRQ as the base address for the search, and insist
    that any from that is specified can support that IRQ.

    Signed-off-by: Mark Brown
    Link: http://lkml.kernel.org/r/1307037313-15733-1-git-send-email-broonie@opensource.wolfsonmicro.com
    Signed-off-by: Thomas Gleixner

    Mark Brown
     
  • The genirq changes are initializing descriptors for sparse IRQs quite
    differently from how non-sparse (stacked?) IRQs are initialized, with
    the effect that on my platform all IRQs are default-disabled on sparse
    IRQs and default-enabled if non-sparse IRQs are used, crashing some
    GPIO driver.

    Fix this by refactoring the non-sparse IRQs to use the same descriptor
    init function as the sparse IRQs.

    Signed-off: Linus Walleij
    Link: http://lkml.kernel.org/r/1306858479-16622-1-git-send-email-linus.walleij@stericsson.com
    Cc: stable@kernel.org # 2.6.39
    Signed-off-by: Thomas Gleixner

    Linus Walleij
     
  • The detection of spurios interrupts is currently limited to first level
    handler. In force-threaded mode we never notice if the threaded irq does
    not feel responsible.
    This patch catches the return value of the threaded handler and forwards
    it to the spurious detector. If the primary handler returns only
    IRQ_WAKE_THREAD then the spourious detector ignores it because it gets
    called again from the threaded handler.

    [ tglx: Report the erroneous return value early and bail out ]

    Signed-off-by: Sebastian Andrzej Siewior
    Link: http://lkml.kernel.org/r/1306824972-27067-2-git-send-email-sebastian@breakpoint.cc
    Signed-off-by: Thomas Gleixner

    Sebastian Andrzej Siewior
     
  • In forced threaded mode (or with an explicit threaded handler) we only
    see the primary handler, but not the threaded handler.

    Signed-off-by: Sebastian Andrzej Siewior
    Link: http://lkml.kernel.org/r/1306824972-27067-1-git-send-email-sebastian@breakpoint.cc
    Signed-off-by: Thomas Gleixner

    Sebastian Andrzej Siewior
     
  • For UP it's stupid to request an initialized cpumask for the clock
    event devices. Though we need the mask set even on UP to avoid a
    horrible ifdeffery especially in the broadcast code.

    For SMP we can at least try to survive with a warning and set the
    cpumask of the cpu we're running on. That gives a decent chance to
    bring the machine up and retrieve the debug info.

    Signed-off-by: Thomas Gleixner
    Cc: Linus Walleij
    Cc: Russell King - ARM Linux
    Cc: Stephen Boyd

    Thomas Gleixner
     

31 May, 2011

4 commits

  • Ben changed the cgroup API in commit f780bdb7c1c (cgroups: add
    per-thread subsystem callbacks) in an incompatible way, but
    forgot to convert the perf cgroup bits.

    Avoid compile warnings and runtime splats and convert perf too ;-)

    Acked-by: Ben Blum
    Cc: Stephane Eranian
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1306767651.1200.2990.camel@twins
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • While looking over the code I found that with the ttwu rework the
    nr_wakeups_migrate test broke since we now switch cpus prior to
    calling ttwu_stat(), hence the test is always true.

    Cure this by passing the migration state in wake_flags. Also move the
    whole test under CONFIG_SMP, its hard to migrate tasks on UP :-)

    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-pwwxl7gdqs5676f1d4cx6pj7@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Markus reported that commit 317f394160e ("sched: Move the second half
    of ttwu() to the remote cpu") caused some accounting funnies on his AMD
    Phenom II X4, such as weird 'top' results.

    It turns out that this is due to non-synced TSC and the queued remote
    wakeups stopped coupeling the two relevant cpu clocks, which leads to
    wakeups seeing time jumps, which in turn lead to skewed runtime stats.

    Add an explicit call to sched_clock_cpu() to couple the per-cpu clocks
    to restore the normal flow of time.

    Reported-and-tested-by: Markus Trippelsdorf
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1306835745.2353.3.camel@twins
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Commit cc3ce5176d83 (rcu: Start RCU kthreads in TASK_INTERRUPTIBLE
    state) fudges a sleeping task' state, resulting in the scheduler seeing
    a TASK_UNINTERRUPTIBLE task going to sleep, but a TASK_INTERRUPTIBLE
    task waking up. The result is unbalanced load calculation.

    The problem that patch tried to address is that the RCU threads could
    stay in UNINTERRUPTIBLE state for quite a while and triggering the hung
    task detector due to on-demand wake-ups.

    Cure the problem differently by always giving the tasks at least one
    wake-up once the CPU is fully up and running, this will kick them out of
    the initial UNINTERRUPTIBLE state and into the regular INTERRUPTIBLE
    wait state.

    [ The alternative would be teaching kthread_create() to start threads as
    INTERRUPTIBLE but that needs a tad more thought. ]

    Reported-by: Damien Wyart
    Signed-off-by: Peter Zijlstra
    Acked-by: Paul E. McKenney
    Link: http://lkml.kernel.org/r/1306755291.1200.2872.camel@twins
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

30 May, 2011

2 commits

  • Thomas Gleixner reports that we now have a boot crash triggered by
    CONFIG_CPUMASK_OFFSTACK=y:

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] find_next_bit+0x55/0xb0
    Call Trace:
    [] cpumask_any_but+0x2a/0x70
    [] flush_tlb_mm+0x2b/0x80
    [] pud_populate+0x35/0x50
    [] pgd_alloc+0x9a/0xf0
    [] mm_init+0xec/0x120
    [] mm_alloc+0x53/0xd0

    which was introduced by commit de03c72cfce5 ("mm: convert
    mm->cpu_vm_cpumask into cpumask_var_t"), and is due to wrong ordering of
    mm_init() vs mm_init_cpumask

    Thomas wrote a patch to just fix the ordering of initialization, but I
    hate the new double allocation in the fork path, so I ended up instead
    doing some more radical surgery to clean it all up.

    Reported-by: Thomas Gleixner
    Reported-by: Ingo Molnar
    Cc: KOSAKI Motohiro
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
    x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param
    x86 idle: deprecate "no-hlt" cmdline param
    x86 idle APM: deprecate CONFIG_APM_CPU_IDLE
    x86 idle floppy: deprecate disable_hlt()
    x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it
    x86 idle: clarify AMD erratum 400 workaround
    idle governor: Avoid lock acquisition to read pm_qos before entering idle
    cpuidle: menu: fixed wrapping timers at 4.294 seconds

    Linus Torvalds
     

29 May, 2011

4 commits

  • Thanks to the reviews and comments by Rafael, James, Mark and Andi.
    Here's version 2 of the patch incorporating your comments and also some
    update to my previous patch comments.

    I noticed that before entering idle state, the menu idle governor will
    look up the current pm_qos target value according to the list of qos
    requests received. This look up currently needs the acquisition of a
    lock to access the list of qos requests to find the qos target value,
    slowing down the entrance into idle state due to contention by multiple
    cpus to access this list. The contention is severe when there are a lot
    of cpus waking and going into idle. For example, for a simple workload
    that has 32 pair of processes ping ponging messages to each other, where
    64 cpu cores are active in test system, I see the following profile with
    37.82% of cpu cycles spent in contention of pm_qos_lock:

    - 37.82% swapper [kernel.kallsyms] [k]
    _raw_spin_lock_irqsave
    - _raw_spin_lock_irqsave
    - 95.65% pm_qos_request
    menu_select
    cpuidle_idle_call
    - cpu_idle
    99.98% start_secondary

    A better approach will be to cache the updated pm_qos target value so
    reading it does not require lock acquisition as in the patch below.
    With this patch the contention for pm_qos_lock is removed and I saw a
    2.2X increase in throughput for my message passing workload.

    cc: stable@kernel.org
    Signed-off-by: Tim Chen
    Acked-by: Andi Kleen
    Acked-by: James Bottomley
    Acked-by: mark gross
    Signed-off-by: Len Brown

    Tim Chen
     
  • …el/git/tip/linux-2.6-tip

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    cpuset: Fix cpuset_cpus_allowed_fallback(), don't update tsk->rt.nr_cpus_allowed
    sched: Fix ->min_vruntime calculation in dequeue_entity()
    sched: Fix ttwu() for __ARCH_WANT_INTERRUPTS_ON_CTXSW
    sched: More sched_domain iterations fixes

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    rcu: Start RCU kthreads in TASK_INTERRUPTIBLE state
    rcu: Remove waitqueue usage for cpu, node, and boost kthreads
    rcu: Avoid acquiring rcu_node locks in timer functions
    atomic: Add atomic_or()
    Documentation: Add statistics about nested locks
    rcu: Decrease memory-barrier usage based on semi-formal proof
    rcu: Make rcu_enter_nohz() pay attention to nesting
    rcu: Don't do reschedule unless in irq
    rcu: Remove old memory barriers from rcu_process_callbacks()
    rcu: Add memory barriers
    rcu: Fix unpaired rcu_irq_enter() from locking selftests

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits)
    perf: Fix SIGIO handling
    perf top: Don't stop if no kernel symtab is found
    perf top: Handle kptr_restrict
    perf top: Remove unused macro
    perf events: initialize fd array to -1 instead of 0
    perf tools: Make sure kptr_restrict warnings fit 80 col terms
    perf tools: Fix build on older systems
    perf symbols: Handle /proc/sys/kernel/kptr_restrict
    perf: Remove duplicate headers
    ftrace: Add internal recursive checks
    tracing: Update btrfs's tracepoints to use u64 interface
    tracing: Add __print_symbolic_u64 to avoid warnings on 32bit machine
    ftrace: Set ops->flag to enabled even on static function tracing
    tracing: Have event with function tracer check error return
    ftrace: Have ftrace_startup() return failure code
    jump_label: Check entries limit in __jump_label_update
    ftrace/recordmcount: Avoid STT_FUNC symbols as base on ARM
    scripts/tags.sh: Add magic for trace-events for etags too
    scripts/tags.sh: Fix ctags for DEFINE_EVENT()
    x86/ftrace: Fix compiler warning in ftrace.c
    ...

    Linus Torvalds
     

28 May, 2011

11 commits

  • Upon creation, kthreads are in TASK_UNINTERRUPTIBLE state, which can
    result in softlockup warnings. Because some of RCU's kthreads can
    legitimately be idle indefinitely, start them in TASK_INTERRUPTIBLE
    state in order to avoid those warnings.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Tested-by: Yinghai Lu
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • It is not necessary to use waitqueues for the RCU kthreads because
    we always know exactly which thread is to be awakened. In addition,
    wake_up() only issues an actual wakeup when there is a thread waiting on
    the queue, which was why there was an extra explicit wake_up_process()
    to get the RCU kthreads started.

    Eliminating the waitqueues (and wake_up()) in favor of wake_up_process()
    eliminates the need for the initial wake_up_process() and also shrinks
    the data structure size a bit. The wakeup logic is placed in a new
    rcu_wait() macro.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • This commit switches manipulations of the rcu_node ->wakemask field
    to atomic operations, which allows rcu_cpu_kthread_timer() to avoid
    acquiring the rcu_node lock. This should avoid the following lockdep
    splat reported by Valdis Kletnieks:

    [ 12.872150] usb 1-4: new high speed USB device number 3 using ehci_hcd
    [ 12.986667] usb 1-4: New USB device found, idVendor=413c, idProduct=2513
    [ 12.986679] usb 1-4: New USB device strings: Mfr=0, Product=0, SerialNumber=0
    [ 12.987691] hub 1-4:1.0: USB hub found
    [ 12.987877] hub 1-4:1.0: 3 ports detected
    [ 12.996372] input: PS/2 Generic Mouse as /devices/platform/i8042/serio1/input/input10
    [ 13.071471] udevadm used greatest stack depth: 3984 bytes left
    [ 13.172129]
    [ 13.172130] =======================================================
    [ 13.172425] [ INFO: possible circular locking dependency detected ]
    [ 13.172650] 2.6.39-rc6-mmotm0506 #1
    [ 13.172773] -------------------------------------------------------
    [ 13.172997] blkid/267 is trying to acquire lock:
    [ 13.173009] (&p->pi_lock){-.-.-.}, at: [] try_to_wake_up+0x29/0x1aa
    [ 13.173009]
    [ 13.173009] but task is already holding lock:
    [ 13.173009] (rcu_node_level_0){..-...}, at: [] rcu_cpu_kthread_timer+0x27/0x58
    [ 13.173009]
    [ 13.173009] which lock already depends on the new lock.
    [ 13.173009]
    [ 13.173009]
    [ 13.173009] the existing dependency chain (in reverse order) is:
    [ 13.173009]
    [ 13.173009] -> #2 (rcu_node_level_0){..-...}:
    [ 13.173009] [] check_prevs_add+0x8b/0x104
    [ 13.173009] [] validate_chain+0x36f/0x3ab
    [ 13.173009] [] __lock_acquire+0x369/0x3e2
    [ 13.173009] [] lock_acquire+0xfc/0x14c
    [ 13.173009] [] _raw_spin_lock+0x36/0x45
    [ 13.173009] [] rcu_read_unlock_special+0x8c/0x1d5
    [ 13.173009] [] __rcu_read_unlock+0x4f/0xd7
    [ 13.173009] [] rcu_read_unlock+0x21/0x23
    [ 13.173009] [] cpuacct_charge+0x6c/0x75
    [ 13.173009] [] update_curr+0x101/0x12e
    [ 13.173009] [] check_preempt_wakeup+0xf7/0x23b
    [ 13.173009] [] check_preempt_curr+0x2b/0x68
    [ 13.173009] [] ttwu_do_wakeup+0x76/0x128
    [ 13.173009] [] ttwu_do_activate.constprop.63+0x57/0x5c
    [ 13.173009] [] scheduler_ipi+0x48/0x5d
    [ 13.173009] [] smp_reschedule_interrupt+0x16/0x18
    [ 13.173009] [] reschedule_interrupt+0x13/0x20
    [ 13.173009] [] rcu_read_unlock+0x21/0x23
    [ 13.173009] [] find_get_page+0xa9/0xb9
    [ 13.173009] [] filemap_fault+0x6a/0x34d
    [ 13.173009] [] __do_fault+0x54/0x3e6
    [ 13.173009] [] handle_pte_fault+0x12c/0x1ed
    [ 13.173009] [] handle_mm_fault+0x1cd/0x1e0
    [ 13.173009] [] do_page_fault+0x42d/0x5de
    [ 13.173009] [] page_fault+0x1f/0x30
    [ 13.173009]
    [ 13.173009] -> #1 (&rq->lock){-.-.-.}:
    [ 13.173009] [] check_prevs_add+0x8b/0x104
    [ 13.173009] [] validate_chain+0x36f/0x3ab
    [ 13.173009] [] __lock_acquire+0x369/0x3e2
    [ 13.173009] [] lock_acquire+0xfc/0x14c
    [ 13.173009] [] _raw_spin_lock+0x36/0x45
    [ 13.173009] [] __task_rq_lock+0x8b/0xd3
    [ 13.173009] [] wake_up_new_task+0x41/0x108
    [ 13.173009] [] do_fork+0x265/0x33f
    [ 13.173009] [] kernel_thread+0x6b/0x6d
    [ 13.173009] [] rest_init+0x21/0xd2
    [ 13.173009] [] start_kernel+0x3bb/0x3c6
    [ 13.173009] [] x86_64_start_reservations+0xaf/0xb3
    [ 13.173009] [] x86_64_start_kernel+0xf0/0xf7
    [ 13.173009]
    [ 13.173009] -> #0 (&p->pi_lock){-.-.-.}:
    [ 13.173009] [] check_prev_add+0x68/0x20e
    [ 13.173009] [] check_prevs_add+0x8b/0x104
    [ 13.173009] [] validate_chain+0x36f/0x3ab
    [ 13.173009] [] __lock_acquire+0x369/0x3e2
    [ 13.173009] [] lock_acquire+0xfc/0x14c
    [ 13.173009] [] _raw_spin_lock_irqsave+0x44/0x57
    [ 13.173009] [] try_to_wake_up+0x29/0x1aa
    [ 13.173009] [] wake_up_process+0x10/0x12
    [ 13.173009] [] rcu_cpu_kthread_timer+0x44/0x58
    [ 13.173009] [] call_timer_fn+0xac/0x1e9
    [ 13.173009] [] run_timer_softirq+0x1aa/0x1f2
    [ 13.173009] [] __do_softirq+0x109/0x26a
    [ 13.173009] [] call_softirq+0x1c/0x30
    [ 13.173009] [] do_softirq+0x44/0xf1
    [ 13.173009] [] irq_exit+0x58/0xc8
    [ 13.173009] [] smp_apic_timer_interrupt+0x79/0x87
    [ 13.173009] [] apic_timer_interrupt+0x13/0x20
    [ 13.173009] [] get_page_from_freelist+0x2aa/0x310
    [ 13.173009] [] __alloc_pages_nodemask+0x178/0x243
    [ 13.173009] [] pte_alloc_one+0x1e/0x3a
    [ 13.173009] [] __pte_alloc+0x22/0x14b
    [ 13.173009] [] handle_mm_fault+0x17e/0x1e0
    [ 13.173009] [] do_page_fault+0x42d/0x5de
    [ 13.173009] [] page_fault+0x1f/0x30
    [ 13.173009]
    [ 13.173009] other info that might help us debug this:
    [ 13.173009]
    [ 13.173009] Chain exists of:
    [ 13.173009] &p->pi_lock --> &rq->lock --> rcu_node_level_0
    [ 13.173009]
    [ 13.173009] Possible unsafe locking scenario:
    [ 13.173009]
    [ 13.173009] CPU0 CPU1
    [ 13.173009] ---- ----
    [ 13.173009] lock(rcu_node_level_0);
    [ 13.173009] lock(&rq->lock);
    [ 13.173009] lock(rcu_node_level_0);
    [ 13.173009] lock(&p->pi_lock);
    [ 13.173009]
    [ 13.173009] *** DEADLOCK ***
    [ 13.173009]
    [ 13.173009] 3 locks held by blkid/267:
    [ 13.173009] #0: (&mm->mmap_sem){++++++}, at: [] do_page_fault+0x1f3/0x5de
    [ 13.173009] #1: (&yield_timer){+.-...}, at: [] call_timer_fn+0x0/0x1e9
    [ 13.173009] #2: (rcu_node_level_0){..-...}, at: [] rcu_cpu_kthread_timer+0x27/0x58
    [ 13.173009]
    [ 13.173009] stack backtrace:
    [ 13.173009] Pid: 267, comm: blkid Not tainted 2.6.39-rc6-mmotm0506 #1
    [ 13.173009] Call Trace:
    [ 13.173009] [] print_circular_bug+0xc8/0xd9
    [ 13.173009] [] check_prev_add+0x68/0x20e
    [ 13.173009] [] ? save_stack_trace+0x28/0x46
    [ 13.173009] [] check_prevs_add+0x8b/0x104
    [ 13.173009] [] validate_chain+0x36f/0x3ab
    [ 13.173009] [] __lock_acquire+0x369/0x3e2
    [ 13.173009] [] ? try_to_wake_up+0x29/0x1aa
    [ 13.173009] [] lock_acquire+0xfc/0x14c
    [ 13.173009] [] ? try_to_wake_up+0x29/0x1aa
    [ 13.173009] [] ? rcu_check_quiescent_state+0x82/0x82
    [ 13.173009] [] _raw_spin_lock_irqsave+0x44/0x57
    [ 13.173009] [] ? try_to_wake_up+0x29/0x1aa
    [ 13.173009] [] try_to_wake_up+0x29/0x1aa
    [ 13.173009] [] ? rcu_check_quiescent_state+0x82/0x82
    [ 13.173009] [] wake_up_process+0x10/0x12
    [ 13.173009] [] rcu_cpu_kthread_timer+0x44/0x58
    [ 13.173009] [] ? rcu_check_quiescent_state+0x82/0x82
    [ 13.173009] [] call_timer_fn+0xac/0x1e9
    [ 13.173009] [] ? del_timer+0x75/0x75
    [ 13.173009] [] ? rcu_check_quiescent_state+0x82/0x82
    [ 13.173009] [] run_timer_softirq+0x1aa/0x1f2
    [ 13.173009] [] __do_softirq+0x109/0x26a
    [ 13.173009] [] ? tick_dev_program_event+0x37/0xf6
    [ 13.173009] [] ? time_hardirqs_off+0x1b/0x2f
    [ 13.173009] [] call_softirq+0x1c/0x30
    [ 13.173009] [] do_softirq+0x44/0xf1
    [ 13.173009] [] irq_exit+0x58/0xc8
    [ 13.173009] [] smp_apic_timer_interrupt+0x79/0x87
    [ 13.173009] [] apic_timer_interrupt+0x13/0x20
    [ 13.173009] [] ? get_page_from_freelist+0x114/0x310
    [ 13.173009] [] ? get_page_from_freelist+0x2aa/0x310
    [ 13.173009] [] ? clear_page_c+0x7/0x10
    [ 13.173009] [] ? prep_new_page+0x14c/0x1cd
    [ 13.173009] [] get_page_from_freelist+0x2aa/0x310
    [ 13.173009] [] __alloc_pages_nodemask+0x178/0x243
    [ 13.173009] [] ? __pmd_alloc+0x87/0x99
    [ 13.173009] [] pte_alloc_one+0x1e/0x3a
    [ 13.173009] [] ? __pmd_alloc+0x87/0x99
    [ 13.173009] [] __pte_alloc+0x22/0x14b
    [ 13.173009] [] handle_mm_fault+0x17e/0x1e0
    [ 13.173009] [] do_page_fault+0x42d/0x5de
    [ 13.173009] [] ? sys_brk+0x32/0x10c
    [ 13.173009] [] ? time_hardirqs_off+0x1b/0x2f
    [ 13.173009] [] ? trace_hardirqs_off_caller+0x3f/0x9c
    [ 13.173009] [] ? trace_hardirqs_off_thunk+0x3a/0x3c
    [ 13.173009] [] page_fault+0x1f/0x30
    [ 14.010075] usb 5-1: new full speed USB device number 2 using uhci_hcd

    Reported-by: Valdis Kletnieks
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     
  • …ck/linux-2.6-rcu into core/urgent

    Ingo Molnar
     
  • Vince noticed that unless we mmap() a buffer, SIGIO gets lost. So
    explicitly push the wakeup (including signals) when requested.

    Reported-by: Vince Weaver
    Signed-off-by: Peter Zijlstra
    Cc:
    Link: http://lkml.kernel.org/n/tip-2euus3f3x3dyvdk52cjxw8zu@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The rule is, we have to update tsk->rt.nr_cpus_allowed if we change
    tsk->cpus_allowed. Otherwise RT scheduler may confuse.

    Signed-off-by: KOSAKI Motohiro
    Cc: Oleg Nesterov
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/4DD4B3FA.5060901@jp.fujitsu.com
    Signed-off-by: Ingo Molnar

    KOSAKI Motohiro
     
  • Dima Zavin reported:

    "After pulling the thread off the run-queue during a cgroup change,
    the cfs_rq.min_vruntime gets recalculated. The dequeued thread's vruntime
    then gets normalized to this new value. This can then lead to the thread
    getting an unfair boost in the new group if the vruntime of the next
    task in the old run-queue was way further ahead."

    Reported-by: Dima Zavin
    Signed-off-by: John Stultz
    Recalls-having-tested-once-upon-a-time-by: Mike Galbraith
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1305674470-23727-1-git-send-email-john.stultz@linaro.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Marc reported that e4a52bcb9 (sched: Remove rq->lock from the first
    half of ttwu()) broke his ARM-SMP machine. Now ARM is one of the few
    __ARCH_WANT_INTERRUPTS_ON_CTXSW users, so that exception in the ttwu()
    code was suspect.

    Yong found that the interrupt could hit after context_switch() changes
    current but before it clears p->on_cpu, if that interrupt were to
    attempt a wake-up of p we would indeed find ourselves spinning in IRQ
    context.

    Fix this by reverting to the old behaviour for this situation and
    perform a full remote wake-up.

    Cc: Frank Rowand
    Cc: Yong Zhang
    Cc: Oleg Nesterov
    Reported-by: Marc Zyngier
    Tested-by: Marc Zyngier
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • sched_domain iterations needs to be protected by rcu_read_lock() now,
    this patch adds another two places which needs the rcu lock, which is
    spotted by following suspicious rcu_dereference_check() usage warnings.

    kernel/sched_rt.c:1244 invoked rcu_dereference_check() without protection!
    kernel/sched_stats.h:41 invoked rcu_dereference_check() without protection!

    Signed-off-by: Xiaotian Feng
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1303469634-11678-1-git-send-email-dfeng@redhat.com
    Signed-off-by: Ingo Molnar

    Xiaotian Feng
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
    PM: Fix PM QOS's user mode interface to work with ASCII input
    PM / Hibernate: Update kerneldoc comments in hibernate.c
    PM / Hibernate: Remove arch_prepare_suspend()
    PM / Hibernate: Update some comments in core hibernate code

    Linus Torvalds
     
  • * 'docs-move' of git://git.kernel.org/pub/scm/linux/kernel/git/rdunlap/linux-docs:
    Create Documentation/security/, move LSM-, credentials-, and keys-related files from Documentation/ to Documentation/security/, add Documentation/security/00-INDEX, and update all occurrences of Documentation/ to Documentation/security/.

    Linus Torvalds