04 Feb, 2010

1 commit

  • hrtimers callbacks are always done from hardirq context, either the
    jiffy tick interrupt or the hrtimer device interrupt.

    [ there is currently one exception that can still call a hrtimer
    callback from softirq, but even in that case this will still
    work correctly. ]

    Reported-by: Wei Yongjun
    Signed-off-by: Peter Zijlstra
    Cc: Yury Polyanskiy
    Tested-by: Wei Yongjun
    Acked-by: David S. Miller
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Peter Zijlstra
     

15 Dec, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
    m68k: rename global variable vmalloc_end to m68k_vmalloc_end
    percpu: add missing per_cpu_ptr_to_phys() definition for UP
    percpu: Fix kdump failure if booted with percpu_alloc=page
    percpu: make misc percpu symbols unique
    percpu: make percpu symbols in ia64 unique
    percpu: make percpu symbols in powerpc unique
    percpu: make percpu symbols in x86 unique
    percpu: make percpu symbols in xen unique
    percpu: make percpu symbols in cpufreq unique
    percpu: make percpu symbols in oprofile unique
    percpu: make percpu symbols in tracer unique
    percpu: make percpu symbols under kernel/ and mm/ unique
    percpu: remove some sparse warnings
    percpu: make alloc_percpu() handle array types
    vmalloc: fix use of non-existent percpu variable in put_cpu_var()
    this_cpu: Use this_cpu_xx in trace_functions_graph.c
    this_cpu: Use this_cpu_xx for ftrace
    this_cpu: Use this_cpu_xx in nmi handling
    this_cpu: Use this_cpu operations in RCU
    this_cpu: Use this_cpu ops for VM statistics
    ...

    Fix up trivial (famous last words) global per-cpu naming conflicts in
    arch/x86/kvm/svm.c
    mm/slab.c

    Linus Torvalds
     

02 Nov, 2009

1 commit

  • Currently, rcu_irq_exit() is invoked only for CONFIG_NO_HZ,
    while rcu_irq_enter() is invoked unconditionally. This patch
    moves rcu_irq_exit() out from under CONFIG_NO_HZ so that the
    calls are balanced.

    This patch has no effect on the behavior of the kernel because
    both rcu_irq_enter() and rcu_irq_exit() are empty for
    !CONFIG_NO_HZ, but the code is easier to understand if the calls
    are obviously balanced in all cases.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Paul E. McKenney
    Cc: dipankar@in.ibm.com
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josh@joshtriplett.org
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    Cc: Valdis.Kletnieks@vt.edu
    Cc: dhowells@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     

29 Oct, 2009

1 commit

  • This patch updates percpu related symbols under kernel/ and mm/ such
    that percpu symbols are unique and don't clash with local symbols.
    This serves two purposes of decreasing the possibility of global
    percpu symbol collision and allowing dropping per_cpu__ prefix from
    percpu symbols.

    * kernel/lockdep.c: s/lock_stats/cpu_lock_stats/

    * kernel/sched.c: s/init_rq_rt/init_rt_rq_var/ (any better idea?)
    s/sched_group_cpus/sched_groups/

    * kernel/softirq.c: s/ksoftirqd/run_ksoftirqd/a

    * kernel/softlockup.c: s/(*)_timestamp/softlockup_\1_ts/
    s/watchdog_task/softlockup_watchdog/
    s/timestamp/ts/ for local variables

    * kernel/time/timer_stats: s/lookup_lock/tstats_lookup_lock/

    * mm/slab.c: s/reap_work/slab_reap_work/
    s/reap_node/slab_reap_node/

    * mm/vmstat.c: local variable changed to avoid collision with vmstat_work

    Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
    which cause name clashes" patch.

    Signed-off-by: Tejun Heo
    Acked-by: (slab/vmstat) Christoph Lameter
    Reviewed-by: Christoph Lameter
    Cc: Rusty Russell
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Andrew Morton
    Cc: Nick Piggin

    Tejun Heo
     

18 Sep, 2009

1 commit


23 Aug, 2009

1 commit

  • Make RCU-sched, RCU-bh, and RCU-preempt be underlying
    implementations, with "RCU" defined in terms of one of the
    three. Update the outdated rcu_qsctr_inc() names, as these
    functions no longer increment anything.

    Signed-off-by: Paul E. McKenney
    Cc: laijs@cn.fujitsu.com
    Cc: dipankar@in.ibm.com
    Cc: akpm@linux-foundation.org
    Cc: mathieu.desnoyers@polymtl.ca
    Cc: josht@linux.vnet.ibm.com
    Cc: dvhltc@us.ibm.com
    Cc: niv@us.ibm.com
    Cc: peterz@infradead.org
    Cc: rostedt@goodmis.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

22 Jul, 2009

1 commit

  • commit ca109491f (hrtimer: removing all ur callback modes) moved all
    hrtimer callbacks into hard interrupt context when high resolution
    timers are active. That breaks code which relied on the assumption
    that the callback happens in softirq context.

    Provide a generic infrastructure which combines tasklets and hrtimers
    together to provide an in-softirq hrtimer experience.

    Signed-off-by: Peter Zijlstra
    Cc: torvalds@linux-foundation.org
    Cc: kaber@trash.net
    Cc: David Miller
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Peter Zijlstra
     

19 Jun, 2009

1 commit

  • Statistics for softirq doesn't exist.
    It will be helpful like statistics for interrupts.
    This patch introduces counting the number of softirq,
    which will be exported in /proc/softirqs.

    When softirq handler consumes much CPU time,
    /proc/stat is like the following.

    $ while :; do cat /proc/stat | head -n1 ; sleep 10 ; done
    cpu 88 0 408 739665 583 28 2 0 0
    cpu 450 0 1090 740970 594 28 1294 0 0
    ^^^^
    softirq

    In such a situation,
    /proc/softirqs shows us which softirq handler is invoked.
    We can see the increase rate of softirqs.

    $ cat /proc/softirqs
    CPU0 CPU1 CPU2 CPU3
    HI 0 0 0 0
    TIMER 462850 462805 462782 462718
    NET_TX 0 0 0 365
    NET_RX 2472 2 2 40
    BLOCK 0 0 381 1164
    TASKLET 0 0 0 224
    SCHED 462654 462689 462698 462427
    RCU 3046 2423 3367 3173

    $ cat /proc/softirqs
    CPU0 CPU1 CPU2 CPU3
    HI 0 0 0 0
    TIMER 463361 465077 465056 464991
    NET_TX 53 0 1 365
    NET_RX 3757 2 2 40
    BLOCK 0 0 398 1170
    TASKLET 0 0 0 224
    SCHED 463074 464318 464612 463330
    RCU 3505 2948 3947 3673

    When CPU TIME of softirq is high,
    the rates of increase is the following.
    TIMER : 220/sec : CPU1-3
    NET_TX : 5/sec : CPU0
    NET_RX : 120/sec : CPU0
    SCHED : 40-200/sec : all CPU
    RCU : 45-58/sec : all CPU

    The rates of increase in an idle mode is the following.
    TIMER : 250/sec
    SCHED : 250/sec
    RCU : 2/sec

    It seems many softirqs for receiving packets and rcu are invoked. This
    gives us help for checking system.

    Signed-off-by: Keika Kobayashi
    Reviewed-by: Hiroshi Shimamoto
    Reviewed-by: KOSAKI Motohiro
    Cc: Ingo Molnar
    Cc: Eric Dumazet
    Cc: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Keika Kobayashi
     

13 Jun, 2009

1 commit

  • Rationale: kmemcheck needs to be able to schedule a tasklet without
    touching any dynamically allocated memory _at_ _all_ (since that would
    lead to a recursive page fault). This tasklet is used for writing the
    error reports to the kernel log.

    The new scheduling function avoids touching any other tasklets by
    inserting the new tasklist as the head of the "tasklet_hi" list instead
    of on the tail.

    Also don't wake up the softirq thread lest the scheduler access some
    tracked memory and we go down with a recursive page fault.

    In this case, we'd better just wait for the maximum time of 1/HZ for the
    message to appear.

    Signed-off-by: Vegard Nossum

    Vegard Nossum
     

11 Jun, 2009

1 commit

  • * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (244 commits)
    Revert "x86, bts: reenable ptrace branch trace support"
    tracing: do not translate event helper macros in print format
    ftrace/documentation: fix typo in function grapher name
    tracing/events: convert block trace points to TRACE_EVENT(), fix !CONFIG_BLOCK
    tracing: add protection around module events unload
    tracing: add trace_seq_vprint interface
    tracing: fix the block trace points print size
    tracing/events: convert block trace points to TRACE_EVENT()
    ring-buffer: fix ret in rb_add_time_stamp
    ring-buffer: pass in lockdep class key for reader_lock
    tracing: add annotation to what type of stack trace is recorded
    tracing: fix multiple use of __print_flags and __print_symbolic
    tracing/events: fix output format of user stack
    tracing/events: fix output format of kernel stack
    tracing/trace_stack: fix the number of entries in the header
    ring-buffer: discard timestamps that are at the start of the buffer
    ring-buffer: try to discard unneeded timestamps
    ring-buffer: fix bug in ring_buffer_discard_commit
    ftrace: do not profile functions when disabled
    tracing: make trace pipe recognize latency format flag
    ...

    Linus Torvalds
     

07 May, 2009

1 commit


29 Apr, 2009

1 commit

  • "tracing: create automated trace defines" causes this compile error on s390,
    as reported by Sachin Sant against linux-next:

    kernel/built-in.o: In function `__do_softirq':
    (.text+0x1c680): undefined reference to `__tracepoint_softirq_entry'

    This happens because the definitions of the softirq tracepoints were moved
    from kernel/softirq.c to kernel/irq/handle.c. Since s390 doesn't support
    generic hardirqs handle.c doesn't get compiled and the definitions are
    missing.

    So move the tracepoints to softirq.c again.

    [ Impact: fix build failure on s390 ]

    Reported-by: Sachin Sant
    Signed-off-by: Heiko Carstens
    Cc: Steven Rostedt
    Cc: fweisbec@gmail.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Heiko Carstens
     

28 Apr, 2009

1 commit

  • This simplifies the node awareness of the code. All our allocators
    only deal with a NUMA node ID locality not with CPU ids anyway - so
    there's no need to maintain (and transform) a CPU id all across the
    IRq layer.

    v2: keep move_irq_desc related

    [ Impact: cleanup, prepare IRQ code to be NUMA-aware ]

    Signed-off-by: Yinghai Lu
    Cc: Andrew Morton
    Cc: Suresh Siddha
    Cc: "Eric W. Biederman"
    Cc: Rusty Russell
    Cc: Jeremy Fitzhardinge
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     

17 Apr, 2009

1 commit


15 Apr, 2009

2 commits

  • Impact: clean up

    Create a sub directory in include/trace called events to keep the
    trace point headers in their own separate directory. Only headers that
    declare trace points should be defined in this directory.

    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Neil Horman
    Cc: Zhao Lei
    Cc: Eduard - Gabriel Munteanu
    Cc: Pekka Enberg
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • This patch lowers the number of places a developer must modify to add
    new tracepoints. The current method to add a new tracepoint
    into an existing system is to write the trace point macro in the
    trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or
    DECLARE_TRACE, then they must add the same named item into the C file
    with the macro DEFINE_TRACE(name) and then add the trace point.

    This change cuts out the needing to add the DEFINE_TRACE(name).
    Every file that uses the tracepoint must still include the trace/.h
    file, but the one C file must also add a define before the including
    of that file.

    #define CREATE_TRACE_POINTS
    #include

    This will cause the trace/mytrace.h file to also produce the C code
    necessary to implement the trace point.

    Note, if more than one trace/.h is used to create the C code
    it is best to list them all together.

    #define CREATE_TRACE_POINTS
    #include
    #include
    #include

    Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with
    the cleaner solution of the define above the includes over my first
    design to have the C code include a "special" header.

    This patch converts sched, irq and lockdep and skb to use this new
    method.

    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Neil Horman
    Cc: Zhao Lei
    Cc: Eduard - Gabriel Munteanu
    Cc: Pekka Enberg
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

07 Apr, 2009

1 commit


06 Apr, 2009

1 commit

  • * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
    tracing, net: fix net tree and tracing tree merge interaction
    tracing, powerpc: fix powerpc tree and tracing tree interaction
    ring-buffer: do not remove reader page from list on ring buffer free
    function-graph: allow unregistering twice
    trace: make argument 'mem' of trace_seq_putmem() const
    tracing: add missing 'extern' keywords to trace_output.h
    tracing: provide trace_seq_reserve()
    blktrace: print out BLK_TN_MESSAGE properly
    blktrace: extract duplidate code
    blktrace: fix memory leak when freeing struct blk_io_trace
    blktrace: fix blk_probes_ref chaos
    blktrace: make classic output more classic
    blktrace: fix off-by-one bug
    blktrace: fix the original blktrace
    blktrace: fix a race when creating blk_tree_root in debugfs
    blktrace: fix timestamp in binary output
    tracing, Text Edit Lock: cleanup
    tracing: filter fix for TRACE_EVENT_FORMAT events
    ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
    x86: kretprobe-booster interrupt emulation code fix
    ...

    Fix up trivial conflicts in
    arch/parisc/include/asm/ftrace.h
    include/linux/memory.h
    kernel/extable.c
    kernel/module.c

    Linus Torvalds
     

04 Apr, 2009

1 commit


31 Mar, 2009

2 commits

  • It appears I inadvertly introduced rq->lock recursion to the
    hrtimer_start() path when I delegated running already expired
    timers to softirq context.

    This patch fixes it by introducing a __hrtimer_start_range_ns()
    method that will not use raise_softirq_irqoff() but
    __raise_softirq_irqoff() which avoids the wakeup.

    It then also changes schedule() to check for pending softirqs and
    do the wakeup then, I'm not quite sure I like this last bit, nor
    am I convinced its really needed.

    Signed-off-by: Peter Zijlstra
    Cc: Peter Zijlstra
    Cc: paulus@samba.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Conflicts:
    lib/Kconfig.debug

    Ingo Molnar
     

28 Mar, 2009

1 commit


13 Mar, 2009

7 commits


10 Mar, 2009

2 commits


06 Mar, 2009

1 commit


05 Mar, 2009

1 commit

  • If a machine is flooded by network frames, a cpu can loop
    100% of its time inside ksoftirqd() without calling schedule().
    This can delay RCU grace period to insane values.

    Adding rcu_qsctr_inc() call in ksoftirqd() solves this problem.

    Paul: "This regression was a result of the recent change from
    "schedule()" to "cond_resched()", which got rid of that quiescent
    state in the common case where a reschedule is not needed".

    Signed-off-by: Eric Dumazet
    Reviewed-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Eric Dumazet
     

25 Feb, 2009

1 commit

  • Oleg noticed that we don't strictly need CSD_FLAG_WAIT, rework
    the code so that we can use CSD_FLAG_LOCK for both purposes.

    Signed-off-by: Peter Zijlstra
    Cc: Oleg Nesterov
    Cc: Linus Torvalds
    Cc: Nick Piggin
    Cc: Jens Axboe
    Cc: "Paul E. McKenney"
    Cc: Rusty Russell
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

23 Jan, 2009

1 commit

  • Impact: fix to preempt trace triggering lockdep check_flag failure

    In local_bh_disable, the use of add_preempt_count causes the
    preempt tracer to start recording the time preemption is off.
    But because it already modified the preempt_count to show
    softirqs disabled, and before it called the lockdep code to
    handle this, it causes a state that lockdep can not handle.

    The preempt tracer will reset the ring buffer on start of a trace,
    and the ring buffer reset code does a spin_lock_irqsave. This
    calls into lockdep and lockdep will fail when it detects the
    invalid state of having softirqs disabled but the internal
    current->softirqs_enabled is still set.

    The fix is to manually add the SOFTIRQ_OFFSET to preempt count
    and call the preempt tracer code outside the lockdep critical
    area.

    Thanks to Peter Zijlstra for suggesting this solution.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

13 Jan, 2009

1 commit


04 Jan, 2009

1 commit

  • …/git/tip/linux-2.6-tip

    * 'cpus4096-for-linus-3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (77 commits)
    x86: setup_per_cpu_areas() cleanup
    cpumask: fix compile error when CONFIG_NR_CPUS is not defined
    cpumask: use alloc_cpumask_var_node where appropriate
    cpumask: convert shared_cpu_map in acpi_processor* structs to cpumask_var_t
    x86: use cpumask_var_t in acpi/boot.c
    x86: cleanup some remaining usages of NR_CPUS where s/b nr_cpu_ids
    sched: put back some stack hog changes that were undone in kernel/sched.c
    x86: enable cpus display of kernel_max and offlined cpus
    ia64: cpumask fix for is_affinity_mask_valid()
    cpumask: convert RCU implementations, fix
    xtensa: define __fls
    mn10300: define __fls
    m32r: define __fls
    h8300: define __fls
    frv: define __fls
    cris: define __fls
    cpumask: CONFIG_DISABLE_OBSOLETE_CPUMASK_FUNCTIONS
    cpumask: zero extra bits in alloc_cpumask_var_node
    cpumask: replace for_each_cpu_mask_nr with for_each_cpu in kernel/time/
    cpumask: convert mm/
    ...

    Linus Torvalds
     

01 Jan, 2009

2 commits

  • Impact: Remove obsolete API usage

    any_online_cpu() is a good name, but it takes a cpumask_t, not a
    pointer.

    There are several places where any_online_cpu() doesn't really want a
    mask arg at all. Replace all callers with cpumask_any() and
    cpumask_any_and().

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis

    Rusty Russell
     
  • …l/git/tip/linux-2.6-tip

    * 'irq-fixes-for-linus-4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sparseirq: move __weak symbols into separate compilation unit
    sparseirq: work around __weak alias bug
    sparseirq: fix hang with !SPARSE_IRQ
    sparseirq: set lock_class for legacy irq when sparse_irq is selected
    sparseirq: work around compiler optimizing away __weak functions
    sparseirq: fix desc->lock init
    sparseirq: do not printk when migrating IRQ descriptors
    sparseirq: remove duplicated arch_early_irq_init()
    irq: simplify for_each_irq_desc() usage
    proc: remove ifdef CONFIG_SPARSE_IRQ from stat.c
    irq: for_each_irq_desc() move to irqnr.h
    hrtimer: remove #include <linux/irq.h>

    Linus Torvalds
     

29 Dec, 2008

1 commit

  • GCC has a bug with __weak alias functions: if the functions are in
    the same compilation unit as their call site, GCC can decide to
    inline them - and thus rob the linker of the opportunity to override
    the weak alias with the real thing.

    So move all the IRQ handling related __weak symbols to kernel/irq/chip.c.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar

    Yinghai Lu