23 Mar, 2015

1 commit

  • When CONFIG_IRQ_WORK is not defined (difficult to do, as it also
    requires CONFIG_PRINTK not to be defined), we get a build failure:

    kernel/built-in.o: In function `flush_smp_call_function_queue':
    kernel/smp.c:263: undefined reference to `irq_work_run'
    kernel/smp.c:263: undefined reference to `irq_work_run'
    Makefile:933: recipe for target 'vmlinux' failed

    Simplest thing to do is to make irq_work_run() a nop when not set.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20150319101851.4d224d9b@gandalf.local.home
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

14 Sep, 2014

2 commits

  • The nohz full kick, which restarts the tick when any resource depend
    on it, can't be executed anywhere given the operation it does on timers.
    If it is called from the scheduler or timers code, chances are that
    we run into a deadlock.

    This is why we run the nohz full kick from an irq work. That way we make
    sure that the kick runs on a virgin context.

    However if that's the case when irq work runs in its own dedicated
    self-ipi, things are different for the big bunch of archs that don't
    support the self triggered way. In order to support them, irq works are
    also handled by the timer interrupt as fallback.

    Now when irq works run on the timer interrupt, the context isn't blank.
    More precisely, they can run in the context of the hrtimer that runs the
    tick. But the nohz kick cancels and restarts this hrtimer and cancelling
    an hrtimer from itself isn't allowed. This is why we run in an endless
    loop:

    Kernel panic - not syncing: Watchdog detected hard LOCKUP on cpu 2
    CPU: 2 PID: 7538 Comm: kworker/u8:8 Not tainted 3.16.0+ #34
    Workqueue: btrfs-endio-write normal_work_helper [btrfs]
    ffff880244c06c88 000000001b486fe1 ffff880244c06bf0 ffffffff8a7f1e37
    ffffffff8ac52a18 ffff880244c06c78 ffffffff8a7ef928 0000000000000010
    ffff880244c06c88 ffff880244c06c20 000000001b486fe1 0000000000000000
    Call Trace:
    ] dump_stack+0x4e/0x7a
    [] panic+0xd4/0x207
    [] watchdog_overflow_callback+0x118/0x120
    [] __perf_event_overflow+0xae/0x350
    [] ? perf_event_task_disable+0xa0/0xa0
    [] ? x86_perf_event_set_period+0xbf/0x150
    [] perf_event_overflow+0x14/0x20
    [] intel_pmu_handle_irq+0x206/0x410
    [] perf_event_nmi_handler+0x2b/0x50
    [] nmi_handle+0xd2/0x390
    [] ? nmi_handle+0x5/0x390
    [] ? match_held_lock+0x8/0x1b0
    [] default_do_nmi+0x72/0x1c0
    [] do_nmi+0xb8/0x100
    [] end_repeat_nmi+0x1e/0x2e
    [] ? match_held_lock+0x8/0x1b0
    [] ? match_held_lock+0x8/0x1b0
    [] ? match_held_lock+0x8/0x1b0
    <] lock_acquired+0xaf/0x450
    [] ? lock_hrtimer_base.isra.20+0x25/0x50
    [] _raw_spin_lock_irqsave+0x78/0x90
    [] ? lock_hrtimer_base.isra.20+0x25/0x50
    [] lock_hrtimer_base.isra.20+0x25/0x50
    [] hrtimer_try_to_cancel+0x33/0x1e0
    [] hrtimer_cancel+0x1a/0x30
    [] tick_nohz_restart+0x17/0x90
    [] __tick_nohz_full_check+0xc3/0x100
    [] nohz_full_kick_work_func+0xe/0x10
    [] irq_work_run_list+0x44/0x70
    [] irq_work_run+0x2a/0x50
    [] update_process_times+0x5b/0x70
    [] tick_sched_handle.isra.21+0x25/0x60
    [] tick_sched_timer+0x41/0x60
    [] __run_hrtimer+0x72/0x470
    [] ? tick_sched_do_timer+0xb0/0xb0
    [] hrtimer_interrupt+0x117/0x270
    [] local_apic_timer_interrupt+0x37/0x60
    [] smp_apic_timer_interrupt+0x3f/0x50
    [] apic_timer_interrupt+0x6f/0x80

    To fix this we force non-lazy irq works to run on irq work self-IPIs
    when available. That ability of the arch to trigger irq work self IPIs
    is available with arch_irq_work_has_interrupt().

    Reported-by: Catalin Iacob
    Reported-by: Dave Jones
    Acked-by: Peter Zijlstra (Intel)
    Cc: Ingo Molnar
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Frederic Weisbecker

    Frederic Weisbecker
     
  • The nohz full code needs irq work to trigger its own interrupt so that
    the subsystem can work even when the tick is stopped.

    Lets introduce arch_irq_work_has_interrupt() that archs can override to
    tell about their support for this ability.

    Signed-off-by: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Signed-off-by: Frederic Weisbecker

    Peter Zijlstra
     

16 Jun, 2014

1 commit

  • irq work currently only supports local callbacks. However its code
    is mostly ready to run remote callbacks and we have some potential user.

    The full nohz subsystem currently open codes its own remote irq work
    on top of the scheduler ipi when it wants a CPU to reevaluate its next
    tick. However this ad hoc solution bloats the scheduler IPI.

    Lets just extend the irq work subsystem to support remote queuing on top
    of the generic SMP IPI to handle this kind of user. This shouldn't add
    noticeable overhead.

    Suggested-by: Peter Zijlstra
    Acked-by: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Eric Dumazet
    Cc: Ingo Molnar
    Cc: Kevin Hilman
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Viresh Kumar
    Signed-off-by: Frederic Weisbecker

    Frederic Weisbecker
     

22 Feb, 2014

1 commit

  • On Mon, Feb 10, 2014 at 08:45:16AM -0800, Dave Hansen wrote:
    > The reason I coded this up was that NMIs were firing off so fast that
    > nothing else was getting a chance to run. With this patch, at least the
    > printk() would come out and I'd have some idea what was going on.

    It will start spewing to early_printk() (which is a lot nicer to use
    from NMI context too) when it fails to queue the IRQ-work because its
    already enqueued.

    It does have the false-positive for when two CPUs trigger the warn
    concurrently, but that should be rare and some extra clutter on the
    early printk shouldn't be a problem.

    Cc: hpa@zytor.com
    Cc: tglx@linutronix.de
    Cc: dzickus@redhat.com
    Cc: Dave Hansen
    Cc: mingo@kernel.org
    Fixes: 6a02ad66b2c4 ("perf/x86: Push the duration-logging printk() to IRQ context")
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140211150116.GO27965@twins.programming.kicks-ass.net
    Signed-off-by: Thomas Gleixner

    Peter Zijlstra
     

09 Feb, 2014

1 commit

  • Calling printk() from NMI context is bad (TM), so move it to IRQ
    context.

    This also avoids the problem where the printk() time is measured by
    the generic NMI duration goo and triggers a second warning.

    Signed-off-by: Peter Zijlstra
    Cc: Don Zickus
    Cc: Dave Hansen
    Link: http://lkml.kernel.org/n/tip-75dv35xf6dhhmeb7nq6fua31@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

23 Mar, 2013

1 commit

  • A randconfig caught repeated compiler warnings when CONFIG_IRQ_WORK=n
    due to the definition of a non-inline static function in
    :

    include/linux/irq_work.h +40 : warning: 'irq_work_needs_cpu' defined but not used

    Make it inline to supress the warning. This is caused commit
    00b42959106a ("irq_work: Don't stop the tick with pending works") merged
    in v3.9-rc1.

    Signed-off-by: James Hogan
    Signed-off-by: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Paul Gortmaker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    James Hogan
     

05 Feb, 2013

1 commit

  • Conflicts:
    kernel/irq_work.c

    Add support for printk in full dynticks CPU.

    * Don't stop tick with irq works pending. This
    fix is generally useful and concerns archs that
    can't raise self IPIs.

    * Flush irq works before CPU offlining.

    * Introduce "lazy" irq works that can wait for the
    next tick to be executed, unless it's stopped.

    * Implement klogd wake up using irq work. This
    removes the ad-hoc printk_tick()/printk_needs_cpu()
    hooks and make it working even in dynticks mode.

    Signed-off-by: Frederic Weisbecker

    Frederic Weisbecker
     

04 Feb, 2013

1 commit

  • As no one is using the return value of irq_work_queue(),
    so it is better to just make it void.

    Signed-off-by: anish kumar
    Acked-by: Steven Rostedt
    [ Fix stale comments, remove now unnecessary __irq_work_queue() intermediate function ]
    Signed-off-by: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/1359925703-24304-1-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    anish kumar
     

18 Nov, 2012

2 commits

  • On irq work initialization, let the user choose to define it
    as "lazy" or not. "Lazy" means that we don't want to send
    an IPI (provided the arch can anyway) when we enqueue this
    work but we rather prefer to wait for the next timer tick
    to execute our work if possible.

    This is going to be a benefit for non-urgent enqueuers
    (like printk in the future) that may prefer not to raise
    an IPI storm in case of frequent enqueuing on short periods
    of time.

    Signed-off-by: Frederic Weisbecker
    Acked-by: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Andrew Morton
    Cc: Paul Gortmaker

    Frederic Weisbecker
     
  • Don't stop the tick if we have pending irq works on the
    queue, otherwise if the arch can't raise self-IPIs, we may not
    find an opportunity to execute the pending works for a while.

    Signed-off-by: Frederic Weisbecker
    Acked-by: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Andrew Morton
    Cc: Paul Gortmaker

    Frederic Weisbecker
     

04 Oct, 2011

1 commit

  • Use llist in irq_work instead of the lock-less linked list
    implementation in irq_work to avoid the code duplication.

    Signed-off-by: Huang Ying
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1315461646-1379-6-git-send-email-ying.huang@intel.com
    Signed-off-by: Ingo Molnar

    Huang Ying
     

19 Oct, 2010

1 commit

  • Provide a mechanism that allows running code in IRQ context. It is
    most useful for NMI code that needs to interact with the rest of the
    system -- like wakeup a task to drain buffers.

    Perf currently has such a mechanism, so extract that and provide it as
    a generic feature, independent of perf so that others may also
    benefit.

    The IRQ context callback is generated through self-IPIs where
    possible, or on architectures like powerpc the decrementer (the
    built-in timer facility) is set to generate an interrupt immediately.

    Architectures that don't have anything like this get to do with a
    callback from the timer tick. These architectures can call
    irq_work_run() at the tail of any IRQ handlers that might enqueue such
    work (like the perf IRQ handler) to avoid undue latencies in
    processing the work.

    Signed-off-by: Peter Zijlstra
    Acked-by: Kyle McMartin
    Acked-by: Martin Schwidefsky
    [ various fixes ]
    Signed-off-by: Huang Ying
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra