20 Oct, 2008

1 commit


18 Oct, 2008

1 commit


17 Oct, 2008

2 commits


03 Oct, 2008

1 commit


02 Oct, 2008

1 commit


07 Sep, 2008

1 commit


27 Jul, 2008

1 commit

  • A previous patch added the early_initcall(), to allow a cleaner hooking of
    pre-SMP initcalls. Now we remove the older interface, converting all
    existing users to the new one.

    [akpm@linux-foundation.org: cleanups]
    [akpm@linux-foundation.org: build fix]
    [kosaki.motohiro@jp.fujitsu.com: warning fix]
    [kosaki.motohiro@jp.fujitsu.com: warning fix]
    Signed-off-by: Eduard - Gabriel Munteanu
    Cc: Tom Zanussi
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eduard - Gabriel Munteanu
     

19 Jul, 2008

2 commits

  • Ingo Molnar
     
  • Jack Ren and Eric Miao tracked down the following long standing
    problem in the NOHZ code:

    scheduler switch to idle task
    enable interrupts

    Window starts here

    ----> interrupt happens (does not set NEED_RESCHED)
    irq_exit() stops the tick

    ----> interrupt happens (does set NEED_RESCHED)

    return from schedule()

    cpu_idle(): preempt_disable();

    Window ends here

    The interrupts can happen at any point inside the race window. The
    first interrupt stops the tick, the second one causes the scheduler to
    rerun and switch away from idle again and we end up with the tick
    disabled.

    The fact that it needs two interrupts where the first one does not set
    NEED_RESCHED and the second one does made the bug obscure and extremly
    hard to reproduce and analyse. Kudos to Jack and Eric.

    Solution: Limit the NOHZ functionality to the idle loop to make sure
    that we can not run into such a situation ever again.

    cpu_idle()
    {
    preempt_disable();

    while(1) {
    tick_nohz_stop_sched_tick(1); ,
    Debugged-by: eric miao
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

16 Jul, 2008

1 commit

  • Conflicts:

    arch/powerpc/Kconfig
    arch/s390/kernel/time.c
    arch/x86/kernel/apic_32.c
    arch/x86/kernel/cpu/perfctr-watchdog.c
    arch/x86/kernel/i8259_64.c
    arch/x86/kernel/ldt.c
    arch/x86/kernel/nmi_64.c
    arch/x86/kernel/smpboot.c
    arch/x86/xen/smp.c
    include/asm-x86/hw_irq_32.h
    include/asm-x86/hw_irq_64.h
    include/asm-x86/mach-default/irq_vectors.h
    include/asm-x86/mach-voyager/irq_vectors.h
    include/asm-x86/smp.h
    kernel/Makefile

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

15 Jul, 2008

1 commit


26 Jun, 2008

2 commits


24 Jun, 2008

1 commit

  • Hidehiro Kawai noticed that sched_setscheduler() can fail in
    stop_machine: it calls sched_setscheduler() from insmod, which can
    have CAP_SYS_MODULE without CAP_SYS_NICE.

    Two cases could have failed, so are changed to sched_setscheduler_nocheck:
    kernel/softirq.c:cpu_callback()
    - CPU hotplug callback
    kernel/stop_machine.c:__stop_machine_run()
    - Called from various places, including modprobe()

    Signed-off-by: Rusty Russell
    Cc: Jeremy Fitzhardinge
    Cc: Hidehiro Kawai
    Cc: Andrew Morton
    Cc: linux-mm@kvack.org
    Cc: sugita
    Cc: Satoshi OSHIMA
    Signed-off-by: Ingo Molnar

    Rusty Russell
     

20 Jun, 2008

1 commit

  • There's no need to use local_irq_save() over local_irq_disable() in the
    local_bh_enable code since it is a bug to call it with irqs disabled and
    do_softirq will enable irqs if there is any pending work.

    Consolidate the code from local_bh_enable and ..._ip to avoid having a
    disconnect between them in the warnings they trigger that is currently
    there.

    Also always trigger the warning on in_irq(), not just in the
    trace-irqflags case.

    Signed-off-by: Johannes Berg
    Cc: Michael Buesch
    Cc: David Ellingsworth
    Cc: Linus Torvalds
    Signed-off-by: Ingo Molnar

    Johannes Berg
     

18 Jun, 2008

1 commit


25 May, 2008

1 commit

  • As git-grep shows, open_softirq() is always called with the last argument
    being NULL

    block/blk-core.c: open_softirq(BLOCK_SOFTIRQ, blk_done_softirq, NULL);
    kernel/hrtimer.c: open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq, NULL);
    kernel/rcuclassic.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
    kernel/rcupreempt.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
    kernel/sched.c: open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL);
    kernel/softirq.c: open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
    kernel/softirq.c: open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
    kernel/timer.c: open_softirq(TIMER_SOFTIRQ, run_timer_softirq, NULL);
    net/core/dev.c: open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL);
    net/core/dev.c: open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL);

    This observation has already been made by Matthew Wilcox in June 2002
    (http://www.cs.helsinki.fi/linux/linux-kernel/2002-25/0687.html)

    "I notice that none of the current softirq routines use the data element
    passed to them."

    and the situation hasn't changed since them. So it appears we can safely
    remove that extra argument to save 128 (54) bytes of kernel data (text).

    Signed-off-by: Carlos R. Mafra
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Carlos R. Mafra
     

01 May, 2008

1 commit

  • currently cpu hotplug (unplug) seems broken on s390 and likely others. On cpu
    unplug the system starts to behave very strange and hangs.

    I bisected the problem to the following commit:

    commit 48f20a9a9488c432fc86df1ff4b7f4fa895d1183
    Author: Olof Johansson
    Date: Tue Mar 4 15:23:25 2008 -0800
    tasklets: execute tasklets in the same order they were queued

    Reverting this patch seems to fix the problem. I looked into takeover_tasklet
    and it seems that there is a way to corrupt the tail pointer of the current
    cpu. If the tasklet list of the frozen cpu is empty, the tail pointer of the
    current cpu points to the address of the head pointer of the stopped cpu and
    not to the next pointer of a tasklet_struct.

    This patch avoids the list splice of the list is empty and cpu hotplug seems
    to work as the tail pointer is not corrupted. Olof, can you look into that
    patch and ACK/NACK it so Andrew can push this to Linus, if appropriate?
    Please note that some lines are longer than 80 chars, but line-wrapping looked
    worse that this version.

    Signed-off-by: Christian Borntraeger
    Acked-by: Olof Johansson
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Borntraeger
     

20 Apr, 2008

1 commit

  • I noticed this when looking at an openswan issue. Openswan (ab?)uses the
    tasklet API to defer processing of packets in some situations, with one
    packet per tasklet_action(). I started noticing sequences of
    backwards-ordered sequence numbers coming over the wire, since new tasklets
    are always queued at the head of the list but processed sequentially.

    Convert it to instead append new entries to the tail of the list. As an
    extra bonus, the splicing code in takeover_tasklets() no longer has to
    iterate over the list.

    Signed-off-by: Olof Johansson
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Olof Johansson
     

01 Mar, 2008

1 commit

  • The PREEMPT-RCU can get stuck if a CPU goes idle and NO_HZ is set. The
    idle CPU will not progress the RCU through its grace period and a
    synchronize_rcu my get stuck. Without this patch I have a box that will
    not boot when PREEMPT_RCU and NO_HZ are set. That same box boots fine
    with this patch.

    This patch comes from the -rt kernel where it has been tested for
    several months.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Paul E. McKenney
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

09 Feb, 2008

1 commit


30 Jan, 2008

2 commits

  • Current idle time in kstat is based on jiffies and is coarse grained.
    tick_sched.idle_sleeptime is making some attempt to keep track of idle time
    in a fine grained manner. But, it is not handling the time spent in
    interrupts fully.

    Make tick_sched.idle_sleeptime accurate with respect to time spent on
    handling interrupts and also add tick_sched.idle_lastupdate, which keeps
    track of last time when idle_sleeptime was updated.

    This statistics will be crucial for cpufreq-ondemand governor, which can
    shed some conservative gaurd band that is uses today while setting the
    frequency. The ondemand changes that uses the exact idle time is coming
    soon.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Venki Pallipadi
     
  • I was confused by FSEC = 10^15 NSEC statement, plus small whitespace
    fixes. When there's copyright, there should be GPL.

    Signed-off-by: Pavel Machek
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Pavel Machek
     

11 Oct, 2007

2 commits


18 Jul, 2007

1 commit

  • Currently, the freezer treats all tasks as freezable, except for the kernel
    threads that explicitly set the PF_NOFREEZE flag for themselves. This
    approach is problematic, since it requires every kernel thread to either
    set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
    care for the freezing of tasks at all.

    It seems better to only require the kernel threads that want to or need to
    be frozen to use some freezer-related code and to remove any
    freezer-related code from the other (nonfreezable) kernel threads, which is
    done in this patch.

    The patch causes all kernel threads to be nonfreezable by default (ie. to
    have PF_NOFREEZE set by default) and introduces the set_freezable()
    function that should be called by the freezable kernel threads in order to
    unset PF_NOFREEZE. It also makes all of the currently freezable kernel
    threads call set_freezable(), so it shouldn't cause any (intentional)
    change of behaviour to appear. Additionally, it updates documentation to
    describe the freezing of tasks more accurately.

    [akpm@linux-foundation.org: build fixes]
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Nigel Cunningham
    Cc: Pavel Machek
    Cc: Oleg Nesterov
    Cc: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

17 Jul, 2007

1 commit

  • Fix ksoftirqd termination on cpu hotplug with naughty real time process.

    Assuming the following case:

    - Try to hot remove CPU2 from CPU1.
    - There is a real time process on CPU2, and that process doesn't sleep at all.
    - That rt process and ksoftirqd/2 is migrated to the CPU0

    Then ksoftirqd/2 can't stop becasue that rt process runs everlastingly on
    CPU0, and CPU1 waiting the ksoftirqd/2's termination hangs up. To fix this
    problem, set the priority of ksoftirqd/2 to max one before kthread_stop().

    [akpm@linux-foundation.org: fix warning]
    Signed-off-by: Satoru Takeuchi
    Cc: Rusty Russell
    Cc: Ingo Molnar
    Cc: Oleg Nesterov
    Cc: Ashok Raj
    Cc: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Satoru Takeuchi
     

10 Jul, 2007

1 commit

  • do not set softirqs to nice +19. _If_ for whatever reason
    we missed to process some high-prio softirq and woke up
    ksoftirqd, we should give it a fair chance to actually
    get some work done, even if the system is under load.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

10 May, 2007

1 commit

  • Since nonboot CPUs are now disabled after tasks and devices have been
    frozen and the CPU hotplug infrastructure is used for this purpose, we need
    special CPU hotplug notifications that will help the CPU-hotplug-aware
    subsystems distinguish normal CPU hotplug events from CPU hotplug events
    related to a system-wide suspend or resume operation in progress. This
    patch introduces such notifications and causes them to be used during
    suspend and resume transitions. It also changes all of the
    CPU-hotplug-aware subsystems to take these notifications into consideration
    (for now they are handled in the same way as the corresponding "normal"
    ones).

    [oleg@tv-sign.ru: cleanups]
    Signed-off-by: Rafael J. Wysocki
    Cc: Gautham R Shenoy
    Cc: Pavel Machek
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

17 Feb, 2007

2 commits

  • With Ingo Molnar

    Add functions to provide dynamic ticks and high resolution timers. The code
    which keeps track of jiffies and handles the long idle periods is shared
    between tick based and high resolution timer based dynticks. The dyntick
    functionality can be disabled on the kernel commandline. Provide also the
    infrastructure to support high resolution timers.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Uninline irq_enter(). [dynticks adds more stuff to it]

    No functional changes.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

08 Dec, 2006

1 commit

  • It is possible to have tasklets get scheduled before softirqd has had a chance
    to spawn on all CPUs. This is totally harmless; after success during action
    CPU_UP_PREPARE, action CPU_ONLINE will be called, which immediately wakes
    softirqd on the appropriate CPU to process the already pending tasklets. So
    there is no danger of having a missed wakeup for any tasklets that were
    already pending.

    In particular, i386 is affected by this during startup, and is visible when
    using a very large initrd; during the time it takes for the initrd to be
    decompressed, a timer IRQ can come in and schedule RCU callbacks. It is also
    possible that resending of a hardware IRQ via a softirq triggers the same bug.

    Because of different timing conditions, this shows up in all emulators and
    virtual machines tested, including Xen, VMware, Virtual PC, and Qemu. It is
    also possible to trigger on native hardware with a large enough initrd,
    although I don't have a reliable case demonstrating that.

    Signed-off-by: Zachary Amsden
    Cc:
    Cc: Ingo Molnar
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zachary Amsden
     

30 Sep, 2006

1 commit

  • Spawing ksoftirqd, migration, or watchdog, and calling init_timers_cpu()
    may fail with small memory. If it happens in initcalls, kernel NULL
    pointer dereference happens later. This patch makes crash happen
    immediately in such cases. It seems a bit better than getting kernel NULL
    pointer dereference later.

    Cc: Ingo Molnar
    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

01 Aug, 2006

2 commits

  • The recent changes from irqtrace feature has added overheads to
    local_bh_disable and local_bh_enable that reduces UDP performance across
    x86_64 and IA64, even though IA64 does not support the irqtrace feature.
    Patch in question is

    [PATCH]lockdep: irqtrace subsystem, core
    http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=c
    ommit;h=de30a2b355ea85350ca2f58f3b9bf4e5bc007986

    Prior to this patch, local_bh_disable was a short macro. Now it is a
    function which calls __local_bh_disable with added irq flags save and
    restore. The irq flags save and restore were also added to
    local_bh_enable, probably for injecting the trace irqs code.

    This overhead is on the generic code path across all architectures. On a
    IA_64 test machine (Itanium-2 1.6 GHz) running a benchmark like netperf's
    UDP streaming test, the added overhead results in a drop of 3% in
    throughput, as udp_sendmsg calls the local_bh_enable/disable several times.

    Other workloads that have heavy usages of local_bh_enable/disable could
    also be affected. The patch ideally should not have affected IA-64
    performance as it does not have IRQ tracing support. A significant portion
    of the overhead is in the added irq flags save and restore, which I think
    is not needed if IRQ tracing is unused. A suggested patch is attached
    below that recovers the lost performance. However, the "ifdef"s in the
    patch are a bit ugly.

    Signed-off-by: Tim Chen
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tim Chen
     
  • Few of the callback functions and notifier blocks that are associated with cpu
    notifications incorrectly have __devinit and __devinitdata. They should be
    __cpuinit and __cpuinitdata instead.

    It makes no functional difference but wastes text area when CONFIG_HOTPLUG is
    enabled and CONFIG_HOTPLUG_CPU is not.

    This patch fixes all those instances.

    Signed-off-by: Chandra Seetharaman
    Cc: Ashok Raj
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     

15 Jul, 2006

1 commit

  • Christoph Hellwig:
    open_softirq just enables a softirq. The softirq array is statically
    allocated so to add a new one you would have to patch the kernel. So
    there's no point to keep this export at all as any user would have to
    patch the enum in include/linux/interrupt.h anyway.

    Signed-off-by: Adrian Bunk
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

11 Jul, 2006

1 commit


04 Jul, 2006

2 commits

  • At the moment, powerpc and s390 have their own versions of do_softirq which
    include local_bh_disable() and __local_bh_enable() calls. They end up
    calling __do_softirq (in kernel/softirq.c) which also does
    local_bh_disable/enable.

    Apparently the two levels of disable/enable trigger a warning from some
    validation code that Ingo is working on, and he would like to see the outer
    level removed. But to do that, we have to move the account_system_vtime
    calls that are currently in the arch do_softirq() implementations for
    powerpc and s390 into the generic __do_softirq() (this is a no-op for other
    archs because account_system_vtime is defined to be an empty inline
    function on all other archs). This patch does that.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Mackerras
     
  • Accurate hard-IRQ-flags and softirq-flags state tracing.

    This allows us to attach extra functionality to IRQ flags on/off
    events (such as trace-on/off).

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar