15 Jul, 2008

1 commit

  • * 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (241 commits)
    [ARM] 5171/1: ep93xx: fix compilation of modules using clocks
    [ARM] 5133/2: at91sam9g20 defconfig file
    [ARM] 5130/4: Support for the at91sam9g20
    [ARM] 5160/1: IOP3XX: gpio/gpiolib support
    [ARM] at91: Fix NAND FLASH timings for at91sam9x evaluation kits.
    [ARM] 5084/1: zylonite: Register AC97 device
    [ARM] 5085/2: PXA: Move AC97 over to the new central device declaration model
    [ARM] 5120/1: pxa: correct platform driver names for PXA25x and PXA27x UDC drivers
    [ARM] 5147/1: pxaficp_ir: drop pxa_gpio_mode calls, as pin setting
    [ARM] 5145/1: PXA2xx: provide api to control IrDA pins state
    [ARM] 5144/1: pxaficp_ir: cleanup includes
    [ARM] pxa: remove pxa_set_cken()
    [ARM] pxa: allow clk aliases
    [ARM] Feroceon: don't disable BPU on boot
    [ARM] Orion: LED support for HP mv2120
    [ARM] Orion: add RD88F5181L-FXO support
    [ARM] Orion: add RD88F5181L-GE support
    [ARM] Orion: add Netgear WNR854T support
    [ARM] s3c2410_defconfig: update for current build
    [ARM] Acer n30: Minor style and indentation fixes.
    ...

    Linus Torvalds
     

25 May, 2008

1 commit

  • As git-grep shows, open_softirq() is always called with the last argument
    being NULL

    block/blk-core.c: open_softirq(BLOCK_SOFTIRQ, blk_done_softirq, NULL);
    kernel/hrtimer.c: open_softirq(HRTIMER_SOFTIRQ, run_hrtimer_softirq, NULL);
    kernel/rcuclassic.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
    kernel/rcupreempt.c: open_softirq(RCU_SOFTIRQ, rcu_process_callbacks, NULL);
    kernel/sched.c: open_softirq(SCHED_SOFTIRQ, run_rebalance_domains, NULL);
    kernel/softirq.c: open_softirq(TASKLET_SOFTIRQ, tasklet_action, NULL);
    kernel/softirq.c: open_softirq(HI_SOFTIRQ, tasklet_hi_action, NULL);
    kernel/timer.c: open_softirq(TIMER_SOFTIRQ, run_timer_softirq, NULL);
    net/core/dev.c: open_softirq(NET_TX_SOFTIRQ, net_tx_action, NULL);
    net/core/dev.c: open_softirq(NET_RX_SOFTIRQ, net_rx_action, NULL);

    This observation has already been made by Matthew Wilcox in June 2002
    (http://www.cs.helsinki.fi/linux/linux-kernel/2002-25/0687.html)

    "I notice that none of the current softirq routines use the data element
    passed to them."

    and the situation hasn't changed since them. So it appears we can safely
    remove that extra argument to save 128 (54) bytes of kernel data (text).

    Signed-off-by: Carlos R. Mafra
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Carlos R. Mafra
     

13 May, 2008

1 commit


30 Apr, 2008

1 commit

  • Add calls to the generic object debugging infrastructure and provide fixup
    functions which allow to keep the system alive when recoverable problems have
    been detected by the object debugging core code.

    Signed-off-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Cc: Greg KH
    Cc: Randy Dunlap
    Cc: Kay Sievers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

17 Apr, 2008

1 commit

  • In order to avoid the false positive from lockdep, each per-cpu base->lock has
    the separate lock class and migrate_timers() uses double_spin_lock().

    This all is overcomplicated: except for migrate_timers() we never take 2 locks
    at once, and migrate_timers() can use spin_lock_nested().

    Signed-off-by: Oleg Nesterov
    Cc: Arjan van de Ven
    Cc: Heiko Carstens
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Oleg Nesterov
     

26 Mar, 2008

1 commit

  • add_timer_on() can add a timer on a CPU which is currently in a long
    idle sleep, but the timer wheel is not reevaluated by the nohz code on
    that CPU. So a timer can be delayed for quite a long time. This
    triggered a false positive in the clocksource watchdog code.

    To avoid this we need to wake up the idle CPU and enforce the
    reevaluation of the timer wheel for the next timer event.

    Add a function, which checks a given CPU for idle state, marks the
    idle task with NEED_RESCHED and sends a reschedule IPI to notify the
    other CPU of the change in the timer wheel.

    Call this function from add_timer_on().

    Signed-off-by: Thomas Gleixner
    Acked-by: Peter Zijlstra
    Acked-by: Ingo Molnar
    Cc: stable@kernel.org

    --
    include/linux/sched.h | 6 ++++++
    kernel/sched.c | 43 +++++++++++++++++++++++++++++++++++++++++++
    kernel/timer.c | 10 +++++++++-
    3 files changed, 58 insertions(+), 1 deletion(-)

    Thomas Gleixner
     

09 Feb, 2008

2 commits

  • [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Harvey Harrison
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Harvey Harrison
     
  • Some time ago the xxx_vnr() calls (e.g. pid_vnr or find_task_by_vpid) were
    _all_ converted to operate on the current pid namespace. After this each call
    like xxx_nr_ns(foo, current->nsproxy->pid_ns) is nothing but a xxx_vnr(foo)
    one.

    Switch all the xxx_nr_ns() callers to use the xxx_vnr() calls where
    appropriate.

    Signed-off-by: Pavel Emelyanov
    Reviewed-by: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

07 Feb, 2008

1 commit

  • This moves the ability to scale cputime into generic code. This allows us
    to fix the issue in kernel/timer.c (noticed by Balbir) where we could only
    add an unscaled value to the scaled utime/stime.

    This adds a cputime_to_scaled function. As before, the POWERPC version
    does the scaling based on the last SPURR/PURR ratio calculated. The
    generic and s390 (only other arch to implement asm/cputime.h) versions are
    both NOPs.

    Also moves the SPURR and PURR snapshots closer.

    Signed-off-by: Michael Neuling
    Cc: Jay Lan
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Neuling
     

01 Feb, 2008

1 commit

  • * 'task_killable' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc: (22 commits)
    Remove commented-out code copied from NFS
    NFS: Switch from intr mount option to TASK_KILLABLE
    Add wait_for_completion_killable
    Add wait_event_killable
    Add schedule_timeout_killable
    Use mutex_lock_killable in vfs_readdir
    Add mutex_lock_killable
    Use lock_page_killable
    Add lock_page_killable
    Add fatal_signal_pending
    Add TASK_WAKEKILL
    exit: Use task_is_*
    signal: Use task_is_*
    sched: Use task_contributes_to_load, TASK_ALL and TASK_NORMAL
    ptrace: Use task_is_*
    power: Use task_is_*
    wait: Use TASK_NORMAL
    proc/base.c: Use task_is_*
    proc/array.c: Use TASK_REPORT
    perfmon: Use task_is_*
    ...

    Fixed up conflicts in NFS/sunrpc manually..

    Linus Torvalds
     

30 Jan, 2008

2 commits


26 Jan, 2008

1 commit


22 Jan, 2008

1 commit

  • The caller is __cpuinit.
    Also, this code block and its caller are inside #ifdef CONFIG_HOTPLUG_CPU
    blocks, so this code should reflect that config symbol's usage.

    WARNING: vmlinux.o(.text+0x4252f): Section mismatch: reference to .init.text: (between 'timer_cpu_notify' and 'msleep')

    Signed-off-by: Randy Dunlap
    Cc: Sam Ravnborg
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

14 Jan, 2008

1 commit

  • task_ppid_nr_ns is called in three places. One of these should never
    have called it. In the other two, using it broke the existing
    semantics. This was presumably accidental. If the function had not
    been there, it would have been much more obvious to the eye that those
    patches were changing the behavior. We don't need this function.

    In task_state, the pid of the ptracer is not the ppid of the ptracer.

    In do_task_stat, ppid is the tgid of the real_parent, not its pid.
    I also moved the call outside of lock_task_sighand, since it doesn't
    need it.

    In sys_getppid, ppid is the tgid of the real_parent, not its pid.

    Signed-off-by: Roland McGrath
    Signed-off-by: Linus Torvalds

    Roland McGrath
     

19 Dec, 2007

1 commit

  • This patch fixes the following section mismatches with CONFIG_HOTPLUG=n,
    CONFIG_HOTPLUG_CPU=y:

    ...
    WARNING: vmlinux.o(.text+0x41cd3): Section mismatch: reference to .init.data:tvec_base_done.22610 (between 'timer_cpu_notify' and 'run_timer_softirq')
    WARNING: vmlinux.o(.text+0x41d67): Section mismatch: reference to .init.data:tvec_base_done.22610 (between 'timer_cpu_notify' and 'run_timer_softirq')
    ...

    Signed-off-by: Adrian Bunk
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Adrian Bunk
     

07 Dec, 2007

1 commit


10 Nov, 2007

1 commit

  • Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
    deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
    broken on powerpc, because we end up counting user time twice: once in
    timer_interrupt() and once in update_process_times().

    This fixes the problem by pulling the code in update_process_times
    that updates utime and stime into a separate function called
    account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
    there is a version of account_process_tick in kernel/timer.c that
    simply accounts a whole tick to either utime or stime as before. If
    CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
    implement account_process_tick.

    This also lets us simplify the s390 code a bit; it means that the s390
    timer interrupt can now call update_process_times even when
    CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
    suitable account_process_tick().

    account_process_tick() now takes the task_struct * as an argument.
    Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Ingo Molnar

    Paul Mackerras
     

06 Nov, 2007

1 commit


20 Oct, 2007

1 commit

  • This is the largest patch in the set. Make all (I hope) the places where
    the pid is shown to or get from user operate on the virtual pids.

    The idea is:
    - all in-kernel data structures must store either struct pid itself
    or the pid's global nr, obtained with pid_nr() call;
    - when seeking the task from kernel code with the stored id one
    should use find_task_by_pid() call that works with global pids;
    - when showing pid's numerical value to the user the virtual one
    should be used, but however when one shows task's pid outside this
    task's namespace the global one is to be used;
    - when getting the pid from userspace one need to consider this as
    the virtual one and use appropriate task/pid-searching functions.

    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: nuther build fix]
    [akpm@linux-foundation.org: yet nuther build fix]
    [akpm@linux-foundation.org: remove unneeded casts]
    Signed-off-by: Pavel Emelyanov
    Signed-off-by: Alexey Dobriyan
    Cc: Sukadev Bhattiprolu
    Cc: Oleg Nesterov
    Cc: Paul Menage
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

19 Oct, 2007

2 commits

  • This adds items to the taststats struct to account for user and system
    time based on scaling the CPU frequency and instruction issue rates.

    Adds account_(user|system)_time_scaled callbacks which architectures
    can use to account for time using this mechanism.

    Signed-off-by: Michael Neuling
    Cc: Balbir Singh
    Cc: Jay Lan
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Neuling
     
  • Signed-off-by: Daniel Walker
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Walker
     

21 Jul, 2007

2 commits


20 Jul, 2007

1 commit


18 Jul, 2007

1 commit

  • kmalloc_node() and kmem_cache_alloc_node() were not available in a zeroing
    variant in the past. But with __GFP_ZERO it is possible now to do zeroing
    while allocating.

    Use __GFP_ZERO to remove the explicit clearing of memory via memset whereever
    we can.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

17 Jul, 2007

2 commits

  • Add a flag in /proc/timer_stats to indicate deferrable timers. This will
    let developers/users to differentiate between types of tiemrs in
    /proc/timer_stats.

    Deferrable timer and normal timer will appear in /proc/timer_stats as below.
    10D, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)
    10, 1 swapper queue_delayed_work_on (delayed_work_timer_fn)

    Also version of timer_stats changes from v0.1 to v0.2

    Signed-off-by: Venkatesh Pallipadi
    Acked-by: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: john stultz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Venki Pallipadi
     
  • Commit 411187fb05cd11676b0979d9fbf3291db69dbce2 caused uptime not to increase
    during suspend. This may cause confusion so I restore the old behaviour by
    using the boot based time instead of monotonic for uptime.

    Signed-off-by: Tomas Janousek
    Acked-by: John Stultz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tomas Janousek
     

30 May, 2007

1 commit

  • get_next_timer_interrupt() returns a delta of (LONG_MAX > 1) in case
    there is no timer pending. On 64 bit machines this results in a
    multiplication overflow in tick_nohz_stop_sched_tick().

    Reported by: Dave Miller

    Make the return value a constant and limit the return value to a 32 bit
    value.

    When the max timeout value is returned, we can safely stop the tick
    timer device. The max jiffies delta results in a 12 days timeout for
    HZ=1000.

    In the long term the get_next_timer_interrupt() code needs to be
    reworked to return ktime instead of jiffies, but we have to wait until
    the last users of the original NO_IDLE_HZ code are converted.

    Signed-off-by: Thomas Gleixner
    Acked-off-by: David S. Miller
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

15 May, 2007

1 commit

  • The time keeping code move to kernel/time/timekeeping.c broke the
    clocksource resume logic patch, which got applied to the old file by a
    fuzzy application. Fix it up and move the clocksource_resume() call to
    the appropriate place.

    Signed-off-by: Thomas Gleixner
    [ tssk, tssk, everybody should use --fuzz=0 ]
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

11 May, 2007

1 commit

  • On 09-05-2007 21:10, Pallipadi, Venkatesh wrote:
    ...
    > On a 64 bit system, converting pointer to int causes unnecessary
    > compiler warning, and intermediate long conversion was to avoid that.
    > I will have to rephrase my comment to remove 32 bit value and use int,
    > as that is what the function returns.

    So, this patch reverts all changes done by my previous patch.

    I apologize for my wrong comment about "logical error" here.

    Cc: "Pallipadi, Venkatesh"
    Cc: Satyam Sharma
    Cc: Oleg Nesterov
    Signed-off-by: Jarek Poplawski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    akpm@linux-foundation.org
     

10 May, 2007

3 commits

  • We need to make sure that the clocksources are resumed, when timekeeping is
    resumed. The current resume logic does not guarantee this.

    Add a resume function pointer to the clocksource struct, so clocksource
    drivers which need to reinitialize the clocksource can provide a resume
    function.

    Add a resume function, which calls the maybe available clocksource resume
    functions and resets the watchdog function, so a stable TSC can be used
    accross suspend/resume.

    Signed-off-by: Thomas Gleixner
    Cc: john stultz
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Since nonboot CPUs are now disabled after tasks and devices have been
    frozen and the CPU hotplug infrastructure is used for this purpose, we need
    special CPU hotplug notifications that will help the CPU-hotplug-aware
    subsystems distinguish normal CPU hotplug events from CPU hotplug events
    related to a system-wide suspend or resume operation in progress. This
    patch introduces such notifications and causes them to be used during
    suspend and resume transitions. It also changes all of the
    CPU-hotplug-aware subsystems to take these notifications into consideration
    (for now they are handled in the same way as the corresponding "normal"
    ones).

    [oleg@tv-sign.ru: cleanups]
    Signed-off-by: Rafael J. Wysocki
    Cc: Gautham R Shenoy
    Cc: Pavel Machek
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Signed-off-by: Jarek Poplawski
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jarek Poplawski
     

09 May, 2007

3 commits

  • There are many places in the kernel where the construction like

    foo = list_entry(head->next, struct foo_struct, list);

    are used.
    The code might look more descriptive and neat if using the macro

    list_first_entry(head, type, member) \
    list_entry((head)->next, type, member)

    Here is the macro itself and the examples of its usage in the generic code.
    If it will turn out to be useful, I can prepare the set of patches to
    inject in into arch-specific code, drivers, networking, etc.

    Signed-off-by: Pavel Emelianov
    Signed-off-by: Kirill Korotaev
    Cc: Randy Dunlap
    Cc: Andi Kleen
    Cc: Zach Brown
    Cc: Davide Libenzi
    Cc: John McCutchan
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: john stultz
    Cc: Ram Pai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelianov
     
  • Move the timekeeping code out of kernel/timer.c and into
    kernel/time/timekeeping.c. I made no cleanups or other changes in transit.

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: John Stultz
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     
  • Introduce a new flag for timers - deferrable: Timers that work normally
    when system is busy. But, will not cause CPU to come out of idle (just to
    service this timer), when CPU is idle. Instead, this timer will be
    serviced when CPU eventually wakes up with a subsequent non-deferrable
    timer.

    The main advantage of this is to avoid unnecessary timer interrupts when
    CPU is idle. If the routine currently called by a timer can wait until
    next event without any issues, this new timer can be used to setup timer
    event for that routine. This, with dynticks, allows CPUs to be lazy,
    allowing them to stay in idle for extended period of time by reducing
    unnecesary wakeup and thereby reducing the power consumption.

    This patch:

    Builds this new timer on top of existing timer infrastructure. It uses
    last bit in 'base' pointer of timer_list structure to store this deferrable
    timer flag. __next_timer_interrupt() function skips over these deferrable
    timers when CPU looks for next timer event for which it has to wake up.

    This is exported by a new interface init_timer_deferrable() that can be
    called in place of regular init_timer().

    [akpm@linux-foundation.org: Privatise a #define]
    Signed-off-by: Venkatesh Pallipadi
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Oleg Nesterov
    Cc: Dave Jones
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Venki Pallipadi
     

27 Apr, 2007

1 commit


08 Apr, 2007

1 commit

  • Soeren Sonnenburg reported that upon resume he is getting
    this backtrace:

    [] smp_apic_timer_interrupt+0x57/0x90
    [] retrigger_next_event+0x0/0xb0
    [] apic_timer_interrupt+0x28/0x30
    [] retrigger_next_event+0x0/0xb0
    [] __kfifo_put+0x8/0x90
    [] on_each_cpu+0x35/0x60
    [] clock_was_set+0x18/0x20
    [] timekeeping_resume+0x7c/0xa0
    [] __sysdev_resume+0x11/0x80
    [] sysdev_resume+0x47/0x80
    [] device_power_up+0x5/0x10

    it turns out that on resume we mistakenly re-enable interrupts too
    early. Do the timer retrigger only on the current CPU.

    Signed-off-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Acked-by: Soeren Sonnenburg
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

26 Mar, 2007

1 commit

  • The rework of next_timer_interrupt() fixed the timer wheel bugs, but
    invented a rounding error versus the next hrtimer event. This is caused
    by the conversion of the hrtimer internal representation to relative
    jiffies.

    This causes bug #8100:
    http://bugzilla.kernel.org/show_bug.cgi?id=8100

    next_timer_interrupt() returns "now" in such a case and causes the code
    in tick_nohz_stop_sched_tick() to trigger the timer softirq, which is
    bogus as no timer is due for expiry. This results in an endless context
    switching between idle and ksoftirqd until a timer is due for expiry.

    Modify the hrtimer evaluation so that, it returns now + 1, when the
    conversion results in a delta < 1 jiffie.

    It's confirmed to resolve bug #8100

    Reported-by: Emil Karlson
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Linus Torvalds

    Thomas Gleixner