07 Jan, 2012

1 commit

  • * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
    sched/tracing: Add a new tracepoint for sleeptime
    sched: Disable scheduler warnings during oopses
    sched: Fix cgroup movement of waking process
    sched: Fix cgroup movement of newly created process
    sched: Fix cgroup movement of forking process
    sched: Remove cfs bandwidth period check in tg_set_cfs_period()
    sched: Fix load-balance lock-breaking
    sched: Replace all_pinned with a generic flags field
    sched: Only queue remote wakeups when crossing cache boundaries
    sched: Add missing rcu_dereference() around ->real_parent usage
    [S390] fix cputime overflow in uptime_proc_show
    [S390] cputime: add sparse checking and cleanup
    sched: Mark parent and real_parent as __rcu
    sched, nohz: Fix missing RCU read lock
    sched, nohz: Set the NOHZ_BALANCE_KICK flag for idle load balancer
    sched, nohz: Fix the idle cpu check in nohz_idle_balance
    sched: Use jump_labels for sched_feat
    sched/accounting: Fix parameter passing in task_group_account_field
    sched/accounting: Fix user/system tick double accounting
    sched/accounting: Re-use scheduler statistics for the root cgroup
    ...

    Fix up conflicts in
    - arch/ia64/include/asm/cputime.h, include/asm-generic/cputime.h
    usecs_to_cputime64() vs the sparse cleanups
    - kernel/sched/fair.c, kernel/time/tick-sched.c
    scheduler changes in multiple branches

    Linus Torvalds
     

12 Dec, 2011

4 commits

  • Those two APIs were provided to optimize the calls of
    tick_nohz_idle_enter() and rcu_idle_enter() into a single
    irq disabled section. This way no interrupt happening in-between would
    needlessly process any RCU job.

    Now we are talking about an optimization for which benefits
    have yet to be measured. Let's start simple and completely decouple
    idle rcu and dyntick idle logics to simplify.

    Signed-off-by: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Reviewed-by: Josh Triplett
    Signed-off-by: Paul E. McKenney

    Frederic Weisbecker
     
  • It is assumed that rcu won't be used once we switch to tickless
    mode and until we restart the tick. However this is not always
    true, as in x86-64 where we dereference the idle notifiers after
    the tick is stopped.

    To prepare for fixing this, add two new APIs:
    tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu().

    If no use of RCU is made in the idle loop between
    tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch
    must instead call the new *_norcu() version such that the arch doesn't
    need to call rcu_idle_enter() and rcu_idle_exit().

    Otherwise the arch must call tick_nohz_enter_idle() and
    tick_nohz_exit_idle() and also call explicitly:

    - rcu_idle_enter() after its last use of RCU before the CPU is put
    to sleep.
    - rcu_idle_exit() before the first use of RCU after the CPU is woken
    up.

    Signed-off-by: Frederic Weisbecker
    Cc: Mike Frysinger
    Cc: Guan Xuetao
    Cc: David Miller
    Cc: Chris Metcalf
    Cc: Hans-Christian Egtvedt
    Cc: Ralf Baechle
    Cc: Paul E. McKenney
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: H. Peter Anvin
    Cc: Russell King
    Cc: Paul Mackerras
    Cc: Heiko Carstens
    Cc: Paul Mundt
    Signed-off-by: Paul E. McKenney

    Frederic Weisbecker
     
  • The tick_nohz_stop_sched_tick() function, which tries to delay
    the next timer tick as long as possible, can be called from two
    places:

    - From the idle loop to start the dytick idle mode
    - From interrupt exit if we have interrupted the dyntick
    idle mode, so that we reprogram the next tick event in
    case the irq changed some internal state that requires this
    action.

    There are only few minor differences between both that
    are handled by that function, driven by the ts->inidle
    cpu variable and the inidle parameter. The whole guarantees
    that we only update the dyntick mode on irq exit if we actually
    interrupted the dyntick idle mode, and that we enter in RCU extended
    quiescent state from idle loop entry only.

    Split this function into:

    - tick_nohz_idle_enter(), which sets ts->inidle to 1, enters
    dynticks idle mode unconditionally if it can, and enters into RCU
    extended quiescent state.

    - tick_nohz_irq_exit() which only updates the dynticks idle mode
    when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called).

    To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed
    into tick_nohz_idle_exit().

    This simplifies the code and micro-optimize the irq exit path (no need
    for local_irq_save there). This also prepares for the split between
    dynticks and rcu extended quiescent state logics. We'll need this split to
    further fix illegal uses of RCU in extended quiescent states in the idle
    loop.

    Signed-off-by: Frederic Weisbecker
    Cc: Mike Frysinger
    Cc: Guan Xuetao
    Cc: David Miller
    Cc: Chris Metcalf
    Cc: Hans-Christian Egtvedt
    Cc: Ralf Baechle
    Cc: Paul E. McKenney
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: H. Peter Anvin
    Cc: Russell King
    Cc: Paul Mackerras
    Cc: Heiko Carstens
    Cc: Paul Mundt
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Frederic Weisbecker
     
  • Earlier versions of RCU used the scheduling-clock tick to detect idleness
    by checking for the idle task, but handled idleness differently for
    CONFIG_NO_HZ=y. But there are now a number of uses of RCU read-side
    critical sections in the idle task, for example, for tracing. A more
    fine-grained detection of idleness is therefore required.

    This commit presses the old dyntick-idle code into full-time service,
    so that rcu_idle_enter(), previously known as rcu_enter_nohz(), is
    always invoked at the beginning of an idle loop iteration. Similarly,
    rcu_idle_exit(), previously known as rcu_exit_nohz(), is always invoked
    at the end of an idle-loop iteration. This allows the idle task to
    use RCU everywhere except between consecutive rcu_idle_enter() and
    rcu_idle_exit() calls, in turn allowing architecture maintainers to
    specify exactly where in the idle loop that RCU may be used.

    Because some of the userspace upcall uses can result in what looks
    to RCU like half of an interrupt, it is not possible to expect that
    the irq_enter() and irq_exit() hooks will give exact counts. This
    patch therefore expands the ->dynticks_nesting counter to 64 bits
    and uses two separate bitfields to count process/idle transitions
    and interrupt entry/exit transitions. It is presumed that userspace
    upcalls do not happen in the idle loop or from usermode execution
    (though usermode might do a system call that results in an upcall).
    The counter is hard-reset on each process/idle transition, which
    avoids the interrupt entry/exit error from accumulating. Overflow
    is avoided by the 64-bitness of the ->dyntick_nesting counter.

    This commit also adds warnings if a non-idle task asks RCU to enter
    idle state (and these checks will need some adjustment before applying
    Frederic's OS-jitter patches (http://lkml.org/lkml/2011/10/7/246).
    In addition, validation of ->dynticks and ->dynticks_nesting is added.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

06 Dec, 2011

1 commit

  • Introduce nr_busy_cpus in the struct sched_group_power [Not in sched_group
    because sched groups are duplicated for the SD_OVERLAP scheduler domain]
    and for each cpu that enters and exits idle, this parameter will
    be updated in each scheduler group of the scheduler domain that this cpu
    belongs to.

    To avoid the frequent update of this state as the cpu enters
    and exits idle, the update of the stat during idle exit is
    delayed to the first timer tick that happens after the cpu becomes busy.
    This is done using NOHZ_IDLE flag in the struct rq's nohz_flags.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20111202010832.555984323@sbsiddha-desk.sc.intel.com
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

26 Oct, 2011

1 commit

  • * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
    time, s390: Get rid of compile warning
    dw_apb_timer: constify clocksource name
    time: Cleanup old CONFIG_GENERIC_TIME references that snuck in
    time: Change jiffies_to_clock_t() argument type to unsigned long
    alarmtimers: Fix error handling
    clocksource: Make watchdog reset lockless
    posix-cpu-timers: Cure SMP accounting oddities
    s390: Use direct ktime path for s390 clockevent device
    clockevents: Add direct ktime programming function
    clockevents: Make minimum delay adjustments configurable
    nohz: Remove "Switched to NOHz mode" debugging messages
    proc: Consider NO_HZ when printing idle and iowait times
    nohz: Make idle/iowait counter update conditional
    nohz: Fix update_ts_time_stat idle accounting
    cputime: Clean up cputime_to_usecs and usecs_to_cputime macros
    alarmtimers: Rework RTC device selection using class interface
    alarmtimers: Add try_to_cancel functionality
    alarmtimers: Add more refined alarm state tracking
    alarmtimers: Remove period from alarm structure
    alarmtimers: Remove interval cap limit hack
    ...

    Linus Torvalds
     

29 Sep, 2011

1 commit

  • RCU no longer uses this global variable, nor does anyone else. This
    commit therefore removes this variable. This reduces memory footprint
    and also removes some atomic instructions and memory barriers from
    the dyntick-idle path.

    Signed-off-by: Alex Shi
    Signed-off-by: Paul E. McKenney

    Shi, Alex
     

08 Sep, 2011

3 commits

  • When performing cpu hotplug tests the kernel printk log buffer gets flooded
    with pointless "Switched to NOHz mode..." messages. Especially when afterwards
    analyzing a dump this might have removed more interesting stuff out of the
    buffer.
    Assuming that switching to NOHz mode simply works just remove the printk.

    Signed-off-by: Heiko Carstens
    Link: http://lkml.kernel.org/r/20110823112046.GB2540@osiris.boeblingen.de.ibm.com
    Signed-off-by: Thomas Gleixner

    Heiko Carstens
     
  • get_cpu_{idle,iowait}_time_us update idle/iowait counters
    unconditionally if the given CPU is in the idle loop.

    This doesn't work well outside of CPU governors which are singletons
    so nobody (except for IRQ) can race with them.

    We will need to use both functions from /proc/stat handler to properly
    handle nohz idle/iowait times.

    Make the update depend on a non NULL last_update_time argument.

    Signed-off-by: Michal Hocko
    Cc: Dave Jones
    Cc: Arnd Bergmann
    Cc: Alexey Dobriyan
    Link: http://lkml.kernel.org/r/11f23179472635ce52e78921d47a20216b872f23.1314172057.git.mhocko@suse.cz
    Signed-off-by: Thomas Gleixner

    Michal Hocko
     
  • update_ts_time_stat currently updates idle time even if we are in
    iowait loop at the moment. The only real users of the idle counter
    (via get_cpu_idle_time_us) are CPU governors and they expect to get
    cumulative time for both idle and iowait times.
    The value (idle_sleeptime) is also printed to userspace by print_cpu
    but it prints both idle and iowait times so the idle part is misleading.

    Let's clean this up and fix update_ts_time_stat to account both counters
    properly and update consumers of idle to consider iowait time as well.
    If we do this we might use get_cpu_{idle,iowait}_time_us from other
    contexts as well and we will get expected values.

    Signed-off-by: Michal Hocko
    Cc: Dave Jones
    Cc: Arnd Bergmann
    Cc: Alexey Dobriyan
    Link: http://lkml.kernel.org/r/e9c909c221a8da402c4da07e4cd968c3218f8eb1.1314172057.git.mhocko@suse.cz
    Signed-off-by: Thomas Gleixner

    Michal Hocko
     

01 Feb, 2011

1 commit

  • All callers of do_timer() are converted to xtime_update(). The only
    users of xtime_lock are in kernel/time/. Make both local to
    kernel/time/ and remove them from the global header files.

    [ tglx: Reuse tick-internal.h instead of creating another local header
    file. Massaged changelog ]

    Signed-off-by: Torben Hohn
    Cc: Peter Zijlstra
    Cc: johnstul@us.ibm.com
    Cc: yong.zhang0@gmail.com
    Cc: hch@infradead.org
    Signed-off-by: Thomas Gleixner

    Torben Hohn
     

20 Jan, 2011

1 commit

  • When NOHZ=y and high res timers are disabled (via cmdline or
    Kconfig) tick_nohz_switch_to_nohz() will notify the user about
    switching into NOHZ mode. Nothing is printed for the case where
    HIGH_RES_TIMERS=y. Fix this for the HIGH_RES_TIMERS=y case by
    duplicating the printk from the low res NOHZ path in the high
    res NOHZ path.

    This confused me since I was thinking 'dmesg | grep -i NOHZ' would
    tell me if NOHZ was enabled, but if I have hrtimers there is
    nothing.

    Signed-off-by: Stephen Boyd
    Acked-by: Thomas Gleixner
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Stephen Boyd
     

07 Aug, 2010

1 commit


05 Aug, 2010

1 commit


03 Aug, 2010

1 commit

  • Historically, Linux has tried to make the regular timer tick on the
    various CPUs not happen at the same time, to avoid contention on
    xtime_lock.

    Nowadays, with the tickless kernel, this contention no longer happens
    since time keeping and updating are done differently. In addition,
    this skew is actually hurting power consumption in a measurable way on
    many-core systems.

    Signed-off-by: Arjan van de Ven
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Arjan van de Ven
     

22 Jul, 2010

1 commit


17 Jul, 2010

1 commit

  • Norbert reported that nohz_ratelimit() causes his laptop to burn about
    4W (40%) extra. For now back out the change and see if we can adjust
    the power management code to make better decisions.

    Reported-by: Norbert Preining
    Signed-off-by: Peter Zijlstra
    Acked-by: Mike Galbraith
    Cc: Arjan van de Ven
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

03 Jul, 2010

1 commit


01 Jul, 2010

1 commit

  • Commit 0224cf4c5e (sched: Intoduce get_cpu_iowait_time_us())
    broke things by not making sure preemption was indeed disabled
    by the callers of nr_iowait_cpu() which took the iowait value of
    the current cpu.

    This resulted in a heap of preempt warnings. Cure this by making
    nr_iowait_cpu() take a cpu number and fix up the callers to pass
    in the right number.

    Signed-off-by: Peter Zijlstra
    Cc: Arjan van de Ven
    Cc: Sergey Senozhatsky
    Cc: Rafael J. Wysocki
    Cc: Maxim Levitsky
    Cc: Len Brown
    Cc: Pavel Machek
    Cc: Jiri Slaby
    Cc: linux-pm@lists.linux-foundation.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

18 Jun, 2010

1 commit

  • Chris Wedgwood reports that 39c0cbe (sched: Rate-limit nohz) causes a
    serial console regression, unresponsiveness, and indeed it does. The
    reason is that the nohz code is skipped even when the tick was already
    stopped before the nohz_ratelimit(cpu) condition changed.

    Move the nohz_ratelimit() check to the other conditions which prevent
    long idle sleeps.

    Reported-by: Chris Wedgwood
    Tested-by: Brian Bloniarz
    Signed-off-by: Mike Galbraith
    Signed-off-by: Peter Zijlstra
    Cc: Jiri Kosina
    Cc: Linus Torvalds
    Cc: Greg KH
    Cc: Alan Cox
    Cc: OGAWA Hirofumi
    Cc: Jef Driesen
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Peter Zijlstra
     

09 Jun, 2010

1 commit

  • In the new push model, all idle CPUs indeed go into nohz mode. There is
    still the concept of idle load balancer (performing the load balancing
    on behalf of all the idle cpu's in the system). Busy CPU kicks the nohz
    balancer when any of the nohz CPUs need idle load balancing.
    The kickee CPU does the idle load balancing on behalf of all idle CPUs
    instead of the normal idle balance.

    This addresses the below two problems with the current nohz ilb logic:
    * the idle load balancer continued to have periodic ticks during idle and
    wokeup frequently, even though it did not have any rebalancing to do on
    behalf of any of the idle CPUs.
    * On x86 and CPUs that have APIC timer stoppage on idle CPUs, this
    periodic wakeup can result in a periodic additional interrupt on a CPU
    doing the timer broadcast.

    Also currently we are migrating the unpinned timers from an idle to the cpu
    doing idle load balancing (when all the cpus in the system are idle,
    there is no idle load balancing cpu and timers get added to the same idle cpu
    where the request was made. So the existing optimization works only on semi idle
    system).

    And In semi idle system, we no longer have periodic ticks on the idle load
    balancer CPU. Using that cpu will add more delays to the timers than intended
    (as that cpu's timer base may not be uptodate wrt jiffies etc). This was
    causing mysterious slowdowns during boot etc.

    For now, in the semi idle case, use the nearest busy cpu for migrating timers
    from an idle cpu. This is good for power-savings anyway.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Suresh Siddha
    Signed-off-by: Peter Zijlstra
    Cc: Thomas Gleixner
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Venkatesh Pallipadi
     

10 May, 2010

6 commits

  • For the ondemand cpufreq governor, it is desired that the iowait
    time is microaccounted in a similar way as idle time is.

    This patch introduces the infrastructure to account and expose
    this information via the get_cpu_iowait_time_us() function.

    [akpm@linux-foundation.org: fix CONFIG_NO_HZ=n build]
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Reviewed-by: Rik van Riel
    Acked-by: Peter Zijlstra
    Cc: davej@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • Now that the only user of ts->idle_lastupdate is
    update_ts_time_stats(), the entire field can be eliminated.

    In update_ts_time_stats(), idle_lastupdate is first set to
    "now", and a few lines later, the only user is an if() statement
    that assigns a variable either to "now" or to
    ts->idle_lastupdate, which has the value of "now" at that point.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Reviewed-by: Rik van Riel
    Acked-by: Peter Zijlstra
    Cc: davej@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • This patch folds the updating of the last_update_time into the
    update_ts_time_stats() function, and updates the callers.

    This allows for further cleanups that are done in the next
    patch.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Reviewed-by: Rik van Riel
    Acked-by: Peter Zijlstra
    Cc: davej@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • Right now, get_cpu_idle_time_us() only reports the idle
    statistics upto the point the CPU entered last idle; not what is
    valid right now.

    This patch adds an update of the idle statistics to
    get_cpu_idle_time_us(), so that calling this function always
    returns statistics that are accurate at the point of the call.

    This includes resetting the start of the idle time for
    accounting purposes to avoid double accounting.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Reviewed-by: Rik van Riel
    Acked-by: Peter Zijlstra
    Cc: davej@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • Currently, two places update the idle statistics (and more to
    come later in this series).

    This patch creates a helper function for updating these
    statistics.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Reviewed-by: Rik van Riel
    Acked-by: Peter Zijlstra
    Cc: davej@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • The exported function get_cpu_idle_time_us() has no comment
    describing it; add a kerneldoc comment

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Reviewed-by: Rik van Riel
    Acked-by: Peter Zijlstra
    Cc: davej@redhat.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     

12 Mar, 2010

1 commit

  • Entering nohz code on every micro-idle is costing ~10% throughput for netperf
    TCP_RR when scheduling cross-cpu. Rate limiting entry fixes this, but raises
    ticks a bit. On my Q6600, an idle box goes from ~85 interrupts/sec to 128.

    The higher the context switch rate, the more nohz entry costs. With this patch
    and some cycle recovery patches in my tree, max cross cpu context switch rate is
    improved by ~16%, a large portion of which of which is this ratelimiting.

    Signed-off-by: Mike Galbraith
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Mike Galbraith
     

09 Dec, 2009

1 commit

  • * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    timers, init: Limit the number of per cpu calibration bootup messages
    posix-cpu-timers: optimize and document timer_create callback
    clockevents: Add missing include to pacify sparse
    x86: vmiclock: Fix printk format
    x86: Fix printk format due to variable type change
    sparc: fix printk for change of variable type
    clocksource/events: Fix fallout of generic code changes
    nohz: Allow 32-bit machines to sleep for more than 2.15 seconds
    nohz: Track last do_timer() cpu
    nohz: Prevent clocksource wrapping during idle
    nohz: Type cast printk argument
    mips: Use generic mult/shift factor calculation for clocks
    clocksource: Provide a generic mult/shift factor calculation
    clockevents: Use u32 for mult and shift factors
    nohz: Introduce arch_needs_cpu
    nohz: Reuse ktime in sub-functions of tick_check_idle.
    time: Remove xtime_cache
    time: Implement logarithmic time accumulation

    Linus Torvalds
     

14 Nov, 2009

3 commits

  • The previous patch which limits the sleep time to the maximum
    deferment time of the time keeping clocksource has some limitations on
    SMP machines: if all CPUs are idle then for all CPUs the maximum sleep
    time is limited.

    Solve this by keeping track of which cpu had the do_timer() duty
    assigned last and limit the sleep time only for this cpu.

    Signed-off-by: Thomas Gleixner
    LKML-Reference:
    Cc: Jon Hunter
    Cc: John Stultz

    Thomas Gleixner
     
  • The dynamic tick allows the kernel to sleep for periods longer than a
    single tick, but it does not limit the sleep time currently. In the
    worst case the kernel could sleep longer than the wrap around time of
    the time keeping clock source which would result in losing track of
    time.

    Prevent this by limiting it to the safe maximum sleep time of the
    current time keeping clock source. The value is calculated when the
    clock source is registered.

    [ tglx: simplified the code a bit and massaged the commit msg ]

    Signed-off-by: Jon Hunter
    Cc: John Stultz
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Jon Hunter
     
  • On some archs local_softirq_pending() has a data type of unsigned long
    on others its unsigned int. Type cast it to (unsigned int) in the
    printk to avoid the compiler warning.

    Signed-off-by: Thomas Gleixner
    LKML-Reference:

    Thomas Gleixner
     

05 Nov, 2009

2 commits

  • Allow the architecture to request a normal jiffy tick when the system
    goes idle and tick_nohz_stop_sched_tick is called . On s390 the hook is
    used to prevent the system going fully idle if there has been an
    interrupt other than a clock comparator interrupt since the last wakeup.

    On s390 the HiperSockets response time for 1 connection ping-pong goes
    down from 42 to 34 microseconds. The CPU cost decreases by 27%.

    Signed-off-by: Martin Schwidefsky
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Martin Schwidefsky
     
  • On a system with NOHZ=y tick_check_idle calls tick_nohz_stop_idle and
    tick_nohz_update_jiffies. Given the right conditions (ts->idle_active
    and/or ts->tick_stopped) both function get a time stamp with ktime_get.
    The same time stamp can be reused if both function require one.

    On s390 this change has the additional benefit that gcc inlines the
    tick_nohz_stop_idle function into tick_check_idle. The number of
    instructions to execute tick_check_idle drops from 225 to 144
    (without the ktime_get optimization it is 367 vs 215 instructions).

    before:

    0) | tick_check_idle() {
    0) | tick_nohz_stop_idle() {
    0) | ktime_get() {
    0) | read_tod_clock() {
    0) 0.601 us | }
    0) 1.765 us | }
    0) 3.047 us | }
    0) | ktime_get() {
    0) | read_tod_clock() {
    0) 0.570 us | }
    0) 1.727 us | }
    0) | tick_do_update_jiffies64() {
    0) 0.609 us | }
    0) 8.055 us | }

    after:

    0) | tick_check_idle() {
    0) | ktime_get() {
    0) | read_tod_clock() {
    0) 0.617 us | }
    0) 1.773 us | }
    0) | tick_do_update_jiffies64() {
    0) 0.593 us | }
    0) 4.477 us | }

    Signed-off-by: Martin Schwidefsky
    Cc: john stultz
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Martin Schwidefsky
     

07 Oct, 2009

1 commit

  • Commit f2e21c9610991e95621a81407cdbab881226419b had unfortunate side
    effects with cpufreq governors on some systems.

    If the system did not switch into NOHZ mode ts->inidle is not set when
    tick_nohz_stop_sched_tick() is called from the idle routine. Therefor
    all subsequent calls from irq_exit() to tick_nohz_stop_sched_tick()
    fail to call tick_nohz_start_idle(). This results in bogus idle
    accounting information which is passed to cpufreq governors.

    Set the inidle flag unconditionally of the NOHZ active state to keep
    the idle time accounting correct in any case.

    [ tglx: Added comment and tweaked the changelog ]

    Reported-by: Steven Noonan
    Signed-off-by: Eero Nurkkala
    Cc: Rik van Riel
    Cc: Venkatesh Pallipadi
    Cc: Greg KH
    Cc: Steven Noonan
    Cc: stable@kernel.org
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Eero Nurkkala
     

21 Jun, 2009

1 commit


27 May, 2009

1 commit

  • A call from irq_exit() may occasionally pause the timing
    info for cpufreq ondemand governor. This results in the
    cpufreq ondemand governor to fail to calculate the
    system load properly. Thus, relocate the checks for this
    particular case to keep the governor always functional.

    Signed-off-by: Eero Nurkkala
    Reported-by: Tero Kristo
    Acked-by: Rik van Riel
    Acked-by: Venkatesh Pallipadi
    Signed-off-by: Thomas Gleixner

    Eero Nurkkala
     

13 May, 2009

1 commit


15 Jan, 2009

1 commit