14 Jul, 2016

3 commits

  • The vtime irqtime accounting headers are very scattered and convoluted
    right now. Reorganize them such that it is obvious that only
    CONFIG_VIRT_CPU_ACCOUNTING_NATIVE does use it.

    Signed-off-by: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Paolo Bonzini
    Cc: Peter Zijlstra
    Cc: Radim Krcmar
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: Wanpeng Li
    Link: http://lkml.kernel.org/r/1468421405-20056-5-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Vtime generic irqtime accounting has been removed but there are a few
    remnants to clean up:

    * The vtime_accounting_cpu_enabled() check in irq entry was only used
    by CONFIG_VIRT_CPU_ACCOUNTING_GEN. We can safely remove it.

    * Without the vtime_accounting_cpu_enabled(), we no longer need to
    have a vtime_common_account_irq_enter() indirect function.

    * Move vtime_account_irq_enter() implementation under
    CONFIG_VIRT_CPU_ACCOUNTING_NATIVE which is the last user.

    * The vtime_account_user() call was only used on irq entry for
    CONFIG_VIRT_CPU_ACCOUNTING_GEN. We can remove that too.

    Signed-off-by: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Paolo Bonzini
    Cc: Peter Zijlstra
    Cc: Radim Krcmar
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: Wanpeng Li
    Link: http://lkml.kernel.org/r/1468421405-20056-4-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • The CONFIG_VIRT_CPU_ACCOUNTING_GEN irq time tracking code does not
    appear to currently work right.

    On CPUs without nohz_full=, only tick based irq time sampling is
    done, which breaks down when dealing with a nohz_idle CPU.

    On firewalls and similar systems, no ticks may happen on a CPU for a
    while, and the irq time spent may never get accounted properly. This
    can cause issues with capacity planning and power saving, which use
    the CPU statistics as inputs in decision making.

    Remove the VTIME_GEN vtime irq time code, and replace it with the
    IRQ_TIME_ACCOUNTING code, when selected as a config option by the user.

    Signed-off-by: Rik van Riel
    Signed-off-by: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Paolo Bonzini
    Cc: Peter Zijlstra
    Cc: Radim Krcmar
    Cc: Thomas Gleixner
    Cc: Wanpeng Li
    Link: http://lkml.kernel.org/r/1468421405-20056-3-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Rik van Riel
     

04 Dec, 2015

2 commits

  • Readers need to know if vtime runs at all on some CPU somewhere, this
    is a fast-path check to determine if we need to check further the need
    to add up any tickless cputime delta.

    This fast path check uses context tracking state because vtime is tied
    to context tracking as of now. This check appears to be confusing though
    so lets use a vtime function that deals with context tracking details
    in vtime implementation instead.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Chris Metcalf
    Cc: Christoph Lameter
    Cc: Hiroshi Shimamoto
    Cc: Linus Torvalds
    Cc: Luiz Capitulino
    Cc: Mike Galbraith
    Cc: Paul E . McKenney
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1447948054-28668-7-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • vtime_accounting_enabled() checks if vtime is running on the current CPU
    and is as such a misnomer. Lets rename it to a function that reflect its
    locality. We are going to need the current name for a function that tells
    if vtime runs at all on some CPU.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Chris Metcalf
    Cc: Christoph Lameter
    Cc: Hiroshi Shimamoto
    Cc: Linus Torvalds
    Cc: Luiz Capitulino
    Cc: Mike Galbraith
    Cc: Paul E . McKenney
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1447948054-28668-6-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

03 Dec, 2013

2 commits


14 Aug, 2013

2 commits

  • If no CPU is in the full dynticks range, we can avoid the full
    dynticks cputime accounting through generic vtime along with its
    overhead and use the traditional tick based accounting instead.

    Let's do this and nope the off case with static keys.

    Signed-off-by: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Paul E. McKenney
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Li Zhong
    Cc: Mike Galbraith
    Cc: Kevin Hilman

    Frederic Weisbecker
     
  • If the arch overrides some generic vtime APIs, let it describe
    these on a dedicated and standalone header. This way it becomes
    convenient to include it in vtime generic headers without irrelevant
    stuff in such a low level header.

    Signed-off-by: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Paul E. McKenney
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Li Zhong
    Cc: Mike Galbraith
    Cc: Kevin Hilman
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens

    Frederic Weisbecker
     

31 May, 2013

1 commit

  • While computing the cputime delta of dynticks CPUs,
    we are mixing up clocks of differents natures:

    * local_clock() which takes care of unstable clock
    sources and fix these if needed.

    * sched_clock() which is the weaker version of
    local_clock(). It doesn't compute any fixup in case
    of unstable source.

    If the clock source is stable, those two clocks are the
    same and we can safely compute the difference against
    two random points.

    Otherwise it results in random deltas as sched_clock()
    can randomly drift away, back or forward, from local_clock().

    As a consequence, some strange behaviour with unstable tsc
    has been observed such as non progressing constant zero cputime.
    (The 'top' command showing no load).

    Fix this by only using local_clock(), or its irq safe/remote
    equivalent, in vtime code.

    Reported-by: Mike Galbraith
    Suggested-by: Mike Galbraith
    Cc: Steven Rostedt
    Cc: Paul E. McKenney
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Li Zhong
    Cc: Mike Galbraith
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

28 Jan, 2013

4 commits

  • While remotely reading the cputime of a task running in a
    full dynticks CPU, the values stored in utime/stime fields
    of struct task_struct may be stale. Its values may be those
    of the last kernel user transition time snapshot and
    we need to add the tickless time spent since this snapshot.

    To fix this, flush the cputime of the dynticks CPUs on
    kernel user transition and record the time / context
    where we did this. Then on top of this snapshot and the current
    time, perform the fixup on the reader side from task_times()
    accessors.

    Signed-off-by: Frederic Weisbecker
    Cc: Andrew Morton
    Cc: Ingo Molnar
    Cc: Li Zhong
    Cc: Namhyung Kim
    Cc: Paul E. McKenney
    Cc: Paul Gortmaker
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    [fixed kvm module related build errors]
    Signed-off-by: Sedat Dilek

    Frederic Weisbecker
     
  • Do some ground preparatory work before adding guest_enter()
    and guest_exit() context tracking callbacks. Those will
    be later used to read the guest cputime safely when we
    run in full dynticks mode.

    Signed-off-by: Frederic Weisbecker
    Cc: Andrew Morton
    Cc: Gleb Natapov
    Cc: Ingo Molnar
    Cc: Li Zhong
    Cc: Marcelo Tosatti
    Cc: Namhyung Kim
    Cc: Paul E. McKenney
    Cc: Paul Gortmaker
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner

    Frederic Weisbecker
     
  • Allow to dynamically switch between tick and virtual based
    cputime accounting. This way we can provide a kind of "on-demand"
    virtual based cputime accounting. In this mode, the kernel relies
    on the context tracking subsystem to dynamically probe on kernel
    boundaries.

    This is in preparation for being able to stop the timer tick in
    more places than just the idle state. Doing so will depend on
    CONFIG_VIRT_CPU_ACCOUNTING_GEN which makes it possible to account
    the cputime without the tick by hooking on kernel/user boundaries.

    Depending whether the tick is stopped or not, we can switch between
    tick and vtime based accounting anytime in order to minimize the
    overhead associated to user hooks.

    Signed-off-by: Frederic Weisbecker
    Cc: Andrew Morton
    Cc: Ingo Molnar
    Cc: Li Zhong
    Cc: Namhyung Kim
    Cc: Paul E. McKenney
    Cc: Paul Gortmaker
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner

    Frederic Weisbecker
     
  • If we want to stop the tick further idle, we need to be
    able to account the cputime without using the tick.

    Virtual based cputime accounting solves that problem by
    hooking into kernel/user boundaries.

    However implementing CONFIG_VIRT_CPU_ACCOUNTING require
    low level hooks and involves more overhead. But we already
    have a generic context tracking subsystem that is required
    for RCU needs by archs which plan to shut down the tick
    outside idle.

    This patch implements a generic virtual based cputime
    accounting that relies on these generic kernel/user hooks.

    There are some upsides of doing this:

    - This requires no arch code to implement CONFIG_VIRT_CPU_ACCOUNTING
    if context tracking is already built (already necessary for RCU in full
    tickless mode).

    - We can rely on the generic context tracking subsystem to dynamically
    (de)activate the hooks, so that we can switch anytime between virtual
    and tick based accounting. This way we don't have the overhead
    of the virtual accounting when the tick is running periodically.

    And one downside:

    - There is probably more overhead than a native virtual based cputime
    accounting. But this relies on hooks that are already set anyway.

    Signed-off-by: Frederic Weisbecker
    Cc: Andrew Morton
    Cc: Ingo Molnar
    Cc: Li Zhong
    Cc: Namhyung Kim
    Cc: Paul E. McKenney
    Cc: Paul Gortmaker
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner

    Frederic Weisbecker
     

19 Nov, 2012

2 commits

  • All vtime implementations just flush the user time on process
    tick. Consolidate that in generic code by calling a user time
    accounting helper. This avoids an indirect call in ia64 and
    prepare to also consolidate vtime context switch code.

    Signed-off-by: Frederic Weisbecker
    Reviewed-by: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Steven Rostedt
    Cc: Paul Gortmaker
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens

    Frederic Weisbecker
     
  • Prepending irq-unsafe vtime APIs with underscores was actually
    a bad idea as the result is a big mess in the API namespace that
    is even waiting to be further extended. Also these helpers
    are always called from irq safe callers except kvm. Just
    provide a vtime_account_system_irqsafe() for this specific
    case so that we can remove the underscore prefix on other
    vtime functions.

    Signed-off-by: Frederic Weisbecker
    Reviewed-by: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Steven Rostedt
    Cc: Paul Gortmaker
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens

    Frederic Weisbecker
     

30 Oct, 2012

4 commits

  • vtime_account() doesn't have the same role in
    CONFIG_VIRT_CPU_ACCOUNTING and CONFIG_IRQ_TIME_ACCOUNTING.

    In the first case it handles time accounting in any context. In
    the second case it only handles irq time accounting.

    So when vtime_account() is called from outside vtime_account_irq_*()
    this call is pointless to CONFIG_IRQ_TIME_ACCOUNTING.

    To fix the confusion, change vtime_account() to irqtime_account_irq()
    in CONFIG_IRQ_TIME_ACCOUNTING. This way we ensure future account_vtime()
    calls won't waste useless cycles in the irqtime APIs.

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Steven Rostedt
    Cc: Paul Gortmaker

    Frederic Weisbecker
     
  • With CONFIG_VIRT_CPU_ACCOUNTING, when vtime_account()
    is called in irq entry/exit, we perform a check on the
    context: if we are interrupting the idle task we
    account the pending cputime to idle, otherwise account
    to system time or its sub-areas: tsk->stime, hardirq time,
    softirq time, ...

    However this check for idle only concerns the hardirq entry
    and softirq entry:

    * Hardirq may directly interrupt the idle task, in which case
    we need to flush the pending CPU time to idle.

    * The idle task may be directly interrupted by a softirq if
    it calls local_bh_enable(). There is probably no such call
    in any idle task but we need to cover every case. Ksoftirqd
    is not concerned because the idle time is flushed on context
    switch and softirq in the end of hardirq have the idle time
    already flushed from the hardirq entry.

    In the other cases we always account to system/irq time:

    * On hardirq exit we account the time to hardirq time.
    * On softirq exit we account the time to softirq time.

    To optimize this and avoid the indirect call to vtime_account()
    and the checks it performs, specialize the vtime irq APIs and
    only perform the check on irq entry. Irq exit can directly call
    vtime_account_system().

    CONFIG_IRQ_TIME_ACCOUNTING behaviour doesn't change and directly
    maps to its own vtime_account() implementation. One may want
    to take benefits from the new APIs to optimize irq time accounting
    as well in the future.

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Steven Rostedt
    Cc: Paul Gortmaker

    Frederic Weisbecker
     
  • vtime_account_system() currently has only one caller with
    vtime_account() which is irq safe.

    Now we are going to call it from other places like kvm where
    irqs are not always disabled by the time we account the cputime.

    So let's make it irqsafe. The arch implementation part is now
    prefixed with "__".

    vtime_account_idle() arch implementation is prefixed accordingly
    to stay consistent.

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Steven Rostedt
    Cc: Paul Gortmaker
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens

    Frederic Weisbecker
     
  • These APIs are scattered around and are going to expand a bit.
    Let's create a dedicated header file for sanity.

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Steven Rostedt
    Cc: Paul Gortmaker

    Frederic Weisbecker