30 Oct, 2011

2 commits


26 May, 2011

1 commit


31 Mar, 2011

1 commit


05 Jan, 2011

2 commits

  • Up to now /proc/interrupts only has statistics for external and i/o
    interrupts but doesn't split up them any further.
    This patch adds a line for every single interrupt source so that it
    is possible to easier tell what the machine is/was doing.
    Part of the output now looks like this;

    CPU0 CPU2 CPU4
    EXT: 3898 4232 2305
    I/O: 782 315 245
    CLK: 1029 1964 727 [EXT] Clock Comparator
    IPI: 2868 2267 1577 [EXT] Signal Processor
    TMR: 0 0 0 [EXT] CPU Timer
    TAL: 0 0 0 [EXT] Timing Alert
    PFL: 0 0 0 [EXT] Pseudo Page Fault
    [...]
    NMI: 0 1 1 [NMI] Machine Checks

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • Add kprobes annotations to get the massive 'probe kernel.function("*") {}'
    stress test working.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

01 Dec, 2010

1 commit

  • This fixes the same problem as described in the patch "nohz: fix
    printk_needs_cpu() return value on offline cpus" for the arch_needs_cpu()
    primitive:

    arch_needs_cpu() may return 1 if called on offline cpus. When a cpu gets
    offlined it schedules the idle process which, before killing its own cpu,
    will call tick_nohz_stop_sched_tick().
    That function in turn will call arch_needs_cpu() in order to check if the
    local tick can be disabled. On offline cpus this function should naturally
    return 0 since regardless if the tick gets disabled or not the cpu will be
    dead short after. That is besides the fact that __cpu_disable() should already
    have made sure that no interrupts on the offlined cpu will be delivered anyway.

    In this case it prevents tick_nohz_stop_sched_tick() to call
    select_nohz_load_balancer(). No idea if that really is a problem. However what
    made me debug this is that on 2.6.32 the function get_nohz_load_balancer() is
    used within __mod_timer() to select a cpu on which a timer gets enqueued.
    If arch_needs_cpu() returns 1 then the nohz_load_balancer cpu doesn't get
    updated when a cpu gets offlined. It may contain the cpu number of an offline
    cpu. In turn timers get enqueued on an offline cpu and not very surprisingly
    they never expire and cause system hangs.

    This has been observed 2.6.32 kernels. On current kernels __mod_timer() uses
    get_nohz_timer_target() which doesn't have that problem. However there might
    be other problems because of the too early exit tick_nohz_stop_sched_tick()
    in case a cpu goes offline.

    This specific bug was indrocuded with 3c5d92a0 "nohz: Introduce
    arch_needs_cpu".

    In this case a cpu hotplug notifier is used to fix the issue in order to keep
    the normal/fast path small. All we need to do is to clear the condition that
    makes arch_needs_cpu() return 1 since it is just a performance improvement
    which is supposed to keep the local tick running for a short period if a cpu
    goes idle. Nothing special needs to be done except for clearing the condition.

    Cc: stable@kernel.org
    Acked-by: Peter Zijlstra
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

25 Oct, 2010

1 commit


17 May, 2010

1 commit

  • A machine check can interrupt the i/o and external interrupt handler
    anytime. If the machine check occurs while the interrupt handler is
    waking up from idle vtime_start_cpu can get executed a second time
    and the int_clock / async_enter_timer values in the lowcore get
    clobbered. This can confuse the cpu time accounting.
    To fix this problem two changes are needed. First the machine check
    handler has to use its own copies of int_clock and async_enter_timer,
    named mcck_clock and mcck_enter_timer. Second the nested execution
    of vtime_start_cpu has to be prevented. This is done in s390_idle_check
    by checking the wait bit in the program status word.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

05 Nov, 2009

1 commit

  • Allow the architecture to request a normal jiffy tick when the system
    goes idle and tick_nohz_stop_sched_tick is called . On s390 the hook is
    used to prevent the system going fully idle if there has been an
    interrupt other than a clock comparator interrupt since the last wakeup.

    On s390 the HiperSockets response time for 1 connection ping-pong goes
    down from 42 to 34 microseconds. The CPU cost decreases by 27%.

    Signed-off-by: Martin Schwidefsky
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Martin Schwidefsky
     

22 Jun, 2009

2 commits


12 Jun, 2009

1 commit


23 Apr, 2009

1 commit

  • The cpu idle field in the output of /proc/stat is too small for cpus
    that have been idle for more than a tick. Add the architecture hook
    arch_idle_time that allows to add the not accounted idle time of a
    sleeping cpu without waking the cpu.

    The s390 implementation of arch_idle_time uses the already existing
    s390_idle_data per_cpu variable to find the sleep time of a neighboring
    idle cpu.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

14 Apr, 2009

3 commits

  • Start the cpu time accounting very early to catch the cpu time spent
    for the initial kernel setup. To make the output of /proc/uptime
    match the sum of all cpu accounting values of the boot cpu reset
    xtime and wall_to_monotonic to sane values based on the TOD clock.
    The values set by timekeeping_init are off by up to a second.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • The steal time is calculated by subtracting the time the virtual cpu
    has been running on a physical cpu from the wall clock time. To make
    that work all wall time needs to be added to the steal time field first
    before the virtual cpu time is subtracted.

    The time between the last clock update and the load of the enabled wait
    psw needs to be added to the steal_time field as well to make the sum
    over all cpu accounting numbers match the wall clock.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • In case mod_virt_timer is used to add a non pending timer the timer
    is always added as a one-shot timer. If mod_virt_timer is used for
    periodic timers they may therfore be degraded to one-shot timers.

    Add mod_virt_timer_periodic to the interface to allow safe re-programming
    of the interval value.

    Signed-off-by: Jan Glauber
    Signed-off-by: Martin Schwidefsky

    Jan Glauber
     

23 Jan, 2009

1 commit

  • On (initial) cpu hotplug the lowcore values for user_timer and
    system_timer don't get initialized like they would get on each
    process schedule.
    On initial start of secondary cpus this leads to the situation
    where per thread user/system_timer values are larger than the
    corresponding contents of the lowcore. When later calculating
    time spent in user/system context the result can be negative.

    So for cpu hotplug we should manually initialize lowcore values.

    Fixes this bug:

    Kernel BUG at 000ec080 [verbose debug info unavailable]
    fixpoint divide exception: 0009 [#1] PREEMPT SMP
    Modules linked in:
    CPU: 10 Not tainted 2.6.28 #4
    Process sysctl (pid: 975, task: 3fa752e0, ksp: 3fbebca0)
    Krnl PSW : 070c1000 800ec080 (show_stat+0x390/0x5fc)
    R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0
    Krnl GPRS: 7fffffff fefc7ce5 3faec080 003879ae
    00000001 01388000 7fffffff 01388000
    00000000 00000000 0049ad50 3fbebcf8
    01388000 002f51a8 800ec1fe 3fbebcf8
    Krnl Code: 800ec076: 9001b188 stm %r0,%r1,392(%r11)
    800ec07a: 9801b0c0 lm %r0,%r1,192(%r11)
    800ec07e: 1d05 dr %r0,%r5
    >800ec080: 9001b0c0 stm %r0,%r1,192(%r11)
    800ec084: 5860b0c4 l %r6,196(%r11)
    800ec088: 1806 lr %r0,%r6
    800ec08a: 8c800001 srdl %r8,1
    800ec08e: 1d87 dr %r8,%r7
    Call Trace:
    ([] show_stat+0x4fe/0x5fc)
    [] seq_read+0xc4/0x3ac
    [] proc_reg_read+0x6e/0x9c
    [] vfs_read+0x78/0x100
    [] sys_read+0x40/0x80
    [] sysc_do_restart+0x1a/0x1e

    Signed-off-by: Heiko Carstens

    Heiko Carstens
     

31 Dec, 2008

5 commits

  • Distinguish the cputime of the idle process where idle is actually using
    cpu cycles from the cputime where idle is sleeping on an enabled wait psw.
    The former is accounted as system time, the later as idle time.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Increase the precision of the idle time calculation that is exported
    to user space via /sys/devices/system/cpu/cpu/idle_time_us

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • The unit of the cputime accouting values that are stored per process is
    currently a microsecond. The CPU timer has a maximum granularity of
    2**-12 microseconds. There is no benefit in storing the per process values
    in the lesser precision and there is the disadvantage that the backend
    has to do the rounding to microseconds. The better solution is to use
    the maximum granularity of the CPU timer as cputime unit.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • The cpu time spent by the idle process actually doing something is
    currently accounted as idle time. This is plain wrong, the architectures
    that support VIRT_CPU_ACCOUNTING=y can do better: distinguish between the
    time spent doing nothing and the time spent by idle doing work. The first
    is accounted with account_idle_time and the second with account_system_time.
    The architectures that use the account_xxx_time interface directly and not
    the account_xxx_ticks interface now need to do the check for the idle
    process in their arch code. In particular to improve the system vs true
    idle time accounting the arch code needs to measure the true idle time
    instead of just testing for the idle process.
    To improve the tick based accounting as well we would need an architecture
    primitive that can tell us if the pt_regs of the interrupted context
    points to the magic instruction that halts the cpu.

    In addition idle time is no more added to the stime of the idle process.
    This field now contains the system time of the idle process as it should
    be. On systems without VIRT_CPU_ACCOUNTING this will always be zero as
    every tick that occurs while idle is running will be accounted as idle
    time.

    This patch contains the necessary common code changes to be able to
    distinguish idle system time and true idle time. The architectures with
    support for VIRT_CPU_ACCOUNTING need some changes to exploit this.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • The utimescaled / stimescaled fields in the task structure and the
    global cpustat should be set on all architectures. On s390 the calls
    to account_user_time_scaled and account_system_time_scaled never have
    been added. In addition system time that is accounted as guest time
    to the user time of a process is accounted to the scaled system time
    instead of the scaled user time.
    To fix the bugs and to prevent future forgetfulness this patch merges
    account_system_time_scaled into account_system_time and
    account_user_time_scaled into account_user_time.

    Cc: Benjamin Herrenschmidt
    Cc: Hidetoshi Seto
    Cc: Tony Luck
    Cc: Jeremy Fitzhardinge
    Cc: Chris Wright
    Cc: Michael Neuling
    Acked-by: Paul Mackerras
    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

25 Dec, 2008

1 commit


14 Jul, 2008

2 commits


27 Apr, 2008

1 commit

  • This patch contains the port of Qumranet's kvm kernel module to IBM zSeries
    (aka s390x, mainframe) architecture. It uses the mainframe's virtualization
    instruction SIE to run virtual machines with up to 64 virtual CPUs each.
    This port is only usable on 64bit host kernels, and can only run 64bit guest
    kernels. However, running 31bit applications in guest userspace is possible.

    The following source files are introduced by this patch
    arch/s390/kvm/kvm-s390.c similar to arch/x86/kvm/x86.c, this implements all
    arch callbacks for kvm. __vcpu_run calls back into
    sie64a to enter the guest machine context
    arch/s390/kvm/sie64a.S assembler function sie64a, which enters guest
    context via SIE, and switches world before and after that
    include/asm-s390/kvm_host.h contains all vital data structures needed to run
    virtual machines on the mainframe
    include/asm-s390/kvm.h defines kvm_regs and friends for user access to
    guest register content
    arch/s390/kvm/gaccess.h functions similar to uaccess to access guest memory
    arch/s390/kvm/kvm-s390.h header file for kvm-s390 internals, extended by
    later patches

    Acked-by: Martin Schwidefsky
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Heiko Carstens
    Signed-off-by: Carsten Otte
    Signed-off-by: Avi Kivity

    Heiko Carstens
     

10 Nov, 2007

1 commit

  • Since powerpc started using CONFIG_GENERIC_CLOCKEVENTS, the
    deterministic CPU accounting (CONFIG_VIRT_CPU_ACCOUNTING) has been
    broken on powerpc, because we end up counting user time twice: once in
    timer_interrupt() and once in update_process_times().

    This fixes the problem by pulling the code in update_process_times
    that updates utime and stime into a separate function called
    account_process_tick. If CONFIG_VIRT_CPU_ACCOUNTING is not defined,
    there is a version of account_process_tick in kernel/timer.c that
    simply accounts a whole tick to either utime or stime as before. If
    CONFIG_VIRT_CPU_ACCOUNTING is defined, then arch code gets to
    implement account_process_tick.

    This also lets us simplify the s390 code a bit; it means that the s390
    timer interrupt can now call update_process_times even when
    CONFIG_VIRT_CPU_ACCOUNTING is turned on, and can just implement a
    suitable account_process_tick().

    account_process_tick() now takes the task_struct * as an argument.
    Tested both with and without CONFIG_VIRT_CPU_ACCOUNTING.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Ingo Molnar

    Paul Mackerras
     

27 Jul, 2007

1 commit


10 Jul, 2007

1 commit

  • sched-cfs-v2.6.22-git-v18.patch introduces CPU_IDLE in sched.h.
    This conflict with the already existing define in
    include/asm-s390/processor.h
    Just rename the s390 defines, since they will go away as soon as
    we support CONFIG_NO_HZ instead of our own CONFIG_NO_IDLE_HZ.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

27 Apr, 2007

1 commit


06 Feb, 2007

2 commits

  • This patch adds support for clock synchronization to an external time
    reference (ETR). The external time reference sends an oscillator
    signal and a synchronization signal every 2^20 microseconds to keep
    the TOD clocks of all connected servers in sync. For availability
    two ETR units can be connected to a machine. If the clock deviates
    for more than the sync-check tolerance all cpus get a machine check
    that indicates that the clock is out of sync. For the lovely details
    how to get the clock back in sync see the code below.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

11 Oct, 2006

1 commit

  • Remove the last few places where a pointer to pt_regs gets passed.
    Also make sure we call set_irq_regs() before irq_enter() and after
    irq_exit(). This doesn't fix anything but makes sure s390 looks the
    same like all other architectures.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

06 Oct, 2006

1 commit


01 Jul, 2006

1 commit


27 Jun, 2006

1 commit

  • acquired (aquired)
    contiguous (contigious)
    successful (succesful, succesfull)
    surprise (suprise)
    whether (weather)
    some other misspellings

    Signed-off-by: Andreas Mohr
    Signed-off-by: Adrian Bunk

    Andreas Mohr
     

15 Jan, 2006

1 commit

  • finish_arch_switch needs to update the user cpu time as well, not just the
    system cpu time. Otherwise the partial user cpu time of a process that is
    stored in the lowcore will be (mis-)accounted to the next process.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     

31 Oct, 2005

1 commit

  • Remove timer_list.magic and associated debugging code.

    I originally added this when a spinlock was added to timer_list - this meant
    that an all-zeroes timer became illegal and init_timer() was required.

    That spinlock isn't even there any more, although timer.base must now be
    initialised.

    I'll keep this debugging code in -mm.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

01 May, 2005

1 commit

  • Fix overflow in calculation of the new tod value in stop_hz_timer and fix
    wrong virtual timer list idle time in case the virtual timer is already
    expired in stop_cpu_timer.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky