01 Dec, 2011

4 commits

  • git commit 20b40a794baf3b4b "signal race with restarting system calls"
    added code to the poke_user/poke_user_compat to reset the system call
    restart information in the thread-info if the PSW address is changed.
    The purpose of that change has been to workaround old gdbs that do
    not know about the REGSET_SYSTEM_CALL. It turned out that this is not
    a good idea, it makes the behaviour of the debuggee dependent on the
    order of specific ptrace call, e.g. the REGSET_SYSTEM_CALL register
    set needs to be written last. And the workaround does not really fix
    old gdbs, inferior calls on interrupted restarting system calls do not
    work either way.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • The last breaking event address is a read-only value, the regset misses the
    .set function. If a PTRACE_SETREGSET is done for NT_S390_LAST_BREAK we
    get an oops due to a branch to zero:

    Kernel BUG at 0000000000000002 verbose debug info unavailable
    illegal operation: 0001 #1 SMP
    ...
    Call Trace:
    ( ptrace_regset+0x184/0x188)
    ptrace_request+0x37a/0x4fc
    arch_ptrace+0x108/0x1fc
    SyS_ptrace+0xaa/0x12c
    sysc_noemu+0x16/0x1c
    0x3fffd5ec10c
    Last Breaking-Event-Address:
    ptrace_regset+0x132/0x188

    Add a nop .set function to prevent the branch to zero.

    Signed-off-by: Martin Schwidefsky
    Cc: stable@kernel.org

    Martin Schwidefsky
     
  • The TIF_SYSCALL bit needs to be cleared if the debugger changes the state
    of the ptraced process in regard to the presence of a system call.
    Otherwise the system call will be restarted although the debugger set up
    an inferior call.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • In order to have the same behavior for kdump based stand-alone dump
    as for the kexec method, the is_kdump_kernel() check (only true for
    the kexec method) has to be replaced by the OLDMEM_BASE check (true
    for both methods).

    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Michael Holzheu
     

14 Nov, 2011

5 commits


07 Nov, 2011

1 commit

  • …/kernel/git/jeremy/xen

    * 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
    jump-label: initialize jump-label subsystem much earlier
    x86/jump_label: add arch_jump_label_transform_static()
    s390/jump-label: add arch_jump_label_transform_static()
    jump_label: add arch_jump_label_transform_static() to optimise non-live code updates
    sparc/jump_label: drop arch_jump_label_text_poke_early()
    x86/jump_label: drop arch_jump_label_text_poke_early()
    jump_label: if a key has already been initialized, don't nop it out
    stop_machine: make stop_machine safe and efficient to call early
    jump_label: use proper atomic_t initializer

    Conflicts:
    - arch/x86/kernel/jump_label.c
    Added __init_or_module to arch_jump_label_text_poke_early vs
    removal of that function entirely
    - kernel/stop_machine.c
    same patch ("stop_machine: make stop_machine safe and efficient
    to call early") merged twice, with whitespace fix in one version

    Linus Torvalds
     

30 Oct, 2011

25 commits

  • Currently it can happen that the pre-allocated ELF header contains a wrong
    memory map which would result in errors when copying /proc/vmcore.
    In order to still get a valid vmcore, we (temporarily) disable the error
    checking in copy_oldmem_page(). This will then produce zero pages for those
    memory regions.

    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Michael Holzheu
     
  • We use both the external call and emergency call IPIs to signal remote
    cpus. Therefore it makes sense to account them differently withing
    /proc/irqstats so we actually know what happened.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • Use __force to quiet sparse warnings about user address space.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Fix three sparse warnings in math-emu / sysinfo:

    arch/s390/kernel/sysinfo.c:448:17: error: return expression in void function
    arch/s390/kernel/sysinfo.c:445:25: warning: shift too big (32) for type unsigned int
    arch/s390/kernel/sysinfo.c:445:25: warning: shift too big (32) for type unsigned int

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Add prototypes and includes for functions used in different modules.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Make functions and data static to avoid sparse warnings.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Remove unnecessary code to avoid false positives from sparse, e.g.

    arch/s390/kernel/compat_signal.c:221:61: warning: invalid access past the end of 'set32' (8 8)

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • On sie_fault we need to switch back to user ASCE. Otherwise we get
    interresting effects when exiting to "userspace" while the guest
    space is still active.

    Signed-off-by: Carsten Otte
    Signed-off-by: Martin Schwidefsky

    Carsten Otte
     
  • Use a sigp sense running to decide which signal processor order to use
    for an ipi. If the target cpu is running use external call, if the target
    cpu is not running use emergency signal.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Add support for CHSC I/O interrupt statistics in /proc/interrupts.

    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • The user space program can change its addressing mode between the
    24-bit, 31-bit and the 64-bit mode if the kernel is 64 bit. Currently
    the kernel always forces the standard amode on signal delivery and
    signal return and on ptrace: 64-bit for a 64-bit process, 31-bit for
    a compat process and 31-bit kernels. Change the signal and ptrace code
    to allow the full range of addressing modes. Signal handlers are
    run in the standard addressing mode for the process.

    One caveat is that even an 31-bit compat process can switch to the
    64-bit mode. The next signal will switch back into the 31-bit mode
    and there is no room in the 31-bit compat signal frame to store the
    information that the program came from the 64-bit mode.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Split out addressing mode bits from PSW_BASE_BITS, rename PSW_BASE_BITS
    to PSW_MASK_BASE, get rid of psw_user32_bits, remove unused function
    enabled_wait(), introduce PSW_MASK_USER, and drop PSW_MASK_MERGE macros.
    Change psw_kernel_bits / psw_user_bits to contain only the bits that
    are always set in the respective mode.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Add an explicit TIF_SYSCALL bit that indicates if a task is inside
    a system call. The svc_code in the pt_regs structure is now only
    valid if TIF_SYSCALL is set. With this definition TIF_RESTART_SVC
    can be replaced with TIF_SYSCALL. Overall do_signal is a bit more
    readable and it saves a few lines of code.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • An instruction with an address right below the adress limit for the
    current addressing mode will wrap. The instruction restart logic in
    the protection fault handler and the signal code need to follow the
    wrapping rules to find the correct instruction address.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • For a ERESTARTNOHAND/ERESTARTSYS/ERESTARTNOINTR restarting system call
    do_signal will prepare the restart of the system call with a rewind of
    the PSW before calling get_signal_to_deliver (where the debugger might
    take control). For A ERESTART_RESTARTBLOCK restarting system call
    do_signal will set -EINTR as return code.
    There are two issues with this approach:
    1) strace never sees ERESTARTNOHAND, ERESTARTSYS, ERESTARTNOINTR or
    ERESTART_RESTARTBLOCK as the rewinding already took place or the
    return code has been changed to -EINTR
    2) if get_signal_to_deliver does not return with a signal to deliver
    the restart via the repeat of the svc instruction is left in place.
    This opens a race if another signal is made pending before the
    system call instruction can be reexecuted. The original system call
    will be restarted even if the second signal would have ended the
    system call with -EINTR.

    These two issues can be solved by dropping the early rewind of the
    system call before get_signal_to_deliver has been called and by using
    the TIF_RESTART_SVC magic to do the restart if no signal has to be
    delivered. The only situation where the system call restart via the
    repeat of the svc instruction is appropriate is when a SA_RESTART
    signal is delivered to user space.

    Unfortunately this breaks inferior calls by the debugger again. The
    system call number and the length of the system call instruction is
    lost over the inferior call and user space will see ERESTARTNOHAND/
    ERESTARTSYS/ERESTARTNOINTR/ERESTART_RESTARTBLOCK. To correct this a
    new ptrace interface is added to save/restore the system call number
    and system call instruction length.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Remove the save_area_64 field from the 0xe00 - 0xf00 area in the lowcore.
    Use a free slot in the save_area array instead.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • This patch implements the crash_map_pages() function for s390.
    KEXEC_CRASH_MEM_ALIGN is set to HPAGE_SIZE, in order to support
    kernel mappings that use large pages. We also use HPAGE_SIZE alignment
    for CONFIG_HUGETLB_PAGE=n in order to have the same 1 MiB alignment on
    all s390 systems.

    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Michael Holzheu
     
  • This patch defines for s390 an ABI defined pointer to the vmcoreinfo note at
    a well known address. With this patch tools are able to find this information
    in dumps created by stand-alone or hypervisor dump tools.

    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Michael Holzheu
     
  • This patch provides the architecture specific part of the s390 kdump
    support.

    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Michael Holzheu
     
  • PSW restart can be triggered on offline CPUs. If this happens, currently
    the PSW restart code fails, because functions like smp_processor_id()
    do not work on offline CPUs. This patch fixes this as follows:

    If PSW restart is triggered on an offline CPU, the PSW restart (sigp restart)
    is done a second time on another CPU that is online and the old CPU is
    stopped afterwards.

    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky

    Michael Holzheu
     
  • Use the ENTRY macro for the system call wrapper sys_setns_wrapper
    similarly to the other wrappers.

    Signed-off-by: Jan Glauber
    Signed-off-by: Martin Schwidefsky

    Jan Glauber
     
  • git commit 5e9a2692 "[S390] ptrace cleanup" introduced a regression
    for the case when both a user PER set (e.g. a storage alteration trace) and
    PTRACE_SINGLESTEP are active. The new code will overrule the user PER set
    with a instruction-fetch PER set over the whole address space for ptrace
    single stepping. The inferior process will be stopped after each instruction
    with an instruction fetch event. Any other events that may have occurred
    concurrently are not reported (e.g. storage alteration event) because the
    control bits for them are not set. The solution is to merge the PER control
    bits of the user PER set with the PER_EVENT_IFETCH control bit for
    PTRACE_SINGLESTEP.

    Cc: stable@kernel.org
    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Fix this warning:
    WARNING: vmlinux.o(.text+0x199b6): Section mismatch in reference from
    the function alloc_masks() to the function .init.text:__alloc_bootmem()

    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • The .start function and indirectly the .next function of the show_cpuinfo
    sequential operation uses NR_CPUS as limit instead of nr_cpu_ids.
    This can cause warnings like this:

    WARNING: at /usr/src/linux/include/linux/cpumask.h:107
    Process lscpu (pid: 575, task: 000000007deb4338, ksp: 000000007794f588)
    Krnl PSW : 0704000180000000 0000000000106db4 (show_cpuinfo+0x108/0x234)
    R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:0 PM:0 EA:3
    Krnl GPRS: 0000000000000003 0000000000791988 000000000071b478 0000000000000004
    0000000000000001 0000000000000000 000000007d139500 0000000000000400
    0000000000000000 000000000070e24c 000000007d48d600 0000000000000005
    000000007d48d600 00000000004dfa10 0000000000106cf8 000000007794fcc0
    Krnl Code: 0000000000106da8: 95001000 cli 0(%r1),0
    0000000000106dac: a774ffac brc 7,106d04
    0000000000106db0: a7f40001 brc 15,106db2
    >0000000000106db4: 92011000 mvi 0(%r1),1
    0000000000106db8: a7f4ffa6 brc 15,106d04
    0000000000106dbc: c0e5000065b4 brasl %r14,113924
    0000000000106dc2: c09000303a45 larl %r9,70e24c
    0000000000106dc8: c020001eefd4 larl %r2,4e4d70

    Replacing NR_CPUS with nr_cpu_ids fixes it.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Current IRQ statistics support does not show detail counts for I/O
    interrupts which are processed internally only. The result is a
    summation count which is way off such as this one:

    CPU0 CPU1 CPU2
    I/O: 1331 710 442
    [...]
    QAI: 15 16 16 [I/O] QDIO Adapter Interrupt
    QDI: 1 0 0 [I/O] QDIO Interrupt
    DAS: 706 645 381 [I/O] DASD
    C15: 26 10 0 [I/O] 3215
    C70: 0 0 0 [I/O] 3270
    TAP: 0 0 0 [I/O] Tape
    VMR: 0 0 0 [I/O] Unit Record Devices
    LCS: 0 0 0 [I/O] LCS
    CLW: 0 0 0 [I/O] CLAW
    CTC: 0 0 0 [I/O] CTC
    APB: 0 0 0 [I/O] AP Bus

    Fix this by moving I/O interrupt accounting into the common I/O layer.

    Signed-off-by: Peter Oberparleiter
    Signed-off-by: Martin Schwidefsky

    Peter Oberparleiter
     

26 Oct, 2011

2 commits

  • * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
    time, s390: Get rid of compile warning
    dw_apb_timer: constify clocksource name
    time: Cleanup old CONFIG_GENERIC_TIME references that snuck in
    time: Change jiffies_to_clock_t() argument type to unsigned long
    alarmtimers: Fix error handling
    clocksource: Make watchdog reset lockless
    posix-cpu-timers: Cure SMP accounting oddities
    s390: Use direct ktime path for s390 clockevent device
    clockevents: Add direct ktime programming function
    clockevents: Make minimum delay adjustments configurable
    nohz: Remove "Switched to NOHz mode" debugging messages
    proc: Consider NO_HZ when printing idle and iowait times
    nohz: Make idle/iowait counter update conditional
    nohz: Fix update_ts_time_stat idle accounting
    cputime: Clean up cputime_to_usecs and usecs_to_cputime macros
    alarmtimers: Rework RTC device selection using class interface
    alarmtimers: Add try_to_cancel functionality
    alarmtimers: Add more refined alarm state tracking
    alarmtimers: Remove period from alarm structure
    alarmtimers: Remove interval cap limit hack
    ...

    Linus Torvalds
     
  • This allows jump-label entries to be cheaply updated on code which is
    not yet live.

    Signed-off-by: Jeremy Fitzhardinge
    Acked-by: Jason Baron
    Acked-by: Peter Zijlstra
    Cc: Jan Glauber

    Jeremy Fitzhardinge
     

17 Oct, 2011

1 commit

  • For s390 there is one additional byte associated with each page,
    the storage key. This byte contains the referenced and changed
    bits and needs to be included into the hibernation image.
    If the storage keys are not restored to their previous state all
    original pages would appear to be dirty. This can cause
    inconsistencies e.g. with read-only filesystems.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Rafael J. Wysocki

    Martin Schwidefsky
     

12 Oct, 2011

1 commit

  • "s390: Use direct ktime path for s390 clockevent device" in linux-next
    introduces this compile warning:

    arch/s390/kernel/time.c: In function 's390_next_ktime':
    arch/s390/kernel/time.c:118:2: warning:
    comparison of distinct pointer types lacks a cast [enabled by default]

    Just use a u64 instead of an s64 variable. This is not a problem since it
    will always contain a positive value.

    Signed-off-by: Heiko Carstens
    Cc: Martin Schwidefsky
    Link: http://lkml.kernel.org/r/1316675957-5538-1-git-send-email-heiko.carstens@de.ibm.com
    Signed-off-by: Thomas Gleixner

    Heiko Carstens
     

20 Sep, 2011

1 commit

  • 598841ca9919d008b520114d8a4378c4ce4e40a1 ([S390] use gmap address
    spaces for kvm guest images) changed kvm to use a separate address
    space for kvm guests. This address space was switched in __vcpu_run
    In some cases (preemption, page fault) there is the possibility that
    this address space switch is lost.
    The typical symptom was a huge amount of validity intercepts or
    random guest addressing exceptions.
    Fix this by doing the switch in sie_loop and sie_exit and saving the
    address space in the gmap structure itself. Also use the preempt
    notifier.

    Signed-off-by: Christian Borntraeger
    Acked-by: Avi Kivity
    Signed-off-by: Heiko Carstens

    Christian Borntraeger