31 Jan, 2019

3 commits

  • commit 60f1bf29c0b2519989927cae640cd1f50f59dc7f upstream.

    When calling smp_call_ipl_cpu() from the IPL CPU, we will try to read
    from pcpu_devices->lowcore. However, due to prefixing, that will result
    in reading from absolute address 0 on that CPU. We have to go via the
    actual lowcore instead.

    This means that right now, we will read lc->nodat_stack == 0 and
    therfore work on a very wrong stack.

    This BUG essentially broke rebooting under QEMU TCG (which will report
    a low address protection exception). And checking under KVM, it is
    also broken under KVM. With 1 VCPU it can be easily triggered.

    :/# echo 1 > /proc/sys/kernel/sysrq
    :/# echo b > /proc/sysrq-trigger
    [ 28.476745] sysrq: SysRq : Resetting
    [ 28.476793] Kernel stack overflow.
    [ 28.476817] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
    [ 28.476820] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
    [ 28.476826] Krnl PSW : 0400c00180000000 0000000000115c0c (pcpu_delegate+0x12c/0x140)
    [ 28.476861] R:0 T:1 IO:0 EX:0 Key:0 M:0 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
    [ 28.476863] Krnl GPRS: ffffffffffffffff 0000000000000000 000000000010dff8 0000000000000000
    [ 28.476864] 0000000000000000 0000000000000000 0000000000ab7090 000003e0006efbf0
    [ 28.476864] 000000000010dff8 0000000000000000 0000000000000000 0000000000000000
    [ 28.476865] 000000007fffc000 0000000000730408 000003e0006efc58 0000000000000000
    [ 28.476887] Krnl Code: 0000000000115bfe: 4170f000 la %r7,0(%r15)
    [ 28.476887] 0000000000115c02: 41f0a000 la %r15,0(%r10)
    [ 28.476887] #0000000000115c06: e370f0980024 stg %r7,152(%r15)
    [ 28.476887] >0000000000115c0c: c0e5fffff86e brasl %r14,114ce8
    [ 28.476887] 0000000000115c12: 41f07000 la %r15,0(%r7)
    [ 28.476887] 0000000000115c16: a7f4ffa8 brc 15,115b66
    [ 28.476887] 0000000000115c1a: 0707 bcr 0,%r7
    [ 28.476887] 0000000000115c1c: 0707 bcr 0,%r7
    [ 28.476901] Call Trace:
    [ 28.476902] Last Breaking-Event-Address:
    [ 28.476920] [] arch_call_rest_init+0x22/0x80
    [ 28.476927] Kernel panic - not syncing: Corrupt kernel stack, can't continue.
    [ 28.476930] CPU: 0 PID: 424 Comm: sh Not tainted 5.0.0-rc1+ #13
    [ 28.476932] Hardware name: IBM 2964 NE1 716 (KVM/Linux)
    [ 28.476932] Call Trace:

    Fixes: 2f859d0dad81 ("s390/smp: reduce size of struct pcpu")
    Cc: stable@vger.kernel.org # 4.0+
    Reported-by: Cornelia Huck
    Signed-off-by: David Hildenbrand
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    David Hildenbrand
     
  • commit b7cb707c373094ce4008d4a6ac9b6b366ec52da5 upstream.

    smp_rescan_cpus() is called without the device_hotplug_lock, which can lead
    to a dedlock when a new CPU is found and immediately set online by a udev
    rule.

    This was observed on an older kernel version, where the cpu_hotplug_begin()
    loop was still present, and it resulted in hanging chcpu and systemd-udev
    processes. This specific deadlock will not show on current kernels. However,
    there may be other possible deadlocks, and since smp_rescan_cpus() can still
    trigger a CPU hotplug operation, the device_hotplug_lock should be held.

    For reference, this was the deadlock with the old cpu_hotplug_begin() loop:

    chcpu (rescan) systemd-udevd

    echo 1 > /sys/../rescan
    -> smp_rescan_cpus()
    -> (*) get_online_cpus()
    (increases refcount)
    -> smp_add_present_cpu()
    (new CPU found)
    -> register_cpu()
    -> device_add()
    -> udev "add" event triggered -----------> udev rule sets CPU online
    -> echo 1 > /sys/.../online
    -> lock_device_hotplug_sysfs()
    (this is missing in rescan path)
    -> device_online()
    -> (**) device_lock(new CPU dev)
    -> cpu_up()
    -> cpu_hotplug_begin()
    (loops until refcount == 0)
    -> deadlock with (*)
    -> bus_probe_device()
    -> device_attach()
    -> device_lock(new CPU dev)
    -> deadlock with (**)

    Fix this by taking the device_hotplug_lock in the CPU rescan path.

    Cc:
    Signed-off-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Gerald Schaefer
     
  • commit 03aa047ef2db4985e444af6ee1c1dd084ad9fb4c upstream.

    Right now the early machine detection code check stsi 3.2.2 for "KVM"
    and set MACHINE_IS_VM if this is different. As the console detection
    uses diagnose 8 if MACHINE_IS_VM returns true this will crash Linux
    early for any non z/VM system that sets a different value than KVM.
    So instead of assuming z/VM, do not set any of MACHINE_IS_LPAR,
    MACHINE_IS_VM, or MACHINE_IS_KVM.

    CC: stable@vger.kernel.org
    Reviewed-by: Heiko Carstens
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Christian Borntraeger
     

17 Dec, 2018

1 commit

  • [ Upstream commit 613a41b0d16e617f46776a93b975a1eeea96417c ]

    On s390 command perf top fails
    [root@s35lp76 perf] # ./perf top -F100000 --stdio
    Error:
    cycles: PMU Hardware doesn't support sampling/overflow-interrupts.
    Try 'perf stat'
    [root@s35lp76 perf] #

    Using event -e rb0000 works as designed. Event rb0000 is the event
    number of the sampling facility for basic sampling.

    During system start up the following PMUs are installed in the kernel's
    PMU list (from head to tail):
    cpum_cf --> s390 PMU counter facility device driver
    cpum_sf --> s390 PMU sampling facility device driver
    uprobe
    kprobe
    tracepoint
    task_clock
    cpu_clock

    Perf top executes following functions and calls perf_event_open(2) system
    call with different parameters many times:

    cmd_top
    --> __cmd_top
    --> perf_evlist__add_default
    --> __perf_evlist__add_default
    --> perf_evlist__new_cycles (creates event type:0 (HW)
    config 0 (CPU_CYCLES)
    --> perf_event_attr__set_max_precise_ip
    Uses perf_event_open(2) to detect correct
    precise_ip level. Fails 3 times on s390 which is ok.

    Then functions cmd_top
    --> __cmd_top
    --> perf_top__start_counters
    -->perf_evlist__config
    --> perf_can_comm_exec
    --> perf_probe_api
    This functions test support for the following events:
    "cycles:u", "instructions:u", "cpu-clock:u" using
    --> perf_do_probe_api
    --> perf_event_open_cloexec
    Test the close on exec flag support with
    perf_event_open(2).
    perf_do_probe_api returns true if the event is
    supported.
    The function returns true because event cpu-clock is
    supported by the PMU cpu_clock.
    This is achieved by many calls to perf_event_open(2).

    Function perf_top__start_counters now calls perf_evsel__open() for every
    event, which is the default event cpu_cycles (config:0) and type HARDWARE
    (type:0) which a predfined frequence of 4000.

    Given the above order of the PMU list, the PMU cpum_cf gets called first
    and returns 0, which indicates support for this sampling. The event is
    fully allocated in the function perf_event_open (file kernel/event/core.c
    near line 10521 and the following check fails:

    event = perf_event_alloc(&attr, cpu, task, group_leader, NULL,
    NULL, NULL, cgroup_fd);
    if (IS_ERR(event)) {
    err = PTR_ERR(event);
    goto err_cred;
    }

    if (is_sampling_event(event)) {
    if (event->pmu->capabilities & PERF_PMU_CAP_NO_INTERRUPT) {
    err = -EOPNOTSUPP;
    goto err_alloc;
    }
    }

    The check for the interrupt capabilities fails and the system call
    perf_event_open() returns -EOPNOTSUPP (-95).

    Add a check to return -ENODEV when sampling is requested in PMU cpum_cf.
    This allows common kernel code in the perf_event_open() system call to
    test the next PMU in above list.

    Fixes: 97b1198fece0 (" "s390, perf: Use common PMU interrupt disabled code")
    Signed-off-by: Thomas Richter
    Reviewed-by: Hendrik Brueckner
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Sasha Levin

    Thomas Richter
     

27 Nov, 2018

2 commits

  • [ Upstream commit 0bb2ae1b26e1fb7543ec7474cdd374ac4b88c4da ]

    The function perf_init_event() creates a new event and
    assignes it to a PMU. This a done in a loop over all existing
    PMUs. For each listed PMU the event init function is called
    and if this function does return any other error than -ENOENT,
    the loop is terminated the creation of the event fails.

    If the event is invalid, return -ENOENT to try other PMUs.

    Signed-off-by: Thomas Richter
    Reviewed-by: Hendrik Brueckner
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Sasha Levin

    Thomas Richter
     
  • [ Upstream commit b44b136a3773d8a9c7853f8df716bd1483613cbb ]

    According to Documentation/kbuild/makefiles.txt all build targets using
    if_changed should use FORCE as well. Add missing FORCE to make sure
    vdso targets are rebuild properly when not just immediate prerequisites
    have changed but also when build command differs.

    Reviewed-by: Philipp Rudo
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Sasha Levin

    Vasily Gorbik
     

04 Oct, 2018

1 commit

  • [ Upstream commit 9f35b818a2f90fb6cb291aa0c9f835d4f0974a9a ]

    Get rid of this compile warning for !PROC_FS:

    CC arch/s390/kernel/sysinfo.o
    arch/s390/kernel/sysinfo.c:275:12: warning: 'sysinfo_show' defined but not used [-Wunused-function]
    static int sysinfo_show(struct seq_file *m, void *v)

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Heiko Carstens
     

15 Sep, 2018

1 commit

  • [ Upstream commit 2d2e7075b87181ed0c675e4936e20bdadba02e1f ]

    The vmcoreinfo of a crashed system is potentially fragmented. Thus the
    crash kernel has an intermediate step where the vmcoreinfo is copied into a
    temporary, continuous buffer in the crash kernel memory. This temporary
    buffer is never freed. Free it now to prevent the memleak.

    While at it replace all occurrences of "VMCOREINFO" by its corresponding
    macro to prevent potential renaming issues.

    Signed-off-by: Philipp Rudo
    Acked-by: Heiko Carstens
    Signed-off-by: Heiko Carstens
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Philipp Rudo
     

11 Jul, 2018

1 commit

  • commit 891f6a726cacbb87e5b06076693ffab53bd378d7 upstream.

    In the critical section cleanup we must not mess with r1. For march=z9
    or older, larl + ex (instead of exrl) are used with r1 as a temporary
    register. This can clobber r1 in several interrupt handlers. Fix this by
    using r11 as a temp register. r11 is being saved by all callers of
    cleanup_critical.

    Fixes: 6dd85fbb87 ("s390: move expoline assembler macros to a header")
    Cc: stable@vger.kernel.org #v4.16
    Reported-by: Oliver Kurz
    Reported-by: Petr Tesařík
    Signed-off-by: Christian Borntraeger
    Reviewed-by: Hendrik Brueckner
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Christian Borntraeger
     

25 May, 2018

5 commits

  • [ Upstream commit 6deaa3bbca804b2a3627fd685f75de64da7be535 ]

    The BPF JIT uses a 'b (%r)' instruction in the definition
    of the sk_load_word and sk_load_half functions.

    Add support for branch-on-condition instructions contained in the
    thunk code of an expoline.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit 4253b0e0627ee3461e64c2495c616f1c8f6b127b ]

    The nospec-branch.c file is compiled without the gcc options to
    generate expoline thunks. The return branch of the sysfs show
    functions cpu_show_spectre_v1 and cpu_show_spectre_v2 is an indirect
    branch as well. These need to be compiled with expolines.

    Move the sysfs functions for spectre reporting to a separate file
    and loose an '.' for one of the messages.

    Cc: stable@vger.kernel.org # 4.16
    Fixes: d424986f1d ("s390: add sysfs attributes for spectre")
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit c50c84c3ac4d5db683904bdb3257798b6ef980ae ]

    The assember code in arch/s390/kernel uses a few more indirect branches
    which need to be done with execute trampolines for CONFIG_EXPOLINE=y.

    Cc: stable@vger.kernel.org # 4.16
    Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
    Reviewed-by: Hendrik Brueckner
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit 23a4d7fd34856da8218c4cfc23dba7a6ec0a423a ]

    The return from the ftrace_stub, _mcount, ftrace_caller and
    return_to_handler functions is done with "br %r14" and "br %r1".
    These are indirect branches as well and need to use execute
    trampolines for CONFIG_EXPOLINE=y.

    The ftrace_caller function is a special case as it returns to the
    start of a function and may only use %r0 and %r1. For a pre z10
    machine the standard execute trampoline uses a LARL + EX to do
    this, but this requires *two* registers in the range %r1..%r15.
    To get around this the 'br %r1' located in the lowcore is used,
    then the EX instruction does not need an address register.
    But the lowcore trick may only be used for pre z14 machines,
    with noexec=on the mapping for the first page may not contain
    instructions. The solution for that is an ALTERNATIVE in the
    expoline THUNK generated by 'GEN_BR_THUNK %r1' to switch to
    EXRL, this relies on the fact that a machine that supports
    noexec=on has EXRL as well.

    Cc: stable@vger.kernel.org # 4.16
    Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit 6dd85fbb87d1d6b87a3b1f02ca28d7b2abd2e7ba ]

    To be able to use the expoline branches in different assembler
    files move the associated macros from entry.S to a new header
    nospec-insn.h.

    While we are at it make the macros a bit nicer to use.

    Cc: stable@vger.kernel.org # 4.16
    Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     

23 May, 2018

2 commits

  • commit 9f18fff63cfd6f559daa1eaae60640372c65f84b upstream.

    The inline assembly to call __do_softirq on the irq stack uses
    an indirect branch. This can be replaced with a normal relative
    branch.

    Cc: stable@vger.kernel.org # 4.16
    Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
    Reviewed-by: Hendrik Brueckner
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • commit 4bbaf2584b86b0772413edeac22ff448f36351b1 upstream.

    Correct a trinity finding for the perf_event_open() system call with
    a perf event attribute structure that uses a frequency but has the
    sampling frequency set to zero. This causes a FP divide exception during
    the sample rate initialization for the hardware sampling facility.

    Fixes: 8c069ff4bd606 ("s390/perf: add support for the CPU-Measurement Sampling Facility")
    Cc: stable@vger.kernel.org # 3.14+
    Reviewed-by: Heiko Carstens
    Signed-off-by: Hendrik Brueckner
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Hendrik Brueckner
     

29 Apr, 2018

17 commits

  • commit 783c3b53b9506db3e05daacfe34e0287eebb09d8 upstream.

    Implement s390 specific arch_uretprobe_is_alive() to avoid SIGSEGVs
    observed with uretprobes in combination with setjmp/longjmp.

    See commit 2dea1d9c38e4 ("powerpc/uprobes: Implement
    arch_uretprobe_is_alive()") for more details.

    With this implemented all test cases referenced in the above commit
    pass.

    Reported-by: Ziqian SUN
    Cc: # v4.3+
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Heiko Carstens
     
  • [ Upstream commit 6cf09958f32b9667bb3ebadf74367c791112771b ]

    The main linker script vmlinux.lds.S for the kernel image merges
    the expoline code patch tables into two section ".nospec_call_table"
    and ".nospec_return_table". This is *not* done for the modules,
    there the sections retain their original names as generated by gcc:
    ".s390_indirect_call", ".s390_return_mem" and ".s390_return_reg".

    The module_finalize code has to check for the compiler generated
    section names, otherwise no code patching is done. This slows down
    the module code in case of "spectre_v2=off".

    Cc: stable@vger.kernel.org # 4.16
    Fixes: f19fbd5ed6 ("s390: introduce execute-trampolines for branches")
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit 6a3d1e81a434fc311f224b8be77258bafc18ccc6 ]

    With CONFIG_EXPOLINE_AUTO=y the call of spectre_v2_auto_early() via
    early_initcall is done *after* the early_param functions. This
    overwrites any settings done with the nobp/no_spectre_v2/spectre_v2
    parameters. The code patching for the kernel is done after the
    evaluation of the early parameters but before the early_initcall
    is done. The end result is a kernel image that is patched correctly
    but the kernel modules are not.

    Make sure that the nospec auto detection function is called before the
    early parameters are evaluated and before the code patching is done.

    Fixes: 6e179d64126b ("s390: add automatic detection of the spectre defense")
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit d424986f1d6b16079b3231db0314923f4f8deed1 ]

    Set CONFIG_GENERIC_CPU_VULNERABILITIES and provide the two functions
    cpu_show_spectre_v1 and cpu_show_spectre_v2 to report the spectre
    mitigations.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit bc035599718412cfba9249aa713f90ef13f13ee9 ]

    Add a boot message if either of the spectre defenses is active.
    The message is
    "Spectre V2 mitigation: execute trampolines."
    or "Spectre V2 mitigation: limited branch prediction."

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit 6e179d64126b909f0b288fa63cdbf07c531e9b1d ]

    Automatically decide between nobp vs. expolines if the spectre_v2=auto
    kernel parameter is specified or CONFIG_EXPOLINE_AUTO=y is set.

    The decision made at boot time due to CONFIG_EXPOLINE_AUTO=y being set
    can be overruled with the nobp, nospec and spectre_v2 kernel parameters.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit b2e2f43a01bace1a25bdbae04c9f9846882b727a ]

    Keep the code for the nobp parameter handling with the code for
    expolines. Both are related to the spectre v2 mitigation.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit d3f468963cd6fd6d2aa5e26aed8b24232096d0e1 ]

    when a system call is interrupted we might call the critical section
    cleanup handler that re-does some of the operations. When we are between
    .Lsysc_vtime and .Lsysc_do_svc we might also redo the saving of the
    problem state registers r0-r7:

    .Lcleanup_system_call:
    [...]
    0: # update accounting time stamp
    mvc __LC_LAST_UPDATE_TIMER(8),__LC_SYNC_ENTER_TIMER
    # set up saved register r11
    lg %r15,__LC_KERNEL_STACK
    la %r9,STACK_FRAME_OVERHEAD(%r15)
    stg %r9,24(%r11) # r11 pt_regs pointer
    # fill pt_regs
    mvc __PT_R8(64,%r9),__LC_SAVE_AREA_SYNC
    ---> stmg %r0,%r7,__PT_R0(%r9)

    The problem is now, that we might have already zeroed out r0.
    The fix is to move the zeroing of r0 after sysc_do_svc.

    Reported-by: Farhan Ali
    Fixes: 7041d28115e91 ("s390: scrub registers on kernel entry and KVM exit")
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky

    Signed-off-by: Martin Schwidefsky

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Christian Borntraeger
     
  • [ Upstream commit d5feec04fe578c8dbd9e2e1439afc2f0af761ed4 ]

    The system call path can be interrupted before the switch back to the
    standard branch prediction with BPENTER has been done. The critical
    section cleanup code skips forward to .Lsysc_do_svc and bypasses the
    BPENTER. In this case the kernel and all subsequent code will run with
    the limited branch prediction.

    Fixes: eacf67eb9b32 ("s390: run user space and KVM guests with modified branch prediction")
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit 2cb370d615e9fbed9e95ed222c2c8f337181aa90 ]

    I've accidentally stumbled upon the IS_ENABLED(EXPOLINE_*) lines, which
    obviously always evaluate to false. Fix this.

    Fixes: f19fbd5ed642 ("s390: introduce execute-trampolines for branches")
    Signed-off-by: Eugeniu Rosca
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Eugeniu Rosca
     
  • [ Upstream commit f19fbd5ed642dc31c809596412dab1ed56f2f156 ]

    Add CONFIG_EXPOLINE to enable the use of the new -mindirect-branch= and
    -mfunction_return= compiler options to create a kernel fortified against
    the specte v2 attack.

    With CONFIG_EXPOLINE=y all indirect branches will be issued with an
    execute type instruction. For z10 or newer the EXRL instruction will
    be used, for older machines the EX instruction. The typical indirect
    call

    basr %r14,%r1

    is replaced with a PC relative call to a new thunk

    brasl %r14,__s390x_indirect_jump_r1

    The thunk contains the EXRL/EX instruction to the indirect branch

    __s390x_indirect_jump_r1:
    exrl 0,0f
    j .
    0: br %r1

    The detour via the execute type instruction has a performance impact.
    To get rid of the detour the new kernel parameter "nospectre_v2" and
    "spectre_v2=[on,off,auto]" can be used. If the parameter is specified
    the kernel and module code will be patched at runtime.

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit 6b73044b2b0081ee3dd1cd6eaab7dee552601efb ]

    Define TIF_ISOLATE_BP and TIF_ISOLATE_BP_GUEST and add the necessary
    plumbing in entry.S to be able to run user space and KVM guests with
    limited branch prediction.

    To switch a user space process to limited branch prediction the
    s390_isolate_bp() function has to be call, and to run a vCPU of a KVM
    guest associated with the current task with limited branch prediction
    call s390_isolate_bp_guest().

    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit d768bd892fc8f066cd3aa000eb1867bcf32db0ee ]

    Add the PPA instruction to the system entry and exit path to switch
    the kernel to a different branch prediction behaviour. The instructions
    are added via CPU alternatives and can be disabled with the "nospec"
    or the "nobp=0" kernel parameter. If the default behaviour selected
    with CONFIG_KERNEL_NOBP is set to "n" then the "nobp=1" parameter can be
    used to enable the changed kernel branch prediction.

    Acked-by: Cornelia Huck
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit cf1489984641369611556bf00c48f945c77bcf02 ]

    To be able to switch off specific CPU alternatives with kernel parameters
    make a copy of the facility bit mask provided by STFLE and use the copy
    for the decision to apply an alternative.

    Reviewed-by: David Hildenbrand
    Reviewed-by: Cornelia Huck
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit 7041d28115e91f2144f811ffe8a195c696b1e1d0 ]

    Clear all user space registers on entry to the kernel and all KVM guest
    registers on KVM guest exit if the register does not contain either a
    parameter or a result value.

    Reviewed-by: Christian Borntraeger
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Martin Schwidefsky
     
  • [ Upstream commit 049a2c2d486e8cc82c5cd79fa479c5b105b109e9 ]

    Remove the CPU_ALTERNATIVES config option and enable the code
    unconditionally. The config option was only added to avoid a conflict
    with the named saved segment support. Since that code is gone there is
    no reason to keep the CPU_ALTERNATIVES config option.

    Just enable it unconditionally to also reduce the number of config
    options and make it less likely that something breaks.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Heiko Carstens
     
  • [ Upstream commit 686140a1a9c41d85a4212a1c26d671139b76404b ]

    Implement CPU alternatives, which allows to optionally patch newer
    instructions at runtime, based on CPU facilities availability.

    A new kernel boot parameter "noaltinstr" disables patching.

    Current implementation is derived from x86 alternatives. Although
    ideal instructions padding (when altinstr is longer then oldinstr)
    is added at compile time, and no oldinstr nops optimization has to be
    done at runtime. Also couple of compile time sanity checks are done:
    1. oldinstr and altinstr must be
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Vasily Gorbik
     

19 Apr, 2018

1 commit

  • commit 15deb080a6087b73089139569558965750e69d67 upstream.

    When loadparm is set in reipl parm block, the kernel should also set
    DIAG308_FLAGS_LP_VALID flag.

    This fixes loadparm ignoring during z/VM fcp -> ccw reipl and kvm direct
    boot -> ccw reipl.

    Cc:
    Reviewed-by: Heiko Carstens
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Vasily Gorbik
     

22 Feb, 2018

1 commit

  • commit 6dd0d2d22aa363fec075cb2577ba273ac8462e94 upstream.

    For some reason, the implementation of some 16-bit ID system calls
    (namely, setuid16/setgid16 and setfsuid16/setfsgid16) used type cast
    instead of low2highgid/low2highuid macros for converting [GU]IDs, which
    led to incorrect handling of value of -1 (which ought to be considered
    invalid).

    Discovered by strace test suite.

    Cc: stable@vger.kernel.org
    Signed-off-by: Eugene Syromiatnikov
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Eugene Syromiatnikov
     

04 Feb, 2018

1 commit

  • [ Upstream commit 38389ec84e835fa31a59b7dabb18343106a6d0d5 ]

    Commit 1887aa07b676
    ("s390/topology: add detection of dedicated vs shared CPUs")
    introduced following compiler error when CONFIG_SCHED_TOPOLOGY is not set.

    CC arch/s390/kernel/smp.o
    ...
    arch/s390/kernel/smp.c: In function ‘smp_start_secondary’:
    arch/s390/kernel/smp.c:812:6: error: implicit declaration of function
    ‘topology_cpu_dedicated’; did you mean ‘topology_cpu_init’?

    This patch fixes the compiler error by adding function
    topology_cpu_dedicated() to return false when this config option is
    not defined.

    Signed-off-by: Thomas Richter
    Reviewed-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Thomas Richter
     

20 Dec, 2017

1 commit

  • commit bdcf0a423ea1c40bbb40e7ee483b50fc8aa3d758 upstream.

    In testing, we found that nfsd threads may call set_groups in parallel
    for the same entry cached in auth.unix.gid, racing in the call of
    groups_sort, corrupting the groups for that entry and leading to
    permission denials for the client.

    This patch:
    - Make groups_sort globally visible.
    - Move the call to groups_sort to the modifiers of group_info
    - Remove the call to groups_sort from set_groups

    Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com
    Signed-off-by: Thiago Rafael Becker
    Reviewed-by: Matthew Wilcox
    Reviewed-by: NeilBrown
    Acked-by: "J. Bruce Fields"
    Cc: Al Viro
    Cc: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Thiago Rafael Becker
     

14 Dec, 2017

1 commit

  • commit e779498df587dd2189b30fe5b9245aefab870eb8 upstream.

    When wiring up the socket system calls the compat entries were
    incorrectly set. Not all of them point to the corresponding compat
    wrapper functions, which clear the upper 33 bits of user space
    pointers, like it is required.

    Fixes: 977108f89c989 ("s390: wire up separate socketcalls system calls")
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Heiko Carstens
     

10 Dec, 2017

2 commits

  • [ Upstream commit 5ef2d5231d547c672c67bdf84c13a4adaf477964 ]

    If the guarded storage regset for current is supposed to be changed,
    the regset from user space is copied directly into the guarded storage
    control block.

    If then the process gets scheduled away while the control block is
    being copied and before the new control block has been loaded, the
    result is random: the process can be scheduled away due to a page
    fault or preemption. If that happens the already copied parts will be
    overwritten by save_gs_cb(), called from switch_to().

    Avoid this by copying the data to a temporary buffer on the stack and
    do the actual update with preemption disabled.

    Fixes: f5bbd7219891 ("s390/ptrace: guarded storage regset for the current task")
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Heiko Carstens
     
  • commit 8d9047f8b967ce6181fd824ae922978e1b055cc0 upstream.

    Free data structures required for runtime instrumentation from
    arch_release_task_struct(). This allows to simplify the code a bit,
    and also makes the semantics a bit easier: arch_release_task_struct()
    is never called from the task that is being removed.

    In addition this allows to get rid of exit_thread() in a later patch.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    Cc: Ben Hutchings
    Signed-off-by: Greg Kroah-Hartman

    Heiko Carstens