28 Jun, 2016

1 commit

  • This reverts commit 852ffd0f4e23248b47531058e531066a988434b5.

    There are use cases where an intermediate boot kernel (1) uses kexec
    to boot the final production kernel (2). For this scenario we should
    provide the original boot information to the production kernel (2).
    Therefore clearing the boot information during kexec() should not
    be done.

    Cc: stable@vger.kernel.org # v3.17+
    Reported-by: Steffen Maier
    Signed-off-by: Michael Holzheu
    Reviewed-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Michael Holzheu
     

16 Jun, 2016

1 commit

  • On s390, there are two different hardware PMUs for counting and
    sampling. Previously, both PMUs have shared the perf_hw_context
    which is not correct and, recently, results in this warning:

    ------------[ cut here ]------------
    WARNING: CPU: 5 PID: 1 at kernel/events/core.c:8485 perf_pmu_register+0x420/0x428
    Modules linked in:
    CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc1+ #2
    task: 00000009c5240000 ti: 00000009c5234000 task.ti: 00000009c5234000
    Krnl PSW : 0704c00180000000 0000000000220c50 (perf_pmu_register+0x420/0x428)
    R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
    Krnl GPRS: ffffffffffffffff 0000000000b15ac6 0000000000000000 00000009cb440000
    000000000022087a 0000000000000000 0000000000b78fa0 0000000000000000
    0000000000a9aa90 0000000000000084 0000000000000005 000000000088a97a
    0000000000000004 0000000000749dd0 000000000022087a 00000009c5237cc0
    Krnl Code: 0000000000220c44: a7f4ff54 brc 15,220aec
    0000000000220c48: 92011000 mvi 0(%r1),1
    #0000000000220c4c: a7f40001 brc 15,220c4e
    >0000000000220c50: a7f4ff12 brc 15,220a74
    0000000000220c54: 0707 bcr 0,%r7
    0000000000220c56: 0707 bcr 0,%r7
    0000000000220c58: ebdff0800024 stmg %r13,%r15,128(%r15)
    0000000000220c5e: a7f13fe0 tmll %r15,16352
    Call Trace:
    ([] perf_pmu_register+0x4a/0x428)
    ([] init_cpum_sampling_pmu+0x14c/0x1f8)
    ([] do_one_initcall+0x48/0x140)
    ([] kernel_init_freeable+0x1e6/0x2a0)
    ([] kernel_init+0x24/0x138)
    ([] kernel_thread_starter+0x6/0xc)
    ([] kernel_thread_starter+0x0/0xc)
    Last Breaking-Event-Address:
    [] perf_pmu_register+0x41c/0x428
    ---[ end trace 0c6ef9f5b771ad97 ]---

    Using the perf_sw_context is an option because the cpum_cf PMU does
    not use interrupts. To make this more clear, initialize the
    capabilities in the PMU structure.

    Signed-off-by: Hendrik Brueckner
    Suggested-by: Peter Zijlstra
    Acked-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Hendrik Brueckner
     

26 May, 2016

1 commit

  • Pull perf updates from Ingo Molnar:
    "Mostly tooling and PMU driver fixes, but also a number of late updates
    such as the reworking of the call-chain size limiting logic to make
    call-graph recording more robust, plus tooling side changes for the
    new 'backwards ring-buffer' extension to the perf ring-buffer"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
    perf record: Read from backward ring buffer
    perf record: Rename variable to make code clear
    perf record: Prevent reading invalid data in record__mmap_read
    perf evlist: Add API to pause/resume
    perf trace: Use the ptr->name beautifier as default for "filename" args
    perf trace: Use the fd->name beautifier as default for "fd" args
    perf report: Add srcline_from/to branch sort keys
    perf evsel: Record fd into perf_mmap
    perf evsel: Add overwrite attribute and check write_backward
    perf tools: Set buildid dir under symfs when --symfs is provided
    perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced
    perf annotate: Sort list of recognised instructions
    perf annotate: Fix identification of ARM blt and bls instructions
    perf tools: Fix usage of max_stack sysctl
    perf callchain: Stop validating callchains by the max_stack sysctl
    perf trace: Fix exit_group() formatting
    perf top: Use machine->kptr_restrict_warned
    perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
    perf machine: Do not bail out if not managing to read ref reloc symbol
    perf/x86/intel/p4: Trival indentation fix, remove space
    ...

    Linus Torvalds
     

24 May, 2016

2 commits

  • most architectures are relying on mmap_sem for write in their
    arch_setup_additional_pages. If the waiting task gets killed by the oom
    killer it would block oom_reaper from asynchronous address space reclaim
    and reduce the chances of timely OOM resolving. Wait for the lock in
    the killable mode and return with EINTR if the task got killed while
    waiting.

    Signed-off-by: Michal Hocko
    Acked-by: Andy Lutomirski [x86 vdso]
    Acked-by: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     
  • …unprotect)_crashkres()

    Commit 3f625002581b ("kexec: introduce a protection mechanism for the
    crashkernel reserved memory") is a similar mechanism for protecting the
    crash kernel reserved memory to previous crash_map/unmap_reserved_pages()
    implementation, the new one is more generic in name and cleaner in code
    (besides, some arch may not be allowed to unmap the pgtable).

    Therefore, this patch consolidates them, and uses the new
    arch_kexec_protect(unprotect)_crashkres() to replace former
    crash_map/unmap_reserved_pages() which by now has been only used by
    S390.

    The consolidation work needs the crash memory to be mapped initially,
    this is done in machine_kdump_pm_init() which is after
    reserve_crashkernel(). Once kdump kernel is loaded, the new
    arch_kexec_protect_crashkres() implemented for S390 will actually
    unmap the pgtable like before.

    Signed-off-by: Xunlei Pang <xlpang@redhat.com>
    Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
    Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
    Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
    Cc: "Eric W. Biederman" <ebiederm@xmission.com>
    Cc: Minfei Huang <mhuang@redhat.com>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Cc: Dave Young <dyoung@redhat.com>
    Cc: Baoquan He <bhe@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Xunlei Pang
     

21 May, 2016

1 commit

  • We need to call exit_thread from copy_process in a fail path. So make it
    accept task_struct as a parameter.

    [v2]
    * s390: exit_thread_runtime_instr doesn't make sense to be called for
    non-current tasks.
    * arm: fix the comment in vfp_thread_copy
    * change 'me' to 'tsk' for task_struct
    * now we can change only archs that actually have exit_thread

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Jiri Slaby
    Cc: "David S. Miller"
    Cc: "H. Peter Anvin"
    Cc: "James E.J. Bottomley"
    Cc: Aurelien Jacquiot
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Chen Liqin
    Cc: Chris Metcalf
    Cc: Chris Zankel
    Cc: David Howells
    Cc: Fenghua Yu
    Cc: Geert Uytterhoeven
    Cc: Guan Xuetao
    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ivan Kokshaysky
    Cc: James Hogan
    Cc: Jeff Dike
    Cc: Jesper Nilsson
    Cc: Jiri Slaby
    Cc: Jonas Bonn
    Cc: Koichi Yasutake
    Cc: Lennox Wu
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Mikael Starvik
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Ralf Baechle
    Cc: Rich Felker
    Cc: Richard Henderson
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Russell King
    Cc: Steven Miao
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Slaby
     

20 May, 2016

1 commit

  • …ernel/git/acme/linux into perf/core

    Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

    User visible changes:

    - Honour the kernel.perf_event_max_stack knob more precisely by not counting
    PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to
    the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo)

    - Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim)

    - Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim)

    - Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim)

    - Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen)

    - Store vdso buildid unconditionally, as it appears in callchains and
    we're not checking those when creating the build-id table, so we
    end up not being able to resolve VDSO symbols when doing analysis
    on a different machine than the one where recording was done, possibly
    of a different arch even (arm -> x86_64) (He Kuang)

    Infrastructure changes:

    - Generalize max_stack sysctl handler, will be used for configuring
    multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo)

    Cleanups:

    - Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using
    open coded strings (Masami Hiramatsu)

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

19 May, 2016

1 commit

  • Pull s390 updates from Martin Schwidefsky:
    "The s390 patches for the 4.7 merge window have the usual bug fixes and
    cleanups, and the following new features:

    - An interface for dasd driver to query if a volume is online to
    another operating system

    - A new ioctl for the dasd driver to verify the format for a range of
    tracks

    - Following the example of x86 the struct fpu is now allocated with
    the task_struct

    - The 'report_error' interface for the PCI bus to send an
    adapter-error notification from user space to the service element
    of the machine"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (29 commits)
    s390/vmem: remove unused function parameter
    s390/vmem: fix identity mapping
    s390: add missing include statements
    s390: add missing declarations
    s390: make couple of variables and functions static
    s390/cache: remove superfluous locking
    s390/cpuinfo: simplify locking and skip offline cpus early
    s390/3270: hangup the 3270 tty after a disconnect
    s390/3270: handle reconnect of a tty with a different size
    s390/3270: avoid endless I/O loop with disconnected 3270 terminals
    s390/3270: fix garbled output on 3270 tty view
    s390/3270: fix view reference counting
    s390/3270: add missing tty_kref_put
    s390/dumpstack: implement and use return_address()
    s390/cpum_sf: Remove superfluous SMP function call
    s390/cpum_cf: Remove superfluous SMP function call
    s390/Kconfig: make z196 the default processor type
    s390/sclp: avoid compile warning in sclp_pci_report
    s390/fpu: allocate 'struct fpu' with the task_struct
    s390/crypto: cleanup and move the header with the cpacf definitions
    ...

    Linus Torvalds
     

18 May, 2016

1 commit

  • Pull livepatching updates from Jiri Kosina:

    - remove of our own implementation of architecture-specific relocation
    code and leveraging existing code in the module loader to perform
    arch-dependent work, from Jessica Yu.

    The relevant patches have been acked by Rusty (for module.c) and
    Heiko (for s390).

    - live patching support for ppc64le, which is a joint work of Michael
    Ellerman and Torsten Duwe. This is coming from topic branch that is
    share between livepatching.git and ppc tree.

    - addition of livepatching documentation from Petr Mladek

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching:
    livepatch: make object/func-walking helpers more robust
    livepatch: Add some basic livepatch documentation
    powerpc/livepatch: Add live patching support on ppc64le
    powerpc/livepatch: Add livepatch stack to struct thread_info
    powerpc/livepatch: Add livepatch header
    livepatch: Allow architectures to specify an alternate ftrace location
    ftrace: Make ftrace_location_range() global
    livepatch: robustify klp_register_patch() API error checking
    Documentation: livepatch: outline Elf format and requirements for patch modules
    livepatch: reuse module loader code to write relocations
    module: s390: keep mod_arch_specific for livepatch modules
    module: preserve Elf information for livepatch modules
    Elf: add livepatch-specific Elf constants

    Linus Torvalds
     

17 May, 2016

1 commit

  • This makes perf_callchain_{user,kernel}() receive the max stack
    as context for the perf_callchain_entry, instead of accessing
    the global sysctl_perf_event_max_stack.

    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexei Starovoitov
    Cc: Brendan Gregg
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Masami Hiramatsu
    Cc: Milian Wolff
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Cc: Wang Nan
    Cc: Zefan Li
    Link: http://lkml.kernel.org/n/tip-kolmn1yo40p7jhswxwrc7rrd@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

10 May, 2016

5 commits

  • arch_mmap_rnd, cpu_have_feature, and arch_randomize_brk are all
    defined as globally visible variables.
    However the files they are defined in do not include the header files
    with the declaration. To avoid a possible mismatch add the missing
    include statements so we have proper type checking in place.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • arch_dup_task_struct and the per cpu variable mt_cycles are globally
    visible, but do not have any header file with a declaration.
    Therefore add it so we have proper type checking in place.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • copy_oldmem_user() and ap_jumptable are private to the files they are
    being used in. Therefore make them static.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • With "s390/cpuinfo: simplify locking and skip offline cpus early" we
    prevent already that cpus will go away. The additional
    get_online_cpus() / put_online_cpus() within show_cacheinfo() is not
    needed anymore.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • Move the get_online_cpus() and put_online_cpus() to the start and stop
    operation of the seqfile ops. This way there is no need to lock cpu
    hotplug again and again for each single cpu.

    This way we can also skip offline cpus early if we simply use
    cpumask_next() within the next operation.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

06 May, 2016

1 commit

  • In order to enable symmetric hotplug, we must mirror the online &&
    !active state of cpu-down on the cpu-up side.

    However, to retain sanity, limit this state to per-cpu kthreads.

    Aside from the change to set_cpus_allowed_ptr(), which allow moving
    the per-cpu kthreads on, the other critical piece is the cpu selection
    for pinned tasks in select_task_rq(). This avoids dropping into
    select_fallback_rq().

    select_fallback_rq() cannot be allowed to select !active cpus because
    its used to migrate user tasks away. And we do not want to move user
    tasks onto cpus that are in transition.

    Requested-by: Thomas Gleixner
    Signed-off-by: Peter Zijlstra (Intel)
    Tested-by: Thomas Gleixner
    Cc: Lai Jiangshan
    Cc: Jan H. Schönherr
    Cc: Oleg Nesterov
    Cc: rt@linutronix.de
    Link: http://lkml.kernel.org/r/20160301152303.GV6356@twins.programming.kicks-ass.net
    Signed-off-by: Thomas Gleixner

    Peter Zijlstra (Intel)
     

04 May, 2016

1 commit

  • Implement return_address() and use it instead of __builtin_return_address(n).

    __builtin_return_address(n) is not guaranteed to work for n > 0,
    therefore implement a private return_address() function which walks
    the stack frames and returns the proper return address.

    This way we get also rid of a compile warning which gcc 6.1 emits and
    look like all other architectures.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

03 May, 2016

2 commits

  • Since commit 3b9d6da67e11 ("cpu/hotplug: Fix rollback during error-out
    in __cpu_disable()") it is ensured that callbacks of CPU_ONLINE and
    CPU_DOWN_PREPARE are processed on the hotplugged CPU. Due to this SMP
    function calls are no longer required.

    Replace smp_call_function_single() with a direct call of
    setup_pmc_cpu(). To keep the calling convention, interrupts are
    explicitly disabled around the call.

    Cc: linux-s390@vger.kernel.org
    Signed-off-by: Anna-Maria Gleixner
    Acked-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Anna-Maria Gleixner
     
  • Since commit 3b9d6da67e11 ("cpu/hotplug: Fix rollback during error-out
    in __cpu_disable()") it is ensured that callbacks of CPU_ONLINE and
    CPU_DOWN_PREPARE are processed on the hotplugged CPU. Due to this SMP
    function calls are no longer required.

    Replace smp_call_function_single() with a direct call of
    setup_pmc_cpu(). To keep the calling convention, interrupts are
    explicitly disabled around the call.

    Cc: linux-s390@vger.kernel.org
    Signed-off-by: Anna-Maria Gleixner
    Acked-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Anna-Maria Gleixner
     

21 Apr, 2016

1 commit

  • Analog to git commit 0c8c0f03e3a292e031596484275c14cf39c0ab7a
    "x86/fpu, sched: Dynamically allocate 'struct fpu'"
    move the struct fpu to the end of the struct thread_struct,
    set CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and add the
    setup_task_size() function to calculate the correct size
    fo the task struct.

    For the performance_defconfig this increases the size of
    struct task_struct from 7424 bytes to 7936 bytes (MACHINE_HAS_VX==1)
    or 7552 bytes (MACHINE_HAS_VX==0). The dynamic allocation of the
    struct fpu is removed. The slab cache uses an 8KB block for the
    task struct in all cases, there is enough room for the struct fpu.
    For MACHINE_HAS_VX==1 each task now needs 512 bytes less memory.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

01 Apr, 2016

3 commits

  • Livepatch needs to utilize the symbol information contained in the
    mod_arch_specific struct in order to be able to call the s390
    apply_relocate_add() function to apply relocations. Keep a reference to
    syminfo if the module is a livepatch module. Remove the redundant vfree()
    in module_finalize() since module_arch_freeing_init() (which also frees
    those structures) is called in do_init_module(). If the module isn't a
    livepatch module, we free the structures in module_arch_freeing_init() as
    usual.

    Signed-off-by: Jessica Yu
    Reviewed-by: Miroslav Benes
    Acked-by: Josh Poimboeuf
    Acked-by: Heiko Carstens
    Signed-off-by: Jiri Kosina

    Jessica Yu
     
  • Pull s390 fixes from Martin Schwidefsky:
    - A proper fix for the locking issue in the dasd driver
    - Wire up the new preadv2 nad pwritev2 system calls
    - Add the mark_rodata_ro function and set DEBUG_RODATA=y
    - A few more bug fixes.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    s390: wire up preadv2/pwritev2 syscalls
    s390/pci: PCI function group 0 is valid for clp_query_pci_fn
    s390/crypto: provide correct file mode at device register.
    s390/mm: handle PTE-mapped tail pages in fast gup
    s390: add DEBUG_RODATA support
    s390: disable postinit-readonly for now
    s390/dasd: reorder lcu and device lock
    s390/cpum_sf: Fix cpu hotplug notifier transitions
    s390/cpum_cf: Fix missing cpu hotplug notifier transition

    Linus Torvalds
     
  • Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

26 Mar, 2016

1 commit

  • KASAN needs to know whether the allocation happens in an IRQ handler.
    This lets us strip everything below the IRQ entry point to reduce the
    number of unique stack traces needed to be stored.

    Move the definition of __irq_entry to so that the
    users don't need to pull in . Also introduce the
    __softirq_entry macro which is similar to __irq_entry, but puts the
    corresponding functions to the .softirqentry.text section.

    Signed-off-by: Alexander Potapenko
    Acked-by: Steven Rostedt
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Andrey Konovalov
    Cc: Dmitry Vyukov
    Cc: Andrey Ryabinin
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     

18 Mar, 2016

1 commit

  • This changes several users of manual "on"/"off" parsing to use
    strtobool.

    Some side-effects:
    - these uses will now parse y/n/1/0 meaningfully too
    - the early_param uses will now bubble up parse errors

    Signed-off-by: Kees Cook
    Acked-by: Heiko Carstens
    Acked-by: Michael Ellerman
    Cc: Amitkumar Karwar
    Cc: Andy Shevchenko
    Cc: Daniel Borkmann
    Cc: Joe Perches
    Cc: Kalle Valo
    Cc: Martin Schwidefsky
    Cc: Nishant Sarmukadam
    Cc: Rasmus Villemoes
    Cc: Steve French
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

17 Mar, 2016

4 commits

  • The cpumf_pmu_notfier() hotplug callback lacks handling of the
    CPU_DOWN_FAILED case. That means, if CPU_DOWN_PREPARE failes, the PMC
    of the CPU is not setup again. Furthermore the CPU_ONLINE_FROZEN case
    will never be processed because of masking the switch expression with
    CPU_TASKS_FROZEN.

    Add handling for CPU_DOWN_FAILED transition to setup the PMC of the
    CPU. Remove CPU_ONLINE_FROZEN case.

    Signed-off-by: Anna-Maria Gleixner
    Acked-by: Hendrik Brueckner
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Anna-Maria Gleixner
     
  • The cpumf_pmu_notfier() hotplug callback lacks handling of the
    CPU_DOWN_FAILED case. That means, if CPU_DOWN_PREPARE failes, the PMC
    of the CPU is not setup again.

    Add handling for CPU_DOWN_FAILED transition to setup the PMC of the
    CPU.

    Signed-off-by: Anna-Maria Gleixner
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Anna-Maria Gleixner
     
  • Merge first patch-bomb from Andrew Morton:

    - some misc things

    - ofs2 updates

    - about half of MM

    - checkpatch updates

    - autofs4 update

    * emailed patches from Andrew Morton : (120 commits)
    autofs4: fix string.h include in auto_dev-ioctl.h
    autofs4: use pr_xxx() macros directly for logging
    autofs4: change log print macros to not insert newline
    autofs4: make autofs log prints consistent
    autofs4: fix some white space errors
    autofs4: fix invalid ioctl return in autofs4_root_ioctl_unlocked()
    autofs4: fix coding style line length in autofs4_wait()
    autofs4: fix coding style problem in autofs4_get_set_timeout()
    autofs4: coding style fixes
    autofs: show pipe inode in mount options
    kallsyms: add support for relative offsets in kallsyms address table
    kallsyms: don't overload absolute symbol type for percpu symbols
    x86: kallsyms: disable absolute percpu symbols on !SMP
    checkpatch: fix another left brace warning
    checkpatch: improve UNSPECIFIED_INT test for bare signed/unsigned uses
    checkpatch: warn on bare unsigned or signed declarations without int
    checkpatch: exclude asm volatile from complex macro check
    mm: memcontrol: drop unnecessary lru locking from mem_cgroup_migrate()
    mm: migrate: consolidate mem_cgroup_migrate() calls
    mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous
    ...

    Linus Torvalds
     
  • Pull s390 updates from Martin Schwidefsky:

    - Add the CPU id for the new z13s machine

    - Add a s390 specific XOR template for RAID-5 checksumming based on the
    XC instruction. Remove all other alternatives, XC is always faster

    - The merge of our four different stack tracers into a single one

    - Tidy up the code related to page tables, several large inline
    functions are now out-of-line. Bloat-o-meter reports ~11K text size
    reduction

    - A binary interface for the priviledged CLP instruction to retrieve
    the hardware view of the installed PCI functions

    - Improvements for the dasd format code

    - Bug fixes and cleanups

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (31 commits)
    s390/pci: enforce fmb page boundary rule
    s390: fix floating pointer register corruption (again)
    s390/cpumf: add missing lpp magic initialization
    s390: Fix misspellings in comments
    s390/mm: split arch/s390/mm/pgtable.c
    s390/mm: uninline pmdp_xxx functions from pgtable.h
    s390/mm: uninline ptep_xxx functions from pgtable.h
    s390/pci: add ioctl interface for CLP
    s390: Use pr_warn instead of pr_warning
    s390/dasd: remove casts to dasd_*_private
    s390/dasd: Refactor dasd format functions
    s390/dasd: Simplify code in format logic
    s390/dasd: Improve dasd format code
    s390/percpu: remove this_cpu_cmpxchg_double_4
    s390/cpumf: Improve guest detection heuristics
    s390/fault: merge report_user_fault implementations
    s390/dis: use correct escape sequence for '%' character
    s390/kvm: simplify set_guest_storage_key
    s390/oprofile: add z13/z13s model numbers
    s390: add z13s model number to z13 elf platform
    ...

    Linus Torvalds
     

16 Mar, 2016

2 commits

  • We can use debug_pagealloc_enabled() to check if we can map the identity
    mapping with 1MB/2GB pages as well as to print the current setting in
    dump_stack.

    Signed-off-by: Christian Borntraeger
    Reviewed-by: Heiko Carstens
    Cc: Thomas Gleixner
    Acked-by: David Rientjes
    Cc: Laura Abbott
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Borntraeger
     
  • Pull cpu hotplug updates from Thomas Gleixner:
    "This is the first part of the ongoing cpu hotplug rework:

    - Initial implementation of the state machine

    - Runs all online and prepare down callbacks on the plugged cpu and
    not on some random processor

    - Replaces busy loop waiting with completions

    - Adds tracepoints so the states can be followed"

    More detailed commentary on this work from an earlier email:
    "What's wrong with the current cpu hotplug infrastructure?

    - Asymmetry

    The hotplug notifier mechanism is asymmetric versus the bringup and
    teardown. This is mostly caused by the notifier mechanism.

    - Largely undocumented dependencies

    While some notifiers use explicitely defined notifier priorities,
    we have quite some notifiers which use numerical priorities to
    express dependencies without any documentation why.

    - Control processor driven

    Most of the bringup/teardown of a cpu is driven by a control
    processor. While it is understandable, that preperatory steps,
    like idle thread creation, memory allocation for and initialization
    of essential facilities needs to be done before a cpu can boot,
    there is no reason why everything else must run on a control
    processor. Before this patch series, bringup looks like this:

    Control CPU Booting CPU

    do preparatory steps
    kick cpu into life

    do low level init

    sync with booting cpu sync with control cpu

    bring the rest up

    - All or nothing approach

    There is no way to do partial bringups. That's something which is
    really desired because we waste e.g. at boot substantial amount of
    time just busy waiting that the cpu comes to life. That's stupid
    as we could very well do preparatory steps and the initial IPI for
    other cpus and then go back and do the necessary low level
    synchronization with the freshly booted cpu.

    - Minimal debuggability

    Due to the notifier based design, it's impossible to switch between
    two stages of the bringup/teardown back and forth in order to test
    the correctness. So in many hotplug notifiers the cancel
    mechanisms are either not existant or completely untested.

    - Notifier [un]registering is tedious

    To [un]register notifiers we need to protect against hotplug at
    every callsite. There is no mechanism that bringup/teardown
    callbacks are issued on the online cpus, so every caller needs to
    do it itself. That also includes error rollback.

    What's the new design?

    The base of the new design is a symmetric state machine, where both
    the control processor and the booting/dying cpu execute a well
    defined set of states. Each state is symmetric in the end, except
    for some well defined exceptions, and the bringup/teardown can be
    stopped and reversed at almost all states.

    So the bringup of a cpu will look like this in the future:

    Control CPU Booting CPU

    do preparatory steps
    kick cpu into life

    do low level init

    sync with booting cpu sync with control cpu

    bring itself up

    The synchronization step does not require the control cpu to wait.
    That mechanism can be done asynchronously via a worker or some
    other mechanism.

    The teardown can be made very similar, so that the dying cpu cleans
    up and brings itself down. Cleanups which need to be done after
    the cpu is gone, can be scheduled asynchronously as well.

    There is a long way to this, as we need to refactor the notion when a
    cpu is available. Today we set the cpu online right after it comes
    out of the low level bringup, which is not really correct.

    The proper mechanism is to set it to available, i.e. cpu local
    threads, like softirqd, hotplug thread etc. can be scheduled on that
    cpu, and once it finished all booting steps, it's set to online, so
    general workloads can be scheduled on it. The reverse happens on
    teardown. First thing to do is to forbid scheduling of general
    workloads, then teardown all the per cpu resources and finally shut it
    off completely.

    This patch series implements the basic infrastructure for this at the
    core level. This includes the following:

    - Basic state machine implementation with well defined states, so
    ordering and prioritization can be expressed.

    - Interfaces to [un]register state callbacks

    This invokes the bringup/teardown callback on all online cpus with
    the proper protection in place and [un]installs the callbacks in
    the state machine array.

    For callbacks which have no particular ordering requirement we have
    a dynamic state space, so that drivers don't have to register an
    explicit hotplug state.

    If a callback fails, the code automatically does a rollback to the
    previous state.

    - Sysfs interface to drive the state machine to a particular step.

    This is only partially functional today. Full functionality and
    therefor testability will be achieved once we converted all
    existing hotplug notifiers over to the new scheme.

    - Run all CPU_ONLINE/DOWN_PREPARE notifiers on the booting/dying
    processor:

    Control CPU Booting CPU

    do preparatory steps
    kick cpu into life

    do low level init

    sync with booting cpu sync with control cpu
    wait for boot
    bring itself up

    Signal completion to control cpu

    In a previous step of this work we've done a full tree mechanical
    conversion of all hotplug notifiers to the new scheme. The balance
    is a net removal of about 4000 lines of code.

    This is not included in this series, as we decided to take a
    different approach. Instead of mechanically converting everything
    over, we will do a proper overhaul of the usage sites one by one so
    they nicely fit into the symmetric callback scheme.

    I decided to do that after I looked at the ugliness of some of the
    converted sites and figured out that their hotplug mechanism is
    completely buggered anyway. So there is no point to do a
    mechanical conversion first as we need to go through the usage
    sites one by one again in order to achieve a full symmetric and
    testable behaviour"

    * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
    cpu/hotplug: Document states better
    cpu/hotplug: Fix smpboot thread ordering
    cpu/hotplug: Remove redundant state check
    cpu/hotplug: Plug death reporting race
    rcu: Make CPU_DYING_IDLE an explicit call
    cpu/hotplug: Make wait for dead cpu completion based
    cpu/hotplug: Let upcoming cpu bring itself fully up
    arch/hotplug: Call into idle with a proper state
    cpu/hotplug: Move online calls to hotplugged cpu
    cpu/hotplug: Create hotplug threads
    cpu/hotplug: Split out the state walk into functions
    cpu/hotplug: Unpark smpboot threads from the state machine
    cpu/hotplug: Move scheduler cpu_online notifier to hotplug core
    cpu/hotplug: Implement setup/removal interface
    cpu/hotplug: Make target state writeable
    cpu/hotplug: Add sysfs state interface
    cpu/hotplug: Hand in target state to _cpu_up/down
    cpu/hotplug: Convert the hotplugged cpu work to a state machine
    cpu/hotplug: Convert to a state machine for the control processor
    cpu/hotplug: Add tracepoints
    ...

    Linus Torvalds
     

15 Mar, 2016

2 commits

  • Pull locking changes from Ingo Molnar:
    "Various updates:

    - Futex scalability improvements: remove page lock use for shared
    futex get_futex_key(), which speeds up 'perf bench futex hash'
    benchmarks by over 40% on a 60-core Westmere. This makes anon-mem
    shared futexes perform close to private futexes. (Mel Gorman)

    - lockdep hash collision detection and fix (Alfredo Alvarez
    Fernandez)

    - lockdep testing enhancements (Alfredo Alvarez Fernandez)

    - robustify lockdep init by using hlists (Andrew Morton, Andrey
    Ryabinin)

    - mutex and csd_lock micro-optimizations (Davidlohr Bueso)

    - small x86 barriers tweaks (Michael S Tsirkin)

    - qspinlock updates (Waiman Long)"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
    locking/csd_lock: Use smp_cond_acquire() in csd_lock_wait()
    locking/csd_lock: Explicitly inline csd_lock*() helpers
    futex: Replace barrier() in unqueue_me() with READ_ONCE()
    locking/lockdep: Detect chain_key collisions
    locking/lockdep: Prevent chain_key collisions
    tools/lib/lockdep: Fix link creation warning
    tools/lib/lockdep: Add tests for AA and ABBA locking
    tools/lib/lockdep: Add userspace version of READ_ONCE()
    tools/lib/lockdep: Fix the build on recent kernels
    locking/qspinlock: Move __ARCH_SPIN_LOCK_UNLOCKED to qspinlock_types.h
    locking/mutex: Allow next waiter lockless wakeup
    locking/pvqspinlock: Enable slowpath locking count tracking
    locking/qspinlock: Use smp_cond_acquire() in pending code
    locking/pvqspinlock: Move lock stealing count tracking code into pv_queued_spin_steal_lock()
    locking/mcs: Fix mcs_spin_lock() ordering
    futex: Remove requirement for lock_page() in get_futex_key()
    futex: Rename barrier references in ordering guarantees
    locking/atomics: Update comment about READ_ONCE() and structures
    locking/lockdep: Eliminate lockdep_init()
    locking/lockdep: Convert hash tables to hlists
    ...

    Linus Torvalds
     
  • Pull ram resource handling changes from Ingo Molnar:
    "Core kernel resource handling changes to support NVDIMM error
    injection.

    This tree introduces a new I/O resource type, IORESOURCE_SYSTEM_RAM,
    for System RAM while keeping the current IORESOURCE_MEM type bit set
    for all memory-mapped ranges (including System RAM) for backward
    compatibility.

    With this resource flag it no longer takes a strcmp() loop through the
    resource tree to find "System RAM" resources.

    The new resource type is then used to extend ACPI/APEI error injection
    facility to also support NVDIMM"

    * 'core-resources-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    ACPI/EINJ: Allow memory error injection to NVDIMM
    resource: Kill walk_iomem_res()
    x86/kexec: Remove walk_iomem_res() call with GART type
    x86, kexec, nvdimm: Use walk_iomem_res_desc() for iomem search
    resource: Add walk_iomem_res_desc()
    memremap: Change region_intersects() to take @flags and @desc
    arm/samsung: Change s3c_pm_run_res() to use System RAM type
    resource: Change walk_system_ram() to use System RAM type
    drivers: Initialize resource entry to zero
    xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM
    kexec: Set IORESOURCE_SYSTEM_RAM for System RAM
    arch: Set IORESOURCE_SYSTEM_RAM flag for System RAM
    ia64: Set System RAM type and descriptor
    x86/e820: Set System RAM type and descriptor
    resource: Add I/O resource descriptor
    resource: Handle resource flags properly
    resource: Add System RAM resource type

    Linus Torvalds
     

10 Mar, 2016

2 commits

  • There is a tricky interaction between the machine check handler
    and the critical sections of load_fpu_regs and save_fpu_regs
    functions. If the machine check interrupts one of the two
    functions the critical section cleanup will complete the function
    before the machine check handler s390_do_machine_check is called.
    Trouble is that the machine check handler needs to validate the
    floating point registers *before* and not *after* the completion
    of load_fpu_regs/save_fpu_regs.

    The simplest solution is to rewind the PSW to the start of the
    load_fpu_regs/save_fpu_regs and retry the function after the
    return from the machine check handler.

    Tested-by: Christian Borntraeger
    Cc: # 4.3+
    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Add the missing lpp magic initialization for cpu 0. Without this all
    samples on cpu 0 do not have the most significant bit set in the
    program parameter field, which we use to distinguish between guest and
    host samples if the pid is also 0.

    We did initialize the lpp magic in the absolute zero lowcore but
    forgot that when switching to the allocated lowcore on cpu 0 only.

    Reported-by: Shu Juan Zhang
    Acked-by: Christian Borntraeger
    Cc: stable@vger.kernel.org # v4.4+
    Fixes: e22cf8ca6f75 ("s390/cpumf: rework program parameter setting to detect guest samples")
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     

08 Mar, 2016

3 commits

  • Signed-off-by: Adam Buchbinder
    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Adam Buchbinder
     
  • The pgtable.c file is quite big, before it grows any larger split it
    into pgtable.c, pgalloc.c and gmap.c. In addition move the gmap related
    header definitions into the new gmap.h header and all of the pgste
    helpers from pgtable.h to pgtable.c.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • we have to check bit 40 of the facility list before issuing LPP
    and not bit 48. Otherwise a guest running on a system with
    "The decimal-floating-point zoned-conversion facility" and without
    the "The set-program-parameters facility" might crash on an lpp
    instruction.

    Signed-off-by: Christian Borntraeger
    Cc: stable@vger.kernel.org # v4.4+
    Fixes: e22cf8ca6f75 ("s390/cpumf: rework program parameter setting to detect guest samples")
    Signed-off-by: Martin Schwidefsky

    Christian Borntraeger
     

07 Mar, 2016

1 commit


04 Mar, 2016

1 commit