28 Apr, 2014

16 commits

  • We have two copies of code that creates an OPAL sg list. Consolidate
    these into a common set of helpers and fix the endian issues.

    The flash interface embedded a version number in the num_entries
    field, whereas the dump interface did did not. Since versioning
    wasn't added to the flash interface and it is impossible to add
    this in a backwards compatible way, just remove it.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     
  • Fix little endian issues with the OPAL error log code.

    Signed-off-by: Anton Blanchard
    Reviewed-by: Stewart Smith
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     
  • The bitmap in opal_poll_events and opal_handle_interrupt is
    big endian, so we need to byteswap it on little endian builds.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     
  • We had some duplication of the internal OPAL functions.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     
  • Using size_t in our APIs is asking for trouble, especially
    when some OPAL calls use size_t pointers.

    Signed-off-by: Anton Blanchard
    Reviewed-by: Stewart Smith
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     
  • On PowerNV platform, we are holding an unnecessary refcount on a pci_dev, which
    leads to the pci_dev is not destroyed when hotplugging a pci device.

    This patch release the unnecessary refcount.

    Signed-off-by: Wei Yang
    Signed-off-by: Benjamin Herrenschmidt

    Wei Yang
     
  • During the EEH hotplug event, iommu_add_device() will be invoked three times
    and two of them will trigger warning or error.

    The three times to invoke the iommu_add_device() are:

    pci_device_add
    ...
    set_iommu_table_base_and_group kobj->sd is not initialized. The
    dev->kobj->sd is initialized in device_add().
    The third time's warning is triggered by the re-attach of the iommu_group.

    After applying this patch, the error

    iommu_tce: 0003:05:00.0 has not been added, ret=-14

    and the warning

    [ 204.123609] ------------[ cut here ]------------
    [ 204.123645] WARNING: at arch/powerpc/kernel/iommu.c:1125
    [ 204.123680] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT bnep bluetooth 6lowpan_iphc rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc mlx4_ib ib_sa ib_mad ib_core ib_addr ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnx2x tg3 mlx4_core nfsd ptp mdio ses libcrc32c nfs_acl enclosure be2net pps_core shpchp lockd kvm uinput sunrpc binfmt_misc lpfc scsi_transport_fc ipr scsi_tgt
    [ 204.124356] CPU: 18 PID: 650 Comm: eehd Not tainted 3.14.0-rc5yw+ #102
    [ 204.124400] task: c0000027ed485670 ti: c0000027ed50c000 task.ti: c0000027ed50c000
    [ 204.124453] NIP: c00000000003cf80 LR: c00000000006c648 CTR: c00000000006c5c0
    [ 204.124506] REGS: c0000027ed50f440 TRAP: 0700 Not tainted (3.14.0-rc5yw+)
    [ 204.124558] MSR: 9000000000029032 CR: 88008084 XER: 20000000
    [ 204.124682] CFAR: c00000000006c644 SOFTE: 1
    GPR00: c00000000006c648 c0000027ed50f6c0 c000000001398380 c0000027ec260300
    GPR04: c0000027ea92c000 c00000000006ad00 c0000000016e41b0 0000000000000110
    GPR08: c0000000012cd4c0 0000000000000001 c0000027ec2602ff 0000000000000062
    GPR12: 0000000028008084 c00000000fdca200 c0000000000d1d90 c0000027ec281a80
    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
    GPR24: 000000005342697b 0000000000002906 c000001fe6ac9800 c000001fe6ac9800
    GPR28: 0000000000000000 c0000000016e3a80 c0000027ea92c090 c0000027ea92c000
    [ 204.125353] NIP [c00000000003cf80] .iommu_add_device+0x30/0x1f0
    [ 204.125399] LR [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
    [ 204.125443] Call Trace:
    [ 204.125464] [c0000027ed50f6c0] [c0000027ed50f750] 0xc0000027ed50f750 (unreliable)
    [ 204.125526] [c0000027ed50f750] [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0
    [ 204.125588] [c0000027ed50f7d0] [c000000000069cc8] .pnv_pci_dma_dev_setup+0x78/0x340
    [ 204.125650] [c0000027ed50f870] [c000000000044408] .pcibios_setup_device+0x88/0x2f0
    [ 204.125712] [c0000027ed50f940] [c000000000046040] .pcibios_setup_bus_devices+0x60/0xd0
    [ 204.125774] [c0000027ed50f9c0] [c000000000043acc] .pcibios_add_pci_devices+0xdc/0x1c0
    [ 204.125837] [c0000027ed50fa50] [c00000000086f970] .eeh_reset_device+0x36c/0x4f0
    [ 204.125939] [c0000027ed50fb20] [c00000000003a2d8] .eeh_handle_normal_event+0x448/0x480
    [ 204.126068] [c0000027ed50fbc0] [c00000000003a35c] .eeh_handle_event+0x4c/0x340
    [ 204.126192] [c0000027ed50fc80] [c00000000003a74c] .eeh_event_handler+0xfc/0x1b0
    [ 204.126319] [c0000027ed50fd30] [c0000000000d1ea0] .kthread+0x110/0x130
    [ 204.126430] [c0000027ed50fe30] [c00000000000a460] .ret_from_kernel_thread+0x5c/0x7c
    [ 204.126556] Instruction dump:
    [ 204.126610] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff71 7c7e1b78 60000000
    [ 204.126787] 60000000 e87e0298 3143ffff 7d2a1910 2fa90000 40de00c8 ebfe0218
    [ 204.126966] ---[ end trace 6e7aefd80add2973 ]---

    are cleared.

    This patch removes iommu_add_device() in pnv_pci_ioda_dma_dev_setup(), which
    revert part of the change in commit d905c5df(PPC: POWERNV: move
    iommu_add_device earlier).

    Signed-off-by: Wei Yang
    Signed-off-by: Benjamin Herrenschmidt

    Wei Yang
     
  • With this patch I was able to update firmware on an LE kernel.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     
  • We have a subtle race when sending CPUs back to OPAL on kexec.

    We mark them as "in real mode" right before we send them down. Once
    we've booted the new kernel, it might try to call opal_reinit_cpus()
    to change endianness, and that requires all CPUs to be spinning inside
    OPAL.

    However there is no synchronization here and we've observed cases
    where the returning CPUs hadn't established their new state inside
    OPAL before opal_reinit_cpus() is called, causing it to fail.

    The proper fix is to actually wait for them to go down all the way
    from the kexec'ing kernel.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     
  • The size of the sysparam sysfs files is determined from the device tree
    at boot. However the buffer is hard coded to 64 bytes. If we encounter a
    parameter that is larger than 64, or miss-parse the device tree, the
    buffer will overflow when reading or writing to the parameter.

    Check it at discovery time, and if the parameter is too large, do not
    create a sysfs entry for it.

    Signed-off-by: Joel Stanley
    Signed-off-by: Benjamin Herrenschmidt

    Joel Stanley
     
  • Signed-off-by: Benjamin Herrenschmidt

    Joel Stanley
     
  • The sysparam code currently uses the userspace supplied number of
    bytes when memcpy()ing in to a local 64-byte buffer.

    Limit the maximum number of bytes by the size of the buffer.

    Signed-off-by: Benjamin Herrenschmidt

    Joel Stanley
     
  • The OPAL calls are returning int64_t values, which the sysparam code
    stores in an int, and the sysfs callback returns ssize_t. Make code a
    easier to read by consistently using ssize_t.

    Signed-off-by: Joel Stanley
    Signed-off-by: Benjamin Herrenschmidt

    Joel Stanley
     
  • When a sysparam query in OPAL returned a negative value (error code),
    sysfs would spew out a decent chunk of memory; almost 64K more than
    expected. This was traced to a sign/unsigned mix up in the OPAL sysparam
    sysfs code at sys_param_show.

    The return value of sys_param_show is a ssize_t, calculated using

    return ret ? ret : attr->param_size;

    Alan Modra explains:

    "attr->param_size" is an unsigned int, "ret" an int, so the overall
    expression has type unsigned int. Result is that ret is cast to
    unsigned int before being cast to ssize_t.

    Instead of using the ternary operator, set ret to the param_size if an
    error is not detected. The same bug exists in the sysfs write callback;
    this patch fixes it in the same way.

    A note on debugging this next time: on my system gcc will warn about
    this if compiled with -Wsign-compare, which is not enabled by -Wall,
    only -Wextra.

    Signed-off-by: Joel Stanley
    Signed-off-by: Benjamin Herrenschmidt

    Joel Stanley
     
  • commit 41dd03a9 may cause Oops in rtas_stop_self().

    The reason is that the rtas_args was moved into stack space. For a box
    with more that 4GB RAM, the stack could easily be outside 32bit range,
    but RTAS is 32bit.

    So the patch moves rtas_args away from stack by adding static before
    it.

    Signed-off-by: Li Zhong
    Signed-off-by: Anton Blanchard
    Cc: stable@vger.kernel.org # 3.14+
    Signed-off-by: Benjamin Herrenschmidt

    Li Zhong
     
  • Commit aac416fc38c (lkdtm: flush icache and report actions) calls
    flush_icache_range from a module. It's exported on most architectures
    that implement it, but not on powerpc. This patch exports it to fix
    the module link failure.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Benjamin Herrenschmidt

    Jeff Mahoney
     

20 Apr, 2014

2 commits


19 Apr, 2014

6 commits

  • Merge misc fixes from Andrew Morton:
    "13 fixes"

    * emailed patches from Andrew Morton :
    thp: close race between split and zap huge pages
    mm: fix new kernel-doc warning in filemap.c
    mm: fix CONFIG_DEBUG_VM_RB description
    mm: use paravirt friendly ops for NUMA hinting ptes
    mips: export flush_icache_range
    mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()
    wait: explain the shadowing and type inconsistencies
    Shiraz has moved
    Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt
    powerpc/mm: fix ".__node_distance" undefined
    kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write()
    init/Kconfig: move the trusted keyring config option to general setup
    vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()

    Linus Torvalds
     
  • The lkdtm module performs tests against executable memory ranges, so it
    needs to flush the icache for proper behaviors. Other architectures
    already export this, so do the same for MIPS.

    [akpm@linux-foundation.org: relocate export sites]
    Signed-off-by: Kees Cook
    Cc: Paul Gortmaker
    Cc: Ralf Baechle
    Cc: Sanjay Lal
    Cc: John Crispin
    Cc: Sergei Shtylyov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • shiraz.hashim@st.com email-id doesn't exist anymore as he has left the
    company. Replace ST's id with shiraz.linux.kernel@gmail.com.

    It also updates .mailmap file to fix address for 'git shortlog'.

    Signed-off-by: Viresh Kumar
    Cc: Shiraz Hashim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Viresh Kumar
     
  • CHK include/config/kernel.release
    CHK include/generated/uapi/linux/version.h
    CHK include/generated/utsrelease.h
    ...
    Building modules, stage 2.
    WARNING: 1 bad relocations
    c0000000013d6a30 R_PPC64_ADDR64 uprobes_fetch_type_table
    WRAP arch/powerpc/boot/zImage.pseries
    WRAP arch/powerpc/boot/zImage.epapr
    MODPOST 1849 modules
    ERROR: ".__node_distance" [drivers/block/nvme.ko] undefined!
    make[1]: *** [__modpost] Error 1
    make: *** [modules] Error 2
    make: *** Waiting for unfinished jobs....

    The reason is symbol "__node_distance" not been exported in powerpc.

    Signed-off-by: Mike Qiu
    Acked-by: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Nathan Fontenot
    Cc: Stephen Rothwell
    Cc: Srivatsa S. Bhat
    Cc: Jesse Larrew
    Cc: Robert Jennings
    Cc: Alistair Popple
    Cc: Mike Qiu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Qiu
     
  • Commit 93ea02bb8435 ("arch: Clean up asm/barrier.h implementations")
    wired generic barrier.h for ARC, but failed to delete the existing file.

    In 3.15, due to rcupdate.h updates, this causes a build breakage on ARC:

    CC arch/arc/kernel/asm-offsets.s
    In file included from include/linux/sched.h:45:0,
    from arch/arc/kernel/asm-offsets.c:9:
    include/linux/rculist.h: In function __list_add_rcu:
    include/linux/rculist.h:54:2: error: implicit declaration of function smp_store_release [-Werror=implicit-function-declaration]
    rcu_assign_pointer(list_next_rcu(prev), new);
    ^

    Cc: Peter Zijlstra
    Signed-off-by: Vineet Gupta
    Signed-off-by: Linus Torvalds

    Vineet Gupta
     
  • Pull PCI updates from Bjorn Helgaas:
    "These are fixes for a powerpc NULL pointer dereference, an OF
    interrupt mapping issue on some of the new host bridges, and a
    DesignWare iATU issue.

    Host bridge drivers
    - Fix OF interrupt mapping for DesignWare, R-Car, Tegra (Lucas Stach)
    - Fix DesignWare iATU programming (Mohit Kumar)

    Miscellaneous
    - Fix powerpc NULL dereference from list_for_each_entry() update (Mike Qiu)"

    * tag 'pci-v3.15-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
    PCI: tegra: Use new OF interrupt mapping when possible
    PCI: rcar: Use new OF interrupt mapping when possible
    PCI: designware: Use new OF interrupt mapping when possible
    PCI: designware: Fix iATU programming for cfg1, io and mem viewport
    PCI: designware: Fix comment for setting number of lanes
    powerpc/PCI: Fix NULL dereference in sys_pciconfig_iobase() list traversal

    Linus Torvalds
     

18 Apr, 2014

3 commits

  • CPUs which should support the RAPL counters according to
    Family/Model/Stepping may still issue #GP when attempting to access
    the RAPL MSRs. This may happen when Linux is running under KVM and
    we are passing-through host F/M/S data, for example. Use rdmsrl_safe
    to first access the RAPL_POWER_UNIT MSR; if this fails, do not
    attempt to use this PMU.

    Signed-off-by: Venkatesh Srinivas
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1394739386-22260-1-git-send-email-venkateshs@google.com
    Cc: zheng.z.yan@intel.com
    Cc: eranian@google.com
    Cc: ak@linux.intel.com
    Cc: linux-kernel@vger.kernel.org
    [ The patch also silently fixes another bug: rapl_pmu_init() didn't handle the memory alloc failure case previously. ]
    Signed-off-by: Ingo Molnar

    Venkatesh Srinivas
     
  • Pull parisc updates from Helge Deller:
    "There are two major changes in this patchset:

    The major fix is that the epoll_pwait() syscall for 32bit userspace
    was not using the compat wrapper on a 64bit kernel.

    Secondly we changed the value of SHMLBA from 4MB to PAGE_SIZE to
    reflect that we can actually mmap to any multiple of PAGE_SIZE. The
    only thing which needs care is that shared mmaps need to be mapped at
    the same offset inside the 4MB cache window"

    * 'parisc-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: fix epoll_pwait syscall on compat kernel
    parisc: change value of SHMLBA from 0x00400000 to PAGE_SIZE
    parisc: Replace __get_cpu_var uses for address calculation

    Linus Torvalds
     
  • Pull Xen fixes from David Vrabel:
    "Xen regression and bug fixes for 3.15-rc1:

    - fix completely broken 32-bit PV guests caused by x86 refactoring
    32-bit thread_info.
    - only enable ticketlock slow path on Xen (not bare metal)
    - fix two bugs with PV guests not shutting down when requested
    - fix a minor memory leak in xen-pciback error path"

    * tag 'stable/for-linus-3.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/manage: Poweroff forcefully if user-space is not yet up.
    xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart.
    xen/spinlock: Don't enable them unconditionally.
    xen-pciback: silence an unwanted debug printk
    xen: fix memory leak in __xen_pcibk_add_pci_dev()
    x86/xen: Fix 32-bit PV guests's usage of kernel_stack

    Linus Torvalds
     

17 Apr, 2014

8 commits

  • Current kprobes in-kernel page fault handler doesn't
    expect that its single-stepping can be interrupted by
    an NMI handler which may cause a page fault(e.g. perf
    with callback tracing).

    In that case, the page-fault handled by kprobes and it
    misunderstands the page-fault has been caused by the
    single-stepping code and tries to recover IP address
    to probed address.

    But the truth is the page-fault has been caused by the
    NMI handler, and do_page_fault failes to handle real
    page fault because the IP address is modified and
    causes Kernel BUGs like below.

    ----
    [ 2264.726905] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
    [ 2264.727190] IP: [] copy_user_generic_string+0x0/0x40

    To handle this correctly, I fixed the kprobes fault
    handler to ensure the faulted ip address is its own
    single-step buffer instead of checking current kprobe
    state.

    Signed-off-by: Masami Hiramatsu
    Cc: Andi Kleen
    Cc: Ananth N Mavinakayanahalli
    Cc: Sandeepa Prabhu
    Cc: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: fche@redhat.com
    Cc: systemtap@sourceware.org
    Link: http://lkml.kernel.org/r/20140417081644.26341.52351.stgit@ltc230.yrl.intra.hitachi.co.jp
    Signed-off-by: Ingo Molnar

    Masami Hiramatsu
     
  • The following commit:

    27f6c573e0f7 ("x86, CMCI: Add proper detection of end of CMCI storms")

    Added two preemption bugs:

    - machine_check_poll() does a get_cpu_var() without a matching
    put_cpu_var(), which causes preemption imbalance and crashes upon
    bootup.

    - it does percpu ops without disabling preemption. Preemption is not
    disabled due to the mistaken use of a raw spinlock.

    To fix these bugs fix the imbalance and change
    cmci_discover_lock to a regular spinlock.

    Reported-by: Owen Kibel
    Reported-by: Linus Torvalds
    Signed-off-by: Ingo Molnar
    Cc: Chen, Gong
    Cc: Josh Boyer
    Cc: Tony Luck
    Cc: Peter Zijlstra
    Cc: Alexander Todorov
    Cc: Borislav Petkov
    Link: http://lkml.kernel.org/n/tip-jtjptvgigpfkpvtQxpEk1at2@git.kernel.org
    Signed-off-by: Ingo Molnar
    --
    arch/x86/kernel/cpu/mcheck/mce.c | 4 +---
    arch/x86/kernel/cpu/mcheck/mce_intel.c | 18 +++++++++---------
    2 files changed, 10 insertions(+), 12 deletions(-)

    Ingo Molnar
     
  • Pull x86 fixes from Ingo Molnar:
    "Various fixes:

    - reboot regression fix
    - build message spam fix
    - GPU quirk fix
    - 'make kvmconfig' fix

    plus the wire-up of the renameat2() system call on i386"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: Remove the PCI reboot method from the default chain
    x86/build: Supress "Nothing to be done for ..." messages
    x86/gpu: Fix sign extension issue in Intel graphics stolen memory quirks
    x86/platform: Fix "make O=dir kvmconfig"
    i386: Wire up the renameat2() syscall

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar:
    "Tooling fixes, plus a simple hardware-enablement patch for the Intel
    RAPL PMU (energy use measurement) on Haswell CPUs, which I hope is
    still fine at this stage"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf tools: Instead of redirecting flex output, use -o
    perf tools: Fix double free in perf test 21 (code-reading.c)
    perf stat: Initialize statistics correctly
    perf bench: Set more defaults in the 'numa' suite
    perf bench: Fix segfault at the end of an 'all' execution
    perf bench: Update manpage to mention numa and futex
    perf probe: Use dwarf_getcfi_elf() instead of dwarf_getcfi()
    perf probe: Fix to handle errors in line_range searching
    perf probe: Fix --line option behavior
    perf tools: Pick up libdw without explicit LIBDW_DIR
    MAINTAINERS: Change e-mail to kernel.org one
    perf callchains: Disable unwind libraries when libelf isn't found
    tools lib traceevent: Do not call warning() directly
    tools lib traceevent: Print event name when show warning if possible
    perf top: Fix documentation of invalid -s option
    perf/x86: Enable DRAM RAPL support on Intel Haswell

    Linus Torvalds
     
  • Pull pincontrol fixes from Linus Walleij:
    "A first set of pin control fixes for the v3.15 series:

    - Fix a couple of barnsjukdomar on the Rockchip driver.

    - Remove an idiotic debug print I happened to leave behind in the
    Nomadik driver.

    - Fixup the Qualcomm MSM interrupt handling code for the TLMM v2.

    - Three patches renaming the Broadcom Capri driver to BCM28155. This
    has been falling between the chairs for some time due to some
    cross-tree synchronization misunderstandings, now I'm fed up with
    this and just rename it in this -rc1 phase"

    * tag 'pinctrl-v3.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
    pinctrl: fix typo in bindings documentation
    Update bcm_defconfig with new pinctrl CONFIG
    pinctrl: Rename Broadcom Capri pinctrl driver
    pinctrl: msm: Correct interrupt code for TLMM v2
    pinctrl: nomadik: delete stray debug print
    pinctrl: rockchip: handle first half of rk3188-bank0 correctly
    pinctrl: rockchip: add return value to rockchip_set_mux
    pinctrl: rockchip: fix offset of mux registers for rk3188

    Linus Torvalds
     
  • Pull s390 patches from Martin Schwidefsky:
    "An update to the oops output with additional information about the
    crash. The renameat2 system call is enabled. Two patches in regard
    to the PTR_ERR_OR_ZERO cleanup. And a bunch of bug fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    s390/sclp_cmd: replace PTR_RET with PTR_ERR_OR_ZERO
    s390/sclp: replace PTR_RET with PTR_ERR_OR_ZERO
    s390/sclp_vt220: Fix kernel panic due to early terminal input
    s390/compat: fix typo
    s390/uaccess: fix possible register corruption in strnlen_user_srst()
    s390: add 31 bit warning message
    s390: wire up sys_renameat2
    s390: show_registers() should not map user space addresses to kernel symbols
    s390/mm: print control registers and page table walk on crash
    s390/smp: fix smp_stop_cpu() for !CONFIG_SMP
    s390: fix control register update

    Linus Torvalds
     
  • Pull itanium erratum fix from Tony Luck:
    "Small workaround for a rare, but annoying, erratum #237"

    * tag 'please-pull-ia64-erratum' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
    [IA64] Change default PSR.ac from '1' to '0' (Fix erratum #237)

    Linus Torvalds
     
  • April 2014 Itanium processor specification update:

    http://www.intel.com/content/www/us/en/processors/itanium/itanium-specification-update.html

    describes this erratum:

    =========================================================================
    237. Under a complex set of conditions, store to load forwarding for a
    sub 8-byte load may complete incorrectly

    Problem: A load instruction may complete incorrectly when a code sequence
    using 4-byte or smaller load and store operations to the same address
    is executed in combination with specific timing of all the following
    concurrent conditions: store to load forwarding, alignment checking
    enabled, a mis-predicted branch, and complex cache utilization activity.

    Implication: The affected sub 8-byte instruction may complete
    incorrectly resulting in unpredictable system behavior. There is an
    extremely low probability of exposure due to the significant number of
    complex microarchitectural concurrent conditions required to encounter
    the erratum.

    Workaround: Set PSR.ac = 0 to completely avoid the erratum. Disabling
    Hyper-Threading will significantly reduce exposure to the conditions
    that contribute to encountering the erratum.

    Status: See the Summary Table of Changes for the affected steppings.
    =========================================================================

    [Table of changes essentially lists all models from McKinley to Tukwila]

    The PSR.ac bit controls whether the processor will always generate
    an unaligned reference trap (0x5a00) for a misaligned data access
    (when PSR.ac=1) or if it will let the access succeed when running
    on a cpu that implements logic to handle some unaligned accesses.

    Way back in 2008 in commit b704882e70d87d7f56db5ff17e2253f3fa90e4f3
    [IA64] Rationalize kernel mode alignment checking
    we made the decision to always enable strict checking. We were
    already doing so in trap/interrupt context because the common
    preamble code set this bit - but the rest of supervisor code
    (and by inheritance user code) ran with PSR.ac=0.

    We now reverse that decision and set PSR.ac=0 everywhere in the
    kernel (also inherited by user processes). This will avoid the
    erratum using the method described in the Itanium specification
    update. Net effect for users is that the processor will handle
    unaligned access when it can (typically with a tiny performance
    bubble in the pipeline ... but much less invasive than taking a
    trap and having the OS perform the access).

    Signed-off-by: Tony Luck

    Tony Luck
     

16 Apr, 2014

2 commits

  • Steve reported a reboot hang and bisected it back to this commit:

    a4f1987e4c54 x86, reboot: Add EFI and CF9 reboot methods into the default list

    He heroically tested all reboot methods and found the following:

    reboot=t # triple fault ok
    reboot=k # keyboard ctrl FAIL
    reboot=b # BIOS ok
    reboot=a # ACPI FAIL
    reboot=e # EFI FAIL [system has no EFI]
    reboot=p # PCI 0xcf9 FAIL

    And I think it's pretty obvious that we should only try PCI 0xcf9 as a
    last resort - if at all.

    The other observation is that (on this box) we should never try
    the PCI reboot method, but close with either the 'triple fault'
    or the 'BIOS' (terminal!) reboot methods.

    Thirdly, CF9_COND is a total misnomer - it should be something like
    CF9_SAFE or CF9_CAREFUL, and 'CF9' should be 'CF9_FORCE' ...

    So this patch fixes the worst problems:

    - it orders the actual reboot logic to follow the reboot ordering
    pattern - it was in a pretty random order before for no good
    reason.

    - it fixes the CF9 misnomers and uses BOOT_CF9_FORCE and
    BOOT_CF9_SAFE flags to make the code more obvious.

    - it tries the BIOS reboot method before the PCI reboot method.
    (Since 'BIOS' is a terminal reboot method resulting in a hang
    if it does not work, this is essentially equivalent to removing
    the PCI reboot method from the default reboot chain.)

    - just for the miraculous possibility of terminal (resulting
    in hang) reboot methods of triple fault or BIOS returning
    without having done their job, there's an ordering between
    them as well.

    Reported-and-bisected-and-tested-by: Steven Rostedt
    Cc: Li Aubrey
    Cc: Linus Torvalds
    Cc: Matthew Garrett
    Link: http://lkml.kernel.org/r/20140404064120.GB11877@gmail.com
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The git commit a945928ea2709bc0e8e8165d33aed855a0110279
    ('xen: Do not enable spinlocks before jump_label_init() has executed')
    was added to deal with the jump machinery. Earlier the code
    that turned on the jump label was only called by Xen specific
    functions. But now that it had been moved to the initcall machinery
    it gets called on Xen, KVM, and baremetal - ouch!. And the detection
    machinery to only call it on Xen wasn't remembered in the heat
    of merge window excitement.

    This means that the slowpath is enabled on baremetal while it should
    not be.

    Reported-by: Waiman Long
    Acked-by: Steven Rostedt
    CC: stable@vger.kernel.org
    CC: Boris Ostrovsky
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: David Vrabel

    Konrad Rzeszutek Wilk
     

15 Apr, 2014

3 commits

  • Commit 198d208df4371734ac4728f69cb585c284d20a15 ("x86: Keep
    thread_info on thread stack in x86_32") made 32-bit kernels use
    kernel_stack to point to thread_info. That change missed a couple of
    updates needed by Xen's 32-bit PV guests:

    1. kernel_stack needs to be initialized for secondary CPUs

    2. GET_THREAD_INFO() now uses %fs register which may not be the
    kernel's version when executing xen_iret().

    With respect to the second issue, we don't need GET_THREAD_INFO()
    anymore: we used it as an intermediate step to get to per_cpu xen_vcpu
    and avoid referencing %fs. Now that we are going to use %fs anyway we
    may as well go directly to xen_vcpu.

    Signed-off-by: Boris Ostrovsky
    Signed-off-by: David Vrabel

    Boris Ostrovsky
     
  • Pull KVM fixes from Marcelo Tosatti:
    - Fix for guest triggerable BUG_ON (CVE-2014-0155)
    - CR4.SMAP support
    - Spurious WARN_ON() fix

    * git://git.kernel.org/pub/scm/virt/kvm/kvm:
    KVM: x86: remove WARN_ON from get_kernel_ns()
    KVM: Rename variable smep to cr4_smep
    KVM: expose SMAP feature to guest
    KVM: Disable SMAP for guests in EPT realmode and EPT unpaging mode
    KVM: Add SMAP support when setting CR4
    KVM: Remove SMAP bit from CR4_RESERVED_BITS
    KVM: ioapic: try to recover if pending_eoi goes out of range
    KVM: ioapic: fix assignment of ioapic->rtc_status.pending_eoi (CVE-2014-0155)

    Linus Torvalds
     
  • 3bc955987fb3 ("powerpc/PCI: Use list_for_each_entry() for bus traversal")
    caused a NULL pointer dereference because the loop body set the iterator to
    NULL:

    Unable to handle kernel paging request for data at address 0x00000000
    Faulting instruction address: 0xc000000000041d78
    Oops: Kernel access of bad area, sig: 11 [#1]
    ...
    NIP [c000000000041d78] .sys_pciconfig_iobase+0x68/0x1f0
    LR [c000000000041e0c] .sys_pciconfig_iobase+0xfc/0x1f0
    Call Trace:
    [c0000003b4787db0] [c000000000041e0c] .sys_pciconfig_iobase+0xfc/0x1f0 (unreliable)
    [c0000003b4787e30] [c000000000009ed8] syscall_exit+0x0/0x98

    Fix it by using a temporary variable for the iterator.

    [bhelgaas: changelog, drop tmp_bus initialization]
    Fixes: 3bc955987fb3 powerpc/PCI: Use list_for_each_entry() for bus traversal
    Signed-off-by: Mike Qiu
    Signed-off-by: Bjorn Helgaas

    Mike Qiu