09 Sep, 2016

1 commit

  • Pull powerpc fixes from Michael Ellerman:
    "Fixes marked for stable:
    - Don't alias user region to other regions below PAGE_OFFSET from
    Paul Mackerras
    - Fix again csum_partial_copy_generic() on 32-bit from Christophe
    Leroy
    - Fix corrupted PE allocation bitmap on releasing PE from Gavin Shan

    Fixes for code merged this cycle:
    - Fix crash on releasing compound PE from Gavin Shan
    - Fix processor numbers in OPAL ICP from Benjamin Herrenschmidt
    - Fix little endian build with CONFIG_KEXEC=n from Thiago Jung
    Bauermann"

    * tag 'powerpc-4.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc/mm: Don't alias user region to other regions below PAGE_OFFSET
    powerpc/32: Fix again csum_partial_copy_generic()
    powerpc/powernv: Fix corrupted PE allocation bitmap on releasing PE
    powerpc/powernv: Fix crash on releasing compound PE
    powerpc/xics/opal: Fix processor numbers in OPAL ICP
    powerpc/pseries: Fix little endian build with CONFIG_KEXEC=n

    Linus Torvalds
     

08 Sep, 2016

3 commits

  • In commit c60ac5693c47 ("powerpc: Update kernel VSID range", 2013-03-13)
    we lost a check on the region number (the top four bits of the effective
    address) for addresses below PAGE_OFFSET. That commit replaced a check
    that the top 18 bits were all zero with a check that bits 46 - 59 were
    zero (performed for all addresses, not just user addresses).

    This means that userspace can access an address like 0x1000_0xxx_xxxx_xxxx
    and we will insert a valid SLB entry for it. The VSID used will be the
    same as if the top 4 bits were 0, but the page size will be some random
    value obtained by indexing beyond the end of the mm_ctx_high_slices_psize
    array in the paca. If that page size is the same as would be used for
    region 0, then userspace just has an alias of the region 0 space. If the
    page size is different, then no HPTE will be found for the access, and
    the process will get a SIGSEGV (since hash_page_mm() will refuse to create
    a HPTE for the bogus address).

    The access beyond the end of the mm_ctx_high_slices_psize can be at most
    5.5MB past the array, and so will be in RAM somewhere. Since the access
    is a load performed in real mode, it won't fault or crash the kernel.
    At most this bug could perhaps leak a little bit of information about
    blocks of 32 bytes of memory located at offsets of i * 512kB past the
    paca->mm_ctx_high_slices_psize array, for 1
    Reviewed-by: Aneesh Kumar K.V
    Signed-off-by: Michael Ellerman

    Paul Mackerras
     
  • Commit 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic()
    based on copy_tofrom_user()") introduced a bug when destination address
    is odd and len is lower than cacheline size.

    In that case the resulting csum value doesn't have to be rotated one
    byte because the cache-aligned copy part is skipped so no alignment
    is performed.

    Fixes: 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic() based on copy_tofrom_user()")
    Cc: stable@vger.kernel.org # v4.6+
    Reported-by: Alessio Igor Bogani
    Signed-off-by: Christophe Leroy
    Tested-by: Alessio Igor Bogani
    Signed-off-by: Michael Ellerman

    Christophe Leroy
     
  • In pnv_ioda_free_pe(), the PE object (including the associated PE
    number) is cleared before resetting the corresponding bit in the
    PE allocation bitmap. It means PE#0 is always released to the bitmap
    wrongly.

    This fixes above issue by caching the PE number before the PE object
    is cleared.

    Fixes: 1e9167726c41 ("powerpc/powernv: Use PE instead of number during setup and release"
    Cc: stable@vger.kernel.org # v4.7+
    Signed-off-by: Gavin Shan
    Signed-off-by: Michael Ellerman

    Gavin Shan
     

07 Sep, 2016

1 commit

  • Instead of having each caller of check_object_size() need to remember to
    check for a const size parameter, move the check into check_object_size()
    itself. This actually matches the original implementation in PaX, though
    this commit cleans up the now-redundant builtin_const() calls in the
    various architectures.

    Signed-off-by: Kees Cook

    Kees Cook
     

06 Sep, 2016

3 commits

  • The compound PE is created to accommodate the devices attached to
    one specific PCI bus that consume multiple M64 segments. The compound
    PE is made up of one master PE and possibly multiple slave PEs. The
    slave PEs should be destroyed when releasing the master PE. A kernel
    crash happens when derferencing @pe->pdev on releasing the slave PE
    in pnv_ioda_deconfigure_pe().

    # echo 0 > /sys/bus/pci/slots/C7/power
    iommu: Removing device 0000:01:00.1 from group 0
    iommu: Removing device 0000:01:00.0 from group 0
    Unable to handle kernel paging request for data at address 0x00000010
    Faulting instruction address: 0xc00000000005d898
    cpu 0x1: Vector: 300 (Data Access) at [c000000fe8217620]
    pc: c00000000005d898: pnv_ioda_release_pe+0x288/0x610
    lr: c00000000005dbdc: pnv_ioda_release_pe+0x5cc/0x610
    sp: c000000fe82178a0
    msr: 9000000000009033
    dar: 10
    dsisr: 40000000
    current = 0xc000000fe815ab80
    paca = 0xc00000000ff00400 softe: 0 irq_happened: 0x01
    pid = 2709, comm = sh
    Linux version 4.8.0-rc5-gavin-00006-g745efdb (gwshan@gwshan) \
    (gcc version 4.9.3 (Buildroot 2016.02-rc2-00093-g5ea3bce) ) #586 SMP \
    Tue Sep 6 13:37:29 AEST 2016
    enter ? for help
    [c000000fe8217940] c00000000005d684 pnv_ioda_release_pe+0x74/0x610
    [c000000fe82179e0] c000000000034460 pcibios_release_device+0x50/0x70
    [c000000fe8217a10] c0000000004aba80 pci_release_dev+0x50/0xa0
    [c000000fe8217a40] c000000000704898 device_release+0x58/0xf0
    [c000000fe8217ac0] c000000000470510 kobject_release+0x80/0xf0
    [c000000fe8217b00] c000000000704dd4 put_device+0x24/0x40
    [c000000fe8217b20] c0000000004af94c pci_remove_bus_device+0x12c/0x150
    [c000000fe8217b60] c000000000034244 pci_hp_remove_devices+0x94/0xd0
    [c000000fe8217ba0] c0000000004ca444 pnv_php_disable_slot+0x64/0xb0
    [c000000fe8217bd0] c0000000004c88c0 power_write_file+0xa0/0x190
    [c000000fe8217c50] c0000000004c248c pci_slot_attr_store+0x3c/0x60
    [c000000fe8217c70] c0000000002d6494 sysfs_kf_write+0x94/0xc0
    [c000000fe8217cb0] c0000000002d50f0 kernfs_fop_write+0x180/0x260
    [c000000fe8217d00] c0000000002334a0 __vfs_write+0x40/0x190
    [c000000fe8217d90] c000000000234738 vfs_write+0xc8/0x240
    [c000000fe8217de0] c000000000236250 SyS_write+0x60/0x110
    [c000000fe8217e30] c000000000009524 system_call+0x38/0x108

    It fixes the kernel crash by bypassing releasing resources (DMA,
    IO and memory segments, PELTM) because there are no resources assigned
    to the slave PE.

    Fixes: c5f7700bbd2e ("powerpc/powernv: Dynamically release PE")
    Reported-by: Frederic Barrat
    Signed-off-by: Gavin Shan
    Signed-off-by: Michael Ellerman

    Gavin Shan
     
  • When using the OPAL ICP backend we incorrectly pass Linux CPU numbers
    rather than HW CPU numbers to OPAL.

    Fixes: d74361881f0d ("powerpc/xics: Add ICP OPAL backend")
    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Michael Ellerman

    Benjamin Herrenschmidt
     
  • On ppc64le, builds with CONFIG_KEXEC=n fail with:

    arch/powerpc/platforms/pseries/setup.c: In function ‘pseries_big_endian_exceptions’:
    arch/powerpc/platforms/pseries/setup.c:403:13: error: implicit declaration of function ‘kdump_in_progress’
    if (rc && !kdump_in_progress())

    This is because pseries/setup.c includes , but
    kdump_in_progress() is defined in . This is a problem
    because the former only includes the latter if CONFIG_KEXEC_CORE=y.

    Fix it by including directly, as is done in powernv/setup.c.

    Fixes: d3cbff1b5a90 ("powerpc: Put exception configuration in a common place")
    Signed-off-by: Thiago Jung Bauermann
    Signed-off-by: Michael Ellerman

    Thiago Jung Bauermann
     

29 Aug, 2016

3 commits

  • Userspace can begin and suspend a transaction within the signal
    handler which means they might enter sys_rt_sigreturn() with the
    processor in suspended state.

    sys_rt_sigreturn() wants to restore process context (which may have
    been in a transaction before signal delivery). To do this it must
    restore TM SPRS. To achieve this, any transaction initiated within the
    signal frame must be discarded in order to be able to restore TM SPRs
    as TM SPRs can only be manipulated non-transactionally..
    >From the PowerPC ISA:
    TM Bad Thing Exception [Category: Transactional Memory]
    An attempt is made to execute a mtspr targeting a TM register in
    other than Non-transactional state.

    Not doing so results in a TM Bad Thing:
    [12045.221359] Kernel BUG at c000000000050a40 [verbose debug info unavailable]
    [12045.221470] Unexpected TM Bad Thing exception at c000000000050a40 (msr 0x201033)
    [12045.221540] Oops: Unrecoverable exception, sig: 6 [#1]
    [12045.221586] SMP NR_CPUS=2048 NUMA PowerNV
    [12045.221634] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE
    nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
    xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter
    ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_hv kvm
    uio_pdrv_genirq ipmi_powernv uio powernv_rng ipmi_msghandler autofs4 ses enclosure
    scsi_transport_sas bnx2x ipr mdio libcrc32c
    [12045.222167] CPU: 68 PID: 6178 Comm: sigreturnpanic Not tainted 4.7.0 #34
    [12045.222224] task: c0000000fce38600 ti: c0000000fceb4000 task.ti: c0000000fceb4000
    [12045.222293] NIP: c000000000050a40 LR: c0000000000163bc CTR: 0000000000000000
    [12045.222361] REGS: c0000000fceb7ac0 TRAP: 0700 Not tainted (4.7.0)
    [12045.222418] MSR: 9000000300201033 CR: 28444280 XER: 20000000
    [12045.222625] CFAR: c0000000000163b8 SOFTE: 0 PACATMSCRATCH: 900000014280f033
    GPR00: 01100000b8000001 c0000000fceb7d40 c00000000139c100 c0000000fce390d0
    GPR04: 900000034280f033 0000000000000000 0000000000000000 0000000000000000
    GPR08: 0000000000000000 b000000000001033 0000000000000001 0000000000000000
    GPR12: 0000000000000000 c000000002926400 0000000000000000 0000000000000000
    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    GPR24: 0000000000000000 00003ffff98cadd0 00003ffff98cb470 0000000000000000
    GPR28: 900000034280f033 c0000000fceb7ea0 0000000000000001 c0000000fce390d0
    [12045.223535] NIP [c000000000050a40] tm_restore_sprs+0xc/0x1c
    [12045.223584] LR [c0000000000163bc] tm_recheckpoint+0x5c/0xa0
    [12045.223630] Call Trace:
    [12045.223655] [c0000000fceb7d80] [c000000000026e74] sys_rt_sigreturn+0x494/0x6c0
    [12045.223738] [c0000000fceb7e30] [c0000000000092e0] system_call+0x38/0x108
    [12045.223806] Instruction dump:
    [12045.223841] 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8
    [12045.223955] 4e800020 e80304a8 7c0023a6 e80304b0 e80304b8 7c0123a6 4e800020
    [12045.224074] ---[ end trace cb8002ee240bae76 ]---

    It isn't clear exactly if there is really a use case for userspace
    returning with a suspended transaction, however, doing so doesn't (on
    its own) constitute a bad frame. As such, this patch simply discards
    the transactional state of the context calling the sigreturn and
    continues.

    Reported-by: Laurent Dufour
    Signed-off-by: Cyril Bur
    Tested-by: Laurent Dufour
    Reviewed-by: Laurent Dufour
    Acked-by: Simon Guo
    Signed-off-by: Benjamin Herrenschmidt

    Cyril Bur
     
  • In a situation, where Linux kernel gets notified about duplicate error log
    from OPAL, it is been observed that kernel fails to remove sysfs entries
    (/sys/firmware/opal/elog/0xXXXXXXXX) of such error logs. This is because,
    we currently search the error log/dump kobject in the kset list via
    'kset_find_obj()' routine. Which eventually increment the reference count
    by one, once it founds the kobject.

    So, unless we decrement the reference count by one after it found the kobject,
    we would not be able to release the kobject properly later.

    This patch adds the 'kobject_put()' which was missing earlier.

    Signed-off-by: Mukesh Ojha
    Cc: stable@vger.kernel.org
    Reviewed-by: Vasant Hegde
    Signed-off-by: Benjamin Herrenschmidt

    Mukesh Ojha
     
  • tabort_syscall runs with RI=1, so a nested recoverable machine
    check will load the paca into r13 and overwrite what we loaded
    it with, because exceptions returning to privileged mode do not
    restore r13.

    Fixes: b4b56f9ecab4 (powerpc/tm: Abort syscalls in active transactions)
    Cc: stable@vger.kernel.org
    Signed-off-by: Nick Piggin
    Signed-off-by: Benjamin Herrenschmidt

    Nicholas Piggin
     

22 Aug, 2016

12 commits

  • hmi.c functions are unused unless sibling_subcore_state is nonzero, and
    that in turn happens only if KVM is in use. So move the code to
    arch/powerpc/kvm/, putting it under CONFIG_KVM_BOOK3S_HV_POSSIBLE
    rather than CONFIG_PPC_BOOK3S_64. The sibling_subcore_state is also
    included in struct paca_struct only if KVM is supported by the kernel.

    Cc: Daniel Axtens
    Cc: Michael Ellerman
    Cc: Mahesh Salgaonkar
    Cc: Paul Mackerras
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: kvm-ppc@vger.kernel.org
    Cc: kvm@vger.kernel.org
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Benjamin Herrenschmidt

    Paolo Bonzini
     
  • of_mm_gpiochip_add_data() calls mm_gc->save_regs() before
    setting the data. Therefore ->save_regs() cannot use
    gpiochip_get_data()

    [ 0.275940] Unable to handle kernel paging request for data at address 0x00000130
    [ 0.283120] Faulting instruction address: 0xc01b44cc
    [ 0.288175] Oops: Kernel access of bad area, sig: 11 [#1]
    [ 0.293343] PREEMPT CMPC885
    [ 0.296141] CPU: 0 PID: 1 Comm: swapper Not tainted 4.7.0-g65124df-dirty #68
    [ 0.304131] task: c6074000 ti: c6080000 task.ti: c6080000
    [ 0.309459] NIP: c01b44cc LR: c0011720 CTR: c0011708
    [ 0.314372] REGS: c6081d90 TRAP: 0300 Not tainted (4.7.0-g65124df-dirty)
    [ 0.322267] MSR: 00009032 CR: 24000028 XER: 20000000
    [ 0.328813] DAR: 00000130 DSISR: c0000000
    GPR00: c01b6d0c c6081e40 c6074000 c6017000 c9028000 c601d028 c6081dd8 00000000
    GPR08: c601d028 00000000 ffffffff 00000001 24000044 00000000 c0002790 00000000
    GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 c05643b0 00000083
    GPR24: c04a1a6c c0560000 c04a8308 c04c6480 c0012498 c6017000 c7ffcc78 c6017000
    [ 0.360806] NIP [c01b44cc] gpiochip_get_data+0x4/0xc
    [ 0.365684] LR [c0011720] cpm1_gpio16_save_regs+0x18/0x44
    [ 0.370972] Call Trace:
    [ 0.373451] [c6081e50] [c01b6d0c] of_mm_gpiochip_add_data+0x70/0xdc
    [ 0.379624] [c6081e70] [c00124c0] cpm_init_par_io+0x28/0x118
    [ 0.385238] [c6081e80] [c04a8ac0] do_one_initcall+0xb0/0x17c
    [ 0.390819] [c6081ef0] [c04a8cbc] kernel_init_freeable+0x130/0x1dc
    [ 0.396924] [c6081f30] [c00027a4] kernel_init+0x14/0x110
    [ 0.402177] [c6081f40] [c000b424] ret_from_kernel_thread+0x5c/0x64
    [ 0.408233] Instruction dump:
    [ 0.411168] 4182fafc 3f80c040 48234c6d 3bc0fff0 3b9c5ed0 4bfffaf4 81290020 712a0004
    [ 0.418825] 4182fb34 48234c51 4bfffb2c 81230004 4e800020 7c0802a6 9421ffe0
    [ 0.426763] ---[ end trace fe4113ee21d72ffa ]---

    fixes: e65078f1f3490 ("powerpc: sysdev: cpm1: use gpiochip data pointer")
    fixes: a14a2d484b386 ("powerpc: cpm_common: use gpiochip data pointer")
    Cc: stable@vger.kernel.org
    Signed-off-by: Christophe Leroy
    Reviewed-by: Linus Walleij
    Signed-off-by: Benjamin Herrenschmidt

    Christophe Leroy
     
  • MCE must not enable MSR_RI until PACA_EXMC is no longer being used.

    Signed-off-by: Benjamin Herrenschmidt

    Nicholas Piggin
     
  • MCE must not use PACA_EXGEN. When a general exception enables MSR_RI,
    that means SPRN_SRR[01] and SPRN_SPRG are no longer used. However the
    PACA save area is still in use.
    Acked-by: Mahesh Salgaonkar

    Signed-off-by: Benjamin Herrenschmidt

    Nicholas Piggin
     
  • When booting from an OpenFirmware which supports it, we use the
    "ibm,client-architecture-support" firmware call to communicate
    our capabilities to firmware.

    The format of the structure we pass to firmware is specified in
    PAPR (Power Architecture Platform Requirements), or the public version
    LoPAPR (Linux on Power Architecture Platform Reference).

    Referring to table 244 in LoPAPR v1.1, option vector 5 contains a 4 byte
    field at bytes 17-20 for the "Platform Facilities Enable". This is
    followed by a 1 byte field at byte 21 for "Sub-Processor Represenation
    Level".

    Comparing to the code, there we have the Platform Facilities
    options (OV5_PFO_*) at byte 17, but we fail to pad that field out to its
    full width of 4 bytes. This means the OV5_SUB_PROCESSORS option is
    incorrectly placed at byte 18.

    Fix it by adding zero bytes for bytes 18, 19, 20, and comment the bytes
    to hopefully make it clearer in future.

    As far as I'm aware nothing actually consumes this value at this time,
    so the effect of this bug is nil in practice.

    It does mean we've been incorrectly setting bit 15 of the "Platform
    Facilities Enable" option for the past ~3 1/2 years, so we should avoid
    allocating that bit to anything else in future.

    Fixes: df77c7992029 ("powerpc/pseries: Update ibm,architecture.vec for PAPR 2.7/POWER8")
    Signed-off-by: Michael Ellerman
    Signed-off-by: Benjamin Herrenschmidt

    Michael Ellerman
     
  • We observed a kernel oops when running a PPC guest with config NR_CPUS=4
    and qemu option "-smp cores=1,threads=8":

    [ 30.634781] Unable to handle kernel paging request for data at
    address 0xc00000014192eb17
    [ 30.636173] Faulting instruction address: 0xc00000000003e5cc
    [ 30.637069] Oops: Kernel access of bad area, sig: 11 [#1]
    [ 30.637877] SMP NR_CPUS=4 NUMA pSeries
    [ 30.638471] Modules linked in:
    [ 30.638949] CPU: 3 PID: 27 Comm: migration/3 Not tainted
    4.7.0-07963-g9714b26 #1
    [ 30.640059] task: c00000001e29c600 task.stack: c00000001e2a8000
    [ 30.640956] NIP: c00000000003e5cc LR: c00000000003e550 CTR:
    0000000000000000
    [ 30.642001] REGS: c00000001e2ab8e0 TRAP: 0300 Not tainted
    (4.7.0-07963-g9714b26)
    [ 30.643139] MSR: 8000000102803033 CR: 22004084 XER: 00000000
    [ 30.644583] CFAR: c000000000009e98 DAR: c00000014192eb17 DSISR: 40000000 SOFTE: 0
    GPR00: c00000000140a6b8 c00000001e2abb60 c0000000016dd300 0000000000000003
    GPR04: 0000000000000000 0000000000000004 c0000000016e5920 0000000000000008
    GPR08: 0000000000000004 c00000014192eb17 0000000000000000 0000000000000020
    GPR12: c00000000140a6c0 c00000000ffffc00 c0000000000d3ea8 c00000001e005680
    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    GPR20: 0000000000000000 c00000001e6b3a00 0000000000000000 0000000000000001
    GPR24: c00000001ff85138 c00000001ff85130 000000001eb6f000 0000000000000001
    GPR28: 0000000000000000 c0000000017014e0 0000000000000000 0000000000000018
    [ 30.653882] NIP [c00000000003e5cc] __cpu_disable+0xcc/0x190
    [ 30.654713] LR [c00000000003e550] __cpu_disable+0x50/0x190
    [ 30.655528] Call Trace:
    [ 30.655893] [c00000001e2abb60] [c00000000003e550] __cpu_disable+0x50/0x190 (unreliable)
    [ 30.657280] [c00000001e2abbb0] [c0000000000aca0c] take_cpu_down+0x5c/0x100
    [ 30.658365] [c00000001e2abc10] [c000000000163918] multi_cpu_stop+0x1a8/0x1e0
    [ 30.659617] [c00000001e2abc60] [c000000000163cc0] cpu_stopper_thread+0xf0/0x1d0
    [ 30.660737] [c00000001e2abd20] [c0000000000d8d70] smpboot_thread_fn+0x290/0x2a0
    [ 30.661879] [c00000001e2abd80] [c0000000000d3fa8] kthread+0x108/0x130
    [ 30.662876] [c00000001e2abe30] [c000000000009968] ret_from_kernel_thread+0x5c/0x74
    [ 30.664017] Instruction dump:
    [ 30.664477] 7bde1f24 38a00000 787f1f24 3b600001 39890008 7d204b78 7d05e214 7d0b07b4
    [ 30.665642] 796b1f24 7d26582a 7d204a14 7d29f214 7d4a3878 7d4049ad 40c2fff4
    [ 30.666854] ---[ end trace 32643b7195717741 ]---

    The reason of this is that in __cpu_disable(), when we try to set the
    cpu_sibling_mask or cpu_core_mask of the sibling CPUs of the disabled
    one, we don't check whether the current configuration employs those
    sibling CPUs(hw threads). And if a CPU is not employed by a
    configuration, the percpu structures cpu_{sibling,core}_mask are not
    allocated, therefore accessing those cpumasks will result in problems as
    above.

    This patch fixes this problem by adding an addition check on whether the
    id is no less than nr_cpu_ids in the sibling CPU iteration code.

    Signed-off-by: Boqun Feng
    Signed-off-by: Benjamin Herrenschmidt

    Boqun Feng
     
  • These files were only including module.h for exception table
    related functions. We've now separated that content out into its
    own file "extable.h" so now move over to that and avoid all the
    extra header content in module.h that we don't really need to compile
    these files.

    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Cc: linuxppc-dev@lists.ozlabs.org
    Signed-off-by: Paul Gortmaker
    Signed-off-by: Benjamin Herrenschmidt

    Paul Gortmaker
     
  • Unsigned type is always non-negative, so the loop could not end in case
    condition is never true.

    The problem has been detected using semantic patch
    scripts/coccinelle/tests/unsigned_lesser_than_zero.cocci

    Signed-off-by: Andrzej Hajda
    Signed-off-by: Benjamin Herrenschmidt

    Andrzej Hajda
     
  • This patch leverages 'struct pci_host_bridge' from the PCI subsystem
    in order to free the pci_controller only after the last reference to
    its devices is dropped (avoiding an oops in pcibios_release_device()
    if the last reference is dropped after pcibios_free_controller()).

    The patch relies on pci_host_bridge.release_fn() (and .release_data),
    which is called automatically by the PCI subsystem when the root bus
    is released (i.e., the last reference is dropped). Those fields are
    set via pci_set_host_bridge_release() (e.g. in the platform-specific
    implementation of pcibios_root_bridge_prepare()).

    It introduces the 'pcibios_free_controller_deferred()' .release_fn()
    and it expects .release_data to hold a pointer to the pci_controller.

    The function implictly calls 'pcibios_free_controller()', so an user
    must *NOT* explicitly call it if using the new _deferred() callback.

    The functionality is enabled for pseries (although it isn't platform
    specific, and may be used by cxl).

    Details on not-so-elegant design choices:

    - Use 'pci_host_bridge.release_data' field as pointer to associated
    'struct pci_controller' so *not* to 'pci_bus_to_host(bridge->bus)'
    in pcibios_free_controller_deferred().

    That's because pci_remove_root_bus() sets 'host_bridge->bus = NULL'
    (so, if the last reference is released after pci_remove_root_bus()
    runs, which eventually reaches pcibios_free_controller_deferred(),
    that would hit a null pointer dereference).

    The cxl/vphb.c code calls pci_remove_root_bus(), and the cxl folks
    are interested in this fix.

    Test-case #1 (hold references)

    # ls -ld /sys/block/sd* | grep -m1 0021:01:00.0
    /sys/block/sdaa -> ../devices/pci0021:01/0021:01:00.0/

    # ls -ld /sys/block/sd* | grep -m1 0021:01:00.1
    /sys/block/sdab -> ../devices/pci0021:01/0021:01:00.1/

    # cat >/dev/sdaa & pid1=$!
    # cat >/dev/sdab & pid2=$!

    # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
    Validating PHB DLPAR capability...yes.
    [ 594.306719] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
    [ 594.306738] pci_hp_remove_devices: Removing 0021:01:00.0...
    ...
    [ 598.236381] pci_hp_remove_devices: Removing 0021:01:00.1...
    ...
    [ 611.972077] pci_bus 0021:01: busn_res: [bus 01-ff] is released
    [ 611.972140] rpadlpar_io: slot PHB 33 removed

    # kill -9 $pid1
    # kill -9 $pid2
    [ 632.918088] pcibios_free_controller_deferred: domain 33, dynamic 1

    Test-case #2 (don't hold references)

    # drmgr -w 5 -d 1 -c phb -s 'PHB 33' -r
    Validating PHB DLPAR capability...yes.
    [ 916.357363] pci_hp_remove_devices: PCI: Removing devices on bus 0021:01
    [ 916.357386] pci_hp_remove_devices: Removing 0021:01:00.0...
    ...
    [ 920.566527] pci_hp_remove_devices: Removing 0021:01:00.1...
    ...
    [ 933.955873] pci_bus 0021:01: busn_res: [bus 01-ff] is released
    [ 933.955977] pcibios_free_controller_deferred: domain 33, dynamic 1
    [ 933.955999] rpadlpar_io: slot PHB 33 removed

    Suggested-By: Gavin Shan
    Signed-off-by: Mauricio Faria de Oliveira
    Reviewed-by: Gavin Shan
    Reviewed-by: Andrew Donnellan
    Tested-by: Andrew Donnellan # cxl
    Signed-off-by: Benjamin Herrenschmidt

    Mauricio Faria de Oliveira
     
  • The field "owner" is set by the core.
    Thus delete an unneeded initialisation.

    Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
    Signed-off-by: Markus Elfring
    Signed-off-by: Benjamin Herrenschmidt

    Markus Elfring
     
  • The field "owner" is set by the core.
    Thus delete an unneeded initialisation.

    Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
    Signed-off-by: Markus Elfring
    Signed-off-by: Benjamin Herrenschmidt

    Markus Elfring
     
  • Powerpc builds may fail with the following build error.

    Error log:
    In file included from ./arch/powerpc/include/asm/mmu_context.h:11:0,
    from ./include/linux/mmu_context.h:4,
    from mm/mmu_context.c:8:
    ./arch/powerpc/include/asm/cputhreads.h: In function 'get_tensr':
    ./arch/powerpc/include/asm/cputhreads.h:101:2: error:
    implicit declaration of function 'cpu_has_feature'

    The problem can be triggered by configuring ppc64e_defconfig and selecting
    CONFIG_TICK_CPU_ACCOUNTING instead of CONFIG_VIRT_CPU_ACCOUNTING_NATIVE.

    Fixes: b92a226e5284 ("powerpc: Move cpu_has_feature() to a separate file")
    Signed-off-by: Guenter Roeck
    Signed-off-by: Benjamin Herrenschmidt

    Guenter Roeck
     

14 Aug, 2016

1 commit

  • Pull KVM fixes from Radim Krčmář:
    "KVM:
    - lock kvm_device list to prevent corruption on device creation.

    PPC:
    - split debugfs initialization from creation of the xics device to
    unlock the newly taken kvm lock earlier.

    s390:
    - prevent userspace from triggering two WARN_ON_ONCE.

    MIPS:
    - fix several issues in the management of TLB faults (Cc: stable)"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    MIPS: KVM: Propagate kseg0/mapped tlb fault errors
    MIPS: KVM: Fix gfn range check in kseg0 tlb faults
    MIPS: KVM: Add missing gfn range check
    MIPS: KVM: Fix mapped fault broken commpage handling
    KVM: Protect device ops->create and list_add with kvm->lock
    KVM: PPC: Move xics_debugfs_init out of create
    KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed
    KVM: s390: set the prefix initially properly

    Linus Torvalds
     

13 Aug, 2016

1 commit

  • Pull powerpc fixes from Michael Ellerman:
    "Some powerpc fixes for 4.8:

    Misc:
    - powerpc/vdso: Fix build rules to rebuild vdsos correctly from Nicholas Piggin
    - powerpc/ptrace: Fix coredump since ptrace TM changes from Cyril Bur
    - powerpc/32: Fix csum_partial_copy_generic() from Christophe Leroy
    - cxl: Set psl_fir_cntl to production environment value from Frederic Barrat
    - powerpc/eeh: Switch to conventional PCI address output in EEH log from Guilherme G. Piccoli
    - cxl: Use fixed width predefined types in data structure. from Philippe Bergheaud
    - powerpc/vdso: Add missing include file from Guenter Roeck
    - powerpc: Fix unused function warning 'lmb_to_memblock' from Alastair D'Silva
    - powerpc/powernv/ioda: Fix TCE invalidate to work in real mode again from Alexey Kardashevskiy
    - powerpc/cell: Add missing error code in spufs_mkgang() from Dan Carpenter
    - crypto: crc32c-vpmsum - Convert to CPU feature based module autoloading from Anton Blanchard
    - powerpc/pasemi: Fix coherent_dma_mask for dma engine from Darren Stevens

    Benjamin Herrenschmidt:
    - powerpc/32: Fix crash during static key init
    - powerpc: Update obsolete comment in setup_32.c about early_init()
    - powerpc: Print the kernel load address at the end of prom_init()
    - powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs
    - powerpc/xics: Properly set Edge/Level type and enable resend

    Mahesh Salgaonkar:
    - powerpc/book3s: Fix MCE console messages for unrecoverable MCE.
    - powerpc/powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.
    - powerpc/powernv: Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h
    - powerpc/powernv: Load correct TOC pointer while waking up from winkle.

    Andrew Donnellan:
    - cxl: Fix sparse warnings
    - cxl: Fix NULL dereference in cxl_context_init() on PowerVM guests

    Michael Ellerman:
    - selftests/powerpc: Specify we expect to build with std=gnu99
    - powerpc/Makefile: Use cflags-y/aflags-y for setting endian options
    - powerpc/pci: Fix endian bug in fixed PHB numbering"

    * tag 'powerpc-4.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (26 commits)
    selftests/powerpc: Specify we expect to build with std=gnu99
    powerpc/vdso: Fix build rules to rebuild vdsos correctly
    powerpc/Makefile: Use cflags-y/aflags-y for setting endian options
    powerpc/32: Fix crash during static key init
    powerpc: Update obsolete comment in setup_32.c about early_init()
    powerpc: Print the kernel load address at the end of prom_init()
    powerpc/ptrace: Fix coredump since ptrace TM changes
    powerpc/32: Fix csum_partial_copy_generic()
    cxl: Set psl_fir_cntl to production environment value
    powerpc/pnv/pci: Fix incorrect PE reservation attempt on some 64-bit BARs
    powerpc/book3s: Fix MCE console messages for unrecoverable MCE.
    powerpc/pci: Fix endian bug in fixed PHB numbering
    powerpc/eeh: Switch to conventional PCI address output in EEH log
    cxl: Fix sparse warnings
    cxl: Fix NULL dereference in cxl_context_init() on PowerVM guests
    cxl: Use fixed width predefined types in data structure.
    powerpc/vdso: Add missing include file
    powerpc: Fix unused function warning 'lmb_to_memblock'
    powerpc/powernv: Fix MCE handler to avoid trashing CR0/CR1 registers.
    powerpc/powernv: Move IDLE_STATE_ENTER_SEQ macro to cpuidle.h
    ...

    Linus Torvalds
     

12 Aug, 2016

2 commits

  • KVM devices were manipulating list data structures without any form of
    synchronization, and some implementations of the create operations also
    suffered from a lack of synchronization.

    Now when we've split the xics create operation into create and init, we
    can hold the kvm->lock mutex while calling the create operation and when
    manipulating the devices list.

    The error path in the generic code gets slightly ugly because we have to
    take the mutex again and delete the device from the list, but holding
    the mutex during anon_inode_getfd or releasing/locking the mutex in the
    common non-error path seemed wrong.

    Signed-off-by: Christoffer Dall
    Reviewed-by: Paolo Bonzini
    Acked-by: Christian Borntraeger
    Signed-off-by: Radim Krčmář

    Christoffer Dall
     
  • As we are about to hold the kvm->lock during the create operation on KVM
    devices, we should move the call to xics_debugfs_init into its own
    function, since holding a mutex over extended amounts of time might not
    be a good idea.

    Introduce an init operation on the kvm_device_ops struct which cannot
    fail and call this, if configured, after the device has been created.

    Signed-off-by: Christoffer Dall
    Reviewed-by: Paolo Bonzini
    Signed-off-by: Radim Krčmář

    Christoffer Dall
     

10 Aug, 2016

7 commits

  • When using if_changed, we need to add FORCE as a dependency (see
    Documentation/kbuild/makefiles.txt) otherwise we don't get command line
    change checking amongst other things. This has resulted in vdsos not
    being rebuilt when switching between big and little endian.

    The vdso64/32ld commands have to be changed around to avoid pulling
    FORCE into the linker command line (code copied from x86).

    Signed-off-by: Nicholas Piggin
    Signed-off-by: Michael Ellerman

    Nicholas Piggin
     
  • When we introduced the little endian support, we added the endian flags
    to CC directly using override. I don't know the history of why we did
    that, I suspect no one does.

    Although this mostly works, it has one bug, which is that CROSS32CC
    doesn't get -mbig-endian. That means when the compiler is little endian
    by default and the user is building big endian, vdso32 is incorrectly
    compiled as little endian and the kernel fails to build.

    Instead we can add the endian flags to cflags-y/aflags-y, and then
    append those to KBUILD_CFLAGS/KBUILD_AFLAGS.

    This has the advantage of being 1) less ugly, 2) the documented way of
    adding flags in the arch Makefile and 3) it fixes building vdso32 with a
    LE toolchain.

    Signed-off-by: Michael Ellerman

    Michael Ellerman
     
  • We cannot do those initializations from apply_feature_fixups() as
    this function runs in a very restricted environment on 32-bit where
    the kernel isn't running at its linked address and the PTRRELOC()
    macro must be used for any global accesss.

    Instead, split them into a separtate steup_feature_keys() function
    which is called in a more suitable spot on ppc32.

    Fixes: 309b315b6ec6 ("powerpc: Call jump_label_init() in apply_feature_fixups()")
    Reported-and-tested-by: Christian Kujau
    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Michael Ellerman

    Benjamin Herrenschmidt
     
  • We don't identify the machine type anymore...

    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Michael Ellerman

    Benjamin Herrenschmidt
     
  • This makes it easier to debug crashes that happen very early before
    the kernel takes over Open Firmware by allowing us to relate the OF
    reported crashing addresses to offsets within the kernel.

    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Michael Ellerman

    Benjamin Herrenschmidt
     
  • Commit 8d460f6156cd ("powerpc/process: Add the function
    flush_tmregs_to_thread") added flush_tmregs_to_thread() and included
    the assumption that it would only be called for a task which is not
    current.

    Although this is correct for ptrace, when generating a core dump, some
    of the routines which call flush_tmregs_to_thread() are called. This
    leads to a WARNing such as:

    Not expecting ptrace on self: TM regs may be incorrect
    ------------[ cut here ]------------
    WARNING: CPU: 123 PID: 7727 at arch/powerpc/kernel/process.c:1088 flush_tmregs_to_thread+0x78/0x80
    CPU: 123 PID: 7727 Comm: libvirtd Not tainted 4.8.0-rc1-gcc6x-g61e8a0d #1
    task: c000000fe631b600 task.stack: c000000fe63b0000
    NIP: c00000000001a1a8 LR: c00000000001a1a4 CTR: c000000000717780
    REGS: c000000fe63b3420 TRAP: 0700 Not tainted (4.8.0-rc1-gcc6x-g61e8a0d)
    MSR: 900000010282b033 CR: 28004222 XER: 20000000
    ...
    NIP [c00000000001a1a8] flush_tmregs_to_thread+0x78/0x80
    LR [c00000000001a1a4] flush_tmregs_to_thread+0x74/0x80
    Call Trace:
    flush_tmregs_to_thread+0x74/0x80 (unreliable)
    vsr_get+0x64/0x1a0
    elf_core_dump+0x604/0x1430
    do_coredump+0x5fc/0x1200
    get_signal+0x398/0x740
    do_signal+0x54/0x2b0
    do_notify_resume+0x98/0xb0
    ret_from_except_lite+0x70/0x74

    So fix flush_tmregs_to_thread() to detect the case where it is called on
    current, and a transaction is active, and in that case flush the TM regs
    to the thread_struct.

    This patch also moves flush_tmregs_to_thread() into ptrace.c as it is
    only called from that file.

    Fixes: 8d460f6156cd ("powerpc/process: Add the function flush_tmregs_to_thread")
    Signed-off-by: Cyril Bur
    [mpe: Flesh out change log]
    Signed-off-by: Michael Ellerman

    Cyril Bur
     
  • Commit 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic()
    based on copy_tofrom_user()") introduced a bug when destination
    address is odd and initial csum is not null

    In that (rare) case the initial csum value has to be rotated one byte
    as well as the resulting value is

    This patch also fixes related comments

    Fixes: 7aef4136566b0 ("powerpc32: rewrite csum_partial_copy_generic() based on copy_tofrom_user()")
    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman

    Christophe Leroy
     

09 Aug, 2016

6 commits

  • The generic allocation code may sometimes decide to assign a prefetchable
    64-bit BAR to the M32 window. In fact it may also decide to allocate
    a 64-bit non-prefetchable BAR to the M64 one ! So using the resource
    flags as a test to decide which window was used for PE allocation is
    just wrong and leads to insane PE numbers.

    Instead, compare the addresses to figure it out.

    Signed-off-by: Benjamin Herrenschmidt
    [mpe: Rename the function as agreed by Ben & Gavin]
    Signed-off-by: Michael Ellerman

    Benjamin Herrenschmidt
     
  • When machine check occurs with MSR(RI=0), it means MC interrupt is
    unrecoverable and kernel goes down to panic path. But the console
    message still shows it as recovered. This patch fixes the MCE console
    messages.

    Fixes: 36df96f8acaf ("powerpc/book3s: Decode and save machine check event.")
    Signed-off-by: Mahesh Salgaonkar
    Signed-off-by: Michael Ellerman

    Mahesh Salgaonkar
     
  • The recent commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number
    based on device-tree properties"), added code to read a 64-bit property
    from the device tree, and if not found read a 32-bit property (reg).

    There was a bug in the 32-bit case, on big endian machines, due to the
    use of the 64-bit value to read the 32-bit property. The cast of &prop
    means we end up writing to the high 32-bit of prop, leaving the low
    32-bits containing whatever junk was on the stack.

    If that junk value was non-zero, and < MAX_PHBS, we would end up using
    it as the PHB id. This results in users seeing what appear to be random
    PHB ids.

    Fix it by reading into a u32 property and then assigning that to the
    u64 value, letting the CPU do the correct conversions for us.

    Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties")
    Signed-off-by: Michael Ellerman

    Michael Ellerman
     
  • This is a very minor/trivial fix for the output of PCI address on EEH
    logs. The PCI address on "OF node" field currently is using ":" as a
    separator for the function, but the usual separator is ".". This patch
    changes the separator to dot, so the PCI address is printed as usual.

    Signed-off-by: Guilherme G. Piccoli
    Reviewed-by: Gavin Shan
    Signed-off-by: Michael Ellerman

    Guilherme G. Piccoli
     
  • Some powerpc builds fail with the following buld error.

    In file included from ./arch/powerpc/include/asm/mmu_context.h:11:0,
    from arch/powerpc/kernel/vdso.c:28:
    arch/powerpc/include/asm/cputhreads.h: In function 'get_tensr':
    arch/powerpc/include/asm/cputhreads.h:101:2: error:
    implicit declaration of function 'cpu_has_feature'

    Fixes: b92a226e5284 ("powerpc: Move cpu_has_feature() to a separate file")
    Signed-off-by: Guenter Roeck
    Signed-off-by: Michael Ellerman

    Guenter Roeck
     
  • This patch fixes the following warning:
    arch/powerpc/platforms/pseries/hotplug-memory.c:323:29: error: 'lmb_to_memblock' defined but not used [-Werror=unused-function]
    static struct memory_block *lmb_to_memblock(struct of_drconf_cell *lmb)
    ^~~~~~~~~~~~~~~

    The only consumer of this function is 'dlpar_remove_lmb', which is
    enabled with CONFIG_MEMORY_HOTREMOVE, so move it into the same
    ifdef block.

    Signed-off-by: Michael Ellerman

    Alastair D'Silva