08 Mar, 2020

1 commit

  • Merge Linux stable release v5.4.24 into imx_5.4.y

    * tag 'v5.4.24': (3306 commits)
    Linux 5.4.24
    blktrace: Protect q->blk_trace with RCU
    kvm: nVMX: VMWRITE checks unsupported field before read-only field
    ...

    Signed-off-by: Jason Liu

    Conflicts:
    arch/arm/boot/dts/imx6sll-evk.dts
    arch/arm/boot/dts/imx7ulp.dtsi
    arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
    drivers/clk/imx/clk-composite-8m.c
    drivers/gpio/gpio-mxc.c
    drivers/irqchip/Kconfig
    drivers/mmc/host/sdhci-of-esdhc.c
    drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
    drivers/net/can/flexcan.c
    drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
    drivers/net/ethernet/mscc/ocelot.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
    drivers/net/phy/realtek.c
    drivers/pci/controller/mobiveil/pcie-mobiveil-host.c
    drivers/perf/fsl_imx8_ddr_perf.c
    drivers/tee/optee/shm_pool.c
    drivers/usb/cdns3/gadget.c
    kernel/sched/cpufreq.c
    net/core/xdp.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c
    sound/soc/sof/core.c
    sound/soc/sof/imx/Kconfig
    sound/soc/sof/loader.c

    Jason Liu
     

29 Feb, 2020

6 commits

  • commit 50a175dd18de7a647e72aca7daf4744e3a5a81e3 upstream.

    With HW assistance all page tables must be 4k aligned, the 8xx drops
    the last 12 bits during the walk.

    Redefine HUGEPD_SHIFT_MASK to mask last 12 bits out. HUGEPD_SHIFT_MASK
    is used to for alignment of page table cache.

    Fixes: 22569b881d37 ("powerpc/8xx: Enable 8M hugepage support with HW assistance")
    Cc: stable@vger.kernel.org # v5.0+
    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/778b1a248c4c7ca79640eeff7740044da6a220a0.1581264115.git.christophe.leroy@c-s.fr
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     
  • commit f2b67ef90b0d5eca0f2255e02cf2f620bc0ddcdb upstream.

    Commit 55c8fc3f4930 ("powerpc/8xx: reintroduce 16K pages with HW
    assistance") redefined pte_t as a struct of 4 pte_basic_t, because
    in 16K pages mode there are four identical entries in the
    page table. But the size of hugepage tables is calculated based
    of the size of (void *). Therefore, we end up with page tables
    of size 1k instead of 4k for 512k pages.

    As 512k hugepage tables are the same size as standard page tables,
    ie 4k, use the standard page tables instead of PGT_CACHE tables.

    Fixes: 3fb69c6a1a13 ("powerpc/8xx: Enable 512k hugepage support with HW assistance")
    Cc: stable@vger.kernel.org # v5.0+
    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/90ec56a2315be602494619ed0223bba3b0b8d619.1580997007.git.christophe.leroy@c-s.fr
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     
  • commit 9eb425b2e04e0e3006adffea5bf5f227a896f128 upstream.

    Fixes: 12c3f1fd87bf ("powerpc/32s: get rid of CPU_FTR_601 feature")
    Cc: stable@vger.kernel.org # v5.4+
    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/a99fc0ad65b87a1ba51cfa3e0e9034ee294c3e07.1582034961.git.christophe.leroy@c-s.fr
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     
  • commit 2464cc4c345699adea52c7aef75707207cb8a2f6 upstream.

    After a treclaim, we expect to be in non-transactional state. If we
    don't clear the current thread's MSR[TS] before we get preempted, then
    tm_recheckpoint_new_task() will recheckpoint and we get rescheduled in
    suspended transaction state.

    When handling a signal caught in transactional state,
    handle_rt_signal64() calls get_tm_stackpointer() that treclaims the
    transaction using tm_reclaim_current() but without clearing the
    thread's MSR[TS]. This can cause the TM Bad Thing exception below if
    later we pagefault and get preempted trying to access the user's
    sigframe, using __put_user(). Afterwards, when we are rescheduled back
    into do_page_fault() (but now in suspended state since the thread's
    MSR[TS] was not cleared), upon executing 'rfid' after completion of
    the page fault handling, the exception is raised because a transition
    from suspended to non-transactional state is invalid.

    Unexpected TM Bad Thing exception at c00000000000de44 (msr 0x8000000302a03031) tm_scratch=800000010280b033
    Oops: Unrecoverable exception, sig: 6 [#1]
    LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
    CPU: 25 PID: 15547 Comm: a.out Not tainted 5.4.0-rc2 #32
    NIP: c00000000000de44 LR: c000000000034728 CTR: 0000000000000000
    REGS: c00000003fe7bd70 TRAP: 0700 Not tainted (5.4.0-rc2)
    MSR: 8000000302a03031 CR: 44000884 XER: 00000000
    CFAR: c00000000000dda4 IRQMASK: 0
    PACATMSCRATCH: 800000010280b033
    GPR00: c000000000034728 c000000f65a17c80 c000000001662800 00007fffacf3fd78
    GPR04: 0000000000001000 0000000000001000 0000000000000000 c000000f611f8af0
    GPR08: 0000000000000000 0000000078006001 0000000000000000 000c000000000000
    GPR12: c000000f611f84b0 c00000003ffcb200 0000000000000000 0000000000000000
    GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000f611f8140
    GPR24: 0000000000000000 00007fffacf3fd68 c000000f65a17d90 c000000f611f7800
    GPR28: c000000f65a17e90 c000000f65a17e90 c000000001685e18 00007fffacf3f000
    NIP [c00000000000de44] fast_exception_return+0xf4/0x1b0
    LR [c000000000034728] handle_rt_signal64+0x78/0xc50
    Call Trace:
    [c000000f65a17c80] [c000000000034710] handle_rt_signal64+0x60/0xc50 (unreliable)
    [c000000f65a17d30] [c000000000023640] do_notify_resume+0x330/0x460
    [c000000f65a17e20] [c00000000000dcc4] ret_from_except_lite+0x70/0x74
    Instruction dump:
    7c4ff120 e8410170 7c5a03a6 38400000 f8410060 e8010070 e8410080 e8610088
    60000000 60000000 e8810090 e8210078 48000000 e8610178 88ed0989
    ---[ end trace 93094aa44b442f87 ]---

    The simplified sequence of events that triggers the above exception is:

    ... # userspace in NON-TRANSACTIONAL state
    tbegin # userspace in TRANSACTIONAL state
    signal delivery # kernelspace in SUSPENDED state
    handle_rt_signal64()
    get_tm_stackpointer()
    treclaim # kernelspace in NON-TRANSACTIONAL state
    __put_user()
    page fault happens. We will never get back here because of the TM Bad Thing exception.

    page fault handling kicks in and we voluntarily preempt ourselves
    do_page_fault()
    __schedule()
    __switch_to(other_task)

    our task is rescheduled and we recheckpoint because the thread's MSR[TS] was not cleared
    __switch_to(our_task)
    switch_to_tm()
    tm_recheckpoint_new_task()
    trechkpt # kernelspace in SUSPENDED state

    The page fault handling resumes, but now we are in suspended transaction state
    do_page_fault() completes
    rfid
    Acked-by: Michael Neuling
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20200211033831.11165-1-gustavold@linux.ibm.com
    Signed-off-by: Greg Kroah-Hartman

    Gustavo Luiz Duarte
     
  • commit d4f194ed9eb9841a8f978710e4d24296f791a85b upstream.

    Recovering a dead PHB can currently cause a deadlock as the PCI
    rescan/remove lock is taken twice.

    This is caused as part of an existing bug in
    eeh_handle_special_event(). The pe is processed while traversing the
    PHBs even though the pe is unrelated to the loop. This causes the pe
    to be, incorrectly, processed more than once.

    Untangling this section can move the pe processing out of the loop and
    also outside the locked section, correcting both problems.

    Fixes: 2e25505147b8 ("powerpc/eeh: Fix crash when edev->pdev changes")
    Cc: stable@vger.kernel.org # 5.4+
    Signed-off-by: Sam Bobroff
    Reviewed-by: Frederic Barrat
    Tested-by: Frederic Barrat
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/0547e82dbf90ee0729a2979a8cac5c91665c621f.1581051445.git.sbobroff@linux.ibm.com
    Signed-off-by: Greg Kroah-Hartman

    Sam Bobroff
     
  • commit a4031afb9d10d97f4d0285844abbc0ab04245304 upstream.

    In ITLB miss handled the line supposed to clear bits 20-23 on the L2
    ITLB entry is buggy and does indeed nothing, leading to undefined
    value which could allow execution when it shouldn't.

    Properly do the clearing with the relevant instruction.

    Fixes: 74fabcadfd43 ("powerpc/8xx: don't use r12/SPRN_SPRG_SCRATCH2 in TLB Miss handlers")
    Cc: stable@vger.kernel.org # v5.0+
    Signed-off-by: Christophe Leroy
    Reviewed-by: Leonardo Bras
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/4f70c2778163affce8508a210f65d140e84524b4.1581272050.git.christophe.leroy@c-s.fr
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     

26 Feb, 2020

1 commit

  • There's an OF helper called of_dma_is_coherent(), which checks if a
    device has a "dma-coherent" property to see if the device is coherent
    for DMA.

    But on some platforms devices are coherent by default, and on some
    platforms it's not possible to update existing device trees to add the
    "dma-coherent" property.

    So add a Kconfig symbol to allow arch code to tell
    of_dma_is_coherent() that devices are coherent by default, regardless
    of the presence of the property.

    Select that symbol on powerpc when NOT_COHERENT_CACHE is not set, ie.
    when the system has a coherent cache.

    Fixes: 92ea637edea3 ("of: introduce of_dma_is_coherent() helper")
    Cc: stable@vger.kernel.org # v3.16+
    Reported-by: Christian Zigotzky
    Tested-by: Christian Zigotzky
    Signed-off-by: Michael Ellerman
    Reviewed-by: Ulf Hansson
    Signed-off-by: Rob Herring
    (cherry picked from commit dabf6b36b83a18d57e3d4b9d50544ed040d86255)
    (cherry picked from commit 43557841be9fbf4dc9e053944a2e896c4baea73b)

    Michael Ellerman
     

24 Feb, 2020

7 commits

  • [ Upstream commit 43e76cd368fbb67e767da5363ffeaa3989993c8c ]

    Commit 8580ac9404f6 ("bpf: Process in-kernel BTF") introduced two weak
    symbols that may be unresolved at link time which result in an absolute
    relocation to 0. relocs_check.sh emits the following warning:

    "WARNING: 2 bad relocations
    c000000001a41478 R_PPC64_ADDR64 _binary__btf_vmlinux_bin_start
    c000000001a41480 R_PPC64_ADDR64 _binary__btf_vmlinux_bin_end"

    whereas those relocations are legitimate even for a relocatable kernel
    compiled with -pie option.

    relocs_check.sh already excluded some weak unresolved symbols explicitly:
    remove those hardcoded symbols and add some logic that parses the symbols
    using nm, retrieves all the weak unresolved symbols and excludes those from
    the list of the potential bad relocations.

    Reported-by: Stephen Rothwell
    Signed-off-by: Alexandre Ghiti
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20200118170335.21440-1-alex@ghiti.fr
    Signed-off-by: Sasha Levin

    Alexandre Ghiti
     
  • [ Upstream commit 0f9aee0cb9da7db7d96f63cfa2dc5e4f1bffeb87 ]

    Running vdsotest leaves many times the following log:

    [ 79.629901] vdsotest[396]: User access of kernel address (ffffffff) - exploit attempt? (uid: 0)

    A pointer set to (-1) is likely a programming error similar to
    a NULL pointer and is not worth logging as an exploit attempt.

    Don't log user accesses to 0xffffffff.

    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/0728849e826ba16f1fbd6fa7f5c6cc87bd64e097.1577087627.git.christophe.leroy@c-s.fr
    Signed-off-by: Sasha Levin

    Christophe Leroy
     
  • [ Upstream commit f1dbc1c5c70d0d4c60b5d467ba941fba167c12f6 ]

    Correct overflow problem in calculation and display of Maximum Memory
    value to syscfg.

    Signed-off-by: Michael Bringmann
    [mpe: Only n_lmbs needs casting to unsigned long]
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/5577aef8-1d5a-ca95-ff0a-9c7b5977e5bf@linux.ibm.com
    Signed-off-by: Sasha Levin

    Michael Bringmann
     
  • [ Upstream commit 1fb4124ca9d456656a324f1ee29b7bf942f59ac8 ]

    When disabling virtual functions on an SR-IOV adapter we currently do not
    correctly remove the EEH state for the now-dead virtual functions. When
    removing the pci_dn that was created for the VF when SR-IOV was enabled
    we free the corresponding eeh_dev without removing it from the child device
    list of the eeh_pe that contained it. This can result in crashes due to the
    use-after-free.

    Signed-off-by: Oliver O'Halloran
    Reviewed-by: Sam Bobroff
    Tested-by: Sam Bobroff
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20190821062655.19735-1-oohall@gmail.com
    Signed-off-by: Sasha Levin

    Oliver O'Halloran
     
  • [ Upstream commit 4de0a8355463e068e443b48eb5ae32370155368b ]

    Fixes gcc '-Wunused-but-set-variable' warning:

    arch/powerpc/kvm/emulate_loadstore.c: In function kvmppc_emulate_loadstore:
    arch/powerpc/kvm/emulate_loadstore.c:87:6: warning: variable ra set but not used [-Wunused-but-set-variable]
    arch/powerpc/kvm/emulate_loadstore.c: In function kvmppc_emulate_loadstore:
    arch/powerpc/kvm/emulate_loadstore.c:87:10: warning: variable rs set but not used [-Wunused-but-set-variable]
    arch/powerpc/kvm/emulate_loadstore.c: In function kvmppc_emulate_loadstore:
    arch/powerpc/kvm/emulate_loadstore.c:87:14: warning: variable rt set but not used [-Wunused-but-set-variable]

    They are not used since commit 2b33cb585f94 ("KVM: PPC: Reimplement
    LOAD_FP/STORE_FP instruction mmio emulation with analyse_instr() input")

    Reported-by: Hulk Robot
    Signed-off-by: zhengbin
    Signed-off-by: Paul Mackerras
    Signed-off-by: Sasha Levin

    zhengbin
     
  • [ Upstream commit 965c94f309be58fbcc6c8d3e4f123376c5970d79 ]

    An ioda_pe for each VF is allocated in pnv_pci_sriov_enable() before
    the pci_dev for the VF is created. We need to set the pe->pdev pointer
    at some point after the pci_dev is created. Currently we do that in:

    pcibios_bus_add_device()
    pnv_pci_dma_dev_setup() (via phb->ops.dma_dev_setup)
    /* fixup is done here */
    pnv_pci_ioda_dma_dev_setup() (via pnv_phb->dma_dev_setup)

    The fixup needs to be done before setting up DMA for for the VF's PE,
    but there's no real reason to delay it until this point. Move the
    fixup into pnv_pci_ioda_fixup_iov() so the ordering is:

    pcibios_add_device()
    pnv_pci_ioda_fixup_iov() (via ppc_md.pcibios_fixup_sriov)

    pcibios_bus_add_device()
    ...

    This isn't strictly required, but it's a slightly more logical place
    to do the fixup and it simplifies pnv_pci_dma_dev_setup().

    Signed-off-by: Oliver O'Halloran
    Reviewed-by: Alexey Kardashevskiy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20200110070207.439-4-oohall@gmail.com
    Signed-off-by: Sasha Levin

    Oliver O'Halloran
     
  • [ Upstream commit 3b5b9997b331e77ce967eba2c4bc80dc3134a7fe ]

    On pseries there is a bug with adding hotplugged devices to an IOMMU
    group. For a number of dumb reasons fixing that bug first requires
    re-working how VFs are configured on PowerNV. For background, on
    PowerNV we use the pcibios_sriov_enable() hook to do two things:

    1. Create a pci_dn structure for each of the VFs, and
    2. Configure the PHB's internal BARs so the MMIO range for each VF
    maps to a unique PE.

    Roughly speaking a PE is the hardware counterpart to a Linux IOMMU
    group since all the devices in a PE share the same IOMMU table. A PE
    also defines the set of devices that should be isolated in response to
    a PCI error (i.e. bad DMA, UR/CA, AER events, etc). When isolated all
    MMIO and DMA traffic to and from devicein the PE is blocked by the
    root complex until the PE is recovered by the OS.

    The requirement to block MMIO causes a giant headache because the P8
    PHB generally uses a fixed mapping between MMIO addresses and PEs. As
    a result we need to delay configuring the IOMMU groups for device
    until after MMIO resources are assigned. For physical devices (i.e.
    non-VFs) the PE assignment is done in pcibios_setup_bridge() which is
    called immediately after the MMIO resources for downstream
    devices (and the bridge's windows) are assigned. For VFs the setup is
    more complicated because:

    a) pcibios_setup_bridge() is not called again when VFs are activated, and
    b) The pci_dev for VFs are created by generic code which runs after
    pcibios_sriov_enable() is called.

    The work around for this is a two step process:

    1. A fixup in pcibios_add_device() is used to initialised the cached
    pe_number in pci_dn, then
    2. A bus notifier then adds the device to the IOMMU group for the PE
    specified in pci_dn->pe_number.

    A side effect fixing the pseries bug mentioned in the first paragraph
    is moving the fixup out of pcibios_add_device() and into
    pcibios_bus_add_device(), which is called much later. This results in
    step 2. failing because pci_dn->pe_number won't be initialised when
    the bus notifier is run.

    We can fix this by removing the need for the fixup. The PE for a VF is
    known before the VF is even scanned so we can initialise
    pci_dn->pe_number pcibios_sriov_enable() instead. Unfortunately,
    moving the initialisation causes two problems:

    1. We trip the WARN_ON() in the current fixup code, and
    2. The EEH core clears pdn->pe_number when recovering a VF and
    relies on the fixup to correctly re-set it.

    The only justification for either of these is a comment in
    eeh_rmv_device() suggesting that pdn->pe_number *must* be set to
    IODA_INVALID_PE in order for the VF to be scanned. However, this
    comment appears to have no basis in reality. Both bugs can be fixed by
    just deleting the code.

    Tested-by: Alexey Kardashevskiy
    Reviewed-by: Alexey Kardashevskiy
    Signed-off-by: Oliver O'Halloran
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20191028085424.12006-1-oohall@gmail.com
    Signed-off-by: Sasha Levin

    Oliver O'Halloran
     

15 Feb, 2020

6 commits

  • commit 7559d3d295f3365ea7ac0c0274c05e633fe4f594 upstream.

    By default a pseries guest supports a H_PUT_TCE hypercall which maps
    a single IOMMU page in a DMA window. Additionally the hypervisor may
    support H_PUT_TCE_INDIRECT/H_STUFF_TCE which update multiple TCEs at once;
    this is advertised via the device tree /rtas/ibm,hypertas-functions
    property which Linux converts to FW_FEATURE_MULTITCE.

    FW_FEATURE_MULTITCE is checked when dma_iommu_ops is used; however
    the code managing the huge DMA window (DDW) ignores it and calls
    H_PUT_TCE_INDIRECT even if it is explicitly disabled via
    the "multitce=off" kernel command line parameter.

    This adds FW_FEATURE_MULTITCE checking to the DDW code path.

    This changes tce_build_pSeriesLP to take liobn and page size as
    the huge window does not have iommu_table descriptor which usually
    the place to store these numbers.

    Fixes: 4e8b0cf46b25 ("powerpc/pseries: Add support for dynamic dma windows")
    Signed-off-by: Alexey Kardashevskiy
    Reviewed-by: Thiago Jung Bauermann
    Tested-by: Thiago Jung Bauermann
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20191216041924.42318-3-aik@ozlabs.ru
    Signed-off-by: Greg Kroah-Hartman

    Alexey Kardashevskiy
     
  • commit aff8c8242bc638ba57247ae1ec5f272ac3ed3b92 upstream.

    Commit e5afdf9dd515 ("powerpc/vfio_spapr_tce: Add reference counting to
    iommu_table") missed an iommu_table allocation in the pseries vio code.
    The iommu_table is allocated with kzalloc and as a result the associated
    kref gets a value of zero. This has the side effect that during a DLPAR
    remove of the associated virtual IOA the iommu_tce_table_put() triggers
    a use-after-free underflow warning.

    Call Trace:
    [c0000002879e39f0] [c00000000071ecb4] refcount_warn_saturate+0x184/0x190
    (unreliable)
    [c0000002879e3a50] [c0000000000500ac] iommu_tce_table_put+0x9c/0xb0
    [c0000002879e3a70] [c0000000000f54e4] vio_dev_release+0x34/0x70
    [c0000002879e3aa0] [c00000000087cfa4] device_release+0x54/0xf0
    [c0000002879e3b10] [c000000000d64c84] kobject_cleanup+0xa4/0x240
    [c0000002879e3b90] [c00000000087d358] put_device+0x28/0x40
    [c0000002879e3bb0] [c0000000007a328c] dlpar_remove_slot+0x15c/0x250
    [c0000002879e3c50] [c0000000007a348c] remove_slot_store+0xac/0xf0
    [c0000002879e3cd0] [c000000000d64220] kobj_attr_store+0x30/0x60
    [c0000002879e3cf0] [c0000000004ff13c] sysfs_kf_write+0x6c/0xa0
    [c0000002879e3d10] [c0000000004fde4c] kernfs_fop_write+0x18c/0x260
    [c0000002879e3d60] [c000000000410f3c] __vfs_write+0x3c/0x70
    [c0000002879e3d80] [c000000000415408] vfs_write+0xc8/0x250
    [c0000002879e3dd0] [c0000000004157dc] ksys_write+0x7c/0x120
    [c0000002879e3e20] [c00000000000b278] system_call+0x5c/0x68

    Further, since the refcount was always zero the iommu_tce_table_put()
    fails to call the iommu_table release function resulting in a leak.

    Fix this issue be initilizing the iommu_table kref immediately after
    allocation.

    Fixes: e5afdf9dd515 ("powerpc/vfio_spapr_tce: Add reference counting to iommu_table")
    Signed-off-by: Tyrel Datwyler
    Reviewed-by: Alexey Kardashevskiy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/1579558202-26052-1-git-send-email-tyreld@linux.ibm.com
    Signed-off-by: Greg Kroah-Hartman

    Tyrel Datwyler
     
  • commit 5649607a8d0b0e019a4db14aab3de1e16c3a2b4f upstream.

    String 'bus_desc.provider_name' allocated inside
    papr_scm_nvdimm_init() will leaks in case call to
    nvdimm_bus_register() fails or when papr_scm_remove() is called.

    This minor patch ensures that 'bus_desc.provider_name' is freed in
    error path for nvdimm_bus_register() as well as in papr_scm_remove().

    Fixes: b5beae5e224f ("powerpc/pseries: Add driver for PAPR SCM regions")
    Signed-off-by: Vaibhav Jain
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20200122155140.120429-1-vaibhav@linux.ibm.com
    Signed-off-by: Greg Kroah-Hartman

    Vaibhav Jain
     
  • commit f509247b08f2dcf7754d9ed85ad69a7972aa132b upstream.

    ptdump_check_wx() is called from mark_rodata_ro() which only exists
    when CONFIG_STRICT_KERNEL_RWX is selected.

    Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot")
    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/922d4939c735c6b52b4137838bcc066fffd4fc33.1578989545.git.christophe.leroy@c-s.fr
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     
  • commit e26ad936dd89d79f66c2b567f700e0c2a7103070 upstream.

    ptdump_check_wx() also have to be called when pages are mapped
    by blocks.

    Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot")
    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/37517da8310f4457f28921a4edb88fb21d27b62a.1578989531.git.christophe.leroy@c-s.fr
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     
  • commit d862b44133b7a1d7de25288e09eabf4df415e971 upstream.

    This reverts commit edea902c1c1efb855f77e041f9daf1abe7a9768a.

    At the time the change allowed direct DMA ops for secure VMs; however
    since then we switched on using SWIOTLB backed with IOMMU (direct mapping)
    and to make this work, we need dma_iommu_ops which handles all cases
    including TCE mapping I/O pages in the presence of an IOMMU.

    Fixes: edea902c1c1e ("powerpc/pseries/iommu: Don't use dma_iommu_ops on secure guests")
    Signed-off-by: Ram Pai
    [aik: added "revert" and "fixes:"]
    Signed-off-by: Alexey Kardashevskiy
    Reviewed-by: Thiago Jung Bauermann
    Tested-by: Thiago Jung Bauermann
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20191216041924.42318-2-aik@ozlabs.ru
    Signed-off-by: Greg Kroah-Hartman

    Ram Pai
     

11 Feb, 2020

14 commits

  • [ Upstream commit 1d8f739b07bd538f272f60bf53f10e7e6248d295 ]

    __builtin_constant_p() always return 0 for pointers, so on RADIX
    we always end up opening both direction (by writing 0 in SPR29):

    0000000000000170 :
    ...
    1b0: 4c 00 01 2c isync
    1b4: 39 20 00 00 li r9,0
    1b8: 7d 3d 03 a6 mtspr 29,r9
    1bc: 4c 00 01 2c isync
    1c0: 48 00 00 01 bl 1c0
    1c0: R_PPC64_REL24 .__copy_tofrom_user
    ...
    0000000000000220 :
    ...
    2ac: 4c 00 01 2c isync
    2b0: 39 20 00 00 li r9,0
    2b4: 7d 3d 03 a6 mtspr 29,r9
    2b8: 4c 00 01 2c isync
    2bc: 7f c5 f3 78 mr r5,r30
    2c0: 7f 83 e3 78 mr r3,r28
    2c4: 48 00 00 01 bl 2c4
    2c4: R_PPC64_REL24 .__copy_tofrom_user
    ...

    Use an explicit parameter for direction selection, so that GCC
    is able to see it is a constant:

    00000000000001b0 :
    ...
    1f0: 4c 00 01 2c isync
    1f4: 3d 20 40 00 lis r9,16384
    1f8: 79 29 07 c6 rldicr r9,r9,32,31
    1fc: 7d 3d 03 a6 mtspr 29,r9
    200: 4c 00 01 2c isync
    204: 48 00 00 01 bl 204
    204: R_PPC64_REL24 .__copy_tofrom_user
    ...
    0000000000000260 :
    ...
    2ec: 4c 00 01 2c isync
    2f0: 39 20 ff ff li r9,-1
    2f4: 79 29 00 04 rldicr r9,r9,0,0
    2f8: 7d 3d 03 a6 mtspr 29,r9
    2fc: 4c 00 01 2c isync
    300: 7f c5 f3 78 mr r5,r30
    304: 7f 83 e3 78 mr r3,r28
    308: 48 00 00 01 bl 308
    308: R_PPC64_REL24 .__copy_tofrom_user
    ...

    Signed-off-by: Christophe Leroy
    [mpe: Spell out the directions, s/KUAP_R/KUAP_READ/ etc.]
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/f4e88ec4941d5facb35ce75026b0112f980086c3.1579866752.git.christophe.leroy@c-s.fr
    Signed-off-by: Sasha Levin

    Christophe Leroy
     
  • [ Upstream commit f9b84e19221efc5f493156ee0329df3142085f28 ]

    Use kvm_vcpu_gfn_to_hva() when retrieving the host page size so that the
    correct set of memslots is used when handling x86 page faults in SMM.

    Fixes: 54bf36aac520 ("KVM: x86: use vcpu-specific functions to read/write/translate GFNs")
    Cc: stable@vger.kernel.org
    Signed-off-by: Sean Christopherson
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Sasha Levin

    Sean Christopherson
     
  • commit c3aae14e5d468d18dbb5d7c0c8c7e2968cc14aad upstream.

    Clang warns:

    ../arch/powerpc/boot/4xx.c:231:3: warning: misleading indentation;
    statement is not part of the previous 'else' [-Wmisleading-indentation]
    val = SDRAM0_READ(DDR0_42);
    ^
    ../arch/powerpc/boot/4xx.c:227:2: note: previous statement is here
    else
    ^

    This is because there is a space at the beginning of this line; remove
    it so that the indentation is consistent according to the Linux kernel
    coding style and clang no longer warns.

    Fixes: d23f5099297c ("[POWERPC] 4xx: Adds decoding of 440SPE memory size to boot wrapper library")
    Signed-off-by: Nathan Chancellor
    Reviewed-by: Nick Desaulniers
    Signed-off-by: Michael Ellerman
    Link: https://github.com/ClangBuiltLinux/linux/issues/780
    Link: https://lore.kernel.org/r/20191209200338.12546-1-natechancellor@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    Nathan Chancellor
     
  • commit 0ed1325967ab5f7a4549a2641c6ebe115f76e228 upstream.

    Architectures for which we have hardware walkers of Linux page table
    should flush TLB on mmu gather batch allocation failures and batch flush.
    Some architectures like POWER supports multiple translation modes (hash
    and radix) and in the case of POWER only radix translation mode needs the
    above TLBI. This is because for hash translation mode kernel wants to
    avoid this extra flush since there are no hardware walkers of linux page
    table. With radix translation, the hardware also walks linux page table
    and with that, kernel needs to make sure to TLB invalidate page walk cache
    before page table pages are freed.

    More details in commit d86564a2f085 ("mm/tlb, x86/mm: Support invalidating
    TLB caches for RCU_TABLE_FREE")

    The changes to sparc are to make sure we keep the old behavior since we
    are now removing HAVE_RCU_TABLE_NO_INVALIDATE. The default value for
    tlb_needs_table_invalidate is to always force an invalidate and sparc can
    avoid the table invalidate. Hence we define tlb_needs_table_invalidate to
    false for sparc architecture.

    Link: http://lkml.kernel.org/r/20200116064531.483522-3-aneesh.kumar@linux.ibm.com
    Fixes: a46cc7a90fd8 ("powerpc/mm/radix: Improve TLB/PWC flushes")
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Michael Ellerman [powerpc]
    Cc: [4.14+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra
     
  • commit cb10bf9194f4d2c5d830eddca861f7ca0fecdbb4 upstream.

    Explicitly free the shared page if kvmppc_mmu_init() fails during
    kvmppc_core_vcpu_create(), as the page is freed only in
    kvmppc_core_vcpu_free(), which is not reached via kvm_vcpu_uninit().

    Fixes: 96bc451a15329 ("KVM: PPC: Introduce shared page")
    Cc: stable@vger.kernel.org
    Reviewed-by: Greg Kurz
    Signed-off-by: Sean Christopherson
    Acked-by: Paul Mackerras
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Sean Christopherson
     
  • commit 1a978d9d3e72ddfa40ac60d26301b154247ee0bc upstream.

    Call kvm_vcpu_uninit() if vcore creation fails to avoid leaking any
    resources allocated by kvm_vcpu_init(), i.e. the vcpu->run page.

    Fixes: 371fefd6f2dc4 ("KVM: PPC: Allow book3s_hv guests to use SMT processor modes")
    Cc: stable@vger.kernel.org
    Reviewed-by: Greg Kurz
    Signed-off-by: Sean Christopherson
    Acked-by: Paul Mackerras
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Sean Christopherson
     
  • commit 9dc086f1e9ef39dd823bd27954b884b2062f9e70 upstream.

    The early versions of our kernel user access prevention (KUAP) were
    written by Russell and Christophe, and didn't have separate
    read/write access.

    At some point I picked up the series and added the read/write access,
    but I failed to update the usages in futex.h to correctly allow read
    and write.

    However we didn't notice because of another bug which was causing the
    low-level code to always enable read and write. That bug was fixed
    recently in commit 1d8f739b07bd ("powerpc/kuap: Fix set direction in
    allow/prevent_user_access()").

    futex_atomic_cmpxchg_inatomic() is passed the user address as %3 and
    does:

    1: lwarx %1, 0, %3
    cmpw 0, %1, %4
    bne- 3f
    2: stwcx. %5, 0, %3

    Which clearly loads and stores from/to %3. The logic in
    arch_futex_atomic_op_inuser() is similar, so fix both of them to use
    allow_read_write_user().

    Without this fix, and with PPC_KUAP_DEBUG=y, we see eg:

    Bug: Read fault blocked by AMR!
    WARNING: CPU: 94 PID: 149215 at arch/powerpc/include/asm/book3s/64/kup-radix.h:126 __do_page_fault+0x600/0xf30
    CPU: 94 PID: 149215 Comm: futex_requeue_p Tainted: G W 5.5.0-rc7-gcc9x-g4c25df5640ae #1
    ...
    NIP [c000000000070680] __do_page_fault+0x600/0xf30
    LR [c00000000007067c] __do_page_fault+0x5fc/0xf30
    Call Trace:
    [c00020138e5637e0] [c00000000007067c] __do_page_fault+0x5fc/0xf30 (unreliable)
    [c00020138e5638c0] [c00000000000ada8] handle_page_fault+0x10/0x30
    --- interrupt: 301 at cmpxchg_futex_value_locked+0x68/0xd0
    LR = futex_lock_pi_atomic+0xe0/0x1f0
    [c00020138e563bc0] [c000000000217b50] futex_lock_pi_atomic+0x80/0x1f0 (unreliable)
    [c00020138e563c30] [c00000000021b668] futex_requeue+0x438/0xb60
    [c00020138e563d60] [c00000000021c6cc] do_futex+0x1ec/0x2b0
    [c00020138e563d90] [c00000000021c8b8] sys_futex+0x128/0x200
    [c00020138e563e20] [c00000000000b7ac] system_call+0x5c/0x68

    Fixes: de78a9c42a79 ("powerpc: Add a framework for Kernel Userspace Access Protection")
    Cc: stable@vger.kernel.org # v5.2+
    Reported-by: syzbot+e808452bad7c375cbee6@syzkaller-ppc64.appspotmail.com
    Signed-off-by: Michael Ellerman
    Reviewed-by: Christophe Leroy
    Link: https://lore.kernel.org/r/20200207122145.11928-1-mpe@ellerman.id.au
    Signed-off-by: Greg Kroah-Hartman

    Michael Ellerman
     
  • commit dabf6b36b83a18d57e3d4b9d50544ed040d86255 upstream.

    There's an OF helper called of_dma_is_coherent(), which checks if a
    device has a "dma-coherent" property to see if the device is coherent
    for DMA.

    But on some platforms devices are coherent by default, and on some
    platforms it's not possible to update existing device trees to add the
    "dma-coherent" property.

    So add a Kconfig symbol to allow arch code to tell
    of_dma_is_coherent() that devices are coherent by default, regardless
    of the presence of the property.

    Select that symbol on powerpc when NOT_COHERENT_CACHE is not set, ie.
    when the system has a coherent cache.

    Fixes: 92ea637edea3 ("of: introduce of_dma_is_coherent() helper")
    Cc: stable@vger.kernel.org # v3.16+
    Reported-by: Christian Zigotzky
    Tested-by: Christian Zigotzky
    Signed-off-by: Michael Ellerman
    Reviewed-by: Ulf Hansson
    Signed-off-by: Rob Herring
    Signed-off-by: Greg Kroah-Hartman

    Michael Ellerman
     
  • commit 9933819099c4600b41a042f27a074470a43cf6b9 upstream.

    Commit f7354ccac844 ("powerpc/32: Remove CURRENT_THREAD_INFO and
    rename TI_CPU") broke the CPU wake-up from sleep mode (i.e. when
    _TLF_SLEEPING is set) by delaying the tovirt(r2, r2).

    This is because r2 is not restored by fast_exception_return. It used
    to work (by chance ?) because CPU wake-up interrupt never comes from
    user, so r2 is expected to point to 'current' on return.

    Commit e2fb9f544431 ("powerpc/32: Prepare for Kernel Userspace Access
    Protection") broke it even more by clobbering r0 which is not
    restored by fast_exception_return either.

    Use r6 instead of r0. This is possible because r3-r6 are restored by
    fast_exception_return and only r3-r5 are used for exception arguments.

    For r2 it could be converted back to virtual address, but stay on the
    safe side and restore it from the stack instead. It should be live
    in the cache at that moment, so loading from the stack should make
    no difference compared to converting it from phys to virt.

    Fixes: f7354ccac844 ("powerpc/32: Remove CURRENT_THREAD_INFO and rename TI_CPU")
    Fixes: e2fb9f544431 ("powerpc/32: Prepare for Kernel Userspace Access Protection")
    Cc: stable@vger.kernel.org
    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/6d02c3ae6ad77af34392e98117e44c2bf6d13ba1.1580121710.git.christophe.leroy@c-s.fr
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     
  • commit 6ec20aa2e510b6297906c45f009aa08b2d97269a upstream.

    At the moment, bad_kuap_fault() reports a fault only if a bad access
    to userspace occurred while access to userspace was not granted.

    But if a fault occurs for a write outside the allowed userspace
    segment(s) that have been unlocked, bad_kuap_fault() fails to
    detect it and the kernel loops forever in do_page_fault().

    Fix it by checking that the accessed address is within the allowed
    range.

    Fixes: a68c31fc01ef ("powerpc/32s: Implement Kernel Userspace Access Protection")
    Cc: stable@vger.kernel.org # v5.2+
    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/f48244e9485ada0a304ed33ccbb8da271180c80d.1579866752.git.christophe.leroy@c-s.fr
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     
  • commit fbee6ba2dca30d302efe6bddb3a886f5e964a257 upstream.

    In lmb_is_removable(), if a section is not present, it should continue
    to test the rest of the sections in the block. But the current code
    fails to do so.

    Fixes: 51925fb3c5c9 ("powerpc/pseries: Implement memory hotplug remove in the kernel")
    Cc: stable@vger.kernel.org # v4.1+
    Signed-off-by: Pingfan Liu
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/1578632042-12415-1-git-send-email-kernelfans@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    Pingfan Liu
     
  • commit c2a20711fc181e7f22ee5c16c28cb9578af84729 upstream.

    ASDR is HV-privileged and must only be accessed in HV-mode.
    Fixes a Program Check (0x700) when xmon in a VM dumps SPRs.

    Fixes: d1e1b351f50f ("powerpc/xmon: Add ISA v3.0 SPRs to SPR dump")
    Cc: stable@vger.kernel.org # v4.14+
    Signed-off-by: Sukadev Bhattiprolu
    Reviewed-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20200107021633.GB29843@us.ibm.com
    Signed-off-by: Greg Kroah-Hartman

    Sukadev Bhattiprolu
     
  • commit d80ae83f1f932ab7af47b54d0d3bef4f4dba489f upstream.

    Verification cannot rely on simple bit checking because on some
    platforms PAGE_RW is 0, checking that a page is not W means
    checking that PAGE_RO is set instead of checking that PAGE_RW
    is not set.

    Use pte helpers instead of checking bits.

    Fixes: 453d87f6a8ae ("powerpc/mm: Warn if W+X pages found on boot")
    Cc: stable@vger.kernel.org # v5.2+
    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/0d894839fdbb19070f0e1e4140363be4f2bb62fc.1578989540.git.christophe.leroy@c-s.fr
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     
  • commit 12e4d53f3f04e81f9e83d6fc10edc7314ab9f6b9 upstream.

    Patch series "Fixup page directory freeing", v4.

    This is a repost of patch series from Peter with the arch specific changes
    except ppc64 dropped. ppc64 changes are added here because we are redoing
    the patch series on top of ppc64 changes. This makes it easy to backport
    these changes. Only the first 2 patches need to be backported to stable.

    The thing is, on anything SMP, freeing page directories should observe the
    exact same order as normal page freeing:

    1) unhook page/directory
    2) TLB invalidate
    3) free page/directory

    Without this, any concurrent page-table walk could end up with a
    Use-after-Free. This is esp. trivial for anything that has software
    page-table walkers (HAVE_FAST_GUP / software TLB fill) or the hardware
    caches partial page-walks (ie. caches page directories).

    Even on UP this might give issues since mmu_gather is preemptible these
    days. An interrupt or preempted task accessing user pages might stumble
    into the free page if the hardware caches page directories.

    This patch series fixes ppc64 and add generic MMU_GATHER changes to
    support the conversion of other architectures. I haven't added patches
    w.r.t other architecture because they are yet to be acked.

    This patch (of 9):

    A followup patch is going to make sure we correctly invalidate page walk
    cache before we free page table pages. In order to keep things simple
    enable RCU_TABLE_FREE even for !SMP so that we don't have to fixup the
    !SMP case differently in the followup patch

    !SMP case is right now broken for radix translation w.r.t page walk
    cache flush. We can get interrupted in between page table free and
    that would imply we have page walk cache entries pointing to tables
    which got freed already. Michael said "both our platforms that run on
    Power9 force SMP on in Kconfig, so the !SMP case is unlikely to be a
    problem for anyone in practice, unless they've hacked their kernel to
    build it !SMP."

    Link: http://lkml.kernel.org/r/20200116064531.483522-2-aneesh.kumar@linux.ibm.com
    Signed-off-by: Aneesh Kumar K.V
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Michael Ellerman
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Aneesh Kumar K.V
     

06 Feb, 2020

1 commit

  • [ Upstream commit 73d527aef68f7644e59f22ce7f9ac75e7b533aea ]

    Add fsl,erratum-a011043 to internal MDIO buses.
    Software may get false read error when reading internal
    PCS registers through MDIO. As a workaround, all internal
    MDIO accesses should ignore the MDIO_CFG[MDIO_RD_ER] bit.

    Signed-off-by: Madalin Bucur
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Madalin Bucur
     

29 Jan, 2020

2 commits

  • commit 17328f218fb760c9c6accc5b52494889243a6b98 upstream.

    A load on an ESB page returning all 1's means that the underlying
    device has invalidated the access to the PQ state of the interrupt
    through mmio. It may happen, for example when querying a PHB interrupt
    while the PHB is in an error state.

    In that case, we should consider the interrupt to be invalid when
    checking its state in the irq_get_irqchip_state() handler.

    Fixes: da15c03b047d ("powerpc/xive: Implement get_irqchip_state method for XIVE to fix shutdown race")
    Cc: stable@vger.kernel.org # v5.4+
    Signed-off-by: Frederic Barrat
    [clg: wrote a commit log, introduced XIVE_ESB_INVALID ]
    Signed-off-by: Cédric Le Goater
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20200113130118.27969-1-clg@kaod.org
    Signed-off-by: Greg Kroah-Hartman

    Frederic Barrat
     
  • commit 5d2e5dd5849b4ef5e8ec35e812cdb732c13cd27e upstream.

    Commit 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in
    the same 0xc range") has a bug in the definition of MIN_USER_CONTEXT.

    The result is that the context id used for the vmemmap and the lowest
    context id handed out to userspace are the same. The context id is
    essentially the process identifier as far as the first stage of the
    MMU translation is concerned.

    This can result in multiple SLB entries with the same VSID (Virtual
    Segment ID), accessible to the kernel and some random userspace
    process that happens to get the overlapping id, which is not expected
    eg:

    07 c00c000008000000 40066bdea7000500 1T ESID= c00c00 VSID= 66bdea7 LLP:100
    12 0002000008000000 40066bdea7000d80 1T ESID= 200 VSID= 66bdea7 LLP:100

    Even though the user process and the kernel use the same VSID, the
    permissions in the hash page table prevent the user process from
    reading or writing to any kernel mappings.

    It can also lead to SLB entries with different base page size
    encodings (LLP), eg:

    05 c00c000008000000 00006bde0053b500 256M ESID=c00c00000 VSID= 6bde0053b LLP:100
    09 0000000008000000 00006bde0053bc80 256M ESID= 0 VSID= 6bde0053b LLP: 0

    Such SLB entries can result in machine checks, eg. as seen on a G5:

    Oops: Machine check, sig: 7 [#1]
    BE PAGE SIZE=64K MU-Hash SMP NR_CPUS=4 NUMA Power Mac
    NIP: c00000000026f248 LR: c000000000295e58 CTR: 0000000000000000
    REGS: c0000000erfd3d70 TRAP: 0200 Tainted: G M (5.5.0-rcl-gcc-8.2.0-00010-g228b667d8ea1)
    MSR: 9000000000109032 CR: 24282048 XER: 00000000
    DAR: c00c000000612c80 DSISR: 00000400 IRQMASK: 0
    ...
    NIP [c00000000026f248] .kmem_cache_free+0x58/0x140
    LR [c088000008295e58] .putname 8x88/0xa
    Call Trace:
    .putname+0xB8/0xa
    .filename_lookup.part.76+0xbe/0x160
    .do_faccessat+0xe0/0x380
    system_call+0x5c/ex68

    This happens with 256MB segments and 64K pages, as the duplicate VSID
    is hit with the first vmemmap segment and the first user segment, and
    older 32-bit userspace maps things in the first user segment.

    On other CPUs a machine check is not seen. Instead the userspace
    process can get stuck continuously faulting, with the fault never
    properly serviced, due to the kernel not understanding that there is
    already a HPTE for the address but with inaccessible permissions.

    On machines with 1T segments we've not seen the bug hit other than by
    deliberately exercising it. That seems to be just a matter of luck
    though, due to the typical layout of the user virtual address space
    and the ranges of vmemmap that are typically populated.

    To fix it we add 2 to MIN_USER_CONTEXT. This ensures the lowest
    context given to userspace doesn't overlap with the VMEMMAP context,
    or with the context for INVALID_REGION_ID.

    Fixes: 0034d395f89d ("powerpc/mm/hash64: Map all the kernel regions in the same 0xc range")
    Cc: stable@vger.kernel.org # v5.2+
    Reported-by: Christian Marillat
    Reported-by: Romain Dolbeau
    Signed-off-by: Aneesh Kumar K.V
    [mpe: Account for INVALID_REGION_ID, mostly rewrite change log]
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20200123102547.11623-1-mpe@ellerman.id.au
    Signed-off-by: Greg Kroah-Hartman

    Aneesh Kumar K.V
     

26 Jan, 2020

2 commits

  • commit b6afd1234cf93aa0d71b4be4788c47534905f0be upstream.

    Commit 01c9348c7620ec65

    powerpc: Use hardware RNG for arch_get_random_seed_* not arch_get_random_*

    updated arch_get_random_[int|long]() to be NOPs, and moved the hardware
    RNG backing to arch_get_random_seed_[int|long]() instead. However, it
    failed to take into account that arch_get_random_int() was implemented
    in terms of arch_get_random_long(), and so we ended up with a version
    of the former that is essentially a NOP as well.

    Fix this by calling arch_get_random_seed_long() from
    arch_get_random_seed_int() instead.

    Fixes: 01c9348c7620ec65 ("powerpc: Use hardware RNG for arch_get_random_seed_* not arch_get_random_*")
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20191204115015.18015-1-ardb@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Ard Biesheuvel
     
  • commit 71eb40fc53371bc247c8066ae76ad9e22ae1e18d upstream.

    When enabling CONFIG_RELOCATABLE and CONFIG_KASAN on FSL_BOOKE,
    the kernel doesn't boot.

    relocate_init() requires KASAN early shadow area to be set up because
    it needs access to the device tree through generic functions.

    Call kasan_early_init() before calling relocate_init()

    Reported-by: Lexi Shao
    Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
    Signed-off-by: Christophe Leroy
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/b58426f1664a4b344ff696d18cacf3b3e8962111.1575036985.git.christophe.leroy@c-s.fr
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy