01 May, 2014

4 commits


30 Apr, 2014

2 commits

  • Unaligned stores take alignment exceptions on POWER7 running in little-endian.
    This is a dumb little-endian base memcpy that prevents unaligned stores.
    Once booted the feature fixup code switches over to the VMX copy loops
    (which are already endian safe).

    The question is what we do before that switch over. The base 64bit
    memcpy takes alignment exceptions on POWER7 so we can't use it as is.
    Fixing the causes of alignment exception would slow it down, because
    we'd need to ensure all loads and stores are aligned either through
    rotate tricks or bytewise loads and stores. Either would be bad for
    all other 64bit platforms.

    [ I simplified the loop a bit - Anton ]

    Signed-off-by: Philippe Bergheaud
    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Philippe Bergheaud
     
  • Merge Linus tree to get "cpufreq, powernv: Fix build failure on UP"
    to avoid build breakages in some of my test configs.

    Benjamin Herrenschmidt
     

28 Apr, 2014

34 commits

  • If we do a treclaim and we are not in TM suspend mode, it results in a TM bad
    thing (ie. a 0x700 program check). Similarly if we do a trechkpt and we have
    an active transaction or TEXASR Failure Summary (FS) is not set, we also take a
    TM bad thing.

    This should never happen, but if it does (ie. a kernel bug), the cause is
    almost impossible to debug as the GPR state is mostly userspace and hence we
    don't get a call chain.

    This adds some checks in these cases case a BUG_ON() (in asm) in case we ever
    hit these cases. It moves the register saving around to preserve r1 till later
    also.

    Signed-off-by: Michael Neuling
    Signed-off-by: Benjamin Herrenschmidt

    Michael Neuling
     
  • We save r1 to the scratch SPR and restore it from there after the trechkpt so
    saving r1 to the paca is not needed.

    Signed-off-by: Michael Neuling
    Signed-off-by: Benjamin Herrenschmidt

    Michael Neuling
     
  • Implement a method named pnv_get_proc_freq(unsigned int cpu) which
    returns the current clock rate on the 'cpu' in Hz to be reported in
    /proc/cpuinfo. This method uses the value reported by cpufreq when
    such a value is sane. Otherwise it falls back to old way of reporting
    the clockrate, i.e. ppc_proc_freq.

    Set the ppc_md.get_proc_freq() hook to pnv_get_proc_freq() on the
    PowerNV platform.

    Signed-off-by: Gautham R. Shenoy
    Signed-off-by: Benjamin Herrenschmidt

    Gautham R. Shenoy
     
  • Currently, the code in setup-common.c for powerpc assumes that all
    clock rates are same in a smp system. This value is cached in the
    variable named ppc_proc_freq and is the value that is reported in
    /proc/cpuinfo.

    However on the PowerNV platform, the clock rate is same only across
    the threads of the same core. Hence the value that is reported in
    /proc/cpuinfo is incorrect on PowerNV platforms. We need a better way
    to query and report the correct value of the processor clock in
    /proc/cpuinfo.

    The patch achieves this by creating a machdep_call named
    get_proc_freq() which is expected to returns the frequency in Hz. The
    code in show_cpuinfo() can invoke this method to display the correct
    clock rate on platforms that have implemented this method. On the
    other powerpc platforms it can use the value cached in ppc_proc_freq.

    Signed-off-by: Gautham R. Shenoy
    Signed-off-by: Benjamin Herrenschmidt

    Gautham R. Shenoy
     
  • Firmware update on PowerNV platform takes several minutes. During
    this time one CPU is stuck in FW and the kernel complains about "soft
    lockups".

    This patch returns all secondary CPUs to firmware before starting
    firmware update process.

    [ Reworked a bit and cleaned up -- BenH ]

    Signed-off-by: Vasant Hegde
    Signed-off-by: Benjamin Herrenschmidt

    Vasant Hegde
     
  • This patch updates the implementation of pci_process_bridge_OF_ranges to use
    the of_pci_range_parser helpers.

    Signed-off-by: Andrew Murray
    Signed-off-by: Benjamin Herrenschmidt

    Andrew Murray
     
  • This patch adds support to legacy serial for
    UARTS with shifted registers.

    The MVME5100 Single Board Computer is a PowerPC platform
    that has 16550 style UARTS with register addresses that are
    16 bytes apart (shifted by 4).

    Commit 309257484cc1a592e8ac5fbdd8cd661be2b80bf8
    "powerpc: Cleanup udbg_16550 and add support for LPC PIO-only UARTs"
    added support to udbg_16550 for shifted registers by adding a "stride"
    parameter to the initialisation operations for Programmed IO and
    Memory Mapped IO.

    As a consequence it is now possible to use the services of legacy serial
    to provide early serial console messages for the MVME5100.

    An added benefit of this is that the serial console will always be
    "ttyS0" irrespective of whether the computer is fitted with extra
    PCI 8250 interface boards or not.

    I have tested this patch using the four PowerPC platforms available to me:

    MVME5100 - shifted registers,
    SAM440EP - unshifted registers,
    MPC8349 - unshifted registers,
    MVME4100 - unshifted registers.

    Signed-off-by: Stephen Chivers
    Signed-off-by: Benjamin Herrenschmidt

    Stephen Chivers
     
  • The code is only slightly modified : entry points now use the
    FIXUP_ENDIAN trampoline to switch endian order. The 32bit wrapper
    is kept for big endian kernels and 64bit is enforced for little
    endian kernels with a PPC64_BOOT_WRAPPER config option.

    The linker script is generated using the kernel preprocessor flags
    to make use of the CONFIG_* definitions and the wrapper script is
    modified to take into account the new elf64ppc format.

    Finally, the zImage file is compiled as a position independent
    executable (-pie) which makes it loadable at any address by the
    firmware.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • When entering the boot wrapper in little endian, we will need to fix
    the endian order using a fixup trampoline like in the kernel. This
    patch overrides the _zimage_start entry point for this purpose.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • This patch adds support a 64bit wrapper entry point. As in 32bit, the
    entry point does its own relocation and can be loaded at any address
    by the firmware.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • This patch defines a 'prom' routine similar to 'enter_prom' in the
    kernel.

    The difference is in the MSR which is built before entering prom. Big
    endian order is enforced as in the kernel but 32bit mode is not. It
    prepares ground for the next patches which will introduce Little endian
    order.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • It could certainly be improved using Elf macros and byteswapping
    routines, but the initial version of the code is organised to be a
    single file program with limited dependencies. yaboot is the same.

    Please scream if you want a total rewrite.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • These are not the most efficient versions of swab but the wrapper does
    not do much byte swapping. On a big endian cpu, these routines are
    a no-op.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • arch/powerpc/boot/oflib.c:211:9: warning: cast to pointer from integer of \
    different size [-Wint-to-pointer-cast]
    return (phandle) of_call_prom("finddevice", 1, 1, name);

    This is a work around. The definite solution would be to define the
    phandle typedef as a u32, as in the kernel, but this would break the
    device tree ops API.

    Let it be for the moment.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • This makes ihandle 64bit friendly.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • This patch fixes 64bit compile warnings and updates the wrapper code
    to converge the kernel code in prom_init.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • This is mostly useful to make to the boot wrapper code closer with
    the kernel code in prom_init.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • Values will need to be byte-swapped when calling prom (big endian) from
    a little endian boot wrapper.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • This patch updates the wrapper code to converge with the kernel code in
    prom_init.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • This patch fixes warnings when the wrapper is compiled in 64bit and
    updates the boot wrapper code related to prom to converge with the
    kernel code in prom_init. This should make the review of changes easier.

    The kernel has a different number of possible arguments (10) when
    entering prom. There does not seem to be any good reason to have
    12 in the wrapper, so the patch changes this value to args[10] in
    the prom_args struct.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • When the boot wrapper is compiled in 64bit, there is no need to
    use __div64_32.

    Signed-off-by: Cédric Le Goater
    Signed-off-by: Benjamin Herrenschmidt

    Cédric Le Goater
     
  • Function early_init_dt_scan_fw_dump() is called to scan the device
    tree for fdump properties under node "rtas". Any one of them is
    invalid, we can stop scanning the device tree early by returning
    "1". It would save a bit time during boot.

    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • If the PE contains single PCI function, "pe->pbus" would be NULL.
    It's not reliable to be used by pci_domain_nr(). We just grab the
    PCI domain number from the PCI host controller (struct pci_controller)
    instance.

    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • In function pnv_pci_ioda2_setup_dma_pe(), the IOMMU table type is
    set to (TCE_PCI_SWINV_CREATE | TCE_PCI_SWINV_FREE) unconditionally.
    It was just set to TCE_PCI by pnv_pci_setup_iommu_table(). So the
    primary IOMMU table type (TCE_PCI) is lost. The patch fixes it.

    Also, pnv_pci_setup_iommu_table() already set "tbl->it_busno" to
    zero and we needn't do it again. The patch removes the redundant
    assignment.

    The patch also fixes similar issues in pnv_pci_ioda_setup_dma_pe().

    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • The patch intends to support fundamental reset on PLX downstream
    ports. If the PCI device matches any one of the internal table,
    which includes PLX vendor ID, bridge device ID, register offset
    for fundamental reset and bit, fundamental reset will be done
    accordingly. Otherwise, it will fail back to hot reset.

    Additional flag (EEH_DEV_FRESET) is introduced to record the last
    reset type on the PCI bridge.

    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • When PCI_ERS_RESULT_CAN_RECOVER returned from device drivers, the
    EEH core should enable I/O and DMA for the affected PE. However,
    it was missed to have DMA enabled in eeh_handle_normal_event().
    Besides, the frozen state of the affected PE should be cleared
    after successful recovery, but we didn't.

    The patch fixes both of the issues as above.

    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • In the kdump scenario, the first kerenl doesn't shutdown PCI devices
    and the kdump kerenl clean PHB IODA table at the early probe time.
    That means the kdump kerenl can't support PCI transactions piled
    by the first kerenl. Otherwise, lots of EEH errors and frozen PEs
    will be detected.

    In order to avoid the EEH errors, the PHB is resetted to drop all
    PCI transaction from the first kerenl.

    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • The problem was initially reported by Wendy who tried pass through
    IPR adapter, which was connected to PHB root port directly, to KVM
    based guest. When doing that, pci_reset_bridge_secondary_bus() was
    called by VFIO driver and linkDown was detected by the root port.
    That caused all PEs to be frozen.

    The patch fixes the issue by routing the reset for the secondary bus
    of root port to underly firmware. For that, one more weak function
    pci_reset_secondary_bus() is introduced so that the individual platforms
    can override that and do specific reset for bridge's secondary bus.

    Reported-by: Wendy Xiong
    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • Basically, we have 3 types of resets to fulfil PE reset: fundamental,
    hot and PHB reset. For the later 2 cases, we need PCI bus reset hold
    and settlement delay as specified by PCI spec. PowerNV and pSeries
    platforms are running on top of different firmware and some of the
    delays have been covered by underly firmware (PowerNV).

    The patch makes the delays unified to be done in backend, instead of
    EEH core.

    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • Resetting root port has more stuff to do than that for PCIe switch
    ports and we should have resetting root port done in firmware instead
    of the kernel itself. The problem was introduced by commit 5b2e198e
    ("powerpc/powernv: Rework EEH reset").

    Cc: linux-stable
    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • In pseries_eeh_get_state(), EEH_STATE_UNAVAILABLE is always
    overwritten by EEH_STATE_NOT_SUPPORT because of the missed
    "break" there. The patch fixes the issue.

    Reported-by: Joe Perches
    Cc: linux-stable
    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • Once one specific PE has been marked as EEH_PE_ISOLATED, it's in
    the middile of recovery or removed permenently. We needn't report
    the frozen PE again. Otherwise, we will have endless reporting
    same frozen PE.

    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan
     
  • The issue was detected in a bit complicated test case where
    we have multiple hierarchical PEs shown as following figure:

    +-----------------+
    | PE#3 p2p#0 |
    | p2p#1 |
    +-----------------+
    |
    +-----------------+
    | PE#4 pdev#0 |
    | pdev#1 |
    +-----------------+

    PE#4 (have 2 PCI devices) is the child of PE#3, which has 2 p2p
    bridges. We accidentally had less-known scenario: PE#4 was removed
    permanently from the system because of permanent failure (e.g.
    exceeding the max allowd failure times in last hour), then we detects
    EEH errors on PE#3 and tried to recover it. However, eeh_dev instances
    for pdev#0/1 were not detached from PE#4, which was still connected to
    PE#3. All of that was because of the fact that we rely on count-based
    pcibios_release_device(), which isn't reliable enough. When doing
    recovery for PE#3, we still apply hotplug on PE#4 and pdev#0/1, which
    are not valid any more. Eventually, we run into kernel crash.

    The patch fixes above issue from two aspects. For unplug, we simply
    skip those permanently removed PE, whose state is (EEH_PE_STATE_ISOLATED
    && !EEH_PE_STATE_RECOVERING) and its frozen count should be greater
    than EEH_MAX_ALLOWED_FREEZES. For plug, we marked all permanently
    removed EEH devices with EEH_DEV_REMOVED and return 0xFF's on read
    its PCI config so that PCI core will omit them.

    Signed-off-by: Gavin Shan
    Signed-off-by: Benjamin Herrenschmidt

    Gavin Shan