12 Jan, 2020

1 commit

  • [ Upstream commit 913e73c77d48aeeb50c16450a653dca9c71ae2e2 ]

    If we couldn't fully init a context, we were leaking memory.

    Fixes: b9721d275cc2 ("ocxl: Allow external drivers to use OpenCAPI contexts")
    Signed-off-by: Frederic Barrat
    Acked-by: Andrew Donnellan
    Reviewed-by: Greg Kurz
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20191209105513.8566-1-fbarrat@linux.ibm.com
    Signed-off-by: Sasha Levin

    Frederic Barrat
     

31 Dec, 2019

1 commit

  • commit a58d37bce0d21cf7fbd589384c619e465ef2f927 upstream.

    If an ocxl device is unbound through sysfs at the same time its AFU is
    being opened by a user process, the open code may dereference freed
    stuctures, which can lead to kernel oops messages. You'd have to hit a
    tiny time window, but it's possible. It's fairly easy to test by
    making the time window bigger artificially.

    Fix it with a combination of 2 changes:
    - when an AFU device is found in the IDR by looking for the device
    minor number, we should hold a reference on the device until after
    the context is allocated. A reference on the AFU structure is kept
    when the context is allocated, so we can release the reference on
    the device after the context allocation.
    - with the fix above, there's still another even tinier window,
    between the time the AFU device is found in the IDR and the
    reference on the device is taken. We can fix this one by removing
    the IDR entry earlier, when the device setup is removed, instead
    of waiting for the 'release' device callback. With proper locking
    around the IDR.

    Fixes: 75ca758adbaf ("ocxl: Create a clear delineation between ocxl backend & frontend")
    Cc: stable@vger.kernel.org # v5.2+
    Signed-off-by: Frederic Barrat
    Reviewed-by: Greg Kurz
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20190624144148.32022-1-fbarrat@linux.ibm.com
    Signed-off-by: Greg Kroah-Hartman

    Frederic Barrat
     

05 Sep, 2019

1 commit

  • Introduce two options to control the use of the tlbie instruction. A
    boot time option which completely disables the kernel using the
    instruction, this is currently incompatible with HASH MMU, KVM, and
    coherent accelerators.

    And a debugfs option can be switched at runtime and avoids using tlbie
    for invalidating CPU TLBs for normal process and kernel address
    mappings. Coherent accelerators are still managed with tlbie, as will
    KVM partition scope translations.

    Cross-CPU TLB flushing is implemented with IPIs and tlbiel. This is a
    basic implementation which does not attempt to make any optimisation
    beyond the tlbie implementation.

    This is useful for performance testing among other things. For example
    in certain situations on large systems, using IPIs may be faster than
    tlbie as they can be directed rather than broadcast. Later we may also
    take advantage of the IPIs to do more interesting things such as trim
    the mm cpumask more aggressively.

    Signed-off-by: Nicholas Piggin
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20190902152931.17840-7-npiggin@gmail.com

    Nicholas Piggin
     

14 Jul, 2019

1 commit

  • Pull powerpc updates from Michael Ellerman:
    "Notable changes:

    - Removal of the NPU DMA code, used by the out-of-tree Nvidia driver,
    as well as some other functions only used by drivers that haven't
    (yet?) made it upstream.

    - A fix for a bug in our handling of hardware watchpoints (eg. perf
    record -e mem: ...) which could lead to register corruption and
    kernel crashes.

    - Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for
    vmalloc when using the Radix MMU.

    - A large but incremental rewrite of our exception handling code to
    use gas macros rather than multiple levels of nested CPP macros.

    And the usual small fixes, cleanups and improvements.

    Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Andreas Schwab,
    Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann,
    Athira Rajeev, Cédric Le Goater, Christian Lamparter, Christophe
    Leroy, Christophe Lombard, Christoph Hellwig, Daniel Axtens, Denis
    Efremov, Enrico Weigelt, Frederic Barrat, Gautham R. Shenoy, Geert
    Uytterhoeven, Geliang Tang, Gen Zhang, Greg Kroah-Hartman, Greg Kurz,
    Gustavo Romero, Krzysztof Kozlowski, Madhavan Srinivasan, Masahiro
    Yamada, Mathieu Malaterre, Michael Neuling, Nathan Lynch, Naveen N.
    Rao, Nicholas Piggin, Nishad Kamdar, Oliver O'Halloran, Qian Cai, Ravi
    Bangoria, Sachin Sant, Sam Bobroff, Satheesh Rajendran, Segher
    Boessenkool, Shaokun Zhang, Shawn Anastasio, Stewart Smith, Suraj
    Jitindar Singh, Thiago Jung Bauermann, YueHaibing"

    * tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (163 commits)
    powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state.
    powerpc/eeh: Handle hugepages in ioremap space
    ocxl: Update for AFU descriptor template version 1.1
    powerpc/boot: pass CONFIG options in a simpler and more robust way
    powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h
    powerpc/irq: Don't WARN continuously in arch_local_irq_restore()
    powerpc/module64: Use symbolic instructions names.
    powerpc/module32: Use symbolic instructions names.
    powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h
    powerpc/module64: Fix comment in R_PPC64_ENTRY handling
    powerpc/boot: Add lzo support for uImage
    powerpc/boot: Add lzma support for uImage
    powerpc/boot: don't force gzipped uImage
    powerpc/8xx: Add microcode patch to move SMC parameter RAM.
    powerpc/8xx: Use IO accessors in microcode programming.
    powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c
    powerpc/8xx: refactor programming of microcode CPM params.
    powerpc/8xx: refactor printing of microcode patch name.
    powerpc/8xx: Refactor microcode write
    powerpc/8xx: refactor writing of CPM microcode arrays
    ...

    Linus Torvalds
     

10 Jul, 2019

1 commit

  • The OpenCAPI discovery and configuration specification has been
    updated and introduces version 1.1 of the AFU descriptor template,
    with new fields to better define the memory layout of an OpenCAPI
    adapter.

    The ocxl driver doesn't do much yet to support LPC memory but as we
    start seeing (non-LPC) AFU images using the new template, this patch
    updates the config space parsing code to avoid spitting a warning.

    Signed-off-by: Alastair D'Silva
    Signed-off-by: Frederic Barrat
    Reviewed-by: Christophe Lombard
    Acked-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20190605111545.19762-1-fbarrat@linux.ibm.com

    Alastair D'Silva
     

04 Jul, 2019

1 commit

  • If an OpenCAPI context is to be used directly by a kernel driver, there
    may not be a suitable mm to use.

    The patch makes the mm parameter to ocxl_context_attach optional.

    Signed-off-by: Alastair D'Silva
    Acked-by: Andrew Donnellan
    Acked-by: Frederic Barrat
    Acked-by: Nicholas Piggin
    Link: https://lore.kernel.org/r/20190620041203.12274-1-alastair@au1.ibm.com
    Signed-off-by: Greg Kroah-Hartman

    Alastair D'Silva
     

09 Jun, 2019

1 commit


28 May, 2019

1 commit

  • Fix sparse warning:

    drivers/misc/ocxl/pci.c:44:6: warning:
    symbol 'ocxl_remove' was not declared. Should it be static?

    Reported-by: Hulk Robot
    Signed-off-by: YueHaibing
    Acked-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman

    YueHaibing
     

25 May, 2019

1 commit

  • 'default n' is the default value for any bool or tristate Kconfig
    setting so there is no need to write it explicitly.

    Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO
    is not set' for visible symbols") the Kconfig behavior is the same
    regardless of 'default n' being present or not:

    ...
    One side effect of (and the main motivation for) this change is making
    the following two definitions behave exactly the same:

    config FOO
    bool

    config FOO
    bool
    default n

    With this change, neither of these will generate a
    '# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
    That might make it clearer to people that a bare 'default n' is
    redundant.
    ...

    Signed-off-by: Bartlomiej Zolnierkiewicz
    Acked-by: Frederic Barrat
    Acked-by: Arnd Bergmann
    Acked-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Bartlomiej Zolnierkiewicz
     

21 May, 2019

1 commit


06 May, 2019

1 commit

  • In case of error, the function eventfd_ctx_fdget() returns ERR_PTR() and
    never returns NULL. The NULL test in the return value check should be
    replaced with IS_ERR().

    This issue was detected by using the Coccinelle software.

    Fixes: 060146614643 ("ocxl: move event_fd handling to frontend")
    Signed-off-by: Wei Yongjun
    Acked-by: Alastair D'Silva
    Acked-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman

    Wei Yongjun
     

03 May, 2019

11 commits


01 May, 2019

1 commit

  • Fixes gcc '-Wunused-but-set-variable' warning:

    drivers/misc/ocxl/link.c: In function 'xsl_fault_handler':
    drivers/misc/ocxl/link.c:187:17: warning: variable 'tid' set but not used
    drivers/misc/ocxl/link.c:187:6: warning: variable 'lpid' set but not used

    They are never used and can be removed.

    Signed-off-by: YueHaibing
    Reviewed-by: Mukesh Ojha
    Acked-by: Andrew Donnellan
    Acked-by: Frederic Barrat
    Signed-off-by: Michael Ellerman

    YueHaibing
     

21 Apr, 2019

1 commit

  • This patch maps vmalloc, IO and vmemap regions in the 0xc address range
    instead of the current 0xd and 0xf range. This brings the mapping closer
    to radix translation mode.

    With hash 64K page size each of this region is 512TB whereas with 4K config
    we are limited by the max page table range of 64TB and hence there regions
    are of 16TB size.

    The kernel mapping is now:

    On 4K hash

    kernel_region_map_size = 16TB
    kernel vmalloc start = 0xc000100000000000
    kernel IO start = 0xc000200000000000
    kernel vmemmap start = 0xc000300000000000

    64K hash, 64K radix and 4k radix:

    kernel_region_map_size = 512TB
    kernel vmalloc start = 0xc008000000000000
    kernel IO start = 0xc00a000000000000
    kernel vmemmap start = 0xc00c000000000000

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Michael Ellerman

    Aneesh Kumar K.V
     

21 Dec, 2018

1 commit

  • The AFU Descriptor Template in the PCI config space has a Name Space
    field which is a 24 Byte ASCII character string of descriptive name
    space for the AFU. The OCXL driver read the string four characters at
    a time with pci_read_config_dword().

    This optimization is valid on a little-endian system since this is PCI,
    but a big-endian system ends up with each subset of four characters in
    reverse order.

    This could be fixed by switching to read characters one by one. Another
    option is to swap the bytes if we're big-endian.

    Go for the latter with le32_to_cpu().

    Cc: stable@vger.kernel.org # v4.16
    Signed-off-by: Greg Kurz
    Acked-by: Frederic Barrat
    Acked-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman

    Greg Kurz
     

20 Dec, 2018

3 commits

  • The AFU irq code doesn't need to reach out to the platform.

    Signed-off-by: Greg Kurz
    Acked-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman

    Greg Kurz
     
  • Implementing rollback with goto and labels is a common practice that
    leads to prettier and more maintainable code. FWIW, this design pattern
    is already being used in alloc_link() a few lines below in this file.

    Do the same in setup_xsl_irq().

    Signed-off-by: Greg Kurz
    Acked-by: Frederic Barrat
    Acked-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman

    Greg Kurz
     
  • All fields in the PE are big-endian. Use cpu_to_be32() like everywhere
    else something is written to the PE. Otherwise a wrong TID will be used
    by the NPU. If this TID happens to point to an existing thread sharing
    the same mm, it could be woken up by error. This is highly improbable
    though. The likely outcome of this is the NPU not finding the target
    thread and forcing the AFU into sending an interrupt, which userspace
    is supposed to handle anyway.

    Fixes: e948e06fc63a ("ocxl: Expose the thread_id needed for wait on POWER9")
    Cc: stable@vger.kernel.org # v4.18
    Signed-off-by: Greg Kurz
    Acked-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman

    Greg Kurz
     

19 Sep, 2018

1 commit

  • The AFU Information DVSEC capability is a means to extract common,
    general information about all of the AFUs associated with a Function
    independent of the specific functionality that each AFU provides.
    Write in the AFU Index field allows to access to the descriptor data
    for each AFU.

    With the current code, we are not able to access to these specific data
    when the index >= 1 because we are writing to the wrong location.
    All requests to the data of each AFU are pointing to those of the AFU 0,
    which could have impacts when using a card with more than one AFU per
    function.

    This patch fixes the access to the AFU Descriptor Data indexed by the
    AFU Info Index field.

    Fixes: 5ef3166e8a32 ("ocxl: Driver code for 'generic' opencapi devices")
    Cc: stable # 4.16
    Signed-off-by: Christophe Lombard

    Acked-by: Frederic Barrat
    Acked-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman

    Christophe Lombard
     

18 Aug, 2018

2 commits

  • Merge updates from Andrew Morton:

    - a few misc things

    - a few Y2038 fixes

    - ntfs fixes

    - arch/sh tweaks

    - ocfs2 updates

    - most of MM

    * emailed patches from Andrew Morton : (111 commits)
    mm/hmm.c: remove unused variables align_start and align_end
    fs/userfaultfd.c: remove redundant pointer uwq
    mm, vmacache: hash addresses based on pmd
    mm/list_lru: introduce list_lru_shrink_walk_irq()
    mm/list_lru.c: pass struct list_lru_node* as an argument to __list_lru_walk_one()
    mm/list_lru.c: move locking from __list_lru_walk_one() to its caller
    mm/list_lru.c: use list_lru_walk_one() in list_lru_walk_node()
    mm, swap: make CONFIG_THP_SWAP depend on CONFIG_SWAP
    mm/sparse: delete old sparse_init and enable new one
    mm/sparse: add new sparse_init_nid() and sparse_init()
    mm/sparse: move buffer init/fini to the common place
    mm/sparse: use the new sparse buffer functions in non-vmemmap
    mm/sparse: abstract sparse buffer allocations
    mm/hugetlb.c: don't zero 1GiB bootmem pages
    mm, page_alloc: double zone's batchsize
    mm/oom_kill.c: document oom_lock
    mm/hugetlb: remove gigantic page support for HIGHMEM
    mm, oom: remove sleep from under oom_lock
    kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()
    mm/cma: remove unsupported gfp_mask parameter from cma_alloc()
    ...

    Linus Torvalds
     
  • Use new return type vm_fault_t for fault handler. For now, this is just
    documenting that the function returns a VM_FAULT value rather than an
    errno. Once all instances are converted, vm_fault_t will become a
    distinct type.

    Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t")

    In this patch all the caller of handle_mm_fault() are changed to return
    vm_fault_t type.

    Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC
    Signed-off-by: Souptick Joarder
    Cc: Matthew Wilcox
    Cc: Richard Henderson
    Cc: Tony Luck
    Cc: Matt Turner
    Cc: Vineet Gupta
    Cc: Russell King
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Richard Kuo
    Cc: Geert Uytterhoeven
    Cc: Michal Simek
    Cc: James Hogan
    Cc: Ley Foon Tan
    Cc: Jonas Bonn
    Cc: James E.J. Bottomley
    Cc: Benjamin Herrenschmidt
    Cc: Palmer Dabbelt
    Cc: Yoshinori Sato
    Cc: David S. Miller
    Cc: Richard Weinberger
    Cc: Guan Xuetao
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: "Levin, Alexander (Sasha Levin)"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Souptick Joarder
     

02 Jul, 2018

2 commits

  • If a process exits without doing proper cleanup, there's a window
    where an opencapi device can try to access the memory of the dying
    process and may trigger a page fault. That's an expected scenario and
    the ocxl driver holds a reference on the mm_struct of the process
    until the opencapi device is notified of the process exiting.
    However, if mm_users is already at 0, i.e. the address space of the
    process has already been destroyed, the driver shouldn't try resolving
    the page fault, as it will fail, but it can also try accessing already
    freed data.

    It is fixed by only calling the bottom half of the page fault handler
    if mm_users is greater than 0 and get a reference on mm_users instead
    of mm_count. Otherwise, we can safely return a translation fault to
    the device, as its associated memory context is being removed. The
    opencapi device will be properly cleaned up shortly after when closing
    the file descriptors.

    Fixes: 5ef3166e8a32 ("ocxl: Driver code for 'generic' opencapi devices")
    Cc: stable@vger.kernel.org # v4.16+
    Signed-off-by: Frederic Barrat
    Reviewed-By: Alastair D'Silva
    Acked-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman

    Frederic Barrat
     
  • Use new return type vm_fault_t for fault handler. For now, this is
    just documenting that the function returns a VM_FAULT value rather
    than an errno. Once all instances are converted, vm_fault_t will
    become a distinct type.

    Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t")

    There is an existing bug when vm_insert_pfn() can return ENOMEM which
    was ignored and VM_FAULT_NOPAGE returned as default. The new inline
    vmf_insert_pfn() has removed this inefficiency by returning correct
    vm_fault_ type.

    Signed-off-by: Souptick Joarder
    Acked-by: Andrew Donnellan
    Acked-by: Frederic Barrat
    Signed-off-by: Michael Ellerman

    Souptick Joarder
     

05 Jun, 2018

1 commit


03 Jun, 2018

3 commits


28 Mar, 2018

1 commit


16 Mar, 2018

1 commit


02 Mar, 2018

1 commit