29 Oct, 2020

1 commit


25 Oct, 2020

1 commit

  • Pull block fixes from Jens Axboe:

    - NVMe pull request from Christoph
    - rdma error handling fixes (Chao Leng)
    - fc error handling and reconnect fixes (James Smart)
    - fix the qid displace when tracing ioctl command (Keith Busch)
    - don't use BLK_MQ_REQ_NOWAIT for passthru (Chaitanya Kulkarni)
    - fix MTDT for passthru (Logan Gunthorpe)
    - blacklist Write Same on more devices (Kai-Heng Feng)
    - fix an uninitialized work struct (zhenwei pi)"

    - lightnvm out-of-bounds fix (Colin)

    - SG allocation leak fix (Doug)

    - rnbd fixes (Gioh, Guoqing, Jack)

    - zone error translation fixes (Keith)

    - kerneldoc markup fix (Mauro)

    - zram lockdep fix (Peter)

    - Kill unused io_context members (Yufen)

    - NUMA memory allocation cleanup (Xianting)

    - NBD config wakeup fix (Xiubo)

    * tag 'block-5.10-2020-10-24' of git://git.kernel.dk/linux-block: (27 commits)
    block: blk-mq: fix a kernel-doc markup
    nvme-fc: shorten reconnect delay if possible for FC
    nvme-fc: wait for queues to freeze before calling update_hr_hw_queues
    nvme-fc: fix error loop in create_hw_io_queues
    nvme-fc: fix io timeout to abort I/O
    null_blk: use zone status for max active/open
    nvmet: don't use BLK_MQ_REQ_NOWAIT for passthru
    nvmet: cleanup nvmet_passthru_map_sg()
    nvmet: limit passthru MTDS by BIO_MAX_PAGES
    nvmet: fix uninitialized work for zero kato
    nvme-pci: disable Write Zeroes on Sandisk Skyhawk
    nvme: use queuedata for nvme_req_qid
    nvme-rdma: fix crash due to incorrect cqe
    nvme-rdma: fix crash when connect rejected
    block: remove unused members for io_context
    blk-mq: remove the calling of local_memory_node()
    zram: Fix __zram_bvec_{read,write}() locking order
    skd_main: remove unused including
    sgl_alloc_order: fix memory leak
    lightnvm: fix out-of-bounds write to array devices->info[]
    ...

    Linus Torvalds
     

18 Oct, 2020

1 commit

  • Pull rdma updates from Jason Gunthorpe:
    "A usual cycle for RDMA with a typical mix of driver and core subsystem
    updates:

    - Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma,
    hns, usnic, qib, qedr, cxgb4, hns, bnxt_re

    - Various rtrs fixes and updates

    - Bug fix for mlx4 CM emulation for virtualization scenarios where
    MRA wasn't working right

    - Use tracepoints instead of pr_debug in the CM code

    - Scrub the locking in ucma and cma to close more syzkaller bugs

    - Use tasklet_setup in the subsystem

    - Revert the idea that 'destroy' operations are not allowed to fail
    at the driver level. This proved unworkable from a HW perspective.

    - Revise how the umem API works so drivers make fewer mistakes using
    it

    - XRC support for qedr

    - Convert uverbs objects RWQ and MW to new the allocation scheme

    - Large queue entry sizes for hns

    - Use hmm_range_fault() for mlx5 On Demand Paging

    - uverbs APIs to inspect the GID table instead of sysfs

    - Move some of the RDMA code for building large page SGLs into
    lib/scatterlist"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (191 commits)
    RDMA/ucma: Fix use after free in destroy id flow
    RDMA/rxe: Handle skb_clone() failure in rxe_recv.c
    RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI
    RDMA: Explicitly pass in the dma_device to ib_register_device
    lib/scatterlist: Do not limit max_segment to PAGE_ALIGNED values
    IB/mlx4: Convert rej_tmout radix-tree to XArray
    RDMA/rxe: Fix bug rejecting all multicast packets
    RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()
    RDMA/rxe: Remove duplicate entries in struct rxe_mr
    IB/hfi,rdmavt,qib,opa_vnic: Update MAINTAINERS
    IB/rdmavt: Fix sizeof mismatch
    MAINTAINERS: CISCO VIC LOW LATENCY NIC DRIVER
    RDMA/bnxt_re: Fix sizeof mismatch for allocation of pbl_tbl.
    RDMA/bnxt_re: Use rdma_umem_for_each_dma_block()
    RDMA/umem: Move to allocate SG table from pages
    lib/scatterlist: Add support in dynamic allocation of SG table from pages
    tools/testing/scatterlist: Show errors in human readable form
    tools/testing/scatterlist: Rejuvenate bit-rotten test
    RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces
    RDMA/uverbs: Expose the new GID query API to user space
    ...

    Linus Torvalds
     

17 Oct, 2020

1 commit

  • 'sgl' is zeroed a few lines below in 'sg_init_table()'. There is no need to
    clear it twice.

    Remove the redundant initialization.

    Signed-off-by: Christophe JAILLET
    Signed-off-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200920071544.368841-1-christophe.jaillet@wanadoo.fr
    Signed-off-by: Linus Torvalds

    Christophe JAILLET
     

16 Oct, 2020

2 commits

  • The main intention of the max_segment argument to
    __sg_alloc_table_from_pages() is to match the DMA layer segment size set
    by dma_set_max_seg_size().

    Restricting the input to be page aligned makes it impossible to just
    connect the DMA layer to this API.

    The only reason for a page alignment here is because the algorithm will
    overshoot the max_segment if it is not a multiple of PAGE_SIZE. Simply fix
    the alignment before starting and don't expose this implementation detail
    to the callers.

    A future patch will completely remove SCATTERLIST_MAX_SEGMENT.

    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     
  • sgl_alloc_order() can fail when 'length' is large on a memory
    constrained system. When order > 0 it will potentially be
    making several multi-page allocations with the later ones more
    likely to fail than the earlier one. So it is important that
    sgl_alloc_order() frees up any pages it has obtained before
    returning NULL. In the case when order > 0 it calls the wrong
    free page function and leaks. In testing the leak was
    sufficient to bring down my 8 GiB laptop with OOM.

    Reviewed-by: Bart Van Assche
    Signed-off-by: Douglas Gilbert
    Signed-off-by: Jens Axboe

    Douglas Gilbert
     

06 Oct, 2020

1 commit

  • Extend __sg_alloc_table_from_pages to support dynamic allocation of
    SG table from pages. It should be used by drivers that can't supply
    all the pages at one time.

    This function returns the last populated SGE in the table. Users should
    pass it as an argument to the function from the second call and forward.
    As before, nents will be equal to the number of populated SGEs (chunks).

    With this new extension, drivers can benefit the optimization of merging
    contiguous pages without a need to allocate all pages in advance and
    hold them in a large buffer.

    E.g. with the Infiniband driver that allocates a single page for hold the
    pages. For 1TB memory registration, the temporary buffer would consume only
    4KB, instead of 2GB.

    Link: https://lore.kernel.org/r/20201004154340.1080481-2-leon@kernel.org
    Signed-off-by: Maor Gottlieb
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Jason Gunthorpe

    Maor Gottlieb
     

08 Apr, 2020

1 commit


01 Feb, 2020

1 commit

  • Clang warns:

    ../lib/scatterlist.c:314:5: warning: misleading indentation; statement
    is not part of the previous 'if' [-Wmisleading-indentation]
    return -ENOMEM;
    ^
    ../lib/scatterlist.c:311:4: note: previous statement is here
    if (prv)
    ^
    1 warning generated.

    This warning occurs because there is a space before the tab on this
    line. Remove it so that the indentation is consistent with the Linux
    kernel coding style and clang no longer warns.

    Link: http://lkml.kernel.org/r/20191218033606.11942-1-natechancellor@gmail.com
    Link: https://github.com/ClangBuiltLinux/linux/issues/830
    Fixes: edce6820a9fd ("scatterlist: prevent invalid free when alloc fails")
    Signed-off-by: Nathan Chancellor
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nathan Chancellor
     

12 Jul, 2019

1 commit

  • Pull SCSI scatter-gather list updates from James Bottomley:
    "This topic branch covers a fundamental change in how our sg lists are
    allocated to make mq more efficient by reducing the size of the
    preallocated sg list.

    This necessitates a large number of driver changes because the
    previous guarantee that if a driver specified SG_ALL as the size of
    its scatter list, it would get a non-chained list and didn't need to
    bother with scatterlist iterators is now broken and every driver
    *must* use scatterlist iterators.

    This was broken out as a separate topic because we need to convert all
    the drivers before pulling the trigger and unconverted drivers kept
    being found, necessitating a rebase"

    * tag 'scsi-sg' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (21 commits)
    scsi: core: don't preallocate small SGL in case of NO_SG_CHAIN
    scsi: lib/sg_pool.c: clear 'first_chunk' in case of no preallocation
    scsi: core: avoid preallocating big SGL for data
    scsi: core: avoid preallocating big SGL for protection information
    scsi: lib/sg_pool.c: improve APIs for allocating sg pool
    scsi: esp: use sg helper to iterate over scatterlist
    scsi: NCR5380: use sg helper to iterate over scatterlist
    scsi: wd33c93: use sg helper to iterate over scatterlist
    scsi: ppa: use sg helper to iterate over scatterlist
    scsi: pcmcia: nsp_cs: use sg helper to iterate over scatterlist
    scsi: imm: use sg helper to iterate over scatterlist
    scsi: aha152x: use sg helper to iterate over scatterlist
    scsi: s390: zfcp_fc: use sg helper to iterate over scatterlist
    scsi: staging: unisys: visorhba: use sg helper to iterate over scatterlist
    scsi: usb: image: microtek: use sg helper to iterate over scatterlist
    scsi: pmcraid: use sg helper to iterate over scatterlist
    scsi: ipr: use sg helper to iterate over scatterlist
    scsi: mvumi: use sg helper to iterate over scatterlist
    scsi: lpfc: use sg helper to iterate over scatterlist
    scsi: advansys: use sg helper to iterate over scatterlist
    ...

    Linus Torvalds
     

09 Jul, 2019

1 commit

  • Pull crypto updates from Herbert Xu:
    "Here is the crypto update for 5.3:

    API:
    - Test shash interface directly in testmgr
    - cra_driver_name is now mandatory

    Algorithms:
    - Replace arc4 crypto_cipher with library helper
    - Implement 5 way interleave for ECB, CBC and CTR on arm64
    - Add xxhash
    - Add continuous self-test on noise source to drbg
    - Update jitter RNG

    Drivers:
    - Add support for SHA204A random number generator
    - Add support for 7211 in iproc-rng200
    - Fix fuzz test failures in inside-secure
    - Fix fuzz test failures in talitos
    - Fix fuzz test failures in qat"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (143 commits)
    crypto: stm32/hash - remove interruptible condition for dma
    crypto: stm32/hash - Fix hmac issue more than 256 bytes
    crypto: stm32/crc32 - rename driver file
    crypto: amcc - remove memset after dma_alloc_coherent
    crypto: ccp - Switch to SPDX license identifiers
    crypto: ccp - Validate the the error value used to index error messages
    crypto: doc - Fix formatting of new crypto engine content
    crypto: doc - Add parameter documentation
    crypto: arm64/aes-ce - implement 5 way interleave for ECB, CBC and CTR
    crypto: arm64/aes-ce - add 5 way interleave routines
    crypto: talitos - drop icv_ool
    crypto: talitos - fix hash on SEC1.
    crypto: talitos - move struct talitos_edesc into talitos.h
    lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE
    crypto/NX: Set receive window credits to max number of CRBs in RxFIFO
    crypto: asymmetric_keys - select CRYPTO_HASH where needed
    crypto: serpent - mark __serpent_setkey_sbox noinline
    crypto: testmgr - dynamically allocate crypto_shash
    crypto: testmgr - dynamically allocate testvec_config
    crypto: talitos - eliminate unneeded 'done' functions at build time
    ...

    Linus Torvalds
     

03 Jul, 2019

1 commit

  • All mapping iterator logic is based on the assumption that sg->offset
    is always lower than PAGE_SIZE.

    But there are situations where sg->offset is such that the SG item
    is on the second page. In that case sg_copy_to_buffer() fails
    properly copying the data into the buffer. One of the reason is
    that the data will be outside the kmapped area used to access that
    data.

    This patch fixes the issue by adjusting the mapping iterator
    offset and pgoffset fields such that offset is always lower than
    PAGE_SIZE.

    Signed-off-by: Christophe Leroy
    Fixes: 4225fc8555a9 ("lib/scatterlist: use page iterator in the mapping iterator")
    Cc: stable@vger.kernel.org
    Signed-off-by: Herbert Xu

    Christophe Leroy
     

21 Jun, 2019

1 commit

  • sg_alloc_table_chained() currently allows the caller to provide one
    preallocated SGL and returns if the requested number isn't bigger than
    size of that SGL. This is used to inline an SGL for an IO request.

    However, scattergather code only allows that size of the 1st preallocated
    SGL to be SG_CHUNK_SIZE(128). This means a substantial amount of memory
    (4KB) is claimed for the SGL for each IO request. If the I/O is small, it
    would be prudent to allocate a smaller SGL.

    Introduce an extra parameter to sg_alloc_table_chained() and
    sg_free_table_chained() for specifying size of the preallocated SGL.

    Both __sg_free_table() and __sg_alloc_table() assume that each SGL has the
    same size except for the last one. Change the code to allow both functions
    to accept a variable size for the 1st preallocated SGL.

    [mkp: attempted to clarify commit desc]

    Cc: Christoph Hellwig
    Cc: Bart Van Assche
    Cc: Ewan D. Milne
    Cc: Hannes Reinecke
    Cc: Sagi Grimberg
    Cc: Chuck Lever
    Cc: netdev@vger.kernel.org
    Cc: linux-nvme@lists.infradead.org
    Suggested-by: Christoph Hellwig
    Signed-off-by: Ming Lei
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Martin K. Petersen

    Ming Lei
     

19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this source code is licensed under the gnu general public license
    version 2 see the file copying for more details

    this source code is licensed under general public license version 2
    see

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 52 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Allison Randal
    Reviewed-by: Alexios Zavras
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190602204653.449021192@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

12 Feb, 2019

1 commit

  • Commit 2db76d7c3c6d ("lib/scatterlist: sg_page_iter: support sg lists w/o
    backing pages") introduced the sg_page_iter_dma_address() function without
    providing a way to use it in the general case. If the sg_dma_len() is not
    equal to the sg length callers cannot safely use the
    for_each_sg_page/sg_page_iter_dma_address combination.

    Resolve this API mistake by providing a DMA specific iterator,
    for_each_sg_dma_page(), that uses the right length so
    sg_page_iter_dma_address() works as expected with all sglists.

    A new iterator type is introduced to provide compile-time safety against
    wrongly mixing accessors and iterators.

    Acked-by: Christoph Hellwig (for scatterlist)
    Acked-by: Thomas Hellstrom
    Acked-by: Sakari Ailus (ipu3-cio2)
    Signed-off-by: Jason Gunthorpe

    Jason Gunthorpe
     

06 Dec, 2018

1 commit

  • These days architectures are mostly out of the business of dealing with
    struct scatterlist at all, unless they have architecture specific iommu
    drivers. Replace the ARCH_HAS_SG_CHAIN symbol with a ARCH_NO_SG_CHAIN
    one only enabled for architectures with horrible legacy iommu drivers
    like alpha and parisc, and conditionally for arm which wants to keep it
    disable for legacy platforms.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Palmer Dabbelt

    Christoph Hellwig
     

01 Jul, 2018

1 commit

  • Pull block fixes from Jens Axboe:
    "Small set of fixes for this series. Mostly just minor fixes, the only
    oddball in here is the sg change.

    The sg change came out of the stall fix for NVMe, where we added a
    mempool and limited us to a single page allocation. CONFIG_SG_DEBUG
    sort-of ruins that, since we'd need to account for that. That's
    actually a generic problem, since lots of drivers need to allocate SG
    lists. So this just removes support for CONFIG_SG_DEBUG, which I added
    back in 2007 and to my knowledge it was never useful.

    Anyway, outside of that, this pull contains:

    - clone of request with special payload fix (Bart)

    - drbd discard handling fix (Bart)

    - SATA blk-mq stall fix (me)

    - chunk size fix (Keith)

    - double free nvme rdma fix (Sagi)"

    * tag 'for-linus-20180629' of git://git.kernel.dk/linux-block:
    sg: remove ->sg_magic member
    drbd: Fix drbd_request_prepare() discard handling
    blk-mq: don't queue more if we get a busy return
    block: Fix cloning of requests with a special payload
    nvme-rdma: fix possible double free of controller async event buffer
    block: Fix transfer when chunk sectors exceeds max

    Linus Torvalds
     

29 Jun, 2018

1 commit

  • This was introduced more than a decade ago when sg chaining was
    added, but we never really caught anything with it. The scatterlist
    entry size can be critical, since drivers allocate it, so remove
    the magic member. Recently it's been triggering allocation stalls
    and failures in NVMe.

    Tested-by: Jordan Glover
    Acked-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Jens Axboe
     

13 Jun, 2018

1 commit

  • The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
    patch replaces cases of:

    kmalloc(a * b, gfp)

    with:
    kmalloc_array(a * b, gfp)

    as well as handling cases of:

    kmalloc(a * b * c, gfp)

    with:

    kmalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kmalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kmalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The tools/ directory was manually excluded, since it has its own
    implementation of kmalloc().

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kmalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kmalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kmalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kmalloc
    + kmalloc_array
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kmalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kmalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kmalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kmalloc(sizeof(THING) * C2, ...)
    |
    kmalloc(sizeof(TYPE) * C2, ...)
    |
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(C1 * C2, ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

31 Mar, 2018

1 commit

  • sg_init_marker initializes sg_magic in the sg table and calls
    sg_mark_end() on the last entry of the table. This can be useful to
    avoid memset in sg_init_table() when scatterlist is already zeroed out

    For example: when scatterlist is embedded inside other struct and that
    container struct is zeroed out

    Suggested-by: Daniel Borkmann
    Signed-off-by: Prashant Bhole
    Acked-by: John Fastabend
    Signed-off-by: Daniel Borkmann

    Prashant Bhole
     

20 Jan, 2018

1 commit

  • This patch avoids that workloads with large block sizes (megabytes)
    can trigger the following call stack with the ib_srpt driver (that
    driver is the only driver that chains scatterlists allocated by
    sgl_alloc_order()):

    BUG: Bad page state in process kworker/0:1H pfn:2423a78
    page:fffffb03d08e9e00 count:-3 mapcount:0 mapping: (null) index:0x0
    flags: 0x57ffffc0000000()
    raw: 0057ffffc0000000 0000000000000000 0000000000000000 fffffffdffffffff
    raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
    page dumped because: nonzero _count
    CPU: 0 PID: 733 Comm: kworker/0:1H Tainted: G I 4.15.0-rc7.bart+ #1
    Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015
    Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
    Call Trace:
    dump_stack+0x5c/0x83
    bad_page+0xf5/0x10f
    get_page_from_freelist+0xa46/0x11b0
    __alloc_pages_nodemask+0x103/0x290
    sgl_alloc_order+0x101/0x180
    target_alloc_sgl+0x2c/0x40 [target_core_mod]
    srpt_alloc_rw_ctxs+0x173/0x2d0 [ib_srpt]
    srpt_handle_new_iu+0x61e/0x7f0 [ib_srpt]
    __ib_process_cq+0x55/0xa0 [ib_core]
    ib_cq_poll_work+0x1b/0x60 [ib_core]
    process_one_work+0x141/0x340
    worker_thread+0x47/0x3e0
    kthread+0xf5/0x130
    ret_from_fork+0x1f/0x30

    Fixes: e80a0af4759a ("lib/scatterlist: Introduce sgl_alloc() and sgl_free()")
    Reported-by: Laurence Oberman
    Tested-by: Laurence Oberman
    Signed-off-by: Bart Van Assche
    Cc: Nicholas A. Bellinger
    Cc: Laurence Oberman
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

07 Jan, 2018

1 commit

  • Many kernel drivers contain code that allocates and frees both a
    scatterlist and the pages that populate that scatterlist.
    Introduce functions in lib/scatterlist.c that perform these tasks
    instead of duplicating this functionality in multiple drivers.
    Only include these functions in the build if CONFIG_SGL_ALLOC=y
    to avoid that the kernel size increases if this functionality is
    not used.

    Signed-off-by: Bart Van Assche
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

07 Sep, 2017

3 commits

  • Drivers like i915 benefit from being able to control the maxium
    size of the sg coalesced segment while building the scatter-
    gather list.

    Introduce and export the __sg_alloc_table_from_pages function
    which will allow it that control.

    v2: Reorder parameters. (Chris Wilson)
    v3: Fix incomplete reordering in v2.
    v4: max_segment needs to be page aligned.
    v5: Rebase.
    v6: Rebase.
    v7: Fix spelling in commit and mention max segment size in
    __sg_alloc_table_from_pages kerneldoc. (Andrew Morton)

    Signed-off-by: Tvrtko Ursulin
    Cc: Masahiro Yamada
    Cc: linux-kernel@vger.kernel.org
    Cc: Chris Wilson
    Reviewed-by: Chris Wilson
    Cc: Joonas Lahtinen
    Cc: Andrew Morton
    Link: https://patchwork.freedesktop.org/patch/msgid/20170803091351.23594-1-tvrtko.ursulin@linux.intel.com

    Tvrtko Ursulin
     
  • Since the scatterlist length field is an unsigned int, make
    sure that sg_alloc_table_from_pages does not overflow it while
    coalescing pages to a single entry.

    v2: Drop reference to future use. Use UINT_MAX.
    v3: max_segment must be page aligned.
    v4: Do not rely on compiler to optimise out the rounddown.
    (Joonas Lahtinen)
    v5: Simplified loops and use post-increments rather than
    pre-increments. Use PAGE_MASK and fix comment typo.
    (Andy Shevchenko)
    v6: Commit spelling fix.

    Signed-off-by: Tvrtko Ursulin
    Cc: Masahiro Yamada
    Cc: linux-kernel@vger.kernel.org
    Reviewed-by: Chris Wilson
    Cc: Joonas Lahtinen
    Cc: Andy Shevchenko
    Link: https://patchwork.freedesktop.org/patch/msgid/20170803091312.22875-1-tvrtko.ursulin@linux.intel.com

    Tvrtko Ursulin
     
  • Scatterlist entries have an unsigned int for the offset so
    correct the sg_alloc_table_from_pages function accordingly.

    Since these are offsets withing a page, unsigned int is
    wide enough.

    Also converts callers which were using unsigned long locally
    with the lower_32_bits annotation to make it explicitly
    clear what is happening.

    v2: Use offset_in_page. (Chris Wilson)

    Signed-off-by: Tvrtko Ursulin
    Cc: Masahiro Yamada
    Cc: Pawel Osciak
    Cc: Marek Szyprowski
    Cc: Kyungmin Park
    Cc: Tomasz Stanislawski
    Cc: Matt Porter
    Cc: Alexandre Bounine
    Cc: linux-media@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Acked-by: Marek Szyprowski (v1)
    Reviewed-by: Chris Wilson
    Reviewed-by: Mauro Carvalho Chehab
    Link: https://patchwork.freedesktop.org/patch/msgid/20170731185512.20010-1-tvrtko.ursulin@linux.intel.com

    Tvrtko Ursulin
     

15 Jun, 2017

1 commit


28 Feb, 2017

2 commits

  • Commit 50bed2e2862a ("sg: disable interrupts inside sg_copy_buffer")
    introduced disabling interrupts in sg_copy_buffer() since atomic uses of
    miter required it due to use of kmap_atomic().

    However, as commit 8290e2d2dcbf ("scatterlist: atomic sg_mapping_iter()
    no longer needs disabled IRQs") acknowledges disabling interrupts is no
    longer needed for calls to kmap_atomic() and therefore unneeded for
    miter ops either, so remove it from sg_copy_buffer().

    Link: http://lkml.kernel.org/r/1486040150-14109-3-git-send-email-gilad@benyossef.com
    Signed-off-by: Gilad Ben-Yossef
    Cc: Jens Axboe
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gilad Ben-Yossef
     
  • Test the cheaper boolean expression with no side effects first.

    Link: http://lkml.kernel.org/r/1486040150-14109-2-git-send-email-gilad@benyossef.com
    Signed-off-by: Gilad Ben-Yossef
    Cc: Jens Axboe
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gilad Ben-Yossef
     

09 Feb, 2016

1 commit


17 Aug, 2015

1 commit

  • There are a couple of uses of struct scatterlist that never go to
    the dma_map_sg() helper and thus don't care about ARCH_HAS_SG_CHAIN
    which indicates that we can map chained S/G list.

    The most important one is the crypto code, which currently has
    to open code a few helpers to always allow chaining. This patch
    removes a few #ifdef ARCH_HAS_SG_CHAIN statements so that we can
    switch the crypto code to these common helpers.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

01 Jul, 2015

3 commits

  • do_device_access() takes a separate parameter to indicate the direction of
    data transfer, which it used to use to select the appropriate function out
    of sg_pcopy_{to,from}_buffer(). However these two functions now have

    So this patch makes it bypass these wrappers and call the underlying
    function sg_copy_buffer() directly; this has the same calling style as
    do_device_access() i.e. a separate direction-of-transfer parameter and no
    pointers-to-const, so skipping the wrappers not only eliminates the
    warning, it also make the code simpler :)

    [akpm@linux-foundation.org: fix very broken build]
    Signed-off-by: Dave Gordon
    Acked-by: Arnd Bergmann
    Cc: James Bottomley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Gordon
     
  • The 'buf' parameter of sg(p)copy_from_buffer() can and should be
    const-qualified, although because of the shared implementation of
    _to_buffer() and _from_buffer(), we have to cast this away internally.

    This means that callers who have a 'const' buffer containing the data to
    be copied to the sg-list no longer have to cast away the const-ness
    themselves. It also enables improved coverage by code analysis tools.

    Signed-off-by: Dave Gordon
    Cc: Akinobu Mita
    Cc: "Martin K. Petersen"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Gordon
     
  • The kerneldoc for the functions doesn't match the code; the last two
    parameters (buflen, skip) have been transposed, which is confusing,
    especially as they're both integral types and the compiler won't warn
    about swapping them.

    These functions and the kerneldoc were introduced in commit:
    df642cea lib/scatterlist: introduce sg_pcopy_from_buffer() ...
    Author: Akinobu Mita
    Date: Mon Jul 8 16:01:54 2013 -0700

    The only difference between sg_pcopy_{from,to}_buffer() and
    sg_copy_{from,to}_buffer() is an additional argument that
    specifies the number of bytes to skip the SG list before
    copying.

    The functions have the extra argument at the end, but the kerneldoc
    lists it in penultimate position.

    Signed-off-by: Dave Gordon
    Reviewed-by: Akinobu Mita
    Cc: "Martin K. Petersen"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Gordon
     

03 Jun, 2015

1 commit

  • When performing a dma_map_sg() call, the number of sg entries to map is
    required. Using sg_nents to retrieve the number of sg entries will
    return the total number of entries in the sg list up to the entry marked
    as the end. If there happen to be unused entries in the list, these will
    still be counted. Some dma_map_sg() implementations will not handle the
    unused entries correctly (lib/swiotlb.c) and execute a BUG_ON.

    The sg_nents_for_len() function will traverse the sg list and return the
    number of entries required to satisfy the supplied length argument. This
    can then be supplied to the dma_map_sg() call to successfully map the
    sg.

    Signed-off-by: Tom Lendacky
    Signed-off-by: Herbert Xu

    Tom Lendacky
     

29 Oct, 2014

1 commit


09 Aug, 2014

1 commit

  • Rather than have architectures #define ARCH_HAS_SG_CHAIN in an
    architecture specific scatterlist.h, make it a proper Kconfig option and
    use that instead. At same time, remove the header files are are now
    mostly useless and just include asm-generic/scatterlist.h.

    [sfr@canb.auug.org.au: powerpc files now need asm/dma.h]
    Signed-off-by: Laura Abbott
    Acked-by: Thomas Gleixner [x86]
    Acked-by: Benjamin Herrenschmidt [powerpc]
    Acked-by: Heiko Carstens
    Cc: Russell King
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Paul Mackerras
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: "James E.J. Bottomley"
    Cc: Martin Schwidefsky
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laura Abbott
     

26 Jul, 2014

1 commit

  • Blk-mq drivers usually preallocate their S/G list as part of the request,
    but if we want to support the very large S/G lists currently supported by
    the SCSI code that would tie up a lot of memory in the preallocated request
    pool. Add support to the scatterlist code so that it can initialize a
    S/G list that uses a preallocated first chunks and dynamically allocated
    additional chunks. That way the scsi-mq code can preallocate a first
    page worth of S/G entries as part of the request, and dynamically extend
    the S/G list when needed.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Webb Scales
    Acked-by: Jens Axboe
    Tested-by: Bart Van Assche
    Tested-by: Robert Elliott

    Christoph Hellwig
     

09 Dec, 2013

1 commit

  • sg_copy_buffer() can't meet demand for some drrivers(such usb
    mass storage), so we have to use the sg_miter_* APIs to access
    sg buffer, then need export sg_miter_skip() for these drivers.

    The API is needed for converting to sg_miter_* APIs in USB storage
    driver for accessing sg buffer.

    Acked-by: Andrew Morton
    Cc: FUJITA Tomonori
    Cc: Jens Axboe
    Signed-off-by: Ming Lei
    Reviewed-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Ming Lei
     

01 Nov, 2013

1 commit

  • Commit b1adaf65ba03 ("[SCSI] block: add sg buffer copy helper
    functions") introduces two sg buffer copy helpers, and calls
    flush_kernel_dcache_page() on pages in SG list after these pages are
    written to.

    Unfortunately, the commit may introduce a potential bug:

    - Before sending some SCSI commands, kmalloc() buffer may be passed to
    block layper, so flush_kernel_dcache_page() can see a slab page
    finally

    - According to cachetlb.txt, flush_kernel_dcache_page() is only called
    on "a user page", which surely can't be a slab page.

    - ARCH's implementation of flush_kernel_dcache_page() may use page
    mapping information to do optimization so page_mapping() will see the
    slab page, then VM_BUG_ON() is triggered.

    Aaro Koskinen reported the bug on ARM/kirkwood when DEBUG_VM is enabled,
    and this patch fixes the bug by adding test of '!PageSlab(miter->page)'
    before calling flush_kernel_dcache_page().

    Signed-off-by: Ming Lei
    Reported-by: Aaro Koskinen
    Tested-by: Simon Baatz
    Cc: Russell King - ARM Linux
    Cc: Will Deacon
    Cc: Aaro Koskinen
    Acked-by: Catalin Marinas
    Cc: FUJITA Tomonori
    Cc: Tejun Heo
    Cc: "James E.J. Bottomley"
    Cc: Jens Axboe
    Cc: [3.2+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ming Lei
     

10 Jul, 2013

1 commit

  • I was reviewing code which I suspected might allocate a zero size SG
    table. That will cause memory corruption. Also we can't return before
    doing the memset or we could end up using uninitialized memory in the
    cleanup path.

    Signed-off-by: Dan Carpenter
    Cc: Akinobu Mita
    Cc: Imre Deak
    Cc: Tejun Heo
    Cc: Daniel Vetter
    Cc: Maxim Levitsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Carpenter