17 Jan, 2021

1 commit

  • commit 19cd3403cb0d522dd5e10188eef85817de29e26e upstream.

    Without CRC32 support, this fails to link:

    arm-linux-gnueabi-ld: drivers/lightnvm/pblk-init.o: in function `pblk_init':
    pblk-init.c:(.text+0x2654): undefined reference to `crc32_le'
    arm-linux-gnueabi-ld: drivers/lightnvm/pblk-init.o: in function `pblk_exit':
    pblk-init.c:(.text+0x2a7c): undefined reference to `crc32_le'

    Fixes: a4bd217b4326 ("lightnvm: physical block device (pblk) target")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     

16 Oct, 2020

1 commit

  • There is an off-by-one array check that can lead to a out-of-bounds
    write to devices->info[i]. Fix this by checking by using >= rather
    than > for the size check. Also replace hard-coded array size limit
    with ARRAY_SIZE on the array.

    Addresses-Coverity: ("Out-of-bounds write")
    Fixes: cd9e9808d18f ("lightnvm: Support for Open-Channel SSDs")
    Signed-off-by: Colin Ian King
    Signed-off-by: Jens Axboe

    Colin Ian King
     

24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

01 Jul, 2020

3 commits


03 Jun, 2020

2 commits

  • Pull block updates from Jens Axboe:
    "Core block changes that have been queued up for this release:

    - Remove dead blk-throttle and blk-wbt code (Guoqing)

    - Include pid in blktrace note traces (Jan)

    - Don't spew I/O errors on wouldblock termination (me)

    - Zone append addition (Johannes, Keith, Damien)

    - IO accounting improvements (Konstantin, Christoph)

    - blk-mq hardware map update improvements (Ming)

    - Scheduler dispatch improvement (Salman)

    - Inline block encryption support (Satya)

    - Request map fixes and improvements (Weiping)

    - blk-iocost tweaks (Tejun)

    - Fix for timeout failing with error injection (Keith)

    - Queue re-run fixes (Douglas)

    - CPU hotplug improvements (Christoph)

    - Queue entry/exit improvements (Christoph)

    - Move DMA drain handling to the few drivers that use it (Christoph)

    - Partition handling cleanups (Christoph)"

    * tag 'for-5.8/block-2020-06-01' of git://git.kernel.dk/linux-block: (127 commits)
    block: mark bio_wouldblock_error() bio with BIO_QUIET
    blk-wbt: rename __wbt_update_limits to wbt_update_limits
    blk-wbt: remove wbt_update_limits
    blk-throttle: remove tg_drain_bios
    blk-throttle: remove blk_throtl_drain
    null_blk: force complete for timeout request
    blk-mq: drain I/O when all CPUs in a hctx are offline
    blk-mq: add blk_mq_all_tag_iter
    blk-mq: open code __blk_mq_alloc_request in blk_mq_alloc_request_hctx
    blk-mq: use BLK_MQ_NO_TAG in more places
    blk-mq: rename BLK_MQ_TAG_FAIL to BLK_MQ_NO_TAG
    blk-mq: move more request initialization to blk_mq_rq_ctx_init
    blk-mq: simplify the blk_mq_get_request calling convention
    blk-mq: remove the bio argument to ->prepare_request
    nvme: force complete cancelled requests
    blk-mq: blk-mq: provide forced completion method
    block: fix a warning when blkdev.h is included for !CONFIG_BLOCK builds
    block: blk-crypto-fallback: remove redundant initialization of variable err
    block: reduce part_stat_lock() scope
    block: use __this_cpu_add() instead of access by smp_processor_id()
    ...

    Linus Torvalds
     
  • The pgprot argument to __vmalloc is always PAGE_KERNEL now, so remove it.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Reviewed-by: Michael Kelley [hyperv]
    Acked-by: Gao Xiang [erofs]
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Wei Liu
    Cc: Christian Borntraeger
    Cc: Christophe Leroy
    Cc: Daniel Vetter
    Cc: David Airlie
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Johannes Weiner
    Cc: "K. Y. Srinivasan"
    Cc: Laura Abbott
    Cc: Mark Rutland
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Robin Murphy
    Cc: Sakari Ailus
    Cc: Stephen Hemminger
    Cc: Sumit Semwal
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Paul Mackerras
    Cc: Vasily Gorbik
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20200414131348.444715-22-hch@lst.de
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

27 May, 2020

1 commit


31 Mar, 2020

1 commit

  • Pull block driver updates from Jens Axboe:

    - floppy driver cleanup series from Willy

    - NVMe updates and fixes (Various)

    - null_blk trace improvements (Chaitanya)

    - bcache fixes (Coly)

    - md fixes (via Song)

    - loop block size change optimizations (Martijn)

    - scnprintf() use (Takashi)

    * tag 'for-5.7/drivers-2020-03-29' of git://git.kernel.dk/linux-block: (81 commits)
    null_blk: add trace in null_blk_zoned.c
    null_blk: add tracepoint helpers for zoned mode
    block: add a zone condition debug helper
    nvme: cleanup namespace identifier reporting in nvme_init_ns_head
    nvme: rename __nvme_find_ns_head to nvme_find_ns_head
    nvme: refactor nvme_identify_ns_descs error handling
    nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl
    nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl
    nvme: Fix controller creation races with teardown flow
    nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl
    nvme: Fix ctrl use-after-free during sysfs deletion
    nvme-pci: Re-order nvme_pci_free_ctrl
    nvme: Remove unused return code from nvme_delete_ctrl_sync
    nvme: Use nvme_state_terminal helper
    nvme: release ida resources
    nvme: Add compat_ioctl handler for NVME_IOCTL_SUBMIT_IO
    nvmet-tcp: optimize tcp stack TX when data digest is used
    nvme-fabrics: Use scnprintf() for avoiding potential buffer overflow
    nvme-multipath: do not reset on unknown status
    nvmet-rdma: allocate RW ctxs according to mdts
    ...

    Linus Torvalds
     

28 Mar, 2020

1 commit

  • Current make_request based drivers use either blk_alloc_queue_node or
    blk_alloc_queue to allocate a queue, and then set up the make_request_fn
    function pointer and a few parameters using the blk_queue_make_request
    helper. Simplify this by passing the make_request pointer to
    blk_alloc_queue, and while at it merge the _node variant into the main
    helper by always passing a node_id, and remove the superfluous gfp_mask
    parameter. A lower-level __blk_alloc_queue is kept for the blk-mq case.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

12 Mar, 2020

1 commit


27 Nov, 2019

1 commit

  • Rework event_create_dir() to use an array of static data instead of
    function pointers where possible.

    The problem is that it would call the function pointer on module load
    before parse_args(), possibly even before jump_labels were initialized.
    Luckily the generated functions don't use jump_labels but it still seems
    fragile. It also gets in the way of changing when we make the module map
    executable.

    The generated function are basically calling trace_define_field() with a
    bunch of static arguments. So instead of a function, capture these
    arguments in a static array, avoiding the function call.

    Now there are a number of cases where the fields are dynamic (syscall
    arguments, kprobes and uprobes), in which case a static array does not
    work, for these we preserve the function call. Luckily all these cases
    are not related to modules and so we can retain the function call for
    them.

    Also fix up all broken tracepoint definitions that now generate a
    compile error.

    Tested-by: Alexei Starovoitov
    Tested-by: Steven Rostedt (VMware)
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Steven Rostedt (VMware)
    Acked-by: Alexei Starovoitov
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20191111132458.342979914@infradead.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

06 Sep, 2019

2 commits


09 Aug, 2019

1 commit

  • A previous commit correctly removed set-but-not-read variables, but
    this left two new variables now unused. Kill them.

    Fixes: ba6f7da99aaf ("lightnvm: remove set but not used variables 'data_len' and 'rq_len'")
    Reported-by: Stephen Rothwell
    Signed-off-by: Jens Axboe

    Jens Axboe
     

08 Aug, 2019

1 commit

  • drivers/lightnvm/pblk-read.c: In function pblk_submit_read_gc:
    drivers/lightnvm/pblk-read.c:423:6: warning: variable data_len set but not used [-Wunused-but-set-variable]
    drivers/lightnvm/pblk-recovery.c: In function pblk_recov_scan_oob:
    drivers/lightnvm/pblk-recovery.c:368:15: warning: variable rq_len set but not used [-Wunused-but-set-variable]

    They are not used since commit 48e5da725581 ("lightnvm:
    move metadata mapping to lower level driver")

    Reported-by: Hulk Robot
    Signed-off-by: YueHaibing
    Signed-off-by: Jens Axboe

    YueHaibing
     

06 Aug, 2019

3 commits


21 Jun, 2019

2 commits

  • With gcc 4.1:

    drivers/lightnvm/core.c: In function ‘nvm_remove_tgt’:
    drivers/lightnvm/core.c:510: warning: ‘t’ is used uninitialized in this function

    Indeed, if no NVM devices have been registered, t will be an
    uninitialized pointer, and may be dereferenced later. A call to
    nvm_remove_tgt() can be triggered from userspace by issuing the
    NVM_DEV_REMOVE ioctl on the lightnvm control device.

    Fix this by preinitializing t to NULL.

    Fixes: 843f2edbdde085b4 ("lightnvm: do not remove instance under global lock")
    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Geert Uytterhoeven
     
  • bio_add_pc_page() may merge pages when a bio is padded due to a flush.
    Fix iteration over the bio to free the correct pages in case of a merge.

    Signed-off-by: Heiner Litz
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Heiner Litz
     

05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation this program is
    distributed in the hope that it will be useful but without any
    warranty without even the implied warranty of merchantability or
    fitness for a particular purpose see the gnu general public license
    for more details you should have received a copy of the gnu general
    public license along with this program see the file copying if not
    write to the free software foundation 675 mass ave cambridge ma
    02139 usa

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 3 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190531190112.675111872@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 May, 2019

1 commit


07 May, 2019

16 commits

  • This patch replaces few remaining usages of rqd->ppa_list[] with
    existing nvm_rq_to_ppa_list() helpers. This is needed for theoretical
    devices with ws_min/ws_opt equal to 1.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • This patch changes the approach to handling partial read path.

    In old approach merging of data from round buffer and drive was fully
    made by drive. This had some disadvantages - code was complex and
    relies on bio internals, so it was hard to maintain and was strongly
    dependent on bio changes.

    In new approach most of the handling is done mostly by block layer
    functions such as bio_split(), bio_chain() and generic_make request()
    and generally is less complex and easier to maintain. Below some more
    details of the new approach.

    When read bio arrives, it is cloned for pblk internal purposes. All
    the L2P mapping, which includes copying data from round buffer to bio
    and thus bio_advance() calls is done on the cloned bio, so the original
    bio is untouched. If we found that we have partial read case, we
    still have original bio untouched, so we can split it and continue to
    process only first part of it in current context, when the rest will be
    called as separate bio request which is passed to generic_make_request()
    for further processing.

    Signed-off-by: Igor Konopko
    Reviewed-by: Heiner Litz
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • Currently all the target instances are removed under global nvm_lock.
    This was needed to ensure that nvm_dev struct will not be freed by
    hot unplug event during target removal. However, current implementation
    has some drawbacks, since the same lock is used when new nvme subsystem
    is registered, so we can have a situation, that due to long process of
    target removal on drive A, registration (and listing in OS) of the
    drive B will take a lot of time, since it will wait for that lock.

    Now when we have kref which ensures that nvm_dev will not be freed in
    the meantime, we can easily get rid of this lock for a time when we are
    removing nvm targets.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • When creation process is still in progress, target is not yet on
    targets list. This causes a chance for removing whole lightnvm
    subsystem by calling nvm_unregister() in the meantime and finally by
    causing kernel panic inside target init function.

    This patch changes the behaviour by adding kref variable which tracks
    all the users of nvm_dev structure. When nvm_dev is allocated, kref
    value is set to 1. Then before every target creation the value is
    increased and decreased after target removal. The extra reference
    is decreased when nvm subsystem is unregistered.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • This patch ensures that smeta was fully written before even
    trying to read it based on chunk table state and write pointer.

    Signed-off-by: Igor Konopko
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • This patch is made in order to prepare read path for new approach to
    partial read handling, which is simpler in compare with previous one.

    The most important change is to move the handling of completed and
    failed bio from the pblk_make_rq() to particular read and write
    functions. This is needed, since after partial read path changes,
    sometimes completed/failed bio will be different from original one, so
    we cannot do this any longer in pblk_make_rq().

    Other changes are small read path refactor in order to reduce the size
    of the following patch with partial read changes.

    Generally the goal of this patch is not to change the functionality,
    but just to prepare the code for the following changes.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • Currently when there is an IO error (or similar) on GC read path, pblk
    still move the line, which was currently under GC process to free state.
    Such a behaviour can lead to silent data mismatch issue.

    With this patch, the line which was under GC process on which some IO
    errors occurred, will be putted back to closed state (instead of free
    state as it was without this patch) and the L2P mapping for such a
    failed sectors will not be updated.

    Then in case of any user IOs to such a failed sectors, pblk would be
    able to return at least real IO error instead of stale data as it is
    right now.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Reviewed-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • Currently during pblk padding, there is internal IO timeout introduced,
    which is smaller than default NVMe timeout. This can lead to various
    use-after-free issues. Since in case of any IO timeouts NVMe and block
    layer will handle timeout by themselves and report it back to use,
    there is no need to keep this internal timeout in pblk.

    Signed-off-by: Igor Konopko
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • This patch changes the behaviour of recovery padding in order to
    support a case, when some IOs were already submitted to the drive and
    some next one are not submitted due to error returned.

    Currently in case of errors we simply exit the pad function without
    waiting for inflight IOs, which leads to panic on inflight IOs
    completion.

    After the changes we always wait for all the inflight IOs before
    exiting the function.

    Signed-off-by: Igor Konopko
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • Read errors are not correctly propagated. Errors are cleared before
    returning control to the io submitter. Change the behaviour such that
    all read errors exept high ecc read warning status is returned
    appropriately.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Reviewed-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • In case of OOB recovery, we can hit the scenario when all the data in
    line were written and some part of emeta was written too. In such
    a case pblk_update_line_wp() function will call pblk_alloc_page()
    function which will case to set left_msecs to value below zero
    (since this field does not track emeta region) and thus will lead to
    multiple kernel warnings. This patch fixes that issue.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • In case of write recovery path, there is a chance that writer thread
    is not active, kick immediately instead of waiting for timer.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Reviewed-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • In pblk_rb_tear_down_check() the spinlock functions are not
    called in proper order.

    Fixes: a4bd217 ("lightnvm: physical block device (pblk) target")
    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Reviewed-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • When we trigger nvm target remove during device hot unplug, there is
    a probability to hit a general protection fault. This is caused by use
    of nvm_dev thay may be freed from another (hot unplug) thread
    (in the nvm_unregister function).

    Introduce lock in nvme_ioctl_dev_remove function to prevent this
    situation.

    Signed-off-by: Marcin Dziegielewski
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Marcin Dziegielewski
     
  • In current implementation of l2p recovery, when we are after gc and we
    have open line, we are not setting current data line properly (we set
    last line from the device instead of last line ordered by seq_nr) and
    in consequence, kernel panic and data corruption.

    Signed-off-by: Marcin Dziegielewski
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Marcin Dziegielewski
     
  • For large size io where blk_queue_split needs to be called inside
    pblk_rw_io, results in bio leak as bio_endio is not called on the
    newly allocated. One way to observe this is to mounting ext4
    filesystem on the target and issuing 1MB io with dd, e.g., dd bs=1MB
    if=/dev/null of=/mount/myvolume. kmemleak reports:

    unreferenced object 0xffff88803d7d0100 (size 256):
    comm "kworker/u16:1", pid 68, jiffies 4294899333 (age 284.120s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 60 e8 31 81 88 ff ff .........`.1....
    01 40 00 00 06 06 00 00 00 00 00 00 05 00 00 00 .@..............
    backtrace:
    [] kmem_cache_alloc+0x204/0x3c0
    [] mempool_alloc_slab+0x1d/0x30
    [] mempool_alloc+0x83/0x220
    [] bio_alloc_bioset+0x229/0x320
    [] bio_clone_fast+0x26/0xc0
    [] bio_split+0x41/0x110
    [] blk_queue_split+0x349/0x930
    [] pblk_make_rq+0x1b5/0x1f0
    [] generic_make_request+0x2f9/0x690
    [] submit_bio+0x12e/0x1f0
    [] ext4_io_submit+0x64/0x80
    [] ext4_bio_write_page+0x32e/0x890
    [] mpage_submit_page+0x65/0xc0
    [] mpage_map_and_submit_buffers+0x171/0x330
    [] ext4_writepages+0xd5e/0x1650
    [] do_writepages+0x39/0xc0

    In case there is a need for a split, blk_queue_split returns the newly
    allocated bio to the caller by changing the value of pointer passed as
    a reference, while the original is passed to generic_make_requests.

    Although pblk_rw_io's local variable bio* has changed and passed to
    pblk_submit_read and pblk_write_to_cache, work is done on this new
    bio*, and pblk_rw_io returns NVM_IO_DONE, pblk_make_rq calls bio_endio
    on the old bio* because it passed bio pointer by value to pblk_rw_io.

    pblk_rw_io is unfolded into pblk_make_rq so that there is no copying
    of bio* and bio_endio is called on the correct bio*.

    Signed-off-by: Chansol Kim
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Chansol Kim