24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

06 Aug, 2019

2 commits


21 Jun, 2019

1 commit

  • bio_add_pc_page() may merge pages when a bio is padded due to a flush.
    Fix iteration over the bio to free the correct pages in case of a merge.

    Signed-off-by: Heiner Litz
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Heiner Litz
     

07 May, 2019

6 commits

  • This patch replaces few remaining usages of rqd->ppa_list[] with
    existing nvm_rq_to_ppa_list() helpers. This is needed for theoretical
    devices with ws_min/ws_opt equal to 1.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • This patch changes the approach to handling partial read path.

    In old approach merging of data from round buffer and drive was fully
    made by drive. This had some disadvantages - code was complex and
    relies on bio internals, so it was hard to maintain and was strongly
    dependent on bio changes.

    In new approach most of the handling is done mostly by block layer
    functions such as bio_split(), bio_chain() and generic_make request()
    and generally is less complex and easier to maintain. Below some more
    details of the new approach.

    When read bio arrives, it is cloned for pblk internal purposes. All
    the L2P mapping, which includes copying data from round buffer to bio
    and thus bio_advance() calls is done on the cloned bio, so the original
    bio is untouched. If we found that we have partial read case, we
    still have original bio untouched, so we can split it and continue to
    process only first part of it in current context, when the rest will be
    called as separate bio request which is passed to generic_make_request()
    for further processing.

    Signed-off-by: Igor Konopko
    Reviewed-by: Heiner Litz
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • Currently when there is an IO error (or similar) on GC read path, pblk
    still move the line, which was currently under GC process to free state.
    Such a behaviour can lead to silent data mismatch issue.

    With this patch, the line which was under GC process on which some IO
    errors occurred, will be putted back to closed state (instead of free
    state as it was without this patch) and the L2P mapping for such a
    failed sectors will not be updated.

    Then in case of any user IOs to such a failed sectors, pblk would be
    able to return at least real IO error instead of stale data as it is
    right now.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Reviewed-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • Read errors are not correctly propagated. Errors are cleared before
    returning control to the io submitter. Change the behaviour such that
    all read errors exept high ecc read warning status is returned
    appropriately.

    Signed-off-by: Igor Konopko
    Reviewed-by: Javier González
    Reviewed-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • smeta_ssec field in pblk_line is never used after it was replaced by
    the function pblk_line_smeta_start().

    Signed-off-by: Igor Konopko
    Reviewed-by: Hans Holmberg
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • Currently L2P map size is calculated based on the total number of
    available sectors, which is redundant, since it contains mapping for
    overprovisioning as well (11% by default).

    Change this size to the real capacity and thus reduce the memory
    footprint significantly - with default op value it is approx.
    110MB of DRAM less for every 1TB of media.

    Signed-off-by: Igor Konopko
    Reviewed-by: Hans Holmberg
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     

11 Feb, 2019

3 commits

  • This patch fixes a race condition where a write is mapped to the last
    sectors of a line. The write is synced to the device but the L2P is not
    updated yet. When the line is garbage collected before the L2P update
    is performed, the sectors are ignored by the GC logic and the line is
    freed before all sectors are moved. When the L2P is finally updated, it
    contains a mapping to a freed line, subsequent reads of the
    corresponding LBAs fail.

    This patch introduces a per line counter specifying the number of
    sectors that are synced to the device but have not been updated in the
    L2P. Lines with a counter of greater than zero will not be selected
    for GC.

    Signed-off-by: Heiner Litz
    Reviewed-by: Hans Holmberg
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Heiner Litz
     
  • There are new types and helpers that are supposed to be used in new code.

    As a preparation to get rid of legacy types and API functions do
    the conversion here.

    Signed-off-by: Andy Shevchenko
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Andy Shevchenko
     
  • As chunk metadata is allocated using vmalloc, we need to free it
    using vfree.

    Fixes: 090ee26fd512 ("lightnvm: use internal allocation for chunk log page")
    Signed-off-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Hans Holmberg
     

12 Dec, 2018

6 commits

  • pblk performs recovery of open lines by storing the LBA in the per LBA
    metadata field. Recovery therefore only works for drives that has this
    field.

    This patch adds support for packed metadata, which store l2p mapping
    for open lines in last sector of every write unit and enables drives
    without per IO metadata to recover open lines.

    After this patch, drives with OOB size
    Signed-off-by: Igor Konopko
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • Currently lightnvm and pblk uses single DMA pool, for which the entry
    size always is equal to PAGE_SIZE. The contents of each entry allocated
    from the DMA pool consists of a PPA list (8bytes * 64), leaving
    56bytes * 64 space for metadata. Since the metadata field can be bigger,
    such as 128 bytes, the static size does not cover this use-case.

    This patch adds support for I/O metadata above 56 bytes by changing DMA
    pool size based on device meta size and allows pblk to use OOB metadata
    >=16B.

    Reviewed-by: Javier González
    Signed-off-by: Igor Konopko
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • pblk currently assumes that size of OOB metadata on drive is always
    equal to size of pblk_sec_meta struct. This commit add helpers which will
    allow to handle different sizes of OOB metadata on drive in the future.

    After this patch only OOB metadata equal to 16 bytes is supported.

    Reviewed-by: Javier González
    Signed-off-by: Igor Konopko
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Igor Konopko
     
  • pblk's recovery path is single threaded and therefore a number of
    assumptions regarding concurrency can be made. To avoid confusion, make
    this explicit with a couple of comments in the code.

    Signed-off-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Javier González
     
  • Protect the list_add on the pblk_line_init_bb() error
    path in case this code is used for some other purpose
    in the future.

    Signed-off-by: Hua Su
    Reviewed-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Hua Su
     
  • The check for chunk closes suffers from an off-by-one issue, leading
    to chunk close events not being traced.

    Fixes: 4c44abf43d00 ("lightnvm: pblk: add trace events for chunk states")
    Signed-off-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Hans Holmberg
     

09 Oct, 2018

19 commits

  • pblk exposes a sysfs interface that represents its internal state. Part
    of this state is the map bitmap for the current open line, which should
    be protected by the line lock to avoid a race when freeing the line
    metadata. Currently, it is not.

    This patch makes sure that the line state is consistent and NULL
    bitmap pointers are not dereferenced.

    Signed-off-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Javier González
     
  • Add GLP-2.0 SPDX license tag to all pblk files

    Signed-off-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Javier González
     
  • pblk guarantees write ordering at a chunk level through a per open chunk
    semaphore. At this point, since we only have an open I/O stream for both
    user and GC data, the semaphore is per parallel unit.

    For the metadata I/O that is synchronous, the semaphore is not needed as
    ordering is guaranteed. However, if the metadata scheme changes or
    multiple streams are open, this guarantee might not be preserved.

    This patch makes sure that all writes go through the semaphore, even for
    synchronous I/O. This is consistent with pblk's write I/O model. It also
    simplifies maintenance since changes in the metadata scheme could cause
    ordering issues.

    Signed-off-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Javier González
     
  • pblk maintains two different metadata paths for smeta and emeta, which
    store metadata at the start of the line and at the end of the line,
    respectively. Until now, these path has been common for writing and
    retrieving metadata, however, as these paths diverge, the common code
    becomes less clear and unnecessary complicated.

    In preparation for further changes to the metadata write path, this
    patch separates the write and read paths for smeta and emeta and
    removes the synchronous emeta path as it not used anymore (emeta is
    scheduled asynchronously to prevent jittering due to internal I/Os).

    Signed-off-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Javier González
     
  • dma allocations for ppa_list and meta_list in rqd are replicated in
    several places across the pblk codebase. Make helpers to encapsulate
    creation and deletion to simplify the code.

    Signed-off-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Javier González
     
  • The lightnvm subsystem provides helpers to retrieve chunk metadata,
    where the target needs to provide a buffer to store the metadata. An
    implicit assumption is that this buffer is contiguous and can be used to
    retrieve the data from the device. If the device exposes too many
    chunks, then kmalloc might fail, thus failing instance creation.

    This patch removes this assumption by implementing an internal buffer in
    the lightnvm subsystem to retrieve chunk metadata. Targets can then
    use virtual memory allocations. Since this is a target API change, adapt
    pblk accordingly.

    Signed-off-by: Javier González
    Reviewed-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Javier González
     
  • Trace state of chunk resets.

    Signed-off-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Hans Holmberg
     
  • Add trace events for tracking pblk state changes.

    Signed-off-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Hans Holmberg
     
  • Add trace events for logging for line state changes.

    Signed-off-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Hans Holmberg
     
  • Introduce trace points for tracking chunk states in pblk - this is
    useful for inspection of the entire state of the drive, and real handy
    for both fw and pblk debugging.

    Signed-off-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Hans Holmberg
     
  • Remove the debug only iteration within __pblk_down_page, which
    then allows us to reduce the number of arguments down to pblk and
    the parallel unit from the functions that calls it. Simplifying the
    callers logic considerably.

    Also, rename the functions pblk_[down/up]_page to
    pblk_[down/up]_chunk, to communicate that it manages the write
    pointer of the chunk. Note that it also protects the parallel unit
    such that at most one chunk is active per parallel unit.

    Signed-off-by: Matias Bjørling
    Reviewed-by: Javier González
    Signed-off-by: Jens Axboe

    Matias Bjørling
     
  • The parameters nr_ppas and ppa_list are not used, so remove them.

    Signed-off-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Hans Holmberg
     
  • Line map bitmap allocations are fairly large and can fail. Allocation
    failures are fatal to pblk, stopping the write pipeline. To avoid this,
    allocate the bitmaps using a mempool instead.

    Mempool allocations never fail if called from a process context,
    and pblk *should* only allocate map bitmaps in process context,
    but keep the failure handling for robustness sake.

    Signed-off-by: Hans Holmberg
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Hans Holmberg
     
  • If a line is recovered from open chunks, the memory structures for
    emeta have not necessarily been properly set on line initialization.
    When closing a line, make sure that emeta is consistent so that the line
    can be recovered on the fast path on next reboot.

    Also, remove a couple of empty lines at the end of the function.

    Signed-off-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Javier González
     
  • The current helper to obtain a line from a ppa returns the line id,
    which requires its users to explicitly retrieve the pointer to the line
    with the id.

    Make 2 different helpers: one returning the line id and one returning
    the line directly.

    Signed-off-by: Javier González
    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Javier González
     
  • The read completion path uses the put_line variable to decide whether
    the reference on a line should be released. The function name used for
    that is pblk_read_put_rqd_kref, which could lead one to believe that it
    is the rqd that is releasing the reference, while it is the line
    reference that is put.

    Rename and also split the function in two to account for either rqd or
    single ppa callers and move it to core, such that it later can be used
    in the write path as well.

    Signed-off-by: Matias Bjørling
    Reviewed-by: Javier González
    Reviewed-by: Heiner Litz
    Signed-off-by: Jens Axboe

    Matias Bjørling
     
  • pblk implements two data paths for recovery line state. One for 1.2
    and another for 2.0, instead of having pblk implement these, combine
    them in the core to reduce complexity and make available to other
    targets.

    The new interface will adhere to the 2.0 chunk definition,
    including managing open chunks with an active write pointer. To provide
    this interface, a 1.2 device recovers the state of the chunks by
    manually detecting if a chunk is either free/open/close/offline, and if
    open, scanning the flash pages sequentially to find the next writeable
    page. This process takes on average ~10 seconds on a device with 64 dies,
    1024 blocks and 60us read access time. The process can be parallelized
    but is left out for maintenance simplicity, as the 1.2 specification is
    deprecated. For 2.0 devices, the logic is maintained internally in the
    drive and retrieved through the 2.0 interface.

    Signed-off-by: Matias Bjørling
    Signed-off-by: Jens Axboe

    Matias Bjørling
     
  • rqd.error is masked by the return value of pblk_submit_io_sync.
    The rqd structure is then passed on to the end_io function, which
    assumes that any error should lead to a chunk being marked
    offline/bad. Since the pblk_submit_io_sync can fail before the
    command is issued to the device, the error value maybe not correspond
    to a media failure, leading to chunks being immaturely retired.

    Also, the pblk_blk_erase_sync function prints an error message in case
    the erase fails. Since the caller prints an error message by itself,
    remove the error message in this function.

    Signed-off-by: Matias Bjørling
    Reviewed-by: Javier González
    Reviewed-by: Hans Holmberg
    Signed-off-by: Jens Axboe

    Matias Bjørling
     
  • Add nvm_set_flags helper to enable core to appropriately
    set the command flags for read/write/erase depending on which version
    a drive supports.

    The flags arguments can be distilled into the access hint,
    scrambling, and program/erase suspend. Replace the access hint with
    a "is_seq" parameter. The rest of the flags are dependent on the
    command opcode, which is trivial to detect and set.

    Signed-off-by: Matias Bjørling
    Reviewed-by: Javier González
    Signed-off-by: Jens Axboe

    Matias Bjørling
     

13 Jul, 2018

2 commits