03 Apr, 2021

14 commits

  • The nvme_fc_rcv_ls_req() function has first argument as pointer to
    remoteport named portprt, but in the documentation comment that is name
    is used as remoteport. Fix that to get rid if the compilation warning.

    drivers/nvme//host/fc.c:1724: warning: Function parameter or member 'portptr' not described in 'nvme_fc_rcv_ls_req'
    drivers/nvme//host/fc.c:1724: warning: Excess function parameter 'remoteport' description in 'nvme_fc_rcv_ls_req'

    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: James Smart
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • Add a new line in functions nvme_pr_preempt(), nvme_pr_clear(), and
    nvme_pr_release() after variable declaration which follows the rest of
    the code in the nvme/host/core.c.

    No functional change(s) in this patch.

    Signed-off-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • nvme_clear_request() has a check for flag REQ_DONTPREP and it is called
    from nvme_init_request() and nvme_setuo_cmd().

    The function nvme_init_request() is called from nvme_alloc_request()
    and nvme_alloc_request_qid(). From these two callers new request is
    allocated everytime. For newly allocated request RQF_DONTPREP is never
    set. Since after getting a tag, block layer sets the req->rq_flags == 0
    and never sets the REQ_DONTPREP when returning the request :-

    nvme_alloc_request()
    blk_mq_alloc_request()
    blk_mq_rq_ctx_init()
    rq->rq_flags = 0 rq_flags = 0 rq_flags but REQ_DONTPREP is not one of
    them and that is set by the driver.

    That means we can unconditinally set the REQ_DONTPREP value to the
    rq->rq_flags when nvme_init_request()->nvme_clear_request() is called
    from above two callers.

    Move the check for REQ_DONTPREP from nvme_clear_nvme_request() into
    nvme_setup_cmd().

    This is needed since nvme_alloc_request() now gets called from fast
    path when NVMeOF target is configured with passthru backend to avoid
    unnecessary checks in the fast path.

    Signed-off-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • Since nvmet_setup_passthru() function falls in fast path when called
    from the NVMeOF passthru backend, make it inline.

    Signed-off-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • The function nvme_init_ctrl_finish() (formerly nvme_init_identify()) has
    grown over the period of time about ~200 lines given the size of nvme id
    ctrl data structure.

    Move the nvme_id_ctrl data structure related initilzation into helper
    nvme_init_identify() and call it from nvme_init_ctrl_finish().

    When we move the code into nvme_init_identify() change the local
    variable i from int to unsigned int and remove the duplicate kfree()
    after nvme_mpath_init() and jump to the label out_free if
    nvme_mpath_ini() fails.

    Signed-off-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • This is a prep patch so that we can move the identify data structure
    related code initialization from nvme_init_identify() into a helper.

    Rename the function nvmet_init_identify() to nvmet_init_ctrl_finish().

    Next patch will move the nvme_id_ctrl related initialization from newly
    renamed function nvme_init_ctrl_finish() into the nvme_init_identify()
    helper.

    Signed-off-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • For passthrough I/O commands, effects are usually to be zero.
    nvme_passthrough_end() does three checks in futility for this case.
    Bail out of function-call/checks.

    Signed-off-by: Kanchan Joshi
    Reviewed-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Kanchan Joshi
     
  • Use the proper macro instead of hard-coded value.

    Signed-off-by: Kanchan Joshi
    Reviewed-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Kanchan Joshi
     
  • Instead of the using the whitespaces use tab spacing in the
    nvmet_execute_identify_ns().

    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • In nvmet_check_ctrl_status() cmd can be derived from nvmet_req. Remove
    the local variable cmd in the nvmet_check_ctrl_status() and function
    parameter cmd for nvmet_check_ctrl_status(). Derive the cmd value from
    req parameter in the nvmet_check_ctrl_status().

    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • Instead of updating the error log page in the caller of the
    nvmet_alloc_ctrt() update the error log page in the nvmet_alloc_ctrl().

    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • In the function nvmet_alloc_ctrl() we assign status value before we
    call nvmet_fine_get_subsys() to:

    status = NVME_SC_CONNECT_INVALID_PARAM | NVME_SC_DNR;

    After we successfully find the subsystem we again set the status value
    to:

    status = NVME_SC_CONNECT_INVALID_PARAM | NVME_SC_DNR;

    Remove the duplicate status assignment value.

    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • Get rid of a local variable that is not needed and just return the
    status directly.

    Signed-off-by: Chaitanya Kulkarni
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     
  • The barriers were added to the nvme_irq() in commit 3a7afd8ee42a
    ("nvme-pci: remove the CQ lock for interrupt driven queues") to prevent
    compiler from doing memory optimization for the variabes that were
    protected previously by spinlock in nvme_irq() at completion queue
    processing and with queue head check condition.

    The variable nvmeq->last_cq_head from those checks was removed in the
    commit f6c4d97b0d82 ("nvme/pci: Remove last_cq_head") that was not
    allwing poll queues from mistakenly triggering the spurious interrupt
    detection.

    Remove the barriers which were protecting the updates to the variables.

    Reported-by: Heiner Kallweit
    Signed-off-by: Chaitanya Kulkarni
    Reviewed-by: Heiner Kallweit
    Signed-off-by: Christoph Hellwig

    Chaitanya Kulkarni
     

29 Mar, 2021

4 commits


26 Mar, 2021

1 commit

  • …d into for-5.13/drivers

    Pull MD updates from Song:

    "The major changes are:

    1. Performance improvement for raid10 discard requests, from Xiao Ni.
    2. Fix missing information of /proc/mdstat, from Jan Glauber."

    * 'md-next' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
    md: Fix missing unused status line of /proc/mdstat
    md/raid10: improve discard request for far layout
    md/raid10: improve raid10 discard request
    md/raid10: pull the code that wait for blocked dev into one function
    md/raid10: extend r10bio devs to raid disks
    md: add md_submit_discard_bio() for submitting discard bio

    Jens Axboe
     

25 Mar, 2021

6 commits

  • Reading /proc/mdstat with a read buffer size that would not
    fit the unused status line in the first read will skip this
    line from the output.

    So 'dd if=/proc/mdstat bs=64 2>/dev/null' will not print something
    like: unused devices:

    Don't return NULL immediately in start() for v=2 but call
    show() once to print the status line also for multiple reads.

    Cc: stable@vger.kernel.org
    Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code and interface")
    Signed-off-by: Jan Glauber
    Signed-off-by: Song Liu

    Jan Glauber
     
  • For far layout, the discard region is not continuous on disks. So it needs
    far copies r10bio to cover all regions. It needs a way to know all r10bios
    have finish or not. Similar with raid10_sync_request, only the first r10bio
    master_bio records the discard bio. Other r10bios master_bio record the
    first r10bio. The first r10bio can finish after other r10bios finish and
    then return the discard bio.

    Tested-by: Adrian Huang
    Signed-off-by: Xiao Ni
    Signed-off-by: Song Liu

    Xiao Ni
     
  • Now the discard request is split by chunk size. So it takes a long time
    to finish mkfs on disks which support discard function. This patch improve
    handling raid10 discard request. It uses the similar way with patch
    29efc390b (md/md0: optimize raid0 discard handling).

    But it's a little complex than raid0. Because raid10 has different layout.
    If raid10 is offset layout and the discard request is smaller than stripe
    size. There are some holes when we submit discard bio to underlayer disks.

    For example: five disks (disk1 - disk5)
    D01 D02 D03 D04 D05
    D05 D01 D02 D03 D04
    D06 D07 D08 D09 D10
    D10 D06 D07 D08 D09
    The discard bio just wants to discard from D03 to D10. For disk3, there is
    a hole between D03 and D08. For disk4, there is a hole between D04 and D09.
    D03 is a chunk, raid10_write_request can handle one chunk perfectly. So
    the part that is not aligned with stripe size is still handled by
    raid10_write_request.

    If reshape is running when discard bio comes and the discard bio spans the
    reshape position, raid10_write_request is responsible to handle this
    discard bio.

    I did a test with this patch set.
    Without patch:
    time mkfs.xfs /dev/md0
    real4m39.775s
    user0m0.000s
    sys0m0.298s

    With patch:
    time mkfs.xfs /dev/md0
    real0m0.105s
    user0m0.000s
    sys0m0.007s

    nvme3n1 259:1 0 477G 0 disk
    └─nvme3n1p1 259:10 0 50G 0 part
    nvme4n1 259:2 0 477G 0 disk
    └─nvme4n1p1 259:11 0 50G 0 part
    nvme5n1 259:6 0 477G 0 disk
    └─nvme5n1p1 259:12 0 50G 0 part
    nvme2n1 259:9 0 477G 0 disk
    └─nvme2n1p1 259:15 0 50G 0 part
    nvme0n1 259:13 0 477G 0 disk
    └─nvme0n1p1 259:14 0 50G 0 part

    Reviewed-by: Coly Li
    Reviewed-by: Guoqing Jiang
    Tested-by: Adrian Huang
    Signed-off-by: Xiao Ni
    Signed-off-by: Song Liu

    Xiao Ni
     
  • The following patch will reuse these logics, so pull the same codes into
    one function.

    Tested-by: Adrian Huang
    Signed-off-by: Xiao Ni
    Signed-off-by: Song Liu

    Xiao Ni
     
  • Now it allocs r10bio->devs[conf->copies]. Discard bio needs to submit
    to all member disks and it needs to use r10bio. So extend to
    r10bio->devs[geo.raid_disks].

    Reviewed-by: Coly Li
    Tested-by: Adrian Huang
    Signed-off-by: Xiao Ni
    Signed-off-by: Song Liu

    Xiao Ni
     
  • Move these logic from raid0.c to md.c, so that we can also use it in
    raid10.c.

    Reviewed-by: Coly Li
    Reviewed-by: Guoqing Jiang
    Tested-by: Adrian Huang
    Signed-off-by: Xiao Ni
    Signed-off-by: Song Liu

    Xiao Ni
     

24 Mar, 2021

3 commits

  • This removes the driver on the premise that it has been unused for a long
    time. This is a better approach compared to changing untestable code
    nobody cares about in the first place. Similarly, the umem.com website now
    shows a mere Godaddy parking add.

    Acked-by: NeilBrown
    Suggested-by: Christoph Hellwig
    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Jens Axboe

    Davidlohr Bueso
     
  • The returned string from rsxx_card_state_to_str is 'const',
    but the other qualifier doesn't change anything here except
    causing a warning with 'clang -Wextra':

    drivers/block/rsxx/core.c:393:21: warning: 'const' type qualifier on return type has no effect [-Wignored-qualifiers]
    static const char * const rsxx_card_state_to_str(unsigned int state)

    Fixes: f37912039eb0 ("block: IBM RamSan 70/80 trivial changes.")
    Reviewed-by: Nick Desaulniers
    Signed-off-by: Arnd Bergmann
    Link: https://lore.kernel.org/r/20210323215753.281668-1-arnd@kernel.org
    Signed-off-by: Jens Axboe

    Arnd Bergmann
     
  • Sysace IP is no longer used on Xilinx PowerPC 405/440 and Microblaze
    systems. The driver is not regularly tested and very likely not working for
    quite a long time that's why remove it.

    Signed-off-by: Michal Simek
    Signed-off-by: Jens Axboe

    Michal Simek
     

16 Mar, 2021

2 commits

  • Wire up device_driver->dev_groups, so that really_probe() creates the
    sysfs attributes for us automatically.

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Jan Hoeppner
    Signed-off-by: Stefan Haberland
    Link: https://lore.kernel.org/r/20210316094513.2601218-3-sth@linux.ibm.com
    Signed-off-by: Jens Axboe

    Julian Wiedmann
     
  • commit e03c5941f904 ("s390/dasd: Remove unused parameter from
    dasd_generic_probe()") allows us to wire the generic callback up
    directly, avoiding the additional level of indirection.

    While at it also remove the forward declaration for the dasd_fba_driver
    struct, it's no longer needed.

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Jan Hoeppner
    Signed-off-by: Stefan Haberland
    Link: https://lore.kernel.org/r/20210316094513.2601218-2-sth@linux.ibm.com
    Signed-off-by: Jens Axboe

    Julian Wiedmann
     

15 Mar, 2021

10 commits

  • Linus Torvalds
     
  • Doing a

    prctl(PR_SET_MM, PR_SET_MM_AUXV, addr, 1);

    will copy 1 byte from userspace to (quite big) on-stack array
    and then stash everything to mm->saved_auxv.
    AT_NULL terminator will be inserted at the very end.

    /proc/*/auxv handler will find that AT_NULL terminator
    and copy original stack contents to userspace.

    This devious scheme requires CAP_SYS_RESOURCE.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Pull irq fixes from Thomas Gleixner:
    "A set of irqchip updates:

    - Make the GENERIC_IRQ_MULTI_HANDLER configuration correct

    - Add a missing DT compatible string for the Ingenic driver

    - Remove the pointless debugfs_file pointer from struct irqdomain"

    * tag 'irq-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    irqchip/ingenic: Add support for the JZ4760
    dt-bindings/irq: Add compatible string for the JZ4760B
    irqchip: Do not blindly select CONFIG_GENERIC_IRQ_MULTI_HANDLER
    ARM: ep93xx: Select GENERIC_IRQ_MULTI_HANDLER directly
    irqdomain: Remove debugfs_file from struct irq_domain

    Linus Torvalds
     
  • Pull timer fix from Thomas Gleixner:
    "A single fix in for hrtimers to prevent an interrupt storm caused by
    the lack of reevaluation of the timers which expire in softirq context
    under certain circumstances, e.g. when the clock was set"

    * tag 'timers-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()

    Linus Torvalds
     
  • Pull scheduler fixes from Thomas Gleixner:
    "A set of scheduler updates:

    - Prevent a NULL pointer dereference in the migration_stop_cpu()
    mechanims

    - Prevent self concurrency of affine_move_task()

    - Small fixes and cleanups related to task migration/affinity setting

    - Ensure that sync_runqueues_membarrier_state() is invoked on the
    current CPU when it is in the cpu mask"

    * tag 'sched-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/membarrier: fix missing local execution of ipi_sync_rq_state()
    sched: Simplify set_affinity_pending refcounts
    sched: Fix affine_move_task() self-concurrency
    sched: Optimize migration_cpu_stop()
    sched: Collate affine_move_task() stoppers
    sched: Simplify migration_cpu_stop()
    sched: Fix migration_cpu_stop() requeueing

    Linus Torvalds
     
  • Pull objtool fix from Thomas Gleixner:
    "A single objtool fix to handle the PUSHF/POPF validation correctly for
    the paravirt changes which modified arch_local_irq_restore not to use
    popf"

    * tag 'objtool-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    objtool,x86: Fix uaccess PUSHF/POPF validation

    Linus Torvalds
     
  • Pull locking fixes from Thomas Gleixner:
    "A couple of locking fixes:

    - A fix for the static_call mechanism so it handles unaligned
    addresses correctly.

    - Make u64_stats_init() a macro so every instance gets a seperate
    lockdep key.

    - Make seqcount_latch_init() a macro as well to preserve the static
    variable which is used for the lockdep key"

    * tag 'locking-urgent-2021-03-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    seqlock,lockdep: Fix seqcount_latch_init()
    u64_stats,lockdep: Fix u64_stats_init() vs lockdep
    static_call: Fix the module key fixup

    Linus Torvalds
     
  • Pull perf fixes from Borislav Petkov:

    - Make sure PMU internal buffers are flushed for per-CPU events too and
    properly handle PID/TID for large PEBS.

    - Handle the case properly when there's no PMU and therefore return an
    empty list of perf MSRs for VMX to switch instead of reading random
    garbage from the stack.

    * tag 'perf_urgent_for_v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/perf: Use RET0 as default for guest_get_msrs to handle "no PMU" case
    perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR
    perf/core: Flush PMU internal buffers for per-CPU events

    Linus Torvalds
     
  • Pull EFI fix from Ard Biesheuvel via Borislav Petkov:
    "Fix an oversight in the handling of EFI_RT_PROPERTIES_TABLE, which was
    added v5.10, but failed to take the SetVirtualAddressMap() RT service
    into account"

    * tag 'efi-urgent-for-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    efi: stub: omit SetVirtualAddressMap() if marked unsupported in RT_PROP table

    Linus Torvalds
     
  • Pull x86 fixes from Borislav Petkov:

    - A couple of SEV-ES fixes and robustifications: verify usermode stack
    pointer in NMI is not coming from the syscall gap, correctly track
    IRQ states in the #VC handler and access user insn bytes atomically
    in same handler as latter cannot sleep.

    - Balance 32-bit fast syscall exit path to do the proper work on exit
    and thus not confuse audit and ptrace frameworks.

    - Two fixes for the ORC unwinder going "off the rails" into KASAN
    redzones and when ORC data is missing.

    * tag 'x86_urgent_for_v5.12_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/sev-es: Use __copy_from_user_inatomic()
    x86/sev-es: Correctly track IRQ states in runtime #VC handler
    x86/sev-es: Check regs->sp is trusted before adjusting #VC IST stack
    x86/sev-es: Introduce ip_within_syscall_gap() helper
    x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls
    x86/unwind/orc: Silence warnings caused by missing ORC data
    x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2

    Linus Torvalds