08 Oct, 2020

1 commit

  • * tag 'v5.4.70': (3051 commits)
    Linux 5.4.70
    netfilter: ctnetlink: add a range check for l3/l4 protonum
    ep_create_wakeup_source(): dentry name can change under you...
    ...

    Conflicts:
    arch/arm/mach-imx/pm-imx6.c
    arch/arm64/boot/dts/freescale/imx8mm-evk.dts
    arch/arm64/boot/dts/freescale/imx8mn-ddr4-evk.dts
    drivers/crypto/caam/caamalg.c
    drivers/gpu/drm/imx/dw_hdmi-imx.c
    drivers/gpu/drm/imx/imx-ldb.c
    drivers/gpu/drm/imx/ipuv3/ipuv3-crtc.c
    drivers/mmc/host/sdhci-esdhc-imx.c
    drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
    drivers/net/ethernet/freescale/enetc/enetc.c
    drivers/net/ethernet/freescale/enetc/enetc_pf.c
    drivers/thermal/imx_thermal.c
    drivers/usb/cdns3/ep0.c
    drivers/xen/swiotlb-xen.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c

    Signed-off-by: Jason Liu

    Jason Liu
     

07 Oct, 2020

1 commit

  • [ Upstream commit 09a6b0bc3be793ca8cba580b7992d73e9f68f15d ]

    Commit f227e3ec3b5c ("random32: update the net random state on interrupt
    and activity") broke compilation and was temporarily fixed by Linus in
    83bdc7275e62 ("random32: remove net_rand_state from the latent entropy
    gcc plugin") by entirely moving net_rand_state out of the things handled
    by the latent_entropy GCC plugin.

    From what I understand when reading the plugin code, using the
    __latent_entropy attribute on a declaration was the wrong part and
    simply keeping the __latent_entropy attribute on the variable definition
    was the correct fix.

    Fixes: 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy gcc plugin")
    Acked-by: Willy Tarreau
    Cc: Emese Revfy
    Signed-off-by: Thibaut Sautereau
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Thibaut Sautereau
     

01 Oct, 2020

2 commits

  • commit 1e1b6d63d6340764e00356873e5794225a2a03ea upstream.

    LLVM implemented a recent "libcall optimization" that lowers calls to
    `sprintf(dest, "%s", str)` where the return value is used to
    `stpcpy(dest, str) - dest`.

    This generally avoids the machinery involved in parsing format strings.
    `stpcpy` is just like `strcpy` except it returns the pointer to the new
    tail of `dest`. This optimization was introduced into clang-12.

    Implement this so that we don't observe linkage failures due to missing
    symbol definitions for `stpcpy`.

    Similar to last year's fire drill with: commit 5f074f3e192f
    ("lib/string.c: implement a basic bcmp")

    The kernel is somewhere between a "freestanding" environment (no full
    libc) and "hosted" environment (many symbols from libc exist with the
    same type, function signature, and semantics).

    As Peter Anvin notes, there's not really a great way to inform the
    compiler that you're targeting a freestanding environment but would like
    to opt-in to some libcall optimizations (see pr/47280 below), rather
    than opt-out.

    Arvind notes, -fno-builtin-* behaves slightly differently between GCC
    and Clang, and Clang is missing many __builtin_* definitions, which I
    consider a bug in Clang and am working on fixing.

    Masahiro summarizes the subtle distinction between compilers justly:
    To prevent transformation from foo() into bar(), there are two ways in
    Clang to do that; -fno-builtin-foo, and -fno-builtin-bar. There is
    only one in GCC; -fno-buitin-foo.

    (Any difference in that behavior in Clang is likely a bug from a missing
    __builtin_* definition.)

    Masahiro also notes:
    We want to disable optimization from foo() to bar(),
    but we may still benefit from the optimization from
    foo() into something else. If GCC implements the same transform, we
    would run into a problem because it is not -fno-builtin-bar, but
    -fno-builtin-foo that disables that optimization.

    In this regard, -fno-builtin-foo would be more future-proof than
    -fno-built-bar, but -fno-builtin-foo is still potentially overkill. We
    may want to prevent calls from foo() being optimized into calls to
    bar(), but we still may want other optimization on calls to foo().

    It seems that compilers today don't quite provide the fine grain control
    over which libcall optimizations pseudo-freestanding environments would
    prefer.

    Finally, Kees notes that this interface is unsafe, so we should not
    encourage its use. As such, I've removed the declaration from any
    header, but it still needs to be exported to avoid linkage errors in
    modules.

    Reported-by: Sami Tolvanen
    Suggested-by: Andy Lavr
    Suggested-by: Arvind Sankar
    Suggested-by: Joe Perches
    Suggested-by: Kees Cook
    Suggested-by: Masahiro Yamada
    Suggested-by: Rasmus Villemoes
    Signed-off-by: Nick Desaulniers
    Signed-off-by: Andrew Morton
    Tested-by: Nathan Chancellor
    Cc:
    Link: https://lkml.kernel.org/r/20200914161643.938408-1-ndesaulniers@google.com
    Link: https://bugs.llvm.org/show_bug.cgi?id=47162
    Link: https://bugs.llvm.org/show_bug.cgi?id=47280
    Link: https://github.com/ClangBuiltLinux/linux/issues/1126
    Link: https://man7.org/linux/man-pages/man3/stpcpy.3.html
    Link: https://pubs.opengroup.org/onlinepubs/9699919799/functions/stpcpy.html
    Link: https://reviews.llvm.org/D85963
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Nick Desaulniers
     
  • [ Upstream commit 2cb80dbbbaba4f2f86f686c34cb79ea5cbfb0edb ]

    KUnit tests for initialized data behavior of proc_dointvec that is
    explicitly checked in the code. Includes basic parsing tests including
    int min/max overflow.

    Signed-off-by: Iurii Zaikin
    Signed-off-by: Brendan Higgins
    Reviewed-by: Greg Kroah-Hartman
    Reviewed-by: Logan Gunthorpe
    Acked-by: Luis Chamberlain
    Reviewed-by: Stephen Boyd
    Signed-off-by: Shuah Khan
    Signed-off-by: Sasha Levin

    Iurii Zaikin
     

17 Sep, 2020

1 commit

  • commit 40b8b826a6998639dd1c26f0e127f18371e1058d upstream.

    The commit 079ad2fb4bf9 ("kobject: Avoid premature parent object freeing in
    kobject_cleanup()") inadvertently dropped a possibility to call kobject_del()
    with NULL pointer. Restore the old behaviour.

    Fixes: 079ad2fb4bf9 ("kobject: Avoid premature parent object freeing in kobject_cleanup()")
    Cc: stable
    Reported-by: Qu Wenruo
    Cc: Heikki Krogerus
    Signed-off-by: Andy Shevchenko
    Reviewed-by: Qu Wenruo
    Link: https://lore.kernel.org/r/20200803082706.65347-1-andriy.shevchenko@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman

    Andy Shevchenko
     

21 Aug, 2020

2 commits

  • [ Upstream commit 0776d1231bec0c7ab43baf440a3f5ef5f49dd795 ]

    Reset the member "test_fs" of the test configuration after a call of the
    function "kfree_const" to a null pointer so that a double memory release
    will not be performed.

    Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader")
    Signed-off-by: Tiezhu Yang
    Signed-off-by: Luis Chamberlain
    Signed-off-by: Andrew Morton
    Acked-by: Luis Chamberlain
    Cc: Alexei Starovoitov
    Cc: Al Viro
    Cc: Christian Brauner
    Cc: Chuck Lever
    Cc: David Howells
    Cc: David S. Miller
    Cc: Greg Kroah-Hartman
    Cc: Jakub Kicinski
    Cc: James Morris
    Cc: Jarkko Sakkinen
    Cc: J. Bruce Fields
    Cc: Jens Axboe
    Cc: Josh Triplett
    Cc: Kees Cook
    Cc: Lars Ellenberg
    Cc: Nikolay Aleksandrov
    Cc: Philipp Reisner
    Cc: Roopa Prabhu
    Cc: "Serge E. Hallyn"
    Cc: Sergei Trofimovich
    Cc: Sergey Kvachonok
    Cc: Shuah Khan
    Cc: Tony Vroon
    Cc: Christoph Hellwig
    Link: http://lkml.kernel.org/r/20200610154923.27510-4-mcgrof@kernel.org
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Tiezhu Yang
     
  • [ Upstream commit 35bd8c07db2ce8fd2834ef866240613a4ef982e7 ]

    Sometimes debugging a device is easiest using devmem on its register
    map, and that can be seen with /proc/iomem. But some device drivers have
    many memory regions. Take for example a networking switch. Its memory
    map used to look like this in /proc/iomem:

    1fc000000-1fc3fffff : pcie@1f0000000
    1fc000000-1fc3fffff : 0000:00:00.5
    1fc010000-1fc01ffff : sys
    1fc030000-1fc03ffff : rew
    1fc060000-1fc0603ff : s2
    1fc070000-1fc0701ff : devcpu_gcb
    1fc080000-1fc0800ff : qs
    1fc090000-1fc0900cb : ptp
    1fc100000-1fc10ffff : port0
    1fc110000-1fc11ffff : port1
    1fc120000-1fc12ffff : port2
    1fc130000-1fc13ffff : port3
    1fc140000-1fc14ffff : port4
    1fc150000-1fc15ffff : port5
    1fc200000-1fc21ffff : qsys
    1fc280000-1fc28ffff : ana

    But after the patch in Fixes: was applied, the information is now
    presented in a much more opaque way:

    1fc000000-1fc3fffff : pcie@1f0000000
    1fc000000-1fc3fffff : 0000:00:00.5
    1fc010000-1fc01ffff : 0000:00:00.5
    1fc030000-1fc03ffff : 0000:00:00.5
    1fc060000-1fc0603ff : 0000:00:00.5
    1fc070000-1fc0701ff : 0000:00:00.5
    1fc080000-1fc0800ff : 0000:00:00.5
    1fc090000-1fc0900cb : 0000:00:00.5
    1fc100000-1fc10ffff : 0000:00:00.5
    1fc110000-1fc11ffff : 0000:00:00.5
    1fc120000-1fc12ffff : 0000:00:00.5
    1fc130000-1fc13ffff : 0000:00:00.5
    1fc140000-1fc14ffff : 0000:00:00.5
    1fc150000-1fc15ffff : 0000:00:00.5
    1fc200000-1fc21ffff : 0000:00:00.5
    1fc280000-1fc28ffff : 0000:00:00.5

    That patch made a fair comment that /proc/iomem might be confusing when
    it shows resources without an associated device, but we can do better
    than just hide the resource name altogether. Namely, we can print the
    device name _and_ the resource name. Like this:

    1fc000000-1fc3fffff : pcie@1f0000000
    1fc000000-1fc3fffff : 0000:00:00.5
    1fc010000-1fc01ffff : 0000:00:00.5 sys
    1fc030000-1fc03ffff : 0000:00:00.5 rew
    1fc060000-1fc0603ff : 0000:00:00.5 s2
    1fc070000-1fc0701ff : 0000:00:00.5 devcpu_gcb
    1fc080000-1fc0800ff : 0000:00:00.5 qs
    1fc090000-1fc0900cb : 0000:00:00.5 ptp
    1fc100000-1fc10ffff : 0000:00:00.5 port0
    1fc110000-1fc11ffff : 0000:00:00.5 port1
    1fc120000-1fc12ffff : 0000:00:00.5 port2
    1fc130000-1fc13ffff : 0000:00:00.5 port3
    1fc140000-1fc14ffff : 0000:00:00.5 port4
    1fc150000-1fc15ffff : 0000:00:00.5 port5
    1fc200000-1fc21ffff : 0000:00:00.5 qsys
    1fc280000-1fc28ffff : 0000:00:00.5 ana

    Fixes: 8d84b18f5678 ("devres: always use dev_name() in devm_ioremap_resource()")
    Signed-off-by: Vladimir Oltean
    Link: https://lore.kernel.org/r/20200601095826.1757621-1-olteanv@gmail.com
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin

    Vladimir Oltean
     

19 Aug, 2020

3 commits

  • [ Upstream commit 079ad2fb4bf9eba8a0aaab014b49705cd7f07c66 ]

    If kobject_del() is invoked by kobject_cleanup() to delete the
    target kobject, it may cause its parent kobject to be freed
    before invoking the target kobject's ->release() method, which
    effectively means freeing the parent before dealing with the
    child entirely.

    That is confusing at best and it may also lead to functional
    issues if the callers of kobject_cleanup() are not careful enough
    about the order in which these calls are made, so avoid the
    problem by making kobject_cleanup() drop the last reference to
    the target kobject's parent at the end, after invoking the target
    kobject's ->release() method.

    [ rjw: Rewrite the subject and changelog, make kobject_cleanup()
    drop the parent reference only when __kobject_del() has been
    called. ]

    Reported-by: Naresh Kamboju
    Reported-by: kernel test robot
    Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
    Suggested-by: Rafael J. Wysocki
    Signed-off-by: Heikki Krogerus
    Signed-off-by: Rafael J. Wysocki
    Link: https://lore.kernel.org/r/1908555.IiAGLGrh1Z@kreacher
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin

    Heikki Krogerus
     
  • [ Upstream commit f678ce8cc3cb2ad29df75d8824c74f36398ba871 ]

    ddebug_describe_flags() currently fills a caller provided string buffer,
    after testing its size (also passed) in a BUG_ON. Fix this by
    replacing them with a known-big-enough string buffer wrapped in a
    struct, and passing that instead.

    Also simplify ddebug_describe_flags() flags parameter from a struct to
    a member in that struct, and hoist the member deref up to the caller.
    This makes the function reusable (soon) where flags are unpacked.

    Acked-by:
    Signed-off-by: Jim Cromie
    Link: https://lore.kernel.org/r/20200719231058.1586423-8-jim.cromie@gmail.com
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin

    Jim Cromie
     
  • [ Upstream commit 3906f640224dbe7714b52b66d7d68c0812808e19 ]

    The crypto notify call occurs with a read mutex held so you must
    not do any substantial work directly. In particular, you cannot
    call crypto_alloc_* as they may trigger further notifications
    which may dead-lock in the presence of another writer.

    This patch fixes this by postponing the work into a work queue and
    taking the same lock in the module init function.

    While we're at it this patch also ensures that all RCU accesses are
    marked appropriately (tested with sparse).

    Finally this also reveals a race condition in module param show
    function as it may be called prior to the module init function.
    It's fixed by testing whether crct10dif_tfm is NULL (this is true
    iff the init function has not completed assuming fallback is false).

    Fixes: 11dcb1037f40 ("crc-t10dif: Allow current transform to be...")
    Fixes: b76377543b73 ("crc-t10dif: Pick better transform if one...")
    Signed-off-by: Herbert Xu
    Reviewed-by: Martin K. Petersen
    Reviewed-by: Eric Biggers
    Signed-off-by: Herbert Xu
    Signed-off-by: Sasha Levin

    Herbert Xu
     

07 Aug, 2020

2 commits

  • commit 83bdc7275e6206f560d247be856bceba3e1ed8f2 upstream.

    It turns out that the plugin right now ends up being really unhappy
    about the change from 'static' to 'extern' storage that happened in
    commit f227e3ec3b5c ("random32: update the net random state on interrupt
    and activity").

    This is probably a trivial fix for the latent_entropy plugin, but for
    now, just remove net_rand_state from the list of things the plugin
    worries about.

    Reported-by: Stephen Rothwell
    Cc: Emese Revfy
    Cc: Kees Cook
    Cc: Willy Tarreau
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Linus Torvalds
     
  • commit f227e3ec3b5cad859ad15666874405e8c1bbc1d4 upstream.

    This modifies the first 32 bits out of the 128 bits of a random CPU's
    net_rand_state on interrupt or CPU activity to complicate remote
    observations that could lead to guessing the network RNG's internal
    state.

    Note that depending on some network devices' interrupt rate moderation
    or binding, this re-seeding might happen on every packet or even almost
    never.

    In addition, with NOHZ some CPUs might not even get timer interrupts,
    leaving their local state rarely updated, while they are running
    networked processes making use of the random state. For this reason, we
    also perform this update in update_process_times() in order to at least
    update the state when there is user or system activity, since it's the
    only case we care about.

    Reported-by: Amit Klein
    Suggested-by: Linus Torvalds
    Cc: Eric Dumazet
    Cc: "Jason A. Donenfeld"
    Cc: Andy Lutomirski
    Cc: Kees Cook
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc:
    Signed-off-by: Willy Tarreau
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Willy Tarreau
     

01 Jul, 2020

1 commit


24 Jun, 2020

1 commit

  • [ Upstream commit acaab7335bd6f0c0b54ce3a00bd7f18222ce0f5f ]

    The zlib inflate code has an old micro-optimization based on the
    assumption that for pre-increment memory accesses, the compiler will
    generate code that fits better into the processor's pipeline than what
    would be generated for post-increment memory accesses.

    This optimization was already removed in upstream zlib in 2016:
    https://github.com/madler/zlib/commit/9aaec95e8211

    This optimization causes UB according to C99, which says in section 6.5.6
    "Additive operators": "If both the pointer operand and the result point to
    elements of the same array object, or one past the last element of the
    array object, the evaluation shall not produce an overflow; otherwise, the
    behavior is undefined".

    This UB is not only a theoretical concern, but can also cause trouble for
    future work on compiler-based sanitizers.

    According to the zlib commit, this optimization also is not optimal
    anymore with modern compilers.

    Replace uses of OFF, PUP and UP_UNALIGNED with their definitions in the
    POSTINC case, and remove the macro definitions, just like in the upstream
    patch.

    Signed-off-by: Jann Horn
    Signed-off-by: Andrew Morton
    Cc: Mikhail Zaslonko
    Link: http://lkml.kernel.org/r/20200507123112.252723-1-jannh@google.com
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Jann Horn
     

22 Jun, 2020

2 commits

  • [ Upstream commit adb72ae1915db28f934e9e02c18bfcea2f3ed3b7 ]

    Patch series "Fix some incompatibilites between KASAN and FORTIFY_SOURCE", v4.

    3 KASAN self-tests fail on a kernel with both KASAN and FORTIFY_SOURCE:
    memchr, memcmp and strlen.

    When FORTIFY_SOURCE is on, a number of functions are replaced with
    fortified versions, which attempt to check the sizes of the operands.
    However, these functions often directly invoke __builtin_foo() once they
    have performed the fortify check. The compiler can detect that the
    results of these functions are not used, and knows that they have no other
    side effects, and so can eliminate them as dead code.

    Why are only memchr, memcmp and strlen affected?
    ================================================

    Of string and string-like functions, kasan_test tests:

    * strchr -> not affected, no fortified version
    * strrchr -> likewise
    * strcmp -> likewise
    * strncmp -> likewise

    * strnlen -> not affected, the fortify source implementation calls the
    underlying strnlen implementation which is instrumented, not
    a builtin

    * strlen -> affected, the fortify souce implementation calls a __builtin
    version which the compiler can determine is dead.

    * memchr -> likewise
    * memcmp -> likewise

    * memset -> not affected, the compiler knows that memset writes to its
    first argument and therefore is not dead.

    Why does this not affect the functions normally?
    ================================================

    In string.h, these functions are not marked as __pure, so the compiler
    cannot know that they do not have side effects. If relevant functions are
    marked as __pure in string.h, we see the following warnings and the
    functions are elided:

    lib/test_kasan.c: In function `kasan_memchr':
    lib/test_kasan.c:606:2: warning: statement with no effect [-Wunused-value]
    memchr(ptr, '1', size + 1);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~
    lib/test_kasan.c: In function `kasan_memcmp':
    lib/test_kasan.c:622:2: warning: statement with no effect [-Wunused-value]
    memcmp(ptr, arr, size+1);
    ^~~~~~~~~~~~~~~~~~~~~~~~
    lib/test_kasan.c: In function `kasan_strings':
    lib/test_kasan.c:645:2: warning: statement with no effect [-Wunused-value]
    strchr(ptr, '1');
    ^~~~~~~~~~~~~~~~
    ...

    This annotation would make sense to add and could be added at any point,
    so the behaviour of test_kasan.c should change.

    The fix
    =======

    Make all the functions that are pure write their results to a global,
    which makes them live. The strlen and memchr tests now pass.

    The memcmp test still fails to trigger, which is addressed in the next
    patch.

    [dja@axtens.net: drop patch 3]
    Link: http://lkml.kernel.org/r/20200424145521.8203-2-dja@axtens.net
    Fixes: 0c96350a2d2f ("lib/test_kasan.c: add tests for several string/memory API functions")
    Signed-off-by: Daniel Axtens
    Signed-off-by: Andrew Morton
    Tested-by: David Gow
    Reviewed-by: Dmitry Vyukov
    Cc: Daniel Micay
    Cc: Andrey Ryabinin
    Cc: Alexander Potapenko
    Link: http://lkml.kernel.org/r/20200423154503.5103-1-dja@axtens.net
    Link: http://lkml.kernel.org/r/20200423154503.5103-2-dja@axtens.net
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Daniel Axtens
     
  • [ Upstream commit 18f1ca46858eac22437819937ae44aa9a8f9f2fa ]

    When building 64r6_defconfig with CONFIG_MIPS32_O32 disabled and
    CONFIG_CRYPTO_RSA enabled:

    lib/mpi/generic_mpih-mul1.c:37:24: error: invalid use of a cast in a
    inline asm context requiring an l-value: remove the cast
    or build with -fheinous-gnu-extensions
    umul_ppmm(prod_high, prod_low, s1_ptr[j], s2_limb);
    ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    lib/mpi/longlong.h:664:22: note: expanded from macro 'umul_ppmm'
    : "=d" ((UDItype)(w0))
    ~~~~~~~~~~^~~
    lib/mpi/generic_mpih-mul1.c:37:13: error: invalid use of a cast in a
    inline asm context requiring an l-value: remove the cast
    or build with -fheinous-gnu-extensions
    umul_ppmm(prod_high, prod_low, s1_ptr[j], s2_limb);
    ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    lib/mpi/longlong.h:668:22: note: expanded from macro 'umul_ppmm'
    : "=d" ((UDItype)(w1))
    ~~~~~~~~~~^~~
    2 errors generated.

    This special case for umul_ppmm for MIPS64r6 was added in
    commit bbc25bee37d2b ("lib/mpi: Fix umul_ppmm() for MIPS64r6"), due to
    GCC being inefficient and emitting a __multi3 intrinsic.

    There is no such issue with clang; with this patch applied, I can build
    this configuration without any problems and there are no link errors
    like mentioned in the commit above (which I can still reproduce with
    GCC 9.3.0 when that commit is reverted). Only use this definition when
    GCC is being used.

    This really should have been caught by commit b0c091ae04f67 ("lib/mpi:
    Eliminate unused umul_ppmm definitions for MIPS") when I was messing
    around in this area but I was not testing 64-bit MIPS at the time.

    Link: https://github.com/ClangBuiltLinux/linux/issues/885
    Reported-by: Dmitry Golovin
    Signed-off-by: Nathan Chancellor
    Signed-off-by: Herbert Xu
    Signed-off-by: Sasha Levin

    Nathan Chancellor
     

19 Jun, 2020

1 commit

  • * tag 'v5.4.47': (2193 commits)
    Linux 5.4.47
    KVM: arm64: Save the host's PtrAuth keys in non-preemptible context
    KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception
    ...

    Conflicts:
    arch/arm/boot/dts/imx6qdl.dtsi
    arch/arm/mach-imx/Kconfig
    arch/arm/mach-imx/common.h
    arch/arm/mach-imx/suspend-imx6.S
    arch/arm64/boot/dts/freescale/imx8qxp-mek.dts
    arch/powerpc/include/asm/cacheflush.h
    drivers/cpufreq/imx6q-cpufreq.c
    drivers/dma/imx-sdma.c
    drivers/edac/synopsys_edac.c
    drivers/firmware/imx/imx-scu.c
    drivers/net/ethernet/freescale/fec.h
    drivers/net/ethernet/freescale/fec_main.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
    drivers/net/phy/phy_device.c
    drivers/perf/fsl_imx8_ddr_perf.c
    drivers/usb/cdns3/gadget.c
    drivers/usb/dwc3/gadget.c
    include/uapi/linux/dma-buf.h

    Signed-off-by: Jason Liu

    Jason Liu
     

17 Jun, 2020

1 commit

  • commit b5265c813ce4efbfa2e46fd27cdf9a7f44a35d2e upstream.

    In some rare cases, for input data over 32 KB, lzo-rle could encode two
    different inputs to the same compressed representation, so that
    decompression is then ambiguous (i.e. data may be corrupted - although
    zram is not affected because it operates over 4 KB pages).

    This modifies the compressor without changing the decompressor or the
    bitstream format, such that:

    - there is no change to how data produced by the old compressor is
    decompressed

    - an old decompressor will correctly decode data from the updated
    compressor

    - performance and compression ratio are not affected

    - we avoid introducing a new bitstream format

    In testing over 12.8M real-world files totalling 903 GB, three files
    were affected by this bug. I also constructed 37M semi-random 64 KB
    files totalling 2.27 TB, and saw no affected files. Finally I tested
    over files constructed to contain each of the ~1024 possible bad input
    sequences; for all of these cases, updated lzo-rle worked correctly.

    There is no significant impact to performance or compression ratio.

    Signed-off-by: Dave Rodgman
    Signed-off-by: Andrew Morton
    Cc: Mark Rutland
    Cc: Dave Rodgman
    Cc: Willy Tarreau
    Cc: Sergey Senozhatsky
    Cc: Markus F.X.J. Oberhumer
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Chao Yu
    Cc:
    Link: http://lkml.kernel.org/r/20200507100203.29785-1-dave.rodgman@arm.com
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Dave Rodgman
     

27 May, 2020

1 commit

  • commit 7bd57fbc4a4ddedc664cad0bbced1b469e24e921 upstream.

    I don't see what security concern is addressed by obfuscating NULL
    and IS_ERR() error pointers, printed with %p/%pK. Given the number
    of sites where %p is used (over 10000) and the fact that NULL pointers
    aren't uncommon, it probably wouldn't take long for an attacker to
    find the hash that corresponds to 0. Although harder, the same goes
    for most common error values, such as -1, -2, -11, -14, etc.

    The NULL part actually fixes a regression: NULL pointers weren't
    obfuscated until commit 3e5903eb9cff ("vsprintf: Prevent crash when
    dereferencing invalid pointers") which went into 5.2. I'm tacking
    the IS_ERR() part on here because error pointers won't leak kernel
    addresses and printing them as pointers shouldn't be any different
    from e.g. %d with PTR_ERR_OR_ZERO(). Obfuscating them just makes
    debugging based on existing pr_debug and friends excruciating.

    Note that the "always print 0's for %pK when kptr_restrict == 2"
    behaviour which goes way back is left as is.

    Example output with the patch applied:

    ptr error-ptr NULL
    %p: 0000000001f8cc5b fffffffffffffff2 0000000000000000
    %pK, kptr = 0: 0000000001f8cc5b fffffffffffffff2 0000000000000000
    %px: ffff888048c04020 fffffffffffffff2 0000000000000000
    %pK, kptr = 1: ffff888048c04020 fffffffffffffff2 0000000000000000
    %pK, kptr = 2: 0000000000000000 0000000000000000 0000000000000000

    Fixes: 3e5903eb9cff ("vsprintf: Prevent crash when dereferencing invalid pointers")
    Signed-off-by: Ilya Dryomov
    Reviewed-by: Petr Mladek
    Reviewed-by: Sergey Senozhatsky
    Reviewed-by: Andy Shevchenko
    Acked-by: Steven Rostedt (VMware)
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Ilya Dryomov
     

10 May, 2020

2 commits

  • [ Upstream commit e537654b7039aacfe8ae629d49655c0e5692ad44 ]

    Implement a resource managed strongly uncachable ioremap function.

    Cc: # v4.19+
    Tested-by: AceLan Kao
    Signed-off-by: Tuowen Zhao
    Acked-by: Mika Westerberg
    Acked-by: Andy Shevchenko
    Acked-by: Luis Chamberlain
    Signed-off-by: Lee Jones
    Signed-off-by: Sasha Levin

    Tuowen Zhao
     
  • [ Upstream commit 5990cdee689c6885b27c6d969a3d58b09002b0bc ]

    0day reports over and over on an powerpc randconfig with clang:

    lib/mpi/generic_mpih-mul1.c:37:13: error: invalid use of a cast in a
    inline asm context requiring an l-value: remove the cast or build with
    -fheinous-gnu-extensions

    Remove the superfluous casts, which have been done previously for x86
    and arm32 in commit dea632cadd12 ("lib/mpi: fix build with clang") and
    commit 7b7c1df2883d ("lib/mpi/longlong.h: fix building with 32-bit
    x86").

    Reported-by: kbuild test robot
    Signed-off-by: Nathan Chancellor
    Acked-by: Herbert Xu
    Signed-off-by: Michael Ellerman
    Link: https://github.com/ClangBuiltLinux/linux/issues/991
    Link: https://lore.kernel.org/r/20200413195041.24064-1-natechancellor@gmail.com
    Signed-off-by: Sasha Levin

    Nathan Chancellor
     

29 Apr, 2020

1 commit

  • [ Upstream commit 06bd48b6cd97ef3889b68c8e09014d81dbc463f1 ]

    You can build a user-space test program for the raid6 library code,
    like this:

    $ cd lib/raid6/test
    $ make

    The command in $(shell ...) function is evaluated by /bin/sh by default.
    (or, you can specify the shell by passing SHELL= from command line)

    Currently '>&/dev/null' is used to sink both stdout and stderr. Because
    this code is bash-ism, it only works when /bin/sh is a symbolic link to
    bash (this is the case on RHEL etc.)

    This does not work on Ubuntu where /bin/sh is a symbolic link to dash.

    I see lots of

    /bin/sh: 1: Syntax error: Bad fd number

    and

    warning "your version of binutils lacks ... support"

    Replace it with portable '>/dev/null 2>&1'.

    Fixes: 4f8c55c5ad49 ("lib/raid6: build proper files on corresponding arch")
    Signed-off-by: Masahiro Yamada
    Acked-by: H. Peter Anvin (Intel)
    Reviewed-by: Jason A. Donenfeld
    Acked-by: Ingo Molnar
    Reviewed-by: Nick Desaulniers
    Signed-off-by: Sasha Levin

    Masahiro Yamada
     

23 Apr, 2020

1 commit

  • commit 7d32e69310d67e6b04af04f26193f79dfc2f05c7 upstream.

    Currently turning on DEBUG_INFO_SPLIT when DEBUG_INFO_BTF is also
    enabled will produce invalid btf file, since gen_btf function in
    link-vmlinux.sh script doesn't handle *.dwo files.

    Enabling DEBUG_INFO_REDUCED will also produce invalid btf file,
    and using GCC_PLUGIN_RANDSTRUCT with BTF makes no sense.

    Fixes: e83b9f55448a ("kbuild: add ability to generate BTF type info for vmlinux")
    Reported-by: Jann Horn
    Reported-by: Liu Yiding
    Signed-off-by: Slava Bacherikov
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Kees Cook
    Acked-by: KP Singh
    Acked-by: Andrii Nakryiko
    Link: https://lore.kernel.org/bpf/20200402204138.408021-1-slava@bacher09.org
    Signed-off-by: Greg Kroah-Hartman

    Slava Bacherikov
     

17 Apr, 2020

2 commits

  • commit 7e934cf5ace1dceeb804f7493fa28bb697ed3c52 upstream.

    xas_for_each_marked() is using entry == NULL as a termination condition
    of the iteration. When xas_for_each_marked() is used protected only by
    RCU, this can however race with xas_store(xas, NULL) in the following
    way:

    TASK1 TASK2
    page_cache_delete() find_get_pages_range_tag()
    xas_for_each_marked()
    xas_find_marked()
    off = xas_find_chunk()

    xas_store(&xas, NULL)
    xas_init_marks(&xas);
    ...
    rcu_assign_pointer(*slot, NULL);
    entry = xa_entry(off);

    And thus xas_for_each_marked() terminates prematurely possibly leading
    to missed entries in the iteration (translating to missing writeback of
    some pages or a similar problem).

    If we find a NULL entry that has been marked, skip it (unless we're trying
    to allocate an entry).

    Reported-by: Jan Kara
    CC: stable@vger.kernel.org
    Fixes: ef8e5717db01 ("page cache: Convert delete_batch to XArray")
    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Greg Kroah-Hartman

    Matthew Wilcox (Oracle)
     
  • commit c36d451ad386b34f452fc3c8621ff14b9eaa31a6 upstream.

    Inspired by the recent Coverity report, I looked for other places where
    the offset wasn't being converted to an unsigned long before being
    shifted, and I found one in xas_pause() when the entry being paused is
    of order >32.

    Fixes: b803b42823d0 ("xarray: Add XArray iterators")
    Signed-off-by: Matthew Wilcox (Oracle)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Matthew Wilcox (Oracle)
     

13 Apr, 2020

1 commit

  • commit d5767057c9a76a29f073dad66b7fa12a90e8c748 upstream.

    ext2_swab() is defined locally in lib/find_bit.c However it is not
    specific to ext2, neither to bitmaps.

    There are many potential users of it, so rename it to just swab() and
    move to include/uapi/linux/swab.h

    ABI guarantees that size of unsigned long corresponds to BITS_PER_LONG,
    therefore drop unneeded cast.

    Link: http://lkml.kernel.org/r/20200103202846.21616-1-yury.norov@gmail.com
    Signed-off-by: Yury Norov
    Cc: Allison Randal
    Cc: Joe Perches
    Cc: Thomas Gleixner
    Cc: William Breathitt Gray
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Yury Norov
     

08 Apr, 2020

1 commit

  • [ Upstream commit bd40b17ca49d7d110adf456e647701ce74de2241 ]

    Coverity pointed out that xas_sibling() was shifting xa_offset without
    promoting it to an unsigned long first, so the shift could cause an
    overflow and we'd get the wrong answer. The fix is obvious, and the
    new test-case provokes UBSAN to report an error:
    runtime error: shift exponent 60 is too large for 32-bit type 'int'

    Fixes: 19c30f4dd092 ("XArray: Fix xa_find_after with multi-index entries")
    Reported-by: Bjorn Helgaas
    Reported-by: Kees Cook
    Signed-off-by: Matthew Wilcox (Oracle)
    Cc: stable@vger.kernel.org
    Signed-off-by: Sasha Levin

    Matthew Wilcox (Oracle)
     

08 Mar, 2020

1 commit

  • Merge Linux stable release v5.4.24 into imx_5.4.y

    * tag 'v5.4.24': (3306 commits)
    Linux 5.4.24
    blktrace: Protect q->blk_trace with RCU
    kvm: nVMX: VMWRITE checks unsupported field before read-only field
    ...

    Signed-off-by: Jason Liu

    Conflicts:
    arch/arm/boot/dts/imx6sll-evk.dts
    arch/arm/boot/dts/imx7ulp.dtsi
    arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
    drivers/clk/imx/clk-composite-8m.c
    drivers/gpio/gpio-mxc.c
    drivers/irqchip/Kconfig
    drivers/mmc/host/sdhci-of-esdhc.c
    drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
    drivers/net/can/flexcan.c
    drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
    drivers/net/ethernet/mscc/ocelot.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
    drivers/net/phy/realtek.c
    drivers/pci/controller/mobiveil/pcie-mobiveil-host.c
    drivers/perf/fsl_imx8_ddr_perf.c
    drivers/tee/optee/shm_pool.c
    drivers/usb/cdns3/gadget.c
    kernel/sched/cpufreq.c
    net/core/xdp.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c
    sound/soc/sof/core.c
    sound/soc/sof/imx/Kconfig
    sound/soc/sof/loader.c

    Jason Liu
     

05 Mar, 2020

1 commit

  • commit 7ecaf069da52e472d393f03e79d721aabd724166 upstream.

    Currently, some sanity checks for uapi headers are done by
    scripts/headers_check.pl, which is wired up to the 'headers_check'
    target in the top Makefile.

    It is true compiling headers has better test coverage, but there
    are still several headers excluded from the compile test. I like
    to keep headers_check.pl for a while, but we can delete a lot of
    code by moving the build rule to usr/include/Makefile.

    Signed-off-by: Masahiro Yamada
    Signed-off-by: Greg Kroah-Hartman

    Masahiro Yamada
     

29 Feb, 2020

1 commit

  • commit 305e519ce48e935702c32241f07d393c3c8fed3e upstream.

    Walter Wu has reported a potential case in which init_stack_slab() is
    called after stack_slabs[STACK_ALLOC_MAX_SLABS - 1] has already been
    initialized. In that case init_stack_slab() will overwrite
    stack_slabs[STACK_ALLOC_MAX_SLABS], which may result in a memory
    corruption.

    Link: http://lkml.kernel.org/r/20200218102950.260263-1-glider@google.com
    Fixes: cd11016e5f521 ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB")
    Signed-off-by: Alexander Potapenko
    Reported-by: Walter Wu
    Cc: Dmitry Vyukov
    Cc: Matthias Brugger
    Cc: Thomas Gleixner
    Cc: Josh Poimboeuf
    Cc: Kate Stewart
    Cc: Greg Kroah-Hartman
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Alexander Potapenko
     

24 Feb, 2020

3 commits

  • [ Upstream commit 4e456fee215677584cafa7f67298a76917e89c64 ]

    Clang warns:

    ../lib/scatterlist.c:314:5: warning: misleading indentation; statement
    is not part of the previous 'if' [-Wmisleading-indentation]
    return -ENOMEM;
    ^
    ../lib/scatterlist.c:311:4: note: previous statement is here
    if (prv)
    ^
    1 warning generated.

    This warning occurs because there is a space before the tab on this
    line. Remove it so that the indentation is consistent with the Linux
    kernel coding style and clang no longer warns.

    Link: http://lkml.kernel.org/r/20191218033606.11942-1-natechancellor@gmail.com
    Link: https://github.com/ClangBuiltLinux/linux/issues/830
    Fixes: edce6820a9fd ("scatterlist: prevent invalid free when alloc fails")
    Signed-off-by: Nathan Chancellor
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Nathan Chancellor
     
  • [ Upstream commit 35fd7a637c42bb54ba4608f4d40ae6e55fc88781 ]

    The counters obj_pool_free, and obj_nr_tofree, and the flag obj_freeing are
    read locklessly outside the pool_lock critical sections. If read with plain
    accesses, this would result in data races.

    This is addressed as follows:

    * reads outside critical sections become READ_ONCE()s (pairing with
    WRITE_ONCE()s added);

    * writes become WRITE_ONCE()s (pairing with READ_ONCE()s added); since
    writes happen inside critical sections, only the write and not the read
    of RMWs needs to be atomic, thus WRITE_ONCE(var, var +/- X) is
    sufficient.

    The data races were reported by KCSAN:

    BUG: KCSAN: data-race in __free_object / fill_pool

    write to 0xffffffff8beb04f8 of 4 bytes by interrupt on cpu 1:
    __free_object+0x1ee/0x8e0 lib/debugobjects.c:404
    __debug_check_no_obj_freed+0x199/0x330 lib/debugobjects.c:969
    debug_check_no_obj_freed+0x3c/0x44 lib/debugobjects.c:994
    slab_free_hook mm/slub.c:1422 [inline]

    read to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 2:
    fill_pool+0x3d/0x520 lib/debugobjects.c:135
    __debug_object_init+0x3c/0x810 lib/debugobjects.c:536
    debug_object_init lib/debugobjects.c:591 [inline]
    debug_object_activate+0x228/0x320 lib/debugobjects.c:677
    debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline]

    BUG: KCSAN: data-race in __debug_object_init / fill_pool

    read to 0xffffffff8beb04f8 of 4 bytes by task 10 on cpu 6:
    fill_pool+0x3d/0x520 lib/debugobjects.c:135
    __debug_object_init+0x3c/0x810 lib/debugobjects.c:536
    debug_object_init_on_stack+0x39/0x50 lib/debugobjects.c:606
    init_timer_on_stack_key kernel/time/timer.c:742 [inline]

    write to 0xffffffff8beb04f8 of 4 bytes by task 1 on cpu 3:
    alloc_object lib/debugobjects.c:258 [inline]
    __debug_object_init+0x717/0x810 lib/debugobjects.c:544
    debug_object_init lib/debugobjects.c:591 [inline]
    debug_object_activate+0x228/0x320 lib/debugobjects.c:677
    debug_rcu_head_queue kernel/rcu/rcu.h:176 [inline]

    BUG: KCSAN: data-race in free_obj_work / free_object

    read to 0xffffffff9140c190 of 4 bytes by task 10 on cpu 6:
    free_object+0x4b/0xd0 lib/debugobjects.c:426
    debug_object_free+0x190/0x210 lib/debugobjects.c:824
    destroy_timer_on_stack kernel/time/timer.c:749 [inline]

    write to 0xffffffff9140c190 of 4 bytes by task 93 on cpu 1:
    free_obj_work+0x24f/0x480 lib/debugobjects.c:313
    process_one_work+0x454/0x8d0 kernel/workqueue.c:2264
    worker_thread+0x9a/0x780 kernel/workqueue.c:2410

    Reported-by: Qian Cai
    Signed-off-by: Marco Elver
    Signed-off-by: Thomas Gleixner
    Link: https://lore.kernel.org/r/20200116185529.11026-1-elver@google.com
    Signed-off-by: Sasha Levin

    Marco Elver
     
  • [ Upstream commit 5e5ac01c2b8802921fee680518a986011cb59820 ]

    The compilation warning is redefination showed as following:

    In file included from tables.c:2:
    ../../../include/linux/export.h:180: warning: "EXPORT_SYMBOL" redefined
    #define EXPORT_SYMBOL(sym) __EXPORT_SYMBOL(sym, "")

    In file included from tables.c:1:
    ../../../include/linux/raid/pq.h:61: note: this is the location of the previous definition
    #define EXPORT_SYMBOL(sym)

    Fixes: 69a94abb82ee ("export.h, genksyms: do not make genksyms calculate CRC of trimmed symbols")
    Signed-off-by: Zhengyuan Liu
    Signed-off-by: Song Liu
    Signed-off-by: Sasha Levin

    Zhengyuan Liu
     

11 Feb, 2020

1 commit

  • commit 3e21d9a501bf99aee2e5835d7f34d8c823f115b5 upstream.

    In case memory resources for _ptr2_ were allocated, release them before
    return.

    Notice that in case _ptr1_ happens to be NULL, krealloc() behaves
    exactly like kmalloc().

    Addresses-Coverity-ID: 1490594 ("Resource leak")
    Link: http://lkml.kernel.org/r/20200123160115.GA4202@embeddedor
    Fixes: 3f15801cdc23 ("lib: add kasan test module")
    Signed-off-by: Gustavo A. R. Silva
    Reviewed-by: Dmitry Vyukov
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Gustavo A. R. Silva
     

06 Feb, 2020

1 commit

  • [ Upstream commit 82a22311b7a68a78709699dc8c098953b70e4fd2 ]

    If we were unlucky enough to call xas_pause() when the index was at
    ULONG_MAX (or a multi-slot entry which ends at ULONG_MAX), we would
    wrap the index back around to 0 and restart the iteration from the
    beginning. Use the XAS_BOUNDS state to indicate that we should just
    stop the iteration.

    Signed-off-by: Matthew Wilcox (Oracle)
    Signed-off-by: Sasha Levin

    Matthew Wilcox (Oracle)
     

29 Jan, 2020

4 commits

  • commit ab10ae1c3bef56c29bac61e1201c752221b87b41 upstream.

    The range passed to user_access_begin() by strncpy_from_user() and
    strnlen_user() starts at 'src' and goes up to the limit of userspace
    although reads will be limited by the 'count' param.

    On 32 bits powerpc (book3s/32) access has to be granted for each
    256Mbytes segment and the cost increases with the number of segments to
    unlock.

    Limit the range with 'count' param.

    Fixes: 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'")
    Signed-off-by: Christophe Leroy
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     
  • commit c44aa5e8ab58b5f4cf473970ec784c3333496a2e upstream.

    If you call xas_find() with the initial index > max, it should have
    returned NULL but was returning the entry at index.

    Signed-off-by: Matthew Wilcox (Oracle)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Matthew Wilcox (Oracle)
     
  • commit 19c30f4dd0923ef191f35c652ee4058e91e89056 upstream.

    If the entry is of an order which is a multiple of XA_CHUNK_SIZE,
    the current detection of sibling entries does not work. Factor out
    an xas_sibling() function to make xa_find_after() a little more
    understandable, and write a new implementation that doesn't suffer from
    the same bug.

    Signed-off-by: Matthew Wilcox (Oracle)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Matthew Wilcox (Oracle)
     
  • commit 430f24f94c8a174d411a550d7b5529301922e67a upstream.

    If there is an entry at ULONG_MAX, xa_for_each() will overflow the
    'index + 1' in xa_find_after() and wrap around to 0. Catch this case
    and terminate the loop by returning NULL.

    Signed-off-by: Matthew Wilcox (Oracle)
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Matthew Wilcox (Oracle)
     

12 Jan, 2020

1 commit

  • [ Upstream commit df034c93f15ee71df231ff9fe311d27ff08a2a52 ]

    Under heavy loads where the kyber I/O scheduler hits the token limits for
    its scheduling domains, kyber can become stuck. When active requests
    complete, kyber may not be woken up leaving the I/O requests in kyber
    stuck.

    This stuck state is due to a race condition with kyber and the sbitmap
    functions it uses to run a callback when enough requests have completed.
    The running of a sbt_wait callback can race with the attempt to insert the
    sbt_wait. Since sbitmap_del_wait_queue removes the sbt_wait from the list
    first then sets the sbq field to NULL, kyber can see the item as not on a
    list but the call to sbitmap_add_wait_queue will see sbq as non-NULL. This
    results in the sbt_wait being inserted onto the wait list but ws_active
    doesn't get incremented. So the sbitmap queue does not know there is a
    waiter on a wait list.

    Since sbitmap doesn't think there is a waiter, kyber may never be
    informed that there are domain tokens available and the I/O never advances.
    With the sbt_wait on a wait list, kyber believes it has an active waiter
    so cannot insert a new waiter when reaching the domain's full state.

    This race can be fixed by only adding the sbt_wait to the queue if the
    sbq field is NULL. If sbq is not NULL, there is already an action active
    which will trigger the re-running of kyber. Let it run and add the
    sbt_wait to the wait list if still needing to wait.

    Reviewed-by: Omar Sandoval
    Signed-off-by: David Jeffery
    Reported-by: John Pittman
    Tested-by: John Pittman
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    David Jeffery