05 Aug, 2019

11 commits

  • blk-mq may schedule to call queue's complete function on remote CPU via
    IPI, but doesn't provide any way to synchronize the request's complete
    fn. The current queue freeze interface can't provide the synchonization
    because aborted requests stay at blk-mq queues during EH.

    In some driver's EH(such as NVMe), hardware queue's resource may be freed &
    re-allocated. If the completed request's complete fn is run finally after the
    hardware queue's resource is released, kernel crash will be triggered.

    Prepare for fixing this kind of issue by introducing
    blk_mq_tagset_wait_completed_request().

    Cc: Max Gurtovoy
    Cc: Sagi Grimberg
    Cc: Keith Busch
    Cc: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • NVMe needs this function to decide if one request to be aborted has
    been completed in normal IO path already.

    So introduce it.

    Cc: Max Gurtovoy
    Cc: Sagi Grimberg
    Cc: Keith Busch
    Cc: Christoph Hellwig
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     
  • Linus Torvalds
     
  • Pull tpm fixes from Jarkko Sakkinen:
    "Two bug fixes that did not make into my first pull request"

    * tag 'tpmdd-next-20190805' of git://git.infradead.org/users/jjs/linux-tpmdd:
    tpm: tpm_ibm_vtpm: Fix unallocated banks
    tpm: Fix null pointer dereference on chip register error path

    Linus Torvalds
     
  • Pull MTD fixes from Miquel Raynal:
    "NAND:

    - Fix Micron driver as some chips enable internal ECC correction
    during their discovery while they advertize they do not have any.

    Hyperbus:

    - Restrict the build to only ARM64 SoCs (and compile testing) which
    is what should have been done since the beginning.

    - Fix Kconfig issue by selection something instead of implying it"

    * tag 'mtd/fixes-for-5.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
    mtd: hyperbus: Add hardware dependency to AM654 driver
    mtd: hyperbus: Kconfig: Fix HBMC_AM654 dependencies
    mtd: rawnand: micron: handle on-die "ECC-off" devices correctly

    Linus Torvalds
     
  • The nr_allocated_banks and allocated banks are initialized as part of
    tpm_chip_register. Currently, this is done as part of auto startup
    function. However, some drivers, like the ibm vtpm driver, do not run
    auto startup during initialization. This results in uninitialized memory
    issue and causes a kernel panic during boot.

    This patch moves the pcr allocation outside the auto startup function
    into tpm_chip_register. This ensures that allocated banks are initialized
    in any case.

    Fixes: 879b589210a9 ("tpm: retrieve digest size of unknown algorithms with PCR read")
    Reported-by: Michal Suchanek
    Signed-off-by: Nayna Jain
    Reviewed-by: Mimi Zohar
    Tested-by: Sachin Sant
    Tested-by: Michal Suchánek
    Reviewed-by: Jarkko Sakkinen
    Signed-off-by: Jarkko Sakkinen

    Nayna Jain
     
  • If clk_enable is not defined and chip initialization
    is canceled code hits null dereference.

    Easily reproducible with vTPM init fail:
    swtpm chardev --tpmstate dir=nonexistent_dir --tpm2 --vtpm-proxy

    BUG: kernel NULL pointer dereference, address: 00000000
    ...
    Call Trace:
    tpm_chip_start+0x9d/0xa0 [tpm]
    tpm_chip_register+0x10/0x1a0 [tpm]
    vtpm_proxy_work+0x11/0x30 [tpm_vtpm_proxy]
    process_one_work+0x214/0x5a0
    worker_thread+0x134/0x3e0
    ? process_one_work+0x5a0/0x5a0
    kthread+0xd4/0x100
    ? process_one_work+0x5a0/0x5a0
    ? kthread_park+0x90/0x90
    ret_from_fork+0x19/0x24

    Fixes: 719b7d81f204 ("tpm: introduce tpm_chip_start() and tpm_chip_stop()")
    Cc: stable@vger.kernel.org # v5.1+
    Signed-off-by: Milan Broz
    Reviewed-by: Jarkko Sakkinen
    Signed-off-by: Jarkko Sakkinen

    Milan Broz
     
  • Pull powerpc fixes from Michael Ellerman:
    "Some more powerpc fixes for 5.3:

    - Wire up the new clone3 syscall.

    - A fix for the PAPR SCM nvdimm driver, to fix a crash when firmware
    gives us a device that's attached to a non-online NUMA node.

    - A fix for a boot failure on 32-bit with KASAN enabled.

    - Three fixes for implicit fall through warnings, some of which are
    errors for us due to -Werror.

    Thanks to: Aneesh Kumar K.V, Christophe Leroy, Kees Cook, Santosh
    Sivaraj, Stephen Rothwell"

    * tag 'powerpc-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc/kasan: fix early boot failure on PPC32
    drivers/macintosh/smu.c: Mark expected switch fall-through
    powerpc/spe: Mark expected switch fall-throughs
    powerpc/nvdimm: Pick nearby online node if the device node is not online
    powerpc/kvm: Fall through switch case explicitly
    powerpc: Wire up clone3 syscall

    Linus Torvalds
     
  • At the end of the v5.3 upstream kernel development cycle, Simon will be
    stepping down from his role as Renesas SoC maintainer. Starting with
    the v5.4 development cycle, Geert is taking over this role.

    Add Geert as a co-maintainer, and add his git repository and branch.

    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Niklas Söderlund
    Acked-by: Simon Horman
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     
  • …/masahiroy/linux-kbuild

    Pull Kbuild fixes from Masahiro Yamada:

    - detect missing missing "WITH Linux-syscall-note" for uapi headers

    - fix needless rebuild when using Clang

    - fix false-positive cc-option in Kconfig when using Clang

    - avoid including corrupted .*.cmd files in the modpost stage

    - fix warning of 'make vmlinux'

    - fix {m,n,x,g}config to not generate the broken .config on the second
    save operation.

    - some trivial Makefile fixes

    * tag 'kbuild-fixes-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    kconfig: Clear "written" flag to avoid data loss
    kbuild: Check for unknown options with cc-option usage in Kconfig and clang
    lib/raid6: fix unnecessary rebuild of vpermxor*.c
    kbuild: modpost: do not parse unnecessary rules for vmlinux modpost
    kbuild: modpost: remove unnecessary dependency for __modpost
    kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules
    kbuild: modpost: include .*.cmd files only when targets exist
    kbuild: initialize CLANG_FLAGS correctly in the top Makefile
    kbuild: detect missing "WITH Linux-syscall-note" for uapi headers

    Linus Torvalds
     
  • Pull SafeSetID maintainer update from Micah Morton:
    "Add entry in MAINTAINERS file for SafeSetID LSM"

    * tag 'safesetid-maintainers-correction-5.3-rc2' of git://github.com/micah-morton/linux:
    Add entry in MAINTAINERS file for SafeSetID LSM

    Linus Torvalds
     

04 Aug, 2019

8 commits

  • Prior to this commit, starting nconfig, xconfig or gconfig, and saving
    the .config file more than once caused data loss, where a .config file
    that contained only comments would be written to disk starting from the
    second save operation.

    This bug manifests itself because the SYMBOL_WRITTEN flag is never
    cleared after the first call to conf_write, and subsequent calls to
    conf_write then skip all of the configuration symbols due to the
    SYMBOL_WRITTEN flag being set.

    This commit resolves this issue by clearing the SYMBOL_WRITTEN flag
    from all symbols before conf_write returns.

    Fixes: 8e2442a5f86e ("kconfig: fix missing choice values in auto.conf")
    Cc: linux-stable # 4.19+
    Signed-off-by: M. Vefa Bicakci
    Signed-off-by: Masahiro Yamada

    M. Vefa Bicakci
     
  • Pull Xtensa fix from Max Filippov:
    "Fix build for xtensa cores with coprocessors that was broken by
    entry/return abstraction patch"

    * tag 'xtensa-20190803' of git://github.com/jcmvbkbc/linux-xtensa:
    xtensa: fix build for cores with coprocessors

    Linus Torvalds
     
  • Pull i2c fixes from Wolfram Sang:
    "A set of driver fixes for the I2C subsystem"

    * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
    i2c: s3c2410: Mark expected switch fall-through
    i2c: at91: fix clk_offset for sama5d2
    i2c: at91: disable TXRDY interrupt after sending data
    i2c: iproc: Fix i2c master read more than 63 bytes
    eeprom: at24: make spd world-readable again

    Linus Torvalds
     
  • Pull perf tooling fixes from Thomas Gleixner:
    "A set of updates for perf tools and documentation:

    perf header:
    - Prevent a division by zero
    - Deal with an uninitialized warning proper

    libbpf:
    - Fix the missiong __WORDSIZE definition for musl & al

    UAPI headers:
    - Synchronize kernel headers

    Documentation:
    - Fix the memory units for perf.data size"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    libbpf: fix missing __WORDSIZE definition
    perf tools: Fix perf.data documentation units for memory size
    perf header: Fix use of unitialized value warning
    perf header: Fix divide by zero error if f_header.attr_size==0
    tools headers UAPI: Sync if_link.h with the kernel
    tools headers UAPI: Sync sched.h with the kernel
    tools headers UAPI: Sync usbdevice_fs.h with the kernels to get new ioctl
    tools perf beauty: Fix usbdevfs_ioctl table generator to handle _IOC()
    tools headers UAPI: Update tools's copy of drm.h headers
    tools headers UAPI: Update tools's copy of mman.h headers
    tools headers UAPI: Update tools's copy of kvm.h headers
    tools include UAPI: Sync x86's syscalls_64.tbl and generic unistd.h to pick up clone3 and pidfd_open

    Linus Torvalds
     
  • Pull vdso timer fixes from Thomas Gleixner:
    "A series of commits to deal with the regression caused by the generic
    VDSO implementation.

    The usage of clock_gettime64() for 32bit compat fallback syscalls
    caused seccomp filters to kill innocent processes because they only
    allow clock_gettime().

    Handle the compat syscalls with clock_gettime() as before, which is
    not a functional problem for the VDSO as the legacy compat application
    interface is not y2038 safe anyway. It's just extra fallback code
    which needs to be implemented on every architecture.

    It's opt in for now so that it does not break the compile of already
    converted architectures in linux-next. Once these are fixed, the
    #ifdeffery goes away.

    So much for trying to be smart and reuse code..."

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    arm64: compat: vdso: Use legacy syscalls as fallback
    x86/vdso/32: Use 32bit syscall fallback
    lib/vdso/32: Provide legacy syscall fallbacks
    lib/vdso: Move fallback invocation to the callers
    lib/vdso/32: Remove inconsistent NULL pointer checks

    Linus Torvalds
     
  • Pull irq fixes from Thomas Gleixner:
    "A small bunch of fixes from the irqchip department:

    - Fix a couple of UAF on error paths (RZA1, GICv3 ITS)

    - Fix iMX GPCv2 trigger setting

    - Add missing of_node_put() on error path in MBIGEN

    - Add another bunch of /* fall-through */ to silence warnings"

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    irqchip/renesas-rza1: Fix an use-after-free in rza1_irqc_probe()
    irqchip/irq-imx-gpcv2: Forward irq type to parent
    irqchip/irq-mbigen: Add of_node_put() before return
    irqchip/gic-v3-its: Free unused vpt_page when alloc vpe table fail
    irqchip/gic-v3: Mark expected switch fall-through

    Linus Torvalds
     
  • Pull xfs fixes from Darrick Wong:

    - Avoid leaking kernel stack contents to userspace

    - Fix a potential null pointer dereference in the dabtree scrub code

    * tag 'xfs-5.3-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
    xfs: Fix possible null-pointer dereferences in xchk_da_btree_block_check_sibling()
    xfs: fix stack contents leakage in the v1 inumber ioctls

    Linus Torvalds
     
  • Merge misc fixes from Andrew Morton:
    "17 fixes"

    * emailed patches from Andrew Morton :
    drivers/acpi/scan.c: document why we don't need the device_hotplug_lock
    memremap: move from kernel/ to mm/
    lib/test_meminit.c: use GFP_ATOMIC in RCU critical section
    asm-generic: fix -Wtype-limits compiler warnings
    cgroup: kselftest: relax fs_spec checks
    mm/memory_hotplug.c: remove unneeded return for void function
    mm/migrate.c: initialize pud_entry in migrate_vma()
    coredump: split pipe command whitespace before expanding template
    page flags: prioritize kasan bits over last-cpuid
    ubsan: build ubsan.c more conservatively
    kasan: remove clang version check for KASAN_STACK
    mm: compaction: avoid 100% CPU usage during compaction when a task is killed
    mm: migrate: fix reference check race between __find_get_block() and migration
    mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker
    ocfs2: remove set but not used variable 'last_hash'
    Revert "kmemleak: allow to coexist with fault injection"
    kernel/signal.c: fix a kernel-doc markup

    Linus Torvalds
     

03 Aug, 2019

21 commits

  • Pull RISC-V fixes from Paul Walmsley:
    "Three minor RISC-V-related changes for v5.3-rc3:

    - Add build ID to VDSO builds to avoid a double-free in perf when
    libelf isn't used

    - Align the RV64 defconfig to the output of "make savedefconfig" so
    subsequent defconfig patches don't get out of hand

    - Drop a superfluous DT property from the FU540 SoC DT data (since it
    must be already set in board data that includes it)"

    * tag 'riscv/for-v5.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
    riscv: defconfig: align RV64 defconfig to the output of "make savedefconfig"
    riscv: dts: fu540-c000: drop "timebase-frequency"
    riscv: Fix perf record without libelf support

    Linus Torvalds
     
  • Let's document why the lock is not needed in acpi_scan_init(), right now
    this is not really obvious.

    [akpm@linux-foundation.org: fix tpyo]
    Link: http://lkml.kernel.org/r/20190731135306.31524-1-david@redhat.com
    Signed-off-by: David Hildenbrand
    Acked-by: Michal Hocko
    Acked-by: Rafael J. Wysocki
    Cc: Oscar Salvador
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • memremap.c implements MM functionality for ZONE_DEVICE, so it really
    should be in the mm/ directory, not the kernel/ one.

    Link: http://lkml.kernel.org/r/20190722094143.18387-1-hch@lst.de
    Signed-off-by: Christoph Hellwig
    Reviewed-by: Anshuman Khandual
    Acked-by: Dan Williams
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • kmalloc() shouldn't sleep while in RCU critical section, therefore use
    GFP_ATOMIC instead of GFP_KERNEL.

    The bug was spotted by the 0day kernel testing robot.

    Link: http://lkml.kernel.org/r/20190725121703.210874-1-glider@google.com
    Fixes: 7e659650cbda ("lib: introduce test_meminit module")
    Signed-off-by: Alexander Potapenko
    Reviewed-by: Andrew Morton
    Reported-by: kernel test robot
    Cc: Kees Cook
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     
  • Commit d66acc39c7ce ("bitops: Optimise get_order()") introduced a
    compilation warning because "rx_frag_size" is an "ushort" while
    PAGE_SHIFT here is 16.

    The commit changed the get_order() to be a multi-line macro where
    compilers insist to check all statements in the macro even when
    __builtin_constant_p(rx_frag_size) will return false as "rx_frag_size"
    is a module parameter.

    In file included from ./arch/powerpc/include/asm/page_64.h:107,
    from ./arch/powerpc/include/asm/page.h:242,
    from ./arch/powerpc/include/asm/mmu.h:132,
    from ./arch/powerpc/include/asm/lppaca.h:47,
    from ./arch/powerpc/include/asm/paca.h:17,
    from ./arch/powerpc/include/asm/current.h:13,
    from ./include/linux/thread_info.h:21,
    from ./arch/powerpc/include/asm/processor.h:39,
    from ./include/linux/prefetch.h:15,
    from drivers/net/ethernet/emulex/benet/be_main.c:14:
    drivers/net/ethernet/emulex/benet/be_main.c: In function 'be_rx_cqs_create':
    ./include/asm-generic/getorder.h:54:9: warning: comparison is always
    true due to limited range of data type [-Wtype-limits]
    (((n) < (1UL << PAGE_SHIFT)) ? 0 : \
    ^
    drivers/net/ethernet/emulex/benet/be_main.c:3138:33: note: in expansion
    of macro 'get_order'
    adapter->big_page_size = (1 << get_order(rx_frag_size)) * PAGE_SIZE;
    ^~~~~~~~~

    Fix it by moving all of this multi-line macro into a proper function,
    and killing __get_order() off.

    [akpm@linux-foundation.org: remove __get_order() altogether]
    [cai@lca.pw: v2]
    Link: http://lkml.kernel.org/r/1564000166-31428-1-git-send-email-cai@lca.pw
    Link: http://lkml.kernel.org/r/1563914986-26502-1-git-send-email-cai@lca.pw
    Fixes: d66acc39c7ce ("bitops: Optimise get_order()")
    Signed-off-by: Qian Cai
    Reviewed-by: Nathan Chancellor
    Cc: David S. Miller
    Cc: Arnd Bergmann
    Cc: David Howells
    Cc: Jakub Jelinek
    Cc: Nick Desaulniers
    Cc: Bill Wendling
    Cc: James Y Knight
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Qian Cai
     
  • On my laptop most memcg kselftests were being skipped because it claimed
    cgroup v2 hierarchy wasn't mounted, but this isn't correct. Instead, it
    seems current systemd HEAD mounts it with the name "cgroup2" instead of
    "cgroup":

    % grep cgroup /proc/mounts
    cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0

    I can't think of a reason to need to check fs_spec explicitly
    since it's arbitrary, so we can just rely on fs_vfstype.

    After these changes, `make TARGETS=cgroup kselftest` actually runs the
    cgroup v2 tests in more cases.

    Link: http://lkml.kernel.org/r/20190723210737.GA487@chrisdown.name
    Signed-off-by: Chris Down
    Cc: Johannes Weiner
    Cc: Tejun Heo
    Cc: Roman Gushchin
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Down
     
  • return is unneeded in void function

    Link: http://lkml.kernel.org/r/20190723130814.21826-1-houweitaoo@gmail.com
    Signed-off-by: Weitao Hou
    Reviewed-by: David Hildenbrand
    Reviewed-by: Oscar Salvador
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Weitao Hou
     
  • When CONFIG_MIGRATE_VMA_HELPER is enabled, migrate_vma() calls
    migrate_vma_collect() which initializes a struct mm_walk but didn't
    initialize mm_walk.pud_entry. (Found by code inspection) Use a C
    structure initialization to make sure it is set to NULL.

    Link: http://lkml.kernel.org/r/20190719233225.12243-1-rcampbell@nvidia.com
    Fixes: 8763cb45ab967 ("mm/migrate: new memory migration helper for use with device memory")
    Signed-off-by: Ralph Campbell
    Reviewed-by: John Hubbard
    Reviewed-by: Andrew Morton
    Cc: "Jérôme Glisse"
    Cc: Mel Gorman
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ralph Campbell
     
  • Save the offsets of the start of each argument to avoid having to update
    pointers to each argument after every corename krealloc and to avoid
    having to duplicate the memory for the dump command.

    Executable names containing spaces were previously being expanded from
    %e or %E and then split in the middle of the filename. This is
    incorrect behaviour since an argument list can represent arguments with
    spaces.

    The splitting could lead to extra arguments being passed to the core
    dump handler that it might have interpreted as options or ignored
    completely.

    Core dump handlers that are not aware of this Linux kernel issue will be
    using %e or %E without considering that it may be split and so they will
    be vulnerable to processes with spaces in their names breaking their
    argument list. If their internals are otherwise well written, such as
    if they are written in shell but quote arguments, they will work better
    after this change than before. If they are not well written, then there
    is a slight chance of breakage depending on the details of the code but
    they will already be fairly broken by the split filenames.

    Core dump handlers that are aware of this Linux kernel issue will be
    placing %e or %E as the last item in their core_pattern and then
    aggregating all of the remaining arguments into one, separated by
    spaces. Alternatively they will be obtaining the filename via other
    methods. Both of these will be compatible with the new arrangement.

    A side effect from this change is that unknown template types (for
    example %z) result in an empty argument to the dump handler instead of
    the argument being dropped. This is a desired change as:

    It is easier for dump handlers to process empty arguments than dropped
    ones, especially if they are written in shell or don't pass each
    template item with a preceding command-line option in order to
    differentiate between individual template types. Most core_patterns in
    the wild do not use options so they can confuse different template types
    (especially numeric ones) if an earlier one gets dropped in old kernels.
    If the kernel introduces a new template type and a core_pattern uses it,
    the core dump handler might not expect that the argument can be dropped
    in old kernels.

    For example, this can result in security issues when %d is dropped in
    old kernels. This happened with the corekeeper package in Debian and
    resulted in the interface between corekeeper and Linux having to be
    rewritten to use command-line options to differentiate between template
    types.

    The core_pattern for most core dump handlers is written by the handler
    author who would generally not insert unknown template types so this
    change should be compatible with all the core dump handlers that exist.

    Link: http://lkml.kernel.org/r/20190528051142.24939-1-pabs3@bonedaddy.net
    Fixes: 74aadce98605 ("core_pattern: allow passing of arguments to user mode helper when core_pattern is a pipe")
    Signed-off-by: Paul Wise
    Reported-by: Jakub Wilk [https://bugs.debian.org/924398]
    Reported-by: Paul Wise [https://lore.kernel.org/linux-fsdevel/c8b7ecb8508895bf4adb62a748e2ea2c71854597.camel@bonedaddy.net/]
    Suggested-by: Jakub Wilk
    Acked-by: Neil Horman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Wise
     
  • ARM64 randdconfig builds regularly run into a build error, especially
    when NUMA_BALANCING and SPARSEMEM are enabled but not SPARSEMEM_VMEMMAP:

    #error "KASAN: not enough bits in page flags for tag"

    The last-cpuid bits are already contitional on the available space, so
    the result of the calculation is a bit random on whether they were
    already left out or not.

    Adding the kasan tag bits before last-cpuid makes it much more likely to
    end up with a successful build here, and should be reliable for
    randconfig at least, as long as that does not randomize NR_CPUS or
    NODES_SHIFT but uses the defaults.

    In order for the modified check to not trigger in the x86 vdso32 code
    where all constants are wrong (building with -m32), enclose all the
    definitions with an #ifdef.

    [arnd@arndb.de: build fix]
    Link: http://lkml.kernel.org/r/CAK8P3a3Mno1SWTcuAOT0Wa9VS15pdU6EfnkxLbDpyS55yO04+g@mail.gmail.com
    Link: http://lkml.kernel.org/r/20190722115520.3743282-1-arnd@arndb.de
    Link: https://lore.kernel.org/lkml/20190618095347.3850490-1-arnd@arndb.de/
    Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Andrey Konovalov
    Reviewed-by: Andrey Ryabinin
    Cc: Andrey Konovalov
    Cc: Dmitry Vyukov
    Cc: Will Deacon
    Cc: Christoph Lameter
    Cc: Mark Rutland
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • objtool points out several conditions that it does not like, depending
    on the combination with other configuration options and compiler
    variants:

    stack protector:
    lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0xbf: call to __stack_chk_fail() with UACCESS enabled
    lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0xbe: call to __stack_chk_fail() with UACCESS enabled

    stackleak plugin:
    lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0x4a: call to stackleak_track_stack() with UACCESS enabled
    lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0x4a: call to stackleak_track_stack() with UACCESS enabled

    kasan:
    lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0x25: call to memcpy() with UACCESS enabled
    lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0x25: call to memcpy() with UACCESS enabled

    The stackleak and kasan options just need to be disabled for this file
    as we do for other files already. For the stack protector, we already
    attempt to disable it, but this fails on clang because the check is
    mixed with the gcc specific -fno-conserve-stack option. According to
    Andrey Ryabinin, that option is not even needed, dropping it here fixes
    the stackprotector issue.

    Link: http://lkml.kernel.org/r/20190722125139.1335385-1-arnd@arndb.de
    Link: https://lore.kernel.org/lkml/20190617123109.667090-1-arnd@arndb.de/t/
    Link: https://lore.kernel.org/lkml/20190722091050.2188664-1-arnd@arndb.de/t/
    Fixes: d08965a27e84 ("x86/uaccess, ubsan: Fix UBSAN vs. SMAP")
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Andrey Ryabinin
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Dmitry Vyukov
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Kees Cook
    Cc: Matthew Wilcox
    Cc: Ard Biesheuvel
    Cc: Andy Shevchenko
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • asan-stack mode still uses dangerously large kernel stacks of tens of
    kilobytes in some drivers, and it does not seem that anyone is working
    on the clang bug.

    Turn it off for all clang versions to prevent users from accidentally
    enabling it once they update to clang-9, and to help automated build
    testing with clang-9.

    Link: https://bugs.llvm.org/show_bug.cgi?id=38809
    Link: http://lkml.kernel.org/r/20190719200347.2596375-1-arnd@arndb.de
    Fixes: 6baec880d7a5 ("kasan: turn off asan-stack for clang-8 and earlier")
    Signed-off-by: Arnd Bergmann
    Acked-by: Nick Desaulniers
    Reviewed-by: Mark Brown
    Reviewed-by: Andrey Ryabinin
    Cc: Qian Cai
    Cc: Andrey Konovalov
    Cc: Vasily Gorbik
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • "howaboutsynergy" reported via kernel buzilla number 204165 that
    compact_zone_order was consuming 100% CPU during a stress test for
    prolonged periods of time. Specifically the following command, which
    should exit in 10 seconds, was taking an excessive time to finish while
    the CPU was pegged at 100%.

    stress -m 220 --vm-bytes 1000000000 --timeout 10

    Tracing indicated a pattern as follows

    stress-3923 [007] 519.106208: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
    stress-3923 [007] 519.106212: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
    stress-3923 [007] 519.106216: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
    stress-3923 [007] 519.106219: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
    stress-3923 [007] 519.106223: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
    stress-3923 [007] 519.106227: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
    stress-3923 [007] 519.106231: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
    stress-3923 [007] 519.106235: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
    stress-3923 [007] 519.106238: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
    stress-3923 [007] 519.106242: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0

    Note that compaction is entered in rapid succession while scanning and
    isolating nothing. The problem is that when a task that is compacting
    receives a fatal signal, it retries indefinitely instead of exiting
    while making no progress as a fatal signal is pending.

    It's not easy to trigger this condition although enabling zswap helps on
    the basis that the timing is altered. A very small window has to be hit
    for the problem to occur (signal delivered while compacting and
    isolating a PFN for migration that is not aligned to SWAP_CLUSTER_MAX).

    This was reproduced locally -- 16G single socket system, 8G swap, 30%
    zswap configured, vm-bytes 22000000000 using Colin Kings stress-ng
    implementation from github running in a loop until the problem hits).
    Tracing recorded the problem occurring almost 200K times in a short
    window. With this patch, the problem hit 4 times but the task existed
    normally instead of consuming CPU.

    This problem has existed for some time but it was made worse by commit
    cf66f0700c8f ("mm, compaction: do not consider a need to reschedule as
    contention"). Before that commit, if the same condition was hit then
    locks would be quickly contended and compaction would exit that way.

    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204165
    Link: http://lkml.kernel.org/r/20190718085708.GE24383@techsingularity.net
    Fixes: cf66f0700c8f ("mm, compaction: do not consider a need to reschedule as contention")
    Signed-off-by: Mel Gorman
    Reviewed-by: Vlastimil Babka
    Cc: [5.1+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • buffer_migrate_page_norefs() can race with bh users in the following
    way:

    CPU1 CPU2
    buffer_migrate_page_norefs()
    buffer_migrate_lock_buffers()
    checks bh refs
    spin_unlock(&mapping->private_lock)
    __find_get_block()
    spin_lock(&mapping->private_lock)
    grab bh ref
    spin_unlock(&mapping->private_lock)
    move page do bh work

    This can result in various issues like lost updates to buffers (i.e.
    metadata corruption) or use after free issues for the old page.

    This patch closes the race by holding mapping->private_lock while the
    mapping is being moved to a new page. Ordinarily, a reference can be
    taken outside of the private_lock using the per-cpu BH LRU but the
    references are checked and the LRU invalidated if necessary. The
    private_lock is held once the references are known so the buffer lookup
    slow path will spin on the private_lock. Between the page lock and
    private_lock, it should be impossible for other references to be
    acquired and updates to happen during the migration.

    A user had reported data corruption issues on a distribution kernel with
    a similar page migration implementation as mainline. The data
    corruption could not be reproduced with this patch applied. A small
    number of migration-intensive tests were run and no performance problems
    were noted.

    [mgorman@techsingularity.net: Changelog, removed tracing]
    Link: http://lkml.kernel.org/r/20190718090238.GF24383@techsingularity.net
    Fixes: 89cb0888ca14 "mm: migrate: provide buffer_migrate_page_norefs()"
    Signed-off-by: Jan Kara
    Signed-off-by: Mel Gorman
    Cc: [5.0+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Shakeel Butt reported premature oom on kernel with
    "cgroup_disable=memory" since mem_cgroup_is_root() returns false even
    though memcg is actually NULL. The drop_caches is also broken.

    It is because commit aeed1d325d42 ("mm/vmscan.c: generalize
    shrink_slab() calls in shrink_node()") removed the !memcg check before
    !mem_cgroup_is_root(). And, surprisingly root memcg is allocated even
    though memory cgroup is disabled by kernel boot parameter.

    Add mem_cgroup_disabled() check to make reclaimer work as expected.

    Link: http://lkml.kernel.org/r/1563385526-20805-1-git-send-email-yang.shi@linux.alibaba.com
    Fixes: aeed1d325d42 ("mm/vmscan.c: generalize shrink_slab() calls in shrink_node()")
    Signed-off-by: Yang Shi
    Reported-by: Shakeel Butt
    Reviewed-by: Shakeel Butt
    Reviewed-by: Kirill Tkhai
    Acked-by: Michal Hocko
    Cc: Jan Hadrava
    Cc: Vladimir Davydov
    Cc: Johannes Weiner
    Cc: Roman Gushchin
    Cc: Hugh Dickins
    Cc: Qian Cai
    Cc: Kirill A. Shutemov
    Cc: [4.19+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Shi
     
  • Fixes gcc '-Wunused-but-set-variable' warning:

    fs/ocfs2/xattr.c: In function ocfs2_xattr_bucket_find:
    fs/ocfs2/xattr.c:3828:6: warning: variable last_hash set but not used [-Wunused-but-set-variable]

    It's never used and can be removed.

    Link: http://lkml.kernel.org/r/20190716132110.34836-1-yuehaibing@huawei.com
    Signed-off-by: YueHaibing
    Acked-by: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Changwei Ge
    Cc: Gang He
    Cc: Jun Piao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    YueHaibing
     
  • When running ltp's oom test with kmemleak enabled, the below warning was
    triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
    passed in:

    WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
    Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata
    CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
    RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
    ...
    kmemleak_alloc+0x4e/0xb0
    kmem_cache_alloc+0x2a7/0x3e0
    mempool_alloc_slab+0x2d/0x40
    mempool_alloc+0x118/0x2b0
    bio_alloc_bioset+0x19d/0x350
    get_swap_bio+0x80/0x230
    __swap_writepage+0x5ff/0xb20

    The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak
    has __GFP_NOFAIL set all the time due to d9570ee3bd1d4f2 ("kmemleak:
    allow to coexist with fault injection"). But, it doesn't make any sense
    to have __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same
    time.

    According to the discussion on the mailing list, the commit should be
    reverted for short term solution. Catalin Marinas would follow up with
    a better solution for longer term.

    The failure rate of kmemleak metadata allocation may increase in some
    circumstances, but this should be expected side effect.

    Link: http://lkml.kernel.org/r/1563299431-111710-1-git-send-email-yang.shi@linux.alibaba.com
    Fixes: d9570ee3bd1d4f2 ("kmemleak: allow to coexist with fault injection")
    Signed-off-by: Yang Shi
    Suggested-by: Catalin Marinas
    Acked-by: Michal Hocko
    Cc: Dmitry Vyukov
    Cc: David Rientjes
    Cc: Matthew Wilcox
    Cc: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Shi
     
  • The kernel-doc parser doesn't handle expressions with %foo*. Instead,
    when an asterisk should be part of a constant, it uses an alternative
    notation: `foo*`.

    Link: http://lkml.kernel.org/r/7f18c2e0b5e39e6b7eb55ddeb043b8b260b49f2d.1563361575.git.mchehab+samsung@kernel.org
    Signed-off-by: Mauro Carvalho Chehab
    Cc: Deepa Dinamani
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mauro Carvalho Chehab
     
  • Pull more drm fixes from Daniel Vetter:
    "Dave sends his pull, everyone realizes they've been asleep at the
    wheel and hits send on their own pulls :-/

    Normally I'd just ignore these all because w/e for me and Dave. But
    this time around the latecomers also included drm-intel-fixes, which
    failed to send out a -fixes pull thus far for this release (screwed up
    vacation coverage, despite that 2/3 maintainers were around ... they
    all look appropriately guilty), and that really is overdue to get
    landed.

    And since I had to do a pull request anyway I pulled the other two
    late ones too.

    intel fixes (didn't have any ever since the main merge window pull):
    - gvt fixes (2 cc: stable)
    - fix gpu reset vs mm-shrinker vs wakeup fun (needed a few patches)
    - two gem locking fixes (one cc: stable)
    - pile of misc fixes all over with minor impact, 6 cc: stable, others
    from this window

    exynos:
    - misc minor fixes

    misc:
    - some build/Kconfig fixes
    - regression fix for vm scalability perf test which seems to mostly
    exercise dmesg/console logging ...
    - the vgem cache flush fix for arm64 broke the world on x86, so
    that's reverted again

    * tag 'drm-fixes-2019-08-02-1' of git://anongit.freedesktop.org/drm/drm: (42 commits)
    Revert "drm/vgem: fix cache synchronization on arm/arm64"
    drm/exynos: fix missing decrement of retry counter
    drm/exynos: add CONFIG_MMU dependency
    drm/exynos: remove redundant assignment to pointer 'node'
    drm/exynos: using dev_get_drvdata directly
    drm/bochs: Use shadow buffer for bochs framebuffer console
    drm/fb-helper: Instanciate shadow FB if configured in device's mode_config
    drm/fb-helper: Map DRM client buffer only when required
    drm/client: Support unmapping of DRM client buffers
    drm/i915: Only recover active engines
    drm/i915: Add a wakeref getter for iff the wakeref is already active
    drm/i915: Lift intel_engines_resume() to callers
    drm/vgem: fix cache synchronization on arm/arm64
    drm/i810: Use CONFIG_PREEMPTION
    drm/bridge: tc358764: Fix build error
    drm/bridge: lvds-encoder: Fix build error while CONFIG_DRM_KMS_HELPER=m
    drm/i915/gvt: Adding ppgtt to GVT GEM context after shadow pdps settled.
    drm/i915/gvt: grab runtime pm first for forcewake use
    drm/i915/gvt: fix incorrect cache entry for guest page mapping
    drm/i915/gvt: Checking workload's gma earlier
    ...

    Linus Torvalds
     
  • Pull selinux fix from Paul Moore:
    "One more small fix for a potential memory leak in an error path"

    * tag 'selinux-pr-20190801' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
    selinux: fix memory leak in policydb_init()

    Linus Torvalds
     
  • The hbmc-am654 driver is for the TI AM654, which is an ARM64 SoC, so
    don't propose this driver on other architectures unless
    build-testing.

    Fixes: b07079f1642c ("mtd: hyperbus: Add driver for TI's HyperBus memory controller")
    Signed-off-by: Jean Delvare
    Cc: Vignesh Raghavendra
    Cc: Miquel Raynal
    Signed-off-by: Miquel Raynal

    Jean Delvare