07 Oct, 2020

4 commits


21 Sep, 2020

14 commits

  • Linus Torvalds
     
  • Pull syscall tracing fix from Borislav Petkov:
    "Fix the seccomp syscall rewriting so that trace and audit see the
    rewritten syscall number, from Kees Cook"

    * tag 'core_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    core/entry: Report syscall correctly for trace and audit

    Linus Torvalds
     
  • Pull objtool fix from Borislav Petkov:
    "Fix noreturn detection for ignored sibling functions (Josh Poimboeuf)"

    * tag 'objtool_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    objtool: Fix noreturn detection for ignored functions

    Linus Torvalds
     
  • Pull locking fixes from Borislav Petkov:
    "Two fixes from the locking/urgent pile:

    - Fix lockdep's detection of "USED" inversions

    Linus Torvalds
     
  • Pull EFI fix from Borislav Petkov:
    "Ensure that the EFI bootloader control module only probes successfully
    on systems that support the EFI SetVariable runtime service"

    [ Tag and commit from Ard Biesheuvel, forwarded by Borislav ]

    * tag 'efi-urgent-for-v5.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    efi: efibc: check for efivars write capability

    Linus Torvalds
     
  • Pull x86 fixes from Borislav Petkov:

    - A defconfig fix (Daniel Díaz)

    - Disable relocation relaxation for the compressed kernel when not
    built as -pie as in that case kernels built with clang and linked
    with LLD fail to boot due to the linker optimizing some instructions
    in non-PIE form; the gory details in the commit message (Arvind
    Sankar)

    - A fix for the "bad bp value" warning issued by the frame-pointer
    unwinder (Josh Poimboeuf)

    * tag 'x86_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/unwind/fp: Fix FP unwinding in ret_from_fork
    x86/boot/compressed: Disable relocation relaxation
    x86/defconfigs: Explicitly unset CONFIG_64BIT in i386_defconfig

    Linus Torvalds
     
  • Pull libnvdimm fixes from Dan Williams:
    "A handful of fixes to address a string of mistakes in the mechanism
    for device-mapper to determine if its component devices are dax
    capable.

    - Fix an original bug in device-mapper table reference counting when
    interrogating dax capability in the component device. This bug was
    hidden by the following bug.

    - Fix device-mapper to use the proper helper (dax_supported() instead
    of the leaf helper generic_fsdax_supported()) to determine dax
    operation of a stacked block device configuration. The original
    implementation is only valid for one level of dax-capable block
    device stacking. This bug was discovered while fixing the below
    regression.

    - Fix an infinite recursion regression introduced by broken attempts
    to quiet the generic_fsdax_supported() path and make it bail out
    before logging "dax capability not found" errors"

    * tag 'libnvdimm-fixes-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    dax: Fix stack overflow when mounting fsdax pmem device
    dm: Call proper helper to determine dax support
    dm/dax: Fix table reference counts

    Linus Torvalds
     
  • Pull RISC-V fixes from Palmer Dabbelt:

    - A fix for a lockdep issue to avoid an asserting triggering during
    early boot. There shouldn't be any incorrect behavior as the system
    isn't concurrent at the time.

    - The addition of a missing fence when installing early fixmap
    mappings.

    - A corretion to the K210 device tree's interrupt map.

    - A fix for M-mode timer handling on the K210.

    * tag 'riscv-for-linus-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
    RISC-V: Resurrect the MMIO timer implementation for M-mode systems
    riscv: Fix Kendryte K210 device tree
    riscv: Add sfence.vma after early page table changes
    RISC-V: Take text_mutex in ftrace_init_nop()

    Linus Torvalds
     
  • Pull USB/Thunderbolt fixes from Greg KH:
    "Here are some small USB and one Thunderbolt driver fixes.

    Nothing major at all, just some fixes for reported issues, and a quirk
    addition:

    - typec fixes

    - UAS disconnect fix

    - usblp race fix

    - ehci-hcd modversions build fix

    - ignore wakeup quirk table addition

    - thunderbolt DROM read fix

    All of these have been in linux-next with no reported issues"

    * tag 'usb-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
    usblp: fix race between disconnect() and read()
    ehci-hcd: Move include to keep CRC stable
    usb: typec: intel_pmc_mux: Handle SCU IPC error conditions
    USB: quirks: Add USB_QUIRK_IGNORE_REMOTE_WAKEUP quirk for BYD zhaoxin notebook
    USB: UAS: fix disconnect by unplugging a hub
    usb: typec: ucsi: Prevent mode overrun
    usb: typec: ucsi: acpi: Increase command completion timeout value
    thunderbolt: Retry DROM read once if parsing fails

    Linus Torvalds
     
  • Pull tty/serial/fbcon fixes from Greg KH:
    "Here are some small tty/serial and one more fbcon fix.

    They include:

    - serial core locking regression fixes

    - new device ids for 8250_pci driver

    - fbcon fix for syzbot found issue

    All have been in linux-next with no reported issues"

    * tag 'tty-5.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
    fbcon: Fix user font detection test at fbcon_resize().
    serial: 8250_pci: Add Realtek 816a and 816b
    serial: core: fix console port-lock regression
    serial: core: fix port-lock initialisation

    Linus Torvalds
     
  • Pull EDAC fixes from Borislav Petkov:
    "Two fixes for resulting from CONFIG_DEBUG_TEST_DRIVER_REMOVE=y
    experiments:

    - complete a previous fix to reset a local structure containing
    scanned system data properly so that the driver rescans, as it
    should, on a second load.

    - address a refcount underflow due to not paying attention to the
    driver whitelest on unregister"

    * tag 'edac_urgent_for_v5.9_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
    EDAC/ghes: Check whether the driver is on the safe list correctly
    EDAC/ghes: Clear scanned data on unload

    Linus Torvalds
     
  • Pull input fixes from Dmitry Torokhov:
    "Just a couple of driver quirks"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: trackpoint - add new trackpoint variant IDs
    Input: i8042 - add Entroware Proteus EL07R4 to nomux and reset lists

    Linus Torvalds
     
  • Sedat Dilek pointed out some silly comment typo issues.

    Reported-by: Sedat Dilek
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • …/masahiroy/linux-kbuild

    Pull Kbuild fixes from Masahiro Yamada:
    "Fix qconf warnings and revive help message"

    * tag 'kbuild-fixes-v5.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    kconfig: qconf: revive help message in the info view
    kconfig: qconf: fix incomplete type 'struct gstr' warning
    kconfig: qconf: use delete[] instead of delete to free array (again)

    Linus Torvalds
     

20 Sep, 2020

22 commits

  • When mounting fsdax pmem device, commit 6180bb446ab6 ("dax: fix
    detection of dax support for non-persistent memory block devices")
    introduces the stack overflow [1][2]. Here is the call path for
    mounting ext4 file system:
    ext4_fill_super
    bdev_dax_supported
    __bdev_dax_supported
    dax_supported
    generic_fsdax_supported
    __generic_fsdax_supported
    bdev_dax_supported

    The call path leads to the infinite calling loop, so we cannot
    call bdev_dax_supported() in __generic_fsdax_supported(). The sanity
    checking of the variable 'dax_dev' is moved prior to the two
    bdev_dax_pgoff() checks [3][4].

    [1] https://lore.kernel.org/linux-nvdimm/1420999447.1004543.1600055488770.JavaMail.zimbra@redhat.com/
    [2] https://lore.kernel.org/linux-nvdimm/alpine.LRH.2.02.2009141131220.30651@file01.intranet.prod.int.rdu2.redhat.com/
    [3] https://lore.kernel.org/linux-nvdimm/CA+RJvhxBHriCuJhm-D8NvJRe3h2MLM+ZMFgjeJjrRPerMRLvdg@mail.gmail.com/
    [4] https://lore.kernel.org/linux-nvdimm/20200903160608.GU878166@iweiny-DESK2.sc.intel.com/

    Fixes: 6180bb446ab6 ("dax: fix detection of dax support for non-persistent memory block devices")
    Reported-by: Yi Zhang
    Reported-by: Mikulas Patocka
    Signed-off-by: Adrian Huang
    Reviewed-by: Jan Kara
    Tested-by: Ritesh Harjani
    Cc: Coly Li
    Cc: Ira Weiny
    Cc: John Pittman
    Link: https://lore.kernel.org/r/20200917111549.6367-1-adrianhuang0701@gmail.com
    Signed-off-by: Dan Williams

    Adrian Huang
     
  • DM was calling generic_fsdax_supported() to determine whether a device
    referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that
    they don't have to duplicate common generic checks. High level code
    should call dax_supported() helper which that calls into appropriate
    helper for the particular device. This problem manifested itself as
    kernel messages:

    dm-3: error: dax access failed (-95)

    when lvm2-testsuite run in cases where a DM device was stacked on top of
    another DM device.

    Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
    Cc:
    Tested-by: Adrian Huang
    Signed-off-by: Jan Kara
    Acked-by: Mike Snitzer
    Reported-by: kernel test robot
    Link: https://lore.kernel.org/r/160061715195.13131.5503173247632041975.stgit@dwillia2-desk3.amr.corp.intel.com
    Signed-off-by: Dan Williams

    Jan Kara
     
  • A recent fix to the dm_dax_supported() flow uncovered a latent bug. When
    dm_get_live_table() fails it is still required to drop the
    srcu_read_lock(). Without this change the lvm2 test-suite triggers this
    warning:

    # lvm2-testsuite --only pvmove-abort-all.sh

    WARNING: lock held when returning to user space!
    5.9.0-rc5+ #251 Tainted: G OE
    ------------------------------------------------
    lvm/1318 is leaving the kernel with locks still held!
    1 lock held by lvm/1318:
    #0: ffff9372abb5a340 (&md->io_barrier){....}-{0:0}, at: dm_get_live_table+0x5/0xb0 [dm_mod]

    ...and later on this hang signature:

    INFO: task lvm:1344 blocked for more than 122 seconds.
    Tainted: G OE 5.9.0-rc5+ #251
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    task:lvm state:D stack: 0 pid: 1344 ppid: 1 flags:0x00004000
    Call Trace:
    __schedule+0x45f/0xa80
    ? finish_task_switch+0x249/0x2c0
    ? wait_for_completion+0x86/0x110
    schedule+0x5f/0xd0
    schedule_timeout+0x212/0x2a0
    ? __schedule+0x467/0xa80
    ? wait_for_completion+0x86/0x110
    wait_for_completion+0xb0/0x110
    __synchronize_srcu+0xd1/0x160
    ? __bpf_trace_rcu_utilization+0x10/0x10
    __dm_suspend+0x6d/0x210 [dm_mod]
    dm_suspend+0xf6/0x140 [dm_mod]

    Fixes: 7bf7eac8d648 ("dax: Arrange for dax_supported check to span multiple devices")
    Cc:
    Cc: Jan Kara
    Cc: Alasdair Kergon
    Cc: Mike Snitzer
    Reported-by: Adrian Huang
    Reviewed-by: Ira Weiny
    Tested-by: Adrian Huang
    Link: https://lore.kernel.org/r/160045867590.25663.7548541079217827340.stgit@dwillia2-desk3.amr.corp.intel.com
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Since commit 68fd110b3e7e ("kconfig: qconf: remove redundant help in
    the info view"), the help message is no longer displayed.

    I intended to drop duplicated "Symbol:", "Type:", but precious info
    about help and reverse dependencies was lost too.

    Revive it now.

    "defined at" is contained in menu_get_ext_help(), so I made sure
    to not display it twice.

    Fixes: 68fd110b3e7e ("kconfig: qconf: remove redundant help in the info view")
    Reported-by: Maxim Levitsky
    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • "make HOSTCXX=clang++ xconfig" reports the following:

    HOSTCXX scripts/kconfig/qconf.o
    In file included from scripts/kconfig/qconf.cc:23:
    In file included from scripts/kconfig/lkc.h:15:
    scripts/kconfig/lkc_proto.h:26:13: warning: 'get_relations_str' has C-linkage specified, but returns incomplete type 'struct gstr' which could be incompatible with C [-Wreturn-type-c-linkage]
    struct gstr get_relations_str(struct symbol **sym_arr, struct list_head *head);
    ^

    Currently, get_relations_str() is declared before the struct gstr
    definition.

    Move all declarations of menu.c functions below.

    BTW, some are declared in lkc.h and some in lkc_proto.h, but the
    difference is unclear. I guess some refactoring is needed.

    Signed-off-by: Masahiro Yamada
    Acked-by: Boris Kolpackov

    Masahiro Yamada
     
  • Merge fixes from Andrew Morton:
    "15 patches.

    Subsystems affected by this patch series: mailmap, mm/hotfixes,
    mm/thp, mm/memory-hotplug, misc, kcsan"

    * emailed patches from Andrew Morton :
    kcsan: kconfig: move to menu 'Generic Kernel Debugging Instruments'
    fs/fs-writeback.c: adjust dirtytime_interval_handler definition to match prototype
    stackleak: let stack_erasing_sysctl take a kernel pointer buffer
    ftrace: let ftrace_enable_sysctl take a kernel pointer buffer
    mm/memory_hotplug: drain per-cpu pages again during memory offline
    selftests/vm: fix display of page size in map_hugetlb
    mm/thp: fix __split_huge_pmd_locked() for migration PMD
    kprobes: fix kill kprobe which has been marked as gone
    tmpfs: restore functionality of nr_inodes=0
    mlock: fix unevictable_pgs event counts on THP
    mm: fix check_move_unevictable_pages() on THP
    mm: migration of hugetlbfs page skip memcg
    ksm: reinstate memcg charge on copied pages
    mailmap: add older email addresses for Kees Cook

    Linus Torvalds
     
  • Pull i2c fixes from Wolfram Sang:
    "Another bunch of fixes for I2C.

    Jean's i801 patch is a cleanup on top of Volker's i801 patch, but it
    will make dependency handling much easier if those two go together"

    * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
    i2c: mxs: use MXS_DMA_CTRL_WAIT4END instead of DMA_CTRL_ACK
    i2c: mediatek: Send i2c master code at more than 1MHz
    i2c: mediatek: Fix generic definitions for bus frequency
    i2c: core: Call i2c_acpi_install_space_handler() before i2c_acpi_register_devices()
    i2c: i801: Simplify the suspend callback
    i2c: i801: Fix resume bug
    i2c: aspeed: Mask IRQ status to relevant bits

    Linus Torvalds
     
  • The K210 doesn't implement rdtime in M-mode, and since that's where Linux runs
    in the NOMMU systems that means we can't use rdtime. The K210 is the only
    system that anyone is currently running NOMMU or M-mode on, so here we're just
    inlining the timer read directly.

    This also adds the CLINT driver as an !MMU dependency, as it's currently the
    only timer driver availiable for these systems and without it we get a build
    failure for some configurations.

    Tested-by: Damien Le Moal
    Signed-off-by: Palmer Dabbelt

    Palmer Dabbelt
     
  • The Kendryte K210 SoC CLINT is compatible with Sifive clint v0
    (sifive,clint0). Fix the Kendryte K210 device tree clint entry to be
    inline with the sifive timer definition documented in
    Documentation/devicetree/bindings/timer/sifive,clint.yaml.
    The device tree clint entry is renamed similarly to u-boot device tree
    definition to improve compatibility with u-boot defined device tree.
    To ensure correct initialization, the interrup-cells attribute is added
    and the interrupt-extended attribute definition fixed.

    This fixes boot failures with Kendryte K210 SoC boards.

    Note that the clock referenced is kept as K210_CLK_ACLK, which does not
    necessarilly match the clint MTIME increment rate. This however does not
    seem to cause any problem for now.

    Signed-off-by: Damien Le Moal
    Signed-off-by: Palmer Dabbelt

    Damien Le Moal
     
  • This invalidates local TLB after modifying the page tables during early init as
    it's too early to handle suprious faults as we otherwise do.

    Fixes: f2c17aabc917 ("RISC-V: Implement compile-time fixed mappings")
    Reported-by: Syven Wang
    Signed-off-by: Syven Wang
    Signed-off-by: Greentime Hu
    Reviewed-by: Anup Patel
    [Palmer: Cleaned up the commit text]
    Signed-off-by: Palmer Dabbelt

    Greentime Hu
     
  • This moves the KCSAN kconfig items under menu 'Generic Kernel Debugging
    Instruments' where UBSAN resides.

    Signed-off-by: Changbin Du
    Signed-off-by: Andrew Morton
    Tested-by: Randy Dunlap
    Reviewed-by: Randy Dunlap
    Cc: Greg Kroah-Hartman
    Cc: Marco Elver
    Link: https://lkml.kernel.org/r/20200904152224.5570-1-changbin.du@gmail.com
    Signed-off-by: Linus Torvalds

    Changbin Du
     
  • Commit 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
    changed ctl_table.proc_handler to take a kernel pointer. Adjust the
    definition of dirtytime_interval_handler to match its prototype in
    linux/writeback.h which fixes the following sparse error/warning:

    fs/fs-writeback.c:2189:50: warning: incorrect type in argument 3 (different address spaces)
    fs/fs-writeback.c:2189:50: expected void *
    fs/fs-writeback.c:2189:50: got void [noderef] __user *buffer
    fs/fs-writeback.c:2184:5: error: symbol 'dirtytime_interval_handler' redeclared with different type (incompatible argument 3 (different address spaces)):
    fs/fs-writeback.c:2184:5: int extern [addressable] [signed] [toplevel] dirtytime_interval_handler( ... )
    fs/fs-writeback.c: note: in included file:
    ./include/linux/writeback.h:374:5: note: previously declared as:
    ./include/linux/writeback.h:374:5: int extern [addressable] [signed] [toplevel] dirtytime_interval_handler( ... )

    Fixes: 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
    Signed-off-by: Tobias Klauser
    Signed-off-by: Andrew Morton
    Reviewed-by: Jan Kara
    Cc: Christoph Hellwig
    Cc: Al Viro
    Link: https://lkml.kernel.org/r/20200907093140.13434-1-tklauser@distanz.ch
    Signed-off-by: Linus Torvalds

    Tobias Klauser
     
  • Commit 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
    changed ctl_table.proc_handler to take a kernel pointer. Adjust the
    signature of stack_erasing_sysctl to match ctl_table.proc_handler which
    fixes the following sparse warning:

    kernel/stackleak.c:31:50: warning: incorrect type in argument 3 (different address spaces)
    kernel/stackleak.c:31:50: expected void *
    kernel/stackleak.c:31:50: got void [noderef] __user *buffer

    Fixes: 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
    Signed-off-by: Tobias Klauser
    Signed-off-by: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Al Viro
    Link: https://lkml.kernel.org/r/20200907093253.13656-1-tklauser@distanz.ch
    Signed-off-by: Linus Torvalds

    Tobias Klauser
     
  • Commit 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
    changed ctl_table.proc_handler to take a kernel pointer. Adjust the
    signature of ftrace_enable_sysctl to match ctl_table.proc_handler which
    fixes the following sparse warning:

    kernel/trace/ftrace.c:7544:43: warning: incorrect type in argument 3 (different address spaces)
    kernel/trace/ftrace.c:7544:43: expected void *
    kernel/trace/ftrace.c:7544:43: got void [noderef] __user *buffer

    Fixes: 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
    Signed-off-by: Tobias Klauser
    Signed-off-by: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Al Viro
    Link: https://lkml.kernel.org/r/20200907093207.13540-1-tklauser@distanz.ch
    Signed-off-by: Linus Torvalds

    Tobias Klauser
     
  • There is a race during page offline that can lead to infinite loop:
    a page never ends up on a buddy list and __offline_pages() keeps
    retrying infinitely or until a termination signal is received.

    Thread#1 - a new process:

    load_elf_binary
    begin_new_exec
    exec_mmap
    mmput
    exit_mmap
    tlb_finish_mmu
    tlb_flush_mmu
    release_pages
    free_unref_page_list
    free_unref_page_prepare
    set_pcppage_migratetype(page, migratetype);
    // Set page->index migration type below MIGRATE_PCPTYPES

    Thread#2 - hot-removes memory
    __offline_pages
    start_isolate_page_range
    set_migratetype_isolate
    set_pageblock_migratetype(page, MIGRATE_ISOLATE);
    Set migration type to MIGRATE_ISOLATE-> set
    drain_all_pages(zone);
    // drain per-cpu page lists to buddy allocator.

    Thread#1 - continue
    free_unref_page_commit
    migratetype = get_pcppage_migratetype(page);
    // get old migration type
    list_add(&page->lru, &pcp->lists[migratetype]);
    // add new page to already drained pcp list

    Thread#2
    Never drains pcp again, and therefore gets stuck in the loop.

    The fix is to try to drain per-cpu lists again after
    check_pages_isolated_cb() fails.

    Fixes: c52e75935f8d ("mm: remove extra drain pages on pcp list")
    Signed-off-by: Pavel Tatashin
    Signed-off-by: Andrew Morton
    Acked-by: David Rientjes
    Acked-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Acked-by: David Hildenbrand
    Cc: Oscar Salvador
    Cc: Wei Yang
    Cc:
    Link: https://lkml.kernel.org/r/20200903140032.380431-1-pasha.tatashin@soleen.com
    Link: https://lkml.kernel.org/r/20200904151448.100489-2-pasha.tatashin@soleen.com
    Link: http://lkml.kernel.org/r/20200904070235.GA15277@dhcp22.suse.cz
    Signed-off-by: Linus Torvalds

    Pavel Tatashin
     
  • The displayed size is in bytes while the text says it is in kB.

    Shift it by 10 to really display kBytes.

    Fixes: fa7b9a805c79 ("tools/selftest/vm: allow choosing mem size and page size in map_hugetlb")
    Signed-off-by: Christophe Leroy
    Signed-off-by: Andrew Morton
    Cc:
    Link: https://lkml.kernel.org/r/e27481224564a93d14106e750de31189deaa8bc8.1598861977.git.christophe.leroy@csgroup.eu
    Signed-off-by: Linus Torvalds

    Christophe Leroy
     
  • A migrating transparent huge page has to already be unmapped. Otherwise,
    the page could be modified while it is being copied to a new page and data
    could be lost. The function __split_huge_pmd() checks for a PMD migration
    entry before calling __split_huge_pmd_locked() leading one to think that
    __split_huge_pmd_locked() can handle splitting a migrating PMD.

    However, the code always increments the page->_mapcount and adjusts the
    memory control group accounting assuming the page is mapped.

    Also, if the PMD entry is a migration PMD entry, the call to
    is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead
    of migration_entry_to_pfn(pmd_to_swp_entry(pmd)). Fix these problems by
    checking for a PMD migration entry.

    Fixes: 84c3fc4e9c56 ("mm: thp: check pmd migration entry in common path")
    Signed-off-by: Ralph Campbell
    Signed-off-by: Andrew Morton
    Reviewed-by: Yang Shi
    Reviewed-by: Zi Yan
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Alistair Popple
    Cc: Christoph Hellwig
    Cc: Jason Gunthorpe
    Cc: Bharata B Rao
    Cc: Ben Skeggs
    Cc: Shuah Khan
    Cc: [4.14+]
    Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.com
    Signed-off-by: Linus Torvalds

    Ralph Campbell
     
  • If a kprobe is marked as gone, we should not kill it again. Otherwise, we
    can disarm the kprobe more than once. In that case, the statistics of
    kprobe_ftrace_enabled can unbalance which can lead to that kprobe do not
    work.

    Fixes: e8386a0cb22f ("kprobes: support probing module __exit function")
    Co-developed-by: Chengming Zhou
    Signed-off-by: Muchun Song
    Signed-off-by: Chengming Zhou
    Signed-off-by: Andrew Morton
    Acked-by: Masami Hiramatsu
    Cc: "Naveen N . Rao"
    Cc: Anil S Keshavamurthy
    Cc: David S. Miller
    Cc: Song Liu
    Cc: Steven Rostedt
    Cc:
    Link: https://lkml.kernel.org/r/20200822030055.32383-1-songmuchun@bytedance.com
    Signed-off-by: Linus Torvalds

    Muchun Song
     
  • Commit e809d5f0b5c9 ("tmpfs: per-superblock i_ino support") made changes
    to shmem_reserve_inode() in mm/shmem.c, however the original test for
    (sbinfo->max_inodes) got dropped. This causes mounting tmpfs with option
    nr_inodes=0 to fail:

    # mount -ttmpfs -onr_inodes=0 none /ext0
    mount: /ext0: mount(2) system call failed: Cannot allocate memory.

    This patch restores the nr_inodes=0 functionality.

    Fixes: e809d5f0b5c9 ("tmpfs: per-superblock i_ino support")
    Signed-off-by: Byron Stanoszek
    Signed-off-by: Andrew Morton
    Acked-by: Hugh Dickins
    Acked-by: Chris Down
    Link: https://lkml.kernel.org/r/20200902035715.16414-1-gandalf@winds.org
    Signed-off-by: Linus Torvalds

    Byron Stanoszek
     
  • 5.8 commit 5d91f31faf8e ("mm: swap: fix vmstats for huge page") has
    established that vm_events should count every subpage of a THP, including
    unevictable_pgs_culled and unevictable_pgs_rescued; but
    lru_cache_add_inactive_or_unevictable() was not doing so for
    unevictable_pgs_mlocked, and mm/mlock.c was not doing so for
    unevictable_pgs mlocked, munlocked, cleared and stranded.

    Fix them; but THPs don't go the pagevec way in mlock.c, so no fixes needed
    on that path.

    Fixes: 5d91f31faf8e ("mm: swap: fix vmstats for huge page")
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Reviewed-by: Shakeel Butt
    Acked-by: Yang Shi
    Cc: Alex Shi
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Mike Kravetz
    Cc: Qian Cai
    Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301408230.5954@eggly.anvils
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • check_move_unevictable_pages() is used in making unevictable shmem pages
    evictable: by shmem_unlock_mapping(), drm_gem_check_release_pagevec() and
    i915/gem check_release_pagevec(). Those may pass down subpages of a huge
    page, when /sys/kernel/mm/transparent_hugepage/shmem_enabled is "force".

    That does not crash or warn at present, but the accounting of vmstats
    unevictable_pgs_scanned and unevictable_pgs_rescued is inconsistent:
    scanned being incremented on each subpage, rescued only on the head (since
    tails already appear evictable once the head has been updated).

    5.8 commit 5d91f31faf8e ("mm: swap: fix vmstats for huge page") has
    established that vm_events in general (and unevictable_pgs_rescued in
    particular) should count every subpage: so follow that precedent here.

    Do this in such a way that if mem_cgroup_page_lruvec() is made stricter
    (to check page->mem_cgroup is always set), no problem: skip the tails
    before calling it, and add thp_nr_pages() to vmstats on the head.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Reviewed-by: Shakeel Butt
    Acked-by: Yang Shi
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Mike Kravetz
    Cc: Qian Cai
    Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301405000.5954@eggly.anvils
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • hugetlbfs pages do not participate in memcg: so although they do find most
    of migrate_page_states() useful, it would be better if they did not call
    into mem_cgroup_migrate() - where Qian Cai reported that LTP's
    move_pages12 triggers the warning in Alex Shi's prospective commit
    "mm/memcg: warning on !memcg after readahead page charged".

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Reviewed-by: Shakeel Butt
    Acked-by: Johannes Weiner
    Cc: Alex Shi
    Cc: Michal Hocko
    Cc: Mike Kravetz
    Cc: Qian Cai
    Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008301359460.5954@eggly.anvils
    Signed-off-by: Linus Torvalds

    Hugh Dickins