14 Oct, 2020

1 commit

  • kmemleak_scan() currently relies on the big tasklist_lock hammer to
    stabilize iterating through the tasklist. Instead, this patch proposes
    simply using rcu along with the rcu-safe for_each_process_thread flavor
    (without changing scan semantics), which doesn't make use of
    next_thread/p->thread_group and thus cannot race with exit. Furthermore,
    any races with fork() and not seeing the new child should be benign as
    it's not running yet and can also be detected by the next scan.

    Avoiding the tasklist_lock could prove beneficial for performance
    considering the scan operation is done periodically. I have seen
    improvements of 30%-ish when doing similar replacements on very
    pathological microbenchmarks (ie stressing get/setpriority(2)).

    However my main motivation is that it's one less user of the global
    lock, something that Linus has long time wanted to see gone eventually
    (if ever) even if the traditional fairness issues has been dealt with
    now with qrwlocks. Of course this is a very long ways ahead. This
    patch also kills another user of the deprecated tsk->thread_group.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Andrew Morton
    Reviewed-by: Qian Cai
    Acked-by: Catalin Marinas
    Acked-by: Oleg Nesterov
    Link: https://lkml.kernel.org/r/20200820203902.11308-1-dave@stgolabs.net
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     

15 Aug, 2020

1 commit

  • Even if KCSAN is disabled for kmemleak, update_checksum() could still call
    crc32() (which is outside of kmemleak.c) to dereference object->pointer.
    Thus, the value of object->pointer could be accessed concurrently as
    noticed by KCSAN,

    BUG: KCSAN: data-race in crc32_le_base / do_raw_spin_lock

    write to 0xffffb0ea683a7d50 of 4 bytes by task 23575 on cpu 12:
    do_raw_spin_lock+0x114/0x200
    debug_spin_lock_after at kernel/locking/spinlock_debug.c:91
    (inlined by) do_raw_spin_lock at kernel/locking/spinlock_debug.c:115
    _raw_spin_lock+0x40/0x50
    __handle_mm_fault+0xa9e/0xd00
    handle_mm_fault+0xfc/0x2f0
    do_page_fault+0x263/0x6f9
    page_fault+0x34/0x40

    read to 0xffffb0ea683a7d50 of 4 bytes by task 839 on cpu 60:
    crc32_le_base+0x67/0x350
    crc32_le_base+0x67/0x350:
    crc32_body at lib/crc32.c:106
    (inlined by) crc32_le_generic at lib/crc32.c:179
    (inlined by) crc32_le at lib/crc32.c:197
    kmemleak_scan+0x528/0xd90
    update_checksum at mm/kmemleak.c:1172
    (inlined by) kmemleak_scan at mm/kmemleak.c:1497
    kmemleak_scan_thread+0xcc/0xfa
    kthread+0x1e0/0x200
    ret_from_fork+0x27/0x50

    If a shattered value was returned due to a data race, it will be corrected
    in the next scan. Thus, let KCSAN ignore all reads in the region to
    silence KCSAN in case the write side is non-atomic.

    Suggested-by: Marco Elver
    Signed-off-by: Qian Cai
    Signed-off-by: Andrew Morton
    Acked-by: Marco Elver
    Acked-by: Catalin Marinas
    Link: http://lkml.kernel.org/r/20200317182754.2180-1-cai@lca.pw
    Signed-off-by: Linus Torvalds

    Qian Cai
     

03 Apr, 2020

1 commit

  • Clang warns:

    mm/kmemleak.c:1955:28: warning: array comparison always evaluates to a constant [-Wtautological-compare]
    if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)
    ^
    mm/kmemleak.c:1955:60: warning: array comparison always evaluates to a constant [-Wtautological-compare]
    if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)

    These are not true arrays, they are linker defined symbols, which are just
    addresses. Using the address of operator silences the warning and does
    not change the resulting assembly with either clang/ld.lld or gcc/ld
    (tested with diff + objdump -Dr).

    Suggested-by: Nick Desaulniers
    Signed-off-by: Nathan Chancellor
    Signed-off-by: Andrew Morton
    Acked-by: Catalin Marinas
    Link: https://github.com/ClangBuiltLinux/linux/issues/895
    Link: http://lkml.kernel.org/r/20200220051551.44000-1-natechancellor@gmail.com
    Signed-off-by: Linus Torvalds

    Nathan Chancellor
     

01 Feb, 2020

1 commit

  • kmemleak_lock as a rwlock on RT can possibly be acquired in atomic
    context which does work.

    Since the kmemleak operation is performed in atomic context make it a
    raw_spinlock_t so it can also be acquired on RT. This is used for
    debugging and is not enabled by default in a production like environment
    (where performance/latency matters) so it makes sense to make it a
    raw_spinlock_t instead trying to get rid of the atomic context. Turn
    also the kmemleak_object->lock into raw_spinlock_t which is acquired
    (nested) while the kmemleak_lock is held.

    The time spent in "echo scan > kmemleak" slightly improved on 64core box
    with this patch applied after boot.

    [bigeasy@linutronix.de: redo the description, update comments. Merge the individual bits: He Zhe did the kmemleak_lock, Liu Haitao the ->lock and Yongxin Liu forwarded Liu's patch.]
    Link: http://lkml.kernel.org/r/20191219170834.4tah3prf2gdothz4@linutronix.de
    Link: https://lkml.kernel.org/r/20181218150744.GB20197@arrakis.emea.arm.com
    Link: https://lkml.kernel.org/r/1542877459-144382-1-git-send-email-zhe.he@windriver.com
    Link: https://lkml.kernel.org/r/20190927082230.34152-1-yongxin.liu@windriver.com
    Signed-off-by: He Zhe
    Signed-off-by: Liu Haitao
    Signed-off-by: Yongxin Liu
    Signed-off-by: Sebastian Andrzej Siewior
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    He Zhe
     

14 Oct, 2019

1 commit

  • In case of an error (e.g. memory pool too small), kmemleak disables
    itself and cleans up the already allocated metadata objects. However, if
    this happens early before the RCU callback mechanism is available,
    put_object() skips call_rcu() and frees the object directly. This is not
    safe with the RCU list traversal in __kmemleak_do_cleanup().

    Change the list traversal in __kmemleak_do_cleanup() to
    list_for_each_entry_safe() and remove the rcu_read_{lock,unlock} since
    the kmemleak is already disabled at this point. In addition, avoid an
    unnecessary metadata object rb-tree look-up since it already has the
    struct kmemleak_object pointer.

    Fixes: c5665868183f ("mm: kmemleak: use the memory pool for early allocations")
    Reported-by: Alexey Kardashevskiy
    Reported-by: Marc Dionne
    Reported-by: Ted Ts'o
    Cc: Andrew Morton
    Signed-off-by: Catalin Marinas
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

25 Sep, 2019

4 commits

  • The only way to obtain the current memory pool size for a running kernel
    is to check the kernel config file which is inconvenient. Record it in
    the kernel messages.

    [akpm@linux-foundation.org: s/memory pool size/memory pool/available/, per Catalin]
    Link: http://lkml.kernel.org/r/1565809631-28933-1-git-send-email-cai@lca.pw
    Signed-off-by: Qian Cai
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Qian Cai
     
  • Currently kmemleak uses a static early_log buffer to trace all memory
    allocation/freeing before the slab allocator is initialised. Such early
    log is replayed during kmemleak_init() to properly initialise the kmemleak
    metadata for objects allocated up that point. With a memory pool that
    does not rely on the slab allocator, it is possible to skip this early log
    entirely.

    In order to remove the early logging, consider kmemleak_enabled == 1 by
    default while the kmem_cache availability is checked directly on the
    object_cache and scan_area_cache variables. The RCU callback is only
    invoked after object_cache has been initialised as we wouldn't have any
    concurrent list traversal before this.

    In order to reduce the number of callbacks before kmemleak is fully
    initialised, move the kmemleak_init() call to mm_init().

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: remove WARN_ON(), per Catalin]
    Link: http://lkml.kernel.org/r/20190812160642.52134-4-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Cc: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     
  • Add a memory pool for struct kmemleak_object in case the normal
    kmem_cache_alloc() fails under the gfp constraints passed by the caller.
    The mem_pool[] array size is currently fixed at 16000.

    We are not using the existing mempool kernel API since this requires
    the slab allocator to be available (for pool->elements allocation). A
    subsequent kmemleak patch will replace the static early log buffer with
    the pool allocation introduced here and this functionality is required
    to be available before the slab was initialised.

    Link: http://lkml.kernel.org/r/20190812160642.52134-3-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Cc: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     
  • Patch series "mm: kmemleak: Use a memory pool for kmemleak object
    allocations", v3.

    Following the discussions on v2 of this patch(set) [1], this series takes
    slightly different approach:

    - it implements its own simple memory pool that does not rely on the
    slab allocator

    - drops the early log buffer logic entirely since it can now allocate
    metadata from the memory pool directly before kmemleak is fully
    initialised

    - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
    CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE

    - moves the kmemleak_init() call earlier (mm_init())

    - to avoid a separate memory pool for struct scan_area, it makes the
    tool robust when such allocations fail as scan areas are rather an
    optimisation

    [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com

    This patch (of 3):

    Object scan areas are an optimisation aimed to decrease the false
    positives and slightly improve the scanning time of large objects known to
    only have a few specific pointers. If a struct scan_area fails to
    allocate, kmemleak can still function normally by scanning the full
    object.

    Introduce an OBJECT_FULL_SCAN flag and mark objects as such when scan_area
    allocation fails.

    Link: http://lkml.kernel.org/r/20190812160642.52134-2-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Cc: Michal Hocko
    Cc: Matthew Wilcox
    Cc: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

14 Aug, 2019

1 commit

  • If an error occurs during kmemleak_init() (e.g. kmem cache cannot be
    created), kmemleak is disabled but kmemleak_early_log remains enabled.
    Subsequently, when the .init.text section is freed, the log_early()
    function no longer exists. To avoid a page fault in such scenario,
    ensure that kmemleak_disable() also disables early logging.

    Link: http://lkml.kernel.org/r/20190731152302.42073-1-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Reported-by: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

03 Aug, 2019

1 commit

  • When running ltp's oom test with kmemleak enabled, the below warning was
    triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
    passed in:

    WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
    Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata
    CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
    RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
    ...
    kmemleak_alloc+0x4e/0xb0
    kmem_cache_alloc+0x2a7/0x3e0
    mempool_alloc_slab+0x2d/0x40
    mempool_alloc+0x118/0x2b0
    bio_alloc_bioset+0x19d/0x350
    get_swap_bio+0x80/0x230
    __swap_writepage+0x5ff/0xb20

    The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak
    has __GFP_NOFAIL set all the time due to d9570ee3bd1d4f2 ("kmemleak:
    allow to coexist with fault injection"). But, it doesn't make any sense
    to have __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same
    time.

    According to the discussion on the mailing list, the commit should be
    reverted for short term solution. Catalin Marinas would follow up with
    a better solution for longer term.

    The failure rate of kmemleak metadata allocation may increase in some
    circumstances, but this should be expected side effect.

    Link: http://lkml.kernel.org/r/1563299431-111710-1-git-send-email-yang.shi@linux.alibaba.com
    Fixes: d9570ee3bd1d4f2 ("kmemleak: allow to coexist with fault injection")
    Signed-off-by: Yang Shi
    Suggested-by: Catalin Marinas
    Acked-by: Michal Hocko
    Cc: Dmitry Vyukov
    Cc: David Rientjes
    Cc: Matthew Wilcox
    Cc: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Shi
     

13 Jul, 2019

3 commits

  • Pull driver core and debugfs updates from Greg KH:
    "Here is the "big" driver core and debugfs changes for 5.3-rc1

    It's a lot of different patches, all across the tree due to some api
    changes and lots of debugfs cleanups.

    Other than the debugfs cleanups, in this set of changes we have:

    - bus iteration function cleanups

    - scripts/get_abi.pl tool to display and parse Documentation/ABI
    entries in a simple way

    - cleanups to Documenatation/ABI/ entries to make them parse easier
    due to typos and other minor things

    - default_attrs use for some ktype users

    - driver model documentation file conversions to .rst

    - compressed firmware file loading

    - deferred probe fixes

    All of these have been in linux-next for a while, with a bunch of
    merge issues that Stephen has been patient with me for"

    * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
    debugfs: make error message a bit more verbose
    orangefs: fix build warning from debugfs cleanup patch
    ubifs: fix build warning after debugfs cleanup patch
    driver: core: Allow subsystems to continue deferring probe
    drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
    arch_topology: Remove error messages on out-of-memory conditions
    lib: notifier-error-inject: no need to check return value of debugfs_create functions
    swiotlb: no need to check return value of debugfs_create functions
    ceph: no need to check return value of debugfs_create functions
    sunrpc: no need to check return value of debugfs_create functions
    ubifs: no need to check return value of debugfs_create functions
    orangefs: no need to check return value of debugfs_create functions
    nfsd: no need to check return value of debugfs_create functions
    lib: 842: no need to check return value of debugfs_create functions
    debugfs: provide pr_fmt() macro
    debugfs: log errors when something goes wrong
    drivers: s390/cio: Fix compilation warning about const qualifiers
    drivers: Add generic helper to match by of_node
    driver_find_device: Unify the match function with class_find_device()
    bus_find_device: Unify the match callback with class_find_device
    ...

    Linus Torvalds
     
  • According to POSIX, EBUSY means that the "device or resource is busy", and
    this can lead to people thinking that the file
    `/sys/kernel/debug/kmemleak/` is somehow locked or being used by other
    process. Change this error code to a more appropriate one.

    Link: http://lkml.kernel.org/r/20190612155231.19448-1-andrealmeid@collabora.com
    Signed-off-by: André Almeida
    Reviewed-by: Andrew Morton
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    André Almeida
     
  • in_softirq() is a wrong predicate to check if we are in a softirq
    context. It also returns true if we have BH disabled, so objects are
    falsely stamped with "softirq" comm. The correct predicate is
    in_serving_softirq().

    If user does cat from /sys/kernel/debug/kmemleak previously they would
    see this, which is clearly wrong, this is system call context (see the
    comm):

    unreferenced object 0xffff88805bd661c0 (size 64):
    comm "softirq", pid 0, jiffies 4294942959 (age 12.400s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00 ................
    00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
    backtrace:
    [] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [] slab_post_alloc_hook mm/slab.h:439 [inline]
    [] slab_alloc mm/slab.c:3326 [inline]
    [] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
    [] kmalloc include/linux/slab.h:547 [inline]
    [] kzalloc include/linux/slab.h:742 [inline]
    [] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
    [] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
    [] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
    [] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
    [] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
    [] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
    [] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
    [] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
    [] __do_sys_setsockopt net/socket.c:2089 [inline]
    [] __se_sys_setsockopt net/socket.c:2086 [inline]
    [] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
    [] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    now they will see this:

    unreferenced object 0xffff88805413c800 (size 64):
    comm "syz-executor.4", pid 8960, jiffies 4294994003 (age 14.350s)
    hex dump (first 32 bytes):
    00 7a 8a 57 80 88 ff ff e0 00 00 01 00 00 00 00 .z.W............
    00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
    backtrace:
    [] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [] slab_post_alloc_hook mm/slab.h:439 [inline]
    [] slab_alloc mm/slab.c:3326 [inline]
    [] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
    [] kmalloc include/linux/slab.h:547 [inline]
    [] kzalloc include/linux/slab.h:742 [inline]
    [] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
    [] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
    [] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
    [] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
    [] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
    [] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
    [] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
    [] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
    [] __do_sys_setsockopt net/socket.c:2089 [inline]
    [] __se_sys_setsockopt net/socket.c:2086 [inline]
    [] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
    [] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Link: http://lkml.kernel.org/r/20190517171507.96046-1-dvyukov@gmail.com
    Signed-off-by: Dmitry Vyukov
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     

05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation this program is
    distributed in the hope that it will be useful but without any
    warranty without even the implied warranty of merchantability or
    fitness for a particular purpose see the gnu general public license
    for more details you should have received a copy of the gnu general
    public license along with this program if not write to the free
    software foundation inc 59 temple place suite 330 boston ma 02111
    1307 usa

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 136 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Alexios Zavras
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190530000436.384967451@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

03 Jun, 2019

1 commit


07 May, 2019

1 commit

  • Pull stack trace updates from Ingo Molnar:
    "So Thomas looked at the stacktrace code recently and noticed a few
    weirdnesses, and we all know how such stories of crummy kernel code
    meeting German engineering perfection end: a 45-patch series to clean
    it all up! :-)

    Here's the changes in Thomas's words:

    'Struct stack_trace is a sinkhole for input and output parameters
    which is largely pointless for most usage sites. In fact if embedded
    into other data structures it creates indirections and extra storage
    overhead for no benefit.

    Looking at all usage sites makes it clear that they just require an
    interface which is based on a storage array. That array is either on
    stack, global or embedded into some other data structure.

    Some of the stack depot usage sites are outright wrong, but
    fortunately the wrongness just causes more stack being used for
    nothing and does not have functional impact.

    Another oddity is the inconsistent termination of the stack trace
    with ULONG_MAX. It's pointless as the number of entries is what
    determines the length of the stored trace. In fact quite some call
    sites remove the ULONG_MAX marker afterwards with or without nasty
    comments about it. Not all architectures do that and those which do,
    do it inconsistenly either conditional on nr_entries == 0 or
    unconditionally.

    The following series cleans that up by:

    1) Removing the ULONG_MAX termination in the architecture code

    2) Removing the ULONG_MAX fixups at the call sites

    3) Providing plain storage array based interfaces for stacktrace
    and stackdepot.

    4) Cleaning up the mess at the callsites including some related
    cleanups.

    5) Removing the struct stack_trace based interfaces

    This is not changing the struct stack_trace interfaces at the
    architecture level, but it removes the exposure to the generic
    code'"

    * 'core-stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
    x86/stacktrace: Use common infrastructure
    stacktrace: Provide common infrastructure
    lib/stackdepot: Remove obsolete functions
    stacktrace: Remove obsolete functions
    livepatch: Simplify stack trace retrieval
    tracing: Remove the last struct stack_trace usage
    tracing: Simplify stack trace retrieval
    tracing: Make ftrace_trace_userstack() static and conditional
    tracing: Use percpu stack trace buffer more intelligently
    tracing: Simplify stacktrace retrieval in histograms
    lockdep: Simplify stack trace handling
    lockdep: Remove save argument from check_prev_add()
    lockdep: Remove unused trace argument from print_circular_bug()
    drm: Simplify stacktrace handling
    dm persistent data: Simplify stack trace handling
    dm bufio: Simplify stack trace retrieval
    btrfs: ref-verify: Simplify stack trace retrieval
    dma/debug: Simplify stracktrace retrieval
    fault-inject: Simplify stacktrace retrieval
    mm/page_owner: Simplify stack trace handling
    ...

    Linus Torvalds
     

29 Apr, 2019

1 commit

  • Replace the indirection through struct stack_trace by using the storage
    array based interfaces.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Josh Poimboeuf
    Acked-by: Catalin Marinas
    Cc: Andy Lutomirski
    Cc: linux-mm@kvack.org
    Cc: Steven Rostedt
    Cc: Alexander Potapenko
    Cc: Alexey Dobriyan
    Cc: Andrew Morton
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Dmitry Vyukov
    Cc: Andrey Ryabinin
    Cc: kasan-dev@googlegroups.com
    Cc: Mike Rapoport
    Cc: Akinobu Mita
    Cc: Christoph Hellwig
    Cc: iommu@lists.linux-foundation.org
    Cc: Robin Murphy
    Cc: Marek Szyprowski
    Cc: Johannes Thumshirn
    Cc: David Sterba
    Cc: Chris Mason
    Cc: Josef Bacik
    Cc: linux-btrfs@vger.kernel.org
    Cc: dm-devel@redhat.com
    Cc: Mike Snitzer
    Cc: Alasdair Kergon
    Cc: Daniel Vetter
    Cc: intel-gfx@lists.freedesktop.org
    Cc: Joonas Lahtinen
    Cc: Maarten Lankhorst
    Cc: dri-devel@lists.freedesktop.org
    Cc: David Airlie
    Cc: Jani Nikula
    Cc: Rodrigo Vivi
    Cc: Tom Zanussi
    Cc: Miroslav Benes
    Cc: linux-arch@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190425094801.863716911@linutronix.de

    Thomas Gleixner
     

20 Apr, 2019

1 commit

  • The only references outside of the #ifdef have been removed, so now we
    get a warning in non-SMP configurations:

    mm/kmemleak.c:1404:13: error: unused function 'scan_large_block' [-Werror,-Wunused-function]

    Add a new #ifdef around it.

    Link: http://lkml.kernel.org/r/20190416123148.3502045-1-arnd@arndb.de
    Fixes: 298a32b13208 ("kmemleak: powerpc: skip scanning holes in the .bss section")
    Signed-off-by: Arnd Bergmann
    Acked-by: Catalin Marinas
    Cc: Vincent Whitchurch
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     

06 Apr, 2019

1 commit

  • Commit 2d4f567103ff ("KVM: PPC: Introduce kvm_tmp framework") adds
    kvm_tmp[] into the .bss section and then free the rest of unused spaces
    back to the page allocator.

    kernel_init
    kvm_guest_init
    kvm_free_tmp
    free_reserved_area
    free_unref_page
    free_unref_page_prepare

    With DEBUG_PAGEALLOC=y, it will unmap those pages from kernel. As the
    result, kmemleak scan will trigger a panic when it scans the .bss
    section with unmapped pages.

    This patch creates dedicated kmemleak objects for the .data, .bss and
    potentially .data..ro_after_init sections to allow partial freeing via
    the kmemleak_free_part() in the powerpc kvm_free_tmp() function.

    Link: http://lkml.kernel.org/r/20190321171917.62049-1-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Reported-by: Qian Cai
    Acked-by: Michael Ellerman (powerpc)
    Tested-by: Qian Cai
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Avi Kivity
    Cc: Paolo Bonzini
    Cc: Radim Krcmar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

22 Feb, 2019

1 commit

  • kmemleak keeps two global variables, min_addr and max_addr, which store
    the range of valid (encountered by kmemleak) pointer values, which it
    later uses to speed up pointer lookup when scanning blocks.

    With tagged pointers this range will get bigger than it needs to be. This
    patch makes kmemleak untag pointers before saving them to min_addr and
    max_addr and when performing a lookup.

    Link: http://lkml.kernel.org/r/16e887d442986ab87fe87a755815ad92fa431a5f.1550066133.git.andreyknvl@google.com
    Signed-off-by: Andrey Konovalov
    Tested-by: Qian Cai
    Acked-by: Catalin Marinas
    Cc: Alexander Potapenko
    Cc: Andrey Ryabinin
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Dmitry Vyukov
    Cc: Evgeniy Stepanov
    Cc: Joonsoo Kim
    Cc: Kostya Serebryany
    Cc: Pekka Enberg
    Cc: Vincenzo Frascino
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     

29 Dec, 2018

2 commits

  • Kmemleak scan can be cpu intensive and can stall user tasks at times. To
    prevent this, add config DEBUG_KMEMLEAK_AUTO_SCAN to enable/disable auto
    scan on boot up. Also protect first_run with DEBUG_KMEMLEAK_AUTO_SCAN as
    this is meant for only first automatic scan.

    Link: http://lkml.kernel.org/r/1540231723-7087-1-git-send-email-prpatel@nvidia.com
    Signed-off-by: Sri Krishna chowdary
    Signed-off-by: Sachin Nikam
    Signed-off-by: Prateek
    Reviewed-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sri Krishna chowdary
     
  • kmemleak_scan() goes through all online nodes and tries to scan all used
    pages.

    We can do better and use pfn_to_online_page(), so in case we have
    CONFIG_MEMORY_HOTPLUG, offlined pages will be skiped automatically. For
    boxes where CONFIG_MEMORY_HOTPLUG is not present, pfn_to_online_page()
    will fallback to pfn_valid().

    Another little optimization is to check if the page belongs to the node we
    are currently checking, so in case we have nodes interleaved we will not
    check the same pfn multiple times.

    I ran some tests:

    Add some memory to node1 and node2 making it interleaved:

    (qemu) object_add memory-backend-ram,id=ram0,size=1G
    (qemu) device_add pc-dimm,id=dimm0,memdev=ram0,node=1
    (qemu) object_add memory-backend-ram,id=ram1,size=1G
    (qemu) device_add pc-dimm,id=dimm1,memdev=ram1,node=2
    (qemu) object_add memory-backend-ram,id=ram2,size=1G
    (qemu) device_add pc-dimm,id=dimm2,memdev=ram2,node=1

    Then, we offline that memory:
    # for i in {32..39} ; do echo "offline" > /sys/devices/system/node/node1/memory$i/state;done
    # for i in {48..55} ; do echo "offline" > /sys/devices/system/node/node1/memory$i/state;don
    # for i in {40..47} ; do echo "offline" > /sys/devices/system/node/node2/memory$i/state;done

    And we run kmemleak_scan:

    # echo "scan" > /sys/kernel/debug/kmemleak

    before the patch:

    kmemleak: time spend: 41596 us

    after the patch:

    kmemleak: time spend: 34899 us

    [akpm@linux-foundation.org: remove stray newline, per Oscar]
    Link: http://lkml.kernel.org/r/20181206131918.25099-1-osalvador@suse.de
    Signed-off-by: Oscar Salvador
    Reviewed-by: Wei Yang
    Suggested-by: Michal Hocko
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oscar Salvador
     

31 Oct, 2018

1 commit

  • Move remaining definitions and declarations from include/linux/bootmem.h
    into include/linux/memblock.h and remove the redundant header.

    The includes were replaced with the semantic patch below and then
    semi-automated removal of duplicated '#include

    @@
    @@
    - #include
    + #include

    [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
    [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
    [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
    Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
    Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Stephen Rothwell
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

27 Oct, 2018

1 commit

  • Currently, kmemleak only prints the number of suspected leaks to dmesg but
    requires the user to read a debugfs file to get the actual stack traces of
    the objects' allocation points. Add a module option to print the full
    object information to dmesg too. It can be enabled with
    kmemleak.verbose=1 on the kernel command line, or "echo 1 >
    /sys/module/kmemleak/parameters/verbose":

    This allows easier integration of kmemleak into test systems: We have
    automated test infrastructure to test our Linux systems. With this
    option, running our tests with kmemleak is as simple as enabling kmemleak
    and passing this command line option; the test infrastructure knows how to
    save kernel logs, which will now include kmemleak reports. Without this
    option, the test infrastructure needs to be specifically taught to read
    out the kmemleak debugfs file. Removing this need for special handling
    makes kmemleak more similar to other kernel debug options (slab debugging,
    debug objects, etc).

    Link: http://lkml.kernel.org/r/20180903144046.21023-1-vincent.whitchurch@axis.com
    Signed-off-by: Vincent Whitchurch
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vincent Whitchurch
     

05 Sep, 2018

1 commit

  • If kmemleak built in to the kernel, but is disabled by default, the
    debugfs file is never registered. Because of this, it is not possible
    to find out if the kernel is built with kmemleak support by checking for
    the presence of this file. To allow this, always register the file.

    After this patch, if the file doesn't exist, kmemleak is not available
    in the kernel. If writing "scan" or any other value than "clear" to
    this file results in EBUSY, then kmemleak is available but is disabled
    by default and can be activated via the kernel command line.

    Catalin: "that's also consistent with a late disabling of kmemleak when
    the debugfs entry sticks around."

    Link: http://lkml.kernel.org/r/20180824131220.19176-1-vincent.whitchurch@axis.com
    Signed-off-by: Vincent Whitchurch
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vincent Whitchurch
     

06 Apr, 2018

2 commits

  • Link: http://lkml.kernel.org/r/1519585191-10180-4-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Reviewed-by: Andrew Morton
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The early_param() is only called during kernel initialization, So Linux
    marks the functions of it with __init macro to save memory.

    But it forgot to mark the kmemleak_boot_config(). So, Make it __init as
    well.

    Link: http://lkml.kernel.org/r/20180117034720.26897-1-douly.fnst@cn.fujitsu.com
    Signed-off-by: Dou Liyang
    Reviewed-by: Andrew Morton
    Cc: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dou Liyang
     

29 Mar, 2018

1 commit

  • A crash is observed when kmemleak_scan accesses the object->pointer,
    likely due to the following race.

    TASK A TASK B TASK C
    kmemleak_write
    (with "scan" and
    NOT "scan=on")
    kmemleak_scan()
    create_object
    kmem_cache_alloc fails
    kmemleak_disable
    kmemleak_do_cleanup
    kmemleak_free_enabled = 0
    kfree
    kmemleak_free bails out
    (kmemleak_free_enabled is 0)
    slub frees object->pointer
    update_checksum
    crash - object->pointer
    freed (DEBUG_PAGEALLOC)

    kmemleak_do_cleanup waits for the scan thread to complete, but not for
    direct call to kmemleak_scan via kmemleak_write. So add a wait for
    kmemleak_scan completion before disabling kmemleak_free, and while at it
    fix the comment on stop_scan_thread.

    [vinmenon@codeaurora.org: fix stop_scan_thread comment]
    Link: http://lkml.kernel.org/r/1522219972-22809-1-git-send-email-vinmenon@codeaurora.org
    Link: http://lkml.kernel.org/r/1522063429-18992-1-git-send-email-vinmenon@codeaurora.org
    Signed-off-by: Vinayak Menon
    Reviewed-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vinayak Menon
     

01 Feb, 2018

1 commit

  • Preempt counter APIs have been split out, currently, hardirq.h just
    includes irq_enter/exit APIs which are not used by kmemleak at all.

    So, remove the unused hardirq.h.

    Link: http://lkml.kernel.org/r/1510959741-31109-1-git-send-email-yang.s@alibaba-inc.com
    Signed-off-by: Yang Shi
    Cc: Michal Hocko
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Shi
     

14 Jan, 2018

1 commit

  • kmemleak does one slab allocation per user allocation. So if slab fault
    injection is enabled to any degree, kmemleak instantly fails to allocate
    and turns itself off. However, it's useful to use kmemleak with fault
    injection to find leaks on error paths. On the other hand, checking
    kmemleak itself is not so useful because (1) it's a debugging tool and
    (2) it has a very regular allocation pattern (basically a single
    allocation site, so it either works or not).

    Turn off fault injection for kmemleak allocations.

    Link: http://lkml.kernel.org/r/20180109192243.19316-1-dvyukov@google.com
    Signed-off-by: Dmitry Vyukov
    Cc: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     

15 Dec, 2017

1 commit

  • Commit bde5f6bc68db ("kmemleak: add scheduling point to
    kmemleak_scan()") tries to rate-limit the frequency of cond_resched()
    calls, but does it in a way which might incur an expensive division
    operation in the inner loop. Simplify this.

    Fixes: bde5f6bc68db5 ("kmemleak: add scheduling point to kmemleak_scan()")
    Suggested-by: Linus Torvalds
    Cc: Yisheng Xie
    Cc: Catalin Marinas
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

30 Nov, 2017

1 commit

  • kmemleak_scan() will scan struct page for each node and it can be really
    large and resulting in a soft lockup. We have seen a soft lockup when
    do scan while compile kernel:

    watchdog: BUG: soft lockup - CPU#53 stuck for 22s! [bash:10287]
    [...]
    Call Trace:
    kmemleak_scan+0x21a/0x4c0
    kmemleak_write+0x312/0x350
    full_proxy_write+0x5a/0xa0
    __vfs_write+0x33/0x150
    vfs_write+0xad/0x1a0
    SyS_write+0x52/0xc0
    do_syscall_64+0x61/0x1a0
    entry_SYSCALL64_slow_path+0x25/0x25

    Fix this by adding cond_resched every MAX_SCAN_SIZE.

    Link: http://lkml.kernel.org/r/1511439788-20099-1-git-send-email-xieyisheng1@huawei.com
    Signed-off-by: Yisheng Xie
    Suggested-by: Catalin Marinas
    Acked-by: Catalin Marinas
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yisheng Xie
     

16 Nov, 2017

2 commits

  • Patch series "kmemcheck: kill kmemcheck", v2.

    As discussed at LSF/MM, kill kmemcheck.

    KASan is a replacement that is able to work without the limitation of
    kmemcheck (single CPU, slow). KASan is already upstream.

    We are also not aware of any users of kmemcheck (or users who don't
    consider KASan as a suitable replacement).

    The only objection was that since KASAN wasn't supported by all GCC
    versions provided by distros at that time we should hold off for 2
    years, and try again.

    Now that 2 years have passed, and all distros provide gcc that supports
    KASAN, kill kmemcheck again for the very same reasons.

    This patch (of 4):

    Remove kmemcheck annotations, and calls to kmemcheck from the kernel.

    [alexander.levin@verizon.com: correctly remove kmemcheck call from dma_map_sg_attrs]
    Link: http://lkml.kernel.org/r/20171012192151.26531-1-alexander.levin@verizon.com
    Link: http://lkml.kernel.org/r/20171007030159.22241-2-alexander.levin@verizon.com
    Signed-off-by: Sasha Levin
    Cc: Alexander Potapenko
    Cc: Eric W. Biederman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Steven Rostedt
    Cc: Tim Hansen
    Cc: Vegard Nossum
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Levin, Alexander (Sasha Levin)
     
  • Kmemleak can be tweaked at runtime by writing commands into debugfs
    file. Root can use it anyway, but without the write-bit this interface
    isn't obvious.

    Link: http://lkml.kernel.org/r/150728996582.744328.11541332857988399411.stgit@buzz
    Signed-off-by: Konstantin Khlebnikov
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

07 Jul, 2017

3 commits

  • Kmemleak requires that vmalloc'ed objects have a minimum reference count
    of 2: one in the corresponding vm_struct object and the other owned by
    the vmalloc() caller. There are cases, however, where the original
    vmalloc() returned pointer is lost and, instead, a pointer to vm_struct
    is stored (see free_thread_stack()). Kmemleak currently reports such
    objects as leaks.

    This patch adds support for treating any surplus references to an object
    as additional references to a specified object. It introduces the
    kmemleak_vmalloc() API function which takes a vm_struct pointer and sets
    its surplus reference passing to the actual vmalloc() returned pointer.
    The __vmalloc_node_range() calling site has been modified accordingly.

    Link: http://lkml.kernel.org/r/1495726937-23557-4-git-send-email-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Reported-by: "Luis R. Rodriguez"
    Cc: Michal Hocko
    Cc: Andy Lutomirski
    Cc: "Luis R. Rodriguez"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     
  • scan_block() updates the number of references (pointers) to objects,
    adding them to the gray_list when object->min_count is reached. The
    patch factors out this functionality into a separate update_refs()
    function.

    Link: http://lkml.kernel.org/r/1495726937-23557-3-git-send-email-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Cc: Michal Hocko
    Cc: Andy Lutomirski
    Cc: "Luis R. Rodriguez"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     
  • Change the kmemleak_object.flags type to unsigned int and moves the
    early_log.min_count (int) near early_log.op_type (int) to slightly
    reduce the size of these structures on 64-bit architectures.

    Link: http://lkml.kernel.org/r/1495726937-23557-2-git-send-email-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Cc: Michal Hocko
    Cc: Andy Lutomirski
    Cc: "Luis R. Rodriguez"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

01 Apr, 2017

1 commit

  • A section name for .data..ro_after_init was added by both:

    commit d07a980c1b8d ("s390: add proper __ro_after_init support")

    and

    commit d7c19b066dcf ("mm: kmemleak: scan .data.ro_after_init")

    The latter adds incorrect wrapping around the existing s390 section, and
    came later. I'd prefer the s390 naming, so this moves the s390-specific
    name up to the asm-generic/sections.h and renames the section as used by
    kmemleak (and in the future, kernel/extable.c).

    Link: http://lkml.kernel.org/r/20170327192213.GA129375@beast
    Signed-off-by: Kees Cook
    Acked-by: Heiko Carstens [s390 parts]
    Acked-by: Jakub Kicinski
    Cc: Eddie Kovsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

02 Mar, 2017

1 commit