15 Sep, 2022

1 commit

  • This reverts commit 23c2d497de21f25898fbea70aeb292ab8acc8c94.

    Commit 23c2d497de21 ("mm: kmemleak: take a full lowmem check in
    kmemleak_*_phys()") brought false leak alarms on some archs like arm64
    that does not init pfn boundary in early booting. The final solution
    lands on linux-6.0: commit 0c24e061196c ("mm: kmemleak: add rbtree and
    store physical address for objects allocated with PA").

    Revert this commit before linux-6.0. The original issue of invalid PA
    can be mitigated by additional check in devicetree.

    The false alarm report is as following: Kmemleak output: (Qemu/arm64)
    unreferenced object 0xffff0000c0170a00 (size 128):
    comm "swapper/0", pid 1, jiffies 4294892404 (age 126.208s)
    hex dump (first 32 bytes):
    62 61 73 65 00 00 00 00 00 00 00 00 00 00 00 00 base............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] __kmalloc_track_caller+0x1b0/0x2e4
    [] kstrdup_const+0x8c/0xc4
    [] kvasprintf_const+0xbc/0xec
    [] kobject_set_name_vargs+0x58/0xe4
    [] kobject_add+0x84/0x100
    [] __of_attach_node_sysfs+0x78/0xec
    [] of_core_init+0x68/0x104
    [] driver_init+0x28/0x48
    [] do_basic_setup+0x14/0x28
    [] kernel_init_freeable+0x110/0x178
    [] kernel_init+0x20/0x1a0
    [] ret_from_fork+0x10/0x20

    This pacth is also applicable to linux-5.17.y/linux-5.18.y/linux-5.19.y

    Cc:
    Signed-off-by: Yee Lee
    Signed-off-by: Greg Kroah-Hartman

    Yee Lee
     

20 Apr, 2022

1 commit

  • commit 23c2d497de21f25898fbea70aeb292ab8acc8c94 upstream.

    The kmemleak_*_phys() apis do not check the address for lowmem's min
    boundary, while the caller may pass an address below lowmem, which will
    trigger an oops:

    # echo scan > /sys/kernel/debug/kmemleak
    Unable to handle kernel paging request at virtual address ff5fffffffe00000
    Oops [#1]
    Modules linked in:
    CPU: 2 PID: 134 Comm: bash Not tainted 5.18.0-rc1-next-20220407 #33
    Hardware name: riscv-virtio,qemu (DT)
    epc : scan_block+0x74/0x15c
    ra : scan_block+0x72/0x15c
    epc : ffffffff801e5806 ra : ffffffff801e5804 sp : ff200000104abc30
    gp : ffffffff815cd4e8 tp : ff60000004cfa340 t0 : 0000000000000200
    t1 : 00aaaaaac23954cc t2 : 00000000000003ff s0 : ff200000104abc90
    s1 : ffffffff81b0ff28 a0 : 0000000000000000 a1 : ff5fffffffe01000
    a2 : ffffffff81b0ff28 a3 : 0000000000000002 a4 : 0000000000000001
    a5 : 0000000000000000 a6 : ff200000104abd7c a7 : 0000000000000005
    s2 : ff5fffffffe00ff9 s3 : ffffffff815cd998 s4 : ffffffff815d0e90
    s5 : ffffffff81b0ff28 s6 : 0000000000000020 s7 : ffffffff815d0eb0
    s8 : ffffffffffffffff s9 : ff5fffffffe00000 s10: ff5fffffffe01000
    s11: 0000000000000022 t3 : 00ffffffaa17db4c t4 : 000000000000000f
    t5 : 0000000000000001 t6 : 0000000000000000
    status: 0000000000000100 badaddr: ff5fffffffe00000 cause: 000000000000000d
    scan_gray_list+0x12e/0x1a6
    kmemleak_scan+0x2aa/0x57e
    kmemleak_write+0x32a/0x40c
    full_proxy_write+0x56/0x82
    vfs_write+0xa6/0x2a6
    ksys_write+0x6c/0xe2
    sys_write+0x22/0x2a
    ret_from_syscall+0x0/0x2

    The callers may not quite know the actual address they pass(e.g. from
    devicetree). So the kmemleak_*_phys() apis should guarantee the address
    they finally use is in lowmem range, so check the address for lowmem's
    min boundary.

    Link: https://lkml.kernel.org/r/20220413122925.33856-1-patrick.wang.shcn@gmail.com
    Signed-off-by: Patrick Wang
    Acked-by: Catalin Marinas
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Patrick Wang
     

08 Apr, 2022

1 commit

  • commit bfc8089f00fa526dea983844c880fa8106c33ac4 upstream.

    When we use HW-tag based kasan and enable vmalloc support, we hit the
    following bug. It is due to comparison between tagged object and
    non-tagged pointer.

    We need to reset the kasan tag when we need to compare tagged object and
    non-tagged pointer.

    kmemleak: [name:kmemleak&]Scan area larger than object 0xffffffe77076f440
    CPU: 4 PID: 1 Comm: init Tainted: G S W 5.15.25-android13-0-g5cacf919c2bc #1
    Hardware name: MT6983(ENG) (DT)
    Call trace:
    add_scan_area+0xc4/0x244
    kmemleak_scan_area+0x40/0x9c
    layout_and_allocate+0x1e8/0x288
    load_module+0x2c8/0xf00
    __se_sys_finit_module+0x190/0x1d0
    __arm64_sys_finit_module+0x20/0x30
    invoke_syscall+0x60/0x170
    el0_svc_common+0xc8/0x114
    do_el0_svc+0x28/0xa0
    el0_svc+0x60/0xf8
    el0t_64_sync_handler+0x88/0xec
    el0t_64_sync+0x1b4/0x1b8
    kmemleak: [name:kmemleak&]Object 0xf5ffffe77076b000 (size 32768):
    kmemleak: [name:kmemleak&] comm "init", pid 1, jiffies 4294894197
    kmemleak: [name:kmemleak&] min_count = 0
    kmemleak: [name:kmemleak&] count = 0
    kmemleak: [name:kmemleak&] flags = 0x1
    kmemleak: [name:kmemleak&] checksum = 0
    kmemleak: [name:kmemleak&] backtrace:
    module_alloc+0x9c/0x120
    move_module+0x34/0x19c
    layout_and_allocate+0x1c4/0x288
    load_module+0x2c8/0xf00
    __se_sys_finit_module+0x190/0x1d0
    __arm64_sys_finit_module+0x20/0x30
    invoke_syscall+0x60/0x170
    el0_svc_common+0xc8/0x114
    do_el0_svc+0x28/0xa0
    el0_svc+0x60/0xf8
    el0t_64_sync_handler+0x88/0xec
    el0t_64_sync+0x1b4/0x1b8

    Link: https://lkml.kernel.org/r/20220318034051.30687-1-Kuan-Ying.Lee@mediatek.com
    Signed-off-by: Kuan-Ying Lee
    Reviewed-by: Catalin Marinas
    Cc: Matthias Brugger
    Cc: Chinwen Chang
    Cc: Nicholas Tang
    Cc: Yee Lee
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Kuan-Ying Lee
     

09 Feb, 2022

1 commit

  • commit c10a0f877fe007021d70f9cada240f42adc2b5db upstream.

    When using devm_request_free_mem_region() and devm_memremap_pages() to
    add ZONE_DEVICE memory, if requested free mem region's end pfn were
    huge(e.g., 0x400000000), the node_end_pfn() will be also huge (see
    move_pfn_range_to_zone()). Thus it creates a huge hole between
    node_start_pfn() and node_end_pfn().

    We found on some AMD APUs, amdkfd requested such a free mem region and
    created a huge hole. In such a case, following code snippet was just
    doing busy test_bit() looping on the huge hole.

    for (pfn = start_pfn; pfn < end_pfn; pfn++) {
    struct page *page = pfn_to_online_page(pfn);
    if (!page)
    continue;
    ...
    }

    So we got a soft lockup:

    watchdog: BUG: soft lockup - CPU#6 stuck for 26s! [bash:1221]
    CPU: 6 PID: 1221 Comm: bash Not tainted 5.15.0-custom #1
    RIP: 0010:pfn_to_online_page+0x5/0xd0
    Call Trace:
    ? kmemleak_scan+0x16a/0x440
    kmemleak_write+0x306/0x3a0
    ? common_file_perm+0x72/0x170
    full_proxy_write+0x5c/0x90
    vfs_write+0xb9/0x260
    ksys_write+0x67/0xe0
    __x64_sys_write+0x1a/0x20
    do_syscall_64+0x3b/0xc0
    entry_SYSCALL_64_after_hwframe+0x44/0xae

    I did some tests with the patch.

    (1) amdgpu module unloaded

    before the patch:

    real 0m0.976s
    user 0m0.000s
    sys 0m0.968s

    after the patch:

    real 0m0.981s
    user 0m0.000s
    sys 0m0.973s

    (2) amdgpu module loaded

    before the patch:

    real 0m35.365s
    user 0m0.000s
    sys 0m35.354s

    after the patch:

    real 0m1.049s
    user 0m0.000s
    sys 0m1.042s

    Link: https://lkml.kernel.org/r/20211108140029.721144-1-lang.yu@amd.com
    Signed-off-by: Lang Yu
    Acked-by: David Hildenbrand
    Acked-by: Catalin Marinas
    Cc: Oscar Salvador
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Lang Yu
     

09 Sep, 2021

2 commits

  • In a memory pressure situation, I'm seeing the lockdep WARNING below.
    Actually, this is similar to a known false positive which is already
    addressed by commit 6dcde60efd94 ("xfs: more lockdep whackamole with
    kmem_alloc*").

    This warning still persists because it's not from kmalloc() itself but
    from an allocation for kmemleak object. While kmalloc() itself suppress
    the warning with __GFP_NOLOCKDEP, gfp_kmemleak_mask() is dropping the
    flag for the kmemleak's allocation.

    Allow __GFP_NOLOCKDEP to be passed to kmemleak's allocation, so that the
    warning for it is also suppressed.

    ======================================================
    WARNING: possible circular locking dependency detected
    5.14.0-rc7-BTRFS-ZNS+ #37 Not tainted
    ------------------------------------------------------
    kswapd0/288 is trying to acquire lock:
    ffff88825ab45df0 (&xfs_nondir_ilock_class){++++}-{3:3}, at: xfs_ilock+0x8a/0x250

    but task is already holding lock:
    ffffffff848cc1e0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (fs_reclaim){+.+.}-{0:0}:
    fs_reclaim_acquire+0x112/0x160
    kmem_cache_alloc+0x48/0x400
    create_object.isra.0+0x42/0xb10
    kmemleak_alloc+0x48/0x80
    __kmalloc+0x228/0x440
    kmem_alloc+0xd3/0x2b0
    kmem_alloc_large+0x5a/0x1c0
    xfs_attr_copy_value+0x112/0x190
    xfs_attr_shortform_getvalue+0x1fc/0x300
    xfs_attr_get_ilocked+0x125/0x170
    xfs_attr_get+0x329/0x450
    xfs_get_acl+0x18d/0x430
    get_acl.part.0+0xb6/0x1e0
    posix_acl_xattr_get+0x13a/0x230
    vfs_getxattr+0x21d/0x270
    getxattr+0x126/0x310
    __x64_sys_fgetxattr+0x1a6/0x2a0
    do_syscall_64+0x3b/0x90
    entry_SYSCALL_64_after_hwframe+0x44/0xae

    -> #0 (&xfs_nondir_ilock_class){++++}-{3:3}:
    __lock_acquire+0x2c0f/0x5a00
    lock_acquire+0x1a1/0x4b0
    down_read_nested+0x50/0x90
    xfs_ilock+0x8a/0x250
    xfs_can_free_eofblocks+0x34f/0x570
    xfs_inactive+0x411/0x520
    xfs_fs_destroy_inode+0x2c8/0x710
    destroy_inode+0xc5/0x1a0
    evict+0x444/0x620
    dispose_list+0xfe/0x1c0
    prune_icache_sb+0xdc/0x160
    super_cache_scan+0x31e/0x510
    do_shrink_slab+0x337/0x8e0
    shrink_slab+0x362/0x5c0
    shrink_node+0x7a7/0x1a40
    balance_pgdat+0x64e/0xfe0
    kswapd+0x590/0xa80
    kthread+0x38c/0x460
    ret_from_fork+0x22/0x30

    other info that might help us debug this:
    Possible unsafe locking scenario:
    CPU0 CPU1
    ---- ----
    lock(fs_reclaim);
    lock(&xfs_nondir_ilock_class);
    lock(fs_reclaim);
    lock(&xfs_nondir_ilock_class);

    *** DEADLOCK ***
    3 locks held by kswapd0/288:
    #0: ffffffff848cc1e0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30
    #1: ffffffff848a08d8 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x269/0x5c0
    #2: ffff8881a7a820e8 (&type->s_umount_key#60){++++}-{3:3}, at: super_cache_scan+0x5a/0x510

    Link: https://lkml.kernel.org/r/20210907055659.3182992-1-naohiro.aota@wdc.com
    Signed-off-by: Naohiro Aota
    Acked-by: Catalin Marinas
    Cc: "Darrick J . Wong"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naohiro Aota
     
  • Replace the obsolete and ambiguos macro in_irq() with new macro
    in_hardirq().

    Link: https://lkml.kernel.org/r/20210813145245.86070-1-changbin.du@gmail.com
    Signed-off-by: Changbin Du
    Acked-by: Catalin Marinas [kmemleak]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Changbin Du
     

14 Aug, 2021

1 commit

  • Patch series "kasan, slub: reset tag when printing address", v3.

    With hardware tag-based kasan enabled, we reset the tag when we access
    metadata to avoid from false alarm.

    This patch (of 2):

    Kmemleak needs to scan kernel memory to check memory leak. With hardware
    tag-based kasan enabled, when it scans on the invalid slab and
    dereference, the issue will occur as below.

    Hardware tag-based KASAN doesn't use compiler instrumentation, we can not
    use kasan_disable_current() to ignore tag check.

    Based on the below report, there are 11 0xf7 granules, which amounts to
    176 bytes, and the object is allocated from the kmalloc-256 cache. So
    when kmemleak accesses the last 256-176 bytes, it causes faults, as those
    are marked with KASAN_KMALLOC_REDZONE == KASAN_TAG_INVALID == 0xfe.

    Thus, we reset tags before accessing metadata to avoid from false positives.

    BUG: KASAN: out-of-bounds in scan_block+0x58/0x170
    Read at addr f7ff0000c0074eb0 by task kmemleak/138
    Pointer tag: [f7], memory tag: [fe]

    CPU: 7 PID: 138 Comm: kmemleak Not tainted 5.14.0-rc2-00001-g8cae8cd89f05-dirty #134
    Hardware name: linux,dummy-virt (DT)
    Call trace:
    dump_backtrace+0x0/0x1b0
    show_stack+0x1c/0x30
    dump_stack_lvl+0x68/0x84
    print_address_description+0x7c/0x2b4
    kasan_report+0x138/0x38c
    __do_kernel_fault+0x190/0x1c4
    do_tag_check_fault+0x78/0x90
    do_mem_abort+0x44/0xb4
    el1_abort+0x40/0x60
    el1h_64_sync_handler+0xb4/0xd0
    el1h_64_sync+0x78/0x7c
    scan_block+0x58/0x170
    scan_gray_list+0xdc/0x1a0
    kmemleak_scan+0x2ac/0x560
    kmemleak_scan_thread+0xb0/0xe0
    kthread+0x154/0x160
    ret_from_fork+0x10/0x18

    Allocated by task 0:
    kasan_save_stack+0x2c/0x60
    __kasan_kmalloc+0xec/0x104
    __kmalloc+0x224/0x3c4
    __register_sysctl_paths+0x200/0x290
    register_sysctl_table+0x2c/0x40
    sysctl_init+0x20/0x34
    proc_sys_init+0x3c/0x48
    proc_root_init+0x80/0x9c
    start_kernel+0x648/0x6a4
    __primary_switched+0xc0/0xc8

    Freed by task 0:
    kasan_save_stack+0x2c/0x60
    kasan_set_track+0x2c/0x40
    kasan_set_free_info+0x44/0x54
    ____kasan_slab_free.constprop.0+0x150/0x1b0
    __kasan_slab_free+0x14/0x20
    slab_free_freelist_hook+0xa4/0x1fc
    kfree+0x1e8/0x30c
    put_fs_context+0x124/0x220
    vfs_kern_mount.part.0+0x60/0xd4
    kern_mount+0x24/0x4c
    bdev_cache_init+0x70/0x9c
    vfs_caches_init+0xdc/0xf4
    start_kernel+0x638/0x6a4
    __primary_switched+0xc0/0xc8

    The buggy address belongs to the object at ffff0000c0074e00
    which belongs to the cache kmalloc-256 of size 256
    The buggy address is located 176 bytes inside of
    256-byte region [ffff0000c0074e00, ffff0000c0074f00)
    The buggy address belongs to the page:
    page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x100074
    head:(____ptrval____) order:2 compound_mapcount:0 compound_pincount:0
    flags: 0xbfffc0000010200(slab|head|node=0|zone=2|lastcpupid=0xffff|kasantag=0x0)
    raw: 0bfffc0000010200 0000000000000000 dead000000000122 f5ff0000c0002300
    raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff0000c0074c00: f0 f0 f0 f0 f0 f0 f0 f0 f0 fe fe fe fe fe fe fe
    ffff0000c0074d00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
    >ffff0000c0074e00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 fe fe fe fe fe
    ^
    ffff0000c0074f00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
    ffff0000c0075000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ==================================================================
    Disabling lock debugging due to kernel taint
    kmemleak: 181 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

    Link: https://lkml.kernel.org/r/20210804090957.12393-1-Kuan-Ying.Lee@mediatek.com
    Link: https://lkml.kernel.org/r/20210804090957.12393-2-Kuan-Ying.Lee@mediatek.com
    Signed-off-by: Kuan-Ying Lee
    Acked-by: Catalin Marinas
    Reviewed-by: Andrey Konovalov
    Cc: Marco Elver
    Cc: Nicholas Tang
    Cc: Andrey Ryabinin
    Cc: Alexander Potapenko
    Cc: Chinwen Chang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kuan-Ying Lee
     

30 Jun, 2021

1 commit

  • This commit contains 3 modifications:

    1. Convert the type of jiffies_scan_wait to "unsigned long".

    2. Use READ/WRITE_ONCE() for accessing "jiffies_scan_wait".

    3. Fix the possible wrong memory scanning period. If you set a large
    memory scanning period like blow, then the "secs" variable will be
    non-zero, however the value of "jiffies_scan_wait" will be zero.

    echo "scan=0x10000000" > /sys/kernel/debug/kmemleak

    It is because the type of the msecs_to_jiffies()'s parameter is "unsigned
    int", and the "secs * 1000" is larger than its max value. This in turn
    leads a unexpected jiffies_scan_wait, maybe zero. We corret it by
    replacing kstrtoul() with kstrtouint(), and check the msecs to prevent it
    larger than UINT_MAX.

    Link: https://lkml.kernel.org/r/20210613174022.23044-1-yanfei.xu@windriver.com
    Signed-off-by: Yanfei Xu
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yanfei Xu
     

01 May, 2021

1 commit

  • s/interruptable/interruptible/

    Link: https://lkml.kernel.org/r/20210319214140.23304-1-unixbhaskar@gmail.com
    Signed-off-by: Bhaskar Chowdhury
    Acked-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bhaskar Chowdhury
     

26 Mar, 2021

1 commit

  • Because memblock allocations are registered with kmemleak, the KFENCE
    pool was seen by kmemleak as one large object. Later allocations
    through kfence_alloc() that were registered with kmemleak via
    slab_post_alloc_hook() would then overlap and trigger a warning.
    Therefore, once the pool is initialized, we can remove (free) it from
    kmemleak again, since it should be treated as allocator-internal and be
    seen as "free memory".

    The second problem is that kmemleak is passed the rounded size, and not
    the originally requested size, which is also the size of KFENCE objects.
    To avoid kmemleak scanning past the end of an object and trigger a
    KFENCE out-of-bounds error, fix the size if it is a KFENCE object.

    For simplicity, to avoid a call to kfence_ksize() in
    slab_post_alloc_hook() (and avoid new IS_ENABLED(CONFIG_DEBUG_KMEMLEAK)
    guard), just call kfence_ksize() in mm/kmemleak.c:create_object().

    Link: https://lkml.kernel.org/r/20210317084740.3099921-1-elver@google.com
    Signed-off-by: Marco Elver
    Reported-by: Luis Henriques
    Reviewed-by: Catalin Marinas
    Tested-by: Luis Henriques
    Cc: Alexander Potapenko
    Cc: Dmitry Vyukov
    Cc: Andrey Konovalov
    Cc: Jann Horn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marco Elver
     

14 Oct, 2020

1 commit

  • kmemleak_scan() currently relies on the big tasklist_lock hammer to
    stabilize iterating through the tasklist. Instead, this patch proposes
    simply using rcu along with the rcu-safe for_each_process_thread flavor
    (without changing scan semantics), which doesn't make use of
    next_thread/p->thread_group and thus cannot race with exit. Furthermore,
    any races with fork() and not seeing the new child should be benign as
    it's not running yet and can also be detected by the next scan.

    Avoiding the tasklist_lock could prove beneficial for performance
    considering the scan operation is done periodically. I have seen
    improvements of 30%-ish when doing similar replacements on very
    pathological microbenchmarks (ie stressing get/setpriority(2)).

    However my main motivation is that it's one less user of the global
    lock, something that Linus has long time wanted to see gone eventually
    (if ever) even if the traditional fairness issues has been dealt with
    now with qrwlocks. Of course this is a very long ways ahead. This
    patch also kills another user of the deprecated tsk->thread_group.

    Signed-off-by: Davidlohr Bueso
    Signed-off-by: Andrew Morton
    Reviewed-by: Qian Cai
    Acked-by: Catalin Marinas
    Acked-by: Oleg Nesterov
    Link: https://lkml.kernel.org/r/20200820203902.11308-1-dave@stgolabs.net
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     

15 Aug, 2020

1 commit

  • Even if KCSAN is disabled for kmemleak, update_checksum() could still call
    crc32() (which is outside of kmemleak.c) to dereference object->pointer.
    Thus, the value of object->pointer could be accessed concurrently as
    noticed by KCSAN,

    BUG: KCSAN: data-race in crc32_le_base / do_raw_spin_lock

    write to 0xffffb0ea683a7d50 of 4 bytes by task 23575 on cpu 12:
    do_raw_spin_lock+0x114/0x200
    debug_spin_lock_after at kernel/locking/spinlock_debug.c:91
    (inlined by) do_raw_spin_lock at kernel/locking/spinlock_debug.c:115
    _raw_spin_lock+0x40/0x50
    __handle_mm_fault+0xa9e/0xd00
    handle_mm_fault+0xfc/0x2f0
    do_page_fault+0x263/0x6f9
    page_fault+0x34/0x40

    read to 0xffffb0ea683a7d50 of 4 bytes by task 839 on cpu 60:
    crc32_le_base+0x67/0x350
    crc32_le_base+0x67/0x350:
    crc32_body at lib/crc32.c:106
    (inlined by) crc32_le_generic at lib/crc32.c:179
    (inlined by) crc32_le at lib/crc32.c:197
    kmemleak_scan+0x528/0xd90
    update_checksum at mm/kmemleak.c:1172
    (inlined by) kmemleak_scan at mm/kmemleak.c:1497
    kmemleak_scan_thread+0xcc/0xfa
    kthread+0x1e0/0x200
    ret_from_fork+0x27/0x50

    If a shattered value was returned due to a data race, it will be corrected
    in the next scan. Thus, let KCSAN ignore all reads in the region to
    silence KCSAN in case the write side is non-atomic.

    Suggested-by: Marco Elver
    Signed-off-by: Qian Cai
    Signed-off-by: Andrew Morton
    Acked-by: Marco Elver
    Acked-by: Catalin Marinas
    Link: http://lkml.kernel.org/r/20200317182754.2180-1-cai@lca.pw
    Signed-off-by: Linus Torvalds

    Qian Cai
     

03 Apr, 2020

1 commit

  • Clang warns:

    mm/kmemleak.c:1955:28: warning: array comparison always evaluates to a constant [-Wtautological-compare]
    if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)
    ^
    mm/kmemleak.c:1955:60: warning: array comparison always evaluates to a constant [-Wtautological-compare]
    if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)

    These are not true arrays, they are linker defined symbols, which are just
    addresses. Using the address of operator silences the warning and does
    not change the resulting assembly with either clang/ld.lld or gcc/ld
    (tested with diff + objdump -Dr).

    Suggested-by: Nick Desaulniers
    Signed-off-by: Nathan Chancellor
    Signed-off-by: Andrew Morton
    Acked-by: Catalin Marinas
    Link: https://github.com/ClangBuiltLinux/linux/issues/895
    Link: http://lkml.kernel.org/r/20200220051551.44000-1-natechancellor@gmail.com
    Signed-off-by: Linus Torvalds

    Nathan Chancellor
     

01 Feb, 2020

1 commit

  • kmemleak_lock as a rwlock on RT can possibly be acquired in atomic
    context which does work.

    Since the kmemleak operation is performed in atomic context make it a
    raw_spinlock_t so it can also be acquired on RT. This is used for
    debugging and is not enabled by default in a production like environment
    (where performance/latency matters) so it makes sense to make it a
    raw_spinlock_t instead trying to get rid of the atomic context. Turn
    also the kmemleak_object->lock into raw_spinlock_t which is acquired
    (nested) while the kmemleak_lock is held.

    The time spent in "echo scan > kmemleak" slightly improved on 64core box
    with this patch applied after boot.

    [bigeasy@linutronix.de: redo the description, update comments. Merge the individual bits: He Zhe did the kmemleak_lock, Liu Haitao the ->lock and Yongxin Liu forwarded Liu's patch.]
    Link: http://lkml.kernel.org/r/20191219170834.4tah3prf2gdothz4@linutronix.de
    Link: https://lkml.kernel.org/r/20181218150744.GB20197@arrakis.emea.arm.com
    Link: https://lkml.kernel.org/r/1542877459-144382-1-git-send-email-zhe.he@windriver.com
    Link: https://lkml.kernel.org/r/20190927082230.34152-1-yongxin.liu@windriver.com
    Signed-off-by: He Zhe
    Signed-off-by: Liu Haitao
    Signed-off-by: Yongxin Liu
    Signed-off-by: Sebastian Andrzej Siewior
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    He Zhe
     

14 Oct, 2019

1 commit

  • In case of an error (e.g. memory pool too small), kmemleak disables
    itself and cleans up the already allocated metadata objects. However, if
    this happens early before the RCU callback mechanism is available,
    put_object() skips call_rcu() and frees the object directly. This is not
    safe with the RCU list traversal in __kmemleak_do_cleanup().

    Change the list traversal in __kmemleak_do_cleanup() to
    list_for_each_entry_safe() and remove the rcu_read_{lock,unlock} since
    the kmemleak is already disabled at this point. In addition, avoid an
    unnecessary metadata object rb-tree look-up since it already has the
    struct kmemleak_object pointer.

    Fixes: c5665868183f ("mm: kmemleak: use the memory pool for early allocations")
    Reported-by: Alexey Kardashevskiy
    Reported-by: Marc Dionne
    Reported-by: Ted Ts'o
    Cc: Andrew Morton
    Signed-off-by: Catalin Marinas
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

25 Sep, 2019

4 commits

  • The only way to obtain the current memory pool size for a running kernel
    is to check the kernel config file which is inconvenient. Record it in
    the kernel messages.

    [akpm@linux-foundation.org: s/memory pool size/memory pool/available/, per Catalin]
    Link: http://lkml.kernel.org/r/1565809631-28933-1-git-send-email-cai@lca.pw
    Signed-off-by: Qian Cai
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Qian Cai
     
  • Currently kmemleak uses a static early_log buffer to trace all memory
    allocation/freeing before the slab allocator is initialised. Such early
    log is replayed during kmemleak_init() to properly initialise the kmemleak
    metadata for objects allocated up that point. With a memory pool that
    does not rely on the slab allocator, it is possible to skip this early log
    entirely.

    In order to remove the early logging, consider kmemleak_enabled == 1 by
    default while the kmem_cache availability is checked directly on the
    object_cache and scan_area_cache variables. The RCU callback is only
    invoked after object_cache has been initialised as we wouldn't have any
    concurrent list traversal before this.

    In order to reduce the number of callbacks before kmemleak is fully
    initialised, move the kmemleak_init() call to mm_init().

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: remove WARN_ON(), per Catalin]
    Link: http://lkml.kernel.org/r/20190812160642.52134-4-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Cc: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     
  • Add a memory pool for struct kmemleak_object in case the normal
    kmem_cache_alloc() fails under the gfp constraints passed by the caller.
    The mem_pool[] array size is currently fixed at 16000.

    We are not using the existing mempool kernel API since this requires
    the slab allocator to be available (for pool->elements allocation). A
    subsequent kmemleak patch will replace the static early log buffer with
    the pool allocation introduced here and this functionality is required
    to be available before the slab was initialised.

    Link: http://lkml.kernel.org/r/20190812160642.52134-3-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Cc: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     
  • Patch series "mm: kmemleak: Use a memory pool for kmemleak object
    allocations", v3.

    Following the discussions on v2 of this patch(set) [1], this series takes
    slightly different approach:

    - it implements its own simple memory pool that does not rely on the
    slab allocator

    - drops the early log buffer logic entirely since it can now allocate
    metadata from the memory pool directly before kmemleak is fully
    initialised

    - CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
    CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE

    - moves the kmemleak_init() call earlier (mm_init())

    - to avoid a separate memory pool for struct scan_area, it makes the
    tool robust when such allocations fail as scan areas are rather an
    optimisation

    [1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com

    This patch (of 3):

    Object scan areas are an optimisation aimed to decrease the false
    positives and slightly improve the scanning time of large objects known to
    only have a few specific pointers. If a struct scan_area fails to
    allocate, kmemleak can still function normally by scanning the full
    object.

    Introduce an OBJECT_FULL_SCAN flag and mark objects as such when scan_area
    allocation fails.

    Link: http://lkml.kernel.org/r/20190812160642.52134-2-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Cc: Michal Hocko
    Cc: Matthew Wilcox
    Cc: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

14 Aug, 2019

1 commit

  • If an error occurs during kmemleak_init() (e.g. kmem cache cannot be
    created), kmemleak is disabled but kmemleak_early_log remains enabled.
    Subsequently, when the .init.text section is freed, the log_early()
    function no longer exists. To avoid a page fault in such scenario,
    ensure that kmemleak_disable() also disables early logging.

    Link: http://lkml.kernel.org/r/20190731152302.42073-1-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Reported-by: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

03 Aug, 2019

1 commit

  • When running ltp's oom test with kmemleak enabled, the below warning was
    triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
    passed in:

    WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
    Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata
    CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
    RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
    ...
    kmemleak_alloc+0x4e/0xb0
    kmem_cache_alloc+0x2a7/0x3e0
    mempool_alloc_slab+0x2d/0x40
    mempool_alloc+0x118/0x2b0
    bio_alloc_bioset+0x19d/0x350
    get_swap_bio+0x80/0x230
    __swap_writepage+0x5ff/0xb20

    The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak
    has __GFP_NOFAIL set all the time due to d9570ee3bd1d4f2 ("kmemleak:
    allow to coexist with fault injection"). But, it doesn't make any sense
    to have __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same
    time.

    According to the discussion on the mailing list, the commit should be
    reverted for short term solution. Catalin Marinas would follow up with
    a better solution for longer term.

    The failure rate of kmemleak metadata allocation may increase in some
    circumstances, but this should be expected side effect.

    Link: http://lkml.kernel.org/r/1563299431-111710-1-git-send-email-yang.shi@linux.alibaba.com
    Fixes: d9570ee3bd1d4f2 ("kmemleak: allow to coexist with fault injection")
    Signed-off-by: Yang Shi
    Suggested-by: Catalin Marinas
    Acked-by: Michal Hocko
    Cc: Dmitry Vyukov
    Cc: David Rientjes
    Cc: Matthew Wilcox
    Cc: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Shi
     

13 Jul, 2019

3 commits

  • Pull driver core and debugfs updates from Greg KH:
    "Here is the "big" driver core and debugfs changes for 5.3-rc1

    It's a lot of different patches, all across the tree due to some api
    changes and lots of debugfs cleanups.

    Other than the debugfs cleanups, in this set of changes we have:

    - bus iteration function cleanups

    - scripts/get_abi.pl tool to display and parse Documentation/ABI
    entries in a simple way

    - cleanups to Documenatation/ABI/ entries to make them parse easier
    due to typos and other minor things

    - default_attrs use for some ktype users

    - driver model documentation file conversions to .rst

    - compressed firmware file loading

    - deferred probe fixes

    All of these have been in linux-next for a while, with a bunch of
    merge issues that Stephen has been patient with me for"

    * tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
    debugfs: make error message a bit more verbose
    orangefs: fix build warning from debugfs cleanup patch
    ubifs: fix build warning after debugfs cleanup patch
    driver: core: Allow subsystems to continue deferring probe
    drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
    arch_topology: Remove error messages on out-of-memory conditions
    lib: notifier-error-inject: no need to check return value of debugfs_create functions
    swiotlb: no need to check return value of debugfs_create functions
    ceph: no need to check return value of debugfs_create functions
    sunrpc: no need to check return value of debugfs_create functions
    ubifs: no need to check return value of debugfs_create functions
    orangefs: no need to check return value of debugfs_create functions
    nfsd: no need to check return value of debugfs_create functions
    lib: 842: no need to check return value of debugfs_create functions
    debugfs: provide pr_fmt() macro
    debugfs: log errors when something goes wrong
    drivers: s390/cio: Fix compilation warning about const qualifiers
    drivers: Add generic helper to match by of_node
    driver_find_device: Unify the match function with class_find_device()
    bus_find_device: Unify the match callback with class_find_device
    ...

    Linus Torvalds
     
  • According to POSIX, EBUSY means that the "device or resource is busy", and
    this can lead to people thinking that the file
    `/sys/kernel/debug/kmemleak/` is somehow locked or being used by other
    process. Change this error code to a more appropriate one.

    Link: http://lkml.kernel.org/r/20190612155231.19448-1-andrealmeid@collabora.com
    Signed-off-by: André Almeida
    Reviewed-by: Andrew Morton
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    André Almeida
     
  • in_softirq() is a wrong predicate to check if we are in a softirq
    context. It also returns true if we have BH disabled, so objects are
    falsely stamped with "softirq" comm. The correct predicate is
    in_serving_softirq().

    If user does cat from /sys/kernel/debug/kmemleak previously they would
    see this, which is clearly wrong, this is system call context (see the
    comm):

    unreferenced object 0xffff88805bd661c0 (size 64):
    comm "softirq", pid 0, jiffies 4294942959 (age 12.400s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00 ................
    00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
    backtrace:
    [] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [] slab_post_alloc_hook mm/slab.h:439 [inline]
    [] slab_alloc mm/slab.c:3326 [inline]
    [] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
    [] kmalloc include/linux/slab.h:547 [inline]
    [] kzalloc include/linux/slab.h:742 [inline]
    [] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
    [] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
    [] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
    [] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
    [] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
    [] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
    [] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
    [] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
    [] __do_sys_setsockopt net/socket.c:2089 [inline]
    [] __se_sys_setsockopt net/socket.c:2086 [inline]
    [] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
    [] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    now they will see this:

    unreferenced object 0xffff88805413c800 (size 64):
    comm "syz-executor.4", pid 8960, jiffies 4294994003 (age 14.350s)
    hex dump (first 32 bytes):
    00 7a 8a 57 80 88 ff ff e0 00 00 01 00 00 00 00 .z.W............
    00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
    backtrace:
    [] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
    [] slab_post_alloc_hook mm/slab.h:439 [inline]
    [] slab_alloc mm/slab.c:3326 [inline]
    [] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
    [] kmalloc include/linux/slab.h:547 [inline]
    [] kzalloc include/linux/slab.h:742 [inline]
    [] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
    [] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
    [] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
    [] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
    [] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
    [] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
    [] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
    [] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
    [] __do_sys_setsockopt net/socket.c:2089 [inline]
    [] __se_sys_setsockopt net/socket.c:2086 [inline]
    [] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
    [] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Link: http://lkml.kernel.org/r/20190517171507.96046-1-dvyukov@gmail.com
    Signed-off-by: Dmitry Vyukov
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Vyukov
     

05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation this program is
    distributed in the hope that it will be useful but without any
    warranty without even the implied warranty of merchantability or
    fitness for a particular purpose see the gnu general public license
    for more details you should have received a copy of the gnu general
    public license along with this program if not write to the free
    software foundation inc 59 temple place suite 330 boston ma 02111
    1307 usa

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 136 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Alexios Zavras
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190530000436.384967451@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

03 Jun, 2019

1 commit


07 May, 2019

1 commit

  • Pull stack trace updates from Ingo Molnar:
    "So Thomas looked at the stacktrace code recently and noticed a few
    weirdnesses, and we all know how such stories of crummy kernel code
    meeting German engineering perfection end: a 45-patch series to clean
    it all up! :-)

    Here's the changes in Thomas's words:

    'Struct stack_trace is a sinkhole for input and output parameters
    which is largely pointless for most usage sites. In fact if embedded
    into other data structures it creates indirections and extra storage
    overhead for no benefit.

    Looking at all usage sites makes it clear that they just require an
    interface which is based on a storage array. That array is either on
    stack, global or embedded into some other data structure.

    Some of the stack depot usage sites are outright wrong, but
    fortunately the wrongness just causes more stack being used for
    nothing and does not have functional impact.

    Another oddity is the inconsistent termination of the stack trace
    with ULONG_MAX. It's pointless as the number of entries is what
    determines the length of the stored trace. In fact quite some call
    sites remove the ULONG_MAX marker afterwards with or without nasty
    comments about it. Not all architectures do that and those which do,
    do it inconsistenly either conditional on nr_entries == 0 or
    unconditionally.

    The following series cleans that up by:

    1) Removing the ULONG_MAX termination in the architecture code

    2) Removing the ULONG_MAX fixups at the call sites

    3) Providing plain storage array based interfaces for stacktrace
    and stackdepot.

    4) Cleaning up the mess at the callsites including some related
    cleanups.

    5) Removing the struct stack_trace based interfaces

    This is not changing the struct stack_trace interfaces at the
    architecture level, but it removes the exposure to the generic
    code'"

    * 'core-stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
    x86/stacktrace: Use common infrastructure
    stacktrace: Provide common infrastructure
    lib/stackdepot: Remove obsolete functions
    stacktrace: Remove obsolete functions
    livepatch: Simplify stack trace retrieval
    tracing: Remove the last struct stack_trace usage
    tracing: Simplify stack trace retrieval
    tracing: Make ftrace_trace_userstack() static and conditional
    tracing: Use percpu stack trace buffer more intelligently
    tracing: Simplify stacktrace retrieval in histograms
    lockdep: Simplify stack trace handling
    lockdep: Remove save argument from check_prev_add()
    lockdep: Remove unused trace argument from print_circular_bug()
    drm: Simplify stacktrace handling
    dm persistent data: Simplify stack trace handling
    dm bufio: Simplify stack trace retrieval
    btrfs: ref-verify: Simplify stack trace retrieval
    dma/debug: Simplify stracktrace retrieval
    fault-inject: Simplify stacktrace retrieval
    mm/page_owner: Simplify stack trace handling
    ...

    Linus Torvalds
     

29 Apr, 2019

1 commit

  • Replace the indirection through struct stack_trace by using the storage
    array based interfaces.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Josh Poimboeuf
    Acked-by: Catalin Marinas
    Cc: Andy Lutomirski
    Cc: linux-mm@kvack.org
    Cc: Steven Rostedt
    Cc: Alexander Potapenko
    Cc: Alexey Dobriyan
    Cc: Andrew Morton
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Dmitry Vyukov
    Cc: Andrey Ryabinin
    Cc: kasan-dev@googlegroups.com
    Cc: Mike Rapoport
    Cc: Akinobu Mita
    Cc: Christoph Hellwig
    Cc: iommu@lists.linux-foundation.org
    Cc: Robin Murphy
    Cc: Marek Szyprowski
    Cc: Johannes Thumshirn
    Cc: David Sterba
    Cc: Chris Mason
    Cc: Josef Bacik
    Cc: linux-btrfs@vger.kernel.org
    Cc: dm-devel@redhat.com
    Cc: Mike Snitzer
    Cc: Alasdair Kergon
    Cc: Daniel Vetter
    Cc: intel-gfx@lists.freedesktop.org
    Cc: Joonas Lahtinen
    Cc: Maarten Lankhorst
    Cc: dri-devel@lists.freedesktop.org
    Cc: David Airlie
    Cc: Jani Nikula
    Cc: Rodrigo Vivi
    Cc: Tom Zanussi
    Cc: Miroslav Benes
    Cc: linux-arch@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190425094801.863716911@linutronix.de

    Thomas Gleixner
     

20 Apr, 2019

1 commit

  • The only references outside of the #ifdef have been removed, so now we
    get a warning in non-SMP configurations:

    mm/kmemleak.c:1404:13: error: unused function 'scan_large_block' [-Werror,-Wunused-function]

    Add a new #ifdef around it.

    Link: http://lkml.kernel.org/r/20190416123148.3502045-1-arnd@arndb.de
    Fixes: 298a32b13208 ("kmemleak: powerpc: skip scanning holes in the .bss section")
    Signed-off-by: Arnd Bergmann
    Acked-by: Catalin Marinas
    Cc: Vincent Whitchurch
    Cc: Michael Ellerman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     

06 Apr, 2019

1 commit

  • Commit 2d4f567103ff ("KVM: PPC: Introduce kvm_tmp framework") adds
    kvm_tmp[] into the .bss section and then free the rest of unused spaces
    back to the page allocator.

    kernel_init
    kvm_guest_init
    kvm_free_tmp
    free_reserved_area
    free_unref_page
    free_unref_page_prepare

    With DEBUG_PAGEALLOC=y, it will unmap those pages from kernel. As the
    result, kmemleak scan will trigger a panic when it scans the .bss
    section with unmapped pages.

    This patch creates dedicated kmemleak objects for the .data, .bss and
    potentially .data..ro_after_init sections to allow partial freeing via
    the kmemleak_free_part() in the powerpc kvm_free_tmp() function.

    Link: http://lkml.kernel.org/r/20190321171917.62049-1-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas
    Reported-by: Qian Cai
    Acked-by: Michael Ellerman (powerpc)
    Tested-by: Qian Cai
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Avi Kivity
    Cc: Paolo Bonzini
    Cc: Radim Krcmar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

22 Feb, 2019

1 commit

  • kmemleak keeps two global variables, min_addr and max_addr, which store
    the range of valid (encountered by kmemleak) pointer values, which it
    later uses to speed up pointer lookup when scanning blocks.

    With tagged pointers this range will get bigger than it needs to be. This
    patch makes kmemleak untag pointers before saving them to min_addr and
    max_addr and when performing a lookup.

    Link: http://lkml.kernel.org/r/16e887d442986ab87fe87a755815ad92fa431a5f.1550066133.git.andreyknvl@google.com
    Signed-off-by: Andrey Konovalov
    Tested-by: Qian Cai
    Acked-by: Catalin Marinas
    Cc: Alexander Potapenko
    Cc: Andrey Ryabinin
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Dmitry Vyukov
    Cc: Evgeniy Stepanov
    Cc: Joonsoo Kim
    Cc: Kostya Serebryany
    Cc: Pekka Enberg
    Cc: Vincenzo Frascino
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     

29 Dec, 2018

2 commits

  • Kmemleak scan can be cpu intensive and can stall user tasks at times. To
    prevent this, add config DEBUG_KMEMLEAK_AUTO_SCAN to enable/disable auto
    scan on boot up. Also protect first_run with DEBUG_KMEMLEAK_AUTO_SCAN as
    this is meant for only first automatic scan.

    Link: http://lkml.kernel.org/r/1540231723-7087-1-git-send-email-prpatel@nvidia.com
    Signed-off-by: Sri Krishna chowdary
    Signed-off-by: Sachin Nikam
    Signed-off-by: Prateek
    Reviewed-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sri Krishna chowdary
     
  • kmemleak_scan() goes through all online nodes and tries to scan all used
    pages.

    We can do better and use pfn_to_online_page(), so in case we have
    CONFIG_MEMORY_HOTPLUG, offlined pages will be skiped automatically. For
    boxes where CONFIG_MEMORY_HOTPLUG is not present, pfn_to_online_page()
    will fallback to pfn_valid().

    Another little optimization is to check if the page belongs to the node we
    are currently checking, so in case we have nodes interleaved we will not
    check the same pfn multiple times.

    I ran some tests:

    Add some memory to node1 and node2 making it interleaved:

    (qemu) object_add memory-backend-ram,id=ram0,size=1G
    (qemu) device_add pc-dimm,id=dimm0,memdev=ram0,node=1
    (qemu) object_add memory-backend-ram,id=ram1,size=1G
    (qemu) device_add pc-dimm,id=dimm1,memdev=ram1,node=2
    (qemu) object_add memory-backend-ram,id=ram2,size=1G
    (qemu) device_add pc-dimm,id=dimm2,memdev=ram2,node=1

    Then, we offline that memory:
    # for i in {32..39} ; do echo "offline" > /sys/devices/system/node/node1/memory$i/state;done
    # for i in {48..55} ; do echo "offline" > /sys/devices/system/node/node1/memory$i/state;don
    # for i in {40..47} ; do echo "offline" > /sys/devices/system/node/node2/memory$i/state;done

    And we run kmemleak_scan:

    # echo "scan" > /sys/kernel/debug/kmemleak

    before the patch:

    kmemleak: time spend: 41596 us

    after the patch:

    kmemleak: time spend: 34899 us

    [akpm@linux-foundation.org: remove stray newline, per Oscar]
    Link: http://lkml.kernel.org/r/20181206131918.25099-1-osalvador@suse.de
    Signed-off-by: Oscar Salvador
    Reviewed-by: Wei Yang
    Suggested-by: Michal Hocko
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oscar Salvador
     

31 Oct, 2018

1 commit

  • Move remaining definitions and declarations from include/linux/bootmem.h
    into include/linux/memblock.h and remove the redundant header.

    The includes were replaced with the semantic patch below and then
    semi-automated removal of duplicated '#include

    @@
    @@
    - #include
    + #include

    [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
    [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
    Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
    [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
    Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
    Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Signed-off-by: Stephen Rothwell
    Acked-by: Michal Hocko
    Cc: Catalin Marinas
    Cc: Chris Zankel
    Cc: "David S. Miller"
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Ingo Molnar
    Cc: "James E.J. Bottomley"
    Cc: Jonas Bonn
    Cc: Jonathan Corbet
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Burton
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Russell King
    Cc: Serge Semin
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

27 Oct, 2018

1 commit

  • Currently, kmemleak only prints the number of suspected leaks to dmesg but
    requires the user to read a debugfs file to get the actual stack traces of
    the objects' allocation points. Add a module option to print the full
    object information to dmesg too. It can be enabled with
    kmemleak.verbose=1 on the kernel command line, or "echo 1 >
    /sys/module/kmemleak/parameters/verbose":

    This allows easier integration of kmemleak into test systems: We have
    automated test infrastructure to test our Linux systems. With this
    option, running our tests with kmemleak is as simple as enabling kmemleak
    and passing this command line option; the test infrastructure knows how to
    save kernel logs, which will now include kmemleak reports. Without this
    option, the test infrastructure needs to be specifically taught to read
    out the kmemleak debugfs file. Removing this need for special handling
    makes kmemleak more similar to other kernel debug options (slab debugging,
    debug objects, etc).

    Link: http://lkml.kernel.org/r/20180903144046.21023-1-vincent.whitchurch@axis.com
    Signed-off-by: Vincent Whitchurch
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vincent Whitchurch
     

05 Sep, 2018

1 commit

  • If kmemleak built in to the kernel, but is disabled by default, the
    debugfs file is never registered. Because of this, it is not possible
    to find out if the kernel is built with kmemleak support by checking for
    the presence of this file. To allow this, always register the file.

    After this patch, if the file doesn't exist, kmemleak is not available
    in the kernel. If writing "scan" or any other value than "clear" to
    this file results in EBUSY, then kmemleak is available but is disabled
    by default and can be activated via the kernel command line.

    Catalin: "that's also consistent with a late disabling of kmemleak when
    the debugfs entry sticks around."

    Link: http://lkml.kernel.org/r/20180824131220.19176-1-vincent.whitchurch@axis.com
    Signed-off-by: Vincent Whitchurch
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vincent Whitchurch
     

06 Apr, 2018

2 commits

  • Link: http://lkml.kernel.org/r/1519585191-10180-4-git-send-email-rppt@linux.vnet.ibm.com
    Signed-off-by: Mike Rapoport
    Reviewed-by: Andrew Morton
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The early_param() is only called during kernel initialization, So Linux
    marks the functions of it with __init macro to save memory.

    But it forgot to mark the kmemleak_boot_config(). So, Make it __init as
    well.

    Link: http://lkml.kernel.org/r/20180117034720.26897-1-douly.fnst@cn.fujitsu.com
    Signed-off-by: Dou Liyang
    Reviewed-by: Andrew Morton
    Cc: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dou Liyang
     

29 Mar, 2018

1 commit

  • A crash is observed when kmemleak_scan accesses the object->pointer,
    likely due to the following race.

    TASK A TASK B TASK C
    kmemleak_write
    (with "scan" and
    NOT "scan=on")
    kmemleak_scan()
    create_object
    kmem_cache_alloc fails
    kmemleak_disable
    kmemleak_do_cleanup
    kmemleak_free_enabled = 0
    kfree
    kmemleak_free bails out
    (kmemleak_free_enabled is 0)
    slub frees object->pointer
    update_checksum
    crash - object->pointer
    freed (DEBUG_PAGEALLOC)

    kmemleak_do_cleanup waits for the scan thread to complete, but not for
    direct call to kmemleak_scan via kmemleak_write. So add a wait for
    kmemleak_scan completion before disabling kmemleak_free, and while at it
    fix the comment on stop_scan_thread.

    [vinmenon@codeaurora.org: fix stop_scan_thread comment]
    Link: http://lkml.kernel.org/r/1522219972-22809-1-git-send-email-vinmenon@codeaurora.org
    Link: http://lkml.kernel.org/r/1522063429-18992-1-git-send-email-vinmenon@codeaurora.org
    Signed-off-by: Vinayak Menon
    Reviewed-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vinayak Menon
     

01 Feb, 2018

1 commit

  • Preempt counter APIs have been split out, currently, hardirq.h just
    includes irq_enter/exit APIs which are not used by kmemleak at all.

    So, remove the unused hardirq.h.

    Link: http://lkml.kernel.org/r/1510959741-31109-1-git-send-email-yang.s@alibaba-inc.com
    Signed-off-by: Yang Shi
    Cc: Michal Hocko
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Shi