14 Oct, 2020
1 commit
-
kmemleak_scan() currently relies on the big tasklist_lock hammer to
stabilize iterating through the tasklist. Instead, this patch proposes
simply using rcu along with the rcu-safe for_each_process_thread flavor
(without changing scan semantics), which doesn't make use of
next_thread/p->thread_group and thus cannot race with exit. Furthermore,
any races with fork() and not seeing the new child should be benign as
it's not running yet and can also be detected by the next scan.Avoiding the tasklist_lock could prove beneficial for performance
considering the scan operation is done periodically. I have seen
improvements of 30%-ish when doing similar replacements on very
pathological microbenchmarks (ie stressing get/setpriority(2)).However my main motivation is that it's one less user of the global
lock, something that Linus has long time wanted to see gone eventually
(if ever) even if the traditional fairness issues has been dealt with
now with qrwlocks. Of course this is a very long ways ahead. This
patch also kills another user of the deprecated tsk->thread_group.Signed-off-by: Davidlohr Bueso
Signed-off-by: Andrew Morton
Reviewed-by: Qian Cai
Acked-by: Catalin Marinas
Acked-by: Oleg Nesterov
Link: https://lkml.kernel.org/r/20200820203902.11308-1-dave@stgolabs.net
Signed-off-by: Linus Torvalds
15 Aug, 2020
1 commit
-
Even if KCSAN is disabled for kmemleak, update_checksum() could still call
crc32() (which is outside of kmemleak.c) to dereference object->pointer.
Thus, the value of object->pointer could be accessed concurrently as
noticed by KCSAN,BUG: KCSAN: data-race in crc32_le_base / do_raw_spin_lock
write to 0xffffb0ea683a7d50 of 4 bytes by task 23575 on cpu 12:
do_raw_spin_lock+0x114/0x200
debug_spin_lock_after at kernel/locking/spinlock_debug.c:91
(inlined by) do_raw_spin_lock at kernel/locking/spinlock_debug.c:115
_raw_spin_lock+0x40/0x50
__handle_mm_fault+0xa9e/0xd00
handle_mm_fault+0xfc/0x2f0
do_page_fault+0x263/0x6f9
page_fault+0x34/0x40read to 0xffffb0ea683a7d50 of 4 bytes by task 839 on cpu 60:
crc32_le_base+0x67/0x350
crc32_le_base+0x67/0x350:
crc32_body at lib/crc32.c:106
(inlined by) crc32_le_generic at lib/crc32.c:179
(inlined by) crc32_le at lib/crc32.c:197
kmemleak_scan+0x528/0xd90
update_checksum at mm/kmemleak.c:1172
(inlined by) kmemleak_scan at mm/kmemleak.c:1497
kmemleak_scan_thread+0xcc/0xfa
kthread+0x1e0/0x200
ret_from_fork+0x27/0x50If a shattered value was returned due to a data race, it will be corrected
in the next scan. Thus, let KCSAN ignore all reads in the region to
silence KCSAN in case the write side is non-atomic.Suggested-by: Marco Elver
Signed-off-by: Qian Cai
Signed-off-by: Andrew Morton
Acked-by: Marco Elver
Acked-by: Catalin Marinas
Link: http://lkml.kernel.org/r/20200317182754.2180-1-cai@lca.pw
Signed-off-by: Linus Torvalds
03 Apr, 2020
1 commit
-
Clang warns:
mm/kmemleak.c:1955:28: warning: array comparison always evaluates to a constant [-Wtautological-compare]
if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)
^
mm/kmemleak.c:1955:60: warning: array comparison always evaluates to a constant [-Wtautological-compare]
if (__start_ro_after_init < _sdata || __end_ro_after_init > _edata)These are not true arrays, they are linker defined symbols, which are just
addresses. Using the address of operator silences the warning and does
not change the resulting assembly with either clang/ld.lld or gcc/ld
(tested with diff + objdump -Dr).Suggested-by: Nick Desaulniers
Signed-off-by: Nathan Chancellor
Signed-off-by: Andrew Morton
Acked-by: Catalin Marinas
Link: https://github.com/ClangBuiltLinux/linux/issues/895
Link: http://lkml.kernel.org/r/20200220051551.44000-1-natechancellor@gmail.com
Signed-off-by: Linus Torvalds
01 Feb, 2020
1 commit
-
kmemleak_lock as a rwlock on RT can possibly be acquired in atomic
context which does work.Since the kmemleak operation is performed in atomic context make it a
raw_spinlock_t so it can also be acquired on RT. This is used for
debugging and is not enabled by default in a production like environment
(where performance/latency matters) so it makes sense to make it a
raw_spinlock_t instead trying to get rid of the atomic context. Turn
also the kmemleak_object->lock into raw_spinlock_t which is acquired
(nested) while the kmemleak_lock is held.The time spent in "echo scan > kmemleak" slightly improved on 64core box
with this patch applied after boot.[bigeasy@linutronix.de: redo the description, update comments. Merge the individual bits: He Zhe did the kmemleak_lock, Liu Haitao the ->lock and Yongxin Liu forwarded Liu's patch.]
Link: http://lkml.kernel.org/r/20191219170834.4tah3prf2gdothz4@linutronix.de
Link: https://lkml.kernel.org/r/20181218150744.GB20197@arrakis.emea.arm.com
Link: https://lkml.kernel.org/r/1542877459-144382-1-git-send-email-zhe.he@windriver.com
Link: https://lkml.kernel.org/r/20190927082230.34152-1-yongxin.liu@windriver.com
Signed-off-by: He Zhe
Signed-off-by: Liu Haitao
Signed-off-by: Yongxin Liu
Signed-off-by: Sebastian Andrzej Siewior
Acked-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Oct, 2019
1 commit
-
In case of an error (e.g. memory pool too small), kmemleak disables
itself and cleans up the already allocated metadata objects. However, if
this happens early before the RCU callback mechanism is available,
put_object() skips call_rcu() and frees the object directly. This is not
safe with the RCU list traversal in __kmemleak_do_cleanup().Change the list traversal in __kmemleak_do_cleanup() to
list_for_each_entry_safe() and remove the rcu_read_{lock,unlock} since
the kmemleak is already disabled at this point. In addition, avoid an
unnecessary metadata object rb-tree look-up since it already has the
struct kmemleak_object pointer.Fixes: c5665868183f ("mm: kmemleak: use the memory pool for early allocations")
Reported-by: Alexey Kardashevskiy
Reported-by: Marc Dionne
Reported-by: Ted Ts'o
Cc: Andrew Morton
Signed-off-by: Catalin Marinas
Signed-off-by: Linus Torvalds
25 Sep, 2019
4 commits
-
The only way to obtain the current memory pool size for a running kernel
is to check the kernel config file which is inconvenient. Record it in
the kernel messages.[akpm@linux-foundation.org: s/memory pool size/memory pool/available/, per Catalin]
Link: http://lkml.kernel.org/r/1565809631-28933-1-git-send-email-cai@lca.pw
Signed-off-by: Qian Cai
Acked-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently kmemleak uses a static early_log buffer to trace all memory
allocation/freeing before the slab allocator is initialised. Such early
log is replayed during kmemleak_init() to properly initialise the kmemleak
metadata for objects allocated up that point. With a memory pool that
does not rely on the slab allocator, it is possible to skip this early log
entirely.In order to remove the early logging, consider kmemleak_enabled == 1 by
default while the kmem_cache availability is checked directly on the
object_cache and scan_area_cache variables. The RCU callback is only
invoked after object_cache has been initialised as we wouldn't have any
concurrent list traversal before this.In order to reduce the number of callbacks before kmemleak is fully
initialised, move the kmemleak_init() call to mm_init().[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: remove WARN_ON(), per Catalin]
Link: http://lkml.kernel.org/r/20190812160642.52134-4-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas
Cc: Matthew Wilcox
Cc: Michal Hocko
Cc: Qian Cai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add a memory pool for struct kmemleak_object in case the normal
kmem_cache_alloc() fails under the gfp constraints passed by the caller.
The mem_pool[] array size is currently fixed at 16000.We are not using the existing mempool kernel API since this requires
the slab allocator to be available (for pool->elements allocation). A
subsequent kmemleak patch will replace the static early log buffer with
the pool allocation introduced here and this functionality is required
to be available before the slab was initialised.Link: http://lkml.kernel.org/r/20190812160642.52134-3-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas
Cc: Matthew Wilcox
Cc: Michal Hocko
Cc: Qian Cai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Patch series "mm: kmemleak: Use a memory pool for kmemleak object
allocations", v3.Following the discussions on v2 of this patch(set) [1], this series takes
slightly different approach:- it implements its own simple memory pool that does not rely on the
slab allocator- drops the early log buffer logic entirely since it can now allocate
metadata from the memory pool directly before kmemleak is fully
initialised- CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE option is renamed to
CONFIG_DEBUG_KMEMLEAK_MEM_POOL_SIZE- moves the kmemleak_init() call earlier (mm_init())
- to avoid a separate memory pool for struct scan_area, it makes the
tool robust when such allocations fail as scan areas are rather an
optimisation[1] http://lkml.kernel.org/r/20190727132334.9184-1-catalin.marinas@arm.com
This patch (of 3):
Object scan areas are an optimisation aimed to decrease the false
positives and slightly improve the scanning time of large objects known to
only have a few specific pointers. If a struct scan_area fails to
allocate, kmemleak can still function normally by scanning the full
object.Introduce an OBJECT_FULL_SCAN flag and mark objects as such when scan_area
allocation fails.Link: http://lkml.kernel.org/r/20190812160642.52134-2-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas
Cc: Michal Hocko
Cc: Matthew Wilcox
Cc: Qian Cai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Aug, 2019
1 commit
-
If an error occurs during kmemleak_init() (e.g. kmem cache cannot be
created), kmemleak is disabled but kmemleak_early_log remains enabled.
Subsequently, when the .init.text section is freed, the log_early()
function no longer exists. To avoid a page fault in such scenario,
ensure that kmemleak_disable() also disables early logging.Link: http://lkml.kernel.org/r/20190731152302.42073-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas
Reported-by: Qian Cai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
03 Aug, 2019
1 commit
-
When running ltp's oom test with kmemleak enabled, the below warning was
triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
passed in:WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata
CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
...
kmemleak_alloc+0x4e/0xb0
kmem_cache_alloc+0x2a7/0x3e0
mempool_alloc_slab+0x2d/0x40
mempool_alloc+0x118/0x2b0
bio_alloc_bioset+0x19d/0x350
get_swap_bio+0x80/0x230
__swap_writepage+0x5ff/0xb20The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak
has __GFP_NOFAIL set all the time due to d9570ee3bd1d4f2 ("kmemleak:
allow to coexist with fault injection"). But, it doesn't make any sense
to have __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same
time.According to the discussion on the mailing list, the commit should be
reverted for short term solution. Catalin Marinas would follow up with
a better solution for longer term.The failure rate of kmemleak metadata allocation may increase in some
circumstances, but this should be expected side effect.Link: http://lkml.kernel.org/r/1563299431-111710-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: d9570ee3bd1d4f2 ("kmemleak: allow to coexist with fault injection")
Signed-off-by: Yang Shi
Suggested-by: Catalin Marinas
Acked-by: Michal Hocko
Cc: Dmitry Vyukov
Cc: David Rientjes
Cc: Matthew Wilcox
Cc: Qian Cai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Jul, 2019
3 commits
-
Pull driver core and debugfs updates from Greg KH:
"Here is the "big" driver core and debugfs changes for 5.3-rc1It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups.Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way- cleanups to Documenatation/ABI/ entries to make them parse easier
due to typos and other minor things- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of
merge issues that Stephen has been patient with me for"* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
debugfs: make error message a bit more verbose
orangefs: fix build warning from debugfs cleanup patch
ubifs: fix build warning after debugfs cleanup patch
driver: core: Allow subsystems to continue deferring probe
drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
arch_topology: Remove error messages on out-of-memory conditions
lib: notifier-error-inject: no need to check return value of debugfs_create functions
swiotlb: no need to check return value of debugfs_create functions
ceph: no need to check return value of debugfs_create functions
sunrpc: no need to check return value of debugfs_create functions
ubifs: no need to check return value of debugfs_create functions
orangefs: no need to check return value of debugfs_create functions
nfsd: no need to check return value of debugfs_create functions
lib: 842: no need to check return value of debugfs_create functions
debugfs: provide pr_fmt() macro
debugfs: log errors when something goes wrong
drivers: s390/cio: Fix compilation warning about const qualifiers
drivers: Add generic helper to match by of_node
driver_find_device: Unify the match function with class_find_device()
bus_find_device: Unify the match callback with class_find_device
... -
According to POSIX, EBUSY means that the "device or resource is busy", and
this can lead to people thinking that the file
`/sys/kernel/debug/kmemleak/` is somehow locked or being used by other
process. Change this error code to a more appropriate one.Link: http://lkml.kernel.org/r/20190612155231.19448-1-andrealmeid@collabora.com
Signed-off-by: André Almeida
Reviewed-by: Andrew Morton
Acked-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
in_softirq() is a wrong predicate to check if we are in a softirq
context. It also returns true if we have BH disabled, so objects are
falsely stamped with "softirq" comm. The correct predicate is
in_serving_softirq().If user does cat from /sys/kernel/debug/kmemleak previously they would
see this, which is clearly wrong, this is system call context (see the
comm):unreferenced object 0xffff88805bd661c0 (size 64):
comm "softirq", pid 0, jiffies 4294942959 (age 12.400s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 ff ff ff ff 00 00 00 00 ................
00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
backtrace:
[] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
[] slab_post_alloc_hook mm/slab.h:439 [inline]
[] slab_alloc mm/slab.c:3326 [inline]
[] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
[] kmalloc include/linux/slab.h:547 [inline]
[] kzalloc include/linux/slab.h:742 [inline]
[] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
[] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
[] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
[] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
[] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
[] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
[] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
[] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
[] __do_sys_setsockopt net/socket.c:2089 [inline]
[] __se_sys_setsockopt net/socket.c:2086 [inline]
[] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
[] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
[] entry_SYSCALL_64_after_hwframe+0x44/0xa9now they will see this:
unreferenced object 0xffff88805413c800 (size 64):
comm "syz-executor.4", pid 8960, jiffies 4294994003 (age 14.350s)
hex dump (first 32 bytes):
00 7a 8a 57 80 88 ff ff e0 00 00 01 00 00 00 00 .z.W............
00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
backtrace:
[] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
[] slab_post_alloc_hook mm/slab.h:439 [inline]
[] slab_alloc mm/slab.c:3326 [inline]
[] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
[] kmalloc include/linux/slab.h:547 [inline]
[] kzalloc include/linux/slab.h:742 [inline]
[] ip_mc_add1_src net/ipv4/igmp.c:1961 [inline]
[] ip_mc_add_src+0x36b/0x400 net/ipv4/igmp.c:2085
[] ip_mc_msfilter+0x22d/0x310 net/ipv4/igmp.c:2475
[] do_ip_setsockopt.isra.0+0x19fe/0x1c00 net/ipv4/ip_sockglue.c:957
[] ip_setsockopt+0x3b/0xb0 net/ipv4/ip_sockglue.c:1246
[] udp_setsockopt+0x4e/0x90 net/ipv4/udp.c:2616
[] sock_common_setsockopt+0x3e/0x50 net/core/sock.c:3130
[] __sys_setsockopt+0x9e/0x120 net/socket.c:2078
[] __do_sys_setsockopt net/socket.c:2089 [inline]
[] __se_sys_setsockopt net/socket.c:2086 [inline]
[] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2086
[] do_syscall_64+0x7c/0x1a0 arch/x86/entry/common.c:301
[] entry_SYSCALL_64_after_hwframe+0x44/0xa9Link: http://lkml.kernel.org/r/20190517171507.96046-1-dvyukov@gmail.com
Signed-off-by: Dmitry Vyukov
Acked-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
05 Jun, 2019
1 commit
-
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not write to the free
software foundation inc 59 temple place suite 330 boston ma 02111
1307 usaextracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 136 file(s).
Signed-off-by: Thomas Gleixner
Reviewed-by: Alexios Zavras
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190530000436.384967451@linutronix.de
Signed-off-by: Greg Kroah-Hartman
03 Jun, 2019
1 commit
-
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.Cc: Catalin Marinas
Cc: linux-mm@kvack.org
Signed-off-by: Greg Kroah-Hartman
07 May, 2019
1 commit
-
Pull stack trace updates from Ingo Molnar:
"So Thomas looked at the stacktrace code recently and noticed a few
weirdnesses, and we all know how such stories of crummy kernel code
meeting German engineering perfection end: a 45-patch series to clean
it all up! :-)Here's the changes in Thomas's words:
'Struct stack_trace is a sinkhole for input and output parameters
which is largely pointless for most usage sites. In fact if embedded
into other data structures it creates indirections and extra storage
overhead for no benefit.Looking at all usage sites makes it clear that they just require an
interface which is based on a storage array. That array is either on
stack, global or embedded into some other data structure.Some of the stack depot usage sites are outright wrong, but
fortunately the wrongness just causes more stack being used for
nothing and does not have functional impact.Another oddity is the inconsistent termination of the stack trace
with ULONG_MAX. It's pointless as the number of entries is what
determines the length of the stored trace. In fact quite some call
sites remove the ULONG_MAX marker afterwards with or without nasty
comments about it. Not all architectures do that and those which do,
do it inconsistenly either conditional on nr_entries == 0 or
unconditionally.The following series cleans that up by:
1) Removing the ULONG_MAX termination in the architecture code
2) Removing the ULONG_MAX fixups at the call sites
3) Providing plain storage array based interfaces for stacktrace
and stackdepot.4) Cleaning up the mess at the callsites including some related
cleanups.5) Removing the struct stack_trace based interfaces
This is not changing the struct stack_trace interfaces at the
architecture level, but it removes the exposure to the generic
code'"* 'core-stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
x86/stacktrace: Use common infrastructure
stacktrace: Provide common infrastructure
lib/stackdepot: Remove obsolete functions
stacktrace: Remove obsolete functions
livepatch: Simplify stack trace retrieval
tracing: Remove the last struct stack_trace usage
tracing: Simplify stack trace retrieval
tracing: Make ftrace_trace_userstack() static and conditional
tracing: Use percpu stack trace buffer more intelligently
tracing: Simplify stacktrace retrieval in histograms
lockdep: Simplify stack trace handling
lockdep: Remove save argument from check_prev_add()
lockdep: Remove unused trace argument from print_circular_bug()
drm: Simplify stacktrace handling
dm persistent data: Simplify stack trace handling
dm bufio: Simplify stack trace retrieval
btrfs: ref-verify: Simplify stack trace retrieval
dma/debug: Simplify stracktrace retrieval
fault-inject: Simplify stacktrace retrieval
mm/page_owner: Simplify stack trace handling
...
29 Apr, 2019
1 commit
-
Replace the indirection through struct stack_trace by using the storage
array based interfaces.Signed-off-by: Thomas Gleixner
Reviewed-by: Josh Poimboeuf
Acked-by: Catalin Marinas
Cc: Andy Lutomirski
Cc: linux-mm@kvack.org
Cc: Steven Rostedt
Cc: Alexander Potapenko
Cc: Alexey Dobriyan
Cc: Andrew Morton
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Dmitry Vyukov
Cc: Andrey Ryabinin
Cc: kasan-dev@googlegroups.com
Cc: Mike Rapoport
Cc: Akinobu Mita
Cc: Christoph Hellwig
Cc: iommu@lists.linux-foundation.org
Cc: Robin Murphy
Cc: Marek Szyprowski
Cc: Johannes Thumshirn
Cc: David Sterba
Cc: Chris Mason
Cc: Josef Bacik
Cc: linux-btrfs@vger.kernel.org
Cc: dm-devel@redhat.com
Cc: Mike Snitzer
Cc: Alasdair Kergon
Cc: Daniel Vetter
Cc: intel-gfx@lists.freedesktop.org
Cc: Joonas Lahtinen
Cc: Maarten Lankhorst
Cc: dri-devel@lists.freedesktop.org
Cc: David Airlie
Cc: Jani Nikula
Cc: Rodrigo Vivi
Cc: Tom Zanussi
Cc: Miroslav Benes
Cc: linux-arch@vger.kernel.org
Link: https://lkml.kernel.org/r/20190425094801.863716911@linutronix.de
20 Apr, 2019
1 commit
-
The only references outside of the #ifdef have been removed, so now we
get a warning in non-SMP configurations:mm/kmemleak.c:1404:13: error: unused function 'scan_large_block' [-Werror,-Wunused-function]
Add a new #ifdef around it.
Link: http://lkml.kernel.org/r/20190416123148.3502045-1-arnd@arndb.de
Fixes: 298a32b13208 ("kmemleak: powerpc: skip scanning holes in the .bss section")
Signed-off-by: Arnd Bergmann
Acked-by: Catalin Marinas
Cc: Vincent Whitchurch
Cc: Michael Ellerman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
06 Apr, 2019
1 commit
-
Commit 2d4f567103ff ("KVM: PPC: Introduce kvm_tmp framework") adds
kvm_tmp[] into the .bss section and then free the rest of unused spaces
back to the page allocator.kernel_init
kvm_guest_init
kvm_free_tmp
free_reserved_area
free_unref_page
free_unref_page_prepareWith DEBUG_PAGEALLOC=y, it will unmap those pages from kernel. As the
result, kmemleak scan will trigger a panic when it scans the .bss
section with unmapped pages.This patch creates dedicated kmemleak objects for the .data, .bss and
potentially .data..ro_after_init sections to allow partial freeing via
the kmemleak_free_part() in the powerpc kvm_free_tmp() function.Link: http://lkml.kernel.org/r/20190321171917.62049-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas
Reported-by: Qian Cai
Acked-by: Michael Ellerman (powerpc)
Tested-by: Qian Cai
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Cc: Avi Kivity
Cc: Paolo Bonzini
Cc: Radim Krcmar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
22 Feb, 2019
1 commit
-
kmemleak keeps two global variables, min_addr and max_addr, which store
the range of valid (encountered by kmemleak) pointer values, which it
later uses to speed up pointer lookup when scanning blocks.With tagged pointers this range will get bigger than it needs to be. This
patch makes kmemleak untag pointers before saving them to min_addr and
max_addr and when performing a lookup.Link: http://lkml.kernel.org/r/16e887d442986ab87fe87a755815ad92fa431a5f.1550066133.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov
Tested-by: Qian Cai
Acked-by: Catalin Marinas
Cc: Alexander Potapenko
Cc: Andrey Ryabinin
Cc: Christoph Lameter
Cc: David Rientjes
Cc: Dmitry Vyukov
Cc: Evgeniy Stepanov
Cc: Joonsoo Kim
Cc: Kostya Serebryany
Cc: Pekka Enberg
Cc: Vincenzo Frascino
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Dec, 2018
2 commits
-
Kmemleak scan can be cpu intensive and can stall user tasks at times. To
prevent this, add config DEBUG_KMEMLEAK_AUTO_SCAN to enable/disable auto
scan on boot up. Also protect first_run with DEBUG_KMEMLEAK_AUTO_SCAN as
this is meant for only first automatic scan.Link: http://lkml.kernel.org/r/1540231723-7087-1-git-send-email-prpatel@nvidia.com
Signed-off-by: Sri Krishna chowdary
Signed-off-by: Sachin Nikam
Signed-off-by: Prateek
Reviewed-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
kmemleak_scan() goes through all online nodes and tries to scan all used
pages.We can do better and use pfn_to_online_page(), so in case we have
CONFIG_MEMORY_HOTPLUG, offlined pages will be skiped automatically. For
boxes where CONFIG_MEMORY_HOTPLUG is not present, pfn_to_online_page()
will fallback to pfn_valid().Another little optimization is to check if the page belongs to the node we
are currently checking, so in case we have nodes interleaved we will not
check the same pfn multiple times.I ran some tests:
Add some memory to node1 and node2 making it interleaved:
(qemu) object_add memory-backend-ram,id=ram0,size=1G
(qemu) device_add pc-dimm,id=dimm0,memdev=ram0,node=1
(qemu) object_add memory-backend-ram,id=ram1,size=1G
(qemu) device_add pc-dimm,id=dimm1,memdev=ram1,node=2
(qemu) object_add memory-backend-ram,id=ram2,size=1G
(qemu) device_add pc-dimm,id=dimm2,memdev=ram2,node=1Then, we offline that memory:
# for i in {32..39} ; do echo "offline" > /sys/devices/system/node/node1/memory$i/state;done
# for i in {48..55} ; do echo "offline" > /sys/devices/system/node/node1/memory$i/state;don
# for i in {40..47} ; do echo "offline" > /sys/devices/system/node/node2/memory$i/state;doneAnd we run kmemleak_scan:
# echo "scan" > /sys/kernel/debug/kmemleak
before the patch:
kmemleak: time spend: 41596 us
after the patch:
kmemleak: time spend: 34899 us
[akpm@linux-foundation.org: remove stray newline, per Oscar]
Link: http://lkml.kernel.org/r/20181206131918.25099-1-osalvador@suse.de
Signed-off-by: Oscar Salvador
Reviewed-by: Wei Yang
Suggested-by: Michal Hocko
Acked-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
31 Oct, 2018
1 commit
-
Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include@@
@@
- #include
+ #include[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport
Signed-off-by: Stephen Rothwell
Acked-by: Michal Hocko
Cc: Catalin Marinas
Cc: Chris Zankel
Cc: "David S. Miller"
Cc: Geert Uytterhoeven
Cc: Greentime Hu
Cc: Greg Kroah-Hartman
Cc: Guan Xuetao
Cc: Ingo Molnar
Cc: "James E.J. Bottomley"
Cc: Jonas Bonn
Cc: Jonathan Corbet
Cc: Ley Foon Tan
Cc: Mark Salter
Cc: Martin Schwidefsky
Cc: Matt Turner
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Palmer Dabbelt
Cc: Paul Burton
Cc: Richard Kuo
Cc: Richard Weinberger
Cc: Rich Felker
Cc: Russell King
Cc: Serge Semin
Cc: Thomas Gleixner
Cc: Tony Luck
Cc: Vineet Gupta
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 Oct, 2018
1 commit
-
Currently, kmemleak only prints the number of suspected leaks to dmesg but
requires the user to read a debugfs file to get the actual stack traces of
the objects' allocation points. Add a module option to print the full
object information to dmesg too. It can be enabled with
kmemleak.verbose=1 on the kernel command line, or "echo 1 >
/sys/module/kmemleak/parameters/verbose":This allows easier integration of kmemleak into test systems: We have
automated test infrastructure to test our Linux systems. With this
option, running our tests with kmemleak is as simple as enabling kmemleak
and passing this command line option; the test infrastructure knows how to
save kernel logs, which will now include kmemleak reports. Without this
option, the test infrastructure needs to be specifically taught to read
out the kmemleak debugfs file. Removing this need for special handling
makes kmemleak more similar to other kernel debug options (slab debugging,
debug objects, etc).Link: http://lkml.kernel.org/r/20180903144046.21023-1-vincent.whitchurch@axis.com
Signed-off-by: Vincent Whitchurch
Acked-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
05 Sep, 2018
1 commit
-
If kmemleak built in to the kernel, but is disabled by default, the
debugfs file is never registered. Because of this, it is not possible
to find out if the kernel is built with kmemleak support by checking for
the presence of this file. To allow this, always register the file.After this patch, if the file doesn't exist, kmemleak is not available
in the kernel. If writing "scan" or any other value than "clear" to
this file results in EBUSY, then kmemleak is available but is disabled
by default and can be activated via the kernel command line.Catalin: "that's also consistent with a late disabling of kmemleak when
the debugfs entry sticks around."Link: http://lkml.kernel.org/r/20180824131220.19176-1-vincent.whitchurch@axis.com
Signed-off-by: Vincent Whitchurch
Acked-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
06 Apr, 2018
2 commits
-
Link: http://lkml.kernel.org/r/1519585191-10180-4-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport
Reviewed-by: Andrew Morton
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The early_param() is only called during kernel initialization, So Linux
marks the functions of it with __init macro to save memory.But it forgot to mark the kmemleak_boot_config(). So, Make it __init as
well.Link: http://lkml.kernel.org/r/20180117034720.26897-1-douly.fnst@cn.fujitsu.com
Signed-off-by: Dou Liyang
Reviewed-by: Andrew Morton
Cc: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Mar, 2018
1 commit
-
A crash is observed when kmemleak_scan accesses the object->pointer,
likely due to the following race.TASK A TASK B TASK C
kmemleak_write
(with "scan" and
NOT "scan=on")
kmemleak_scan()
create_object
kmem_cache_alloc fails
kmemleak_disable
kmemleak_do_cleanup
kmemleak_free_enabled = 0
kfree
kmemleak_free bails out
(kmemleak_free_enabled is 0)
slub frees object->pointer
update_checksum
crash - object->pointer
freed (DEBUG_PAGEALLOC)kmemleak_do_cleanup waits for the scan thread to complete, but not for
direct call to kmemleak_scan via kmemleak_write. So add a wait for
kmemleak_scan completion before disabling kmemleak_free, and while at it
fix the comment on stop_scan_thread.[vinmenon@codeaurora.org: fix stop_scan_thread comment]
Link: http://lkml.kernel.org/r/1522219972-22809-1-git-send-email-vinmenon@codeaurora.org
Link: http://lkml.kernel.org/r/1522063429-18992-1-git-send-email-vinmenon@codeaurora.org
Signed-off-by: Vinayak Menon
Reviewed-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Feb, 2018
1 commit
-
Preempt counter APIs have been split out, currently, hardirq.h just
includes irq_enter/exit APIs which are not used by kmemleak at all.So, remove the unused hardirq.h.
Link: http://lkml.kernel.org/r/1510959741-31109-1-git-send-email-yang.s@alibaba-inc.com
Signed-off-by: Yang Shi
Cc: Michal Hocko
Cc: Matthew Wilcox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Jan, 2018
1 commit
-
kmemleak does one slab allocation per user allocation. So if slab fault
injection is enabled to any degree, kmemleak instantly fails to allocate
and turns itself off. However, it's useful to use kmemleak with fault
injection to find leaks on error paths. On the other hand, checking
kmemleak itself is not so useful because (1) it's a debugging tool and
(2) it has a very regular allocation pattern (basically a single
allocation site, so it either works or not).Turn off fault injection for kmemleak allocations.
Link: http://lkml.kernel.org/r/20180109192243.19316-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov
Cc: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Dec, 2017
1 commit
-
Commit bde5f6bc68db ("kmemleak: add scheduling point to
kmemleak_scan()") tries to rate-limit the frequency of cond_resched()
calls, but does it in a way which might incur an expensive division
operation in the inner loop. Simplify this.Fixes: bde5f6bc68db5 ("kmemleak: add scheduling point to kmemleak_scan()")
Suggested-by: Linus Torvalds
Cc: Yisheng Xie
Cc: Catalin Marinas
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
30 Nov, 2017
1 commit
-
kmemleak_scan() will scan struct page for each node and it can be really
large and resulting in a soft lockup. We have seen a soft lockup when
do scan while compile kernel:watchdog: BUG: soft lockup - CPU#53 stuck for 22s! [bash:10287]
[...]
Call Trace:
kmemleak_scan+0x21a/0x4c0
kmemleak_write+0x312/0x350
full_proxy_write+0x5a/0xa0
__vfs_write+0x33/0x150
vfs_write+0xad/0x1a0
SyS_write+0x52/0xc0
do_syscall_64+0x61/0x1a0
entry_SYSCALL64_slow_path+0x25/0x25Fix this by adding cond_resched every MAX_SCAN_SIZE.
Link: http://lkml.kernel.org/r/1511439788-20099-1-git-send-email-xieyisheng1@huawei.com
Signed-off-by: Yisheng Xie
Suggested-by: Catalin Marinas
Acked-by: Catalin Marinas
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
16 Nov, 2017
2 commits
-
Patch series "kmemcheck: kill kmemcheck", v2.
As discussed at LSF/MM, kill kmemcheck.
KASan is a replacement that is able to work without the limitation of
kmemcheck (single CPU, slow). KASan is already upstream.We are also not aware of any users of kmemcheck (or users who don't
consider KASan as a suitable replacement).The only objection was that since KASAN wasn't supported by all GCC
versions provided by distros at that time we should hold off for 2
years, and try again.Now that 2 years have passed, and all distros provide gcc that supports
KASAN, kill kmemcheck again for the very same reasons.This patch (of 4):
Remove kmemcheck annotations, and calls to kmemcheck from the kernel.
[alexander.levin@verizon.com: correctly remove kmemcheck call from dma_map_sg_attrs]
Link: http://lkml.kernel.org/r/20171012192151.26531-1-alexander.levin@verizon.com
Link: http://lkml.kernel.org/r/20171007030159.22241-2-alexander.levin@verizon.com
Signed-off-by: Sasha Levin
Cc: Alexander Potapenko
Cc: Eric W. Biederman
Cc: Michal Hocko
Cc: Pekka Enberg
Cc: Steven Rostedt
Cc: Tim Hansen
Cc: Vegard Nossum
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Kmemleak can be tweaked at runtime by writing commands into debugfs
file. Root can use it anyway, but without the write-bit this interface
isn't obvious.Link: http://lkml.kernel.org/r/150728996582.744328.11541332857988399411.stgit@buzz
Signed-off-by: Konstantin Khlebnikov
Acked-by: Catalin Marinas
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
07 Jul, 2017
3 commits
-
Kmemleak requires that vmalloc'ed objects have a minimum reference count
of 2: one in the corresponding vm_struct object and the other owned by
the vmalloc() caller. There are cases, however, where the original
vmalloc() returned pointer is lost and, instead, a pointer to vm_struct
is stored (see free_thread_stack()). Kmemleak currently reports such
objects as leaks.This patch adds support for treating any surplus references to an object
as additional references to a specified object. It introduces the
kmemleak_vmalloc() API function which takes a vm_struct pointer and sets
its surplus reference passing to the actual vmalloc() returned pointer.
The __vmalloc_node_range() calling site has been modified accordingly.Link: http://lkml.kernel.org/r/1495726937-23557-4-git-send-email-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas
Reported-by: "Luis R. Rodriguez"
Cc: Michal Hocko
Cc: Andy Lutomirski
Cc: "Luis R. Rodriguez"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
scan_block() updates the number of references (pointers) to objects,
adding them to the gray_list when object->min_count is reached. The
patch factors out this functionality into a separate update_refs()
function.Link: http://lkml.kernel.org/r/1495726937-23557-3-git-send-email-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas
Cc: Michal Hocko
Cc: Andy Lutomirski
Cc: "Luis R. Rodriguez"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Change the kmemleak_object.flags type to unsigned int and moves the
early_log.min_count (int) near early_log.op_type (int) to slightly
reduce the size of these structures on 64-bit architectures.Link: http://lkml.kernel.org/r/1495726937-23557-2-git-send-email-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas
Cc: Michal Hocko
Cc: Andy Lutomirski
Cc: "Luis R. Rodriguez"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
01 Apr, 2017
1 commit
-
A section name for .data..ro_after_init was added by both:
commit d07a980c1b8d ("s390: add proper __ro_after_init support")
and
commit d7c19b066dcf ("mm: kmemleak: scan .data.ro_after_init")
The latter adds incorrect wrapping around the existing s390 section, and
came later. I'd prefer the s390 naming, so this moves the s390-specific
name up to the asm-generic/sections.h and renames the section as used by
kmemleak (and in the future, kernel/extable.c).Link: http://lkml.kernel.org/r/20170327192213.GA129375@beast
Signed-off-by: Kees Cook
Acked-by: Heiko Carstens [s390 parts]
Acked-by: Jakub Kicinski
Cc: Eddie Kovsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
02 Mar, 2017
1 commit
-
We are going to split out of , which
will have to be picked up from other headers and a couple of .c files.Create a trivial placeholder file that just
maps to to make this patch obviously correct and
bisectable.Include the new header in the files that are going to need it.
Acked-by: Linus Torvalds
Cc: Mike Galbraith
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar