17 Dec, 2018

1 commit

  • [ Upstream commit 8de456cf87ba863e028c4dd01bae44255ce3d835 ]

    CONFIG_DEBUG_OBJECTS_RCU_HEAD does not play well with kmemleak due to
    recursive calls.

    fill_pool
    kmemleak_ignore
    make_black_object
    put_object
    __call_rcu (kernel/rcu/tree.c)
    debug_rcu_head_queue
    debug_object_activate
    debug_object_init
    fill_pool
    kmemleak_ignore
    make_black_object
    ...

    So add SLAB_NOLEAKTRACE to kmem_cache_create() to not register newly
    allocated debug objects at all.

    Link: http://lkml.kernel.org/r/20181126165343.2339-1-cai@gmx.us
    Signed-off-by: Qian Cai
    Suggested-by: Catalin Marinas
    Acked-by: Waiman Long
    Acked-by: Catalin Marinas
    Cc: Thomas Gleixner
    Cc: Yang Shi
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Qian Cai
     

02 Aug, 2018

1 commit


31 Jul, 2018

1 commit

  • While debugging an issue debugobject tracking warned about an annotation
    issue of an object on stack. It turned out that the issue was due to the
    object in concern being on a different stack which was due to another
    issue.

    Thomas suggested to print the pointers and the location of the stack for
    the currently running task. This helped to figure out that the object was
    on the wrong stack.

    As this is general useful information for debugging similar issues, make
    the error message more informative by printing the pointers.

    [ tglx: Massaged changelog ]

    Signed-off-by: Joel Fernandes (Google)
    Signed-off-by: Thomas Gleixner
    Acked-by: Waiman Long
    Acked-by: Yang Shi
    Cc: kernel-team@android.com
    Cc: Arnd Bergmann
    Cc: astrachan@google.com
    Link: https://lkml.kernel.org/r/20180723212531.202328-1-joel@joelfernandes.org

    Joel Fernandes (Google)
     

15 Mar, 2018

1 commit

  • debug_objects_maxchecked is only updated in __debug_check_no_obj_freed(),
    and only read in debug_objects_maxchecked, unfortunately both of these are
    optional and depend on different Kconfig symbols.

    When both CONFIG_DEBUG_OBJECTS_FREE and CONFIG_DEBUG_FS are disabled this
    warning is emitted:

    lib/debugobjects.c:56:14: error: 'debug_objects_maxchecked' defined but not used [-Werror=unused-variable]

    Rather than trying to add more complex #ifdef protections, mark the
    variable as __maybe_unused so it can be silently dropped when usused.

    Fixes: bd9dcd046509 ("debugobjects: Export max loops counter")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Thomas Gleixner
    Acked-by: Yang Shi
    Cc: Waiman Long
    Link: https://lkml.kernel.org/r/20180313131857.158876-1-arnd@arndb.de

    Arnd Bergmann
     

23 Feb, 2018

1 commit

  • The removal of the batched object freeing has caused the debug_objects_freed
    to become read-only, and the reading is inside an ifdef, so gcc warns that it
    is completely unused without CONFIG_DEBUG_FS:

    lib/debugobjects.c:71:14: error: 'debug_objects_freed' defined but not used [-Werror=unused-variable]

    Assuming we are still interested in this number, this adds back code to
    keep track of the freed objects.

    Fixes: 636e1970fd7d ("debugobjects: Use global free list in free_object()")
    Suggested-by: Waiman Long
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Thomas Gleixner
    Acked-by: Yang Shi
    Acked-by: Waiman Long
    Link: https://lkml.kernel.org/r/20180222155335.1647466-1-arnd@arndb.de

    Arnd Bergmann
     

13 Feb, 2018

4 commits

  • __debug_check_no_obj_freed() iterates over the to be freed memory region in
    chunks and iterates over the corresponding hash bucket list for each
    chunk. This can accumulate to hundred thousands of checked objects. In the
    worst case this can trigger the soft lockup detector:

    NMI watchdog: BUG: soft lockup - CPU#15 stuck for 22s!
    CPU: 15 PID: 110342 Comm: stress-ng-getde
    Call Trace:
    [] debug_check_no_obj_freed+0x13e/0x220
    [] __free_pages_ok+0x1f1/0x5c0
    [] __free_pages+0x25/0x40
    [] __free_slab+0x19b/0x270
    [] discard_slab+0x39/0x50
    [] __slab_free+0x207/0x270
    [] ___cache_free+0xa6/0xb0
    [] qlist_free_all+0x47/0x80
    [] quarantine_reduce+0x159/0x190
    [] kasan_kmalloc+0xaf/0xc0
    [] kasan_slab_alloc+0x12/0x20
    [] kmem_cache_alloc+0xfa/0x360
    [] ? getname_flags+0x4f/0x1f0
    [] getname_flags+0x4f/0x1f0
    [] getname+0x12/0x20
    [] do_sys_open+0xf9/0x210
    [] SyS_open+0x1e/0x20
    [] entry_SYSCALL_64_fastpath+0x1f/0xc2

    The code path might be called in either atomic or non-atomic context, but
    in_atomic() can't tell if the current context is atomic or not on a
    PREEMPT=n kernel, so cond_resched() can't be used to prevent the
    softlockup.

    Utilize the global free list to shorten the loop execution time.

    [ tglx: Massaged changelog ]

    Suggested-by: Thomas Gleixner
    Signed-off-by: Yang Shi
    Signed-off-by: Thomas Gleixner
    Cc: longman@redhat.com
    Link: https://lkml.kernel.org/r/1517872708-24207-5-git-send-email-yang.shi@linux.alibaba.com

    Yang Shi
     
  • The newly added global free list allows to avoid lengthy pool_list
    iterations in free_obj_work() by putting objects either into the pool list
    when the fill level of the pool is below the maximum or by putting them on
    the global free list immediately.

    As the pool is now guaranteed to never exceed the maximum fill level this
    allows to remove the batch removal from pool list in free_obj_work().

    Split free_object() into two parts, so the actual queueing function can be
    reused without invoking schedule_work() on every invocation.

    [ tglx: Remove the batch removal from pool list and massage changelog ]

    Suggested-by: Thomas Gleixner
    Signed-off-by: Yang Shi
    Signed-off-by: Thomas Gleixner
    Cc: longman@redhat.com
    Link: https://lkml.kernel.org/r/1517872708-24207-4-git-send-email-yang.shi@linux.alibaba.com

    Yang Shi
     
  • free_object() adds objects to the pool list and schedules work when the
    pool list is larger than the pool size. The worker handles the actual
    kfree() of the object by iterating the pool list until the pool size is
    below the maximum pool size again.

    To iterate the pool list, pool_lock has to be held and the objects which
    should be freed() need to be put into temporary storage so pool_lock can be
    dropped for the actual kmem_cache_free() invocation. That's a pointless and
    expensive exercise if there is a large number of objects to free.

    In such a case its better to evaulate the fill level of the pool in
    free_objects() and queue the object to free either in the pool list or if
    it's full on a separate global free list.

    The worker can then do the following simpler operation:

    - Move objects back from the global free list to the pool list if the
    pool list is not longer full.

    - Remove the remaining objects in a single list move operation from the
    global free list and do the kmem_cache_free() operation lockless from
    the temporary list head.

    In fill_pool() the global free list is checked as well to avoid real
    allocations from the kmem cache.

    Add the necessary list head and a counter for the number of objects on the
    global free list and export that counter via sysfs:

    max_chain :79
    max_loops :8147
    warnings :0
    fixups :0
    pool_free :1697
    pool_min_free :346
    pool_used :15356
    pool_max_used :23933
    on_free_list :39
    objs_allocated:32617
    objs_freed :16588

    Nothing queues objects on the global free list yet. This happens in a
    follow up change.

    [ tglx: Simplified implementation and massaged changelog ]

    Suggested-by: Thomas Gleixner
    Signed-off-by: Yang Shi
    Signed-off-by: Thomas Gleixner
    Cc: longman@redhat.com
    Link: https://lkml.kernel.org/r/1517872708-24207-3-git-send-email-yang.shi@linux.alibaba.com

    Yang Shi
     
  • __debug_check_no_obj_freed() can be an expensive operation depending on the
    size of memory freed. It already exports the maximum chain walk length via
    debugfs, but this only records the maximum of a single memory chunk.

    Though there is no information about the total number of objects inspected
    for a __debug_check_no_obj_freed() operation, which might be significantly
    larger when a huge memory region is freed.

    Aggregate the number of objects inspected for a single invocation of
    __debug_check_no_obj_freed() and export it via sysfs.

    The resulting output of /sys/kernel/debug/debug_objects/stats looks like:

    max_chain :121
    max_checked :543267
    warnings :0
    fixups :0
    pool_free :1764
    pool_min_free :341
    pool_used :86438
    pool_max_used :268887
    objs_allocated:6068254
    objs_freed :5981076

    [ tglx: Renamed the variable to max_checked and adjusted changelog ]

    Signed-off-by: Yang Shi
    Signed-off-by: Thomas Gleixner
    Cc: longman@redhat.com
    Link: https://lkml.kernel.org/r/1517872708-24207-2-git-send-email-yang.shi@linux.alibaba.com

    Yang Shi
     

14 Aug, 2017

1 commit

  • The allocated debug objects are either on the free list or in the
    hashed bucket lists. So they won't get lost. However if both debug
    objects and kmemleak are enabled and kmemleak scanning is done
    while some of the debug objects are transitioning from one list to
    the others, false negative reporting of memory leaks may happen for
    those objects. For example,

    [38687.275678] kmemleak: 12 new suspected memory leaks (see
    /sys/kernel/debug/kmemleak)
    unreferenced object 0xffff92e98aabeb68 (size 40):
    comm "ksmtuned", pid 4344, jiffies 4298403600 (age 906.430s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 d0 bc db 92 e9 92 ff ff ................
    01 00 00 00 00 00 00 00 38 36 8a 61 e9 92 ff ff ........86.a....
    backtrace:
    [] kmemleak_alloc+0x4a/0xa0
    [] kmem_cache_alloc+0xe9/0x320
    [] __debug_object_init+0x3e6/0x400
    [] debug_object_activate+0x131/0x210
    [] __call_rcu+0x3f/0x400
    [] call_rcu_sched+0x1d/0x20
    [] put_object+0x2c/0x40
    [] __delete_object+0x3c/0x50
    [] delete_object_full+0x1d/0x20
    [] kmemleak_free+0x32/0x80
    [] kmem_cache_free+0x77/0x350
    [] unlink_anon_vmas+0x82/0x1e0
    [] free_pgtables+0xa1/0x110
    [] exit_mmap+0xc1/0x170
    [] mmput+0x80/0x150
    [] do_exit+0x2a9/0xd20

    The references in the debug objects may also hide a real memory leak.

    As there is no point in having kmemleak to track debug object
    allocations, kmemleak checking is now disabled for debug objects.

    Signed-off-by: Waiman Long
    Signed-off-by: Thomas Gleixner
    Cc: Andrew Morton
    Link: http://lkml.kernel.org/r/1502718733-8527-1-git-send-email-longman@redhat.com

    Waiman Long
     

02 Mar, 2017

1 commit


10 Feb, 2017

1 commit

  • As suggested by Ingo, the debug_objects_alloc counter is now renamed to
    debug_objects_allocated with minor twist in comment and debug output.

    Signed-off-by: Waiman Long
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1486503630-1501-1-git-send-email-longman@redhat.com
    Signed-off-by: Ingo Molnar

    Waiman Long
     

06 Feb, 2017

1 commit

  • On a large SMP system with many CPUs, the global pool_lock may become
    a performance bottleneck as all the CPUs that need to allocate or
    free debug objects have to take the lock. That can sometimes cause
    soft lockups like:

    NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21]
    ...
    RIP: 0010:[] []
    _raw_spin_unlock_irqrestore+0x3b/0x60
    ...
    Call Trace:
    [] free_object+0x81/0xb0
    [] debug_check_no_obj_freed+0x193/0x220
    [] ? trace_hardirqs_on_caller+0xf9/0x1c0
    [] ? file_free_rcu+0x36/0x60
    [] kmem_cache_free+0xd2/0x380
    [] ? fput+0x90/0x90
    [] file_free_rcu+0x36/0x60
    [] rcu_nocb_kthread+0x1b3/0x550
    [] ? rcu_nocb_kthread+0x101/0x550
    [] ? sync_exp_work_done.constprop.63+0x50/0x50
    [] kthread+0x101/0x120
    [] ? trace_hardirqs_on_caller+0xf9/0x1c0
    [] ret_from_fork+0x22/0x50

    To reduce the amount of contention on the pool_lock, the actual
    kmem_cache_free() of the debug objects will be delayed if the pool_lock
    is busy. This will temporarily increase the amount of free objects
    available at the free pool when the system is busy. As a result,
    the number of kmem_cache allocation and freeing is reduced.

    To further reduce the lock operations free debug objects in batches of
    four.

    Signed-off-by: Waiman Long
    Cc: Christian Borntraeger
    Cc: "Du Changbin"
    Cc: Andrew Morton
    Cc: Jan Stancek
    Link: http://lkml.kernel.org/r/1483647425-4135-4-git-send-email-longman@redhat.com
    Signed-off-by: Thomas Gleixner

    Waiman Long
     

04 Feb, 2017

2 commits

  • On a large SMP systems with hundreds of CPUs, the current thresholds
    for allocating and freeing debug objects (256 and 1024 respectively)
    may not work well. This can cause a lot of needless calls to
    kmem_aloc() and kmem_free() on those systems.

    To alleviate this thrashing problem, the object freeing threshold
    is now increased to "1024 + # of CPUs * 32". Whereas the object
    allocation threshold is increased to "256 + # of CPUs * 4". That
    should make the debug objects subsystem scale better with the number
    of CPUs available in the system.

    Signed-off-by: Waiman Long
    Cc: Christian Borntraeger
    Cc: "Du Changbin"
    Cc: Andrew Morton
    Cc: Jan Stancek
    Link: http://lkml.kernel.org/r/1483647425-4135-3-git-send-email-longman@redhat.com
    Signed-off-by: Thomas Gleixner

    Waiman Long
     
  • New debugfs stat counters are added to track the numbers of
    kmem_cache_alloc() and kmem_cache_free() function calls to get a
    sense of how the internal debug objects cache management is performing.

    Signed-off-by: Waiman Long
    Cc: Christian Borntraeger
    Cc: "Du Changbin"
    Cc: Andrew Morton
    Cc: Jan Stancek
    Link: http://lkml.kernel.org/r/1483647425-4135-2-git-send-email-longman@redhat.com
    Signed-off-by: Thomas Gleixner

    Waiman Long
     

14 Dec, 2016

1 commit

  • Pull workqueue updates from Tejun Heo:
    "Mostly patches to initialize workqueue subsystem earlier and get rid
    of keventd_up().

    The patches were headed for the last merge cycle but got delayed due
    to a bug found late minute, which is fixed now.

    Also, to help debugging, destroy_workqueue() is more chatty now on a
    sanity check failure."

    * 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: move wq_numa_init() to workqueue_init()
    workqueue: remove keventd_up()
    debugobj, workqueue: remove keventd_up() usage
    slab, workqueue: remove keventd_up() usage
    power, workqueue: remove keventd_up() usage
    tty, workqueue: remove keventd_up() usage
    mce, workqueue: remove keventd_up() usage
    workqueue: make workqueue available early during boot
    workqueue: dump workqueue state on sanity check failures in destroy_workqueue()

    Linus Torvalds
     

01 Dec, 2016

1 commit

  • Drivers, or other modules, that use a mixture of objects (especially
    objects embedded within other objects) would like to take advantage of
    the debugobjects facilities to help catch misuse. Currently, the
    debugobjects interface is only available to builtin drivers and requires
    a set of EXPORT_SYMBOL_GPL for use by modules.

    I am using the debugobjects in i915.ko to try and catch some invalid
    operations on embedded objects. The problem currently only presents
    itself across module unload so forcing i915 to be builtin is not an
    option.

    Link: http://lkml.kernel.org/r/20161122143039.6433-1-chris@chris-wilson.co.uk
    Signed-off-by: Chris Wilson
    Cc: "Du, Changbin"
    Cc: Thomas Gleixner
    Cc: Christian Borntraeger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Wilson
     

18 Sep, 2016

1 commit


20 May, 2016

3 commits

  • When activating a static object we need make sure that the object is
    tracked in the object tracker. If it is a non-static object then the
    activation is illegal.

    In previous implementation, each subsystem need take care of this in
    their fixup callbacks. Actually we can put it into debugobjects core.
    Thus we can save duplicated code, and have *pure* fixup callbacks.

    To achieve this, a new callback "is_static_object" is introduced to let
    the type specific code decide whether a object is static or not. If
    yes, we take it into object tracker, otherwise give warning and invoke
    fixup callback.

    This change has paassed debugobjects selftest, and I also do some test
    with all debugobjects supports enabled.

    At last, I have a concern about the fixups that can it change the object
    which is in incorrect state on fixup? Because the 'addr' may not point
    to any valid object if a non-static object is not tracked. Then Change
    such object can overwrite someone's memory and cause unexpected
    behaviour. For example, the timer_fixup_activate bind timer to function
    stub_timer.

    Link: http://lkml.kernel.org/r/1462576157-14539-1-git-send-email-changbin.du@intel.com
    [changbin.du@intel.com: improve code comments where invoke the new is_static_object callback]
    Link: http://lkml.kernel.org/r/1462777431-8171-1-git-send-email-changbin.du@intel.com
    Signed-off-by: Du, Changbin
    Cc: Jonathan Corbet
    Cc: Josh Triplett
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Cc: Tejun Heo
    Cc: Christian Borntraeger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Du, Changbin
     
  • If debug_object_fixup() return non-zero when problem has been fixed.
    But the code got it backwards, it taks 0 as fixup successfully. So fix
    it.

    Signed-off-by: Du, Changbin
    Cc: Jonathan Corbet
    Cc: Josh Triplett
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Cc: Tejun Heo
    Cc: Christian Borntraeger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Du, Changbin
     
  • I am going to introduce debugobjects infrastructure to USB subsystem.
    But before this, I found the code of debugobjects could be improved.
    This patchset will make fixup functions return bool type instead of int.
    Because fixup only need report success or no. boolean is the 'real'
    type.

    This patch (of 7):

    The object debugging infrastructure core provides some fixup callbacks
    for the subsystem who use it. These callbacks are called from the debug
    code whenever a problem in debug_object_init is detected. And
    debugobjects core suppose them returns 1 when the fixup was successful,
    otherwise 0. So the return type is boolean.

    A bad thing is that debug_object_fixup use the return value for
    arithmetic operation. It confused me that what is the reall return
    type.

    Reading over the whole code, I found some place do use the return value
    incorrectly(see next patch). So why use bool type instead?

    Signed-off-by: Du, Changbin
    Cc: Jonathan Corbet
    Cc: Josh Triplett
    Cc: Steven Rostedt
    Cc: Thomas Gleixner
    Cc: Tejun Heo
    Cc: Christian Borntraeger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Du, Changbin
     

27 Jan, 2016

1 commit

  • On my bigger s390 systems I always get "Out of memory.
    ODEBUG disabled". Since the number of objects is needed at
    compile time, we can not change the size dynamically before
    the caches etc are available. Doubling the size seems to
    do the trick. Since it is init data it will be freed anyway,
    this should be ok.

    Signed-off-by: Christian Borntraeger
    Link: http://lkml.kernel.org/r/1453905478-13409-1-git-send-email-borntraeger@de.ibm.com
    Signed-off-by: Thomas Gleixner

    Christian Borntraeger
     

05 Jun, 2014

3 commits


13 Nov, 2013

1 commit


19 Aug, 2013

1 commit

  • In order to better respond to things like duplicate invocations
    of call_rcu(), RCU needs to see the status of a call to
    debug_object_activate(). This would allow RCU to leak the callback in
    order to avoid adding freelist-reuse mischief to the duplicate invoations.
    This commit therefore makes debug_object_activate() return status,
    zero for success and -EINVAL for failure.

    Signed-off-by: Paul E. McKenney
    Cc: Mathieu Desnoyers
    Cc: Sedat Dilek
    Cc: Davidlohr Bueso
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: Linus Torvalds
    Tested-by: Sedat Dilek
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

28 Feb, 2013

1 commit

  • I'm not sure why, but the hlist for each entry iterators were conceived

    list_for_each_entry(pos, head, member)

    The hlist ones were greedy and wanted an extra parameter:

    hlist_for_each_entry(tpos, pos, head, member)

    Why did they need an extra pos parameter? I'm not quite sure. Not only
    they don't really need it, it also prevents the iterator from looking
    exactly like the list iterator, which is unfortunate.

    Besides the semantic patch, there was some manual work required:

    - Fix up the actual hlist iterators in linux/list.h
    - Fix up the declaration of other iterators based on the hlist ones.
    - A very small amount of places were using the 'node' parameter, this
    was modified to use 'obj->member' instead.
    - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
    properly, so those had to be fixed up manually.

    The semantic patch which is mostly the work of Peter Senna Tschudin is here:

    @@
    iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

    type T;
    expression a,c,d,e;
    identifier b;
    statement S;
    @@

    -T b;

    [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
    [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
    [akpm@linux-foundation.org: checkpatch fixes]
    [akpm@linux-foundation.org: fix warnings]
    [akpm@linux-foudnation.org: redo intrusive kvm changes]
    Tested-by: Peter Senna Tschudin
    Acked-by: Paul E. McKenney
    Signed-off-by: Sasha Levin
    Cc: Wu Fengguang
    Cc: Marcelo Tosatti
    Cc: Gleb Natapov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     

18 Apr, 2012

1 commit

  • There was a return missed in 1fda107d44 "debugobjects: Remove unused
    return value from fill_pool()". It makes gcc complain:

    lib/debugobjects.c: In function ‘fill_pool’:
    lib/debugobjects.c:98:4: warning: ‘return’ with a value, in
    function returning void [enabled by default]

    Signed-off-by: Dan Carpenter
    Link: http://lkml.kernel.org/r/20120418112810.GA2669@elgon.mountain
    Signed-off-by: Thomas Gleixner

    Dan Carpenter
     

11 Apr, 2012

2 commits


06 Mar, 2012

1 commit

  • debugobjects is now printing a warning when a fixup for a NOTAVAILABLE
    object is run. This causes the selftest to fail like:

    ODEBUG: selftest warnings failed 4 != 5

    We could just increase the number of warnings that the selftest is
    expecting to see because that is actually what has changed. But, it turns
    out that fixup_activate() was written with inverted logic and thus a fixup
    for a static object returned 1 indicating the object had been fixed, and 0
    otherwise. Fix the logic to be correct and update the counts to reflect
    that nothing needed fixing for a static object.

    Signed-off-by: Stephen Boyd
    Reported-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     

24 Nov, 2011

2 commits

  • Calling del_timer_sync() on an uninitialized timer leads to a
    never ending loop in lock_timer_base() that spins checking for a
    non-NULL timer base. Add an assertion to debugobjects to catch
    usage of uninitialized objects so that we can initialize timers
    in the del_timer_sync() path before it calls lock_timer_base().

    [ sboyd@codeaurora.org: Clarify commit message ]

    Signed-off-by: Christine Chan
    Signed-off-by: Stephen Boyd
    Cc: John Stultz
    Link: http://lkml.kernel.org/r/1320724108-20788-3-git-send-email-sboyd@codeaurora.org
    Signed-off-by: Andrew Morton
    Signed-off-by: Thomas Gleixner

    Christine Chan
     
  • Make debugobjects use the return code from the fixup function. That
    allows us better diagnostics in the activate check than relying on a
    WARN_ON() in the object specific code.

    [ tglx@linutronix.de: Split out the debugobjects vs. the timer change ]

    Signed-off-by: Stephen Boyd
    Cc: Christine Chan
    Cc: John Stultz
    Signed-off-by: Andrew Morton
    Link: http://lkml.kernel.org/r/1320724108-20788-2-git-send-email-sboyd@codeaurora.org
    Signed-off-by: Thomas Gleixner

    Stephen Boyd
     

20 Jun, 2011

1 commit

  • Order of initialization look like this:
    ...
    debugobjects
    kmemleak
    ...(lots of other subsystems)...
    workqueues (through early initcall)
    ...

    debugobjects use schedule_work for batch freeing of its data and kmemleak
    heavily use debugobjects, so when it comes to freeing and workqueues were
    not initialized yet, kernel crashes:

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] __queue_work+0x29/0x41a
    [] queue_work_on+0x16/0x1d
    [] queue_work+0x29/0x55
    [] schedule_work+0x13/0x15
    [] free_object+0x90/0x95
    [] debug_check_no_obj_freed+0x187/0x1d3
    [] ? _raw_spin_unlock_irqrestore+0x30/0x4d
    [] ? free_object_rcu+0x68/0x6d
    [] kmem_cache_free+0x64/0x12c
    [] free_object_rcu+0x68/0x6d
    [] __rcu_process_callbacks+0x1b6/0x2d9
    ...

    because system_wq is NULL.

    Fix it by checking if workqueues susbystem was initialized before using.

    Signed-off-by: Marcin Slusarz
    Cc: Catalin Marinas
    Cc: Tejun Heo
    Cc: Dipankar Sarma
    Cc: Paul E. McKenney
    Cc: stable@kernel.org
    Link: http://lkml.kernel.org/r/20110528112342.GA3068@joi.lan
    Signed-off-by: Thomas Gleixner

    Marcin Slusarz
     

08 Mar, 2011

1 commit

  • In complex subsystems like mac80211 structures can contain several
    timers and work structs, so identifying a specific instance from the
    call trace and object type output of debugobjects can be hard.

    Allow the subsystems which support debugobjects to provide a hint
    function. This function returns a pointer to a kernel address
    (preferrably the objects callback function) which is printed along
    with the debugobjects type.

    Add hint methods for timer_list, work_struct and hrtimer.

    [ tglx: Massaged changelog, made it compile ]

    Signed-off-by: Stanislaw Gruszka
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Stanislaw Gruszka
     

18 May, 2010

2 commits

  • * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (24 commits)
    rcu: remove all rcu head initializations, except on_stack initializations
    rcu head introduce rcu head init on stack
    Debugobjects transition check
    rcu: fix build bug in RCU_FAST_NO_HZ builds
    rcu: RCU_FAST_NO_HZ must check RCU dyntick state
    rcu: make SRCU usable in modules
    rcu: improve the RCU CPU-stall warning documentation
    rcu: reduce the number of spurious RCU_SOFTIRQ invocations
    rcu: permit discontiguous cpu_possible_mask CPU numbering
    rcu: improve RCU CPU stall-warning messages
    rcu: print boot-time console messages if RCU configs out of ordinary
    rcu: disable CPU stall warnings upon panic
    rcu: enable CPU_STALL_VERBOSE by default
    rcu: slim down rcutiny by removing rcu_scheduler_active and friends
    rcu: refactor RCU's context-switch handling
    rcu: rename rcutiny rcu_ctrlblk to rcu_sched_ctrlblk
    rcu: shrink rcutiny by making synchronize_rcu_bh() be inline
    rcu: fix now-bogus rcu_scheduler_active comments.
    rcu: Fix bogus CONFIG_PROVE_LOCKING in comments to reflect reality.
    rcu: ignore offline CPUs in last non-dyntick-idle CPU check
    ...

    Linus Torvalds
     
  • …/kernel/git/tip/linux-2.6-tip

    * 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    debugobjects: Section mismatch cleanup

    Linus Torvalds
     

11 May, 2010

1 commit

  • Implement a basic state machine checker in the debugobjects.

    This state machine checker detects races and inconsistencies within the "active"
    life of a debugobject. The checker only keeps track of the current state; all
    the state machine logic is kept at the object instance level.

    The checker works by adding a supplementary "unsigned int astate" field to the
    debug_obj structure. It keeps track of the current "active state" of the object.

    The only constraints that are imposed on the states by the debugobjects system
    is that:

    - activation of an object sets the current active state to 0,
    - deactivation of an object expects the current active state to be 0.

    For the rest of the states, the state mapping is determined by the specific
    object instance. Therefore, the logic keeping track of the state machine is
    within the specialized instance, without any need to know about it at the
    debugobject level.

    The current object active state is changed by calling:

    debug_object_active_state(addr, descr, expect, next)

    where "expect" is the expected state and "next" is the next state to move to if
    the expected state is found. A warning is generated if the expected is not
    found.

    Signed-off-by: Mathieu Desnoyers
    Reviewed-by: Thomas Gleixner
    Acked-by: David S. Miller
    CC: "Paul E. McKenney"
    CC: akpm@linux-foundation.org
    CC: mingo@elte.hu
    CC: laijs@cn.fujitsu.com
    CC: dipankar@in.ibm.com
    CC: josh@joshtriplett.org
    CC: dvhltc@us.ibm.com
    CC: niv@us.ibm.com
    CC: peterz@infradead.org
    CC: rostedt@goodmis.org
    CC: Valdis.Kletnieks@vt.edu
    CC: dhowells@redhat.com
    CC: eric.dumazet@gmail.com
    CC: Alexey Dobriyan
    Signed-off-by: Paul E. McKenney

    Mathieu Desnoyers
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo