25 Mar, 2010

1 commit

  • cpuset_mem_spread_node() returns an offline node, and causes an oops.

    This patch fixes it by initializing task->mems_allowed to
    node_states[N_HIGH_MEMORY], and updating task->mems_allowed when doing
    memory hotplug.

    Signed-off-by: Miao Xie
    Acked-by: David Rientjes
    Reported-by: Nick Piggin
    Tested-by: Nick Piggin
    Cc: Paul Menage
    Cc: Li Zefan
    Cc: Ingo Molnar
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     

09 Feb, 2010

1 commit


17 Dec, 2009

1 commit


03 Nov, 2009

1 commit

  • Eric Paris reported that commit
    f685ceacab07d3f6c236f04803e2f2f0dbcc5afb causes boot time
    PREEMPT_DEBUG complaints.

    [ 4.590699] BUG: using smp_processor_id() in preemptible [00000000] code: rmmod/1314
    [ 4.593043] caller is task_hot+0x86/0xd0

    Since kthread_bind() messes with scheduler internals, move the
    body to sched.c, and lock the runqueue.

    Reported-by: Eric Paris
    Signed-off-by: Mike Galbraith
    Tested-by: Eric Paris
    Cc: Peter Zijlstra
    LKML-Reference:
    [ v2: fix !SMP build and clean up ]
    Signed-off-by: Ingo Molnar

    Mike Galbraith
     

09 Sep, 2009

1 commit


28 Jul, 2009

1 commit

  • Commit 63706172f332fd3f6e7458ebfb35fa6de9c21dc5 ("kthreads: rework
    kthread_stop()") removed the limitation that the thread function mysr
    not call do_exit() itself, but forgot to update the comment.

    Since that commit it is OK to use kthread_stop() even if kthread can
    exit itself.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Rusty Russell
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

19 Jun, 2009

2 commits

  • Based on Eric's patch which in turn was based on my patch.

    kthread_stop() has the nasty problems:

    - it runs unpredictably long with the global semaphore held.

    - it deadlocks if kthread itself does kthread_stop() before it obeys
    the kthread_should_stop() request.

    - it is not useable if kthread exits on its own, see for example the
    ugly "wait_to_die:" hack in migration_thread()

    - it is not possible to just tell kthread it should stop, we must always
    wait for its exit.

    With this patch kthread() allocates all neccesary data (struct kthread) on
    its own stack, globals kthread_stop_xxx are deleted. ->vfork_done is used
    as a pointer into "struct kthread", this means kthread_stop() can easily
    wait for kthread's exit.

    Signed-off-by: Oleg Nesterov
    Cc: Christoph Hellwig
    Cc: "Eric W. Biederman"
    Cc: Ingo Molnar
    Cc: Pavel Emelyanov
    Cc: Rusty Russell
    Cc: Vitaliy Gusev
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • We use two completions two create the kernel thread, this is a bit ugly.
    kthread() wakes up create_kthread() via ->started, then create_kthread()
    wakes up the caller kthread_create() via ->done. But kthread() does not
    need to wait for kthread(), it can just return. Instead kthread() itself
    can wake up the caller of kthread_create().

    Kill kthread_create_info->started, ->done is enough. This improves the
    scalability a bit and sijmplifies the code.

    The only problem if kernel_thread() fails, in that case create_kthread()
    must do complete(&create->done).

    Signed-off-by: Oleg Nesterov
    Cc: Christoph Hellwig
    Cc: "Eric W. Biederman"
    Cc: Ingo Molnar
    Cc: Pavel Emelyanov
    Cc: Rusty Russell
    Cc: Vitaliy Gusev
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

17 Jun, 2009

1 commit

  • Fix allocating page cache/slab object on the unallowed node when memory
    spread is set by updating tasks' mems_allowed after its cpuset's mems is
    changed.

    In order to update tasks' mems_allowed in time, we must modify the code of
    memory policy. Because the memory policy is applied in the process's
    context originally. After applying this patch, one task directly
    manipulates anothers mems_allowed, and we use alloc_lock in the
    task_struct to protect mems_allowed and memory policy of the task.

    But in the fast path, we didn't use lock to protect them, because adding a
    lock may lead to performance regression. But if we don't add a lock,the
    task might see no nodes when changing cpuset's mems_allowed to some
    non-overlapping set. In order to avoid it, we set all new allowed nodes,
    then clear newly disallowed ones.

    [lee.schermerhorn@hp.com:
    The rework of mpol_new() to extract the adjusting of the node mask to
    apply cpuset and mpol flags "context" breaks set_mempolicy() and mbind()
    with MPOL_PREFERRED and a NULL nodemask--i.e., explicit local
    allocation. Fix this by adding the check for MPOL_PREFERRED and empty
    node mask to mpol_new_mpolicy().

    Remove the now unneeded 'nodes = NULL' from mpol_new().

    Note that mpol_new_mempolicy() is always called with a non-NULL
    'nodes' parameter now that it has been removed from mpol_new().
    Therefore, we don't need to test nodes for NULL before testing it for
    'empty'. However, just to be extra paranoid, add a VM_BUG_ON() to
    verify this assumption.]
    [lee.schermerhorn@hp.com:

    I don't think the function name 'mpol_new_mempolicy' is descriptive
    enough to differentiate it from mpol_new().

    This function applies cpuset set context, usually constraining nodes
    to those allowed by the cpuset. However, when the 'RELATIVE_NODES flag
    is set, it also translates the nodes. So I settled on
    'mpol_set_nodemask()', because the comment block for mpol_new() mentions
    that we need to call this function to "set nodes".

    Some additional minor line length, whitespace and typo cleanup.]
    Signed-off-by: Miao Xie
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Christoph Lameter
    Cc: Paul Menage
    Cc: Nick Piggin
    Cc: Yasunori Goto
    Cc: Pekka Enberg
    Cc: David Rientjes
    Signed-off-by: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     

15 Apr, 2009

2 commits

  • Impact: clean up

    Create a sub directory in include/trace called events to keep the
    trace point headers in their own separate directory. Only headers that
    declare trace points should be defined in this directory.

    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Neil Horman
    Cc: Zhao Lei
    Cc: Eduard - Gabriel Munteanu
    Cc: Pekka Enberg
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • This patch lowers the number of places a developer must modify to add
    new tracepoints. The current method to add a new tracepoint
    into an existing system is to write the trace point macro in the
    trace header with one of the macros TRACE_EVENT, TRACE_FORMAT or
    DECLARE_TRACE, then they must add the same named item into the C file
    with the macro DEFINE_TRACE(name) and then add the trace point.

    This change cuts out the needing to add the DEFINE_TRACE(name).
    Every file that uses the tracepoint must still include the trace/.h
    file, but the one C file must also add a define before the including
    of that file.

    #define CREATE_TRACE_POINTS
    #include

    This will cause the trace/mytrace.h file to also produce the C code
    necessary to implement the trace point.

    Note, if more than one trace/.h is used to create the C code
    it is best to list them all together.

    #define CREATE_TRACE_POINTS
    #include
    #include
    #include

    Thanks to Mathieu Desnoyers and Christoph Hellwig for coming up with
    the cleaner solution of the define above the includes over my first
    design to have the C code include a "special" header.

    This patch converts sched, irq and lockdep and skb to use this new
    method.

    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Neil Horman
    Cc: Zhao Lei
    Cc: Eduard - Gabriel Munteanu
    Cc: Pekka Enberg
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

09 Apr, 2009

2 commits


30 Mar, 2009

1 commit

  • Impact: cleanup

    (Thanks to Al Viro for reminding me of this, via Ingo)

    CPU_MASK_ALL is the (deprecated) "all bits set" cpumask, defined as so:

    #define CPU_MASK_ALL (cpumask_t) { { ... } }

    Taking the address of such a temporary is questionable at best,
    unfortunately 321a8e9d (cpumask: add CPU_MASK_ALL_PTR macro) added
    CPU_MASK_ALL_PTR:

    #define CPU_MASK_ALL_PTR (&CPU_MASK_ALL)

    Which formalizes this practice. One day gcc could bite us over this
    usage (though we seem to have gotten away with it so far).

    So replace everywhere which used &CPU_MASK_ALL or CPU_MASK_ALL_PTR
    with the modern "cpu_all_mask" (a real const struct cpumask *).

    Signed-off-by: Rusty Russell
    Acked-by: Ingo Molnar
    Reported-by: Al Viro
    Cc: Mike Travis

    Rusty Russell
     

16 Nov, 2008

1 commit

  • Impact: API *CHANGE*. Must update all tracepoint users.

    Add DEFINE_TRACE() to tracepoints to let them declare the tracepoint
    structure in a single spot for all the kernel. It helps reducing memory
    consumption, especially when declaring a lot of tracepoints, e.g. for
    kmalloc tracing.

    *API CHANGE WARNING*: now, DECLARE_TRACE() must be used in headers for
    tracepoint declarations rather than DEFINE_TRACE(). This is the sane way
    to do it. The name previously used was misleading.

    Updates scheduler instrumentation to follow this API change.

    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Ingo Molnar

    Mathieu Desnoyers
     

21 Oct, 2008

1 commit

  • …l/git/tip/linux-2.6-tip

    * 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (131 commits)
    tracing/fastboot: improve help text
    tracing/stacktrace: improve help text
    tracing/fastboot: fix initcalls disposition in bootgraph.pl
    tracing/fastboot: fix bootgraph.pl initcall name regexp
    tracing/fastboot: fix issues and improve output of bootgraph.pl
    tracepoints: synchronize unregister static inline
    tracepoints: tracepoint_synchronize_unregister()
    ftrace: make ftrace_test_p6nop disassembler-friendly
    markers: fix synchronize marker unregister static inline
    tracing/fastboot: add better resolution to initcall debug/tracing
    trace: add build-time check to avoid overrunning hex buffer
    ftrace: fix hex output mode of ftrace
    tracing/fastboot: fix initcalls disposition in bootgraph.pl
    tracing/fastboot: fix printk format typo in boot tracer
    ftrace: return an error when setting a nonexistent tracer
    ftrace: make some tracers reentrant
    ring-buffer: make reentrant
    ring-buffer: move page indexes into page headers
    tracing/fastboot: only trace non-module initcalls
    ftrace: move pc counter in irqtrace
    ...

    Manually fix conflicts:
    - init/main.c: initcall tracing
    - kernel/module.c: verbose level vs tracepoints
    - scripts/bootgraph.pl: fallout from cherry-picking commits.

    Linus Torvalds
     

20 Oct, 2008

1 commit


14 Oct, 2008

1 commit

  • Instrument the scheduler activity (sched_switch, migration, wakeups,
    wait for a task, signal delivery) and process/thread
    creation/destruction (fork, exit, kthread stop). Actually, kthread
    creation is not instrumented in this patch because it is architecture
    dependent. It allows to connect tracers such as ftrace which detects
    scheduling latencies, good/bad scheduler decisions. Tools like LTTng can
    export this scheduler information along with instrumentation of the rest
    of the kernel activity to perform post-mortem analysis on the scheduler
    activity.

    About the performance impact of tracepoints (which is comparable to
    markers), even without immediate values optimizations, tests done by
    Hideo Aoki on ia64 show no regression. His test case was using hackbench
    on a kernel where scheduler instrumentation (about 5 events in code
    scheduler code) was added. See the "Tracepoints" patch header for
    performance result detail.

    Changelog :

    - Change instrumentation location and parameter to match ftrace
    instrumentation, previously done with kernel markers.

    [ mingo@elte.hu: conflict resolutions ]
    Signed-off-by: Mathieu Desnoyers
    Acked-by: 'Peter Zijlstra'
    Signed-off-by: Ingo Molnar

    Mathieu Desnoyers
     

27 Jul, 2008

1 commit

  • This extends wait_task_inactive() with a new argument so it can be used in
    a "soft" mode where it will check for the task changing state unexpectedly
    and back off. There is no change to existing callers. This lays the
    groundwork to allow robust, noninvasive tracing that can try to sample a
    blocked thread but back off safely if it wakes up.

    Signed-off-by: Roland McGrath
    Cc: Oleg Nesterov
    Reviewed-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     

19 Jul, 2008

1 commit


17 Jul, 2008

1 commit

  • The freezer currently attempts to distinguish kernel threads from
    user space tasks by checking if their mm pointer is unset and it
    does not send fake signals to kernel threads. However, there are
    kernel threads, mostly related to networking, that behave like
    user space tasks and may want to be sent a fake signal to be frozen.

    Introduce the new process flag PF_FREEZER_NOSIG that will be set
    by default for all kernel threads and make the freezer only send
    fake signals to the tasks having PF_FREEZER_NOSIG unset. Provide
    the set_freezable_with_signal() function to be called by the kernel
    threads that want to be sent a fake signal for freezing.

    This patch should not change the freezer's observable behavior.

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Andi Kleen
    Acked-by: Pavel Machek
    Signed-off-by: Len Brown

    Rafael J. Wysocki
     

10 Jun, 2008

1 commit

  • Kthreads that have called kthread_bind() are bound to specific cpus, so
    other tasks should not be able to change their cpus_allowed from under
    them. Otherwise, it is possible to move kthreads, such as the migration
    or software watchdog threads, so they are not allowed access to the cpu
    they work on.

    Cc: Peter Zijlstra
    Cc: Paul Menage
    Cc: Paul Jackson
    Signed-off-by: David Rientjes
    Signed-off-by: Ingo Molnar

    David Rientjes
     

30 Apr, 2008

1 commit

  • There are some places that are known to operate on tasks'
    global pids only:

    * the rest_init() call (called on boot)
    * the kgdb's getthread
    * the create_kthread() (since the kthread is run in init ns)

    So use the find_task_by_pid_ns(..., &init_pid_ns) there
    and schedule the find_task_by_pid for removal.

    [sukadev@us.ibm.com: Fix warning in kernel/pid.c]
    Signed-off-by: Pavel Emelyanov
    Cc: "Eric W. Biederman"
    Signed-off-by: Sukadev Bhattiprolu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

29 Apr, 2008

1 commit

  • From the POV of synchronization, there should be no need to call
    wake_up_process() with the 'kthread_create_lock' being held.

    Signed-off-by: Dmitry Adamushko
    Cc: Nick Piggin
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Cc: "Paul E. McKenney"
    Cc: Peter Zijlstra
    Cc: Andy Whitcroft
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Adamushko
     

22 Apr, 2008

1 commit

  • * 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
    Deprecate the asm/semaphore.h files in feature-removal-schedule.
    Convert asm/semaphore.h users to linux/semaphore.h
    security: Remove unnecessary inclusions of asm/semaphore.h
    lib: Remove unnecessary inclusions of asm/semaphore.h
    kernel: Remove unnecessary inclusions of asm/semaphore.h
    include: Remove unnecessary inclusions of asm/semaphore.h
    fs: Remove unnecessary inclusions of asm/semaphore.h
    drivers: Remove unnecessary inclusions of asm/semaphore.h
    net: Remove unnecessary inclusions of asm/semaphore.h
    arch: Remove unnecessary inclusions of asm/semaphore.h

    Linus Torvalds
     

20 Apr, 2008

1 commit


19 Apr, 2008

1 commit


26 Jan, 2008

1 commit


01 Aug, 2007

1 commit

  • WARNING: kernel/built-in.o(.text+0x16910): Section mismatch:
    reference to .init.text: (between 'kthreadd' and 'init_waitqueue_head')

    comes because kernel/kthread.c:kthreadd() is not __init but calls
    kthreadd_setup() which is __init. But this is ok, because kthreadd_setup()
    is only ever called at init time, and then kthreadd() proceeds into its
    "for (;;)" loop. We could mark kthreadd __init_refok, but kthreadd_setup()
    with just one callsite and 4 lines in it (it's been that small since
    10ab825bdef8df51) doesn't need to be a separate function at all -- so let's
    just move those four lines at beginning of kthreadd() itself.

    Signed-off-by: Satyam Sharma
    Acked-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Satyam Sharma
     

17 Jul, 2007

1 commit


24 May, 2007

1 commit

  • kthread() sleeps in TASK_INTERRUPTIBLE state waiting for the first wakeup. In
    theory, this wakeup may come from freeze_process()->signal_wake_up(), so the
    task can disappear even before kthread_create() sets its ->comm.

    Change kthread() to use TASK_UNINTERRUPTIBLE.

    [akpm@linux-foundation.org: s/BUG_ON/WARN_ON+recover]
    Signed-off-by: Oleg Nesterov
    Acked-by: "Eric W. Biederman"
    Signed-off-by: Rafael J. Wysocki
    Cc: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

10 May, 2007

2 commits

  • Currently kernel threads use sigprocmask(SIG_BLOCK) to protect against
    signals. This doesn't prevent the signal delivery, this only blocks
    signal_wake_up(). Every "killall -33 kthreadd" means a "struct siginfo"
    leak.

    Change kthreadd_setup() to set all handlers to SIG_IGN instead of blocking
    them (make a new helper ignore_signals() for that). If the kernel thread
    needs some signal, it should use allow_signal() anyway, and in that case it
    should not use CLONE_SIGHAND.

    Note that we can't change daemonize() (should die!) in the same way,
    because it can be used along with CLONE_SIGHAND. This means that
    allow_signal() still should unblock the signal to work correctly with
    daemonize()ed threads.

    However, disallow_signal() doesn't block the signal any longer but ignores
    it.

    NOTE: with or without this patch the kernel threads are not protected from
    handle_stop_signal(), this seems harmless, but not good.

    Signed-off-by: Oleg Nesterov
    Acked-by: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Currently there is a circular reference between work queue initialization
    and kthread initialization. This prevents the kthread infrastructure from
    initializing until after work queues have been initialized.

    We want the properties of tasks created with kthread_create to be as close
    as possible to the init_task and to not be contaminated by user processes.
    The later we start our kthreadd that creates these tasks the harder it is
    to avoid contamination from user processes and the more of a mess we have
    to clean up because the defaults have changed on us.

    So this patch modifies the kthread support to not use work queues but to
    instead use a simple list of structures, and to have kthreadd start from
    init_task immediately after our kernel thread that execs /sbin/init.

    By being a true child of init_task we only have to change those process
    settings that we want to have different from init_task, such as our process
    name, the cpus that are allowed, blocking all signals and setting SIGCHLD
    to SIG_IGN so that all of our children are reaped automatically.

    By being a true child of init_task we also naturally get our ppid set to 0
    and do not wind up as a child of PID == 1. Ensuring that tasks generated
    by kthread_create will not slow down the functioning of the wait family of
    functions.

    [akpm@linux-foundation.org: use interruptible sleeps]
    Signed-off-by: Eric W. Biederman
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

12 Feb, 2007

1 commit

  • A variety of (mostly) innocuous fixes to the embedded kernel-doc content in
    source files, including:

    * make multi-line initial descriptions single line
    * denote some function names, constants and structs as such
    * change erroneous opening '/*' to '/**' in a few places
    * reword some text for clarity

    Signed-off-by: Robert P. J. Day
    Cc: "Randy.Dunlap"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     

22 Nov, 2006

1 commit

  • Pass the work_struct pointer to the work function rather than context data.
    The work function can use container_of() to work out the data.

    For the cases where the container of the work_struct may go away the moment the
    pending bit is cleared, it is made possible to defer the release of the
    structure by deferring the clearing of the pending bit.

    To make this work, an extra flag is introduced into the management side of the
    work_struct. This governs auto-release of the structure upon execution.

    Ordinarily, the work queue executor would release the work_struct for further
    scheduling or deallocation by clearing the pending bit prior to jumping to the
    work function. This means that, unless the driver makes some guarantee itself
    that the work_struct won't go away, the work function may not access anything
    else in the work_struct or its container lest they be deallocated.. This is a
    problem if the auxiliary data is taken away (as done by the last patch).

    However, if the pending bit is *not* cleared before jumping to the work
    function, then the work function *may* access the work_struct and its container
    with no problems. But then the work function must itself release the
    work_struct by calling work_release().

    In most cases, automatic release is fine, so this is the default. Special
    initiators exist for the non-auto-release case (ending in _NAR).

    Signed-Off-By: David Howells

    David Howells
     

15 Jul, 2006

1 commit


26 Jun, 2006

1 commit


26 Mar, 2006

1 commit

  • A couple of places are forgetting to take it.

    The kswapd case is probably unimportant. keventd_create_kthread() was racy.

    The whole thing is a bit flakey: you start a kernel thread, get its pid from
    kernel_thread() then look up its task_struct.

    a) It assumes that pid recycling takes a "long" time.

    b) We get a task_struct but no reference was taken on it. The owner of the
    kswapd and kthread task_struct*'s must assume that the new thread won't
    exit unexpectedly. Because if it does, they're left holding dead memory
    and any attempt to control or stop that task will crash.

    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

23 Mar, 2006

1 commit

  • Semaphore to mutex conversion.

    The conversion was generated via scripts, and the result was validated
    automatically via a script as well.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

31 Oct, 2005

1 commit

  • Enhance the kthread API by adding kthread_stop_sem, for use in stopping
    threads that spend their idle time waiting on a semaphore.

    Signed-off-by: Alan Stern
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Stern