02 Jun, 2009

1 commit

  • v3: zhaolei@cn.fujitsu.com: Change TRACE_EVENT definition to new format
    introduced by Steven Rostedt: consolidate trace and trace_event headers
    v2: kosaki@jp.fujitsu.com: print the function names instead of addr, and zap
    the work addr
    v1: zhaolei@cn.fujitsu.com: Make workqueue tracepoints use TRACE_EVENT macro

    TRACE_EVENT is a more generic way to define tracepoints.
    Doing so adds these new capabilities to the tracepoints:

    - zero-copy and per-cpu splice() tracing
    - binary tracing without printf overhead
    - structured logging records exposed under /debug/tracing/events
    - trace events embedded in function tracer output and other plugins
    - user-defined, per tracepoint filter expressions

    Then, this patch converts DEFINE_TRACE to TRACE_EVENT in workqueue related
    tracepoints.

    [ Impact: expand workqueue tracer to events tracing ]

    Signed-off-by: Zhao Lei
    Cc: Steven Rostedt
    Cc: Tom Zanussi
    Cc: Oleg Nesterov
    Cc: Andrew Morton
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Frederic Weisbecker

    Zhaolei
     

09 Apr, 2009

1 commit

  • Impact: circular locking bugfix

    The various implemetnations and proposed implemetnations of work_on_cpu()
    are vulnerable to various deadlocks because they all used queues of some
    form.

    Unrelated pieces of kernel code thus gained dependencies wherein if one
    work_on_cpu() caller holds a lock which some other work_on_cpu() callback
    also takes, the kernel could rarely deadlock.

    Fix this by creating a short-lived kernel thread for each work_on_cpu()
    invokation.

    This is not terribly fast, but the only current caller of work_on_cpu() is
    pci_call_probe().

    It would be nice to find some other way of doing the node-local
    allocations in the PCI probe code so that we can zap work_on_cpu()
    altogether. The code there is rather nasty. I can't think of anything
    simple at this time...

    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Rusty Russell

    Andrew Morton
     

06 Apr, 2009

1 commit

  • * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (413 commits)
    tracing, net: fix net tree and tracing tree merge interaction
    tracing, powerpc: fix powerpc tree and tracing tree interaction
    ring-buffer: do not remove reader page from list on ring buffer free
    function-graph: allow unregistering twice
    trace: make argument 'mem' of trace_seq_putmem() const
    tracing: add missing 'extern' keywords to trace_output.h
    tracing: provide trace_seq_reserve()
    blktrace: print out BLK_TN_MESSAGE properly
    blktrace: extract duplidate code
    blktrace: fix memory leak when freeing struct blk_io_trace
    blktrace: fix blk_probes_ref chaos
    blktrace: make classic output more classic
    blktrace: fix off-by-one bug
    blktrace: fix the original blktrace
    blktrace: fix a race when creating blk_tree_root in debugfs
    blktrace: fix timestamp in binary output
    tracing, Text Edit Lock: cleanup
    tracing: filter fix for TRACE_EVENT_FORMAT events
    ftrace: Using FTRACE_WARN_ON() to check "freed record" in ftrace_release()
    x86: kretprobe-booster interrupt emulation code fix
    ...

    Fix up trivial conflicts in
    arch/parisc/include/asm/ftrace.h
    include/linux/memory.h
    kernel/extable.c
    kernel/module.c

    Linus Torvalds
     

03 Apr, 2009

1 commit

  • 1) lockdep will complain when run_workqueue() performs recursion.

    2) The recursive implementation of run_workqueue() means that
    flush_workqueue() and its documentation are inconsistent. This may
    hide deadlocks and other bugs.

    3) The recursion in run_workqueue() will poison cwq->current_work, but
    flush_work() and __cancel_work_timer(), etcetera need a reliable
    cwq->current_work.

    Signed-off-by: Lai Jiangshan
    Acked-by: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Frederic Weisbecker
    Cc: Eric Dumazet
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan
     

02 Apr, 2009

1 commit


30 Mar, 2009

1 commit


03 Feb, 2009

1 commit


20 Jan, 2009

2 commits

  • Impact: remove potential clashes with generic kevent workqueue

    Annoyingly, some places we want to use work_on_cpu are already in
    workqueues. As per Ingo's suggestion, we create a different workqueue
    for work_on_cpu.

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Rusty Russell
     
  • Impact: remove potential circular lock dependency with cpu hotplug lock

    This has caused more problems than it solved, with a pile of cpu
    hotplug locking issues.

    Followup patches will get_online_cpus() in callers that need it, but
    if they don't do it they're no worse than before when they were using
    set_cpus_allowed without locking.

    Signed-off-by: Rusty Russell
    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Rusty Russell
     

14 Jan, 2009

1 commit

  • Impact: new tracer

    The workqueue tracer provides some statistical informations
    about each cpu workqueue thread such as the number of the
    works inserted and executed since their creation. It can help
    to evaluate the amount of work each of them have to perform.
    For example it can help a developer to decide whether he should
    choose a per cpu workqueue instead of a singlethreaded one.

    It only traces statistical informations for now but it will probably later
    provide event tracing too.

    Such a tracer could help too, and be improved, to help rt priority sorted
    workqueue development.

    To have a snapshot of the workqueues state at any time, just do

    cat /debugfs/tracing/trace_stat/workqueues

    Ie:

    1 125 125 reiserfs/1
    1 0 0 scsi_tgtd/1
    1 0 0 aio/1
    1 0 0 ata/1
    1 114 114 kblockd/1
    1 0 0 kintegrityd/1
    1 2147 2147 events/1

    0 0 0 kpsmoused
    0 105 105 reiserfs/0
    0 0 0 scsi_tgtd/0
    0 0 0 aio/0
    0 0 0 ata_aux
    0 0 0 ata/0
    0 0 0 cqueue
    0 0 0 kacpi_notify
    0 0 0 kacpid
    0 149 149 kblockd/0
    0 0 0 kintegrityd/0
    0 1000 1000 khelper
    0 2270 2270 events/0

    Changes in V2:

    _ Drop the static array based on NR_CPU and dynamically allocate the stat array
    with num_possible_cpus() and other cpu mask facilities....
    _ Trace workqueue insertion at a bit lower level (insert_work instead of queue_work) to handle
    even the workqueue barriers.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

01 Jan, 2009

1 commit

  • Impact: Reduce memory usage, use new cpumask API.

    cpu_populated_map becomes a cpumask_var_t, and cpu_singlethread_map is
    simply a cpumask pointer: it's simply the cpumask containing the first
    possible CPU anyway.

    Signed-off-by: Rusty Russell

    Rusty Russell
     

14 Nov, 2008

2 commits


06 Nov, 2008

1 commit

  • Impact: introduce new APIs

    We want to deprecate cpumasks on the stack, as we are headed for
    gynormous numbers of CPUs. Eventually, we want to head towards an
    undefined 'struct cpumask' so they can never be declared on stack.

    1) New cpumask functions which take pointers instead of copies.
    (cpus_* -> cpumask_*)

    2) Several new helpers to reduce requirements for temporary cpumasks
    (cpumask_first_and, cpumask_next_and, cpumask_any_and)

    3) Helpers for declaring cpumasks on or offstack for large NR_CPUS
    (cpumask_var_t, alloc_cpumask_var and free_cpumask_var)

    4) 'struct cpumask' for explicitness and to mark new-style code.

    5) Make iterator functions stop at nr_cpu_ids (a runtime constant),
    not NR_CPUS for time efficiency and for smaller dynamic allocations
    in future.

    6) cpumask_copy() so we can allocate less than a full cpumask eventually
    (for alloc_cpumask_var), and so we can eliminate the 'struct cpumask'
    definition eventually.

    7) work_on_cpu() helper for doing task on a CPU, rather than saving old
    cpumask for current thread and manipulating it.

    8) smp_call_function_many() which is smp_call_function_mask() except
    taking a cpumask pointer.

    Note that this patch simply introduces the new functions and leaves
    the obsolescent ones in place. This is to simplify the transition
    patches.

    Signed-off-by: Rusty Russell
    Signed-off-by: Ingo Molnar

    Rusty Russell
     

22 Oct, 2008

1 commit

  • create_rt_workqueue will create a real time prioritized workqueue.
    This is needed for the conversion of stop_machine to a workqueue based
    implementation.
    This patch adds yet another parameter to __create_workqueue_key to tell
    it that we want an rt workqueue.
    However it looks like we rather should have something like "int type"
    instead of singlethread, freezable and rt.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Rusty Russell
    Cc: Ingo Molnar

    Heiko Carstens
     

17 Oct, 2008

1 commit


12 Aug, 2008

1 commit


11 Aug, 2008

2 commits


31 Jul, 2008

1 commit


26 Jul, 2008

8 commits

  • The bug was pointed out by Akinobu Mita , and this
    patch is based on his original patch.

    workqueue_cpu_callback(CPU_UP_PREPARE) expects that if it returns
    NOTIFY_BAD, _cpu_up() will send CPU_UP_CANCELED then.

    However, this is not true since

    "cpu hotplug: cpu: deliver CPU_UP_CANCELED only to NOTIFY_OKed callbacks with CPU_UP_PREPARE"
    commit: a0d8cdb652d35af9319a9e0fb7134de2a276c636

    The callback which has returned NOTIFY_BAD will not receive
    CPU_UP_CANCELED. Change the code to fulfil the CPU_UP_CANCELED logic if
    CPU_UP_PREPARE fails.

    Signed-off-by: Oleg Nesterov
    Reported-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • schedule_on_each_cpu() can use schedule_work_on() to avoid the code
    duplication.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • queue_work() can use queue_work_on() to avoid the code duplication.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Add lockdep annotations to flush_work() and update the comment.

    Signed-off-by: Oleg Nesterov
    Cc: Jarek Poplawski
    Acked-by: Johannes Berg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • workqueue_cpu_callback(CPU_DEAD) flushes cwq->thread under
    cpu_maps_update_begin(). This means that the multithreaded workqueues
    can't use get_online_cpus() due to the possible deadlock, very bad and
    very old problem.

    Introduce the new state, CPU_POST_DEAD, which is called after
    cpu_hotplug_done() but before cpu_maps_update_done().

    Change workqueue_cpu_callback() to use CPU_POST_DEAD instead of CPU_DEAD.
    This means that create/destroy functions can't rely on get_online_cpus()
    any longer and should take cpu_add_remove_lock instead.

    [akpm@linux-foundation.org: fix CONFIG_SMP=n]
    Signed-off-by: Oleg Nesterov
    Acked-by: Gautham R Shenoy
    Cc: Heiko Carstens
    Cc: Max Krasnyansky
    Cc: Paul Jackson
    Cc: Paul Menage
    Cc: Peter Zijlstra
    Cc: Vegard Nossum
    Cc: Martin Schwidefsky
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Change schedule_on_each_cpu() to use flush_work() instead of
    flush_workqueue(), this way we don't wait for other work_struct's which
    can be queued meanwhile.

    Signed-off-by: Oleg Nesterov
    Cc: Jarek Poplawski
    Cc: Max Krasnyansky
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Most of users of flush_workqueue() can be changed to use cancel_work_sync(),
    but sometimes we really need to wait for the completion and cancelling is not
    an option. schedule_on_each_cpu() is good example.

    Add the new helper, flush_work(work), which waits for the completion of the
    specific work_struct. More precisely, it "flushes" the result of of the last
    queue_work() which is visible to the caller.

    For example, this code

    queue_work(wq, work);
    /* WINDOW */
    queue_work(wq, work);

    flush_work(work);

    doesn't necessary work "as expected". What can happen in the WINDOW above is

    - wq starts the execution of work->func()

    - the caller migrates to another CPU

    now, after the 2nd queue_work() this work is active on the previous CPU, and
    at the same time it is queued on another. In this case flush_work(work) may
    return before the first work->func() completes.

    It is trivial to add another helper

    int flush_work_sync(struct work_struct *work)
    {
    return flush_work(work) || wait_on_work(work);
    }

    which works "more correctly", but it has to iterate over all CPUs and thus
    it much slower than flush_work().

    Signed-off-by: Oleg Nesterov
    Acked-by: Max Krasnyansky
    Acked-by: Jarek Poplawski
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • insert_work() inserts the new work_struct before or after cwq->worklist,
    depending on the "int tail" parameter. Change it to accept "list_head *"
    instead, this shrinks .text a bit and allows us to insert the barrier
    after specific work_struct.

    Signed-off-by: Oleg Nesterov
    Cc: Jarek Poplawski
    Cc: Max Krasnyansky
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

25 Jul, 2008

1 commit

  • This interface allows adding a job on a specific cpu.

    Although a work struct on a cpu will be scheduled to other cpu if the cpu
    dies, there is a recursion if a work task tries to offline the cpu it's
    running on. we need to schedule the task to a specific cpu in this case.
    http://bugzilla.kernel.org/show_bug.cgi?id=10897

    [oleg@tv-sign.ru: cleanups]
    Signed-off-by: Zhang Rui
    Tested-by: Rus
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zhang Rui
     

06 Jul, 2008

1 commit


05 Jul, 2008

1 commit

  • Remove all clameter@sgi.com addresses from the kernel tree since they will
    become invalid on June 27th. Change my maintainer email address for the
    slab allocators to cl@linux-foundation.org (which will be the new email
    address for the future).

    Signed-off-by: Christoph Lameter
    Signed-off-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Stephen Rothwell
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

24 May, 2008

1 commit


01 May, 2008

1 commit

  • timer_stats_timer_set_start_info is invoked twice, additionally, the
    invocation of this function can be moved to where it is only called when a
    delay is really required.

    Signed-off-by: Andrew Liu
    Cc: Pavel Machek
    Cc: Ingo Molnar
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Liu
     

30 Apr, 2008

1 commit


29 Apr, 2008

2 commits

  • cleanup_workqueue_thread() doesn't need the second argument, remove it.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • When cpu_populated_map was introduced, it was supposed that cwq->thread can
    survive after CPU_DEAD, that is why we never shrink cpu_populated_map.

    This is not very nice, we can safely remove the already dead CPU from the map.
    The only required change is that destroy_workqueue() must hold the hotplug
    lock until it destroys all cwq->thread's, to protect the cpu_populated_map.
    We could make the local copy of cpu mask and drop the lock, but
    sizeof(cpumask_t) may be very large.

    Also, fix the comment near queue_work(). Unless _cpu_down() happens we do
    guarantee the cpu-affinity of the work_struct, and we have users which rely on
    this.

    [akpm@linux-foundation.org: repair comment]
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

17 Apr, 2008

1 commit


09 Feb, 2008

2 commits


26 Jan, 2008

1 commit