17 Jul, 2007

2 commits

  • Change cancel_work_sync() and cancel_delayed_work_sync() to return a boolean
    indicating whether the work was actually cancelled. A zero return value means
    that the work was not pending/queued.

    Without that kind of change it is not possible to avoid flush_workqueue()
    sometimes, see the next patch as an example.

    Also, this patch unifies both functions and kills the (unlikely) busy-wait
    loop.

    Signed-off-by: Oleg Nesterov
    Acked-by: Jarek Poplawski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Imho, the current naming of cancel_xxx workqueue functions is very confusing.

    cancel_delayed_work()
    cancel_rearming_delayed_work()
    cancel_rearming_delayed_workqueue() // obsolete

    cancel_work_sync()

    This looks as if the first 2 functions differ in "type" of their argument
    which is not true any longer, nowadays the difference is the behaviour.

    The semantics of cancel_rearming_delayed_work(dwork) was changed
    significantly, it doesn't require that dwork rearms itself, and cancels dwork
    synchronously.

    Rename it to cancel_delayed_work_sync(). This matches cancel_delayed_work()
    and cancel_work_sync(). Re-create cancel_rearming_delayed_work() as a simple
    inline obsolete wrapper, like cancel_rearming_delayed_workqueue().

    Signed-off-by: Oleg Nesterov
    Acked-by: Jarek Poplawski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

18 May, 2007

1 commit

  • As pointed out by Jarek Poplawski, the patch

    [WORKQUEUE]: cancel_delayed_work: use del_timer() instead of del_timer_sync()
    commit: 071b638689464c6b39407025eedd810d5b5e6f5d

    was wrong, it was merged by mistake after that.

    From the changelog:

    after this patch:
    ...
    delayed_work_timer_fn->__queue_work() in progress.

    The latter doesn't differ from the caller's POV,

    it does make a difference if the caller calls flush_workqueue() after
    cancel_delayed_work(), in that case flush_workqueue() can miss this
    work_struct.

    Signed-off-by: Oleg Nesterov
    Cc: Jarek Poplawski
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

17 May, 2007

1 commit

  • It is a known fact that freezeable multithreaded workqueues doesn't like
    CPU_DEAD. We keep them only for the incoming CPU-hotplug rework.

    Sadly, we can't just kill create_freezeable_workqueue() right now, make
    them singlethread.

    Signed-off-by: Oleg Nesterov
    Cc: "Rafael J. Wysocki"
    Cc: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

10 May, 2007

5 commits

  • flush_work(wq, work) doesn't need the first parameter, we can use cwq->wq
    (this was possible from the very beginnig, I missed this). So we can unify
    flush_work_keventd and flush_work.

    Also, rename flush_work() to cancel_work_sync() and fix all callers.
    Perhaps this is not the best name, but "flush_work" is really bad.

    (akpm: this is why the earlier patches bypassed maintainers)

    Signed-off-by: Oleg Nesterov
    Cc: Jeff Garzik
    Cc: "David S. Miller"
    Cc: Jens Axboe
    Cc: Tejun Heo
    Cc: Auke Kok ,
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • We don't have any users, and it is not so trivial to use NOAUTOREL works
    correctly. It is better to simplify API.

    Delete NOAUTOREL support and rename work_release to work_clear_pending to
    avoid a confusion.

    Signed-off-by: Oleg Nesterov
    Acked-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • cancel_rearming_delayed_workqueue(wq, dwork) doesn't need the first
    parameter. We don't hang on un-queued dwork any longer, and work->data
    doesn't change its type. This means we can always figure out "wq" from
    dwork when it is needed.

    Remove this parameter, and rename the function to
    cancel_rearming_delayed_work(). Re-create an inline "obsolete"
    cancel_rearming_delayed_workqueue(wq) which just calls
    cancel_rearming_delayed_work().

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Because it has no callers.

    Actually, I think the whole idea of run_scheduled_work() was not right, not
    good to mix "unqueue this work and execute its ->func()" in one function.

    Signed-off-by: Oleg Nesterov
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • A basic problem with flush_scheduled_work() is that it blocks behind _all_
    presently-queued works, rather than just the work whcih the caller wants to
    flush. If the caller holds some lock, and if one of the queued work happens
    to want that lock as well then accidental deadlocks can occur.

    One example of this is the phy layer: it wants to flush work while holding
    rtnl_lock(). But if a linkwatch event happens to be queued, the phy code will
    deadlock because the linkwatch callback function takes rtnl_lock.

    So we implement a new function which will flush a *single* work - just the one
    which the caller wants to free up. Thus we avoid the accidental deadlocks
    which can arise from unrelated subsystems' callbacks taking shared locks.

    flush_work() non-blockingly dequeues the work_struct which we want to kill,
    then it waits for its handler to complete on all CPUs.

    Add ->current_work to the "struct cpu_workqueue_struct", it points to
    currently running "struct work_struct". When flush_work(work) detects
    ->current_work == work, it inserts a barrier at the _head_ of ->worklist
    (and thus right _after_ that work) and waits for completition. This means
    that the next work fired on that CPU will be this barrier, or another
    barrier queued by concurrent flush_work(), so the caller of flush_work()
    will be woken before any "regular" work has a chance to run.

    When wait_on_work() unlocks workqueue_mutex (or whatever we choose to protect
    against CPU hotplug), CPU may go away. But in that case take_over_work() will
    move a barrier we queued to another CPU, it will be fired sometime, and
    wait_on_work() will be woken.

    Actually, we are doing cleanup_workqueue_thread()->kthread_stop() before
    take_over_work(), so cwq->thread should complete its ->worklist (and thus
    the barrier), because currently we don't check kthread_should_stop() in
    run_workqueue(). But even if we did, everything should be ok.

    [akpm@osdl.org: cleanup]
    [akpm@osdl.org: add flush_work_keventd() wrapper]
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

09 May, 2007

1 commit

  • Add a new deferrable delayed work init. This can be used to schedule work
    that are 'unimportant' when CPU is idle and can be called later, when CPU
    eventually comes out of idle.

    Use this init in cpufreq ondemand governor.

    Signed-off-by: Venkatesh Pallipadi
    Cc: Dave Jones
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Venki Pallipadi
     

27 Apr, 2007

1 commit

  • del_timer_sync() buys nothing for cancel_delayed_work(), but it is less
    efficient since it locks the timer unconditionally, and may wait for the
    completion of the delayed_work_timer_fn().

    cancel_delayed_work() == 0 means:

    before this patch:
    work->func may still be running or queued

    after this patch:
    work->func may still be running or queued, or
    delayed_work_timer_fn->__queue_work() in progress.

    The latter doesn't differ from the caller's POV,
    delayed_work_timer_fn() is called with _PENDING
    bit set.

    cancel_delayed_work() == 1 with this patch adds a new possibility:

    delayed_work->work was cancelled, but delayed_work_timer_fn
    is still running (this is only possible for the re-arming
    works on single-threaded workqueue).

    In this case the timer was re-started by work->func(), nobody
    else can do this. This in turn means that delayed_work_timer_fn
    has already passed __queue_work() (and wont't touch delayed_work)
    because nobody else can queue delayed_work->work.

    Signed-off-by: Oleg Nesterov
    Signed-Off-By: David Howells
    Signed-off-by: David S. Miller

    Oleg Nesterov
     

17 Dec, 2006

1 commit

  • On architectures where the atomicity of the bit operations is handled by
    external means (ie a separate spinlock to protect concurrent accesses),
    just doing a direct assignment on the workqueue data field (as done by
    commit 4594bf159f1962cec3b727954b7c598b07e2e737) can cause the
    assignment to be lost due to lack of serialization with the bitops on
    the same word.

    So we need to serialize the assignment with the locks on those
    architectures (notably older ARM chips, PA-RISC and sparc32).

    So rather than using an "unsigned long", let's use "atomic_long_t",
    which already has a safe assignment operation (atomic_long_set()) on
    such architectures.

    This requires that the atomic operations use the same atomicity locks as
    the bit operations do, but that is largely the case anyway. Sparc32
    will probably need fixing.

    Architectures (including modern ARM with LL/SC) that implement sane
    atomic operations for SMP won't see any of this matter.

    Cc: Russell King
    Cc: David Howells
    Cc: David Miller
    Cc: Matthew Wilcox
    Cc: Linux Arch Maintainers
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

16 Dec, 2006

1 commit


08 Dec, 2006

2 commits

  • This allows workqueue users to run just their own pending work, rather
    than wait for the whole workqueue to finish running. This solves the
    deadlock with networking libphy that was due to other workqueue entries
    possibly needing a lock that was held by the routine that wanted to
    flush its own work.

    It's not wonderful: if you absolutely need to synchronize with the work
    function having been executed, any user strictly speaking should have
    its own completion tracking logic, since when we run things explicitly
    by hand, the generic workqueue layer can no longer help us synchronize.

    Also, this is strictly only usable for work that has been scheduled
    without any delayed timers. You can not mix the new interface with
    schedule_delayed_work().

    But it's better than what we had currently.

    Acked-by: Maciej W. Rozycki
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Make it possible to create a workqueue the worker thread of which will be
    frozen during suspend, along with other kernel threads.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Cc: Nigel Cunningham
    Cc: David Chinner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

22 Nov, 2006

4 commits

  • Pass the work_struct pointer to the work function rather than context data.
    The work function can use container_of() to work out the data.

    For the cases where the container of the work_struct may go away the moment the
    pending bit is cleared, it is made possible to defer the release of the
    structure by deferring the clearing of the pending bit.

    To make this work, an extra flag is introduced into the management side of the
    work_struct. This governs auto-release of the structure upon execution.

    Ordinarily, the work queue executor would release the work_struct for further
    scheduling or deallocation by clearing the pending bit prior to jumping to the
    work function. This means that, unless the driver makes some guarantee itself
    that the work_struct won't go away, the work function may not access anything
    else in the work_struct or its container lest they be deallocated.. This is a
    problem if the auxiliary data is taken away (as done by the last patch).

    However, if the pending bit is *not* cleared before jumping to the work
    function, then the work function *may* access the work_struct and its container
    with no problems. But then the work function must itself release the
    work_struct by calling work_release().

    In most cases, automatic release is fine, so this is the default. Special
    initiators exist for the non-auto-release case (ending in _NAR).

    Signed-Off-By: David Howells

    David Howells
     
  • Reclaim a word from the size of the work_struct by folding the pending bit and
    the wq_data pointer together. This shouldn't cause misalignment problems as
    all pointers should be at least 4-byte aligned.

    Signed-Off-By: David Howells

    David Howells
     
  • Define a type for the work function prototype. It's not only kept in the
    work_struct struct, it's also passed as an argument to several functions.

    This makes it easier to change it.

    Signed-Off-By: David Howells

    David Howells
     
  • Separate delayable work items from non-delayable work items be splitting them
    into a separate structure (delayed_work), which incorporates a work_struct and
    the timer_list removed from work_struct.

    The work_struct struct is huge, and this limits it's usefulness. On a 64-bit
    architecture it's nearly 100 bytes in size. This reduces that by half for the
    non-delayable type of event.

    Signed-Off-By: David Howells

    David Howells
     

30 Jun, 2006

1 commit


28 Feb, 2006

1 commit

  • We have several points in the SCSI stack (primarily for our device
    functions) where we need to guarantee process context, but (given the
    place where the last reference was released) we cannot guarantee this.

    This API gets around the issue by executing the function directly if
    the caller has process context, but scheduling a workqueue to execute
    in process context if the caller doesn't have it.

    Signed-off-by: James Bottomley

    James Bottomley
     

09 Jan, 2006

1 commit

  • swap migration's isolate_lru_page() currently uses an IPI to notify other
    processors that the lru caches need to be drained if the page cannot be
    found on the LRU. The IPI interrupt may interrupt a processor that is just
    processing lru requests and cause a race condition.

    This patch introduces a new function run_on_each_cpu() that uses the
    keventd() to run the LRU draining on each processor. Processors disable
    preemption when dealing the LRU caches (these are per processor) and thus
    executing LRU draining from another process is safe.

    Thanks to Lee Schermerhorn for finding this race
    condition.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

17 Apr, 2005

2 commits

  • This was unexported by Arjan because we have no current users.

    However, during a conversion from tasklets to workqueues of the parisc led
    functions, we ran across a case where this was needed. In particular, the
    open coded equivalent of cancel_rearming_delayed_workqueue was implemented
    incorrectly, which is, I think, all the evidence necessary that this is a
    useful API.

    Signed-off-by: James Bottomley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    James Bottomley
     
  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds