13 Sep, 2012

2 commits

  • ed3e694d "move exit_task_work() past exit_files() et.al" destroyed
    the add/exit synchronization we had, the caller itself should ensure
    task_work_add() can't race with the exiting task.

    However, this is not convenient/simple, and the only user which tries
    to do this is buggy (see the next patch). Unless the task is current,
    there is simply no way to do this in general.

    Change exit_task_work()->task_work_run() to use the dummy "work_exited"
    entry to let task_work_add() know it should fail.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Peter Zijlstra
    Cc: Al Viro
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20120826191211.GA4228@redhat.com
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     
  • Change task_work's to use llist-like code to avoid pi_lock
    in task_work_add(), this makes it useable under rq->lock.

    task_work_cancel() and task_work_run() still use pi_lock
    to synchronize with each other.

    (This is in preparation for a deadlock fix.)

    Suggested-by: Peter Zijlstra
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Peter Zijlstra
    Cc: Al Viro
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20120826191209.GA4221@redhat.com
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     

22 Aug, 2012

1 commit

  • It seems commit 4a9d4b024a31 ("switch fput to task_work_add") re-
    introduced the problem addressed in 944be0b22472 ("close_files(): add
    scheduling point")

    If a server process with a lot of files (say 2 million tcp sockets) is
    killed, we can spend a lot of time in task_work_run() and trigger a soft
    lockup.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

23 Jul, 2012

4 commits


24 May, 2012

1 commit

  • Provide a simple mechanism that allows running code in the (nonatomic)
    context of the arbitrary task.

    The caller does task_work_add(task, task_work) and this task executes
    task_work->func() either from do_notify_resume() or from do_exit(). The
    callback can rely on PF_EXITING to detect the latter case.

    "struct task_work" can be embedded in another struct, still it has "void
    *data" to handle the most common/simple case.

    This allows us to kill the ->replacement_session_keyring hack, and
    potentially this can have more users.

    Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks into
    tracehook_notify_resume() and do_exit(). But at the same time we can
    remove the "replacement_session_keyring != NULL" checks from
    arch/*/signal.c and exit_creds().

    Note: task_work_add/task_work_run abuses ->pi_lock. This is only because
    this lock is already used by lookup_pi_state() to synchronize with
    do_exit() setting PF_EXITING. Fortunately the scope of this lock in
    task_work.c is really tiny, and the code is unlikely anyway.

    Signed-off-by: Oleg Nesterov
    Acked-by: David Howells
    Cc: Thomas Gleixner
    Cc: Richard Kuo
    Cc: Linus Torvalds
    Cc: Alexander Gordeev
    Cc: Chris Zankel
    Cc: David Smith
    Cc: "Frank Ch. Eigler"
    Cc: Geert Uytterhoeven
    Cc: Larry Woodman
    Cc: Peter Zijlstra
    Cc: Tejun Heo
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Oleg Nesterov