Commit 7b2b55da1db10a5525460633ae4b6fb0be060c41

Authored by Jason Xing
Committed by Linus Torvalds
1 parent bb65f89b7d

psi: get poll_work to run when calling poll syscall next time

Only when calling the poll syscall the first time can user receive
POLLPRI correctly.  After that, user always fails to acquire the event
signal.

Reproduce case:
 1. Get the monitor code in Documentation/accounting/psi.txt
 2. Run it, and wait for the event triggered.
 3. Kill and restart the process.

The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0).  And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker.  The
cancel needs to pair with resetting the poll_scheduled flag.

Link: http://lkml.kernel.org/r/1566357985-97781-1-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: Jason Xing <kerneljasonxing@linux.alibaba.com>
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Caspar Zhang <caspar@linux.alibaba.com>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 1 changed file with 8 additions and 0 deletions Side-by-side Diff

... ... @@ -1131,7 +1131,15 @@
1131 1131 * deadlock while waiting for psi_poll_work to acquire trigger_lock
1132 1132 */
1133 1133 if (kworker_to_destroy) {
  1134 + /*
  1135 + * After the RCU grace period has expired, the worker
  1136 + * can no longer be found through group->poll_kworker.
  1137 + * But it might have been already scheduled before
  1138 + * that - deschedule it cleanly before destroying it.
  1139 + */
1134 1140 kthread_cancel_delayed_work_sync(&group->poll_work);
  1141 + atomic_set(&group->poll_scheduled, 0);
  1142 +
1135 1143 kthread_destroy_worker(kworker_to_destroy);
1136 1144 }
1137 1145 kfree(t);