Blame view

Documentation/slow-work.txt 11.5 KB
8f0aa2f25   David Howells   Document the slow...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
  		     ====================================
  		     SLOW WORK ITEM EXECUTION THREAD POOL
  		     ====================================
  
  By: David Howells <dhowells@redhat.com>
  
  The slow work item execution thread pool is a pool of threads for performing
  things that take a relatively long time, such as making mkdir calls.
  Typically, when processing something, these items will spend a lot of time
  blocking a thread on I/O, thus making that thread unavailable for doing other
  work.
  
  The standard workqueue model is unsuitable for this class of work item as that
  limits the owner to a single thread or a single thread per CPU.  For some
  tasks, however, more threads - or fewer - are required.
  
  There is just one pool per system.  It contains no threads unless something
  wants to use it - and that something must register its interest first.  When
  the pool is active, the number of threads it contains is dynamic, varying
  between a maximum and minimum setting, depending on the load.
  
  
  ====================
  CLASSES OF WORK ITEM
  ====================
  
  This pool support two classes of work items:
  
   (*) Slow work items.
  
   (*) Very slow work items.
  
  The former are expected to finish much quicker than the latter.
  
  An operation of the very slow class may do a batch combination of several
  lookups, mkdirs, and a create for instance.
  
  An operation of the ordinarily slow class may, for example, write stuff or
  expand files, provided the time taken to do so isn't too long.
  
  Operations of both types may sleep during execution, thus tying up the thread
  loaned to it.
6b8268b17   Jens Axboe   SLOW_WORK: Add de...
43
44
45
46
47
48
  A further class of work item is available, based on the slow work item class:
  
   (*) Delayed slow work items.
  
  These are slow work items that have a timer to defer queueing of the item for
  a while.
8f0aa2f25   David Howells   Document the slow...
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
  
  THREAD-TO-CLASS ALLOCATION
  --------------------------
  
  Not all the threads in the pool are available to work on very slow work items.
  The number will be between one and one fewer than the number of active threads.
  This is configurable (see the "Pool Configuration" section).
  
  All the threads are available to work on ordinarily slow work items, but a
  percentage of the threads will prefer to work on very slow work items.
  
  The configuration ensures that at least one thread will be available to work on
  very slow work items, and at least one thread will be available that won't work
  on very slow work items at all.
  
  
  =====================
  USING SLOW WORK ITEMS
  =====================
  
  Firstly, a module or subsystem wanting to make use of slow work items must
  register its interest:
3d7a641e5   David Howells   SLOW_WORK: Wait f...
71
  	 int ret = slow_work_register_user(struct module *module);
8f0aa2f25   David Howells   Document the slow...
72

3d7a641e5   David Howells   SLOW_WORK: Wait f...
73
74
75
  This will return 0 if successful, or a -ve error upon failure.  The module
  pointer should be the module interested in using this facility (almost
  certainly THIS_MODULE).
8f0aa2f25   David Howells   Document the slow...
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
  
  
  Slow work items may then be set up by:
  
   (1) Declaring a slow_work struct type variable:
  
  	#include <linux/slow-work.h>
  
  	struct slow_work myitem;
  
   (2) Declaring the operations to be used for this item:
  
  	struct slow_work_ops myitem_ops = {
  		.get_ref = myitem_get_ref,
  		.put_ref = myitem_put_ref,
  		.execute = myitem_execute,
  	};
  
       [*] For a description of the ops, see section "Item Operations".
  
   (3) Initialising the item:
  
  	slow_work_init(&myitem, &myitem_ops);
  
       or:
6b8268b17   Jens Axboe   SLOW_WORK: Add de...
101
102
103
  	delayed_slow_work_init(&myitem, &myitem_ops);
  
       or:
8f0aa2f25   David Howells   Document the slow...
104
105
106
107
108
109
110
111
112
  	vslow_work_init(&myitem, &myitem_ops);
  
       depending on its class.
  
  A suitably set up work item can then be enqueued for processing:
  
  	int ret = slow_work_enqueue(&myitem);
  
  This will return a -ve error if the thread pool is unable to gain a reference
6b8268b17   Jens Axboe   SLOW_WORK: Add de...
113
114
115
  on the item, 0 otherwise, or (for delayed work):
  
  	int ret = delayed_slow_work_enqueue(&myitem, my_jiffy_delay);
8f0aa2f25   David Howells   Document the slow...
116
117
118
  
  
  The items are reference counted, so there ought to be no need for a flush
016095029   Jens Axboe   SLOW_WORK: Add su...
119
120
121
122
  operation.  But as the reference counting is optional, means to cancel
  existing work items are also included:
  
  	cancel_slow_work(&myitem);
6b8268b17   Jens Axboe   SLOW_WORK: Add de...
123
  	cancel_delayed_slow_work(&myitem);
016095029   Jens Axboe   SLOW_WORK: Add su...
124
125
126
127
128
129
130
  
  can be used to cancel pending work.  The above cancel function waits for
  existing work to have been executed (or prevent execution of them, depending
  on timing).
  
  
  When all a module's slow work items have been processed, and the
8f0aa2f25   David Howells   Document the slow...
131
132
  module has no further interest in the facility, it should unregister its
  interest:
3d7a641e5   David Howells   SLOW_WORK: Wait f...
133
134
135
136
137
138
  	slow_work_unregister_user(struct module *module);
  
  The module pointer is used to wait for all outstanding work items for that
  module before completing the unregistration.  This prevents the put_ref() code
  from being taken away before it completes.  module should almost certainly be
  THIS_MODULE.
8f0aa2f25   David Howells   Document the slow...
139

31ba99d30   David Howells   SLOW_WORK: Allow ...
140
141
142
143
144
145
146
147
148
149
150
151
152
  ================
  HELPER FUNCTIONS
  ================
  
  The slow-work facility provides a function by which it can be determined
  whether or not an item is queued for later execution:
  
  	bool queued = slow_work_is_queued(struct slow_work *work);
  
  If it returns false, then the item is not on the queue (it may be executing
  with a requeue pending).  This can be used to work out whether an item on which
  another depends is on the queue, thus allowing a dependent item to be queued
  after it.
3bde31a4a   David Howells   SLOW_WORK: Allow ...
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
  If the above shows an item on which another depends not to be queued, then the
  owner of the dependent item might need to wait.  However, to avoid locking up
  the threads unnecessarily be sleeping in them, it can make sense under some
  circumstances to return the work item to the queue, thus deferring it until
  some other items have had a chance to make use of the yielded thread.
  
  To yield a thread and defer an item, the work function should simply enqueue
  the work item again and return.  However, this doesn't work if there's nothing
  actually on the queue, as the thread just vacated will jump straight back into
  the item's work function, thus busy waiting on a CPU.
  
  Instead, the item should use the thread to wait for the dependency to go away,
  but rather than using schedule() or schedule_timeout() to sleep, it should use
  the following function:
  
  	bool requeue = slow_work_sleep_till_thread_needed(
  			struct slow_work *work,
  			signed long *_timeout);
  
  This will add a second wait and then sleep, such that it will be woken up if
  either something appears on the queue that could usefully make use of the
  thread - and behind which this item can be queued, or if the event the caller
  set up to wait for happens.  True will be returned if something else appeared
  on the queue and this work function should perhaps return, of false if
  something else woke it up.  The timeout is as for schedule_timeout().
  
  For example:
  
  	wq = bit_waitqueue(&my_flags, MY_BIT);
  	init_wait(&wait);
  	requeue = false;
  	do {
  		prepare_to_wait(wq, &wait, TASK_UNINTERRUPTIBLE);
  		if (!test_bit(MY_BIT, &my_flags))
  			break;
  		requeue = slow_work_sleep_till_thread_needed(&my_work,
  							     &timeout);
  	} while (timeout > 0 && !requeue);
  	finish_wait(wq, &wait);
  	if (!test_bit(MY_BIT, &my_flags)
  		goto do_my_thing;
  	if (requeue)
  		return; // to slow_work
31ba99d30   David Howells   SLOW_WORK: Allow ...
196

8f0aa2f25   David Howells   Document the slow...
197
198
199
200
201
  ===============
  ITEM OPERATIONS
  ===============
  
  Each work item requires a table of operations of type struct slow_work_ops.
8fba10a42   David Howells   SLOW_WORK: Allow ...
202
203
  Only ->execute() is required; the getting and putting of a reference and the
  describing of an item are all optional.
8f0aa2f25   David Howells   Document the slow...
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
  
   (*) Get a reference on an item:
  
  	int (*get_ref)(struct slow_work *work);
  
       This allows the thread pool to attempt to pin an item by getting a
       reference on it.  This function should return 0 if the reference was
       granted, or a -ve error otherwise.  If an error is returned,
       slow_work_enqueue() will fail.
  
       The reference is held whilst the item is queued and whilst it is being
       executed.  The item may then be requeued with the same reference held, or
       the reference will be released.
  
   (*) Release a reference on an item:
  
  	void (*put_ref)(struct slow_work *work);
  
       This allows the thread pool to unpin an item by releasing the reference on
       it.  The thread pool will not touch the item again once this has been
       called.
  
   (*) Execute an item:
  
  	void (*execute)(struct slow_work *work);
  
       This should perform the work required of the item.  It may sleep, it may
       perform disk I/O and it may wait for locks.
8fba10a42   David Howells   SLOW_WORK: Allow ...
232
233
234
235
236
237
238
239
240
   (*) View an item through /proc:
  
  	void (*desc)(struct slow_work *work, struct seq_file *m);
  
       If supplied, this should print to 'm' a small string describing the work
       the item is to do.  This should be no more than about 40 characters, and
       shouldn't include a newline character.
  
       See the 'Viewing executing and queued items' section below.
8f0aa2f25   David Howells   Document the slow...
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
  
  ==================
  POOL CONFIGURATION
  ==================
  
  The slow-work thread pool has a number of configurables:
  
   (*) /proc/sys/kernel/slow-work/min-threads
  
       The minimum number of threads that should be in the pool whilst it is in
       use.  This may be anywhere between 2 and max-threads.
  
   (*) /proc/sys/kernel/slow-work/max-threads
  
       The maximum number of threads that should in the pool.  This may be
       anywhere between min-threads and 255 or NR_CPUS * 2, whichever is greater.
  
   (*) /proc/sys/kernel/slow-work/vslow-percentage
  
       The percentage of active threads in the pool that may be used to execute
       very slow work items.  This may be between 1 and 99.  The resultant number
       is bounded to between 1 and one fewer than the number of active threads.
       This ensures there is always at least one thread that can process very
       slow work items, and always at least one thread that won't.
8fba10a42   David Howells   SLOW_WORK: Allow ...
265
266
267
268
269
  
  
  ==================================
  VIEWING EXECUTING AND QUEUED ITEMS
  ==================================
f13a48bd7   David Howells   SLOW_WORK: Move s...
270
  If CONFIG_SLOW_WORK_DEBUG is enabled, a debugfs file is made available:
8fba10a42   David Howells   SLOW_WORK: Allow ...
271

f13a48bd7   David Howells   SLOW_WORK: Move s...
272
  	/sys/kernel/debug/slow_work/runqueue
8fba10a42   David Howells   SLOW_WORK: Allow ...
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
  
  through which the list of work items being executed and the queues of items to
  be executed may be viewed.  The owner of a work item is given the chance to
  add some information of its own.
  
  The contents look something like the following:
  
      THR PID   ITEM ADDR        FL MARK  DESC
      === ===== ================ == ===== ==========
        0  3005 ffff880023f52348  a 952ms FSC: OBJ17d3: LOOK
        1  3006 ffff880024e33668  2 160ms FSC: OBJ17e5 OP60d3b: Write1/Store fl=2
        2  3165 ffff8800296dd180  a 424ms FSC: OBJ17e4: LOOK
        3  4089 ffff8800262c8d78  a 212ms FSC: OBJ17ea: CRTN
        4  4090 ffff88002792bed8  2 388ms FSC: OBJ17e8 OP60d36: Write1/Store fl=2
        5  4092 ffff88002a0ef308  2 388ms FSC: OBJ17e7 OP60d2e: Write1/Store fl=2
        6  4094 ffff88002abaf4b8  2 132ms FSC: OBJ17e2 OP60d4e: Write1/Store fl=2
        7  4095 ffff88002bb188e0  a 388ms FSC: OBJ17e9: CRTN
      vsq     - ffff880023d99668  1 308ms FSC: OBJ17e0 OP60f91: Write1/EnQ fl=2
      vsq     - ffff8800295d1740  1 212ms FSC: OBJ16be OP4d4b6: Write1/EnQ fl=2
      vsq     - ffff880025ba3308  1 160ms FSC: OBJ179a OP58dec: Write1/EnQ fl=2
      vsq     - ffff880024ec83e0  1 160ms FSC: OBJ17ae OP599f2: Write1/EnQ fl=2
      vsq     - ffff880026618e00  1 160ms FSC: OBJ17e6 OP60d33: Write1/EnQ fl=2
      vsq     - ffff880025a2a4b8  1 132ms FSC: OBJ16a2 OP4d583: Write1/EnQ fl=2
      vsq     - ffff880023cbe6d8  9 212ms FSC: OBJ17eb: LOOK
      vsq     - ffff880024d37590  9 212ms FSC: OBJ17ec: LOOK
      vsq     - ffff880027746cb0  9 212ms FSC: OBJ17ed: LOOK
      vsq     - ffff880024d37ae8  9 212ms FSC: OBJ17ee: LOOK
      vsq     - ffff880024d37cb0  9 212ms FSC: OBJ17ef: LOOK
      vsq     - ffff880025036550  9 212ms FSC: OBJ17f0: LOOK
      vsq     - ffff8800250368e0  9 212ms FSC: OBJ17f1: LOOK
      vsq     - ffff880025036aa8  9 212ms FSC: OBJ17f2: LOOK
  
  In the 'THR' column, executing items show the thread they're occupying and
  queued threads indicate which queue they're on.  'PID' shows the process ID of
  a slow-work thread that's executing something.  'FL' shows the work item flags.
  'MARK' indicates how long since an item was queued or began executing.  Lastly,
  the 'DESC' column permits the owner of an item to give some information.