Commit b04f784e5d19ed58892833dae845738972cea260

Authored by Nick Piggin
Committed by Al Viro
1 parent 2a4419b5b2

fs: remove extra lookup in __lookup_hash

fs: remove extra lookup in __lookup_hash

Optimize lookup for create operations, where no dentry should often be
common-case. In cases where it is not, such as unlink, the added overhead
is much smaller than the removed.

Also, move comments about __d_lookup racyness to the __d_lookup call site.
d_lookup is intuitive; __d_lookup is what needs commenting. So in that same
vein, add kerneldoc comments to __d_lookup and clean up some of the comments:

- We are interested in how the RCU lookup works here, particularly with
  renames. Make that explicit, and point to the document where it is explained
  in more detail.
- RCU is pretty standard now, and macros make implementations pretty mindless.
  If we want to know about RCU barrier details, we look in RCU code.
- Delete some boring legacy comments because we don't care much about how the
  code used to work, more about the interesting parts of how it works now. So
  comments about lazy LRU may be interesting, but would better be done in the
  LRU or refcount management code.

Signed-off-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Showing 2 changed files with 51 additions and 41 deletions Side-by-side Diff

... ... @@ -1332,31 +1332,13 @@
1332 1332 * d_lookup - search for a dentry
1333 1333 * @parent: parent dentry
1334 1334 * @name: qstr of name we wish to find
  1335 + * Returns: dentry, or NULL
1335 1336 *
1336   - * Searches the children of the parent dentry for the name in question. If
1337   - * the dentry is found its reference count is incremented and the dentry
1338   - * is returned. The caller must use dput to free the entry when it has
1339   - * finished using it. %NULL is returned on failure.
1340   - *
1341   - * __d_lookup is dcache_lock free. The hash list is protected using RCU.
1342   - * Memory barriers are used while updating and doing lockless traversal.
1343   - * To avoid races with d_move while rename is happening, d_lock is used.
1344   - *
1345   - * Overflows in memcmp(), while d_move, are avoided by keeping the length
1346   - * and name pointer in one structure pointed by d_qstr.
1347   - *
1348   - * rcu_read_lock() and rcu_read_unlock() are used to disable preemption while
1349   - * lookup is going on.
1350   - *
1351   - * The dentry unused LRU is not updated even if lookup finds the required dentry
1352   - * in there. It is updated in places such as prune_dcache, shrink_dcache_sb,
1353   - * select_parent and __dget_locked. This laziness saves lookup from dcache_lock
1354   - * acquisition.
1355   - *
1356   - * d_lookup() is protected against the concurrent renames in some unrelated
1357   - * directory using the seqlockt_t rename_lock.
  1337 + * d_lookup searches the children of the parent dentry for the name in
  1338 + * question. If the dentry is found its reference count is incremented and the
  1339 + * dentry is returned. The caller must use dput to free the entry when it has
  1340 + * finished using it. %NULL is returned if the dentry does not exist.
1358 1341 */
1359   -
1360 1342 struct dentry * d_lookup(struct dentry * parent, struct qstr * name)
1361 1343 {
1362 1344 struct dentry * dentry = NULL;
... ... @@ -1372,6 +1354,21 @@
1372 1354 }
1373 1355 EXPORT_SYMBOL(d_lookup);
1374 1356  
  1357 +/*
  1358 + * __d_lookup - search for a dentry (racy)
  1359 + * @parent: parent dentry
  1360 + * @name: qstr of name we wish to find
  1361 + * Returns: dentry, or NULL
  1362 + *
  1363 + * __d_lookup is like d_lookup, however it may (rarely) return a
  1364 + * false-negative result due to unrelated rename activity.
  1365 + *
  1366 + * __d_lookup is slightly faster by avoiding rename_lock read seqlock,
  1367 + * however it must be used carefully, eg. with a following d_lookup in
  1368 + * the case of failure.
  1369 + *
  1370 + * __d_lookup callers must be commented.
  1371 + */
1375 1372 struct dentry * __d_lookup(struct dentry * parent, struct qstr * name)
1376 1373 {
1377 1374 unsigned int len = name->len;
... ... @@ -1382,6 +1379,19 @@
1382 1379 struct hlist_node *node;
1383 1380 struct dentry *dentry;
1384 1381  
  1382 + /*
  1383 + * The hash list is protected using RCU.
  1384 + *
  1385 + * Take d_lock when comparing a candidate dentry, to avoid races
  1386 + * with d_move().
  1387 + *
  1388 + * It is possible that concurrent renames can mess up our list
  1389 + * walk here and result in missing our dentry, resulting in the
  1390 + * false-negative result. d_lookup() protects against concurrent
  1391 + * renames using rename_lock seqlock.
  1392 + *
  1393 + * See Documentation/vfs/dcache-locking.txt for more details.
  1394 + */
1385 1395 rcu_read_lock();
1386 1396  
1387 1397 hlist_for_each_entry_rcu(dentry, node, head, d_hash) {
... ... @@ -1396,8 +1406,8 @@
1396 1406  
1397 1407 /*
1398 1408 * Recheck the dentry after taking the lock - d_move may have
1399   - * changed things. Don't bother checking the hash because we're
1400   - * about to compare the whole name anyway.
  1409 + * changed things. Don't bother checking the hash because
  1410 + * we're about to compare the whole name anyway.
1401 1411 */
1402 1412 if (dentry->d_parent != parent)
1403 1413 goto next;
... ... @@ -735,6 +735,11 @@
735 735 return err;
736 736 }
737 737  
  738 + /*
  739 + * Rename seqlock is not required here because in the off chance
  740 + * of a false negative due to a concurrent rename, we're going to
  741 + * do the non-racy lookup, below.
  742 + */
738 743 dentry = __d_lookup(nd->path.dentry, name);
739 744 if (!dentry)
740 745 goto need_lookup;
741 746  
... ... @@ -754,17 +759,13 @@
754 759 mutex_lock(&dir->i_mutex);
755 760 /*
756 761 * First re-do the cached lookup just in case it was created
757   - * while we waited for the directory semaphore..
  762 + * while we waited for the directory semaphore, or the first
  763 + * lookup failed due to an unrelated rename.
758 764 *
759   - * FIXME! This could use version numbering or similar to
760   - * avoid unnecessary cache lookups.
761   - *
762   - * The "dcache_lock" is purely to protect the RCU list walker
763   - * from concurrent renames at this point (we mustn't get false
764   - * negatives from the RCU list walk here, unlike the optimistic
765   - * fast walk).
766   - *
767   - * so doing d_lookup() (with seqlock), instead of lockfree __d_lookup
  765 + * This could use version numbering or similar to avoid unnecessary
  766 + * cache lookups, but then we'd have to do the first lookup in the
  767 + * non-racy way. However in the common case here, everything should
  768 + * be hot in cache, so would it be a big win?
768 769 */
769 770 dentry = d_lookup(parent, name);
770 771 if (likely(!dentry)) {
771 772  
... ... @@ -1136,13 +1137,12 @@
1136 1137 goto out;
1137 1138 }
1138 1139  
1139   - dentry = __d_lookup(base, name);
1140   -
1141   - /* lockess __d_lookup may fail due to concurrent d_move()
1142   - * in some unrelated directory, so try with d_lookup
  1140 + /*
  1141 + * Don't bother with __d_lookup: callers are for creat as
  1142 + * well as unlink, so a lot of the time it would cost
  1143 + * a double lookup.
1143 1144 */
1144   - if (!dentry)
1145   - dentry = d_lookup(base, name);
  1145 + dentry = d_lookup(base, name);
1146 1146  
1147 1147 if (dentry && dentry->d_op && dentry->d_op->d_revalidate)
1148 1148 dentry = do_revalidate(dentry, nd);