Commit 64580cdff828d554d78010a8b19fef0408fdd2bd

Authored by Ryusuke Konishi
Committed by Greg Kroah-Hartman
1 parent 32b783339f

nilfs2: fix deadlock of segment constructor during recovery

commit 283ee1482f349d6c0c09dfb725db5880afc56813 upstream.

According to a report from Yuxuan Shui, nilfs2 in kernel 3.19 got stuck
during recovery at mount time.  The code path that caused the deadlock was
as follows:

  nilfs_fill_super()
    load_nilfs()
      nilfs_salvage_orphan_logs()
        * Do roll-forwarding, attach segment constructor for recovery,
          and kick it.

        nilfs_segctor_thread()
          nilfs_segctor_thread_construct()
           * A lock is held with nilfs_transaction_lock()
             nilfs_segctor_do_construct()
               nilfs_segctor_drop_written_files()
                 iput()
                   iput_final()
                     write_inode_now()
                       writeback_single_inode()
                         __writeback_single_inode()
                           do_writepages()
                             nilfs_writepage()
                               nilfs_construct_dsync_segment()
                                 nilfs_transaction_lock() --> deadlock

This can happen if commit 7ef3ff2fea8b ("nilfs2: fix deadlock of segment
constructor over I_SYNC flag") is applied and roll-forward recovery was
performed at mount time.  The roll-forward recovery can happen if datasync
write is done and the file system crashes immediately after that.  For
instance, we can reproduce the issue with the following steps:

 < nilfs2 is mounted on /nilfs (device: /dev/sdb1) >
 # dd if=/dev/zero of=/nilfs/test bs=4k count=1 && sync
 # dd if=/dev/zero of=/nilfs/test conv=notrunc oflag=dsync bs=4k
 count=1 && reboot -nfh
 < the system will immediately reboot >
 # mount -t nilfs2 /dev/sdb1 /nilfs

The deadlock occurs because iput() can run segment constructor through
writeback_single_inode() if MS_ACTIVE flag is not set on sb->s_flags.  The
above commit changed segment constructor so that it calls iput()
asynchronously for inodes with i_nlink == 0, but that change was
imperfect.

This fixes the another deadlock by deferring iput() in segment constructor
even for the case that mount is not finished, that is, for the case that
MS_ACTIVE flag is not set.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Reported-by: Yuxuan Shui <yshuiv7@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Showing 1 changed file with 4 additions and 3 deletions Side-by-side Diff

... ... @@ -1906,6 +1906,7 @@
1906 1906 struct the_nilfs *nilfs)
1907 1907 {
1908 1908 struct nilfs_inode_info *ii, *n;
  1909 + int during_mount = !(sci->sc_super->s_flags & MS_ACTIVE);
1909 1910 int defer_iput = false;
1910 1911  
1911 1912 spin_lock(&nilfs->ns_inode_lock);
1912 1913  
... ... @@ -1918,10 +1919,10 @@
1918 1919 brelse(ii->i_bh);
1919 1920 ii->i_bh = NULL;
1920 1921 list_del_init(&ii->i_dirty);
1921   - if (!ii->vfs_inode.i_nlink) {
  1922 + if (!ii->vfs_inode.i_nlink || during_mount) {
1922 1923 /*
1923   - * Defer calling iput() to avoid a deadlock
1924   - * over I_SYNC flag for inodes with i_nlink == 0
  1924 + * Defer calling iput() to avoid deadlocks if
  1925 + * i_nlink == 0 or mount is not yet finished.
1925 1926 */
1926 1927 list_add_tail(&ii->i_dirty, &sci->sc_iput_queue);
1927 1928 defer_iput = true;