Commit e6fb6da2e10682d477f2fdb749451d9fe5d168e8

Authored by Wu Fengguang
1 parent cb9bd1159c

writeback: try more writeback as long as something was written

writeback_inodes_wb()/__writeback_inodes_sb() are not aggressive in that
they only populate possibly a subset of eligible inodes into b_io at
entrance time. When the queued set of inodes are all synced, they just
return, possibly with all queued inode pages written but still
wbc.nr_to_write > 0.

For kupdate and background writeback, there may be more eligible inodes
sitting in b_dirty when the current set of b_io inodes are completed. So
it is necessary to try another round of writeback as long as we made some
progress in this round. When there are no more eligible inodes, no more
inodes will be enqueued in queue_io(), hence nothing could/will be
synced and we may safely bail.

For example, imagine 100 inodes

        i0, i1, i2, ..., i90, i91, i99

At queue_io() time, i90-i99 happen to be expired and moved to s_io for
IO. When finished successfully, if their total size is less than
MAX_WRITEBACK_PAGES, nr_to_write will be > 0. Then wb_writeback() will
quit the background work (w/o this patch) while it's still over
background threshold. This will be a fairly normal/frequent case I guess.

Now that we do tagged sync and update inode->dirtied_when after the sync,
this change won't livelock sync(1).  I actually tried to write 1 page
per 1ms with this command

	write-and-fsync -n10000 -S 1000 -c 4096 /fs/test

and do sync(1) at the same time. The sync completes quickly on ext4,
xfs, btrfs.

Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>

Showing 1 changed file with 8 additions and 8 deletions Side-by-side Diff

... ... @@ -739,22 +739,22 @@
739 739 wrote += write_chunk - wbc.nr_to_write;
740 740  
741 741 /*
742   - * If we consumed everything, see if we have more
  742 + * Did we write something? Try for more
  743 + *
  744 + * Dirty inodes are moved to b_io for writeback in batches.
  745 + * The completion of the current batch does not necessarily
  746 + * mean the overall work is done. So we keep looping as long
  747 + * as made some progress on cleaning pages or inodes.
743 748 */
744   - if (wbc.nr_to_write <= 0)
  749 + if (wbc.nr_to_write < write_chunk)
745 750 continue;
746 751 if (wbc.inodes_written)
747 752 continue;
748 753 /*
749   - * Didn't write everything and we don't have more IO, bail
  754 + * No more inodes for IO, bail
750 755 */
751 756 if (!wbc.more_io)
752 757 break;
753   - /*
754   - * Did we write something? Try for more
755   - */
756   - if (wbc.nr_to_write < write_chunk)
757   - continue;
758 758 /*
759 759 * Nothing written. Wait for some inode to
760 760 * become available for writeback. Otherwise