reiserfs: fix race with flush_used_journal_lists and flush_journal_list

There are two locks involved in managing the journal lists. The general reiserfs_write_lock and the journal->j_flush_mutex. While flush_journal_list is sleeping to acquire the j_flush_mutex or to submit a block for write, it will drop the write lock. This allows another thread to acquire the write lock and ultimately call flush_used_journal_lists to traverse the list of journal lists and select one for flushing. It can select the journal_list that has just had flush_journal_list called on it in the original thread and call it again with the same journal_list. The second thread then drops the write lock to acquire j_flush_mutex and the first thread reacquires it and continues execution and eventually clears and frees the journal list before dropping j_flush_mutex and returning. The second thread acquires j_flush_mutex and ends up operating on a journal_list that has already been released. If the memory hasn't been reused, we'll soon after hit a BUG_ON because the transaction id has already been cleared. If it's been reused, we'll crash in other fun ways. Since flush_journal_list will synchronize on j_flush_mutex, we can fix the race by taking a proper reference in flush_used_journal_lists and checking to see if it's still valid after the mutex is taken. It's safe to iterate the list of journal lists and pick a list with just the write lock as long as a reference is taken on the journal list before we drop the lock. We already have code to handle whether a transaction has been flushed already so we can use that to handle the race and get rid of the trans_id BUG_ON. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz>

reiserfs: fix race with flush_used_journal_lists and flush_journal_list
There are two locks involved in managing the journal lists. The general reiserfs_write_lock and the journal->j_flush_mutex. While flush_journal_list is sleeping to acquire the j_flush_mutex or to submit a block for write, it will drop the write lock. This allows another thread to acquire the write lock and ultimately call flush_used_journal_lists to traverse the list of journal lists and select one for flushing. It can select the journal_list that has just had flush_journal_list called on it in the original thread and call it again with the same journal_list. The second thread then drops the write lock to acquire j_flush_mutex and the first thread reacquires it and continues execution and eventually clears and frees the journal list before dropping j_flush_mutex and returning. The second thread acquires j_flush_mutex and ends up operating on a journal_list that has already been released. If the memory hasn't been reused, we'll soon after hit a BUG_ON because the transaction id has already been cleared. If it's been reused, we'll crash in other fun ways. Since flush_journal_list will synchronize on j_flush_mutex, we can fix the race by taking a proper reference in flush_used_journal_lists and checking to see if it's still valid after the mutex is taken. It's safe to iterate the list of journal lists and pick a list with just the write lock as long as a reference is taken on the journal list before we drop the lock. We already have code to handle whether a transaction has been flushed already so we can use that to handle the race and get rid of the trans_id BUG_ON. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz>
Jeff Mahoney · Jan Kara
1 parent 7bc9cc07ee
Showing 1 changed file with 4 additions and 1 deletions Side-by-side Diff
fs/reiserfs/journal.c
@@ -1338,7 +1338,6 @@
 		reiserfs_warning(s, "clm-2048", "called with wcount %d",
 				 atomic_read(&journal->j_wcount));
 	}
-	BUG_ON(jl->j_trans_id == 0);
  
 	/* if flushall == 0, the lock is already held */
 	if (flushall) {
@@ -1765,6 +1764,8 @@
 			break;
 		tjl = JOURNAL_LIST_ENTRY(tjl->j_list.next);
 	}
+	get_journal_list(jl);
+	get_journal_list(flush_jl);
 	/* try to find a group of blocks we can flush across all the
 	 ** transactions, but only bother if we've actually spanned
 	 ** across multiple lists
@@ -1773,6 +1774,8 @@
 		ret = kupdate_transactions(s, jl, &tjl, &trans_id, len, i);
 	}
 	flush_journal_list(s, flush_jl, 1);
+	put_journal_list(s, flush_jl);
+	put_journal_list(s, jl);
 	return 0;
 }
...	...	@@ -1338,7 +1338,6 @@
1338	1338	reiserfs_warning(s, "clm-2048", "called with wcount %d",
1339	1339	atomic_read(&journal->j_wcount));
1340	1340	}
1341		- BUG_ON(jl->j_trans_id == 0);
1342	1341
1343	1342	/* if flushall == 0, the lock is already held */
1344	1343	if (flushall) {
...	...	@@ -1765,6 +1764,8 @@
1765	1764	break;
1766	1765	tjl = JOURNAL_LIST_ENTRY(tjl->j_list.next);
1767	1766	}
	1767	+ get_journal_list(jl);
	1768	+ get_journal_list(flush_jl);
1768	1769	/* try to find a group of blocks we can flush across all the
1769	1770	** transactions, but only bother if we've actually spanned
1770	1771	** across multiple lists
...	...	@@ -1773,6 +1774,8 @@
1773	1774	ret = kupdate_transactions(s, jl, &tjl, &trans_id, len, i);
1774	1775	}
1775	1776	flush_journal_list(s, flush_jl, 1);
	1777	+ put_journal_list(s, flush_jl);
	1778	+ put_journal_list(s, jl);
1776	1779	return 0;
1777	1780	}
1778	1781