Commit 4120db47198d21d8cd3b2cdbbe1ea6118a50bcd4
Committed by
Linus Torvalds
1 parent
168a9fd6a1
[PATCH] bugfix: two read_inode() calls without clear_inode() call between
Bug symptoms ~~~~~~~~~~~~ For the same inode VFS calls read_inode() twice and doesn't call clear_inode() between the two read_inode() invocations. Bug description ~~~~~~~~~~~~~~~ Suppose we have an inode which has zero reference count but is still in the inode cache. Suppose kswapd invokes shrink_icache_memory() to free some RAM. In prune_icache() inodes are removed from i_hash. prune_icache () is then going to call clear_inode(), but drops the inode_lock spinlock before this. If in this moment another task calls iget() for an inode which was just removed from i_hash by prune_icache(), then iget() invokes read_inode() for this inode, because it is *already removed* from i_hash. The end result is: we call iget(#N) then iput(#N); inode #N has zero i_count now and is in the inode cache; kswapd starts. kswapd removes the inode #N from i_hash ans is preempted; we call iget(#N) again; read_inode() is invoked as the result; but we expect clear_inode() before. Fix ~~~~~~~ To fix the bug I remove inodes from i_hash later, when clear_inode() is actually called. I remove them from i_hash under spinlock protection. Since the i_state is set to I_FREEING, it is safe to do this. The others will sleep waiting for the inode state change. I also postpone removing inodes from i_sb_list. It is not compulsory to do so but I do it for readability reasons. Inodes are added/removed to the lists together everywhere in the code and there is no point to change this rule. This is harmless because the only user of i_sb_list which somehow may interfere with me (invalidate_list()) is excluded by the iprune_sem mutex. The same race is possible in invalidate_list() so I do the same for it. Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Showing 1 changed file with 7 additions and 4 deletions Side-by-side Diff
fs/inode.c
... | ... | @@ -282,6 +282,13 @@ |
282 | 282 | if (inode->i_data.nrpages) |
283 | 283 | truncate_inode_pages(&inode->i_data, 0); |
284 | 284 | clear_inode(inode); |
285 | + | |
286 | + spin_lock(&inode_lock); | |
287 | + hlist_del_init(&inode->i_hash); | |
288 | + list_del_init(&inode->i_sb_list); | |
289 | + spin_unlock(&inode_lock); | |
290 | + | |
291 | + wake_up_inode(inode); | |
285 | 292 | destroy_inode(inode); |
286 | 293 | nr_disposed++; |
287 | 294 | } |
... | ... | @@ -317,8 +324,6 @@ |
317 | 324 | inode = list_entry(tmp, struct inode, i_sb_list); |
318 | 325 | invalidate_inode_buffers(inode); |
319 | 326 | if (!atomic_read(&inode->i_count)) { |
320 | - hlist_del_init(&inode->i_hash); | |
321 | - list_del(&inode->i_sb_list); | |
322 | 327 | list_move(&inode->i_list, dispose); |
323 | 328 | inode->i_state |= I_FREEING; |
324 | 329 | count++; |
... | ... | @@ -439,8 +444,6 @@ |
439 | 444 | if (!can_unuse(inode)) |
440 | 445 | continue; |
441 | 446 | } |
442 | - hlist_del_init(&inode->i_hash); | |
443 | - list_del_init(&inode->i_sb_list); | |
444 | 447 | list_move(&inode->i_list, &freeable); |
445 | 448 | inode->i_state |= I_FREEING; |
446 | 449 | nr_pruned++; |