Commit a85d9df1ea1d23682a0ed1e100e6965006595d06

Authored by KOSAKI Motohiro
Committed by Linus Torvalds
1 parent f893ab41e4

mm: __set_page_dirty_nobuffers() uses spin_lock_irqsave() instead of spin_lock_irq()

During aio stress test, we observed the following lockdep warning.  This
mean AIO+numa_balancing is currently deadlockable.

The problem is, aio_migratepage disable interrupt, but
__set_page_dirty_nobuffers unintentionally enable it again.

Generally, all helper function should use spin_lock_irqsave() instead of
spin_lock_irq() because they don't know caller at all.

   other info that might help us debug this:
    Possible unsafe locking scenario:

          CPU0
          ----
     lock(&(&ctx->completion_lock)->rlock);
     <Interrupt>
       lock(&(&ctx->completion_lock)->rlock);

    *** DEADLOCK ***

      dump_stack+0x19/0x1b
      print_usage_bug+0x1f7/0x208
      mark_lock+0x21d/0x2a0
      mark_held_locks+0xb9/0x140
      trace_hardirqs_on_caller+0x105/0x1d0
      trace_hardirqs_on+0xd/0x10
      _raw_spin_unlock_irq+0x2c/0x50
      __set_page_dirty_nobuffers+0x8c/0xf0
      migrate_page_copy+0x434/0x540
      aio_migratepage+0xb1/0x140
      move_to_new_page+0x7d/0x230
      migrate_pages+0x5e5/0x700
      migrate_misplaced_page+0xbc/0xf0
      do_numa_page+0x102/0x190
      handle_pte_fault+0x241/0x970
      handle_mm_fault+0x265/0x370
      __do_page_fault+0x172/0x5a0
      do_page_fault+0x1a/0x70
      page_fault+0x28/0x30

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 1 changed file with 3 additions and 2 deletions Side-by-side Diff

... ... @@ -2173,11 +2173,12 @@
2173 2173 if (!TestSetPageDirty(page)) {
2174 2174 struct address_space *mapping = page_mapping(page);
2175 2175 struct address_space *mapping2;
  2176 + unsigned long flags;
2176 2177  
2177 2178 if (!mapping)
2178 2179 return 1;
2179 2180  
2180   - spin_lock_irq(&mapping->tree_lock);
  2181 + spin_lock_irqsave(&mapping->tree_lock, flags);
2181 2182 mapping2 = page_mapping(page);
2182 2183 if (mapping2) { /* Race with truncate? */
2183 2184 BUG_ON(mapping2 != mapping);
... ... @@ -2186,7 +2187,7 @@
2186 2187 radix_tree_tag_set(&mapping->page_tree,
2187 2188 page_index(page), PAGECACHE_TAG_DIRTY);
2188 2189 }
2189   - spin_unlock_irq(&mapping->tree_lock);
  2190 + spin_unlock_irqrestore(&mapping->tree_lock, flags);
2190 2191 if (mapping->host) {
2191 2192 /* !PageAnon && !swapper_space */
2192 2193 __mark_inode_dirty(mapping->host, I_DIRTY_PAGES);