Commit 788a2f69f9b8b77b30ace8d1ef9380fa4ea5c6ec
Committed by
Jiri Slaby
1 parent
84a6c7694a
Exists in
ti-linux-3.12.y
and in
2 other branches
vmalloc: use rcu list iterator to reduce vmap_area_lock contention
commit 474750aba88817c53f39424e5567b8e4acc4b39b upstream. Richard Yao reported a month ago that his system have a trouble with vmap_area_lock contention during performance analysis by /proc/meminfo. Andrew asked why his analysis checks /proc/meminfo stressfully, but he didn't answer it. https://lkml.org/lkml/2014/4/10/416 Although I'm not sure that this is right usage or not, there is a solution reducing vmap_area_lock contention with no side-effect. That is just to use rcu list iterator in get_vmalloc_info(). rcu can be used in this function because all RCU protocol is already respected by writers, since Nick Piggin commit db64fe02258f1 ("mm: rewrite vmap layer") back in linux-2.6.28 Specifically : insertions use list_add_rcu(), deletions use list_del_rcu() and kfree_rcu(). Note the rb tree is not used from rcu reader (it would not be safe), only the vmap_area_list has full RCU protection. Note that __purge_vmap_area_lazy() already uses this rcu protection. rcu_read_lock(); list_for_each_entry_rcu(va, &vmap_area_list, list) { if (va->flags & VM_LAZY_FREE) { if (va->va_start < *start) *start = va->va_start; if (va->va_end > *end) *end = va->va_end; nr += (va->va_end - va->va_start) >> PAGE_SHIFT; list_add_tail(&va->purge_list, &valist); va->flags |= VM_LAZY_FREEING; va->flags &= ~VM_LAZY_FREE; } } rcu_read_unlock(); Peter: : While rcu list traversal over the vmap_area_list is safe, this may : arrive at different results than the spinlocked version. The rcu list : traversal version will not be a 'snapshot' of a single, valid instant : of the entire vmap_area_list, but rather a potential amalgam of : different list states. Joonsoo: : Yes, you are right, but I don't think that we should be strict here. : Meminfo is already not a 'snapshot' at specific time. While we try to get : certain stats, the other stats can change. And, although we may arrive at : different results than the spinlocked version, the difference would not be : large and would not make serious side-effect. [edumazet@google.com: add more commit description] Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reported-by: Richard Yao <ryao@gentoo.org> Acked-by: Eric Dumazet <edumazet@google.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Zhang Yanfei <zhangyanfei.yes@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Showing 1 changed file with 3 additions and 3 deletions Side-by-side Diff
mm/vmalloc.c
... | ... | @@ -2685,14 +2685,14 @@ |
2685 | 2685 | |
2686 | 2686 | prev_end = VMALLOC_START; |
2687 | 2687 | |
2688 | - spin_lock(&vmap_area_lock); | |
2688 | + rcu_read_lock(); | |
2689 | 2689 | |
2690 | 2690 | if (list_empty(&vmap_area_list)) { |
2691 | 2691 | vmi->largest_chunk = VMALLOC_TOTAL; |
2692 | 2692 | goto out; |
2693 | 2693 | } |
2694 | 2694 | |
2695 | - list_for_each_entry(va, &vmap_area_list, list) { | |
2695 | + list_for_each_entry_rcu(va, &vmap_area_list, list) { | |
2696 | 2696 | unsigned long addr = va->va_start; |
2697 | 2697 | |
2698 | 2698 | /* |
... | ... | @@ -2719,7 +2719,7 @@ |
2719 | 2719 | vmi->largest_chunk = VMALLOC_END - prev_end; |
2720 | 2720 | |
2721 | 2721 | out: |
2722 | - spin_unlock(&vmap_area_lock); | |
2722 | + rcu_read_unlock(); | |
2723 | 2723 | } |
2724 | 2724 | #endif |