Commit cc9a6c8776615f9c194ccf0b63a0aa5628235545

Authored by Mel Gorman
Committed by Linus Torvalds
1 parent e845e19936

cpuset: mm: reduce large amounts of memory barrier related damage v3

Commit c0ff7453bb5c ("cpuset,mm: fix no node to alloc memory when
changing cpuset's mems") wins a super prize for the largest number of
memory barriers entered into fast paths for one commit.

[get|put]_mems_allowed is incredibly heavy with pairs of full memory
barriers inserted into a number of hot paths.  This was detected while
investigating at large page allocator slowdown introduced some time
after 2.6.32.  The largest portion of this overhead was shown by
oprofile to be at an mfence introduced by this commit into the page
allocator hot path.

For extra style points, the commit introduced the use of yield() in an
implementation of what looks like a spinning mutex.

This patch replaces the full memory barriers on both read and write
sides with a sequence counter with just read barriers on the fast path
side.  This is much cheaper on some architectures, including x86.  The
main bulk of the patch is the retry logic if the nodemask changes in a
manner that can cause a false failure.

While updating the nodemask, a check is made to see if a false failure
is a risk.  If it is, the sequence number gets bumped and parallel
allocators will briefly stall while the nodemask update takes place.

In a page fault test microbenchmark, oprofile samples from
__alloc_pages_nodemask went from 4.53% of all samples to 1.15%.  The
actual results were

                             3.3.0-rc3          3.3.0-rc3
                             rc3-vanilla        nobarrier-v2r1
    Clients   1 UserTime       0.07 (  0.00%)   0.08 (-14.19%)
    Clients   2 UserTime       0.07 (  0.00%)   0.07 (  2.72%)
    Clients   4 UserTime       0.08 (  0.00%)   0.07 (  3.29%)
    Clients   1 SysTime        0.70 (  0.00%)   0.65 (  6.65%)
    Clients   2 SysTime        0.85 (  0.00%)   0.82 (  3.65%)
    Clients   4 SysTime        1.41 (  0.00%)   1.41 (  0.32%)
    Clients   1 WallTime       0.77 (  0.00%)   0.74 (  4.19%)
    Clients   2 WallTime       0.47 (  0.00%)   0.45 (  3.73%)
    Clients   4 WallTime       0.38 (  0.00%)   0.37 (  1.58%)
    Clients   1 Flt/sec/cpu  497620.28 (  0.00%) 520294.53 (  4.56%)
    Clients   2 Flt/sec/cpu  414639.05 (  0.00%) 429882.01 (  3.68%)
    Clients   4 Flt/sec/cpu  257959.16 (  0.00%) 258761.48 (  0.31%)
    Clients   1 Flt/sec      495161.39 (  0.00%) 517292.87 (  4.47%)
    Clients   2 Flt/sec      820325.95 (  0.00%) 850289.77 (  3.65%)
    Clients   4 Flt/sec      1020068.93 (  0.00%) 1022674.06 (  0.26%)
    MMTests Statistics: duration
    Sys Time Running Test (seconds)             135.68    132.17
    User+Sys Time Running Test (seconds)         164.2    160.13
    Total Elapsed Time (seconds)                123.46    120.87

The overall improvement is small but the System CPU time is much
improved and roughly in correlation to what oprofile reported (these
performance figures are without profiling so skew is expected).  The
actual number of page faults is noticeably improved.

For benchmarks like kernel builds, the overall benefit is marginal but
the system CPU time is slightly reduced.

To test the actual bug the commit fixed I opened two terminals.  The
first ran within a cpuset and continually ran a small program that
faulted 100M of anonymous data.  In a second window, the nodemask of the
cpuset was continually randomised in a loop.

Without the commit, the program would fail every so often (usually
within 10 seconds) and obviously with the commit everything worked fine.
With this patch applied, it also worked fine so the fix should be
functionally equivalent.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 12 changed files with 133 additions and 110 deletions Side-by-side Diff

include/linux/cpuset.h
... ... @@ -89,42 +89,33 @@
89 89 extern void cpuset_print_task_mems_allowed(struct task_struct *p);
90 90  
91 91 /*
92   - * reading current mems_allowed and mempolicy in the fastpath must protected
93   - * by get_mems_allowed()
  92 + * get_mems_allowed is required when making decisions involving mems_allowed
  93 + * such as during page allocation. mems_allowed can be updated in parallel
  94 + * and depending on the new value an operation can fail potentially causing
  95 + * process failure. A retry loop with get_mems_allowed and put_mems_allowed
  96 + * prevents these artificial failures.
94 97 */
95   -static inline void get_mems_allowed(void)
  98 +static inline unsigned int get_mems_allowed(void)
96 99 {
97   - current->mems_allowed_change_disable++;
98   -
99   - /*
100   - * ensure that reading mems_allowed and mempolicy happens after the
101   - * update of ->mems_allowed_change_disable.
102   - *
103   - * the write-side task finds ->mems_allowed_change_disable is not 0,
104   - * and knows the read-side task is reading mems_allowed or mempolicy,
105   - * so it will clear old bits lazily.
106   - */
107   - smp_mb();
  100 + return read_seqcount_begin(&current->mems_allowed_seq);
108 101 }
109 102  
110   -static inline void put_mems_allowed(void)
  103 +/*
  104 + * If this returns false, the operation that took place after get_mems_allowed
  105 + * may have failed. It is up to the caller to retry the operation if
  106 + * appropriate.
  107 + */
  108 +static inline bool put_mems_allowed(unsigned int seq)
111 109 {
112   - /*
113   - * ensure that reading mems_allowed and mempolicy before reducing
114   - * mems_allowed_change_disable.
115   - *
116   - * the write-side task will know that the read-side task is still
117   - * reading mems_allowed or mempolicy, don't clears old bits in the
118   - * nodemask.
119   - */
120   - smp_mb();
121   - --ACCESS_ONCE(current->mems_allowed_change_disable);
  110 + return !read_seqcount_retry(&current->mems_allowed_seq, seq);
122 111 }
123 112  
124 113 static inline void set_mems_allowed(nodemask_t nodemask)
125 114 {
126 115 task_lock(current);
  116 + write_seqcount_begin(&current->mems_allowed_seq);
127 117 current->mems_allowed = nodemask;
  118 + write_seqcount_end(&current->mems_allowed_seq);
128 119 task_unlock(current);
129 120 }
130 121  
131 122  
132 123  
133 124  
... ... @@ -234,12 +225,14 @@
234 225 {
235 226 }
236 227  
237   -static inline void get_mems_allowed(void)
  228 +static inline unsigned int get_mems_allowed(void)
238 229 {
  230 + return 0;
239 231 }
240 232  
241   -static inline void put_mems_allowed(void)
  233 +static inline bool put_mems_allowed(unsigned int seq)
242 234 {
  235 + return true;
243 236 }
244 237  
245 238 #endif /* !CONFIG_CPUSETS */
include/linux/init_task.h
... ... @@ -29,6 +29,13 @@
29 29 #define INIT_GROUP_RWSEM(sig)
30 30 #endif
31 31  
  32 +#ifdef CONFIG_CPUSETS
  33 +#define INIT_CPUSET_SEQ \
  34 + .mems_allowed_seq = SEQCNT_ZERO,
  35 +#else
  36 +#define INIT_CPUSET_SEQ
  37 +#endif
  38 +
32 39 #define INIT_SIGNALS(sig) { \
33 40 .nr_threads = 1, \
34 41 .wait_chldexit = __WAIT_QUEUE_HEAD_INITIALIZER(sig.wait_chldexit),\
... ... @@ -192,6 +199,7 @@
192 199 INIT_FTRACE_GRAPH \
193 200 INIT_TRACE_RECURSION \
194 201 INIT_TASK_RCU_PREEMPT(tsk) \
  202 + INIT_CPUSET_SEQ \
195 203 }
196 204  
197 205  
include/linux/sched.h
... ... @@ -1514,7 +1514,7 @@
1514 1514 #endif
1515 1515 #ifdef CONFIG_CPUSETS
1516 1516 nodemask_t mems_allowed; /* Protected by alloc_lock */
1517   - int mems_allowed_change_disable;
  1517 + seqcount_t mems_allowed_seq; /* Seqence no to catch updates */
1518 1518 int cpuset_mem_spread_rotor;
1519 1519 int cpuset_slab_spread_rotor;
1520 1520 #endif
... ... @@ -964,7 +964,6 @@
964 964 {
965 965 bool need_loop;
966 966  
967   -repeat:
968 967 /*
969 968 * Allow tasks that have access to memory reserves because they have
970 969 * been OOM killed to get memory anywhere.
971 970  
972 971  
... ... @@ -983,45 +982,19 @@
983 982 */
984 983 need_loop = task_has_mempolicy(tsk) ||
985 984 !nodes_intersects(*newmems, tsk->mems_allowed);
  985 +
  986 + if (need_loop)
  987 + write_seqcount_begin(&tsk->mems_allowed_seq);
  988 +
986 989 nodes_or(tsk->mems_allowed, tsk->mems_allowed, *newmems);
987 990 mpol_rebind_task(tsk, newmems, MPOL_REBIND_STEP1);
988 991  
989   - /*
990   - * ensure checking ->mems_allowed_change_disable after setting all new
991   - * allowed nodes.
992   - *
993   - * the read-side task can see an nodemask with new allowed nodes and
994   - * old allowed nodes. and if it allocates page when cpuset clears newly
995   - * disallowed ones continuous, it can see the new allowed bits.
996   - *
997   - * And if setting all new allowed nodes is after the checking, setting
998   - * all new allowed nodes and clearing newly disallowed ones will be done
999   - * continuous, and the read-side task may find no node to alloc page.
1000   - */
1001   - smp_mb();
1002   -
1003   - /*
1004   - * Allocation of memory is very fast, we needn't sleep when waiting
1005   - * for the read-side.
1006   - */
1007   - while (need_loop && ACCESS_ONCE(tsk->mems_allowed_change_disable)) {
1008   - task_unlock(tsk);
1009   - if (!task_curr(tsk))
1010   - yield();
1011   - goto repeat;
1012   - }
1013   -
1014   - /*
1015   - * ensure checking ->mems_allowed_change_disable before clearing all new
1016   - * disallowed nodes.
1017   - *
1018   - * if clearing newly disallowed bits before the checking, the read-side
1019   - * task may find no node to alloc page.
1020   - */
1021   - smp_mb();
1022   -
1023 992 mpol_rebind_task(tsk, newmems, MPOL_REBIND_STEP2);
1024 993 tsk->mems_allowed = *newmems;
  994 +
  995 + if (need_loop)
  996 + write_seqcount_end(&tsk->mems_allowed_seq);
  997 +
1025 998 task_unlock(tsk);
1026 999 }
1027 1000  
... ... @@ -1237,6 +1237,7 @@
1237 1237 #ifdef CONFIG_CPUSETS
1238 1238 p->cpuset_mem_spread_rotor = NUMA_NO_NODE;
1239 1239 p->cpuset_slab_spread_rotor = NUMA_NO_NODE;
  1240 + seqcount_init(&p->mems_allowed_seq);
1240 1241 #endif
1241 1242 #ifdef CONFIG_TRACE_IRQFLAGS
1242 1243 p->irq_events = 0;
... ... @@ -499,10 +499,13 @@
499 499 struct page *page;
500 500  
501 501 if (cpuset_do_page_mem_spread()) {
502   - get_mems_allowed();
503   - n = cpuset_mem_spread_node();
504   - page = alloc_pages_exact_node(n, gfp, 0);
505   - put_mems_allowed();
  502 + unsigned int cpuset_mems_cookie;
  503 + do {
  504 + cpuset_mems_cookie = get_mems_allowed();
  505 + n = cpuset_mem_spread_node();
  506 + page = alloc_pages_exact_node(n, gfp, 0);
  507 + } while (!put_mems_allowed(cpuset_mems_cookie) && !page);
  508 +
506 509 return page;
507 510 }
508 511 return alloc_pages(gfp, 0);
... ... @@ -454,14 +454,16 @@
454 454 struct vm_area_struct *vma,
455 455 unsigned long address, int avoid_reserve)
456 456 {
457   - struct page *page = NULL;
  457 + struct page *page;
458 458 struct mempolicy *mpol;
459 459 nodemask_t *nodemask;
460 460 struct zonelist *zonelist;
461 461 struct zone *zone;
462 462 struct zoneref *z;
  463 + unsigned int cpuset_mems_cookie;
463 464  
464   - get_mems_allowed();
  465 +retry_cpuset:
  466 + cpuset_mems_cookie = get_mems_allowed();
465 467 zonelist = huge_zonelist(vma, address,
466 468 htlb_alloc_mask, &mpol, &nodemask);
467 469 /*
468 470  
469 471  
... ... @@ -488,10 +490,15 @@
488 490 }
489 491 }
490 492 }
491   -err:
  493 +
492 494 mpol_cond_put(mpol);
493   - put_mems_allowed();
  495 + if (unlikely(!put_mems_allowed(cpuset_mems_cookie) && !page))
  496 + goto retry_cpuset;
494 497 return page;
  498 +
  499 +err:
  500 + mpol_cond_put(mpol);
  501 + return NULL;
495 502 }
496 503  
497 504 static void update_and_free_page(struct hstate *h, struct page *page)
... ... @@ -1850,18 +1850,24 @@
1850 1850 alloc_pages_vma(gfp_t gfp, int order, struct vm_area_struct *vma,
1851 1851 unsigned long addr, int node)
1852 1852 {
1853   - struct mempolicy *pol = get_vma_policy(current, vma, addr);
  1853 + struct mempolicy *pol;
1854 1854 struct zonelist *zl;
1855 1855 struct page *page;
  1856 + unsigned int cpuset_mems_cookie;
1856 1857  
1857   - get_mems_allowed();
  1858 +retry_cpuset:
  1859 + pol = get_vma_policy(current, vma, addr);
  1860 + cpuset_mems_cookie = get_mems_allowed();
  1861 +
1858 1862 if (unlikely(pol->mode == MPOL_INTERLEAVE)) {
1859 1863 unsigned nid;
1860 1864  
1861 1865 nid = interleave_nid(pol, vma, addr, PAGE_SHIFT + order);
1862 1866 mpol_cond_put(pol);
1863 1867 page = alloc_page_interleave(gfp, order, nid);
1864   - put_mems_allowed();
  1868 + if (unlikely(!put_mems_allowed(cpuset_mems_cookie) && !page))
  1869 + goto retry_cpuset;
  1870 +
1865 1871 return page;
1866 1872 }
1867 1873 zl = policy_zonelist(gfp, pol, node);
... ... @@ -1872,7 +1878,8 @@
1872 1878 struct page *page = __alloc_pages_nodemask(gfp, order,
1873 1879 zl, policy_nodemask(gfp, pol));
1874 1880 __mpol_put(pol);
1875   - put_mems_allowed();
  1881 + if (unlikely(!put_mems_allowed(cpuset_mems_cookie) && !page))
  1882 + goto retry_cpuset;
1876 1883 return page;
1877 1884 }
1878 1885 /*
... ... @@ -1880,7 +1887,8 @@
1880 1887 */
1881 1888 page = __alloc_pages_nodemask(gfp, order, zl,
1882 1889 policy_nodemask(gfp, pol));
1883   - put_mems_allowed();
  1890 + if (unlikely(!put_mems_allowed(cpuset_mems_cookie) && !page))
  1891 + goto retry_cpuset;
1884 1892 return page;
1885 1893 }
1886 1894  
1887 1895  
... ... @@ -1907,11 +1915,14 @@
1907 1915 {
1908 1916 struct mempolicy *pol = current->mempolicy;
1909 1917 struct page *page;
  1918 + unsigned int cpuset_mems_cookie;
1910 1919  
1911 1920 if (!pol || in_interrupt() || (gfp & __GFP_THISNODE))
1912 1921 pol = &default_policy;
1913 1922  
1914   - get_mems_allowed();
  1923 +retry_cpuset:
  1924 + cpuset_mems_cookie = get_mems_allowed();
  1925 +
1915 1926 /*
1916 1927 * No reference counting needed for current->mempolicy
1917 1928 * nor system default_policy
... ... @@ -1922,7 +1933,10 @@
1922 1933 page = __alloc_pages_nodemask(gfp, order,
1923 1934 policy_zonelist(gfp, pol, numa_node_id()),
1924 1935 policy_nodemask(gfp, pol));
1925   - put_mems_allowed();
  1936 +
  1937 + if (unlikely(!put_mems_allowed(cpuset_mems_cookie) && !page))
  1938 + goto retry_cpuset;
  1939 +
1926 1940 return page;
1927 1941 }
1928 1942 EXPORT_SYMBOL(alloc_pages_current);
... ... @@ -2380,8 +2380,9 @@
2380 2380 {
2381 2381 enum zone_type high_zoneidx = gfp_zone(gfp_mask);
2382 2382 struct zone *preferred_zone;
2383   - struct page *page;
  2383 + struct page *page = NULL;
2384 2384 int migratetype = allocflags_to_migratetype(gfp_mask);
  2385 + unsigned int cpuset_mems_cookie;
2385 2386  
2386 2387 gfp_mask &= gfp_allowed_mask;
2387 2388  
2388 2389  
... ... @@ -2400,15 +2401,15 @@
2400 2401 if (unlikely(!zonelist->_zonerefs->zone))
2401 2402 return NULL;
2402 2403  
2403   - get_mems_allowed();
  2404 +retry_cpuset:
  2405 + cpuset_mems_cookie = get_mems_allowed();
  2406 +
2404 2407 /* The preferred zone is used for statistics later */
2405 2408 first_zones_zonelist(zonelist, high_zoneidx,
2406 2409 nodemask ? : &cpuset_current_mems_allowed,
2407 2410 &preferred_zone);
2408   - if (!preferred_zone) {
2409   - put_mems_allowed();
2410   - return NULL;
2411   - }
  2411 + if (!preferred_zone)
  2412 + goto out;
2412 2413  
2413 2414 /* First allocation attempt */
2414 2415 page = get_page_from_freelist(gfp_mask|__GFP_HARDWALL, nodemask, order,
2415 2416  
... ... @@ -2418,9 +2419,19 @@
2418 2419 page = __alloc_pages_slowpath(gfp_mask, order,
2419 2420 zonelist, high_zoneidx, nodemask,
2420 2421 preferred_zone, migratetype);
2421   - put_mems_allowed();
2422 2422  
2423 2423 trace_mm_page_alloc(page, order, gfp_mask, migratetype);
  2424 +
  2425 +out:
  2426 + /*
  2427 + * When updating a task's mems_allowed, it is possible to race with
  2428 + * parallel threads in such a way that an allocation can fail while
  2429 + * the mask is being updated. If a page allocation is about to fail,
  2430 + * check if the cpuset changed during allocation and if so, retry.
  2431 + */
  2432 + if (unlikely(!put_mems_allowed(cpuset_mems_cookie) && !page))
  2433 + goto retry_cpuset;
  2434 +
2424 2435 return page;
2425 2436 }
2426 2437 EXPORT_SYMBOL(__alloc_pages_nodemask);
2427 2438  
... ... @@ -2634,13 +2645,15 @@
2634 2645 bool skip_free_areas_node(unsigned int flags, int nid)
2635 2646 {
2636 2647 bool ret = false;
  2648 + unsigned int cpuset_mems_cookie;
2637 2649  
2638 2650 if (!(flags & SHOW_MEM_FILTER_NODES))
2639 2651 goto out;
2640 2652  
2641   - get_mems_allowed();
2642   - ret = !node_isset(nid, cpuset_current_mems_allowed);
2643   - put_mems_allowed();
  2653 + do {
  2654 + cpuset_mems_cookie = get_mems_allowed();
  2655 + ret = !node_isset(nid, cpuset_current_mems_allowed);
  2656 + } while (!put_mems_allowed(cpuset_mems_cookie));
2644 2657 out:
2645 2658 return ret;
2646 2659 }
... ... @@ -3284,12 +3284,10 @@
3284 3284 if (in_interrupt() || (flags & __GFP_THISNODE))
3285 3285 return NULL;
3286 3286 nid_alloc = nid_here = numa_mem_id();
3287   - get_mems_allowed();
3288 3287 if (cpuset_do_slab_mem_spread() && (cachep->flags & SLAB_MEM_SPREAD))
3289 3288 nid_alloc = cpuset_slab_spread_node();
3290 3289 else if (current->mempolicy)
3291 3290 nid_alloc = slab_node(current->mempolicy);
3292   - put_mems_allowed();
3293 3291 if (nid_alloc != nid_here)
3294 3292 return ____cache_alloc_node(cachep, flags, nid_alloc);
3295 3293 return NULL;
3296 3294  
3297 3295  
... ... @@ -3312,14 +3310,17 @@
3312 3310 enum zone_type high_zoneidx = gfp_zone(flags);
3313 3311 void *obj = NULL;
3314 3312 int nid;
  3313 + unsigned int cpuset_mems_cookie;
3315 3314  
3316 3315 if (flags & __GFP_THISNODE)
3317 3316 return NULL;
3318 3317  
3319   - get_mems_allowed();
3320   - zonelist = node_zonelist(slab_node(current->mempolicy), flags);
3321 3318 local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
3322 3319  
  3320 +retry_cpuset:
  3321 + cpuset_mems_cookie = get_mems_allowed();
  3322 + zonelist = node_zonelist(slab_node(current->mempolicy), flags);
  3323 +
3323 3324 retry:
3324 3325 /*
3325 3326 * Look through allowed nodes for objects available
... ... @@ -3372,7 +3373,9 @@
3372 3373 }
3373 3374 }
3374 3375 }
3375   - put_mems_allowed();
  3376 +
  3377 + if (unlikely(!put_mems_allowed(cpuset_mems_cookie) && !obj))
  3378 + goto retry_cpuset;
3376 3379 return obj;
3377 3380 }
3378 3381  
... ... @@ -1581,6 +1581,7 @@
1581 1581 struct zone *zone;
1582 1582 enum zone_type high_zoneidx = gfp_zone(flags);
1583 1583 void *object;
  1584 + unsigned int cpuset_mems_cookie;
1584 1585  
1585 1586 /*
1586 1587 * The defrag ratio allows a configuration of the tradeoffs between
1587 1588  
1588 1589  
1589 1590  
... ... @@ -1604,23 +1605,32 @@
1604 1605 get_cycles() % 1024 > s->remote_node_defrag_ratio)
1605 1606 return NULL;
1606 1607  
1607   - get_mems_allowed();
1608   - zonelist = node_zonelist(slab_node(current->mempolicy), flags);
1609   - for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) {
1610   - struct kmem_cache_node *n;
  1608 + do {
  1609 + cpuset_mems_cookie = get_mems_allowed();
  1610 + zonelist = node_zonelist(slab_node(current->mempolicy), flags);
  1611 + for_each_zone_zonelist(zone, z, zonelist, high_zoneidx) {
  1612 + struct kmem_cache_node *n;
1611 1613  
1612   - n = get_node(s, zone_to_nid(zone));
  1614 + n = get_node(s, zone_to_nid(zone));
1613 1615  
1614   - if (n && cpuset_zone_allowed_hardwall(zone, flags) &&
1615   - n->nr_partial > s->min_partial) {
1616   - object = get_partial_node(s, n, c);
1617   - if (object) {
1618   - put_mems_allowed();
1619   - return object;
  1616 + if (n && cpuset_zone_allowed_hardwall(zone, flags) &&
  1617 + n->nr_partial > s->min_partial) {
  1618 + object = get_partial_node(s, n, c);
  1619 + if (object) {
  1620 + /*
  1621 + * Return the object even if
  1622 + * put_mems_allowed indicated that
  1623 + * the cpuset mems_allowed was
  1624 + * updated in parallel. It's a
  1625 + * harmless race between the alloc
  1626 + * and the cpuset update.
  1627 + */
  1628 + put_mems_allowed(cpuset_mems_cookie);
  1629 + return object;
  1630 + }
1620 1631 }
1621 1632 }
1622   - }
1623   - put_mems_allowed();
  1633 + } while (!put_mems_allowed(cpuset_mems_cookie));
1624 1634 #endif
1625 1635 return NULL;
1626 1636 }
... ... @@ -2343,7 +2343,6 @@
2343 2343 unsigned long writeback_threshold;
2344 2344 bool aborted_reclaim;
2345 2345  
2346   - get_mems_allowed();
2347 2346 delayacct_freepages_start();
2348 2347  
2349 2348 if (global_reclaim(sc))
... ... @@ -2407,7 +2406,6 @@
2407 2406  
2408 2407 out:
2409 2408 delayacct_freepages_end();
2410   - put_mems_allowed();
2411 2409  
2412 2410 if (sc->nr_reclaimed)
2413 2411 return sc->nr_reclaimed;