Commit 7cb0240492caea2f6467f827313478f41877e6ef

Authored by Mel Gorman
Committed by Linus Torvalds
1 parent 99a1dec70d

netvm: allow the use of __GFP_MEMALLOC by specific sockets

Allow specific sockets to be tagged SOCK_MEMALLOC and use __GFP_MEMALLOC
for their allocations.  These sockets will be able to go below watermarks
and allocate from the emergency reserve.  Such sockets are to be used to
service the VM (iow.  to swap over).  They must be handled kernel side,
exposing such a socket to user-space is a bug.

There is a risk that the reserves be depleted so for now, the
administrator is responsible for increasing min_free_kbytes as necessary
to prevent deadlock for their workloads.

[a.p.zijlstra@chello.nl: Original patches]
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 2 changed files with 26 additions and 1 deletions Side-by-side Diff

... ... @@ -621,6 +621,7 @@
621 621 SOCK_RCVTSTAMPNS, /* %SO_TIMESTAMPNS setting */
622 622 SOCK_LOCALROUTE, /* route locally only, %SO_DONTROUTE setting */
623 623 SOCK_QUEUE_SHRUNK, /* write queue has been shrunk recently */
  624 + SOCK_MEMALLOC, /* VM depends on this socket for swapping */
624 625 SOCK_TIMESTAMPING_TX_HARDWARE, /* %SOF_TIMESTAMPING_TX_HARDWARE */
625 626 SOCK_TIMESTAMPING_TX_SOFTWARE, /* %SOF_TIMESTAMPING_TX_SOFTWARE */
626 627 SOCK_TIMESTAMPING_RX_HARDWARE, /* %SOF_TIMESTAMPING_RX_HARDWARE */
... ... @@ -660,7 +661,7 @@
660 661  
661 662 static inline gfp_t sk_gfp_atomic(struct sock *sk, gfp_t gfp_mask)
662 663 {
663   - return GFP_ATOMIC;
  664 + return GFP_ATOMIC | (sk->sk_allocation & __GFP_MEMALLOC);
664 665 }
665 666  
666 667 static inline void sk_acceptq_removed(struct sock *sk)
... ... @@ -803,6 +804,8 @@
803 804 extern void sk_stream_wait_close(struct sock *sk, long timeo_p);
804 805 extern int sk_stream_error(struct sock *sk, int flags, int err);
805 806 extern void sk_stream_kill_queues(struct sock *sk);
  807 +extern void sk_set_memalloc(struct sock *sk);
  808 +extern void sk_clear_memalloc(struct sock *sk);
806 809  
807 810 extern int sk_wait_data(struct sock *sk, long *timeo);
808 811  
... ... @@ -271,6 +271,28 @@
271 271 int sysctl_optmem_max __read_mostly = sizeof(unsigned long)*(2*UIO_MAXIOV+512);
272 272 EXPORT_SYMBOL(sysctl_optmem_max);
273 273  
  274 +/**
  275 + * sk_set_memalloc - sets %SOCK_MEMALLOC
  276 + * @sk: socket to set it on
  277 + *
  278 + * Set %SOCK_MEMALLOC on a socket for access to emergency reserves.
  279 + * It's the responsibility of the admin to adjust min_free_kbytes
  280 + * to meet the requirements
  281 + */
  282 +void sk_set_memalloc(struct sock *sk)
  283 +{
  284 + sock_set_flag(sk, SOCK_MEMALLOC);
  285 + sk->sk_allocation |= __GFP_MEMALLOC;
  286 +}
  287 +EXPORT_SYMBOL_GPL(sk_set_memalloc);
  288 +
  289 +void sk_clear_memalloc(struct sock *sk)
  290 +{
  291 + sock_reset_flag(sk, SOCK_MEMALLOC);
  292 + sk->sk_allocation &= ~__GFP_MEMALLOC;
  293 +}
  294 +EXPORT_SYMBOL_GPL(sk_clear_memalloc);
  295 +
274 296 #if defined(CONFIG_CGROUPS)
275 297 #if !defined(CONFIG_NET_CLS_CGROUP)
276 298 int net_cls_subsys_id = -1;