Commit 754af6f5a85fcd1ecb456851d20c65e4c6ce10ab

Authored by Lee Schermerhorn
Committed by Linus Torvalds
1 parent 32a4330d41

Mem Policy: add MPOL_F_MEMS_ALLOWED get_mempolicy() flag

Allow an application to query the memories allowed by its context.

Updated numa_memory_policy.txt to mention that applications can use this to
obtain allowed memories for constructing valid policies.

TODO:  update out-of-tree libnuma wrapper[s], or maybe add a new
wrapper--e.g.,  numa_get_mems_allowed() ?

Also, update numa syscall man pages.

Tested with memtoy V>=0.13.

Signed-off-by:  Lee Schermerhorn <lee.schermerhorn@hp.com>
Acked-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 3 changed files with 28 additions and 18 deletions Side-by-side Diff

Documentation/vm/numa_memory_policy.txt
... ... @@ -302,32 +302,31 @@
302 302  
303 303 Memory policies work within cpusets as described above. For memory policies
304 304 that require a node or set of nodes, the nodes are restricted to the set of
305   -nodes whose memories are allowed by the cpuset constraints. If the
306   -intersection of the set of nodes specified for the policy and the set of nodes
307   -allowed by the cpuset is the empty set, the policy is considered invalid and
308   -cannot be installed.
  305 +nodes whose memories are allowed by the cpuset constraints. If the nodemask
  306 +specified for the policy contains nodes that are not allowed by the cpuset, or
  307 +the intersection of the set of nodes specified for the policy and the set of
  308 +nodes with memory is the empty set, the policy is considered invalid
  309 +and cannot be installed.
309 310  
310 311 The interaction of memory policies and cpusets can be problematic for a
311 312 couple of reasons:
312 313  
313   -1) the memory policy APIs take physical node id's as arguments. However, the
314   - memory policy APIs do not provide a way to determine what nodes are valid
315   - in the context where the application is running. An application MAY consult
316   - the cpuset file system [directly or via an out of tree, and not generally
317   - available, libcpuset API] to obtain this information, but then the
318   - application must be aware that it is running in a cpuset and use what are
319   - intended primarily as administrative APIs.
  314 +1) the memory policy APIs take physical node id's as arguments. As mentioned
  315 + above, it is illegal to specify nodes that are not allowed in the cpuset.
  316 + The application must query the allowed nodes using the get_mempolicy()
  317 + API with the MPOL_F_MEMS_ALLOWED flag to determine the allowed nodes and
  318 + restrict itself to those nodes. However, the resources available to a
  319 + cpuset can be changed by the system administrator, or a workload manager
  320 + application, at any time. So, a task may still get errors attempting to
  321 + specify policy nodes, and must query the allowed memories again.
320 322  
321   - However, as long as the policy specifies at least one node that is valid
322   - in the controlling cpuset, the policy can be used.
323   -
324 323 2) when tasks in two cpusets share access to a memory region, such as shared
325 324 memory segments created by shmget() of mmap() with the MAP_ANONYMOUS and
326 325 MAP_SHARED flags, and any of the tasks install shared policy on the region,
327 326 only nodes whose memories are allowed in both cpusets may be used in the
328   - policies. Again, obtaining this information requires "stepping outside"
329   - the memory policy APIs, as well as knowing in what cpusets other task might
330   - be attaching to the shared region, to use the cpuset information.
  327 + policies. Obtaining this information requires "stepping outside" the
  328 + memory policy APIs to use the cpuset information and requires that one
  329 + know in what cpusets other task might be attaching to the shared region.
331 330 Furthermore, if the cpusets' allowed memory sets are disjoint, "local"
332 331 allocation is the only valid policy.
include/linux/mempolicy.h
... ... @@ -19,6 +19,7 @@
19 19 /* Flags for get_mem_policy */
20 20 #define MPOL_F_NODE (1<<0) /* return next IL mode instead of node mask */
21 21 #define MPOL_F_ADDR (1<<1) /* look up vma using address */
  22 +#define MPOL_F_MEMS_ALLOWED (1<<2) /* return allowed memories */
22 23  
23 24 /* Flags for mbind */
24 25 #define MPOL_MF_STRICT (1<<0) /* Verify existing pages in the mapping */
... ... @@ -526,8 +526,18 @@
526 526 struct mempolicy *pol = current->mempolicy;
527 527  
528 528 cpuset_update_task_memory_state();
529   - if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
  529 + if (flags &
  530 + ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR|MPOL_F_MEMS_ALLOWED))
530 531 return -EINVAL;
  532 +
  533 + if (flags & MPOL_F_MEMS_ALLOWED) {
  534 + if (flags & (MPOL_F_NODE|MPOL_F_ADDR))
  535 + return -EINVAL;
  536 + *policy = 0; /* just so it's initialized */
  537 + *nmask = cpuset_current_mems_allowed;
  538 + return 0;
  539 + }
  540 +
531 541 if (flags & MPOL_F_ADDR) {
532 542 down_read(&mm->mmap_sem);
533 543 vma = find_vma_intersection(mm, addr, addr+1);