14 Nov, 2008

3 commits

  • Conflicts:
    security/keys/internal.h
    security/keys/process_keys.c
    security/keys/request_key.c

    Fixed conflicts above by using the non 'tsk' versions.

    Signed-off-by: James Morris

    James Morris
     
  • Use RCU to access another task's creds and to release a task's own creds.
    This means that it will be possible for the credentials of a task to be
    replaced without another task (a) requiring a full lock to read them, and (b)
    seeing deallocated memory.

    Signed-off-by: David Howells
    Acked-by: James Morris
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    David Howells
     
  • Separate the task security context from task_struct. At this point, the
    security data is temporarily embedded in the task_struct with two pointers
    pointing to it.

    Note that the Alpha arch is altered as it refers to (E)UID and (E)GID in
    entry.S via asm-offsets.

    With comment fixes Signed-off-by: Marc Dionne

    Signed-off-by: David Howells
    Acked-by: James Morris
    Acked-by: Serge Hallyn
    Signed-off-by: James Morris

    David Howells
     

11 Nov, 2008

1 commit


07 Nov, 2008

2 commits


14 Aug, 2008

1 commit

  • Fix the setting of PF_SUPERPRIV by __capable() as it could corrupt the flags
    the target process if that is not the current process and it is trying to
    change its own flags in a different way at the same time.

    __capable() is using neither atomic ops nor locking to protect t->flags. This
    patch removes __capable() and introduces has_capability() that doesn't set
    PF_SUPERPRIV on the process being queried.

    This patch further splits security_ptrace() in two:

    (1) security_ptrace_may_access(). This passes judgement on whether one
    process may access another only (PTRACE_MODE_ATTACH for ptrace() and
    PTRACE_MODE_READ for /proc), and takes a pointer to the child process.
    current is the parent.

    (2) security_ptrace_traceme(). This passes judgement on PTRACE_TRACEME only,
    and takes only a pointer to the parent process. current is the child.

    In Smack and commoncap, this uses has_capability() to determine whether
    the parent will be permitted to use PTRACE_ATTACH if normal checks fail.
    This does not set PF_SUPERPRIV.

    Two of the instances of __capable() actually only act on current, and so have
    been changed to calls to capable().

    Of the places that were using __capable():

    (1) The OOM killer calls __capable() thrice when weighing the killability of a
    process. All of these now use has_capability().

    (2) cap_ptrace() and smack_ptrace() were using __capable() to check to see
    whether the parent was allowed to trace any process. As mentioned above,
    these have been split. For PTRACE_ATTACH and /proc, capable() is now
    used, and for PTRACE_TRACEME, has_capability() is used.

    (3) cap_safe_nice() only ever saw current, so now uses capable().

    (4) smack_setprocattr() rejected accesses to tasks other than current just
    after calling __capable(), so the order of these two tests have been
    switched and capable() is used instead.

    (5) In smack_file_send_sigiotask(), we need to allow privileged processes to
    receive SIGIO on files they're manipulating.

    (6) In smack_task_wait(), we let a process wait for a privileged process,
    whether or not the process doing the waiting is privileged.

    I've tested this with the LTP SELinux and syscalls testscripts.

    Signed-off-by: David Howells
    Acked-by: Serge Hallyn
    Acked-by: Casey Schaufler
    Acked-by: Andrew G. Morgan
    Acked-by: Al Viro
    Signed-off-by: James Morris

    David Howells
     

28 Apr, 2008

3 commits

  • In commit 4c4a22148909e4c003562ea7ffe0a06e26919e3c, we moved the
    memcontroller-related code from badness() to select_bad_process(), so the
    parameter 'mem' in badness() is unused now.

    Signed-off-by: Li Zefan
    Acked-by: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Filtering zonelists requires very frequent use of zone_idx(). This is costly
    as it involves a lookup of another structure and a substraction operation. As
    the zone_idx is often required, it should be quickly accessible. The node idx
    could also be stored here if it was found that accessing zone->node is
    significant which may be the case on workloads where nodemasks are heavily
    used.

    This patch introduces a struct zoneref to store a zone pointer and a zone
    index. The zonelist then consists of an array of these struct zonerefs which
    are looked up as necessary. Helpers are given for accessing the zone index as
    well as the node index.

    [kamezawa.hiroyu@jp.fujitsu.com: Suggested struct zoneref instead of embedding information in pointers]
    [hugh@veritas.com: mm-have-zonelist: fix memcg ooms]
    [hugh@veritas.com: just return do_try_to_free_pages]
    [hugh@veritas.com: do_try_to_free_pages gfp_mask redundant]
    Signed-off-by: Mel Gorman
    Acked-by: Christoph Lameter
    Acked-by: David Rientjes
    Signed-off-by: Lee Schermerhorn
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Christoph Lameter
    Cc: Nick Piggin
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Currently a node has two sets of zonelists, one for each zone type in the
    system and a second set for GFP_THISNODE allocations. Based on the zones
    allowed by a gfp mask, one of these zonelists is selected. All of these
    zonelists consume memory and occupy cache lines.

    This patch replaces the multiple zonelists per-node with two zonelists. The
    first contains all populated zones in the system, ordered by distance, for
    fallback allocations when the target/preferred node has no free pages. The
    second contains all populated zones in the node suitable for GFP_THISNODE
    allocations.

    An iterator macro is introduced called for_each_zone_zonelist() that interates
    through each zone allowed by the GFP flags in the selected zonelist.

    Signed-off-by: Mel Gorman
    Acked-by: Christoph Lameter
    Signed-off-by: Lee Schermerhorn
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Christoph Lameter
    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

16 Apr, 2008

1 commit

  • When I used a test program to fork mass processes and immediately move them to
    a cgroup where the memory limit is low enough to trigger oom kill, I got oops:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000808
    IP: [] _spin_lock_irqsave+0x8/0x18
    PGD 4c95f067 PUD 4406c067 PMD 0
    Oops: 0002 [1] SMP
    CPU 2
    Modules linked in:

    Pid: 11973, comm: a.out Not tainted 2.6.25-rc7 #5
    RIP: 0010:[] [] _spin_lock_irqsave+0x8/0x18
    RSP: 0018:ffff8100448c7c30 EFLAGS: 00010002
    RAX: 0000000000000202 RBX: 0000000000000009 RCX: 000000000001c9f3
    RDX: 0000000000000100 RSI: 0000000000000001 RDI: 0000000000000808
    RBP: ffff81007e444080 R08: 0000000000000000 R09: ffff8100448c7900
    R10: ffff81000105f480 R11: 00000100ffffffff R12: ffff810067c84140
    R13: 0000000000000001 R14: ffff8100441d0018 R15: ffff81007da56200
    FS: 00007f70eb1856f0(0000) GS:ffff81007fbad3c0(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000808 CR3: 000000004498a000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process a.out (pid: 11973, threadinfo ffff8100448c6000, task ffff81007da533e0)
    Stack: ffffffff8023ef5a 00000000000000d0 ffffffff80548dc0 00000000000000d0
    ffff810067c84140 ffff81007e444080 ffffffff8026cef9 00000000000000d0
    ffff8100441d0000 00000000000000d0 ffff8100441d0000 ffff8100505445c0
    Call Trace:
    [] ? force_sig_info+0x25/0xb9
    [] ? oom_kill_task+0x77/0xe2
    [] ? mem_cgroup_out_of_memory+0x55/0x67
    [] ? mem_cgroup_charge_common+0xec/0x202
    [] ? handle_mm_fault+0x24e/0x77f
    [] ? default_wake_function+0x0/0xe
    [] ? get_user_pages+0x2ce/0x3af
    [] ? mem_cgroup_charge_common+0x2d/0x202
    [] ? make_pages_present+0x8e/0xa4
    [] ? mmap_region+0x373/0x429
    [] ? do_mmap_pgoff+0x2ff/0x364
    [] ? sys_mmap+0xe5/0x111
    [] ? tracesys+0xdc/0xe1

    Code: 00 00 01 48 8b 3c 24 e9 46 d4 dd ff f0 ff 07 48 8b 3c 24 e9 3a d4 dd ff fe 07 48 8b 3c 24 e9 2f d4 dd ff 9c 58 fa ba 00 01 00 00 66 0f c1 17 38 f2 74 06 f3 90 8a 17 eb f6 c3 fa b8 00 01 00
    RIP [] _spin_lock_irqsave+0x8/0x18
    RSP
    CR2: 0000000000000808
    ---[ end trace c3702fa668021ea4 ]---

    It's reproducable in a x86_64 box, but doesn't happen in x86_32.

    This is because tsk->sighand is not guarded by RCU, so we have to
    hold tasklist_lock, just as what out_of_memory() does.

    Signed-off-by: Li Zefan
    Cc: KAMEZAWA Hiroyuki
    Acked-by: Balbir Singh
    Cc: Pavel Emelianov
    Cc: Paul Menage
    Cc: Oleg Nesterov
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

20 Mar, 2008

1 commit


05 Mar, 2008

1 commit

  • Rename Memory Controller to Memory Resource Controller. Reflect the same
    changes in the CONFIG definition for the Memory Resource Controller. Group
    together the config options for Resource Counters and Memory Resource
    Controller.

    Signed-off-by: Balbir Singh
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Balbir Singh
     

08 Feb, 2008

3 commits

  • Adds a new sysctl, 'oom_dump_tasks', that enables the kernel to produce a
    dump of all system tasks (excluding kernel threads) when performing an
    OOM-killing. Information includes pid, uid, tgid, vm size, rss, cpu,
    oom_adj score, and name.

    This is helpful for determining why there was an OOM condition and which
    rogue task caused it.

    It is configurable so that large systems, such as those with several
    thousand tasks, do not incur a performance penalty associated with dumping
    data they may not desire.

    If an OOM was triggered as a result of a memory controller, the tasklist
    shall be filtered to exclude tasks that are not a member of the same
    cgroup.

    Cc: Andrea Arcangeli
    Cc: Christoph Lameter
    Cc: Balbir Singh
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Creates a helper function to return non-zero if a task is a member of a
    memory controller:

    int task_in_mem_cgroup(const struct task_struct *task,
    const struct mem_cgroup *mem);

    When the OOM killer is constrained by the memory controller, the exclusion
    of tasks that are not a member of that controller was previously misplaced
    and appeared in the badness scoring function. It should be excluded
    during the tasklist scan in select_bad_process() instead.

    [akpm@linux-foundation.org: build fix]
    Cc: Christoph Lameter
    Cc: Balbir Singh
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Out of memory handling for cgroups over their limit. A task from the
    cgroup over limit is chosen using the existing OOM logic and killed.

    TODO:
    1. As discussed in the OLS BOF session, consider implementing a user
    space policy for OOM handling.

    [akpm@linux-foundation.org: fix build due to oom-killer changes]
    Signed-off-by: Pavel Emelianov
    Signed-off-by: Balbir Singh
    Cc: Paul Menage
    Cc: Peter Zijlstra
    Cc: "Eric W. Biederman"
    Cc: Nick Piggin
    Cc: Kirill Korotaev
    Cc: Herbert Poetzl
    Cc: David Rientjes
    Cc: Vaidyanathan Srinivasan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelianov
     

06 Feb, 2008

2 commits

  • Root processes are considered more important when out of memory and killing
    proceses. The check for CAP_SYS_ADMIN was augmented with a check for
    uid==0 or euid==0.

    There are several possible ways to look at this:

    1. uid comparisons are unnecessary, trust CAP_SYS_ADMIN
    alone. However CAP_SYS_RESOURCE is the one that really
    means "give me extra resources" so allow for that as
    well.
    2. Any privileged code should be protected, but uid is not
    an indication of privilege. So we should check whether
    any capabilities are raised.
    3. uid==0 makes processes on the host as well as in containers
    more important, so we should keep the existing checks.
    4. uid==0 makes processes only on the host more important,
    even without any capabilities. So we should be keeping
    the (uid==0||euid==0) check but only when
    userns==&init_user_ns.

    I'm following number 1 here.

    Signed-off-by: Serge Hallyn
    Cc: Andrew Morgan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     
  • The patch supports legacy (32-bit) capability userspace, and where possible
    translates 32-bit capabilities to/from userspace and the VFS to 64-bit
    kernel space capabilities. If a capability set cannot be compressed into
    32-bits for consumption by user space, the system call fails, with -ERANGE.

    FWIW libcap-2.00 supports this change (and earlier capability formats)

    http://www.kernel.org/pub/linux/libs/security/linux-privs/kernel-2.6/

    [akpm@linux-foundation.org: coding-syle fixes]
    [akpm@linux-foundation.org: use get_task_comm()]
    [ezk@cs.sunysb.edu: build fix]
    [akpm@linux-foundation.org: do not initialise statics to 0 or NULL]
    [akpm@linux-foundation.org: unused var]
    [serue@us.ibm.com: export __cap_ symbols]
    Signed-off-by: Andrew G. Morgan
    Cc: Stephen Smalley
    Acked-by: Serge Hallyn
    Cc: Chris Wright
    Cc: James Morris
    Cc: Casey Schaufler
    Signed-off-by: Erez Zadok
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morgan
     

26 Jan, 2008

1 commit

  • Move the task_struct members specific to rt scheduling together.
    A future optimization could be to put sched_entity and sched_rt_entity
    into a union.

    Signed-off-by: Peter Zijlstra
    CC: Srivatsa Vaddagiri
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

21 Oct, 2007

1 commit


20 Oct, 2007

4 commits

  • The task_struct->pid member is going to be deprecated, so start
    using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in
    the kernel.

    The first thing to start with is the pid, printed to dmesg - in
    this case we may safely use task_pid_nr(). Besides, printks produce
    more (much more) than a half of all the explicit pid usage.

    [akpm@linux-foundation.org: git-drm went and changed lots of stuff]
    Signed-off-by: Pavel Emelyanov
    Cc: Dave Airlie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • With pid namespaces this field is now dangerous to use explicitly, so hide
    it behind the helpers.

    Also the pid and pgrp fields o task_struct and signal_struct are to be
    deprecated. Unfortunately this patch cannot be sent right now as this
    leads to tons of warnings, so start isolating them, and deprecate later.

    Actually the p->tgid == pid has to be changed to has_group_leader_pid(),
    but Oleg pointed out that in case of posix cpu timers this is the same, and
    thread_group_leader() is more preferable.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Oleg Nesterov
    Cc: Sukadev Bhattiprolu
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • mm/oom_kill.c: Convert list_for_each to list_for_each_entry in
    oom_kill_process()

    Signed-off-by: Matthias Kaehlcke
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthias Kaehlcke
     
  • is_init() is an ambiguous name for the pid==1 check. Split it into
    is_global_init() and is_container_init().

    A cgroup init has it's tsk->pid == 1.

    A global init also has it's tsk->pid == 1 and it's active pid namespace
    is the init_pid_ns. But rather than check the active pid namespace,
    compare the task structure with 'init_pid_ns.child_reaper', which is
    initialized during boot to the /sbin/init process and never changes.

    Changelog:

    2.6.22-rc4-mm2-pidns1:
    - Use 'init_pid_ns.child_reaper' to determine if a given task is the
    global init (/sbin/init) process. This would improve performance
    and remove dependence on the task_pid().

    2.6.21-mm2-pidns2:

    - [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
    ppc,avr32}/traps.c for the _exception() call to is_global_init().
    This way, we kill only the cgroup if the cgroup's init has a
    bug rather than force a kernel panic.

    [akpm@linux-foundation.org: fix comment]
    [sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
    [bunk@stusta.de: kernel/pid.c: remove unused exports]
    [sukadev@us.ibm.com: Fix capability.c to work with threaded init]
    Signed-off-by: Serge E. Hallyn
    Signed-off-by: Sukadev Bhattiprolu
    Acked-by: Pavel Emelianov
    Cc: Eric W. Biederman
    Cc: Cedric Le Goater
    Cc: Dave Hansen
    Cc: Herbert Poetzel
    Cc: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     

17 Oct, 2007

8 commits

  • There's no reason to sleep in try_set_zone_oom() or clear_zonelist_oom() if
    the lock can't be acquired; it will be available soon enough once the zonelist
    scanning is done. All other threads waiting for the OOM killer are also
    contingent on the exiting task being able to acquire the lock in
    clear_zonelist_oom() so it doesn't make sense to put it to sleep.

    Cc: Andrea Arcangeli
    Cc: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Since no task descriptor's 'cpuset' field is dereferenced in the execution of
    the OOM killer anymore, it is no longer necessary to take callback_mutex.

    [akpm@linux-foundation.org: restore cpuset_lock for other patches]
    Cc: Andrea Arcangeli
    Acked-by: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Instead of testing for overlap in the memory nodes of the the nearest
    exclusive ancestor of both current and the candidate task, it is better to
    simply test for intersection between the task's mems_allowed in their task
    descriptors. This does not require taking callback_mutex since it is only
    used as a hint in the badness scoring.

    Tasks that do not have an intersection in their mems_allowed with the current
    task are not explicitly restricted from being OOM killed because it is quite
    possible that the candidate task has allocated memory there before and has
    since changed its mems_allowed.

    Cc: Andrea Arcangeli
    Acked-by: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Suppresses the extraneous stack and memory dump when a parallel OOM killing
    has been found. There's no need to fill the ring buffer with this information
    if its already been printed and the condition that triggered the previous OOM
    killer has not yet been alleviated.

    Cc: Andrea Arcangeli
    Acked-by: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Adds a new sysctl, 'oom_kill_allocating_task', which will automatically kill
    the OOM-triggering task instead of scanning through the tasklist to find a
    memory-hogging target. This is helpful for systems with an insanely large
    number of tasks where scanning the tasklist significantly degrades
    performance.

    Cc: Andrea Arcangeli
    Acked-by: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • OOM killer synchronization should be done with zone granularity so that memory
    policy and cpuset allocations may have their corresponding zones locked and
    allow parallel kills for other OOM conditions that may exist elsewhere in the
    system. DMA allocations can be targeted at the zone level, which would not be
    possible if locking was done in nodes or globally.

    Synchronization shall be done with a variation of "trylocks." The goal is to
    put the current task to sleep and restart the failed allocation attempt later
    if the trylock fails. Otherwise, the OOM killer is invoked.

    Each zone in the zonelist that __alloc_pages() was called with is checked for
    the newly-introduced ZONE_OOM_LOCKED flag. If any zone has this flag present,
    the "trylock" to serialize the OOM killer fails and returns zero. Otherwise,
    all the zones have ZONE_OOM_LOCKED set and the try_set_zone_oom() function
    returns non-zero.

    Cc: Andrea Arcangeli
    Cc: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • The OOM killer's CONSTRAINT definitions are really more appropriate in an
    enum, so define them in include/linux/oom.h.

    Cc: Andrea Arcangeli
    Acked-by: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • constrained_alloc() builds its own memory map for nodes with memory. We have
    that available in N_HIGH_MEMORY now. So simplify the code.

    Signed-off-by: Christoph Lameter
    Acked-by: Nishanth Aravamudan
    Acked-by: Lee Schermerhorn
    Acked-by: Bob Picco
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

01 Aug, 2007

1 commit


30 Jul, 2007

1 commit

  • Remove fs.h from mm.h. For this,
    1) Uninline vma_wants_writenotify(). It's pretty huge anyway.
    2) Add back fs.h or less bloated headers (err.h) to files that need it.

    As result, on x86_64 allyesconfig, fs.h dependencies cut down from 3929 files
    rebuilt down to 3444 (-12.3%).

    Cross-compile tested without regressions on my two usual configs and (sigh):

    alpha arm-mx1ads mips-bigsur powerpc-ebony
    alpha-allnoconfig arm-neponset mips-capcella powerpc-g5
    alpha-defconfig arm-netwinder mips-cobalt powerpc-holly
    alpha-up arm-netx mips-db1000 powerpc-iseries
    arm arm-ns9xxx mips-db1100 powerpc-linkstation
    arm-assabet arm-omap_h2_1610 mips-db1200 powerpc-lite5200
    arm-at91rm9200dk arm-onearm mips-db1500 powerpc-maple
    arm-at91rm9200ek arm-picotux200 mips-db1550 powerpc-mpc7448_hpc2
    arm-at91sam9260ek arm-pleb mips-ddb5477 powerpc-mpc8272_ads
    arm-at91sam9261ek arm-pnx4008 mips-decstation powerpc-mpc8313_rdb
    arm-at91sam9263ek arm-pxa255-idp mips-e55 powerpc-mpc832x_mds
    arm-at91sam9rlek arm-realview mips-emma2rh powerpc-mpc832x_rdb
    arm-ateb9200 arm-realview-smp mips-excite powerpc-mpc834x_itx
    arm-badge4 arm-rpc mips-fulong powerpc-mpc834x_itxgp
    arm-carmeva arm-s3c2410 mips-ip22 powerpc-mpc834x_mds
    arm-cerfcube arm-shannon mips-ip27 powerpc-mpc836x_mds
    arm-clps7500 arm-shark mips-ip32 powerpc-mpc8540_ads
    arm-collie arm-simpad mips-jazz powerpc-mpc8544_ds
    arm-corgi arm-spitz mips-jmr3927 powerpc-mpc8560_ads
    arm-csb337 arm-trizeps4 mips-malta powerpc-mpc8568mds
    arm-csb637 arm-versatile mips-mipssim powerpc-mpc85xx_cds
    arm-ebsa110 i386 mips-mpc30x powerpc-mpc8641_hpcn
    arm-edb7211 i386-allnoconfig mips-msp71xx powerpc-mpc866_ads
    arm-em_x270 i386-defconfig mips-ocelot powerpc-mpc885_ads
    arm-ep93xx i386-up mips-pb1100 powerpc-pasemi
    arm-footbridge ia64 mips-pb1500 powerpc-pmac32
    arm-fortunet ia64-allnoconfig mips-pb1550 powerpc-ppc64
    arm-h3600 ia64-bigsur mips-pnx8550-jbs powerpc-prpmc2800
    arm-h7201 ia64-defconfig mips-pnx8550-stb810 powerpc-ps3
    arm-h7202 ia64-gensparse mips-qemu powerpc-pseries
    arm-hackkit ia64-sim mips-rbhma4200 powerpc-up
    arm-integrator ia64-sn2 mips-rbhma4500 s390
    arm-iop13xx ia64-tiger mips-rm200 s390-allnoconfig
    arm-iop32x ia64-up mips-sb1250-swarm s390-defconfig
    arm-iop33x ia64-zx1 mips-sead s390-up
    arm-ixp2000 m68k mips-tb0219 sparc
    arm-ixp23xx m68k-amiga mips-tb0226 sparc-allnoconfig
    arm-ixp4xx m68k-apollo mips-tb0287 sparc-defconfig
    arm-jornada720 m68k-atari mips-workpad sparc-up
    arm-kafa m68k-bvme6000 mips-wrppmc sparc64
    arm-kb9202 m68k-hp300 mips-yosemite sparc64-allnoconfig
    arm-ks8695 m68k-mac parisc sparc64-defconfig
    arm-lart m68k-mvme147 parisc-allnoconfig sparc64-up
    arm-lpd270 m68k-mvme16x parisc-defconfig um-x86_64
    arm-lpd7a400 m68k-q40 parisc-up x86_64
    arm-lpd7a404 m68k-sun3 powerpc x86_64-allnoconfig
    arm-lubbock m68k-sun3x powerpc-cell x86_64-defconfig
    arm-lusl7200 mips powerpc-celleb x86_64-up
    arm-mainstone mips-atlas powerpc-chrp32

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

08 May, 2007

3 commits

  • Fixes a deadlock in the OOM killer for allocations that are not
    __GFP_HARDWALL.

    Before the OOM killer checks for the allocation constraint, it takes
    callback_mutex.

    constrained_alloc() iterates through each zone in the allocation zonelist
    and calls cpuset_zone_allowed_softwall() to determine whether an allocation
    for gfp_mask is possible. If a zone's node is not in the OOM-triggering
    task's mems_allowed, it is not exiting, and we did not fail on a
    __GFP_HARDWALL allocation, cpuset_zone_allowed_softwall() attempts to take
    callback_mutex to check the nearest exclusive ancestor of current's cpuset.
    This results in deadlock.

    We now take callback_mutex after iterating through the zonelist since we
    don't need it yet.

    Cc: Andi Kleen
    Cc: Nick Piggin
    Cc: Christoph Lameter
    Cc: Martin J. Bligh
    Signed-off-by: David Rientjes
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • The current panic_on_oom may not work if there is a process using
    cpusets/mempolicy, because other nodes' memory may remain. But some people
    want failover by panic ASAP even if they are used. This patch makes new
    setting for its request.

    This is tested on my ia64 box which has 3 nodes.

    Signed-off-by: Yasunori Goto
    Signed-off-by: Benjamin LaHaise
    Cc: Christoph Lameter
    Cc: Paul Jackson
    Cc: Ethan Solomita
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • If the badness of a process is zero then oom_adj>0 has no effect. This
    patch makes sure that the oom_adj shift actually increases badness points
    appropriately.

    Signed-off-by: Joshua N. Pritikin
    Cc: Andrea Arcangeli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joshua N Pritikin
     

24 Apr, 2007

2 commits

  • I only have CONFIG_NUMA=y for build testing: surprised when trying a memhog
    to see lots of other processes killed with "No available memory
    (MPOL_BIND)". memhog is killed correctly once we initialize nodemask in
    constrained_alloc().

    Signed-off-by: Hugh Dickins
    Acked-by: Christoph Lameter
    Acked-by: William Irwin
    Acked-by: KAMEZAWA Hiroyuki
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • oom_kill_task() calls __oom_kill_task() to OOM kill a selected task.
    When finding other threads that share an mm with that task, we need to
    kill those individual threads and not the same one.

    (Bug introduced by f2a2a7108aa0039ba7a5fe7a0d2ecef2219a7584)

    Acked-by: William Irwin
    Acked-by: Christoph Lameter
    Cc: Nick Piggin
    Cc: Andrew Morton
    Cc: Andi Kleen
    Signed-off-by: David Rientjes
    Signed-off-by: Linus Torvalds

    David Rientjes
     

17 Mar, 2007

1 commit