21 Aug, 2008

1 commit

  • When user calls sys_setpriority(PRIO_PGRP ...) on a NPTL style multi-LWP
    process, only the task leader of the process is affected, all other
    sibling LWP threads didn't receive the setting. The problem was that the
    iterator used in sys_setpriority() only iteartes over one task for each
    process, ignoring all other sibling thread.

    Introduce a new macro do_each_pid_thread / while_each_pid_thread to walk
    each thread of a process. Convert 4 call sites in {set/get}priority and
    ioprio_{set/get}.

    Signed-off-by: Ken Chen
    Cc: Oleg Nesterov
    Cc: Roland McGrath
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ken Chen
     

15 Aug, 2008

1 commit

  • Call kernel_restart_prepare() in kernel_kexec() instead of duplicating the
    code.

    Signed-off-by: Huang Ying
    Acked-by: Pavel Machek
    Acked-by: Vivek Goyal
    Cc: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Cc: "Eric W. Biederman"
    Cc: Vivek Goyal
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Huang Ying
     

27 Jul, 2008

1 commit

  • This patch provides an enhancement to kexec/kdump. It implements the
    following features:

    - Backup/restore memory used by the original kernel before/after
    kexec.

    - Save/restore CPU state before/after kexec.

    The features of this patch can be used as a general method to call program in
    physical mode (paging turning off). This can be used to call BIOS code under
    Linux.

    kexec-tools needs to be patched to support kexec jump. The patches and
    the precompiled kexec can be download from the following URL:

    source: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-src_git_kh10.tar.bz2
    patches: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec-tools-patches_git_kh10.tar.bz2
    binary: http://khibernation.sourceforge.net/download/release_v10/kexec-tools/kexec_git_kh10

    Usage example of calling some physical mode code and return:

    1. Compile and install patched kernel with following options selected:

    CONFIG_X86_32=y
    CONFIG_KEXEC=y
    CONFIG_PM=y
    CONFIG_KEXEC_JUMP=y

    2. Build patched kexec-tool or download the pre-built one.

    3. Build some physical mode executable named such as "phy_mode"

    4. Boot kernel compiled in step 1.

    5. Load physical mode executable with /sbin/kexec. The shell command
    line can be as follow:

    /sbin/kexec --load-preserve-context --args-none phy_mode

    6. Call physical mode executable with following shell command line:

    /sbin/kexec -e

    Implementation point:

    To support jumping without reserving memory. One shadow backup page (source
    page) is allocated for each page used by kexeced code image (destination
    page). When do kexec_load, the image of kexeced code is loaded into source
    pages, and before executing, the destination pages and the source pages are
    swapped, so the contents of destination pages are backupped. Before jumping
    to the kexeced code image and after jumping back to the original kernel, the
    destination pages and the source pages are swapped too.

    C ABI (calling convention) is used as communication protocol between
    kernel and called code.

    A flag named KEXEC_PRESERVE_CONTEXT for sys_kexec_load is added to
    indicate that the loaded kernel image is used for jumping back.

    Now, only the i386 architecture is supported.

    Signed-off-by: Huang Ying
    Acked-by: Vivek Goyal
    Cc: "Eric W. Biederman"
    Cc: Pavel Machek
    Cc: Nigel Cunningham
    Cc: "Rafael J. Wysocki"
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Huang Ying
     

26 Jul, 2008

2 commits

  • With the removal of the Solaris binary emulation the export of
    uts_sem became unused.

    Signed-off-by: Adrian Bunk
    Acked-by: David S. Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Presently call_usermodehelper_setup() uses GFP_ATOMIC. but it can return
    NULL _very_ easily.

    GFP_ATOMIC is needed only when we can't sleep. and, GFP_KERNEL is robust
    and better.

    thus, I add gfp_mask argument to call_usermodehelper_setup().

    So, its callers pass the gfp_t as below:

    call_usermodehelper() and call_usermodehelper_keys():
    depend on 'wait' argument.
    call_usermodehelper_pipe():
    always GFP_KERNEL because always run under process context.
    orderly_poweroff():
    pass to GFP_ATOMIC because may run under interrupt context.

    Signed-off-by: KOSAKI Motohiro
    Cc: "Paul Menage"
    Reviewed-by: Li Zefan
    Acked-by: Jeremy Fitzhardinge
    Cc: Rusty Russell
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     

25 May, 2008

1 commit

  • If none of the switch cases match, the PR_SET_PDEATHSIG and
    PR_SET_DUMPABLE cases of the switch statement will never write to local
    variable `error'.

    Signed-off-by: Shi Weihua
    Cc: Andrew G. Morgan
    Acked-by: "Serge E. Hallyn"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shi Weihua
     

30 Apr, 2008

4 commits

  • 1. sys_getpgid() needs rcu_read_lock() to derive the pgrp _nr, even if
    the task is current, otherwise we can race with another thread which
    does sys_setpgid().

    2. Use rcu_read_lock() instead of tasklist_lock when pid != 0, make sure
    that we don't use the NULL pid if the task exits right after successful
    find_task_by_vpid().

    Signed-off-by: Oleg Nesterov
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • 1. sys_getsid() needs rcu_read_lock() to derive the session _nr, even if
    the task is current, otherwise we can race with another thread which
    does sys_setsid().

    2. The task can exit between find_task_by_vpid() and task_session_vnr(),
    in that unlikely case sys_getsid() returns 0 instead of -ESRCH.

    Signed-off-by: Oleg Nesterov
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Use change_pid() instead of detach_pid() + attach_pid() in sys_setpgid().

    This way task_pgrp() is not NULL in between.

    Signed-off-by: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Cc: Pavel Emelyanov
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Just a trivial example, more to come.

    k_getrusage() holds rcu_read_lock() because it was previously required by
    lock_task_sighand(). Unneeded now.

    Signed-off-by: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Cc: "Paul E. McKenney"
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

29 Apr, 2008

1 commit

  • Add the RUSAGE_THREAD option for the getrusage system call. This is
    essentially Roland's patch from http://lkml.org/lkml/2008/1/18/589, but the
    line about RUSAGE_LWP line has been removed, as suggested by Ulrich and
    Christoph.

    Signed-off-by: Roland McGrath
    Signed-off-by: Sripathi Kodi
    Cc: Ingo Molnar
    Cc: Michael Kerrisk
    Cc: Ulrich Drepper
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sripathi Kodi
     

28 Apr, 2008

1 commit

  • Filesystem capability support makes it possible to do away with (set)uid-0
    based privilege and use capabilities instead. That is, with filesystem
    support for capabilities but without this present patch, it is (conceptually)
    possible to manage a system with capabilities alone and never need to obtain
    privilege via (set)uid-0.

    Of course, conceptually isn't quite the same as currently possible since few
    user applications, certainly not enough to run a viable system, are currently
    prepared to leverage capabilities to exercise privilege. Further, many
    applications exist that may never get upgraded in this way, and the kernel
    will continue to want to support their setuid-0 base privilege needs.

    Where pure-capability applications evolve and replace setuid-0 binaries, it is
    desirable that there be a mechanisms by which they can contain their
    privilege. In addition to leveraging the per-process bounding and inheritable
    sets, this should include suppressing the privilege of the uid-0 superuser
    from the process' tree of children.

    The feature added by this patch can be leveraged to suppress the privilege
    associated with (set)uid-0. This suppression requires CAP_SETPCAP to
    initiate, and only immediately affects the 'current' process (it is inherited
    through fork()/exec()). This reimplementation differs significantly from the
    historical support for securebits which was system-wide, unwieldy and which
    has ultimately withered to a dead relic in the source of the modern kernel.

    With this patch applied a process, that is capable(CAP_SETPCAP), can now drop
    all legacy privilege (through uid=0) for itself and all subsequently
    fork()'d/exec()'d children with:

    prctl(PR_SET_SECUREBITS, 0x2f);

    This patch represents a no-op unless CONFIG_SECURITY_FILE_CAPABILITIES is
    enabled at configure time.

    [akpm@linux-foundation.org: fix uninitialised var warning]
    [serue@us.ibm.com: capabilities: use cap_task_prctl when !CONFIG_SECURITY]
    Signed-off-by: Andrew G. Morgan
    Acked-by: Serge Hallyn
    Reviewed-by: James Morris
    Cc: Stephen Smalley
    Cc: Paul Moore
    Signed-off-by: Serge E. Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew G. Morgan
     

20 Apr, 2008

1 commit


09 Feb, 2008

7 commits

  • Some time ago the xxx_vnr() calls (e.g. pid_vnr or find_task_by_vpid) were
    _all_ converted to operate on the current pid namespace. After this each call
    like xxx_nr_ns(foo, current->nsproxy->pid_ns) is nothing but a xxx_vnr(foo)
    one.

    Switch all the xxx_nr_ns() callers to use the xxx_vnr() calls where
    appropriate.

    Signed-off-by: Pavel Emelyanov
    Reviewed-by: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • With the new semantics of find_vpid() we don't need to play with ->nsproxy
    explicitely, _vxx() do the right things.

    Also s/tasklist/rcu/.

    Signed-off-by: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Cc: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Eric's "fix clone(CLONE_NEWPID)" eliminated the last reason for this hack.

    Signed-off-by: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • As Eric pointed out, there is no problem with init starting with sid == pgid
    == 0, and this was historical linux behavior changed in 2.6.18.

    Remove kernel_init()->__set_special_pids(), this is unneeded and complicates
    the rules for sys_setsid().

    This change and the previous change in daemonize() mean that /sbin/init does
    not need the special "session != 1" hack in sys_setsid() any longer. We can't
    remove this check yet, we should cleanup copy_process(CLONE_NEWPID) first, so
    update the comment only.

    Signed-off-by: Oleg Nesterov
    Acked-by: "Eric W. Biederman"
    Cc: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Change set_special_pids() to work with struct pid, not pid_t from global name
    space. This again speedups and imho cleanups the code, also a preparation for
    the next patch.

    Signed-off-by: Oleg Nesterov
    Acked-by: "Eric W. Biederman"
    Acked-by: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • sys_setsid() still deals with pid_t's from the global namespace. This means
    that the "session > 1" check can't help for sub-namespace init, setsid() can't
    succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links.

    Remove the usage of task_struct->pid and convert the code to use "struct pid".
    This also simplifies and speedups the code, saves one find_pid().

    Signed-off-by: Oleg Nesterov
    Cc: "Eric W. Biederman"
    Acked-by: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • sys_setpgid() does unneeded conversions from pid_t to "struct pid" and vice
    versa. Use "struct pid" more consistently. Saves one find_vpid() and
    eliminates the explicit usage of ->nsproxy->pid_ns. Imho, cleanups the
    code.

    Also use the same_thread_group() helper.

    Signed-off-by: Oleg Nesterov
    Acked-by: Pavel Emelyanov
    Acked-by: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

07 Feb, 2008

2 commits

  • groups_sort() can be quite long if user loads a large gid table.

    This is because GROUP_AT(group_info, some_integer) uses an integer divide.
    So having to do XXX thousand divides during one syscall can lead to very
    high latencies. (NGROUPS_MAX=65536)

    In the past (25 Mar 2006), an analog problem was found in groups_search()
    (commit d74beb9f33a5f16d2965f11b275e401f225c949d ) and at that time I
    changed some variables to unsigned int.

    I believe that a more generic fix is to make sure NGROUPS_PER_BLOCK is
    unsigned.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • NR_OPEN (historically set to 1024*1024) actually forbids processes to open
    more than 1024*1024 handles.

    Unfortunatly some production servers hit the not so 'ridiculously high
    value' of 1024*1024 file descriptors per process.

    Changing NR_OPEN is not considered safe because of vmalloc space potential
    exhaust.

    This patch introduces a new sysctl (/proc/sys/fs/nr_open) wich defaults to
    1024*1024, so that admins can decide to change this limit if their workload
    needs it.

    [akpm@linux-foundation.org: export it for sparc64]
    Signed-off-by: Eric Dumazet
    Cc: Alan Cox
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: "David S. Miller"
    Cc: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

06 Feb, 2008

2 commits

  • kernel_shutdown_prepare() can now become static.

    Signed-off-by: Adrian Bunk
    Acked-by: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • The capability bounding set is a set beyond which capabilities cannot grow.
    Currently cap_bset is per-system. It can be manipulated through sysctl,
    but only init can add capabilities. Root can remove capabilities. By
    default it includes all caps except CAP_SETPCAP.

    This patch makes the bounding set per-process when file capabilities are
    enabled. It is inherited at fork from parent. Noone can add elements,
    CAP_SETPCAP is required to remove them.

    One example use of this is to start a safer container. For instance, until
    device namespaces or per-container device whitelists are introduced, it is
    best to take CAP_MKNOD away from a container.

    The bounding set will not affect pP and pE immediately. It will only
    affect pP' and pE' after subsequent exec()s. It also does not affect pI,
    and exec() does not constrain pI'. So to really start a shell with no way
    of regain CAP_MKNOD, you would do

    prctl(PR_CAPBSET_DROP, CAP_MKNOD);
    cap_t cap = cap_get_proc();
    cap_value_t caparray[1];
    caparray[0] = CAP_MKNOD;
    cap_set_flag(cap, CAP_INHERITABLE, 1, caparray, CAP_DROP);
    cap_set_proc(cap);
    cap_free(cap);

    The following test program will get and set the bounding
    set (but not pI). For instance

    ./bset get
    (lists capabilities in bset)
    ./bset drop cap_net_raw
    (starts shell with new bset)
    (use capset, setuid binary, or binary with
    file capabilities to try to increase caps)

    ************************************************************
    cap_bound.c
    ************************************************************
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    #ifndef PR_CAPBSET_READ
    #define PR_CAPBSET_READ 23
    #endif

    #ifndef PR_CAPBSET_DROP
    #define PR_CAPBSET_DROP 24
    #endif

    int usage(char *me)
    {
    printf("Usage: %s get\n", me);
    printf(" %s drop \n", me);
    return 1;
    }

    #define numcaps 32
    char *captable[numcaps] = {
    "cap_chown",
    "cap_dac_override",
    "cap_dac_read_search",
    "cap_fowner",
    "cap_fsetid",
    "cap_kill",
    "cap_setgid",
    "cap_setuid",
    "cap_setpcap",
    "cap_linux_immutable",
    "cap_net_bind_service",
    "cap_net_broadcast",
    "cap_net_admin",
    "cap_net_raw",
    "cap_ipc_lock",
    "cap_ipc_owner",
    "cap_sys_module",
    "cap_sys_rawio",
    "cap_sys_chroot",
    "cap_sys_ptrace",
    "cap_sys_pacct",
    "cap_sys_admin",
    "cap_sys_boot",
    "cap_sys_nice",
    "cap_sys_resource",
    "cap_sys_time",
    "cap_sys_tty_config",
    "cap_mknod",
    "cap_lease",
    "cap_audit_write",
    "cap_audit_control",
    "cap_setfcap"
    };

    int getbcap(void)
    {
    int comma=0;
    unsigned long i;
    int ret;

    printf("i know of %d capabilities\n", numcaps);
    printf("capability bounding set:");
    for (i=0; i< 0)
    perror("prctl");
    else if (ret==1)
    printf("%s%s", (comma++) ? ", " : " ", captable[i]);
    }
    printf("\n");
    return 0;
    }

    int capdrop(char *str)
    {
    unsigned long i;

    int found=0;
    for (i=0; i
    Signed-off-by: Andrew G. Morgan
    Cc: Stephen Smalley
    Cc: James Morris
    Cc: Chris Wright
    Cc: Casey Schaufler a
    Signed-off-by: "Serge E. Hallyn"
    Tested-by: Jiri Slaby
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     

17 Nov, 2007

1 commit

  • dont use the vgetcpu tcache - it's causing problems with tasks
    migrating, they'll see the old cache up to a jiffy after the
    migration, further increasing the costs of the migration.

    In the worst case they see a complete bogus information from
    the tcache, when a sys_getcpu() call "invalidated" the cache
    info by incrementing the jiffies _and_ the cpuid info in the
    cache and the following vdso_getcpu() call happens after
    vdso_jiffies have been incremented.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Ulrich Drepper
    Signed-off-by: Thomas Gleixner

    Ingo Molnar
     

20 Oct, 2007

5 commits

  • The pgrp field is not used widely around the kernel so it is now marked as
    deprecated with appropriate comment.

    The initialization of INIT_SIGNALS is trimmed because
    a) they are set to 0 automatically;
    b) gcc cannot properly initialize two anonymous (the second one
    is the one with the session) unions. In this particular case
    to make it compile we'd have to add some field initialized
    right before the .pgrp.

    This is the same patch as the 1ec320afdc9552c92191d5f89fcd1ebe588334ca one
    (from Cedric), but for the pgrp field.

    Some progress report:

    We have to deprecate the pid, tgid, session and pgrp fields on struct
    task_struct and struct signal_struct. The session and pgrp are already
    deprecated. The tgid value is close to being such - the worst known usage
    in in fs/locks.c and audit code. The pid field deprecation is mainly
    blocked by numerous printk-s around the kernel that print the tsk->pid to
    log.

    Signed-off-by: Pavel Emelyanov
    Cc: Oleg Nesterov
    Cc: Sukadev Bhattiprolu
    Cc: Cedric Le Goater
    Cc: Serge Hallyn
    Cc: "Eric W. Biederman"
    Cc: Herbert Poetzl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • The find_task_by_something is a set of macros are used to find task by pid
    depending on what kind of pid is proposed - global or virtual one. All of
    them are wrappers above the most generic one - find_task_by_pid_type_ns() -
    and just substitute some args for it.

    It turned out, that dereferencing the current->nsproxy->pid_ns construction
    and pushing one more argument on the stack inline cause kernel text size to
    grow.

    This patch moves all this stuff out-of-line into kernel/pid.c. Together
    with the next patch it saves a bit less than 400 bytes from the .text
    section.

    Signed-off-by: Pavel Emelyanov
    Cc: Sukadev Bhattiprolu
    Cc: Oleg Nesterov
    Cc: Paul Menage
    Cc: "Eric W. Biederman"
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • This is the largest patch in the set. Make all (I hope) the places where
    the pid is shown to or get from user operate on the virtual pids.

    The idea is:
    - all in-kernel data structures must store either struct pid itself
    or the pid's global nr, obtained with pid_nr() call;
    - when seeking the task from kernel code with the stored id one
    should use find_task_by_pid() call that works with global pids;
    - when showing pid's numerical value to the user the virtual one
    should be used, but however when one shows task's pid outside this
    task's namespace the global one is to be used;
    - when getting the pid from userspace one need to consider this as
    the virtual one and use appropriate task/pid-searching functions.

    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: nuther build fix]
    [akpm@linux-foundation.org: yet nuther build fix]
    [akpm@linux-foundation.org: remove unneeded casts]
    Signed-off-by: Pavel Emelyanov
    Signed-off-by: Alexey Dobriyan
    Cc: Sukadev Bhattiprolu
    Cc: Oleg Nesterov
    Cc: Paul Menage
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • The set of functions process_session, task_session, process_group and
    task_pgrp is confusing, as the names can be mixed with each other when looking
    at the code for a long time.

    The proposals are to
    * equip the functions that return the integer with _nr suffix to
    represent that fact,
    * and to make all functions work with task (not process) by making
    the common prefix of the same name.

    For monotony the routines signal_session() and set_signal_session() are
    replaced with task_session_nr() and set_task_session(), especially since they
    are only used with the explicit task->signal dereference.

    Signed-off-by: Pavel Emelianov
    Acked-by: Serge E. Hallyn
    Cc: Kirill Korotaev
    Cc: "Eric W. Biederman"
    Cc: Cedric Le Goater
    Cc: Herbert Poetzl
    Cc: Sukadev Bhattiprolu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelianov
     
  • There is separate notifier header, but no separate notifier .c file.

    Extract notifier code out of kernel/sys.c which will remain for
    misc syscalls I hope. Merge kernel/die_notifier.c into kernel/notifier.c.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

19 Oct, 2007

1 commit


01 Oct, 2007

1 commit

  • We need to disable all CPUs other than the boot CPU (usually 0) before
    attempting to power-off modern SMP machines. This fixes the
    hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
    new toybox.

    Signed-off-by: Mark Lord
    Acked-by: Thomas Gleixner
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mark Lord
     

31 Aug, 2007

1 commit

  • Spotted by Marcin Kowalczyk .

    sys_setpgid(child) fails if the child was forked by sub-thread.

    Fix the "is it our child" check. The previous commit
    ee0acf90d320c29916ba8c5c1b2e908d81f5057d was not complete.

    (this patch asks for the new same_thread_group() helper, but mainline doesn't
    have it yet).

    Signed-off-by: Oleg Nesterov
    Acked-by: Roland McGrath
    Cc:
    Tested-by: "Marcin 'Qrczak' Kowalczyk"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

30 Jul, 2007

1 commit


27 Jul, 2007

1 commit

  • Commit bd804eba1c8597cbb7cd5a5f9fe886aae16a079a ("PM: Introduce
    pm_power_off_prepare") caused problems in the poweroff path, as reported by
    YOSHIFUJI Hideaki / 吉藤英明.

    Generally, sysdev_shutdown() should be called after the ACPI preparation for
    powering the system off. To make it happen, we can separate sysdev_shutdown()
    from device_shutdown() and call it directly wherever necessary.

    Signed-off-by: Rafael J. Wysocki
    Tested-by: YOSHIFUJI Hideaki / 吉藤英明
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

20 Jul, 2007

2 commits

  • This patch changes mm_struct.dumpable to a pair of bit flags.

    set_dumpable() converts three-value dumpable to two flags and stores it into
    lower two bits of mm_struct.flags instead of mm_struct.dumpable.
    get_dumpable() behaves in the opposite way.

    [akpm@linux-foundation.org: export set_dumpable]
    Signed-off-by: Hidehiro Kawai
    Cc: Alan Cox
    Cc: David Howells
    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kawai, Hidehiro
     
  • Introduce the pm_power_off_prepare() callback that can be registered by the
    interested platforms in analogy with pm_idle() and pm_power_off(), used for
    preparing the system to power off (needed by ACPI).

    This allows us to drop acpi_sysclass and device_acpi that are only defined in
    order to register the ACPI power off preparation callback, which is needed by
    pm_power_off() registered in a much different way.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

18 Jul, 2007

2 commits

  • Rather than using a tri-state integer for the wait flag in
    call_usermodehelper_exec, define a proper enum, and use that. I've
    preserved the integer values so that any callers I've missed should
    still work OK.

    Signed-off-by: Jeremy Fitzhardinge
    Cc: James Bottomley
    Cc: Randy Dunlap
    Cc: Christoph Hellwig
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Cc: Johannes Berg
    Cc: Ralf Baechle
    Cc: Bjorn Helgaas
    Cc: Joel Becker
    Cc: Tony Luck
    Cc: Kay Sievers
    Cc: Srivatsa Vaddagiri
    Cc: Oleg Nesterov
    Cc: David Howells

    Jeremy Fitzhardinge
     
  • Various pieces of code around the kernel want to be able to trigger an
    orderly poweroff. This pulls them together into a single
    implementation.

    By default the poweroff command is /sbin/poweroff, but it can be set
    via sysctl: kernel/poweroff_cmd. This is split at whitespace, so it
    can include command-line arguments.

    This patch replaces four other instances of invoking either "poweroff"
    or "shutdown -h now": two sbus drivers, and acpi thermal
    management.

    sparc64 has its own "powerd"; still need to determine whether it should
    be replaced by orderly_poweroff().

    Signed-off-by: Jeremy Fitzhardinge
    Acked-by: Len Brown
    Signed-off-by: Chris Wright
    Cc: Andrew Morton
    Cc: Randy Dunlap
    Cc: Andi Kleen
    Cc: Al Viro
    Cc: Arnd Bergmann
    Cc: David S. Miller

    Jeremy Fitzhardinge
     

17 Jul, 2007

1 commit

  • This reduces the memory footprint and it enforces that only the current
    task can enable seccomp on itself (this is a requirement for a
    strightforward [modulo preempt ;) ] TIF_NOTSC implementation).

    Signed-off-by: Andrea Arcangeli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli