27 May, 2009

1 commit


31 Mar, 2009

1 commit

  • There seems to be a common pattern in the kernel where drivers want to
    call request_module() from inside a module_init() function. Currently
    this would deadlock.

    As a result, several drivers go through hoops like scheduling things via
    kevent, or creating custom work queues (because kevent can deadlock on them).

    This patch changes this to use a request_module_nowait() function macro instead,
    which just fires the modprobe off but doesn't wait for it, and thus avoids the
    original deadlock entirely.

    On my laptop this already results in one less kernel thread running..

    (Includes Jiri's patch to use enum umh_wait)

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Rusty Russell (bool-ified)
    Cc: Jiri Slaby

    Arjan van de Ven
     

30 Mar, 2009

1 commit

  • Impact: cleanup

    (Thanks to Al Viro for reminding me of this, via Ingo)

    CPU_MASK_ALL is the (deprecated) "all bits set" cpumask, defined as so:

    #define CPU_MASK_ALL (cpumask_t) { { ... } }

    Taking the address of such a temporary is questionable at best,
    unfortunately 321a8e9d (cpumask: add CPU_MASK_ALL_PTR macro) added
    CPU_MASK_ALL_PTR:

    #define CPU_MASK_ALL_PTR (&CPU_MASK_ALL)

    Which formalizes this practice. One day gcc could bite us over this
    usage (though we seem to have gotten away with it so far).

    So replace everywhere which used &CPU_MASK_ALL or CPU_MASK_ALL_PTR
    with the modern "cpu_all_mask" (a real const struct cpumask *).

    Signed-off-by: Rusty Russell
    Acked-by: Ingo Molnar
    Reported-by: Al Viro
    Cc: Mike Travis

    Rusty Russell
     

07 Jan, 2009

1 commit

  • Fix varargs kernel-doc format in kmod.c:
    Use @... instead of @varargs.

    Warning(kernel/kmod.c:67): Excess function parameter or struct member 'varargs' description in 'request_module'

    Signed-off-by: Randy Dunlap
    Acked-by: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

14 Nov, 2008

2 commits

  • Inaugurate copy-on-write credentials management. This uses RCU to manage the
    credentials pointer in the task_struct with respect to accesses by other tasks.
    A process may only modify its own credentials, and so does not need locking to
    access or modify its own credentials.

    A mutex (cred_replace_mutex) is added to the task_struct to control the effect
    of PTRACE_ATTACHED on credential calculations, particularly with respect to
    execve().

    With this patch, the contents of an active credentials struct may not be
    changed directly; rather a new set of credentials must be prepared, modified
    and committed using something like the following sequence of events:

    struct cred *new = prepare_creds();
    int ret = blah(new);
    if (ret < 0) {
    abort_creds(new);
    return ret;
    }
    return commit_creds(new);

    There are some exceptions to this rule: the keyrings pointed to by the active
    credentials may be instantiated - keyrings violate the COW rule as managing
    COW keyrings is tricky, given that it is possible for a task to directly alter
    the keys in a keyring in use by another task.

    To help enforce this, various pointers to sets of credentials, such as those in
    the task_struct, are declared const. The purpose of this is compile-time
    discouragement of altering credentials through those pointers. Once a set of
    credentials has been made public through one of these pointers, it may not be
    modified, except under special circumstances:

    (1) Its reference count may incremented and decremented.

    (2) The keyrings to which it points may be modified, but not replaced.

    The only safe way to modify anything else is to create a replacement and commit
    using the functions described in Documentation/credentials.txt (which will be
    added by a later patch).

    This patch and the preceding patches have been tested with the LTP SELinux
    testsuite.

    This patch makes several logical sets of alteration:

    (1) execve().

    This now prepares and commits credentials in various places in the
    security code rather than altering the current creds directly.

    (2) Temporary credential overrides.

    do_coredump() and sys_faccessat() now prepare their own credentials and
    temporarily override the ones currently on the acting thread, whilst
    preventing interference from other threads by holding cred_replace_mutex
    on the thread being dumped.

    This will be replaced in a future patch by something that hands down the
    credentials directly to the functions being called, rather than altering
    the task's objective credentials.

    (3) LSM interface.

    A number of functions have been changed, added or removed:

    (*) security_capset_check(), ->capset_check()
    (*) security_capset_set(), ->capset_set()

    Removed in favour of security_capset().

    (*) security_capset(), ->capset()

    New. This is passed a pointer to the new creds, a pointer to the old
    creds and the proposed capability sets. It should fill in the new
    creds or return an error. All pointers, barring the pointer to the
    new creds, are now const.

    (*) security_bprm_apply_creds(), ->bprm_apply_creds()

    Changed; now returns a value, which will cause the process to be
    killed if it's an error.

    (*) security_task_alloc(), ->task_alloc_security()

    Removed in favour of security_prepare_creds().

    (*) security_cred_free(), ->cred_free()

    New. Free security data attached to cred->security.

    (*) security_prepare_creds(), ->cred_prepare()

    New. Duplicate any security data attached to cred->security.

    (*) security_commit_creds(), ->cred_commit()

    New. Apply any security effects for the upcoming installation of new
    security by commit_creds().

    (*) security_task_post_setuid(), ->task_post_setuid()

    Removed in favour of security_task_fix_setuid().

    (*) security_task_fix_setuid(), ->task_fix_setuid()

    Fix up the proposed new credentials for setuid(). This is used by
    cap_set_fix_setuid() to implicitly adjust capabilities in line with
    setuid() changes. Changes are made to the new credentials, rather
    than the task itself as in security_task_post_setuid().

    (*) security_task_reparent_to_init(), ->task_reparent_to_init()

    Removed. Instead the task being reparented to init is referred
    directly to init's credentials.

    NOTE! This results in the loss of some state: SELinux's osid no
    longer records the sid of the thread that forked it.

    (*) security_key_alloc(), ->key_alloc()
    (*) security_key_permission(), ->key_permission()

    Changed. These now take cred pointers rather than task pointers to
    refer to the security context.

    (4) sys_capset().

    This has been simplified and uses less locking. The LSM functions it
    calls have been merged.

    (5) reparent_to_kthreadd().

    This gives the current thread the same credentials as init by simply using
    commit_thread() to point that way.

    (6) __sigqueue_alloc() and switch_uid()

    __sigqueue_alloc() can't stop the target task from changing its creds
    beneath it, so this function gets a reference to the currently applicable
    user_struct which it then passes into the sigqueue struct it returns if
    successful.

    switch_uid() is now called from commit_creds(), and possibly should be
    folded into that. commit_creds() should take care of protecting
    __sigqueue_alloc().

    (7) [sg]et[ug]id() and co and [sg]et_current_groups.

    The set functions now all use prepare_creds(), commit_creds() and
    abort_creds() to build and check a new set of credentials before applying
    it.

    security_task_set[ug]id() is called inside the prepared section. This
    guarantees that nothing else will affect the creds until we've finished.

    The calling of set_dumpable() has been moved into commit_creds().

    Much of the functionality of set_user() has been moved into
    commit_creds().

    The get functions all simply access the data directly.

    (8) security_task_prctl() and cap_task_prctl().

    security_task_prctl() has been modified to return -ENOSYS if it doesn't
    want to handle a function, or otherwise return the return value directly
    rather than through an argument.

    Additionally, cap_task_prctl() now prepares a new set of credentials, even
    if it doesn't end up using it.

    (9) Keyrings.

    A number of changes have been made to the keyrings code:

    (a) switch_uid_keyring(), copy_keys(), exit_keys() and suid_keys() have
    all been dropped and built in to the credentials functions directly.
    They may want separating out again later.

    (b) key_alloc() and search_process_keyrings() now take a cred pointer
    rather than a task pointer to specify the security context.

    (c) copy_creds() gives a new thread within the same thread group a new
    thread keyring if its parent had one, otherwise it discards the thread
    keyring.

    (d) The authorisation key now points directly to the credentials to extend
    the search into rather pointing to the task that carries them.

    (e) Installing thread, process or session keyrings causes a new set of
    credentials to be created, even though it's not strictly necessary for
    process or session keyrings (they're shared).

    (10) Usermode helper.

    The usermode helper code now carries a cred struct pointer in its
    subprocess_info struct instead of a new session keyring pointer. This set
    of credentials is derived from init_cred and installed on the new process
    after it has been cloned.

    call_usermodehelper_setup() allocates the new credentials and
    call_usermodehelper_freeinfo() discards them if they haven't been used. A
    special cred function (prepare_usermodeinfo_creds()) is provided
    specifically for call_usermodehelper_setup() to call.

    call_usermodehelper_setkeys() adjusts the credentials to sport the
    supplied keyring as the new session keyring.

    (11) SELinux.

    SELinux has a number of changes, in addition to those to support the LSM
    interface changes mentioned above:

    (a) selinux_setprocattr() no longer does its check for whether the
    current ptracer can access processes with the new SID inside the lock
    that covers getting the ptracer's SID. Whilst this lock ensures that
    the check is done with the ptracer pinned, the result is only valid
    until the lock is released, so there's no point doing it inside the
    lock.

    (12) is_single_threaded().

    This function has been extracted from selinux_setprocattr() and put into
    a file of its own in the lib/ directory as join_session_keyring() now
    wants to use it too.

    The code in SELinux just checked to see whether a task shared mm_structs
    with other tasks (CLONE_VM), but that isn't good enough. We really want
    to know if they're part of the same thread group (CLONE_THREAD).

    (13) nfsd.

    The NFS server daemon now has to use the COW credentials to set the
    credentials it is going to use. It really needs to pass the credentials
    down to the functions it calls, but it can't do that until other patches
    in this series have been applied.

    Signed-off-by: David Howells
    Acked-by: James Morris
    Signed-off-by: James Morris

    David Howells
     
  • Alter the use of the key instantiation and negation functions' link-to-keyring
    arguments. Currently this specifies a keyring in the target process to link
    the key into, creating the keyring if it doesn't exist. This, however, can be
    a problem for copy-on-write credentials as it means that the instantiating
    process can alter the credentials of the requesting process.

    This patch alters the behaviour such that:

    (1) If keyctl_instantiate_key() or keyctl_negate_key() are given a specific
    keyring by ID (ringid >= 0), then that keyring will be used.

    (2) If keyctl_instantiate_key() or keyctl_negate_key() are given one of the
    special constants that refer to the requesting process's keyrings
    (KEY_SPEC_*_KEYRING, all | Instantiator |------->| Instantiator |
    | | | | | |
    +-----------+ +--------------+ +--------------+
    request_key() request_key()

    This might be useful, for example, in Kerberos, where the requestor requests a
    ticket, and then the ticket instantiator requests the TGT, which someone else
    then has to go and fetch. The TGT, however, should be retained in the
    keyrings of the requestor, not the first instantiator. To make this explict
    an extra special keyring constant is also added.

    Signed-off-by: David Howells
    Reviewed-by: James Morris
    Signed-off-by: James Morris

    David Howells
     

17 Oct, 2008

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
    module: remove CONFIG_KMOD in comment after #endif
    remove CONFIG_KMOD from fs
    remove CONFIG_KMOD from drivers

    Manually fix conflict due to include cleanups in drivers/md/md.c

    Linus Torvalds
     
  • We currently use a PM notifier to disable user mode helpers before suspend
    and hibernation and to re-enable them during resume. However, this is not
    an ideal solution, because if any drivers want to upload firmware into
    memory before suspend, they have to use a PM notifier for this purpose and
    there is no guarantee that the ordering of PM notifiers will be as
    expected (ie. the notifier that disables user mode helpers has to be run
    after the driver's notifier used for uploading the firmware).

    For this reason, it seems better to move the disabling and enabling of
    user mode helpers to separate functions that will be called by the PM core
    as necessary.

    [akpm@linux-foundation.org: remove unneeded ifdefs]
    Signed-off-by: Rafael J. Wysocki
    Cc: Alan Stern
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

16 Oct, 2008

1 commit


26 Jul, 2008

1 commit

  • Presently call_usermodehelper_setup() uses GFP_ATOMIC. but it can return
    NULL _very_ easily.

    GFP_ATOMIC is needed only when we can't sleep. and, GFP_KERNEL is robust
    and better.

    thus, I add gfp_mask argument to call_usermodehelper_setup().

    So, its callers pass the gfp_t as below:

    call_usermodehelper() and call_usermodehelper_keys():
    depend on 'wait' argument.
    call_usermodehelper_pipe():
    always GFP_KERNEL because always run under process context.
    orderly_poweroff():
    pass to GFP_ATOMIC because may run under interrupt context.

    Signed-off-by: KOSAKI Motohiro
    Cc: "Paul Menage"
    Reviewed-by: Li Zefan
    Acked-by: Jeremy Fitzhardinge
    Cc: Rusty Russell
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     

25 Jul, 2008

1 commit

  • This patch adds O_NONBLOCK support to pipe2. It is minimally more involved
    than the patches for eventfd et.al but still trivial. The interfaces of the
    create_write_pipe and create_read_pipe helper functions were changed and the
    one other caller as well.

    The following test must be adjusted for architectures other than x86 and
    x86-64 and in case the syscall numbers changed.

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    #include
    #include
    #include
    #include

    #ifndef __NR_pipe2
    # ifdef __x86_64__
    # define __NR_pipe2 293
    # elif defined __i386__
    # define __NR_pipe2 331
    # else
    # error "need __NR_pipe2"
    # endif
    #endif

    int
    main (void)
    {
    int fds[2];
    if (syscall (__NR_pipe2, fds, 0) == -1)
    {
    puts ("pipe2(0) failed");
    return 1;
    }
    for (int i = 0; i < 2; ++i)
    {
    int fl = fcntl (fds[i], F_GETFL);
    if (fl == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if (fl & O_NONBLOCK)
    {
    printf ("pipe2(0) set non-blocking mode for fds[%d]\n", i);
    return 1;
    }
    close (fds[i]);
    }

    if (syscall (__NR_pipe2, fds, O_NONBLOCK) == -1)
    {
    puts ("pipe2(O_NONBLOCK) failed");
    return 1;
    }
    for (int i = 0; i < 2; ++i)
    {
    int fl = fcntl (fds[i], F_GETFL);
    if (fl == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if ((fl & O_NONBLOCK) == 0)
    {
    printf ("pipe2(O_NONBLOCK) does not set non-blocking mode for fds[%d]\n", i);
    return 1;
    }
    close (fds[i]);
    }

    puts ("OK");

    return 0;
    }
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Signed-off-by: Ulrich Drepper
    Acked-by: Davide Libenzi
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     

22 Jul, 2008

1 commit


02 May, 2008

1 commit


20 Apr, 2008

1 commit

  • * Use new set_cpus_allowed_ptr() function added by previous patch,
    which instead of passing the "newly allowed cpus" cpumask_t arg
    by value, pass it by pointer:

    -int set_cpus_allowed(struct task_struct *p, cpumask_t new_mask)
    +int set_cpus_allowed_ptr(struct task_struct *p, const cpumask_t *new_mask)

    * Modify CPU_MASK_ALL

    Depends on:
    [sched-devel]: sched: add new set_cpus_allowed_ptr function

    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     

15 Feb, 2008

1 commit

  • This test seems to be unnecessary since we always have rootfs mounted before
    calling a usermodehelper.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Jan Blunck
    Acked-by: Christoph Hellwig
    Acked-by: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Blunck
     

18 Jan, 2008

1 commit


12 Sep, 2007

1 commit

  • The semantics of call_usermodehelper_pipe() used to be that it would fork
    the helper, and wait for the kernel thread to be started. This was
    implemented by setting sub_info.wait to 0 (implicitly), and doing a
    wait_for_completion().

    As part of the cleanup done in 0ab4dc92278a0f3816e486d6350c6652a72e06c8,
    call_usermodehelper_pipe() was changed to pass 1 as the value for wait to
    call_usermodehelper_exec().

    This is equivalent to setting sub_info.wait to 1, which is a change from
    the previous behaviour. Using 1 instead of 0 causes
    __call_usermodehelper() to start the kernel thread running
    wait_for_helper(), rather than directly calling ____call_usermodehelper().

    The end result is that the calling kernel code blocks until the user mode
    helper finishes. As the helper is expecting input on stdin, and now no one
    is writing anything, everything locks up (observed in do_coredump).

    The fix is to change the 1 to UMH_WAIT_EXEC (aka 0), indicating that we
    want to wait for the kernel thread to be started, but not for the helper to
    finish.

    Signed-off-by: Michael Ellerman
    Acked-by: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Ellerman
     

27 Jul, 2007

1 commit

  • Fix kmod.c:
    Warning(linux-2.6.23-rc1//kernel/kmod.c:364): No description found for parameter 'envp'

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

20 Jul, 2007

2 commits

  • At present, if a user mode helper is running while
    usermodehelper_pm_callback() is executed, the helper may be frozen and the
    completion in call_usermodehelper_exec() won't be completed until user
    space processes are thawed. As a result, the freezing of kernel threads
    may fail, which is not desirable.

    Prevent this from happening by introducing a counter of running user mode
    helpers and allowing usermodehelper_pm_callback() to succeed for action =
    PM_HIBERNATION_PREPARE or action = PM_SUSPEND_PREPARE only if there are no
    helpers running. [Namely, usermodehelper_pm_callback() waits for at most
    RUNNING_HELPERS_TIMEOUT for the number of running helpers to become zero
    and fails if that doesn't happen.]

    Special thanks to Uli Luckas , Pavel Machek
    and Oleg Nesterov for reviewing the
    previous versions of this patch and for very useful comments.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Uli Luckas
    Acked-by: Nigel Cunningham
    Acked-by: Pavel Machek
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Use a hibernation and suspend notifier to disable the user mode helper before
    a hibernation/suspend and enable it after the operation.

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Acked-by: Nigel Cunningham
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

18 Jul, 2007

2 commits

  • Rather than using a tri-state integer for the wait flag in
    call_usermodehelper_exec, define a proper enum, and use that. I've
    preserved the integer values so that any callers I've missed should
    still work OK.

    Signed-off-by: Jeremy Fitzhardinge
    Cc: James Bottomley
    Cc: Randy Dunlap
    Cc: Christoph Hellwig
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Cc: Johannes Berg
    Cc: Ralf Baechle
    Cc: Bjorn Helgaas
    Cc: Joel Becker
    Cc: Tony Luck
    Cc: Kay Sievers
    Cc: Srivatsa Vaddagiri
    Cc: Oleg Nesterov
    Cc: David Howells

    Jeremy Fitzhardinge
     
  • Rather than having hundreds of variations of call_usermodehelper for
    various pieces of usermode state which could be set up, split the
    info allocation and initialization from the actual process execution.

    This means the general pattern becomes:
    info = call_usermodehelper_setup(path, argv, envp); /* basic state */
    call_usermodehelper_(info, stuff...); /* extra state */
    call_usermodehelper_exec(info, wait); /* run process and free info */

    This patch introduces wrappers for all the existing calling styles for
    call_usermodehelper_*, but folds their implementations into one.

    Signed-off-by: Jeremy Fitzhardinge
    Cc: Andi Kleen
    Cc: Rusty Russell
    Cc: David Howells
    Cc: Bj?rn Steinbrink
    Cc: Randy Dunlap

    Jeremy Fitzhardinge
     

10 May, 2007

2 commits


09 May, 2007

2 commits

  • Fix kevent's childs priority greediness. Such tasks were always scheduled
    at nice level -5 and, at that time, udev stole us the CPU time with -5.

    Already posted at http://lkml.org/lkml/2005/1/10/85

    [akpm@linux-foundation.org: add comment]
    Signed-off-by: Jan Engelhardt
    Cc: Chris Wright
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Engelhardt
     
  • Remove includes of where it is not used/needed.
    Suggested by Al Viro.

    Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
    sparc64, and arm (all 59 defconfigs).

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

24 Feb, 2007

2 commits


17 Feb, 2007

1 commit

  • On recent systems, calls to /sbin/modprobe are handled by udev depending
    on the kind of device the kernel has discovered. This patch creates an
    uevent for the kernels internal request_module(), to let udev take control
    over the request, instead of forking the binary directly by the kernel.
    The direct execution of /sbin/modprobe can be disabled by setting:
    /sys/module/kmod/mod_request_helper (/proc/sys/kernel/modprobe)
    to an empty string, the same way /proc/sys/kernel/hotplug is disabled on an
    udev system.

    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     

13 Feb, 2007

1 commit

  • When a machine check event is detected (including a AMD RevF threshold
    overflow event) allow to run a "trigger" program. This allows user space
    to react to such events sooner.

    The trigger is configured using a new trigger entry in the
    machinecheck sysfs interface. It is currently shared between
    all CPUs.

    I also fixed the AMD threshold handler to run the machine
    check polling code immediately to actually log any events
    that might have caused the threshold interrupt.

    Also added some documentation for the mce sysfs interface.

    Signed-off-by: Andi Kleen

    Andi Kleen
     

09 Dec, 2006

1 commit

  • Rename 'struct namespace' to 'struct mnt_namespace' to avoid confusion with
    other namespaces being developped for the containers : pid, uts, ipc, etc.
    'namespace' variables and attributes are also renamed to 'mnt_ns'

    Signed-off-by: Kirill Korotaev
    Signed-off-by: Cedric Le Goater
    Cc: Eric W. Biederman
    Cc: Herbert Poetzl
    Cc: Sukadev Bhattiprolu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill Korotaev
     

05 Dec, 2006

1 commit


29 Nov, 2006

1 commit


22 Nov, 2006

1 commit

  • Pass the work_struct pointer to the work function rather than context data.
    The work function can use container_of() to work out the data.

    For the cases where the container of the work_struct may go away the moment the
    pending bit is cleared, it is made possible to defer the release of the
    structure by deferring the clearing of the pending bit.

    To make this work, an extra flag is introduced into the management side of the
    work_struct. This governs auto-release of the structure upon execution.

    Ordinarily, the work queue executor would release the work_struct for further
    scheduling or deallocation by clearing the pending bit prior to jumping to the
    work function. This means that, unless the driver makes some guarantee itself
    that the work_struct won't go away, the work function may not access anything
    else in the work_struct or its container lest they be deallocated.. This is a
    problem if the auxiliary data is taken away (as done by the last patch).

    However, if the pending bit is *not* cleared before jumping to the work
    function, then the work function *may* access the work_struct and its container
    with no problems. But then the work function must itself release the
    work_struct by calling work_release().

    In most cases, automatic release is fine, so this is the default. Special
    initiators exist for the non-auto-release case (ending in _NAR).

    Signed-Off-By: David Howells

    David Howells
     

02 Oct, 2006

1 commit

  • The use of execve() in the kernel is dubious, since it relies on the
    __KERNEL_SYSCALLS__ mechanism that stores the result in a global errno
    variable. As a first step of getting rid of this, change all users to a
    global kernel_execve function that returns a proper error code.

    This function is a terrible hack, and a later patch removes it again after the
    kernel syscalls are gone.

    Signed-off-by: Arnd Bergmann
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Russell King
    Cc: Ian Molton
    Cc: Mikael Starvik
    Cc: David Howells
    Cc: Yoshinori Sato
    Cc: Hirokazu Takata
    Cc: Ralf Baechle
    Cc: Kyle McMartin
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: Paul Mundt
    Cc: Kazumoto Kojima
    Cc: Richard Curnow
    Cc: William Lee Irwin III
    Cc: "David S. Miller"
    Cc: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Cc: Miles Bader
    Cc: Chris Zankel
    Cc: "Luck, Tony"
    Cc: Geert Uytterhoeven
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     

01 Oct, 2006

2 commits

  • Using the infrastructure created in previous patches implement support to
    pipe core dumps into programs.

    This is done by overloading the existing core_pattern sysctl
    with a new syntax:

    |program

    When the first character of the pattern is a '|' the kernel will instead
    threat the rest of the pattern as a command to run. The core dump will be
    written to the standard input of that program instead of to a file.

    This is useful for having automatic core dump analysis without filling up
    disks. The program can do some simple analysis and save only a summary of
    the core dump.

    The core dump proces will run with the privileges and in the name space of
    the process that caused the core dump.

    I also increased the core pattern size to 128 bytes so that longer command
    lines fit.

    Most of the changes comes from allowing core dumps without seeks. They are
    fairly straight forward though.

    One small incompatibility is that if someone had a core pattern previously
    that started with '|' they will get suddenly new behaviour. I think that's
    unlikely to be a real problem though.

    Additional background:

    > Very nice, do you happen to have a program that can accept this kind of
    > input for crash dumps? I'm guessing that the embedded people will
    > really want this functionality.

    I had a cheesy demo/prototype. Basically it wrote the dump to a file again,
    ran gdb on it to get a backtrace and wrote the summary to a shared directory.
    Then there was a simple CGI script to generate a "top 10" crashes HTML
    listing.

    Unfortunately this still had the disadvantage to needing full disk space for a
    dump except for deleting it afterwards (in fact it was worse because over the
    pipe holes didn't work so if you have a holey address map it would require
    more space).

    Fortunately gdb seems to be happy to handle /proc/pid/fd/xxx input pipes as
    cores (at least it worked with zsh's =(cat core) syntax), so it would be
    likely possible to do it without temporary space with a simple wrapper that
    calls it in the right way. I ran out of time before doing that though.

    The demo prototype scripts weren't very good. If there is really interest I
    can dig them out (they are currently on a laptop disk on the desk with the
    laptop itself being in service), but I would recommend to rewrite them for any
    serious application of this and fix the disk space problem.

    Also to be really useful it should probably find a way to automatically fetch
    the debuginfos (I cheated and just installed them in advance). If nobody else
    does it I can probably do the rewrite myself again at some point.

    My hope at some point was that desktops would support it in their builtin
    crash reporters, but at least the KDE people I talked too seemed to be happy
    with their user space only solution.

    Alan sayeth:

    I don't believe that piping as such as neccessarily the right model, but
    the ability to intercept and processes core dumps from user space is asked
    for by many enterprise users as well. They want to know about, capture,
    analyse and process core dumps, often centrally and in automated form.

    [akpm@osdl.org: loff_t != unsigned long]
    Signed-off-by: Andi Kleen
    Cc: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • A new member in the ever growing family of call_usermode* functions is
    born. The new call_usermodehelper_pipe() function allows to pipe data to
    the stdin of the called user mode progam and behaves otherwise like the
    normal call_usermodehelp() (except that it always waits for the child to
    finish)

    Signed-off-by: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     

30 Sep, 2006

1 commit

  • If ____call_usermodehelper fails, we're not interested in the child
    process' exit value, but the real error, so let's stop wait_for_helper from
    overwriting it in that case.

    Issue discovered by Benedikt Böhm while working on a Linux-VServer usermode
    helper.

    Signed-off-by: Björn Steinbrink
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Björn Steinbrink
     

17 Sep, 2006

1 commit

  • I think there is a bug in kmod.c: In __call_usermodehelper(), when
    kernel_thread(wait_for_helper, ...) return success, since wait_for_helper()
    might call complete() at any time, the sub_info should not be used any
    more.

    Normally wait_for_helper() take a long time to finish, you may not get
    problem for most of the case. But if you remove /sbin/modprobe, it may
    become easier for you to get a oop in khelper.

    Cc: Matt Helsley
    Cc: Martin Schwidefsky
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kenneth Lee
     

04 Jul, 2006

1 commit

  • lockdep needs to have the waitqueue lock initialized for on-stack waitqueues
    implicitly initialized by DECLARE_COMPLETION(). Annotate on-stack completions
    accordingly.

    Has no effect on non-lockdep kernels.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar