15 Jul, 2017

1 commit

  • If we reach the limit of modprobe_limit threads running the next
    request_module() call will fail. The original reason for adding a kill
    was to do away with possible issues with in old circumstances which would
    create a recursive series of request_module() calls.

    We can do better than just be super aggressive and reject calls once we've
    reached the limit by simply making pending callers wait until the
    threshold has been reduced, and then throttling them in, one by one.

    This throttling enables requests over the kmod concurrent limit to be
    processed once a pending request completes. Only the first item queued up
    to wait is woken up. The assumption here is once a task is woken it will
    have no other option to also kick the queue to check if there are more
    pending tasks -- regardless of whether or not it was successful.

    By throttling and processing only max kmod concurrent tasks we ensure we
    avoid unexpected fatal request_module() calls, and we keep memory
    consumption on module loading to a minimum.

    With x86_64 qemu, with 4 cores, 4 GiB of RAM it takes the following run
    time to run both tests:

    time ./kmod.sh -t 0008
    real 0m16.366s
    user 0m0.883s
    sys 0m8.916s

    time ./kmod.sh -t 0009
    real 0m50.803s
    user 0m0.791s
    sys 0m9.852s

    Link: http://lkml.kernel.org/r/20170628223155.26472-4-mcgrof@kernel.org
    Signed-off-by: Luis R. Rodriguez
    Reviewed-by: Petr Mladek
    Cc: Jessica Yu
    Cc: Shuah Khan
    Cc: Rusty Russell
    Cc: Michal Marek
    Cc: Greg Kroah-Hartman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Luis R. Rodriguez
     

28 Jun, 2017

1 commit

  • When checking if we want to allow a kmod thread to kick off we increment,
    then read to see if we should enable a thread. If we were over the allowed
    limit limit we decrement. Splitting the increment far apart from decrement
    means there could be a time where two increments happen potentially
    giving a false failure on a thread which should have been allowed.

    CPU1 CPU2
    atomic_inc()
    atomic_inc()
    atomic_read()
    atomic_read()
    atomic_dec()
    atomic_dec()

    In this case a read on CPU1 gets the atomic_inc()'s and we could negate
    it from getting a kmod thread. We could try to prevent this with a lock
    or preemption but that is overkill. We can fix by reducing the number of
    atomic operations. We do this by inverting the logic of of the enabler,
    instead of incrementing kmod_concurrent as we get new kmod users, define the
    variable kmod_concurrent_max as the max number of currently allowed kmod
    users and as we get new kmod users just decrement it if its still positive.
    This combines the dec and read in one atomic operation.

    In this case we no longer get the same false failure:

    CPU1 CPU2
    atomic_dec_if_positive()
    atomic_dec_if_positive()
    atomic_inc()
    atomic_inc()

    The number of threads is computed at init, and since the current computation
    of kmod_concurrent includes the thread count we can avoid setting
    kmod_concurrent_max later in boot through an init call by simply sticking to
    50 as the kmod_concurrent_max. The assumption here is a system with modules
    must at least have ~16 MiB of RAM.

    Suggested-by: Petr Mladek
    Suggested-by: Dmitry Torokhov
    Signed-off-by: Luis R. Rodriguez
    Reviewed-by: Petr Mladek
    Signed-off-by: Jessica Yu

    Luis R. Rodriguez
     

02 Mar, 2017

2 commits


19 Jan, 2017

2 commits

  • Some usermode helper applications are defined at kernel build time, while
    others can be changed at runtime. To provide a sane way to filter these, add a
    new kernel option "STATIC_USERMODEHELPER". This option routes all
    call_usermodehelper() calls through this binary, no matter what the caller
    wishes to have called.

    The new binary (by default set to /sbin/usermode-helper, but can be changed
    through the STATIC_USERMODEHELPER_PATH option) can properly filter the
    requested programs to be run by the kernel by looking at the first argument
    that is passed to it. All other options should then be passed onto the proper
    program if so desired.

    To disable all call_usermodehelper() calls by the kernel, set
    STATIC_USERMODEHELPER_PATH to an empty string.

    Thanks to Neil Brown for the idea of this feature.

    Cc: NeilBrown
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • This is in preparation for making it so that usermode helper programs
    can't be changed, if desired, by userspace. We will tackle the mess of
    cleaning up the write-ability of argv and env later, that's going to
    take more work, for much less gain...

    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

25 Dec, 2016

1 commit


23 Oct, 2015

1 commit

  • call_usermodehelper_exec_sync() does fork() + wait() with "unignored"
    SIGCHLD. What we have missed is that this worker thread can have other
    children previously forked by call_usermodehelper_exec_work() without
    UMH_WAIT_PROC. If such a child exits in between it becomes a zombie
    because auto-reaping only works if SIGCHLD is ignored, and nobody can
    reap it (unless/until this worker thread exits too).

    Change the !UMH_WAIT_PROC case to use CLONE_PARENT.

    Note: this is only first step. All PF_KTHREAD tasks, even created by
    kernel_thread() should have ->parent == kthreadd by default.

    Fixes: bb304a5c6fc63d8506c ("kmod: handle UMH_WAIT_PROC from system unbound workqueue")
    Signed-off-by: Oleg Nesterov
    Acked-by: Frederic Weisbecker
    Cc: Rik van Riel
    Cc: Christoph Lameter
    Cc: Tejun Heo
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

11 Sep, 2015

6 commits

  • The UMH_WAIT_PROC handler runs in its own thread in order to make sure
    that waiting for the exec kernel thread completion won't block other
    usermodehelper queued jobs.

    On older workqueue implementations, worklets couldn't sleep without
    blocking the rest of the queue. But now the workqueue subsystem handles
    that. Khelper still had the older limitation due to its singlethread
    properties but we replaced it to system unbound workqueues.

    Those are affine to the current node and can block up to some number of
    instances.

    They are a good candidate to handle UMH_WAIT_PROC assuming that we have
    enough system unbound workers to handle lots of parallel usermodehelper
    jobs.

    Signed-off-by: Frederic Weisbecker
    Cc: Rik van Riel
    Reviewed-by: Oleg Nesterov
    Cc: Christoph Lameter
    Cc: Tejun Heo
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frederic Weisbecker
     
  • We need to launch the usermodehelper kernel threads with the widest
    affinity and this is partly why we use khelper. This workqueue has
    unbound properties and thus a wide affinity inherited by all its children.

    Now khelper also has special properties that we aren't much interested in:
    ordered and singlethread. There is really no need about ordering as all
    we do is creating kernel threads. This can be done concurrently. And
    singlethread is a useless limitation as well.

    The workqueue engine already proposes generic unbound workqueues that
    don't share these useless properties and handle well parallel jobs.

    The only worrysome specific is their affinity to the node of the current
    CPU. It's fine for creating the usermodehelper kernel threads but those
    inherit this affinity for longer jobs such as requesting modules.

    This patch proposes to use these node affine unbound workqueues assuming
    that a node is sufficient to handle several parallel usermodehelper
    requests.

    Signed-off-by: Frederic Weisbecker
    Cc: Rik van Riel
    Reviewed-by: Oleg Nesterov
    Cc: Christoph Lameter
    Cc: Tejun Heo
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frederic Weisbecker
     
  • There seem to be quite some confusions on the comments, likely due to
    changes that came after them.

    Now since it's very non obvious why we have 3 levels of asynchronous code
    to implement usermodehelpers, it's important to comment in detail the
    reason of this layout.

    Signed-off-by: Frederic Weisbecker
    Cc: Rik van Riel
    Reviewed-by: Oleg Nesterov
    Cc: Christoph Lameter
    Cc: Tejun Heo
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frederic Weisbecker
     
  • Khelper is affine to all CPUs. Now since it creates the
    call_usermodehelper_exec_[a]sync() kernel threads, those inherit the wide
    affinity.

    As such explicitly forcing a wide affinity from those kernel threads
    is like a no-op.

    Just remove it. It's needless and it breaks CPU isolation users who
    rely on workqueue affinity tuning.

    Signed-off-by: Frederic Weisbecker
    Cc: Rik van Riel
    Reviewed-by: Oleg Nesterov
    Cc: Christoph Lameter
    Cc: Tejun Heo
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frederic Weisbecker
     
  • This patchset does a bunch of cleanups and converts khelper to use system
    unbound workqueues. The 3 first patches should be uncontroversial. The
    last 2 patches are debatable.

    Kmod creates kernel threads that perform userspace jobs and we want those
    to have a large affinity in order not to contend busy CPUs. This is
    (partly) why we use khelper which has a wide affinity that the kernel
    threads it create can inherit from. Now khelper is a dedicated workqueue
    that has singlethread properties which we aren't interested in.

    Hence those two debatable changes:

    _ We would like to use generic workqueues. System unbound workqueues are
    a very good candidate but they are not wide affine, only node affine.
    Now probably a node is enough to perform many parallel kmod jobs.

    _ We would like to remove the wait_for_helper kernel thread (UMH_WAIT_PROC
    handler) to use the workqueue. It means that if the workqueue blocks,
    and no other worker can take pending kmod request, we can be screwed.
    Now if we have 512 threads, this should be enough.

    This patch (of 5):

    Underscores on function names aren't much verbose to explain the purpose
    of a function. And kmod has interesting such flavours.

    Lets rename the following functions:

    * __call_usermodehelper -> call_usermodehelper_exec_work
    * ____call_usermodehelper -> call_usermodehelper_exec_async
    * wait_for_helper -> call_usermodehelper_exec_sync

    Signed-off-by: Frederic Weisbecker
    Cc: Rik van Riel
    Reviewed-by: Oleg Nesterov
    Cc: Christoph Lameter
    Cc: Tejun Heo
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frederic Weisbecker
     
  • If request_module() successfully runs modprobe, but modprobe exits with a
    non-zero status, then the return value from request_module() will be that
    (positive) error status. So the return from request_module can be:

    negative errno
    zero for success
    positive exit code.

    Signed-off-by: NeilBrown
    Cc: Goldwyn Rodrigues
    Cc: Oleg Nesterov
    Cc: Tejun Heo
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    NeilBrown
     

11 Dec, 2014

2 commits

  • Now that we do not call kernel_thread(CLONE_VFORK) from the worker
    thread we can not deadlock if do_execve() in turn triggers another
    call_usermodehelper(), we can remove the kmod_thread_locker code.

    Note: we should probably kill khelper_wq and simply use one of the
    global workqueues, say, system_unbound_wq, this special wq for umh buys
    nothing nowadays.

    Signed-off-by: Oleg Nesterov
    Cc: Martin Schwidefsky
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • After "kernel/kmod: fix use-after-free of the sub_infostructure"
    CLONE_VFORK in __call_usermodehelper() buys nothing, we rely on on
    umh_complete() in ____call_usermodehelper() anyway.

    Remove it. This also eliminates the unnecessary sleep/wakeup in the
    likely case, and this allows the next change.

    While at it, kill the "int wait" locals in ____call_usermodehelper() and
    __call_usermodehelper(), they can safely use sub_info->wait.

    Signed-off-by: Oleg Nesterov
    Cc: Martin Schwidefsky
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

30 Oct, 2014

1 commit

  • Found this in the message log on a s390 system:

    BUG kmalloc-192 (Not tainted): Poison overwritten
    Disabling lock debugging due to kernel taint
    INFO: 0x00000000684761f4-0x00000000684761f7. First byte 0xff instead of 0x6b
    INFO: Allocated in call_usermodehelper_setup+0x70/0x128 age=71 cpu=2 pid=648
    __slab_alloc.isra.47.constprop.56+0x5f6/0x658
    kmem_cache_alloc_trace+0x106/0x408
    call_usermodehelper_setup+0x70/0x128
    call_usermodehelper+0x62/0x90
    cgroup_release_agent+0x178/0x1c0
    process_one_work+0x36e/0x680
    worker_thread+0x2f0/0x4f8
    kthread+0x10a/0x120
    kernel_thread_starter+0x6/0xc
    kernel_thread_starter+0x0/0xc
    INFO: Freed in call_usermodehelper_exec+0x110/0x1b8 age=71 cpu=2 pid=648
    __slab_free+0x94/0x560
    kfree+0x364/0x3e0
    call_usermodehelper_exec+0x110/0x1b8
    cgroup_release_agent+0x178/0x1c0
    process_one_work+0x36e/0x680
    worker_thread+0x2f0/0x4f8
    kthread+0x10a/0x120
    kernel_thread_starter+0x6/0xc
    kernel_thread_starter+0x0/0xc

    There is a use-after-free bug on the subprocess_info structure allocated
    by the user mode helper. In case do_execve() returns with an error
    ____call_usermodehelper() stores the error code to sub_info->retval, but
    sub_info can already have been freed.

    Regarding UMH_NO_WAIT, the sub_info structure can be freed by
    __call_usermodehelper() before the worker thread returns from
    do_execve(), allowing memory corruption when do_execve() failed after
    exec_mmap() is called.

    Regarding UMH_WAIT_EXEC, the call to umh_complete() allows
    call_usermodehelper_exec() to continue which then frees sub_info.

    To fix this race the code needs to make sure that the call to
    call_usermodehelper_freeinfo() is always done after the last store to
    sub_info->retval.

    Signed-off-by: Martin Schwidefsky
    Reviewed-by: Oleg Nesterov
    Cc: Tetsuo Handa
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     

07 Jun, 2014

1 commit

  • Now that we have kernel_sigaction() we can change wait_for_helper() to
    use it and cleans up the code a bit.

    Signed-off-by: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Al Viro
    Cc: David Woodhouse
    Cc: Frederic Weisbecker
    Cc: Geert Uytterhoeven
    Cc: Ingo Molnar
    Cc: Mathieu Desnoyers
    Cc: Richard Weinberger
    Cc: Steven Rostedt
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

18 Apr, 2014

1 commit

  • Mostly scripted conversion of the smp_mb__* barriers.

    Signed-off-by: Peter Zijlstra
    Acked-by: Paul E. McKenney
    Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
    Cc: Linus Torvalds
    Cc: linux-arch@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

06 Feb, 2014

1 commit

  • This changes 'do_execve()' to get the executable name as a 'struct
    filename', and to free it when it is done. This is what the normal
    users want, and it simplifies and streamlines their error handling.

    The controlled lifetime of the executable name also fixes a
    use-after-free problem with the trace_sched_process_exec tracepoint: the
    lifetime of the passed-in string for kernel users was not at all
    obvious, and the user-mode helper code used UMH_WAIT_EXEC to serialize
    the pathname allocation lifetime with the execve() having finished,
    which in turn meant that the trace point that happened after
    mm_release() of the old process VM ended up using already free'd memory.

    To solve the kernel string lifetime issue, this simply introduces
    "getname_kernel()" that works like the normal user-space getname()
    function, except with the source coming from kernel memory.

    As Oleg points out, this also means that we could drop the tcomm[] array
    from 'struct linux_binprm', since the pathname lifetime now covers
    setup_new_exec(). That would be a separate cleanup.

    Reported-by: Igor Zhbanov
    Tested-by: Steven Rostedt
    Cc: Oleg Nesterov
    Cc: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

01 Oct, 2013

1 commit

  • If /proc/sys/kernel/core_pattern contains only "|", a NULL pointer
    dereference happens upon core dump because argv_split("") returns
    argv[0] == NULL.

    This bug was once fixed by commit 264b83c07a84 ("usermodehelper: check
    subprocess_info->path != NULL") but was by error reintroduced by commit
    7f57cfa4e2aa ("usermodehelper: kill the sub_info->path[0] check").

    This bug seems to exist since 2.6.19 (the version which core dump to
    pipe was added). Depending on kernel version and config, some side
    effect might happen immediately after this oops (e.g. kernel panic with
    2.6.32-358.18.1.el6).

    Signed-off-by: Tetsuo Handa
    Acked-by: Oleg Nesterov
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     

04 Jul, 2013

1 commit

  • call_usermodehelper_exec() does nothing but returns success if path[0] ==
    0. The only user which needs this strange feature is request_module(), it
    can check modprobe_path[0] itself like other users do if they want to
    detect the "disabled by admin" case.

    Kill it. Not only it looks strange, it can confuse other callers. And
    this allows us to revert 264b83c0 ("usermodehelper: check
    subprocess_info->path != NULL"), do_execve(NULL) is safe.

    Signed-off-by: Oleg Nesterov
    Acked-by: Rusty Russell
    Cc: Lucas De Marchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

17 May, 2013

1 commit

  • argv_split(empty_or_all_spaces) happily succeeds, it simply returns
    argc == 0 and argv[0] == NULL. Change call_usermodehelper_exec() to
    check sub_info->path != NULL to avoid the crash.

    This is the minimal fix, todo:

    - perhaps we should change argv_split() to return NULL or change the
    callers.

    - kill or justify ->path[0] check

    - narrow the scope of helper_lock()

    Signed-off-by: Oleg Nesterov
    Acked-By: Lucas De Marchi
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

01 May, 2013

3 commits

  • This function suffers from not being able to determine if the cleanup is
    called in case it returns -ENOMEM. Nobody is using it anymore, so let's
    remove it.

    Signed-off-by: Lucas De Marchi
    Cc: Oleg Nesterov
    Cc: David Howells
    Cc: James Morris
    Cc: Al Viro
    Cc: Tejun Heo
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lucas De Marchi
     
  • Use call_usermodehelper_setup() + call_usermodehelper_exec() instead of
    calling call_usermodehelper_fns(). In case the latter returns -ENOMEM the
    cleanup function may had not been called - in this case we would not free
    argv and module_name.

    Signed-off-by: Lucas De Marchi
    Cc: Oleg Nesterov
    Cc: David Howells
    Cc: James Morris
    Cc: Al Viro
    Cc: Tejun Heo
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lucas De Marchi
     
  • call_usermodehelper_setup() + call_usermodehelper_exec() need to be
    called instead of call_usermodehelper_fns() when the cleanup function
    needs to be called even when an ENOMEM error occurs. In this case using
    call_usermodehelper_fns() the user can't distinguish if the cleanup
    function was called or not.

    [akpm@linux-foundation.org: export call_usermodehelper_setup() to modules]
    Signed-off-by: Lucas De Marchi
    Reviewed-by: Oleg Nesterov
    Cc: David Howells
    Cc: James Morris
    Cc: Al Viro
    Cc: Tejun Heo
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lucas De Marchi
     

24 Jan, 2013

1 commit


23 Jan, 2013

1 commit

  • Synchronous requet_module() from an async worker can lead to deadlock
    because module init path may invoke async_synchronize_full(). The
    async worker waits for request_module() to complete and the module
    loading waits for the async task to finish. This bug happened in the
    block layer because of default elevator auto-loading.

    Block layer has been updated not to do default elevator auto-loading
    and it has been decided to disallow synchronous request_module() from
    async workers.

    Trigger WARN_ON_ONCE() on synchronous request_module() from async
    workers.

    For more details, please refer to the following thread.

    http://thread.gmane.org/gmane.linux.kernel/1420814

    Signed-off-by: Tejun Heo
    Reported-by: Alex Riesen
    Cc: Linus Torvalds
    Cc: Arjan van de Ven

    Tejun Heo
     

20 Dec, 2012

1 commit

  • All architectures have
    CONFIG_GENERIC_KERNEL_THREAD
    CONFIG_GENERIC_KERNEL_EXECVE
    __ARCH_WANT_SYS_EXECVE
    None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers
    of kernel_execve() (which is a trivial wrapper for do_execve() now) left.
    Kill the conditionals and make both callers use do_execve().

    Signed-off-by: Al Viro

    Al Viro
     

13 Oct, 2012

1 commit

  • * allow kernel_execve() leave the actual return to userland to
    caller (selected by CONFIG_GENERIC_KERNEL_EXECVE). Callers
    updated accordingly.
    * architecture that does select GENERIC_KERNEL_EXECVE in its
    Kconfig should have its ret_from_kernel_thread() do this:
    call schedule_tail
    call the callback left for it by copy_thread(); if it ever
    returns, that's because it has just done successful kernel_execve()
    jump to return from syscall
    IOW, its only difference from ret_from_fork() is that it does call the
    callback.
    * such an architecture should also get rid of ret_from_kernel_execve()
    and __ARCH_WANT_KERNEL_EXECVE

    This is the last part of infrastructure patches in that area - from
    that point on work on different architectures can live independently.

    Signed-off-by: Al Viro

    Al Viro
     

12 Oct, 2012

1 commit

  • Most of them never returned anyway - only two functions had to be
    changed. That allows to simplify their callers a whole lot.

    Note that this does *not* apply to kthread_run() callbacks - all of
    those had been called from the same kernel_thread() callback, which
    did do_exit() already. This is strictly about very few low-level
    kernel_thread() callbacks (there are only 6 of those, mostly as part
    of kthread.h and kmod.h exported mechanisms, plus kernel_init()
    itself).

    Signed-off-by: Al Viro

    Al Viro
     

31 Jul, 2012

2 commits

  • The system deadlocks (at least since 2.6.10) when
    call_usermodehelper(UMH_WAIT_EXEC) request triggers
    call_usermodehelper(UMH_WAIT_PROC) request.

    This is because "khelper thread is waiting for the worker thread at
    wait_for_completion() in do_fork() since the worker thread was created
    with CLONE_VFORK flag" and "the worker thread cannot call complete()
    because do_execve() is blocked at UMH_WAIT_PROC request" and "the khelper
    thread cannot start processing UMH_WAIT_PROC request because the khelper
    thread is waiting for the worker thread at wait_for_completion() in
    do_fork()".

    The easiest example to observe this deadlock is to use a corrupted
    /sbin/hotplug binary (like shown below).

    # : > /tmp/dummy
    # chmod 755 /tmp/dummy
    # echo /tmp/dummy > /proc/sys/kernel/hotplug
    # modprobe whatever

    call_usermodehelper("/tmp/dummy", UMH_WAIT_EXEC) is called from
    kobject_uevent_env() in lib/kobject_uevent.c upon loading/unloading a
    module. do_execve("/tmp/dummy") triggers a call to
    request_module("binfmt-0000") from search_binary_handler() which in turn
    calls call_usermodehelper(UMH_WAIT_PROC).

    In order to avoid deadlock, as a for-now and easy-to-backport solution, do
    not try to call wait_for_completion() in call_usermodehelper_exec() if the
    worker thread was created by khelper thread with CLONE_VFORK flag. Future
    and fundamental solution might be replacing singleton khelper thread with
    some workqueue so that recursive calls up to max_active dependency loop
    can be handled without deadlock.

    [akpm@linux-foundation.org: add comment to kmod_thread_locker]
    Signed-off-by: Tetsuo Handa
    Cc: Arjan van de Ven
    Acked-by: Rusty Russell
    Cc: Tejun Heo
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     
  • This function's interface is, uh, subtle. Attempt to apologise for it.

    Cc: WANG Cong
    Cc: Cyrill Gorcunov
    Cc: Kees Cook
    Cc: Serge Hallyn
    Cc: "Eric W. Biederman"
    Cc: Alan Cox
    Cc: Oleg Nesterov
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

01 Jun, 2012

3 commits

  • Warning(kernel/kmod.c:419): No description found for parameter 'depth'

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • If we move call_usermodehelper_fns() to kmod.c file and EXPORT_SYMBOL it
    we can avoid exporting all it's helper functions:
    call_usermodehelper_setup
    call_usermodehelper_setfns
    call_usermodehelper_exec
    And make all of them static to kmod.c

    Since the optimizer will see all these as a single call site it will
    inline them inside call_usermodehelper_fns(). So we loose the call to
    _fns but gain 3 calls to the helpers. (Not that it matters)

    Signed-off-by: Boaz Harrosh
    Cc: Oleg Nesterov
    Cc: Tetsuo Handa
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Boaz Harrosh
     
  • call_usermodehelper_freeinfo() is not used outside of kmod.c. So unexport
    it, and make it static to kmod.c

    Signed-off-by: Boaz Harrosh
    Cc: Oleg Nesterov
    Cc: Tetsuo Handa
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Boaz Harrosh
     

29 Mar, 2012

3 commits

  • There is a race condition between the freezer and request_firmware()
    such that if request_firmware() is run on one CPU and
    freeze_processes() is run on another CPU and usermodehelper_disable()
    called by it succeeds to grab umhelper_sem for writing before
    usermodehelper_read_trylock() called from request_firmware()
    acquires it for reading, the request_firmware() will fail and
    trigger a WARN_ON() complaining that it was called at a wrong time.
    However, in fact, it wasn't called at a wrong time and
    freeze_processes() simply happened to be executed simultaneously.

    To avoid this race, at least in some cases, modify
    usermodehelper_read_trylock() so that it doesn't fail if the
    freezing of tasks has just started and hasn't been completed yet.
    Instead, during the freezing of tasks, it will try to freeze the
    task that has called it so that it can wait until user space is
    thawed without triggering the scary warning.

    For this purpose, change usermodehelper_disabled so that it can
    take three different values, UMH_ENABLED (0), UMH_FREEZING and
    UMH_DISABLED. The first one means that usermode helpers are
    enabled, the last one means "hard disable" (i.e. the system is not
    ready for usermode helpers to be used) and the second one
    is reserved for the freezer. Namely, when freeze_processes() is
    started, it sets usermodehelper_disabled to UMH_FREEZING which
    tells usermodehelper_read_trylock() that it shouldn't fail just
    yet and should call try_to_freeze() if woken up and cannot
    return immediately. This way all freezable tasks that happen
    to call request_firmware() right before freeze_processes() is
    started and lose the race for umhelper_sem with it will be
    frozen and will sleep until thaw_processes() unsets
    usermodehelper_disabled. [For the non-freezable callers of
    request_firmware() the race for umhelper_sem against
    freeze_processes() is unfortunately unavoidable.]

    Reported-by: Stephen Boyd
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Greg Kroah-Hartman
    Cc: stable@vger.kernel.org

    Rafael J. Wysocki
     
  • If firmware is requested asynchronously, by calling
    request_firmware_nowait(), there is no reason to fail the request
    (and warn the user) when the system is (presumably temporarily)
    unready to handle it (because user space is not available yet or
    frozen). For this reason, introduce an alternative routine for
    read-locking umhelper_sem, usermodehelper_read_lock_wait(), that
    will wait for usermodehelper_disabled to be unset (possibly with
    a timeout) and make request_firmware_work_func() use it instead of
    usermodehelper_read_trylock().

    Accordingly, modify request_firmware() so that it uses
    usermodehelper_read_trylock() to acquire umhelper_sem and remove
    the code related to that lock from _request_firmware().

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Greg Kroah-Hartman
    Cc: stable@vger.kernel.org

    Rafael J. Wysocki
     
  • Instead of two functions, read_lock_usermodehelper() and
    usermodehelper_is_disabled(), used in combination, introduce
    usermodehelper_read_trylock() that will only return with umhelper_sem
    held if usermodehelper_disabled is unset (and will return -EAGAIN
    otherwise) and make _request_firmware() use it.

    Rename read_unlock_usermodehelper() to
    usermodehelper_read_unlock() to follow the naming convention of the
    new function.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Greg Kroah-Hartman
    Cc: stable@vger.kernel.org

    Rafael J. Wysocki
     

24 Mar, 2012

1 commit

  • As Tetsuo Handa pointed out, request_module() can stress the system
    while the oom-killed caller sleeps in TASK_UNINTERRUPTIBLE.

    The task T uses "almost all" memory, then it does something which
    triggers request_module(). Say, it can simply call sys_socket(). This
    in turn needs more memory and leads to OOM. oom-killer correctly
    chooses T and kills it, but this can't help because it sleeps in
    TASK_UNINTERRUPTIBLE and after that oom-killer becomes "disabled" by the
    TIF_MEMDIE task T.

    Make __request_module() killable. The only necessary change is that
    call_modprobe() should kmalloc argv and module_name, they can't live in
    the stack if we use UMH_KILLABLE. This memory is freed via
    call_usermodehelper_freeinfo()->cleanup.

    Reported-by: Tetsuo Handa
    Signed-off-by: Oleg Nesterov
    Cc: Rusty Russell
    Cc: Tejun Heo
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov