24 Sep, 2009

1 commit

  • CLONE_PARENT was used to implement an older threading model. For
    consistency with the CLONE_THREAD check in copy_pid_ns(), disable
    CLONE_PARENT with CLONE_NEWPID, at least until the required semantics of
    pid namespaces are clear.

    Signed-off-by: Sukadev Bhattiprolu
    Acked-by: Roland McGrath
    Acked-by: Serge Hallyn
    Cc: Oren Laadan
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sukadev Bhattiprolu
     

19 Jun, 2009

2 commits

  • copy_pid_ns() is a perfect example of a case where unwinding leads to more
    code and makes it less clear. Watch the diffstat.

    Signed-off-by: Alexey Dobriyan
    Cc: Pavel Emelyanov
    Cc: "Eric W. Biederman"
    Reviewed-by: Serge Hallyn
    Acked-by: Sukadev Bhattiprolu
    Reviewed-by: WANG Cong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • create_pid_namespace() creates everything, but caller has to assign parent
    pidns by hand, which is unnatural. At the moment of call new ->level has
    to be taken from somewhere and parent pidns is already available.

    Signed-off-by: Alexey Dobriyan
    Cc: Pavel Emelyanov
    Cc: "Eric W. Biederman"
    Acked-by: Serge Hallyn
    Acked-by: Sukadev Bhattiprolu
    Reviewed-by: WANG Cong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

03 Apr, 2009

1 commit

  • send_signal() assumes that signals with SEND_SIG_PRIV are generated from
    within the same namespace. So any nested container-init processes become
    immune to the SIGKILL generated by kill_proc_info() in
    zap_pid_ns_processes().

    Use force_sig() in zap_pid_ns_processes() instead - force_sig() clears the
    SIGNAL_UNKILLABLE flag ensuring the signal is processed by
    container-inits.

    Signed-off-by: Sukadev Bhattiprolu
    Cc: Oleg Nesterov
    Cc: Roland McGrath
    Cc: "Eric W. Biederman"
    Cc: Daniel Lezcano
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sukadev Bhattiprolu
     

03 Sep, 2008

2 commits

  • We don't change pid_ns->child_reaper when the main thread of the
    subnamespace init exits. As Robert Rex pointed
    out this is wrong.

    Yes, the re-parenting itself works correctly, but if the reparented task
    exits it needs ->parent->nsproxy->pid_ns in do_notify_parent(), and if the
    main thread is zombie its ->nsproxy was already cleared by
    exit_task_namespaces().

    Introduce the new function, find_new_reaper(), which finds the new
    ->parent for the re-parenting and changes ->child_reaper if needed. Kill
    the now unneeded exit_child_reaper().

    Also move the changing of ->child_reaper from zap_pid_ns_processes() to
    find_new_reaper(), this consolidates the games with ->child_reaper and
    makes it stable under tasklist_lock.

    Addresses http://bugzilla.kernel.org/show_bug.cgi?id=11391

    Reported-by: Robert Rex
    Signed-off-by: Oleg Nesterov
    Acked-by: Serge Hallyn
    Acked-by: Pavel Emelyanov
    Acked-by: Sukadev Bhattiprolu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • zap_pid_ns_processes() sets pid_ns->child_reaper = NULL, this is wrong.

    Yes, we have already killed all tasks in this namespace, and sys_wait4()
    doesn't see any child. But this doesn't mean ->children list is empty, we
    may have EXIT_DEAD tasks which are not visible to do_wait(). In that case
    the subsequent forget_original_parent() will crash the kernel because it
    will try to re-parent these tasks to the NULL reaper.

    Even if there are no childs, it is not good that forget_original_parent()
    uses reaper == NULL.

    Change the code to set ->child_reaper = init_pid_ns.child_reaper instead.
    We could use pid_ns->parent->child_reaper as well, I think this does not
    really matter. These EXIT_DEAD tasks are not visible to the new ->parent
    after re-parenting, they will silently do release_task() eventually.

    Note that we must change ->child_reaper, otherwise
    forget_original_parent() will use reaper == father, and in that case we
    will hit the (correct) BUG_ON(!list_empty(&father->children)).

    Signed-off-by: Oleg Nesterov
    Acked-by: Serge Hallyn
    Acked-by: Sukadev Bhattiprolu
    Acked-by: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

26 Jul, 2008

2 commits

  • Allocate the structure on the first call to sys_acct(). After this each
    namespace, that ordered the accounting, will live with this structure till
    its own death.

    Two notes
    - routines, that close the accounting on fs umount time use
    the init_pid_ns's acct by now;
    - accounting routine accounts to dying task's namespace
    (also by now).

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • It makes many fields initialization implicit helping in auto-setting
    #ifdef-ed fields (bsd-acct related pointer will be such).

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

30 Apr, 2008

1 commit

  • These values represent the nesting level of a namespace and pids living in it,
    and it's always non-negative.

    Turning this from int to unsigned int saves some space in pid.c (11 bytes on
    x86 and 64 on ia64) by letting the compiler optimize the pid_nr_ns a bit.
    E.g. on ia64 this removes the sign extension calls, which compiler adds to
    optimize access to pid->nubers[ns->level].

    Signed-off-by: Pavel Emelyanov
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

29 Apr, 2008

1 commit


09 Feb, 2008

1 commit

  • Just like with the user namespaces, move the namespace management code into
    the separate .c file and mark the (already existing) PID_NS option as "depend
    on NAMESPACES"

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Cc: Cedric Le Goater
    Cc: "Eric W. Biederman"
    Cc: Herbert Poetzl
    Cc: Kirill Korotaev
    Cc: Sukadev Bhattiprolu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov