Eric Lee / smarc-fsl-linux-kernel

15 Jan, 2012

1 commit

c49c41a41 Merge branch 'for-linus' of git://selinuxproject.org/~jmorris/linux-security ... Browse Code »

* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
capabilities: remove __cap_full_set definition
security: remove the security_netlink_recv hook as it is equivalent to capable()
ptrace: do not audit capability check when outputing /proc/pid/stat
capabilities: remove task_ns_* functions
capabitlies: ns_capable can use the cap helpers rather than lsm call
capabilities: style only - move capable below ns_capable
capabilites: introduce new has_ns_capabilities_noaudit
capabilities: call has_ns_capability from has_capability
capabilities: remove all _real_ interfaces
capabilities: introduce security_capable_noaudit
capabilities: reverse arguments to security_capable
capabilities: remove the task from capable LSM hook entirely
selinux: sparse fix: fix several warnings in the security server cod
selinux: sparse fix: fix warnings in netlink code
selinux: sparse fix: eliminate warnings for selinuxfs
selinux: sparse fix: declare selinux_disable() in security.h
selinux: sparse fix: move selinux_complete_init
selinux: sparse fix: make selinux_secmark_refcount static
SELinux: Fix RCU deref check warning in sel_netport_insert()

Manually fix up a semantic mis-merge wrt security_netlink_recv():

- the interface was removed in commit fd7784615248 ("security: remove
the security_netlink_recv hook as it is equivalent to capable()")

- a new user of it appeared in commit a38f7907b926 ("crypto: Add
userspace configuration API")

causing no automatic merge conflict, but Eric Paris pointed out the
issue.

Linus Torvalds
2012-01-15 10:36:33 +0800

06 Jan, 2012

2 commits

69f594a38 ptrace: do not audit capability check when outputing /proc/pid/stat ... Browse Code »

Reading /proc/pid/stat of another process checks if one has ptrace permissions
on that process. If one does have permissions it outputs some data about the
process which might have security and attack implications. If the current
task does not have ptrace permissions the read still works, but those fields
are filled with inocuous (0) values. Since this check and a subsequent denial
is not a violation of the security policy we should not audit such denials.

This can be quite useful to removing ptrace broadly across a system without
flooding the logs when ps is run or something which harmlessly walks proc.

Signed-off-by: Eric Paris
Acked-by: Serge E. Hallyn

Eric Paris
2012-01-06 07:53:00 +0800
f1c84dae0 capabilities: remove task_ns_* functions ... Browse Code »

task_ in the front of a function, in the security subsystem anyway, means
to me at least, that we are operating with that task as the subject of the
security decision. In this case what it means is that we are using current as
the subject but we use the task to get the right namespace. Who in the world
would ever realize that's what task_ns_capability means just by the name? This
patch eliminates the task_ns functions entirely and uses the has_ns_capability
function instead. This means we explicitly open code the ns in question in
the caller. I think it makes the caller a LOT more clear what is going on.

Signed-off-by: Eric Paris
Acked-by: Serge E. Hallyn

Eric Paris
2012-01-06 07:52:59 +0800

05 Jan, 2012

1 commit

8a88951b5 ptrace: ensure JOBCTL_STOP_SIGMASK is not zero after detach ... Browse Code »

This is the temporary simple fix for 3.2, we need more changes in this
area.

1. do_signal_stop() assumes that the running untraced thread in the
stopped thread group is not possible. This was our goal but it is
not yet achieved: a stopped-but-resumed tracee can clone the running
thread which can initiate another group-stop.

Remove WARN_ON_ONCE(!current->ptrace).

2. A new thread always starts with ->jobctl = 0. If it is auto-attached
and this group is stopped, __ptrace_unlink() sets JOBCTL_STOP_PENDING
but JOBCTL_STOP_SIGMASK part is zero, this triggers WANR_ON(!signr)
in do_jobctl_trap() if another debugger attaches.

Change __ptrace_unlink() to set the artificial SIGSTOP for report.

Alternatively we could change ptrace_init_task() to copy signr from
current, but this means we can copy it for no reason and hide the
possible similar problems.

Acked-by: Tejun Heo
Cc: [3.1]
Signed-off-by: Oleg Nesterov
Signed-off-by: Linus Torvalds

Oleg Nesterov
2012-01-05 07:01:59 +0800

31 Oct, 2011

1 commit

9984de1a5 kernel: Map most files to use export.h instead of module.h ... Browse Code »

The changed files were only including linux/module.h for the
EXPORT_SYMBOL infrastructure, and nothing else. Revector them
onto the isolated export header for faster compile times.

Nothing to see here but a whole lot of instances of:

-#include
+#include

This commit is only changing the kernel dir; next targets
will probably be mm, fs, the arch dirs, etc.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-10-31 21:20:12 +0800

26 Sep, 2011

1 commit

f9d81f61c ptrace: PTRACE_LISTEN forgets to unlock ->siglock ... Browse Code »

If PTRACE_LISTEN fails after lock_task_sighand() it doesn't drop ->siglock.

Reported-by: Matt Fleming
Signed-off-by: Oleg Nesterov
Signed-off-by: Linus Torvalds

Oleg Nesterov
2011-09-26 02:02:00 +0800

19 Jul, 2011

1 commit

f701e5b73 connector: add an event for monitoring process tracers ... Browse Code »

This change adds a procfs connector event, which is emitted on every
successful process tracer attach or detach.

If some process connects to other one, kernelspace connector reports
process id and thread group id of both these involved processes. On
disconnection null process id is returned.

Such an event allows to create a simple automated userspace mechanism
to be aware about processes connecting to others, therefore predefined
process policies can be applied to them if needed.

Note, a detach signal is emitted only in case, if a tracer process
explicitly executes PTRACE_DETACH request. In other cases like tracee
or tracer exit detach event from proc connector is not reported.

Signed-off-by: Vladimir Zapolskiy
Acked-by: Evgeniy Polyakov
Cc: David S. Miller
Signed-off-by: Oleg Nesterov

Vladimir Zapolskiy
2011-07-19 03:38:33 +0800

28 Jun, 2011

2 commits

d4f7c511c do not change dead_task->exit_signal ... Browse Code »

__ptrace_detach() and do_notify_parent() set task->exit_signal = -1
to mark the task dead. This is no longer needed, nobody checks
exit_signal to detect the EXIT_DEAD task.

Signed-off-by: Oleg Nesterov
Reviewed-by: Tejun Heo

Oleg Nesterov
2011-06-28 02:30:10 +0800
9843a1e97 __ptrace_detach: avoid task_detached(), check do_notify_parent() ... Browse Code »

__ptrace_detach() relies on the current obscure behaviour of
do_notify_parent(tsk) which changes tsk->exit_signal if this child
should be silently reaped. That is why we check task_detached(), it
is true if the task is sub-thread, or it is the group_leader but
its exit_signal was changed by do_notify_parent().

This is confusing, change the code to rely on !thread_group_leader()
or the value returned by do_notify_parent().

Signed-off-by: Oleg Nesterov
Acked-by: Tejun Heo

Oleg Nesterov
2011-06-28 02:30:08 +0800

17 Jun, 2011

4 commits

544b2c91a ptrace: implement PTRACE_LISTEN ... Browse Code »

The previous patch implemented async notification for ptrace but it
only worked while trace is running. This patch introduces
PTRACE_LISTEN which is suggested by Oleg Nestrov.

It's allowed iff tracee is in STOP trap and puts tracee into
quasi-running state - tracee never really runs but wait(2) and
ptrace(2) consider it to be running. While ptracer is listening,
tracee is allowed to re-enter STOP to notify an async event.
Listening state is cleared on the first notification. Ptracer can
also clear it by issuing INTERRUPT - tracee will re-trap into STOP
with listening state cleared.

This allows ptracer to monitor group stop state without running tracee
- use INTERRUPT to put tracee into STOP trap, issue LISTEN and then
wait(2) to wait for the next group stop event. When it happens,
PTRACE_GETSIGINFO provides information to determine the current state.

Test program follows.

#define PTRACE_SEIZE 0x4206
#define PTRACE_INTERRUPT 0x4207
#define PTRACE_LISTEN 0x4208

#define PTRACE_SEIZE_DEVEL 0x80000000

static const struct timespec ts1s = { .tv_sec = 1 };

int main(int argc, char **argv)
{
pid_t tracee, tracer;
int i;

tracee = fork();
if (!tracee)
while (1)
pause();

tracer = fork();
if (!tracer) {
siginfo_t si;

ptrace(PTRACE_SEIZE, tracee, NULL,
(void *)(unsigned long)PTRACE_SEIZE_DEVEL);
ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
repeat:
waitid(P_PID, tracee, NULL, WSTOPPED);

ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si);
if (!si.si_code) {
printf("tracer: SIG %d\n", si.si_signo);
ptrace(PTRACE_CONT, tracee, NULL,
(void *)(unsigned long)si.si_signo);
goto repeat;
}
printf("tracer: stopped=%d signo=%d\n",
si.si_signo != SIGTRAP, si.si_signo);
if (si.si_signo != SIGTRAP)
ptrace(PTRACE_LISTEN, tracee, NULL, NULL);
else
ptrace(PTRACE_CONT, tracee, NULL, NULL);
goto repeat;
}

for (i = 0; i < 3; i++) {
nanosleep(&ts1s, NULL);
printf("mother: SIGSTOP\n");
kill(tracee, SIGSTOP);
nanosleep(&ts1s, NULL);
printf("mother: SIGCONT\n");
kill(tracee, SIGCONT);
}
nanosleep(&ts1s, NULL);

kill(tracer, SIGKILL);
kill(tracee, SIGKILL);
return 0;
}

This is identical to the program to test TRAP_NOTIFY except that
tracee is PTRACE_LISTEN'd instead of PTRACE_CONT'd when group stopped.
This allows ptracer to monitor when group stop ends without running
tracee.

# ./test-listen
tracer: stopped=0 signo=5
mother: SIGSTOP
tracer: SIG 19
tracer: stopped=1 signo=19
mother: SIGCONT
tracer: stopped=0 signo=5
tracer: SIG 18
mother: SIGSTOP
tracer: SIG 19
tracer: stopped=1 signo=19
mother: SIGCONT
tracer: stopped=0 signo=5
tracer: SIG 18
mother: SIGSTOP
tracer: SIG 19
tracer: stopped=1 signo=19
mother: SIGCONT
tracer: stopped=0 signo=5
tracer: SIG 18

-v2: Moved JOBCTL_LISTENING check in wait_task_stopped() into
task_stopped_code() as suggested by Oleg.

Signed-off-by: Tejun Heo
Cc: Oleg Nesterov

Tejun Heo
2011-06-17 03:41:54 +0800
fca26f260 ptrace: implement PTRACE_INTERRUPT ... Browse Code »

Currently, there's no way to trap a running ptracee short of sending a
signal which has various side effects. This patch implements
PTRACE_INTERRUPT which traps ptracee without any signal or job control
related side effect.

The implementation is almost trivial. It uses the group stop trap -
SIGTRAP | PTRACE_EVENT_STOP << 8. A new trap flag
JOBCTL_TRAP_INTERRUPT is added, which is set on PTRACE_INTERRUPT and
cleared when any trap happens. As INTERRUPT should be useable
regardless of the current state of tracee, task_is_traced() test in
ptrace_check_attach() is skipped for INTERRUPT.

PTRACE_INTERRUPT is available iff tracee is attached with
PTRACE_SEIZE.

Test program follows.

#define PTRACE_SEIZE 0x4206
#define PTRACE_INTERRUPT 0x4207

#define PTRACE_SEIZE_DEVEL 0x80000000

static const struct timespec ts100ms = { .tv_nsec = 100000000 };
static const struct timespec ts1s = { .tv_sec = 1 };
static const struct timespec ts3s = { .tv_sec = 3 };

int main(int argc, char **argv)
{
pid_t tracee;

tracee = fork();
if (tracee == 0) {
nanosleep(&ts100ms, NULL);
while (1) {
printf("tracee: alive pid=%d\n", getpid());
nanosleep(&ts1s, NULL);
}
}

if (argc > 1)
kill(tracee, SIGSTOP);

nanosleep(&ts100ms, NULL);

ptrace(PTRACE_SEIZE, tracee, NULL,
(void *)(unsigned long)PTRACE_SEIZE_DEVEL);
if (argc > 1) {
waitid(P_PID, tracee, NULL, WSTOPPED);
ptrace(PTRACE_CONT, tracee, NULL, NULL);
}
nanosleep(&ts3s, NULL);

printf("tracer: INTERRUPT and DETACH\n");
ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
waitid(P_PID, tracee, NULL, WSTOPPED);
ptrace(PTRACE_DETACH, tracee, NULL, NULL);
nanosleep(&ts3s, NULL);

printf("tracer: exiting\n");
kill(tracee, SIGKILL);
return 0;
}

When called without argument, tracee is seized from running state,
interrupted and then detached back to running state.

# ./test-interrupt
tracee: alive pid=4546
tracee: alive pid=4546
tracee: alive pid=4546
tracer: INTERRUPT and DETACH
tracee: alive pid=4546
tracee: alive pid=4546
tracee: alive pid=4546
tracer: exiting

When called with argument, tracee is seized from stopped state,
continued, interrupted and then detached back to stopped state.

# ./test-interrupt 1
tracee: alive pid=4548
tracee: alive pid=4548
tracee: alive pid=4548
tracer: INTERRUPT and DETACH
tracer: exiting

Before PTRACE_INTERRUPT, once the tracee was running, there was no way
to trap tracee and do PTRACE_DETACH without causing side effect.

-v2: Updated to use task_set_jobctl_pending() so that it doesn't end
up scheduling TRAP_STOP if child is dying which may make the
child unkillable. Spotted by Oleg.

Signed-off-by: Tejun Heo
Cc: Oleg Nesterov

Tejun Heo
2011-06-17 03:41:53 +0800
3544d72a0 ptrace: implement PTRACE_SEIZE ... Browse Code »

PTRACE_ATTACH implicitly issues SIGSTOP on attach which has side
effects on tracee signal and job control states. This patch
implements a new ptrace request PTRACE_SEIZE which attaches a tracee
without trapping it or affecting its signal and job control states.

The usage is the same with PTRACE_ATTACH but it takes PTRACE_SEIZE_*
flags in @data. Currently, the only defined flag is
PTRACE_SEIZE_DEVEL which is a temporary flag to enable PTRACE_SEIZE.
PTRACE_SEIZE will change ptrace behaviors outside of attach itself.
The changes will be implemented gradually and the DEVEL flag is to
prevent programs which expect full SEIZE behavior from using it before
all the behavior modifications are complete while allowing unit
testing. The flag will be removed once SEIZE behaviors are completely
implemented.

* PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap. After
attaching tracee continues to run unless a trap condition occurs.

* PTRACE_SEIZE doesn't affect signal or group stop state.

* If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses
exit_code of (signr | PTRACE_EVENT_STOP << 8) where signr is one of
the stopping signals if group stop is in effect or SIGTRAP
otherwise, and returns usual trap siginfo on PTRACE_GETSIGINFO
instead of NULL.

Seizing sets PT_SEIZED in ->ptrace of the tracee. This flag will be
used to determine whether new SEIZE behaviors should be enabled.

Test program follows.

#define PTRACE_SEIZE 0x4206
#define PTRACE_SEIZE_DEVEL 0x80000000

static const struct timespec ts100ms = { .tv_nsec = 100000000 };
static const struct timespec ts1s = { .tv_sec = 1 };
static const struct timespec ts3s = { .tv_sec = 3 };

int main(int argc, char **argv)
{
pid_t tracee;

tracee = fork();
if (tracee == 0) {
nanosleep(&ts100ms, NULL);
while (1) {
printf("tracee: alive\n");
nanosleep(&ts1s, NULL);
}
}

if (argc > 1)
kill(tracee, SIGSTOP);

nanosleep(&ts100ms, NULL);

ptrace(PTRACE_SEIZE, tracee, NULL,
(void *)(unsigned long)PTRACE_SEIZE_DEVEL);
if (argc > 1) {
waitid(P_PID, tracee, NULL, WSTOPPED);
ptrace(PTRACE_CONT, tracee, NULL, NULL);
}
nanosleep(&ts3s, NULL);
printf("tracer: exiting\n");
return 0;
}

When the above program is called w/o argument, tracee is seized while
running and remains running. When tracer exits, tracee continues to
run and print out messages.

# ./test-seize-simple
tracee: alive
tracee: alive
tracee: alive
tracer: exiting
tracee: alive
tracee: alive

When called with an argument, tracee is seized from stopped state and
continued, and returns to stopped state when tracer exits.

# ./test-seize
tracee: alive
tracee: alive
tracee: alive
tracer: exiting
# ps -el|grep test-seize
1 T 0 4720 1 0 80 0 - 941 signal ttyS0 00:00:00 test-seize

-v2: SEIZE doesn't schedule TRAP_STOP and leaves tracee running as Jan
suggested.

-v3: PTRACE_EVENT_STOP traps now report group stop state by signr. If
group stop is in effect the stop signal number is returned as
part of exit_code; otherwise, SIGTRAP. This was suggested by
Denys and Oleg.

Signed-off-by: Tejun Heo
Cc: Jan Kratochvil
Cc: Denys Vlasenko
Cc: Oleg Nesterov

Tejun Heo
2011-06-17 03:41:53 +0800
73ddff2be job control: introduce JOBCTL_TRAP_STOP and use it for group stop trap ... Browse Code »

do_signal_stop() implemented both normal group stop and trap for group
stop while ptraced. This approach has been enough but scheduled
changes require trap mechanism which can be used in more generic
manner and using group stop trap for generic trap site simplifies both
userland visible interface and implementation.

This patch adds a new jobctl flag - JOBCTL_TRAP_STOP. When set, it
triggers a trap site, which behaves like group stop trap, in
get_signal_to_deliver() after checking for pending signals. While
ptraced, do_signal_stop() doesn't stop itself. It initiates group
stop if requested and schedules JOBCTL_TRAP_STOP and returns. The
caller - get_signal_to_deliver() - is responsible for checking whether
TRAP_STOP is pending afterwards and handling it.

ptrace_attach() is updated to use JOBCTL_TRAP_STOP instead of
JOBCTL_STOP_PENDING and __ptrace_unlink() to clear all pending trap
bits and TRAPPING so that TRAP_STOP and future trap bits don't linger
after detach.

While at it, add proper function comment to do_signal_stop() and make
it return bool.

-v2: __ptrace_unlink() updated to clear JOBCTL_TRAP_MASK and TRAPPING
instead of JOBCTL_PENDING_MASK. This avoids accidentally
clearing JOBCTL_STOP_CONSUME. Spotted by Oleg.

-v3: do_signal_stop() updated to return %false without dropping
siglock while ptraced and TRAP_STOP check moved inside for(;;)
loop after group stop participation. This avoids unnecessary
relocking and also will help avoiding unnecessary traps by
consuming group stop before handling pending traps.

-v4: Jobctl trap handling moved into a separate function -
do_jobctl_trap().

Signed-off-by: Tejun Heo
Cc: Oleg Nesterov

Tejun Heo
2011-06-17 03:41:52 +0800

05 Jun, 2011

5 commits

62c124ff3 ptrace: use bit_waitqueue for TRAPPING instead of wait_chldexit ... Browse Code »

ptracer->signal->wait_chldexit was used to wait for TRAPPING; however,
->wait_chldexit was already complicated with waker-side filtering
without adding TRAPPING wait on top of it. Also, it unnecessarily
made TRAPPING clearing depend on the current ptrace relationship - if
the ptracee is detached, wakeup is lost.

There is no reason to use signal->wait_chldexit here. We're just
waiting for JOBCTL_TRAPPING bit to clear and given the relatively
infrequent use of ptrace, bit_waitqueue can serve it perfectly.

This patch makes JOBCTL_TRAPPING wait use bit_waitqueue instead of
signal->wait_chldexit.

-v2: Use JOBCTL_*_BIT macros instead of ilog2() as suggested by Linus.

Signed-off-by: Tejun Heo
Cc: Linus Torvalds
Signed-off-by: Oleg Nesterov

Tejun Heo
2011-06-05 00:17:11 +0800
7dd3db54e job control: introduce task_set_jobctl_pending() ... Browse Code »

task->jobctl currently hosts JOBCTL_STOP_PENDING and will host TRAP
pending bits too. Setting pending conditions on a dying task may make
the task unkillable. Currently, each setting site is responsible for
checking for the condition but with to-be-added job control traps this
becomes too fragile.

This patch adds task_set_jobctl_pending() which should be used when
setting task->jobctl bits to schedule a stop or trap. The function
performs the followings to ease setting pending bits.

* Sanity checks.

* If fatal signal is pending or PF_EXITING is set, no bit is set.

* STOP_SIGMASK is automatically cleared if new value is being set.

do_signal_stop() and ptrace_attach() are updated to use
task_set_jobctl_pending() instead of setting STOP_PENDING explicitly.
The surrounding structures around setting are changed to fit
task_set_jobctl_pending() better but there should be no userland
visible behavior difference.

Signed-off-by: Tejun Heo
Cc: Oleg Nesterov
Signed-off-by: Oleg Nesterov

Tejun Heo
2011-06-05 00:17:11 +0800
755e276b3 ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments ... Browse Code »

PTRACE_INTERRUPT is going to be added which should also skip
task_is_traced() check in ptrace_check_attach(). Rename @kill to
@ignore_state and make it bool. Add function comment while at it.

This patch doesn't introduce any behavior difference.

Signed-off-by: Tejun Heo
Signed-off-by: Oleg Nesterov

Tejun Heo
2011-06-05 00:17:10 +0800
a8f072c1d job control: rename signal->group_stop and flags to jobctl and update them ... Browse Code »

signal->group_stop currently hosts mostly group stop related flags;
however, it's gonna be used for wider purposes and the GROUP_STOP_
flag prefix becomes confusing. Rename signal->group_stop to
signal->jobctl and rename all GROUP_STOP_* flags to JOBCTL_*.

Bit position macros JOBCTL_*_BIT are defined and JOBCTL_* flags are
defined in terms of them to allow using bitops later.

While at it, reassign JOBCTL_TRAPPING to bit 22 to better accomodate
future additions.

This doesn't cause any functional change.

-v2: JOBCTL_*_BIT macros added as suggested by Linus.

Signed-off-by: Tejun Heo
Cc: Linus Torvalds
Signed-off-by: Oleg Nesterov

Tejun Heo
2011-06-05 00:17:09 +0800
0b1007c35 ptrace: remove silly wait_trap variable from ptrace_attach() ... Browse Code »

Remove local variable wait_trap which determines whether to wait for
!TRAPPING or not and simply wait for it if attach was successful.

-v2: Oleg pointed out wait should happen iff attach was successful.

Signed-off-by: Tejun Heo
Cc: Oleg Nesterov
Signed-off-by: Oleg Nesterov

Tejun Heo
2011-06-05 00:17:09 +0800

26 May, 2011

1 commit

0666fb51b ptrace: ptrace_resume() shouldn't wake up !TASK_TRACED thread ... Browse Code »

It is not clear why ptrace_resume() does wake_up_process(). Unless the
caller is PTRACE_KILL the tracee should be TASK_TRACED so we can use
wake_up_state(__TASK_TRACED). If sys_ptrace() races with SIGKILL we do
not need the extra and potentionally spurious wakeup.

If the caller is PTRACE_KILL, wake_up_process() is even more wrong.
The tracee can sleep in any state in any place, and if we have a buggy
code which doesn't handle a spurious wakeup correctly PTRACE_KILL can
be used to exploit it. For example:

int main(void)
{
int child, status;

child = fork();
if (!child) {
int ret;

assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);

ret = pause();
printf("pause: %d %m\n", ret);

return 0x23;
}

sleep(1);
assert(ptrace(PTRACE_KILL, child, 0,0) == 0);

assert(child == wait(&status));
printf("wait: %x\n", status);

return 0;
}

prints "pause: -1 Unknown error 514", -ERESTARTNOHAND leaks to the
userland. In this case sys_pause() is buggy as well and should be
fixed.

I do not know what was the original rationality behind PTRACE_KILL.
The man page is simply wrong and afaics it was always wrong. Imho
it should be deprecated, or may be it should do send_sig(SIGKILL)
as Denys suggests, but in any case I do not think that the current
behaviour was intentional.

Note: there is another problem, ptrace_resume() changes ->exit_code
and this can race with SIGKILL too. Eventually we should change ptrace
to not use ->exit_code.

Signed-off-by: Oleg Nesterov

Oleg Nesterov
2011-05-26 01:20:21 +0800

21 May, 2011

1 commit

3ed4c0583 Merge branch 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc ... Browse Code »

* 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (41 commits)
signal: trivial, fix the "timespec declared inside parameter list" warning
job control: reorganize wait_task_stopped()
ptrace: fix signal->wait_chldexit usage in task_clear_group_stop_trapping()
signal: sys_sigprocmask() needs retarget_shared_pending()
signal: cleanup sys_sigprocmask()
signal: rename signandsets() to sigandnsets()
signal: do_sigtimedwait() needs retarget_shared_pending()
signal: introduce do_sigtimedwait() to factor out compat/native code
signal: sys_rt_sigtimedwait: simplify the timeout logic
signal: cleanup sys_rt_sigprocmask()
x86: signal: sys_rt_sigreturn() should use set_current_blocked()
x86: signal: handle_signal() should use set_current_blocked()
signal: sigprocmask() should do retarget_shared_pending()
signal: sigprocmask: narrow the scope of ->siglock
signal: retarget_shared_pending: optimize while_each_thread() loop
signal: retarget_shared_pending: consider shared/unblocked signals only
signal: introduce retarget_shared_pending()
ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/
signal: Turn SIGNAL_STOP_DEQUEUED into GROUP_STOP_DEQUEUED
signal: do_signal_stop: Remove the unneeded task_clear_group_stop_pending()
...

Linus Torvalds
2011-05-21 04:33:21 +0800

25 Apr, 2011

1 commit

bf26c0184 ptrace: Prepare to fix racy accesses on task breakpoints ... Browse Code »

When a task is traced and is in a stopped state, the tracer
may execute a ptrace request to examine the tracee state and
get its task struct. Right after, the tracee can be killed
and thus its breakpoints released.
This can happen concurrently when the tracer is in the middle
of reading or modifying these breakpoints, leading to dereferencing
a freed pointer.

Hence, to prepare the fix, create a generic breakpoint reference
holding API. When a reference on the breakpoints of a task is
held, the breakpoints won't be released until the last reference
is dropped. After that, no more ptrace request on the task's
breakpoints can be serviced for the tracer.

Reported-by: Oleg Nesterov
Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Will Deacon
Cc: Prasad
Cc: Paul Mundt
Cc: v2.6.33..
Link: http://lkml.kernel.org/r/1302284067-7860-2-git-send-email-fweisbec@gmail.com

Frederic Weisbecker
2011-04-25 23:28:24 +0800

08 Apr, 2011

1 commit

e46bc9b6f Merge branch 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into ptrace Browse Code »

Oleg Nesterov
2011-04-08 02:44:11 +0800

04 Apr, 2011

1 commit

321fb5619 ptrace: ptrace_check_attach() should not do s/STOPPED/TRACED/ ... Browse Code »

After "ptrace: Clean transitions between TASK_STOPPED and TRACED"
d79fdd6d96f46fabb779d86332e3677c6f5c2a4f, ptrace_check_attach()
should never see a TASK_STOPPED tracee and s/STOPPED/TRACED/ is
no longer legal. Add the warning.

Note: ptrace_check_attach() can be greatly simplified, in particular
it doesn't need tasklist. But I'd prefer another patch for that.

Signed-off-by: Oleg Nesterov
Signed-off-by: Tejun Heo

Oleg Nesterov
2011-04-04 08:11:05 +0800

24 Mar, 2011

1 commit

8409cca70 userns: allow ptrace from non-init user namespaces ... Browse Code »

ptrace is allowed to tasks in the same user namespace according to the
usual rules (i.e. the same rules as for two tasks in the init user
namespace). ptrace is also allowed to a user namespace to which the
current task the has CAP_SYS_PTRACE capability.

Changelog:
Dec 31: Address feedback by Eric:
. Correct ptrace uid check
. Rename may_ptrace_ns to ptrace_capable
. Also fix the cap_ptrace checks.
Jan 1: Use const cred struct
Jan 11: use task_ns_capable() in place of ptrace_capable().
Feb 23: same_or_ancestore_user_ns() was not an appropriate
check to constrain cap_issubset. Rather, cap_issubset()
only is meaningful when both capsets are in the same
user_ns.

Signed-off-by: Serge E. Hallyn
Cc: "Eric W. Biederman"
Acked-by: Daniel Lezcano
Acked-by: David Howells
Cc: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Serge E. Hallyn
2011-03-24 10:47:05 +0800

23 Mar, 2011

4 commits

0e9f0a4ab ptrace: Always put ptracee into appropriate execution state ... Browse Code »

Currently, __ptrace_unlink() wakes up the tracee iff it's in
TASK_TRACED. For unlinking from PTRACE_DETACH, this is correct as the
tracee is guaranteed to be in TASK_TRACED or dead; however, unlinking
also happens when the ptracer exits and in this case the ptracee can
be in any state and ptrace might be left running even if the group it
belongs to is stopped.

This patch updates __ptrace_unlink() such that GROUP_STOP_PENDING is
reinstated regardless of the ptracee's current state as long as it's
alive and makes sure that signal_wake_up() is called if execution
state transition is necessary.

Test case follows.

#include
#include
#include
#include
#include

static const struct timespec ts1s = { .tv_sec = 1 };

int main(void)
{
pid_t tracee;
siginfo_t si;

tracee = fork();
if (tracee == 0) {
while (1) {
nanosleep(&ts1s, NULL);
write(1, ".", 1);
}
}

ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
waitid(P_PID, tracee, &si, WSTOPPED);
ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
waitid(P_PID, tracee, &si, WSTOPPED);
ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
write(1, "exiting", 7);
return 0;
}

Before the patch, after the parent process exits, the child is left
running and prints out "." every second.

exiting..... (continues)

After the patch, the group stop initiated by the implied SIGSTOP from
PTRACE_ATTACH is re-established when the parent exits.

exiting

Signed-off-by: Tejun Heo
Reported-by: Oleg Nesterov
Acked-by: Oleg Nesterov

Tejun Heo
2011-03-23 17:37:01 +0800
e3bd058f6 ptrace: Collapse ptrace_untrace() into __ptrace_unlink() ... Browse Code »

Remove the extra task_is_traced() check in __ptrace_unlink() and
collapse ptrace_untrace() into __ptrace_unlink(). This is to prepare
for further changes.

While at it, drop the comment on top of ptrace_untrace() and convert
__ptrace_unlink() comment to docbook format. Detailed comment will be
added by the next patch.

This patch doesn't cause any visible behavior changes.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov

Tejun Heo
2011-03-23 17:37:01 +0800
d79fdd6d9 ptrace: Clean transitions between TASK_STOPPED and TRACED ... Browse Code »

Currently, if the task is STOPPED on ptrace attach, it's left alone
and the state is silently changed to TRACED on the next ptrace call.
The behavior breaks the assumption that arch_ptrace_stop() is called
before any task is poked by ptrace and is ugly in that a task
manipulates the state of another task directly.

With GROUP_STOP_PENDING, the transitions between TASK_STOPPED and
TRACED can be made clean. The tracer can use the flag to tell the
tracee to retry stop on attach and detach. On retry, the tracee will
enter the desired state in the correct way. The lower 16bits of
task->group_stop is used to remember the signal number which caused
the last group stop. This is used while retrying for ptrace attach as
the original group_exit_code could have been consumed with wait(2) by
then.

As the real parent may wait(2) and consume the group_exit_code
anytime, the group_exit_code needs to be saved separately so that it
can be used when switching from regular sleep to ptrace_stop(). This
is recorded in the lower 16bits of task->group_stop.

If a task is already stopped and there's no intervening SIGCONT, a
ptrace request immediately following a successful PTRACE_ATTACH should
always succeed even if the tracer doesn't wait(2) for attach
completion; however, with this change, the tracee might still be
TASK_RUNNING trying to enter TASK_TRACED which would cause the
following request to fail with -ESRCH.

This intermediate state is hidden from the ptracer by setting
GROUP_STOP_TRAPPING on attach and making ptrace_check_attach() wait
for it to clear on its signal->wait_chldexit. Completing the
transition or getting killed clears TRAPPING and wakes up the tracer.

Note that the STOPPED -> RUNNING -> TRACED transition is still visible
to other threads which are in the same group as the ptracer and the
reverse transition is visible to all. Please read the comments for
details.

Oleg:

* Spotted a race condition where a task may retry group stop without
proper bookkeeping. Fixed by redoing bookkeeping on retry.

* Spotted that the transition is visible to userland in several
different ways. Most are fixed with GROUP_STOP_TRAPPING. Unhandled
corner case is documented.

* Pointed out not setting GROUP_STOP_SIGMASK on an already stopped
task would result in more consistent behavior.

* Pointed out that calling ptrace_stop() from do_signal_stop() in
TASK_STOPPED can race with group stop start logic and then confuse
the TRAPPING wait in ptrace_check_attach(). ptrace_stop() is now
called with TASK_RUNNING.

* Suggested using signal->wait_chldexit instead of bit wait.

* Spotted a race condition between TRACED transition and clearing of
TRAPPING.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov
Cc: Roland McGrath
Cc: Jan Kratochvil

Tejun Heo
2011-03-23 17:37:00 +0800
9f2bf6513 ptrace: Remove the extra wake_up_state() from ptrace_detach() ... Browse Code »

This wake_up_state() has a turbulent history. This is a remnant from
ancient ptrace implementation and patently wrong. Commit 95a3540d
(ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic) removed
it but the change was reverted later by commit edaba2c5 (ptrace:
revert "ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic")
citing compatibility breakage and general brokeness of the whole group
stop / ptrace interaction. Then, recently, it got converted from
wake_up_process() to wake_up_state() to make it less dangerous.

Digging through the mailing archives, the compatibility breakage
doesn't seem to be critical in the sense that the behavior isn't well
defined or reliable to begin with and it seems to have been agreed to
remove the wakeup with proper cleanup of the whole thing.

Now that the group stop and its interaction with ptrace are being
cleaned up, it's high time to finally kill this silliness.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov
Cc: Roland McGrath

Tejun Heo
2011-03-23 17:37:00 +0800

05 Mar, 2011

1 commit

e3e89cc53 Mark ptrace_{traceme,attach,detach} static ... Browse Code »

They are only used inside kernel/ptrace.c, and have been for a long
time. We don't want to go back to the bad-old-days when architectures
did things on their own, so make them static and private.

Signed-off-by: Linus Torvalds

Linus Torvalds
2011-03-05 01:23:30 +0800

12 Feb, 2011

1 commit

01e05e9a9 ptrace: use safer wake up on ptrace_detach() ... Browse Code »

The wake_up_process() call in ptrace_detach() is spurious and not
interlocked with the tracee state. IOW, the tracee could be running or
sleeping in any place in the kernel by the time wake_up_process() is
called. This can lead to the tracee waking up unexpectedly which can be
dangerous.

The wake_up is spurious and should be removed but for now reduce its
toxicity by only waking up if the tracee is in TRACED or STOPPED state.

This bug can possibly be used as an attack vector. I don't think it
will take too much effort to come up with an attack which triggers oops
somewhere. Most sleeps are wrapped in condition test loops and should
be safe but we have quite a number of places where sleep and wakeup
conditions are expected to be interlocked. Although the window of
opportunity is tiny, ptrace can be used by non-privileged users and with
some loading the window can definitely be extended and exploited.

Signed-off-by: Tejun Heo
Acked-by: Roland McGrath
Acked-by: Oleg Nesterov
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2011-02-12 08:12:19 +0800

28 Oct, 2010

4 commits

9b1bf12d5 signals: move cred_guard_mutex from task_struct to signal_struct ... Browse Code »

Oleg Nesterov pointed out we have to prevent multiple-threads-inside-exec
itself and we can reuse ->cred_guard_mutex for it. Yes, concurrent
execve() has no worth.

Let's move ->cred_guard_mutex from task_struct to signal_struct. It
naturally prevent multiple-threads-inside-exec.

Signed-off-by: KOSAKI Motohiro
Reviewed-by: Oleg Nesterov
Acked-by: Roland McGrath
Acked-by: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2010-10-28 09:03:12 +0800
9fed81dc4 ptrace: cleanup ptrace_request() ... Browse Code »

Use new 'datavp' and 'datalp' variables to remove unnecesary castings.

Signed-off-by: Namhyung Kim
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-28 09:03:10 +0800
4abf98696 ptrace: change signature of sys_ptrace() and friends ... Browse Code »

Since userspace API of ptrace syscall defines @addr and @data as void
pointers, it would be more appropriate to define them as unsigned long in
kernel. Therefore related functions are changed also.

'unsigned long' is typically used in other places in kernel as an opaque
data type and that using this helps cleaning up a lot of warnings from
sparse.

Suggested-by: Arnd Bergmann
Signed-off-by: Namhyung Kim
Acked-by: Arnd Bergmann
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-28 09:03:10 +0800
c4b5ed250 ptrace: annotate lock context change on exit_ptrace() ... Browse Code »

exit_ptrace() releases and regrabs tasklist_lock but was missing proper
annotation. Add it.

Signed-off-by: Namhyung Kim
Acked-by: Roland McGrath
Cc: Ingo Molnar
Cc: Oleg Nesterov
Signed-off-by: Linus Torvalds

Namhyung Kim
2010-10-28 09:03:10 +0800

11 Aug, 2010

1 commit

c7e49c148 ptrace: optimize exit_ptrace() for the likely case ... Browse Code »

exit_ptrace() takes tasklist_lock unconditionally. We need this lock to
avoid the race with ptrace_traceme(), it acts as a barrier.

Change its caller, forget_original_parent(), to call exit_ptrace() under
tasklist_lock. Change exit_ptrace() to drop and reacquire this lock if
needed.

This allows us to add the fastpath list_empty(ptraced) check. In the
likely no-tracees case exit_ptrace() just returns and we avoid the lock()
+ unlock() sequence.

"Zhang, Yanmin" suggested to add this
check, and he reports that this change adds about 11% improvement in some
tests.

Suggested-and-tested-by: "Zhang, Yanmin"
Signed-off-by: Oleg Nesterov
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2010-08-11 23:59:19 +0800

28 May, 2010

2 commits

e0129ef91 ptrace: PTRACE_GETFDPIC: fix the unsafe usage of child->mm ... Browse Code »

Now that Mike Frysinger unified the FDPIC ptrace code, we can fix the
unsafe usage of child->mm in ptrace_request(PTRACE_GETFDPIC).

We have the reference to task_struct, and ptrace_check_attach() verified
the tracee is stopped. But nothing can protect from SIGKILL after that,
we must not assume child->mm != NULL.

Signed-off-by: Oleg Nesterov
Acked-by: Mike Frysinger
Acked-by: David Howells
Cc: Paul Mundt
Cc: Greg Ungerer
Acked-by: Roland McGrath
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2010-05-28 00:12:44 +0800
9c1a12592 ptrace: unify FDPIC implementations ... Browse Code »

The Blackfin/FRV/SuperH guys all have the same exact FDPIC ptrace code in
their arch handlers (since they were probably copied & pasted). Since
these ptrace interfaces are an arch independent aspect of the FDPIC code,
unify them in the common ptrace code so new FDPIC ports don't need to copy
and paste this fundamental stuff yet again.

Signed-off-by: Mike Frysinger
Acked-by: Roland McGrath
Acked-by: David Howells
Acked-by: Paul Mundt
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Frysinger
2010-05-28 00:12:44 +0800

18 May, 2010

1 commit

4d7b4ac22 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits)
perf tools: Add mode to build without newt support
perf symbols: symbol inconsistency message should be done only at verbose=1
perf tui: Add explicit -lslang option
perf options: Type check all the remaining OPT_ variants
perf options: Type check OPT_BOOLEAN and fix the offenders
perf options: Check v type in OPT_U?INTEGER
perf options: Introduce OPT_UINTEGER
perf tui: Add workaround for slang < 2.1.4
perf record: Fix bug mismatch with -c option definition
perf options: Introduce OPT_U64
perf tui: Add help window to show key associations
perf tui: Make <- exit menus too
perf newt: Add single key shortcuts for zoom into DSO and threads
perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed
perf newt: Fix the 'A'/'a' shortcut for annotate
perf newt: Make <- exit the ui_browser
x86, perf: P4 PMU - fix counters management logic
perf newt: Make <- zoom out filters
perf report: Report number of events, not samples
perf hist: Clarify events_stats fields usage
...

Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c

Linus Torvalds
2010-05-18 23:19:03 +0800

27 Apr, 2010

1 commit

b8bc1389b ptrace: Cleanup useless header ... Browse Code »

BKL isn't present anymore into this file thus we can safely remove
smp_lock.h inclusion.

Signed-off-by: Alessio Igor Bogani
Cc: Roland McGrath
Cc: Oleg Nesterov
Cc: Andrew Morton
Cc: James Morris
Cc: Ingo Molnar
Signed-off-by: Frederic Weisbecker

Alessio Igor Bogani
2010-04-27 05:42:51 +0800

10 Apr, 2010

1 commit

5534ecb2d ptrace: kill BKL in ptrace syscall ... Browse Code »

The comment suggests that this usage is stale. There is no bkl in the
exec path so if there is a race lurking there, the bkl in ptrace is
not going to help in this regard.

Overview of the possibility of "accidental" races this bkl might
protect:

- ptrace_traceme() is protected against task removal and concurrent
read/write on current->ptrace as it locks write tasklist_lock.

- arch_ptrace_attach() is serialized by ptrace_traceme() against
concurrent PTRACE_TRACEME or PTRACE_ATTACH

- ptrace_attach() is protected the same way ptrace_traceme() and
in turn serializes arch_ptrace_attach()

- ptrace_check_attach() does its own well described serializing too.

There is no obvious race here.

Signed-off-by: Arnd Bergmann
Signed-off-by: Frederic Weisbecker
Acked-by: Oleg Nesterov
Acked-by: Roland McGrath
Cc: Andrew Morton
Cc: Roland McGrath

Arnd Bergmann
2010-04-10 21:34:21 +0800