Doug / smarc-fsl-linux-kernel | Embedian Git Server

23 Mar, 2011

40 commits

45cb24a1d job control: Allow access to job control events through ptracees ... Browse Code »

Currently a real parent can't access job control stopped/continued
events through a ptraced child. This utterly breaks job control when
the children are ptraced.

For example, if a program is run from an interactive shell and then
strace(1) attaches to it, pressing ^Z would send SIGTSTP and strace(1)
would notice it but the shell has no way to tell whether the child
entered job control stop and thus can't tell when to take over the
terminal - leading to awkward lone ^Z on the terminal.

Because the job control and ptrace stopped states are independent,
there is no reason to prevent real parents from accessing the stopped
state regardless of ptrace. The continued state isn't separate but
ptracers don't have any use for them as ptracees can never resume
without explicit command from their ptracers, so as long as ptracers
don't consume it, it should be fine.

Although this is a behavior change, because the previous behavior is
utterly broken when viewed from real parents and the change is only
visible to real parents, I don't think it's necessary to make this
behavior optional.

One situation to be careful about is when a task from the real
parent's group is ptracing. The parent group is the recipient of both
ptrace and job control stop events and one stop can be reported as
both job control and ptrace stops. As this can break the current
ptrace users, suppress job control stopped events for these cases.

If a real parent ptracer wants to know about both job control and
ptrace stops, it can create a separate process to serve the role of
real parent.

Note that this only updates wait(2) side of things. The real parent
can access the states via wait(2) but still is not properly notified
(woken up and delivered signal). Test case polls wait(2) with WNOHANG
to work around. Notification will be updated by future patches.

Test case follows.

#include
#include
#include
#include
#include
#include
#include

int main(void)
{
const struct timespec ts100ms = { .tv_nsec = 100000000 };
pid_t tracee, tracer;
siginfo_t si;
int i;

tracee = fork();
if (tracee == 0) {
while (1) {
printf("tracee: SIGSTOP\n");
raise(SIGSTOP);
nanosleep(&ts100ms, NULL);
printf("tracee: SIGCONT\n");
raise(SIGCONT);
nanosleep(&ts100ms, NULL);
}
}

waitid(P_PID, tracee, &si, WSTOPPED | WNOHANG | WNOWAIT);

tracer = fork();
if (tracer == 0) {
nanosleep(&ts100ms, NULL);
ptrace(PTRACE_ATTACH, tracee, NULL, NULL);

for (i = 0; i < 11; i++) {
si.si_pid = 0;
waitid(P_PID, tracee, &si, WSTOPPED);
if (si.si_pid && si.si_code == CLD_TRAPPED)
ptrace(PTRACE_CONT, tracee, NULL,
(void *)(long)si.si_status);
}
printf("tracer: EXITING\n");
return 0;
}

while (1) {
si.si_pid = 0;
waitid(P_PID, tracee, &si,
WSTOPPED | WCONTINUED | WEXITED | WNOHANG);
if (si.si_pid)
printf("mommy : WAIT status=%02d code=%02d\n",
si.si_status, si.si_code);
nanosleep(&ts100ms, NULL);
}
return 0;
}

Before the patch, while ptraced, the parent can't see any job control
events.

tracee: SIGSTOP
mommy : WAIT status=19 code=05
tracee: SIGCONT
tracee: SIGSTOP
tracee: SIGCONT
tracee: SIGSTOP
tracee: SIGCONT
tracee: SIGSTOP
tracer: EXITING
mommy : WAIT status=19 code=05
^C

After the patch,

tracee: SIGSTOP
mommy : WAIT status=19 code=05
tracee: SIGCONT
mommy : WAIT status=18 code=06
tracee: SIGSTOP
mommy : WAIT status=19 code=05
tracee: SIGCONT
mommy : WAIT status=18 code=06
tracee: SIGSTOP
mommy : WAIT status=19 code=05
tracee: SIGCONT
mommy : WAIT status=18 code=06
tracee: SIGSTOP
tracer: EXITING
mommy : WAIT status=19 code=05
^C

-v2: Oleg pointed out that wait(2) should be suppressed for the real
parent's group instead of only the real parent task itself.
Updated accordingly.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov

Tejun Heo
2011-03-23 17:37:01 +0800
9b84cca25 job control: Fix ptracer wait(2) hang and explain notask_error clearing ... Browse Code »

wait(2) and friends allow access to stopped/continued states through
zombies, which is required as the states are process-wide and should
be accessible whether the leader task is alive or undead.
wait_consider_task() implements this by always clearing notask_error
and going through wait_task_stopped/continued() for unreaped zombies.

However, while ptraced, the stopped state is per-task and as such if
the ptracee became a zombie, there's no further stopped event to
listen to and wait(2) and friends should return -ECHILD on the tracee.

Fix it by clearing notask_error only if WCONTINUED | WEXITED is set
for ptraced zombies. While at it, document why clearing notask_error
is safe for each case.

Test case follows.

#include
#include
#include
#include
#include
#include
#include

static void *nooper(void *arg)
{
pause();
return NULL;
}

int main(void)
{
const struct timespec ts1s = { .tv_sec = 1 };
pid_t tracee, tracer;
siginfo_t si;

tracee = fork();
if (tracee == 0) {
pthread_t thr;

pthread_create(&thr, NULL, nooper, NULL);
nanosleep(&ts1s, NULL);
printf("tracee exiting\n");
pthread_exit(NULL); /* let subthread run */
}

tracer = fork();
if (tracer == 0) {
ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
while (1) {
if (waitid(P_PID, tracee, &si, WSTOPPED) < 0) {
perror("waitid");
break;
}
ptrace(PTRACE_CONT, tracee, NULL,
(void *)(long)si.si_status);
}
return 0;
}

waitid(P_PID, tracer, &si, WEXITED);
kill(tracee, SIGKILL);
return 0;
}

Before the patch, after the tracee becomes a zombie, the tracer's
waitid(WSTOPPED) never returns and the program doesn't terminate.

tracee exiting
^C

After the patch, tracee exiting triggers waitid() to fail.

tracee exiting
waitid: No child processes

-v2: Oleg pointed out that exited in addition to continued can happen
for ptraced dead group leader. Clear notask_error for ptraced
child on WEXITED too.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov

Tejun Heo
2011-03-23 17:37:01 +0800
823b018e5 job control: Small reorganization of wait_consider_task() ... Browse Code »

Move EXIT_DEAD test in wait_consider_task() above ptrace check. As
ptraced tasks can't be EXIT_DEAD, this change doesn't cause any
behavior change. This is to prepare for further changes.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov

Tejun Heo
2011-03-23 17:37:01 +0800
408a37de6 job control: Don't set group_stop exit_code if re-entering job control stop ... Browse Code »

While ptraced, a task may be resumed while the containing process is
still job control stopped. If the task receives another stop signal
in this state, it will still initiate group stop, which generates
group_exit_code, which the real parent would be able to see once the
ptracer detaches.

In this scenario, the real parent may see two consecutive CLD_STOPPED
events from two stop signals without intervening SIGCONT, which
normally is impossible.

Test case follows.

#include
#include
#include
#include

int main(void)
{
pid_t tracee;
siginfo_t si;

tracee = fork();
if (!tracee)
while (1)
pause();

kill(tracee, SIGSTOP);
waitid(P_PID, tracee, &si, WSTOPPED);

if (!fork()) {
ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
waitid(P_PID, tracee, &si, WSTOPPED);
ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
waitid(P_PID, tracee, &si, WSTOPPED);
ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
waitid(P_PID, tracee, &si, WSTOPPED);
ptrace(PTRACE_DETACH, tracee, NULL, NULL);
return 0;
}

while (1) {
si.si_pid = 0;
waitid(P_PID, tracee, &si, WSTOPPED | WNOHANG);
if (si.si_pid)
printf("st=%02d c=%02d\n", si.si_status, si.si_code);
}
return 0;
}

Before the patch, the latter waitid() in polling mode reports the
second stopped event generated by the implied SIGSTOP of
PTRACE_ATTACH.

st=19 c=05
^C

After the patch, the second event is not reported.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov

Tejun Heo
2011-03-23 17:37:01 +0800
0e9f0a4ab ptrace: Always put ptracee into appropriate execution state ... Browse Code »

Currently, __ptrace_unlink() wakes up the tracee iff it's in
TASK_TRACED. For unlinking from PTRACE_DETACH, this is correct as the
tracee is guaranteed to be in TASK_TRACED or dead; however, unlinking
also happens when the ptracer exits and in this case the ptracee can
be in any state and ptrace might be left running even if the group it
belongs to is stopped.

This patch updates __ptrace_unlink() such that GROUP_STOP_PENDING is
reinstated regardless of the ptracee's current state as long as it's
alive and makes sure that signal_wake_up() is called if execution
state transition is necessary.

Test case follows.

#include
#include
#include
#include
#include

static const struct timespec ts1s = { .tv_sec = 1 };

int main(void)
{
pid_t tracee;
siginfo_t si;

tracee = fork();
if (tracee == 0) {
while (1) {
nanosleep(&ts1s, NULL);
write(1, ".", 1);
}
}

ptrace(PTRACE_ATTACH, tracee, NULL, NULL);
waitid(P_PID, tracee, &si, WSTOPPED);
ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
waitid(P_PID, tracee, &si, WSTOPPED);
ptrace(PTRACE_CONT, tracee, NULL, (void *)(long)si.si_status);
write(1, "exiting", 7);
return 0;
}

Before the patch, after the parent process exits, the child is left
running and prints out "." every second.

exiting..... (continues)

After the patch, the group stop initiated by the implied SIGSTOP from
PTRACE_ATTACH is re-established when the parent exits.

exiting

Signed-off-by: Tejun Heo
Reported-by: Oleg Nesterov
Acked-by: Oleg Nesterov

Tejun Heo
2011-03-23 17:37:01 +0800
e3bd058f6 ptrace: Collapse ptrace_untrace() into __ptrace_unlink() ... Browse Code »

Remove the extra task_is_traced() check in __ptrace_unlink() and
collapse ptrace_untrace() into __ptrace_unlink(). This is to prepare
for further changes.

While at it, drop the comment on top of ptrace_untrace() and convert
__ptrace_unlink() comment to docbook format. Detailed comment will be
added by the next patch.

This patch doesn't cause any visible behavior changes.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov

Tejun Heo
2011-03-23 17:37:01 +0800
d79fdd6d9 ptrace: Clean transitions between TASK_STOPPED and TRACED ... Browse Code »

Currently, if the task is STOPPED on ptrace attach, it's left alone
and the state is silently changed to TRACED on the next ptrace call.
The behavior breaks the assumption that arch_ptrace_stop() is called
before any task is poked by ptrace and is ugly in that a task
manipulates the state of another task directly.

With GROUP_STOP_PENDING, the transitions between TASK_STOPPED and
TRACED can be made clean. The tracer can use the flag to tell the
tracee to retry stop on attach and detach. On retry, the tracee will
enter the desired state in the correct way. The lower 16bits of
task->group_stop is used to remember the signal number which caused
the last group stop. This is used while retrying for ptrace attach as
the original group_exit_code could have been consumed with wait(2) by
then.

As the real parent may wait(2) and consume the group_exit_code
anytime, the group_exit_code needs to be saved separately so that it
can be used when switching from regular sleep to ptrace_stop(). This
is recorded in the lower 16bits of task->group_stop.

If a task is already stopped and there's no intervening SIGCONT, a
ptrace request immediately following a successful PTRACE_ATTACH should
always succeed even if the tracer doesn't wait(2) for attach
completion; however, with this change, the tracee might still be
TASK_RUNNING trying to enter TASK_TRACED which would cause the
following request to fail with -ESRCH.

This intermediate state is hidden from the ptracer by setting
GROUP_STOP_TRAPPING on attach and making ptrace_check_attach() wait
for it to clear on its signal->wait_chldexit. Completing the
transition or getting killed clears TRAPPING and wakes up the tracer.

Note that the STOPPED -> RUNNING -> TRACED transition is still visible
to other threads which are in the same group as the ptracer and the
reverse transition is visible to all. Please read the comments for
details.

Oleg:

* Spotted a race condition where a task may retry group stop without
proper bookkeeping. Fixed by redoing bookkeeping on retry.

* Spotted that the transition is visible to userland in several
different ways. Most are fixed with GROUP_STOP_TRAPPING. Unhandled
corner case is documented.

* Pointed out not setting GROUP_STOP_SIGMASK on an already stopped
task would result in more consistent behavior.

* Pointed out that calling ptrace_stop() from do_signal_stop() in
TASK_STOPPED can race with group stop start logic and then confuse
the TRAPPING wait in ptrace_check_attach(). ptrace_stop() is now
called with TASK_RUNNING.

* Suggested using signal->wait_chldexit instead of bit wait.

* Spotted a race condition between TRACED transition and clearing of
TRAPPING.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov
Cc: Roland McGrath
Cc: Jan Kratochvil

Tejun Heo
2011-03-23 17:37:00 +0800
5224fa366 ptrace: Make do_signal_stop() use ptrace_stop() if the task is being ptraced ... Browse Code »

A ptraced task would still stop at do_signal_stop() when it's stopping
for stop signals and do_signal_stop() behaves the same whether the
task is ptraced or not. However, in addition to stopping,
ptrace_stop() also does ptrace specific stuff like calling
architecture specific callbacks, so this behavior makes the code more
fragile and difficult to understand.

This patch makes do_signal_stop() test whether the task is ptraced and
use ptrace_stop() if so. This renders tracehook_notify_jctl() rather
pointless as the ptrace notification is now handled by ptrace_stop()
regardless of the return value from the tracehook. It probably is a
good idea to update it.

This doesn't solve the whole problem as tasks already in stopped state
would stay in the regular stop when ptrace attached. That part will
be handled by the next patch.

Oleg pointed out that this makes a userland-visible change. Before,
SIGCONT would be able to wake up a task in group stop even if the task
is ptraced if the tracer hasn't issued another ptrace command
afterwards (as the next ptrace commands transitions the state into
TASK_TRACED which ignores SIGCONT wakeups). With this and the next
patch, SIGCONT may race with the transition into TASK_TRACED and is
ignored if the tracee already entered TASK_TRACED.

Another userland visible change of this and the next patch is that the
ptracee's state would now be TASK_TRACED where it used to be
TASK_STOPPED, which is visible via fs/proc.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov
Cc: Roland McGrath
Cc: Jan Kratochvil

Tejun Heo
2011-03-23 17:37:00 +0800
0ae8ce1c8 ptrace: Participate in group stop from ptrace_stop() iff the task is trapping for group stop ... Browse Code »

Currently, ptrace_stop() unconditionally participates in group stop
bookkeeping. This is unnecessary and inaccurate. Make it only
participate if the task is trapping for group stop - ie. if @why is
CLD_STOPPED. As ptrace_stop() currently is not used when trapping for
group stop, this equals to disabling group stop participation from
ptrace_stop().

A visible behavior change is increased likelihood of delayed group
stop completion if the thread group contains one or more ptraced
tasks.

This is to preapre for further cleanup of the interaction between
group stop and ptrace.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov
Cc: Roland McGrath

Tejun Heo
2011-03-23 17:37:00 +0800
39efa3ef3 signal: Use GROUP_STOP_PENDING to stop once for a single group stop ... Browse Code »

Currently task->signal->group_stop_count is used to decide whether to
stop for group stop. However, if there is a task in the group which
is taking a long time to stop, other tasks which are continued by
ptrace would repeatedly stop for the same group stop until the group
stop is complete.

Conversely, if a ptraced task is in TASK_TRACED state, the debugger
won't get notified of group stops which is inconsistent compared to
the ptraced task in any other state.

This patch introduces GROUP_STOP_PENDING which tracks whether a task
is yet to stop for the group stop in progress. The flag is set when a
group stop starts and cleared when the task stops the first time for
the group stop, and consulted whenever whether the task should
participate in a group stop needs to be determined. Note that now
tasks in TASK_TRACED also participate in group stop.

This results in the following behavior changes.

* For a single group stop, a ptracer would see at most one stop
reported.

* A ptracee in TASK_TRACED now also participates in group stop and the
tracer would get the notification. However, as a ptraced task could
be in TASK_STOPPED state or any ptrace trap could consume group
stop, the notification may still be missing. These will be
addressed with further patches.

* A ptracee may start a group stop while one is still in progress if
the tracer let it continue with stop signal delivery. Group stop
code handles this correctly.

Oleg:

* Spotted that a task might skip signal check even when its
GROUP_STOP_PENDING is set. Fixed by updating
recalc_sigpending_tsk() to check GROUP_STOP_PENDING instead of
group_stop_count.

* Pointed out that task->group_stop should be cleared whenever
task->signal->group_stop_count is cleared. Fixed accordingly.

* Pointed out the behavior inconsistency between TASK_TRACED and
RUNNING and the last behavior change.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov
Cc: Roland McGrath

Tejun Heo
2011-03-23 17:37:00 +0800
e5c1902e9 signal: Fix premature completion of group stop when interfered by ptrace ... Browse Code »

task->signal->group_stop_count is used to track the progress of group
stop. It's initialized to the number of tasks which need to stop for
group stop to finish and each stopping or trapping task decrements.
However, each task doesn't keep track of whether it decremented the
counter or not and if woken up before the group stop is complete and
stops again, it can decrement the counter multiple times.

Please consider the following example code.

static void *worker(void *arg)
{
while (1) ;
return NULL;
}

int main(void)
{
pthread_t thread;
pid_t pid;
int i;

pid = fork();
if (!pid) {
for (i = 0; i < 5; i++)
pthread_create(&thread, NULL, worker, NULL);
while (1) ;
return 0;
}

ptrace(PTRACE_ATTACH, pid, NULL, NULL);
while (1) {
waitid(P_PID, pid, NULL, WSTOPPED);
ptrace(PTRACE_SINGLESTEP, pid, NULL, (void *)(long)SIGSTOP);
}
return 0;
}

The child creates five threads and the parent continuously traps the
first thread and whenever the child gets a signal, SIGSTOP is
delivered. If an external process sends SIGSTOP to the child, all
other threads in the process should reliably stop. However, due to
the above bug, the first thread will often end up consuming
group_stop_count multiple times and SIGSTOP often ends up stopping
none or part of the other four threads.

This patch adds a new field task->group_stop which is protected by
siglock and uses GROUP_STOP_CONSUME flag to track which task is still
to consume group_stop_count to fix this bug.

task_clear_group_stop_pending() and task_participate_group_stop() are
added to help manipulating group stop states. As ptrace_stop() now
also uses task_participate_group_stop(), it will set
SIGNAL_STOP_STOPPED if it completes a group stop.

There still are many issues regarding the interaction between group
stop and ptrace. Patches to address them will follow.

- Oleg spotted duplicate GROUP_STOP_CONSUME. Dropped.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov
Cc: Roland McGrath

Tejun Heo
2011-03-23 17:37:00 +0800
fe1bc6a09 ptrace: Add @why to ptrace_stop() ... Browse Code »

To prepare for cleanup of the interaction between group stop and
ptrace, add @why to ptrace_stop(). Existing users are updated such
that there is no behavior change.

Signed-off-by: Tejun Heo
Acked-by: Roland McGrath

Tejun Heo
2011-03-23 17:37:00 +0800
edf2ed153 ptrace: Kill tracehook_notify_jctl() ... Browse Code »

tracehook_notify_jctl() aids in determining whether and what to report
to the parent when a task is stopped or continued. The function also
adds an extra requirement that siglock may be released across it,
which is currently unused and quite difficult to satisfy in
well-defined manner.

As job control and the notifications are about to receive major
overhaul, remove the tracehook and open code it. If ever necessary,
let's factor it out after the overhaul.

* Oleg spotted incorrect CLD_CONTINUED/STOPPED selection when ptraced.
Fixed.

Signed-off-by: Tejun Heo
Cc: Oleg Nesterov
Cc: Roland McGrath

Tejun Heo
2011-03-23 17:37:00 +0800
71db5eb99 signal: Remove superflous try_to_freeze() loop in do_signal_stop() ... Browse Code »

do_signal_stop() is used only by get_signal_to_deliver() and after a
successful signal stop, it always calls try_to_freeze(), so the
try_to_freeze() loop around schedule() in do_signal_stop() is
superflous and confusing. Remove it.

Signed-off-by: Tejun Heo
Acked-by: Rafael J. Wysocki
Acked-by: Oleg Nesterov
Acked-by: Roland McGrath

Tejun Heo
2011-03-23 17:37:00 +0800
9f2bf6513 ptrace: Remove the extra wake_up_state() from ptrace_detach() ... Browse Code »

This wake_up_state() has a turbulent history. This is a remnant from
ancient ptrace implementation and patently wrong. Commit 95a3540d
(ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic) removed
it but the change was reverted later by commit edaba2c5 (ptrace:
revert "ptrace_detach: the wrong wakeup breaks the ERESTARTxxx logic")
citing compatibility breakage and general brokeness of the whole group
stop / ptrace interaction. Then, recently, it got converted from
wake_up_process() to wake_up_state() to make it less dangerous.

Digging through the mailing archives, the compatibility breakage
doesn't seem to be critical in the sense that the behavior isn't well
defined or reliable to begin with and it seems to have been agreed to
remove the wakeup with proper cleanup of the whole thing.

Now that the group stop and its interaction with ptrace are being
cleaned up, it's high time to finally kill this silliness.

Signed-off-by: Tejun Heo
Acked-by: Oleg Nesterov
Cc: Roland McGrath

Tejun Heo
2011-03-23 17:37:00 +0800
c672af35d signal: Fix SIGCONT notification code ... Browse Code »

After a task receives SIGCONT, its parent is notified via SIGCHLD with
its siginfo describing what the notified event is. If SIGCONT is
received while the child process is stopped, the code should be
CLD_CONTINUED. If SIGCONT is recieved while the child process is in
the process of being stopped, it should be CLD_STOPPED. Which code to
use is determined in prepare_signal() and recorded in signal->flags
using SIGNAL_CLD_CONTINUED|STOP flags.

get_signal_deliver() should test these flags and then notify
accoringly; however, it incorrectly tested SIGNAL_STOP_CONTINUED
instead of SIGNAL_CLD_CONTINUED, thus incorrectly notifying
CLD_CONTINUED if the signal is delivered before the task is wait(2)ed
and CLD_STOPPED if the state was fetched already.

Fix it by testing SIGNAL_CLD_CONTINUED. While at it, uncompress the
?: test into if/else clause for better readability.

Signed-off-by: Tejun Heo
Reviewed-by: Oleg Nesterov
Acked-by: Roland McGrath

Tejun Heo
2011-03-23 17:36:59 +0800
6447f55da Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx ... Browse Code »

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (66 commits)
avr32: at32ap700x: fix typo in DMA master configuration
dmaengine/dmatest: Pass timeout via module params
dma: let IMX_DMA depend on IMX_HAVE_DMA_V1 instead of an explicit list of SoCs
fsldma: make halt behave nicely on all supported controllers
fsldma: reduce locking during descriptor cleanup
fsldma: support async_tx dependencies and automatic unmapping
fsldma: fix controller lockups
fsldma: minor codingstyle and consistency fixes
fsldma: improve link descriptor debugging
fsldma: use channel name in printk output
fsldma: move related helper functions near each other
dmatest: fix automatic buffer unmap type
drivers, pch_dma: Fix warning when CONFIG_PM=n.
dmaengine/dw_dmac fix: use readl & writel instead of __raw_readl & __raw_writel
avr32: at32ap700x: Specify DMA Flow Controller, Src and Dst msize
dw_dmac: Setting Default Burst length for transfers as 16.
dw_dmac: Allow src/dst msize & flow controller to be configured at runtime
dw_dmac: Changing type of src_master and dest_master to u8.
dw_dmac: Pass Channel Priority from platform_data
dw_dmac: Pass Channel Allocation Order from platform_data
...

Linus Torvalds
2011-03-23 08:53:13 +0800
c50e3f512 bloat-o-meter: include read-only data section in report ... Browse Code »

I'm not sure why the read-only data section is excluded from the report,
it seems as relevant as the other data sections (b and d).

I've stripped the symbols starting with __mod_ as they can have their
names dynamically generated and thus comparison between binaries is not
possible.

Signed-off-by: Jean Delvare
Cc: Andi Kleen
Acked-by: Nathan Lynch
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jean Delvare
2011-03-23 08:44:17 +0800
565d76cb7 zlib: slim down zlib_deflate() workspace when possible ... Browse Code »

Instead of always creating a huge (268K) deflate_workspace with the
maximum compression parameters (windowBits=15, memLevel=8), allow the
caller to obtain a smaller workspace by specifying smaller parameter
values.

For example, when capturing oops and panic reports to a medium with
limited capacity, such as NVRAM, compression may be the only way to
capture the whole report. In this case, a small workspace (24K works
fine) is a win, whether you allocate the workspace when you need it (i.e.,
during an oops or panic) or at boot time.

I've verified that this patch works with all accepted values of windowBits
(positive and negative), memLevel, and compression level.

Signed-off-by: Jim Keniston
Cc: Herbert Xu
Cc: David Miller
Cc: Chris Mason
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jim Keniston
2011-03-23 08:44:17 +0800
b12d12596 fs/devpts/inode.c: correctly check d_alloc_name() return code in devpts_pty_new() ... Browse Code »

d_alloc_name return NULL in case error, but we expect errno in
devpts_pty_new.

Addresses http://bugzilla.openvz.org/show_bug.cgi?id=1758

Signed-off-by: Andrey Vagin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrey Vagin
2011-03-23 08:44:17 +0800
e91f90bb0 aio: wake all waiters when destroying ctx ... Browse Code »

The test program below will hang because io_getevents() uses
add_wait_queue_exclusive(), which means the wake_up() in io_destroy() only
wakes up one of the threads. Fix this by using wake_up_all() in the aio
code paths where we want to make sure no one gets stuck.

// t.c -- compile with gcc -lpthread -laio t.c

#include
#include
#include
#include

static const int nthr = 2;

void *getev(void *ctx)
{
struct io_event ev;
io_getevents(ctx, 1, 1, &ev, NULL);
printf("io_getevents returned\n");
return NULL;
}

int main(int argc, char *argv[])
{
io_context_t ctx = 0;
pthread_t thread[nthr];
int i;

io_setup(1024, &ctx);

for (i = 0; i < nthr; ++i)
pthread_create(&thread[i], NULL, getev, ctx);

sleep(1);

io_destroy(ctx);

for (i = 0; i < nthr; ++i)
pthread_join(thread[i], NULL);

return 0;
}

Signed-off-by: Roland Dreier
Reviewed-by: Jeff Moyer
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roland Dreier
2011-03-23 08:44:17 +0800
77d1c8eb8 pps: remove unreachable code ... Browse Code »

Remove code enabled only when CONFIG_PREEMPT_RT is turned on because it is
not used in the vanilla kernel.

Signed-off-by: Alexander Gordeev
Cc: john stultz
Cc: Rodolfo Giometti
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Gordeev
2011-03-23 08:44:17 +0800
da23ef054 adfs: add hexadecimal filetype suffix option ... Browse Code »

ADFS (FileCore) storage complies with the RISC OS filetype specification
(12 bits of file type information is stored in the file load address,
rather than using a file extension). The existing driver largely ignores
this information and does not present it to the end user.

It is desirable that stored filetypes be made visible to the end user to
facilitate a precise copy of data and metadata from a hard disc (or image
thereof) into a RISC OS emulator (such as RPCEmu) or to a network share
which can be accessed by real Acorn systems.

This patch implements a per-mount filetype suffix option (use -o
ftsuffix=1) to present any filetype as a ,xyz hexadecimal suffix on each
file. This type suffix is compatible with that used by RISC OS systems
that access network servers using NFS client software and by RPCemu's host
filing system.

Signed-off-by: Stuart Swales
Cc: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stuart Swales
2011-03-23 08:44:17 +0800
7a9730af9 adfs: improve timestamp precision ... Browse Code »

ADFS (FileCore) storage complies with the RISC OS timestamp specification
(40-bit centiseconds since 01 Jan 1900 00:00:00). It is desirable that
stored timestamp precision be maintained to facilitate a precise copy of
data and metadata from a hard disc (or image thereof) into a RISC OS
emulator (such as RPCEmu).

This patch implements a full-precision conversion from ADFS to Unix
timestamp as the existing driver, for ease of calculation with old 32-bit
compilers, uses the common trick of shifting the 40-bits representing
centiseconds around into 32-bits representing seconds thereby losing
precision.

Signed-off-by: Stuart Swales
Cc: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stuart Swales
2011-03-23 08:44:17 +0800
2f09719af adfs: fix E+/F+ dir size > 2048 crashing kernel ... Browse Code »

Kernel crashes in fs/adfs module when accessing directories with a large
number of objects on mounted Acorn ADFS E+/F+ format discs (or images) as
the existing code writes off the end of the fixed array of struct
buffer_head pointers.

Additionally, each directory access that didn't crash would leak a buffer
as nr_buffers was not adjusted correctly for E+/F+ discs (was always left
as one less than required).

The patch fixes this by allocating a dynamically-sized set of struct
buffer_head pointers if necessary for the E+/F+ case (many directories
still do in fact fit in 2048 bytes) and sets the correct nr_buffers so
that all buffers are released.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=26072

Tested by tar'ing the contents of my RISC PC's E+ format 20Gb HDD which
contains a number of large directories that previously crashed the kernel.

Signed-off-by: Stuart Swales
Cc: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Stuart Swales
2011-03-23 08:44:17 +0800
12da58b0c Documentation/vm/page-types.c: auto debugfs mount for hwpoison operation ... Browse Code »

page-types.c doesn't supply a way to specify the debugfs path and the
original debugfs path is not usual on most machines. This patch supplies
a way to auto mount debugfs if needed.

This patch is heavily inspired by tools/perf/utils/debugfs.c

[akpm@linux-foundation.org: make functions static]
[akpm@linux-foundation.org: fix debugfs_mount() signature]
Signed-off-by: Chen Gong
Reviewed-by: KOSAKI Motohiro
Reviewed-by: Wu Fengguang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Chen Gong
2011-03-23 08:44:17 +0800
e06c37440 Documentation/Changes: minor corrections ... Browse Code »

I noticed the 'mcelog' program had no comment and then ended up "fixing"
a few more things:

* reiserfsck -V does not print "reiserfsprogs" (any more?)
* is "udevinfo" still shipped? udevd certainly is
* grub2 doesn't have a 'grub' binary
* add a "# how to get the mcelog version" comment

Signed-off-by: Christian Kujau
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christian Kujau
2011-03-23 08:44:17 +0800
38829dc9d Documentation/CodingStyle: flesh out if-else examples ... Browse Code »

There is a missing case for "Chapter 3: Placing Braces and Spaces". We
often know we should not use braces where a single statement. The first
case is:

if (condition)
action();

Another case is:

if (condition)
do_this();
else
do_that();

However, I can not find a description of the second case.

Signed-off-by: Harry Wei
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Harry Wei
2011-03-23 08:44:16 +0800
0bc825d24 codafs: fix compile warning when CONFIG_SYSCTL=n ... Browse Code »

When CONFIG_SYSCTL=n, we get the following warning:

fs/coda/sysctl.c:18: warning: `coda_tabl' defined but not used

Fix the warning by making sure coda_table and it's callee function are in
the same context. Also clean up the code by removing extra #ifdef.

[akpm@linux-foundation.org: remove unneeded stub macros]
Signed-off-by: Rakib Mullick
Cc: Jan Harkes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rakib Mullick
2011-03-23 08:44:16 +0800
1c00f0161 x86: allow CONFIG_ISA_DMA_API to be disabled ... Browse Code »

Not all 64-bit systems require ISA-style DMA, so allow it to be
configurable. x86 utilizes the generic ISA DMA allocator from
kernel/dma.c, so require it only when CONFIG_ISA_DMA_API is enabled.

Disabling CONFIG_ISA_DMA_API is dependent on x86_64 since those machines
do not have ISA slots and benefit the most from disabling the option (and
on CONFIG_EXPERT as required by H. Peter Anvin).

When disabled, this also avoids declaring claim_dma_lock(),
release_dma_lock(), request_dma(), and free_dma() since those interfaces
will no longer be provided.

Signed-off-by: David Rientjes
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Cc: Bjorn Helgaas
Cc: Russell King
Cc: "Luck, Tony"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2011-03-23 08:44:16 +0800
8df3bd9e1 x86: only compile floppy driver if CONFIG_ISA_DMA_API is enabled ... Browse Code »

The generic floppy disk driver utilizies the interface provided by
CONFIG_ISA_DMA_API, specifically claim_dma_lock(), release_dma_lock(),
request_dma(), and free_dma(). Thus, there's a strict dependency on the
config option and the driver should only be loaded if the kernel supports
ISA-style DMA.

Signed-off-by: David Rientjes
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Cc: Bjorn Helgaas
Cc: Russell King
Cc: "Luck, Tony"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2011-03-23 08:44:16 +0800
4061d68e1 x86: only compile 8237A if CONFIG_ISA_DMA_API is enabled ... Browse Code »

8237A utilizes the interface provided by CONFIG_ISA_DMA_API, specifically
claim_dma_lock() and release_dma_lock(). Thus, there's a strict
dependency on the config option and the module should only be loaded if
the kernel supports ISA-style DMA.

Signed-off-by: David Rientjes
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Cc: Bjorn Helgaas
Cc: Russell King
Cc: "Luck, Tony"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2011-03-23 08:44:16 +0800
586f83e2b pnp: only assign IORESOURCE_DMA if CONFIG_ISA_DMA_API is enabled ... Browse Code »

IORESOURCE_DMA cannot be assigned without utilizing the interface
provided by CONFIG_ISA_DMA_API, specifically request_dma() and
free_dma(). Thus, there's a strict dependency on the config option and
limits IORESOURCE_DMA only to architectures that support ISA-style DMA.

ia64 is not one of those architectures, so pnp_check_dma() no longer
needs to be special-cased for that architecture.

pnp_assign_resources() will now return -EINVAL if IORESOURCE_DMA is
attempted on such a kernel.

Signed-off-by: David Rientjes
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Cc: Bjorn Helgaas
Cc: Russell King
Cc: "Luck, Tony"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2011-03-23 08:44:16 +0800
ff859ba6d rtc: add real-time clock driver for NVIDIA Tegra ... Browse Code »

This is a platform driver that supports the built-in real-time clock on
Tegra SOCs.

Signed-off-by: Andrew Chew
Acked-by: Alessandro Zummo
Acked-by: Wan ZongShun
Acked-by: Jon Mayo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Chew
2011-03-23 08:44:16 +0800
49d50fb1c drivers/rtc/rtc-ds1511.c: world-writable sysfs nvram file ... Browse Code »

Don't allow everybogy to write to NVRAM.

Signed-off-by: Vasiliy Kulikov
Cc: Andy Sharp
Cc: Alessandro Zummo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vasiliy Kulikov
2011-03-23 08:44:16 +0800
cf044f0ed drivers/rtc/rtc-isl1208.c: add alarm support ... Browse Code »

Add alarm/wakeup support to rtc isl1208 driver

Signed-off-by: Ryan Mallon
Cc: Alessandro Zummo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ryan Mallon
2011-03-23 08:44:16 +0800
bc96ba741 rtc: convert DS1374 to dev_pm_ops ... Browse Code »

There is a general move to replace bus-specific PM ops with dev_pm_ops in
order to facilitate core improvements. Do this conversion for DS1374.

Signed-off-by: Mark Brown
Cc: Alessandro Zummo
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mark Brown
2011-03-23 08:44:16 +0800
ea611b269 init: return proper error code in do_mounts_rd() ... Browse Code »

In do_mounts_rd() if memory cannot be allocated, return -ENOMEM.

Signed-off-by: Davidlohr Bueso
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2011-03-23 08:44:15 +0800
1a530a6f2 binfmt_elf: quiet GCC-4.6 'set but not used' warning in load_elf_binary() ... Browse Code »

With GCC-4.6 we get warnings about things being 'set but not used'.

In load_elf_binary() this can happen with reloc_func_desc if ELF_PLAT_INIT
is defined, but doesn't use the reloc_func_desc argument.

Quiet the warning/error by marking reloc_func_desc as __maybe_unused.

Signed-off-by: David Daney
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Daney
2011-03-23 08:44:15 +0800
f4d93ad74 epoll: fix compiler warning and optimize the non-blocking path ... Browse Code »

Add a comment to ep_poll(), rename labels a bit clearly, fix a warning of
unused variable from gcc and optimize the non-blocking path a little.

Hinted-by: Andrew Morton
Signed-off-by: Davide Libenzi

hannes@cmpxchg.org:

: The non-blocking ep_poll path optimization introduced skipping over the
: return value setup.
:
: Initialize it properly, my userspace gets upset by epoll_wait() returning
: random things.
:
: In addition, remove the reinitialization at the fetch_events label, the
: return value is garuanteed to be zero when execution reaches there.

[hannes@cmpxchg.org: fix initialization]
Signed-off-by: Johannes Weiner
Cc: Shawn Bohrer
Acked-by: Davide Libenzi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Shawn Bohrer
2011-03-23 08:44:15 +0800