08 Oct, 2007

2 commits

  • Async signals should not be reported as sent by current in audit log. As
    it is, we call audit_signal_info() too early in check_kill_permission().
    Note that check_kill_permission() has that test already - it needs to know
    if it should apply current-based permission checks. So the solution is to
    move the call of audit_signal_info() between those.

    Bogosity in question is easily reproduced - add a rule watching for e.g.
    kill(2) from specific process (so that audit_signal_info() would not
    short-circuit to nothing), say load_policy, watch the bogus OBJ_PID entry
    in audit logs claiming that write(2) on selinuxfs file issued by
    load_policy(8) had somehow managed to send a signal to syslogd...

    Signed-off-by: Al Viro
    Acked-by: Steve Grubb
    Acked-by: Eric Paris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • When using /proc/timer_stats on ppc64 I noticed the events/sec field wasnt
    accurate. Sometimes the integer part was incorrect due to rounding (we
    werent taking the fractional seconds into consideration).

    The fraction part is also wrong, we need to pad the printf statement and
    take the bottom three digits of 1000 times the value.

    Signed-off-by: Anton Blanchard
    Acked-by: Ingo Molnar
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     

02 Oct, 2007

1 commit


01 Oct, 2007

2 commits

  • Calling handle_futex_death in exit_robust_list for the different robust
    mutexes of a thread basically frees the mutex. Another thread might grab
    the lock immediately which updates the next pointer of the mutex.
    fetch_robust_entry over the next pointer might therefore branch into the
    robust mutex list of a different thread. This can cause two problems: 1)
    some mutexes held by the dead thread are not getting freed and 2) some
    mutexs held by a different thread are freed.

    The next point need to be read before calling handle_futex_death.

    Signed-off-by: Martin Schwidefsky
    Acked-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     
  • We need to disable all CPUs other than the boot CPU (usually 0) before
    attempting to power-off modern SMP machines. This fixes the
    hang-on-poweroff issue on my MythTV SMP box, and also on Thomas Gleixner's
    new toybox.

    Signed-off-by: Mark Lord
    Acked-by: Thomas Gleixner
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mark Lord
     

27 Sep, 2007

1 commit


23 Sep, 2007

1 commit

  • In a desparate attempt to fix the suspend/resume problem on Andrews
    VAIO I added a workaround which enforced the broadcast of the oneshot
    timer on resume. This was actually resolving the problem on the VAIO
    but was just a stupid workaround, which was not tackling the root
    cause: the assignement of lower idle C-States in the ACPI processor_idle
    code. The cpuidle patches, which utilize the dynamic tick feature and
    go faster into deeper C-states exposed the problem again. The correct
    solution is the previous patch, which prevents lower C-states across
    the suspend/resume.

    Remove the enforcement code, including the conditional broadcast timer
    arming, which helped to pamper over the real problem for quite a time.
    The oneshot broadcast flag for the cpu, which runs the resume code can
    never be set at the time when this code is executed. It only gets set,
    when the CPU is entering a lower idle C-State.

    Signed-off-by: Thomas Gleixner
    Tested-by: Andrew Morton
    Cc: Len Brown
    Cc: Venkatesh Pallipadi
    Cc: Rafael J. Wysocki
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     

21 Sep, 2007

1 commit

  • This simplifies signalfd code, by avoiding it to remain attached to the
    sighand during its lifetime.

    In this way, the signalfd remain attached to the sighand only during
    poll(2) (and select and epoll) and read(2). This also allows to remove
    all the custom "tsk == current" checks in kernel/signal.c, since
    dequeue_signal() will only be called by "current".

    I think this is also what Ben was suggesting time ago.

    The external effect of this, is that a thread can extract only its own
    private signals and the group ones. I think this is an acceptable
    behaviour, in that those are the signals the thread would be able to
    fetch w/out signalfd.

    Signed-off-by: Davide Libenzi
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     

20 Sep, 2007

6 commits

  • When using rt_mutex, a NULL pointer dereference is occurred at
    enqueue_task_rt. Here is a scenario;
    1) there are two threads, the thread A is fair_sched_class and
    thread B is rt_sched_class.
    2) Thread A is boosted up to rt_sched_class, because the thread A
    has a rt_mutex lock and the thread B is waiting the lock.
    3) At this time, when thread A create a new thread C, the thread
    C has a rt_sched_class.
    4) When doing wake_up_new_task() for the thread C, the priority
    of the thread C is out of the RT priority range, because the
    normal priority of thread A is not the RT priority. It makes
    data corruption by overflowing the rt_prio_array.
    The new thread C should be fair_sched_class.

    The new thread should be valid scheduler class before queuing.
    This patch fixes to set the suitable scheduler class.

    Signed-off-by: Hiroshi Shimamoto
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Hiroshi Shimamoto
     
  • add /proc/sys/kernel/sched_compat_yield to make sys_sched_yield()
    more agressive, by moving the yielding task to the last position
    in the rbtree.

    with sched_compat_yield=0:

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    2539 mingo 20 0 1576 252 204 R 50 0.0 0:02.03 loop_yield
    2541 mingo 20 0 1576 244 196 R 50 0.0 0:02.05 loop

    with sched_compat_yield=1:

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    2584 mingo 20 0 1576 248 196 R 99 0.0 0:52.45 loop
    2582 mingo 20 0 1576 256 204 R 0 0.0 0:00.00 loop_yield

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Ingo Molnar
     
  • It turned out, that the user namespace is released during the do_exit() in
    exit_task_namespaces(), but the struct user_struct is released only during the
    put_task_struct(), i.e. MUCH later.

    On debug kernels with poisoned slabs this will cause the oops in
    uid_hash_remove() because the head of the chain, which resides inside the
    struct user_namespace, will be already freed and poisoned.

    Since the uid hash itself is required only when someone can search it, i.e.
    when the namespace is alive, we can safely unhash all the user_struct-s from
    it during the namespace exiting. The subsequent free_uid() will complete the
    user_struct destruction.

    For example simple program

    #include

    char stack[2 * 1024 * 1024];

    int f(void *foo)
    {
    return 0;
    }

    int main(void)
    {
    clone(f, stack + 1 * 1024 * 1024, 0x10000000, 0);
    return 0;
    }

    run on kernel with CONFIG_USER_NS turned on will oops the
    kernel immediately.

    This was spotted during OpenVZ kernel testing.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: Alexey Dobriyan
    Acked-by: "Serge E. Hallyn"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • Surprisingly, but (spotted by Alexey Dobriyan) the uid hash still uses
    list_heads, thus occupying twice as much place as it could. Convert it to
    hlist_heads.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: Alexey Dobriyan
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • kernel/user.c: Convert list_for_each to list_for_each_entry in
    uid_hash_find()

    Signed-off-by: Matthias Kaehlcke
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthias Kaehlcke
     
  • struct utsname is copied from master one without any exclusion.

    Here is sample output from one proggie doing

    sethostname("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");
    sethostname("bbbbbbbbbbbbbbbbbbbbbbbbbbbbbb");

    and another

    clone(,, CLONE_NEWUTS, ...)
    uname()

    hostname = 'aaaaaaaaaaaaaaaaaaaaaaaaabbbbb'
    hostname = 'bbbaaaaaaaaaaaaaaaaaaaaaaaaaaa'
    hostname = 'aaaaaaaabbbbbbbbbbbbbbbbbbbbbb'
    hostname = 'aaaaaaaaaaaaaaaaaaaaaaaaaabbbb'
    hostname = 'aaaaaaaaaaaaaaaaaaaaaaaaaaaabb'
    hostname = 'aaabbbbbbbbbbbbbbbbbbbbbbbbbbb'
    hostname = 'bbbbbbbbbbbbbbbbaaaaaaaaaaaaaa'

    Hostname is sometimes corrupted.

    Yes, even _the_ simplest namespace activity had bug in it. :-(

    Signed-off-by: Alexey Dobriyan
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

16 Sep, 2007

5 commits

  • Taking a cpu offline removes the cpu from the online mask before the
    CPU_DEAD notification is done. The clock events layer does the cleanup
    of the dead CPU from the CPU_DEAD notifier chain. tick_do_timer_cpu is
    used to avoid xtime lock contention by assigning the task of jiffies
    xtime updates to one CPU. If a CPU is taken offline, then this
    assignment becomes stale. This went unnoticed because most of the time
    the offline CPU went dead before the online CPU reached __cpu_die(),
    where the CPU_DEAD state is checked. In the case that the offline CPU did
    not reach the DEAD state before we reach __cpu_die(), the code in there
    goes to sleep for 100ms. Due to the stale time update assignment, the
    system is stuck forever.

    Take the assignment away when a cpu is not longer in the cpu_online_mask.
    We do this in the last call to tick_nohz_stop_sched_tick() when the offline
    CPU is on the way to the final play_dead() idle entry.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • When a cpu goes offline it is removed from the broadcast masks. If the
    mask becomes empty the code shuts down the broadcast device. This is
    wrong, because the broadcast device needs to be ready for the online
    cpu going idle (into a c-state, which stops the local apic timer).

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The jinxed VAIO refuses to resume without hitting keys on the keyboard
    when this is not enforced. It is unclear why the cpu ends up in a lower
    C State without notifying the clock events layer, but enforcing the
    oneshot broadcast here is safe.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Timekeeping resume adjusts xtime by adding the slept time in seconds and
    resets the reference value of the clock source (clock->cycle_last).
    clock->cycle last is used to calculate the delta between the last xtime
    update and the readout of the clock source in __get_nsec_offset(). xtime
    plus the offset is the current time. The resume code ignores the delta
    which had already elapsed between the last xtime update and the actual
    time of suspend. If the suspend time is short, then we can see time
    going backwards on resume.

    Suspend:
    offs_s = clock->read() - clock->cycle_last;
    now = xtime + offs_s;
    timekeeping_suspend_time = read_rtc();

    Resume:
    sleep_time = read_rtc() - timekeeping_suspend_time;
    xtime.tv_sec += sleep_time;
    clock->cycle_last = clock->read();
    offs_r = clock->read() - clock->cycle_last;
    now = xtime + offs_r;

    if sleep_time_seconds == 0 and offs_r < offs_s, then time goes
    backwards.

    Fix this by storing the offset from the last xtime update and add it to
    xtime during resume, when we reset clock->cycle_last:

    sleep_time = read_rtc() - timekeeping_suspend_time;
    xtime.tv_sec += sleep_time;
    xtime += offs_s; /* Fixup xtime offset at suspend time */
    clock->cycle_last = clock->read();
    offs_r = clock->read() - clock->cycle_last;
    now = xtime + offs_r;

    Thanks to Marcelo for tracking this down on the OLPC and providing the
    necessary details to analyze the root cause.

    Signed-off-by: Thomas Gleixner
    Cc: John Stultz
    Cc: Tosatti

    Thomas Gleixner
     
  • Lockdep complains about the access of rtc in timekeeping_suspend
    inside the interrupt disabled region of the write locked xtime lock.
    Move the access outside.

    Signed-off-by: Thomas Gleixner
    Cc: John Stultz

    Thomas Gleixner
     

12 Sep, 2007

3 commits

  • Seems to me that this timer will only get started on platforms that say
    they don't want it?

    Signed-off-by: Tony Breeds
    Cc: Paul Mackerras
    Cc: Gabriel Paubert
    Cc: Zachary Amsden
    Acked-by: Thomas Gleixner
    Cc: John Stultz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tony Breeds
     
  • The semantics of call_usermodehelper_pipe() used to be that it would fork
    the helper, and wait for the kernel thread to be started. This was
    implemented by setting sub_info.wait to 0 (implicitly), and doing a
    wait_for_completion().

    As part of the cleanup done in 0ab4dc92278a0f3816e486d6350c6652a72e06c8,
    call_usermodehelper_pipe() was changed to pass 1 as the value for wait to
    call_usermodehelper_exec().

    This is equivalent to setting sub_info.wait to 1, which is a change from
    the previous behaviour. Using 1 instead of 0 causes
    __call_usermodehelper() to start the kernel thread running
    wait_for_helper(), rather than directly calling ____call_usermodehelper().

    The end result is that the calling kernel code blocks until the user mode
    helper finishes. As the helper is expecting input on stdin, and now no one
    is writing anything, everything locks up (observed in do_coredump).

    The fix is to change the 1 to UMH_WAIT_EXEC (aka 0), indicating that we
    want to wait for the kernel thread to be started, but not for the helper to
    finish.

    Signed-off-by: Michael Ellerman
    Acked-by: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Ellerman
     
  • The futex list traversal on the compat side appears to have
    a bug.

    It's loop termination condition compares:

    while (compat_ptr(uentry) != &head->list)

    But that can't be right because "uentry" has the special
    "pi" indicator bit still potentially set at bit 0. This
    is cleared by fetch_robust_entry() into the "entry"
    return value.

    What this seems to mean is that the list won't terminate
    when list iteration gets back to the the head. And we'll
    also process the list head like a normal entry, which could
    cause all kinds of problems.

    So we should check for equality with "entry". That pointer
    is of the non-compat type so we have to do a little casting
    to keep the compiler and sparse happy.

    The same problem can in theory occur with the 'pending'
    variable, although that has not been reported from users
    so far.

    Based on the original patch from David Miller.

    Acked-by: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: David Miller
    Signed-off-by: Arnd Bergmann
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     

11 Sep, 2007

1 commit

  • When PTRACE_SYSCALL was used and then PTRACE_DETACH is used, the
    TIF_SYSCALL_TRACE flag is left set on the formerly-traced task. This
    means that when a new tracer comes along and does PTRACE_ATTACH, it's
    possible he gets a syscall tracing stop even though he's never used
    PTRACE_SYSCALL. This happens if the task was in the middle of a system
    call when the second PTRACE_ATTACH was done. The symptom is an
    unexpected SIGTRAP when the tracer thinks that only SIGSTOP should have
    been provoked by his ptrace calls so far.

    A few machines already fixed this in ptrace_disable (i386, ia64, m68k).
    But all other machines do not, and still have this bug. On x86_64, this
    constitutes a regression in IA32 compatibility support.

    Since all machines now use TIF_SYSCALL_TRACE for this, I put the
    clearing of TIF_SYSCALL_TRACE in the generic ptrace_detach code rather
    than adding it to every other machine's ptrace_disable.

    Signed-off-by: Roland McGrath
    Signed-off-by: Linus Torvalds

    Roland McGrath
     

05 Sep, 2007

8 commits

  • fix ideal_runtime:

    - do not scale it using niced_granularity()
    it is against sum_exec_delta, so its wall-time, not fair-time.

    - move the whole check into __check_preempt_curr_fair()
    so that wakeup preemption can also benefit from the new logic.

    this also results in code size reduction:

    text data bss dec hex filename
    13391 228 1204 14823 39e7 sched.o.before
    13369 228 1204 14801 39d1 sched.o.after

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Second preparatory patch for fix-ideal runtime:

    Mark prev_sum_exec_runtime at the beginning of our run, the same spot
    that adds our wait period to wait_runtime. This seems a more natural
    location to do this, and it also reduces the code a bit:

    text data bss dec hex filename
    13397 228 1204 14829 39ed sched.o.before
    13391 228 1204 14823 39e7 sched.o.after

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Preparatory patch for fix-ideal-runtime:

    simplify __check_preempt_curr_fair(): get rid of the integer return.

    text data bss dec hex filename
    13404 228 1204 14836 39f4 sched.o.before
    13393 228 1204 14825 39e9 sched.o.after

    functionality is unchanged.

    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • rename RSR to SRR - 'RSR' is already defined on xtensa.

    found by Adrian Bunk.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • when cleaning sched-stats also clear prev_sum_exec_runtime.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • the cfs_rq->wait_runtime debug/statistics counter was not maintained
    properly - fix this.

    this also removes some code:

    text data bss dec hex filename
    13420 228 1204 14852 3a04 sched.o.before
    13404 228 1204 14836 39f4 sched.o.after

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra

    Ingo Molnar
     
  • fix niced_granularity(). This resulted in under-scheduling for
    CPU-bound negative nice level tasks (and this in turn caused
    higher than necessary latencies in nice-0 tasks).

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • First fix the check
    if (*imbalance + SCHED_LOAD_SCALE_FUZZ < busiest_load_per_task)
    with this
    if (*imbalance < busiest_load_per_task)

    As the current check is always false for nice 0 tasks (as
    SCHED_LOAD_SCALE_FUZZ is same as busiest_load_per_task for nice 0
    tasks).

    With the above change, imbalance was getting reset to 0 in the corner
    case condition, making the FUZZ logic fail. Fix it by not corrupting the
    imbalance and change the imbalance, only when it finds that the HT/MC
    optimization is needed.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

01 Sep, 2007

1 commit


31 Aug, 2007

6 commits

  • Spotted by taoyue and Jeremy Katz .

    collect_signal: sigqueue_free:

    list_del_init(&first->list);
    if (!list_empty(&q->list)) {
    // not taken
    }
    q->flags &= ~SIGQUEUE_PREALLOC;

    __sigqueue_free(first); __sigqueue_free(q);

    Now, __sigqueue_free() is called twice on the same "struct sigqueue" with the
    obviously bad implications.

    In particular, this double free breaks the array_cache->avail logic, so the
    same sigqueue could be "allocated" twice, and the bug can manifest itself via
    the "impossible" BUG_ON(!SIGQUEUE_PREALLOC) in sigqueue_free/send_sigqueue.

    Hopefully this can explain these mysterious bug-reports, see

    http://marc.info/?t=118766926500003
    http://marc.info/?t=118466273000005

    Alexey Dobriyan reports this patch makes the difference for the testcase, but
    nobody has an access to the application which opened the problems originally.

    Also, this patch removes tasklist lock/unlock, ->siglock is enough.

    Signed-off-by: Oleg Nesterov
    Cc: taoyue
    Cc: Jeremy Katz
    Cc: Sukadev Bhattiprolu
    Cc: Alexey Dobriyan
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Roland McGrath
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Signed-off-by: Alexey Dobriyan
    Acked-by: Cedric Le Goater
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Mariusz Kozlowski reported lockdep's warning:

    > =================================
    > [ INFO: inconsistent lock state ]
    > 2.6.23-rc2-mm1 #7
    > ---------------------------------
    > inconsistent {in-hardirq-W} -> {hardirq-on-W} usage.
    > ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1] takes:
    > (&tp->lock){+...}, at: [] rtl8139_interrupt+0x27/0x46b [8139too]
    > {in-hardirq-W} state was registered at:
    > [] __lock_acquire+0x949/0x11ac
    > [] lock_acquire+0x99/0xb2
    > [] _spin_lock+0x35/0x42
    > [] rtl8139_interrupt+0x27/0x46b [8139too]
    > [] handle_IRQ_event+0x28/0x59
    > [] handle_level_irq+0xad/0x10b
    > [] do_IRQ+0x93/0xd0
    > [] common_interrupt+0x2e/0x34
    ...
    > other info that might help us debug this:
    > 1 lock held by ifconfig/5492:
    > #0: (rtnl_mutex){--..}, at: [] mutex_lock+0x1c/0x1f
    >
    > stack backtrace:
    ...
    > [] _spin_lock+0x35/0x42
    > [] rtl8139_interrupt+0x27/0x46b [8139too]
    > [] free_irq+0x11b/0x146
    > [] rtl8139_close+0x8a/0x14a [8139too]
    > [] dev_close+0x57/0x74
    ...

    This shows that a driver's irq handler was running both in hard interrupt
    and process contexts with irqs enabled. The latter was done during
    free_irq() call and was possible only with CONFIG_DEBUG_SHIRQ enabled.
    This was fixed by another patch.

    But similar problem is possible with request_irq(): any locks taken from
    irq handler could be vulnerable - especially with soft interrupts. This
    patch fixes it by disabling local interrupts during handler's run. (It
    seems, disabling softirqs should be enough, but it needs more checking
    on possible races or other special cases).

    Reported-by: Mariusz Kozlowski
    Signed-off-by: Jarek Poplawski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jarek Poplawski
     
  • Dependencies of CONFIG_SUSPEND and CONFIG_HIBERNATION introduced by commit
    296699de6bdc717189a331ab6bbe90e05c94db06 "Introduce CONFIG_SUSPEND for
    suspend-to-Ram and standby" are incorrect, as they don't cover the facts that
    (1) not all architectures support suspend and (2) SMP hibernation is only
    possible on X86 and PPC64 (if CONFIG_PPC64_SWSUSP is set).

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Spotted by Marcin Kowalczyk .

    sys_setpgid(child) fails if the child was forked by sub-thread.

    Fix the "is it our child" check. The previous commit
    ee0acf90d320c29916ba8c5c1b2e908d81f5057d was not complete.

    (this patch asks for the new same_thread_group() helper, but mainline doesn't
    have it yet).

    Signed-off-by: Oleg Nesterov
    Acked-by: Roland McGrath
    Cc:
    Tested-by: "Marcin 'Qrczak' Kowalczyk"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • taskstats.ac_exitcode is assigned to task_struct.exit_code in bacct_add_tsk()
    through the following kernel function calls:

    do_exit()
    taskstats_exit()
    fill_pid()
    bacct_add_tsk()

    The problem is that in do_exit(), task_struct.exit_code is set to 'code' only
    after taskstats_exit() has been called. So we need to move the assignment
    before taskstats_exit().

    Signed-off-by: Jonathan Lim
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jonathan Lim
     

28 Aug, 2007

2 commits

  • cleanup: we have the 'se' and 'curr' entity-pointers already,
    no need to use p->se and current->se.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Mike Galbraith

    Ingo Molnar
     
  • small schedstat fix: the cfs_rq->wait_runtime 'sum of all runtimes'
    statistics counters missed newly forked tasks and thus had a constant
    negative skew. Fix this.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Mike Galbraith

    Ingo Molnar