14 Nov, 2005

1 commit

  • new helper - task_thread_info(task). On platforms that have thread_info
    allocated separately (i.e. in default case) it simply returns
    task->thread_info. m68k wants (and for good reasons) to embed its thread_info
    into task_struct. So it will (in later patch) have task_thread_info() of its
    own. For now we just add a macro for generic case and convert existing
    instances of its body in core kernel to uses of new macro. Obviously safe -
    all normal architectures get the same preprocessor output they used to get.

    Signed-off-by: Al Viro
    Signed-off-by: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Al Viro
     

07 Nov, 2005

1 commit

  • This patch adds a connector that reports fork, exec, id change, and exit
    events for all processes to userspace. It replaces the fork_advisor patch
    that ELSA is currently using. Applications that may find these events
    useful include accounting/auditing (e.g. ELSA), system activity monitoring
    (e.g. top), security, and resource management (e.g. CKRM).

    Signed-off-by: Matt Helsley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Helsley
     

31 Oct, 2005

3 commits

  • This patch replaces hardcoded SEND_SIG_xxx constants with
    their symbolic names.

    No changes in affected .o files.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Back about a year ago when I last fiddled heavily with the do_wait code, I
    was thinking too hard about the wrong thing and I now think I introduced a
    bug whose inverse thought I was fixing.

    Apparently noone was looking too hard over much shoulder, so as to cite my
    bogus reasoning at the time. In the race condition when PTRACE_ATTACH is
    about to steal a child and then the child hits a tracing event (what
    my_ptrace_child checks for), the real parent does need to set its flag
    noting it has some eligible live children. Otherwise a spurious ECHILD
    error is possible, since the child in question is not yet on the
    ptrace_children list.

    Signed-off-by: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • The PF_DEAD setting doesn't belong to exit_notify(), move it to a proper
    place.

    Signed-off-by: Coywolf Qi Hunt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Coywolf Qi Hunt
     

30 Oct, 2005

1 commit

  • update_mem_hiwater has attracted various criticisms, in particular from those
    concerned with mm scalability. Originally it was called whenever rss or
    total_vm got raised. Then many of those callsites were replaced by a timer
    tick call from account_system_time. Now Frank van Maarseveen reports that to
    be found inadequate. How about this? Works for Frank.

    Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
    update_hiwater_rss and update_hiwater_vm. Don't attempt to keep
    mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
    by 1): those are hot paths. Do the opposite, update only when about to lower
    rss (usually by many), or just before final accounting in do_exit. Handle
    mm->hiwater_vm in the same way, though it's much less of an issue. Demand
    that whoever collects these hiwater statistics do the work of taking the
    maximum with rss or total_vm.

    And there has been no collector of these hiwater statistics in the tree. The
    new convention needs an example, so match Frank's usage by adding a VmPeak
    line above VmSize to /proc//status, and also a VmHWM line above VmRSS
    (High-Water-Mark or High-Water-Memory).

    There was a particular anomaly during mremap move, that hiwater_vm might be
    captured too high. A fleeting such anomaly remains, but it's quickly
    corrected now, whereas before it would stick.

    What locking? None: if the app is racy then these statistics will be racy,
    it's not worth any overhead to make them exact. But whenever it suits,
    hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
    page_table_lock (for now) or with preemption disabled (later on): without
    going to any trouble, minimize the time between reading current values and
    updating, to minimize those occasions when a racing thread bumps a count up
    and back down in between.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

28 Oct, 2005

1 commit


24 Oct, 2005

1 commit

  • do_exit() clears ->it_##clock##_expires, but nothing prevents
    another cpu to attach the timer to exiting process after that.

    After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
    before do_exit() calls 'schedule() local timer interrupt can find
    tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
    does sys_wait4) interrupted task has ->signal == NULL.

    At this moment exiting task has no pending cpu timers, they were cleaned
    up in __exit_signal()->posix_cpu_timers_exit{,_group}(), so we can just
    return from irq.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

22 Oct, 2005

1 commit

  • When I originally moved exit_itimers into __exit_signal, that was the only
    place where we could reliably know it was the last thread in the group
    dying, without races. Since then we've gotten the signal_struct.live
    counter, and do_exit can reliably do group-wide cleanup work.

    This patch moves the call to do_exit, where it's made without locks. This
    avoids the deadlock issues that the old __exit_signal code's comment talks
    about, and the one that Oleg found recently with process CPU timers.

    [ This replaces e03d13e985d48ac4885382c9e3b1510c78bd047f, which is why
    it was just reverted. ]

    Signed-off-by: Roland McGrath
    Signed-off-by: Linus Torvalds

    Roland McGrath
     

02 Oct, 2005

1 commit

  • We should always use bitmask ops, rather than depend on some ordering of
    the different states. With the TASK_NONINTERACTIVE flag, the inequality
    doesn't really work.

    Oleg Nesterov argues (likely correctly) that this test is unnecessary in
    the first place. However, the minimal fix for now is to at least make
    it work in the presense of TASK_NONINTERACTIVE. Waiting for consensus
    from Roland & co on potential bigger cleanups.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

18 Sep, 2005

1 commit

  • With the new fdtable locking rules, you have to protect fdtable with either
    ->file_lock or rcu_read_lock/unlock(). There are some places where we
    aren't doing either. This patch fixes those places.

    Signed-off-by: Dipankar Sarma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dipankar Sarma
     

10 Sep, 2005

2 commits

  • Patch to eliminate struct files_struct.file_lock spinlock on the reader side
    and use rcu refcounting rcuref_xxx api for the f_count refcounter. The
    updates to the fdtable are done by allocating a new fdtable structure and
    setting files->fdt to point to the new structure. The fdtable structure is
    protected by RCU thereby allowing lock-free lookup. For fd arrays/sets that
    are vmalloced, we use keventd to free them since RCU callbacks can't sleep. A
    global list of fdtable to be freed is not scalable, so we use a per-cpu list.
    If keventd is already handling the current cpu's work, we use a timer to defer
    queueing of that work.

    Since the last publication, this patch has been re-written to avoid using
    explicit memory barriers and use rcu_assign_pointer(), rcu_dereference()
    premitives instead. This required that the fd information is kept in a
    separate structure (fdtable) and updated atomically.

    Signed-off-by: Dipankar Sarma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dipankar Sarma
     
  • In order for the RCU to work, the file table array, sets and their sizes must
    be updated atomically. Instead of ensuring this through too many memory
    barriers, we put the arrays and their sizes in a separate structure. This
    patch takes the first step of putting the file table elements in a separate
    structure fdtable that is embedded withing files_struct. It also changes all
    the users to refer to the file table using files_fdtable() macro. Subsequent
    applciation of RCU becomes easier after this.

    Signed-off-by: Dipankar Sarma
    Signed-Off-By: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dipankar Sarma
     

05 Aug, 2005

1 commit


28 Jun, 2005

1 commit

  • This updates the CFQ io scheduler to the new time sliced design (cfq
    v3). It provides full process fairness, while giving excellent
    aggregate system throughput even for many competing processes. It
    supports io priorities, either inherited from the cpu nice value or set
    directly with the ioprio_get/set syscalls. The latter closely mimic
    set/getpriority.

    This import is based on my latest from -mm.

    Signed-off-by: Jens Axboe
    Signed-off-by: Linus Torvalds

    Jens Axboe
     

24 Jun, 2005

2 commits

  • Avoid taking the tasklist_lock in sys_times if the process is single
    threaded. In a NUMA system taking the tasklist_lock may cause a bouncing
    cacheline if multiple independent processes continually call sys_times to
    measure their performance.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Shai Fultheim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Prevent recursive faults in do_exit() by leaving the task alone and wait
    for reboot. This may allow a more graceful shutdown and possibly save the
    original oops.

    Signed-off-by: Alexander Nyberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Nyberg
     

18 Jun, 2005

1 commit


04 May, 2005

1 commit

  • The patch "MCA recovery improvements" added do_exit to mca_drv.c.
    That's fine when the mca recovery code is built in the kernel
    (CONFIG_IA64_MCA_RECOVERY=y) but breaks building the mca recovery
    code as a module (CONFIG_IA64_MCA_RECOVERY=m).

    Most users are currently building this as a module, as loading
    and unloading the module provides a very convenient way to turn
    on/off error recovery.

    This patch exports do_exit, so mca_drv.c can build as a module.

    Signed-off-by: Russ Anderson (rja@sgi.com)
    Signed-off-by: Tony Luck

    Russ Anderson
     

01 May, 2005

3 commits

  • Another large rollup of various patches from Adrian which make things static
    where they were needlessly exported.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • I have recompiled Linux kernel 2.6.11.5 documentation for me and our
    university students again. The documentation could be extended for more
    sources which are equipped by structured comments for recent 2.6 kernels. I
    have tried to proceed with that task. I have done that more times from 2.6.0
    time and it gets boring to do same changes again and again. Linux kernel
    compiles after changes for i386 and ARM targets. I have added references to
    some more files into kernel-api book, I have added some section names as well.
    So please, check that changes do not break something and that categories are
    not too much skewed.

    I have changed kernel-doc to accept "fastcall" and "asmlinkage" words reserved
    by kernel convention. Most of the other changes are modifications in the
    comments to make kernel-doc happy, accept some parameters description and do
    not bail out on errors. Changed to @pid in the description, moved some
    #ifdef before comments to correct function to comments bindings, etc.

    You can see result of the modified documentation build at
    http://cmp.felk.cvut.cz/~pisa/linux/lkdb-2.6.11.tar.gz

    Some more sources are ready to be included into kernel-doc generated
    documentation. Sources has been added into kernel-api for now. Some more
    section names added and probably some more chaos introduced as result of quick
    cleanup work.

    Signed-off-by: Pavel Pisa
    Signed-off-by: Martin Waitz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Pisa
     
  • Convert most of the current code that uses _NSIG directly to instead use
    valid_signal(). This avoids gcc -W warnings and off-by-one errors.

    Signed-off-by: Jesper Juhl
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Juhl
     

30 Apr, 2005

1 commit


17 Apr, 2005

2 commits

  • This patch hides reparent_to_init(). reparent_to_init() should only be
    called by daemonize().

    Signed-off-by: Coywolf Qi Hunt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Coywolf Qi Hunt
     
  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds