21 Feb, 2006

1 commit


10 Jan, 2006

1 commit


09 Jan, 2006

4 commits

  • This adds an option to remove vm86 support under CONFIG_EMBEDDED. Saves
    about 5k.

    This version eliminates most of the #ifdefs of the previous version and
    instead uses function stubs in vm86.h. Also, release_vm86_irqs is moved
    from asm-i386/irq.h to a more appropriate home in vm86.h so that the stubs
    can live together.

    $ size vmlinux-baseline vmlinux-novm86
    text data bss dec hex filename
    2920821 523232 190652 3634705 377611 vmlinux-baseline
    2916268 523100 190492 3629860 376324 vmlinux-novm86

    Signed-off-by: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Mackall
     
  • Configurable 16-bit UID and friends support

    This allows turning off the legacy 16 bit UID interfaces on embedded platforms.

    text data bss dec hex filename
    3330172 529036 190556 4049764 3dcb64 vmlinux-baseline
    3328268 529040 190556 4047864 3dc3f8 vmlinux

    From: Adrian Bunk

    UID16 was accidentially disabled for !EMBEDDED.

    Signed-off-by: Matt Mackall
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Mackall
     
  • sys_migrate_pages implementation using swap based page migration

    This is the original API proposed by Ray Bryant in his posts during the first
    half of 2005 on linux-mm@kvack.org and linux-kernel@vger.kernel.org.

    The intent of sys_migrate is to migrate memory of a process. A process may
    have migrated to another node. Memory was allocated optimally for the prior
    context. sys_migrate_pages allows to shift the memory to the new node.

    sys_migrate_pages is also useful if the processes available memory nodes have
    changed through cpuset operations to manually move the processes memory. Paul
    Jackson is working on an automated mechanism that will allow an automatic
    migration if the cpuset of a process is changed. However, a user may decide
    to manually control the migration.

    This implementation is put into the policy layer since it uses concepts and
    functions that are also needed for mbind and friends. The patch also provides
    a do_migrate_pages function that may be useful for cpusets to automatically
    move memory. sys_migrate_pages does not modify policies in contrast to Ray's
    implementation.

    The current code here is based on the swap based page migration capability and
    thus is not able to preserve the physical layout relative to it containing
    nodeset (which may be a cpuset). When direct page migration becomes available
    then the implementation needs to be changed to do a isomorphic move of pages
    between different nodesets. The current implementation simply evicts all
    pages in source nodeset that are not in the target nodeset.

    Patch supports ia64, i386 and x86_64.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • This is the current version of the spu file system, used
    for driving SPEs on the Cell Broadband Engine.

    This release is almost identical to the version for the
    2.6.14 kernel posted earlier, which is available as part
    of the Cell BE Linux distribution from
    http://www.bsc.es/projects/deepcomputing/linuxoncell/.

    The first patch provides all the interfaces for running
    spu application, but does not have any support for
    debugging SPU tasks or for scheduling. Both these
    functionalities are added in the subsequent patches.

    See Documentation/filesystems/spufs.txt on how to use
    spufs.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Paul Mackerras

    Arnd Bergmann
     

02 Aug, 2005

1 commit

  • This removes sys_set_zone_reclaim() for now. While i'm sure Martin is
    trying to solve a real problem, we must not hard-code an incomplete and
    insufficient approach into a syscall, because syscalls are pretty much
    for eternity. I am quite strongly convinced that this syscall must not
    hit v2.6.13 in its current form.

    Firstly, the syscall lacks basic syscall design: e.g. it allows the
    global setting of VM policy for unprivileged users. (!) [ Imagine an
    Oracle installation and a SAP installation on the same NUMA box fighting
    over the 'optimal' setting for this flag. What will they do? Will they
    try to set the flag to their own preferred value every second or so? ]

    Secondly, it was added based on a single datapoint from Martin:

    http://marc.theaimsgroup.com/?l=linux-mm&m=111763597218177&w=2

    where Martin characterizes the numbers the following way:

    ' Run-to-run variability for "make -j" is huge, so these numbers aren't
    terribly useful except to see that with reclaim the benchmark still
    finishes in a reasonable amount of time. '

    in other words: the fundamental problem has likely not been solved, only
    a tendential move into the right direction has been observed, and a
    handful of numbers were picked out of a set of hugely variable results,
    without showing the variability data. How much variance is there
    run-to-run?

    I'd really suggest to first walk the walk and see what's needed to get
    stable & predictable kernel compilation numbers on that NUMA box, before
    adding random syscalls to tune a particular aspect of the VM ... which
    approach might not even matter once the whole picture has been analyzed
    and understood!

    The third, most important point is that the syscall exposes VM tuning
    internals in a completely unstructured way. What sense does it make to
    have a _GLOBAL_ per-node setting for 'should we go to another node for
    reclaim'? If then it might make sense to do this per-app, via numalib or
    so.

    The change is minimalistic in that it doesnt remove the syscall and the
    underlying infrastructure changes, only the user-visible changes. We
    could perhaps add a CAP_SYS_ADMIN-only sysctl for this hack, a'ka
    /proc/sys/vm/swappiness, but even that looks quite counterproductive
    when the generic approach is that we are trying to reduce the number of
    external factors in the VM balance picture.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

13 Jul, 2005

1 commit

  • inotify is intended to correct the deficiencies of dnotify, particularly
    its inability to scale and its terrible user interface:

    * dnotify requires the opening of one fd per each directory
    that you intend to watch. This quickly results in too many
    open files and pins removable media, preventing unmount.
    * dnotify is directory-based. You only learn about changes to
    directories. Sure, a change to a file in a directory affects
    the directory, but you are then forced to keep a cache of
    stat structures.
    * dnotify's interface to user-space is awful. Signals?

    inotify provides a more usable, simple, powerful solution to file change
    notification:

    * inotify's interface is a system call that returns a fd, not SIGIO.
    You get a single fd, which is select()-able.
    * inotify has an event that says "the filesystem that the item
    you were watching is on was unmounted."
    * inotify can watch directories or files.

    Inotify is currently used by Beagle (a desktop search infrastructure),
    Gamin (a FAM replacement), and other projects.

    See Documentation/filesystems/inotify.txt.

    Signed-off-by: Robert Love
    Cc: John McCutchan
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert Love
     

26 Jun, 2005

1 commit

  • This patch introduces the architecture independent implementation the
    sys_kexec_load, the compat_sys_kexec_load system calls.

    Kexec on panic support has been integrated into the core patch and is
    relatively clean.

    In addition the hopefully architecture independent option
    crashkernel=size@location has been docuemented. It's purpose is to reserve
    space for the panic kernel to live, and where no DMA transfer will ever be
    setup to access.

    Signed-off-by: Eric Biederman
    Signed-off-by: Alexander Nyberg
    Signed-off-by: Adrian Bunk
    Signed-off-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

22 Jun, 2005

1 commit

  • This is the core of the (much simplified) early reclaim. The goal of this
    patch is to reclaim some easily-freed pages from a zone before falling back
    onto another zone.

    One of the major uses of this is NUMA machines. With the default allocator
    behavior the allocator would look for memory in another zone, which might be
    off-node, before trying to reclaim from the current zone.

    This adds a zone tuneable to enable early zone reclaim. It is selected on a
    per-zone basis and is turned on/off via syscall.

    Adding some extra throttling on the reclaim was also required (patch
    4/4). Without the machine would grind to a crawl when doing a "make -j"
    kernel build. Even with this patch the System Time is higher on
    average, but it seems tolerable. Here are some numbers for kernbench
    runs on a 2-node, 4cpu, 8Gig RAM Altix in the "make -j" run:

    wall user sys %cpu ctx sw. sleeps
    ---- ---- --- ---- ------ ------
    No patch 1009 1384 847 258 298170 504402
    w/patch, no reclaim 880 1376 667 288 254064 396745
    w/patch & reclaim 1079 1385 926 252 291625 548873

    These numbers are the average of 2 runs of 3 "make -j" runs done right
    after system boot. Run-to-run variability for "make -j" is huge, so
    these numbers aren't terribly useful except to seee that with reclaim
    the benchmark still finishes in a reasonable amount of time.

    I also looked at the NUMA hit/miss stats for the "make -j" runs and the
    reclaim doesn't make any difference when the machine is thrashing away.

    Doing a "make -j8" on a single node that is filled with page cache pages
    takes 700 seconds with reclaim turned on and 735 seconds without reclaim
    (due to remote memory accesses).

    The simple zone_reclaim syscall program is at
    http://www.bork.org/~mort/sgi/zone_reclaim.c

    Signed-off-by: Martin Hicks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Hicks
     

01 May, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds