17 Oct, 2007

40 commits

  • Probing pages and radix_tree_tagged are lockless operations with the lockless
    radix-tree. Convert these users to RCU locking rather than using tree_lock.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • The commit b5810039a54e5babf428e9a1e89fc1940fabff11 contains the note

    A last caveat: the ZERO_PAGE is now refcounted and managed with rmap
    (and thus mapcounted and count towards shared rss). These writes to
    the struct page could cause excessive cacheline bouncing on big
    systems. There are a number of ways this could be addressed if it is
    an issue.

    And indeed this cacheline bouncing has shown up on large SGI systems.
    There was a situation where an Altix system was essentially livelocked
    tearing down ZERO_PAGE pagetables when an HPC app aborted during startup.
    This situation can be avoided in userspace, but it does highlight the
    potential scalability problem with refcounting ZERO_PAGE, and corner
    cases where it can really hurt (we don't want the system to livelock!).

    There are several broad ways to fix this problem:
    1. add back some special casing to avoid refcounting ZERO_PAGE
    2. per-node or per-cpu ZERO_PAGES
    3. remove the ZERO_PAGE completely

    I will argue for 3. The others should also fix the problem, but they
    result in more complex code than does 3, with little or no real benefit
    that I can see.

    Why? Inserting a ZERO_PAGE for anonymous read faults appears to be a
    false optimisation: if an application is performance critical, it would
    not be doing many read faults of new memory, or at least it could be
    expected to write to that memory soon afterwards. If cache or memory use
    is critical, it should not be working with a significant number of
    ZERO_PAGEs anyway (a more compact representation of zeroes should be
    used).

    As a sanity check -- mesuring on my desktop system, there are never many
    mappings to the ZERO_PAGE (eg. 2 or 3), thus memory usage here should not
    increase much without it.

    When running a make -j4 kernel compile on my dual core system, there are
    about 1,000 mappings to the ZERO_PAGE created per second, but about 1,000
    ZERO_PAGE COW faults per second (less than 1 ZERO_PAGE mapping per second
    is torn down without being COWed). So removing ZERO_PAGE will save 1,000
    page faults per second when running kbuild, while keeping it only saves
    less than 1 page clearing operation per second. 1 page clear is cheaper
    than a thousand faults, presumably, so there isn't an obvious loss.

    Neither the logical argument nor these basic tests give a guarantee of no
    regressions. However, this is a reasonable opportunity to try to remove
    the ZERO_PAGE from the pagefault path. If it is found to cause regressions,
    we can reintroduce it and just avoid refcounting it.

    The /dev/zero ZERO_PAGE usage and TLB tricks also get nuked. I don't see
    much use to them except on benchmarks. All other users of ZERO_PAGE are
    converted just to use ZERO_PAGE(0) for simplicity. We can look at
    replacing them all and maybe ripping out ZERO_PAGE completely when we are
    more satisfied with this solution.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus "snif" Torvalds

    Nick Piggin
     
  • This gets rid of all kmalloc caches larger than page size. A kmalloc
    request larger than PAGE_SIZE > 2 is going to be passed through to the page
    allocator. This works both inline where we will call __get_free_pages
    instead of kmem_cache_alloc and in __kmalloc.

    kfree is modified to check if the object is in a slab page. If not then
    the page is freed via the page allocator instead. Roughly similar to what
    SLOB does.

    Advantages:
    - Reduces memory overhead for kmalloc array
    - Large kmalloc operations are faster since they do not
    need to pass through the slab allocator to get to the
    page allocator.
    - Performance increase of 10%-20% on alloc and 50% on free for
    PAGE_SIZEd allocations.
    SLUB must call page allocator for each alloc anyways since
    the higher order pages which that allowed avoiding the page alloc calls
    are not available in a reliable way anymore. So we are basically removing
    useless slab allocator overhead.
    - Large kmallocs yields page aligned object which is what
    SLAB did. Bad things like using page sized kmalloc allocations to
    stand in for page allocate allocs can be transparently handled and are not
    distinguishable from page allocator uses.
    - Checking for too large objects can be removed since
    it is done by the page allocator.

    Drawbacks:
    - No accounting for large kmalloc slab allocations anymore
    - No debugging of large kmalloc slab allocations.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Convert some 'unsigned long' to pgoff_t.

    Signed-off-by: Fengguang Wu
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • - remove unused local next_index in do_generic_mapping_read()
    - remove a redudant page_cache_read() declaration

    Signed-off-by: Fengguang Wu
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • Remove the size limit max_sectors_kb imposed on max_readahead_kb.

    The size restriction is unreasonable. Especially when max_sectors_kb cannot
    grow larger than max_hw_sectors_kb, which can be rather small for some disk
    drives.

    Cc: Jens Axboe
    Signed-off-by: Fengguang Wu
    Acked-by: Jens Axboe
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • Remove VM_MAX_CACHE_HIT, MAX_RA_PAGES and MIN_RA_PAGES.

    Signed-off-by: Fengguang Wu
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • The local copy of ra in do_generic_mapping_read() can now go away.

    It predates readanead(req_size). In a time when the readahead code was called
    on *every* single page. Hence a local has to be made to reduce the chance of
    the readahead state being overwritten by a concurrent reader. More details
    in: Linux: Random File I/O Regressions In 2.6

    Cc: Nick Piggin
    Signed-off-by: Fengguang Wu
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • This is a simplified version of the pagecache context based readahead. It
    handles the case of multiple threads reading on the same fd and invalidating
    each others' readahead state. It does the trick by scanning the pagecache and
    recovering the current read stream's readahead status.

    The algorithm works in a opportunistic way, in that it does not try to detect
    interleaved reads _actively_, which requires a probe into the page cache
    (which means a little more overhead for random reads). It only tries to
    handle a previously started sequential readahead whose state was overwritten
    by another concurrent stream, and it can do this job pretty well.

    Negative and positive examples(or what you can expect from it):

    1) it cannot detect and serve perfect request-by-request interleaved reads
    right:
    time stream 1 stream 2
    0 1
    1 1001
    2 2
    3 1002
    4 3
    5 1003
    6 4
    7 1004
    8 5
    9 1005

    Here no single readahead will be carried out.

    2) However, if it's two concurrent reads by two threads, the chance of the
    initial sequential readahead be started is huge. Once the first sequential
    readahead is started for a stream, this patch will ensure that the readahead
    window continues to rampup and won't be disturbed by other streams.

    time stream 1 stream 2
    0 1
    1 2
    2 1001
    3 3
    4 1002
    5 1003
    6 4
    7 5
    8 1004
    9 6
    10 1005
    11 7
    12 1006
    13 1007

    Here stream 1 will start a readahead at page 2, and stream 2 will start its
    first readahead at page 1003. From then on the two streams will be served
    right.

    Cc: Rusty Russell
    Signed-off-by: Fengguang Wu
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • Introduce radix_tree_next_hole(root, index, max_scan) to scan radix tree for
    the first hole. It will be used in interleaved readahead.

    The implementation is dumb and obviously correct. It can help debug(and
    document) the possible smart one in future.

    Cc: Nick Piggin
    Signed-off-by: Fengguang Wu
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • Combine the file_ra_state members
    unsigned long prev_index
    unsigned int prev_offset
    into
    loff_t prev_pos

    It is more consistent and better supports huge files.

    Thanks to Peter for the nice proposal!

    [akpm@linux-foundation.org: fix shift overflow]
    Cc: Peter Zijlstra
    Signed-off-by: Fengguang Wu
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • Fold file_ra_state.mmap_hit into file_ra_state.mmap_miss and make it an int.

    Signed-off-by: Fengguang Wu
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • Use 'unsigned int' instead of 'unsigned long' for readahead sizes.

    This helps reduce memory consumption on 64bit CPU when a lot of files are
    opened.

    CC: Andi Kleen
    Signed-off-by: Fengguang Wu
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fengguang Wu
     
  • This patch cleans up duplicate includes in
    mm/

    Signed-off-by: Jesper Juhl
    Acked-by: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Juhl
     
  • This patch cleans up duplicate includes in
    include/linux/memory_hotplug.h

    Signed-off-by: Jesper Juhl
    Acked-by: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Juhl
     
  • We have had complaints where a threaded application is left in a bad state
    after one of it's threads is killed when we hit a VM: out_of_memory
    condition.

    Killing just one of the process threads can leave the application in a bad
    state, whereas killing the entire process group would allow for the
    application to restart, or be otherwise handled, and makes it very obvious
    that something has gone wrong.

    This change allows the entire process group to be taken down, rather
    than just the one thread.

    Signed-off-by: Will Schmidt
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Russell King
    Cc: Ian Molton
    Cc: Haavard Skinnemoen
    Cc: Mikael Starvik
    Cc: David Howells
    Cc: Andi Kleen
    Cc: "Luck, Tony"
    Cc: Hirokazu Takata
    Cc: Geert Uytterhoeven
    Cc: Roman Zippel
    Cc: Ralf Baechle
    Cc: Kyle McMartin
    Cc: Matthew Wilcox
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: Paul Mundt
    Cc: Kazumoto Kojima
    Cc: Richard Curnow
    Cc: William Lee Irwin III
    Cc: "David S. Miller"
    Cc: Chris Zankel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Will Schmidt
     
  • WARNING: mm/built-in.o(.text+0x24bd3): Section mismatch: reference to .init.text:early_kmem_cache_node_alloc (between 'init_kmem_cache_nodes' and 'calculate_sizes')
    ...

    Signed-off-by: Adrian Bunk
    Acked-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Enable virtual memmap support for SPARSEMEM on PPC64 systems. Slice a 16th
    off the end of the linear mapping space and use that to hold the vmemmap.
    Uses the same size mapping as uses in the linear 1:1 kernel mapping.

    [pbadari@gmail.com: fix warning]
    Signed-off-by: Andy Whitcroft
    Acked-by: Mel Gorman
    Cc: Christoph Lameter
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Badari Pulavarty
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • [apw@shadowen.org: style fixups]
    [apw@shadowen.org: vmemmap sparc64: convert to new config options]
    Signed-off-by: Andy Whitcroft
    Acked-by: Mel Gorman
    Acked-by: Christoph Lameter
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Miller
     
  • Equip IA64 sparsemem with a virtual memmap. This is similar to the existing
    CONFIG_VIRTUAL_MEM_MAP functionality for DISCONTIGMEM. It uses a PAGE_SIZE
    mapping.

    This is provided as a minimally intrusive solution. We split the 128TB
    VMALLOC area into two 64TB areas and use one for the virtual memmap.

    This should replace CONFIG_VIRTUAL_MEM_MAP long term.

    [apw@shadowen.org: convert to new helper based initialisation]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andy Whitcroft
    Acked-by: Mel Gorman
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • x86_64 uses 2M page table entries to map its 1-1 kernel space. We also
    implement the virtual memmap using 2M page table entries. So there is no
    additional runtime overhead over FLATMEM, initialisation is slightly more
    complex. As FLATMEM still references memory to obtain the mem_map pointer and
    SPARSEMEM_VMEMMAP uses a compile time constant, SPARSEMEM_VMEMMAP should be
    superior.

    With this SPARSEMEM becomes the most efficient way of handling virt_to_page,
    pfn_to_page and friends for UP, SMP and NUMA on x86_64.

    [apw@shadowen.org: code resplit, style fixups]
    [apw@shadowen.org: vmemmap x86_64: ensure end of section memmap is initialised]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andy Whitcroft
    Acked-by: Mel Gorman
    Cc: Andi Kleen
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Convert the common vmemmap population into initialisation helpers for use by
    architecture vmemmap populators. All architecture implementing the
    SPARSEMEM_VMEMMAP variant supply an architecture specific vmemmap_populate()
    initialiser, which may make use of the helpers.

    This allows us to clean up and remove the initialisation Kconfig entries.
    With this patch there is a single SPARSEMEM_VMEMMAP_ENABLE Kconfig option to
    indicate use of that variant.

    Signed-off-by: Andy Whitcroft
    Acked-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • SPARSEMEM is a pretty nice framework that unifies quite a bit of code over all
    the arches. It would be great if it could be the default so that we can get
    rid of various forms of DISCONTIG and other variations on memory maps. So far
    what has hindered this are the additional lookups that SPARSEMEM introduces
    for virt_to_page and page_address. This goes so far that the code to do this
    has to be kept in a separate function and cannot be used inline.

    This patch introduces a virtual memmap mode for SPARSEMEM, in which the memmap
    is mapped into a virtually contigious area, only the active sections are
    physically backed. This allows virt_to_page page_address and cohorts become
    simple shift/add operations. No page flag fields, no table lookups, nothing
    involving memory is required.

    The two key operations pfn_to_page and page_to_page become:

    #define __pfn_to_page(pfn) (vmemmap + (pfn))
    #define __page_to_pfn(page) ((page) - vmemmap)

    By having a virtual mapping for the memmap we allow simple access without
    wasting physical memory. As kernel memory is typically already mapped 1:1
    this introduces no additional overhead. The virtual mapping must be big
    enough to allow a struct page to be allocated and mapped for all valid
    physical pages. This vill make a virtual memmap difficult to use on 32 bit
    platforms that support 36 address bits.

    However, if there is enough virtual space available and the arch already maps
    its 1-1 kernel space using TLBs (f.e. true of IA64 and x86_64) then this
    technique makes SPARSEMEM lookups even more efficient than CONFIG_FLATMEM.
    FLATMEM needs to read the contents of the mem_map variable to get the start of
    the memmap and then add the offset to the required entry. vmemmap is a
    constant to which we can simply add the offset.

    This patch has the potential to allow us to make SPARSMEM the default (and
    even the only) option for most systems. It should be optimal on UP, SMP and
    NUMA on most platforms. Then we may even be able to remove the other memory
    models: FLATMEM, DISCONTIG etc.

    [apw@shadowen.org: config cleanups, resplit code etc]
    [kamezawa.hiroyu@jp.fujitsu.com: Fix sparsemem_vmemmap init]
    [apw@shadowen.org: vmemmap: remove excess debugging]
    [apw@shadowen.org: simplify initialisation code and reduce duplication]
    [apw@shadowen.org: pull out the vmemmap code into its own file]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andy Whitcroft
    Acked-by: Mel Gorman
    Cc: "Luck, Tony"
    Cc: Andi Kleen
    Cc: "David S. Miller"
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • We have flags to indicate whether a section actually has a valid mem_map
    associated with it. This is never set and we rely solely on the present bit
    to indicate a section is valid. By definition a section is not valid if it
    has no mem_map and there is a window during init where the present bit is set
    but there is no mem_map, during which pfn_valid() will return true
    incorrectly.

    Use the existing SECTION_HAS_MEM_MAP flag to indicate the presence of a valid
    mem_map. Switch valid_section{,_nr} and pfn_valid() to this bit. Add a new
    present_section{,_nr} and pfn_present() interfaces for those users who care to
    know that a section is going to be valid.

    [akpm@linux-foundation.org: coding-syle fixes]
    Signed-off-by: Andy Whitcroft
    Acked-by: Mel Gorman
    Cc: Christoph Lameter
    Cc: "Luck, Tony"
    Cc: Andi Kleen
    Cc: "David S. Miller"
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • SPARSEMEM is a pretty nice framework that unifies quite a bit of code over all
    the arches. It would be great if it could be the default so that we can get
    rid of various forms of DISCONTIG and other variations on memory maps. So far
    what has hindered this are the additional lookups that SPARSEMEM introduces
    for virt_to_page and page_address. This goes so far that the code to do this
    has to be kept in a separate function and cannot be used inline.

    This patch introduces a virtual memmap mode for SPARSEMEM, in which the memmap
    is mapped into a virtually contigious area, only the active sections are
    physically backed. This allows virt_to_page page_address and cohorts become
    simple shift/add operations. No page flag fields, no table lookups, nothing
    involving memory is required.

    The two key operations pfn_to_page and page_to_page become:

    #define __pfn_to_page(pfn) (vmemmap + (pfn))
    #define __page_to_pfn(page) ((page) - vmemmap)

    By having a virtual mapping for the memmap we allow simple access without
    wasting physical memory. As kernel memory is typically already mapped 1:1
    this introduces no additional overhead. The virtual mapping must be big
    enough to allow a struct page to be allocated and mapped for all valid
    physical pages. This vill make a virtual memmap difficult to use on 32 bit
    platforms that support 36 address bits.

    However, if there is enough virtual space available and the arch already maps
    its 1-1 kernel space using TLBs (f.e. true of IA64 and x86_64) then this
    technique makes SPARSEMEM lookups even more efficient than CONFIG_FLATMEM.
    FLATMEM needs to read the contents of the mem_map variable to get the start of
    the memmap and then add the offset to the required entry. vmemmap is a
    constant to which we can simply add the offset.

    This patch has the potential to allow us to make SPARSMEM the default (and
    even the only) option for most systems. It should be optimal on UP, SMP and
    NUMA on most platforms. Then we may even be able to remove the other memory
    models: FLATMEM, DISCONTIG etc.

    The current aim is to bring a common virtually mapped mem_map to all
    architectures. This should facilitate the removal of the bespoke
    implementations from the architectures. This also brings performance
    improvements for most architecture making sparsmem vmemmap the more desirable
    memory model. The ultimate aim of this work is to expand sparsemem support to
    encompass all the features of the other memory models. This could allow us to
    drop support for and remove the other models in the longer term.

    Below are some comparitive kernbench numbers for various architectures,
    comparing default memory model against SPARSEMEM VMEMMAP. All but ia64 show
    marginal improvement; we expect the ia64 figures to be sorted out when the
    larger mapping support returns.

    x86-64 non-NUMA
    Base VMEMAP % change (-ve good)
    User 85.07 84.84 -0.26
    System 34.32 33.84 -1.39
    Total 119.38 118.68 -0.59

    ia64
    Base VMEMAP % change (-ve good)
    User 1016.41 1016.93 0.05
    System 50.83 51.02 0.36
    Total 1067.25 1067.95 0.07

    x86-64 NUMA
    Base VMEMAP % change (-ve good)
    User 30.77 431.73 0.22
    System 45.39 43.98 -3.11
    Total 476.17 475.71 -0.10

    ppc64
    Base VMEMAP % change (-ve good)
    User 488.77 488.35 -0.09
    System 56.92 56.37 -0.97
    Total 545.69 544.72 -0.18

    Below are some AIM bencharks on IA64 and x86-64 (thank Bob). The seems
    pretty much flat as you would expect.

    ia64 results 2 cpu non-numa 4Gb SCSI disk

    Benchmark Version Machine Run Date
    AIM Multiuser Benchmark - Suite VII "1.1" extreme Jun 1 07:17:24 2007

    Tasks Jobs/Min JTI Real CPU Jobs/sec/task
    1 98.9 100 58.9 1.3 1.6482
    101 5547.1 95 106.0 79.4 0.9154
    201 6377.7 95 183.4 158.3 0.5288
    301 6932.2 95 252.7 237.3 0.3838
    401 7075.8 93 329.8 316.7 0.2941
    501 7235.6 94 403.0 396.2 0.2407
    600 7387.5 94 472.7 475.0 0.2052

    Benchmark Version Machine Run Date
    AIM Multiuser Benchmark - Suite VII "1.1" vmemmap Jun 1 09:59:04 2007

    Tasks Jobs/Min JTI Real CPU Jobs/sec/task
    1 99.1 100 58.8 1.2 1.6509
    101 5480.9 95 107.2 79.2 0.9044
    201 6490.3 95 180.2 157.8 0.5382
    301 6886.6 94 254.4 236.8 0.3813
    401 7078.2 94 329.7 316.0 0.2942
    501 7250.3 95 402.2 395.4 0.2412
    600 7399.1 94 471.9 473.9 0.2055

    open power 710 2 cpu, 4 Gb, SCSI and configured physically

    Benchmark Version Machine Run Date
    AIM Multiuser Benchmark - Suite VII "1.1" extreme May 29 15:42:53 2007

    Tasks Jobs/Min JTI Real CPU Jobs/sec/task
    1 25.7 100 226.3 4.3 0.4286
    101 1096.0 97 536.4 199.8 0.1809
    201 1236.4 96 946.1 389.1 0.1025
    301 1280.5 96 1368.0 582.3 0.0709
    401 1270.2 95 1837.4 771.0 0.0528
    501 1251.4 96 2330.1 955.9 0.0416
    601 1252.6 96 2792.4 1139.2 0.0347
    701 1245.2 96 3276.5 1334.6 0.0296
    918 1229.5 96 4345.4 1728.7 0.0223

    Benchmark Version Machine Run Date
    AIM Multiuser Benchmark - Suite VII "1.1" vmemmap May 30 07:28:26 2007

    Tasks Jobs/Min JTI Real CPU Jobs/sec/task
    1 25.6 100 226.9 4.3 0.4275
    101 1049.3 97 560.2 198.1 0.1731
    201 1199.1 97 975.6 390.7 0.0994
    301 1261.7 96 1388.5 591.5 0.0699
    401 1256.1 96 1858.1 771.9 0.0522
    501 1220.1 96 2389.7 955.3 0.0406
    601 1224.6 96 2856.3 1133.4 0.0340
    701 1252.0 96 3258.7 1314.1 0.0298
    915 1232.8 96 4319.7 1704.0 0.0225

    amd64 2 2-core, 4Gb and SATA

    Benchmark Version Machine Run Date
    AIM Multiuser Benchmark - Suite VII "1.1" extreme Jun 2 03:59:48 2007

    Tasks Jobs/Min JTI Real CPU Jobs/sec/task
    1 13.0 100 446.4 2.1 0.2173
    101 533.4 97 1102.0 110.2 0.0880
    201 578.3 97 2022.8 220.8 0.0480
    301 583.8 97 3000.6 332.3 0.0323
    401 580.5 97 4020.1 442.2 0.0241
    501 574.8 98 5072.8 558.8 0.0191
    600 566.5 98 6163.8 671.0 0.0157

    Benchmark Version Machine Run Date
    AIM Multiuser Benchmark - Suite VII "1.1" vmemmap Jun 3 04:19:31 2007

    Tasks Jobs/Min JTI Real CPU Jobs/sec/task
    1 13.0 100 447.8 2.0 0.2166
    101 536.5 97 1095.6 109.7 0.0885
    201 567.7 97 2060.5 219.3 0.0471
    301 582.1 96 3009.4 330.2 0.0322
    401 578.2 96 4036.4 442.4 0.0240
    501 585.1 98 4983.2 555.1 0.0195
    600 565.5 98 6175.2 660.6 0.0157

    This patch:

    Fix some spelling errors.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andy Whitcroft
    Acked-by: Mel Gorman
    Cc: "Luck, Tony"
    Cc: Andi Kleen
    Cc: "David S. Miller"
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • x86(-64) are the last architectures still using the page fault notifier
    cruft for the kprobes page fault hook. This patch converts them to the
    proper direct calls, and removes the now unused pagefault notifier bits
    aswell as the cruft in kprobes.c that was related to this mess.

    I know Andi didn't really like this, but all other architecture maintainers
    agreed the direct calls are much better and besides the obvious cruft
    removal a common way of dealing with kprobes across architectures is
    important aswell.

    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: fix sparc64]
    Signed-off-by: Christoph Hellwig
    Cc: Andi Kleen
    Cc:
    Cc: Prasanna S Panchamukhi
    Cc: Ananth N Mavinakayanahalli
    Cc: Anil S Keshavamurthy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Convert cpu_sibling_map from a static array sized by NR_CPUS to a per_cpu
    variable. This saves sizeof(cpumask_t) * NR unused cpus. Access is mostly
    from startup and CPU HOTPLUG functions.

    Signed-off-by: Mike Travis
    Cc: Andi Kleen
    Cc: Christoph Lameter
    Cc: "Siddha, Suresh B"
    Cc: "David S. Miller"
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Travis
     
  • This is from an earlier message from 'Christoph Lameter':

    cpu_core_map is currently an array defined using NR_CPUS. This means that
    we overallocate since we will rarely really use maximum configured cpu.

    If we put the cpu_core_map into the per cpu area then it will be allocated
    for each processor as it comes online.

    This means that the core map cannot be accessed until the per cpu area
    has been allocated. Xen does a weird thing here looping over all processors
    and zeroing the masks that are not yet allocated and that will be zeroed
    when they are allocated. I commented the code out.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Mike Travis
    Cc: Andi Kleen
    Cc: Christoph Lameter
    Cc: "Siddha, Suresh B"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Travis
     
  • Notebook manufacturer seems to built a newer Wacom pen enabled tablet to
    recent tablet pcs which are not recognized by the serial pnp driver.

    Attached is a patch which makes the newer Wacom WACF007 and WACF008 tablets
    useable with the serial driver. The device is fully compatible with it.

    Signed-off-by: Maik Broemme
    Cc: Andrey Panin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Maik Broemme
     
  • The UPF_FIXED_PORT flags was introduced in 2.6.22 and it can be used
    instead of the driver specific verify_port routine.

    Signed-off-by: Atsushi Nemoto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Atsushi Nemoto
     
  • Enable wakeup from serial ports, make it run-time configurable over sysfs,
    e.g.,

    echo enabled > /sys/devices/platform/serial8250.0/tty/ttyS0/power/wakeup

    Requires

    # CONFIG_SYSFS_DEPRECATED is not set

    Following suggestions from Alan and Russell moved the may_wake_up checks
    to serial_core.c. This time actually tested - it does even work. Could
    someone, please, verify, that put_device after device_find_child is
    correct?

    Also would be nice to test with a Natsemi UART, that can wake up the system,
    if such systems exist.

    For this you just have to apply the patch below, issue the above "echo"
    command to one of your Natsemi port, suspend and resume your system, and
    verify that your Natsemi port still works. If you are actually capable of
    waking up the system from that port, would be nice to test that as well.

    Signed-off-by: Guennadi Liakhovetski
    Cc: Alan Cox
    Cc: Russell King
    Cc: Kay Sievers
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Guennadi Liakhovetski
     
  • Provide {enable,disable}_irq_wakeup dummies for undefined
    cross-compilers for platforms without CONFIG_GENERIC_IRQ.

    Needed by wake-up-from-a-serial-port.patch

    Signed-off-by: Guennadi Liakhovetski
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Guennadi Liakhovetski
     
  • Add support for a whole range of boards. Some are partly autodetected but
    not fully correctly others (PCI Express notably) not at all. Stick all
    the right entries in.

    Thanks to Mainpine for information and testing.

    Signed-off-by: Alan Cox
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • Do not include some header files already indluded by serial_core.h.

    Signed-off-by: Atsushi Nemoto
    Cc: Ralf Baechle
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Atsushi Nemoto
     
  • Most non cardbus devices can't do dma, so flag them as such in the device
    creation routine.

    Signed-off-by: James Bottomley
    Cc: Andi Kleen
    Cc: Alan Cox
    Cc: Tejun Heo
    Cc: Natalie Protasevich
    Cc: Jeff Garzik
    Cc: Dominik Brodowski
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    James Bottomley
     
  • Some devices are incapable of DMA and need to be recognised as such.
    Introduce a NONE dma mask to facilitate this plus an inline function:
    is_device_dma_capable() to check this.

    Signed-off-by: James Bottomley
    Cc: Andi Kleen
    Cc: Alan Cox
    Cc: Tejun Heo
    Cc: Natalie Protasevich
    Cc: Jeff Garzik
    Cc: Dominik Brodowski
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    James Bottomley
     
  • Add support for Sierra Wireless AC850 which has the same Ids as the
    AC710/750 but has a different firmware.

    Cc: Dominik Brodowski
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Leblond
     
  • Based on a patch by Haavard Skinnemoen posted to linux-pcmcia, but using
    static inlines for readability reasons. this should fix PCMCIA an AVR32

    Signed-off-by: Daniel Ritz
    Cc: Haavard Skinnemoen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Ritz
     
  • Only a few definitions is in xxs1500.h .
    They can be move to au1000_xxs1500.c .

    [m.kozlowski@tuxland.pl: fix unbalanced parenthesis]
    Signed-off-by: Yoichi Yuasa
    Cc: Ralf Baechle
    Cc: Dominik Brodowski
    Signed-off-by: Mariusz Kozlowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yoichi Yuasa
     
  • Recently I've been trying to get working PCMCIA interface on H5000 ipaq
    series, using dual PCMCIA sleeve. So far things work correctly, but I had
    to do one modification to drivers/pcmcia/pxa2xx_base.c to get the interface
    working with orinoco gold PCMCIA card (wired pcnet_cs ethernet card worked
    even without this modification).

    The issue has something to do with assert time on PCMCIA bus, but I'm not
    really sure what -- I found the working value just by trial&error approach.
    I'm not sure how is the assert value in pxa2xx_mcxx_asst calculated (I
    know, simple formula, but the reason why is it calculated that way is not
    obvious for me), neither that my modification is correct. It just works
    with iPAQ.

    Cc: Russell King
    Cc: Richard Purdie
    Cc: Dominik Brodowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Milan Plzik