22 Jul, 2008

1 commit

  • This allow to dynamically generate attributes and share show/store
    functions between attributes. Right now most attributes are generated
    by special macros and lots of duplicated code. With the attribute
    passed it's instead possible to attach some data to the attribute
    and then use that in shared low level functions to do different things.

    I need this for the dynamically generated bank attributes in the x86
    machine check code, but it'll allow some further cleanups.

    I converted all users in tree to the new show/store prototype. It's a single
    huge patch to avoid unbisectable sections.

    Runtime tested: x86-32, x86-64
    Compiled only: ia64, powerpc
    Not compile tested/only grep converted: sh, arm, avr32

    Signed-off-by: Andi Kleen
    Signed-off-by: Greg Kroah-Hartman

    Andi Kleen
     

05 Jul, 2008

1 commit


30 Apr, 2008

1 commit

  • Fuse will use temporary buffers to write back dirty data from memory mappings
    (normal writes are done synchronously). This is needed, because there cannot
    be any guarantee about the time in which a write will complete.

    By using temporary buffers, from the MM's point if view the page is written
    back immediately. If the writeout was due to memory pressure, this
    effectively migrates data from a full zone to a less full zone.

    This patch adds a new counter (NR_WRITEBACK_TEMP) for the number of pages used
    as temporary buffers.

    [Lee.Schermerhorn@hp.com: add vmstat_text for NR_WRITEBACK_TEMP]
    Signed-off-by: Miklos Szeredi
    Cc: Christoph Lameter
    Signed-off-by: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

20 Apr, 2008

2 commits

  • * Cleaned up references to cpumask_scnprintf() and added new
    cpulist_scnprintf() interfaces where appropriate.

    * Fix some small bugs (or code efficiency improvments) for various uses
    of cpumask_scnprintf.

    * Clean up some checkpatch errors.

    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     
  • * Use new node_to_cpumask_ptr. This creates a pointer to the
    cpumask for a given node. This definition is in mm patch:

    asm-generic-add-node_to_cpumask_ptr-macro.patch

    * Use new set_cpus_allowed_ptr function.

    Depends on:
    [mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch
    [sched-devel]: sched: add new set_cpus_allowed_ptr function
    [x86/latest]: x86: add cpus_scnprintf function

    Cc: Greg Kroah-Hartman
    Cc: Greg Banks
    Cc: H. Peter Anvin
    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     

25 Jan, 2008

1 commit


17 Oct, 2007

1 commit

  • Add a per node state sysfs class attribute file to /sys/devices/system/node
    to display node state masks.

    E.g., on a 4-cell HP ia64 NUMA platform, we have 5 nodes: 4 representing
    the actual hardware cells and one memory-only pseudo-node representing a
    small amount [512MB] of "hardware interleaved" memory. With this patch, in
    /sys/devices/system/node we see:

    #ls -1F /sys/devices/system/node
    has_cpu
    has_normal_memory
    node0/
    node1/
    node2/
    node3/
    node4/
    online
    possible
    #cat /sys/devices/system/node/possible
    0-255
    #cat /sys/devices/system/node/online
    0-4
    #cat /sys/devices/system/node/has_normal_memory
    0-4
    #cat /sys/devices/system/node/has_cpu
    0-3

    Signed-off-by: Lee Schermerhorn
    Acked-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     

18 Feb, 2007

1 commit


12 Feb, 2007

1 commit


26 Sep, 2006

2 commits

  • Remove the atomic counter for slab_reclaim_pages and replace the counter
    and NR_SLAB with two ZVC counter that account for unreclaimable and
    reclaimable slab pages: NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE.

    Change the check in vmscan.c to refer to to NR_SLAB_RECLAIMABLE. The
    intend seems to be to check for slab pages that could be freed.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Do not display HIGHMEM memory sizes if CONFIG_HIGHMEM is not set.

    Make HIGHMEM dependent texts and make display of highmem counters optional

    Some texts are depending on CONFIG_HIGHMEM.

    Remove those strings and remove the display of highmem counter values if
    CONFIG_HIGHMEM is not set.

    [akpm@osdl.org: remove some ifdefs]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

28 Aug, 2006

1 commit


01 Jul, 2006

10 commits

  • The numa statistics are really event counters. But they are per node and
    so we have had special treatment for these counters through additional
    fields on the pcp structure. We can now use the per zone nature of the
    zoned VM counters to realize these.

    This will shrink the size of the pcp structure on NUMA systems. We will
    have some room to add additional per zone counters that will all still fit
    in the same cacheline.

    Bits Prior pcp size Size after patch We can add
    ------------------------------------------------------------------
    64 128 bytes (16 words) 80 bytes (10 words) 48
    32 76 bytes (19 words) 56 bytes (14 words) 8 (64 byte cacheline)
    72 (128 byte)

    Remove the special statistics for numa and replace them with zoned vm
    counters. This has the side effect that global sums of these events now
    show up in /proc/vmstat.

    Also take the opportunity to move the zone_statistics() function from
    page_alloc.c into vmstat.c.

    Discussions:
    V2 http://marc.theaimsgroup.com/?t=115048227000002&r=1&w=2

    Signed-off-by: Christoph Lameter
    Acked-by: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Conversion of nr_bounce to a per zone counter

    nr_bounce is only used for proc output. So it could be left as an event
    counter. However, the event counters may not be accurate and nr_bounce is
    categorizing types of pages in a zone. So we really need this to also be a
    per zone counter.

    [akpm@osdl.org: bugfix]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Conversion of nr_unstable to a per zone counter

    We need to do some special modifications to the nfs code since there are
    multiple cases of disposition and we need to have a page ref for proper
    accounting.

    This converts the last critical page state of the VM and therefore we need to
    remove several functions that were depending on GET_PAGE_STATE_LAST in order
    to make the kernel compile again. We are only left with event type counters
    in page state.

    [akpm@osdl.org: bugfixes]
    Signed-off-by: Christoph Lameter
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Conversion of nr_writeback to per zone counter.

    This removes the last page_state counter from arch/i386/mm/pgtable.c so we
    drop the page_state from there.

    [akpm@osdl.org: bugfix]
    Signed-off-by: Christoph Lameter
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • This makes nr_dirty a per zone counter. Looping over all processors is
    avoided during writeback state determination.

    The counter aggregation for nr_dirty had to be undone in the NFS layer since
    we summed up the page counts from multiple zones. Someone more familiar with
    NFS should probably review what I have done.

    [akpm@osdl.org: bugfix]
    Signed-off-by: Christoph Lameter
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Conversion of nr_page_table_pages to a per zone counter

    [akpm@osdl.org: bugfix]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • - Allows reclaim to access counter without looping over processor counts.

    - Allows accurate statistics on how many pages are used in a zone by
    the slab. This may become useful to balance slab allocations over
    various zones.

    [akpm@osdl.org: bugfix]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • The current NR_FILE_MAPPED is used by zone reclaim and the dirty load
    calculation as the number of mapped pagecache pages. However, that is not
    true. NR_FILE_MAPPED includes the mapped anonymous pages. This patch
    separates those and therefore allows an accurate tracking of the anonymous
    pages per zone.

    It then becomes possible to determine the number of unmapped pages per zone
    and we can avoid scanning for unmapped pages if there are none.

    Also it may now be possible to determine the mapped/unmapped ratio in
    get_dirty_limit. Isnt the number of anonymous pages irrelevant in that
    calculation?

    Note that this will change the meaning of the number of mapped pages reported
    in /proc/vmstat /proc/meminfo and in the per node statistics. This may affect
    user space tools that monitor these counters! NR_FILE_MAPPED works like
    NR_FILE_DIRTY. It is only valid for pagecache pages.

    Signed-off-by: Christoph Lameter
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Currently a single atomic variable is used to establish the size of the page
    cache in the whole machine. The zoned VM counters have the same method of
    implementation as the nr_pagecache code but also allow the determination of
    the pagecache size per zone.

    Remove the special implementation for nr_pagecache and make it a zoned counter
    named NR_FILE_PAGES.

    Updates of the page cache counters are always performed with interrupts off.
    We can therefore use the __ variant here.

    Signed-off-by: Christoph Lameter
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • nr_mapped is important because it allows a determination of how many pages of
    a zone are not mapped, which would allow a more efficient means of determining
    when we need to reclaim memory in a zone.

    We take the nr_mapped field out of the page state structure and define a new
    per zone counter named NR_FILE_MAPPED (the anonymous pages will be split off
    from NR_MAPPED in the next patch).

    We replace the use of nr_mapped in various kernel locations. This avoids the
    looping over all processors in try_to_free_pages(), writeback, reclaim (swap +
    zone reclaim).

    [akpm@osdl.org: bugfix]
    Signed-off-by: Christoph Lameter
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

28 Jun, 2006

2 commits

  • With Goto-san's patch, we can add new pgdat/node at runtime. I'm now
    considering node-hot-add with cpu + memory on ACPI.

    I found acpi container, which describes node, could evaluate cpu before
    memory. This means cpu-hot-add occurs before memory hot add.

    In most part, cpu-hot-add doesn't depend on node hot add. But register_cpu(),
    which creates symbolic link from node to cpu, requires that node should be
    onlined before register_cpu(). When a node is onlined, its pgdat should be
    there.

    This patch-set holds off creating symbolic link from node to cpu
    until node is onlined.

    This removes node arguments from register_cpu().

    Now, register_cpu() requires 'struct node' as its argument. But the array of
    struct node is now unified in driver/base/node.c now (By Goto's node hotplug
    patch). We can get struct node in generic way. So, this argument is not
    necessary now.

    This patch also guarantees add cpu under node only when node is onlined. It
    is necessary for node-hot-add vs. cpu-hot-add patch following this.

    Moreover, register_cpu calculates cpu->node_id by cpu_to_node() without regard
    to its 'struct node *root' argument. This patch removes it.

    Also modify callers of register_cpu()/unregister_cpu, whose args are changed
    by register-cpu-remove-node-struct patch.

    [Brice.Goglin@ens-lyon.org: fix it]
    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Cc: Ashok Raj
    Cc: Dave Hansen
    Signed-off-by: Brice Goglin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • When new node becomes enable by hot-add, new sysfs file must be created for
    new node. So, if new node is enabled by add_memory(), register_one_node() is
    called to create it. In addition, I386's arch_register_node() and a part of
    register_nodes() of powerpc are consolidated to register_one_node() as a
    generic_code().

    This is tested by Tiger4(IPF) with node hot-plug emulation.

    Signed-off-by: Keiichiro Tokunaga
    Signed-off-by: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     

11 Apr, 2006

1 commit


05 Sep, 2005

1 commit

  • Add page_state info to the per-node meminfo file in sysfs. This is mostly
    just for informational purposes.

    The lack of this information was brought up recently during a discussion
    regarding pagecache clearing, and I put this patch together to test out one
    of the suggestions.

    It seems like interesting info to have, so I'm submitting the patch.

    Signed-off-by: Martin Hicks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Hicks
     

22 Jun, 2005

1 commit

  • This patch modifies the way pagesets in struct zone are managed.

    Each zone has a per-cpu array of pagesets. So any particular CPU has some
    memory in each zone structure which belongs to itself. Even if that CPU is
    not local to that zone.

    So the patch relocates the pagesets for each cpu to the node that is nearest
    to the cpu instead of allocating the pagesets in the (possibly remote) target
    zone. This means that the operations to manage pages on remote zone can be
    done with information available locally.

    We play a macro trick so that non-NUMA pmachines avoid the additional
    pointer chase on the page allocator fastpath.

    AIM7 benchmark on a 32 CPU SGI Altix

    w/o patches:
    Tasks jobs/min jti jobs/min/task real cpu
    1 484.68 100 484.6769 12.01 1.97 Fri Mar 25 11:01:42 2005
    100 27140.46 89 271.4046 21.44 148.71 Fri Mar 25 11:02:04 2005
    200 30792.02 82 153.9601 37.80 296.72 Fri Mar 25 11:02:42 2005
    300 32209.27 81 107.3642 54.21 451.34 Fri Mar 25 11:03:37 2005
    400 34962.83 78 87.4071 66.59 588.97 Fri Mar 25 11:04:44 2005
    500 31676.92 75 63.3538 91.87 742.71 Fri Mar 25 11:06:16 2005
    600 36032.69 73 60.0545 96.91 885.44 Fri Mar 25 11:07:54 2005
    700 35540.43 77 50.7720 114.63 1024.28 Fri Mar 25 11:09:49 2005
    800 33906.70 74 42.3834 137.32 1181.65 Fri Mar 25 11:12:06 2005
    900 34120.67 73 37.9119 153.51 1325.26 Fri Mar 25 11:14:41 2005
    1000 34802.37 74 34.8024 167.23 1465.26 Fri Mar 25 11:17:28 2005

    with slab API changes and pageset patch:

    Tasks jobs/min jti jobs/min/task real cpu
    1 485.00 100 485.0000 12.00 1.96 Fri Mar 25 11:46:18 2005
    100 28000.96 89 280.0096 20.79 150.45 Fri Mar 25 11:46:39 2005
    200 32285.80 79 161.4290 36.05 293.37 Fri Mar 25 11:47:16 2005
    300 40424.15 84 134.7472 43.19 438.42 Fri Mar 25 11:47:59 2005
    400 39155.01 79 97.8875 59.46 590.05 Fri Mar 25 11:48:59 2005
    500 37881.25 82 75.7625 76.82 730.19 Fri Mar 25 11:50:16 2005
    600 39083.14 78 65.1386 89.35 872.79 Fri Mar 25 11:51:46 2005
    700 38627.83 77 55.1826 105.47 1022.46 Fri Mar 25 11:53:32 2005
    800 39631.94 78 49.5399 117.48 1169.94 Fri Mar 25 11:55:30 2005
    900 36903.70 79 41.0041 141.94 1310.78 Fri Mar 25 11:57:53 2005
    1000 36201.23 77 36.2012 160.77 1458.31 Fri Mar 25 12:00:34 2005

    Signed-off-by: Christoph Lameter
    Signed-off-by: Shobhit Dayal
    Signed-off-by: Shai Fultheim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

21 Jun, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds