12 Feb, 2007

1 commit

  • The determination of the dirty ratio to determine writeback behavior is
    currently based on the number of total pages on the system.

    However, not all pages in the system may be dirtied. Thus the ratio is always
    too low and can never reach 100%. The ratio may be particularly skewed if
    large hugepage allocations, slab allocations or device driver buffers make
    large sections of memory not available anymore. In that case we may get into
    a situation in which f.e. the background writeback ratio of 40% cannot be
    reached anymore which leads to undesired writeback behavior.

    This patchset fixes that issue by determining the ratio based on the actual
    pages that may potentially be dirty. These are the pages on the active and
    the inactive list plus free pages.

    The problem with those counts has so far been that it is expensive to
    calculate these because counts from multiple nodes and multiple zones will
    have to be summed up. This patchset makes these counters ZVC counters. This
    means that a current sum per zone, per node and for the whole system is always
    available via global variables and not expensive anymore to calculate.

    The patchset results in some other good side effects:

    - Removal of the various functions that sum up free, active and inactive
    page counts

    - Cleanup of the functions that display information via the proc filesystem.

    This patch:

    The use of a ZVC for nr_inactive and nr_active allows a simplification of some
    counter operations. More ZVC functionality is used for sums etc in the
    following patches.

    [akpm@osdl.org: UP build fix]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

22 Mar, 2006

1 commit

  • In the page release paths, we can be sure that nobody will mess with our
    page->flags because the refcount has dropped to 0. So no need for atomic
    operations here.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

19 Jan, 2006

1 commit

  • Migration code currently does not take a reference to target page
    properly, so between unlocking the pte and trying to take a new
    reference to the page with isolate_lru_page, anything could happen to
    it.

    Fix this by holding the pte lock until we get a chance to elevate the
    refcount.

    Other small cleanups while we're here.

    Signed-off-by: Nick Piggin
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

09 Jan, 2006

1 commit

  • This is the start of the `swap migration' patch series.

    Swap migration allows the moving of the physical location of pages between
    nodes in a numa system while the process is running. This means that the
    virtual addresses that the process sees do not change. However, the system
    rearranges the physical location of those pages.

    The main intent of page migration patches here is to reduce the latency of
    memory access by moving pages near to the processor where the process
    accessing that memory is running.

    The patchset allows a process to manually relocate the node on which its
    pages are located through the MF_MOVE and MF_MOVE_ALL options while
    setting a new memory policy.

    The pages of process can also be relocated from another process using the
    sys_migrate_pages() function call. Requires CAP_SYS_ADMIN. The migrate_pages
    function call takes two sets of nodes and moves pages of a process that are
    located on the from nodes to the destination nodes.

    Manual migration is very useful if for example the scheduler has relocated a
    process to a processor on a distant node. A batch scheduler or an
    administrator can detect the situation and move the pages of the process
    nearer to the new processor.

    sys_migrate_pages() could be used on non-numa machines as well, to force all
    of a particualr process's pages out to swap, if someone thinks that's useful.

    Larger installations usually partition the system using cpusets into sections
    of nodes. Paul has equipped cpusets with the ability to move pages when a
    task is moved to another cpuset. This allows automatic control over locality
    of a process. If a task is moved to a new cpuset then also all its pages are
    moved with it so that the performance of the process does not sink
    dramatically (as is the case today).

    Swap migration works by simply evicting the page. The pages must be faulted
    back in. The pages are then typically reallocated by the system near the node
    where the process is executing.

    For swap migration the destination of the move is controlled by the allocation
    policy. Cpusets set the allocation policy before calling sys_migrate_pages()
    in order to move the pages as intended.

    No allocation policy changes are performed for sys_migrate_pages(). This
    means that the pages may not faulted in to the specified nodes if no
    allocation policy was set by other means. The pages will just end up near the
    node where the fault occurred.

    There's another patch series in the pipeline which implements "direct
    migration".

    The direct migration patchset extends the migration functionality to avoid
    going through swap. The destination node of the relation is controllable
    during the actual moving of pages. The crutch of using the allocation policy
    to relocate is not necessary and the pages are moved directly to the target.
    Its also faster since swap is not used.

    And sys_migrate_pages() can then move pages directly to the specified node.
    Implement functions to isolate pages from the LRU and put them back later.

    This patch:

    An earlier implementation was provided by Hirokazu Takahashi
    and IWAMOTO Toshihiro for the
    memory hotplug project.

    From: Magnus

    This breaks out isolate_lru_page() and putpack_lru_page(). Needed for swap
    migration.

    Signed-off-by: Magnus Damm
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds