30 Apr, 2013

1 commit

  • Currently page_action() does not check dirty flag to determine whether
    the error page is "clean mlocked/unevictable LRU" page. This doesn't
    cause any misjudgement because we do matching against "dirty
    mlocked/unevictable LRU" just before the check. But in order to make
    code consistent and/or to avoid potential regression, we had better
    check dirty flag explicitly.

    Signed-off-by: Naoya Horiguchi
    Suggested-by: Chen Gong
    Cc: Andi Kleen
    Cc: Tony Luck
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

24 Feb, 2013

8 commits

  • error_states[] has two separate states "unevictable LRU page" and
    "mlocked LRU page", and the former one has the higher priority now. But
    because of that the latter one is rarely chosen because pages with
    PageMlocked highly likely have PG_unevictable set. On the other hand,
    PG_unevictable without PageMlocked is common for ramfs or SHM_LOCKed
    shared memory, so reversing the priority of these two states helps us
    clearly distinguish them.

    Signed-off-by: Naoya Horiguchi
    Cc: Andi Kleen
    Cc: Chen Gong
    Cc: Tony Luck
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • memory_failure() can't handle memory errors on mlocked pages correctly,
    because page_action() judges such errors as ones on "unknown pages"
    instead of ones on "unevictable LRU page" or "mlocked LRU page". In
    order to determine page_state page_action() checks page flags at the
    timing of the judgement, but such page flags are not the same with those
    just after memory_failure() is called, because memory_failure() does
    unmapping of the error pages before doing page_action(). This unmapping
    changes the page state, especially page_remove_rmap() (called from
    try_to_unmap_one()) clears PG_mlocked, so page_action() can't catch
    mlocked pages after that.

    With this patch, we store the page flag of the error page before doing
    unmap, and (only) if the first check with page flags at the time decided
    the error page is unknown, we do the second check with the stored page
    flag. This implementation doesn't change error handling for the page
    types for which the first check can determine the page state correctly.

    [akpm@linux-foundation.org: tweak comments]
    Signed-off-by: Naoya Horiguchi
    Cc: Andi Kleen
    Cc: Tony Luck
    Cc: Chen Gong
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • No functional change, but the only purpose of the offlining argument to
    migrate_pages() etc, was to ensure that __unmap_and_move() could migrate a
    KSM page for memory hotremove (which took ksm_thread_mutex) but not for
    other callers. Now all cases are safe, remove the arg.

    Signed-off-by: Hugh Dickins
    Cc: Rik van Riel
    Cc: Petr Holasek
    Cc: Andrea Arcangeli
    Cc: Izik Eidus
    Cc: Gerald Schaefer
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • num_poisoned_pages counts up the number of pages isolated by memory
    errors. But for thp, only one subpage is isolated because memory error
    handler splits it, so it's wrong to add (1 << compound_trans_order).

    [akpm@linux-foundation.org: tweak comment]
    Signed-off-by: Naoya Horiguchi
    Cc: Andi Kleen
    Cc: Tony Luck
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • Currently soft_offline_page() is hard to maintain because it has many
    return points and goto statements. All of this mess come from
    get_any_page().

    This function should only get page refcount as the name implies, but it
    does some page isolating actions like SetPageHWPoison() and dequeuing
    hugepage. This patch corrects it and introduces some internal
    subroutines to make soft offlining code more readable and maintainable.

    Signed-off-by: Naoya Horiguchi
    Reviewed-by: Andi Kleen
    Cc: Tony Luck
    Cc: Wu Fengguang
    Cc: Xishi Qiu
    Cc: Jiang Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • Since MCE is an x86 concept, and this code is in mm/, it would be better
    to use the name num_poisoned_pages instead of mce_bad_pages.

    [akpm@linux-foundation.org: fix mm/sparse.c]
    Signed-off-by: Xishi Qiu
    Signed-off-by: Jiang Liu
    Suggested-by: Borislav Petkov
    Reviewed-by: Wanpeng Li
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xishi Qiu
     
  • There are too many return points randomly intermingled with some "goto
    done" return points. So adjust the function structure, one for the
    success path, the other for the failure path. Use atomic_long_inc
    instead of atomic_long_add.

    Signed-off-by: Xishi Qiu
    Signed-off-by: Jiang Liu
    Suggested-by: Andrew Morton
    Cc: Borislav Petkov
    Cc: Wanpeng Li
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xishi Qiu
     
  • When doing

    $ echo paddr > /sys/devices/system/memory/soft_offline_page

    to offline a *free* page, the value of mce_bad_pages will be added, and
    the page is set HWPoison flag, but it is still managed by page buddy
    alocator.

    $ cat /proc/meminfo | grep HardwareCorrupted

    shows the value.

    If we offline the same page, the value of mce_bad_pages will be added
    *again*, this means the value is incorrect now. Assume the page is
    still free during this short time.

    soft_offline_page()
    get_any_page()
    "else if (is_free_buddy_page(p))" branch return 0
    "goto done";
    "atomic_long_add(1, &mce_bad_pages);"

    This patch:

    Move poisoned page check at the beginning of the function in order to
    fix the error.

    Signed-off-by: Xishi Qiu
    Signed-off-by: Jiang Liu
    Tested-by: Naoya Horiguchi
    Cc: Borislav Petkov
    Cc: Wanpeng Li
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xishi Qiu
     

17 Dec, 2012

1 commit

  • Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
    "There are three implementations for NUMA balancing, this tree
    (balancenuma), numacore which has been developed in tip/master and
    autonuma which is in aa.git.

    In almost all respects balancenuma is the dumbest of the three because
    its main impact is on the VM side with no attempt to be smart about
    scheduling. In the interest of getting the ball rolling, it would be
    desirable to see this much merged for 3.8 with the view to building
    scheduler smarts on top and adapting the VM where required for 3.9.

    The most recent set of comparisons available from different people are

    mel: https://lkml.org/lkml/2012/12/9/108
    mingo: https://lkml.org/lkml/2012/12/7/331
    tglx: https://lkml.org/lkml/2012/12/10/437
    srikar: https://lkml.org/lkml/2012/12/10/397

    The results are a mixed bag. In my own tests, balancenuma does
    reasonably well. It's dumb as rocks and does not regress against
    mainline. On the other hand, Ingo's tests shows that balancenuma is
    incapable of converging for this workloads driven by perf which is bad
    but is potentially explained by the lack of scheduler smarts. Thomas'
    results show balancenuma improves on mainline but falls far short of
    numacore or autonuma. Srikar's results indicate we all suffer on a
    large machine with imbalanced node sizes.

    My own testing showed that recent numacore results have improved
    dramatically, particularly in the last week but not universally.
    We've butted heads heavily on system CPU usage and high levels of
    migration even when it shows that overall performance is better.
    There are also cases where it regresses. Of interest is that for
    specjbb in some configurations it will regress for lower numbers of
    warehouses and show gains for higher numbers which is not reported by
    the tool by default and sometimes missed in treports. Recently I
    reported for numacore that the JVM was crashing with
    NullPointerExceptions but currently it's unclear what the source of
    this problem is. Initially I thought it was in how numacore batch
    handles PTEs but I'm no longer think this is the case. It's possible
    numacore is just able to trigger it due to higher rates of migration.

    These reports were quite late in the cycle so I/we would like to start
    with this tree as it contains much of the code we can agree on and has
    not changed significantly over the last 2-3 weeks."

    * tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
    mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
    mm/rmap: Convert the struct anon_vma::mutex to an rwsem
    mm: migrate: Account a transhuge page properly when rate limiting
    mm: numa: Account for failed allocations and isolations as migration failures
    mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
    mm: numa: Add THP migration for the NUMA working set scanning fault case.
    mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
    mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
    mm: sched: numa: Control enabling and disabling of NUMA balancing
    mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
    mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely tasknode relationships
    mm: numa: migrate: Set last_nid on newly allocated page
    mm: numa: split_huge_page: Transfer last_nid on tail page
    mm: numa: Introduce last_nid to the page frame
    sched: numa: Slowly increase the scanning period as NUMA faults are handled
    mm: numa: Rate limit setting of pte_numa if node is saturated
    mm: numa: Rate limit the amount of memory that is migrated between nodes
    mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
    mm: numa: Migrate pages handled during a pmd_numa hinting fault
    mm: numa: Migrate on reference policy
    ...

    Linus Torvalds
     

12 Dec, 2012

2 commits

  • action_result() fails to print out "dirty" even if an error occurred on
    a dirty pagecache, because when we check PageDirty in action_result() it
    was cleared after page isolation even if it's dirty before error
    handling. This can break some applications that monitor this message,
    so should be fixed.

    There are several callers of action_result() except page_action(), but
    either of them are not for LRU pages but for free pages or kernel pages,
    so we don't have to consider dirty or not for them.

    Note that PG_dirty can be set outside page locks as described in commit
    6746aff74da2 ("HWPOISON: shmem: call set_page_dirty() with locked
    page"), so this patch does not completely closes the race window, but
    just narrows it.

    Signed-off-by: Naoya Horiguchi
    Reviewed-by: Andi Kleen
    Cc: Tony Luck
    Cc: "Jun'ichi Nomura"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • hwpoisoned may be set when we offline a page by the sysfs interface
    /sys/devices/system/memory/soft_offline_page or
    /sys/devices/system/memory/hard_offline_page. If we don't clear
    this flag when onlining pages, this page can't be freed, and will
    not in free list. So we can't offline these pages again. So we
    should skip such page when offlining pages.

    Signed-off-by: Wen Congyang
    Cc: David Rientjes
    Cc: Jiang Liu
    Cc: Len Brown
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Christoph Lameter
    Cc: Minchan Kim
    Cc: KOSAKI Motohiro
    Cc: Yasuaki Ishimatsu
    Cc: Andi Kleen
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wen Congyang
     

11 Dec, 2012

2 commits

  • rmap_walk_anon() and try_to_unmap_anon() appears to be too
    careful about locking the anon vma: while it needs protection
    against anon vma list modifications, it does not need exclusive
    access to the list itself.

    Transforming this exclusive lock to a read-locked rwsem removes
    a global lock from the hot path of page-migration intense
    threaded workloads which can cause pathological performance like
    this:

    96.43% process 0 [kernel.kallsyms] [k] perf_trace_sched_switch
    |
    --- perf_trace_sched_switch
    __schedule
    schedule
    schedule_preempt_disabled
    __mutex_lock_common.isra.6
    __mutex_lock_slowpath
    mutex_lock
    |
    |--50.61%-- rmap_walk
    | move_to_new_page
    | migrate_pages
    | migrate_misplaced_page
    | __do_numa_page.isra.69
    | handle_pte_fault
    | handle_mm_fault
    | __do_page_fault
    | do_page_fault
    | page_fault
    | __memset_sse2
    | |
    | --100.00%-- worker_thread
    | |
    | --100.00%-- start_thread
    |
    --49.39%-- page_lock_anon_vma
    try_to_unmap_anon
    try_to_unmap
    migrate_pages
    migrate_misplaced_page
    __do_numa_page.isra.69
    handle_pte_fault
    handle_mm_fault
    __do_page_fault
    do_page_fault
    page_fault
    __memset_sse2
    |
    --100.00%-- worker_thread
    start_thread

    With this change applied the profile is now nicely flat
    and there's no anon-vma related scheduling/blocking.

    Rename anon_vma_[un]lock() => anon_vma_[un]lock_write(),
    to make it clearer that it's an exclusive write-lock in
    that case - suggested by Rik van Riel.

    Suggested-by: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Paul Turner
    Cc: Lee Schermerhorn
    Cc: Christoph Lameter
    Cc: Rik van Riel
    Cc: Mel Gorman
    Cc: Andrea Arcangeli
    Cc: Johannes Weiner
    Cc: Hugh Dickins
    Signed-off-by: Ingo Molnar
    Signed-off-by: Mel Gorman

    Ingo Molnar
     
  • The pgmigrate_success and pgmigrate_fail vmstat counters tells the user
    about migration activity but not the type or the reason. This patch adds
    a tracepoint to identify the type of page migration and why the page is
    being migrated.

    Signed-off-by: Mel Gorman
    Reviewed-by: Rik van Riel

    Mel Gorman
     

01 Dec, 2012

1 commit

  • When we try to soft-offline a thp tail page, put_page() is called on the
    tail page unthinkingly and VM_BUG_ON is triggered in put_compound_page().

    This patch splits thp before going into the main body of soft-offlining.

    Signed-off-by: Naoya Horiguchi
    Cc: Andi Kleen
    Cc: Tony Luck
    Cc: Andi Kleen
    Cc: Wu Fengguang
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

09 Oct, 2012

2 commits

  • When a large VMA (anon or private file mapping) is first touched, which
    will populate its anon_vma field, and then split into many regions through
    the use of mprotect(), the original anon_vma ends up linking all of the
    vmas on a linked list. This can cause rmap to become inefficient, as we
    have to walk potentially thousands of irrelevent vmas before finding the
    one a given anon page might fall into.

    By replacing the same_anon_vma linked list with an interval tree (where
    each avc's interval is determined by its vma's start and last pgoffs), we
    can make rmap efficient for this use case again.

    While the change is large, all of its pieces are fairly simple.

    Most places that were walking the same_anon_vma list were looking for a
    known pgoff, so they can just use the anon_vma_interval_tree_foreach()
    interval tree iterator instead. The exception here is ksm, where the
    page's index is not known. It would probably be possible to rework ksm so
    that the index would be known, but for now I have decided to keep things
    simple and just walk the entirety of the interval tree there.

    When updating vma's that already have an anon_vma assigned, we must take
    care to re-index the corresponding avc's on their interval tree. This is
    done through the use of anon_vma_interval_tree_pre_update_vma() and
    anon_vma_interval_tree_post_update_vma(), which remove the avc's from
    their interval tree before the update and re-insert them after the update.
    The anon_vma stays locked during the update, so there is no chance that
    rmap would miss the vmas that are being updated.

    Signed-off-by: Michel Lespinasse
    Cc: Andrea Arcangeli
    Cc: Rik van Riel
    Cc: Peter Zijlstra
    Cc: Daniel Santos
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • Implement an interval tree as a replacement for the VMA prio_tree. The
    algorithms are similar to lib/interval_tree.c; however that code can't be
    directly reused as the interval endpoints are not explicitly stored in the
    VMA. So instead, the common algorithm is moved into a template and the
    details (node type, how to get interval endpoints from the node, etc) are
    filled in using the C preprocessor.

    Once the interval tree functions are available, using them as a
    replacement to the VMA prio tree is a relatively simple, mechanical job.

    Signed-off-by: Michel Lespinasse
    Cc: Rik van Riel
    Cc: Hillf Danton
    Cc: Peter Zijlstra
    Cc: Catalin Marinas
    Cc: Andrea Arcangeli
    Cc: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     

01 Aug, 2012

2 commits

  • Sanity:

    CONFIG_CGROUP_MEM_RES_CTLR -> CONFIG_MEMCG
    CONFIG_CGROUP_MEM_RES_CTLR_SWAP -> CONFIG_MEMCG_SWAP
    CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -> CONFIG_MEMCG_SWAP_ENABLED
    CONFIG_CGROUP_MEM_RES_CTLR_KMEM -> CONFIG_MEMCG_KMEM

    [mhocko@suse.cz: fix missed bits]
    Cc: Glauber Costa
    Acked-by: Michal Hocko
    Cc: Johannes Weiner
    Cc: KAMEZAWA Hiroyuki
    Cc: Hugh Dickins
    Cc: Tejun Heo
    Cc: Aneesh Kumar K.V
    Cc: David Rientjes
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Since we migrate only one hugepage, don't use linked list for passing the
    page around. Directly pass the page that need to be migrated as argument.
    This also removes the usage of page->lru in the migrate path.

    Signed-off-by: Aneesh Kumar K.V
    Reviewed-by: KAMEZAWA Hiroyuki
    Cc: David Rientjes
    Cc: Hillf Danton
    Reviewed-by: Michal Hocko
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aneesh Kumar K.V
     

31 Jul, 2012

1 commit

  • Commit a6bc32b89922 ("mm: compaction: introduce sync-light migration for
    use by compaction") changed the declaration of migrate_pages() and
    migrate_huge_pages().

    But it missed changing the argument of migrate_huge_pages() in
    soft_offline_huge_page(). In this case, we should call
    migrate_huge_pages() with MIGRATE_SYNC.

    Additionally, there is a mismatch between type the of argument and the
    function declaration for migrate_pages().

    Signed-off-by: Joonsoo Kim
    Cc: Christoph Lameter
    Cc: Mel Gorman
    Acked-by: David Rientjes
    Cc: "Aneesh Kumar K.V"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

12 Jul, 2012

1 commit

  • In commit dad1743e5993f1 ("x86/mce: Only restart instruction after machine
    check recovery if it is safe") we fixed mce_notify_process() to force a
    signal to the current process if it was not restartable (RIPV bit not
    set in MCG_STATUS). But doing it here means that the process doesn't
    get told the virtual address of the fault via siginfo_t->si_addr. This
    would prevent application level recovery from the fault.

    Make a new MF_MUST_KILL flag bit for memory_failure() et al. to use so
    that we will provide the right information with the signal.

    Signed-off-by: Tony Luck
    Acked-by: Borislav Petkov
    Cc: stable@kernel.org # 3.4+

    Tony Luck
     

30 May, 2012

1 commit


21 May, 2012

1 commit

  • This commit changes various functions that change pages and
    pageblocks migrate type between MIGRATE_ISOLATE and
    MIGRATE_MOVABLE in such a way as to allow to work with
    MIGRATE_CMA migrate type.

    Signed-off-by: Michal Nazarewicz
    Signed-off-by: Marek Szyprowski
    Reviewed-by: KAMEZAWA Hiroyuki
    Tested-by: Rob Clark
    Tested-by: Ohad Ben-Cohen
    Tested-by: Benjamin Gaignard
    Tested-by: Robert Nelson
    Tested-by: Barry Song

    Michal Nazarewicz
     

23 Mar, 2012

1 commit

  • Pull MCE changes from Ingo Molnar.

    * 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/mce: Fix return value of mce_chrdev_read() when erst is disabled
    x86/mce: Convert static array of pointers to per-cpu variables
    x86/mce: Replace hard coded hex constants with symbolic defines
    x86/mce: Recognise machine check bank signature for data path error
    x86/mce: Handle "action required" errors
    x86/mce: Add mechanism to safely save information in MCE handler
    x86/mce: Create helper function to save addr/misc when needed
    HWPOISON: Add code to handle "action required" errors.
    HWPOISON: Clean up memory_failure() vs. __memory_failure()

    Linus Torvalds
     

22 Mar, 2012

1 commit

  • Andrea Arcangeli pointed out to me that a check in __memory_failure()
    which was intended to prevent THP tail pages from being checked for the
    absence of the PG_lru flag (something that is always the case), was also
    preventing THP head pages from being checked.

    A THP head page could actually benefit from the call to shake_page() by
    ending up being put back to a LRU, provided it had been waiting in a
    pagevec array.

    Andrea suggested that the "!PageTransCompound(p)" in the if-statement
    should be replaced by a "!PageTransTail(p)", thus allowing THP head pages
    to be checked and possibly shaken.

    Signed-off-by: Dean Nelson
    Cc: Jin Dongming
    Reviewed-by: Andrea Arcangeli
    Cc: Andi Kleen
    Cc: Hidetoshi Seto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dean Nelson
     

26 Jan, 2012

1 commit


13 Jan, 2012

1 commit

  • This patch adds a lightweight sync migrate operation MIGRATE_SYNC_LIGHT
    mode that avoids writing back pages to backing storage. Async compaction
    maps to MIGRATE_ASYNC while sync compaction maps to MIGRATE_SYNC_LIGHT.
    For other migrate_pages users such as memory hotplug, MIGRATE_SYNC is
    used.

    This avoids sync compaction stalling for an excessive length of time,
    particularly when copying files to a USB stick where there might be a
    large number of dirty pages backed by a filesystem that does not support
    ->writepages.

    [aarcange@redhat.com: This patch is heavily based on Andrea's work]
    [akpm@linux-foundation.org: fix fs/nfs/write.c build]
    [akpm@linux-foundation.org: fix fs/btrfs/disk-io.c build]
    Signed-off-by: Mel Gorman
    Reviewed-by: Rik van Riel
    Cc: Andrea Arcangeli
    Cc: Minchan Kim
    Cc: Dave Jones
    Cc: Jan Kara
    Cc: Andy Isaacson
    Cc: Nai Xia
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

04 Jan, 2012

2 commits

  • Add new flag bit "MF_ACTION_REQUIRED" to be used by machine check
    code to force a signal with si_code = BUS_MCEERR_AR in the case
    where the error occurs in processor execution context. Pass the
    flags argument along call chain:
    memory_failure()
    hwpoison_user_mappings()
    kill_procs()
    kill_proc()

    Drop the "_ao" suffix from kill_procs_ao() and kill_proc_ao() since
    they can now handle "action required" as well as "action optional" errors.

    Acked-by: Borislav Petkov
    Signed-off-by: Tony Luck

    Tony Luck
     
  • There is only one caller of memory_failure(), all other users call
    __memory_failure() and pass in the flags argument explicitly. The
    lone user of memory_failure() will soon need to pass flags too.

    Add flags argument to the callsite in mce.c. Delete the old memory_failure()
    function, and then rename __memory_failure() without the leading "__".

    Provide clearer message when action optional memory errors are ignored.

    Acked-by: Borislav Petkov
    Signed-off-by: Tony Luck

    Tony Luck
     

07 Nov, 2011

1 commit

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     

01 Nov, 2011

1 commit

  • Commit fb46e73520940b ("HWPOISON: Convert pr_debugs to pr_info) authored
    by Andi Kleen converted a number of pr_debug()s to pr_info()s.

    About the same time additional code with pr_debug()s was added by two
    other commits 8c6c2ecb4466 ("HWPOSION, hugetlb: recover from free hugepage
    error when !MF_COUNT_INCREASED") and d950b95882f3d ("HWPOISON, hugetlb:
    soft offlining for hugepage"). And these pr_debug()s failed to get
    converted to pr_info()s.

    This patch converts them as well. And does some minor related whitespace
    cleanup.

    Signed-off-by: Dean Nelson
    Reviewed-by: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dean Nelson
     

31 Oct, 2011

1 commit


03 Aug, 2011

1 commit

  • memory_failure() is the entry point for HWPoison memory error
    recovery. It must be called in process context. But commonly
    hardware memory errors are notified via MCE or NMI, so some delayed
    execution mechanism must be used. In MCE handler, a work queue + ring
    buffer mechanism is used.

    In addition to MCE, now APEI (ACPI Platform Error Interface) GHES
    (Generic Hardware Error Source) can be used to report memory errors
    too. To add support to APEI GHES memory recovery, a mechanism similar
    to that of MCE is implemented. memory_failure_queue() is the new
    entry point that can be called in IRQ context. The next step is to
    make MCE handler uses this interface too.

    Signed-off-by: Huang Ying
    Cc: Andi Kleen
    Cc: Wu Fengguang
    Cc: Andrew Morton
    Signed-off-by: Len Brown

    Huang Ying
     

28 Jun, 2011

1 commit


16 Jun, 2011

1 commit

  • Pages isolated for migration are accounted with the vmstat counters
    NR_ISOLATE_[ANON|FILE]. Callers of migrate_pages() are expected to
    increment these counters when pages are isolated from the LRU. Once the
    pages have been migrated, they are put back on the LRU or freed and the
    isolated count is decremented.

    Memory failure is not properly accounting for pages it isolates causing
    the NR_ISOLATED counters to be negative. On SMP builds, this goes
    unnoticed as negative counters are treated as 0 due to expected per-cpu
    drift. On UP builds, the counter is treated by too_many_isolated() as a
    large value causing processes to enter D state during page reclaim or
    compaction. This patch accounts for pages isolated by memory failure
    correctly.

    [mel@csn.ul.ie: rewrote changelog]
    Reviewed-by: Andrea Arcangeli
    Signed-off-by: Minchan Kim
    Cc: Andi Kleen
    Acked-by: Mel Gorman
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     

25 May, 2011

4 commits

  • Change each shrinker's API by consolidating the existing parameters into
    shrink_control struct. This will simplify any further features added w/o
    touching each file of shrinker.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: fix warning]
    [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
    [akpm@linux-foundation.org: fix xfs warning]
    [akpm@linux-foundation.org: update gfs2]
    Signed-off-by: Ying Han
    Cc: KOSAKI Motohiro
    Cc: Minchan Kim
    Acked-by: Pavel Emelyanov
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Acked-by: Rik van Riel
    Cc: Johannes Weiner
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Cc: Steven Whitehouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ying Han
     
  • Consolidate the existing parameters to shrink_slab() into a new
    shrink_control struct. This is needed later to pass the same struct to
    shrinkers.

    Signed-off-by: Ying Han
    Cc: KOSAKI Motohiro
    Cc: Minchan Kim
    Acked-by: Pavel Emelyanov
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Acked-by: Rik van Riel
    Cc: Johannes Weiner
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ying Han
     
  • Drop first page reference only after calling isolate_lru_page() to keep
    page stable reference while isolating.

    Signed-off-by: Konstantin Khlebnikov
    Cc: Andi Kleen
    Cc: KAMEZAWA Hiroyuki
    Cc: KOSAKI Motohiro
    Cc: Mel Gorman
    Cc: Lee Schermerhorn
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • Straightforward conversion of i_mmap_lock to a mutex.

    Signed-off-by: Peter Zijlstra
    Acked-by: Hugh Dickins
    Cc: Benjamin Herrenschmidt
    Cc: David Miller
    Cc: Martin Schwidefsky
    Cc: Russell King
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Tony Luck
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Nick Piggin
    Cc: Namhyung Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

31 Mar, 2011

1 commit