26 Jul, 2011

40 commits

  • 2.6.36's 7e496299d4d2 ("tmpfs: make tmpfs scalable with percpu_counter for
    used blocks") to make tmpfs scalable with percpu_counter used
    inode->i_lock in place of sbinfo->stat_lock around i_blocks updates; but
    that was adverse to scalability, and unnecessary, since info->lock is
    already held there in the fast paths.

    Remove those uses of i_lock, and add info->lock in the three error paths
    where it's then needed across shmem_free_blocks(). It's not actually
    needed across shmem_unacct_blocks(), but they're so often paired that it
    looks wrong to split them apart.

    Signed-off-by: Hugh Dickins
    Acked-by: Tim Chen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • truncate_inode_pages_range()'s final loop has a nice pincer property,
    bringing start and end together, squeezing out the last pages. But the
    range handling missed out on that, just sliding up the range, perhaps
    letting pages come in behind it. Add one more test to give it the same
    pincer effect.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Make the pagevec_lookup loops in truncate_inode_pages_range(),
    invalidate_mapping_pages() and invalidate_inode_pages2_range() more
    consistent with each other.

    They were relying upon page->index of an unlocked page, but apologizing
    for it: accept it, embrace it, add comments and WARN_ONs, and simplify the
    index handling.

    invalidate_inode_pages2_range() had special handling for a wrapped
    page->index + 1 = 0 case; but MAX_LFS_FILESIZE doesn't let us anywhere
    near there, and a corrupt page->index in the radix_tree could cause more
    trouble than that would catch. Remove that wrapped handling.

    invalidate_inode_pages2_range() uses min() to limit the pagevec_lookup
    when near the end of the range: copy that into the other two, although
    it's less useful than you might think (it limits the use of the buffer,
    rather than the indices looked up).

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Use consistent variable names in truncate_pagecache(), truncate_setsize(),
    vmtruncate() and vmtruncate_range().

    unmap_mapping_range() and vmtruncate_range() have mismatched interfaces:
    don't change either, but make the vmtruncates more precise about what they
    expect unmap_mapping_range() to do.

    vmtruncate_range() is currently called only with page-aligned start and
    end+1: can handle unaligned start, but unaligned end+1 would hit BUG_ON in
    truncate_inode_pages_range() (lacks partial clearing of the end page).

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Correct comment on truncate_inode_pages*() in linux/mm.h; and remove
    declaration of page_unuse(), it didn't exist even in 2.2.26 or 2.4.0!

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • The often-NULL data arg to read_cache_page() and read_mapping_page()
    functions is misdescribed as "destination for read data": no, it's the
    first arg to the filler function, often struct file * to ->readpage().

    Satisfy checkpatch.pl on those filler prototypes, and tidy up the
    declarations in linux/pagemap.h.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Signed-off-by: David S. Miller
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David S. Miller
     
  • Luckily there are still a few software PTE bits remaining and they even
    match up in both the sun4u and sun4v pte layouts.

    Signed-off-by: David S. Miller
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David S. Miller
     
  • Make use of the generic RCU page table freeing on Sparc64, doing so allows
    for race-free software page-table walkers like gup_fast().

    Signed-off-by: David S. Miller
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David S. Miller
     
  • With the recent mmu_gather changes that included generic RCU freeing of
    page-tables, it is now quite straightforward to implement gup_fast() on
    sparc64.

    This patch:

    Remove the page table quicklists. They are pointless and make it harder
    to use RCU page table freeing and share code with other architectures.

    BTW, this is the second time this has happened, see commit 3c936465249f
    ("[SPARC64]: Kill pgtable quicklists and use SLAB.")

    Signed-off-by: David S. Miller
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David S. Miller
     
  • - shmem pages are not immediately available, but they are not
    potentially available either, even if we swap them out, they will just
    relocate from memory into swap, total amount of immediate and
    potentially available memory is not going to be affected, so we
    shouldn't count them as potentially free in the first place.

    - nr_free_pages() is not an expensive operation anymore, there is no
    need to split the decision making in two halves and repeat code.

    Signed-off-by: Dmitry Fink
    Reviewed-by: Minchan Kim
    Acked-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Fink
     
  • RED_INACTIVE is a slab thing, and reusing it for memblock was
    inappropriate, because memblock is dealing with phys_addr_t's which have a
    Kconfigurable sizeof().

    Create a new poison type for this application. Fixes the sparse warning

    warning: cast truncates bits from constant value (9f911029d74e35b becomes 9d74e35b)

    Reported-by: H Hartley Sweeten
    Tested-by: H Hartley Sweeten
    Acked-by: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • /proc/pid/oom_adj is deprecated and scheduled for removal in August 2012
    according to Documentation/feature-removal-schedule.txt.

    This patch makes the warning more verbose by making it appear as a more
    serious problem (the presence of a stack trace and being multiline should
    attract more attention) so that applications still using the old interface
    can get fixed.

    Very popular users of the old interface have been converted since the oom
    killer rewrite has been introduced. udevd switched to the
    /proc/pid/oom_score_adj interface for v162, kde switched in 4.6.1, and
    opensshd switched in 5.7p1.

    At the start of 2012, this should be changed into a WARN() to emit all
    such incidents and then finally remove the tunable in August 2012 as
    scheduled.

    Signed-off-by: David Rientjes
    Cc: KAMEZAWA Hiroyuki
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • The badness() function in the oom killer was renamed to oom_badness() in
    a63d83f427fb ("oom: badness heuristic rewrite") since it is a globally
    exported function for clarity.

    The prototype for the old function still existed in linux/oom.h, so remove
    it. There are no existing users.

    Also fixes documentation and comment references to badness() and adjusts
    them accordingly.

    Signed-off-by: David Rientjes
    Reviewed-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • ZAP_BLOCK_SIZE became unused in the preemptible-mmu_gather work ("mm:
    Remove i_mmap_lock lockbreak"). So zap it.

    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Fix coding style issues flagged by checkpatch.pl

    Signed-off-by: Chris Forbes
    Acked-by: Eric B Munson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Forbes
     
  • The lock is released first thing in all three branches. Simplify this by
    unconditionally releasing lock and remove else clause which was only there
    to be sure lock was released.

    Signed-off-by: Chris Wright
    Reviewed-by: Michal Hocko
    Cc: Andrea Arcangeli
    Acked-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Wright
     
  • Commit a539f3533b78e3 ("mm: add SECTION_ALIGN_UP() and
    SECTION_ALIGN_DOWN() macro") introduced the SECTION_ALIGN_UP() and
    SECTION_ALIGN_DOWN() macros. Use those macros to increase code
    readability.

    Signed-off-by: Daniel Kiper
    Acked-by: David Rientjes
    Acked-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Kiper
     
  • In commit a2c8990aed5ab ("memsw: remove noswapaccount kernel parameter"),
    Michal forgot to remove some left pieces of noswapaccount in the tree,
    this patch removes them all.

    Signed-off-by: WANG Cong
    Acked-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    WANG Cong
     
  • Commit bae9c19bf1 ("thp: split_huge_page_mm/vma") changed locking behavior
    of walk_page_range(). Thus this patch changes the comment too.

    Signed-off-by: KOSAKI Motohiro
    Cc: Naoya Horiguchi
    Cc: Hiroyuki Kamezawa
    Cc: Andrea Arcangeli
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Originally, walk_hugetlb_range() didn't require a caller take any lock.
    But commit d33b9f45bd ("mm: hugetlb: fix hugepage memory leak in
    walk_page_range") changed its rule. Because it added find_vma() call in
    walk_hugetlb_range().

    Any locking-rule change commit should write a doc too.

    [akpm@linux-foundation.org: clarify comment]
    Signed-off-by: KOSAKI Motohiro
    Cc: Naoya Horiguchi
    Cc: Hiroyuki Kamezawa
    Cc: Andrea Arcangeli
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Currently, walk_page_range() calls find_vma() every page table for walk
    iteration. but it's completely unnecessary if walk->hugetlb_entry is
    unused. And we don't have to assume find_vma() is a lightweight
    operation. So this patch checks the walk->hugetlb_entry and avoids the
    find_vma() call if possible.

    This patch also makes some cleanups. 1) remove ugly uninitialized_var()
    and 2) #ifdef in function body.

    Signed-off-by: KOSAKI Motohiro
    Cc: Naoya Horiguchi
    Cc: Hiroyuki Kamezawa
    Cc: Andrea Arcangeli
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • The doc of find_vma() says,

    /* Look up the first VMA which satisfies addr < vm_end, NULL if none. */
    struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
    {
    (snip)

    Thus, caller should confirm whether the returned vma matches a desired one.

    Signed-off-by: KOSAKI Motohiro
    Cc: Naoya Horiguchi
    Cc: Hiroyuki Kamezawa
    Cc: Andrea Arcangeli
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Document some swap token aging design decisions.

    Signed-off-by: KOSAKI Motohiro
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • global_faults and last_aging are only used in grab_swap_token(). Move
    them into grab_swap_token().

    Signed-off-by: KOSAKI Motohiro
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • http://www.cs.wm.edu/~sjiang/token.pdf is now dead. Replace it with an
    alive alternative.

    Signed-off-by: KOSAKI Motohiro
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Memory hotplug support for Xen balloon driver. It should be mentioned
    that hotplugged memory is not onlined automatically. It should be onlined
    by user through standard sysfs interface.

    Memory could be hotplugged in following steps:

    1) dom0: xl mem-max
    where is >= requested memory size,

    2) dom0: xl mem-set
    where is requested memory size; alternatively memory
    could be added by writing proper value to
    /sys/devices/system/xen_memory/xen_memory0/target or
    /sys/devices/system/xen_memory/xen_memory0/target_kb on dumU,

    3) domU: for i in /sys/devices/system/memory/memory*/state; do \
    [ "`cat "$i"`" = offline ] && echo online > "$i"; done

    Memory could be onlined automatically on domU by adding following line to
    udev rules:

    SUBSYSTEM=="memory", ACTION=="add", RUN+="/bin/sh -c '[ -f /sys$devpath/state ] && echo online > /sys$devpath/state'"

    In that case step 3 should be omitted.

    Signed-off-by: Daniel Kiper
    Acked-by: Konrad Rzeszutek Wilk
    Cc: Ian Campbell
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Kiper
     
  • This patch contains online_page_callback and apropriate functions for
    registering/unregistering online page callbacks. It allows to do some
    machine specific tasks during online page stage which is required to
    implement memory hotplug in virtual machines. Currently this patch is
    required by latest memory hotplug support for Xen balloon driver patch
    which will be posted soon.

    Additionally, originial online_page() function was splited into
    following functions doing "atomic" operations:

    - __online_page_set_limits() - set new limits for memory management code,
    - __online_page_increment_counters() - increment totalram_pages and totalhigh_pages,
    - __online_page_free() - free page to allocator.

    It was done to:
    - not duplicate existing code,
    - ease hotplug code devolpment by usage of well defined interface,
    - avoid stupid bugs which are unavoidable when the same code
    (by design) is developed in many places.

    [akpm@linux-foundation.org: use explicit indirect-call syntax]
    Signed-off-by: Daniel Kiper
    Reviewed-by: Konrad Rzeszutek Wilk
    Cc: Ian Campbell
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Kiper
     
  • Since commit a19a6ee "backlight: Allow properties to be passed at
    registration" and commit bb7ca74 "backlight: add backlight type", we can
    set backlight type and max_brightness before backlights are registered.
    Some newly added drivers did not set it properly, let's fix it.

    Signed-off-by: Axel Lin
    Cc: Matthew Garrett
    Cc: Jingoo Han
    Cc: Donghwa Lee
    Cc: InKi Dae
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Axel Lin
     
  • Add the ams369fg06 amoled panel driver. The ams369fg06 amoled panel (480
    x 800) driver uses 3-wired SPI inteface. The brightness can be controlled
    by gamma setting of amoled panel.

    [sfr@canb.auug.org.au: fix build error]
    [axel.lin@gmail.com: unregister backlight device when unloading the module]
    [axel.lin@gmail.com: staticize ams369fg06_shutdown]
    Signed-off-by: Jingoo Han
    Cc: Richard Purdie
    Cc: Inki Dae
    Cc: anish singh
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Axel Lin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jingoo Han
     
  • We have set props.max_brightness before registering backlight device.

    Signed-off-by: Axel Lin
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Axel Lin
     
  • - Fix checking of wrong return value for backlight_device_register()

    - Properly free allocated resources in ld9040_probe() error path and
    ld9040_remove().

    Signed-off-by: Axel Lin
    Cc: Donghwa Lee
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Axel Lin
     
  • Vito said:

    : The system has many usb disks coming and going day to day, with their
    : respective bdi's having min_ratio set to 1 when inserted. It works for
    : some time until eventually min_ratio can no longer be set, even when the
    : active set of bdi's seen in /sys/class/bdi/*/min_ratio doesn't add up to
    : anywhere near 100.
    :
    : This then leads to an unrelated starvation problem caused by write-heavy
    : fuse mounts being used atop the usb disks, a problem the min_ratio setting
    : at the underlying devices bdi effectively prevents.

    Fix this leakage by resetting the bdi min_ratio when unregistering the
    BDI.

    Signed-off-by: Peter Zijlstra
    Reported-by: Vito Caputo
    Cc: Wu Fengguang
    Cc: Miklos Szeredi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • If dmi_get_system_info() returns NULL, pch_phub_probe() will dereferencea
    a zero pointer.

    This oops was observed on an Atom based board which has no BIOS, but a
    bootloder which doesn't privde DMI data.

    Signed-off-by: Alexander Stein
    Cc: Tomoya MORINAGA
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Stein
     
  • Fix the following build error:

    arch/xtensa/include/asm/uaccess.h:403: error: implicit declaration of function 'prefetch'
    arch/xtensa/include/asm/uaccess.h:412: error: implicit declaration of function 'prefetchw'

    Signed-off-by: WANG Cong
    Cc: Chris Zankel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    WANG Cong
     
  • Prevent an arbitrary kernel read. Check the user pointer with access_ok()
    before copying data in.

    [akpm@linux-foundation.org: s/EIO/EFAULT/]
    Signed-off-by: Dan Rosenberg
    Cc: Christian Zankel
    Cc: Oleg Nesterov
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Rosenberg
     
  • In a subsquent patch I have a const struct page in my hand...

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Ian Campbell
    Cc: Andrea Arcangeli
    Cc: Rik van Riel
    Cc: Martin Schwidefsky
    Cc: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Campbell
     
  • These uses are read-only and in a subsequent patch I have a const struct
    page in my hand...

    [akpm@linux-foundation.org: fix warnings in lowmem_page_address()]
    Signed-off-by: Ian Campbell
    Cc: Rik van Riel
    Cc: Andrea Arcangeli
    Cc: Mel Gorman
    Cc: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ian Campbell
     
  • This is needed on HIGHMEM systems - we don't always have a virtual
    address so store the physical address and map it in as needed.

    [akpm@linux-foundation.org: cleanup]
    Signed-off-by: Becky Bruce
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Becky Bruce
     
  • This:

    vma->vm_pgoff & ~(huge_page_mask(h) >> PAGE_SHIFT)

    is incorrect on 32-bit. It causes us to & the pgoff with something that
    looks like this (for a 4m hugepage): 0xfff003ff. The mask should be
    flipped and *then* shifted, to give you 0x0000_03fff.

    Signed-off-by: Becky Bruce
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Becky Bruce