25 Jul, 2008

40 commits

  • This patch simplifies the memory bitmap manipulations.

    - remove the member size in struct bm_block

    It is not necessary for struct bm_block to have the number of bit chunks that
    can be calculated by using end_pfn and start_pfn.

    - use find_next_bit() for memory_bm_next_pfn

    No need to invent the bitmap library only for the memory bitmap.

    Signed-off-by: Akinobu Mita
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • This patch (as1112) adds some new PM_EVENT_* codes for use by kernel
    subsystems. They describe runtime power-state transitions of the sort already
    implemented by the USB subsystem.

    Signed-off-by: Alan Stern
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Stern
     
  • Drop unnecessary includes from include/linux/pm.h .

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Remove some obsolete PM documentation.

    The majority of contents of Documentation/power/pm.txt are
    outdated. Remove the outdated parts of this file and move the rest
    to Documentation/power/apm-acpi.txt . Update the index in
    Documentation/power/ as appropriate.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Acked-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Remove the remaining obsolete definitions from include/linux/pm.h and move
    the definitions of PM_SUSPEND and PM_RESUME to the header of h3600 which
    is the only user of them.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Remove the definition of 'struct pm_dev', which is not used any more,
    along with some related stuff from include/linux/pm.h .

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Remove the obsolete and no longer used include/linux/pm_legacy.h

    Reviewed-by: Robert P. J. Day
    Signed-off-by: Adrian Bunk
    Cc: Pavel Machek
    Acked-by: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Boot-time test for system suspend states (STR or standby). The generic
    RTC framework triggers wakeup alarms, which are used to exit those states.

    - Measures some aspects of suspend time ... this uses "jiffies" until
    someone converts it to use a timebase that works properly even while
    timer IRQs are disabled.

    - Triggered by a command line parameter. By default nothing even
    vaguely troublesome will happen, but "test_suspend=mem" will give
    you a brief STR test during system boot. (Or you may need to use
    "test_suspend=standby" instead, if your hardware needs that.)

    This isn't without problems. It fires early enough during boot that for
    example both PCMCIA and MMC stacks have misbehaved. The workaround in
    those cases was to boot without such media cards inserted.

    [matthltc@us.ibm.com: fix compile failure in boot time suspend selftest]
    Signed-off-by: David Brownell
    Cc: Ingo Molnar
    Cc: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Matt Helsley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Brownell
     
  • Tell the user about the no_console_suspend option, so that we don't have to
    tell each bug reporter personally.

    [akpm@linux-foundation.org: clarify the text a little]
    Signed-off-by: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • The real option is named AGP_ALPHA_CORE.

    Reviewed-by: Robert P. J. Day
    Signed-off-by: Adrian Bunk
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • This patch removes the unused include/asm-h8300/keyboard.h

    Signed-off-by: Adrian Bunk
    Acked-by: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • ifd->offset is unsigned. gigaset_isowbuf_getbytes() may return signed
    unnoticed. Revised version of patch originally submitted by Roel Kluin
    .

    Signed-off-by: Tilman Schmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tilman Schmidt
     
  • The info() / warn() / err() macros from usb.h for generating kernel
    messages are considered inferior to dev_info() / dev_warn() / dev_err()
    from device.h. Replace them where possible. Also correct the severity
    level and improve the text of one message.

    Signed-off-by: Tilman Schmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tilman Schmidt
     
  • Why would linux/security.h need forward declarations for nfsctl_arg and
    swap_info_struct? It's hard to imagine: remove them.

    Signed-off-by: Hugh Dickins
    Acked-by: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Filesystem capabilities have come of age. Remove the experimental tag for
    configuring filesystem capabilities.

    Signed-off-by: Andrew G. Morgan
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew G. Morgan
     
  • To date, we've tried hard to confine filesystem support for capabilities
    to the security modules. This has left a lot of the code in
    kernel/capability.c in a state where it looks like it supports something
    that filesystem support for capabilities actually suppresses when the LSM
    security/commmoncap.c code runs. What is left is a lot of code that uses
    sub-optimal locking in the main kernel

    With this change we refactor the main kernel code and make it explicit
    which locks are needed and that the only remaining kernel races in this
    area are associated with non-filesystem capability code.

    Signed-off-by: Andrew G. Morgan
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew G. Morgan
     
  • When cap_bset suppresses some of the forced (fP) capabilities of a file,
    it is generally only safe to execute the program if it understands how to
    recognize it doesn't have enough privilege to work correctly. For legacy
    applications (fE!=0), which have no non-destructive way to determine that
    they are missing privilege, we fail to execute (EPERM) any executable that
    requires fP capabilities, but would otherwise get pP' < fP. This is a
    fail-safe permission check.

    For some discussion of why it is problematic for (legacy) privileged
    applications to run with less than the set of capabilities requested for
    them, see:

    http://userweb.kernel.org/~morgan/sendmail-capabilities-war-story.html

    With this iteration of this support, we do not include setuid-0 based
    privilege protection from the bounding set. That is, the admin can still
    (ab)use the bounding set to suppress the privileges of a setuid-0 program.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: cleanup]
    Signed-off-by: Andrew G. Morgan
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew G. Morgan
     
  • Vegard Nossum has noticed the ever-decreasing negative priority in a
    swapon /swapoff loop, which eventually would misprioritize when int wraps
    positive. Not worth spending much code on, but probably better fixed.

    It's easy to handle the swapping on and off of just one area, but there's
    not much point if a pair or more still misbehave. To handle the general
    case, swapoff should compact negative priorities, keeping them always from
    -1 to -MAX_SWAPFILES. That's a change, but should cause no regression,
    since these negative (unspecified) priorities are disjoint from the the
    positive specified priorities 0 to 32767.

    One small functional difference, which seems appropriate: when swapoff
    fails to free all swap from a negative priority area, that area is now
    reinserted at lowest priority, rather than at its original priority.

    In moving down swapon's setting of priority, I notice that an area is
    visible to /proc/swaps when it has swap_map set, yet that was being set
    before all the visible fields were properly filled in: corrected.

    Signed-off-by: Hugh Dickins
    Reviewed-by: KOSAKI Motohiro
    Reported-by: Vegard Nossum
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • We'd like to support CONFIG_MEMORY_HOTREMOVE on s390, which depends on
    CONFIG_MIGRATION. So far, CONFIG_MIGRATION is only available with NUMA
    support.

    This patch makes CONFIG_MIGRATION selectable for architectures that define
    ARCH_ENABLE_MEMORY_HOTREMOVE. When MIGRATION is enabled w/o NUMA, the
    kernel won't compile because migrate_vmas() does not know about
    vm_ops->migrate() and vma_migratable() does not know about policy_zone.
    To fix this, those two functions can be restricted to '#ifdef CONFIG_NUMA'
    because they are not being used w/o NUMA. vma_migratable() is moved over
    from migrate.h to mempolicy.h.

    [kosaki.motohiro@jp.fujitsu.com: build fix]
    Acked-by: Christoph Lameter
    Signed-off-by: Gerald Schaefer
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: KOSAKI Motorhiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gerald Schaefer
     
  • While in all cases in the kernel we know the size of the elements to be
    created, we don't always know the count of elements. By commuting the size
    and count in the overflow check, the compiler can reduce the runtime division
    of size_t with a compare to a (unique) constant in these cases.

    Signed-off-by: Milton Miller
    Cc: Takashi Iwai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Milton Miller
     
  • Memory may be hot-removed on a per-memory-block basis, particularly on
    POWER where the SPARSEMEM section size often matches the memory-block
    size. A user-level agent must be able to identify which sections of
    memory are likely to be removable before attempting the potentially
    expensive operation. This patch adds a file called "removable" to the
    memory directory in sysfs to help such an agent. In this patch, a memory
    block is considered removable if;

    o It contains only MOVABLE pageblocks
    o It contains only pageblocks with free pages regardless of pageblock type

    On the other hand, a memory block starting with a PageReserved() page will
    never be considered removable. Without this patch, the user-agent is
    forced to choose a memory block to remove randomly.

    Sample output of the sysfs files:

    ./memory/memory0/removable: 0
    ./memory/memory1/removable: 0
    ./memory/memory2/removable: 0
    ./memory/memory3/removable: 0
    ./memory/memory4/removable: 0
    ./memory/memory5/removable: 0
    ./memory/memory6/removable: 0
    ./memory/memory7/removable: 1
    ./memory/memory8/removable: 0
    ./memory/memory9/removable: 0
    ./memory/memory10/removable: 0
    ./memory/memory11/removable: 0
    ./memory/memory12/removable: 0
    ./memory/memory13/removable: 0
    ./memory/memory14/removable: 0
    ./memory/memory15/removable: 0
    ./memory/memory16/removable: 0
    ./memory/memory17/removable: 1
    ./memory/memory18/removable: 1
    ./memory/memory19/removable: 1
    ./memory/memory20/removable: 1
    ./memory/memory21/removable: 1
    ./memory/memory22/removable: 1

    Signed-off-by: Badari Pulavarty
    Signed-off-by: Mel Gorman
    Acked-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     
  • If zonelist is required to be rebuilt in online_pages(), there is no need
    to recalculate vm_total_pages in that function, as it has been updated in
    the call build_all_zonelists().

    Signed-off-by: Kent Liu
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Cc: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Liu
     
  • - Change some naming
    * Magic -> types
    * MIX_INFO -> MIX_SECTION_INFO
    * Change definition of bootmem type from direct hex value

    - __free_pages_bootmem() becomes __meminit.

    Signed-off-by: Yasunori Goto
    Cc: Andy Whitcroft
    Cc: Badari Pulavarty
    Cc: Yinghai Lu
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • Usemaps are allocated on the section which has pgdat by this.

    Because usemap size is very small, many other sections usemaps are
    allocated on only one page. If a section has usemap, it can't be removed
    until removing other sections. This dependency is not desirable for
    memory removing.

    Pgdat has similar feature. When a section has pgdat area, it must be the
    last section for removing on the node. So, if section A has pgdat and
    section B has usemap for section A, Both sections can't be removed due to
    dependency each other.

    To solve this issue, this patch collects usemap on same section with pgdat
    as much as possible. If other sections doesn't have any dependency, this
    section will be able to be removed finally.

    Signed-off-by: Yasunori Goto
    Cc: Mel Gorman
    Cc: Andy Whitcroft
    Cc: David Miller
    Cc: Badari Pulavarty
    Cc: Heiko Carstens
    Cc: Hiroyuki KAMEZAWA
    Cc: Tony Breeds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • This was required by some old, no-longer-used gcc on sparc.

    Signed-off-by: Vegard Nossum
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vegard Nossum
     
  • On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
    boundary. For example:

    u64 val = PAGE_ALIGN(size);

    always returns a value < 4GB even if size is greater than 4GB.

    The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
    example):

    #define PAGE_SHIFT 12
    #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
    #define PAGE_MASK (~(PAGE_SIZE-1))
    ...
    #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK)

    The "~" is performed on a 32-bit value, so everything in "and" with
    PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
    Using the ALIGN() macro seems to be the right way, because it uses
    typeof(addr) for the mask.

    Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
    include/linux/mm.h.

    See also lkml discussion: http://lkml.org/lkml/2008/6/11/237

    [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
    [akpm@linux-foundation.org: fix v850]
    [akpm@linux-foundation.org: fix powerpc]
    [akpm@linux-foundation.org: fix arm]
    [akpm@linux-foundation.org: fix mips]
    [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
    [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
    [akpm@linux-foundation.org: fix powerpc]
    Signed-off-by: Andrea Righi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Righi
     
  • Make the needlessly global register_page_bootmem_info_section() static.

    Signed-off-by: Adrian Bunk
    Acked-by: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • This patch contains the following cleanups:
    - make the following needlessly global variables static:
    - required_kernelcore
    - zone_movable_pfn[]
    - make the following needlessly global functions static:
    - move_freepages()
    - move_freepages_block()
    - setup_pageset()
    - find_usable_zone_for_movable()
    - adjust_zone_range_for_zone_movable()
    - __absent_pages_in_range()
    - find_min_pfn_for_node()
    - find_zone_movable_pfns_for_nodes()

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • alloc_pages_exact() is similar to alloc_pages(), except that it allocates
    the minimum number of pages to fulfill the request. This is useful if you
    want to allocate a very large buffer that is slightly larger than an even
    power-of-two number of pages. In that case, alloc_pages() will waste a
    lot of memory.

    I have a video driver that wants to allocate a 5MB buffer. alloc_pages()
    wiill waste 3MB of physically-contiguous memory.

    Signed-off-by: Timur Tabi
    Cc: Andi Kleen
    Acked-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Timur Tabi
     
  • Almost all users of this field need a PFN instead of a physical address,
    so replace node_boot_start with node_min_pfn.

    [Lee.Schermerhorn@hp.com: fix spurious BUG_ON() in mark_bootmem()]
    Signed-off-by: Johannes Weiner
    Cc:
    Signed-off-by: Lee Schermerhorn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Since alloc_bootmem_core does no goal-fallback anymore and just returns
    NULL if the allocation fails, we might now use it in alloc_bootmem_section
    without all the fixup code for a misplaced allocation.

    Also, the limit can be the first PFN of the next section as the semantics
    is that the limit is _above_ the allocated region, not within.

    Signed-off-by: Johannes Weiner
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • __alloc_bootmem_node already does this, make the interface consistent.

    Signed-off-by: Johannes Weiner
    Cc: Ingo Molnar
    Cc: Yinghai Lu
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • The old node-agnostic code tried allocating on all nodes starting from the
    one with the lowest range. alloc_bootmem_core retried without the goal if
    it could not satisfy it and so the goal was only respected at all when it
    happened to be on the first (lowest page numbers) node (or theoretically
    if allocations failed on all nodes before to the one holding the goal).

    Introduce a non-panicking helper that starts allocating from the node
    holding the goal and falls back only after all thes tries failed, thus
    moving the goal fallback code out of alloc_bootmem_core.

    Make all other allocation functions benefit from this new helper.

    Signed-off-by: Johannes Weiner
    Cc: Ingo Molnar
    Cc: Yinghai Lu
    Cc: Andi Kleen
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Introduce new helpers that mark a range that resides completely on a node
    or node-agnostic ranges that might also span node boundaries.

    The free/reserve API functions will then directly use these helpers.

    Note that the free/reserve semantics become more strict: while the prior
    code took basically arbitrary range arguments and marked the PFNs that
    happen to fall into that range, the new code requires node-specific ranges
    to be completely on the node. The node-agnostic requests might span node
    boundaries as long as the nodes are contiguous.

    Passing ranges that do not satisfy these criteria is a bug.

    [akpm@linux-foundation.org: fix printk warnings]
    Signed-off-by: Johannes Weiner
    Cc: Ingo Molnar
    Cc: Yinghai Lu
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Factor out the common operation of marking a range on the bitmap.

    [akpm@linux-foundation.org: fix various warnings]
    Signed-off-by: Johannes Weiner
    Cc: Ingo Molnar
    Cc: Yinghai Lu
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • alloc_bootmem_core has become quite nasty to read over time. This is a
    clean rewrite that keeps the semantics.

    bdata->last_pos has been dropped.

    bdata->last_success has been renamed to hint_idx and it is now an index
    relative to the node's range. Since further block searching might start
    at this index, it is now set to the end of a succeeded allocation rather
    than its beginning.

    bdata->last_offset has been renamed to last_end_off to be more clear that
    it represents the ending address of the last allocation relative to the
    node.

    [y-goto@jp.fujitsu.com: fix new alloc_bootmem_core()]
    Signed-off-by: Johannes Weiner
    Signed-off-by: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Rewrite the code in a more concise way using less variables.

    [akpm@linux-foundation.org: fix printk warnings]
    Signed-off-by: Johannes Weiner
    Cc: Ingo Molnar
    Cc: Yinghai Lu
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • link_bootmem handles an insertion of a new descriptor into the sorted list
    in more or less three explicit branches; empty list, insert in between and
    append. These cases can be expressed implicite.

    Also mark the sorted list as initdata as it can be thrown away after boot
    as well.

    Signed-off-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Reincarnate get_mapsize as bootmap_bytes and implement
    bootmem_bootmap_pages on top of it.

    Adjust users of these helpers and make free_all_bootmem_core use
    bootmem_bootmap_pages instead of open-coding it.

    Signed-off-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Introduce the bootmem_debug kernel parameter that enables very verbose
    diagnostics regarding all range operations of bootmem as well as the
    initialization and release of nodes.

    [akpm@linux-foundation.org: fix printk warnings]
    Signed-off-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner