27 Sep, 2006

40 commits

  • Set the backing device info capabilities for /dev/mem and /dev/kmem to
    permit direct sharing under no-MMU conditions and full mapping capabilities
    under MMU conditions. Make the BDI used by these available to all directly
    mappable character devices.

    Also comment the capabilities for /dev/zero.

    [akpm@osdl.org: ifdef reductions]
    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Check that access_process_vm() is accessing a valid mapping in the target
    process.

    This limits ptrace() accesses and accesses through /proc//maps to only
    those regions actually mapped by a program.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Signed-off-by: Haavard Skinnemoen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Haavard Skinnemoen
     
  • The function is exported but not used from anywhere else. It's also marked as
    "not for driver use" so noone out there should really care.

    Signed-off-by: Rolf Eike Beer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rolf Eike Beer
     
  • The empty line between the short description and the first argument
    description causes a section to appear twice in the generated manpage.
    Also the short description should really be short: the script can't handle
    multiple lines.

    Signed-off-by: Rolf Eike Beer
    Acked-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rolf Eike Beer
     
  • Use NULL instead of 0 for pointer value, eliminate sparse warnings.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Implement the special memory driver (mspec) based on the do_no_pfn
    approach. The driver is currently used only on SN2 hardware with special
    fetchop support but could be beneficial on other architectures using the
    uncached mode.

    Signed-off-by: Jes Sorensen
    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jes Sorensen
     
  • Implement do_no_pfn() for handling mapping of memory without a struct page
    backing it. This avoids creating fake page table entries for regions which
    are not backed by real memory.

    This feature is used by the MSPEC driver and other users, where it is
    highly undesirable to have a struct page sitting behind the page (for
    instance if the page is accessed in cached mode via the struct page in
    parallel to the the driver accessing it uncached, which can result in data
    corruption on some architectures, such as ia64).

    This version uses specific NOPFN_{SIGBUS,OOM} return values, rather than
    expect all negative pfn values would be an error. It also bugs on cow
    mappings as this would not work with the VM.

    [akpm@osdl.org: micro-optimise]
    Signed-off-by: Jes Sorensen
    Cc: Hugh Dickins
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jes Sorensen
     
  • Now that we have the node in the hot zone of struct zone we can avoid
    accessing zone_pgdat in zone_statistics.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • We do not need to allocate pagesets for unpopulated zones.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Add the node in order to optimize zone_to_nid.

    Signed-off-by: Christoph Lameter
    Acked-by: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • This patch insures that the slab node lists in the NUMA case only contain
    slabs that belong to that specific node. All slab allocations use
    GFP_THISNODE when calling into the page allocator. If an allocation fails
    then we fall back in the slab allocator according to the zonelists appropriate
    for a certain context.

    This allows a replication of the behavior of alloc_pages and alloc_pages node
    in the slab layer.

    Currently allocations requested from the page allocator may be redirected via
    cpusets to other nodes. This results in remote pages on nodelists and that in
    turn results in interrupt latency issues during cache draining. Plus the slab
    is handing out memory as local when it is really remote.

    Fallback for slab memory allocations will occur within the slab allocator and
    not in the page allocator. This is necessary in order to be able to use the
    existing pools of objects on the nodes that we fall back to before adding more
    pages to a slab.

    The fallback function insures that the nodes we fall back to obey cpuset
    restrictions of the current context. We do not allocate objects from outside
    of the current cpuset context like before.

    Note that the implementation of locality constraints within the slab allocator
    requires importing logic from the page allocator. This is a mischmash that is
    not that great. Other allocators (uncached allocator, vmalloc, huge pages)
    face similar problems and have similar minimal reimplementations of the basic
    fallback logic of the page allocator. There is another way of implementing a
    slab by avoiding per node lists (see modular slab) but this wont work within
    the existing slab.

    V1->V2:
    - Use NUMA_BUILD to avoid #ifdef CONFIG_NUMA
    - Exploit GFP_THISNODE being 0 in the NON_NUMA case to avoid another
    #ifdef

    [akpm@osdl.org: build fix]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • GFP_THISNODE must be set to 0 in the non numa case otherwise we disable retry
    and warnings for failing allocations in the SMP and UP case.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • The NUMA_BUILD constant is always available and will be set to 1 on
    NUMA_BUILDs. That way checks valid only under CONFIG_NUMA can easily be done
    without #ifdef CONFIG_NUMA

    F.e.

    if (NUMA_BUILD && ) {
    ...
    }

    [akpm: not a thing we'd normally do, but CONFIG_NUMA is special: it is
    causing ifdef explosion in core kernel, so let's see if this is a comfortable
    way in whcih to control that]

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • On larger systems, the amount of output dumped on the console when you do
    SysRq-M is beyond insane. This patch is trying to reduce it somewhat as
    even with the smaller NUMA systems that have hit the desktop this seems to
    be a fair thing to do.

    The philosophy I have taken is as follows:
    1) If a zone is empty, don't tell, we don't need yet another line
    telling us so. The information is available since one can look up
    the fact how many zones were initialized in the first place.
    2) Put as much information on a line is possible, if it can be done
    in one line, rahter than two, then do it in one. I tried to format
    the temperature stuff for easy reading.

    Change show_free_areas() to not print lines for empty zones. If no zone
    output is printed, the zone is empty. This reduces the number of lines
    dumped to the console in sysrq on a large system by several thousand lines.

    Change the zone temperature printouts to use one line per CPU instead of
    two lines (one hot, one cold). On a 1024 CPU, 1024 node system, this
    reduces the console output by over a million lines of output.

    While this is a bigger problem on large NUMA systems, it is also applicable
    to smaller desktop sized and mid range NUMA systems.

    Old format:

    Mem-info:
    Node 0 DMA per-cpu:
    cpu 0 hot: high 42, batch 7 used:24
    cpu 0 cold: high 14, batch 3 used:1
    cpu 1 hot: high 42, batch 7 used:34
    cpu 1 cold: high 14, batch 3 used:0
    cpu 2 hot: high 42, batch 7 used:0
    cpu 2 cold: high 14, batch 3 used:0
    cpu 3 hot: high 42, batch 7 used:0
    cpu 3 cold: high 14, batch 3 used:0
    cpu 4 hot: high 42, batch 7 used:0
    cpu 4 cold: high 14, batch 3 used:0
    cpu 5 hot: high 42, batch 7 used:0
    cpu 5 cold: high 14, batch 3 used:0
    cpu 6 hot: high 42, batch 7 used:0
    cpu 6 cold: high 14, batch 3 used:0
    cpu 7 hot: high 42, batch 7 used:0
    cpu 7 cold: high 14, batch 3 used:0
    Node 0 DMA32 per-cpu: empty
    Node 0 Normal per-cpu: empty
    Node 0 HighMem per-cpu: empty
    Node 1 DMA per-cpu:
    [snip]
    Free pages: 5410688kB (0kB HighMem)
    Active:9536 inactive:4261 dirty:6 writeback:0 unstable:0 free:338168 slab:1931 mapped:1900 pagetables:208
    Node 0 DMA free:1676304kB min:3264kB low:4080kB high:4896kB active:128048kB inactive:61568kB present:1970880kB pages_scanned:0 all_unreclaimable? no
    lowmem_reserve[]: 0 0 0 0
    Node 0 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
    lowmem_reserve[]: 0 0 0 0
    Node 0 Normal free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
    lowmem_reserve[]: 0 0 0 0
    Node 0 HighMem free:0kB min:512kB low:512kB high:512kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
    lowmem_reserve[]: 0 0 0 0
    Node 1 DMA free:1951728kB min:3280kB low:4096kB high:4912kB active:5632kB inactive:1504kB present:1982464kB pages_scanned:0 all_unreclaimable? no
    lowmem_reserve[]: 0 0 0 0
    ....

    New format:

    Mem-info:
    Node 0 DMA per-cpu:
    CPU 0: Hot: hi: 42, btch: 7 usd: 41 Cold: hi: 14, btch: 3 usd: 2
    CPU 1: Hot: hi: 42, btch: 7 usd: 40 Cold: hi: 14, btch: 3 usd: 1
    CPU 2: Hot: hi: 42, btch: 7 usd: 0 Cold: hi: 14, btch: 3 usd: 0
    CPU 3: Hot: hi: 42, btch: 7 usd: 0 Cold: hi: 14, btch: 3 usd: 0
    CPU 4: Hot: hi: 42, btch: 7 usd: 0 Cold: hi: 14, btch: 3 usd: 0
    CPU 5: Hot: hi: 42, btch: 7 usd: 0 Cold: hi: 14, btch: 3 usd: 0
    CPU 6: Hot: hi: 42, btch: 7 usd: 0 Cold: hi: 14, btch: 3 usd: 0
    CPU 7: Hot: hi: 42, btch: 7 usd: 0 Cold: hi: 14, btch: 3 usd: 0
    Node 1 DMA per-cpu:
    [snip]
    Free pages: 5411088kB (0kB HighMem)
    Active:9558 inactive:4233 dirty:6 writeback:0 unstable:0 free:338193 slab:1942 mapped:1918 pagetables:208
    Node 0 DMA free:1677648kB min:3264kB low:4080kB high:4896kB active:129296kB inactive:58864kB present:1970880kB pages_scanned:0 all_unreclaimable? no
    lowmem_reserve[]: 0 0 0 0
    Node 1 DMA free:1948448kB min:3280kB low:4096kB high:4912kB active:6864kB inactive:3536kB present:1982464kB pages_scanned:0 all_unreclaimable? no
    lowmem_reserve[]: 0 0 0 0

    Signed-off-by: Jes Sorensen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jes Sorensen
     
  • kmalloc_node() falls back to ___cache_alloc() under certain conditions and
    at that point memory policies may be applied redirecting the allocation
    away from the current node. Therefore kmalloc_node(...,numa_node_id()) or
    kmalloc_node(...,-1) may not return memory from the local node.

    Fix this by doing the policy check in __cache_alloc() instead of
    ____cache_alloc().

    This version here is a cleanup of Kiran's patch.

    - Tested on ia64.
    - Extra material removed.
    - Consolidate the exit path if alternate_node_alloc() returned an object.

    [akpm@osdl.org: warning fix]
    Signed-off-by: Alok N Kataria
    Signed-off-by: Ravikiran Thirumalai
    Signed-off-by: Shai Fultheim
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Clean up the invalidate code, and use a common function to safely remove
    the page from pagecache.

    Signed-off-by: Nick Piggin
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • This moves the definition of struct page from mm.h to its own header file
    page-struct.h. This is a prereq to fix SetPageUptodate which is broken on
    s390:

    #define SetPageUptodate(_page)
    do {
    struct page *__page = (_page);
    if (!test_and_set_bit(PG_uptodate, &__page->flags))
    page_test_and_clear_dirty(_page);
    } while (0)

    _page gets used twice in this macro which can cause subtle bugs. Using
    __page for the page_test_and_clear_dirty call doesn't work since it causes
    yet another problem with the page_test_and_clear_dirty macro as well.

    In order to avoid all these problems caused by macros it seems to be a good
    idea to get rid of them and convert them to static inline functions.
    Because of header file include order it's necessary to have a seperate
    header file for the struct page definition.

    Cc: Martin Schwidefsky
    Signed-off-by: Heiko Carstens
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     
  • The VM is supposed to minimise the number of pages which get written off the
    LRU (for IO scheduling efficiency, and for high reclaim-success rates). But
    we don't actually have a clear way of showing how true this is.

    So add `nr_vmscan_write' to /proc/vmstat and /proc/zoneinfo - the number of
    pages which have been written by the vm scanner in this zone and globally.

    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Arch-independent zone-sizing determines the size of a node
    (pgdat->node_spanned_pages) based on the physical memory that was
    registered by the architecture. However, when
    CONFIG_MEMORY_HOTPLUG_RESERVE is set, the architecture expects that the
    spanned_pages will be much larger and that mem_map will be allocated that
    is used lated on memory hot-add.

    This patch allows an architecture that sets CONFIG_MEMORY_HOTPLUG_RESERVE
    to call push_node_boundaries() which will set the node beginning and end to
    at *least* the requested boundary.

    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • absent_pages_in_range() made the assumption that users of the API would not
    care about holes beyound the end of physical memory. This was not the
    case. This patch will account for ranges outside of physical memory as
    holes correctly.

    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • The x86_64 code accounted for memmap and some portions of the the DMA zone as
    holes. This was because those areas would never be reclaimed and accounting
    for them as memory affects min watermarks. This patch will account for the
    memmap as a memory hole. Architectures may optionally use set_dma_reserve()
    if they wish to account for a portion of memory in ZONE_DMA as a hole.

    Signed-off-by: Mel Gorman
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Size zones and holes in an architecture independent manner for ia64.

    [bob.picco@hp.com: fix ia64 FLATMEM+VIRTUAL_MEM_MAP]
    Signed-off-by: Mel Gorman
    Signed-off-by: Bob Picco
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Bob Picco
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Size zones and holes in an architecture independent manner for x86_64.

    Signed-off-by: Mel Gorman
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Size zones and holes in an architecture independent manner for x86.

    [akpm@osdl.org: build fix]
    Signed-off-by: Mel Gorman
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Size zones and holes in an architecture independent manner for Power.

    [judith@osdl.org: build fix]
    Signed-off-by: Mel Gorman
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • At a basic level, architectures define structures to record where active
    ranges of page frames are located. Once located, the code to calculate zone
    sizes and holes in each architecture is very similar. Some of this zone and
    hole sizing code is difficult to read for no good reason. This set of patches
    eliminates the similar-looking architecture-specific code.

    The patches introduce a mechanism where architectures register where the
    active ranges of page frames are with add_active_range(). When all areas have
    been discovered, free_area_init_nodes() is called to initialise the pgdat and
    zones. The zone sizes and holes are then calculated in an architecture
    independent manner.

    Patch 1 introduces the mechanism for registering and initialising PFN ranges
    Patch 2 changes ppc to use the mechanism - 139 arch-specific LOC removed
    Patch 3 changes x86 to use the mechanism - 136 arch-specific LOC removed
    Patch 4 changes x86_64 to use the mechanism - 74 arch-specific LOC removed
    Patch 5 changes ia64 to use the mechanism - 52 arch-specific LOC removed
    Patch 6 accounts for mem_map as a memory hole as the pages are not reclaimable.
    It adjusts the watermarks slightly

    Tony Luck has successfully tested for ia64 on Itanium with tiger_defconfig,
    gensparse_defconfig and defconfig. Bob Picco has also tested and debugged on
    IA64. Jack Steiner successfully boot tested on a mammoth SGI IA64-based
    machine. These were on patches against 2.6.17-rc1 and release 3 of these
    patches but there have been no ia64-changes since release 3.

    There are differences in the zone sizes for x86_64 as the arch-specific code
    for x86_64 accounts the kernel image and the starting mem_maps as memory holes
    but the architecture-independent code accounts the memory as present.

    The big benefit of this set of patches is a sizable reduction of
    architecture-specific code, some of which is very hairy. There should be a
    greater reduction when other architectures use the same mechanisms for zone
    and hole sizing but I lack the hardware to test on.

    Additional credit;
    Dave Hansen for the initial suggestion and comments on early patches
    Andy Whitcroft for reviewing early versions and catching numerous
    errors
    Tony Luck for testing and debugging on IA64
    Bob Picco for fixing bugs related to pfn registration, reviewing a
    number of patch revisions, providing a number of suggestions
    on future direction and testing heavily
    Jack Steiner and Robin Holt for testing on IA64 and clarifying
    issues related to memory holes
    Yasunori for testing on IA64
    Andi Kleen for reviewing and feeding back about x86_64
    Christian Kujau for providing valuable information related to ACPI
    problems on x86_64 and testing potential fixes

    This patch:

    Define the structure to represent an active range of page frames within a node
    in an architecture independent manner. Architectures are expected to register
    active ranges of PFNs using add_active_range(nid, start_pfn, end_pfn) and call
    free_area_init_nodes() passing the PFNs of the end of each zone.

    Signed-off-by: Mel Gorman
    Signed-off-by: Bob Picco
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • We need processor.h for cpu_relax().

    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • un-, de-, -free, -destroy, -exit, etc functions should in general return
    void. Also,

    There is very little, say, filesystem driver code can do upon failed
    kmem_cache_destroy(). If it will be decided to BUG in this case, BUG
    should be put in generic code, instead.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • * Rougly half of callers already do it by not checking return value
    * Code in drivers/acpi/osl.c does the following to be sure:

    (void)kmem_cache_destroy(cache);

    * Those who check it printk something, however, slab_error already printed
    the name of failed cache.
    * XFS BUGs on failed kmem_cache_destroy which is not the decision
    low-level filesystem driver should make. Converted to ignore.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • * Removing useless casts
    * Removing useless wrapper
    * Conversion from kmalloc+memset to kzalloc

    Signed-off-by: Panagiotis Issaris
    Acked-by: Dave Kleikamp
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Panagiotis Issaris
     
  • Conversions from kmalloc+memset to kzalloc.

    Signed-off-by: Panagiotis Issaris
    Jffs2-bit-acked-by: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Panagiotis Issaris
     
  • Some of the changes in balloc.c are just cosmetic, as Andreas pointed out -
    if they overflow they'll then underflow and things are fine.

    5th hunk actually fixes an overflow problem.

    Also check for potential overflows in inode & block counts when resizing.

    Signed-off-by: Eric Sandeen
    Cc: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     
  • Fixing up some endian-ness warnings in preparation to clone ext4 from ext3.

    Signed-off-by: Dave Kleikamp
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Kleikamp
     
  • More white space cleanups in preparation of cloning ext4 from ext3.
    Removing spaces that precede a tab.

    Signed-off-by: Dave Kleikamp
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Kleikamp
     
  • SWsoft Virtuozzo/OpenVZ Linux kernel team has discovered that ext3 error
    behavior was broken in linux kernels since 2.5.x versions by the following
    patch:

    2002/10/31 02:15:26-05:00 tytso@snap.thunk.org
    Default mount options from superblock for ext2/3 filesystems
    http://linux.bkbits.net:8080/linux-2.6/gnupatch@3dc0d88eKbV9ivV4ptRNM8fBuA3JBQ

    In case ext3 file system is mounted with errors=continue
    (EXT3_ERRORS_CONTINUE) errors should be ignored when possible. However at
    present in case of any error kernel aborts journal and remounts filesystem
    to read-only. Such behavior was hit number of times and noted to differ
    from that of 2.4.x kernels.

    This patch fixes this:
    - do nothing in case of EXT3_ERRORS_CONTINUE,
    - set EXT3_MOUNT_ABORT and call journal_abort() in all other cases
    - panic() should be called after ext3_commit_super() to save
    sb marked as EXT3_ERROR_FS

    Signed-off-by: Vasily Averin
    Acked-by: Kirill Korotaev
    Cc: Theodore Ts'o
    Cc: "Stephen C. Tweedie"
    Cc: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vasily Averin
     
  • Signed-off-by: Mingming Cao
    Acked-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mingming Cao
     
  • In the past there were a few kernel panics related to block reservation
    tree operations failure (insert/remove etc). It would be very useful to
    get the block allocation reservation map info when such error happens.

    Signed-off-by: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mingming Cao
     
  • These are a few places I've found in jbd that look like they may not be
    16T-safe, or consistent with the use of unsigned longs for block
    containers. Problems here would be somewhat hard to hit, would require
    journal blocks past the 8T boundary, which would not be terribly common.
    Still, should fix.

    (some of these have come from the ext4 work on jbd as well).

    I think there's one more possibility that the wrap() function may not be
    safe IF your last block in the journal butts right up against the 232 block
    boundary, but that seems like a VERY remote possibility, and I'm not
    worrying about it at this point.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     
  • This is primarily format string fixes, with changes to ialloc.c where large
    inode counts could overflow, and also pass around journal_inum as an
    unsigned long, just to be pedantic about it....

    Signed-off-by: Eric Sandeen
    Cc: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen