27 Sep, 2006

40 commits

  • GFP_THISNODE must be set to 0 in the non numa case otherwise we disable retry
    and warnings for failing allocations in the SMP and UP case.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • The NUMA_BUILD constant is always available and will be set to 1 on
    NUMA_BUILDs. That way checks valid only under CONFIG_NUMA can easily be done
    without #ifdef CONFIG_NUMA

    F.e.

    if (NUMA_BUILD && ) {
    ...
    }

    [akpm: not a thing we'd normally do, but CONFIG_NUMA is special: it is
    causing ifdef explosion in core kernel, so let's see if this is a comfortable
    way in whcih to control that]

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • This moves the definition of struct page from mm.h to its own header file
    page-struct.h. This is a prereq to fix SetPageUptodate which is broken on
    s390:

    #define SetPageUptodate(_page)
    do {
    struct page *__page = (_page);
    if (!test_and_set_bit(PG_uptodate, &__page->flags))
    page_test_and_clear_dirty(_page);
    } while (0)

    _page gets used twice in this macro which can cause subtle bugs. Using
    __page for the page_test_and_clear_dirty call doesn't work since it causes
    yet another problem with the page_test_and_clear_dirty macro as well.

    In order to avoid all these problems caused by macros it seems to be a good
    idea to get rid of them and convert them to static inline functions.
    Because of header file include order it's necessary to have a seperate
    header file for the struct page definition.

    Cc: Martin Schwidefsky
    Signed-off-by: Heiko Carstens
    Cc: Roman Zippel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     
  • The VM is supposed to minimise the number of pages which get written off the
    LRU (for IO scheduling efficiency, and for high reclaim-success rates). But
    we don't actually have a clear way of showing how true this is.

    So add `nr_vmscan_write' to /proc/vmstat and /proc/zoneinfo - the number of
    pages which have been written by the vm scanner in this zone and globally.

    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Arch-independent zone-sizing determines the size of a node
    (pgdat->node_spanned_pages) based on the physical memory that was
    registered by the architecture. However, when
    CONFIG_MEMORY_HOTPLUG_RESERVE is set, the architecture expects that the
    spanned_pages will be much larger and that mem_map will be allocated that
    is used lated on memory hot-add.

    This patch allows an architecture that sets CONFIG_MEMORY_HOTPLUG_RESERVE
    to call push_node_boundaries() which will set the node beginning and end to
    at *least* the requested boundary.

    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • The x86_64 code accounted for memmap and some portions of the the DMA zone as
    holes. This was because those areas would never be reclaimed and accounting
    for them as memory affects min watermarks. This patch will account for the
    memmap as a memory hole. Architectures may optionally use set_dma_reserve()
    if they wish to account for a portion of memory in ZONE_DMA as a hole.

    Signed-off-by: Mel Gorman
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Size zones and holes in an architecture independent manner for ia64.

    [bob.picco@hp.com: fix ia64 FLATMEM+VIRTUAL_MEM_MAP]
    Signed-off-by: Mel Gorman
    Signed-off-by: Bob Picco
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Bob Picco
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Size zones and holes in an architecture independent manner for x86_64.

    Signed-off-by: Mel Gorman
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • At a basic level, architectures define structures to record where active
    ranges of page frames are located. Once located, the code to calculate zone
    sizes and holes in each architecture is very similar. Some of this zone and
    hole sizing code is difficult to read for no good reason. This set of patches
    eliminates the similar-looking architecture-specific code.

    The patches introduce a mechanism where architectures register where the
    active ranges of page frames are with add_active_range(). When all areas have
    been discovered, free_area_init_nodes() is called to initialise the pgdat and
    zones. The zone sizes and holes are then calculated in an architecture
    independent manner.

    Patch 1 introduces the mechanism for registering and initialising PFN ranges
    Patch 2 changes ppc to use the mechanism - 139 arch-specific LOC removed
    Patch 3 changes x86 to use the mechanism - 136 arch-specific LOC removed
    Patch 4 changes x86_64 to use the mechanism - 74 arch-specific LOC removed
    Patch 5 changes ia64 to use the mechanism - 52 arch-specific LOC removed
    Patch 6 accounts for mem_map as a memory hole as the pages are not reclaimable.
    It adjusts the watermarks slightly

    Tony Luck has successfully tested for ia64 on Itanium with tiger_defconfig,
    gensparse_defconfig and defconfig. Bob Picco has also tested and debugged on
    IA64. Jack Steiner successfully boot tested on a mammoth SGI IA64-based
    machine. These were on patches against 2.6.17-rc1 and release 3 of these
    patches but there have been no ia64-changes since release 3.

    There are differences in the zone sizes for x86_64 as the arch-specific code
    for x86_64 accounts the kernel image and the starting mem_maps as memory holes
    but the architecture-independent code accounts the memory as present.

    The big benefit of this set of patches is a sizable reduction of
    architecture-specific code, some of which is very hairy. There should be a
    greater reduction when other architectures use the same mechanisms for zone
    and hole sizing but I lack the hardware to test on.

    Additional credit;
    Dave Hansen for the initial suggestion and comments on early patches
    Andy Whitcroft for reviewing early versions and catching numerous
    errors
    Tony Luck for testing and debugging on IA64
    Bob Picco for fixing bugs related to pfn registration, reviewing a
    number of patch revisions, providing a number of suggestions
    on future direction and testing heavily
    Jack Steiner and Robin Holt for testing on IA64 and clarifying
    issues related to memory holes
    Yasunori for testing on IA64
    Andi Kleen for reviewing and feeding back about x86_64
    Christian Kujau for providing valuable information related to ACPI
    problems on x86_64 and testing potential fixes

    This patch:

    Define the structure to represent an active range of page frames within a node
    in an architecture independent manner. Architectures are expected to register
    active ranges of PFNs using add_active_range(nid, start_pfn, end_pfn) and call
    free_area_init_nodes() passing the PFNs of the end of each zone.

    Signed-off-by: Mel Gorman
    Signed-off-by: Bob Picco
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • We need processor.h for cpu_relax().

    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • un-, de-, -free, -destroy, -exit, etc functions should in general return
    void. Also,

    There is very little, say, filesystem driver code can do upon failed
    kmem_cache_destroy(). If it will be decided to BUG in this case, BUG
    should be put in generic code, instead.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Fixing up some endian-ness warnings in preparation to clone ext4 from ext3.

    Signed-off-by: Dave Kleikamp
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Kleikamp
     
  • More white space cleanups in preparation of cloning ext4 from ext3.
    Removing spaces that precede a tab.

    Signed-off-by: Dave Kleikamp
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Kleikamp
     
  • These are a few places I've found in jbd that look like they may not be
    16T-safe, or consistent with the use of unsigned longs for block
    containers. Problems here would be somewhat hard to hit, would require
    journal blocks past the 8T boundary, which would not be terribly common.
    Still, should fix.

    (some of these have come from the ext4 work on jbd as well).

    I think there's one more possibility that the wrap() function may not be
    safe IF your last block in the journal butts right up against the 232 block
    boundary, but that seems like a VERY remote possibility, and I'm not
    worrying about it at this point.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     
  • Remove whitespace from ext3 and jbd, before we clone ext4.

    Signed-off-by: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mingming Cao
     
  • Fix build error introduced by 3212fe1594e577463bc8601d28aa008f520c3377

    Non-NUMA case should be handled.

    Signed-off-by: KAMEZAWA Hiroyuki
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/i2c-2.6: (30 commits)
    i2c: Drop unimplemented slave functions
    i2c: Constify i2c_algorithm declarations, part 2
    i2c: Constify i2c_algorithm declarations, part 1
    i2c: Let drivers constify i2c_algorithm data
    i2c-isa: Restore driver owner
    i2c-viapro: Add support for the VT8237A and VT8251
    i2c: Warn on i2c client creation failure
    i2c-core: Drop useless bitmaskings
    i2c-algo-pcf: Discard the mdelay data struct member
    i2c-algo-bit: Cleanups
    i2c-isa: Fail adding driver on attach_adapter error
    i2c: __must_check fixes (chip drivers)
    i2c-dev: attach/detach_adapter cleanups
    i2c-stub: Chip address as a module parameter
    i2c: Plan i2c-isa for removal
    i2c: New bus driver for TI OMAP boards
    i2c-algo-bit: Discard the mdelay data struct member
    i2c-matroxfb: Struct init conversion
    i2c: Fix copy-n-paste in subsystem Kconfig
    i2c-au1550: Add I2C support for Au1200
    ...

    Linus Torvalds
     
  • * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (28 commits)
    pciehp - fix wrong return value
    IA64: PCI: dont disable irq which is not enabled
    acpiphp: add support for ioapic hot-remove
    PCI: assign ioapic resource at hotplug
    acpiphp: disable bridges
    acpiphp: stop bus device before acpi_bus_trim
    PCI: add pci_stop_bus_device
    acpiphp: do not initialize existing ioapics
    acpiphp: initialize ioapics before starting devices
    acpiphp: set hpp values before starting devices
    PCI Hotplug: cleanup pcihp skeleton code.
    PCI: Restore PCI Express capability registers after PM event
    PCI: drivers/pci/hotplug/acpiphp_glue.c: make a function static
    PCI: Multiprobe sanitizer
    PCI: fix __must_check warnings
    PCI Hotplug: fix __must_check warnings
    SHPCHP: fix __must_check warnings
    PCI-Express AER implemetation: pcie_portdrv error handler
    PCI-Express AER implemetation: AER core and aerdriver
    PCI-Express AER implemetation: export pcie_port_bus_type
    ...

    Linus Torvalds
     
  • Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • MIPS is the only port to call its fstatat()-related syscalls
    "__NR_fstatat". Now I can see why that might be seen as every
    other port being wrong, but I think for o32, it is at best confusing.
    __NR_fstat provides a plain (32-bit) stat while __NR_fstatat provides a
    64-bit stat. Changing the name to __NR_fstatat64 would make things more
    explicit, match x86, and make the glibc port slightly easier.

    The current name is more appropriate for n32 and n64, but it would be
    appropriate for other 64-bit targets too, and those targets have chosen
    to call it __NR_newfstatat instead. Using the same name for MIPS would
    again be more consistent and make the glibc port slightly easier.

    I'm not wedded to this idea if the current names are preferred,
    but FWIW...

    Signed-off-by: Richard Sandiford
    Signed-off-by: Ralf Baechle

    Richard Sandiford
     
  • Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • Mostly based on patch by Chris Dearman and cleanups from Yoichi.

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • The code in pgtable-64.h assumes TASK_SIZE is always bigger than a first
    level PGDIR_SIZE. This is not the case for 64K pages, where task size is
    40 bits (1TB) and a pgd entry can map 42 bits. This leads to
    USER_PTRS_PER_PGD being zero for 64K pages.

    Signed-off-by: Peter Watkins
    Signed-off-by: Ralf Baechle

    Peter Watkins
     
  • excite_fpga.h, like all platform headers, really belongs in the
    platform header directory.

    Signed-off-by: Thomas Koeller
    Signed-off-by: Ralf Baechle

    thomas@koeller.dyndns.org
     
  • Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • Signed-off-by: Atsushi Nemoto
    Signed-off-by: Ralf Baechle

    Atsushi Nemoto
     
  • Signed-off-by: Yoichi Yuasa
    Signed-off-by: Ralf Baechle

    Yoichi Yuasa
     
  • Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • Signed-off-by: Yoichi Yuasa
    Signed-off-by: Ralf Baechle

    Yoichi Yuasa
     
  • The following change updates the Atlas interrupt handling to match that
    of Malta. Tested with a 5Kc and a 34Kf successfully.

    Signed-off-by: Maciej W. Rozycki
    Signed-off-by: Ralf Baechle

    Maciej W. Rozycki
     
  • Atlas maps its RTC chip in the host mmio space rather than using the
    "traditional" location in the PCI/ISA port space. A change that has
    happened to the generic RTC header requires to define ARCH_RTC_LOCATION
    now.

    Signed-off-by: Maciej W. Rozycki
    Signed-off-by: Ralf Baechle

    Maciej W. Rozycki
     
  • Signed-off-by: Atsushi Nemoto
    Signed-off-by: Ralf Baechle

    Atsushi Nemoto
     
  • * export asm/sgidefs.h
    * include asm/isadep.h only if in kernel
    * do not export contents of asm/timex.h and asm/user.h

    Signed-off-by: Atsushi Nemoto
    Signed-off-by: Ralf Baechle

    Atsushi Nemoto
     
  • Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • generic__raw_read_trylock() is a defect generic function actually doing
    a __raw_read_lock ...

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • On the 34K the redundant cache operations were causing excessive stalls
    resulting in realtime code running on the second VPE missing its deadline.
    For all other platforms this patch is just a significant performance
    improvment as illustrated by below benchmark numbers.

    Processor, Processes - times in microseconds - smaller is better
    ------------------------------------------------------------------------------
    Host OS Mhz null null open slct sig sig fork exec sh
    call I/O stat clos TCP inst hndl proc proc proc
    --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
    25Kf 2.6.18-rc4 533 0.49 1.16 7.57 33.4 30.5 1.34 12.4 5497 17.K 54.K
    25Kf 2.6.18-rc4-p 533 0.49 1.16 6.68 23.0 30.7 1.36 8.55 5030 16.K 48.K
    4Kc 2.6.18-rc4 80 4.21 15.0 131. 289. 261. 16.5 258. 18.K 70.K 227K
    4Kc 2.6.18-rc4-p 80 4.34 13.1 128. 285. 262. 18.2 258. 12.K 52.K 176K
    34Kc 2.6.18-rc4 40 5.01 14.0 61.6 90.0 477. 17.9 94.7 29.K 108K 342K
    34Kc 2.6.18-rc4-p 40 4.98 13.9 61.2 89.7 475. 17.6 93.7 8758 44.K 158K
    BCM1480 2.6.18-rc4 700 0.28 0.60 3.68 5.92 16.0 0.78 5.08 931. 3163 15.K
    BCM1480 2.6.18-rc4-p 700 0.28 0.61 3.65 5.85 16.0 0.79 5.20 395. 1464 8385
    TX49-16K 2.6.18-rc3 197 0.73 2.41 19.0 37.8 82.9 2.94 17.5 4438 14.K 56.K
    TX49-16K 2.6.18-rc3-p 197 0.73 2.40 19.9 36.3 82.9 2.94 23.4 2577 9103 38.K
    TX49-32K 2.6.18-rc3 396 0.36 1.19 6.80 11.8 41.0 1.46 8.17 2738 8465 32.K
    TX49-32K 2.6.18-rc3-p 396 0.36 1.19 6.82 10.2 41.0 1.46 8.18 1330 4638 18.K

    Original patch by me with enhancements by Atsushi Nemoto.

    Signed-off-by: Ralf Baechle
    Signed-off-by: Atsushi Nemoto

    Ralf Baechle
     
  • CONFIG_IRQ_PER_CPU now controls the IRQ_PER_CPU stuff.

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • This patch adds pci_stop_bus_device() which stops a PCI device (detach
    the driver, remove from the global list and so on) and any children.
    This is needed for ACPI based PCI-to-PCI bridge hot-remove, and it will
    be also needed for ACPI based PCI root bridge hot-remove.

    Signed-off-by: Kenji Kaneshige
    Signed-off-by: MUNEDA Takahiro
    Signed-off-by: Satoru Takeuchi
    Signed-off-by: Kristen Carlson Accardi
    Signed-off-by: Greg Kroah-Hartman

    Satoru Takeuchi
     
  • There are numerous drivers that can use multithreaded probing but having
    some kind of global flag as the way to control this makes migration to
    threaded probing hard and since it enables it everywhere and is almost
    as likely to cause serious pain as holding a clog dance in a minefield.

    If we have a pci_driver multithread_probe flag to inherit you can turn
    it on for one driver at a time.

    From playing so far however I think we need a different model at the
    device layer which serializes until the called probe function says "ok
    you can start another one now". That would need some kind of flag and
    semaphore plus a helper function.

    Anyway in the absence of that this is a starting point to usefully play
    with this stuff

    Signed-off-by: Alan Cox
    Signed-off-by: Greg Kroah-Hartman

    Alan Cox