22 Apr, 2008

3 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/juhl/trivial: (24 commits)
    DOC: A couple corrections and clarifications in USB doc.
    Generate a slightly more informative error msg for bad HZ
    fix typo "is" -> "if" in Makefile
    ext*: spelling fix prefered -> preferred
    DOCUMENTATION: Use newer DEFINE_SPINLOCK macro in docs.
    KEYS: Fix the comment to match the file name in rxrpc-type.h.
    RAID: remove trailing space from printk line
    DMA engine: typo fixes
    Remove unused MAX_NODES_SHIFT
    MAINTAINERS: Clarify access to OCFS2 development mailing list.
    V4L: Storage class should be before const qualifier (sn9c102)
    V4L: Storage class should be before const qualifier
    sonypi: Storage class should be before const qualifier
    intel_menlow: Storage class should be before const qualifier
    DVB: Storage class should be before const qualifier
    arm: Storage class should be before const qualifier
    ALSA: Storage class should be before const qualifier
    acpi: Storage class should be before const qualifier
    firmware_sample_driver.c: fix coding style
    MAINTAINERS: Add ati_remote2 driver
    ...

    Fixed up trivial conflicts in firmware_sample_driver.c

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-2.6: (36 commits)
    SCSI: convert struct class_device to struct device
    DRM: remove unused dev_class
    IB: rename "dev" to "srp_dev" in srp_host structure
    IB: convert struct class_device to struct device
    memstick: convert struct class_device to struct device
    driver core: replace remaining __FUNCTION__ occurrences
    sysfs: refill attribute buffer when reading from offset 0
    PM: Remove destroy_suspended_device()
    Firmware: add iSCSI iBFT Support
    PM: Remove legacy PM (fix)
    Kobject: Replace list_for_each() with list_for_each_entry().
    SYSFS: Explicitly include required header file slab.h.
    Driver core: make device_is_registered() work for class devices
    PM: Convert wakeup flag accessors to inline functions
    PM: Make wakeup flags available whenever CONFIG_PM is set
    PM: Fix misuse of wakeup flag accessors in serial core
    Driver core: Call device_pm_add() after bus_add_device() in device_add()
    PM: Handle device registrations during suspend/resume
    block: send disk "change" event for rescan_partitions()
    sysdev: detect multiple driver registrations
    ...

    Fixed trivial conflict in include/linux/memory.h due to semaphore header
    file change (made irrelevant by the change to mutex).

    Linus Torvalds
     
  • These are small cleanups all over the tree.

    Trivial style and comment changes to
    fs/select.c, kernel/signal.c, kernel/stop_machine.c & mm/pdflush.c

    Signed-off-by: Pavel Machek
    Signed-off-by: Jesper Juhl

    Pavel Machek
     

20 Apr, 2008

4 commits

  • Signed-off-by: Daniel Walker
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Daniel Walker
     
  • * Use new node_to_cpumask_ptr. This creates a pointer to the
    cpumask for a given node. This definition is in mm patch:

    asm-generic-add-node_to_cpumask_ptr-macro.patch

    * Use new set_cpus_allowed_ptr function.

    Depends on:
    [mm-patch]: asm-generic-add-node_to_cpumask_ptr-macro.patch
    [sched-devel]: sched: add new set_cpus_allowed_ptr function
    [x86/latest]: x86: add cpus_scnprintf function

    Cc: Greg Kroah-Hartman
    Cc: Greg Banks
    Cc: H. Peter Anvin
    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     
  • * Modify cpuset_cpus_allowed to return the currently allowed cpuset
    via a pointer argument instead of as the function return value.

    * Use new set_cpus_allowed_ptr function.

    * Cleanup CPU_MASK_ALL and NODE_MASK_ALL uses.

    Depends on:
    [sched-devel]: sched: add new set_cpus_allowed_ptr function

    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     
  • * Replace usages of CPU_MASK_NONE, CPU_MASK_ALL, NODE_MASK_NONE,
    NODE_MASK_ALL to reduce stack requirements for large NR_CPUS
    and MAXNODES counts.

    * In some cases, the cpumask variable was initialized but then overwritten
    with another value. This is the case for changes like this:

    - cpumask_t oldmask = CPU_MASK_ALL;
    + cpumask_t oldmask;

    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     

18 Apr, 2008

4 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-kgdb:
    kgdb: always use icache flush for sw breakpoints
    kgdb: fix SMP NMI kgdb_handle_exception exit race
    kgdb: documentation fixes
    kgdb: allow static kgdbts boot configuration
    kgdb: add documentation
    kgdb: Kconfig fix
    kgdb: add kgdb internal test suite
    kgdb: fix several kgdb regressions
    kgdb: kgdboc pl011 I/O module
    kgdb: fix optional arch functions and probe_kernel_*
    kgdb: add x86 HW breakpoints
    kgdb: print breakpoint removed on exception
    kgdb: clocksource watchdog
    kgdb: fix NMI hangs
    kgdb: fix kgdboc dynamic module configuration
    kgdb: document parameters
    x86: kgdb support
    consoles: polling support, kgdboc
    kgdb: core
    uaccess: add probe_kernel_write()

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
    slub: No need for per node slab counters if !SLUB_DEBUG
    slub: Move map/flag clearing to __free_slab
    slub: Fixes to per cpu stat output in sysfs
    slub: Deal with config variable dependencies
    slub: Reduce #ifdef ZONE_DMA by moving kmalloc_caches_dma near dma logic
    slub: Initialize per-cpu stats

    Linus Torvalds
     
  • Fix two regressions dealing with the kgdb core.

    1) kgdb_skipexception and kgdb_post_primary_code are optional
    functions that are only required on archs that need special exception
    fixups.

    2) The kernel address space scope must be set on any probe_kernel_*
    function or archs such as ARCH=arm will not allow access to the kernel
    memory space. As an example, it is required to allow the full kernel
    address space is when you the kernel debugger to inspect a system
    call.

    Signed-off-by: Jason Wessel
    Signed-off-by: Ingo Molnar

    Jason Wessel
     
  • add probe_kernel_read() and probe_kernel_write().

    Uninlined and restricted to kernel range memory only, as suggested
    by Linus.

    Signed-off-by: Ingo Molnar
    Reviewed-by: Thomas Gleixner

    Ingo Molnar
     

16 Apr, 2008

3 commits

  • In a5d76b54a3f3a40385d7f76069a2feac9f1bad63 (memory unplug: page isolation by
    KAMEZAWA Hiroyuki), "isolate" migratetype added. but unfortunately, it
    doesn't treat /proc/pagetypeinfo display logic.

    this patch add "Isolate" to pagetype name field.

    /proc/pagetype
    before:
    ------------------------------------------------------------------------------------------------------------------------
    Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10
    Node 0, zone DMA, type Unmovable 1 2 2 2 1 2 2 1 1 0 0
    Node 0, zone DMA, type Reclaimable 0 0 0 0 0 0 0 0 0 0 0
    Node 0, zone DMA, type Movable 2 3 3 1 3 3 2 0 0 0 0
    Node 0, zone DMA, type Reserve 0 0 0 0 0 0 0 0 0 0 1
    Node 0, zone DMA, type 0 0 0 0 0 0 0 0 0 0 0
    Node 0, zone Normal, type Unmovable 1 9 7 4 1 1 1 1 0 0 0
    Node 0, zone Normal, type Reclaimable 5 2 0 0 1 1 0 0 0 1 0
    Node 0, zone Normal, type Movable 0 1 1 0 0 0 1 0 0 1 60
    Node 0, zone Normal, type Reserve 0 0 0 0 0 0 0 0 0 0 1
    Node 0, zone Normal, type 0 0 0 0 0 0 0 0 0 0 0
    Node 0, zone HighMem, type Unmovable 0 0 1 1 1 0 1 1 2 2 0
    Node 0, zone HighMem, type Reclaimable 0 0 0 0 0 0 0 0 0 0 0
    Node 0, zone HighMem, type Movable 236 62 6 2 2 1 1 0 1 1 16
    Node 0, zone HighMem, type Reserve 0 0 0 0 0 0 0 0 0 0 1
    Node 0, zone HighMem, type 0 0 0 0 0 0 0 0 0 0 0

    Number of blocks type Unmovable Reclaimable Movable Reserve
    Node 0, zone DMA 1 0 2 1 0
    Node 0, zone Normal 10 40 169 1 0
    Node 0, zone HighMem 2 0 283 1 0

    after:
    ------------------------------------------------------------------------------------------------------------------------
    Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10
    Node 0, zone DMA, type Unmovable 1 2 2 2 1 2 2 1 1 0 0
    Node 0, zone DMA, type Reclaimable 0 0 0 0 0 0 0 0 0 0 0
    Node 0, zone DMA, type Movable 2 3 3 1 3 3 2 0 0 0 0
    Node 0, zone DMA, type Reserve 0 0 0 0 0 0 0 0 0 0 1
    Node 0, zone DMA, type Isolate 0 0 0 0 0 0 0 0 0 0 0
    Node 0, zone Normal, type Unmovable 0 2 1 1 0 1 0 0 0 0 0
    Node 0, zone Normal, type Reclaimable 1 1 1 1 1 0 1 1 1 0 0
    Node 0, zone Normal, type Movable 0 1 1 1 0 1 0 1 0 0 196
    Node 0, zone Normal, type Reserve 0 0 0 0 0 0 0 0 0 0 1
    Node 0, zone Normal, type Isolate 0 0 0 0 0 0 0 0 0 0 0
    Node 0, zone HighMem, type Unmovable 0 1 0 0 0 1 1 1 2 2 0
    Node 0, zone HighMem, type Reclaimable 0 0 0 0 0 0 0 0 0 0 0
    Node 0, zone HighMem, type Movable 1 0 1 1 0 0 0 0 1 0 200
    Node 0, zone HighMem, type Reserve 0 0 0 0 0 0 0 0 0 0 1
    Node 0, zone HighMem, type Isolate 0 0 0 0 0 0 0 0 0 0 0

    Number of blocks type Unmovable Reclaimable Movable Reserve Isolate
    Node 0, zone DMA 1 0 2 1 0
    Node 0, zone Normal 8 4 207 1 0
    Node 0, zone HighMem 2 0 283 1 0

    Signed-off-by: KOSAKI Motohiro
    Acked-by: KAMEZAWA Hiroyuki
    Acked-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • When I used a test program to fork mass processes and immediately move them to
    a cgroup where the memory limit is low enough to trigger oom kill, I got oops:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000808
    IP: [] _spin_lock_irqsave+0x8/0x18
    PGD 4c95f067 PUD 4406c067 PMD 0
    Oops: 0002 [1] SMP
    CPU 2
    Modules linked in:

    Pid: 11973, comm: a.out Not tainted 2.6.25-rc7 #5
    RIP: 0010:[] [] _spin_lock_irqsave+0x8/0x18
    RSP: 0018:ffff8100448c7c30 EFLAGS: 00010002
    RAX: 0000000000000202 RBX: 0000000000000009 RCX: 000000000001c9f3
    RDX: 0000000000000100 RSI: 0000000000000001 RDI: 0000000000000808
    RBP: ffff81007e444080 R08: 0000000000000000 R09: ffff8100448c7900
    R10: ffff81000105f480 R11: 00000100ffffffff R12: ffff810067c84140
    R13: 0000000000000001 R14: ffff8100441d0018 R15: ffff81007da56200
    FS: 00007f70eb1856f0(0000) GS:ffff81007fbad3c0(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 0000000000000808 CR3: 000000004498a000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process a.out (pid: 11973, threadinfo ffff8100448c6000, task ffff81007da533e0)
    Stack: ffffffff8023ef5a 00000000000000d0 ffffffff80548dc0 00000000000000d0
    ffff810067c84140 ffff81007e444080 ffffffff8026cef9 00000000000000d0
    ffff8100441d0000 00000000000000d0 ffff8100441d0000 ffff8100505445c0
    Call Trace:
    [] ? force_sig_info+0x25/0xb9
    [] ? oom_kill_task+0x77/0xe2
    [] ? mem_cgroup_out_of_memory+0x55/0x67
    [] ? mem_cgroup_charge_common+0xec/0x202
    [] ? handle_mm_fault+0x24e/0x77f
    [] ? default_wake_function+0x0/0xe
    [] ? get_user_pages+0x2ce/0x3af
    [] ? mem_cgroup_charge_common+0x2d/0x202
    [] ? make_pages_present+0x8e/0xa4
    [] ? mmap_region+0x373/0x429
    [] ? do_mmap_pgoff+0x2ff/0x364
    [] ? sys_mmap+0xe5/0x111
    [] ? tracesys+0xdc/0xe1

    Code: 00 00 01 48 8b 3c 24 e9 46 d4 dd ff f0 ff 07 48 8b 3c 24 e9 3a d4 dd ff fe 07 48 8b 3c 24 e9 2f d4 dd ff 9c 58 fa ba 00 01 00 00 66 0f c1 17 38 f2 74 06 f3 90 8a 17 eb f6 c3 fa b8 00 01 00
    RIP [] _spin_lock_irqsave+0x8/0x18
    RSP
    CR2: 0000000000000808
    ---[ end trace c3702fa668021ea4 ]---

    It's reproducable in a x86_64 box, but doesn't happen in x86_32.

    This is because tsk->sighand is not guarded by RCU, so we have to
    hold tasklist_lock, just as what out_of_memory() does.

    Signed-off-by: Li Zefan
    Cc: KAMEZAWA Hiroyuki
    Acked-by: Balbir Singh
    Cc: Pavel Emelianov
    Cc: Paul Menage
    Cc: Oleg Nesterov
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Fix memory corruption and crash on 32-bit x86 systems.

    If a !PAE x86 kernel is booted on a 32-bit system with more than 4GB of
    RAM, then we call memory_present() with a start/end that goes outside
    the scope of MAX_PHYSMEM_BITS.

    That causes this loop to happily walk over the limit of the sparse
    memory section map:

    for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
    unsigned long section = pfn_to_section_nr(pfn);
    struct mem_section *ms;

    sparse_index_init(section, nid);
    set_section_nid(section, nid);

    ms = __nr_to_section(section);
    if (!ms->section_mem_map)
    ms->section_mem_map = sparse_encode_early_nid(nid) |
    SECTION_MARKED_PRESENT;

    'ms' will be out of bounds and we'll corrupt a small amount of memory by
    encoding the node ID and writing SECTION_MARKED_PRESENT (==0x1) over it.

    The corruption might happen when encoding a non-zero node ID, or due to
    the SECTION_MARKED_PRESENT which is 0x1:

    mmzone.h:#define SECTION_MARKED_PRESENT (1UL<
    Tested-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Mel Gorman
    Cc: Nick Piggin
    Cc: Andrew Morton
    Cc: Rafael J. Wysocki
    Cc: Yinghai Lu
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

14 Apr, 2008

6 commits

  • The per node counters are used mainly for showing data through the sysfs API.
    If that API is not compiled in then there is no point in keeping track of this
    data. Disable counters for the number of slabs and the number of total slabs
    if !SLUB_DEBUG. Incrementing the per node counters is also accessing a
    potentially contended cacheline so this could actually be a performance
    benefit to embedded systems.

    SLABINFO support is also affected. It now must depends on SLUB_DEBUG (which
    is on by default).

    Patch also avoids a check for a NULL kmem_cache_node pointer in new_slab()
    if the system is not compiled with NUMA support.

    [penberg@cs.helsinki.fi: fix oops and move ->nr_slabs into CONFIG_SLUB_DEBUG]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • __free_slab does some diagnostics. The resetting of mapcount etc
    in discard_slab() can interfere with debug processing. So move
    the reset immediately before the page is freed.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Only output per cpu stats if the kernel is build for SMP.

    Use a capital "C" as a leading character for the processor number
    (same as the numa statistics that also use a capital letter "N").

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • count_partial() is used by both slabinfo and the sysfs proc support. Move
    the function directly before the beginning of the sysfs code so that it can
    be easily found. Rework the preprocessor conditional to take into account
    that slub sysfs support depends on CONFIG_SYSFS *and* CONFIG_SLUB_DEBUG.

    Make CONFIG_SLUB_STATS depend on CONFIG_SLUB_DEBUG and CONFIG_SYSFS. There
    is no point of keeping statistics if no one can restrive them.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • Move the definition of kmalloc_caches_dma() into a later #ifdef CONFIG_ZONE_DMA.
    This saves one #ifdef and leaves us with a total of two #ifdefs for dma slab support.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     
  • As spotted by kmemcheck, we need to initialize the per-CPU ->stat array before
    using it.

    [kmem_cache_cpu structures are usually allocated from arrays defined via
    DEFINE_PER_CPU that are zeroed so we have not noticed this so far --cl].

    Reported-by: Vegard Nossum
    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Pekka Enberg
     

09 Apr, 2008

1 commit

  • This should be N_NORMAL_MEMORY.

    N_NORMAL_MEMORY is "true" if a node has memory for the kernel. N_HIGH_MEMORY
    is "true" if a node has memory for HIGHMEM. (If CONFIG_HIGHMEM=n, always
    "true")

    This check is used for testing whether we can use kmalloc_node() on a node.
    Then, if there is a node which only contains HIGHMEM, the system will call
    kmalloc_node() which doesn't contain memory for the kernel. If it happens
    under SLUB, the kernel will panic. I think this only happens on x86_32-numa.

    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Balbir Singh
    Cc: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

05 Apr, 2008

1 commit

  • A boot option for the memory controller was discussed on lkml. It is a good
    idea to add it, since it saves memory for people who want to turn off the
    memory controller.

    By default the option is on for the following two reasons:

    1. It provides compatibility with the current scheme where the memory
    controller turns on if the config option is enabled
    2. It allows for wider testing of the memory controller, once the config
    option is enabled

    We still allow the create, destroy callbacks to succeed, since they are not
    aware of boot options. We do not populate the directory will memory resource
    controller specific files.

    Signed-off-by: Balbir Singh
    Cc: Paul Menage
    Cc: Balbir Singh
    Cc: Pavel Emelyanov
    Cc: KAMEZAWA Hiroyuki
    Cc: Hugh Dickins
    Cc: Sudhir Kumar
    Cc: YAMAMOTO Takashi
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Balbir Singh
     

02 Apr, 2008

1 commit

  • Small typo in the patch recently merged to avoid the unused symbol
    message for count_partial(). Discussion thread with confirmation of fix at
    http://marc.info/?t=120696854400001&r=1&w=2

    Typo in the check if we need the count_partial function that was
    introduced by 53625b4204753b904addd40ca96d9ba802e6977d

    Signed-off-by: Christoph Lameter
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

31 Mar, 2008

1 commit


28 Mar, 2008

1 commit


27 Mar, 2008

4 commits

  • Running the counters testcase from libhugetlbfs results in on 2.6.25-rc5
    and 2.6.25-rc5-mm1:

    BUG: soft lockup - CPU#3 stuck for 61s! [counters:10531]
    NIP: c0000000000d1f3c LR: c0000000000d1f2c CTR: c0000000001b5088
    REGS: c000005db12cb360 TRAP: 0901 Not tainted (2.6.25-rc5-autokern1)
    MSR: 8000000000009032 CR: 48008448 XER: 20000000
    TASK = c000005dbf3d6000[10531] 'counters' THREAD: c000005db12c8000 CPU: 3
    GPR00: 0000000000000004 c000005db12cb5e0 c000000000879228 0000000000000004
    GPR04: 0000000000000010 0000000000000000 0000000000200200 0000000000100100
    GPR08: c0000000008aba10 000000000000ffff 0000000000000004 0000000000000000
    GPR12: 0000000028000442 c000000000770080
    NIP [c0000000000d1f3c] .return_unused_surplus_pages+0x84/0x18c
    LR [c0000000000d1f2c] .return_unused_surplus_pages+0x74/0x18c
    Call Trace:
    [c000005db12cb5e0] [c000005db12cb670] 0xc000005db12cb670 (unreliable)
    [c000005db12cb670] [c0000000000d24c4] .hugetlb_acct_memory+0x2e0/0x354
    [c000005db12cb740] [c0000000001b5048] .truncate_hugepages+0x1d4/0x214
    [c000005db12cb890] [c0000000001b50a4] .hugetlbfs_delete_inode+0x1c/0x3c
    [c000005db12cb920] [c000000000103fd8] .generic_delete_inode+0xf8/0x1c0
    [c000005db12cb9b0] [c0000000001b5100] .hugetlbfs_drop_inode+0x3c/0x24c
    [c000005db12cba50] [c00000000010287c] .iput+0xdc/0xf8
    [c000005db12cbad0] [c0000000000fee54] .dentry_iput+0x12c/0x194
    [c000005db12cbb60] [c0000000000ff050] .d_kill+0x6c/0xa4
    [c000005db12cbbf0] [c0000000000ffb74] .dput+0x18c/0x1b0
    [c000005db12cbc70] [c0000000000e9e98] .__fput+0x1a4/0x1e8
    [c000005db12cbd10] [c0000000000e61ec] .filp_close+0xb8/0xe0
    [c000005db12cbda0] [c0000000000e62d0] .sys_close+0xbc/0x134
    [c000005db12cbe30] [c00000000000872c] syscall_exit+0x0/0x40
    Instruction dump:
    ebbe8038 38800010 e8bf0002 3bbd0008 7fa3eb78 38a50001 7ca507b4 4818df25
    60000000 38800010 38a00000 7c601b78 2f800010 409d0008 38000010

    This was tracked down to a potential livelock in
    return_unused_surplus_hugepages(). In the case where we have surplus
    pages on some node, but no free pages on the same node, we may never
    break out of the loop. To avoid this livelock, terminate the search if
    we iterate a number of times equal to the number of online nodes without
    freeing a page.

    Thanks to Andy Whitcroft and Adam Litke for helping with debugging and
    the patch.

    Signed-off-by: Nishanth Aravamudan
    Signed-off-by: Linus Torvalds

    Nishanth Aravamudan
     
  • Currently we show the surplus hugetlb pool state in /proc/meminfo, but
    not in the per-node meminfo files, even though we track the information
    on a per-node basis. Printing it there can help track down dynamic pool
    bugs including the one in the follow-on patch.

    Signed-off-by: Nishanth Aravamudan
    Signed-off-by: Linus Torvalds

    Nishanth Aravamudan
     
  • Commit 556a169dab38b5100df6f4a45b655dddd3db94c1 ("slab: fix bootstrap on
    memoryless node") introduced bootstrap-time cache_cache list3s for all nodes
    but forgot that initkmem_list3 needs to be accessed by [somevalue + node]. This
    patch fixes list_add() corruption in mm/slab.c seen on the ES7000.

    Cc: Mel Gorman
    Cc: Olaf Hering
    Cc: Christoph Lameter
    Signed-off-by: Dan Yeisley
    Signed-off-by: Pekka Enberg
    Signed-off-by: Christoph Lameter

    Daniel Yeisley
     
  • Avoid warnings about unused functions if neither SLUB_DEBUG nor CONFIG_SLABINFO
    is defined. This patch will be reversed when slab defrag is merged since slab
    defrag requires count_partial() to determine the fragmentation status of
    slab caches.

    Signed-off-by: Christoph Lameter

    Christoph Lameter
     

25 Mar, 2008

3 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    [PATCH] get stack footprint of pathname resolution back to relative sanity
    [PATCH] double iput() on failure exit in hugetlb
    [PATCH] double dput() on failure exit in tiny-shmem
    [PATCH] fix up new filp allocators
    [PATCH] check for null vfsmount in dentry_open()
    [PATCH] reiserfs: eliminate private use of struct file in xattr
    [PATCH] sanitize hppfs
    hppfs pass vfsmount to dentry_open()
    [PATCH] restore export of do_kern_mount()

    Linus Torvalds
     
  • Revert commit f1a9ee758de7de1e040de849fdef46e6802ea117:

    Author: Rik van Riel
    Date: Thu Feb 7 00:14:08 2008 -0800

    kswapd should only wait on IO if there is IO

    The current kswapd (and try_to_free_pages) code has an oddity where the
    code will wait on IO, even if there is no IO in flight. This problem is
    notable especially when the system scans through many unfreeable pages,
    causing unnecessary stalls in the VM.

    Additionally, tasks without __GFP_FS or __GFP_IO in the direct reclaim path
    will sleep if a significant number of pages are encountered that should be
    written out. This gives kswapd a chance to write out those pages, while
    the direct reclaim task sleeps.

    Signed-off-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Because of large latencies and interactivity problems reported by Carlos,
    here: http://lkml.org/lkml/2008/3/22/211

    Cc: Rik van Riel
    Cc: "Carlos R. Mafra"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • With numa enabled, some callers could have a range of memory on one node
    but try to free that on other node. This can cause some pages to be
    freed wrongly.

    For example: when we try to allocate 128g boot ram early for
    gart/swiotlb, and free that range later so gart/swiotlb can get some
    range afterwards.

    With this patch, we don't need to care which node holds the range, just
    loop to call free_bootmem_node for all online nodes.

    This patch makes free_bootmem_core() more robust by trimming the sidx
    and eidx according the ram range that the node has.

    And make the free_bootmem_core handle this out of range case. We could
    use bdata_list to make sure the range can be freed for sure. So next
    time, we don't need to loop online nodes and could use free_bootmem
    directly.

    Signed-off-by: Yinghai Lu
    Cc: Andi Kleen
    Cc: Yasunori Goto
    Cc: KAMEZAWA Hiroyuki
    Acked-by: Ingo Molnar
    Tested-by: Ingo Molnar
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yinghai Lu
     

20 Mar, 2008

7 commits

  • Fix kernel-doc notation in mm/readahead.c.

    Change ":" to ";" so that it doesn't get treated as a doc section heading.
    Move the comment block ending "*/" to a line by itself so that the text on
    that last line is not lost (dropped).

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • The check t->pid == t->pid is not the blessed way to check whether a task is a
    group leader.

    This is not about the code beautifulness only, but about pid namespaces fixes
    - both the tgid and the pid fields on the task_struct are (slowly :( )
    becoming deprecated.

    Besides, the thread_group_leader() macro makes only one dereference :)

    Signed-off-by: Pavel Emelyanov
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • Correct kernel-doc function names and parameters in rmap.c.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Add kernel-doc comments to highmem.c.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Fix kernel-doc notation in oom_kill.c.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Convert tiny-shmem.c function comments to kernel-doc. Add parameters and
    convert/fix other kernel-doc in shmem.c.

    Signed-off-by: Randy Dunlap
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Fix various kernel-doc notation in mm/:

    filemap.c: add function short description; convert 2 to kernel-doc
    fremap.c: change parameter 'prot' to @prot
    pagewalk.c: change "-" in function parameters to ":"
    slab.c: fix short description of kmem_ptr_validate()
    swap.c: fix description & parameters of put_pages_list()
    swap_state.c: fix function parameters
    vmalloc.c: change "@returns" to "Returns:" since that is not a parameter

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

19 Mar, 2008

1 commit