20 Jul, 2008

1 commit


16 Jul, 2008

6 commits


15 Jul, 2008

7 commits

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits)
    ext4: Documention update for new ordered mode and delayed allocation
    ext4: do not set extents feature from the kernel
    ext4: Don't allow nonextenst mount option for large filesystem
    ext4: Enable delalloc by default.
    ext4: delayed allocation i_blocks fix for stat
    ext4: fix delalloc i_disksize early update issue
    ext4: Handle page without buffers in ext4_*_writepage()
    ext4: Add ordered mode support for delalloc
    ext4: Invert lock ordering of page_lock and transaction start in delalloc
    mm: Add range_cont mode for writeback
    ext4: delayed allocation ENOSPC handling
    percpu_counter: new function percpu_counter_sum_and_set
    ext4: Add delayed allocation support in data=writeback mode
    vfs: add hooks for ext4's delayed allocation support
    jbd2: Remove data=ordered mode support using jbd buffer heads
    ext4: Use new framework for data=ordered mode in JBD2
    jbd2: Implement data=ordered mode handling via inodes
    vfs: export filemap_fdatawrite_range()
    ext4: Fix lock inversion in ext4_ext_truncate()
    ext4: Invert the locking order of page_lock and transaction start
    ...

    Linus Torvalds
     
  • Manual fixup of:

    arch/powerpc/Kconfig

    Benjamin Herrenschmidt
     
  • * 'tracing/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (228 commits)
    ftrace: build fix for ftraced_suspend
    ftrace: separate out the function enabled variable
    ftrace: add ftrace_kill_atomic
    ftrace: use current CPU for function startup
    ftrace: start wakeup tracing after setting function tracer
    ftrace: check proper config for preempt type
    ftrace: trace schedule
    ftrace: define function trace nop
    ftrace: move sched_switch enable after markers
    ftrace: prevent ftrace modifications while being kprobe'd, v2
    fix "ftrace: store mcount address in rec->ip"
    mmiotrace broken in linux-next (8-bit writes only)
    ftrace: avoid modifying kprobe'd records
    ftrace: freeze kprobe'd records
    kprobes: enable clean usage of get_kprobe
    ftrace: store mcount address in rec->ip
    ftrace: build fix with gcc 4.3
    namespacecheck: fixes
    ftrace: fix "notrace" filtering priority
    ftrace: fix printout
    ...

    Linus Torvalds
     
  • * 'x86/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (821 commits)
    x86: make 64bit hpet_set_mapping to use ioremap too, v2
    x86: get x86_phys_bits early
    x86: max_low_pfn_mapped fix #4
    x86: change _node_to_cpumask_ptr to return const ptr
    x86: I/O APIC: remove an IRQ2-mask hack
    x86: fix numaq_tsc_disable calling
    x86, e820: remove end_user_pfn
    x86: max_low_pfn_mapped fix, #3
    x86: max_low_pfn_mapped fix, #2
    x86: max_low_pfn_mapped fix, #1
    x86_64: fix delayed signals
    x86: remove conflicting nx6325 and nx6125 quirks
    x86: Recover timer_ack lost in the merge of the NMI watchdog
    x86: I/O APIC: Never configure IRQ2
    x86: L-APIC: Always fully configure IRQ0
    x86: L-APIC: Set IRQ0 as edge-triggered
    x86: merge dwarf2 headers
    x86: use AS_CFI instead of UNWIND_INFO
    x86: use ignore macro instead of hash comment
    x86: use matching CFI_ENDPROC
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6: (31 commits)
    avr32: Fix typo of IFSR in a comment in the PIO header file
    avr32: Power Management support ("standby" and "mem" modes)
    avr32: Add system device for the internal interrupt controller (intc)
    avr32: Add simple SRAM allocator
    avr32: Enable SDRAMC clock at startup
    rtc-at32ap700x: Enable wakeup
    macb: Basic suspend/resume support
    atmel_serial: Drain console TX shifter before suspending
    atmel_serial: Fix build on avr32 with CONFIG_PM enabled
    avr32: Use a quicklist for PTE allocation as well
    avr32: Use a quicklist for PGD allocation
    avr32: Cover the kernel page tables in the user PGDs
    avr32: Store virtual addresses in the PGD
    avr32: Remove useless zeroing of swapper_pg_dir at startup
    avr32: Clean up and optimize the TLB operations
    avr32: Rename at32ap.c -> pdc.c
    avr32: Move setup_platform() into chip-specific file
    avr32: Kill special exception handler sections
    avr32: Kill unneeded #include from asm/mmu_context.h
    avr32: Clean up time.c #includes
    ...

    Linus Torvalds
     
  • * 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (71 commits)
    [S390] sclp_tty: Fix scheduling while atomic bug.
    [S390] sclp_tty: remove ioctl interface.
    [S390] Remove P390 support.
    [S390] Cleanup vmcp printk messages.
    [S390] Cleanup lcs printk messages.
    [S390] Cleanup kprobes printk messages.
    [S390] Cleanup vmwatch printk messages.
    [S390] Cleanup dcssblk printk messages.
    [S390] Cleanup zfcp dumper printk messages.
    [S390] Cleanup vmlogrdr printk messages.
    [S390] Cleanup s390 debug feature print messages.
    [S390] Cleanup monreader printk messages.
    [S390] Cleanup appldata printk messages.
    [S390] Cleanup smsgiucv printk messages.
    [S390] Cleanup cpacf printk messages.
    [S390] Cleanup qeth print messages.
    [S390] Cleanup netiucv printk messages.
    [S390] Cleanup iucv printk messages.
    [S390] Cleanup sclp printk messages.
    [S390] Cleanup zcrypt printk messages.
    ...

    Linus Torvalds
     
  • This simplifies the code significantly, and was the whole point of the
    exercise.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

14 Jul, 2008

3 commits


12 Jul, 2008

3 commits

  • Conflicts:

    arch/x86/mm/ioremap.c

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Filesystems like ext4 needs to start a new transaction in
    the writepages for block allocation. This happens with delayed
    allocation and there is limit to how many credits we can request
    from the journal layer. So we call write_cache_pages multiple
    times with wbc->nr_to_write set to the maximum possible value
    limitted by the max journal credits available.

    Add a new mode to writeback that enables us to handle this
    behaviour. In the new mode we update the wbc->range_start
    to point to the new offset to be written. Next call to
    call to write_cache_pages will start writeout from specified
    range_start offset. In the new mode we also limit writing
    to the specified wbc->range_end.

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Mingming Cao
    Acked-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     
  • Make filemap_fdatawrite_range() function public, so that it can later
    be used in ordered mode rewrite by JBD/JBD2.

    Signed-off-by: Jan Kara

    Jan Kara
     

11 Jul, 2008

1 commit

  • Vegard Nossum reported a crash in kmem_cache_alloc():

    BUG: unable to handle kernel paging request at da87d000
    IP: [] kmem_cache_alloc+0xc7/0xe0
    *pde = 28180163 *pte = 1a87d160
    Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC
    Pid: 3850, comm: grep Not tainted (2.6.26-rc9-00059-gb190333 #5)
    EIP: 0060:[] EFLAGS: 00210203 CPU: 0
    EIP is at kmem_cache_alloc+0xc7/0xe0
    EAX: 00000000 EBX: da87c100 ECX: 1adad71a EDX: 6b6b6b6b
    ESI: 00200282 EDI: da87d000 EBP: f60bfe74 ESP: f60bfe54
    DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068

    and analyzed it:

    "The register %ecx looks innocent but is very important here. The disassembly:

    mov %edx,%ecx
    shr $0x2,%ecx
    rep stos %eax,%es:(%edi) > 2 == 0x1adadada.)

    %ecx is the counter for the memset, from here:

    memset(object, 0, c->objsize);

    i.e. %ecx was loaded from c->objsize, so "c" must have been freed.
    Where did "c" come from? Uh-oh...

    c = get_cpu_slab(s, smp_processor_id());

    This looks like it has very much to do with CPU hotplug/unplug. Is
    there a race between SLUB/hotplug since the CPU slab is used after it
    has been freed?"

    Good analysis.

    Yeah, it's possible that a caller of kmem_cache_alloc() -> slab_alloc()
    can be migrated on another CPU right after local_irq_restore() and
    before memset(). The inital cpu can become offline in the mean time (or
    a migration is a consequence of the CPU going offline) so its
    'kmem_cache_cpu' structure gets freed ( slab_cpuup_callback).

    At some point of time the caller continues on another CPU having an
    obsolete pointer...

    Signed-off-by: Dmitry Adamushko
    Reported-by: Vegard Nossum
    Acked-by: Ingo Molnar
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Dmitry Adamushko
     

10 Jul, 2008

1 commit


09 Jul, 2008

1 commit

  • This patch allows architectures to define functions to deal with
    additional protections bits for mmap() and mprotect().

    arch_calc_vm_prot_bits() maps additonal protection bits to vm_flags
    arch_vm_get_page_prot() maps additional vm_flags to the vma's vm_page_prot
    arch_validate_prot() checks for valid values of the protection bits

    Note: vm_get_page_prot() is now pretty ugly, but the generated code
    should be identical for architectures that don't define additional
    protection bits.

    Signed-off-by: Dave Kleikamp
    Acked-by: Andrew Morton
    Acked-by: Hugh Dickins
    Signed-off-by: Benjamin Herrenschmidt

    Dave Kleikamp
     

08 Jul, 2008

9 commits


05 Jul, 2008

5 commits

  • Flags considered internal to the mempolicy kernel code are stored as part
    of the "flags" member of struct mempolicy.

    Before exposing a policy type to userspace via get_mempolicy(), these
    internal flags must be masked. Flags exposed to userspace, however,
    should still be returned to the user.

    Signed-off-by: David Rientjes
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • get_user_pages() must not return the error when i != 0. When pages !=
    NULL we have i get_page()'ed pages.

    Signed-off-by: Oleg Nesterov
    Acked-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Dirty page accounting accurately measures the amound of dirty pages in
    writable shared mappings by mapping the pages RO (as indicated by
    vma_wants_writenotify). We then trap on first write and call
    set_page_dirty() on the page, after which we map the page RW and
    continue execution.

    When we launder dirty pages, we call clear_page_dirty_for_io() which
    clears both the dirty flag, and maps the page RO again before we start
    writeout so that the story can repeat itself.

    vma_wants_writenotify() excludes VM_PFNMAP on the basis that we cannot
    do the regular dirty page stuff on raw PFNs and the memory isn't going
    anywhere anyway.

    The recently introduced VM_MIXEDMAP mixes both !pfn_valid() and
    pfn_valid() pages in a single mapping.

    We can't do dirty page accounting on !pfn_valid() pages as stated
    above, and mapping them RO causes them to be COW'ed on write, which
    breaks VM_SHARED semantics.

    Excluding VM_MIXEDMAP in vma_wants_writenotify() would mean we don't do
    the regular dirty page accounting for the pfn_valid() pages, which
    would bring back all the head-aches from inaccurate dirty page
    accounting.

    So instead, we let the !pfn_valid() pages get mapped RO, but fix them
    up unconditionally in the fault path.

    Signed-off-by: Peter Zijlstra
    Cc: Nick Piggin
    Acked-by: Hugh Dickins
    Cc: "Jared Hulbert"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Remove all clameter@sgi.com addresses from the kernel tree since they will
    become invalid on June 27th. Change my maintainer email address for the
    slab allocators to cl@linux-foundation.org (which will be the new email
    address for the future).

    Signed-off-by: Christoph Lameter
    Signed-off-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Stephen Rothwell
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
    slub: Do not use 192 byte sized cache if minimum alignment is 128 byte

    Linus Torvalds
     

04 Jul, 2008

2 commits

  • The non-NUMA case of build_zonelist_cache() would initialize the
    zlcache_ptr for both node_zonelists[] to NULL.

    Which is problematic, since non-NUMA only has a single node_zonelists[]
    entry, and trying to zero the non-existent second one just overwrote the
    nr_zones field instead.

    As kswapd uses this value to determine what reclaim work is necessary,
    the result is that kswapd never reclaims. This causes processes to
    stall frequently in low-memory situations as they always direct reclaim.
    This patch initialises zlcache_ptr correctly.

    Signed-off-by: Mel Gorman
    Tested-by: Dan Williams
    [ Simplified patch a bit ]
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • The 192 byte cache is not necessary if we have a basic alignment of 128
    byte. If it would be used then the 192 would be aligned to the next 128 byte
    boundary which would result in another 256 byte cache. Two 256 kmalloc caches
    cause sysfs to complain about a duplicate entry.

    MIPS needs 128 byte aligned kmalloc caches and spits out warnings on boot without
    this patch.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Pekka Enberg

    Christoph Lameter
     

02 Jul, 2008

1 commit