12 Jun, 2009

18 commits

  • * 'for-linus' of git://linux-arm.org/linux-2.6:
    kmemleak: Add the corresponding MAINTAINERS entry
    kmemleak: Simple testing module for kmemleak
    kmemleak: Enable the building of the memory leak detector
    kmemleak: Remove some of the kmemleak false positives
    kmemleak: Add modules support
    kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash
    kmemleak: Add the vmalloc memory allocation/freeing hooks
    kmemleak: Add the slub memory allocation/freeing hooks
    kmemleak: Add the slob memory allocation/freeing hooks
    kmemleak: Add the slab memory allocation/freeing hooks
    kmemleak: Add documentation on the memory leak detector
    kmemleak: Add the base support

    Manual conflict resolution (with the slab/earlyboot changes) in:
    drivers/char/vt.c
    init/main.c
    mm/slab.c

    Linus Torvalds
     
  • …el/git/tip/linux-2.6-tip

    * 'perfcounters-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (574 commits)
    perf_counter: Turn off by default
    perf_counter: Add counter->id to the throttle event
    perf_counter: Better align code
    perf_counter: Rename L2 to LL cache
    perf_counter: Standardize event names
    perf_counter: Rename enums
    perf_counter tools: Clean up u64 usage
    perf_counter: Rename perf_counter_limit sysctl
    perf_counter: More paranoia settings
    perf_counter: powerpc: Implement generalized cache events for POWER processors
    perf_counters: powerpc: Add support for POWER7 processors
    perf_counter: Accurate period data
    perf_counter: Introduce struct for sample data
    perf_counter tools: Normalize data using per sample period data
    perf_counter: Annotate exit ctx recursion
    perf_counter tools: Propagate signals properly
    perf_counter tools: Small frequency related fixes
    perf_counter: More aggressive frequency adjustment
    perf_counter/x86: Fix the model number of Intel Core2 processors
    perf_counter, x86: Correct some event and umask values for Intel processors
    ...

    Linus Torvalds
     
  • …/git/penberg/slab-2.6

    * 'topic/slab/earlyboot' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
    vgacon: use slab allocator instead of the bootmem allocator
    irq: use kcalloc() instead of the bootmem allocator
    sched: use slab in cpupri_init()
    sched: use alloc_cpumask_var() instead of alloc_bootmem_cpumask_var()
    memcg: don't use bootmem allocator in setup code
    irq/cpumask: make memoryless node zero happy
    x86: remove some alloc_bootmem_cpumask_var calling
    vt: use kzalloc() instead of the bootmem allocator
    sched: use kzalloc() instead of the bootmem allocator
    init: introduce mm_init()
    vmalloc: use kzalloc() instead of alloc_bootmem()
    slab: setup allocators earlier in the boot sequence
    bootmem: fix slab fallback on numa
    bootmem: use slab if bootmem is no longer available

    Linus Torvalds
     
  • * 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits)
    block: add request clone interface (v2)
    floppy: fix hibernation
    ramdisk: remove long-deprecated "ramdisk=" boot-time parameter
    fs/bio.c: add missing __user annotation
    block: prevent possible io_context->refcount overflow
    Add serial number support for virtio_blk, V4a
    block: Add missing bounce_pfn stacking and fix comments
    Revert "block: Fix bounce limit setting in DM"
    cciss: decode unit attention in SCSI error handling code
    cciss: Remove no longer needed sendcmd reject processing code
    cciss: change SCSI error handling routines to work with interrupts enabled.
    cciss: separate error processing and command retrying code in sendcmd_withirq_core()
    cciss: factor out fix target status processing code from sendcmd functions
    cciss: simplify interface of sendcmd() and sendcmd_withirq()
    cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code
    cciss: Use schedule_timeout_uninterruptible in SCSI error handling code
    block: needs to set the residual length of a bidi request
    Revert "block: implement blkdev_readpages"
    block: Fix bounce limit setting in DM
    Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt
    ...

    Manually fix conflicts with tracing updates in:
    block/blk-sysfs.c
    drivers/ide/ide-atapi.c
    drivers/ide/ide-cd.c
    drivers/ide/ide-floppy.c
    drivers/ide/ide-tape.c
    include/trace/events/block.h
    kernel/trace/blktrace.c

    Linus Torvalds
     
  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (44 commits)
    nommu: Provide mmap_min_addr definition.
    TOMOYO: Add description of lists and structures.
    TOMOYO: Remove unused field.
    integrity: ima audit dentry_open failure
    TOMOYO: Remove unused parameter.
    security: use mmap_min_addr indepedently of security models
    TOMOYO: Simplify policy reader.
    TOMOYO: Remove redundant markers.
    SELinux: define audit permissions for audit tree netlink messages
    TOMOYO: Remove unused mutex.
    tomoyo: avoid get+put of task_struct
    smack: Remove redundant initialization.
    integrity: nfsd imbalance bug fix
    rootplug: Remove redundant initialization.
    smack: do not beyond ARRAY_SIZE of data
    integrity: move ima_counts_get
    integrity: path_check update
    IMA: Add __init notation to ima functions
    IMA: Minimal IMA policy and boot param for TCB IMA policy
    selinux: remove obsolete read buffer limit from sel_read_bool
    ...

    Linus Torvalds
     
  • The bootmem allocator is no longer available for page_cgroup_init() because we
    set up the kernel slab allocator much earlier now.

    Cc: Ingo Molnar
    Cc: Johannes Weiner
    Cc: Linus Torvalds
    Signed-off-by: Yinghai Lu
    Signed-off-by: Pekka Enberg

    Yinghai Lu
     
  • We can call vmalloc_init() after kmem_cache_init() and use kzalloc() instead of
    the bootmem allocator when initializing vmalloc data structures.

    Acked-by: Johannes Weiner
    Acked-by: Linus Torvalds
    Acked-by: Nick Piggin
    Cc: Ingo Molnar
    Cc: Yinghai Lu
    Signed-off-by: Pekka Enberg

    Pekka Enberg
     
  • This patch makes kmalloc() available earlier in the boot sequence so we can get
    rid of some bootmem allocations. The bulk of the changes are due to
    kmem_cache_init() being called with interrupts disabled which requires some
    changes to allocator boostrap code.

    Note: 32-bit x86 does WP protect test in mem_init() so we must setup traps
    before we call mem_init() during boot as reported by Ingo Molnar:

    We have a hard crash in the WP-protect code:

    [ 0.000000] Checking if this processor honours the WP bit even in supervisor mode...BUG: Int 14: CR2 ffcff000
    [ 0.000000] EDI 00000188 ESI 00000ac7 EBP c17eaf9c ESP c17eaf8c
    [ 0.000000] EBX 000014e0 EDX 0000000e ECX 01856067 EAX 00000001
    [ 0.000000] err 00000003 EIP c10135b1 CS 00000060 flg 00010002
    [ 0.000000] Stack: c17eafa8 c17fd410 c16747bc c17eafc4 c17fd7e5 000011fd f8616000 c18237cc
    [ 0.000000] 00099800 c17bb000 c17eafec c17f1668 000001c5 c17f1322 c166e039 c1822bf0
    [ 0.000000] c166e033 c153a014 c18237cc 00020800 c17eaff8 c17f106a 00020800 01ba5003
    [ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.30-tip-02161-g7a74539-dirty #52203
    [ 0.000000] Call Trace:
    [ 0.000000] [] ? printk+0x14/0x16
    [ 0.000000] [] ? do_test_wp_bit+0x19/0x23
    [ 0.000000] [] ? test_wp_bit+0x26/0x64
    [ 0.000000] [] ? mem_init+0x1ba/0x1d8
    [ 0.000000] [] ? start_kernel+0x164/0x2f7
    [ 0.000000] [] ? unknown_bootoption+0x0/0x19c
    [ 0.000000] [] ? __init_begin+0x6a/0x6f

    Acked-by: Johannes Weiner
    Acked-by Linus Torvalds
    Cc: Christoph Lameter
    Cc: Ingo Molnar
    Cc: Matt Mackall
    Cc: Nick Piggin
    Cc: Yinghai Lu
    Signed-off-by: Pekka Enberg

    Pekka Enberg
     
  • If the user requested bootmem allocation on a specific node, we should use
    kzalloc_node() for the fallback allocation.

    Cc: Ingo Molnar
    Cc: Johannes Weiner
    Cc: Linus Torvalds
    Cc: Yinghai Lu
    Signed-off-by: Pekka Enberg

    Pekka Enberg
     
  • As a preparation for initializing the slab allocator early, make sure the
    bootmem allocator does not crash and burn if someone calls it after slab is up;
    otherwise we'd need a flag day for switching to early slab.

    Acked-by: Johannes Weiner
    Acked-by: Linus Torvalds
    Cc: Christoph Lameter
    Cc: Ingo Molnar
    Cc: Matt Mackall
    Cc: Nick Piggin
    Cc: Yinghai Lu
    Signed-off-by: Pekka Enberg

    Pekka Enberg
     
  • This patch adds a loadable module that deliberately leaks memory. It
    is used for testing various memory leaking scenarios.

    Signed-off-by: Catalin Marinas

    Catalin Marinas
     
  • This patch adds the Kconfig.debug and Makefile entries needed for
    building kmemleak into the kernel.

    Signed-off-by: Catalin Marinas

    Catalin Marinas
     
  • The alloc_large_system_hash function is called from various places in
    the kernel and it contains pointers to other allocated structures. It
    therefore needs to be traced by kmemleak.

    Signed-off-by: Catalin Marinas

    Catalin Marinas
     
  • This patch adds the callbacks to kmemleak_(alloc|free) functions from
    vmalloc/vfree.

    Signed-off-by: Catalin Marinas

    Catalin Marinas
     
  • This patch adds the callbacks to kmemleak_(alloc|free) functions from the
    slub allocator.

    Signed-off-by: Catalin Marinas
    Cc: Christoph Lameter
    Reviewed-by: Pekka Enberg

    Catalin Marinas
     
  • This patch adds the callbacks to kmemleak_(alloc|free) functions from the
    slob allocator.

    Signed-off-by: Catalin Marinas
    Acked-by: Matt Mackall
    Acked-by: Pekka Enberg

    Catalin Marinas
     
  • This patch adds the callbacks to kmemleak_(alloc|free) functions from
    the slab allocator. The patch also adds the SLAB_NOLEAKTRACE flag to
    avoid recursive calls to kmemleak when it allocates its own data
    structures.

    Signed-off-by: Catalin Marinas
    Reviewed-by: Pekka Enberg

    Catalin Marinas
     
  • This patch adds the base support for the kernel memory leak
    detector. It traces the memory allocation/freeing in a way similar to
    the Boehm's conservative garbage collector, the difference being that
    the unreferenced objects are not freed but only shown in
    /sys/kernel/debug/kmemleak. Enabling this feature introduces an
    overhead to memory allocations.

    Signed-off-by: Catalin Marinas
    Cc: Ingo Molnar
    Acked-by: Pekka Enberg
    Cc: Andrew Morton
    Reviewed-by: Paul E. McKenney

    Catalin Marinas
     

11 Jun, 2009

5 commits

  • Conflicts:
    arch/x86/kernel/irqinit.c
    arch/x86/kernel/irqinit_64.c
    arch/x86/kernel/traps.c
    arch/x86/mm/fault.c
    include/linux/sched.h
    kernel/exit.c

    Ingo Molnar
     
  • * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (244 commits)
    Revert "x86, bts: reenable ptrace branch trace support"
    tracing: do not translate event helper macros in print format
    ftrace/documentation: fix typo in function grapher name
    tracing/events: convert block trace points to TRACE_EVENT(), fix !CONFIG_BLOCK
    tracing: add protection around module events unload
    tracing: add trace_seq_vprint interface
    tracing: fix the block trace points print size
    tracing/events: convert block trace points to TRACE_EVENT()
    ring-buffer: fix ret in rb_add_time_stamp
    ring-buffer: pass in lockdep class key for reader_lock
    tracing: add annotation to what type of stack trace is recorded
    tracing: fix multiple use of __print_flags and __print_symbolic
    tracing/events: fix output format of user stack
    tracing/events: fix output format of kernel stack
    tracing/trace_stack: fix the number of entries in the header
    ring-buffer: discard timestamps that are at the start of the buffer
    ring-buffer: try to discard unneeded timestamps
    ring-buffer: fix bug in ring_buffer_discard_commit
    ftrace: do not profile functions when disabled
    tracing: make trace pipe recognize latency format flag
    ...

    Linus Torvalds
     
  • James Morris
     
  • * 'percpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    percpu: remove rbtree and use page->index instead
    percpu: don't put the first chunk in reverse-map rbtree

    Linus Torvalds
     
  • * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
    x86: fix system without memory on node0
    x86, mm: Fix node_possible_map logic
    mm, x86: remove MEMORY_HOTPLUG_RESERVE related code
    x86: make sparse mem work in non-NUMA mode
    x86: process.c, remove useless headers
    x86: merge process.c a bit
    x86: use sparse_memory_present_with_active_regions() on UMA
    x86: unify 64-bit UMA and NUMA paging_init()
    x86: Allow 1MB of slack between the e820 map and SRAT, not 4GB
    x86: Sanity check the e820 against the SRAT table using e820 map only
    x86: clean up and and print out initial max_pfn_mapped
    x86/pci: remove rounding quirk from e820_setup_gap()
    x86, e820, pci: reserve extra free space near end of RAM
    x86: fix typo in address space documentation
    x86: 46 bit physical address support on 64 bits
    x86, mm: fault.c, use printk_once() in is_errata93()
    x86: move per-cpu mmu_gathers to mm/init.c
    x86: move max_pfn_mapped and max_low_pfn_mapped to setup.c
    x86: unify noexec handling
    x86: remove (null) in /sys kernel_page_tables
    ...

    Linus Torvalds
     

10 Jun, 2009

2 commits

  • With the "security: use mmap_min_addr indepedently of security models"
    change, mmap_min_addr is used in common areas, which susbsequently blows
    up the nommu build. This stubs in the definition in the nommu case as
    well.

    Signed-off-by: Paul Mundt

    --

    mm/nommu.c | 3 +++
    1 file changed, 3 insertions(+)
    Signed-off-by: James Morris

    Paul Mundt
     
  • TRACE_EVENT is a more generic way to define tracepoints. Doing so adds
    these new capabilities to this tracepoint:

    - zero-copy and per-cpu splice() tracing
    - binary tracing without printf overhead
    - structured logging records exposed under /debug/tracing/events
    - trace events embedded in function tracer output and other plugins
    - user-defined, per tracepoint filter expressions
    ...

    Cons:

    - no dev_t info for the output of plug, unplug_timer and unplug_io events.
    no dev_t info for getrq and sleeprq events if bio == NULL.
    no dev_t info for rq_abort,...,rq_requeue events if rq->rq_disk == NULL.

    This is mainly because we can't get the deivce from a request queue.
    But this may change in the future.

    - A packet command is converted to a string in TP_assign, not TP_print.
    While blktrace do the convertion just before output.

    Since pc requests should be rather rare, this is not a big issue.

    - In blktrace, an event can have 2 different print formats, but a TRACE_EVENT
    has a unique format, which means we have some unused data in a trace entry.

    The overhead is minimized by using __dynamic_array() instead of __array().

    I've benchmarked the ioctl blktrace vs the splice based TRACE_EVENT tracing:

    dd dd + ioctl blktrace dd + TRACE_EVENT (splice)
    1 7.36s, 42.7 MB/s 7.50s, 42.0 MB/s 7.41s, 42.5 MB/s
    2 7.43s, 42.3 MB/s 7.48s, 42.1 MB/s 7.43s, 42.4 MB/s
    3 7.38s, 42.6 MB/s 7.45s, 42.2 MB/s 7.41s, 42.5 MB/s

    So the overhead of tracing is very small, and no regression when using
    those trace events vs blktrace.

    And the binary output of TRACE_EVENT is much smaller than blktrace:

    # ls -l -h
    -rw-r--r-- 1 root root 8.8M 06-09 13:24 sda.blktrace.0
    -rw-r--r-- 1 root root 195K 06-09 13:24 sda.blktrace.1
    -rw-r--r-- 1 root root 2.7M 06-09 13:25 trace_splice.out

    Following are some comparisons between TRACE_EVENT and blktrace:

    plug:
    kjournald-480 [000] 303.084981: block_plug: [kjournald]
    kjournald-480 [000] 303.084981: 8,0 P N [kjournald]

    unplug_io:
    kblockd/0-118 [000] 300.052973: block_unplug_io: [kblockd/0] 1
    kblockd/0-118 [000] 300.052974: 8,0 U N [kblockd/0] 1

    remap:
    kjournald-480 [000] 303.085042: block_remap: 8,0 W 102736992 + 8 v3:

    - use the newly introduced __dynamic_array().

    Changelog from v1 -> v2:

    - use __string() instead of __array() to minimize the memory required
    to store hex dump of rq->cmd().

    - support large pc requests.

    - add missing blk_fill_rwbs_rq() in block_rq_requeue TRACE_EVENT.

    - some cleanups.

    Signed-off-by: Li Zefan
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Li Zefan
     

09 Jun, 2009

2 commits


05 Jun, 2009

1 commit


04 Jun, 2009

2 commits

  • In name of keeping it simple, only track mmap events. Userspace
    will have to remove old overlapping maps when it encounters them.

    Signed-off-by: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • This patch removes the dependency of mmap_min_addr on CONFIG_SECURITY.
    It also sets a default mmap_min_addr of 4096.

    mmapping of addresses below 4096 will only be possible for processes
    with CAP_SYS_RAWIO.

    Signed-off-by: Christoph Lameter
    Acked-by: Eric Paris
    Looks-ok-by: Linus Torvalds
    Signed-off-by: James Morris

    Christoph Lameter
     

01 Jun, 2009

1 commit


29 May, 2009

4 commits

  • Fix build warning, "mem_cgroup_is_obsolete defined but not used" when
    CONFIG_DEBUG_VM is not set. Also avoid checking for !mem again and again.

    Signed-off-by: Nikanth Karthikesan
    Acked-by: Pekka Enberg
    Acked-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nikanth Karthikesan
     
  • Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13302

    hugetlbfs reserves huge pages but does not fault them at mmap() time to
    ensure that future faults succeed. The reservation behaviour differs
    depending on whether the mapping was mapped MAP_SHARED or MAP_PRIVATE.
    For MAP_SHARED mappings, hugepages are reserved when mmap() is first
    called and are tracked based on information associated with the inode.
    Other processes mapping MAP_SHARED use the same reservation. MAP_PRIVATE
    track the reservations based on the VMA created as part of the mmap()
    operation. Each process mapping MAP_PRIVATE must make its own
    reservation.

    hugetlbfs currently checks if a VMA is MAP_SHARED with the VM_SHARED flag
    and not VM_MAYSHARE. For file-backed mappings, such as hugetlbfs,
    VM_SHARED is set only if the mapping is MAP_SHARED and the file was opened
    read-write. If a shared memory mapping was mapped shared-read-write for
    populating of data and mapped shared-read-only by other processes, then
    hugetlbfs would account for the mapping as if it was MAP_PRIVATE. This
    causes processes to fail to map the file MAP_SHARED even though it should
    succeed as the reservation is there.

    This patch alters mm/hugetlb.c and replaces VM_SHARED with VM_MAYSHARE
    when the intent of the code was to check whether the VMA was mapped
    MAP_SHARED or MAP_PRIVATE.

    Signed-off-by: Mel Gorman
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc:
    Cc: Lee Schermerhorn
    Cc: KOSAKI Motohiro
    Cc:
    Cc: Eric B Munson
    Cc: Adam Litke
    Cc: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • mapping->tree_lock can be acquired from interrupt context. Then,
    following dead lock can occur.

    Assume "A" as a page.

    CPU0:
    lock_page_cgroup(A)
    interrupted
    -> take mapping->tree_lock.
    CPU1:
    take mapping->tree_lock
    -> lock_page_cgroup(A)

    This patch tries to fix above deadlock by moving memcg's hook to out of
    mapping->tree_lock. charge/uncharge of pagecache/swapcache is protected
    by page lock, not tree_lock.

    After this patch, lock_page_cgroup() is not called under mapping->tree_lock.

    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Daisuke Nishimura
    Cc: Balbir Singh
    Cc: Daisuke Nishimura
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daisuke Nishimura
     
  • When /proc/sys/vm/oom_dump_tasks is enabled, it is possible to get a NULL
    pointer for tasks that have detached mm's since task_lock() is not held
    during the tasklist scan. Add the task_lock().

    Acked-by: Nick Piggin
    Acked-by: Mel Gorman
    Cc: Rik van Riel
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     

23 May, 2009

1 commit


22 May, 2009

4 commits

  • Conflicts:
    fs/exec.c

    Removed IMA changes (the IMA checks are now performed via may_open()).

    Signed-off-by: James Morris

    James Morris
     
  • Based on discussion on lkml (Andrew Morton and Eric Paris),
    move ima_counts_get down a layer into shmem/hugetlb__file_setup().
    Resolves drm shmem_file_setup() usage case as well.

    HD comment:
    I still think you're doing this at the wrong level, but recognize
    that you probably won't be persuaded until a few more users of
    alloc_file() emerge, all wanting your ima_counts_get().

    Resolving GEM's shmem_file_setup() is an improvement, so I'll say

    Acked-by: Hugh Dickins
    Signed-off-by: Mimi Zohar
    Signed-off-by: James Morris

    Mimi Zohar
     
  • - Add support in ima_path_check() for integrity checking without
    incrementing the counts. (Required for nfsd.)
    - rename and export opencount_get to ima_counts_get
    - replace ima_shm_check calls with ima_counts_get
    - export ima_path_check

    Signed-off-by: Mimi Zohar
    Signed-off-by: James Morris

    Mimi Zohar
     
  • My old address will shut down in a few days time: remove it from the tree,
    and add a tmpfs (shmem filesystem) maintainer entry with the new address.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Hugh Dickins