31 Jul, 2009

1 commit

  • sg_miter_start() is currently unaware of the direction of the copy
    process (to or from the scatter list). It is important to know the
    direction because the page has to be flushed in case the data written
    is seen on a different mapping in user land on cache incoherent
    architectures.

    Signed-off-by: Sebastian Andrzej Siewior
    Acked-by: FUJITA Tomonori
    Acked-by: Tejun Heo
    Signed-off-by: Pierre Ossman

    Sebastian Andrzej Siewior
     

30 Jul, 2009

2 commits

  • Once a structure goes over PAGE_SIZE*2, we see occasional allocation
    failures. Some people have chosen to switch over to things like vmalloc()
    that will let them keep array-like access to such a large structures.
    But, vmalloc() has plenty of downsides.

    Here's an alternative. I think it's what Andrew was suggesting here:

    http://lkml.org/lkml/2009/7/2/518

    I call it a flexible array. It does all of its work in PAGE_SIZE bits, so
    never does an order>0 allocation. The base level has
    PAGE_SIZE-2*sizeof(int) bytes of storage for pointers to the second level.
    So, with a 32-bit arch, you get about 4MB (4183112 bytes) of total
    storage when the objects pack nicely into a page. It is half that on
    64-bit because the pointers are twice the size. There's a table detailing
    this in the code.

    There are kerneldocs for the functions, but here's an
    overview:

    flex_array_alloc() - dynamically allocate a base structure
    flex_array_free() - free the array and all of the
    second-level pages
    flex_array_free_parts() - free the second-level pages, but
    not the base (for static bases)
    flex_array_put() - copy into the array at the given index
    flex_array_get() - copy out of the array at the given index
    flex_array_prealloc() - preallocate the second-level pages
    between the given indexes to
    guarantee no allocs will occur at
    put() time.

    We could also potentially just pass the "element_size" into each of the
    API functions instead of storing it internally. That would get us one
    more base pointer on 32-bit.

    I've been testing this by running it in userspace. The header and patch
    that I've been using are here, as well as the little script I'm using to
    generate the size table which goes in the kerneldocs.

    http://sr71.net/~dave/linux/flexarray/

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Dave Hansen
    Reviewed-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • The generic atomic64_t implementation in lib/ did not export the functions
    it defined, which means that modules that use atomic64_t would not link on
    platforms (such as 32-bit powerpc). For example, trying to build a kernel
    with CONFIG_NET_RDS on such a platform would fail with:

    ERROR: "atomic64_read" [net/rds/rds.ko] undefined!
    ERROR: "atomic64_set" [net/rds/rds.ko] undefined!

    Fix this by exporting the atomic64_t functions to modules. (I export the
    entire API even if it's not all currently used by in-tree modules to avoid
    having to continue fixing this in dribs and drabs)

    Signed-off-by: Roland Dreier
    Acked-by: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland Dreier
     

29 Jul, 2009

1 commit


11 Jul, 2009

2 commits

  • …el/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    dma-debug: Fix the overlap() function to be correct and readable
    oprofile: reset bt_lost_no_mapping with other stats
    x86/oprofile: rename kernel parameter for architectural perfmon to arch_perfmon
    signals: declare sys_rt_tgsigqueueinfo in syscalls.h
    rcu: Mark Hierarchical RCU no longer experimental
    dma-debug: Put all hash-chain locks into the same lock class
    dma-debug: fix off-by-one error in overlap function

    Linus Torvalds
     
  • Linus noticed how unclean and buggy the overlap() function is:

    - It uses convoluted (and bug-causing) positive checks for
    range overlap - instead of using a more natural negative
    check.

    - Even the positive checks are buggy: a positive intersection
    check has four natural cases while we checked only for three,
    missing the (addr < start && addr2 == end) case for example.

    - The variables are mis-named, making it non-obvious how the
    check was done.

    - It needlessly uses u64 instead of unsigned long. Since these
    are kernel memory pointers and we explicitly exclude highmem
    ranges anyway we cannot ever overflow 32 bits, even if we
    could. (and on 64-bit it doesnt matter anyway)

    All in one, this function needs a total revamp. I used Linus's
    suggestions minus the paranoid checks (we cannot overflow really
    because if we get totally bad DMA ranges passed far more things
    break in the systems than just DMA debugging). I also fixed a
    few other small details i noticed.

    Reported-by: Linus Torvalds
    Cc: Joerg Roedel
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

03 Jul, 2009

1 commit


25 Jun, 2009

1 commit

  • (feature suggested by Sergey Senozhatsky)

    Kmemleak needs to track all the memory allocations but some of these
    happen before kmemleak is initialised. These are stored in an internal
    buffer which may be exceeded in some kernel configurations. This patch
    adds a configuration option with a default value of 400 and also removes
    the stack dump when the early log buffer is exceeded.

    Signed-off-by: Catalin Marinas
    Acked-by: Sergey Senozhatsky

    Catalin Marinas
     

24 Jun, 2009

1 commit


23 Jun, 2009

1 commit

  • Selecting DEBUG_SLAB or SLUB_DEBUG by the KMEMLEAK menu entry may cause
    issues with other dependencies (KMEMCHECK). These configuration options
    aren't strictly needed by kmemleak but they may increase the chances of
    finding leaks. This patch also updates the KMEMLEAK config entry help
    text.

    Signed-off-by: Catalin Marinas
    Acked-by: Pekka Enberg

    Catalin Marinas
     

22 Jun, 2009

1 commit


21 Jun, 2009

1 commit

  • x86 stack traces are a piece of crap without frame pointers, and its not
    like the 'performance gain' of not having stack pointers matters when you
    selected lockdep.

    Reported-by: Andrew Morton
    LKML-Reference:
    Cc:
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

19 Jun, 2009

2 commits

  • The new generic checksum code has a small dependency on endianess and
    worked only on big-endian systems. I could not find a nice efficient
    way to express this, so I added an #ifdef. Using
    'result += le16_to_cpu(*buff);' would have worked as well, but
    would be slightly less efficient on big-endian systems and IMHO
    would not be clearer.

    Also fix a bug that prevents this from working on 64-bit machines.
    If you have a 64-bit CPU and want to use the generic checksum
    code, you should probably do some more optimizations anyway, but
    at least the code should not break.

    Reported-by: Mike Frysinger
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     
  • This patch adds lib/gcd.c which contains a greatest common divider
    implementation taken from sound/core/pcm_timer.c

    Several usages of this new library function will be sent to subsystem
    maintainers.

    [akpm@linux-foundation.org: use swap() (pointed out by Joe)]
    [akpm@linux-foundation.org: just add gcd.o to obj-y, remove Kconfig changes]
    Signed-off-by: Florian Fainelli
    Cc: Sergei Shtylyov
    Cc: Takashi Iwai
    Cc: Simon Horman
    Cc: Julius Volz
    Cc: David S. Miller
    Cc: Patrick McHardy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Florian Fainelli
     

17 Jun, 2009

12 commits

  • Alan Cox reported that lockdep runs out of its stack-trace entries
    with certain configs:

    BUG: MAX_STACK_TRACE_ENTRIES too low

    This happens because there are 1024 hash buckets, each with a
    separate lock. Lockdep puts each lock into a separate lock class and
    tracks them independently.

    But in reality we never take more than one of the buckets, so they
    really belong into a single lock-class. Annotate the has bucket lock
    init accordingly.

    [ Impact: reduce the lockdep footprint of dma-debug ]

    Reported-by: Alan Cox
    Signed-off-by: Ingo Molnar
    Signed-off-by: Joerg Roedel

    Ingo Molnar
     
  • * akpm: (182 commits)
    fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset
    fbdev: *bfin*: fix __dev{init,exit} markings
    fbdev: *bfin*: drop unnecessary calls to memset
    fbdev: bfin-t350mcqb-fb: drop unused local variables
    fbdev: blackfin has __raw I/O accessors, so use them in fb.h
    fbdev: s1d13xxxfb: add accelerated bitblt functions
    tcx: use standard fields for framebuffer physical address and length
    fbdev: add support for handoff from firmware to hw framebuffers
    intelfb: fix a bug when changing video timing
    fbdev: use framebuffer_release() for freeing fb_info structures
    radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb?
    s3c-fb: CPUFREQ frequency scaling support
    s3c-fb: fix resource releasing on error during probing
    carminefb: fix possible access beyond end of carmine_modedb[]
    acornfb: remove fb_mmap function
    mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF
    mb862xxfb: restrict compliation of platform driver to PPC
    Samsung SoC Framebuffer driver: add Alpha Channel support
    atmel-lcdc: fix pixclock upper bound detection
    offb: use framebuffer_alloc() to allocate fb_info struct
    ...

    Manually fix up conflicts due to kmemcheck in mm/slab.c

    Linus Torvalds
     
  • Furthermore, notice that the initial checks:

    if (!node->rb_left)
    child = node->rb_right;
    else if (!node->rb_right)
    child = node->rb_left;
    else
    {
    ...
    }
    guarantee that old->rb_right is set in the final else branch, therefore
    we can omit checking that again.

    Signed-off-by: Wolfram Strepp
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wolfram Strepp
     
  • There are two cases when a node, having 2 childs, is erased:
    'normal case': the successor is not the right-hand-child of the node to be erased
    'special case': the successor is the right-hand child of the node to be erased

    Here some ascii-art, with following symbols (referring to the code):
    O: node to be deleted
    N: the successor of O
    P: parent of N
    C: child of N
    L: some other node

    normal case:

    O N
    / \ / \
    / \ / \
    L \ L \
    / \ P ----> / \ P
    / \ / \
    / /
    N C
    \ / \
    \
    C
    / \

    special case:
    O|P N
    / \ / \
    / \ / \
    L \ L \
    / \ N ----> / C
    \ / \
    \
    C
    / \

    Notice that for the special case we don't have to reconnect C to N.

    Signed-off-by: Wolfram Strepp
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wolfram Strepp
     
  • First, move some code around in order to make the next change more obvious.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Wolfram Strepp
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wolfram Strepp
     
  • There is a call to write_lock() in gen_pool_destroy which is not balanced
    by any corresponding write_unlock(). This causes problems with preemption
    because the preemption-disable counter is incremented in the write_lock()
    call, but never decremented by any call to write_unlock(). This bug is
    gen_pool_destroy, and one of them is non-x86 arch-specific code.

    Signed-off-by: Zygo Blaxell
    Cc: Jiri Kosina
    Cc: Steve Wise
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zygo Blaxell
     
  • For example:
    hex_dump_to_buffer("AB", 2, 16, 1, buf, 100, 0);
    pr_info("[%s]\n", buf);

    I'd expect the output to be "[41 42]", but actually it's "[41 42 ]"

    This patch also makes the required buf to be minimum. To print the hex
    format of "AB", a buf with size 6 should be sufficient, but
    hex_dump_to_buffer() required at least 8.

    Signed-off-by: Li Zefan
    Acked-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • radix_tree_lookup() and radix_tree_lookup_slot() have much the
    same code except for the return value.

    Introduce radix_tree_lookup_element() to do the real work.

    /*
    * is_slot == 1 : search for the slot.
    * is_slot == 0 : search for the node.
    */
    static void * radix_tree_lookup_element(struct radix_tree_root *root,
    unsigned long index, int is_slot);

    Signed-off-by: Huang Shijie
    Cc: Nick Piggin
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Huang Shijie
     
  • _atomic_dec_and_lock() should not unconditionally take the lock before
    calling atomic_dec_and_test() in the UP case. For consistency reasons it
    should behave exactly like in the SMP case.

    Besides that this works around the problem that with CONFIG_DEBUG_SPINLOCK
    this spins in __spin_lock_debug() if the lock is already taken even if the
    counter doesn't drop to 0.

    Signed-off-by: Jan Blunck
    Acked-by: Paul E. McKenney
    Acked-by: Nick Piggin
    Cc: Valerie Aurora
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Blunck
     
  • The counterpart of radix_tree_next_hole(). To be used by context readahead.

    Signed-off-by: Wu Fengguang
    Cc: Vladislav Bolkhovitin
    Cc: Jens Axboe
    Cc: Jeff Moyer
    Cc: Nick Piggin
    Cc: Ying Han
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck: (39 commits)
    signal: fix __send_signal() false positive kmemcheck warning
    fs: fix do_mount_root() false positive kmemcheck warning
    fs: introduce __getname_gfp()
    trace: annotate bitfields in struct ring_buffer_event
    net: annotate struct sock bitfield
    c2port: annotate bitfield for kmemcheck
    net: annotate inet_timewait_sock bitfields
    ieee1394/csr1212: fix false positive kmemcheck report
    ieee1394: annotate bitfield
    net: annotate bitfields in struct inet_sock
    net: use kmemcheck bitfields API for skbuff
    kmemcheck: introduce bitfield API
    kmemcheck: add opcode self-testing at boot
    x86: unify pte_hidden
    x86: make _PAGE_HIDDEN conditional
    kmemcheck: make kconfig accessible for other architectures
    kmemcheck: enable in the x86 Kconfig
    kmemcheck: add hooks for the page allocator
    kmemcheck: add hooks for page- and sg-dma-mappings
    kmemcheck: don't track page tables
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (64 commits)
    debugfs: use specified mode to possibly mark files read/write only
    debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem.
    xen: remove driver_data direct access of struct device from more drivers
    usb: gadget: at91_udc: remove driver_data direct access of struct device
    uml: remove driver_data direct access of struct device
    block/ps3: remove driver_data direct access of struct device
    s390: remove driver_data direct access of struct device
    parport: remove driver_data direct access of struct device
    parisc: remove driver_data direct access of struct device
    of_serial: remove driver_data direct access of struct device
    mips: remove driver_data direct access of struct device
    ipmi: remove driver_data direct access of struct device
    infiniband: ehca: remove driver_data direct access of struct device
    ibmvscsi: gadget: at91_udc: remove driver_data direct access of struct device
    hvcs: remove driver_data direct access of struct device
    xen block: remove driver_data direct access of struct device
    thermal: remove driver_data direct access of struct device
    scsi: remove driver_data direct access of struct device
    pcmcia: remove driver_data direct access of struct device
    PCIE: remove driver_data direct access of struct device
    ...

    Manually fix up trivial conflicts due to different direct driver_data
    direct access fixups in drivers/block/{ps3disk.c,ps3vram.c}

    Linus Torvalds
     

16 Jun, 2009

2 commits


15 Jun, 2009

6 commits


13 Jun, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
    add generic lib/checksum.c
    asm-generic: add a generic uaccess.h
    asm-generic: add generic NOMMU versions of some headers
    asm-generic: add generic atomic.h and io.h
    asm-generic: add legacy I/O header files
    asm-generic: add generic versions of common headers
    asm-generic: make bitops.h usable
    asm-generic: make pci.h usable directly
    asm-generic: make get_rtc_time overridable
    asm-generic: rename page.h and uaccess.h
    asm-generic: rename atomic.h to atomic-long.h
    asm-generic: add a generic unistd.h
    asm-generic: add generic ABI headers
    asm-generic: add generic sysv ipc headers
    asm-generic: introduce asm/bitsperlong.h
    asm-generic: rename termios.h, signal.h and mman.h

    Linus Torvalds
     

12 Jun, 2009

5 commits

  • It's theoretically possible that there are exception table entries
    which point into the (freed) init text of modules. These could cause
    future problems if other modules get loaded into that memory and cause
    an exception as we'd see the wrong fixup. The only case I know of is
    kvm-intel.ko (when CONFIG_CC_OPTIMIZE_FOR_SIZE=n).

    Amerigo fixed this long-standing FIXME in the x86 version, but this
    patch is more general.

    This implements trim_init_extable(); most archs are simple since they
    use the standard lib/extable.c sort code. Alpha and IA64 use relative
    addresses in their fixups, so thier trimming is a slight variation.

    Sparc32 is unique; it doesn't seem to define ARCH_HAS_SORT_EXTABLE,
    yet it defines its own sort_extable() which overrides the one in lib.
    It doesn't sort, so we have to mark deleted entries instead of
    actually trimming them.

    Inspired-by: Amerigo Wang
    Signed-off-by: Rusty Russell
    Cc: linux-alpha@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org

    Rusty Russell
     
  • Fixes a merge conflict against the x86 tree caused by a fix to
    atomic.h which I renamed to atomic_long.h.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     
  • * 'for-linus' of git://linux-arm.org/linux-2.6:
    kmemleak: Add the corresponding MAINTAINERS entry
    kmemleak: Simple testing module for kmemleak
    kmemleak: Enable the building of the memory leak detector
    kmemleak: Remove some of the kmemleak false positives
    kmemleak: Add modules support
    kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash
    kmemleak: Add the vmalloc memory allocation/freeing hooks
    kmemleak: Add the slub memory allocation/freeing hooks
    kmemleak: Add the slob memory allocation/freeing hooks
    kmemleak: Add the slab memory allocation/freeing hooks
    kmemleak: Add documentation on the memory leak detector
    kmemleak: Add the base support

    Manual conflict resolution (with the slab/earlyboot changes) in:
    drivers/char/vt.c
    init/main.c
    mm/slab.c

    Linus Torvalds
     
  • …/git/penberg/slab-2.6

    * 'topic/slab/earlyboot' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
    vgacon: use slab allocator instead of the bootmem allocator
    irq: use kcalloc() instead of the bootmem allocator
    sched: use slab in cpupri_init()
    sched: use alloc_cpumask_var() instead of alloc_bootmem_cpumask_var()
    memcg: don't use bootmem allocator in setup code
    irq/cpumask: make memoryless node zero happy
    x86: remove some alloc_bootmem_cpumask_var calling
    vt: use kzalloc() instead of the bootmem allocator
    sched: use kzalloc() instead of the bootmem allocator
    init: introduce mm_init()
    vmalloc: use kzalloc() instead of alloc_bootmem()
    slab: setup allocators earlier in the boot sequence
    bootmem: fix slab fallback on numa
    bootmem: use slab if bootmem is no longer available

    Linus Torvalds
     
  • Add a generic (unoptimized) implementation of checksum.c in pure C
    for use by all architectures that cannot be bother with implementing
    their own version.

    Based on microblaze code by Michal Simek

    Cc: Michal Simek
    Signed-off-by: Remis Lima Baima
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann