28 Aug, 2009

1 commit

  • We call lmb_end_of_DRAM() to test whether a DMA mask is ok on a machine
    without IOMMU, but this function is marked as __init.

    I don't think there's a clean way to get the top of RAM max_pfn doesn't
    appear to include highmem or I missed (or we have a bug :-) so for now,
    let's just avoid having a broken 2.6.31 by making this function
    non-__init and we can revisit later.

    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Linus Torvalds

    Benjamin Herrenschmidt
     

27 Aug, 2009

3 commits

  • It's problematic to allow signed element_nr's or total's to be passed as
    part of the flex array API.

    flex_array_alloc() allows total_nr_elements to be set to a negative
    quantity, which is obviously erroneous.

    flex_array_get() and flex_array_put() allows negative array indices in
    dereferencing an array part, which could address memory mapped before
    struct flex_array.

    The fix is to convert all existing element_nr formals to be qualified as
    unsigned. Existing checks to compare it to total_nr_elements or the max
    array size based on element_size need not be changed.

    Signed-off-by: David Rientjes
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • flex_array_free_parts() does not take `src' or `element_nr' formals, so
    remove their respective comments.

    Signed-off-by: David Rientjes
    Acked-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • If all array elements fit into the base structure and data is copied using
    flex_array_put() starting at a non-zero index, flex_array_get() will fail
    to return the data.

    This fixes the bug by only checking for NULL parts when all elements do
    not fit in the base structure when flex_array_get() is used. Otherwise,
    fa_element_to_part_nr() will always be 0 since there are no parts
    structures needed and such element may never have been put. Thus, it will
    remain NULL due to the kzalloc() of the base.

    Additionally, flex_array_put() now only checks for a NULL part when all
    elements do not fit in the base structure. This is otherwise unnecessary
    since the base structure is guaranteed to exist (or we would have already
    hit a NULL pointer).

    Signed-off-by: David Rientjes
    Acked-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     

26 Aug, 2009

1 commit


22 Aug, 2009

1 commit

  • When 'and'ing two bitmasks (where 'andnot' is a variation on it), some
    cases want to know whether the result is the empty set or not. In
    particular, the TLB IPI sending code wants to do cpumask operations and
    determine if there are any CPU's left in the final set.

    So this just makes the bitmask (and cpumask) functions return a boolean
    for whether the result has any bits set.

    Cc: stable@kernel.org (2.6.30, needed by TLB shootdown fix)
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

21 Aug, 2009

1 commit

  • While it's debatable whether or not a NULL device argument to
    the DMA API functions is valid... since it certainly isn't
    valid on devices with an IOMMU... dma-debug really shouldn't be
    dereferencing null pointers either.

    Guard against that in err_printk and the driver_filter
    functions. A Fedora rawhide user was seeing this in one of the
    dvb drivers resulting in an oops on boot.

    [ A patch has been sent for testing to the driver, but I feel
    the dma debugging support should be fixed as well. (There's
    still a pile of legacy garbage in the kernel passing null
    pointers to dma_{alloc,free}_*. :( ]

    Signed-off-by: Kyle McMartin
    Cc: mchehab@infradead.org
    Cc: Joerg Roedel
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Kyle McMartin
     

08 Aug, 2009

3 commits

  • These includes were added by 079effb6933f34b9b1b67b08bd4fd7fb672d16ef
    ("kmemtrace, kbuild: fix slab.h dependency problem in
    lib/decompress_inflate.c") to fix the build when using kmemtrace. However
    this is not necessary when used to create a compressed kernel, and
    actually creates issues (brings a lot of things unavailable in the
    decompression environment), so don't include it if STATIC is defined.

    Signed-off-by: Albin Tonnerre
    Cc: Sam Ravnborg
    Cc: Russell King
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Pekka Enberg
    Cc: Eduard - Gabriel Munteanu
    Cc: Phillip Lougher
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Albin Tonnerre
     
  • decompress_bunzip2 and decompress_unlzma have a nasty hack that subtracts
    4 from the input length if being called in the pre-boot environment.

    This is a nasty hack because it relies on the fact that flush = NULL only
    when called from the pre-boot environment (i.e.
    arch/x86/boot/compressed/misc.c). initramfs.c/do_mounts_rd.c pass in a
    flush buffer (flush != NULL).

    This hack prevents the decompressors from being used with flush = NULL by
    other callers unless knowledge of the hack is propagated to them.

    This patch removes the hack by making decompress (called only from the
    pre-boot environment) a wrapper function that subtracts 4 from the input
    length before calling the decompressor.

    Signed-off-by: Phillip Lougher
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Phillip Lougher
     
  • Fix and improve comments in decompress/generic.h that describe the
    decompressor API. Also remove an unused definition, and rename INBUF_LEN
    in lib/decompress_inflate.c to conform to bzip2/lzma naming.

    Signed-off-by: Phillip Lougher
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Phillip Lougher
     

05 Aug, 2009

1 commit


31 Jul, 2009

1 commit

  • sg_miter_start() is currently unaware of the direction of the copy
    process (to or from the scatter list). It is important to know the
    direction because the page has to be flushed in case the data written
    is seen on a different mapping in user land on cache incoherent
    architectures.

    Signed-off-by: Sebastian Andrzej Siewior
    Acked-by: FUJITA Tomonori
    Acked-by: Tejun Heo
    Signed-off-by: Pierre Ossman

    Sebastian Andrzej Siewior
     

30 Jul, 2009

2 commits

  • Once a structure goes over PAGE_SIZE*2, we see occasional allocation
    failures. Some people have chosen to switch over to things like vmalloc()
    that will let them keep array-like access to such a large structures.
    But, vmalloc() has plenty of downsides.

    Here's an alternative. I think it's what Andrew was suggesting here:

    http://lkml.org/lkml/2009/7/2/518

    I call it a flexible array. It does all of its work in PAGE_SIZE bits, so
    never does an order>0 allocation. The base level has
    PAGE_SIZE-2*sizeof(int) bytes of storage for pointers to the second level.
    So, with a 32-bit arch, you get about 4MB (4183112 bytes) of total
    storage when the objects pack nicely into a page. It is half that on
    64-bit because the pointers are twice the size. There's a table detailing
    this in the code.

    There are kerneldocs for the functions, but here's an
    overview:

    flex_array_alloc() - dynamically allocate a base structure
    flex_array_free() - free the array and all of the
    second-level pages
    flex_array_free_parts() - free the second-level pages, but
    not the base (for static bases)
    flex_array_put() - copy into the array at the given index
    flex_array_get() - copy out of the array at the given index
    flex_array_prealloc() - preallocate the second-level pages
    between the given indexes to
    guarantee no allocs will occur at
    put() time.

    We could also potentially just pass the "element_size" into each of the
    API functions instead of storing it internally. That would get us one
    more base pointer on 32-bit.

    I've been testing this by running it in userspace. The header and patch
    that I've been using are here, as well as the little script I'm using to
    generate the size table which goes in the kerneldocs.

    http://sr71.net/~dave/linux/flexarray/

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Dave Hansen
    Reviewed-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • The generic atomic64_t implementation in lib/ did not export the functions
    it defined, which means that modules that use atomic64_t would not link on
    platforms (such as 32-bit powerpc). For example, trying to build a kernel
    with CONFIG_NET_RDS on such a platform would fail with:

    ERROR: "atomic64_read" [net/rds/rds.ko] undefined!
    ERROR: "atomic64_set" [net/rds/rds.ko] undefined!

    Fix this by exporting the atomic64_t functions to modules. (I export the
    entire API even if it's not all currently used by in-tree modules to avoid
    having to continue fixing this in dribs and drabs)

    Signed-off-by: Roland Dreier
    Acked-by: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland Dreier
     

29 Jul, 2009

1 commit


11 Jul, 2009

2 commits

  • …el/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    dma-debug: Fix the overlap() function to be correct and readable
    oprofile: reset bt_lost_no_mapping with other stats
    x86/oprofile: rename kernel parameter for architectural perfmon to arch_perfmon
    signals: declare sys_rt_tgsigqueueinfo in syscalls.h
    rcu: Mark Hierarchical RCU no longer experimental
    dma-debug: Put all hash-chain locks into the same lock class
    dma-debug: fix off-by-one error in overlap function

    Linus Torvalds
     
  • Linus noticed how unclean and buggy the overlap() function is:

    - It uses convoluted (and bug-causing) positive checks for
    range overlap - instead of using a more natural negative
    check.

    - Even the positive checks are buggy: a positive intersection
    check has four natural cases while we checked only for three,
    missing the (addr < start && addr2 == end) case for example.

    - The variables are mis-named, making it non-obvious how the
    check was done.

    - It needlessly uses u64 instead of unsigned long. Since these
    are kernel memory pointers and we explicitly exclude highmem
    ranges anyway we cannot ever overflow 32 bits, even if we
    could. (and on 64-bit it doesnt matter anyway)

    All in one, this function needs a total revamp. I used Linus's
    suggestions minus the paranoid checks (we cannot overflow really
    because if we get totally bad DMA ranges passed far more things
    break in the systems than just DMA debugging). I also fixed a
    few other small details i noticed.

    Reported-by: Linus Torvalds
    Cc: Joerg Roedel
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

03 Jul, 2009

1 commit


25 Jun, 2009

1 commit

  • (feature suggested by Sergey Senozhatsky)

    Kmemleak needs to track all the memory allocations but some of these
    happen before kmemleak is initialised. These are stored in an internal
    buffer which may be exceeded in some kernel configurations. This patch
    adds a configuration option with a default value of 400 and also removes
    the stack dump when the early log buffer is exceeded.

    Signed-off-by: Catalin Marinas
    Acked-by: Sergey Senozhatsky

    Catalin Marinas
     

24 Jun, 2009

1 commit


23 Jun, 2009

1 commit

  • Selecting DEBUG_SLAB or SLUB_DEBUG by the KMEMLEAK menu entry may cause
    issues with other dependencies (KMEMCHECK). These configuration options
    aren't strictly needed by kmemleak but they may increase the chances of
    finding leaks. This patch also updates the KMEMLEAK config entry help
    text.

    Signed-off-by: Catalin Marinas
    Acked-by: Pekka Enberg

    Catalin Marinas
     

22 Jun, 2009

1 commit


21 Jun, 2009

1 commit

  • x86 stack traces are a piece of crap without frame pointers, and its not
    like the 'performance gain' of not having stack pointers matters when you
    selected lockdep.

    Reported-by: Andrew Morton
    LKML-Reference:
    Cc:
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

19 Jun, 2009

2 commits

  • The new generic checksum code has a small dependency on endianess and
    worked only on big-endian systems. I could not find a nice efficient
    way to express this, so I added an #ifdef. Using
    'result += le16_to_cpu(*buff);' would have worked as well, but
    would be slightly less efficient on big-endian systems and IMHO
    would not be clearer.

    Also fix a bug that prevents this from working on 64-bit machines.
    If you have a 64-bit CPU and want to use the generic checksum
    code, you should probably do some more optimizations anyway, but
    at least the code should not break.

    Reported-by: Mike Frysinger
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     
  • This patch adds lib/gcd.c which contains a greatest common divider
    implementation taken from sound/core/pcm_timer.c

    Several usages of this new library function will be sent to subsystem
    maintainers.

    [akpm@linux-foundation.org: use swap() (pointed out by Joe)]
    [akpm@linux-foundation.org: just add gcd.o to obj-y, remove Kconfig changes]
    Signed-off-by: Florian Fainelli
    Cc: Sergei Shtylyov
    Cc: Takashi Iwai
    Cc: Simon Horman
    Cc: Julius Volz
    Cc: David S. Miller
    Cc: Patrick McHardy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Florian Fainelli
     

17 Jun, 2009

12 commits

  • Alan Cox reported that lockdep runs out of its stack-trace entries
    with certain configs:

    BUG: MAX_STACK_TRACE_ENTRIES too low

    This happens because there are 1024 hash buckets, each with a
    separate lock. Lockdep puts each lock into a separate lock class and
    tracks them independently.

    But in reality we never take more than one of the buckets, so they
    really belong into a single lock-class. Annotate the has bucket lock
    init accordingly.

    [ Impact: reduce the lockdep footprint of dma-debug ]

    Reported-by: Alan Cox
    Signed-off-by: Ingo Molnar
    Signed-off-by: Joerg Roedel

    Ingo Molnar
     
  • * akpm: (182 commits)
    fbdev: bf54x-lq043fb: use kzalloc over kmalloc/memset
    fbdev: *bfin*: fix __dev{init,exit} markings
    fbdev: *bfin*: drop unnecessary calls to memset
    fbdev: bfin-t350mcqb-fb: drop unused local variables
    fbdev: blackfin has __raw I/O accessors, so use them in fb.h
    fbdev: s1d13xxxfb: add accelerated bitblt functions
    tcx: use standard fields for framebuffer physical address and length
    fbdev: add support for handoff from firmware to hw framebuffers
    intelfb: fix a bug when changing video timing
    fbdev: use framebuffer_release() for freeing fb_info structures
    radeon: P2G2CLK_ALWAYS_ONb tested twice, should 2nd be P2G2CLK_DAC_ALWAYS_ONb?
    s3c-fb: CPUFREQ frequency scaling support
    s3c-fb: fix resource releasing on error during probing
    carminefb: fix possible access beyond end of carmine_modedb[]
    acornfb: remove fb_mmap function
    mb862xxfb: use CONFIG_OF instead of CONFIG_PPC_OF
    mb862xxfb: restrict compliation of platform driver to PPC
    Samsung SoC Framebuffer driver: add Alpha Channel support
    atmel-lcdc: fix pixclock upper bound detection
    offb: use framebuffer_alloc() to allocate fb_info struct
    ...

    Manually fix up conflicts due to kmemcheck in mm/slab.c

    Linus Torvalds
     
  • Furthermore, notice that the initial checks:

    if (!node->rb_left)
    child = node->rb_right;
    else if (!node->rb_right)
    child = node->rb_left;
    else
    {
    ...
    }
    guarantee that old->rb_right is set in the final else branch, therefore
    we can omit checking that again.

    Signed-off-by: Wolfram Strepp
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wolfram Strepp
     
  • There are two cases when a node, having 2 childs, is erased:
    'normal case': the successor is not the right-hand-child of the node to be erased
    'special case': the successor is the right-hand child of the node to be erased

    Here some ascii-art, with following symbols (referring to the code):
    O: node to be deleted
    N: the successor of O
    P: parent of N
    C: child of N
    L: some other node

    normal case:

    O N
    / \ / \
    / \ / \
    L \ L \
    / \ P ----> / \ P
    / \ / \
    / /
    N C
    \ / \
    \
    C
    / \

    special case:
    O|P N
    / \ / \
    / \ / \
    L \ L \
    / \ N ----> / C
    \ / \
    \
    C
    / \

    Notice that for the special case we don't have to reconnect C to N.

    Signed-off-by: Wolfram Strepp
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wolfram Strepp
     
  • First, move some code around in order to make the next change more obvious.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Peter Zijlstra
    Signed-off-by: Wolfram Strepp
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wolfram Strepp
     
  • There is a call to write_lock() in gen_pool_destroy which is not balanced
    by any corresponding write_unlock(). This causes problems with preemption
    because the preemption-disable counter is incremented in the write_lock()
    call, but never decremented by any call to write_unlock(). This bug is
    gen_pool_destroy, and one of them is non-x86 arch-specific code.

    Signed-off-by: Zygo Blaxell
    Cc: Jiri Kosina
    Cc: Steve Wise
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zygo Blaxell
     
  • For example:
    hex_dump_to_buffer("AB", 2, 16, 1, buf, 100, 0);
    pr_info("[%s]\n", buf);

    I'd expect the output to be "[41 42]", but actually it's "[41 42 ]"

    This patch also makes the required buf to be minimum. To print the hex
    format of "AB", a buf with size 6 should be sufficient, but
    hex_dump_to_buffer() required at least 8.

    Signed-off-by: Li Zefan
    Acked-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • radix_tree_lookup() and radix_tree_lookup_slot() have much the
    same code except for the return value.

    Introduce radix_tree_lookup_element() to do the real work.

    /*
    * is_slot == 1 : search for the slot.
    * is_slot == 0 : search for the node.
    */
    static void * radix_tree_lookup_element(struct radix_tree_root *root,
    unsigned long index, int is_slot);

    Signed-off-by: Huang Shijie
    Cc: Nick Piggin
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Huang Shijie
     
  • _atomic_dec_and_lock() should not unconditionally take the lock before
    calling atomic_dec_and_test() in the UP case. For consistency reasons it
    should behave exactly like in the SMP case.

    Besides that this works around the problem that with CONFIG_DEBUG_SPINLOCK
    this spins in __spin_lock_debug() if the lock is already taken even if the
    counter doesn't drop to 0.

    Signed-off-by: Jan Blunck
    Acked-by: Paul E. McKenney
    Acked-by: Nick Piggin
    Cc: Valerie Aurora
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Blunck
     
  • The counterpart of radix_tree_next_hole(). To be used by context readahead.

    Signed-off-by: Wu Fengguang
    Cc: Vladislav Bolkhovitin
    Cc: Jens Axboe
    Cc: Jeff Moyer
    Cc: Nick Piggin
    Cc: Ying Han
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/vegard/kmemcheck: (39 commits)
    signal: fix __send_signal() false positive kmemcheck warning
    fs: fix do_mount_root() false positive kmemcheck warning
    fs: introduce __getname_gfp()
    trace: annotate bitfields in struct ring_buffer_event
    net: annotate struct sock bitfield
    c2port: annotate bitfield for kmemcheck
    net: annotate inet_timewait_sock bitfields
    ieee1394/csr1212: fix false positive kmemcheck report
    ieee1394: annotate bitfield
    net: annotate bitfields in struct inet_sock
    net: use kmemcheck bitfields API for skbuff
    kmemcheck: introduce bitfield API
    kmemcheck: add opcode self-testing at boot
    x86: unify pte_hidden
    x86: make _PAGE_HIDDEN conditional
    kmemcheck: make kconfig accessible for other architectures
    kmemcheck: enable in the x86 Kconfig
    kmemcheck: add hooks for the page allocator
    kmemcheck: add hooks for page- and sg-dma-mappings
    kmemcheck: don't track page tables
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (64 commits)
    debugfs: use specified mode to possibly mark files read/write only
    debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem.
    xen: remove driver_data direct access of struct device from more drivers
    usb: gadget: at91_udc: remove driver_data direct access of struct device
    uml: remove driver_data direct access of struct device
    block/ps3: remove driver_data direct access of struct device
    s390: remove driver_data direct access of struct device
    parport: remove driver_data direct access of struct device
    parisc: remove driver_data direct access of struct device
    of_serial: remove driver_data direct access of struct device
    mips: remove driver_data direct access of struct device
    ipmi: remove driver_data direct access of struct device
    infiniband: ehca: remove driver_data direct access of struct device
    ibmvscsi: gadget: at91_udc: remove driver_data direct access of struct device
    hvcs: remove driver_data direct access of struct device
    xen block: remove driver_data direct access of struct device
    thermal: remove driver_data direct access of struct device
    scsi: remove driver_data direct access of struct device
    pcmcia: remove driver_data direct access of struct device
    PCIE: remove driver_data direct access of struct device
    ...

    Manually fix up trivial conflicts due to different direct driver_data
    direct access fixups in drivers/block/{ps3disk.c,ps3vram.c}

    Linus Torvalds
     

16 Jun, 2009

2 commits


15 Jun, 2009

1 commit