25 Apr, 2010

3 commits

  • Add a missing EXPORT_SYMBOL.

    I must be the first person that wants to use this function :-)

    Signed-off-by: Hans Verkuil
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hans Verkuil
     
  • This patch fixes 2 issues with the LZO decompressor:

    - It doesn't handle the case where a block isn't compressed at all. In
    this case, calling lzo1x_decompress_safe will fail, so we need to just
    use memcpy() instead (the upstream LZO code does something similar)

    - Since commit 54291362d2a5738e1b0495df2abcb9e6b0563a3f ("initramfs: add
    missing decompressor error check") , the decompressor return code is
    checked in the init/initramfs.c The LZO decompressor didn't return the
    expected value, causing the initramfs code to falsely believe a
    decompression error occured

    Signed-off-by: Albin Tonnerre
    Tested-by: bert schulze
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Albin Tonnerre
     
  • memset() is called with the wrong address and the kernel panics.

    Signed-off-by: Changli Gao
    Cc: Patrick McHardy
    Acked-by: David Rientjes
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Changli Gao
     

16 Apr, 2010

1 commit

  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86/gart: Disable GART explicitly before initialization
    dma-debug: Cleanup for copy-loop in filter_write()
    x86/amd-iommu: Remove obsolete parameter documentation
    x86/amd-iommu: use for_each_pci_dev
    Revert "x86: disable IOMMUs on kernel crash"
    x86/amd-iommu: warn when issuing command to uninitialized cmd buffer
    x86/amd-iommu: enable iommu before attaching devices
    x86/amd-iommu: Use helper function to destroy domain
    x86/amd-iommu: Report errors in acpi parsing functions upstream
    x86/amd-iommu: Pt mode fix for domain_destroy
    x86/amd-iommu: Protect IOMMU-API map/unmap path
    x86/amd-iommu: Remove double NULL check in check_device

    Linus Torvalds
     

15 Apr, 2010

1 commit

  • Commit ef0658f3de484bf9b173639cd47544584e01efa5 changed precision
    from int to s8.

    There is existing kernel code that uses a larger precision.

    An example from the audit code:
    vsnprintf(...,..., " msg='%.1024s'", (char *)data);
    which overflows precision and truncates to nothing.

    Extending precision size fixes the audit system issue.

    Other changes:

    Change the size of the struct printf_spec.type from u16 to u8 so
    sizeof(struct printf_spec) stays as small as possible.
    Reorder the struct members so sizeof(struct printf_spec) remains 64 bits
    without alignment holes.
    Document the struct members a bit more.

    Original-patch-by: Eric Paris
    Signed-off-by: Joe Perches
    Tested-by: Justin P. Mattock
    Signed-off-by: Linus Torvalds

    Joe Perches
     

13 Apr, 2010

3 commits


10 Apr, 2010

2 commits

  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits)
    cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch
    loop: Update mtime when writing using aops
    block: expose the statistics in blkio.time and blkio.sectors for the root cgroup
    backing-dev: Handle class_create() failure
    Block: Fix block/elevator.c elevator_get() off-by-one error
    drbd: lc_element_by_index() never returns NULL
    cciss: unlock on error path
    cfq-iosched: Do not merge queues of BE and IDLE classes
    cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging
    i2o: Remove the dangerous kobj_to_i2o_device macro
    block: remove 16 bytes of padding from struct request on 64bits
    cfq-iosched: fix a kbuild regression
    block: make CONFIG_BLK_CGROUP visible
    Remove GENHD_FL_DRIVERFS
    block: Export max number of segments and max segment size in sysfs
    block: Finalize conversion of block limits functions
    block: Fix overrun in lcm() and move it to lib
    vfs: improve writeback_inodes_wb()
    paride: fix off-by-one test
    drbd: fix al-to-on-disk-bitmap for 4k logical_block_size
    ...

    Linus Torvalds
     
  • radix_tree_tag_get() is not safe to use concurrently with radix_tree_tag_set()
    or radix_tree_tag_clear(). The problem is that the double tag_get() in
    radix_tree_tag_get():

    if (!tag_get(node, tag, offset))
    saw_unset_tag = 1;
    if (height == 1) {
    int ret = tag_get(node, tag, offset);

    may see the value change due to the action of set/clear. RCU is no protection
    against this as no pointers are being changed, no nodes are being replaced
    according to a COW protocol - set/clear alter the node directly.

    The documentation in linux/radix-tree.h, however, says that
    radix_tree_tag_get() is an exception to the rule that "any function modifying
    the tree or tags (...) must exclude other modifications, and exclude any
    functions reading the tree".

    The problem is that the next statement in radix_tree_tag_get() checks that the
    tag doesn't vary over time:

    BUG_ON(ret && saw_unset_tag);

    This has been seen happening in FS-Cache:

    https://www.redhat.com/archives/linux-cachefs/2010-April/msg00013.html

    To this end, remove the BUG_ON() from radix_tree_tag_get() and note in various
    comments that the value of the tag may change whilst the RCU read lock is held,
    and thus that the return value of radix_tree_tag_get() may not be relied upon
    unless radix_tree_tag_set/clear() and radix_tree_delete() are excluded from
    running concurrently with it.

    Reported-by: Romain DEGEZ
    Signed-off-by: David Howells
    Acked-by: Nick Piggin
    Signed-off-by: Linus Torvalds

    David Howells
     

08 Apr, 2010

1 commit

  • rwsems can be used with IRQs disabled, particularily in early boot
    before IRQs are enabled. Currently the spin_unlock_irq() usage in the
    slow-patch will unconditionally enable interrupts and cause problems
    since interrupts are not yet initialized or enabled.

    This patch uses save/restore versions of IRQ spinlocks in the slowpath
    to ensure interrupts are not unintentionally disabled.

    Signed-off-by: Kevin Hilman
    Signed-off-by: Linus Torvalds

    Kevin Hilman
     

07 Apr, 2010

5 commits


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

25 Mar, 2010

1 commit


19 Mar, 2010

1 commit


15 Mar, 2010

4 commits


14 Mar, 2010

1 commit

  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf: Provide generic perf_sample_data initialization
    MAINTAINERS: Add Arnaldo as tools/perf/ co-maintainer
    perf trace: Don't use pager if scripting
    perf trace/scripting: Remove extraneous header read
    perf, ARM: Modify kuser rmb() call to compile for Thumb-2
    x86/stacktrace: Don't dereference bad frame pointers
    perf archive: Don't try to collect files without a build-id
    perf_events, x86: Fixup fixed counter constraints
    perf, x86: Restrict the ANY flag
    perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE
    perf, x86: add some IBS macros to perf_event.h
    perf, x86: make IBS macros available in perf_event.h
    hw-breakpoints: Remove stub unthrottle callback
    x86/hw-breakpoints: Remove the name field
    perf: Remove pointless breakpoint union
    perf lock: Drop the buffers multiplexing dependency
    perf lock: Fix and add misc documentally things
    percpu: Add __percpu sparse annotations to hw_breakpoint

    Linus Torvalds
     

13 Mar, 2010

2 commits

  • inflate_fast() can do either POST INC or PRE INC on its pointers walking
    the memory to decompress. Default is PRE INC.

    The sout pointer offset was miscalculated in one case as the calculation
    assumed sout was a char * This breaks inflate_fast() iff configured to do
    POST INC.

    Signed-off-by: Joakim Tjernlund
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joakim Tjernlund
     
  • Commit 6846ee5ca68d81e6baccf0d56221d7a00c1be18b ("zlib: Fix build of
    powerpc boot wrapper") made the new optimized inflate only available on
    arch's that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS.

    This patch will again enable the optimization for all arch's by defining
    our own endian independent version of unaligned access. As an added
    bonus, arch's that define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS do a
    plain load instead.

    Signed-off-by: Joakim Tjernlund
    Cc: Anton Blanchard
    Cc: Benjamin Herrenschmidt
    Cc: David Woodhouse
    Cc: Kumar Gala
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joakim Tjernlund
     

10 Mar, 2010

1 commit


08 Mar, 2010

3 commits

  • Constify struct sysfs_ops.

    This is part of the ops structure constification
    effort started by Arjan van de Ven et al.

    Benefits of this constification:

    * prevents modification of data that is shared
    (referenced) by many other structure instances
    at runtime

    * detects/prevents accidental (but not intentional)
    modification attempts on archs that enforce
    read-only kernel data at runtime

    * potentially better optimized code as the compiler
    can assume that the const data cannot be changed

    * the compiler/linker move const data into .rodata
    and therefore exclude them from false sharing

    Signed-off-by: Emese Revfy
    Acked-by: David Teigland
    Acked-by: Matt Domsch
    Acked-by: Maciej Sosnowski
    Acked-by: Hans J. Koch
    Acked-by: Pekka Enberg
    Acked-by: Jens Axboe
    Acked-by: Stephen Hemminger
    Signed-off-by: Greg Kroah-Hartman

    Emese Revfy
     
  • Constify struct kset_uevent_ops.

    This is part of the ops structure constification
    effort started by Arjan van de Ven et al.

    Benefits of this constification:

    * prevents modification of data that is shared
    (referenced) by many other structure instances
    at runtime

    * detects/prevents accidental (but not intentional)
    modification attempts on archs that enforce
    read-only kernel data at runtime

    * potentially better optimized code as the compiler
    can assume that the const data cannot be changed

    * the compiler/linker move const data into .rodata
    and therefore exclude them from false sharing

    Signed-off-by: Emese Revfy
    Signed-off-by: Greg Kroah-Hartman

    Emese Revfy
     
  • This reverts commit a069c266ae5fdfbf5b4aecf2c672413aa33b2504.

    It turns ou that not only was it missing a case (XFS) that needed it,
    but perhaps more importantly, people sometimes want to enable new
    modules that they hadn't had enabled before, and if such a module uses
    list_sort(), it can't easily be inserted any more.

    So rather than add a "select LIST_SORT" to the XFS case, just leave it
    compiled in. It's not all _that_ big, after all, and the inconvenience
    isn't worth it.

    Requested-by: Alexey Dobriyan
    Cc: Christoph Hellwig
    Cc: Don Mullis
    Cc: Andrew Morton
    Cc: Dave Chinner
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

07 Mar, 2010

10 commits

  • This adds separate I/O and memory specs, so we don't have to change the
    field width in a shared spec, which then lets us make all the specs const
    and static, since they never change.

    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Linus Torvalds

    Bjorn Helgaas
     
  • Add clues about what the SMALL and SPECIAL flags do.

    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Linus Torvalds

    Bjorn Helgaas
     
  • Reducing the size of struct printf_spec is a good thing because multiple
    instances are commonly passed on stack.

    It's possible for type to be u8 and field_width to be s8, but this is
    likely small enough for now.

    Signed-off-by: Joe Perches
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs:
    [LogFS] Change magic number
    [LogFS] Remove h_version field
    [LogFS] Check feature flags
    [LogFS] Only write journal if dirty
    [LogFS] Fix bdev erases
    [LogFS] Silence gcc
    [LogFS] Prevent 64bit divisions in hash_index
    [LogFS] Plug memory leak on error paths
    [LogFS] Add MAINTAINERS entry
    [LogFS] add new flash file system

    Fixed up trivial conflict in lib/Kconfig, and a semantic conflict in
    fs/logfs/inode.c introduced by write_inode() being changed to use
    writeback_control' by commit a9185b41a4f84971b930c519f0c63bd450c4810d
    ("pass writeback_control to ->write_inode")

    Linus Torvalds
     
  • Signed-off-by: Joakim Tjernlund
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joakim Tjernlund
     
  • Replace open-coded loop with for_each_set_bit().

    Signed-off-by: Akinobu Mita
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • The function name must be followed by a space, hypen, space, and a short
    description.

    Signed-off-by: Ben Hutchings
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ben Hutchings
     
  • Build list_sort() only for configs that need it -- those that don't save
    ~581 bytes (i386).

    Signed-off-by: Don Mullis
    Cc: Dave Airlie
    Cc: Andi Kleen
    Cc: Dave Chinner
    Cc: Artem Bityutskiy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Don Mullis
     
  • Clarify and correct header comment of list_sort().

    Signed-off-by: Don Mullis
    Cc: Dave Airlie
    Cc: Andi Kleen
    Cc: Dave Chinner
    Cc: Artem Bityutskiy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Don Mullis
     
  • XFS and UBIFS can pass long lists to list_sort(); this alternative
    implementation scales better, reaching ~3x performance gain when list
    length exceeds the L2 cache size.

    Stand-alone program timings were run on a Core 2 duo L1=32KB L2=4MB,
    gcc-4.4, with flags extracted from an Ubuntu kernel build. Object size is
    581 bytes compared to 455 for Mark J. Roberts' code.

    Worst case for either implementation is a list length just over a power of
    two, and to roughly the same degree, so here are timing results for a
    range of 2^N+1 lengths. List elements were 16 bytes each including malloc
    overhead; initial order was random.

    time (msec)
    Tatham-Roberts
    | generic-Mullis-v2
    loop_count length | | ratio
    4000000 2 206 294 1.427
    2000000 3 176 227 1.289
    1000000 5 199 172 0.864
    500000 9 235 178 0.757
    250000 17 243 182 0.748
    125000 33 261 196 0.750
    62500 65 277 209 0.754
    31250 129 292 219 0.75
    15625 257 317 235 0.741
    7812 513 340 252 0.741
    3906 1025 362 267 0.737
    1953 2049 388 283 0.729 ~ L1 size
    976 4097 556 323 0.580
    488 8193 678 361 0.532
    244 16385 773 395 0.510
    122 32769 844 418 0.495
    61 65537 917 454 0.495
    30 131073 1128 543 0.481
    15 262145 2355 869 0.369 ~ L2 size
    7 524289 5597 1714 0.306
    3 1048577 6218 2022 0.325

    Mark's code does not actually implement the usual or generic mergesort,
    but rather a variant from Simon Tatham described here:

    http://www.chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html

    Simon's algorithm performs O(log N) passes over the entire input list,
    doing merges of sublists that double in size on each pass. The generic
    algorithm instead merges pairs of equal length lists as early as possible,
    in recursive order. For either algorithm, the elements that extend the
    list beyond power-of-two length are a special case, handled as nearly as
    possible as a "rounding-up" to a full POT.

    Some intuition for the locality of reference implications of merge order
    may be gotten by watching this animation:

    http://www.sorting-algorithms.com/merge-sort

    Simon's algorithm requires only O(1) extra space rather than the generic
    algorithm's O(log N), but in my non-recursive implementation the actual
    O(log N) data is merely a vector of ~20 pointers, which I've put on the
    stack.

    Long-running list_sort() calls: If the list passed in may be long, or the
    client's cmp() callback function is slow, the client's cmp() may
    periodically invoke cond_resched() to voluntarily yield the CPU. All
    inner loops of list_sort() call back to cmp().

    Stability of the sort: distinct elements that compare equal emerge from
    the sort in the same order as with Mark's code, for simple test cases. A
    boot-time test is provided to verify this and other correctness
    requirements.

    A kernel that uses drm.ko appears to run normally with this change; I have
    no suitable hardware to similarly test the use by UBIFS.

    [akpm@linux-foundation.org: style tweaks, fix comment, make list_sort_test __init]
    Signed-off-by: Don Mullis
    Cc: Dave Airlie
    Cc: Andi Kleen
    Cc: Dave Chinner
    Cc: Artem Bityutskiy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Don Mullis