18 Dec, 2012

8 commits

  • Fix this warning:

    lib/rbtree_test.c: In function `check':
    lib/rbtree_test.c:121: warning: `blacks' may be used uninitialized in this function

    Signed-off-by: Cong Ding
    Cc: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cong Ding
     
  • Currently only block_dev and uprobes use percpu_rw_semaphore,
    add the config option selected by BLOCK || UPROBES.

    Signed-off-by: Oleg Nesterov
    Cc: Anton Arapov
    Cc: Ingo Molnar
    Cc: Linus Torvalds
    Cc: Michal Marek
    Cc: Mikulas Patocka
    Cc: "Paul E. McKenney"
    Cc: Peter Zijlstra
    Cc: Srikar Dronamraju
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Add lockdep annotations. Not only this can help to find the potential
    problems, we do not want the false warnings if, say, the task takes two
    different percpu_rw_semaphore's for reading. IOW, at least ->rw_sem
    should not use a single class.

    This patch exposes this internal lock to lockdep so that it represents the
    whole percpu_rw_semaphore. This way we do not need to add another "fake"
    ->lockdep_map and lock_class_key. More importantly, this also makes the
    output from lockdep much more understandable if it finds the problem.

    In short, with this patch from lockdep pov percpu_down_read() and
    percpu_up_read() acquire/release ->rw_sem for reading, this matches the
    actual semantics. This abuses __up_read() but I hope this is fine and in
    fact I'd like to have down_read_no_lockdep() as well,
    percpu_down_read_recursive_readers() will need it.

    Signed-off-by: Oleg Nesterov
    Cc: Anton Arapov
    Cc: Ingo Molnar
    Cc: Linus Torvalds
    Cc: Michal Marek
    Cc: Mikulas Patocka
    Cc: "Paul E. McKenney"
    Cc: Peter Zijlstra
    Cc: Srikar Dronamraju
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • percpu_rw_semaphore->writer_mutex was only added to simplify the initial
    rewrite, the only thing it protects is clear_fast_ctr() which otherwise
    could be called by multiple writers. ->rw_sem is enough to serialize the
    writers.

    Kill this mutex and add "atomic_t write_ctr" instead. The writers
    increment/decrement this counter, the readers check it is zero instead of
    mutex_is_locked().

    Move atomic_add(clear_fast_ctr(), slow_read_ctr) under down_write() to
    avoid the race with other writers. This is a bit sub-optimal, only the
    first writer needs this and we do not need to exclude the readers at this
    stage. But this is simple, we do not want another internal lock until we
    add more features.

    And this speeds up the write-contended case. Before this patch the racing
    writers sleep in synchronize_sched_expedited() sequentially, with this
    patch multiple synchronize_sched_expedited's can "overlap" with each
    other. Note: we can do more optimizations, this is only the first step.

    Signed-off-by: Oleg Nesterov
    Cc: Anton Arapov
    Cc: Ingo Molnar
    Cc: Linus Torvalds
    Cc: Michal Marek
    Cc: Mikulas Patocka
    Cc: "Paul E. McKenney"
    Cc: Peter Zijlstra
    Cc: Srikar Dronamraju
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Currently the writer does msleep() plus synchronize_sched() 3 times to
    acquire/release the semaphore, and during this time the readers are
    blocked completely. Even if the "write" section was not actually started
    or if it was already finished.

    With this patch down_write/up_write does synchronize_sched() twice and
    down_read/up_read are still possible during this time, just they use the
    slow path.

    percpu_down_write() first forces the readers to use rw_semaphore and
    increment the "slow" counter to take the lock for reading, then it
    takes that rw_semaphore for writing and blocks the readers.

    Also. With this patch the code relies on the documented behaviour of
    synchronize_sched(), it doesn't try to pair synchronize_sched() with
    barrier.

    Signed-off-by: Oleg Nesterov
    Reviewed-by: Paul E. McKenney
    Cc: Linus Torvalds
    Cc: Mikulas Patocka
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Srikar Dronamraju
    Cc: Ananth N Mavinakayanahalli
    Cc: Anton Arapov
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • This is another step towards better standard conformance. Rather than
    adding a local buffer to store the specified portion of the string (with
    the need to enforce an arbitrary maximum supported width to limit the
    buffer size), do a maximum width conversion and then drop as much of it as
    is necessary to meet the caller's request.

    Also fail on negative field widths.

    Uses the deprecated simple_strto*() functions because kstrtoXX() fail on
    non-zero terminated strings.

    Signed-off-by: Jan Beulich
    Cc: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Beulich
     
  • Remove the custom implementation of the functionality similar to kbasename().

    Signed-off-by: Andy Shevchenko
    Cc: Jason Baron
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Shevchenko
     
  • Documentation/printk-formats.txt says to use %zd for a ssize_t argument
    and some drivers do. Unfortunately this prints a positive number for
    negative values eg:

    tpm_tis 70030000.tpm_tis: tpm_transmit: tpm_send: error 4294967234

    Add a case to va_args a ssize_t type if the interpretation should be
    signed.

    Tested on PPC32.

    Signed-off-by: Jason Gunthorpe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jason Gunthorpe
     

17 Dec, 2012

1 commit

  • …ernel/git/konrad/swiotlb

    Pull swiotlb update from Konrad Rzeszutek Wilk:
    "Feature:
    - Use dma addresses instead of the virt_to_phys and vice versa
    functions.

    Remove the multitude of phys_to_virt/virt_to_phys calls and instead
    operate on the physical addresses instead of virtual in many of the
    internal functions. This does provide a speed up in interrupt
    handlers that do DMA operations and use SWIOTLB."

    * tag 'stable/for-linus-3.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
    swiotlb: Do not export swiotlb_bounce since there are no external consumers
    swiotlb: Use physical addresses instead of virtual in swiotlb_tbl_sync_single
    swiotlb: Use physical addresses for swiotlb_tbl_unmap_single
    swiotlb: Return physical addresses when calling swiotlb_tbl_map_single
    swiotlb: Make io_tlb_overflow_buffer a physical address
    swiotlb: Make io_tlb_start a physical address instead of a virtual one
    swiotlb: Make io_tlb_end a physical address instead of a virtual one

    Linus Torvalds
     

15 Dec, 2012

1 commit

  • Pull x86 ACPI update from Peter Anvin:
    "This is a patchset which didn't make the last merge window. It adds a
    debugging capability to feed ACPI tables via the initramfs.

    On a grander scope, it formalizes using the initramfs protocol for
    feeding arbitrary blobs which need to be accessed early to the kernel:
    they are fed first in the initramfs blob (lots of bootloaders can
    concatenate this at boot time, others can use a single file) in an
    uncompressed cpio archive using filenames starting with "kernel/".

    The ACPI maintainers requested that this patchset be fed via the x86
    tree rather than the ACPI tree as the footprint in the general x86
    code is much bigger than in the ACPI code proper."

    * 'x86-acpi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    X86 ACPI: Use #ifdef not #if for CONFIG_X86 check
    ACPI: Fix build when disabled
    ACPI: Document ACPI table overriding via initrd
    ACPI: Create acpi_table_taint() function to avoid code duplication
    ACPI: Implement physical address table override
    ACPI: Store valid ACPI tables passed via early initrd in reserved memblock areas
    x86, acpi: Introduce x86 arch specific arch_reserve_mem_area() for e820 handling
    lib: Add early cpio decoder

    Linus Torvalds
     

14 Dec, 2012

1 commit

  • Pull trivial branch from Jiri Kosina:
    "Usual stuff -- comment/printk typo fixes, documentation updates, dead
    code elimination."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
    HOWTO: fix double words typo
    x86 mtrr: fix comment typo in mtrr_bp_init
    propagate name change to comments in kernel source
    doc: Update the name of profiling based on sysfs
    treewide: Fix typos in various drivers
    treewide: Fix typos in various Kconfig
    wireless: mwifiex: Fix typo in wireless/mwifiex driver
    messages: i2o: Fix typo in messages/i2o
    scripts/kernel-doc: check that non-void fcts describe their return value
    Kernel-doc: Convention: Use a "Return" section to describe return values
    radeon: Fix typo and copy/paste error in comments
    doc: Remove unnecessary declarations from Documentation/accounting/getdelays.c
    various: Fix spelling of "asynchronous" in comments.
    Fix misspellings of "whether" in comments.
    eisa: Fix spelling of "asynchronous".
    various: Fix spelling of "registered" in comments.
    doc: fix quite a few typos within Documentation
    target: iscsi: fix comment typos in target/iscsi drivers
    treewide: fix typo of "suport" in various comments and Kconfig
    treewide: fix typo of "suppport" in various comments
    ...

    Linus Torvalds
     

12 Dec, 2012

4 commits

  • Pull RCU update from Ingo Molnar:
    "The major features of this tree are:

    1. A first version of no-callbacks CPUs. This version prohibits
    offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y.
    Relaxing this constraint is in progress, but not yet ready
    for prime time. These commits were posted to LKML at
    https://lkml.org/lkml/2012/10/30/724.

    2. Changes to SRCU that allows statically initialized srcu_struct
    structures. These commits were posted to LKML at
    https://lkml.org/lkml/2012/10/30/296.

    3. Restructuring of RCU's debugfs output. These commits were posted
    to LKML at https://lkml.org/lkml/2012/10/30/341.

    4. Additional CPU-hotplug/RCU improvements, posted to LKML at
    https://lkml.org/lkml/2012/10/30/327.
    Note that the commit eliminating __stop_machine() was judged to
    be too-high of risk, so is deferred to 3.9.

    5. Changes to RCU's idle interface, most notably a new module
    parameter that redirects normal grace-period operations to
    their expedited equivalents. These were posted to LKML at
    https://lkml.org/lkml/2012/10/30/739.

    6. Additional diagnostics for RCU's CPU stall warning facility,
    posted to LKML at https://lkml.org/lkml/2012/10/30/315.
    The most notable change reduces the
    default RCU CPU stall-warning time from 60 seconds to 21 seconds,
    so that it once again happens sooner than the softlockup timeout.

    7. Documentation updates, which were posted to LKML at
    https://lkml.org/lkml/2012/10/30/280.
    A couple of late-breaking changes were posted at
    https://lkml.org/lkml/2012/11/16/634 and
    https://lkml.org/lkml/2012/11/16/547.

    8. Miscellaneous fixes, which were posted to LKML at
    https://lkml.org/lkml/2012/10/30/309.

    9. Finally, a fix for an lockdep-RCU splat was posted to LKML
    at https://lkml.org/lkml/2012/11/7/486."

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
    context_tracking: New context tracking susbsystem
    sched: Mark RCU reader in sched_show_task()
    rcu: Separate accounting of callbacks from callback-free CPUs
    rcu: Add callback-free CPUs
    rcu: Add documentation for the new rcuexp debugfs trace file
    rcu: Update documentation for TREE_RCU debugfs tracing
    rcu: Reduce default RCU CPU stall warning timeout
    rcu: Fix TINY_RCU rcu_is_cpu_rrupt_from_idle check
    rcu: Clarify memory-ordering properties of grace-period primitives
    rcu: Add new rcutorture module parameters to start/end test messages
    rcu: Remove list_for_each_continue_rcu()
    rcu: Fix batch-limit size problem
    rcu: Add tracing for synchronize_sched_expedited()
    rcu: Remove old debugfs interfaces and also RCU flavor name
    rcu: split 'rcuhier' to each flavor
    rcu: split 'rcugp' to each flavor
    rcu: split 'rcuboost' to each flavor
    rcu: split 'rcubarrier' to each flavor
    rcu: Fix tracing formatting
    rcu: Remove the interface "rcudata.csv"
    ...

    Linus Torvalds
     
  • Merge misc updates from Andrew Morton:
    "About half of most of MM. Going very early this time due to
    uncertainty over the coreautounifiednumasched things. I'll send the
    other half of most of MM tomorrow. The rest of MM awaits a slab merge
    from Pekka."

    * emailed patches from Andrew Morton: (71 commits)
    memory_hotplug: ensure every online node has NORMAL memory
    memory_hotplug: handle empty zone when online_movable/online_kernel
    mm, memory-hotplug: dynamic configure movable memory and portion memory
    drivers/base/node.c: cleanup node_state_attr[]
    bootmem: fix wrong call parameter for free_bootmem()
    avr32, kconfig: remove HAVE_ARCH_BOOTMEM
    mm: cma: remove watermark hacks
    mm: cma: skip watermarks check for already isolated blocks in split_free_page()
    mm, oom: fix race when specifying a thread as the oom origin
    mm, oom: change type of oom_score_adj to short
    mm: cleanup register_node()
    mm, mempolicy: remove duplicate code
    mm/vmscan.c: try_to_freeze() returns boolean
    mm: introduce putback_movable_pages()
    virtio_balloon: introduce migration primitives to balloon pages
    mm: introduce compaction and migration for ballooned pages
    mm: introduce a common interface for balloon pages mobility
    mm: redefine address_space.assoc_mapping
    mm: adjust address_space_operations.migratepage() return code
    arch/sparc/kernel/sys_sparc_64.c: s/COLOUR/COLOR/
    ...

    Linus Torvalds
     
  • It is strange that alloc_bootmem() returns a virtual address and
    free_bootmem() requires a physical address. Anyway, free_bootmem()'s
    first parameter should be physical address.

    There are some call sites for free_bootmem() with virtual address. So fix
    them.

    [akpm@linux-foundation.org: improve free_bootmem() and free_bootmem_pate() documentation]
    Signed-off-by: Joonsoo Kim
    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Johannes Weiner
    Cc: FUJITA Tomonori
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Pull driver core updates from Greg Kroah-Hartman:
    "Here's the large driver core updates for 3.8-rc1.

    The biggest thing here is the various __dev* marking removals. This
    is going to be a pain for the merge with different subsystem trees, I
    know, but all of the patches included here have been ACKed by their
    various subsystem maintainers, as they wanted them to go through here.

    If this is too much of a pain, I can pull all of them out of this tree
    and just send you one with the other fixes/updates and then, after
    3.8-rc1 is out, do the rest of the removals to ensure we catch them
    all, it's up to you. The merges should all be trivial, and Stephen
    has been doing them all in linux-next for a few weeks now quite
    easily.

    Other than the __dev* marking removals, there's nothing major here,
    some firmware loading updates and other minor things in the driver
    core.

    All of these have (much to Stephen's annoyance), been in linux-next
    for a while.

    Signed-off-by: Greg Kroah-Hartman "

    Fixed up trivial conflicts in drivers/gpio/gpio-{em,stmpe}.c due to gpio
    update.

    * tag 'driver-core-3.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (93 commits)
    modpost.c: Stop checking __dev* section mismatches
    init.h: Remove __dev* sections from the kernel
    acpi: remove use of __devinit
    PCI: Remove __dev* markings
    PCI: Always build setup-bus when PCI is enabled
    PCI: Move pci_uevent into pci-driver.c
    PCI: Remove CONFIG_HOTPLUG ifdefs
    unicore32/PCI: Remove CONFIG_HOTPLUG ifdefs
    sh/PCI: Remove CONFIG_HOTPLUG ifdefs
    powerpc/PCI: Remove CONFIG_HOTPLUG ifdefs
    mips/PCI: Remove CONFIG_HOTPLUG ifdefs
    microblaze/PCI: Remove CONFIG_HOTPLUG ifdefs
    dma: remove use of __devinit
    dma: remove use of __devexit_p
    firewire: remove use of __devinitdata
    firewire: remove use of __devinit
    leds: remove use of __devexit
    leds: remove use of __devinit
    leds: remove use of __devexit_p
    mmc: remove use of __devexit
    ...

    Linus Torvalds
     

06 Dec, 2012

2 commits

  • I've legally changed my name with New York State, the US Social Security
    Administration, et al. This patch propagates the name change and change
    in initials and login to comments in the kernel source as well.

    Signed-off-by: Nadia Yvette Chambers
    Signed-off-by: Jiri Kosina

    Nadia Yvette Chambers
     
  • It is $(obj)/oid_registry.o that is dependent on $(obj)/oid_registry_data.c.
    The object file cannot be built until $(obj)/oid_registry_data.c has been
    generated.

    A periodic and hard to reproduce parallel build failure is due to
    this incorrect lib/Makefile dependency. The compile error is completely
    disingenuous.

    GEN lib/oid_registry_data.c
    Compiling 49 OIDs
    CC lib/oid_registry.o
    gcc: error: lib/oid_registry.c: No such file or directory
    gcc: fatal error: no input files
    compilation terminated.
    make[3]: *** [lib/oid_registry.o] Error 4

    Cc: Andrew Morton
    Cc: Akinobu Mita
    Cc: Michel Lespinasse
    Cc: David Howells
    Cc: "David S. Miller"
    Signed-off-by: Tim Gardner
    Signed-off-by: Rusty Russell

    Tim Gardner
     

05 Dec, 2012

1 commit

  • Fix an error in asn1_find_indefinite_length() whereby small definite length
    elements of size 0x7f are incorrecly classified as non-small. Without this
    fix, an error will be given as the length of the length will be perceived as
    being very much greater than the maximum supported size.

    Signed-off-by: David Howells
    Signed-off-by: Rusty Russell

    David Howells
     

03 Dec, 2012

2 commits

  • Correct spelling typo within various Kconfig.

    Signed-off-by: Masanari Iida
    Signed-off-by: Jiri Kosina

    Masanari Iida
     
  • …/linux-rcu into core/rcu

    Conflicts:
    arch/x86/kernel/ptrace.c

    Pull the latest RCU tree from Paul E. McKenney:

    " The major features of this series are:

    1. A first version of no-callbacks CPUs. This version prohibits
    offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y.
    Relaxing this constraint is in progress, but not yet ready
    for prime time. These commits were posted to LKML at
    https://lkml.org/lkml/2012/10/30/724, and are at branch rcu/nocb.

    2. Changes to SRCU that allows statically initialized srcu_struct
    structures. These commits were posted to LKML at
    https://lkml.org/lkml/2012/10/30/296, and are at branch rcu/srcu.

    3. Restructuring of RCU's debugfs output. These commits were posted
    to LKML at https://lkml.org/lkml/2012/10/30/341, and are at
    branch rcu/tracing.

    4. Additional CPU-hotplug/RCU improvements, posted to LKML at
    https://lkml.org/lkml/2012/10/30/327, and are at branch rcu/hotplug.
    Note that the commit eliminating __stop_machine() was judged to
    be too-high of risk, so is deferred to 3.9.

    5. Changes to RCU's idle interface, most notably a new module
    parameter that redirects normal grace-period operations to
    their expedited equivalents. These were posted to LKML at
    https://lkml.org/lkml/2012/10/30/739, and are at branch rcu/idle.

    6. Additional diagnostics for RCU's CPU stall warning facility,
    posted to LKML at https://lkml.org/lkml/2012/10/30/315, and
    are at branch rcu/stall. The most notable change reduces the
    default RCU CPU stall-warning time from 60 seconds to 21 seconds,
    so that it once again happens sooner than the softlockup timeout.

    7. Documentation updates, which were posted to LKML at
    https://lkml.org/lkml/2012/10/30/280, and are at branch rcu/doc.
    A couple of late-breaking changes were posted at
    https://lkml.org/lkml/2012/11/16/634 and
    https://lkml.org/lkml/2012/11/16/547.

    8. Miscellaneous fixes, which were posted to LKML at
    https://lkml.org/lkml/2012/10/30/309, along with a late-breaking
    change posted at Fri, 16 Nov 2012 11:26:25 -0800 with message-ID
    <20121116192625.GA447@linux.vnet.ibm.com>, but which lkml.org
    seems to have missed. These are at branch rcu/fixes.

    9. Finally, a fix for an lockdep-RCU splat was posted to LKML
    at https://lkml.org/lkml/2012/11/7/486. This is at rcu/next. "

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

29 Nov, 2012

1 commit


24 Nov, 2012

1 commit

  • Since 4.4 GCC on MIPS no longer recognizes the "h" constraint,
    leading to this build failure:

    CC lib/mpi/generic_mpih-mul1.o
    lib/mpi/generic_mpih-mul1.c: In function 'mpihelp_mul_1':
    lib/mpi/generic_mpih-mul1.c:50:3: error: impossible constraint in 'asm'

    This patch updates MPI with the latest umul_ppm implementations for MIPS.

    Signed-off-by: Manuel Lauss
    Cc: Linux-MIPS
    Cc: Dmitry Kasatkin
    Cc: James Morris
    Patchwork: https://patchwork.linux-mips.org/patch/4612/
    Signed-off-by: Ralf Baechle

    Manuel Lauss
     

14 Nov, 2012

1 commit


30 Oct, 2012

7 commits

  • Currently swiotlb is the only consumer for swiotlb_bounce. Since that is the
    case it doesn't make much sense to be exporting it so make it a static
    function only.

    In addition we can save a few more lines of code by making it so that it
    accepts the DMA address as a physical address instead of a virtual one. This
    is the last piece in essentially pushing all of the DMA address values to use
    physical addresses in swiotlb.

    In order to clarify things since we now have 2 physical addresses in use
    inside of swiotlb_bounce I am renaming phys to orig_addr, and dma_addr to
    tlb_addr. This way is should be clear that orig_addr is contained within
    io_orig_addr and tlb_addr is an address within the io_tlb_addr buffer.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Konrad Rzeszutek Wilk

    Alexander Duyck
     
  • This change makes it so that the sync functionality also uses physical
    addresses. This helps to further reduce the use of virt_to_phys and
    phys_to_virt functions.

    In order to clarify things since we now have 2 physical addresses in use
    inside of swiotlb_tbl_sync_single I am renaming phys to orig_addr, and
    dma_addr to tlb_addr. This way is should be clear that orig_addr is
    contained within io_orig_addr and tlb_addr is an address within the
    io_tlb_addr buffer.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Konrad Rzeszutek Wilk

    Alexander Duyck
     
  • This change makes it so that the unmap functionality also uses physical
    addresses. This helps to further reduce the use of virt_to_phys and
    phys_to_virt functions.

    In order to clarify things since we now have 2 physical addresses in use
    inside of swiotlb_tbl_unmap_single I am renaming phys to orig_addr, and
    dma_addr to tlb_addr. This way is should be clear that orig_addr is
    contained within io_orig_addr and tlb_addr is an address within the
    io_tlb_addr buffer.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Konrad Rzeszutek Wilk

    Alexander Duyck
     
  • This change makes it so that swiotlb_tbl_map_single will return a physical
    address instead of a virtual address when called. The advantage to this once
    again is that we are avoiding a number of virt_to_phys and phys_to_virt
    translations by working with everything as a physical address.

    One change I had to make in order to support using physical addresses is that
    I could no longer trust 0 to be a invalid physical address on all platforms.
    So instead I made it so that ~0 is returned on error. This should never be a
    valid return value as it implies that only one byte would be available for
    use.

    In order to clarify things since we now have 2 physical addresses in use
    inside of swiotlb_tbl_map_single I am renaming phys to orig_addr, and
    dma_addr to tlb_addr. This way is should be clear that orig_addr is
    contained within io_orig_addr and tlb_addr is an address within the
    io_tlb_addr buffer.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Konrad Rzeszutek Wilk

    Alexander Duyck
     
  • This change makes it so that we can avoid virt_to_phys overhead when using the
    io_tlb_overflow_buffer. My original plan was to completely remove the value
    and replace it with a constant but I had seen that there were recent patches
    that stated this couldn't be done until all device drivers that depended on
    that functionality be updated.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Konrad Rzeszutek Wilk

    Alexander Duyck
     
  • This change replaces all references to the virtual address for io_tlb_start
    with references to the physical address io_tlb_end. The main advantage of
    replacing the virtual address with a physical address is that we can avoid
    having to do multiple translations from the virtual address to the physical
    one needed for testing an existing DMA address.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Konrad Rzeszutek Wilk

    Alexander Duyck
     
  • This change replaces all references to the virtual address for io_tlb_end
    with references to the physical address io_tlb_end. The main advantage of
    replacing the virtual address with a physical address is that we can avoid
    having to do multiple translations from the virtual address to the physical
    one needed for testing an existing DMA address.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Konrad Rzeszutek Wilk

    Alexander Duyck
     

26 Oct, 2012

1 commit

  • The genalloc code uses the bitmap API from include/linux/bitmap.h and
    lib/bitmap.c, which is based on long values. Both bitmap_set from
    lib/bitmap.c and bitmap_set_ll, which is the lockless version from
    genalloc.c, use BITMAP_LAST_WORD_MASK to set the first bits in a long in
    the bitmap.

    That one uses (1 << bits) - 1, 0b111, if you are setting the first three
    bits. This means that the API counts from the least significant bits
    (LSB from now on) to the MSB. The LSB in the first long is bit 0, then.
    The same works for the lookup functions.

    The genalloc code uses longs for the bitmap, as it should. In
    include/linux/genalloc.h, struct gen_pool_chunk has unsigned long
    bits[0] as its last member. When allocating the struct, genalloc should
    reserve enough space for the bitmap. This should be a proper number of
    longs that can fit the amount of bits in the bitmap.

    However, genalloc allocates an integer number of bytes that fit the
    amount of bits, but may not be an integer amount of longs. 9 bytes, for
    example, could be allocated for 70 bits.

    This is a problem in itself if the Least Significat Bit in a long is in
    the byte with the largest address, which happens in Big Endian machines.
    This means genalloc is not allocating the byte in which it will try to
    set or check for a bit.

    This may end up in memory corruption, where genalloc will try to set the
    bits it has not allocated. In fact, genalloc may not set these bits
    because it may find them already set, because they were not zeroed since
    they were not allocated. And that's what causes a BUG when
    gen_pool_destroy is called and check for any set bits.

    What really happens is that genalloc uses kmalloc_node with __GFP_ZERO
    on gen_pool_add_virt. With SLAB and SLUB, this means the whole slab
    will be cleared, not only the requested bytes. Since struct
    gen_pool_chunk has a size that is a multiple of 8, and slab sizes are
    multiples of 8, we get lucky and allocate and clear the right amount of
    bytes.

    Hower, this is not the case with SLOB or with older code that did memset
    after allocating instead of using __GFP_ZERO.

    So, a simple module as this (running 3.6.0), will cause a crash when
    rmmod'ed.

    [root@phantom-lp2 foo]# cat foo.c
    #include
    #include
    #include
    #include

    MODULE_LICENSE("GPL");
    MODULE_VERSION("0.1");

    static struct gen_pool *foo_pool;

    static __init int foo_init(void)
    {
    int ret;
    foo_pool = gen_pool_create(10, -1);
    if (!foo_pool)
    return -ENOMEM;
    ret = gen_pool_add(foo_pool, 0xa0000000, 32 << 10, -1);
    if (ret) {
    gen_pool_destroy(foo_pool);
    return ret;
    }
    return 0;
    }

    static __exit void foo_exit(void)
    {
    gen_pool_destroy(foo_pool);
    }

    module_init(foo_init);
    module_exit(foo_exit);
    [root@phantom-lp2 foo]# zcat /proc/config.gz | grep SLOB
    CONFIG_SLOB=y
    [root@phantom-lp2 foo]# insmod ./foo.ko
    [root@phantom-lp2 foo]# rmmod foo
    ------------[ cut here ]------------
    kernel BUG at lib/genalloc.c:243!
    cpu 0x4: Vector: 700 (Program Check) at [c0000000bb0e7960]
    pc: c0000000003cb50c: .gen_pool_destroy+0xac/0x110
    lr: c0000000003cb4fc: .gen_pool_destroy+0x9c/0x110
    sp: c0000000bb0e7be0
    msr: 8000000000029032
    current = 0xc0000000bb0e0000
    paca = 0xc000000006d30e00 softe: 0 irq_happened: 0x01
    pid = 13044, comm = rmmod
    kernel BUG at lib/genalloc.c:243!
    [c0000000bb0e7ca0] d000000004b00020 .foo_exit+0x20/0x38 [foo]
    [c0000000bb0e7d20] c0000000000dff98 .SyS_delete_module+0x1a8/0x290
    [c0000000bb0e7e30] c0000000000097d4 syscall_exit+0x0/0x94
    --- Exception: c00 (System Call) at 000000800753d1a0
    SP (fffd0b0e640) is in userspace

    Signed-off-by: Thadeu Lima de Souza Cascardo
    Cc: Paul Gortmaker
    Cc: Benjamin Gaignard
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thadeu Lima de Souza Cascardo
     

20 Oct, 2012

1 commit

  • If there is only one match, the unique matched entry should be returned.

    Without the fix, the upcoming dma debug interfaces ("dma-debug: new
    interfaces to debug dma mapping errors") can't work reliably because
    only device and dma_addr are passed to dma_mapping_error().

    Signed-off-by: Ming Lei
    Reported-by: Wu Fengguang
    Cc: Joerg Roedel
    Tested-by: Shuah Khan
    Cc: Paul Gortmaker
    Cc: Jakub Kicinski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ming Lei
     

15 Oct, 2012

1 commit

  • Pull module signing support from Rusty Russell:
    "module signing is the highlight, but it's an all-over David Howells frenzy..."

    Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.

    * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
    X.509: Fix indefinite length element skip error handling
    X.509: Convert some printk calls to pr_devel
    asymmetric keys: fix printk format warning
    MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
    MODSIGN: Make mrproper should remove generated files.
    MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
    MODSIGN: Use the same digest for the autogen key sig as for the module sig
    MODSIGN: Sign modules during the build process
    MODSIGN: Provide a script for generating a key ID from an X.509 cert
    MODSIGN: Implement module signature checking
    MODSIGN: Provide module signing public keys to the kernel
    MODSIGN: Automatically generate module signing keys if missing
    MODSIGN: Provide Kconfig options
    MODSIGN: Provide gitignore and make clean rules for extra files
    MODSIGN: Add FIPS policy
    module: signature checking hook
    X.509: Add a crypto key parser for binary (DER) X.509 certificates
    MPILIB: Provide a function to read raw data into an MPI
    X.509: Add an ASN.1 decoder
    X.509: Add simple ASN.1 grammar compiler
    ...

    Linus Torvalds
     

11 Oct, 2012

3 commits

  • Merge misc fixes from Andrew Morton:
    "Followups, fixes and some random stuff I found on the internet."

    * emailed patches from Andrew Morton : (11 patches)
    perf: fix duplicate header inclusion
    memcg, kmem: fix build error when CONFIG_INET is disabled
    rtc: kconfig: fix RTC_INTF defaults connected to RTC_CLASS
    rapidio: fix comment
    lib/kasprintf.c: use kmalloc_track_caller() to get accurate traces for kvasprintf
    rapidio: update for destination ID allocation
    rapidio: update asynchronous discovery initialization
    rapidio: use msleep in discovery wait
    mm: compaction: fix bit ranges in {get,clear,set}_pageblock_skip()
    arch/powerpc/platforms/pseries/hotplug-memory.c: section removal cleanups
    arch/powerpc/platforms/pseries/hotplug-memory.c: fix section handling code

    Linus Torvalds
     
  • Pull block IO update from Jens Axboe:
    "Core block IO bits for 3.7. Not a huge round this time, it contains:

    - First series from Kent cleaning up and generalizing bio allocation
    and freeing.

    - WRITE_SAME support from Martin.

    - Mikulas patches to prevent O_DIRECT crashes when someone changes
    the block size of a device.

    - Make bio_split() work on data-less bio's (like trim/discards).

    - A few other minor fixups."

    Fixed up silent semantic mis-merge as per Mikulas Patocka and Andrew
    Morton. It is due to the VM no longer using a prio-tree (see commit
    6b2dbba8b6ac: "mm: replace vma prio_tree with an interval tree").

    So make set_blocksize() use mapping_mapped() instead of open-coding the
    internal VM knowledge that has changed.

    * 'for-3.7/core' of git://git.kernel.dk/linux-block: (26 commits)
    block: makes bio_split support bio without data
    scatterlist: refactor the sg_nents
    scatterlist: add sg_nents
    fs: fix include/percpu-rwsem.h export error
    percpu-rw-semaphore: fix documentation typos
    fs/block_dev.c:1644:5: sparse: symbol 'blkdev_mmap' was not declared
    blockdev: turn a rw semaphore into a percpu rw semaphore
    Fix a crash when block device is read and block size is changed at the same time
    block: fix request_queue->flags initialization
    block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
    block: ioctl to zero block ranges
    block: Make blkdev_issue_zeroout use WRITE SAME
    block: Implement support for WRITE SAME
    block: Consolidate command flag and queue limit checks for merges
    block: Clean up special command handling logic
    block/blk-tag.c: Remove useless kfree
    block: remove the duplicated setting for congestion_threshold
    block: reject invalid queue attribute values
    block: Add bio_clone_bioset(), bio_clone_kmalloc()
    block: Consolidate bio_alloc_bioset(), bio_kmalloc()
    ...

    Linus Torvalds
     
  • Previously kvasprintf() allocation was being done through kmalloc(),
    thus producing an inaccurate trace report.

    This is a common problem: in order to get accurate callsite tracing, a
    lib/utils function shouldn't allocate kmalloc but instead use
    kmalloc_track_caller.

    Signed-off-by: Ezequiel Garcia
    Cc: Sam Ravnborg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ezequiel Garcia
     

10 Oct, 2012

1 commit

  • asn1_find_indefinite_length() returns an error indicator of -1, which the
    caller asn1_ber_decoder() places in a size_t (which is usually unsigned) and
    then checks to see whether it is less than 0 (which it can't be). This can
    lead to the following warning:

    lib/asn1_decoder.c:320 asn1_ber_decoder()
    warn: unsigned 'len' is never less than zero.

    Instead, asn1_find_indefinite_length() update the caller's idea of the data
    cursor and length separately from returning the error code.

    Reported-by: Dan Carpenter
    Signed-off-by: David Howells
    Signed-off-by: Rusty Russell

    David Howells
     

09 Oct, 2012

3 commits

  • Add a CONFIG_DEBUG_VM_RB build option for the previously existing
    DEBUG_MM_RB code. Now that Andi Kleen modified it to avoid using
    recursive algorithms, we can expose it a bit more.

    Also extend this code to validate_mm() after stack expansion, and to check
    that the vma's start and last pgoffs have not changed since the nodes were
    inserted on the anon vma interval tree (as it is important that the nodes
    be reindexed after each such update).

    Signed-off-by: Michel Lespinasse
    Cc: Andrea Arcangeli
    Cc: Rik van Riel
    Cc: Peter Zijlstra
    Cc: Daniel Santos
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • Update the generic interval tree code that was introduced in "mm: replace
    vma prio_tree with an interval tree".

    Changes:

    - fixed 'endpoing' typo noticed by Andrew Morton

    - replaced include/linux/interval_tree_tmpl.h, which was used as a
    template (including it automatically defined the interval tree
    functions) with include/linux/interval_tree_generic.h, which only
    defines a preprocessor macro INTERVAL_TREE_DEFINE(), which itself
    defines the interval tree functions when invoked. Now that is a very
    long macro which is unfortunate, but it does make the usage sites
    (lib/interval_tree.c and mm/interval_tree.c) a bit nicer than previously.

    - make use of RB_DECLARE_CALLBACKS() in the INTERVAL_TREE_DEFINE() macro,
    instead of duplicating that code in the interval tree template.

    - replaced vma_interval_tree_add(), which was actually handling the
    nonlinear and interval tree cases, with vma_interval_tree_insert_after()
    which handles only the interval tree case and has an API that is more
    consistent with the other interval tree handling functions.
    The nonlinear case is now handled explicitly in kernel/fork.c dup_mmap().

    Signed-off-by: Michel Lespinasse
    Cc: Andrea Arcangeli
    Cc: Rik van Riel
    Cc: Peter Zijlstra
    Cc: Daniel Santos
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • Provide rb_insert_augmented() and rb_erase_augmented() through a new
    rbtree_augmented.h include file. rb_erase_augmented() is defined there as
    an __always_inline function, in order to allow inlining of augmented
    rbtree callbacks into it. Since this generates a relatively large
    function, each augmented rbtree user should make sure to have a single
    call site.

    Signed-off-by: Michel Lespinasse
    Cc: Rik van Riel
    Cc: Hillf Danton
    Cc: Peter Zijlstra
    Cc: Catalin Marinas
    Cc: Andrea Arcangeli
    Cc: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse