04 Aug, 2011
1 commit
-
For versions of the device that implement operation-types 0x87, 0x88
(IOAT_OP_XOR, IOAT_OP_XOR_VAL) this map determines whether a given
source is located in the base or extended descriptor. Source addresses
6 through 8 require an extended descriptor, hence 0xe0, not 0xd0. No
shipping hardware currently implements these operation types.Reported-by: Evgueni Smogailov
Signed-off-by: Dan Williams
23 Jul, 2011
2 commits
-
const __read_mostly is not legal and causes section type conflicts.
That's because the read.mostly section is not read only.
Simply drop the __read_mostly designation.Signed-off-by: Andi Kleen
[drop __read_mostly instead of const]
Signed-off-by: Dan Williams -
Adding to pci_id.h and the device table for ioat.
Signed-off-by: Dave Jiang
Signed-off-by: Dan Williams
29 May, 2011
1 commit
-
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (33 commits)
x86: poll waiting for I/OAT DMA channel status
maintainers: add dma engine tree details
dmaengine: add TODO items for future work on dma drivers
dmaengine: Add API documentation for slave dma usage
dmaengine/dw_dmac: Update maintainer-ship
dmaengine: move link order
dmaengine/dw_dmac: implement pause and resume in dwc_control
dmaengine/dw_dmac: Replace spin_lock* with irqsave variants and enable submission from callback
dmaengine/dw_dmac: Divide one sg to many desc, if sg len is greater than DWC_MAX_COUNT
dmaengine/dw_dmac: set residue as total len in dwc_tx_status if status is !DMA_SUCCESS
dmaengine/dw_dmac: don't call callback routine in case dmaengine_terminate_all() is called
dmaengine: at_hdmac: pause: no need to wait for FIFO empty
pch_dma: modify pci device table definition
pch_dma: Support new device ML7223 IOH
pch_dma: Support I2S for ML7213 IOH
pch_dma: Fix DMA setting issue
pch_dma: modify for checkpatch
pch_dma: fix dma direction issue for ML7213 IOH video-in
dmaengine: at_hdmac: use descriptor chaining help function
dmaengine: at_hdmac: implement pause and resume in atc_control
...Fix up trivial conflict in drivers/dma/dw_dmac.c
27 May, 2011
1 commit
-
For certain system configurations a 5 usec udelay before checking I/OAT DMA
channel status is sometimes not sufficient, resulting in a false failure
status and unnecessary freeing of channel resources. Conversely, for many
configurations 5 usec is longer than necessary.Loop for up to 20 usec waiting for successful status before failing.
Signed-off-by: Dimitri Sivanich
Signed-off-by: Dan Williams
23 May, 2011
1 commit
-
After discovering that wide use of prefetch on modern CPUs
could be a net loss instead of a win, net drivers which were
relying on the implicit inclusion of prefetch.h via the list
headers showed up in the resulting cleanup fallout. Give
them an explicit include via the following $0.02 script.=========================================
#!/bin/bash
MANUAL=""
for i in `git grep -l 'prefetch(.*)' .` ; do
grep -q '' $i
if [ $? = 0 ] ; then
continue
fi( echo '?^#include '
echo .
echo w
echo q
) | ed -s $i > /dev/null 2>&1
if [ $? != 0 ]; then
echo $i needs manual fixup
MANUAL="$i $MANUAL"
fi
done
echo ------------------- 8\
[ Fixed up some incorrect #include placements, and added some
non-network drivers and the fib_trie.c case - Linus ]
Signed-off-by: Linus Torvalds
05 Dec, 2010
1 commit
-
Changed Makefile to use -y instead of -objs. Following
(documentation/kbuild/makefiles.txt).Signed-off-by: Tracey Dent
Signed-off-by: Andrew Morton
Signed-off-by: Dan Williams
14 Oct, 2010
1 commit
-
Commit 0793448 "DMAENGINE: generic channel status v2" changed the interface for
how dma channel progress is retrieved. It inadvertently exported an internal
helper function ioat_tx_status() instead of ioat_dma_tx_status(). The latter
polls the hardware to get the latest completion state, while the helper just
evaluates the current state without touching hardware. The effect is that we
end up waiting for completion timeouts or descriptor allocation errors before
the completion state is updated.iperf (before fix):
[SUM] 0.0-41.3 sec 364 MBytes 73.9 Mbits/seciperf (after fix):
[SUM] 0.0- 4.5 sec 499 MBytes 940 Mbits/secThis is a regression starting with 2.6.35.
Cc:
Cc: Dave Jiang
Cc: Jesse Brandeburg
Cc: Linus Walleij
Cc: Maciej Sosnowski
Reported-by: Richard Scobie
Signed-off-by: Dan Williams
05 Aug, 2010
1 commit
-
On some platforms (MacPro3,1) the BIOS assigns the ioatdma device to the
incorrect iommu causing faults when the driver initializes. Add a quirk
to catch this misconfiguration and try falling back to untranslated
operation (which works in the MacPro3,1 case).Assuming there are other platforms with misconfigured iommus teach the
ioatdma driver to treat initialization failures as non-fatal (just fail
the driver load and emit a warning instead of triggering a BUG_ON).This can be classified as a boot regression since 2.6.32 on affected
platforms since the ioatdma module did not autoload prior to that
kernel.Cc:
Acked-by: David Woodhouse
Reported-by: Chris Li
Tested-by: Chris Li
Signed-off-by: Dan Williams
18 May, 2010
1 commit
03 May, 2010
1 commit
-
The memory for ioatdma_device structure is being allocated in
alloc_ioatdma()Signed-off-by: Minskey Guo
Signed-off-by: Dan Williams
02 May, 2010
3 commits
-
There are cases where cacheline-unaligned raid operations can hang the
dma channel. Simply disable these operations by increasing the
alignment constraints published to async_tx. The raid456 driver always
issues page aligned requests, so the only in-kernel user of the ioatdma
driver that is affected by this change is dmatest.Signed-off-by: Dan Williams
-
Use separate locks for the descriptor prep (producer) and descriptor
cleanup (consumer) paths. Allows the producer path to run concurrently
with the cleanup path. Inspired by Documentation/circular-buffer.txt.Cc: David Howells
Cc: Paul E. McKenney
Cc: Maciej Sosnowski
Signed-off-by: Dan Williams -
Use the common power-of-2 circular buffer macros.
Signed-off-by: Dan Williams
30 Mar, 2010
1 commit
-
…it slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
27 Mar, 2010
2 commits
-
Simple conditional struct filler to cut out some duplicated code.
Signed-off-by: Dan Williams
-
Convert the device_is_tx_complete() operation on the
DMA engine to a generic device_tx_status()operation which
can return three states, DMA_TX_RUNNING, DMA_TX_COMPLETE,
DMA_TX_PAUSED.[dan.j.williams@intel.com: update for timberdale]
Signed-off-by: Linus Walleij
Acked-by: Mark Brown
Cc: Maciej Sosnowski
Cc: Nicolas Ferre
Cc: Pavel Machek
Cc: Li Yang
Cc: Guennadi Liakhovetski
Cc: Paul Mundt
Cc: Ralf Baechle
Cc: Haavard Skinnemoen
Cc: Magnus Damm
Cc: Liam Girdwood
Cc: Joe Perches
Cc: Roland Dreier
Signed-off-by: Dan Williams
08 Mar, 2010
1 commit
-
Constify struct sysfs_ops.
This is part of the ops structure constification
effort started by Arjan van de Ven et al.Benefits of this constification:
* prevents modification of data that is shared
(referenced) by many other structure instances
at runtime* detects/prevents accidental (but not intentional)
modification attempts on archs that enforce
read-only kernel data at runtime* potentially better optimized code as the compiler
can assume that the const data cannot be changed* the compiler/linker move const data into .rodata
and therefore exclude them from false sharingSigned-off-by: Emese Revfy
Acked-by: David Teigland
Acked-by: Matt Domsch
Acked-by: Maciej Sosnowski
Acked-by: Hans J. Koch
Acked-by: Pekka Enberg
Acked-by: Jens Axboe
Acked-by: Stephen Hemminger
Signed-off-by: Greg Kroah-Hartman
07 Mar, 2010
1 commit
-
Rename for_each_bit to for_each_set_bit in the kernel source tree. To
permit for_each_clear_bit(), should that ever be added.The patch includes a macro to map the old for_each_bit() onto the new
for_each_set_bit(). This is a (very) temporary thing to ease the migration.[akpm@linux-foundation.org: add temporary for_each_bit()]
Suggested-by: Alexey Dobriyan
Suggested-by: Andrew Morton
Signed-off-by: Akinobu Mita
Cc: "David S. Miller"
Cc: Russell King
Cc: David Woodhouse
Cc: Artem Bityutskiy
Cc: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
04 Mar, 2010
7 commits
-
If the calling convention of ->timer_fn() and ->cleanup_fn() are unified
across hardware versions we can drop parameters to ioat_init_channel() and
unify ioat_is_dma_complete() implementations.Both ->timer_fn() and ->cleanup_fn() are modified to expect a struct
dma_chan pointer.Signed-off-by: Dan Williams
-
The hardware automatically disables further interrupts after each event
until rearmed. This allows a delay to be injected between the occurence
of the interrupt and the running of the cleanup routine. The delay is
scaled by the descriptor backlog and then written to the INTRDELAY
register which specifies the number of microseconds to hold off
interrupt delivery after an interrupt event occurs. According to
powertop this reduces the interrupt rate from ~5000 intr/s to ~150
intr/s per without affecting throughput (simple dd to a raid6 array).Signed-off-by: Dan Williams
-
Since ioat_cleanup_preamble() and the update of the last completed
descriptor are not synchronized there is a chance that two cleanup threads
can see descriptors to clean. If the first cleans up all pending
descriptors then the second will trigger the BUG_ON.Signed-off-by: Dan Williams
-
The pending == 2 case no longer exists in the driver so, we can use
ioat2_ring_pending() outside the lock to determine if there might be any
descriptors in the ring that the hardware has not seen.Signed-off-by: Dan Williams
-
Replace open coded ioat2_quiesce() call in ioat3_restart_channel
Signed-off-by: Dan Williams
-
We already disallow raid operations while DCA is globally enabled, so
having it locally enabled is a nop and confusing when reading the code.Signed-off-by: Dan Williams
03 Feb, 2010
1 commit
-
Fix typo in ioat2_quiesce. check 'tmo' is zero, not 'end'. Also applies
to 2.6.32.3Cc:
Signed-off-by: Dan Williams
20 Dec, 2009
1 commit
-
Put the ioat2 and ioat3 state machines in the halted state with all
errors cleared.The ioat1 init path is not disturbed for stability, there are no
reported ioat1 initiaization issues.Cc:
Reported-by: Roland Dreier
Tested-by: Roland Dreier
Acked-by: Simon Horman
Signed-off-by: Dan Williams
18 Dec, 2009
1 commit
-
When continuing a pq calculation the driver needs 3 extra sources. The
driver can perform a 3 source calculation with a single descriptor, but
needs an extended descriptor to process up to 8 sources in one
operation. However, in the p-disabled case only one extra source is
needed. When continuing a p-disabled operation there are occasions
(i.e. 0 < src_cnt % 8 < 3) where the tail operation does not need an
extended descriptor. Properly account for this fact otherwise invalid
'dmacount' values will be written to hardware usually causing the
channel to halt with 'invalid descriptor' errors.Cc:
Signed-off-by: Dan Williams
20 Nov, 2009
6 commits
-
The completion of a pq operation is notified with a null descriptor
appended to the end of the chain. This descriptor needs to be visible
to dma clients otherwise the client is precluded from ensuring all
operations are quiesced before freeing channel resources, i.e. due to
descriptor polling it may get the completion notification ahead of the
interrupt delivered by the null descriptor.Signed-off-by: Dan Williams
-
ioat3.2 does not support asynchronous error notifications which makes
the driver experience latencies when non-zero pq validate results are
expected. Provide a mechanism for turning off async_xor_val and
async_syndrome_val via Kconfig. This approach is generally useful for
any driver that specifies ASYNC_TX_DISABLE_CHANNEL_SWITCH and would like
to force the async_tx api to fall back to the synchronous path for
certain operations.Signed-off-by: Dan Williams
-
Modify is_ioat_bug() to catch all errors that are uncorrectable, or not
currently handled.Signed-off-by: Dan Williams
-
Although disabled, hardware still checks address validity, so duplicate
the known address.Signed-off-by: Dan Williams
-
Error interrupts and error completions may cause channel hangs, so
poll the channel status register after a timeout.Signed-off-by: Dan Williams
-
RAID operations cause a system hang on platforms with DCA
(Direct-Cache-Access) enabled. So turn off RAID capabilities in this
case.Signed-off-by: Dan Williams
18 Nov, 2009
1 commit
-
Turning off dca is not an "error", and the dca-enabled state can be
viewed from sysfs.Signed-off-by: Dan Williams
22 Sep, 2009
2 commits
-
drivers/dma/ioat/dma_v3.c: In function 'ioat3_prep_memset_lock':
drivers/dma/ioat/dma_v3.c:439: warning: 'fill' may be used uninitialized in this function
drivers/dma/ioat/dma_v3.c:437: warning: 'desc' may be used uninitialized in this function
drivers/dma/ioat/dma_v3.c: In function '__ioat3_prep_xor_lock':
drivers/dma/ioat/dma_v3.c:489: warning: 'xor' may be used uninitialized in this function
drivers/dma/ioat/dma_v3.c:486: warning: 'desc' may be used uninitialized in this function
drivers/dma/ioat/dma_v3.c: In function '__ioat3_prep_pq_lock':
drivers/dma/ioat/dma_v3.c:631: warning: 'pq' may be used uninitialized in this function
drivers/dma/ioat/dma_v3.c:628: warning: 'desc' may be used uninitialized in this functiongcc-4.0, unlike gcc-4.3, does not see that these variables are
initialized before use. Convert the descriptor loops to do-while make
this initialization apparent.Signed-off-by: Dan Williams
-
drivers/dma/ioat/dma_v2.c: In function 'ioat2_dma_prep_memcpy_lock':
drivers/dma/ioat/dma_v2.c:680: warning: 'hw' may be used uninitialized in this function
drivers/dma/ioat/dma_v2.c:681: warning: 'desc' may be used uninitialized in this functionCc: Maciej Sosnowski
Signed-off-by: Andrew Morton
Signed-off-by: Dan Williams
17 Sep, 2009
1 commit
-
With the addition of ioat_max_alloc_order it is not clear what the
maximum allocation order is, so document that in the modinfo. Also take
an opportunity to kill a stray semicolon.Signed-off-by: Maciej Sosnowski
Signed-off-by: Dan Williams
11 Sep, 2009
1 commit
-
A new ring implementation and the addition of raid functionality
constitutes a bump in the driver major version number.Signed-off-by: Maciej Sosnowski
Signed-off-by: Dan Williams