Eric Lee / smarc-fsl-linux-kernel

05 Jun, 2019

1 commit

a61127c21 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 335 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms and conditions of the gnu general public license
version 2 as published by the free software foundation this program
is distributed in the hope it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not write to the free
software foundation inc 51 franklin st fifth floor boston ma 02110
1301 usa

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 111 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Alexios Zavras
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190530000436.567572064@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-06-05 23:37:06 +0800

07 Jan, 2016

1 commit

b02bab6b0 async_tx: use GFP_NOWAIT rather than GFP_IO ... Browse Code »

These async_XX functions are called from md/raid5 in an atomic
section, between get_cpu() and put_cpu(), so they must not sleep.
So use GFP_NOWAIT rather than GFP_IO.

Dan Williams writes: Longer term async_tx needs to be merged into md
directly as we can allocate this unmap data statically per-stripe
rather than per request.

Fixed: 7476bd79fc01 ("async_pq: convert to dmaengine_unmap_data")
Cc: stable@vger.kernel.org (v3.13+)
Reported-and-tested-by: Stanislav Samsonov
Acked-by: Dan Williams
Signed-off-by: NeilBrown
Signed-off-by: Vinod Koul

NeilBrown
2016-01-07 13:36:18 +0800

15 Nov, 2013

2 commits

0776ae7b8 dmaengine: remove DMA unmap flags ... Browse Code »

Remove no longer needed DMA unmap flags:
- DMA_COMPL_SKIP_SRC_UNMAP
- DMA_COMPL_SKIP_DEST_UNMAP
- DMA_COMPL_SRC_UNMAP_SINGLE
- DMA_COMPL_DEST_UNMAP_SINGLE

Cc: Vinod Koul
Cc: Tomasz Figa
Cc: Dave Jiang
Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Kyungmin Park
Acked-by: Jon Mason
Acked-by: Mark Brown
[djbw: clean up straggling skip unmap flags in ntb]
Signed-off-by: Dan Williams

Bartlomiej Zolnierkiewicz
2013-11-15 03:04:38 +0800
897164629 async_memcpy: convert to dmaengine_unmap_data ... Browse Code »

Use the generic unmap object to unmap dma buffers.

Cc: Vinod Koul
Cc: Tomasz Figa
Cc: Dave Jiang
Reported-by: Bartlomiej Zolnierkiewicz
[bzolnier: add missing unmap->len initialization]
[bzolnier: fix whitespace damage]
Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Kyungmin Park
[djbw: add DMA_ENGINE=n support]
Signed-off-by: Dan Williams

Dan Williams
2013-11-15 03:00:39 +0800

08 Jan, 2013

1 commit

35fa4dbc8 async_tx: add missing DMA unmap to async_memcpy() ... Browse Code »

Do DMA unmap on ->device_prep_dma_memcpy failure.

Cc: Dan Williams
Cc: Tomasz Figa
Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Kyungmin Park
Signed-off-by: Dan Williams

Bartlomiej Zolnierkiewicz
2013-01-08 14:04:57 +0800

20 Mar, 2012

1 commit

f0dfc0b0b crypto: remove the second argument of k[un]map_atomic() ... Browse Code »

Signed-off-by: Cong Wang

Cong Wang
2012-03-20 21:48:16 +0800

01 Nov, 2011

1 commit

4bb33cc89 crypto: add module.h to those files that are explicitly using it ... Browse Code »

Part of the include cleanups means that the implicit
inclusion of module.h via device.h is going away. So
fix things up in advance.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-11-01 07:31:11 +0800

27 Oct, 2010

1 commit

61ecdb801 mm: strictly nested kmap_atomic() ... Browse Code »

Ensure kmap_atomic() usage is strictly nested

Signed-off-by: Peter Zijlstra
Reviewed-by: Rik van Riel
Acked-by: Chris Metcalf
Cc: David Howells
Cc: Hugh Dickins
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Steven Rostedt
Cc: Russell King
Cc: Ralf Baechle
Cc: David Miller
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2010-10-27 07:52:08 +0800

09 Sep, 2009

2 commits

83544ae9f dmaengine, async_tx: support alignment checks ... Browse Code »

Some engines have transfer size and address alignment restrictions. Add
a per-operation alignment property to struct dma_device that the async
routines and dmatest can use to check alignment capabilities.

Signed-off-by: Dan Williams

Dan Williams
2009-09-09 08:42:53 +0800
0403e3827 dmaengine: add fence support ... Browse Code »

Some engines optimize operation by reading ahead in the descriptor chain
such that descriptor2 may start execution before descriptor1 completes.
If descriptor2 depends on the result from descriptor1 then a fence is
required (on descriptor2) to disable this optimization. The async_tx
api could implicitly identify dependencies via the 'depend_tx'
parameter, but that would constrain cases where the dependency chain
only specifies a completion order rather than a data dependency. So,
provide an ASYNC_TX_FENCE to explicitly identify data dependencies.

Signed-off-by: Dan Williams

Dan Williams
2009-09-09 08:42:50 +0800

30 Aug, 2009

1 commit

af1f951eb async_tx: kill needless module_{init|exit} ... Browse Code »

If module_init and module_exit are nops then neither need to be defined.

[ Impact: pure cleanup ]

Reviewed-by: Andre Noll
Acked-by: Maciej Sosnowski
Signed-off-by: Dan Williams

Dan Williams
2009-08-30 10:09:26 +0800

04 Jun, 2009

2 commits

a08abd8ca async_tx: structify submission arguments, add scribble ... Browse Code »

Prepare the api for the arrival of a new parameter, 'scribble'. This
will allow callers to identify scratchpad memory for dma address or page
address conversions. As this adds yet another parameter, take this
opportunity to convert the common submission parameters (flags,
dependency, callback, and callback argument) into an object that is
passed by reference.

Also, take this opportunity to fix up the kerneldoc and add notes about
the relevant ASYNC_TX_* flags for each routine.

[ Impact: moves api pass-by-value parameters to a pass-by-reference struct ]

Signed-off-by: Andre Noll
Acked-by: Maciej Sosnowski
Signed-off-by: Dan Williams

Dan Williams
2009-06-04 05:07:35 +0800
88ba2aa58 async_tx: kill ASYNC_TX_DEP_ACK flag ... Browse Code »

In support of inter-channel chaining async_tx utilizes an ack flag to
gate whether a dependent operation can be chained to another. While the
flag is not set the chain can be considered open for appending. Setting
the ack flag closes the chain and flags the descriptor for garbage
collection. The ASYNC_TX_DEP_ACK flag essentially means "close the
chain after adding this dependency". Since each operation can only have
one child the api now implicitly sets the ack flag at dependency
submission time. This removes an unnecessary management burden from
clients of the api.

[ Impact: clean up and enforce one dependency per operation ]

Reviewed-by: Andre Noll
Acked-by: Maciej Sosnowski
Signed-off-by: Dan Williams

Dan Williams
2009-06-04 05:07:34 +0800

18 Jul, 2008

2 commits

3dce01713 async_tx: remove depend_tx from async_tx_sync_epilog ... Browse Code »

All callers of async_tx_sync_epilog have called async_tx_quiesce on the
depend_tx, so async_tx_sync_epilog need only call the callback to
complete the operation.

Signed-off-by: Dan Williams

Dan Williams
2008-07-18 08:59:55 +0800
d2c52b798 async_tx: export async_tx_quiesce ... Browse Code »

Replace open coded "wait and acknowledge" instances with async_tx_quiesce.

Signed-off-by: Dan Williams

Dan Williams
2008-07-18 08:59:55 +0800

18 Apr, 2008

1 commit

636bdeaa1 dmaengine: ack to flags: make use of the unused bits in the 'ack' field ... Browse Code »

'ack' is currently a simple integer that flags whether or not a client is done
touching fields in the given descriptor. It is effectively just a single bit
of information. Converting this to a flags parameter allows the other bits to
be put to use to control completion actions, like dma-unmap, and capture
results, like xor-zero-sum == 0.

Changes are one of:
1/ convert all open-coded ->ack manipulations to use async_tx_ack
and async_tx_test_ack.
2/ set the ack bit at prep time where possible
3/ make drivers store the flags at prep time
4/ add flags to the device_prep_dma_interrupt prototype

Acked-by: Maciej Sosnowski
Signed-off-by: Dan Williams

Dan Williams
2008-04-18 04:25:54 +0800

14 Mar, 2008

1 commit

3280ab3e8 async_tx: checkpatch says s/__FUNCTION__/__func__/g ... Browse Code »

Signed-off-by: Dan Williams

Dan Williams
2008-03-14 01:57:10 +0800

07 Feb, 2008

4 commits

47437b2c9 async_tx: allow architecture specific async_tx_find_channel implementations ... Browse Code »

The source and destination addresses are included to allow channel
selection based on address alignment.

Signed-off-by: Dan Williams
Reviewed-by: Haavard Skinnemoen

Dan Williams
2008-02-07 01:12:18 +0800
d4c56f97f async_tx: replace 'int_en' with operation preparation flags ... Browse Code »

Pass a full set of flags to drivers' per-operation 'prep' routines.
Currently the only flag passed is DMA_PREP_INTERRUPT. The expectation is
that arch-specific async_tx_find_channel() implementations can exploit this
capability to find the best channel for an operation.

Signed-off-by: Dan Williams
Acked-by: Shannon Nelson
Reviewed-by: Haavard Skinnemoen

Dan Williams
2008-02-07 01:12:18 +0800
0036731c8 async_tx: kill tx_set_src and tx_set_dest methods ... Browse Code »

The tx_set_src and tx_set_dest methods were originally implemented to allow
an array of addresses to be passed down from async_xor to the dmaengine
driver while minimizing stack overhead. Removing these methods allows
drivers to have all transaction parameters available at 'prep' time, saves
two function pointers in struct dma_async_tx_descriptor, and reduces the
number of indirect branches..

A consequence of moving this data to the 'prep' routine is that
multi-source routines like async_xor need temporary storage to convert an
array of linear addresses into an array of dma addresses. In order to keep
the same stack footprint of the previous implementation the input array is
reused as storage for the dma addresses. This requires that
sizeof(dma_addr_t) be less than or equal to sizeof(void *). As a
consequence CONFIG_DMADEVICES now depends on !CONFIG_HIGHMEM64G. It also
requires that drivers be able to make descriptor resources available when
the 'prep' routine is polled.

Signed-off-by: Dan Williams
Acked-by: Shannon Nelson

Dan Williams
2008-02-07 01:12:17 +0800
d909b3475 async_tx: kill ASYNC_TX_ASSUME_COHERENT ... Browse Code »

Remove the unused ASYNC_TX_ASSUME_COHERENT flag. Async_tx is
meant to hide the difference between asynchronous hardware and synchronous
software operations, this flag requires clients to understand cache
coherency consequences of the async path.

Signed-off-by: Dan Williams
Reviewed-by: Haavard Skinnemoen

Dan Williams
2008-02-07 01:12:17 +0800

20 Jul, 2007

1 commit

eb0645a8b async_tx: fix kmap_atomic usage in async_memcpy ... Browse Code »

Andrew Morton:
[async_memcpy] is very wrong if both ASYNC_TX_KMAP_DST and
ASYNC_TX_KMAP_SRC can ever be set. We'll end up using the same kmap
slot for both src add dest and we get either corrupted data or a BUG.

Evgeniy Polyakov:
Btw, shouldn't it always be kmap_atomic() even if flag is not set.
That pages are usual one returned by alloc_page().

So fix the usage of kmap_atomic and kill the ASYNC_TX_KMAP_DST and
ASYNC_TX_KMAP_SRC flags.

Cc: Andrew Morton
Cc: Evgeniy Polyakov
Signed-off-by: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Williams
2007-07-20 23:44:19 +0800

13 Jul, 2007

1 commit

9bc89cd82 async_tx: add the async_tx api ... Browse Code »

The async_tx api provides methods for describing a chain of asynchronous
bulk memory transfers/transforms with support for inter-transactional
dependencies. It is implemented as a dmaengine client that smooths over
the details of different hardware offload engine implementations. Code
that is written to the api can optimize for asynchronous operation and the
api will fit the chain of operations to the available offload resources.

I imagine that any piece of ADMA hardware would register with the
'async_*' subsystem, and a call to async_X would be routed as
appropriate, or be run in-line. - Neil Brown

async_tx exploits the capabilities of struct dma_async_tx_descriptor to
provide an api of the following general format:

struct dma_async_tx_descriptor *
async_(..., struct dma_async_tx_descriptor *depend_tx,
dma_async_tx_callback cb_fn, void *cb_param)
{
struct dma_chan *chan = async_tx_find_channel(depend_tx, );
struct dma_device *device = chan ? chan->device : NULL;
int int_en = cb_fn ? 1 : 0;
struct dma_async_tx_descriptor *tx = device ?
device->device_prep_dma_(chan, len, int_en) : NULL;

if (tx) { /* run asynchronously */
...
tx->tx_set_dest(addr, tx, index);
...
tx->tx_set_src(addr, tx, index);
...
async_tx_submit(chan, tx, flags, depend_tx, cb_fn, cb_param);
} else { /* run synchronously */
...

...
async_tx_sync_epilog(flags, depend_tx, cb_fn, cb_param);
}

return tx;
}

async_tx_find_channel() returns a capable channel from its pool. The
channel pool is organized as a per-cpu array of channel pointers. The
async_tx_rebalance() routine is tasked with managing these arrays. In the
uniprocessor case async_tx_rebalance() tries to spread responsibility
evenly over channels of similar capabilities. For example if there are two
copy+xor channels, one will handle copy operations and the other will
handle xor. In the SMP case async_tx_rebalance() attempts to spread the
operations evenly over the cpus, e.g. cpu0 gets copy channel0 and xor
channel0 while cpu1 gets copy channel 1 and xor channel 1. When a
dependency is specified async_tx_find_channel defaults to keeping the
operation on the same channel. A xor->copy->xor chain will stay on one
channel if it supports both operation types, otherwise the transaction will
transition between a copy and a xor resource.

Currently the raid5 implementation in the MD raid456 driver has been
converted to the async_tx api. A driver for the offload engines on the
Intel Xscale series of I/O processors, iop-adma, is provided in a later
commit. With the iop-adma driver and async_tx, raid456 is able to offload
copy, xor, and xor-zero-sum operations to hardware engines.

On iop342 tiobench showed higher throughput for sequential writes (20 - 30%
improvement) and sequential reads to a degraded array (40 - 55%
improvement). For the other cases performance was roughly equal, +/- a few
percentage points. On a x86-smp platform the performance of the async_tx
implementation (in synchronous mode) was also +/- a few percentage points
of the original implementation. According to 'top' on iop342 CPU
utilization drops from ~50% to ~15% during a 'resync' while the speed
according to /proc/mdstat doubles from ~25 MB/s to ~50 MB/s.

The tiobench command line used for testing was: tiobench --size 2048
--block 4096 --block 131072 --dir /mnt/raid --numruns 5
* iop342 had 1GB of memory available

Details:
* if CONFIG_DMA_ENGINE=n the asynchronous path is compiled away by making
async_tx_find_channel a static inline routine that always returns NULL
* when a callback is specified for a given transaction an interrupt will
fire at operation completion time and the callback will occur in a
tasklet. if the the channel does not support interrupts then a live
polling wait will be performed
* the api is written as a dmaengine client that requests all available
channels
* In support of dependencies the api implicitly schedules channel-switch
interrupts. The interrupt triggers the cleanup tasklet which causes
pending operations to be scheduled on the next channel
* Xor engines treat an xor destination address differently than a software
xor routine. To the software routine the destination address is an implied
source, whereas engines treat it as a write-only destination. This patch
modifies the xor_blocks routine to take a an explicit destination address
to mirror the hardware.

Changelog:
* fixed a leftover debug print
* don't allow callbacks in async_interrupt_cond
* fixed xor_block changes
* fixed usage of ASYNC_TX_XOR_DROP_DEST
* drop dma mapping methods, suggested by Chris Leech
* printk warning fixups from Andrew Morton
* don't use inline in C files, Adrian Bunk
* select the API when MD is enabled
* BUG_ON xor source counts
Signed-off-by: Dan Williams
Acked-By: NeilBrown

Dan Williams
2007-07-13 23:06:14 +0800