Eric Lee / smarc-fsl-linux-kernel

02 Nov, 2017

1 commit

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800

08 Sep, 2017

1 commit

3645e6d0d Merge tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md ... Browse Code »

Pull MD updates from Shaohua Li:
"This update mainly fixes bugs:

- Make raid5 ppl support several ppl from Pawel

- Several raid5-cache bug fixes from Song

- Bitmap fixes from Neil and Me

- One raid1/10 regression fix since 4.12 from Me

- Other small fixes and cleanup"

* tag 'md/4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md/bitmap: disable bitmap_resize for file-backed bitmaps.
raid5-ppl: Recovery support for multiple partial parity logs
md: Runtime support for multiple ppls
md/raid0: attach correct cgroup info in bio
lib/raid6: align AVX512 constants to 512 bits, not bytes
raid5: remove raid5_build_block
md/r5cache: call mddev_lock/unlock() in r5c_journal_mode_show
md: replace seq_release_private with seq_release
md: notify about new spare disk in the container
md/raid1/10: reset bio allocated from mempool
md/raid5: release/flush io in raid5_do_work()
md/bitmap: copy correct data for bitmap super

Linus Torvalds
2017-09-08 03:41:48 +0800

26 Aug, 2017

1 commit

b5e0fff19 lib/raid6: align AVX512 constants to 512 bits, not bytes ... Browse Code »

Signed-off-by: Denys Vlasenko
Cc: H. Peter Anvin
Cc: mingo@redhat.com
Cc: Jim Kukunas
Cc: Fenghua Yu
Cc: Megha Dey
Cc: Gayatri Kammela
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Shaohua Li

Denys Vlasenko
2017-08-26 01:21:47 +0800

10 Aug, 2017

2 commits

6ec4e2514 md/raid6: implement recovery using ARM NEON intrinsics ... Browse Code »

Provide a NEON accelerated implementation of the recovery algorithm,
which supersedes the default byte-by-byte one.

Signed-off-by: Ard Biesheuvel
Signed-off-by: Catalin Marinas

Ard Biesheuvel
2017-08-10 01:52:07 +0800
35129dde8 md/raid6: use faster multiplication for ARM NEON delta syndrome ... Browse Code »

The P/Q left side optimization in the delta syndrome simply involves
repeatedly multiplying a value by polynomial 'x' in GF(2^8). Given
that 'x * x * x * x' equals 'x^4' even in the polynomial world, we
can accelerate this substantially by performing up to 4 such operations
at once, using the NEON instructions for polynomial multiplication.

Results on a Cortex-A57 running in 64-bit mode:

Before:
-------
raid6: neonx1 xor() 1680 MB/s
raid6: neonx2 xor() 2286 MB/s
raid6: neonx4 xor() 3162 MB/s
raid6: neonx8 xor() 3389 MB/s

After:
------
raid6: neonx1 xor() 2281 MB/s
raid6: neonx2 xor() 3362 MB/s
raid6: neonx4 xor() 3787 MB/s
raid6: neonx8 xor() 4239 MB/s

While we're at it, simplify MASK() by using a signed shift rather than
a vector compare involving a temp register.

Signed-off-by: Ard Biesheuvel
Signed-off-by: Catalin Marinas

Ard Biesheuvel
2017-08-10 01:51:57 +0800

16 May, 2017

1 commit

b5dceda1f lib/raid6: Add log-of-2 table for RAID6 HW requiring disk position ... Browse Code »

The raid6_gfexp table represents {2}^n values for 0 < 256. The
Linux async_tx framework pass values from raid6_gfexp as coefficients
for each source to prep_dma_pq() callback of DMA channel with PQ
capability. This creates problem for RAID6 offload engines (such as
Broadcom SBA) which take disk position (i.e. log of {2}) instead of
multiplicative cofficients from raid6_gfexp table.

This patch adds raid6_gflog table having log-of-2 value for any given
x such that 0 < 256. For any given disk coefficient x, the
corresponding disk position is given by raid6_gflog[x]. The RAID6
offload engine driver can use this newly added raid6_gflog table to
get disk position from multiplicative coefficient.

Signed-off-by: Anup Patel
Reviewed-by: Scott Branden
Reviewed-by: Ray Jui
Acked-by: Shaohua Li
Signed-off-by: Vinod Koul

Anup Patel
2017-05-16 12:31:57 +0800

08 Nov, 2016

1 commit

b9bf33a8b lib/raid6: Add AVX2 optimized xor_syndrome functions ... Browse Code »

Implement the AVX2 optimization of RAID6 xor_syndrome functions which is
simply based on sse2.c written by hpa.

Cc: H. Peter Anvin
Cc: Yuanhan Liu
Cc: Fenghua Yu
Signed-off-by: Gayatri Kammela
Signed-off-by: Shaohua Li

Gayatri Kammela
2016-11-08 07:08:20 +0800

08 Oct, 2016

1 commit

c23112e03 Merge tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md ... Browse Code »

Pull MD updates from Shaohua Li:
"This update includes:

- new AVX512 instruction based raid6 gen/recovery algorithm

- a couple of md-cluster related bug fixes

- fix a potential deadlock

- set nonrotational bit for raid array with SSD

- set correct max_hw_sectors for raid5/6, which hopefuly can improve
performance a little bit

- other minor fixes"

* tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md: set rotational bit
raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays
raid5: handle register_shrinker failure
raid5: fix to detect failure of register_shrinker
md: fix a potential deadlock
md/bitmap: fix wrong cleanup
raid5: allow arbitrary max_hw_sectors
lib/raid6: Add AVX512 optimized xor_syndrome functions
lib/raid6/test/Makefile: Add avx512 gen_syndrome and recovery functions
lib/raid6: Add AVX512 optimized recovery functions
lib/raid6: Add AVX512 optimized gen_syndrome functions
md-cluster: make resync lock also could be interruptted
md-cluster: introduce dlm_lock_sync_interruptible to fix tasks hang
md-cluster: convert the completion to wait queue
md-cluster: protect md_find_rdev_nr_rcu with rcu lock
md-cluster: clean related infos of cluster
md: changes for MD_STILL_CLOSED flag
md-cluster: remove some unnecessary dlm_unlock_sync
md-cluster: use FORCEUNLOCK in lockres_free
md-cluster: call md_kick_rdev_from_array once ack failed

Linus Torvalds
2016-10-08 00:45:43 +0800

27 Sep, 2016

1 commit

099b548c4 raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays ... Browse Code »

Specifying the aligned attributes to the char data[NDISKS][PAGE_SIZE],
char recovi[PAGE_SIZE] and char recovi[PAGE_SIZE] arrays, so that all
malloc memory is page boundary aligned.

Without these alignment attributes, the test causes a segfault in
userspace when the NDISKS are changed to 4 from 16.

The RAID stripes will be page aligned anyway, so we want to test what
the kernel actually will execute.

Cc: H. Peter Anvin
Cc: Yu-cheng Yu
Signed-off-by: Gayatri Kammela
Reviewed-by: H. Peter Anvin
Signed-off-by: Shaohua Li

Gayatri Kammela
2016-09-27 07:18:21 +0800

22 Sep, 2016

4 commits

694dda62d lib/raid6: Add AVX512 optimized xor_syndrome functions ... Browse Code »

Optimize RAID6 xor_syndrome functions to take advantage of the 512-bit
ZMM integer instructions introduced in AVX512.

AVX512 optimized xor_syndrome functions, which is simply based on sse2.c
written by hpa.

The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions

Cc: H. Peter Anvin
Cc: Jim Kukunas
Cc: Fenghua Yu
Cc: Megha Dey
Signed-off-by: Gayatri Kammela
Reviewed-by: Fenghua Yu
Signed-off-by: Shaohua Li

Gayatri Kammela
2016-09-22 00:09:44 +0800
161db5d16 lib/raid6/test/Makefile: Add avx512 gen_syndrome and recovery functions ... Browse Code »

Adding avx512 gen_syndrome and recovery functions so as to allow code to
be compiled and tested successfully in userspace.

This patch is tested in userspace and improvement in performace is
observed.

Cc: H. Peter Anvin
Cc: Jim Kukunas
Cc: Fenghua Yu
Signed-off-by: Megha Dey
Signed-off-by: Gayatri Kammela
Reviewed-by: Fenghua Yu
Signed-off-by: Shaohua Li

Gayatri Kammela
2016-09-22 00:09:44 +0800
13c520b29 lib/raid6: Add AVX512 optimized recovery functions ... Browse Code »

Optimize RAID6 recovery functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.

AVX512 optimized recovery functions, which is simply based
on recov_avx2.c written by Jim Kukunas

This patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions

Cc: Jim Kukunas
Cc: H. Peter Anvin
Cc: Fenghua Yu
Signed-off-by: Megha Dey
Signed-off-by: Gayatri Kammela
Reviewed-by: Fenghua Yu
Signed-off-by: Shaohua Li

Gayatri Kammela
2016-09-22 00:09:44 +0800
e0a491c12 lib/raid6: Add AVX512 optimized gen_syndrome functions ... Browse Code »

Optimize RAID6 gen_syndrom functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.

AVX512 optimized gen_syndrom functions, which is simply based
on avx2.c written by Yuanhan Liu and sse2.c written by hpa.

The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructions

Cc: H. Peter Anvin
Cc: Jim Kukunas
Cc: Fenghua Yu
Signed-off-by: Megha Dey
Signed-off-by: Gayatri Kammela
Reviewed-by: Fenghua Yu
Signed-off-by: Shaohua Li

Gayatri Kammela
2016-09-22 00:09:44 +0800

01 Sep, 2016

1 commit

f5b55fa1f RAID/s390: provide raid6 recovery optimization ... Browse Code »

The XC instruction can be used to improve the speed of the raid6
recovery. The loops now operate on blocks of 256 bytes.

Signed-off-by: Martin Schwidefsky

Martin Schwidefsky
2016-09-01 22:13:25 +0800

29 Aug, 2016

1 commit

474fd6e80 RAID/s390: add SIMD implementation for raid6 gen/xor ... Browse Code »

Using vector registers is slightly faster:

raid6: vx128x8 gen() 19705 MB/s
raid6: vx128x8 xor() 11886 MB/s
raid6: using algorithm vx128x8 gen() 19705 MB/s
raid6: .... xor() 11886 MB/s, rmw enabled

vs the software algorithms:

raid6: int64x1 gen() 3018 MB/s
raid6: int64x1 xor() 1429 MB/s
raid6: int64x2 gen() 4661 MB/s
raid6: int64x2 xor() 3143 MB/s
raid6: int64x4 gen() 5392 MB/s
raid6: int64x4 xor() 3509 MB/s
raid6: int64x8 gen() 4441 MB/s
raid6: int64x8 xor() 3207 MB/s
raid6: using algorithm int64x4 gen() 5392 MB/s
raid6: .... xor() 3509 MB/s, rmw enabled

Signed-off-by: Martin Schwidefsky

Martin Schwidefsky
2016-08-29 17:05:04 +0800

01 Dec, 2015

1 commit

dc4fbba11 powerpc: Create disable_kernel_{fp,altivec,vsx,spe}() ... Browse Code »

The enable_kernel_*() functions leave the relevant MSR bits enabled
until we exit the kernel sometime later. Create disable versions
that wrap the kernel use of FP, Altivec VSX or SPE.

While we don't want to disable it normally for performance reasons
(MSR writes are slow), it will be used for a debug boot option that
does this and catches bad uses in other areas of the kernel.

Signed-off-by: Anton Blanchard
Signed-off-by: Michael Ellerman

Anton Blanchard
2015-12-01 10:52:25 +0800

01 Sep, 2015

1 commit

0e833e697 md/raid6: delta syndrome for ARM NEON ... Browse Code »

This implements XOR syndrome calculation using NEON intrinsics.
As before, the module can be built for ARM and arm64 from the
same source.

Relative performance on a Cortex-A57 based system:

raid6: int64x1 gen() 905 MB/s
raid6: int64x1 xor() 881 MB/s
raid6: int64x2 gen() 1343 MB/s
raid6: int64x2 xor() 1286 MB/s
raid6: int64x4 gen() 1896 MB/s
raid6: int64x4 xor() 1321 MB/s
raid6: int64x8 gen() 1773 MB/s
raid6: int64x8 xor() 1165 MB/s
raid6: neonx1 gen() 1834 MB/s
raid6: neonx1 xor() 1278 MB/s
raid6: neonx2 gen() 2528 MB/s
raid6: neonx2 xor() 1942 MB/s
raid6: neonx4 gen() 2888 MB/s
raid6: neonx4 xor() 2334 MB/s
raid6: neonx8 gen() 2957 MB/s
raid6: neonx8 xor() 2232 MB/s
raid6: using algorithm neonx8 gen() 2957 MB/s
raid6: .... xor() 2232 MB/s, rmw enabled

Cc: Markus Stockhausen
Cc: Neil Brown
Signed-off-by: Ard Biesheuvel
Signed-off-by: NeilBrown

Ard Biesheuvel
2015-09-01 01:29:05 +0800

24 Jun, 2015

1 commit

08d183e3c Merge tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux ... Browse Code »

Pull powerpc updates from Michael Ellerman:

- disable the 32-bit vdso when building LE, so we can build with a
64-bit only toolchain.

- EEH fixes from Gavin & Richard.

- enable the sys_kcmp syscall from Laurent.

- sysfs control for fastsleep workaround from Shreyas.

- expose OPAL events as an irq chip by Alistair.

- MSI ops moved to pci_controller_ops by Daniel.

- fix for kernel to userspace backtraces for perf from Anton.

- merge pseries and pseries_le defconfigs from Cyril.

- CXL in-kernel API from Mikey.

- OPAL prd driver from Jeremy.

- fix for DSCR handling & tests from Anshuman.

- Powernv flash mtd driver from Cyril.

- dynamic DMA Window support on powernv from Alexey.

- LLVM clang fixes & workarounds from Anton.

- reworked version of the patch to abort syscalls when transactional.

- fix the swap encoding to support 4TB, from Aneesh.

- various fixes as usual.

- Freescale updates from Scott: Highlights include more 8xx
optimizations, an e6500 hugetlb optimization, QMan device tree nodes,
t1024/t1023 support, and various fixes and cleanup.

* tag 'powerpc-4.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (180 commits)
cxl: Fix typo in debug print
cxl: Add CXL_KERNEL_API config option
powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma()
powerpc/mm: Change the swap encoding in pte.
powerpc/mm: PTE_RPN_MAX is not used, remove the same
powerpc/tm: Abort syscalls in active transactions
powerpc/iommu/ioda2: Enable compile with IOV=on and IOMMU_API=off
powerpc/include: Add opal-prd to installed uapi headers
powerpc/powernv: fix construction of opal PRD messages
powerpc/powernv: Increase opal-irqchip initcall priority
powerpc: Make doorbell check preemption safe
powerpc/powernv: pnv_init_idle_states() should only run on powernv
macintosh/nvram: Remove as unused
powerpc: Don't use gcc specific options on clang
powerpc: Don't use -mno-strict-align on clang
powerpc: Only use -mtraceback=no, -mno-string and -msoft-float if toolchain supports it
powerpc: Only use -mabi=altivec if toolchain supports it
powerpc: Fix duplicate const clang warning in user access code
vfio: powerpc/spapr: Support Dynamic DMA windows
vfio: powerpc/spapr: Register memory and define IOMMU v2
...

Linus Torvalds
2015-06-24 23:46:32 +0800

11 Jun, 2015

1 commit

1fb3f5a7c powerpc: Only use -mabi=altivec if toolchain supports it ... Browse Code »

The -mabi=altivec option is not recognised on LLVM, so use call cc-option
to check for support.

Signed-off-by: Anton Blanchard
Signed-off-by: Michael Ellerman

Anton Blanchard
2015-06-11 15:33:05 +0800

19 May, 2015

1 commit

df6b35f40 x86/fpu: Rename i387.h to fpu/api.h ... Browse Code »

We already have fpu/types.h, move i387.h to fpu/api.h.

The file name has become a misnomer anyway: it offers generic FPU APIs,
but is not limited to i387 functionality.

Reviewed-by: Borislav Petkov
Cc: Andy Lutomirski
Cc: Dave Hansen
Cc: Fenghua Yu
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Ingo Molnar
2015-05-19 21:47:30 +0800

22 Apr, 2015

4 commits

a582564b2 md/raid6 algorithms: xor_syndrome() for SSE2 ... Browse Code »

The second and (last) optimized XOR syndrome calculation. This version
supports right and left side optimization. All CPUs with architecture
older than Haswell will benefit from it.

It should be noted that SSE2 movntdq kills performance for memory areas
that are read and written simultaneously in chunks smaller than cache
line size. So use movdqa instead for P/Q writes in sse21 and sse22 XOR
functions.

Signed-off-by: Markus Stockhausen
Signed-off-by: NeilBrown

Markus Stockhausen
2015-04-22 06:00:42 +0800
9a5ce91d0 md/raid6 algorithms: xor_syndrome() for generic int ... Browse Code »

Start the algorithms with the very basic one. It is left and right
optimized. That means we can avoid all calculations for unneeded pages
above the right stop offset. For pages below the left start offset we
still need the syndrome multiplication but without reading data pages.

Signed-off-by: Markus Stockhausen
Signed-off-by: NeilBrown

Markus Stockhausen
2015-04-22 06:00:42 +0800
7e92e1d76 md/raid6 algorithms: improve test program ... Browse Code »

It is always helpful to have a test tool in place if we implement
new data critical algorithms. So add some test routines to the raid6
checker that can prove if the new xor_syndrome() works as expected.

Run through all permutations of start/stop pages per algorithm and
simulate a xor_syndrome() assisted rmw run. After each rmw check if
the recovery algorithm still confirms that the stripe is fine.

Signed-off-by: Markus Stockhausen
Signed-off-by: NeilBrown

Markus Stockhausen
2015-04-22 06:00:42 +0800
fe5cbc6e0 md/raid6 algorithms: delta syndrome functions ... Browse Code »

v3: s-o-b comment, explanation of performance and descision for
the start/stop implementation

Implementing rmw functionality for RAID6 requires optimized syndrome
calculation. Up to now we can only generate a complete syndrome. The
target P/Q pages are always overwritten. With this patch we provide
a framework for inplace P/Q modification. In the first place simply
fill those functions with NULL values.

xor_syndrome() has two additional parameters: start & stop. These
will indicate the first and last page that are changing during a
rmw run. That makes it possible to avoid several unneccessary loops
and speed up calculation. The caller needs to implement the following
logic to make the functions work.

1) xor_syndrome(disks, start, stop, ...): "Remove" all data of source
blocks inside P/Q between (and including) start and end.

2) modify any block with start
Signed-off-by: NeilBrown

Markus Stockhausen
2015-04-22 06:00:41 +0800

04 Feb, 2015

1 commit

75aaf4c3e x86/raid6: correctly check for assembler capabilities ... Browse Code »

Just like for AVX2 (which simply needs an #if -> #ifdef conversion),
SSSE3 assembler support should be checked for before using it.

Signed-off-by: Jan Beulich
Cc: Jim Kukunas
Acked-by: Thomas Gleixner
Signed-off-by: NeilBrown

Jan Beulich
2015-02-04 05:35:51 +0800

14 Oct, 2014

1 commit

b395f75ea lib/raid6: Add log level to printks ... Browse Code »

Signed-off-by: Anton Blanchard
Signed-off-by: NeilBrown

Anton Blanchard
2014-10-14 10:08:29 +0800

11 Sep, 2013

1 commit

4d7696f1b Merge tag 'md/3.12' of git://neil.brown.name/md ... Browse Code »

Pull md update from Neil Brown:
"Headline item is multithreading for RAID5 so that more IO/sec can be
supported on fast (SSD) devices. Also TILE-Gx SIMD suppor for RAID6
calculations and an assortment of bug fixes"

* tag 'md/3.12' of git://neil.brown.name/md:
raid5: only wakeup necessary threads
md/raid5: flush out all pending requests before proceeding with reshape.
md/raid5: use seqcount to protect access to shape in make_request.
raid5: sysfs entry to control worker thread number
raid5: offload stripe handle to workqueue
raid5: fix stripe release order
raid5: make release_stripe lockless
md: avoid deadlock when dirty buffers during md_stop.
md: Don't test all of mddev->flags at once.
md: Fix apparent cut-and-paste error in super_90_validate
raid6/test: replace echo -e with printf
RAID: add tilegx SIMD implementation of raid6
md: fix safe_mode buglet.
md: don't call md_allow_write in get_bitmap_file.

Linus Torvalds
2013-09-11 04:03:41 +0800

27 Aug, 2013

2 commits

c28399b59 raid6/test: replace echo -e with printf ... Browse Code »

-e is a non-standard echo option, echo output is
implementation-dependent when it is used. Replace echo -e with printf as
suggested by POSIX echo manual.

Cc: NeilBrown
Cc: Jim Kukunas
Cc: "H. Peter Anvin"
Cc: Yuanhan Liu
Acked-by: H. Peter Anvin
Signed-off-by: Max Filippov
Signed-off-by: NeilBrown

Max Filippov
2013-08-27 14:06:06 +0800
ae77cbc1e RAID: add tilegx SIMD implementation of raid6 ... Browse Code »

This change adds TILE-Gx SIMD instructions to the software raid
(md), modeling the Altivec implementation. This is only for Syndrome
generation; there is more that could be done to improve recovery,
as in the recent Intel SSE3 recovery implementation.

The code unrolls 8 times; this turns out to be the best on tilegx
hardware among the set 1, 2, 4, 8 or 16. The code reads one
cache-line of data from each disk, stores P and Q then goes to the
next cache-line.

The test code in sys/linux/lib/raid6/test reports 2008 MB/s data
read rate for syndrome generation using 18 disks (16 data and 2
parity). It was 1512 MB/s before this SIMD optimizations. This is
running on 1 core with all the data in cache.

This is based on the paper The Mathematics of RAID-6.
(http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf).

Signed-off-by: Ken Steele
Signed-off-by: Chris Metcalf
Signed-off-by: NeilBrown

Ken Steele
2013-08-27 14:05:50 +0800

09 Jul, 2013

1 commit

7d11965dd lib/raid6: add ARM-NEON accelerated syndrome calculation ... Browse Code »

Rebased/reworked a patch contributed by Rob Herring that uses
NEON intrinsics to perform the RAID-6 syndrome calculations.
It uses the existing unroll.awk code to generate several
unrolled versions of which the best performing one is selected
at boot time.

Signed-off-by: Ard Biesheuvel
Acked-by: Nicolas Pitre
Cc: hpa@linux.intel.com

Ard Biesheuvel
2013-07-09 05:09:18 +0800

13 Dec, 2012

3 commits

4f8c55c5a lib/raid6: build proper files on corresponding arch ... Browse Code »

sse and avx2 stuff only exist on x86 arch, and we don't need to build
altivec on x86. And we can do that at lib/raid6/Makefile.

Proposed-by: H. Peter Anvin
Signed-off-by: Yuanhan Liu
Reviewed-by: H. Peter Anvin
Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown

Yuanhan Liu
2012-12-13 16:51:04 +0800
2c935842b lib/raid6: Add AVX2 optimized gen_syndrome functions ... Browse Code »

Add AVX2 optimized gen_syndrom functions, which is simply based on
sse2.c written by hpa.

Signed-off-by: Yuanhan Liu
Reviewed-by: H. Peter Anvin
Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown

Yuanhan Liu
2012-12-13 16:51:03 +0800
7056741fd lib/raid6: Add AVX2 optimized recovery functions ... Browse Code »

Optimize RAID6 recovery functions to take advantage of
the 256-bit YMM integer instructions introduced in AVX2.

The patch was tested and benchmarked before submission.
However hardware is not yet released so benchmark numbers
cannot be reported.

Acked-by: "H. Peter Anvin"
Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown

Jim Kukunas
2012-12-13 13:42:01 +0800

28 May, 2012

1 commit

2aa4ee2a8 lib/raid6: fix sparse warnings in recovery functions ... Browse Code »

Make the recovery functions static to fix the following sparse warnings:

lib/raid6/recov.c:25:6: warning: symbol 'raid6_2data_recov_intx1' was
not declared. Should it be static?
lib/raid6/recov.c:69:6: warning: symbol 'raid6_datap_recov_intx1' was
not declared. Should it be static?
lib/raid6/recov_ssse3.c:22:6: warning: symbol 'raid6_2data_recov_ssse3'
was not declared. Should it be static?
lib/raid6/recov_ssse3.c:197:6: warning: symbol 'raid6_datap_recov_ssse3'
was not declared. Should it be static?

Reported-by: Fengguang Wu
Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown

Jim Kukunas
2012-05-28 12:10:22 +0800

22 May, 2012

4 commits

96e67703e lib/raid6: cleanup gen_syndrome function selection ... Browse Code »

Reorders functions in raid6_algos as well as the preference check
to reduce the number of functions tested on initialization.

Also, creates symmetry between choosing the gen_syndrome functions
and choosing the recovery functions.

Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown

Jim Kukunas
2012-05-22 11:54:24 +0800
2dbf70844 lib/raid6: update test program for recovery functions ... Browse Code »

Test each combination of recovery and syndrome generation
functions.

Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown

Jim Kukunas
2012-05-22 11:54:23 +0800
048a8b8c8 lib/raid6: Add SSSE3 optimized recovery functions ... Browse Code »

Add SSSE3 optimized recovery functions, as well as a system
for selecting the most appropriate recovery functions to use.

Originally-by: H. Peter Anvin
Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown

Jim Kukunas
2012-05-22 11:54:18 +0800
f674ef7b4 lib/raid6: fix test program build ... Browse Code »

drags in headers which are not visible to userspace,
thus breaking the build for the test program.

Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown

Jim Kukunas
2012-05-22 11:54:16 +0800

29 Mar, 2012

2 commits

9ffc93f20 Remove all #inclusions of asm/system.h ... Browse Code »
86

Remove all #inclusions of asm/system.h preparatory to splitting and killing
it. Performed with the following command:

perl -p -i -e 's!^#\s*include\s*.*\n!!' `grep -Irl '^#\s*include\s*' *`

Signed-off-by: David Howells

David Howells
2012-03-29 01:30:03 +0800
ae3a197e3 Disintegrate asm/system.h for PowerPC ... Browse Code »
43

Disintegrate asm/system.h for PowerPC.

Signed-off-by: David Howells
Acked-by: Benjamin Herrenschmidt
cc: linuxppc-dev@lists.ozlabs.org

David Howells
2012-03-29 01:30:02 +0800