20 Jan, 2021
1 commit
-
[ Upstream commit 0c36d88cff4d72149f94809303c5180b6f716d39 ]
Older versions of BSD awk are fussy about the order of '-v' and '-f'
flags, and require a space after the flag name. This causes build
failures on platforms with an old awk, such as macOS and NetBSD.Since GNU awk and modern versions of BSD awk (distributed with
FreeBSD/OpenBSD) are fine with either form, the definition of
'cmd_unroll' can be trivially tweaked to let the lib/raid6 Makefile
work with both old and new awk flag dialects.Signed-off-by: John Millikin
Signed-off-by: Masahiro Yamada
Signed-off-by: Sasha Levin
04 Feb, 2020
1 commit
-
In old days, the "host-progs" syntax was used for specifying host
programs. It was renamed to the current "hostprogs-y" in 2004.It is typically useful in scripts/Makefile because it allows Kbuild to
selectively compile host programs based on the kernel configuration.This commit renames like follows:
always -> always-y
hostprogs-y -> hostprogsSo, scripts/Makefile will look like this:
always-$(CONFIG_BUILD_BIN2C) += ...
always-$(CONFIG_KALLSYMS) += ...
...
hostprogs := $(always-y) $(always-m)I think this makes more sense because a host program is always a host
program, irrespective of the kernel configuration. We want to specify
which ones to compile by CONFIG options, so always-y will be handier.The "always", "hostprogs-y", "hostprogs-m" will be kept for backward
compatibility for a while.Signed-off-by: Masahiro Yamada
31 Jul, 2019
1 commit
-
The following four files are every time rebuilt:
UNROLL lib/raid6/vpermxor1.c
UNROLL lib/raid6/vpermxor2.c
UNROLL lib/raid6/vpermxor4.c
UNROLL lib/raid6/vpermxor8.cFix the suffixes in the targets.
Fixes: 72ad21075df8 ("lib/raid6: refactor unroll rules with pattern rules")
Signed-off-by: Masahiro Yamada
24 Jun, 2019
2 commits
-
This Makefile repeats very similar rules.
Let's use pattern rules. $(UNROLL) can be replaced with $*.
No intended change in behavior.
Signed-off-by: Masahiro Yamada
-
No intended change in behavior.
Signed-off-by: Masahiro Yamada
12 Feb, 2019
1 commit
-
While building arm32 allyesconfig, I ran into the following errors:
arch/arm/lib/xor-neon.c:17:2: error: You should compile this file with
'-mfloat-abi=softfp -mfpu=neon'In file included from lib/raid6/neon1.c:27:
/home/nathan/cbl/prebuilt/lib/clang/8.0.0/include/arm_neon.h:28:2:
error: "NEON support not enabled"Building V=1 showed NEON_FLAGS getting passed along to Clang but
__ARM_NEON__ was not getting defined. Ultimately, it boils down to Clang
only defining __ARM_NEON__ when targeting armv7, rather than armv6k,
which is the '-march' value for allyesconfig.>From lib/Basic/Targets/ARM.cpp in the Clang source:
// This only gets set when Neon instructions are actually available, unlike
// the VFP define, hence the soft float and arch check. This is subtly
// different from gcc, we follow the intent which was that it should be set
// when Neon instructions are actually available.
if ((FPU & NeonFPU) && !SoftFloat && ArchVersion >= 7) {
Builder.defineMacro("__ARM_NEON", "1");
Builder.defineMacro("__ARM_NEON__");
// current AArch32 NEON implementations do not support double-precision
// floating-point even when it is present in VFP.
Builder.defineMacro("__ARM_NEON_FP",
"0x" + Twine::utohexstr(HW_FP & ~HW_FP_DP));
}Ard Biesheuvel recommended explicitly adding '-march=armv7-a' at the
beginning of the NEON_FLAGS definitions so that __ARM_NEON__ always gets
definined by Clang. This doesn't functionally change anything because
that code will only run where NEON is supported, which is implicitly
armv7.Link: https://github.com/ClangBuiltLinux/linux/issues/287
Suggested-by: Ard Biesheuvel
Signed-off-by: Nathan Chancellor
Acked-by: Nicolas Pitre
Reviewed-by: Nick Desaulniers
Reviewed-by: Stefan Agner
Signed-off-by: Russell King
06 Jan, 2019
1 commit
-
Since commit 9c2af1c7377a ("kbuild: add .DELETE_ON_ERROR special
target"), the target file is automatically deleted on failure.The boilerplate code
... || { rm -f $@; false; }
is unneeded.
Signed-off-by: Masahiro Yamada
20 Dec, 2018
1 commit
-
We cannot build these files with clang as it does not allow altivec
instructions in assembly when -msoft-float is passed.Jinsong Ji wrote:
> We currently disable Altivec/VSX support when enabling soft-float. So
> any usage of vector builtins will break.
>
> Enable Altivec/VSX with soft-float may need quite some clean up work, so
> I guess this is currently a limitation.
>
> Removing -msoft-float will make it work (and we are lucky that no
> floating point instructions will be generated as well).This is a workaround until the issue is resolved in clang.
Link: https://bugs.llvm.org/show_bug.cgi?id=31177
Link: https://github.com/ClangBuiltLinux/linux/issues/239
Signed-off-by: Joel Stanley
Reviewed-by: Nick Desaulniers
Signed-off-by: Michael Ellerman
08 Apr, 2018
1 commit
-
Pull powerpc updates from Michael Ellerman:
"Notable changes:- Support for 4PB user address space on 64-bit, opt-in via mmap().
- Removal of POWER4 support, which was accidentally broken in 2016
and no one noticed, and blocked use of some modern instructions.- Workarounds so that the hypervisor can enable Transactional Memory
on Power9.- A series to disable the DAWR (Data Address Watchpoint Register) on
Power9.- More information displayed in the meltdown/spectre_v1/v2 sysfs
files.- A vpermxor (Power8 Altivec) implementation for the raid6 Q
Syndrome.- A big series to make the allocation of our pacas (per cpu area),
kernel page tables, and per-cpu stacks NUMA aware when using the
Radix MMU on Power9.And as usual many fixes, reworks and cleanups.
Thanks to: Aaro Koskinen, Alexandre Belloni, Alexey Kardashevskiy,
Alistair Popple, Andy Shevchenko, Aneesh Kumar K.V, Anshuman Khandual,
Balbir Singh, Benjamin Herrenschmidt, Christophe Leroy, Christophe
Lombard, Cyril Bur, Daniel Axtens, Dave Young, Finn Thain, Frederic
Barrat, Gustavo Romero, Horia Geantă, Jonathan Neuschäfer, Kees Cook,
Larry Finger, Laurent Dufour, Laurent Vivier, Logan Gunthorpe,
Madhavan Srinivasan, Mark Greer, Mark Hairgrove, Markus Elfring,
Mathieu Malaterre, Matt Brown, Matt Evans, Mauricio Faria de Oliveira,
Michael Neuling, Naveen N. Rao, Nicholas Piggin, Paul Mackerras,
Philippe Bergheaud, Ram Pai, Rob Herring, Sam Bobroff, Segher
Boessenkool, Simon Guo, Simon Horman, Stewart Smith, Sukadev
Bhattiprolu, Suraj Jitindar Singh, Thiago Jung Bauermann, Vaibhav
Jain, Vaidyanathan Srinivasan, Vasant Hegde, Wei Yongjun"* tag 'powerpc-4.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (207 commits)
powerpc/64s/idle: Fix restore of AMOR on POWER9 after deep sleep
powerpc/64s: Fix POWER9 DD2.2 and above in cputable features
powerpc/64s: Fix pkey support in dt_cpu_ftrs, add CPU_FTR_PKEY bit
powerpc/64s: Fix dt_cpu_ftrs to have restore_cpu clear unwanted LPCR bits
Revert "powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead"
powerpc: iomap.c: introduce io{read|write}64_{lo_hi|hi_lo}
powerpc: io.h: move iomap.h include so that it can use readq/writeq defs
cxl: Fix possible deadlock when processing page faults from cxllib
powerpc/hw_breakpoint: Only disable hw breakpoint if cpu supports it
powerpc/mm/radix: Update command line parsing for disable_radix
powerpc/mm/radix: Parse disable_radix commandline correctly.
powerpc/mm/hugetlb: initialize the pagetable cache correctly for hugetlb
powerpc/mm/radix: Update pte fragment count from 16 to 256 on radix
powerpc/mm/keys: Update documentation and remove unnecessary check
powerpc/64s/idle: POWER9 ESL=0 stop avoid save/restore overhead
powerpc/64s/idle: Consolidate power9_offline_stop()/power9_idle_stop()
powerpc/powernv: Always stop secondaries before reboot/shutdown
powerpc: hard disable irqs in smp_send_stop loop
powerpc: use NMI IPI for smp_send_stop
powerpc/powernv: Fix SMT4 forcing idle code
...
26 Mar, 2018
1 commit
-
The Tile architecture is getting removed, so we no longer need this either.
Acked-by: Ard Biesheuvel
Signed-off-by: Arnd Bergmann
20 Mar, 2018
1 commit
-
This patch uses the vpermxor instruction to optimise the raid6 Q
syndrome. This instruction was made available with POWER8, ISA version
2.07. It allows for both vperm and vxor instructions to be done in a
single instruction. This has been tested for correctness on a ppc64le
vm with a basic RAID6 setup containing 5 drives.The performance benchmarks are from the raid6test in the
/lib/raid6/test directory. These results are from an IBM Firestone
machine with ppc64le architecture. The benchmark results show a 35%
speed increase over the best existing algorithm for powerpc (altivec).
The raid6test has also been run on a big-endian ppc64 vm to ensure it
also works for big-endian architectures.Performance benchmarks:
raid6: altivecx4 gen() 18773 MB/s
raid6: altivecx8 gen() 19438 MB/sraid6: vpermxor4 gen() 25112 MB/s
raid6: vpermxor8 gen() 26279 MB/sSigned-off-by: Matt Brown
Reviewed-by: Daniel Axtens
[mpe: Add VPERMXOR macro so we can build with old binutils]
Signed-off-by: Michael Ellerman
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
10 Aug, 2017
1 commit
-
Provide a NEON accelerated implementation of the recovery algorithm,
which supersedes the default byte-by-byte one.Signed-off-by: Ard Biesheuvel
Signed-off-by: Catalin Marinas
08 Oct, 2016
1 commit
-
Pull MD updates from Shaohua Li:
"This update includes:- new AVX512 instruction based raid6 gen/recovery algorithm
- a couple of md-cluster related bug fixes
- fix a potential deadlock
- set nonrotational bit for raid array with SSD
- set correct max_hw_sectors for raid5/6, which hopefuly can improve
performance a little bit- other minor fixes"
* tag 'md/4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
md: set rotational bit
raid6/test/test.c: bug fix: Specify aligned(alignment) attributes to the char arrays
raid5: handle register_shrinker failure
raid5: fix to detect failure of register_shrinker
md: fix a potential deadlock
md/bitmap: fix wrong cleanup
raid5: allow arbitrary max_hw_sectors
lib/raid6: Add AVX512 optimized xor_syndrome functions
lib/raid6/test/Makefile: Add avx512 gen_syndrome and recovery functions
lib/raid6: Add AVX512 optimized recovery functions
lib/raid6: Add AVX512 optimized gen_syndrome functions
md-cluster: make resync lock also could be interruptted
md-cluster: introduce dlm_lock_sync_interruptible to fix tasks hang
md-cluster: convert the completion to wait queue
md-cluster: protect md_find_rdev_nr_rcu with rcu lock
md-cluster: clean related infos of cluster
md: changes for MD_STILL_CLOSED flag
md-cluster: remove some unnecessary dlm_unlock_sync
md-cluster: use FORCEUNLOCK in lockres_free
md-cluster: call md_kick_rdev_from_array once ack failed
22 Sep, 2016
2 commits
-
Optimize RAID6 recovery functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.AVX512 optimized recovery functions, which is simply based
on recov_avx2.c written by Jim KukunasThis patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructionsCc: Jim Kukunas
Cc: H. Peter Anvin
Cc: Fenghua Yu
Signed-off-by: Megha Dey
Signed-off-by: Gayatri Kammela
Reviewed-by: Fenghua Yu
Signed-off-by: Shaohua Li -
Optimize RAID6 gen_syndrom functions to take advantage of
the 512-bit ZMM integer instructions introduced in AVX512.AVX512 optimized gen_syndrom functions, which is simply based
on avx2.c written by Yuanhan Liu and sse2.c written by hpa.The patch was tested and benchmarked before submission on
a hardware that has AVX512 flags to support such instructionsCc: H. Peter Anvin
Cc: Jim Kukunas
Cc: Fenghua Yu
Signed-off-by: Megha Dey
Signed-off-by: Gayatri Kammela
Reviewed-by: Fenghua Yu
Signed-off-by: Shaohua Li
01 Sep, 2016
1 commit
-
The XC instruction can be used to improve the speed of the raid6
recovery. The loops now operate on blocks of 256 bytes.Signed-off-by: Martin Schwidefsky
29 Aug, 2016
1 commit
-
Using vector registers is slightly faster:
raid6: vx128x8 gen() 19705 MB/s
raid6: vx128x8 xor() 11886 MB/s
raid6: using algorithm vx128x8 gen() 19705 MB/s
raid6: .... xor() 11886 MB/s, rmw enabledvs the software algorithms:
raid6: int64x1 gen() 3018 MB/s
raid6: int64x1 xor() 1429 MB/s
raid6: int64x2 gen() 4661 MB/s
raid6: int64x2 xor() 3143 MB/s
raid6: int64x4 gen() 5392 MB/s
raid6: int64x4 xor() 3509 MB/s
raid6: int64x8 gen() 4441 MB/s
raid6: int64x8 xor() 3207 MB/s
raid6: using algorithm int64x4 gen() 5392 MB/s
raid6: .... xor() 3509 MB/s, rmw enabledSigned-off-by: Martin Schwidefsky
11 Jun, 2015
1 commit
-
The -mabi=altivec option is not recognised on LLVM, so use call cc-option
to check for support.Signed-off-by: Anton Blanchard
Signed-off-by: Michael Ellerman
11 Sep, 2013
1 commit
-
Pull md update from Neil Brown:
"Headline item is multithreading for RAID5 so that more IO/sec can be
supported on fast (SSD) devices. Also TILE-Gx SIMD suppor for RAID6
calculations and an assortment of bug fixes"* tag 'md/3.12' of git://neil.brown.name/md:
raid5: only wakeup necessary threads
md/raid5: flush out all pending requests before proceeding with reshape.
md/raid5: use seqcount to protect access to shape in make_request.
raid5: sysfs entry to control worker thread number
raid5: offload stripe handle to workqueue
raid5: fix stripe release order
raid5: make release_stripe lockless
md: avoid deadlock when dirty buffers during md_stop.
md: Don't test all of mddev->flags at once.
md: Fix apparent cut-and-paste error in super_90_validate
raid6/test: replace echo -e with printf
RAID: add tilegx SIMD implementation of raid6
md: fix safe_mode buglet.
md: don't call md_allow_write in get_bitmap_file.
27 Aug, 2013
1 commit
-
This change adds TILE-Gx SIMD instructions to the software raid
(md), modeling the Altivec implementation. This is only for Syndrome
generation; there is more that could be done to improve recovery,
as in the recent Intel SSE3 recovery implementation.The code unrolls 8 times; this turns out to be the best on tilegx
hardware among the set 1, 2, 4, 8 or 16. The code reads one
cache-line of data from each disk, stores P and Q then goes to the
next cache-line.The test code in sys/linux/lib/raid6/test reports 2008 MB/s data
read rate for syndrome generation using 18 disks (16 data and 2
parity). It was 1512 MB/s before this SIMD optimizations. This is
running on 1 core with all the data in cache.This is based on the paper The Mathematics of RAID-6.
(http://kernel.org/pub/linux/kernel/people/hpa/raid6.pdf).Signed-off-by: Ken Steele
Signed-off-by: Chris Metcalf
Signed-off-by: NeilBrown
09 Jul, 2013
1 commit
-
Rebased/reworked a patch contributed by Rob Herring that uses
NEON intrinsics to perform the RAID-6 syndrome calculations.
It uses the existing unroll.awk code to generate several
unrolled versions of which the best performing one is selected
at boot time.Signed-off-by: Ard Biesheuvel
Acked-by: Nicolas Pitre
Cc: hpa@linux.intel.com
13 Dec, 2012
3 commits
-
sse and avx2 stuff only exist on x86 arch, and we don't need to build
altivec on x86. And we can do that at lib/raid6/Makefile.Proposed-by: H. Peter Anvin
Signed-off-by: Yuanhan Liu
Reviewed-by: H. Peter Anvin
Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown -
Add AVX2 optimized gen_syndrom functions, which is simply based on
sse2.c written by hpa.Signed-off-by: Yuanhan Liu
Reviewed-by: H. Peter Anvin
Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown -
Optimize RAID6 recovery functions to take advantage of
the 256-bit YMM integer instructions introduced in AVX2.The patch was tested and benchmarked before submission.
However hardware is not yet released so benchmark numbers
cannot be reported.Acked-by: "H. Peter Anvin"
Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown
22 May, 2012
1 commit
-
Add SSSE3 optimized recovery functions, as well as a system
for selecting the most appropriate recovery functions to use.Originally-by: H. Peter Anvin
Signed-off-by: Jim Kukunas
Signed-off-by: NeilBrown
11 Aug, 2010
1 commit
-
Linus asks 'why "raid6" twice?'. No reason.
Signed-off-by: David Woodhouse
09 Aug, 2010
1 commit
-
Conflicts:
drivers/md/Makefile
lib/raid6/unroll.pl
29 Oct, 2009
1 commit
-
We'll want to use these in btrfs too.
Signed-off-by: David Woodhouse