12 May, 2018
1 commit
-
test_find_first_bit() is intentionally sub-optimal, and may cause soft
lockup due to long time of run on some systems. So decrease length of
bitmap to traverse to avoid lockup.With the change below, time of test execution doesn't exceed 0.2 seconds
on my testing system.Link: http://lkml.kernel.org/r/20180420171949.15710-1-ynorov@caviumnetworks.com
Fixes: 4441fca0a27f5 ("lib: test module for find_*_bit() functions")
Signed-off-by: Yury Norov
Reviewed-by: Andrew Morton
Reported-by: Fengguang Wu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
07 Feb, 2018
3 commits
-
We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and().
It's essentially a joined iteration in search for a non-zero bit, which is
currently implemented as a lookup join (find a nonzero bit on the lhs,
lookup the rhs to see if it's set there).Implement a direct join (find a nonzero bit on the incrementally built
join). Also add generic bitmap benchmarks in the new `test_find_bit`
module for new function (see `find_next_and_bit` in [2] and [3] below).For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x
faster with a geometric mean of 2.1 on 32 CPUs [1]. No impact on memory
usage. Note that on Arm, the new pure-C implementation still outperforms
the old one that uses a mix of C and asm (`find_next_bit`) [3].[1] Approximate benchmark code:
```
unsigned long src1p[nr_cpumask_longs] = {pattern1};
unsigned long src2p[nr_cpumask_longs] = {pattern2};
for (/*a bunch of repetitions*/) {
for (int n = -1; n ]
Link: http://lkml.kernel.org/r/1512556816-28627-1-git-send-email-geert@linux-m68k.org
Link: http://lkml.kernel.org/r/20171128131334.23491-1-courbet@google.com
Signed-off-by: Clement Courbet
Signed-off-by: Geert Uytterhoeven
Cc: Yury Norov
Cc: Geert Uytterhoeven
Cc: Alexey Dobriyan
Cc: Rasmus Villemoes
Signed-off-by: Andrew MortonSigned-off-by: Linus Torvalds
-
As suggested in review comments:
* printk: align numbers using whitespaces instead of tabs;
* return error value from init() to avoid calling rmmod if testing again;
* use ktime_get instead of get_cycles as some arches don't support it;The output in dmesg (on QEMU arm64):
[ 38.823430] Start testing find_bit() with random-filled bitmap
[ 38.845358] find_next_bit: 20138448 ns, 163968 iterations
[ 38.856217] find_next_zero_bit: 10615328 ns, 163713 iterations
[ 38.863564] find_last_bit: 7111888 ns, 163967 iterations
[ 40.944796] find_first_bit: 2081007216 ns, 163968 iterations
[ 40.944975]
[ 40.944975] Start testing find_bit() with sparse bitmap
[ 40.945268] find_next_bit: 73216 ns, 656 iterations
[ 40.967858] find_next_zero_bit: 22461008 ns, 327025 iterations
[ 40.968047] find_last_bit: 62320 ns, 656 iterations
[ 40.978060] find_first_bit: 9889360 ns, 656 iterationsLink: http://lkml.kernel.org/r/20171124143040.a44jvhmnaiyedg2i@yury-thinkpad
Signed-off-by: Yury Norov
Tested-by: Geert Uytterhoeven
Cc: Alexey Dobriyan
Cc: Clement Courbet
Cc: Matthew Wilcox
Cc: Rasmus Villemoes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As suggested in review comments, rename test_find_bit.c to
find_bit_benchmark.c.Link: http://lkml.kernel.org/r/20171124143040.a44jvhmnaiyedg2i@yury-thinkpad
Signed-off-by: Yury Norov
Tested-by: Geert Uytterhoeven
Cc: Alexey Dobriyan
Cc: Clement Courbet
Cc: Matthew Wilcox
Cc: Rasmus Villemoes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds