17 Oct, 2020
1 commit
-
kernel.h is being used as a dump for all kinds of stuff for a long time.
Here is the attempt to start cleaning it up by splitting out min()/max()
et al. helpers.At the same time convert users in header and lib folder to use new header.
Though for time being include new header back to kernel.h to avoid
twisted indirected includes for other existing users.Signed-off-by: Andy Shevchenko
Signed-off-by: Andrew Morton
Cc: "Rafael J. Wysocki"
Cc: Steven Rostedt
Cc: Rasmus Villemoes
Cc: Joe Perches
Cc: Linus Torvalds
Link: https://lkml.kernel.org/r/20200910164152.GA1891694@smile.fi.intel.com
Signed-off-by: Linus Torvalds
01 Feb, 2020
3 commits
-
It saves 25% of .text for arm64, and more for BE architectures.
Before:
$ size lib/find_bit.o
text data bss dec hex filename
1012 56 0 1068 42c lib/find_bit.oAfter:
$ size lib/find_bit.o
text data bss dec hex filename
776 56 0 832 340 lib/find_bit.oLink: http://lkml.kernel.org/r/20200103202846.21616-3-yury.norov@gmail.com
Signed-off-by: Yury Norov
Cc: Thomas Gleixner
Cc: Allison Randal
Cc: William Breathitt Gray
Cc: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
_find_next_bit and _find_next_bit_le are very similar functions. It's
possible to join them by adding 1 parameter and a couple of simple
checks. It's simplify maintenance and make possible to shrink the size
of .text by un-inlining the unified function (in the following patch).Link: http://lkml.kernel.org/r/20200103202846.21616-2-yury.norov@gmail.com
Signed-off-by: Yury Norov
Cc: Allison Randal
Cc: Joe Perches
Cc: Thomas Gleixner
Cc: William Breathitt Gray
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
ext2_swab() is defined locally in lib/find_bit.c However it is not
specific to ext2, neither to bitmaps.There are many potential users of it, so rename it to just swab() and
move to include/uapi/linux/swab.hABI guarantees that size of unsigned long corresponds to BITS_PER_LONG,
therefore drop unneeded cast.Link: http://lkml.kernel.org/r/20200103202846.21616-1-yury.norov@gmail.com
Signed-off-by: Yury Norov
Cc: Allison Randal
Cc: Joe Perches
Cc: Thomas Gleixner
Cc: William Breathitt Gray
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
05 Dec, 2019
1 commit
-
Pach series "Introduce the for_each_set_clump8 macro", v18.
While adding GPIO get_multiple/set_multiple callback support for various
drivers, I noticed a pattern of looping manifesting that would be useful
standardized as a macro.This patchset introduces the for_each_set_clump8 macro and utilizes it
in several GPIO drivers. The for_each_set_clump macro8 facilitates a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
XXXXXXXX represents the current 8-bit group:Example: 10111110 00000000 11111111 00110011
First loop: 10111110 00000000 11111111 XXXXXXXX
Second loop: 10111110 00000000 XXXXXXXX 00110011
Third loop: XXXXXXXX 00000000 11111111 00110011Each iteration of the loop returns the next 8-bit group that has at
least one set bit.The for_each_set_clump8 macro has four parameters:
* start: set to the bit offset of the current clump
* clump: set to the current clump value
* bits: bitmap to search within
* size: bitmap size in number of bitsIn this version of the patchset, the for_each_set_clump macro has been
reimplemented and simplified based on the suggestions provided by Rasmus
Villemoes and Andy Shevchenko in the version 4 submission.In particular, the function of the for_each_set_clump macro has been
restricted to handle only 8-bit clumps; the drivers that use the
for_each_set_clump macro only handle 8-bit ports so a generic
for_each_set_clump implementation is not necessary. Thus, a solution
for large clumps (i.e. those larger than the width of a bitmap word)
can be postponed until a driver appears that actually requires such a
generic for_each_set_clump implementation.For what it's worth, a semi-generic for_each_set_clump (i.e. for clumps
smaller than the width of a bitmap word) can be implemented by simply
replacing the hardcoded '8' and '0xFF' instances with respective
variables. I have not yet had a need for such an implementation, and
since it falls short of a true generic for_each_set_clump function, I
have decided to forgo such an implementation for now.In addition, the bitmap_get_value8 and bitmap_set_value8 functions are
introduced to get and set 8-bit values respectively. Their use is based
on the behavior suggested in the patchset version 4 review.This patch (of 14):
This macro iterates for each 8-bit group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value8 and bitmap_set_value8 functions are introduced to
respectively get and set an 8-bit value in a bitmap memory region.[gustavo@embeddedor.com: fix potential sign-extension overflow]
Link: http://lkml.kernel.org/r/20191015184657.GA26541@embeddedor
[akpm@linux-foundation.org: s/ULL/UL/, per Joe]
[vilhelm.gray@gmail.com: add for_each_set_clump8 documentation]
Link: http://lkml.kernel.org/r/20191016161825.301082-1-vilhelm.gray@gmail.com
Link: http://lkml.kernel.org/r/893c3b4f03266c9496137cc98ac2b1bd27f92c73.1570641097.git.vilhelm.gray@gmail.com
Signed-off-by: William Breathitt Gray
Signed-off-by: Gustavo A. R. Silva
Suggested-by: Andy Shevchenko
Suggested-by: Rasmus Villemoes
Suggested-by: Lukas Wunner
Tested-by: Andy Shevchenko
Cc: Arnd Bergmann
Cc: Linus Walleij
Cc: Bartosz Golaszewski
Cc: Masahiro Yamada
Cc: Geert Uytterhoeven
Cc: Phil Reid
Cc: Geert Uytterhoeven
Cc: Mathias Duckeck
Cc: Morten Hein Tiljeset
Cc: Sean Nyekjaer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
31 May, 2019
1 commit
-
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later versionextracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 3029 file(s).
Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman
07 Feb, 2018
1 commit
-
We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and().
It's essentially a joined iteration in search for a non-zero bit, which is
currently implemented as a lookup join (find a nonzero bit on the lhs,
lookup the rhs to see if it's set there).Implement a direct join (find a nonzero bit on the incrementally built
join). Also add generic bitmap benchmarks in the new `test_find_bit`
module for new function (see `find_next_and_bit` in [2] and [3] below).For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x
faster with a geometric mean of 2.1 on 32 CPUs [1]. No impact on memory
usage. Note that on Arm, the new pure-C implementation still outperforms
the old one that uses a mix of C and asm (`find_next_bit`) [3].[1] Approximate benchmark code:
```
unsigned long src1p[nr_cpumask_longs] = {pattern1};
unsigned long src2p[nr_cpumask_longs] = {pattern2};
for (/*a bunch of repetitions*/) {
for (int n = -1; n ]
Link: http://lkml.kernel.org/r/1512556816-28627-1-git-send-email-geert@linux-m68k.org
Link: http://lkml.kernel.org/r/20171128131334.23491-1-courbet@google.com
Signed-off-by: Clement Courbet
Signed-off-by: Geert Uytterhoeven
Cc: Yury Norov
Cc: Geert Uytterhoeven
Cc: Alexey Dobriyan
Cc: Rasmus Villemoes
Signed-off-by: Andrew MortonSigned-off-by: Linus Torvalds
25 Feb, 2017
1 commit
-
This saves 32 bytes on my x86-64 build, mostly due to alignment
considerations and sharing more code between find_next_bit and
find_next_zero_bit, but it does save a couple of instructions.There's really two parts to this commit:
- First, the first half of the test: (!nbits || start >= nbits) is
trivially a subset of the second half, since nbits and start are both
unsigned
- Second, while looking at the disassembly, I noticed that GCC was
predicting the branch taken. Since this is a failure case, it's
clearly the less likely of the two branches, so add an unlikely() to
override GCC's heuristics.[mawilcox@microsoft.com: v2]
Link: http://lkml.kernel.org/r/1483709016-1834-1-git-send-email-mawilcox@linuxonhyperv.com
Link: http://lkml.kernel.org/r/1483709016-1834-1-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox
Acked-by: Yury Norov
Acked-by: Rasmus Villemoes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
17 Apr, 2015
1 commit
-
This file contains implementation for all find_*_bit{,_le}
So giving it more generic name looks reasonable.Signed-off-by: Yury Norov
Reviewed-by: Rasmus Villemoes
Reviewed-by: George Spelvin
Cc: Alexey Klimov
Cc: David S. Miller
Cc: Daniel Borkmann
Cc: Hannes Frederic Sowa
Cc: Lai Jiangshan
Cc: Mark Salter
Cc: AKASHI Takahiro
Cc: Thomas Graf
Cc: Valentin Rothberg
Cc: Chris Wilson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds