Eric Lee / smarc-fsl-linux-kernel

31 May, 2016

2 commits

bc9dc9d5e lib/uuid.c: use correct offset in uuid parser ... Browse Code »

Use '+ 0' and '+ 1' as offsets, like they were intended, instead of
adding to the result.

Fixes: 2b1b0d66704a ("lib/uuid.c: introduce a few more generic helpers")
Signed-off-by: Bjørn Mork
Signed-off-by: Andy Shevchenko
Signed-off-by: Linus Torvalds

Bjørn Mork
2016-05-31 06:26:57 +0800
cfaff0e51 lib/uuid: add a test module ... Browse Code »

It appears that somehow I missed a test of the latest UUID rework which
landed in the kernel. Present a small test module to avoid such cases
in the future.

Signed-off-by: Andy Shevchenko
Signed-off-by: Linus Torvalds

Andy Shevchenko
2016-05-31 06:26:57 +0800

29 May, 2016

2 commits

7e0fb73c5 Merge branch 'hash' of git://ftp.sciencehorizons.net/linux ... Browse Code »

Pull string hash improvements from George Spelvin:
"This series does several related things:

- Makes the dcache hash (fs/namei.c) useful for general kernel use.

(Thanks to Bruce for noticing the zero-length corner case)

- Converts the string hashes in to use the
above.

- Avoids 64-bit multiplies in hash_64() on 32-bit platforms. Two
32-bit multiplies will do well enough.

- Rids the world of the bad hash multipliers in hash_32.

This finishes the job started in commit 689de1d6ca95 ("Minimal
fix-up of bad hashing behavior of hash_64()")

The vast majority of Linux architectures have hardware support for
32x32-bit multiply and so derive no benefit from "simplified"
multipliers.

The few processors that do not (68000, h8/300 and some models of
Microblaze) have arch-specific implementations added. Those
patches are last in the series.

- Overhauls the dcache hash mixing.

The patch in commit 0fed3ac866ea ("namei: Improve hash mixing if
CONFIG_DCACHE_WORD_ACCESS") was an off-the-cuff suggestion.
Replaced with a much more careful design that's simultaneously
faster and better. (My own invention, as there was noting suitable
in the literature I could find. Comments welcome!)

- Modify the hash_name() loop to skip the initial HASH_MIX(). This
would let us salt the hash if we ever wanted to.

- Sort out partial_name_hash().

The hash function is declared as using a long state, even though
it's truncated to 32 bits at the end and the extra internal state
contributes nothing to the result. And some callers do odd things:

- fs/hfs/string.c only allocates 32 bits of state
- fs/hfsplus/unicode.c uses it to hash 16-bit unicode symbols not bytes

- Modify bytemask_from_count to handle inputs of 1..sizeof(long)
rather than 0..sizeof(long)-1. This would simplify users other
than full_name_hash"

Special thanks to Bruce Fields for testing and finding bugs in v1. (I
learned some humbling lessons about "obviously correct" code.)

On the arch-specific front, the m68k assembly has been tested in a
standalone test harness, I've been in contact with the Microblaze
maintainers who mostly don't care, as the hardware multiplier is never
omitted in real-world applications, and I haven't heard anything from
the H8/300 world"

* 'hash' of git://ftp.sciencehorizons.net/linux:
h8300: Add
microblaze: Add
m68k: Add
: Add support for architecture-specific functions
fs/namei.c: Improve dcache hash function
Eliminate bad hash multipliers from hash_32() and hash_64()
Change hash_64() return value to 32 bits
: Define hash_str() in terms of hashlen_string()
fs/namei.c: Add hashlen_string() function
Pull out string hash to

Linus Torvalds
2016-05-29 07:15:25 +0800
468a94285 <linux/hash.h>: Add support for architecture-specific functions ... Browse Code »

This is just the infrastructure; there are no users yet.

This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
the existence of .

That file may define its own versions of various functions, and define
HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.

Included is a self-test (in lib/test_hash.c) that verifies the basics.
It is NOT in general required that the arch-specific functions compute
the same thing as the generic, but if a HAVE_* symbol is defined with
the value 1, then equality is tested.

Signed-off-by: George Spelvin
Cc: Geert Uytterhoeven
Cc: Greg Ungerer
Cc: Andreas Schwab
Cc: Philippe De Muyter
Cc: linux-m68k@lists.linux-m68k.org
Cc: Alistair Francis
Cc: Michal Simek
Cc: Yoshinori Sato
Cc: uclinux-h8-devel@lists.sourceforge.jp

George Spelvin
2016-05-29 03:48:31 +0800

27 May, 2016

1 commit

3017cd63f dma-debug: avoid spinlock recursion when disabling dma-debug ... Browse Code »

With netconsole (at least) the pr_err("... disablingn") call can
recurse back into the dma-debug code, where it'll try to grab
free_entries_lock again. Avoid the problem by doing the printk after
dropping the lock.

Link: http://lkml.kernel.org/r/1463678421-18683-1-git-send-email-ville.syrjala@linux.intel.com
Signed-off-by: Ville Syrjälä
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ville Syrjälä
2016-05-27 06:35:44 +0800

26 May, 2016

2 commits

0985b65d3 Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs iov_iter regression fix from Al Viro:
"Fix for braino in 'fold checks into iterate_and_advance()'"

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
do "fold checks into iterate_and_advance()" right

Linus Torvalds
2016-05-26 06:59:09 +0800
19f184593 do "fold checks into iterate_and_advance()" right ... Browse Code »

the only case when we should skip the iterate_and_advance() guts
is when nothing's left in the iterator, _not_ just when requested
amount is 0. Said guts will do nothing in the latter case anyway;
the problem we tried to deal with in the aforementioned commit is
that when there's nothing left *and* the amount requested is 0,
we might end up deferencing one iovec too many; the value we fetch
from there is discarded in that case, but theoretically it might
oops if the iovec array ends exactly at the end of page with the
next page not mapped.

Bailing out on zero size requested had an unexpected side effect -
zero-length segment in the beginning of iovec array ended up
throwing do_loop_readv_writev() into infinite spin; we do not
advance past the empty segment at all. Reproducer is trivial:
echo '#include ' >a.c
echo 'main() {char c; struct iovec v[] = {{&c,0},{&c,1}}; readv(0,v,2);}' >>a.c
cc a.c && ./a.out

Al Viro
2016-05-26 06:16:02 +0800

24 May, 2016

1 commit

c5d2cac0f kgdb: depends on VT ... Browse Code »

With VT=n, the kernel build fails with:

drivers/built-in.o: In function `kgdboc_pre_exp_handler':
kgdboc.c:(.text+0x7b5aa): undefined reference to `fg_console'
kgdboc.c:(.text+0x7b5ce): undefined reference to `vc_cons'
kgdboc.c:(.text+0x7b5d5): undefined reference to `vc_cons'

kgdboc.o is built when KGDB_SERIAL_CONSOLE is set. So make
KGDB_SERIAL_CONSOLE depend on HW_CONSOLE which includes those symbols.

Link: http://lkml.kernel.org/r/1459412955-4696-1-git-send-email-jslaby@suse.cz
Signed-off-by: Jiri Slaby
Reported-by: "Jim Davis"
Acked-by: Jason Wessel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiri Slaby
2016-05-24 08:04:14 +0800

21 May, 2016

32 commits

5469dc270 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge more updates from Andrew Morton:

- the rest of MM

- KASAN updates

- procfs updates

- exit, fork updates

- printk updates

- lib/ updates

- radix-tree testsuite updates

- checkpatch updates

- kprobes updates

- a few other misc bits

* emailed patches from Andrew Morton : (162 commits)
samples/kprobes: print out the symbol name for the hooks
samples/kprobes: add a new module parameter
kprobes: add the "tls" argument for j_do_fork
init/main.c: simplify initcall_blacklisted()
fs/efs/super.c: fix return value
checkpatch: improve --git shortcut
checkpatch: reduce number of `git log` calls with --git
checkpatch: add support to check already applied git commits
checkpatch: add --list-types to show message types to show or ignore
checkpatch: advertise the --fix and --fix-inplace options more
checkpatch: whine about ACCESS_ONCE
checkpatch: add test for keywords not starting on tabstops
checkpatch: improve CONSTANT_COMPARISON test for structure members
checkpatch: add PREFER_IS_ENABLED test
lib/GCD.c: use binary GCD algorithm instead of Euclidean
radix-tree: free up the bottom bit of exceptional entries for reuse
dax: move RADIX_DAX_ definitions to dax.c
radix-tree: make radix_tree_descend() more useful
radix-tree: introduce radix_tree_replace_clear_tags()
radix-tree: tidy up __radix_tree_create()
...

Linus Torvalds
2016-05-21 13:31:33 +0800
3aa2fc166 Merge tag 'driver-core-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull driver core updates from Greg KH:
"Here's the "big" driver core update for 4.7-rc1.

Mostly just debugfs changes, the long-known and messy races with
removing debugfs files should be fixed thanks to the great work of
Nicolai Stange. We also have some isa updates in here (the x86
maintainers told me to take it through this tree), a new warning when
we run out of dynamic char major numbers, and a few other assorted
changes, details in the shortlog.

All have been in linux-next for some time with no reported issues"

* tag 'driver-core-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (32 commits)
Revert "base: dd: don't remove driver_data in -EPROBE_DEFER case"
gpio: ws16c48: Utilize the ISA bus driver
gpio: 104-idio-16: Utilize the ISA bus driver
gpio: 104-idi-48: Utilize the ISA bus driver
gpio: 104-dio-48e: Utilize the ISA bus driver
watchdog: ebc-c384_wdt: Utilize the ISA bus driver
iio: stx104: Utilize the module_isa_driver and max_num_isa_dev macros
iio: stx104: Add X86 dependency to STX104 Kconfig option
Documentation: Add ISA bus driver documentation
isa: Implement the max_num_isa_dev macro
isa: Implement the module_isa_driver macro
pnp: pnpbios: Add explicit X86_32 dependency to PNPBIOS
isa: Decouple X86_32 dependency from the ISA Kconfig option
driver-core: use 'dev' argument in dev_dbg_ratelimited stub
base: dd: don't remove driver_data in -EPROBE_DEFER case
kernfs: Move faulting copy_user operations outside of the mutex
devcoredump: add scatterlist support
debugfs: unproxify files created through debugfs_create_u32_array()
debugfs: unproxify files created through debugfs_create_blob()
debugfs: unproxify files created through debugfs_create_bool()
...

Linus Torvalds
2016-05-21 12:26:15 +0800
fff7fb0b2 lib/GCD.c: use binary GCD algorithm instead of Euclidean ... Browse Code »

The binary GCD algorithm is based on the following facts:
1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2)
2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b)
3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b)

Even on x86 machines with reasonable division hardware, the binary
algorithm runs about 25% faster (80% the execution time) than the
division-based Euclidian algorithm.

On platforms like Alpha and ARMv6 where division is a function call to
emulation code, it's even more significant.

There are two variants of the code here, depending on whether a fast
__ffs (find least significant set bit) instruction is available. This
allows the unpredictable branches in the bit-at-a-time shifting loop to
be eliminated.

If fast __ffs is not available, the "even/odd" GCD variant is used.

I use the following code to benchmark:

#include
#include
#include
#include
#include
#include

#define swap(a, b) \
do { \
a ^= b; \
b ^= a; \
a ^= b; \
} while (0)

unsigned long gcd0(unsigned long a, unsigned long b)
{
unsigned long r;

if (a < b) {
swap(a, b);
}

if (b == 0)
return a;

while ((r = a % b) != 0) {
a = b;
b = r;
}

return b;
}

unsigned long gcd1(unsigned long a, unsigned long b)
{
unsigned long r = a | b;

if (!a || !b)
return r;

b >>= __builtin_ctzl(b);

for (;;) {
a >>= __builtin_ctzl(a);
if (a == b)
return a << __builtin_ctzl(r);

if (a < b)
swap(a, b);
a -= b;
}
}

unsigned long gcd2(unsigned long a, unsigned long b)
{
unsigned long r = a | b;

if (!a || !b)
return r;

r &= -r;

while (!(b & r))
b >>= 1;

for (;;) {
while (!(a & r))
a >>= 1;
if (a == b)
return a;

if (a < b)
swap(a, b);
a -= b;
a >>= 1;
if (a & r)
a += b;
a >>= 1;
}
}

unsigned long gcd3(unsigned long a, unsigned long b)
{
unsigned long r = a | b;

if (!a || !b)
return r;

b >>= __builtin_ctzl(b);
if (b == 1)
return r & -r;

for (;;) {
a >>= __builtin_ctzl(a);
if (a == 1)
return r & -r;
if (a == b)
return a << __builtin_ctzl(r);

if (a < b)
swap(a, b);
a -= b;
}
}

unsigned long gcd4(unsigned long a, unsigned long b)
{
unsigned long r = a | b;

if (!a || !b)
return r;

r &= -r;

while (!(b & r))
b >>= 1;
if (b == r)
return r;

for (;;) {
while (!(a & r))
a >>= 1;
if (a == r)
return r;
if (a == b)
return a;

if (a < b)
swap(a, b);
a -= b;
a >>= 1;
if (a & r)
a += b;
a >>= 1;
}
}

static unsigned long (*gcd_func[])(unsigned long a, unsigned long b) = {
gcd0, gcd1, gcd2, gcd3, gcd4,
};

#define TEST_ENTRIES (sizeof(gcd_func) / sizeof(gcd_func[0]))

#if defined(__x86_64__)

#define rdtscll(val) do { \
unsigned long __a,__d; \
__asm__ __volatile__("rdtsc" : "=a" (__a), "=d" (__d)); \
(val) = ((unsigned long long)__a) | (((unsigned long long)__d)<= start)
ret = end - start;
else
ret = ~0ULL - start + 1 + end;

*res = gcd_res;
return ret;
}

#else

static inline struct timespec read_time(void)
{
struct timespec time;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &time);
return time;
}

static inline unsigned long long diff_time(struct timespec start, struct timespec end)
{
struct timespec temp;

if ((end.tv_nsec - start.tv_nsec) < 0) {
temp.tv_sec = end.tv_sec - start.tv_sec - 1;
temp.tv_nsec = 1000000000ULL + end.tv_nsec - start.tv_nsec;
} else {
temp.tv_sec = end.tv_sec - start.tv_sec;
temp.tv_nsec = end.tv_nsec - start.tv_nsec;
}

return temp.tv_sec * 1000000000ULL + temp.tv_nsec;
}

static unsigned long long benchmark_gcd_func(unsigned long (*gcd)(unsigned long, unsigned long),
unsigned long a, unsigned long b, unsigned long *res)
{
struct timespec start, end;
unsigned long gcd_res;

start = read_time();
gcd_res = gcd(a, b);
end = read_time();

*res = gcd_res;
return diff_time(start, end);
}

#endif

static inline unsigned long get_rand()
{
if (sizeof(long) == 8)
return (unsigned long)rand() << 32 | rand();
else
return rand();
}

int main(int argc, char **argv)
{
unsigned int seed = time(0);
int loops = 100;
int repeats = 1000;
unsigned long (*res)[TEST_ENTRIES];
unsigned long long elapsed[TEST_ENTRIES];
int i, j, k;

for (;;) {
int opt = getopt(argc, argv, "n:r:s:");
/* End condition always first */
if (opt == -1)
break;

switch (opt) {
case 'n':
loops = atoi(optarg);
break;
case 'r':
repeats = atoi(optarg);
break;
case 's':
seed = strtoul(optarg, NULL, 10);
break;
default:
/* You won't actually get here. */
break;
}
}

res = malloc(sizeof(unsigned long) * TEST_ENTRIES * loops);
memset(elapsed, 0, sizeof(elapsed));

srand(seed);
for (j = 0; j < loops; j++) {
unsigned long a = get_rand();
/* Do we have args? */
unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
unsigned long long min_elapsed[TEST_ENTRIES];
for (k = 0; k < repeats; k++) {
for (i = 0; i < TEST_ENTRIES; i++) {
unsigned long long tmp = benchmark_gcd_func(gcd_func[i], a, b, &res[j][i]);
if (k == 0 || min_elapsed[i] > tmp)
min_elapsed[i] = tmp;
}
}
for (i = 0; i < TEST_ENTRIES; i++)
elapsed[i] += min_elapsed[i];
}

for (i = 0; i < TEST_ENTRIES; i++)
printf("gcd%d: elapsed %llu\n", i, elapsed[i]);

k = 0;
srand(seed);
for (j = 0; j < loops; j++) {
unsigned long a = get_rand();
unsigned long b = argc > optind ? strtoul(argv[optind], NULL, 10) : get_rand();
for (i = 1; i < TEST_ENTRIES; i++) {
if (res[j][i] != res[j][0])
break;
}
if (i < TEST_ENTRIES) {
if (k == 0) {
k = 1;
fprintf(stderr, "Error:\n");
}
fprintf(stderr, "gcd(%lu, %lu): ", a, b);
for (i = 0; i < TEST_ENTRIES; i++)
fprintf(stderr, "%ld%s", res[j][i], i < TEST_ENTRIES - 1 ? ", " : "\n");
}
}

if (k == 0)
fprintf(stderr, "PASS\n");

free(res);

return 0;
}

Compiled with "-O2", on "VirtualBox 4.4.0-22-generic #38-Ubuntu x86_64" got:

zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 10174
gcd1: elapsed 2120
gcd2: elapsed 2902
gcd3: elapsed 2039
gcd4: elapsed 2812
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9309
gcd1: elapsed 2280
gcd2: elapsed 2822
gcd3: elapsed 2217
gcd4: elapsed 2710
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9589
gcd1: elapsed 2098
gcd2: elapsed 2815
gcd3: elapsed 2030
gcd4: elapsed 2718
PASS
zhaoxiuzeng@zhaoxiuzeng-VirtualBox:~/develop$ ./gcd -r 500000 -n 10
gcd0: elapsed 9914
gcd1: elapsed 2309
gcd2: elapsed 2779
gcd3: elapsed 2228
gcd4: elapsed 2709
PASS

[akpm@linux-foundation.org: avoid #defining a CONFIG_ variable]
Signed-off-by: Zhaoxiu Zeng
Signed-off-by: George Spelvin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Zhaoxiu Zeng
2016-05-21 08:58:30 +0800
9e85d8111 radix-tree: make radix_tree_descend() more useful ... Browse Code »

Now that the shift amount is stored in the node, radix_tree_descend()
can calculate offset itself from index, which removes several lines of
code from each of the tree walkers.

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
d604c3245 radix-tree: introduce radix_tree_replace_clear_tags() ... Browse Code »

In addition to replacing the entry, we also clear all associated tags.
This is really a one-off special for page_cache_tree_delete() which had
far too much detailed knowledge about how the radix tree works.

For efficiency, factor node_tag_clear() out of radix_tree_tag_clear() It
can be used by radix_tree_delete_item() as well as
radix_tree_replace_clear_tags().

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
89148aa40 radix-tree: tidy up __radix_tree_create() ... Browse Code »

1. Rename the existing variable 'slot' to 'child'.
2. Introduce a new variable called 'slot' which is the address of the
slot we're dealing with. This lets us simplify the tree insertion,
and removes the recalculation of 'slot' at the end of the function.
3. Using 'slot' in the sibling pointer insertion part makes the code
more readable.

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
a8e4da25d radix-tree: tidy up range_tag_if_tagged ... Browse Code »

Convert radix_tree_range_tag_if_tagged to name the nodes parent, node
and child instead of node & slot.

Use parent->offset instead of playing games with 'upindex'.

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
8c1244de0 radix-tree: tidy up next_chunk ... Browse Code »

Convert radix_tree_next_chunk to use 'child' instead of 'slot' as the
name of the child node. Also use node_maxindex() where it makes sense.

The 'rnode' variable was unnecessary; it doesn't overlap in usage with
'node', so we can just use 'node' the whole way through the function.

Improve the testcase to start the walk from every index in the carefully
constructed tree, and to accept any index within the range covered by
the entry.

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
af49a63e1 radix-tree: change naming conventions in radix_tree_shrink ... Browse Code »

Use the more standard 'node' and 'child' instead of 'to_free' and
'slot'.

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
b194d16c2 radix-tree: rename radix_tree_is_indirect_ptr() ... Browse Code »

As with indirect_to_ptr(), ptr_to_indirect() and
RADIX_TREE_INDIRECT_PTR, change radix_tree_is_indirect_ptr() to
radix_tree_is_internal_node().

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
4dd6c0987 radix-tree: rename indirect_to_ptr() to entry_to_node() ... Browse Code »

Mirrors the earlier commit introducing node_to_entry().

Also change the type returned to be a struct radix_tree_node pointer.
That lets us simplify a couple of places in the radix tree shrink &
extend paths where we could convert an entry into a pointer, modify the
node, then convert the pointer back into an entry.

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
a4db4dcea radix-tree: rename ptr_to_indirect() to node_to_entry() ... Browse Code »

ptr_to_indirect() was a bad name. What it really means is "Convert this
pointer to a node into an entry suitable for storing in the radix tree".
So node_to_entry() seemed like a better name.

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
30ff46ccb radix-tree: rename INDIRECT_PTR to INTERNAL_NODE ... Browse Code »

The name RADIX_TREE_INDIRECT_PTR doesn't really match the meaning.
RADIX_TREE_INTERNAL_NODE is a better name.

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
d0891265b radix-tree: remove root->height ... Browse Code »

The only remaining references to root->height were in extend and shrink,
where it was updated. Now we can remove it entirely.

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
fb209019c radix-tree: remove a use of root->height from delete_node ... Browse Code »

If radix_tree_shrink returns whether it managed to shrink, then
__radix_tree_delete_node doesn't ned to query the tree to find out
whether it did any work or not.

Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
c12e51b07 radix-tree: replace node->height with node->shift ... Browse Code »

node->shift represents the shift necessary for looking in the slots
array at this level. It is equal to the old (node->height - 1) *
RADIX_TREE_MAP_SHIFT.

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
0c7fa0a84 radix-tree: split node->path into offset and height ... Browse Code »

Neither piece of information we're storing in node->path can be larger
than 64, so store each in its own unsigned char instead of shifting and
masking to store them both in an unsigned int.

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
2fcd9005c radix-tree: miscellaneous fixes ... Browse Code »

Typos, whitespace, grammar, line length, using the correct types, etc.

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
6b053b8e5 radix-tree: add copyright statements ... Browse Code »

The multiorder support is a sufficiently large feature to be worth
adding copyrigt lines for.

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
0796c5832 radix-tree: fix radix_tree_dump() for multi-order entries ... Browse Code »

- Print which indices are covered by every leaf entry
- Print sibling entries
- Print the node pointer instead of the slot entry
- Build by default in userspace, and make it accessible to the test-suite

Signed-off-by: Ross Zwisler
Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ross Zwisler
2016-05-21 08:58:30 +0800
070c5ac27 radix-tree: fix radix_tree_range_tag_if_tagged() for multiorder entries ... Browse Code »

I had previously decided that tagging a single multiorder entry would
count as tagging 2^order entries for the purposes of 'nr_to_tag'. I now
believe that decision to be a mistake, and it should count as a single
entry. That's more likely to be what callers expect.

When walking back up the tree from a newly-tagged entry, the current
code assumed we were starting from the lowest level of the tree; if we
have a multiorder entry with an order at least RADIX_TREE_MAP_SHIFT in
size then we need to shift the index by 'shift' before we start walking
back up the tree, or we will end up not setting tags on higher entries,
and then mistakenly thinking that entries below a certain point in the
tree are not tagged.

If the first index we examine is a sibling entry of a tagged multiorder
entry, we were not tagging it. We need to examine the canonical entry,
and the easiest way to do that is to use radix_tree_descend(). We then
have to skip over sibling slots when looking for the next entry in the
tree or we will end up walking back to the canonical entry.

Add several tests for radix_tree_range_tag_if_tagged().

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
0a2efc6c8 radix-tree: rewrite radix_tree_locate_item ... Browse Code »

Use the new multi-order support functions to rewrite
radix_tree_locate_item(). Modify the locate tests to test multiorder
entries too.

[hughd@google.com: radix_tree_locate_item() is often returning the wrong index]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1605012108490.1166@eggly.anvils
Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
8a14f4d83 radix-tree: fix radix_tree_create for sibling entries ... Browse Code »

If the radix tree user attempted to insert a colliding entry with an
existing multiorder entry, then radix_tree_create() could encounter a
sibling entry when walking down the tree to look for a slot. Use
radix_tree_descend() to fix the problem, and add a test-case to make
sure the problem doesn't come back in future.

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
4589ba6d0 radix-tree: rewrite radix_tree_tag_get ... Browse Code »

Use the new multi-order support functions to rewrite
radix_tree_tag_get()

Signed-off-by: Ross Zwisler
Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ross Zwisler
2016-05-21 08:58:30 +0800
00f47b581 radix-tree: rewrite radix_tree_tag_clear ... Browse Code »

Use the new multi-order support functions to rewrite
radix_tree_tag_clear()

Signed-off-by: Ross Zwisler
Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ross Zwisler
2016-05-21 08:58:30 +0800
fb969909d radix-tree: rewrite radix_tree_tag_set ... Browse Code »

Use the new multi-order support functions to rewrite
radix_tree_tag_set()

Signed-off-by: Ross Zwisler
Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ross Zwisler
2016-05-21 08:58:30 +0800
21ef53393 radix-tree: add support for multi-order iterating ... Browse Code »

This enables the macros radix_tree_for_each_slot() and friends to be
used with multi-order entries.

The way that this works is that we treat all entries in a given slots[]
array as a single chunk. If the index given to radix_tree_next_chunk()
happens to point us to a sibling entry, we will back up iter->index so
that it points to the canonical entry, and that will be the place where
we start our iteration.

As we're processing a chunk in radix_tree_next_slot(), we process
canonical entries, skip over sibling entries, and restart the chunk
lookup if we find a non-sibling indirect pointer. This drops back to
the radix_tree_next_chunk() code, which will re-walk the tree and look
for another chunk.

This allows us to properly handle multi-order entries mixed with other
entries that are at various heights in the radix tree.

Signed-off-by: Ross Zwisler
Signed-off-by: Matthew Wilcox
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ross Zwisler
2016-05-21 08:58:30 +0800
7b60e9ad5 radix-tree: fix multiorder BUG_ON in radix_tree_insert ... Browse Code »

These BUG_ON tests are to ensure that all the tags are clear when
inserting a new entry. If we insert a multiorder entry, we'll end up
looking at the tags for a different node, and so the BUG_ON can end up
triggering spuriously.

Also, we now have three tags, not two, so check all three are clear, and
check all the root tags with a single call to BUG_ON since the bits are
stored contiguously.

Include a test-case to ensure this problem does not reoccur.

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
858299544 radix-tree: rewrite __radix_tree_lookup ... Browse Code »

Use the new multi-order support functions to rewrite __radix_tree_lookup()

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
afe0e395b radix-tree: fix several shrinking bugs with multiorder entries ... Browse Code »

Setting the indirect bit on the user data entry used to be unambiguous
because the tree walking code knew not to expect internal nodes in the
last level of the tree. Multiorder entries can appear at any level of
the tree, and a leaf with the indirect bit set is indistinguishable from
a pointer to a node.

Introduce a special entry (RADIX_TREE_RETRY) which is neither a valid
user entry, nor a valid pointer to a node. The radix_tree_deref_retry()
function continues to work the same way, but tree walking code can
distinguish it from a pointer to a node.

Also fix the condition for setting slot->parent to NULL; it does not
matter what height the tree is, it only matters whether slot is an
indirect pointer. Move this code above the comment which is referring
to the assignment to root->rnode.

Also fix the condition for preventing the tree from shrinking to a
single entry if it's a multiorder entry.

Add a test-case to the test suite that checks that the tree goes back
down to its original height after an item is inserted & deleted from a
higher index in the tree.

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
49ea6ebcd radix-tree: fix extending the tree for multi-order entries at offset 0 ... Browse Code »

The current code will insert entries at each level, as if we're going to
add a new entry at the bottom level, so we then get an -EEXIST when we
try to insert the entry into the tree. The best way to fix this is to
not check 'order' when inserting into an empty tree.

We still need to 'extend' the tree to the height necessary for the maximum
index corresponding to this entry, so pass that value to
radix_tree_extend() rather than the index we're asked to create, or we
won't create a tree that's deep enough.

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800
1456a439f radix-tree: introduce radix_tree_load_root() ... Browse Code »

All the tree walking functions start with some variant of this code;
centralise it in one place so we're not chasing subtly different bugs
everywhere.

Signed-off-by: Matthew Wilcox
Reviewed-by: Ross Zwisler
Cc: Konstantin Khlebnikov
Cc: Kirill Shutemov
Cc: Jan Kara
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-05-21 08:58:30 +0800