Eric Lee / smarc-fsl-linux-kernel

09 Jul, 2015

1 commit

142b942a7 rhashtable: fix for resize events during table walk ... Browse Code »

If rhashtable_walk_next detects a resize operation in progress, it jumps
to the new table and continues walking that one. But it misses to drop
the reference to it's current item, leading it to continue traversing
the new table's bucket in which the current item is sorted into, and
after reaching that bucket's end continues traversing the new table's
second bucket instead of the first one, thereby potentially missing
items.

This fixes the rhashtable runtime test for me. Bug probably introduced
by Herbert Xu's patch eddee5ba ("rhashtable: Fix walker behaviour during
rehash") although not explicitly tested.

Fixes: eddee5ba ("rhashtable: Fix walker behaviour during rehash")
Signed-off-by: Phil Sutter
Acked-by: Herbert Xu
Signed-off-by: David S. Miller

Phil Sutter
2015-07-09 05:53:49 +0800

09 Jun, 2015

1 commit

941742f49 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2015-06-09 11:06:56 +0800

07 Jun, 2015

1 commit

6d7954130 rhashtable: add missing import <linux/export.h> ... Browse Code »

rhashtable uses EXPORT_SYMBOL_GPL() without importing linux/export.h
directly it is only imported indirectly through some other includes.

Signed-off-by: Hauke Mehrtens
Signed-off-by: David S. Miller

Hauke Mehrtens
2015-06-07 15:10:15 +0800

23 May, 2015

1 commit

36583eb54 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
drivers/net/ethernet/cadence/macb.c
drivers/net/phy/phy.c
include/linux/skbuff.h
net/ipv4/tcp.c
net/switchdev/switchdev.c

Switchdev was a case of RTNH_H_{EXTERNAL --> OFFLOAD}
renaming overlapping with net-next changes of various
sorts.

phy.c was a case of two changes, one adding a local
variable to a function whilst the second was removing
one.

tcp.c overlapped a deadlock fix with the addition of new tcp_info
statistic values.

macb.c involved the addition of two zyncq device entries.

skbuff.h involved adding back ipv4_daddr to nf_bridge_info
whilst net-next changes put two other existing members of
that struct into a union.

Signed-off-by: David S. Miller

David S. Miller
2015-05-23 13:22:35 +0800

17 May, 2015

1 commit

07ee0722b rhashtable: Add cap on number of elements in hash table ... Browse Code »

We currently have no limit on the number of elements in a hash table.
This is a problem because some users (tipc) set a ceiling on the
maximum table size and when that is reached the hash table may
degenerate. Others may encounter OOM when growing and if we allow
insertions when that happens the hash table perofrmance may also
suffer.

This patch adds a new paramater insecure_max_entries which becomes
the cap on the table. If unset it defaults to max_size * 2. If
it is also zero it means that there is no cap on the number of
elements in the table. However, the table will grow whenever the
utilisation hits 100% and if that growth fails, you will get ENOMEM
on insertion.

As allowing oversubscription is potentially dangerous, the name
contains the word insecure.

Note that the cap is not a hard limit. This is done for performance
reasons as enforcing a hard limit will result in use of atomic ops
that are heavier than the ones we currently use.

The reasoning is that we're only guarding against a gross over-
subscription of the table, rather than a small breach of the limit.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-05-17 06:08:26 +0800

06 May, 2015

1 commit

c936a79fc rhashtable: Simplify iterator code ... Browse Code »

Remove useless obj variable and goto logic.

Signed-off-by: Thomas Graf
Acked-by: Herbert Xu
Signed-off-by: David S. Miller

Thomas Graf
2015-05-06 07:30:47 +0800

23 Apr, 2015

2 commits

a87b9ebf1 rhashtable: Do not schedule more than one rehash if we can't grow further ... Browse Code »

The current code currently only stops inserting rehashes into the
chain when no resizes are currently scheduled. As long as resizes
are scheduled and while inserting above the utilization watermark,
more and more rehashes will be scheduled.

This lead to a perfect DoS storm with thousands of rehashes
scheduled which lead to thousands of spinlocks to be taken
sequentially.

Instead, only allow either a series of resizes or a single rehash.
Drop any further rehashes and return -EBUSY.

Fixes: ccd57b1bd324 ("rhashtable: Add immediate rehash during insertion")
Signed-off-by: Thomas Graf
Acked-by: Herbert Xu
Signed-off-by: David S. Miller

Thomas Graf
2015-04-23 02:17:22 +0800
e2307ed6c rhashtable: Schedule async resize when sync realloc fails ... Browse Code »

When rhashtable_insert_rehash() fails with ENOMEM, this indicates that
we can't allocate the necessary memory in the current context but the
limits as set by the user would still allow to grow.

Thus attempt an async resize in the background where we can allocate
using GFP_KERNEL which is more likely to succeed. The insertion itself
will still fail to indicate pressure.

This fixes a bug where the table would never continue growing once the
utilization is above 100%.

Fixes: ccd57b1bd324 ("rhashtable: Add immediate rehash during insertion")
Signed-off-by: Thomas Graf
Acked-by: Herbert Xu
Signed-off-by: David S. Miller

Thomas Graf
2015-04-23 02:17:22 +0800

26 Mar, 2015

1 commit

49f7b33e6 rhashtable: provide len to obj_hashfn ... Browse Code »

nftables sets will be converted to use so called setextensions, moving
the key to a non-fixed position. To hash it, the obj_hashfn must be used,
however it so far doesn't receive the length parameter.

Pass the key length to obj_hashfn() and convert existing users.

Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso

Patrick McHardy
2015-03-26 00:18:33 +0800

25 Mar, 2015

4 commits

6b6f302ce rhashtable: Add rhashtable_free_and_destroy() ... Browse Code »

rhashtable_destroy() variant which stops rehashes, iterates over
the table and calls a callback to release resources.

Avoids need for nft_hash to embed rhashtable internals and allows to
get rid of the being_destroyed flag. It also saves a 2nd mutex
lock upon destruction.

Also fixes an RCU lockdep splash on nft set destruction due to
calling rht_for_each_entry_safe() without holding bucket locks.
Open code this loop as we need know that no mutations may occur in
parallel.

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2015-03-25 05:48:40 +0800
b5e2c150a rhashtable: Disable automatic shrinking by default ... Browse Code »

Introduce a new bool automatic_shrinking to require the
user to explicitly opt-in to automatic shrinking of tables.

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2015-03-25 05:48:40 +0800
299e5c32a rhashtable: Use 'unsigned int' consistently ... Browse Code »

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2015-03-25 05:48:39 +0800
27ed44a5d rhashtable: Add comment on choice of elasticity value ... Browse Code »

This patch adds a comment on the choice of the value 16 as the
maximum chain length before we force a rehash.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-25 02:57:04 +0800

24 Mar, 2015

7 commits

ba7c95ea3 rhashtable: Fix sleeping inside RCU critical section in walk_stop ... Browse Code »

The commit 963ecbd41a1026d99ec7537c050867428c397b89 ("rhashtable:
Fix use-after-free in rhashtable_walk_stop") fixed a real bug
but created another one because we may end up sleeping inside an
RCU critical section.

This patch fixes it properly by replacing the mutex with a spin
lock that specifically protects the walker lists.

Reported-by: Sasha Levin
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-24 10:16:07 +0800
ccd57b1bd rhashtable: Add immediate rehash during insertion ... Browse Code »

This patch reintroduces immediate rehash during insertion. If
we find during insertion that the table is full or the chain
length exceeds a set limit (currently 16 but may be disabled
with insecure_elasticity) then we will force an immediate rehash.
The rehash will contain an expansion if the table utilisation
exceeds 75%.

If this rehash fails then the insertion will fail. Otherwise the
insertion will be reattempted in the new hash table.

Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Herbert Xu
2015-03-24 10:07:52 +0800
b9ecfdaa1 rhashtable: Allow GFP_ATOMIC bucket table allocation ... Browse Code »

This patch adds the ability to allocate bucket table with GFP_ATOMIC
instead of GFP_KERNEL. This is needed when we perform an immediate
rehash during insertion.

Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Herbert Xu
2015-03-24 10:07:52 +0800
b824478b2 rhashtable: Add multiple rehash support ... Browse Code »

This patch adds the missing bits to allow multiple rehashes. The
read-side as well as remove already handle this correctly. So it's
only the rehasher and insertion that need modification to handle
this.

Note that this patch doesn't actually enable it so for now rehashing
is still only performed by the worker thread.

This patch also disables the explicit expand/shrink interface because
the table is meant to expand and shrink automatically, and continuing
to export these interfaces unnecessarily complicates the life of the
rehasher since the rehash process is now composed of two parts.

Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Herbert Xu
2015-03-24 10:07:52 +0800
18093d1c0 rhashtable: Shrink to fit ... Browse Code »

This patch changes rhashtable_shrink to shrink to the smallest
size possible rather than halving the table. This is needed
because with multiple rehashing we will defer shrinking until
all other rehashing is done, meaning that when we do shrink
we may be able to shrink a lot.

Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Herbert Xu
2015-03-24 10:07:52 +0800
31ccde2da rhashtable: Allow hashfn to be unset ... Browse Code »

Since every current rhashtable user uses jhash as their hash
function, the fact that jhash is an inline function causes each
user to generate a copy of its code.

This function provides a solution to this problem by allowing
hashfn to be unset. In which case rhashtable will automatically
set it to jhash. Furthermore, if the key length is a multiple
of 4, we will switch over to jhash2.

Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Herbert Xu
2015-03-24 10:07:51 +0800
d88252f9b rhashtable: Add barrier to ensure we see new tables in walker ... Browse Code »

The walker is a lockless reader so it too needs an smp_rmb before
reading the future_tbl field in order to see any new tables that
may contain elements that we should have walked over.

Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Herbert Xu
2015-03-24 10:07:51 +0800

21 Mar, 2015

3 commits

dc0ee268d rhashtable: Rip out obsolete out-of-line interface ... Browse Code »

Now that all rhashtable users have been converted over to the
inline interface, this patch removes the unused out-of-line
interface.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-21 04:16:24 +0800
02fd97c3d rhashtable: Allow hash/comparison functions to be inlined ... Browse Code »

This patch deals with the complaint that we make indirect function
calls on the fast paths unnecessarily in rhashtable. We resolve
it by moving the fast paths into inline functions that take struct
rhashtable_param (which obviously must be the same set of parameters
supplied to rhashtable_init) as an argument.

The only remaining indirect call is to obj_hashfn (or key_hashfn it
obj_hashfn is unset) on the rehash as well as the insert-during-
rehash slow path.

This patch also extends the support of vairable-length keys to
include those where the key is fixed but scattered in the object.
For example, in netlink we want to key off the namespace and the
portid but they're not next to each other.

This patch does this by directly using the object hash function
as the indicator of whether the key is accessible or not. It
also adds a new function obj_cmpfn to compare a key against an
object. This means that the caller no longer needs to supply
explicit compare functions.

All this is done in a backwards compatible manner so no existing
users are affected until they convert to the new interface.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-21 04:16:24 +0800
488fb86ee rhashtable: Make rhashtable_init params argument const ... Browse Code »

This patch marks the rhashtable_init params argument const as
there is no reason to modify it since we will always make a copy
of it in the rhashtable.

This patch also fixes a bug where we don't actually round up the
value of min_size unless it is less than HASH_MIN_SIZE.

Signed-off-by: Herbert Xu
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Herbert Xu
2015-03-21 04:16:24 +0800

20 Mar, 2015

1 commit

a998f712f rhashtable: Round up/down min/max_size to ensure we respect limit ... Browse Code »

Round up min_size respectively round down max_size to the next power
of two to make sure we always respect the limit specified by the
user. This is required because we compare the table size against the
limit before we expand or shrink.

Also fixes a minor bug where we modified min_size in the params
provided instead of the copy stored in struct rhashtable.

Signed-off-by: Thomas Graf
Acked-by: Herbert Xu
Signed-off-by: David S. Miller

Thomas Graf
2015-03-20 09:02:23 +0800

19 Mar, 2015

3 commits

e2e21c1c5 rhashtable: Remove max_shift and min_shift ... Browse Code »

Now that nobody uses max_shift and min_shift, we can safely remove
them.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-19 00:46:41 +0800
c2e213cff rhashtable: Introduce max_size/min_size ... Browse Code »

This patch adds the parameters max_size and min_size which are
meant to replace max_shift and min_shift.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-19 00:46:40 +0800
6aebd9408 rhashtable: Remove shift from bucket_table ... Browse Code »

Keeping both size and shift is silly. We only need one.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-19 00:46:40 +0800

17 Mar, 2015

2 commits

617011e7d rhashtable: Avoid calculating hash again to unlock ... Browse Code »

Caching the lock pointer avoids having to hash on the object
again to unlock the bucket locks.

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2015-03-17 05:14:34 +0800
db4374f48 rhashtable: Annotate RCU locking of walkers ... Browse Code »

Fixes the following sparse warnings:

lib/rhashtable.c:767:5: warning: context imbalance in 'rhashtable_walk_start' - wrong count at exit
lib/rhashtable.c:849:6: warning: context imbalance in 'rhashtable_walk_stop' - unexpected unlock

Fixes: f2dba9c6ff0d ("rhashtable: Introduce rhashtable_walk_*")
Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2015-03-17 04:24:13 +0800

16 Mar, 2015

2 commits

565e86404 rhashtable: Fix rhashtable_remove failures ... Browse Code »

The commit 9d901bc05153bbf33b5da2cd6266865e531f0545 ("rhashtable:
Free bucket tables asynchronously after rehash") causes gratuitous
failures in rhashtable_remove.

The reason is that it inadvertently introduced multiple rehashing
from the perspective of readers. IOW it is now possible to see
more than two tables during a single RCU critical section.

Fortunately the other reader rhashtable_lookup already deals with
this correctly thanks to c4db8848af6af92f90462258603be844baeab44d
("rhashtable: rhashtable: Move future_tbl into struct bucket_table")
so only rhashtable_remove is broken by this change.

This patch fixes this by looping over every table from the first
one to the last or until we find the element that we were trying
to delete.

Incidentally the simple test for detecting rehashing to prevent
starting another shrinking no longer works. Since it isn't needed
anyway (the work queue and the mutex serves as a natural barrier
to unnecessary rehashes) I've simply killed the test.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-16 10:22:08 +0800
963ecbd41 rhashtable: Fix use-after-free in rhashtable_walk_stop ... Browse Code »

The commit c4db8848af6af92f90462258603be844baeab44d ("rhashtable:
Move future_tbl into struct bucket_table") introduced a use-after-
free bug in rhashtable_walk_stop because it dereferences tbl after
droping the RCU read lock.

This patch fixes it by moving the RCU read unlock down to the bottom
of rhashtable_walk_stop. In fact this was how I had it originally
but it got dropped while rearranging patches because this one
depended on the async freeing of bucket_table.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-16 10:22:08 +0800

15 Mar, 2015

6 commits

c4db8848a rhashtable: Move future_tbl into struct bucket_table ... Browse Code »

This patch moves future_tbl to open up the possibility of having
multiple rehashes on the same table.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-15 13:35:34 +0800
63d512d0c rhashtable: Add rehash counter to bucket_table ... Browse Code »

This patch adds a rehash counter to bucket_table to indicate
the last bucket that has been rehashed. This serves two purposes:

1. Any bucket that has been rehashed can never gain a new object.
2. If the rehash counter reaches the size of the table, the table
will forever remain empty.

This patch also downsizes bucket_table->size to an unsigned int
since we do not support sizes greater than 32 bits yet.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-15 13:35:34 +0800
9d901bc05 rhashtable: Free bucket tables asynchronously after rehash ... Browse Code »

There is in fact no need to wait for an RCU grace period in the
rehash function, since all insertions are guaranteed to go into
the new table through spin locks.

This patch uses call_rcu to free the old/rehashed table at our
leisure.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-15 13:35:34 +0800
5269b53da rhashtable: Move seed init into bucket_table_alloc ... Browse Code »

It seems that I have already made every rehash redo the random
seed even though my commit message indicated otherwise :)

Since we have already taken that step, this patch goes one step
further and moves the seed initialisation into bucket_table_alloc.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-15 13:35:34 +0800
8f2484bdb rhashtable: Use SINGLE_DEPTH_NESTING ... Browse Code »

We only nest one level deep there is no need to roll our own
subclasses.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-15 13:35:34 +0800
eddee5ba3 rhashtable: Fix walker behaviour during rehash ... Browse Code »

Previously whenever the walker encountered a resize it simply
snaps back to the beginning and starts again. However, this only
works if the rehash started and completed while the walker was
idle.

If the walker attempts to restart while the rehash is still ongoing,
we may miss objects that we shouldn't have.

This patch fixes this by making the walker walk the old table
followed by the new table just like all other readers. If a
rehash is detected we will still signal our caller of the fact
so they can prepare for duplicates but we will simply continue
the walk onto the new table after the old one is finished either
by us or by the rehasher.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-15 13:35:34 +0800

13 Mar, 2015

3 commits

393619474 rhashtable: Fix read-side crash during rehash ... Browse Code »

This patch fixes a typo rhashtable_lookup_compare where we fail
to recompute the hash when looking up the new table. This causes
elements to be missed and potentially a crash during a resize.

Reported-by: Thomas Graf
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-13 11:02:30 +0800
a5b6846f9 rhashtable: kill ht->shift atomic operations ... Browse Code »

Commit c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker
queue") changed ht->shift to be atomic, which is actually unnecessary.

Instead of leaving the current shift in the core rhashtable structure,
it can be cached inside the individual bucket tables.

There, it will only be initialized once during a new table allocation
in the shrink/expansion slow path, and from then onward it stays immutable
for the rest of the bucket table liftime.

That allows shift to be non-atomic. The patch also moves hash_rnd
management into the table setup. The rhashtable structure now consumes
3 instead of 4 cachelines.

Signed-off-by: Daniel Borkmann
Cc: Ying Xue
Acked-by: Thomas Graf
Signed-off-by: David S. Miller

Daniel Borkmann
2015-03-13 11:02:30 +0800
9497df88a rhashtable: Fix reader/rehash race ... Browse Code »

There is a potential race condition between readers and the rehasher.
In particular, the rehasher could have started a rehash while the
reader finishes a scan of the old table but fails to see the new
table pointer.

This patch closes this window by adding smp_wmb/smp_rmb.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2015-03-13 11:02:30 +0800