Eric Lee / smarc-fsl-linux-kernel

25 Feb, 2018

1 commit

6e12516df netfilter: xt_RATEEST: acquire xt_rateest_mutex for hash insert ... Browse Code »

commit 7dc68e98757a8eccf8ca7a53a29b896f1eef1f76 upstream.

rateest_hash is supposed to be protected by xt_rateest_mutex,
and, as suggested by Eric, lookup and insert should be atomic,
so we should acquire the xt_rateest_mutex once for both.

So introduce a non-locking helper for internal use and keep the
locking one for external.

Reported-by:
Fixes: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target")
Signed-off-by: Cong Wang
Reviewed-by: Florian Westphal
Reviewed-by: Eric Dumazet
Signed-off-by: Pablo Neira Ayuso
Signed-off-by: Greg Kroah-Hartman

Cong Wang
2018-02-25 18:07:50 +0800

10 Jan, 2017

1 commit

ec2318904 xtables: extend matches and targets with .usersize ... Browse Code »

In matches and targets that define a kernel-only tail to their
xt_match and xt_target data structs, add a field .usersize that
specifies up to where data is to be shared with userspace.

Performed a search for comment "Used internally by the kernel" to find
relevant matches and targets. Manually inspected the structs to derive
a valid offsetof.

Signed-off-by: Willem de Bruijn
Signed-off-by: Pablo Neira Ayuso

Willem de Bruijn
2017-01-10 00:24:55 +0800

06 Dec, 2016

1 commit

1c0d32fde net_sched: gen_estimator: complete rewrite of rate estimators ... Browse Code »

1) Old code was hard to maintain, due to complex lock chains.
(We probably will be able to remove some kfree_rcu() in callers)

2) Using a single timer to update all estimators does not scale.

3) Code was buggy on 32bit kernel (WRITE_ONCE() on 64bit quantity
is not supposed to work well)

In this rewrite :

- I removed the RB tree that had to be scanned in
gen_estimator_active(). qdisc dumps should be much faster.

- Each estimator has its own timer.

- Estimations are maintained in net_rate_estimator structure,
instead of dirtying the qdisc. Minor, but part of the simplification.

- Reading the estimator uses RCU and a seqcount to provide proper
support for 32bit kernels.

- We reduce memory need when estimators are not used, since
we store a pointer, instead of the bytes/packets counters.

- xt_rateest_mt() no longer has to grab a spinlock.
(In the future, xt_rateest_tg() could be switched to per cpu counters)

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2016-12-06 04:21:59 +0800

23 Sep, 2016

1 commit

7bdc66242 netfilter: Enhance the codes used to get random once ... Browse Code »

There are some codes which are used to get one random once in netfilter.
We could use net_get_random_once to simplify these codes.

Signed-off-by: Gao Feng
Signed-off-by: Pablo Neira Ayuso

Gao Feng
2016-09-23 15:30:36 +0800

08 Jun, 2016

1 commit

edb09eb17 net: sched: do not acquire qdisc spinlock in qdisc/class stats dump ... Browse Code »

Large tc dumps (tc -s {qdisc|class} sh dev ethX) done by Google BwE host
agent [1] are problematic at scale :

For each qdisc/class found in the dump, we currently lock the root qdisc
spinlock in order to get stats. Sampling stats every 5 seconds from
thousands of HTB classes is a challenge when the root qdisc spinlock is
under high pressure. Not only the dumps take time, they also slow
down the fast path (queue/dequeue packets) by 10 % to 20 % in some cases.

An audit of existing qdiscs showed that sch_fq_codel is the only qdisc
that might need the qdisc lock in fq_codel_dump_stats() and
fq_codel_dump_class_stats()

In v2 of this patch, I now use the Qdisc running seqcount to provide
consistent reads of packets/bytes counters, regardless of 32/64 bit arches.

I also changed rate estimators to use the same infrastructure
so that they no longer need to lock root qdisc lock.

[1]
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43838.pdf

Signed-off-by: Eric Dumazet
Cc: Cong Wang
Cc: Jamal Hadi Salim
Cc: John Fastabend
Cc: Kevin Athey
Cc: Xiaotian Pei
Signed-off-by: David S. Miller

Eric Dumazet
2016-06-08 07:37:14 +0800

30 Sep, 2014

1 commit

22e0f8b93 net: sched: make bstats per cpu and estimator RCU safe ... Browse Code »

In order to run qdisc's without locking statistics and estimators
need to be handled correctly.

To resolve bstats make the statistics per cpu. And because this is
only needed for qdiscs that are running without locks which is not
the case for most qdiscs in the near future only create percpu
stats when qdiscs set the TCQ_F_CPUSTATS flag.

Next because estimators use the bstats to calculate packets per
second and bytes per second the estimator code paths are updated
to use the per cpu statistics.

Signed-off-by: John Fastabend
Signed-off-by: David S. Miller

John Fastabend
2014-09-30 13:02:26 +0800

28 Feb, 2013

1 commit

b67bfe0d4 hlist: drop the node parameter from iterators ... Browse Code »

I'm not sure why, but the hlist for each entry iterators were conceived

list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin
Acked-by: Paul E. McKenney
Signed-off-by: Sasha Levin
Cc: Wu Fengguang
Cc: Marcelo Tosatti
Cc: Gleb Natapov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sasha Levin
2013-02-28 11:10:24 +0800

21 Jul, 2011

1 commit

cefcb6020 net,rcu: Convert call_rcu(xt_rateest_free_rcu) to kfree_rcu() ... Browse Code »

The RCU callback xt_rateest_free_rcu() just calls kfree(), so we can
use kfree_rcu() instead of call_rcu(). This also allows us to dispense
with an rcu_barrier() call, speeding up unloading of this module.

Signed-off-by: Paul E. McKenney
Cc: Patrick McHardy
Reviewed-by: Josh Triplett

Paul E. McKenney
2011-07-21 05:10:19 +0800

12 Jun, 2010

1 commit

c7de2cf05 pkt_sched: gen_kill_estimator() rcu fixes ... Browse Code »

gen_kill_estimator() API is incomplete or not well documented, since
caller should make sure an RCU grace period is respected before
freeing stats_lock.

This was partially addressed in commit 5d944c640b4
(gen_estimator: deadlock fix), but same problem exist for all
gen_kill_estimator() users, if lock they use is not already RCU
protected.

A code review shows xt_RATEEST.c, act_api.c, act_police.c have this
problem. Other are ok because they use qdisc lock, already RCU
protected.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-06-12 09:37:08 +0800

12 May, 2010

1 commit

4b560b447 netfilter: xtables: substitute temporary defines by final name ... Browse Code »

Signed-off-by: Jan Engelhardt

Jan Engelhardt
2010-05-12 00:31:17 +0800

20 Apr, 2010

1 commit

629105546 Merge branch 'master' of /repos/git/net-next-2.6 ... Browse Code »

Conflicts:
Documentation/feature-removal-schedule.txt
net/ipv6/netfilter/ip6t_REJECT.c
net/netfilter/xt_limit.c

Signed-off-by: Patrick McHardy

Patrick McHardy
2010-04-20 22:02:01 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

25 Mar, 2010

3 commits

4a5a5c73b netfilter: xtables: slightly better error reporting ... Browse Code »

When extended status codes are available, such as ENOMEM on failed
allocations, or subsequent functions (e.g. nf_ct_get_l3proto), passing
them up to userspace seems like a good idea compared to just always
EINVAL.

Signed-off-by: Jan Engelhardt

Jan Engelhardt
2010-03-25 23:56:09 +0800
d6b00a534 netfilter: xtables: change targets to return error code ... Browse Code »

Part of the transition of done by this semantic patch:
//
@ rule1 @
struct xt_target ops;
identifier check;
@@
ops.checkentry = check;

@@
identifier rule1.check;
@@
check(...) { }

@@
identifier rule1.check;
@@
check(...) { }
//

Signed-off-by: Jan Engelhardt

Jan Engelhardt
2010-03-25 23:55:49 +0800
135367b8f netfilter: xtables: change xt_target.checkentry return type ... Browse Code »

Restore function signatures from bool to int so that we can report
memory allocation failures or similar using -ENOMEM rather than
always having to pass -EINVAL back.

//
@@
type bool;
identifier check, par;
@@
-bool check
+int check
(struct xt_tgchk_param *par) { ... }
//

Minus the change it does to xt_ct_find_proto.

Signed-off-by: Jan Engelhardt

Jan Engelhardt
2010-03-25 23:04:33 +0800

04 Jan, 2010

1 commit

5191d5019 netfilter: xtables: do not grab random bytes at __init ... Browse Code »

"It is deliberately not done in the init function, since we might not
have sufficient random while booting."

Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy

Jan Engelhardt
2010-01-04 23:27:25 +0800

18 Aug, 2009

1 commit

c1a8f1f1c net: restore gnet_stats_basic to previous definition ... Browse Code »

In 5e140dfc1fe87eae27846f193086724806b33c7d "net: reorder struct Qdisc
for better SMP performance" the definition of struct gnet_stats_basic
changed incompatibly, as copies of this struct are shipped to
userland via netlink.

Restoring old behavior is not welcome, for performance reason.

Fix is to use a private structure for kernel, and
teach gnet_stats_copy_basic() to convert from kernel to user land,
using legacy structure (struct gnet_stats_basic)

Based on a report and initial patch from Michael Spang.

Reported-by: Michael Spang
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2009-08-18 12:33:49 +0800

08 Oct, 2008

5 commits

a2df1648b netfilter: xtables: move extension arguments into compound structure (6/6) ... Browse Code »

This patch does this for target extensions' destroy functions.

Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy

Jan Engelhardt
2008-10-08 17:35:19 +0800
af5d6dc20 netfilter: xtables: move extension arguments into compound structure (5/6) ... Browse Code »

This patch does this for target extensions' checkentry functions.

Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy

Jan Engelhardt
2008-10-08 17:35:19 +0800
7eb355865 netfilter: xtables: move extension arguments into compound structure (4/6) ... Browse Code »

This patch does this for target extensions' target functions.

Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy

Jan Engelhardt
2008-10-08 17:35:19 +0800
55b69e910 netfilter: implement NFPROTO_UNSPEC as a wildcard for extensions ... Browse Code »

When a match or target is looked up using xt_find_{match,target},
Xtables will also search the NFPROTO_UNSPEC module list. This allows
for protocol-independent extensions (like xt_time) to be reused from
other components (e.g. arptables, ebtables).

Extensions that take different codepaths depending on match->family
or target->family of course cannot use NFPROTO_UNSPEC within the
registration structure (e.g. xt_pkttype).

Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy

Jan Engelhardt
2008-10-08 17:35:01 +0800
ee999d8b9 netfilter: x_tables: use NFPROTO_* in extensions ... Browse Code »

Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy

Jan Engelhardt
2008-10-08 17:35:01 +0800

14 Apr, 2008

1 commit

3cf93c96a [NETFILTER]: annotate xtables targets with const and remove casts ... Browse Code »

Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy

Jan Engelhardt
2008-04-14 15:56:05 +0800

29 Jan, 2008

3 commits

1e90474c3 [NET_SCHED]: Convert packet schedulers from rtnetlink to new netlink API ... Browse Code »

Convert packet schedulers to use the netlink API. Unfortunately a gradual
conversion is not possible without breaking compilation in the middle or
adding lots of casts, so this patch converts them all in one step. The
patch has been mostly generated automatically with some minor edits to
at least allow seperate conversion of classifiers and actions.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2008-01-29 07:11:10 +0800
2ae15b64e [NETFILTER]: Update modules' descriptions ... Browse Code »

Updates the MODULE_DESCRIPTION() tags for all Netfilter modules,
actually describing what the module does and not just
"netfilter XYZ target".

Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Jan Engelhardt
2008-01-29 07:02:26 +0800
5859034d7 [NETFILTER]: x_tables: add RATEEST target ... Browse Code »

Add new rate estimator target (using gen_estimator). In combination with
the rateest match (next patch) this can be used for load-based multipath
routing.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2008-01-29 06:56:02 +0800