04 Feb, 2017
1 commit
-
[ Upstream commit 0faa9cb5b3836a979864a6357e01d2046884ad52 ]
Demonstrating the issue:
.. add a drop action
$sudo $TC actions add action drop index 10.. retrieve it
$ sudo $TC -s actions get action gact index 10action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 0 installed 29 sec used 29 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0... bug 1 above: reference is two.
Reference is actually 1 but we forget to subtract 1.... do a GET again and we see the same issue
try a few times and nothing changes
~$ sudo $TC -s actions get action gact index 10action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 0 installed 31 sec used 31 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0... lets try to bind the action to a filter..
$ sudo $TC qdisc add dev lo ingress
$ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10... and now a few GETs:
$ sudo $TC -s actions get action gact index 10action order 1: gact action drop
random type none pass val 0
index 10 ref 3 bind 1 installed 204 sec used 204 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 4 bind 1 installed 206 sec used 206 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 5 bind 1 installed 235 sec used 235 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0.... as can be observed the reference count keeps going up.
After the fix
$ sudo $TC actions add action drop index 10
$ sudo $TC -s actions get action gact index 10action order 1: gact action drop
random type none pass val 0
index 10 ref 1 bind 0 installed 4 sec used 4 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 1 bind 0 installed 6 sec used 6 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0$ sudo $TC qdisc add dev lo ingress
$ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 1 installed 32 sec used 32 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 1 installed 33 sec used 33 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0Fixes: aecc5cefc389 ("net sched actions: fix GETing actions")
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
30 Oct, 2016
1 commit
-
Use nla_parse_nested instead of open-coding the call to
nla_parse() with the attribute data/len.Signed-off-by: Johannes Berg
Acked-by: Cong Wang
Signed-off-by: David S. Miller
13 Oct, 2016
1 commit
-
Krister reported a kernel NULL pointer dereference after
tcf_action_init_1() invokes a_o->init(), it is a race condition
where one thread calling tcf_register_action() to initialize
the netns data after putting act ops in the global list and
the other thread searching the list and then calling
a_o->init(net, ...).Fix this by moving the pernet ops registration before making
the action ops visible. This is fine because: a) we don't
rely on act_base in pernet ops->init(), b) in the worst case we
have a fully initialized netns but ops is still not ready so
new actions still can't be created.Reported-by: Krister Johansen
Tested-by: Krister Johansen
Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
21 Sep, 2016
1 commit
-
With the batch changes that translated transient actions into
a temporary list lost in the translation was the fact that
tcf_action_destroy() will eventually delete the action from
the permanent location if the refcount is zero.Example of what broke:
...add a gact action to drop
sudo $TC actions add action drop index 10
...now retrieve it, looks good
sudo $TC actions get action gact index 10
...retrieve it again and find it is gone!
sudo $TC actions get action gact index 10Fixes: 22dc13c837c3 ("net_sched: convert tcf_exts from list to pointer array"),
Fixes: 824a7e8863b3 ("net_sched: remove an unnecessary list_del()")
Fixes: f07fed82ad79 ("net_sched: remove the leftover cleanup_a()")Acked-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
20 Sep, 2016
1 commit
-
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
18 Aug, 2016
3 commits
-
As pointed out by Jamal, an action could be shared by
multiple filters, so we can't use list to chain them
any more after we get rid of the original tc_action.
Instead, we could just save pointers to these actions
in tcf_exts, since they are refcount'ed, so convert
the list to an array of pointers.The "ugly" part is the action API still accepts list
as a parameter, I just introduce a helper function to
convert the array of pointers to a list, instead of
relying on the C99 feature to iterate the array.Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common")
Reported-by: Jamal Hadi Salim
Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
This list_del() for tc action is not needed actually,
because we only use this list to chain bulk operations,
therefore should not be carried for latter operations.Fixes: ec0595cc4495 ("net_sched: get rid of struct tcf_common")
Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
After refactoring tc_action into tcf_common, we no
longer need to cleanup temporary "actions" in list,
they are permanently stored in the hashtable.Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common")
Reported-by: Jamal Hadi Salim
Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
26 Jul, 2016
2 commits
-
After the previous patch, struct tc_action should be enough
to represent the generic tc action, tcf_common is not necessary
any more. This patch gets rid of it to make tc action code
more readable.Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller -
struct tc_action is confusing, currently we use it for two purposes:
1) Pass in arguments and carry out results from helper functions
2) A generic representation for tc actionsThe first one is error-prone, since we need to make sure we don't
miss anything. This patch aims to get rid of this use, by moving
tc_action into tcf_common, so that they are allocated together
in hashtable and can be cast'ed easily.And together with the following patch, we could really make
tc_action a generic representation for all tc actions and each
type of action can inherit from it.Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
30 Jun, 2016
1 commit
-
Several cases of overlapping changes, except the packet scheduler
conflicts which deal with the addition of the free list parameter
to qdisc_enqueue().Signed-off-by: David S. Miller
16 Jun, 2016
2 commits
-
This refers to commands to direct action access as follows:
sudo tc actions add action drop index 12
sudo tc actions add action pipe index 10And then dumping them like so:
sudo tc actions ls action gactiproute2 worked because it depended on absence of TCA_ACT_TAB TLV
as end of message.
This fix has been tested with iproute2 and is backward compatible.Signed-off-by: Jamal Hadi Salim
Acked-by: Cong Wang
Signed-off-by: David S. Miller -
Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
08 Jun, 2016
3 commits
-
Large tc dumps (tc -s {qdisc|class} sh dev ethX) done by Google BwE host
agent [1] are problematic at scale :For each qdisc/class found in the dump, we currently lock the root qdisc
spinlock in order to get stats. Sampling stats every 5 seconds from
thousands of HTB classes is a challenge when the root qdisc spinlock is
under high pressure. Not only the dumps take time, they also slow
down the fast path (queue/dequeue packets) by 10 % to 20 % in some cases.An audit of existing qdiscs showed that sch_fq_codel is the only qdisc
that might need the qdisc lock in fq_codel_dump_stats() and
fq_codel_dump_class_stats()In v2 of this patch, I now use the Qdisc running seqcount to provide
consistent reads of packets/bytes counters, regardless of 32/64 bit arches.I also changed rate estimators to use the same infrastructure
so that they no longer need to lock root qdisc lock.[1]
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43838.pdfSigned-off-by: Eric Dumazet
Cc: Cong Wang
Cc: Jamal Hadi Salim
Cc: John Fastabend
Cc: Kevin Athey
Cc: Xiaotian Pei
Signed-off-by: David S. Miller -
Signed-off-by: Jamal Hadi Salim
Acked-by: Cong Wang -
Useful to know when the action was first used for accounting
(and debugging)Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
27 Apr, 2016
1 commit
-
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller
26 Feb, 2016
2 commits
-
Currently tc actions are stored in a per-module hashtable,
therefore are visible to all network namespaces. This is
probably the last part of the tc subsystem which is not
aware of netns now. This patch makes them per-netns,
several tc action API's need to be adjusted for this.The tc action API code is ugly due to historical reasons,
we need to refactor that code in the future.Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
We only release the memory of the hashtable itself, not its
entries inside. This is not a problem yet since we only call
it in module release path, and module is refcount'ed by
actions. This would be a problem after we move the per module
hinfo into per netns in the latter patch.Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
27 Aug, 2015
1 commit
-
tcf_hash_destroy() used once. Make it static.
Signed-off-by: Alexei Starovoitov
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller
01 Aug, 2015
1 commit
-
Conflicts:
arch/s390/net/bpf_jit_comp.c
drivers/net/ethernet/ti/netcp_ethss.c
net/bridge/br_multicast.c
net/ipv4/ip_fragment.cAll four conflicts were cases of simple overlapping
changes.Signed-off-by: David S. Miller
31 Jul, 2015
1 commit
-
Since commit 55334a5db5cd ("net_sched: act: refuse to remove bound action
outside"), we end up with a wrong reference count for a tc action.Test case 1:
FOO="1,6 0 0 4294967295,"
BAR="1,6 0 0 4294967294,"
tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 \
action bpf bytecode "$FOO"
tc actions show action bpf
action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
index 1 ref 1 bind 1
tc actions replace action bpf bytecode "$BAR" index 1
tc actions show action bpf
action order 0: bpf bytecode '1,6 0 0 4294967294' default-action pipe
index 1 ref 2 bind 1
tc actions replace action bpf bytecode "$FOO" index 1
tc actions show action bpf
action order 0: bpf bytecode '1,6 0 0 4294967295' default-action pipe
index 1 ref 3 bind 1Test case 2:
FOO="1,6 0 0 4294967295,"
tc filter add dev foo parent 1: bpf bytecode "$FOO" flowid 1:1 action ok
tc actions show action gact
action order 0: gact action pass
random type none pass val 0
index 1 ref 1 bind 1
tc actions add action drop index 1
RTNETLINK answers: File exists [...]
tc actions show action gact
action order 0: gact action pass
random type none pass val 0
index 1 ref 2 bind 1
tc actions add action drop index 1
RTNETLINK answers: File exists [...]
tc actions show action gact
action order 0: gact action pass
random type none pass val 0
index 1 ref 3 bind 1What happens is that in tcf_hash_check(), we check tcf_common for a given
index and increase tcfc_refcnt and conditionally tcfc_bindcnt when we've
found an existing action. Now there are the following cases:1) We do a late binding of an action. In that case, we leave the
tcfc_refcnt/tcfc_bindcnt increased and are done with the ->init()
handler. This is correctly handeled.2) We replace the given action, or we try to add one without replacing
and find out that the action at a specific index already exists
(thus, we go out with error in that case).In case of 2), we have to undo the reference count increase from
tcf_hash_check() in the tcf_hash_check() function. Currently, we fail to
do so because of the 'tcfc_bindcnt > 0' check which bails out early with
an -EPERM error.Now, while commit 55334a5db5cd prevents 'tc actions del action ...' on an
already classifier-bound action to drop the reference count (which could
then become negative, wrap around etc), this restriction only accounts for
invocations outside a specific action's ->init() handler.One possible solution would be to add a flag thus we possibly trigger
the -EPERM ony in situations where it is indeed relevant.After the patch, above test cases have correct reference count again.
Fixes: 55334a5db5cd ("net_sched: act: refuse to remove bound action outside")
Signed-off-by: Daniel Borkmann
Reviewed-by: Cong Wang
Signed-off-by: David S. Miller
09 Jul, 2015
1 commit
-
Reuse existing percpu infrastructure John Fastabend added for qdisc.
This patch adds a new cpustats parameter to tcf_hash_create() and all
actions pass false, meaning this patch should have no effect yet.Signed-off-by: Eric Dumazet
Cc: Alexei Starovoitov
Acked-by: Jamal Hadi Salim
Acked-by: John Fastabend
Signed-off-by: David S. Miller
03 May, 2015
1 commit
-
Not used.
pedit sets TC_MUNGED when packet content was altered, but all the core
does is unset MUNGED again and then set OK2MUNGE.And the latter isn't tested anywhere. So lets remove both
TC_MUNGED and TC_OK2MUNGE.Signed-off-by: Florian Westphal
Acked-by: Alexei Starovoitov
Acked-by: Daniel Borkmann
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
30 Sep, 2014
3 commits
-
After previous patches to simplify qstats the qstats can be
made per cpu with a packed union in Qdisc struct.Signed-off-by: John Fastabend
Signed-off-by: David S. Miller -
This removes the use of qstats->qlen variable from the classifiers
and makes it an explicit argument to gnet_stats_copy_queue().The qlen represents the qdisc queue length and is packed into
the qstats at the last moment before passnig to user space. By
handling it explicitely we avoid, in the percpu stats case, having
to figure out which per_cpu variable to put it in.It would probably be best to remove it from qstats completely
but qstats is a user space ABI and can't be broken. A future
patch could make an internal only qstats structure that would
avoid having to allocate an additional u32 variable on the
Qdisc struct. This would make the qstats struct 128bits instead
of 128+32.Signed-off-by: John Fastabend
Signed-off-by: David S. Miller -
In order to run qdisc's without locking statistics and estimators
need to be handled correctly.To resolve bstats make the statistics per cpu. And because this is
only needed for qdiscs that are running without locks which is not
the case for most qdiscs in the near future only create percpu
stats when qdiscs set the TCQ_F_CPUSTATS flag.Next because estimators use the bstats to calculate packets per
second and bytes per second the estimator code paths are updated
to use the per cpu statistics.Signed-off-by: John Fastabend
Signed-off-by: David S. Miller
25 Apr, 2014
1 commit
-
It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.Reported-by: Andy Lutomirski
Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
13 Feb, 2014
5 commits
-
We could allocate tc_action on stack in tca_action_flush(),
since it is not large.Also, we could use create_a() in tcf_action_get_1().
Cc: Jamal Hadi Salim
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
When an action is bonnd to a filter, there is no point to
remove it outside. Currently we just silently decrease the refcnt,
we should reject this explicitly with EPERM.Cc: Jamal Hadi Salim
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
Cc: Jamal Hadi Salim
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
For bindcnt and refcnt etc., they are common for all actions,
not need to repeat such operations for their own, they can be unified
now. Actions just need to do its specific cleanup if needed.Cc: Jamal Hadi Salim
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
Now we can totally hide it from modules. tcf_hash_*() API's
will operate on struct tc_action, modules don't need to care about
the details.Cc: Jamal Hadi Salim
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
22 Jan, 2014
2 commits
-
So that we will not expose struct tcf_common to modules.
Cc: Jamal Hadi Salim
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
Every action ops has a pointer to hash info, so we don't need to
hard-code it in each module.Cc: Jamal Hadi Salim
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
14 Jan, 2014
3 commits
-
It is not necessary at all.
Cc: Jamal Hadi Salim
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
Refactor tcf_add_notify() and factor out tcf_del_notify().
Cc: Jamal Hadi Salim
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
There is no need to store the index separatedly
since tcf_hashinfo is allocated statically too.Cc: Jamal Hadi Salim
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller
07 Jan, 2014
2 commits
-
action flushing missaccounting
Account only for deleted actionsSigned-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
Remove unnecessary checks for act->ops
(suggested by Eric Dumazet).Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller