Eric Lee / smarc-fsl-linux-kernel

06 Oct, 2020

1 commit

8b0308fe3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

Rejecting non-native endian BTF overlapped with the addition
of support for it.

The rest were more simple overlapping changes, except the
renesas ravb binding update, which had to follow a file
move as well as a YAML conversion.

Signed-off-by: David S. Miller

David S. Miller
2020-10-06 09:40:01 +0800

05 Oct, 2020

1 commit

580e4273d net_sched: check error pointer in tcf_dump_walker() ... Browse Code »

Although we take RTNL on dump path, it is possible to
skip RTNL on insertion path. So the following race condition
is possible:

rtnl_lock() // no rtnl lock
mutex_lock(&idrinfo->lock);
// insert ERR_PTR(-EBUSY)
mutex_unlock(&idrinfo->lock);
tc_dump_action()
rtnl_unlock()

So we have to skip those temporary -EBUSY entries on dump path
too.

Reported-and-tested-by: syzbot+b47bc4f247856fb4d9e1@syzkaller.appspotmail.com
Fixes: 0fedc63fadf0 ("net_sched: commit action insertions together")
Cc: Vlad Buslov
Cc: Jamal Hadi Salim
Cc: Jiri Pirko
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

Cong Wang
2020-10-05 05:53:06 +0800

29 Sep, 2020

1 commit

1aad80499 net_sched: remove a redundant goto chain check ... Browse Code »

All TC actions call tcf_action_check_ctrlact() to validate
goto chain, so this check in tcf_action_init_1() is actually
redundant. Remove it to save troubles of leaking memory.

Fixes: e49d8c22f126 ("net_sched: defer tcf_idr_insert() in tcf_action_init_1()")
Reported-by: Vlad Buslov
Suggested-by: Davide Caratti
Cc: Jamal Hadi Salim
Cc: Jiri Pirko
Signed-off-by: Cong Wang
Reviewed-by: Davide Caratti
Signed-off-by: David S. Miller

Cong Wang
2020-09-29 03:58:45 +0800

25 Sep, 2020

2 commits

0fedc63fa net_sched: commit action insertions together ... Browse Code »

syzbot is able to trigger a failure case inside the loop in
tcf_action_init(), and when this happens we clean up with
tcf_action_destroy(). But, as these actions are already inserted
into the global IDR, other parallel process could free them
before tcf_action_destroy(), then we will trigger a use-after-free.

Fix this by deferring the insertions even later, after the loop,
and committing all the insertions in a separate loop, so we will
never fail in the middle of the insertions any more.

One side effect is that the window between alloction and final
insertion becomes larger, now it is more likely that the loop in
tcf_del_walker() sees the placeholder -EBUSY pointer. So we have
to check for error pointer in tcf_del_walker().

Reported-and-tested-by: syzbot+2287853d392e4b42374a@syzkaller.appspotmail.com
Fixes: 0190c1d452a9 ("net: sched: atomically check-allocate action")
Cc: Vlad Buslov
Cc: Jamal Hadi Salim
Cc: Jiri Pirko
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

Cong Wang
2020-09-25 10:46:21 +0800
e49d8c22f net_sched: defer tcf_idr_insert() in tcf_action_init_1() ... Browse Code »

All TC actions call tcf_idr_insert() for new action at the end
of their ->init(), so we can actually move it to a central place
in tcf_action_init_1().

And once the action is inserted into the global IDR, other parallel
process could free it immediately as its refcnt is still 1, so we can
not fail after this, we need to move it after the goto action
validation to avoid handling the failure case after insertion.

This is found during code review, is not directly triggered by syzbot.
And this prepares for the next patch.

Cc: Vlad Buslov
Cc: Jamal Hadi Salim
Cc: Jiri Pirko
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

Cong Wang
2020-09-25 10:46:21 +0800

09 Sep, 2020

1 commit

c1f1f16c4 net: sched: skip an unnecessay check ... Browse Code »

Reviewing the error handling in tcf_action_init_1()
most of the early handling uses

err_out:
if (cookie) {
kfree(cookie->data);
kfree(cookie);
}

before cookie could ever be set.

So skip the unnecessay check.

Signed-off-by: Tom Rix
Signed-off-by: David S. Miller

Tom Rix
2020-09-09 10:34:36 +0800

21 Jun, 2020

1 commit

8bf153951 Remove redundant skb null check ... Browse Code »

Remove the redundant null check for skb.

Signed-off-by: Gaurav Singh
Signed-off-by: David S. Miller

Gaurav Singh
2020-06-21 12:29:27 +0800

20 Jun, 2020

1 commit

4b61d3e8d net: qos offload add flow status with dropped count ... Browse Code »

This patch adds a drop frames counter to tc flower offloading.
Reporting h/w dropped frames is necessary for some actions.
Some actions like police action and the coming introduced stream gate
action would produce dropped frames which is necessary for user. Status
update shows how many filtered packets increasing and how many dropped
in those packets.

v2: Changes
- Update commit comments suggest by Jiri Pirko.

Signed-off-by: Po Liu
Reviewed-by: Simon Horman
Reviewed-by: Vlad Buslov
Signed-off-by: David S. Miller

Po Liu
2020-06-20 03:53:30 +0800

16 May, 2020

1 commit

ca44b738e net: sched: implement terse dump support in act ... Browse Code »

Extend tcf_action_dump() with boolean argument 'terse' that is used to
request terse-mode action dump. In terse mode only essential data needed to
identify particular action (action kind, cookie, etc.) and its stats is put
to resulting skb and everything else is omitted. Implement
tcf_exts_terse_dump() helper in cls API that is intended to be used to
request terse dump of all exts (actions) attached to the filter.

Signed-off-by: Vlad Buslov
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller

Vlad Buslov
2020-05-16 01:23:11 +0800

01 May, 2020

1 commit

47a1494b8 netlink: remove type-unsafe validation_data pointer ... Browse Code »

In the netlink policy, we currently have a void *validation_data
that's pointing to different things:
* a u32 value for bitfield32,
* the netlink policy for nested/nested array
* the string for NLA_REJECT

Remove the pointer and place appropriate type-safe items in the
union instead.

While at it, completely dissolve the pointer for the bitfield32
case and just put the value there directly.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2020-05-01 08:51:41 +0800

31 Mar, 2020

2 commits

93a129eb8 net: sched: expose HW stats types per action used by drivers ... Browse Code »

It may be up to the driver (in case ANY HW stats is passed) to select
which type of HW stats he is going to use. Add an infrastructure to
expose this information to user.

$ tc filter add dev enp3s0np1 ingress proto ip handle 1 pref 1 flower dst_ip 192.168.1.1 action drop
$ tc -s filter show dev enp3s0np1 ingress
filter protocol ip pref 1 flower chain 0
filter protocol ip pref 1 flower chain 0 handle 0x1
eth_type ipv4
dst_ip 192.168.1.1
in_hw in_hw_count 2
action order 1: gact action drop
random type none pass val 0
index 1 ref 1 bind 1 installed 10 sec used 10 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
used_hw_stats immediate <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller

Jiri Pirko
2020-03-31 02:06:49 +0800
8953b0770 net: introduce nla_put_bitfield32() helper and use it ... Browse Code »

Introduce a helper to pass value and selector to. The helper packs them
into struct and puts them into netlink message.

Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller

Jiri Pirko
2020-03-31 02:06:49 +0800

24 Mar, 2020

1 commit

0dfb2d82a net: sched: rename more stats_types ... Browse Code »

Commit 53eca1f3479f ("net: rename flow_action_hw_stats_types* ->
flow_action_hw_stats*") renamed just the flow action types and
helpers. For consistency rename variables, enums, struct members
and UAPI too (note that this UAPI was not in any official release,
yet).

Signed-off-by: Jakub Kicinski
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller

Jakub Kicinski
2020-03-24 11:54:23 +0800

09 Mar, 2020

1 commit

44f865801 sched: act: allow user to specify type of HW stats for a filter ... Browse Code »

Currently, user who is adding an action expects HW to report stats,
however it does not have exact expectations about the stats types.
That is aligned with TCA_ACT_HW_STATS_TYPE_ANY.

Allow user to specify the type of HW stats for an action and require it.

Pass the information down to flow_offload layer.

Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller

Jiri Pirko
2020-03-09 12:07:48 +0800

27 Feb, 2020

1 commit

1521a67e6 sched: act: count in the size of action flags bitfield ... Browse Code »

The put of the flags was added by the commit referenced in fixes tag,
however the size of the message was not extended accordingly.

Fix this by adding size of the flags bitfield to the message size.

Fixes: e38226786022 ("net: sched: update action implementations to support flags")
Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller

Jiri Pirko
2020-02-27 09:10:44 +0800

27 Nov, 2019

1 commit

1ae78780e Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull RCU updates from Ingo Molnar:
"The main changes in this cycle were:

- Dynamic tick (nohz) updates, perhaps most notably changes to force
the tick on when needed due to lengthy in-kernel execution on CPUs
on which RCU is waiting.

- Linux-kernel memory consistency model updates.

- Replace rcu_swap_protected() with rcu_prepace_pointer().

- Torture-test updates.

- Documentation updates.

- Miscellaneous fixes"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
security/safesetid: Replace rcu_swap_protected() with rcu_replace_pointer()
net/sched: Replace rcu_swap_protected() with rcu_replace_pointer()
net/netfilter: Replace rcu_swap_protected() with rcu_replace_pointer()
net/core: Replace rcu_swap_protected() with rcu_replace_pointer()
bpf/cgroup: Replace rcu_swap_protected() with rcu_replace_pointer()
fs/afs: Replace rcu_swap_protected() with rcu_replace_pointer()
drivers/scsi: Replace rcu_swap_protected() with rcu_replace_pointer()
drm/i915: Replace rcu_swap_protected() with rcu_replace_pointer()
x86/kvm/pmu: Replace rcu_swap_protected() with rcu_replace_pointer()
rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer()
rcu: Suppress levelspread uninitialized messages
rcu: Fix uninitialized variable in nocb_gp_wait()
rcu: Update descriptions for rcu_future_grace_period tracepoint
rcu: Update descriptions for rcu_nocb_wake tracepoint
rcu: Remove obsolete descriptions for rcu_barrier tracepoint
rcu: Ensure that ->rcu_urgent_qs is set before resched IPI
workqueue: Convert for_each_wq to use built-in list check
rcu: Several rcu_segcblist functions can be static
rcu: Remove unused function hlist_bl_del_init_rcu()
Documentation: Rename rcu_node_context_switch() to rcu_note_context_switch()
...

Linus Torvalds
2019-11-27 07:42:43 +0800

13 Nov, 2019

1 commit

e0e2b35b7 net/sched: actions: remove unused 'order' ... Browse Code »

after commit 4097e9d250fb ("net: sched: don't use tc_action->order during
action dump"), 'act->order' is initialized but then it's no more read, so
we can just remove this member of struct tc_action.

CC: Ivan Vecera
Signed-off-by: Davide Caratti
Acked-by: Jiri Pirko
Reviewed-by: Ivan Vecera
Signed-off-by: David S. Miller

Davide Caratti
2019-11-13 04:11:22 +0800

06 Nov, 2019

1 commit

b33e699fe net_sched: add TCA_STATS_PKT64 attribute ... Browse Code »

Now the kernel uses 64bit packet counters in scheduler layer,
we want to export these counters to user space.

Instead risking breaking user space by adding fields
to struct gnet_stats_basic, add a new TCA_STATS_PKT64.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2019-11-06 10:20:55 +0800

31 Oct, 2019

5 commits

43e0ae7ae Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmc… ... Browse Code »

…k/linux-rcu into core/rcu

Pull RCU and LKMM changes from Paul E. McKenney:

- Documentation updates.

- Miscellaneous fixes.

- Dynamic tick (nohz) updates, perhaps most notably changes to
force the tick on when needed due to lengthy in-kernel execution
on CPUs on which RCU is waiting.

- Replace rcu_swap_protected() with rcu_prepace_pointer().

- Torture-test updates.

- Linux-kernel memory consistency model updates.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2019-10-31 16:33:19 +0800
e38226786 net: sched: update action implementations to support flags ... Browse Code »

Extend struct tc_action with new "tcfa_flags" field. Set the field in
tcf_idr_create() function and provide new helper
tcf_idr_create_from_flags() that derives 'cpustats' boolean from flags
value. Update individual hardware-offloaded actions init() to pass their
"flags" argument to new helper in order to skip percpu stats allocation
when user requested it through flags.

Signed-off-by: Vlad Buslov
Signed-off-by: David S. Miller

Vlad Buslov
2019-10-31 09:07:51 +0800
abbb0d336 net: sched: extend TCA_ACT space with TCA_ACT_FLAGS ... Browse Code »

Extend TCA_ACT space with nla_bitfield32 flags. Add
TCA_ACT_FLAGS_NO_PERCPU_STATS as the only allowed flag. Parse the flags in
tcf_action_init_1() and pass resulting value as additional argument to
a_o->init().

Signed-off-by: Vlad Buslov
Signed-off-by: David S. Miller

Vlad Buslov
2019-10-31 09:07:50 +0800
5e174d5e7 net: sched: modify stats helper functions to support regular stats ... Browse Code »

Modify stats update helper functions introduced in previous patches in this
series to fallback to regular tc_action->tcfa_{b|q}stats if cpu stats are
not allocated for the action argument. If regular non-percpu allocated
counters are in use, then obtain action tcfa_lock while modifying them.

Signed-off-by: Vlad Buslov
Acked-by: Jiri Pirko
Signed-off-by: David S. Miller

Vlad Buslov
2019-10-31 09:07:50 +0800
c8ecebd04 net: sched: extract common action counters update code into function ... Browse Code »

Currently, all implementations of tc_action_ops->stats_update() callback
have almost exactly the same implementation of counters update
code (besides gact which also updates drop counter). In order to simplify
support for using both percpu-allocated and regular action counters
depending on run-time flag in following patches, extract action counters
update code into standalone function in act API.

This commit doesn't change functionality.

Signed-off-by: Vlad Buslov
Acked-by: Jiri Pirko
Signed-off-by: David S. Miller

Vlad Buslov
2019-10-31 09:07:50 +0800

30 Oct, 2019

1 commit

445d37493 net/sched: Replace rcu_swap_protected() with rcu_replace_pointer() ... Browse Code »

This commit replaces the use of rcu_swap_protected() with the more
intuitively appealing rcu_replace_pointer() as a step towards removing
rcu_swap_protected().

Link: https://lore.kernel.org/lkml/CAHk-=wiAsJLw1egFEE=Z7-GGtM6wcvtyytXZA1+BHqta4gg6Hw@mail.gmail.com/
Reported-by: Linus Torvalds
[ paulmck: From rcu_replace() to rcu_replace_pointer() per Ingo Molnar. ]
Signed-off-by: Paul E. McKenney
Cc: Jamal Hadi Salim
Cc: Cong Wang
Cc: Jiri Pirko
Cc: "David S. Miller"
Cc:
Cc:

Paul E. McKenney
2019-10-30 23:45:48 +0800

16 Oct, 2019

1 commit

39f13ea2f net: avoid potential infinite loop in tc_ctl_action() ... Browse Code »

tc_ctl_action() has the ability to loop forever if tcf_action_add()
returns -EAGAIN.

This special case has been done in case a module needed to be loaded,
but it turns out that tcf_add_notify() could also return -EAGAIN
if the socket sk_rcvbuf limit is hit.

We need to separate the two cases, and only loop for the module
loading case.

While we are at it, add a limit of 10 attempts since unbounded
loops are always scary.

syzbot repro was something like :

socket(PF_NETLINK, SOCK_RAW|SOCK_NONBLOCK, NETLINK_ROUTE) = 3
write(3, ..., 38) = 38
setsockopt(3, SOL_SOCKET, SO_RCVBUF, [0], 4) = 0
sendmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{..., 388}], msg_controllen=0, msg_flags=0x10}, ...)

NMI backtrace for cpu 0
CPU: 0 PID: 1054 Comm: khungtaskd Not tainted 5.4.0-rc1+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
watchdog+0x9d0/0xef0 kernel/hung_task.c:289
kthread+0x361/0x430 kernel/kthread.c:255
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 8859 Comm: syz-executor910 Not tainted 5.4.0-rc1+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:arch_local_save_flags arch/x86/include/asm/paravirt.h:751 [inline]
RIP: 0010:lockdep_hardirqs_off+0x1df/0x2e0 kernel/locking/lockdep.c:3453
Code: 5c 08 00 00 5b 41 5c 41 5d 5d c3 48 c7 c0 58 1d f3 88 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 0f 85 d3 00 00 00 83 3d 21 9e 99 07 00 0f 84 b9 00 00 00 9c 58 0f 1f 44 00 00 f6
RSP: 0018:ffff8880a6f3f1b8 EFLAGS: 00000046
RAX: 1ffffffff11e63ab RBX: ffff88808c9c6080 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffff88808c9c6914
RBP: ffff8880a6f3f1d0 R08: ffff88808c9c6080 R09: fffffbfff16be5d1
R10: fffffbfff16be5d0 R11: 0000000000000003 R12: ffffffff8746591f
R13: ffff88808c9c6080 R14: ffffffff8746591f R15: 0000000000000003
FS: 00000000011e4880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffff600400 CR3: 00000000a8920000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
trace_hardirqs_off+0x62/0x240 kernel/trace/trace_preemptirq.c:45
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:108 [inline]
_raw_spin_lock_irqsave+0x6f/0xcd kernel/locking/spinlock.c:159
__wake_up_common_lock+0xc8/0x150 kernel/sched/wait.c:122
__wake_up+0xe/0x10 kernel/sched/wait.c:142
netlink_unlock_table net/netlink/af_netlink.c:466 [inline]
netlink_unlock_table net/netlink/af_netlink.c:463 [inline]
netlink_broadcast_filtered+0x705/0xb80 net/netlink/af_netlink.c:1514
netlink_broadcast+0x3a/0x50 net/netlink/af_netlink.c:1534
rtnetlink_send+0xdd/0x110 net/core/rtnetlink.c:714
tcf_add_notify net/sched/act_api.c:1343 [inline]
tcf_action_add+0x243/0x370 net/sched/act_api.c:1362
tc_ctl_action+0x3b5/0x4bc net/sched/act_api.c:1410
rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5386
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5404
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:657
___sys_sendmsg+0x803/0x920 net/socket.c:2311
__sys_sendmsg+0x105/0x1d0 net/socket.c:2356
__do_sys_sendmsg net/socket.c:2365 [inline]
__se_sys_sendmsg net/socket.c:2363 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440939

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet
Reported-by: syzbot+cf0adbb9c28c8866c788@syzkaller.appspotmail.com
Signed-off-by: David S. Miller

Eric Dumazet
2019-10-16 11:20:22 +0800

09 Oct, 2019

1 commit

4b793fecc net_sched: fix backward compatibility for TCA_ACT_KIND ... Browse Code »

For TCA_ACT_KIND, we have to keep the backward compatibility too,
and rely on nla_strlcpy() to check and terminate the string with
a NUL.

Note for TC actions, nla_strcmp() is already used to compare kind
strings, so we don't need to fix other places.

Fixes: 199ce850ce11 ("net_sched: add policy validation for action attributes")
Reported-by: Marcelo Ricardo Leitner
Cc: Jamal Hadi Salim
Cc: Jiri Pirko
Signed-off-by: Cong Wang
Reviewed-by: Marcelo Ricardo Leitner
Signed-off-by: Jakub Kicinski

Cong Wang
2019-10-09 07:29:35 +0800

22 Sep, 2019

1 commit

199ce850c net_sched: add policy validation for action attributes ... Browse Code »

Similar to commit 8b4c3cdd9dd8
("net: sched: Add policy validation for tc attributes"), we need
to add proper policy validation for TC action attributes too.

Cc: David Ahern
Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Acked-by: Jiri Pirko
Signed-off-by: Jakub Kicinski

Cong Wang
2019-09-22 10:33:13 +0800

02 Jul, 2019

1 commit

e33d2b74d idr: fix overflow case for idr_for_each_entry_ul() ... Browse Code »

idr_for_each_entry_ul() is buggy as it can't handle overflow
case correctly. When we have an ID == UINT_MAX, it becomes an
infinite loop. This happens when running on 32-bit CPU where
unsigned long has the same size with unsigned int.

There is no better way to fix this than casting it to a larger
integer, but we can't just 64 bit integer on 32 bit CPU. Instead
we could just use an additional integer to help us to detect this
overflow case, that is, adding a new parameter to this macro.
Fortunately tc action is its only user right now.

Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
Reported-by: Li Shuang
Tested-by: Davide Caratti
Cc: Matthew Wilcox
Cc: Chris Mi
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

Cong Wang
2019-07-02 10:15:46 +0800

31 May, 2019

2 commits

2f4c53349 Merge tag 'spdx-5.2-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull yet more SPDX updates from Greg KH:
"Here is another set of reviewed patches that adds SPDX tags to
different kernel files, based on a set of rules that are being used to
parse the comments to try to determine that the license of the file is
"GPL-2.0-or-later" or "GPL-2.0-only". Only the "obvious" versions of
these matches are included here, a number of "non-obvious" variants of
text have been found but those have been postponed for later review
and analysis.

There is also a patch in here to add the proper SPDX header to a bunch
of Kbuild files that we have missed in the past due to new files being
added and forgetting that Kbuild uses two different file names for
Makefiles. This issue was reported by the Kbuild maintainer.

These patches have been out for review on the linux-spdx@vger mailing
list, and while they were created by automatic tools, they were
hand-verified by a bunch of different people, all whom names are on
the patches are reviewers"

* tag 'spdx-5.2-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (82 commits)
treewide: Add SPDX license identifier - Kbuild
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 225
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 224
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 223
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 222
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 221
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 220
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 218
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 217
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 216
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 215
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 214
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 213
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 211
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 210
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 207
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 203
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 201
...

Linus Torvalds
2019-05-31 23:34:32 +0800
2874c5fd2 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 ... Browse Code »

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 of the license or at
your option any later version

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 3029 file(s).

Signed-off-by: Thomas Gleixner
Reviewed-by: Allison Randal
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-31 02:26:32 +0800

25 May, 2019

1 commit

4097e9d25 net: sched: don't use tc_action->order during action dump ... Browse Code »

Function tcf_action_dump() relies on tc_action->order field when starting
nested nla to send action data to userspace. This approach breaks in
several cases:

- When multiple filters point to same shared action, tc_action->order field
is overwritten each time it is attached to filter. This causes filter
dump to output action with incorrect attribute for all filters that have
the action in different position (different order) from the last set
tc_action->order value.

- When action data is displayed using tc action API (RTM_GETACTION), action
order is overwritten by tca_action_gd() according to its position in
resulting array of nl attributes, which will break filter dump for all
filters attached to that shared action that expect it to have different
order value.

Don't rely on tc_action->order when dumping actions. Set nla according to
action position in resulting array of actions instead.

Signed-off-by: Vlad Buslov
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller

Vlad Buslov
2019-05-25 04:27:52 +0800

28 Apr, 2019

2 commits

8cb081746 netlink: make validation more configurable for future strictness ... Browse Code »

We currently have two levels of strict validation:

1) liberal (default)
- undefined (type >= max) & NLA_UNSPEC attributes accepted
- attribute length >= expected accepted
- garbage at end of message accepted
2) strict (opt-in)
- NLA_UNSPEC attributes accepted
- attribute length >= expected accepted

Split out parsing strictness into four different options:
* TRAILING - check that there's no trailing data after parsing
attributes (in message or nested)
* MAXTYPE - reject attrs > max known type
* UNSPEC - reject attributes with NLA_UNSPEC policy entries
* STRICT_ATTRS - strictly validate attribute size

The default for future things should be *everything*.
The current *_strict() is a combination of TRAILING and MAXTYPE,
and is renamed to _deprecated_strict().
The current regular parsing has none of this, and is renamed to
*_parse_deprecated().

Additionally it allows us to selectively set one of the new flags
even on old policies. Notably, the UNSPEC flag could be useful in
this case, since it can be arranged (by filling in the policy) to
not be an incompatible userspace ABI change, but would then going
forward prevent forgetting attribute entries. Similar can apply
to the POLICY flag.

We end up with the following renames:
* nla_parse -> nla_parse_deprecated
* nla_parse_strict -> nla_parse_deprecated_strict
* nlmsg_parse -> nlmsg_parse_deprecated
* nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
* nla_parse_nested -> nla_parse_nested_deprecated
* nla_validate_nested -> nla_validate_nested_deprecated

Using spatch, of course:
@@
expression TB, MAX, HEAD, LEN, POL, EXT;
@@
-nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
+nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)

@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)

@@
expression NLH, HDRLEN, TB, MAX, POL, EXT;
@@
-nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
+nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)

@@
expression TB, MAX, NLA, POL, EXT;
@@
-nla_parse_nested(TB, MAX, NLA, POL, EXT)
+nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)

@@
expression START, MAX, POL, EXT;
@@
-nla_validate_nested(START, MAX, POL, EXT)
+nla_validate_nested_deprecated(START, MAX, POL, EXT)

@@
expression NLH, HDRLEN, MAX, POL, EXT;
@@
-nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
+nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)

For this patch, don't actually add the strict, non-renamed versions
yet so that it breaks compile if I get it wrong.

Also, while at it, make nla_validate and nla_parse go down to a
common __nla_validate_parse() function to avoid code duplication.

Ultimately, this allows us to have very strict validation for every
new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
next patch, while existing things will continue to work as is.

In effect then, this adds fully strict validation for any new command.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2019-04-28 05:07:21 +0800
ae0be8de9 netlink: make nla_nest_start() add NLA_F_NESTED flag ... Browse Code »

Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
netlink based interfaces (including recently added ones) are still not
setting it in kernel generated messages. Without the flag, message parsers
not aware of attribute semantics (e.g. wireshark dissector or libmnl's
mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
the structure of their contents.

Unfortunately we cannot just add the flag everywhere as there may be
userspace applications which check nlattr::nla_type directly rather than
through a helper masking out the flags. Therefore the patch renames
nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
are rewritten to use nla_nest_start().

Except for changes in include/net/netlink.h, the patch was generated using
this semantic patch:

@@ expression E1, E2; @@
-nla_nest_start(E1, E2)
+nla_nest_start_noflag(E1, E2)

@@ expression E1, E2; @@
-nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
+nla_nest_start(E1, E2)

Signed-off-by: Michal Kubecek
Acked-by: Jiri Pirko
Acked-by: David Ahern
Signed-off-by: David S. Miller

Michal Kubecek
2019-04-28 05:03:44 +0800

22 Mar, 2019

2 commits

ee3bbfe80 net/sched: let actions use RCU to access 'goto_chain' ... Browse Code »

use RCU when accessing the action chain, to avoid use after free in the
traffic path when 'goto chain' is replaced on existing TC actions (see
script below). Since the control action is read in the traffic path
without holding the action spinlock, we need to explicitly ensure that
a->goto_chain is not NULL before dereferencing (i.e it's not sufficient
to rely on the value of TC_ACT_GOTO_CHAIN bits). Not doing so caused NULL
dereferences in tcf_action_goto_chain_exec() when the following script:

# tc chain add dev dd0 chain 42 ingress protocol ip flower \
> ip_proto udp action pass index 4
# tc filter add dev dd0 ingress protocol ip flower \
> ip_proto udp action csum udp goto chain 42 index 66
# tc chain del dev dd0 chain 42 ingress
(start UDP traffic towards dd0)
# tc action replace action csum udp pass index 66

was run repeatedly for several hours.

Suggested-by: Cong Wang
Suggested-by: Vlad Buslov
Signed-off-by: Davide Caratti
Signed-off-by: David S. Miller

Davide Caratti
2019-03-22 04:26:42 +0800
85d0966fa net/sched: prepare TC actions to properly validate the control action ... Browse Code »

- pass a pointer to struct tcf_proto in each actions's init() handler,
to allow validating the control action, checking whether the chain
exists and (eventually) refcounting it.
- remove code that validates the control action after a successful call
to the action's init() handler, and replace it with a test that forbids
addition of actions having 'goto_chain' and NULL goto_chain pointer at
the same time.
- add tcf_action_check_ctrlact(), that will validate the control action
and eventually allocate the action 'goto_chain' within the init()
handler.
- add tcf_action_set_ctrlact(), that will assign the control action and
swap the current 'goto_chain' pointer with the new given one.

This disallows 'goto_chain' on actions that don't initialize it properly
in their init() handler, i.e. calling tcf_action_check_ctrlact() after
successful IDR reservation and then calling tcf_action_set_ctrlact()
to assign 'goto_chain' and 'tcf_action' consistently.

By doing this, the kernel does not leak anymore refcounts when a valid
'goto chain' handle is replaced in TC actions, causing kmemleak splats
like the following one:

# tc chain add dev dd0 chain 42 ingress protocol ip flower \
> ip_proto tcp action drop
# tc chain add dev dd0 chain 43 ingress protocol ip flower \
> ip_proto udp action drop
# tc filter add dev dd0 ingress matchall \
> action gact goto chain 42 index 66
# tc filter replace dev dd0 ingress matchall \
> action gact goto chain 43 index 66
# echo scan >/sys/kernel/debug/kmemleak

unreferenced object 0xffff93c0ee09f000 (size 1024):
comm "tc", pid 2565, jiffies 4295339808 (age 65.426s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 08 00 06 00 00 00 00 00 00 00 00 00 ................
backtrace:
[] tc_ctl_chain+0x3d2/0x4c0
[] rtnetlink_rcv_msg+0x263/0x2d0
[] netlink_rcv_skb+0x4a/0x110
[] netlink_unicast+0x1a0/0x250
[] netlink_sendmsg+0x2c1/0x3c0
[] sock_sendmsg+0x36/0x40
[] ___sys_sendmsg+0x280/0x2f0
[] __sys_sendmsg+0x5e/0xa0
[] do_syscall_64+0x5b/0x180
[] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[] 0xffffffffffffffff

Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain")
Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values")
Signed-off-by: Davide Caratti
Signed-off-by: David S. Miller

Davide Caratti
2019-03-22 04:26:41 +0800

11 Feb, 2019

1 commit

eddd2cf19 net: Change TCA_ACT_* to TCA_ID_* to match that of TCA_ID_POLICE ... Browse Code »

Modify the kernel users of the TCA_ACT_* macros to use TCA_ID_*. For
example, use TCA_ID_GACT instead of TCA_ACT_GACT. This will align with
TCA_ID_POLICE and also differentiates these identifier, used in struct
tc_action_ops type field, from other macros starting with TCA_ACT_.

To make things clearer, we name the enum defining the TCA_ID_*
identifiers and also change the "type" field of struct tc_action to
id.

Signed-off-by: Eli Cohen
Acked-by: Jiri Pirko
Signed-off-by: David S. Miller

Eli Cohen
2019-02-11 01:28:43 +0800

11 Dec, 2018

1 commit

69bd48404 net/sched: Remove egdev mechanism ... Browse Code »

The egdev mechanism was replaced by the TC indirect block notifications
platform.

Signed-off-by: Oz Shlomo
Reviewed-by: Eli Britstein
Reviewed-by: Jiri Pirko
Cc: John Hurley
Cc: Jakub Kicinski
Signed-off-by: Saeed Mahameed

Oz Shlomo
2018-12-11 07:54:34 +0800

09 Oct, 2018

1 commit

dac9c9790 net: Add extack to nlmsg_parse ... Browse Code »

Make sure extack is passed to nlmsg_parse where easy to do so.
Most of these are dump handlers and leveraging the extack in
the netlink_callback.

Signed-off-by: David Ahern
Acked-by: Christian Brauner
Signed-off-by: David S. Miller

David Ahern
2018-10-09 01:39:04 +0800

05 Oct, 2018

1 commit

95278ddaa net_sched: convert idrinfo->lock from spinlock to a mutex ... Browse Code »

In commit ec3ed293e766 ("net_sched: change tcf_del_walker() to take idrinfo->lock")
we move fl_hw_destroy_tmplt() to a workqueue to avoid blocking
with the spinlock held. Unfortunately, this causes a lot of
troubles here:

1. tcf_chain_destroy() could be called right after we queue the work
but before the work runs. This is a use-after-free.

2. The chain refcnt is already 0, we can't even just hold it again.
We can check refcnt==1 but it is ugly.

3. The chain with refcnt 0 is still visible in its block, which means
it could be still found and used!

4. The block has a refcnt too, we can't hold it without introducing a
proper API either.

We can make it working but the end result is ugly. Instead of wasting
time on reviewing it, let's just convert the troubling spinlock to
a mutex, which allows us to use non-atomic allocations too.

Fixes: ec3ed293e766 ("net_sched: change tcf_del_walker() to take idrinfo->lock")
Reported-by: Ido Schimmel
Cc: Jamal Hadi Salim
Cc: Vlad Buslov
Cc: Jiri Pirko
Signed-off-by: Cong Wang
Tested-by: Ido Schimmel
Signed-off-by: David S. Miller

Cong Wang
2018-10-05 15:36:31 +0800

25 Sep, 2018

1 commit

28169abad net/sched: Add hardware specific counters to TC actions ... Browse Code »

Add additional counters that will store the bytes/packets processed by
hardware. These will be exported through the netlink interface for
displaying by the iproute2 tc tool

Signed-off-by: Eelco Chaudron
Signed-off-by: David S. Miller

Eelco Chaudron
2018-09-25 03:18:42 +0800