Eric Lee / smarc-fsl-linux-kernel

07 Mar, 2019

1 commit

5b4f21b2a bpf: hbm: fix spelling mistake "deault" -> "default" ... Browse Code »

There are a couple of typos, fix these.

Signed-off-by: Colin Ian King
Acked-by: Song Liu
Signed-off-by: Daniel Borkmann

Colin Ian King
2019-03-07 17:35:00 +0800

03 Mar, 2019

3 commits

4ffd44cfd bpf: HBM test script ... Browse Code »

Script for testing HBM (Host Bandwidth Manager) framework.
It creates a cgroup to use for testing and load a BPF program to limit
egress bandwidht. It then uses iperf3 or netperf to create
loads. The output is the goodput in Mbps (unless -D is used).

It can work on a single host using loopback or among two hosts (with netperf).
When using loopback, it is recommended to also introduce a delay of at least
1ms (-d=1), otherwise the assigned bandwidth is likely to be underutilized.

USAGE: $name [out] [-b=|--bpf=] [-c=|--cc=] [-D]
[-d=|--delay=] [--debug] [-E]
[-f=|--flows=] [-h] [-i=|--id=] [-l]
[-N] [-p=|--port=] [-P] [-q=]
[-R] [-s=|--server=|--time=] [-w] [cubic|dctcp]
Where:
out Egress (default egress)
-b or --bpf BPF program filename to load and attach.
Default is nrm_out_kern.o for egress,
-c or -cc TCP congestion control (cubic or dctcp)
-d or --delay Add a delay in ms using netem
-D In addition to the goodput in Mbps, it also outputs
other detailed information. This information is
test dependent (i.e. iperf3 or netperf).
--debug Print BPF trace buffer
-E Enable ECN (not required for dctcp)
-f or --flows Number of concurrent flows (default=1)
-i or --id cgroup id (an integer, default is 1)
-l Do not limit flows using loopback
-N Use netperf instead of iperf3
-h Help
-p or --port iperf3 port (default is 5201)
-P Use an iperf3 instance for each flow
-q Use the specified qdisc.
-r or --rate Rate in Mbps (default 1s 1Gbps)
-R Use TCP_RR for netperf. 1st flow has req
size of 10KB, rest of 1MB. Reply in all
cases is 1 byte.
More detailed output for each flow can be found
in the files netperf.., where is the
cgroup id as specified with the -i flag, and
is the flow id starting at 1 and increasing by 1 for
flow (as specified by -f).
-s or --server hostname of netperf server. Used to create netperf
test traffic between to hosts (default is within host)
netserver must be running on the host.
--stats Get HBM stats (marked, dropped, etc.)
-t or --time duration of iperf3 in seconds (default=5)
-w Work conserving flag. cgroup can increase its
bandwidth beyond the rate limit specified
while there is available bandwidth. Current
implementation assumes there is only one NIC
(eth0), but can be extended to support multiple
NICs. This is just a proof of concept.
cubic or dctcp specify TCP CC to use

Examples:
./do_hbm_test.sh -l -d=1 -D --stats
Runs a 5 second test, using a single iperf3 flow and with the default
rate limit of 1Gbps and a delay of 1ms (using netem) using the default
TCP congestion control on the loopback device (hence we use "-l" to
enforce bandwidth limit on loopback device). Since no direction is
specified, it defaults to egress. Since no TCP CC algorithm is
specified it uses the system default (Cubic for this test).
With no -D flag, only the value of the AGGREGATE OUTPUT would show.
id refers to the cgroup id and is useful when running multi cgroup
tests (supported by a future patch).
This patchset does not support calling TCP's congesion window
reduction, even when packets are dropped by the BPF program, resulting
in a large number of packets dropped. It is recommended that the current
HBM implemenation only be used with ECN enabled flows. A future patch
will add support for reducing TCP's cwnd and will increase the
performance of non-ECN enabled flows.
Output:
Details for HBM in cgroup 1
id:1
rate_mbps:493
duration:4.8 secs
packets:11355
bytes_MB:590
pkts_dropped:4497
bytes_dropped_MB:292
pkts_marked_percent: 39.60
bytes_marked_percent: 49.49
pkts_dropped_percent: 39.60
bytes_dropped_percent: 49.49
PING AVG DELAY:2.075
AGGREGATE_GOODPUT:505

./do_nrm_test.sh -l -d=1 -D --stats dctcp
Same as above but using dctcp. Note that fewer bytes are dropped
(0.01% vs. 49%).
Output:
Details for HBM in cgroup 1
id:1
rate_mbps:945
duration:4.9 secs
packets:16859
bytes_MB:578
pkts_dropped:1
bytes_dropped_MB:0
pkts_marked_percent: 28.74
bytes_marked_percent: 45.15
pkts_dropped_percent: 0.01
bytes_dropped_percent: 0.01
PING AVG DELAY:2.083
AGGREGATE_GOODPUT:965

./do_nrm_test.sh -d=1 -D --stats
As first example, but without limiting loopback device (i.e. no
"-l" flag). Since there is no bandwidth limiting, no details for
HBM are printed out.
Output:
Details for HBM in cgroup 1
PING AVG DELAY:2.019
AGGREGATE_GOODPUT:42655

./do_hbm.sh -l -d=1 -D --stats -f=2
Uses iper3 and does 2 flows
./do_hbm.sh -l -d=1 -D --stats -f=4 -P
Uses iperf3 and does 4 flows, each flow as a separate process.
./do_hbm.sh -l -d=1 -D --stats -f=4 -N
Uses netperf, 4 flows
./do_hbm.sh -f=1 -r=2000 -t=5 -N -D --stats dctcp -s=
Uses netperf between two hosts. The remote host name is specified
with -s= and you need to start the program netserver manually on
the remote host. It will use 1 flow, a rate limit of 2Gbps and dctcp.
./do_hbm.sh -f=1 -r=2000 -t=5 -N -D --stats -w dctcp \
-s=
As previous, but allows use of extra bandwidth. For this test the
rate is 8Gbps vs. 1Gbps of the previous test.

Signed-off-by: Lawrence Brakmo
Signed-off-by: Alexei Starovoitov

brakmo
2019-03-03 02:48:27 +0800
a1270fe95 bpf: User program for testing HBM ... Browse Code »

The program nrm creates a cgroup and attaches a BPF program to the
cgroup for testing HBM (Host Bandwidth Manager) for egress traffic.
One still needs to create network traffic. This can be done through
netesto, netperf or iperf3.
A follow-up patch contains a script to create traffic.

USAGE: hbm [-d] [-l] [-n ] [-r ] [-s] [-t ]
[-w] [-h] [prog]
Where:
-d Print BPF trace debug buffer
-l Also limit flows doing loopback
-n To create cgroup "/hbm#" and attach prog. Default is /nrm1
This is convenient when testing HBM in more than 1 cgroup
-r Rate limit in Mbps
-s Get HBM stats (marked, dropped, etc.)
-t Exit after specified seconds (deault is 0)
-w Work conserving flag. cgroup can increase its bandwidth
beyond the rate limit specified while there is available
bandwidth. Current implementation assumes there is only
NIC (eth0), but can be extended to support multiple NICs.
Currrently only supported for egress. Note, this is just
a proof of concept.
-h Print this info
prog BPF program file name. Name defaults to hbm_out_kern.o

More information about HBM can be found in the paper "BPF Host Resource
Management" presented at the 2018 Linux Plumbers Conference, Networking Track
(http://vger.kernel.org/lpc_net2018_talks/LPC%20BPF%20Network%20Resource%20Paper.pdf)

Signed-off-by: Lawrence Brakmo
Signed-off-by: Alexei Starovoitov

brakmo
2019-03-03 02:48:27 +0800
187d0738f bpf: Sample HBM BPF program to limit egress bw ... Browse Code »

A cgroup skb BPF program to limit cgroup output bandwidth.
It uses a modified virtual token bucket queue to limit average
egress bandwidth. The implementation uses credits instead of tokens.
Negative credits imply that queueing would have happened (this is
a virtual queue, so no queueing is done by it. However, queueing may
occur at the actual qdisc (which is not used for rate limiting).

This implementation uses 3 thresholds, one to start marking packets and
the other two to drop packets:
CREDIT
- +
| | | 0
| Large pkt |
| drop thresh |
Small pkt drop Mark threshold
thresh

The effect of marking depends on the type of packet:
a) If the packet is ECN enabled, then the packet is ECN ce marked.
The current mark threshold is tuned for DCTCP.
c) Else, it is dropped if it is a large packet.

If the credit is below the drop threshold, the packet is dropped.
Note that dropping a packet through the BPF program does not trigger CWR
(Congestion Window Reduction) in TCP packets. A future patch will add
support for triggering CWR.

This BPF program actually uses 2 drop thresholds, one threshold
for larger packets (>= 120 bytes) and another for smaller packets. This
protects smaller packets such as SYNs, ACKs, etc.

The default bandwidth limit is set at 1Gbps but this can be changed by
a user program through a shared BPF map. In addition, by default this BPF
program does not limit connections using loopback. This behavior can be
overwritten by the user program. There is also an option to calculate
some statistics, such as percent of packets marked or dropped, which
the user program can access.

A latter patch provides such a program (hbm.c)

Signed-off-by: Lawrence Brakmo
Signed-off-by: Alexei Starovoitov

brakmo
2019-03-03 02:48:27 +0800

02 Mar, 2019

1 commit

b74e21ab7 samples/bpf: silence compiler warning for xdpsock_user.c ... Browse Code »

Compiling xdpsock_user.c with 4.8.5, I hit the following
compilation warning:
HOSTCC samples/bpf/xdpsock_user.o
/data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c: In function ‘main’:
/data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c:449:6: warning: ‘idx_cq’ may be used unini
tialized in this function [-Wmaybe-uninitialized]
u32 idx_cq, idx_fq;
^
/data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c:606:7: warning: ‘idx_rx’ may be used unini
tialized in this function [-Wmaybe-uninitialized]
u32 idx_rx, idx_tx = 0;
^
/data/users/yhs/work/net-next/samples/bpf/xdpsock_user.c:506:6: warning: ‘idx_rx’ may be used unini
tialized in this function [-Wmaybe-uninitialized]
u32 idx_rx, idx_fq = 0;

As an example, the code pattern looks like:
u32 idx_cq;
...
ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq);
if (ret) {
...
}
... idx_fq ...
The compiler warns since it does not know whether &idx_fq is assigned
or not inside the library function xsk_ring_prod__reserve().

Let us assign an initial value 0 to such auto variables to silence
compiler warning.

Fixes: 248c7f9c0e21 ("samples/bpf: convert xdpsock to use libbpf for AF_XDP access")
Signed-off-by: Yonghong Song
Acked-by: Jonathan Lemon
Acked-by: Song Liu
Signed-off-by: Daniel Borkmann

Yonghong Song
2019-03-02 08:07:10 +0800

01 Mar, 2019

3 commits

1a9b268c9 samples: bpf: use libbpf where easy ... Browse Code »

Some samples don't really need the magic of bpf_load,
switch them to libbpf.

v2: - specify program types.

Signed-off-by: Jakub Kicinski
Reviewed-by: Quentin Monnet
Acked-by: Andrii Nakryiko
Signed-off-by: Daniel Borkmann

Jakub Kicinski
2019-03-01 07:53:45 +0800
ea9b63620 samples: bpf: remove load_sock_ops in favour of bpftool ... Browse Code »

bpftool can do all the things load_sock_ops used to do, and more.
Point users to bpftool instead of maintaining this sample utility.

Signed-off-by: Jakub Kicinski
Reviewed-by: Quentin Monnet
Acked-by: Andrii Nakryiko
Signed-off-by: Daniel Borkmann

Jakub Kicinski
2019-03-01 07:53:45 +0800
5c3cf87d4 samples: bpf: force IPv4 in ping ... Browse Code »

ping localhost may default of IPv6 on modern systems, but
samples are trying to only parse IPv4. Force IPv4.

samples/bpf/tracex1_user.c doesn't interpret the packet so
we don't care which IP version will be used there.

Signed-off-by: Jakub Kicinski
Reviewed-by: Quentin Monnet
Acked-by: Andrii Nakryiko
Signed-off-by: Daniel Borkmann

Jakub Kicinski
2019-03-01 07:53:45 +0800

28 Feb, 2019

1 commit

d2e614cb0 samples: bpf: fix: broken sample regarding removed function ... Browse Code »

Currently, running sample "task_fd_query" and "tracex3" occurs the
following error. On kernel v5.0-rc* this sample will be unavailable
due to the removal of function 'blk_start_request' at commit "a1ce35f".
(function removed, as "Single Queue IO scheduler" no longer exists)

$ sudo ./task_fd_query
failed to create kprobe 'blk_start_request' error 'No such file or
directory'

This commit will change the function 'blk_start_request' to
'blk_mq_start_request' to fix the broken sample.

Signed-off-by: Daniel T. Lee
Signed-off-by: Daniel Borkmann

Daniel T. Lee
2019-02-28 00:27:22 +0800

26 Feb, 2019

1 commit

248c7f9c0 samples/bpf: convert xdpsock to use libbpf for AF_XDP access ... Browse Code »

This commit converts the xdpsock sample application to use the AF_XDP
functions present in libbpf. This cuts down the size of it by nearly
300 lines of code.

The default ring sizes plus the batch size has been increased and the
size of the umem area has decreased. This so that the sample application
will provide higher throughput. Note also that the shared umem code
has been removed from the sample as this is not supported by libbpf
at this point in time.

Tested-by: Björn Töpel
Signed-off-by: Magnus Karlsson
Signed-off-by: Daniel Borkmann

Magnus Karlsson
2019-02-26 06:21:42 +0800

22 Feb, 2019

1 commit

915654fd7 samples/bpf: Fix dummy program unloading for xdp_redirect samples ... Browse Code »

The xdp_redirect and xdp_redirect_map sample programs both load a dummy
program onto the egress interfaces. However, the unload code checks these
programs against the wrong id number, and thus refuses to unload them. Fix
the comparison to avoid this.

Fixes: 3b7a8ec2dec3 ("samples/bpf: Check the prog id before exiting")
Signed-off-by: Toke Høiland-Jørgensen
Acked-by: Maciej Fijalkowski
Acked-by: Martin KaFai Lau
Signed-off-by: Daniel Borkmann

Toke Høiland-Jørgensen
2019-02-22 23:21:59 +0800

02 Feb, 2019

5 commits

3b7a8ec2d samples/bpf: Check the prog id before exiting ... Browse Code »

Check the program id within the signal handler on polling xdp samples
that were previously converted to libbpf usage. Avoid the situation of
unloading the program that was not attached by sample that is exiting.
Handle also the case where bpf_get_link_xdp_id didn't exit with an error
but the xdp program was not found on an interface.

Reported-by: Michal Papaj
Reported-by: Jakub Spizewski
Signed-off-by: Maciej Fijalkowski
Reviewed-by: Jakub Kicinski
Signed-off-by: Daniel Borkmann

Maciej Fijalkowski
2019-02-02 06:37:51 +0800
743e568c1 samples/bpf: Add a "force" flag to XDP samples ... Browse Code »

Make xdp samples consistent with iproute2 behavior and set the
XDP_FLAGS_UPDATE_IF_NOEXIST by default when setting the xdp program on
interface. Provide an option for user to force the program loading,
which as a result will not include the mentioned flag in
bpf_set_link_xdp_fd call.

Signed-off-by: Maciej Fijalkowski
Reviewed-by: Jakub Kicinski
Acked-by: John Fastabend
Signed-off-by: Daniel Borkmann

Maciej Fijalkowski
2019-02-02 06:37:51 +0800
6a5457618 samples/bpf: Extend RLIMIT_MEMLOCK for xdp_{sample_pkts, router_ipv4} ... Browse Code »

There is a common problem with xdp samples that happens when user wants
to run a particular sample and some bpf program is already loaded. The
default 64kb RLIMIT_MEMLOCK resource limit will cause a following error
(assuming that xdp sample that is failing was converted to libbpf
usage):

libbpf: Error in bpf_object__probe_name():Operation not permitted(1).
Couldn't load basic 'r0 = 0' BPF program.
libbpf: failed to load object './xdp_sample_pkts_kern.o'

Fix it in xdp_sample_pkts and xdp_router_ipv4 by setting RLIMIT_MEMLOCK
to RLIM_INFINITY.

Signed-off-by: Maciej Fijalkowski
Reviewed-by: Jakub Kicinski
Acked-by: Jesper Dangaard Brouer
Acked-by: John Fastabend
Signed-off-by: Daniel Borkmann

Maciej Fijalkowski
2019-02-02 06:37:51 +0800
bbaf6029c samples/bpf: Convert XDP samples to libbpf usage ... Browse Code »

Some of XDP samples that are attaching the bpf program to the interface
via libbpf's bpf_set_link_xdp_fd are still using the bpf_load.c for
loading and manipulating the ebpf program and maps. Convert them to do
this through libbpf usage and remove bpf_load from the picture.

While at it remove what looks like debug leftover in
xdp_redirect_map_user.c

In xdp_redirect_cpu, change the way that the program to be loaded onto
interface is chosen - user now needs to pass the program's section name
instead of the relative number. In case of typo print out the section
names to choose from.

Signed-off-by: Maciej Fijalkowski
Reviewed-by: Jakub Kicinski
Acked-by: Jesper Dangaard Brouer
Signed-off-by: Daniel Borkmann

Maciej Fijalkowski
2019-02-02 06:37:51 +0800
7313798b1 samples/bpf: xdp_redirect_cpu have not need for read_trace_pipe ... Browse Code »

The sample xdp_redirect_cpu is not using helper bpf_trace_printk.
Thus it makes no sense that the --debug option us reading
from /sys/kernel/debug/tracing/trace_pipe via read_trace_pipe.
Simply remove it.

Signed-off-by: Jesper Dangaard Brouer
Acked-by: John Fastabend
Signed-off-by: Daniel Borkmann

Jesper Dangaard Brouer
2019-02-02 06:37:51 +0800

29 Jan, 2019

1 commit

ec7146db1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next ... Browse Code »

Daniel Borkmann says:

====================
pull-request: bpf-next 2019-01-29

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Teach verifier dead code removal, this also allows for optimizing /
removing conditional branches around dead code and to shrink the
resulting image. Code store constrained architectures like nfp would
have hard time doing this at JIT level, from Jakub.

2) Add JMP32 instructions to BPF ISA in order to allow for optimizing
code generation for 32-bit sub-registers. Evaluation shows that this
can result in code reduction of ~5-20% compared to 64 bit-only code
generation. Also add implementation for most JITs, from Jiong.

3) Add support for __int128 types in BTF which is also needed for
vmlinux's BTF conversion to work, from Yonghong.

4) Add a new command to bpftool in order to dump a list of BPF-related
parameters from the system or for a specific network device e.g. in
terms of available prog/map types or helper functions, from Quentin.

5) Add AF_XDP sock_diag interface for querying sockets from user
space which provides information about the RX/TX/fill/completion
rings, umem, memory usage etc, from Björn.

6) Add skb context access for skb_shared_info->gso_segs field, from Eric.

7) Add support for testing flow dissector BPF programs by extending
existing BPF_PROG_TEST_RUN infrastructure, from Stanislav.

8) Split BPF kselftest's test_verifier into various subgroups of tests
in order better deal with merge conflicts in this area, from Jakub.

9) Add support for queue/stack manipulations in bpftool, from Stanislav.

10) Document BTF, from Yonghong.

11) Dump supported ELF section names in libbpf on program load
failure, from Taeung.

12) Silence a false positive compiler warning in verifier's BTF
handling, from Peter.

13) Fix help string in bpftool's feature probing, from Prashant.

14) Remove duplicate includes in BPF kselftests, from Yue.
====================

Signed-off-by: David S. Miller

David S. Miller
2019-01-29 11:38:33 +0800

27 Jan, 2019

1 commit

6ea848b5c selftests: bpf: functional and min/max reasoning unit tests for JMP32 ... Browse Code »

This patch adds unit tests for new JMP32 instructions.

This patch also added the new BPF_JMP32_REG and BPF_JMP32_IMM macros to
samples/bpf/bpf_insn.h so that JMP32 insn builders are available to tests
under 'samples' directory.

Reviewed-by: Jakub Kicinski
Signed-off-by: Jiong Wang
Signed-off-by: Alexei Starovoitov

Jiong Wang
2019-01-27 05:33:02 +0800

16 Jan, 2019

1 commit

6bf3bbe1f samples/bpf: workaround clang asm goto compilation errors ... Browse Code »

x86 compilation has required asm goto support since 4.17.
Since clang does not support asm goto, at 4.17,
Commit b1ae32dbab50 ("x86/cpufeature: Guard asm_volatile_goto usage
for BPF compilation") worked around the issue by permitting an
alternative implementation without asm goto for clang.

At 5.0, more asm goto usages appeared.
[yhs@148 x86]$ egrep -r asm_volatile_goto
include/asm/cpufeature.h: asm_volatile_goto("1: jmp 6f\n"
include/asm/jump_label.h: asm_volatile_goto("1:"
include/asm/jump_label.h: asm_volatile_goto("1:"
include/asm/rmwcc.h: asm_volatile_goto (fullop "; j" #cc " %l[cc_label]" \
include/asm/uaccess.h: asm_volatile_goto("\n" \
include/asm/uaccess.h: asm_volatile_goto("\n" \
[yhs@148 x86]$

Compiling samples/bpf directories, most bpf programs failed
compilation with error messages like:
In file included from /home/yhs/work/bpf-next/samples/bpf/xdp_sample_pkts_kern.c:2:
In file included from /home/yhs/work/bpf-next/include/linux/ptrace.h:6:
In file included from /home/yhs/work/bpf-next/include/linux/sched.h:15:
In file included from /home/yhs/work/bpf-next/include/linux/sem.h:5:
In file included from /home/yhs/work/bpf-next/include/uapi/linux/sem.h:5:
In file included from /home/yhs/work/bpf-next/include/linux/ipc.h:9:
In file included from /home/yhs/work/bpf-next/include/linux/refcount.h:72:
/home/yhs/work/bpf-next/arch/x86/include/asm/refcount.h:70:9: error: 'asm goto' constructs are not supported yet
return GEN_BINARY_SUFFIXED_RMWcc(LOCK_PREFIX "subl",
^
/home/yhs/work/bpf-next/arch/x86/include/asm/rmwcc.h:67:2: note: expanded from macro 'GEN_BINARY_SUFFIXED_RMWcc'
__GEN_RMWcc(op " %[val], %[var]\n\t" suffix, var, cc, \
^
/home/yhs/work/bpf-next/arch/x86/include/asm/rmwcc.h:21:2: note: expanded from macro '__GEN_RMWcc'
asm_volatile_goto (fullop "; j" #cc " %l[cc_label]" \
^
/home/yhs/work/bpf-next/include/linux/compiler_types.h:188:37: note: expanded from macro 'asm_volatile_goto'
#define asm_volatile_goto(x...) asm goto(x)

Most implementation does not even provide an alternative
implementation. And it is also not practical to make changes
for each call site.

This patch workarounded the asm goto issue by redefining the macro like below:
#define asm_volatile_goto(x...) asm volatile("invalid use of asm_volatile_goto")

If asm_volatile_goto is not used by bpf programs, which is typically the case, nothing bad
will happen. If asm_volatile_goto is used by bpf programs, which is incorrect, the compiler
will issue an error since "invalid use of asm_volatile_goto" is not valid assembly codes.

With this patch, all bpf programs under samples/bpf can pass compilation.

Note that bpf programs under tools/testing/selftests/bpf/ compiled fine as
they do not access kernel internal headers.

Fixes: e769742d3584 ("Revert "x86/jump-labels: Macrofy inline assembly code to work around GCC inlining bugs"")
Fixes: 18fe58229d80 ("x86, asm: change the GEN_*_RMWcc() macros to not quote the condition")
Acked-by: Alexei Starovoitov
Signed-off-by: Yonghong Song
Signed-off-by: Daniel Borkmann

Yonghong Song
2019-01-16 03:57:30 +0800

10 Jan, 2019

1 commit

11b36abc2 samples: bpf: user proper argument index ... Browse Code »

Use optind as index for argv instead of a hardcoded value.
When the program has options this leads to improper parameter handling.

Fixes: dc378a1ab5b6 ("samples: bpf: get ifindex from ifname")
Signed-off-by: Ioana Ciornei
Acked-by: Matteo Croce
Signed-off-by: Daniel Borkmann

Ioana Ciornei
2019-01-10 22:54:47 +0800

08 Jan, 2019

1 commit

a8911d6d5 selftests/bpf: fix incorrect users of create_and_get_cgroup ... Browse Code »

We have some tests that assume create_and_get_cgroup returns -1 on error
which is incorrect (it returns 0 on error). Since fd might be zero in
general case, change create_and_get_cgroup to return -1 on error
and fix the users that assume 0 on error.

Fixes: f269099a7e7a ("tools/bpf: add a selftest for bpf_get_current_cgroup_id() helper")
Fixes: 7d2c6cfc5411 ("bpf: use --cgroup in test_suite if supplied")

v2:
- instead of fixing the uses that assume -1 on error, convert the users
that assume 0 on error (fd might be zero in general case)

Signed-off-by: Stanislav Fomichev
Signed-off-by: Alexei Starovoitov

Stanislav Fomichev
2019-01-08 05:15:55 +0800

30 Dec, 2018

1 commit

668c35f69 Merge tag 'kbuild-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild ... Browse Code »

Pull Kbuild updates from Masahiro Yamada:
"Kbuild core:
- remove unneeded $(call cc-option,...) switches
- consolidate Clang compiler flags into CLANG_FLAGS
- announce the deprecation of SUBDIRS
- fix single target build for external module
- simplify the dependencies of 'prepare' stage targets
- allow fixdep to directly write to .*.cmd files
- simplify dependency generation for CONFIG_TRIM_UNUSED_KSYMS
- change if_changed_rule to accept multi-line recipe
- move .SECONDARY special target to scripts/Kbuild.include
- remove redundant 'set -e'
- improve parallel execution for CONFIG_HEADERS_CHECK
- misc cleanups

Treewide fixes and cleanups
- set Clang flags correctly for PowerPC boot images
- fix UML build error with CONFIG_GCC_PLUGINS
- remove unneeded patterns from .gitignore files
- refactor firmware/Makefile
- remove unneeded rules for *offsets.s
- avoid unneeded regeneration of intermediate .s files
- clean up ./Kbuild

Modpost:
- remove unused -M, -K options
- fix false positive warnings about section mismatch
- use simple devtable lookup instead of linker magic
- misc cleanups

Coccinelle:
- relax boolinit.cocci checks for overall consistency
- fix warning messages of boolinit.cocci

Other tools:
- improve -dirty check of scripts/setlocalversion
- add a tool to generate compile_commands.json from .*.cmd files"

* tag 'kbuild-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (51 commits)
kbuild: remove unused cmd_gentimeconst
kbuild: remove $(obj)/ prefixes in ./Kbuild
treewide: add intermediate .s files to targets
treewide: remove explicit rules for *offsets.s
firmware: refactor firmware/Makefile
firmware: remove unnecessary patterns from .gitignore
scripts: remove unnecessary ihex2fw and check-lc_ctypes from .gitignore
um: remove unused filechk_gen_header in Makefile
scripts: add a tool to produce a compile_commands.json file
kbuild: add -Werror=implicit-int flag unconditionally
kbuild: add -Werror=strict-prototypes flag unconditionally
kbuild: add -fno-PIE flag unconditionally
scripts: coccinelle: Correct warning message
scripts: coccinelle: only suggest true/false in files that already use them
kbuild: handle part-of-module correctly for *.ll and *.symtypes
kbuild: refactor part-of-module
kbuild: refactor quiet_modtag
kbuild: remove redundant quiet_modtag for $(obj-m)
kbuild: refactor Makefile.asm-generic
user/Makefile: Fix typo and capitalization in comment section
...

Linus Torvalds
2018-12-30 04:03:17 +0800

23 Dec, 2018

2 commits

2c667d77f treewide: add intermediate .s files to targets ... Browse Code »

Avoid unneeded recreation of these in the incremental build.

Signed-off-by: Masahiro Yamada

Masahiro Yamada
2018-12-23 09:12:08 +0800
4d4b5c2e3 treewide: remove explicit rules for *offsets.s ... Browse Code »

These explicit rules are unneeded because scripts/Makefile.build
provides a pattern rule to create %.s from %.c

Signed-off-by: Masahiro Yamada

Masahiro Yamada
2018-12-23 09:12:03 +0800

04 Dec, 2018

1 commit

d59dd69d5 samples: bpf: fix: seg fault with NULL pointer arg ... Browse Code »

When NULL pointer accidentally passed to write_kprobe_events,
due to strlen(NULL), segmentation fault happens.
Changed code returns -1 to deal with this situation.

Bug issued with Smatch, static analysis.

Signed-off-by: Daniel T. Lee
Acked-by: Song Liu
Signed-off-by: Daniel Borkmann

Daniel T. Lee
2018-12-04 06:58:03 +0800

01 Dec, 2018

2 commits

dc378a1ab samples: bpf: get ifindex from ifname ... Browse Code »

Find the ifindex with if_nametoindex() instead of requiring the
numeric ifindex.

Signed-off-by: Matteo Croce
Signed-off-by: Alexei Starovoitov

Matteo Croce
2018-12-01 14:06:41 +0800
d606ee5c1 samples: bpf: improve xdp1 example ... Browse Code »

Store only the total packet count for every protocol, instead of the
whole per-cpu array.
Use bpf_map_get_next_key() to iterate the map, instead of looking up
all the protocols.

Signed-off-by: Matteo Croce
Signed-off-by: Alexei Starovoitov

Matteo Croce
2018-12-01 14:06:41 +0800

24 Nov, 2018

1 commit

5a8638132 samples: bpf: fix: error handling regarding kprobe_events ... Browse Code »

Currently, kprobe_events failure won't be handled properly.
Due to calling system() indirectly to write to kprobe_events,
it can't be identified whether an error is derived from kprobe or system.

// buf = "echo '%c:%s %s' >> /s/k/d/t/kprobe_events"
err = system(buf);
if (err < 0) {
printf("failed to create kprobe ..");
return -1;
}

For example, running ./tracex7 sample in ext4 partition,
"echo p:open_ctree open_ctree >> /s/k/d/t/kprobe_events"
gets 256 error code system() failure.
=> The error comes from kprobe, but it's not handled correctly.

According to man of system(3), it's return value
just passes the termination status of the child shell
rather than treating the error as -1. (don't care success)

Which means, currently it's not working as desired.
(According to the upper code snippet)

ex) running ./tracex7 with ext4 env.
# Current Output
sh: echo: I/O error
failed to open event open_ctree

# Desired Output
failed to create kprobe 'open_ctree' error 'No such file or directory'

The problem is, error can't be verified whether from child ps
or system. But using write() directly can verify the command
failure, and it will treat all error as -1. So I suggest using
write() directly to 'kprobe_events' rather than calling system().

Signed-off-by: Daniel T. Lee
Signed-off-by: Daniel Borkmann

Daniel T. Lee
2018-11-24 05:39:09 +0800

21 Nov, 2018

1 commit

9ce6ae22c tools/bpf: do not use pahole if clang/llvm can generate BTF sections ... Browse Code »

Add additional checks in tools/testing/selftests/bpf and
samples/bpf such that if clang/llvm compiler can generate
BTF sections, do not use pahole.

Signed-off-by: Yonghong Song
Signed-off-by: Martin KaFai Lau
Signed-off-by: Alexei Starovoitov

Yonghong Song
2018-11-21 02:54:39 +0800

08 Nov, 2018

1 commit

bce6a1499 bpf_load: add map name to load_maps error message ... Browse Code »

To help when debugging bpf/xdp load issues, have the load_map()
error message include the number and name of the map that
failed.

Signed-off-by: Shannon Nelson
Acked-by: John Fastabend
Acked-by: Song Liu
Signed-off-by: Daniel Borkmann

Shannon Nelson
2018-11-08 05:34:54 +0800

04 Oct, 2018

1 commit

20cdeb540 bpf, tracex3_user: erase "ARRAY_SIZE" redefined ... Browse Code »

There is a warning when compiling bpf sample programs in sample/bpf:

make -C /home/foo/bpf/samples/bpf/../../tools/lib/bpf/ RM='rm -rf' LDFLAGS= srctree=/home/foo/bpf/samples/bpf/../../ O=
HOSTCC /home/foo/bpf/samples/bpf/tracex3_user.o
/home/foo/bpf/samples/bpf/tracex3_user.c:20:0: warning: "ARRAY_SIZE" redefined
#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))

In file included from /home/foo/bpf/samples/bpf/tracex3_user.c:18:0:
./tools/testing/selftests/bpf/bpf_util.h:48:0: note: this is the location of the previous definition
# define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))

Signed-off-by: Bo YU
Signed-off-by: Daniel Borkmann

Bo YU
2018-10-04 22:31:57 +0800

01 Oct, 2018

1 commit

5fcbd29b3 samples/bpf: extend test_cgrp2_attach2 test to use per-cpu cgroup storage ... Browse Code »

This commit extends the test_cgrp2_attach2 test to cover per-cpu
cgroup storage. Bpf program will use shared and per-cpu cgroup
storages simultaneously, so a better coverage of corresponding
core code will be achieved.

Expected output:
$ ./test_cgrp2_attach2
Attached DROP prog. This ping in cgroup /foo should fail...
ping: sendmsg: Operation not permitted
Attached DROP prog. This ping in cgroup /foo/bar should fail...
ping: sendmsg: Operation not permitted
Attached PASS prog. This ping in cgroup /foo/bar should pass...
Detached PASS from /foo/bar while DROP is attached to /foo.
This ping in cgroup /foo/bar should fail...
ping: sendmsg: Operation not permitted
Attached PASS from /foo/bar and detached DROP from /foo.
This ping in cgroup /foo/bar should pass...
### override:PASS
### multi:PASS

Signed-off-by: Roman Gushchin
Acked-by: Song Liu
Cc: Daniel Borkmann
Cc: Alexei Starovoitov
Signed-off-by: Daniel Borkmann

Roman Gushchin
2018-10-01 22:18:33 +0800

22 Sep, 2018

1 commit

32c009798 samples/bpf: fix compilation failure ... Browse Code »

following commit:
commit d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
added struct bpf_flow_keys which conflicts with the struct with
same name in sockex2_kern.c and sockex3_kern.c

similar to commit:
commit 534e0e52bc23 ("samples/bpf: fix a compilation failure")
we tried the rename it "flow_keys" but it also conflicted with struct
having same name in include/net/flow_dissector.h. Hence renaming the
struct to "flow_key_record". Also, this commit doesn't fix the
compilation error completely because the similar struct is present in
sockex3_kern.c. Hence renaming it in both files sockex3_user.c and
sockex3_kern.c

Signed-off-by: Prashant Bhole
Acked-by: Song Liu
Signed-off-by: Daniel Borkmann

Prashant Bhole
2018-09-22 04:51:16 +0800

18 Sep, 2018

2 commits

534e0e52b samples/bpf: fix a compilation failure ... Browse Code »

samples/bpf build failed with the following errors:

$ make samples/bpf/
...
HOSTCC samples/bpf/sockex3_user.o
/data/users/yhs/work/net-next/samples/bpf/sockex3_user.c:16:8: error: redefinition of ‘struct bpf_flow_keys’
struct bpf_flow_keys {
^
In file included from /data/users/yhs/work/net-next/samples/bpf/sockex3_user.c:4:0:
./usr/include/linux/bpf.h:2338:9: note: originally defined here
struct bpf_flow_keys *flow_keys;
^
make[3]: *** [samples/bpf/sockex3_user.o] Error 1

Commit d58e468b1112d ("flow_dissector: implements flow dissector BPF hook")
introduced struct bpf_flow_keys in include/uapi/linux/bpf.h and hence
caused the naming conflict with samples/bpf/sockex3_user.c.

The fix is to rename struct bpf_flow_keys in samples/bpf/sockex3_user.c
to flow_keys to avoid the conflict.

Signed-off-by: Yonghong Song
Signed-off-by: Daniel Borkmann

Yonghong Song
2018-09-18 23:50:02 +0800
664e78784 samples/bpf: remove duplicated includes ... Browse Code »

Remove duplicated includes.

Signed-off-by: YueHaibing
Acked-by: Yonghong Song
Signed-off-by: Daniel Borkmann

YueHaibing
2018-09-18 23:49:33 +0800

01 Sep, 2018

2 commits

11c3f5113 samples/bpf: xdpsock, minor fixes ... Browse Code »

- xsks_map size was fixed to 4, changed it MAX_SOCKS
- Remove redundant definition of MAX_SOCKS in xdpsock_user.c
- In dump_stats(), add NULL check for xsks[i]

Signed-off-by: Prashant Bhole
Acked-by: Björn Töpel
Signed-off-by: Daniel Borkmann

Prashant Bhole
2018-09-01 07:36:08 +0800
acb4ea956 bpf: add TCP_SAVE_SYN/TCP_SAVED_SYN sample program ... Browse Code »

Sample program which shows TCP_SAVE_SYN/TCP_SAVED_SYN usage example:
bpf program which is doing TOS/TCLASS reflection (server would reply
with a same TOS/TCLASS as client).

Signed-off-by: Nikita V. Shirokov
Signed-off-by: Alexei Starovoitov
Signed-off-by: Daniel Borkmann

Nikita V. Shirokov
2018-09-01 07:36:04 +0800

30 Aug, 2018

1 commit

58c50ae4a samples/bpf: add -c/--copy -z/--zero-copy flags to xdpsock ... Browse Code »

The -c/--copy -z/--zero-copy flags enforces either copy or zero-copy
mode.

Signed-off-by: Björn Töpel
Signed-off-by: Alexei Starovoitov

Björn Töpel
2018-08-30 03:25:53 +0800

17 Aug, 2018

1 commit

817b89beb samples/bpf: all XDP samples should unload xdp/bpf prog on SIGTERM ... Browse Code »

It is common XDP practice to unload/deattach the XDP bpf program,
when the XDP sample program is Ctrl-C interrupted (SIGINT) or
killed (SIGTERM).

The samples/bpf programs xdp_redirect_cpu and xdp_rxq_info,
forgot to trap signal SIGTERM (which is the default signal used
by the kill command).

This was discovered by Red Hat QA, which automated scripts depend
on killing the XDP sample program after a timeout period.

Fixes: fad3917e361b ("samples/bpf: add cpumap sample program xdp_redirect_cpu")
Fixes: 0fca931a6f21 ("samples/bpf: program demonstrating access to xdp_rxq_info")
Reported-by: Jean-Tsung Hsiao
Signed-off-by: Jesper Dangaard Brouer
Acked-by: Yonghong Song
Signed-off-by: Daniel Borkmann

Jesper Dangaard Brouer
2018-08-17 03:55:32 +0800

16 Aug, 2018

1 commit

9a76aba02 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:
"Highlights:

- Gustavo A. R. Silva keeps working on the implicit switch fallthru
changes.

- Support 802.11ax High-Efficiency wireless in cfg80211 et al, From
Luca Coelho.

- Re-enable ASPM in r8169, from Kai-Heng Feng.

- Add virtual XFRM interfaces, which avoids all of the limitations of
existing IPSEC tunnels. From Steffen Klassert.

- Convert GRO over to use a hash table, so that when we have many
flows active we don't traverse a long list during accumluation.

- Many new self tests for routing, TC, tunnels, etc. Too many
contributors to mention them all, but I'm really happy to keep
seeing this stuff.

- Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu.

- Lots of cleanups and fixes in L2TP code from Guillaume Nault.

- Add IPSEC offload support to netdevsim, from Shannon Nelson.

- Add support for slotting with non-uniform distribution to netem
packet scheduler, from Yousuk Seung.

- Add UDP GSO support to mlx5e, from Boris Pismenny.

- Support offloading of Team LAG in NFP, from John Hurley.

- Allow to configure TX queue selection based upon RX queue, from
Amritha Nambiar.

- Support ethtool ring size configuration in aquantia, from Anton
Mikaev.

- Support DSCP and flowlabel per-transport in SCTP, from Xin Long.

- Support list based batching and stack traversal of SKBs, this is
very exciting work. From Edward Cree.

- Busyloop optimizations in vhost_net, from Toshiaki Makita.

- Introduce the ETF qdisc, which allows time based transmissions. IGB
can offload this in hardware. From Vinicius Costa Gomes.

- Add parameter support to devlink, from Moshe Shemesh.

- Several multiplication and division optimizations for BPF JIT in
nfp driver, from Jiong Wang.

- Lots of prepatory work to make more of the packet scheduler layer
lockless, when possible, from Vlad Buslov.

- Add ACK filter and NAT awareness to sch_cake packet scheduler, from
Toke Høiland-Jørgensen.

- Support regions and region snapshots in devlink, from Alex Vesker.

- Allow to attach XDP programs to both HW and SW at the same time on
a given device, with initial support in nfp. From Jakub Kicinski.

- Add TLS RX offload and support in mlx5, from Ilya Lesokhin.

- Use PHYLIB in r8169 driver, from Heiner Kallweit.

- All sorts of changes to support Spectrum 2 in mlxsw driver, from
Ido Schimmel.

- PTP support in mv88e6xxx DSA driver, from Andrew Lunn.

- Make TCP_USER_TIMEOUT socket option more accurate, from Jon
Maxwell.

- Support for templates in packet scheduler classifier, from Jiri
Pirko.

- IPV6 support in RDS, from Ka-Cheong Poon.

- Native tproxy support in nf_tables, from Máté Eckl.

- Maintain IP fragment queue in an rbtree, but optimize properly for
in-order frags. From Peter Oskolkov.

- Improvde handling of ACKs on hole repairs, from Yuchung Cheng"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits)
bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT"
hv/netvsc: Fix NULL dereference at single queue mode fallback
net: filter: mark expected switch fall-through
xen-netfront: fix warn message as irq device name has '/'
cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0
net: dsa: mv88e6xxx: missing unlock on error path
rds: fix building with IPV6=m
inet/connection_sock: prefer _THIS_IP_ to current_text_addr
net: dsa: mv88e6xxx: bitwise vs logical bug
net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd()
ieee802154: hwsim: using right kind of iteration
net: hns3: Add vlan filter setting by ethtool command -K
net: hns3: Set tx ring' tc info when netdev is up
net: hns3: Remove tx ring BD len register in hns3_enet
net: hns3: Fix desc num set to default when setting channel
net: hns3: Fix for phy link issue when using marvell phy driver
net: hns3: Fix for information of phydev lost problem when down/up
net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero
net: hns3: Add support for serdes loopback selftest
bnxt_en: take coredump_record structure off stack
...

Linus Torvalds
2018-08-16 06:04:25 +0800