Eric Lee / smarc-fsl-linux-kernel

15 May, 2018

3 commits

56ea6a8b4 bpf, arm64: save 4 bytes in prologue when ebpf insns came from cbpf ... Browse Code »

We can trivially save 4 bytes in prologue for cBPF since tail calls
can never be used from there. The register push/pop is pairwise,
here, x25 (fp) and x26 (tcc), so no point in changing that, only
reset to zero is not needed.

Signed-off-by: Daniel Borkmann
Signed-off-by: Alexei Starovoitov

Daniel Borkmann
2018-05-15 10:11:45 +0800
6d2eea6fb bpf, arm64: optimize 32/64 immediate emission ... Browse Code »

Improve the JIT to emit 64 and 32 bit immediates, the current
algorithm is not optimal and we often emit more instructions
than actually needed. arm64 has movz, movn, movk variants but
for the current 64 bit immediates we only use movz with a
series of movk when needed.

For example loading ffffffffffffabab emits the following 4
instructions in the JIT today:

* movz: abab, shift: 0, result: 000000000000abab
* movk: ffff, shift: 16, result: 00000000ffffabab
* movk: ffff, shift: 32, result: 0000ffffffffabab
* movk: ffff, shift: 48, result: ffffffffffffabab

Whereas after the patch the same load only needs a single
instruction:

* movn: 5454, shift: 0, result: ffffffffffffabab

Another example where two extra instructions can be saved:

* movz: abab, shift: 0, result: 000000000000abab
* movk: 1f2f, shift: 16, result: 000000001f2fabab
* movk: ffff, shift: 32, result: 0000ffff1f2fabab
* movk: ffff, shift: 48, result: ffffffff1f2fabab

After the patch:

* movn: e0d0, shift: 16, result: ffffffff1f2fffff
* movk: abab, shift: 0, result: ffffffff1f2fabab

Another example with movz, before:

* movz: 0000, shift: 0, result: 0000000000000000
* movk: fea0, shift: 32, result: 0000fea000000000

After:

* movz: fea0, shift: 32, result: 0000fea000000000

Moreover, reuse emit_a64_mov_i() for 32 bit immediates that
are loaded via emit_a64_mov_i64() which is a similar optimization
as done in 6fe8b9c1f41d ("bpf, x64: save several bytes by using
mov over movabsq when possible"). On arm64, the latter allows to
use a single instruction with movn due to zero extension where
otherwise two would be needed. And last but not least add a
missing optimization in emit_a64_mov_i() where movn is used but
the subsequent movk not needed. With some of the Cilium programs
in use, this shrinks the needed instructions by about three
percent. Tested on Cavium ThunderX CN8890.

Signed-off-by: Daniel Borkmann
Signed-off-by: Alexei Starovoitov

Daniel Borkmann
2018-05-15 10:11:45 +0800
09ece3d0f bpf, arm64: save 4 bytes of unneeded stack space ... Browse Code »

Follow-up to 816d9ef32a8b ("bpf, arm64: remove ld_abs/ld_ind") in
that the extra 4 byte JIT scratchpad is not needed anymore since it
was in ld_abs/ld_ind as stack buffer for bpf_load_pointer().

Signed-off-by: Daniel Borkmann
Signed-off-by: Alexei Starovoitov

Daniel Borkmann
2018-05-15 10:11:45 +0800

04 May, 2018

1 commit

816d9ef32 bpf, arm64: remove ld_abs/ld_ind ... Browse Code »

Since LD_ABS/LD_IND instructions are now removed from the core and
reimplemented through a combination of inlined BPF instructions and
a slow-path helper, we can get rid of the complexity from arm64 JIT.

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: Alexei Starovoitov

Daniel Borkmann
2018-05-04 07:49:19 +0800

23 Feb, 2018

1 commit

16338a9b3 bpf, arm64: fix out of bounds access in tail call ... Browse Code »

I recently noticed a crash on arm64 when feeding a bogus index
into BPF tail call helper. The crash would not occur when the
interpreter is used, but only in case of JIT. Output looks as
follows:

[ 347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
[...]
[ 347.043065] [fffb850e96492510] address between user and kernel address ranges
[ 347.050205] Internal error: Oops: 96000004 [#1] SMP
[...]
[ 347.190829] x13: 0000000000000000 x12: 0000000000000000
[ 347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
[ 347.201427] x9 : 0000000000000000 x8 : 0000000000000000
[ 347.206726] x7 : 0000000000000000 x6 : 001c991738000000
[ 347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
[ 347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
[ 347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
[ 347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
[ 347.235221] Call trace:
[ 347.237656] 0xffff000002f3a4fc
[ 347.240784] bpf_test_run+0x78/0xf8
[ 347.244260] bpf_prog_test_run_skb+0x148/0x230
[ 347.248694] SyS_bpf+0x77c/0x1110
[ 347.251999] el0_svc_naked+0x30/0x34
[ 347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
[...]

In this case the index used in BPF r3 is the same as in r1
at the time of the call, meaning we fed a pointer as index;
here, it had the value 0xffff808fd7cf0500 which sits in x2.

While I found tail calls to be working in general (also for
hitting the error cases), I noticed the following in the code
emission:

# bpftool p d j i 988
[...]
38: ldr w10, [x1,x10]
3c: cmp w2, w10
40: b.ge 0x000000000000007c
Signed-off-by: Alexei Starovoitov

Daniel Borkmann
2018-02-23 08:06:28 +0800

27 Jan, 2018

1 commit

96a71005b bpf, arm64: remove obsolete exception handling from div/mod ... Browse Code »

Since we've changed div/mod exception handling for src_reg in
eBPF verifier itself, remove the leftovers from arm64 JIT.

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: Alexei Starovoitov

Daniel Borkmann
2018-01-27 08:42:06 +0800

21 Jan, 2018

1 commit

ea9722e26 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next ... Browse Code »

Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-01-19

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) bpf array map HW offload, from Jakub.

2) support for bpf_get_next_key() for LPM map, from Yonghong.

3) test_verifier now runs loaded programs, from Alexei.

4) xdp cpumap monitoring, from Jesper.

5) variety of tests, cleanups and small x64 JIT optimization, from Daniel.

6) user space can now retrieve HW JITed program, from Jiong.

Note there is a minor conflict between Russell's arm32 JIT fixes
and removal of bpf_jit_enable variable by Daniel which should
be resolved by keeping Russell's comment and removing that variable.
====================

Signed-off-by: David S. Miller

David S. Miller
2018-01-21 11:03:46 +0800

20 Jan, 2018

2 commits

8565d26bc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

The BPF verifier conflict was some minor contextual issue.

The TUN conflict was less trivial. Cong Wang fixed a memory leak of
tfile->tx_array in 'net'. This is an skb_array. But meanwhile in
net-next tun changed tfile->tx_arry into tfile->tx_ring which is a
ptr_ring.

Signed-off-by: David S. Miller

David S. Miller
2018-01-20 11:59:33 +0800
fa9dd599b bpf: get rid of pure_initcall dependency to enable jits ... Browse Code »

Having a pure_initcall() callback just to permanently enable BPF
JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
a small race window in future where JIT is still disabled on boot.
Since we know about the setting at compilation time anyway, just
initialize it properly there. Also consolidate all the individual
bpf_jit_enable variables into a single one and move them under one
location. Moreover, don't allow for setting unspecified garbage
values on them.

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: Alexei Starovoitov

Daniel Borkmann
2018-01-20 10:37:00 +0800

17 Jan, 2018

1 commit

a2284d912 bpf, arm64: fix stack_depth tracking in combination with tail calls ... Browse Code »

Using dynamic stack_depth tracking in arm64 JIT is currently broken in
combination with tail calls. In prologue, we cache ctx->stack_size and
adjust SP reg for setting up function call stack, and tearing it down
again in epilogue. Problem is that when doing a tail call, the cached
ctx->stack_size might not be the same.

One way to fix the problem with minimal overhead is to re-adjust SP in
emit_bpf_tail_call() and properly adjust it to the current program's
ctx->stack_size. Tested on Cavium ThunderX ARMv8.

Fixes: f1c9eed7f437 ("bpf, arm64: take advantage of stack_depth tracking")
Signed-off-by: Daniel Borkmann
Signed-off-by: Alexei Starovoitov

Daniel Borkmann
2018-01-17 03:29:15 +0800

19 Dec, 2017

1 commit

5ee7f784c bpf: arm64: fix uninitialized variable ... Browse Code »

fix the following issue:
arch/arm64/net/bpf_jit_comp.c: In function 'bpf_int_jit_compile':
arch/arm64/net/bpf_jit_comp.c:982:18: error: 'image_size' may be used
uninitialized in this function [-Werror=maybe-uninitialized]

Fixes: db496944fdaa ("bpf: arm64: add JIT support for multi-function programs")
Reported-by: Arnd Bergmann
Signed-off-by: Alexei Starovoitov
Signed-off-by: Daniel Borkmann

Alexei Starovoitov
2017-12-19 08:29:25 +0800

18 Dec, 2017

2 commits

db496944f bpf: arm64: add JIT support for multi-function programs ... Browse Code »

similar to x64 add support for bpf-to-bpf calls.
When program has calls to in-kernel helpers the target call offset
is known at JIT time and arm64 architecture needs 2 passes.
With bpf-to-bpf calls the dynamically allocated function start
is unknown until all functions of the program are JITed.
Therefore (just like x64) arm64 JIT needs one extra pass over
the program to emit correct call offsets.

Implementation detail:
Avoid being too clever in 64-bit immediate moves and
always use 4 instructions (instead of 3-4 depending on the address)
to make sure only one extra pass is needed.
If some future optimization would make it worth while to optimize
'call 64-bit imm' further, the JIT would need to do 4 passes
over the program instead of 3 as in this patch.
For typical bpf program address the mov needs 3 or 4 insns,
so unconditional 4 insns to save extra pass is a worthy trade off
at this state of JIT.

Signed-off-by: Alexei Starovoitov
Acked-by: Daniel Borkmann
Signed-off-by: Daniel Borkmann

Alexei Starovoitov
2017-12-18 03:34:36 +0800
60b58afc9 bpf: fix net.core.bpf_jit_enable race ... Browse Code »

global bpf_jit_enable variable is tested multiple times in JITs,
blinding and verifier core. The malicious root can try to toggle
it while loading the programs. This race condition was accounted
for and there should be no issues, but it's safer to avoid
this race condition.

Signed-off-by: Alexei Starovoitov
Acked-by: Daniel Borkmann
Signed-off-by: Daniel Borkmann

Alexei Starovoitov
2017-12-18 03:34:36 +0800

10 Aug, 2017

1 commit

c362b2f34 bpf, arm64: implement jiting of BPF_J{LT, LE, SLT, SLE} ... Browse Code »

This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions
with BPF_X/BPF_K variants for the arm64 eBPF JIT.

Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller

Daniel Borkmann
2017-08-10 07:53:56 +0800

06 Jul, 2017

1 commit

55a7b2125 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux ... Browse Code »

Pull arm64 updates from Will Deacon:

- RAS reporting via GHES/APEI (ACPI)

- Indirect ftrace trampolines for modules

- Improvements to kernel fault reporting

- Page poisoning

- Sigframe cleanups and preparation for SVE context

- Core dump fixes

- Sparse fixes (mainly relating to endianness)

- xgene SoC PMU v3 driver

- Misc cleanups and non-critical fixes

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (75 commits)
arm64: fix endianness annotation for 'struct jit_ctx' and friends
arm64: cpuinfo: constify attribute_group structures.
arm64: ptrace: Fix incorrect get_user() use in compat_vfp_set()
arm64: ptrace: Remove redundant overrun check from compat_vfp_set()
arm64: ptrace: Avoid setting compat FP[SC]R to garbage if get_user fails
arm64: fix endianness annotation for __apply_alternatives()/get_alt_insn()
arm64: fix endianness annotation in get_kaslr_seed()
arm64: add missing conversion to __wsum in ip_fast_csum()
arm64: fix endianness annotation in acpi_parking_protocol.c
arm64: use readq() instead of readl() to read 64bit entry_point
arm64: fix endianness annotation for reloc_insn_movw() & reloc_insn_imm()
arm64: fix endianness annotation for aarch64_insn_write()
arm64: fix endianness annotation in aarch64_insn_read()
arm64: fix endianness annotation in call_undef_hook()
arm64: fix endianness annotation for debug-monitors.c
ras: mark stub functions as 'inline'
arm64: pass endianness info to sparse
arm64: ftrace: fix !CONFIG_ARM64_MODULE_PLTS kernels
arm64: signal: Allow expansion of the signal frame
acpi: apei: check for pending errors when probing GHES entries
...

Linus Torvalds
2017-07-06 08:09:27 +0800

01 Jul, 2017

1 commit

425e1ed73 arm64: fix endianness annotation for 'struct jit_ctx' and friends ... Browse Code »

struct jit_ctx::image is used the store a pointer to the jitted
intructions, which are always little-endian. These instructions
are thus correctly converted from native order to little-endian
before being stored but the pointer 'image' is declared as for
native order values.

Fix this by declaring the field as __le32* instead of u32*.
Same for the pointer used in jit_fill_hole() to initialize
the image.

Signed-off-by: Luc Van Oostenryck
Signed-off-by: Will Deacon

Luc Van Oostenryck
2017-07-01 00:11:28 +0800

15 Jun, 2017

1 commit

0ddead90b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

The conflicts were two cases of overlapping changes in
batman-adv and the qed driver.

Signed-off-by: David S. Miller

David S. Miller
2017-06-15 23:59:32 +0800

12 Jun, 2017

1 commit

f1c9eed7f bpf, arm64: take advantage of stack_depth tracking ... Browse Code »

Make use of recently implemented stack_depth tracking for arm64 JIT,
so that stack usage can be reduced heavily for programs not using
tail calls at least.

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Daniel Borkmann
2017-06-12 06:18:32 +0800

08 Jun, 2017

1 commit

7005cade1 bpf, arm64: use separate register for state in stxr ... Browse Code »

Will reported that in BPF_XADD we must use a different register in stxr
instruction for the status flag due to otherwise CONSTRAINED UNPREDICTABLE
behavior per architecture. Reference manual says [1]:

If s == t, then one of the following behaviors must occur:

* The instruction is UNDEFINED.
* The instruction executes as a NOP.
* The instruction performs the store to the specified address, but
the value stored is UNKNOWN.

Thus, use a different temporary register for the status flag to fix it.

Disassembly extract from test 226/STX_XADD_DW from test_bpf.ko:

[...]
0000003c: c85f7d4b ldxr x11, [x10]
00000040: 8b07016b add x11, x11, x7
00000044: c80c7d4b stxr w12, x11, [x10]
00000048: 35ffffac cbnz w12, 0x0000003c
[...]

[1] https://static.docs.arm.com/ddi0487/b/DDI0487B_a_armv8_arm.pdf, p.6132

Fixes: 85f68fe89832 ("bpf, arm64: implement jiting of BPF_XADD")
Reported-by: Will Deacon
Signed-off-by: Daniel Borkmann
Acked-by: Will Deacon
Signed-off-by: David S. Miller

Daniel Borkmann
2017-06-08 03:27:20 +0800

07 Jun, 2017

1 commit

783d28dd1 bpf: Add jited_len to struct bpf_prog ... Browse Code »

Add jited_len to struct bpf_prog. It will be
useful for the struct bpf_prog_info which will
be added in the later patch.

Signed-off-by: Martin KaFai Lau
Acked-by: Alexei Starovoitov
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller

Martin KaFai Lau
2017-06-07 03:41:24 +0800

01 Jun, 2017

1 commit

71189fa9b bpf: free up BPF_JMP | BPF_CALL | BPF_X opcode ... Browse Code »

free up BPF_JMP | BPF_CALL | BPF_X opcode to be used by actual
indirect call by register and use kernel internal opcode to
mark call instruction into bpf_tail_call() helper.

Signed-off-by: Alexei Starovoitov
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller

Alexei Starovoitov
2017-06-01 07:29:47 +0800

12 May, 2017

1 commit

d8b54110e bpf, arm64: fix faulty emission of map access in tail calls ... Browse Code »

Shubham was recently asking on netdev why in arm64 JIT we don't multiply
the index for accessing the tail call map by 8. That led me into testing
out arm64 JIT wrt tail calls and it turned out I got a NULL pointer
dereference on the tail call.

The buggy access is at:

prog = array->ptrs[index];
if (prog == NULL)
goto out;

[...]
00000060: d2800e0a mov x10, #0x70 // #112
00000064: f86a682a ldr x10, [x1,x10]
00000068: f862694b ldr x11, [x10,x2]
0000006c: b40000ab cbz x11, 0x00000080
[...]

The code triggering the crash is f862694b. x1 at the time contains the
address of the bpf array, x10 offsetof(struct bpf_array, ptrs). Meaning,
above we load the pointer to the program at map slot 0 into x10. x10
can then be NULL if the slot is not occupied, which we later on try to
access with a user given offset in x2 that is the map index.

Fix this by emitting the following instead:

[...]
00000060: d2800e0a mov x10, #0x70 // #112
00000064: 8b0a002a add x10, x1, x10
00000068: d37df04b lsl x11, x2, #3
0000006c: f86b694b ldr x11, [x10,x11]
00000070: b40000ab cbz x11, 0x00000084
[...]

This basically adds the offset to ptrs to the base address of the bpf
array we got and we later on access the map with an index * 8 offset
relative to that. The tail call map itself is basically one large area
with meta data at the head followed by the array of prog pointers.
This makes tail calls working again, tested on Cavium ThunderX ARMv8.

Fixes: ddb55992b04d ("arm64: bpf: implement bpf_tail_call() helper")
Reported-by: Shubham Bansal
Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller

Daniel Borkmann
2017-05-12 00:41:31 +0800

09 May, 2017

1 commit

d4bbc30bb arm64: use set_memory.h header ... Browse Code »

The set_memory_* functions have moved to set_memory.h. Use that header
explicitly.

Link: http://lkml.kernel.org/r/1488920133-27229-4-git-send-email-labbott@redhat.com
Signed-off-by: Laura Abbott
Acked-by: Catalin Marinas
Acked-by: Mark Rutland
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Laura Abbott
2017-05-09 08:15:13 +0800

03 May, 2017

2 commits

ddc665a4b bpf, arm64: fix jit branch offset related to ldimm64 ... Browse Code »

When the instruction right before the branch destination is
a 64 bit load immediate, we currently calculate the wrong
jump offset in the ctx->offset[] array as we only account
one instruction slot for the 64 bit load immediate although
it uses two BPF instructions. Fix it up by setting the offset
into the right slot after we incremented the index.

Before (ldimm64 test 1):

[...]
00000020: 52800007 mov w7, #0x0 // #0
00000024: d2800060 mov x0, #0x3 // #3
00000028: d2800041 mov x1, #0x2 // #2
0000002c: eb01001f cmp x0, x1
00000030: 54ffff82 b.cs 0x00000020
00000034: d29fffe7 mov x7, #0xffff // #65535
00000038: f2bfffe7 movk x7, #0xffff, lsl #16
0000003c: f2dfffe7 movk x7, #0xffff, lsl #32
00000040: f2ffffe7 movk x7, #0xffff, lsl #48
00000044: d29dddc7 mov x7, #0xeeee // #61166
00000048: f2bdddc7 movk x7, #0xeeee, lsl #16
0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32
00000050: f2fdddc7 movk x7, #0xeeee, lsl #48
[...]

After (ldimm64 test 1):

[...]
00000020: 52800007 mov w7, #0x0 // #0
00000024: d2800060 mov x0, #0x3 // #3
00000028: d2800041 mov x1, #0x2 // #2
0000002c: eb01001f cmp x0, x1
00000030: 540000a2 b.cs 0x00000044
00000034: d29fffe7 mov x7, #0xffff // #65535
00000038: f2bfffe7 movk x7, #0xffff, lsl #16
0000003c: f2dfffe7 movk x7, #0xffff, lsl #32
00000040: f2ffffe7 movk x7, #0xffff, lsl #48
00000044: d29dddc7 mov x7, #0xeeee // #61166
00000048: f2bdddc7 movk x7, #0xeeee, lsl #16
0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32
00000050: f2fdddc7 movk x7, #0xeeee, lsl #48
[...]

Also, add a couple of test cases to make sure JITs pass
this test. Tested on Cavium ThunderX ARMv8. The added
test cases all pass after the fix.

Fixes: 8eee539ddea0 ("arm64: bpf: fix out-of-bounds read in bpf2a64_offset()")
Reported-by: David S. Miller
Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Cc: Xi Wang
Signed-off-by: David S. Miller

Daniel Borkmann
2017-05-03 03:04:50 +0800
85f68fe89 bpf, arm64: implement jiting of BPF_XADD ... Browse Code »

This work adds BPF_XADD for BPF_W/BPF_DW to the arm64 JIT and therefore
completes JITing of all BPF instructions, meaning we can thus also remove
the 'notyet' label and do not need to fall back to the interpreter when
BPF_XADD is used in a program!

This now also brings arm64 JIT in line with x86_64, s390x, ppc64, sparc64,
where all current eBPF features are supported.

BPF_W example from test_bpf:

.u.insns_int = {
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
BPF_ST_MEM(BPF_W, R10, -40, 0x10),
BPF_STX_XADD(BPF_W, R10, R0, -40),
BPF_LDX_MEM(BPF_W, R0, R10, -40),
BPF_EXIT_INSN(),
},

[...]
00000020: 52800247 mov w7, #0x12 // #18
00000024: 928004eb mov x11, #0xffffffffffffffd8 // #-40
00000028: d280020a mov x10, #0x10 // #16
0000002c: b82b6b2a str w10, [x25,x11]
// start of xadd mapping:
00000030: 928004ea mov x10, #0xffffffffffffffd8 // #-40
00000034: 8b19014a add x10, x10, x25
00000038: f9800151 prfm pstl1strm, [x10]
0000003c: 885f7d4b ldxr w11, [x10]
00000040: 0b07016b add w11, w11, w7
00000044: 880b7d4b stxr w11, w11, [x10]
00000048: 35ffffab cbnz w11, 0x0000003c
// end of xadd mapping:
[...]

BPF_DW example from test_bpf:

.u.insns_int = {
BPF_ALU32_IMM(BPF_MOV, R0, 0x12),
BPF_ST_MEM(BPF_DW, R10, -40, 0x10),
BPF_STX_XADD(BPF_DW, R10, R0, -40),
BPF_LDX_MEM(BPF_DW, R0, R10, -40),
BPF_EXIT_INSN(),
},

[...]
00000020: 52800247 mov w7, #0x12 // #18
00000024: 928004eb mov x11, #0xffffffffffffffd8 // #-40
00000028: d280020a mov x10, #0x10 // #16
0000002c: f82b6b2a str x10, [x25,x11]
// start of xadd mapping:
00000030: 928004ea mov x10, #0xffffffffffffffd8 // #-40
00000034: 8b19014a add x10, x10, x25
00000038: f9800151 prfm pstl1strm, [x10]
0000003c: c85f7d4b ldxr x11, [x10]
00000040: 8b07016b add x11, x11, x7
00000044: c80b7d4b stxr w11, x11, [x10]
00000048: 35ffffab cbnz w11, 0x0000003c
// end of xadd mapping:
[...]

Tested on Cavium ThunderX ARMv8, test suite results after the patch:

No JIT: [ 3751.855362] test_bpf: Summary: 311 PASSED, 0 FAILED, [0/303 JIT'ed]
With JIT: [ 3573.759527] test_bpf: Summary: 311 PASSED, 0 FAILED, [303/303 JIT'ed]

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Daniel Borkmann
2017-05-03 03:04:50 +0800

29 Apr, 2017

1 commit

7e56fbd27 bpf, x86_64/arm64: remove old ldimm64 artifacts from jits ... Browse Code »

For both cases, the verifier is already rejecting such invalid
formed instructions. Thus, remove these artifacts from old times
and align it with ppc64, sparc64 and s390x JITs that don't have
them in the first place.

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Daniel Borkmann
2017-04-29 03:48:14 +0800

22 Feb, 2017

1 commit

9d876e79d bpf: fix unlocking of jited image when module ronx not set ... Browse Code »

Eric and Willem reported that they recently saw random crashes when
JIT was in use and bisected this to 74451e66d516 ("bpf: make jited
programs visible in traces"). Issue was that the consolidation part
added bpf_jit_binary_unlock_ro() that would unlock previously made
read-only memory back to read-write. However, DEBUG_SET_MODULE_RONX
cannot be used for this to test for presence of set_memory_*()
functions. We need to use ARCH_HAS_SET_MEMORY instead to fix this;
also add the corresponding bpf_jit_binary_lock_ro() to filter.h.

Fixes: 74451e66d516 ("bpf: make jited programs visible in traces")
Reported-by: Eric Dumazet
Reported-by: Willem de Bruijn
Bisected-by: Eric Dumazet
Signed-off-by: Daniel Borkmann
Tested-by: Willem de Bruijn
Signed-off-by: David S. Miller

Daniel Borkmann
2017-02-22 02:30:14 +0800

18 Feb, 2017

2 commits

74451e66d bpf: make jited programs visible in traces ... Browse Code »

Long standing issue with JITed programs is that stack traces from
function tracing check whether a given address is kernel code
through {__,}kernel_text_address(), which checks for code in core
kernel, modules and dynamically allocated ftrace trampolines. But
what is still missing is BPF JITed programs (interpreted programs
are not an issue as __bpf_prog_run() will be attributed to them),
thus when a stack trace is triggered, the code walking the stack
won't see any of the JITed ones. The same for address correlation
done from user space via reading /proc/kallsyms. This is read by
tools like perf, but the latter is also useful for permanent live
tracing with eBPF itself in combination with stack maps when other
eBPF types are part of the callchain. See offwaketime example on
dumping stack from a map.

This work tries to tackle that issue by making the addresses and
symbols known to the kernel. The lookup from *kernel_text_address()
is implemented through a latched RB tree that can be read under
RCU in fast-path that is also shared for symbol/size/offset lookup
for a specific given address in kallsyms. The slow-path iteration
through all symbols in the seq file done via RCU list, which holds
a tiny fraction of all exported ksyms, usually below 0.1 percent.
Function symbols are exported as bpf_prog_, in order to aide
debugging and attribution. This facility is currently enabled for
root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
is active in any mode. The rationale behind this is that still a lot
of systems ship with world read permissions on kallsyms thus addresses
should not get suddenly exposed for them. If that situation gets
much better in future, we always have the option to change the
default on this. Likewise, unprivileged programs are not allowed
to add entries there either, but that is less of a concern as most
such programs types relevant in this context are for root-only anyway.
If enabled, call graphs and stack traces will then show a correct
attribution; one example is illustrated below, where the trace is
now visible in tooling such as perf script --kallsyms=/proc/kallsyms
and friends.

Before:

7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)

After:

7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
[...]
7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller

Daniel Borkmann
2017-02-18 02:40:05 +0800
9383191da bpf: remove stubs for cBPF from arch code ... Browse Code »

Remove the dummy bpf_jit_compile() stubs for eBPF JITs and make
that a single __weak function in the core that can be overridden
similarly to the eBPF one. Also remove stale pr_err() mentions
of bpf_jit_compile.

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Daniel Borkmann
2017-02-18 02:40:04 +0800

11 Jun, 2016

3 commits

643c332d5 arm64: bpf: optimize LD_ABS, LD_IND ... Browse Code »

Remove superfluous stack frame, saving us 3 instructions for every
LD_ABS or LD_IND.

Signed-off-by: Zi Shen Lim
Signed-off-by: David S. Miller

Zi Shen Lim
2016-06-11 14:11:50 +0800
997ce8883 arm64: bpf: optimize JMP_CALL ... Browse Code »

Remove superfluous stack frame, saving us 3 instructions for
every JMP_CALL.

Signed-off-by: Zi Shen Lim
Signed-off-by: David S. Miller

Zi Shen Lim
2016-06-11 14:11:50 +0800
ddb55992b arm64: bpf: implement bpf_tail_call() helper ... Browse Code »

Add support for JMP_CALL_X (tail call) introduced by commit 04fd61ab36ec
("bpf: allow bpf programs to tail-call other bpf programs").

bpf_tail_call() arguments:
ctx - context pointer passed to next program
array - pointer to map which type is BPF_MAP_TYPE_PROG_ARRAY
index - index inside array that selects specific program to run

In this implementation arm64 JIT jumps into callee program after prologue,
so callee program reuses the same stack. For tail_call_cnt, we use the
callee-saved R26 (which was already saved/restored but previously unused
by JIT).

With this patch a tail call generates the following code on arm64:

if (index >= array->map.max_entries)
goto out;

34: mov x10, #0x10 // #16
38: ldr w10, [x1,x10]
3c: cmp w2, w10
40: b.ge 0x0000000000000074

if (tail_call_cnt > MAX_TAIL_CALL_CNT)
goto out;
tail_call_cnt++;

44: mov x10, #0x20 // #32
48: cmp x26, x10
4c: b.gt 0x0000000000000074
50: add x26, x26, #0x1

prog = array->ptrs[index];
if (prog == NULL)
goto out;

54: mov x10, #0x68 // #104
58: ldr x10, [x1,x10]
5c: ldr x11, [x10,x2]
60: cbz x11, 0x0000000000000074

goto *(prog->bpf_func + prologue_size);

64: mov x10, #0x20 // #32
68: ldr x10, [x11,x10]
6c: add x10, x10, #0x20
70: br x10
74:

Signed-off-by: Zi Shen Lim
Signed-off-by: David S. Miller

Zi Shen Lim
2016-06-11 14:11:49 +0800

18 May, 2016

1 commit

4c1cd4fdf bpf: arm64: remove callee-save registers use for tmp registers ... Browse Code »

In the current implementation of ARM64 eBPF JIT, R23 and R24 are used for
tmp registers, which are callee-saved registers. This leads to variable size
of JIT prologue and epilogue. The latest blinding constant change prefers to
constant size of prologue and epilogue. AAPCS reserves R9 ~ R15 for temp
registers which not need to be saved/restored during function call. So, replace
R23 and R24 to R10 and R11, and remove tmp_used flag to save 2 instructions for
some jited BPF program.

CC: Daniel Borkmann
Acked-by: Zi Shen Lim
Signed-off-by: Yang Shi
Acked-by: Catalin Marinas
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller

Yang Shi
2016-05-18 02:03:33 +0800

17 May, 2016

3 commits

26eb042ee bpf, arm64: add support for constant blinding ... Browse Code »

This patch adds recently added constant blinding helpers into the
arm64 eBPF JIT. In the bpf_int_jit_compile() path, requirements are
to utilize bpf_jit_blind_constants()/bpf_jit_prog_release_other()
pair for rewriting the program into a blinded one, and to map the
BPF_REG_AX register to a CPU register. The mapping is on x9.

Signed-off-by: Daniel Borkmann
Acked-by: Zi Shen Lim
Acked-by: Yang Shi
Tested-by: Yang Shi
Signed-off-by: David S. Miller

Daniel Borkmann
2016-05-17 01:49:33 +0800
d1c55ab5e bpf: prepare bpf_int_jit_compile/bpf_prog_select_runtime apis ... Browse Code »

Since the blinding is strictly only called from inside eBPF JITs,
we need to change signatures for bpf_int_jit_compile() and
bpf_prog_select_runtime() first in order to prepare that the
eBPF program we're dealing with can change underneath. Hence,
for call sites, we need to return the latest prog. No functional
change in this patch.

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Daniel Borkmann
2016-05-17 01:49:32 +0800
93a73d442 bpf, x86/arm64: remove useless checks on prog ... Browse Code »

There is never such a situation, where bpf_int_jit_compile() is
called with either prog as NULL or len as 0, so the tests are
unnecessary and confusing as people would just copy them. s390
doesn't have them, so no change is needed there.

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Daniel Borkmann
2016-05-17 01:49:32 +0800

15 May, 2016

1 commit

98397fc54 arm64: bpf: jit JMP_JSET_{X,K} ... Browse Code »

Original implementation commit e54bcde3d69d ("arm64: eBPF JIT compiler")
had the relevant code paths, but due to an oversight always fail jiting.

As a result, we had been falling back to BPF interpreter whenever a BPF
program has JMP_JSET_{X,K} instructions.

With this fix, we confirm that the corresponding tests in lib/test_bpf
continue to pass, and also jited.

...
[ 2.784553] test_bpf: #30 JSET jited:1 188 192 197 PASS
[ 2.791373] test_bpf: #31 tcpdump port 22 jited:1 325 677 625 PASS
[ 2.808800] test_bpf: #32 tcpdump complex jited:1 323 731 991 PASS
...
[ 3.190759] test_bpf: #237 JMP_JSET_K: if (0x3 & 0x2) return 1 jited:1 110 PASS
[ 3.192524] test_bpf: #238 JMP_JSET_K: if (0x3 & 0xffffffff) return 1 jited:1 98 PASS
[ 3.211014] test_bpf: #249 JMP_JSET_X: if (0x3 & 0x2) return 1 jited:1 120 PASS
[ 3.212973] test_bpf: #250 JMP_JSET_X: if (0x3 & 0xffffffff) return 1 jited:1 89 PASS
...

Fixes: e54bcde3d69d ("arm64: eBPF JIT compiler")
Signed-off-by: Zi Shen Lim
Acked-by: Will Deacon
Acked-by: Yang Shi
Signed-off-by: David S. Miller

Zi Shen Lim
2016-05-15 04:11:45 +0800

18 Jan, 2016

1 commit

42ff712bc arm64: bpf: add extra pass to handle faulty codegen ... Browse Code »

Code generation functions in arch/arm64/kernel/insn.c previously
BUG_ON invalid parameters. Following change of that behavior, now we
need to handle the error case where AARCH64_BREAK_FAULT is returned.

Instead of error-handling on every emit() in JIT, we add a new
validation pass at the end of JIT compilation. There's no point in
running JITed code at run-time only to trap due to AARCH64_BREAK_FAULT.
Instead, we drop this failed JIT compilation and allow the system to
gracefully fallback on the BPF interpreter.

Signed-off-by: Zi Shen Lim
Suggested-by: Alexei Starovoitov
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Zi Shen Lim
2016-01-18 08:15:26 +0800

19 Dec, 2015

1 commit

8b614aebe bpf: move clearing of A/X into classic to eBPF migration prologue ... Browse Code »

Back in the days where eBPF (or back then "internal BPF" ;->) was not
exposed to user space, and only the classic BPF programs internally
translated into eBPF programs, we missed the fact that for classic BPF
A and X needed to be cleared. It was fixed back then via 83d5b7ef99c9
("net: filter: initialize A and X registers"), and thus classic BPF
specifics were added to the eBPF interpreter core to work around it.

This added some confusion for JIT developers later on that take the
eBPF interpreter code as an example for deriving their JIT. F.e. in
f75298f5c3fe ("s390/bpf: clear correct BPF accumulator register"), at
least X could leak stack memory. Furthermore, since this is only needed
for classic BPF translations and not for eBPF (verifier takes care
that read access to regs cannot be done uninitialized), more complexity
is added to JITs as they need to determine whether they deal with
migrations or native eBPF where they can just omit clearing A/X in
their prologue and thus reduce image size a bit, see f.e. cde66c2d88da
("s390/bpf: Only clear A and X for converted BPF programs"). In other
cases (x86, arm64), A and X is being cleared in the prologue also for
eBPF case, which is unnecessary.

Lets move this into the BPF migration in bpf_convert_filter() where it
actually belongs as long as the number of eBPF JITs are still few. It
can thus be done generically; allowing us to remove the quirk from
__bpf_prog_run() and to slightly reduce JIT image size in case of eBPF,
while reducing code duplication on this matter in current(/future) eBPF
JITs.

Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Reviewed-by: Michael Holzheu
Tested-by: Michael Holzheu
Cc: Zi Shen Lim
Cc: Yang Shi
Acked-by: Yang Shi
Acked-by: Zi Shen Lim
Signed-off-by: David S. Miller

Daniel Borkmann
2015-12-19 05:04:51 +0800

04 Dec, 2015

1 commit

df849ba3a arm64: bpf: add 'store immediate' instruction ... Browse Code »

aarch64 doesn't have native store immediate instruction, such operation
has to be implemented by the below instruction sequence:

Load immediate to register
Store register

Signed-off-by: Yang Shi
CC: Zi Shen Lim
CC: Xi Wang
Reviewed-by: Zi Shen Lim
Signed-off-by: David S. Miller

Yang Shi
2015-12-04 00:38:31 +0800