17 Sep, 2020

1 commit

  • Running the eBPF test_verifier leads to random errors looking like this:

    [ 6525.735488] Unexpected kernel BRK exception at EL1
    [ 6525.735502] Internal error: ptrace BRK handler: f2000100 [#1] SMP
    [ 6525.741609] Modules linked in: nls_utf8 cifs libdes libarc4 dns_resolver fscache binfmt_misc nls_ascii nls_cp437 vfat fat aes_ce_blk crypto_simd cryptd aes_ce_cipher ghash_ce gf128mul efi_pstore sha2_ce sha256_arm64 sha1_ce evdev efivars efivarfs ip_tables x_tables autofs4 btrfs blake2b_generic xor xor_neon zstd_compress raid6_pq libcrc32c crc32c_generic ahci xhci_pci libahci xhci_hcd igb libata i2c_algo_bit nvme realtek usbcore nvme_core scsi_mod t10_pi netsec mdio_devres of_mdio gpio_keys fixed_phy libphy gpio_mb86s7x
    [ 6525.787760] CPU: 3 PID: 7881 Comm: test_verifier Tainted: G W 5.9.0-rc1+ #47
    [ 6525.796111] Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS build #1 Jun 6 2020
    [ 6525.804812] pstate: 20000005 (nzCv daif -PAN -UAO BTYPE=--)
    [ 6525.810390] pc : bpf_prog_c3d01833289b6311_F+0xc8/0x9f4
    [ 6525.815613] lr : bpf_prog_d53bb52e3f4483f9_F+0x38/0xc8c
    [ 6525.820832] sp : ffff8000130cbb80
    [ 6525.824141] x29: ffff8000130cbbb0 x28: 0000000000000000
    [ 6525.829451] x27: 000005ef6fcbf39b x26: 0000000000000000
    [ 6525.834759] x25: ffff8000130cbb80 x24: ffff800011dc7038
    [ 6525.840067] x23: ffff8000130cbd00 x22: ffff0008f624d080
    [ 6525.845375] x21: 0000000000000001 x20: ffff800011dc7000
    [ 6525.850682] x19: 0000000000000000 x18: 0000000000000000
    [ 6525.855990] x17: 0000000000000000 x16: 0000000000000000
    [ 6525.861298] x15: 0000000000000000 x14: 0000000000000000
    [ 6525.866606] x13: 0000000000000000 x12: 0000000000000000
    [ 6525.871913] x11: 0000000000000001 x10: ffff8000000a660c
    [ 6525.877220] x9 : ffff800010951810 x8 : ffff8000130cbc38
    [ 6525.882528] x7 : 0000000000000000 x6 : 0000009864cfa881
    [ 6525.887836] x5 : 00ffffffffffffff x4 : 002880ba1a0b3e9f
    [ 6525.893144] x3 : 0000000000000018 x2 : ffff8000000a4374
    [ 6525.898452] x1 : 000000000000000a x0 : 0000000000000009
    [ 6525.903760] Call trace:
    [ 6525.906202] bpf_prog_c3d01833289b6311_F+0xc8/0x9f4
    [ 6525.911076] bpf_prog_d53bb52e3f4483f9_F+0x38/0xc8c
    [ 6525.915957] bpf_dispatcher_xdp_func+0x14/0x20
    [ 6525.920398] bpf_test_run+0x70/0x1b0
    [ 6525.923969] bpf_prog_test_run_xdp+0xec/0x190
    [ 6525.928326] __do_sys_bpf+0xc88/0x1b28
    [ 6525.932072] __arm64_sys_bpf+0x24/0x30
    [ 6525.935820] el0_svc_common.constprop.0+0x70/0x168
    [ 6525.940607] do_el0_svc+0x28/0x88
    [ 6525.943920] el0_sync_handler+0x88/0x190
    [ 6525.947838] el0_sync+0x140/0x180
    [ 6525.951154] Code: d4202000 d4202000 d4202000 d4202000 (d4202000)
    [ 6525.957249] ---[ end trace cecc3f93b14927e2 ]---

    The reason is the offset[] creation and later usage, while building
    the eBPF body. The code currently omits the first instruction, since
    build_insn() will increase our ctx->idx before saving it.
    That was fine up until bounded eBPF loops were introduced. After that
    introduction, offset[0] must be the offset of the end of prologue which
    is the start of the 1st insn while, offset[n] holds the
    offset of the end of n-th insn.

    When "taken loop with back jump to 1st insn" test runs, it will
    eventually call bpf2a64_offset(-1, 2, ctx). Since negative indexing is
    permitted, the current outcome depends on the value stored in
    ctx->offset[-1], which has nothing to do with our array.
    If the value happens to be 0 the tests will work. If not this error
    triggers.

    commit 7c2e988f400e ("bpf: fix x64 JIT code generation for jmp to 1st insn")
    fixed an indentical bug on x86 when eBPF bounded loops were introduced.

    So let's fix it by creating the ctx->offset[] differently. Track the
    beginning of instruction and account for the extra instruction while
    calculating the arm instruction offsets.

    Fixes: 2589726d12a1 ("bpf: introduce bounded loops")
    Reported-by: Naresh Kamboju
    Reported-by: Jiri Olsa
    Co-developed-by: Jean-Philippe Brucker
    Co-developed-by: Yauheni Kaliuta
    Signed-off-by: Jean-Philippe Brucker
    Signed-off-by: Yauheni Kaliuta
    Signed-off-by: Ilias Apalodimas
    Acked-by: Will Deacon
    Link: https://lore.kernel.org/r/20200917084925.177348-1-ilias.apalodimas@linaro.org
    Signed-off-by: Catalin Marinas

    Ilias Apalodimas
     

31 Jul, 2020

1 commit

  • When a tracing BPF program attempts to read memory without using the
    bpf_probe_read() helper, the verifier marks the load instruction with
    the BPF_PROBE_MEM flag. Since the arm64 JIT does not currently recognize
    this flag it falls back to the interpreter.

    Add support for BPF_PROBE_MEM, by appending an exception table to the
    BPF program. If the load instruction causes a data abort, the fixup
    infrastructure finds the exception table and fixes up the fault, by
    clearing the destination register and jumping over the faulting
    instruction.

    To keep the compact exception table entry format, inspect the pc in
    fixup_exception(). A more generic solution would add a "handler" field
    to the table entry, like on x86 and s390.

    Signed-off-by: Jean-Philippe Brucker
    Signed-off-by: Daniel Borkmann
    Acked-by: Song Liu
    Link: https://lore.kernel.org/bpf/20200728152122.1292756-2-jean-philippe@linaro.org

    Jean-Philippe Brucker
     

29 May, 2020

1 commit

  • Support for Branch Target Identification (BTI) in user and kernel
    (Mark Brown and others)
    * for-next/bti: (39 commits)
    arm64: vdso: Fix CFI directives in sigreturn trampoline
    arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction
    arm64: bti: Fix support for userspace only BTI
    arm64: kconfig: Update and comment GCC version check for kernel BTI
    arm64: vdso: Map the vDSO text with guarded pages when built for BTI
    arm64: vdso: Force the vDSO to be linked as BTI when built for BTI
    arm64: vdso: Annotate for BTI
    arm64: asm: Provide a mechanism for generating ELF note for BTI
    arm64: bti: Provide Kconfig for kernel mode BTI
    arm64: mm: Mark executable text as guarded pages
    arm64: bpf: Annotate JITed code for BTI
    arm64: Set GP bit in kernel page tables to enable BTI for the kernel
    arm64: asm: Override SYM_FUNC_START when building the kernel with BTI
    arm64: bti: Support building kernel C code using BTI
    arm64: Document why we enable PAC support for leaf functions
    arm64: insn: Report PAC and BTI instructions as skippable
    arm64: insn: Don't assume unrecognized HINTs are skippable
    arm64: insn: Provide a better name for aarch64_insn_is_nop()
    arm64: insn: Add constants for new HINT instruction decode
    arm64: Disable old style assembly annotations
    ...

    Will Deacon
     

11 May, 2020

2 commits

  • The current code for BPF_{ADD,SUB} BPF_K loads the BPF immediate to a
    temporary register before performing the addition/subtraction. Similarly,
    BPF_JMP BPF_K cases load the immediate to a temporary register before
    comparison.

    This patch introduces optimizations that use arm64 immediate add, sub,
    cmn, or cmp instructions when the BPF immediate fits. If the immediate
    does not fit, it falls back to using a temporary register.

    Example of generated code for BPF_ALU64_IMM(BPF_ADD, R0, 2):

    without optimization:

    24: mov x10, #0x2
    28: add x7, x7, x10

    with optimization:

    24: add x7, x7, #0x2

    The code could use A64_{ADD,SUB}_I directly and check if it returns
    AARCH64_BREAK_FAULT, similar to how logical immediates are handled.
    However, aarch64_insn_gen_add_sub_imm from insn.c prints error messages
    when the immediate does not fit, and it's simpler to check if the
    immediate fits ahead of time.

    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Acked-by: Daniel Borkmann
    Link: https://lore.kernel.org/r/20200508181547.24783-4-luke.r.nels@gmail.com
    Signed-off-by: Will Deacon

    Luke Nelson
     
  • The current code for BPF_{AND,OR,XOR,JSET} BPF_K loads the immediate to
    a temporary register before use.

    This patch changes the code to avoid using a temporary register
    when the BPF immediate is encodable using an arm64 logical immediate
    instruction. If the encoding fails (due to the immediate not being
    encodable), it falls back to using a temporary register.

    Example of generated code for BPF_ALU32_IMM(BPF_AND, R0, 0x80000001):

    without optimization:

    24: mov w10, #0x8000ffff
    28: movk w10, #0x1
    2c: and w7, w7, w10

    with optimization:

    24: and w7, w7, #0x80000001

    Since the encoding process is quite complex, the JIT reuses existing
    functionality in arch/arm64/kernel/insn.c for encoding logical immediates
    rather than duplicate it in the JIT.

    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Acked-by: Daniel Borkmann
    Link: https://lore.kernel.org/r/20200508181547.24783-3-luke.r.nels@gmail.com
    Signed-off-by: Will Deacon

    Luke Nelson
     

08 May, 2020

1 commit

  • In order to extend the protection offered by BTI to all code executing in
    kernel mode we need to annotate JITed BPF code appropriately for BTI. To
    do this we need to add a landing pad to the start of each BPF function and
    also immediately after the function prologue if we are emitting a function
    which can be tail called. Jumps within BPF functions are all to immediate
    offsets and therefore do not require landing pads.

    Signed-off-by: Mark Brown
    Reviewed-by: Catalin Marinas
    Link: https://lore.kernel.org/r/20200506195138.22086-6-broonie@kernel.org
    Signed-off-by: Will Deacon

    Mark Brown
     

03 Sep, 2019

1 commit


09 Jul, 2019

1 commit

  • Pull arm64 updates from Catalin Marinas:

    - arm64 support for syscall emulation via PTRACE_SYSEMU{,_SINGLESTEP}

    - Wire up VM_FLUSH_RESET_PERMS for arm64, allowing the core code to
    manage the permissions of executable vmalloc regions more strictly

    - Slight performance improvement by keeping softirqs enabled while
    touching the FPSIMD/SVE state (kernel_neon_begin/end)

    - Expose a couple of ARMv8.5 features to user (HWCAP): CondM (new
    XAFLAG and AXFLAG instructions for floating point comparison flags
    manipulation) and FRINT (rounding floating point numbers to integers)

    - Re-instate ARM64_PSEUDO_NMI support which was previously marked as
    BROKEN due to some bugs (now fixed)

    - Improve parking of stopped CPUs and implement an arm64-specific
    panic_smp_self_stop() to avoid warning on not being able to stop
    secondary CPUs during panic

    - perf: enable the ARM Statistical Profiling Extensions (SPE) on ACPI
    platforms

    - perf: DDR performance monitor support for iMX8QXP

    - cache_line_size() can now be set from DT or ACPI/PPTT if provided to
    cope with a system cache info not exposed via the CPUID registers

    - Avoid warning on hardware cache line size greater than
    ARCH_DMA_MINALIGN if the system is fully coherent

    - arm64 do_page_fault() and hugetlb cleanups

    - Refactor set_pte_at() to avoid redundant READ_ONCE(*ptep)

    - Ignore ACPI 5.1 FADTs reported as 5.0 (infer from the
    'arm_boot_flags' introduced in 5.1)

    - CONFIG_RANDOMIZE_BASE now enabled in defconfig

    - Allow the selection of ARM64_MODULE_PLTS, currently only done via
    RANDOMIZE_BASE (and an erratum workaround), allowing modules to spill
    over into the vmalloc area

    - Make ZONE_DMA32 configurable

    * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (54 commits)
    perf: arm_spe: Enable ACPI/Platform automatic module loading
    arm_pmu: acpi: spe: Add initial MADT/SPE probing
    ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
    ACPI/PPTT: Modify node flag detection to find last IDENTICAL
    x86/entry: Simplify _TIF_SYSCALL_EMU handling
    arm64: rename dump_instr as dump_kernel_instr
    arm64/mm: Drop [PTE|PMD]_TYPE_FAULT
    arm64: Implement panic_smp_self_stop()
    arm64: Improve parking of stopped CPUs
    arm64: Expose FRINT capabilities to userspace
    arm64: Expose ARMv8.5 CondM capability to userspace
    arm64: defconfig: enable CONFIG_RANDOMIZE_BASE
    arm64: ARM64_MODULES_PLTS must depend on MODULES
    arm64: bpf: do not allocate executable memory
    arm64/kprobes: set VM_FLUSH_RESET_PERMS on kprobe instruction pages
    arm64/mm: wire up CONFIG_ARCH_HAS_SET_DIRECT_MAP
    arm64: module: create module allocations without exec permissions
    arm64: Allow user selection of ARM64_MODULE_PLTS
    acpi/arm64: ignore 5.1 FADTs that are reported as 5.0
    arm64: Allow selecting Pseudo-NMI again
    ...

    Linus Torvalds
     

25 Jun, 2019

1 commit

  • The BPF code now takes care of mapping the code pages executable
    after mapping them read-only, to ensure that no RWX mapped regions
    are needed, even transiently. This means we can drop the executable
    permissions from the mapping at allocation time.

    Acked-by: Will Deacon
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Catalin Marinas

    Ard Biesheuvel
     

19 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation this program is
    distributed in the hope that it will be useful but without any
    warranty without even the implied warranty of merchantability or
    fitness for a particular purpose see the gnu general public license
    for more details you should have received a copy of the gnu general
    public license along with this program if not see http www gnu org
    licenses

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 503 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Alexios Zavras
    Reviewed-by: Allison Randal
    Reviewed-by: Enrico Weigelt
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 May, 2019

1 commit


27 Apr, 2019

2 commits

  • Since ARMv8.1 supplement introduced LSE atomic instructions back in 2016,
    lets add support for STADD and use that in favor of LDXR / STXR loop for
    the XADD mapping if available. STADD is encoded as an alias for LDADD with
    XZR as the destination register, therefore add LDADD to the instruction
    encoder along with STADD as special case and use it in the JIT for CPUs
    that advertise LSE atomics in CPUID register. If immediate offset in the
    BPF XADD insn is 0, then use dst register directly instead of temporary
    one.

    Signed-off-by: Daniel Borkmann
    Acked-by: Jean-Philippe Brucker
    Acked-by: Will Deacon
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     
  • Prefetch-with-intent-to-write is currently part of the XADD mapping in
    the AArch64 JIT and follows the kernel's implementation of atomic_add.
    This may interfere with other threads executing the LDXR/STXR loop,
    leading to potential starvation and fairness issues. Drop the optional
    prefetch instruction.

    Fixes: 85f68fe89832 ("bpf, arm64: implement jiting of BPF_XADD")
    Reported-by: Will Deacon
    Signed-off-by: Daniel Borkmann
    Acked-by: Jean-Philippe Brucker
    Acked-by: Will Deacon
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

27 Jan, 2019

1 commit


12 Dec, 2018

1 commit


05 Dec, 2018

1 commit

  • The arm64 module region is a 128 MB region that is kept close to
    the core kernel, in order to ensure that relative branches are
    always in range. So using the same region for programs that do
    not have this restriction is wasteful, and preferably avoided.

    Now that the core BPF JIT code permits the alloc/free routines to
    be overridden, implement them by vmalloc()/vfree() calls from a
    dedicated 128 MB region set aside for BPF programs. This ensures
    that BPF programs are still in branching range of each other, which
    is something the JIT currently depends upon (and is not guaranteed
    when using module_alloc() on KASLR kernels like we do currently).
    It also ensures that placement of BPF programs does not correlate
    with the placement of the core kernel or modules, making it less
    likely that leaking the former will reveal the latter.

    This also solves an issue under KASAN, where shadow memory is
    needlessly allocated for all BPF programs (which don't require KASAN
    shadow pages since they are not KASAN instrumented)

    Signed-off-by: Ard Biesheuvel
    Acked-by: Will Deacon
    Signed-off-by: Daniel Borkmann

    Ard Biesheuvel
     

30 Nov, 2018

1 commit

  • On arm64, all executable code is guaranteed to reside in the vmalloc
    space (or the module space), and so jump targets will only use 48
    bits at most, and the remaining bits are guaranteed to be 0x1.

    This means we can generate an immediate jump address using a sequence
    of one MOVN (move wide negated) and two MOVK instructions, where the
    first one sets the lower 16 bits but also sets all top bits to 0x1.

    Signed-off-by: Ard Biesheuvel
    Acked-by: Will Deacon
    Acked-by: Daniel Borkmann
    Signed-off-by: Daniel Borkmann

    Ard Biesheuvel
     

27 Nov, 2018

1 commit

  • The arm64 JIT has the same issue as ppc64 JIT in that the relative BPF
    to BPF call offset can be too far away from core kernel in that relative
    encoding into imm is not sufficient and could potentially be truncated,
    see also fd045f6cd98e ("arm64: add support for module PLTs") which adds
    spill-over space for module_alloc() and therefore bpf_jit_binary_alloc().
    Therefore, use the recently added bpf_jit_get_func_addr() helper for
    properly fetching the address through prog->aux->func[off]->bpf_func
    instead. This also has the benefit to optimize normal helper calls since
    their address can use the optimized emission. Tested on Cavium ThunderX
    CN8890.

    Fixes: db496944fdaa ("bpf: arm64: add JIT support for multi-function programs")
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

15 May, 2018

3 commits

  • We can trivially save 4 bytes in prologue for cBPF since tail calls
    can never be used from there. The register push/pop is pairwise,
    here, x25 (fp) and x26 (tcc), so no point in changing that, only
    reset to zero is not needed.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     
  • Improve the JIT to emit 64 and 32 bit immediates, the current
    algorithm is not optimal and we often emit more instructions
    than actually needed. arm64 has movz, movn, movk variants but
    for the current 64 bit immediates we only use movz with a
    series of movk when needed.

    For example loading ffffffffffffabab emits the following 4
    instructions in the JIT today:

    * movz: abab, shift: 0, result: 000000000000abab
    * movk: ffff, shift: 16, result: 00000000ffffabab
    * movk: ffff, shift: 32, result: 0000ffffffffabab
    * movk: ffff, shift: 48, result: ffffffffffffabab

    Whereas after the patch the same load only needs a single
    instruction:

    * movn: 5454, shift: 0, result: ffffffffffffabab

    Another example where two extra instructions can be saved:

    * movz: abab, shift: 0, result: 000000000000abab
    * movk: 1f2f, shift: 16, result: 000000001f2fabab
    * movk: ffff, shift: 32, result: 0000ffff1f2fabab
    * movk: ffff, shift: 48, result: ffffffff1f2fabab

    After the patch:

    * movn: e0d0, shift: 16, result: ffffffff1f2fffff
    * movk: abab, shift: 0, result: ffffffff1f2fabab

    Another example with movz, before:

    * movz: 0000, shift: 0, result: 0000000000000000
    * movk: fea0, shift: 32, result: 0000fea000000000

    After:

    * movz: fea0, shift: 32, result: 0000fea000000000

    Moreover, reuse emit_a64_mov_i() for 32 bit immediates that
    are loaded via emit_a64_mov_i64() which is a similar optimization
    as done in 6fe8b9c1f41d ("bpf, x64: save several bytes by using
    mov over movabsq when possible"). On arm64, the latter allows to
    use a single instruction with movn due to zero extension where
    otherwise two would be needed. And last but not least add a
    missing optimization in emit_a64_mov_i() where movn is used but
    the subsequent movk not needed. With some of the Cilium programs
    in use, this shrinks the needed instructions by about three
    percent. Tested on Cavium ThunderX CN8890.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     
  • Follow-up to 816d9ef32a8b ("bpf, arm64: remove ld_abs/ld_ind") in
    that the extra 4 byte JIT scratchpad is not needed anymore since it
    was in ld_abs/ld_ind as stack buffer for bpf_load_pointer().

    Signed-off-by: Daniel Borkmann
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

04 May, 2018

1 commit

  • Since LD_ABS/LD_IND instructions are now removed from the core and
    reimplemented through a combination of inlined BPF instructions and
    a slow-path helper, we can get rid of the complexity from arm64 JIT.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

23 Feb, 2018

1 commit

  • I recently noticed a crash on arm64 when feeding a bogus index
    into BPF tail call helper. The crash would not occur when the
    interpreter is used, but only in case of JIT. Output looks as
    follows:

    [ 347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
    [...]
    [ 347.043065] [fffb850e96492510] address between user and kernel address ranges
    [ 347.050205] Internal error: Oops: 96000004 [#1] SMP
    [...]
    [ 347.190829] x13: 0000000000000000 x12: 0000000000000000
    [ 347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
    [ 347.201427] x9 : 0000000000000000 x8 : 0000000000000000
    [ 347.206726] x7 : 0000000000000000 x6 : 001c991738000000
    [ 347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
    [ 347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
    [ 347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
    [ 347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
    [ 347.235221] Call trace:
    [ 347.237656] 0xffff000002f3a4fc
    [ 347.240784] bpf_test_run+0x78/0xf8
    [ 347.244260] bpf_prog_test_run_skb+0x148/0x230
    [ 347.248694] SyS_bpf+0x77c/0x1110
    [ 347.251999] el0_svc_naked+0x30/0x34
    [ 347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
    [...]

    In this case the index used in BPF r3 is the same as in r1
    at the time of the call, meaning we fed a pointer as index;
    here, it had the value 0xffff808fd7cf0500 which sits in x2.

    While I found tail calls to be working in general (also for
    hitting the error cases), I noticed the following in the code
    emission:

    # bpftool p d j i 988
    [...]
    38: ldr w10, [x1,x10]
    3c: cmp w2, w10
    40: b.ge 0x000000000000007c
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

27 Jan, 2018

1 commit


21 Jan, 2018

1 commit

  • Alexei Starovoitov says:

    ====================
    pull-request: bpf-next 2018-01-19

    The following pull-request contains BPF updates for your *net-next* tree.

    The main changes are:

    1) bpf array map HW offload, from Jakub.

    2) support for bpf_get_next_key() for LPM map, from Yonghong.

    3) test_verifier now runs loaded programs, from Alexei.

    4) xdp cpumap monitoring, from Jesper.

    5) variety of tests, cleanups and small x64 JIT optimization, from Daniel.

    6) user space can now retrieve HW JITed program, from Jiong.

    Note there is a minor conflict between Russell's arm32 JIT fixes
    and removal of bpf_jit_enable variable by Daniel which should
    be resolved by keeping Russell's comment and removing that variable.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

20 Jan, 2018

2 commits

  • The BPF verifier conflict was some minor contextual issue.

    The TUN conflict was less trivial. Cong Wang fixed a memory leak of
    tfile->tx_array in 'net'. This is an skb_array. But meanwhile in
    net-next tun changed tfile->tx_arry into tfile->tx_ring which is a
    ptr_ring.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Having a pure_initcall() callback just to permanently enable BPF
    JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
    a small race window in future where JIT is still disabled on boot.
    Since we know about the setting at compilation time anyway, just
    initialize it properly there. Also consolidate all the individual
    bpf_jit_enable variables into a single one and move them under one
    location. Moreover, don't allow for setting unspecified garbage
    values on them.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

17 Jan, 2018

1 commit

  • Using dynamic stack_depth tracking in arm64 JIT is currently broken in
    combination with tail calls. In prologue, we cache ctx->stack_size and
    adjust SP reg for setting up function call stack, and tearing it down
    again in epilogue. Problem is that when doing a tail call, the cached
    ctx->stack_size might not be the same.

    One way to fix the problem with minimal overhead is to re-adjust SP in
    emit_bpf_tail_call() and properly adjust it to the current program's
    ctx->stack_size. Tested on Cavium ThunderX ARMv8.

    Fixes: f1c9eed7f437 ("bpf, arm64: take advantage of stack_depth tracking")
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

19 Dec, 2017

1 commit

  • fix the following issue:
    arch/arm64/net/bpf_jit_comp.c: In function 'bpf_int_jit_compile':
    arch/arm64/net/bpf_jit_comp.c:982:18: error: 'image_size' may be used
    uninitialized in this function [-Werror=maybe-uninitialized]

    Fixes: db496944fdaa ("bpf: arm64: add JIT support for multi-function programs")
    Reported-by: Arnd Bergmann
    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann

    Alexei Starovoitov
     

18 Dec, 2017

2 commits

  • similar to x64 add support for bpf-to-bpf calls.
    When program has calls to in-kernel helpers the target call offset
    is known at JIT time and arm64 architecture needs 2 passes.
    With bpf-to-bpf calls the dynamically allocated function start
    is unknown until all functions of the program are JITed.
    Therefore (just like x64) arm64 JIT needs one extra pass over
    the program to emit correct call offsets.

    Implementation detail:
    Avoid being too clever in 64-bit immediate moves and
    always use 4 instructions (instead of 3-4 depending on the address)
    to make sure only one extra pass is needed.
    If some future optimization would make it worth while to optimize
    'call 64-bit imm' further, the JIT would need to do 4 passes
    over the program instead of 3 as in this patch.
    For typical bpf program address the mov needs 3 or 4 insns,
    so unconditional 4 insns to save extra pass is a worthy trade off
    at this state of JIT.

    Signed-off-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: Daniel Borkmann

    Alexei Starovoitov
     
  • global bpf_jit_enable variable is tested multiple times in JITs,
    blinding and verifier core. The malicious root can try to toggle
    it while loading the programs. This race condition was accounted
    for and there should be no issues, but it's safer to avoid
    this race condition.

    Signed-off-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: Daniel Borkmann

    Alexei Starovoitov
     

10 Aug, 2017

1 commit


06 Jul, 2017

1 commit

  • Pull arm64 updates from Will Deacon:

    - RAS reporting via GHES/APEI (ACPI)

    - Indirect ftrace trampolines for modules

    - Improvements to kernel fault reporting

    - Page poisoning

    - Sigframe cleanups and preparation for SVE context

    - Core dump fixes

    - Sparse fixes (mainly relating to endianness)

    - xgene SoC PMU v3 driver

    - Misc cleanups and non-critical fixes

    * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (75 commits)
    arm64: fix endianness annotation for 'struct jit_ctx' and friends
    arm64: cpuinfo: constify attribute_group structures.
    arm64: ptrace: Fix incorrect get_user() use in compat_vfp_set()
    arm64: ptrace: Remove redundant overrun check from compat_vfp_set()
    arm64: ptrace: Avoid setting compat FP[SC]R to garbage if get_user fails
    arm64: fix endianness annotation for __apply_alternatives()/get_alt_insn()
    arm64: fix endianness annotation in get_kaslr_seed()
    arm64: add missing conversion to __wsum in ip_fast_csum()
    arm64: fix endianness annotation in acpi_parking_protocol.c
    arm64: use readq() instead of readl() to read 64bit entry_point
    arm64: fix endianness annotation for reloc_insn_movw() & reloc_insn_imm()
    arm64: fix endianness annotation for aarch64_insn_write()
    arm64: fix endianness annotation in aarch64_insn_read()
    arm64: fix endianness annotation in call_undef_hook()
    arm64: fix endianness annotation for debug-monitors.c
    ras: mark stub functions as 'inline'
    arm64: pass endianness info to sparse
    arm64: ftrace: fix !CONFIG_ARM64_MODULE_PLTS kernels
    arm64: signal: Allow expansion of the signal frame
    acpi: apei: check for pending errors when probing GHES entries
    ...

    Linus Torvalds
     

01 Jul, 2017

1 commit

  • struct jit_ctx::image is used the store a pointer to the jitted
    intructions, which are always little-endian. These instructions
    are thus correctly converted from native order to little-endian
    before being stored but the pointer 'image' is declared as for
    native order values.

    Fix this by declaring the field as __le32* instead of u32*.
    Same for the pointer used in jit_fill_hole() to initialize
    the image.

    Signed-off-by: Luc Van Oostenryck
    Signed-off-by: Will Deacon

    Luc Van Oostenryck
     

15 Jun, 2017

1 commit


12 Jun, 2017

1 commit


08 Jun, 2017

1 commit

  • Will reported that in BPF_XADD we must use a different register in stxr
    instruction for the status flag due to otherwise CONSTRAINED UNPREDICTABLE
    behavior per architecture. Reference manual says [1]:

    If s == t, then one of the following behaviors must occur:

    * The instruction is UNDEFINED.
    * The instruction executes as a NOP.
    * The instruction performs the store to the specified address, but
    the value stored is UNKNOWN.

    Thus, use a different temporary register for the status flag to fix it.

    Disassembly extract from test 226/STX_XADD_DW from test_bpf.ko:

    [...]
    0000003c: c85f7d4b ldxr x11, [x10]
    00000040: 8b07016b add x11, x11, x7
    00000044: c80c7d4b stxr w12, x11, [x10]
    00000048: 35ffffac cbnz w12, 0x0000003c
    [...]

    [1] https://static.docs.arm.com/ddi0487/b/DDI0487B_a_armv8_arm.pdf, p.6132

    Fixes: 85f68fe89832 ("bpf, arm64: implement jiting of BPF_XADD")
    Reported-by: Will Deacon
    Signed-off-by: Daniel Borkmann
    Acked-by: Will Deacon
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

07 Jun, 2017

1 commit


01 Jun, 2017

1 commit


12 May, 2017

1 commit

  • Shubham was recently asking on netdev why in arm64 JIT we don't multiply
    the index for accessing the tail call map by 8. That led me into testing
    out arm64 JIT wrt tail calls and it turned out I got a NULL pointer
    dereference on the tail call.

    The buggy access is at:

    prog = array->ptrs[index];
    if (prog == NULL)
    goto out;

    [...]
    00000060: d2800e0a mov x10, #0x70 // #112
    00000064: f86a682a ldr x10, [x1,x10]
    00000068: f862694b ldr x11, [x10,x2]
    0000006c: b40000ab cbz x11, 0x00000080
    [...]

    The code triggering the crash is f862694b. x1 at the time contains the
    address of the bpf array, x10 offsetof(struct bpf_array, ptrs). Meaning,
    above we load the pointer to the program at map slot 0 into x10. x10
    can then be NULL if the slot is not occupied, which we later on try to
    access with a user given offset in x2 that is the map index.

    Fix this by emitting the following instead:

    [...]
    00000060: d2800e0a mov x10, #0x70 // #112
    00000064: 8b0a002a add x10, x1, x10
    00000068: d37df04b lsl x11, x2, #3
    0000006c: f86b694b ldr x11, [x10,x11]
    00000070: b40000ab cbz x11, 0x00000084
    [...]

    This basically adds the offset to ptrs to the base address of the bpf
    array we got and we later on access the map with an index * 8 offset
    relative to that. The tail call map itself is basically one large area
    with meta data at the head followed by the array of prog pointers.
    This makes tail calls working again, tested on Cavium ThunderX ARMv8.

    Fixes: ddb55992b04d ("arm64: bpf: implement bpf_tail_call() helper")
    Reported-by: Shubham Bansal
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann