04 May, 2020

2 commits

  • This patch adds an optimization that uses the asr immediate instruction
    for BPF_ALU BPF_ARSH BPF_K, rather than loading the immediate to
    a temporary register. This is similar to existing code for handling
    BPF_ALU BPF_{LSH,RSH} BPF_K. This optimization saves two instructions
    and is more consistent with LSH and RSH.

    Example of the code generated for BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5)
    before the optimization:

    2c: mov r8, #5
    30: mov r9, #0
    34: asr r0, r0, r8

    and after optimization:

    2c: asr r0, r0, #5

    Tested on QEMU using lib/test_bpf and test_verifier.

    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20200501020210.32294-3-luke.r.nels@gmail.com

    Luke Nelson
     
  • This patch optimizes the code generated by emit_a32_arsh_r64, which
    handles the BPF_ALU64 BPF_ARSH BPF_X instruction.

    The original code uses a conditional B followed by an unconditional ORR.
    The optimization saves one instruction by removing the B instruction
    and using a conditional ORR (with an inverted condition).

    Example of the code generated for BPF_ALU64_REG(BPF_ARSH, BPF_REG_0,
    BPF_REG_1), before optimization:

    34: rsb ip, r2, #32
    38: subs r9, r2, #32
    3c: lsr lr, r0, r2
    40: orr lr, lr, r1, lsl ip
    44: bmi 0x4c
    48: orr lr, lr, r1, asr r9
    4c: asr ip, r1, r2
    50: mov r0, lr
    54: mov r1, ip

    and after optimization:

    34: rsb ip, r2, #32
    38: subs r9, r2, #32
    3c: lsr lr, r0, r2
    40: orr lr, lr, r1, lsl ip
    44: orrpl lr, lr, r1, asr r9
    48: asr ip, r1, r2
    4c: mov r0, lr
    50: mov r1, ip

    Tested on QEMU using lib/test_bpf and test_verifier.

    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20200501020210.32294-2-luke.r.nels@gmail.com

    Luke Nelson
     

15 Apr, 2020

1 commit

  • This patch fixes an incorrect check in how immediate memory offsets are
    computed for BPF_DW on arm.

    For BPF_LDX/ST/STX + BPF_DW, the 32-bit arm JIT breaks down an 8-byte
    access into two separate 4-byte accesses using off+0 and off+4. If off
    fits in imm12, the JIT emits a ldr/str instruction with the immediate
    and avoids the use of a temporary register. While the current check off

    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20200409221752.28448-1-luke.r.nels@gmail.com

    Luke Nelson
     

09 Apr, 2020

1 commit

  • The current arm BPF JIT does not correctly compile RSH or ARSH when the
    immediate shift amount is 0. This causes the "rsh64 by 0 imm" and "arsh64
    by 0 imm" BPF selftests to hang the kernel by reaching an instruction
    the verifier determines to be unreachable.

    The root cause is in how immediate right shifts are encoded on arm.
    For LSR and ASR (logical and arithmetic right shift), a bit-pattern
    of 00000 in the immediate encodes a shift amount of 32. When the BPF
    immediate is 0, the generated code shifts by 32 instead of the expected
    behavior (a no-op).

    This patch fixes the bugs by adding an additional check if the BPF
    immediate is 0. After the change, the above mentioned BPF selftests pass.

    Fixes: 39c13c204bb11 ("arm: eBPF JIT compiler")
    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20200408181229.10909-1-luke.r.nels@gmail.com

    Luke Nelson
     

11 Dec, 2019

1 commit

  • Improve the prologue code sequence to be able to take advantage of
    64-bit stores, changing the code from:

    push {r4, r5, r6, r7, r8, r9, fp, lr}
    mov fp, sp
    sub ip, sp, #80 ; 0x50
    sub sp, sp, #600 ; 0x258
    str ip, [fp, #-100] ; 0xffffff9c
    mov r6, #0
    str r6, [fp, #-96] ; 0xffffffa0
    mov r4, #0
    mov r3, r4
    mov r2, r0
    str r4, [fp, #-104] ; 0xffffff98
    str r4, [fp, #-108] ; 0xffffff94

    to the tighter:

    push {r4, r5, r6, r7, r8, r9, fp, lr}
    mov fp, sp
    mov r3, #0
    sub r2, sp, #80 ; 0x50
    sub sp, sp, #600 ; 0x258
    strd r2, [fp, #-100] ; 0xffffff9c
    mov r2, #0
    strd r2, [fp, #-108] ; 0xffffff94
    mov r2, r0

    resulting in a saving of three instructions.

    Signed-off-by: Russell King
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/E1ieH2g-0004ih-Rb@rmk-PC.armlinux.org.uk

    Russell King
     

18 Jun, 2019

1 commit


05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation version 2 of the license

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 315 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Armijn Hemel
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190531190115.503150771@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

25 May, 2019

1 commit


21 May, 2019

1 commit


27 Jan, 2019

1 commit

  • This patch implements code-gen for new JMP32 instructions on arm.

    For JSET, "ands" (AND with flags updated) is used, so corresponding
    encoding helper is added.

    Cc: Shubham Bansal
    Signed-off-by: Jiong Wang
    Signed-off-by: Alexei Starovoitov

    Jiong Wang
     

13 Jul, 2018

18 commits


30 Jun, 2018

1 commit

  • Any eBPF JIT that where its underlying arch supports ARCH_HAS_SET_MEMORY
    would need to use bpf_jit_binary_{un,}lock_ro() pair instead of the
    set_memory_{ro,rw}() pair directly as otherwise changes to the former
    might break. arm32's eBPF conversion missed to change it, so fix this
    up here.

    Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

05 Jun, 2018

2 commits

  • The names for BPF_ALU64 | BPF_ARSH are emit_a32_arsh_*,
    the names for BPF_ALU64 | BPF_LSH are emit_a32_lsh_*, but
    the names for BPF_ALU64 | BPF_RSH are emit_a32_lsr_*.

    For consistence reason, let's rename emit_a32_lsr_* to
    emit_a32_rsh_*.

    This patch also corrects a wrong comment.

    Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
    Signed-off-by: Wang YanQing
    Cc: Shubham Bansal
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux@armlinux.org.uk
    Signed-off-by: Daniel Borkmann

    Wang YanQing
     
  • imm24 is signed, so the right range is:

    [-(1<<
    Cc: Shubham Bansal
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux@armlinux.org.uk
    Signed-off-by: Daniel Borkmann

    Wang YanQing
     

15 May, 2018

1 commit


04 May, 2018

1 commit

  • Since LD_ABS/LD_IND instructions are now removed from the core and
    reimplemented through a combination of inlined BPF instructions and
    a slow-path helper, we can get rid of the complexity from arm32 JIT.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

27 Jan, 2018

1 commit


21 Jan, 2018

1 commit

  • Alexei Starovoitov says:

    ====================
    pull-request: bpf-next 2018-01-19

    The following pull-request contains BPF updates for your *net-next* tree.

    The main changes are:

    1) bpf array map HW offload, from Jakub.

    2) support for bpf_get_next_key() for LPM map, from Yonghong.

    3) test_verifier now runs loaded programs, from Alexei.

    4) xdp cpumap monitoring, from Jesper.

    5) variety of tests, cleanups and small x64 JIT optimization, from Daniel.

    6) user space can now retrieve HW JITed program, from Jiong.

    Note there is a minor conflict between Russell's arm32 JIT fixes
    and removal of bpf_jit_enable variable by Daniel which should
    be resolved by keeping Russell's comment and removing that variable.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

20 Jan, 2018

2 commits

  • The BPF verifier conflict was some minor contextual issue.

    The TUN conflict was less trivial. Cong Wang fixed a memory leak of
    tfile->tx_array in 'net'. This is an skb_array. But meanwhile in
    net-next tun changed tfile->tx_arry into tfile->tx_ring which is a
    ptr_ring.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Having a pure_initcall() callback just to permanently enable BPF
    JITs under CONFIG_BPF_JIT_ALWAYS_ON is unnecessary and could leave
    a small race window in future where JIT is still disabled on boot.
    Since we know about the setting at compilation time anyway, just
    initialize it properly there. Also consolidate all the individual
    bpf_jit_enable variables into a single one and move them under one
    location. Moreover, don't allow for setting unspecified garbage
    values on them.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

18 Jan, 2018

3 commits

  • As per 90caccdd8cc0 ("bpf: fix bpf_tail_call() x64 JIT"), the index used
    for array lookup is defined to be 32-bit wide. Update a misleading
    comment that suggests it is 64-bit wide.

    Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
    Signed-off-by: Russell King

    Russell King
     
  • When the source and destination register are identical, our JIT does not
    generate correct code, which leads to kernel oopses.

    Fix this by (a) generating more efficient code, and (b) making use of
    the temporary earlier if we will overwrite the address register.

    Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
    Signed-off-by: Russell King

    Russell King
     
  • When an eBPF program tail-calls another eBPF program, it enters it after
    the prologue to avoid having complex stack manipulations. This can lead
    to kernel oopses, and similar.

    Resolve this by always using a fixed stack layout, a CPU register frame
    pointer, and using this when reloading registers before returning.

    Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
    Signed-off-by: Russell King

    Russell King