24 Aug, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     

22 Jul, 2020

3 commits

  • This patch uses the RVC support and encodings from bpf_jit.h to optimize
    the rv64 jit.

    The optimizations work by replacing emit(rv_X(...)) with a call to a
    helper function emit_X, which will emit a compressed version of the
    instruction when possible, and when RVC is enabled.

    The JIT continues to pass all tests in lib/test_bpf.c, and introduces
    no new failures to test_verifier; both with and without RVC being enabled.

    Most changes are straightforward replacements of emit(rv_X(...), ctx)
    with emit_X(..., ctx), with the following exceptions bearing mention;

    * Change emit_imm to sign-extend the value in "lower", since the
    checks for RVC (and the instructions themselves) treat the value as
    signed. Otherwise, small negative immediates will not be recognized as
    encodable using an RVC instruction. For example, without this change,
    emit_imm(rd, -1, ctx) would cause lower to become 4095, which is not a
    6b int even though a "c.li rd, -1" instruction suffices.

    * For {BPF_MOV,BPF_ADD} BPF_X, drop using addiw,addw in the 32-bit
    cases since the values are zero-extended into the upper 32 bits in
    the following instructions anyways, and the addition commutes with
    zero-extension. (BPF_SUB BPF_X must still use subw since subtraction
    does not commute with zero-extension.)

    This patch avoids optimizing branches and jumps to use RVC instructions
    since surrounding code often makes assumptions about the sizes of
    emitted instructions. Optimizing these will require changing these
    functions (e.g., emit_branch) to dynamically compute jump offsets.

    The following are examples of the JITed code for the verifier selftest
    "direct packet read test#3 for CGROUP_SKB OK", without and with RVC
    enabled, respectively. The former uses 178 bytes, and the latter uses 112,
    for a ~37% reduction in code size for this example.

    Without RVC:

    0: 02000813 addi a6,zero,32
    4: fd010113 addi sp,sp,-48
    8: 02813423 sd s0,40(sp)
    c: 02913023 sd s1,32(sp)
    10: 01213c23 sd s2,24(sp)
    14: 01313823 sd s3,16(sp)
    18: 01413423 sd s4,8(sp)
    1c: 03010413 addi s0,sp,48
    20: 03056683 lwu a3,48(a0)
    24: 02069693 slli a3,a3,0x20
    28: 0206d693 srli a3,a3,0x20
    2c: 03456703 lwu a4,52(a0)
    30: 02071713 slli a4,a4,0x20
    34: 02075713 srli a4,a4,0x20
    38: 03856483 lwu s1,56(a0)
    3c: 02049493 slli s1,s1,0x20
    40: 0204d493 srli s1,s1,0x20
    44: 03c56903 lwu s2,60(a0)
    48: 02091913 slli s2,s2,0x20
    4c: 02095913 srli s2,s2,0x20
    50: 04056983 lwu s3,64(a0)
    54: 02099993 slli s3,s3,0x20
    58: 0209d993 srli s3,s3,0x20
    5c: 09056a03 lwu s4,144(a0)
    60: 020a1a13 slli s4,s4,0x20
    64: 020a5a13 srli s4,s4,0x20
    68: 00900313 addi t1,zero,9
    6c: 006a7463 bgeu s4,t1,0x74
    70: 00000a13 addi s4,zero,0
    74: 02d52823 sw a3,48(a0)
    78: 02e52a23 sw a4,52(a0)
    7c: 02952c23 sw s1,56(a0)
    80: 03252e23 sw s2,60(a0)
    84: 05352023 sw s3,64(a0)
    88: 00000793 addi a5,zero,0
    8c: 02813403 ld s0,40(sp)
    90: 02013483 ld s1,32(sp)
    94: 01813903 ld s2,24(sp)
    98: 01013983 ld s3,16(sp)
    9c: 00813a03 ld s4,8(sp)
    a0: 03010113 addi sp,sp,48
    a4: 00078513 addi a0,a5,0
    a8: 00008067 jalr zero,0(ra)

    With RVC:

    0: 02000813 addi a6,zero,32
    4: 7179 c.addi16sp sp,-48
    6: f422 c.sdsp s0,40(sp)
    8: f026 c.sdsp s1,32(sp)
    a: ec4a c.sdsp s2,24(sp)
    c: e84e c.sdsp s3,16(sp)
    e: e452 c.sdsp s4,8(sp)
    10: 1800 c.addi4spn s0,sp,48
    12: 03056683 lwu a3,48(a0)
    16: 1682 c.slli a3,0x20
    18: 9281 c.srli a3,0x20
    1a: 03456703 lwu a4,52(a0)
    1e: 1702 c.slli a4,0x20
    20: 9301 c.srli a4,0x20
    22: 03856483 lwu s1,56(a0)
    26: 1482 c.slli s1,0x20
    28: 9081 c.srli s1,0x20
    2a: 03c56903 lwu s2,60(a0)
    2e: 1902 c.slli s2,0x20
    30: 02095913 srli s2,s2,0x20
    34: 04056983 lwu s3,64(a0)
    38: 1982 c.slli s3,0x20
    3a: 0209d993 srli s3,s3,0x20
    3e: 09056a03 lwu s4,144(a0)
    42: 1a02 c.slli s4,0x20
    44: 020a5a13 srli s4,s4,0x20
    48: 4325 c.li t1,9
    4a: 006a7363 bgeu s4,t1,0x50
    4e: 4a01 c.li s4,0
    50: d914 c.sw a3,48(a0)
    52: d958 c.sw a4,52(a0)
    54: dd04 c.sw s1,56(a0)
    56: 03252e23 sw s2,60(a0)
    5a: 05352023 sw s3,64(a0)
    5e: 4781 c.li a5,0
    60: 7422 c.ldsp s0,40(sp)
    62: 7482 c.ldsp s1,32(sp)
    64: 6962 c.ldsp s2,24(sp)
    66: 69c2 c.ldsp s3,16(sp)
    68: 6a22 c.ldsp s4,8(sp)
    6a: 6145 c.addi16sp sp,48
    6c: 853e c.mv a0,a5
    6e: 8082 c.jr ra

    Signed-off-by: Luke Nelson
    Signed-off-by: Alexei Starovoitov
    Cc: Björn Töpel
    Link: https://lore.kernel.org/bpf/20200721025241.8077-4-luke.r.nels@gmail.com

    Luke Nelson
     
  • This patch adds functions for encoding and emitting compressed riscv
    (RVC) instructions to the BPF JIT.

    Some regular riscv instructions can be compressed into an RVC instruction
    if the instruction fields meet some requirements. For example, "add rd,
    rs1, rs2" can be compressed into "c.add rd, rs2" when rd == rs1.

    To make using RVC encodings simpler, this patch also adds helper
    functions that selectively emit either a regular instruction or a
    compressed instruction if possible.

    For example, emit_add will produce a "c.add" if possible and regular
    "add" otherwise.

    Signed-off-by: Luke Nelson
    Signed-off-by: Alexei Starovoitov
    Link: https://lore.kernel.org/bpf/20200721025241.8077-3-luke.r.nels@gmail.com

    Luke Nelson
     
  • This patch makes the necessary changes to struct rv_jit_context and to
    bpf_int_jit_compile to support compressed riscv (RVC) instructions in
    the BPF JIT.

    It changes the JIT image to be u16 instead of u32, since RVC instructions
    are 2 bytes as opposed to 4.

    It also changes ctx->offset and ctx->ninsns to refer to 2-byte
    instructions rather than 4-byte ones. The riscv PC is required to be
    16-bit aligned with or without RVC, so this is sufficient to refer to
    any valid riscv offset.

    The code for computing jump offsets in bytes is updated accordingly,
    and factored into a new "ninsns_rvoff" function to simplify the code.

    Signed-off-by: Luke Nelson
    Signed-off-by: Alexei Starovoitov
    Link: https://lore.kernel.org/bpf/20200721025241.8077-2-luke.r.nels@gmail.com

    Luke Nelson
     

06 May, 2020

4 commits

  • This patch optimizes BPF_JSET BPF_K by using a RISC-V andi instruction
    when the BPF immediate fits in 12 bits, instead of first loading the
    immediate to a temporary register.

    Examples of generated code with and without this optimization:

    BPF_JMP_IMM(BPF_JSET, R1, 2, 1) without optimization:

    20: li t1,2
    24: and t1,a0,t1
    28: bnez t1,0x30

    BPF_JMP_IMM(BPF_JSET, R1, 2, 1) with optimization:

    20: andi t1,a0,2
    24: bnez t1,0x2c

    BPF_JMP32_IMM(BPF_JSET, R1, 2, 1) without optimization:

    20: li t1,2
    24: mv t2,a0
    28: slli t2,t2,0x20
    2c: srli t2,t2,0x20
    30: slli t1,t1,0x20
    34: srli t1,t1,0x20
    38: and t1,t2,t1
    3c: bnez t1,0x44

    BPF_JMP32_IMM(BPF_JSET, R1, 2, 1) with optimization:

    20: andi t1,a0,2
    24: bnez t1,0x2c

    In these examples, because the upper 32 bits of the sign-extended
    immediate are 0, BPF_JMP BPF_JSET and BPF_JMP32 BPF_JSET are equivalent
    and therefore the JIT produces identical code for them.

    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Björn Töpel
    Acked-by: Björn Töpel
    Link: https://lore.kernel.org/bpf/20200506000320.28965-5-luke.r.nels@gmail.com

    Luke Nelson
     
  • This patch adds an optimization to BPF_JMP (32- and 64-bit) BPF_K for
    when the BPF immediate is zero.

    When the immediate is zero, the code can directly use the RISC-V zero
    register instead of loading a zero immediate to a temporary register
    first.

    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Björn Töpel
    Acked-by: Björn Töpel
    Link: https://lore.kernel.org/bpf/20200506000320.28965-4-luke.r.nels@gmail.com

    Luke Nelson
     
  • This patch adds two optimizations for BPF_ALU BPF_END BPF_FROM_LE in
    the RV64 BPF JIT.

    First, it enables the verifier zero-extension optimization to avoid zero
    extension when imm == 32. Second, it avoids generating code for imm ==
    64, since it is equivalent to a no-op.

    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Björn Töpel
    Acked-by: Björn Töpel
    Link: https://lore.kernel.org/bpf/20200506000320.28965-3-luke.r.nels@gmail.com

    Luke Nelson
     
  • Commit 66d0d5a854a6 ("riscv: bpf: eliminate zero extension code-gen")
    added support for the verifier zero-extension optimization on RV64 and
    commit 46dd3d7d287b ("bpf, riscv: Enable zext optimization for more
    RV64G ALU ops") enabled it for more instruction cases.

    However, BPF_LSH BPF_X and BPF_{LSH,RSH,ARSH} BPF_K are still missing
    the optimization.

    This patch enables the zero-extension optimization for these remaining
    cases.

    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Björn Töpel
    Acked-by: Björn Töpel
    Link: https://lore.kernel.org/bpf/20200506000320.28965-2-luke.r.nels@gmail.com

    Luke Nelson
     

30 Apr, 2020

1 commit

  • This patch fixes issues with stackframe unwinding and alignment in the
    current stack layout for BPF programs on RV32.

    In the current layout, RV32 fp points to the JIT scratch registers, rather
    than to the callee-saved registers. This breaks stackframe unwinding,
    which expects fp to point just above the saved ra and fp registers.

    This patch fixes the issue by moving the callee-saved registers to be
    stored on the top of the stack, pointed to by fp. This satisfies the
    assumptions of stackframe unwinding.

    This patch also fixes an issue with the old layout that the stack was
    not aligned to 16 bytes.

    Stacktrace from JITed code using the old stack layout:

    [ 12.196249 ] [] walk_stackframe+0x0/0x96

    Stacktrace using the new stack layout:

    [ 13.062888 ] [] walk_stackframe+0x0/0x96
    [ 13.063028 ] [] show_stack+0x28/0x32
    [ 13.063253 ] [] bpf_prog_82b916b2dfa00464+0x80/0x908
    [ 13.063417 ] [] bpf_test_run+0x124/0x39a
    [ 13.063553 ] [] bpf_prog_test_run_skb+0x234/0x448
    [ 13.063704 ] [] __do_sys_bpf+0x766/0x13b4
    [ 13.063840 ] [] sys_bpf+0xc/0x14
    [ 13.063961 ] [] ret_from_syscall+0x0/0x2

    The new code is also simpler to understand and includes an ASCII diagram
    of the stack layout.

    Tested on riscv32 QEMU virt machine.

    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Acked-by: Xi Wang
    Link: https://lore.kernel.org/bpf/20200430005127.2205-1-luke.r.nels@gmail.com

    Luke Nelson
     

26 Apr, 2020

1 commit

  • This patch fixes an off by one error in the RV32 JIT handling for BPF
    tail call. Currently, the code decrements TCC before checking if it
    is less than zero. This limits the maximum number of tail calls to 32
    instead of 33 as in other JITs. The fix is to instead check the old
    value of TCC before decrementing.

    Fixes: 5f316b65e99f ("riscv, bpf: Add RV32G eBPF JIT")
    Signed-off-by: Luke Nelson
    Signed-off-by: Alexei Starovoitov
    Acked-by: Xi Wang
    Link: https://lore.kernel.org/bpf/20200421002804.5118-1-luke.r.nels@gmail.com

    Luke Nelson
     

08 Apr, 2020

1 commit

  • The existing code in emit_call on RV64 checks that the PC-relative offset
    to the function fits in 32 bits before calling emit_jump_and_link to emit
    an auipc+jalr pair. However, this check is incorrect because offsets in
    the range [2^31 - 2^11, 2^31 - 1] cannot be encoded using auipc+jalr on
    RV64 (see discussion [1]). The RISC-V spec has recently been updated
    to reflect this fact [2, 3].

    This patch fixes the problem by moving the check on the offset into
    emit_jump_and_link and modifying it to the correct range of encodable
    offsets, which is [-2^31 - 2^11, 2^31 - 2^11). This also enforces the
    check on the offset to other uses of emit_jump_and_link (e.g., BPF_JA)
    as well.

    Currently, this bug is unlikely to be triggered, because the memory
    region from which JITed images are allocated is close enough to kernel
    text for the offsets to not become too large; and because the bounds on
    BPF program size are small enough. This patch prevents this problem from
    becoming an issue if either of these change.

    [1]: https://groups.google.com/a/groups.riscv.org/forum/#!topic/isa-dev/bwWFhBnnZFQ
    [2]: https://github.com/riscv/riscv-isa-manual/commit/b1e42e09ac55116dbf9de5e4fb326a5a90e4a993
    [3]: https://github.com/riscv/riscv-isa-manual/commit/4c1b2066ebd2965a422e41eb262d0a208a7fea07

    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20200406221604.18547-1-luke.r.nels@gmail.com

    Luke Nelson
     

05 Mar, 2020

2 commits

  • This is an eBPF JIT for RV32G, adapted from the JIT for RV64G and
    the 32-bit ARM JIT.

    There are two main changes required for this to work compared to
    the RV64 JIT.

    First, eBPF registers are 64-bit, while RV32G registers are 32-bit.
    BPF registers either map directly to 2 RISC-V registers, or reside
    in stack scratch space and are saved and restored when used.

    Second, many 64-bit ALU operations do not trivially map to 32-bit
    operations. Operations that move bits between high and low words,
    such as ADD, LSH, MUL, and others must emulate the 64-bit behavior
    in terms of 32-bit instructions.

    This patch also makes related changes to bpf_jit.h, such
    as adding RISC-V instructions required by the RV32 JIT.

    Supported features:

    The RV32 JIT supports the same features and instructions as the
    RV64 JIT, with the following exceptions:

    - ALU64 DIV/MOD: Requires loops to implement on 32-bit hardware.

    - BPF_XADD | BPF_DW: There's no 8-byte atomic instruction in RV32.

    These features are also unsupported on other BPF JITs for 32-bit
    architectures.

    Testing:

    - lib/test_bpf.c
    test_bpf: Summary: 378 PASSED, 0 FAILED, [349/366 JIT'ed]
    test_bpf: test_skb_segment: Summary: 2 PASSED, 0 FAILED

    The tests that are not JITed are all due to use of 64-bit div/mod
    or 64-bit xadd.

    - tools/testing/selftests/bpf/test_verifier.c
    Summary: 1415 PASSED, 122 SKIPPED, 43 FAILED

    Tested both with and without BPF JIT hardening.

    This is the same set of tests that pass using the BPF interpreter
    with the JIT disabled.

    Verification and synthesis:

    We developed the RV32 JIT using our automated verification tool,
    Serval. We have used Serval in the past to verify patches to the
    RV64 JIT. We also used Serval to superoptimize the resulting code
    through program synthesis.

    You can find the tool and a guide to the approach and results here:
    https://github.com/uw-unsat/serval-bpf/tree/rv32-jit-v5

    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Björn Töpel
    Acked-by: Björn Töpel
    Link: https://lore.kernel.org/bpf/20200305050207.4159-3-luke.r.nels@gmail.com

    Luke Nelson
     
  • This patch factors out code that can be used by both the RV64 and RV32
    BPF JITs to a common bpf_jit.h and bpf_jit_core.c.

    Move struct definitions and macro-like functions to header. Rename
    rv_sb_insn/rv_uj_insn to rv_b_insn/rv_j_insn to match the RISC-V
    specification.

    Move reusable functions emit_body() and bpf_int_jit_compile() to
    bpf_jit_core.c with minor simplifications. Rename emit_insn() and
    build_{prologue,epilogue}() to be prefixed with "bpf_jit_" as they are
    no longer static.

    Rename bpf_jit_comp.c to bpf_jit_comp64.c to be more explicit.

    Co-developed-by: Xi Wang
    Signed-off-by: Xi Wang
    Signed-off-by: Luke Nelson
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Björn Töpel
    Acked-by: Björn Töpel
    Link: https://lore.kernel.org/bpf/20200305050207.4159-2-luke.r.nels@gmail.com

    Luke Nelson
     

28 Dec, 2019

1 commit

  • Daniel Borkmann says:

    ====================
    pull-request: bpf-next 2019-12-27

    The following pull-request contains BPF updates for your *net-next* tree.

    We've added 127 non-merge commits during the last 17 day(s) which contain
    a total of 110 files changed, 6901 insertions(+), 2721 deletions(-).

    There are three merge conflicts. Conflicts and resolution looks as follows:

    1) Merge conflict in net/bpf/test_run.c:

    There was a tree-wide cleanup c593642c8be0 ("treewide: Use sizeof_field() macro")
    which gets in the way with b590cb5f802d ("bpf: Switch to offsetofend in
    BPF_PROG_TEST_RUN"):

    <<<<<<< HEAD
    if (!range_is_zero(__skb, offsetof(struct __sk_buff, priority) +
    sizeof_field(struct __sk_buff, priority),
    =======
    if (!range_is_zero(__skb, offsetofend(struct __sk_buff, priority),
    >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c

    There are a few occasions that look similar to this. Always take the chunk with
    offsetofend(). Note that there is one where the fields differ in here:

    <<<<<<< HEAD
    if (!range_is_zero(__skb, offsetof(struct __sk_buff, tstamp) +
    sizeof_field(struct __sk_buff, tstamp),
    =======
    if (!range_is_zero(__skb, offsetofend(struct __sk_buff, gso_segs),
    >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c

    Just take the one with offsetofend() /and/ gso_segs. Latter is correct due to
    850a88cc4096 ("bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN").

    2) Merge conflict in arch/riscv/net/bpf_jit_comp.c:

    (I'm keeping Bjorn in Cc here for a double-check in case I got it wrong.)

    <<<<<<< HEAD
    if (is_13b_check(off, insn))
    return -1;
    emit(rv_blt(tcc, RV_REG_ZERO, off >> 1), ctx);
    =======
    emit_branch(BPF_JSLT, RV_REG_T1, RV_REG_ZERO, off, ctx);
    >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c

    Result should look like:

    emit_branch(BPF_JSLT, tcc, RV_REG_ZERO, off, ctx);

    3) Merge conflict in arch/riscv/include/asm/pgtable.h:

    <<<<<<< HEAD
    =======
    #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1)
    #define VMALLOC_END (PAGE_OFFSET - 1)
    #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE)

    #define BPF_JIT_REGION_SIZE (SZ_128M)
    #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
    #define BPF_JIT_REGION_END (VMALLOC_END)

    /*
    * Roughly size the vmemmap space to be large enough to fit enough
    * struct pages to map half the virtual address space. Then
    * position vmemmap directly below the VMALLOC region.
    */
    #define VMEMMAP_SHIFT \
    (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
    #define VMEMMAP_SIZE BIT(VMEMMAP_SHIFT)
    #define VMEMMAP_END (VMALLOC_START - 1)
    #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE)

    #define vmemmap ((struct page *)VMEMMAP_START)

    >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c

    Only take the BPF_* defines from there and move them higher up in the
    same file. Remove the rest from the chunk. The VMALLOC_* etc defines
    got moved via 01f52e16b868 ("riscv: define vmemmap before pfn_to_page
    calls"). Result:

    [...]
    #define __S101 PAGE_READ_EXEC
    #define __S110 PAGE_SHARED_EXEC
    #define __S111 PAGE_SHARED_EXEC

    #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1)
    #define VMALLOC_END (PAGE_OFFSET - 1)
    #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE)

    #define BPF_JIT_REGION_SIZE (SZ_128M)
    #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
    #define BPF_JIT_REGION_END (VMALLOC_END)

    /*
    * Roughly size the vmemmap space to be large enough to fit enough
    * struct pages to map half the virtual address space. Then
    * position vmemmap directly below the VMALLOC region.
    */
    #define VMEMMAP_SHIFT \
    (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT)
    #define VMEMMAP_SIZE BIT(VMEMMAP_SHIFT)
    #define VMEMMAP_END (VMALLOC_START - 1)
    #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE)

    [...]

    Let me know if there are any other issues.

    Anyway, the main changes are:

    1) Extend bpftool to produce a struct (aka "skeleton") tailored and specific
    to a provided BPF object file. This provides an alternative, simplified API
    compared to standard libbpf interaction. Also, add libbpf extern variable
    resolution for .kconfig section to import Kconfig data, from Andrii Nakryiko.

    2) Add BPF dispatcher for XDP which is a mechanism to avoid indirect calls by
    generating a branch funnel as discussed back in bpfconf'19 at LSF/MM. Also,
    add various BPF riscv JIT improvements, from Björn Töpel.

    3) Extend bpftool to allow matching BPF programs and maps by name,
    from Paul Chaignon.

    4) Support for replacing cgroup BPF programs attached with BPF_F_ALLOW_MULTI
    flag for allowing updates without service interruption, from Andrey Ignatov.

    5) Cleanup and simplification of ring access functions for AF_XDP with a
    bonus of 0-5% performance improvement, from Magnus Karlsson.

    6) Enable BPF JITs for x86-64 and arm64 by default. Also, final version of
    audit support for BPF, from Daniel Borkmann and latter with Jiri Olsa.

    7) Move and extend test_select_reuseport into BPF program tests under
    BPF selftests, from Jakub Sitnicki.

    8) Various BPF sample improvements for xdpsock for customizing parameters
    to set up and benchmark AF_XDP, from Jay Jayatheerthan.

    9) Improve libbpf to provide a ulimit hint on permission denied errors.
    Also change XDP sample programs to attach in driver mode by default,
    from Toke Høiland-Jørgensen.

    10) Extend BPF test infrastructure to allow changing skb mark from tc BPF
    programs, from Nikita V. Shirokov.

    11) Optimize prologue code sequence in BPF arm32 JIT, from Russell King.

    12) Fix xdp_redirect_cpu BPF sample to manually attach to tracepoints after
    libbpf conversion, from Jesper Dangaard Brouer.

    13) Minor misc improvements from various others.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Dec, 2019

7 commits

  • Instead of using emit_imm() and emit_jalr() which can expand to six
    instructions, start using jal or auipc+jalr.

    Signed-off-by: Björn Töpel
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20191216091343.23260-8-bjorn.topel@gmail.com

    Björn Töpel
     
  • This commit makes sure that the JIT images is kept close to the kernel
    text, so BPF calls can use relative calling with auipc/jalr or jal
    instead of loading the full 64-bit address and jalr.

    The BPF JIT image region is 128 MB before the kernel text.

    Signed-off-by: Björn Töpel
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20191216091343.23260-7-bjorn.topel@gmail.com

    Björn Töpel
     
  • Remove one addi, and instead use the offset part of jalr.

    Signed-off-by: Björn Töpel
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20191216091343.23260-6-bjorn.topel@gmail.com

    Björn Töpel
     
  • This commit add support for far (offset > 21b) jumps and exits.

    Signed-off-by: Björn Töpel
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Luke Nelson
    Link: https://lore.kernel.org/bpf/20191216091343.23260-5-bjorn.topel@gmail.com

    Björn Töpel
     
  • Start use the emit_branch() function in the tail call emitter in order
    to support far branching.

    Signed-off-by: Björn Töpel
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20191216091343.23260-4-bjorn.topel@gmail.com

    Björn Töpel
     
  • This commit adds branch relaxation to the BPF JIT, and with that
    support for far (offset greater than 12b) branching.

    The branch relaxation requires more than two passes to converge. For
    most programs it is three passes, but for larger programs it can be
    more.

    Signed-off-by: Björn Töpel
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Luke Nelson
    Link: https://lore.kernel.org/bpf/20191216091343.23260-3-bjorn.topel@gmail.com

    Björn Töpel
     
  • The BPF JIT incorrectly clobbered the a0 register, and did not flag
    usage of s5 register when BPF stack was being used.

    Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G")
    Signed-off-by: Björn Töpel
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20191216091343.23260-2-bjorn.topel@gmail.com

    Björn Töpel
     

11 Dec, 2019

1 commit

  • All BPF JIT compilers except RISC-V's and MIPS' enforce a 33-tail calls
    limit at runtime. In addition, a test was recently added, in tailcalls2,
    to check this limit.

    This patch updates the tail call limit in RISC-V's JIT compiler to allow
    33 tail calls. I tested it using the above selftest on an emulated
    RISCV64.

    Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G")
    Reported-by: Mahshid Khezri
    Signed-off-by: Paul Chaignon
    Signed-off-by: Daniel Borkmann
    Acked-by: Björn Töpel
    Acked-by: Martin KaFai Lau
    Link: https://lore.kernel.org/bpf/966fe384383bf23a0ee1efe8d7291c78a3fb832b.1575916815.git.paul.chaignon@gmail.com

    Paul Chaignon
     

06 Jul, 2019

1 commit

  • Commit 66d0d5a854a6 ("riscv: bpf: eliminate zero extension code-gen")
    added the new zero-extension optimization for some BPF ALU operations.

    Since then, bugs in the JIT that have been fixed in the bpf tree require
    this optimization to be added to other operations: commit 1e692f09e091
    ("bpf, riscv: clear high 32 bits for ALU32 add/sub/neg/lsh/rsh/arsh"),
    and commit fe121ee531d1 ("bpf, riscv: clear target register high 32-bits
    for and/or/xor on ALU32").

    Now that these have been merged to bpf-next, the zext optimization can
    be enabled for the fixed operations.

    Signed-off-by: Luke Nelson
    Cc: Song Liu
    Cc: Jiong Wang
    Cc: Xi Wang
    Acked-by: Björn Töpel
    Acked-by: Jiong Wang
    Signed-off-by: Daniel Borkmann

    Luke Nelson
     

18 Jun, 2019

1 commit


08 Jun, 2019

1 commit

  • Daniel Borkmann says:

    ====================
    pull-request: bpf 2019-06-07

    The following pull-request contains BPF updates for your *net* tree.

    The main changes are:

    1) Fix several bugs in riscv64 JIT code emission which forgot to clear high
    32-bits for alu32 ops, from Björn and Luke with selftests covering all
    relevant BPF alu ops from Björn and Jiong.

    2) Two fixes for UDP BPF reuseport that avoid calling the program in case of
    __udp6_lib_err and UDP GRO which broke reuseport_select_sock() assumption
    that skb->data is pointing to transport header, from Martin.

    3) Two fixes for BPF sockmap: a use-after-free from sleep in psock's backlog
    workqueue, and a missing restore of sk_write_space when psock gets dropped,
    from Jakub and John.

    4) Fix unconnected UDP sendmsg hook API which is insufficient as-is since it
    breaks standard applications like DNS if reverse NAT is not performed upon
    receive, from Daniel.

    5) Fix an out-of-bounds read in __bpf_skc_lookup which in case of AF_INET6
    fails to verify that the length of the tuple is long enough, from Lorenz.

    6) Fix libbpf's libbpf__probe_raw_btf to return an fd instead of 0/1 (for
    {un,}successful probe) as that is expected to be propagated as an fd to
    load_sk_storage_btf() and thus closing the wrong descriptor otherwise,
    from Michal.

    7) Fix bpftool's JSON output for the case when a lookup fails, from Krzesimir.

    8) Minor misc fixes in docs, samples and selftests, from various others.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

01 Jun, 2019

1 commit

  • In BPF, 32-bit ALU operations should zero-extend their results into
    the 64-bit registers.

    The current BPF JIT on RISC-V emits incorrect instructions that perform
    sign extension only (e.g., addw, subw) on 32-bit add, sub, lsh, rsh,
    arsh, and neg. This behavior diverges from the interpreter and JITs
    for other architectures.

    This patch fixes the bugs by performing zero extension on the destination
    register of 32-bit ALU operations.

    Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G")
    Cc: Xi Wang
    Signed-off-by: Luke Nelson
    Acked-by: Song Liu
    Acked-by: Björn Töpel
    Reviewed-by: Palmer Dabbelt
    Signed-off-by: Alexei Starovoitov

    Luke Nelson
     

25 May, 2019

1 commit


23 May, 2019

1 commit

  • When using 32-bit subregisters (ALU32), the RISC-V JIT would not clear
    the high 32-bits of the target register and therefore generate
    incorrect code.

    E.g., in the following code:

    $ cat test.c
    unsigned int f(unsigned long long a,
    unsigned int b)
    {
    return (unsigned int)a & b;
    }

    $ clang-9 -target bpf -O2 -emit-llvm -S test.c -o - | \
    llc-9 -mattr=+alu32 -mcpu=v3
    .text
    .file "test.c"
    .globl f
    .p2align 3
    .type f,@function
    f:
    r0 = r1
    w0 &= w2
    exit
    .Lfunc_end0:
    .size f, .Lfunc_end0-f

    The JIT would not clear the high 32-bits of r0 after the
    and-operation, which in this case might give an incorrect return
    value.

    After this patch, that is not the case, and the upper 32-bits are
    cleared.

    Reported-by: Jiong Wang
    Fixes: 2353ecc6f91f ("bpf, riscv: add BPF JIT for RV64G")
    Signed-off-by: Björn Töpel
    Signed-off-by: Daniel Borkmann

    Björn Töpel
     

21 May, 2019

1 commit


05 Feb, 2019

1 commit

  • This commit adds a BPF JIT for RV64G.

    The JIT is a two-pass JIT, and has a dynamic prolog/epilogue (similar
    to the MIPS64 BPF JIT) instead of static ones (e.g. x86_64).

    At the moment the RISC-V Linux port does not support
    CONFIG_HAVE_KPROBES, which means that CONFIG_BPF_EVENTS is not
    supported. Thus, no tests involving BPF_PROG_TYPE_TRACEPOINT,
    BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_KPROBE and
    BPF_PROG_TYPE_RAW_TRACEPOINT passes.

    The implementation does not support "far branching" (>4KiB).

    Test results:
    # modprobe test_bpf
    test_bpf: Summary: 378 PASSED, 0 FAILED, [366/366 JIT'ed]

    # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled
    # ./test_verifier
    ...
    Summary: 761 PASSED, 507 SKIPPED, 2 FAILED

    Note that "test_verifier" was run with one build with
    CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y and one without, otherwise
    many of the the tests that require unaligned access were skipped.

    CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y:
    # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled
    # ./test_verifier | grep -c 'NOTE.*unknown align'
    0

    No CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS:
    # echo 1 > /proc/sys/kernel/unprivileged_bpf_disabled
    # ./test_verifier | grep -c 'NOTE.*unknown align'
    59

    The two failing test_verifier tests are:
    "ld_abs: vlan + abs, test 1"
    "ld_abs: jump around ld_abs"

    This is due to that "far branching" involved in those tests.

    All tests where done on QEMU (QEMU emulator version 3.1.50
    (v3.1.0-688-g8ae951fbc106)).

    Signed-off-by: Björn Töpel
    Signed-off-by: Daniel Borkmann

    Björn Töpel