09 Dec, 2014

1 commit


31 Oct, 2014

1 commit

  • nmap generates classic BPF programs to filter ARP packets with given target MAC
    which triggered a bug in eBPF x64 JIT. The bug was fixed in
    commit e0ee9c12157d ("x86: bpf_jit: fix two bugs in eBPF JIT compiler")
    This patch is adding a testcase in eBPF instructions (those that
    were generated by classic->eBPF converter) to be processed by JIT.
    The test is primarily targeting JIT compiler.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     

23 Sep, 2014

1 commit

  • old gcc 4.2 used by avr32 architecture produces warnings:

    lib/test_bpf.c:1741: warning: integer constant is too large for 'long' type
    lib/test_bpf.c:1741: warning: integer constant is too large for 'long' type
    lib/test_bpf.c: In function '__run_one':
    lib/test_bpf.c:1897: warning: 'ret' may be used uninitialized in this function

    silence these warnings.

    Fixes: 02ab695bb37e ("net: filter: add "load 64-bit immediate" eBPF instruction")
    Reported-by: Fengguang Wu
    Signed-off-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     

10 Sep, 2014

1 commit

  • add BPF_LD_IMM64 instruction to load 64-bit immediate value into a register.
    All previous instructions were 8-byte. This is first 16-byte instruction.
    Two consecutive 'struct bpf_insn' blocks are interpreted as single instruction:
    insn[0].code = BPF_LD | BPF_DW | BPF_IMM
    insn[0].dst_reg = destination register
    insn[0].imm = lower 32-bit
    insn[1].code = 0
    insn[1].imm = upper 32-bit
    All unused fields must be zero.

    Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM
    which loads 32-bit immediate value into a register.

    x64 JITs it as single 'movabsq %rax, imm64'
    arm64 may JIT as sequence of four 'movk x0, #imm16, lsl #shift' insn

    Note that old eBPF programs are binary compatible with new interpreter.

    It helps eBPF programs load 64-bit constant into a register with one
    instruction instead of using two registers and 4 instructions:
    BPF_MOV32_IMM(R1, imm32)
    BPF_ALU64_IMM(BPF_LSH, R1, 32)
    BPF_MOV32_IMM(R2, imm32)
    BPF_ALU64_REG(BPF_OR, R1, R2)

    User space generated programs will use this instruction to load constants only.

    To tell kernel that user space needs a pointer the _pseudo_ variant of
    this instruction may be added later, which will use extra bits of encoding
    to indicate what type of pointer user space is asking kernel to provide.
    For example 'off' or 'src_reg' fields can be used for such purpose.
    src_reg = 1 could mean that user space is asking kernel to validate and
    load in-kernel map pointer.
    src_reg = 2 could mean that user space needs readonly data section pointer
    src_reg = 3 could mean that user space needs a pointer to per-cpu local data
    All such future pseudo instructions will not be carrying the actual pointer
    as part of the instruction, but rather will be treated as a request to kernel
    to provide one. The kernel will verify the request_for_a_pointer, then
    will drop _pseudo_ marking and will store actual internal pointer inside
    the instruction, so the end result is the interpreter and JITs never
    see pseudo BPF_LD_IMM64 insns and only operate on generic BPF_LD_IMM64 that
    loads 64-bit immediate into a register. User space never operates on direct
    pointers and verifier can easily recognize request_for_pointer vs other
    instructions.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     

06 Sep, 2014

1 commit

  • With eBPF getting more extended and exposure to user space is on it's way,
    hardening the memory range the interpreter uses to steer its command flow
    seems appropriate. This patch moves the to be interpreted bytecode to
    read-only pages.

    In case we execute a corrupted BPF interpreter image for some reason e.g.
    caused by an attacker which got past a verifier stage, it would not only
    provide arbitrary read/write memory access but arbitrary function calls
    as well. After setting up the BPF interpreter image, its contents do not
    change until destruction time, thus we can setup the image on immutable
    made pages in order to mitigate modifications to that code. The idea
    is derived from commit 314beb9bcabf ("x86: bpf_jit_comp: secure bpf jit
    against spraying attacks").

    This is possible because bpf_prog is not part of sk_filter anymore.
    After setup bpf_prog cannot be altered during its life-time. This prevents
    any modifications to the entire bpf_prog structure (incl. function/JIT
    image pointer).

    Every eBPF program (including classic BPF that are migrated) have to call
    bpf_prog_select_runtime() to select either interpreter or a JIT image
    as a last setup step, and they all are being freed via bpf_prog_free(),
    including non-JIT. Therefore, we can easily integrate this into the
    eBPF life-time, plus since we directly allocate a bpf_prog, we have no
    performance penalty.

    Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
    inspection of kernel_page_tables. Brad Spengler proposed the same idea
    via Twitter during development of this patch.

    Joint work with Hannes Frederic Sowa.

    Suggested-by: Brad Spengler
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Hannes Frederic Sowa
    Cc: Alexei Starovoitov
    Cc: Kees Cook
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

26 Aug, 2014

1 commit


03 Aug, 2014

1 commit

  • clean up names related to socket filtering and bpf in the following way:
    - everything that deals with sockets keeps 'sk_*' prefix
    - everything that is pure BPF is changed to 'bpf_*' prefix

    split 'struct sk_filter' into
    struct sk_filter {
    atomic_t refcnt;
    struct rcu_head rcu;
    struct bpf_prog *prog;
    };
    and
    struct bpf_prog {
    u32 jited:1,
    len:31;
    struct sock_fprog_kern *orig_prog;
    unsigned int (*bpf_func)(const struct sk_buff *skb,
    const struct bpf_insn *filter);
    union {
    struct sock_filter insns[0];
    struct bpf_insn insnsi[0];
    struct work_struct work;
    };
    };
    so that 'struct bpf_prog' can be used independent of sockets and cleans up
    'unattached' bpf use cases

    split SK_RUN_FILTER macro into:
    SK_RUN_FILTER to be used with 'struct sk_filter *' and
    BPF_PROG_RUN to be used with 'struct bpf_prog *'

    __sk_filter_release(struct sk_filter *) gains
    __bpf_prog_release(struct bpf_prog *) helper function

    also perform related renames for the functions that work
    with 'struct bpf_prog *', since they're on the same lines:

    sk_filter_size -> bpf_prog_size
    sk_filter_select_runtime -> bpf_prog_select_runtime
    sk_filter_free -> bpf_prog_free
    sk_unattached_filter_create -> bpf_prog_create
    sk_unattached_filter_destroy -> bpf_prog_destroy
    sk_store_orig_filter -> bpf_prog_store_orig_filter
    sk_release_orig_filter -> bpf_release_orig_filter
    __sk_migrate_filter -> bpf_migrate_filter
    __sk_prepare_filter -> bpf_prepare_filter

    API for attaching classic BPF to a socket stays the same:
    sk_attach_filter(prog, struct sock *)/sk_detach_filter(struct sock *)
    and SK_RUN_FILTER(struct sk_filter *, ctx) to execute a program
    which is used by sockets, tun, af_packet

    API for 'unattached' BPF programs becomes:
    bpf_prog_create(struct bpf_prog **)/bpf_prog_destroy(struct bpf_prog *)
    and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program
    which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     

25 Jul, 2014

1 commit


11 Jun, 2014

1 commit


03 Jun, 2014

1 commit

  • The current probe_filter_length() (the function that calculates the
    length of a test BPF filter) behavior is to declare the end of the
    filter as soon as it finds {0, *, *, 0}. This is actually a valid
    insn ("ld #0"), so any filter with includes "BPF_STMT(BPF_LD | BPF_IMM, 0)"
    fails (its length is cut short).

    We are changing probe_filter_length() so as to start from the end, and
    declare the end of the filter as the first instruction which is not
    {0, *, *, 0}. This solution produces a simpler patch than the
    alternative of using an explicit end-of-filter mark. It is technically
    incorrect if your filter ends up with "ld #0", but that should not
    happen anyway.

    We also add a new test (LD_IMM_0) that includes ld #0 (does not work
    without this patch).

    Signed-off-by: Chema Gonzalez
    Acked-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Chema Gonzalez
     

02 Jun, 2014

2 commits

  • This check tests that overloading BPF_LD | BPF_ABS with an
    always invalid BPF extension, that is SKF_AD_MAX, fails to
    make sure classic BPF behaviour is correct in filter checker.

    Also, we add a test for loading at packet offset SKF_AD_OFF-1
    which should pass the filter, but later on fail during runtime.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Also add a test for the scratch memory store that first fills
    all slots and then sucessively reads all of them back adding
    up to A, and eventually returning A. This and the previous
    M[] test with alternating fill/spill will detect possible JIT
    errors on M[].

    Suggested-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

31 May, 2014

2 commits

  • This patch converts raw opcodes for tcpdump tests into
    BPF_STMT()/BPF_JUMP() combinations, which brings it into
    conformity with the rest of the patches and it also makes
    life easier to grasp what's going on in these particular
    test cases when they ever fail. Also arrange payload from
    the jump+holes test in a way as we have with other packet
    payloads in the test suite.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • This test for classic BPF probes stores and load combination
    via X on all 16 registers of the scratch memory store. It
    initially loads integer 100 and passes this value around
    to each register while incrementing it every time, thus we
    expect to have 116 as a result. Might be useful for JIT
    testing.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

24 May, 2014

4 commits

  • This patch adds three more test cases:

    1) long jumps with holes of unreachable code
    2) ret x
    3) ldx + ret x

    All three tests are for classical BPF and to make sure that
    any changes will not break some exotic behaviour that exists
    probably since decades. The last two tests are expected to
    fail by the BPF checker already, as in classic BPF only K
    or A are allowed to be returned. Thus, there are now 52 test
    cases for BPF.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • This patch simplifies and refactors the test case code a
    bit and also adds a summary of all test that passed or
    failed in the kernel log, so that it's easier to spot if
    something has failed.

    Future work could further extend the test framework to also
    support different input 'stimuli' i.e. related structures
    to seccomp.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • The sk_unattached_filter_create() API is used by BPF filters that
    are not directly attached or related to sockets, and are used in
    team, ptp, xt_bpf, cls_bpf, etc. As such all users do their own
    internal managment of obtaining filter blocks and thus already
    have them in kernel memory and set up before calling into
    sk_unattached_filter_create(). As a result, due to __user annotation
    in sock_fprog, sparse triggers false positives (incorrect type in
    assignment [different address space]) when filters are set up before
    passing them to sk_unattached_filter_create(). Therefore, let
    sk_unattached_filter_create() API use sock_fprog_kern to overcome
    this issue.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Older gcc's (mine is gcc-4.4.4) make a mess of this.

    lib/test_bpf.c:74: error: unknown field 'insns' specified in initializer
    lib/test_bpf.c:75: warning: missing braces around initializer
    lib/test_bpf.c:75: warning: (near initialization for 'tests[0]..insns[0]')
    lib/test_bpf.c:76: error: extra brace group at end of initializer
    lib/test_bpf.c:76: error: (near initialization for 'tests[0].')
    lib/test_bpf.c:76: warning: excess elements in union initializer
    lib/test_bpf.c:76: warning: (near initialization for 'tests[0].')
    lib/test_bpf.c:77: error: extra brace group at end of initializer

    Cc: Alexei Starovoitov
    Cc: David S. Miller
    Signed-off-by: Andrew Morton
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Andrew Morton
     

22 May, 2014

1 commit

  • Kernel API for classic BPF socket filters is:

    sk_unattached_filter_create() - validate classic BPF, convert, JIT
    SK_RUN_FILTER() - run it
    sk_unattached_filter_destroy() - destroy socket filter

    Cleanup internal BPF kernel API as following:

    sk_filter_select_runtime() - final step of internal BPF creation.
    Try to JIT internal BPF program, if JIT is not available select interpreter
    SK_RUN_FILTER() - run it
    sk_filter_free() - free internal BPF program

    Disallow direct calls to BPF interpreter. Execution of the BPF program should
    be done with SK_RUN_FILTER() macro.

    Example of internal BPF create, run, destroy:

    struct sk_filter *fp;

    fp = kzalloc(sk_filter_size(prog_len), GFP_KERNEL);
    memcpy(fp->insni, prog, prog_len * sizeof(fp->insni[0]));
    fp->len = prog_len;

    sk_filter_select_runtime(fp);

    SK_RUN_FILTER(fp, ctx);

    sk_filter_free(fp);

    Sockets, seccomp, testsuite, tracing are using different ways to populate
    sk_filter, so first steps of program creation are not common.

    Signed-off-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     

12 May, 2014

2 commits

  • All tests should pass with and without JIT.

    Example output:
    test_bpf: #0 TAX 35 16 16 PASS
    test_bpf: #1 TXA 7 7 7 PASS
    test_bpf: #2 ADD_SUB_MUL_K 10 PASS
    test_bpf: #3 DIV_KX 33 PASS
    test_bpf: #4 AND_OR_LSH_K 10 10 PASS
    test_bpf: #5 LD_IND 8 8 8 PASS
    test_bpf: #6 LD_ABS 8 8 8 PASS
    test_bpf: #7 LD_ABS_LL 13 14 PASS
    test_bpf: #8 LD_IND_LL 12 12 12 PASS
    test_bpf: #9 LD_ABS_NET 10 12 PASS
    test_bpf: #10 LD_IND_NET 11 12 12 PASS
    ...

    Numbers are times in nsec per filter for given input data.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     
  • The testsuite covers classic and internal BPF instructions.
    It is particularly useful for JIT compiler developers.
    Adds to "net" selftest target.

    The testsuite can be used as a set of micro-benchmarks.
    It measures execution time of each BPF program in nsec.

    This patch adds core framework.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Alexei Starovoitov