31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of version 2 of the gnu general public license as
    published by the free software foundation

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 107 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Richard Fontana
    Reviewed-by: Steve Winslow
    Reviewed-by: Alexios Zavras
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190528171438.615055994@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

15 May, 2018

1 commit


29 Sep, 2017

1 commit

  • This patch extends the libbpf to provide API support to
    allow specifying BPF object name.

    In tools/lib/bpf/libbpf, the C symbol of the function
    and the map is used. Regarding section name, all maps are
    under the same section named "maps". Hence, section name
    is not a good choice for map's name. To be consistent with
    map, bpf_prog also follows and uses its function symbol as
    the prog's name.

    This patch adds logic to collect function's symbols in libbpf.
    There is existing codes to collect the map's symbols and no change
    is needed.

    The bpf_load_program_name() and bpf_map_create_name() are
    added to take the name argument. For the other bpf_map_create_xxx()
    variants, a name argument is directly added to them.

    In samples/bpf, bpf_load.c in particular, the symbol is also
    used as the map's name and the map symbols has already been
    collected in the existing code. For bpf_prog, bpf_load.c does
    not collect the function symbol name. We can consider to collect
    them later if there is a need to continue supporting the bpf_load.c.

    Signed-off-by: Martin KaFai Lau
    Acked-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

22 Sep, 2017

1 commit

  • When cross-compiling the bpf sample map_perf_test for aarch64, I find that
    __NR_getpgrp is undefined. This causes build errors. This syscall is deprecated
    and requires defining __ARCH_WANT_SYSCALL_DEPRECATED. To avoid having to define
    that, just use a different syscall (getppid) for the array map stress test.

    Acked-by: Alexei Starovoitov
    Signed-off-by: Joel Fernandes
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Joel Fernandes
     

02 Sep, 2017

1 commit

  • Create a new case to test the LRU lookup performance.

    At the beginning, the LRU map is fully loaded (i.e. the number of keys
    is equal to map->max_entries). The lookup is done through key 0
    to num_map_entries and then repeats from 0 again.

    This patch also creates an anonymous struct to properly
    name the test params in stress_lru_hmap_alloc() in map_perf_test_kern.c.

    Signed-off-by: Martin KaFai Lau
    Acked-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

20 Aug, 2017

1 commit


03 May, 2017

1 commit

  • Do this change before others start to use this callback.
    Change map_perf_test_user.c which seems to be the only user.

    This patch extends capabilities of commit 9fd63d05f3e8 ("bpf:
    Allow bpf sample programs (*_user.c) to change bpf_map_def").

    Give fixup callback access to struct bpf_map_data, instead of
    only stuct bpf_map_def. This add flexibility to allow userspace
    to reassign the map file descriptor. This is very useful when
    wanting to share maps between several bpf programs.

    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     

18 Apr, 2017

3 commits

  • This patch adds a map-in-map LRU example.
    If we know only a subset of cores will use the
    LRU, we can allocate a common LRU list per targeting core
    and store it into an array-of-hashs.

    It allows using the common LRU map with map-update performance
    comparable to the BPF_F_NO_COMMON_LRU map but without wasting memory
    on the unused cores that we know they will never access the LRU map.

    BPF_F_NO_COMMON_LRU:
    > map_perf_test 32 8 10000000 10000000 | awk '{sum += $3}END{print sum}'
    9234314 (9.23M/s)

    map-in-map LRU:
    > map_perf_test 512 8 1260000 80000000 | awk '{sum += $3}END{print sum}'
    9962743 (9.96M/s)

    Notes that the max_entries for the map-in-map LRU test is 1260000 which
    is the max_entries for each inner LRU map. 8 processes have been
    started, so 8 * 1260000 = 10080000 (~10M) which is close to what is
    used in the BPF_F_NO_COMMON_LRU test.

    Signed-off-by: Martin KaFai Lau
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • The current bpf_map_def is statically defined during compile
    time. This patch allows the *_user.c program to change it during
    runtime. It is done by adding load_bpf_file_fixup_map() which
    takes a callback. The callback will be called before creating
    each map so that it has a chance to modify the bpf_map_def.

    The current usecase is to change max_entries in map_perf_test.
    It is interesting to test with a much bigger map size in
    some cases (e.g. the following patch on bpf_lru_map.c).
    However, it is hard to find one size to fit all testing
    environment. Hence, it is handy to take the max_entries
    as a cmdline arg and then configure the bpf_map_def during
    runtime.

    This patch adds two cmdline args. One is to configure
    the map's max_entries. Another is to configure the max_cnt
    which controls how many times a syscall is called.

    Signed-off-by: Martin KaFai Lau
    Acked-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • One more LRU test will be added later in this patch series.
    In this patch, we first move all existing LRU map tests into
    a single syscall (connect) first so that the future new
    LRU test can be added without hunting another syscall.

    One of the map name is also changed from percpu_lru_hash_map
    to nocommon_lru_hash_map to avoid the confusion with percpu_hash_map.

    Signed-off-by: Martin KaFai Lau
    Acked-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

17 Mar, 2017

1 commit

  • $ map_perf_test 128
    speed of HASH bpf_map_lookup_elem() in lookups per second
    w/o JIT w/JIT
    before 46M 58M
    after 42M 74M

    perf report
    before:
    54.23% map_perf_test [kernel.kallsyms] [k] __htab_map_lookup_elem
    14.24% map_perf_test [kernel.kallsyms] [k] lookup_elem_raw
    8.84% map_perf_test [kernel.kallsyms] [k] htab_map_lookup_elem
    5.93% map_perf_test [kernel.kallsyms] [k] bpf_map_lookup_elem
    2.30% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2
    1.49% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler

    after:
    60.03% map_perf_test [kernel.kallsyms] [k] __htab_map_lookup_elem
    18.07% map_perf_test [kernel.kallsyms] [k] lookup_elem_raw
    2.91% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2
    1.94% map_perf_test [kernel.kallsyms] [k] _einittext
    1.90% map_perf_test [kernel.kallsyms] [k] __audit_syscall_exit
    1.72% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler

    Notice that bpf_map_lookup_elem() and htab_map_lookup_elem() are trivial
    functions, yet they take sizeable amount of cpu time.
    htab_map_gen_lookup() removes bpf_map_lookup_elem() and converts
    htab_map_lookup_elem() into three BPF insns which causing cpu time
    for bpf_prog_da4fc6a3f41761a2() slightly increase.

    $ map_perf_test 256
    speed of ARRAY bpf_map_lookup_elem() in lookups per second
    w/o JIT w/JIT
    before 97M 174M
    after 64M 280M

    before:
    37.33% map_perf_test [kernel.kallsyms] [k] array_map_lookup_elem
    13.95% map_perf_test [kernel.kallsyms] [k] bpf_map_lookup_elem
    6.54% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2
    4.57% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler

    after:
    32.86% map_perf_test [kernel.kallsyms] [k] bpf_prog_da4fc6a3f41761a2
    6.54% map_perf_test [kernel.kallsyms] [k] kprobe_ftrace_handler

    array_map_gen_lookup() removes calls to array_map_lookup_elem()
    and bpf_map_lookup_elem() and replaces them with 7 bpf insns.

    The performance without JIT is slower, since executing extra insns
    in the interpreter is slower than running native C code,
    but with JIT the performance gains are obvious,
    since native C->x86 code is replaced with fewer bpf->x86 instructions.

    Signed-off-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     

24 Jan, 2017

1 commit

  • Extend the map_perf_test_{user,kern}.c infrastructure to stress test
    lpm-trie lookups. We hook into the kprobe on sys_gettid() and measure
    the latency depending on trie size and lookup count.

    On my Intel Haswell i7-6400U, a single gettid() syscall with an empty
    bpf program takes roughly 6.5us on my system. Lookups in empty tries
    take ~1.8us on first try, ~0.9us on retries. Lookups in tries with 8192
    entries take ~7.1us (on the first _and_ any subsequent try).

    Signed-off-by: David Herrmann
    Reviewed-by: Daniel Mack
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    David Herrmann
     

16 Nov, 2016

1 commit

  • This patch has some unit tests and a test_lru_dist.

    The test_lru_dist reads in the numeric keys from a file.
    The files used here are generated by a modified fio-genzipf tool
    originated from the fio test suit. The sample data file can be
    found here: https://github.com/iamkafai/bpf-lru

    The zipf.* data files have 100k numeric keys and the key is also
    ranged from 1 to 100k.

    The test_lru_dist outputs the number of unique keys (nr_unique).
    F.e. The following means, 61239 of them is unique out of 100k keys.
    nr_misses means it cannot be found in the LRU map, so nr_misses
    must be >= nr_unique. test_lru_dist also simulates a perfect LRU
    map as a comparison:

    [root@arch-fb-vm1 ~]# ~/devshare/fb-kernel/linux/samples/bpf/test_lru_dist \
    /root/zipf.100k.a1_01.out 4000 1
    ...
    test_parallel_lru_dist (map_type:9 map_flags:0x0):
    task:0 BPF LRU: nr_unique:23093(/100000) nr_misses:31603(/100000)
    task:0 Perfect LRU: nr_unique:23093(/100000 nr_misses:34328(/100000)
    ....
    test_parallel_lru_dist (map_type:9 map_flags:0x2):
    task:0 BPF LRU: nr_unique:23093(/100000) nr_misses:31710(/100000)
    task:0 Perfect LRU: nr_unique:23093(/100000 nr_misses:34328(/100000)

    [root@arch-fb-vm1 ~]# ~/devshare/fb-kernel/linux/samples/bpf/test_lru_dist \
    /root/zipf.100k.a0_01.out 40000 1
    ...
    test_parallel_lru_dist (map_type:9 map_flags:0x0):
    task:0 BPF LRU: nr_unique:61239(/100000) nr_misses:67054(/100000)
    task:0 Perfect LRU: nr_unique:61239(/100000 nr_misses:66993(/100000)
    ...
    test_parallel_lru_dist (map_type:9 map_flags:0x2):
    task:0 BPF LRU: nr_unique:61239(/100000) nr_misses:67068(/100000)
    task:0 Perfect LRU: nr_unique:61239(/100000 nr_misses:66993(/100000)

    LRU map has also been added to map_perf_test:
    /* Global LRU */
    [root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \
    ./map_perf_test 16 $i | awk '{r += $3}END{print r " updates"}'; done
    1 cpus: 2934082 updates
    4 cpus: 7391434 updates
    8 cpus: 6500576 updates

    /* Percpu LRU */
    [root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \
    ./map_perf_test 32 $i | awk '{r += $3}END{print r " updates"}'; done
    1 cpus: 2896553 updates
    4 cpus: 9766395 updates
    8 cpus: 17460553 updates

    Signed-off-by: Martin KaFai Lau
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

07 Apr, 2016

1 commit

  • Building BPF samples is failing with the below error:

    samples/bpf/map_perf_test_user.c: In function ‘main’:
    samples/bpf/map_perf_test_user.c:134:9: error: variable ‘r’ has
    initializer but incomplete type
    struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
    ^
    samples/bpf/map_perf_test_user.c:134:21: error: ‘RLIM_INFINITY’
    undeclared (first use in this function)
    struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
    ^
    samples/bpf/map_perf_test_user.c:134:21: note: each undeclared
    identifier is reported only once for each function it appears in
    samples/bpf/map_perf_test_user.c:134:9: warning: excess elements in
    struct initializer [enabled by default]
    struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
    ^
    samples/bpf/map_perf_test_user.c:134:9: warning: (near initialization
    for ‘r’) [enabled by default]
    samples/bpf/map_perf_test_user.c:134:9: warning: excess elements in
    struct initializer [enabled by default]
    samples/bpf/map_perf_test_user.c:134:9: warning: (near initialization
    for ‘r’) [enabled by default]
    samples/bpf/map_perf_test_user.c:134:16: error: storage size of ‘r’
    isn’t known
    struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
    ^
    samples/bpf/map_perf_test_user.c:139:2: warning: implicit declaration of
    function ‘setrlimit’ [-Wimplicit-function-declaration]
    setrlimit(RLIMIT_MEMLOCK, &r);
    ^
    samples/bpf/map_perf_test_user.c:139:12: error: ‘RLIMIT_MEMLOCK’
    undeclared (first use in this function)
    setrlimit(RLIMIT_MEMLOCK, &r);
    ^
    samples/bpf/map_perf_test_user.c:134:16: warning: unused variable ‘r’
    [-Wunused-variable]
    struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
    ^
    make[2]: *** [samples/bpf/map_perf_test_user.o] Error 1

    Fix this by including the necessary header file.

    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: David S. Miller
    Cc: Ananth N Mavinakayanahalli
    Cc: Michael Ellerman
    Acked-by: Alexei Starovoitov
    Signed-off-by: Naveen N. Rao
    Signed-off-by: David S. Miller

    Naveen N. Rao
     

09 Mar, 2016

1 commit