04 Oct, 2018

1 commit

  • There is a warning when compiling bpf sample programs in sample/bpf:

    make -C /home/foo/bpf/samples/bpf/../../tools/lib/bpf/ RM='rm -rf' LDFLAGS= srctree=/home/foo/bpf/samples/bpf/../../ O=
    HOSTCC /home/foo/bpf/samples/bpf/tracex3_user.o
    /home/foo/bpf/samples/bpf/tracex3_user.c:20:0: warning: "ARRAY_SIZE" redefined
    #define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))

    In file included from /home/foo/bpf/samples/bpf/tracex3_user.c:18:0:
    ./tools/testing/selftests/bpf/bpf_util.h:48:0: note: this is the location of the previous definition
    # define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))

    Signed-off-by: Bo YU
    Signed-off-by: Daniel Borkmann

    Bo YU
     

15 May, 2018

1 commit


03 May, 2017

1 commit


16 Dec, 2016

1 commit

  • Switch all of the sample code to use the function names from
    tools/lib/bpf so that they're consistent with that, and to declare their
    own log buffers. This allow the next commit to be purely devoted to
    getting rid of the duplicate library in samples/bpf.

    Committer notes:

    Testing it:

    On a fedora rawhide container, with clang/llvm 3.9, sharing the host
    linux kernel git tree:

    # make O=/tmp/build/linux/ headers_install
    # make O=/tmp/build/linux -C samples/bpf/

    Since I forgot to make it privileged, just tested it outside the
    container, using what it generated:

    # uname -a
    Linux jouet 4.9.0-rc8+ #1 SMP Mon Dec 12 11:20:49 BRT 2016 x86_64 x86_64 x86_64 GNU/Linux
    # cd /var/lib/docker/devicemapper/mnt/c43e09a53ff56c86a07baf79847f00e2cc2a17a1e2220e1adbf8cbc62734feda/rootfs/tmp/build/linux/samples/bpf/
    # ls -la offwaketime
    -rwxr-xr-x. 1 root root 24200 Dec 15 12:19 offwaketime
    # file offwaketime
    offwaketime: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=c940d3f127d5e66cdd680e42d885cb0b64f8a0e4, not stripped
    # readelf -SW offwaketime_kern.o | grep PROGBITS
    [ 2] .text PROGBITS 0000000000000000 000040 000000 00 AX 0 0 4
    [ 3] kprobe/try_to_wake_up PROGBITS 0000000000000000 000040 0000d8 00 AX 0 0 8
    [ 5] tracepoint/sched/sched_switch PROGBITS 0000000000000000 000118 000318 00 AX 0 0 8
    [ 7] maps PROGBITS 0000000000000000 000430 000050 00 WA 0 0 4
    [ 8] license PROGBITS 0000000000000000 000480 000004 00 WA 0 0 1
    [ 9] version PROGBITS 0000000000000000 000484 000004 00 WA 0 0 4
    # ./offwaketime | head -5
    swapper/1;start_secondary;cpu_startup_entry;schedule_preempt_disabled;schedule;__schedule;-;---;; 106
    CPU 0/KVM;entry_SYSCALL_64_fastpath;sys_ioctl;do_vfs_ioctl;kvm_vcpu_ioctl;kvm_arch_vcpu_ioctl_run;kvm_vcpu_block;schedule;__schedule;-;try_to_wake_up;swake_up_locked;swake_up;apic_timer_expired;apic_timer_fn;__hrtimer_run_queues;hrtimer_interrupt;local_apic_timer_interrupt;smp_apic_timer_interrupt;__irqentry_text_start;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary;;swapper/3 2
    Compositor;entry_SYSCALL_64_fastpath;sys_futex;do_futex;futex_wait;futex_wait_queue_me;schedule;__schedule;-;try_to_wake_up;futex_requeue;do_futex;sys_futex;entry_SYSCALL_64_fastpath;;SoftwareVsyncTh 5
    firefox;entry_SYSCALL_64_fastpath;sys_poll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;pollwake;__wake_up_common;__wake_up_sync_key;pipe_write;__vfs_write;vfs_write;sys_write;entry_SYSCALL_64_fastpath;;Timer 13
    JS Helper;entry_SYSCALL_64_fastpath;sys_futex;do_futex;futex_wait;futex_wait_queue_me;schedule;__schedule;-;try_to_wake_up;do_futex;sys_futex;entry_SYSCALL_64_fastpath;;firefox 2
    #

    Signed-off-by: Joe Stringer
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Wang Nan
    Cc: netdev@vger.kernel.org
    Link: http://lkml.kernel.org/r/20161214224342.12858-2-joe@ovn.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Joe Stringer
     

28 Nov, 2016

1 commit

  • 1) The test_lru_map and test_lru_dist fails building on my machine since
    the sys/resource.h header is not included.

    2) test_verifier fails in one test case where we try to call an invalid
    function, since the verifier log output changed wrt printing function
    names.

    3) Current selftest suite code relies on sysconf(_SC_NPROCESSORS_CONF) for
    retrieving the number of possible CPUs. This is broken at least in our
    scenario and really just doesn't work.

    glibc tries a number of things for retrieving _SC_NPROCESSORS_CONF.
    First it tries equivalent of /sys/devices/system/cpu/cpu[0-9]* | wc -l,
    if that fails, depending on the config, it either tries to count CPUs
    in /proc/cpuinfo, or returns the _SC_NPROCESSORS_ONLN value instead.
    If /proc/cpuinfo has some issue, it returns just 1 worst case. This
    oddity is nothing new [1], but semantics/behaviour seems to be settled.
    _SC_NPROCESSORS_ONLN will parse /sys/devices/system/cpu/online, if
    that fails it looks into /proc/stat for cpuX entries, and if also that
    fails for some reason, /proc/cpuinfo is consulted (and returning 1 if
    unlikely all breaks down).

    While that might match num_possible_cpus() from the kernel in some
    cases, it's really not guaranteed with CPU hotplugging, and can result
    in a buffer overflow since the array in user space could have too few
    number of slots, and on perpcu map lookup, the kernel will write beyond
    that memory of the value buffer.

    William Tu reported such mismatches:

    [...] The fact that sysconf(_SC_NPROCESSORS_CONF) != num_possible_cpu()
    happens when CPU hotadd is enabled. For example, in Fusion when
    setting vcpu.hotadd = "TRUE" or in KVM, setting ./qemu-system-x86_64
    -smp 2, maxcpus=4 ... the num_possible_cpu() will be 4 and sysconf()
    will be 2 [2]. [...]

    Documentation/cputopology.txt says /sys/devices/system/cpu/possible
    outputs cpu_possible_mask. That is the same as in num_possible_cpus(),
    so first step would be to fix the _SC_NPROCESSORS_CONF calls with our
    own implementation. Later, we could add support to bpf(2) for passing
    a mask via CPU_SET(3), for example, to just select a subset of CPUs.

    BPF samples code needs this fix as well (at least so that people stop
    copying this). Thus, define bpf_num_possible_cpus() once in selftests
    and import it from there for the sample code to avoid duplicating it.
    The remaining sysconf(_SC_NPROCESSORS_CONF) in samples are unrelated.

    After all three issues are fixed, the test suite runs fine again:

    # make run_tests | grep self
    selftests: test_verifier [PASS]
    selftests: test_maps [PASS]
    selftests: test_lru_map [PASS]
    selftests: test_kmod.sh [PASS]

    [1] https://www.sourceware.org/ml/libc-alpha/2011-06/msg00079.html
    [2] https://www.mail-archive.com/netdev@vger.kernel.org/msg121183.html

    Fixes: 3059303f59cf ("samples/bpf: update tracex[23] examples to use per-cpu maps")
    Fixes: 86af8b4191d2 ("Add sample for adding simple drop program to link")
    Fixes: df570f577231 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY")
    Fixes: e15596717948 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_HASH")
    Fixes: ebb676daa1a3 ("bpf: Print function name in addition to function id")
    Fixes: 5db58faf989f ("bpf: Add tests for the LRU bpf_htab")
    Signed-off-by: Daniel Borkmann
    Cc: William Tu
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

06 Feb, 2016

1 commit


02 Apr, 2015

1 commit

  • BPF C program attaches to
    blk_mq_start_request()/blk_update_request() kprobe events to
    calculate IO latency.

    For every completed block IO event it computes the time delta
    in nsec and records in a histogram map:

    map[log10(delta)*10]++

    User space reads this histogram map every 2 seconds and prints
    it as a 'heatmap' using gray shades of text terminal. Black
    spaces have many events and white spaces have very few events.
    Left most space is the smallest latency, right most space is
    the largest latency in the range.

    Usage:

    $ sudo ./tracex3
    and do 'sudo dd if=/dev/sda of=/dev/null' in other terminal.

    Observe IO latencies and how different activity (like 'make
    kernel') affects it.

    Similar experiments can be done for network transmit latencies,
    syscalls, etc.

    '-t' flag prints the heatmap using normal ascii characters:

    $ sudo ./tracex3 -t
    heatmap of IO latency
    # - many events with this latency
    - few events
    |1us |10us |100us |1ms |10ms |100ms |1s |10s
    *ooo. *O.#. # 221
    . *# . # 125
    .. .o#*.. # 55
    . . . . .#O # 37
    .# # 175
    .#*. # 37
    # # 199
    . . *#*. # 55
    *#..* # 42
    # # 266
    ...***Oo#*OO**o#* . # 629
    # # 271
    . .#o* o.*o* # 221
    . . o* *#O.. # 50

    Signed-off-by: Alexei Starovoitov
    Cc: Arnaldo Carvalho de Melo
    Cc: Arnaldo Carvalho de Melo
    Cc: Daniel Borkmann
    Cc: David S. Miller
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Masami Hiramatsu
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/1427312966-8434-9-git-send-email-ast@plumgrid.com
    Signed-off-by: Ingo Molnar

    Alexei Starovoitov