04 Jan, 2017

1 commit


30 Dec, 2016

4 commits


24 Dec, 2016

1 commit

  • Pull perf fixes from Ingo Molnar:
    "On the kernel side there's two x86 PMU driver fixes and a uprobes fix,
    plus on the tooling side there's a number of fixes and some late
    updates"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
    perf sched timehist: Fix invalid period calculation
    perf sched timehist: Remove hardcoded 'comm_width' check at print_summary
    perf sched timehist: Enlarge default 'comm_width'
    perf sched timehist: Honour 'comm_width' when aligning the headers
    perf/x86: Fix overlap counter scheduling bug
    perf/x86/pebs: Fix handling of PEBS buffer overflows
    samples/bpf: Move open_raw_sock to separate header
    samples/bpf: Remove perf_event_open() declaration
    samples/bpf: Be consistent with bpf_load_program bpf_insn parameter
    tools lib bpf: Add bpf_prog_{attach,detach}
    samples/bpf: Switch over to libbpf
    perf diff: Do not overwrite valid build id
    perf annotate: Don't throw error for zero length symbols
    perf bench futex: Fix lock-pi help string
    perf trace: Check if MAP_32BIT is defined (again)
    samples/bpf: Make perf_event_read() static
    uprobes: Fix uprobes on MIPS, allow for a cache flush after ixol breakpoint creation
    samples/bpf: Make samples more libbpf-centric
    tools lib bpf: Add flags to bpf_create_map()
    tools lib bpf: use __u32 from linux/types.h
    ...

    Linus Torvalds
     

20 Dec, 2016

6 commits

  • This function was declared in libbpf.c and was the only remaining
    function in this library, but has nothing to do with BPF. Shift it out
    into a new header, sock_example.h, and include it from the relevant
    samples.

    Signed-off-by: Joe Stringer
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Wang Nan
    Link: http://lkml.kernel.org/r/20161209024620.31660-8-joe@ovn.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Joe Stringer
     
  • This declaration was made in samples/bpf/libbpf.c for convenience, but
    there's already one in tools/perf/perf-sys.h. Reuse that one.

    Committer notes:

    Testing it:

    $ make -j4 O=../build/v4.9.0-rc8+ samples/bpf/
    make[1]: Entering directory '/home/build/v4.9.0-rc8+'
    CHK include/config/kernel.release
    GEN ./Makefile
    CHK include/generated/uapi/linux/version.h
    Using /home/acme/git/linux as source for kernel
    CHK include/generated/utsrelease.h
    CHK include/generated/timeconst.h
    CHK include/generated/bounds.h
    CHK include/generated/asm-offsets.h
    CALL /home/acme/git/linux/scripts/checksyscalls.sh
    HOSTCC samples/bpf/test_verifier.o
    HOSTCC samples/bpf/libbpf.o
    HOSTCC samples/bpf/../../tools/lib/bpf/bpf.o
    HOSTCC samples/bpf/test_maps.o
    HOSTCC samples/bpf/sock_example.o
    HOSTCC samples/bpf/bpf_load.o

    HOSTLD samples/bpf/trace_event
    HOSTLD samples/bpf/sampleip
    HOSTLD samples/bpf/tc_l2_redirect
    make[1]: Leaving directory '/home/build/v4.9.0-rc8+'
    $

    Also tested the offwaketime resulting from the rebuild, seems to work as
    before.

    Signed-off-by: Joe Stringer
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Wang Nan
    Link: http://lkml.kernel.org/r/20161209024620.31660-7-joe@ovn.org
    [ Use -I$(srctree)/tools/lib/ to support out of source code tree builds ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Joe Stringer
     
  • Only one of the examples declare the bpf_insn bpf proggie as a const:

    $ grep 'struct bpf_insn [a-z]' samples/bpf/*.c
    samples/bpf/fds_example.c: static const struct bpf_insn insns[] = {
    samples/bpf/sock_example.c: struct bpf_insn prog[] = {
    samples/bpf/test_cgrp2_attach2.c: struct bpf_insn prog[] = {
    samples/bpf/test_cgrp2_attach.c: struct bpf_insn prog[] = {
    samples/bpf/test_cgrp2_sock.c: struct bpf_insn prog[] = {
    $

    Which causes this warning:

    [root@f5065a7d6272 linux]# make -j4 O=/tmp/build/linux samples/bpf/

    HOSTCC samples/bpf/fds_example.o
    /git/linux/samples/bpf/fds_example.c: In function 'bpf_prog_create':
    /git/linux/samples/bpf/fds_example.c:63:6: warning: passing argument 2 of 'bpf_load_program' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
    insns, insns_cnt, "GPL", 0,
    ^~~~~
    In file included from /git/linux/samples/bpf/libbpf.h:5:0,
    from /git/linux/samples/bpf/bpf_load.h:4,
    from /git/linux/samples/bpf/fds_example.c:15:
    /git/linux/tools/lib/bpf/bpf.h:31:5: note: expected 'struct bpf_insn *' but argument is of type 'const struct bpf_insn *'
    int bpf_load_program(enum bpf_prog_type type, struct bpf_insn *insns,
    ^~~~~~~~~~~~~~~~
    HOSTCC samples/bpf/sockex1_user.o

    So just ditch that 'const' to reduce build noise, leaving changing the
    bpf_load_program() bpf_insn parameter to const to a later patch, if deemed
    adequate.

    Cc: Joe Stringer
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-1z5xee8n3oa66jf62bpv16ed@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Commit d8c5b17f2bc0 ("samples: bpf: add userspace example for attaching
    eBPF programs to cgroups") added these functions to samples/libbpf, but
    during this merge all of the samples libbpf functionality is shifting to
    tools/lib/bpf. Shift these functions there.

    Committer notes:

    Use bzero + attr.FIELD = value instead of 'attr = { .FIELD = value, just
    like the other wrapper calls to sys_bpf with bpf_attr to make this build
    in older toolchais, such as the ones in CentOS 5 and 6.

    Signed-off-by: Joe Stringer
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-au2zvtsh55vqeo3v3uw7jr4c@git.kernel.org
    Link: https://github.com/joestringer/linux/commit/353e6f298c3d0a92fa8bfa61ff898c5050261a12.patch
    Signed-off-by: Arnaldo Carvalho de Melo

    Joe Stringer
     
  • Now that libbpf under tools/lib/bpf/* is synced with the version from
    samples/bpf, we can get rid most of the libbpf library here.

    Committer notes:

    Built it in a docker fedora rawhide container and ran it in the f25 host, seems
    to work just like it did before this patch, i.e. the switch to tools/lib/bpf/
    doesn't seem to have introduced problems and Joe said he tested it with
    all the entries in samples/bpf/ and other code he found:

    [root@f5065a7d6272 linux]# make -j4 O=/tmp/build/linux headers_install

    [root@f5065a7d6272 linux]# rm -rf /tmp/build/linux/samples/bpf/
    [root@f5065a7d6272 linux]# make -j4 O=/tmp/build/linux samples/bpf/
    make[1]: Entering directory '/tmp/build/linux'
    CHK include/config/kernel.release
    HOSTCC scripts/basic/fixdep
    GEN ./Makefile
    CHK include/generated/uapi/linux/version.h
    Using /git/linux as source for kernel
    CHK include/generated/utsrelease.h
    HOSTCC scripts/basic/bin2c
    HOSTCC arch/x86/tools/relocs_32.o
    HOSTCC arch/x86/tools/relocs_64.o
    LD samples/bpf/built-in.o

    HOSTCC samples/bpf/fds_example.o
    HOSTCC samples/bpf/sockex1_user.o
    /git/linux/samples/bpf/fds_example.c: In function 'bpf_prog_create':
    /git/linux/samples/bpf/fds_example.c:63:6: warning: passing argument 2 of 'bpf_load_program' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
    insns, insns_cnt, "GPL", 0,
    ^~~~~
    In file included from /git/linux/samples/bpf/libbpf.h:5:0,
    from /git/linux/samples/bpf/bpf_load.h:4,
    from /git/linux/samples/bpf/fds_example.c:15:
    /git/linux/tools/lib/bpf/bpf.h:31:5: note: expected 'struct bpf_insn *' but argument is of type 'const struct bpf_insn *'
    int bpf_load_program(enum bpf_prog_type type, struct bpf_insn *insns,
    ^~~~~~~~~~~~~~~~
    HOSTCC samples/bpf/sockex2_user.o

    HOSTCC samples/bpf/xdp_tx_iptunnel_user.o
    clang -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.2.1/include -I/git/linux/arch/x86/include -I./arch/x86/include/generated/uapi -I./arch/x86/include/generated -I/git/linux/include -I./include -I/git/linux/arch/x86/include/uapi -I/git/linux/include/uapi -I./include/generated/uapi -include /git/linux/include/linux/kconfig.h \
    -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
    -Wno-compare-distinct-pointer-types \
    -Wno-gnu-variable-sized-type-not-at-end \
    -Wno-address-of-packed-member -Wno-tautological-compare \
    -O2 -emit-llvm -c /git/linux/samples/bpf/sockex1_kern.c -o -| llc -march=bpf -filetype=obj -o samples/bpf/sockex1_kern.o
    HOSTLD samples/bpf/tc_l2_redirect

    HOSTLD samples/bpf/lwt_len_hist
    HOSTLD samples/bpf/xdp_tx_iptunnel
    make[1]: Leaving directory '/tmp/build/linux'
    [root@f5065a7d6272 linux]#

    And then, in the host:

    [root@jouet bpf]# mount | grep "docker.*devicemapper\/"
    /dev/mapper/docker-253:0-1705076-9bd8aa1e0af33adce89ff42090847868ca676932878942be53941a06ec5923f9 on /var/lib/docker/devicemapper/mnt/9bd8aa1e0af33adce89ff42090847868ca676932878942be53941a06ec5923f9 type xfs (rw,relatime,context="system_u:object_r:container_file_t:s0:c73,c276",nouuid,attr2,inode64,sunit=1024,swidth=1024,noquota)
    [root@jouet bpf]# cd /var/lib/docker/devicemapper/mnt/9bd8aa1e0af33adce89ff42090847868ca676932878942be53941a06ec5923f9/rootfs/tmp/build/linux/samples/bpf/
    [root@jouet bpf]# file offwaketime
    offwaketime: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=f423d171e0487b2f802b6a792657f0f3c8f6d155, not stripped
    [root@jouet bpf]# readelf -SW offwaketime
    offwaketime offwaketime_kern.o offwaketime_user.o
    [root@jouet bpf]# readelf -SW offwaketime_kern.o
    There are 11 section headers, starting at offset 0x700:

    Section Headers:
    [Nr] Name Type Address Off Size ES Flg Lk Inf Al
    [ 0] NULL 0000000000000000 000000 000000 00 0 0 0
    [ 1] .strtab STRTAB 0000000000000000 000658 0000a8 00 0 0 1
    [ 2] .text PROGBITS 0000000000000000 000040 000000 00 AX 0 0 4
    [ 3] kprobe/try_to_wake_up PROGBITS 0000000000000000 000040 0000d8 00 AX 0 0 8
    [ 4] .relkprobe/try_to_wake_up REL 0000000000000000 0005a8 000020 10 10 3 8
    [ 5] tracepoint/sched/sched_switch PROGBITS 0000000000000000 000118 000318 00 AX 0 0 8
    [ 6] .reltracepoint/sched/sched_switch REL 0000000000000000 0005c8 000090 10 10 5 8
    [ 7] maps PROGBITS 0000000000000000 000430 000050 00 WA 0 0 4
    [ 8] license PROGBITS 0000000000000000 000480 000004 00 WA 0 0 1
    [ 9] version PROGBITS 0000000000000000 000484 000004 00 WA 0 0 4
    [10] .symtab SYMTAB 0000000000000000 000488 000120 18 1 4 8
    Key to Flags:
    W (write), A (alloc), X (execute), M (merge), S (strings)
    I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
    O (extra OS processing required) o (OS specific), p (processor specific)
    [root@jouet bpf]# ./offwaketime | head -3
    qemu-system-x86;entry_SYSCALL_64_fastpath;sys_ppoll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;hrtimer_wakeup;__hrtimer_run_queues;hrtimer_interrupt;local_apic_timer_interrupt;smp_apic_timer_interrupt;__irqentry_text_start;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel;start_cpu;;swapper/0 4
    firefox;entry_SYSCALL_64_fastpath;sys_poll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;pollwake;__wake_up_common;__wake_up_sync_key;pipe_write;__vfs_write;vfs_write;sys_write;entry_SYSCALL_64_fastpath;;Timer 1
    swapper/2;start_cpu;start_secondary;cpu_startup_entry;schedule_preempt_disabled;schedule;__schedule;-;---;; 61
    [root@jouet bpf]#

    Signed-off-by: Joe Stringer
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Wang Nan
    Cc: netdev@vger.kernel.org
    Link: https://github.com/joestringer/linux/commit/5c40f54a52b1f437123c81e21873f4b4b1f9bd55.patch
    Link: http://lkml.kernel.org/n/tip-xr8twtx7sjh5821g8qw47yxk@git.kernel.org
    [ Use -I$(srctree)/tools/lib/ to support out of source code tree builds, as noticed by Wang Nan ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Joe Stringer
     
  • While testing Joe's conversion of samples/bpf/ to use tools/lib/bpf/ I noticed
    some warnings building samples/bpf/ on a Fedora Rawhide container, with
    clang/llvm 3.9 I noticed this:

    [root@1e797fdfbf4f linux]# make -j4 O=/tmp/build/linux/ samples/bpf/
    make[1]: Entering directory '/tmp/build/linux'
    CHK include/config/kernel.release
    GEN ./Makefile
    CHK include/generated/uapi/linux/version.h
    Using /git/linux as source for kernel

    HOSTCC samples/bpf/trace_output_user.o
    /git/linux/samples/bpf/trace_output_user.c:64:6: warning: no previous
    prototype for 'perf_event_read' [-Wmissing-prototypes]
    void perf_event_read(print_fn fn)
    ^~~~~~~~~~~~~~~
    HOSTLD samples/bpf/trace_output
    make[1]: Leaving directory '/tmp/build/linux'

    Shut up the compiler by making that function static.

    Acked-by: Daniel Borkmann
    Cc: Alexei Starovoitov
    Cc: Joe Stringer
    Cc: Wang Nan
    Link: http://lkml.kernel.org/r/20161215152927.GC6866@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

18 Dec, 2016

1 commit

  • Pull kbuild updates from Michal Marek:

    - prototypes for x86 asm-exported symbols (Adam Borowski) and a warning
    about missing CRCs (Nick Piggin)

    - asm-exports fix for LTO (Nicolas Pitre)

    - thin archives improvements (Nick Piggin)

    - linker script fix for CONFIG_LD_DEAD_CODE_DATA_ELIMINATION (Nick
    Piggin)

    - genksyms support for __builtin_va_list keyword

    - misc minor fixes

    * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
    x86/kbuild: enable modversions for symbols exported from asm
    kbuild: fix scripts/adjust_autoksyms.sh* for the no modules case
    scripts/kallsyms: remove last remnants of --page-offset option
    make use of make variable CURDIR instead of calling pwd
    kbuild: cmd_export_list: tighten the sed script
    kbuild: minor improvement for thin archives build
    kbuild: modpost warn if export version crc is missing
    kbuild: keep data tables through dead code elimination
    kbuild: improve linker compatibility with lib-ksyms.o build
    genksyms: Regenerate parser
    kbuild/genksyms: handle va_list type
    kbuild: thin archives for multi-y targets
    kbuild: kallsyms allow 3-pass generation if symbols size has changed

    Linus Torvalds
     

16 Dec, 2016

2 commits

  • Pull tracing updates from Steven Rostedt:
    "This release has a few updates:

    - STM can hook into the function tracer
    - Function filtering now supports more advance glob matching
    - Ftrace selftests updates and added tests
    - Softirq tag in traces now show only softirqs
    - ARM nop added to non traced locations at compile time
    - New trace_marker_raw file that allows for binary input
    - Optimizations to the ring buffer
    - Removal of kmap in trace_marker
    - Wakeup and irqsoff tracers now adhere to the set_graph_notrace file
    - Other various fixes and clean ups"

    * tag 'trace-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (42 commits)
    selftests: ftrace: Shift down default message verbosity
    kprobes/trace: Fix kprobe selftest for newer gcc
    tracing/kprobes: Add a helper method to return number of probe hits
    tracing/rb: Init the CPU mask on allocation
    tracing: Use SOFTIRQ_OFFSET for softirq dectection for more accurate results
    tracing/fgraph: Have wakeup and irqsoff tracers ignore graph functions too
    fgraph: Handle a case where a tracer ignores set_graph_notrace
    tracing: Replace kmap with copy_from_user() in trace_marker writing
    ftrace/x86_32: Set ftrace_stub to weak to prevent gcc from using short jumps to it
    tracing: Allow benchmark to be enabled at early_initcall()
    tracing: Have system enable return error if one of the events fail
    tracing: Do not start benchmark on boot up
    tracing: Have the reg function allow to fail
    ring-buffer: Force rb_end_commit() and rb_set_commit_to_write() inline
    ring-buffer: Froce rb_update_write_stamp() to be inlined
    ring-buffer: Force inline of hotpath helper functions
    tracing: Make __buffer_unlock_commit() always_inline
    tracing: Make tracepoint_printk a static_key
    ring-buffer: Always inline rb_event_data()
    ring-buffer: Make rb_reserve_next_event() always inlined
    ...

    Linus Torvalds
     
  • Switch all of the sample code to use the function names from
    tools/lib/bpf so that they're consistent with that, and to declare their
    own log buffers. This allow the next commit to be purely devoted to
    getting rid of the duplicate library in samples/bpf.

    Committer notes:

    Testing it:

    On a fedora rawhide container, with clang/llvm 3.9, sharing the host
    linux kernel git tree:

    # make O=/tmp/build/linux/ headers_install
    # make O=/tmp/build/linux -C samples/bpf/

    Since I forgot to make it privileged, just tested it outside the
    container, using what it generated:

    # uname -a
    Linux jouet 4.9.0-rc8+ #1 SMP Mon Dec 12 11:20:49 BRT 2016 x86_64 x86_64 x86_64 GNU/Linux
    # cd /var/lib/docker/devicemapper/mnt/c43e09a53ff56c86a07baf79847f00e2cc2a17a1e2220e1adbf8cbc62734feda/rootfs/tmp/build/linux/samples/bpf/
    # ls -la offwaketime
    -rwxr-xr-x. 1 root root 24200 Dec 15 12:19 offwaketime
    # file offwaketime
    offwaketime: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=c940d3f127d5e66cdd680e42d885cb0b64f8a0e4, not stripped
    # readelf -SW offwaketime_kern.o | grep PROGBITS
    [ 2] .text PROGBITS 0000000000000000 000040 000000 00 AX 0 0 4
    [ 3] kprobe/try_to_wake_up PROGBITS 0000000000000000 000040 0000d8 00 AX 0 0 8
    [ 5] tracepoint/sched/sched_switch PROGBITS 0000000000000000 000118 000318 00 AX 0 0 8
    [ 7] maps PROGBITS 0000000000000000 000430 000050 00 WA 0 0 4
    [ 8] license PROGBITS 0000000000000000 000480 000004 00 WA 0 0 1
    [ 9] version PROGBITS 0000000000000000 000484 000004 00 WA 0 0 4
    # ./offwaketime | head -5
    swapper/1;start_secondary;cpu_startup_entry;schedule_preempt_disabled;schedule;__schedule;-;---;; 106
    CPU 0/KVM;entry_SYSCALL_64_fastpath;sys_ioctl;do_vfs_ioctl;kvm_vcpu_ioctl;kvm_arch_vcpu_ioctl_run;kvm_vcpu_block;schedule;__schedule;-;try_to_wake_up;swake_up_locked;swake_up;apic_timer_expired;apic_timer_fn;__hrtimer_run_queues;hrtimer_interrupt;local_apic_timer_interrupt;smp_apic_timer_interrupt;__irqentry_text_start;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary;;swapper/3 2
    Compositor;entry_SYSCALL_64_fastpath;sys_futex;do_futex;futex_wait;futex_wait_queue_me;schedule;__schedule;-;try_to_wake_up;futex_requeue;do_futex;sys_futex;entry_SYSCALL_64_fastpath;;SoftwareVsyncTh 5
    firefox;entry_SYSCALL_64_fastpath;sys_poll;do_sys_poll;poll_schedule_timeout;schedule_hrtimeout_range;schedule_hrtimeout_range_clock;schedule;__schedule;-;try_to_wake_up;pollwake;__wake_up_common;__wake_up_sync_key;pipe_write;__vfs_write;vfs_write;sys_write;entry_SYSCALL_64_fastpath;;Timer 13
    JS Helper;entry_SYSCALL_64_fastpath;sys_futex;do_futex;futex_wait;futex_wait_queue_me;schedule;__schedule;-;try_to_wake_up;do_futex;sys_futex;entry_SYSCALL_64_fastpath;;firefox 2
    #

    Signed-off-by: Joe Stringer
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: Wang Nan
    Cc: netdev@vger.kernel.org
    Link: http://lkml.kernel.org/r/20161214224342.12858-2-joe@ovn.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Joe Stringer
     

15 Dec, 2016

1 commit

  • Pull security subsystem updates from James Morris:
    "Generally pretty quiet for this release. Highlights:

    Yama:
    - allow ptrace access for original parent after re-parenting

    TPM:
    - add documentation
    - many bugfixes & cleanups
    - define a generic open() method for ascii & bios measurements

    Integrity:
    - Harden against malformed xattrs

    SELinux:
    - bugfixes & cleanups

    Smack:
    - Remove unnecessary smack_known_invalid label
    - Do not apply star label in smack_setprocattr hook
    - parse mnt opts after privileges check (fixes unpriv DoS vuln)"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (56 commits)
    Yama: allow access for the current ptrace parent
    tpm: adjust return value of tpm_read_log
    tpm: vtpm_proxy: conditionally call tpm_chip_unregister
    tpm: Fix handling of missing event log
    tpm: Check the bios_dir entry for NULL before accessing it
    tpm: return -ENODEV if np is not set
    tpm: cleanup of printk error messages
    tpm: replace of_find_node_by_name() with dev of_node property
    tpm: redefine read_log() to handle ACPI/OF at runtime
    tpm: fix the missing .owner in tpm_bios_measurements_ops
    tpm: have event log use the tpm_chip
    tpm: drop tpm1_chip_register(/unregister)
    tpm: replace dynamically allocated bios_dir with a static array
    tpm: replace symbolic permission with octal for securityfs files
    char: tpm: fix kerneldoc tpm2_unseal_trusted name typo
    tpm_tis: Allow tpm_tis to be bound using DT
    tpm, tpm_vtpm_proxy: add kdoc comments for VTPM_PROXY_IOC_NEW_DEV
    tpm: Only call pm_runtime_get_sync if device has a parent
    tpm: define a generic open() method for ascii & bios measurements
    Documentation: tpm: add the Physical TPM device tree binding documentation
    ...

    Linus Torvalds
     

14 Dec, 2016

1 commit

  • Pull VFIO updates from Alex Williamson:

    - VFIO updates for v4.10 primarily include a new Mediated Device
    interface, which essentially allows software defined devices to be
    exposed to users through VFIO. The host vendor driver providing this
    virtual device polices, or mediates user access to the device.

    These devices often incorporate portions of real devices, for
    instance the primary initial users of this interface expose vGPUs
    which allow the user to map mediated devices, or mdevs, to a portion
    of a physical GPU. QEMU composes these mdevs into PCI representations
    using the existing VFIO user API. This enables both Intel KVM-GT
    support, which is also expected to arrive into Linux mainline during
    the v4.10 merge window, as well as NVIDIA vGPU, and also Channel I/O
    devices (aka CCW devices) for s390 virtualization support. (Kirti
    Wankhede, Neo Jia)

    - Drop unnecessary uses of pcibios_err_to_errno() (Cao Jin)

    - Fixes to VFIO capability chain handling (Eric Auger)

    - Error handling fixes for fallout from mdev (Christophe JAILLET)

    - Notifiers to expose struct kvm to mdev vendor drivers (Jike Song)

    - type1 IOMMU model search fixes (Kirti Wankhede, Neo Jia)

    * tag 'vfio-v4.10-rc1' of git://github.com/awilliam/linux-vfio: (30 commits)
    vfio iommu type1: Fix size argument to vfio_find_dma() in pin_pages/unpin_pages
    vfio iommu type1: Fix size argument to vfio_find_dma() during DMA UNMAP.
    vfio iommu type1: WARN_ON if notifier block is not unregistered
    kvm: set/clear kvm to/from vfio_group when group add/delete
    vfio: support notifier chain in vfio_group
    vfio: vfio_register_notifier: classify iommu notifier
    vfio: Fix handling of error returned by 'vfio_group_get_from_dev()'
    vfio: fix vfio_info_cap_add/shift
    vfio/pci: Drop unnecessary pcibios_err_to_errno()
    MAINTAINERS: Add entry VFIO based Mediated device drivers
    docs: Sample driver to demonstrate how to use Mediated device framework.
    docs: Sysfs ABI for mediated device framework
    docs: Add Documentation for Mediated devices
    vfio: Define device_api strings
    vfio_platform: Updated to use vfio_set_irqs_validate_and_prepare()
    vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare()
    vfio: Introduce vfio_set_irqs_validate_and_prepare()
    vfio_pci: Update vfio_pci to use vfio_info_add_capability()
    vfio: Introduce common function to add capabilities
    vfio iommu: Add blocking notifier to notify DMA_UNMAP
    ...

    Linus Torvalds
     

11 Dec, 2016

1 commit


09 Dec, 2016

2 commits

  • Some tracepoints have a registration function that gets enabled when the
    tracepoint is enabled. There may be cases that the registraction function
    must fail (for example, can't allocate enough memory). In this case, the
    tracepoint should also fail to register, otherwise the user would not know
    why the tracepoint is not working.

    Cc: David Howells
    Cc: Seiji Aguchi
    Cc: Anton Blanchard
    Cc: Mathieu Desnoyers
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • The XDP prog checks if the incoming packet matches any VIP:PORT
    combination in the BPF hashmap. If it is, it will encapsulate
    the packet with a IPv4/v6 header as instructed by the value of
    the BPF hashmap and then XDP_TX it out.

    The VIP:PORT -> IP-Encap-Info can be specified by the cmd args
    of the user prog.

    Acked-by: Alexei Starovoitov
    Signed-off-by: Martin KaFai Lau
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

04 Dec, 2016

4 commits

  • This patch adds the sample program test_cgrp2_attach2. This program is
    similar to test_cgrp2_attach, but it performs automated testing of the
    cgroupv2 BPF attached filters. It runs the following checks:
    * Simple filter attachment
    * Application of filters to child cgroups
    * Overriding filters on child cgroups
    * Checking that this still works when the parent filter is removed

    The filters that are used here are simply allow all / deny all filters, so
    it isn't checking the actual functionality of the filters, but rather
    the behaviour around detachment / attachment. If net_cls is enabled,
    this test will fail.

    Signed-off-by: Sargun Dhillon
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Sargun Dhillon
     
  • This patch modifies test_current_task_under_cgroup_user. The test has
    several helpers around creating a temporary environment for cgroup
    testing, and moving the current task around cgroups. This set of
    helpers can then be used in other tests.

    Signed-off-by: Sargun Dhillon
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Sargun Dhillon
     
  • silence some of the clang compiler warnings like:
    include/linux/fs.h:2693:9: warning: comparison of unsigned enum expression < 0 is always false
    arch/x86/include/asm/processor.h:491:30: warning: taking address of packed member 'sp0' of class or structure 'x86_hw_tss' may result in an unaligned pointer value
    include/linux/cgroup-defs.h:326:16: warning: field 'cgrp' with variable sized type 'struct cgroup' not at the end of a struct or class is a GNU extension
    since they add too much noise to samples/bpf/ build.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     
  • Couple conflicts resolved here:

    1) In the MACB driver, a bug fix to properly initialize the
    RX tail pointer properly overlapped with some changes
    to support variable sized rings.

    2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix
    overlapping with a reorganization of the driver to support
    ACPI, OF, as well as PCI variants of the chip.

    3) In 'net' we had several probe error path bug fixes to the
    stmmac driver, meanwhile a lot of this code was cleaned up
    and reorganized in 'net-next'.

    4) The cls_flower classifier obtained a helper function in
    'net-next' called __fl_delete() and this overlapped with
    Daniel Borkamann's bug fix to use RCU for object destruction
    in 'net'. It also overlapped with Jiri's change to guard
    the rhashtable_remove_fast() call with a check against
    tc_skip_sw().

    5) In mlx4, a revert bug fix in 'net' overlapped with some
    unrelated changes in 'net-next'.

    6) In geneve, a stale header pointer after pskb_expand_head()
    bug fix in 'net' overlapped with a large reorganization of
    the same code in 'net-next'. Since the 'net-next' code no
    longer had the bug in question, there was nothing to do
    other than to simply take the 'net-next' hunks.

    Signed-off-by: David S. Miller

    David S. Miller
     

03 Dec, 2016

3 commits


02 Dec, 2016

1 commit

  • Adds a series of tests to verify the functionality of attaching
    BPF programs at LWT hooks.

    Also adds a sample which collects a histogram of packet sizes which
    pass through an LWT hook.

    $ ./lwt_len_hist.sh
    Starting netserver with host 'IN(6)ADDR_ANY' port '12865' and family AF_UNSPEC
    MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.253.2 () port 0 AF_INET : demo
    Recv Send Send
    Socket Socket Message Elapsed
    Size Size Size Time Throughput
    bytes bytes bytes secs. 10^6bits/sec

    87380 16384 16384 10.00 39857.69
    1 -> 1 : 0 | |
    2 -> 3 : 0 | |
    4 -> 7 : 0 | |
    8 -> 15 : 0 | |
    16 -> 31 : 0 | |
    32 -> 63 : 22 | |
    64 -> 127 : 98 | |
    128 -> 255 : 213 | |
    256 -> 511 : 1444251 |******** |
    512 -> 1023 : 660610 |*** |
    1024 -> 2047 : 535241 |** |
    2048 -> 4095 : 19 | |
    4096 -> 8191 : 180 | |
    8192 -> 16383 : 5578023 |************************************* |
    16384 -> 32767 : 632099 |*** |
    32768 -> 65535 : 6575 | |

    Signed-off-by: Thomas Graf
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Thomas Graf
     

01 Dec, 2016

1 commit

  • Fix the following build error:
    HOSTCC samples/bpf/test_lru_dist.o
    ../samples/bpf/test_lru_dist.c:25:22: fatal error: bpf_util.h: No such file or directory

    This is due to objtree != srctree.
    Use srctree, since that's where bpf_util.h is located.

    Fixes: e00c7b216f34 ("bpf: fix multiple issues in selftest suite and samples")
    Signed-off-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Tested-by: David Ahern
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     

30 Nov, 2016

1 commit

  • This patch modifies test_cgrp2_attach to use getopt so we can use standard
    command line parsing.

    It also adds an option to run the program in detach only mode. This does
    not attach a new filter at the cgroup, but only runs the detach command.

    Lastly, it changes the attach code to not detach and then attach. It relies
    on the 'hotswap' behaviour of CGroup BPF programs to be able to change
    in-place. If detach-then-attach behaviour needs to be tested, the example
    can be run in detach only mode prior to attachment.

    Signed-off-by: Sargun Dhillon
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Sargun Dhillon
     

29 Nov, 2016

1 commit

  • The files "sampleip_kern.c" and "trace_event_kern.c" directly access
    "ctx->regs.ip" which is not available on s390x. Fix this and use the
    PT_REGS_IP() macro instead.

    Also fix the macro for s390x and use "psw.addr" from "pt_regs".

    Reported-by: Zvonko Kosic
    Signed-off-by: Michael Holzheu
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Michael Holzheu
     

28 Nov, 2016

1 commit

  • 1) The test_lru_map and test_lru_dist fails building on my machine since
    the sys/resource.h header is not included.

    2) test_verifier fails in one test case where we try to call an invalid
    function, since the verifier log output changed wrt printing function
    names.

    3) Current selftest suite code relies on sysconf(_SC_NPROCESSORS_CONF) for
    retrieving the number of possible CPUs. This is broken at least in our
    scenario and really just doesn't work.

    glibc tries a number of things for retrieving _SC_NPROCESSORS_CONF.
    First it tries equivalent of /sys/devices/system/cpu/cpu[0-9]* | wc -l,
    if that fails, depending on the config, it either tries to count CPUs
    in /proc/cpuinfo, or returns the _SC_NPROCESSORS_ONLN value instead.
    If /proc/cpuinfo has some issue, it returns just 1 worst case. This
    oddity is nothing new [1], but semantics/behaviour seems to be settled.
    _SC_NPROCESSORS_ONLN will parse /sys/devices/system/cpu/online, if
    that fails it looks into /proc/stat for cpuX entries, and if also that
    fails for some reason, /proc/cpuinfo is consulted (and returning 1 if
    unlikely all breaks down).

    While that might match num_possible_cpus() from the kernel in some
    cases, it's really not guaranteed with CPU hotplugging, and can result
    in a buffer overflow since the array in user space could have too few
    number of slots, and on perpcu map lookup, the kernel will write beyond
    that memory of the value buffer.

    William Tu reported such mismatches:

    [...] The fact that sysconf(_SC_NPROCESSORS_CONF) != num_possible_cpu()
    happens when CPU hotadd is enabled. For example, in Fusion when
    setting vcpu.hotadd = "TRUE" or in KVM, setting ./qemu-system-x86_64
    -smp 2, maxcpus=4 ... the num_possible_cpu() will be 4 and sysconf()
    will be 2 [2]. [...]

    Documentation/cputopology.txt says /sys/devices/system/cpu/possible
    outputs cpu_possible_mask. That is the same as in num_possible_cpus(),
    so first step would be to fix the _SC_NPROCESSORS_CONF calls with our
    own implementation. Later, we could add support to bpf(2) for passing
    a mask via CPU_SET(3), for example, to just select a subset of CPUs.

    BPF samples code needs this fix as well (at least so that people stop
    copying this). Thus, define bpf_num_possible_cpus() once in selftests
    and import it from there for the sample code to avoid duplicating it.
    The remaining sysconf(_SC_NPROCESSORS_CONF) in samples are unrelated.

    After all three issues are fixed, the test suite runs fine again:

    # make run_tests | grep self
    selftests: test_verifier [PASS]
    selftests: test_maps [PASS]
    selftests: test_lru_map [PASS]
    selftests: test_kmod.sh [PASS]

    [1] https://www.sourceware.org/ml/libc-alpha/2011-06/msg00079.html
    [2] https://www.mail-archive.com/netdev@vger.kernel.org/msg121183.html

    Fixes: 3059303f59cf ("samples/bpf: update tracex[23] examples to use per-cpu maps")
    Fixes: 86af8b4191d2 ("Add sample for adding simple drop program to link")
    Fixes: df570f577231 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY")
    Fixes: e15596717948 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_HASH")
    Fixes: ebb676daa1a3 ("bpf: Print function name in addition to function id")
    Fixes: 5db58faf989f ("bpf: Add tests for the LRU bpf_htab")
    Signed-off-by: Daniel Borkmann
    Cc: William Tu
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

26 Nov, 2016

1 commit

  • Add a simple userpace program to demonstrate the new API to attach eBPF
    programs to cgroups. This is what it does:

    * Create arraymap in kernel with 4 byte keys and 8 byte values

    * Load eBPF program

    The eBPF program accesses the map passed in to store two pieces of
    information. The number of invocations of the program, which maps
    to the number of packets received, is stored to key 0. Key 1 is
    incremented on each iteration by the number of bytes stored in
    the skb.

    * Detach any eBPF program previously attached to the cgroup

    * Attach the new program to the cgroup using BPF_PROG_ATTACH

    * Once a second, read map[0] and map[1] to see how many bytes and
    packets were seen on any socket of tasks in the given cgroup.

    The program takes a cgroup path as 1st argument, and either "ingress"
    or "egress" as 2nd. Optionally, "drop" can be passed as 3rd argument,
    which will make the generated eBPF program return 0 instead of 1, so
    the kernel will drop the packet.

    libbpf gained two new wrappers for the new syscall commands.

    Signed-off-by: Daniel Mack
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Mack
     

25 Nov, 2016

2 commits

  • llvm can emit relocations into sections other than program code
    (like debug info sections). Ignore them during parsing of elf file

    Signed-off-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     
  • since llvm commit "Do not expand UNDEF SDNode during insn selection lowering"
    llvm will generate code that uses uninitialized registers for cases
    where C code is actually uses uninitialized data.
    So this sockex2 example is technically broken.
    Fix it by initializing on the stack variable fully.
    Also increase verifier buffer limit, since verifier output
    may not fit in 64k for this sockex2 code depending on llvm version.

    Signed-off-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     

18 Nov, 2016

1 commit


16 Nov, 2016

1 commit

  • This patch has some unit tests and a test_lru_dist.

    The test_lru_dist reads in the numeric keys from a file.
    The files used here are generated by a modified fio-genzipf tool
    originated from the fio test suit. The sample data file can be
    found here: https://github.com/iamkafai/bpf-lru

    The zipf.* data files have 100k numeric keys and the key is also
    ranged from 1 to 100k.

    The test_lru_dist outputs the number of unique keys (nr_unique).
    F.e. The following means, 61239 of them is unique out of 100k keys.
    nr_misses means it cannot be found in the LRU map, so nr_misses
    must be >= nr_unique. test_lru_dist also simulates a perfect LRU
    map as a comparison:

    [root@arch-fb-vm1 ~]# ~/devshare/fb-kernel/linux/samples/bpf/test_lru_dist \
    /root/zipf.100k.a1_01.out 4000 1
    ...
    test_parallel_lru_dist (map_type:9 map_flags:0x0):
    task:0 BPF LRU: nr_unique:23093(/100000) nr_misses:31603(/100000)
    task:0 Perfect LRU: nr_unique:23093(/100000 nr_misses:34328(/100000)
    ....
    test_parallel_lru_dist (map_type:9 map_flags:0x2):
    task:0 BPF LRU: nr_unique:23093(/100000) nr_misses:31710(/100000)
    task:0 Perfect LRU: nr_unique:23093(/100000 nr_misses:34328(/100000)

    [root@arch-fb-vm1 ~]# ~/devshare/fb-kernel/linux/samples/bpf/test_lru_dist \
    /root/zipf.100k.a0_01.out 40000 1
    ...
    test_parallel_lru_dist (map_type:9 map_flags:0x0):
    task:0 BPF LRU: nr_unique:61239(/100000) nr_misses:67054(/100000)
    task:0 Perfect LRU: nr_unique:61239(/100000 nr_misses:66993(/100000)
    ...
    test_parallel_lru_dist (map_type:9 map_flags:0x2):
    task:0 BPF LRU: nr_unique:61239(/100000) nr_misses:67068(/100000)
    task:0 Perfect LRU: nr_unique:61239(/100000 nr_misses:66993(/100000)

    LRU map has also been added to map_perf_test:
    /* Global LRU */
    [root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \
    ./map_perf_test 16 $i | awk '{r += $3}END{print r " updates"}'; done
    1 cpus: 2934082 updates
    4 cpus: 7391434 updates
    8 cpus: 6500576 updates

    /* Percpu LRU */
    [root@kerneltest003.31.prn1 ~]# for i in 1 4 8; do echo -n "$i cpus: "; \
    ./map_perf_test 32 $i | awk '{r += $3}END{print r " updates"}'; done
    1 cpus: 2896553 updates
    4 cpus: 9766395 updates
    8 cpus: 17460553 updates

    Signed-off-by: Martin KaFai Lau
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

15 Nov, 2016

1 commit


14 Nov, 2016

1 commit


13 Nov, 2016

1 commit

  • The test creates two netns, ns1 and ns2. The host (the default netns)
    has an ipip or ip6tnl dev configured for tunneling traffic to the ns2.

    ping VIPS from ns1 host ns2 (VIPs at loopback)

    The test is to have ns1 pinging VIPs configured at the loopback
    interface in ns2.

    The VIPs are 10.10.1.102 and 2401:face::66 (which are configured
    at lo@ns2). [Note: 0x66 => 102].

    At ns1, the VIPs are routed _via_ the host.

    At the host, bpf programs are installed at the veth to redirect packets
    from a veth to the ipip/ip6tnl. The test is configured in a way so
    that both ingress and egress can be tested.

    At ns2, the ipip/ip6tnl dev is configured with the local and remote address
    specified. The return path is routed to the dev ipip/ip6tnl.

    During egress test, the host also locally tests pinging the VIPs to ensure
    that bpf_redirect at egress also works for the direct egress (i.e. not
    forwarding from dev ve1 to ve2).

    Acked-by: Alexei Starovoitov
    Signed-off-by: Martin KaFai Lau
    Signed-off-by: David S. Miller

    Martin KaFai Lau