18 Jun, 2017

2 commits


17 Jun, 2017

1 commit

  • The PC returned by dwfl_frame_pc() may map into a not-yet-reported
    module. We have to report it before we continue unwinding. But when we
    query for the isactivation flag in dwfl_frame_pc, libdw will actually do
    one more unwinding step internally which can then break and lead to
    missed frames or broken stacks.

    With libunwind we get e.g.:

    ~~~~~
    heaptrack_gui 2228 135073.400474: 613969 cycles:
    108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
    1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
    109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0)
    10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
    1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
    92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
    2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
    f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
    1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0)
    78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui)
    20439 __libc_start_main (/usr/lib/libc-2.25.so)
    78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)

    heaptrack_gui 2228 135073.401156: 569521 cycles:
    131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0)
    1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0)
    21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0)
    2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0)
    279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0)
    e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0)
    f5a1c QGuiApplicationPrivate::createPlatformIntegration (/usr/lib/libQt5Gui.so.5.8.0)
    f650c QGuiApplicationPrivate::createEventDispatcher (/usr/lib/libQt5Gui.so.5.8.0)
    298524 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
    f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
    1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0)
    78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui)
    20439 __libc_start_main (/usr/lib/libc-2.25.so)
    78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)
    ~~~~~

    Note the two frames 1589e8 and 78622 in the first sample. These are
    missing when unwinding with libdw. The second sample's breakage is
    more obvious:

    ~~~~~
    heaptrack_gui 2228 135073.400474: 613969 cycles:
    108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
    1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0)
    109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0)
    10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0)
    1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0)
    92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0)
    2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0)
    f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0)
    20439 __libc_start_main (/usr/lib/libc-2.25.so)
    78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui)

    heaptrack_gui 2228 135073.401156: 569521 cycles:
    131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0)
    1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0)
    21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0)
    1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0)
    2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0)
    279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0)
    e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0)
    723dbf [unknown] ([unknown])
    ~~~~~

    This patch fixes this issue and the libdw unwinder mimicks the libunwind
    behavior more closely.

    Signed-off-by: Milian Wolff
    Acked-by: Jan Kratochvil
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/20170602143753.16907-2-milian.wolff@kdab.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Milian Wolff
     

16 Jun, 2017

1 commit

  • CONFIG_FORTIFY_SOURCE=y implements fortify_panic() as a __noreturn function,
    so objtool needs to know about it too.

    Suggested-by: Daniel Micay
    Tested-by: Stephen Rothwell
    Signed-off-by: Kees Cook
    Signed-off-by: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1497532835-32704-1-git-send-email-jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Kees Cook
     

15 Jun, 2017

3 commits

  • Pull networking fixes from David Miller:

    1) The netlink attribute passed in to dev_set_alias() is not
    necessarily NULL terminated, don't use strlcpy() on it. From
    Alexander Potapenko.

    2) Fix implementation of atomics in arm64 bpf JIT, from Daniel
    Borkmann.

    3) Correct the release of netdevs and driver private data in certain
    circumstances.

    4) Sanitize netlink message length properly in decnet, from Mateusz
    Jurczyk.

    5) Don't leak kernel data in rtnl_fill_vfinfo() netlink blobs. From
    Yuval Mintz.

    6) Hash secret is never initialized in ipv6 ILA translation code, from
    Arnd Bergmann. I guess those clang warnings about unused inline
    functions are useful for something!

    7) Fix endian selection in bpf_endian.h, from Daniel Borkmann.

    8) Sanitize sockaddr length before dereferncing any fields in AF_UNIX
    and CAIF. From Mateusz Jurczyk.

    9) Fix timestamping for GMAC3 chips in stmmac driver, from Mario
    Molitor.

    10) Do not leak netdev on dev_alloc_name() errors in mac80211, from
    Johannes Berg.

    11) Fix locking in sctp_for_each_endpoint(), from Xin Long.

    12) Fix wrong memset size on 32-bit in snmp6, from Christian Perle.

    13) Fix use after free in ip_mc_clear_src(), from WANG Cong.

    14) Fix regressions caused by ICMP rate limiting changes in 4.11, from
    Jesper Dangaard Brouer.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (91 commits)
    i40e: Fix a sleep-in-atomic bug
    net: don't global ICMP rate limit packets originating from loopback
    net/act_pedit: fix an error code
    net: update undefined ->ndo_change_mtu() comment
    net_sched: move tcf_lock down after gen_replace_estimator()
    caif: Add sockaddr length check before accessing sa_family in connect handler
    qed: fix dump of context data
    qmi_wwan: new Telewell and Sierra device IDs
    net: phy: Fix MDIO_THUNDER dependencies
    netconsole: Remove duplicate "netconsole: " logging prefix
    igmp: acquire pmc lock for ip_mc_clear_src()
    r8152: give the device version
    net: rps: fix uninitialized symbol warning
    mac80211: don't send SMPS action frame in AP mode when not needed
    mac80211/wpa: use constant time memory comparison for MACs
    mac80211: set bss_info data before configuring the channel
    mac80211: remove 5/10 MHz rate code from station MLME
    mac80211: Fix incorrect condition when checking rx timestamp
    mac80211: don't look at the PM bit of BAR frames
    i40e: fix handling of HW ATR eviction
    ...

    Linus Torvalds
     
  • With commit: 0a943cb10ce78 (tools build: Add HOSTARCH Makefile variable)
    when building for ARCH=x86_64, ARCH=x86_64 is passed to perf instead of
    ARCH=x86, so the perf build process searchs header files from
    tools/arch/x86_64/include, which doesn't exist.

    The following build failure is seen:

    In file included from util/event.c:2:0:
    tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory
    compilation terminated.

    Fix this issue by using SRCARCH instead of ARCH in perf, just like the
    main kernel Makefile and tools/objtool's.

    Signed-off-by: Jiada Wang
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Eugeniu Rosca
    Cc: Jan Stancek
    Cc: Masami Hiramatsu
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Cc: Rui Teng
    Cc: Sukadev Bhattiprolu
    Cc: Wang Nan
    Fixes: 0a943cb10ce7 ("tools build: Add HOSTARCH Makefile variable")
    Link: http://lkml.kernel.org/r/1491793357-14977-2-git-send-email-jiada_wang@mentor.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiada Wang
     
  • Since commit 18e7a45af91a ("perf/x86: Reject non sampling events with
    precise_ip") returns -EINVAL for sys_perf_event_open() with an attribute
    with (attr.precise_ip > 0 && attr.sample_period == 0), just like is done
    in the routine used to probe the max precise level when no events were
    passed to 'perf record' or 'perf top', i.e.:

    perf_evsel__new_cycles()
    perf_event_attr__set_max_precise_ip()

    The x86 code, in x86_pmu_hw_config(), which is called all the way from
    sys_perf_event_open() did, starting with the aforementioned commit:

    /* There's no sense in having PEBS for non sampling events: */
    if (!is_sampling_event(event))
    return -EINVAL;

    Which makes it fail for cycles:ppp, cycles:pp and cycles:p, always using
    just the non precise cycles variant.

    To make sure that this is the case, I tested it, before this patch,
    with:

    # perf probe -L x86_pmu_hw_config

    0 int x86_pmu_hw_config(struct perf_event *event)
    1 {
    2 if (event->attr.precise_ip) {

    17 if (event->attr.precise_ip > precise)
    18 return -EOPNOTSUPP;

    /* There's no sense in having PEBS for non sampling events: */
    21 if (!is_sampling_event(event))
    22 return -EINVAL;
    }

    # perf probe x86_pmu_hw_config:22
    Added new events:
    probe:x86_pmu_hw_config (on x86_pmu_hw_config:22)
    probe:x86_pmu_hw_config_1 (on x86_pmu_hw_config:22)

    You can now use it in all perf tools, such as:

    perf record -e probe:x86_pmu_hw_config_1 -aR sleep 1

    # perf trace -e perf_event_open,probe:x86_pmu_hwconfig*/max-stack=16/ perf record usleep 1
    0.000 ( 0.015 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
    0.015 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
    x86_pmu_hw_config ([kernel.kallsyms])
    hsw_hw_config ([kernel.kallsyms])
    x86_pmu_event_init ([kernel.kallsyms])
    perf_try_init_event ([kernel.kallsyms])
    perf_event_alloc ([kernel.kallsyms])
    SYSC_perf_event_open ([kernel.kallsyms])
    sys_perf_event_open ([kernel.kallsyms])
    do_syscall_64 ([kernel.kallsyms])
    return_from_SYSCALL_64 ([kernel.kallsyms])
    syscall (/usr/lib64/libc-2.24.so)
    perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
    perf_evsel__new_cycles (/home/acme/bin/perf)
    perf_evlist__add_default (/home/acme/bin/perf)
    cmd_record (/home/acme/bin/perf)
    run_builtin (/home/acme/bin/perf)
    handle_internal_command (/home/acme/bin/perf)
    0.000 ( 0.021 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
    0.023 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
    0.025 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
    x86_pmu_hw_config ([kernel.kallsyms])
    hsw_hw_config ([kernel.kallsyms])
    x86_pmu_event_init ([kernel.kallsyms])
    perf_try_init_event ([kernel.kallsyms])
    perf_event_alloc ([kernel.kallsyms])
    SYSC_perf_event_open ([kernel.kallsyms])
    sys_perf_event_open ([kernel.kallsyms])
    do_syscall_64 ([kernel.kallsyms])
    return_from_SYSCALL_64 ([kernel.kallsyms])
    syscall (/usr/lib64/libc-2.24.so)
    perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
    perf_evsel__new_cycles (/home/acme/bin/perf)
    perf_evlist__add_default (/home/acme/bin/perf)
    cmd_record (/home/acme/bin/perf)
    run_builtin (/home/acme/bin/perf)
    handle_internal_command (/home/acme/bin/perf)
    0.023 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
    0.028 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ...
    0.030 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1))
    x86_pmu_hw_config ([kernel.kallsyms])
    hsw_hw_config ([kernel.kallsyms])
    x86_pmu_event_init ([kernel.kallsyms])
    perf_try_init_event ([kernel.kallsyms])
    perf_event_alloc ([kernel.kallsyms])
    SYSC_perf_event_open ([kernel.kallsyms])
    sys_perf_event_open ([kernel.kallsyms])
    do_syscall_64 ([kernel.kallsyms])
    return_from_SYSCALL_64 ([kernel.kallsyms])
    syscall (/usr/lib64/libc-2.24.so)
    perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
    perf_evsel__new_cycles (/home/acme/bin/perf)
    perf_evlist__add_default (/home/acme/bin/perf)
    cmd_record (/home/acme/bin/perf)
    run_builtin (/home/acme/bin/perf)
    handle_internal_command (/home/acme/bin/perf)
    0.028 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument
    41.018 ( 0.012 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8b5dd0, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
    41.065 ( 0.011 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
    41.080 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
    41.103 ( 0.010 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
    41.115 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
    41.122 ( 0.004 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
    41.128 ( 0.008 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ]
    #

    I.e. that return -EINVAL in x86_pmu_hw_config() is hit three times.

    So fix it by just setting attr.sample_period

    Now, after this patch:

    # perf trace --max-stack=2 -e perf_event_open,probe:x86_pmu_hw_config* perf record usleep 1
    [ perf record: Woken up 1 times to write data ]
    0.000 ( 0.017 ms): perf/8469 perf_event_open(attr_uptr: 0x7ffe36c27d10, pid: -1, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 4
    syscall (/usr/lib64/libc-2.24.so)
    perf_event_open_cloexec_flag (/home/acme/bin/perf)
    0.050 ( 0.031 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
    syscall (/usr/lib64/libc-2.24.so)
    perf_evlist__config (/home/acme/bin/perf)
    0.092 ( 0.040 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
    syscall (/usr/lib64/libc-2.24.so)
    perf_evlist__config (/home/acme/bin/perf)
    0.143 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, cpu: -1, group_fd: -1 ) = 4
    syscall (/usr/lib64/libc-2.24.so)
    perf_event_attr__set_max_precise_ip (/home/acme/bin/perf)
    0.161 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4
    syscall (/usr/lib64/libc-2.24.so)
    perf_evsel__open (/home/acme/bin/perf)
    0.171 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5
    syscall (/usr/lib64/libc-2.24.so)
    perf_evsel__open (/home/acme/bin/perf)
    0.180 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6
    syscall (/usr/lib64/libc-2.24.so)
    perf_evsel__open (/home/acme/bin/perf)
    0.190 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8
    syscall (/usr/lib64/libc-2.24.so)
    perf_evsel__open (/home/acme/bin/perf)
    [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
    #

    The probe one called from perf_event_attr__set_max_precise_ip() works
    the first time, with attr.precise_ip = 3, wit hthe next ones being the
    per cpu ones for the cycles:ppp event.

    And here is the text from a report and alternative proposed patch by
    Thomas-Mich Richter:

    ---

    On s390 the counter and sampling facility do not support a precise IP
    skid level and sometimes returns EOPNOTSUPP when structure member
    precise_ip in struct perf_event_attr is not set to zero.

    On s390 commnd 'perf record -- true' fails with error EOPNOTSUPP. This
    happens only when no events are specified on command line.

    The functions called are
    ...
    --> perf_evlist__add_default
    --> perf_evsel__new_cycles
    --> perf_event_attr__set_max_precise_ip

    The last function determines the value of structure member precise_ip by
    invoking the perf_event_open() system call and checking the return code.
    The first successful open is the value for precise_ip.

    However the value is determined without setting member sample_period and
    indicates no sampling.

    On s390 the counter facility and sampling facility are different. The
    above procedure determines a precise_ip value of 3 using the counter
    facility. Later it uses the sampling facility with a value of 3 and
    fails with EOPNOTSUPP.

    ---

    v2: Older compilers (e.g. gcc 4.4.7) don't support referencing members
    of unnamed union members in the container struct initialization, so
    move from:

    struct perf_event_attr attr = {
    ...
    .sample_period = 1,
    };

    to right after it as:

    struct perf_event_attr attr = {
    ...
    };

    attr.sample_period = 1;

    v3: We need to reset .sample_period to 0 to let the users of
    perf_evsel__new_cycles() to properly setup attr.sample_period or
    attr.sample_freq. Reported by Ingo Molnar.

    Reported-and-Acked-by: Thomas-Mich Richter
    Acked-by: Hendrik Brueckner
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Fixes: 18e7a45af91a ("perf/x86: Reject non sampling events with precise_ip")
    Link: http://lkml.kernel.org/n/tip-yv6nnkl7tzqocrm0hl3x7vf1@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

09 Jun, 2017

10 commits

  • I noticed that test_l4lb was failing in selftests:

    # ./test_progs
    test_pkt_access:PASS:ipv4 77 nsec
    test_pkt_access:PASS:ipv6 44 nsec
    test_xdp:PASS:ipv4 2933 nsec
    test_xdp:PASS:ipv6 1500 nsec
    test_l4lb:PASS:ipv4 377 nsec
    test_l4lb:PASS:ipv6 544 nsec
    test_l4lb:FAIL:stats 6297600000 200000
    test_tcp_estats:PASS: 0 nsec
    Summary: 7 PASSED, 1 FAILED

    Tracking down the issue actually revealed that endianness selection
    in bpf_endian.h is broken when compiled with clang with bpf target.
    test_pkt_access.c, test_l4lb.c is compiled with __BYTE_ORDER as
    __BIG_ENDIAN, test_xdp.c as __LITTLE_ENDIAN! test_l4lb noticeably
    fails, because the test accounts bytes via bpf_ntohs(ip6h->payload_len)
    and bpf_ntohs(iph->tot_len), and compares them against a defined
    value and given a wrong endianness, the test outcome is different,
    of course.

    Turns out that there are actually two bugs: i) when we do __BYTE_ORDER
    comparison with __LITTLE_ENDIAN/__BIG_ENDIAN, then depending on the
    include order we see different outcomes. Reason is that __BYTE_ORDER
    is undefined due to missing endian.h include. Before we include the
    asm/byteorder.h (e.g. through linux/in.h), then __BYTE_ORDER equals
    __LITTLE_ENDIAN since both are undefined, after the include which
    correctly pulls in linux/byteorder/little_endian.h, __LITTLE_ENDIAN
    is defined, but given __BYTE_ORDER is still undefined, we match on
    __BYTE_ORDER equals to __BIG_ENDIAN since __BIG_ENDIAN is also
    undefined at that point, sigh. ii) But even that would be wrong,
    since when compiling the test cases with clang, one can select between
    bpfeb and bpfel targets for cross compilation. Hence, we can also not
    rely on what the system's endian.h provides, but we need to look at
    the compiler's defined endianness. The compiler defines __BYTE_ORDER__,
    and we can match __ORDER_LITTLE_ENDIAN__ and __ORDER_BIG_ENDIAN__,
    which also reflects targets bpf (native), bpfel, bpfeb correctly,
    thus really only rely on that. After patch:

    # ./test_progs
    test_pkt_access:PASS:ipv4 74 nsec
    test_pkt_access:PASS:ipv6 42 nsec
    test_xdp:PASS:ipv4 2340 nsec
    test_xdp:PASS:ipv6 1461 nsec
    test_l4lb:PASS:ipv4 400 nsec
    test_l4lb:PASS:ipv6 530 nsec
    test_tcp_estats:PASS: 0 nsec
    Summary: 7 PASSED, 0 FAILED

    Fixes: 43bcf707ccdc ("bpf: fix _htons occurences in test_progs")
    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • The commit e7ee40475760 ("perf symbols: Fix symbols searching for module
    in buildid-cache") added the function to check kernel modules reside in
    the build-id cache. This was because there's no way to identify a DSO
    which is actually a kernel module. So it searched linkname of the file
    and find ".ko" suffix.

    But this does not work for compressed kernel modules and now such DSOs
    hCcave correct symtab_type now. So no need to check it anymore. This
    patch essentially reverts the commit.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170608073109.30699-10-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The symsrc__init() overwrites dso->symtab_type as symsrc->type in
    dso__load_sym(). But for compressed kernel modules in the build-id
    cache, it should have original symtab type to be decompressed as needed.

    This fixes perf annotate to show disassembly of the function properly.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170608073109.30699-9-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • If a kernel modules is compressed, it should be decompressed before
    running objdump to parse binary data correctly. This fixes a failure of
    object code reading test for me.

    Signed-off-by: Namhyung Kim
    Acked-by: Adrian Hunter
    Acked-by: Jiri Olsa
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170608073109.30699-8-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • On failure, it should free the 'name', so clean up the error path using
    goto.

    Signed-off-by: Namhyung Kim
    Suggested-by: Arnaldo Carvalho de Melo
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170608073109.30699-7-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Currently perf decompresses kernel modules when loading the symbol table
    but it missed to do it when reading raw data.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170608073109.30699-6-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Convert open-coded decompress routine to use the function.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170608073109.30699-5-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Move decompress_kmodule() to util/dso.c and split it into two functions
    returning fd and (decompressed) file path. The existing user only wants
    the fd version but the path version will be used soon.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170608073109.30699-4-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The 'name' variable should be freed on the error path.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170608073109.30699-3-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The commit 6ebd2547dd24 ("perf annotate: Fix a bug following symbolic
    link of a build-id file") changed to use dirname to follow the symlink.
    But it only considers new-style build-id cache names so old names fail
    on readlink() and force to use system path which might not available.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: Taeung Song
    Cc: Wang Nan
    Cc: kernel-team@lge.com
    Fixes: 6ebd2547dd24 ("perf annotate: Fix a bug following symbolic link of a build-id file")
    Link: http://lkml.kernel.org/r/20170608073109.30699-2-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     

08 Jun, 2017

6 commits

  • Few shell command examples in perf-script-python.txt has few nitpicks
    include:

    - tools/perf/scripts/python directory listing command is unnecessarily
    repeated.
    - few examples contain additional information in command prompt
    unnecessarily and inconsistently.

    This commit fixes them to enhance readability of the document.

    Signed-off-by: SeongJae Park
    Cc: Alexander Shishkin
    Cc: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Tom Zanussi
    Fixes: cff68e582237 ("perf/scripts: Add perf-trace-python Documentation")
    Link: http://lkml.kernel.org/r/20170530111827.21732-4-sj38.park@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    SeongJae Park
     
  • Default function signature of trace_unhandled() got changed to include a
    field dict, but its documentation, perf-script-python.txt has not been
    updated. Fix it.

    Signed-off-by: SeongJae Park
    Cc: Alexander Shishkin
    Cc: Peter Zijlstra
    Cc: Pierre Tardy
    Fixes: c02514850d67 ("perf scripts python: Give field dict to unhandled callback")
    Link: http://lkml.kernel.org/r/20170530111827.21732-6-sj38.park@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    SeongJae Park
     
  • This commit fixes wrong code snippets for trace_begin() and trace_end()
    function example definition.

    Signed-off-by: SeongJae Park
    Cc: Alexander Shishkin
    Cc: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Tom Zanussi
    Fixes: cff68e582237 ("perf/scripts: Add perf-trace-python Documentation")
    Link: http://lkml.kernel.org/r/20170530111827.21732-5-sj38.park@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    SeongJae Park
     
  • This commit fixes two errors in documents for perf-script-python and
    perf-script-perl as below:

    - /sys/kernel/debug/tracing events -> /sys/kernel/debug/tracing/events/
    - trace_handled -> trace_unhandled

    Signed-off-by: SeongJae Park
    Cc: Alexander Shishkin
    Cc: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Tom Zanussi
    Fixes: cff68e582237 ("perf/scripts: Add perf-trace-python Documentation")
    Link: http://lkml.kernel.org/r/20170530111827.21732-3-sj38.park@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    SeongJae Park
     
  • Script generated by the '--gen-script' option contains an outdated
    comment. It mentions a 'perf-trace-python' document while it has been
    renamed to 'perf-script-python'. Fix it.

    Signed-off-by: SeongJae Park
    Cc: Alexander Shishkin
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: 133dc4c39c57 ("perf: Rename 'perf trace' to 'perf script'")
    Link: http://lkml.kernel.org/r/20170530111827.21732-2-sj38.park@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    SeongJae Park
     
  • An example in perf-probe documentation for pattern of function name
    based probe addition is not providing example command for that case.

    This commit fixes the example to give appropriate example command.

    Signed-off-by: SeongJae Park
    Acked-by: Masami Hiramatsu
    Cc: Peter Zijlstra
    Cc: Taeung Song
    Fixes: ee391de876ae ("perf probe: Update perf probe document")
    Link: http://lkml.kernel.org/r/20170507103642.30560-1-sj38.park@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    SeongJae Park
     

07 Jun, 2017

1 commit

  • …linux/kernel/git/acme/linux into perf/urgent

    Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

    - Only print NMI watchdog hint in 'perf stat' when it is enabled (Andi Kleen)

    - Fix sys_mmap/sys_old_mmap shandling in s390 in 'perf trace' (Jiri Olsa)

    - Disable breakpoint signal tests in powerpc, that lacks the perf kernel
    glue to set breakpoint events and makes 'perf test' always fail (Jiri Olsa)

    - Fix 'perf annotate' for branch instruction with multiple operands (Kim Phillips)

    - Add missing powerpc triplet when disassembling with 'objdump' in 'perf
    annotate' (Kim Phillips)

    - Do not trow away partial unwound stacks when using libdw, making
    callchains produced with it similar to those produced when linked with
    the other DWARF unwind library supported in perf, libunwind (Milian Wolff)

    - Fixes to properly handle kernel modules when processing build-id meta
    events (Namhyung Kim)

    - Fix handling of compressed modules in the build-id cache (Namhyung Kim)

    - Fix 'perf annotate' failure when filename has special chars (Ravi Bangoria)

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

06 Jun, 2017

7 commits

  • In some situations the libdw unwinder stopped working properly. I.e.
    with libunwind we see:

    ~~~~~
    heaptrack_gui 2228 135073.400112: 641314 cycles:
    e8ed _dl_fixup (/usr/lib/ld-2.25.so)
    15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
    ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
    608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
    f199 call_init.part.0 (/usr/lib/ld-2.25.so)
    f2a5 _dl_init (/usr/lib/ld-2.25.so)
    db9 _dl_start_user (/usr/lib/ld-2.25.so)
    ~~~~~

    But with libdw and without this patch this sample is not properly
    unwound:

    ~~~~~
    heaptrack_gui 2228 135073.400112: 641314 cycles:
    e8ed _dl_fixup (/usr/lib/ld-2.25.so)
    15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
    ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
    ~~~~~

    Debug output showed me that libdw found a module for the last frame
    address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch
    double-checks what libdw sees and what perf knows. If the mappings
    mismatch, we now report the elf known to perf. This fixes the situation
    above, and the libdw unwinder produces the same stack as libunwind.

    Signed-off-by: Milian Wolff
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wolff@kdab.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Milian Wolff
     
  • So far the whole stack was thrown away when any error occurred before
    the maximum stack depth was unwound. This is actually a very common
    scenario though. The stacks that got unwound so far are still
    interesting. This removes a large chunk of differences when comparing
    perf script output for libunwind and libdw perf unwinding.

    E.g. with libunwind:

    ~~~~~
    heaptrack_gui 2228 135073.388524: 479408 cycles:
    ffffffff811749ed perf_iterate_ctx ([kernel.kallsyms])
    ffffffff81181662 perf_event_mmap ([kernel.kallsyms])
    ffffffff811cf5ed mmap_region ([kernel.kallsyms])
    ffffffff811cfe6b do_mmap ([kernel.kallsyms])
    ffffffff811b0dca vm_mmap_pgoff ([kernel.kallsyms])
    ffffffff811cdb0c sys_mmap_pgoff ([kernel.kallsyms])
    ffffffff81033acb sys_mmap ([kernel.kallsyms])
    ffffffff81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms])
    192ca mmap64 (/usr/lib/ld-2.25.so)
    59a9 _dl_map_object_from_fd (/usr/lib/ld-2.25.so)
    83d0 _dl_map_object (/usr/lib/ld-2.25.so)
    cda1 openaux (/usr/lib/ld-2.25.so)
    1834f _dl_catch_error (/usr/lib/ld-2.25.so)
    cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so)
    3481 dl_main (/usr/lib/ld-2.25.so)
    17387 _dl_sysdep_start (/usr/lib/ld-2.25.so)
    4d37 _dl_start (/usr/lib/ld-2.25.so)
    d87 _start (/usr/lib/ld-2.25.so)

    heaptrack_gui 2228 135073.388677: 611329 cycles:
    1a3e0 strcmp (/usr/lib/ld-2.25.so)
    82b2 _dl_map_object (/usr/lib/ld-2.25.so)
    cda1 openaux (/usr/lib/ld-2.25.so)
    1834f _dl_catch_error (/usr/lib/ld-2.25.so)
    cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so)
    3481 dl_main (/usr/lib/ld-2.25.so)
    17387 _dl_sysdep_start (/usr/lib/ld-2.25.so)
    4d37 _dl_start (/usr/lib/ld-2.25.so)
    d87 _start (/usr/lib/ld-2.25.so)
    ~~~~~

    With libdw without this patch:

    ~~~~~
    heaptrack_gui 2228 135073.388524: 479408 cycles:
    ffffffff811749ed perf_iterate_ctx ([kernel.kallsyms])
    ffffffff81181662 perf_event_mmap ([kernel.kallsyms])
    ffffffff811cf5ed mmap_region ([kernel.kallsyms])
    ffffffff811cfe6b do_mmap ([kernel.kallsyms])
    ffffffff811b0dca vm_mmap_pgoff ([kernel.kallsyms])
    ffffffff811cdb0c sys_mmap_pgoff ([kernel.kallsyms])
    ffffffff81033acb sys_mmap ([kernel.kallsyms])
    ffffffff81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms])

    heaptrack_gui 2228 135073.388677: 611329 cycles:
    ~~~~~

    With this patch applied, the libdw unwinder will produce the same
    output as the libunwind unwinder.

    Signed-off-by: Milian Wolff
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lkml.kernel.org/r/20170601210021.20046-1-milian.wolff@kdab.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Milian Wolff
     
  • On an Ubuntu xenial system, 'perf annotate' says to install powerpc
    objdump on a system that already has binutils-powerpc-linux-gnu
    installed. Make perf aware of the missing triplet for the
    powerpc-linux-gnu target.

    Signed-off-by: Kim Phillips
    Cc: Alexander Shishkin
    Cc: He Kuang
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Ravi Bangoria
    Link: http://lkml.kernel.org/r/20170529142754.7fbfb1152fd8f2663de0ea70@arm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kim Phillips
     
  • The following tests are failing on powerpc:

    # perf test break
    18: Breakpoint overflow signal handler : FAILED!
    19: Breakpoint overflow sampling : FAILED!

    The powerpc kenel so far does not have support to even create
    instruction breakpoints using the perf event interface, so those tests
    fail early in the config phase.

    I added a '->is_supported()' callback to test struct to be able to
    disable specific tests. It seems better than putting ifdefs directly to
    the test array.

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20170601205450.GA398@krava
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • The decompress_kmodule() decompresses kernel modules in order to load
    symbols from it. In the DSO_BINARY_TYPE__BUILD_ID_CACHE case, it needs
    the full file path to extract the file extension to determine the
    decompression method. But overwriting 'name' will fail the
    decompression since it might point to a non-existing old file.

    Instead, use dso->long_name for having the correct extension and use the
    real filename to decompress.

    In the DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP case, both names should
    be the same. This allows resolving symbols in the old modules.

    Before:

    $ perf report -i perf.data.old | grep scsi_mod
    0.00% cc1 [scsi_mod] [k] 0x0000000000004aa6
    0.00% as [scsi_mod] [k] 0x00000000000099e1
    0.00% cc1 [scsi_mod] [k] 0x0000000000009830
    0.00% cc1 [scsi_mod] [k] 0x0000000000001b8f

    After:

    0.00% cc1 [scsi_mod] [k] scsi_handle_queue_ramp_up
    0.00% as [scsi_mod] [k] scsi_sg_alloc
    0.00% cc1 [scsi_mod] [k] scsi_setup_cmnd
    0.00% cc1 [scsi_mod] [k] scsi_get_command

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170531120105.21731-3-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Like machine__findnew_module_dso(), it should set necessary info for
    kernel modules to find symbol info from the file. Factor out
    dso__set_module_info() to do it.

    This is needed for dso__needs_decompress() to detect such DSOs.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170531120105.21731-2-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • When perf processes build-id event, it creates DSOs with the build-id.
    But it didn't set the module short name (like '[module-name]') so when
    processing a kernel mmap event of the module, it cannot found the DSO as
    it only checks the short names.

    That leads for perf to create a same DSO without the build-id info and
    it'll lookup the system path even if the DSO is already in the build-id
    cache. After kernel was updated, perf cannot find the DSO and cannot
    show symbols in it anymore.

    You can see this if you have an old data file (w/ old kernel version):

    $ perf report -i perf.data.old -v |& grep scsi_mod
    build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz : cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
    Failed to open /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz, continuing without symbols
    ...

    The second message didn't show the build-id. With this patch:

    $ perf report -i perf.data.old -v |& grep scsi_mod
    build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz: cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
    /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz with build id cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1 not found, continuing without symbols
    ...

    Now it shows the build-id but still cannot load the symbol table. This
    is a different problem which will be fixed in the next patch.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Peter Zijlstra
    Cc: kernel-team@lge.com
    Link: http://lkml.kernel.org/r/20170531120105.21731-1-namhyung@kernel.org
    [ Fix the build on older compilers (debian

    Namhyung Kim
     

02 Jun, 2017

2 commits

  • Only print the NMI watchdog hint when that watchdog it actually enabled.

    This avoids printing these unnecessarily.

    Signed-off-by: Andi Kleen
    Acked-by: Jiri Olsa
    Link: http://lkml.kernel.org/n/tip-lnw7edxnqsphkmeew857wz1i@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • 'perf annotate' is dropping the cr* fields from branch instructions.

    Fix it by adding support to display branch instructions having
    multiple operands.

    Power Arch objdump of int_sqrt:

    20.36 | c0000000004d2694: subf r10,r10,r3
    | c0000000004d2698: v bgt cr6,c0000000004d26a0
    1.82 | c0000000004d269c: mr r3,r10
    29.18 | c0000000004d26a0: mr r10,r8
    | c0000000004d26a4: v bgt cr7,c0000000004d26ac
    | c0000000004d26a8: mr r10,r7

    Power Arch Before Patch:

    20.36 | subf r10,r10,r3
    | v bgt 40
    1.82 | mr r3,r10
    29.18 | 40: mr r10,r8
    | v bgt 4c
    | mr r10,r7

    Power Arch After patch:

    20.36 | subf r10,r10,r3
    | v bgt cr6,40
    1.82 | mr r3,r10
    29.18 | 40: mr r10,r8
    | v bgt cr7,4c
    | mr r10,r7

    Also support AArch64 conditional branch instructions, which can
    have up to three operands:

    Aarch64 Non-simplified (raw objdump) view:

    │ffff0000083cd11c: ↑ cbz w0, ffff0000083cd100
    Tested-by: Ravi Bangoria
    Reported-by: Anton Blanchard
    Reported-by: Robin Murphy
    Signed-off-by: Kim Phillips
    Cc: Alexander Shishkin
    Cc: Christian Borntraeger
    Cc: Mark Rutland
    Cc: Peter Zijlstra
    Cc: Taeung Song
    Link: http://lkml.kernel.org/r/20170601092959.f60d98912e8a1b66fd1e4c0e@arm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kim Phillips
     

01 Jun, 2017

1 commit

  • The s390 architecture maps sys_mmap (nr 90) into sys_old_mmap. For this
    reason perf trace can't find the proper syscall event to get args format
    from and displays it wrongly as 'continued'.

    To fix that fill the "alias" field with "old_mmap" for trace's mmap record
    to get the correct translation.

    Before:
    0.042 ( 0.011 ms): vest/43052 fstat(statbuf: 0x3ffff89fd90 ) = 0
    0.042 ( 0.028 ms): vest/43052 ... [continued]: mmap()) = 0x3fffd6e2000
    0.072 ( 0.025 ms): vest/43052 read(buf: 0x3fffd6e2000, count: 4096 ) = 6

    After:
    0.045 ( 0.011 ms): fstat(statbuf: 0x3ffff8a0930 ) = 0
    0.057 ( 0.018 ms): mmap(arg: 0x3ffff8a0858 ) = 0x3fffd14a000
    0.076 ( 0.025 ms): read(buf: 0x3fffd14a000, count: 4096 ) = 6

    Signed-off-by: Jiri Olsa
    Cc: David Ahern
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20170531113557.19175-1-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

28 May, 2017

2 commits

  • Pull powerpc fixes from Michael Ellerman:
    "Fix running SPU programs on Cell, and a few other minor fixes.

    Thanks to Alistair Popple, Jeremy Kerr, Michael Neuling, Nicholas
    Piggin"

    * tag 'powerpc-4.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc: Add PPC_FEATURE userspace bits for SCV and DARN instructions
    powerpc/spufs: Fix hash faults for kernel regions
    powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N
    powerpc/powernv/npu-dma.c: Fix opal_npu_destroy_context() call
    selftests/powerpc: Fix TM resched DSCR test with some compilers

    Linus Torvalds
     
  • Pull perf tooling fixes from Thomas Gleixner:

    - Synchronization of tools and kernel headers

    - A series of fixes for perf report addressing various failures:
    * Handle invalid maps proper
    * Plug a memory leak
    * Handle frames and callchain order correctly

    - Fixes for handling inlines and children mode

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    tools/include: Sync kernel ABI headers with tooling headers
    perf tools: Put caller above callee in --children mode
    perf report: Do not drop last inlined frame
    perf report: Always honor callchain order for inlined nodes
    perf script: Add --inline option for debugging
    perf report: Fix off-by-one for non-activation frames
    perf report: Fix memory leak in addr2line when called by addr2inlines
    perf report: Don't crash on invalid maps in `-g srcline` mode

    Linus Torvalds
     

27 May, 2017

3 commits

  • Pull ftrace fixes from Steven Rostedt:
    "There's been a few memory issues found with ftrace.

    One was simply a memory leak where not all was being freed that should
    have been in releasing a file pointer on set_graph_function.

    Then Thomas found that the ftrace trampolines were marked for
    read/write as well as execute. To shrink the possible attack surface,
    he added calls to set them to ro. Which also uncovered some other
    issues with freeing module allocated memory that had its permissions
    changed.

    Kprobes had a similar issue which is fixed and a selftest was added to
    trigger that issue again"

    * tag 'trace-v4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    x86/ftrace: Make sure that ftrace trampolines are not RWX
    x86/mm/ftrace: Do not bug in early boot on irqs_disabled in cpu_flush_range()
    selftests/ftrace: Add a testcase for many kprobe events
    kprobes/x86: Fix to set RWX bits correctly before releasing trampoline
    ftrace: Fix memory leak in ftrace_graph_release()

    Linus Torvalds
     
  • When filename contains special chars, perf annotate fails
    with an error:

    $ perf annotate --vmlinux ./vmlinux\(test\) --stdio native_safe_halt
    sh: -c: line 0: syntax error near unexpected token `('
    sh: -c: line 0: `objdump --start-address=0xffffffff8184e840
    --stop-address=0xffffffff8184e848 -l -d --no-show-raw -S -C
    ./vmlinux(test) 2>/dev/null|grep -v ./vmlinux(test):|expand'

    Fix it by surrounding filename in double quotes.

    Signed-off-by: Ravi Bangoria
    Cc: Adam Stylinski
    Cc: Alexander Shishkin
    Cc: Christian Borntraeger
    Cc: Peter Zijlstra
    Cc: Taeung Song
    Link: http://lkml.kernel.org/r/20170505101417.2117-1-ravi.bangoria@linux.vnet.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Ravi Bangoria
     
  • Add a testcase to test kprobes via ftrace interface
    with many concurrent kprobe events.

    This tries to add many kprobe events (up to 256) on
    kernel functions. To avoid making ftrace-based
    kprobes (kprobes on fentry), it skips first N bytes
    (on x86 N=5, on ppc or arm N=4) of function entry.
    After that, it enables all those events, disable it,
    and remove it.

    Since the unoptimization buffer reclaiming will
    be delayed, after removing events, it will wait
    enough time.

    Link: http://lkml.kernel.org/r/149577388470.11702.11832460851769204511.stgit@devbox

    Signed-off-by: Masami Hiramatsu
    Suggested-by: Steven Rostedt
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     

26 May, 2017

1 commit

  • This patch adds various verifier test cases:

    1) A test case for the pruning issue when tracking alignment
    is used.
    2) Various PTR_TO_MAP_VALUE_OR_NULL tests to make sure pointer
    arithmetic turns such register into UNKNOWN_VALUE type.
    3) Test cases for the special treatment of LD_ABS/LD_IND to
    make sure verifier doesn't break calling convention here.
    Latter is needed, since f.e. arm64 JIT uses r1 - r5 for
    storing temporary data, so they really must be marked as
    NOT_INIT.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann