18 May, 2020

4 commits

  • Add ddr bandwidth usage metric support for i.MX8DXL.

    Metric:
    imx8dxl_lpddr4.bandwidth_usage
    imx8dxl_ddr3l.bandwidth_usage

    Example:
    root@imx8dxlevk:~# ./perf stat -a -I 1000 -M imx8dxl_lpddr4.bandwidth_usage dd if=/dev/zero of=/dev/null bs=100M count=1000000
    1.000242625 1444320 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/ # 15.1 % imx8dxl_lpddr4.bandwidth_usage
    1.000242625 88964544 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/
    1.000242625 1000242625 ns duration_time
    2.001170500 297392 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/ # 16.0 % imx8dxl_lpddr4.bandwidth_usage
    2.001170500 95684315 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/
    2.001170500 1000927875 ns duration_time
    3.001840125 320798 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/ # 16.0 % imx8dxl_lpddr4.bandwidth_usage
    3.001840125 95655155 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000,axi_channel=0x0/
    3.001840125 1000669625 ns duration_time

    Reviewed-by: Fugang Duan
    Signed-off-by: Joakim Zhang

    Joakim Zhang
     
  • Add ddr bandwidth usage metric support for i.MX8MP.

    Metric:
    imx8mp-lpddr4-bandwidth-usage

    Example:
    root@imx8mpevk:~# ./perf stat -a -I 1000 -M imx8mp-lpddr4-bandwidth-usage dd if=/dev/zero of=/dev/null bs=100M count=1000000
    1.000770875 18081664 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000/ # 37.0 % imx8mp-lpddr4-bandwidth-usage
    1.000770875 5895351484 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000/
    1.000770875 1000770875 ns duration_time
    2.001780250 11137456 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000/ # 39.0 % imx8mp-lpddr4-bandwidth-usage
    2.001780250 6232776052 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000/
    2.001780250 1001009375 ns duration_time
    3.002748125 10643520 imx8_ddr0/axid-read,axi_mask=0xffff,axi_id=0x0000/ # 39.0 % imx8mp-lpddr4-bandwidth-usage
    3.002748125 6229700768 imx8_ddr0/axid-write,axi_mask=0xffff,axi_id=0x0000/
    3.002748125 1000967875 ns duration_time

    Reviewed-by: Fugang Duan
    Signed-off-by: Joakim Zhang

    Joakim Zhang
     
  • In interval mode, if metric expression contains duration_time event,
    command with -I 5000 config can trigger this cast issue.

    Reviewed-by: Fugang Duan
    Signed-off-by: Joakim Zhang

    Joakim Zhang
     
  • For interval mode, the metric is printed after the '#' character if it
    exists. But it's not calculated by the counts generated in this
    interval.

    See the following examples:

    root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
    # time counts unit events
    1.000422803 764,809 inst_retired.any # 2.9 CPI
    1.000422803 2,234,932 cycles
    2.001464585 1,960,061 inst_retired.any # 1.6 CPI
    2.001464585 4,022,591 cycles

    The second CPI should not be 1.6 (4,022,591/1,960,061 is 2.1)

    root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
    # time counts unit events
    1.000429493 2,869,311 cycles
    1.000429493 816,875 instructions # 0.28 insn per cycle
    2.001516426 9,260,973 cycles
    2.001516426 5,250,634 instructions # 0.87 insn per cycle

    The second 'insn per cycle' should not be 0.87 (5,250,634/9,260,973 is
    0.57).

    The current code uses a global variable 'rt_stat' for tracking and
    updating the std dev of runtime stat. Unlike the counts, 'rt_stat' is not
    reset for interval. While the counts are reset for interval.

    perf_stat_process_counter()
    {
    if (config->interval)
    init_stats(ps->res_stats);
    }

    So for interval mode, the 'rt_stat' variable should be reset too.

    This patch resets 'rt_stat' before read_counters(), so the runtime stat
    is only calculated by the counts generated in this interval.

    With this patch:

    root@kbl-ppc:~# perf stat -M CPI -I1000 --interval-count 2
    # time counts unit events
    1.000420924 2,408,818 inst_retired.any # 2.1 CPI
    1.000420924 5,010,111 cycles
    2.001448579 2,798,407 inst_retired.any # 1.6 CPI
    2.001448579 4,599,861 cycles

    root@kbl-ppc:~# perf stat -e cycles,instructions -I1000 --interval-count 2
    # time counts unit events
    1.000428555 2,769,714 cycles
    1.000428555 774,462 instructions # 0.28 insn per cycle
    2.001471562 3,595,904 cycles
    2.001471562 1,243,703 instructions # 0.35 insn per cycle

    Now the second 'insn per cycle' and CPI are calculated by the counts
    generated in this interval.

    Signed-off-by: Jin Yao
    Acked-by: Jiri Olsa
    Tested-By: Kajol Jain
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Jin Yao
    Cc: Kan Liang
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200420145417.6864-1-yao.jin@linux.intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Joakim cherry pick from perf/core commit: 197ba86fdc888dc0d3d6b89b402c9c6851d4c6fb
    Reviewed-by: Fugang Duan
    Signed-off-by: Joakim Zhang

    Jin Yao
     

01 Apr, 2020

2 commits


08 Mar, 2020

1 commit

  • Merge Linux stable release v5.4.24 into imx_5.4.y

    * tag 'v5.4.24': (3306 commits)
    Linux 5.4.24
    blktrace: Protect q->blk_trace with RCU
    kvm: nVMX: VMWRITE checks unsupported field before read-only field
    ...

    Signed-off-by: Jason Liu

    Conflicts:
    arch/arm/boot/dts/imx6sll-evk.dts
    arch/arm/boot/dts/imx7ulp.dtsi
    arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
    drivers/clk/imx/clk-composite-8m.c
    drivers/gpio/gpio-mxc.c
    drivers/irqchip/Kconfig
    drivers/mmc/host/sdhci-of-esdhc.c
    drivers/mtd/nand/raw/gpmi-nand/gpmi-nand.c
    drivers/net/can/flexcan.c
    drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
    drivers/net/ethernet/mscc/ocelot.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
    drivers/net/phy/realtek.c
    drivers/pci/controller/mobiveil/pcie-mobiveil-host.c
    drivers/perf/fsl_imx8_ddr_perf.c
    drivers/tee/optee/shm_pool.c
    drivers/usb/cdns3/gadget.c
    kernel/sched/cpufreq.c
    net/core/xdp.c
    sound/soc/fsl/fsl_esai.c
    sound/soc/fsl/fsl_sai.c
    sound/soc/sof/core.c
    sound/soc/sof/imx/Kconfig
    sound/soc/sof/loader.c

    Jason Liu
     

07 Mar, 2020

1 commit

  • libbfd has changed the bfd_section_* macros to inline functions
    bfd_section_ since 2019-09-18. See below two commits:
    o http://www.sourceware.org/ml/gdb-cvs/2019-09/msg00064.html
    o https://www.sourceware.org/ml/gdb-cvs/2019-09/msg00072.html

    This fix make perf able to build with both old and new libbfd.

    Signed-off-by: Changbin Du
    Acked-by: Jiri Olsa
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200128152938.31413-1-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     

05 Mar, 2020

4 commits

  • commit 604e2139a1026793b8c2172bd92c7e9d039a5cf0 upstream.

    When we moved zalloc.o to the library we missed gtk library which needs
    it compiled in, otherwise the missing __zfree symbol will cause the
    library to fail to load.

    Adding the zalloc object to the gtk library build.

    Fixes: 7f7c536f23e6 ("tools lib: Adopt zalloc()/zfree() from tools/perf")
    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Jelle van der Waa
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200113104358.123511-1-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Jiri Olsa
     
  • commit 3f7774033e6820d25beee5cf7aefa11d4968b951 upstream.

    We need to set actions->ms.map since 599a2f38a989 ("perf hists browser:
    Check sort keys before hot key actions"), as in that patch we bail out
    if map is NULL.

    Reviewed-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Namhyung Kim
    Fixes: 599a2f38a989 ("perf hists browser: Check sort keys before hot key actions")
    Link: https://lkml.kernel.org/n/tip-wp1ssoewy6zihwwexqpohv0j@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Arnaldo Carvalho de Melo
     
  • commit b9167c8078c3527de6da241c8a1a75a9224ed90a upstream.

    Commit 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second
    timeout per test") added a 45 second timeout for tests, and also added
    a way for tests to customise the timeout via a settings file.

    For example the ftrace tests take multiple minutes to run, so they
    were given longer in commit b43e78f65b1d ("tracing/selftests: Turn off
    timeout setting").

    This works when the tests are run from the source tree. However if the
    tests are installed with "make -C tools/testing/selftests install",
    the settings files are not copied into the install directory. When the
    tests are then run from the install directory the longer timeouts are
    not applied and the tests timeout incorrectly.

    So add the settings files to TEST_FILES of the appropriate Makefiles
    to cause the settings files to be installed using the existing install
    logic.

    Fixes: 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second timeout per test")
    Signed-off-by: Michael Ellerman
    Signed-off-by: Shuah Khan
    Signed-off-by: Greg Kroah-Hartman

    Michael Ellerman
     
  • [ Upstream commit e404b8c7cfb31654c9024d497cec58a501501692 ]

    After commit 27596472473a ("ipv6: fix ECMP route replacement") it is no
    longer possible to replace an ECMP-able route by a non ECMP-able route.
    For example,
    ip route add 2001:db8::1/128 via fe80::1 dev dummy0
    ip route replace 2001:db8::1/128 dev dummy0
    does not work as expected.

    Tweak the replacement logic so that point 3 in the log of the above commit
    becomes:
    3. If the new route is not ECMP-able, and no matching non-ECMP-able route
    exists, replace matching ECMP-able route (if any) or add the new route.

    We can now summarize the entire replace semantics to:
    When doing a replace, prefer replacing a matching route of the same
    "ECMP-able-ness" as the replace argument. If there is no such candidate,
    fallback to the first route found.

    Fixes: 27596472473a ("ipv6: fix ECMP route replacement")
    Signed-off-by: Benjamin Poirier
    Reviewed-by: Michal Kubecek
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Benjamin Poirier
     

29 Feb, 2020

1 commit

  • commit f2e97dc126b712c0d21219ed0c42710006c1cf52 upstream.

    Fix following build error. We could push a tcp.h header into one of the
    include paths, but I think its easy enough to simply pull in the three
    defines we need here. If we end up using more of tcp.h at some point
    we can pull it in later.

    /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c: In function ‘connected_socket_v4’:
    /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:20:11: error: ‘TCP_REPAIR_ON’ undeclared (first use in this function)
    repair = TCP_REPAIR_ON;
    ^
    /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:20:11: note: each undeclared identifier is reported only once for each function it appears in
    /home/john/git/bpf/tools/testing/selftests/bpf/prog_tests/sockmap_basic.c:29:11: error: ‘TCP_REPAIR_OFF_NO_WP’ undeclared (first use in this function)
    repair = TCP_REPAIR_OFF_NO_WP;

    Then with fix,

    $ ./test_progs -n 44
    #44/1 sockmap create_update_free:OK
    #44/2 sockhash create_update_free:OK
    #44 sockmap_basic:OK

    Fixes: 5d3919a953c3c ("selftests/bpf: Test freeing sockmap/sockhash with a socket in it")
    Signed-off-by: John Fastabend
    Signed-off-by: Alexei Starovoitov
    Reviewed-by: Jakub Sitnicki
    Link: https://lore.kernel.org/bpf/158131347731.21414.12120493483848386652.stgit@john-Precision-5820-Tower
    Signed-off-by: Greg Kroah-Hartman

    John Fastabend
     

24 Feb, 2020

10 commits

  • [ Upstream commit 414f50434aa2463202a5b35e844f4125dd1a7101 ]

    Some newer cards supported by aacraid can take up to 40s to recover
    after an EEH event. This causes spurious failures in the basic EEH
    self-test since the current maximim timeout is only 30s.

    Fix the immediate issue by bumping the timeout to a default of 60s,
    and allow the wait time to be specified via an environmental variable
    (EEH_MAX_WAIT).

    Reported-by: Steve Best
    Suggested-by: Douglas Miller
    Signed-off-by: Oliver O'Halloran
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20200122031125.25991-1-oohall@gmail.com
    Signed-off-by: Sasha Levin

    Oliver O'Halloran
     
  • [ Upstream commit 51bad0f05616c43d6d34b0a19bcc9bdab8e8fb39 ]

    Currently, there is a lot of false positives if a single reuseport test
    fails. This is because expected_results and the result map are not cleared.

    Zero both after individual test runs, which fixes the mentioned false
    positives.

    Fixes: 91134d849a0e ("bpf: Test BPF_PROG_TYPE_SK_REUSEPORT")
    Signed-off-by: Lorenz Bauer
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Jakub Sitnicki
    Acked-by: Martin KaFai Lau
    Acked-by: John Fastabend
    Link: https://lore.kernel.org/bpf/20200124112754.19664-5-lmb@cloudflare.com
    Signed-off-by: Sasha Levin

    Lorenz Bauer
     
  • [ Upstream commit 8b7e20a7ba54836076ff35a28349dabea4cec48f ]

    Add TEST opcode to Group3-2 reg=001b as same as Group3-1 does.

    Commit

    12a78d43de76 ("x86/decoder: Add new TEST instruction pattern")

    added a TEST opcode assignment to f6 XX/001/XXX (Group 3-1), but did
    not add f7 XX/001/XXX (Group 3-2).

    Actually, this TEST opcode variant (ModRM.reg /1) is not described in
    the Intel SDM Vol2 but in AMD64 Architecture Programmer's Manual Vol.3,
    Appendix A.2 Table A-6. ModRM.reg Extensions for the Primary Opcode Map.

    Without this fix, Randy found a warning by insn_decoder_test related
    to this issue as below.

    HOSTCC arch/x86/tools/insn_decoder_test
    HOSTCC arch/x86/tools/insn_sanity
    TEST posttest
    arch/x86/tools/insn_decoder_test: warning: Found an x86 instruction decoder bug, please report this.
    arch/x86/tools/insn_decoder_test: warning: ffffffff81000bf1: f7 0b 00 01 08 00 testl $0x80100,(%rbx)
    arch/x86/tools/insn_decoder_test: warning: objdump says 6 bytes, but insn_get_length() says 2
    arch/x86/tools/insn_decoder_test: warning: Decoded and checked 11913894 instructions with 1 failures
    TEST posttest
    arch/x86/tools/insn_sanity: Success: decoded and checked 1000000 random instructions with 0 errors (seed:0x871ce29c)

    To fix this error, add the TEST opcode according to AMD64 APM Vol.3.

    [ bp: Massage commit message. ]

    Reported-by: Randy Dunlap
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Borislav Petkov
    Acked-by: Randy Dunlap
    Tested-by: Randy Dunlap
    Link: https://lkml.kernel.org/r/157966631413.9580.10311036595431878351.stgit@devnote2
    Signed-off-by: Sasha Levin

    Masami Hiramatsu
     
  • [ Upstream commit 8580bed7e751e6d4f17881e059daf3cb37ba4717 ]

    Building objtool with ARCH=x86_64 fails with:

    $make ARCH=x86_64 -C tools/objtool
    ...
    CC arch/x86/decode.o
    arch/x86/decode.c:10:22: fatal error: asm/insn.h: No such file or directory
    #include
    ^
    compilation terminated.
    mv: cannot stat ‘arch/x86/.decode.o.tmp’: No such file or directory
    make[2]: *** [arch/x86/decode.o] Error 1
    ...

    The root cause is that the command-line variable 'ARCH' cannot be
    overridden. It can be replaced by 'SRCARCH', which is defined in
    'tools/scripts/Makefile.arch'.

    Signed-off-by: Shile Zhang
    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Ingo Molnar
    Reviewed-by: Kamalesh Babulal
    Link: https://lore.kernel.org/r/d5d11370ae116df6c653493acd300ec3d7f5e925.1579543924.git.jpoimboe@redhat.com
    Signed-off-by: Sasha Levin

    Shile Zhang
     
  • [ Upstream commit 585c91f40d201bc564d4e76b83c05b3b5363fe7e ]

    Fix unsafe unaligned pointer usage in usbip network interfaces. usbip tool
    build fails with new gcc -Werror=address-of-packed-member checks.

    usbip_network.c: In function ‘usbip_net_pack_usb_device’:
    usbip_network.c:79:32: error: taking address of packed member of ‘struct usbip_usb_device’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
    79 | usbip_net_pack_uint32_t(pack, &udev->busnum);

    Fix with minor changes to pass by value instead of by address.

    Signed-off-by: Shuah Khan
    Link: https://lore.kernel.org/r/20200109012416.2875-1-skhan@linuxfoundation.org
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin

    Shuah Khan
     
  • [ Upstream commit 6794200fa3c9c3e6759dae099145f23e4310f4f7 ]

    GCC9 introduced string hardening mechanisms, which exhibits the error
    during fs api compilation:

    error: '__builtin_strncpy' specified bound 4096 equals destination size
    [-Werror=stringop-truncation]

    This comes when the length of copy passed to strncpy is is equal to
    destination size, which could potentially lead to buffer overflow.

    There is a need to mitigate this potential issue by limiting the size of
    destination by 1 and explicitly terminate the destination with NULL.

    Signed-off-by: Andrey Zhizhikin
    Reviewed-by: Petr Mladek
    Acked-by: Jiri Olsa
    Cc: Alexei Starovoitov
    Cc: Andrii Nakryiko
    Cc: Daniel Borkmann
    Cc: Kefeng Wang
    Cc: Martin KaFai Lau
    Cc: Petr Mladek
    Cc: Sergey Senozhatsky
    Cc: Song Liu
    Cc: Yonghong Song
    Cc: bpf@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Link: http://lore.kernel.org/lkml/20191211080109.18765-1-andrey.zhizhikin@leica-geosystems.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin

    Andrey Zhizhikin
     
  • [ Upstream commit 1162f844030ac1ac7321b5e8f6c9badc7a11428f ]

    Currently, when bpftool cgroup show has an error, no error
    message is printed. This is confusing because the user may think the
    result is empty.

    Before the change:

    $ bpftool cgroup show /sys/fs/cgroup
    ID AttachType AttachFlags Name
    $ echo $?
    255

    After the change:
    $ ./bpftool cgroup show /sys/fs/cgroup
    Error: can't query bpf programs attached to /sys/fs/cgroup: Operation
    not permitted

    v2: Rename check_query_cgroup_progs to cgroup_has_attached_progs

    Signed-off-by: Hechao Li
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20191224011742.3714301-1-hechaol@fb.com
    Signed-off-by: Sasha Levin

    Hechao Li
     
  • [ Upstream commit ea6a547669b37453f2b1a5d85188d75b3613dfaa ]

    The SO_TXTIME test depends on accurate timers. In some virtualized
    environments the test has been reported to be flaky. This is easily
    reproduced by disabling kvm acceleration in Qemu.

    Allow greater variance in a run and retry to further reduce flakiness.

    Observed errors are one of two kinds: either the packet arrives too
    early or late at recv(), or it was dropped in the qdisc itself and the
    recv() call times out.

    In the latter case, the qdisc queues a notification to the error
    queue of the send socket. Also explicitly report this cause.

    Link: https://lore.kernel.org/netdev/CA+FuTSdYOnJCsGuj43xwV1jxvYsaoa_LzHQF9qMyhrkLrivxKw@mail.gmail.com
    Reported-by: Naresh Kamboju
    Signed-off-by: Willem de Bruijn
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Sasha Levin

    Willem de Bruijn
     
  • [ Upstream commit ac87813d4372f4c005264acbe3b7f00c1dee37c4 ]

    Commit 852c8cbf34d3 ("selftests/kselftest/runner.sh: Add 45 second
    timeout per test") adds support for a new per-test-directory "settings"
    file. But this only works for tests not in a sub-subdirectories, e.g.

    - tools/testing/selftests/rtc (rtc) is OK,
    - tools/testing/selftests/net/mptcp (net/mptcp) is not.

    We have to increase the timeout for net/mptcp tests which are not
    upstreamed yet but this fix is valid for other tests if they need to add
    a "settings" file, see the full list with:

    tools/testing/selftests/*/*/**/Makefile

    Note that this patch changes the text header message printed at the end
    of the execution but this text is modified only for the tests that are
    in sub-subdirectories, e.g.

    ok 1 selftests: net/mptcp: mptcp_connect.sh

    Before we had:

    ok 1 selftests: mptcp: mptcp_connect.sh

    But showing the full target name is probably better, just in case a
    subsubdir has the same name as another one in another subdirectory.

    Fixes: 852c8cbf34d3 (selftests/kselftest/runner.sh: Add 45 second timeout per test)
    Signed-off-by: Matthieu Baerts
    Reviewed-by: Kees Cook
    Signed-off-by: Shuah Khan
    Signed-off-by: Sasha Levin

    Matthieu Baerts
     
  • [ Upstream commit 6b64a650f0b2ae3940698f401732988699eecf7a ]

    It was observed[1] on arm64 that __builtin_strlen led to an infinite
    loop in the get_size selftest. This is because __builtin_strlen (and
    other builtins) may sometimes result in a call to the C library
    function. The C library implementation of strlen uses an IFUNC
    resolver to load the most efficient strlen implementation for the
    underlying machine and hence has a PLT indirection even for static
    binaries. Because this binary avoids the C library startup routines,
    the PLT initialization never happens and hence the program gets stuck
    in an infinite loop.

    On x86_64 the __builtin_strlen just happens to expand inline and avoid
    the call but that is not always guaranteed.

    Further, while testing on x86_64 (Fedora 31), it was observed that the
    test also failed with a segfault inside write() because the generated
    code for the write function in glibc seems to access TLS before the
    syscall (probably due to the cancellation point check) and fails
    because TLS is not initialised.

    To mitigate these problems, this patch reduces the interface with the
    C library to just the syscall function. The syscall function still
    sets errno on failure, which is undesirable but for now it only
    affects cases where syscalls fail.

    [1] https://bugs.linaro.org/show_bug.cgi?id=5479

    Signed-off-by: Siddhesh Poyarekar
    Reported-by: Masami Hiramatsu
    Tested-by: Masami Hiramatsu
    Reviewed-by: Tim Bird
    Signed-off-by: Shuah Khan
    Signed-off-by: Sasha Levin

    Siddhesh Poyarekar
     

20 Feb, 2020

1 commit

  • commit 80cc7bb6c104d733bff60ddda09f19139c61507c upstream.

    For data collected on machines with front end stalled cycles supported,
    such as found on modern AMD CPU families, commit 146540fb545b ("perf
    stat: Always separate stalled cycles per insn") introduces a new line in
    CSV output with a leading comma that upsets some automated scripts.
    Scripts have to use "-e ex_ret_instr" to work around this issue, after
    upgrading to a version of perf with that commit.

    We could add "if (have_frontend_stalled && !config->csv_sep)" to the not
    (total && avg) else clause, to emphasize that CSV users are usually
    scripts, and are written to do only what is needed, i.e., they wouldn't
    typically invoke "perf stat" without specifying an explicit event list.

    But - let alone CSV output - why should users now tolerate a constant
    0-reporting extra line in regular terminal output?:

    BEFORE:

    $ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1

    Performance counter stats for 'system wide':

    181,110,981 instructions # 0.58 insn per cycle
    # 0.00 stalled cycles per insn
    309,876,469 cycles

    1.002202582 seconds time elapsed

    The user would not like to see the now permanent:

    "0.00 stalled cycles per insn"

    line fixture, as it gives no useful information.

    So this patch removes the printing of the zeroed stalled cycles line
    altogether, almost reverting the very original commit fb4605ba47e7
    ("perf stat: Check for frontend stalled for metrics"), which seems like
    it was written to normalize --metric-only column output of common Intel
    machines at the time: modern Intel machines have ceased to support the
    genericised frontend stalled metrics AFAICT.

    AFTER:

    $ sudo perf stat --all-cpus -einstructions,cycles -- sleep 1

    Performance counter stats for 'system wide':

    244,071,432 instructions # 0.69 insn per cycle
    355,353,490 cycles

    1.001862516 seconds time elapsed

    Output behaviour when stalled cycles is indeed measured is not affected
    (BEFORE == AFTER):

    $ sudo perf stat --all-cpus -einstructions,cycles,stalled-cycles-frontend -- sleep 1

    Performance counter stats for 'system wide':

    247,227,799 instructions # 0.63 insn per cycle
    # 0.26 stalled cycles per insn
    394,745,636 cycles
    63,194,485 stalled-cycles-frontend # 16.01% frontend cycles idle

    1.002079770 seconds time elapsed

    Fixes: 146540fb545b ("perf stat: Always separate stalled cycles per insn")
    Signed-off-by: Kim Phillips
    Acked-by: Andi Kleen
    Acked-by: Jiri Olsa
    Acked-by: Song Liu
    Cc: Alexander Shishkin
    Cc: Cong Wang
    Cc: Davidlohr Bueso
    Cc: Jin Yao
    Cc: Kan Liang
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200207230613.26709-1-kim.phillips@amd.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Kim Phillips
     

15 Feb, 2020

3 commits

  • commit 1985f8c7f9a42a651a9750d6fcadc74336d182df upstream.

    If we compile tools/acpi target in the top source directory, we'd get a
    compilation error showing as bellow:

    # make tools/acpi
    DESCEND power/acpi
    DESCEND tools/acpidbg
    CC tools/acpidbg/acpidbg.o
    Assembler messages:
    Fatal error: can't create /home/lzy/kernel-upstream/power/acpi/\
    tools/acpidbg/acpidbg.o: No such file or directory
    ../../Makefile.rules:26: recipe for target '/home/lzy/kernel-upstream/\
    power/acpi/tools/acpidbg/acpidbg.o' failed
    make[3]: *** [/home/lzy/kernel-upstream//power/acpi/tools/acpidbg/\
    acpidbg.o] Error 1
    Makefile:19: recipe for target 'acpidbg' failed
    make[2]: *** [acpidbg] Error 2
    Makefile:54: recipe for target 'acpi' failed
    make[1]: *** [acpi] Error 2
    Makefile:1607: recipe for target 'tools/acpi' failed
    make: *** [tools/acpi] Error 2

    Fixes: d5a4b1a540b8 ("tools/power/acpi: Remove direct kernel source include reference")
    Signed-off-by: Zhengyuan Liu
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Zhengyuan Liu
     
  • commit 5d3919a953c3c96c02fc7a337f8376cde43ae31f upstream.

    Commit 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear
    down") introduced sleeping issues inside RCU critical sections and while
    holding a spinlock on sockmap/sockhash tear-down. There has to be at least
    one socket in the map for the problem to surface.

    This adds a test that triggers the warnings for broken locking rules. Not a
    fix per se, but rather tooling to verify the accompanying fixes. Run on a
    VM with 1 vCPU to reproduce the warnings.

    Fixes: 7e81a3530206 ("bpf: Sockmap, ensure sock lock held during tear down")
    Signed-off-by: Jakub Sitnicki
    Signed-off-by: Daniel Borkmann
    Acked-by: John Fastabend
    Link: https://lore.kernel.org/bpf/20200206111652.694507-4-jakub@cloudflare.com
    Signed-off-by: Greg Kroah-Hartman

    Jakub Sitnicki
     
  • commit d95f1e8b462c4372ac409886070bb8719d8a4d3a upstream.

    Turns out the xlated program instructions can also be missing if
    kptr_restrict sysctl is set. This means that the previous fix to check the
    jited_prog_insns pointer was insufficient; add another check of the
    xlated_prog_insns pointer as well.

    Fixes: 5b79bcdf0362 ("bpftool: Don't crash on missing jited insns or ksyms")
    Fixes: cae73f233923 ("bpftool: use bpf_program__get_prog_info_linear() in prog.c:do_dump()")
    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Quentin Monnet
    Link: https://lore.kernel.org/bpf/20200206102906.112551-1-toke@redhat.com
    Signed-off-by: Greg Kroah-Hartman

    Toke Høiland-Jørgensen
     

12 Feb, 2020

6 commits

  • Add SocName in DDR JSON file, so that metric/metricgroup can filter by this
    property.

    Reviewed-by: Fugang Duan
    Signed-off-by: Joakim Zhang

    Joakim Zhang
     
  • All metrics under one CPUID would be loaded by Perf tool when the CPUID
    of SoC is matched. So users could see other platforms' metrics from one
    platform, which is very confused. We can match metric/metricgroup with
    SOCNAME if needed.

    Reviewed-by: Fugang Duan
    Signed-off-by: Joakim Zhang

    Joakim Zhang
     
  • Add support for checking socname for ARCH arm64.

    Reviewed-by: Fugang Duan
    Signed-off-by: Joakim Zhang

    Joakim Zhang
     
  • Add socname for struct pmu_event for that we can distinguish different SoCs
    by this property.

    Reviewed-by: Fugang Duan
    Signed-off-by: Joakim Zhang

    Joakim Zhang
     
  • [ Upstream commit eb573e746b9d4f0921dcb2449be3df41dae3caea ]

    Commit f01642e4912b ("perf metricgroup: Support multiple events for
    metricgroup") introduced support for multiple events in a metric group.
    But with the current upstream, metric events names are not printed
    properly

    In power9 platform:

    command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
    1.000208486
    2.000368863
    2.001400558

    Similarly in skylake platform:

    command:./perf stat --metric-only -M Power -I 1000
    1.000579994
    2.002189493

    With current upstream version, issue is with event name comparison logic
    in find_evsel_group(). Current logic is to compare events belonging to a
    metric group to the events in perf_evlist. Since the break statement is
    missing in the loop used for comparison between metric group and
    perf_evlist events, the loop continues to execute even after getting a
    pattern match, and end up in discarding the matches.

    Incase of single metric event belongs to metric group, its working fine,
    because in case of single event once it compare all events it reaches to
    end of perf_evlist.

    Example for single metric event in power9 platform:

    command:# ./perf stat --metric-only -M branches_per_inst -I 1000 sleep 1
    1.000094653 0.2
    1.001337059 0.0

    This patch fixes the issue by making sure once we found all events
    belongs to that metric event matched in find_evsel_group(), we
    successfully break from that loop by adding corresponding condition.

    With this patch:
    In power9 platform:

    command:# ./perf stat --metric-only -M translation -C 0 -I 1000 sleep 2
    result:#
    time derat_4k_miss_rate_percent derat_4k_miss_ratio derat_miss_ratio derat_64k_miss_rate_percent derat_64k_miss_ratio dslb_miss_rate_percent islb_miss_rate_percent
    1.000135672 0.0 0.3 1.0 0.0 0.2 0.0 0.0
    2.000380617 0.0 0.0 0.0 0.0 0.0 0.0 0.0

    command:# ./perf stat --metric-only -M Power -I 1000

    Similarly in skylake platform:
    result:#
    time Turbo_Utilization C3_Core_Residency C6_Core_Residency C7_Core_Residency C2_Pkg_Residency C3_Pkg_Residency C6_Pkg_Residency C7_Pkg_Residency
    1.000563580 0.3 0.0 2.6 44.2 21.9 0.0 0.0 0.0
    2.002235027 0.4 0.0 2.7 43.0 20.7 0.0 0.0 0.0

    Committer testing:

    Before:

    [root@seventh ~]# perf stat --metric-only -M Power -I 1000
    # time
    1.000383223
    2.001168182
    3.001968545
    4.002741200
    5.003442022
    ^C 5.777687244

    [root@seventh ~]#

    After the patch:

    [root@seventh ~]# perf stat --metric-only -M Power -I 1000
    # time Turbo_Utilization C3_Core_Residency C6_Core_Residency C7_Core_Residency C2_Pkg_Residency C3_Pkg_Residency C6_Pkg_Residency C7_Pkg_Residency
    1.000406577 0.4 0.1 1.4 97.0 0.0 0.0 0.0 0.0
    2.001481572 0.3 0.0 0.6 97.9 0.0 0.0 0.0 0.0
    3.002332585 0.2 0.0 1.0 97.5 0.0 0.0 0.0 0.0
    4.003196624 0.2 0.0 0.3 98.6 0.0 0.0 0.0 0.0
    5.004063851 0.3 0.0 0.7 97.7 0.0 0.0 0.0 0.0
    ^C 5.471260276 0.2 0.0 0.5 49.3 0.0 0.0 0.0 0.0

    [root@seventh ~]#
    [root@seventh ~]# dmesg | grep -i skylake
    [ 0.187807] Performance Events: PEBS fmt3+, Skylake events, 32-deep LBR, full-width counters, Intel PMU driver.
    [root@seventh ~]#

    Fixes: f01642e4912b ("perf metricgroup: Support multiple events for metricgroup")
    Signed-off-by: Kajol Jain
    Reviewed-by: Ravi Bangoria
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Andi Kleen
    Cc: Anju T Sudhakar
    Cc: Jin Yao
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Madhavan Srinivasan
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20191120084059.24458-1-kjain@linux.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin

    Joakim cherry pick from upstream:
    3635b27cc058 perf metricgroup: Fix printing event names of metric group with multiple events

    Signed-off-by: Joakim Zhang

    Kajol Jain
     
  • Currently when cross compiling perf tool for ARM64 on my x86 machine I
    get this error:

    arch/arm64/util/sym-handling.c:9:10: fatal error: gelf.h: No such file or directory
    #include

    For the build, libelf is reported off:

    Auto-detecting system features:
    ...
    ... libelf: [ OFF ]

    Indeed, test-libelf is not built successfully:

    more ./build/feature/test-libelf.make.output
    test-libelf.c:2:10: fatal error: libelf.h: No such file or directory
    #include
    ^~~~~~~~~~
    compilation terminated.

    I have no such problems natively compiling on ARM64, and I did not
    previously have this issue for cross compiling. Fix by relocating the
    gelf.h include.

    Signed-off-by: John Garry
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lore.kernel.org/lkml/1573045254-39833-1-git-send-email-john.garry@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Joakim cherry pick from upstream:
    1302caaef52a perf tools: Fix cross compile for ARM64

    Signed-off-by: Joakim Zhang

    John Garry
     

11 Feb, 2020

7 commits

  • commit 5fcf3a55a62afb0760ccb6f391d62f20bce4a42f upstream.

    The filter name is fixed to "exit_reason" for some kvm_exit events, no
    matter what architect we have. Actually, the filter name ("exit_reason")
    is only applicable to x86, meaning it's broken on other architects
    including aarch64.

    This fixes the issue by providing various kvm_exit filter names, depending
    on architect we're on. Afterwards, the variable filter name is picked and
    applied through ioctl(fd, SET_FILTER).

    Reported-by: Andrew Jones
    Signed-off-by: Gavin Shan
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Gavin Shan
     
  • commit 8bec4f665e0baecb5f1b683379fc10b3745eb612 upstream.

    The reuseport tests currently suffer from a race condition: FIN
    packets count towards DROP_ERR_SKB_DATA, since they don't contain
    a valid struct cmd. Tests will spuriously fail depending on whether
    check_results is called before or after the FIN is processed.

    Exit the BPF program early if FIN is set.

    Fixes: 91134d849a0e ("bpf: Test BPF_PROG_TYPE_SK_REUSEPORT")
    Signed-off-by: Lorenz Bauer
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Jakub Sitnicki
    Acked-by: Martin KaFai Lau
    Acked-by: John Fastabend
    Link: https://lore.kernel.org/bpf/20200124112754.19664-3-lmb@cloudflare.com
    Signed-off-by: Greg Kroah-Hartman

    Lorenz Bauer
     
  • commit c31dbb1e41d1857b403f9bf58c87f5898519a0bc upstream.

    Use a proper temporary file for sendpage tests. This means that running
    the tests doesn't clutter the working directory, and allows running the
    test on read-only filesystems.

    Fixes: 16962b2404ac ("bpf: sockmap, add selftests")
    Signed-off-by: Lorenz Bauer
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Jakub Sitnicki
    Acked-by: Martin KaFai Lau
    Acked-by: John Fastabend
    Link: https://lore.kernel.org/bpf/20200124112754.19664-2-lmb@cloudflare.com
    Signed-off-by: Greg Kroah-Hartman

    Lorenz Bauer
     
  • commit f1c3656c6d9c147d07d16614455aceb34932bdeb upstream.

    The same with commit 4e59afbbed96 ("selftests/bpf: skip nmi test when perf
    hw events are disabled"), it would make more sense to skip the
    test_stacktrace_build_id_nmi test if the setup (e.g. virtual machines) has
    disabled hardware perf events.

    Fixes: 13790d1cc72c ("bpf: add selftest for stackmap with build_id in NMI context")
    Signed-off-by: Hangbin Liu
    Signed-off-by: Daniel Borkmann
    Acked-by: John Fastabend
    Link: https://lore.kernel.org/bpf/20200117100656.10359-1-liuhangbin@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    Hangbin Liu
     
  • commit 580205dd4fe800b1e95be8b6df9e2991f975a8ad upstream.

    Fix two issues in test_attach_probe:

    1. it was not able to parse /proc/self/maps beyond the first line,
    since %s means parse string until white space.
    2. offset has to be accounted for otherwise uprobed address is incorrect.

    Fixes: 1e8611bbdfc9 ("selftests/bpf: add kprobe/uprobe selftests")
    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Acked-by: Yonghong Song
    Acked-by: Andrii Nakryiko
    Link: https://lore.kernel.org/bpf/20191219020442.1922617-1-ast@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Alexei Starovoitov
     
  • commit 7145fcfffef1fad4266aaf5ca96727696916edb7 upstream.

    when the following command is done on a fresh clone of the kernel tree,

    [root@f31 tc-testing]# ./tdc.py -c bpf

    test cases that need to build the eBPF sample program fail systematically,
    because 'buildebpfPlugin' is unable to install the kernel headers (i.e, the
    'khdr' target fails). Pass the correct environment to 'make', in place of
    ENVIR, to allow running these tests.

    Fixes: 4c2d39bd40c1 ("tc-testing: use a plugin to build eBPF program")
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • commit 35b9211c0a2427e8f39e534f442f43804fc8d5ca upstream.

    Fix bug requesting invalid size of reallocated array when constructing CO-RE
    relocation candidate list. This can cause problems if there are many potential
    candidates and a very fine-grained memory allocator bucket sizes are used.

    Fixes: ddc7c3042614 ("libbpf: implement BPF CO-RE offset relocation algorithm")
    Reported-by: William Smith
    Signed-off-by: Andrii Nakryiko
    Signed-off-by: Daniel Borkmann
    Acked-by: Yonghong Song
    Link: https://lore.kernel.org/bpf/20200124201847.212528-1-andriin@fb.com
    Signed-off-by: Greg Kroah-Hartman

    Andrii Nakryiko