16 Aug, 2020

2 commits

  • Pull arch/sh updates from Rich Felker:
    "Cleanup, SECCOMP_FILTER support, message printing fixes, and other
    changes to arch/sh"

    * tag 'sh-for-5.9' of git://git.libc.org/linux-sh: (34 commits)
    sh: landisk: Add missing initialization of sh_io_port_base
    sh: bring syscall_set_return_value in line with other architectures
    sh: Add SECCOMP_FILTER
    sh: Rearrange blocks in entry-common.S
    sh: switch to copy_thread_tls()
    sh: use the generic dma coherent remap allocator
    sh: don't allow non-coherent DMA for NOMMU
    dma-mapping: consolidate the NO_DMA definition in kernel/dma/Kconfig
    sh: unexport register_trapped_io and match_trapped_io_handler
    sh: don't include in
    sh: move the ioremap implementation out of line
    sh: move ioremap_fixed details out of
    sh: remove __KERNEL__ ifdefs from non-UAPI headers
    sh: sort the selects for SUPERH alphabetically
    sh: remove -Werror from Makefiles
    sh: Replace HTTP links with HTTPS ones
    arch/sh/configs: remove obsolete CONFIG_SOC_CAMERA*
    sh: stacktrace: Remove stacktrace_ops.stack()
    sh: machvec: Modernize printing of kernel messages
    sh: pci: Modernize printing of kernel messages
    ...

    Linus Torvalds
     
  • Pull more perf tools updates from Arnaldo Carvalho de Melo:
    "Fixes:
    - Fixes for 'perf bench numa'.

    - Always memset source before memcpy in 'perf bench mem'.

    - Quote CC and CXX for their arguments to fix build in environments
    using those variables to pass more than just the compiler names.

    - Fix module symbol processing, addressing regression detected via
    "perf test".

    - Allow multiple probes in record+script_probe_vfs_getname.sh 'perf
    test' entry.

    Improvements:
    - Add script to autogenerate socket family name id->string table from
    copy of kernel header, used so far in 'perf trace'.

    - 'perf ftrace' improvements to provide similar options for this
    utility so that one can go from 'perf record', 'perf trace', etc to
    'perf ftrace' just by changing the name of the subcommand.

    - Prefer new "sched:sched_waking" trace event when it exists in 'perf
    sched' post processing.

    - Update POWER9 metrics to utilize other metrics.

    - Fall back to querying debuginfod if debuginfo not found locally.

    Miscellaneous:
    - Sync various kvm headers with kernel sources"

    * tag 'perf-tools-2020-08-14' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (40 commits)
    perf ftrace: Make option description initials all capital letters
    perf build-ids: Fall back to debuginfod query if debuginfo not found
    perf bench numa: Remove dead code in parse_nodes_opt()
    perf stat: Update POWER9 metrics to utilize other metrics
    perf ftrace: Add change log
    perf: ftrace: Add set_tracing_options() to set all trace options
    perf ftrace: Add option --tid to filter by thread id
    perf ftrace: Add option -D/--delay to delay tracing
    perf: ftrace: Allow set graph depth by '--graph-opts'
    perf ftrace: Add support for trace option tracing_thresh
    perf ftrace: Add option 'verbose' to show more info for graph tracer
    perf ftrace: Add support for tracing option 'irq-info'
    perf ftrace: Add support for trace option funcgraph-irqs
    perf ftrace: Add support for trace option sleep-time
    perf ftrace: Add support for tracing option 'func_stack_trace'
    perf tools: Add general function to parse sublevel options
    perf ftrace: Add option '--inherit' to trace children processes
    perf ftrace: Show trace column header
    perf ftrace: Add option '-m/--buffer-size' to set per-cpu buffer size
    perf ftrace: Factor out function write_tracing_file_int()
    ...

    Linus Torvalds
     

15 Aug, 2020

3 commits

  • Since commit 61a47c1ad3a4dc ("sysctl: Remove the sysctl system call"),
    sys_sysctl is actually unavailable: any input can only return an error.

    We have been warning about people using the sysctl system call for years
    and believe there are no more users. Even if there are users of this
    interface if they have not complained or fixed their code by now they
    probably are not going to, so there is no point in warning them any
    longer.

    So completely remove sys_sysctl on all architectures.

    [nixiaoming@huawei.com: s390: fix build error for sys_call_table_emu]
    Link: http://lkml.kernel.org/r/20200618141426.16884-1-nixiaoming@huawei.com

    Signed-off-by: Xiaoming Ni
    Signed-off-by: Andrew Morton
    Acked-by: Will Deacon [arm/arm64]
    Acked-by: "Eric W. Biederman"
    Cc: Aleksa Sarai
    Cc: Alexander Shishkin
    Cc: Al Viro
    Cc: Andi Kleen
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Arnaldo Carvalho de Melo
    Cc: Arnd Bergmann
    Cc: Benjamin Herrenschmidt
    Cc: Bin Meng
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Catalin Marinas
    Cc: chenzefeng
    Cc: Christian Borntraeger
    Cc: Christian Brauner
    Cc: Chris Zankel
    Cc: David Howells
    Cc: David S. Miller
    Cc: Diego Elio Pettenò
    Cc: Dmitry Vyukov
    Cc: Dominik Brodowski
    Cc: Fenghua Yu
    Cc: Geert Uytterhoeven
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Iurii Zaikin
    Cc: Ivan Kokshaysky
    Cc: James Bottomley
    Cc: Jens Axboe
    Cc: Jiri Olsa
    Cc: Kars de Jong
    Cc: Kees Cook
    Cc: Krzysztof Kozlowski
    Cc: Luis Chamberlain
    Cc: Marco Elver
    Cc: Mark Rutland
    Cc: Martin K. Petersen
    Cc: Masahiro Yamada
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Miklos Szeredi
    Cc: Minchan Kim
    Cc: Namhyung Kim
    Cc: Naveen N. Rao
    Cc: Nick Piggin
    Cc: Oleg Nesterov
    Cc: Olof Johansson
    Cc: Paul Burton
    Cc: "Paul E. McKenney"
    Cc: Paul Mackerras
    Cc: Peter Zijlstra (Intel)
    Cc: Randy Dunlap
    Cc: Ravi Bangoria
    Cc: Richard Henderson
    Cc: Rich Felker
    Cc: Russell King
    Cc: Sami Tolvanen
    Cc: Sargun Dhillon
    Cc: Stephen Rothwell
    Cc: Sudeep Holla
    Cc: Sven Schnelle
    Cc: Thiago Jung Bauermann
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vasily Gorbik
    Cc: Vlastimil Babka
    Cc: Yoshinori Sato
    Cc: Zhou Yanjie
    Link: http://lkml.kernel.org/r/20200616030734.87257-1-nixiaoming@huawei.com
    Signed-off-by: Linus Torvalds

    Xiaoming Ni
     
  • Make sure execve() returns the expected errno values for non-regular
    files.

    Signed-off-by: Kees Cook
    Signed-off-by: Andrew Morton
    Cc: Marc Zyngier
    Link: http://lkml.kernel.org/r/20200813231723.2725102-3-keescook@chromium.org
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Port sh to use the new SECCOMP_FILTER code.

    Signed-off-by: Michael Karcher
    Tested-by: John Paul Adrian Glaubitz
    Signed-off-by: Rich Felker

    Michael Karcher
     

14 Aug, 2020

23 commits

  • And improve a bit the -m description to state that a B/K/M/G suffix is
    needed.

    Cc: Changbin Du
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • During a perf-record, use the -ldebuginfod API to query a debuginfod
    server, should the debug data not be found in the usual system
    locations. If successful, the usual $HOME/.debug dir is populated.

    Tested with:

    $ find .
    .
    ./ctags-debuginfo-5.8-26.fc31.x86_64.rpm
    ./usr
    ./usr/lib
    ./usr/lib/debug
    ./usr/lib/debug/.build-id
    ./usr/lib/debug/.build-id/ca
    ./usr/lib/debug/.build-id/ca/46f6ae6a0cee57d85abc1d461c49074248908d
    ./usr/lib/debug/.build-id/ca/46f6ae6a0cee57d85abc1d461c49074248908d.debug
    ./usr/lib/debug/usr
    ./usr/lib/debug/usr/bin
    ./usr/lib/debug/usr/bin/ctags-5.8-26.fc31.x86_64.debug

    $ debuginfod -F .
    ...

    $ rm -rf ~/.debug/ ; mkdir ~/.debug

    $ perf record make tags
    BUILD: Doing 'make -j8' parallel build
    GEN tags
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.107 MB perf.data (1483 samples) ]

    $ find ~/.debug | grep ctags
    /home/jolsa/.debug/usr/bin/ctags
    /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d
    /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/elf
    /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/probes

    $ rm -rf ~/.debug/ ; mkdir ~/.debug

    $ DEBUGINFOD_URLS=http://localhost:8002 perf record make tags
    BUILD: Doing 'make -j8' parallel build
    GEN tags
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.108 MB perf.data (1531 samples) ]

    $ find ~/.debug | grep ctag
    /home/jolsa/.debug/usr/bin/ctags
    /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d
    /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/debug
    /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/elf
    /home/jolsa/.debug/usr/bin/ctags/ca46f6ae6a0cee57d85abc1d461c49074248908d/probes

    Note the 'debug' file is created in the last run.

    Note that currently the debuginfo data are downloaded only on record path,
    we still need add this support to perf build-id/report.. and test ;-)

    Tested-by: Jiri Olsa
    Signed-off-by: Jiri Olsa
    Signed-off-by: Frank Ch. Eigler
    Signed-off-by: Arnaldo Carvalho de Melo

    Frank Ch. Eigler
     
  • In the function parse_nodes_opt(), the statement "return 0;" is dead
    code, remove it.

    Signed-off-by: Peng Fan
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/1597401894-27549-1-git-send-email-fanpeng@loongson.cn
    Signed-off-by: Arnaldo Carvalho de Melo

    Peng Fan
     
  • These changes take advantage of the new capability added in merge commit
    00e4db51259a5f936fec1424b884f029479d3981 "Allow using computed metrics
    in calculating other metrics".

    The net is a simplification of the expressions for a handful of metrics,
    but no functional change.

    Signed-off-by: Paul Clarke
    Reviewed-by: Kajol Jain
    Acked-by: Ian Rogers
    Cc: Jiri Olsa
    Cc: Madhavan Srinivasan
    Link: http://lore.kernel.org/lkml/20200813222155.268183-1-pc@us.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Paul A. Clarke
     
  • Add a change log after previous enhancements.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-19-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • Now the __cmd_ftrace() becomes a bit long. This moves the trace option
    setting code to a separate function set_tracing_options().

    Suggested-by: Namhyung Kim
    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-18-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This allows us to trace single thread instead of the whole process.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-17-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '-D/--delay' to allow us to start tracing some times
    later after workload is launched.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-16-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This is to have a consistent view of all graph tracer options.
    The original option '--graph-depth' is marked as deprecated.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-15-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '--graph-opts thresh' to setup trace duration
    threshold for funcgraph tracer.

    $ sudo ./perf ftrace -G '*' --graph-opts thresh=100
    3) ! 184.060 us | } /* schedule */
    3) ! 185.600 us | } /* exit_to_usermode_loop */
    2) ! 225.989 us | } /* schedule_idle */
    2) # 4140.051 us | } /* do_idle */

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-14-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • Sometimes we want ftrace display more and longer information about the
    trace.

    $ sudo perf ftrace -G '*'
    2) 0.979 us | mutex_unlock();
    2) 1.540 us | __fsnotify_parent();
    2) 0.433 us | fsnotify();

    $ sudo perf ftrace -G '*' --graph-opts verbose
    14160.770883 | 0) -47814 | .... | 1.289 us | mutex_unlock();
    14160.770886 | 0) -47814 | .... | 1.624 us | __fsnotify_parent();
    14160.770887 | 0) -47814 | .... | 0.636 us | fsnotify();
    14160.770888 | 0) -47814 | .... | 0.328 us | __sb_end_write();
    14160.770888 | 0) -47814 | d... | 0.430 us | fpregs_assert_state_consistent();
    14160.770889 | 0) -47814 | d... | | do_syscall_64() {
    14160.770889 | 0) -47814 | .... | | __x64_sys_close() {

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-13-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds support to display irq context info for function tracer. To do
    this, just specify a '--func-opts irq-info' option.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-12-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '--graph-opts noirqs' to filter out functions executed
    in irq context.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-11-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '--graph-opts nosleep-time' which allow us only to
    measure on-CPU time. This option is function_graph tracer only.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-10-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds support to display call trace for function tracer. To do this,
    just specify a '--func-opts call-graph' option.

    Example:

    $ sudo perf ftrace -T vfs_read --func-opts call-graph
    iio-sensor-prox-855 [003] 6168.369657: vfs_read
    => vfs_read
    => ksys_read
    => __x64_sys_read
    => do_syscall_64
    => entry_SYSCALL_64_after_hwframe
    ...

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-9-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This factors out a general function perf_parse_sublevel_options() to
    parse sublevel options. The 'sublevel' options is something like the
    '--debug' options which allow more sublevel options.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-8-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '--inherit' to allow us trace children
    processes spawned by our target.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-7-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This makes 'perf ftrace' display column header before printing trace.

    $ sudo perf ftrace
    # tracer: function
    #
    # entries-in-buffer/entries-written: 0/0 #P:8
    #
    # TASK-PID CPU# TIMESTAMP FUNCTION
    # | | | | |
    -9246 [006] 10726.262760: mutex_unlock -9246 [006] 10726.262764: __fsnotify_parent -9246 [006] 10726.262765: fsnotify -9246 [006] 10726.262766: __sb_end_write -9246 [006] 10726.262767: fpregs_assert_state_consistent
    Acked-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-6-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '-m/--buffer-size' to allow us set the size of per-cpu
    tracing buffer.

    Committer testing:

    Before running with this option:

    # find /sys/kernel/tracing/ -name buffer_size_kb | xargs cat
    1408
    1408
    1408
    1408
    1408
    1408
    1408
    1408
    1408
    #

    Then, run:

    # perf ftrace -m 2048K | head -10
    2) | mutex_unlock() {
    2) ==========> |
    2) | smp_irq_work_interrupt() {
    2) | irq_enter() {
    2) 0.121 us | rcu_irq_enter();
    2) 0.128 us | irqtime_account_irq();
    2) 0.719 us | }
    2) | __wake_up() {
    2) | __wake_up_common_lock() {
    2) 0.105 us | _raw_spin_lock_irqsave();
    #

    Now look at those tracefs knobs:

    # find /sys/kernel/tracing/ -name buffer_size_kb | xargs cat
    2048
    2048
    2048
    2048
    2048
    2048
    2048
    2048
    2048
    #

    This should be similar to the -m option in the other perf tools, such as
    'perf record', 'perf trace', etc.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-5-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • We will reuse this function later.

    Signed-off-by: Changbin Du
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-4-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • This adds an option '-F/--funcs' to list all available functions to
    trace, which is read from tracing file 'available_filter_functions'.

    $ sudo ./perf ftrace -F | head
    trace_initcall_finish_cb
    initcall_blacklisted
    do_one_initcall
    do_one_initcall
    trace_initcall_start_cb
    run_init_process
    try_to_run_init_process
    match_dev_by_label
    match_dev_by_uuid
    rootfs_init_fs_context
    $

    Committer notes:

    This is the same command line option and for the same purpose as in
    'perf probe'.

    Signed-off-by: Changbin Du
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-3-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • The '-g/-G' options have already implied function_graph tracer should be
    used instead of function tracer. So we don't need extra option
    '--tracer' in this case.

    This patch changes the behavior as below:

    - If '-g' or '-G' option is on, then function_graph tracer is used.
    - If '-T' or '-N' option is on, then function tracer is used.
    - The function_graph has priority over function tracer.
    - The option '--tracer' only take effect if neither -g/-G nor -T/-N
    is specified.

    Here are some examples.

    This will start tracing all functions using default tracer:

    $ sudo perf ftrace

    This will trace all functions using function graph tracer:

    $ sudo perf ftrace -G '*'

    This will trace function vfs_read using function graph tracer:

    $ sudo perf ftrace -G vfs_read

    This will trace function vfs_read using function tracer:

    $ sudo perf ftrace -T vfs_read

    Committer notes:

    Using '-h -G' will tell what that option is about, so to further clarify
    the above examples:

    # perf ftrace -h -G

    -G, --graph-funcs Set graph filter on given functions

    # perf ftrace -h -g

    -g, --nograph-funcs Set nograph filter on given functions

    # perf ftrace -h -T

    -T, --trace-funcs trace given functions only

    # perf ftrace -h -N

    -N, --notrace-funcs do not trace given functions

    #

    Signed-off-by: Changbin Du
    Tested-by: Arnaldo Carvalho de Melo
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Cc: Steven Rostedt (VMware)
    Link: http://lore.kernel.org/lkml/20200808023141.14227-2-changbin.du@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Changbin Du
     
  • Pull networking fixes from David Miller:
    "Some merge window fallout, some longer term fixes:

    1) Handle headroom properly in lapbether and x25_asy drivers, from
    Xie He.

    2) Fetch MAC address from correct r8152 device node, from Thierry
    Reding.

    3) In the sw kTLS path we should allow MSG_CMSG_COMPAT in sendmsg,
    from Rouven Czerwinski.

    4) Correct fdputs in socket layer, from Miaohe Lin.

    5) Revert troublesome sockptr_t optimization, from Christoph Hellwig.

    6) Fix TCP TFO key reading on big endian, from Jason Baron.

    7) Missing CAP_NET_RAW check in nfc, from Qingyu Li.

    8) Fix inet fastreuse optimization with tproxy sockets, from Tim
    Froidcoeur.

    9) Fix 64-bit divide in new SFC driver, from Edward Cree.

    10) Add a tracepoint for prandom_u32 so that we can more easily
    perform usage analysis. From Eric Dumazet.

    11) Fix rwlock imbalance in AF_PACKET, from John Ogness"

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
    net: openvswitch: introduce common code for flushing flows
    af_packet: TPACKET_V3: fix fill status rwlock imbalance
    random32: add a tracepoint for prandom_u32()
    Revert "ipv4: tunnel: fix compilation on ARCH=um"
    net: accept an empty mask in /sys/class/net/*/queues/rx-*/rps_cpus
    net: ethernet: stmmac: Disable hardware multicast filter
    net: stmmac: dwmac1000: provide multicast filter fallback
    ipv4: tunnel: fix compilation on ARCH=um
    vsock: fix potential null pointer dereference in vsock_poll()
    sfc: fix ef100 design-param checking
    net: initialize fastreuse on inet_inherit_port
    net: refactor bind_bucket fastreuse into helper
    net: phy: marvell10g: fix null pointer dereference
    net: Fix potential memory leak in proto_register()
    net: qcom/emac: add missed clk_disable_unprepare in error path of emac_clks_phase1_init
    ionic_lif: Use devm_kcalloc() in ionic_qcq_alloc()
    net/nfc/rawsock.c: add CAP_NET_RAW check.
    hinic: fix strncpy output truncated compile warnings
    drivers/net/wan/x25_asy: Added needed_headroom and a skb->len check
    net/tls: Fix kmap usage
    ...

    Linus Torvalds
     

13 Aug, 2020

12 commits

  • It is currently assumed that each node contains at most nr_cpus/nr_nodes
    CPUs and nodes' CPU ranges do not overlap.

    That assumption is generally incorrect as there are archs where a CPU
    number does not depend on to its node number.

    This update removes the described assumption by simply calling
    numa_node_to_cpus() interface and using the returned mask for binding
    CPUs to nodes.

    Also, variable types and names made consistent in functions using
    cpumask.

    Signed-off-by: Alexander Gordeev
    Reviewed-by: Srikar Dronamraju
    Cc: Alexander Shishkin
    Cc: Balamuruhan S
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Naveen N. Rao
    Cc: Peter Zijlstra
    Cc: Satheesh Rajendran
    Link: http://lore.kernel.org/lkml/20200813113247.GA2014@oc3871087118.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexander Gordeev
     
  • Couple numa_allocate_cpumask() and numa_free_cpumask() functions

    Signed-off-by: Alexander Gordeev
    Reviewed-by: Srikar Dronamraju
    Cc: Alexander Shishkin
    Cc: Balamuruhan S
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Naveen N. Rao
    Cc: Peter Zijlstra
    Cc: Satheesh Rajendran
    Link: http://lore.kernel.org/lkml/20200813113041.GA1685@oc3871087118.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Alexander Gordeev
     
  • When using a cross-compilation environment, such as OpenEmbedded,
    the CC an CXX variables are set to something more than just a
    command: there are arguments (such as --sysroot) that need to be
    passed on to the compiler so that the right set of headers and
    libraries are used.

    For the particular case that our systems detected, CC is set to
    the following:

    export CC="aarch64-linaro-linux-gcc --sysroot=/oe/build/tmp/work/machine/perf/1.0-r9/recipe-sysroot"

    Without quotes, detection is as follows:

    Auto-detecting system features:
    ... dwarf: [ OFF ]
    ... dwarf_getlocations: [ OFF ]
    ... glibc: [ OFF ]
    ... gtk2: [ OFF ]
    ... libbfd: [ OFF ]
    ... libcap: [ OFF ]
    ... libelf: [ OFF ]
    ... libnuma: [ OFF ]
    ... numa_num_possible_cpus: [ OFF ]
    ... libperl: [ OFF ]
    ... libpython: [ OFF ]
    ... libcrypto: [ OFF ]
    ... libunwind: [ OFF ]
    ... libdw-dwarf-unwind: [ OFF ]
    ... zlib: [ OFF ]
    ... lzma: [ OFF ]
    ... get_cpuid: [ OFF ]
    ... bpf: [ OFF ]
    ... libaio: [ OFF ]
    ... libzstd: [ OFF ]
    ... disassembler-four-args: [ OFF ]

    Makefile.config:414: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop.
    Makefile.perf:230: recipe for target 'sub-make' failed
    make[1]: *** [sub-make] Error 2
    Makefile:69: recipe for target 'all' failed
    make: *** [all] Error 2

    With CC and CXX quoted, some of those features are now detected.

    Fixes: e3232c2f39ac ("tools build feature: Use CC and CXX from parent")
    Signed-off-by: Daniel Díaz
    Reviewed-by: Thomas Hebb
    Cc: Alexei Starovoitov
    Cc: Andrii Nakryiko
    Cc: Daniel Borkmann
    Cc: Jiri Olsa
    Cc: John Fastabend
    Cc: KP Singh
    Cc: Martin KaFai Lau
    Cc: Namhyung Kim
    Cc: Song Liu
    Cc: Stephane Eranian
    Cc: Yonghong Song
    Link: http://lore.kernel.org/lkml/20200812221518.2869003-1-daniel.diaz@linaro.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Daniel Díaz
     
  • The 'dso->kernel' condition is true also for kernel modules now,
    and there are several places that were omited by the initial change:

    - we need to identify modules separately in dso__process_kernel_symbol
    - we need to set 'dso->kernel' for module from buildid table
    - there's no need to use 'dso->kernel || kmodule' in one condition

    Committer testing:

    Before:

    # perf test -v object

    Objdump command is: objdump -z -d --start-address=0xffffffff813e682f --stop-address=0xffffffff813e68af /usr/lib/debug/lib/modules/5.7.14-200.fc32.x86_64/vmlinux
    Bytes read match those read by objdump
    Reading object code for memory address: 0xffffffffc02dc257
    File is: /lib/modules/5.7.14-200.fc32.x86_64/kernel/arch/x86/crypto/crc32c-intel.ko.xz
    On file address is: 0xffffffffc02dc2e7
    dso__data_read_offset failed
    test child finished with -1
    ---- end ----
    Object code reading: FAILED!
    #

    After:

    # perf test object
    26: Object code reading : Ok
    # perf test object
    26: Object code reading : Ok
    # perf test object
    26: Object code reading : Ok
    # perf test object
    26: Object code reading : Ok
    # perf test object
    26: Object code reading : Ok
    #

    Fixes: 02213cec64bb ("perf maps: Mark module DSOs with kernel type")
    Reported-by: Arnaldo Carvalho de Melo
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Adrian Hunter
    Cc: Namhyung Kim
    Signed-off-by: Jiri Olsa

    Jiri Olsa
     
  • Rename enum dso_kernel_type to enum dso_space_type, which seems like
    better fit.

    Committer notes:

    This is used with 'struct dso'->kernel, which once was a boolean, so
    DSO_SPACE__USER is zero, !zero means some sort of kernel space, be it
    the host kernel space or a guest kernel space.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Fix various typos and inconsistent capitalization of CPU in the libperf
    man pages.

    Signed-off-by: Rob Herring
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/20200807193241.3904545-1-robh@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Rob Herring
     
  • Sometimes when adding a kprobe by perf, it results in multiple probe
    points, such as the following:

    # ./perf probe -l
    probe:vfs_getname (on getname_flags:73@fs/namei.c with pathname)
    probe:vfs_getname_1 (on getname_flags:73@fs/namei.c with pathname)
    probe:vfs_getname_2 (on getname_flags:73@fs/namei.c with pathname)
    # cat /sys/kernel/debug/tracing/kprobe_events
    p:probe/vfs_getname _text+5501804 pathname=+0(+0(%gpr31)):string
    p:probe/vfs_getname_1 _text+5505388 pathname=+0(+0(%gpr31)):string
    p:probe/vfs_getname_2 _text+5508396 pathname=+0(+0(%gpr31)):string

    In this test, we need to record all of them and expect any of them in
    the perf-script output, since it's not clear which one will be used for
    the desired syscall:

    # perf stat -e probe:vfs_getname\* -- touch /tmp/nic

    Performance counter stats for 'touch /tmp/nic':

    31 probe:vfs_getname_2
    0 probe:vfs_getname_1
    1 probe:vfs_getname
    0.001421826 seconds time elapsed

    0.001506000 seconds user
    0.000000000 seconds sys

    If the test relies only on probe:vfs_getname, it might easily miss the
    relevant data.

    Signed-off-by: Michael Petlan
    Cc: Jiri Olsa
    LPU-Reference: 20200722135845.29958-1-mpetlan@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Michael Petlan
     
  • For memcpy, the source pages are memset to zero only when --cycles is
    used. This leads to wildly different results with or without --cycles,
    since all sources pages are likely to be mapped to the same zero page
    without explicit writes.

    Before this fix:

    $ export cmd="./perf stat -e LLC-loads -- ./perf bench \
    mem memcpy -s 1024MB -l 100 -f default"
    $ $cmd

    2,935,826 LLC-loads
    3.821677452 seconds time elapsed

    $ $cmd --cycles

    217,533,436 LLC-loads
    8.616725985 seconds time elapsed

    After this fix:

    $ $cmd

    214,459,686 LLC-loads
    8.674301124 seconds time elapsed

    $ $cmd --cycles

    214,758,651 LLC-loads
    8.644480006 seconds time elapsed

    Fixes: 47b5757bac03c338 ("perf bench mem: Move boilerplate memory allocation to the infrastructure")
    Signed-off-by: Vincent Whitchurch
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: kernel@axis.com
    Link: http://lore.kernel.org/lkml/20200810133404.30829-1-vincent.whitchurch@axis.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Vincent Whitchurch
     
  • Commit fbd705a0c618 ("sched: Introduce the 'trace_sched_waking'
    tracepoint") added sched_waking tracepoint which should be preferred
    over sched_wakeup when analyzing scheduling delays.

    Update 'perf sched record' to collect sched_waking events if it exists
    and fallback to sched_wakeup if it does not. Similarly, update timehist
    command to skip sched_wakeup events if the session includes sched_waking
    (ie., sched_waking is preferred over sched_wakeup).

    Signed-off-by: David Ahern
    Acked-by: Namhyung Kim
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20200807164844.44870-1-dsahern@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    David Ahern
     
  • Merge more updates from Andrew Morton:

    - most of the rest of MM (memcg, hugetlb, vmscan, proc, compaction,
    mempolicy, oom-kill, hugetlbfs, migration, thp, cma, util,
    memory-hotplug, cleanups, uaccess, migration, gup, pagemap),

    - various other subsystems (alpha, misc, sparse, bitmap, lib, bitops,
    checkpatch, autofs, minix, nilfs, ufs, fat, signals, kmod, coredump,
    exec, kdump, rapidio, panic, kcov, kgdb, ipc).

    * emailed patches from Andrew Morton : (164 commits)
    mm/gup: remove task_struct pointer for all gup code
    mm: clean up the last pieces of page fault accountings
    mm/xtensa: use general page fault accounting
    mm/x86: use general page fault accounting
    mm/sparc64: use general page fault accounting
    mm/sparc32: use general page fault accounting
    mm/sh: use general page fault accounting
    mm/s390: use general page fault accounting
    mm/riscv: use general page fault accounting
    mm/powerpc: use general page fault accounting
    mm/parisc: use general page fault accounting
    mm/openrisc: use general page fault accounting
    mm/nios2: use general page fault accounting
    mm/nds32: use general page fault accounting
    mm/mips: use general page fault accounting
    mm/microblaze: use general page fault accounting
    mm/m68k: use general page fault accounting
    mm/ia64: use general page fault accounting
    mm/hexagon: use general page fault accounting
    mm/csky: use general page fault accounting
    ...

    Linus Torvalds
     
  • Patch series "kmod/umh: a few fixes".

    Tiezhu Yang had sent out a patch set with a slew of kmod selftest fixes,
    and one patch which modified kmod to return 254 when a module was not
    found. This opened up pandora's box about why that was being used for and
    low and behold its because when UMH_WAIT_PROC is used we call a
    kernel_wait4() call but have never unwrapped the error code. The commit
    log for that fix details the rationale for the approach taken. I'd
    appreciate some review on that, in particular nfs folks as it seems a case
    was never really hit before.

    This patch (of 5):

    Use the variable NAME instead of "\000" directly in kmod_test_0001().

    Signed-off-by: Tiezhu Yang
    Signed-off-by: Luis Chamberlain
    Signed-off-by: Andrew Morton
    Acked-by: Luis Chamberlain
    Cc: Greg Kroah-Hartman
    Cc: Al Viro
    Cc: Philipp Reisner
    Cc: Lars Ellenberg
    Cc: Jens Axboe
    Cc: J. Bruce Fields
    Cc: Chuck Lever
    Cc: Roopa Prabhu
    Cc: Nikolay Aleksandrov
    Cc: David S. Miller
    Cc: Jakub Kicinski
    Cc: David Howells
    Cc: Jarkko Sakkinen
    Cc: James Morris
    Cc: "Serge E. Hallyn"
    Cc: Christian Brauner
    Cc: Sergei Trofimovich
    Cc: Alexei Starovoitov
    Cc: Kees Cook
    Cc: Josh Triplett
    Cc: Sergey Kvachonok
    Cc: Tony Vroon
    Cc: Shuah Khan
    Cc: Christoph Hellwig
    Link: http://lkml.kernel.org/r/20200610154923.27510-1-mcgrof@kernel.org
    Link: http://lkml.kernel.org/r/20200610154923.27510-2-mcgrof@kernel.org
    Signed-off-by: Linus Torvalds

    Tiezhu Yang
     
  • Add a migrate_vma_*() self test for mmap(MAP_SHARED) to verify that
    !vma_anonymous() ranges won't be migrated.

    Signed-off-by: Ralph Campbell
    Signed-off-by: Andrew Morton
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Christoph Hellwig
    Cc: Jason Gunthorpe
    Cc: "Bharata B Rao"
    Cc: Shuah Khan
    Link: http://lkml.kernel.org/r/20200710194840.7602-3-rcampbell@nvidia.com
    Link: http://lkml.kernel.org/r/20200709165711.26584-3-rcampbell@nvidia.com
    Signed-off-by: Linus Torvalds

    Ralph Campbell