Eric Lee / smarc-fsl-linux-kernel

28 Jan, 2017

1 commit

1b1bc42c1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) GTP fixes from Andreas Schultz (missing genl module alias, clear IP
DF on transmit).

2) Netfilter needs to reflect the fwmark when sending resets, from Pau
Espin Pedrol.

3) nftable dump OOPS fix from Liping Zhang.

4) Fix erroneous setting of VIRTIO_NET_HDR_F_DATA_VALID on transmit,
from Rolf Neugebauer.

5) Fix build error of ipt_CLUSTERIP when procfs is disabled, from Arnd
Bergmann.

6) Fix regression in handling of NETIF_F_SG in harmonize_features(),
from Eric Dumazet.

7) Fix RTNL deadlock wrt. lwtunnel module loading, from David Ahern.

8) tcp_fastopen_create_child() needs to setup tp->max_window, from
Alexey Kodanev.

9) Missing kmemdup() failure check in ipv6 segment routing code, from
Eric Dumazet.

10) Don't execute unix_bind() under the bindlock, otherwise we deadlock
with splice. From WANG Cong.

11) ip6_tnl_parse_tlv_enc_lim() potentially reallocates the skb buffer,
therefore callers must reload cached header pointers into that skb.
Fix from Eric Dumazet.

12) Fix various bugs in legacy IRQ fallback handling in alx driver, from
Tobias Regnery.

13) Do not allow lwtunnel drivers to be unloaded while they are
referenced by active instances, from Robert Shearman.

14) Fix truncated PHY LED trigger names, from Geert Uytterhoeven.

15) Fix a few regressions from virtio_net XDP support, from John
Fastabend and Jakub Kicinski.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (102 commits)
ISDN: eicon: silence misleading array-bounds warning
net: phy: micrel: add support for KSZ8795
gtp: fix cross netns recv on gtp socket
gtp: clear DF bit on GTP packet tx
gtp: add genl family modules alias
tcp: don't annotate mark on control socket from tcp_v6_send_response()
ravb: unmap descriptors when freeing rings
virtio_net: reject XDP programs using header adjustment
virtio_net: use dev_kfree_skb for small buffer XDP receive
r8152: check rx after napi is enabled
r8152: re-schedule napi for tx
r8152: avoid start_xmit to schedule napi when napi is disabled
r8152: avoid start_xmit to call napi_schedule during autosuspend
net: dsa: Bring back device detaching in dsa_slave_suspend()
net: phy: leds: Fix truncated LED trigger names
net: phy: leds: Break dependency of phy.h on phy_led_triggers.h
net: phy: leds: Clear phy_num_led_triggers on failure to avoid crash
net-next: ethernet: mediatek: change the compatible string
Documentation: devicetree: change the mediatek ethernet compatible string
bnxt_en: Fix RTNL lock usage on bnxt_get_port_module_status().
...

Linus Torvalds
2017-01-28 04:54:16 +0800

23 Jan, 2017

1 commit

585457fc8 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost ... Browse Code »

Pull virtio/vhost fixes from Michael Tsirkin:
"Random fixes and cleanups that accumulated over the time"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio/s390: virtio: constify virtio_config_ops structures
virtio/s390: add missing \n to end of dev_err message
virtio/s390: support READ_STATUS command for virtio-ccw
tools/virtio/ringtest: tweaks for s390
tools/virtio/ringtest: fix run-on-all.sh for offline cpus
virtio_console: fix a crash in config_work_handler
vhost/scsi: silence uninitialized variable warning
vhost: scsi: constify target_core_fabric_ops structures

Linus Torvalds
2017-01-23 04:40:09 +0800

22 Jan, 2017

1 commit

83fd57a74 Merge tag 'powerpc-4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux ... Browse Code »

Pull powerpc fixes from Michael Ellerman:
"Two fixes for fallout from the hugetlb changes we merged this cycle.

Ten other fixes, four only affect Power9, and the rest are a bit of a
mixture though nothing terrible.

Thanks to: Aneesh Kumar K.V, Anton Blanchard, Benjamin Herrenschmidt,
Dave Martin, Gavin Shan, Madhavan Srinivasan, Nicholas Piggin, Reza
Arbab"

* tag 'powerpc-4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Ignore reserved field in DCSR and PVR reads and writes
powerpc/ptrace: Preserve previous TM fprs/vsrs on short regset write
powerpc/ptrace: Preserve previous fprs/vsrs on short regset write
powerpc/perf: Use MSR to report privilege level on P9 DD1
selftest/powerpc: Wrong PMC initialized in pmc56_overflow test
powerpc/eeh: Enable IO path on permanent error
powerpc/perf: Fix PM_BRU_CMPL event code for power9
powerpc/mm: Fix little-endian 4K hugetlb
powerpc/mm/hugetlb: Don't panic when we don't find the default huge page size
powerpc: Fix pgtable pmd cache init
powerpc/icp-opal: Fix missing KVM case and harden replay
powerpc/mm: Fix memory hotplug BUG() on radix

Linus Torvalds
2017-01-22 09:58:45 +0800

20 Jan, 2017

2 commits

47a4c49af tools/virtio/ringtest: tweaks for s390 ... Browse Code »

Make ringtest work on s390 too.

Signed-off-by: Halil Pasic
Acked-by: Sascha Silbe
Signed-off-by: Cornelia Huck

Halil Pasic
2017-01-20 05:46:32 +0800
21f5eda9b tools/virtio/ringtest: fix run-on-all.sh for offline cpus ... Browse Code »

Since ef1b144d ("tools/virtio/ringtest: fix run-on-all.sh to work
without /dev/cpu") run-on-all.sh uses seq 0 $HOST_AFFINITY as the list
of ids of the CPUs to run the command on (assuming ids of online CPUs
are consecutive and start from 0), where $HOST_AFFINITY is the highest
CPU id in the system previously determined using lscpu. This can fail
on systems with offline CPUs.

Instead let's use lscpu to determine the list of online CPUs.

Signed-off-by: Halil Pasic
Fixes: ef1b144d ("tools/virtio/ringtest: fix run-on-all.sh to work without
/dev/cpu")
Reviewed-by: Sascha Silbe
Signed-off-by: Cornelia Huck

Halil Pasic
2017-01-20 05:46:31 +0800

18 Jan, 2017

3 commits

df21d2fa7 selftest/powerpc: Wrong PMC initialized in pmc56_overflow test ... Browse Code »

Test uses PMC2 to count the event. But PMC1 is being initialized.
Patch to fix it.

Fixes: 3752e453f6ba ('selftests/powerpc: Add tests of PMU EBBs')
Signed-off-by: Madhavan Srinivasan
Signed-off-by: Michael Ellerman

Madhavan Srinivasan
2017-01-18 13:03:34 +0800
3fbfadce6 bpf: Fix test_lru_sanity5() in test_lru_map.c ... Browse Code »

test_lru_sanity5() fails when the number of online cpus
is fewer than the number of possible cpus. It can be
reproduced with qemu by using cmd args "--smp cpus=2,maxcpus=8".

The problem is the loop in test_lru_sanity5() is testing
'i' which is incorrect.

This patch:
1. Make sched_next_online() always return -1 if it cannot
find a next cpu to schedule the process.
2. In test_lru_sanity5(), the parent process does
sched_setaffinity() first (through sched_next_online())
and the forked process will inherit it according to
the 'man sched_setaffinity'.

Fixes: 5db58faf989f ("bpf: Add tests for the LRU bpf_htab")
Reported-by: Daniel Borkmann
Signed-off-by: Martin KaFai Lau
Acked-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Martin KaFai Lau
2017-01-18 04:39:39 +0800
31f5260a7 Merge tag 'perf-urgent-for-mingo-4.10-20170117' of git://git.kernel.org/pub/scm/… ... Browse Code »

…linux/kernel/git/acme/linux into perf/urgent

Pull 'perf probe' fixes from Arnaldo Carvalho de Melo <acme@redhat.com>

- Show correct locations for 'perf probe' on modules (Masami Hiramatsu)

- Correctly handle 'perf probe's on GCC generated functions in modules (Masami Hiramatsu)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2017-01-18 00:03:16 +0800

17 Jan, 2017

3 commits

613f050d6 perf probe: Fix to probe on gcc generated functions in modules ... Browse Code »

Fix to probe on gcc generated functions on modules. Since
probing on a module is based on its symbol name, it should
be adjusted on actual symbols.

E.g. without this fix, perf probe shows probe definition
on non-exist symbol as below.

$ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -F in_range*
in_range.isra.12
$ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
p:probe/in_range nf_nat:in_range+0

With this fix, perf probe correctly shows a probe on
gcc-generated symbol.

$ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range
p:probe/in_range nf_nat:in_range.isra.12+0

This also fixes same problem on online module as below.

$ perf probe -m i915 -D assert_plane
p:probe/assert_plane i915:assert_plane.constprop.134+0

Signed-off-by: Masami Hiramatsu
Tested-by: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/148411450673.9978.14905987549651656075.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo

Masami Hiramatsu
2017-01-17 02:43:04 +0800
3e96dac7c perf probe: Add error checks to offline probe post-processing ... Browse Code »

Add error check codes on post processing and improve it for offline
probe events as:

- post processing fails if no matched symbol found in map(-ENOENT)
or strdup() failed(-ENOMEM).

- Even if the symbol name is the same, it updates symbol address
and offset.

Signed-off-by: Masami Hiramatsu
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/148411443738.9978.4617979132625405545.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo

Masami Hiramatsu
2017-01-17 02:35:25 +0800
d2d4edbeb perf probe: Fix to show correct locations for events on modules ... Browse Code »

Fix to show correct locations for events on modules by relocating given
address instead of retrying after failure.

This happens when the module text size is big enough, bigger than
sh_addr, because the original code retries with given address + sh_addr
if it failed to find CU DIE at the given address.

Any address smaller than sh_addr always fails and it retries with the
correct address, but addresses bigger than sh_addr will get a CU DIE
which is on the given address (not adjusted by sh_addr).

In my environment(x86-64), the sh_addr of ".text" section is 0x10030.
Since i915 is a huge kernel module, we can see this issue as below.

$ grep "[Tt] .*\[i915\]" /proc/kallsyms | sort | head -n1
ffffffffc0270000 t i915_switcheroo_can_switch [i915]

ffffffffc0270000 + 0x10030 = ffffffffc0280030, so we'll check
symbols cross this boundary.

$ grep "[Tt] .*\[i915\]" /proc/kallsyms | grep -B1 ^ffffffffc028\
| head -n 2
ffffffffc027ff80 t haswell_init_clock_gating [i915]
ffffffffc0280110 t valleyview_init_clock_gating [i915]

So setup probes on both function and see what happen.

$ sudo ./perf probe -m i915 -a haswell_init_clock_gating \
-a valleyview_init_clock_gating
Added new events:
probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

You can now use it in all perf tools, such as:

perf record -e probe:valleyview_init_clock_gating -aR sleep 1

$ sudo ./perf probe -l
probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
probe:valleyview_init_clock_gating (on i915_vga_set_decode:4@gpu/drm/i915/i915_drv.c in i915)

As you can see, haswell_init_clock_gating is correctly shown,
but valleyview_init_clock_gating is not.

With this patch, both events are shown correctly.

$ sudo ./perf probe -l
probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)

Committer notes:

In my case:

# perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
Added new events:
probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

You can now use it in all perf tools, such as:

perf record -e probe:valleyview_init_clock_gating -aR sleep 1

# perf probe -l
probe:haswell_init_clock_gating (on i915_getparam+432@gpu/drm/i915/i915_drv.c in i915)
probe:valleyview_init_clock_gating (on __i915_printk+240@gpu/drm/i915/i915_drv.c in i915)
#

# readelf -SW /lib/modules/4.9.0+/build/vmlinux | egrep -w '.text|Name'
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 1] .text PROGBITS ffffffff81000000 200000 822fd3 00 AX 0 0 4096
#

So both are b0rked, now with the fix:

# perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating
Added new events:
probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915)

You can now use it in all perf tools, such as:

perf record -e probe:valleyview_init_clock_gating -aR sleep 1

# perf probe -l
probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915)
#

Both looks correct.

Signed-off-by: Masami Hiramatsu
Tested-by: Arnaldo Carvalho de Melo
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/148411436777.9978.1440275861947194930.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo

Masami Hiramatsu
2017-01-17 02:14:06 +0800

16 Jan, 2017

1 commit

79078c53b Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Ingo Molnar:
"Misc race fixes uncovered by fuzzing efforts, a Sparse fix, two PMU
driver fixes, plus miscellanous tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Reject non sampling events with precise_ip
perf/x86/intel: Account interrupts for PEBS errors
perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race
perf/core: Fix sys_perf_event_open() vs. hotplug
perf/x86/intel: Use ULL constant to prevent undefined shift behaviour
perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code
perf/x86: Set pmu->module in Intel PMU modules
perf probe: Fix to probe on gcc generated symbols for offline kernel
perf probe: Fix --funcs to show correct symbols for offline module
perf symbols: Robustify reading of build-id from sysfs
perf tools: Install tools/lib/traceevent plugins with install-bin
tools lib traceevent: Fix prev/next_prio for deadline tasks
perf record: Fix --switch-output documentation and comment
perf record: Make __record_options static
tools lib subcmd: Add OPT_STRING_OPTARG_SET option
perf probe: Fix to get correct modname from elf header
samples/bpf trace_output_user: Remove duplicate sys/ioctl.h include
samples/bpf sock_example: Avoid getting ethhdr from two includes
perf sched timehist: Show total scheduling time

Linus Torvalds
2017-01-16 03:37:43 +0800

12 Jan, 2017

1 commit

ba836a6f5 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge fixes from Andrew Morton:
"27 fixes.

There are three patches that aren't actually fixes. They're simple
function renamings which are nice-to-have in mainline as ongoing net
development depends on them."

* akpm: (27 commits)
timerfd: export defines to userspace
mm/hugetlb.c: fix reservation race when freeing surplus pages
mm/slab.c: fix SLAB freelist randomization duplicate entries
zram: support BDI_CAP_STABLE_WRITES
zram: revalidate disk under init_lock
mm: support anonymous stable page
mm: add documentation for page fragment APIs
mm: rename __page_frag functions to __page_frag_cache, drop order from drain
mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_free
mm, memcg: fix the active list aging for lowmem requests when memcg is enabled
mm: don't dereference struct page fields of invalid pages
mailmap: add codeaurora.org names for nameless email commits
signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
mm: pmd dirty emulation in page fault handler
ipc/sem.c: fix incorrect sem_lock pairing
lib/Kconfig.debug: fix frv build failure
mm: get rid of __GFP_OTHER_NODE
mm: fix remote numa hits statistics
mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}
ocfs2: fix crash caused by stale lvb with fsdlm plugin
...

Linus Torvalds
2017-01-12 03:15:15 +0800

11 Jan, 2017

1 commit

41b6167e8 mm: get rid of __GFP_OTHER_NODE ... Browse Code »

The flag was introduced by commit 78afd5612deb ("mm: add
__GFP_OTHER_NODE flag") to allow proper accounting of remote node
allocations done by kernel daemons on behalf of a process - e.g.
khugepaged.

After "mm: fix remote numa hits statistics" we do not need and actually
use the flag so we can safely remove it because all allocations which
are satisfied from their "home" node are accounted properly.

[mhocko@suse.com: fix build]
Link: http://lkml.kernel.org/r/20170106122225.GK5556@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20170102153057.9451-3-mhocko@kernel.org
Signed-off-by: Michal Hocko
Acked-by: Mel Gorman
Acked-by: Vlastimil Babka
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Joonsoo Kim
Cc: Taku Izumi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2017-01-11 10:31:55 +0800

06 Jan, 2017

4 commits

7738789fb selftests: x86/pkeys: fix spelling mistake: "itertation" -> "iteration" ... Browse Code »

Fix spelling mistake in print test pass message.

Signed-off-by: Colin Ian King
Signed-off-by: Shuah Khan

Colin King
2017-01-06 04:24:18 +0800
3659f98b5 selftests: do not require bash to run netsocktests testcase ... Browse Code »

Nothing in this minimal script seems to require bash. We often run these
tests on embedded devices where the only shell available is the busybox
ash. Use sh instead.

Signed-off-by: Rolf Eike Beer
Cc: stable@vger.kernel.org
Signed-off-by: Shuah Khan

Rolf Eike Beer
2017-01-06 04:19:53 +0800
d979e13a3 selftests: do not require bash to run bpf tests ... Browse Code »

Nothing in this minimal script seems to require bash. We often run these
tests on embedded devices where the only shell available is the busybox
ash. Use sh instead.

Signed-off-by: Rolf Eike Beer
Cc: stable@vger.kernel.org
Acked-by: Daniel Borkmann
Signed-off-by: Shuah Khan

Rolf Eike Beer
2017-01-06 04:19:47 +0800
a2b1e8a20 selftests: do not require bash for the generated test ... Browse Code »

Nothing in this minimal script seems to require bash. We often run these
tests on embedded devices where the only shell available is the busybox
ash. Use sh instead.

Signed-off-by: Rolf Eike Beer
Cc: stable@vger.kernel.org
Signed-off-by: Shuah Khan

Rolf Eike Beer
2017-01-06 04:18:32 +0800

05 Jan, 2017

1 commit

4e06d4f08 Merge tag 'perf-urgent-for-mingo-4.10-20170104' of git://git.kernel.org/pub/scm/… ... Browse Code »

…linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes and one improvement from Arnaldo Carvalho de Melo:

Fixes:

- Fix prev/next_prio formatting for deadline tasks in libtraceevent (Daniel Bristot de Oliveira)

- Robustify reading of build-ids from /sys/kernel/note (Arnaldo Carvalho de Melo)

- Fix building some sample/bpf in Alpine Linux 3.4 (Arnaldo Carvalho de Melo)

- Fix 'make install-bin' to install libtraceevent plugins (Arnaldo Carvalho de Melo)

- Fix 'perf record --switch-output' documentation and comment (Jiri Olsa)

- Fix 'perf probe' for cross arch probing (Masami Hiramatsu)

Improvement:

- Show total scheduling time in 'perf sched timehist' (Namhyumg Kim)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Ingo Molnar
2017-01-05 15:33:02 +0800

04 Jan, 2017

5 commits

8a937a25a perf probe: Fix to probe on gcc generated symbols for offline kernel ... Browse Code »

Fix perf-probe to show probe definition on gcc generated symbols for
offline kernel (including cross-arch kernel image).

gcc sometimes optimizes functions and generate new symbols with suffixes
such as ".constprop.N" or ".isra.N" etc. Since those symbol names are
not recorded in DWARF, we have to find correct generated symbols from
offline ELF binary to probe on it (kallsyms doesn't correct it). For
online kernel or uprobes we don't need it because those are rebased on
_text, or a section relative address.

E.g. Without this:

$ perf probe -k build-arm/vmlinux -F __slab_alloc*
__slab_alloc.constprop.9
$ perf probe -k build-arm/vmlinux -D __slab_alloc
p:probe/__slab_alloc __slab_alloc+0

If you put above definition on target machine, it should fail
because there is no __slab_alloc in kallsyms.

With this fix, perf probe shows correct probe definition on
__slab_alloc.constprop.9:

$ perf probe -k build-arm/vmlinux -D __slab_alloc
p:probe/__slab_alloc __slab_alloc.constprop.9+0

Signed-off-by: Masami Hiramatsu
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/148350060434.19001.11864836288580083501.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo

Masami Hiramatsu
2017-01-04 22:44:22 +0800
eebc509b2 perf probe: Fix --funcs to show correct symbols for offline module ... Browse Code »

Fix --funcs (-F) option to show correct symbols for offline module.
Since previous perf-probe uses machine__findnew_module_map() for offline
module, even if user passes a module file (with full path) which is for
other architecture, perf-probe always tries to load symbol map for
current kernel module.

This fix uses dso__new_map() to load the map from given binary as same
as a map for user applications.

Signed-off-by: Masami Hiramatsu
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/148350053478.19001.15435255244512631545.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo

Masami Hiramatsu
2017-01-04 22:15:09 +0800
7934c98a6 perf symbols: Robustify reading of build-id from sysfs ... Browse Code »

Markus reported that perf segfaults when reading /sys/kernel/notes from
a kernel linked with GNU gold, due to what looks like a gold bug, so do
some bounds checking to avoid crashing in that case.

Reported-by: Markus Trippelsdorf
Report-Link: http://lkml.kernel.org/r/20161219161821.GA294@x4
Cc: Adrian Hunter
Cc: David Ahern
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Wang Nan
Link: http://lkml.kernel.org/n/tip-ryhgs6a6jxvz207j2636w31c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2017-01-04 03:11:13 +0800
30a9c6444 perf tools: Install tools/lib/traceevent plugins with install-bin ... Browse Code »

Those are binaries as well, so should be installed by:

make -C tools/perf install-bin'

too.

Cc: Alexander Shishkin
Cc: Daniel Bristot de Oliveira
Cc: Jiri Olsa
Cc: Peter Zijlstra
Cc: Steven Rostedt
Link: http://lkml.kernel.org/n/tip-3841b37u05evxrs1igkyu6ks@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Arnaldo Carvalho de Melo
2017-01-04 03:11:13 +0800
074859184 tools lib traceevent: Fix prev/next_prio for deadline tasks ... Browse Code »

Currently, the sched:sched_switch tracepoint reports deadline tasks with
priority -1. But when reading the trace via perf script I've got the
following output:

# ./d & # (d is a deadline task, see [1])
# perf record -e sched:sched_switch -a sleep 1
# perf script
...
swapper 0 [000] 2146.962441: sched:sched_switch: swapper/0:0 [120] R ==> d:2593 [4294967295]
d 2593 [000] 2146.972472: sched:sched_switch: d:2593 [4294967295] R ==> g:2590 [4294967295]

The task d reports the wrong priority [4294967295]. This happens because
the "int prio" is stored in an unsigned long long val. Although it is
set as a %lld, as int is shorter than unsigned long long,
trace_seq_printf prints it as a positive number.

The fix is just to cast the val as an int, and print it as a %d,
as in the sched:sched_switch tracepoint's "format".

The output with the fix is:

# ./d &
# perf record -e sched:sched_switch -a sleep 1
# perf script
...
swapper 0 [000] 4306.374037: sched:sched_switch: swapper/0:0 [120] R ==> d:10941 [-1]
d 10941 [000] 4306.383823: sched:sched_switch: d:10941 [-1] R ==> swapper/0:0 [120]

[1] d.c

---
#include
#include
#include
#include
#include

struct sched_attr {
__u32 size, sched_policy;
__u64 sched_flags;
__s32 sched_nice;
__u32 sched_priority;
__u64 sched_runtime, sched_deadline, sched_period;
};

int sched_setattr(pid_t pid, const struct sched_attr *attr, unsigned int flags)
{
return syscall(__NR_sched_setattr, pid, attr, flags);
}

int main(void)
{
struct sched_attr attr = {
.size = sizeof(attr),
.sched_policy = SCHED_DEADLINE, /* This creates a 10ms/30ms reservation */
.sched_runtime = 10 * 1000 * 1000,
.sched_period = attr.sched_deadline = 30 * 1000 * 1000,
};

if (sched_setattr(0, &attr, 0) < 0) {
perror("sched_setattr");
return -1;
}

for(;;);
}
---

Committer notes:

Got the program from the provided URL, http://bristot.me/lkml/d.c,
trimmed it and included in the cset log above, so that we have
everything needed to test it in one place.

Signed-off-by: Daniel Bristot de Oliveira
Acked-by: Steven Rostedt
Tested-by: Arnaldo Carvalho de Melo
Cc: Alexander Shishkin
Cc: Daniel Bristot de Oliveira
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/866ef75bcebf670ae91c6a96daa63597ba981f0d.1483443552.git.bristot@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo

Daniel Bristot de Oliveira
2017-01-04 03:11:12 +0800

03 Jan, 2017

4 commits

60437ac02 perf record: Fix --switch-output documentation and comment ... Browse Code »

There's no --signal-trigger option, also adding the code comment into
record man page.

Signed-off-by: Jiri Olsa
Tested-by: Wang Nan
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1483431600-19887-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2017-01-03 22:11:38 +0800
efd213071 perf record: Make __record_options static ... Browse Code »

There's no need for this one to be global.

Signed-off-by: Jiri Olsa
Tested-by: Wang Nan
Cc: David Ahern
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1483431600-19887-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2017-01-03 22:11:10 +0800
b66fb1da5 tools lib subcmd: Add OPT_STRING_OPTARG_SET option ... Browse Code »

To allow string options with a default argument and variable set when
the option is used.

Signed-off-by: Jiri Olsa
Tested-by: Wang Nan
Cc: David Ahern
Cc: Josh Poimboeuf
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/1483431600-19887-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Jiri Olsa
2017-01-03 22:10:38 +0800
1f2ed153b perf probe: Fix to get correct modname from elf header ... Browse Code »

Since 'perf probe' supports cross-arch probes, it is possible to analyze
different arch kernel image which has different bits-per-long.

In that case, it fails to get the module name because it uses the
MOD_NAME_OFFSET macro based on the host machine bits-per-long, instead
of the target arch bits-per-long.

This fixes above issue by changing modname-offset based on the target
archs bit width. This is ok because linux kernel uses LP64 model on
64bit arch.

E.g. without this (on x86_64, and target module is arm32):

$ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
p:probe/configfs_lookup :configfs_lookup+0
^-Here is an empty module name.

With this fix, you can see correct module name:

$ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup
p:probe/configfs_lookup configfs:configfs_lookup+0

Signed-off-by: Masami Hiramatsu
Cc: Jiri Olsa
Cc: Namhyung Kim
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/148337043836.6752.383495516397005695.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo

Masami Hiramatsu
2017-01-03 01:09:17 +0800

28 Dec, 2016

1 commit

9396c9cb0 perf sched timehist: Show total scheduling time ... Browse Code »

Show length of analyzed sample time and rate of idle task running.
This also takes care of time range given by --time option.

$ perf sched timehist -sI | tail
Samples do not have callchains.
Idle stats:
CPU 0 idle for 930.316 msec ( 92.93%)
CPU 1 idle for 963.614 msec ( 96.25%)
CPU 2 idle for 885.482 msec ( 88.45%)
CPU 3 idle for 938.635 msec ( 93.76%)

Total number of unique tasks: 118
Total number of context switches: 2337
Total run time (msec): 3718.048
Total scheduling time (msec): 1001.131 (x 4)

Suggested-by: David Ahern
Signed-off-by: Namhyung Kim
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20161222060350.17655-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2016-12-28 08:47:57 +0800

26 Dec, 2016

1 commit

10bbe7599 Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux ... Browse Code »

Pull turbostat updates from Len Brown.

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: remove obsolete -M, -m, -C, -c options
tools/power turbostat: Make extensible via the --add parameter
tools/power turbostat: Denverton uses a 25 MHz crystal, not 19.2 MHz
tools/power turbostat: line up headers when -M is used
tools/power turbostat: fix SKX PKG_CSTATE_LIMIT decoding
tools/power turbostat: Support Knights Mill (KNM)
tools/power turbostat: Display HWP OOB status
tools/power turbostat: fix Denverton BCLK
tools/power turbostat: use intel-family.h model strings
tools/power/turbostat: Add Denverton RAPL support
tools/power/turbostat: Add Denverton support
tools/power/turbostat: split core MSR support into status + limit
tools/power turbostat: fix error case overflow read of slm_freq_table[]
tools/power turbostat: Allocate correct amount of fd and irq entries
tools/power turbostat: switch to tab delimited output
tools/power turbostat: Gracefully handle ACPI S3
tools/power turbostat: tidy up output on Joule counter overflow

Linus Torvalds
2016-12-26 06:01:28 +0800

25 Dec, 2016

2 commits

6886fee4d tools/power turbostat: remove obsolete -M, -m, -C, -c options ... Browse Code »

The new --add option has replaced the -M, -m, -C, -c options
Eg.

-M 0x10 is now --add msr0x10,raw
-m 0x10 is now --add msr0x10,raw,u32
-C 0x10 is now --add msr0x10,delta
-c 0x10 is now --add msr0x10,delta,u32

The --add option can be repeated to add any number of counters,
while the previous options were limited to adding one of each type.

In addition, the --add option can accept a column label,
and can also display a counter as a percentage of elapsed cycles.

Eg. --add msr0x3fe,core,percent,MY_CC3

Signed-off-by: Len Brown

Len Brown
2016-12-25 04:38:09 +0800
388e9c813 tools/power turbostat: Make extensible via the --add parameter ... Browse Code »

Create the "--add" parameter. This can be used to teach an existing
turbostat binary about any number of any type of counter.

turbostat(8) details the syntax for --add.

Signed-off-by: Len Brown

Len Brown
2016-12-25 04:16:10 +0800

24 Dec, 2016

1 commit

00198dab3 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Ingo Molnar:
"On the kernel side there's two x86 PMU driver fixes and a uprobes fix,
plus on the tooling side there's a number of fixes and some late
updates"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
perf sched timehist: Fix invalid period calculation
perf sched timehist: Remove hardcoded 'comm_width' check at print_summary
perf sched timehist: Enlarge default 'comm_width'
perf sched timehist: Honour 'comm_width' when aligning the headers
perf/x86: Fix overlap counter scheduling bug
perf/x86/pebs: Fix handling of PEBS buffer overflows
samples/bpf: Move open_raw_sock to separate header
samples/bpf: Remove perf_event_open() declaration
samples/bpf: Be consistent with bpf_load_program bpf_insn parameter
tools lib bpf: Add bpf_prog_{attach,detach}
samples/bpf: Switch over to libbpf
perf diff: Do not overwrite valid build id
perf annotate: Don't throw error for zero length symbols
perf bench futex: Fix lock-pi help string
perf trace: Check if MAP_32BIT is defined (again)
samples/bpf: Make perf_event_read() static
uprobes: Fix uprobes on MIPS, allow for a cache flush after ixol breakpoint creation
samples/bpf: Make samples more libbpf-centric
tools lib bpf: Add flags to bpf_create_map()
tools lib bpf: use __u32 from linux/types.h
...

Linus Torvalds
2016-12-24 08:49:12 +0800

23 Dec, 2016

4 commits

bdd75729e perf sched timehist: Fix invalid period calculation ... Browse Code »

When --time option is given with a value outside recorded time, the last
sample time (tprev) was set to that value and run time calculation might
be incorrect. This is a problem of the first samples for each cpus
since it would skip the runtime update when tprev is 0. But with --time
option it had non-zero (which is invalid) value so the calculation is
also incorrect.

For example, let's see the followging:

$ perf sched timehist
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
3195.968367 [0003] 0.000 0.000 0.000
3195.968386 [0002] Timer[4306/4277] 0.000 0.000 0.018
3195.968397 [0002] Web Content[4277] 0.000 0.000 0.000
3195.968595 [0001] JS Helper[4302/4277] 0.000 0.000 0.000
3195.969217 [0000] 0.000 0.000 0.621
3195.969251 [0001] kworker/1:1H[291] 0.000 0.000 0.033

The sample starts at 3195.968367 but when I gave a time interval from
3194 to 3196 (in sec) it will calculate the whole 2 second as runtime.
In below, 2 cpus accounted it as runtime, other 2 cpus accounted it as
idle time.

Before:

$ perf sched timehist --time 3194,3196 -s | tail
Idle stats:
CPU 0 idle for 1995.991 msec
CPU 1 idle for 20.793 msec
CPU 2 idle for 30.191 msec
CPU 3 idle for 1999.852 msec

Total number of unique tasks: 23
Total number of context switches: 128
Total run time (msec): 3724.940

After:

$ perf sched timehist --time 3194,3196 -s | tail
Idle stats:
CPU 0 idle for 10.811 msec
CPU 1 idle for 20.793 msec
CPU 2 idle for 30.191 msec
CPU 3 idle for 18.337 msec

Total number of unique tasks: 23
Total number of context switches: 128
Total run time (msec): 18.139

Committer notes:

Further testing:

Before:

Idle stats:
CPU 0 idle for 229.785 msec
CPU 1 idle for 937.944 msec
CPU 2 idle for 188.931 msec
CPU 3 idle for 986.185 msec

After:

# perf sched timehist --time 40602,40603 -s | tail

Idle stats:
CPU 0 idle for 229.785 msec
CPU 1 idle for 175.407 msec
CPU 2 idle for 188.931 msec
CPU 3 idle for 223.657 msec

Total number of unique tasks: 68
Total number of context switches: 814
Total run time (msec): 97.688

# for cpu in `seq 0 3` ; do echo -n "CPU $cpu idle for " ; perf sched timehist --time 40602,40603 | grep "\[000${cpu}\].*\" | tr -s ' ' | cut -d' ' -f7 | awk '{entries++ ; s+=$1} END {print s " msec (entries: " entries ")"}' ; done
CPU 0 idle for 229.721 msec (entries: 123)
CPU 1 idle for 175.381 msec (entries: 65)
CPU 2 idle for 188.903 msec (entries: 56)
CPU 3 idle for 223.61 msec (entries: 102)

Difference due to the idle stats being accounted at nanoseconds precision while
the entries in 'perf sched timehist' are trucated at msec.usec.

Signed-off-by: Namhyung Kim
Tested-by: Arnaldo Carvalho de Melo
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Fixes: 853b74071110 ("perf sched timehist: Add option to specify time window of interest")
Link: http://lkml.kernel.org/r/20161222060350.17655-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2016-12-23 03:35:46 +0800
4fa0d1aa2 perf sched timehist: Remove hardcoded 'comm_width' check at print_summary ... Browse Code »

Now that the default 'comm_width' value is 30, no need to check that at
print_summary,

Signed-off-by: Namhyung Kim
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20161222060350.17655-1-namhyung@kernel.org
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2016-12-23 03:35:46 +0800
9b8087d72 perf sched timehist: Enlarge default 'comm_width' ... Browse Code »

Current default value is 20 but it's easily changed to a bigger value as
task has a long name and different tid and pid. And it makes the output
not aligned. So change it to have a large value as summary shows.

Committer notes:

Before:

# perf sched record
^C
# perf sched timehist

40602.770537 [0001] rcuos/2[29] 7.970 0.002 0.020
40602.771512 [0003] 0.003 0.000 0.986
40602.771586 [0001] 0.020 0.000 1.049
40602.771606 [0001] qemu-system-x86[3593/3510] 0.000 0.002 0.020
40602.771629 [0003] qemu-system-x86[3510] 0.000 0.003 0.116
40602.771776 [0000] 0.001 0.000 1.892

After:

# perf sched timehist

40602.770537 [0001] rcuos/2[29] 7.970 0.002 0.020
40602.771512 [0003] 0.003 0.000 0.986
40602.771586 [0001] 0.020 0.000 1.049
40602.771606 [0001] qemu-system-x86[3593/3510] 0.000 0.002 0.020
40602.771629 [0003] qemu-system-x86[3510] 0.000 0.003 0.116

Signed-off-by: Namhyung Kim
Tested-by: Arnaldo Carvalho de Melo
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20161222060350.17655-1-namhyung@kernel.org
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2016-12-23 03:35:45 +0800
0e6758e82 perf sched timehist: Honour 'comm_width' when aligning the headers ... Browse Code »

Current default value is 20, but that may change in the future, so make
places where we have 20 hardcoded use 'comm_width'.

Signed-off-by: Namhyung Kim
Cc: David Ahern
Cc: Jiri Olsa
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20161222060350.17655-1-namhyung@kernel.org
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo

Namhyung Kim
2016-12-23 03:35:45 +0800

20 Dec, 2016

3 commits

5dc880de6 tools lib bpf: Add bpf_prog_{attach,detach} ... Browse Code »

Commit d8c5b17f2bc0 ("samples: bpf: add userspace example for attaching
eBPF programs to cgroups") added these functions to samples/libbpf, but
during this merge all of the samples libbpf functionality is shifting to
tools/lib/bpf. Shift these functions there.

Committer notes:

Use bzero + attr.FIELD = value instead of 'attr = { .FIELD = value, just
like the other wrapper calls to sys_bpf with bpf_attr to make this build
in older toolchais, such as the ones in CentOS 5 and 6.

Signed-off-by: Joe Stringer
Cc: Alexei Starovoitov
Cc: Daniel Borkmann
Cc: Wang Nan
Link: http://lkml.kernel.org/n/tip-au2zvtsh55vqeo3v3uw7jr4c@git.kernel.org
Link: https://github.com/joestringer/linux/commit/353e6f298c3d0a92fa8bfa61ff898c5050261a12.patch
Signed-off-by: Arnaldo Carvalho de Melo

Joe Stringer
2016-12-20 23:00:39 +0800
ed6c166cc perf diff: Do not overwrite valid build id ... Browse Code »

Fixes a perf diff regression issue which was introduced by commit
5baecbcd9c9a ("perf symbols: we can now read separate debug-info files
based on a build ID")

The binary name could be same when perf diff different binaries. Build
id is used to distinguish between them.
However, the previous patch assumes the same binary name has same build
id. So it overwrites the build id according to the binary name,
regardless of whether the build id is set or not.

Check the has_build_id in dso__load. If the build id is already set, use
it.

Before the fix:

$ perf diff 1.perf.data 2.perf.data
# Event 'cycles'
#
# Baseline Delta Shared Object Symbol
# ........ ....... ................ .............................
#
99.83% -99.80% tchain_edit [.] f2
0.12% +99.81% tchain_edit [.] f3
0.02% -0.01% [ixgbe] [k] ixgbe_read_reg

After the fix:
$ perf diff 1.perf.data 2.perf.data
# Event 'cycles'
#
# Baseline Delta Shared Object Symbol
# ........ ....... ................ .............................
#
99.83% +0.10% tchain_edit [.] f3
0.12% -0.08% tchain_edit [.] f2

Signed-off-by: Kan Liang
Cc: Andi Kleen
CC: Dima Kogan
Cc: Jiri Olsa
Cc: Namhyung Kim
Fixes: 5baecbcd9c9a ("perf symbols: we can now read separate debug-info files based on a build ID")
Link: http://lkml.kernel.org/r/1481642984-13593-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo

Kan Liang
2016-12-20 23:00:38 +0800
edee44be5 perf annotate: Don't throw error for zero length symbols ... Browse Code »

'perf report --tui' exits with error when it finds a sample of zero
length symbol (i.e. addr == sym->start == sym->end). Actually these are
valid samples. Don't exit TUI and show report with such symbols.

Reported-and-Tested-by: Anton Blanchard
Link: https://lkml.org/lkml/2016/10/8/189
Signed-off-by: Ravi Bangoria
Cc: Alexander Shishkin
Cc: Benjamin Herrenschmidt
Cc: Chris Riyder
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Masami Hiramatsu
Cc: Michael Ellerman
Cc: Nicholas Piggin
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: stable@kernel.org # v4.9+
Link: http://lkml.kernel.org/r/1479804050-5028-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo

Ravi Bangoria
2016-12-20 23:00:32 +0800