Eric Lee / smarc-fsl-linux-kernel

18 Nov, 2015

1 commit

7f151f1d8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix list tests in netfilter ingress support, from Florian Westphal.

2) Fix reversal of input and output interfaces in ingress hook
invocation, from Pablo Neira Ayuso.

3) We have a use after free in r8169, caught by Dave Jones, fixed by
Francois Romieu.

4) Splice use-after-free fix in AF_UNIX frmo Hannes Frederic Sowa.

5) Three ipv6 route handling bug fixes from Martin KaFai Lau:
a) Don't create clone routes not managed by the fib6 tree
b) Don't forget to check expiration of DST_NOCACHE routes.
c) Handle rt->dst.from == NULL properly.

6) Several AF_PACKET fixes wrt transport header setting and SKB
protocol setting, from Daniel Borkmann.

7) Fix thunder driver crash on shutdown, from Pavel Fedin.

8) Several Mellanox driver fixes (max MTU calculations, use of correct
DMA unmap in TX path, etc.) from Saeed Mahameed, Tariq Toukan, Doron
Tsur, Achiad Shochat, Eran Ben Elisha, and Noa Osherovich.

9) Several mv88e6060 DSA driver fixes (wrong bit definitions for
certain registers, etc.) from Neil Armstrong.

10) Make sure to disable preemption while updating per-cpu stats of ip
tunnels, from Jason A. Donenfeld.

11) Various ARM64 bpf JIT fixes, from Yang Shi.

12) Flush icache properly in ARM JITs, from Daniel Borkmann.

13) Fix masking of RX and TX interrupts in ravb driver, from Masaru
Nagai.

14) Fix netdev feature propagation for devices not implementing
->ndo_set_features(). From Nikolay Aleksandrov.

15) Big endian fix in vmxnet3 driver, from Shrikrishna Khare.

16) RAW socket code increments incorrect SNMP counters, fix from Ben
Cartwright-Cox.

17) IPv6 multicast SNMP counters are bumped twice, fix from Neil Horman.

18) Fix handling of VLAN headers on stacked devices when REORDER is
disabled. From Vlad Yasevich.

19) Fix SKB leaks and use-after-free in ipvlan and macvlan drivers, from
Sabrina Dubroca.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits)
MAINTAINERS: Update Mellanox's Eth NIC driver entries
net/core: revert "net: fix __netdev_update_features return.." and add comment
af_unix: take receive queue lock while appending new skb
rtnetlink: fix frame size warning in rtnl_fill_ifinfo
net: use skb_clone to avoid alloc_pages failure.
packet: Use PAGE_ALIGNED macro
packet: Don't check frames_per_block against negative values
net: phy: Use interrupts when available in NOLINK state
phy: marvell: Add support for 88E1540 PHY
arm64: bpf: make BPF prologue and epilogue align with ARM64 AAPCS
macvlan: fix leak in macvlan_handle_frame
ipvlan: fix use after free of skb
ipvlan: fix leak in ipvlan_rcv_frame
vlan: Do not put vlan headers back on bridge and macvlan ports
vlan: Fix untag operations of stacked vlans with REORDER_HEADER off
via-velocity: unconditionally drop frames with bad l2 length
ipg: Remove ipg driver
dl2k: Add support for IP1000A-based cards
snmp: Remove duplicate OUTMCAST stat increment
net: thunder: Check for driver data in nicvf_remove()
...

Linus Torvalds
2015-11-18 05:52:59 +0800

17 Nov, 2015

1 commit

30b50aa61 bpf: samples: exclude asm/sysreg.h for arm64 ... Browse Code »

commit 338d4f49d6f7114a017d294ccf7374df4f998edc
("arm64: kernel: Add support for Privileged Access Never") includes sysreg.h
into futex.h and uaccess.h. But, the inline assembly used by asm/sysreg.h is
incompatible with llvm so it will cause BPF samples build failure for ARM64.
Since sysreg.h is useless for BPF samples, just exclude it from Makefile via
defining __ASM_SYSREG_H.

Signed-off-by: Yang Shi
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Yang Shi
2015-11-17 03:40:49 +0800

14 Nov, 2015

1 commit

9aa3d651a Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending ... Browse Code »

Pull SCSI target updates from Nicholas Bellinger:
"This series contains HCH's changes to absorb configfs attribute
->show() + ->store() function pointer usage from it's original
tree-wide consumers, into common configfs code.

It includes usb-gadget, target w/ drivers, netconsole and ocfs2
changes to realize the improved simplicity, that now renders the
original include/target/configfs_macros.h CPP magic for fabric drivers
and others, unnecessary and obsolete.

And with common code in place, new configfs attributes can be added
easier than ever before.

Note, there are further improvements in-flight from other folks for
v4.5 code in configfs land, plus number of target fixes for post -rc1
code"

In the meantime, a new user of the now-removed old configfs API came in
through the char/misc tree in commit 7bd1d4093c2f ("stm class: Introduce
an abstraction for System Trace Module devices").

This merge resolution comes from Alexander Shishkin, who updated his stm
class tracing abstraction to account for the removal of the old
show_attribute and store_attribute methods in commit 517982229f78
("configfs: remove old API") from this pull. As Alexander says about
that patch:

"There's no need to keep an extra wrapper structure per item and the
awkward show_attribute/store_attribute item ops are no longer needed.

This patch converts policy code to the new api, all the while making
the code quite a bit smaller and easier on the eyes.

Signed-off-by: Alexander Shishkin "

That patch was folded into the merge so that the tree should be fully
bisectable.

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (23 commits)
configfs: remove old API
ocfs2/cluster: use per-attribute show and store methods
ocfs2/cluster: move locking into attribute store methods
netconsole: use per-attribute show and store methods
target: use per-attribute show and store methods
spear13xx_pcie_gadget: use per-attribute show and store methods
dlm: use per-attribute show and store methods
usb-gadget/f_serial: use per-attribute show and store methods
usb-gadget/f_phonet: use per-attribute show and store methods
usb-gadget/f_obex: use per-attribute show and store methods
usb-gadget/f_uac2: use per-attribute show and store methods
usb-gadget/f_uac1: use per-attribute show and store methods
usb-gadget/f_mass_storage: use per-attribute show and store methods
usb-gadget/f_sourcesink: use per-attribute show and store methods
usb-gadget/f_printer: use per-attribute show and store methods
usb-gadget/f_midi: use per-attribute show and store methods
usb-gadget/f_loopback: use per-attribute show and store methods
usb-gadget/ether: use per-attribute show and store methods
usb-gadget/f_acm: use per-attribute show and store methods
usb-gadget/f_hid: use per-attribute show and store methods
...

Linus Torvalds
2015-11-14 12:04:17 +0800

07 Nov, 2015

1 commit

22402cd0a Merge tag 'trace-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracking updates from Steven Rostedt:
"Most of the changes are clean ups and small fixes. Some of them have
stable tags to them. I searched through my INBOX just as the merge
window opened and found lots of patches to pull. I ran them through
all my tests and they were in linux-next for a few days.

Features added this release:
----------------------------

- Module globbing. You can now filter function tracing to several
modules. # echo '*:mod:*snd*' > set_ftrace_filter (Dmitry Safonov)

- Tracer specific options are now visible even when the tracer is not
active. It was rather annoying that you can only see and modify
tracer options after enabling the tracer. Now they are in the
options/ directory even when the tracer is not active. Although
they are still only visible when the tracer is active in the
trace_options file.

- Trace options are now per instance (although some of the tracer
specific options are global)

- New tracefs file: set_event_pid. If any pid is added to this file,
then all events in the instance will filter out events that are not
part of this pid. sched_switch and sched_wakeup events handle next
and the wakee pids"

* tag 'trace-v4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (68 commits)
tracefs: Fix refcount imbalance in start_creating()
tracing: Put back comma for empty fields in boot string parsing
tracing: Apply tracer specific options from kernel command line.
tracing: Add some documentation about set_event_pid
ring_buffer: Remove unneeded smp_wmb() before wakeup of reader benchmark
tracing: Allow dumping traces without tracking trace started cpus
ring_buffer: Fix more races when terminating the producer in the benchmark
ring_buffer: Do no not complete benchmark reader too early
tracing: Remove redundant TP_ARGS redefining
tracing: Rename max_stack_lock to stack_trace_max_lock
tracing: Allow arch-specific stack tracer
recordmcount: arm64: Replace the ignored mcount call into nop
recordmcount: Fix endianness handling bug for nop_mcount
tracepoints: Fix documentation of RCU lockdep checks
tracing: ftrace_event_is_function() can return boolean
tracing: is_legal_op() can return boolean
ring-buffer: rb_event_is_commit() can return boolean
ring-buffer: rb_per_cpu_empty() can return boolean
ring_buffer: ring_buffer_empty{cpu}() can return boolean
ring-buffer: rb_is_reader_page() can return boolean
...

Linus Torvalds
2015-11-07 05:30:20 +0800

03 Nov, 2015

2 commits

42984d7c1 bpf: add sample usages for persistent maps/progs ... Browse Code »

This patch adds a couple of stand-alone examples on how BPF_OBJ_PIN
and BPF_OBJ_GET commands can be used.

Example with maps:

# ./fds_example -F /sys/fs/bpf/m -P -m -k 1 -v 42
bpf: map fd:3 (Success)
bpf: pin ret:(0,Success)
bpf: fd:3 u->(1:42) ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m -G -m -k 1
bpf: get fd:3 (Success)
bpf: fd:3 l->(1):42 ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m -G -m -k 1 -v 24
bpf: get fd:3 (Success)
bpf: fd:3 u->(1:24) ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m -G -m -k 1
bpf: get fd:3 (Success)
bpf: fd:3 l->(1):24 ret:(0,Success)

# ./fds_example -F /sys/fs/bpf/m2 -P -m
bpf: map fd:3 (Success)
bpf: pin ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m2 -G -m -k 1
bpf: get fd:3 (Success)
bpf: fd:3 l->(1):0 ret:(0,Success)
# ./fds_example -F /sys/fs/bpf/m2 -G -m
bpf: get fd:3 (Success)

Example with progs:

# ./fds_example -F /sys/fs/bpf/p -P -p
bpf: prog fd:3 (Success)
bpf: pin ret:(0,Success)
bpf sock:4
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Daniel Borkmann
2015-11-03 11:48:39 +0800
67aedeb85 Sample: Trace_event: Correct the comments ... Browse Code »

The commit 889204278ccf ("tracing: Update trace-event-sample with
TRACE_SYSTEM_VAR documentation") changed TRACE_SYSTEM to 'sample-trace',
but didn't make the according change of its name in the comments.

Link: http://lkml.kernel.org/r/1443599650-23680-1-git-send-email-zhang.chunyan@linaro.org

Signed-off-by: Chunyan Zhang
Signed-off-by: Steven Rostedt

Chunyan Zhang
2015-11-03 03:09:15 +0800

01 Nov, 2015

1 commit

b75ec3af2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2015-11-01 12:15:30 +0800

28 Oct, 2015

1 commit

85ff8a43f bpf: sample: define aarch64 specific registers ... Browse Code »

Define aarch64 specific registers for building bpf samples correctly.

Signed-off-by: Yang Shi
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Yang Shi
2015-10-28 10:51:10 +0800

22 Oct, 2015

1 commit

39111695b samples: bpf: add bpf_perf_event_output example ... Browse Code »

Performance test and example of bpf_perf_event_output().
kprobe is attached to sys_write() and trivial bpf program streams
pid+cookie into userspace via PERF_COUNT_SW_BPF_OUTPUT event.

Usage:
$ sudo ./bld_x64/samples/bpf/trace_output
recv 2968913 events per sec

Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Alexei Starovoitov
2015-10-22 21:42:15 +0800

14 Oct, 2015

1 commit

517982229 configfs: remove old API ... Browse Code »

Remove the old show_attribute and store_attribute methods and update
the documentation. Also replace the two C samples with a single new
one in the proper samples directory where people expect to find it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Nicholas Bellinger

Christoph Hellwig
2015-10-14 13:17:57 +0800

13 Oct, 2015

1 commit

bf5088773 bpf: add unprivileged bpf tests ... Browse Code »

Add new tests samples/bpf/test_verifier:

unpriv: return pointer
checks that pointer cannot be returned from the eBPF program

unpriv: add const to pointer
unpriv: add pointer to pointer
unpriv: neg pointer
checks that pointer arithmetic is disallowed

unpriv: cmp pointer with const
unpriv: cmp pointer with pointer
checks that comparison of pointers is disallowed
Only one case allowed 'void *value = bpf_map_lookup_elem(..); if (value == 0) ...'

unpriv: check that printk is disallowed
since bpf_trace_printk is not available to unprivileged

unpriv: pass pointer to helper function
checks that pointers cannot be passed to functions that expect integers
If function expects a pointer the verifier allows only that type of pointer.
Like 1st argument of bpf_map_lookup_elem() must be pointer to map.
(applies to non-root as well)

unpriv: indirectly pass pointer on stack to helper function
checks that pointer stored into stack cannot be used as part of key
passed into bpf_map_lookup_elem()

unpriv: mangle pointer on stack 1
unpriv: mangle pointer on stack 2
checks that writing into stack slot that already contains a pointer
is disallowed

unpriv: read pointer from stack in small chunks
checks that < 8 byte read from stack slot that contains a pointer is
disallowed

unpriv: write pointer into ctx
checks that storing pointers into skb->fields is disallowed

unpriv: write pointer into map elem value
checks that storing pointers into element values is disallowed
For example:
int bpf_prog(struct __sk_buff *skb)
{
u32 key = 0;
u64 *value = bpf_map_lookup_elem(&map, &key);
if (value)
*value = (u64) skb;
}
will be rejected.

unpriv: partial copy of pointer
checks that doing 32-bit register mov from register containing
a pointer is disallowed

unpriv: pass pointer to tail_call
checks that passing pointer as an index into bpf_tail_call
is disallowed

unpriv: cmp map pointer with zero
checks that comparing map pointer with constant is disallowed

unpriv: write into frame pointer
checks that frame pointer is read-only (applies to root too)

unpriv: cmp of frame pointer
checks that R10 cannot be using in comparison

unpriv: cmp of stack pointer
checks that Rx = R10 - imm is ok, but comparing Rx is not

unpriv: obfuscate stack pointer
checks that Rx = R10 - imm is ok, but Rx -= imm is not

Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Alexei Starovoitov
2015-10-13 10:13:37 +0800

02 Oct, 2015

2 commits

f6d3125fa Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
net/dsa/slave.c

net/dsa/slave.c simply had overlapping changes.

Signed-off-by: David S. Miller

David S. Miller
2015-10-02 22:21:25 +0800
54aea4542 kprobes: use _do_fork() in samples to make them work again ... Browse Code »

Commit 3033f14ab78c ("clone: support passing tls argument via C rather
than pt_regs magic") introduced _do_fork() that allowed to pass @tls
parameter.

The old do_fork() is defined only for architectures that are not ready
to use this way and do not define HAVE_COPY_THREAD_TLS.

Let's use _do_fork() in the kprobe examples to make them work again on
all architectures.

Signed-off-by: Petr Mladek
Cc: Ingo Molnar
Cc: Masami Hiramatsu
Cc: Andy Lutomirski
Cc: Peter Zijlstra
Cc: "H. Peter Anvin"
Cc: Thomas Gleixner
Cc: Thiago Macieira
Cc: Jiri Kosina
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Petr Mladek
2015-10-02 09:42:35 +0800

18 Sep, 2015

1 commit

27b29f630 bpf: add bpf_redirect() helper ... Browse Code »

Existing bpf_clone_redirect() helper clones skb before redirecting
it to RX or TX of destination netdev.
Introduce bpf_redirect() helper that does that without cloning.

Benchmarked with two hosts using 10G ixgbe NICs.
One host is doing line rate pktgen.
Another host is configured as:
$ tc qdisc add dev $dev ingress
$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
action bpf run object-file tcbpf1_kern.o section clone_redirect_xmit drop
so it receives the packet on $dev and immediately xmits it on $dev + 1
The section 'clone_redirect_xmit' in tcbpf1_kern.o file has the program
that does bpf_clone_redirect() and performance is 2.0 Mpps

$ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \
action bpf run object-file tcbpf1_kern.o section redirect_xmit drop
which is using bpf_redirect() - 2.4 Mpps

and using cls_bpf with integrated actions as:
$ tc filter add dev $dev root pref 10 \
bpf run object-file tcbpf1_kern.o section redirect_xmit integ_act classid 1
performance is 2.5 Mpps

To summarize:
u32+act_bpf using clone_redirect - 2.0 Mpps
u32+act_bpf using redirect - 2.4 Mpps
cls_bpf using redirect - 2.5 Mpps

For comparison linux bridge in this setup is doing 2.1 Mpps
and ixgbe rx + drop in ip_rcv - 7.8 Mpps

Signed-off-by: Alexei Starovoitov
Acked-by: Daniel Borkmann
Acked-by: John Fastabend
Signed-off-by: David S. Miller

Alexei Starovoitov
2015-09-18 12:09:07 +0800

13 Aug, 2015

1 commit

5ed3ccbd5 bpf: fix build warnings and add function read_trace_pipe() ... Browse Code »

There are two improvements in this patch:
1. Fix the build warnings;
2. Add function read_trace_pipe() to print the result on
the screen;

Before this patch, we can get the result through /sys/kernel/de
bug/tracing/trace_pipe and get nothing on the screen.
By applying this patch, the result can be printed on the screen.
$ ./tracex6
...
tracex6-705 [003] d..1 131.428593: : CPU-3 19981414
sshd-683 [000] d..1 131.428727: : CPU-0 221682321
sshd-683 [000] d..1 131.428821: : CPU-0 221808766
sshd-683 [000] d..1 131.428950: : CPU-0 221982984
sshd-683 [000] d..1 131.429045: : CPU-0 222111851
tracex6-705 [003] d..1 131.429168: : CPU-3 20757551
sshd-683 [000] d..1 131.429170: : CPU-0 222281240
sshd-683 [000] d..1 131.429261: : CPU-0 222403340
sshd-683 [000] d..1 131.429378: : CPU-0 222561024
...

Signed-off-by: Kaixu Xia
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Kaixu Xia
2015-08-13 07:39:12 +0800

10 Aug, 2015

1 commit

47efb3027 samples/bpf: example of get selected PMU counter value ... Browse Code »

This is a simple example and shows how to use the new ability
to get the selected Hardware PMU counter value.

Signed-off-by: Kaixu Xia
Signed-off-by: David S. Miller

Kaixu Xia
2015-08-10 13:50:06 +0800

27 Jul, 2015

1 commit

24b4d2abd ebpf: Allow dereferences of PTR_TO_STACK registers ... Browse Code »

mov %rsp, %r1 ; r1 = rsp
add $-8, %r1 ; r1 = rsp - 8
store_q $123, -8(%rsp) ; *(u64*)r1 = 123
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Alex Gartrell
2015-07-27 15:54:10 +0800

23 Jul, 2015

1 commit

c5e40ee28 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
net/bridge/br_mdb.c

br_mdb.c conflict was a function call being removed to fix a bug in
'net' but whose signature was changed in 'net-next'.

Signed-off-by: David S. Miller

David S. Miller
2015-07-23 15:41:16 +0800

18 Jul, 2015

1 commit

d6726c814 tracing: Fix sample output of dynamic arrays ... Browse Code »

He Kuang noticed that the trace event samples for arrays was broken:

"The output result of trace_foo_bar event in traceevent samples is
wrong. This problem can be reproduced as following:

(Build kernel with SAMPLE_TRACE_EVENTS=m)

$ insmod trace-events-sample.ko

$ echo 1 > /sys/kernel/debug/tracing/events/sample-trace/foo_bar/enable

$ cat /sys/kernel/debug/tracing/trace

event-sample-980 [000] .... 43.649559: foo_bar: foo hello 21 0x15
BIT1|BIT3|0x10 {0x1,0x6f6f6e53,0xff007970,0xffffffff} Snoopy
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The array length is not right, should be {0x1}.
(ffffffff,ffffffff)

event-sample-980 [000] .... 44.653827: foo_bar: foo hello 22 0x16
BIT2|BIT3|0x10
{0x1,0x2,0x646e6147,0x666c61,0xffffffff,0xffffffff,0x750aeffe,0x7}
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The array length is not right, should be {0x1,0x2}.
Gandalf (ffffffff,ffffffff)"

This was caused by an update to have __print_array()'s second parameter
be the count of items in the array and not the size of the array.

As there is already users of __print_array(), it can not change. But
the sample code can and we can also improve on the documentation about
__print_array() and __get_dynamic_array_len().

Link: http://lkml.kernel.org/r/1436839171-31527-2-git-send-email-hekuang@huawei.com

Fixes: ac01ce1410fc2 ("tracing: Make ftrace_print_array_seq compute buf_len")
Reported-by: He Kuang
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-07-18 02:15:13 +0800

09 Jul, 2015

1 commit

d912557b3 samples: bpf: enable trace samples for s390x ... Browse Code »

The trace bpf samples do not compile on s390x because they use x86
specific fields from the "pt_regs" structure.

Fix this and access the fields via new PT_REGS macros.

Signed-off-by: Michael Holzheu
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Michael Holzheu
2015-07-09 06:17:45 +0800

23 Jun, 2015

1 commit

0fb1170ee bpf: BPF based latency tracing ... Browse Code »

BPF offers another way to generate latency histograms. We attach
kprobes at trace_preempt_off and trace_preempt_on and calculate the
time it takes to from seeing the off/on transition.

The first array is used to store the start time stamp. The key is the
CPU id. The second array stores the log2(time diff). We need to use
static allocation here (array and not hash tables). The kprobes
hooking into trace_preempt_on|off should not calling any dynamic
memory allocation or free path. We need to avoid recursivly
getting called. Besides that, it reduces jitter in the measurement.

CPU 0
latency
1 -> 1
2 -> 3
4 -> 7
8 -> 15
16 -> 31
32 -> 63
64 -> 127
128 -> 255
256 -> 511
512 -> 1023 : 0
1024 -> 2047 : 0
2048 -> 4095
4096 -> 8191
8192 -> 16383
16384 -> 32767
32768 -> 65535
65536 -> 131071 : 179
131072 -> 262143 : 18
262144 -> 524287 : 4
524288 -> 1048575 : 1363
CPU 1
latency
1 -> 1
2 -> 3
4 -> 7
8 -> 15
16 -> 31
32 -> 63
64 -> 127
128 -> 255
256 -> 511
512 -> 1023 : 0
1024 -> 2047 : 0
2048 -> 4095
4096 -> 8191
8192 -> 16383
16384 -> 32767
32768 -> 65535
65536 -> 131071 : 29
131072 -> 262143 : 4
262144 -> 524287 : 1
524288 -> 1048575 : 364
CPU 2
latency
1 -> 1
2 -> 3
4 -> 7
8 -> 15
16 -> 31
32 -> 63
64 -> 127
128 -> 255
256 -> 511
512 -> 1023 : 0
1024 -> 2047 : 0
2048 -> 4095
4096 -> 8191
8192 -> 16383
16384 -> 32767
32768 -> 65535 : 59
65536 -> 131071 : 2
131072 -> 262143 : 0
262144 -> 524287 : 1
524288 -> 1048575 : 174
CPU 3
latency
1 -> 1
2 -> 3
4 -> 7
8 -> 15
16 -> 31
32 -> 63
64 -> 127
128 -> 255
256 -> 511
512 -> 1023 : 0
1024 -> 2047 : 0
2048 -> 4095
4096 -> 8191
8192 -> 16383
16384 -> 32767
32768 -> 65535 : 72
65536 -> 131071 : 32
131072 -> 262143 : 26
262144 -> 524287 : 12
524288 -> 1048575 : 298 : count distribution : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | | | | | : 166723 |*************************************** | : 19870 |*** | : 6324 | | : 1098 | | : 190 | | | | | | | | | | : count distribution : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | | | | | : 114042 |*************************************** | : 9587 |** | : 4140 | | : 673 | | : 179 | | | | | | | | | | : count distribution : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | | | | | : 40147 |*************************************** | : 2300 |* | : 828 | | : 178 | | | | | | | | | | | | : count distribution : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | : 0 | | | | | | : 29626 |*************************************** | : 2704 |** | : 1090 | | : 160 | | | | | | | | | | | |

All this is based on the trace3 examples written by
Alexei Starovoitov .

Signed-off-by: Daniel Wagner
Cc: Alexei Starovoitov
Cc: Alexei Starovoitov
Cc: "David S. Miller"
Cc: Daniel Borkmann
Cc: Ingo Molnar
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Alexei Starovoitov
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller

Daniel Wagner
2015-06-23 21:09:58 +0800

16 Jun, 2015

1 commit

ffeedafbf bpf: introduce current->pid, tgid, uid, gid, comm accessors ... Browse Code »

eBPF programs attached to kprobes need to filter based on
current->pid, uid and other fields, so introduce helper functions:

u64 bpf_get_current_pid_tgid(void)
Return: current->tgid << 32 | current->pid

u64 bpf_get_current_uid_gid(void)
Return: current_gid << 32 | current_uid

bpf_get_current_comm(char *buf, int size_of_buf)
stores current->comm into buf

They can be used from the programs attached to TC as well to classify packets
based on current task fields.

Update tracex2 example to print histogram of write syscalls for each process
instead of aggregated for all.

Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Alexei Starovoitov
2015-06-16 06:53:50 +0800

07 Jun, 2015

2 commits

d691f9e8d bpf: allow programs to write to certain skb fields ... Browse Code »

allow programs read/write skb->mark, tc_index fields and
((struct qdisc_skb_cb *)cb)->data.

mark and tc_index are generically useful in TC.
cb[0]-cb[4] are primarily used to pass arguments from one
program to another called via bpf_tail_call() which can
be seen in sockex3_kern.c example.

All fields of 'struct __sk_buff' are readable to socket and tc_cls_act progs.
mark, tc_index are writeable from tc_cls_act only.
cb[0]-cb[4] are writeable by both sockets and tc_cls_act.

Add verifier tests and improve sample code.

Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Alexei Starovoitov
2015-06-07 17:01:33 +0800
3431205e0 bpf: make programs see skb->data == L2 for ingress and egress ... Browse Code »

eBPF programs attached to ingress and egress qdiscs see inconsistent skb->data.
For ingress L2 header is already pulled, whereas for egress it's present.
This is known to program writers which are currently forced to use
BPF_LL_OFF workaround.
Since programs don't change skb internal pointers it is safe to do
pull/push right around invocation of the program and earlier taps and
later pt->func() will not be affected.
Multiple taps via packet_rcv(), tpacket_rcv() are doing the same trick
around run_filter/BPF_PROG_RUN even if skb_shared.

This fix finally allows programs to use optimized LD_ABS/IND instructions
without BPF_LL_OFF for higher performance.
tc ingress + cls_bpf + samples/bpf/tcbpf1_kern.o
w/o JIT w/JIT
before 20.5 23.6 Mpps
after 21.8 26.6 Mpps

Old programs with BPF_LL_OFF will still work as-is.

We can now undo most of the earlier workaround commit:
a166151cbe33 ("bpf: fix bpf helpers to use skb->mac_header relative offsets")

Signed-off-by: Alexei Starovoitov
Acked-by: Jamal Hadi Salim
Signed-off-by: David S. Miller

Alexei Starovoitov
2015-06-07 17:01:33 +0800

23 May, 2015

5 commits

05a14d5e1 pktgen: add benchmark script pktgen_bench_xmit_mode_netif_receive.sh ... Browse Code »

This script pktgen_bench_xmit_mode_netif_receive.sh is a benchmark
script, which can be used for benchmarking part of the network stack.
This can be used for performance improving or catching regression in
that area.

The script is developed for benchmarking ingress qdisc path, original
idea by Alexei Starovoitov. This script don't really need any
hardware. This is achieved via the recently introduced stack inject
feature "xmit_mode netif_receive". See commit 62f64aed622b6 ("pktgen:
introduce xmit_mode ''").

Signed-off-by: Jesper Dangaard Brouer
Signed-off-by: David S. Miller

Jesper Dangaard Brouer
2015-05-23 11:59:17 +0800
1d73ba16a pktgen: add sample script pktgen_sample03_burst_single_flow.sh ... Browse Code »

Add the pktgen samples script pktgen_sample03_burst_single_flow.sh
that demonstrates how to acheive maximum performance.

If correctly tuned[1] single CPU 10Gbit/s wirespeed small pkts is
possible[2] which is 14.88Mpps. The trick is to take advantage of the
"burst" feature introduced in commit 38b2cf2982dc73 ("net: pktgen:
packet bursting via skb->xmit_more").

[1] http://netoptimizer.blogspot.dk/2014/06/pktgen-for-network-overload-testing.html
[2] http://netoptimizer.blogspot.dk/2014/10/unlocked-10gbps-tx-wirespeed-smallest.html

Signed-off-by: Jesper Dangaard Brouer
Signed-off-by: David S. Miller

Jesper Dangaard Brouer
2015-05-23 11:59:17 +0800
282fb5894 pktgen: add sample script pktgen_sample02_multiqueue.sh ... Browse Code »

Add the pktgen samples script pktgen_sample02_multiqueue.sh that
demonstrates generating packets on multiqueue NICs.

Specifically notice the options "-t" that specifies how many
kernel threads to activate. Also notice the flag QUEUE_MAP_CPU,
which cause the SKB TX queue to be mapped to the CPU running the
kernel thread. For best scalability people are also encourage to
map NIC IRQ /proc/irq/*/smp_affinity to CPU number.

Usage example with "-t" 4 threads and help:
./pktgen_sample02_multiqueue.sh -i eth4 -m 00:1B:21:3C:9D:F8 -t 4

Usage: ./pktgen_sample02_multiqueue.sh [-vx] -i ethX
-i : ($DEV) output interface/device (required)
-s : ($PKT_SIZE) packet size
-d : ($DEST_IP) destination IP
-m : ($DST_MAC) destination MAC-addr
-t : ($THREADS) threads to start
-c : ($SKB_CLONE) SKB clones send before alloc new SKB
-b : ($BURST) HW level bursting of SKBs
-v : ($VERBOSE) verbose
-x : ($DEBUG) debug

Removing pktgen.conf-2-1 and pktgen.conf-2-2 as these examples
should be covered now.

Signed-off-by: Jesper Dangaard Brouer
Signed-off-by: David S. Miller

Jesper Dangaard Brouer
2015-05-23 11:59:17 +0800
6f0947975 pktgen: add sample script pktgen_sample01_simple.sh ... Browse Code »

Add the first basic pktgen samples script pktgen_sample01_simple.sh,
which demonstrates the a simple use of the helper functions.
Removing pktgen.conf-1-1 as that example should be covered now.

The naming scheme pktgen_sampleNN, where NN is a number, should encourage
reading the samples in a specific order.

Script cause pktgen sending with a single thread and single interface,
and introduce flow variation via random UDP source port.

Usage example and help:
./pktgen_sample01_simple.sh -i eth4 -m 00:1B:21:3C:9D:F8 -d 192.168.8.2

Usage: ./pktgen_sample01_simple.sh [-vx] -i ethX
-i : ($DEV) output interface/device (required)
-s : ($PKT_SIZE) packet size
-d : ($DEST_IP) destination IP
-m : ($DST_MAC) destination MAC-addr
-c : ($SKB_CLONE) SKB clones send before alloc new SKB
-v : ($VERBOSE) verbose
-x : ($DEBUG) debug

Signed-off-by: Jesper Dangaard Brouer
Signed-off-by: David S. Miller

Jesper Dangaard Brouer
2015-05-23 11:59:17 +0800
b64b0d1e6 pktgen: new pktgen helper functions for samples scripts ... Browse Code »

Preparing for removing existing samples/pktgen/ scripts, and
replacing these with easier to use samples.

This commit provides two helper shell files, that can
be "included" by shell source'ing. Namely "functions.sh"
and "parameters.sh".

The parameters.sh file support easy and consistant parameter
parsing across the sample scripts. Usage example is printed on
errors.

The functions.sh file provides, three new shell functions for
configuring the different components of pktgen: pg_ctrl(),
pg_thread() and pg_set(). A slightly improved version of the old
pgset() function is also provided for backwards compat.

The new functions correspond to pktgens different components.
* pg_ctrl() control "pgctrl" (/proc/net/pktgen/pgctrl)
* pg_thread() control the kernel threads and binding to devices
* pg_set() control setup of individual devices

These changes are borrowed from:
https://github.com/netoptimizer/network-testing/tree/master/pktgen

Signed-off-by: Jesper Dangaard Brouer
Signed-off-by: David S. Miller

Jesper Dangaard Brouer
2015-05-23 11:59:16 +0800

22 May, 2015

2 commits

530b2c861 samples/bpf: bpf_tail_call example for networking ... Browse Code »

Usage:
$ sudo ./sockex3
IP src.port -> dst.port bytes packets
127.0.0.1.42010 -> 127.0.0.1.12865 1568 8
127.0.0.1.59526 -> 127.0.0.1.33778 11422636 173070
127.0.0.1.33778 -> 127.0.0.1.59526 11260224828 341974
127.0.0.1.12865 -> 127.0.0.1.42010 1832 12
IP src.port -> dst.port bytes packets
127.0.0.1.42010 -> 127.0.0.1.12865 1568 8
127.0.0.1.59526 -> 127.0.0.1.33778 23198092 351486
127.0.0.1.33778 -> 127.0.0.1.59526 22972698518 698616
127.0.0.1.12865 -> 127.0.0.1.42010 1832 12

this example is similar to sockex2 in a way that it accumulates per-flow
statistics, but it does packet parsing differently.
sockex2 inlines full packet parser routine into single bpf program.
This sockex3 example have 4 independent programs that parse vlan, mpls, ip, ipv6
and one main program that starts the process.
bpf_tail_call() mechanism allows each program to be small and be called
on demand potentially multiple times, so that many vlan, mpls, ip in ip,
gre encapsulations can be parsed. These and other protocol parsers can
be added or removed at runtime. TLVs can be parsed in similar manner.
Note, tail_call_cnt dynamic check limits the number of tail calls to 32.

Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Alexei Starovoitov
2015-05-22 05:07:59 +0800
5bacd7805 samples/bpf: bpf_tail_call example for tracing ... Browse Code »

kprobe example that demonstrates how future seccomp programs may look like.
It attaches to seccomp_phase1() function and tail-calls other BPF programs
depending on syscall number.

Existing optimized classic BPF seccomp programs generated by Chrome look like:
if (sd.nr < 121) {
if (sd.nr < 57) {
if (sd.nr < 22) {
if (sd.nr < 7) {
if (sd.nr < 4) {
if (sd.nr < 1) {
check sys_read
} else {
if (sd.nr < 3) {
check sys_write and sys_open
} else {
check sys_close
}
}
} else {
} else {
} else {
} else {
} else {
}

the future seccomp using native eBPF may look like:
bpf_tail_call(&sd, &syscall_jmp_table, sd.nr);
which is simpler, faster and leaves more room for per-syscall checks.

Usage:
$ sudo ./tracex5
-366 [001] d... 4.870033: : read(fd=1, buf=00007f6d5bebf000, size=771)
-369 [003] d... 4.870066: : mmap
-369 [003] d... 4.870077: : syscall=110 (one of get/set uid/pid/gid)
-369 [003] d... 4.870089: : syscall=107 (one of get/set uid/pid/gid)
sh-369 [000] d... 4.891740: : read(fd=0, buf=00000000023d1000, size=512)
sh-369 [000] d... 4.891747: : write(fd=1, buf=00000000023d3000, size=512)
sh-369 [000] d... 4.891747: : read(fd=1, buf=00000000023d3000, size=512)

Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Alexei Starovoitov
2015-05-22 05:07:59 +0800

13 May, 2015

1 commit

b88c06e36 samples/bpf: fix in-source build of samples with clang ... Browse Code »

in-source build of 'make samples/bpf/' was incorrectly
using default compiler instead of invoking clang/llvm.
out-of-source build was ok.

Fixes: a80857822b0c ("samples: bpf: trivial eBPF program in C")
Signed-off-by: Brenden Blanco
Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Brenden Blanco
2015-05-13 11:15:25 +0800

17 Apr, 2015

2 commits

725f9dcd5 bpf: fix two bugs in verification logic when accessing 'ctx' pointer ... Browse Code »

1.
first bug is a silly mistake. It broke tracing examples and prevented
simple bpf programs from loading.

In the following code:
if (insn->imm == 0 && BPF_SIZE(insn->code) == BPF_W) {
} else if (...) {
// this part should have been executed when
// insn->code == BPF_W and insn->imm != 0
}

Obviously it's not doing that. So simple instructions like:
r2 = *(u64 *)(r1 + 8)
will be rejected. Note the comments in the code around these branches
were and still valid and indicate the true intent.

Replace it with:
if (BPF_SIZE(insn->code) != BPF_W)
continue;

if (insn->imm == 0) {
} else if (...) {
// now this code will be executed when
// insn->code == BPF_W and insn->imm != 0
}

2.
second bug is more subtle.
If malicious code is using the same dest register as source register,
the checks designed to prevent the same instruction to be used with different
pointer types will fail to trigger, since we were assigning src_reg_type
when it was already overwritten by check_mem_access().
The fix is trivial. Just move line:
src_reg_type = regs[insn->src_reg].type;
before check_mem_access().
Add new 'access skb fields bad4' test to check this case.

Fixes: 9bac3d6d548e ("bpf: allow extended BPF programs access skb fields")
Signed-off-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Alexei Starovoitov
2015-04-17 02:08:49 +0800
a166151cb bpf: fix bpf helpers to use skb->mac_header relative offsets ... Browse Code »

For the short-term solution, lets fix bpf helper functions to use
skb->mac_header relative offsets instead of skb->data in order to
get the same eBPF programs with cls_bpf and act_bpf work on ingress
and egress qdisc path. We need to ensure that mac_header is set
before calling into programs. This is effectively the first option
from below referenced discussion.

More long term solution for LD_ABS|LD_IND instructions will be more
intrusive but also more beneficial than this, and implemented later
as it's too risky at this point in time.

I.e., we plan to look into the option of moving skb_pull() out of
eth_type_trans() and into netif_receive_skb() as has been suggested
as second option. Meanwhile, this solution ensures ingress can be
used with eBPF, too, and that we won't run into ABI troubles later.
For dealing with negative offsets inside eBPF helper functions,
we've implemented bpf_skb_clone_unwritable() to test for unwriteable
headers.

Reference: http://thread.gmane.org/gmane.linux.network/359129/focus=359694
Fixes: 608cd71a9c7c ("tc: bpf: generalize pedit action")
Fixes: 91bc4822c3d6 ("tc: bpf: add checksum helpers")
Signed-off-by: Alexei Starovoitov
Signed-off-by: Daniel Borkmann
Signed-off-by: David S. Miller

Alexei Starovoitov
2015-04-17 02:08:49 +0800

16 Apr, 2015

1 commit

6c373ca89 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Pull networking updates from David Miller:

1) Add BQL support to via-rhine, from Tino Reichardt.

2) Integrate SWITCHDEV layer support into the DSA layer, so DSA drivers
can support hw switch offloading. From Floria Fainelli.

3) Allow 'ip address' commands to initiate multicast group join/leave,
from Madhu Challa.

4) Many ipv4 FIB lookup optimizations from Alexander Duyck.

5) Support EBPF in cls_bpf classifier and act_bpf action, from Daniel
Borkmann.

6) Remove the ugly compat support in ARP for ugly layers like ax25,
rose, etc. And use this to clean up the neigh layer, then use it to
implement MPLS support. All from Eric Biederman.

7) Support L3 forwarding offloading in switches, from Scott Feldman.

8) Collapse the LOCAL and MAIN ipv4 FIB tables when possible, to speed
up route lookups even further. From Alexander Duyck.

9) Many improvements and bug fixes to the rhashtable implementation,
from Herbert Xu and Thomas Graf. In particular, in the case where
an rhashtable user bulk adds a large number of items into an empty
table, we expand the table much more sanely.

10) Don't make the tcp_metrics hash table per-namespace, from Eric
Biederman.

11) Extend EBPF to access SKB fields, from Alexei Starovoitov.

12) Split out new connection request sockets so that they can be
established in the main hash table. Much less false sharing since
hash lookups go direct to the request sockets instead of having to
go first to the listener then to the request socks hashed
underneath. From Eric Dumazet.

13) Add async I/O support for crytpo AF_ALG sockets, from Tadeusz Struk.

14) Support stable privacy address generation for RFC7217 in IPV6. From
Hannes Frederic Sowa.

15) Hash network namespace into IP frag IDs, also from Hannes Frederic
Sowa.

16) Convert PTP get/set methods to use 64-bit time, from Richard
Cochran.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1816 commits)
fm10k: Bump driver version to 0.15.2
fm10k: corrected VF multicast update
fm10k: mbx_update_max_size does not drop all oversized messages
fm10k: reset head instead of calling update_max_size
fm10k: renamed mbx_tx_dropped to mbx_tx_oversized
fm10k: update xcast mode before synchronizing multicast addresses
fm10k: start service timer on probe
fm10k: fix function header comment
fm10k: comment next_vf_mbx flow
fm10k: don't handle mailbox events in iov_event path and always process mailbox
fm10k: use separate workqueue for fm10k driver
fm10k: Set PF queues to unlimited bandwidth during virtualization
fm10k: expose tx_timeout_count as an ethtool stat
fm10k: only increment tx_timeout_count in Tx hang path
fm10k: remove extraneous "Reset interface" message
fm10k: separate PF only stats so that VF does not display them
fm10k: use hw->mac.max_queues for stats
fm10k: only show actual queues, not the maximum in hardware
fm10k: allow creation of VLAN on default vid
fm10k: fix unused warnings
...

Linus Torvalds
2015-04-16 00:00:47 +0800

15 Apr, 2015

3 commits

6c8a53c9e Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf changes from Ingo Molnar:
"Core kernel changes:

- One of the more interesting features in this cycle is the ability
to attach eBPF programs (user-defined, sandboxed bytecode executed
by the kernel) to kprobes.

This allows user-defined instrumentation on a live kernel image
that can never crash, hang or interfere with the kernel negatively.
(Right now it's limited to root-only, but in the future we might
allow unprivileged use as well.)

(Alexei Starovoitov)

- Another non-trivial feature is per event clockid support: this
allows, amongst other things, the selection of different clock
sources for event timestamps traced via perf.

This feature is sought by people who'd like to merge perf generated
events with external events that were measured with different
clocks:

- cluster wide profiling

- for system wide tracing with user-space events,

- JIT profiling events

etc. Matching perf tooling support is added as well, available via
the -k, --clockid parameter to perf record et al.

(Peter Zijlstra)

Hardware enablement kernel changes:

- x86 Intel Processor Trace (PT) support: which is a hardware tracer
on steroids, available on Broadwell CPUs.

The hardware trace stream is directly output into the user-space
ring-buffer, using the 'AUX' data format extension that was added
to the perf core to support hardware constraints such as the
necessity to have the tracing buffer physically contiguous.

This patch-set was developed for two years and this is the result.
A simple way to make use of this is to use BTS tracing, the PT
driver emulates BTS output - available via the 'intel_bts' PMU.
More explicit PT specific tooling support is in the works as well -
will probably be ready by 4.2.

(Alexander Shishkin, Peter Zijlstra)

- x86 Intel Cache QoS Monitoring (CQM) support: this is a hardware
feature of Intel Xeon CPUs that allows the measurement and
allocation/partitioning of caches to individual workloads.

These kernel changes expose the measurement side as a new PMU
driver, which exposes various QoS related PMU events. (The
partitioning change is work in progress and is planned to be merged
as a cgroup extension.)

(Matt Fleming, Peter Zijlstra; CPU feature detection by Peter P
Waskiewicz Jr)

- x86 Intel Haswell LBR call stack support: this is a new Haswell
feature that allows the hardware recording of call chains, plus
tooling support. To activate this feature you have to enable it
via the new 'lbr' call-graph recording option:

perf record --call-graph lbr
perf report

or:

perf top --call-graph lbr

This hardware feature is a lot faster than stack walk or dwarf
based unwinding, but has some limitations:

- It reuses the current LBR facility, so LBR call stack and
branch record can not be enabled at the same time.

- It is only available for user-space callchains.

(Yan, Zheng)

- x86 Intel Broadwell CPU support and various event constraints and
event table fixes for earlier models.

(Andi Kleen)

- x86 Intel HT CPUs event scheduling workarounds. This is a complex
CPU bug affecting the SNB,IVB,HSW families that results in counter
value corruption. The mitigation code is automatically enabled and
is transparent.

(Maria Dimakopoulou, Stephane Eranian)

The perf tooling side had a ton of changes in this cycle as well, so
I'm only able to list the user visible changes here, in addition to
the tooling changes outlined above:

User visible changes affecting all tools:

- Improve support of compressed kernel modules (Jiri Olsa)
- Save DSO loading errno to better report errors (Arnaldo Carvalho de Melo)
- Bash completion for subcommands (Yunlong Song)
- Add 'I' event modifier for perf_event_attr.exclude_idle bit (Jiri Olsa)
- Support missing -f to override perf.data file ownership. (Yunlong Song)
- Show the first event with an invalid filter (David Ahern, Arnaldo Carvalho de Melo)

User visible changes in individual tools:

'perf data':

New tool for converting perf.data to other formats, initially
for the CTF (Common Trace Format) from LTTng (Jiri Olsa,
Sebastian Siewior)

'perf diff':

Add --kallsyms option (David Ahern)

'perf list':

Allow listing events with 'tracepoint' prefix (Yunlong Song)

Sort the output of the command (Yunlong Song)

'perf kmem':

Respect -i option (Jiri Olsa)

Print big numbers using thousands' group (Namhyung Kim)

Allow -v option (Namhyung Kim)

Fix alignment of slab result table (Namhyung Kim)

'perf probe':

Support multiple probes on different binaries on the same command line (Masami Hiramatsu)

Support unnamed union/structure members data collection. (Masami Hiramatsu)

Check kprobes blacklist when adding new events. (Masami Hiramatsu)

'perf record':

Teach 'perf record' about perf_event_attr.clockid (Peter Zijlstra)

Support recording running/enabled time (Andi Kleen)

'perf sched':

Improve the performance of 'perf sched replay' on high CPU core count machines (Yunlong Song)

'perf report' and 'perf top':

Allow annotating entries in callchains in the hists browser (Arnaldo Carvalho de Melo)

Indicate which callchain entries are annotated in the
TUI hists browser (Arnaldo Carvalho de Melo)

Add pid/tid filtering to 'report' and 'script' commands (David Ahern)

Consider PERF_RECORD_ events with cpumode == 0 in 'perf top', removing one
cause of long term memory usage buildup, i.e. not processing PERF_RECORD_EXIT
events (Arnaldo Carvalho de Melo)

'perf stat':

Report unsupported events properly (Suzuki K. Poulose)

Output running time and run/enabled ratio in CSV mode (Andi Kleen)

'perf trace':

Handle legacy syscalls tracepoints (David Ahern, Arnaldo Carvalho de Melo)

Only insert blank duration bracket when tracing syscalls (Arnaldo Carvalho de Melo)

Filter out the trace pid when no threads are specified (Arnaldo Carvalho de Melo)

Dump stack on segfaults (Arnaldo Carvalho de Melo)

No need to explicitely enable evsels for workload started from perf, let it
be enabled via perf_event_attr.enable_on_exec, removing some events that take
place in the 'perf trace' before a workload is really started by it.
(Arnaldo Carvalho de Melo)

Allow mixing with tracepoints and suppressing plain syscalls. (Arnaldo Carvalho de Melo)

There's also been a ton of infrastructure work done, such as the
split-out of perf's build system into tools/build/ and other changes -
see the shortlog and changelog for details"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (358 commits)
perf/x86/intel/pt: Clean up the control flow in pt_pmu_hw_init()
perf evlist: Fix type for references to data_head/tail
perf probe: Check the orphaned -x option
perf probe: Support multiple probes on different binaries
perf buildid-list: Fix segfault when show DSOs with hits
perf tools: Fix cross-endian analysis
perf tools: Fix error path to do closedir() when synthesizing threads
perf tools: Fix synthesizing fork_event.ppid for non-main thread
perf tools: Add 'I' event modifier for exclude_idle bit
perf report: Don't call map__kmap if map is NULL.
perf tests: Fix attr tests
perf probe: Fix ARM 32 building error
perf tools: Merge all perf_event_attr print functions
perf record: Add clockid parameter
perf sched replay: Use replay_repeat to calculate the runavg of cpu usage instead of the default value 10
perf sched replay: Support using -f to override perf.data file ownership
perf sched replay: Fix the EMFILE error caused by the limitation of the maximum open files
perf sched replay: Handle the dead halt of sem_wait when create_tasks() fails for any task
perf sched replay: Fix the segmentation fault problem caused by pr_err in threads
perf sched replay: Realloc the memory of pid_to_task stepwise to adapt to the different pid_max configurations
...

Linus Torvalds
2015-04-15 05:37:47 +0800
eeee78cf7 Merge tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing updates from Steven Rostedt:
"Some clean ups and small fixes, but the biggest change is the addition
of the TRACE_DEFINE_ENUM() macro that can be used by tracepoints.

Tracepoints have helper functions for the TP_printk() called
__print_symbolic() and __print_flags() that lets a numeric number be
displayed as a a human comprehensible text. What is placed in the
TP_printk() is also shown in the tracepoint format file such that user
space tools like perf and trace-cmd can parse the binary data and
express the values too. Unfortunately, the way the TRACE_EVENT()
macro works, anything placed in the TP_printk() will be shown pretty
much exactly as is. The problem arises when enums are used. That's
because unlike macros, enums will not be changed into their values by
the C pre-processor. Thus, the enum string is exported to the format
file, and this makes it useless for user space tools.

The TRACE_DEFINE_ENUM() solves this by converting the enum strings in
the TP_printk() format into their number, and that is what is shown to
user space. For example, the tracepoint tlb_flush currently has this
in its format file:

__print_symbolic(REC->reason,
{ TLB_FLUSH_ON_TASK_SWITCH, "flush on task switch" },
{ TLB_REMOTE_SHOOTDOWN, "remote shootdown" },
{ TLB_LOCAL_SHOOTDOWN, "local shootdown" },
{ TLB_LOCAL_MM_SHOOTDOWN, "local mm shootdown" })

After adding:

TRACE_DEFINE_ENUM(TLB_FLUSH_ON_TASK_SWITCH);
TRACE_DEFINE_ENUM(TLB_REMOTE_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_SHOOTDOWN);
TRACE_DEFINE_ENUM(TLB_LOCAL_MM_SHOOTDOWN);

Its format file will contain this:

__print_symbolic(REC->reason,
{ 0, "flush on task switch" },
{ 1, "remote shootdown" },
{ 2, "local shootdown" },
{ 3, "local mm shootdown" })"

* tag 'trace-v4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (27 commits)
tracing: Add enum_map file to show enums that have been mapped
writeback: Export enums used by tracepoint to user space
v4l: Export enums used by tracepoints to user space
SUNRPC: Export enums in tracepoints to user space
mm: tracing: Export enums in tracepoints to user space
irq/tracing: Export enums in tracepoints to user space
f2fs: Export the enums in the tracepoints to userspace
net/9p/tracing: Export enums in tracepoints to userspace
x86/tlb/trace: Export enums in used by tlb_flush tracepoint
tracing/samples: Update the trace-event-sample.h with TRACE_DEFINE_ENUM()
tracing: Allow for modules to convert their enums to values
tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values
tracing: Update trace-event-sample with TRACE_SYSTEM_VAR documentation
tracing: Give system name a pointer
brcmsmac: Move each system tracepoints to their own header
iwlwifi: Move each system tracepoints to their own header
mac80211: Move message tracepoints to their own header
tracing: Add TRACE_SYSTEM_VAR to xhci-hcd
tracing: Add TRACE_SYSTEM_VAR to kvm-s390
tracing: Add TRACE_SYSTEM_VAR to intel-sst
...

Linus Torvalds
2015-04-15 01:49:03 +0800
8de29a35d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid ... Browse Code »

Pull HID updates from Jiri Kosina:

- quite a few firmware fixes for RMI driver by Andrew Duggan

- huion and uclogic drivers have been substantially overlaping in
functionality laterly. This redundancy is fixed by hid-huion driver
being merged into hid-uclogic; work done by Benjamin Tissoires and
Nikolai Kondrashov

- i2c-hid now supports ACPI GPIO interrupts; patch from Mika Westerberg

- Some of the quirks, that got separated into individual drivers, have
historically had EXPERT dependency. As HID subsystem matured (as
well as the individual drivers), this made less and less sense. This
dependency is now being removed by patch from Jean Delvare

- Logitech lg4ff driver received a couple of improvements for mode
switching, by Michal Malý

- multitouch driver now supports clickpads, patches by Benjamin
Tissoires and Seth Forshee

- hid-sensor framework received a substantial update; namely support
for Custom and Generic pages is being added; work done by Srinivas
Pandruvada

- wacom driver received substantial update; it now supports
i2c-conntected devices (Mika Westerberg), Bamboo PADs are now
properly supported (Benjamin Tissoires), much improved battery
reporting (Jason Gerecke) and pen proximity cleanups (Ping Cheng)

- small assorted fixes and device ID additions

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (68 commits)
HID: sensor: Update document for custom sensor
HID: sensor: Custom and Generic sensor support
HID: debug: fix error handling in hid_debug_events_read()
Input - mt: Fix input_mt_get_slot_by_key
HID: logitech-hidpp: fix error return code
HID: wacom: Add support for Cintiq 13HD Touch
HID: logitech-hidpp: add a module parameter to keep firmware gestures
HID: usbhid: yet another mouse with ALWAYS_POLL
HID: usbhid: more mice with ALWAYS_POLL
HID: wacom: set stylus_in_proximity before checking touch_down
HID: wacom: use wacom_wac_finger_count_touches to set touch_down
HID: wacom: remove hardcoded WACOM_QUIRK_MULTI_INPUT
HID: pidff: effect can't be NULL
HID: add quirk for PIXART OEM mouse used by HP
HID: add HP OEM mouse to quirk ALWAYS_POLL
HID: wacom: ask for a in-prox report when it was missed
HID: hid-sensor-hub: Fix sparse warning
HID: hid-sensor-hub: fix attribute read for logical usage id
HID: plantronics: fix Kconfig default
HID: pidff: support more than one concurrent effect
...

Linus Torvalds
2015-04-15 00:25:26 +0800

14 Apr, 2015

1 commit

05f6d0252 Merge branches 'for-4.0/upstream-fixes', 'for-4.1/genius', 'for-4.1/huion-uclogi… ... Browse Code »

…c-merge', 'for-4.1/i2c-hid', 'for-4.1/kconfig-drop-expert-dependency', 'for-4.1/logitech', 'for-4.1/multitouch', 'for-4.1/rmi', 'for-4.1/sony', 'for-4.1/upstream' and 'for-4.1/wacom' into for-linus

Jiri Kosina
2015-04-14 05:41:15 +0800

08 Apr, 2015

1 commit

32eb3d0d0 tracing/samples: Update the trace-event-sample.h with TRACE_DEFINE_ENUM() ... Browse Code »

Document the use of TRACE_DEFINE_ENUM() by adding enums to the
trace-event-sample.h and using this macro to convert them in the format
files.

Also update the comments and sho the use of __print_symbolic() and
__print_flags() as well as adding comments abount __print_array().

Link: http://lkml.kernel.org/r/20150403013802.220157513@goodmis.org

Reviewed-by: Masami Hiramatsu
Tested-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-04-08 21:39:58 +0800