05 Apr, 2014

1 commit


03 Apr, 2014

3 commits

  • Pull networking updates from David Miller:
    "Here is my initial pull request for the networking subsystem during
    this merge window:

    1) Support for ESN in AH (RFC 4302) from Fan Du.

    2) Add full kernel doc for ethtool command structures, from Ben
    Hutchings.

    3) Add BCM7xxx PHY driver, from Florian Fainelli.

    4) Export computed TCP rate information in netlink socket dumps, from
    Eric Dumazet.

    5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas
    Dichtel.

    6) Convert many drivers to pci_enable_msix_range(), from Alexander
    Gordeev.

    7) Record SKB timestamps more efficiently, from Eric Dumazet.

    8) Switch to microsecond resolution for TCP round trip times, also
    from Eric Dumazet.

    9) Clean up and fix 6lowpan fragmentation handling by making use of
    the existing inet_frag api for it's implementation.

    10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss.

    11) Auto size SKB lengths when composing netlink messages based upon
    past message sizes used, from Eric Dumazet.

    12) qdisc dumps can take a long time, add a cond_resched(), From Eric
    Dumazet.

    13) Sanitize netpoll core and drivers wrt. SKB handling semantics.
    Get rid of never-used-in-tree netpoll RX handling. From Eric W
    Biederman.

    14) Support inter-address-family and namespace changing in VTI tunnel
    driver(s). From Steffen Klassert.

    15) Add Altera TSE driver, from Vince Bridgers.

    16) Optimizing csum_replace2() so that it doesn't adjust the checksum
    by checksumming the entire header, from Eric Dumazet.

    17) Expand BPF internal implementation for faster interpreting, more
    direct translations into JIT'd code, and much cleaner uses of BPF
    filtering in non-socket ocntexts. From Daniel Borkmann and Alexei
    Starovoitov"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits)
    netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
    net: Add a test to see if a skb is freeable in irq context
    qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port'
    net: ptp: move PTP classifier in its own file
    net: sxgbe: make "core_ops" static
    net: sxgbe: fix logical vs bitwise operation
    net: sxgbe: sxgbe_mdio_register() frees the bus
    Call efx_set_channels() before efx->type->dimension_resources()
    xen-netback: disable rogue vif in kthread context
    net/mlx4: Set proper build dependancy with vxlan
    be2net: fix build dependency on VxLAN
    mac802154: make csma/cca parameters per-wpan
    mac802154: allow only one WPAN to be up at any given time
    net: filter: minor: fix kdoc in __sk_run_filter
    netlink: don't compare the nul-termination in nla_strcmp
    can: c_can: Avoid led toggling for every packet.
    can: c_can: Simplify TX interrupt cleanup
    can: c_can: Store dlc private
    can: c_can: Reduce register access
    can: c_can: Make the code readable
    ...

    Linus Torvalds
     
  • Pull virtio updates from Rusty Russell:
    "Nothing exciting: virtio-blk users might see a bit of a boost from the
    doubling of the default queue length though"

    * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    virtio-blk: base queue-depth on virtqueue ringsize or module param
    Revert a02bbb1ccfe8: MAINTAINERS: add virtio-dev ML for virtio
    virtio: fail adding buffer on broken queues.
    virtio-rng: don't crash if virtqueue is broken.
    virtio_balloon: don't crash if virtqueue is broken.
    virtio_blk: don't crash, report error if virtqueue is broken.
    virtio_net: don't crash if virtqueue is broken.
    virtio_balloon: don't softlockup on huge balloon changes.
    virtio: Use pci_enable_msix_exact() instead of pci_enable_msix()
    MAINTAINERS: virtio-dev is subscribers only
    tools/virtio: add a missing )
    tools/virtio: fix missing kmemleak_ignore symbol
    tools/virtio: update internal copies of headers

    Linus Torvalds
     
  • Pull main powerpc updates from Ben Herrenschmidt:
    "This time around, the powerpc merges are going to be a little bit more
    complicated than usual.

    This is the main pull request with most of the work for this merge
    window. I will describe it a bit more further down.

    There is some additional cpuidle driver work, however I haven't
    included it in this tree as it depends on some work in tip/timer-core
    which Thomas accidentally forgot to put in a topic branch. Since I
    didn't want to carry all of that tip timer stuff in powerpc -next, I
    setup a separate branch on top of Thomas tree with just that cpuidle
    driver in it, and Stephen has been carrying that in next separately
    for a while now. I'll send a separate pull request for it.

    Additionally, two new pieces in this tree add users for a sysfs API
    that Tejun and Greg have been deprecating in drivers-core-next.
    Thankfully Greg reverted the patch that removes the old API so this
    merge can happen cleanly, but once merged, I will send a patch
    adjusting our new code to the new API so that Greg can send you the
    removal patch.

    Now as for the content of this branch, we have a lot of perf work for
    power8 new counters including support for our new "nest" counters
    (also called 24x7) under pHyp (not natively yet).

    We have new functionality when running under the OPAL firmware
    (non-virtualized or KVM host), such as access to the firmware error
    logs and service processor dumps, system parameters and sensors, along
    with a hwmon driver for the latter.

    There's also a bunch of bug fixes accross the board, some LE fixes,
    and a nice set of selftests for validating our various types of copy
    loops.

    On the Freescale side, we see mostly new chip/board revisions, some
    clock updates, better support for machine checks and debug exceptions,
    etc..."

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (70 commits)
    powerpc/book3s: Fix CFAR clobbering issue in machine check handler.
    powerpc/compat: 32-bit little endian machine name is ppcle, not ppc
    powerpc/le: Big endian arguments for ppc_rtas()
    powerpc: Use default set of netfilter modules (CONFIG_NETFILTER_ADVANCED=n)
    powerpc/defconfigs: Enable THP in pseries defconfig
    powerpc/mm: Make sure a local_irq_disable prevent a parallel THP split
    powerpc: Rate-limit users spamming kernel log buffer
    powerpc/perf: Fix handling of L3 events with bank == 1
    powerpc/perf/hv_{gpci, 24x7}: Add documentation of device attributes
    powerpc/perf: Add kconfig option for hypervisor provided counters
    powerpc/perf: Add support for the hv 24x7 interface
    powerpc/perf: Add support for the hv gpci (get performance counter info) interface
    powerpc/perf: Add macros for defining event fields & formats
    powerpc/perf: Add a shared interface to get gpci version and capabilities
    powerpc/perf: Add 24x7 interface headers
    powerpc/perf: Add hv_gpci interface header
    powerpc: Add hvcalls for 24x7 and gpci (Get Performance Counter Info)
    sysfs: create bin_attributes under the requested group
    powerpc/perf: Enable BHRB access for EBB events
    powerpc/perf: Add BHRB constraint and IFM MMCRA handling for EBB
    ...

    Linus Torvalds
     

02 Apr, 2014

1 commit

  • Pull char/misc driver patches from Greg KH:
    "Here's the big char/misc driver updates for 3.15-rc1.

    Lots of various things here, including the new mcb driver subsystem.

    All of these have been in linux-next for a while"

    * tag 'char-misc-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (118 commits)
    extcon: Move OF helper function to extcon core and change function name
    extcon: of: Remove unnecessary function call by using the name of device_node
    extcon: gpio: Use SIMPLE_DEV_PM_OPS macro
    extcon: palmas: Use SIMPLE_DEV_PM_OPS macro
    mei: don't use deprecated DEFINE_PCI_DEVICE_TABLE macro
    mei: amthif: fix checkpatch error
    mei: client.h fix checkpatch errors
    mei: use cl_dbg where appropriate
    mei: fix Unnecessary space after function pointer name
    mei: report consistently copy_from/to_user failures
    mei: drop pr_fmt macros
    mei: make me hw headers private to me hw.
    mei: fix memory leak of pending write cb objects
    mei: me: do not reset when less than expected data is received
    drivers: mcb: Fix build error discovered by 0-day bot
    cs5535-mfgpt: Simplify dependencies
    spmi: pm: drop bus-level PM suspend/resume routines
    spmi: pmic_arb: make selectable on ARCH_QCOM
    Drivers: hv: vmbus: Increase the limit on the number of pfns we can handle
    pch_phub: Report error writing MAC back to user
    ...

    Linus Torvalds
     

01 Apr, 2014

2 commits

  • Pull perf changes from Ingo Molnar:
    "Main changes:

    Kernel side changes:

    - Add SNB/IVB/HSW client uncore memory controller support (Stephane
    Eranian)

    - Fix various x86/P4 PMU driver bugs (Don Zickus)

    Tooling, user visible changes:

    - Add several futex 'perf bench' microbenchmarks (Davidlohr Bueso)

    - Speed up thread map generation (Don Zickus)

    - Introduce 'perf kvm --list-cmds' command line option for use by
    scripts (Ramkumar Ramachandra)

    - Print the evsel name in the annotate stdio output, prep to fix
    support outputting annotation for multiple events, not just for the
    first one (Arnaldo Carvalho de Melo)

    - Allow setting preferred callchain method in .perfconfig (Jiri Olsa)

    - Show in what binaries/modules 'perf probe's are set (Masami
    Hiramatsu)

    - Support distro-style debuginfo for uprobe in 'perf probe' (Masami
    Hiramatsu)

    Tooling, internal changes and fixes:

    - Use tid in mmap/mmap2 events to find maps (Don Zickus)

    - Record the reason for filtering an address_location (Namhyung Kim)

    - Apply all filters to an addr_location (Namhyung Kim)

    - Merge al->filtered with hist_entry->filtered in report/hists
    (Namhyung Kim)

    - Fix memory leak when synthesizing thread records (Namhyung Kim)

    - Use ui__has_annotation() in 'report' (Namhyung Kim)

    - hists browser refactorings to reuse code accross UIs (Namhyung Kim)

    - Add support for the new DWARF unwinder library in elfutils (Jiri
    Olsa)

    - Fix build race in the generation of bison files (Jiri Olsa)

    - Further streamline the feature detection display, trimming it a bit
    to show just the libraries detected, using VF=1 gets a more verbose
    output, showing the less interesting feature checks as well (Jiri
    Olsa).

    - Check compatible symtab type before loading dso (Namhyung Kim)

    - Check return value of filename__read_debuglink() (Stephane Eranian)

    - Move some hashing and fs related code from tools/perf/util/ to
    tools/lib/ so that it can be used by more tools/ living utilities
    (Borislav Petkov)

    - Prepare DWARF unwinding code for using an elfutils alternative
    unwinding library (Jiri Olsa)

    - Fix DWARF unwind max_stack processing (Jiri Olsa)

    - Add dwarf unwind 'perf test' entry (Jiri Olsa)

    - 'perf probe' improvements including memory leak fixes, sharing the
    intlist class with other tools, uprobes/kprobes code sharing and
    use of ref_reloc_sym (Masami Hiramatsu)

    - Shorten sample symbol resolving by adding cpumode to struct
    addr_location (Arnaldo Carvalho de Melo)

    - Fix synthesizing mmaps for threads (Don Zickus)

    - Fix invalid output on event group stdio report (Namhyung Kim)

    - Fixup header alignment in 'perf sched latency' output (Ramkumar
    Ramachandra)

    - Fix off-by-one error in 'perf timechart record' argv handling
    (Ramkumar Ramachandra)

    Tooling, cleanups:

    - Remove unused thread__find_map function (Jiri Olsa)

    - Remove unused simple_strtoul() function (Ramkumar Ramachandra)

    Tooling, documentation updates:

    - Update function names in debug messages (Ramkumar Ramachandra)

    - Update some code references in design.txt (Ramkumar Ramachandra)

    - Clarify load-latency information in the 'perf mem' docs (Andi
    Kleen)

    - Clarify x86 register naming in 'perf probe' docs (Andi Kleen)"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (96 commits)
    perf tools: Remove unused simple_strtoul() function
    perf tools: Update some code references in design.txt
    perf evsel: Update function names in debug messages
    perf tools: Remove thread__find_map function
    perf annotate: Print the evsel name in the stdio output
    perf report: Use ui__has_annotation()
    perf tools: Fix memory leak when synthesizing thread records
    perf tools: Use tid in mmap/mmap2 events to find maps
    perf report: Merge al->filtered with hist_entry->filtered
    perf symbols: Apply all filters to an addr_location
    perf symbols: Record the reason for filtering an address_location
    perf sched: Fixup header alignment in 'latency' output
    perf timechart: Fix off-by-one error in 'record' argv handling
    perf machine: Factor machine__find_thread to take tid argument
    perf tools: Speed up thread map generation
    perf kvm: introduce --list-cmds for use by scripts
    perf ui hists: Pass evsel to hpp->header/width functions explicitly
    perf symbols: Introduce thread__find_cpumode_addr_location
    perf session: Change header.misc dump from decimal to hex
    perf ui/tui: Reuse generic __hpp__fmt() code
    ...

    Linus Torvalds
     
  • Pull RCU updates from Ingo Molnar:
    "Main changes:

    - Torture-test changes, including refactoring of rcutorture and
    introduction of a vestigial locktorture.

    - Real-time latency fixes.

    - Documentation updates.

    - Miscellaneous fixes"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (77 commits)
    rcu: Provide grace-period piggybacking API
    rcu: Ensure kernel/rcu/rcu.h can be sourced/used stand-alone
    rcu: Fix sparse warning for rcu_expedited from kernel/ksysfs.c
    notifier: Substitute rcu_access_pointer() for rcu_dereference_raw()
    Documentation/memory-barriers.txt: Clarify release/acquire ordering
    rcutorture: Save kvm.sh output to log
    rcutorture: Add a lock_busted to test the test
    rcutorture: Place kvm-test-1-run.sh output into res directory
    rcutorture: Rename TREE_RCU-Kconfig.txt
    locktorture: Add kvm-recheck.sh plug-in for locktorture
    rcutorture: Gracefully handle NULL cleanup hooks
    locktorture: Add vestigial locktorture configuration
    rcutorture: Introduce "rcu" directory level underneath configs
    rcutorture: Rename kvm-test-1-rcu.sh
    rcutorture: Remove RCU dependencies from ver_functions.sh API
    rcutorture: Create CFcommon file for common Kconfig parameters
    rcutorture: Create config files for scripted test-the-test testing
    rcutorture: Add an rcu_busted to test the test
    locktorture: Add a lock-torture kernel module
    rcutorture: Abstract kvm-recheck.sh
    ...

    Linus Torvalds
     

26 Mar, 2014

1 commit


19 Mar, 2014

13 commits

  • Moreover, the corresponding function in include/linux/kernel.h is marked
    obsolete.

    Signed-off-by: Ramkumar Ramachandra
    Cc: David Ahern
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1395176715-4465-1-git-send-email-artagnon@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Ramkumar Ramachandra
     
  • Update the names of some functions and enums in design.txt. The document
    still has some stale information, but the motivation behind this patch
    is to allow a developer to quickly grep and learn about the associated
    structures.

    Signed-off-by: Ramkumar Ramachandra
    Cc: David Ahern
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1395169804-1293-1-git-send-email-artagnon@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Ramkumar Ramachandra
     
  • perf_event_open() was renamed to sys_perf_event_open(); update the debug
    messages to reflect this.

    Signed-off-by: Ramkumar Ramachandra
    Cc: David Ahern
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1395169842-1399-1-git-send-email-artagnon@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Ramkumar Ramachandra
     
  • Because it's not used any more.

    Signed-off-by: Jiri Olsa
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Don Zickus
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1395154016-26709-3-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • So that when showing multiple events annotations, we can figure out
    which is which:

    # perf record -a -e instructions,cycles sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.826 MB perf.data (~36078 samples) ]
    # perf evlist
    instructions
    cycles
    # perf annotate intel_idle 2> /dev/null | head -1
    Percent | Source code & Disassembly of vmlinux for instructions
    #

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-n1r51l329434js84qtb2c6l9@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Since we introduced the ui__has_annotation() for that, don't open code
    it.

    Signed-off-by: Namhyung Kim
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1395124359-11744-2-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Checking default guest machine should be done before allocating event
    structures otherwise it'll leak memory.

    Signed-off-by: Namhyung Kim
    Cc: Don Zickus
    Cc: Jiri Olsa
    Cc: Joe Mario
    Link: http://lkml.kernel.org/r/87ob15tx6a.fsf@sejong.aot.lge.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Now that we can properly synthesize threads system-wide, make sure the
    mmap and mmap2 events use tids instead of pids to locate their maps.

    Signed-off-by: Don Zickus
    Cc: Jiri Olsa
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1393429527-167840-3-git-send-email-dzickus@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Don Zickus
     
  • I.e. don't drop al->filtered entries, create the hist_entries and use
    its ->filtered bitmap, that is kept with the same semantics for its
    bitmap, leaving the filtering to be done at the hist_entry level, i.e.
    in the UIs.

    This will allow zooming in/out the filters.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-xeyhkepu7plw716lrtb0zlnu@git.kernel.org
    [ yanked this out of a previous patch ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Instead of bailing out as soon as we find a filter that applies, go on
    checking all of them so that we can zoom in/out filters.

    We also need to make sure we only update al->filtered after
    thread__find_addr_map(), because there is where al->filtered gets
    initialized to zero.

    This will increase the cost of processing when all we don't need this
    toggling, but will provide flexibility for the TUI and GTK+ interfaces,
    that will incur in creating the hist_entries just once.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-fhv9lhzdjxgp9w3w3668lsfw@git.kernel.org
    [ yanked this out of a previous patch ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • By turning the addr_location->filtered member from a boolean to a u8
    bitmap, reusing (and extending) the hist_filter enum for that.

    This patch doesn't change the logic at all, as it keeps the meaning of
    al->filtered !0 to mean that the entry _was_ filtered, so no change in
    how this value is interpreted needs to be done at this point.

    This will be soon used in upcoming patches.

    Signed-off-by: Namhyung Kim
    Acked-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-89hmfgtr9t22sky1lyg7nw7l@git.kernel.org
    [ yanked this out of a previous patch ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Before:

    ---------------------------------------------------------------------------------------------------------------
    Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
    ---------------------------------------------------------------------------------------------------------------
    ... | | | | |
    git:24540 | 336.622 ms | 10 | avg: 0.032 ms | max: 0.062 ms | max at: 115610.111046 s
    git:24541 | 0.457 ms | 1 | avg: 0.000 ms | max: 0.000 ms | max at: 0.000000 s
    -----------------------------------------------------------------------------------------
    TOTAL: | 396.542 ms | 353 |
    ---------------------------------------------------

    After:

    -----------------------------------------------------------------------------------------------------------------
    Task | Runtime ms | Switches | Average delay ms | Maximum delay ms | Maximum delay at |
    -----------------------------------------------------------------------------------------------------------------
    ... | | | | |
    git:24540 | 336.622 ms | 10 | avg: 0.032 ms | max: 0.062 ms | max at: 115610.111046 s
    git:24541 | 0.457 ms | 1 | avg: 0.000 ms | max: 0.000 ms | max at: 0.000000 s
    -----------------------------------------------------------------------------------------------------------------
    TOTAL: | 396.542 ms | 353 |
    ---------------------------------------------------

    Signed-off-by: Ramkumar Ramachandra
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1395065901-25740-1-git-send-email-artagnon@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Ramkumar Ramachandra
     
  • Since 367b315 (perf timechart: Add support for -P and -T in timechart
    recording, 2013-11-01), the 'perf timechart record' command stopped
    working:

    $ perf timechart record -- git status
    Workload failed: No such file or directory

    This happens because of an off-by-one error while preparing the argv for
    cmd_record(): it attempts to execute the command 'status' and complains
    that it doesn't exist. Fix this error.

    Signed-off-by: Ramkumar Ramachandra
    Acked-by: Stanislav Fomichev
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Stanislav Fomichev
    Link: http://lkml.kernel.org/r/1394985965-2332-1-git-send-email-artagnon@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Ramkumar Ramachandra
     

18 Mar, 2014

1 commit

  • …it/acme/linux into perf/urgent

    Pull two 'perf bench' fixes from Arnaldo:

    * Make 'perf bench mem' (i.e. no args) mean 'run all tests' so that we can run
    all tests, not stopping at the numa ones. (Arnaldo Carvalho de Melo)

    * Fix NULL pointer dereference after last test in in "perf bench all" (Patrick Palka)

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

17 Mar, 2014

1 commit


15 Mar, 2014

12 commits

  • Conflicts:
    drivers/net/usb/r8152.c
    drivers/net/xen-netback/netback.c

    Both the r8152 and netback conflicts were simple overlapping
    changes.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Forcing the code to always search thread by pid/tid pair.

    The PID value will be needed in future to determine the process thread
    leader for map groups sharing.

    Signed-off-by: Jiri Olsa
    Acked-by: Adrian Hunter
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Don Zickus
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1394805606-25883-3-git-send-email-jolsa@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • When trying to capture perf data on a system running spejbb2013, perf
    hung for about 15 minutes. This is because it took that long to gather
    about 10,000 thread maps and process them.

    I don't think a user wants to wait that long.

    Instead, recognize that thread maps are roughly equivalent to pid maps
    and just quickly copy those instead.

    To do this, I synthesize 'fork' events, this eventually calls
    thread__fork() and copies the maps over.

    The overhead goes from 15 minutes down to about a few seconds.

    --
    V2: based on Jiri's comments, moved malloc up a level
    and made sure the memory was freed

    Signed-off-by: Don Zickus
    Acked-by: Jiri Olsa
    Cc: Jiri Olsa
    Cc: Joe Mario
    Link: http://lkml.kernel.org/r/1394808224-113774-1-git-send-email-dzickus@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Don Zickus
     
  • Introduce

    $ perf kvm --list-cmds

    to dump a raw list of commands for use by the completion script. In
    order to do this, introduce parse_options_subcommand() for handling
    subcommands as a special case in the parse-options machinery.

    Signed-off-by: Ramkumar Ramachandra
    Acked-by: David Ahern
    Acked-by: Jiri Olsa
    Cc: David Ahern
    Link: http://lkml.kernel.org/r/1393896396-10427-1-git-send-email-artagnon@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Ramkumar Ramachandra
     
  • Those functions need evsel to investigate event group and it's passed
    via hpp->ptr. However as it can be missed easily so it's better to
    pass it via an argument IMHO.

    Signed-off-by: Namhyung Kim
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1394437440-11609-2-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Its one level up thread__find_addr_location, where it will look in
    different domains for a sample: user, kernel, hypervisor, etc.

    Will soon be used by a patchkit by Andi Kleen.

    Cc: Adrian Hunter
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Mike Galbraith
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/n/tip-so6nxkh7xj48bc5kq4jpj991@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • When printing the raw dump of a data file, the header.misc is
    printed as a decimal. Unfortunately, that field is a bit mask, so
    it is hard to interpret as a decimal.

    Print in hex, so the user can easily see what bits are set and more
    importantly what type of info it is conveying.

    V2: add 0x in front per Jiri Olsa

    Signed-off-by: Don Zickus
    Cc: Jiri Olsa
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1393386227-149412-3-git-send-email-dzickus@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Don Zickus
     
  • The __hpp__color_fmt used in the TUI code can be replace by the generic
    code with small change in print_fn callback. And it also needs to move
    callback function to the generic __hpp__fmt().

    No functional changes intended.

    Signed-off-by: Namhyung Kim
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1393809254-4480-5-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • Instead of the pointer to buffer and its size so that it can also get
    private argument passed along with hpp.

    This is a preparation of further change.

    Signed-off-by: Namhyung Kim
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1393809254-4480-4-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The __hpp__color_fmt used in the gtk code can be replace by the generic
    code with small change in print_fn callback.

    This is a preparation to upcoming changes and no functional changes
    intended.

    Signed-off-by: Namhyung Kim
    Acked-by: Pekka Enberg
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Pekka Enberg
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1393809254-4480-3-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • When some of group member has 0 overhead, it printed previous percentage
    instead of 0.00%. It's because passing integer 0 as a percent rather
    than double 0.0 so the remaining bits came from garbage. The TUI and
    GTK don't have this problem since they pass 0.0.

    Before:

    # Samples: 845 of event 'anon group { cycles, cache-references, cache-misses }'
    # Event count (approx.): 174775051
    #
    # Overhead Samples
    # ........................ ....................................
    #
    20.32% 8.58% 73.51% 45 30 138
    6.87% 6.87% 6.87% 21 0 0
    5.29% 0.31% 0.31% 10 1 0
    1.89% 1.89% 1.89% 6 0 0
    1.76% 1.76% 1.76% 2 0 0

    After:

    # Overhead Samples
    # ........................ ....................................
    #
    20.32% 8.58% 73.51% 45 30 138
    6.87% 0.00% 0.00% 21 0 0
    5.29% 0.31% 0.00% 10 1 0
    1.89% 0.00% 0.00% 6 0 0
    1.76% 0.00% 0.00% 2 0 0

    Signed-off-by: Namhyung Kim
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Ingo Molnar
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1393809254-4480-2-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Namhyung Kim
     
  • The for_each_bench() macro must check that the "benchmarks" field of a
    collection is not NULL before dereferencing it because the "all"
    collection in particular has a NULL "benchmarks" field (signifying that
    it has no benchmarks to iterate over).

    This fixes this NULL pointer dereference when running "perf bench all":

    [root@ssdandy ~]# perf bench all

    # Running mem/memset benchmark...
    # Copying 1MB Bytes ...

    2.453675 GB/Sec
    12.056327 GB/Sec (with prefault)

    Segmentation fault (core dumped)
    [root@ssdandy ~]#

    Signed-off-by: Patrick Palka
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1394664051-6037-1-git-send-email-patrick@parcs.ath.cx
    Signed-off-by: Arnaldo Carvalho de Melo

    Patrick Palka
     

14 Mar, 2014

5 commits

  • Currently if a process creates a bunch of threads using pthread_create
    and then perf is run in system_wide mode, the mmaps for those threads
    are not captured with a synthesized mmap event.

    The reason is those threads are not visible when walking the /proc/
    directory looking for /proc//maps files. Instead they are
    discovered using the /proc//tasks file (which the synthesized comm
    event uses).

    This causes problems when a program is trying to map a data address to a
    tid. Because the tid has no maps, the event is dropped. Changing the
    program to look up using the pid instead of the tid, finds the correct
    maps but creates ugly hacks in the program to carry the correct tid
    around.

    Fix this by moving the walking of the /proc//tasks up a level (out
    of the comm function) based on Arnaldo's suggestion.

    Tweaked things a bit to special case the 'full' bit and 'guest' check.

    Signed-off-by: Don Zickus
    Cc: Jiri Olsa
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1393429527-167840-2-git-send-email-dzickus@redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Don Zickus
     
  • Clarify how to specify x86 registers in perf probe. I recently ran into
    this problem and had to figure it out from the source.

    Signed-off-by: Andi Kleen
    Acked-by: Masami Hiramatsu
    Cc: Masami Hiramatsu
    Link: http://lkml.kernel.org/r/1393596135-4227-3-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Clarify in the documentation that 'perf mem report' reports use-latency,
    not load/store-latency on Intel systems.

    This often causes confusion with users.

    Signed-off-by: Andi Kleen
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1393596135-4227-2-git-send-email-andi@firstfloor.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Andi Kleen
     
  • Block a bunch of threads on a futex and requeue them on another, N at a
    time.

    This program is particularly useful to measure the latency of nthread
    requeues without waking up any tasks -- thus mimicking a regular
    futex_wait.

    An example run:

    $ perf bench futex requeue -r 100 -t 64
    Run summary [PID 151011]: Requeuing 64 threads (from 0x7d15c4 to 0x7d15c8), 1 at a time.

    [Run 1]: Requeued 64 of 64 threads in 0.0400 ms
    [Run 2]: Requeued 64 of 64 threads in 0.0390 ms
    [Run 3]: Requeued 64 of 64 threads in 0.0400 ms
    ...
    [Run 100]: Requeued 64 of 64 threads in 0.0390 ms
    Requeued 64 of 64 threads in 0.0399 ms (+-0.37%)

    Signed-off-by: Davidlohr Bueso
    Acked-by: Darren Hart
    Cc: Aswin Chandramouleeswaran
    Cc: Darren Hart
    Cc: Ingo Molnar
    Cc: Jason Low
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Link: http://lkml.kernel.org/r/1387081917-9102-4-git-send-email-davidlohr@hp.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Davidlohr Bueso
     
  • Block a bunch of threads on a futex and wake them up, N at a time.

    This program is particularly useful to measure the latency of nthread
    wakeups in non-error situations: all waiters are queued and all wake
    calls wakeup one or more tasks.

    An example run:

    $ perf bench futex wake -t 512 -r 100
    Run summary [PID 27823]: blocking on 512 threads (at futex 0x7e10d4), waking up 1 at a time.

    [Run 1]: Wokeup 512 of 512 threads in 6.0080 ms
    [Run 2]: Wokeup 512 of 512 threads in 5.2280 ms
    [Run 3]: Wokeup 512 of 512 threads in 4.8300 ms
    ...
    [Run 100]: Wokeup 512 of 512 threads in 5.0100 ms
    Wokeup 512 of 512 threads in 5.0109 ms (+-2.25%)

    Signed-off-by: Davidlohr Bueso
    Acked-by: Darren Hart
    Cc: Aswin Chandramouleeswaran
    Cc: Darren Hart
    Cc: Ingo Molnar
    Cc: Jason Low
    Cc: Peter Zijlstra
    Cc: Scott J Norton
    Cc: Thomas Gleixner
    Cc: Waiman Long
    Link: http://lkml.kernel.org/r/1387081917-9102-3-git-send-email-davidlohr@hp.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Davidlohr Bueso