23 Dec, 2018

1 commit


11 Dec, 2018

1 commit

  • The trace_add/remove_event_call_nolock() functions were added to allow
    the tace_add/remove_event_call() code be called when the event_mutex
    lock was already taken. Now that all callers are done within the
    event_mutex, there's no reason to have two different interfaces.

    Remove the current wrapper trace_add/remove_event_call()s and rename the
    _nolock versions back to the original names.

    Link: http://lkml.kernel.org/r/154140866955.17322.2081425494660638846.stgit@devbox

    Acked-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

09 Dec, 2018

1 commit

  • synthetic event is using synth_event_mutex for protecting
    synth_event_list, and event_trigger_write() path acquires
    locks as below order.

    event_trigger_write(event_mutex)
    ->trigger_process_regex(trigger_cmd_mutex)
    ->event_hist_trigger_func(synth_event_mutex)

    On the other hand, synthetic event creation and deletion paths
    call trace_add_event_call() and trace_remove_event_call()
    which acquires event_mutex. In that case, if we keep the
    synth_event_mutex locked while registering/unregistering synthetic
    events, its dependency will be inversed.

    To avoid this issue, current synthetic event is using a 2 phase
    process to create/delete events. For example, it searches existing
    events under synth_event_mutex to check for event-name conflicts, and
    unlocks synth_event_mutex, then registers a new event under event_mutex
    locked. Finally, it locks synth_event_mutex and tries to add the
    new event to the list. But it can introduce complexity and a chance
    for name conflicts.

    To solve this simpler, this introduces trace_add_event_call_nolock()
    and trace_remove_event_call_nolock() which don't acquire
    event_mutex inside. synthetic event can lock event_mutex before
    synth_event_mutex to solve the lock dependency issue simpler.

    Link: http://lkml.kernel.org/r/154140844377.17322.13781091165954002713.stgit@devbox

    Reviewed-by: Tom Zanussi
    Tested-by: Tom Zanussi
    Signed-off-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt (VMware)

    Masami Hiramatsu
     

17 Aug, 2018

1 commit


11 Aug, 2018

1 commit

  • Now that some trace events can be protected by srcu_read_lock(tracepoint_srcu),
    we need to make sure all locations that depend on this are also protected.
    There were many places that did a synchronize_sched() thinking that it was
    enough to protect againts access to trace events. This use to be the case,
    but now that we use SRCU for _rcuidle() trace events, they may not be
    protected by synchronize_sched(), as they may be called in paths that RCU is
    not watching for preempt disable.

    Fixes: e6753f23d961d ("tracepoint: Make rcuidle tracepoint callers use SRCU")
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

03 Aug, 2018

1 commit

  • Since we switched to using SRCU for tracepoints used in the idle path,
    we can no longer use rcu_dereference_sched for dereferencing points in
    trace-event hooks.

    Since tracepoints can now use either SRCU or sched-RCU, just use
    rcu_dereference_raw for traceevents just like we're doing when
    dereferencing the tracepoint table.

    This prevents an RCU warning reported by Masami:

    [ 282.060593] WARNING: can't dereference registers at 00000000f3c7f62b
    [ 282.063200] =============================
    [ 282.064082] WARNING: suspicious RCU usage
    [ 282.064963] 4.18.0-rc6+ #15 Tainted: G W
    [ 282.066048] -----------------------------
    [ 282.066923] /home/mhiramat/ksrc/linux/kernel/trace/trace_events.c:242
    suspicious rcu_dereference_check() usage!
    [ 282.068974]
    [ 282.068974] other info that might help us debug this:
    [ 282.068974]
    [ 282.070770]
    [ 282.070770] RCU used illegally from idle CPU!
    [ 282.070770] rcu_scheduler_active = 2, debug_locks = 1
    [ 282.072938] RCU used illegally from extended quiescent state!
    [ 282.074183] no locks held by swapper/0/0.
    [ 282.075071]
    [ 282.075071] stack backtrace:
    [ 282.076121] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W
    [ 282.077782] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
    [ 282.079604] Call Trace:
    [ 282.080212]
    [ 282.080755] dump_stack+0x85/0xcb
    [ 282.081523] trace_event_ignore_this_pid+0x66/0x70
    [ 282.082541] trace_event_raw_event_preemptirq_template+0xa2/0xb0
    [ 282.083774] ? interrupt_entry+0xc4/0xe0
    [ 282.084665] ? trace_hardirqs_off_thunk+0x1a/0x1c
    [ 282.085669] trace_hardirqs_off_caller+0x90/0xd0
    [ 282.086597] trace_hardirqs_off_thunk+0x1a/0x1c
    [ 282.087433] ? call_function_interrupt+0xa/0x20
    [ 282.088201] interrupt_entry+0xc4/0xe0
    [ 282.088848] ? call_function_interrupt+0xa/0x20
    [ 282.089579]
    [ 282.090029] ? native_safe_halt+0x2/0x10
    [ 282.090695] ? default_idle+0x1f/0x160
    [ 282.091330] ? default_idle_call+0x24/0x40
    [ 282.091997] ? do_idle+0x210/0x250
    [ 282.092658] ? cpu_startup_entry+0x6f/0x80
    [ 282.093338] ? start_kernel+0x49d/0x4bd
    [ 282.093987] ? secondary_startup_64+0xa5/0xb0

    Link: http://lkml.kernel.org/r/20180803023407.225852-1-joel@joelfernandes.org

    Reported-by: Masami Hiramatsu
    Tested-by: Masami Hiramatsu
    Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
    Signed-off-by: Joel Fernandes (Google)
    Signed-off-by: Steven Rostedt (VMware)

    Joel Fernandes (Google)
     

29 May, 2018

3 commits

  • The filter file in the ftrace internal events, like in
    /sys/kernel/tracing/events/ftrace/function/filter is not attached to any
    functionality. Do not create them as they are meaningless.

    In the future, if an ftrace internal event gets filter functionality, then
    it will need to create it directly.

    Reviewed-by: Namhyung Kim
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • Instead of having both trace_init_tracefs() and event_trace_init() be called
    by fs_initcall() routines, have event_trace_init() called directly by
    trace_init_tracefs(). This will guarantee order of how the events are
    created with respect to the rest of the ftrace infrastructure. This is
    needed to be able to assoctiate event files with ftrace internal events,
    such as the trace_marker.

    Reviewed-by: Namhyung Kim
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • By adding the function __find_event_file() that can search for files without
    restrictions, such as if the event associated with the file has a reg
    function, or if it has the "ignore" flag set, the files that are associated
    to ftrace internal events (like trace_marker and function events) can be
    found and used.

    find_event_file() still returns a "filtered" file, as most callers need a
    valid trace event file. One created by the trace_events.h macros and not one
    created for parsing ftrace specific events.

    Reviewed-by: Namhyung Kim
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

24 Jan, 2018

1 commit

  • Always mark the parsed string with a terminated nul '\0' character. This removes
    the need for the users to have to append the '\0' before using the parsed string.

    Link: http://lkml.kernel.org/r/1516093350-12045-4-git-send-email-changbin.du@intel.com

    Acked-by: Namhyung Kim
    Signed-off-by: Changbin Du
    Signed-off-by: Steven Rostedt (VMware)

    Changbin Du
     

19 Jan, 2018

1 commit

  • Since enums do not get converted by the TRACE_EVENT macro into their values,
    the event format displaces the enum name and not the value. This breaks
    tools like perf and trace-cmd that need to interpret the raw binary data. To
    solve this, an enum map was created to convert these enums into their actual
    numbers on boot up. This is done by TRACE_EVENTS() adding a
    TRACE_DEFINE_ENUM() macro.

    Some enums were not being converted. This was caused by an optization that
    had a bug in it.

    All calls get checked against this enum map to see if it should be converted
    or not, and it compares the call's system to the system that the enum map
    was created under. If they match, then they call is processed.

    To cut down on the number of iterations needed to find the maps with a
    matching system, since calls and maps are grouped by system, when a match is
    made, the index into the map array is saved, so that the next call, if it
    belongs to the same system as the previous call, could start right at that
    array index and not have to scan all the previous arrays.

    The problem was, the saved index was used as the variable to know if this is
    a call in a new system or not. If the index was zero, it was assumed that
    the call is in a new system and would keep incrementing the saved index
    until it found a matching system. The issue arises when the first matching
    system was at index zero. The next map, if it belonged to the same system,
    would then think it was the first match and increment the index to one. If
    the next call belong to the same system, it would begin its search of the
    maps off by one, and miss the first enum that should be converted. This left
    a single enum not converted properly.

    Also add a comment to describe exactly what that index was for. It took me a
    bit too long to figure out what I was thinking when debugging this issue.

    Link: http://lkml.kernel.org/r/717BE572-2070-4C1E-9902-9F2E0FEDA4F8@oracle.com

    Cc: stable@vger.kernel.org
    Fixes: 0c564a538aa93 ("tracing: Add TRACE_DEFINE_ENUM() macro to map enums to their values")
    Reported-by: Chuck Lever
    Teste-by: Chuck Lever
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

04 Oct, 2017

1 commit

  • In order to make future changes where we need to call
    tracing_set_clock() from within an event command, the order of
    trace_types_lock and event_mutex must be reversed, as the event command
    will hold event_mutex and the trace_types_lock is taken from within
    tracing_set_clock().

    Link: http://lkml.kernel.org/r/20170921162249.0dde3dca@gandalf.local.home

    Requested-by: Tom Zanussi
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

06 Sep, 2017

1 commit

  • When disabling one trace event, the RECORDED_TGID flag in the event
    file is not correctly cleared. It's clearing RECORDED_CMD flag when
    it should clear RECORDED_TGID flag.

    Link: http://lkml.kernel.org/r/1504589806-8425-1-git-send-email-chuhu@redhat.com

    Cc: Joel Fernandes
    Cc: stable@vger.kernel.org
    Fixes: d914ba37d7 ("tracing: Add support for recording tgid of tasks")
    Signed-off-by: Chunyu Hu
    Signed-off-by: Steven Rostedt (VMware)

    Chunyu Hu
     

01 Sep, 2017

1 commit

  • Currently, when a module event is enabled, when that module is removed, it
    clears all ring buffers. This is to prevent another module from being loaded
    and having one of its trace event IDs from reusing a trace event ID of the
    removed module. This could cause undesirable effects as the trace event of
    the new module would be using its own processing algorithms to process raw
    data of another event. To prevent this, when a module is loaded, if any of
    its events have been used (signified by the WAS_ENABLED event call flag,
    which is never cleared), all ring buffers are cleared, just in case any one
    of them contains event data of the removed event.

    The problem is, there's no reason to clear all ring buffers if only one (or
    less than all of them) uses one of the events. Instead, only clear the ring
    buffers that recorded the events of a module that is being removed.

    To do this, instead of keeping the WAS_ENABLED flag with the trace event
    call, move it to the per instance (per ring buffer) event file descriptor.
    The event file descriptor maps each event to a separate ring buffer
    instance. Then when the module is removed, only the ring buffers that
    activated one of the module's events get cleared. The rest are not touched.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

28 Jun, 2017

1 commit

  • Inorder to support recording of tgid, the following changes are made:

    * Introduce a new API (tracing_record_taskinfo) to additionally record the tgid
    along with the task's comm at the same time. This has has the benefit of not
    setting trace_cmdline_save before all the information for a task is saved.
    * Add a new API tracing_record_taskinfo_sched_switch to record task information
    for 2 tasks at a time (previous and next) and use it from sched_switch probe.
    * Preserve the old API (tracing_record_cmdline) and create it as a wrapper
    around the new one so that existing callers aren't affected.
    * Reuse the existing sched_switch and sched_wakeup probes to record tgid
    information and add a new option 'record-tgid' to enable recording of tgid

    When record-tgid option isn't enabled to being with, we take care to make sure
    that there's isn't memory or runtime overhead.

    Link: http://lkml.kernel.org/r/20170627020155.5139-1-joelaf@google.com

    Cc: kernel-team@android.com
    Cc: Ingo Molnar
    Tested-by: Michael Sartain
    Signed-off-by: Joel Fernandes
    Signed-off-by: Steven Rostedt (VMware)

    Joel Fernandes
     

14 Jun, 2017

3 commits


21 Apr, 2017

8 commits

  • With the redesign of the registration and execution of the function probes
    (triggers), data can now be passed from the setup of the probe to the probe
    callers that are specific to the trace_array it is on. Although, all probes
    still only affect the toplevel trace array, this change will allow for
    instances to have their own probes separated from other instances and the
    top array.

    That is, something like the stacktrace probe can be set to trace only in an
    instance and not the toplevel trace array. This isn't implement yet, but
    this change sets the ground work for the change.

    When a probe callback is triggered (someone writes the probe format into
    set_ftrace_filter), it calls register_ftrace_function_probe() passing in
    init_data that will be used to initialize the probe. Then for every matching
    function, register_ftrace_function_probe() will call the probe_ops->init()
    function with the init data that was passed to it, as well as an address to
    a place holder that is associated with the probe and the instance. The first
    occurrence will have a NULL in the pointer. The init() function will then
    initialize it. If other probes are added, or more functions are part of the
    probe, the place holder will be passed to the init() function with the place
    holder data that it was initialized to the last time.

    Then this place_holder is passed to each of the other probe_ops functions,
    where it can be used in the function callback. When the probe_ops free()
    function is called, it can be called either with the rip of the function
    that is being removed from the probe, or zero, indicating that there are no
    more functions attached to the probe, and the place holder is about to be
    freed. This gives the probe_ops a way to free the data it assigned to the
    place holder if it was allocade during the first init call.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • In order to eventually have each trace_array instance have its own unique
    set of function probes (triggers), the trace array needs to hold the ops and
    the filters for the probes.

    This is the first step to accomplish this. Instead of having the private
    data of the probe ops point to the trace_array, create a separate list that
    the trace_array holds. There's only one private_data for a probe, we need
    one per trace_array. The probe ftrace_ops will be dynamically created for
    each instance, instead of being static.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • Pass the trace_array associated to a ftrace_probe_ops into the probe_ops
    func(), init() and free() functions. The trace_array is the descriptor that
    describes a tracing instance. This will help create the infrastructure that
    will allow having function probes unique to tracing instances.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • Add a link list to the trace_array to hold func probes that are registered.
    Currently, all function probes are the same for all instances as it was
    before, that is, only the top level trace_array holds the function probes.
    But this lays the ground work to have function probes be attached to
    individual instances, and having the event trigger only affect events in the
    given instance. But that work is still to be done.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • Currently unregister_ftrace_function_probe_func() is a void function. It
    does not give any feedback if an error occurred or no item was found to
    remove and nothing was done.

    Change it to return status and success if it removed something. Also update
    the callers to return that feedback to the user.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • No users of the function probes uses the data field anymore. Remove it, and
    change the init function to take a void *data parameter instead of a
    void **data, because the init will just get the data that the registering
    function was received, and there's no state after it is called.

    The other functions for ftrace_probe_ops still take the data parameter, but
    it will currently only be passed NULL. It will stay as a parameter for
    future data to be passed to these functions.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • In order to move the ops to the function probes directly, they need a way to
    map function ips to their own data without depending on the infrastructure
    of the function probes, as the data field will be going away.

    New helper functions are added that are based on the ftrace_hash code.
    ftrace_func_mapper functions are there to let the probes map ips to their
    data. These can be allocated by the probe ops, and referenced in the
    function callbacks.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • In preparation to cleaning up the probe function registration code, the
    "data" parameter will eventually be removed from the probe->func() call.
    Instead it will receive its own "ops" function, in which it can set up its
    own data that it needs to map.

    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     

09 Dec, 2016

1 commit


24 Nov, 2016

1 commit

  • Currently, when tracepoint_printk is set (enabled by the "tp_printk" kernel
    command line), it causes trace events to print via printk(). This is a very
    dangerous operation, but is useful for debugging.

    The issue is, it's seldom used, but it is always checked even if it's not
    enabled by the kernel command line. Instead of having this feature called by
    a branch against a variable, turn that variable into a static key, and this
    will remove the test and jump.

    To simplify things, the functions output_printk() and
    trace_event_buffer_commit() were moved from trace_events.c to trace.c.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

23 Nov, 2016

1 commit

  • The creation of the set_event_pid file was assigned to a variable "entry"
    but that variable was never used. Ideally, it should be used to check if the
    file was created and warn if it was not.

    The files header_page, header_event should also be checked and a warning if
    they fail to be created.

    The "enable" file was moved up, as it is a more crucial file to have and a
    hard failure (return -ENOMEM) should be returned if it is not created.

    Reported-by: David Binderman
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

20 Jun, 2016

5 commits


19 May, 2016

1 commit

  • Pull tracing updates from Steven Rostedt:
    "This includes two new updates for the ftrace infrastructure.

    - With the changing of the code for filtering events by pid, from a
    list of pids to a bitmask, we can now easily implement following
    forks. With a new tracing option "event-fork" which, when set,
    will have tasks with pids in set_event_pid, when they fork, to have
    their child pids added to set_event_pid and the child will be
    traced as well.

    Note, if "event-fork" is set and a task with its pid in
    set_event_pid exits, its pid will be removed from set_event_pid

    - The addition of Tom Zanussi's hist triggers. This includes a very
    thorough documentatino on how to use the hist triggers with events.
    This introduces a quick and easy way to get histogram data from
    events and their fields.

    Some other cleanups and updates were added as well. Like Masami
    Hiramatsu added test cases for the event trigger and hist triggers.
    Also I added a speed up of filtering by using a temp buffer when
    filters are set"

    * tag 'trace-v4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (45 commits)
    tracing: Use temp buffer when filtering events
    tracing: Remove TRACE_EVENT_FL_USE_CALL_FILTER logic
    tracing: Remove unused function trace_current_buffer_lock_reserve()
    tracing: Remove one use of trace_current_buffer_lock_reserve()
    tracing: Have trace_buffer_unlock_commit() call the _regs version with NULL
    tracing: Remove unused function trace_current_buffer_discard_commit()
    tracing: Move trace_buffer_unlock_commit{_regs}() to local header
    tracing: Fold filter_check_discard() into its only user
    tracing: Make filter_check_discard() local
    tracing: Move event_trigger_unlock_commit{_regs}() to local header
    tracing: Don't use the address of the buffer array name in copy_from_user
    tracing: Handle tracing_map_alloc_elts() error path correctly
    tracing: Add check for NULL event field when creating hist field
    tracing: checking for NULL instead of IS_ERR()
    tracing: Do not inherit event-fork option for instances
    tracing: Fix unsigned comparison to zero in hist trigger code
    kselftests/ftrace: Add a test for log2 modifier of hist trigger
    tracing: Add hist trigger 'log2' modifier
    kselftests/ftrace: Add hist trigger testcases
    kselftests/ftrace : Add event trigger testcases
    ...

    Linus Torvalds
     

10 May, 2016

1 commit


04 May, 2016

2 commits

  • Filtering of events requires the data to be written to the ring buffer
    before it can be decided to filter or not. This is because the parameters of
    the filter are based on the result that is written to the ring buffer and
    not on the parameters that are passed into the trace functions.

    The ftrace ring buffer is optimized for writing into the ring buffer and
    committing. The discard procedure used when filtering decides the event
    should be discarded is much more heavy weight. Thus, using a temporary
    filter when filtering events can speed things up drastically.

    Without a temp buffer we have:

    # trace-cmd start -p nop
    # perf stat -r 10 hackbench 50
    0.790706626 seconds time elapsed ( +- 0.71% )

    # trace-cmd start -e all
    # perf stat -r 10 hackbench 50
    1.566904059 seconds time elapsed ( +- 0.27% )

    # trace-cmd start -e all -f 'common_preempt_count==20'
    # perf stat -r 10 hackbench 50
    1.690598511 seconds time elapsed ( +- 0.19% )

    # trace-cmd start -e all -f 'common_preempt_count!=20'
    # perf stat -r 10 hackbench 50
    1.707486364 seconds time elapsed ( +- 0.30% )

    The first run above is without any tracing, just to get a based figure.
    hackbench takes ~0.79 seconds to run on the system.

    The second run enables tracing all events where nothing is filtered. This
    increases the time by 100% and hackbench takes 1.57 seconds to run.

    The third run filters all events where the preempt count will equal "20"
    (this should never happen) thus all events are discarded. This takes 1.69
    seconds to run. This is 10% slower than just committing the events!

    The last run enables all events and filters where the filter will commit all
    events, and this takes 1.70 seconds to run. The filtering overhead is
    approximately 10%. Thus, the discard and commit of an event from the ring
    buffer may be about the same time.

    With this patch, the numbers change:

    # trace-cmd start -p nop
    # perf stat -r 10 hackbench 50
    0.778233033 seconds time elapsed ( +- 0.38% )

    # trace-cmd start -e all
    # perf stat -r 10 hackbench 50
    1.582102692 seconds time elapsed ( +- 0.28% )

    # trace-cmd start -e all -f 'common_preempt_count==20'
    # perf stat -r 10 hackbench 50
    1.309230710 seconds time elapsed ( +- 0.22% )

    # trace-cmd start -e all -f 'common_preempt_count!=20'
    # perf stat -r 10 hackbench 50
    1.786001924 seconds time elapsed ( +- 0.20% )

    The first run is again the base with no tracing.

    The second run is all tracing with no filtering. It is a little slower, but
    that may be well within the noise.

    The third run shows that discarding all events only took 1.3 seconds. This
    is a speed up of 23%! The discard is much faster than even the commit.

    The one downside is shown in the last run. Events that are not discarded by
    the filter will take longer to add, this is due to the extra copy of the
    event.

    Cc: Alexei Starovoitov
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Currently register functions for events will be called
    through the 'reg' field of event class directly without
    any check when seting up triggers.

    Triggers for events that don't support register through
    debug fs (events under events/ftrace are for trace-cmd to
    read event format, and most of them don't have a register
    function except events/ftrace/functionx) can't be enabled
    at all, and an oops will be hit when setting up trigger
    for those events, so just not creating them is an easy way
    to avoid the oops.

    Link: http://lkml.kernel.org/r/1462275274-3911-1-git-send-email-chuhu@redhat.com

    Cc: stable@vger.kernel.org # 3.14+
    Fixes: 85f2b08268c01 ("tracing: Add basic event trigger framework")
    Signed-off-by: Chunyu Hu
    Signed-off-by: Steven Rostedt

    Chunyu Hu
     

30 Apr, 2016

1 commit


20 Apr, 2016

1 commit

  • 'hist' triggers allow users to continually aggregate trace events,
    which can then be viewed afterwards by simply reading a 'hist' file
    containing the aggregation in a human-readable format.

    The basic idea is very simple and boils down to a mechanism whereby
    trace events, rather than being exhaustively dumped in raw form and
    viewed directly, are automatically 'compressed' into meaningful tables
    completely defined by the user.

    This is done strictly via single-line command-line commands and
    without the aid of any kind of programming language or interpreter.

    A surprising number of typical use cases can be accomplished by users
    via this simple mechanism. In fact, a large number of the tasks that
    users typically do using the more complicated script-based tracing
    tools, at least during the initial stages of an investigation, can be
    accomplished by simply specifying a set of keys and values to be used
    in the creation of a hash table.

    The Linux kernel trace event subsystem happens to provide an extensive
    list of keys and values ready-made for such a purpose in the form of
    the event format files associated with each trace event. By simply
    consulting the format file for field names of interest and by plugging
    them into the hist trigger command, users can create an endless number
    of useful aggregations to help with investigating various properties
    of the system. See Documentation/trace/events.txt for examples.

    hist triggers are implemented on top of the existing event trigger
    infrastructure, and as such are consistent with the existing triggers
    from a user's perspective as well.

    The basic syntax follows the existing trigger syntax. Users start an
    aggregation by writing a 'hist' trigger to the event of interest's
    trigger file:

    # echo hist:keys=xxx [ if filter] > event/trigger

    Once a hist trigger has been set up, by default it continually
    aggregates every matching event into a hash table using the event key
    and a value field named 'hitcount'.

    To view the aggregation at any point in time, simply read the 'hist'
    file in the same directory as the 'trigger' file:

    # cat event/hist

    The detailed syntax provides additional options for user control, and
    is described exhaustively in Documentation/trace/events.txt and in the
    virtual tracing/README file in the tracing subsystem.

    Link: http://lkml.kernel.org/r/72d263b5e1853fe9c314953b65833c3aa75479f2.1457029949.git.tom.zanussi@linux.intel.com

    Signed-off-by: Tom Zanussi
    Tested-by: Masami Hiramatsu
    Reviewed-by: Namhyung Kim
    Signed-off-by: Steven Rostedt

    Tom Zanussi