23 Sep, 2010

1 commit


14 May, 2010

1 commit

  • This patch adds data to be passed to tracepoint callbacks.

    The created functions from DECLARE_TRACE() now need a mandatory data
    parameter. For example:

    DECLARE_TRACE(mytracepoint, int value, value)

    Will create the register function:

    int register_trace_mytracepoint((void(*)(void *data, int value))probe,
    void *data);

    As the first argument, all callbacks (probes) must take a (void *data)
    parameter. So a callback for the above tracepoint will look like:

    void myprobe(void *data, int value)
    {
    }

    The callback may choose to ignore the data parameter.

    This change allows callbacks to register a private data pointer along
    with the function probe.

    void mycallback(void *data, int value);

    register_trace_mytracepoint(mycallback, mydata);

    Then the mycallback() will receive the "mydata" as the first parameter
    before the args.

    A more detailed example:

    DECLARE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status));

    /* In the C file */

    DEFINE_TRACE(mytracepoint, TP_PROTO(int status), TP_ARGS(status));

    [...]

    trace_mytracepoint(status);

    /* In a file registering this tracepoint */

    int my_callback(void *data, int status)
    {
    struct my_struct my_data = data;
    [...]
    }

    [...]
    my_data = kmalloc(sizeof(*my_data), GFP_KERNEL);
    init_my_data(my_data);
    register_trace_mytracepoint(my_callback, my_data);

    The same callback can also be registered to the same tracepoint as long
    as the data registered is different. Note, the data must also be used
    to unregister the callback:

    unregister_trace_mytracepoint(my_callback, my_data);

    Because of the data parameter, tracepoints declared this way can not have
    no args. That is:

    DECLARE_TRACE(mytracepoint, TP_PROTO(void), TP_ARGS());

    will cause an error.

    If no arguments are needed, a new macro can be used instead:

    DECLARE_TRACE_NOARGS(mytracepoint);

    Since there are no arguments, the proto and args fields are left out.

    This is part of a series to make the tracepoint footprint smaller:

    text data bss dec hex filename
    4913961 1088356 861512 6863829 68bbd5 vmlinux.orig
    4914025 1088868 861512 6864405 68be15 vmlinux.class
    4918492 1084612 861512 6864616 68bee8 vmlinux.tracepoint

    Again, this patch also increases the size of the kernel, but
    lays the ground work for decreasing it.

    v5: Fixed net/core/drop_monitor.c to handle these updates.

    v4: Moved the DECLARE_TRACE() DECLARE_TRACE_NOARGS out of the
    #ifdef CONFIG_TRACE_POINTS, since the two are the same in both
    cases. The __DECLARE_TRACE() is what changes.
    Thanks to Frederic Weisbecker for pointing this out.

    v3: Made all register_* functions require data to be passed and
    all callbacks to take a void * parameter as its first argument.
    This makes the calling functions comply with C standards.

    Also added more comments to the modifications of DECLARE_TRACE().

    v2: Made the DECLARE_TRACE() have the ability to pass arguments
    and added a new DECLARE_TRACE_NOARGS() for tracepoints that
    do not need any arguments.

    Acked-by: Mathieu Desnoyers
    Acked-by: Masami Hiramatsu
    Acked-by: Frederic Weisbecker
    Cc: Neil Horman
    Cc: David S. Miller
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

10 Jul, 2009

1 commit

  • The stat entries can be freed when the stat file is being read.
    The worse is, the ptr can be freed immediately after it's returned
    from workqueue_stat_start/next().

    Add a refcnt to struct cpu_workqueue_stats to avoid use-after-free.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Li Zefan
    Acked-by: Frederic Weisbecker
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     

02 Jun, 2009

4 commits

  • The blankline between each cpu's workqueue stat is not necessary, because
    the cpu number is enough to part them by eye.
    Old style also caused a blankline below headline, and made code complex
    by using lock, disableirq and get cpu var.

    Old style:
    # CPU INSERTED EXECUTED NAME
    # | | | |

    0 8644 8644 events/0
    0 0 0 cpuset
    ...
    0 1 1 kdmflush

    1 35365 35365 events/1
    ...

    New style:
    # CPU INSERTED EXECUTED NAME
    # | | | |

    0 8644 8644 events/0
    0 0 0 cpuset
    ...
    0 1 1 kdmflush
    1 35365 35365 events/1
    ...

    [ Impact: provide more readable code ]

    Signed-off-by: Zhao Lei
    Cc: KOSAKI Motohiro
    Cc: Steven Rostedt
    Cc: Tom Zanussi
    Cc: Oleg Nesterov
    Cc: Andrew Morton
    Signed-off-by: Frederic Weisbecker

    Zhaolei
     
  • cpu_workqueue_stats->first_entry is useless because we can retrieve the
    header of a cpu workqueue using:
    if (&cpu_workqueue_stats->list == workqueue_cpu_stat(cpu)->list.next)

    [ Impact: cleanup ]

    Signed-off-by: Zhao Lei
    Cc: Steven Rostedt
    Cc: Tom Zanussi
    Cc: Oleg Nesterov
    Cc: Andrew Morton
    Signed-off-by: Frederic Weisbecker

    Zhaolei
     
  • No need to use list_for_each_entry_safe() in iteration without deleting
    any node, we can use list_for_each_entry() instead.

    [ Impact: cleanup ]

    Signed-off-by: Zhao Lei
    Cc: Steven Rostedt
    Cc: Tom Zanussi
    Cc: Oleg Nesterov
    Cc: Andrew Morton
    Signed-off-by: Frederic Weisbecker

    Zhaolei
     
  • v3: zhaolei@cn.fujitsu.com: Change TRACE_EVENT definition to new format
    introduced by Steven Rostedt: consolidate trace and trace_event headers
    v2: kosaki@jp.fujitsu.com: print the function names instead of addr, and zap
    the work addr
    v1: zhaolei@cn.fujitsu.com: Make workqueue tracepoints use TRACE_EVENT macro

    TRACE_EVENT is a more generic way to define tracepoints.
    Doing so adds these new capabilities to the tracepoints:

    - zero-copy and per-cpu splice() tracing
    - binary tracing without printf overhead
    - structured logging records exposed under /debug/tracing/events
    - trace events embedded in function tracer output and other plugins
    - user-defined, per tracepoint filter expressions

    Then, this patch converts DEFINE_TRACE to TRACE_EVENT in workqueue related
    tracepoints.

    [ Impact: expand workqueue tracer to events tracing ]

    Signed-off-by: Zhao Lei
    Cc: Steven Rostedt
    Cc: Tom Zanussi
    Cc: Oleg Nesterov
    Cc: Andrew Morton
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Frederic Weisbecker

    Zhaolei
     

07 Apr, 2009

1 commit


26 Mar, 2009

1 commit

  • Empty lines separate cpus stat. After previous
    fix(trace_stat: keep original order) applied, the empty lines
    are displayed at incorrect position.

    Signed-off-by: Lai Jiangshan
    Acked-by: Steven Rostedt
    Acked-by: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     

25 Mar, 2009

1 commit

  • Currently, if a trace_stat user wants a handle to some private data,
    the trace_stat infrastructure does not supply a way to do that.

    This patch passes the trace_stat structure to the start function of
    the trace_stat code.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

13 Mar, 2009

1 commit


11 Mar, 2009

2 commits


10 Mar, 2009

1 commit


09 Mar, 2009

1 commit

  • Impact: improve workqueue tracer output

    Currently, /sys/kernel/debug/tracing/trace_stat/workqueues can display
    wrong and strange thread names.

    Why?

    Currently, ftrace has tracing_record_cmdline()/trace_find_cmdline()
    convenience function that implements a task->comm string cache.

    This can avoid unnecessary memcpy overhead and the workqueue tracer
    uses it.

    However, in general, any trace statistics feature shouldn't use
    tracing_record_cmdline() because trace statistics can display
    very old process. Then comm cache can return wrong string because
    recent process overrides the cache.

    Fortunately, workqueue trace guarantees that displayed processes
    are live. Thus we can search comm string from PID at display time.

    % cat workqueues
    # CPU INSERTED EXECUTED NAME
    # | | | |

    7 431913 431913 kondemand/7
    7 0 0 tail
    7 21 21 git
    7 0 0 ls
    7 9 9 cat
    7 832632 832632 unix_chkpwd
    7 236292 236292 ls

    Note: tail, git, ls, cat unix_chkpwd are obiously not workqueue thread.

    % cat workqueues
    # CPU INSERTED EXECUTED NAME
    # | | | |

    7 510 510 kondemand/7
    7 0 0 kmpathd/7
    7 15 15 ata/7
    7 0 0 aio/7
    7 11 11 kblockd/7
    7 1063 1063 work_on_cpu/7
    7 167 167 events/7

    Signed-off-by: KOSAKI Motohiro
    Cc: Lai Jiangshan
    Cc: Steven Rostedt
    Cc: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    KOSAKI Motohiro
     

20 Jan, 2009

1 commit

  • Impact: use percpu data instead of a global structure

    Use:

    static DEFINE_PER_CPU(struct workqueue_global_stats, all_workqueue_stat);

    instead of allocating a global structure.

    percpu data also works well on NUMA.

    Signed-off-by: Lai Jiangshan
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Lai Jiangshan
     

14 Jan, 2009

1 commit

  • Impact: new tracer

    The workqueue tracer provides some statistical informations
    about each cpu workqueue thread such as the number of the
    works inserted and executed since their creation. It can help
    to evaluate the amount of work each of them have to perform.
    For example it can help a developer to decide whether he should
    choose a per cpu workqueue instead of a singlethreaded one.

    It only traces statistical informations for now but it will probably later
    provide event tracing too.

    Such a tracer could help too, and be improved, to help rt priority sorted
    workqueue development.

    To have a snapshot of the workqueues state at any time, just do

    cat /debugfs/tracing/trace_stat/workqueues

    Ie:

    1 125 125 reiserfs/1
    1 0 0 scsi_tgtd/1
    1 0 0 aio/1
    1 0 0 ata/1
    1 114 114 kblockd/1
    1 0 0 kintegrityd/1
    1 2147 2147 events/1

    0 0 0 kpsmoused
    0 105 105 reiserfs/0
    0 0 0 scsi_tgtd/0
    0 0 0 aio/0
    0 0 0 ata_aux
    0 0 0 ata/0
    0 0 0 cqueue
    0 0 0 kacpi_notify
    0 0 0 kacpid
    0 149 149 kblockd/0
    0 0 0 kintegrityd/0
    0 1000 1000 khelper
    0 2270 2270 events/0

    Changes in V2:

    _ Drop the static array based on NR_CPU and dynamically allocate the stat array
    with num_possible_cpus() and other cpu mask facilities....
    _ Trace workqueue insertion at a bit lower level (insert_work instead of queue_work) to handle
    even the workqueue barriers.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker