27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

13 Sep, 2009

1 commit


05 Sep, 2009

1 commit

  • The latency tracers (irqsoff and wakeup) can swap trace buffers
    on the fly. If an event is happening and has reserved data on one of
    the buffers, and the latency tracer swaps the global buffer with the
    max buffer, the result is that the event may commit the data to the
    wrong buffer.

    This patch changes the API to the trace recording to be recieve the
    buffer that was used to reserve a commit. Then this buffer can be passed
    in to the commit.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

07 Apr, 2009

1 commit


20 Mar, 2009

1 commit


13 Mar, 2009

1 commit

  • Impact: fix callsites with dynamic format strings

    Since its new binary implementation, trace_printk() internally uses static
    containers for the format strings on each callsites. But the value is
    assigned once at build time, which means that it can't take dynamic
    formats.

    So this patch unearthes the raw trace_printk implementation for the callers
    that will need trace_printk to be able to carry these dynamic format
    strings. The trace_printk() macro will use the appropriate implementation
    for each callsite. Most of the time however, the binary implementation will
    still be used.

    The other impact of this patch is that mmiotrace_printk() will use the old
    implementation because it calls the low level trace_vprintk and we can't
    guess here whether the format passed in it is dynamic or not.

    Some parts of this patch have been written by Steven Rostedt (most notably
    the part that chooses the appropriate implementation for each callsites).

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt

    Frederic Weisbecker
     

07 Mar, 2009

1 commit

  • Impact: faster and lighter tracing

    Now that we have trace_bprintk() which is faster and consume lesser
    memory than trace_printk() and has the same purpose, we can now drop
    the old implementation in favour of the binary one from trace_bprintk(),
    which means we move all the implementation of trace_bprintk() to
    trace_printk(), so the Api doesn't change except that we must now use
    trace_seq_bprintk() to print the TRACE_PRINT entries.

    Some changes result of this:

    - Previously, trace_bprintk depended of a single tracer and couldn't
    work without. This tracer has been dropped and the whole implementation
    of trace_printk() (like the module formats management) is now integrated
    in the tracing core (comes with CONFIG_TRACING), though we keep the file
    trace_printk (previously trace_bprintk.c) where we can find the module
    management. Thus we don't overflow trace.c

    - changes some parts to use trace_seq_bprintk() to print TRACE_PRINT entries.

    - change a bit trace_printk/trace_vprintk macros to support non-builtin formats
    constants, and fix 'const' qualifiers warnings. But this is all transparent for
    developers.

    - etc...

    V2:

    - Rebase against last changes
    - Fix mispell on the changelog

    V3:

    - Rebase against last changes (moving trace_printk() to kernel.h)

    Signed-off-by: Frederic Weisbecker
    Acked-by: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

06 Feb, 2009

2 commits

  • Impact: new API

    These new functions do what previously was being open coded, reducing
    the number of details ftrace plugin writers have to worry about.

    It also standardizes the handling of stacktrace, userstacktrace and
    other trace options we may introduce in the future.

    With this patch, for instance, the blk tracer (and some others already
    in the tree) can use the "userstacktrace" /d/tracing/trace_options
    facility.

    $ codiff /tmp/vmlinux.before /tmp/vmlinux.after
    linux-2.6-tip/kernel/trace/trace.c:
    trace_vprintk | -5
    trace_graph_return | -22
    trace_graph_entry | -26
    trace_function | -45
    __ftrace_trace_stack | -27
    ftrace_trace_userstack | -29
    tracing_sched_switch_trace | -66
    tracing_stop | +1
    trace_seq_to_user | -1
    ftrace_trace_special | -63
    ftrace_special | +1
    tracing_sched_wakeup_trace | -70
    tracing_reset_online_cpus | -1
    13 functions changed, 2 bytes added, 355 bytes removed, diff: -353

    linux-2.6-tip/block/blktrace.c:
    __blk_add_trace | -58
    1 function changed, 58 bytes removed, diff: -58

    linux-2.6-tip/kernel/trace/trace.c:
    trace_buffer_lock_reserve | +88
    trace_buffer_unlock_commit | +86
    2 functions changed, 174 bytes added, diff: +174

    /tmp/vmlinux.after:
    16 functions changed, 176 bytes added, 413 bytes removed, diff: -237

    Signed-off-by: Arnaldo Carvalho de Melo
    Acked-by: Frédéric Weisbecker
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Impact: API change, cleanup

    >From ring_buffer_{lock_reserve,unlock_commit}.

    $ codiff /tmp/vmlinux.before /tmp/vmlinux.after
    linux-2.6-tip/kernel/trace/trace.c:
    trace_vprintk | -14
    trace_graph_return | -14
    trace_graph_entry | -10
    trace_function | -8
    __ftrace_trace_stack | -8
    ftrace_trace_userstack | -8
    tracing_sched_switch_trace | -8
    ftrace_trace_special | -12
    tracing_sched_wakeup_trace | -8
    9 functions changed, 90 bytes removed, diff: -90

    linux-2.6-tip/block/blktrace.c:
    __blk_add_trace | -1
    1 function changed, 1 bytes removed, diff: -1

    /tmp/vmlinux.after:
    10 functions changed, 91 bytes removed, diff: -91

    Signed-off-by: Arnaldo Carvalho de Melo
    Acked-by: Frédéric Weisbecker
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     

16 Jan, 2009

1 commit


11 Jan, 2009

1 commit

  • Impact: enhances lost events counting in mmiotrace

    The tracing framework, or the ring buffer facility it uses, has a switch
    to stop recording data. When recording is off, the trace events will be
    lost. The framework does not count these, so mmiotrace has to count them
    itself.

    Signed-off-by: Pekka Paalanen
    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     

29 Dec, 2008

2 commits

  • Impact: simplify/generalize/refactor trace.c

    The trace.c file is becoming more difficult to maintain due to the
    growing number of events. There is several formats that an event may
    be printed. This patch sets up the infrastructure of an event hash to
    allow for events to register how they should be printed.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Impact: cleanup, remove obsolete code

    Now that the ring buffer used by ftrace allows for variable length
    entries, we do not need the 'cont' feature of the buffer. This code
    makes other parts of ftrace more complex and by removing this it
    simplifies the ftrace code.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

19 Dec, 2008

1 commit


04 Dec, 2008

1 commit

  • Handle the TRACE_PRINT entries from the function grapg tracer
    and output them as a C comment just below the function that called
    it, as if it was a comment inside this function.

    Example with an ftrace_printk inside might_sleep() function:

    void __might_sleep(char *file, int line)
    {
    static unsigned long prev_jiffy; /* ratelimiting */

    ftrace_printk("Hi I'm a comment in might_sleep() :-)");

    A chunk of a resulting trace:

    0) | _reiserfs_free_block() {
    0) | reiserfs_read_bitmap_block() {
    0) | __bread() {
    0) | __getblk() {
    0) | __find_get_block() {
    0) 0.698 us | mark_page_accessed();
    0) 2.267 us | }
    0) | __might_sleep() {
    0) | /* Hi I'm a comment in might_sleep() :-) */
    0) 1.321 us | }
    0) 5.872 us | }
    0) 7.313 us | }
    0) 8.718 us | }

    And this patch brings two minor fixes:

    - The newline after a switch-out task has disappeared
    - The "|" sign just before the cpu number on task-switch has been deleted.

    0) 0.616 us | pick_next_task_rt();
    0) 1.457 us | _spin_trylock();
    0) 0.653 us | _spin_unlock();
    0) 0.728 us | _spin_trylock();
    0) 0.631 us | _spin_unlock();
    0) 0.729 us | native_load_sp0();
    0) 0.593 us | native_load_tls();
    ------------------------------------------
    0) cat-2834 => migrati-3
    ------------------------------------------

    0) | finish_task_switch() {
    0) 0.841 us | _spin_unlock_irq();
    0) 0.616 us | post_schedule_rt();
    0) 3.882 us | }

    Signed-off-by: Frederic Weisbecker
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

25 Nov, 2008

1 commit


24 Nov, 2008

1 commit

  • Impact: fix mmiotrace overrun tracing

    When ftrace framework moved to use the ring buffer facility, the buffer
    overrun detection was broken after 2.6.27 by commit

    | commit 3928a8a2d98081d1bc3c0a84a2d70e29b90ecf1c
    | Author: Steven Rostedt
    | Date: Mon Sep 29 23:02:41 2008 -0400
    |
    | ftrace: make work with new ring buffer
    |
    | This patch ports ftrace over to the new ring buffer.

    The detection is now fixed by using the ring buffer API.

    When mmiotrace detects a buffer overrun, it will report the number of
    lost events. People reading an mmiotrace log must know if something was
    missed, otherwise the data may not make sense.

    Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     

16 Nov, 2008

1 commit

  • Impact: extend the ->init() method with the ability to fail

    This bring a way to know if the initialization of a tracer successed.
    A tracer must return 0 on success and a traditional error (ie:
    -ENOMEM) if it fails.

    If a tracer fails to init, it is free to print a detailed warn. The
    tracing api will not and switch to a new tracer will just return the
    error from the init callback.

    Note: this will be used for the return tracer.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

08 Nov, 2008

2 commits


14 Oct, 2008

9 commits

  • With the new ring buffer infrastructure in ftrace, I'm trying to make
    ftrace a little more light weight.

    This patch converts a lot of the local_irq_save/restore into
    preempt_disable/enable. The original preempt count in a lot of cases
    has to be sent in as a parameter so that it can be recorded correctly.
    Some places were recording it incorrectly before anyway.

    This is also laying the ground work to make ftrace a little bit
    more reentrant, and remove all locking. The function tracers must
    still protect from reentrancy.

    Note: All the function tracers must be careful when using preempt_disable.
    It must do the following:

    resched = need_resched();
    preempt_disable_notrace();
    [...]
    if (resched)
    preempt_enable_no_resched_notrace();
    else
    preempt_enable_notrace();

    The reason is that if this function traces schedule() itself, the
    preempt_enable_notrace() will cause a schedule, which will lead
    us into a recursive failure.

    If we needed to reschedule before calling preempt_disable, we
    should have already scheduled. Since we did not, this is most
    likely that we should not and are probably inside a schedule
    function.

    If resched was not set, we still need to catch the need resched
    flag being set when preemption was off and the if case at the
    end will catch that for us.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • The mmiotrace map had a bug that would typecast the entry from
    the trace to the wrong type. That is a known danger of C typecasts,
    there's absolutely zero checking done on them.

    Help that problem a bit by using a GCC extension to implement a
    type filter that restricts the types that a trace record can be
    cast into, and by adding a dynamic check (in debug mode) to verify
    the type of the entry.

    This patch adds a macro to assign all entries of ftrace using the type
    of the variable and checking the entry id. The typecasts are now done
    in the macro for only those types that it knows about, which should
    be all the types that are allowed to be read from the tracer.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Adapt mmiotrace to the new print_line type.
    By default, it ignores (and consumes) types it doesn't support.

    Signed-off-by: Frederic Weisbecker
    Acked-by: Pekka Paalanen
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Now that the underlining ring buffer for ftrace now hold variable length
    entries, we can take advantage of this by only storing the size of the
    actual event into the buffer. This happens to increase the number of
    entries in the buffer dramatically.

    We can also get rid of the "trace_cont" operation, but I'm keeping that
    until we have no more users. Some of the ftrace tracers can now change
    their code to adapt to this new feature.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • This patch ports ftrace over to the new ring buffer.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     
  • Also make trace_seq_print_cont() non-static, and add a newline if the
    seq buffer can't hold all data.

    Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Offer mmiotrace users a function to inject markers from inside the kernel.
    This depends on the trace_vprintk() patch.

    Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Moves the mmiotrace specific functions from trace.c to
    trace_mmiotrace.c. Functions trace_wake_up(), tracing_get_trace_entry(),
    and tracing_generic_entry_update() are therefore made available outside
    trace.c.

    Signed-off-by: Pekka Paalanen
    Acked-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Some tracers will need to work with more than one entry. In order to do this
    the trace_entry structure was split into two fields. One for the start of
    all entries, and one to continue an existing entry.

    The trace_entry structure now has a "field" entry that consists of the previous
    content of the trace_entry, and a "cont" entry that is just a string buffer
    the size of the "field" entry.

    Thanks to Andrew Morton for suggesting this idea.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Steven Rostedt
     

24 May, 2008

8 commits

  • Signed-off-by: Pekka Paalanen
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Signed-off-by: Pekka Paalanen
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Signed-off-by: Pekka Paalanen
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Now the header is printed only for `trace_pipe' file.

    Signed-off-by: Pekka Paalanen
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Non-zero pid indicates the MMIO access originated in user space.
    We do not catch that kind of accesses yet, so always print zero for now.

    Signed-off-by: Pekka Paalanen
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Pekka Paalanen
     
  • another weekend, another patch. This should apply on top of my previous patch
    from March 23rd.

    Summary of changes:
    - Print PCI device list in output header
    - work around recursive probe hits on SMP
    - refactor dis/arm_kmmio_fault_page() and add check for page levels
    - remove un/reference_kmmio(), the die notifier hook is registered
    permanently into the list
    - explicitly check for single stepping in die notifier callback

    I have tested this version on my UP Athlon64 desktop with Nouveau, and
    SMP Core 2 Duo laptop with the proprietary nvidia driver. Both systems
    are 64-bit. One previously unknown bug crept into daylight: the ftrace
    framework's output routines print the first entry last after buffer has
    wrapped around.

    The most important regressions compared to non-ftrace mmiotrace at this
    time are:
    - failure of trace_pipe file
    - illegal lines in output file
    - unaware of losing data due to buffer full

    Personally I'd like to see these three solved before submitting to
    mainline. Other issues may come up once we know when we lose events.

    Signed-off-by: Pekka Paalanen
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Pekka Paalanen
     
  • here is a patch that makes mmiotrace work almost well within the tracing
    framework. The patch applies on top of my previous patch. I have my own
    output formatting in place now.

    Summary of changes:
    - fix the NULL dereference that was due to not calling tracing_reset()
    - add print_line() callback into struct tracer
    - implement print_line() for mmiotrace, producing up-to-spec text
    - add my output header, but that is not really called in the right place
    - rewrote the main structs in mmiotrace
    - added two new trace entry types: TRACE_MMIO_RW and TRACE_MMIO_MAP
    - made some functions in trace.c non-static
    - check current==NULL in tracing_generic_entry_update()
    - fix(?) comparison in trace_seq_printf()

    Things seem to work fine except a few issues. Markers (text lines injected
    into mmiotrace log) are missing, I did not feel hacking them in before we
    have variable length entries. My output header is printed only for 'trace'
    file, but not 'trace_pipe'. For some reason, despite my quick fix,
    iter->trace is NULL in print_trace_line() when called from 'trace_pipe'
    file, which means I don't get proper output formatting.

    I only tried by loading nouveau.ko, which just detects the card, and that
    is traced fine. I didn't try further. Map, two reads and unmap. Works
    perfectly.

    I am missing the information about overflows, I'd prefer to have a
    counter for lost events. I didn't try, but I guess currently there is no
    way of knowning when it overflows?

    So, not too far from being fully operational, it seems :-)
    And looking at the diffstat, there also is some 700-900 lines of user space
    code that just became obsolete.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Pekka Paalanen
     
  • On Sat, 22 Mar 2008 13:07:47 +0100
    Ingo Molnar wrote:

    > > > i'd suggest the following: pull x86.git and sched-devel.git into a
    > > > single tree [the two will combine without rejects]. Then try to add a
    > > > kernel/tracing/trace_mmiotrace.c ftrace plugin. The trace_sysprof.c
    > > > plugin might be a good example.
    > >
    > > I did this and now I have mmiotrace enabled/disabled via the tracing
    > > framework (what do we call this, since ftrace is one of the tracers?).
    >
    > cool! could you send the patches for that? (even if they are not fully
    > functional yet)

    Patch attached in the end. Nice to see how much code disappeared. I tried
    to mark all the features I had to break with XXX-comments.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Pekka Paalanen