26 May, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd:
    net: fix get_net_ns_by_fd for !CONFIG_NET_NS
    ns proc: Return -ENOENT for a nonexistent /proc/self/ns/ entry.
    ns: Declare sys_setns in syscalls.h
    net: Allow setting the network namespace by fd
    ns proc: Add support for the ipc namespace
    ns proc: Add support for the uts namespace
    ns proc: Add support for the network namespace.
    ns: Introduce the setns syscall
    ns: proc files for namespace naming policy.

    Linus Torvalds
     

25 May, 2011

1 commit


06 May, 2011

1 commit

  • This patch adds a multiple message send syscall and is the send
    version of the existing recvmmsg syscall. This is heavily
    based on the patch by Arnaldo that added recvmmsg.

    I wrote a microbenchmark to test the performance gains of using
    this new syscall:

    http://ozlabs.org/~anton/junkcode/sendmmsg_test.c

    The test was run on a ppc64 box with a 10 Gbit network card. The
    benchmark can send both UDP and RAW ethernet packets.

    64B UDP

    batch pkts/sec
    1 804570
    2 872800 (+ 8 %)
    4 916556 (+14 %)
    8 939712 (+17 %)
    16 952688 (+18 %)
    32 956448 (+19 %)
    64 964800 (+20 %)

    64B raw socket

    batch pkts/sec
    1 1201449
    2 1350028 (+12 %)
    4 1461416 (+22 %)
    8 1513080 (+26 %)
    16 1541216 (+28 %)
    32 1553440 (+29 %)
    64 1557888 (+30 %)

    We see a 20% improvement in throughput on UDP send and 30%
    on raw socket send.

    [ Add sparc syscall entries. -DaveM ]

    Signed-off-by: Anton Blanchard
    Signed-off-by: David S. Miller

    Anton Blanchard
     

21 Mar, 2011

1 commit

  • It is frequently useful to sync a single file system, instead of all
    mounted file systems via sync(2):

    - On machines with many mounts, it is not at all uncommon for some of
    them to hang (e.g. unresponsive NFS server). sync(2) will get stuck on
    those and may never get to the one you do care about (e.g., /).
    - Some applications write lots of data to the file system and then
    want to make sure it is flushed to disk. Calling fsync(2) on each
    file introduces unnecessary ordering constraints that result in a large
    amount of sub-optimal writeback/flush/commit behavior by the file
    system.

    There are currently two ways (that I know of) to sync a single super_block:

    - BLKFLSBUF ioctl on the block device: That also invalidates the bdev
    mapping, which isn't usually desirable, and doesn't work for non-block
    file systems.
    - 'mount -o remount,rw' will call sync_filesystem as an artifact of the
    current implemention. Relying on this little-known side effect for
    something like data safety sounds foolish.

    Both of these approaches require root privileges, which some applications
    do not have (nor should they need?) given that sync(2) is an unprivileged
    operation.

    This patch introduces a new system call syncfs(2) that takes an fd and
    syncs only the file system it references. Maybe someday we can

    $ sync /some/path

    and not get

    sync: ignoring all arguments

    The syscall is motivated by comments by Al and Christoph at the last LSF.
    syncfs(2) seems like an appropriate name given statfs(2).

    A similar ioctl was also proposed a while back, see
    http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2

    Signed-off-by: Sage Weil
    Signed-off-by: Al Viro

    Sage Weil
     

16 Mar, 2011

2 commits

  • …l/git/tip/linux-2.6-tip

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (62 commits)
    posix-clocks: Check write permissions in posix syscalls
    hrtimer: Remove empty hrtimer_init_hres_timer()
    hrtimer: Update hrtimer->state documentation
    hrtimer: Update base[CLOCK_BOOTTIME].offset correctly
    timers: Export CLOCK_BOOTTIME via the posix timers interface
    timers: Add CLOCK_BOOTTIME hrtimer base
    time: Extend get_xtime_and_monotonic_offset() to also return sleep
    time: Introduce get_monotonic_boottime and ktime_get_boottime
    hrtimers: extend hrtimer base code to handle more then 2 clockids
    ntp: Remove redundant and incorrect parameter check
    mn10300: Switch do_timer() to xtimer_update()
    posix clocks: Introduce dynamic clocks
    posix-timers: Cleanup namespace
    posix-timers: Add support for fd based clocks
    x86: Add clock_adjtime for x86
    posix-timers: Introduce a syscall for clock tuning.
    time: Splitout compat timex accessors
    ntp: Add ADJ_SETOFFSET mode bit
    time: Introduce timekeeping_inject_offset
    posix-timer: Update comment
    ...

    Fix up new system-call-related conflicts in
    arch/x86/ia32/ia32entry.S
    arch/x86/include/asm/unistd_32.h
    arch/x86/include/asm/unistd_64.h
    arch/x86/kernel/syscall_table_32.S
    (name_to_handle_at()/open_by_handle_at() vs clock_adjtime()), and some
    due to movement of get_jiffies_64() in:
    kernel/time.c

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (184 commits)
    perf probe: Clean up probe_point_lazy_walker() return value
    tracing: Fix irqoff selftest expanding max buffer
    tracing: Align 4 byte ints together in struct tracer
    tracing: Export trace_set_clr_event()
    tracing: Explain about unstable clock on resume with ring buffer warning
    ftrace/graph: Trace function entry before updating index
    ftrace: Add .ref.text as one of the safe areas to trace
    tracing: Adjust conditional expression latency formatting.
    tracing: Fix event alignment: skb:kfree_skb
    tracing: Fix event alignment: mce:mce_record
    tracing: Fix event alignment: kvm:kvm_hv_hypercall
    tracing: Fix event alignment: module:module_request
    tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup
    tracing: Remove lock_depth from event entry
    perf header: Stop using 'self'
    perf session: Use evlist/evsel for managing perf.data attributes
    perf top: Don't let events to eat up whole header line
    perf top: Fix events overflow in top command
    ring-buffer: Remove unused #include <linux/trace_irq.h>
    tracing: Add an 'overwrite' trace_option.
    ...

    Linus Torvalds
     

15 Mar, 2011

2 commits


09 Feb, 2011

1 commit


08 Feb, 2011

1 commit

  • FTRACE_SYSCALLS would create events for each and every system call, even
    if it had failed to map the system call's name with it's number. This
    resulted in a number of events being created that would not behave as
    expected.

    This could happen, for example, on architectures who's symbol names are
    unusual and will not match the system call name. It could also happen
    with system calls which were mapped to sys_ni_syscall.

    This patch changes the default system call number in the metadata to -1.
    If the system call name from the metadata is not successfully mapped to
    a system call number during boot, than the event initialisation routine
    will now return an error, preventing the event from being created.

    Signed-off-by: Ian Munsie
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Ian Munsie
     

03 Feb, 2011

2 commits

  • Currently the syscall_meta structures for the syscall tracepoints are
    placed in the __syscall_metadata section, and at link time, the linker
    makes one large array of all these syscall metadata structures. On boot
    up, this array is read (much like the initcall sections) and the syscall
    data is processed.

    The problem is that there is no guarantee that gcc will place complex
    structures nicely together in an array format. Two structures in the
    same file may be placed awkwardly, because gcc has no clue that they
    are suppose to be in an array.

    A hack was used previous to force the alignment to 4, to pack the
    structures together. But this caused alignment issues with other
    architectures (sparc).

    Instead of packing the structures into an array, the structures' addresses
    are now put into the __syscall_metadata section. As pointers are always the
    natural alignment, gcc should always pack them tightly together
    (otherwise initcall, extable, etc would also fail).

    By having the pointers to the structures in the section, we can still
    iterate the trace_events without causing unnecessary alignment problems
    with other architectures, or depending on the current behaviour of
    gcc that will likely change in the future just to tick us kernel developers
    off a little more.

    The __syscall_metadata section is also moved into the .init.data section
    as it is now only needed at boot up.

    Suggested-by: David Miller
    Acked-by: David S. Miller
    Cc: Mathieu Desnoyers
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Currently the trace_event structures are placed in the _ftrace_events
    section, and at link time, the linker makes one large array of all
    the trace_event structures. On boot up, this array is read (much like
    the initcall sections) and the events are processed.

    The problem is that there is no guarantee that gcc will place complex
    structures nicely together in an array format. Two structures in the
    same file may be placed awkwardly, because gcc has no clue that they
    are suppose to be in an array.

    A hack was used previous to force the alignment to 4, to pack the
    structures together. But this caused alignment issues with other
    architectures (sparc).

    Instead of packing the structures into an array, the structures' addresses
    are now put into the _ftrace_event section. As pointers are always the
    natural alignment, gcc should always pack them tightly together
    (otherwise initcall, extable, etc would also fail).

    By having the pointers to the structures in the section, we can still
    iterate the trace_events without causing unnecessary alignment problems
    with other architectures, or depending on the current behaviour of
    gcc that will likely change in the future just to tick us kernel developers
    off a little more.

    The _ftrace_event section is also moved into the .init.data section
    as it is now only needed at boot up.

    Suggested-by: David Miller
    Cc: Mathieu Desnoyers
    Acked-by: David S. Miller
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

02 Feb, 2011

1 commit

  • A new syscall is introduced that allows tuning of a POSIX clock. The
    new call, clock_adjtime, takes two parameters, the clock ID and a
    pointer to a struct timex. Any ADJTIMEX(2) operation may be requested
    via this system call, but various POSIX clocks may or may not support
    tuning.

    [ tglx: Adapted to the posix-timer cleanup series. Avoid copy_to_user
    in the error case ]

    Signed-off-by: Richard Cochran
    Acked-by: John Stultz
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Richard Cochran
     

18 Nov, 2010

2 commits


28 Oct, 2010

1 commit

  • Since userspace API of ptrace syscall defines @addr and @data as void
    pointers, it would be more appropriate to define them as unsigned long in
    kernel. Therefore related functions are changed also.

    'unsigned long' is typically used in other places in kernel as an opaque
    data type and that using this helps cleaning up a lot of warnings from
    sparse.

    Suggested-by: Arnd Bergmann
    Signed-off-by: Namhyung Kim
    Acked-by: Arnd Bergmann
    Acked-by: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Namhyung Kim
     

18 Aug, 2010

1 commit

  • Make do_execve() take a const filename pointer so that kernel_execve() compiles
    correctly on ARM:

    arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type

    This also requires the argv and envp arguments to be consted twice, once for
    the pointer array and once for the strings the array points to. This is
    because do_execve() passes a pointer to the filename (now const) to
    copy_strings_kernel(). A simpler alternative would be to cast the filename
    pointer in do_execve() when it's passed to copy_strings_kernel().

    do_execve() may not change any of the strings it is passed as part of the argv
    or envp lists as they are some of them in .rodata, so marking these strings as
    const should be fine.

    Further kernel_execve() and sys_execve() need to be changed to match.

    This has been test built on x86_64, frv, arm and mips.

    Signed-off-by: David Howells
    Tested-by: Ralf Baechle
    Acked-by: Russell King
    Signed-off-by: Linus Torvalds

    David Howells
     

14 Aug, 2010

1 commit

  • Mark arguments to certain system calls as being const where they should be but
    aren't. The list includes:

    (*) The filename arguments of various stat syscalls, execve(), various utimes
    syscalls and some mount syscalls.

    (*) The filename arguments of some syscall helpers relating to the above.

    (*) The buffer argument of various write syscalls.

    Signed-off-by: David Howells
    Acked-by: David S. Miller
    Signed-off-by: Linus Torvalds

    David Howells
     

11 Aug, 2010

2 commits

  • * 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux:
    unistd: add __NR_prlimit64 syscall numbers
    rlimits: implement prlimit64 syscall
    rlimits: switch more rlimit syscalls to do_prlimit
    rlimits: redo do_setrlimit to more generic do_prlimit
    rlimits: add rlimit64 structure
    rlimits: do security check under task_lock
    rlimits: allow setrlimit to non-current tasks
    rlimits: split sys_setrlimit
    rlimits: selinux, do rlimits changes under task_lock
    rlimits: make sure ->rlim_max never grows in sys_setrlimit
    rlimits: add task_struct to update_rlimit_cpu
    rlimits: security, add task_struct to setrlimit

    Fix up various system call number conflicts. We not only added fanotify
    system calls in the meantime, but asm-generic/unistd.h added a wait4
    along with a range of reserved per-architecture system calls.

    Linus Torvalds
     
  • * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits)
    fanotify: use both marks when possible
    fsnotify: pass both the vfsmount mark and inode mark
    fsnotify: walk the inode and vfsmount lists simultaneously
    fsnotify: rework ignored mark flushing
    fsnotify: remove global fsnotify groups lists
    fsnotify: remove group->mask
    fsnotify: remove the global masks
    fsnotify: cleanup should_send_event
    fanotify: use the mark in handler functions
    audit: use the mark in handler functions
    dnotify: use the mark in handler functions
    inotify: use the mark in handler functions
    fsnotify: send fsnotify_mark to groups in event handling functions
    fsnotify: Exchange list heads instead of moving elements
    fsnotify: srcu to protect read side of inode and vfsmount locks
    fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called
    fsnotify: use _rcu functions for mark list traversal
    fsnotify: place marks on object in order of group memory address
    vfs/fsnotify: fsnotify_close can delay the final work in fput
    fsnotify: store struct file not struct path
    ...

    Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.

    Linus Torvalds
     

28 Jul, 2010

3 commits


22 Jul, 2010

1 commit


16 Jul, 2010

1 commit

  • This patch adds the code to support the sys_prlimit64 syscall which
    modifies-and-returns the rlim values of a selected process atomically.
    The first parameter, pid, being 0 means current process.

    Unlike the current implementation, it is a generic interface,
    architecture indepentent so that we needn't handle compat stuff
    anymore. In the future, after glibc start to use this we can deprecate
    sys_setrlimit and sys_getrlimit in favor to clean up the code finally.

    It also adds a possibility of changing limits of other processes. We
    check the user's permissions to do that and if it succeeds, the new
    limits are propagated online. This is good for large scale
    applications such as SAP or databases where administrators need to
    change limits time by time (e.g. on crashes increase core size). And
    it is unacceptable to restart the service.

    For safety, all rlim users now either use accessors or doesn't need
    them due to
    - locking
    - the fact a process was just forked and nobody else knows about it
    yet (and nobody can't thus read/write limits)
    hence it is safe to modify limits now.

    The limitation is that we currently stay at ulong internal
    representation. So the rlim64_is_infinity check is used where value is
    compared against ULONG_MAX on 32-bit which is the maximum value there.

    And since internally the limits are held in struct rlimit, converters
    which are used before and after do_prlimit call in sys_prlimit64 are
    introduced.

    Signed-off-by: Jiri Slaby

    Jiri Slaby
     

10 Jul, 2010

1 commit

  • For some reason if we declare a static variable and then assign it
    later, and the assignment contains a __attribute__((__aligned__(#))),
    some versions of gcc will ignore it.

    This caused the syscall meta data to not be compact in its section
    and caused a kernel oops when the section was being read.

    The fix for these versions of gcc seems to be to add the aligned
    attribute to the declaration as well.

    This fixes the BZ regression:

    https://bugzilla.kernel.org/show_bug.cgi?id=16353

    Reported-by: Zeev Tarantov
    Tested-by: Zeev Tarantov
    Acked-by: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

29 Jun, 2010

1 commit


05 Jun, 2010

1 commit

  • task_struct->pesonality is "unsigned int", but sys_personality() paths use
    "unsigned long pesonality". This means that every assignment or
    comparison is not right. In particular, if this argument does not fit
    into "unsigned int" __set_personality() changes the caller's personality
    and then sys_personality() returns -EINVAL.

    Turn this argument into "unsigned int" and avoid overflows. Obviously,
    this is the user-visible change, we just ignore the upper bits. But this
    can't break the sane application.

    There is another thing which can confuse the poorly written applications.
    User-space thinks that this syscall returns int, not long. This means
    that the returned value can be negative and look like the error code. But
    note that libc won't be confused and thus errno won't be set, and with
    this patch the user-space can never get -1 unless sys_personality() really
    fails. And, most importantly, the negative RET != -1 is only possible if
    that app previously called personality(RET).

    Pointed-out-by: Wenming Zhang
    Suggested-by: Linus Torvalds
    Signed-off-by: Oleg Nesterov
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

15 May, 2010

5 commits

  • Currently, every event has its own trace_event structure. This is
    fine since the structure is needed anyway. But the print function
    structure (trace_event_functions) is now separate. Since the output
    of the trace event is done by the class (with the exception of events
    defined by DEFINE_EVENT_PRINT), it makes sense to have the class
    define the print functions that all events in the class can use.

    This makes a bigger deal with the syscall events since all syscall events
    use the same class. The savings here is another 30K.

    text data bss dec hex filename
    4913961 1088356 861512 6863829 68bbd5 vmlinux.orig
    4900382 1048964 861512 6810858 67ecea vmlinux.init
    4900446 1049028 861512 6810986 67ed6a vmlinux.preprint
    4895024 1023812 861512 6780348 6775bc vmlinux.print

    To accomplish this, and to let the class know what event is being
    printed, the event structure is embedded in the ftrace_event_call
    structure. This should not be an issues since the event structure
    was created for each event anyway.

    Acked-by: Mathieu Desnoyers
    Acked-by: Masami Hiramatsu
    Acked-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Multiple events may use the same method to print their data.
    Instead of having all events have a pointer to their print funtions,
    the trace_event structure now points to a trace_event_functions structure
    that will hold the way to print ouf the event.

    The event itself is now passed to the print function to let the print
    function know what kind of event it should print.

    This opens the door to consolidating the way several events print
    their output.

    text data bss dec hex filename
    4913961 1088356 861512 6863829 68bbd5 vmlinux.orig
    4900382 1048964 861512 6810858 67ecea vmlinux.init
    4900446 1049028 861512 6810986 67ed6a vmlinux.preprint

    This change slightly increases the size but is needed for the next change.

    v3: Fix the branch tracer events to handle this change.

    v2: Fix the new function graph tracer event calls to handle this change.

    Acked-by: Mathieu Desnoyers
    Acked-by: Masami Hiramatsu
    Acked-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The raw_init function pointer in the event is used to initialize
    various kinds of events. The type of initialization needed is usually
    classed to the kind of event it is.

    Two events with the same class will always have the same initialization
    function, so it makes sense to move this to the class structure.

    Perhaps even making a special system structure would work since
    the initialization is the same for all events within a system.
    But since there's no system structure (yet), this will just move it
    to the class.

    text data bss dec hex filename
    4913961 1088356 861512 6863829 68bbd5 vmlinux.orig
    4900375 1053380 861512 6815267 67fe23 vmlinux.fields
    4900382 1048964 861512 6810858 67ecea vmlinux.init

    The text grew very slightly, but this is a constant growth that happened
    with the changing of the C files that call the init code.
    The bigger savings is the data which will be saved the more events share
    a class.

    Acked-by: Mathieu Desnoyers
    Acked-by: Masami Hiramatsu
    Acked-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Move the defined fields from the event to the class structure.
    Since the fields of the event are defined by the class they belong
    to, it makes sense to have the class hold the information instead
    of the individual events. The events of the same class would just
    hold duplicate information.

    After this change the size of the kernel dropped another 3K:

    text data bss dec hex filename
    4913961 1088356 861512 6863829 68bbd5 vmlinux.orig
    4900252 1057412 861512 6819176 680d68 vmlinux.regs
    4900375 1053380 861512 6815267 67fe23 vmlinux.fields

    Although the text increased, this was mainly due to the C files
    having to adapt to the change. This is a constant increase, where
    new tracepoints will not increase the Text. But the big drop is
    in the data size (as well as needed allocations to hold the fields).
    This will give even more savings as more tracepoints are created.

    Note, if just TRACE_EVENT()s are used and not DECLARE_EVENT_CLASS()
    with several DEFINE_EVENT()s, then the savings will be lost. But
    we are pushing developers to consolidate events with DEFINE_EVENT()
    so this should not be an issue.

    The kprobes define a unique class to every new event, but are dynamic
    so it should not be a issue.

    The syscalls however have a single class but the fields for the individual
    events are different. The syscalls use a metadata to define the
    fields. I moved the fields list from the event to the metadata and
    added a "get_fields()" function to the class. This function is used
    to find the fields. For normal events and kprobes, get_fields() just
    returns a pointer to the fields list_head in the class. For syscall
    events, it returns the fields list_head in the metadata for the event.

    v2: Fixed the syscall fields. The syscall metadata needs a list
    of fields for both enter and exit.

    Acked-by: Frederic Weisbecker
    Acked-by: Mathieu Desnoyers
    Acked-by: Masami Hiramatsu
    Cc: Tom Zanussi
    Cc: Peter Zijlstra
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • This patch removes the register functions of TRACE_EVENT() to enable
    and disable tracepoints. The registering of a event is now down
    directly in the trace_events.c file. The tracepoint_probe_register()
    is now called directly.

    The prototypes are no longer type checked, but this should not be
    an issue since the tracepoints are created automatically by the
    macros. If a prototype is incorrect in the TRACE_EVENT() macro, then
    other macros will catch it.

    The trace_event_class structure now holds the probes to be called
    by the callbacks. This removes needing to have each event have
    a separate pointer for the probe.

    To handle kprobes and syscalls, since they register probes in a
    different manner, a "reg" field is added to the ftrace_event_class
    structure. If the "reg" field is assigned, then it will be called for
    enabling and disabling of the probe for either ftrace or perf. To let
    the reg function know what is happening, a new enum (trace_reg) is
    created that has the type of control that is needed.

    With this new rework, the 82 kernel events and 618 syscall events
    has their footprint dramatically lowered:

    text data bss dec hex filename
    4913961 1088356 861512 6863829 68bbd5 vmlinux.orig
    4914025 1088868 861512 6864405 68be15 vmlinux.class
    4918492 1084612 861512 6864616 68bee8 vmlinux.tracepoint
    4900252 1057412 861512 6819176 680d68 vmlinux.regs

    The size went from 6863829 to 6819176, that's a total of 44K
    in savings. With tracepoints being continuously added, this is
    critical that the footprint becomes minimal.

    v5: Added #ifdef CONFIG_PERF_EVENTS around a reference to perf
    specific structure in trace_events.c.

    v4: Fixed trace self tests to check probe because regfunc no longer
    exists.

    v3: Updated to handle void *data in beginning of probe parameters.
    Also added the tracepoint: check_trace_callback_type_##call().

    v2: Changed the callback probes to pass void * and typecast the
    value within the function.

    Acked-by: Mathieu Desnoyers
    Acked-by: Masami Hiramatsu
    Acked-by: Frederic Weisbecker
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

14 May, 2010

1 commit

  • This patch creates a ftrace_event_class struct that event structs point to.
    This class struct will be made to hold information to modify the
    events. Currently the class struct only holds the events system name.

    This patch slightly increases the size, but this change lays the ground work
    of other changes to make the footprint of tracepoints smaller.

    With 82 standard tracepoints, and 618 system call tracepoints
    (two tracepoints per syscall: enter and exit):

    text data bss dec hex filename
    4913961 1088356 861512 6863829 68bbd5 vmlinux.orig
    4914025 1088868 861512 6864405 68be15 vmlinux.class

    This patch also cleans up some stale comments in ftrace.h.

    v2: Fixed missing semi-colon in macro.

    Acked-by: Frederic Weisbecker
    Acked-by: Mathieu Desnoyers
    Acked-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

23 Mar, 2010

1 commit


19 Mar, 2010

1 commit

  • …/git/tip/linux-2.6-tip

    * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (35 commits)
    perf: Fix unexported generic perf_arch_fetch_caller_regs
    perf record: Don't try to find buildids in a zero sized file
    perf: export perf_trace_regs and perf_arch_fetch_caller_regs
    perf, x86: Fix hw_perf_enable() event assignment
    perf, ppc: Fix compile error due to new cpu notifiers
    perf: Make the install relative to DESTDIR if specified
    kprobes: Calculate the index correctly when freeing the out-of-line execution slot
    perf tools: Fix sparse CPU numbering related bugs
    perf_event: Fix oops triggered by cpu offline/online
    perf: Drop the obsolete profile naming for trace events
    perf: Take a hot regs snapshot for trace events
    perf: Introduce new perf_fetch_caller_regs() for hot regs snapshot
    perf/x86-64: Use frame pointer to walk on irq and process stacks
    lockdep: Move lock events under lockdep recursion protection
    perf report: Print the map table just after samples for which no map was found
    perf report: Add multiple event support
    perf session: Change perf_session post processing functions to take histogram tree
    perf session: Add storage for seperating event types in report
    perf session: Change add_hist_entry to take the tree root instead of session
    perf record: Add ID and to recorded event data when recording multiple events
    ...

    Linus Torvalds
     

13 Mar, 2010

4 commits

  • Add generic implementations of the old and really old uname system calls.
    Note that sh only implements sys_olduname but not sys_oldolduname, but I'm
    not going to bother with another ifdef for that special case.

    m32r implemented an old uname but never wired it up, so kill it, too.

    Signed-off-by: Christoph Hellwig
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Hirokazu Takata
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: Al Viro
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: "Luck, Tony"
    Cc: James Morris
    Cc: Andreas Schwab
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Add a generic implementation of the ipc demultiplexer syscall. Except for
    s390 and sparc64 all implementations of the sys_ipc are nearly identical.

    There are slight differences in the types of the parameters, where mips
    and powerpc as the only 64-bit architectures with sys_ipc use unsigned
    long for the "third" argument as it gets casted to a pointer later, while
    it traditionally is an "int" like most other paramters. frv goes even
    further and uses unsigned long for all parameters execept for "ptr" which
    is a pointer type everywhere. The change from int to unsigned long for
    "third" and back to "int" for the others on frv should be fine due to the
    in-register calling conventions for syscalls (we already had a similar
    issue with the generic sys_ptrace), but I'd prefer to have the arch
    maintainers looks over this in details.

    Except for that h8300, m68k and m68knommu lack an impplementation of the
    semtimedop sub call which this patch adds, and various architectures have
    gets used - at least on i386 it seems superflous as the compat code on
    x86-64 and ia64 doesn't even bother to implement it.

    [akpm@linux-foundation.org: add sys_ipc to sys_ni.c]
    Signed-off-by: Christoph Hellwig
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Hirokazu Takata
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Reviewed-by: H. Peter Anvin
    Cc: Al Viro
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: "Luck, Tony"
    Cc: James Morris
    Cc: Andreas Schwab
    Acked-by: Jesper Nilsson
    Acked-by: Russell King
    Acked-by: David Howells
    Acked-by: Kyle McMartin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Add a generic implementation of the old mmap() syscall, which expects its
    argument in a memory block and switch all architectures over to use it.

    Signed-off-by: Christoph Hellwig
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Hirokazu Takata
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Reviewed-by: H. Peter Anvin
    Cc: Al Viro
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: "Luck, Tony"
    Cc: James Morris
    Cc: Andreas Schwab
    Acked-by: Jesper Nilsson
    Acked-by: Russell King
    Acked-by: Greg Ungerer
    Acked-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Add a generic implementation of the old select() syscall, which expects
    its argument in a memory block and switch all architectures over to use
    it.

    Signed-off-by: Christoph Hellwig
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Hirokazu Takata
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Reviewed-by: H. Peter Anvin
    Cc: Al Viro
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: "Luck, Tony"
    Cc: James Morris
    Acked-by: Andreas Schwab
    Acked-by: Russell King
    Acked-by: Greg Ungerer
    Acked-by: David Howells
    Cc: Andreas Schwab
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig