15 Mar, 2013

18 commits

  • Move the logic to wake up on ring buffer data into the ring buffer
    code itself. This simplifies the tracing code a lot and also has the
    added benefit that waiters on one of the instance buffers can be woken
    only when data is added to that instance instead of data added to
    any instance.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • If the ring buffer is empty, a read to trace_pipe_raw wont block.
    The tracing code has the infrastructure to wake up waiting readers,
    but the trace_pipe_raw doesn't take advantage of that.

    When a read is done to trace_pipe_raw without the O_NONBLOCK flag
    set, have the read block until there's data in the requested buffer.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The trace_pipe_raw never implemented polling and this was casing
    issues for several utilities. This is now implemented.

    Blocked reads still are on the TODO list.

    Reported-by: Mauro Carvalho Chehab
    Tested-by: Mauro Carvalho Chehab
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Currently only the splice NONBLOCK flag is checked to determine if
    the splice read should block or not. But the file descriptor NONBLOCK
    flag also needs to be checked.

    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • The names used to display the field and type in the event format
    files are copied, as well as the system name that is displayed.

    All these names are created by constant values passed in.
    If one of theses values were to be removed by a module, the module
    would also be required to remove any event it created.

    By using the strings directly, we can save over 100K of memory.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The event structures used by the trace events are mostly persistent,
    but they are also allocated by kmalloc, which is not the best at
    allocating space for what is used. By converting these kmallocs
    into kmem_cache_allocs, we can save over 50K of space that is
    permanently allocated.

    After boot we have:

    slab name active allocated size
    --------- ------ --------- ----
    ftrace_event_file 979 1005 56 67 1
    ftrace_event_field 2301 2310 48 77 1

    The ftrace_event_file has at boot up 979 active objects out of
    1005 allocated in the slabs. Each object is 56 bytes. In a normal
    kmalloc, that would allocate 64 bytes for each object.

    1005 - 979 = 26 objects not used
    26 * 56 = 1456 bytes wasted

    But if we used kmalloc:

    64 - 56 = 8 bytes unused per allocation
    8 * 979 = 7832 bytes wasted

    7832 - 1456 = 6376 bytes in savings

    Doing the same for ftrace_event_field where there's 2301 objects
    allocated in a slab that can hold 2310 with 48 bytes each we have:

    2310 - 2301 = 9 objects not used
    9 * 48 = 432 bytes wasted

    A kmalloc would also use 64 bytes per object:

    64 - 48 = 16 bytes unused per allocation
    16 * 2301 = 36816 bytes wasted!

    36816 - 432 = 36384 bytes in savings

    This change gives us a total of 42760 bytes in savings. At least
    on my machine, but as there's a lot of these persistent objects
    for all configurations that use trace points, this is a net win.

    Thanks to Ezequiel Garcia for his trace_analyze presentation which
    pointed out the wasted space in my code.

    Cc: Ezequiel Garcia
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • With the new descriptors used to allow multiple buffers in the
    tracing directory added, the kernel command line parameter
    trace_events=... no longer works. This is because the top level
    (global) trace array now has a list of descriptors associated
    with the events and the files in the debugfs directory. But in
    early bootup, when the command line is processed and the events
    enabled, the trace array list of events has not been set up yet.

    Without the list of events in the trace array, the setting of
    events to record will fail because it would not match any events.

    The solution is to set up the top level array in two stages.
    The first is to just add the ftrace file descriptors that just point
    to the events. This will allow events to be enabled and start tracing.
    The second stage is called after the filesystem is set up, and this
    stage will create the debugfs event files and directories associated
    with the trace array events.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Add a method to the hijacked dentry descriptor of the
    "instances" directory to allow for rmdir to remove an
    instance of a multibuffer.

    Example:

    cd /debug/tracing/instances
    mkdir hello
    ls
    hello/
    rmdir hello
    ls

    Like the mkdir method, the i_mutex is dropped for the instances
    directory. The instances directory is created at boot up and can
    not be renamed or removed. The trace_types_lock mutex is used to
    synchronize adding and removing of instances.

    I've run several stress tests with different threads trying to
    create and delete directories of the same name, and it has stood
    up fine.

    Cc: Al Viro
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Add the interface ("instances" directory) to add multiple buffers
    to ftrace. To create a new instance, simply do a mkdir in the
    instances directory:

    This will create a directory with the following:

    # cd instances
    # mkdir foo
    # ls foo
    buffer_size_kb free_buffer trace_clock trace_pipe
    buffer_total_size_kb set_event trace_marker tracing_enabled
    events/ trace trace_options tracing_on

    Currently only events are able to be set, and there isn't a way
    to delete a buffer when one is created (yet).

    Note, the i_mutex lock is dropped from the parent "instances"
    directory during the mkdir operation. As the "instances" directory
    can not be renamed or deleted (created on boot), I do not see
    any harm in dropping the lock. The creation of the sub directories
    is protected by trace_types_lock mutex, which only lets one
    instance get into the code path at a time. If two tasks try to
    create or delete directories of the same name, only one will occur
    and the other will fail with -EEXIST.

    Cc: Al Viro
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Currently the syscall events record into the global buffer. But if
    multiple buffers are in place, then we need to have syscall events
    record in the proper buffers.

    By adding descriptors to pass to the syscall event functions, the
    syscall events can now record into the buffers that have been assigned
    to them (one event may be applied to mulitple buffers).

    This will allow tracing high volume syscalls along with seldom occurring
    syscalls without losing the seldom syscall events.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The global and max-tr currently use static per_cpu arrays for the CPU data
    descriptors. But in order to get new allocated trace_arrays, they need to
    be allocated per_cpu arrays. Instead of using the static arrays, switch
    the global and max-tr to use allocated data.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Pass the struct ftrace_event_file *ftrace_file to the
    trace_event_buffer_lock_reserve() (new function that replaces the
    trace_current_buffer_lock_reserver()).

    The ftrace_file holds a pointer to the trace_array that is in use.
    In the case of multiple buffers with different trace_arrays, this
    allows different events to be recorded into different buffers.

    Also fixed some of the stale comments in include/trace/ftrace.h

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The global_trace variable in kernel/trace/trace.c has been kept 'static' and
    local to that file so that it would not be used too much outside of that
    file. This has paid off, even though there were lots of changes to make
    the trace_array structure more generic (not depending on global_trace).

    Removal of a lot of direct usages of global_trace is needed to be able to
    create more trace_arrays such that we can add multiple buffers.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • Both RING_BUFFER_ALL_CPUS and TRACE_PIPE_ALL_CPU are defined as
    -1 and used to say that all the ring buffers are to be modified
    or read (instead of just a single cpu, which would be >= 0).

    There's no reason to keep TRACE_PIPE_ALL_CPU as it is also started
    to be used for more than what it was created for, and now that
    the ring buffer code added a generic RING_BUFFER_ALL_CPUS define,
    we can clean up the trace code to use that instead and remove
    the TRACE_PIPE_ALL_CPU macro.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The trace events for ftrace are all defined via global variables.
    The arrays of events and event systems are linked to a global list.
    This prevents multiple users of the event system (what to enable and
    what not to).

    By adding descriptors to represent the event/file relation, as well
    as to which trace_array descriptor they are associated with, allows
    for more than one set of events to be defined. Once the trace events
    files have a link between the trace event and the trace_array they
    are associated with, we can create multiple trace_arrays that can
    record separate events in separate buffers.

    Signed-off-by: Steven Rostedt

    Steven Rostedt
     
  • The latency tracers require the buffers to be in overwrite mode,
    otherwise they get screwed up. Force the buffers to stay in overwrite
    mode when latency tracers are enabled.

    Added a flag_changed() method to the tracer structure to allow
    the tracers to see what flags are being changed, and also be able
    to prevent the change from happing.

    Cc: stable@vger.kernel.org
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Changing the overwrite mode for the ring buffer via the trace
    option only sets the normal buffer. But the snapshot buffer could
    swap with it, and then the snapshot would be in non overwrite mode
    and the normal buffer would be in overwrite mode, even though the
    option flag states otherwise.

    Keep the two buffers overwrite modes in sync.

    Cc: stable@vger.kernel.org
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Seems that the tracer flags have never been protected from
    synchronous writes. Luckily, admins don't usually modify the
    tracing flags via two different tasks. But if scripts were to
    be used to modify them, then they could get corrupted.

    Move the trace_types_lock that protects against tracers changing
    to also protect the flags being set.

    Cc: stable@vger.kernel.org
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

14 Mar, 2013

1 commit

  • Because function tracing is very invasive, and can even trace
    calls to rcu_read_lock(), RCU access in function tracing is done
    with preempt_disable_notrace(). This requires a synchronize_sched()
    for updates and not a synchronize_rcu().

    Function probes (traceon, traceoff, etc) must be freed after
    a synchronize_sched() after its entry has been removed from the
    hash. But call_rcu() is used. Fix this by using call_rcu_sched().

    Also fix the usage to use hlist_del_rcu() instead of hlist_del().

    Cc: stable@vger.kernel.org
    Cc: Paul McKenney
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

12 Mar, 2013

1 commit

  • Although the swap is wrapped with a spin_lock, the assignment
    of the temp buffer used to swap is not within that lock.
    It needs to be moved into that lock, otherwise two swaps
    happening on two different CPUs, can end up using the wrong
    temp buffer to assign in the swap.

    Luckily, all current callers of the swap function appear to have
    their own locks. But in case something is added that allows two
    different callers to call the swap, then there's a chance that
    this race can trigger and corrupt the buffers.

    New code is coming soon that will allow for this race to trigger.

    I've Cc'd stable, so this bug will not show up if someone backports
    one of the changes that can trigger this bug.

    Cc: stable@vger.kernel.org
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

08 Mar, 2013

1 commit

  • Now, "snapshot" file returns success on a reset of snapshot buffer
    even if the buffer wasn't allocated, instead of returning EINVAL.
    This patch updates snapshot desctiption according to the change.

    Link: http://lkml.kernel.org/r/51399409.4090207@hitachi.com

    Signed-off-by: Hiraku Toyooka
    Signed-off-by: Steven Rostedt

    Hiraku Toyooka
     

07 Mar, 2013

2 commits

  • To use the tracing snapshot feature, writing a '1' into the snapshot
    file causes the snapshot buffer to be allocated if it has not already
    been allocated and dose a 'swap' with the main buffer, so that the
    snapshot now contains what was in the main buffer, and the main buffer
    now writes to what was the snapshot buffer.

    To free the snapshot buffer, a '0' is written into the snapshot file.

    To clear the snapshot buffer, any number but a '0' or '1' is written
    into the snapshot file. But if the file is not allocated it returns
    -EINVAL error code. This is rather pointless. It is better just to
    do nothing and return success.

    Acked-by: Hiraku Toyooka
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • When cat'ing the snapshot file, instead of showing an empty trace
    header like the trace file does, show how to use the snapshot
    feature.

    Also, this is a good place to show if the snapshot has been allocated
    or not. Users may want to "pre allocate" the snapshot to have a fast
    "swap" of the current buffer. Otherwise, a swap would be slow and might
    fail as it would need to allocate the snapshot buffer, and that might
    fail under tight memory constraints.

    Here's what it looked like before:

    # tracer: nop
    #
    # entries-in-buffer/entries-written: 0/0 #P:4
    #
    # _-----=> irqs-off
    # / _----=> need-resched
    # | / _---=> hardirq/softirq
    # || / _--=> preempt-depth
    # ||| / delay
    # TASK-PID CPU# |||| TIMESTAMP FUNCTION
    # | | | |||| | |

    Here's what it looks like now:

    # tracer: nop
    #
    #
    # * Snapshot is freed *
    #
    # Snapshot commands:
    # echo 0 > snapshot : Clears and frees snapshot buffer
    # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
    # Takes a snapshot of the main buffer.
    # echo 2 > snapshot : Clears snapshot buffer (but does not allocate)
    # (Doesn't have to be '2' works with any number that
    # is not a '0' or '1')

    Acked-by: Hiraku Toyooka
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

28 Feb, 2013

2 commits

  • …rostedt/linux-trace into perf/urgent

    Pull an ftrace Kconfig help text fix from Steve Rostedt.

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     
  • The prompt to enable DYNAMIC_FTRACE (the ability to nop and
    enable function tracing at run time) had a confusing statement:

    "enable/disable ftrace tracepoints dynamically"

    This was written before tracepoints were added to the kernel,
    but now that tracepoints have been added, this is very confusing
    and has confused people enough to give wrong information during
    presentations.

    Not only that, I looked at the help text, and it still references
    that dreaded daemon that use to wake up once a second to update
    the nop locations and brick NICs, that hasn't been around for over
    five years.

    Time to bring the text up to the current decade.

    Cc: stable@vger.kernel.org
    Reported-by: Ezequiel Garcia
    Signed-off-by: Steven Rostedt

    Steven Rostedt
     

20 Feb, 2013

15 commits

  • …stedt/linux-trace into perf/urgent

    Pull two fixes from Steven Rostedt.

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     
  • Intel IvyBridge processor has different constraints compared
    to SandyBridge. Therefore it needs its own contraint table.
    This patch adds the constraint table.

    Without this patch, the events listed in the patch may not be
    scheduled correctly and bogus counts may be collected.

    Signed-off-by: Stephane Eranian
    Cc: peterz@infradead.org
    Cc: ak@linux.intel.com
    Cc: acme@redhat.com
    Cc: jolsa@redhat.com
    Cc: namhyung.kim@lge.com
    Link: http://lkml.kernel.org/r/1361355312-3323-1-git-send-email-eranian@google.com
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     
  • Pull async changes from Tejun Heo:
    "These are followups for the earlier deadlock issue involving async
    ending up waiting for itself through block requesting module[1]. The
    following changes are made by these commits.

    - Instead of requesting default elevator on each request_queue init,
    block now requests it once early during boot.

    - Kmod triggers warning if invoked from an async worker.

    - Async synchronization implementation has been reimplemented. It's
    a lot simpler now."

    * 'for-3.9-async' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    async: initialise list heads to fix crash
    async: replace list of active domains with global list of pending items
    async: keep pending tasks on async_domain and remove async_pending
    async: use ULLONG_MAX for infinity cookie value
    async: bring sanity to the use of words domain and running
    async, kmod: warn on synchronous request_module() from async workers
    block: don't request module during elevator init
    init, block: try to load default elevator module early during boot

    Linus Torvalds
     
  • Pull workqueue changes from Tejun Heo:
    "A lot of reorganization is going on mostly to prepare for worker pools
    with custom attributes so that workqueue can replace custom pool
    implementations in places including writeback and btrfs and make CPU
    assignment in crypto more flexible.

    workqueue evolved from purely per-cpu design and implementation, so
    there are a lot of assumptions regarding being bound to CPUs and even
    unbound workqueues are implemented as an extension of the model -
    workqueues running on the special unbound CPU. Bulk of changes this
    round are about promoting worker_pools as the top level abstraction
    replacing global_cwq (global cpu workqueue). At this point, I'm
    fairly confident about getting custom worker pools working pretty soon
    and ready for the next merge window.

    Lai's patches are replacing the convoluted mb() dancing workqueue has
    been doing with much simpler mechanism which only depends on
    assignment atomicity of long. For details, please read the commit
    message of 0b3dae68ac ("workqueue: simplify is-work-item-queued-here
    test"). While the change ends up adding one pointer to struct
    delayed_work, the inflation in percentage is less than five percent
    and it decouples delayed_work logic a lot more cleaner from usual work
    handling, removes the unusual memory barrier dancing, and allows for
    further simplification, so I think the trade-off is acceptable.

    There will be two more workqueue related pull requests and there are
    some shared commits among them. I'll write further pull requests
    assuming this pull request is pulled first."

    * 'for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (37 commits)
    workqueue: un-GPL function delayed_work_timer_fn()
    workqueue: rename cpu_workqueue to pool_workqueue
    workqueue: reimplement is_chained_work() using current_wq_worker()
    workqueue: fix is_chained_work() regression
    workqueue: pick cwq instead of pool in __queue_work()
    workqueue: make get_work_pool_id() cheaper
    workqueue: move nr_running into worker_pool
    workqueue: cosmetic update in try_to_grab_pending()
    workqueue: simplify is-work-item-queued-here test
    workqueue: make work->data point to pool after try_to_grab_pending()
    workqueue: add delayed_work->wq to simplify reentrancy handling
    workqueue: make work_busy() test WORK_STRUCT_PENDING first
    workqueue: replace WORK_CPU_NONE/LAST with WORK_CPU_END
    workqueue: post global_cwq removal cleanups
    workqueue: rename nr_running variables
    workqueue: remove global_cwq
    workqueue: remove worker_pool->gcwq
    workqueue: replace for_each_worker_pool() with for_each_std_worker_pool()
    workqueue: make freezing/thawing per-pool
    workqueue: make hotplug processing per-pool
    ...

    Linus Torvalds
     
  • Pull workqueue [delayed_]work_pending() cleanups from Tejun Heo:
    "This is part of on-going cleanups to remove / minimize usages of
    workqueue interfaces which are deprecated and/or misleading.

    This round drops a number of usages of [delayed_]work_pending(), which
    are dangerous as they lack any form of synchronization and thus often
    lead to buggy / unnecessary code. There are a couple legitimate use
    cases in kernel. Hopefully, they can be converted and
    [delayed_]work_pending() can be removed completely. Even if not,
    removing most of misuses should make it more difficult to find
    examples of misuses and thus slow down growth of them.

    These changes are independent from other workqueue changes."

    * 'for-3.9-cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    wimax/i2400m: fix i2400m->wake_tx_skb handling
    kprobes: fix wait_for_kprobe_optimizer()
    ipw2x00: simplify scan_event handling
    video/exynos: don't use [delayed_]work_pending()
    tty/max3100: don't use [delayed_]work_pending()
    x86/mce: don't use [delayed_]work_pending()
    rfkill: don't use [delayed_]work_pending()
    wl1251: don't use [delayed_]work_pending()
    thinkpad_acpi: don't use [delayed_]work_pending()
    mwifiex: don't use [delayed_]work_pending()
    sja1000: don't use [delayed_]work_pending()

    Linus Torvalds
     
  • Pull x86 UV3 support update from Ingo Molnar:
    "Support for the SGI Ultraviolet System 3 (UV3) platform - the upcoming
    third major iteration and upscaling of the SGI UV supercomputing
    platform."

    * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, uv, uv3: Trim MMR register definitions after code changes for SGI UV3
    x86, uv, uv3: Check current gru hub support for SGI UV3
    x86, uv, uv3: Update Time Support for SGI UV3
    x86, uv, uv3: Update x2apic Support for SGI UV3
    x86, uv, uv3: Update Hub Info for SGI UV3
    x86, uv, uv3: Update ACPI Check to include SGI UV3
    x86, uv, uv3: Update MMR register definitions for SGI Ultraviolet System 3 (UV3)

    Linus Torvalds
     
  • Pull x86 platform changes from Ingo Molnar:

    - Support for the Technologic Systems TS-5500 platform, by Vivien
    Didelot

    - Improved NUMA support on AMD systems:

    Add support for federated systems where multiple memory controllers
    can exist and see each other over multiple PCI domains. This
    basically means that AMD node ids can be more than 8 now and the code
    handling this is taught to incorporate PCI domain into those IDs.

    - Support for the Goldfish virtual Android emulator, by Jun Nakajima,
    Intel, Google, et al.

    - Misc fixlets.

    * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: Add TS-5500 platform support
    x86/srat: Simplify memory affinity init error handling
    x86/apb/timer: Remove unnecessary "if"
    goldfish: platform device for x86
    amd64_edac: Fix type usage in NB IDs and memory ranges
    amd64_edac: Fix PCI function lookup
    x86, AMD, NB: Use u16 for northbridge IDs in amd_get_nb_id
    x86, AMD, NB: Add multi-domain support

    Linus Torvalds
     
  • Pull x86/hyperv changes from Ingo Molnar:
    "The biggest change is support for Windows 8's improved hypervisor
    interrupt model on the Linux Hyper-V guest subsystem code side.

    Smallish fixes otherwise."

    * 'x86-hyperv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, hyperv: HYPERV depends on X86_LOCAL_APIC
    X86: Handle Hyper-V vmbus interrupts as special hypervisor interrupts
    X86: Add a check to catch Xen emulation of Hyper-V
    x86: Hyper-V: register clocksource only if its advertised

    Linus Torvalds
     
  • Pull x86/debug changes from Ingo Molnar:
    "Two init annotations and a built-in memtest speedup"

    * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/memtest: Shorten time for tests
    x86: Convert a few mistaken __cpuinit annotations to __init
    x86/EFI: Properly init-annotate BGRT code

    Linus Torvalds
     
  • Pull x86 cleanup patches from Ingo Molnar:
    "Misc smaller cleanups"

    * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: ptrace.c only needs export.h and not the full module.h
    x86, apb_timer: remove unused variable percpu_timer
    um: don't compare a pointer to 0
    arch/x86/platform/uv: use ARRAY_SIZE where possible

    Linus Torvalds
     
  • Pull two x86 kernel build changes from Ingo Molnar:
    "The first change modifies how 'make oldconfig' works on cross-bitness
    situations on x86. It was felt the new behavior of preserving the
    bitness of the .config is more logical. This is a leftover of the
    merge.

    The second change eliminates a Perl warning. (There's another, more
    complete fix resulting of this warning fix, which second fix in flight
    to you via the kbuild tree, which will remove the timeconst.pl script
    altogether.)"

    * 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    timeconst.pl: Eliminate Perl warning
    x86: Default to ARCH=x86 to avoid overriding CONFIG_64BIT

    Linus Torvalds
     
  • Pull x86 bootup changes from Ingo Molnar:
    "Deal with bootloaders which fail to initialize unknown fields in
    boot_params to zero, by sanitizing boot params passed in.

    This unbreaks versions of kexec-utils. Other bootloaders do not
    appear to show sensitivity to this change, but it's a possibility for
    breakage nevertheless."

    * 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, boot: Sanitize boot_params if not zeroed on creation

    Linus Torvalds
     
  • Pull x86/asm changes from Ingo Molnar:
    "The biggest change (by line count) is the unification of the XOR code
    and then the introduction of an additional SSE based XOR assembly
    method.

    The other bigger change is the head_32.S rework/cleanup by Borislav
    Petkov.

    Last but not least there's the usual laundry list of small but
    dangerous (and hopefully perfectly tested) changes to subtle low level
    x86 code, plus cleanups."

    * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, head_32: Give the 6 label a real name
    x86, head_32: Remove second CPUID detection from default_entry
    x86: Detect CPUID support early at boot
    x86, head_32: Remove i386 pieces
    x86: Require MOVBE feature in cpuid when we use it
    x86: Enable ARCH_USE_BUILTIN_BSWAP
    x86/xor: Add alternative SSE implementation only prefetching once per 64-byte line
    x86/xor: Unify SSE-base xor-block routines
    x86: Fix a typo
    x86/mm: Fix the argument passed to sync_global_pgds()
    x86/mm: Convert update_mmu_cache() and update_mmu_cache_pmd() to functions
    ix86: Tighten asmlinkage_protect() constraints

    Linus Torvalds
     
  • Pull x86/apic changes from Ingo Molnar:
    "Main changes:

    - Multiple MSI support added to the APIC, PCI and AHCI code - acked
    by all relevant maintainers, by Alexander Gordeev.

    The advantage is that multiple AHCI ports can have multiple MSI
    irqs assigned, and can thus spread to multiple CPUs.

    [ Drivers can make use of this new facility via the
    pci_enable_msi_block_auto() method ]

    - x86 IOAPIC code from interrupt remapping cleanups from Joerg
    Roedel:

    These patches move all interrupt remapping specific checks out of
    the x86 core code and replaces the respective call-sites with
    function pointers. As a result the interrupt remapping code is
    better abstraced from x86 core interrupt handling code.

    - Various smaller improvements, fixes and cleanups."

    * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
    x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess
    x86, kvm: Fix intialization warnings in kvm.c
    x86, irq: Move irq_remapped out of x86 core code
    x86, io_apic: Introduce eoi_ioapic_pin call-back
    x86, msi: Introduce x86_msi.compose_msi_msg call-back
    x86, irq: Introduce setup_remapped_irq()
    x86, irq: Move irq_remapped() check into free_remapped_irq
    x86, io-apic: Remove !irq_remapped() check from __target_IO_APIC_irq()
    x86, io-apic: Move CONFIG_IRQ_REMAP code out of x86 core
    x86, irq: Add data structure to keep AMD specific irq remapping information
    x86, irq: Move irq_remapping_enabled declaration to iommu code
    x86, io_apic: Remove irq_remapping_enabled check in setup_timer_IRQ0_pin
    x86, io_apic: Move irq_remapping_enabled checks out of check_timer()
    x86, io_apic: Convert setup_ioapic_entry to function pointer
    x86, io_apic: Introduce set_affinity function pointer
    x86, msi: Use IRQ remapping specific setup_msi_irqs routine
    x86, hpet: Introduce x86_msi_ops.setup_hpet_msi
    x86, io_apic: Introduce x86_io_apic_ops.print_entries for debugging
    x86, io_apic: Introduce x86_io_apic_ops.disable()
    x86, apic: Mask IO-APIC and PIC unconditionally on LAPIC resume
    ...

    Linus Torvalds
     
  • Pull timer changes from Ingo Molnar:
    "Main changes:

    - ntp: Add CONFIG_RTC_SYSTOHC: a generic RTC driver facility
    complementing the existing CONFIG_RTC_HCTOSYS, which uses NTP to
    keep the hardware clock updated.

    - posix-timers: Fix clock_adjtime to always return timex data on
    success. This is changing the ABI, but no breakage was expected
    and found - caution is warranted nevertheless.

    - platform persistent clock improvements/cleanups.

    - clockevents: refactor timer broadcast handling to be more generic
    and less duplicated with matching architecture code (mostly ARM
    motivated.)

    - various fixes and cleanups"

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    timers/x86/hpet: Use HPET_COUNTER to specify the hpet counter in vread_hpet()
    posix-cpu-timers: Fix nanosleep task_struct leak
    clockevents: Fix generic broadcast for FEAT_C3STOP
    time, Fix setting of hardware clock in NTP code
    hrtimer: Prevent hrtimer_enqueue_reprogram race
    clockevents: Add generic timer broadcast function
    clockevents: Add generic timer broadcast receiver
    timekeeping: Switch HAS_PERSISTENT_CLOCK to ALWAYS_USE_PERSISTENT_CLOCK
    x86/time/rtc: Don't print extended CMOS year when reading RTC
    x86: Select HAS_PERSISTENT_CLOCK on x86
    timekeeping: Add CONFIG_HAS_PERSISTENT_CLOCK option
    rtc: Skip the suspend/resume handling if persistent clock exist
    timekeeping: Add persistent_clock_exist flag
    posix-timers: Fix clock_adjtime to always return timex data on success
    Round the calculated scale factor in set_cyc2ns_scale()
    NTP: Add a CONFIG_RTC_SYSTOHC configuration
    MAINTAINERS: Update John Stultz's email
    time: create __getnstimeofday for WARNless calls

    Linus Torvalds