20 Dec, 2014

1 commit

  • Pull perf fixes and cleanups from Ingo Molnar:
    "A kernel fix plus mostly tooling fixes, but also some tooling
    restructuring and cleanups"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
    perf: Fix building warning on ARM 32
    perf symbols: Fix use after free in filename__read_build_id
    perf evlist: Use roundup_pow_of_two
    tools: Adopt roundup_pow_of_two
    perf tools: Make the mmap length autotuning more robust
    tools: Adopt rounddown_pow_of_two and deps
    tools: Adopt fls_long and deps
    tools: Move bitops.h from tools/perf/util to tools/
    tools: Introduce asm-generic/bitops.h
    tools lib: Move asm-generic/bitops/find.h code to tools/include and tools/lib
    tools: Whitespace prep patches for moving bitops.h
    tools: Move code originally from asm-generic/atomic.h into tools/include/asm-generic/
    tools: Move code originally from linux/log2.h to tools/include/linux/
    tools: Move __ffs implementation to tools/include/asm-generic/bitops/__ffs.h
    perf evlist: Do not use hard coded value for a mmap_pages default
    perf trace: Let the perf_evlist__mmap autosize the number of pages to use
    perf evlist: Improve the strerror_mmap method
    perf evlist: Clarify sterror_mmap variable names
    perf evlist: Fixup brown paper bag on "hint" for --mmap-pages cmdline arg
    perf trace: Provide a better explanation when mmap fails
    ...

    Linus Torvalds
     

16 Dec, 2014

1 commit

  • Pull drm updates from Dave Airlie:
    "Highlights:

    - AMD KFD driver merge

    This is the AMD HSA interface for exposing a lowlevel interface for
    GPGPU use. They have an open source userspace built on top of this
    interface, and the code looks as good as it was going to get out of
    tree.

    - Initial atomic modesetting work

    The need for an atomic modesetting interface to allow userspace to
    try and send a complete set of modesetting state to the driver has
    arisen, and been suffering from neglect this past year. No more,
    the start of the common code and changes for msm driver to use it
    are in this tree. Ongoing work to get the userspace ioctl finished
    and the code clean will probably wait until next kernel.

    - DisplayID 1.3 and tiled monitor exposed to userspace.

    Tiled monitor property is now exposed for userspace to make use of.

    - Rockchip drm driver merged.

    - imx gpu driver moved out of staging

    Other stuff:

    - core:
    panel - MIPI DSI + new panels.
    expose suggested x/y properties for virtual GPUs

    - i915:
    Initial Skylake (SKL) support
    gen3/4 reset work
    start of dri1/ums removal
    infoframe tracking
    fixes for lots of things.

    - nouveau:
    tegra k1 voltage support
    GM204 modesetting support
    GT21x memory reclocking work

    - radeon:
    CI dpm fixes
    GPUVM improvements
    Initial DPM fan control

    - rcar-du:
    HDMI support added
    removed some support for old boards
    slave encoder driver for Analog Devices adv7511

    - exynos:
    Exynos4415 SoC support

    - msm:
    a4xx gpu support
    atomic helper conversion

    - tegra:
    iommu support
    universal plane support
    ganged-mode DSI support

    - sti:
    HDMI i2c improvements

    - vmwgfx:
    some late fixes.

    - qxl:
    use suggested x/y properties"

    * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (969 commits)
    drm: sti: fix module compilation issue
    drm/i915: save/restore GMBUS freq across suspend/resume on gen4
    drm: sti: correctly cleanup CRTC and planes
    drm: sti: add HQVDP plane
    drm: sti: add cursor plane
    drm: sti: enable auxiliary CRTC
    drm: sti: fix delay in VTG programming
    drm: sti: prepare sti_tvout to support auxiliary crtc
    drm: sti: use drm_crtc_vblank_{on/off} instead of drm_vblank_{on/off}
    drm: sti: fix hdmi avi infoframe
    drm: sti: remove event lock while disabling vblank
    drm: sti: simplify gdp code
    drm: sti: clear all mixer control
    drm: sti: remove gpio for HDMI hot plug detection
    drm: sti: allow to change hdmi ddc i2c adapter
    drm/doc: Document drm_add_modes_noedid() usage
    drm/i915: Remove '& 0xffff' from the mask given to WA_REG()
    drm/i915: Invert the mask and val arguments in wa_add() and WA_REG()
    drm: Zero out DRM object memory upon cleanup
    drm/i915/bdw: Fix the write setting up the WIZ hashing mode
    ...

    Linus Torvalds
     

14 Dec, 2014

3 commits

  • Both register and unregister call build_map_info() in order to create the
    list of mappings before installing or removing breakpoints for every mm
    which maps file backed memory. As such, there is no reason to hold the
    i_mmap_rwsem exclusively, so share it and allow concurrent readers to
    build the mapping data.

    Signed-off-by: Davidlohr Bueso
    Acked-by: Srikar Dronamraju
    Acked-by: "Kirill A. Shutemov"
    Cc: Oleg Nesterov
    Acked-by: Hugh Dickins
    Acked-by: Peter Zijlstra (Intel)
    Cc: Rik van Riel
    Acked-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     
  • The i_mmap_mutex is a close cousin of the anon vma lock, both protecting
    similar data, one for file backed pages and the other for anon memory. To
    this end, this lock can also be a rwsem. In addition, there are some
    important opportunities to share the lock when there are no tree
    modifications.

    This conversion is straightforward. For now, all users take the write
    lock.

    [sfr@canb.auug.org.au: update fremap.c]
    Signed-off-by: Davidlohr Bueso
    Reviewed-by: Rik van Riel
    Acked-by: "Kirill A. Shutemov"
    Acked-by: Hugh Dickins
    Cc: Oleg Nesterov
    Acked-by: Peter Zijlstra (Intel)
    Cc: Srikar Dronamraju
    Acked-by: Mel Gorman
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     
  • Convert all open coded mutex_lock/unlock calls to the
    i_mmap_[lock/unlock]_write() helpers.

    Signed-off-by: Davidlohr Bueso
    Acked-by: Rik van Riel
    Acked-by: "Kirill A. Shutemov"
    Acked-by: Hugh Dickins
    Cc: Oleg Nesterov
    Acked-by: Peter Zijlstra (Intel)
    Cc: Srikar Dronamraju
    Acked-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     

12 Dec, 2014

1 commit


11 Dec, 2014

2 commits

  • We allow PMU driver to change the cpu on which the event
    should be installed to. This happened in patch:

    e2d37cd213dc ("perf: Allow the PMU driver to choose the CPU on which to install events")

    This patch also forces all the group members to follow
    the currently opened events cpu if the group happened
    to be moved.

    This and the change of event->cpu in perf_install_in_context()
    function introduced in:

    0cda4c023132 ("perf: Introduce perf_pmu_migrate_context()")

    forces group members to change their event->cpu,
    if the currently-opened-event's PMU changed the cpu
    and there is a group move.

    Above behaviour causes problem for breakpoint events,
    which uses event->cpu to touch cpu specific data for
    breakpoints accounting. By changing event->cpu, some
    breakpoints slots were wrongly accounted for given
    cpu.

    Vinces's perf fuzzer hit this issue and caused following
    WARN on my setup:

    WARNING: CPU: 0 PID: 20214 at arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0x142/0x150()
    Can't find any breakpoint slot
    [...]

    This patch changes the group moving code to keep the event's
    original cpu.

    Reported-by: Vince Weaver
    Signed-off-by: Jiri Olsa
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Vince Weaver
    Cc: Yan, Zheng
    Cc:
    Link: http://lkml.kernel.org/r/1418243031-20367-3-git-send-email-jolsa@kernel.org
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     
  • Pull VFS changes from Al Viro:
    "First pile out of several (there _definitely_ will be more). Stuff in
    this one:

    - unification of d_splice_alias()/d_materialize_unique()

    - iov_iter rewrite

    - killing a bunch of ->f_path.dentry users (and f_dentry macro).

    Getting that completed will make life much simpler for
    unionmount/overlayfs, since then we'll be able to limit the places
    sensitive to file _dentry_ to reasonably few. Which allows to have
    file_inode(file) pointing to inode in a covered layer, with dentry
    pointing to (negative) dentry in union one.

    Still not complete, but much closer now.

    - crapectomy in lustre (dead code removal, mostly)

    - "let's make seq_printf return nothing" preparations

    - assorted cleanups and fixes

    There _definitely_ will be more piles"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    copy_from_iter_nocache()
    new helper: iov_iter_kvec()
    csum_and_copy_..._iter()
    iov_iter.c: handle ITER_KVEC directly
    iov_iter.c: convert copy_to_iter() to iterate_and_advance
    iov_iter.c: convert copy_from_iter() to iterate_and_advance
    iov_iter.c: get rid of bvec_copy_page_{to,from}_iter()
    iov_iter.c: convert iov_iter_zero() to iterate_and_advance
    iov_iter.c: convert iov_iter_get_pages_alloc() to iterate_all_kinds
    iov_iter.c: convert iov_iter_get_pages() to iterate_all_kinds
    iov_iter.c: convert iov_iter_npages() to iterate_all_kinds
    iov_iter.c: iterate_and_advance
    iov_iter.c: macros for iterating over iov_iter
    kill f_dentry macro
    dcache: fix kmemcheck warning in switch_names
    new helper: audit_file()
    nfsd_vfs_write(): use file_inode()
    ncpfs: use file_inode()
    kill f_dentry uses
    lockd: get rid of ->f_path.dentry->d_sb
    ...

    Linus Torvalds
     

10 Dec, 2014

1 commit

  • Pull perf events update from Ingo Molnar:
    "On the kernel side there's few changes, the one that stands out is
    PEBS machine state sampling support on x86, by Stephane Eranian.

    On the tooling side:

    User visible tooling changes:

    - Don't open the DWARF info multiple times, keeping instead a dwfl
    handle in struct dso, greatly speeding up 'perf report' on powerpc.
    (Sukadev Bhattiprolu)

    - Introduce PARSE_OPT_DISABLED option flag and use it to avoid
    showing undersired options in tools that provides frontends to
    'perf record', like sched, kvm, etc (Namhyung Kim)

    - Fallback to kallsyms when using the minimal 'ELF' loader (Arnaldo
    Carvalho de Melo)

    - Fix annotation with kcore (Adrian Hunter)

    - Support source line numbers in annotate using a hotkey (Andi Kleen)

    - Callchain improvements including:
    * Enable printing the srcline in the history
    * Make get_srcline fall back to sym+offset (Andi Kleen)

    - TUI hist_entry browser fixes, including showing missing overhead
    value for first level callchain. Detected comparing the output of
    --stdio/--gui (that matched) with --tui, that had this problem.
    (Namhyung Kim)

    - Support handling complete branch stacks as histograms (Andi Kleen)

    Tooling infrastructure changes:

    - Prep work for supporting per-pkg and snapshot counters in 'perf
    stat' (Jiri Olsa)

    - 'perf stat' refactorings, moving stuff from it to evsel.c to use in
    per-pkg/snapshot format changes (Jiri Olsa)

    - Add per-pkg format file parsing (Matt Fleming)

    - Clean up libelf feature support code (Namhyung Kim)

    - Add gzip decompression support for kernel modules (Namhyung Kim)

    - More prep patches for Intel PT, including a a thread stack and more
    stuff made available via the database export mechanism (Adrian
    Hunter)

    - More Intel PT work, including a facility to export sample data
    (comms, threads, symbol names, etc) in a database friendly way,
    with an script to use this to create a postgresql database.
    (Adrian Hunter)

    - Make sure that thread->mg->machine points to the machine where the
    thread exists (it was being set only for the kmaps kernel modules
    case, do it as well for the mmaps) and use it to shorten function
    signatures (Arnaldo Carvalho de Melo)

    ... and lots of other fixes and smaller improvements"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (91 commits)
    perf report: In branch stack mode use address history sorting
    perf report: Add --branch-history option
    perf callchain: Support handling complete branch stacks as histograms
    perf stat: Add support for snapshot counters
    perf stat: Add support for per-pkg counters
    perf tools: Remove perf_evsel__read interface
    perf stat: Use read_counter in read_counter_aggr
    perf stat: Make read_counter work over the thread dimension
    perf stat: Use perf_evsel__read_cb in read_counter
    perf tools: Add snapshot format file parsing
    perf tools: Add per-pkg format file parsing
    perf evsel: Introduce perf_evsel__read_cb function
    perf evsel: Introduce perf_counts_values__scale function
    perf evsel: Introduce perf_evsel__compute_deltas function
    perf tools: Allow to force redirect pr_debug to stderr.
    perf tools: Fix segfault due to invalid kernel dso access
    perf callchain: Make get_srcline fall back to sym+offset
    perf symbols: Move bfd_demangle stubbing to its only user
    perf callchain: Enable printing the srcline in the history
    perf tools: Collapse first level callchain entry if it has sibling
    ...

    Linus Torvalds
     

09 Dec, 2014

1 commit


02 Dec, 2014

1 commit

  • This fixes a bunch of conflicts prior to merging i915 tree.

    Linux 3.18-rc7

    Conflicts:
    drivers/gpu/drm/exynos/exynos_drm_drv.c
    drivers/gpu/drm/i915/i915_drv.c
    drivers/gpu/drm/i915/intel_pm.c
    drivers/gpu/drm/tegra/dc.c

    Dave Airlie
     

24 Nov, 2014

1 commit

  • x86 call do_notify_resume on paranoid returns if TIF_UPROBE is set but
    not on non-paranoid returns. I suspect that this is a mistake and that
    the code only works because int3 is paranoid.

    Setting _TIF_NOTIFY_RESUME in the uprobe code was probably a workaround
    for the x86 bug. With that bug fixed, we can remove _TIF_NOTIFY_RESUME
    from the uprobes code.

    Reported-by: Oleg Nesterov
    Acked-by: Srikar Dronamraju
    Acked-by: Borislav Petkov
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Linus Torvalds

    Andy Lutomirski
     

20 Nov, 2014

1 commit


16 Nov, 2014

3 commits

  • This patch reorders fields in the perf_sample_data struct in order to
    minimize the number of cachelines touched in perf_sample_data_init().
    It also removes some intializations which are redundant with the code
    in kernel/events/core.c

    Signed-off-by: Peter Zijlstra (Intel)
    Link: http://lkml.kernel.org/r/1411559322-16548-7-git-send-email-eranian@google.com
    Cc: cebbert.lkml@gmail.com
    Cc: Arnaldo Carvalho de Melo
    Cc: jolsa@redhat.com
    Cc: Linus Torvalds
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Enable capture of interrupted machine state for each sample.

    Registers to sample are passed per event in the sample_regs_intr bitmask.

    To sample interrupt machine state, the PERF_SAMPLE_INTR_REGS must be passed in
    sample_type.

    The list of available registers is arch dependent and provided by asm/perf_regs.h

    Registers are laid out as u64 in the order of the bit order of sample_intr_regs.

    This patch also adds a new ABI version PERF_ATTR_SIZE_VER4 because we extend
    the perf_event_attr struct with a new u64 field.

    Reviewed-by: Jiri Olsa
    Signed-off-by: Stephane Eranian
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: cebbert.lkml@gmail.com
    Cc: Arnaldo Carvalho de Melo
    Cc: Linus Torvalds
    Cc: linux-api@vger.kernel.org
    Link: http://lkml.kernel.org/r/1411559322-16548-2-git-send-email-eranian@google.com
    Signed-off-by: Ingo Molnar

    Stephane Eranian
     
  • When a CPU hotplugged out, we call perf_remove_from_context() (via
    perf_event_exit_cpu()) to rip each CPU-bound event out of its PMU's cpu
    context, but leave siblings grouped together. Freeing of these events is
    left to the mercy of the usual refcounting.

    When a CPU-bound event's refcount drops to zero we cross-call to
    __perf_remove_from_context() to clean it up, detaching grouped siblings.

    This works when the relevant CPU is online, but will fail if the CPU is
    currently offline, and we won't detach the event from its siblings
    before freeing the event, leaving the sibling list corrupt. If the
    sibling list is later walked (e.g. because the CPU cam online again
    before a remaining sibling's refcount drops to zero), we will walk the
    now corrupted siblings list, potentially dereferencing garbage values.

    Given that the events should never be scheduled again (as we removed
    them from their context), we can simply detatch siblings when the CPU
    goes down in the first place. If the CPU comes back online, the
    redundant call to __perf_remove_from_context() is safe.

    Reported-by: Drew Richardson
    Signed-off-by: Mark Rutland
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: vincent.weaver@maine.edu
    Cc: Vince Weaver
    Cc: Will Deacon
    Cc: Arnaldo Carvalho de Melo
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1415203904-25308-2-git-send-email-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     

28 Oct, 2014

1 commit

  • Andy reported that the current state of event_idx is rather confused.
    So remove all but the x86_pmu implementation and change the default to
    return 0 (the safe option).

    Reported-by: Andy Lutomirski
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Arnaldo Carvalho de Melo
    Cc: Benjamin Herrenschmidt
    Cc: Christoph Lameter
    Cc: Cody P Schafer
    Cc: Cody P Schafer
    Cc: Heiko Carstens
    Cc: Hendrik Brueckner
    Cc: Himangi Saraogi
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Michael Ellerman
    Cc: Paul Gortmaker
    Cc: Paul Mackerras
    Cc: sukadev@linux.vnet.ibm.com
    Cc: Thomas Huth
    Cc: Vince Weaver
    Cc: linux390@de.ibm.com
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-s390@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

15 Oct, 2014

1 commit

  • Pull percpu consistent-ops changes from Tejun Heo:
    "Way back, before the current percpu allocator was implemented, static
    and dynamic percpu memory areas were allocated and handled separately
    and had their own accessors. The distinction has been gone for many
    years now; however, the now duplicate two sets of accessors remained
    with the pointer based ones - this_cpu_*() - evolving various other
    operations over time. During the process, we also accumulated other
    inconsistent operations.

    This pull request contains Christoph's patches to clean up the
    duplicate accessor situation. __get_cpu_var() uses are replaced with
    with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

    Unfortunately, the former sometimes is tricky thanks to C being a bit
    messy with the distinction between lvalues and pointers, which led to
    a rather ugly solution for cpumask_var_t involving the introduction of
    this_cpu_cpumask_var_ptr().

    This converts most of the uses but not all. Christoph will follow up
    with the remaining conversions in this merge window and hopefully
    remove the obsolete accessors"

    * 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
    irqchip: Properly fetch the per cpu offset
    percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
    ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
    percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
    Revert "powerpc: Replace __get_cpu_var uses"
    percpu: Remove __this_cpu_ptr
    clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
    sparc: Replace __get_cpu_var uses
    avr32: Replace __get_cpu_var with __this_cpu_write
    blackfin: Replace __get_cpu_var uses
    tile: Use this_cpu_ptr() for hardware counters
    tile: Replace __get_cpu_var uses
    powerpc: Replace __get_cpu_var uses
    alpha: Replace __get_cpu_var
    ia64: Replace __get_cpu_var uses
    s390: cio driver &__get_cpu_var replacements
    s390: Replace __get_cpu_var uses
    mips: Replace __get_cpu_var uses
    MIPS: Replace __get_cpu_var uses in FPU emulator.
    arm: Replace __this_cpu_ptr with raw_cpu_ptr
    ...

    Linus Torvalds
     

13 Oct, 2014

2 commits

  • Pull perf fixes from Ingo Molnar:
    "Two leftover fixes from the v3.17 cycle - these will be forwarded to
    stable as well, if they prove problem-free in wider testing as well"

    [ Side note: the "fix perf bug in fork()" fix had also come in through
    Andrew's patch-bomb - Linus ]

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf: Fix perf bug in fork()
    perf: Fix unclone_ctx() vs. locking

    Linus Torvalds
     
  • Pull perf updates from Ingo Molnar:
    "Kernel side updates:

    - Fix and enhance poll support (Jiri Olsa)

    - Re-enable inheritance optimization (Jiri Olsa)

    - Enhance Intel memory events support (Stephane Eranian)

    - Refactor the Intel uncore driver to be more maintainable (Zheng
    Yan)

    - Enhance and fix Intel CPU and uncore PMU drivers (Peter Zijlstra,
    Andi Kleen)

    - [ plus various smaller fixes/cleanups ]

    User visible tooling updates:

    - Add +field argument support for --field option, so that one can add
    fields to the default list of fields to show, ie now one can just
    do:

    perf report --fields +pid

    And the pid will appear in addition to the default fields (Jiri
    Olsa)

    - Add +field argument support for --sort option (Jiri Olsa)

    - Honour -w in the report tools (report, top), allowing to specify
    the widths for the histogram entries columns (Namhyung Kim)

    - Properly show submicrosecond times in 'perf kvm stat' (Christian
    Borntraeger)

    - Add beautifier for mremap flags param in 'trace' (Alex Snast)

    - perf script: Allow callchains if any event samples them

    - Don't truncate Intel style addresses in 'annotate' (Alex Converse)

    - Allow profiling when kptr_restrict == 1 for non root users, kernel
    samples will just remain unresolved (Andi Kleen)

    - Allow configuring default options for callchains in config file
    (Namhyung Kim)

    - Support operations for shared futexes. (Davidlohr Bueso)

    - "perf kvm stat report" improvements by Alexander Yarygin:
    - Save pid string in opts.target.pid
    - Enable the target.system_wide flag
    - Unify the title bar output

    - [ plus lots of other fixes and small improvements. ]

    Tooling infrastructure changes:

    - Refactor unit and scale function parameters for PMU parsing
    routines (Matt Fleming)

    - Improve DSO long names lookup with rbtree, resulting in great
    speedup for workloads with lots of DSOs (Waiman Long)

    - We were not handling POLLHUP notifications for event file
    descriptors

    Fix it by filtering entries in the events file descriptor array
    after poll() returns, refcounting mmaps so that when the last fd
    pointing to a perf mmap goes away we do the unmap (Arnaldo Carvalho
    de Melo)

    - Intel PT prep work, from Adrian Hunter, including:
    - Let a user specify a PMU event without any config terms
    - Add perf-with-kcore script
    - Let default config be defined for a PMU
    - Add perf_pmu__scan_file()
    - Add a 'perf test' for tracking with sched_switch
    - Add 'flush' callback to scripting API

    - Use ring buffer consume method to look like other tools (Arnaldo
    Carvalho de Melo)

    - hists browser (used in top and report) refactorings, getting rid of
    unused variables and reducing source code size by handling similar
    cases in a fewer functions (Namhyung Kim).

    - Replace thread unsafe strerror() with strerror_r() accross the
    whole tools/perf/ tree (Masami Hiramatsu)

    - Rename ordered_samples to ordered_events and allow setting a queue
    size for ordering events (Jiri Olsa)

    - [ plus lots of fixes, cleanups and other improvements ]"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (198 commits)
    perf/x86: Tone down kernel messages when the PMU check fails in a virtual environment
    perf/x86/intel/uncore: Fix minor race in box set up
    perf record: Fix error message for --filter option not coming after tracepoint
    perf tools: Fix build breakage on arm64 targets
    perf symbols: Improve DSO long names lookup speed with rbtree
    perf symbols: Encapsulate dsos list head into struct dsos
    perf bench futex: Sanitize -q option in requeue
    perf bench futex: Support operations for shared futexes
    perf trace: Fix mmap return address truncation to 32-bit
    perf tools: Refactor unit and scale function parameters
    perf tools: Fix line number in the config file error message
    perf tools: Convert {record,top}.call-graph option to call-graph.record-mode
    perf tools: Introduce perf_callchain_config()
    perf callchain: Move some parser functions to callchain.c
    perf tools: Move callchain config from record_opts to callchain_param
    perf hists browser: Fix callchain print bug on TUI
    perf tools: Use ACCESS_ONCE() instead of volatile cast
    perf tools: Modify error code for when perf_session__new() fails
    perf tools: Fix perf record as non root with kptr_restrict == 1
    perf stat: Fix --per-core on multi socket systems
    ...

    Linus Torvalds
     

10 Oct, 2014

1 commit

  • Pull cgroup updates from Tejun Heo:
    "Nothing too interesting. Just a handful of cleanup patches"

    * 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    Revert "cgroup: remove redundant variable in cgroup_mount()"
    cgroup: remove redundant variable in cgroup_mount()
    cgroup: fix missing unlock in cgroup_release_agent()
    cgroup: remove CGRP_RELEASABLE flag
    perf/cgroup: Remove perf_put_cgroup()
    cgroup: remove redundant check in cgroup_ino()
    cpuset: simplify proc_cpuset_show()
    cgroup: simplify proc_cgroup_show()
    cgroup: use a per-cgroup work for release agent
    cgroup: remove bogus comments
    cgroup: remove redundant code in cgroup_rmdir()
    cgroup: remove some useless forward declarations
    cgroup: fix a typo in comment.

    Linus Torvalds
     

03 Oct, 2014

3 commits

  • Oleg noticed that a cleanup by Sylvain actually uncovered a bug; by
    calling perf_event_free_task() when failing sched_fork() we will not yet
    have done the memset() on ->perf_event_ctxp[] and will therefore try and
    'free' the inherited contexts, which are still in use by the parent
    process.

    This is bad and might explain some outstanding fuzzer failures ...

    Suggested-by: Oleg Nesterov
    Reported-by: Oleg Nesterov
    Reported-by: Sylvain 'ythier' Hitier
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Aaron Tomlin
    Cc: Andrew Morton
    Cc: Arnaldo Carvalho de Melo
    Cc: Daeseok Youn
    Cc: David Rientjes
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Paul Mackerras
    Cc: Rik van Riel
    Cc: Vladimir Davydov
    Cc:
    Link: http://lkml.kernel.org/r/20140929101201.GE5430@worktop
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The idiot who did 4a1c0f262f88 ("perf: Fix lockdep warning on process exit")
    forgot to pay attention and fix all similar cases. Do so now.

    In particular, unclone_ctx() must be called while holding ctx->lock,
    therefore all such sites are broken for the same reason. Pull the
    put_ctx() call out from under ctx->lock.

    Reported-by: Sasha Levin
    Probably-also-reported-by: Vince Weaver
    Fixes: 4a1c0f262f88 ("perf: Fix lockdep warning on process exit")
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Sasha Levin
    Cc: Cong Wang
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/20140930172308.GI4241@worktop.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Oleg noticed that a cleanup by Sylvain actually uncovered a bug; by
    calling perf_event_free_task() when failing sched_fork() we will not yet
    have done the memset() on ->perf_event_ctxp[] and will therefore try and
    'free' the inherited contexts, which are still in use by the parent
    process. This is bad..

    Suggested-by: Oleg Nesterov
    Reported-by: Oleg Nesterov
    Reported-by: Sylvain 'ythier' Hitier
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Ingo Molnar
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

24 Sep, 2014

3 commits

  • This reverts commit 1f9a7268c67f0290837aada443d28fd953ddca90.

    With the fix of the initial state for the cloned event we now correctly
    handle the error described in:

    1f9a7268c67f perf: Do not allow optimized switch for non-cloned events

    so we can revert it.

    I made an automated test for this, but its not suitable for automated
    perf tests framework. It needs to be customized for each machine (the
    more cpu the higher numbers for GROUPS/WORKERS/BYTES) and it could take
    longer time to hit the issue.

    Signed-off-by: Jiri Olsa
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Frederic Weisbecker
    Cc: Stephane Eranian
    Cc: Jiri Olsa
    Cc: Arnaldo Carvalho de Melo
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/20140910143535.GD2409@krava.brq.redhat.com
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     
  • Currently we initialize the child event based on the original
    parent state. This is wrong, because the original parent event
    (and its state) is not related to current fork and also could
    be already gone.

    We need to initialize the child state based on the immediate
    parent event state.

    Signed-off-by: Jiri Olsa
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Frederic Weisbecker
    Cc: Stephane Eranian
    Cc: Jiri Olsa
    Cc: Arnaldo Carvalho de Melo
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1410520708-19275-2-git-send-email-jolsa@kernel.org
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     
  • Currently we return POLLHUP in event polling if the monitored
    process is done, but we didn't consider possible children,
    that might be still running and producing data.

    Before returning POLLHUP making sure that:

    1) the monitored task has exited and that
    2) we don't have any children to monitor

    Also adding parent wakeup when the child event is gone.

    Suggested-by: Peter Zijlstra
    Signed-off-by: Jiri Olsa
    Signed-off-by: Peter Zijlstra (Intel)
    Link: http://lkml.kernel.org/r/1410520708-19275-1-git-send-email-jolsa@kernel.org
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: Stephane Eranian
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: Stephane Eranian
    Cc: Linus Torvalds
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     

19 Sep, 2014

1 commit


16 Sep, 2014

1 commit

  • Revert PERF_EVENT_STATE_EXIT check on read syscall path.
    It breaks standard way to read counter, which is to open
    the counter, wait for the monitored process to die and
    read the counter.

    Reported-by: Stephane Eranian
    Signed-off-by: Jiri Olsa
    Acked-by: Stephane Eranian
    Acked-by: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Stephane Eranian
    Cc: David Ahern
    Link: http://lkml.kernel.org/r/20140908143107.GG17728@krava.brq.redhat.com
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     

09 Sep, 2014

4 commits

  • We saw a kernel soft lockup in perf_remove_from_context(),
    it looks like the `perf` process, when exiting, could not go
    out of the retry loop. Meanwhile, the target process was forking
    a child. So either the target process should execute the smp
    function call to deactive the event (if it was running) or it should
    do a context switch which deactives the event.

    It seems we optimize out a context switch in perf_event_context_sched_out(),
    and what's more important, we still test an obsolete task pointer when
    retrying, so no one actually would deactive that event in this situation.
    Fix it directly by reloading the task pointer in perf_remove_from_context().

    This should cure the above soft lockup.

    Signed-off-by: Cong Wang
    Signed-off-by: Cong Wang
    Signed-off-by: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Linus Torvalds
    Cc:
    Link: http://lkml.kernel.org/r/1409696840-843-1-git-send-email-xiyou.wangcong@gmail.com
    Signed-off-by: Ingo Molnar

    Cong Wang
     
  • The use of "rcu_assign_pointer()" is NULLing out the pointer.
    According to RCU_INIT_POINTER()'s block comment:

    "1. This use of RCU_INIT_POINTER() is NULLing out the pointer"

    it is better to use it instead of rcu_assign_pointer() because it has a
    smaller overhead.

    The following Coccinelle semantic patch was used:
    @@
    @@

    - rcu_assign_pointer
    + RCU_INIT_POINTER
    (..., NULL)

    Signed-off-by: Andreea-Cristina Bernat
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Arnaldo Carvalho de Melo
    Link: http://lkml.kernel.org/r/20140822132605.GA20130@ada
    Signed-off-by: Ingo Molnar

    Andreea-Cristina Bernat
     
  • The use of "rcu_assign_pointer()" is NULLing out the pointer.
    According to RCU_INIT_POINTER()'s block comment:

    "1. This use of RCU_INIT_POINTER() is NULLing out the pointer"

    it is better to use it instead of rcu_assign_pointer() because it has a
    smaller overhead.

    The following Coccinelle semantic patch was used:
    @@
    @@

    - rcu_assign_pointer
    + RCU_INIT_POINTER
    (..., NULL)

    Signed-off-by: Andreea-Cristina Bernat
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: paulmck@linux.vnet.ibm.com
    Cc: Arnaldo Carvalho de Melo
    Link: http://lkml.kernel.org/r/20140822141536.GA32051@ada
    Signed-off-by: Ingo Molnar

    Andreea-Cristina Bernat
     
  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     

27 Aug, 2014

1 commit


25 Aug, 2014

2 commits


24 Aug, 2014

2 commits

  • Adding new perf event state to indicate that the monitored task has
    exited. In this case the event stays alive until the owner task exits
    or close the event fd while providing the last data through the read
    syscall and ring buffer.

    Instead it needs to propagate the error info (monitored task has died)
    via poll and read syscalls by returning POLLHUP and 0 respectively.

    Signed-off-by: Jiri Olsa
    Acked-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140811120102.GY9918@twins.programming.kicks-ass.net
    Cc: Adrian Hunter
    Cc: Arnaldo Carvalho de Melo
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jean Pihet
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-t5y3w8jjx6tfo5w8y6oajsjq@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Currently perf_poll returns POLL_HUP in case of error, which is wrong,
    because poll syscall expects POLLHUP. The POLL_HUP is meant to be used
    for SIGIO state.

    Signed-off-by: Jiri Olsa
    Acked-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20140811120102.GY9918@twins.programming.kicks-ass.net
    Cc: Adrian Hunter
    Cc: Arnaldo Carvalho de Melo
    Cc: Corey Ashford
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Ingo Molnar
    Cc: Jean Pihet
    Cc: Namhyung Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-0ywfthh4lh65swe15f6w2x2q@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

20 Aug, 2014

1 commit

  • When running a 32-bit userspace on a 64-bit kernel (eg. i386
    application on x86_64 kernel or 32-bit arm userspace on arm64
    kernel) some of the perf ioctls must be treated with special
    care, as they have a pointer size encoded in the command.

    For example, PERF_EVENT_IOC_ID in 32-bit world will be encoded
    as 0x80042407, but 64-bit kernel will expect 0x80082407. In
    result the ioctl will fail returning -ENOTTY.

    This patch solves the problem by adding code fixing up the
    size as compat_ioctl file operation.

    Reported-by: Drew Richardson
    Signed-off-by: Pawel Moll
    Signed-off-by: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1402671812-9078-1-git-send-email-pawel.moll@arm.com
    Signed-off-by: Ingo Molnar

    Pawel Moll
     

13 Aug, 2014

1 commit

  • One should first enqueue to the waitqueue and then check for the
    condition. If the condition gets true after mutex_unlock() but before
    poll_wait() then we lose it and would have wait for another wakeup.

    This has been like this since v2.6.31-rc1 commit c7138f37f9 ("perf_counter:
    fix perf_poll()"). Before that it was slightly worse. I guess we get enough
    wakeups so if we miss here one it doesn't really matter. It is still a
    bad example.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1407159068-1478-1-git-send-email-bigeasy@linutronix.de
    Cc: Arnaldo Carvalho de Melo
    Cc: Linus Torvalds
    Signed-off-by: Ingo Molnar

    Sebastian Andrzej Siewior