17 Oct, 2015

1 commit

  • Pull powerpc fixes from Michael Ellerman:
    - Re-enable CONFIG_SCSI_DH in our defconfigs
    - Remove unused os_area_db_id_video_mode
    - cxl: fix leak of IRQ names in cxl_free_afu_irqs() from Andrew
    - cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API from Andrew
    - cxl: fix leak of ctx->mapping when releasing kernel API contexts from Andrew
    - cxl: Workaround malformed pcie packets on some cards from Philippe
    - cxl: Fix number of allocated pages in SPA from Christophe Lombard
    - Fix checkstop in native_hpte_clear() with lockdep from Cyril
    - Panic on unhandled Machine Check on powernv from Daniel
    - selftests/powerpc: Fix build failure of load_unaligned_zeropad test

    * tag 'powerpc-4.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    selftests/powerpc: Fix build failure of load_unaligned_zeropad test
    powerpc/powernv: Panic on unhandled Machine Check
    powerpc: Fix checkstop in native_hpte_clear() with lockdep
    cxl: Fix number of allocated pages in SPA
    cxl: Workaround malformed pcie packets on some cards
    cxl: fix leak of ctx->mapping when releasing kernel API contexts
    cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API
    cxl: fix leak of IRQ names in cxl_free_afu_irqs()
    powerpc/ps3: Remove unused os_area_db_id_video_mode
    powerpc/configs: Re-enable CONFIG_SCSI_DH

    Linus Torvalds
     

13 Oct, 2015

1 commit

  • Commit 7a5692e6e533 ("arch/powerpc: provide zero_bytemask() for
    big-endian") added a call to __fls() in our word-at-a-time.h. That was
    fine for the kernel build but missed the fact that we also use
    word-at-a-time.h in a userspace test.

    Pulling in the kernel version of __fls() gets messy, so just define our
    own, it's unlikely to change often.

    Fixes: 7a5692e6e533 ("arch/powerpc: provide zero_bytemask() for big-endian")
    Signed-off-by: Michael Ellerman

    Michael Ellerman
     

08 Oct, 2015

1 commit


07 Oct, 2015

1 commit

  • perf_regs.c does not get built on Powerpc as CONFIG_PERF_REGS is false.
    So the weak definition for 'sample_regs_masks' doesn't get picked up.

    Adding perf_regs.o to util/Build unconditionally, exposes a redefinition
    error for 'perf_reg_value()' function (due to the static inline version
    in util/perf_regs.h). So use #ifdef HAVE_PERF_REGS_SUPPORT' around that
    function.

    Signed-off-by: Sukadev Bhattiprolu
    Acked-by: Jiri Olsa
    Cc: Naveen N. Rao
    Cc: Stephane Eranian
    Cc: linuxppc-dev@ozlabs.org
    Link: http://lkml.kernel.org/r/20150930182836.GA27858@us.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Sukadev Bhattiprolu
     

02 Oct, 2015

1 commit

  • Pull power management and ACPI fixes from Rafael Wysocki:
    "These are fixes mostly, for a few changes made in this cycle (the
    intel_idle driver, the OPP library, the ACPI EC driver, turbostat) and
    for some issues that have just been discovered (ACPI PCI IRQ
    management, PCI power management documentation, turbostat), with a
    couple of cleanups on top of them.

    Specifics:

    - intel_idle driver fixup for the recently added Skylake chips
    support (Len Brown).

    - Operating Performance Points (OPP) library fix related to the
    recently added support for new DT bindings and a fix for a typo in
    a comment (Viresh Kumar, Stephen Boyd).

    - ACPI EC driver fix for a recently introduced memory leak in an
    error code path (Lv Zheng).

    - ACPI PCI IRQ management fix for the issue where an ISA IRQ is
    shared with a PCI device which requires it to be configured in a
    different way and may cause an interrupt storm to happen as a
    result with an extra ACPI SCI IRQ handling simplification on top of
    it (Jiang Liu).

    - Update of the PCI power management documentation that became
    outdated and started to actively confuse the readers to make it
    actually reflect the code (Rafael J Wysocki).

    - turbostat fixes including an IVB Xeon regression fix (related to
    the --debug command line option), Skylake adjustment for the TSC
    running at a frequency that doesn't match the base one exactly, and
    a Knights Landing quirk to account for the fact that it only
    updates APERF and MPERF every 1024 clock cycles plus bumping up the
    turbostat version number (Len Brown, Hubert Chrzaniuk)"

    * tag 'pm+acpi-4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    tools/power turbosat: update version number
    tools/power turbostat: SKL: Adjust for TSC difference from base frequency
    tools/power turbostat: KNL workaround for %Busy and Avg_MHz
    tools/power turbostat: IVB Xeon: fix --debug regression
    ACPI / PCI: Remove duplicated penalty on SCI IRQ
    ACPI, PCI, irq: Do not share PCI IRQ with ISA IRQ
    ACPI / EC: Fix a memory leak issue in acpi_ec_query()
    PM / OPP: Fix typo modifcation -> modification
    PCI / PM: Update runtime PM documentation for PCI devices
    PM / OPP: of_property_count_u32_elems() can return errors
    intel_idle: Skylake Client Support - updated

    Linus Torvalds
     

28 Sep, 2015

1 commit

  • Pull perf fixes from Thomas Gleixner:
    "Another pile of fixes for perf:

    - Plug overflows and races in the core code

    - Sanitize the flow of the perf syscall so we error out before
    handling the more complex and hard to undo setups

    - Improve and fix Broadwell and Skylake hardware support

    - Revert a fix which broke what it tried to fix in perf tools

    - A couple of smaller fixes in various places of perf tools"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf tools: Fix copying of /proc/kcore
    perf intel-pt: Remove no_force_psb from documentation
    perf probe: Use existing routine to look for a kernel module by dso->short_name
    perf/x86: Change test_aperfmperf() and test_intel() to static
    tools lib traceevent: Fix string handling in heterogeneous arch environments
    perf record: Avoid infinite loop at buildid processing with no samples
    perf: Fix races in computing the header sizes
    perf: Fix u16 overflows
    perf: Restructure perf syscall point of no return
    perf/x86/intel: Fix Skylake FRONTEND MSR extrareg mask
    perf/x86/intel/pebs: Add PEBS frontend profiling for Skylake
    perf/x86/intel: Make the CYCLE_ACTIVITY.* constraint on Broadwell more specific
    perf tools: Bool functions shouldn't return -1
    tools build: Add test for presence of __get_cpuid() gcc builtin
    tools build: Add test for presence of numa_num_possible_cpus() in libnuma
    Revert "perf symbols: Fix mismatched declarations for elf_getphdrnum"
    perf stat: Fix per-pkg event reporting bug

    Linus Torvalds
     

27 Sep, 2015

1 commit


26 Sep, 2015

4 commits

  • Signed-off-by: Len Brown

    Len Brown
     
  • On a Skylake with 1500MHz base frequency,
    the TSC runs at 1512MHz.

    This is because the TSC is no longer in the n*100 MHz BCLK domain,
    but is now in the m*24MHz crystal clock domain. (24 MHz * 63 = 1512 MHz)

    This adds error to several calculations in turbostat,
    unless the TSC sample sizes are adjusted for this difference.

    Note that calculations in the time domain are immune
    from this issue, as the timing sub-system has already
    calibrated the TSC against a known wall clock.

    AVG_MHz = APERF_delta/measurement_interval

    need no adjustment. APERF_delta is in the BCLK domain,
    and measurement_interval is in the time domain.

    TSC_MHz = TSC_delta/measurement_interval

    needs no adjustment -- as we really do want to report
    the actual measured TSC delta here, and measurement_interval
    is in the accurate time domain.

    %Busy = MPERF_delta/TSC_delta

    needs adjustment to use TSC_BCLK_DOMAIN_delta.
    TSC_BCLK_DOMAIN_delta = TSC_delta * base_hz / tsc_hz

    Bzy_MHz = TSC_delta/APERF_delta/MPERF_delta/measurement_interval

    need adjustment as above.

    No other metrics in turbostat need to be adjusted.

    Before:

    CPU Avg_MHz %Busy Bzy_MHz TSC_MHz
    - 550 24.84 2216 1512
    0 2191 98.73 2219 1514
    2 0 0.01 2130 1512
    1 9 0.43 2016 1512
    3 2 0.08 2016 1512

    After:

    CPU Avg_MHz %Busy Bzy_MHz TSC_MHz
    - 550 25.05 2198 1512
    0 2190 99.62 2199 1512
    2 0 0.01 2152 1512
    1 9 0.46 2000 1512
    3 2 0.10 2000 1512

    Note that in this example, the "Before" Bzy_MHz
    was reported as exceeding the 2200 max turbo rate.
    Also, even a pinned spin loop would not be reported
    as over 99% busy.

    Signed-off-by: Len Brown

    Len Brown
     
  • KNL increments APERF and MPERF every 1024 clocks.
    This is compliant with the architecture specification,
    which requires that only the ratio of APERF/MPERF need be valid.

    However, turbostat takes advantage of the fact that these
    two MSRs increment every un-halted clock
    at the actual and base frequency:

    AVG_MHz = APERF_delta/measurement_interval

    %Busy = MPERF_delta/TSC_delta

    This quirk is needed for these calculations to also work on KNL,
    which would otherwise show a value 1024x smaller than expected.

    Signed-off-by: Hubert Chrzaniuk
    Signed-off-by: Len Brown

    Hubert Chrzaniuk
     
  • Staring in Linux-4.3-rc1,
    commit 6fb3143b561c ("tools/power turbostat: dump CONFIG_TDP")
    touches MSR 0x648, which is not supported on IVB-Xeon.
    This results in "turbostat --debug" exiting on those systems:

    turbostat: /dev/cpu/2/msr offset 0x648 read failed: Input/output error

    Remove IVB-Xeon from the list of machines supporting with that MSR.

    Signed-off-by: Len Brown

    Len Brown
     

25 Sep, 2015

3 commits

  • A copy of /proc/kcore containing the kernel text can be made to the
    buildid cache. e.g.

    perf buildid-cache -v -k /proc/kcore

    To workaround objdump limitations, a copy is also made when annotating
    against /proc/kcore.

    The copying process stops working from libelf about v1.62 onwards (the
    problem was found with v1.63).

    The cause is that a call to gelf_getphdr() in kcore__add_phdr() fails
    because additional validation has been added to gelf_getphdr().

    The use of gelf_getphdr() is a misguided attempt to get default
    initialization of the Gelf_Phdr structure. That should not be
    necessary because every member of the Gelf_Phdr structure is
    subsequently assigned. So just remove the call to gelf_getphdr().

    Similarly, a call to gelf_getehdr() in gelf_kcore__init() can be
    removed also.

    Committer notes:

    Note to stable@kernel.org, from Adrian in the cover letter for this
    patchkit:

    The "Fix copying of /proc/kcore" problem goes back to v3.13 if you think
    it is important enough for stable.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Cc: stable@kernel.org
    Link: http://lkml.kernel.org/r/1443089122-19082-3-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • no_force_psb was dropped as a late change to the kernel driver.
    Consequently, remove it from the documentation.

    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/1443089122-19082-2-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • We have map_groups__find_by_name() to look at the list of modules that
    are in place for a given machine, so use it instead of traversing the
    machine dso list, which also includes DSOs for userspace.

    When merging the user and kernel DSO lists a bug was introduced where
    'perf probe' stopped being able to add probes to modules using its short
    name:

    # perf probe -m usbnet --add usbnet_start_xmit
    usbnet_start_xmit is out of .text, skip it.
    Error: Failed to add events.
    #

    With this fix it works again:

    # perf probe -m usbnet --add usbnet_start_xmit
    Added new event:
    probe:usbnet_start_xmit (on usbnet_start_xmit in usbnet)

    You can now use it in all perf tools, such as:

    perf record -e probe:usbnet_start_xmit -aR sleep 1
    #

    Reported-by: Wang Nan
    Acked-by: Masami Hiramatsu
    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Stephane Eranian
    Fixes: 3d39ac538629 ("perf machine: No need to have two DSOs lists")
    Link: http://lkml.kernel.org/r/20150924015008.GE1897@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

23 Sep, 2015

7 commits

  • We don't need to specify an explicit rule in the Makefile, the implicit
    one will do the same. The "__EXPORTED_HEADERS__" define is not needed,
    because we build the test against the installed kernel headers, not the
    in-tree kernel headers. Re-use "$(TEST_PROGS)" in the clean target
    rather than spelling the executable name twice. Include
    rather than the rather specific . Include
    rather than . In both cases, the former
    header is located in a standard location and includes the latter.

    Signed-off-by: Mathieu Desnoyers
    Acked-by: Michael Ellerman
    Cc: Pranith Kumar
    Cc: Shuah Khan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • On ppc big endian this check fails, the mutex doesn't necessarily need
    to be identical for all pages after pthread_mutex_lock/unlock cycles.
    The count verification (outside of the pthread_mutex_t structure)
    suffices and that is retained.

    Signed-off-by: Andrea Arcangeli
    Cc: Dr. David Alan Gilbert
    Cc: Michael Ellerman
    Cc: Shuah Khan
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • This will report the error in the exit code, in addition of the fprintf.

    Signed-off-by: Andrea Arcangeli
    Cc: Dr. David Alan Gilbert
    Cc: Michael Ellerman
    Cc: Shuah Khan
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • Keep a non-zero placeholder after the count, for the my_bcmp comparison
    of the page against the zeropage. The lockless increment between 255 to
    256 against a lockless my_bcmp could otherwise return false positives on
    ppc32le.

    Signed-off-by: Andrea Arcangeli
    Tested-by: Michael Ellerman
    Cc: Dr. David Alan Gilbert
    Cc: Shuah Khan
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • If __NR_userfaultfd is not yet defined by the arch, warn but still build
    and run the userfaultfd selftest successfully.

    Signed-off-by: Michael Ellerman
    Signed-off-by: Andrea Arcangeli
    Cc: Dr. David Alan Gilbert
    Cc: Shuah Khan
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Ellerman
     
  • Depend on "make headers_install" to create proper headers to include and
    provide syscall numbers.

    Signed-off-by: Andrea Arcangeli
    Cc: Dr. David Alan Gilbert
    Cc: Michael Ellerman
    Cc: Shuah Khan
    Cc: Thierry Reding
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • Add the usr/include subdirectory of the top-level tree to the include
    path, and make sure to include headers without relative paths to make
    sure the sanitized headers get picked up. Otherwise the compiler will
    not be able to find the linux/compiler.h header included by the non-
    sanitized include/uapi/linux/userfaultfd.h.

    While at it, make sure to only hardcode the syscall numbers on x86 and
    PowerPC if they haven't been properly picked up from the headers.

    Signed-off-by: Thierry Reding
    Acked-by: Michael Ellerman
    Cc: Shuah Khan
    Signed-off-by: Andrea Arcangeli
    Cc: Dr. David Alan Gilbert
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thierry Reding
     

22 Sep, 2015

2 commits

  • When a trace recorded on a 32-bit device is processed with a 64-bit
    binary, the higher 32-bits of the address need to ignored.

    The lack of this results in the output of the 64-bit pointer
    value to the trace as the 32-bit address lookup fails in find_printk().

    Before:

    burn-1778 [003] 548.600305: bputs: 0xc0046db2s: 2cec5c058d98c

    After:

    burn-1778 [003] 548.600305: bputs: 0xc0046db2s: RT throttling activated

    The problem occurs in PRINT_FIELD when the field is recognized as a
    pointer to a string (of the type const char *)

    Heterogeneous architectures cases below can arise and should be handled:

    * Traces recorded using 32-bit addresses processed on a 64-bit machine
    * Traces recorded using 64-bit addresses processed on a 32-bit machine

    Reported-by: Juri Lelli
    Signed-off-by: Kapileshwar Singh
    Reviewed-by: Steven Rostedt
    Cc: David Ahern
    Cc: Javi Merino
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/r/1442928123-13824-1-git-send-email-kapileshwar.singh@arm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Kapileshwar Singh
     
  • Pull s390 fixes from Martin Schwidefsky:
    "A couple of system call updates. The two new system calls userfaultfd
    and membarrier have been added, as well as the 17 direct calls for the
    multiplexed socket system calls.

    In addition the system call compat wrappers have been flagged as
    notrace functions and a few wrappers could be removed.

    And bug fixes for the vector register handling, cpu_mf, suspend/resume,
    compat signals, SMT cputime accounting and the zfcp dumper"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    s390: wire up separate socketcalls system calls
    s390/compat: remove superfluous compat wrappers
    s390/compat: do not trace compat wrapper functions
    s390/s390x: allocate sys_membarrier system call number
    s390/configs//zfcpdump_defconfig: Remove CONFIG_MEMSTICK
    s390: wire up userfaultfd system call
    s390/vtime: correct scaled cputime for SMT
    s390/cpum_cf: Corrected return code for unauthorized counter sets
    s390/compat: correct uc_sigmask of the compat signal frame
    s390: fix floating point register corruption
    s390/hibernate: fix save and restore of vector registers

    Linus Torvalds
     

20 Sep, 2015

1 commit

  • …/git/shuah/linux-kselftest

    Pull kselftest fixes from Shuah Khan:
    "This update contains 7 fixes for problems ranging from build failurs
    to incorrect error reporting"

    * tag 'linux-kselftest-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
    selftests: exec: revert to default emit rule
    selftests: change install command to rsync
    selftests: mqueue: simplify the Makefile
    selftests: mqueue: allow extra cflags
    selftests: rename jump label to static_keys
    selftests/seccomp: add support for s390
    seltests/zram: fix syntax error

    Linus Torvalds
     

19 Sep, 2015

1 commit


18 Sep, 2015

8 commits

  • If a session contains no events, we can get stuck in an infinite loop in
    __perf_session__process_events, with a non-zero file_size and data_offset, but
    a zero data_size.

    In this case, we can mmap the entirety of the file (consisting of the file and
    attribute headers), and fetch_mmaped_event will correctly refuse to read any
    (unmapped and non-existent) event headers. This causes
    __perf_session__process_events to unmap the file and retry with the exact same
    parameters, getting stuck in an infinite loop.

    This has been observed to result in an exit-time hang when counting
    rare/unschedulable events with perf record, and can be triggered artificially
    with the script below:

    ----
    #!/bin/sh
    printf "REPRO: launching perf\n";
    ./perf record -e software/config=9/ sleep 1 &
    PERF_PID=$!;
    sleep 0.002;
    kill -2 $PERF_PID;
    printf "REPRO: waiting for perf (%d) to exit...\n" "$PERF_PID";
    wait $PERF_PID;
    printf "REPRO: perf exited\n";
    ----

    To avoid this, have __perf_session__process_events bail out early when
    the file has no data (i.e. it has no events).

    Commiter note:

    I only managed to reproduce this when setting
    /proc/sys/kernel/kptr_restrict to '1' and changing the code to
    purposefully not process any samples and no synthesized samples, i.e.
    kptr_restrict prevents 'record' from synthesizing the kernel mmaps for
    vmlinux + modules and since it is a workload started from perf, we don't
    synthesize mmap/comm records for existing threads.

    Adrian Hunter managed to reproduce it in his environment tho.

    Signed-off-by: Mark Rutland
    Tested-by: Arnaldo Carvalho de Melo
    Tested-by: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1442423929-12253-1-git-send-email-mark.rutland@arm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Mark Rutland
     
  • …it/acme/linux into perf/urgent

    Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

    User visible changes:

    - When handling perf_event_open() returning EBUSY and not being able to opendir
    the procfs mount point we would tell the user that the oprofile daemon was
    found by returning -1 on as the return for a bool function, oops, fix it,
    found with Coccinelle. (Peter Senna Tschudin).

    - Fix per-pkg event reporting bug in 'perf stat'. (Stephane Eranian)

    Developer visible changes:

    - Fix missing prototype for function provided when it isn't present in the
    libelf present, fixing the build on RHEL/CentOS 5.1 systems, for instance.
    (Arnaldo Carvalho de Melo)

    - Detect if the gcc and libnuma have the features needed to avoid requiring
    the use of NO_LIBNUMA and/or NO_AUXTRACE to build on older systems.
    (Arnaldo Carvalho de Melo)

    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     
  • Returning a negative value for a boolean function seem to have the
    undesired effect of returning true. Replace -1 by false in a
    bool-returning function.

    The diff of the .s file before and after the change (for x86_64):

    3907c3907
    < movl $1, %ebx
    ---
    > xorl %ebx, %ebx

    while if -1 is replaced by true, the diff is empty.

    This issue was found by the following Coccinelle semantic patch:


    @@
    identifier f;
    constant C;
    typedef bool;
    @@
    bool f (...){

    }

    Signed-off-by: Peter Senna Tschudin
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Matt Fleming
    Cc: Milos Vyletel
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/1442484533-19742-1-git-send-email-peter.senna@gmail.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Peter Senna Tschudin
     
  • Pull x86 fixes from Ingo Molnar:
    - misc fixes all around the map
    - block non-root vm86(old) if mmap_min_addr != 0
    - two small debuggability improvements
    - removal of obsolete paravirt op

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/platform: Fix Geode LX timekeeping in the generic x86 build
    x86/apic: Serialize LVTT and TSC_DEADLINE writes
    x86/ioapic: Force affinity setting in setup_ioapic_dest()
    x86/paravirt: Remove the unused pv_time_ops::get_tsc_khz method
    x86/ldt: Fix small LDT allocation for Xen
    x86/vm86: Fix the misleading CONFIG_VM86 Kconfig help text
    x86/cpu: Print family/model/stepping in hex
    x86/vm86: Block non-root vm86(old) if mmap_min_addr != 0
    x86/alternatives: Make optimize_nops() interrupt safe and synced
    x86/mm/srat: Print non-volatile flag in SRAT
    x86/cpufeatures: Enable cpuid for Intel SHA extensions

    Linus Torvalds
     
  • Pull perf fixes from Ingo MOlnar:
    "Mostly tooling fixes, but also two x86 PMU driver fixes"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf tests: Fix software clock events test setting maps
    perf tests: Fix task exit test setting maps
    perf evlist: Fix create_syswide_maps() not propagating maps
    perf evlist: Fix add() not propagating maps
    perf evlist: Factor out a function to propagate maps for a single evsel
    perf evlist: Make create_maps() use set_maps()
    perf evlist: Make set_maps() more resilient
    perf evsel: Add own_cpus member
    perf evlist: Fix missing thread_map__put in propagate_maps()
    perf evlist: Fix splice_list_tail() not setting evlist
    perf evlist: Add has_user_cpus member
    perf evlist: Remove redundant validation from propagate_maps()
    perf evlist: Simplify set_maps() logic
    perf evlist: Simplify propagate_maps() logic
    perf top: Fix segfault pressing -> with no hist entries
    perf header: Fixup reading of HEADER_NRCPUS feature
    perf/x86/intel: Fix constraint access
    perf/x86/intel/bts: Set event->hw.itrace_started in pmu::start to match the new logic
    perf tools: Fix use of wrong event when processing exit events
    perf tools: Fix parse_events_add_pmu caller

    Linus Torvalds
     
  • The auxtrace code needed by Intel PT uses the __get_cpuid() gcc builtin,
    that is not present in old systems, breaking the build.

    Add a test to check for that builtin and disable AUXTRACE in those
    systems.

    [acme@rhel5 linux]$ make NO_LIBPERL=1 -C tools/perf O=/tmp/build/perf install-bin
    make: Entering directory `/home/acme/git/linux/tools/perf'
    BUILD: Doing 'make -j2' parallel build

    Auto-detecting system features:

    ... lzma: [ on ]
    ... get_cpuid: [ OFF ]

    config/Makefile:630: Your gcc lacks the __get_cpuid() builtin, disables support for auxtrace/Intel PT, please install a newer gcc
    MKDIR /tmp/build/perf/util/

    This fixes the build on old systems such as RHEL/CentOS 5.11.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: "Naveen N. Rao"
    Cc: Peter Zijlstra
    Cc: Srikar Dronamraju
    Cc: Stephane Eranian
    Cc: Victor Kamensky
    Cc: Vinson Lee
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-d4puslul0jltoodzpx9r4sje@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • The existing numa test checks only if numa.h and numa_available() are
    present, but that can be satisfied with an old libnuma that is not
    enough for the 'perf bench numa' entry, so add a test to check for that:

    [acme@rhel5 linux]$ make NO_AUXTRACE=1 NO_LIBPERL=1 -C tools/perf O=/tmp/build/perf install-bin
    make: Entering directory `/home/acme/git/linux/tools/perf'
    BUILD: Doing 'make -j2' parallel build

    Auto-detecting system features:
    ... libelf: [ on ]
    ... libnuma: [ on ]
    ... numa_num_possible_cpus: [ OFF ]
    ... libperl: [ on ]


    config/Makefile:577: Old numa library found, disables 'perf bench numa mem' benchmark, please install numactl-devel/libnuma-devel/libnuma-dev >= 2.0.8
    INSTALL binaries

    This fixes the build on old systems such as RHEL/CentOS 5.11.

    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: "Naveen N. Rao"
    Cc: Peter Zijlstra
    Cc: Srikar Dronamraju
    Cc: Stephane Eranian
    Cc: Victor Kamensky
    Cc: Vinson Lee
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-zqriqkezppi2de2iyjin1tnc@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • This reverts commit f785f2357673d520a0b7b468973cdd197f336494.

    We have a test to check if elf_getphdrnum() is present, so, if it fails,
    we'll get:

    [acme@rhel5 linux]$ cat /tmp/build/perf/feature/test-libelf-getphdrnum.make.output
    cc1: warnings being treated as errors
    test-libelf-getphdrnum.c: In function ‘main’:
    test-libelf-getphdrnum.c:7: warning: implicit declaration of function ‘elf_getphdrnum’
    [acme@rhel5 linux]$

    And this block will not be compiled:

    #ifndef HAVE_ELF_GETPHDRNUM_SUPPORT
    static int elf_getphdrnum(Elf *elf, size_t *dst)
    ...
    #endif

    So, if elf_getphdrnum() is being defined somewhere, there is a problem
    with the test that is not detecting that function, go fix it.

    Reported-by: Vinson Lee
    Cc: Adrian Hunter
    Cc: Borislav Petkov
    Cc: David Ahern
    Cc: Frederic Weisbecker
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: "Naveen N. Rao"
    Cc: Peter Zijlstra
    Cc: Srikar Dronamraju
    Cc: Stephane Eranian
    Cc: Victor Kamensky
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-qn459fal6acvcvm50i8zxx9k@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

17 Sep, 2015

2 commits

  • Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky

    Heiko Carstens
     
  • Per-pkg events need to be captured once per processor socket. The code
    in check_per_pkg() ensures only one value per processor package is used.
    However there is a problem with this function in case the first CPU of
    the package does not measure anything for the per-pkg event, but other
    CPUs do.

    Consider the following:

    $ create cgroup FOO; echo $$ >FOO/tasks; taskset -c 1 noploop &
    $ perf stat -a -I 1000 -e intel_cqm/llc_occupancy/ -G FOO sleep 100
    1.00000 Bytes intel_cqm/llc_occupancy/ FOO

    The reason for this is that CPU0 in the cgroup has nothing running on it.
    Yet check_per_plg() will mark socket0 as processed and no other event
    value will be considered for the socket.

    This patch fixes the problem by having check_per_pkg() only consider
    events which actually ran.

    Signed-off-by: Stephane Eranian
    Cc: Adrian Hunter
    Cc: Andi Kleen
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1441286620-10117-1-git-send-email-eranian@google.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Stephane Eranian
     

16 Sep, 2015

1 commit


15 Sep, 2015

4 commits

  • The test titled "Test software clock events have valid period values"
    was setting cpu/thread maps directly. Make it use the proper function
    perf_evlist__set_maps() especially now that it also propagates the maps.

    Signed-off-by: Adrian Hunter
    Acked-by: Jiri Olsa
    Cc: Kan Liang
    Link: http://lkml.kernel.org/r/1441699142-18905-15-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • The test titled "Test number of exit event of a simple workload" was
    setting cpu/thread maps directly. Make it use the proper function
    perf_evlist__set_maps() especially now that it also propagates the maps.

    Signed-off-by: Adrian Hunter
    Acked-by: Jiri Olsa
    Cc: Kan Liang
    Link: http://lkml.kernel.org/r/1441699142-18905-14-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • Fix it by making it call perf_evlist__set_maps() instead of setting the
    maps itself.

    Signed-off-by: Adrian Hunter
    Acked-by: Jiri Olsa
    Cc: Kan Liang
    Link: http://lkml.kernel.org/r/1441699142-18905-13-git-send-email-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter
     
  • If evsels are added after maps are created, then they won't have any
    maps propagated to them. Fix that.

    Signed-off-by: Adrian Hunter
    Acked-by: Jiri Olsa
    Cc: Kan Liang
    Link: http://lkml.kernel.org/r/1441699142-18905-12-git-send-email-adrian.hunter@intel.com
    [ Moved the moving of propagate_maps() to the patch before, so that this
    one does _just_ the one lile fix calling in add()]
    Signed-off-by: Arnaldo Carvalho de Melo

    Adrian Hunter