22 Aug, 2016

3 commits

  • Linus Torvalds
     
  • Pull two parisc fixes from Helge Deller:
    "The first patch ensures that the high-res cr16 clocksource (which was
    added in kernel 4.7) gets choosen as default clocksource for parisc.

    The second patch moves the #define of EREFUSED down inside errno.h and
    thus unbreaks building the gccgo compiler"

    * 'parisc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: Fix order of EREFUSED define in errno.h
    parisc: Fix automatic selection of cr16 clocksource

    Linus Torvalds
     
  • This is an entirely new driver instead of yet another set of patches
    to sb_edac.c because:

    1) Mapping from PCI devices to socket/memory controller is significantly
    different. Skylake scatters devices on a socket across a number of
    PCI buses.
    2) There is an extra level of interleaving via the "mcroute" register
    that would be a little messy to squeeze into the old driver.
    3) Validation is getting too expensive. Changes to sb_edac need to
    be checked against Sandy Bridge, Ivy Bridge, Haswell, Broadwell and
    Knights Landing.

    Acked-by: Aristeu Rozanski
    Acked-by: Borislav Petkov
    Signed-off-by: Tony Luck
    Signed-off-by: Linus Torvalds

    Tony Luck
     

20 Aug, 2016

8 commits

  • When building gccgo in userspace, errno.h gets parsed and the go include file
    sysinfo.go is generated.

    Since EREFUSED is defined to the same value as ECONNREFUSED, and ECONNREFUSED
    is defined later on in errno.h, this leads to go complaining that EREFUSED
    isn't defined yet.

    Fix this trivial problem by moving the define of EREFUSED down after
    ECONNREFUSED in errno.h (and clean up the indenting while touching this line).

    Signed-off-by: Helge Deller
    Cc: stable@vger.kernel.org

    Helge Deller
     
  • Commit 54b66800907 (parisc: Add native high-resolution sched_clock()
    implementation) added support to use the CPU-internal cr16 counters as reliable
    clocksource with the help of HAVE_UNSTABLE_SCHED_CLOCK.

    Sadly the commit missed to remove the hack which prevented cr16 to become the
    default clocksource even on SMP systems.

    Signed-off-by: Helge Deller
    Cc: stable@vger.kernel.org # 4.7+

    Helge Deller
     
  • The kernel test robot reported a usercopy failure in the new hardened
    sanity checks, due to a page-crossing copy of the FPU state into the
    task structure.

    This happened because the kernel test robot was testing with SLOB, which
    doesn't actually do the required book-keeping for slab allocations, and
    as a result the hardening code didn't realize that the task struct
    allocation was one single allocation - and the sanity checks fail.

    Since SLOB doesn't even claim to support hardening (and you really
    shouldn't use it), the straightforward solution is to just make the
    usercopy hardening code depend on the allocator supporting it.

    Reported-by: kernel test robot
    Cc: Kees Cook
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull i2c fixes from Wolfram Sang:
    "I2C has some pretty standard driver bugfixes and one minor cleanup"

    * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
    i2c: meson: Use complete() instead of complete_all()
    i2c: brcmstb: Use complete() instead of complete_all()
    i2c: bcm-kona: Use complete() instead of complete_all()
    i2c: bcm-iproc: Use complete() instead of complete_all()
    i2c: at91: fix support of the "alternative command" feature
    i2c: ocores: add missed clk_disable_unprepare() on failure paths
    i2c: cros-ec-tunnel: Fix usage of cros_ec_cmd_xfer()
    i2c: mux: demux-pinctrl: properly roll back when adding adapter fails

    Linus Torvalds
     
  • Pull device mapper fixes from Mike Snitzer:

    - a stable fix for DM round robin multipath path selector to disable
    preemption before using this_cpu_ptr()

    - a slight increase in DM crypt's mempool reserves to make swap ontop
    of DM crypt more performant

    - a few DM raid fixes to issues found while testing changes that were
    merged in v4.8-rc1

    * tag 'dm-4.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
    dm raid: support raid0 with missing metadata devices
    dm raid: enhance attempt_restore_of_faulty_devices() to support more devices
    dm raid: fix restoring of failed devices regression
    dm raid: fix frozen recovery regression
    dm crypt: increase mempool reserve to better support swapping
    dm round robin: do not use this_cpu_ptr() without having preemption disabled

    Linus Torvalds
     
  • Pull SCSI fixes from James Bottomley:
    "Six fairly small fixes. The ipr, mpt3sas and ses ones all trigger
    oopses. The megaraid one fixes an attach failure on io mapped only
    cards, the fcoe one is an obvious problem in the error path and the
    aacraid one is a theoretical security issue (ability to trick the
    kernel into a buffer overrun)"

    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
    ses: Fix racy cleanup of /sys in remove_dev()
    mpt3sas: Fix resume on WarpDrive flash cards
    ipr: Fix sync scsi scan
    megaraid_sas: Fix probing cards without io port
    aacraid: Check size values after double-fetch from user
    fcoe: Use kfree_skb() instead of kfree()

    Linus Torvalds
     
  • Pull USB fixes from Greg KH:
    "Here are a number of USB fixes for reported issues for your tree.

    The normal amount of gadget fixes, xhci fixes, new device ids, and a
    few other minor things. All of them have been in linux-next for a
    while, the full details are in the shortlog below"

    * tag 'usb-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (43 commits)
    xhci: don't dereference a xhci member after removing xhci
    usb: xhci: Fix panic if disconnect
    xhci: really enqueue zero length TRBs.
    xhci: always handle "Command Ring Stopped" events
    cdc-acm: fix wrong pipe type on rx interrupt xfers
    usb: misc: usbtest: add fix for driver hang
    usb: dwc3: gadget: stop processing on HWO set
    usb: dwc3: don't set last bit for ISOC endpoints
    usb: gadget: rndis: free response queue during REMOTE_NDIS_RESET_MSG
    usb: udc: core: fix error handling
    usb: gadget: fsl_qe_udc: off by one in setup_received_handle()
    usb/gadget: fix gadgetfs aio support.
    usb: gadget: composite: Fix return value in case of error
    usb: gadget: uvc: Fix return value in case of error
    usb: gadget: fix check in sync read from ep in gadgetfs
    usb: misc: usbtest: usbtest_do_ioctl may return positive integer
    usb: dwc3: fix missing platform_set_drvdata() in dwc3_of_simple_probe()
    usb: phy: omap-otg: Fix missing platform_set_drvdata() in omap_otg_probe()
    usb: gadget: configfs: add mutex lock before unregister gadget
    usb: gadget: u_ether: fix dereference after null check coverify warning
    ...

    Linus Torvalds
     
  • …rnel/git/dgc/linux-xfs

    Pull xfs and iomap fixes from Dave Chinner:
    "Changes in this update:

    Regression fixes for XFS changes introduce in 4.8-rc1:
    - buffer IO accounting assert failure
    - ENOSPC block accounting reservation issue
    - DAX IO path page cache invalidation fix
    - rmapbt on-disk block count in agf
    - correct classification of rmap block type when updating AGFL.
    - iomap support for attribute fork mapping

    Regression fixes for iomap infrastructure in 4.8-rc1:
    - fiemap: honor FIEMAP_FLAG_SYNC
    - fiemap: implement FIEMAP_FLAG_XATTR support to fix XFS regression
    - make mark_page_accessed and pagefault_disable usage consistent with
    other IO paths"

    * tag 'xfs-iomap-for-linus-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
    xfs: remove OWN_AG rmap when allocating a block from the AGFL
    xfs: (re-)implement FIEMAP_FLAG_XATTR
    xfs: simplify xfs_file_iomap_begin
    iomap: mark ->iomap_end as optional
    iomap: prepare iomap_fiemap for attribute mappings
    iomap: fiemap should honor the FIEMAP_FLAG_SYNC flag
    iomap: remove superflous pagefault_disable from iomap_write_actor
    iomap: remove superflous mark_page_accessed from iomap_write_actor
    xfs: store rmapbt block count in the AGF
    xfs: don't invalidate whole file on DAX read/write
    xfs: fix bogus space reservation in xfs_iomap_write_allocate
    xfs: don't assert fail on non-async buffers on ioacct decrement

    Linus Torvalds
     

19 Aug, 2016

15 commits

  • …l/git/groeck/linux-staging

    Pull hwmon fixes from Guenter Roeck:
    "Fix a bug in it87 driver and URLs in ftsteutates driver"

    * tag 'hwmon-for-linus-v4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
    hwmon: (ftsteutates) Correct ftp urls in driver documentation
    hwmon: (it87) Features mask must be 32 bit wide

    Linus Torvalds
     
  • Pull more drm fixes from Dave Airlie:
    "Daniel pointed out I'd missed some i915 fixes, and I also found a
    single etnaviv fix I missed.

    So here they are"

    * tag 'drm-fixes-for-4.8-rc3-2' of git://people.freedesktop.org/~airlied/linux:
    drm/etnaviv: take GPU lock later in the submit process
    drm/i915: Fix modeset handling during gpu reset, v5.
    drm/i915: fix aliasing_ppgtt leak
    drm/i915: fix WaInsertDummyPushConstPs
    drm/i915: Fix iboost setting for SKL Y/U DP DDI buffer translation entry 2
    drm/i915/gen9: Give one extra block per line for SKL plane WM calculations
    drm/i915: Acquire audio powerwell for HD-Audio registers
    drm/i915: Add missing rpm wakelock to GGTT pread
    drm/i915/fbc: FBC causes display flicker when VT-d is enabled on Skylake
    drm/i915: Clean up the extra RPM ref on CHV with i915.enable_rc6=0
    drm/i915: Program iboost settings for HDMI/DVI on SKL
    drm/i915: Fix iboost setting for DDI with 4 lanes on SKL
    drm/i915: Handle ENOSPC after failing to insert a mappable node
    drm/i915: Flush GT idle status upon reset

    Linus Torvalds
     
  • Pull DeviceTree fixes from Rob Herring:

    - a couple of DT node ref counting fixes

    - fix __unflatten_device_tree for PPC PCI hotplug case

    - rework marking irq controllers as OF_POPULATED in cases where real
    driver is used.

    - disable of_platform_default_populate_init on PPC. The change in
    initcall order causes problems which need to be sorted out later.

    * tag 'devicetree-fixes-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
    of: fix reference counting in of_graph_get_endpoint_by_regs
    of/platform: disable the of_platform_default_populate_init() for all the ppc boards
    ARM: imx6: mark GPC node as not populated after irq init to probe pm domain driver
    of/irq: Mark interrupt controllers as populated before initialisation
    drivers/of: Validate device node in __unflatten_device_tree()
    of: Delete an unnecessary check before the function call "of_node_put"

    Linus Torvalds
     
  • Pull documentation fixes from Jonathan Corbet:
    "Three small fixes for Sphinx-formatted documentation generation"

    * tag '4.8-doc-fixes' of git://git.lwn.net/linux:
    doc-rst: customize RTD theme, drop padding of inline literal
    docs: kernel-documentation: remove some highlight directives
    docs: Set the Sphinx default highlight language to "guess"

    Linus Torvalds
     
  • Collection of i915 fixes.

    * tag 'drm-intel-fixes-2016-08-15' of git://anongit.freedesktop.org/drm-intel:
    drm/i915: Fix modeset handling during gpu reset, v5.
    drm/i915: fix aliasing_ppgtt leak
    drm/i915: fix WaInsertDummyPushConstPs
    drm/i915: Fix iboost setting for SKL Y/U DP DDI buffer translation entry 2
    drm/i915/gen9: Give one extra block per line for SKL plane WM calculations
    drm/i915: Acquire audio powerwell for HD-Audio registers
    drm/i915: Add missing rpm wakelock to GGTT pread
    drm/i915/fbc: FBC causes display flicker when VT-d is enabled on Skylake
    drm/i915: Clean up the extra RPM ref on CHV with i915.enable_rc6=0
    drm/i915: Program iboost settings for HDMI/DVI on SKL
    drm/i915: Fix iboost setting for DDI with 4 lanes on SKL
    drm/i915: Handle ENOSPC after failing to insert a mappable node
    drm/i915: Flush GT idle status upon reset

    Dave Airlie
     
  • Single GPU recovery fix
    * 'drm-etnaviv-fixes' of git://git.pengutronix.de/git/lst/linux:
    drm/etnaviv: take GPU lock later in the submit process

    Dave Airlie
     
  • Pull x86 fixes from Ingo Molnar:
    "An initrd microcode loading fix, and an SMP bootup topology setup fix
    to resolve crashes on SGI/UV systems if the BIOS is configured in a
    certain way"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/smp: Fix __max_logical_packages value setup
    x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y

    Linus Torvalds
     
  • Pull timer fixes from Ingo Molnar:
    "Three clocksource driver fixes"

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    clocksource/drivers/mips-gic-timer: Make gic_clocksource_of_init() return int
    clocksource/drivers/kona: Fix get_counter() error handling
    clocksource/drivers/time-armada-370-xp: Fix the clock reference

    Linus Torvalds
     
  • Pull scheduler fixes from Ingo Molnar:
    "Two cputime fixes - hopefully the last ones"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/cputime: Resync steal time when guest & host lose sync
    sched/cputime: Fix NO_HZ_FULL getrusage() monotonicity regression

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar:
    "Mostly tooling fixes, but also start/stop filter related fixes, a perf
    event read() fix, a fix uncovered by fuzzing, and an uprobes leak fix"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/core: Check return value of the perf_event_read() IPI
    perf/core: Enable mapping of the stop filters
    perf/core: Update filters only on executable mmap
    perf/core: Fix file name handling for start/stop filters
    perf/core: Fix event_function_local()
    uprobes: Fix the memcg accounting
    perf intel-pt: Fix occasional decoding errors when tracing system-wide
    tools: Sync kvm related header files for arm64 and s390
    perf probe: Release resources on error when handling exit paths
    perf probe: Check for dup and fdopen failures
    perf symbols: Fix annotation of objects with debuginfo files
    perf script: Don't disable use_callchain if input is pipe
    perf script: Show proper message when failed list scripts
    perf jitdump: Add the right header to get the major()/minor() definitions
    perf ppc64le: Fix build failure when libelf is not present
    perf tools mem: Fix -t store option for record command
    perf intel-pt: Fix ip compression

    Linus Torvalds
     
  • Pull locking fixes from Ingo Molnar:
    "Two lockless_dereference() related fixes"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    locking/barriers: Suppress sparse warnings in lockless_dereference()
    Revert "drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference"

    Linus Torvalds
     
  • Pull arm64 fixes from Catalin Marinas:

    - Avoid a literal load with the MMU off on the CPU resume path
    (potential inconsistency between cache and RAM)

    - Build error with CONFIG_ACPI=n fixed

    - Compiler warning in the arch/arm64/mm/dump.c code fixed

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: Fix shift warning in arch/arm64/mm/dump.c
    arm64: kernel: avoid literal load of virtual address with MMU off
    arm64: Fix NUMA build error when !CONFIG_ACPI

    Linus Torvalds
     
  • Pull ARM fixes from Russell King:
    "Only three fixes this time:

    - Emil found an overflow problem with the memory layout sanity check.

    - Ard Biesheuvel noticed that late-allocated page tables (for EFI)
    weren't being properly constructed.

    - Guenter Roeck reported a problem found on qemu caused by the recent
    addr_limit changes"

    * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
    ARM: fix address limit restoration for undefined instructions
    ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocations
    ARM: 8590/1: sanity_check_meminfo(): avoid overflow on vmalloc_limit

    Linus Torvalds
     
  • Pull power management fixes from Rafael Wysocki:
    "More hibernation-related material: one fix for a recent regression in
    the core, one small cleanup of the x86-64 resume code and a
    documentation update.

    Specifics:

    - Fix a hibernate core regression resulting from uncovering a latent
    bug in its implementation of memory bitmaps by a recent commit
    (James Morse).

    - Use __pa() to compute a physical address in the x86-64 code
    finalizing resume from hibernation (Rafael Wysocki).

    - Update power management documentation related to system sleep
    states to remove outdated information from it and to add a
    description of a recently introduced hibernation debug feature to
    it (Rafael Wysocki)"

    * tag 'pm-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
    x86/power/64: Use __pa() for physical address computation
    PM / sleep: Update some system sleep documentation

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "Pretty quiet so far:

    - a few amdgpu/radeon fixup for pcie pm changes
    - a couple of amdgpu fixes
    - some build fixes
    - printk fix"

    * tag 'drm-fixes-for-4.8-rc3' of git://people.freedesktop.org/~airlied/linux:
    drm/amdgpu: Change GART offset to 64-bit
    drm/mediatek: add ARM_SMCCC dependency
    drm/mediatek: add CONFIG_OF dependency
    drm/mediatek: add COMMON_CLK dependency
    drm/amdgpu: Fix memory trashing if UVD ring test fails
    drm/amdgpu: fix vm init error path
    drm/amdkfd: print doorbell offset as a hex value
    Revert "drm/radeon: work around lack of upstream ACPI support for D3cold"
    Revert "drm/amdgpu: work around lack of upstream ACPI support for D3cold"

    Linus Torvalds
     

18 Aug, 2016

14 commits

  • After Peter's commit:

    331b6d8c7afc ("locking/barriers: Validate lockless_dereference() is used on a pointer type")

    ... we get a lot of sparse warnings (one for every rcu_dereference, and more)
    since the expression here is assigning to the wrong address space.

    Instead of validating that 'p' is a pointer this way, instead make
    it fail compilation when it's not by using sizeof(*(p)). This will
    not cause any sparse warnings (tested, likely since the address
    space is irrelevant for sizeof), and will fail compilation when
    'p' isn't a pointer type.

    Tested-by: Paul E. McKenney
    Signed-off-by: Johannes Berg
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andrew Morton
    Cc: Chris Wilson
    Cc: Daniel Vetter
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: 331b6d8c7afc ("locking/barriers: Validate lockless_dereference() is used on a pointer type")
    Link: http://lkml.kernel.org/r/1470909022-687-2-git-send-email-johannes@sipsolutions.net
    Signed-off-by: Ingo Molnar

    Johannes Berg
     
  • This reverts commit:

    fa7d81bb3c269 ("drm/fb-helper: Reduce READ_ONCE(master) to lockless_dereference")

    As Peter explained:

    [...] lockless_dereference() is _stronger_ than READ_ONCE(), not weaker.

    [...]

    Also, clue is in the name: 'dereference', you don't actually dereference
    the pointer here, only load it.

    My next patch breaks the compile without this revert, because it assumes
    you want to deference and thus also need the struct type visible (which
    it isn't here), so revert it.

    Tested-by: Paul E. McKenney
    Signed-off-by: Johannes Berg
    Signed-off-by: Peter Zijlstra (Intel)
    Reviewed-by: Daniel Vetter
    Cc: Andrew Morton
    Cc: Chris Wilson
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1470909022-687-1-git-send-email-johannes@sipsolutions.net
    Signed-off-by: Ingo Molnar

    Johannes Berg
     
  • When building with 48-bit VAs and 16K page configuration, it's possible
    to get the following warning when building the arm64 page table dumping
    code:

    arch/arm64/mm/dump.c: In function ‘walk_pud’:
    arch/arm64/mm/dump.c:274:102: warning: right shift count >= width of type [-Wshift-count-overflow]

    This is because pud_offset(pgd, 0) performs a shift to the right by 36
    while the value 0 has the type 'int' by default, therefore 32-bit.

    This patch modifies all the p*_offset() uses in arch/arm64/mm/dump.c to
    use 0UL for the address argument.

    Acked-by: Mark Rutland
    Signed-off-by: Catalin Marinas

    Catalin Marinas
     
  • Commit:

    57430218317e ("sched/cputime: Count actually elapsed irq & softirq time")

    ... fixed a bug but also triggered a regression:

    On an i5 laptop, 4 pCPUs, 4vCPUs for one full dynticks guest, there are four
    CPU hog processes(for loop) running in the guest, I hot-unplug the pCPUs
    on host one by one until there is only one left, then observe CPU utilization
    via 'top' in the guest, it shows:

    100% st for cpu0(housekeeping)
    75% st for other CPUs (nohz full mode)

    However, w/o this commit it shows the correct 75% for all four CPUs.

    When a guest is interrupted for a longer amount of time, missed clock ticks
    are not redelivered later. Because of that, we should not limit the amount
    of steal time accounted to the amount of time that the calling functions
    think have passed.

    However, the interval returned by account_other_time() is NOT rounded down
    to the nearest jiffy, while the base interval in get_vtime_delta() it is
    subtracted from is, so the max cputime limit is required to avoid underflow.

    This patch fixes the regression by limiting the account_other_time() from
    get_vtime_delta() to avoid underflow, and lets the other three call sites
    (in account_other_time() and steal_account_process_time()) account however
    much steal time the host told us elapsed.

    Suggested-by: Rik van Riel
    Suggested-by: Paolo Bonzini
    Signed-off-by: Wanpeng Li
    Reviewed-by: Rik van Riel
    Cc: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Radim Krcmar
    Cc: Thomas Gleixner
    Cc: kvm@vger.kernel.org
    Link: http://lkml.kernel.org/r/1471399546-4069-1-git-send-email-wanpeng.li@hotmail.com
    [ Improved the changelog. ]
    Signed-off-by: Ingo Molnar

    Wanpeng Li
     
  • Mike reports:

    Roughly 10% of the time, ltp testcase getrusage04 fails:
    getrusage04 0 TINFO : Expected timers granularity is 4000 us
    getrusage04 0 TINFO : Using 1 as multiply factor for max [us]time increment (1000+4000us)!
    getrusage04 0 TINFO : utime: 0us; stime: 179us
    getrusage04 0 TINFO : utime: 3751us; stime: 0us
    getrusage04 1 TFAIL : getrusage04.c:133: stime increased > 5000us:

    And tracked it down to the case where the task simply doesn't get
    _any_ [us]time ticks.

    Update the code to assume all rtime is utime when we lack information,
    thus ensuring a task that elides the tick gets time accounted.

    Reported-by: Mike Galbraith
    Tested-by: Mike Galbraith
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Frederic Weisbecker
    Cc: Fredrik Markstrom
    Cc: Linus Torvalds
    Cc: Paolo Bonzini
    Cc: Peter Zijlstra
    Cc: Radim
    Cc: Rik van Riel
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Cc: Wanpeng Li
    Cc: stable@vger.kernel.org # 4.3+
    Fixes: 9d7fb0427648 ("sched/cputime: Guarantee stime + utime == rtime")
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The call to smp_call_function_single in perf_event_read() may fail if
    an invalid or not online CPU index is passed. Warn user if such bug is
    present and return error.

    Signed-off-by: David Carrillo-Cisneros
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Kan Liang
    Cc: Linus Torvalds
    Cc: Paul Turner
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vegard Nossum
    Cc: Vince Weaver
    Link: http://lkml.kernel.org/r/1471467307-61171-2-git-send-email-davidcc@google.com
    Signed-off-by: Ingo Molnar

    David Carrillo-Cisneros
     
  • At this time the perf_addr_filter_needs_mmap() function will _not_
    return true on a user space 'stop' filter. But stop filters need
    exactly the same kind of mapping that range and start filters get.

    Signed-off-by: Mathieu Poirier
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Link: http://lkml.kernel.org/r/1468860187-318-4-git-send-email-mathieu.poirier@linaro.org
    Signed-off-by: Ingo Molnar

    Mathieu Poirier
     
  • Function perf_event_mmap() is called by the MM subsystem each time
    part of a binary is loaded in memory. There can be several mapping
    for a binary, many times unrelated to the code section.

    Each time a section of a binary is mapped address filters are
    updated, event when the map doesn't pertain to the code section.
    The end result is that filters are configured based on the last map
    event that was received rather than the last mapping of the code
    segment.

    For example if we have an executable 'main' that calls library
    'libcstest.so.1.0', and that we want to collect traces on code
    that is in that library. The perf cmd line for this scenario
    would be:

    perf record -e cs_etm// --filter 'filter 0x72c/0x40@/opt/lib/libcstest.so.1.0' --per-thread ./main

    Resulting in binaries being mapped this way:

    root@linaro-nano:~# cat /proc/1950/maps
    00400000-00401000 r-xp 00000000 08:02 33169 /home/linaro/main
    00410000-00411000 r--p 00000000 08:02 33169 /home/linaro/main
    00411000-00412000 rw-p 00001000 08:02 33169 /home/linaro/main
    7fa2464000-7fa2474000 rw-p 00000000 00:00 0
    7fa2474000-7fa25a4000 r-xp 00000000 08:02 543 /lib/aarch64-linux-gnu/libc-2.21.so
    7fa25a4000-7fa25b3000 ---p 00130000 08:02 543 /lib/aarch64-linux-gnu/libc-2.21.so
    7fa25b3000-7fa25b7000 r--p 0012f000 08:02 543 /lib/aarch64-linux-gnu/libc-2.21.so
    7fa25b7000-7fa25b9000 rw-p 00133000 08:02 543 /lib/aarch64-linux-gnu/libc-2.21.so
    7fa25b9000-7fa25bd000 rw-p 00000000 00:00 0
    7fa25bd000-7fa25be000 r-xp 00000000 08:02 38308 /opt/lib/libcstest.so.1.0
    7fa25be000-7fa25cd000 ---p 00001000 08:02 38308 /opt/lib/libcstest.so.1.0
    7fa25cd000-7fa25ce000 r--p 00000000 08:02 38308 /opt/lib/libcstest.so.1.0
    7fa25ce000-7fa25cf000 rw-p 00001000 08:02 38308 /opt/lib/libcstest.so.1.0
    7fa25cf000-7fa25eb000 r-xp 00000000 08:02 574 /lib/aarch64-linux-gnu/ld-2.21.so
    7fa25ef000-7fa25f2000 rw-p 00000000 00:00 0
    7fa25f7000-7fa25f9000 rw-p 00000000 00:00 0
    7fa25f9000-7fa25fa000 r--p 00000000 00:00 0 [vvar]
    7fa25fa000-7fa25fb000 r-xp 00000000 00:00 0 [vdso]
    7fa25fb000-7fa25fc000 r--p 0001c000 08:02 574 /lib/aarch64-linux-gnu/ld-2.21.so
    7fa25fc000-7fa25fe000 rw-p 0001d000 08:02 574 /lib/aarch64-linux-gnu/ld-2.21.so
    7ff2ea8000-7ff2ec9000 rw-p 00000000 00:00 0 [stack]
    root@linaro-nano:~#

    Before 'main()' can execute 'libcstest.so.1.0' has to be loaded in
    memory. Once that has been done perf_event_mmap() has been called
    4 times, with the last map starting at address 0x7fa25ce000 and
    the address filter configured to start filtering when the
    IP has passed over address 0x0x7fa25ce72c (0x7fa25ce000 + 0x72c).

    But that is wrong since the code segment for library 'libcstest.so.1.0'
    as been mapped at 0x7fa25bd000, resulting in traces not being
    collected.

    This patch corrects the situation by requesting that address
    filters be updated only if the mapped event is for a code
    segment.

    Signed-off-by: Mathieu Poirier
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Link: http://lkml.kernel.org/r/1468860187-318-3-git-send-email-mathieu.poirier@linaro.org
    Signed-off-by: Ingo Molnar

    Mathieu Poirier
     
  • Binary file names have to be supplied for both range and start/stop
    filters but the current code only processes the filename if an
    address range filter is specified. This code adds processing of
    the filename for start/stop filters.

    Signed-off-by: Mathieu Poirier
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Link: http://lkml.kernel.org/r/1468860187-318-2-git-send-email-mathieu.poirier@linaro.org
    Signed-off-by: Ingo Molnar

    Mathieu Poirier
     
  • Vincent reported triggering the WARN_ON_ONCE() in event_function_local().

    While thinking through cases I noticed that by using event_function()
    directly, we miss the inactive case usually handled by
    event_function_call().

    Therefore construct a blend of event_function_call() and
    event_function() that handles the cases relevant to
    event_function_local().

    Reported-by: Vince Weaver
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Alexander Shishkin
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: stable@vger.kernel.org # 4.5+
    Fixes: fae3fde65138 ("perf: Collapse and fix event_function_call() users")
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Frank reported kernel panic when he disabled several cores in BIOS
    via following option:

    Core Disable Bitmap(Hex) [0]

    with number 0xFFE, which leaves 16 CPUs in system (out of 48).

    The kernel panic below goes along with following messages:

    smpboot: Max logical packages: 2^M
    smpboot: APIC(0) Converting physical 0 to logical package 0^M
    smpboot: APIC(20) Converting physical 1 to logical package 1^M
    smpboot: APIC(40) Package 2 exceeds logical package map^M
    smpboot: CPU 8 APICId 40 disabled^M
    smpboot: APIC(60) Package 3 exceeds logical package map^M
    smpboot: CPU 12 APICId 60 disabled^M
    ...
    general protection fault: 0000 [#1] SMP^M
    Modules linked in:^M
    CPU: 15 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc5+ #1^M
    Hardware name: SGI UV300/UV300, BIOS SGI UV 300 series BIOS 05/25/2016^M
    task: ffff8801673e0000 ti: ffff8801673ac000 task.ti: ffff8801673ac000^M
    RIP: 0010:[] [] uncore_change_context+0xd4/0x180^M
    ...
    [] uncore_event_init_cpu+0x6c/0x70^M
    [] intel_uncore_init+0x1c2/0x2dd^M
    [] ? uncore_cpu_setup+0x17/0x17^M
    [] do_one_initcall+0x50/0x190^M
    [] ? parse_args+0x293/0x480^M
    [] kernel_init_freeable+0x1a5/0x249^M
    [] ? set_debug_rodata+0x12/0x12^M
    [] kernel_init+0xe/0x110^M
    [] ret_from_fork+0x1f/0x40^M
    [] ? rest_init+0x80/0x80^M

    The reason for the panic is wrong value of __max_logical_packages,
    which lets logical_package_map uninitialized and the uncore code
    relying on this map being properly initialized (maybe we should
    add some safety checks there as well).

    The __max_logical_packages is computed as:

    DIV_ROUND_UP(total_cpus, ncpus);
    - ncpus being number of cores

    With above BIOS setup we get total_cpus == 16 which set
    __max_logical_packages to 2 (ncpus is 12).

    Once topology_update_package_map processes CPU with logical
    pkg over 2 we display above messages and fail to initialize
    the physical_to_logical_pkg map, which makes the uncore code
    crash.

    The fix is to remove logical_package_map bitmap completely
    and keep and update the logical_packages number instead.

    After we enumerate all the present CPUs, we check if the
    enumerated logical packages count is within its computed
    maximum from BIOS data.

    If it's not the case, we set this maximum to the new enumerated
    value and freeze any new addition of logical packages.

    The freeze is because lot of init code like uncore/rapl/cqm
    depends on having maximum logical package value set to allocate
    their data, so we can't change it later on.

    Prarit Bhargava tested the patch and confirms that it solves
    the problem:

    From dmidecode:
    Core Count: 24
    Core Enabled: 24
    Thread Count: 48

    Orig kernel boot log:

    [ 0.464981] smpboot: Max logical packages: 19
    [ 0.469861] smpboot: APIC(0) Converting physical 0 to logical package 0
    [ 0.477261] smpboot: APIC(40) Converting physical 1 to logical package 1
    [ 0.484760] smpboot: APIC(80) Converting physical 2 to logical package 2
    [ 0.492258] smpboot: APIC(c0) Converting physical 3 to logical package 3

    1. nr_cpus=8, should stop enumerating in package 0:

    [ 0.533664] smpboot: APIC(0) Converting physical 0 to logical package 0
    [ 0.539596] smpboot: Max logical packages: 19

    2. max_cpus=8, should still enumerate all packages:

    [ 0.526494] smpboot: APIC(0) Converting physical 0 to logical package 0
    [ 0.532428] smpboot: APIC(40) Converting physical 1 to logical package 1
    [ 0.538456] smpboot: APIC(80) Converting physical 2 to logical package 2
    [ 0.544486] smpboot: APIC(c0) Converting physical 3 to logical package 3
    [ 0.550524] smpboot: Max logical packages: 19

    3. nr_cpus=49 ( 2 socket + 1 core on 3rd socket), should stop enumerating in
    package 2:

    [ 0.521378] smpboot: APIC(0) Converting physical 0 to logical package 0
    [ 0.527314] smpboot: APIC(40) Converting physical 1 to logical package 1
    [ 0.533345] smpboot: APIC(80) Converting physical 2 to logical package 2
    [ 0.539368] smpboot: Max logical packages: 19

    4. maxcpus=49, should still enumerate all packages:

    [ 0.525591] smpboot: APIC(0) Converting physical 0 to logical package 0
    [ 0.531525] smpboot: APIC(40) Converting physical 1 to logical package 1
    [ 0.537547] smpboot: APIC(80) Converting physical 2 to logical package 2
    [ 0.543579] smpboot: APIC(c0) Converting physical 3 to logical package 3
    [ 0.549624] smpboot: Max logical packages: 19

    5. kdump (nr_cpus=1) works as well.

    Reported-by: Frank Ramsay
    Tested-by: Prarit Bhargava
    Signed-off-by: Jiri Olsa
    Reviewed-by: Prarit Bhargava
    Acked-by: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20160815101700.GA30090@krava
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     
  • Similar to:

    efaad554b4ff ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y")

    ... fix microcode loading from the initrd on AMD by adding the
    randomization offset to the microcode patch container within the initrd.

    Reported-and-tested-by: Brian Gerst
    Signed-off-by: Borislav Petkov
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-tip-commits@vger.kernel.org
    Link: http://lkml.kernel.org/r/20160817113314.GA19221@nazgul.tnic
    Signed-off-by: Ingo Molnar

    Borislav Petkov
     
  • __replace_page() wronlgy calls mem_cgroup_cancel_charge() in "success" path,
    it should only do this if page_check_address() fails.

    This means that every enable/disable leads to unbalanced mem_cgroup_uncharge()
    from put_page(old_page), it is trivial to underflow the page_counter->count
    and trigger OOM.

    Reported-and-tested-by: Brenden Blanco
    Signed-off-by: Oleg Nesterov
    Reviewed-by: Johannes Weiner
    Acked-by: Michal Hocko
    Cc: Alexander Shishkin
    Cc: Alexei Starovoitov
    Cc: Arnaldo Carvalho de Melo
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Vladimir Davydov
    Cc: stable@vger.kernel.org # 3.17+
    Fixes: 00501b531c47 ("mm: memcontrol: rewrite charge API")
    Link: http://lkml.kernel.org/r/20160817153629.GB29724@redhat.com
    Signed-off-by: Ingo Molnar

    Oleg Nesterov
     
  • Single 64-bit gart size fix.

    * 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux:
    drm/amdgpu: Change GART offset to 64-bit

    Dave Airlie