10 Oct, 2016

4 commits

  • Pull blk-mq irq/cpu mapping updates from Jens Axboe:
    "This is the block-irq topic branch for 4.9-rc. It's mostly from
    Christoph, and it allows drivers to specify their own mappings, and
    more importantly, to share the blk-mq mappings with the IRQ affinity
    mappings. It's a good step towards making this work better out of the
    box"

    * 'for-4.9/block-irq' of git://git.kernel.dk/linux-block:
    blk_mq: linux/blk-mq.h does not include all the headers it depends on
    blk-mq: kill unused blk_mq_create_mq_map()
    blk-mq: get rid of the cpumask in struct blk_mq_tags
    nvme: remove the post_scan callout
    nvme: switch to use pci_alloc_irq_vectors
    blk-mq: provide a default queue mapping for PCI device
    blk-mq: allow the driver to pass in a queue mapping
    blk-mq: remove ->map_queue
    blk-mq: only allocate a single mq_map per tag_set
    blk-mq: don't redistribute hardware queues on a CPU hotplug event

    Linus Torvalds
     
  • Pull device mapper updates from Mike Snitzer:

    - various fixes and cleanups for request-based DM core

    - add support for delaying the requeue of requests; used by DM
    multipath when all paths have failed and 'queue_if_no_path' is
    enabled

    - DM cache improvements to speedup the loading metadata and the writing
    of the hint array

    - fix potential for a dm-crypt crash on device teardown

    - remove dm_bufio_cond_resched() and just using cond_resched()

    - change DM multipath to return a reservation conflict error
    immediately; rather than failing the path and retrying (potentially
    indefinitely)

    * tag 'dm-4.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (24 commits)
    dm mpath: always return reservation conflict without failing over
    dm bufio: remove dm_bufio_cond_resched()
    dm crypt: fix crash on exit
    dm cache metadata: switch to using the new cursor api for loading metadata
    dm array: introduce cursor api
    dm btree: introduce cursor api
    dm cache policy smq: distribute entries to random levels when switching to smq
    dm cache: speed up writing of the hint array
    dm array: add dm_array_new()
    dm mpath: delay the requeue of blk-mq requests while all paths down
    dm mpath: use dm_mq_kick_requeue_list()
    dm rq: introduce dm_mq_kick_requeue_list()
    dm rq: reduce arguments passed to map_request() and dm_requeue_original_request()
    dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests
    dm: convert wait loops to use autoremove_wake_function()
    dm: use signal_pending_state() in dm_wait_for_completion()
    dm: rename task state function arguments
    dm: add two lockdep_assert_held() statements
    dm rq: simplify dm_old_stop_queue()
    dm mpath: check if path's request_queue is dying in activate_path()
    ...

    Linus Torvalds
     
  • Pull main rdma updates from Doug Ledford:
    "This is the main pull request for the rdma stack this release. The
    code has been through 0day and I had it tagged for linux-next testing
    for a couple days.

    Summary:

    - updates to mlx5

    - updates to mlx4 (two conflicts, both minor and easily resolved)

    - updates to iw_cxgb4 (one conflict, not so obvious to resolve,
    proper resolution is to keep the code in cxgb4_main.c as it is in
    Linus' tree as attach_uld was refactored and moved into
    cxgb4_uld.c)

    - improvements to uAPI (moved vendor specific API elements to uAPI
    area)

    - add hns-roce driver and hns and hns-roce ACPI reset support

    - conversion of all rdma code away from deprecated
    create_singlethread_workqueue

    - security improvement: remove unsafe ib_get_dma_mr (breaks lustre in
    staging)"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (75 commits)
    staging/lustre: Disable InfiniBand support
    iw_cxgb4: add fast-path for small REG_MR operations
    cxgb4: advertise support for FR_NSMR_TPTE_WR
    IB/core: correctly handle rdma_rw_init_mrs() failure
    IB/srp: Fix infinite loop when FMR sg[0].offset != 0
    IB/srp: Remove an unused argument
    IB/core: Improve ib_map_mr_sg() documentation
    IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets
    IB/mthca: Move user vendor structures
    IB/nes: Move user vendor structures
    IB/ocrdma: Move user vendor structures
    IB/mlx4: Move user vendor structures
    IB/cxgb4: Move user vendor structures
    IB/cxgb3: Move user vendor structures
    IB/mlx5: Move and decouple user vendor structures
    IB/{core,hw}: Add constant for node_desc
    ipoib: Make ipoib_warn ratelimited
    IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue
    IB/ipoib_verbs: Remove deprecated create_singlethread_workqueue
    IB/ipoib: Remove deprecated create_singlethread_workqueue
    ...

    Linus Torvalds
     
  • Pull more rdma updates from Doug Ledford:
    "Minor updates for rxe driver"

    [ Starting to do merge window pulls again - the current -git tree does
    appear to have some netfilter use-after-free issues, but I've sent
    off the report to the proper channels, and I don't want to delay merge
    window activity any more ]

    * tag 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
    IB/rxe: improved debug prints & code cleanup
    rdma_rxe: Ensure rdma_rxe init occurs at correct time
    IB/rxe: Properly honor max IRD value for rd/atomic.
    IB/{rxe,core,rdmavt}: Fix kernel crash for reg MR
    IB/rxe: Fix sending out loopback packet on netdev interface.
    IB/rxe: Avoid scheduling tasklet for userspace QP

    Linus Torvalds
     

08 Oct, 2016

36 commits

  • Merge updates from Andrew Morton:

    - fsnotify updates

    - ocfs2 updates

    - all of MM

    * emailed patches from Andrew Morton : (127 commits)
    console: don't prefer first registered if DT specifies stdout-path
    cred: simpler, 1D supplementary groups
    CREDITS: update Pavel's information, add GPG key, remove snail mail address
    mailmap: add Johan Hovold
    .gitattributes: set git diff driver for C source code files
    uprobes: remove function declarations from arch/{mips,s390}
    spelling.txt: "modeled" is spelt correctly
    nmi_backtrace: generate one-line reports for idle cpus
    arch/tile: adopt the new nmi_backtrace framework
    nmi_backtrace: do a local dump_stack() instead of a self-NMI
    nmi_backtrace: add more trigger_*_cpu_backtrace() methods
    min/max: remove sparse warnings when they're nested
    Documentation/filesystems/proc.txt: add more description for maps/smaps
    mm, proc: fix region lost in /proc/self/smaps
    proc: fix timerslack_ns CAP_SYS_NICE check when adjusting self
    proc: add LSM hook checks to /proc//timerslack_ns
    proc: relax /proc//timerslack_ns capability requirements
    meminfo: break apart a very long seq_printf with #ifdefs
    seq/proc: modify seq_put_decimal_[u]ll to take a const char *, not char
    proc: faster /proc/*/status
    ...

    Linus Torvalds
     
  • Pull ARM SoC late DT updates from Arnd Bergmann:
    "These updates have been kept in a separate branch mostly because they
    rely on updates to the respective clk drivers to keep the shared
    header files in sync.

    - The Renesas r8a7796 (R-Car M3-W) platform gets added, this is an
    automotive SoC similar to the ⅹ8a7795 chip we already support, but
    the dts changes rely on a clock driver change that has been merged
    for v4.9 through the clk tree.

    - The Amlogic meson-gxbb (S905) platform gains support for a few
    drivers merged through our tree, in particular the network and usb
    driver changes are required and included here, and also the clk
    tree changes.

    - The Allwinner platforms have seen a large-scale change to their clk
    drivers and the dts file updates must come after that. This
    includes the newly added Nextthing GR8 platform, which is derived
    from sun5i/A13.

    - Some integrator (arm32) changes rely on clk driver changes.

    - A single patch for lpc32xx has no such dependency but wasn't added
    until just before the merge window"

    * tag 'armsoc-late' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (99 commits)
    ARM: dts: lpc32xx: add device node for IRAM on-chip memory
    ARM: dts: sun8i: Add accelerometer to polaroid-mid2407pxe03
    ARM: dts: sun8i: enable UART1 for iNet D978 Rev2 board
    ARM: dts: sun8i: add pinmux for UART1 at PG
    dts: sun8i-h3: add I2C0-2 peripherals to H3 SOC
    dts: sun8i-h3: add pinmux definitions for I2C0-2
    dts: sun8i-h3: associate exposed UARTs on Orange Pi Boards
    dts: sun8i-h3: split off RTS/CTS for UART1 in seperate pinmux
    dts: sun8i-h3: add pinmux definitions for UART2-3
    ARM: dts: sun9i: a80-optimus: Disable EHCI1
    ARM: dts: sun9i: cubieboard4: Add AXP806 PMIC device node and regulators
    ARM: dts: sun9i: a80-optimus: Add AXP806 PMIC device node and regulators
    ARM: dts: sun9i: cubieboard4: Declare AXP809 SW regulator as unused
    ARM: dts: sun9i: a80-optimus: Declare AXP809 SW regulator as unused
    ARM: dts: sun8i: Add touchscreen node for sun8i-a33-ga10h
    ARM: dts: sun8i: Add touchscreen node for sun8i-a23-polaroid-mid2809pxe04
    ARM: dts: sun8i: Add touchscreen node for sun8i-a23-polaroid-mid2407pxe03
    ARM: dts: sun8i: Add touchscreen node for sun8i-a23-inet86dz
    ARM: dts: sun8i: Add touchscreen node for sun8i-a23-gt90h
    ARM64: dts: meson-gxbb-vega-s95: Enable USB Nodes
    ...

    Linus Torvalds
     
  • Pull ARM 64-bit DT updates from Arnd Bergmann:
    "The 64-bit DT changes are surprisingly small this time, we only add
    two SoC platforms: the ZTE ZX296718 Set-top-box SoC and the SocioNext
    UniPhier LD11 TV SoC, each with their reference boards.

    There are three new machines added for existing SoC platforms:

    - The Marvell Armada 8040 development board is an impressive
    quad-core Cortex-A72 machine with three 10gbit ethernet interfaces

    - Qualcomms DragonBoard 820c single-board computer is their current
    high-end phone platform in the 96boards form factor

    - Rockchip: Tronsmart Orion r86 set-top-box is a popular mid-range
    Android box based on the 8-core rk3368 SoC"

    * tag 'armsoc-dt64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (91 commits)
    arm64: dts: berlin4ct: Add L2 cache topology
    arm64: dts: berlin4ct: enable all wdt nodes unconditionally
    arm64: dts: berlin4ct: switch to Cortex-A53 specific pmu nodes
    arm64: dts: Add ZTE ZX296718 SoC dts and Makefile
    arm64: dts: apm: Add DT node for APM X-Gene 2 CPU clocks
    arm64: dts: apm: Add X-Gene SoC hwmon to device tree
    arm64: dts: apm: Fix interrupt polarity for X-Gene PCIe legacy interrupts
    arm64: dts: apm: Add APM X-Gene v2 SoC PMU DTS entries
    arm64: dts: apm: Add APM X-Gene SoC PMU DTS entries
    arm64: dts: marvell: enable MSI for PCIe on Armada 7K/8K
    arm64: dts: ls2080a: Add 'dma-coherent' for ls2080a PCI nodes
    arm64: dts: rockchip: add Type-C phy for RK3399
    arm64: dts: rockchip: enable the gmac for rk3399 evb board
    arm64: dts: rockchip: add the gmac needed node for rk3399
    arm64: dts: rockchip: support the pmu node for rk3399
    arm64: dts: rockchip: change all interrupts cells to 4 on rk3399 SoCs
    arm64: dts: rockchip: add the tcpc for rk3399 power domain
    arm64: dts: rockchip: add efuse0 device node for rk3399
    arm64: dts: rockchip: configure PCIe support for rk3399-evb
    arm64: dts: rockchip: add the PCIe controller support for RK3399
    ...

    Linus Torvalds
     
  • Pull ARM DT updates from Arnd Bergmann:
    "These are as usual a very large number of mostly boring updates to
    enable devices in existing machines, or to fix minor bugs. Notably, an
    ongoing treewide effort to fix warnings caused by an update to the
    device tree compiler. These are enabled with "make W=1" at the moment
    but can hopefully become the default once all issues have been
    addressed.

    No new SoC platform is added this time around (Armada 395 and Orion
    mv88f5181 are slight variations of existing ones), but a significant
    number of new dts files are added, which I list by platform:

    - Allwinner: Empire Electronix M712 and iNet d978 Rev2 tablets,
    Orange Pi PC Plus, Orange Pi 2, Orange Pi Plus 2E, Orange Pi Lite,
    Olimex A33-Olinuxino, and Nano Pi Neo single-board computers

    - ARM Realview: all supported machines (ported from board files)

    - Broadcom: BCM958525er, BCM958522er, BCM988312hr, BCM958623hr and
    BCM958622hr reference boards for Northstar platform, Raspberry Pi
    Zero single-board computer

    - Marvell EBU: Netgear WNR854T router (ported from board file),
    Armada 395 SoC platform and GP board Armada 390 DB development
    board

    - NXP i.MX: imx7s Warp7 reference board, Gateworks Ventana GW553x
    single-board computer, Technologic Systems TS-4900 and Engicam
    IMX6UL GEA M6UL computer-on-module, Inverse Path USB armory board

    - Qualcomm: LG Nexus 5 Phone

    - Renesas: r8a7792/wheat and r7s72100/rskrza1 development boards

    - Rockchip: Rockchip RK3288 Fennec reference board, Firefly RK3288
    Reload platform

    - ST Microelectronics STi: B2260 (96boards) single-board computer

    - TI Davinci: OMAP-L138 LCDK Development kit

    - TI OMAP: beagleboard-x15 rev B1 single-board computer"

    * tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (390 commits)
    ARM: dts: sony-nsz-gs7: add missing unit name to /memory node
    ARM: dts: chromecast: add missing unit name to /memory node
    ARM: dts: berlin2q-marvell-dmp: add missing unit name to /memory node
    ARM: dts: berlin2: Add missing unit name to /soc node
    ARM: dts: berlin2cd: Add missing unit name to /soc node
    ARM: dts: berlin2q: Add missing unit name to /soc node
    ARM: dts: berlin2: Remove skeleton.dtsi inclusion
    ARM: dts: berlin2cd: Remove skeleton.dtsi inclusion
    ARM: dts: berlin2q: Remove skeleton.dtsi inclusion
    arm: dts: berlin2q: enable all wdt nodes unconditionally
    arm: dts: berlin2: enable all wdt nodes unconditionally
    ARM: dts: omap5-igep0050.dts: Use tabs for indentation
    ARM: dts: Fix igepv5 power button GPIO direction
    ARM: dts: am335x-evmsk: Add blue-and-red-wiring -property to lcdc node
    ARM: dts: am335x-evmsk: Whitespace cleanup of lcdc related nodes
    ARM: dts: am335x-evm: Add blue-and-red-wiring -property to lcdc node
    ARM: dts: s3c64xx: Use macros for pinctrl configuration
    ARM: dts: s3c2416: Use macros for pinctrl configuration
    ARM: dts: s5pv210: Use macros for pinctrl configuration
    ARM: dts: s3c64xx: Use common macros for pinctrl configuration
    ...

    Linus Torvalds
     
  • Pull ARM SoC driver updates from Arnd Bergmann:
    "Driver updates for ARM SoCs, including a couple of newly added
    drivers:

    - The Qualcomm external bus interface 2 (EBI2), used in some of their
    mobile phone chips for connecting flash memory, LCD displays or
    other peripherals

    - Secure monitor firmware for Amlogic SoCs, and an NVMEM driver for
    the EFUSE based on that firmware interface.

    - Perf support for the AppliedMicro X-Gene performance monitor unit

    - Reset driver for STMicroelectronics STM32

    - Reset driver for SocioNext UniPhier SoCs

    Aside from these, there are minor updates to SoC-specific bus,
    clocksource, firmware, pinctrl, reset, rtc and pmic drivers"

    * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (50 commits)
    bus: qcom-ebi2: depend on HAS_IOMEM
    pinctrl: mvebu: orion5x: Generalise mv88f5181l support for 88f5181
    clk: mvebu: Add clk support for the orion5x SoC mv88f5181
    dt-bindings: EXYNOS: Add Exynos5433 PMU compatible
    clocksource: exynos_mct: Add the support for ARM64
    perf: xgene: Add APM X-Gene SoC Performance Monitoring Unit driver
    Documentation: Add documentation for APM X-Gene SoC PMU DTS binding
    MAINTAINERS: Add entry for APM X-Gene SoC PMU driver
    bus: qcom: add EBI2 driver
    bus: qcom: add EBI2 device tree bindings
    rtc: rtc-pm8xxx: Add support for pm8018 rtc
    nvmem: amlogic: Add Amlogic Meson EFUSE driver
    firmware: Amlogic: Add secure monitor driver
    soc: qcom: smd: Reset rx tail rather than tx
    memory: atmel-sdramc: fix a possible NULL dereference
    reset: hi6220: allow to compile test driver on other architectures
    reset: zynq: add driver Kconfig option
    reset: sunxi: add driver Kconfig option
    reset: stm32: add driver Kconfig option
    reset: socfpga: add driver Kconfig option
    ...

    Linus Torvalds
     
  • Pull ARM SoC 64-bit updates from Arnd Bergmann:
    "Changes to platform code for 64-bit ARM platforms.

    Nearly all of these are defconfig updates to enable new drivers or old
    drivers still used on these 64-bit platforms.

    Aside from that, we gain initial support for two set-top-box
    platforms, both of which already have 32-bit support in arch/arm:

    - Broadcom adds abstract support for the bcm7xxx/brcmstb platform,
    presumably the respective dts files and more information will
    follow at a later point.

    - The ZTE ZX296718 SoC for set-top-boxes, a relative of the 32-bit
    ZX296702 SoC that we already support"

    * tag 'armsoc-arm64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
    arm64: add ZTE ZX SoC family
    arm64: defconfig: enable ZTE ZX related config
    arm64: defconfig: enable common modules for power management
    arm64: defconfig: enable meson I2C
    arm64: defconfig: enable meson SPI as module
    arm64: defconfig: enable meson WDT as modules
    arm64: defconfig: enable HW random as module
    arm64: defconfig: Enable SDHI and GPIO_REGULATOR
    arm64: configs: enable PCIe driver for Aardvark
    Kconfig: ARCH_HISI: Add PINCTRL to HISI platform
    arm64: defconfig: enable bluetooth supports as modules
    arm64: defconfig: enable CONFIG_INPUT_HISI_POWERKEY for HiKey
    arm64: defconfig: Enable HiSilicon kirin drm, adv7533 for HiKey
    arm64: defconfig: Enable Hisi SAS and HNS
    arm64: defconfig: Enable QDF2432 config options
    arm64: sunxi: Kconfig: add essential pinctrl driver
    arm64: defconfig: Add Renesas R-Car HSUSB driver support as module
    arm64: Add Broadcom Set Top Box Kconfig entry point
    arm64: defconfig: enable xhci-platform

    Linus Torvalds
     
  • Pull ARM SoC defconfig updates from Arnd Bergmann:
    "Defconfig additions, removals, etc. Most of these are small changes
    adding the options for newly upstreamed drivers, or drivers needed for
    new board support. Nothing specifically sticks out this time"

    * tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (25 commits)
    ARM: multi_v7_defconfig: enable CONFIG_EFI
    ARM: multi_v7_defconfig: Build Atmel maXTouch driver as a module
    ARM: defconfig: update the Integrator defconfig
    ARM: keystone: defconfig: Fix USB configuration
    ARM: imx_v6_v7_defconfig: Select the wm8960 codec driver
    ARM: omap2plus_defconfig: switch to the IIO BMP085 driver
    ARM: mvebu_v5_defconfig: use MV88E6XXX
    ARM: davinci_all_defconfig: Enable some UBI modules
    ARM: davinci_all_defconfig: Enable AEMIF as a module
    ARM: multi_v7_defconfig: Enable SECCOMP
    ARM: exynos_defconfig: Enable SECCOMP
    ARM: imx_v6_v7_defconfig: Add CONFIG_MPL3115
    ARM: imx_v6_v7_defconfig: Enable GPU support
    ARM: s3c2410_defconfig: Remove CONFIG_IPV6_PRIVACY
    ARM: exynos_defconfig: Enable PM_DEBUG
    ARM: exynos_defconfig: Enable bus frequency scaling with devfreq
    ARM: imx_v6_v7_defconfig: enable more USB configurations
    ARM: davinci_all_defconfig: enable SMSC ethernet PHY
    ARM: davinci_all_defconfig: enable RTC driver as module
    ARM: multi_v7_defconfig: Enable ARM_IMX6Q_CPUFREQ
    ...

    Linus Torvalds
     
  • Pull ARM SoC platform updates from Arnd Bergmann:
    "These are updates for platform specific code on 32-bit ARM machines,
    essentially anything that can not (yet) be expressed using DT files.

    Noteworthy changes include:

    - We get support for running in big-endian mode on two platforms:
    sunxi (Allwinner) and s3c24xx (old Samsung).

    - The recently added Uniphier platform now uses standard PSCI methods
    for SMP booting and we remove support for old bootloader versions
    that did not support it yet.

    - In sunxi, we gain support for the "Nextthing GR8" SoC, which is a
    close relative of the Allwinner A13 and R8 chips.

    - PXA completes its move over to the generic dmaengine framework and
    removes its old private API

    - mach-bcm gains support for BCM47189/BCM53573, their first ARM SoC
    with integrated 802.11ac wireless networking"

    * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (54 commits)
    ARM: imx legacy: pca100: move peripheral initialization to .init_late
    ARM: imx legacy: mx27ads: move peripheral initialization to .init_late
    ARM: imx legacy: mx21ads: move peripheral initialization to .init_late
    ARM: imx legacy: pcm043: move peripheral initialization to .init_late
    ARM: imx legacy: mx35-3ds: move peripheral initialization to .init_late
    ARM: imx legacy: mx27-3ds: move peripheral initialization to .init_late
    ARM: imx legacy: imx27-visstrim-m10: move peripheral initialization to .init_late
    ARM: imx legacy: vpr200: move peripheral initialization to .init_late
    ARM: imx legacy: mx31moboard: move peripheral initialization to .init_late
    ARM: imx legacy: armadillo5x0: move peripheral initialization to .init_late
    ARM: imx legacy: qong: move peripheral initialization to .init_late
    ARM: imx legacy: mx31-3ds: move peripheral initialization to .init_late
    ARM: imx legacy: pcm037: move peripheral initialization to .init_late
    ARM: imx legacy: mx31lilly: move peripheral initialization to .init_late
    ARM: imx legacy: mx31ads: move peripheral initialization to .init_late
    ARM: imx legacy: mx31lite: move peripheral initialization to .init_late
    ARM: imx legacy: kzm: move peripheral initialization to .init_late
    MAINTAINERS: update list of Oxnas maintainers
    ARM: orion5x: remove extraneous NO_IRQ
    ARM: orion: simplify orion_ge00_switch_init
    ...

    Linus Torvalds
     
  • Pull ARM SoC cleanups from Arnd Bergmann:
    "The cleanups for v4.9 are a little larger that usual, but thankfully
    that is almost exclusively due to removing a significant number of
    files that have become obsolete after the still ongoing conversion of
    old board files to devicetree.

    - for mach-omap2, which is still the largest platform in arch/arm/,
    the conversion to DT is finally complete after the Nokia N900 is
    now fully supported there, along with the omap3 LDP, and we can
    remove those two board files. If no regressions are found, another
    large cleanup for the platform will happen as a follow-up, removing
    dead code and restructuring the platform based on being DT-only.

    - In mach-imx, similar work is ongoing, but has not come that far.
    This time, we remove the obsolete board file for the i.MX1
    generation, which like i.MX25, i.MX5, i.MX6, and i.MX7 is now
    DT-only. The remaining board files are for i.MX2 and i.MX3 machines
    based on old ARM926 or ARM1136 cores that should work with DT in
    principle.

    - realview has just been converted from board files to DT, and a lot
    of code gets removed in the process. This is the last
    ARM/Keil/Versatile derived platform that was still using board
    files, the other ones being integrator, versatile and vexpress. We
    can probably merge the remaining code into a single directory in
    the near future.

    - clps711x had completed the conversion in v4.8, but we accidentally
    left the files in place that should have been deleted then"

    * tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (21 commits)
    ARM: select PCI_DOMAINS config from ARCH_MULTIPLATFORM
    ARM: stop *MIGHT_HAVE_PCI* config from being selected redundantly
    ARM: imx: (trivial) fix typo and grammar
    ARM: clps711x: remove extraneous files
    ARM: imx: use IS_ENABLED() instead of checking for built-in or module
    ARM: OMAP2+: use IS_ENABLED() instead of checking for built-in or module
    ARM: OMAP1: use IS_ENABLED() instead of checking for built-in or module
    ARM: imx: remove platform-mxc_rnga
    ARM: realview: imply device tree boot
    ARM: realview: no need to select SMP_ON_UP explicitly
    ARM: realview: delete the RealView board files
    ARM: imx: no need to select SMP_ON_UP explicitly
    ARM: i.MX: Move SOC_IMX1 into 'Device tree only'
    ARM: i.MX: Remove i.MX1 non-DT support
    ARM: i.MX: Remove i.MX1 Synertronixx SCB9328 board support
    ARM: i.MX: Remove i.MX1 Armadeus APF9328 board support
    ARM: mxs: remove obsolete startup code for TX28
    ARM: i.MX31 iomux: remove duplicates with alternate name
    ARM: i.MX31 iomux: remove plain duplicates
    ARM: OMAP2+: Drop legacy board file for LDP
    ...

    Linus Torvalds
     
  • Pull parisc updates from Helge Deller:
    "Changes include:

    - Fix boot of 32bit SMP kernel (initial kernel mapping was too small)

    - Added hardened usercopy checks

    - Drop bootmem and switch to memblock and NO_BOOTMEM implementation

    - Drop the BROKEN_RODATA config option (and thus remove the relevant
    code from the generic headers and files because parisc was the last
    architecture which used this config option)

    - Improve segfault reporting by printing human readable error strings

    - Various smaller changes, e.g. dwarf debug support for assembly
    code, update comments regarding copy_user_page_asm, switch to
    kmalloc_array()"

    * 'parisc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: Increase KERNEL_INITIAL_SIZE for 32-bit SMP kernels
    parisc: Drop bootmem and switch to memblock
    parisc: Add hardened usercopy feature
    parisc: Add cfi_startproc and cfi_endproc to assembly code
    parisc: Move hpmc stack into page aligned bss section
    parisc: Fix self-detected CPU stall warnings on Mako machines
    parisc: Report trap type as human readable string
    parisc: Update comment regarding implementation of copy_user_page_asm
    parisc: Use kmalloc_array() in add_system_map_addresses()
    parisc: Check return value of smp_boot_one_cpu()
    parisc: Drop BROKEN_RODATA config option

    Linus Torvalds
     
  • Pull avr32 update from Hans-Christian Noren Egtvedt.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
    avr32: migrate exception table users off module.h and onto extable.h

    Linus Torvalds
     
  • Pull powerpc updates from Michael Ellerman:
    "Highlights:
    - Major rework of Book3S 64-bit exception vectors (Nicholas Piggin)
    - Use gas sections for arranging exception vectors et. al.
    - Large set of TM cleanups and selftests (Cyril Bur)
    - Enable transactional memory (TM) lazily for userspace (Cyril Bur)
    - Support for XZ compression in the zImage wrapper (Oliver
    O'Halloran)
    - Add support for bpf constant blinding (Naveen N. Rao)
    - Beginnings of upstream support for PA Semi Nemo motherboards
    (Darren Stevens)

    Fixes:
    - Ensure .mem(init|exit).text are within _stext/_etext (Michael
    Ellerman)
    - xmon: Don't use ld on 32-bit (Michael Ellerman)
    - vdso64: Use double word compare on pointers (Anton Blanchard)
    - powerpc/nvram: Fix an incorrect partition merge (Pan Xinhui)
    - powerpc: Fix usage of _PAGE_RO in hugepage (Christophe Leroy)
    - powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K
    (Aneesh Kumar K.V)
    - Fix memory leak in queue_hotplug_event() error path (Andrew
    Donnellan)
    - Replay hypervisor maintenance interrupt first (Nicholas Piggin)

    Various performance optimisations (Anton Blanchard):
    - Align hot loops of memset() and backwards_memcpy()
    - During context switch, check before setting mm_cpumask
    - Remove static branch prediction in atomic{, 64}_add_unless
    - Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little
    endian
    - Set default CPU type to POWER8 for little endian builds

    Cleanups & features:
    - Sparse fixes/cleanups (Daniel Axtens)
    - Preserve CFAR value on SLB miss caused by access to bogus address
    (Paul Mackerras)
    - Radix MMU fixups for POWER9 (Aneesh Kumar K.V)
    - Support for setting used_(vsr|vr|spe) in sigreturn path (for CRIU)
    (Simon Guo)
    - Optimise syscall entry for virtual, relocatable case (Nicholas
    Piggin)
    - Optimise MSR handling in exception handling (Nicholas Piggin)
    - Support for kexec with Radix MMU (Benjamin Herrenschmidt)
    - powernv EEH fixes (Russell Currey)
    - Suprise PCI hotplug support for powernv (Gavin Shan)
    - Endian/sparse fixes for powernv PCI (Gavin Shan)
    - Defconfig updates (Anton Blanchard)
    - KVM: PPC: Book3S HV: Migrate pinned pages out of CMA (Balbir Singh)
    - cxl: Flush PSL cache before resetting the adapter (Frederic Barrat)
    - cxl: replace loop with for_each_child_of_node(), remove unneeded
    of_node_put() (Andrew Donnellan)
    - Fix HV facility unavailable to use correct handler (Nicholas
    Piggin)
    - Remove unnecessary syscall trampoline (Nicholas Piggin)
    - fadump: Fix build break when CONFIG_PROC_VMCORE=n (Michael
    Ellerman)
    - Quieten EEH message when no adapters are found (Anton Blanchard)
    - powernv: Add PHB register dump debugfs handle (Russell Currey)
    - Use kprobe blacklist for exception handlers & asm functions
    (Nicholas Piggin)
    - Document the syscall ABI (Nicholas Piggin)
    - MAINTAINERS: Update cxl maintainers (Michael Neuling)
    - powerpc: Remove all usages of NO_IRQ (Michael Ellerman)

    Minor cleanups:
    - Andrew Donnellan, Christophe Leroy, Colin Ian King, Cyril Bur,
    Frederic Barrat, Pan Xinhui, PrasannaKumar Muralidharan, Rui Teng,
    Simon Guo"

    * tag 'powerpc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (156 commits)
    powerpc/bpf: Add support for bpf constant blinding
    powerpc/bpf: Implement support for tail calls
    powerpc/bpf: Introduce accessors for using the tmp local stack space
    powerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n
    powerpc: tm: Enable transactional memory (TM) lazily for userspace
    powerpc/tm: Add TM Unavailable Exception
    powerpc: Remove do_load_up_transact_{fpu,altivec}
    powerpc: tm: Rename transct_(*) to ck(\1)_state
    powerpc: tm: Always use fp_state and vr_state to store live registers
    selftests/powerpc: Add checks for transactional VSXs in signal contexts
    selftests/powerpc: Add checks for transactional VMXs in signal contexts
    selftests/powerpc: Add checks for transactional FPUs in signal contexts
    selftests/powerpc: Add checks for transactional GPRs in signal contexts
    selftests/powerpc: Check that signals always get delivered
    selftests/powerpc: Add TM tcheck helpers in C
    selftests/powerpc: Allow tests to extend their kill timeout
    selftests/powerpc: Introduce GPR asm helper header file
    selftests/powerpc: Move VMX stack frame macros to header file
    selftests/powerpc: Rework FPU stack placement macros and move to header file
    selftests/powerpc: Check for VSX preservation across userspace preemption
    ...

    Linus Torvalds
     
  • If a device tree specifies a preferred device for kernel console output
    via the stdout-path or linux,stdout-path chosen node properties or the
    stdout alias then the kernel ought to honor it & output the kernel
    console to that device. As it stands, this isn't the case. Whilst we
    parse the stdout-path properties & set an of_stdout variable from
    of_alias_scan(), and use that from of_console_check() to determine
    whether to add a console device as a preferred console whilst
    registering it, we also prefer the first registered console if no other
    has been selected at the time of its registration.

    This means that if a console other than the one the device tree selects
    via stdout-path is registered first, we will switch to using it & when
    the stdout-path console is later registered the call to
    add_preferred_console() via of_console_check() is too late to do
    anything useful. In practice this seems to mean that we switch to the
    dummy console device fairly early & see no further console output:

    Console: colour dummy device 80x25
    console [tty0] enabled
    bootconsole [ns16550a0] disabled

    Fix this by not automatically preferring the first registered console if
    one is specified by the device tree. This allows consoles to be
    registered but not enabled, and once the driver for the console selected
    by stdout-path calls of_console_check() the driver will be added to the
    list of preferred consoles before any other console has been enabled.
    When that console is then registered via register_console() it will be
    enabled as expected.

    Link: http://lkml.kernel.org/r/20160809151937.26118-1-paul.burton@imgtec.com
    Signed-off-by: Paul Burton
    Cc: Ralf Baechle
    Cc: Paul Burton
    Cc: Tejun Heo
    Cc: Sergey Senozhatsky
    Cc: Jiri Slaby
    Cc: Daniel Vetter
    Cc: Ivan Delalande
    Cc: Thierry Reding
    Cc: Borislav Petkov
    Cc: Jan Kara
    Cc: Petr Mladek
    Cc: Joe Perches
    Cc: Greg Kroah-Hartman
    Cc: Rob Herring
    Cc: Frank Rowand
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Burton
     
  • Current supplementary groups code can massively overallocate memory and
    is implemented in a way so that access to individual gid is done via 2D
    array.

    If number of gids is
    Cc: Vasily Kulikov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Link: http://lkml.kernel.org/r/20161003082312.GA20634@amd
    Signed-off-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • Add two entries to map to my primary address.

    Link: http://lkml.kernel.org/r/1473850348-19177-1-git-send-email-johan@kernel.org
    Signed-off-by: Johan Hovold
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johan Hovold
     
  • Git can be told to apply language-specific rules when generating diffs.
    Enable this for C source code files (*.c and *.h) so that function names
    are printed right. Specifically, doing so prevents "git diff" from
    mistakenly considering unindented goto labels as function names.

    Link: http://lkml.kernel.org/r/20160907143403.1449324f@endymion
    Signed-off-by: Jean Delvare
    Cc: Peter Zijlstra
    Cc: Joe Perches
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jean Delvare
     
  • The declarations of arch-specific functions have been moved to a common
    header in commit 3820b4d2789f ('uprobes: Move function declarations out
    of arch'), but MIPS and S390 has added them to their own trees later.
    Remove the unnecessary duplicates.

    Link: http://lkml.kernel.org/r/1472804384-17830-1-git-send-email-marcin.nowakowski@imgtec.com
    Signed-off-by: Marcin Nowakowski
    Acked-by: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marcin Nowakowski
     
  • No need to correct the correct.

    Link: http://lkml.kernel.org/r/1472490791.3425.38.camel@perches.com
    Signed-off-by: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • When doing an nmi backtrace of many cores, most of which are idle, the
    output is a little overwhelming and very uninformative. Suppress
    messages for cpus that are idling when they are interrupted and just
    emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".

    We do this by grouping all the cpuidle code together into a new
    .cpuidle.text section, and then checking the address of the interrupted
    PC to see if it lies within that section.

    This commit suitably tags x86 and tile idle routines, and only adds in
    the minimal framework for other architectures.

    Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com
    Signed-off-by: Chris Metcalf
    Acked-by: Peter Zijlstra (Intel)
    Tested-by: Peter Zijlstra (Intel)
    Tested-by: Daniel Thompson [arm]
    Tested-by: Petr Mladek
    Cc: Aaron Tomlin
    Cc: Peter Zijlstra (Intel)
    Cc: "Rafael J. Wysocki"
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Metcalf
     
  • Previously tile was rolling its own method of capturing backtrace data
    in the NMI handlers, but it was relying on running printk() from the NMI
    handler, which is not always safe. So adopt the nmi_backtrace model
    (with the new cpumask extension) instead.

    So we can call the nmi_backtrace code directly from the nmi handler,
    move the nmi_enter()/exit() into the top-level tile NMI handler.

    The semantics of the routine change slightly since it is now synchronous
    with the remote cores completing the backtraces. Previously it was
    asynchronous, but with protection to avoid starting a new remote
    backtrace if the old one was still in progress.

    Link: http://lkml.kernel.org/r/1472487169-14923-4-git-send-email-cmetcalf@mellanox.com
    Signed-off-by: Chris Metcalf
    Cc: Daniel Thompson [arm]
    Cc: Petr Mladek
    Cc: Aaron Tomlin
    Cc: Peter Zijlstra (Intel)
    Cc: "Rafael J. Wysocki"
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Metcalf
     
  • Currently on arm there is code that checks whether it should call
    dump_stack() explicitly, to avoid trying to raise an NMI when the
    current context is not preemptible by the backtrace IPI. Similarly, the
    forthcoming arch/tile support uses an IPI mechanism that does not
    support generating an NMI to self.

    Accordingly, move the code that guards this case into the generic
    mechanism, and invoke it unconditionally whenever we want a backtrace of
    the current cpu. It seems plausible that in all cases, dump_stack()
    will generate better information than generating a stack from the NMI
    handler. The register state will be missing, but that state is likely
    not particularly helpful in any case.

    Or, if we think it is helpful, we should be capturing and emitting the
    current register state in all cases when regs == NULL is passed to
    nmi_cpu_backtrace().

    Link: http://lkml.kernel.org/r/1472487169-14923-3-git-send-email-cmetcalf@mellanox.com
    Signed-off-by: Chris Metcalf
    Tested-by: Daniel Thompson [arm]
    Reviewed-by: Petr Mladek
    Acked-by: Aaron Tomlin
    Cc: "Rafael J. Wysocki"
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Metcalf
     
  • Patch series "improvements to the nmi_backtrace code" v9.

    This patch series modifies the trigger_xxx_backtrace() NMI-based remote
    backtracing code to make it more flexible, and makes a few small
    improvements along the way.

    The motivation comes from the task isolation code, where there are
    scenarios where we want to be able to diagnose a case where some cpu is
    about to interrupt a task-isolated cpu. It can be helpful to see both
    where the interrupting cpu is, and also an approximation of where the
    cpu that is being interrupted is. The nmi_backtrace framework allows us
    to discover the stack of the interrupted cpu.

    I've tested that the change works as desired on tile, and build-tested
    x86, arm, mips, and sparc64. For x86 I confirmed that the generic
    cpuidle stuff as well as the architecture-specific routines are in the
    new cpuidle section. For arm, mips, and sparc I just build-tested it
    and made sure the generic cpuidle routines were in the new cpuidle
    section, but I didn't attempt to figure out which the platform-specific
    idle routines might be. That might be more usefully done by someone
    with platform experience in follow-up patches.

    This patch (of 4):

    Currently you can only request a backtrace of either all cpus, or all
    cpus but yourself. It can also be helpful to request a remote backtrace
    of a single cpu, and since we want that, the logical extension is to
    support a cpumask as the underlying primitive.

    This change modifies the existing lib/nmi_backtrace.c code to take a
    cpumask as its basic primitive, and modifies the linux/nmi.h code to use
    the new "cpumask" method instead.

    The existing clients of nmi_backtrace (arm and x86) are converted to
    using the new cpumask approach in this change.

    The other users of the backtracing API (sparc64 and mips) are converted
    to use the cpumask approach rather than the all/allbutself approach.
    The mips code ignored the "include_self" boolean but with this change it
    will now also dump a local backtrace if requested.

    Link: http://lkml.kernel.org/r/1472487169-14923-2-git-send-email-cmetcalf@mellanox.com
    Signed-off-by: Chris Metcalf
    Tested-by: Daniel Thompson [arm]
    Reviewed-by: Aaron Tomlin
    Reviewed-by: Petr Mladek
    Cc: "Rafael J. Wysocki"
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Ralf Baechle
    Cc: David Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Metcalf
     
  • Currently, when min/max are nested within themselves, sparse will warn:

    warning: symbol '_min1' shadows an earlier one
    originally declared here
    warning: symbol '_min1' shadows an earlier one
    originally declared here
    warning: symbol '_min2' shadows an earlier one
    originally declared here

    This also immediately happens when min3() or max3() are used.

    Since sparse implements __COUNTER__, we can use __UNIQUE_ID() to
    generate unique variable names, avoiding this.

    Link: http://lkml.kernel.org/r/1471519773-29882-1-git-send-email-johannes@sipsolutions.net
    Signed-off-by: Johannes Berg
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Berg
     
  • Add some more description on the limitations for smaps/maps readings, as
    well as some guaruntees we can make.

    Link: http://lkml.kernel.org/r/1475296958-27652-2-git-send-email-robert.hu@intel.com
    Signed-off-by: Robert Ho
    Acked-by: Michal Hocko
    Cc: Dave Hansen
    Cc: Xiao Guangrong
    Cc: Robert Hu
    Cc: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Dan Williams
    Cc: Gleb Natapov
    Cc: Marcelo Tosatti
    Cc: Stefan Hajnoczi
    Cc: Ross Zwisler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert Ho
     
  • Recently, Redhat reported that nvml test suite failed on QEMU/KVM,
    more detailed info please refer to:

    https://bugzilla.redhat.com/show_bug.cgi?id=1365721

    Actually, this bug is not only for NVDIMM/DAX but also for any other
    file systems. This simple test case abstracted from nvml can easily
    reproduce this bug in common environment:

    -------------------------- testcase.c -----------------------------

    int
    is_pmem_proc(const void *addr, size_t len)
    {
    const char *caddr = addr;

    FILE *fp;
    if ((fp = fopen("/proc/self/smaps", "r")) == NULL) {
    printf("!/proc/self/smaps");
    return 0;
    }

    int retval = 0; /* assume false until proven otherwise */
    char line[PROCMAXLEN]; /* for fgets() */
    char *lo = NULL; /* beginning of current range in smaps file */
    char *hi = NULL; /* end of current range in smaps file */
    int needmm = 0; /* looking for mm flag for current range */
    while (fgets(line, PROCMAXLEN, fp) != NULL) {
    static const char vmflags[] = "VmFlags:";
    static const char mm[] = " wr";

    /* check for range line */
    if (sscanf(line, "%p-%p", &lo, &hi) == 2) {
    if (needmm) {
    /* last range matched, but no mm flag found */
    printf("never found mm flag.\n");
    break;
    } else if (caddr < lo) {
    /* never found the range for caddr */
    printf("#######no match for addr %p.\n", caddr);
    break;
    } else if (caddr < hi) {
    /* start address is in this range */
    size_t rangelen = (size_t)(hi - caddr);

    /* remember that matching has started */
    needmm = 1;

    /* calculate remaining range to search for */
    if (len > rangelen) {
    len -= rangelen;
    caddr += rangelen;
    printf("matched %zu bytes in range "
    "%p-%p, %zu left over.\n",
    rangelen, lo, hi, len);
    } else {
    len = 0;
    printf("matched all bytes in range "
    "%p-%p.\n", lo, hi);
    }
    }
    } else if (needmm && strncmp(line, vmflags,
    sizeof(vmflags) - 1) == 0) {
    if (strstr(&line[sizeof(vmflags) - 1], mm) != NULL) {
    printf("mm flag found.\n");
    if (len == 0) {
    /* entire range matched */
    retval = 1;
    break;
    }
    needmm = 0; /* saw what was needed */
    } else {
    /* mm flag not set for some or all of range */
    printf("range has no mm flag.\n");
    break;
    }
    }
    }

    fclose(fp);

    printf("returning %d.\n", retval);
    return retval;
    }

    void *Addr;
    size_t Size;

    /*
    * worker -- the work each thread performs
    */
    static void *
    worker(void *arg)
    {
    int *ret = (int *)arg;
    *ret = is_pmem_proc(Addr, Size);
    return NULL;
    }

    int main(int argc, char *argv[])
    {
    if (argc < 2 || argc > 3) {
    printf("usage: %s file [env].\n", argv[0]);
    return -1;
    }

    int fd = open(argv[1], O_RDWR);

    struct stat stbuf;
    fstat(fd, &stbuf);

    Size = stbuf.st_size;
    Addr = mmap(0, stbuf.st_size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);

    close(fd);

    pthread_t threads[NTHREAD];
    int ret[NTHREAD];

    /* kick off NTHREAD threads */
    for (int i = 0; i < NTHREAD; i++)
    pthread_create(&threads[i], NULL, worker, &ret[i]);

    /* wait for all the threads to complete */
    for (int i = 0; i < NTHREAD; i++)
    pthread_join(threads[i], NULL);

    /* verify that all the threads return the same value */
    for (int i = 1; i < NTHREAD; i++) {
    if (ret[0] != ret[i]) {
    printf("Error i %d ret[0] = %d ret[i] = %d.\n", i,
    ret[0], ret[i]);
    }
    }

    printf("%d", ret[0]);
    return 0;
    }

    It failed as some threads can not find the memory region in
    "/proc/self/smaps" which is allocated in the main process

    It is caused by proc fs which uses 'file->version' to indicate the VMA that
    is the last one has already been handled by read() system call. When the
    next read() issues, it uses the 'version' to find the VMA, then the next
    VMA is what we want to handle, the related code is as follows:

    if (last_addr) {
    vma = find_vma(mm, last_addr);
    if (vma && (vma = m_next_vma(priv, vma)))
    return vma;
    }

    However, VMA will be lost if the last VMA is gone, e.g:

    The process VMA list is A->B->C->D

    CPU 0 CPU 1
    read() system call
    handle VMA B
    version = B
    return to userspace

    unmap VMA B

    issue read() again to continue to get
    the region info
    find_vma(version) will get VMA C
    m_next_vma(C) will get VMA D
    handle D
    !!! VMA C is lost !!!

    In order to fix this bug, we make 'file->version' indicate the end address
    of the current VMA. m_start will then look up a vma which with vma_start
    < last_vm_end and moves on to the next vma if we found the same or an
    overlapping vma. This will guarantee that we will not miss an exclusive
    vma but we can still miss one if the previous vma was shrunk. This is
    acceptable because guaranteeing "never miss a vma" is simply not feasible.
    User has to cope with some inconsistencies if the file is not read in one
    go.

    [mhocko@suse.com: changelog fixes]
    Link: http://lkml.kernel.org/r/1475296958-27652-1-git-send-email-robert.hu@intel.com
    Acked-by: Dave Hansen
    Signed-off-by: Xiao Guangrong
    Signed-off-by: Robert Hu
    Acked-by: Michal Hocko
    Acked-by: Oleg Nesterov
    Cc: Paolo Bonzini
    Cc: Dan Williams
    Cc: Gleb Natapov
    Cc: Marcelo Tosatti
    Cc: Stefan Hajnoczi
    Cc: Ross Zwisler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert Ho
     
  • In changing from checking ptrace_may_access(p, PTRACE_MODE_ATTACH_FSCREDS)
    to capable(CAP_SYS_NICE), I missed that ptrace_my_access succeeds when p
    == current, but the CAP_SYS_NICE doesn't.

    Thus while the previous commit was intended to loosen the needed
    privileges to modify a processes timerslack, it needlessly restricted a
    task modifying its own timerslack via the proc//timerslack_ns
    (which is permitted also via the PR_SET_TIMERSLACK method).

    This patch corrects this by checking if p == current before checking the
    CAP_SYS_NICE value.

    This patch applies on top of my two previous patches currently in -mm

    Link: http://lkml.kernel.org/r/1471906870-28624-1-git-send-email-john.stultz@linaro.org
    Signed-off-by: John Stultz
    Acked-by: Kees Cook
    Cc: "Serge E. Hallyn"
    Cc: Thomas Gleixner
    Cc: Arjan van de Ven
    Cc: Oren Laadan
    Cc: Ruchi Kandoi
    Cc: Rom Lemarchand
    Cc: Todd Kjos
    Cc: Colin Cross
    Cc: Nick Kralevich
    Cc: Dmitry Shmidt
    Cc: Elliott Hughes
    Cc: Android Kernel Team
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Stultz
     
  • As requested, this patch checks the existing LSM hooks
    task_getscheduler/task_setscheduler when reading or modifying the task's
    timerslack value.

    Previous versions added new get/settimerslack LSM hooks, but since they
    checked the same PROCESS__SET/GETSCHED values as existing hooks, it was
    suggested we just use the existing ones.

    Link: http://lkml.kernel.org/r/1469132667-17377-2-git-send-email-john.stultz@linaro.org
    Signed-off-by: John Stultz
    Cc: Kees Cook
    Cc: "Serge E. Hallyn"
    Cc: Thomas Gleixner
    Cc: Arjan van de Ven
    Cc: Oren Laadan
    Cc: Ruchi Kandoi
    Cc: Rom Lemarchand
    Cc: Todd Kjos
    Cc: Colin Cross
    Cc: Nick Kralevich
    Cc: Dmitry Shmidt
    Cc: Elliott Hughes
    Cc: James Morris
    Cc: Android Kernel Team
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Stultz
     
  • When an interface to allow a task to change another tasks timerslack was
    first proposed, it was suggested that something greater then
    CAP_SYS_NICE would be needed, as a task could be delayed further then
    what normally could be done with nice adjustments.

    So CAP_SYS_PTRACE was adopted instead for what became the
    /proc//timerslack_ns interface. However, for Android (where this
    feature originates), giving the system_server CAP_SYS_PTRACE would allow
    it to observe and modify all tasks memory. This is considered too high
    a privilege level for only needing to change the timerslack.

    After some discussion, it was realized that a CAP_SYS_NICE process can
    set a task as SCHED_FIFO, so they could fork some spinning processes and
    set them all SCHED_FIFO 99, in effect delaying all other tasks for an
    infinite amount of time.

    So as a CAP_SYS_NICE task can already cause trouble for other tasks,
    using it as a required capability for accessing and modifying
    /proc//timerslack_ns seems sufficient.

    Thus, this patch loosens the capability requirements to CAP_SYS_NICE and
    removes CAP_SYS_PTRACE, simplifying some of the code flow as well.

    This is technically an ABI change, but as the feature just landed in
    4.6, I suspect no one is yet using it.

    Link: http://lkml.kernel.org/r/1469132667-17377-1-git-send-email-john.stultz@linaro.org
    Signed-off-by: John Stultz
    Reviewed-by: Nick Kralevich
    Acked-by: Serge Hallyn
    Acked-by: Kees Cook
    Cc: Kees Cook
    Cc: "Serge E. Hallyn"
    Cc: Thomas Gleixner
    Cc: Arjan van de Ven
    Cc: Oren Laadan
    Cc: Ruchi Kandoi
    Cc: Rom Lemarchand
    Cc: Todd Kjos
    Cc: Colin Cross
    Cc: Nick Kralevich
    Cc: Dmitry Shmidt
    Cc: Elliott Hughes
    Cc: Android Kernel Team
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Stultz
     
  • Use a specific routine to emit most lines so that the code is easier to
    read and maintain.

    akpm:
    text data bss dec hex filename
    2976 8 0 2984 ba8 fs/proc/meminfo.o before
    2669 8 0 2677 a75 fs/proc/meminfo.o after

    Link: http://lkml.kernel.org/r/8fce7fdef2ba081a4ef531594e97da8a9feebb58.1470810406.git.joe@perches.com
    Signed-off-by: Joe Perches
    Cc: Andi Kleen
    Cc: Alexey Dobriyan
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Allow some seq_puts removals by taking a string instead of a single
    char.

    [akpm@linux-foundation.org: update vmstat_show(), per Joe]
    Link: http://lkml.kernel.org/r/667e1cf3d436de91a5698170a1e98d882905e956.1470704995.git.joe@perches.com
    Signed-off-by: Joe Perches
    Cc: Joe Perches
    Cc: Andi Kleen
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • top(1) opens the following files for every PID:

    /proc/*/stat
    /proc/*/statm
    /proc/*/status

    This patch switches /proc/*/status away from seq_printf().
    The result is 13.5% speedup.

    Benchmark is open("/proc/self/status")+read+close 1.000.000 million times.

    BEFORE
    $ perf stat -r 10 taskset -c 3 ./proc-self-status

    Performance counter stats for 'taskset -c 3 ./proc-self-status' (10 runs):

    10748.474301 task-clock (msec) # 0.954 CPUs utilized ( +- 0.91% )
    12 context-switches # 0.001 K/sec ( +- 1.09% )
    1 cpu-migrations # 0.000 K/sec
    104 page-faults # 0.010 K/sec ( +- 0.45% )
    37,424,127,876 cycles # 3.482 GHz ( +- 0.04% )
    8,453,010,029 stalled-cycles-frontend # 22.59% frontend cycles idle ( +- 0.12% )
    3,747,609,427 stalled-cycles-backend # 10.01% backend cycles idle ( +- 0.68% )
    65,632,764,147 instructions # 1.75 insn per cycle
    # 0.13 stalled cycles per insn ( +- 0.00% )
    13,981,324,775 branches # 1300.773 M/sec ( +- 0.00% )
    138,967,110 branch-misses # 0.99% of all branches ( +- 0.18% )

    11.263885428 seconds time elapsed ( +- 0.04% )
    ^^^^^^^^^^^^

    AFTER
    $ perf stat -r 10 taskset -c 3 ./proc-self-status

    Performance counter stats for 'taskset -c 3 ./proc-self-status' (10 runs):

    9010.521776 task-clock (msec) # 0.925 CPUs utilized ( +- 1.54% )
    11 context-switches # 0.001 K/sec ( +- 1.54% )
    1 cpu-migrations # 0.000 K/sec ( +- 11.11% )
    103 page-faults # 0.011 K/sec ( +- 0.60% )
    32,352,310,603 cycles # 3.591 GHz ( +- 0.07% )
    7,849,199,578 stalled-cycles-frontend # 24.26% frontend cycles idle ( +- 0.27% )
    3,269,738,842 stalled-cycles-backend # 10.11% backend cycles idle ( +- 0.73% )
    56,012,163,567 instructions # 1.73 insn per cycle
    # 0.14 stalled cycles per insn ( +- 0.00% )
    11,735,778,795 branches # 1302.453 M/sec ( +- 0.00% )
    98,084,459 branch-misses # 0.84% of all branches ( +- 0.28% )

    9.741247736 seconds time elapsed ( +- 0.07% )
    ^^^^^^^^^^^

    Link: http://lkml.kernel.org/r/20160806125608.GB1187@p183.telecom.by
    Signed-off-by: Alexey Dobriyan
    Cc: Joe Perches
    Cc: Andi Kleen
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Every current KDE system has process named ksysguardd polling files
    below once in several seconds:

    $ strace -e trace=open -p $(pidof ksysguardd)
    Process 1812 attached
    open("/etc/mtab", O_RDONLY|O_CLOEXEC) = 8
    open("/etc/mtab", O_RDONLY|O_CLOEXEC) = 8
    open("/proc/net/dev", O_RDONLY) = 8
    open("/proc/net/wireless", O_RDONLY) = -1 ENOENT (No such file or directory)
    open("/proc/stat", O_RDONLY) = 8
    open("/proc/vmstat", O_RDONLY) = 8

    Hell knows what it is doing but speed up reading /proc/vmstat by 33%!

    Benchmark is open+read+close 1.000.000 times.

    BEFORE
    $ perf stat -r 10 taskset -c 3 ./proc-vmstat

    Performance counter stats for 'taskset -c 3 ./proc-vmstat' (10 runs):

    13146.768464 task-clock (msec) # 0.960 CPUs utilized ( +- 0.60% )
    15 context-switches # 0.001 K/sec ( +- 1.41% )
    1 cpu-migrations # 0.000 K/sec ( +- 11.11% )
    104 page-faults # 0.008 K/sec ( +- 0.57% )
    45,489,799,349 cycles # 3.460 GHz ( +- 0.03% )
    9,970,175,743 stalled-cycles-frontend # 21.92% frontend cycles idle ( +- 0.10% )
    2,800,298,015 stalled-cycles-backend # 6.16% backend cycles idle ( +- 0.32% )
    79,241,190,850 instructions # 1.74 insn per cycle
    # 0.13 stalled cycles per insn ( +- 0.00% )
    17,616,096,146 branches # 1339.956 M/sec ( +- 0.00% )
    176,106,232 branch-misses # 1.00% of all branches ( +- 0.18% )

    13.691078109 seconds time elapsed ( +- 0.03% )
    ^^^^^^^^^^^^

    AFTER
    $ perf stat -r 10 taskset -c 3 ./proc-vmstat

    Performance counter stats for 'taskset -c 3 ./proc-vmstat' (10 runs):

    8688.353749 task-clock (msec) # 0.950 CPUs utilized ( +- 1.25% )
    10 context-switches # 0.001 K/sec ( +- 2.13% )
    1 cpu-migrations # 0.000 K/sec
    104 page-faults # 0.012 K/sec ( +- 0.56% )
    30,384,010,730 cycles # 3.497 GHz ( +- 0.07% )
    12,296,259,407 stalled-cycles-frontend # 40.47% frontend cycles idle ( +- 0.13% )
    3,370,668,651 stalled-cycles-backend # 11.09% backend cycles idle ( +- 0.69% )
    28,969,052,879 instructions # 0.95 insn per cycle
    # 0.42 stalled cycles per insn ( +- 0.01% )
    6,308,245,891 branches # 726.058 M/sec ( +- 0.00% )
    214,685,502 branch-misses # 3.40% of all branches ( +- 0.26% )

    9.146081052 seconds time elapsed ( +- 0.07% )
    ^^^^^^^^^^^

    vsnprintf() is slow because:

    1. format_decode() is busy looking for format specifier: 2 branches
    per character (not in this case, but in others)

    2. approximately million branches while parsing format mini language
    and everywhere

    3. just look at what string() does /proc/vmstat is good case because
    most of its content are strings

    Link: http://lkml.kernel.org/r/20160806125455.GA1187@p183.telecom.by
    Signed-off-by: Alexey Dobriyan
    Cc: Joe Perches
    Cc: Andi Kleen
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • This came to light when implementing native 64-bit atomics for ARCv2.

    The atomic64 self-test code uses CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
    to check whether atomic64_dec_if_positive() is available. It seems it
    was needed when not every arch defined it. However as of current code
    the Kconfig option seems needless

    - for CONFIG_GENERIC_ATOMIC64 it is auto-enabled in lib/Kconfig and a
    generic definition of API is present lib/atomic64.c
    - arches with native 64-bit atomics select it in arch/*/Kconfig and
    define the API in their headers

    So I see no point in keeping the Kconfig option

    Compile tested for:
    - blackfin (CONFIG_GENERIC_ATOMIC64)
    - x86 (!CONFIG_GENERIC_ATOMIC64)
    - ia64

    Link: http://lkml.kernel.org/r/1473703083-8625-3-git-send-email-vgupta@synopsys.com
    Signed-off-by: Vineet Gupta
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Russell King
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Ralf Baechle
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: "David S. Miller"
    Cc: Chris Metcalf
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Vineet Gupta
    Cc: Zhaoxiu Zeng
    Cc: Linus Walleij
    Cc: Alexander Potapenko
    Cc: Andrey Ryabinin
    Cc: Herbert Xu
    Cc: Ming Lin
    Cc: Arnd Bergmann
    Cc: Geert Uytterhoeven
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Andi Kleen
    Cc: Boqun Feng
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vineet Gupta
     
  • This is based on s390 version and needed to get rid of
    CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE

    Link: http://lkml.kernel.org/r/1473703083-8625-2-git-send-email-vgupta@synopsys.com
    Signed-off-by: Vineet Gupta
    Reported-by: kbuild test robot
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vineet Gupta
     
  • The macro PAGE_ALIGNED() is prone to cause error because it doesn't
    follow convention to parenthesize parameter @addr within macro body, for
    example unsigned long *ptr = kmalloc(...); PAGE_ALIGNED(ptr + 16); for
    the left parameter of macro IS_ALIGNED(), (unsigned long)(ptr + 16) is
    desired but the actual one is (unsigned long)ptr + 16.

    It is fixed by simply canonicalizing macro PAGE_ALIGNED() definition.

    Link: http://lkml.kernel.org/r/57EA6AE7.7090807@zoho.com
    Signed-off-by: zijun_hu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    zijun_hu