18 May, 2019

4 commits

  • Pull sound fixes from Takashi Iwai:
    "Just a few HD-audio fixes, most of which are specific to Realtek
    codecs"

    * tag 'sound-fix-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: hda/realtek - Fix for Lenovo B50-70 inverted internal microphone bug
    ALSA: hda: Fix race between creating and refreshing sysfs entries
    ALSA: hda/realtek - Corrected fixup for System76 Gazelle (gaze14)
    ALSA: hda/realtek - Avoid superfluous COEF EAPD setups
    ALSA: hda/realtek - Fixup headphone noise via runtime suspend

    Linus Torvalds
     
  • Pull KVM updates from Paolo Bonzini:
    "ARM:
    - support for SVE and Pointer Authentication in guests
    - PMU improvements

    POWER:
    - support for direct access to the POWER9 XIVE interrupt controller
    - memory and performance optimizations

    x86:
    - support for accessing memory not backed by struct page
    - fixes and refactoring

    Generic:
    - dirty page tracking improvements"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (155 commits)
    kvm: fix compilation on aarch64
    Revert "KVM: nVMX: Expose RDPMC-exiting only when guest supports PMU"
    kvm: x86: Fix L1TF mitigation for shadow MMU
    KVM: nVMX: Disable intercept for FS/GS base MSRs in vmcs02 when possible
    KVM: PPC: Book3S: Remove useless checks in 'release' method of KVM device
    KVM: PPC: Book3S HV: XIVE: Fix spelling mistake "acessing" -> "accessing"
    KVM: PPC: Book3S HV: Make sure to load LPID for radix VCPUs
    kvm: nVMX: Set nested_run_pending in vmx_set_nested_state after checks complete
    tests: kvm: Add tests for KVM_SET_NESTED_STATE
    KVM: nVMX: KVM_SET_NESTED_STATE - Tear down old EVMCS state before setting new state
    tests: kvm: Add tests for KVM_CAP_MAX_VCPUS and KVM_CAP_MAX_CPU_ID
    tests: kvm: Add tests to .gitignore
    KVM: Introduce KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2
    KVM: Fix kvm_clear_dirty_log_protect off-by-(minus-)one
    KVM: Fix the bitmap range to copy during clear dirty
    KVM: arm64: Fix ptrauth ID register masking logic
    KVM: x86: use direct accessors for RIP and RSP
    KVM: VMX: Use accessors for GPRs outside of dedicated caching logic
    KVM: x86: Omit caching logic for always-available GPRs
    kvm, x86: Properly check whether a pfn is an MMIO or not
    ...

    Linus Torvalds
     
  • Pull more s390 updates from Martin Schwidefsky:

    - Enhancements for the QDIO layer

    - Remove the RCP trace event

    - Avoid three build issues

    - Move the defconfig to the configs directory

    * tag 's390-5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    s390: move arch/s390/defconfig to arch/s390/configs/defconfig
    s390/qdio: optimize state inspection of HW-owned SBALs
    s390/qdio: use get_buf_state() in debug_get_buf_state()
    s390/qdio: allow to scan all Output SBALs in one go
    s390/cio: Remove tracing for rchp instruction
    s390/kasan: adapt disabled_wait usage to avoid build error
    latent_entropy: avoid build error when plugin cflags are not set
    s390/boot: fix compiler error due to missing awk strtonum

    Linus Torvalds
     
  • Pull more vfs mount updates from Al Viro:
    "Propagation of new syscalls to other architectures + cosmetic change
    from Christian (fscontext didn't follow the convention for anon inode
    names)"

    * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    uapi: Wire up the mount API syscalls on non-x86 arches [ver #2]
    uapi, x86: Fix the syscall numbering of the mount API syscalls [ver #2]
    uapi, fsopen: use square brackets around "fscontext" [ver #2]

    Linus Torvalds
     

17 May, 2019

12 commits

  • Pull more block updates from Jens Axboe:
    "This is mainly some late lightnvm changes that came in just before the
    merge window, as well as fixes that have been queued up since the
    initial pull request was frozen.

    This contains:

    - lightnvm changes, fixing race conditions, improving memory
    utilization, and improving pblk compatability (Chansol, Igor,
    Marcin)

    - NVMe pull request with minor fixes all over the map (via Christoph)

    - remove redundant error print in sata_rcar (Geert)

    - struct_size() cleanup (Jackie)

    - dasd CONFIG_LBADF warning fix (Ming)

    - brd cond_resched() improvement (Mikulas)"

    * tag 'for-5.2/block-post-20190516' of git://git.kernel.dk/linux-block: (41 commits)
    block/bio-integrity: use struct_size() in kmalloc()
    nvme: validate cntlid during controller initialisation
    nvme: change locking for the per-subsystem controller list
    nvme: trace all async notice events
    nvme: fix typos in nvme status code values
    nvme-fabrics: remove unused argument
    nvme-multipath: avoid crash on invalid subsystem cntlid enumeration
    nvme-fc: use separate work queue to avoid warning
    nvme-rdma: remove redundant reference between ib_device and tagset
    nvme-pci: mark expected switch fall-through
    nvme-pci: add known admin effects to augument admin effects log page
    nvme-pci: init shadow doorbell after each reset
    brd: add cond_resched to brd_free_pages
    sata_rcar: Remove ata_host_alloc() error printing
    s390/dasd: fix build warning in dasd_eckd_build_cp_raw
    lightnvm: pblk: use nvm_rq_to_ppa_list()
    lightnvm: pblk: simplify partial read path
    lightnvm: do not remove instance under global lock
    lightnvm: track inflight target creations
    lightnvm: pblk: recover only written metadata
    ...

    Linus Torvalds
     
  • Pull more clk framework updates from Stephen Boyd:
    "One more patch to remove io.h from clk-provider.h.

    We used to need this include when we had clk_readl() and clk_writel(),
    but those are gone now so this patch pushes the dependency out to the
    users of clk-provider.h"

    * tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
    clk: Remove io.h from clk-provider.h

    Linus Torvalds
     
  • Pull misc AFS fixes from David Howells:
    "This fixes a set of miscellaneous issues in the afs filesystem,
    including:

    - leak of keys on file close.

    - broken error handling in xattr functions.

    - missing locking when updating VL server list.

    - volume location server DNS lookup whereby preloaded cells may not
    ever get a lookup and regular DNS lookups to maintain server lists
    consume power unnecessarily.

    - incorrect error propagation and handling in the fileserver
    iteration code causes operations to sometimes apparently succeed.

    - interruption of server record check/update side op during
    fileserver iteration causes uninterruptible main operations to fail
    unexpectedly.

    - callback promise expiry time miscalculation.

    - over invalidation of the callback promise on directories.

    - double locking on callback break waking up file locking waiters.

    - double increment of the vnode callback break counter.

    Note that it makes some changes outside of the afs code, including:

    - an extra parameter to dns_query() to allow the dns_resolver key
    just accessed to be immediately invalidated. AFS is caching the
    results itself, so the key can be discarded.

    - an interruptible version of wait_var_event().

    - an rxrpc function to allow the maximum lifespan to be set on a
    call.

    - a way for an rxrpc call to be marked as non-interruptible"

    * tag 'afs-fixes-20190516' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    afs: Fix double inc of vnode->cb_break
    afs: Fix lock-wait/callback-break double locking
    afs: Don't invalidate callback if AFS_VNODE_DIR_VALID not set
    afs: Fix calculation of callback expiry time
    afs: Make dynamic root population wait uninterruptibly for proc_cells_lock
    afs: Make some RPC operations non-interruptible
    rxrpc: Allow the kernel to mark a call as being non-interruptible
    afs: Fix error propagation from server record check/update
    afs: Fix the maximum lifespan of VL and probe calls
    rxrpc: Provide kernel interface to set max lifespan on a call
    afs: Fix "kAFS: AFS vnode with undefined type 0"
    afs: Fix cell DNS lookup
    Add wait_var_event_interruptible()
    dns_resolver: Allow used keys to be invalidated
    afs: Fix afs_cell records to always have a VL server list record
    afs: Fix missing lock when replacing VL server list
    afs: Fix afs_xattr_get_yfs() to not try freeing an error value
    afs: Fix incorrect error handling in afs_xattr_get_acl()
    afs: Fix key leak in afs_release() and afs_evict_inode()

    Linus Torvalds
     
  • Pull ceph updates from Ilya Dryomov:
    "On the filesystem side we have:

    - a fix to enforce quotas set above the mount point (Luis Henriques)

    - support for exporting snapshots through NFS (Zheng Yan)

    - proper statx implementation (Jeff Layton). statx flags are mapped
    to MDS caps, with AT_STATX_{DONT,FORCE}_SYNC taken into account.

    - some follow-up dentry name handling fixes, in particular
    elimination of our hand-rolled helper and the switch to __getname()
    as suggested by Al (Jeff Layton)

    - a set of MDS client cleanups in preparation for async MDS requests
    in the future (Jeff Layton)

    - a fix to sync the filesystem before remounting (Jeff Layton)

    On the rbd side, work is on-going on object-map and fast-diff image
    features"

    * tag 'ceph-for-5.2-rc1' of git://github.com/ceph/ceph-client: (29 commits)
    ceph: flush dirty inodes before proceeding with remount
    ceph: fix unaligned access in ceph_send_cap_releases
    libceph: make ceph_pr_addr take an struct ceph_entity_addr pointer
    libceph: fix unaligned accesses in ceph_entity_addr handling
    rbd: don't assert on writes to snapshots
    rbd: client_mutex is never nested
    ceph: print inode number in __caps_issued_mask debugging messages
    ceph: just call get_session in __ceph_lookup_mds_session
    ceph: simplify arguments and return semantics of try_get_cap_refs
    ceph: fix comment over ceph_drop_caps_for_unlink
    ceph: move wait for mds request into helper function
    ceph: have ceph_mdsc_do_request call ceph_mdsc_submit_request
    ceph: after an MDS request, do callback and completions
    ceph: use pathlen values returned by set_request_path_attr
    ceph: use __getname/__putname in ceph_mdsc_build_path
    ceph: use ceph_mdsc_build_path instead of clone_dentry_name
    ceph: fix potential use-after-free in ceph_mdsc_build_path
    ceph: dump granular cap info in "caps" debugfs file
    ceph: make iterate_session_caps a public symbol
    ceph: fix NULL pointer deref when debugging is enabled
    ...

    Linus Torvalds
     
  • Pull thermal management updates from Zhang Rui:

    - Remove the 'module' Kconfig option for thermal subsystem framework
    because the thermal framework are required to be ready as early as
    possible to avoid overheat at boot time (Daniel Lezcano)

    - Fix a bug that thermal framework pokes disabled thermal zones upon
    resume (Wei Wang)

    - A couple of cleanups and trivial fixes on int340x thermal drivers
    (Srinivas Pandruvada, Zhang Rui, Sumeet Pawnikar)

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
    drivers: thermal: processor_thermal: Downgrade error message
    mlxsw: Remove obsolete dependency on THERMAL=m
    hwmon/drivers/core: Simplify complex dependency
    thermal/drivers/core: Fix typo in the option name
    thermal/drivers/core: Remove depends on THERMAL in Kconfig
    thermal/drivers/core: Remove module unload code
    thermal/drivers/core: Remove the module Kconfig's option
    thermal: core: skip update disabled thermal zones after suspend
    thermal: make device_register's type argument const
    thermal: intel: int340x: processor_thermal_device: simplify to get driver data
    thermal/int3403_thermal: favor _TMP instead of PTYP

    Linus Torvalds
     
  • …it/device-mapper/linux-dm

    Pull device mapper updates from Mike Snitzer:

    - Improve DM snapshot target's scalability by using finer grained
    locking. Requires some list_bl interface improvements.

    - Add ability for DM integrity to use a bitmap mode, that tracks
    regions where data and metadata are out of sync, instead of using a
    journal.

    - Improve DM thin provisioning target to not write metadata changes to
    disk if the thin-pool and associated thin devices are merely
    activated but not used. This avoids metadata corruption due to
    concurrent activation of thin devices across different OS instances
    (e.g. split brain scenarios, which ultimately would be avoided if
    proper device filters were used -- but not having proper filtering
    has proven a very common configuration mistake)

    - Fix missing call to path selector type->end_io in DM multipath. This
    fixes reported performance problems due to inaccurate path selector
    IO accounting causing an imbalance of IO (e.g. avoiding issuing IO to
    particular path due to it seemingly being heavily used).

    - Fix bug in DM cache metadata's loading of its discard bitset that
    could lead to all cache blocks being discarded if the very first
    cache block was discarded (thankfully in practice the first cache
    block is generally in use; be it FS superblock, partition table, disk
    label, etc).

    - Add testing-only DM dust target which simulates a device that has
    failing sectors and/or read failures.

    - Fix a DM init error path reference count hang that caused boot hangs
    if user supplied malformed input on kernel commandline.

    - Fix a couple issues with DM crypt target's logging being overly
    verbose or lacking context.

    - Various other small fixes to DM init, DM multipath, DM zoned, and DM
    crypt.

    * tag 'for-5.2/dm-changes-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (42 commits)
    dm: fix a couple brace coding style issues
    dm crypt: print device name in integrity error message
    dm crypt: move detailed message into debug level
    dm ioctl: fix hang in early create error condition
    dm integrity: whitespace, coding style and dead code cleanup
    dm integrity: implement synchronous mode for reboot handling
    dm integrity: handle machine reboot in bitmap mode
    dm integrity: add a bitmap mode
    dm integrity: introduce a function add_new_range_and_wait()
    dm integrity: allow large ranges to be described
    dm ingerity: pass size to dm_integrity_alloc_page_list()
    dm integrity: introduce rw_journal_sectors()
    dm integrity: update documentation
    dm integrity: don't report unused options
    dm integrity: don't check null pointer before kvfree and vfree
    dm integrity: correctly calculate the size of metadata area
    dm dust: Make dm_dust_init and dm_dust_exit static
    dm dust: remove redundant unsigned comparison to less than zero
    dm mpath: always free attached_handler_name in parse_path()
    dm init: fix max devices/targets checks
    ...

    Linus Torvalds
     
  • It turned out that DEBUG_SLAB_LEAK is still broken even after recent
    recue efforts that when there is a large number of objects like
    kmemleak_object which is normal on a debug kernel,

    # grep kmemleak /proc/slabinfo
    kmemleak_object 2243606 3436210 ...

    reading /proc/slab_allocators could easily loop forever while processing
    the kmemleak_object cache and any additional freeing or allocating
    objects will trigger a reprocessing. To make a situation worse,
    soft-lockups could easily happen in this sitatuion which will call
    printk() to allocate more kmemleak objects to guarantee an infinite
    loop.

    Also, since it seems no one had noticed when it was totally broken
    more than 2-year ago - see the commit fcf88917dd43 ("slab: fix a crash
    by reading /proc/slab_allocators"), probably nobody cares about it
    anymore due to the decline of the SLAB. Just remove it entirely.

    Suggested-by: Vlastimil Babka
    Suggested-by: Linus Torvalds
    Signed-off-by: Qian Cai
    Signed-off-by: Linus Torvalds

    Qian Cai
     
  • Pull media fixes from Mauro Carvalho Chehab:
    "Some fixes for some platform drivers (rockchip, atmel, omap, daVinci,
    tegra-cec, coda and rcar).

    Also includes a fix on one of the V4L2 uAPI doc, explaining a border
    case"

    * tag 'media/v5.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
    media: rockchip/vpu: Fix/re-order probe-error/remove path
    media: rockchip/vpu: Initialize mdev->bus_info
    media: rockchip/vpu: Get vdev from the file arg in vidioc_querycap()
    media: rockchip/vpu: Add missing dont_use_autosuspend() calls
    media: rockchip/vpu: Do not request id 0 for our video device
    media: tegra-cec: fix cec_notifier_parse_hdmi_phandle return check
    media: davinci/vpbe: array underflow in vpbe_enum_outputs()
    media: field-order.rst: clarify FIELD_ANY and FIELD_NONE
    media: staging/imx: add media device to capture register
    media: rcar-csi2: Propagate the FLD signal for NTSC and PAL
    media: rcar-csi2: restart CSI-2 link if error is detected
    media: omap_vout: potential buffer overflow in vidioc_dqbuf()
    media: coda: fix unset field and fail on invalid field in buf_prepare
    media: atmel: atmel-isc: fix asd memory allocation
    media: atmel: atmel-isc: fix INIT_WORK misplacement
    media: atmel: atmel-isc: limit incoming pixels per frame

    Linus Torvalds
     
  • Pull nommu generic uaccess updates from Arnd Bergmann:
    "asm-generic: kill and improve nommu generic uaccess helpers

    Christoph Hellwig writes:

    This is a series doing two somewhat interwinded things. It improves
    the asm-generic nommu uaccess helper to optionally be entirely
    generic and not require any arch helpers for the actual uaccess.
    For the generic uaccess.h to actually be generically useful I also
    had to kill off the mess we made of , which really
    shouldn't exist on most architectures"

    * tag 'asm-generic-nommu' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
    asm-generic: optimize generic uaccess for 8-byte loads and stores
    asm-generic: provide entirely generic nommu uaccess
    arch: mostly remove
    asm-generic: don't include from

    Linus Torvalds
     
  • Pull core fixes from Ingo Molnar:
    "A handful of objtool updates, plus a documentation addition for
    __ab_c_size()"

    * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    objtool: Fix whitelist documentation typo
    objtool: Fix function fallthrough detection
    objtool: Don't use ignore flag for fake jumps
    overflow.h: Add comment documenting __ab_c_size()

    Linus Torvalds
     
  • Wire up the mount API syscalls on non-x86 arches.

    Reported-by: Arnd Bergmann
    Signed-off-by: David Howells
    Reviewed-by: Arnd Bergmann
    Signed-off-by: Al Viro

    David Howells
     
  • Pull ARM SoC-related driver updates from Olof Johansson:
    "Various driver updates for platforms and a couple of the small driver
    subsystems we merge through our tree:

    Among the larger pieces:

    - Power management improvements for TI am335x and am437x (RTC
    suspend/wake)

    - Misc new additions for Amlogic (socinfo updates)

    - ZynqMP FPGA manager

    - Nvidia improvements for reset/powergate handling

    - PMIC wrapper for Mediatek MT8516

    - Misc fixes/improvements for ARM SCMI, TEE, NXP i.MX SCU drivers"

    * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (57 commits)
    soc: aspeed: fix Kconfig
    soc: add aspeed folder and misc drivers
    spi: zynqmp: Fix build break
    soc: imx: Add generic i.MX8 SoC driver
    MAINTAINERS: Update email for Qualcomm SoC maintainer
    memory: tegra: Fix a typos for "fdcdwr2" mc client
    Revert "ARM: tegra: Restore memory arbitration on resume from LP1 on Tegra30+"
    memory: tegra: Replace readl-writel with mc_readl-mc_writel
    memory: tegra: Fix integer overflow on tick value calculation
    memory: tegra: Fix missed registers values latching
    ARM: tegra: cpuidle: Handle tick broadcasting within cpuidle core on Tegra20/30
    optee: allow to work without static shared memory
    soc/tegra: pmc: Move powergate initialisation to probe
    soc/tegra: pmc: Remove reset sysfs entries on error
    soc/tegra: pmc: Fix reset sources and levels
    soc: amlogic: meson-gx-pwrc-vpu: Add support for G12A
    soc: amlogic: meson-gx-pwrc-vpu: Fix power on/off register bitmask
    fpga manager: Adding FPGA Manager support for Xilinx zynqmp
    dt-bindings: fpga: Add bindings for ZynqMP fpga driver
    firmware: xilinx: Add fpga API's
    ...

    Linus Torvalds
     

16 May, 2019

14 commits

  • Pull ARM Device-tree updates from Olof Johansson:
    "Besides new bindings and additional descriptions of hardware blocks
    for various SoCs and boards, the main new contents here is:

    SoCs:
    - Intel Agilex (SoCFPGA)
    - NXP i.MX8MM (Quad Cortex-A53 with media/graphics focus)

    New boards:
    - Allwinner:
    + RerVision H3-DVK (H3)
    + Oceanic 5205 5inMFD (H6)
    + Beelink GS2 (H6)
    + Orange Pi 3 (H6)
    - Rockchip:
    + Orange Pi RK3399
    + Nanopi NEO4
    + Veyron-Mighty Chromebook variant
    - Amlogic:
    + SEI Robotics SEI510
    - ST Micro:
    + stm32mp157a discovery1
    + stm32mp157c discovery2
    - NXP:
    + Eckelmann ci4x10 (i.MX6DL)
    + i.MX8MM EVK (i.MX8MM)
    + ZII i.MX7 RPU2 (i.MX7)
    + ZII SPB4 (VF610)
    + Zii Ultra (i.MX8M)
    + TQ TQMa7S (i.MX7Solo)
    + TQ TQMa7D (i.MX7Dual)
    + Kobo Aura (i.MX50)
    + Menlosystems M53 (i.MX53)j
    - Nvidia:
    + Jetson Nano (Tegra T210)"

    * tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (593 commits)
    arm64: dts: bitmain: Add UART pinctrl support for Sophon Edge
    arm64: dts: bitmain: Add pinctrl support for BM1880 SoC
    arm64: dts: bitmain: Add GPIO Line names for Sophon Edge board
    arm64: dts: bitmain: Add GPIO support for BM1880 SoC
    ARM: dts: gemini: Indent DIR-685 partition table
    dt-bindings: hwmon (pwm-fan) Remove dead "cooling-*-state" properties
    ARM: dts: qcom-apq8064: Set 'cxo_board' as ref clock of the DSI PHY
    arm64: dts: msm8998: thermal: Restrict thermal zone name length to under 20
    arm64: dts: msm8998: thermal: Fix number of supported sensors
    arm64: dts: msm8998-mtp: thermal: Remove skin and battery thermal zones
    arm64: dts: exynos: Move fixed-clocks out of soc
    arm64: dts: exynos: Move pmu and timer nodes out of soc
    ARM: dts: s5pv210: Fix camera clock provider on Goni board
    ARM: dts: exynos: Properly override node to use MDMA0 on Universal C210
    ARM: dts: exynos: Move fixed-clocks out of soc on Exynos3250
    ARM: dts: exynos: Remove unneeded address/size cells from fixed-clock on Exynos3250
    ARM: dts: exynos: Move pmu and timer nodes out of soc
    arm64: dts: rockchip: fix IO domain voltage setting of APIO5 on rockpro64
    arm64: dts: db820c: Add sound card support
    arm64: dts: apq8096-db820c: Add HDMI display support
    ...

    Linus Torvalds
     
  • Pull ARM SoC platform updates from Olof Johansson:
    "SoC updates, mostly refactorings and cleanups of old legacy platforms.

    Major themes this release:

    - Conversion of ixp4xx to a modern platform (drivers, DT, bindings)

    - Moving some of the ep93xx headers around to get it closer to
    multiplatform enabled.

    - Cleanups of Davinci

    This also contains a few patches that were queued up as fixes before
    5.1 but I didn't get sent in before release"

    * tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (123 commits)
    ARM: debug-ll: add default address for digicolor
    ARM: u300: regulator: add MODULE_LICENSE()
    ARM: ep93xx: move private headers out of mach/*
    ARM: ep93xx: move pinctrl interfaces into include/linux/soc
    ARM: ep93xx: keypad: stop using mach/platform.h
    ARM: ep93xx: move network platform data to separate header
    ARM: stm32: add AMBA support for stm32 family
    MAINTAINERS: update arch/arm/mach-davinci
    ARM: rockchip: add missing of_node_put in rockchip_smp_prepare_pmu
    ARM: dts: Add queue manager and NPE to the IXP4xx DTSI
    soc: ixp4xx: qmgr: Add DT probe code
    soc: ixp4xx: qmgr: Add DT bindings for IXP4xx qmgr
    soc: ixp4xx: npe: Add DT probe code
    soc: ixp4xx: Add DT bindings for IXP4xx NPE
    soc: ixp4xx: qmgr: Pass resources
    soc: ixp4xx: Remove unused functions
    soc: ixp4xx: Uninline several functions
    soc: ixp4xx: npe: Pass addresses as resources
    ARM: ixp4xx: Turn the QMGR into a platform device
    ARM: ixp4xx: Turn the NPE into a platform device
    ...

    Linus Torvalds
     
  • Allow kernel services using AF_RXRPC to indicate that a call should be
    non-interruptible. This allows kafs to make things like lock-extension and
    writeback data storage calls non-interruptible.

    If this is set, signals will be ignored for operations on that call where
    possible - such as waiting to get a call channel on an rxrpc connection.

    It doesn't prevent UDP sendmsg from being interrupted, but that will be
    handled by packet retransmission.

    rxrpc_kernel_recv_data() isn't affected by this since that never waits,
    preferring instead to return -EAGAIN and leave the waiting to the caller.

    Userspace initiated calls can't be set to be uninterruptible at this time.

    Signed-off-by: David Howells

    David Howells
     
  • Pull thermal soc updates from Eduardo Valentin:

    - thermal core has a new devm_* API for registering cooling devices. I
    took the entire series, that is why you see changes on drivers/hwmon
    in this pull (Guenter Roeck)

    - rockchip thermal driver gains support to PX30 SoC (Elaine Zhang)

    - the generic-adc thermal driver now considers the lookup table DT
    property as optional (Jean-Francois Dagenais)

    - Refactoring of tsens thermal driver (Amit Kucheria)

    - Cleanups on cpu cooling driver (Daniel Lezcano)

    - broadcom thermal driver dropped support to ACPI (Srinath Mannam)

    - tegra thermal driver gains support to OC hw throttle and GPU throtle
    (Wei Ni)

    - Fixes in several thermal drivers.

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: (59 commits)
    hwmon: (pwm-fan) Use devm_thermal_of_cooling_device_register
    hwmon: (npcm750-pwm-fan) Use devm_thermal_of_cooling_device_register
    hwmon: (mlxreg-fan) Use devm_thermal_of_cooling_device_register
    hwmon: (gpio-fan) Use devm_thermal_of_cooling_device_register
    hwmon: (aspeed-pwm-tacho) Use devm_thermal_of_cooling_device_register
    thermal: rcar_gen3_thermal: Fix to show correct trip points number
    thermal: rcar_thermal: update calculation formula for R-Car Gen3 SoCs
    thermal: cpu_cooling: Actually trace CPU load in thermal_power_cpu_get_power
    thermal: rockchip: Support the PX30 SoC in thermal driver
    dt-bindings: rockchip-thermal: Support the PX30 SoC compatible
    thermal: rockchip: fix up the tsadc pinctrl setting error
    thermal: broadcom: Remove ACPI support
    thermal: Fix build error of missing devm_ioremap_resource on UM
    thermal/drivers/cpu_cooling: Remove pointless field
    thermal/drivers/cpu_cooling: Add Software Package Data Exchange (SPDX)
    thermal/drivers/cpu_cooling: Fixup the header and copyright
    thermal/drivers/cpu_cooling: Remove pointless test in power2state()
    thermal: rcar_gen3_thermal: disable interrupt in .remove
    thermal: rcar_gen3_thermal: fix interrupt type
    thermal: Introduce devm_thermal_of_cooling_device_register
    ...

    Linus Torvalds
     
  • Provide an interface to set max lifespan on a call from inside of the
    kernel without having to call kernel_sendmsg().

    Signed-off-by: David Howells

    David Howells
     
  • Merge in a few pending fixes from pre-5.1 that didn't get sent in:

    MAINTAINERS: update arch/arm/mach-davinci
    ARM: dts: ls1021: Fix SGMII PCS link remaining down after PHY disconnect
    ARM: dts: imx6q-logicpd: Reduce inrush current on USBH1
    ARM: dts: imx6q-logicpd: Reduce inrush current on start
    ARM: dts: imx: Fix the AR803X phy-mode
    ARM: dts: sun8i: a33: Reintroduce default pinctrl muxing
    arm64: dts: allwinner: a64: Rename hpvcc-supply to cpvdd-supply
    ARM: sunxi: fix a leaked reference by adding missing of_node_put
    ARM: sunxi: fix a leaked reference by adding missing of_node_put

    Signed-off-by: Olof Johansson

    Olof Johansson
     
  • Pull power supply and reset updates from Sebastian Reichel:
    "Core:
    - Add over-current health state
    - Add standard, adaptive and custom charge types
    - Add new properties for start/end charge threshold

    New Drivers / Hardware:
    - UCS1002 Programmable USB Port Power Controller
    - Ingenic JZ47xx Battery Fuel Gauge
    - AXP20x USB Power: Add AXP813 support
    - AT91 poweroff: Add SAM9X60 support
    - OLPC battery: Add XO-1.5 and XO-1.75 support

    Misc Changes:
    - syscon-reboot: support mask property
    - AXP288 fuel gauge: Blacklist ACEPC T8/T11. Looks like some vendor
    thought it's a good idea to build a desktop system with a fuel
    gauge, that slowly "discharges"...
    - cpcap-battery: Fix calculation errors
    - misc fixes"

    * tag 'for-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply: (54 commits)
    power: supply: olpc_battery: force the le/be casts
    power: supply: ucs1002: Fix build error without CONFIG_REGULATOR
    power: supply: ucs1002: Fix wrong return value checking
    power: supply: Add driver for Microchip UCS1002
    dt-bindings: power: supply: Add bindings for Microchip UCS1002
    power: supply: core: Add POWER_SUPPLY_HEALTH_OVERCURRENT constant
    power: supply: core: fix clang -Wunsequenced
    power: supply: core: Add missing documentation for CHARGE_CONTROL_* properties
    power: supply: core: Add CHARGE_CONTROL_{START_THRESHOLD,END_THRESHOLD} properties
    power: supply: core: Add Standard, Adaptive, and Custom charge types
    power: supply: axp288_fuel_gauge: Add ACEPC T8 and T11 mini PCs to the blacklist
    power: supply: bq27xxx_battery: Notify also about status changes
    power: supply: olpc_battery: Have the framework register sysfs files for us
    power: supply: olpc_battery: Add OLPC XO 1.75 support
    power: supply: olpc_battery: Avoid using platform_info
    power: supply: olpc_battery: Use devm_power_supply_register()
    power: supply: olpc_battery: Move priv data to a struct
    power: supply: olpc_battery: Use DT to get battery version
    x86/platform/olpc: Use a correct version when making up a battery node
    x86/platform/olpc: Trivial code move in DT fixup
    ...

    Linus Torvalds
     
  • Pull nfsd updates from Bruce Fields:
    "This consists mostly of nfsd container work:

    Scott Mayhew revived an old api that communicates with a userspace
    daemon to manage some on-disk state that's used to track clients
    across server reboots. We've been using a usermode_helper upcall for
    that, but it's tough to run those with the right namespaces, so a
    daemon is much friendlier to container use cases.

    Trond fixed nfsd's handling of user credentials in user namespaces. He
    also contributed patches that allow containers to support different
    sets of NFS protocol versions.

    The only remaining container bug I'm aware of is that the NFS reply
    cache is shared between all containers. If anyone's aware of other
    gaps in our container support, let me know.

    The rest of this is miscellaneous bugfixes"

    * tag 'nfsd-5.2' of git://linux-nfs.org/~bfields/linux: (23 commits)
    nfsd: update callback done processing
    locks: move checks from locks_free_lock() to locks_release_private()
    nfsd: fh_drop_write in nfsd_unlink
    nfsd: allow fh_want_write to be called twice
    nfsd: knfsd must use the container user namespace
    SUNRPC: rsi_parse() should use the current user namespace
    SUNRPC: Fix the server AUTH_UNIX userspace mappings
    lockd: Pass the user cred from knfsd when starting the lockd server
    SUNRPC: Temporary sockets should inherit the cred from their parent
    SUNRPC: Cache the process user cred in the RPC server listener
    nfsd: Allow containers to set supported nfs versions
    nfsd: Add custom rpcbind callbacks for knfsd
    SUNRPC: Allow further customisation of RPC program registration
    SUNRPC: Clean up generic dispatcher code
    SUNRPC: Add a callback to initialise server requests
    SUNRPC/nfs: Fix return value for nfs4_callback_compound()
    nfsd: handle legacy client tracking records sent by nfsdcld
    nfsd: re-order client tracking method selection
    nfsd: keep a tally of RECLAIM_COMPLETE operations when using nfsdcld
    nfsd: un-deprecate nfsdcld
    ...

    Linus Torvalds
     
  • Pull tracing updates from Steven Rostedt:
    "The major changes in this tracing update includes:

    - Removal of non-DYNAMIC_FTRACE from 32bit x86

    - Removal of mcount support from x86

    - Emulating a call from int3 on x86_64, fixes live kernel patching

    - Consolidated Tracing Error logs file

    Minor updates:

    - Removal of klp_check_compiler_support()

    - kdb ftrace dumping output changes

    - Accessing and creating ftrace instances from inside the kernel

    - Clean up of #define if macro

    - Introduction of TRACE_EVENT_NOP() to disable trace events based on
    config options

    And other minor fixes and clean ups"

    * tag 'trace-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
    x86: Hide the int3_emulate_call/jmp functions from UML
    livepatch: Remove klp_check_compiler_support()
    ftrace/x86: Remove mcount support
    ftrace/x86_32: Remove support for non DYNAMIC_FTRACE
    tracing: Simplify "if" macro code
    tracing: Fix documentation about disabling options using trace_options
    tracing: Replace kzalloc with kcalloc
    tracing: Fix partial reading of trace event's id file
    tracing: Allow RCU to run between postponed startup tests
    tracing: Fix white space issues in parse_pred() function
    tracing: Eliminate const char[] auto variables
    ring-buffer: Fix mispelling of Calculate
    tracing: probeevent: Fix to make the type of $comm string
    tracing: probeevent: Do not accumulate on ret variable
    tracing: uprobes: Re-enable $comm support for uprobe events
    ftrace/x86_64: Emulate call function while updating in breakpoint handler
    x86_64: Allow breakpoints to emulate call instructions
    x86_64: Add gap to int3 to allow for call emulation
    tracing: kdb: Allow ftdump to skip all but the last few entries
    tracing: Add trace_total_entries() / trace_total_entries_cpu()
    ...

    Linus Torvalds
     
  • KVM/arm updates for 5.2

    - guest SVE support
    - guest Pointer Authentication support
    - Better discrimination of perf counters between host and guests

    Conflicts:
    include/uapi/linux/kvm.h

    Paolo Bonzini
     
  • …paulus/powerpc into HEAD

    PPC KVM update for 5.2

    * Support for guests to access the new POWER9 XIVE interrupt controller
    hardware directly, reducing interrupt latency and overhead for guests.

    * In-kernel implementation of the H_PAGE_INIT hypercall.

    * Reduce memory usage of sparsely-populated IOMMU tables.

    * Several bug fixes.

    Second PPC KVM update for 5.2

    * Fix a bug, fix a spelling mistake, remove some useless code.

    Paolo Bonzini
     
  • Now that we've gotten rid of clk_readl() we can remove io.h from the
    clk-provider header and push out the io.h include to any code that isn't
    already including the io.h header but using things like readl/writel,
    etc.

    Found with this grep:

    git grep -l clk-provider.h | grep '.c$' | xargs git grep -L 'linux/io.h' | \
    xargs git grep -l \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\' --or \
    -e '\'

    I also reordered a couple includes when they weren't alphabetical and
    removed clk.h from kona, replacing it with clk-provider.h because
    that driver doesn't use clk consumer APIs.

    Acked-by: Geert Uytterhoeven
    Cc: Chen-Yu Tsai
    Acked-by: Maxime Ripard
    Acked-by: Tero Kristo
    Acked-by: Sekhar Nori
    Cc: Krzysztof Kozlowski
    Acked-by: Mark Brown
    Cc: Chris Zankel
    Acked-by: Max Filippov
    Acked-by: John Crispin
    Acked-by: Heiko Stuebner
    Signed-off-by: Stephen Boyd

    Stephen Boyd
     
  • Add wait_var_event_interruptible() to allow interruptible waits for events.

    Signed-off-by: David Howells
    Acked-by: Peter Zijlstra (Intel)

    David Howells
     
  • Allow used DNS resolver keys to be invalidated after use if the caller is
    doing its own caching of the results. This reduces the amount of resources
    required.

    Fix AFS to invalidate DNS results to kill off permanent failure records
    that get lodged in the resolver keyring and prevent future lookups from
    happening.

    Fixes: 0a5143f2f89c ("afs: Implement VL server rotation")
    Signed-off-by: David Howells

    David Howells
     

15 May, 2019

10 commits

  • Pull more ACPI updates from Rafael Wysocki:
    "These fix two regressions introduced during the 5.0 cycle, in ACPICA
    and in device PM, cause the values returned by _ADR to be stored in 64
    bits and fix two ACPI documentation issues.

    Specifics:

    - Update the ACPICA code in the kernel to upstream revision 20190509
    including one regression fix:
    * Prevent excessive ACPI debug messages from being printed by
    moving the ACPI_DEBUG_DEFAULT definition to the right place
    (Erik Schmauss).

    - Set the enable_for_wake bits for wakeup GPEs during suspend to idle
    to allow acpi_enable_all_wakeup_gpes() to enable them as
    aproppriate and make wakeup devices sighaling events through ACPI
    GPEs work with suspend-to-idle again (Rajat Jain).

    - Use 64 bits to store the return values of _ADR which are assumed to
    be 64-bit by some bus specs and may contain nonzero bits in the
    upper 32 bits part for some devices (Pierre-Louis Bossart).

    - Fix two minor issues with the ACPI documentation (Sakari Ailus)"

    * tag 'acpi-5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    ACPI: PM: Set enable_for_wake for wakeup GPEs during suspend-to-idle
    Documentation: ACPI: Direct references are allowed to devices only
    Documentation: ACPI: Use tabs for graph ASL indentation
    ACPICA: Update version to 20190509
    ACPICA: Linux: move ACPI_DEBUG_DEFAULT flag out of ifndef
    ACPI: bus: change _ADR representation to 64 bits

    Linus Torvalds
     
  • Pull more power management updates from Rafael Wysocki:
    "These fix a recent regression causing kernels built with CONFIG_PM
    unset to crash on systems that support the Performance and Energy Bias
    Hint (EPB), clean up the cpufreq core and some users of transition
    notifiers and introduce a new power domain flag into the generic power
    domains framework (genpd).

    Specifics:

    - Fix recent regression causing kernels built with CONFIG_PM unset to
    crash on systems that support the Performance and Energy Bias Hint
    (EPB) by avoiding to compile the EPB-related code depending on
    CONFIG_PM when it is unset (Rafael Wysocki).

    - Clean up the transition notifier invocation code in the cpufreq
    core and change some users of cpufreq transition notifiers
    accordingly (Viresh Kumar).

    - Change MAINTAINERS to cover the schedutil governor as part of
    cpufreq (Viresh Kumar).

    - Simplify cpufreq_init_policy() to avoid redundant computations (Yue
    Hu).

    - Add explanatory comment to the cpufreq core (Rafael Wysocki).

    - Introduce a new flag, GENPD_FLAG_RPM_ALWAYS_ON, to the generic
    power domains (genpd) framework along with the first user of it
    (Leonard Crestez)"

    * tag 'pm-5.2-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    soc: imx: gpc: Use GENPD_FLAG_RPM_ALWAYS_ON for ERR009619
    PM / Domains: Add GENPD_FLAG_RPM_ALWAYS_ON flag
    cpufreq: Update MAINTAINERS to include schedutil governor
    cpufreq: Don't find governor for setpolicy drivers in cpufreq_init_policy()
    cpufreq: Explain the kobject_put() in cpufreq_policy_alloc()
    cpufreq: Call transition notifier only once for each policy
    x86: intel_epb: Take CONFIG_PM into account

    Linus Torvalds
     
  • * pm-cpufreq:
    cpufreq: Update MAINTAINERS to include schedutil governor
    cpufreq: Don't find governor for setpolicy drivers in cpufreq_init_policy()
    cpufreq: Explain the kobject_put() in cpufreq_policy_alloc()
    cpufreq: Call transition notifier only once for each policy

    * pm-domains:
    soc: imx: gpc: Use GENPD_FLAG_RPM_ALWAYS_ON for ERR009619
    PM / Domains: Add GENPD_FLAG_RPM_ALWAYS_ON flag

    Rafael J. Wysocki
     
  • * acpi-bus:
    ACPI: bus: change _ADR representation to 64 bits

    * acpi-doc:
    Documentation: ACPI: Direct references are allowed to devices only
    Documentation: ACPI: Use tabs for graph ASL indentation

    * acpi-pm:
    ACPI: PM: Set enable_for_wake for wakeup GPEs during suspend-to-idle

    Rafael J. Wysocki
     
  • Pull more rdma updates from Jason Gunthorpe:
    "This is being sent to get a fix for the gcc 9.1 build warnings, and
    I've also pulled in some bug fix patches that were posted in the last
    two weeks.

    - Avoid the gcc 9.1 warning about overflowing a union member

    - Fix the wrong callback type for a single response netlink to doit

    - Bug fixes from more usage of the mlx5 devx interface"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
    net/mlx5: Set completion EQs as shared resources
    IB/mlx5: Verify DEVX general object type correctly
    RDMA/core: Change system parameters callback from dumpit to doit
    RDMA: Directly cast the sockaddr union to sockaddr

    Linus Torvalds
     
  • Merge more updates from Andrew Morton:

    - a couple of hotfixes

    - almost all of the rest of MM

    - lib/ updates

    - binfmt_elf updates

    - autofs updates

    - quite a lot of misc fixes and updates
    - reiserfs, fatfs
    - signals
    - exec
    - cpumask
    - rapidio
    - sysctl
    - pids
    - eventfd
    - gcov
    - panic
    - pps

    - gdb script updates

    - ipc updates

    * emailed patches from Andrew Morton : (126 commits)
    mm: memcontrol: fix NUMA round-robin reclaim at intermediate level
    mm: memcontrol: fix recursive statistics correctness & scalabilty
    mm: memcontrol: move stat/event counting functions out-of-line
    mm: memcontrol: make cgroup stats and events query API explicitly local
    drivers/virt/fsl_hypervisor.c: prevent integer overflow in ioctl
    drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctl
    mm, memcg: rename ambiguously named memory.stat counters and functions
    arch: remove and
    treewide: replace #include with #include
    fs/block_dev.c: Remove duplicate header
    fs/cachefiles/namei.c: remove duplicate header
    include/linux/sched/signal.h: replace `tsk' with `task'
    fs/coda/psdev.c: remove duplicate header
    ipc: do cyclic id allocation for the ipc object.
    ipc: conserve sequence numbers in ipcmni_extend mode
    ipc: allow boot time extension of IPCMNI from 32k to 16M
    ipc/mqueue: optimize msg_get()
    ipc/mqueue: remove redundant wq task assignment
    ipc: prevent lockup on alloc_msg and free_msg
    scripts/gdb: print cached rate in lx-clk-summary
    ...

    Linus Torvalds
     
  • Right now, when somebody needs to know the recursive memory statistics
    and events of a cgroup subtree, they need to walk the entire subtree and
    sum up the counters manually.

    There are two issues with this:

    1. When a cgroup gets deleted, its stats are lost. The state counters
    should all be 0 at that point, of course, but the events are not.
    When this happens, the event counters, which are supposed to be
    monotonic, can go backwards in the parent cgroups.

    2. During regular operation, we always have a certain number of lazily
    freed cgroups sitting around that have been deleted, have no tasks,
    but have a few cache pages remaining. These groups' statistics do not
    change until we eventually hit memory pressure, but somebody
    watching, say, memory.stat on an ancestor has to iterate those every
    time.

    This patch addresses both issues by introducing recursive counters at
    each level that are propagated from the write side when stats change.

    Upward propagation happens when the per-cpu caches spill over into the
    local atomic counter. This is the same thing we do during charge and
    uncharge, except that the latter uses atomic RMWs, which are more
    expensive; stat changes happen at around the same rate. In a sparse
    file test (page faults and reclaim at maximum CPU speed) with 5 cgroup
    nesting levels, perf shows __mod_memcg_page state at ~1%.

    Link: http://lkml.kernel.org/r/20190412151507.2769-4-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Reviewed-by: Shakeel Butt
    Reviewed-by: Roman Gushchin
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • These are getting too big to be inlined in every callsite. They were
    stolen from vmstat.c, which already out-of-lines them, and they have
    only been growing since. The callsites aren't that hot, either.

    Move __mod_memcg_state()
    __mod_lruvec_state() and
    __count_memcg_events() out of line and add kerneldoc comments.

    Link: http://lkml.kernel.org/r/20190412151507.2769-3-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Reviewed-by: Shakeel Butt
    Reviewed-by: Roman Gushchin
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Patch series "mm: memcontrol: memory.stat cost & correctness".

    The cgroup memory.stat file holds recursive statistics for the entire
    subtree. The current implementation does this tree walk on-demand
    whenever the file is read. This is giving us problems in production.

    1. The cost of aggregating the statistics on-demand is high. A lot of
    system service cgroups are mostly idle and their stats don't change
    between reads, yet we always have to check them. There are also always
    some lazily-dying cgroups sitting around that are pinned by a handful
    of remaining page cache; the same applies to them.

    In an application that periodically monitors memory.stat in our
    fleet, we have seen the aggregation consume up to 5% CPU time.

    2. When cgroups die and disappear from the cgroup tree, so do their
    accumulated vm events. The result is that the event counters at
    higher-level cgroups can go backwards and confuse some of our
    automation, let alone people looking at the graphs over time.

    To address both issues, this patch series changes the stat
    implementation to spill counts upwards when the counters change.

    The upward spilling is batched using the existing per-cpu cache. In a
    sparse file stress test with 5 level cgroup nesting, the additional cost
    of the flushing was negligible (a little under 1% of CPU at 100% CPU
    utilization, compared to the 5% of reading memory.stat during regular
    operation).

    This patch (of 4):

    memcg_page_state(), lruvec_page_state(), memcg_sum_events() are
    currently returning the state of the local memcg or lruvec, not the
    recursive state.

    In practice there is a demand for both versions, although the callers
    that want the recursive counts currently sum them up by hand.

    Per default, cgroups are considered recursive entities and generally we
    expect more users of the recursive counters, with the local counts being
    special cases. To reflect that in the name, add a _local suffix to the
    current implementations.

    The following patch will re-incarnate these functions with recursive
    semantics, but with an O(1) implementation.

    [hannes@cmpxchg.org: fix bisection hole]
    Link: http://lkml.kernel.org/r/20190417160347.GC23013@cmpxchg.org
    Link: http://lkml.kernel.org/r/20190412151507.2769-2-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Reviewed-by: Shakeel Butt
    Reviewed-by: Roman Gushchin
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • I spent literally an hour trying to work out why an earlier version of
    my memory.events aggregation code doesn't work properly, only to find
    out I was calling memcg->events instead of memcg->memory_events, which
    is fairly confusing.

    This naming seems in need of reworking, so make it harder to do the
    wrong thing by using vmevents instead of events, which makes it more
    clear that these are vm counters rather than memcg-specific counters.

    There are also a few other inconsistent names in both the percpu and
    aggregated structs, so these are all cleaned up to be more coherent and
    easy to understand.

    This commit contains code cleanup only: there are no logic changes.

    [akpm@linux-foundation.org: fix it for preceding changes]
    Link: http://lkml.kernel.org/r/20190208224319.GA23801@chrisdown.name
    Signed-off-by: Chris Down
    Acked-by: Johannes Weiner
    Cc: Michal Hocko
    Cc: Tejun Heo
    Cc: Roman Gushchin
    Cc: Dennis Zhou
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Down