03 Jun, 2020

40 commits

  • Pull btrfs updates from David Sterba:
    "Highlights:

    - speedup dead root detection during orphan cleanup, eg. when there
    are many deleted subvolumes waiting to be cleaned, the trees are
    now looked up in radix tree instead of a O(N^2) search

    - snapshot creation with inherited qgroup will mark the qgroup
    inconsistent, requires a rescan

    - send will emit file capabilities after chown, this produces a
    stream that does not need postprocessing to set the capabilities
    again

    - direct io ported to iomap infrastructure, cleaned up and simplified
    code, notably removing last use of struct buffer_head in btrfs code

    Core changes:

    - factor out backreference iteration, to be used by ordinary
    backreferences and relocation code

    - improved global block reserve utilization
    * better logic to serialize requests
    * increased maximum available for unlink
    * improved handling on large pages (64K)

    - direct io cleanups and fixes
    * simplify layering, where cloned bios were unnecessarily created
    for some cases
    * error handling fixes (submit, endio)
    * remove repair worker thread, used to avoid deadlocks during
    repair

    - refactored block group reading code, preparatory work for new type
    of block group storage that should improve mount time on large
    filesystems

    Cleanups:

    - cleaned up (and slightly sped up) set/get helpers for metadata data
    structure members

    - root bit REF_COWS got renamed to SHAREABLE to reflect the that the
    blocks of the tree get shared either among subvolumes or with the
    relocation trees

    Fixes:

    - when subvolume deletion fails due to ENOSPC, the filesystem is not
    turned read-only

    - device scan deals with devices from other filesystems that changed
    ownership due to overwrite (mkfs)

    - fix a race between scrub and block group removal/allocation

    - fix long standing bug of a runaway balance operation, printing the
    same line to the syslog, caused by a stale status bit on a reloc
    tree that prevented progress

    - fix corrupt log due to concurrent fsync of inodes with shared
    extents

    - fix space underflow for NODATACOW and buffered writes when it for
    some reason needs to fallback to COW mode"

    * tag 'for-5.8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (133 commits)
    btrfs: fix space_info bytes_may_use underflow during space cache writeout
    btrfs: fix space_info bytes_may_use underflow after nocow buffered write
    btrfs: fix wrong file range cleanup after an error filling dealloc range
    btrfs: remove redundant local variable in read_block_for_search
    btrfs: open code key_search
    btrfs: split btrfs_direct_IO to read and write part
    btrfs: remove BTRFS_INODE_READDIO_NEED_LOCK
    fs: remove dio_end_io()
    btrfs: switch to iomap_dio_rw() for dio
    iomap: remove lockdep_assert_held()
    iomap: add a filesystem hook for direct I/O bio submission
    fs: export generic_file_buffered_read()
    btrfs: turn space cache writeout failure messages into debug messages
    btrfs: include error on messages about failure to write space/inode caches
    btrfs: remove useless 'fail_unlock' label from btrfs_csum_file_blocks()
    btrfs: do not ignore error from btrfs_next_leaf() when inserting checksums
    btrfs: make checksum item extension more efficient
    btrfs: fix corrupt log due to concurrent fsync of inodes with shared extents
    btrfs: unexport btrfs_compress_set_level()
    btrfs: simplify iget helpers
    ...

    Linus Torvalds
     
  • Pull DAX updates part two from Darrick Wong:
    "This time around, we're hoisting the DONTCACHE flag from XFS into the
    VFS so that we can make the incore DAX mode changes become effective
    sooner.

    We can't change the file data access mode on a live inode because we
    don't have a safe way to change the file ops pointers. The incore
    state change becomes effective at inode loading time, which can happen
    if the inode is evicted. Therefore, we're making it so that
    filesystems can ask the VFS to evict the inode as soon as the last
    holder drops.

    The per-fs changes to make this call this will be in subsequent pull
    requests from Ted and myself.

    Summary:

    - Introduce DONTCACHE flags for dentries and inodes. This hint will
    cause the VFS to drop the associated objects immediately after the
    last put, so that we can change the file access mode (DAX or page
    cache) on the fly"

    * tag 'vfs-5.8-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
    fs: Introduce DCACHE_DONTCACHE
    fs: Lift XFS_IDONTCACHE to the VFS layer

    Linus Torvalds
     
  • Pull DAX updates part one from Darrick Wong:
    "After many years of LKML-wrangling about how to enable programs to
    query and influence the file data access mode (DAX) when a filesystem
    resides on storage devices such as persistent memory, Ira Weiny has
    emerged with a proposed set of standard behaviors that has not been
    shot down by anyone! We're more or less standardizing on the current
    XFS behavior and adapting ext4 to do the same.

    This is the first of a handful pull requests that will make ext4 and
    XFS present a consistent interface for user programs that care about
    DAX. We add a statx attribute that programs can check to see if DAX is
    enabled on a particular file. Then, we update the DAX documentation to
    spell out the user-visible behaviors that filesystems will guarantee
    (until the next storage industry shakeup). The on-disk inode flag has
    been in XFS for a few years now.

    Summary:

    - Clean up io_is_direct.

    - Add a new statx flag to indicate when file data access is being
    done via DAX (as opposed to the page cache).

    - Update the documentation for how system administrators and
    application programmers can take advantage of the (still
    experimental DAX) feature"

    Link: https://lore.kernel.org/lkml/20200505002016.1085071-1-ira.weiny@intel.com/

    * tag 'vfs-5.8-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
    Documentation/dax: Update Usage section
    fs/stat: Define DAX statx attribute
    fs: Remove unneeded IS_DAX() check in io_is_direct()

    Linus Torvalds
     
  • Pull xfs updates from Darrick Wong:
    "Most of the changes this cycle are refactoring of existing code in
    preparation for things landing in the future.

    We also fixed various problems and deficiencies in the quota
    implementation, and (I hope) the last of the stale read vectors by
    forcing write allocations to go through the unwritten state until the
    write completes.

    Summary:

    - Various cleanups to remove dead code, unnecessary conditionals,
    asserts, etc.

    - Fix a linker warning caused by xfs stuffing '-g' into CFLAGS
    redundantly.

    - Tighten up our dmesg logging to ensure that everything is prefixed
    with 'XFS' for easier grepping.

    - Kill a bunch of typedefs.

    - Refactor the deferred ops code to reduce indirect function calls.

    - Increase type-safety with the deferred ops code.

    - Make the DAX mount options a tri-state.

    - Fix some error handling problems in the inode flush code and clean
    up other inode flush warts.

    - Refactor log recovery so that each log item recovery functions now
    live with the other log item processing code.

    - Fix some SPDX forms.

    - Fix quota counter corruption if the fs crashes after running
    quotacheck but before any dquots get logged.

    - Don't fail metadata verification on zero-entry attr leaf blocks,
    since they're just part of the disk format now due to a historic
    lack of log atomicity.

    - Don't allow SWAPEXT between files with different [ugp]id when
    quotas are enabled.

    - Refactor inode fork reading and verification to run directly from
    the inode-from-disk function. This means that we now actually
    guarantee that _iget'ted inodes are totally verified and ready to
    go.

    - Move the incore inode fork format and extent counts to the ifork
    structure.

    - Scalability improvements by reducing cacheline pingponging in
    struct xfs_mount.

    - More scalability improvements by removing m_active_trans from the
    hot path.

    - Fix inode counter update sanity checking to run /only/ on debug
    kernels.

    - Fix longstanding inconsistency in what error code we return when a
    program hits project quota limits (ENOSPC).

    - Fix group quota returning the wrong error code when a program hits
    group quota limits.

    - Fix per-type quota limits and grace periods for group and project
    quotas so that they actually work.

    - Allow extension of individual grace periods.

    - Refactor the non-reclaim inode radix tree walking code to remove a
    bunch of stupid little functions and straighten out the
    inconsistent naming schemes.

    - Fix a bug in speculative preallocation where we measured a new
    allocation based on the last extent mapping in the file instead of
    looking farther for the last contiguous space allocation.

    - Force delalloc writes to unwritten extents. This closes a stale
    disk contents exposure vector if the system goes down before the
    write completes.

    - More lockdep whackamole"

    * tag 'xfs-5.8-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (129 commits)
    xfs: more lockdep whackamole with kmem_alloc*
    xfs: force writes to delalloc regions to unwritten
    xfs: refactor xfs_iomap_prealloc_size
    xfs: measure all contiguous previous extents for prealloc size
    xfs: don't fail unwritten extent conversion on writeback due to edquot
    xfs: rearrange xfs_inode_walk_ag parameters
    xfs: straighten out all the naming around incore inode tree walks
    xfs: move xfs_inode_ag_iterator to be closer to the perag walking code
    xfs: use bool for done in xfs_inode_ag_walk
    xfs: fix inode ag walk predicate function return values
    xfs: refactor eofb matching into a single helper
    xfs: remove __xfs_icache_free_eofblocks
    xfs: remove flags argument from xfs_inode_ag_walk
    xfs: remove xfs_inode_ag_iterator_flags
    xfs: remove unused xfs_inode_ag_iterator function
    xfs: replace open-coded XFS_ICI_NO_TAG
    xfs: move eofblocks conversion function to xfs_ioctl.c
    xfs: allow individual quota grace period extension
    xfs: per-type quota timers and warn limits
    xfs: switch xfs_get_defquota to take explicit type
    ...

    Linus Torvalds
     
  • Pull lockdown update from James Morris:
    "An update for the security subsystem to allow unprivileged users
    to see the status of the lockdown feature. From Jeremy Cline"

    Also an added comment to describe CAP_SETFCAP.

    * 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    capabilities: add description for CAP_SETFCAP
    lockdown: Allow unprivileged users to see lockdown status

    Linus Torvalds
     
  • Pull SELinux updates from Paul Moore:
    "The highlights:

    - A number of improvements to various SELinux internal data
    structures to help improve performance. We move the role
    transitions into a hash table. In the content structure we shift
    from hashing the content string (aka SELinux label) to the
    structure itself, when it is valid. This last change not only
    offers a speedup, but it helps us simplify the code some as well.

    - Add a new SELinux policy version which allows for a more space
    efficient way of storing the filename transitions in the binary
    policy. Given the default Fedora SELinux policy with the unconfined
    module enabled, this change drops the policy size from ~7.6MB to
    ~3.3MB. The kernel policy load time dropped as well.

    - Some fixes to the error handling code in the policy parser to
    properly return error codes when things go wrong"

    * tag 'selinux-pr-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
    selinux: netlabel: Remove unused inline function
    selinux: do not allocate hashtabs dynamically
    selinux: fix return value on error in policydb_read()
    selinux: simplify range_write()
    selinux: fix error return code in policydb_read()
    selinux: don't produce incorrect filename_trans_count
    selinux: implement new format of filename transitions
    selinux: move context hashing under sidtab
    selinux: hash context structure directly
    selinux: store role transitions in a hash table
    selinux: drop unnecessary smp_load_acquire() call
    selinux: fix warning Comparison to bool

    Linus Torvalds
     
  • Pull audit updates from Paul Moore:
    "Summary of the significant patches:

    - Record information about binds/unbinds to the audit multicast
    socket. This helps identify which processes have/had access to the
    information in the audit stream.

    - Cleanup and add some additional information to the netfilter
    configuration events collected by audit.

    - Fix some of the audit error handling code so we don't leak network
    namespace references"

    * tag 'audit-pr-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
    audit: add subj creds to NETFILTER_CFG record to
    audit: Replace zero-length array with flexible-array
    audit: make symbol 'audit_nfcfgs' static
    netfilter: add audit table unregister actions
    audit: tidy and extend netfilter_cfg x_tables
    audit: log audit netlink multicast bind and unbind
    audit: fix a net reference leak in audit_list_rules_send()
    audit: fix a net reference leak in audit_send_reply()

    Linus Torvalds
     
  • Pull tomoyo update from Tetsuo Handa:
    "One patch for suppressing coccicheck's warning"

    * tag 'tomoyo-pr-20200601' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
    tomoyo: use true for bool variable

    Linus Torvalds
     
  • Document the purpose of CAP_SETFCAP. For some reason this capability
    had no description while the others did.

    Signed-off-by: Stefan Hajnoczi
    Signed-off-by: James Morris

    Stefan Hajnoczi
     
  • Pull io_uring updates from Jens Axboe:
    "A relatively quiet round, mostly just fixes and code improvements. In
    particular:

    - Make statx just use the generic statx handler, instead of open
    coding it. We don't need that anymore, as we always call it async
    safe (Bijan)

    - Enable closing of the ring itself. Also fixes O_PATH closure (me)

    - Properly name completion members (me)

    - Batch reap of dead file registrations (me)

    - Allow IORING_OP_POLL with double waitqueues (me)

    - Add tee(2) support (Pavel)

    - Remove double off read (Pavel)

    - Fix overflow cancellations (Pavel)

    - Improve CQ timeouts (Pavel)

    - Async defer drain fixes (Pavel)

    - Add support for enabling/disabling notifications on a registered
    eventfd (Stefano)

    - Remove dead state parameter (Xiaoguang)

    - Disable SQPOLL submit on dying ctx (Xiaoguang)

    - Various code cleanups"

    * tag 'for-5.8/io_uring-2020-06-01' of git://git.kernel.dk/linux-block: (29 commits)
    io_uring: fix overflowed reqs cancellation
    io_uring: off timeouts based only on completions
    io_uring: move timeouts flushing to a helper
    statx: hide interfaces no longer used by io_uring
    io_uring: call statx directly
    statx: allow system call to be invoked from io_uring
    io_uring: add io_statx structure
    io_uring: get rid of manual punting in io_close
    io_uring: separate DRAIN flushing into a cold path
    io_uring: don't re-read sqe->off in timeout_prep()
    io_uring: simplify io_timeout locking
    io_uring: fix flush req->refs underflow
    io_uring: don't submit sqes when ctx->refs is dying
    io_uring: async task poll trigger cleanup
    io_uring: add tee(2) support
    splice: export do_tee()
    io_uring: don't repeat valid flag list
    io_uring: rename io_file_put()
    io_uring: remove req->needs_fixed_files
    io_uring: cleanup io_poll_remove_one() logic
    ...

    Linus Torvalds
     
  • Pull block driver updates from Jens Axboe:
    "On top of the core changes, here are the block driver changes for this
    merge window:

    - NVMe changes:
    - NVMe over Fibre Channel protocol updates, which also reach
    over to drivers/scsi/lpfc (James Smart)
    - namespace revalidation support on the target (Anthony
    Iliopoulos)
    - gcc zero length array fix (Arnd Bergmann)
    - nvmet cleanups (Chaitanya Kulkarni)
    - misc cleanups and fixes (me, Keith Busch, Sagi Grimberg)
    - use a SRQ per completion vector (Max Gurtovoy)
    - fix handling of runtime changes to the queue count (Weiping
    Zhang)
    - t10 protection information support for nvme-rdma and
    nvmet-rdma (Israel Rukshin and Max Gurtovoy)
    - target side AEN improvements (Chaitanya Kulkarni)
    - various fixes and minor improvements all over, icluding the
    nvme part of the lpfc driver"

    - Floppy code cleanup series (Willy, Denis)

    - Floppy contention fix (Jiri)

    - Loop CONFIGURE support (Martijn)

    - bcache fixes/improvements (Coly, Joe, Colin)

    - q->queuedata cleanups (Christoph)

    - Get rid of ioctl_by_bdev (Christoph, Stefan)

    - md/raid5 allocation fixes (Coly)

    - zero length array fixes (Gustavo)

    - swim3 task state fix (Xu)"

    * tag 'for-5.8/drivers-2020-06-01' of git://git.kernel.dk/linux-block: (166 commits)
    bcache: configure the asynchronous registertion to be experimental
    bcache: asynchronous devices registration
    bcache: fix refcount underflow in bcache_device_free()
    bcache: Convert pr_ uses to a more typical style
    bcache: remove redundant variables i and n
    lpfc: Fix return value in __lpfc_nvme_ls_abort
    lpfc: fix axchg pointer reference after free and double frees
    lpfc: Fix pointer checks and comments in LS receive refactoring
    nvme: set dma alignment to qword
    nvmet: cleanups the loop in nvmet_async_events_process
    nvmet: fix memory leak when removing namespaces and controllers concurrently
    nvmet-rdma: add metadata/T10-PI support
    nvmet: add metadata support for block devices
    nvmet: add metadata/T10-PI support
    nvme: add Metadata Capabilities enumerations
    nvmet: rename nvmet_check_data_len to nvmet_check_transfer_len
    nvmet: rename nvmet_rw_len to nvmet_rw_data_len
    nvmet: add metadata characteristics for a namespace
    nvme-rdma: add metadata/T10-PI support
    nvme-rdma: introduce nvme_rdma_sgl structure
    ...

    Linus Torvalds
     
  • Pull block updates from Jens Axboe:
    "Core block changes that have been queued up for this release:

    - Remove dead blk-throttle and blk-wbt code (Guoqing)

    - Include pid in blktrace note traces (Jan)

    - Don't spew I/O errors on wouldblock termination (me)

    - Zone append addition (Johannes, Keith, Damien)

    - IO accounting improvements (Konstantin, Christoph)

    - blk-mq hardware map update improvements (Ming)

    - Scheduler dispatch improvement (Salman)

    - Inline block encryption support (Satya)

    - Request map fixes and improvements (Weiping)

    - blk-iocost tweaks (Tejun)

    - Fix for timeout failing with error injection (Keith)

    - Queue re-run fixes (Douglas)

    - CPU hotplug improvements (Christoph)

    - Queue entry/exit improvements (Christoph)

    - Move DMA drain handling to the few drivers that use it (Christoph)

    - Partition handling cleanups (Christoph)"

    * tag 'for-5.8/block-2020-06-01' of git://git.kernel.dk/linux-block: (127 commits)
    block: mark bio_wouldblock_error() bio with BIO_QUIET
    blk-wbt: rename __wbt_update_limits to wbt_update_limits
    blk-wbt: remove wbt_update_limits
    blk-throttle: remove tg_drain_bios
    blk-throttle: remove blk_throtl_drain
    null_blk: force complete for timeout request
    blk-mq: drain I/O when all CPUs in a hctx are offline
    blk-mq: add blk_mq_all_tag_iter
    blk-mq: open code __blk_mq_alloc_request in blk_mq_alloc_request_hctx
    blk-mq: use BLK_MQ_NO_TAG in more places
    blk-mq: rename BLK_MQ_TAG_FAIL to BLK_MQ_NO_TAG
    blk-mq: move more request initialization to blk_mq_rq_ctx_init
    blk-mq: simplify the blk_mq_get_request calling convention
    blk-mq: remove the bio argument to ->prepare_request
    nvme: force complete cancelled requests
    blk-mq: blk-mq: provide forced completion method
    block: fix a warning when blkdev.h is included for !CONFIG_BLOCK builds
    block: blk-crypto-fallback: remove redundant initialization of variable err
    block: reduce part_stat_lock() scope
    block: use __this_cpu_add() instead of access by smp_processor_id()
    ...

    Linus Torvalds
     
  • Just finished bisecting mmotm, to find why a test which used to take
    four minutes now took more than an hour: the __buffer_migrate_page()
    cleanup left behind a get_page() which attach_page_private() now does.

    Fixes: cd0f37154443 ("mm/migrate.c: call detach_page_private to cleanup code")
    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Pull drm updates from Dave Airlie:
    "Highlights:

    - Core DRM had a lot of refactoring around managed drm resources to
    make drivers simpler.

    - Intel Tigerlake support is on by default

    - amdgpu now support p2p PCI buffer sharing and encrypted GPU memory

    Details:

    core:
    - uapi: error out EBUSY when existing master
    - uapi: rework SET/DROP MASTER permission handling
    - remove drm_pci.h
    - drm_pci* are now legacy
    - introduced managed DRM resources
    - subclassing support for drm_framebuffer
    - simple encoder helper
    - edid improvements
    - vblank + writeback documentation improved
    - drm/mm - optimise tree searches
    - port drivers to use devm_drm_dev_alloc

    dma-buf:
    - add flag for p2p buffer support

    mst:
    - ACT timeout improvements
    - remove drm_dp_mst_has_audio
    - don't use 2nd TX slot - spec recommends against it

    bridge:
    - dw-hdmi various improvements
    - chrontel ch7033 support
    - fix stack issues with old gcc

    hdmi:
    - add unpack function for drm infoframe

    fbdev:
    - misc fbdev driver fixes

    i915:
    - uapi: global sseu pinning
    - uapi: OA buffer polling
    - uapi: remove generated perf code
    - uapi: per-engine default property values in sysfs
    - Tigerlake GEN12 enabled.
    - Lots of gem refactoring
    - Tigerlake enablement patches
    - move to drm_device logging
    - Icelake gamma HW readout
    - push MST link retrain to hotplug work
    - bandwidth atomic helpers
    - ICL fixes
    - RPS/GT refactoring
    - Cherryview full-ppgtt support
    - i915 locking guidelines documented
    - require linear fb stride to be 512 multiple on gen9
    - Tigerlake SAGV support

    amdgpu:
    - uapi: encrypted GPU memory handling
    - uapi: add MEM_SYNC IB flag
    - p2p dma-buf support
    - export VRAM dma-bufs
    - FRU chip access support
    - RAS/SR-IOV updates
    - Powerplay locking fixes
    - VCN DPG (powergating) enablement
    - GFX10 clockgating fixes
    - DC fixes
    - GPU reset fixes
    - navi SDMA fix
    - expose FP16 for modesetting
    - DP 1.4 compliance fixes
    - gfx10 soft recovery
    - Improved Critical Thermal Faults handling
    - resizable BAR on gmc10

    amdkfd:
    - uapi: GWS resource management
    - track GPU memory per process
    - report PCI domain in topology

    radeon:
    - safe reg list generator fixes

    nouveau:
    - HD audio fixes on recent systems
    - vGPU detection (fail probe if we're on one, for now)
    - Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it)
    - SVM improvements/fixes
    - NVIDIA format modifier support
    - Misc other fixes.

    adv7511:
    - HDMI SPDIF support

    ast:
    - allocate crtc state size
    - fix double assignment
    - fix suspend

    bochs:
    - drop connector register

    cirrus:
    - move to tiny drivers.

    exynos:
    - fix imported dma-buf mapping
    - enable runtime PM
    - fixes and cleanups

    mediatek:
    - DPI pin mode swap
    - config mipi_tx current/impedance

    lima:
    - devfreq + cooling device support
    - task handling improvements
    - runtime PM support

    pl111:
    - vexpress init improvements
    - fix module auto-load

    rcar-du:
    - DT bindings conversion to YAML
    - Planes zpos sanity check and fix
    - MAINTAINERS entry for LVDS panel driver

    mcde:
    - fix return value

    mgag200:
    - use managed config init

    stm:
    - read endpoints from DT

    vboxvideo:
    - use PCI managed functions
    - drop WC mtrr

    vkms:
    - enable cursor by default

    rockchip:
    - afbc support

    virtio:
    - various cleanups

    qxl:
    - fix cursor notify port

    hisilicon:
    - 128-byte stride alignment fix

    sun4i:
    - improved format handling"

    * tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm: (1401 commits)
    drm/amd/display: Fix potential integer wraparound resulting in a hang
    drm/amd/display: drop cursor position check in atomic test
    drm/amdgpu: fix device attribute node create failed with multi gpu
    drm/nouveau: use correct conflicting framebuffer API
    drm/vblank: Fix -Wformat compile warnings on some arches
    drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode
    drm/amd/display: Handle GPU reset for DC block
    drm/amdgpu: add apu flags (v2)
    drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven
    drm/amdgpu: fix pm sysfs node handling (v2)
    drm/amdgpu: move gpu_info parsing after common early init
    drm/amdgpu: move discovery gfx config fetching
    drm/nouveau/dispnv50: fix runtime pm imbalance on error
    drm/nouveau: fix runtime pm imbalance on error
    drm/nouveau: fix runtime pm imbalance on error
    drm/nouveau/debugfs: fix runtime pm imbalance on error
    drm/nouveau/nouveau/hmm: fix migrate zero page to GPU
    drm/nouveau/nouveau/hmm: fix nouveau_dmem_chunk allocations
    drm/nouveau/kms/nv50-: Share DP SST mode_valid() handling with MST
    drm/nouveau/kms/nv50-: Move 8BPC limit for MST into nv50_mstc_get_modes()
    ...

    Linus Torvalds
     
  • Pull hmm updates from Jason Gunthorpe:
    "This series adds a selftest for hmm_range_fault() and several of the
    DEVICE_PRIVATE migration related actions, and another simplification
    for hmm_range_fault()'s API.

    - Simplify hmm_range_fault() with a simpler return code, no
    HMM_PFN_SPECIAL, and no customizable output PFN format

    - Add a selftest for hmm_range_fault() and DEVICE_PRIVATE related
    functionality"

    * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
    MAINTAINERS: add HMM selftests
    mm/hmm/test: add selftests for HMM
    mm/hmm/test: add selftest driver for HMM
    mm/hmm: remove the customizable pfn format from hmm_range_fault
    mm/hmm: remove HMM_PFN_SPECIAL
    drm/amdgpu: remove dead code after hmm_range_fault()
    mm/hmm: make hmm_range_fault return 0 or -1

    Linus Torvalds
     
  • Pull PNP update from Rafael Wysocki:
    "Replace a zero-length array with a flexible-array (Gustavo A. R.
    Silva)"

    * tag 'pnp-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    PNPBIOS: Replace zero-length array with flexible-array

    Linus Torvalds
     
  • Pull ACPI updates from Rafael Wysocki:
    "These update the ACPICA code in the kernel to upstream revision
    20200430, fix several reference counting errors related to ACPI
    tables, add _Exx / _Lxx support to the GED driver, add a new
    acpi_evaluate_reg() helper, add new DPTF battery participant driver
    and extend the DPFT power participant driver, improve the handling of
    memory failures in the APEI code, add a blacklist entry to the
    backlight driver, update the PMIC driver and the processor idle
    driver, fix two kobject reference count leaks, and make a few janitory
    changes.

    Specifics:

    - Update the ACPICA code in the kernel to upstream revision 20200430:

    - Move acpi_gbl_next_cmd_num definition (Erik Kaneda).

    - Ignore AE_ALREADY_EXISTS status in the disassembler when parsing
    create operators (Erik Kaneda).

    - Add status checks to the dispatcher (Erik Kaneda).

    - Fix required parameters for _NIG and _NIH (Erik Kaneda).

    - Make acpi_protocol_lengths static (Yue Haibing).

    - Fix ACPI table reference counting errors in several places, mostly
    in error code paths (Hanjun Guo).

    - Extend the Generic Event Device (GED) driver to support _Exx and
    _Lxx handler methods (Ard Biesheuvel).

    - Add new acpi_evaluate_reg() helper and modify the ACPI PCI hotplug
    code to use it (Hans de Goede).

    - Add new DPTF battery participant driver and make the DPFT power
    participant driver create more sysfs device attributes (Srinivas
    Pandruvada).

    - Improve the handling of memory failures in APEI (James Morse).

    - Add new blacklist entry for Acer TravelMate 5735Z to the backlight
    driver (Paul Menzel).

    - Add i2c address for thermal control to the PMIC driver (Mauro
    Carvalho Chehab).

    - Allow the ACPI processor idle driver to work on platforms with only
    one ACPI C-state present (Zhang Rui).

    - Fix kobject reference count leaks in error code paths in two places
    (Qiushi Wu).

    - Delete unused proc filename macros and make some symbols static
    (Pascal Terjan, Zheng Zengkai, Zou Wei)"

    * tag 'acpi-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
    ACPI: CPPC: Fix reference count leak in acpi_cppc_processor_probe()
    ACPI: sysfs: Fix reference count leak in acpi_sysfs_add_hotplug_profile()
    ACPI: GED: use correct trigger type field in _Exx / _Lxx handling
    ACPI: DPTF: Add battery participant driver
    ACPI: DPTF: Additional sysfs attributes for power participant driver
    ACPI: video: Use native backlight on Acer TravelMate 5735Z
    arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work
    ACPI: APEI: Kick the memory_failure() queue for synchronous errors
    mm/memory-failure: Add memory_failure_queue_kick()
    ACPI / PMIC: Add i2c address for thermal control
    ACPI: GED: add support for _Exx / _Lxx handler methods
    ACPI: Delete unused proc filename macros
    ACPI: hotplug: PCI: Use the new acpi_evaluate_reg() helper
    ACPI: utils: Add acpi_evaluate_reg() helper
    ACPI: debug: Make two functions static
    ACPI: sleep: Put the FACS table after using it
    ACPI: scan: Put SPCR and STAO table after using it
    ACPI: EC: Put the ACPI table after using it
    ACPI: APEI: Put the HEST table for error path
    ACPI: APEI: Put the error record serialization table for error path
    ...

    Linus Torvalds
     
  • Pull power management updates from Rafael Wysocki:
    "These rework the system-wide PM driver flags, make runtime switching
    of cpuidle governors easier, improve the user space hibernation
    interface code, add intel-speed-select interface documentation, add
    more debug messages to the ACPI code handling suspend to idle, update
    the cpufreq core and drivers, fix a minor issue in the cpuidle core
    and update two cpuidle drivers, improve the PM-runtime framework,
    update the Intel RAPL power capping driver, update devfreq core and
    drivers, and clean up the cpupower utility.

    Specifics:

    - Rework the system-wide PM driver flags to make them easier to
    understand and use and update their documentation (Rafael Wysocki,
    Alan Stern).

    - Allow cpuidle governors to be switched at run time regardless of
    the kernel configuration and update the related documentation
    accordingly (Hanjun Guo).

    - Improve the resume device handling in the user space hibernarion
    interface code (Domenico Andreoli).

    - Document the intel-speed-select sysfs interface (Srinivas
    Pandruvada).

    - Make the ACPI code handing suspend to idle print more debug
    messages to help diagnose issues with it (Rafael Wysocki).

    - Fix a helper routine in the cpufreq core and correct a typo in the
    struct cpufreq_driver kerneldoc comment (Rafael Wysocki, Wang
    Wenhu).

    - Update cpufreq drivers:

    - Make the intel_pstate driver start in the passive mode by
    default on systems without HWP (Rafael Wysocki).

    - Add i.MX7ULP support to the imx-cpufreq-dt driver and add
    i.MX7ULP to the cpufreq-dt-platdev blacklist (Peng Fan).

    - Convert the qoriq cpufreq driver to a platform one, make the
    platform code create a suitable device object for it and add
    platform dependencies to it (Mian Yousaf Kaukab, Geert
    Uytterhoeven).

    - Fix wrong compatible binding in the qcom driver (Ansuel Smith).

    - Build the omap driver by default for ARCH_OMAP2PLUS (Anders
    Roxell).

    - Add r8a7742 SoC support to the dt cpufreq driver (Lad
    Prabhakar).

    - Update cpuidle core and drivers:

    - Fix three reference count leaks in error code paths in the
    cpuidle core (Qiushi Wu).

    - Convert Qualcomm SPM to a generic cpuidle driver (Stephan
    Gerhold).

    - Fix up the execution order when entering a domain idle state in
    the PSCI driver (Ulf Hansson).

    - Fix a reference counting issue related to clock management and
    clean up two oddities in the PM-runtime framework (Rafael Wysocki,
    Andy Shevchenko).

    - Add ElkhartLake support to the Intel RAPL power capping driver and
    remove an unused local MSR definition from it (Jacob Pan, Sumeet
    Pawnikar).

    - Update devfreq core and drivers:

    - Replace strncpy() with strscpy() in the devfreq core and use
    lockdep asserts instead of manual checks for a locked mutex in
    it (Dmitry Osipenko, Krzysztof Kozlowski).

    - Add a generic imx bus scaling driver and make it register an
    interconnect device (Leonard Crestez, Gustavo A. R. Silva).

    - Make the cpufreq notifier in the tegra30 driver take boosting
    into account and delete an unuseful error message from that
    driver (Dmitry Osipenko, Markus Elfring).

    - Remove unneeded semicolon from the cpupower code (Zou Wei)"

    * tag 'pm-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (51 commits)
    cpuidle: Fix three reference count leaks
    PM: runtime: Replace pm_runtime_callbacks_present()
    PM / devfreq: Use lockdep asserts instead of manual checks for locked mutex
    PM / devfreq: imx-bus: Fix inconsistent IS_ERR and PTR_ERR
    PM / devfreq: Replace strncpy with strscpy
    PM / devfreq: imx: Register interconnect device
    PM / devfreq: Add generic imx bus scaling driver
    PM / devfreq: tegra30: Delete an error message in tegra_devfreq_probe()
    PM / devfreq: tegra30: Make CPUFreq notifier to take into account boosting
    PM: hibernate: Restrict writes to the resume device
    PM: runtime: clk: Fix clk_pm_runtime_get() error path
    cpuidle: Convert Qualcomm SPM driver to a generic CPUidle driver
    ACPI: EC: PM: s2idle: Extend GPE dispatching debug message
    ACPI: PM: s2idle: Print type of wakeup debug messages
    powercap: RAPL: remove unused local MSR define
    PM: runtime: Make clear what we do when conditions are wrong in rpm_suspend()
    Documentation: admin-guide: pm: Document intel-speed-select
    PM: hibernate: Split off snapshot dev option
    PM: hibernate: Incorporate concurrency handling
    Documentation: ABI: make current_governer_ro as a candidate for removal
    ...

    Linus Torvalds
     
  • Pull x86 platform driver updates from Andy Shevchenko:

    - Add a support of the media keys on the ASUS laptop UX325JA/UX425JA

    - ASUS WMI driver can now handle 2-in-1 models T100TA, T100CHI, T100HA,
    T200TA

    - Big refactoring of Intel SCU driver with Elkhart Lake support has
    been added

    - Slim Bootloarder firmware update signaling WMI driver has been added

    - Thinkpad ACPI driver can handle dual fan configuration on new P and X
    models

    - Touchscreen DMI driver has been extended to support
    - MP-man MPWIN895CL tablet
    - ONDA V891 v5 tablet
    - techBite Arc 11.6
    - Trekstor Twin 10.1
    - Trekstor Yourbook C11B
    - Vinga J116

    - Virtual Button driver got a few fixes to detect mode of 2-in-1 tablet
    models

    - Intel Speed Select tools update

    - Plenty of small cleanups here and there

    * tag 'platform-drivers-x86-v5.8-1' of git://git.infradead.org/linux-platform-drivers-x86: (89 commits)
    platform/x86: dcdbas: Check SMBIOS for protected buffer address
    platform/x86: asus_wmi: Reserve more space for struct bias_args
    platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type
    platform/x86: intel-hid: Add a quirk to support HP Spectre X2 (2015)
    platform/x86: touchscreen_dmi: Update Trekstor Twin 10.1 entry
    platform/x86: touchscreen_dmi: Add info for the Trekstor Yourbook C11B
    platform/x86: hp-wmi: Introduce HPWMI_POWER_FW_OR_HW as convenient shortcut
    platform/x86: hp-wmi: Convert simple_strtoul() to kstrtou32()
    platform/x86: hp-wmi: Refactor postcode_store() to follow standard patterns
    platform/x86: acerhdf: replace space by * in modalias
    platform/x86: ISST: Increase timeout
    tools/power/x86/intel-speed-select: Fix invalid core mask
    tools/power/x86/intel-speed-select: Increase CPU count
    tools/power/x86/intel-speed-select: Fix json perf-profile output output
    platform/x86: dell-wmi: Ignore keyboard attached / detached events
    platform/x86: dell-laptop: don't register micmute LED if there is no token
    platform/x86: thinkpad_acpi: Replace custom approach by kstrtoint()
    platform/x86: thinkpad_acpi: Use strndup_user() in dispatch_proc_write()
    platform/x86: thinkpad_acpi: Replace next_cmd(&buf) with strsep(&buf, ",")
    platform/x86: intel-vbtn: Detect switch position before registering the input-device
    ...

    Linus Torvalds
     
  • Pull MMC updates from Ulf Hansson:
    "MMC core:
    - Enable erase/discard/trim support for all (e)MMC/SD hosts
    - Export information through sysfs about enhanced RPMB support (eMMC v5.1+)
    - Align the initialization commands for SDIO cards
    - Fix SDIO initialization to prevent memory leaks and NULL pointer errors
    - Do not export undefined MMC_NAME/MODALIAS for SDIO cards
    - Export device/vendor field from common CIS for SDIO cards
    - Move SDIO IDs from functional drivers to the common SDIO header
    - Introduce the ->request_atomic() host ops

    MMC host:
    - Improve support for HW busy signaling for several hosts
    - Converting some DT bindings to the json-schema
    - meson-mx-sdhc: Add driver and DT doc for the Amlogic Meson SDHC controller
    - meson-mx-sdio: Run a soft reset to recover from timeout/CRC error
    - mmci: Convert to use mmc_regulator_set_vqmmc()
    - mmci_stm32_sdmmc: Fix a couple of DMA bugs
    - mmci_stm32_sdmmc: Fix power on issue
    - renesas,mmcif,sdhci: Document r8a7742 DT bindings
    - renesas_sdhi: Add support for M3-W ES1.2 and 1.3 revisions
    - renesas_sdhi: Improvements to the TAP selection
    - renesas_sdhi/tmio: Further fixup runtime PM management at ->remove()
    - sdhci: Introduce ops to dump vendor specific registers
    - sdhci-cadence: Fix PHY write sequence
    - sdhci-esdhc-imx: Improve tunings
    - sdhci-esdhc-imx: Enable GPIO card detect as system wakeup
    - sdhci-esdhc-imx: Add HS400 support for i.MX6SLL
    - sdhci-esdhc-mcf: Add driver for the Coldfire/M5441X esdhc controller
    - m68k: mcf5441x: Add platform data to enable esdhc mmc controller
    - sdhci-msm: Improve HS400 tuning
    - sdhci-msm: Dump vendor specific registers at error
    - sdhci-msm: Add support for DLL/DDR properties provided from DT
    - sdhci-msm: Add support for the sm8250 variant
    - sdhci-msm: Add support for DVFS by converting to dev_pm_opp_set_rate()
    - sdhci-of-arasan: Add support for Intel Keem Bay variant
    - sdhci-of-arasan: Add support for Xilinx Versal SD variant
    - sdhci-of-dwcmshc: Add support for system suspend/resume
    - sdhci-of-dwcmshc: Fix UHS signaling support
    - sdhci-of-esdhc: Fix tuning for eMMC HS400 mode
    - sdhci-pci-gli: Add Genesys Logic GL9763E support
    - sdhci-sprd: Add support for the ->request_atomic() ops
    - sdhci-tegra: Avoid reading autocal timeout values when not applicable

    MEMSTICK:
    - Minor trivial update"

    * tag 'mmc-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (127 commits)
    dt-bindings: mmc: Convert sdhci-pxa to json-schema
    mmc: sdhci-msm: Clear tuning done flag while hs400 tuning
    mmc: core: Export device/vendor ids from Common CIS for SDIO cards
    mmc: core: Do not export MMC_NAME= and MODALIAS=mmc:block for SDIO cards
    mmc: sdhci-of-at91: fix CALCR register being rewritten
    mmc: sdhci-esdhc-imx: disable the CMD CRC check for standard tuning
    mmc: sdhci-esdhc-imx: fix the mask for tuning start point
    mmc: host: sdhci-esdhc-imx: add wakeup feature for GPIO CD pin
    mmc: mmci_sdmmc: fix DMA API warning max segment size
    mmc: mmci_sdmmc: fix DMA API warning overlapping mappings
    mmc: sdhci-of-arasan: Add support for Intel Keem Bay
    dt-bindings: mmc: arasan: Add compatible strings for Intel Keem Bay
    mmc: sdhci-cadence: fix PHY write
    mmc: sdio: Sort all SDIO IDs in common include file
    mmc: sdio: Fix Cypress SDIO IDs macros in common include file
    mmc: sdio: Move SDIO IDs from b43-sdio driver to common include file
    mmc: sdio: Move SDIO IDs from ath10k driver to common include file
    mmc: sdio: Move SDIO IDs from ath6kl driver to common include file
    mmc: sdio: Move SDIO IDs from smssdio driver to common include file
    mmc: sdio: Move SDIO IDs from btmtksdio driver to common include file
    ...

    Linus Torvalds
     
  • Merge updates from Andrew Morton:
    "A few little subsystems and a start of a lot of MM patches.

    Subsystems affected by this patch series: squashfs, ocfs2, parisc,
    vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup,
    swap, memcg, pagemap, memory-failure, vmalloc, kasan"

    * emailed patches from Andrew Morton : (128 commits)
    kasan: move kasan_report() into report.c
    mm/mm_init.c: report kasan-tag information stored in page->flags
    ubsan: entirely disable alignment checks under UBSAN_TRAP
    kasan: fix clang compilation warning due to stack protector
    x86/mm: remove vmalloc faulting
    mm: remove vmalloc_sync_(un)mappings()
    x86/mm/32: implement arch_sync_kernel_mappings()
    x86/mm/64: implement arch_sync_kernel_mappings()
    mm/ioremap: track which page-table levels were modified
    mm/vmalloc: track which page-table levels were modified
    mm: add functions to track page directory modifications
    s390: use __vmalloc_node in stack_alloc
    powerpc: use __vmalloc_node in alloc_vm_stack
    arm64: use __vmalloc_node in arch_alloc_vmap_stack
    mm: remove vmalloc_user_node_flags
    mm: switch the test_vmalloc module to use __vmalloc_node
    mm: remove __vmalloc_node_flags_caller
    mm: remove both instances of __vmalloc_node_flags
    mm: remove the prot argument to __vmalloc_node
    mm: remove the pgprot argument to __vmalloc
    ...

    Linus Torvalds
     
  • The kasan_report() functions belongs to report.c, as it's a common
    functions that does error reporting.

    Reported-by: Leon Romanovsky
    Signed-off-by: Andrey Konovalov
    Signed-off-by: Andrew Morton
    Tested-by: Leon Romanovsky
    Cc: Andrey Ryabinin
    Cc: Alexander Potapenko
    Cc: Dmitry Vyukov
    Cc: Leon Romanovsky
    Link: http://lkml.kernel.org/r/78a81fde6eeda9db72a7fd55fbc33173a515e4b1.1589297433.git.andreyknvl@google.com
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     
  • The pageflags_layout_usage shows incorrect message by means of
    mminit_loglevel when Kasan runs in the mode of software tag-based
    enabled with CONFIG_KASAN_SW_TAGS. This patch corrects it and reports
    kasan-tag information.

    Signed-off-by: Jing Xia
    Signed-off-by: Andrew Morton
    Cc: Chunyan Zhang
    Cc: Orson Zhai
    Cc: Andrey Ryabinin
    Cc: Alexander Potapenko
    Cc: Dmitry Vyukov
    Link: http://lkml.kernel.org/r/1586929370-10838-1-git-send-email-jing.xia.mail@gmail.com
    Signed-off-by: Linus Torvalds

    Jing Xia
     
  • Commit 8d58f222e85f ("ubsan: disable UBSAN_ALIGNMENT under
    COMPILE_TEST") tried to fix the pathological results of UBSAN_ALIGNMENT
    with UBSAN_TRAP (which objtool would rightly scream about), but it made
    an assumption about how COMPILE_TEST gets set (it is not set for
    randconfig). As a result, we need a bigger hammer here: just don't
    allow the alignment checks with the trap mode.

    Fixes: 8d58f222e85f ("ubsan: disable UBSAN_ALIGNMENT under COMPILE_TEST")
    Reported-by: Randy Dunlap
    Signed-off-by: Kees Cook
    Signed-off-by: Andrew Morton
    Acked-by: Randy Dunlap
    Cc: Josh Poimboeuf
    Cc: Dmitry Vyukov
    Cc: Elena Petrova
    Link: http://lkml.kernel.org/r/202005291236.000FCB6@keescook
    Link: https://lore.kernel.org/lkml/742521db-1e8c-0d7a-1ed4-a908894fb497@infradead.org/
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • KASAN uses a single cc-option invocation to disable both conserve-stack
    and stack-protector flags. The former flag is not present in Clang,
    which causes cc-option to fail, and results in stack-protector being
    enabled.

    Fix by using separate cc-option calls for each flag. Also collect all
    flags in a variable to avoid calling cc-option multiple times for
    different files.

    Reported-by: Qian Cai
    Signed-off-by: Andrey Konovalov
    Signed-off-by: Andrew Morton
    Reviewed-by: Marco Elver
    Link: http://lkml.kernel.org/r/c2f0c8e4048852ae014f4a391d96ca42d27e3255.1590779332.git.andreyknvl@google.com
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     
  • Remove fault handling on vmalloc areas, as the vmalloc code now takes
    care of synchronizing changes to all page-tables in the system.

    Signed-off-by: Joerg Roedel
    Signed-off-by: Andrew Morton
    Acked-by: Andy Lutomirski
    Acked-by: Peter Zijlstra (Intel)
    Cc: Arnd Bergmann
    Cc: Christoph Hellwig
    Cc: Dave Hansen
    Cc: "H . Peter Anvin"
    Cc: Ingo Molnar
    Cc: Matthew Wilcox (Oracle)
    Cc: Michal Hocko
    Cc: "Rafael J. Wysocki"
    Cc: Steven Rostedt (VMware)
    Cc: Thomas Gleixner
    Cc: Vlastimil Babka
    Link: http://lkml.kernel.org/r/20200515140023.25469-8-joro@8bytes.org
    Signed-off-by: Linus Torvalds

    Joerg Roedel
     
  • These functions are not needed anymore because the vmalloc and ioremap
    mappings are now synchronized when they are created or torn down.

    Remove all callers and function definitions.

    Signed-off-by: Joerg Roedel
    Signed-off-by: Andrew Morton
    Tested-by: Steven Rostedt (VMware)
    Acked-by: Andy Lutomirski
    Acked-by: Peter Zijlstra (Intel)
    Cc: Arnd Bergmann
    Cc: Christoph Hellwig
    Cc: Dave Hansen
    Cc: "H . Peter Anvin"
    Cc: Ingo Molnar
    Cc: Matthew Wilcox (Oracle)
    Cc: Michal Hocko
    Cc: "Rafael J. Wysocki"
    Cc: Thomas Gleixner
    Cc: Vlastimil Babka
    Link: http://lkml.kernel.org/r/20200515140023.25469-7-joro@8bytes.org
    Signed-off-by: Linus Torvalds

    Joerg Roedel
     
  • Implement the function to sync changes in vmalloc and ioremap ranges to
    all page-tables.

    Signed-off-by: Joerg Roedel
    Signed-off-by: Andrew Morton
    Acked-by: Andy Lutomirski
    Acked-by: Peter Zijlstra (Intel)
    Cc: Arnd Bergmann
    Cc: Christoph Hellwig
    Cc: Dave Hansen
    Cc: "H . Peter Anvin"
    Cc: Ingo Molnar
    Cc: Matthew Wilcox (Oracle)
    Cc: Michal Hocko
    Cc: "Rafael J. Wysocki"
    Cc: Steven Rostedt (VMware)
    Cc: Thomas Gleixner
    Cc: Vlastimil Babka
    Link: http://lkml.kernel.org/r/20200515140023.25469-6-joro@8bytes.org
    Signed-off-by: Linus Torvalds

    Joerg Roedel
     
  • Implement the function to sync changes in vmalloc and ioremap ranges to
    all page-tables.

    Signed-off-by: Joerg Roedel
    Signed-off-by: Andrew Morton
    Acked-by: Andy Lutomirski
    Acked-by: Peter Zijlstra (Intel)
    Cc: Arnd Bergmann
    Cc: Christoph Hellwig
    Cc: Dave Hansen
    Cc: "H . Peter Anvin"
    Cc: Ingo Molnar
    Cc: Matthew Wilcox (Oracle)
    Cc: Michal Hocko
    Cc: "Rafael J. Wysocki"
    Cc: Steven Rostedt (VMware)
    Cc: Thomas Gleixner
    Cc: Vlastimil Babka
    Link: http://lkml.kernel.org/r/20200515140023.25469-5-joro@8bytes.org
    Signed-off-by: Linus Torvalds

    Joerg Roedel
     
  • Track at which levels in the page-table entries were modified by
    ioremap_page_range().

    After the page-table has been modified, use that information do decide
    whether the new arch_sync_kernel_mappings() needs to be called. The
    iounmap path re-uses vunmap(), which has already been taken care of.

    Signed-off-by: Joerg Roedel
    Signed-off-by: Andrew Morton
    Acked-by: Andy Lutomirski
    Acked-by: Peter Zijlstra (Intel)
    Cc: Arnd Bergmann
    Cc: Christoph Hellwig
    Cc: Dave Hansen
    Cc: "H . Peter Anvin"
    Cc: Ingo Molnar
    Cc: Matthew Wilcox (Oracle)
    Cc: Michal Hocko
    Cc: "Rafael J. Wysocki"
    Cc: Steven Rostedt (VMware)
    Cc: Thomas Gleixner
    Cc: Vlastimil Babka
    Link: http://lkml.kernel.org/r/20200515140023.25469-4-joro@8bytes.org
    Signed-off-by: Linus Torvalds

    Joerg Roedel
     
  • Track at which levels in the page-table entries were modified by
    vmap/vunmap.

    After the page-table has been modified, use that information do decide
    whether the new arch_sync_kernel_mappings() needs to be called.

    [akpm@linux-foundation.org: map_kernel_range_noflush() needs the arch_sync_kernel_mappings() call]
    Signed-off-by: Joerg Roedel
    Signed-off-by: Andrew Morton
    Acked-by: Andy Lutomirski
    Acked-by: Peter Zijlstra (Intel)
    Cc: Arnd Bergmann
    Cc: Christoph Hellwig
    Cc: Dave Hansen
    Cc: "H . Peter Anvin"
    Cc: Ingo Molnar
    Cc: Matthew Wilcox (Oracle)
    Cc: Michal Hocko
    Cc: "Rafael J. Wysocki"
    Cc: Steven Rostedt (VMware)
    Cc: Thomas Gleixner
    Cc: Vlastimil Babka
    Link: http://lkml.kernel.org/r/20200515140023.25469-3-joro@8bytes.org
    Signed-off-by: Linus Torvalds

    Joerg Roedel
     
  • Patch series "mm: Get rid of vmalloc_sync_(un)mappings()", v3.

    After the recent issue with vmalloc and tracing code[1] on x86 and a
    long history of previous issues related to the vmalloc_sync_mappings()
    interface, I thought the time has come to remove it. Please see [2],
    [3], and [4] for some other issues in the past.

    The patches add tracking of page-table directory changes to the vmalloc
    and ioremap code. Depending on which page-table levels changes have
    been made, a new per-arch function is called:
    arch_sync_kernel_mappings().

    On x86-64 with 4-level paging, this function will not be called more
    than 64 times in a systems runtime (because vmalloc-space takes 64 PGD
    entries which are only populated, but never cleared).

    As a side effect this also allows to get rid of vmalloc faults on x86,
    making it safe to touch vmalloc'ed memory in the page-fault handler.
    Note that this potentially includes per-cpu memory.

    This patch (of 7):

    Add page-table allocation functions which will keep track of changed
    directory entries. They are needed for new PGD, P4D, PUD, and PMD
    entries and will be used in vmalloc and ioremap code to decide whether
    any changes in the kernel mappings need to be synchronized between
    page-tables in the system.

    Signed-off-by: Joerg Roedel
    Signed-off-by: Andrew Morton
    Acked-by: Andy Lutomirski
    Acked-by: Peter Zijlstra (Intel)
    Cc: "H . Peter Anvin"
    Cc: Dave Hansen
    Cc: "Rafael J. Wysocki"
    Cc: Arnd Bergmann
    Cc: Steven Rostedt (VMware)
    Cc: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Matthew Wilcox (Oracle)
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Christoph Hellwig
    Link: http://lkml.kernel.org/r/20200515140023.25469-1-joro@8bytes.org
    Link: http://lkml.kernel.org/r/20200515140023.25469-2-joro@8bytes.org
    Signed-off-by: Linus Torvalds

    Joerg Roedel
     
  • stack_alloc can use a slightly higher level vmalloc function.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Acked-by: Christian Borntraeger
    Acked-by: Peter Zijlstra (Intel)
    Cc: "K. Y. Srinivasan"
    Cc: Haiyang Zhang
    Cc: Stephen Hemminger
    Cc: Wei Liu
    Cc: David Airlie
    Cc: Laura Abbott
    Cc: Sumit Semwal
    Cc: Sakari Ailus
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Robin Murphy
    Cc: Christophe Leroy
    Cc: Daniel Vetter
    Cc: Gao Xiang
    Cc: Greg Kroah-Hartman
    Cc: Johannes Weiner
    Cc: Mark Rutland
    Cc: Michael Kelley
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Paul Mackerras
    Cc: Vasily Gorbik
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20200414131348.444715-30-hch@lst.de
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • alloc_vm_stack can use a slightly higher level vmalloc function.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Acked-by: Peter Zijlstra (Intel)
    Cc: Christian Borntraeger
    Cc: Christophe Leroy
    Cc: Daniel Vetter
    Cc: David Airlie
    Cc: Gao Xiang
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Johannes Weiner
    Cc: "K. Y. Srinivasan"
    Cc: Laura Abbott
    Cc: Mark Rutland
    Cc: Michael Kelley
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Robin Murphy
    Cc: Sakari Ailus
    Cc: Stephen Hemminger
    Cc: Sumit Semwal
    Cc: Wei Liu
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Paul Mackerras
    Cc: Vasily Gorbik
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20200414131348.444715-29-hch@lst.de
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • arch_alloc_vmap_stack can use a slightly higher level vmalloc function.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Acked-by: Peter Zijlstra (Intel)
    Cc: Christian Borntraeger
    Cc: Christophe Leroy
    Cc: Daniel Vetter
    Cc: David Airlie
    Cc: Gao Xiang
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Johannes Weiner
    Cc: "K. Y. Srinivasan"
    Cc: Laura Abbott
    Cc: Mark Rutland
    Cc: Michael Kelley
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Robin Murphy
    Cc: Sakari Ailus
    Cc: Stephen Hemminger
    Cc: Sumit Semwal
    Cc: Wei Liu
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Paul Mackerras
    Cc: Vasily Gorbik
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20200414131348.444715-28-hch@lst.de
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Open code it in __bpf_map_area_alloc, which is the only caller. Also
    clean up __bpf_map_area_alloc to have a single vmalloc call with slightly
    different flags instead of the current two different calls.

    For this to compile for the nommu case add a __vmalloc_node_range stub to
    nommu.c.

    [akpm@linux-foundation.org: fix nommu.c build]
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Acked-by: Peter Zijlstra (Intel)
    Acked-by: Johannes Weiner
    Cc: Christian Borntraeger
    Cc: Christophe Leroy
    Cc: Daniel Vetter
    Cc: David Airlie
    Cc: Gao Xiang
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: "K. Y. Srinivasan"
    Cc: Laura Abbott
    Cc: Mark Rutland
    Cc: Michael Kelley
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Robin Murphy
    Cc: Sakari Ailus
    Cc: Stephen Hemminger
    Cc: Sumit Semwal
    Cc: Wei Liu
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Paul Mackerras
    Cc: Vasily Gorbik
    Cc: Will Deacon
    Cc: Stephen Rothwell
    Link: http://lkml.kernel.org/r/20200414131348.444715-27-hch@lst.de
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • No need to export the very low-level __vmalloc_node_range when the test
    module can use a slightly higher level variant.

    [akpm@linux-foundation.org: add missing `node' arg]
    [akpm@linux-foundation.org: fix riscv nommu build]
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Acked-by: Peter Zijlstra (Intel)
    Cc: Christian Borntraeger
    Cc: Christophe Leroy
    Cc: Daniel Vetter
    Cc: David Airlie
    Cc: Gao Xiang
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Johannes Weiner
    Cc: "K. Y. Srinivasan"
    Cc: Laura Abbott
    Cc: Mark Rutland
    Cc: Michael Kelley
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Robin Murphy
    Cc: Sakari Ailus
    Cc: Stephen Hemminger
    Cc: Sumit Semwal
    Cc: Wei Liu
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Paul Mackerras
    Cc: Vasily Gorbik
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20200414131348.444715-26-hch@lst.de
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Just use __vmalloc_node instead which gets and extra argument. To be able
    to to use __vmalloc_node in all caller make it available outside of
    vmalloc and implement it in nommu.c.

    [akpm@linux-foundation.org: fix nommu build]
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Acked-by: Peter Zijlstra (Intel)
    Cc: Christian Borntraeger
    Cc: Christophe Leroy
    Cc: Daniel Vetter
    Cc: David Airlie
    Cc: Gao Xiang
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Johannes Weiner
    Cc: "K. Y. Srinivasan"
    Cc: Laura Abbott
    Cc: Mark Rutland
    Cc: Michael Kelley
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Robin Murphy
    Cc: Sakari Ailus
    Cc: Stephen Hemminger
    Cc: Sumit Semwal
    Cc: Wei Liu
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Paul Mackerras
    Cc: Vasily Gorbik
    Cc: Will Deacon
    Cc: Stephen Rothwell
    Link: http://lkml.kernel.org/r/20200414131348.444715-25-hch@lst.de
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • The real version just had a few callers that can open code it and remove
    one layer of indirection. The nommu stub was public but only had a single
    caller, so remove it and avoid a CONFIG_MMU ifdef in vmalloc.h.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Acked-by: Peter Zijlstra (Intel)
    Cc: Christian Borntraeger
    Cc: Christophe Leroy
    Cc: Daniel Vetter
    Cc: David Airlie
    Cc: Gao Xiang
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Johannes Weiner
    Cc: "K. Y. Srinivasan"
    Cc: Laura Abbott
    Cc: Mark Rutland
    Cc: Michael Kelley
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Robin Murphy
    Cc: Sakari Ailus
    Cc: Stephen Hemminger
    Cc: Sumit Semwal
    Cc: Wei Liu
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Paul Mackerras
    Cc: Vasily Gorbik
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20200414131348.444715-24-hch@lst.de
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • This is always PAGE_KERNEL now.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Acked-by: Peter Zijlstra (Intel)
    Cc: Christian Borntraeger
    Cc: Christophe Leroy
    Cc: Daniel Vetter
    Cc: David Airlie
    Cc: Gao Xiang
    Cc: Greg Kroah-Hartman
    Cc: Haiyang Zhang
    Cc: Johannes Weiner
    Cc: "K. Y. Srinivasan"
    Cc: Laura Abbott
    Cc: Mark Rutland
    Cc: Michael Kelley
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Robin Murphy
    Cc: Sakari Ailus
    Cc: Stephen Hemminger
    Cc: Sumit Semwal
    Cc: Wei Liu
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Heiko Carstens
    Cc: Paul Mackerras
    Cc: Vasily Gorbik
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/20200414131348.444715-23-hch@lst.de
    Signed-off-by: Linus Torvalds

    Christoph Hellwig