09 Dec, 2020

1 commit


07 Dec, 2020

1 commit

  • On success, mmap should return the begin address of newly mapped area,
    but patch "mm: mmap: merge vma after call_mmap() if possible" set
    vm_start of newly merged vma to return value addr. Users of mmap will
    get wrong address if vma is merged after call_mmap(). We fix this by
    moving the assignment to addr before merging vma.

    We have a driver which changes vm_flags, and this bug is found by our
    testcases.

    Fixes: d70cec898324 ("mm: mmap: merge vma after call_mmap() if possible")
    Signed-off-by: Liu Zixian
    Signed-off-by: Andrew Morton
    Reviewed-by: Jason Gunthorpe
    Reviewed-by: David Hildenbrand
    Cc: Miaohe Lin
    Cc: Hongxiang Lou
    Cc: Hu Shiyuan
    Cc: Matthew Wilcox
    Link: https://lkml.kernel.org/r/20201203085350.22624-1-liuzixian4@huawei.com
    Signed-off-by: Linus Torvalds

    Liu Zixian
     

26 Oct, 2020

2 commits


25 Oct, 2020

1 commit


24 Oct, 2020

1 commit


21 Oct, 2020

1 commit


19 Oct, 2020

2 commits

  • There are two locations that have a block of code for munmapping a vma
    range. Change those two locations to use a function and add meaningful
    comments about what happens to the arguments, which was unclear in the
    previous code.

    Signed-off-by: Liam R. Howlett
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200818154707.2515169-2-Liam.Howlett@Oracle.com
    Signed-off-by: Linus Torvalds

    Liam R. Howlett
     
  • There are three places that the next vma is required which uses the same
    block of code. Replace the block with a function and add comments on what
    happens in the case where NULL is encountered.

    Signed-off-by: Liam R. Howlett
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200818154707.2515169-1-Liam.Howlett@Oracle.com
    Signed-off-by: Linus Torvalds

    Liam R. Howlett
     

17 Oct, 2020

2 commits

  • The preceding patches have ensured that core dumping properly takes the
    mmap_lock. Thanks to that, we can now remove mmget_still_valid() and all
    its users.

    Signed-off-by: Jann Horn
    Signed-off-by: Andrew Morton
    Acked-by: Linus Torvalds
    Cc: Christoph Hellwig
    Cc: Alexander Viro
    Cc: "Eric W . Biederman"
    Cc: Oleg Nesterov
    Cc: Hugh Dickins
    Link: http://lkml.kernel.org/r/20200827114932.3572699-8-jannh@google.com
    Signed-off-by: Linus Torvalds

    Jann Horn
     
  • In commit 1da177e4c3f4 ("Linux-2.6.12-rc2"), the helper put_write_access()
    came with the atomic_dec operation of the i_writecount field. But it
    forgot to use this helper in __vma_link_file() and dup_mmap().

    Signed-off-by: Miaohe Lin
    Signed-off-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200924115235.5111-1-linmiaohe@huawei.com
    Signed-off-by: Linus Torvalds

    Miaohe Lin
     

14 Oct, 2020

9 commits

  • Replace do_brk with do_brk_flags in comment of insert_vm_struct(), since
    do_brk was removed in following commit.

    Fixes: bb177a732c4369 ("mm: do not bug_on on incorrect length in __mm_populate()")
    Signed-off-by: Liao Pingfang
    Signed-off-by: Yi Wang
    Signed-off-by: Andrew Morton
    Link: https://lkml.kernel.org/r/1600650778-43230-1-git-send-email-wang.yi59@zte.com.cn
    Signed-off-by: Linus Torvalds

    Liao Pingfang
     
  • In commit 1da177e4c3f4 ("Linux-2.6.12-rc2"), the helper allow_write_access
    came with the atomic_inc operation of the i_writecount field in the func
    __remove_shared_vm_struct(). But it forgot to use this helper function.

    Signed-off-by: Miaohe Lin
    Signed-off-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200921115814.39680-1-linmiaohe@huawei.com
    Signed-off-by: Linus Torvalds

    Miaohe Lin
     
  • Commit 4bb5f5d9395b ("mm: allow drivers to prevent new writable mappings")
    changed i_mmap_writable from unsigned int to atomic_t and add the helper
    function mapping_allow_writable() to atomic_inc i_mmap_writable. But it
    forgot to use this helper function in dup_mmap() and __vma_link_file().

    Signed-off-by: Miaohe Lin
    Signed-off-by: Andrew Morton
    Cc: Christian Brauner
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: "Eric W. Biederman"
    Cc: Christian Kellner
    Cc: Suren Baghdasaryan
    Cc: Adrian Reber
    Cc: Shakeel Butt
    Cc: Aleksa Sarai
    Cc: Thomas Gleixner
    Link: https://lkml.kernel.org/r/20200917112736.7789-1-linmiaohe@huawei.com
    Signed-off-by: Linus Torvalds

    Miaohe Lin
     
  • In __vma_adjust(), we do the check on *root* to decide whether to adjust
    the address_space. It seems to be more meaningful to do the check on
    *file* itself. This means we are adjusting some data because it is a file
    backed vma.

    Since we seem to assume the address_space is valid if it is a file backed
    vma, let's just replace *root* with *file* here.

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200913133631.37781-2-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • *root* with type of struct rb_root_cached is an element of *mapping*
    with type of struct address_space. This implies when we have a valid
    *root* it must be a part of valid *mapping*.

    So we can merge these two checks together to make the code more easy to
    read and to save some cpu cycles.

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200913133631.37781-1-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • Instead of converting adjust_next between bytes and pages number, let's
    just store the virtual address into adjust_next.

    Also, this patch fixes one typo in the comment of vma_adjust_trans_huge().

    [vbabka@suse.cz: changelog tweak]

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Acked-by: Vlastimil Babka
    Cc: Mike Kravetz
    Link: http://lkml.kernel.org/r/20200828081031.11306-1-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • These two functions share the same logic except ignore a different vma.

    Let's reuse the code.

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200809232057.23477-2-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • __vma_unlink_common() and __vma_unlink() are counterparts. Since there is
    no function named __vma_unlink(), let's rename __vma_unlink_common() to
    __vma_unlink() to make the code more self-explanatory and easy for
    audience to understand.

    Otherwise we may expect there are several variants of vma_unlink() and
    __vma_unlink_common() is used by them.

    Signed-off-by: Wei Yang
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: https://lkml.kernel.org/r/20200809232057.23477-1-richard.weiyang@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    Wei Yang
     
  • Pull block updates from Jens Axboe:

    - Series of merge handling cleanups (Baolin, Christoph)

    - Series of blk-throttle fixes and cleanups (Baolin)

    - Series cleaning up BDI, seperating the block device from the
    backing_dev_info (Christoph)

    - Removal of bdget() as a generic API (Christoph)

    - Removal of blkdev_get() as a generic API (Christoph)

    - Cleanup of is-partition checks (Christoph)

    - Series reworking disk revalidation (Christoph)

    - Series cleaning up bio flags (Christoph)

    - bio crypt fixes (Eric)

    - IO stats inflight tweak (Gabriel)

    - blk-mq tags fixes (Hannes)

    - Buffer invalidation fixes (Jan)

    - Allow soft limits for zone append (Johannes)

    - Shared tag set improvements (John, Kashyap)

    - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel)

    - DM no-wait support (Mike, Konstantin)

    - Request allocation improvements (Ming)

    - Allow md/dm/bcache to use IO stat helpers (Song)

    - Series improving blk-iocost (Tejun)

    - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang,
    Xianting, Yang, Yufen, yangerkun)

    * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits)
    block: fix uapi blkzoned.h comments
    blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue
    blk-mq: get rid of the dead flush handle code path
    block: get rid of unnecessary local variable
    block: fix comment and add lockdep assert
    blk-mq: use helper function to test hw stopped
    block: use helper function to test queue register
    block: remove redundant mq check
    block: invoke blk_mq_exit_sched no matter whether have .exit_sched
    percpu_ref: don't refer to ref->data if it isn't allocated
    block: ratelimit handle_bad_sector() message
    blk-throttle: Re-use the throtl_set_slice_end()
    blk-throttle: Open code __throtl_de/enqueue_tg()
    blk-throttle: Move service tree validation out of the throtl_rb_first()
    blk-throttle: Move the list operation after list validation
    blk-throttle: Fix IO hang for a corner case
    blk-throttle: Avoid tracking latency if low limit is invalid
    blk-throttle: Avoid getting the current time if tg->last_finish_time is 0
    blk-throttle: Remove a meaningless parameter for throtl_downgrade_state()
    block: Remove redundant 'return' statement
    ...

    Linus Torvalds
     

13 Oct, 2020

1 commit

  • Pull arm64 updates from Will Deacon:
    "There's quite a lot of code here, but much of it is due to the
    addition of a new PMU driver as well as some arm64-specific selftests
    which is an area where we've traditionally been lagging a bit.

    In terms of exciting features, this includes support for the Memory
    Tagging Extension which narrowly missed 5.9, hopefully allowing
    userspace to run with use-after-free detection in production on CPUs
    that support it. Work is ongoing to integrate the feature with KASAN
    for 5.11.

    Another change that I'm excited about (assuming they get the hardware
    right) is preparing the ASID allocator for sharing the CPU page-table
    with the SMMU. Those changes will also come in via Joerg with the
    IOMMU pull.

    We do stray outside of our usual directories in a few places, mostly
    due to core changes required by MTE. Although much of this has been
    Acked, there were a couple of places where we unfortunately didn't get
    any review feedback.

    Other than that, we ran into a handful of minor conflicts in -next,
    but nothing that should post any issues.

    Summary:

    - Userspace support for the Memory Tagging Extension introduced by
    Armv8.5. Kernel support (via KASAN) is likely to follow in 5.11.

    - Selftests for MTE, Pointer Authentication and FPSIMD/SVE context
    switching.

    - Fix and subsequent rewrite of our Spectre mitigations, including
    the addition of support for PR_SPEC_DISABLE_NOEXEC.

    - Support for the Armv8.3 Pointer Authentication enhancements.

    - Support for ASID pinning, which is required when sharing
    page-tables with the SMMU.

    - MM updates, including treating flush_tlb_fix_spurious_fault() as a
    no-op.

    - Perf/PMU driver updates, including addition of the ARM CMN PMU
    driver and also support to handle CPU PMU IRQs as NMIs.

    - Allow prefetchable PCI BARs to be exposed to userspace using normal
    non-cacheable mappings.

    - Implementation of ARCH_STACKWALK for unwinding.

    - Improve reporting of unexpected kernel traps due to BPF JIT
    failure.

    - Improve robustness of user-visible HWCAP strings and their
    corresponding numerical constants.

    - Removal of TEXT_OFFSET.

    - Removal of some unused functions, parameters and prototypes.

    - Removal of MPIDR-based topology detection in favour of firmware
    description.

    - Cleanups to handling of SVE and FPSIMD register state in
    preparation for potential future optimisation of handling across
    syscalls.

    - Cleanups to the SDEI driver in preparation for support in KVM.

    - Miscellaneous cleanups and refactoring work"

    * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
    Revert "arm64: initialize per-cpu offsets earlier"
    arm64: random: Remove no longer needed prototypes
    arm64: initialize per-cpu offsets earlier
    kselftest/arm64: Check mte tagged user address in kernel
    kselftest/arm64: Verify KSM page merge for MTE pages
    kselftest/arm64: Verify all different mmap MTE options
    kselftest/arm64: Check forked child mte memory accessibility
    kselftest/arm64: Verify mte tag inclusion via prctl
    kselftest/arm64: Add utilities and a test to validate mte memory
    perf: arm-cmn: Fix conversion specifiers for node type
    perf: arm-cmn: Fix unsigned comparison to less than zero
    arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
    arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
    arm64: Add support for PR_SPEC_DISABLE_NOEXEC prctl() option
    arm64: Pull in task_stack_page() to Spectre-v4 mitigation code
    KVM: arm64: Allow patching EL2 vectors even with KASLR is not enabled
    arm64: Get rid of arm64_ssbd_state
    KVM: arm64: Convert ARCH_WORKAROUND_2 to arm64_get_spectre_v4_state()
    KVM: arm64: Get rid of kvm_arm_have_ssbd()
    KVM: arm64: Simplify handling of ARCH_WORKAROUND_2
    ...

    Linus Torvalds
     

12 Oct, 2020

2 commits

  • Linux 5.9

    Change-Id: Ic4308a3e2a4015058efdac52bd51794b604c8435
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • The syzbot reported the below general protection fault:

    general protection fault, probably for non-canonical address
    0xe00eeaee0000003b: 0000 [#1] PREEMPT SMP KASAN
    KASAN: maybe wild-memory-access in range [0x00777770000001d8-0x00777770000001df]
    CPU: 1 PID: 10488 Comm: syz-executor721 Not tainted 5.9.0-rc3-syzkaller #0
    RIP: 0010:unlink_file_vma+0x57/0xb0 mm/mmap.c:164
    Call Trace:
    free_pgtables+0x1b3/0x2f0 mm/memory.c:415
    exit_mmap+0x2c0/0x530 mm/mmap.c:3184
    __mmput+0x122/0x470 kernel/fork.c:1076
    mmput+0x53/0x60 kernel/fork.c:1097
    exit_mm kernel/exit.c:483 [inline]
    do_exit+0xa8b/0x29f0 kernel/exit.c:793
    do_group_exit+0x125/0x310 kernel/exit.c:903
    get_signal+0x428/0x1f00 kernel/signal.c:2757
    arch_do_signal+0x82/0x2520 arch/x86/kernel/signal.c:811
    exit_to_user_mode_loop kernel/entry/common.c:136 [inline]
    exit_to_user_mode_prepare+0x1ae/0x200 kernel/entry/common.c:167
    syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:242
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    It's because the ->mmap() callback can change vma->vm_file and fput the
    original file. But the commit d70cec898324 ("mm: mmap: merge vma after
    call_mmap() if possible") failed to catch this case and always fput()
    the original file, hence add an extra fput().

    [ Thanks Hillf for pointing this extra fput() out. ]

    Fixes: d70cec898324 ("mm: mmap: merge vma after call_mmap() if possible")
    Reported-by: syzbot+c5d5a51dcbb558ca0cb5@syzkaller.appspotmail.com
    Signed-off-by: Miaohe Lin
    Signed-off-by: Andrew Morton
    Cc: Christian König
    Cc: Hongxiang Lou
    Cc: Chris Wilson
    Cc: Dave Airlie
    Cc: Daniel Vetter
    Cc: Sumit Semwal
    Cc: Matthew Wilcox (Oracle)
    Cc: John Hubbard
    Link: https://lkml.kernel.org/r/20200916090733.31427-1-linmiaohe@huawei.com
    Signed-off-by: Linus Torvalds

    Miaohe Lin
     

25 Sep, 2020

1 commit


04 Sep, 2020

1 commit

  • Similarly to arch_validate_prot() called from do_mprotect_pkey(), an
    architecture may need to sanity-check the new vm_flags.

    Define a dummy function always returning true. In addition to
    do_mprotect_pkey(), also invoke it from mmap_region() prior to updating
    vma->vm_page_prot to allow the architecture code to veto potentially
    inconsistent vm_flags.

    Signed-off-by: Catalin Marinas
    Acked-by: Andrew Morton

    Catalin Marinas
     

08 Aug, 2020

5 commits

  • commit 60500a42286d ("ANDROID: mm: add a field to store names for
    private anonymous memory") changed the parameters to vma_merge() which
    causes any new use of that function upstream to break the build.

    So fix up the new call by adding the needed extra parameter.

    Maybe someday this patch could be dropped to prevent this.

    Bug: 120441514
    Cc: Colin Cross
    Cc: Dmitry Shmidt
    Cc: Amit Pundir
    Signed-off-by: Greg Kroah-Hartman
    Change-Id: I05629d408449124215ef9181223a686f4855cbf6

    Greg Kroah-Hartman
     
  • …kernel/git/sre/linux-power-supply") into android-mainline

    Merges along the way to 5.9-rc1

    resolves conflicts in:
    Documentation/ABI/testing/sysfs-class-power
    drivers/power/supply/power_supply_sysfs.c
    fs/crypto/inline_crypt.c

    Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
    Change-Id: Ia087834f54fb4e5269d68c3c404747ceed240701

    Greg Kroah-Hartman
     
  • The current split between do_mmap() and do_mmap_pgoff() was introduced in
    commit 1fcfd8db7f82 ("mm, mpx: add "vm_flags_t vm_flags" arg to
    do_mmap_pgoff()") to support MPX.

    The wrapper function do_mmap_pgoff() always passed 0 as the value of the
    vm_flags argument to do_mmap(). However, MPX support has subsequently
    been removed from the kernel and there were no more direct callers of
    do_mmap(); all calls were going via do_mmap_pgoff().

    Simplify the code by removing do_mmap_pgoff() and changing all callers to
    directly call do_mmap(), which now no longer takes a vm_flags argument.

    Signed-off-by: Peter Collingbourne
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Reviewed-by: David Hildenbrand
    Link: http://lkml.kernel.org/r/20200727194109.1371462-1-pcc@google.com
    Signed-off-by: Linus Torvalds

    Peter Collingbourne
     
  • The vm_flags may be changed after call_mmap() because drivers may set some
    flags for their own purpose. As a result, we failed to merge the adjacent
    vma due to the different vm_flags as userspace can't pass in the same one.
    Try to merge vma after call_mmap() to fix this issue.

    Signed-off-by: Hongxiang Lou
    Signed-off-by: Miaohe Lin
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: http://lkml.kernel.org/r/1594954065-23733-1-git-send-email-linmiaohe@huawei.com
    Signed-off-by: Linus Torvalds

    Miaohe Lin
     
  • Look at the pseudo code below. It's very clear that, the judgement
    "!is_file_hugepages(file)" at 3) is duplicated to the one at 1), we can
    use "else if" to avoid it. And the assignment "retval = -EINVAL" at 2) is
    only needed by the branch 3), because "retval" will be overwritten at 4).

    No functional change, but it can reduce the code size. Maybe more clearer?
    Before:
    text data bss dec hex filename
    28733 1590 1 30324 7674 mm/mmap.o

    After:
    text data bss dec hex filename
    28701 1590 1 30292 7654 mm/mmap.o

    ====pseudo code====:
    if (!(flags & MAP_ANONYMOUS)) {
    ...
    1) if (is_file_hugepages(file))
    len = ALIGN(len, huge_page_size(hstate_file(file)));
    2) retval = -EINVAL;
    3) if (unlikely(flags & MAP_HUGETLB && !is_file_hugepages(file)))
    goto out_fput;
    } else if (flags & MAP_HUGETLB) {
    ...
    }
    ...

    4) retval = vm_mmap_pgoff(file, addr, len, prot, flags, pgoff);
    out_fput:
    ...
    return retval;

    Signed-off-by: Zhen Lei
    Signed-off-by: Andrew Morton
    Reviewed-by: Andrew Morton
    Link: http://lkml.kernel.org/r/20200705080112.1405-1-thunder.leizhen@huawei.com
    Signed-off-by: Linus Torvalds

    Zhen Lei
     

07 Aug, 2020

1 commit


31 Jul, 2020

1 commit


25 Jul, 2020

2 commits

  • Partial 5.8-rc7 merge to make the final merge easier.

    Signed-off-by: Greg Kroah-Hartman
    Change-Id: I95f1b0a379e3810333300a70c5a93f449d945c54

    Greg Kroah-Hartman
     
  • VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under
    mmap_read_lock(). It can lead to race with __do_munmap():

    Thread A Thread B
    __do_munmap()
    detach_vmas_to_be_unmapped()
    mmap_write_downgrade()
    expand_downwards()
    vma->vm_start = address;
    // The VMA now overlaps with
    // VMAs detached by the Thread A
    // page fault populates expanded part
    // of the VMA
    unmap_region()
    // Zaps pagetables partly
    // populated by Thread B

    Similar race exists for expand_upwards().

    The fix is to avoid downgrading mmap_lock in __do_munmap() if detached
    VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA.

    [akpm@linux-foundation.org: s/mmap_sem/mmap_lock/ in comment]

    Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
    Reported-by: Jann Horn
    Signed-off-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Reviewed-by: Yang Shi
    Acked-by: Vlastimil Babka
    Cc: Oleg Nesterov
    Cc: Matthew Wilcox
    Cc: [4.20+]
    Link: http://lkml.kernel.org/r/20200709105309.42495-1-kirill.shutemov@linux.intel.com
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

30 Jun, 2020

1 commit

  • A large process running on a heavily loaded system can encounter the
    following RCU CPU stall warning:

    rcu: INFO: rcu_sched self-detected stall on CPU
    rcu: 3-....: (20998 ticks this GP) idle=4ea/1/0x4000000000000002 softirq=556558/556558 fqs=5190
    (t=21013 jiffies g=1005461 q=132576)
    NMI backtrace for cpu 3
    CPU: 3 PID: 501900 Comm: aio-free-ring-w Kdump: loaded Not tainted 5.2.9-108_fbk12_rc3_3858_gb83b75af7909 #1
    Hardware name: Wiwynn HoneyBadger/PantherPlus, BIOS HBM6.71 02/03/2016
    Call Trace:

    dump_stack+0x46/0x60
    nmi_cpu_backtrace.cold.3+0x13/0x50
    ? lapic_can_unplug_cpu.cold.27+0x34/0x34
    nmi_trigger_cpumask_backtrace+0xba/0xca
    rcu_dump_cpu_stacks+0x99/0xc7
    rcu_sched_clock_irq.cold.87+0x1aa/0x397
    ? tick_sched_do_timer+0x60/0x60
    update_process_times+0x28/0x60
    tick_sched_timer+0x37/0x70
    __hrtimer_run_queues+0xfe/0x270
    hrtimer_interrupt+0xf4/0x210
    smp_apic_timer_interrupt+0x5e/0x120
    apic_timer_interrupt+0xf/0x20

    RIP: 0010:kmem_cache_free+0x223/0x300
    Code: 88 00 00 00 0f 85 ca 00 00 00 41 8b 55 18 31 f6 f7 da 41 f6 45 0a 02 40 0f 94 c6 83 c6 05 9c 41 5e fa e8 a0 a7 01 00 41 56 9d 8b 47 08 a8 03 0f 85 87 00 00 00 65 48 ff 08 e9 3d fe ff ff 65
    RSP: 0018:ffffc9000e8e3da8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
    RAX: 0000000000020000 RBX: ffff88861b9de960 RCX: 0000000000000030
    RDX: fffffffffffe41e8 RSI: 000060777fe3a100 RDI: 000000000001be18
    RBP: ffffea00186e7780 R08: ffffffffffffffff R09: ffffffffffffffff
    R10: ffff88861b9dea28 R11: ffff88887ffde000 R12: ffffffff81230a1f
    R13: ffff888854684dc0 R14: 0000000000000206 R15: ffff8888547dbc00
    ? remove_vma+0x4f/0x60
    remove_vma+0x4f/0x60
    exit_mmap+0xd6/0x160
    mmput+0x4a/0x110
    do_exit+0x278/0xae0
    ? syscall_trace_enter+0x1d3/0x2b0
    ? handle_mm_fault+0xaa/0x1c0
    do_group_exit+0x3a/0xa0
    __x64_sys_exit_group+0x14/0x20
    do_syscall_64+0x42/0x100
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    And on a PREEMPT=n kernel, the "while (vma)" loop in exit_mmap() can run
    for a very long time given a large process. This commit therefore adds
    a cond_resched() to this loop, providing RCU any needed quiescent states.

    Cc: Andrew Morton
    Cc:
    Reviewed-by: Shakeel Butt
    Reviewed-by: Joel Fernandes (Google)
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     

24 Jun, 2020

2 commits


10 Jun, 2020

3 commits

  • Convert comments that reference mmap_sem to reference mmap_lock instead.

    [akpm@linux-foundation.org: fix up linux-next leftovers]
    [akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
    [akpm@linux-foundation.org: more linux-next fixups, per Michel]

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Vlastimil Babka
    Reviewed-by: Daniel Jordan
    Cc: Davidlohr Bueso
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Laurent Dufour
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • Convert comments that reference old mmap_sem APIs to reference
    corresponding new mmap locking APIs instead.

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Vlastimil Babka
    Reviewed-by: Davidlohr Bueso
    Reviewed-by: Daniel Jordan
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Laurent Dufour
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-12-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • Rename the mmap_sem field to mmap_lock. Any new uses of this lock should
    now go through the new mmap locking api. The mmap_lock is still
    implemented as a rwsem, though this could change in the future.

    [akpm@linux-foundation.org: fix it for mm-gup-might_lock_readmmap_sem-in-get_user_pages_fast.patch]

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Vlastimil Babka
    Reviewed-by: Davidlohr Bueso
    Reviewed-by: Daniel Jordan
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Laurent Dufour
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-11-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse