03 Nov, 2020

14 commits

  • Commit 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler")
    removed various __user annotations from function signatures as part of
    its refactoring.

    It also removed the __user annotation for proc_dohung_task_timeout_secs()
    at its declaration in sched/sysctl.h, but not at its definition in
    kernel/hung_task.c.

    Hence, sparse complains:

    kernel/hung_task.c:271:5: error: symbol 'proc_dohung_task_timeout_secs' redeclared with different type (incompatible argument 3 (different address spaces))

    Adjust the annotation at the definition fitting to that refactoring to make
    sparse happy again, which also resolves this warning from sparse:

    kernel/hung_task.c:277:52: warning: incorrect type in argument 3 (different address spaces)
    kernel/hung_task.c:277:52: expected void *
    kernel/hung_task.c:277:52: got void [noderef] __user *buffer

    No functional change. No change in object code.

    Signed-off-by: Lukas Bulwahn
    Signed-off-by: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Tetsuo Handa
    Cc: Al Viro
    Cc: Andrey Ignatov
    Link: https://lkml.kernel.org/r/20201028130541.20320-1-lukas.bulwahn@gmail.com
    Signed-off-by: Linus Torvalds

    Lukas Bulwahn
     
  • Add a test case to ensure an event is observed by at least one poller
    when an epoll timeout is used.

    Signed-off-by: Guantao Liu
    Signed-off-by: Soheil Hassas Yeganeh
    Signed-off-by: Andrew Morton
    Reviewed-by: Eric Dumazet
    Reviewed-by: Khazhismel Kumykov
    Acked-by: Willem de Bruijn
    Cc: Al Viro
    Cc: Davidlohr Bueso
    Link: https://lkml.kernel.org/r/20201028180202.952079-2-soheil.kdev@gmail.com
    Signed-off-by: Linus Torvalds

    Soheil Hassas Yeganeh
     
  • The purpose of io_remap_pfn_range() is to map IO memory, such as a
    memory mapped IO exposed through a PCI BAR. IO devices do not
    understand encryption, so this memory must always be decrypted.
    Automatically call pgprot_decrypted() as part of the generic
    implementation.

    This fixes a bug where enabling AMD SME causes subsystems, such as RDMA,
    using io_remap_pfn_range() to expose BAR pages to user space to fail.
    The CPU will encrypt access to those BAR pages instead of passing
    unencrypted IO directly to the device.

    Places not mapping IO should use remap_pfn_range().

    Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption")
    Signed-off-by: Jason Gunthorpe
    Signed-off-by: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Tom Lendacky
    Cc: Thomas Gleixner
    Cc: Andrey Ryabinin
    Cc: Borislav Petkov
    Cc: Brijesh Singh
    Cc: Jonathan Corbet
    Cc: Dmitry Vyukov
    Cc: "Dave Young"
    Cc: Alexander Potapenko
    Cc: Konrad Rzeszutek Wilk
    Cc: Andy Lutomirski
    Cc: Larry Woodman
    Cc: Matt Fleming
    Cc: Ingo Molnar
    Cc: "Michael S. Tsirkin"
    Cc: Paolo Bonzini
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Toshimitsu Kani
    Cc:
    Link: https://lkml.kernel.org/r/0-v1-025d64bdf6c4+e-amd_sme_fix_jgg@nvidia.com
    Signed-off-by: Linus Torvalds

    Jason Gunthorpe
     
  • For oom_score_adj values in the range [942,999], the current
    calculations will print 16 for oom_adj. This patch simply limits the
    output so output is inline with docs.

    Signed-off-by: Charles Haithcock
    Signed-off-by: Andrew Morton
    Acked-by: Michal Hocko
    Cc: Alexey Dobriyan
    Link: https://lkml.kernel.org/r/20201020165130.33927-1-chaithco@redhat.com
    Signed-off-by: Linus Torvalds

    Charles Haithcock
     
  • There is a small race window when a delayed work is being canceled and
    the work still might be queued from the timer_fn:

    CPU0 CPU1
    kthread_cancel_delayed_work_sync()
    __kthread_cancel_work_sync()
    __kthread_cancel_work()
    work->canceling++;
    kthread_delayed_work_timer_fn()
    kthread_insert_work();

    BUG: kthread_insert_work() should not get called when work->canceling is
    set.

    Signed-off-by: Zqiang
    Signed-off-by: Andrew Morton
    Reviewed-by: Petr Mladek
    Acked-by: Tejun Heo
    Cc:
    Link: https://lkml.kernel.org/r/20201014083030.16895-1-qiang.zhang@windriver.com
    Signed-off-by: Linus Torvalds

    Zqiang
     
  • Fix the following sparse warning:

    mm/truncate.c:531:15: warning: symbol '__invalidate_mapping_pages' was not declared. Should it be static?

    Fixes: eb1d7a65f08a ("mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEED")
    Signed-off-by: Jason Yan
    Signed-off-by: Andrew Morton
    Reviewed-by: Yafang Shao
    Link: https://lkml.kernel.org/r/20201015054808.2445904-1-yanaijie@huawei.com
    Signed-off-by: Linus Torvalds

    Jason Yan
     
  • Commit 4d004099a668 ("lockdep: Fix lockdep recursion") uncovered the
    following issue in lib/crc32test reported on s390:

    BUG: using __this_cpu_read() in preemptible [00000000] code: swapper/0/1
    caller is lockdep_hardirqs_on_prepare+0x48/0x270
    CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.9.0-next-20201015-15164-g03d992bd2de6 #19
    Hardware name: IBM 3906 M04 704 (LPAR)
    Call Trace:
    lockdep_hardirqs_on_prepare+0x48/0x270
    trace_hardirqs_on+0x9c/0x1b8
    crc32_test.isra.0+0x170/0x1c0
    crc32test_init+0x1c/0x40
    do_one_initcall+0x40/0x130
    do_initcalls+0x126/0x150
    kernel_init_freeable+0x1f6/0x230
    kernel_init+0x22/0x150
    ret_from_fork+0x24/0x2c
    no locks held by swapper/0/1.

    Remove extra local_irq_disable/local_irq_enable helpers calls.

    Fixes: 5fb7f87408f1 ("lib: add module support to crc32 tests")
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Andrew Morton
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Greg Kroah-Hartman
    Link: https://lkml.kernel.org/r/patch.git-4369da00c06e.your-ad-here.call-01602859837-ext-1679@work.hours
    Signed-off-by: Linus Torvalds

    Vasily Gorbik
     
  • This testcase

    #include
    #include
    #include
    #include
    #include
    #include
    #include

    void *tf(void *arg)
    {
    return NULL;
    }

    int main(void)
    {
    int pid = fork();
    if (!pid) {
    kill(getpid(), SIGSTOP);

    pthread_t th;
    pthread_create(&th, NULL, tf, NULL);

    return 0;
    }

    waitpid(pid, NULL, WSTOPPED);

    ptrace(PTRACE_SEIZE, pid, 0, PTRACE_O_TRACECLONE);
    waitpid(pid, NULL, 0);

    ptrace(PTRACE_CONT, pid, 0,0);
    waitpid(pid, NULL, 0);

    int status;
    int thread = waitpid(-1, &status, 0);
    assert(thread > 0 && thread != pid);
    assert(status == 0x80137f);

    return 0;
    }

    fails and triggers WARN_ON_ONCE(!signr) in do_jobctl_trap().

    This is because task_join_group_stop() has 2 problems when current is traced:

    1. We can't rely on the "JOBCTL_STOP_PENDING" check, a stopped tracee
    can be woken up by debugger and it can clone another thread which
    should join the group-stop.

    We need to check group_stop_count || SIGNAL_STOP_STOPPED.

    2. If SIGNAL_STOP_STOPPED is already set, we should not increment
    sig->group_stop_count and add JOBCTL_STOP_CONSUME. The new thread
    should stop without another do_notify_parent_cldstop() report.

    To clarify, the problem is very old and we should blame
    ptrace_init_task(). But now that we have task_join_group_stop() it makes
    more sense to fix this helper to avoid the code duplication.

    Reported-by: syzbot+3485e3773f7da290eecc@syzkaller.appspotmail.com
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Cc: Jens Axboe
    Cc: Christian Brauner
    Cc: "Eric W . Biederman"
    Cc: Zhiqiang Liu
    Cc: Tejun Heo
    Cc:
    Link: https://lkml.kernel.org/r/20201019134237.GA18810@redhat.com
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or
    MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to
    pte_unmap_unlock seems like not a good idea.

    queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't
    migrate misplaced pages but returns with EIO when encountering such a
    page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return
    -EIO when MPOL_MF_STRICT is specified") and early break on the first pte
    in the range results in pte_unmap_unlock on an underflow pte. This can
    lead to lockups later on when somebody tries to lock the pte resp.
    page_table_lock again..

    Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
    Signed-off-by: Shijie Luo
    Signed-off-by: Miaohe Lin
    Signed-off-by: Andrew Morton
    Reviewed-by: Oscar Salvador
    Acked-by: Michal Hocko
    Cc: Miaohe Lin
    Cc: Feilong Lin
    Cc: Shijie Luo
    Cc:
    Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com
    Signed-off-by: Linus Torvalds

    Shijie Luo
     
  • Now that we have KASAN-KUNIT tests integration, it's easy to see that
    some KASAN tests are not adopted to the SW_TAGS mode and are failing.

    Adjust the allocation size for kasan_memchr() and kasan_memcmp() by
    roung it up to OOB_TAG_OFF so the bad access ends up in a separate
    memory granule.

    Add a new kmalloc_uaf_16() tests that relies on UAF, and a new
    kasan_bitops_tags() test that is tailored to tag-based mode, as it's
    hard to adopt the existing kmalloc_oob_16() and kasan_bitops_generic()
    (renamed from kasan_bitops()) without losing the precision.

    Add new kmalloc_uaf_16() and kasan_bitops_uaf() tests that rely on UAFs,
    as it's hard to adopt the existing kmalloc_oob_16() and
    kasan_bitops_oob() (rename from kasan_bitops()) without losing the
    precision.

    Disable kasan_global_oob() and kasan_alloca_oob_left/right() as SW_TAGS
    mode doesn't instrument globals nor dynamic allocas.

    Signed-off-by: Andrey Konovalov
    Signed-off-by: Andrew Morton
    Tested-by: David Gow
    Link: https://lkml.kernel.org/r/76eee17b6531ca8b3ca92b240cb2fd23204aaff7.1603129942.git.andreyknvl@google.com
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     
  • Richard reported a warning which can be reproduced by running the LTP
    madvise6 test (cgroup v1 in the non-hierarchical mode should be used):

    WARNING: CPU: 0 PID: 12 at mm/page_counter.c:57 page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156)
    Modules linked in:
    CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.9.0-rc7-22-default #77
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812d-rebuilt.opensuse.org 04/01/2014
    Workqueue: events drain_local_stock
    RIP: 0010:page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156)
    Call Trace:
    __memcg_kmem_uncharge (mm/memcontrol.c:3022)
    drain_obj_stock (./include/linux/rcupdate.h:689 mm/memcontrol.c:3114)
    drain_local_stock (mm/memcontrol.c:2255)
    process_one_work (./arch/x86/include/asm/jump_label.h:25 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:108 kernel/workqueue.c:2274)
    worker_thread (./include/linux/list.h:282 kernel/workqueue.c:2416)
    kthread (kernel/kthread.c:292)
    ret_from_fork (arch/x86/entry/entry_64.S:300)

    The problem occurs because in the non-hierarchical mode non-root page
    counters are not linked to root page counters, so the charge is not
    propagated to the root memory cgroup.

    After the removal of the original memory cgroup and reparenting of the
    object cgroup, the root cgroup might be uncharged by draining a objcg
    stock, for example. It leads to an eventual underflow of the charge and
    triggers a warning.

    Fix it by linking all page counters to corresponding root page counters
    in the non-hierarchical mode.

    Please note, that in the non-hierarchical mode all objcgs are always
    reparented to the root memory cgroup, even if the hierarchy has more
    than 1 level. This patch doesn't change it.

    The patch also doesn't affect how the hierarchical mode is working,
    which is the only sane and truly supported mode now.

    Thanks to Richard for reporting, debugging and providing an alternative
    version of the fix!

    Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
    Reported-by:
    Signed-off-by: Roman Gushchin
    Signed-off-by: Andrew Morton
    Reviewed-by: Shakeel Butt
    Reviewed-by: Michal Koutný
    Acked-by: Johannes Weiner
    Cc: Michal Hocko
    Cc:
    Link: https://lkml.kernel.org/r/20201026231326.3212225-1-guro@fb.com
    Debugged-by: Richard Palethorpe
    Signed-off-by: Linus Torvalds

    Roman Gushchin
     
  • memcg_page_state will get the specified number in hierarchical memcg, It
    should multiply by HPAGE_PMD_NR rather than an page if the item is
    NR_ANON_THPS.

    [akpm@linux-foundation.org: fix printk warning]
    [akpm@linux-foundation.org: use u64 cast, per Michal]

    Fixes: 468c398233da ("mm: memcontrol: switch to native NR_ANON_THPS counter")
    Signed-off-by: zhongjiang-ali
    Signed-off-by: Andrew Morton
    Acked-by: Johannes Weiner
    Acked-by: Michal Hocko
    Link: https://lkml.kernel.org/r/1603722395-72443-1-git-send-email-zhongjiang-ali@linux.alibaba.com
    Signed-off-by: Linus Torvalds

    zhongjiang-ali
     
  • Michal Privoznik was using "free page reporting" in QEMU/virtio-balloon
    with hugetlbfs and hit the warning below. QEMU with free page hinting
    uses fallocate(FALLOC_FL_PUNCH_HOLE) to discard pages that are reported
    as free by a VM. The reporting granularity is in pageblock granularity.
    So when the guest reports 2M chunks, we fallocate(FALLOC_FL_PUNCH_HOLE)
    one huge page in QEMU.

    WARNING: CPU: 7 PID: 6636 at mm/page_counter.c:57 page_counter_uncharge+0x4b/0x50
    Modules linked in: ...
    CPU: 7 PID: 6636 Comm: qemu-system-x86 Not tainted 5.9.0 #137
    Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO/X570 AORUS PRO, BIOS F21 07/31/2020
    RIP: 0010:page_counter_uncharge+0x4b/0x50
    ...
    Call Trace:
    hugetlb_cgroup_uncharge_file_region+0x4b/0x80
    region_del+0x1d3/0x300
    hugetlb_unreserve_pages+0x39/0xb0
    remove_inode_hugepages+0x1a8/0x3d0
    hugetlbfs_fallocate+0x3c4/0x5c0
    vfs_fallocate+0x146/0x290
    __x64_sys_fallocate+0x3e/0x70
    do_syscall_64+0x33/0x40
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Investigation of the issue uncovered bugs in hugetlb cgroup reservation
    accounting. This patch addresses the found issues.

    Fixes: 075a61d07a8e ("hugetlb_cgroup: add accounting for shared mappings")
    Reported-by: Michal Privoznik
    Co-developed-by: David Hildenbrand
    Signed-off-by: David Hildenbrand
    Signed-off-by: Mike Kravetz
    Signed-off-by: Andrew Morton
    Tested-by: Michal Privoznik
    Reviewed-by: Mina Almasry
    Acked-by: Michael S. Tsirkin
    Cc:
    Cc: David Hildenbrand
    Cc: Michal Hocko
    Cc: Muchun Song
    Cc: "Aneesh Kumar K . V"
    Cc: Tejun Heo
    Link: https://lkml.kernel.org/r/20201021204426.36069-1-mike.kravetz@oracle.com
    Signed-off-by: Linus Torvalds

    Mike Kravetz
     
  • commit 6f42193fd86e ("memremap: don't use a separate devm action for
    devmap_managed_enable_get") changed the static key updates such that we
    now call devmap_managed_enable_put() without doing the equivalent
    devmap_managed_enable_get().

    devmap_managed_enable_get() is only called for MEMORY_DEVICE_PRIVATE and
    MEMORY_DEVICE_FS_DAX, But memunmap_pages() get called for other pgmap
    types too. This results in the below warning when switching between
    system-ram and devdax mode for devdax namespace.

    jump label: negative count!
    WARNING: CPU: 52 PID: 1335 at kernel/jump_label.c:235 static_key_slow_try_dec+0x88/0xa0
    Modules linked in:
    ....

    NIP static_key_slow_try_dec+0x88/0xa0
    LR static_key_slow_try_dec+0x84/0xa0
    Call Trace:
    static_key_slow_try_dec+0x84/0xa0
    __static_key_slow_dec_cpuslocked+0x34/0xd0
    static_key_slow_dec+0x54/0xf0
    memunmap_pages+0x36c/0x500
    devm_action_release+0x30/0x50
    release_nodes+0x2f4/0x3e0
    device_release_driver_internal+0x17c/0x280
    bus_remove_device+0x124/0x210
    device_del+0x1d4/0x530
    unregister_dev_dax+0x48/0xe0
    devm_action_release+0x30/0x50
    release_nodes+0x2f4/0x3e0
    device_release_driver_internal+0x17c/0x280
    unbind_store+0x130/0x170
    drv_attr_store+0x40/0x60
    sysfs_kf_write+0x6c/0xb0
    kernfs_fop_write+0x118/0x280
    vfs_write+0xe8/0x2a0
    ksys_write+0x84/0x140
    system_call_exception+0x120/0x270
    system_call_common+0xf0/0x27c

    Reported-by: Aneesh Kumar K.V
    Signed-off-by: Ralph Campbell
    Signed-off-by: Andrew Morton
    Tested-by: Sachin Sant
    Reviewed-by: Aneesh Kumar K.V
    Reviewed-by: Ira Weiny
    Reviewed-by: Christoph Hellwig
    Cc: Dan Williams
    Cc: Jason Gunthorpe
    Link: https://lkml.kernel.org/r/20201023183222.13186-1-rcampbell@nvidia.com
    Signed-off-by: Linus Torvalds

    Ralph Campbell
     

02 Nov, 2020

11 commits

  • Linus Torvalds
     
  • Pull x86 fixes from Thomas Gleixner:
    "Three fixes all related to #DB:

    - Handle the BTF bit correctly so it doesn't get lost due to a kernel
    #DB

    - Only clear and set the virtual DR6 value used by ptrace on user
    space triggered #DB. A kernel #DB must leave it alone to ensure
    data consistency for ptrace.

    - Make the bitmasking of the virtual DR6 storage correct so it does
    not lose DR_STEP"

    * tag 'x86-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/debug: Fix DR_STEP vs ptrace_get_debugreg(6)
    x86/debug: Only clear/set ->virtual_dr6 for userspace #DB
    x86/debug: Fix BTF handling

    Linus Torvalds
     
  • Pull timer fixes from Thomas Gleixner:
    "A few fixes for timers/timekeeping:

    - Prevent undefined behaviour in the timespec64_to_ns() conversion
    which is used for converting user supplied time input to
    nanoseconds. It lacked overflow protection.

    - Mark sched_clock_read_begin/retry() to prevent recursion in the
    tracer

    - Remove unused debug functions in the hrtimer and timerlist code"

    * tag 'timers-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    time: Prevent undefined behaviour in timespec64_to_ns()
    timers: Remove unused inline funtion debug_timer_free()
    hrtimer: Remove unused inline function debug_hrtimer_free()
    time/sched_clock: Mark sched_clock_read_begin/retry() as notrace

    Linus Torvalds
     
  • Pull smp fix from Thomas Gleixner:
    "A single fix for stop machine.

    Mark functions no trace to prevent a crash caused by recursion when
    enabling or disabling a tracer on RISC-V (probably all architectures
    which patch through stop machine)"

    * tag 'smp-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    stop_machine, rcu: Mark functions as notrace

    Linus Torvalds
     
  • Pull locking fixes from Thomas Gleixner:
    "A couple of locking fixes:

    - Fix incorrect failure injection handling in the fuxtex code

    - Prevent a preemption warning in lockdep when tracking
    local_irq_enable() and interrupts are already enabled

    - Remove more raw_cpu_read() usage from lockdep which causes state
    corruption on !X86 architectures.

    - Make the nr_unused_locks accounting in lockdep correct again"

    * tag 'locking-urgent-2020-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    lockdep: Fix nr_unused_locks accounting
    locking/lockdep: Remove more raw_cpu_read() usage
    futex: Fix incorrect should_fail_futex() handling
    lockdep: Fix preemption WARN for spurious IRQ-enable

    Linus Torvalds
     
  • Pull char/misc fixes/removals from Greg KH:
    "Here's some small fixes for 5.10-rc2 and a big driver removal.

    The fixes are for some reported issues in the interconnect and
    coresight drivers, nothing major.

    The "big" driver removal is the MIC drivers have been asked to be
    removed as the hardware never shipped and Intel no longer wants to
    maintain something that no one can use. This is welcomed by many as
    the DMA usage of these drivers was "interesting" and the security
    people were starting to question some issues that were starting to be
    found in the codebase.

    Note, one of the subsystems for this driver, the "VOP" code, will
    probably come back in future kernel versions as it was looking to
    potentially solve some PCIe virtualization issues that a number of
    other vendors were wanting to solve. But as-is, this codebase didn't
    work for anyone else so no actual functionality is being removed.

    All of these have been in linux-next with no reported issues"

    * tag 'char-misc-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
    coresight: cti: Initialize dynamic sysfs attributes
    coresight: Fix uninitialised pointer bug in etm_setup_aux()
    coresight: add module license
    misc: mic: remove the MIC drivers
    interconnect: qcom: use icc_sync state for sm8[12]50
    interconnect: qcom: Ensure that the floor bandwidth value is enforced
    interconnect: qcom: sc7180: Init BCMs before creating the nodes
    interconnect: qcom: sdm845: Init BCMs before creating the nodes
    interconnect: Aggregate before setting initial bandwidth
    interconnect: qcom: sdm845: Enable keepalive for the MM1 BCM

    Linus Torvalds
     
  • Pull driver core and documentation fixes from Greg KH:
    "Here is one tiny debugfs change to fix up an API where the last user
    was successfully fixed up in 5.10-rc1 (so it couldn't be merged
    earlier), and a much larger Documentation/ABI/ update to the files so
    they can be automatically parsed by our tools.

    The Documentation/ABI/ updates are just formatting issues, small ones
    to bring the files into parsable format, and have been acked by
    numerous subsystem maintainers and the documentation maintainer. I
    figured it was good to get this into 5.10-rc2 to help wih the merge
    issues that would arise if these were to stick in linux-next until
    5.11-rc1.

    The debugfs change has been in linux-next for a long time, and the
    Documentation updates only for the last linux-next release"

    * tag 'driver-core-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (40 commits)
    scripts: get_abi.pl: assume ReST format by default
    docs: ABI: sysfs-class-led-trigger-pattern: remove hw_pattern duplication
    docs: ABI: sysfs-class-backlight: unify ABI documentation
    docs: ABI: sysfs-c2port: remove a duplicated entry
    docs: ABI: sysfs-class-power: unify duplicated properties
    docs: ABI: unify /sys/class/leds//brightness documentation
    docs: ABI: stable: remove a duplicated documentation
    docs: ABI: change read/write attributes
    docs: ABI: cleanup several ABI documents
    docs: ABI: sysfs-bus-nvdimm: use the right format for ABI
    docs: ABI: vdso: use the right format for ABI
    docs: ABI: fix syntax to be parsed using ReST notation
    docs: ABI: convert testing/configfs-acpi to ReST
    docs: Kconfig/Makefile: add a check for broken ABI files
    docs: abi-testing.rst: enable --rst-sources when building docs
    docs: ABI: don't escape ReST-incompatible chars from obsolete and removed
    docs: ABI: create a 2-depth index for ABI
    docs: ABI: make it parse ABI/stable as ReST-compatible files
    docs: ABI: sysfs-uevent: make it compatible with ReST output
    docs: ABI: testing: make the files compatible with ReST output
    ...

    Linus Torvalds
     
  • Pull staging driver fixes from Greg KH:
    "Here are some small staging driver fixes for issues that have been
    reported in 5.10-rc1:

    - octeon driver fixes

    - wfx driver fixes

    - memory leak fix in vchiq driver

    - fieldbus driver bugfix

    - comedi driver bugfix

    All of these have been in linux-next with no reported issues"

    * tag 'staging-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
    staging: fieldbus: anybuss: jump to correct label in an error path
    staging: wfx: fix test on return value of gpiod_get_value()
    staging: wfx: fix use of uninitialized pointer
    staging: mmal-vchiq: Fix memory leak for vchiq_instance
    staging: comedi: cb_pcidas: Allow 2-channel commands for AO subdevice
    staging: octeon: Drop on uncorrectable alignment or FCS error
    staging: octeon: repair "fixed-link" support

    Linus Torvalds
     
  • Pull tty/serial fixes from Greg KH:
    "Here are some small TTY and Serial driver fixes for reported issues
    for 5.10-rc2. They include:

    - vt ioctl bugfix for reported problems

    - fsl_lpuart serial driver fix

    - 21285 serial driver bugfix

    All have been in linux-next with no reported issues"

    * tag 'tty-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
    vt_ioctl: fix GIO_UNIMAP regression
    vt: keyboard, extend func_buf_lock to readers
    vt: keyboard, simplify vt_kdgkbsent
    tty: serial: fsl_lpuart: LS1021A has a FIFO size of 16 words, like LS1028A
    tty: serial: 21285: fix lockup on open

    Linus Torvalds
     
  • Pull USB driver fixes from Greg KH:
    "Here are a number of small bugfixes for reported issues in some USB
    drivers. They include:

    - typec bugfixes

    - xhci bugfixes and lockdep warning fixes

    - cdc-acm driver regression fix

    - kernel doc fixes

    - cdns3 driver bugfixes for a bunch of reported issues

    - other tiny USB driver fixes

    All have been in linux-next with no reported issues"

    * tag 'usb-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
    usb: cdns3: gadget: own the lock wrongly at the suspend routine
    usb: cdns3: Fix on-chip memory overflow issue
    usb: cdns3: gadget: suspicious implicit sign extension
    xhci: Don't create stream debugfs files with spinlock held.
    usb: xhci: Workaround for S3 issue on AMD SNPS 3.0 xHC
    xhci: Fix sizeof() mismatch
    usb: typec: stusb160x: fix signedness comparison issue with enum variables
    usb: typec: add missing MODULE_DEVICE_TABLE() to stusb160x
    USB: apple-mfi-fastcharge: don't probe unhandled devices
    usbcore: Check both id_table and match() when both available
    usb: host: ehci-tegra: Fix error handling in tegra_ehci_probe()
    usb: typec: stusb160x: fix an IS_ERR() vs NULL check in probe
    usb: typec: tcpm: reset hard_reset_count for any disconnect
    usb: cdc-acm: fix cooldown mechanism
    usb: host: fsl-mph-dr-of: check return of dma_set_mask()
    usb: fix kernel-doc markups
    usb: typec: stusb160x: fix some signedness bugs
    usb: cdns3: Variable 'length' set but not used

    Linus Torvalds
     
  • Pull kvm fixes from Paolo Bonzini:
    "ARM:
    - selftest fix
    - force PTE mapping on device pages provided via VFIO
    - fix detection of cacheable mapping at S2
    - fallback to PMD/PTE mappings for composite huge pages
    - fix accounting of Stage-2 PGD allocation
    - fix AArch32 handling of some of the debug registers
    - simplify host HYP entry
    - fix stray pointer conversion on nVHE TLB invalidation
    - fix initialization of the nVHE code
    - simplify handling of capabilities exposed to HYP
    - nuke VCPUs caught using a forbidden AArch32 EL0

    x86:
    - new nested virtualization selftest
    - miscellaneous fixes
    - make W=1 fixes
    - reserve new CPUID bit in the KVM leaves"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    KVM: vmx: remove unused variable
    KVM: selftests: Don't require THP to run tests
    KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again
    KVM: selftests: test behavior of unmapped L2 APIC-access address
    KVM: x86: Fix NULL dereference at kvm_msr_ignored_check()
    KVM: x86: replace static const variables with macros
    KVM: arm64: Handle Asymmetric AArch32 systems
    arm64: cpufeature: upgrade hyp caps to final
    arm64: cpufeature: reorder cpus_have_{const, final}_cap()
    KVM: arm64: Factor out is_{vhe,nvhe}_hyp_code()
    KVM: arm64: Force PTE mapping on fault resulting in a device mapping
    KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes
    KVM: arm64: Fix masks in stage2_pte_cacheable()
    KVM: arm64: Fix AArch32 handling of DBGD{CCINT,SCRext} and DBGVCR
    KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT
    KVM: arm64: Drop useless PAN setting on host EL1 to EL2 transition
    KVM: arm64: Remove leftover kern_hyp_va() in nVHE TLB invalidation
    KVM: arm64: Don't corrupt tpidr_el2 on failed HVC call
    x86/kvm: Reserve KVM_FEATURE_MSI_EXT_DEST_ID

    Linus Torvalds
     

01 Nov, 2020

4 commits

  • Pull vhost fixes from Michael Tsirkin:
    "Fixes all over the place.

    A new UAPI is borderline: can also be considered a new feature but
    also seems to be the only way we could come up with to fix addressing
    for userspace - and it seems important to switch to it now before
    userspace making assumptions about addressing ability of devices is
    set in stone"

    * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
    vdpasim: allow to assign a MAC address
    vdpasim: fix MAC address configuration
    vdpa: handle irq bypass register failure case
    vdpa_sim: Fix DMA mask
    Revert "vhost-vdpa: fix page pinning leakage in error path"
    vdpa/mlx5: Fix error return in map_direct_mr()
    vhost_vdpa: Return -EFAULT if copy_from_user() fails
    vdpa_sim: implement get_iova_range()
    vhost: vdpa: report iova range
    vdpa: introduce config op to get valid iova range

    Linus Torvalds
     
  • …linux/kernel/git/gustavoars/linux

    Pull more flexible-array member conversions from Gustavo A. R. Silva:
    "Replace zero-length arrays with flexible-array members"

    * tag 'flexible-array-conversions-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
    printk: ringbuffer: Replace zero-length array with flexible-array member
    net/smc: Replace zero-length array with flexible-array member
    net/mlx5: Replace zero-length array with flexible-array member
    mei: hw: Replace zero-length array with flexible-array member
    gve: Replace zero-length array with flexible-array member
    Bluetooth: btintel: Replace zero-length array with flexible-array member
    scsi: target: tcmu: Replace zero-length array with flexible-array member
    ima: Replace zero-length array with flexible-array member
    enetc: Replace zero-length array with flexible-array member
    fs: Replace zero-length array with flexible-array member
    Bluetooth: Replace zero-length array with flexible-array member
    params: Replace zero-length array with flexible-array member
    tracepoint: Replace zero-length array with flexible-array member
    platform/chrome: cros_ec_proto: Replace zero-length array with flexible-array member
    platform/chrome: cros_ec_commands: Replace zero-length array with flexible-array member
    mailbox: zynqmp-ipi-message: Replace zero-length array with flexible-array member
    dmaengine: ti-cppi5: Replace zero-length array with flexible-array member

    Linus Torvalds
     
  • Pull dma-mapping fix from Christoph Hellwig:
    "Fix an integer overflow on 32-bit platforms in the new DMA range code
    (Geert Uytterhoeven)"

    * tag 'dma-mapping-5.10-2' of git://git.infradead.org/users/hch/dma-mapping:
    dma-mapping: fix 32-bit overflow with CONFIG_ARM_LPAE=n

    Linus Torvalds
     
  • Pull SCSI fixes from James Bottomley:
    "Four driver fixes and one core fix.

    The core fix closes a race window where we could kick off a second
    asynchronous scan because the test and set of the variable preventing
    it isn't atomic"

    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
    scsi: hisi_sas: Stop using queue #0 always for v2 hw
    scsi: ibmvscsi: Fix potential race after loss of transport
    scsi: mptfusion: Fix null pointer dereferences in mptscsih_remove()
    scsi: qla2xxx: Return EBUSY on fcport deletion
    scsi: core: Don't start concurrent async scan on same host

    Linus Torvalds
     

31 Oct, 2020

11 commits

  • Reported-by: kernel test robot
    Signed-off-by: Paolo Bonzini

    Paolo Bonzini
     
  • Unless we want to test with THP, then we shouldn't require it to be
    configured by the host kernel. Unfortunately, even advising with
    MADV_NOHUGEPAGE does require it, so check for THP first in order
    to avoid madvise failing with EINVAL.

    Signed-off-by: Andrew Jones
    Message-Id:
    Signed-off-by: Paolo Bonzini

    Andrew Jones
     
  • It was noticed that evmcs_sanitize_exec_ctrls() is not being executed
    nowadays despite the code checking 'enable_evmcs' static key looking
    correct. Turns out, static key magic doesn't work in '__init' section
    (and it is unclear when things changed) but setup_vmcs_config() is called
    only once per CPU so we don't really need it to. Switch to checking
    'enlightened_vmcs' instead, it is supposed to be in sync with
    'enable_evmcs'.

    Opportunistically make evmcs_sanitize_exec_ctrls '__init' and drop unneeded
    extra newline from it.

    Reported-by: Yang Weijiang
    Signed-off-by: Vitaly Kuznetsov
    Message-Id:
    Signed-off-by: Paolo Bonzini

    Vitaly Kuznetsov
     
  • Add a regression test for commit 671ddc700fd0 ("KVM: nVMX: Don't leak
    L1 MMIO regions to L2").

    First, check to see that an L2 guest can be launched with a valid
    APIC-access address that is backed by a page of L1 physical memory.

    Next, set the APIC-access address to a (valid) L1 physical address
    that is not backed by memory. KVM can't handle this situation, so
    resuming L2 should result in a KVM exit for internal error
    (emulation).

    Signed-off-by: Jim Mattson
    Reviewed-by: Ricardo Koller
    Reviewed-by: Peter Shier
    Message-Id:
    Signed-off-by: Paolo Bonzini

    Jim Mattson
     
  • Pull block fixes from Jens Axboe:

    - null_blk zone fixes (Damien, Kanchan)

    - NVMe pull request from Christoph:
    - improve zone revalidation (Keith Busch)
    - gracefully handle zero length messages in nvme-rdma (zhenwei pi)
    - nvme-fc error handling fixes (James Smart)
    - nvmet tracing NULL pointer dereference fix (Chaitanya Kulkarni)"

    - xsysace platform fixes (Andy)

    - scatterlist type cleanup (David)

    - blk-cgroup memory fixes (Gabriel)

    - nbd block size update fix (Ming)

    - Flush completion state fix (Ming)

    - bio_add_hw_page() iteration fix (Naohiro)

    * tag 'block-5.10-2020-10-30' of git://git.kernel.dk/linux-block:
    blk-mq: mark flush request as IDLE in flush_end_io()
    lib/scatterlist: use consistent sg_copy_buffer() return type
    xsysace: use platform_get_resource() and platform_get_irq_optional()
    null_blk: Fix locking in zoned mode
    null_blk: Fix zone reset all tracing
    nbd: don't update block size after device is started
    block: advance iov_iter on bio_add_hw_page failure
    null_blk: synchronization fix for zoned device
    nvmet: fix a NULL pointer dereference when tracing the flush command
    nvme-fc: remove nvme_fc_terminate_io()
    nvme-fc: eliminate terminate_io use by nvme_fc_error_recovery
    nvme-fc: remove err_work work item
    nvme-fc: track error_recovery while connecting
    nvme-rdma: handle unexpected nvme completion data length
    nvme: ignore zone validate errors on subsequent scans
    blk-cgroup: Pre-allocate tree node on blkg_conf_prep
    blk-cgroup: Fix memleak on error path

    Linus Torvalds
     
  • There is a regular need in the kernel to provide a way to declare having a
    dynamically sized set of trailing elements in a structure. Kernel code should
    always use “flexible array members”[1] for these cases. The older style of
    one-element or zero-length arrays should no longer be used[2].

    [1] https://en.wikipedia.org/wiki/Flexible_array_member
    [2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     
  • There is a regular need in the kernel to provide a way to declare having a
    dynamically sized set of trailing elements in a structure. Kernel code should
    always use “flexible array members”[1] for these cases. The older style of
    one-element or zero-length arrays should no longer be used[2].

    [1] https://en.wikipedia.org/wiki/Flexible_array_member
    [2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     
  • There is a regular need in the kernel to provide a way to declare having a
    dynamically sized set of trailing elements in a structure. Kernel code should
    always use “flexible array members”[1] for these cases. The older style of
    one-element or zero-length arrays should no longer be used[2].

    [1] https://en.wikipedia.org/wiki/Flexible_array_member
    [2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     
  • There is a regular need in the kernel to provide a way to declare having a
    dynamically sized set of trailing elements in a structure. Kernel code should
    always use “flexible array members”[1] for these cases. The older style of
    one-element or zero-length arrays should no longer be used[2].

    [1] https://en.wikipedia.org/wiki/Flexible_array_member
    [2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     
  • There is a regular need in the kernel to provide a way to declare having a
    dynamically sized set of trailing elements in a structure. Kernel code
    should always use “flexible array members”[1] for these cases. The
    older style of one-element or zero-length arrays should no longer be
    used[2].

    Refactor the code according to the use of a flexible-array member in
    struct gve_stats_report, instead of a zero-length array, and use the
    struct_size() helper to calculate the size for the resource allocation.

    [1] https://en.wikipedia.org/wiki/Flexible_array_member
    [2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva
     
  • There is a regular need in the kernel to provide a way to declare having a
    dynamically sized set of trailing elements in a structure. Kernel code should
    always use “flexible array members”[1] for these cases. The older style of
    one-element or zero-length arrays should no longer be used[2].

    [1] https://en.wikipedia.org/wiki/Flexible_array_member
    [2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays

    Signed-off-by: Gustavo A. R. Silva

    Gustavo A. R. Silva