11 Dec, 2019

8 commits

  • Remove references to unused functions, standardize language, update to
    reflect new functionality, migrate to rst format, and fix all kernel-doc
    warnings.

    Fixes: 815613da6a67 ("kernel/padata.c: removed unused code")
    Signed-off-by: Daniel Jordan
    Cc: Eric Biggers
    Cc: Herbert Xu
    Cc: Jonathan Corbet
    Cc: Steffen Klassert
    Cc: linux-crypto@vger.kernel.org
    Cc: linux-doc@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Daniel Jordan
    Signed-off-by: Herbert Xu

    Daniel Jordan
     
  • reorder_objects is unused since the rework of padata's flushing, so
    remove it.

    Signed-off-by: Daniel Jordan
    Cc: Eric Biggers
    Cc: Herbert Xu
    Cc: Steffen Klassert
    Cc: linux-crypto@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Herbert Xu

    Daniel Jordan
     
  • Since commit 63d3578892dc ("crypto: pcrypt - remove padata cpumask
    notifier") this feature is unused, so get rid of it.

    Signed-off-by: Daniel Jordan
    Cc: Eric Biggers
    Cc: Herbert Xu
    Cc: Jonathan Corbet
    Cc: Steffen Klassert
    Cc: linux-crypto@vger.kernel.org
    Cc: linux-doc@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Herbert Xu

    Daniel Jordan
     
  • lockdep complains when padata's paths to update cpumasks via CPU hotplug
    and sysfs are both taken:

    # echo 0 > /sys/devices/system/cpu/cpu1/online
    # echo ff > /sys/kernel/pcrypt/pencrypt/parallel_cpumask

    ======================================================
    WARNING: possible circular locking dependency detected
    5.4.0-rc8-padata-cpuhp-v3+ #1 Not tainted
    ------------------------------------------------------
    bash/205 is trying to acquire lock:
    ffffffff8286bcd0 (cpu_hotplug_lock.rw_sem){++++}, at: padata_set_cpumask+0x2b/0x120

    but task is already holding lock:
    ffff8880001abfa0 (&pinst->lock){+.+.}, at: padata_set_cpumask+0x26/0x120

    which lock already depends on the new lock.

    padata doesn't take cpu_hotplug_lock and pinst->lock in a consistent
    order. Which should be first? CPU hotplug calls into padata with
    cpu_hotplug_lock already held, so it should have priority.

    Fixes: 6751fb3c0e0c ("padata: Use get_online_cpus/put_online_cpus")
    Signed-off-by: Daniel Jordan
    Cc: Eric Biggers
    Cc: Herbert Xu
    Cc: Steffen Klassert
    Cc: linux-crypto@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Herbert Xu

    Daniel Jordan
     
  • Configuring an instance's parallel mask without any online CPUs...

    echo 2 > /sys/kernel/pcrypt/pencrypt/parallel_cpumask
    echo 0 > /sys/devices/system/cpu/cpu1/online

    ...makes tcrypt mode=215 crash like this:

    divide error: 0000 [#1] SMP PTI
    CPU: 4 PID: 283 Comm: modprobe Not tainted 5.4.0-rc8-padata-doc-v2+ #2
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20191013_105130-anatol 04/01/2014
    RIP: 0010:padata_do_parallel+0x114/0x300
    Call Trace:
    pcrypt_aead_encrypt+0xc0/0xd0 [pcrypt]
    crypto_aead_encrypt+0x1f/0x30
    do_mult_aead_op+0x4e/0xdf [tcrypt]
    test_mb_aead_speed.constprop.0.cold+0x226/0x564 [tcrypt]
    do_test+0x28c2/0x4d49 [tcrypt]
    tcrypt_mod_init+0x55/0x1000 [tcrypt]
    ...

    cpumask_weight() in padata_cpu_hash() returns 0 because the mask has no
    CPUs. The problem is __padata_remove_cpu() checks for valid masks too
    early and so doesn't mark the instance PADATA_INVALID as expected, which
    would have made padata_do_parallel() return error before doing the
    division.

    Fix by introducing a second padata CPU hotplug state before
    CPUHP_BRINGUP_CPU so that __padata_remove_cpu() sees the online mask
    without @cpu. No need for the second argument to padata_replace() since
    @cpu is now already missing from the online mask.

    Fixes: 33e54450683c ("padata: Handle empty padata cpumasks")
    Signed-off-by: Daniel Jordan
    Cc: Eric Biggers
    Cc: Herbert Xu
    Cc: Sebastian Andrzej Siewior
    Cc: Steffen Klassert
    Cc: Thomas Gleixner
    Cc: linux-crypto@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Herbert Xu

    Daniel Jordan
     
  • If the pcrypt template is used multiple times in an algorithm, then a
    deadlock occurs because all pcrypt instances share the same
    padata_instance, which completes requests in the order submitted. That
    is, the inner pcrypt request waits for the outer pcrypt request while
    the outer request is already waiting for the inner.

    This patch fixes this by allocating a set of queues for each pcrypt
    instance instead of using two global queues. In order to maintain
    the existing user-space interface, the pinst structure remains global
    so any sysfs modifications will apply to every pcrypt instance.

    Note that when an update occurs we have to allocate memory for
    every pcrypt instance. Should one of the allocations fail we
    will abort the update without rolling back changes already made.

    The new per-instance data structure is called padata_shell and is
    essentially a wrapper around parallel_data.

    Reproducer:

    #include
    #include
    #include

    int main()
    {
    struct sockaddr_alg addr = {
    .salg_type = "aead",
    .salg_name = "pcrypt(pcrypt(rfc4106-gcm-aesni))"
    };
    int algfd, reqfd;
    char buf[32] = { 0 };

    algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
    bind(algfd, (void *)&addr, sizeof(addr));
    setsockopt(algfd, SOL_ALG, ALG_SET_KEY, buf, 20);
    reqfd = accept(algfd, 0, 0);
    write(reqfd, buf, 32);
    read(reqfd, buf, 16);
    }

    Reported-by: syzbot+56c7151cad94eec37c521f0e47d2eee53f9361c4@syzkaller.appspotmail.com
    Fixes: 5068c7a883d1 ("crypto: pcrypt - Add pcrypt crypto parallelization wrapper")
    Signed-off-by: Herbert Xu
    Tested-by: Eric Biggers
    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • The function padata_remove_cpu was supposed to have been removed
    along with padata_add_cpu but somehow it remained behind. Let's
    kill it now as it doesn't even have a prototype anymore.

    Fixes: 815613da6a67 ("kernel/padata.c: removed unused code")
    Signed-off-by: Herbert Xu
    Reviewed-by: Daniel Jordan
    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • The function padata_flush_queues is fundamentally broken because
    it cannot force padata users to complete the request that is
    underway. IOW padata has to passively wait for the completion
    of any outstanding work.

    As it stands flushing is used in two places. Its use in padata_stop
    is simply unnecessary because nothing depends on the queues to
    be flushed afterwards.

    The other use in padata_replace is more substantial as we depend
    on it to free the old pd structure. This patch instead uses the
    pd->refcnt to dynamically free the pd structure once all requests
    are complete.

    Fixes: 2b73b07ab8a4 ("padata: Flush the padata queues actively")
    Cc:
    Signed-off-by: Herbert Xu
    Reviewed-by: Daniel Jordan
    Signed-off-by: Herbert Xu

    Herbert Xu
     

09 Dec, 2019

1 commit

  • Pull networking fixes from David Miller:

    1) More jumbo frame fixes in r8169, from Heiner Kallweit.

    2) Fix bpf build in minimal configuration, from Alexei Starovoitov.

    3) Use after free in slcan driver, from Jouni Hogander.

    4) Flower classifier port ranges don't work properly in the HW offload
    case, from Yoshiki Komachi.

    5) Use after free in hns3_nic_maybe_stop_tx(), from Yunsheng Lin.

    6) Out of bounds access in mqprio_dump(), from Vladyslav Tarasiuk.

    7) Fix flow dissection in dsa TX path, from Alexander Lobakin.

    8) Stale syncookie timestampe fixes from Guillaume Nault.

    [ Did an evil merge to silence a warning introduced by this pull - Linus ]

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (84 commits)
    r8169: fix rtl_hw_jumbo_disable for RTL8168evl
    net_sched: validate TCA_KIND attribute in tc_chain_tmplt_add()
    r8169: add missing RX enabling for WoL on RTL8125
    vhost/vsock: accept only packets with the right dst_cid
    net: phy: dp83867: fix hfs boot in rgmii mode
    net: ethernet: ti: cpsw: fix extra rx interrupt
    inet: protect against too small mtu values.
    gre: refetch erspan header from skb->data after pskb_may_pull()
    pppoe: remove redundant BUG_ON() check in pppoe_pernet
    tcp: Protect accesses to .ts_recent_stamp with {READ,WRITE}_ONCE()
    tcp: tighten acceptance of ACKs not matching a child socket
    tcp: fix rejected syncookies due to stale timestamps
    lpc_eth: kernel BUG on remove
    tcp: md5: fix potential overestimation of TCP option space
    net: sched: allow indirect blocks to bind to clsact in TC
    net: core: rename indirect block ingress cb function
    net-sysfs: Call dev_hold always in netdev_queue_add_kobject
    net: dsa: fix flow dissection on Tx path
    net/tls: Fix return values to avoid ENOTSUPP
    net: avoid an indirect call in ____sys_recvmsg()
    ...

    Linus Torvalds
     

06 Dec, 2019

3 commits

  • Pull modules updates from Jessica Yu:
    "Summary of modules changes for the 5.5 merge window:

    - Refactor include/linux/export.h and remove code duplication between
    EXPORT_SYMBOL and EXPORT_SYMBOL_NS to make it more readable.

    The most notable change is that no namespace is represented by an
    empty string "" rather than NULL.

    - Fix a module load/unload race where waiter(s) trying to load the
    same module weren't being woken up when a module finally goes away"

    * tag 'modules-for-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
    kernel/module.c: wakeup processes in module_wq on module unload
    moduleparam: fix parameter description mismatch
    export: avoid code duplication in include/linux/export.h

    Linus Torvalds
     
  • Pull thermal management updates from Zhang Rui:

    - Fix a deadlock regression in thermal core framework, which was
    introduced in 5.3 (Wei Wang)

    - Initialize thermal control framework earlier to enable thermal
    mitigation during boot (Amit Kucheria)

    - Convert the Intelligent Power Allocator (IPA) thermal governor to
    follow the generic PM_EM instead of its own Energy Model (Quentin
    Perret)

    - Introduce a new Amlogic soc thermal driver (Guillaume La Roque)

    - Add interrupt support for tsens thermal driver (Amit Kucheria)

    - Add support for MSM8956/8976 in tsens thermal driver
    (AngeloGioacchino Del Regno)

    - Add support for r8a774b1 in rcar thermal driver (Biju Das)

    - Add support for Thermal Monitor Unit v2 in qoriq thermal driver
    (Yuantian Tang)

    - Some other fixes/cleanups on thermal core framework and soc thermal
    drivers (Colin Ian King, Daniel Lezcano, Hsin-Yi Wang, Tian Tao)

    * 'thermal/next' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux: (32 commits)
    thermal: Fix deadlock in thermal thermal_zone_device_check
    thermal: cpu_cooling: Migrate to using the EM framework
    thermal: cpu_cooling: Make the power-related code depend on IPA
    PM / EM: Declare EM data types unconditionally
    arm64: defconfig: Enable CONFIG_ENERGY_MODEL
    drivers: thermal: tsens: fix potential integer overflow on multiply
    thermal: cpu_cooling: Reorder the header file
    thermal: cpu_cooling: Remove pointless dependency on CONFIG_OF
    thermal: no need to set .owner when using module_platform_driver
    thermal: qcom: tsens-v1: Fix kfree of a non-pointer value
    cpufreq: qcom-hw: Move driver initialization earlier
    clk: qcom: Initialize clock drivers earlier
    cpufreq: Initialize cpufreq-dt driver earlier
    cpufreq: Initialize the governors in core_initcall
    thermal: Initialize thermal subsystem earlier
    thermal: Remove netlink support
    dt: thermal: tsens: Document compatible for MSM8976/56
    thermal: qcom: tsens-v1: Add support for MSM8956 and MSM8976
    MAINTAINERS: add entry for Amlogic Thermal driver
    thermal: amlogic: Add thermal driver to support G12 SoCs
    ...

    Linus Torvalds
     
  • Merge more updates from Andrew Morton:
    "Most of the rest of MM and various other things. Some Kconfig rework
    still awaits merges of dependent trees from linux-next.

    Subsystems affected by this patch series: mm/hotfixes, mm/memcg,
    mm/vmstat, mm/thp, procfs, sysctl, misc, notifiers, core-kernel,
    bitops, lib, checkpatch, epoll, binfmt, init, rapidio, uaccess, kcov,
    ubsan, ipc, bitmap, mm/pagemap"

    * akpm: (86 commits)
    mm: remove __ARCH_HAS_4LEVEL_HACK and include/asm-generic/4level-fixup.h
    um: add support for folded p4d page tables
    um: remove unused pxx_offset_proc() and addr_pte() functions
    sparc32: use pgtable-nopud instead of 4level-fixup
    parisc/hugetlb: use pgtable-nopXd instead of 4level-fixup
    parisc: use pgtable-nopXd instead of 4level-fixup
    nds32: use pgtable-nopmd instead of 4level-fixup
    microblaze: use pgtable-nopmd instead of 4level-fixup
    m68k: mm: use pgtable-nopXd instead of 4level-fixup
    m68k: nommu: use pgtable-nopud instead of 4level-fixup
    c6x: use pgtable-nopud instead of 4level-fixup
    arm: nommu: use pgtable-nopud instead of 4level-fixup
    alpha: use pgtable-nopud instead of 4level-fixup
    gpio: pca953x: tighten up indentation
    gpio: pca953x: convert to use bitmap API
    gpio: pca953x: use input from regs structure in pca953x_irq_pending()
    gpio: pca953x: remove redundant variable and check in IRQ handler
    lib/bitmap: introduce bitmap_replace() helper
    lib/test_bitmap: fix comment about this file
    lib/test_bitmap: move exp1 and exp2 upper for others to use
    ...

    Linus Torvalds
     

05 Dec, 2019

10 commits

  • For jited bpf program, if the subprogram count is 1, i.e.,
    there is no callees in the program, prog->aux->func will be NULL
    and prog->bpf_func points to image address of the program.

    If there is more than one subprogram, prog->aux->func is populated,
    and subprogram 0 can be accessed through either prog->bpf_func or
    prog->aux->func[0]. Other subprograms should be accessed through
    prog->aux->func[subprog_id].

    This patch fixed a bug in check_attach_btf_id(), where
    prog->aux->func[subprog_id] is used to access any subprogram which
    caused a segfault like below:
    [79162.619208] BUG: kernel NULL pointer dereference, address:
    0000000000000000
    ......
    [79162.634255] Call Trace:
    [79162.634974] ? _cond_resched+0x15/0x30
    [79162.635686] ? kmem_cache_alloc_trace+0x162/0x220
    [79162.636398] ? selinux_bpf_prog_alloc+0x1f/0x60
    [79162.637111] bpf_prog_load+0x3de/0x690
    [79162.637809] __do_sys_bpf+0x105/0x1740
    [79162.638488] do_syscall_64+0x5b/0x180
    [79162.639147] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    ......

    Fixes: 5b92a28aae4d ("bpf: Support attaching tracing BPF program to other BPF programs")
    Reported-by: Eelco Chaudron
    Signed-off-by: Yonghong Song
    Signed-off-by: Alexei Starovoitov
    Link: https://lore.kernel.org/bpf/20191205010606.177774-1-yhs@fb.com

    Yonghong Song
     
  • Patch series " kcov: collect coverage from usb and vhost", v3.

    This patchset extends kcov to allow collecting coverage from backgound
    kernel threads. This extension requires custom annotations for each of
    the places where coverage collection is desired. This patchset
    implements this for hub events in the USB subsystem and for vhost
    workers. See the first patch description for details about the kcov
    extension. The other two patches apply this kcov extension to USB and
    vhost.

    Examples of other subsystems that might potentially benefit from this
    when custom annotations are added (the list is based on
    process_one_work() callers for bugs recently reported by syzbot):

    1. fs: writeback wb_workfn() worker,
    2. net: addrconf_dad_work()/addrconf_verify_work() workers,
    3. net: neigh_periodic_work() worker,
    4. net/p9: p9_write_work()/p9_read_work() workers,
    5. block: blk_mq_run_work_fn() worker.

    These patches have been used to enable coverage-guided USB fuzzing with
    syzkaller for the last few years, see the details here:

    https://github.com/google/syzkaller/blob/master/docs/linux/external_fuzzing_usb.md

    This patchset has been pushed to the public Linux kernel Gerrit
    instance:

    https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/1524

    This patch (of 3):

    Add background thread coverage collection ability to kcov.

    With KCOV_ENABLE coverage is collected only for syscalls that are issued
    from the current process. With KCOV_REMOTE_ENABLE it's possible to
    collect coverage for arbitrary parts of the kernel code, provided that
    those parts are annotated with kcov_remote_start()/kcov_remote_stop().

    This allows to collect coverage from two types of kernel background
    threads: the global ones, that are spawned during kernel boot in a
    limited number of instances (e.g. one USB hub_event() worker thread is
    spawned per USB HCD); and the local ones, that are spawned when a user
    interacts with some kernel interface (e.g. vhost workers).

    To enable collecting coverage from a global background thread, a unique
    global handle must be assigned and passed to the corresponding
    kcov_remote_start() call. Then a userspace process can pass a list of
    such handles to the KCOV_REMOTE_ENABLE ioctl in the handles array field
    of the kcov_remote_arg struct. This will attach the used kcov device to
    the code sections, that are referenced by those handles.

    Since there might be many local background threads spawned from
    different userspace processes, we can't use a single global handle per
    annotation. Instead, the userspace process passes a non-zero handle
    through the common_handle field of the kcov_remote_arg struct. This
    common handle gets saved to the kcov_handle field in the current
    task_struct and needs to be passed to the newly spawned threads via
    custom annotations. Those threads should in turn be annotated with
    kcov_remote_start()/kcov_remote_stop().

    Internally kcov stores handles as u64 integers. The top byte of a
    handle is used to denote the id of a subsystem that this handle belongs
    to, and the lower 4 bytes are used to denote the id of a thread instance
    within that subsystem. A reserved value 0 is used as a subsystem id for
    common handles as they don't belong to a particular subsystem. The
    bytes 4-7 are currently reserved and must be zero. In the future the
    number of bytes used for the subsystem or handle ids might be increased.

    When a particular userspace process collects coverage by via a common
    handle, kcov will collect coverage for each code section that is
    annotated to use the common handle obtained as kcov_handle from the
    current task_struct. However non common handles allow to collect
    coverage selectively from different subsystems.

    Link: http://lkml.kernel.org/r/e90e315426a384207edbec1d6aa89e43008e4caf.1572366574.git.andreyknvl@google.com
    Signed-off-by: Andrey Konovalov
    Cc: Dmitry Vyukov
    Cc: Greg Kroah-Hartman
    Cc: Alan Stern
    Cc: "Michael S. Tsirkin"
    Cc: Jason Wang
    Cc: Arnd Bergmann
    Cc: Steven Rostedt
    Cc: David Windsor
    Cc: Elena Reshetova
    Cc: Anders Roxell
    Cc: Alexander Potapenko
    Cc: Marco Elver
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     
  • Follow the kernel conventions, rename addr_in_gen_pool to
    gen_pool_has_addr.

    [sjhuang@iluvatar.ai: fix Documentation/ too]
    Link: http://lkml.kernel.org/r/20181229015914.5573-1-sjhuang@iluvatar.ai
    Link: http://lkml.kernel.org/r/20181228083950.20398-1-sjhuang@iluvatar.ai
    Signed-off-by: Huang Shijie
    Reviewed-by: Andrew Morton
    Cc: Russell King
    Cc: Arnd Bergmann
    Cc: Greg Kroah-Hartman
    Cc: Christoph Hellwig
    Cc: Marek Szyprowski
    Cc: Robin Murphy
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Huang Shijie
     
  • Initialization is not guaranteed to zero padding bytes so use an
    explicit memset instead to avoid leaking any kernel content in any
    possible padding bytes.

    Link: http://lkml.kernel.org/r/dfa331c00881d61c8ee51577a082d8bebd61805c.camel@perches.com
    Signed-off-by: Joe Perches
    Cc: Dan Carpenter
    Cc: Julia Lawall
    Cc: Thomas Gleixner
    Cc: Kees Cook
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • When building with clang + -Wtautological-pointer-compare, these
    instances pop up:

    kernel/profile.c:339:6: warning: comparison of array 'prof_cpu_mask' not equal to a null pointer is always true [-Wtautological-pointer-compare]
    if (prof_cpu_mask != NULL)
    ^~~~~~~~~~~~~ ~~~~
    kernel/profile.c:376:6: warning: comparison of array 'prof_cpu_mask' not equal to a null pointer is always true [-Wtautological-pointer-compare]
    if (prof_cpu_mask != NULL)
    ^~~~~~~~~~~~~ ~~~~
    kernel/profile.c:406:26: warning: comparison of array 'prof_cpu_mask' not equal to a null pointer is always true [-Wtautological-pointer-compare]
    if (!user_mode(regs) && prof_cpu_mask != NULL &&
    ^~~~~~~~~~~~~ ~~~~
    3 warnings generated.

    This can be addressed with the cpumask_available helper, introduced in
    commit f7e30f01a9e2 ("cpumask: Add helper cpumask_available()") to fix
    warnings like this while keeping the code the same.

    Link: https://github.com/ClangBuiltLinux/linux/issues/747
    Link: http://lkml.kernel.org/r/20191022191957.9554-1-natechancellor@gmail.com
    Signed-off-by: Nathan Chancellor
    Reviewed-by: Andrew Morton
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nathan Chancellor
     
  • blocking_notifier_chain_cond_register() does not consider system_booting
    state, which is the only difference between this function and
    blocking_notifier_cain_register(). This can be a bug and is a piece of
    duplicate code.

    Delete blocking_notifier_chain_cond_register()

    Link: http://lkml.kernel.org/r/1568861888-34045-4-git-send-email-nixiaoming@huawei.com
    Signed-off-by: Xiaoming Ni
    Reviewed-by: Andrew Morton
    Cc: Alan Stern
    Cc: Alexey Dobriyan
    Cc: Andy Lutomirski
    Cc: Anna Schumaker
    Cc: Arjan van de Ven
    Cc: Chuck Lever
    Cc: David S. Miller
    Cc: Ingo Molnar
    Cc: J. Bruce Fields
    Cc: Jeff Layton
    Cc: Nadia Derbey
    Cc: "Paul E. McKenney"
    Cc: Sam Protsenko
    Cc: Thomas Gleixner
    Cc: Trond Myklebust
    Cc: Vasily Averin
    Cc: Viresh Kumar
    Cc: YueHaibing
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiaoming Ni
     
  • The only difference between notifier_chain_cond_register() and
    notifier_chain_register() is the lack of warning hints for duplicate
    registrations. Use notifier_chain_register() instead of
    notifier_chain_cond_register() to avoid duplicate code

    Link: http://lkml.kernel.org/r/1568861888-34045-3-git-send-email-nixiaoming@huawei.com
    Signed-off-by: Xiaoming Ni
    Reviewed-by: Andrew Morton
    Cc: Alan Stern
    Cc: Alexey Dobriyan
    Cc: Andy Lutomirski
    Cc: Anna Schumaker
    Cc: Arjan van de Ven
    Cc: Chuck Lever
    Cc: David S. Miller
    Cc: Ingo Molnar
    Cc: J. Bruce Fields
    Cc: Jeff Layton
    Cc: Nadia Derbey
    Cc: "Paul E. McKenney"
    Cc: Sam Protsenko
    Cc: Thomas Gleixner
    Cc: Trond Myklebust
    Cc: Vasily Averin
    Cc: Viresh Kumar
    Cc: YueHaibing
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiaoming Ni
     
  • Registering the same notifier to a hook repeatedly can cause the hook
    list to form a ring or lose other members of the list.

    case1: An infinite loop in notifier_chain_register() can cause soft lockup
    atomic_notifier_chain_register(&test_notifier_list, &test1);
    atomic_notifier_chain_register(&test_notifier_list, &test1);
    atomic_notifier_chain_register(&test_notifier_list, &test2);

    case2: An infinite loop in notifier_chain_register() can cause soft lockup
    atomic_notifier_chain_register(&test_notifier_list, &test1);
    atomic_notifier_chain_register(&test_notifier_list, &test1);
    atomic_notifier_call_chain(&test_notifier_list, 0, NULL);

    case3: lose other hook test2
    atomic_notifier_chain_register(&test_notifier_list, &test1);
    atomic_notifier_chain_register(&test_notifier_list, &test2);
    atomic_notifier_chain_register(&test_notifier_list, &test1);

    case4: Unregister returns 0, but the hook is still in the linked list,
    and it is not really registered. If you call
    notifier_call_chain after ko is unloaded, it will trigger oops.

    If the system is configured with softlockup_panic and the same hook is
    repeatedly registered on the panic_notifier_list, it will cause a loop
    panic.

    Add a check in notifier_chain_register(), intercepting duplicate
    registrations to avoid infinite loops

    Link: http://lkml.kernel.org/r/1568861888-34045-2-git-send-email-nixiaoming@huawei.com
    Signed-off-by: Xiaoming Ni
    Reviewed-by: Vasily Averin
    Reviewed-by: Andrew Morton
    Cc: Alexey Dobriyan
    Cc: Anna Schumaker
    Cc: Arjan van de Ven
    Cc: J. Bruce Fields
    Cc: Chuck Lever
    Cc: David S. Miller
    Cc: Jeff Layton
    Cc: Andy Lutomirski
    Cc: Ingo Molnar
    Cc: Nadia Derbey
    Cc: "Paul E. McKenney"
    Cc: Sam Protsenko
    Cc: Alan Stern
    Cc: Thomas Gleixner
    Cc: Trond Myklebust
    Cc: Viresh Kumar
    Cc: Xiaoming Ni
    Cc: YueHaibing
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiaoming Ni
     
  • Pull more tracing updates from Steven Rostedt:
    "Two fixes and one patch that was missed:

    Fixes:

    - Missing __print_hex_dump undef for processing new function in trace
    events

    - Stop WARN_ON messages when lockdown disables tracing on boot up

    Enhancement:

    - Debug option to inject trace events from userspace (for rasdaemon)"

    The enhancement has its own config option and is non invasive. It's been
    discussed for sever months and should have been added to my original
    push, but I never pulled it into my queue.

    * tag 'trace-v5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Do not create directories if lockdown is in affect
    tracing: Introduce trace event injection
    tracing: Fix __print_hex_dump scope

    Linus Torvalds
     
  • Pull additional power management updates from Rafael Wysocki:
    "These fix an ACPI EC driver bug exposed by the recent rework of the
    suspend-to-idle code flow, reintroduce frequency constraints into
    device PM QoS (in preparation for adding QoS support to devfreq), drop
    a redundant field from struct cpuidle_state and clean up Kconfig in
    some places.

    Specifics:

    - Avoid a race condition in the ACPI EC driver that may cause systems
    to be unable to leave suspend-to-idle (Rafael Wysocki)

    - Drop the "disabled" field, which is redundant, from struct
    cpuidle_state (Rafael Wysocki)

    - Reintroduce device PM QoS frequency constraints (temporarily
    introduced and than dropped during the 5.4 cycle) in preparation
    for adding QoS support to devfreq (Leonard Crestez)

    - Clean up indentation (in multiple places) and the cpuidle drivers
    help text in Kconfig (Krzysztof Kozlowski, Randy Dunlap)"

    * tag 'pm-5.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    ACPI: PM: s2idle: Rework ACPI events synchronization
    ACPI: EC: Rework flushing of pending work
    PM / devfreq: Add missing locking while setting suspend_freq
    PM / QoS: Restore DEV_PM_QOS_MIN/MAX_FREQUENCY
    PM / QoS: Reorder pm_qos/freq_qos/dev_pm_qos structs
    PM / QoS: Initial kunit test
    PM / QoS: Redefine FREQ_QOS_MAX_DEFAULT_VALUE to S32_MAX
    power: avs: Fix Kconfig indentation
    cpufreq: Fix Kconfig indentation
    cpuidle: minor Kconfig help text fixes
    cpuidle: Drop disabled field from struct cpuidle_state
    cpuidle: Fix Kconfig indentation

    Linus Torvalds
     

04 Dec, 2019

3 commits

  • If lockdown is disabling tracing on boot up, it prevents the tracing files
    from even bering created. But when that happens, there's several places that
    will give a warning that the files were not created as that is usually a
    sign of a bug.

    Add in strategic locations where a check is made to see if tracing is
    disabled by lockdown, and if it is, do not go further, and fail silently
    (but print that tracing is disabled by lockdown, without doing a WARN_ON()).

    Cc: Matthew Garrett
    Fixes: 17911ff38aa5 ("tracing: Add locked_down checks to the open calls of files created for tracefs")
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • Pull timer updates from Ingo Molnar:
    "The main changes in the timer code in this cycle were:

    - Clockevent updates:

    - timer-of framework cleanups. (Geert Uytterhoeven)

    - Use timer-of for the renesas-ostm and the device name to prevent
    name collision in case of multiple timers. (Geert Uytterhoeven)

    - Check if there is an error after calling of_clk_get in asm9260
    (Chuhong Yuan)

    - ABI fix: Zero out high order bits of nanoseconds on compat
    syscalls. This got broken a year ago, with apparently no side
    effects so far.

    Since the kernel would use random data otherwise I don't think we'd
    have other options but to fix the bug, even if there was a side
    effect to applications (Dmitry Safonov)

    - Optimize ns_to_timespec64() on 32-bit systems: move away from
    div_s64_rem() which can be slow, to div_u64_rem() which is faster
    (Arnd Bergmann)

    - Annotate KCSAN-reported false positive data races in
    hrtimer_is_queued() users by moving timer->state handling over to
    the READ_ONCE()/WRITE_ONCE() APIs. This documents these accesses
    (Eric Dumazet)

    - Misc cleanups and small fixes"

    [ I undid the "ABI fix" and updated the comments instead. The reason
    there were apparently no side effects is that the fix was a no-op.

    The updated comment is to say _why_ it was a no-op. - Linus ]

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    time: Zero the upper 32-bits in __kernel_timespec on 32-bit
    time: Rename tsk->real_start_time to ->start_boottime
    hrtimer: Remove the comment about not used HRTIMER_SOFTIRQ
    time: Fix spelling mistake in comment
    time: Optimize ns_to_timespec64()
    hrtimer: Annotate lockless access to timer->state
    clocksource/drivers/asm9260: Add a check for of_clk_get
    clocksource/drivers/renesas-ostm: Use unique device name instead of ostm
    clocksource/drivers/renesas-ostm: Convert to timer_of
    clocksource/drivers/timer-of: Use unique device name instead of timer
    clocksource/drivers/timer-of: Convert last full_name to %pOF

    Linus Torvalds
     
  • Pull irq updates from Ingo Molnar:
    "Most of the IRQ subsystem changes in this cycle were irq-chip driver
    updates:

    - Qualcomm PDC wakeup interrupt support

    - Layerscape external IRQ support

    - Broadcom bcm7038 PM and wakeup support

    - Ingenic driver cleanup and modernization

    - GICv3 ITS preparation for GICv4.1 updates

    - GICv4 fixes

    There's also the series from Frederic Weisbecker that fixes memory
    ordering bugs for the irq-work logic, whose primary fix is to turn
    work->irq_work.flags into an atomic variable and then convert the
    complex (and buggy) atomic_cmpxchg() loop in irq_work_claim() into a
    much simpler atomic_fetch_or() call.

    There are also various smaller cleanups"

    * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
    pinctrl/sdm845: Add PDC wakeup interrupt map for GPIOs
    pinctrl/msm: Setup GPIO chip in hierarchy
    irqchip/qcom-pdc: Add irqchip set/get state calls
    irqchip/qcom-pdc: Add irqdomain for wakeup capable GPIOs
    irqchip/qcom-pdc: Do not toggle IRQ_ENABLE during mask/unmask
    irqchip/qcom-pdc: Update max PDC interrupts
    of/irq: Document properties for wakeup interrupt parent
    genirq: Introduce irq_chip_get/set_parent_state calls
    irqdomain: Add bus token DOMAIN_BUS_WAKEUP
    genirq: Fix function documentation of __irq_alloc_descs()
    irq_work: Fix IRQ_WORK_BUSY bit clearing
    irqchip/ti-sci-inta: Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...))
    irq_work: Slightly simplify IRQ_WORK_PENDING clearing
    irq_work: Fix irq_work_claim() memory ordering
    irq_work: Convert flags to atomic_t
    irqchip: Ingenic: Add process for more than one irq at the same time.
    irqchip: ingenic: Alloc generic chips from IRQ domain
    irqchip: ingenic: Get virq number from IRQ domain
    irqchip: ingenic: Error out if IRQ domain creation failed
    irqchip: ingenic: Drop redundant irq_suspend / irq_resume functions
    ...

    Linus Torvalds
     

03 Dec, 2019

3 commits

  • Pull Kbuild updates from Masahiro Yamada:

    - remove unneeded asm headers from hexagon, ia64

    - add 'dir-pkg' target, which works like 'tar-pkg' but skips archiving

    - add 'helpnewconfig' target, which shows help for new CONFIG options

    - support 'make nsdeps' for external modules

    - make rebuilds faster by deleting $(wildcard $^) checks

    - remove compile tests for kernel-space headers

    - refactor modpost to simplify modversion handling

    - make single target builds faster

    - optimize and clean up scripts/kallsyms.c

    - refactor various Makefiles and scripts

    * tag 'kbuild-v5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (59 commits)
    MAINTAINERS: update Kbuild/Kconfig maintainer's email address
    scripts/kallsyms: remove redundant initializers
    scripts/kallsyms: put check_symbol_range() calls close together
    scripts/kallsyms: make check_symbol_range() void function
    scripts/kallsyms: move ignored symbol types to is_ignored_symbol()
    scripts/kallsyms: move more patterns to the ignored_prefixes array
    scripts/kallsyms: skip ignored symbols very early
    scripts/kallsyms: add const qualifiers where possible
    scripts/kallsyms: make find_token() return (unsigned char *)
    scripts/kallsyms: replace prefix_underscores_count() with strspn()
    scripts/kallsyms: add sym_name() to mitigate cast ugliness
    scripts/kallsyms: remove unneeded length check for prefix matching
    scripts/kallsyms: remove redundant is_arm_mapping_symbol()
    scripts/kallsyms: set relative_base more effectively
    scripts/kallsyms: shrink table before sorting it
    scripts/kallsyms: fix definitely-lost memory leak
    scripts/kallsyms: remove unneeded #ifndef ARRAY_SIZE
    kbuild: make single target builds even faster
    modpost: respect the previous export when 'exported twice' is warned
    modpost: do not set ->preloaded for symbols from Module.symvers
    ...

    Linus Torvalds
     
  • Daniel Borkmann says:

    ====================
    pull-request: bpf 2019-12-02

    The following pull-request contains BPF updates for your *net* tree.

    We've added 10 non-merge commits during the last 6 day(s) which contain
    a total of 10 files changed, 60 insertions(+), 51 deletions(-).

    The main changes are:

    1) Fix vmlinux BTF generation for binutils pre v2.25, from Stanislav Fomichev.

    2) Fix libbpf global variable relocation to take symbol's st_value offset
    into account, from Andrii Nakryiko.

    3) Fix libbpf build on powerpc where check_abi target fails due to different
    readelf output format, from Aurelien Jarno.

    4) Don't set BPF insns RO for the case when they are JITed in order to avoid
    fragmenting the direct map, from Daniel Borkmann.

    5) Fix static checker warning in btf_distill_func_proto() as well as a build
    error due to empty enum when BPF is compiled out, from Alexei Starovoitov.

    6) Fix up generation of bpf_helper_defs.h for perf, from Arnaldo Carvalho de Melo.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • We have been trying to use rasdaemon to monitor hardware errors like
    correctable memory errors. rasdaemon uses trace events to monitor
    various hardware errors. In order to test it, we have to inject some
    hardware errors, unfortunately not all of them provide error
    injections. MCE does provide a way to inject MCE errors, but errors
    like PCI error and devlink error don't, it is not easy to add error
    injection to each of them. Instead, it is relatively easier to just
    allow users to inject trace events in a generic way so that all trace
    events can be injected.

    This patch introduces trace event injection, where a new 'inject' is
    added to each tracepoint directory. Users could write into this file
    with key=value pairs to specify the value of each fields of the trace
    event, all unspecified fields are set to zero values by default.

    For example, for the net/net_dev_queue tracepoint, we can inject:

    INJECT=/sys/kernel/debug/tracing/events/net/net_dev_queue/inject
    echo "" > $INJECT
    echo "name='test'" > $INJECT
    echo "name='test' len=1024" > $INJECT
    cat /sys/kernel/debug/tracing/trace
    ...
    -614 [000] .... 36.571483: net_dev_queue: dev= skbaddr=00000000fbf338c2 len=0
    -614 [001] .... 136.588252: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=0
    -614 [001] .N.. 208.431878: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=1024

    Triggers could be triggered as usual too:

    echo "stacktrace if len == 1025" > /sys/kernel/debug/tracing/events/net/net_dev_queue/trigger
    echo "len=1025" > $INJECT
    cat /sys/kernel/debug/tracing/trace
    ...
    bash-614 [000] .... 36.571483: net_dev_queue: dev= skbaddr=00000000fbf338c2 len=0
    bash-614 [001] .... 136.588252: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=0
    bash-614 [001] .N.. 208.431878: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=1024
    bash-614 [001] .N.1 284.236349:
    => event_inject_write
    => vfs_write
    => ksys_write
    => do_syscall_64
    => entry_SYSCALL_64_after_hwframe

    The only thing that can't be injected is string pointers as they
    require constant string pointers, this can't be done at run time.

    Link: http://lkml.kernel.org/r/20191130045218.18979-1-xiyou.wangcong@gmail.com

    Cc: Ingo Molnar
    Signed-off-by: Cong Wang
    Signed-off-by: Steven Rostedt (VMware)

    Cong Wang
     

02 Dec, 2019

5 commits

  • Merge updates from Andrew Morton:
    "Incoming:

    - a small number of updates to scripts/, ocfs2 and fs/buffer.c

    - most of MM

    I still have quite a lot of material (mostly not MM) staged after
    linux-next due to -next dependencies. I'll send those across next week
    as the preprequisites get merged up"

    * emailed patches from Andrew Morton : (135 commits)
    mm/page_io.c: annotate refault stalls from swap_readpage
    mm/Kconfig: fix trivial help text punctuation
    mm/Kconfig: fix indentation
    mm/memory_hotplug.c: remove __online_page_set_limits()
    mm: fix typos in comments when calling __SetPageUptodate()
    mm: fix struct member name in function comments
    mm/shmem.c: cast the type of unmap_start to u64
    mm: shmem: use proper gfp flags for shmem_writepage()
    mm/shmem.c: make array 'values' static const, makes object smaller
    userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK
    fs/userfaultfd.c: wp: clear VM_UFFD_MISSING or VM_UFFD_WP during userfaultfd_register()
    userfaultfd: wrap the common dst_vma check into an inlined function
    userfaultfd: remove unnecessary WARN_ON() in __mcopy_atomic_hugetlb()
    userfaultfd: use vma_pagesize for all huge page size calculation
    mm/madvise.c: use PAGE_ALIGN[ED] for range checking
    mm/madvise.c: replace with page_size() in madvise_inject_error()
    mm/mmap.c: make vma_merge() comment more easy to understand
    mm/hwpoison-inject: use DEFINE_DEBUGFS_ATTRIBUTE to define debugfs fops
    autonuma: reduce cache footprint when scanning page tables
    autonuma: fix watermark checking in migrate_balanced_pgdat()
    ...

    Linus Torvalds
     
  • Pull y2038 cleanups from Arnd Bergmann:
    "y2038 syscall implementation cleanups

    This is a series of cleanups for the y2038 work, mostly intended for
    namespace cleaning: the kernel defines the traditional time_t, timeval
    and timespec types that often lead to y2038-unsafe code. Even though
    the unsafe usage is mostly gone from the kernel, having the types and
    associated functions around means that we can still grow new users,
    and that we may be missing conversions to safe types that actually
    matter.

    There are still a number of driver specific patches needed to get the
    last users of these types removed, those have been submitted to the
    respective maintainers"

    Link: https://lore.kernel.org/lkml/20191108210236.1296047-1-arnd@arndb.de/

    * tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (26 commits)
    y2038: alarm: fix half-second cut-off
    y2038: ipc: fix x32 ABI breakage
    y2038: fix typo in powerpc vdso "LOPART"
    y2038: allow disabling time32 system calls
    y2038: itimer: change implementation to timespec64
    y2038: move itimer reset into itimer.c
    y2038: use compat_{get,set}_itimer on alpha
    y2038: itimer: compat handling to itimer.c
    y2038: time: avoid timespec usage in settimeofday()
    y2038: timerfd: Use timespec64 internally
    y2038: elfcore: Use __kernel_old_timeval for process times
    y2038: make ns_to_compat_timeval use __kernel_old_timeval
    y2038: socket: use __kernel_old_timespec instead of timespec
    y2038: socket: remove timespec reference in timestamping
    y2038: syscalls: change remaining timeval to __kernel_old_timeval
    y2038: rusage: use __kernel_old_timeval
    y2038: uapi: change __kernel_time_t to __kernel_old_time_t
    y2038: stat: avoid 'time_t' in 'struct stat'
    y2038: ipc: remove __kernel_time_t reference from headers
    y2038: vdso: powerpc: avoid timespec references
    ...

    Linus Torvalds
     
  • Pull sysctl system call removal from Eric Biederman:
    "As far as I can tell we have reached the point where no one enables
    the sysctl system call anymore. It still is enabled in a few
    defconfigs but they are mostly the rarely used one and in asking
    people about that it was more cut & paste enabled than anything else.

    This is single commit that just deletes code. Leaving just enough code
    so that the deprecated sysctl warning continues to be printed. If my
    analysis turns out to be wrong and someone actually cares it will be
    easy to revert this commit and have the system call again.

    There was one new xtensa defconfig in linux-next that enabled the
    system call this cycle and when asked about it the maintainer of the
    code replied that it was not enabled on purpose. As of today's
    linux-next tree that defconfig no longer enables the system call.

    What we saw in the review discussion was that if we go a step farther
    than my patch and mess with uapi headers there are pieces of code that
    won't compile, but nothing minds the system call actually disappearing
    from the kernel"

    Link: https://lore.kernel.org/lkml/201910011140.EA0181F13@keescook/

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    sysctl: Remove the sysctl system call

    Linus Torvalds
     
  • Currently, the drop_caches proc file and sysctl read back the last value
    written, suggesting this is somehow a stateful setting instead of a
    one-time command. Make it write-only, like e.g. compact_memory.

    While mitigating a VM problem at scale in our fleet, there was confusion
    about whether writing to this file will permanently switch the kernel into
    a non-caching mode. This influences the decision making in a tense
    situation, where tens of people are trying to fix tens of thousands of
    affected machines: Do we need a rollback strategy? What are the
    performance implications of operating in a non-caching state for several
    days? It also caused confusion when the kernel team said we may need to
    write the file several times to make sure it's effective ("But it already
    reads back 3?").

    Link: http://lkml.kernel.org/r/20191031221602.9375-1-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Acked-by: Chris Down
    Acked-by: Vlastimil Babka
    Acked-by: David Hildenbrand
    Acked-by: Michal Hocko
    Acked-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Supporting VMAP_STACK with KASAN_VMALLOC is straightforward:

    - clear the shadow region of vmapped stacks when swapping them in
    - tweak Kconfig to allow VMAP_STACK to be turned on with KASAN

    Link: http://lkml.kernel.org/r/20191031093909.9228-4-dja@axtens.net
    Signed-off-by: Daniel Axtens
    Reviewed-by: Dmitry Vyukov
    Reviewed-by: Andrey Ryabinin
    Cc: Alexander Potapenko
    Cc: Christophe Leroy
    Cc: Mark Rutland
    Cc: Vasily Gorbik
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Axtens
     

01 Dec, 2019

7 commits

  • get_unmapped_area() returns an address or -errno on failure. Historically
    we have checked for the failure by offset_in_page() which is correct but
    quite hard to read. Newer code started using IS_ERR_VALUE which is much
    easier to read. Convert remaining users of offset_in_page as well.

    [mhocko@suse.com: rewrite changelog]
    [mhocko@kernel.org: fix mremap.c and uprobes.c sites also]
    Link: http://lkml.kernel.org/r/20191012102512.28051-1-pugaowei@gmail.com
    Signed-off-by: Gaowei Pu
    Reviewed-by: Andrew Morton
    Acked-by: Michal Hocko
    Cc: Vlastimil Babka
    Cc: Wei Yang
    Cc: Konstantin Khlebnikov
    Cc: Kirill A. Shutemov
    Cc: "Jérôme Glisse"
    Cc: Mike Kravetz
    Cc: Rik van Riel
    Cc: Qian Cai
    Cc: Shakeel Butt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gaowei Pu
     
  • Pull seccomp updates from Kees Cook:
    "Mostly this is implementing the new flag SECCOMP_USER_NOTIF_FLAG_CONTINUE,
    but there are cleanups as well.

    - implement SECCOMP_USER_NOTIF_FLAG_CONTINUE (Christian Brauner)

    - fixes to selftests (Christian Brauner)

    - remove secure_computing() argument (Christian Brauner)"

    * tag 'seccomp-v5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    seccomp: rework define for SECCOMP_USER_NOTIF_FLAG_CONTINUE
    seccomp: fix SECCOMP_USER_NOTIF_FLAG_CONTINUE test
    seccomp: simplify secure_computing()
    seccomp: test SECCOMP_USER_NOTIF_FLAG_CONTINUE
    seccomp: add SECCOMP_USER_NOTIF_FLAG_CONTINUE
    seccomp: avoid overflow in implicit constant conversion

    Linus Torvalds
     
  • Pull audit updates from Paul Moore:
    "Audit is back for v5.5, albeit with only two patches:

    - Allow for the auditing of suspicious O_CREAT usage via the new
    AUDIT_ANOM_CREAT record.

    - Remove a redundant if-conditional check found during code analysis.
    It's a minor change, but when the pull request is only two patches
    long, you need filler in the pull request email"

    [ Heh on the pull request filler. I wish more people tried to write
    better pull request messages, even if maybe it's not worth it for the
    trivial cases ;^) - Linus ]

    * tag 'audit-pr-20191126' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
    audit: remove redundant condition check in kauditd_thread()
    audit: Report suspicious O_CREAT usage

    Linus Torvalds
     
  • Pull kgdb updates from Daniel Thompson:
    "The major change here is the work from Douglas Anderson that reworks
    the way kdb stack traces are handled on SMP systems. The effect is to
    allow all CPUs to issue their stack trace which reduced the need for
    architecture specific code to support stack tracing.

    Also included are general of clean ups from Doug and myself:

    - Remove some unused variables or arguments.

    - Tidy up the kdb escape handling code and fix a couple of odd corner
    cases.

    - Better ignore escape characters that do not form part of an escape
    sequence. This mostly benefits vi users since they are most likely
    to press escape as a nervous habit but it won't harm anyone else"

    * tag 'kgdb-5.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/danielt/linux:
    kdb: Tweak escape handling for vi users
    kdb: Improve handling of characters from different input sources
    kdb: Remove special case logic from kdb_read()
    kdb: Simplify code to fetch characters from console
    kdb: Tidy up code to handle escape sequences
    kdb: Avoid array subscript warnings on non-SMP builds
    kdb: Fix stack crawling on 'running' CPUs that aren't the master
    kdb: Fix "btc " crash if the CPU didn't round up
    kdb: Remove unused "argcount" param from kdb_bt1(); make btaprompt bool
    kgdb: Remove unused DCPU_SSTEP definition

    Linus Torvalds
     
  • Pull parisc updates from Helge Deller:
    "Just trivial small updates: An assembler register optimization in the
    inlined networking checksum functions, a compiler warning fix and
    don't unneccesary print a runtime warning on machines which wouldn't
    be affected anyway"

    * 'parisc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: Avoid spurious inequivalent alias kernel error messages
    kexec: Fix pointer-to-int-cast warnings
    parisc: Do not hardcode registers in checksum functions

    Linus Torvalds
     
  • …ux/kernel/git/dhowells/linux-fs

    Pull pipe rework from David Howells:
    "This is my set of preparatory patches for building a general
    notification queue on top of pipes. It makes a number of significant
    changes:

    - It removes the nr_exclusive argument from __wake_up_sync_key() as
    this is always 1. This prepares for the next step:

    - Adds wake_up_interruptible_sync_poll_locked() so that poll can be
    woken up from a function that's holding the poll waitqueue
    spinlock.

    - Change the pipe buffer ring to be managed in terms of unbounded
    head and tail indices rather than bounded index and length. This
    means that reading the pipe only needs to modify one index, not
    two.

    - A selection of helper functions are provided to query the state of
    the pipe buffer, plus a couple to apply updates to the pipe
    indices.

    - The pipe ring is allowed to have kernel-reserved slots. This allows
    many notification messages to be spliced in by the kernel without
    allowing userspace to pin too many pages if it writes to the same
    pipe.

    - Advance the head and tail indices inside the pipe waitqueue lock
    and use wake_up_interruptible_sync_poll_locked() to poke poll
    without having to take the lock twice.

    - Rearrange pipe_write() to preallocate the buffer it is going to
    write into and then drop the spinlock. This allows kernel
    notifications to then be added the ring whilst it is filling the
    buffer it allocated. The read side is stalled because the pipe
    mutex is still held.

    - Don't wake up readers on a pipe if there was already data in it
    when we added more.

    - Don't wake up writers on a pipe if the ring wasn't full before we
    removed a buffer"

    * tag 'notifications-pipe-prep-20191115' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    pipe: Remove sync on wake_ups
    pipe: Increase the writer-wakeup threshold to reduce context-switch count
    pipe: Check for ring full inside of the spinlock in pipe_write()
    pipe: Remove redundant wakeup from pipe_write()
    pipe: Rearrange sequence in pipe_write() to preallocate slot
    pipe: Conditionalise wakeup in pipe_read()
    pipe: Advance tail pointer inside of wait spinlock in pipe_read()
    pipe: Allow pipes to have kernel-reserved slots
    pipe: Use head and tail pointers for the ring, not cursor and length
    Add wake_up_interruptible_sync_poll_locked()
    Remove the nr_exclusive argument from __wake_up_sync_key()
    pipe: Reduce #inclusion of pipe_fs_i.h

    Linus Torvalds
     
  • Pull hmm updates from Jason Gunthorpe:
    "This is another round of bug fixing and cleanup. This time the focus
    is on the driver pattern to use mmu notifiers to monitor a VA range.
    This code is lifted out of many drivers and hmm_mirror directly into
    the mmu_notifier core and written using the best ideas from all the
    driver implementations.

    This removes many bugs from the drivers and has a very pleasing
    diffstat. More drivers can still be converted, but that is for another
    cycle.

    - A shared branch with RDMA reworking the RDMA ODP implementation

    - New mmu_interval_notifier API. This is focused on the use case of
    monitoring a VA and simplifies the process for drivers

    - A common seq-count locking scheme built into the
    mmu_interval_notifier API usable by drivers that call
    get_user_pages() or hmm_range_fault() with the VA range

    - Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen
    GntDev drivers to the new API. This deletes a lot of wonky driver
    code.

    - Two improvements for hmm_range_fault(), from testing done by Ralph"

    * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
    mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap
    mm/hmm: make full use of walk_page_range()
    xen/gntdev: use mmu_interval_notifier_insert
    mm/hmm: remove hmm_mirror and related
    drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
    drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
    drm/amdgpu: Call find_vma under mmap_sem
    nouveau: use mmu_interval_notifier instead of hmm_mirror
    nouveau: use mmu_notifier directly for invalidate_range_start
    drm/radeon: use mmu_interval_notifier_insert
    RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv
    RDMA/odp: Use mmu_interval_notifier_insert()
    mm/hmm: define the pre-processor related parts of hmm.h even if disabled
    mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror
    mm/mmu_notifier: add an interval tree notifier
    mm/mmu_notifier: define the header pre-processor parts even if disabled
    mm/hmm: allow snapshot of the special zero page

    Linus Torvalds