07 Dec, 2020

1 commit

  • …t/masahiroy/linux-kbuild

    Pull Kbuild fixes from Masahiro Yamada:

    - Move -Wcast-align to W=3, which tends to be false-positive and there
    is no tree-wide solution.

    - Pass -fmacro-prefix-map to KBUILD_CPPFLAGS because it is a
    preprocessor option and makes sense for .S files as well.

    - Disable -gdwarf-2 for Clang's integrated assembler to avoid warnings.

    - Disable --orphan-handling=warn for LLD 10.0.1 to avoid warnings.

    - Fix undesirable line breaks in *.mod files.

    * tag 'kbuild-fixes-v5.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    kbuild: avoid split lines in .mod files
    kbuild: Disable CONFIG_LD_ORPHAN_WARN for ld.lld 10.0.1
    kbuild: Hoist '--orphan-handling' into Kconfig
    Kbuild: do not emit debug info for assembly with LLVM_IAS=1
    kbuild: use -fmacro-prefix-map for .S sources
    Makefile.extrawarn: move -Wcast-align to W=3

    Linus Torvalds
     

01 Dec, 2020

2 commits

  • ld.lld 10.0.1 spews a bunch of various warnings about .rela sections,
    along with a few others. Newer versions of ld.lld do not have these
    warnings. As a result, do not add '--orphan-handling=warn' to
    LDFLAGS_vmlinux if ld.lld's version is not new enough.

    Link: https://github.com/ClangBuiltLinux/linux/issues/1187
    Link: https://github.com/ClangBuiltLinux/linux/issues/1193
    Reported-by: Arvind Sankar
    Reported-by: kernelci.org bot
    Reported-by: Mark Brown
    Reviewed-by: Kees Cook
    Signed-off-by: Nathan Chancellor
    Reviewed-by: Nick Desaulniers
    Signed-off-by: Masahiro Yamada

    Nathan Chancellor
     
  • Currently, '--orphan-handling=warn' is spread out across four different
    architectures in their respective Makefiles, which makes it a little
    unruly to deal with in case it needs to be disabled for a specific
    linker version (in this case, ld.lld 10.0.1).

    To make it easier to control this, hoist this warning into Kconfig and
    the main Makefile so that disabling it is simpler, as the warning will
    only be enabled in a couple places (main Makefile and a couple of
    compressed boot folders that blow away LDFLAGS_vmlinx) and making it
    conditional is easier due to Kconfig syntax. One small additional
    benefit of this is saving a call to ld-option on incremental builds
    because we will have already evaluated it for CONFIG_LD_ORPHAN_WARN.

    To keep the list of supported architectures the same, introduce
    CONFIG_ARCH_WANT_LD_ORPHAN_WARN, which an architecture can select to
    gain this automatically after all of the sections are specified and size
    asserted. A special thanks to Kees Cook for the help text on this
    config.

    Link: https://github.com/ClangBuiltLinux/linux/issues/1187
    Acked-by: Kees Cook
    Acked-by: Michael Ellerman (powerpc)
    Reviewed-by: Nick Desaulniers
    Tested-by: Nick Desaulniers
    Signed-off-by: Nathan Chancellor
    Signed-off-by: Masahiro Yamada

    Nathan Chancellor
     

03 Nov, 2020

1 commit

  • Currently, LOG_BUF_SHIFT defaults to 17, which is 2 ^ 17 bytes = 128 KB,
    and LOG_CPU_MAX_BUF_SHIFT defaults to 12, which is 2 ^ 12 bytes = 4 KB.

    Half of 128 KB is 64 KB, so more than 16 CPUs are required for the value
    to be used, as then the sum of contributions is greater than 64 KB for
    the first time. My guess is, that the description was written with the
    configuration values used in the SUSE in mind.

    Fixes: 23b2899f7f194f06e ("printk: allow increasing the ring buffer depending on the number of CPUs")
    Cc: Luis R. Rodriguez
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Paul Menzel
    Reviewed-by: Petr Mladek
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200811092924.6256-1-pmenzel@molgen.mpg.de

    Paul Menzel
     

16 Oct, 2020

1 commit

  • Pull networking updates from Jakub Kicinski:

    - Add redirect_neigh() BPF packet redirect helper, allowing to limit
    stack traversal in common container configs and improving TCP
    back-pressure.

    Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.

    - Expand netlink policy support and improve policy export to user
    space. (Ge)netlink core performs request validation according to
    declared policies. Expand the expressiveness of those policies
    (min/max length and bitmasks). Allow dumping policies for particular
    commands. This is used for feature discovery by user space (instead
    of kernel version parsing or trial and error).

    - Support IGMPv3/MLDv2 multicast listener discovery protocols in
    bridge.

    - Allow more than 255 IPv4 multicast interfaces.

    - Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
    packets of TCPv6.

    - In Multi-patch TCP (MPTCP) support concurrent transmission of data on
    multiple subflows in a load balancing scenario. Enhance advertising
    addresses via the RM_ADDR/ADD_ADDR options.

    - Support SMC-Dv2 version of SMC, which enables multi-subnet
    deployments.

    - Allow more calls to same peer in RxRPC.

    - Support two new Controller Area Network (CAN) protocols - CAN-FD and
    ISO 15765-2:2016.

    - Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
    kernel problem.

    - Add TC actions for implementing MPLS L2 VPNs.

    - Improve nexthop code - e.g. handle various corner cases when nexthop
    objects are removed from groups better, skip unnecessary
    notifications and make it easier to offload nexthops into HW by
    converting to a blocking notifier.

    - Support adding and consuming TCP header options by BPF programs,
    opening the doors for easy experimental and deployment-specific TCP
    option use.

    - Reorganize TCP congestion control (CC) initialization to simplify
    life of TCP CC implemented in BPF.

    - Add support for shipping BPF programs with the kernel and loading
    them early on boot via the User Mode Driver mechanism, hence reusing
    all the user space infra we have.

    - Support sleepable BPF programs, initially targeting LSM and tracing.

    - Add bpf_d_path() helper for returning full path for given 'struct
    path'.

    - Make bpf_tail_call compatible with bpf-to-bpf calls.

    - Allow BPF programs to call map_update_elem on sockmaps.

    - Add BPF Type Format (BTF) support for type and enum discovery, as
    well as support for using BTF within the kernel itself (current use
    is for pretty printing structures).

    - Support listing and getting information about bpf_links via the bpf
    syscall.

    - Enhance kernel interfaces around NIC firmware update. Allow
    specifying overwrite mask to control if settings etc. are reset
    during update; report expected max time operation may take to users;
    support firmware activation without machine reboot incl. limits of
    how much impact reset may have (e.g. dropping link or not).

    - Extend ethtool configuration interface to report IEEE-standard
    counters, to limit the need for per-vendor logic in user space.

    - Adopt or extend devlink use for debug, monitoring, fw update in many
    drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx,
    dpaa2-eth).

    - In mlxsw expose critical and emergency SFP module temperature alarms.
    Refactor port buffer handling to make the defaults more suitable and
    support setting these values explicitly via the DCBNL interface.

    - Add XDP support for Intel's igb driver.

    - Support offloading TC flower classification and filtering rules to
    mscc_ocelot switches.

    - Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
    fixed interval period pulse generator and one-step timestamping in
    dpaa-eth.

    - Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
    offload.

    - Add Lynx PHY/PCS MDIO module, and convert various drivers which have
    this HW to use it. Convert mvpp2 to split PCS.

    - Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
    7-port Mediatek MT7531 IP.

    - Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
    and wcn3680 support in wcn36xx.

    - Improve performance for packets which don't require much offloads on
    recent Mellanox NICs by 20% by making multiple packets share a
    descriptor entry.

    - Move chelsio inline crypto drivers (for TLS and IPsec) from the
    crypto subtree to drivers/net. Move MDIO drivers out of the phy
    directory.

    - Clean up a lot of W=1 warnings, reportedly the actively developed
    subsections of networking drivers should now build W=1 warning free.

    - Make sure drivers don't use in_interrupt() to dynamically adapt their
    code. Convert tasklets to use new tasklet_setup API (sadly this
    conversion is not yet complete).

    * tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits)
    Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH"
    net, sockmap: Don't call bpf_prog_put() on NULL pointer
    bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo
    bpf, sockmap: Add locking annotations to iterator
    netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements
    net: fix pos incrementment in ipv6_route_seq_next
    net/smc: fix invalid return code in smcd_new_buf_create()
    net/smc: fix valid DMBE buffer sizes
    net/smc: fix use-after-free of delayed events
    bpfilter: Fix build error with CONFIG_BPFILTER_UMH
    cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr
    net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info
    bpf: Fix register equivalence tracking.
    rxrpc: Fix loss of final ack on shutdown
    rxrpc: Fix bundle counting for exclusive connections
    netfilter: restore NF_INET_NUMHOOKS
    ibmveth: Identify ingress large send packets.
    ibmveth: Switch order of ibmveth_helper calls.
    cxgb4: handle 4-tuple PEDIT to NAT mode translation
    selftests: Add VRF route leaking tests
    ...

    Linus Torvalds
     

14 Oct, 2020

1 commit

  • Pull printk updates from Petr Mladek:
    "The big new thing is the fully lockless ringbuffer implementation,
    including the support for continuous lines. It will allow to store and
    read messages in any situation wihtout the risk of deadlocks and
    without the need of temporary per-CPU buffers.

    The access is still serialized by logbuf_lock. It synchronizes few
    more operations, for example, temporary buffer for formatting the
    message, syslog and kmsg_dump operations. The lock removal is being
    discussed and should be ready for the next release.

    The continuous lines are handled exactly the same way as before to
    avoid regressions in user space. It means that they are appended to
    the last message when the caller is the same. Only the last message
    can be extended.

    The data ring includes plain text of the messages. Except for an
    integer at the beginning of each message that points back to the
    descriptor ring with other metadata.

    The dictionary has to stay. journalctl uses it to filter the log. It
    allows to show messages related to a given device. The dictionary
    values are stored in the descriptor ring with the other metadata.

    This is the first part of the printk rework as discussed at Plumbers
    2019, see https://lore.kernel.org/r/87k1acz5rx.fsf@linutronix.de. The
    next big step will be handling consoles by kthreads during the normal
    system operation. It will require special handling of situations when
    the kthreads could not get scheduled, for example, early boot,
    suspend, panic.

    Other changes:

    - Add John Ogness as a reviewer for printk subsystem. He is author of
    the rework and is familiar with the code and history.

    - Fix locking in serial8250_do_startup() to prevent lockdep report.

    - Few code cleanups"

    * tag 'printk-for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (27 commits)
    printk: Use fallthrough pseudo-keyword
    printk: reduce setup_text_buf size to LOG_LINE_MAX
    printk: avoid and/or handle record truncation
    printk: remove dict ring
    printk: move dictionary keys to dev_printk_info
    printk: move printk_info into separate array
    printk: reimplement log_cont using record extension
    printk: ringbuffer: add finalization/extension support
    printk: ringbuffer: change representation of states
    printk: ringbuffer: clear initial reserved fields
    printk: ringbuffer: add BLK_DATALESS() macro
    printk: ringbuffer: relocate get_data()
    printk: ringbuffer: avoid memcpy() on state_var
    printk: ringbuffer: fix setting state in desc_read()
    kernel.h: Move oops_in_progress to printk.h
    scripts/gdb: update for lockless printk ringbuffer
    scripts/gdb: add utils.read_ulong()
    docs: vmcoreinfo: add lockless printk ringbuffer vmcoreinfo
    printk: reduce LOG_BUF_SHIFT range for H8300
    printk: ringbuffer: support dataless records
    ...

    Linus Torvalds
     

12 Oct, 2020

1 commit


25 Sep, 2020

1 commit

  • nommu-mmap.rst was moved to Documentation/admin-guide/mm; this patch
    updates the remaining stale references to Documentation/mm.

    Fixes: 800c02f5d030 ("docs: move nommu-mmap.txt to admin-guide and rename to ReST")
    Signed-off-by: Stephen Kitt
    Link: https://lore.kernel.org/r/20200812092230.27541-1-steve@sk2.org
    Signed-off-by: Jonathan Corbet

    Stephen Kitt
     

08 Sep, 2020

1 commit

  • The .bss section for the h8300 is relatively small. A value of
    CONFIG_LOG_BUF_SHIFT that is larger than 19 will create a static
    printk ringbuffer that is too large. Limit the range appropriately
    for the H8300.

    Reported-by: kernel test robot
    Signed-off-by: John Ogness
    Reviewed-by: Sergey Senozhatsky
    Acked-by: Steven Rostedt (VMware)
    Signed-off-by: Petr Mladek
    Link: https://lore.kernel.org/r/20200812073122.25412-1-john.ogness@linutronix.de

    John Ogness
     

29 Aug, 2020

1 commit

  • Introduce sleepable BPF programs that can request such property for themselves
    via BPF_F_SLEEPABLE flag at program load time. In such case they will be able
    to use helpers like bpf_copy_from_user() that might sleep. At present only
    fentry/fexit/fmod_ret and lsm programs can request to be sleepable and only
    when they are attached to kernel functions that are known to allow sleeping.

    The non-sleepable programs are relying on implicit rcu_read_lock() and
    migrate_disable() to protect life time of programs, maps that they use and
    per-cpu kernel structures used to pass info between bpf programs and the
    kernel. The sleepable programs cannot be enclosed into rcu_read_lock().
    migrate_disable() maps to preempt_disable() in non-RT kernels, so the progs
    should not be enclosed in migrate_disable() as well. Therefore
    rcu_read_lock_trace is used to protect the life time of sleepable progs.

    There are many networking and tracing program types. In many cases the
    'struct bpf_prog *' pointer itself is rcu protected within some other kernel
    data structure and the kernel code is using rcu_dereference() to load that
    program pointer and call BPF_PROG_RUN() on it. All these cases are not touched.
    Instead sleepable bpf programs are allowed with bpf trampoline only. The
    program pointers are hard-coded into generated assembly of bpf trampoline and
    synchronize_rcu_tasks_trace() is used to protect the life time of the program.
    The same trampoline can hold both sleepable and non-sleepable progs.

    When rcu_read_lock_trace is held it means that some sleepable bpf program is
    running from bpf trampoline. Those programs can use bpf arrays and preallocated
    hash/lru maps. These map types are waiting on programs to complete via
    synchronize_rcu_tasks_trace();

    Updates to trampoline now has to do synchronize_rcu_tasks_trace() and
    synchronize_rcu_tasks() to wait for sleepable progs to finish and for
    trampoline assembly to finish.

    This is the first step of introducing sleepable progs. Eventually dynamically
    allocated hash maps can be allowed and networking program types can become
    sleepable too.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Josef Bacik
    Acked-by: Andrii Nakryiko
    Acked-by: KP Singh
    Link: https://lore.kernel.org/bpf/20200827220114.69225-3-alexei.starovoitov@gmail.com

    Alexei Starovoitov
     

20 Aug, 2020

1 commit

  • Add kernel module with user mode driver that populates bpffs with
    BPF iterators.

    $ mount bpffs /my/bpffs/ -t bpf
    $ ls -la /my/bpffs/
    total 4
    drwxrwxrwt 2 root root 0 Jul 2 00:27 .
    drwxr-xr-x 19 root root 4096 Jul 2 00:09 ..
    -rw------- 1 root root 0 Jul 2 00:27 maps.debug
    -rw------- 1 root root 0 Jul 2 00:27 progs.debug

    The user mode driver will load BPF Type Formats, create BPF maps, populate BPF
    maps, load two BPF programs, attach them to BPF iterators, and finally send two
    bpf_link IDs back to the kernel.
    The kernel will pin two bpf_links into newly mounted bpffs instance under
    names "progs.debug" and "maps.debug". These two files become human readable.

    $ cat /my/bpffs/progs.debug
    id name attached
    11 dump_bpf_map bpf_iter_bpf_map
    12 dump_bpf_prog bpf_iter_bpf_prog
    27 test_pkt_access
    32 test_main test_pkt_access test_pkt_access
    33 test_subprog1 test_pkt_access_subprog1 test_pkt_access
    34 test_subprog2 test_pkt_access_subprog2 test_pkt_access
    35 test_subprog3 test_pkt_access_subprog3 test_pkt_access
    36 new_get_skb_len get_skb_len test_pkt_access
    37 new_get_skb_ifindex get_skb_ifindex test_pkt_access
    38 new_get_constant get_constant test_pkt_access

    The BPF program dump_bpf_prog() in iterators.bpf.c is printing this data about
    all BPF programs currently loaded in the system. This information is unstable
    and will change from kernel to kernel as ".debug" suffix conveys.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20200819042759.51280-4-alexei.starovoitov@gmail.com

    Alexei Starovoitov
     

08 Aug, 2020

1 commit

  • Patch series "mm: Expand CONFIG_SLAB_FREELIST_HARDENED to include SLAB"

    In reviewing Vlastimil Babka's latest slub debug series, I realized[1]
    that several checks under CONFIG_SLAB_FREELIST_HARDENED weren't being
    applied to SLAB. Fix this by expanding the Kconfig coverage, and adding a
    simple double-free test for SLAB.

    This patch (of 2):

    Include SLAB caches when performing kmem_cache pointer verification. A
    defense against such corruption[1] should be applied to all the
    allocators. With this added, the "SLAB_FREE_CROSS" and "SLAB_FREE_PAGE"
    LKDTM tests now pass on SLAB:

    lkdtm: Performing direct entry SLAB_FREE_CROSS
    lkdtm: Attempting cross-cache slab free ...
    ------------[ cut here ]------------
    cache_from_obj: Wrong slab cache. lkdtm-heap-b but object is from lkdtm-heap-a
    WARNING: CPU: 2 PID: 2195 at mm/slab.h:530 kmem_cache_free+0x8d/0x1d0
    ...
    lkdtm: Performing direct entry SLAB_FREE_PAGE
    lkdtm: Attempting non-Slab slab free ...
    ------------[ cut here ]------------
    virt_to_cache: Object is not a Slab page!
    WARNING: CPU: 1 PID: 2202 at mm/slab.h:489 kmem_cache_free+0x196/0x1d0

    Additionally clean up neighboring Kconfig entries for clarity,
    readability, and redundant option removal.

    [1] https://github.com/ThomasKing2014/slides/raw/master/Building%20universal%20Android%20rooting%20with%20a%20type%20confusion%20vulnerability.pdf

    Fixes: 598a0717a816 ("mm/slab: validate cache membership under freelist hardening")
    Signed-off-by: Kees Cook
    Signed-off-by: Andrew Morton
    Acked-by: Vlastimil Babka
    Cc: Alexander Popov
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Jann Horn
    Cc: Joonsoo Kim
    Cc: Matthew Garrett
    Cc: Pekka Enberg
    Cc: Roman Gushchin
    Cc: Vijayanand Jitta
    Cc: Vinayak Menon
    Link: http://lkml.kernel.org/r/20200625215548.389774-1-keescook@chromium.org
    Link: http://lkml.kernel.org/r/20200625215548.389774-2-keescook@chromium.org
    Signed-off-by: Linus Torvalds

    Kees Cook
     

05 Aug, 2020

1 commit

  • Pull documentation updates from Jonathan Corbet:
    "It's been a busy cycle for documentation - hopefully the busiest for a
    while to come. Changes include:

    - Some new Chinese translations

    - Progress on the battle against double words words and non-HTTPS
    URLs

    - Some block-mq documentation

    - More RST conversions from Mauro. At this point, that task is
    essentially complete, so we shouldn't see this kind of churn again
    for a while. Unless we decide to switch to asciidoc or
    something...:)

    - Lots of typo fixes, warning fixes, and more"

    * tag 'docs-5.9' of git://git.lwn.net/linux: (195 commits)
    scripts/kernel-doc: optionally treat warnings as errors
    docs: ia64: correct typo
    mailmap: add entry for
    doc/zh_CN: add cpu-load Chinese version
    Documentation/admin-guide: tainted-kernels: fix spelling mistake
    MAINTAINERS: adjust kprobes.rst entry to new location
    devices.txt: document rfkill allocation
    PCI: correct flag name
    docs: filesystems: vfs: correct flag name
    docs: filesystems: vfs: correct sync_mode flag names
    docs: path-lookup: markup fixes for emphasis
    docs: path-lookup: more markup fixes
    docs: path-lookup: fix HTML entity mojibake
    CREDITS: Replace HTTP links with HTTPS ones
    docs: process: Add an example for creating a fixes tag
    doc/zh_CN: add Chinese translation prefer section
    doc/zh_CN: add clearing-warn-once Chinese version
    doc/zh_CN: add admin-guide index
    doc:it_IT: process: coding-style.rst: Correct __maybe_unused compiler label
    futex: MAINTAINERS: Re-add selftests directory
    ...

    Linus Torvalds
     

04 Aug, 2020

1 commit

  • Pull x86 boot updates from Ingo Molnar:
    "The main change in this cycle was to add support for ZSTD-compressed
    kernel and initrd images.

    ZSTD has a very fast decompressor, yet it compresses better than gzip"

    * tag 'x86-boot-2020-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    Documentation: dontdiff: Add zstd compressed files
    .gitignore: Add ZSTD-compressed files
    x86: Add support for ZSTD compressed kernel
    x86: Bump ZO_z_extra_bytes margin for zstd
    usr: Add support for zstd compressed initramfs
    init: Add support for zstd compressed kernel
    lib: Add zstd support to decompress
    lib: Prepare zstd for preboot environment, improve performance

    Linus Torvalds
     

31 Jul, 2020

1 commit

  • - Add the zstd and zstd22 cmds to scripts/Makefile.lib

    - Add the HAVE_KERNEL_ZSTD and KERNEL_ZSTD options

    Architecture specific support is still needed for decompression.

    Signed-off-by: Nick Terrell
    Signed-off-by: Ingo Molnar
    Tested-by: Sedat Dilek
    Reviewed-by: Kees Cook
    Link: https://lore.kernel.org/r/20200730190841.2071656-4-nickrterrell@gmail.com

    Nick Terrell
     

29 Jul, 2020

1 commit

  • Qian reported that the current setup forgoes the Kconfig dependencies and
    results in warnings such as:

    WARNING: unmet direct dependencies detected for SCHED_THERMAL_PRESSURE
    Depends on [n]: SMP [=y] && CPU_FREQ_THERMAL [=n]
    Selected by [y]:
    - ARM64 [=y]

    Revert commit

    e17ae7fea871 ("arm, arm64: Select CONFIG_SCHED_THERMAL_PRESSURE")

    and re-implement it by making the option default to 'y' for arm64 and arm,
    which respects Kconfig dependencies (i.e. will remain 'n' if
    CPU_FREQ_THERMAL=n).

    Fixes: e17ae7fea871 ("arm, arm64: Select CONFIG_SCHED_THERMAL_PRESSURE")
    Reported-by: Qian Cai
    Signed-off-by: Valentin Schneider
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200729135718.1871-1-valentin.schneider@arm.com

    Valentin Schneider
     

22 Jul, 2020

1 commit

  • As Russell pointed out [1], this option is severely lacking in the
    documentation department, and figuring out if one has the required
    dependencies to benefit from turning it on is not straightforward.

    Make it non user-visible, and add a bit of help to it. While at it, make it
    depend on CPU_FREQ_THERMAL.

    [1]: https://lkml.kernel.org/r/20200603173150.GB1551@shell.armlinux.org.uk

    Signed-off-by: Valentin Schneider
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/20200712165917.9168-3-valentin.schneider@arm.com

    Valentin Schneider
     

01 Jul, 2020

1 commit

  • scripts/cc-can-link.sh tests if the compiler can link userspace
    programs.

    When $(CC) is GCC, it is checked against the target architecture
    because the toolchain prefix is specified as a part of $(CC).

    When $(CC) is Clang, it is checked against the host architecture
    because --target option is missing.

    Pass $(CLANG_FLAGS) to scripts/cc-can-link.sh to evaluate the link
    capability for the target architecture.

    Signed-off-by: Masahiro Yamada
    Reviewed-by: Nathan Chancellor

    Masahiro Yamada
     

27 Jun, 2020

1 commit

  • The nommu-mmap.txt file provides description of user visible
    behaviuour. So, move it to the admin-guide.

    As it is already at the ReST, also rename it.

    Suggested-by: Mike Rapoport
    Suggested-by: Jonathan Corbet
    Signed-off-by: Mauro Carvalho Chehab
    Link: https://lore.kernel.org/r/3a63d1833b513700755c85bf3bda0a6c4ab56986.1592918949.git.mchehab+huawei@kernel.org
    Signed-off-by: Jonathan Corbet

    Mauro Carvalho Chehab
     

14 Jun, 2020

3 commits

  • Pull more Kbuild updates from Masahiro Yamada:

    - fix build rules in binderfs sample

    - fix build errors when Kbuild recurses to the top Makefile

    - covert '---help---' in Kconfig to 'help'

    * tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    treewide: replace '---help---' in Kconfig files with 'help'
    kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables
    samples: binderfs: really compile this sample and fix build issues

    Linus Torvalds
     
  • Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over
    '---help---'"), the number of '---help---' has been gradually
    decreasing, but there are still more than 2400 instances.

    This commit finishes the conversion. While I touched the lines,
    I also fixed the indentation.

    There are a variety of indentation styles found.

    a) 4 spaces + '---help---'
    b) 7 spaces + '---help---'
    c) 8 spaces + '---help---'
    d) 1 space + 1 tab + '---help---'
    e) 1 tab + '---help---' (correct indentation)
    f) 1 tab + 1 space + '---help---'
    g) 1 tab + 2 spaces + '---help---'

    In order to convert all of them to 1 tab + 'help', I ran the
    following commend:

    $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • …git/dhowells/linux-fs

    Pull notification queue from David Howells:
    "This adds a general notification queue concept and adds an event
    source for keys/keyrings, such as linking and unlinking keys and
    changing their attributes.

    Thanks to Debarshi Ray, we do have a pull request to use this to fix a
    problem with gnome-online-accounts - as mentioned last time:

    https://gitlab.gnome.org/GNOME/gnome-online-accounts/merge_requests/47

    Without this, g-o-a has to constantly poll a keyring-based kerberos
    cache to find out if kinit has changed anything.

    [ There are other notification pending: mount/sb fsinfo notifications
    for libmount that Karel Zak and Ian Kent have been working on, and
    Christian Brauner would like to use them in lxc, but let's see how
    this one works first ]

    LSM hooks are included:

    - A set of hooks are provided that allow an LSM to rule on whether or
    not a watch may be set. Each of these hooks takes a different
    "watched object" parameter, so they're not really shareable. The
    LSM should use current's credentials. [Wanted by SELinux & Smack]

    - A hook is provided to allow an LSM to rule on whether or not a
    particular message may be posted to a particular queue. This is
    given the credentials from the event generator (which may be the
    system) and the watch setter. [Wanted by Smack]

    I've provided SELinux and Smack with implementations of some of these
    hooks.

    WHY
    ===

    Key/keyring notifications are desirable because if you have your
    kerberos tickets in a file/directory, your Gnome desktop will monitor
    that using something like fanotify and tell you if your credentials
    cache changes.

    However, we also have the ability to cache your kerberos tickets in
    the session, user or persistent keyring so that it isn't left around
    on disk across a reboot or logout. Keyrings, however, cannot currently
    be monitored asynchronously, so the desktop has to poll for it - not
    so good on a laptop. This facility will allow the desktop to avoid the
    need to poll.

    DESIGN DECISIONS
    ================

    - The notification queue is built on top of a standard pipe. Messages
    are effectively spliced in. The pipe is opened with a special flag:

    pipe2(fds, O_NOTIFICATION_PIPE);

    The special flag has the same value as O_EXCL (which doesn't seem
    like it will ever be applicable in this context)[?]. It is given up
    front to make it a lot easier to prohibit splice&co from accessing
    the pipe.

    [?] Should this be done some other way? I'd rather not use up a new
    O_* flag if I can avoid it - should I add a pipe3() system call
    instead?

    The pipe is then configured::

    ioctl(fds[1], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);
    ioctl(fds[1], IOC_WATCH_QUEUE_SET_FILTER, &filter);

    Messages are then read out of the pipe using read().

    - It should be possible to allow write() to insert data into the
    notification pipes too, but this is currently disabled as the
    kernel has to be able to insert messages into the pipe *without*
    holding pipe->mutex and the code to make this work needs careful
    auditing.

    - sendfile(), splice() and vmsplice() are disabled on notification
    pipes because of the pipe->mutex issue and also because they
    sometimes want to revert what they just did - but one or more
    notification messages might've been interleaved in the ring.

    - The kernel inserts messages with the wait queue spinlock held. This
    means that pipe_read() and pipe_write() have to take the spinlock
    to update the queue pointers.

    - Records in the buffer are binary, typed and have a length so that
    they can be of varying size.

    This allows multiple heterogeneous sources to share a common
    buffer; there are 16 million types available, of which I've used
    just a few, so there is scope for others to be used. Tags may be
    specified when a watchpoint is created to help distinguish the
    sources.

    - Records are filterable as types have up to 256 subtypes that can be
    individually filtered. Other filtration is also available.

    - Notification pipes don't interfere with each other; each may be
    bound to a different set of watches. Any particular notification
    will be copied to all the queues that are currently watching for it
    - and only those that are watching for it.

    - When recording a notification, the kernel will not sleep, but will
    rather mark a queue as having lost a message if there's
    insufficient space. read() will fabricate a loss notification
    message at an appropriate point later.

    - The notification pipe is created and then watchpoints are attached
    to it, using one of:

    keyctl_watch_key(KEY_SPEC_SESSION_KEYRING, fds[1], 0x01);
    watch_mount(AT_FDCWD, "/", 0, fd, 0x02);
    watch_sb(AT_FDCWD, "/mnt", 0, fd, 0x03);

    where in both cases, fd indicates the queue and the number after is
    a tag between 0 and 255.

    - Watches are removed if either the notification pipe is destroyed or
    the watched object is destroyed. In the latter case, a message will
    be generated indicating the enforced watch removal.

    Things I want to avoid:

    - Introducing features that make the core VFS dependent on the
    network stack or networking namespaces (ie. usage of netlink).

    - Dumping all this stuff into dmesg and having a daemon that sits
    there parsing the output and distributing it as this then puts the
    responsibility for security into userspace and makes handling
    namespaces tricky. Further, dmesg might not exist or might be
    inaccessible inside a container.

    - Letting users see events they shouldn't be able to see.

    TESTING AND MANPAGES
    ====================

    - The keyutils tree has a pipe-watch branch that has keyctl commands
    for making use of notifications. Proposed manual pages can also be
    found on this branch, though a couple of them really need to go to
    the main manpages repository instead.

    If the kernel supports the watching of keys, then running "make
    test" on that branch will cause the testing infrastructure to spawn
    a monitoring process on the side that monitors a notifications pipe
    for all the key/keyring changes induced by the tests and they'll
    all be checked off to make sure they happened.

    https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/log/?h=pipe-watch

    - A test program is provided (samples/watch_queue/watch_test) that
    can be used to monitor for keyrings, mount and superblock events.
    Information on the notifications is simply logged to stdout"

    * tag 'notifications-20200601' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
    smack: Implement the watch_key and post_notification hooks
    selinux: Implement the watch_key security hook
    keys: Make the KEY_NEED_* perms an enum rather than a mask
    pipe: Add notification lossage handling
    pipe: Allow buffers to be marked read-whole-or-error for notifications
    Add sample notification program
    watch_queue: Add a key/keyring notification facility
    security: Add hooks to rule on setting a watch
    pipe: Add general notification queue support
    pipe: Add O_NOTIFICATION_PIPE
    security: Add a hook for the point of notification insertion
    uapi: General notification queue definitions

    Linus Torvalds
     

11 Jun, 2020

1 commit

  • Pull READ/WRITE_ONCE rework from Will Deacon:
    "This the READ_ONCE rework I've been working on for a while, which
    bumps the minimum GCC version and improves code-gen on arm64 when
    stack protector is enabled"

    [ Side note: I'm _really_ tempted to raise the minimum gcc version to
    4.9, so that we can just say that we require _Generic() support.

    That would allow us to more cleanly handle a lot of the cases where we
    depend on very complex macros with 'sizeof' or __builtin_choose_expr()
    with __builtin_types_compatible_p() etc.

    This branch has a workaround for sparse not handling _Generic(),
    either, but that was already fixed in the sparse development branch,
    so it's really just gcc-4.9 that we'd require. - Linus ]

    * 'rwonce/rework' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux:
    compiler_types.h: Use unoptimized __unqual_scalar_typeof for sparse
    compiler_types.h: Optimize __unqual_scalar_typeof compilation time
    compiler.h: Enforce that READ_ONCE_NOCHECK() access size is sizeof(long)
    compiler-types.h: Include naked type in __pick_integer_type() match
    READ_ONCE: Fix comment describing 2x32-bit atomicity
    gcov: Remove old GCC 3.4 support
    arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros
    locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros
    READ_ONCE: Drop pointer qualifiers when reading from scalar types
    READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses
    READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()
    arm64: csum: Disable KASAN for do_csum()
    fault_inject: Don't rely on "return value" from WRITE_ONCE()
    net: tls: Avoid assigning 'const' pointer to non-const pointer
    netfilter: Avoid assigning 'const' pointer to non-const pointer
    compiler/gcc: Raise minimum GCC version for kernel builds to 4.8

    Linus Torvalds
     

07 Jun, 2020

1 commit

  • Pull Kbuild updates from Masahiro Yamada:

    - fix warnings in 'make clean' for ARCH=um, hexagon, h8300, unicore32

    - ensure to rebuild all objects when the compiler is upgraded

    - exclude system headers from dependency tracking and fixdep processing

    - fix potential bit-size mismatch between the kernel and BPF user-mode
    helper

    - add the new syntax 'userprogs' to build user-space programs for the
    target architecture (the same arch as the kernel)

    - compile user-space sample code under samples/ for the target arch
    instead of the host arch

    - make headers_install fail if a CONFIG option is leaked to user-space

    - sanitize the output format of scripts/checkstack.pl

    - handle ARM 'push' instruction in scripts/checkstack.pl

    - error out before modpost if a module name conflict is found

    - error out when multiple directories are passed to M= because this
    feature is broken for a long time

    - add CONFIG_DEBUG_INFO_COMPRESSED to support compressed debug info

    - a lot of cleanups of modpost

    - dump vmlinux symbols out into vmlinux.symvers, and reuse it in the
    second pass of modpost

    - do not run the second pass of modpost if nothing in modules is
    updated

    - install modules.builtin(.modinfo) by 'make install' as well as by
    'make modules_install' because it is useful even when
    CONFIG_MODULES=n

    - add new command line variables, GZIP, BZIP2, LZOP, LZMA, LZ4, and XZ
    to allow users to use alternatives such as pigz, pbzip2, etc.

    * tag 'kbuild-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (96 commits)
    kbuild: add variables for compression tools
    Makefile: install modules.builtin even if CONFIG_MODULES=n
    mksysmap: Fix the mismatch of '.L' symbols in System.map
    kbuild: doc: rename LDFLAGS to KBUILD_LDFLAGS
    modpost: change elf_info->size to size_t
    modpost: remove is_vmlinux() helper
    modpost: strip .o from modname before calling new_module()
    modpost: set have_vmlinux in new_module()
    modpost: remove mod->skip struct member
    modpost: add mod->is_vmlinux struct member
    modpost: remove is_vmlinux() call in check_for_{gpl_usage,unused}()
    modpost: remove mod->is_dot_o struct member
    modpost: move -d option in scripts/Makefile.modpost
    modpost: remove -s option
    modpost: remove get_next_text() and make {grab,release_}file static
    modpost: use read_text_file() and get_line() for reading text files
    modpost: avoid false-positive file open error
    modpost: fix potential mmap'ed file overrun in get_src_version()
    modpost: add read_text_file() and get_line() helpers
    modpost: do not call get_modinfo() for vmlinux(.o)
    ...

    Linus Torvalds
     

05 Jun, 2020

2 commits

  • This allows C code to make use of compilers with support for output
    variables along the fallthrough path via preprocessor define:

    CONFIG_CC_HAS_ASM_GOTO_OUTPUT

    [ This is not used anywhere yet, and currently released compilers don't
    support this yet, but it's coming, and I have some local experimental
    patches to take advantage of it when it does - Linus ]

    Signed-off-by: Nick Desaulniers
    Signed-off-by: Linus Torvalds

    Nick Desaulniers
     
  • Some init systems (eg. systemd) have init at their own paths, for
    example, /usr/lib/systemd/systemd. A compatibility symlink to one of the
    hardcoded init paths is provided by another package, usually named
    something like systemd-sysvcompat or similar.

    Currently distro maintainers who are hands-off on the bootloader are more
    or less required to include those compatibility links as part of their
    base distribution, because it's hard to migrate away from them since
    there's a risk some users will not get the message to set init= on the
    kernel command line appropriately.

    Moreover, for distributions where the init system is something the
    distribution itself is opinionated about (eg. Arch, which has systemd in
    the required `base` package), we could usually reasonably configure this
    ahead of time when building the distribution kernel. However, we
    currently simply don't have any way to configure the kernel to do this.
    Here's an example discussion where removing sysvcompat was discussed by
    distro maintainers[0].

    This patch adds a new Kconfig tunable, CONFIG_DEFAULT_INIT, which if set
    is tried before the hardcoded fallback list. So the order of precedence
    is now thus:

    1. init= on command line (on failure: panic)
    2. CONFIG_DEFAULT_INIT (on failure: try #3)
    3. Hardcoded fallback list (on failure: panic)

    This new config parameter will allow distribution maintainers to move away
    from these compatibility links safely, without having to worry that their
    users might not have the right init=.

    There are also two other benefits of this over having the distribution
    maintain a symlink:

    1. One of the value propositions over simply having distributions
    maintain a /sbin/init symlink via a package is that it also frees
    distributions which have a preferred default, but not mandatory, init
    system from having their package manager fight with their users for
    control of /{s,}bin/init. Instead, the distribution simply makes
    their preference known in CONFIG_DEFAULT_INIT, and if the user
    installs another init system and uninstalls the default one they can
    still make use of /{s,}bin/init and friends for their own uses. This
    makes more cases Just Work(tm) without the user having to perform
    extra configuration via init=.

    2. Since before this we don't know which path the distribution actually
    _intends_ to serve init from, we don't pr_err if it is simply
    missing, and usually will just silently put the user in a /bin/sh
    shell. Now that the distribution can make a declaration of intent, we
    can be more vocal when this init system fails to launch for any
    reason, even if it's simply because no file exists at that location,
    speeding up the palaver of init/mount dependency/etc debugging a bit.

    [0]: https://lists.archlinux.org/pipermail/arch-dev-public/2019-January/029435.html

    Signed-off-by: Chris Down
    Signed-off-by: Andrew Morton
    Cc: Greg Kroah-Hartman
    Cc: Masami Hiramatsu
    Link: http://lkml.kernel.org/r/20200522160234.GA1487022@chrisdown.name
    Signed-off-by: Linus Torvalds

    Chris Down
     

04 Jun, 2020

3 commits

  • Merge more updates from Andrew Morton:
    "More mm/ work, plenty more to come

    Subsystems affected by this patch series: slub, memcg, gup, kasan,
    pagealloc, hugetlb, vmscan, tools, mempolicy, memblock, hugetlbfs,
    thp, mmap, kconfig"

    * akpm: (131 commits)
    arm64: mm: use ARCH_HAS_DEBUG_WX instead of arch defined
    x86: mm: use ARCH_HAS_DEBUG_WX instead of arch defined
    riscv: support DEBUG_WX
    mm: add DEBUG_WX support
    drivers/base/memory.c: cache memory blocks in xarray to accelerate lookup
    mm/thp: rename pmd_mknotpresent() as pmd_mkinvalid()
    powerpc/mm: drop platform defined pmd_mknotpresent()
    mm: thp: don't need to drain lru cache when splitting and mlocking THP
    hugetlbfs: get unmapped area below TASK_UNMAPPED_BASE for hugetlbfs
    sparc32: register memory occupied by kernel as memblock.memory
    include/linux/memblock.h: fix minor typo and unclear comment
    mm, mempolicy: fix up gup usage in lookup_node
    tools/vm/page_owner_sort.c: filter out unneeded line
    mm: swap: memcg: fix memcg stats for huge pages
    mm: swap: fix vmstats for huge pages
    mm: vmscan: limit the range of LRU type balancing
    mm: vmscan: reclaim writepage is IO cost
    mm: vmscan: determine anon/file pressure balance at the reclaim root
    mm: balance LRU lists based on relative thrashing
    mm: only count actual rotations as LRU reclaim cost
    ...

    Linus Torvalds
     
  • Without swap page tracking, users that are otherwise memory controlled can
    easily escape their containment and allocate significant amounts of memory
    that they're not being charged for. That's because swap does readahead,
    but without the cgroup records of who owned the page at swapout, readahead
    pages don't get charged until somebody actually faults them into their
    page table and we can identify an owner task. This can be maliciously
    exploited with MADV_WILLNEED, which triggers arbitrary readahead
    allocations without charging the pages.

    Make swap swap page tracking an integral part of memcg and remove the
    Kconfig options. In the first place, it was only made configurable to
    allow users to save some memory. But the overhead of tracking cgroup
    ownership per swap page is minimal - 2 byte per page, or 512k per 1G of
    swap, or 0.04%. Saving that at the expense of broken containment
    semantics is not something we should present as a coequal option.

    The swapaccount=0 boot option will continue to exist, and it will
    eliminate the page_counter overhead and hide the swap control files, but
    it won't disable swap slot ownership tracking.

    This patch makes sure we always have the cgroup records at swapin time;
    the next patch will fix the actual bug by charging readahead swap pages at
    swapin time rather than at fault time.

    v2: fix double swap charge bug in cgroup1/cgroup2 code gating

    [hannes@cmpxchg.org: fix crash with cgroup_disable=memory]
    Link: http://lkml.kernel.org/r/20200521215855.GB815153@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Reviewed-by: Joonsoo Kim
    Cc: Alex Shi
    Cc: "Kirill A. Shutemov"
    Cc: Roman Gushchin
    Cc: Shakeel Butt
    Cc: Balbir Singh
    Cc: Naresh Kamboju
    Link: http://lkml.kernel.org/r/20200508183105.225460-16-hannes@cmpxchg.org
    Debugged-by: Hugh Dickins
    Debugged-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Pull MIPS updates from Thomas Bogendoerfer:

    - added support for MIPSr5 and P5600 cores

    - converted Loongson PCI driver into a PCI host driver using the
    generic PCI framework

    - added emulation of CPUCFG command for Loogonson64 cpus

    - removed of LASAT, PMC MSP71xx and NEC MARKEINS/EMMA

    - ioremap cleanup

    - fix for a race between two threads faulting the same page

    - various cleanups and fixes

    * tag 'mips_5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (143 commits)
    MIPS: ralink: drop ralink_clk_init for mt7621
    MIPS: ralink: bootrom: mark a function as __init to save some memory
    MIPS: Loongson64: Reorder CPUCFG model match arms
    MIPS: Expose Loongson CPUCFG availability via HWCAP
    MIPS: Loongson64: Guard against future cores without CPUCFG
    MIPS: Fix build warning about "PTR_STR" redefinition
    MIPS: Loongson64: Remove not used pci.c
    MIPS: Loongson64: Define PCI_IOBASE
    MIPS: CPU_LOONGSON2EF need software to maintain cache consistency
    MIPS: DTS: Fix build errors used with various configs
    MIPS: Loongson64: select NO_EXCEPT_FILL
    MIPS: Fix IRQ tracing when call handle_fpe() and handle_msa_fpe()
    MIPS: mm: add page valid judgement in function pte_modify
    mm/memory.c: Add memory read privilege on page fault handling
    mm/memory.c: Update local TLB if PTE entry exists
    MIPS: Do not flush tlb page when updating PTE entry
    MIPS: ingenic: Default to a generic board
    MIPS: ingenic: Add support for GCW Zero prototype
    MIPS: ingenic: DTS: Add memory info of GCW Zero
    MIPS: Loongson64: Switch to generic PCI driver
    ...

    Linus Torvalds
     

19 May, 2020

1 commit

  • Make it possible to have a general notification queue built on top of a
    standard pipe. Notifications are 'spliced' into the pipe and then read
    out. splice(), vmsplice() and sendfile() are forbidden on pipes used for
    notifications as post_one_notification() cannot take pipe->mutex. This
    means that notifications could be posted in between individual pipe
    buffers, making iov_iter_revert() difficult to effect.

    The way the notification queue is used is:

    (1) An application opens a pipe with a special flag and indicates the
    number of messages it wishes to be able to queue at once (this can
    only be set once):

    pipe2(fds, O_NOTIFICATION_PIPE);
    ioctl(fds[0], IOC_WATCH_QUEUE_SET_SIZE, queue_depth);

    (2) The application then uses poll() and read() as normal to extract data
    from the pipe. read() will return multiple notifications if the
    buffer is big enough, but it will not split a notification across
    buffers - rather it will return a short read or EMSGSIZE.

    Notification messages include a length in the header so that the
    caller can split them up.

    Each message has a header that describes it:

    struct watch_notification {
    __u32 type:24;
    __u32 subtype:8;
    __u32 info;
    };

    The type indicates the source (eg. mount tree changes, superblock events,
    keyring changes, block layer events) and the subtype indicates the event
    type (eg. mount, unmount; EIO, EDQUOT; link, unlink). The info field
    indicates a number of things, including the entry length, an ID assigned to
    a watchpoint contributing to this buffer and type-specific flags.

    Supplementary data, such as the key ID that generated an event, can be
    attached in additional slots. The maximum message size is 127 bytes.
    Messages may not be padded or aligned, so there is no guarantee, for
    example, that the notification type will be on a 4-byte bounary.

    Signed-off-by: David Howells

    David Howells
     

17 May, 2020

2 commits

  • On Fedora, linking static glibc requires the glibc-static RPM package,
    which is not part of the glibc-devel package.

    CONFIG_CC_CAN_LINK does not check the capability of static linking,
    so you can enable CONFIG_BPFILTER_UMH, then fail to build:

    HOSTLD net/bpfilter/bpfilter_umh
    /usr/bin/ld: cannot find -lc
    collect2: error: ld returned 1 exit status

    Add CONFIG_CC_CAN_LINK_STATIC, and make CONFIG_BPFILTER_UMH depend
    on it.

    Reported-by: Valdis Kletnieks
    Signed-off-by: Masahiro Yamada
    Acked-by: Alexei Starovoitov

    Masahiro Yamada
     
  • bpfilter_umh is built for the default machine bit of the compiler,
    which may not match to the bit size of the kernel.

    This happens in the scenario below:

    You can use biarch GCC that defaults to 64-bit for building the 32-bit
    kernel. In this case, Kbuild passes -m32 to teach the compiler to
    produce 32-bit kernel space objects. However, it is missing when
    building bpfilter_umh. It is built as a 64-bit ELF, and then embedded
    into the 32-bit kernel.

    The 32-bit kernel and 64-bit umh is a bad combination.

    In theory, we can have 32-bit umh running on 64-bit kernel, but we do
    not have a good reason to support such a usecase.

    The best is to match the bit size between them.

    Pass -m32 or -m64 to the umh build command if it is found in
    $(KBUILD_CFLAGS). Evaluate CC_CAN_LINK against the kernel bit-size.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     

16 May, 2020

1 commit

  • Pull networking fixes from David Miller:

    1) Fix sk_psock reference count leak on receive, from Xiyu Yang.

    2) CONFIG_HNS should be invisible, from Geert Uytterhoeven.

    3) Don't allow locking route MTUs in ipv6, RFCs actually forbid this,
    from Maciej Żenczykowski.

    4) ipv4 route redirect backoff wasn't actually enforced, from Paolo
    Abeni.

    5) Fix netprio cgroup v2 leak, from Zefan Li.

    6) Fix infinite loop on rmmod in conntrack, from Florian Westphal.

    7) Fix tcp SO_RCVLOWAT hangs, from Eric Dumazet.

    8) Various bpf probe handling fixes, from Daniel Borkmann.

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (68 commits)
    selftests: mptcp: pm: rm the right tmp file
    dpaa2-eth: properly handle buffer size restrictions
    bpf: Restrict bpf_trace_printk()'s %s usage and add %pks, %pus specifier
    bpf: Add bpf_probe_read_{user, kernel}_str() to do_refine_retval_range
    bpf: Restrict bpf_probe_read{, str}() only to archs where they work
    MAINTAINERS: Mark networking drivers as Maintained.
    ipmr: Add lockdep expression to ipmr_for_each_table macro
    ipmr: Fix RCU list debugging warning
    drivers: net: hamradio: Fix suspicious RCU usage warning in bpqether.c
    net: phy: broadcom: fix BCM54XX_SHD_SCR3_TRDDAPD value for BCM54810
    tcp: fix error recovery in tcp_zerocopy_receive()
    MAINTAINERS: Add Jakub to networking drivers.
    MAINTAINERS: another add of Karsten Graul for S390 networking
    drivers: ipa: fix typos for ipa_smp2p structure doc
    pppoe: only process PADT targeted at local interfaces
    selftests/bpf: Enforce returning 0 for fentry/fexit programs
    bpf: Enforce returning 0 for fentry/fexit progs
    net: stmmac: fix num_por initialization
    security: Fix the default value of secid_to_secctx hook
    libbpf: Fix register naming in PT_REGS s390 macros
    ...

    Linus Torvalds
     

15 May, 2020

1 commit

  • Given the legacy bpf_probe_read{,str}() BPF helpers are broken on archs
    with overlapping address ranges, we should really take the next step to
    disable them from BPF use there.

    To generally fix the situation, we've recently added new helper variants
    bpf_probe_read_{user,kernel}() and bpf_probe_read_{user,kernel}_str().
    For details on them, see 6ae08ae3dea2 ("bpf: Add probe_read_{user, kernel}
    and probe_read_{user,kernel}_str helpers").

    Given bpf_probe_read{,str}() have been around for ~5 years by now, there
    are plenty of users at least on x86 still relying on them today, so we
    cannot remove them entirely w/o breaking the BPF tracing ecosystem.

    However, their use should be restricted to archs with non-overlapping
    address ranges where they are working in their current form. Therefore,
    move this behind a CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE and
    have x86, arm64, arm select it (other archs supporting it can follow-up
    on it as well).

    For the remaining archs, they can workaround easily by relying on the
    feature probe from bpftool which spills out defines that can be used out
    of BPF C code to implement the drop-in replacement for old/new kernels
    via: bpftool feature probe macro

    Suggested-by: Linus Torvalds
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Alexei Starovoitov
    Reviewed-by: Masami Hiramatsu
    Acked-by: Linus Torvalds
    Cc: Brendan Gregg
    Cc: Christoph Hellwig
    Link: https://lore.kernel.org/bpf/20200515101118.6508-2-daniel@iogearbox.net

    Daniel Borkmann
     

12 May, 2020

3 commits

  • Similarly to the CC_IS_CLANG config, add LD_IS_LLD to avoid GNU ld
    specific logic such as ld-version or ld-ifversion and gain the
    ability to select potential features that depend on the linker at
    configuration time such as LTO.

    Signed-off-by: Sami Tolvanen
    Acked-by: Masahiro Yamada
    [nc: Reword commit message]
    Signed-off-by: Nathan Chancellor
    Tested-by: Sedat Dilek
    Reviewed-by: Sedat Dilek
    Signed-off-by: Thomas Bogendoerfer

    Sami Tolvanen
     
  • Commit 21c54b774744 ("kconfig: show compiler version text in the top
    comment") added the environment variable, CC_VERSION_TEXT in the comment
    of the top Kconfig file. It can detect the compiler update, and invoke
    the syncconfig because all environment variables referenced in Kconfig
    files are recorded in include/config/auto.conf.cmd

    This commit makes it a CONFIG option in order to ensure the full rebuild
    when the compiler is updated.

    This works like follows:

    include/config/kconfig.h contains "CONFIG_CC_VERSION_TEXT" in the comment
    block.

    The top Makefile specifies "-include $(srctree)/include/linux/kconfig.h"
    to guarantee it is included from all kernel source files.

    fixdep parses every source file and all headers included from it,
    searching for words prefixed with "CONFIG_". Then, fixdep finds
    CONFIG_CC_VERSION_TEXT in include/config/kconfig.h and adds
    include/config/cc/version/text.h into every .*.cmd file.

    When the compiler is updated, syncconfig is invoked because init/Kconfig
    contains the reference to the environment variable CC_VERTION_TEXT.
    CONFIG_CC_VERSION_TEXT is updated to the new version string, and
    include/config/cc/version/text.h is touched.

    In the next rebuild, Make will rebuild every files since the timestamp
    of include/config/cc/version/text.h is newer than that of target.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • The result of '$(CC) --version | head -n 1' has already been computed
    by the top Makefile, and stored in the environment variable,
    CC_VERSION_TEXT.

    'echo' is cheaper than the two commands $(CC) and 'head' although this
    optimization is not noticeable level.

    Signed-off-by: Masahiro Yamada
    Reviewed-by: Nathan Chancellor
    Tested-by: Nathan Chancellor

    Masahiro Yamada
     

10 May, 2020

1 commit

  • We have some rather random rules about when we accept the
    "maybe-initialized" warnings, and when we don't.

    For example, we consider it unreliable for gcc versions < 4.9, but also
    if -O3 is enabled, or if optimizing for size. And then various kernel
    config options disabled it, because they know that they trigger that
    warning by confusing gcc sufficiently (ie PROFILE_ALL_BRANCHES).

    And now gcc-10 seems to be introducing a lot of those warnings too, so
    it falls under the same heading as 4.9 did.

    At the same time, we have a very straightforward way to _enable_ that
    warning when wanted: use "W=2" to enable more warnings.

    So stop playing these ad-hoc games, and just disable that warning by
    default, with the known and straight-forward "if you want to work on the
    extra compiler warnings, use W=123".

    Would it be great to have code that is always so obvious that it never
    confuses the compiler whether a variable is used initialized or not?
    Yes, it would. In a perfect world, the compilers would be smarter, and
    our source code would be simpler.

    That's currently not the world we live in, though.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

16 Apr, 2020

1 commit

  • It is very rare to see versions of GCC prior to 4.8 being used to build
    the mainline kernel. These old compilers are also know to have codegen
    issues which can lead to silent miscompilation:

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145

    Raise the minimum GCC version for kernel build to 4.8 and remove some
    tautological Kconfig dependencies as a consequence.

    Cc: Masahiro Yamada
    Acked-by: Arnd Bergmann
    Reviewed-by: Nick Desaulniers
    Signed-off-by: Will Deacon

    Will Deacon
     

10 Apr, 2020

1 commit

  • Pull arm64 fixes from Catalin Marinas:

    - Ensure that the compiler and linker versions are aligned so that ld
    doesn't complain about not understanding a .note.gnu.property section
    (emitted when pointer authentication is enabled).

    - Force -mbranch-protection=none when the feature is not enabled, in
    case a compiler may choose a different default value.

    - Remove CONFIG_DEBUG_ALIGN_RODATA. It was never in defconfig and
    rarely enabled.

    - Fix checking 16-bit Thumb-2 instructions checking mask in the
    emulation of the SETEND instruction (it could match the bottom half
    of a 32-bit Thumb-2 instruction).

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: armv8_deprecated: Fix undef_hook mask for thumb setend
    arm64: remove CONFIG_DEBUG_ALIGN_RODATA feature
    arm64: Always force a branch protection mode when the compiler has one
    arm64: Kconfig: ptrauth: Add binutils version check to fix mismatch
    init/kconfig: Add LD_VERSION Kconfig

    Linus Torvalds