24 Apr, 2019

1 commit

  • Pull syscall numbering updates from Arnd Bergmann:
    "arch: add pidfd and io_uring syscalls everywhere

    This comes a bit late, but should be in 5.1 anyway: we want the newly
    added system calls to be synchronized across all architectures in the
    release.

    I hope that in the future, any newly added system calls can be added
    to all architectures at the same time, and tested there while they are
    in linux-next, avoiding dependencies between the architecture
    maintainer trees and the tree that contains the new system call"

    * tag 'syscalls-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
    arch: add pidfd and io_uring syscalls everywhere

    Linus Torvalds
     

15 Apr, 2019

1 commit

  • Add the io_uring and pidfd_send_signal system calls to all architectures.

    These system calls are designed to handle both native and compat tasks,
    so all entries are the same across architectures, only arm-compat and
    the generic tale still use an old format.

    Acked-by: Michael Ellerman (powerpc)
    Acked-by: Heiko Carstens (s390)
    Acked-by: Geert Uytterhoeven
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

05 Apr, 2019

2 commits

  • After removing the start and count arguments of syscall_get_arguments() it
    seems reasonable to remove them from syscall_set_arguments(). Note, as of
    today, there are no users of syscall_set_arguments(). But we are told that
    there will be soon. But for now, at least make it consistent with
    syscall_get_arguments().

    Link: http://lkml.kernel.org/r/20190327222014.GA32540@altlinux.org

    Cc: Oleg Nesterov
    Cc: Kees Cook
    Cc: Andy Lutomirski
    Cc: Dominik Brodowski
    Cc: Dave Martin
    Cc: "Dmitry V. Levin"
    Cc: x86@kernel.org
    Cc: linux-snps-arc@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: uclinux-h8-devel@lists.sourceforge.jp
    Cc: linux-hexagon@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-mips@vger.kernel.org
    Cc: nios2-dev@lists.rocketboards.org
    Cc: openrisc@lists.librecores.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-riscv@lists.infradead.org
    Cc: linux-s390@vger.kernel.org
    Cc: linux-sh@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Cc: linux-um@lists.infradead.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linux-arch@vger.kernel.org
    Acked-by: Max Filippov # For xtensa changes
    Acked-by: Will Deacon # For the arm64 bits
    Reviewed-by: Thomas Gleixner # for x86
    Reviewed-by: Dmitry V. Levin
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (VMware)
     
  • At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
    function call syscall_get_arguments() implemented in x86 was horribly
    written and not optimized for the standard case of passing in 0 and 6 for
    the starting index and the number of system calls to get. When looking at
    all the users of this function, I discovered that all instances pass in only
    0 and 6 for these arguments. Instead of having this function handle
    different cases that are never used, simply rewrite it to return the first 6
    arguments of a system call.

    This should help out the performance of tracing system calls by ptrace,
    ftrace and perf.

    Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org

    Cc: Oleg Nesterov
    Cc: Kees Cook
    Cc: Andy Lutomirski
    Cc: Dominik Brodowski
    Cc: Dave Martin
    Cc: "Dmitry V. Levin"
    Cc: x86@kernel.org
    Cc: linux-snps-arc@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-c6x-dev@linux-c6x.org
    Cc: uclinux-h8-devel@lists.sourceforge.jp
    Cc: linux-hexagon@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-mips@vger.kernel.org
    Cc: nios2-dev@lists.rocketboards.org
    Cc: openrisc@lists.librecores.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: linux-riscv@lists.infradead.org
    Cc: linux-s390@vger.kernel.org
    Cc: linux-sh@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Cc: linux-um@lists.infradead.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linux-arch@vger.kernel.org
    Acked-by: Paul Burton # MIPS parts
    Acked-by: Max Filippov # For xtensa changes
    Acked-by: Will Deacon # For the arm64 bits
    Reviewed-by: Thomas Gleixner # for x86
    Reviewed-by: Dmitry V. Levin
    Reported-by: Andy Lutomirski
    Signed-off-by: Steven Rostedt (VMware)

    Steven Rostedt (Red Hat)
     

29 Mar, 2019

1 commit

  • I do not see any consistency about headers_install of
    and .

    According to my analysis of Linux 5.1-rc1, there are 3 groups:

    [1] Both and are exported

    alpha, arm, hexagon, mips, powerpc, s390, sparc, x86

    [2] is exported, but is not

    arc, arm64, c6x, h8300, ia64, m68k, microblaze, nios2, openrisc,
    parisc, sh, unicore32, xtensa

    [3] Neither nor is exported

    csky, nds32, riscv

    This does not match to the actual KVM support. At least, [2] is
    half-baked.

    Nor do arch maintainers look like they care about this. For example,
    commit 0add53713b1c ("microblaze: Add missing kvm_para.h to Kbuild")
    exported to user-space in order to fix an in-kernel
    build error.

    We have two ways to make this consistent:

    [A] export both and for all
    architectures, irrespective of the KVM support

    [B] Match the header export of and
    to the KVM support

    My first attempt was [A] because the code looks cleaner, but Paolo
    suggested [B].

    So, this commit goes with [B].

    For most architectures, was moved to the kernel-space.
    I changed include/uapi/linux/Kbuild so that it checks generated
    asm/kvm_para.h as well as check-in ones.

    After this commit, there will be two groups:

    [1] Both and are exported

    arm, arm64, mips, powerpc, s390, x86

    [2] Neither nor is exported

    alpha, arc, c6x, csky, h8300, hexagon, ia64, m68k, microblaze,
    nds32, nios2, openrisc, parisc, riscv, sh, sparc, unicore32, xtensa

    Signed-off-by: Masahiro Yamada
    Acked-by: Cornelia Huck
    Signed-off-by: Paolo Bonzini

    Masahiro Yamada
     

18 Mar, 2019

1 commit

  • Pull more Kbuild updates from Masahiro Yamada:

    - add more Build-Depends to Debian source package

    - prefix header search paths with $(srctree)/

    - make modpost show verbose section mismatch warnings

    - avoid hard-coded CROSS_COMPILE for h8300

    - fix regression for Debian make-kpkg command

    - add semantic patch to detect missing put_device()

    - fix some warnings of 'make deb-pkg'

    - optimize NOSTDINC_FLAGS evaluation

    - add warnings about redundant generic-y

    - clean up Makefiles and scripts

    * tag 'kbuild-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    kconfig: remove stale lxdialog/.gitignore
    kbuild: force all architectures except um to include mandatory-y
    kbuild: warn redundant generic-y
    Revert "modsign: Abort modules_install when signing fails"
    kbuild: Make NOSTDINC_FLAGS a simply expanded variable
    kbuild: deb-pkg: avoid implicit effects
    coccinelle: semantic code search for missing put_device()
    kbuild: pkg: grep include/config/auto.conf instead of $KCONFIG_CONFIG
    kbuild: deb-pkg: introduce is_enabled and if_enabled_echo to builddeb
    kbuild: deb-pkg: add CONFIG_ prefix to kernel config options
    kbuild: add workaround for Debian make-kpkg
    kbuild: source include/config/auto.conf instead of ${KCONFIG_CONFIG}
    unicore32: simplify linker script generation for decompressor
    h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux-
    kbuild: move archive command to scripts/Makefile.lib
    modpost: always show verbose warning for section mismatch
    ia64: prefix header search path with $(srctree)/
    libfdt: prefix header search paths with $(srctree)/
    deb-pkg: generate correct build dependencies

    Linus Torvalds
     

17 Mar, 2019

3 commits

  • Currently, every arch/*/include/uapi/asm/Kbuild explicitly includes
    the common Kbuild.asm file. Factor out the duplicated include directives
    to scripts/Makefile.asm-generic so that no architecture would opt out
    of the mandatory-y mechanism.

    um is not forced to include mandatory-y since it is a very exceptional
    case which does not support UAPI.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • The generic-y is redundant under the following condition:

    - arch has its own implementation

    - the same header is added to generated-y

    - the same header is added to mandatory-y

    If a redundant generic-y is found, the warning like follows is displayed:

    scripts/Makefile.asm-generic:20: redundant generic-y found in arch/arm/include/asm/Kbuild: timex.h

    I fixed up arch Kbuild files found by this.

    Suggested-by: Sam Ravnborg
    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • Pull more SCSI updates from James Bottomley:
    "This is the final round of mostly small fixes and performance
    improvements to our initial submit.

    The main regression fix is the ia64 simscsi build failure which was
    missed in the serial number elimination conversion"

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
    scsi: ia64: simscsi: use request tag instead of serial_number
    scsi: aacraid: Fix performance issue on logical drives
    scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup()
    scsi: libiscsi: Hold back_lock when calling iscsi_complete_task
    scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink
    scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port
    scsi: hisi_sas: Set PHY linkrate when disconnected
    scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw
    scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO
    scsi: hisi_sas: Change return variable type in phy_up_v3_hw()
    scsi: qla2xxx: check for kstrtol() failure
    scsi: lpfc: fix 32-bit format string warning
    scsi: lpfc: fix unused variable warning
    scsi: target: tcmu: Switch to bitmap_zalloc()
    scsi: libiscsi: fall back to sendmsg for slab pages
    scsi: qla2xxx: avoid printf format warning
    scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset
    scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check
    scsi: ufs: hisi: fix ufs_hba_variant_ops passing
    scsi: qla2xxx: Fix panic in qla_dfs_tgt_counters_show
    ...

    Linus Torvalds
     

16 Mar, 2019

1 commit


14 Mar, 2019

1 commit

  • Currently, the Kbuild core manipulates header search paths in a crazy
    way [1].

    To fix this mess, I want all Makefiles to add explicit $(srctree)/ to
    the search paths in the srctree. Some Makefiles are already written in
    that way, but not all. The goal of this work is to make the notation
    consistent, and finally get rid of the gross hacks.

    Having whitespaces after -I does not matter since commit 48f6e3cf5bc6
    ("kbuild: do not drop -I without parameter").

    I removed some header search paths because I was able to build ia64
    without them.

    [1]: https://patchwork.kernel.org/patch/9632347/

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     

13 Mar, 2019

3 commits

  • Add panic() calls if memblock_alloc*() returns NULL.

    Most of the changes are simply addition of

    if(!ptr)
    panic();

    statements after the calls to memblock_alloc*() variants.

    Exceptions are create_mem_map_page_table() and ia64_log_init() that were
    slightly refactored to accommodate the change.

    Link: http://lkml.kernel.org/r/1548057848-15136-15-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Cc: Catalin Marinas
    Cc: Christophe Leroy
    Cc: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Dennis Zhou
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Guo Ren [c-sky]
    Cc: Heiko Carstens
    Cc: Juergen Gross [Xen]
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Paul Burton
    Cc: Petr Mladek
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Rob Herring
    Cc: Rob Herring
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • memblock_alloc() already clears the allocated memory, no point in doing
    it twice.

    Link: http://lkml.kernel.org/r/1548057848-15136-14-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Acked-by: Geert Uytterhoeven [m68k]
    Cc: Catalin Marinas
    Cc: Christophe Leroy
    Cc: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Dennis Zhou
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Guo Ren [c-sky]
    Cc: Heiko Carstens
    Cc: Juergen Gross [Xen]
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Paul Burton
    Cc: Petr Mladek
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Rob Herring
    Cc: Rob Herring
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • The last parameter of memblock_alloc_from() is the lower limit for the
    memory allocation. When it is 0, the call is equivalent to
    memblock_alloc().

    Link: http://lkml.kernel.org/r/1548057848-15136-13-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Acked-by: Paul Burton # MIPS part
    Cc: Catalin Marinas
    Cc: Christophe Leroy
    Cc: Christoph Hellwig
    Cc: "David S. Miller"
    Cc: Dennis Zhou
    Cc: Geert Uytterhoeven
    Cc: Greentime Hu
    Cc: Greg Kroah-Hartman
    Cc: Guan Xuetao
    Cc: Guo Ren
    Cc: Guo Ren [c-sky]
    Cc: Heiko Carstens
    Cc: Juergen Gross [Xen]
    Cc: Mark Salter
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Petr Mladek
    Cc: Richard Weinberger
    Cc: Rich Felker
    Cc: Rob Herring
    Cc: Rob Herring
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

07 Mar, 2019

1 commit

  • Merge misc updates from Andrew Morton:

    - a few misc things

    - ocfs2 updates

    - most of MM

    * emailed patches from Andrew Morton : (159 commits)
    tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include
    proc: more robust bulk read test
    proc: test /proc/*/maps, smaps, smaps_rollup, statm
    proc: use seq_puts() everywhere
    proc: read kernel cpu stat pointer once
    proc: remove unused argument in proc_pid_lookup()
    fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
    fs/proc/self.c: code cleanup for proc_setup_self()
    proc: return exit code 4 for skipped tests
    mm,mremap: bail out earlier in mremap_to under map pressure
    mm/sparse: fix a bad comparison
    mm/memory.c: do_fault: avoid usage of stale vm_area_struct
    writeback: fix inode cgroup switching comment
    mm/huge_memory.c: fix "orig_pud" set but not used
    mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
    mm/memcontrol.c: fix bad line in comment
    mm/cma.c: cma_declare_contiguous: correct err handling
    mm/page_ext.c: fix an imbalance with kmemleak
    mm/compaction: pass pgdat to too_many_isolated() instead of zone
    mm: remove zone_lru_lock() function, access ->lru_lock directly
    ...

    Linus Torvalds
     

06 Mar, 2019

4 commits

  • In the old days, remap_pfn_range() required pages to be marked as
    PG_reserved, so they would e.g. never get swapped out. This was
    required for special mappings. Nowadays, this is fully handled via the
    VMA (VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP inside
    remap_pfn_range() to be precise). PG_reserved is no longer required but
    only a relic from the past.

    So only architecture specific MM handling might require it (e.g. to
    detect them as MMIO pages). As there are no architecture specific
    checks for PageReserved() apart from MCA handling in ia64code, this can
    go. Use simple vzalloc()/vfree() instead.

    Note that before calling vzalloc(), size has already been aligned to
    PAGE_SIZE, no need to align again.

    Link: http://lkml.kernel.org/r/20190114125903.24845-9-david@redhat.com
    Signed-off-by: David Hildenbrand
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Oleg Nesterov
    Cc: David Hildenbrand
    Cc: David Howells
    Cc: Mike Rapoport
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • Patch series "Replace all open encodings for NUMA_NO_NODE", v3.

    All these places for replacement were found by running the following
    grep patterns on the entire kernel code. Please let me know if this
    might have missed some instances. This might also have replaced some
    false positives. I will appreciate suggestions, inputs and review.

    1. git grep "nid == -1"
    2. git grep "node == -1"
    3. git grep "nid = -1"
    4. git grep "node = -1"

    This patch (of 2):

    At present there are multiple places where invalid node number is
    encoded as -1. Even though implicitly understood it is always better to
    have macros in there. Replace these open encodings for an invalid node
    number with the global macro NUMA_NO_NODE. This helps remove NUMA
    related assumptions like 'invalid node' from various places redirecting
    them to a common definition.

    Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com
    Signed-off-by: Anshuman Khandual
    Reviewed-by: David Hildenbrand
    Acked-by: Jeff Kirsher [ixgbe]
    Acked-by: Jens Axboe [mtip32xx]
    Acked-by: Vinod Koul [dmaengine.c]
    Acked-by: Michael Ellerman [powerpc]
    Acked-by: Doug Ledford [drivers/infiniband]
    Cc: Joseph Qi
    Cc: Hans Verkuil
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anshuman Khandual
     
  • Pull year 2038 updates from Thomas Gleixner:
    "Another round of changes to make the kernel ready for 2038. After lots
    of preparatory work this is the first set of syscalls which are 2038
    safe:

    403 clock_gettime64
    404 clock_settime64
    405 clock_adjtime64
    406 clock_getres_time64
    407 clock_nanosleep_time64
    408 timer_gettime64
    409 timer_settime64
    410 timerfd_gettime64
    411 timerfd_settime64
    412 utimensat_time64
    413 pselect6_time64
    414 ppoll_time64
    416 io_pgetevents_time64
    417 recvmmsg_time64
    418 mq_timedsend_time64
    419 mq_timedreceiv_time64
    420 semtimedop_time64
    421 rt_sigtimedwait_time64
    422 futex_time64
    423 sched_rr_get_interval_time64

    The syscall numbers are identical all over the architectures"

    * 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
    riscv: Use latest system call ABI
    checksyscalls: fix up mq_timedreceive and stat exceptions
    unicore32: Fix __ARCH_WANT_STAT64 definition
    asm-generic: Make time32 syscall numbers optional
    asm-generic: Drop getrlimit and setrlimit syscalls from default list
    32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
    compat ABI: use non-compat openat and open_by_handle_at variants
    y2038: add 64-bit time_t syscalls to all 32-bit architectures
    y2038: rename old time and utime syscalls
    y2038: remove struct definition redirects
    y2038: use time32 syscall names on 32-bit
    syscalls: remove obsolete __IGNORE_ macros
    y2038: syscalls: rename y2038 compat syscalls
    x86/x32: use time64 versions of sigtimedwait and recvmmsg
    timex: change syscalls to use struct __kernel_timex
    timex: use __kernel_timex internally
    sparc64: add custom adjtimex/clock_adjtime functions
    time: fix sys_timer_settime prototype
    time: Add struct __kernel_timex
    time: make adjtime compat handling available for 32 bit
    ...

    Linus Torvalds
     
  • Pull networking updates from David Miller:
    "Here we go, another merge window full of networking and #ebpf changes:

    1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP
    range without dealing with floods of ARP traffic, from Linus
    Lüssing.

    2) Throttle buffered multicast packet transmission in mt76, from
    Felix Fietkau.

    3) Support adaptive interrupt moderation in ice, from Brett Creeley.

    4) A lot of struct_size conversions, from Gustavo A. R. Silva.

    5) Add peek/push/pop commands to bpftool, as well as bash completion,
    from Stanislav Fomichev.

    6) Optimize sk_msg_clone(), from Vakul Garg.

    7) Add SO_BINDTOIFINDEX, from David Herrmann.

    8) Be more conservative with local resends due to local congestion,
    from Yuchung Cheng.

    9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata.

    10) Add health buffer support to devlink, from Eran Ben Elisha.

    11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen.

    12) Add statistics to basic packet scheduler filter, from Cong Wang.

    13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan.

    14) Lots of new IP tunneling forwarding tests, also from Nir Dotan.

    15) Add 3ad stats to bonding, from Nikolay Aleksandrov.

    16) Lots of probing improvements for bpftool, from Quentin Monnet.

    17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski.

    18) Allow #ebpf programs to access gso_segs from skb shared info, from
    Eric Dumazet.

    19) Add sock_diag support for AF_XDP sockets, from Björn Töpel.

    20) Support 22260 iwlwifi devices, from Luca Coelho.

    21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov.

    22) Add JMP32 instruction class support to #ebpf, from Jiong Wang.

    23) Add spinlock support to #ebpf, from Alexei Starovoitov.

    24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson.

    25) Add device infomation API to devlink, from Jakub Kicinski.

    26) Add new timestamping socket options which are y2038 safe, from
    Deepa Dinamani.

    27) Add RX checksum offloading for various sh_eth chips, from Sergei
    Shtylyov.

    28) Flow offload infrastructure, from Pablo Neira Ayuso.

    29) Numerous cleanups, improvements, and bug fixes to the PHY layer
    and many drivers from Heiner Kallweit.

    30) Lots of changes to try and make packet scheduler classifiers run
    lockless as much as possible, from Vlad Buslov.

    31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows.

    32) Add concurrency tests to tc-tests infrastructure, from Vlad
    Buslov.

    33) Add hwmon support to aquantia, from Heiner Kallweit.

    34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet.

    And I would be remiss if I didn't thank the various major networking
    subsystem maintainers for integrating much of this work before I even
    saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso,
    Johannes Berg, Kalle Valo, and many others. Thank you!"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits)
    net/sched: avoid unused-label warning
    net: ignore sysctl_devconf_inherit_init_net without SYSCTL
    phy: mdio-mux: fix Kconfig dependencies
    net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg
    net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework
    selftest/net: Remove duplicate header
    sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
    net/mlx5e: Update tx reporter status in case channels were successfully opened
    devlink: Add support for direct reporter health state update
    devlink: Update reporter state to error even if recover aborted
    sctp: call iov_iter_revert() after sending ABORT
    team: Free BPF filter when unregistering netdev
    ip6mr: Do not call __IP6_INC_STATS() from preemptible context
    isdn: mISDN: Fix potential NULL pointer dereference of kzalloc
    net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs
    cxgb4/chtls: Prefix adapter flags with CXGB4
    net-sysfs: Switch to bitmap_zalloc()
    mellanox: Switch to bitmap_zalloc()
    bpf: add test cases for non-pointer sanitiation logic
    mlxsw: i2c: Extend initialization by querying resources data
    ...

    Linus Torvalds
     

05 Mar, 2019

1 commit

  • Every in-kernel use of this function defined it to KERNEL_DS (either as
    an actual define, or as an inline function). It's an entirely
    historical artifact, and long long long ago used to actually read the
    segment selector valueof '%ds' on x86.

    Which in the kernel is always KERNEL_DS.

    Inspired by a patch from Jann Horn that just did this for a very small
    subset of users (the ones in fs/), along with Al who suggested a script.
    I then just took it to the logical extreme and removed all the remaining
    gunk.

    Roughly scripted with

    git grep -l '(get_ds())' -- :^tools/ | xargs sed -i 's/(get_ds())/(KERNEL_DS)/'
    git grep -lw 'get_ds' -- :^tools/ | xargs sed -i '/^#define get_ds()/d'

    plus manual fixups to remove a few unusual usage patterns, the couple of
    inline function cases and to fix up a comment that had become stale.

    The 'get_ds()' function remains in an x86 kvm selftest, since in user
    space it actually does something relevant.

    Inspired-by: Jann Horn
    Inspired-by: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

11 Feb, 2019

2 commits

  • …/arnd/playground into timers/2038

    Pull y2038 - time64 system calls from Arnd Bergmann:

    This series finally gets us to the point of having system calls with 64-bit
    time_t on all architectures, after a long time of incremental preparation
    patches.

    There was actually one conversion that I missed during the summer,
    i.e. Deepa's timex series, which I now updated based the 5.0-rc1 changes
    and review comments.

    The following system calls are now added on all 32-bit architectures using
    the same system call numbers:

    403 clock_gettime64
    404 clock_settime64
    405 clock_adjtime64
    406 clock_getres_time64
    407 clock_nanosleep_time64
    408 timer_gettime64
    409 timer_settime64
    410 timerfd_gettime64
    411 timerfd_settime64
    412 utimensat_time64
    413 pselect6_time64
    414 ppoll_time64
    416 io_pgetevents_time64
    417 recvmmsg_time64
    418 mq_timedsend_time64
    419 mq_timedreceiv_time64
    420 semtimedop_time64
    421 rt_sigtimedwait_time64
    422 futex_time64
    423 sched_rr_get_interval_time64

    Each one of these corresponds directly to an existing system call that
    includes a 'struct timespec' argument, or a structure containing a timespec
    or (in case of clock_adjtime) timeval. Not included here are new versions
    of getitimer/setitimer and getrusage/waitid, which are planned for the
    future but only needed to make a consistent API rather than for correct
    operation beyond y2038. These four system calls are based on 'timeval', and
    it has not been finally decided what the replacement kernel interface will
    use instead.

    So far, I have done a lot of build testing across most architectures, which
    has found a number of bugs. Runtime testing so far included testing LTP on
    32-bit ARM with the existing system calls, to ensure we do not regress for
    existing binaries, and a test with a 32-bit x86 build of LTP against a
    modified version of the musl C library that has been adapted to the new
    system call interface [3]. This library can be used for testing on all
    architectures supported by musl-1.1.21, but it is not how the support is
    getting integrated into the official musl release. Official musl support is
    planned but will require more invasive changes to the library.

    Link: https://lore.kernel.org/lkml/20190110162435.309262-1-arnd@arndb.de/T/
    Link: https://lore.kernel.org/lkml/20190118161835.2259170-1-arnd@arndb.de/
    Link: https://git.linaro.org/people/arnd/musl-y2038.git/ [2]

    Thomas Gleixner
     
  • …git/arnd/playground into timers/2038

    Pull preparatory work for y2038 changes from Arnd Bergmann:

    System call unification and cleanup

    The system call tables have diverged a bit over the years, and a number of
    the recent additions never made it into all architectures, for one reason
    or another.

    This is an attempt to clean it up as far as we can without breaking
    compatibility, doing a number of steps:

    - Add system calls that have not yet been integrated into all architectures
    but that we definitely want there. This includes {,f}statfs64() and
    get{eg,eu,g,p,u,pp}id() on alpha, which have been missing traditionally.

    - The s390 compat syscall handling is cleaned up to be more like what we
    do on other architectures, while keeping the 31-bit pointer
    extension. This was merged as a shared branch by the s390 maintainers
    and is included here in order to base the other patches on top.

    - Add the separate ipc syscalls on all architectures that traditionally
    only had sys_ipc(). This version is done without support for IPC_OLD
    that is we have in sys_ipc. The new semtimedop_time64 syscall will only
    be added here, not in sys_ipc

    - Add syscall numbers for a couple of syscalls that we probably don't need
    everywhere, in particular pkey_* and rseq, for the purpose of symmetry:
    if it's in asm-generic/unistd.h, it makes sense to have it everywhere. I
    expect that any future system calls will get assigned on all platforms
    together, even when they appear to be specific to a single architecture.

    - Prepare for having the same system call numbers for any future calls. In
    combination with the generated tables, this hopefully makes it easier to
    add new calls across all architectures together.

    All of the above are technically separate from the y2038 work, but are done
    as preparation before we add the new 64-bit time_t system calls everywhere,
    providing a common baseline set of system calls.

    I expect that glibc and other libraries that want to use 64-bit time_t will
    require linux-5.1 kernel headers for building in the future, and at a much
    later point may also require linux-5.1 or a later version as the minimum
    kernel at runtime. Having a common baseline then allows the removal of many
    architecture or kernel version specific workarounds.

    Thomas Gleixner
     

07 Feb, 2019

1 commit

  • This adds 21 new system calls on each ABI that has 32-bit time_t
    today. All of these have the exact same semantics as their existing
    counterparts, and the new ones all have macro names that end in 'time64'
    for clarification.

    This gets us to the point of being able to safely use a C library
    that has 64-bit time_t in user space. There are still a couple of
    loose ends to tie up in various areas of the code, but this is the
    big one, and should be entirely uncontroversial at this point.

    In particular, there are four system calls (getitimer, setitimer,
    waitid, and getrusage) that don't have a 64-bit counterpart yet,
    but these can all be safely implemented in the C library by wrapping
    around the existing system calls because the 32-bit time_t they
    pass only counts elapsed time, not time since the epoch. They
    will be dealt with later.

    Signed-off-by: Arnd Bergmann
    Acked-by: Heiko Carstens
    Acked-by: Geert Uytterhoeven
    Acked-by: Catalin Marinas

    Arnd Bergmann
     

04 Feb, 2019

1 commit

  • Many architectures maintain an arch specific copy of the
    file even though there are no differences with the asm-generic
    one. Allow these architectures to use the generic one instead.

    Signed-off-by: Deepa Dinamani
    Acked-by: Max Filippov
    Acked-by: Heiko Carstens
    Acked-by: Willem de Bruijn
    Cc: chris@zankel.net
    Cc: fenghua.yu@intel.com
    Cc: tglx@linutronix.de
    Cc: schwidefsky@de.ibm.com
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linux-s390@vger.kernel.org
    Signed-off-by: David S. Miller

    Deepa Dinamani
     

26 Jan, 2019

4 commits

  • Most architectures define system call numbers for the rseq and pkey system
    calls, even when they don't support the features, and perhaps never will.

    Only a few architectures are missing these, so just define them anyway
    for consistency. If we decide to add them later to one of these, the
    system call numbers won't get out of sync then.

    Signed-off-by: Arnd Bergmann
    Acked-by: Heiko Carstens
    Acked-by: Geert Uytterhoeven

    Arnd Bergmann
     
  • Most architectures have assigned numbers for both seccomp and
    perf_event_open, even when they do not implement either.

    ia64 is an exception here, so for consistency lets add numbers for both
    of them. Unless CONFIG_PERF_EVENTS and CONFIG_SECCOMP are implemented,
    the system calls just return -ENOSYS.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     
  • All architectures should implement these two, so assign numbers
    and hook them up on ia64.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     
  • Other architectures commonly use __NR_umount2 for sys_umount,
    only ia64 and alpha use __NR_umount here. In order to synchronize
    the generated tables, use umount2 like everyone else, and add back
    the old name from asm/unistd.h for compatibility.

    The __IGNORE_* lines are now all obsolete and can be removed as
    a side-effect.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

22 Jan, 2019

1 commit


18 Jan, 2019

1 commit

  • This introduces a new generic SOL_SOCKET-level socket option called
    SO_BINDTOIFINDEX. It behaves similar to SO_BINDTODEVICE, but takes a
    network interface index as argument, rather than the network interface
    name.

    User-space often refers to network-interfaces via their index, but has
    to temporarily resolve it to a name for a call into SO_BINDTODEVICE.
    This might pose problems when the network-device is renamed
    asynchronously by other parts of the system. When this happens, the
    SO_BINDTODEVICE might either fail, or worse, it might bind to the wrong
    device.

    In most cases user-space only ever operates on devices which they
    either manage themselves, or otherwise have a guarantee that the device
    name will not change (e.g., devices that are UP cannot be renamed).
    However, particularly in libraries this guarantee is non-obvious and it
    would be nice if that race-condition would simply not exist. It would
    make it easier for those libraries to operate even in situations where
    the device-name might change under the hood.

    A real use-case that we recently hit is trying to start the network
    stack early in the initrd but make it survive into the real system.
    Existing distributions rename network-interfaces during the transition
    from initrd into the real system. This, obviously, cannot affect
    devices that are up and running (unless you also consider moving them
    between network-namespaces). However, the network manager now has to
    make sure its management engine for dormant devices will not run in
    parallel to these renames. Particularly, when you offload operations
    like DHCP into separate processes, these might setup their sockets
    early, and thus have to resolve the device-name possibly running into
    this race-condition.

    By avoiding a call to resolve the device-name, we no longer depend on
    the name and can run network setup of dormant devices in parallel to
    the transition off the initrd. The SO_BINDTOIFINDEX ioctl plugs this
    race.

    Reviewed-by: Tom Gundersen
    Signed-off-by: David Herrmann
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    David Herrmann
     

16 Jan, 2019

1 commit


06 Jan, 2019

3 commits

  • Now that Kbuild automatically creates asm-generic wrappers for missing
    mandatory headers, it is redundant to list the same headers in
    generic-y and mandatory-y.

    Suggested-by: Sam Ravnborg
    Signed-off-by: Masahiro Yamada
    Acked-by: Sam Ravnborg

    Masahiro Yamada
     
  • These comments are leftovers of commit fcc8487d477a ("uapi: export all
    headers under uapi directories").

    Prior to that commit, exported headers must be explicitly added to
    header-y. Now, all headers under the uapi/ directories are exported.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • Merge more updates from Andrew Morton:

    - procfs updates

    - various misc bits

    - lib/ updates

    - epoll updates

    - autofs

    - fatfs

    - a few more MM bits

    * emailed patches from Andrew Morton : (58 commits)
    mm/page_io.c: fix polled swap page in
    checkpatch: add Co-developed-by to signature tags
    docs: fix Co-Developed-by docs
    drivers/base/platform.c: kmemleak ignore a known leak
    fs: don't open code lru_to_page()
    fs/: remove caller signal_pending branch predictions
    mm/: remove caller signal_pending branch predictions
    arch/arc/mm/fault.c: remove caller signal_pending_branch predictions
    kernel/sched/: remove caller signal_pending branch predictions
    kernel/locking/mutex.c: remove caller signal_pending branch predictions
    mm: select HAVE_MOVE_PMD on x86 for faster mremap
    mm: speed up mremap by 20x on large regions
    mm: treewide: remove unused address argument from pte_alloc functions
    initramfs: cleanup incomplete rootfs
    scripts/gdb: fix lx-version string output
    kernel/kcov.c: mark write_comp_data() as notrace
    kernel/sysctl: add panic_print into sysctl
    panic: add options to print system info when panic happens
    bfs: extra sanity checking and static inode bitmap
    exec: separate MM_ANONPAGES and RLIMIT_STACK accounting
    ...

    Linus Torvalds
     

05 Jan, 2019

3 commits

  • Some non-generic ia64 configs don't build swiotlb, and thus should not
    pull in the generic non-coherent DMA infrastructure.

    Fixes: 68c608345c ("swiotlb: remove dma_mark_clean")
    Reported-by: Tony Luck
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Tony Luck
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Patch series "Add support for fast mremap".

    This series speeds up the mremap(2) syscall by copying page tables at
    the PMD level even for non-THP systems. There is concern that the extra
    'address' argument that mremap passes to pte_alloc may do something
    subtle architecture related in the future that may make the scheme not
    work. Also we find that there is no point in passing the 'address' to
    pte_alloc since its unused. This patch therefore removes this argument
    tree-wide resulting in a nice negative diff as well. Also ensuring
    along the way that the enabled architectures do not do anything funky
    with the 'address' argument that goes unnoticed by the optimization.

    Build and boot tested on x86-64. Build tested on arm64. The config
    enablement patch for arm64 will be posted in the future after more
    testing.

    The changes were obtained by applying the following Coccinelle script.
    (thanks Julia for answering all Coccinelle questions!).
    Following fix ups were done manually:
    * Removal of address argument from pte_fragment_alloc
    * Removal of pte_alloc_one_fast definitions from m68k and microblaze.

    // Options: --include-headers --no-includes
    // Note: I split the 'identifier fn' line, so if you are manually
    // running it, please unsplit it so it runs for you.

    virtual patch

    @pte_alloc_func_def depends on patch exists@
    identifier E2;
    identifier fn =~
    "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
    type T2;
    @@

    fn(...
    - , T2 E2
    )
    { ... }

    @pte_alloc_func_proto_noarg depends on patch exists@
    type T1, T2, T3, T4;
    identifier fn =~ "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
    @@

    (
    - T3 fn(T1, T2);
    + T3 fn(T1);
    |
    - T3 fn(T1, T2, T4);
    + T3 fn(T1, T2);
    )

    @pte_alloc_func_proto depends on patch exists@
    identifier E1, E2, E4;
    type T1, T2, T3, T4;
    identifier fn =~
    "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
    @@

    (
    - T3 fn(T1 E1, T2 E2);
    + T3 fn(T1 E1);
    |
    - T3 fn(T1 E1, T2 E2, T4 E4);
    + T3 fn(T1 E1, T2 E2);
    )

    @pte_alloc_func_call depends on patch exists@
    expression E2;
    identifier fn =~
    "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
    @@

    fn(...
    -, E2
    )

    @pte_alloc_macro depends on patch exists@
    identifier fn =~
    "^(__pte_alloc|pte_alloc_one|pte_alloc|__pte_alloc_kernel|pte_alloc_one_kernel)$";
    identifier a, b, c;
    expression e;
    position p;
    @@

    (
    - #define fn(a, b, c) e
    + #define fn(a, b) e
    |
    - #define fn(a, b) e
    + #define fn(a) e
    )

    Link: http://lkml.kernel.org/r/20181108181201.88826-2-joelaf@google.com
    Signed-off-by: Joel Fernandes (Google)
    Suggested-by: Kirill A. Shutemov
    Acked-by: Kirill A. Shutemov
    Cc: Michal Hocko
    Cc: Julia Lawall
    Cc: Kirill A. Shutemov
    Cc: William Kucharski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joel Fernandes (Google)
     
  • When testing in userspace, UBSAN pointed out that shifting into the sign
    bit is undefined behaviour. It doesn't really make sense to ask for the
    highest set bit of a negative value, so just turn the argument type into
    an unsigned int.

    Some architectures (eg ppc) already had it declared as an unsigned int,
    so I don't expect too many problems.

    Link: http://lkml.kernel.org/r/20181105221117.31828-1-willy@infradead.org
    Signed-off-by: Matthew Wilcox
    Acked-by: Thomas Gleixner
    Acked-by: Geert Uytterhoeven
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

04 Jan, 2019

1 commit

  • Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
    of the user address range verification function since we got rid of the
    old racy i386-only code to walk page tables by hand.

    It existed because the original 80386 would not honor the write protect
    bit when in kernel mode, so you had to do COW by hand before doing any
    user access. But we haven't supported that in a long time, and these
    days the 'type' argument is a purely historical artifact.

    A discussion about extending 'user_access_begin()' to do the range
    checking resulted this patch, because there is no way we're going to
    move the old VERIFY_xyz interface to that model. And it's best done at
    the end of the merge window when I've done most of my merges, so let's
    just get this done once and for all.

    This patch was mostly done with a sed-script, with manual fix-ups for
    the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

    There were a couple of notable cases:

    - csky still had the old "verify_area()" name as an alias.

    - the iter_iov code had magical hardcoded knowledge of the actual
    values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
    really used it)

    - microblaze used the type argument for a debug printout

    but other than those oddities this should be a total no-op patch.

    I tried to fix up all architectures, did fairly extensive grepping for
    access_ok() uses, and the changes are trivial, but I may have missed
    something. Any missed conversion should be trivially fixable, though.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

30 Dec, 2018

2 commits

  • Pull Kconfig file consolidation from Masahiro Yamada:
    "Consolidation of bus (PCI, PCMCIA, EISA, RapidIO) config entries by
    Christoph Hellwig.

    Currently, every architecture that wants to provide common peripheral
    busses needs to add some boilerplate code and include the right
    Kconfig files. This series instead just selects the presence (when
    needed) and then handles everything in the bus-specific Kconfig file
    under drivers/"

    * tag 'kconfig-v4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    pcmcia: remove per-arch PCMCIA config entry
    eisa: consolidate EISA Kconfig entry in drivers/eisa
    rapidio: consolidate RAPIDIO config entry in drivers/rapidio
    pcmcia: allow PCMCIA support independent of the architecture
    PCI: consolidate the PCI_SYSCALL symbol
    PCI: consolidate the PCI_DOMAINS and PCI_DOMAINS_GENERIC config options
    PCI: consolidate PCI config entry in drivers/pci
    MIPS: remove the HT_PCI config option

    Linus Torvalds
     
  • Pull Kconfig updates from Masahiro Yamada:

    - support -y option for merge_config.sh to avoid downgrading =y to =m

    - remove S_OTHER symbol type, and touch include/config/*.h files correctly

    - fix file name and line number in lexer warnings

    - fix memory leak when EOF is encountered in quotation

    - resolve all shift/reduce conflicts of the parser

    - warn no new line at end of file

    - make 'source' statement more strict to take only string literal

    - rewrite the lexer and remove the keyword lookup table

    - convert to SPDX License Identifier

    - compile C files independently instead of including them from zconf.y

    - fix various warnings of gconfig

    - misc cleanups

    * tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
    kconfig: surround dbg_sym_flags with #ifdef DEBUG to fix gconf warning
    kconfig: split images.c out of qconf.cc/gconf.c to fix gconf warnings
    kconfig: add static qualifiers to fix gconf warnings
    kconfig: split the lexer out of zconf.y
    kconfig: split some C files out of zconf.y
    kconfig: convert to SPDX License Identifier
    kconfig: remove keyword lookup table entirely
    kconfig: update current_pos in the second lexer
    kconfig: switch to ASSIGN_VAL state in the second lexer
    kconfig: stop associating kconf_id with yylval
    kconfig: refactor end token rules
    kconfig: stop supporting '.' and '/' in unquoted words
    treewide: surround Kconfig file paths with double quotes
    microblaze: surround string default in Kconfig with double quotes
    kconfig: use T_WORD instead of T_VARIABLE for variables
    kconfig: use specific tokens instead of T_ASSIGN for assignments
    kconfig: refactor scanning and parsing "option" properties
    kconfig: use distinct tokens for type and default properties
    kconfig: remove redundant token defines
    kconfig: rename depends_list to comment_option_list
    ...

    Linus Torvalds