02 Nov, 2020

1 commit


23 Oct, 2020

1 commit


03 Oct, 2020

1 commit


08 Aug, 2020

1 commit

  • When refactoring the SCM_RIGHTS code, I accidentally mis-merged my
    native/compat diffs, which entirely broke using SCM_RIGHTS in compat
    mode. Use the correct helper.

    Reported-by: Christian Zigotzky
    Link: https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-August/216156.html
    Reported-by: "Alex Xu (Hello71)"
    Link: https://lore.kernel.org/lkml/1596812929.lz7fuo8r2w.none@localhost/
    Suggested-by: Thadeu Lima de Souza Cascardo
    Fixes: c0029de50982 ("net/scm: Regularize compat handling of scm_detach_fds()")
    Tested-by: Alex Xu (Hello71)
    Acked-by: Thadeu Lima de Souza Cascardo
    Signed-off-by: Kees Cook

    Kees Cook
     

06 Aug, 2020

1 commit

  • Pull networking updates from David Miller:

    1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan.

    2) Support UDP segmentation in code TSO code, from Eric Dumazet.

    3) Allow flashing different flash images in cxgb4 driver, from Vishal
    Kulkarni.

    4) Add drop frames counter and flow status to tc flower offloading,
    from Po Liu.

    5) Support n-tuple filters in cxgb4, from Vishal Kulkarni.

    6) Various new indirect call avoidance, from Eric Dumazet and Brian
    Vazquez.

    7) Fix BPF verifier failures on 32-bit pointer arithmetic, from
    Yonghong Song.

    8) Support querying and setting hardware address of a port function via
    devlink, use this in mlx5, from Parav Pandit.

    9) Support hw ipsec offload on bonding slaves, from Jarod Wilson.

    10) Switch qca8k driver over to phylink, from Jonathan McDowell.

    11) In bpftool, show list of processes holding BPF FD references to
    maps, programs, links, and btf objects. From Andrii Nakryiko.

    12) Several conversions over to generic power management, from Vaibhav
    Gupta.

    13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry
    Yakunin.

    14) Various https url conversions, from Alexander A. Klimov.

    15) Timestamping and PHC support for mscc PHY driver, from Antoine
    Tenart.

    16) Support bpf iterating over tcp and udp sockets, from Yonghong Song.

    17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov.

    18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan.

    19) Fix the ->ndo_start_xmit() return type to be netdev_tx_t in several
    drivers. From Luc Van Oostenryck.

    20) XDP support for xen-netfront, from Denis Kirjanov.

    21) Support receive buffer autotuning in MPTCP, from Florian Westphal.

    22) Support EF100 chip in sfc driver, from Edward Cree.

    23) Add XDP support to mvpp2 driver, from Matteo Croce.

    24) Support MPTCP in sock_diag, from Paolo Abeni.

    25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic
    infrastructure, from Jakub Kicinski.

    26) Several pci_ --> dma_ API conversions, from Christophe JAILLET.

    27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel.

    28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki.

    29) Refactor a lot of networking socket option handling code in order to
    avoid set_fs() calls, from Christoph Hellwig.

    30) Add rfc4884 support to icmp code, from Willem de Bruijn.

    31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei.

    32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin.

    33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin.

    34) Support TCP syncookies in MPTCP, from Flowian Westphal.

    35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano
    Brivio.

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits)
    net: thunderx: initialize VF's mailbox mutex before first usage
    usb: hso: remove bogus check for EINPROGRESS
    usb: hso: no complaint about kmalloc failure
    hso: fix bailout in error case of probe
    ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
    selftests/net: relax cpu affinity requirement in msg_zerocopy test
    mptcp: be careful on subflow creation
    selftests: rtnetlink: make kci_test_encap() return sub-test result
    selftests: rtnetlink: correct the final return value for the test
    net: dsa: sja1105: use detected device id instead of DT one on mismatch
    tipc: set ub->ifindex for local ipv6 address
    ipv6: add ipv6_dev_find()
    net: openvswitch: silence suspicious RCU usage warning
    Revert "vxlan: fix tos value before xmit"
    ptp: only allow phase values lower than 1 period
    farsync: switch from 'pci_' to 'dma_' API
    wan: wanxl: switch from 'pci_' to 'dma_' API
    hv_netvsc: do not use VF device if link is down
    dpaa2-eth: Fix passing zero to 'PTR_ERR' warning
    net: macb: Properly handle phylink on at91sam9x
    ...

    Linus Torvalds
     

05 Aug, 2020

1 commit

  • Pull seccomp updates from Kees Cook:
    "There are a bunch of clean ups and selftest improvements along with
    two major updates to the SECCOMP_RET_USER_NOTIF filter return:
    EPOLLHUP support to more easily detect the death of a monitored
    process, and being able to inject fds when intercepting syscalls that
    expect an fd-opening side-effect (needed by both container folks and
    Chrome). The latter continued the refactoring of __scm_install_fd()
    started by Christoph, and in the process found and fixed a handful of
    bugs in various callers.

    - Improved selftest coverage, timeouts, and reporting

    - Add EPOLLHUP support for SECCOMP_RET_USER_NOTIF (Christian Brauner)

    - Refactor __scm_install_fd() into __receive_fd() and fix buggy
    callers

    - Introduce 'addfd' command for SECCOMP_RET_USER_NOTIF (Sargun
    Dhillon)"

    * tag 'seccomp-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits)
    selftests/seccomp: Test SECCOMP_IOCTL_NOTIF_ADDFD
    seccomp: Introduce addfd ioctl to seccomp user notifier
    fs: Expand __receive_fd() to accept existing fd
    pidfd: Replace open-coded receive_fd()
    fs: Add receive_fd() wrapper for __receive_fd()
    fs: Move __scm_install_fd() to __receive_fd()
    net/scm: Regularize compat handling of scm_detach_fds()
    pidfd: Add missing sock updates for pidfd_getfd()
    net/compat: Add missing sock updates for SCM_RIGHTS
    selftests/seccomp: Check ENOSYS under tracing
    selftests/seccomp: Refactor to use fixture variants
    selftests/harness: Clean up kern-doc for fixtures
    seccomp: Use -1 marker for end of mode 1 syscall list
    seccomp: Fix ioctl number for SECCOMP_IOCTL_NOTIF_ID_VALID
    selftests/seccomp: Rename user_trap_syscall() to user_notif_syscall()
    selftests/seccomp: Make kcmp() less required
    seccomp: Use pr_fmt
    selftests/seccomp: Improve calibration loop
    selftests/seccomp: use 90s as timeout
    selftests/seccomp: Expand benchmark to per-filter measurements
    ...

    Linus Torvalds
     

02 Aug, 2020

1 commit


28 Jul, 2020

1 commit

  • commit 547ce4cfb34c ("switch cmsghdr_from_user_compat_to_kern() to
    copy_from_user()") missed one of the places where ucmlen should've been
    replaced with cmsg.cmsg_len, now that we are fetching the entire struct
    rather than doing it field-by-field.

    As the result, compat sendmsg() with several different-sized cmsg
    attached started to fail with EINVAL. Trivial to fix, fortunately.

    Fixes: 547ce4cfb34c ("switch cmsghdr_from_user_compat_to_kern() to copy_from_user()")
    Reported-by: Nick Bowler
    Tested-by: Nick Bowler
    Signed-off-by: Al Viro
    Signed-off-by: David S. Miller

    Al Viro
     

20 Jul, 2020

2 commits

  • Now that the ->compat_{get,set}sockopt proto_ops methods are gone
    there is no good reason left to keep the compat syscalls separate.

    This fixes the odd use of unsigned int for the compat_setsockopt
    optlen and the missing sock_use_custom_sol_socket.

    It would also easily allow running the eBPF hooks for the compat
    syscalls, but such a large change in behavior does not belong into
    a consolidation patch like this one.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     
  • Add a helper that copies either a native or compat bpf_fprog from
    userspace after verifying the length, and remove the compat setsockopt
    handlers that now aren't required.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

14 Jul, 2020

4 commits

  • For both pidfd and seccomp, the __user pointer is not used. Update
    __receive_fd() to make writing to ufd optional via a NULL check. However,
    for the receive_fd_user() wrapper, ufd is NULL checked so an -EFAULT
    can be returned to avoid changing the SCM_RIGHTS interface behavior. Add
    new wrapper receive_fd() for pidfd and seccomp that does not use the ufd
    argument. For the new helper, the allocated fd needs to be returned on
    success. Update the existing callers to handle it.

    Cc: Alexander Viro
    Cc: linux-fsdevel@vger.kernel.org
    Reviewed-by: Sargun Dhillon
    Acked-by: Christian Brauner
    Signed-off-by: Kees Cook

    Kees Cook
     
  • In preparation for users of the "install a received file" logic outside
    of net/ (pidfd and seccomp), relocate and rename __scm_install_fd() from
    net/core/scm.c to __receive_fd() in fs/file.c, and provide a wrapper
    named receive_fd_user(), as future patches will change the interface
    to __receive_fd().

    Additionally add a comment to fd_install() as a counterpoint to how
    __receive_fd() interacts with fput().

    Cc: Alexander Viro
    Cc: "David S. Miller"
    Cc: Jakub Kicinski
    Cc: Dmitry Kadashev
    Cc: Jens Axboe
    Cc: Arnd Bergmann
    Cc: Sargun Dhillon
    Cc: Ido Schimmel
    Cc: Ioana Ciornei
    Cc: linux-fsdevel@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Reviewed-by: Sargun Dhillon
    Acked-by: Christian Brauner
    Signed-off-by: Kees Cook

    Kees Cook
     
  • Duplicate the cleanups from commit 2618d530dd8b ("net/scm: cleanup
    scm_detach_fds") into the compat code.

    Replace open-coded __receive_sock() with a call to the helper.

    Move the check added in commit 1f466e1f15cf ("net: cleanly handle kernel
    vs user buffers for ->msg_control") to before the compat call, even
    though it should be impossible for an in-kernel call to also be compat.

    Correct the int "flags" argument to unsigned int to match fd_install()
    and similar APIs.

    Regularize any remaining differences, including a whitespace issue,
    a checkpatch warning, and add the check from commit 6900317f5eff ("net,
    scm: fix PaX detected msg_controllen overflow in scm_detach_fds") which
    fixed an overflow unique to 64-bit. To avoid confusion when comparing
    the compat handler to the native handler, just include the same check
    in the compat handler.

    Cc: Christoph Hellwig
    Cc: Sargun Dhillon
    Cc: Jakub Kicinski
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Acked-by: Christian Brauner
    Signed-off-by: Kees Cook

    Kees Cook
     
  • Add missed sock updates to compat path via a new helper, which will be
    used more in coming patches. (The net/core/scm.c code is left as-is here
    to assist with -stable backports for the compat path.)

    Cc: Christoph Hellwig
    Cc: Sargun Dhillon
    Cc: Jakub Kicinski
    Cc: stable@vger.kernel.org
    Fixes: 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
    Fixes: d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
    Acked-by: Christian Brauner
    Signed-off-by: Kees Cook

    Kees Cook
     

02 Jun, 2020

1 commit


21 May, 2020

3 commits

  • not used anymore

    Signed-off-by: Al Viro

    Al Viro
     
  • now we can do MCAST_MSFILTER in compat ->getsockopt() without
    playing silly buggers with copying things back and forth.
    We can form a native struct group_filter (sans the variable-length
    tail) on stack, pass that + pointer to the tail of original request
    to the helper doing the bulk of the work, then do the rest of
    copyout - same as the native getsockopt() does.

    Signed-off-by: Al Viro

    Al Viro
     
  • We want to get rid of compat_mc_[sg]etsockopt() and to have that stuff
    handled without compat_alloc_user_space(), extra copying through
    userland, etc. To do that we'll need ipv4 and ipv6 instances of
    ->compat_[sg]etsockopt() to manipulate the 32bit variants of mcast
    requests, so we need to move the definitions of those out of net/compat.c
    and into a public header.

    This patch just does a mechanical move to include/net/compat.h

    Signed-off-by: Al Viro

    Al Viro
     

12 May, 2020

1 commit

  • The msg_control field in struct msghdr can either contain a user
    pointer when used with the recvmsg system call, or a kernel pointer
    when used with sendmsg. To complicate things further kernel_recvmsg
    can stuff a kernel pointer in and then use set_fs to make the uaccess
    helpers accept it.

    Replace it with a union of a kernel pointer msg_control field, and
    a user pointer msg_control_user one, and allow kernel_recvmsg operate
    on a proper kernel pointer using a bitfield to override the normal
    choice of a user pointer for recvmsg.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

10 Mar, 2020

1 commit


15 Nov, 2019

1 commit

  • The 'timespec' type definition and helpers like ktime_to_timespec()
    or timespec64_to_timespec() should no longer be used in the kernel so
    we can remove them and avoid introducing y2038 issues in new code.

    Change the socket code that needs to pass a timespec to user space for
    backward compatibility to use __kernel_old_timespec instead. This type
    has the same layout but with a clearer defined name.

    Slightly reformat tcp_recv_timestamp() for consistency after the removal
    of timespec64_to_timespec().

    Acked-by: Deepa Dinamani
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

01 Jun, 2019

1 commit


21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

20 Apr, 2019

1 commit

  • The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many
    socket protocol handlers, and all of those end up calling the same
    sock_get_timestamp()/sock_get_timestampns() helper functions, which
    results in a lot of duplicate code.

    With the introduction of 64-bit time_t on 32-bit architectures, this
    gets worse, as we then need four different ioctl commands in each
    socket protocol implementation.

    To simplify that, let's add a new .gettstamp() operation in
    struct proto_ops, and move ioctl implementation into the common
    sock_ioctl()/compat_sock_ioctl_trans() functions that these all go
    through.

    We can reuse the sock_get_timestamp() implementation, but generalize
    it so it can deal with both native and compat mode, as well as
    timeval and timespec structures.

    Acked-by: Stefan Schmidt
    Acked-by: Neil Horman
    Acked-by: Marc Kleine-Budde
    Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/
    Signed-off-by: Arnd Bergmann
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

06 Mar, 2019

1 commit

  • Pull year 2038 updates from Thomas Gleixner:
    "Another round of changes to make the kernel ready for 2038. After lots
    of preparatory work this is the first set of syscalls which are 2038
    safe:

    403 clock_gettime64
    404 clock_settime64
    405 clock_adjtime64
    406 clock_getres_time64
    407 clock_nanosleep_time64
    408 timer_gettime64
    409 timer_settime64
    410 timerfd_gettime64
    411 timerfd_settime64
    412 utimensat_time64
    413 pselect6_time64
    414 ppoll_time64
    416 io_pgetevents_time64
    417 recvmmsg_time64
    418 mq_timedsend_time64
    419 mq_timedreceiv_time64
    420 semtimedop_time64
    421 rt_sigtimedwait_time64
    422 futex_time64
    423 sched_rr_get_interval_time64

    The syscall numbers are identical all over the architectures"

    * 'timers-2038-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
    riscv: Use latest system call ABI
    checksyscalls: fix up mq_timedreceive and stat exceptions
    unicore32: Fix __ARCH_WANT_STAT64 definition
    asm-generic: Make time32 syscall numbers optional
    asm-generic: Drop getrlimit and setrlimit syscalls from default list
    32-bit userspace ABI: introduce ARCH_32BIT_OFF_T config option
    compat ABI: use non-compat openat and open_by_handle_at variants
    y2038: add 64-bit time_t syscalls to all 32-bit architectures
    y2038: rename old time and utime syscalls
    y2038: remove struct definition redirects
    y2038: use time32 syscall names on 32-bit
    syscalls: remove obsolete __IGNORE_ macros
    y2038: syscalls: rename y2038 compat syscalls
    x86/x32: use time64 versions of sigtimedwait and recvmmsg
    timex: change syscalls to use struct __kernel_timex
    timex: use __kernel_timex internally
    sparc64: add custom adjtimex/clock_adjtime functions
    time: fix sys_timer_settime prototype
    time: Add struct __kernel_timex
    time: make adjtime compat handling available for 32 bit
    ...

    Linus Torvalds
     

04 Mar, 2019

1 commit

  • Add __user attributes in some of the casts in this function to avoid
    the following sparse warnings:

    net/compat.c:592:57: warning: cast removes address space of expression
    net/compat.c:592:57: warning: incorrect type in initializer (different address spaces)
    net/compat.c:592:57: expected struct compat_group_req [noderef] *gr32
    net/compat.c:592:57: got void *
    net/compat.c:613:65: warning: cast removes address space of expression
    net/compat.c:613:65: warning: incorrect type in initializer (different address spaces)
    net/compat.c:613:65: expected struct compat_group_source_req [noderef] *gsr32
    net/compat.c:613:65: got void *
    net/compat.c:634:60: warning: cast removes address space of expression
    net/compat.c:634:60: warning: incorrect type in initializer (different address spaces)
    net/compat.c:634:60: expected struct compat_group_filter [noderef] *gf32
    net/compat.c:634:60: got void *
    net/compat.c:672:52: warning: cast removes address space of expression
    net/compat.c:672:52: warning: incorrect type in initializer (different address spaces)
    net/compat.c:672:52: expected struct compat_group_filter [noderef] *gf32
    net/compat.c:672:52: got void *

    Signed-off-by: Ben Dooks
    Signed-off-by: David S. Miller

    Ben Dooks
     

25 Feb, 2019

1 commit

  • Three conflicts, one of which, for marvell10g.c is non-trivial and
    requires some follow-up from Heiner or someone else.

    The issue is that Heiner converted the marvell10g driver over to
    use the generic c45 code as much as possible.

    However, in 'net' a bug fix appeared which makes sure that a new
    local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0
    is cleared.

    Signed-off-by: David S. Miller

    David S. Miller
     

23 Feb, 2019

1 commit


07 Feb, 2019

1 commit

  • A lot of system calls that pass a time_t somewhere have an implementation
    using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have
    been reworked so that this implementation can now be used on 32-bit
    architectures as well.

    The missing step is to redefine them using the regular SYSCALL_DEFINEx()
    to get them out of the compat namespace and make it possible to build them
    on 32-bit architectures.

    Any system call that ends in 'time' gets a '32' suffix on its name for
    that version, while the others get a '_time32' suffix, to distinguish
    them from the normal version, which takes a 64-bit time argument in the
    future.

    In this step, only 64-bit architectures are changed, doing this rename
    first lets us avoid touching the 32-bit architectures twice.

    Acked-by: Catalin Marinas
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

04 Feb, 2019

3 commits

  • As part of y2038 solution, all internal uses of
    struct timeval are replaced by struct __kernel_old_timeval
    and struct compat_timeval by struct old_timeval32.
    Make socket timestamps use these new types.

    This is mainly to be able to verify that the kernel build
    is y2038 safe when such non y2038 safe types are not
    supported anymore.

    Signed-off-by: Deepa Dinamani
    Acked-by: Willem de Bruijn
    Cc: isdn@linux-pingi.de
    Signed-off-by: David S. Miller

    Deepa Dinamani
     
  • SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING options, the
    way they are currently defined, are not y2038 safe.
    Subsequent patches in the series add new y2038 safe versions
    of these options which provide 64 bit timestamps on all
    architectures uniformly.
    Hence, rename existing options with OLD tag suffixes.

    Also note that kernel will not use the untagged SO_TIMESTAMP*
    and SCM_TIMESTAMP* options internally anymore.

    Signed-off-by: Deepa Dinamani
    Acked-by: Willem de Bruijn
    Cc: deller@gmx.de
    Cc: dhowells@redhat.com
    Cc: jejb@parisc-linux.org
    Cc: ralf@linux-mips.org
    Cc: rth@twiddle.net
    Cc: linux-afs@lists.infradead.org
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-mips@linux-mips.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linux-rdma@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Signed-off-by: David S. Miller

    Deepa Dinamani
     
  • This is a cleanup to prepare for the addition of 64-bit time_t
    in O_SNDTIMEO/O_RCVTIMEO. The existing compat handler seems
    unnecessarily complex and error-prone, moving it all into the
    main setsockopt()/getsockopt() implementation requires half
    as much code and is easier to extend.

    32-bit user space can now use old_timeval32 on both 32-bit
    and 64-bit machines, while 64-bit code can use
    __old_kernel_timeval.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Deepa Dinamani
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

04 Jan, 2019

2 commits

  • Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
    of the user address range verification function since we got rid of the
    old racy i386-only code to walk page tables by hand.

    It existed because the original 80386 would not honor the write protect
    bit when in kernel mode, so you had to do COW by hand before doing any
    user access. But we haven't supported that in a long time, and these
    days the 'type' argument is a purely historical artifact.

    A discussion about extending 'user_access_begin()' to do the range
    checking resulted this patch, because there is no way we're going to
    move the old VERIFY_xyz interface to that model. And it's best done at
    the end of the merge window when I've done most of my merges, so let's
    just get this done once and for all.

    This patch was mostly done with a sed-script, with manual fix-ups for
    the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.

    There were a couple of notable cases:

    - csky still had the old "verify_area()" name as an alias.

    - the iter_iov code had magical hardcoded knowledge of the actual
    values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
    really used it)

    - microblaze used the type argument for a debug printout

    but other than those oddities this should be a total no-op patch.

    I tried to fix up all architectures, did fairly extensive grepping for
    access_ok() uses, and the changes are trivial, but I may have missed
    something. Any missed conversion should be trivially fixable, though.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull networking fixes from David Miller:
    "Several fixes here. Basically split down the line between newly
    introduced regressions and long existing problems:

    1) Double free in tipc_enable_bearer(), from Cong Wang.

    2) Many fixes to nf_conncount, from Florian Westphal.

    3) op->get_regs_len() can throw an error, check it, from Yunsheng
    Lin.

    4) Need to use GFP_ATOMIC in *_add_hash_mac_address() of fsl/fman
    driver, from Scott Wood.

    5) Inifnite loop in fib_empty_table(), from Yue Haibing.

    6) Use after free in ax25_fillin_cb(), from Cong Wang.

    7) Fix socket locking in nr_find_socket(), also from Cong Wang.

    8) Fix WoL wakeup enable in r8169, from Heiner Kallweit.

    9) On 32-bit sock->sk_stamp is not thread-safe, from Deepa Dinamani.

    10) Fix ptr_ring wrap during queue swap, from Cong Wang.

    11) Missing shutdown callback in hinic driver, from Xue Chaojing.

    12) Need to return NULL on error from ip6_neigh_lookup(), from Stefano
    Brivio.

    13) BPF out of bounds speculation fixes from Daniel Borkmann"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits)
    ipv6: Consider sk_bound_dev_if when binding a socket to an address
    ipv6: Fix dump of specific table with strict checking
    bpf: add various test cases to selftests
    bpf: prevent out of bounds speculation on pointer arithmetic
    bpf: fix check_map_access smin_value test when pointer contains offset
    bpf: restrict unknown scalars of mixed signed bounds for unprivileged
    bpf: restrict stack pointer arithmetic for unprivileged
    bpf: restrict map value pointer arithmetic for unprivileged
    bpf: enable access to ax register also from verifier rewrite
    bpf: move tmp variable into ax register in interpreter
    bpf: move {prev_,}insn_idx into verifier env
    isdn: fix kernel-infoleak in capi_unlocked_ioctl
    ipv6: route: Fix return value of ip6_neigh_lookup() on neigh_create() error
    net/hamradio/6pack: use mod_timer() to rearm timers
    net-next/hinic:add shutdown callback
    net: hns3: call hns3_nic_net_open() while doing HNAE3_UP_CLIENT
    ip: validate header length on virtual device xmit
    tap: call skb_probe_transport_header after setting skb->dev
    ptr_ring: wrap back ->producer in __ptr_ring_swap_queue()
    net: rds: remove unnecessary NULL check
    ...

    Linus Torvalds
     

02 Jan, 2019

1 commit

  • Al Viro mentioned (Message-ID
    )
    that there is probably a race condition
    lurking in accesses of sk_stamp on 32-bit machines.

    sock->sk_stamp is of type ktime_t which is always an s64.
    On a 32 bit architecture, we might run into situations of
    unsafe access as the access to the field becomes non atomic.

    Use seqlocks for synchronization.
    This allows us to avoid using spinlocks for readers as
    readers do not need mutual exclusion.

    Another approach to solve this is to require sk_lock for all
    modifications of the timestamps. The current approach allows
    for timestamps to have their own lock: sk_stamp_lock.
    This allows for the patch to not compete with already
    existing critical sections, and side effects are limited
    to the paths in the patch.

    The addition of the new field maintains the data locality
    optimizations from
    commit 9115e8cd2a0c ("net: reorganize struct sock for better data
    locality")

    Note that all the instances of the sk_stamp accesses
    are either through the ioctl or the syscall recvmsg.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: David S. Miller

    Deepa Dinamani
     

18 Dec, 2018

1 commit

  • recvmmsg() takes two arguments to pointers of structures that differ
    between 32-bit and 64-bit architectures: mmsghdr and timespec.

    For y2038 compatbility, we are changing the native system call from
    timespec to __kernel_timespec with a 64-bit time_t (in another patch),
    and use the existing compat system call on both 32-bit and 64-bit
    architectures for compatibility with traditional 32-bit user space.

    As we now have two variants of recvmmsg() for 32-bit tasks that are both
    different from the variant that we use on 64-bit tasks, this means we
    also require two compat system calls!

    The solution I picked is to flip things around: The existing
    compat_sys_recvmmsg() call gets moved from net/compat.c into net/socket.c
    and now handles the case for old user space on all architectures that
    have set CONFIG_COMPAT_32BIT_TIME. A new compat_sys_recvmmsg_time64()
    call gets added in the old place for 64-bit architectures only, this
    one handles the case of a compat mmsghdr structure combined with
    __kernel_timespec.

    In the indirect sys_socketcall(), we now need to call either
    do_sys_recvmmsg() or __compat_sys_recvmmsg(), depending on what kind of
    architecture we are on. For compat_sys_socketcall(), no such change is
    needed, we always call __compat_sys_recvmmsg().

    I decided to not add a new SYS_RECVMMSG_TIME64 socketcall: Any libc
    implementation for 64-bit time_t will need significant changes including
    an updated asm/unistd.h, and it seems better to consistently use the
    separate syscalls that configuration, leaving the socketcall only for
    backward compatibility with 32-bit time_t based libc.

    The naming is asymmetric for the moment, so both existing syscalls
    entry points keep their names, while the new ones are recvmmsg_time32
    and compat_recvmmsg_time64 respectively. I expect that we will rename
    the compat syscalls later as we start using generated syscall tables
    everywhere and add these entry points.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

29 Aug, 2018

1 commit

  • This converts the recvmmsg() system call in all its variations to use
    'timespec64' internally for its timeout, and have a __kernel_timespec64
    argument in the native entry point. This lets us change the type to use
    64-bit time_t at a later point while using the 32-bit compat system call
    emulation for existing user space.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

27 Aug, 2018

1 commit

  • Christoph Hellwig suggested a slightly different path for handling
    backwards compatibility with the 32-bit time_t based system calls:

    Rather than simply reusing the compat_sys_* entry points on 32-bit
    architectures unchanged, we get rid of those entry points and the
    compat_time types by renaming them to something that makes more sense
    on 32-bit architectures (which don't have a compat mode otherwise),
    and then share the entry points under the new name with the 64-bit
    architectures that use them for implementing the compatibility.

    The following types and interfaces are renamed here, and moved
    from linux/compat_time.h to linux/time32.h:

    old new
    --- ---
    compat_time_t old_time32_t
    struct compat_timeval struct old_timeval32
    struct compat_timespec struct old_timespec32
    struct compat_itimerspec struct old_itimerspec32
    ns_to_compat_timeval() ns_to_old_timeval32()
    get_compat_itimerspec64() get_old_itimerspec32()
    put_compat_itimerspec64() put_old_itimerspec32()
    compat_get_timespec64() get_old_timespec32()
    compat_put_timespec64() put_old_timespec32()

    As we already have aliases in place, this patch addresses only the
    instances that are relevant to the system call interface in particular,
    not those that occur in device drivers and other modules. Those
    will get handled separately, while providing the 64-bit version
    of the respective interfaces.

    I'm not renaming the timex, rusage and itimerval structures, as we are
    still debating what the new interface will look like, and whether we
    will need a replacement at all.

    This also doesn't change the names of the syscall entry points, which can
    be done more easily when we actually switch over the 32-bit architectures
    to use them, at that point we need to change COMPAT_SYSCALL_DEFINEx to
    SYSCALL_DEFINEx with a new name, e.g. with a _time32 suffix.

    Suggested-by: Christoph Hellwig
    Link: https://lore.kernel.org/lkml/20180705222110.GA5698@infradead.org/
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

07 Aug, 2018

1 commit


28 Apr, 2018

1 commit

  • For the x32 ABI, struct timeval has two 64-bit fields. However
    the kernel currently interprets the user-space values used for
    the SO_RCVTIMEO and SO_SNDTIMEO socket options as having a pair
    of 32-bit fields.

    When the seconds portion of the requested timeout is less than 2**32,
    the seconds portion of the effective timeout is correct but the
    microseconds portion is zero. When the seconds portion of the
    requested timeout is zero and the microseconds portion is non-zero,
    the kernel interprets the timeout as zero (never timeout).

    Fix by using 64-bit time for SO_RCVTIMEO/SO_SNDTIMEO as required
    for the ABI.

    The code included below demonstrates the problem.

    Results before patch:
    $ gcc -m64 -Wall -O2 -o socktmo socktmo.c && ./socktmo
    recv time: 2.008181 seconds
    send time: 2.015985 seconds

    $ gcc -m32 -Wall -O2 -o socktmo socktmo.c && ./socktmo
    recv time: 2.016763 seconds
    send time: 2.016062 seconds

    $ gcc -mx32 -Wall -O2 -o socktmo socktmo.c && ./socktmo
    recv time: 1.007239 seconds
    send time: 1.023890 seconds

    Results after patch:
    $ gcc -m64 -O2 -Wall -o socktmo socktmo.c && ./socktmo
    recv time: 2.010062 seconds
    send time: 2.015836 seconds

    $ gcc -m32 -O2 -Wall -o socktmo socktmo.c && ./socktmo
    recv time: 2.013974 seconds
    send time: 2.015981 seconds

    $ gcc -mx32 -O2 -Wall -o socktmo socktmo.c && ./socktmo
    recv time: 2.030257 seconds
    send time: 2.013383 seconds

    #include
    #include
    #include
    #include
    #include

    void checkrc(char *str, int rc)
    {
    if (rc >= 0)
    return;

    perror(str);
    exit(1);
    }

    static char buf[1024];
    int main(int argc, char **argv)
    {
    int rc;
    int socks[2];
    struct timeval tv;
    struct timeval start, end, delta;

    rc = socketpair(AF_UNIX, SOCK_STREAM, 0, socks);
    checkrc("socketpair", rc);

    /* set timeout to 1.999999 seconds */
    tv.tv_sec = 1;
    tv.tv_usec = 999999;
    rc = setsockopt(socks[0], SOL_SOCKET, SO_RCVTIMEO, &tv, sizeof tv);
    rc = setsockopt(socks[0], SOL_SOCKET, SO_SNDTIMEO, &tv, sizeof tv);
    checkrc("setsockopt", rc);

    /* measure actual receive timeout */
    gettimeofday(&start, NULL);
    rc = recv(socks[0], buf, sizeof buf, 0);
    gettimeofday(&end, NULL);
    timersub(&end, &start, &delta);

    printf("recv time: %ld.%06ld seconds\n",
    (long)delta.tv_sec, (long)delta.tv_usec);

    /* fill send buffer */
    do {
    rc = send(socks[0], buf, sizeof buf, 0);
    } while (rc > 0);

    /* measure actual send timeout */
    gettimeofday(&start, NULL);
    rc = send(socks[0], buf, sizeof buf, 0);
    gettimeofday(&end, NULL);
    timersub(&end, &start, &delta);

    printf("send time: %ld.%06ld seconds\n",
    (long)delta.tv_sec, (long)delta.tv_usec);
    exit(0);
    }

    Fixes: 515c7af85ed9 ("x32: Use compat shims for {g,s}etsockopt")
    Reported-by: Gopal RajagopalSai
    Signed-off-by: Lance Richardson
    Signed-off-by: David S. Miller

    Lance Richardson