26 Oct, 2020

1 commit


12 Oct, 2020

1 commit


06 Oct, 2020

1 commit


03 Oct, 2020

1 commit

  • If a page sent into kernel_sendpage() is a slab page or it doesn't have
    ref_count, this page is improper to send by the zero copy sendpage()
    method. Otherwise such page might be unexpected released in network code
    path and causes impredictable panic due to kernel memory management data
    structure corruption.

    This path adds a WARN_ON() on the sending page before sends it into the
    concrete zero-copy sendpage() method, if the page is improper for the
    zero-copy sendpage() method, a warning message can be observed before
    the consequential unpredictable kernel panic.

    This patch does not change existing kernel_sendpage() behavior for the
    improper page zero-copy send, it just provides hint warning message for
    following potential panic due the kernel memory heap corruption.

    Signed-off-by: Coly Li
    Cc: Cong Wang
    Cc: Christoph Hellwig
    Cc: David S. Miller
    Cc: Sridhar Samudrala
    Signed-off-by: David S. Miller

    Coly Li
     

07 Sep, 2020

1 commit


05 Sep, 2020

1 commit

  • We got slightly different patches removing a double word
    in a comment in net/ipv4/raw.c - picked the version from net.

    Simple conflict in drivers/net/ethernet/ibm/ibmvnic.c. Use cached
    values instead of VNIC login response buffer (following what
    commit 507ebe6444a4 ("ibmvnic: Fix use-after-free of VNIC login
    response buffer") did).

    Signed-off-by: Jakub Kicinski

    Jakub Kicinski
     

27 Aug, 2020

1 commit


25 Aug, 2020

1 commit

  • For TCP tx zero-copy, the kernel notifies the process of completions by
    queuing completion notifications on the socket error queue. This patch
    allows reading these notifications via recvmsg to support TCP tx
    zero-copy.

    Ancillary data was originally disallowed due to privilege escalation
    via io_uring's offloading of sendmsg() onto a kernel thread with kernel
    credentials (https://crbug.com/project-zero/1975). So, we must ensure
    that the socket type is one where the ancillary data types that are
    delivered on recvmsg are plain data (no file descriptors or values that
    are translated based on the identity of the calling process).

    This was tested by using io_uring to call recvmsg on the MSG_ERRQUEUE
    with tx zero-copy enabled. Before this patch, we received -EINVALID from
    this specific code path. After this patch, we could read tcp tx
    zero-copy completion notifications from the MSG_ERRQUEUE.

    Signed-off-by: Soheil Hassas Yeganeh
    Signed-off-by: Arjun Roy
    Acked-by: Eric Dumazet
    Reviewed-by: Jann Horn
    Reviewed-by: Jens Axboe
    Signed-off-by: Luke Hsiao
    Signed-off-by: David S. Miller

    Luke Hsiao
     

14 Aug, 2020

1 commit


11 Aug, 2020

2 commits

  • This reverts commits 6d04fe15f78acdf8e32329e208552e226f7a8ae6 and
    a31edb2059ed4e498f9aa8230c734b59d0ad797a.

    It turns out the idea to share a single pointer for both kernel and user
    space address causes various kinds of problems. So use the slightly less
    optimal version that uses an extra bit, but which is guaranteed to be safe
    everywhere.

    Fixes: 6d04fe15f78a ("net: optimize the sockptr_t for unified kernel/user address spaces")
    Reported-by: Eric Dumazet
    Reported-by: John Stultz
    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller
    (cherry picked from commit 519a8a6cf91dda095be2d36216fc4ebc525270a1
    https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git master)
    Signed-off-by: John Stultz
    Change-Id: I645a8226be732b72cf8a404957754e3408dfc4bc

    Christoph Hellwig
     
  • This reverts commits 6d04fe15f78acdf8e32329e208552e226f7a8ae6 and
    a31edb2059ed4e498f9aa8230c734b59d0ad797a.

    It turns out the idea to share a single pointer for both kernel and user
    space address causes various kinds of problems. So use the slightly less
    optimal version that uses an extra bit, but which is guaranteed to be safe
    everywhere.

    Fixes: 6d04fe15f78a ("net: optimize the sockptr_t for unified kernel/user address spaces")
    Reported-by: Eric Dumazet
    Reported-by: John Stultz
    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

09 Aug, 2020

4 commits


07 Aug, 2020

1 commit


29 Jul, 2020

1 commit

  • Make sure not just the pointer itself but the whole range lies in
    the user address space. For that pass the length and then use
    the access_ok helper to do the check.

    Fixes: 6d04fe15f78a ("net: optimize the sockptr_t for unified kernel/user address spaces")
    Reported-by: David Laight
    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

25 Jul, 2020

3 commits

  • For architectures like x86 and arm64 we don't need the separate bit to
    indicate that a pointer is a kernel pointer as the address spaces are
    unified. That way the sockptr_t can be reduced to a union of two
    pointers, which leads to nicer calling conventions.

    The only caveat is that we need to check that users don't pass in kernel
    address and thus gain access to kernel memory. Thus the USER_SOCKPTR
    helper is replaced with a init_user_sockptr function that does this check
    and returns an error if it fails.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     
  • Rework the remaining setsockopt code to pass a sockptr_t instead of a
    plain user pointer. This removes the last remaining set_fs(KERNEL_DS)
    outside of architecture specific code.

    Signed-off-by: Christoph Hellwig
    Acked-by: Stefan Schmidt [ieee802154]
    Acked-by: Matthieu Baerts
    Signed-off-by: David S. Miller

    Christoph Hellwig
     
  • Pass a sockptr_t to prepare for set_fs-less handling of the kernel
    pointer from bpf-cgroup.

    Signed-off-by: Christoph Hellwig
    Acked-by: Matthieu Baerts
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

20 Jul, 2020

4 commits


14 Jul, 2020

1 commit


05 Jul, 2020

1 commit

  • setsockopt(mptcp_fd, SOL_SOCKET, ...)... appears to work (returns 0),
    but it has no effect -- this is because the MPTCP layer never has a
    chance to copy the settings to the subflow socket.

    Skip the generic handling for the mptcp case and instead call the
    mptcp specific handler instead for SOL_SOCKET too.

    Next patch adds more specific handling for SOL_SOCKET to mptcp.

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

22 Jun, 2020

1 commit


30 May, 2020

1 commit


28 May, 2020

1 commit


19 May, 2020

2 commits


12 May, 2020

1 commit

  • The msg_control field in struct msghdr can either contain a user
    pointer when used with the recvmsg system call, or a kernel pointer
    when used with sendmsg. To complicate things further kernel_recvmsg
    can stuff a kernel pointer in and then use set_fs to make the uaccess
    helpers accept it.

    Replace it with a union of a kernel pointer msg_control field, and
    a user pointer msg_control_user one, and allow kernel_recvmsg operate
    on a proper kernel pointer using a bitfield to override the normal
    choice of a user pointer for recvmsg.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     

31 Mar, 2020

2 commits

  • …/scm/linux/kernel/git/tip/tip into android-mainline

    In a quest to make the huge -rc1 merge easier to handle and bisect,
    merge the first chunk of 5.7-rc1 patches into android-mainline.

    Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
    Change-Id: Ib54436e9515660a4c0c25c49c21bfb399eb57921

    Greg Kroah-Hartman
     
  • Pull io_uring updates from Jens Axboe:
    "Here are the io_uring changes for this merge window. Light on new
    features this time around (just splice + buffer selection), lots of
    cleanups, fixes, and improvements to existing support. In particular,
    this contains:

    - Cleanup fixed file update handling for stack fallback (Hillf)

    - Re-work of how pollable async IO is handled, we no longer require
    thread offload to handle that. Instead we rely using poll to drive
    this, with task_work execution.

    - In conjunction with the above, allow expendable buffer selection,
    so that poll+recv (for example) no longer has to be a split
    operation.

    - Make sure we honor RLIMIT_FSIZE for buffered writes

    - Add support for splice (Pavel)

    - Linked work inheritance fixes and optimizations (Pavel)

    - Async work fixes and cleanups (Pavel)

    - Improve io-wq locking (Pavel)

    - Hashed link write improvements (Pavel)

    - SETUP_IOPOLL|SETUP_SQPOLL improvements (Xiaoguang)"

    * tag 'for-5.7/io_uring-2020-03-29' of git://git.kernel.dk/linux-block: (54 commits)
    io_uring: cleanup io_alloc_async_ctx()
    io_uring: fix missing 'return' in comment
    io-wq: handle hashed writes in chains
    io-uring: drop 'free_pfile' in struct io_file_put
    io-uring: drop completion when removing file
    io_uring: Fix ->data corruption on re-enqueue
    io-wq: close cancel gap for hashed linked work
    io_uring: make spdxcheck.py happy
    io_uring: honor original task RLIMIT_FSIZE
    io-wq: hash dependent work
    io-wq: split hashing and enqueueing
    io-wq: don't resched if there is no work
    io-wq: remove duplicated cancel code
    io_uring: fix truncated async read/readv and write/writev retry
    io_uring: dual license io_uring.h uapi header
    io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled
    io_uring: Fix unused function warnings
    io_uring: add end-of-bits marker and build time verify it
    io_uring: provide means of removing buffers
    io_uring: add IOSQE_BUFFER_SELECT support for IORING_OP_RECVMSG
    ...

    Linus Torvalds
     

23 Mar, 2020

1 commit


20 Mar, 2020

1 commit

  • Just like commit 4022e7af86be, this fixes the fact that
    IORING_OP_ACCEPT ends up using get_unused_fd_flags(), which checks
    current->signal->rlim[] for limits.

    Add an extra argument to __sys_accept4_file() that allows us to pass
    in the proper nofile limit, and grab it at request prep time.

    Acked-by: David S. Miller
    Signed-off-by: Jens Axboe

    Jens Axboe
     

10 Mar, 2020

1 commit


31 Jan, 2020

1 commit


09 Jan, 2020

1 commit

  • When procfs is disabled, the fdinfo code causes a harmless
    warning:

    net/socket.c:1000:13: error: 'sock_show_fdinfo' defined but not used [-Werror=unused-function]
    static void sock_show_fdinfo(struct seq_file *m, struct file *f)

    Move the function definition up so we can use a single #ifdef
    around it.

    Fixes: b4653342b151 ("net: Allow to show socket-specific information in /proc/[pid]/fdinfo/[fd]")
    Suggested-by: Al Viro
    Acked-by: Kirill Tkhai
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

23 Dec, 2019

1 commit