18 Jan, 2020

1 commit

  • commit 9d7bf41fafa5b5ddd4c13eb39446b0045f0a8167 upstream.

    Unlike the normal SIOCOUTQ, SIOCOUTQNSD was never handled in compat
    mode. Add it to the common socket compat handler along with similar
    ones.

    Fixes: 2f4e1b397097 ("tcp: ioctl type SIOCOUTQNSD returns amount of data not sent")
    Cc: Eric Dumazet
    Cc: netdev@vger.kernel.org
    Cc: "David S. Miller"
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     

09 Jan, 2020

1 commit

  • [ Upstream commit ebfcd8955c0b52eb793bcbc9e71140e3d0cdb228 ]

    The socket read/write helpers only look at the file O_NONBLOCK. not
    the iocb IOCB_NOWAIT flag. This breaks users like preadv2/pwritev2
    and io_uring that rely on not having the file itself marked nonblocking,
    but rather the iocb itself.

    Cc: netdev@vger.kernel.org
    Acked-by: David Miller
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Jens Axboe
     

05 Dec, 2019

2 commits

  • [ Upstream commit d69e07793f891524c6bbf1e75b9ae69db4450953 ]

    Only io_uring uses (and added) these, and we want to disallow the
    use of sendmsg/recvmsg for anything but regular data transfers.
    Use the newly added prep helper to split the msghdr copy out from
    the core function, to check for msg_control and msg_controllen
    settings. If either is set, we return -EINVAL.

    Acked-by: David S. Miller
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Jens Axboe
     
  • [ Upstream commit 4257c8ca13b084550574b8c9a667d9c90ff746eb ]

    This is in preparation for enabling the io_uring helpers for sendmsg
    and recvmsg to first copy the header for validation before continuing
    with the operation.

    There should be no functional changes in this patch.

    Acked-by: David S. Miller
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Jens Axboe
     

20 Jul, 2019

1 commit

  • Pull vfs mount updates from Al Viro:
    "The first part of mount updates.

    Convert filesystems to use the new mount API"

    * 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    mnt_init(): call shmem_init() unconditionally
    constify ksys_mount() string arguments
    don't bother with registering rootfs
    init_rootfs(): don't bother with init_ramfs_fs()
    vfs: Convert smackfs to use the new mount API
    vfs: Convert selinuxfs to use the new mount API
    vfs: Convert securityfs to use the new mount API
    vfs: Convert apparmorfs to use the new mount API
    vfs: Convert openpromfs to use the new mount API
    vfs: Convert xenfs to use the new mount API
    vfs: Convert gadgetfs to use the new mount API
    vfs: Convert oprofilefs to use the new mount API
    vfs: Convert ibmasmfs to use the new mount API
    vfs: Convert qib_fs/ipathfs to use the new mount API
    vfs: Convert efivarfs to use the new mount API
    vfs: Convert configfs to use the new mount API
    vfs: Convert binfmt_misc to use the new mount API
    convenience helper: get_tree_single()
    convenience helper get_tree_nodev()
    vfs: Kill sget_userns()
    ...

    Linus Torvalds
     

14 Jul, 2019

1 commit

  • Pull io_uring updates from Jens Axboe:
    "This contains:

    - Support for recvmsg/sendmsg as first class opcodes.

    I don't envision going much further down this path, as there are
    plans in progress to support potentially any system call in an
    async fashion through io_uring. But I think it does make sense to
    have certain core ops available directly, especially those that can
    support a "try this non-blocking" flag/mode. (me)

    - Handle generic short reads automatically.

    This can happen fairly easily if parts of the buffered read is
    cached. Since the application needs to issue another request for
    the remainder, just do this internally and save kernel/user
    roundtrip while providing a nicer more robust API. (me)

    - Support for linked SQEs.

    This allows SQEs to depend on each other, enabling an application
    to eg queue a read-from-this-file,write-to-that-file pair. (me)

    - Fix race in stopping SQ thread (Jackie)"

    * tag 'for-5.3/io_uring-20190711' of git://git.kernel.dk/linux-block:
    io_uring: fix io_sq_thread_stop running in front of io_sq_thread
    io_uring: add support for recvmsg()
    io_uring: add support for sendmsg()
    io_uring: add support for sqe links
    io_uring: punt short reads to async context
    uio: make import_iovec()/compat_import_iovec() return bytes on success

    Linus Torvalds
     

10 Jul, 2019

2 commits

  • This is done through IORING_OP_RECVMSG. This opcode uses the same
    sqe->msg_flags that IORING_OP_SENDMSG added, and we pass in the
    msghdr struct in the sqe->addr field as well.

    We use MSG_DONTWAIT to force an inline fast path if recvmsg() doesn't
    block, and punt to async execution if it would have.

    Acked-by: David S. Miller
    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • This is done through IORING_OP_SENDMSG. There's a new sqe->msg_flags
    for the flags argument, and the msghdr struct is passed in the
    sqe->addr field.

    We use MSG_DONTWAIT to force an inline fast path if sendmsg() doesn't
    block, and punt to async execution if it would have.

    Acked-by: David S. Miller
    Signed-off-by: Jens Axboe

    Jens Axboe
     

09 Jul, 2019

2 commits

  • socket->wq is assign-once, set when we are initializing both
    struct socket it's in and struct socket_wq it points to. As the
    matter of fact, the only reason for separate allocation was the
    ability to RCU-delay freeing of socket_wq. RCU-delaying the
    freeing of socket itself gets rid of that need, so we can just
    fold struct socket_wq into the end of struct socket and simplify
    the life both for sock_alloc_inode() (one allocation instead of
    two) and for tun/tap oddballs, where we used to embed struct socket
    and struct socket_wq into the same structure (now - embedding just
    the struct socket).

    Note that reference to struct socket_wq in struct sock does remain
    a reference - that's unchanged.

    Signed-off-by: Al Viro
    Signed-off-by: David S. Miller

    Al Viro
     
  • we do have an RCU-delayed part there already (freeing the wq),
    so it's not like the pipe situation; moreover, it might be
    worth considering coallocating wq with the rest of struct sock_alloc.
    ->sk_wq in struct sock would remain a pointer as it is, but
    the object it normally points to would be coallocated with
    struct socket...

    Signed-off-by: Al Viro
    Signed-off-by: David S. Miller

    Al Viro
     

05 Jul, 2019

1 commit

  • Daniel Borkmann says:

    ====================
    pull-request: bpf-next 2019-07-03

    The following pull-request contains BPF updates for your *net-next* tree.

    There is a minor merge conflict in mlx5 due to 8960b38932be ("linux/dim:
    Rename externally used net_dim members") which has been pulled into your
    tree in the meantime, but resolution seems not that bad ... getting current
    bpf-next out now before there's coming more on mlx5. ;) I'm Cc'ing Saeed
    just so he's aware of the resolution below:

    ** First conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c:

    <<<<<<< HEAD
    static int mlx5e_open_cq(struct mlx5e_channel *c,
    struct dim_cq_moder moder,
    struct mlx5e_cq_param *param,
    struct mlx5e_cq *cq)
    =======
    int mlx5e_open_cq(struct mlx5e_channel *c, struct net_dim_cq_moder moder,
    struct mlx5e_cq_param *param, struct mlx5e_cq *cq)
    >>>>>>> e5a3e259ef239f443951d401db10db7d426c9497

    Resolution is to take the second chunk and rename net_dim_cq_moder into
    dim_cq_moder. Also the signature for mlx5e_open_cq() in ...

    drivers/net/ethernet/mellanox/mlx5/core/en.h +977

    ... and in mlx5e_open_xsk() ...

    drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.c +64

    ... needs the same rename from net_dim_cq_moder into dim_cq_moder.

    ** Second conflict in drivers/net/ethernet/mellanox/mlx5/core/en_main.c:

    <<<<<<< HEAD
    int cpu = cpumask_first(mlx5_comp_irq_get_affinity_mask(priv->mdev, ix));
    struct dim_cq_moder icocq_moder = {0, 0};
    struct net_device *netdev = priv->netdev;
    struct mlx5e_channel *c;
    unsigned int irq;
    =======
    struct net_dim_cq_moder icocq_moder = {0, 0};
    >>>>>>> e5a3e259ef239f443951d401db10db7d426c9497

    Take the second chunk and rename net_dim_cq_moder into dim_cq_moder
    as well.

    Let me know if you run into any issues. Anyway, the main changes are:

    1) Long-awaited AF_XDP support for mlx5e driver, from Maxim.

    2) Addition of two new per-cgroup BPF hooks for getsockopt and
    setsockopt along with a new sockopt program type which allows more
    fine-grained pass/reject settings for containers. Also add a sock_ops
    callback that can be selectively enabled on a per-socket basis and is
    executed for every RTT to help tracking TCP statistics, both features
    from Stanislav.

    3) Follow-up fix from loops in precision tracking which was not propagating
    precision marks and as a result verifier assumed that some branches were
    not taken and therefore wrongly removed as dead code, from Alexei.

    4) Fix BPF cgroup release synchronization race which could lead to a
    double-free if a leaf's cgroup_bpf object is released and a new BPF
    program is attached to the one of ancestor cgroups in parallel, from Roman.

    5) Support for bulking XDP_TX on veth devices which improves performance
    in some cases by around 9%, from Toshiaki.

    6) Allow for lookups into BPF devmap and improve feedback when calling into
    bpf_redirect_map() as lookup is now performed right away in the helper
    itself, from Toke.

    7) Add support for fq's Earliest Departure Time to the Host Bandwidth
    Manager (HBM) sample BPF program, from Lawrence.

    8) Various cleanups and minor fixes all over the place from many others.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

04 Jul, 2019

1 commit

  • After the previous patch we have ipv{6,4} variants for {recv,send}msg,
    we should use the generic _INET ICW variant to call into the proper
    build-in.

    This also allows dropping the now unused and rather ugly _INET4 ICW macro

    v1 -> v2:
    - use ICW macro to declare inet6_{recv,send}msg
    - fix a couple of checkpatch offender in the code context

    Signed-off-by: Paolo Abeni
    Signed-off-by: David S. Miller

    Paolo Abeni
     

28 Jun, 2019

1 commit

  • Implement new BPF_PROG_TYPE_CGROUP_SOCKOPT program type and
    BPF_CGROUP_{G,S}ETSOCKOPT cgroup hooks.

    BPF_CGROUP_SETSOCKOPT can modify user setsockopt arguments before
    passing them down to the kernel or bypass kernel completely.
    BPF_CGROUP_GETSOCKOPT can can inspect/modify getsockopt arguments that
    kernel returns.
    Both hooks reuse existing PTR_TO_PACKET{,_END} infrastructure.

    The buffer memory is pre-allocated (because I don't think there is
    a precedent for working with __user memory from bpf). This might be
    slow to do for each {s,g}etsockopt call, that's why I've added
    __cgroup_bpf_prog_array_is_empty that exits early if there is nothing
    attached to a cgroup. Note, however, that there is a race between
    __cgroup_bpf_prog_array_is_empty and BPF_PROG_RUN_ARRAY where cgroup
    program layout might have changed; this should not be a problem
    because in general there is a race between multiple calls to
    {s,g}etsocktop and user adding/removing bpf progs from a cgroup.

    The return code of the BPF program is handled as follows:
    * 0: EPERM
    * 1: success, continue with next BPF program in the cgroup chain

    v9:
    * allow overwriting setsockopt arguments (Alexei Starovoitov):
    * use set_fs (same as kernel_setsockopt)
    * buffer is always kzalloc'd (no small on-stack buffer)

    v8:
    * use s32 for optlen (Andrii Nakryiko)

    v7:
    * return only 0 or 1 (Alexei Starovoitov)
    * always run all progs (Alexei Starovoitov)
    * use optval=0 as kernel bypass in setsockopt (Alexei Starovoitov)
    (decided to use optval=-1 instead, optval=0 might be a valid input)
    * call getsockopt hook after kernel handlers (Alexei Starovoitov)

    v6:
    * rework cgroup chaining; stop as soon as bpf program returns
    0 or 2; see patch with the documentation for the details
    * drop Andrii's and Martin's Acked-by (not sure they are comfortable
    with the new state of things)

    v5:
    * skip copy_to_user() and put_user() when ret == 0 (Martin Lau)

    v4:
    * don't export bpf_sk_fullsock helper (Martin Lau)
    * size != sizeof(__u64) for uapi pointers (Martin Lau)
    * offsetof instead of bpf_ctx_range when checking ctx access (Martin Lau)

    v3:
    * typos in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY comments (Andrii Nakryiko)
    * reverse christmas tree in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY (Andrii
    Nakryiko)
    * use __bpf_md_ptr instead of __u32 for optval{,_end} (Martin Lau)
    * use BPF_FIELD_SIZEOF() for consistency (Martin Lau)
    * new CG_SOCKOPT_ACCESS macro to wrap repeated parts

    v2:
    * moved bpf_sockopt_kern fields around to remove a hole (Martin Lau)
    * aligned bpf_sockopt_kern->buf to 8 bytes (Martin Lau)
    * bpf_prog_array_is_empty instead of bpf_prog_array_length (Martin Lau)
    * added [0,2] return code check to verifier (Martin Lau)
    * dropped unused buf[64] from the stack (Martin Lau)
    * use PTR_TO_SOCKET for bpf_sockopt->sk (Martin Lau)
    * dropped bpf_target_off from ctx rewrites (Martin Lau)
    * use return code for kernel bypass (Martin Lau & Andrii Nakryiko)

    Cc: Andrii Nakryiko
    Cc: Martin Lau
    Signed-off-by: Stanislav Fomichev
    Signed-off-by: Alexei Starovoitov

    Stanislav Fomichev
     

08 Jun, 2019

1 commit


06 Jun, 2019

1 commit


01 Jun, 2019

1 commit


31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

26 May, 2019

2 commits

  • Convert the sockfs filesystem to the new internal mount API as the old
    one will be obsoleted and removed. This allows greater flexibility in
    communication of mount parameters between userspace, the VFS and the
    filesystem.

    See Documentation/filesystems/mount_api.txt for more information.

    Signed-off-by: David Howells
    cc: netdev@vger.kernel.org
    Signed-off-by: Al Viro

    David Howells
     
  • Once upon a time we used to set ->d_name of e.g. pipefs root
    so that d_path() on pipes would work. These days it's
    completely pointless - dentries of pipes are not even connected
    to pipefs root. However, mount_pseudo() had set the root
    dentry name (passed as the second argument) and callers
    kept inventing names to pass to it. Including those that
    didn't *have* any non-root dentries to start with...

    All of that had been pointless for about 8 years now; it's
    time to get rid of that cargo-culting...

    Signed-off-by: Al Viro

    Al Viro
     

20 May, 2019

1 commit

  • Fix kernel-doc warnings by moving the kernel-doc notation to be
    immediately above the functions that it describes.

    Fixes these warnings for sock_sendmsg() and sock_recvmsg():

    ../net/socket.c:658: warning: Excess function parameter 'sock' description in 'INDIRECT_CALLABLE_DECLARE'
    ../net/socket.c:658: warning: Excess function parameter 'msg' description in 'INDIRECT_CALLABLE_DECLARE'
    ../net/socket.c:889: warning: Excess function parameter 'sock' description in 'INDIRECT_CALLABLE_DECLARE'
    ../net/socket.c:889: warning: Excess function parameter 'msg' description in 'INDIRECT_CALLABLE_DECLARE'
    ../net/socket.c:889: warning: Excess function parameter 'flags' description in 'INDIRECT_CALLABLE_DECLARE'

    Signed-off-by: Randy Dunlap
    Signed-off-by: David S. Miller

    Randy Dunlap
     

06 May, 2019

1 commit


26 Apr, 2019

1 commit

  • Add missing break statement in order to prevent the code from falling
    through to cases SIOCGSTAMP_NEW and SIOCGSTAMPNS_NEW.

    This bug was found thanks to the ongoing efforts to enable
    -Wimplicit-fallthrough.

    Fixes: 0768e17073dc ("net: socket: implement 64-bit timestamps")
    Signed-off-by: Gustavo A. R. Silva
    Reported-by: Dan Carpenter
    Acked-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Gustavo A. R. Silva
     

20 Apr, 2019

2 commits

  • The 'timeval' and 'timespec' data structures used for socket timestamps
    are going to be redefined in user space based on 64-bit time_t in future
    versions of the C library to deal with the y2038 overflow problem,
    which breaks the ABI definition.

    Unlike many modern ioctl commands, SIOCGSTAMP and SIOCGSTAMPNS do not
    use the _IOR() macro to encode the size of the transferred data, so it
    remains ambiguous whether the application uses the old or new layout.

    The best workaround I could find is rather ugly: we redefine the command
    code based on the size of the respective data structure with a ternary
    operator. This lets it get evaluated as late as possible, hopefully after
    that structure is visible to the caller. We cannot use an #ifdef here,
    because inux/sockios.h might have been included before any libc header
    that could determine the size of time_t.

    The ioctl implementation now interprets the new command codes as always
    referring to the 64-bit structure on all architectures, while the old
    architecture specific command code still refers to the old architecture
    specific layout. The new command number is only used when they are
    actually different.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     
  • The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many
    socket protocol handlers, and all of those end up calling the same
    sock_get_timestamp()/sock_get_timestampns() helper functions, which
    results in a lot of duplicate code.

    With the introduction of 64-bit time_t on 32-bit architectures, this
    gets worse, as we then need four different ioctl commands in each
    socket protocol implementation.

    To simplify that, let's add a new .gettstamp() operation in
    struct proto_ops, and move ioctl implementation into the common
    sock_ioctl()/compat_sock_ioctl_trans() functions that these all go
    through.

    We can reuse the sock_get_timestamp() implementation, but generalize
    it so it can deal with both native and compat mode, as well as
    timeval and timespec structures.

    Acked-by: Stefan Schmidt
    Acked-by: Neil Horman
    Acked-by: Marc Kleine-Budde
    Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/
    Signed-off-by: Arnd Bergmann
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

16 Mar, 2019

1 commit

  • Adds missing sphinx documentation to the
    socket.c's functions. Also fixes some whitespaces.

    I also changed the style of older documentation as an
    effort to have an uniform documentation style.

    Signed-off-by: Pedro Tammela
    Signed-off-by: David S. Miller

    Pedro Tammela
     

03 Mar, 2019

1 commit


26 Feb, 2019

1 commit

  • Commit 9060cb719e61 ("net: crypto set sk to NULL when af_alg_release.")
    fixed a use-after-free in sockfs_setattr() when an AF_ALG socket is
    closed concurrently with fchownat(). However, it ignored that many
    other proto_ops::release() methods don't set sock->sk to NULL and
    therefore allow the same use-after-free:

    - base_sock_release
    - bnep_sock_release
    - cmtp_sock_release
    - data_sock_release
    - dn_release
    - hci_sock_release
    - hidp_sock_release
    - iucv_sock_release
    - l2cap_sock_release
    - llcp_sock_release
    - llc_ui_release
    - rawsock_release
    - rfcomm_sock_release
    - sco_sock_release
    - svc_release
    - vcc_release
    - x25_release

    Rather than fixing all these and relying on every socket type to get
    this right forever, just make __sock_release() set sock->sk to NULL
    itself after calling proto_ops::release().

    Reproducer that produces the KASAN splat when any of these socket types
    are configured into the kernel:

    #include
    #include
    #include
    #include

    pthread_t t;
    volatile int fd;

    void *close_thread(void *arg)
    {
    for (;;) {
    usleep(rand() % 100);
    close(fd);
    }
    }

    int main()
    {
    pthread_create(&t, NULL, close_thread, NULL);
    for (;;) {
    fd = socket(rand() % 50, rand() % 11, 0);
    fchownat(fd, "", 1000, 1000, 0x1000);
    close(fd);
    }
    }

    Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.")
    Signed-off-by: Eric Biggers
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller

    Eric Biggers
     

09 Feb, 2019

1 commit


04 Feb, 2019

4 commits

  • Add SO_TIMESTAMPING_NEW variant of socket timestamp options.
    This is the y2038 safe versions of the SO_TIMESTAMPING_OLD
    for all architectures.

    Signed-off-by: Deepa Dinamani
    Acked-by: Willem de Bruijn
    Cc: chris@zankel.net
    Cc: fenghua.yu@intel.com
    Cc: rth@twiddle.net
    Cc: tglx@linutronix.de
    Cc: ubraun@linux.ibm.com
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-mips@linux-mips.org
    Cc: linux-s390@vger.kernel.org
    Cc: linux-xtensa@linux-xtensa.org
    Cc: sparclinux@vger.kernel.org
    Signed-off-by: David S. Miller

    Deepa Dinamani
     
  • Add SO_TIMESTAMP_NEW and SO_TIMESTAMPNS_NEW variants of
    socket timestamp options.
    These are the y2038 safe versions of the SO_TIMESTAMP_OLD
    and SO_TIMESTAMPNS_OLD for all architectures.

    Note that the format of scm_timestamping.ts[0] is not changed
    in this patch.

    Signed-off-by: Deepa Dinamani
    Acked-by: Willem de Bruijn
    Cc: jejb@parisc-linux.org
    Cc: ralf@linux-mips.org
    Cc: rth@twiddle.net
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-mips@linux-mips.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linux-rdma@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Signed-off-by: David S. Miller

    Deepa Dinamani
     
  • As part of y2038 solution, all internal uses of
    struct timeval are replaced by struct __kernel_old_timeval
    and struct compat_timeval by struct old_timeval32.
    Make socket timestamps use these new types.

    This is mainly to be able to verify that the kernel build
    is y2038 safe when such non y2038 safe types are not
    supported anymore.

    Signed-off-by: Deepa Dinamani
    Acked-by: Willem de Bruijn
    Cc: isdn@linux-pingi.de
    Signed-off-by: David S. Miller

    Deepa Dinamani
     
  • SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING options, the
    way they are currently defined, are not y2038 safe.
    Subsequent patches in the series add new y2038 safe versions
    of these options which provide 64 bit timestamps on all
    architectures uniformly.
    Hence, rename existing options with OLD tag suffixes.

    Also note that kernel will not use the untagged SO_TIMESTAMP*
    and SCM_TIMESTAMP* options internally anymore.

    Signed-off-by: Deepa Dinamani
    Acked-by: Willem de Bruijn
    Cc: deller@gmx.de
    Cc: dhowells@redhat.com
    Cc: jejb@parisc-linux.org
    Cc: ralf@linux-mips.org
    Cc: rth@twiddle.net
    Cc: linux-afs@lists.infradead.org
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-mips@linux-mips.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linux-rdma@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Signed-off-by: David S. Miller

    Deepa Dinamani
     

31 Jan, 2019

4 commits

  • Same story as before, these use struct ifreq and thus need
    to be read with the shorter version to not cause faults.

    Cc: stable@vger.kernel.org
    Fixes: f92d4fc95341 ("kill bond_ioctl()")
    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • As reported by Robert O'Callahan in
    https://bugzilla.kernel.org/show_bug.cgi?id=202273
    reverting the previous changes in this area broke
    the SIOCGIFNAME ioctl in compat again (I'd previously
    fixed it after his previous report of breakage in
    https://bugzilla.kernel.org/show_bug.cgi?id=199469).

    This is obviously because I fixed SIOCGIFNAME more or
    less by accident.

    Fix it explicitly now by making it pass through the
    restored compat translation code.

    Cc: stable@vger.kernel.org
    Fixes: 4cf808e7ac32 ("kill dev_ifname32()")
    Reported-by: Robert O'Callahan
    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • This reverts commit bf4405737f9f ("kill dev_ifsioc()").

    This wasn't really unused as implied by the original commit,
    it still handles the copy to/from user differently, and the
    commit thus caused issues such as
    https://bugzilla.kernel.org/show_bug.cgi?id=199469
    and
    https://bugzilla.kernel.org/show_bug.cgi?id=202273

    However, deviating from a strict revert, rename dev_ifsioc()
    to compat_ifreq_ioctl() to be clearer as to its purpose and
    add a comment.

    Cc: stable@vger.kernel.org
    Fixes: bf4405737f9f ("kill dev_ifsioc()")
    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • This reverts commit 1cebf8f143c2 ("socket: fix struct ifreq
    size in compat ioctl"), it's a bugfix for another commit that
    I'll revert next.

    This is not a 'perfect' revert, I'm keeping some coding style
    intact rather than revert to the state with indentation errors.

    Cc: stable@vger.kernel.org
    Fixes: 1cebf8f143c2 ("socket: fix struct ifreq size in compat ioctl")
    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

29 Dec, 2018

1 commit

  • Pull y2038 updates from Arnd Bergmann:
    "More syscalls and cleanups

    This concludes the main part of the system call rework for 64-bit
    time_t, which has spread over most of year 2018, the last six system
    calls being

    - ppoll
    - pselect6
    - io_pgetevents
    - recvmmsg
    - futex
    - rt_sigtimedwait

    As before, nothing changes for 64-bit architectures, while 32-bit
    architectures gain another entry point that differs only in the layout
    of the timespec structure. Hopefully in the next release we can wire
    up all 22 of those system calls on all 32-bit architectures, which
    gives us a baseline version for glibc to start using them.

    This does not include the clock_adjtime, getrusage/waitid, and
    getitimer/setitimer system calls. I still plan to have new versions of
    those as well, but they are not required for correct operation of the
    C library since they can be emulated using the old 32-bit time_t based
    system calls.

    Aside from the system calls, there are also a few cleanups here,
    removing old kernel internal interfaces that have become unused after
    all references got removed. The arch/sh cleanups are part of this,
    there were posted several times over the past year without a reaction
    from the maintainers, while the corresponding changes made it into all
    other architectures"

    * tag 'y2038-for-4.21' of ssh://gitolite.kernel.org:/pub/scm/linux/kernel/git/arnd/playground:
    timekeeping: remove obsolete time accessors
    vfs: replace current_kernel_time64 with ktime equivalent
    timekeeping: remove timespec_add/timespec_del
    timekeeping: remove unused {read,update}_persistent_clock
    sh: remove board_time_init() callback
    sh: remove unused rtc_sh_get/set_time infrastructure
    sh: sh03: rtc: push down rtc class ops into driver
    sh: dreamcast: rtc: push down rtc class ops into driver
    y2038: signal: Add compat_sys_rt_sigtimedwait_time64
    y2038: signal: Add sys_rt_sigtimedwait_time32
    y2038: socket: Add compat_sys_recvmmsg_time64
    y2038: futex: Add support for __kernel_timespec
    y2038: futex: Move compat implementation into futex.c
    io_pgetevents: use __kernel_timespec
    pselect6: use __kernel_timespec
    ppoll: use __kernel_timespec
    signal: Add restore_user_sigmask()
    signal: Add set_user_sigmask()

    Linus Torvalds
     

18 Dec, 2018

1 commit

  • recvmmsg() takes two arguments to pointers of structures that differ
    between 32-bit and 64-bit architectures: mmsghdr and timespec.

    For y2038 compatbility, we are changing the native system call from
    timespec to __kernel_timespec with a 64-bit time_t (in another patch),
    and use the existing compat system call on both 32-bit and 64-bit
    architectures for compatibility with traditional 32-bit user space.

    As we now have two variants of recvmmsg() for 32-bit tasks that are both
    different from the variant that we use on 64-bit tasks, this means we
    also require two compat system calls!

    The solution I picked is to flip things around: The existing
    compat_sys_recvmmsg() call gets moved from net/compat.c into net/socket.c
    and now handles the case for old user space on all architectures that
    have set CONFIG_COMPAT_32BIT_TIME. A new compat_sys_recvmmsg_time64()
    call gets added in the old place for 64-bit architectures only, this
    one handles the case of a compat mmsghdr structure combined with
    __kernel_timespec.

    In the indirect sys_socketcall(), we now need to call either
    do_sys_recvmmsg() or __compat_sys_recvmmsg(), depending on what kind of
    architecture we are on. For compat_sys_socketcall(), no such change is
    needed, we always call __compat_sys_recvmmsg().

    I decided to not add a new SYS_RECVMMSG_TIME64 socketcall: Any libc
    implementation for 64-bit time_t will need significant changes including
    an updated asm/unistd.h, and it seems better to consistently use the
    separate syscalls that configuration, leaving the socketcall only for
    backward compatibility with 32-bit time_t based libc.

    The naming is asymmetric for the moment, so both existing syscalls
    entry points keep their names, while the new ones are recvmmsg_time32
    and compat_recvmmsg_time64 respectively. I expect that we will rename
    the compat syscalls later as we start using generated syscall tables
    everywhere and add these entry points.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

18 Nov, 2018

1 commit


02 Nov, 2018

1 commit

  • Pull AFS updates from Al Viro:
    "AFS series, with some iov_iter bits included"

    * 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
    missing bits of "iov_iter: Separate type from direction and use accessor functions"
    afs: Probe multiple fileservers simultaneously
    afs: Fix callback handling
    afs: Eliminate the address pointer from the address list cursor
    afs: Allow dumping of server cursor on operation failure
    afs: Implement YFS support in the fs client
    afs: Expand data structure fields to support YFS
    afs: Get the target vnode in afs_rmdir() and get a callback on it
    afs: Calc callback expiry in op reply delivery
    afs: Fix FS.FetchStatus delivery from updating wrong vnode
    afs: Implement the YFS cache manager service
    afs: Remove callback details from afs_callback_break struct
    afs: Commit the status on a new file/dir/symlink
    afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS
    afs: Don't invoke the server to read data beyond EOF
    afs: Add a couple of tracepoints to log I/O errors
    afs: Handle EIO from delivery function
    afs: Fix TTL on VL server and address lists
    afs: Implement VL server rotation
    afs: Improve FS server rotation error handling
    ...

    Linus Torvalds