10 Mar, 2020

1 commit

  • The sock map code checks that a socket does not have an active upper
    layer protocol before inserting it into the map. This requires casting
    via inet_csk, which isn't valid for UDP sockets.

    Guard checks for ULP by checking inet_sk(sk)->is_icsk first.

    Signed-off-by: Lorenz Bauer
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Jakub Sitnicki
    Acked-by: John Fastabend
    Link: https://lore.kernel.org/bpf/20200309111243.6982-2-lmb@cloudflare.com

    Lorenz Bauer
     

25 Feb, 2020

1 commit

  • tcp_ulp_list is traversed using list_for_each_entry_rcu
    outside an RCU read-side critical section but under the protection
    of tcp_ulp_list_lock.

    Hence, add corresponding lockdep expression to silence false-positive
    warnings, and harden RCU lists.t

    Signed-off-by: Amol Grover
    Signed-off-by: David S. Miller

    Amol Grover
     

22 Feb, 2020

1 commit

  • sk_msg and ULP frameworks override protocol callbacks pointer in
    sk->sk_prot, while tcp accesses it locklessly when cloning the listening
    socket, that is with neither sk_lock nor sk_callback_lock held.

    Once we enable use of listening sockets with sockmap (and hence sk_msg),
    there will be shared access to sk->sk_prot if socket is getting cloned
    while being inserted/deleted to/from the sockmap from another CPU:

    Read side:

    tcp_v4_rcv
    sk = __inet_lookup_skb(...)
    tcp_check_req(sk)
    inet_csk(sk)->icsk_af_ops->syn_recv_sock
    tcp_v4_syn_recv_sock
    tcp_create_openreq_child
    inet_csk_clone_lock
    sk_clone_lock
    READ_ONCE(sk->sk_prot)

    Write side:

    sock_map_ops->map_update_elem
    sock_map_update_elem
    sock_map_update_common
    sock_map_link_no_progs
    tcp_bpf_init
    tcp_bpf_update_sk_prot
    sk_psock_update_proto
    WRITE_ONCE(sk->sk_prot, ops)

    sock_map_ops->map_delete_elem
    sock_map_delete_elem
    __sock_map_delete
    sock_map_unref
    sk_psock_put
    sk_psock_drop
    sk_psock_restore_proto
    tcp_update_ulp
    WRITE_ONCE(sk->sk_prot, proto)

    Mark the shared access with READ_ONCE/WRITE_ONCE annotations.

    Signed-off-by: Jakub Sitnicki
    Signed-off-by: Daniel Borkmann
    Link: https://lore.kernel.org/bpf/20200218171023.844439-2-jakub@cloudflare.com

    Jakub Sitnicki
     

16 Jan, 2020

1 commit

  • When sockmap sock with TLS enabled is removed we cleanup bpf/psock state
    and call tcp_update_ulp() to push updates to TLS ULP on top. However, we
    don't push the write_space callback up and instead simply overwrite the
    op with the psock stored previous op. This may or may not be correct so
    to ensure we don't overwrite the TLS write space hook pass this field to
    the ULP and have it fixup the ctx.

    This completes a previous fix that pushed the ops through to the ULP
    but at the time missed doing this for write_space, presumably because
    write_space TLS hook was added around the same time.

    Fixes: 95fa145479fbc ("bpf: sockmap/tls, close can race with map free")
    Signed-off-by: John Fastabend
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Jakub Sitnicki
    Acked-by: Jonathan Lemon
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/bpf/20200111061206.8028-4-john.fastabend@gmail.com

    John Fastabend
     

21 Nov, 2019

1 commit

  • snprintf returns the number of chars that would be written, not number
    of chars that were actually written. As such, 'offs' may get larger than
    'tbl.maxlen', causing the 'tbl.maxlen - offs' being < 0, and since the
    parameter is size_t, it would overflow.

    Since using scnprintf may hide the limit error, while the buffer is still
    enough now, let's just add a WARN_ON_ONCE in case it reach the limit
    in future.

    v2: Use WARN_ON_ONCE as Jiri and Eric suggested.

    Suggested-by: Jiri Benc
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller

    Hangbin Liu
     

22 Jul, 2019

1 commit

  • When a map free is called and in parallel a socket is closed we
    have two paths that can potentially reset the socket prot ops, the
    bpf close() path and the map free path. This creates a problem
    with which prot ops should be used from the socket closed side.

    If the map_free side completes first then we want to call the
    original lowest level ops. However, if the tls path runs first
    we want to call the sockmap ops. Additionally there was no locking
    around prot updates in TLS code paths so the prot ops could
    be changed multiple times once from TLS path and again from sockmap
    side potentially leaving ops pointed at either TLS or sockmap
    when psock and/or tls context have already been destroyed.

    To fix this race first only update ops inside callback lock
    so that TLS, sockmap and lowest level all agree on prot state.
    Second and a ULP callback update() so that lower layers can
    inform the upper layer when they are being removed allowing the
    upper layer to reset prot ops.

    This gets us close to allowing sockmap and tls to be stacked
    in arbitrary order but will save that patch for *next trees.

    v4:
    - make sure we don't free things for device;
    - remove the checks which swap the callbacks back
    only if TLS is at the top.

    Reported-by: syzbot+06537213db7ba2745c4a@syzkaller.appspotmail.com
    Fixes: 02c558b2d5d6 ("bpf: sockmap, support for msg_peek in sk_msg with redirect ingress")
    Signed-off-by: John Fastabend
    Signed-off-by: Jakub Kicinski
    Reviewed-by: Dirk van der Merwe
    Signed-off-by: Daniel Borkmann

    John Fastabend
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
    initial scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

17 Oct, 2018

1 commit

  • Eric reported that syzkaller triggered a splat in tcp_cleanup_ulp()
    where assertion sock_owned_by_me() failed. This happened through
    inet_csk_prepare_forced_close() first releasing the socket lock,
    then calling into tcp_done(newsk) which is called after the
    inet_csk_prepare_forced_close() and therefore without the socket
    lock held. The sock_owned_by_me() assertion can generally be
    removed as the only place where tcp_cleanup_ulp() is called from
    now is out of inet_csk_destroy_sock() -> sk->sk_prot->destroy()
    where socket is in dead state and unreachable. Therefore, add a
    comment why the check is not needed instead.

    Fixes: 8b9088f806e1 ("tcp, ulp: enforce sock_owned_by_me upon ulp init and cleanup")
    Reported-by: Eric Dumazet
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

16 Oct, 2018

2 commits

  • In order to prepare sockmap logic to be used in combination with kTLS
    we need to detangle it from ULP, and further split it in later commits
    into a generic API.

    Joint work with John.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: John Fastabend
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     
  • Whenever the ULP data on the socket is mangled, enforce that the
    caller has the socket lock held as otherwise things may race with
    initialization and cleanup callbacks from ulp ops as both would
    mangle internal socket state.

    Joint work with John.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: John Fastabend
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

17 Aug, 2018

2 commits

  • I found that in BPF sockmap programs once we either delete a socket
    from the map or we updated a map slot and the old socket was purged
    from the map that these socket can never get reattached into a map
    even though their related psock has been dropped entirely at that
    point.

    Reason is that tcp_cleanup_ulp() leaves the old icsk->icsk_ulp_ops
    intact, so that on the next tcp_set_ulp_id() the kernel returns an
    -EEXIST thinking there is still some active ULP attached.

    BPF sockmap is the only one that has this issue as the other user,
    kTLS, only calls tcp_cleanup_ulp() from tcp_v4_destroy_sock() whereas
    sockmap semantics allow dropping the socket from the map with all
    related psock state being cleaned up.

    Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
    Signed-off-by: Daniel Borkmann
    Acked-by: John Fastabend
    Acked-by: Song Liu
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     
  • Lets not turn the TCP ULP lookup into an arbitrary module loader as
    we only intend to load ULP modules through this mechanism, not other
    unrelated kernel modules:

    [root@bar]# cat foo.c
    #include
    #include
    #include
    #include

    int main(void)
    {
    int sock = socket(PF_INET, SOCK_STREAM, 0);
    setsockopt(sock, IPPROTO_TCP, TCP_ULP, "sctp", sizeof("sctp"));
    return 0;
    }

    [root@bar]# gcc foo.c -O2 -Wall
    [root@bar]# lsmod | grep sctp
    [root@bar]# ./a.out
    [root@bar]# lsmod | grep sctp
    sctp 1077248 4
    libcrc32c 16384 3 nf_conntrack,nf_nat,sctp
    [root@bar]#

    Fix it by adding module alias to TCP ULP modules, so probing module
    via request_module() will be limited to tcp-ulp-[name]. The existing
    modules like kTLS will load fine given tcp-ulp-tls alias, but others
    will fail to load:

    [root@bar]# lsmod | grep sctp
    [root@bar]# ./a.out
    [root@bar]# lsmod | grep sctp
    [root@bar]#

    Sockmap is not affected from this since it's either built-in or not.

    Fixes: 734942cc4ea6 ("tcp: ULP infrastructure")
    Signed-off-by: Daniel Borkmann
    Acked-by: John Fastabend
    Acked-by: Song Liu
    Signed-off-by: Alexei Starovoitov

    Daniel Borkmann
     

06 Feb, 2018

1 commit

  • Create a UID field and enum that can be used to assign ULPs to
    sockets. This saves a set of string comparisons if the ULP id
    is known.

    For sockmap, which is added in the next patches, a ULP is used to
    hook into TCP sockets close state. In this case the ULP being added
    is done at map insert time and the ULP is known and done on the kernel
    side. In this case the named lookup is not needed. Because we don't
    want to expose psock internals to user space socket options a user
    visible flag is also added. For TLS this is set for BPF it will be
    cleared.

    Alos remove pr_notice, user gets an error code back and should check
    that rather than rely on logs.

    Signed-off-by: John Fastabend
    Signed-off-by: Daniel Borkmann

    John Fastabend
     

15 Aug, 2017

1 commit


24 Jun, 2017

1 commit

  • KASAN reports out-of-bound access in proc_dostring() coming from
    proc_tcp_available_ulp() because in case TCP ULP list is empty
    the buffer allocated for the response will not have anything
    printed into it. Set the first byte to zero to avoid strlen()
    going out-of-bounds.

    Fixes: 734942cc4ea6 ("tcp: ULP infrastructure")
    Signed-off-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Jakub Kicinski
     

16 Jun, 2017

1 commit

  • Add the infrustructure for attaching Upper Layer Protocols (ULPs) over TCP
    sockets. Based on a similar infrastructure in tcp_cong. The idea is that any
    ULP can add its own logic by changing the TCP proto_ops structure to its own
    methods.

    Example usage:

    setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls"));

    modules will call:
    tcp_register_ulp(&tcp_tls_ulp_ops);

    to register/unregister their ulp, with an init function and name.

    A list of registered ulps will be returned by tcp_get_available_ulp, which is
    hooked up to /proc. Example:

    $ cat /proc/sys/net/ipv4/tcp_available_ulp
    tls

    There is currently no functionality to remove or chain ULPs, but
    it should be possible to add these in the future if needed.

    Signed-off-by: Boris Pismenny
    Signed-off-by: Dave Watson
    Signed-off-by: David S. Miller

    Dave Watson