09 Feb, 2016

1 commit

  • Commit ed5a377d87dc ("sctp: translate host order to network order when
    setting a hmacid") corrected the hmacid byte-order when setting a hmacid.
    but the same issue also exists on getting a hmacid.

    We fix it by changing hmacids to host order when users get them with
    getsockopt.

    Fixes: Commit ed5a377d87dc ("sctp: translate host order to network order when setting a hmacid")
    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     

29 Jan, 2016

3 commits

  • After we use refcnt to check if transport is alive, the dead can be
    removed from sctp_transport.

    The traversal of transport_addr_list in procfs dump is using
    list_for_each_entry_rcu, no need to check if it has been freed.

    sctp_generate_t3_rtx_event and sctp_generate_heartbeat_event is
    protected by sock lock, it's not necessary to check dead, either.
    also, the timers are cancelled when sctp_transport_free() is
    called, that it doesn't wait for refcnt to reach 0 to cancel them.

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     
  • Previously, before rhashtable, /proc assoc listing was done by
    read-locking the entire hash entry and dumping all assocs at once, so we
    were sure that the assoc wasn't freed because it wouldn't be possible to
    remove it from the hash meanwhile.

    Now we use rhashtable to list transports, and dump entries one by one.
    That is, now we have to check if the assoc is still a good one, as the
    transport we got may be being freed.

    Signed-off-by: Xin Long
    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     
  • Now when __sctp_lookup_association is running in BH, it will try to
    check if t->dead is set, but meanwhile other CPUs may be freeing this
    transport and this assoc and if it happens that
    __sctp_lookup_association checked t->dead a bit too early, it may think
    that the association is still good while it was already freed.

    So we fix this race by using atomic_add_unless in sctp_transport_hold.
    After we get one transport from hashtable, we will hold it only when
    this transport's refcnt is not 0, so that we can make sure t->asoc
    cannot be freed before we hold the asoc again.

    Note that sctp association is not freed using RCU so we can't use
    atomic_add_unless() with it as it may just be too late for that either.

    Fixes: 4f0087812648 ("sctp: apply rhashtable api to send/recv path")
    Reported-by: Vlad Yasevich
    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     

25 Jan, 2016

1 commit

  • This patch extends commit b93d6471748d ("sctp: implement the sender side
    for SACK-IMMEDIATELY extension") as it didn't white list
    SCTP_SACK_IMMEDIATELY on sctp_msghdr_parse(), causing it to be
    understood as an invalid flag and returning -EINVAL to the application.

    Note that the actual handling of the flag is already there in
    sctp_datamsg_from_user().

    https://tools.ietf.org/html/rfc7053#section-7

    Fixes: b93d6471748d ("sctp: implement the sender side for SACK-IMMEDIATELY extension")
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

18 Jan, 2016

1 commit

  • Re-establish the previous behavior and avoid hashing temporary asocs by
    checking t->asoc->temp in sctp_(un)hash_transport. Also, remove the
    check of t->asoc->temp in __sctp_lookup_association, since they are
    never hashed now.

    Fixes: 4f0087812648 ("sctp: apply rhashtable api to send/recv path")
    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Reported-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Xin Long
     

16 Jan, 2016

2 commits

  • net/sctp/proc.c: In function ‘sctp_transport_get_idx’:
    net/sctp/proc.c:313: warning: ‘obj’ may be used uninitialized in this function

    This is currently a false positive, as all callers check for a zero
    offset first, and handle this case in the exact same way.

    Move the check and handling into sctp_transport_get_idx() to kill the
    compiler warning, and avoid future bugs.

    Signed-off-by: Geert Uytterhoeven
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Geert Uytterhoeven
     
  • Now, when we sendmsg, we translate the ep to laddr by selecting the
    first element of the list, and then do a lookup for a transport.

    But sctp_hash_cmp() will compare it against asoc addr_list, which may
    be a subset of ep addr_list, meaning that this chosen laddr may not be
    there, and thus making it impossible to find the transport.

    So we fix it by using ep + paddr to lookup transports in hashtable. In
    sctp_hash_cmp, if .ep is set, we will check if this ep == asoc->ep,
    or we will do the laddr check.

    Fixes: d6c0256a60e6 ("sctp: add the rhashtable apis for sctp global transport hashtable")
    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Reported-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Xin Long
     

12 Jan, 2016

2 commits

  • Conflicts:
    drivers/net/bonding/bond_main.c
    drivers/net/ethernet/mellanox/mlxsw/spectrum.h
    drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c

    The bond_main.c and mellanox switch conflicts were cases of
    overlapping changes.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Dmitry Vyukov reported a use-after-free in the code expanded by the
    macro debug_post_sfx, which is caused by the use of the asoc pointer
    after it was freed within sctp_side_effect() scope.

    This patch fixes it by allowing sctp_side_effect to clear that asoc
    pointer when the TCB is freed.

    As Vlad explained, we also have to cover the SCTP_DISPOSITION_ABORT case
    because it will trigger DELETE_TCB too on that same loop.

    Also, there were places issuing SCTP_CMD_INIT_FAILED and ASSOC_FAILED
    but returning SCTP_DISPOSITION_CONSUME, which would fool the scheme
    above. Fix it by returning SCTP_DISPOSITION_ABORT instead.

    The macro is already prepared to handle such NULL pointer.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

11 Jan, 2016

1 commit


06 Jan, 2016

5 commits

  • sctp_endpoint_lookup_assoc is called in the protection of sock lock
    there is no need to call local_bh_disable in this function. so remove
    them.

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     
  • transport hashtable will replace the association hashtable,
    so association hashtable is not used in sctp any more, so
    drop the codes about that.

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     
  • Traversal the transport rhashtable, get the association only once through
    the condition assoc->peer.primary_path != transport.

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     
  • apply lookup apis to two functions, for __sctp_endpoint_lookup_assoc
    and __sctp_lookup_association, it's invoked in the protection of sock
    lock, it will be safe, but sctp_lookup_association need to call
    rcu_read_lock() and to detect the t->dead to protect it.

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     
  • tranport hashtbale will replace the association hashtable to do the
    lookup for transport, and then get association by t->assoc, rhashtable
    apis will be used because of it's resizable, scalable and using rcu.

    lport + rport + paddr will be the base hashkey to locate the chain,
    with net to protect one netns from another, then plus the laddr to
    compare to get the target.

    this patch will provider the lookup functions:
    - sctp_epaddr_lookup_transport
    - sctp_addrs_lookup_transport

    hash/unhash functions:
    - sctp_hash_transport
    - sctp_unhash_transport

    init/destroy functions:
    - sctp_transport_hashtable_init
    - sctp_transport_hashtable_destroy

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     

01 Jan, 2016

1 commit


31 Dec, 2015

1 commit

  • In sctp_close, sctp_make_abort_user may return NULL because of memory
    allocation failure. If this happens, it will bypass any state change
    and never free the assoc. The assoc has no chance to be freed and it
    will be kept in memory with the state it had even after the socket is
    closed by sctp_close().

    So if sctp_make_abort_user fails to allocate memory, we should abort
    the asoc via sctp_primitive_ABORT as well. Just like the annotation in
    sctp_sf_cookie_wait_prm_abort and sctp_sf_do_9_1_prm_abort said,
    "Even if we can't send the ABORT due to low memory delete the TCB.
    This is a departure from our typical NOMEM handling".

    But then the chunk is NULL (low memory) and the SCTP_CMD_REPLY cmd would
    dereference the chunk pointer, and system crash. So we should add
    SCTP_CMD_REPLY cmd only when the chunk is not NULL, just like other
    places where it adds SCTP_CMD_REPLY cmd.

    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     

28 Dec, 2015

2 commits

  • Accepted or peeled off sockets were missing a security label (e.g.
    SELinux) which means that socket was in "unlabeled" state.

    This patch clones the sock's label from the parent sock and resolves the
    issue (similar to AF_BLUETOOTH protocol family).

    Cc: Paul Moore
    Cc: David Teigland
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Paul Moore
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     
  • Commit cacc06215271 ("sctp: use GFP_USER for user-controlled kmalloc")
    missed two other spots.

    For connectx, as it's more likely to be used by kernel users of the API,
    it detects if GFP_USER should be used or not.

    Fixes: cacc06215271 ("sctp: use GFP_USER for user-controlled kmalloc")
    Reported-by: Dmitry Vyukov
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

18 Dec, 2015

1 commit


16 Dec, 2015

3 commits

  • As we all know, the value of pf_retrans >= max_retrans_path can
    disable pf state. The variables of pf_retrans and max_retrans_path
    can be changed by the userspace application.

    Sometimes the user expects to disable pf state while the 2
    variables are changed to enable pf state. So it is necessary to
    introduce a new variable to disable pf state.

    According to the suggestions from Vlad Yasevich, extra1 and extra2
    are removed. The initialization of pf_enable is added.

    Acked-by: Vlad Yasevich
    Signed-off-by: Zhu Yanjun
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Zhu Yanjun
     
  • modules init functions being called from process context, we better
    use GFP_KERNEL allocations to increase our chances to get these
    high-order pages we want for SCTP hash tables.

    This mostly matters if SCTP module is loaded once memory got fragmented.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • The SCTP checksum is really a CRC and is very different from the
    standards 1's complement checksum that serves as the checksum
    for IP protocols. This offload interface is also very different.
    Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these
    differences. The term CSUM should be reserved in the stack to refer
    to the standard 1's complement IP checksum.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

12 Dec, 2015

1 commit

  • SCTP is lacking proper np->opt cloning at accept() time.

    TCP and DCCP use ipv6_dup_options() helper, do the same
    in SCTP.

    We might later factorize this code in a common helper to avoid
    future mistakes.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Eric Dumazet
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Dec, 2015

1 commit

  • While cooking the sctp np->opt rcu fixes, I forgot to move
    one rcu_read_unlock() after the added rcu_dereference() in
    sctp_v6_get_dst()

    This gave lockdep warnings reported by Dave Jones.

    Fixes: c836a8ba9386 ("ipv6: sctp: add rcu protection around np->opt")
    Reported-by: Dave Jones
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Dec, 2015

3 commits

  • when A sends a data to B, then A close() and enter into SHUTDOWN_PENDING
    state, if B neither claim his rwnd is 0 nor send SACK for this data, A
    will keep retransmitting this data until t5 timeout, Max.Retrans times
    can't work anymore, which is bad.

    if B's rwnd is not 0, it should send abort after Max.Retrans times, only
    when B's rwnd == 0 and A's retransmitting beyonds Max.Retrans times, A
    will start t5 timer, which is also commit f8d960524328 ("sctp: Enforce
    retransmission limit during shutdown") means, but it lacks the condition
    peer rwnd == 0.

    so fix it by adding a bit (zero_window_announced) in peer to record if
    the last rwnd is 0. If it was, zero_window_announced will be set. and use
    this bit to decide if start t5 timer when local.state is SHUTDOWN_PENDING.

    Fixes: commit f8d960524328 ("sctp: Enforce retransmission limit during shutdown")
    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    lucien
     
  • If the chunks are enqueued successfully but sctp_cmd_interpreter()
    return err to sctp_sendmsg() (mainly because of no mem), the chunks will
    get re-queued, but we are dropping the reference and freeing them.

    The fix is to just drop the reference on the datamsg just as it had
    succeeded, as:
    - if the chunks weren't queued, this is enough to get them freed.
    - if they were queued, they will get freed when they finally get out or
    discarded.

    Signed-off-by: Xin Long
    Marcelo Ricardo Leitner
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    lucien
     
  • When a msg is sent, sctp will hold the chunks of this msg and then try
    to enqueue them. But if the chunks are not enqueued in sctp_outq_tail()
    because of the invalid state, sctp_cmd_interpreter() may still return
    success to sctp_sendmsg() after calling sctp_outq_flush(), these chunks
    will become orphans and will leak.

    So we fix them by moving sctp_chunk_hold() to sctp_outq_tail(), where we
    are sure that the chunk is going to get queued.

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    lucien
     

06 Dec, 2015

3 commits

  • As we are keeping timestamps on when copying the socket, we also have to
    copy sk_tsflags.

    This is needed since b9f40e21ef42 ("net-timestamp: move timestamp flags
    out of sk_flags").

    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     
  • Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy
    related to disabling sock timestamp.

    When SCTP accepts an association or peel one off, it copies sock flags
    but forgot to call net_enable_timestamp() if a packet timestamping flag
    was copied, leading to extra calls to net_disable_timestamp() whenever
    such clones were closed.

    The fix is to call net_enable_timestamp() whenever we copy a sock with
    that flag on, like tcp does.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     
  • SCTP echoes a cookie o INIT ACK chunks that contains a timestamp, for
    detecting stale cookies. This cookie is echoed back to the server by the
    client and then that timestamp is checked.

    Thing is, if the listening socket is using packet timestamping, the
    cookie is encoded with ktime_get() value and checked against
    ktime_get_real(), as done by __net_timestamp().

    The fix is to sctp also use ktime_get_real(), so we can compare bananas
    with bananas later no matter if packet timestamping was enabled or not.

    Fixes: 52db882f3fc2 ("net: sctp: migrate cookie life from timeval to ktime")
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

04 Dec, 2015

3 commits


03 Dec, 2015

1 commit

  • Dmitry Vyukov reported that the user could trigger a kernel warning by
    using a large len value for getsockopt SCTP_GET_LOCAL_ADDRS, as that
    value directly affects the value used as a kmalloc() parameter.

    This patch thus switches the allocation flags from all user-controllable
    kmalloc size to GFP_USER to put some more restrictions on it and also
    disables the warn, as they are not necessary.

    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

02 Dec, 2015

2 commits

  • Dmitry provided a syzkaller (http://github.com/google/syzkaller)
    triggering a fault in sock_wake_async() when async IO is requested.

    Said program stressed af_unix sockets, but the issue is generic
    and should be addressed in core networking stack.

    The problem is that by the time sock_wake_async() is called,
    we should not access the @flags field of 'struct socket',
    as the inode containing this socket might be freed without
    further notice, and without RCU grace period.

    We already maintain an RCU protected structure, "struct socket_wq"
    so moving SOCKWQ_ASYNC_NOSPACE & SOCKWQ_ASYNC_WAITDATA into it
    is the safe route.

    It also reduces number of cache lines needing dirtying, so might
    provide a performance improvement anyway.

    In followup patches, we might move remaining flags (SOCK_NOSPACE,
    SOCK_PASSCRED, SOCK_PASSSEC) to save 8 bytes and let 'struct socket'
    being mostly read and let it being shared between cpus.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This patch is a cleanup to make following patch easier to
    review.

    Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
    from (struct socket)->flags to a (struct socket_wq)->flags
    to benefit from RCU protection in sock_wake_async()

    To ease backports, we rename both constants.

    Two new helpers, sk_set_bit(int nr, struct sock *sk)
    and sk_clear_bit(int net, struct sock *sk) are added so that
    following patch can change their implementation.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

01 Dec, 2015

1 commit

  • The memory barrier in the helper wq_has_sleeper is needed by just
    about every user of waitqueue_active. This patch generalises it
    by making it take a wait_queue_head_t directly. The existing
    helper is renamed to skwq_has_sleeper.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

16 Nov, 2015

1 commit

  • now sctp auth cannot work well when setting a hmacid manually, which
    is caused by that we didn't use the network order for hmacid, so fix
    it by adding the transformation in sctp_auth_ep_set_hmacs.

    even we set hmacid with the network order in userspace, it still
    can't work, because of this condition in sctp_auth_ep_set_hmacs():

    if (id > SCTP_AUTH_HMAC_ID_MAX)
    return -EOPNOTSUPP;

    so this wasn't working before and thus it won't break compatibility.

    Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Neil Horman
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    lucien