22 Apr, 2017

1 commit


31 Mar, 2017

1 commit

  • When sending a msg without asoc established, sctp will send INIT packet
    first and then enqueue chunks.

    Before receiving INIT_ACK, stream info is not yet alloced. But enqueuing
    chunks needs to access stream info, like out stream state and out stream
    cnt.

    This patch is to fix it by allocing out stream info when initializing an
    asoc, allocing in stream and re-allocing out stream when processing init.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

20 Feb, 2017

3 commits


10 Feb, 2017

2 commits


08 Feb, 2017

1 commit

  • Add new transport flag to allow sockets to confirm neighbour.
    When same struct dst_entry can be used for many different
    neighbours we can not use it for pending confirmations.
    The flag is propagated from transport to every packet.
    It is reset when cached dst is reset.

    Reported-by: YueHaibing
    Fixes: 5110effee8fd ("net: Do delayed neigh confirmation.")
    Fixes: f2bb4bedf35d ("ipv4: Cache output routes in fib_info nexthops.")
    Signed-off-by: Julian Anastasov
    Acked-by: Eric Dumazet
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Julian Anastasov
     

19 Jan, 2017

2 commits

  • This patch is to add reconf_enable field in all of asoc ep and netns
    to indicate if they support stream reset.

    When initializing, asoc reconf_enable get the default value from ep
    reconf_enable which is from netns netns reconf_enable by default.

    It is also to add reconf_capable in asoc peer part to know if peer
    supports reconf_enable, the value is set if ext params have reconf
    chunk support when processing init chunk, just as rfc6525 section
    5.1.1 demands.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • This patch is to add asoc strreset_outseq and strreset_inseq for
    saving the reconf request sequence, initialize them when create
    assoc and process init, and also to define Incoming and Outgoing
    SSN Reset Request Parameter described in rfc6525 section 4.1 and
    4.2, As they can be in one same chunk as section rfc6525 3.1-3
    describes, it makes them in one function.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

17 Jan, 2017

1 commit


07 Jan, 2017

1 commit

  • sctp stream reconf, described in RFC 6525, needs a structure to
    save per stream information in assoc, like stream state.

    In the future, sctp stream scheduler also needs it to save some
    stream scheduler params and queues.

    This patchset is to prepare the stream array in assoc for stream
    reconf. It defines sctp_stream that includes stream arrays inside
    to replace ssnmap.

    Note that we use different structures for IN and OUT streams, as
    the members in per OUT stream will get more and more different
    from per IN stream.

    v1->v2:
    - put these patches into a smaller group.
    v2->v3:
    - define sctp_stream to contain stream arrays, and create stream.c
    to put stream-related functions.
    - merge 3 patches into 1, as new sctp_stream has the same name
    with before.

    Signed-off-by: Xin Long
    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     

03 Oct, 2016

1 commit


30 Sep, 2016

1 commit

  • Now sctp uses chunk->prsctp_param to save the prsctp param for all the
    prsctp polices, we didn't need to introduce prsctp_param to sctp_chunk.
    We can just use chunk->sinfo.sinfo_timetolive for RTX and BUF polices,
    and reuse msg->expires_at for TTL policy, as the prsctp polices and old
    expires policy are mutual exclusive.

    This patch is to remove prsctp_param from sctp_chunk, and reuse msg's
    expires_at for TTL and chunk's sinfo.sinfo_timetolive for RTX and BUF
    polices.

    Note that sctp can't use chunk's sinfo.sinfo_timetolive for TTL policy,
    as it needs a u64 variables to save the expires_at time.

    This one also fixes the "netperf-Throughput_Mbps -37.2% regression"
    issue.

    Fixes: a6c2f792873a ("sctp: implement prsctp TTL policy")
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

22 Sep, 2016

1 commit

  • To something more meaningful these days, specially because this is
    working on packet headers or lengths and which are not tied to any CPU
    arch but to the protocol itself.

    So, WORD_TRUNC becomes SCTP_TRUNC4 and WORD_ROUND becomes SCTP_PAD4.

    Reported-by: David Laight
    Reported-by: David Miller
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

14 Jul, 2016

1 commit

  • Identifying address family operations during rx path is not something
    expensive but it's ugly to the eye to have it done multiple times,
    specially when we already validated it during initial rx processing.

    This patch takes advantage of the now shared sctp_input_cb and make the
    pointer to the operations readily available.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

12 Jul, 2016

4 commits

  • prsctp PRIO policy is a policy to abandon lower priority chunks when
    asoc doesn't have enough snd buffer, so that the current chunk with
    higher priority can be queued successfully.

    Similar to TTL/RTX policy, we will set the priority of the chunk to
    prsctp_param with sinfo->sinfo_timetolive in sctp_set_prsctp_policy().
    So if PRIO policy is enabled, msg->expire_at won't work.

    asoc->sent_cnt_removable will record how many chunks can be checked to
    remove. If priority policy is enabled, when the chunk is queued into
    the out_queue, we will increase sent_cnt_removable. When the chunk is
    moved to abandon_queue or dequeue and free, we will decrease
    sent_cnt_removable.

    In sctp_sendmsg, we will check if there is enough snd buffer for current
    msg and if sent_cnt_removable is not 0. Then try to abandon chunks in
    sctp_prune_prsctp when sendmsg from the retransmit/transmited queue, and
    free chunks from out_queue in right order until the abandon+free size >
    msg_len - sctp_wfree. For the abandon size, we have to wait until it
    sends FORWARD TSN, receives the sack and the chunks are really freed.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • prsctp RTX policy is a policy to abandon chunks when they are
    retransmitted beyond the max count.

    This patch uses sent_count to count how many times one chunk has
    been sent, and prsctp_param is the max rtx count, which is from
    sinfo->sinfo_timetolive in sctp_set_prsctp_policy(). So similar
    to TTL policy, if RTX policy is enabled, msg->expire_at won't
    work.

    Then in sctp_chunk_abandoned, this patch checks if chunk->sent_count
    is bigger than chunk->prsctp_param to abandon this chunk.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • prsctp TTL policy is a policy to abandon chunks when they expire
    at the specific time in local stack. It's similar with expires_at
    in struct sctp_datamsg.

    This patch uses sinfo->sinfo_timetolive to set the specific time for
    TTL policy. sinfo->sinfo_timetolive is also used for msg->expires_at.
    So if prsctp_enable or TTL policy is not enabled, msg->expires_at
    still works as before.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • According to section 4.5 of rfc7496, prsctp_enable should be per asoc.
    We will add prsctp_enable to both asoc and ep, and replace the places
    where it used net.sctp->prsctp_enable with asoc->prsctp_enable.

    ep->prsctp_enable will be initialized with net.sctp->prsctp_enable, and
    asoc->prsctp_enable will be initialized with ep->prsctp_enable. We can
    also modify it's value through sockopt SCTP_PR_SUPPORTED.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

11 Apr, 2016

1 commit

  • Currently on high rate SCTP streams the heartbeat timer refresh can
    consume quite a lot of resources as timer updates are costly and it
    contains a random factor, which a) is also costly and b) invalidates
    mod_timer() optimization for not editing a timer to the same value.
    It may even cause the timer to be slightly advanced, for no good reason.

    As suggested by David Laight this patch now removes this timer update
    from hot path by leaving the timer on and re-evaluating upon its
    expiration if the heartbeat is still needed or not, similarly to what is
    done for TCP. If it's not needed anymore the timer is re-scheduled to
    the new timeout, considering the time already elapsed.

    For this, we now record the last tx timestamp per transport, updated in
    the same spots as hb timer was restarted on tx. Also split up
    sctp_transport_reset_timers into sctp_transport_reset_t3_rtx and
    sctp_transport_reset_hb_timer, so we can re-arm T3 without re-arming the
    heartbeat one.

    On loopback with MTU of 65535 and data chunks with 1636, so that we
    have a considerable amount of chunks without stressing system calls,
    netperf -t SCTP_STREAM -l 30, perf looked like this before:

    Samples: 103K of event 'cpu-clock', Event count (approx.): 25833000000
    Overhead Command Shared Object Symbol
    + 6,15% netperf [kernel.vmlinux] [k] copy_user_enhanced_fast_string
    - 5,43% netperf [kernel.vmlinux] [k] _raw_write_unlock_irqrestore
    - _raw_write_unlock_irqrestore
    - 96,54% _raw_spin_unlock_irqrestore
    - 36,14% mod_timer
    + 97,24% sctp_transport_reset_timers
    + 2,76% sctp_do_sm
    + 33,65% __wake_up_sync_key
    + 28,77% sctp_ulpq_tail_event
    + 1,40% del_timer
    - 1,84% mod_timer
    + 99,03% sctp_transport_reset_timers
    + 0,97% sctp_do_sm
    + 1,50% sctp_ulpq_tail_event

    And after this patch, now with netperf -l 60:

    Samples: 230K of event 'cpu-clock', Event count (approx.): 57707250000
    Overhead Command Shared Object Symbol
    + 5,65% netperf [kernel.vmlinux] [k] memcpy_erms
    + 5,59% netperf [kernel.vmlinux] [k] copy_user_enhanced_fast_string
    - 5,05% netperf [kernel.vmlinux] [k] _raw_spin_unlock_irqrestore
    - _raw_spin_unlock_irqrestore
    + 49,89% __wake_up_sync_key
    + 45,68% sctp_ulpq_tail_event
    - 2,85% mod_timer
    + 76,51% sctp_transport_reset_t3_rtx
    + 23,49% sctp_do_sm
    + 1,55% del_timer
    + 2,50% netperf [sctp] [k] sctp_datamsg_from_user
    + 2,26% netperf [sctp] [k] sctp_sendmsg

    Throughput-wise, from 6800mbps without the patch to 7050mbps with it,
    ~3.7%.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

20 Mar, 2016

2 commits

  • David S. Miller
     
  • Pull networking updates from David Miller:
    "Highlights:

    1) Support more Realtek wireless chips, from Jes Sorenson.

    2) New BPF types for per-cpu hash and arrap maps, from Alexei
    Starovoitov.

    3) Make several TCP sysctls per-namespace, from Nikolay Borisov.

    4) Allow the use of SO_REUSEPORT in order to do per-thread processing
    of incoming TCP/UDP connections. The muxing can be done using a
    BPF program which hashes the incoming packet. From Craig Gallek.

    5) Add a multiplexer for TCP streams, to provide a messaged based
    interface. BPF programs can be used to determine the message
    boundaries. From Tom Herbert.

    6) Add 802.1AE MACSEC support, from Sabrina Dubroca.

    7) Avoid factorial complexity when taking down an inetdev interface
    with lots of configured addresses. We were doing things like
    traversing the entire address less for each address removed, and
    flushing the entire netfilter conntrack table for every address as
    well.

    8) Add and use SKB bulk free infrastructure, from Jesper Brouer.

    9) Allow offloading u32 classifiers to hardware, and implement for
    ixgbe, from John Fastabend.

    10) Allow configuring IRQ coalescing parameters on a per-queue basis,
    from Kan Liang.

    11) Extend ethtool so that larger link mode masks can be supported.
    From David Decotigny.

    12) Introduce devlink, which can be used to configure port link types
    (ethernet vs Infiniband, etc.), port splitting, and switch device
    level attributes as a whole. From Jiri Pirko.

    13) Hardware offload support for flower classifiers, from Amir Vadai.

    14) Add "Local Checksum Offload". Basically, for a tunneled packet
    the checksum of the outer header is 'constant' (because with the
    checksum field filled into the inner protocol header, the payload
    of the outer frame checksums to 'zero'), and we can take advantage
    of that in various ways. From Edward Cree"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
    bonding: fix bond_get_stats()
    net: bcmgenet: fix dma api length mismatch
    net/mlx4_core: Fix backward compatibility on VFs
    phy: mdio-thunder: Fix some Kconfig typos
    lan78xx: add ndo_get_stats64
    lan78xx: handle statistics counter rollover
    RDS: TCP: Remove unused constant
    RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
    net: smc911x: convert pxa dma to dmaengine
    team: remove duplicate set of flag IFF_MULTICAST
    bonding: remove duplicate set of flag IFF_MULTICAST
    net: fix a comment typo
    ethernet: micrel: fix some error codes
    ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
    bpf, dst: add and use dst_tclassid helper
    bpf: make skb->tc_classid also readable
    net: mvneta: bm: clarify dependencies
    cls_bpf: reset class and reuse major in da
    ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
    ldmvsw: Add ldmvsw.c driver code
    ...

    Linus Torvalds
     

14 Mar, 2016

1 commit

  • Currently sctp_sendmsg() triggers some calls that will allocate memory
    with GFP_ATOMIC even when not necessary. In the case of
    sctp_packet_transmit it will allocate a linear skb that will be used to
    construct the packet and this may cause sends to fail due to ENOMEM more
    often than anticipated specially with big MTUs.

    This patch thus allows it to inherit gfp flags from upper calls so that
    it can use GFP_KERNEL if it was triggered by a sctp_sendmsg call or
    similar. All others, like retransmits or flushes started from BH, are
    still allocated using GFP_ATOMIC.

    In netperf tests this didn't result in any performance drawbacks when
    memory is not too fragmented and made it trigger ENOMEM way less often.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

09 Mar, 2016

1 commit

  • Dmitry reported that sctp_add_bind_addr may read more bytes than
    expected in case the parameter is a IPv4 addr supplied by the user
    through calls such as sctp_bindx_add(), because it always copies
    sizeof(union sctp_addr) while the buffer may be just a struct
    sockaddr_in, which is smaller.

    This patch then fixes it by limiting the memcpy to the min between the
    union size and a (new parameter) provided addr size. Where possible this
    parameter still is the size of that union, except for reading from
    user-provided buffers, which then it accounts for protocol type.

    Reported-by: Dmitry Vyukov
    Tested-by: Dmitry Vyukov
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

27 Jan, 2016

1 commit


06 Dec, 2015

1 commit

  • SCTP echoes a cookie o INIT ACK chunks that contains a timestamp, for
    detecting stale cookies. This cookie is echoed back to the server by the
    client and then that timestamp is checked.

    Thing is, if the listening socket is using packet timestamping, the
    cookie is encoded with ktime_get() value and checked against
    ktime_get_real(), as done by __net_timestamp().

    The fix is to sctp also use ktime_get_real(), so we can compare bananas
    with bananas later no matter if packet timestamping was enabled or not.

    Fixes: 52db882f3fc2 ("net: sctp: migrate cookie life from timeval to ktime")
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

05 Oct, 2015

1 commit

  • We want to avoid using time_t in the kernel because of the y2038
    overflow problem. The use in sctp is not for storing seconds at
    all, but instead uses microseconds and is passed as 32-bit
    on all machines.

    This patch changes the type to u32, which better fits the use.

    Signed-off-by: Arnd Bergmann
    Cc: Vlad Yasevich
    Cc: Neil Horman
    Cc: linux-sctp@vger.kernel.org
    Acked-by: Neil Horman
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

29 Aug, 2015

1 commit

  • RFC 5061:
    This is an opaque integer assigned by the sender to identify each
    request parameter. The receiver of the ASCONF Chunk will copy this
    32-bit value into the ASCONF Response Correlation ID field of the
    ASCONF-ACK response parameter. The sender of the ASCONF can use this
    same value in the ASCONF-ACK to find which request the response is
    for. Note that the receiver MUST NOT change this 32-bit value.

    Address Parameter: TLV

    This field contains an IPv4 or IPv6 address parameter, as described
    in Section 3.3.2.1 of [RFC4960].

    ASCONF chunk with Error Cause Indication Parameter (Unresolvable Address)
    should be sent if the Delete IP Address is not part of the association.

    Endpoint A Endpoint B
    (ESTABLISHED) (ESTABLISHED)

    ASCONF ----------------->
    (Delete IP Address)

    Acked-by: Vlad Yasevich
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    lucien
     

28 Aug, 2015

1 commit

  • in sctp_process_asconf(), we get address parameter from the beginning of
    the addip params. but we never check if it's really there. if the addr
    param is not there, it still can pass sctp_verify_asconf(), then to be
    handled by sctp_process_asconf(), it will not be safe.

    so add a code in sctp_verify_asconf() to check the address parameter is in
    the beginning, or return false to send abort.

    note that this can also detect multiple address parameters, and reject it.

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    lucien
     

31 Jan, 2015

1 commit

  • When making use of RFC5061, section 4.2.4. for setting the primary IP
    address, we're passing a wrong parameter header to param_type2af(),
    resulting always in NULL being returned.

    At this point, param.p points to a sctp_addip_param struct, containing
    a sctp_paramhdr (type = 0xc004, length = var), and crr_id as a correlation
    id. Followed by that, as also presented in RFC5061 section 4.2.4., comes
    the actual sctp_addr_param, which also contains a sctp_paramhdr, but
    this time with the correct type SCTP_PARAM_IPV{4,6}_ADDRESS that
    param_type2af() can make use of. Since we already hold a pointer to
    addr_param from previous line, just reuse it for param_type2af().

    Fixes: d6de3097592b ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
    Signed-off-by: Saran Maruti Ramanara
    Signed-off-by: Daniel Borkmann
    Acked-by: Vlad Yasevich
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Saran Maruti Ramanara
     

24 Nov, 2014

2 commits


12 Nov, 2014

1 commit

  • An SCTP server doing ASCONF will panic on malformed INIT ping-of-death
    in the form of:

    ------------ INIT[PARAM: SET_PRIMARY_IP] ------------>

    While the INIT chunk parameter verification dissects through many things
    in order to detect malformed input, it misses to actually check parameters
    inside of parameters. E.g. RFC5061, section 4.2.4 proposes a 'set primary
    IP address' parameter in ASCONF, which has as a subparameter an address
    parameter.

    So an attacker may send a parameter type other than SCTP_PARAM_IPV4_ADDRESS
    or SCTP_PARAM_IPV6_ADDRESS, param_type2af() will subsequently return 0
    and thus sctp_get_af_specific() returns NULL, too, which we then happily
    dereference unconditionally through af->from_addr_param().

    The trace for the log:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
    IP: [] sctp_process_init+0x492/0x990 [sctp]
    PGD 0
    Oops: 0000 [#1] SMP
    [...]
    Pid: 0, comm: swapper Not tainted 2.6.32-504.el6.x86_64 #1 Bochs Bochs
    RIP: 0010:[] [] sctp_process_init+0x492/0x990 [sctp]
    [...]
    Call Trace:

    [] ? sctp_bind_addr_copy+0x5d/0xe0 [sctp]
    [] sctp_sf_do_5_1B_init+0x21b/0x340 [sctp]
    [] sctp_do_sm+0x71/0x1210 [sctp]
    [] ? sctp_endpoint_lookup_assoc+0xc9/0xf0 [sctp]
    [] sctp_endpoint_bh_rcv+0x116/0x230 [sctp]
    [] sctp_inq_push+0x56/0x80 [sctp]
    [] sctp_rcv+0x982/0xa10 [sctp]
    [] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
    [] ? nf_iterate+0x69/0xb0
    [] ? ip_local_deliver_finish+0x0/0x2d0
    [] ? nf_hook_slow+0x76/0x120
    [] ? ip_local_deliver_finish+0x0/0x2d0
    [...]

    A minimal way to address this is to check for NULL as we do on all
    other such occasions where we know sctp_get_af_specific() could
    possibly return with NULL.

    Fixes: d6de3097592b ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
    Signed-off-by: Daniel Borkmann
    Cc: Vlad Yasevich
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

15 Oct, 2014

1 commit

  • Commit 6f4c618ddb0 ("SCTP : Add paramters validity check for
    ASCONF chunk") added basic verification of ASCONF chunks, however,
    it is still possible to remotely crash a server by sending a
    special crafted ASCONF chunk, even up to pre 2.6.12 kernels:

    skb_over_panic: text:ffffffffa01ea1c3 len:31056 put:30768
    head:ffff88011bd81800 data:ffff88011bd81800 tail:0x7950
    end:0x440 dev:
    ------------[ cut here ]------------
    kernel BUG at net/core/skbuff.c:129!
    [...]
    Call Trace:

    [] skb_put+0x5c/0x70
    [] sctp_addto_chunk+0x63/0xd0 [sctp]
    [] sctp_process_asconf+0x1af/0x540 [sctp]
    [] ? _read_unlock_bh+0x15/0x20
    [] sctp_sf_do_asconf+0x168/0x240 [sctp]
    [] sctp_do_sm+0x71/0x1210 [sctp]
    [] ? fib_rules_lookup+0xad/0xf0
    [] ? sctp_cmp_addr_exact+0x32/0x40 [sctp]
    [] sctp_assoc_bh_rcv+0xd3/0x180 [sctp]
    [] sctp_inq_push+0x56/0x80 [sctp]
    [] sctp_rcv+0x982/0xa10 [sctp]
    [] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
    [] ? nf_iterate+0x69/0xb0
    [] ? ip_local_deliver_finish+0x0/0x2d0
    [] ? nf_hook_slow+0x76/0x120
    [] ? ip_local_deliver_finish+0x0/0x2d0
    [] ip_local_deliver_finish+0xdd/0x2d0
    [] ip_local_deliver+0x98/0xa0
    [] ip_rcv_finish+0x12d/0x440
    [] ip_rcv+0x275/0x350
    [] __netif_receive_skb+0x4ab/0x750
    [] netif_receive_skb+0x58/0x60

    This can be triggered e.g., through a simple scripted nmap
    connection scan injecting the chunk after the handshake, for
    example, ...

    -------------- INIT[ASCONF; ASCONF_ACK] ------------->

    ... where ASCONF chunk of length 280 contains 2 parameters ...

    1) Add IP address parameter (param length: 16)
    2) Add/del IP address parameter (param length: 255)

    ... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the
    Address Parameter in the ASCONF chunk is even missing, too.
    This is just an example and similarly-crafted ASCONF chunks
    could be used just as well.

    The ASCONF chunk passes through sctp_verify_asconf() as all
    parameters passed sanity checks, and after walking, we ended
    up successfully at the chunk end boundary, and thus may invoke
    sctp_process_asconf(). Parameter walking is done with
    WORD_ROUND() to take padding into account.

    In sctp_process_asconf()'s TLV processing, we may fail in
    sctp_process_asconf_param() e.g., due to removal of the IP
    address that is also the source address of the packet containing
    the ASCONF chunk, and thus we need to add all TLVs after the
    failure to our ASCONF response to remote via helper function
    sctp_add_asconf_response(), which basically invokes a
    sctp_addto_chunk() adding the error parameters to the given
    skb.

    When walking to the next parameter this time, we proceed
    with ...

    length = ntohs(asconf_param->param_hdr.length);
    asconf_param = (void *)asconf_param + length;

    ... instead of the WORD_ROUND()'ed length, thus resulting here
    in an off-by-one that leads to reading the follow-up garbage
    parameter length of 12336, and thus throwing an skb_over_panic
    for the reply when trying to sctp_addto_chunk() next time,
    which implicitly calls the skb_put() with that length.

    Fix it by using sctp_walk_params() [ which is also used in
    INIT parameter processing ] macro in the verification *and*
    in ASCONF processing: it will make sure we don't spill over,
    that we walk parameters WORD_ROUND()'ed. Moreover, we're being
    more defensive and guard against unknown parameter types and
    missized addresses.

    Joint work with Vlad Yasevich.

    Fixes: b896b82be4ae ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK chunks.")
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Vlad Yasevich
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

12 Jun, 2014

1 commit


19 Apr, 2014

1 commit

  • Currently, it is possible to create an SCTP socket, then switch
    auth_enable via sysctl setting to 1 and crash the system on connect:

    Oops[#1]:
    CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.1-mipsgit-20140415 #1
    task: ffffffff8056ce80 ti: ffffffff8055c000 task.ti: ffffffff8055c000
    [...]
    Call Trace:
    [] sctp_auth_asoc_set_default_hmac+0x68/0x80
    [] sctp_process_init+0x5e0/0x8a4
    [] sctp_sf_do_5_1B_init+0x234/0x34c
    [] sctp_do_sm+0xb4/0x1e8
    [] sctp_endpoint_bh_rcv+0x1c4/0x214
    [] sctp_rcv+0x588/0x630
    [] sctp6_rcv+0x10/0x24
    [] ip6_input+0x2c0/0x440
    [] __netif_receive_skb_core+0x4a8/0x564
    [] process_backlog+0xb4/0x18c
    [] net_rx_action+0x12c/0x210
    [] __do_softirq+0x17c/0x2ac
    [] irq_exit+0x54/0xb0
    [] ret_from_irq+0x0/0x4
    [] rm7k_wait_irqoff+0x24/0x48
    [] cpu_startup_entry+0xc0/0x148
    [] start_kernel+0x37c/0x398
    Code: dd0900b8 000330f8 0126302d 50c0fff1 0047182a a48306a0
    03e00008 00000000
    ---[ end trace b530b0551467f2fd ]---
    Kernel panic - not syncing: Fatal exception in interrupt

    What happens while auth_enable=0 in that case is, that
    ep->auth_hmacs is initialized to NULL in sctp_auth_init_hmacs()
    when endpoint is being created.

    After that point, if an admin switches over to auth_enable=1,
    the machine can crash due to NULL pointer dereference during
    reception of an INIT chunk. When we enter sctp_process_init()
    via sctp_sf_do_5_1B_init() in order to respond to an INIT chunk,
    the INIT verification succeeds and while we walk and process
    all INIT params via sctp_process_param() we find that
    net->sctp.auth_enable is set, therefore do not fall through,
    but invoke sctp_auth_asoc_set_default_hmac() instead, and thus,
    dereference what we have set to NULL during endpoint
    initialization phase.

    The fix is to make auth_enable immutable by caching its value
    during endpoint initialization, so that its original value is
    being carried along until destruction. The bug seems to originate
    from the very first days.

    Fix in joint work with Daniel Borkmann.

    Reported-by: Joshua Kinard
    Signed-off-by: Vlad Yasevich
    Signed-off-by: Daniel Borkmann
    Acked-by: Neil Horman
    Tested-by: Joshua Kinard
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

06 Mar, 2014

1 commit

  • While working on ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to
    verify if we/peer is AUTH capable"), we noticed that there's a skb
    memory leakage in the error path.

    Running the same reproducer as in ec0223ec48a9 and by unconditionally
    jumping to the error label (to simulate an error condition) in
    sctp_sf_do_5_1D_ce() receive path lets kmemleak detector bark about
    the unfreed chunk->auth_chunk skb clone:

    Unreferenced object 0xffff8800b8f3a000 (size 256):
    comm "softirq", pid 0, jiffies 4294769856 (age 110.757s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    89 ab 75 5e d4 01 58 13 00 00 00 00 00 00 00 00 ..u^..X.........
    backtrace:
    [] kmemleak_alloc+0x4e/0xb0
    [] kmem_cache_alloc+0xc8/0x210
    [] skb_clone+0x49/0xb0
    [] sctp_endpoint_bh_rcv+0x1d9/0x230 [sctp]
    [] sctp_inq_push+0x4c/0x70 [sctp]
    [] sctp_rcv+0x82e/0x9a0 [sctp]
    [] ip_local_deliver_finish+0xa8/0x210
    [] nf_reinject+0xbf/0x180
    [] nfqnl_recv_verdict+0x1d2/0x2b0 [nfnetlink_queue]
    [] nfnetlink_rcv_msg+0x14b/0x250 [nfnetlink]
    [] netlink_rcv_skb+0xa9/0xc0
    [] nfnetlink_rcv+0x23f/0x408 [nfnetlink]
    [] netlink_unicast+0x168/0x250
    [] netlink_sendmsg+0x2e1/0x3f0
    [] sock_sendmsg+0x8b/0xc0
    [] ___sys_sendmsg+0x369/0x380

    What happens is that commit bbd0d59809f9 clones the skb containing
    the AUTH chunk in sctp_endpoint_bh_rcv() when having the edge case
    that an endpoint requires COOKIE-ECHO chunks to be authenticated:

    ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->

    auth_chunk, we could hit the "goto nomem_init" path from
    an error condition and thus leave the cloned skb around w/o
    freeing it.

    The fix is to centrally free such clones in sctp_chunk_destroy()
    handler that is invoked from sctp_chunk_free() after all refs have
    dropped; and also move both kfree_skb(chunk->auth_chunk) there,
    so that chunk->auth_chunk is either NULL (since sctp_chunkify()
    allocs new chunks through kmem_cache_zalloc()) or non-NULL with
    a valid skb pointer. chunk->skb and chunk->auth_chunk are the
    only skbs in the sctp_chunk structure that need to be handeled.

    While at it, we should use consume_skb() for both. It is the same
    as dev_kfree_skb() but more appropriately named as we are not
    a device but a protocol. Also, this effectively replaces the
    kfree_skb() from both invocations into consume_skb(). Functions
    are the same only that kfree_skb() assumes that the frame was
    being dropped after a failure (e.g. for tools like drop monitor),
    usage of consume_skb() seems more appropriate in function
    sctp_chunk_destroy() though.

    Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk")
    Signed-off-by: Daniel Borkmann
    Cc: Vlad Yasevich
    Cc: Neil Horman
    Acked-by: Vlad Yasevich
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

14 Jan, 2014

1 commit


27 Dec, 2013

1 commit