09 Jul, 2019

1 commit

  • Processes can request ipv6 flowlabels with cmsg IPV6_FLOWINFO.
    If not set, by default an autogenerated flowlabel is selected.

    Explicit flowlabels require a control operation per label plus a
    datapath check on every connection (every datagram if unconnected).
    This is particularly expensive on unconnected sockets multiplexing
    many flows, such as QUIC.

    In the common case, where no lease is exclusive, the check can be
    safely elided, as both lease request and check trivially succeed.
    Indeed, autoflowlabel does the same even with exclusive leases.

    Elide the check if no process has requested an exclusive lease.

    fl6_sock_lookup previously returns either a reference to a lease or
    NULL to denote failure. Modify to return a real error and update
    all callers. On return NULL, they can use the label and will elide
    the atomic_dec in fl6_sock_release.

    This is an optimization. Robust applications still have to revert to
    requesting leases if the fast path fails due to an exclusive lease.

    Changes RFC->v1:
    - use static_key_false_deferred to rate limit jump label operations
    - call static_key_deferred_flush to stop timers on exit
    - move decrement out of RCU context
    - defer optimization also if opt data is associated with a lease
    - updated all fp6_sock_lookup callers, not just udp

    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

24 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this sctp implementation is free software you can redistribute it
    and or modify it under the terms of the gnu general public license
    as published by the free software foundation either version 2 or at
    your option any later version this sctp implementation is
    distributed in the hope that it will be useful but without any
    warranty without even the implied warranty of merchantability or
    fitness for a particular purpose see the gnu general public license
    for more details you should have received a copy of the gnu general
    public license along with gnu cc see the file copying if not see
    http www gnu org licenses

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 42 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Kate Stewart
    Reviewed-by: Richard Fontana
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190523091649.683323110@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

20 Apr, 2019

1 commit

  • The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many
    socket protocol handlers, and all of those end up calling the same
    sock_get_timestamp()/sock_get_timestampns() helper functions, which
    results in a lot of duplicate code.

    With the introduction of 64-bit time_t on 32-bit architectures, this
    gets worse, as we then need four different ioctl commands in each
    socket protocol implementation.

    To simplify that, let's add a new .gettstamp() operation in
    struct proto_ops, and move ioctl implementation into the common
    sock_ioctl()/compat_sock_ioctl_trans() functions that these all go
    through.

    We can reuse the sock_get_timestamp() implementation, but generalize
    it so it can deal with both native and compat mode, as well as
    timeval and timespec structures.

    Acked-by: Stefan Schmidt
    Acked-by: Neil Horman
    Acked-by: Marc Kleine-Budde
    Link: https://lore.kernel.org/lkml/CAK8P3a038aDQQotzua_QtKGhq8O9n+rdiz2=WDCp82ys8eUT+A@mail.gmail.com/
    Signed-off-by: Arnd Bergmann
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

25 Jan, 2019

1 commit

  • Now sctp_transport_pmtu() passes transport->saddr into .get_dst() to set
    flow sport from 'saddr'. However, transport->saddr is set only when
    transport->dst exists in sctp_transport_route().

    If sctp_transport_pmtu() is called without transport->saddr set, like
    when transport->dst doesn't exists, the flow sport will be set to 0
    from transport->saddr, which will cause a wrong route to be got.

    Commit 6e91b578bf3f ("sctp: re-use sctp_transport_pmtu in
    sctp_transport_route") made the issue be triggered more easily
    since sctp_transport_pmtu() would be called in sctp_transport_route()
    after that.

    In gerneral, fl4->fl4_sport should always be set to
    htons(asoc->base.bind_addr.port), unless transport->asoc doesn't exist
    in sctp_v4/6_get_dst(), which is the case:

    sctp_ootb_pkt_new() ->
    sctp_transport_route()

    For that, we can simply handle it by setting flow sport from saddr only
    when it's 0 in sctp_v4/6_get_dst().

    Fixes: 6e91b578bf3f ("sctp: re-use sctp_transport_pmtu in sctp_transport_route")
    Reported-by: Ying Xu
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

17 Jan, 2019

1 commit

  • The similar issue as fixed in Commit 4a2eb0c37b47 ("sctp: initialize
    sin6_flowinfo for ipv6 addrs in sctp_inet6addr_event") also exists
    in sctp_inetaddr_event, as Alexander noticed.

    To fix it, allocate sctp_sockaddr_entry with kzalloc for both sctp
    ipv4 and ipv6 addresses, as does in sctp_v4/6_copy_addrlist().

    Reported-by: Alexander Potapenko
    Signed-off-by: Xin Long
    Reported-by: syzbot+ae0c70c0c2d40c51bb92@syzkaller.appspotmail.com
    Acked-by: Marcelo Ricardo Leitner
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Xin Long
     

21 Dec, 2018

1 commit


11 Dec, 2018

1 commit

  • syzbot reported a kernel-infoleak, which is caused by an uninitialized
    field(sin6_flowinfo) of addr->a.v6 in sctp_inet6addr_event().
    The call trace is as below:

    BUG: KMSAN: kernel-infoleak in _copy_to_user+0x19a/0x230 lib/usercopy.c:33
    CPU: 1 PID: 8164 Comm: syz-executor2 Not tainted 4.20.0-rc3+ #95
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x32d/0x480 lib/dump_stack.c:113
    kmsan_report+0x12c/0x290 mm/kmsan/kmsan.c:683
    kmsan_internal_check_memory+0x32a/0xa50 mm/kmsan/kmsan.c:743
    kmsan_copy_to_user+0x78/0xd0 mm/kmsan/kmsan_hooks.c:634
    _copy_to_user+0x19a/0x230 lib/usercopy.c:33
    copy_to_user include/linux/uaccess.h:183 [inline]
    sctp_getsockopt_local_addrs net/sctp/socket.c:5998 [inline]
    sctp_getsockopt+0x15248/0x186f0 net/sctp/socket.c:7477
    sock_common_getsockopt+0x13f/0x180 net/core/sock.c:2937
    __sys_getsockopt+0x489/0x550 net/socket.c:1939
    __do_sys_getsockopt net/socket.c:1950 [inline]
    __se_sys_getsockopt+0xe1/0x100 net/socket.c:1947
    __x64_sys_getsockopt+0x62/0x80 net/socket.c:1947
    do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x63/0xe7

    sin6_flowinfo is not really used by SCTP, so it will be fixed by simply
    setting it to 0.

    The issue exists since very beginning.
    Thanks Alexander for the reproducer provided.

    Reported-by: syzbot+ad5d327e6936a2e284be@syzkaller.appspotmail.com
    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Xin Long
     

09 Nov, 2018

1 commit

  • We'll need this to handle ICMP errors for tunnels without a sending socket
    (i.e. FoU and GUE). There, we might have to look up different types of IP
    tunnels, registered as network protocols, before we get a match, so we
    want this for the error handlers of IPPROTO_IPIP and IPPROTO_IPV6 in both
    inet_protos and inet6_protos. These error codes will be used in the next
    patch.

    For consistency, return sensible error codes in protocol error handlers
    whenever handlers can't handle errors because, even if valid, they don't
    match a protocol or any of its states.

    This has no effect on existing error handling paths.

    Signed-off-by: Stefano Brivio
    Reviewed-by: Sabrina Dubroca
    Signed-off-by: David S. Miller

    Stefano Brivio
     

04 Jul, 2018

2 commits

  • The transport with illegal flowlabel should not be allowed to send
    packets. Other transport protocols already denies this.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • Like some other per transport params, flowlabel and dscp are added
    in transport, asoc and sctp_sock. By default, transport sets its
    value from asoc's, and asoc does it from sctp_sock. flowlabel
    only works for ipv6 transport.

    Other than that they need to be passed down in sctp_xmit, flow4/6
    also needs to set them before looking up route in get_dst.

    Note that it uses '& 0x100000' to check if flowlabel is set and
    '& 0x1' (tos 1st bit is unused) to check if dscp is set by users,
    so that they could be set to 0 by sockopt in next patch.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

29 Jun, 2018

1 commit

  • The poll() changes were not well thought out, and completely
    unexplained. They also caused a huge performance regression, because
    "->poll()" was no longer a trivial file operation that just called down
    to the underlying file operations, but instead did at least two indirect
    calls.

    Indirect calls are sadly slow now with the Spectre mitigation, but the
    performance problem could at least be largely mitigated by changing the
    "->get_poll_head()" operation to just have a per-file-descriptor pointer
    to the poll head instead. That gets rid of one of the new indirections.

    But that doesn't fix the new complexity that is completely unwarranted
    for the regular case. The (undocumented) reason for the poll() changes
    was some alleged AIO poll race fixing, but we don't make the common case
    slower and more complex for some uncommon special case, so this all
    really needs way more explanations and most likely a fundamental
    redesign.

    [ This revert is a revert of about 30 different commits, not reverted
    individually because that would just be unnecessarily messy - Linus ]

    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

05 Jun, 2018

1 commit

  • Pull aio updates from Al Viro:
    "Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly.

    The only thing I'm holding back for a day or so is Adam's aio ioprio -
    his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case),
    but let it sit in -next for decency sake..."

    * 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    aio: sanitize the limit checking in io_submit(2)
    aio: fold do_io_submit() into callers
    aio: shift copyin of iocb into io_submit_one()
    aio_read_events_ring(): make a bit more readable
    aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way
    aio: take list removal to (some) callers of aio_complete()
    aio: add missing break for the IOCB_CMD_FDSYNC case
    random: convert to ->poll_mask
    timerfd: convert to ->poll_mask
    eventfd: switch to ->poll_mask
    pipe: convert to ->poll_mask
    crypto: af_alg: convert to ->poll_mask
    net/rxrpc: convert to ->poll_mask
    net/iucv: convert to ->poll_mask
    net/phonet: convert to ->poll_mask
    net/nfc: convert to ->poll_mask
    net/caif: convert to ->poll_mask
    net/bluetooth: convert to ->poll_mask
    net/sctp: convert to ->poll_mask
    net/tipc: convert to ->poll_mask
    ...

    Linus Torvalds
     

26 May, 2018

1 commit


23 May, 2018

1 commit

  • Now sctp uses inet_dgram_connect as its proto_ops .connect, and the flags
    param can't be passed into its proto .connect where this flags is really
    needed.

    sctp works around it by getting flags from socket file in __sctp_connect.
    It works for connecting from userspace, as inherently the user sock has
    socket file and it passes f_flags as the flags param into the proto_ops
    .connect.

    However, the sock created by sock_create_kern doesn't have a socket file,
    and it passes the flags (like O_NONBLOCK) by using the flags param in
    kernel_connect, which calls proto_ops .connect later.

    So to fix it, this patch defines a new proto_ops .connect for sctp,
    sctp_inet_connect, which calls __sctp_connect() directly with this
    flags param. After this, the sctp's proto .connect can be removed.

    Note that sctp_inet_connect doesn't need to do some checks that are not
    needed for sctp, which makes thing better than with inet_dgram_connect.

    Suggested-by: Marcelo Ricardo Leitner
    Signed-off-by: Xin Long
    Acked-by: Neil Horman
    Acked-by: Marcelo Ricardo Leitner
    Reviewed-by: Michal Kubecek
    Signed-off-by: David S. Miller

    Xin Long
     

28 Apr, 2018

1 commit

  • Since sctp ipv6 socket also supports v4 addrs, it's possible to
    compare two v4 addrs in pf v6 .cmp_addr, sctp_inet6_cmp_addr.

    However after Commit 1071ec9d453a ("sctp: do not check port in
    sctp_inet6_cmp_addr"), it no longer calls af1->cmp_addr, which
    in this case is sctp_v4_cmp_addr, but calls __sctp_v6_cmp_addr
    where it handles them as two v6 addrs. It would cause a out of
    bounds crash.

    syzbot found this crash when trying to bind two v4 addrs to a
    v6 socket.

    This patch fixes it by adding the process for two v4 addrs in
    sctp_inet6_cmp_addr.

    Fixes: 1071ec9d453a ("sctp: do not check port in sctp_inet6_cmp_addr")
    Reported-by: syzbot+cd494c1dd681d4d93ebb@syzkaller.appspotmail.com
    Signed-off-by: Xin Long
    Acked-by: Neil Horman
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     

13 Apr, 2018

1 commit

  • pf->cmp_addr() is called before binding a v6 address to the sock. It
    should not check ports, like in sctp_inet_cmp_addr.

    But sctp_inet6_cmp_addr checks the addr by invoking af(6)->cmp_addr,
    sctp_v6_cmp_addr where it also compares the ports.

    This would cause that setsockopt(SCTP_SOCKOPT_BINDX_ADD) could bind
    multiple duplicated IPv6 addresses after Commit 40b4f0fd74e4 ("sctp:
    lack the check for ports in sctp_v6_cmp_addr").

    This patch is to remove af->cmp_addr called in sctp_inet6_cmp_addr,
    but do the proper check for both v6 addrs and v4mapped addrs.

    v1->v2:
    - define __sctp_v6_cmp_addr to do the common address comparison
    used for both pf and af v6 cmp_addr.

    Fixes: 40b4f0fd74e4 ("sctp: lack the check for ports in sctp_v6_cmp_addr")
    Reported-by: Jianwen Ji
    Signed-off-by: Xin Long
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Xin Long
     

10 Apr, 2018

1 commit

  • Pull networking fixes from David Miller:

    1) The sockmap code has to free socket memory on close if there is
    corked data, from John Fastabend.

    2) Tunnel names coming from userspace need to be length validated. From
    Eric Dumazet.

    3) arp_filter() has to take VRFs properly into account, from Miguel
    Fadon Perlines.

    4) Fix oops in error path of tcf_bpf_init(), from Davide Caratti.

    5) Missing idr_remove() in u32_delete_key(), from Cong Wang.

    6) More syzbot stuff. Several use of uninitialized value fixes all
    over, from Eric Dumazet.

    7) Do not leak kernel memory to userspace in sctp, also from Eric
    Dumazet.

    8) Discard frames from unused ports in DSA, from Andrew Lunn.

    9) Fix DMA mapping and reset/failover problems in ibmvnic, from Thomas
    Falcon.

    10) Do not access dp83640 PHY registers prematurely after reset, from
    Esben Haabendal.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
    vhost-net: set packet weight of tx polling to 2 * vq size
    net: thunderx: rework mac addresses list to u64 array
    inetpeer: fix uninit-value in inet_getpeer
    dp83640: Ensure against premature access to PHY registers after reset
    devlink: convert occ_get op to separate registration
    ARM: dts: ls1021a: Specify TBIPA register address
    net/fsl_pq_mdio: Allow explicit speficition of TBIPA address
    ibmvnic: Do not reset CRQ for Mobility driver resets
    ibmvnic: Fix failover case for non-redundant configuration
    ibmvnic: Fix reset scheduler error handling
    ibmvnic: Zero used TX descriptor counter on reset
    ibmvnic: Fix DMA mapping mistakes
    tipc: use the right skb in tipc_sk_fill_sock_diag()
    sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6
    net: dsa: Discard frames from unused ports
    sctp: do not leak kernel memory to user space
    soreuseport: initialise timewait reuseport field
    ipv4: fix uninit-value in ip_route_output_key_hash_rcu()
    dccp: initialize ireq->ir_mark
    net: fix uninit-value in __hw_addr_add_ex()
    ...

    Linus Torvalds
     

08 Apr, 2018

1 commit

  • syzbot produced a nice report [1]

    Issue here is that a recvmmsg() managed to leak 8 bytes of kernel memory
    to user space, because sin_zero (padding field) was not properly cleared.

    [1]
    BUG: KMSAN: uninit-value in copy_to_user include/linux/uaccess.h:184 [inline]
    BUG: KMSAN: uninit-value in move_addr_to_user+0x32e/0x530 net/socket.c:227
    CPU: 1 PID: 3586 Comm: syzkaller481044 Not tainted 4.16.0+ #82
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0x185/0x1d0 lib/dump_stack.c:53
    kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
    kmsan_internal_check_memory+0x164/0x1d0 mm/kmsan/kmsan.c:1176
    kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199
    copy_to_user include/linux/uaccess.h:184 [inline]
    move_addr_to_user+0x32e/0x530 net/socket.c:227
    ___sys_recvmsg+0x4e2/0x810 net/socket.c:2211
    __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313
    SYSC_recvmmsg+0x29b/0x3e0 net/socket.c:2394
    SyS_recvmmsg+0x76/0xa0 net/socket.c:2378
    do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    RIP: 0033:0x4401c9
    RSP: 002b:00007ffc56f73098 EFLAGS: 00000217 ORIG_RAX: 000000000000012b
    RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401c9
    RDX: 0000000000000001 RSI: 0000000020003ac0 RDI: 0000000000000003
    RBP: 00000000006ca018 R08: 0000000020003bc0 R09: 0000000000000010
    R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000401af0
    R13: 0000000000401b80 R14: 0000000000000000 R15: 0000000000000000

    Local variable description: ----addr@___sys_recvmsg
    Variable was created at:
    ___sys_recvmsg+0xd5/0x810 net/socket.c:2172
    __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313

    Bytes 8-15 of 16 are uninitialized

    ==================================================================
    Kernel panic - not syncing: panic_on_warn set ...

    CPU: 1 PID: 3586 Comm: syzkaller481044 Tainted: G B 4.16.0+ #82
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0x185/0x1d0 lib/dump_stack.c:53
    panic+0x39d/0x940 kernel/panic.c:183
    kmsan_report+0x238/0x240 mm/kmsan/kmsan.c:1083
    kmsan_internal_check_memory+0x164/0x1d0 mm/kmsan/kmsan.c:1176
    kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199
    copy_to_user include/linux/uaccess.h:184 [inline]
    move_addr_to_user+0x32e/0x530 net/socket.c:227
    ___sys_recvmsg+0x4e2/0x810 net/socket.c:2211
    __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313
    SYSC_recvmmsg+0x29b/0x3e0 net/socket.c:2394
    SyS_recvmmsg+0x76/0xa0 net/socket.c:2378
    do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Eric Dumazet
    Cc: Vlad Yasevich
    Cc: Neil Horman
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Apr, 2018

1 commit

  • Pull SELinux updates from Paul Moore:
    "A bigger than usual pull request for SELinux, 13 patches (lucky!)
    along with a scary looking diffstat.

    Although if you look a bit closer, excluding the usual minor
    tweaks/fixes, there are really only two significant changes in this
    pull request: the addition of proper SELinux access controls for SCTP
    and the encapsulation of a lot of internal SELinux state.

    The SCTP changes are the result of a multi-month effort (maybe even a
    year or longer?) between the SELinux folks and the SCTP folks to add
    proper SELinux controls. A special thanks go to Richard for seeing
    this through and keeping the effort moving forward.

    The state encapsulation work is a bit of janitorial work that came out
    of some early work on SELinux namespacing. The question of namespacing
    is still an open one, but I believe there is some real value in the
    encapsulation work so we've split that out and are now sending that up
    to you"

    * tag 'selinux-pr-20180403' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
    selinux: wrap AVC state
    selinux: wrap selinuxfs state
    selinux: fix handling of uninitialized selinux state in get_bools/classes
    selinux: Update SELinux SCTP documentation
    selinux: Fix ltp test connect-syscall failure
    selinux: rename the {is,set}_enforcing() functions
    selinux: wrap global selinux state
    selinux: fix typo in selinux_netlbl_sctp_sk_clone declaration
    selinux: Add SCTP support
    sctp: Add LSM hooks
    sctp: Add ip option support
    security: Add support for SCTP security hooks
    netlabel: If PF_INET6, check sk_buff ip header version

    Linus Torvalds
     

27 Feb, 2018

1 commit

  • Add ip option support to allow LSM security modules to utilise CIPSO/IPv4
    and CALIPSO/IPv6 services.

    Signed-off-by: Richard Haines
    Acked-by: Neil Horman
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: Paul Moore

    Richard Haines
     

13 Feb, 2018

1 commit

  • Changes since v1:
    Added changes in these files:
    drivers/infiniband/hw/usnic/usnic_transport.c
    drivers/staging/lustre/lnet/lnet/lib-socket.c
    drivers/target/iscsi/iscsi_target_login.c
    drivers/vhost/net.c
    fs/dlm/lowcomms.c
    fs/ocfs2/cluster/tcp.c
    security/tomoyo/network.c

    Before:
    All these functions either return a negative error indicator,
    or store length of sockaddr into "int *socklen" parameter
    and return zero on success.

    "int *socklen" parameter is awkward. For example, if caller does not
    care, it still needs to provide on-stack storage for the value
    it does not need.

    None of the many FOO_getname() functions of various protocols
    ever used old value of *socklen. They always just overwrite it.

    This change drops this parameter, and makes all these functions, on success,
    return length of sockaddr. It's always >= 0 and can be differentiated
    from an error.

    Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.

    rpc_sockname() lost "int buflen" parameter, since its only use was
    to be passed to kernel_getsockname() as &buflen and subsequently
    not used in any way.

    Userspace API is not changed.

    text data bss dec hex filename
    30108430 2633624 873672 33615726 200ef6e vmlinux.before.o
    30108109 2633612 873672 33615393 200ee21 vmlinux.o

    Signed-off-by: Denys Vlasenko
    CC: David S. Miller
    CC: linux-kernel@vger.kernel.org
    CC: netdev@vger.kernel.org
    CC: linux-bluetooth@vger.kernel.org
    CC: linux-decnet-user@lists.sourceforge.net
    CC: linux-wireless@vger.kernel.org
    CC: linux-rdma@vger.kernel.org
    CC: linux-sctp@vger.kernel.org
    CC: linux-nfs@vger.kernel.org
    CC: linux-x25@vger.kernel.org
    Signed-off-by: David S. Miller

    Denys Vlasenko
     

06 Feb, 2018

1 commit

  • When going through the bind address list in sctp_v6_get_dst() and
    the previously found address is better ('matchlen > bmatchlen'),
    the code continues to the next iteration without releasing currently
    held destination.

    Fix it by releasing 'bdst' before continue to the next iteration, and
    instead of introducing one more '!IS_ERR(bdst)' check for dst_release(),
    move the already existed one right after ip6_dst_lookup_flow(), i.e. we
    shouldn't proceed further if we get an error for the route lookup.

    Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary addresses for ipv6")
    Signed-off-by: Alexey Kodanev
    Acked-by: Neil Horman
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Alexey Kodanev
     

16 Jan, 2018

1 commit


16 Nov, 2017

1 commit

  • Alexandar Potapenko while testing the kernel with KMSAN and syzkaller
    discovered that in some configurations sctp would leak 4 bytes of
    kernel stack.

    Working with his reproducer I discovered that those 4 bytes that
    are leaked is the scope id of an ipv6 address returned by recvmsg.

    With a little code inspection and a shrewd guess I discovered that
    sctp_inet6_skb_msgname only initializes the scope_id field for link
    local ipv6 addresses to the interface index the link local address
    pertains to instead of initializing the scope_id field for all ipv6
    addresses.

    That is almost reasonable as scope_id's are meaniningful only for link
    local addresses. Set the scope_id in all other cases to 0 which is
    not a valid interface index to make it clear there is nothing useful
    in the scope_id field.

    There should be no danger of breaking userspace as the stack leak
    guaranteed that previously meaningless random data was being returned.

    Fixes: 372f525b495c ("SCTP: Resync with LKSCTP tree.")
    History-tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
    Reported-by: Alexander Potapenko
    Tested-by: Alexander Potapenko
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

29 Oct, 2017

1 commit

  • These warnings were found by running 'make C=2 M=net/sctp/'.
    They are there since very beginning.

    Note after this patch, there still one warning left in
    sctp_outq_flush():
    sctp_chunk_fail(chunk, SCTP_ERROR_INV_STRM)

    Since it has been moved to sctp_stream_outq_migrate on net-next,
    to avoid the extra job when merging net-next to net, I will post
    the fix for it after the merging is done.

    Reported-by: Eric Dumazet
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

24 Oct, 2017

1 commit

  • Commit 9b9742022888 ("sctp: support ipv6 nonlocal bind")
    introduced support for the above options as v4 sctp did,
    so patched sctp_v6_available().

    In the v4 implementation it's enough, because
    sctp_inet_bind_verify() just returns with sctp_v4_available().
    However sctp_inet6_bind_verify() has an extra check before that
    for link-local scope_id, which won't respect the above options.

    Added the checks before calling ipv6_chk_addr(), but
    not before the validation of scope_id.

    before (w/ both options):
    ./v6test fe80::10 sctp
    bind failed, errno: 99 (Cannot assign requested address)
    ./v6test fe80::10 tcp
    bind success, errno: 0 (Success)

    after (w/ both options):
    ./v6test fe80::10 sctp
    bind success, errno: 0 (Success)

    Signed-off-by: Laszlo Toth
    Reviewed-by: Xin Long
    Signed-off-by: David S. Miller

    Laszlo Toth
     

22 Aug, 2017

1 commit


19 Aug, 2017

1 commit

  • KMSAN reported use of uninitialized sctp_addr->v4.sin_addr.s_addr and
    sctp_addr->v6.sin6_scope_id in sctp_v6_cmp_addr() (see below).
    Make sure all fields of an IPv6 address are initialized, which
    guarantees that the IPv4 fields are also initialized.

    ==================================================================
    BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
    net/sctp/ipv6.c:517
    CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
    01/01/2011
    Call Trace:
    dump_stack+0x172/0x1c0 lib/dump_stack.c:42
    is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
    kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
    native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
    arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
    arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
    __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
    sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
    sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
    sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
    sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
    sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
    inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
    sock_sendmsg_nosec net/socket.c:633 [inline]
    sock_sendmsg net/socket.c:643 [inline]
    SYSC_sendto+0x608/0x710 net/socket.c:1696
    SyS_sendto+0x8a/0xb0 net/socket.c:1664
    entry_SYSCALL_64_fastpath+0x13/0x94
    RIP: 0033:0x44b479
    RSP: 002b:00007f6213f21c08 EFLAGS: 00000286 ORIG_RAX: 000000000000002c
    RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 000000000044b479
    RDX: 0000000000000041 RSI: 0000000020edd000 RDI: 0000000000000006
    RBP: 00000000007080a8 R08: 0000000020b85fe4 R09: 000000000000001c
    R10: 0000000000040005 R11: 0000000000000286 R12: 00000000ffffffff
    R13: 0000000000003760 R14: 00000000006e5820 R15: 0000000000ff8000
    origin description: ----dst_saddr@sctp_v6_get_dst
    local variable created at:
    sk_fullsock include/net/sock.h:2321 [inline]
    inet6_sk include/linux/ipv6.h:309 [inline]
    sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
    sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
    ==================================================================
    BUG: KMSAN: use of uninitialized memory in sctp_v6_cmp_addr+0x8d4/0x9f0
    net/sctp/ipv6.c:517
    CPU: 2 PID: 31056 Comm: syz-executor1 Not tainted 4.11.0-rc5+ #2944
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
    01/01/2011
    Call Trace:
    dump_stack+0x172/0x1c0 lib/dump_stack.c:42
    is_logbuf_locked mm/kmsan/kmsan.c:59 [inline]
    kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:938
    native_save_fl arch/x86/include/asm/irqflags.h:18 [inline]
    arch_local_save_flags arch/x86/include/asm/irqflags.h:72 [inline]
    arch_local_irq_save arch/x86/include/asm/irqflags.h:113 [inline]
    __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:467
    sctp_v6_cmp_addr+0x8d4/0x9f0 net/sctp/ipv6.c:517
    sctp_v6_get_dst+0x8c7/0x1630 net/sctp/ipv6.c:290
    sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
    sctp_assoc_add_peer+0x66d/0x16f0 net/sctp/associola.c:651
    sctp_sendmsg+0x35a5/0x4f90 net/sctp/socket.c:1871
    inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
    sock_sendmsg_nosec net/socket.c:633 [inline]
    sock_sendmsg net/socket.c:643 [inline]
    SYSC_sendto+0x608/0x710 net/socket.c:1696
    SyS_sendto+0x8a/0xb0 net/socket.c:1664
    entry_SYSCALL_64_fastpath+0x13/0x94
    RIP: 0033:0x44b479
    RSP: 002b:00007f6213f21c08 EFLAGS: 00000286 ORIG_RAX: 000000000000002c
    RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 000000000044b479
    RDX: 0000000000000041 RSI: 0000000020edd000 RDI: 0000000000000006
    RBP: 00000000007080a8 R08: 0000000020b85fe4 R09: 000000000000001c
    R10: 0000000000040005 R11: 0000000000000286 R12: 00000000ffffffff
    R13: 0000000000003760 R14: 00000000006e5820 R15: 0000000000ff8000
    origin description: ----dst_saddr@sctp_v6_get_dst
    local variable created at:
    sk_fullsock include/net/sock.h:2321 [inline]
    inet6_sk include/linux/ipv6.h:309 [inline]
    sctp_v6_get_dst+0x91/0x1630 net/sctp/ipv6.c:241
    sctp_transport_route+0x101/0x570 net/sctp/transport.c:292
    ==================================================================

    Signed-off-by: Alexander Potapenko
    Reviewed-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Alexander Potapenko
     

07 Aug, 2017

1 commit


17 Jul, 2017

1 commit


06 Jul, 2017

1 commit

  • …find the correct route entry.

    if there are several same route entries with different outgoing net device,
    application's socket specifies the oif through setsockopt with
    SO_BINDTODEVICE, sctpv6 should choose the route entry whose outgoing net
    device is the oif which was specified by socket, set the value of
    flowi6_oif to sk->sk_bound_dev_if to make sctp_v6_get_dst to find the
    correct route entry.

    Signed-off-by: Zheng Li <james.z.li@ericsson.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

    Zheng Li
     

18 May, 2017

1 commit


12 May, 2017

1 commit

  • Commit 0ca50d12fe46 ("sctp: fix src address selection if using secondary
    addresses") has fixed a src address selection issue when using secondary
    addresses for ipv4.

    Now sctp ipv6 also has the similar issue. When using a secondary address,
    sctp_v6_get_dst tries to choose the saddr which has the most same bits
    with the daddr by sctp_v6_addr_match_len. It may make some cases not work
    as expected.

    hostA:
    [1] fd21:356b:459a:cf10::11 (eth1)
    [2] fd21:356b:459a:cf20::11 (eth2)

    hostB:
    [a] fd21:356b:459a:cf30::2 (eth1)
    [b] fd21:356b:459a:cf40::2 (eth2)

    route from hostA to hostB:
    fd21:356b:459a:cf30::/64 dev eth1 metric 1024 mtu 1500

    The expected path should be:
    fd21:356b:459a:cf10::11 fd21:356b:459a:cf30::2
    But addr[2] matches addr[a] more bits than addr[1] does, according to
    sctp_v6_addr_match_len. It causes the path to be:
    fd21:356b:459a:cf20::11 fd21:356b:459a:cf30::2

    This patch is to fix it with the same way as Marcelo's fix for sctp ipv4.
    As no ip_dev_find for ipv6, this patch is to use ipv6_chk_addr to check
    if the saddr is in a dev instead.

    Note that for backwards compatibility, it will still do the addr_match_len
    check here when no optimal is found.

    Reported-by: Patrick Talbert
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

10 Mar, 2017

1 commit

  • Lockdep issues a circular dependency warning when AFS issues an operation
    through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.

    The theory lockdep comes up with is as follows:

    (1) If the pagefault handler decides it needs to read pages from AFS, it
    calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
    creating a call requires the socket lock:

    mmap_sem must be taken before sk_lock-AF_RXRPC

    (2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind()
    binds the underlying UDP socket whilst holding its socket lock.
    inet_bind() takes its own socket lock:

    sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET

    (3) Reading from a TCP socket into a userspace buffer might cause a fault
    and thus cause the kernel to take the mmap_sem, but the TCP socket is
    locked whilst doing this:

    sk_lock-AF_INET must be taken before mmap_sem

    However, lockdep's theory is wrong in this instance because it deals only
    with lock classes and not individual locks. The AF_INET lock in (2) isn't
    really equivalent to the AF_INET lock in (3) as the former deals with a
    socket entirely internal to the kernel that never sees userspace. This is
    a limitation in the design of lockdep.

    Fix the general case by:

    (1) Double up all the locking keys used in sockets so that one set are
    used if the socket is created by userspace and the other set is used
    if the socket is created by the kernel.

    (2) Store the kern parameter passed to sk_alloc() in a variable in the
    sock struct (sk_kern_sock). This informs sock_lock_init(),
    sock_init_data() and sk_clone_lock() as to the lock keys to be used.

    Note that the child created by sk_clone_lock() inherits the parent's
    kern setting.

    (3) Add a 'kern' parameter to ->accept() that is analogous to the one
    passed in to ->create() that distinguishes whether kernel_accept() or
    sys_accept4() was the caller and can be passed to sk_alloc().

    Note that a lot of accept functions merely dequeue an already
    allocated socket. I haven't touched these as the new socket already
    exists before we get the parameter.

    Note also that there are a couple of places where I've made the accepted
    socket unconditionally kernel-based:

    irda_accept()
    rds_rcp_accept_one()
    tcp_accept_from_sock()

    because they follow a sock_create_kern() and accept off of that.

    Whilst creating this, I noticed that lustre and ocfs don't create sockets
    through sock_create_kern() and thus they aren't marked as for-kernel,
    though they appear to be internal. I wonder if these should do that so
    that they use the new set of lock keys.

    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

28 Jan, 2017

1 commit


27 Jan, 2017

1 commit

  • Unlike ipv4, this control socket is shared by all cpus so we cannot use
    it as scratchpad area to annotate the mark that we pass to ip6_xmit().

    Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket
    family caches the flowi6 structure in the sctp_transport structure, so
    we cannot use to carry the mark unless we later on reset it back, which
    I discarded since it looks ugly to me.

    Fixes: bf99b4ded5f8 ("tcp: fix mark propagation with fwmark_reflect enabled")
    Suggested-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Pablo Neira
     

29 Dec, 2016

1 commit


25 Dec, 2016

1 commit


01 Nov, 2016

1 commit

  • Prior to this patch, in rx path, before calling lock_sock, it needed to
    hold assoc when got it by __sctp_lookup_association, in case other place
    would free/put assoc.

    But in __sctp_lookup_association, it lookup and hold transport, then got
    assoc by transport->assoc, then hold assoc and put transport. It means
    it didn't hold transport, yet it was returned and later on directly
    assigned to chunk->transport.

    Without the protection of sock lock, the transport may be freed/put by
    other places, which would cause a use-after-free issue.

    This patch is to fix this issue by holding transport instead of assoc.
    As holding transport can make sure to access assoc is also safe, and
    actually it looks up assoc by searching transport rhashtable, to hold
    transport here makes more sense.

    Note that the function will be renamed later on on another patch.

    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     

26 Jul, 2016

1 commit

  • Commit 486bdee0134c ("sctp: add support for RPS and RFS")
    saves skb->hash into sk->sk_rxhash so that the inet_* can
    record it to flow table.

    But sctp uses sock_common_recvmsg as .recvmsg instead
    of inet_recvmsg, sock_common_recvmsg doesn't invoke
    sock_rps_record_flow to record the flow. It may cause
    that the receiver has no chances to record the flow if
    it doesn't send msg or poll the socket.

    So this patch fixes it by using inet_recvmsg as .recvmsg
    in sctp.

    Fixes: 486bdee0134c ("sctp: add support for RPS and RFS")
    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long