05 Aug, 2020

1 commit

  • commit bbc8a99e952226c585ac17477a85ef1194501762 upstream.

    rds_notify_queue_get() is potentially copying uninitialized kernel stack
    memory to userspace since the compiler may leave a 4-byte hole at the end
    of `cmsg`.

    In 2016 we tried to fix this issue by doing `= { 0 };` on `cmsg`, which
    unfortunately does not always initialize that 4-byte hole. Fix it by using
    memset() instead.

    Cc: stable@vger.kernel.org
    Fixes: f037590fff30 ("rds: fix a leak of kernel memory")
    Fixes: bdbe6fbc6a2f ("RDS: recv.c")
    Suggested-by: Dan Carpenter
    Signed-off-by: Peilin Ye
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Peilin Ye
     

03 Sep, 2019

1 commit


28 Aug, 2019

1 commit

  • The rds6_inc_info_copy() function has a couple struct members which
    are leaking stack information. The ->tos field should hold actual
    information and the ->flags field needs to be zeroed out.

    Fixes: 3eb450367d08 ("rds: add type of service(tos) infrastructure")
    Fixes: b7ff8b1036f0 ("rds: Extend RDS API for IPv6 support")
    Reported-by: 黄ID蝴蝶
    Signed-off-by: Dan Carpenter
    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Ka-Cheong Poon
     

24 Aug, 2019

1 commit

  • Add the RDMA cookie and RX timestamp to the usercopy whitelist.

    After the introduction of hardened usercopy whitelisting
    (https://lwn.net/Articles/727322/), a warning is displayed when the
    RDMA cookie or RX timestamp is copied to userspace:

    kernel: WARNING: CPU: 3 PID: 5750 at
    mm/usercopy.c:81 usercopy_warn+0x8e/0xa6
    [...]
    kernel: Call Trace:
    kernel: __check_heap_object+0xb8/0x11b
    kernel: __check_object_size+0xe3/0x1bc
    kernel: put_cmsg+0x95/0x115
    kernel: rds_recvmsg+0x43d/0x620 [rds]
    kernel: sock_recvmsg+0x43/0x4a
    kernel: ___sys_recvmsg+0xda/0x1e6
    kernel: ? __handle_mm_fault+0xcae/0xf79
    kernel: __sys_recvmsg+0x51/0x8a
    kernel: SyS_recvmsg+0x12/0x1c
    kernel: do_syscall_64+0x79/0x1ae

    When the whitelisting feature was introduced, the memory for the RDMA
    cookie and RX timestamp in RDS was not added to the whitelist, causing
    the warning above.

    Signed-off-by: Dag Moxnes
    Tested-by: Jenny
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Dag Moxnes
     

05 Feb, 2019

1 commit

  • RDS Service type (TOS) is user-defined and needs to be configured
    via RDS IOCTL interface. It must be set before initiating any
    traffic and once set the TOS can not be changed. All out-going
    traffic from the socket will be associated with its TOS.

    Reviewed-by: Sowmini Varadhan
    Signed-off-by: Santosh Shilimkar
    [yanjun.zhu@oracle.com: Adapted original patch with ipv6 changes]
    Signed-off-by: Zhu Yanjun

    Santosh Shilimkar
     

04 Feb, 2019

3 commits

  • Add SO_TIMESTAMP_NEW and SO_TIMESTAMPNS_NEW variants of
    socket timestamp options.
    These are the y2038 safe versions of the SO_TIMESTAMP_OLD
    and SO_TIMESTAMPNS_OLD for all architectures.

    Note that the format of scm_timestamping.ts[0] is not changed
    in this patch.

    Signed-off-by: Deepa Dinamani
    Acked-by: Willem de Bruijn
    Cc: jejb@parisc-linux.org
    Cc: ralf@linux-mips.org
    Cc: rth@twiddle.net
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-mips@linux-mips.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linux-rdma@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Signed-off-by: David S. Miller

    Deepa Dinamani
     
  • As part of y2038 solution, all internal uses of
    struct timeval are replaced by struct __kernel_old_timeval
    and struct compat_timeval by struct old_timeval32.
    Make socket timestamps use these new types.

    This is mainly to be able to verify that the kernel build
    is y2038 safe when such non y2038 safe types are not
    supported anymore.

    Signed-off-by: Deepa Dinamani
    Acked-by: Willem de Bruijn
    Cc: isdn@linux-pingi.de
    Signed-off-by: David S. Miller

    Deepa Dinamani
     
  • SO_TIMESTAMP, SO_TIMESTAMPNS and SO_TIMESTAMPING options, the
    way they are currently defined, are not y2038 safe.
    Subsequent patches in the series add new y2038 safe versions
    of these options which provide 64 bit timestamps on all
    architectures uniformly.
    Hence, rename existing options with OLD tag suffixes.

    Also note that kernel will not use the untagged SO_TIMESTAMP*
    and SCM_TIMESTAMP* options internally anymore.

    Signed-off-by: Deepa Dinamani
    Acked-by: Willem de Bruijn
    Cc: deller@gmx.de
    Cc: dhowells@redhat.com
    Cc: jejb@parisc-linux.org
    Cc: ralf@linux-mips.org
    Cc: rth@twiddle.net
    Cc: linux-afs@lists.infradead.org
    Cc: linux-alpha@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-mips@linux-mips.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linux-rdma@vger.kernel.org
    Cc: sparclinux@vger.kernel.org
    Signed-off-by: David S. Miller

    Deepa Dinamani
     

17 Sep, 2018

1 commit

  • The function rds_inc_init is in recv process. To use memset can optimize
    the function rds_inc_init.
    The test result:

    Before:
    1) + 24.950 us | rds_inc_init [rds]();
    After:
    1) + 10.990 us | rds_inc_init [rds]();

    Acked-by: Santosh Shilimkar
    Signed-off-by: Zhu Yanjun
    Signed-off-by: David S. Miller

    Zhu Yanjun
     

02 Sep, 2018

1 commit

  • rds is the last in-kernel user of the old do_gettimeofday()
    function. Convert it over to ktime_get_real() to make it
    work more like the generic socket timestamps, and to let
    us kill off do_gettimeofday().

    A follow-up patch will have to change the user space interface
    to deal better with 32-bit tasks, which may use an incompatible
    layout for 'struct timespec'.

    Signed-off-by: Arnd Bergmann
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

02 Aug, 2018

1 commit


24 Jul, 2018

3 commits

  • There are many data structures (RDS socket options) used by RDS apps
    which use a 32 bit integer to store IP address. To support IPv6,
    struct in6_addr needs to be used. To ensure backward compatibility, a
    new data structure is introduced for each of those data structures
    which use a 32 bit integer to represent an IP address. And new socket
    options are introduced to use those new structures. This means that
    existing apps should work without a problem with the new RDS module.
    For apps which want to use IPv6, those new data structures and socket
    options can be used. IPv4 mapped address is used to represent IPv4
    address in the new data structures.

    v4: Revert changes to SO_RDS_TRANSPORT

    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Ka-Cheong Poon
     
  • This patch enables RDS to use IPv6 addresses. For RDS/TCP, the
    listener is now an IPv6 endpoint which accepts both IPv4 and IPv6
    connection requests. RDS/RDMA/IB uses a private data (struct
    rds_ib_connect_private) exchange between endpoints at RDS connection
    establishment time to support RDMA. This private data exchange uses a
    32 bit integer to represent an IP address. This needs to be changed in
    order to support IPv6. A new private data struct
    rds6_ib_connect_private is introduced to handle this. To ensure
    backward compatibility, an IPv6 capable RDS stack uses another RDMA
    listener port (RDS_CM_PORT) to accept IPv6 connection. And it
    continues to use the original RDS_PORT for IPv4 RDS connections. When
    it needs to communicate with an IPv6 peer, it uses the RDS_CM_PORT to
    send the connection set up request.

    v5: Fixed syntax problem (David Miller).

    v4: Changed port history comments in rds.h (Sowmini Varadhan).

    v3: Added support to set up IPv4 connection using mapped address
    (David Miller).
    Added support to set up connection between link local and non-link
    addresses.
    Various review comments from Santosh Shilimkar and Sowmini Varadhan.

    v2: Fixed bound and peer address scope mismatched issue.
    Added back rds_connect() IPv6 changes.

    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Ka-Cheong Poon
     
  • This patch changes the internal representation of an IP address to use
    struct in6_addr. IPv4 address is stored as an IPv4 mapped address.
    All the functions which take an IP address as argument are also
    changed to use struct in6_addr. But RDS socket layer is not modified
    such that it still does not accept IPv6 address from an application.
    And RDS layer does not accept nor initiate IPv6 connections.

    v2: Fixed sparse warnings.

    Signed-off-by: Ka-Cheong Poon
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Ka-Cheong Poon
     

15 Jun, 2018

1 commit

  • Loop transport which is self loopback, remote port congestion
    update isn't relevant. Infact the xmit path already ignores it.
    Receive path needs to do the same.

    Reported-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com
    Reviewed-by: Sowmini Varadhan
    Signed-off-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Santosh Shilimkar
     

03 May, 2018

1 commit

  • syzbot/KMSAN reported an uninit-value in put_cmsg(), originating
    from rds_cmsg_recv().

    Simply clear the structure, since we have holes there, or since
    rx_traces might be smaller than RDS_MSG_RX_DGRAM_TRACE_MAX.

    BUG: KMSAN: uninit-value in copy_to_user include/linux/uaccess.h:184 [inline]
    BUG: KMSAN: uninit-value in put_cmsg+0x600/0x870 net/core/scm.c:242
    CPU: 0 PID: 4459 Comm: syz-executor582 Not tainted 4.16.0+ #87
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0x185/0x1d0 lib/dump_stack.c:53
    kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
    kmsan_internal_check_memory+0x135/0x1e0 mm/kmsan/kmsan.c:1157
    kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199
    copy_to_user include/linux/uaccess.h:184 [inline]
    put_cmsg+0x600/0x870 net/core/scm.c:242
    rds_cmsg_recv net/rds/recv.c:570 [inline]
    rds_recvmsg+0x2db5/0x3170 net/rds/recv.c:657
    sock_recvmsg_nosec net/socket.c:803 [inline]
    sock_recvmsg+0x1d0/0x230 net/socket.c:810
    ___sys_recvmsg+0x3fb/0x810 net/socket.c:2205
    __sys_recvmsg net/socket.c:2250 [inline]
    SYSC_recvmsg+0x298/0x3c0 net/socket.c:2262
    SyS_recvmsg+0x54/0x80 net/socket.c:2257
    do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

    Fixes: 3289025aedc0 ("RDS: add receive message trace used by application")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Cc: Santosh Shilimkar
    Cc: linux-rdma
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Mar, 2018

1 commit

  • Commit 401910db4cd4 ("rds: deliver zerocopy completion notification
    with data") removes support fo r zerocopy completion notification
    on the sk_error_queue, thus we no longer need to track the cookie
    information in sk_buff structures.

    This commit removes the struct sk_buff_head rs_zcookie_queue by
    a simpler list that results in a smaller memory footprint as well
    as more efficient memory_allocation time.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

28 Feb, 2018

1 commit

  • This commit is an optimization over commit 01883eda72bd
    ("rds: support for zcopy completion notification") for PF_RDS sockets.

    RDS applications are predominantly request-response transactions, so
    it is more efficient to reduce the number of system calls and have
    zerocopy completion notification delivered as ancillary data on the
    POLLIN channel.

    Cookies are passed up as ancillary data (at level SOL_RDS) in a
    struct rds_zcopy_cookies when the returned value of recvmsg() is
    greater than, or equal to, 0. A max of RDS_MAX_ZCOOKIES may be passed
    with each message.

    This commit removes support for zerocopy completion notification on
    MSG_ERRQUEUE for PF_RDS sockets.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Willem de Bruijn
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

17 Feb, 2018

1 commit

  • RDS removes a datagram (rds_message) from the retransmit queue when
    an ACK is received. The ACK indicates that the receiver has queued
    the RDS datagram, so that the sender can safely forget the datagram.
    When all references to the rds_message are quiesced, rds_message_purge
    is called to release resources used by the rds_message

    If the datagram to be removed had pinned pages set up, add
    an entry to the rs->rs_znotify_queue so that the notifcation
    will be sent up via rds_rm_zerocopy_callback() when the
    rds_message is eventually freed by rds_message_purge.

    rds_rm_zerocopy_callback() attempts to batch the number of cookies
    sent with each notification to a max of SO_EE_ORIGIN_MAX_ZCOOKIES.
    This is achieved by checking the tail skb in the sk_error_queue:
    if this has room for one more cookie, the cookie from the
    current notification is added; else a new skb is added to the
    sk_error_queue. Every invocation of rds_rm_zerocopy_callback() will
    trigger a ->sk_error_report to notify the application.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

05 Jul, 2017

1 commit


22 Jun, 2017

1 commit

  • The RDS handshake ping probe added by commit 5916e2c1554f
    ("RDS: TCP: Enable multipath RDS for TCP") is sent from rds_sendmsg()
    before the first data packet is sent to a peer. If the conversation
    is not bidirectional (i.e., one side is always passive and never
    invokes rds_sendmsg()) and the passive side restarts its rds_tcp
    module, a new HS ping probe needs to be sent, so that the number
    of paths can be re-established.

    This patch achieves that by sending a HS ping probe from
    rds_tcp_accept_one() when c_npaths is 0 (i.e., we have not done
    a handshake probe with this peer yet).

    Signed-off-by: Sowmini Varadhan
    Tested-by: Jenny Xu
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

17 Jun, 2017

1 commit

  • Found when testing between sparc and x86 machines on different
    subnets, so the address comparison patterns hit the corner cases and
    brought out some bugs fixed by this patch.

    Signed-off-by: Sowmini Varadhan
    Tested-by: Imanti Mendez
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

22 Apr, 2017

1 commit


03 Jan, 2017

2 commits


18 Nov, 2016

1 commit

  • The RDS transport has to be able to distinguish between
    two types of failure events:
    (a) when the transport fails (e.g., TCP connection reset)
    but the RDS socket/connection layer on both sides stays
    the same
    (b) when the peer's RDS layer itself resets (e.g., due to module
    reload or machine reboot at the peer)
    In case (a) both sides must reconnect and continue the RDS messaging
    without any message loss or disruption to the message sequence numbers,
    and this is achieved by rds_send_path_reset().

    In case (b) we should reset all rds_connection state to the
    new incarnation of the peer. Examples of state that needs to
    be reset are next expected rx sequence number from, or messages to be
    retransmitted to, the new incarnation of the peer.

    To achieve this, the RDS handshake probe added as part of
    commit 5916e2c1554f ("RDS: TCP: Enable multipath RDS for TCP")
    is enhanced so that sender and receiver of the RDS ping-probe
    will add a generation number as part of the RDS_EXTHDR_GEN_NUM
    extension header. Each peer stores local and remote generation
    numbers as part of each rds_connection. Changes in generation
    number will be detected via incoming handshake probe ping
    request or response and will allow the receiver to reset rds_connection
    state.

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

16 Jul, 2016

1 commit

  • Use RDS probe-ping to compute how many paths may be used with
    the peer, and to synchronously start the multiple paths. If mprds is
    supported, hash outgoing traffic to one of multiple paths in rds_sendmsg()
    when multipath RDS is supported by the transport.

    CC: Santosh Shilimkar
    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

02 Jul, 2016

1 commit

  • RDS ping messages are sent with a non-zero src port to a zero
    dst port, so that the rds pong messages can be sent back to the
    originators src port. However if a confused/malicious sender
    sends a ping with a 0 src port, we'd have an infinite ping-pong
    loop. To avoid this, the receiver should ignore ping messages
    with a 0 src port.

    Acked-by: Santosh Shilimkar
    Signed-off-by: Sowmini Varadhan
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     

15 Jun, 2016

4 commits


03 Jun, 2016

1 commit


03 Mar, 2016

1 commit


03 Mar, 2015

1 commit

  • After TIPC doesn't depend on iocb argument in its internal
    implementations of sendmsg() and recvmsg() hooks defined in proto
    structure, no any user is using iocb argument in them at all now.
    Then we can drop the redundant iocb argument completely from kinds of
    implementations of both sendmsg() and recvmsg() in the entire
    networking stack.

    Cc: Christoph Hellwig
    Suggested-by: Al Viro
    Signed-off-by: Ying Xue
    Signed-off-by: David S. Miller

    Ying Xue
     

10 Dec, 2014

1 commit

  • Note that the code _using_ ->msg_iter at that point will be very
    unhappy with anything other than unshifted iovec-backed iov_iter.
    We still need to convert users to proper primitives.

    Signed-off-by: Al Viro

    Al Viro
     

24 Nov, 2014

1 commit


19 Jan, 2014

1 commit

  • This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg
    handler msg_name and msg_namelen logic").

    DECLARE_SOCKADDR validates that the structure we use for writing the
    name information to is not larger than the buffer which is reserved
    for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR
    consistently in sendmsg code paths.

    Signed-off-by: Steffen Hurrle
    Suggested-by: Hannes Frederic Sowa
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Steffen Hurrle
     

21 Nov, 2013

1 commit


23 Jul, 2012

1 commit

  • Jay Fenlason (fenlason@redhat.com) found a bug,
    that recvfrom() on an RDS socket can return the contents of random kernel
    memory to userspace if it was called with a address length larger than
    sizeof(struct sockaddr_in).
    rds_recvmsg() also fails to set the addr_len paramater properly before
    returning, but that's just a bug.
    There are also a number of cases wher recvfrom() can return an entirely bogus
    address. Anything in rds_recvmsg() that returns a non-negative value but does
    not go through the "sin = (struct sockaddr_in *)msg->msg_name;" code path
    at the end of the while(1) loop will return up to 128 bytes of kernel memory
    to userspace.

    And I write two test programs to reproduce this bug, you will see that in
    rds_server, fromAddr will be overwritten and the following sock_fd will be
    destroyed.
    Yes, it is the programmer's fault to set msg_namelen incorrectly, but it is
    better to make the kernel copy the real length of address to user space in
    such case.

    How to run the test programs ?
    I test them on 32bit x86 system, 3.5.0-rc7.

    1 compile
    gcc -o rds_client rds_client.c
    gcc -o rds_server rds_server.c

    2 run ./rds_server on one console

    3 run ./rds_client on another console

    4 you will see something like:
    server is waiting to receive data...
    old socket fd=3
    server received data from client:data from client
    msg.msg_namelen=32
    new socket fd=-1067277685
    sendmsg()
    : Bad file descriptor

    /***************** rds_client.c ********************/

    int main(void)
    {
    int sock_fd;
    struct sockaddr_in serverAddr;
    struct sockaddr_in toAddr;
    char recvBuffer[128] = "data from client";
    struct msghdr msg;
    struct iovec iov;

    sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0);
    if (sock_fd < 0) {
    perror("create socket error\n");
    exit(1);
    }

    memset(&serverAddr, 0, sizeof(serverAddr));
    serverAddr.sin_family = AF_INET;
    serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
    serverAddr.sin_port = htons(4001);

    if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) {
    perror("bind() error\n");
    close(sock_fd);
    exit(1);
    }

    memset(&toAddr, 0, sizeof(toAddr));
    toAddr.sin_family = AF_INET;
    toAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
    toAddr.sin_port = htons(4000);
    msg.msg_name = &toAddr;
    msg.msg_namelen = sizeof(toAddr);
    msg.msg_iov = &iov;
    msg.msg_iovlen = 1;
    msg.msg_iov->iov_base = recvBuffer;
    msg.msg_iov->iov_len = strlen(recvBuffer) + 1;
    msg.msg_control = 0;
    msg.msg_controllen = 0;
    msg.msg_flags = 0;

    if (sendmsg(sock_fd, &msg, 0) == -1) {
    perror("sendto() error\n");
    close(sock_fd);
    exit(1);
    }

    printf("client send data:%s\n", recvBuffer);

    memset(recvBuffer, '\0', 128);

    msg.msg_name = &toAddr;
    msg.msg_namelen = sizeof(toAddr);
    msg.msg_iov = &iov;
    msg.msg_iovlen = 1;
    msg.msg_iov->iov_base = recvBuffer;
    msg.msg_iov->iov_len = 128;
    msg.msg_control = 0;
    msg.msg_controllen = 0;
    msg.msg_flags = 0;
    if (recvmsg(sock_fd, &msg, 0) == -1) {
    perror("recvmsg() error\n");
    close(sock_fd);
    exit(1);
    }

    printf("receive data from server:%s\n", recvBuffer);

    close(sock_fd);

    return 0;
    }

    /***************** rds_server.c ********************/

    int main(void)
    {
    struct sockaddr_in fromAddr;
    int sock_fd;
    struct sockaddr_in serverAddr;
    unsigned int addrLen;
    char recvBuffer[128];
    struct msghdr msg;
    struct iovec iov;

    sock_fd = socket(AF_RDS, SOCK_SEQPACKET, 0);
    if(sock_fd < 0) {
    perror("create socket error\n");
    exit(0);
    }

    memset(&serverAddr, 0, sizeof(serverAddr));
    serverAddr.sin_family = AF_INET;
    serverAddr.sin_addr.s_addr = inet_addr("127.0.0.1");
    serverAddr.sin_port = htons(4000);
    if (bind(sock_fd, (struct sockaddr*)&serverAddr, sizeof(serverAddr)) < 0) {
    perror("bind error\n");
    close(sock_fd);
    exit(1);
    }

    printf("server is waiting to receive data...\n");
    msg.msg_name = &fromAddr;

    /*
    * I add 16 to sizeof(fromAddr), ie 32,
    * and pay attention to the definition of fromAddr,
    * recvmsg() will overwrite sock_fd,
    * since kernel will copy 32 bytes to userspace.
    *
    * If you just use sizeof(fromAddr), it works fine.
    * */
    msg.msg_namelen = sizeof(fromAddr) + 16;
    /* msg.msg_namelen = sizeof(fromAddr); */
    msg.msg_iov = &iov;
    msg.msg_iovlen = 1;
    msg.msg_iov->iov_base = recvBuffer;
    msg.msg_iov->iov_len = 128;
    msg.msg_control = 0;
    msg.msg_controllen = 0;
    msg.msg_flags = 0;

    while (1) {
    printf("old socket fd=%d\n", sock_fd);
    if (recvmsg(sock_fd, &msg, 0) == -1) {
    perror("recvmsg() error\n");
    close(sock_fd);
    exit(1);
    }
    printf("server received data from client:%s\n", recvBuffer);
    printf("msg.msg_namelen=%d\n", msg.msg_namelen);
    printf("new socket fd=%d\n", sock_fd);
    strcat(recvBuffer, "--data from server");
    if (sendmsg(sock_fd, &msg, 0) == -1) {
    perror("sendmsg()\n");
    close(sock_fd);
    exit(1);
    }
    }

    close(sock_fd);
    return 0;
    }

    Signed-off-by: Weiping Pan
    Signed-off-by: David S. Miller

    Weiping Pan