15 Jan, 2017

1 commit

  • [ Upstream commit 39b2dd765e0711e1efd1d1df089473a8dd93ad48 ]

    Socket cmsg IP(V6)_RECVORIGDSTADDR checks that port range lies within
    the packet. For sockets that have transport headers pulled, transport
    offset can be negative. Use signed comparison to avoid overflow.

    Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
    Reported-by: Nisar Jagabar
    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Willem de Bruijn
     

01 Dec, 2016

1 commit


17 May, 2016

1 commit

  • __sock_cmsg_send() might return different error codes, not only -EINVAL.

    Fixes: 24025c465f77 ("ipv4: process socket-level control messages in IPv4")
    Fixes: ad1e46a83716 ("ipv6: process socket-level control messages in IPv6")
    Signed-off-by: Eric Dumazet
    Cc: Soheil Hassas Yeganeh
    Acked-by: Soheil Hassas Yeganeh
    Signed-off-by: David S. Miller

    Eric Dumazet
     

04 May, 2016

1 commit

  • In the sendmsg function of UDP, raw, ICMP and l2tp sockets, we use local
    variables like hlimits, tclass, opt and dontfrag and pass them to corresponding
    functions like ip6_make_skb, ip6_append_data and xxx_push_pending_frames.
    This is not a good practice and makes it hard to add new parameters.
    This fix introduces a new struct ipcm6_cookie similar to ipcm_cookie in
    ipv4 and include the above mentioned variables. And we only pass the
    pointer to this structure to corresponding functions. This makes it easier
    to add new parameters in the future and makes the function cleaner.

    Signed-off-by: Wei Wang
    Signed-off-by: David S. Miller

    Wei Wang
     

26 Apr, 2016

1 commit


24 Apr, 2016

1 commit


15 Apr, 2016

4 commits

  • This patch adds a release_cb for UDPv6. It does a route lookup
    and updates sk->sk_dst_cache if it is needed. It picks up the
    left-over job from ip6_sk_update_pmtu() if the sk was owned
    by user during the pmtu update.

    It takes a rcu_read_lock to protect the __sk_dst_get() operations
    because another thread may do ip6_dst_store() without taking the
    sk lock (e.g. sendmsg).

    Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
    Signed-off-by: Martin KaFai Lau
    Reported-by: Wei Wang
    Cc: Cong Wang
    Cc: Eric Dumazet
    Cc: Wei Wang
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • There is a case in connected UDP socket such that
    getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
    sequence could be the following:
    1. Create a connected UDP socket
    2. Send some datagrams out
    3. Receive a ICMPV6_PKT_TOOBIG
    4. No new outgoing datagrams to trigger the sk_dst_check()
    logic to update the sk->sk_dst_cache.
    5. getsockopt(IPV6_MTU) returns the mtu from the invalid
    sk->sk_dst_cache instead of the newly created RTF_CACHE clone.

    This patch updates the sk->sk_dst_cache for a connected datagram sk
    during pmtu-update code path.

    Note that the sk->sk_v6_daddr is used to do the route lookup
    instead of skb->data (i.e. iph). It is because a UDP socket can become
    connected after sending out some datagrams in un-connected state. or
    It can be connected multiple times to different destinations. Hence,
    iph may not be related to where sk is currently connected to.

    It is done under '!sock_owned_by_user(sk)' condition because
    the user may make another ip6_datagram_connect() (i.e changing
    the sk->sk_v6_daddr) while dst lookup is happening in the pmtu-update
    code path.

    For the sock_owned_by_user(sk) == true case, the next patch will
    introduce a release_cb() which will update the sk->sk_dst_cache.

    Test:

    Server (Connected UDP Socket):
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Route Details:
    [root@arch-fb-vm1 ~]# ip -6 r show | egrep '2fac'
    2fac::/64 dev eth0 proto kernel metric 256 pref medium
    2fac:face::/64 via 2fac::face dev eth0 metric 1024 pref medium

    A simple python code to create a connected UDP socket:

    import socket
    import errno

    HOST = '2fac::1'
    PORT = 8080

    s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
    s.bind((HOST, PORT))
    s.connect(('2fac:face::face', 53))
    print("connected")
    while True:
    try:
    data = s.recv(1024)
    except socket.error as se:
    if se.errno == errno.EMSGSIZE:
    pmtu = s.getsockopt(41, 24)
    print("PMTU:%d" % pmtu)
    break
    s.close()

    Python program output after getting a ICMPV6_PKT_TOOBIG:
    [root@arch-fb-vm1 ~]# python2 ~/devshare/kernel/tasks/fib6/udp-connect-53-8080.py
    connected
    PMTU:1300

    Cache routes after recieving TOOBIG:
    [root@arch-fb-vm1 ~]# ip -6 r show table cache
    2fac:face::face via 2fac::face dev eth0 metric 0
    cache expires 463sec mtu 1300 pref medium

    Client (Send the ICMPV6_PKT_TOOBIG):
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    scapy is used to generate the TOOBIG message. Here is the scapy script I have
    used:

    >>> p=Ether(src='da:75:4d:36:ac:32', dst='52:54:00:12:34:66', type=0x86dd)/IPv6(src='2fac::face', dst='2fac::1')/ICMPv6PacketTooBig(mtu=1300)/IPv6(src='2fac::
    1',dst='2fac:face::face', nh='UDP')/UDP(sport=8080,dport=53)
    >>> sendp(p, iface='qemubr0')

    Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
    Signed-off-by: Martin KaFai Lau
    Reported-by: Wei Wang
    Cc: Cong Wang
    Cc: Eric Dumazet
    Cc: Wei Wang
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • This patch moves the route lookup and update codes for connected
    datagram sk to a newly created function ip6_datagram_dst_update()

    It will be reused during the pmtu update in the later patch.

    Signed-off-by: Martin KaFai Lau
    Cc: Cong Wang
    Cc: Eric Dumazet
    Cc: Wei Wang
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • Move flowi6 init codes for connected datagram sk to a newly created
    function ip6_datagram_flow_key_init().

    Notes:
    1. fl6_flowlabel is used instead of fl6.flowlabel in __ip6_datagram_connect
    2. ipv6_addr_is_multicast(&fl6->daddr) is used instead of
    (addr_type & IPV6_ADDR_MULTICAST) in ip6_datagram_flow_key_init()

    This new function will be reused during pmtu update in the later patch.

    Signed-off-by: Martin KaFai Lau
    Cc: Cong Wang
    Cc: Eric Dumazet
    Cc: Wei Wang
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     

05 Apr, 2016

1 commit

  • Process socket-level control messages by invoking
    __sock_cmsg_send in ip6_datagram_send_ctl for control messages on
    the SOL_SOCKET layer.

    This makes sure whenever ip6_datagram_send_ctl is called for
    udp and raw, we also process socket-level control messages.

    This is a bit uglier than IPv4, since IPv6 does not have
    something like ipcm_cookie. Perhaps we can later create
    a control message cookie for IPv6?

    Note that this commit interprets new control messages that
    were ignored before. As such, this commit does not change
    the behavior of IPv6 control messages.

    Signed-off-by: Soheil Hassas Yeganeh
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Soheil Hassas Yeganeh
     

30 Jan, 2016

1 commit

  • Currently, the egress interface index specified via IPV6_PKTINFO
    is ignored by __ip6_datagram_connect(), so that RFC 3542 section 6.7
    can be subverted when the user space application calls connect()
    before sendmsg().
    Fix it by initializing properly flowi6_oif in connect() before
    performing the route lookup.

    Signed-off-by: Paolo Abeni
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Paolo Abeni
     

03 Dec, 2015

1 commit

  • This patch addresses multiple problems :

    UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions
    while socket is not locked : Other threads can change np->opt
    concurrently. Dmitry posted a syzkaller
    (http://github.com/google/syzkaller) program desmonstrating
    use-after-free.

    Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock()
    and dccp_v6_request_recv_sock() also need to use RCU protection
    to dereference np->opt once (before calling ipv6_dup_options())

    This patch adds full RCU protection to np->opt

    Reported-by: Dmitry Vyukov
    Signed-off-by: Eric Dumazet
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Eric Dumazet
     

26 Sep, 2015

1 commit

  • This is to document that socket lock might not be held at this point.

    skb_set_owner_w() and ipv6_local_error() are using proper atomic ops
    or spinlocks, so we promote the socket to non const when calling them.

    netfilter hooks should never assume socket lock is held,
    we also promote the socket to non const.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Jul, 2015

1 commit

  • This patch creates sk_set_txhash and eliminates protocol specific
    inet_set_txhash and ip6_set_txhash. sk_set_txhash simply sets a
    random number instead of performing flow dissection. sk_set_txash
    is also allowed to be called multiple times for the same socket,
    we'll need this when redoing the hash for negative routing advice.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

23 Jul, 2015

1 commit


16 Jul, 2015

1 commit

  • ip6_datagram_connect() is doing a lot of socket changes without
    socket being locked.

    This looks wrong, at least for udp_lib_rehash() which could corrupt
    lists because of concurrent udp_sk(sk)->udp_portaddr_hash accesses.

    Signed-off-by: Eric Dumazet
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Eric Dumazet
     

10 Jul, 2015

1 commit


24 Jun, 2015

1 commit

  • ICMP messages can trigger ICMP and local errors. In this case
    serr->port is 0 and starting from Linux 4.0 we do not return
    the original target address to the error queue readers.
    Add function to define which errors provide addr_offset.
    With this fix my ping command is not silent anymore.

    Fixes: c247f0534cc5 ("ip: fix error queue empty skb handling")
    Signed-off-by: Julian Anastasov
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Julian Anastasov
     

01 Apr, 2015

1 commit

  • The ipv6 code uses a mixture of coding styles. In some instances check for NULL
    pointer is done as x == NULL and sometimes as !x. !x is preferred according to
    checkpatch and this patch makes the code consistent by adopting the latter
    form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     

09 Mar, 2015

1 commit

  • When reading from the error queue, msg_name and msg_control are only
    populated for some errors. A new exception for empty timestamp skbs
    added a false positive on icmp errors without payload.

    `traceroute -M udpconn` only displayed gateways that return payload
    with the icmp error: the embedded network headers are pulled before
    sock_queue_err_skb, leaving an skb with skb->len == 0 otherwise.

    Fix this regression by refining when msg_name and msg_control
    branches are taken. The solutions for the two fields are independent.

    msg_name only makes sense for errors that configure serr->port and
    serr->addr_offset. Test the first instead of skb->len. This also fixes
    another issue. saddr could hold the wrong data, as serr->addr_offset
    is not initialized in some code paths, pointing to the start of the
    network header. It is only valid when serr->port is set (non-zero).

    msg_control support differs between IPv4 and IPv6. IPv4 only honors
    requests for ICMP and timestamps with SOF_TIMESTAMPING_OPT_CMSG. The
    skb->len test can simply be removed, because skb->dev is also tested
    and never true for empty skbs. IPv6 honors requests for all errors
    aside from local errors and timestamps on empty skbs.

    In both cases, make the policy more explicit by moving this logic to
    a new function that decides whether to process msg_control and that
    optionally prepares the necessary fields in skb->cb[]. After this
    change, the IPv4 and IPv6 paths are more similar.

    The last case is rxrpc. Here, simply refine to only match timestamps.

    Fixes: 49ca0d8bfaf3 ("net-timestamp: no-payload option")

    Reported-by: Jan Niehusmann
    Signed-off-by: Willem de Bruijn

    ----

    Changes
    v1->v2
    - fix local origin test inversion in ip6_datagram_support_cmsg
    - make v4 and v6 code paths more similar by introducing analogous
    ipv4_datagram_support_cmsg
    - fix compile bug in rxrpc
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

03 Feb, 2015

1 commit

  • Add timestamping option SOF_TIMESTAMPING_OPT_TSONLY. For transmit
    timestamps, this loops timestamps on top of empty packets.

    Doing so reduces the pressure on SO_RCVBUF. Payload inspection and
    cmsg reception (aside from timestamps) are no longer possible. This
    works together with a follow on patch that allows administrators to
    only allow tx timestamping if it does not loop payload or metadata.

    Signed-off-by: Willem de Bruijn

    ----

    Changes (rfc -> v1)
    - add documentation
    - remove unnecessary skb->len test (thanks to Richard Cochran)
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

16 Jan, 2015

1 commit

  • The sockaddr is returned in IP(V6)_RECVERR as part of errhdr. That
    structure is defined and allocated on the stack as

    struct {
    struct sock_extended_err ee;
    struct sockaddr_in(6) offender;
    } errhdr;

    The second part is only initialized for certain SO_EE_ORIGIN values.
    Always initialize it completely.

    An MTU exceeded error on a SOCK_RAW/IPPROTO_RAW is one example that
    would return uninitialized bytes.

    Signed-off-by: Willem de Bruijn

    ----

    Also verified that there is no padding between errhdr.ee and
    errhdr.offender that could leak additional kernel data.
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

11 Dec, 2014

1 commit


09 Dec, 2014

1 commit

  • Allow reading of timestamps and cmsg at the same time on all relevant
    socket families. One use is to correlate timestamps with egress
    device, by asking for cmsg IP_PKTINFO.

    on AF_INET sockets, call the relevant function (ip_cmsg_recv). To
    avoid changing legacy expectations, only do so if the caller sets a
    new timestamping flag SOF_TIMESTAMPING_OPT_CMSG.

    on AF_INET6 sockets, IPV6_PKTINFO and all other recv cmsg are already
    returned for all origins. only change is to set ifindex, which is
    not initialized for all error origins.

    In both cases, only generate the pktinfo message if an ifindex is
    known. This is not the case for ACK timestamps.

    The difference between the protocol families is probably a historical
    accident as a result of the different conditions for generating cmsg
    in the relevant ip(v6)_recv_error function:

    ipv4: if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP) {
    ipv6: if (serr->ee.ee_origin != SO_EE_ORIGIN_LOCAL) {

    At one time, this was the same test bar for the ICMP/ICMP6
    distinction. This is no longer true.

    Signed-off-by: Willem de Bruijn

    ----

    Changes
    v1 -> v2
    large rewrite
    - integrate with existing pktinfo cmsg generation code
    - on ipv4: only send with new flag, to maintain legacy behavior
    - on ipv6: send at most a single pktinfo cmsg
    - on ipv6: initialize fields if not yet initialized

    The recv cmsg interfaces are also relevant to the discussion of
    whether looping packet headers is problematic. For v6, cmsgs that
    identify many headers are already returned. This patch expands
    that to v4. If it sounds reasonable, I will follow with patches

    1. request timestamps without payload with SOF_TIMESTAMPING_OPT_TSONLY
    (http://patchwork.ozlabs.org/patch/366967/)
    2. sysctl to conditionally drop all timestamps that have payload or
    cmsg from users without CAP_NET_RAW.
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

12 Nov, 2014

1 commit

  • Use the more common dynamic_debug capable net_dbg_ratelimited
    and remove the LIMIT_NETDEBUG macro.

    All messages are still ratelimited.

    Some KERN_ uses are changed to KERN_DEBUG.

    This may have some negative impact on messages that were
    emitted at KERN_INFO that are not not enabled at all unless
    DEBUG is defined or dynamic_debug is enabled. Even so,
    these messages are now _not_ emitted by default.

    This also eliminates the use of the net_msg_warn sysctl
    "/proc/sys/net/core/warnings". For backward compatibility,
    the sysctl is not removed, but it has no function. The extern
    declaration of net_msg_warn is removed from sock.h and made
    static in net/core/sysctl_net_core.c

    Miscellanea:

    o Update the sysctl documentation
    o Remove the embedded uses of pr_fmt
    o Coalesce format fragments
    o Realign arguments

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

06 Nov, 2014

1 commit

  • This encapsulates all of the skb_copy_datagram_iovec() callers
    with call argument signature "skb, offset, msghdr->msg_iov, length".

    When we move to iov_iters in the networking, the iov_iter object will
    sit in the msghdr.

    Having a helper like this means there will be less places to touch
    during that transformation.

    Based upon descriptions and patch from Al Viro.

    Signed-off-by: David S. Miller

    David S. Miller
     

02 Sep, 2014

1 commit

  • sk->sk_error_queue is dequeued in four locations. All share the
    exact same logic. Deduplicate.

    Also collapse the two critical sections for dequeue (at the top of
    the recv handler) and signal (at the bottom).

    This moves signal generation for the next packet forward, which should
    be harmless.

    It also changes the behavior if the recv handler exits early with an
    error. Previously, a signal for follow-up packets on the errqueue
    would then not be scheduled. The new behavior, to always signal, is
    arguably a bug fix.

    For rxrpc, the change causes the same function to be called repeatedly
    for each queued packet (because the recv handler == sk_error_report).
    It is likely that all packets will fail for the same reason (e.g.,
    memory exhaustion).

    This code runs without sk_lock held, so it is not safe to trust that
    sk->sk_err is immutable inbetween releasing q->lock and the subsequent
    test. Introduce int err just to avoid this potential race.

    Signed-off-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

25 Aug, 2014

1 commit

  • This patch makes no changes to the logic of the code but simply addresses
    coding style issues as detected by checkpatch.

    Both objdump and diff -w show no differences.

    A number of items are addressed in this patch:
    * Multiple spaces converted to tabs
    * Spaces before tabs removed.
    * Spaces in pointer typing cleansed (char *)foo etc.
    * Remove space after sizeof
    * Ensure spacing around comparators such as if statements.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     

08 Jul, 2014

1 commit

  • For a connected socket we can precompute the flow hash for setting
    in skb->hash on output. This is a performance advantage over
    calculating the skb->hash for every packet on the connection. The
    computation is done using the common hash algorithm to be consistent
    with computations done for packets of the connection in other states
    where thers is no socket (e.g. time-wait, syn-recv, syn-cookies).

    This patch adds sk_txhash to the sock structure. inet_set_txhash and
    ip6_set_txhash functions are added which are called from points in
    TCP and UDP where socket moves to established state.

    skb_set_hash_from_sk is a function which sets skb->hash from the
    sock txhash value. This is called in UDP and TCP transmit path when
    transmitting within the context of a socket.

    Tested: ran super_netperf with 200 TCP_RR streams over a vxlan
    interface (in this case skb_get_hash called on every TX packet to
    create a UDP source port).

    Before fix:

    95.02% CPU utilization
    154/256/505 90/95/99% latencies
    1.13042e+06 tps

    Time in functions:
    0.28% skb_flow_dissect
    0.21% __skb_get_hash

    After fix:

    94.95% CPU utilization
    156/254/485 90/95/99% latencies
    1.15447e+06

    Neither __skb_get_hash nor skb_flow_dissect appear in perf

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

23 Jan, 2014

1 commit

  • This change allows to consider an anycast address valid as source address
    when given via an IPV6_PKTINFO or IPV6_2292PKTINFO ancillary data item.
    So, when sending a datagram with ancillary data, the unicast and anycast
    addresses are handled in the same way.

    - Adds ipv6_chk_acast_addr_src() to check if an anycast address is link-local
    on given interface or is global.
    - Uses it in ip6_datagram_send_ctl().

    Signed-off-by: Francois-Xavier Le Bail
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    FX Le Bail
     

22 Jan, 2014

1 commit

  • Some ipv6 protocols cannot handle ipv4 addresses, so we must not allow
    connecting and binding to them. sendmsg logic does already check msg->name
    for this but must trust already connected sockets which could be set up
    for connection to ipv4 address family.

    Per-socket flag ipv6only is of no use here, as it is under users control
    by setsockopt.

    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

20 Jan, 2014

1 commit

  • We currently don't report IPV6_RECVPKTINFO in cmsg access ancillary data
    for IPv4 datagrams on IPv6 sockets.

    This patch splits the ip6_datagram_recv_ctl into two functions, one
    which handles both protocol families, AF_INET and AF_INET6, while the
    ip6_datagram_recv_specific_ctl only handles IPv6 cmsg data.

    ip6_datagram_recv_*_ctl never reported back any errors, so we can make
    them return void. Also provide a helper for protocols which don't offer dual
    personality to further use ip6_datagram_recv_ctl, which is exported to
    modules.

    I needed to shuffle the code for ping around a bit to make it easier to
    implement dual personality for ping ipv6 sockets in future.

    Reported-by: Gert Doering
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

19 Jan, 2014

1 commit

  • This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg
    handler msg_name and msg_namelen logic").

    DECLARE_SOCKADDR validates that the structure we use for writing the
    name information to is not larger than the buffer which is reserved
    for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR
    consistently in sendmsg code paths.

    Signed-off-by: Steffen Hurrle
    Suggested-by: Hannes Frederic Sowa
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Steffen Hurrle
     

20 Dec, 2013

1 commit

  • Steffen Klassert says:

    ====================
    pull request (net-next): ipsec-next 2013-12-19

    1) Use the user supplied policy index instead of a generated one
    if present. From Fan Du.

    2) Make xfrm migration namespace aware. From Fan Du.

    3) Make the xfrm state and policy locks namespace aware. From Fan Du.

    4) Remove ancient sleeping when the SA is in acquire state,
    we now queue packets to the policy instead. This replaces the
    sleeping code.

    5) Remove FLOWI_FLAG_CAN_SLEEP. This was used to notify xfrm about the
    posibility to sleep. The sleeping code is gone, so remove it.

    6) Check user specified spi for IPComp. Thr spi for IPcomp is only
    16 bit wide, so check for a valid value. From Fan Du.

    7) Export verify_userspi_info to check for valid user supplied spi ranges
    with pfkey and netlink. From Fan Du.

    8) RFC3173 states that if the total size of a compressed payload and the IPComp
    header is not smaller than the size of the original payload, the IP datagram
    must be sent in the original non-compressed form. These packets are dropped
    by the inbound policy check because they are not transformed. Document the need
    to set 'level use' for IPcomp to receive such packets anyway. From Fan Du.

    Please pull or let me know if there are problems.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

11 Dec, 2013

1 commit

  • This patch is following b579035ff766c9412e2b92abf5cab794bff102b6
    "ipv6: remove old conditions on flow label sharing"

    Since there is no reason to restrict a label to a
    destination, we should not erase the destination value of a
    socket with the value contained in the flow label storage.

    This patch allows to really have the same flow label to more
    than one destination.

    Signed-off-by: Florent Fourcot
    Reviewed-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Florent Fourcot
     

06 Dec, 2013

1 commit


24 Nov, 2013

2 commits

  • Offenders don't have port numbers, so set it to 0.

    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • Commit bceaa90240b6019ed73b49965eac7d167610be69 ("inet: prevent leakage
    of uninitialized memory to user in recv syscalls") conditionally updated
    addr_len if the msg_name is written to. The recv_error and rxpmtu
    functions relied on the recvmsg functions to set up addr_len before.

    As this does not happen any more we have to pass addr_len to those
    functions as well and set it to the size of the corresponding sockaddr
    length.

    This broke traceroute and such.

    Fixes: bceaa90240b6 ("inet: prevent leakage of uninitialized memory to user in recv syscalls")
    Reported-by: Brad Spengler
    Reported-by: Tom Labanowski
    Cc: mpb
    Cc: David S. Miller
    Cc: Eric Dumazet
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

09 Oct, 2013

1 commit

  • TCP listener refactoring, part 4 :

    To speed up inet lookups, we moved IPv4 addresses from inet to struct
    sock_common

    Now is time to do the same for IPv6, because it permits us to have fast
    lookups for all kind of sockets, including upcoming SYN_RECV.

    Getting IPv6 addresses in TCP lookups currently requires two extra cache
    lines, plus a dereference (and memory stall).

    inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6

    This patch is way bigger than its IPv4 counter part, because for IPv4,
    we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
    it's not doable easily.

    inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
    inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddr

    And timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
    at the same offset.

    We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
    macro.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet