13 Oct, 2012

1 commit


30 Aug, 2012

1 commit

  • The IPv6 conntrack fragmentation currently has a couple of shortcomings.
    Fragmentes are collected in PREROUTING/OUTPUT, are defragmented, the
    defragmented packet is then passed to conntrack, the resulting conntrack
    information is attached to each original fragment and the fragments then
    continue their way through the stack.

    Helper invocation occurs in the POSTROUTING hook, at which point only
    the original fragments are available. The result of this is that
    fragmented packets are never passed to helpers.

    This patch improves the situation in the following way:

    - If a reassembled packet belongs to a connection that has a helper
    assigned, the reassembled packet is passed through the stack instead
    of the original fragments.

    - During defragmentation, the largest received fragment size is stored.
    On output, the packet is refragmented if required. If the largest
    received fragment size exceeds the outgoing MTU, a "packet too big"
    message is generated, thus behaving as if the original fragments
    were passed through the stack from an outside point of view.

    - The ipv6_helper() hook function can't receive fragments anymore for
    connections using a helper, so it is switched to use ipv6_skip_exthdr()
    instead of the netfilter specific nf_ct_ipv6_skip_exthdr() and the
    reassembled packets are passed to connection tracking helpers.

    The result of this is that we can properly track fragmented packets, but
    still generate ICMPv6 Packet too big messages if we would have before.

    This patch is also required as a precondition for IPv6 NAT, where NAT
    helpers might enlarge packets up to a point that they require
    fragmentation. In that case we can't generate Packet too big messages
    since the proper MTU can't be calculated in all cases (f.i. when
    changing textual representation of a variable amount of addresses),
    so the packet is transparently fragmented iff the original packet or
    fragments would have fit the outgoing MTU.

    IPVS parts by Jesper Dangaard Brouer .

    Signed-off-by: Patrick McHardy

    Patrick McHardy
     

07 Aug, 2012

1 commit

  • IPv6 needs a cookie in dst_check() call.

    We need to add rx_dst_cookie and provide a family independent
    sk_rx_dst_set(sk, skb) method to properly support IPv6 TCP early demux.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

18 Jul, 2012

1 commit

  • We should provide to inet6_csk_route_socket a struct flowi6 pointer,
    so that net6_csk_xmit() works correctly instead of sending garbage.

    Also add some consts

    Signed-off-by: Eric Dumazet
    Reported-by: Yuchung Cheng
    Cc: Neal Cardwell
    Signed-off-by: David S. Miller

    Eric Dumazet
     

11 Jul, 2012

1 commit


13 Feb, 2012

2 commits

  • Currently, it is not easily possible to get TOS/DSCP value of packets from
    an incoming TCP stream. The mechanism is there, IP_PKTOPTIONS getsockopt
    with IP_RECVTOS set, the same way as incoming TTL can be queried. This is
    not actually implemented for TOS, though.

    This patch adds this functionality, both for IPv4 (IP_PKTOPTIONS) and IPv6
    (IPV6_2292PKTOPTIONS). For IPv4, like in the IP_RECVTTL case, the value of
    the TOS field is stored from the other party's ACK.

    This is needed for proxies which require DSCP transparency. One such example
    is at http://zph.bratcheda.org/.

    Signed-off-by: Jiri Benc
    Signed-off-by: David S. Miller

    Jiri Benc
     
  • Implement helper inline function to get traffic class from IPv6 header.

    Signed-off-by: Jiri Benc
    Signed-off-by: David S. Miller

    Jiri Benc
     

09 Feb, 2012

1 commit


12 Dec, 2011

1 commit


25 Nov, 2010

1 commit

  • ipv6_sk_mc_lock rwlock becomes a spinlock.

    readers (inet6_mc_check()) now takes rcu_read_lock() instead of read
    lock. Writers dont need to disable BH anymore.

    struct ipv6_mc_socklist objects are reclaimed after one RCU grace
    period.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Oct, 2010

1 commit


23 Aug, 2010

1 commit

  • __packed is only defined in kernel space, so we should use
    __attribute__((packed)) for the code shared between kernel and user space.

    Two __attribute() annotations are replaced with __attribute__() too.

    Signed-off-by: Changli Gao
    Signed-off-by: David S. Miller

    Changli Gao
     

20 Jul, 2010

1 commit

  • Even with jumbograms I cannot see any way in which we would need
    to records a larger than 65535 valued next-header offset.

    The maximum extension header length is (256 << 3) == 2048.
    There are only a handful of extension headers specified which
    we'd even accept (say 5 or 6), therefore the largest next-header
    offset we'd ever have to contend with is something less than
    say 16k.

    Therefore make it a u16 instead of a u32.

    Signed-off-by: David S. Miller

    David S. Miller
     

03 Jun, 2010

1 commit


11 May, 2010

2 commits

  • This patch adds support for multiple independant multicast routing instances,
    named "tables".

    Userspace multicast routing daemons can bind to a specific table instance by
    issuing a setsockopt call using a new option MRT6_TABLE. The table number is
    stored in the raw socket data and affects all following ip6mr setsockopt(),
    getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT)
    is created with a default routing rule pointing to it. Newly created pim6reg
    devices have the table number appended ("pim6regX"), with the exception of
    devices created in the default table, which are named just "pim6reg" for
    compatibility reasons.

    Packets are directed to a specific table instance using routing rules,
    similar to how regular routing rules work. Currently iif, oif and mark
    are supported as keys, source and destination addresses could be supported
    additionally.

    Example usage:

    - bind pimd/xorp/... to a specific table:

    uint32_t table = 123;
    setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table));

    - create routing rules directing packets to the new table:

    # ip -6 mrule add iif eth0 lookup 123
    # ip -6 mrule add oif eth0 lookup 123

    Signed-off-by: Patrick McHardy

    Patrick McHardy
     
  • Conflicts:
    net/bridge/br_device.c
    net/bridge/br_forward.c

    Signed-off-by: Patrick McHardy

    Patrick McHardy
     

24 Apr, 2010

2 commits

  • Finally add support to detect a local IPV6_DONTFRAG event
    and return the relevant data to the user if they've enabled
    IPV6_RECVPATHMTU on the socket. The next recvmsg() will
    return no data, but have an IPV6_PATHMTU as ancillary data.

    Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     
  • Add underlying data structure changes and basic setsockopt()
    and getsockopt() support for IPV6_RECVPATHMTU, IPV6_PATHMTU,
    and IPV6_DONTFRAG. IPV6_PATHMTU is actually fully functional
    at this point.

    Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     

23 Apr, 2010

1 commit

  • This patch adds IPv6 support for RFC5082 Generalized TTL Security Mechanism.

    Not to users of mapped address; the IPV6 and IPV4 socket options are seperate.
    The server does have to deal with both IPv4 and IPv6 socket options
    and the client has to handle the different for each family.

    On client:
    int ttl = 255;
    getaddrinfo(argv[1], argv[2], &hint, &result);

    for (rp = result; rp != NULL; rp = rp->ai_next) {
    s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
    if (s < 0) continue;

    if (rp->ai_family == AF_INET) {
    setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
    } else if (rp->ai_family == AF_INET6) {
    setsockopt(s, IPPROTO_IPV6, IPV6_UNICAST_HOPS,
    &ttl, sizeof(ttl)))
    }

    if (connect(s, rp->ai_addr, rp->ai_addrlen) == 0) {
    ...

    On server:
    int minttl = 255 - maxhops;

    getaddrinfo(NULL, port, &hints, &result);
    for (rp = result; rp != NULL; rp = rp->ai_next) {
    s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
    if (s < 0) continue;

    if (rp->ai_family == AF_INET6)
    setsockopt(s, IPPROTO_IPV6, IPV6_MINHOPCOUNT,
    &minttl, sizeof(minttl));
    setsockopt(s, IPPROTO_IP, IP_MINTTL, &minttl, sizeof(minttl));

    if (bind(s, rp->ai_addr, rp->ai_addrlen) == 0)
    break
    ...

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

13 Apr, 2010

1 commit


19 Oct, 2009

1 commit

  • In order to have better cache layouts of struct sock (separate zones
    for rx/tx paths), we need this preliminary patch.

    Goal is to transfert fields used at lookup time in the first
    read-mostly cache line (inside struct sock_common) and move sk_refcnt
    to a separate cache line (only written by rx path)

    This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
    sport and id fields. This allows a future patch to define these
    fields as macros, like sk_refcnt, without name clashes.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

09 Oct, 2009

1 commit

  • (This patch fixes bug of commit f7734fdf61ec6bb848e0bafc1fb8bad2c124bb50
    title "make TLLAO option for NA packets configurable")

    When the IPV6 conf is used, the function sysctl_set_parent is called and the
    array addrconf_sysctl is used as a parameter of the function.

    The above patch added new conf "force_tllao" into the array addrconf_sysctl,
    but the size of the array was not modified, the static allocated size is
    DEVCONF_MAX + 1 but the real size is DEVCONF_MAX + 2, so the problem is
    that the function sysctl_set_parent accessed wrong address.

    I got the following information.
    Call Trace:
    [] sysctl_set_parent+0x29/0x3e
    [] sysctl_set_parent+0x29/0x3e
    [] sysctl_set_parent+0x29/0x3e
    [] sysctl_set_parent+0x29/0x3e
    [] sysctl_set_parent+0x29/0x3e
    [] __register_sysctl_paths+0xde/0x272
    [] ? __kmalloc_track_caller+0x16e/0x180
    [] ? __addrconf_sysctl_register+0xc5/0x144 [ipv6]
    [] register_net_sysctl_table+0x48/0x4b
    [] __addrconf_sysctl_register+0xf7/0x144 [ipv6]
    [] addrconf_init_net+0xd4/0x104 [ipv6]
    [] setup_net+0x35/0x82
    [] copy_net_ns+0x76/0xe0
    [] create_new_namespaces+0xf0/0x16e
    [] copy_namespaces+0x65/0x9f
    [] copy_process+0xb2c/0x12c3
    [] do_fork+0x14b/0x2d2
    [] ? up_read+0xe/0x10
    [] ? do_page_fault+0x27a/0x2aa
    [] sys_clone+0x28/0x2a
    [] stub_clone+0x13/0x20
    [] ? system_call_fastpath+0x16/0x1b

    And the information of IPV6 in .config is as following.
    IPV6 in .config:
    CONFIG_IPV6=m
    CONFIG_IPV6_PRIVACY=y
    CONFIG_IPV6_ROUTER_PREF=y
    CONFIG_IPV6_ROUTE_INFO=y
    CONFIG_IPV6_OPTIMISTIC_DAD=y
    CONFIG_IPV6_MIP6=m
    CONFIG_IPV6_SIT=m
    # CONFIG_IPV6_SIT_6RD is not set
    CONFIG_IPV6_NDISC_NODETYPE=y
    CONFIG_IPV6_TUNNEL=m
    CONFIG_IPV6_MULTIPLE_TABLES=y
    CONFIG_IPV6_SUBTREES=y
    CONFIG_IPV6_MROUTE=y
    CONFIG_IPV6_PIMSM_V2=y
    # CONFIG_IP_VS_IPV6 is not set
    CONFIG_NF_CONNTRACK_IPV6=m
    CONFIG_IP6_NF_MATCH_IPV6HEADER=m

    I confirmed this patch fixes this problem.

    Signed-off-by: Jin Dongming
    Signed-off-by: David S. Miller

    Jin Dongming
     

07 Oct, 2009

1 commit

  • On Friday 02 October 2009 20:53:51 you wrote:

    > This is good although I would have shortened the name.

    Ah, I knew I forgot something :) Here is v4.

    tavi

    >From 24d96d825b9fa832b22878cc6c990d5711968734 Mon Sep 17 00:00:00 2001
    From: Octavian Purdila
    Date: Fri, 2 Oct 2009 00:51:15 +0300
    Subject: [PATCH] ipv6: new sysctl for sending TLLAO with unicast NAs

    Neighbor advertisements responding to unicast neighbor solicitations
    did not include the target link-layer address option. This patch adds
    a new sysctl option (disabled by default) which controls whether this
    option should be sent even with unicast NAs.

    The need for this arose because certain routers expect the TLLAO in
    some situations even as a response to unicast NS packets.

    Moreover, RFC 2461 recommends sending this to avoid a race condition
    (section 4.4, Target link-layer address)

    Signed-off-by: Cosmin Ratiu
    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     

01 Jun, 2009

1 commit

  • Add 'autoconf' and 'disable_ipv6' parameters to the IPv6 module.

    The first controls if IPv6 addresses are autoconfigured from
    prefixes received in Router Advertisements. The IPv6 loopback
    (::1) and link-local addresses are still configured.

    The second controls if IPv6 addresses are desired at all. No
    IPv6 addresses will be added to any interfaces.

    Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     

31 Jan, 2009

1 commit


16 Dec, 2008

1 commit

  • There are three reasons for me to add this support:
    1.When no interface is specified in an IPV6_PKTINFO ancillary data
    item, the interface specified in an IPV6_PKTINFO sticky optionis
    is used.

    RFC3542:
    6.7. Summary of Outgoing Interface Selection

    This document and [RFC-3493] specify various methods that affect the
    selection of the packet's outgoing interface. This subsection
    summarizes the ordering among those in order to ensure deterministic
    behavior.

    For a given outgoing packet on a given socket, the outgoing interface
    is determined in the following order:

    1. if an interface is specified in an IPV6_PKTINFO ancillary data
    item, the interface is used.

    2. otherwise, if an interface is specified in an IPV6_PKTINFO sticky
    option, the interface is used.

    2.When no IPV6_PKTINFO ancillary data is received,getsockopt() should
    return the sticky option value which set with setsockopt().

    RFC 3542:
    Issuing getsockopt() for the above options will return the sticky
    option value i.e., the value set with setsockopt(). If no sticky
    option value has been set getsockopt() will return the following
    values:

    3.Make the setsockopt implementation POSIX compliant.

    Signed-off-by: Yang Hongyang
    Signed-off-by: David S. Miller

    Yang Hongyang
     

22 Jul, 2008

1 commit


03 Jul, 2008

2 commits


11 Jun, 2008

1 commit


14 Apr, 2008

2 commits


05 Apr, 2008

1 commit


26 Mar, 2008

1 commit


25 Mar, 2008

4 commits


01 Feb, 2008

2 commits