20 Sep, 2012

2 commits

  • Cc: Herbert Xu
    Cc: Michal Kubeček
    Cc: David Miller
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Amerigo Wang
     
  • Two years ago, Shan Wei tried to fix this:
    http://patchwork.ozlabs.org/patch/43905/

    The problem is that RFC2460 requires an ICMP Time
    Exceeded -- Fragment Reassembly Time Exceeded message should be
    sent to the source of that fragment, if the defragmentation
    times out.

    "
    If insufficient fragments are received to complete reassembly of a
    packet within 60 seconds of the reception of the first-arriving
    fragment of that packet, reassembly of that packet must be
    abandoned and all the fragments that have been received for that
    packet must be discarded. If the first fragment (i.e., the one
    with a Fragment Offset of zero) has been received, an ICMP Time
    Exceeded -- Fragment Reassembly Time Exceeded message should be
    sent to the source of that fragment.
    "

    As Herbert suggested, we could actually use the standard IPv6
    reassembly code which follows RFC2460.

    With this patch applied, I can see ICMP Time Exceeded sent
    from the receiver when the sender sent out 3/4 fragmented
    IPv6 UDP packet.

    Cc: Herbert Xu
    Cc: Michal Kubeček
    Cc: David Miller
    Cc: Hideaki YOSHIFUJI
    Cc: Patrick McHardy
    Cc: Pablo Neira Ayuso
    Cc: netfilter-devel@vger.kernel.org
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Amerigo Wang
     

25 Aug, 2012

1 commit


15 Aug, 2012

2 commits

  • Correct a long standing omission and use struct pid in the owner
    field of struct ip6_flowlabel when the share type is IPV6_FL_S_PROCESS.
    This guarantees we don't have issues when pid wraparound occurs.

    Use a kuid_t in the owner field of struct ip6_flowlabel when the
    share type is IPV6_FL_S_USER to add user namespace support.

    In /proc/net/ip6_flowlabel capture the current pid namespace when
    opening the file and release the pid namespace when the file is
    closed ensuring we print the pid owner value that is meaning to
    the reader of the file. Similarly use from_kuid_munged to print
    uid values that are meaningful to the reader of the file.

    This requires exporting pid_nr_ns so that ipv6 can continue to built
    as a module. Yoiks what silliness

    Acked-by: David S. Miller
    Acked-by: Serge Hallyn
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     
  • GRE over IPv6 implementation.

    Signed-off-by: Dmitry Kozlov
    Signed-off-by: David S. Miller

    xeb@mail.ru
     

19 Jul, 2012

1 commit

  • Introduce ipv6_addr_hash() helper doing a XOR on all bits
    of an IPv6 address, with an optimized x86_64 version.

    Use it in flow dissector, as suggested by Andrew McGregor,
    to reduce hash collision probabilities in fq_codel (and other
    users of flow dissector)

    Use it in ip6_tunnel.c and use more bit shuffling, as suggested
    by David Laight, as existing hash was ignoring most of them.

    Use it in sunrpc and use more bit shuffling, using hash_32().

    Use it in net/ipv6/addrconf.c, using hash_32() as well.

    As a cleanup, use it in net/ipv4/tcp_metrics.c

    Signed-off-by: Eric Dumazet
    Reported-by: Andrew McGregor
    Cc: Dave Taht
    Cc: Tom Herbert
    Cc: David Laight
    Cc: Joe Perches
    Signed-off-by: David S. Miller

    Eric Dumazet
     

12 Jul, 2012

1 commit


11 Jul, 2012

1 commit

  • On 64 bit arches having efficient unaligned accesses (eg x86_64) we can
    use long words to reduce number of instructions for free.

    Joe Perches suggested to change ipv6_masked_addr_cmp() to return a bool
    instead of 'int', to make sure ipv6_masked_addr_cmp() cannot be used
    in a sorting function.

    Signed-off-by: Eric Dumazet
    Cc: Joe Perches
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 May, 2012

1 commit


18 May, 2012

2 commits


21 Apr, 2012

2 commits


16 Apr, 2012

1 commit


04 Dec, 2011

1 commit

  • While parsing through IPv6 extension headers, fragment headers are
    skipped making them invisible to the caller. This reports the
    fragment offset of the last header in order to make it possible to
    determine whether the packet is fragmented and, if so whether it is
    a first or last fragment.

    Signed-off-by: Jesse Gross

    Jesse Gross
     

23 Nov, 2011

1 commit


14 Nov, 2011

1 commit

  • Reading /proc/net/snmp6 on a machine with a lot of cpus is very
    expensive (can be ~88000 us).

    This is because ICMPV6MSG MIB uses 4096 bytes per cpu, and folding
    values for all possible cpus can read 16 Mbytes of memory (32MBytes on
    non x86 arches)

    ICMP messages are not considered as fast path on a typical server, and
    eventually few cpus handle them anyway. We can afford an atomic
    operation instead of using percpu data.

    This saves 4096 bytes per cpu and per network namespace.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Oct, 2011

1 commit

  • commit 66b13d99d96a (ipv4: tcp: fix TOS value in ACK messages sent from
    TIME_WAIT) fixed IPv4 only.

    This part is for the IPv6 side, adding a tclass param to ip6_xmit()

    We alias tw_tclass and tw_tos, if socket family is INET6.

    [ if sockets is ipv4-mapped, only IP_TOS socket option is used to fill
    TOS field, TCLASS is not taken into account ]

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

22 Jul, 2011

1 commit

  • IPv6 fragment identification generation is way beyond what we use for
    IPv4 : It uses a single generator. Its not scalable and allows DOS
    attacks.

    Now inetpeer is IPv6 aware, we can use it to provide a more secure and
    scalable frag ident generator (per destination, instead of system wide)

    This patch :
    1) defines a new secure_ipv6_id() helper
    2) extends inet_getid() to provide 32bit results
    3) extends ipv6_select_ident() with a new dest parameter

    Reported-by: Fernando Gont
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 May, 2011

1 commit

  • ipv6 has per device ICMP SNMP counters, taking too much space because
    they use percpu storage.

    needed size per device is :
    (512+4)*sizeof(long)*number_of_possible_cpus*2

    On a 32bit kernel, 16 possible cpus, this wastes more than 64kbytes of
    memory per ipv6 enabled network device, taken in vmalloc pool.

    Since ICMP messages are rare, just use shared counters (atomic_long_t)

    Per network space ICMP counters are still using percpu memory, we might
    also convert them to shared counters in a future patch.

    Signed-off-by: Eric Dumazet
    CC: Denys Fedoryshchenko
    Signed-off-by: David S. Miller

    Eric Dumazet
     

25 Apr, 2011

1 commit

  • These header files are never installed to user consumption, so any
    __KERNEL__ cpp checks are superfluous.

    Projects should also not copy these files into their userland utility
    sources and try to use them there. If they insist on doing so, the
    onus is on them to sanitize the headers as needed.

    Signed-off-by: David S. Miller

    David S. Miller
     

23 Apr, 2011

1 commit


13 Mar, 2011

1 commit


04 Mar, 2011

1 commit


02 Mar, 2011

4 commits

  • That way we don't have to potentially do this in every xfrm_lookup()
    caller.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Return a dst pointer which is potentitally error encoded.

    Don't pass original dst pointer by reference, pass a struct net
    instead of a socket, and elide the flow argument since it is
    unnecessary.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Since it indicates whether we are invoked from a sleepable
    context or not.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Route lookups follow a general pattern in the ipv6 code wherein
    we first find the non-IPSEC route, potentially override the
    flow destination address due to ipv6 options settings, and then
    finally make an IPSEC search using either xfrm_lookup() or
    __xfrm_lookup().

    __xfrm_lookup() is used when we want to generate a blackhole route
    if the key manager needs to resolve the IPSEC rules (in this case
    -EREMOTE is returned and the original 'dst' is left unchanged).

    Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC
    resolution is necessary, we simply fail the lookup completely.

    All of these cases are encapsulated into two routines,
    ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow. The latter of which
    handles unconnected UDP datagram sockets.

    Signed-off-by: David S. Miller

    David S. Miller
     

23 Feb, 2011

1 commit


24 Sep, 2010

1 commit


01 Jul, 2010

1 commit

  • /proc/net/snmp and /proc/net/netstat expose SNMP counters.

    Width of these counters is either 32 or 64 bits, depending on the size
    of "unsigned long" in kernel.

    This means user program parsing these files must already be prepared to
    deal with 64bit values, regardless of user program being 32 or 64 bit.

    This patch introduces 64bit snmp values for IPSTAT mib, where some
    counters can wrap pretty fast if they are 32bit wide.

    # netstat -s|egrep "InOctets|OutOctets"
    InOctets: 244068329096
    OutOctets: 244069348848

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

02 Jun, 2010

1 commit

  • There are more than a dozen occurrences of following code in the
    IPv6 stack:

    if (opt && opt->srcrt) {
    struct rt0_hdr *rt0 = (struct rt0_hdr *) opt->srcrt;
    ipv6_addr_copy(&final, &fl.fl6_dst);
    ipv6_addr_copy(&fl.fl6_dst, rt0->addr);
    final_p = &final;
    }

    Replace those with a helper. Note that the helper overrides final_p
    in all cases. This is ok as final_p was previously initialized to
    NULL when declared.

    Signed-off-by: Arnaud Ebalard
    Signed-off-by: David S. Miller

    Arnaud Ebalard
     

25 May, 2010

1 commit


24 Apr, 2010

2 commits


16 Apr, 2010

1 commit

  • As Herbert Xu said: we should be able to simply replace ipfragok
    with skb->local_df. commit f88037(sctp: Drop ipfargok in sctp_xmit function)
    has droped ipfragok and set local_df value properly.

    The patch kills the ipfragok parameter of .queue_xmit().

    Signed-off-by: Shan Wei
    Signed-off-by: David S. Miller

    Shan Wei
     

31 Mar, 2010

1 commit


26 Feb, 2010

1 commit

  • RFC 4291 section 2.4 states that all uncategorized addresses
    should be considered as Global Unicast.

    This will remove IPV6_ADDR_RESERVED completely
    and return IPV6_ADDR_UNICAST in ipv6_addr_type() instead.

    Signed-off-by: Ulrich Weber
    Signed-off-by: David S. Miller

    Ulrich Weber
     

17 Feb, 2010

1 commit

  • On Tue, 2010-02-16 at 16:47 +0100, Patrick McHardy wrote:
    > Joe Perches wrote:
    > >> @@ -246,6 +246,8 @@ extern int ipv6_opt_accepted(struct sock *sk, struct sk_buff *skb);
    > >> int ip6_frag_nqueues(struct net *net);
    > >> int ip6_frag_mem(struct net *net);
    > >>
    > >> +#define IPV6_FRAG_HIGH_THRESH 262144 /* == 256*1024 */
    > >> +#define IPV6_FRAG_LOW_THRESH 196608 /* == 192*1024 */
    > >> #define IPV6_FRAG_TIMEOUT (60*HZ) /* 60 seconds */
    > >
    > > 196608 isn't a number I want to remember.
    > > Is this better as:
    > >
    > > #define IPV6_FRAG_HIGH_THRESH (256 * 1024) /* 262144 */
    > > #define IPV6_FRAG_LOW_THRESH (192 * 1024) /* 196608 */
    >
    > Please send a patch, I'll apply it once these patches are in Dave's
    > tree.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

16 Feb, 2010

1 commit