10 Dec, 2013
2 commits
-
tclass information in now already stored in rcv_flowinfo
We do not need to store the same information twice.Signed-off-by: Florent Fourcot
Reviewed-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller -
The current implementation of IPV6_FLOWINFO only gives a
result if pktoptions is available (thanks to the
ip6_datagram_recv_ctl function).
It gives inconsistent results to user space, sometimes
there is a result for getsockopt(IPV6_FLOWINFO), sometimes
not.This patch add rcv_flowinfo to store it, and return it to
the userspace in the same way than other pkt_options.Signed-off-by: Florent Fourcot
Reviewed-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
06 Dec, 2013
1 commit
-
The code to detect fragments in checksum_setup() was missing for IPv4 and
too eager for IPv6. (It transpires that Windows seems to send IPv6 packets
with a fragment header even if they are not a fragment - i.e. offset is zero,
and M bit is not set).This patch also incorporates a fix to callers of maybe_pull_tail() where
skb->network_header was being erroneously added to the length argument.Signed-off-by: Paul Durrant
Signed-off-by: Zoltan Kiss
Cc: Wei Liu
Cc: Ian Campbell
Cc: David Vrabel
cc: David Miller
Acked-by: Wei Liu
Signed-off-by: David S. Miller
29 Oct, 2013
1 commit
-
The code for privacy extentions is very mature, and making it
configurable only gives marginal memory/code savings in exchange
for obfuscation and hard to read code via CPP ifdef'ery.Signed-off-by: David S. Miller
10 Oct, 2013
1 commit
-
TCP listener refactoring, part 5 :
We want to be able to insert request sockets (SYN_RECV) into main
ehash table instead of the per listener hash table to allow RCU
lookups and remove listener lock contention.This patch includes the needed struct sock_common in front
of struct request_sockThis means there is no more inet6_request_sock IPv6 specific
structure.Following inet_request_sock fields were renamed as they became
macros to reference fields from struct sock_common.
Prefix ir_ was chosen to avoid name collisions.loc_port -> ir_loc_port
loc_addr -> ir_loc_addr
rmt_addr -> ir_rmt_addr
rmt_port -> ir_rmt_port
iif -> ir_iifSigned-off-by: Eric Dumazet
Signed-off-by: David S. Miller
09 Oct, 2013
1 commit
-
TCP listener refactoring, part 4 :
To speed up inet lookups, we moved IPv4 addresses from inet to struct
sock_commonNow is time to do the same for IPv6, because it permits us to have fast
lookups for all kind of sockets, including upcoming SYN_RECV.Getting IPv6 addresses in TCP lookups currently requires two extra cache
lines, plus a dereference (and memory stall).inet6_sk(sk) does the dereference of inet_sk(__sk)->pinet6
This patch is way bigger than its IPv4 counter part, because for IPv4,
we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
it's not doable easily.inet6_sk(sk)->daddr becomes sk->sk_v6_daddr
inet6_sk(sk)->rcv_saddr becomes sk->sk_v6_rcv_saddrAnd timewait socket also have tw->tw_v6_daddr & tw->tw_v6_rcv_saddr
at the same offset.We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
macro.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
04 Oct, 2013
1 commit
-
TCP listener refactoring, part 2 :
We can use a generic lookup, sockets being in whatever state, if
we are sure all relevant fields are at the same place in all socket
types (ESTABLISH, TIME_WAIT, SYN_RECV)This patch removes these macros :
inet_addrpair, inet_addrpair, tw_addrpair, tw_portpair
And adds :
sk_portpair, sk_addrpair, sk_daddr, sk_rcv_saddr
Then, INET_TW_MATCH() is really the same than INET_MATCH()
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
30 Aug, 2013
1 commit
-
This patch implements RFC6980: Drop fragmented ndisc packets by
default. If a fragmented ndisc packet is received the user is informed
that it is possible to disable the check.Cc: Fernando Gont
Cc: YOSHIFUJI Hideaki
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
27 Aug, 2013
1 commit
-
Conflicts:
drivers/net/wireless/iwlwifi/pcie/trans.c
include/linux/inetdevice.hThe inetdevice.h conflict involves moving the IPV4_DEVCONF values
into a UAPI header, overlapping additions of some new entries.The iwlwifi conflict is a context overlap.
Signed-off-by: David S. Miller
20 Aug, 2013
1 commit
-
It is not allowed for an ipv6 packet to contain multiple fragmentation
headers. So discard packets which were already reassembled by
fragmentation logic and send back a parameter problem icmp.The updates for RFC 6980 will come in later, I have to do a bit more
research here.Cc: YOSHIFUJI Hideaki
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
14 Aug, 2013
1 commit
-
Commit cab70040dfd95ee32144f02fade64f0cb94f31a0 ("net: igmp:
Reduce Unsolicited report interval to 1s when using IGMPv3") and
2690048c01f32bf45d1c1e1ab3079bc10ad2aea7 ("net: igmp: Allow user-space
configuration of igmp unsolicited report interval") by William Manley made
igmp unsolicited report intervals configurable per interface and corrected
the interval of unsolicited igmpv3 report messages resendings to 1s.Same needs to be done for IPv6:
MLDv1 (RFC2710 7.10.): 10 seconds
MLDv2 (RFC3810 9.11.): 1 secondBoth intervals are configurable via new procfs knobs
mldv1_unsolicited_report_interval and mldv2_unsolicited_report_interval.(also added .force_mld_version to ipv6_devconf_dflt to bring structs in
line without semantic changes)v2:
a) Joined documentation update for IPv4 and IPv6 MLD/IGMP
unsolicited_report_interval procfs knobs.
b) incorporate stylistic feedback from William Manleyv3:
a) add new DEVCONF_* values to the end of the enum (thanks to David
Miller)Cc: Cong Wang
Cc: William Manley
Cc: Benjamin LaHaise
Cc: YOSHIFUJI Hideaki
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
31 Jan, 2013
1 commit
-
Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller
14 Jan, 2013
2 commits
-
Router Alert option is very small and we can store the value
itself in the skb.Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller -
Commit 7a3198a8 ("ipv6: helper function to get tclass") introduced
ipv6_tclass(), but similar function is already available as
ipv6_get_dsfield().We might be able to call ipv6_tclass() from ipv6_get_dsfield(),
but it is confusing to have two versions.Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller
09 Dec, 2012
1 commit
-
This patch adds support in the kernel for offloading in the NIC Tx and Rx
checksumming for encapsulated packets (such as VXLAN and IP GRE).For Tx encapsulation offload, the driver will need to set the right bits
in netdev->hw_enc_features. The protocol driver will have to set the
skb->encapsulation bit and populate the inner headers, so the NIC driver will
use those inner headers to calculate the csum in hardware.For Rx encapsulation offload, the driver will need to set again the
skb->encapsulation flag and the skb->ip_csum to CHECKSUM_UNNECESSARY.
In that case the protocol driver should push the decapsulated packet up
to the stack, again with CHECKSUM_UNNECESSARY. In ether case, the protocol
driver should set the skb->encapsulation flag back to zero. Finally the
protocol driver should have NETIF_F_RXCSUM flag set in its features.Signed-off-by: Joseph Gasparakis
Signed-off-by: Peter P Waskiewicz Jr
Signed-off-by: Alexander Duyck
Signed-off-by: David S. Miller
01 Dec, 2012
1 commit
-
commit 68835aba4d9b (net: optimize INET input path further)
moved some fields used for tcp/udp sockets lookup in the first cache
line of struct sock_common.This patch moves inet_dport/inet_num as well, filling a 32bit hole
on 64 bit arches and reducing number of cache line misses in lookups.Also change INET_MATCH()/INET_TW_MATCH() to perform the ports match
before addresses match, as this check is more discriminant.Remove the hash check from MATCH() macros because we dont need to
re validate the hash value after taking a refcount on socket, and
use likely/unlikely compiler hints, as the sk_hash/hash check
makes the following conditional tests 100% predicted by cpu.Introduce skc_addrpair/skc_portpair pair values to better
document the alignment requirements of the port/addr pairs
used in the various MATCH() macros, and remove some casts.The namespace check can also be done at last.
This slightly improves TCP/UDP lookup times.
IP/TCP early demux needs inet->rx_dst_ifindex and
TCP needs inet->min_ttl, lets group them together in same cache line.With help from Ben Hutchings & Joe Perches.
Idea of this patch came after Ling Ma proposal to move skc_hash
to the beginning of struct sock_common, and should allow him
to submit a final version of his patch. My tests show an improvement
doing so.Signed-off-by: Eric Dumazet
Cc: Ben Hutchings
Cc: Joe Perches
Cc: Ling Ma
Signed-off-by: David S. Miller
14 Nov, 2012
1 commit
-
This patch introduces a new knob ndisc_notify. If enabled, the kernel
will transmit an unsolicited neighbour advertisement on link-layer address
change to update the neighbour tables of the corresponding hosts more quickly.This is the equivalent to arp_notify in ipv4 world.
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
13 Oct, 2012
1 commit
-
Signed-off-by: David Howells
Acked-by: Arnd Bergmann
Acked-by: Thomas Gleixner
Acked-by: Michael Kerrisk
Acked-by: Paul E. McKenney
Acked-by: Dave Jones
30 Aug, 2012
1 commit
-
The IPv6 conntrack fragmentation currently has a couple of shortcomings.
Fragmentes are collected in PREROUTING/OUTPUT, are defragmented, the
defragmented packet is then passed to conntrack, the resulting conntrack
information is attached to each original fragment and the fragments then
continue their way through the stack.Helper invocation occurs in the POSTROUTING hook, at which point only
the original fragments are available. The result of this is that
fragmented packets are never passed to helpers.This patch improves the situation in the following way:
- If a reassembled packet belongs to a connection that has a helper
assigned, the reassembled packet is passed through the stack instead
of the original fragments.- During defragmentation, the largest received fragment size is stored.
On output, the packet is refragmented if required. If the largest
received fragment size exceeds the outgoing MTU, a "packet too big"
message is generated, thus behaving as if the original fragments
were passed through the stack from an outside point of view.- The ipv6_helper() hook function can't receive fragments anymore for
connections using a helper, so it is switched to use ipv6_skip_exthdr()
instead of the netfilter specific nf_ct_ipv6_skip_exthdr() and the
reassembled packets are passed to connection tracking helpers.The result of this is that we can properly track fragmented packets, but
still generate ICMPv6 Packet too big messages if we would have before.This patch is also required as a precondition for IPv6 NAT, where NAT
helpers might enlarge packets up to a point that they require
fragmentation. In that case we can't generate Packet too big messages
since the proper MTU can't be calculated in all cases (f.i. when
changing textual representation of a variable amount of addresses),
so the packet is transparently fragmented iff the original packet or
fragments would have fit the outgoing MTU.IPVS parts by Jesper Dangaard Brouer .
Signed-off-by: Patrick McHardy
07 Aug, 2012
1 commit
-
IPv6 needs a cookie in dst_check() call.
We need to add rx_dst_cookie and provide a family independent
sk_rx_dst_set(sk, skb) method to properly support IPv6 TCP early demux.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
18 Jul, 2012
1 commit
-
We should provide to inet6_csk_route_socket a struct flowi6 pointer,
so that net6_csk_xmit() works correctly instead of sending garbage.Also add some consts
Signed-off-by: Eric Dumazet
Reported-by: Yuchung Cheng
Cc: Neal Cardwell
Signed-off-by: David S. Miller
11 Jul, 2012
1 commit
-
Fixes build when ipv6 is disabled.
Reported-by: Fengguang Wu
Signed-off-by: David S. Miller
13 Feb, 2012
2 commits
-
Currently, it is not easily possible to get TOS/DSCP value of packets from
an incoming TCP stream. The mechanism is there, IP_PKTOPTIONS getsockopt
with IP_RECVTOS set, the same way as incoming TTL can be queried. This is
not actually implemented for TOS, though.This patch adds this functionality, both for IPv4 (IP_PKTOPTIONS) and IPv6
(IPV6_2292PKTOPTIONS). For IPv4, like in the IP_RECVTTL case, the value of
the TOS field is stored from the other party's ACK.This is needed for proxies which require DSCP transparency. One such example
is at http://zph.bratcheda.org/.Signed-off-by: Jiri Benc
Signed-off-by: David S. Miller -
Implement helper inline function to get traffic class from IPv6 header.
Signed-off-by: Jiri Benc
Signed-off-by: David S. Miller
09 Feb, 2012
1 commit
-
The IPV6_UNICAST_IF feature is the IPv6 compliment to IP_UNICAST_IF.
Signed-off-by: Erich E. Hoover
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
12 Dec, 2011
1 commit
-
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
25 Nov, 2010
1 commit
-
ipv6_sk_mc_lock rwlock becomes a spinlock.
readers (inet6_mc_check()) now takes rcu_read_lock() instead of read
lock. Writers dont need to disable BH anymore.struct ipv6_mc_socklist objects are reclaimed after one RCU grace
period.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
21 Oct, 2010
1 commit
-
Support for IPV6_RECVORIGDSTADDR sockopt for UDP sockets were contributed by
Harry Mason.Signed-off-by: Balazs Scheidler
Signed-off-by: KOVACS Krisztian
Signed-off-by: Patrick McHardy
23 Aug, 2010
1 commit
-
__packed is only defined in kernel space, so we should use
__attribute__((packed)) for the code shared between kernel and user space.Two __attribute() annotations are replaced with __attribute__() too.
Signed-off-by: Changli Gao
Signed-off-by: David S. Miller
20 Jul, 2010
1 commit
-
Even with jumbograms I cannot see any way in which we would need
to records a larger than 65535 valued next-header offset.The maximum extension header length is (256 << 3) == 2048.
There are only a handful of extension headers specified which
we'd even accept (say 5 or 6), therefore the largest next-header
offset we'd ever have to contend with is something less than
say 16k.Therefore make it a u16 instead of a u32.
Signed-off-by: David S. Miller
03 Jun, 2010
1 commit
-
cleanup patch.
Use new __packed annotation in net/ and include/
(except netfilter)Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
11 May, 2010
2 commits
-
This patch adds support for multiple independant multicast routing instances,
named "tables".Userspace multicast routing daemons can bind to a specific table instance by
issuing a setsockopt call using a new option MRT6_TABLE. The table number is
stored in the raw socket data and affects all following ip6mr setsockopt(),
getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT)
is created with a default routing rule pointing to it. Newly created pim6reg
devices have the table number appended ("pim6regX"), with the exception of
devices created in the default table, which are named just "pim6reg" for
compatibility reasons.Packets are directed to a specific table instance using routing rules,
similar to how regular routing rules work. Currently iif, oif and mark
are supported as keys, source and destination addresses could be supported
additionally.Example usage:
- bind pimd/xorp/... to a specific table:
uint32_t table = 123;
setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table));- create routing rules directing packets to the new table:
# ip -6 mrule add iif eth0 lookup 123
# ip -6 mrule add oif eth0 lookup 123Signed-off-by: Patrick McHardy
-
Conflicts:
net/bridge/br_device.c
net/bridge/br_forward.cSigned-off-by: Patrick McHardy
24 Apr, 2010
2 commits
-
Finally add support to detect a local IPV6_DONTFRAG event
and return the relevant data to the user if they've enabled
IPV6_RECVPATHMTU on the socket. The next recvmsg() will
return no data, but have an IPV6_PATHMTU as ancillary data.Signed-off-by: Brian Haley
Signed-off-by: David S. Miller -
Add underlying data structure changes and basic setsockopt()
and getsockopt() support for IPV6_RECVPATHMTU, IPV6_PATHMTU,
and IPV6_DONTFRAG. IPV6_PATHMTU is actually fully functional
at this point.Signed-off-by: Brian Haley
Signed-off-by: David S. Miller
23 Apr, 2010
1 commit
-
This patch adds IPv6 support for RFC5082 Generalized TTL Security Mechanism.
Not to users of mapped address; the IPV6 and IPV4 socket options are seperate.
The server does have to deal with both IPv4 and IPv6 socket options
and the client has to handle the different for each family.On client:
int ttl = 255;
getaddrinfo(argv[1], argv[2], &hint, &result);for (rp = result; rp != NULL; rp = rp->ai_next) {
s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
if (s < 0) continue;if (rp->ai_family == AF_INET) {
setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
} else if (rp->ai_family == AF_INET6) {
setsockopt(s, IPPROTO_IPV6, IPV6_UNICAST_HOPS,
&ttl, sizeof(ttl)))
}if (connect(s, rp->ai_addr, rp->ai_addrlen) == 0) {
...On server:
int minttl = 255 - maxhops;getaddrinfo(NULL, port, &hints, &result);
for (rp = result; rp != NULL; rp = rp->ai_next) {
s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
if (s < 0) continue;if (rp->ai_family == AF_INET6)
setsockopt(s, IPPROTO_IPV6, IPV6_MINHOPCOUNT,
&minttl, sizeof(minttl));
setsockopt(s, IPPROTO_IP, IP_MINTTL, &minttl, sizeof(minttl));if (bind(s, rp->ai_addr, rp->ai_addrlen) == 0)
break
...Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
13 Apr, 2010
1 commit
-
Similar to how IPv4's ip_output.c works, have ip6_output also check
the IPSKB_REROUTED flag. It will be set from xt_TEE for cloned packets
since Xtables can currently only deal with a single packet in flight
at a time.Signed-off-by: Jan Engelhardt
Acked-by: David S. Miller
[Patrick: changed to use an IP6SKB value instead of IPSKB]
Signed-off-by: Patrick McHardy
19 Oct, 2009
1 commit
-
In order to have better cache layouts of struct sock (separate zones
for rx/tx paths), we need this preliminary patch.Goal is to transfert fields used at lookup time in the first
read-mostly cache line (inside struct sock_common) and move sk_refcnt
to a separate cache line (only written by rx path)This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
sport and id fields. This allows a future patch to define these
fields as macros, like sk_refcnt, without name clashes.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
09 Oct, 2009
1 commit
-
(This patch fixes bug of commit f7734fdf61ec6bb848e0bafc1fb8bad2c124bb50
title "make TLLAO option for NA packets configurable")When the IPV6 conf is used, the function sysctl_set_parent is called and the
array addrconf_sysctl is used as a parameter of the function.The above patch added new conf "force_tllao" into the array addrconf_sysctl,
but the size of the array was not modified, the static allocated size is
DEVCONF_MAX + 1 but the real size is DEVCONF_MAX + 2, so the problem is
that the function sysctl_set_parent accessed wrong address.I got the following information.
Call Trace:
[] sysctl_set_parent+0x29/0x3e
[] sysctl_set_parent+0x29/0x3e
[] sysctl_set_parent+0x29/0x3e
[] sysctl_set_parent+0x29/0x3e
[] sysctl_set_parent+0x29/0x3e
[] __register_sysctl_paths+0xde/0x272
[] ? __kmalloc_track_caller+0x16e/0x180
[] ? __addrconf_sysctl_register+0xc5/0x144 [ipv6]
[] register_net_sysctl_table+0x48/0x4b
[] __addrconf_sysctl_register+0xf7/0x144 [ipv6]
[] addrconf_init_net+0xd4/0x104 [ipv6]
[] setup_net+0x35/0x82
[] copy_net_ns+0x76/0xe0
[] create_new_namespaces+0xf0/0x16e
[] copy_namespaces+0x65/0x9f
[] copy_process+0xb2c/0x12c3
[] do_fork+0x14b/0x2d2
[] ? up_read+0xe/0x10
[] ? do_page_fault+0x27a/0x2aa
[] sys_clone+0x28/0x2a
[] stub_clone+0x13/0x20
[] ? system_call_fastpath+0x16/0x1bAnd the information of IPV6 in .config is as following.
IPV6 in .config:
CONFIG_IPV6=m
CONFIG_IPV6_PRIVACY=y
CONFIG_IPV6_ROUTER_PREF=y
CONFIG_IPV6_ROUTE_INFO=y
CONFIG_IPV6_OPTIMISTIC_DAD=y
CONFIG_IPV6_MIP6=m
CONFIG_IPV6_SIT=m
# CONFIG_IPV6_SIT_6RD is not set
CONFIG_IPV6_NDISC_NODETYPE=y
CONFIG_IPV6_TUNNEL=m
CONFIG_IPV6_MULTIPLE_TABLES=y
CONFIG_IPV6_SUBTREES=y
CONFIG_IPV6_MROUTE=y
CONFIG_IPV6_PIMSM_V2=y
# CONFIG_IP_VS_IPV6 is not set
CONFIG_NF_CONNTRACK_IPV6=m
CONFIG_IP6_NF_MATCH_IPV6HEADER=mI confirmed this patch fixes this problem.
Signed-off-by: Jin Dongming
Signed-off-by: David S. Miller
07 Oct, 2009
1 commit
-
On Friday 02 October 2009 20:53:51 you wrote:
> This is good although I would have shortened the name.
Ah, I knew I forgot something :) Here is v4.
tavi
>From 24d96d825b9fa832b22878cc6c990d5711968734 Mon Sep 17 00:00:00 2001
From: Octavian Purdila
Date: Fri, 2 Oct 2009 00:51:15 +0300
Subject: [PATCH] ipv6: new sysctl for sending TLLAO with unicast NAsNeighbor advertisements responding to unicast neighbor solicitations
did not include the target link-layer address option. This patch adds
a new sysctl option (disabled by default) which controls whether this
option should be sent even with unicast NAs.The need for this arose because certain routers expect the TLLAO in
some situations even as a response to unicast NS packets.Moreover, RFC 2461 recommends sending this to avoid a race condition
(section 4.4, Target link-layer address)Signed-off-by: Cosmin Ratiu
Signed-off-by: Octavian Purdila
Signed-off-by: David S. Miller