Eric Lee / smarc-fsl-linux-kernel

11 Oct, 2007

11 commits

8f4c1f9b0 [NETLINK]: Introduce nested and byteorder flag to netlink attribute ... Browse Code »

This change allows the generic attribute interface to be used within
the netfilter subsystem where this flag was initially introduced.

The byte-order flag is yet unused, it's intended use is to
allow automatic byte order convertions for all atomic types.

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2007-10-11 07:49:16 +0800
881d966b4 [NET]: Make the device list and device lookups per namespace. ... Browse Code »

This patch makes most of the generic device layer network
namespace safe. This patch makes dev_base_head a
network namespace variable, and then it picks up
a few associated variables. The functions:
dev_getbyhwaddr
dev_getfirsthwbytype
dev_get_by_flags
dev_get_by_name
__dev_get_by_name
dev_get_by_index
__dev_get_by_index
dev_ioctl
dev_ethtool
dev_load
wireless_process_ioctl

were modified to take a network namespace argument, and
deal with it.

vlan_ioctl_set and brioctl_set were modified so their
hooks will receive a network namespace argument.

So basically anthing in the core of the network stack that was
affected to by the change of dev_base was modified to handle
multiple network namespaces. The rest of the network stack was
simply modified to explicitly use &init_net the initial network
namespace. This can be fixed when those components of the network
stack are modified to handle multiple network namespaces.

For now the ifindex generator is left global.

Fundametally ifindex numbers are per namespace, or else
we will have corner case problems with migration when
we get that far.

At the same time there are assumptions in the network stack
that the ifindex of a network device won't change. Making
the ifindex number global seems a good compromise until
the network stack can cope with ifindex changes when
you change namespaces, and the like.

Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller

Eric W. Biederman
2007-10-11 07:49:10 +0800
b4b510290 [NET]: Support multiple network namespaces with netlink ... Browse Code »

Each netlink socket will live in exactly one network namespace,
this includes the controlling kernel sockets.

This patch updates all of the existing netlink protocols
to only support the initial network namespace. Request
by clients in other namespaces will get -ECONREFUSED.
As they would if the kernel did not have the support for
that netlink protocol compiled in.

As each netlink protocol is updated to be multiple network
namespace safe it can register multiple kernel sockets
to acquire a presence in the rest of the network namespaces.

The implementation in af_netlink is a simple filter implementation
at hash table insertion and hash table look up time.

Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller

Eric W. Biederman
2007-10-11 07:49:09 +0800
e9dc86534 [NET]: Make device event notification network namespace safe ... Browse Code »

Every user of the network device notifiers is either a protocol
stack or a pseudo device. If a protocol stack that does not have
support for multiple network namespaces receives an event for a
device that is not in the initial network namespace it quite possibly
can get confused and do the wrong thing.

To avoid problems until all of the protocol stacks are converted
this patch modifies all netdev event handlers to ignore events on
devices that are not in the initial network namespace.

As the rest of the code is made network namespace aware these
checks can be removed.

Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller

Eric W. Biederman
2007-10-11 07:49:09 +0800
e730c1551 [NET]: Make packet reception network namespace safe ... Browse Code »

This patch modifies every packet receive function
registered with dev_add_pack() to drop packets if they
are not from the initial network namespace.

This should ensure that the various network stacks do
not receive packets in a anything but the initial network
namespace until the code has been converted and is ready
for them.

Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller

Eric W. Biederman
2007-10-11 07:49:08 +0800
1b8d7ae42 [NET]: Make socket creation namespace safe. ... Browse Code »

This patch passes in the namespace a new socket should be created in
and has the socket code do the appropriate reference counting. By
virtue of this all socket create methods are touched. In addition
the socket create methods are modified so that they will fail if
you attempt to create a socket in a non-default network namespace.

Failing if we attempt to create a socket outside of the default
network namespace ensures that as we incrementally make the network stack
network namespace aware we will not export functionality that someone
has not audited and made certain is network namespace safe.
Allowing us to partially enable network namespaces before all of the
exotic protocols are supported.

Any protocol layers I have missed will fail to compile because I now
pass an extra parameter into the socket creation code.

[ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ]

Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller

Eric W. Biederman
2007-10-11 07:49:07 +0800
457c4cbc5 [NET]: Make /proc/net per network namespace ... Browse Code »

This patch makes /proc/net per network namespace. It modifies the global
variables proc_net and proc_net_stat to be per network namespace.
The proc_net file helpers are modified to take a network namespace argument,
and all of their callers are fixed to pass &init_net for that argument.
This ensures that all of the /proc/net files are only visible and
usable in the initial network namespace until the code behind them
has been updated to be handle multiple network namespaces.

Making /proc/net per namespace is necessary as at least some files
in /proc/net depend upon the set of network devices which is per
network namespace, and even more files in /proc/net have contents
that are relevant to a single network namespace.

Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller

Eric W. Biederman
2007-10-11 07:49:06 +0800
1dfcae776 [IPV6]: Remove unneeded pointer iph from ipcomp6_input() in net/ipv6/ipcomp6.c ... Browse Code »

This trivial patch removes the unneeded pointer iph, which is never used.

Signed-off-by: Micah Gruber
Signed-off-by: David S. Miller

Micah Gruber
2007-10-11 07:48:58 +0800
1e5dc1461 [IPV6] IPSEC: Omit redirect for tunnelled packet. ... Browse Code »

IPv6 IPsec tunnel gateway incorrectly sends redirect to
router or sender when network device the IPsec tunnelled packet
is arrived is the same as the one the decapsulated packet
is sent.

With this patch, it omits to send the redirect when the forwarding
skbuff carries secpath, since such skbuff should be assumed as
a decapsulated packet from IPsec tunnel by own.

It may be a rare case for an IPsec security gateway, however
it is not rare when the gateway is MIPv6 Home Agent since
the another tunnel end-point is Mobile Node and it changes
the attached network.

Signed-off-by: Masahide NAKAMURA
Signed-off-by: David S. Miller

Masahide NAKAMURA
2007-10-11 07:48:33 +0800
a47ed4cd8 [IPV6] XFRM: Fix connected socket to use transformation. ... Browse Code »

When XFRM policy and state are ready after TCP connection is started,
the traffic should be transformed immediately, however it does not
on IPv6 TCP.

It depends on a dst cache replacement policy with connected socket.
It seems that the replacement is always done for IPv4, however, on
IPv6 case it is done only when routing cookie is changed.

This patch fix that non-transformation dst can be changed to
transformation one.
This behavior is required by MIPv6 and improves IPv6 IPsec.

Fixes by Masahide NAKAMURA.

Signed-off-by: Noriaki TAKAMIYA
Signed-off-by: Masahide NAKAMURA
Signed-off-by: David S. Miller

Noriaki TAKAMIYA
2007-10-11 07:48:32 +0800
e773e4faa [IPV6]: Add v4mapped address inline ... Browse Code »

Add v4mapped address inline to avoid calls to ipv6_addr_type().

Signed-off-by: Brian Haley
Signed-off-by: David S. Miller

Brian Haley
2007-10-11 07:48:32 +0800

08 Oct, 2007

1 commit

bf0b48dfc [IPv6]: Fix ICMPv6 redirect handling with target multicast address ... Browse Code »

When the ICMPv6 Target address is multicast, Linux processes the
redirect instead of dropping it. The problem is in this code in
ndisc_redirect_rcv():

if (ipv6_addr_equal(dest, target)) {
on_link = 1;
} else if (!(ipv6_addr_type(target) & IPV6_ADDR_LINKLOCAL)) {
ND_PRINTK2(KERN_WARNING
"ICMPv6 Redirect: target address is not
link-local.\n");
return;
}

This second check will succeed if the Target address is, for example,
FF02::1 because it has link-local scope. Instead, it should be checking
if it's a unicast link-local address, as stated in RFC 2461/4861 Section
8.1:

- The ICMP Target Address is either a link-local address (when
redirected to a router) or the same as the ICMP Destination
Address (when redirected to the on-link destination).

I know this doesn't explicitly say unicast link-local address, but it's
implied.

This bug is preventing Linux kernels from achieving IPv6 Logo Phase II
certification because of a recent error that was found in the TAHI test
suite - Neighbor Disovery suite test 206 (v6LC.2.3.6_G) had the
multicast address in the Destination field instead of Target field, so
we were passing the test. This won't be the case anymore.

The patch below fixes this problem, and also fixes ndisc_send_redirect()
to not send an invalid redirect with a multicast address in the Target
field. I re-ran the TAHI Neighbor Discovery section to make sure Linux
passes all 245 tests now.

Signed-off-by: Brian Haley
Acked-by: David L Stevens
Signed-off-by: David S. Miller

Brian Haley
2007-10-08 15:12:05 +0800

29 Sep, 2007

1 commit

f8ab18d2d [TCP]: Fix MD5 signature handling on big-endian. ... Browse Code »

Based upon a report and initial patch by Peter Lieven.

tcp4_md5sig_key and tcp6_md5sig_key need to start with
the exact same members as tcp_md5sig_key. Because they
are both cast to that type by tcp_v{4,6}_md5_do_lookup().

Unfortunately tcp{4,6}_md5sig_key use a u16 for the key
length instead of a u8, which is what tcp_md5sig_key
uses. This just so happens to work by accident on
little-endian, but on big-endian it doesn't.

Instead of casting, just place tcp_md5sig_key as the first member of
the address-family specific structures, adjust the access sites, and
kill off the ugly casts.

Signed-off-by: David S. Miller

David S. Miller
2007-09-29 06:18:35 +0800

17 Sep, 2007

1 commit

6ae5f983c [IPV6]: Fix source address selection. ... Browse Code »

The commit 95c385 broke proper source address selection for cases in which
there is a address which is makred 'deprecated'. The commit mistakenly
changed ifa->flags to ifa_result->flags (probably copy/paste error from a
few lines above) in the 'Rule 3' address selection code.

The patch restores the previous RFC-compliant behavior.

Signed-off-by: Jiri Kosina
Signed-off-by: David S. Miller

Jiri Kosina
2007-09-17 05:48:21 +0800

15 Sep, 2007

2 commits

cd562c985 [IPV6]: Just increment OutDatagrams once per a datagram. ... Browse Code »

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2007-09-15 08:15:01 +0800
3ef9d943d [IPV6]: Fix unbalanced socket reference with MSG_CONFIRM. ... Browse Code »

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2007-09-15 07:45:40 +0800

11 Sep, 2007

3 commits

e1f52208b [IPv6]: Fix NULL pointer dereference in ip6_flush_pending_frames ... Browse Code »

Some of skbs in sk->write_queue do not have skb->dst because
we do not fill skb->dst when we allocate new skb in append_data().

BTW, I think we may not need to (or we should not) increment some stats
when using corking; if 100 sendmsg() (with MSG_MORE) result in 2 packets,
how many should we increment?

If 100, we should set skb->dst for every queued skbs.

If 1 (or 2 (*)), we increment the stats for the first queued skb and
we should just skip incrementing OutDiscards for the rest of queued skbs,
adn we should also impelement this semantics in other places;
e.g., we should increment other stats just once, not 100 times.

*: depends on the place we are discarding the datagram.

I guess should just increment by 1 (or 2).

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2007-09-11 17:31:43 +0800
16fcec35e [NETFILTER]: Fix/improve deadlock condition on module removal netfilter ... Browse Code »

So I've had a deadlock reported to me. I've found that the sequence of
events goes like this:

1) process A (modprobe) runs to remove ip_tables.ko

2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket,
increasing the ip_tables socket_ops use count

3) process A acquires a file lock on the file ip_tables.ko, calls remove_module
in the kernel, which in turn executes the ip_tables module cleanup routine,
which calls nf_unregister_sockopt

4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the
calling process into uninterruptible sleep, expecting the process using the
socket option code to wake it up when it exits the kernel

4) the user of the socket option code (process B) in do_ipt_get_ctl, calls
ipt_find_table_lock, which in this case calls request_module to load
ip_tables_nat.ko

5) request_module forks a copy of modprobe (process C) to load the module and
blocks until modprobe exits.

6) Process C. forked by request_module process the dependencies of
ip_tables_nat.ko, of which ip_tables.ko is one.

7) Process C attempts to lock the request module and all its dependencies, it
blocks when it attempts to lock ip_tables.ko (which was previously locked in
step 3)

Theres not really any great permanent solution to this that I can see, but I've
developed a two part solution that corrects the problem

Part 1) Modifies the nf_sockopt registration code so that, instead of using a
use counter internal to the nf_sockopt_ops structure, we instead use a pointer
to the registering modules owner to do module reference counting when nf_sockopt
calls a modules set/get routine. This prevents the deadlock by preventing set 4
from happening.

Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking
remove operations (the same way rmmod does), and add an option to explicity
request blocking operation. So if you select blocking operation in modprobe you
can still cause the above deadlock, but only if you explicity try (and since
root can do any old stupid thing it would like.... :) ).

Signed-off-by: Neil Horman
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Neil Horman
2007-09-11 17:28:26 +0800
9e3be4b34 [IPV6]: Freeing alive inet6 address ... Browse Code »

From: Denis V. Lunev

addrconf_dad_failure calls addrconf_dad_stop which takes referenced address
and drops the count. So, in6_ifa_put perrformed at out: is extra. This
results in message: "Freeing alive inet6 address" and not released dst entries.

Signed-off-by: Denis V. Lunev
Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Denis V. Lunev
2007-09-11 17:04:49 +0800

27 Aug, 2007

1 commit

a96fb49be [NET]: Fix IP_ADD/DROP_MEMBERSHIP to handle only connectionless ... Browse Code »

Fix IP[V6]_ADD_MEMBERSHIP and IP[V6]_DROP_MEMBERSHIP to
return -EPROTO for connection oriented sockets.

Signed-off-by: Flavio Leitner
Signed-off-by: David S. Miller

Flavio Leitner
2007-08-27 09:35:35 +0800

22 Aug, 2007

1 commit

8984e41d1 [IPV6]: Fix kernel panic while send SCTP data with IP fragments ... Browse Code »

If ICMP6 message with "Packet Too Big" is received after send SCTP DATA,
kernel panic will occur when SCTP DATA is send again.

This is because of a bad dest address when call to skb_copy_bits().

The messages sequence is like this:

Endpoint A Endpoint B

(Packet Too Big pmtu=1280)
] Not tainted VLI
EFLAGS: 00010282 (2.6.23-rc2 #1)
EIP is at skb_copy_bits+0x4f/0x1ef
eax: 000004d0 ebx: ce12a980 ecx: 00000134 edx: cfd5a880
esi: c8246858 edi: 00000000 ebp: c0759b14 esp: c0759adc
ds: 007b es: 007b fs: 00d8 gs: 0000 ss: 0068
Process swapper (pid: 0, ti=c0759000 task=c06d0340 task.ti=c0713000)
Stack: c0759b88 c0405867 ce12a980 c8bff838 c789c084 00000000 00000028 cfd5a880
d09f1890 000005dc 0000007b ce12a980 cfd5a880 c8bff838 c0759b88 d09bc521
000004d0 fffff96c 00000200 00000100 c0759b50 cfd5a880 00000246 c0759bd4
Call Trace:
[] show_trace_log_lvl+0x1a/0x2f
[] show_stack_log_lvl+0x9b/0xa3
[] show_registers+0x1b8/0x289
[] die+0x113/0x246
[] do_page_fault+0x4ad/0x57e
[] error_code+0x72/0x78
[] ip6_output+0x8e5/0xab2 [ipv6]
[] ip6_xmit+0x2ea/0x3a3 [ipv6]
[] sctp_v6_xmit+0x248/0x253 [sctp]
[] sctp_packet_transmit+0x53f/0x5ae [sctp]
[] sctp_outq_flush+0x555/0x587 [sctp]
[] sctp_retransmit+0xf8/0x10f [sctp]
[] sctp_icmp_frag_needed+0x57/0x5b [sctp]
[] sctp_v6_err+0xcd/0x148 [sctp]
[] icmpv6_notify+0xe6/0x167 [ipv6]
[] icmpv6_rcv+0x7d7/0x849 [ipv6]
[] ip6_input+0x1dc/0x310 [ipv6]
[] ipv6_rcv+0x294/0x2df [ipv6]
[] netif_receive_skb+0x2d2/0x335
[] process_backlog+0x7f/0xd0
[] net_rx_action+0x96/0x17e
[] __do_softirq+0x64/0xcd
[] do_softirq+0x5c/0xac
=======================
Code: 00 00 29 ca 89 d0 2b 45 e0 89 55 ec 85 c0 7e 35 39 45 08 8b 55 e4 0f 4e 45 08 8b 75 e0 8b 7d dc 89 c1 c1 e9 02 03 b2 a0 00 00 00 a5 89 c1 83 e1 03 74 02 f3 a4 29 45 08 0f 84 7b 01 00 00 01
EIP: [] skb_copy_bits+0x4f/0x1ef SS:ESP 0068:c0759adc
Kernel panic - not syncing: Fatal exception in interrupt

Arnaldo says:
====================
Thanks! I'm to blame for this one, problem was introduced in:

b0e380b1d8a8e0aca215df97702f99815f05c094

@@ -761,7 +762,7 @@ slow_path:
/*
* Copy a block of the IP datagram.
*/
- if (skb_copy_bits(skb, ptr, frag->h.raw, len))
+ if (skb_copy_bits(skb, ptr, skb_transport_header(skb),
len))
BUG();
left -= len;
====================

Signed-off-by: Wei Yongjun
Acked-by: YOSHIFUJI Hideaki
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: David S. Miller

Wei Yongjun
2007-08-22 11:59:08 +0800

16 Aug, 2007

1 commit

660adc6e6 [IPv6]: Invalid semicolon after if statement ... Browse Code »

A similar fix to netfilter from Eric Dumazet inspired me to
look around a bit by using some grep/sed stuff as looking for
this kind of bugs seemed easy to automate. This is one of them
I found where it looks like this semicolon is not valid.

Signed-off-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Ilpo Järvinen
2007-08-16 06:07:30 +0800

14 Aug, 2007

1 commit

703310e64 [IPV6]: Clean up duplicate includes in net/ipv6/ ... Browse Code »

This patch cleans up duplicate includes in
net/ipv6/

Signed-off-by: Jesper Juhl
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Jesper Juhl
2007-08-14 13:52:03 +0800

03 Aug, 2007

1 commit

3516ffb0f [TCP]: Invoke tcp_sendmsg() directly, do not use inet_sendmsg(). ... Browse Code »

As discovered by Evegniy Polyakov, if we try to sendmsg after
a connection reset, we can do incredibly stupid things.

The core issue is that inet_sendmsg() tries to autobind the
socket, but we should never do that for TCP. Instead we should
just go straight into TCP's sendmsg() code which will do all
of the necessary state and pending socket error checks.

TCP's sendpage already directly vectors to tcp_sendpage(), so this
merely brings sendmsg() in line with that.

Signed-off-by: David S. Miller

David S. Miller
2007-08-03 10:42:28 +0800

31 Jul, 2007

4 commits

1a3a206f7 [NETFILTER]: Make nf_ct_ipv6_skip_exthdr() static. ... Browse Code »

nf_ct_ipv6_skip_exthdr() can now become static.

Signed-off-by: Adrian Bunk
Signed-off-by: David S. Miller

Adrian Bunk
2007-07-31 17:28:26 +0800
c61a7d10e [IPV6]: ipv6_addr_type() doesn't know about RFC4193 addresses. ... Browse Code »

ipv6_addr_type() doesn't check for 'Unique Local IPv6 Unicast
Addresses' (RFC4193) and returns IPV6_ADDR_RESERVED for that range.

SCTP uses this function and will fail bind() and connect() calls that
use RFC4193 addresses, SCTP will also ignore inbound connections from
RFC4193 addresses if listening on IPV6_ADDR_ANY.

There may be other users of ipv6_addr_type() that could also have
problems.

Signed-off-by: Dave Johnson
Acked-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

Dave Johnson
2007-07-31 17:28:21 +0800
b217d616a [IPV4/IPV6]: Fail registration if inet device construction fails ... Browse Code »

Now that netdev notifications can fail, we can use this to signal
errors during registration for IPv4/IPv6. In particular, if we
fail to allocate memory for the inet device, we can fail the netdev
registration.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-07-31 17:28:16 +0800
566cfd8f0 [IPV6]: Don't update ADVMSS on routes where the MTU is not also updated ... Browse Code »

The ADVMSS value was incorrectly updated for ALL routes when the MTU
is updated because it's outside the effect of the if statement's
condition.

Signed-off-by: Simon Arlott
Signed-off-by: David S. Miller

Simon Arlott
2007-07-31 17:28:04 +0800

27 Jul, 2007

1 commit

704eae1f3 ip6_tunnel - endianness annotations ... Browse Code »

Convert rel_info to host-endian before calling ip6_tnl_err().
The things become much more straightforward that way.
The key observation (and the reason why that code actually
worked) is that after ip6_tnl_err() we either immediately
bailed out or had rel_info set to 0 or had it set to host-endian
and guaranteed to hit
(rel_type == ICMP_DEST_UNREACH && rel_code == ICMP_FRAG_NEEDED)
case. So inconsistent endianness didn't really lead to bugs,
but it had been subtle and prone to breakage. New variant is
saner and obviously safe.

Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2007-07-27 02:11:56 +0800

25 Jul, 2007

2 commits

7e2acc7e2 [NETFILTER]: Fix logging regression ... Browse Code »

Loading one of the LOG target fails if a different target has already
registered itself as backend for the same family. This can affect the
ipt_LOG and ipt_ULOG modules when both are loaded.

Reported and tested by:

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-25 06:29:55 +0800
ca983cefd [TCPv6] MD5SIG: Ensure to reset allocation count to avoid panic. ... Browse Code »

After clearing all passwords for IPv6 peers, we need to
set allocation count to zero as well as we free the storage.
Otherwise, we panic when a user trys to (re)add a password.

Discovered and fixed by MIYAJIMA Mitsuharu .

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2007-07-25 06:27:30 +0800

22 Jul, 2007

1 commit

b77f2fa62 [IPV6]: endianness bug in ip6_tunnel ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: David S. Miller

Al Viro
2007-07-22 10:09:41 +0800

20 Jul, 2007

1 commit

20c2df83d mm: Remove slab destructors from kmem_cache_create(). ... Browse Code »

Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt

Paul Mundt
2007-07-20 09:11:58 +0800

15 Jul, 2007

7 commits

063ed369c [IPV6]: Call inet6addr_chain notifiers on link down ... Browse Code »

Currently if the link is brought down via ip link or ifconfig down,
the inet6addr_chain notifiers are not called even though all
the addresses are removed from the interface. This caused SCTP
to add duplicate addresses to it's list.

Signed-off-by: Vlad Yasevich
Signed-off-by: David S. Miller

Vlad Yasevich
2007-07-15 15:16:35 +0800
f13ec93fb [IPV6]: MSG_ERRQUEUE messages do not pass to connected raw sockets ... Browse Code »

From: Dmitry Butskoy

Taken from http://bugzilla.kernel.org/show_bug.cgi?id=8747

Problem Description:

It is related to the possibility to obtain MSG_ERRQUEUE messages from the udp
and raw sockets, both connected and unconnected.

There is a little typo in net/ipv6/icmp.c code, which prevents such messages
to be delivered to the errqueue of the correspond raw socket, when the socket
is CONNECTED. The typo is due to swap of local/remote addresses.

Consider __raw_v6_lookup() function from net/ipv6/raw.c. When a raw socket is
looked up usual way, it is something like:

sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif);

where "daddr" is a destination address of the incoming packet (IOW our local
address), "saddr" is a source address of the incoming packet (the remote end).

But when the raw socket is looked up for some icmp error report, in
net/ipv6/icmp.c:icmpv6_notify() , daddr/saddr are obtained from the echoed
fragment of the "bad" packet, i.e. "daddr" is the original destination
address of that packet, "saddr" is our local address. Hence, for
icmpv6_notify() must use "saddr, daddr" in its arguments, not "daddr, saddr"
...

Steps to reproduce:

Create some raw socket, connect it to an address, and cause some error
situation: f.e. set ttl=1 where the remote address is more than 1 hop to reach.
Set IPV6_RECVERR .
Then send something and wait for the error (f.e. poll() with POLLERR|POLLIN).
You should receive "time exceeded" icmp message (because of "ttl=1"), but the
socket do not receive it.

If you do not connect your raw socket, you will receive MSG_ERRQUEUE
successfully. (The reason is that for unconnected socket there are no actual
checks for local/remote addresses).

Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Dmitry Butskoy
2007-07-15 14:53:08 +0800
61075af51 [NETFILTER]: nf_conntrack: mark protocols __read_mostly ... Browse Code »

Also remove two unnecessary EXPORT_SYMBOLs and move the
nf_conntrack_l3proto_ipv4 declaration to the correct file.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-15 11:48:19 +0800
a887c1c14 [NETFILTER]: Lower *tables printk severity ... Browse Code »

Lower ip6tables, arptables and ebtables printk severity similar to
Dan Aloni's patch for iptables.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-15 11:46:15 +0800
e2a3123fb [NETFILTER]: nf_conntrack: Introduces nf_ct_get_tuplepr and uses it ... Browse Code »

nf_ct_get_tuple() requires the offset to transport header and that bothers
callers such as icmp[v6] l4proto modules. This introduces new function
to simplify them.

Signed-off-by: Yasuyuki Kozakai
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Yasuyuki Kozakai
2007-07-15 11:45:14 +0800
ffc306904 [NETFILTER]: nf_conntrack: make l3proto->prepare() generic and renames it ... Browse Code »

The icmp[v6] l4proto modules parse headers in ICMP[v6] error to get tuple.
But they have to find the offset to transport protocol header before that.
Their processings are almost same as prepare() of l3proto modules.
This makes prepare() more generic to simplify icmp[v6] l4proto module
later.

Signed-off-by: Yasuyuki Kozakai
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Yasuyuki Kozakai
2007-07-15 11:44:50 +0800
d87d8469e [NETFILTER]: nf_conntrack: Increment error count on parsing IPv4 header ... Browse Code »

Signed-off-by: Yasuyuki Kozakai
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Yasuyuki Kozakai
2007-07-15 11:44:23 +0800