Doug / smarc-fsl-linux-kernel | Embedian Git Server

12 Jul, 2007

3 commits

0e06877c6 [RTNETLINK]: rtnl_link: allow specifying initial device address ... Browse Code »

Drivers need to validate the initial addresses in their netlink attribute
validation function or manually reject them if they can't support this.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-12 10:45:36 +0800
2d85cba2b [RTNETLINK]: rtnl_link API simplification ... Browse Code »

All drivers need to unregister their devices in the module unload function.
While doing so they must hold the rtnl and atomically unregister the
rtnl_link ops as well. This makes the rtnl_link_unregister function that
takes the rtnl itself completely useless.

Provide default newlink/dellink functions, make __rtnl_link_unregister and
rtnl_link_unregister unregister all devices with matching rtnl_link_ops and
change the existing users to take advantage of that.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-12 10:45:33 +0800
29578624e [NET]: Fix races in net_rx_action vs netpoll. ... Browse Code »

Keep netpoll/poll_napi from messing with the poll_list.
Only net_rx_action is allowed to manipulate the list.

Signed-off-by: Olaf Kirch
Signed-off-by: David S. Miller

Olaf Kirch
2007-07-12 10:32:02 +0800

11 Jul, 2007

21 commits

6b25d30bf [NET]: Fix gen_estimator timer removal race ... Browse Code »

As noticed by Jarek Poplawski , the timer removal in
gen_kill_estimator races with the timer function rearming the timer.

Check whether the timer list is empty before rearming the timer
in the timer function to fix this.

Signed-off-by: Patrick McHardy
Acked-by: Jarek Poplawski
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:19:03 +0800
1498b3f19 [NETPOLL]: Fix a leak-n-bug in netpoll_cleanup() ... Browse Code »

93ec2c723e3f8a216dde2899aeb85c648672bc6b applied excessive duct tape to
the netpoll beast's netpoll_cleanup(), thus substituting one leak with
another, and opening up a little buglet :-)

net_device->npinfo (netpoll_info) is a shared and refcounted object and
cannot simply be set NULL the first time netpoll_cleanup() is called.
Otherwise, further netpoll_cleanup()'s see np->dev->npinfo == NULL and
become no-ops, thus leaking. And it's a bug too: the first call to
netpoll_cleanup() would thus (annoyingly) "disable" other (still alive)
netpolls too. Maybe nobody noticed this because netconsole (only user
of netpoll) never supported multiple netpoll objects earlier.

This is a trivial and obvious one-line fixlet.

Signed-off-by: Satyam Sharma
Signed-off-by: David S. Miller

Satyam Sharma
2007-07-11 13:19:02 +0800
6f11df835 [NET]: "wrong timeout value in sk_wait_data()": cleanups ... Browse Code »

- save 4 bytes

- it's read-mostly.

Signed-off-by: Andrew Morton
Acked-by: Vasily Averin
Signed-off-by: David S. Miller

Andrew Morton
2007-07-11 13:18:50 +0800
60f0438a8 [NET]: Make some network-related proc files use seq_list_xxx helpers ... Browse Code »

This includes /proc/net/protocols, /proc/net/rxrpc_calls and
/proc/net/rxrpc_connections files.

All three need seq_list_start_head to show some header.

Signed-off-by: Pavel Emelianov
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Pavel Emelianov
2007-07-11 13:18:49 +0800
ba9dda3ab [NETFILTER]: x_tables: add TRACE target ... Browse Code »

The TRACE target can be used to follow IP and IPv6 packets through
the ruleset.

Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick NcHardy
Signed-off-by: David S. Miller

Jozsef Kadlecsik
2007-07-11 13:17:14 +0800
a553e4a63 [PKTGEN]: IPSEC support ... Browse Code »

Added transport mode ESP support for starters. I will send more of
these modes and types once i have resolved the tunnel mode isses.

Signed-off-by: Jamal Hadi Salim
Signed-off-by: Robert Olsson
Signed-off-by: David S. Miller

Jamal Hadi Salim
2007-07-11 13:16:36 +0800
007a531b0 [PKTGEN]: Introduce sequential flows ... Browse Code »

By default all flows in pktgen are randomly selected.
This patch introduces ability to have all defined flows to
be sent sequentially. Robert defined randomness to be the
default behavior.

Signed-off-by: Jamal Hadi Salim
Signed-off-by: Robert Olsson
Signed-off-by: David S. Miller

Jamal Hadi Salim
2007-07-11 13:16:27 +0800
16dab72f6 [PKTGEN]: Centralize packet overhead tracking ... Browse Code »

Track the extra packet overhead for VLAN tags, MPLS, IPSEC etc

Signed-off-by: Jamal Hadi Salim
Signed-off-by: Robert Olsson
Signed-off-by: David S. Miller

Jamal Hadi Salim
2007-07-11 13:16:26 +0800
61cbc2fca [NET]: Fix secondary unicast/multicast address count maintenance ... Browse Code »

When a reference to an existing address is increased or decreased without
hitting zero, the address count is incorrectly adjusted.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:16:23 +0800
f25f4e448 [CORE] Stack changes to add multiqueue hardware support API ... Browse Code »

Add the multiqueue hardware device support API to the core network
stack. Allow drivers to allocate multiple queues and manage them at
the netdev level if they choose to do so.

Added a new field to sk_buff, namely queue_mapping, for drivers to
know which tx_ring to select based on OS classification of the flow.

Signed-off-by: Peter P Waskiewicz Jr
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Peter P Waskiewicz Jr
2007-07-11 13:16:21 +0800
a298830cd [NET]: Fix TX checksum feature check ... Browse Code »

This patch fixes a boolean error in the new TX checksum check
that causes bogus TSO packets to be generated.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-07-11 13:16:19 +0800
4417da668 [NET]: dev: secondary unicast address support ... Browse Code »

Add support for configuring secondary unicast addresses on network
devices. To support this devices capable of filtering multiple
unicast addresses need to change their set_multicast_list function
to configure unicast filters as well and assign it to dev->set_rx_mode
instead of dev->set_multicast_list. Other devices are put into promiscous
mode when secondary unicast addresses are present.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:15:56 +0800
3fba5a8b1 [NET]: dev_mcast: switch to generic net_device address lists ... Browse Code »

Use generic net_device address lists for multicast list handling.
Some defines are used to keep drivers working.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:15:55 +0800
bf742482d [NET]: dev: introduce generic net_device address lists ... Browse Code »

Introduce struct dev_addr_list and list maintenance functions
based on dev_mc_list and the related functions. This will be
used by follow-up patches for both multicast and secondary
unicast addresses.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:15:54 +0800
75ebe8f73 [NET]: dev_mcast: unexport dev_mc_upload ... Browse Code »

dev_mc_add/dev_mc_delete take care of uploading the list when
necessary and thats the only interface other code should use.
Also remove two incorrect calls in DECnet.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:15:53 +0800
d212f87b0 [NET]: IPV6 checksum offloading in network devices ... Browse Code »

The existing model for checksum offload does not correctly handle
devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
implies device can do any arbitrary protocol.

This patch:
* adds NETIF_F_IPV6_CSUM for those devices
* fixes bnx2 and tg3 devices that need it
* add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
* fixes assumptions about NETIF_F_ALL_CSUM in nat
* adjusts bridge union of checksumming computation

Signed-off-by: David S. Miller

Stephen Hemminger
2007-07-11 13:15:52 +0800
2371baa4b [RTNETLINK]: Fix rtnetlink compat attribute patch ... Browse Code »

Sent the wrong patch previously.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:15:40 +0800
afdc3238e [RTNETLINK]: Add nested compat attribute ... Browse Code »

Add a nested compat attribute type that can be used to convert
attributes that contain a structure to nested attributes in a
backwards compatible way.

The attribute looks like this:

struct {
[ compat contents ]
struct rtattr {
.rta_len = total size,
.rta_type = type,
} rta;
struct old_structure struct;

[ nested top-level attribute ]
struct rtattr {
.rta_len = nest size,
.rta_type = type,
} nest_attr;

[ optional 0 .. n nested attributes ]
struct rtattr {
.rta_len = private attribute len,
.rta_type = private attribute typ,
} nested_attr;
struct nested_data data;
};

Since both userspace and kernel deal correctly with attributes that are
larger than expected old versions will just parse the compat part and
ignore the rest.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:15:39 +0800
334a8132d [SKBUFF]: Keep track of writable header len of headerless clones ... Browse Code »

Currently NAT (and others) that want to modify cloned skbs copy them,
even if in the vast majority of cases its not necessary because the
skb is a clone made by TCP and the portion NAT wants to modify is
actually writable because TCP release the header reference before
cloning.

The problem is that there is no clean way for NAT to find out how
long the writable header area is, so this patch introduces skb->hdr_len
to hold this length. When a headerless skb is cloned skb->hdr_len
is set to the current headroom, for regular clones it is copied from
the original. A new function skb_clone_writable(skb, len) returns
whether the skb is writable up to len bytes from skb->data. To avoid
enlarging the skb the mac_len field is reduced to 16 bit and the
new hdr_len field is put in the remaining 16 bit.

I've done a few rough benchmarks of NAT (not with this exact patch,
but a very similar one). As expected it saves huge amounts of system
time in case of sendfile, bringing it down to basically the same
amount as without NAT, with sendmsg it only helps on loopback,
probably because of the large MTU.

Transmit a 1GB file using sendfile/sendmsg over eth0/lo with and
without NAT:

- sendfile eth0, no NAT: sys 0m0.388s
- sendfile eth0, NAT: sys 0m1.835s
- sendfile eth0: NAT + path: sys 0m0.370s (~ -80%)

- sendfile lo, no NAT: sys 0m0.258s
- sendfile lo, NAT: sys 0m2.609s
- sendfile lo, NAT + patch: sys 0m0.260s (~ -90%)

- sendmsg eth0, no NAT: sys 0m2.508s
- sendmsg eth0, NAT: sys 0m2.539s
- sendmsg eth0, NAT + patch: sys 0m2.445s (no change)

- sendmsg lo, no NAT: sys 0m2.151s
- sendmsg lo, NAT: sys 0m3.557s
- sendmsg lo, NAT + patch: sys 0m2.159s (~ -40%)

I expect other users can see a similar performance improvement,
packet mangling iptables targets, ipip and ip_gre come to mind ..

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:15:37 +0800
38f7b870d [RTNETLINK]: Link creation API ... Browse Code »

Add rtnetlink API for creating, changing and deleting software devices.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:14:20 +0800
0157f60c0 [RTNETLINK]: Split up rtnl_setlink ... Browse Code »

Split up rtnl_setlink into a function performing validation and a function
performing the actual changes. This allows to share the modifcation logic
with rtnl_newlink, which is introduced by the next patch.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-07-11 13:14:16 +0800

06 Jul, 2007

3 commits

25442cafb [NETPOLL]: Fixups for 'fix soft lockup when removing module' ... Browse Code »

>From my recent patch:

> > #1
> > Until kernel ver. 2.6.21 (including) cancel_rearming_delayed_work()
> > required a work function should always (unconditionally) rearm with
> > delay > 0 - otherwise it would endlessly loop. This patch replaces
> > this function with cancel_delayed_work(). Later kernel versions don't
> > require this, so here it's only for uniformity.

But Oleg Nesterov found:

> But 2.6.22 doesn't need this change, why it was merged?
>
> In fact, I suspect this change adds a race,
...

His description was right (thanks), so this patch reverts #1.

Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2007-07-06 08:42:44 +0800
94b83419e [NET]: net/core/netevent.c should #include <net/netevent.h> ... Browse Code »

Every file should include the headers containing the prototypes for
its global functions.

Signed-off-by: Adrian Bunk
Signed-off-by: David S. Miller

Adrian Bunk
2007-07-06 08:40:27 +0800
2cd052e44 [NET] skbuff: remove export of static symbol ... Browse Code »

skb_clone_fraglist is static so it shouldn't be exported.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2007-07-06 08:40:19 +0800

29 Jun, 2007

1 commit

17200811c [NETPOLL] netconsole: fix soft lockup when removing module ... Browse Code »

#1
Until kernel ver. 2.6.21 (including) cancel_rearming_delayed_work()
required a work function should always (unconditionally) rearm with
delay > 0 - otherwise it would endlessly loop. This patch replaces
this function with cancel_delayed_work(). Later kernel versions don't
require this, so here it's only for uniformity.

#2
After deleting a timer in cancel_[rearming_]delayed_work() there could
stay a last skb queued in npinfo->txq causing a memory leak after
kfree(npinfo).

Initial patch & testing by: Jason Wessel

Signed-off-by: Jarek Poplawski
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Jarek Poplawski
2007-06-29 13:11:47 +0800

27 Jun, 2007

1 commit

0db3dc73f [NETPOLL]: tx lock deadlock fix ... Browse Code »

If sky2 device poll routine is called from netpoll_send_skb, it would
deadlock. The netpoll_send_skb held the netif_tx_lock, and the poll
routine could acquire it to clean up skb's. Other drivers might use
same locking model.

The driver is correct, netpoll should not introduce more locking
problems than it causes already. So change the code to drop lock
before calling poll handler.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2007-06-27 15:39:42 +0800

24 Jun, 2007

3 commits

5b5a60da2 [NET]: Make skb_seq_read unmap the last fragment ... Browse Code »

Having walked through the entire skbuff, skb_seq_read would leave the
last fragment mapped. As a consequence, the unwary caller would leak
kmaps, and proceed with preempt_count off by one. The only (kind of
non-intuitive) workaround is to use skb_seq_read_abort.

This patch makes sure skb_seq_read always unmaps frag_data after
having cycled through the skb's paged part.

Signed-off-by: Olaf Kirch
Signed-off-by: David S. Miller

Olaf Kirch
2007-06-24 14:11:52 +0800
515e06c45 [NET]: Re-enable irqs before pushing pending DMA requests ... Browse Code »

This moves the local_irq_enable() call in net_rx_action() to before
calling the CONFIG_NET_DMA's dma_async_memcpy_issue_pending() rather
than after. This shortens the irq disabled window and allows for DMA
drivers that need to do their own irq hold.

Signed-off-by: Shannon Nelson
Signed-off-by: David S. Miller

Shannon Nelson
2007-06-24 14:09:23 +0800
dbbeb2f99 [SKBUFF]: Fix incorrect config #ifdef around skb_copy_secmark ... Browse Code »

secmark doesn't depend on CONFIG_NET_SCHED.

Signed-off-by: Patrick McHardy
Acked-by: James Morris
Signed-off-by: David S. Miller

Patrick McHardy
2007-06-24 13:58:34 +0800

08 Jun, 2007

4 commits

7c355f532 [NET]: Avoid duplicate netlink notification when changing link state ... Browse Code »

When changing the link state from userspace not affecting any other
flags. Two duplicate notification are being sent, once as action
in the NETDEV_UP/NETDEV_DOWN notification chain and a second time
when comparing old and new device flags after the change has been
completed. Although harmless, the duplicates should be avoided.

Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller

Thomas Graf
2007-06-08 04:40:56 +0800
51055be81 [RTNETLINK]: ifindex 0 does not exist ... Browse Code »

ifindex == 0 does not exist and implies we should do a lookup by name if
one was given.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-06-08 04:40:11 +0800
ef7c79ed6 [NETLINK]: Mark netlink policies const ... Browse Code »

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-06-08 04:40:10 +0800
c4b1010f4 [NET]: Merge dst_discard_in and dst_discard_out. ... Browse Code »

Signed-off-by: Denis Cheng
Signed-off-by: David S. Miller

Denis Cheng
2007-06-08 04:39:46 +0800

04 Jun, 2007

1 commit

4fcd6b991 [NET] gso: Fix GSO feature mask in sk_setup_caps ... Browse Code »

This isn't a bug just yet as only TCP uses sk_setup_caps for GSO.
However, if and when UDP or something else starts using it this is
likely to cause a problem if we forget to add software emulation
for it at the same time.

The problem is that right now we translate GSO emulation to the
bitmask NETIF_F_GSO_MASK, which includes every protocol, even
ones that we cannot emulate.

This patch makes it provide only the ones that we can emulate.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-06-04 09:08:49 +0800

31 May, 2007

2 commits

83f03fa5a [NET]: parse ip:port strings correctly in in4_pton ... Browse Code »

in4_pton converts a textual representation of an ip4 address
into an integer representation. However, when the textual representation
is of in the form ip:port, e.g. 192.168.1.1:5060, and 'delim' is set to
-1, the function bails out with an error when reading the colon.

It makes sense to allow the colon as a delimiting character without
explicitly having to set it through the 'delim' variable as there can be
no ambiguity in the point where the ip address is completely parsed. This
function is indeed called from nf_conntrack_sip.c in this way to parse
textual ip:port combinations which fails due to the reason stated above.

Signed-off-by: Jerome Borsboom
Signed-off-by: David S. Miller

Jerome Borsboom
2007-05-31 16:23:27 +0800
01e67d08f [XFRM]: Allow XFRM_ACQ_EXPIRES to be tunable via sysctl. ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2007-05-31 16:23:23 +0800

25 May, 2007

1 commit

14e50e57a [XFRM]: Allow packet drops during larval state resolution. ... Browse Code »

The current IPSEC rule resolution behavior we have does not work for a
lot of people, even though technically it's an improvement from the
-EAGAIN buisness we had before.

Right now we'll block until the key manager resolves the route. That
works for simple cases, but many folks would rather packets get
silently dropped until the key manager resolves the IPSEC rules.

We can't tell these folks to "set the socket non-blocking" because
they don't have control over the non-block setting of things like the
sockets used to resolve DNS deep inside of the resolver libraries in
libc.

With that in mind I coded up the patch below with some help from
Herbert Xu which provides packet-drop behavior during larval state
resolution, controllable via sysctl and off by default.

This lays the framework to either:

1) Make this default at some point or...

2) Move this logic into xfrm{4,6}_policy.c and implement the
ARP-like resolution queue we've all been dreaming of.
The idea would be to queue packets to the policy, then
once the larval state is resolved by the key manager we
re-resolve the route and push the packets out. The
packets would timeout if the rule didn't get resolved
in a certain amount of time.

Signed-off-by: David S. Miller

David S. Miller
2007-05-25 09:17:54 +0800