Eric Lee / smarc-fsl-linux-kernel

01 Nov, 2007

16 commits

49259d34c [IRDA] IRNET: Fix build when TCGETS2 is defined. ... Browse Code »

Signed-off-by: David S. Miller

David S. Miller
2007-11-01 17:26:38 +0800
3b582cc14 [NET]: docbook fixes for netif_ functions ... Browse Code »

Documentation updates for network interfaces.

1. Add doc for netif_napi_add
2. Remove doc for unused returns from netif_rx
3. Add doc for netif_receive_skb

[ Incorporated minor mods from Randy Dunlap -DaveM ]

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

Stephen Hemminger
2007-11-01 17:21:47 +0800
d57a9212e [NET]: Hide the net_ns kmem cache ... Browse Code »

This cache is only required to create new namespaces,
but we won't have them in CONFIG_NET_NS=n case.

Hide it under the appropriate ifdef.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:46:50 +0800
1a2ee93d2 [NET]: Mark the setup_net as __net_init ... Browse Code »

The setup_net is called for the init net namespace
only (int the CONFIG_NET_NS=n of course) from the __init
function, so mark it as __net_init to disappear with the
caller after the boot.

Yet again, in the perfect world this has to be under
#ifdef CONFIG_NET_NS, but it isn't guaranteed that every
subsystem is registered *after* the init_net_ns is set
up. After we are sure, that we don't start registering
them before the init net setup, we'll be able to move
this code under the ifdef.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:45:59 +0800
6a1a3b9f6 [NET]: Hide the dead code in the net_namespace.c ... Browse Code »

The namespace creation/destruction code is never called
if the CONFIG_NET_NS is n, so it's OK to move it under
appropriate ifdef.

The copy_net_ns() in the "n" case checks for flags and
returns -EINVAL when new net ns is requested. In a perfect
world this stub must be in net_namespace.h, but this
function need to know the CLONE_NEWNET value and thus
requires sched.h. On the other hand this header is to be
injected into almost every .c file in the networking code,
and making all this code depend on the sched.h is a
suicidal attempt.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:44:50 +0800
1dba323b3 [NETNS]: Make the init/exit hooks checks outside the loop ... Browse Code »

When the new pernet something (subsys, device or operations) is
being registered, the init callback is to be called for each
namespace, that currently exitst in the system. During the
unregister, the same is to be done with the exit callback.

However, not every pernet something has both calls, but the
check for the appropriate pointer to be not NULL is performed
inside the for_each_net() loop.

This is (at least) strange, so tune this.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:42:43 +0800
6257ff217 [NET]: Forget the zero_it argument of sk_alloc() ... Browse Code »

Finally, the zero_it argument can be completely removed from
the callers and from the function prototype.

Besides, fix the checkpatch.pl warnings about using the
assignments inside if-s.

This patch is rather big, and it is a part of the previous one.
I splitted it wishing to make the patches more readable. Hope
this particular split helped.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:39:31 +0800
154adbc84 [NET]: Remove bogus zero_it argument from sk_alloc ... Browse Code »

At this point nobody calls the sk_alloc(() with zero_it == 0,
so remove unneeded checks from it.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:38:43 +0800
8fd1d178a [NET]: Make the sk_clone() lighter ... Browse Code »

The sk_prot_alloc() already performs all the stuff needed by the
sk_clone(). Besides, the sk_prot_alloc() requires almost twice
less arguments than the sk_alloc() does, so call the sk_prot_alloc()
saving the stack a bit.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:37:32 +0800
2e4afe7b3 [NET]: Move some core sock setup into sk_prot_alloc ... Browse Code »

The security_sk_alloc() and the module_get is a part of the
object allocations - move it in the proper place.

Note, that since we do not reset the newly allocated sock
in the sk_alloc() (memset() is removed with the previous
patch) we can safely do this.

Also fix the error path in sk_prot_alloc() - release the security
context if needed.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:36:26 +0800
3f0666ee3 [NET]: Auto-zero the allocated sock object ... Browse Code »

We have a __GFP_ZERO flag that allocates a zeroed chunk of memory.
Use it in the sk_alloc() and avoid a hand-made memset().

This is a temporary patch that will help us in the nearest future :)

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:34:42 +0800
c308c1b20 [NET]: Cleanup the allocation/freeing of the sock object ... Browse Code »

The sock object is allocated either from the generic cache with
the kmalloc, or from the proc->slab cache.

Move this logic into an isolated set of helpers and make the
sk_alloc/sk_free look a bit nicer.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:33:50 +0800
1e2e6b89f [NET]: Move the get_net() from sock_copy() ... Browse Code »

The sock_copy() is supposed to just clone the socket. In a perfect
world it has to be just memcpy, but we have to handle the security
mark correctly. All the extra setup must be performed in sk_clone()
call, so move the get_net() into more proper place.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:31:26 +0800
f1a6c4da1 [NET]: Move the sock_copy() from the header ... Browse Code »

The sock_copy() call is not used outside the sock.c file,
so just move it into a sock.c

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-11-01 15:29:45 +0800
261ab365f [TCP]: Another TAGBITS -> SACKED_ACKED|LOST conversion ... Browse Code »

Similar to commit 3eec0047d9bdd, point of this is to avoid
skipping R-bit skbs.

Signed-off-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Ilpo Järvinen
2007-11-01 15:10:18 +0800
e56d6cd60 [TCP]: Process DSACKs that reside within a SACK block ... Browse Code »

DSACK inside another SACK block were missed if start_seq of DSACK
was larger than SACK block's because sorting prioritizes full
processing of the SACK block before DSACK. After SACK block
sorting situation is like this:

SSSSSSSSS
D
SSSSSS
SSSSSSS

Because write_queue is walked in-order, when the first SACK block
has been processed, TCP is already past the skb for which the
DSACK arrived and we haven't taught it to backtrack (nor should
we), so TCP just continues processing by going to the next SACK
block after the DSACK (if any).

Whenever such DSACK is present, do an embedded checking during
the previous SACK block.

If the DSACK is below snd_una, there won't be overlapping SACK
block, and thus no problem in that case. Also if start_seq of
the DSACK is equal to the actual block, it will be processed
first.

Tested this by using netem to duplicate 15% of packets, and
by printing SACK block when found_dup_sack is true and the
selected skb in the dup_sack = 1 branch (if taken):

SACK block 0: 4344-5792 (relative to snd_una 2019137317)
SACK block 1: 4344-5792 (relative to snd_una 2019137317)

equal start seqnos => next_dup = 0, dup_sack = 1 won't occur...

SACK block 0: 5792-7240 (relative to snd_una 2019214061)
SACK block 1: 2896-7240 (relative to snd_una 2019214061)
DSACK skb match 5792-7240 (relative to snd_una)

...and next_dup = 1 case (after the not shown start_seq sort),
went to dup_sack = 1 branch.

Signed-off-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Ilpo Järvinen
2007-11-01 15:09:37 +0800

31 Oct, 2007

7 commits

298bb6217 [AF_KEY]: suppress a warning for 64k pages. ... Browse Code »

On PowerPC allmodconfig build we get this:

net/key/af_key.c:400: warning: comparison is always false due to limited range of data type

Signed-off-by: Stephen Rothwell
Signed-off-by: David S. Miller

Stephen Rothwell
2007-10-31 14:57:05 +0800
51c739d1f [NET]: Fix incorrect sg_mark_end() calls. ... Browse Code »

This fixes scatterlist corruptions added by

commit 68e3f5dd4db62619fdbe520d36c9ebf62e672256
[CRYPTO] users: Fix up scatterlist conversion errors

The issue is that the code calls sg_mark_end() which clobbers the
sg_page() pointer of the final scatterlist entry.

The first part fo the fix makes skb_to_sgvec() do __sg_mark_end().

After considering all skb_to_sgvec() call sites the most correct
solution is to call __sg_mark_end() in skb_to_sgvec() since that is
what all of the callers would end up doing anyways.

I suspect this might have fixed some problems in virtio_net which is
the sole non-crypto user of skb_to_sgvec().

Other similar sg_mark_end() cases were converted over to
__sg_mark_end() as well.

Arguably sg_mark_end() is a poorly named function because it doesn't
just "mark", it clears out the page pointer as a side effect, which is
what led to these bugs in the first place.

The one remaining plain sg_mark_end() call is in scsi_alloc_sgtable()
and arguably it could be converted to __sg_mark_end() if only so that
we can delete this confusing interface from linux/scatterlist.h

Signed-off-by: David S. Miller

David S. Miller
2007-10-31 12:29:29 +0800
07afa0402 [IPVS]: Remove /proc/net/ip_vs_lblcr ... Browse Code »

It's under CONFIG_IP_VS_LBLCR_DEBUG option which never existed.

Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Alexey Dobriyan
2007-10-31 12:16:27 +0800
1675c7b25 [IPV6]: remove duplicate call to proc_net_remove ... Browse Code »

The file /proc/net/if_inet6 is removed twice.
First time in:
inet6_exit
->addrconf_cleanup
And followed a few lines after by:
inet6_exit
-> if6_proc_exit

Signed-off-by: Daniel Lezcano
Signed-off-by: David S. Miller

Daniel Lezcano
2007-10-31 12:16:24 +0800
310928d96 [NETNS]: fix net released by rcu callback ... Browse Code »

When a network namespace reference is held by a network subsystem,
and when this reference is decremented in a rcu update callback, we
must ensure that there is no more outstanding rcu update before
trying to free the network namespace.

In the normal case, the rcu_barrier is called when the network namespace
is exiting in the cleanup_net function.

But when a network namespace creation fails, and the subsystems are
undone (like the cleanup), the rcu_barrier is missing.

This patch adds the missing rcu_barrier.

Signed-off-by: Daniel Lezcano
Signed-off-by: David S. Miller

Daniel Lezcano
2007-10-31 12:16:21 +0800
93ee31f14 [NET]: Fix free_netdev on register_netdev failure. ... Browse Code »

Point 1:
The unregistering of a network device schedule a netdev_run_todo.
This function calls dev->destructor when it is set and the
destructor calls free_netdev.

Point 2:
In the case of an initialization of a network device the usual code
is:
* alloc_netdev
* register_netdev
-> if this one fails, call free_netdev and exit with error.

Point 3:
In the register_netdevice function at the later state, when the device
is at the registered state, a call to the netdevice_notifiers is made.
If one of the notification falls into an error, a rollback to the
registered state is done using unregister_netdevice.

Conclusion:
When a network device fails to register during initialization because
one network subsystem returned an error during a notification call
chain, the network device is freed twice because of fact 1 and fact 2.
The second free_netdev will be done with an invalid pointer.

Proposed solution:
The following patch move all the code of unregister_netdevice *except*
the call to net_set_todo, to a new function "rollback_registered".

The following functions are changed in this way:
* register_netdevice: calls rollback_registered when a notification fails
* unregister_netdevice: calls rollback_register + net_set_todo, the call
order to net_set_todo is changed because it is the
latest now. Since it justs add an element to a list
that should not break anything.

Signed-off-by: Daniel Lezcano
Signed-off-by: David S. Miller

Daniel Lezcano
2007-10-31 12:16:18 +0800
e403149c9 Kbuild/doc: fix links to Documentation files ... Browse Code »

Fix links to files in Documentation/* in various Kconfig files

Signed-off-by: Dirk Hohndel
Signed-off-by: Linus Torvalds

Dirk Hohndel
2007-10-31 05:26:30 +0800

30 Oct, 2007

11 commits

521c2a43b [SUNRPC]: fix rpc debugging ... Browse Code »

Commit baa3a2a0d24ebcf1c451bec8e5bee3d3467f4cbb, by removing initialization
of the ctl_name field, broke this conditional, preventing the display of
rpc_tasks that you previously got when turning on rpc debugging.

[akpm@linux-foundation.org: coding-style fixes]

Signed-off-by: J. Bruce Fields
Acked-by: "Eric W. Biederman"
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

J. Bruce Fields
2007-10-30 16:07:15 +0800
0ccfe6180 [TCP]: Saner thash_entries default with much memory. ... Browse Code »

On systems with a very large amount of memory, the heuristics in
alloc_large_system_hash() result in a very large TCP established hash
table: 16 millions of entries for a 128 GB ia64 system. This makes
reading from /proc/net/tcp pretty slow (well over a second) and as a
result netstat is slow on these machines. I know that /proc/net/tcp is
deprecated in favor of tcp_diag, however at the moment netstat only
knows of the former.

I am skeptical that such a large TCP established hash is often needed.
Just because a system has a lot of memory doesn't imply that it will
have several millions of concurrent TCP connections. Thus I believe
that we should put an arbitrary high limit to the size of the TCP
established hash by default. Users who really need a bigger hash can
always use the thash_entries boot parameter to get more.

I propose 2 millions of entries as the arbitrary high limit. This
makes /proc/net/tcp reasonably fast on the system in question (0.2 s)
while being still large enough for me to be confident that network
performance won't suffer.

This is just one way to limit the hash size, there are others; I am not
familiar enough with the TCP code to decide which is best. Thus, I
would welcome the proposals of alternatives.

[ 2 million is still too large, thus I've modified the limit in the
change to be '512 * 1024'. -DaveM ]

Signed-off-by: Jean Delvare
Signed-off-by: David S. Miller

Jean Delvare
2007-10-30 15:59:25 +0800
e08a132b0 [SUNRPC] rpc_rdma: we need to cast u64 to unsigned long long for printing ... Browse Code »

as some architectures have unsigned long for u64.

net/sunrpc/xprtrdma/rpc_rdma.c: In function 'rpcrdma_create_chunks':
net/sunrpc/xprtrdma/rpc_rdma.c:222: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64'
net/sunrpc/xprtrdma/rpc_rdma.c:234: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'u64'
net/sunrpc/xprtrdma/rpc_rdma.c: In function 'rpcrdma_count_chunks':
net/sunrpc/xprtrdma/rpc_rdma.c:577: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'u64

Noticed on PowerPC pseries_defconfig build.

Signed-off-by: Stephen Rothwell
Signed-off-by: David S. Miller

Stephen Rothwell
2007-10-30 15:44:32 +0800
064f3605b [IPv4] SNMP: Refer correct memory location to display ICMP out-going statistics ... Browse Code »

While displaying ICMP out-going statistics as Out counters in
/proc/net/snmp, the memory location for ICMP in-coming statistics
was referred by mistake.

Signed-off-by: Mitsuru Chinen
Acked-by: David L Stevens
Signed-off-by: David S. Miller

Mitsuru Chinen
2007-10-30 13:37:36 +0800
bf3c23d17 [NET]: Fix error reporting in sys_socketpair(). ... Browse Code »

If either of the two sock_alloc_fd() calls fail, we
forget to update 'err' and thus we'll erroneously
return zero in these cases.

Based upon a report and patch from Rich Paul, and
commentary from Chuck Ebbert.

Signed-off-by: David S. Miller

David S. Miller
2007-10-30 13:37:34 +0800
29b67497f [NETFILTER]: nf_ct_alloc_hashtable(): use __GFP_NOWARN ... Browse Code »

This allocation is expected to fail and we handle it by fallback to vmalloc().

So don't scare people with nasty messages like
http://bugzilla.kernel.org/show_bug.cgi?id=9190

Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Andrew Morton
2007-10-30 13:37:31 +0800
0a7606c12 [NET]: Fix race between poll_napi() and net_rx_action() ... Browse Code »

netpoll_poll_lock() synchronizes the ->poll() invocation
code paths, but once we have the lock we have to make
sure that NAPI_STATE_SCHED is still set. Otherwise we
get:

cpu 0 cpu 1

net_rx_action() poll_napi()
netpoll_poll_lock() ... spin on ->poll_lock
->poll()
netif_rx_complete
netpoll_poll_unlock() acquire ->poll_lock()
->poll()
netif_rx_complete()
CRASH

Based upon a bug report from Tina Yang.

Signed-off-by: David S. Miller

David S. Miller
2007-10-30 13:37:28 +0800
b0a713e9e [TCP] MD5: Remove some more unnecessary casting. ... Browse Code »

while reviewing the tcp_md5-related code further i came across with
another two of these casts which you probably have missed. I don't
actually think that they impose a problem by now, but as you said we
should remove them.

Signed-off-by: Matthias M. Dellweg
Signed-off-by: David S. Miller

Matthias M. Dellweg
2007-10-30 13:37:27 +0800
c940587bf [TCP] vegas: Fix a bug in disabling slow start by gamma parameter. ... Browse Code »

TCP Vegas implementation has a bug in the process of disabling
slow-start with gamma parameter. The bug may lead to extreme
unfairness in the presence of early packet loss. See details in:
http://www.cs.caltech.edu/~weixl/technical/ns2linux/known_linux/index.html#vegas

Switch the order of "if (tp->snd_cwnd snd_ssthresh)" statement
and "if (diff > gamma)" statement to eliminate the problem.

Signed-off-by: Xiaoliang (David) Wei
Signed-off-by: David S. Miller

Xiaoliang (David) Wei
2007-10-30 13:37:25 +0800
5c81833c2 [IPVS]: use proper timeout instead of fixed value ... Browse Code »

Instead of using the default timeout of 3 minutes, this uses the timeout
specific to the protocol used for the connection. The 3 minute timeout
seems somewhat arbitrary (though I know it is used other places in the
ipvs code) and when failing over it would be much nicer to use one of
the configured timeout values.

Signed-off-by: Andy Gospodarek
Acked-by: Simon Horman
Signed-off-by: David S. Miller

Andy Gospodarek
2007-10-30 13:37:23 +0800
ad02ac145 [IPV6] NDISC: Fix setting base_reachable_time_ms variable. ... Browse Code »

This bug was introduced by the commit
d12af679bcf8995a237560bdf7a4d734f8df5dbb (sysctl: fix neighbour table
sysctls).

Signed-off-by: YOSHIFUJI Hideaki
Signed-off-by: David S. Miller

YOSHIFUJI Hideaki
2007-10-30 13:37:22 +0800

29 Oct, 2007

2 commits

d06f60826 SCTP endianness annotations regression ... Browse Code »

Signed-off-by: Al Viro
Acked-by: David S. Miller
Signed-off-by: Linus Torvalds

Al Viro
2007-10-29 22:41:32 +0800
2d8a97266 SUNRPC endianness annotations ... Browse Code »

rpcrdma stuff lacks endianness annotations for on-the-wire data.

Signed-off-by: Al Viro
Acked-by: David S. Miller
Signed-off-by: Linus Torvalds

Al Viro
2007-10-29 22:41:32 +0800

27 Oct, 2007

4 commits

68e3f5dd4 [CRYPTO] users: Fix up scatterlist conversion errors ... Browse Code »

This patch fixes the errors made in the users of the crypto layer during
the sg_init_table conversion. It also adds a few conversions that were
missing altogether.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-10-27 15:52:07 +0800
ceaa79c43 [NETNS]: Fix get_net_ns_by_pid ... Browse Code »

The pid namespace patches changed the semantics of
find_task_by_pid without breaking the compile resulting
in get_net_ns_by_pid doing the wrong thing.

So switch to using the intended find_task_by_vpid.

Combined with Denis' earlier patch to make netlink traffic
fully synchronous the inadvertent race I introduced with
accessing current is actually removed.

Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller

Eric W. Biederman
2007-10-27 13:56:12 +0800
2b008b0a8 [NET]: Marking struct pernet_operations __net_initdata was inappropriate ... Browse Code »

It is not safe to to place struct pernet_operations in a special section.
We need struct pernet_operations to last until we call unregister_pernet_subsys.
Which doesn't happen until module unload.

So marking struct pernet_operations is a disaster for modules in two ways.
- We discard it before we call the exit method it points to.
- Because I keep struct pernet_operations on a linked list discarding
it for compiled in code removes elements in the middle of a linked
list and does horrible things for linked insert.

So this looks safe assuming __exit_refok is not discarded
for modules.

Signed-off-by: Eric W. Biederman
Signed-off-by: David S. Miller

Eric W. Biederman
2007-10-27 13:54:53 +0800
72998d8c8 [INET] ESP: Must #include <linux/scatterlist.h> ... Browse Code »

This patch fixes the following compile errors in some configurations:

...
CC net/ipv4/esp4.o
/home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv4/esp4.c: In function 'esp_output':
/home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv4/esp4.c:113: error: implicit declaration of function 'sg_init_table'
make[3]: *** [net/ipv4/esp4.o] Error 1
...
/home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv6/esp6.c: In function 'esp6_output':
/home/bunk/linux/kernel-2.6/git/linux-2.6/net/ipv6/esp6.c:112: error: implicit declaration of function 'sg_init_table'
make[3]: *** [net/ipv6/esp6.o] Error 1

Signed-off-by: Adrian Bunk
Signed-off-by: David S. Miller

Adrian Bunk
2007-10-27 13:53:58 +0800