Eric Lee / smarc-fsl-linux-kernel

19 Oct, 2007

6 commits

f429cd37a sysctl: properly register the irda binary sysctl numbers ... Browse Code »

Grumble. These numbers should have been in sysctl.h from the beginning if we
ever expected anyone to use them. Oh well put them there now so we can find
them and make maintenance easier.

Signed-off-by: Eric W. Biederman
Acked-by: Samuel Ortiz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-10-19 05:37:23 +0800
064b5bba0 sysctl: remove broken netfilter binary sysctls ... Browse Code »

No one has bothered to set strategy routine for the the netfilter sysctls that
return jiffies to be sysctl_jiffies.

So it appears the sys_sysctl path is unused and untested, so this patch
removes the binary sysctl numbers.

Which fixes the netfilter oops in 2.6.23-rc2-mm2 for me.

Signed-off-by: Eric W. Biederman
Cc: Patrick McHardy
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-10-19 05:37:23 +0800
49641b58a sysctl: ipv4 remove binary sysctl paths where they are broken ... Browse Code »

Currently tcp_available_congestion_control does not even attempt being read
from sys_sysctl, and ipfrag_max_dist while it works allows setting of invalid
values using sys_sysctl.

So just kill the binary sys_sysctl support for these sysctls. If the support
is not important enough to test and get right it probably isn't important
enough to keep.

Signed-off-by: Eric W. Biederman
Cc: Alexey Dobriyan
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-10-19 05:37:23 +0800
baa3a2a0d sysctl: remove broken sunrpc debug binary sysctls ... Browse Code »

This is debug code so no need to support binary sysctl, and the binary sysctls
as they were written were not consistent with what showed up in /proc so
remove the binary sysctl support.

Signed-off-by: Eric W. Biederman
Cc: Alexey Dobriyan
Cc: "David S. Miller"
Cc: Trond Myklebust
Cc: Neil Brown
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-10-19 05:37:22 +0800
428b367bf sysctl: ipv6 route flushing (kill binary path) ... Browse Code »

We don't preoperly support the sysctl binary path for flushing the ipv6
routes. So remove support for a binary path.

Signed-off-by: Eric W. Biederman
Cc: Alexey Dobriyan
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-10-19 05:37:22 +0800
d12af679b sysctl: fix neighbour table sysctls. ... Browse Code »

- In ipv6 ndisc_ifinfo_syctl_change so it doesn't depend on binary
sysctl names for a function that works with proc.

- In neighbour.c reorder the table to put the possibly unused entries
at the end so we can remove them by terminating the table early.

- In neighbour.c kill the entries with questionable binary sysctl
handling behavior.

- In neighbour.c if we don't have a strategy routine remove the
binary path. So we don't the default sysctl strategy routine
on data that is not ready for it.

Signed-off-by: Eric W. Biederman
Cc: Alexey Dobriyan
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-10-19 05:37:22 +0800

18 Oct, 2007

4 commits

982c37cfb 9p: remove sysctl ... Browse Code »

A sysctl method was added to enable and disable debugging levels. After
further review, it was decided that there are better approaches to doing this
and the sysctl methodology isn't really desirable. This patch removes the
sysctl code from 9p.

Signed-off-by: Eric Van Hensbergen

Eric Van Hensbergen
2007-10-18 03:35:15 +0800
fb0466c3a 9p: fix bad kconfig cross-dependency ... Browse Code »

This patch moves transport dynamic registration and matching to the net
module to prevent a bad Kconfig dependency between the net and fs 9p modules.

Signed-off-by: Eric Van Hensbergen

Eric Van Hensbergen
2007-10-18 03:31:07 +0800
ba17674fe 9p: attach-per-user ... Browse Code »

The 9P2000 protocol requires the authentication and permission checks to be
done in the file server. For that reason every user that accesses the file
server tree has to authenticate and attach to the server separately.
Multiple users can share the same connection to the server.

Currently v9fs does a single attach and executes all I/O operations as a
single user. This makes using v9fs in multiuser environment unsafe as it
depends on the client doing the permission checking.

This patch improves the 9P2000 support by allowing every user to attach
separately. The patch defines three modes of access (new mount option
'access'):

- attach-per-user (access=user) (default mode for 9P2000.u)
If a user tries to access a file served by v9fs for the first time, v9fs
sends an attach command to the server (Tattach) specifying the user. If
the attach succeeds, the user can access the v9fs tree.
As there is no uname->uid (string->integer) mapping yet, this mode works
only with the 9P2000.u dialect.

- allow only one user to access the tree (access=)
Only the user with uid can access the v9fs tree. Other users that attempt
to access it will get EPERM error.

- do all operations as a single user (access=any) (default for 9P2000)
V9fs does a single attach and all operations are done as a single user.
If this mode is selected, the v9fs behavior is identical with the current
one.

Signed-off-by: Latchesar Ionkov
Signed-off-by: Eric Van Hensbergen

Latchesar Ionkov
2007-10-18 03:31:07 +0800
a80d923e1 9p: Make transports dynamic ... Browse Code »

This patch abstracts out the interfaces to underlying transports so that
new transports can be added as modules. This should also allow kernel
configuration of transports without ifdef-hell.

Signed-off-by: Eric Van Hensbergen

Eric Van Hensbergen
2007-10-18 03:31:07 +0800

17 Oct, 2007

5 commits

ce8d2cdf3 r/o bind mounts: filesystem helpers for custom 'struct file's ... Browse Code »

Why do we need r/o bind mounts?

This feature allows a read-only view into a read-write filesystem. In the
process of doing that, it also provides infrastructure for keeping track of
the number of writers to any given mount.

This has a number of uses. It allows chroots to have parts of filesystems
writable. It will be useful for containers in the future because users may
have root inside a container, but should not be allowed to write to
somefilesystems. This also replaces patches that vserver has had out of the
tree for several years.

It allows security enhancement by making sure that parts of your filesystem
read-only (such as when you don't trust your FTP server), when you don't want
to have entire new filesystems mounted, or when you want atime selectively
updated. I've been using the following script to test that the feature is
working as desired. It takes a directory and makes a regular bind and a r/o
bind mount of it. It then performs some normal filesystem operations on the
three directories, including ones that are expected to fail, like creating a
file on the r/o mount.

This patch:

Some filesystems forego the vfs and may_open() and create their own 'struct
file's.

This patch creates a couple of helper functions which can be used by these
filesystems, and will provide a unified place which the r/o bind mount code
may patch.

Also, rename an existing, static-scope init_file() to a less generic name.

Signed-off-by: Dave Hansen
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Hansen
2007-10-17 23:43:04 +0800
76181c134 KEYS: Make request_key() and co fundamentally asynchronous ... Browse Code »

Make request_key() and co fundamentally asynchronous to make it easier for
NFS to make use of them. There are now accessor functions that do
asynchronous constructions, a wait function to wait for construction to
complete, and a completion function for the key type to indicate completion
of construction.

Note that the construction queue is now gone. Instead, keys under
construction are linked in to the appropriate keyring in advance, and that
anyone encountering one must wait for it to be complete before they can use
it. This is done automatically for userspace.

The following auxiliary changes are also made:

(1) Key type implementation stuff is split from linux/key.h into
linux/key-type.h.

(2) AF_RXRPC provides a way to allocate null rxrpc-type keys so that AFS does
not need to call key_instantiate_and_link() directly.

(3) Adjust the debugging macros so that they're -Wformat checked even if
they are disabled, and make it so they can be enabled simply by defining
__KDEBUG to be consistent with other code of mine.

(3) Documentation.

[alan@lxorguk.ukuu.org.uk: keys: missing word in documentation]
Signed-off-by: David Howells
Signed-off-by: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2007-10-17 23:42:57 +0800
af49d9248 Remove "unsafe" from module struct ... Browse Code »

Adrian Bunk points out that "unsafe" was used to mark modules touched by
the deprecated MOD_INC_USE_COUNT interface, which has long gone. It's time
to remove the member from the module structure, as well.

If you want a module which can't unload, don't register an exit function.

(Vlad Yasevich says SCTP is now safe to unload, so just remove the
__unsafe there).

Signed-off-by: Rusty Russell
Acked-by: Shannon Nelson
Acked-by: Dan Williams
Acked-by: Vlad Yasevich
Cc: Sridhar Samudrala
Cc: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rusty Russell
2007-10-17 23:42:49 +0800
4ba9b9d0b Slab API: remove useless ctor parameter and reorder parameters ... Browse Code »

Slab constructors currently have a flags parameter that is never used. And
the order of the arguments is opposite to other slab functions. The object
pointer is placed before the kmem_cache pointer.

Convert

ctor(void *object, struct kmem_cache *s, unsigned long flags)

to

ctor(struct kmem_cache *s, void *object)

throughout the kernel

[akpm@linux-foundation.org: coupla fixes]
Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-10-17 23:42:45 +0800
a56daeb7d net/sunrpc/xprtrdma/verbs.c printk warning fix ... Browse Code »

sparc64:

net/sunrpc/xprtrdma/verbs.c:1264: warning: long long unsigned int format, u64 arg (arg 3)
net/sunrpc/xprtrdma/verbs.c:1264: warning: long long unsigned int format, u64 arg (arg 4)

Cc: Trond Myklebust
Cc: "David S. Miller"
Cc: "J. Bruce Fields"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2007-10-17 00:43:23 +0800

16 Oct, 2007

25 commits

4acad72de [IPV6]: Consolidate the ip6_pol_route_(input|output) pair ... Browse Code »

The difference in both functions is in the "id" passed to
the rt6_select, so just pass it as an extra argument from
two outer helpers.

This is minus 60 lines of code and 360 bytes of .text

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 04:02:51 +0800
4ae289444 [NEIGH]: Ensure that pneigh_lookup is protected with RTNL ... Browse Code »

The pnigh_lookup is used to lookup proxy entries and to
create them in case lookup failed.

However, the "creation" code does not perform the re-lookup
after GFP_KERNEL allocation. This is done because the code
is expected to be protected with the RTNL lock, so add the
assertion (mainly to address future questions from new network
developers like me :) ).

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 03:54:15 +0800
f1673ca52 [INET]: kmalloc+memset -> kzalloc in frag_alloc_queue ... Browse Code »

kmalloc + memset -> kzalloc in frag_alloc_queue

Signed-off-by: Denis V. Lunev
Signed-off-by: David S. Miller

Denis V. Lunev
2007-10-16 03:53:13 +0800
e5bbef20e [IPV6]: Replace sk_buff ** with sk_buff * in input handlers ... Browse Code »

With all the users of the double pointers removed from the IPv6 input path,
this patch converts all occurances of sk_buff ** to sk_buff * in IPv6 input
handlers.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-10-16 03:50:28 +0800
762cc4080 [INET]: Consolidate the xxx_put ... Browse Code »

These ones use the generic data types too, so move
them in one place.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 03:26:43 +0800
4b6cb5d8e [INET]: Small cleanup for xxx_put after evictor consolidation ... Browse Code »

After the evictor code is consolidated there is no need in
passing the extra pointer to the xxx_put() functions.

The only place when it made sense was the evictor code itself.

Maybe this change must got with the previous (or with the
next) patch, but I try to make them shorter as much as
possible to simplify the review (but they are still large
anyway), so this change goes in a separate patch.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 03:26:43 +0800
8e7999c44 [INET]: Consolidate the xxx_evictor ... Browse Code »

The evictors collect some statistics for ipv4 and ipv6,
so make it return the number of evicted queues and account
them all at once in the caller.

The XXX_ADD_STATS_BH() macros are just for this case,
but maybe there are places in code, that can make use of
them as well.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 03:26:42 +0800
1e4b82873 [INET]: Consolidate the xxx_frag_destroy ... Browse Code »

To make in possible we need to know the exact frag queue
size for inet_frags->mem management and two callbacks:

* to destoy the skb (optional, used in conntracks only)
* to free the queue itself (mandatory, but later I plan to
move the allocation and the destruction of frag_queues
into the common place, so this callback will most likely
be optional too).

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 03:26:42 +0800
321a3a99e [INET]: Consolidate xxx_the secret_rebuild ... Browse Code »

This code works with the generic data types as well, so
move this into inet_fragment.c

This move makes it possible to hide the secret_timer
management and the secret_rebuild routine completely in
the inet_fragment.c

Introduce the ->hashfn() callback in inet_frags() to get
the hashfun for a given inet_frag_queue() object.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 03:26:41 +0800
277e650dd [INET]: Consolidate the xxx_frag_kill ... Browse Code »

Since now all the xxx_frag_kill functions now work
with the generic inet_frag_queue data type, this can
be moved into a common place.

The xxx_unlink() code is moved as well.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 03:26:41 +0800
04128f233 [INET]: Collect common frag sysctl variables together ... Browse Code »

Some sysctl variables are used to tune the frag queues
management and it will be useful to work with them in
a common way in the future, so move them into one
structure, moreover they are the same for all the frag
management codes.

I don't place them in the existing inet_frags object,
introduced in the previous patch for two reasons:

1. to keep them in the __read_mostly section;
2. not to export the whole inet_frags objects outside.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 03:26:40 +0800
7eb95156d [INET]: Collect frag queues management objects together ... Browse Code »

There are some objects that are common in all the places
which are used to keep track of frag queues, they are:

* hash table
* LRU list
* rw lock
* rnd number for hash function
* the number of queues
* the amount of memory occupied by queues
* secret timer

Move all this stuff into one structure (struct inet_frags)
to make it possible use them uniformly in the future. Like
with the previous patch this mostly consists of hunks like

- write_lock(&ipfrag_lock);
+ write_lock(&ip4_frags.lock);

To address the issue with exporting the number of queues and
the amount of memory occupied by queues outside the .c file
they are declared in, I introduce a couple of helpers.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 03:26:39 +0800
5ab11c98d [INET]: Move common fields from frag_queues in one place. ... Browse Code »

Introduce the struct inet_frag_queue in include/net/inet_frag.h
file and place there all the common fields from three structs:

* struct ipq in ipv4/ip_fragment.c
* struct nf_ct_frag6_queue in nf_conntrack_reasm.c
* struct frag_queue in ipv6/reassembly.c

After this, replace these fields on appropriate structures with
this structure instance and fix the users to use correct names
i.e. hunks like

- atomic_dec(&fq->refcnt);
+ atomic_dec(&fq->q.refcnt);

(these occupy most of the patch)

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2007-10-16 03:26:38 +0800
f885c5b08 [TCP]: high_seq parameter removed (all callers use tp->high_seq) ... Browse Code »

Signed-off-by: Ilpo Järvinen
Signed-off-by: David S. Miller

Ilpo Järvinen
2007-10-16 03:26:37 +0800
ad643a793 [IPV6]: Uninline netfilter okfns ... Browse Code »

Uninline netfilter okfns for those cases where gcc can generate tail-calls.

Before:
text data bss dec hex filename
8994153 1016524 524652 10535329 a0c1a1 vmlinux

After:
text data bss dec hex filename
8992761 1016524 524652 10533937 a0bc31 vmlinux
-------------------------------------------------------
-1392

All cases have been verified to generate tail-calls with and without netfilter.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-10-16 03:26:36 +0800
9c2842bd9 [BRIDGE]: Remove SKB share checks in br_nf_pre_routing(). ... Browse Code »

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-10-16 03:26:35 +0800
861d04860 [IPV4]: Uninline netfilter okfns ... Browse Code »

Now that we don't pass double skb pointers to nf_hook_slow anymore, gcc
can generate tail calls for some of the netfilter hook okfn invocations,
so there is no need to inline the functions anymore. This caused huge
code bloat since we ended up with one inlined version and one out-of-line
version since we pass the address to nf_hook_slow.

Before:
text data bss dec hex filename
8997385 1016524 524652 10538561 a0ce41 vmlinux

After:
text data bss dec hex filename
8994009 1016524 524652 10535185 a0c111 vmlinux
-------------------------------------------------------
-3376

All cases have been verified to generate tail-calls with and without
netfilter. The okfns in ipmr and xfrm4_input still remain inline because
gcc can't generate tail-calls for them.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2007-10-16 03:26:35 +0800
a030847e9 [NET]: Avoid copying TCP packets unnecessarily ... Browse Code »

TCP packets all have writable heads, that is, even though it's cloned, it is
writable up to the end of the TCP header. This patch makes skb_checksum_help
aware of this fact by using skb_clone_writable and avoiding a copy for TCP.

I've also modified the BUG_ON tests to be unsigned. The only case where this
makes a difference is if csum_start points to a location before skb->data.
Since skb->data should always include the header where the checksum field
is (and all currently callers adhere to that), this change is safe and may
uncover bugs later.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-10-16 03:26:34 +0800
172a863f2 [NET]: Fix csum_start update in pskb_expand_head ... Browse Code »

I got confused by the dual nature of the off variable in the
function pskb_expand_head. The csum_start offset should use
nhead instead of off which can change depending on whether we
are using offsets or pointers.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-10-16 03:26:33 +0800
f937f1f46 [NETLINK]: Don't leak 'listeners' in netlink_kernel_create() ... Browse Code »

The Coverity checker spotted that we'll leak the storage allocated
to 'listeners' in netlink_kernel_create() when the
if (!nl_table[unit].registered)
check is false.

This patch avoids the leak.

Signed-off-by: Jesper Juhl
Acked-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Jesper Juhl
2007-10-16 03:26:32 +0800
1dff92e09 [IPV6] __inet6_csk_dst_store(): fix check-after-use ... Browse Code »

The Coverity checker spotted that we have already oops'ed if "dst" was
NULL.

Since "dst" being NULL doesn't seem to be possible at this point this
patch removes the NULL check.

Signed-off-by: Adrian Bunk
Acked-by: Masahide NAKAMURA
Acked-by: Noriaki TAKAMIYA
Signed-off-by: David S. Miller

Adrian Bunk
2007-10-16 03:26:32 +0800
65c884666 [IPV6]: Avoid skb_copy/pskb_copy/skb_realloc_headroom on input ... Browse Code »

This patch replaces unnecessary uses of skb_copy by pskb_expand_head
on the IPv6 input path.

This allows us to remove the double pointers later.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-10-16 03:26:31 +0800
f61944efd [IPV6]: Make ipv6_frag_rcv return the same packet ... Browse Code »

This patch implements the same change taht was done to ip_defrag. It
makes ipv6_frag_rcv return the last packet received of a train of fragments
rather than the head of that sequence.

This allows us to get rid of the sk_buff ** argument later.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-10-16 03:26:30 +0800
3db05fea5 [NETFILTER]: Replace sk_buff ** with sk_buff * ... Browse Code »

With all the users of the double pointers removed, this patch mops up by
finally replacing all occurances of sk_buff ** in the netfilter API by
sk_buff *.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-10-16 03:26:29 +0800
2ca7b0ac0 [NETFILTER]: Avoid skb_copy/pskb_copy/skb_realloc_headroom ... Browse Code »

This patch replaces unnecessary uses of skb_copy, pskb_copy and
skb_realloc_headroom by functions such as skb_make_writable and
pskb_expand_head.

This allows us to remove the double pointers later.

Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller

Herbert Xu
2007-10-16 03:26:28 +0800