Eric Lee / smarc-fsl-linux-kernel

21 Oct, 2016

1 commit

7034b566a netfilter: fix nf_queue handling ... Browse Code »

nf_queue handling is broken since e3b37f11e6e4 ("netfilter: replace
list_head with single linked list") for two reasons:

1) If the bypass flag is set on, there are no userspace listeners and
we still have more hook entries to iterate over, then jump to the
next hook. Otherwise accept the packet. On nf_reinject() path, the
okfn() needs to be invoked.

2) We should not re-enter the same hook on packet reinjection. If the
packet is accepted, we have to skip the current hook from where the
packet was enqueued, otherwise the packets gets enqueued over and
over again.

This restores the previous list_for_each_entry_continue() behaviour
happening from nf_iterate() that was dealing with these two cases.
This patch introduces a new nf_queue() wrapper function so this fix
becomes simpler.

Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list")
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2016-10-21 01:59:59 +0800

25 Sep, 2016

1 commit

e3b37f11e netfilter: replace list_head with single linked list ... Browse Code »

The netfilter hook list never uses the prev pointer, and so can be trimmed to
be a simple singly-linked list.

In addition to having a more light weight structure for hook traversal,
struct net becomes 5568 bytes (down from 6400) and struct net_device becomes
2176 bytes (down from 2240).

Signed-off-by: Aaron Conole
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Aaron Conole
2016-09-25 20:38:48 +0800

25 May, 2016

1 commit

dc3ee32e9 netfilter: nf_queue: Make the queue_handler pernet ... Browse Code »

Florian Weber reported:
> Under full load (unshare() in loop -> OOM conditions) we can
> get kernel panic:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> IP: [] nfqnl_nf_hook_drop+0x35/0x70
> [..]
> task: ffff88012dfa3840 ti: ffff88012dffc000 task.ti: ffff88012dffc000
> RIP: 0010:[] [] nfqnl_nf_hook_drop+0x35/0x70
> RSP: 0000:ffff88012dfffd80 EFLAGS: 00010206
> RAX: 0000000000000008 RBX: ffffffff81add0c0 RCX: ffff88013fd80000
> [..]
> Call Trace:
> [] nf_queue_nf_hook_drop+0x18/0x20
> [] nf_unregister_net_hook+0xdb/0x150
> [] netfilter_net_exit+0x2f/0x60
> [] ops_exit_list.isra.4+0x38/0x60
> [] setup_net+0xc2/0x120
> [] copy_net_ns+0x79/0x120
> [] create_new_namespaces+0x11b/0x1e0
> [] unshare_nsproxy_namespaces+0x57/0xa0
> [] SyS_unshare+0x1b2/0x340
> [] entry_SYSCALL_64_fastpath+0x1e/0xa8
> Code: 65 00 48 89 e5 41 56 41 55 41 54 53 83 e8 01 48 8b 97 70 12 00 00 48 98 49 89 f4 4c 8b 74 c2 18 4d 8d 6e 08 49 81 c6 88 00 00 00 8b 5d 00 48 85 db 74 1a 48 89 df 4c 89 e2 48 c7 c6 90 68 47
>

The simple fix for this requires a new pernet variable for struct
nf_queue that indicates when it is safe to use the dynamically
allocated nf_queue state.

As we need a variable anyway make nf_register_queue_handler and
nf_unregister_queue_handler pernet. This allows the existing logic of
when it is safe to use the state from the nfnetlink_queue module to be
reused with no changes except for making it per net.

The syncrhonize_rcu from nf_unregister_queue_handler is moved to a new
function nfnl_queue_net_exit_batch so that the worst case of having a
syncrhonize_rcu in the pernet exit path is not experienced in batch
mode.

Reported-by: Florian Westphal
Signed-off-by: "Eric W. Biederman"
Acked-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Eric W. Biederman
2016-05-25 17:54:22 +0800

17 Oct, 2015

3 commits

81b4325eb netfilter: nf_queue: remove rcu_read_lock calls ... Browse Code »

All verdict handlers make use of the nfnetlink .call_rcu callback
so rcu readlock is already held.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2015-10-17 00:22:41 +0800
ed78d09d5 netfilter: make nf_queue_entry_get_refs return void ... Browse Code »

We don't care if module is being unloaded anymore since hook unregister
handling will destroy queue entries using that hook.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2015-10-17 00:22:23 +0800
2ffbceb2b netfilter: remove hook owner refcounting ... Browse Code »

since commit 8405a8fff3f8 ("netfilter: nf_qeueue: Drop queue entries on
nf_unregister_hook") all pending queued entries are discarded.

So we can simply remove all of the owner handling -- when module is
removed it also needs to unregister all its hooks.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2015-10-17 00:21:39 +0800

13 Oct, 2015

1 commit

7ceebfe46 netfilter: nfqueue: don't use prev pointer ... Browse Code »

Usage of -prev seems buggy. While packet was out our hook cannot be
removed but we have no way to know if the previous one is still valid.

So better not use ->prev at all. Since NF_REPEAT just asks to invoke
same hook function again, just do so, and continue with nf_interate
if we get an ACCEPT verdict.

A side effect of this change is that if nf_reinject(NF_REPEAT) causes
another REPEAT we will now drop the skb instead of a kernel loop.

However, NF_REPEAT loops would be a bug so this should not happen anyway.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2015-10-13 18:03:24 +0800

30 Sep, 2015

1 commit

d815d90bb netfilter: Push struct net down into nf_afinfo.reroute ... Browse Code »

The network namespace is needed when routing a packet.
Stop making nf_afinfo.reroute guess which network namespace
is the proper namespace to route the packet in.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: Pablo Neira Ayuso

Eric W. Biederman
2015-09-30 02:21:31 +0800

18 Sep, 2015

1 commit

0c4b51f00 netfilter: Pass net into okfn ... Browse Code »

This is immediately motivated by the bridge code that chains functions that
call into netfilter. Without passing net into the okfns the bridge code would
need to guess about the best expression for the network namespace to process
packets in.

As net is frequently one of the first things computed in continuation functions
after netfilter has done it's job passing in the desired network namespace is in
many cases a code simplification.

To support this change the function dst_output_okfn is introduced to
simplify passing dst_output as an okfn. For the moment dst_output_okfn
just silently drops the struct net.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Eric W. Biederman
2015-09-18 08:18:37 +0800

23 Jul, 2015

1 commit

2385eb0c5 netfilter: nf_queue: fix nf_queue_nf_hook_drop() ... Browse Code »

This function reacquires the rtnl_lock() which is already held by
nf_unregister_hook().

This can be triggered via: modprobe nf_conntrack_ipv4 && rmmod nf_conntrack_ipv4

[ 720.628746] INFO: task rmmod:3578 blocked for more than 120 seconds.
[ 720.628749] Not tainted 4.2.0-rc2+ #113
[ 720.628752] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 720.628754] rmmod D ffff8800ca46fd58 0 3578 3571 0x00000080
[...]
[ 720.628783] Call Trace:
[ 720.628790] [] schedule+0x6b/0x90
[ 720.628795] [] schedule_preempt_disabled+0x13/0x20
[ 720.628799] [] mutex_lock_nested+0x1f5/0x380
[ 720.628803] [] ? rtnl_lock+0x12/0x20
[ 720.628807] [] ? rtnl_lock+0x12/0x20
[ 720.628812] [] rtnl_lock+0x12/0x20
[ 720.628817] [] nf_queue_nf_hook_drop+0x15/0x160
[ 720.628825] [] nf_unregister_net_hook+0x168/0x190
[ 720.628831] [] nf_unregister_hook+0x64/0x80
[ 720.628837] [] nf_unregister_hooks+0x20/0x30
[...]

Moreover, nf_unregister_net_hook() should only destroy the queue for this
netns, not for every netns.

Reported-by: Fengguang Wu
Fixes: 085db2c04557 ("netfilter: Per network namespace netfilter hooks.")
Signed-off-by: Pablo Neira Ayuso
Acked-by: "Eric W. Biederman"

Pablo Neira Ayuso
2015-07-23 22:17:58 +0800

02 Jul, 2015

1 commit

f307170d6 netfilter: nf_queue: Don't recompute the hook_list head ... Browse Code »

If someone sends packets from one of the netdevice ingress hooks to
the a userspace queue, and then userspace later accepts the packet,
the netfilter code can enter an infinite loop as the list head will
never be found.

Pass in the saved list_head to avoid this.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: Pablo Neira Ayuso

Eric W. Biederman
2015-07-02 21:03:13 +0800

23 Jun, 2015

1 commit

8405a8fff netfilter: nf_qeueue: Drop queue entries on nf_unregister_hook ... Browse Code »

Add code to nf_unregister_hook to flush the nf_queue when a hook is
unregistered. This guarantees that the pointer that the nf_queue code
retains into the nf_hook list will remain valid while a packet is
queued.

I tested what would happen if we do not flush queued packets and was
trivially able to obtain the oops below. All that was required was
to stop the nf_queue listening process, to delete all of the nf_tables,
and to awaken the nf_queue listening process.

> BUG: unable to handle kernel paging request at 0000000100000001
> IP: [] 0x100000001
> PGD b9c35067 PUD 0
> Oops: 0010 [#1] SMP
> Modules linked in:
> CPU: 0 PID: 519 Comm: lt-nfqnl_test Not tainted
> task: ffff8800b9c8c050 ti: ffff8800ba9d8000 task.ti: ffff8800ba9d8000
> RIP: 0010:[] [] 0x100000001
> RSP: 0018:ffff8800ba9dba40 EFLAGS: 00010a16
> RAX: ffff8800bab48a00 RBX: ffff8800ba9dba90 RCX: ffff8800ba9dba90
> RDX: ffff8800b9c10128 RSI: ffff8800ba940900 RDI: ffff8800bab48a00
> RBP: ffff8800b9c10128 R08: ffffffff82976660 R09: ffff8800ba9dbb28
> R10: dead000000100100 R11: dead000000200200 R12: ffff8800ba940900
> R13: ffffffff8313fd50 R14: ffff8800b9c95200 R15: 0000000000000000
> FS: 00007fb91fc34700(0000) GS:ffff8800bfa00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000100000001 CR3: 00000000babfb000 CR4: 00000000000007f0
> Stack:
> ffffffff8206ab0f ffffffff82982240 ffff8800bab48a00 ffff8800b9c100a8
> ffff8800b9c10100 0000000000000001 ffff8800ba940900 ffff8800b9c10128
> ffffffff8206bd65 ffff8800bfb0d5e0 ffff8800bab48a00 0000000000014dc0
> Call Trace:
> [] ? nf_iterate+0x4f/0xa0
> [] ? nf_reinject+0x125/0x190
> [] ? nfqnl_recv_verdict+0x255/0x360
> [] ? nla_parse+0x80/0xf0
> [] ? nfnetlink_rcv_msg+0x13c/0x240
> [] ? __memcg_kmem_get_cache+0x4c/0x150
> [] ? nfnl_lock+0x20/0x20
> [] ? netlink_rcv_skb+0xa9/0xc0
> [] ? netlink_unicast+0x12f/0x1c0
> [] ? netlink_sendmsg+0x28e/0x650
> [] ? sock_sendmsg+0x44/0x50
> [] ? ___sys_sendmsg+0x2ab/0x2c0
> [] ? __wake_up+0x43/0x70
> [] ? tty_write+0x1c4/0x2a0
> [] ? __sys_sendmsg+0x44/0x80
> [] ? system_call_fastpath+0x12/0x6a
> Code: Bad RIP value.
> RIP [] 0x100000001
> RSP
> CR2: 0000000100000001
> ---[ end trace 08eb65d42362793f ]---

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Eric W. Biederman
2015-06-23 21:23:23 +0800

09 Apr, 2015

1 commit

aadd51aa7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next ... Browse Code »

Resolve conflicts between 5888b93 ("Merge branch 'nf-hook-compress'") and
Florian Westphal br_netfilter works.

Conflicts:
net/bridge/br_netfilter.c

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2015-04-09 00:30:21 +0800

08 Apr, 2015

3 commits

c737b7c45 netfilter: bridge: add helpers for fetching physin/outdev ... Browse Code »

right now we store this in the nf_bridge_info struct, accessible
via skb->nf_bridge. This patch prepares removal of this pointer from skb:

Instead of using skb->nf_bridge->x, we use helpers to obtain the in/out
device (or ifindexes).

Followup patches to netfilter will then allow nf_bridge_info to be
obtained by a call into the br_netfilter core, rather than keeping a
pointer to it in sk_buff.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2015-04-08 22:49:08 +0800
7026b1ddb netfilter: Pass socket pointer down through okfn(). ... Browse Code »

On the output paths in particular, we have to sometimes deal with two
socket contexts. First, and usually skb->sk, is the local socket that
generated the frame.

And second, is potentially the socket used to control a tunneling
socket, such as one the encapsulates using UDP.

We do not want to disassociate skb->sk when encapsulating in order
to fix this, because that would break socket memory accounting.

The most extreme case where this can cause huge problems is an
AF_PACKET socket transmitting over a vxlan device. We hit code
paths doing checks that assume they are dealing with an ipv4
socket, but are actually operating upon the AF_PACKET one.

Signed-off-by: David S. Miller

David Miller
2015-04-08 03:25:55 +0800
1c984f8a5 netfilter: Add socket pointer to nf_hook_state. ... Browse Code »

It is currently always set to NULL, but nf_queue is adjusted to be
prepared for it being set to a real socket by taking and releasing a
reference to that socket when necessary.

Signed-off-by: David S. Miller

David Miller
2015-04-08 03:25:55 +0800

05 Apr, 2015

2 commits

1d1de89b9 netfilter: Use nf_hook_state in nf_queue_entry. ... Browse Code »

That way we don't have to reinstantiate another nf_hook_state
on the stack of the nf_reinject() path.

Signed-off-by: David S. Miller

David S. Miller
2015-04-05 00:25:22 +0800
cfdfab314 netfilter: Create and use nf_hook_state. ... Browse Code »

Instead of passing a large number of arguments down into the nf_hook()
entry points, create a structure which carries this state down through
the hook processing layers.

This makes is so that if we want to change the types or signatures of
any of these pieces of state, there are less places that need to be
changed.

Signed-off-by: David S. Miller

David S. Miller
2015-04-05 00:17:40 +0800

03 Oct, 2014

1 commit

1109a90c0 netfilter: use IS_ENABLED(CONFIG_BRIDGE_NETFILTER) ... Browse Code »

In 34666d4 ("netfilter: bridge: move br_netfilter out of the core"),
the bridge netfilter code has been modularized.

Use IS_ENABLED instead of ifdef to cover the module case.

Fixes: 34666d4 ("netfilter: bridge: move br_netfilter out of the core")
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2014-10-03 00:30:54 +0800

30 Apr, 2013

2 commits

a5fedd43d netfilter: move skb_gso_segment into nfnetlink_queue module ... Browse Code »

skb_gso_segment is expensive, so it would be nice if we could
avoid it in the future. However, userspace needs to be prepared
to receive larger-than-mtu-packets (which will also have incorrect
l3/l4 checksums), so we cannot simply remove it.

The plan is to add a per-queue feature flag that userspace can
set when binding the queue.

The problem is that in nf_queue, we only have a queue number,
not the queue context/configuration settings.

This patch should have no impact other than the skb_gso_segment
call now being in a function that has access to the queue config
data.

A new size attribute in nf_queue_entry is needed so
nfnetlink_queue can duplicate the entry of the gso skb
when segmenting the skb while also copying the route key.

The follow up patch adds switch to disable skb_gso_segment when
queue config says so.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2013-04-30 02:09:05 +0800
4bd60443c netfilter: nf_queue: move device refcount bump to extra function ... Browse Code »

required by future patch that will need to duplicate the
nf_queue_entry, bumping refcounts of the copy.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2013-04-30 02:09:04 +0800

19 Apr, 2013

1 commit

f229f6ce4 netfilter: add my copyright statements ... Browse Code »

Add copyright statements to all netfilter files which have had significant
changes done by myself in the past.

Some notes:

- nf_conntrack_ecache.c was incorrectly attributed to Rusty and Netfilter
Core Team when it got split out of nf_conntrack_core.c. The copyrights
even state a date which lies six years before it was written. It was
written in 2005 by Harald and myself.

- net/ipv{4,6}/netfilter.c, net/netfitler/nf_queue.c were missing copyright
statements. I've added the copyright statement from net/netfilter/core.c,
where this code originated

- for nf_conntrack_proto_tcp.c I've also added Jozsef, since I didn't want
it to give the wrong impression

Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso

Patrick McHardy
2013-04-19 02:27:55 +0800

03 Dec, 2012

1 commit

0360ae412 netfilter: kill support for per-af queue backends ... Browse Code »

We used to have several queueing backends, but nowadays only
nfnetlink_queue remains.

In light of this there doesn't seem to be a good reason to
support per-af registering -- just hook up nfnetlink_queue on module
load and remove it on unload.

This means that the userspace BIND/UNBIND_PF commands are now obsolete;
the kernel will ignore them.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2012-12-03 22:07:48 +0800

03 Sep, 2012

2 commits

1c15b6770 netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_queue() ... Browse Code »

Since 'list_for_each_continue_rcu' has already been replaced by
'list_for_each_entry_continue_rcu', pass 'list_head' to nf_queue() as a
parameter can not benefit us any more.

This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of
nf_queue() and __nf_queue() to save code.

Signed-off-by: Michael Wang
Signed-off-by: Pablo Neira Ayuso

Michael Wang
2012-09-03 19:52:54 +0800
2a6decfd8 netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_iterate() ... Browse Code »

Since 'list_for_each_continue_rcu' has already been replaced by
'list_for_each_entry_continue_rcu', pass 'list_head' to nf_iterate() as a
parameter can not benefit us any more.

This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of
nf_iterate() to save code.

Signed-off-by: Michael Wang
Signed-off-by: Pablo Neira Ayuso

Michael Wang
2012-09-03 19:52:44 +0800

10 Feb, 2012

1 commit

a8db7b2d1 netfilter: nf_queue: fix queueing of bridged gro skbs ... Browse Code »

When trying to nf_queue GRO/GSO skbs, nf_queue uses skb_gso_segment
to split the skb.

However, if nf_queue is called via bridge netfilter, the mac header
won't be preserved -- packets will thus contain a bogus mac header.

Fix this by setting skb->data to the mac header when skb->nf_bridge
is set and restoring skb->data afterwards for all segments.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2012-02-10 03:47:53 +0800

13 Jan, 2012

1 commit

cf778b00e net: reintroduce missing rcu_assign_pointer() calls ... Browse Code »

commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to
RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
y).

We miss needed barriers, even on x86, when y is not NULL.

Signed-off-by: Eric Dumazet
CC: Stephen Hemminger
CC: Paul E. McKenney
Signed-off-by: David S. Miller

Eric Dumazet
2012-01-13 04:26:56 +0800

08 Aug, 2011

2 commits

19fd61785 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2011-08-08 14:20:26 +0800
fad544404 netfilter: avoid double free in nf_reinject ... Browse Code »

NF_STOLEN means skb was already freed

Signed-off-by: Julian Anastasov
Signed-off-by: David S. Miller

Julian Anastasov
2011-08-08 13:11:15 +0800

02 Aug, 2011

1 commit

a9b3cd7f3 rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER ... Browse Code »

When assigning a NULL value to an RCU protected pointer, no barrier
is needed. The rcu_assign_pointer, used to handle that but will soon
change to not handle the special case.

Convert all rcu_assign_pointer of NULL value.

//smpl
@@ expression P; @@

- rcu_assign_pointer(P, NULL)
+ RCU_INIT_POINTER(P, NULL)

//

Signed-off-by: Stephen Hemminger
Acked-by: Paul E. McKenney
Signed-off-by: David S. Miller

Stephen Hemminger
2011-08-02 19:29:23 +0800

31 Mar, 2011

1 commit

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800

18 Jan, 2011

4 commits

94b27cc36 netfilter: allow NFQUEUE bypass if no listener is available ... Browse Code »

If an skb is to be NF_QUEUE'd, but no program has opened the queue, the
packet is dropped.

This adds a v2 target revision of xt_NFQUEUE that allows packets to
continue through the ruleset instead.

Because the actual queueing happens outside of the target context, the
'bypass' flag has to be communicated back to the netfilter core.

Unfortunately the only choice to do this without adding a new function
argument is to use the target function return value (i.e. the verdict).

In the NF_QUEUE case, the upper 16bit already contain the queue number
to use. The previous patch reduced NF_VERDICT_MASK to 0xff, i.e.
we now have extra room for a new flag.

If a hook issued a NF_QUEUE verdict, then the netfilter core will
continue packet processing if the queueing hook
returns -ESRCH (== "this queue does not exist") and the new
NF_VERDICT_FLAG_QUEUE_BYPASS flag is set in the verdict value.

Note: If the queue exists, but userspace does not consume packets fast
enough, the skb will still be dropped.

Signed-off-by: Florian Westphal
Signed-off-by: Patrick McHardy

Florian Westphal
2011-01-18 23:08:30 +0800
f615df76e netfilter: reduce NF_VERDICT_MASK to 0xff ... Browse Code »

NF_VERDICT_MASK is currently 0xffff. This is because the upper
16 bits are used to store errno (for NF_DROP) or the queue number
(NF_QUEUE verdict).

As there are up to 0xffff different queues available, there is no more
room to store additional flags.

At the moment there are only 6 different verdicts, i.e. we can reduce
NF_VERDICT_MASK to 0xff to allow storing additional flags in the 0xff00 space.

NF_VERDICT_BITS would then be reduced to 8, but because the value is
exported to userspace, this might cause breakage; e.g.:

e.g. 'queuenr = (1 << NF_VERDICT_BITS) | NF_QUEUE' would now break.

Thus, remove NF_VERDICT_BITS usage in the kernel and move the old value
to the 'userspace compat' section.

Signed-off-by: Florian Westphal
Signed-off-by: Patrick McHardy

Florian Westphal
2011-01-18 22:52:14 +0800
06cdb6349 netfilter: nfnetlink_queue: do not free skb on error ... Browse Code »

Move free responsibility from nf_queue to caller.
This enables more flexible error handling; we can now accept the skb
instead of freeing it.

Signed-off-by: Florian Westphal
Signed-off-by: Patrick McHardy

Florian Westphal
2011-01-18 22:28:38 +0800
f15850861 netfilter: nfnetlink_queue: return error number to caller ... Browse Code »

instead of returning -1 on error, return an error number to allow the
caller to handle some errors differently.

ECANCELED is used to indicate that the hook is going away and should be
ignored.

A followup patch will introduce more 'ignore this hook' conditions,
(depending on queue settings) and will move kfree_skb responsibility
to the caller.

Signed-off-by: Florian Westphal
Signed-off-by: Patrick McHardy

Florian Westphal
2011-01-18 22:27:28 +0800

16 Nov, 2010

1 commit

0e60ebe04 netfilter: add __rcu annotations ... Browse Code »

Add some __rcu annotations and use helpers to reduce number of sparse
warnings (CONFIG_SPARSE_RCU_POINTER=y)

Signed-off-by: Eric Dumazet
Signed-off-by: Patrick McHardy

Eric Dumazet
2010-11-16 01:17:21 +0800

20 Aug, 2010

1 commit

0906a372f net/netfilter: __rcu annotations ... Browse Code »

Signed-off-by: Arnd Bergmann
Signed-off-by: Paul E. McKenney
Acked-by: Patrick McHardy
Cc: "David S. Miller"
Cc: Eric Dumazet
Reviewed-by: Josh Triplett

Arnd Bergmann
2010-08-20 08:18:01 +0800

18 May, 2010

1 commit

7fee226ad net: add a noref bit on skb dst ... Browse Code »

Use low order bit of skb->_skb_dst to tell dst is not refcounted.

Change _skb_dst to _skb_refdst to make sure all uses are catched.

skb_dst() returns the dst, regardless of noref bit set or not, but
with a lockdep check to make sure a noref dst is not given if current
user is not rcu protected.

New skb_dst_set_noref() helper to set an notrefcounted dst on a skb.
(with lockdep check)

skb_dst_drop() drops a reference only if skb dst was refcounted.

skb_dst_force() helper is used to force a refcount on dst, when skb
is queued and not anymore RCU protected.

Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if
!IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in
sock_queue_rcv_skb(), in __nf_queue().

Use skb_dst_force() in dev_requeue_skb().

Note: dst_use_noref() still dirties dst, we might transform it
later to do one dirtying per jiffies.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-05-18 08:18:50 +0800

13 May, 2010

1 commit

736d58e3a netfilter: remove unnecessary returns from void function()s ... Browse Code »

This patch removes from net/ netfilter files
all the unnecessary return; statements that precede the
last closing brace of void functions.

It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.

Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'

Signed-off-by: Joe Perches
[Patrick: changed to keep return statements in otherwise empty function bodies]
Signed-off-by: Patrick McHardy

Joe Perches
2010-05-13 21:16:27 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800