31 May, 2014
1 commit
-
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-nextThis small patchset contains three accumulated Netfilter/IPVS updates,
they are:1) Refactorize common NAT code by encapsulating it into a helper
function, similarly to what we do in other conntrack extensions,
from Florian Westphal.2) A minor format string mismatch fix for IPVS, from Masanari Iida.
3) Add quota support to the netfilter accounting infrastructure, now
you can add quotas to accounting objects via the nfnetlink interface
and use them from iptables. You can also listen to quota
notifications from userspace. This enhancement from Mathieu Poirier.
====================Signed-off-by: David S. Miller
13 May, 2014
1 commit
-
As suggested by several people, rename local_df to ignore_df,
since it means "ignore df bit if it is set".Cc: Maciej Żenczykowski
Cc: Florian Westphal
Cc: David S. Miller
Cc: Eric Dumazet
Signed-off-by: Cong Wang
Acked-by: Maciej Żenczykowski
Signed-off-by: David S. Miller
04 May, 2014
1 commit
-
else we may fail to forward skb even if original fragments do fit
outgoing link mtu:1. remote sends 2k packets in two 1000 byte frags, DF set
2. we want to forward but only see '2k > mtu and DF set'
3. we then send icmp error saying that outgoing link is 1500But original sender never sent a packet that would not fit
the outgoing link.Setting local_df makes outgoing path test size vs.
IPCB(skb)->frag_max_size, so we will still send the correct
error in case the largest original size did not fit
outgoing link mtu.Reported-by: Maxime Bizon
Suggested-by: Maxime Bizon
Fixes: 5f2d04f1f9 (ipv4: fix path MTU discovery with connection tracking)
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
30 Apr, 2014
1 commit
-
Reduce copy-past a bit by adding a common helper.
Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso
17 Apr, 2014
1 commit
-
As suggested by Julian:
Simply, flowi4_iif must not contain 0, it does not
look logical to ignore all ip rules with specified iif.because in fib_rule_match() we do:
if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
goto out;flowi4_iif should be LOOPBACK_IFINDEX by default.
We need to move LOOPBACK_IFINDEX to include/net/flow.h:
1) It is mostly used by flowi_iif
2) Fix the following compile error if we use it in flow.h
by the patches latter:In file included from include/linux/netfilter.h:277:0,
from include/net/netns/netfilter.h:5,
from include/net/net_namespace.h:21,
from include/linux/netdevice.h:43,
from include/linux/icmpv6.h:12,
from include/linux/ipv6.h:61,
from include/net/ipv6.h:16,
from include/linux/sunrpc/clnt.h:27,
from include/linux/nfs_fs.h:30,
from init/do_mounts.c:32:
include/net/flow.h: In function ‘flowi4_init_output’:
include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)Cc: Eric Biederman
Cc: Julian Anastasov
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
05 Apr, 2014
1 commit
-
All xtables variants suffer from the defect that the copy_to_user()
to copy the counters to user memory may fail after the table has
already been exchanged and thus exposed. Return an error at this
point will result in freeing the already exposed table. Any
subsequent packet processing will result in a kernel panic.We can't copy the counters before exposing the new tables as we
want provide the counter state after the old table has been
unhooked. Therefore convert this into a silent error.Cc: Florian Westphal
Signed-off-by: Thomas Graf
Signed-off-by: Pablo Neira Ayuso
14 Feb, 2014
1 commit
-
The solution was found by Patrick in 2.4 kernel sources.
Cc: Patrick McHardy
Signed-off-by: Francois-Xavier Le Bail
Acked-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso
06 Feb, 2014
3 commits
-
Add a reject module for NFPROTO_INET. It does nothing but dispatch
to the AF-specific modules based on the hook family.Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso -
Currently the nft_reject module depends on symbols from ipv6. This is
wrong since no generic module should force IPv6 support to be loaded.
Split up the module into AF-specific and a generic part.Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso -
Similar bug fixed in SIP module in 3f509c6 ("netfilter: nf_nat_sip: fix
incorrect handling of EBUSY for RTCP expectation").BUG: unable to handle kernel paging request at 00100104
IP: [] nf_ct_unlink_expect_report+0x57/0xf0 [nf_conntrack]
...
Call Trace:
[] ? del_timer+0x48/0x70
[] nf_ct_remove_expectations+0x47/0x60 [nf_conntrack]
[] nf_ct_delete_from_lists+0x59/0x90 [nf_conntrack]
[] death_by_timeout+0x14e/0x1c0 [nf_conntrack]
[] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack]
[] call_timer_fn+0x1d/0x80
[] run_timer_softirq+0x18e/0x1a0
[] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack]
[] __do_softirq+0xa3/0x170
[] ? __local_bh_enable+0x70/0x70
[] ? irq_exit+0x67/0xa0
[] ? do_IRQ+0x46/0xb0
[] ? clockevents_notify+0x35/0x110
[] ? common_interrupt+0x2c/0x40
[] ? cpuidle_enter_state+0x41/0xf0
[] ? cpuidle_idle_call+0x8b/0x100
[] ? arch_cpu_idle+0x8/0x30
[] ? cpu_idle_loop+0x4b/0x140
[] ? cpu_startup_entry+0x18/0x20
[] ? rest_init+0x5d/0x70
[] ? start_kernel+0x2ec/0x2f2
[] ? repair_env_string+0x5b/0x5b
[] ? i386_start_kernel+0x33/0x35Signed-off-by: Alexey Dobriyan
Signed-off-by: Pablo Neira Ayuso
10 Jan, 2014
5 commits
-
We have to unregister chain type if this fails to register netns.
Signed-off-by: Pablo Neira Ayuso
-
We don't encode argument types into function names and since besides
nft_do_chain() there are only AF-specific versions, there is no risk
of confusion.Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso -
Minor nf_chain_type cleanups:
- reorder struct to plug a hoe
- rename struct module member to "owner" for consistency
- rename nf_hookfn array to "hooks" for consistency
- reorder initializers for better readabilitySigned-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso -
Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso -
In some cases we neither take a reference to the AF info nor to the
chain type, allowing the module to be unloaded while in use.Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso
08 Jan, 2014
3 commits
-
This patch adds a new table family and a new filter chain that you can
use to attach IPv4 and IPv6 rules. This should help to simplify
rule-set maintainance in dual-stack setups.Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso -
Add support to register chains to multiple hooks for different address
families for mixed IPv4/IPv6 tables.Signed-off-by: Patrick McHardy
-
Currently the AF-specific hook functions override the chain-type specific
hook functions. That doesn't make too much sense since the chain types
are a special case of the AF-specific hooks.Make the AF-specific hook functions the default and make the optional
chain type hooks override them.As a side effect, the necessary code restructuring reduces the code size,
f.i. in case of nf_tables_ipv4.o:nf_tables_ipv4_init_net | -24
nft_do_chain_ipv4 | -113
2 functions changed, 137 bytes removed, diff: -137Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso
07 Jan, 2014
1 commit
-
Pablo Neira Ayuso says:
====================
nftables updates for net-nextThe following patchset contains nftables updates for your net-next tree,
they are:* Add set operation to the meta expression by means of the select_ops()
infrastructure, this allows us to set the packet mark among other things.
From Arturo Borrero Gonzalez.* Fix wrong format in sscanf in nf_tables_set_alloc_name(), from Daniel
Borkmann.* Add new queue expression to nf_tables. These comes with two previous patches
to prepare this new feature, one to add mask in nf_tables_core to
evaluate the queue verdict appropriately and another to refactor common
code with xt_NFQUEUE, from Eric Leblond.* Do not hide nftables from Kconfig if nfnetlink is not enabled, also from
Eric Leblond.* Add the reject expression to nf_tables, this adds the missing TCP RST
support. It comes with an initial patch to refactor common code with
xt_NFQUEUE, again from Eric Leblond.* Remove an unused variable assignment in nf_tables_dump_set(), from Michal
Nazarewicz.* Remove the nft_meta_target code, now that Arturo added the set operation
to the meta expression, from me.* Add help information for nf_tables to Kconfig, also from me.
* Allow to dump all sets by specifying NFPROTO_UNSPEC, similar feature is
available to other nf_tables objects, requested by Arturo, from me.* Expose the table usage counter, so we can know how many chains are using
this table without dumping the list of chains, from Tomasz Bursztyka.
====================Signed-off-by: David S. Miller
06 Jan, 2014
1 commit
-
Pablo Neira Ayuso says:
====================
netfilter/IPVS updates for net-nextThe following patchset contains Netfilter updates for your net-next tree,
they are:* Add full port randomization support. Some crazy researchers found a way
to reconstruct the secure ephemeral ports that are allocated in random mode
by sending off-path bursts of UDP packets to overrun the socket buffer of
the DNS resolver to trigger retransmissions, then if the timing for the
DNS resolution done by a client is larger than usual, then they conclude
that the port that received the burst of UDP packets is the one that was
opened. It seems a bit aggressive method to me but it seems to work for
them. As a result, Daniel Borkmann and Hannes Frederic Sowa came up with a
new NAT mode to fully randomize ports using prandom.* Add a new classifier to x_tables based on the socket net_cls set via
cgroups. These includes two patches to prepare the field as requested by
Zefan Li. Also from Daniel Borkmann.* Use prandom instead of get_random_bytes in several locations of the
netfilter code, from Florian Westphal.* Allow to use the CTA_MARK_MASK in ctnetlink when mangling the conntrack
mark, also from Florian Westphal.* Fix compilation warning due to unused variable in IPVS, from Geert
Uytterhoeven.* Add support for UID/GID via nfnetlink_queue, from Valentina Giusti.
* Add IPComp extension to x_tables, from Fan Du.
====================Signed-off-by: David S. Miller
04 Jan, 2014
1 commit
-
The following code is not used in current upstream code.
Some of this seems to be old hooks, other might be used by some
out of tree module (which I don't care about breaking), and
the need_ipv4_conntrack was used by old NAT code but no longer
called.Signed-off-by: Stephen Hemminger
Signed-off-by: Pablo Neira Ayuso
02 Jan, 2014
1 commit
-
Signed-off-by: Pablo Neira Ayuso
31 Dec, 2013
1 commit
-
This patch moves nft_reject_ipv4 to nft_reject and adds support
for IPv6 protocol. This patch uses functions included in nf_reject.h
to implement reject by TCP reset.The code has to be build as a module if NF_TABLES_IPV6 is also a
module to avoid compilation error due to usage of IPv6 functions.
This has been done in Kconfig by using the construct:depends on NF_TABLES_IPV6 || !NF_TABLES_IPV6
This seems a bit weird in terms of syntax but works perfectly.
Signed-off-by: Eric Leblond
Signed-off-by: Pablo Neira Ayuso
30 Dec, 2013
1 commit
-
This patch prepares the addition of TCP reset support in
the nft_reject module by moving reusable code into a header
file.Signed-off-by: Eric Leblond
Signed-off-by: Pablo Neira Ayuso
27 Dec, 2013
1 commit
-
Signed-off-by: Weilong Chen
Signed-off-by: David S. Miller
19 Dec, 2013
1 commit
-
Conflicts:
drivers/net/ethernet/intel/i40e/i40e_main.c
drivers/net/macvtap.cBoth minor merge hassles, simple overlapping changes.
Signed-off-by: David S. Miller
12 Dec, 2013
1 commit
-
The dump function in nft_reject_ipv4 was not converting a u32
field to network order before sending it to userspace, this
needs to happen for consistency with other nf_tables and
nfnetlink subsystems.Signed-off-by: Eric Leblond
Signed-off-by: Pablo Neira Ayuso
11 Dec, 2013
1 commit
-
Fix a crash in synproxy_send_tcp() when using the SYNPROXY target in the
PREROUTING chain caused by missing routing information.Reported-by: Nicki P.
Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso
07 Dec, 2013
1 commit
-
Several files refer to an old address for the Free Software Foundation
in the file header comment. Resolve by replacing the address with
the URL so that we do not have to keep
updating the header comments anytime the address changes.CC: Alexey Kuznetsov
CC: James Morris
CC: Hideaki YOSHIFUJI
CC: Patrick McHardy
Signed-off-by: Jeff Kirsher
Signed-off-by: David S. Miller
18 Nov, 2013
1 commit
-
When the synproxy_parse_options is called on the client ack the mss
option will not be present. Consequently mss wont be included in the
backend syn packet, which falls back to 536 bytes mss.Therefore XT_SYNPROXY_OPT_MSS is explicitly flagged when recovering mss
value from cookie.Signed-off-by: Martin Topholm
Reviewed-by: Jesper Dangaard Brouer
Signed-off-by: Pablo Neira Ayuso
05 Nov, 2013
3 commits
-
Pablo Neira Ayuso says:
====================
This batch contains fives nf_tables patches for your net-next tree,
they are:* Fix possible use after free in the module removal path of the
x_tables compatibility layer, from Dan Carpenter.* Add filter chain type for the bridge family, from myself.
* Fix Kconfig dependencies of the nf_tables bridge family with
the core, from myself.* Fix sparse warnings in nft_nat, from Tomasz Bursztyka.
* Remove duplicated include in the IPv4 family support for nf_tables,
from Wei Yongjun.
====================Signed-off-by: David S. Miller
-
Pablo Neira Ayuso says:
====================
This is another batch containing Netfilter/IPVS updates for your net-next
tree, they are:* Six patches to make the ipt_CLUSTERIP target support netnamespace,
from Gao feng.* Two cleanups for the nf_conntrack_acct infrastructure, introducing
a new structure to encapsulate conntrack counters, from Holger
Eitzenberger.* Fix missing verdict in SCTP support for IPVS, from Daniel Borkmann.
* Skip checksum recalculation in SCTP support for IPVS, also from
Daniel Borkmann.* Fix behavioural change in xt_socket after IP early demux, from
Florian Westphal.* Fix bogus large memory allocation in the bitmap port set type in ipset,
from Jozsef Kadlecsik.* Fix possible compilation issues in the hash netnet set type in ipset,
also from Jozsef Kadlecsik.* Define constants to identify netlink callback data in ipset dumps,
again from Jozsef Kadlecsik.* Use sock_gen_put() in xt_socket to replace xt_socket_put_sk,
from Eric Dumazet.* Improvements for the SH scheduler in IPVS, from Alexander Frolkin.
* Remove extra delay due to unneeded rcu barrier in IPVS net namespace
cleanup path, from Julian Anastasov.* Save some cycles in ip6t_REJECT by skipping checksum validation in
packets leaving from our stack, from Stanislav Fomichev.* Fix IPVS_CMD_ATTR_MAX definition in IPVS, larger that required, from
Julian Anastasov.
====================Signed-off-by: David S. Miller
-
Conflicts:
drivers/net/ethernet/emulex/benet/be.h
drivers/net/netconsole.c
net/bridge/br_private.hThree mostly trivial conflicts.
The net/bridge/br_private.h conflict was a function signature (argument
addition) change overlapping with the extern removals from Joe Perches.In drivers/net/netconsole.c we had one change adjusting a printk message
whilst another changed "printk(KERN_INFO" into "pr_info(".Lastly, the emulex change was a new inline function addition overlapping
with Joe Perches's extern removals.Signed-off-by: David S. Miller
04 Nov, 2013
1 commit
-
Remove duplicated include.
Signed-off-by: Wei Yongjun
Signed-off-by: Pablo Neira Ayuso
22 Oct, 2013
1 commit
-
During kernel stability testing on an SMP ARMv7 system, Yalin Wang
reported the following panic from the netfilter code:1fe0: 0000001c 5e2d3b10 4007e779 4009e110 60000010 00000032 ff565656 ff545454
[] (ipt_do_table+0x448/0x584) from [] (nf_iterate+0x48/0x7c)
[] (nf_iterate+0x48/0x7c) from [] (nf_hook_slow+0x58/0x104)
[] (nf_hook_slow+0x58/0x104) from [] (ip_local_deliver+0x88/0xa8)
[] (ip_local_deliver+0x88/0xa8) from [] (ip_rcv_finish+0x418/0x43c)
[] (ip_rcv_finish+0x418/0x43c) from [] (__netif_receive_skb+0x4cc/0x598)
[] (__netif_receive_skb+0x4cc/0x598) from [] (process_backlog+0x84/0x158)
[] (process_backlog+0x84/0x158) from [] (net_rx_action+0x70/0x1dc)
[] (net_rx_action+0x70/0x1dc) from [] (__do_softirq+0x11c/0x27c)
[] (__do_softirq+0x11c/0x27c) from [] (do_softirq+0x44/0x50)
[] (do_softirq+0x44/0x50) from [] (local_bh_enable_ip+0x8c/0xd0)
[] (local_bh_enable_ip+0x8c/0xd0) from [] (inet_stream_connect+0x164/0x298)
[] (inet_stream_connect+0x164/0x298) from [] (sys_connect+0x88/0xc8)
[] (sys_connect+0x88/0xc8) from [] (ret_fast_syscall+0x0/0x30)
Code: 2a000021 e59d2028 e59de01c e59f011c (e7824103)
---[ end trace da227214a82491bd ]---
Kernel panic - not syncing: Fatal exception in interruptThis comes about because CPU1 is executing xt_replace_table in response
to a setsockopt syscall, resulting in:ret = xt_jumpstack_alloc(newinfo);
--> newinfo->jumpstack = kzalloc(size, GFP_KERNEL);[...]
table->private = newinfo;
newinfo->initial_entries = private->initial_entries;Meanwhile, CPU0 is handling the network receive path and ends up in
ipt_do_table, resulting in:private = table->private;
[...]
jumpstack = (struct ipt_entry **)private->jumpstack[cpu];
On weakly ordered memory architectures, the writes to table->private
and newinfo->jumpstack from CPU1 can be observed out of order by CPU0.
Furthermore, on architectures which don't respect ordering of address
dependencies (i.e. Alpha), the reads from CPU0 can also be re-ordered.This patch adds an smp_wmb() before the assignment to table->private
(which is essentially publishing newinfo) to ensure that all writes to
newinfo will be observed before plugging it into the table structure.
A dependent-read barrier is also added on the consumer sides, to ensure
the same ordering requirements are also respected there.Cc: Paul E. McKenney
Reported-by: Wang, Yalin
Tested-by: Wang, Yalin
Signed-off-by: Will Deacon
Acked-by: Eric Dumazet
Signed-off-by: Pablo Neira Ayuso
17 Oct, 2013
5 commits
-
we can allow users in uninit net namespace to operate ipt_CLUSTERIP
now.Signed-off-by: Gao feng
Signed-off-by: Pablo Neira Ayuso -
Create proc entries under the ipt_CLUSTERIP directory of proper
net namespace.Signed-off-by: Gao feng
Signed-off-by: Pablo Neira Ayuso -
Inorder to find clusterip_config in net namespace.
Signed-off-by: Gao feng
Signed-off-by: Pablo Neira Ayuso -
this lock is used for protecting clusterip_configs of per
net namespace, it should be per net namespace too.Signed-off-by: Gao feng
Signed-off-by: Pablo Neira Ayuso -
clusterip_configs should be per net namespace, so operate
cluster in one net namespace won't affect other net
namespace. right now, only allow to operate the clusterip_configs
of init net namespace.Signed-off-by: Gao feng
Signed-off-by: Pablo Neira Ayuso