10 Jan, 2012
1 commit
-
* 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: Remove irqsafe_cpu_xxx variantsFix up conflict in arch/x86/include/asm/percpu.h due to clash with
cebef5beed3d ("x86: Fix and improve percpu_cmpxchg{8,16}b_double()")
which edited the (now removed) irqsafe_cpu_cmpxchg*_double code.
28 Dec, 2011
2 commits
-
Use the new macro and struct names in xt_ecn.h, and put the old
definitions into a definition-forwarding ipt_ecn.h.Signed-off-by: Jan Engelhardt
Signed-off-by: Pablo Neira Ayuso -
Prepare the ECN match for augmentation by an IPv6 counterpart. Since
no symbol dependencies to ipv6.ko are added, having a single ecn match
module is the more so welcome.Signed-off-by: Jan Engelhardt
Signed-off-by: Pablo Neira Ayuso
25 Dec, 2011
2 commits
-
This patch adds the match that allows to perform extended
accounting. It requires the new nfnetlink_acct infrastructure.# iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
# iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-trafficSigned-off-by: Pablo Neira Ayuso
-
We currently have two ways to account traffic in netfilter:
- iptables chain and rule counters:
# iptables -L -n -v
Chain INPUT (policy DROP 3 packets, 867 bytes)
pkts bytes target prot opt in out source destination
8 1104 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0- use flow-based accounting provided by ctnetlink:
# conntrack -L
tcp 6 431999 ESTABLISHED src=192.168.1.130 dst=212.106.219.168 sport=58152 dport=80 packets=47 bytes=7654 src=212.106.219.168 dst=192.168.1.130 sport=80 dport=58152 packets=49 bytes=66340 [ASSURED] mark=0 use=1While trying to display real-time accounting statistics, we require
to pool the kernel periodically to obtain this information. This is
OK if the number of flows is relatively low. However, in case that
the number of flows is huge, we can spend a considerable amount of
cycles to iterate over the list of flows that have been obtained.Moreover, if we want to obtain the sum of the flow accounting results
that match some criteria, we have to iterate over the whole list of
existing flows, look for matchings and update the counters.This patch adds the extended accounting infrastructure for
nfnetlink which aims to allow displaying real-time traffic accounting
without the need of complicated and resource-consuming implementation
in user-space. Basically, this new infrastructure allows you to create
accounting objects. One accounting object is composed of packet and
byte counters.In order to manipulate create accounting objects, you require the
new libnetfilter_acct library. It contains several examples of use:libnetfilter_acct/examples# ./nfacct-add http-traffic
libnetfilter_acct/examples# ./nfacct-get
http-traffic = { pkts = 000000000000, bytes = 000000000000 };Then, you can use one of this accounting objects in several iptables
rules using the new nfacct match (which comes in a follow-up patch):# iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
# iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-trafficThe idea is simple: if one packet matches the rule, the nfacct match
updates the counters.Thanks to Patrick McHardy, Eric Dumazet, Changli Gao for reviewing and
providing feedback for this contribution.Signed-off-by: Pablo Neira Ayuso
23 Dec, 2011
3 commits
-
Export the NAT definitions to userspace. So far userspace (specifically,
iptables) has been copying the headers files from include/net. Also
rename some structures and definitions in preparation for IPv6 NAT.
Since these have never been officially exported, this doesn't affect
existing userspace code.Signed-off-by: Patrick McHardy
Signed-off-by: Pablo Neira Ayuso -
This partially reworks bc01befdcf3e40979eb518085a075cbf0aacede0
which added userspace expectation support.This patch removes the nf_ct_userspace_expect_list since now we
force to use the new iptables CT target feature to add the helper
extension for conntracks that have attached expectations from
userspace.A new version of the proof-of-concept code to implement userspace
helpers from userspace is available at:http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-POC.tar.bz2
This patch also modifies the CT target to allow to set the
conntrack's userspace helper status flags. This flag is used
to tell the conntrack system to explicitly allocate the helper
extension.This helper extension is useful to link the userspace expectations
with the master conntrack that is being tracked from one userspace
helper.This feature fixes a problem in the current approach of the
userspace helper support. Basically, if the master conntrack that
has got a userspace expectation vanishes, the expectations point to
one invalid memory address. Thus, triggering an oops in the
expectation deletion event path.I decided not to add a new revision of the CT target because
I only needed to add a new flag for it. I'll document in this
issue in the iptables manpage. I have also changed the return
value from EINVAL to EOPNOTSUPP if one flag not supported is
specified. Thus, in the future adding new features that only
require a new flag can be added without a new revision.There is no official code using this in userspace (apart from
the proof-of-concept) that uses this infrastructure but there
will be some by beginning 2012.Reported-by: Sam Roberts
Signed-off-by: Pablo Neira Ayuso -
We simply say that regular this_cpu use must be safe regardless of
preemption and interrupt state. That has no material change for x86
and s390 implementations of this_cpu operations. However, arches that
do not provide their own implementation for this_cpu operations will
now get code generated that disables interrupts instead of preemption.-tj: This is part of on-going percpu API cleanup. For detailed
discussion of the subject, please refer to the following thread.http://thread.gmane.org/gmane.linux.kernel/1222078
Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo
LKML-Reference:
05 Dec, 2011
1 commit
-
This tries to do the same thing as fib_validate_source(), but differs
in several aspects.The most important difference is that the reverse path filter built into
fib_validate_source uses the oif as iif when performing the reverse
lookup. We do not do this, as the oif is not yet known by the time the
PREROUTING hook is invoked.We can't wait until FORWARD chain because by the time FORWARD is invoked
ipv4 forward path may have already sent icmp messages is response
to to-be-discarded-via-rpfilter packets.To avoid the such an additional lookup in PREROUTING, Patrick McHardy
suggested to attach the path information directly in the match
(i.e., just do what the standard ipv4 path does a bit earlier in PREROUTING).This works, but it also has a few caveats. Most importantly, when using
marks in PREROUTING to re-route traffic based on the nfmark, -m rpfilter
would have to be used after the nfmark has been set; otherwise the nfmark
would have no effect (because the route is already attached).Another problem would be interaction with -j TPROXY, as this target sets an
nfmark and uses ACCEPT instead of continue, i.e. such a version of
-m rpfilter cannot be used for the initial to-be-intercepted packets.In case in turns out that the oif is required, we can add Patricks
suggestion with a new match option (e.g. --rpf-use-oif) to keep ruleset
compatibility.Another difference to current builtin ipv4 rpfilter is that packets subject to ipsec
transformation are not automatically excluded. If you want this, simply
combine -m rpfilter with the policy match.Packets arriving on loopback interfaces always match.
Signed-off-by: Florian Westphal
Acked-by: David S. Miller
Signed-off-by: Pablo Neira Ayuso
27 Aug, 2011
1 commit
-
Various headers use union nf_inet_addr, defined in .
Signed-off-by: Ben Hutchings
Acked-by: Patrick McHardy
Signed-off-by: David S. Miller
22 Jul, 2011
1 commit
21 Jul, 2011
3 commits
-
Some gcc versions warn about prototypes without "inline" when the declaration
includes the "inline" keyword. The fix generates a false error message
"marked inline, but without a definition" with sparse below 0.4.2.Signed-off-by: Chris Friesen
Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
If overlapping networks with different interfaces was added to
the set, the type did not handle it properly. Exampleipset create test hash:net,iface
ipset add test 192.168.0.0/16,eth0
ipset add test 192.168.0.0/24,eth1Now, if a packet was sent from 192.168.0.0/24,eth0, the type returned
a match.In the patch the algorithm is fixed in order to correctly handle
overlapping networks.Limitation: the same network cannot be stored with more than 64 different
interfaces in a single set.Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy
19 Jul, 2011
1 commit
-
Introduces a new nfnetlink type that applies a given
verdict to all queued packets with an id
Signed-off-by: Patrick McHardy
18 Jul, 2011
1 commit
-
Goal of this patch is to permit nfnetlink providers not mandate
nfnl_mutex being held while nfnetlink_rcv_msg() calls them.If struct nfnl_callback contains a non NULL call_rcu(), then
nfnetlink_rcv_msg() will use it instead of call() field, holding
rcu_read_lock instead of nfnl_mutexSigned-off-by: Eric Dumazet
CC: Florian Westphal
CC: Eric Leblond
Signed-off-by: Patrick McHardy
21 Jun, 2011
1 commit
-
Conflicts:
drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
drivers/net/wireless/rtlwifi/pci.c
net/netfilter/ipvs/ip_vs_core.c
17 Jun, 2011
11 commits
-
Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
The hash:net,iface type makes possible to store network address and
interface name pairs in a set. It's mostly suitable for egress
and ingress filtering. Examples:# ipset create test hash:net,iface
# ipset add test 192.168.0.0/16,eth0
# ipset add test 192.168.0.0/24,eth1Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
With the change the sets can use any parameter available for the match
and target extensions, like input/output interface. It's required for
the hash:net,iface set type.Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
The patch "Fix adding ranges to hash types" had got a mistypeing
in the timeout variant of the hash types, which actually made
the patch ineffective. Fixed!Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
The range internally is converted to the network(s) equal to the range.
Example:# ipset new test hash:net
# ipset add test 10.2.0.0-10.2.1.12
# ipset list test
Name: test
Type: hash:net
Header: family inet hashsize 1024 maxelem 65536
Size in memory: 16888
References: 0
Members:
10.2.1.12
10.2.1.0/29
10.2.0.0/24
10.2.1.8/30Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
A set type may have multiple revisions, for example when syntax is
extended. Support continuous revision ranges in set types.Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
When ranges are added to hash types, the elements may trigger rehashing
the set. However, the last successfully added element was not kept track
so the adding started again with the first element after the rehashing.Bug reported by Mr Dash Four.
Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
Current listing makes possible to list sets with full content only.
The patch adds support partial listings, i.e. listing just
the existing setnames or listing set headers, without set members.Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
The support makes possible to specify the timeout value for
the SET target and a flag to reset the timeout for already existing
entries.Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy -
When an element to a set with timeout added, one can change the timeout
by "readding" the element with the "-exist" flag. That means the timeout
value is reset to the specified one (or to the default from the set
specification if the "timeout n" option is not used). Exampleipset add foo 1.2.3.4 timeout 10
ipset add foo 1.2.3.4 timeout 600 -existSigned-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy
06 Jun, 2011
1 commit
-
Following error is raised (and other similar ones) :
net/ipv4/netfilter/nf_nat_standalone.c: In function ‘nf_nat_fn’:
net/ipv4/netfilter/nf_nat_standalone.c:119:2: warning: case value ‘4’
not in enumerated type ‘enum ip_conntrack_info’gcc barfs on adding two enum values and getting a not enumerated
result :case IP_CT_RELATED+IP_CT_IS_REPLY:
Add missing enum values
Signed-off-by: Eric Dumazet
CC: David Miller
Signed-off-by: Pablo Neira Ayuso
27 May, 2011
2 commits
-
Variable 'ret' is set in type_pf_tdel() but not used, remove.
Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Pablo Neira Ayuso -
Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Pablo Neira Ayuso
20 Apr, 2011
1 commit
13 Apr, 2011
1 commit
-
SCTP and UDPLITE port support added to the hash:*port* set types.
Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy
11 Apr, 2011
1 commit
-
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (34 commits)
net: Add support for SMSC LAN9530, LAN9730 and LAN89530
mlx4_en: Restoring RX buffer pointer in case of failure
mlx4: Sensing link type at device initialization
ipv4: Fix "Set rt->rt_iif more sanely on output routes."
MAINTAINERS: add entry for Xen network backend
be2net: Fix suspend/resume operation
be2net: Rename some struct members for clarity
pppoe: drop PPPOX_ZOMBIEs in pppoe_flush_dev
dsa/mv88e6131: add support for mv88e6085 switch
ipv6: Enable RFS sk_rxhash tracking for ipv6 sockets (v2)
be2net: Fix a potential crash during shutdown.
bna: Fix for handling firmware heartbeat failure
can: mcp251x: Allow pass IRQ flags through platform data.
smsc911x: fix mac_lock acquision before calling smsc911x_mac_read
iwlwifi: accept EEPROM version 0x423 for iwl6000
rt2x00: fix cancelling uninitialized work
rtlwifi: Fix some warnings/bugs
p54usb: IDs for two new devices
wl12xx: fix potential buffer overflow in testmode nvs push
zd1211rw: reset rx idle timer from tasklet
...
04 Apr, 2011
2 commits
-
We currently use a percpu spinlock to 'protect' rule bytes/packets
counters, after various attempts to use RCU instead.Lately we added a seqlock so that get_counters() can run without
blocking BH or 'writers'. But we really only need the seqcount in it.Spinlock itself is only locked by the current/owner cpu, so we can
remove it completely.This cleanups api, using correct 'writer' vs 'reader' semantic.
At replace time, the get_counters() call makes sure all cpus are done
using the old table.Signed-off-by: Eric Dumazet
Cc: Jan Engelhardt
Signed-off-by: Patrick McHardy -
The timeout variant of the list:set type must reference the member sets.
However, its garbage collector runs at timer interrupt so the mutex
protection of the references is a no go. Therefore the reference protection
is converted to rwlock.Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy
31 Mar, 2011
1 commit
-
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi
20 Mar, 2011
1 commit
-
The hash:*port* types with IPv4 silently ignored when address ranges
with non TCP/UDP were added/deleted from the set and used the first
address from the range only.Signed-off-by: Jozsef Kadlecsik
Signed-off-by: Patrick McHardy
19 Mar, 2011
1 commit
-
Now that we finally have __aligned_xx exported to userspace, convert
the headers that get exported over to the proper type.Signed-off-by: Mike Frysinger
Signed-off-by: David S. Miller
16 Mar, 2011
1 commit
-
The kernel will refuse certain types that do not work in ipv6 mode.
We can then add these features incrementally without risk of userspace
breakage.Signed-off-by: Florian Westphal
Signed-off-by: Patrick McHardy