28 Dec, 2011

1 commit


25 Dec, 2011

2 commits

  • This patch adds the match that allows to perform extended
    accounting. It requires the new nfnetlink_acct infrastructure.

    # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
    # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • We currently have two ways to account traffic in netfilter:

    - iptables chain and rule counters:

    # iptables -L -n -v
    Chain INPUT (policy DROP 3 packets, 867 bytes)
    pkts bytes target prot opt in out source destination
    8 1104 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0

    - use flow-based accounting provided by ctnetlink:

    # conntrack -L
    tcp 6 431999 ESTABLISHED src=192.168.1.130 dst=212.106.219.168 sport=58152 dport=80 packets=47 bytes=7654 src=212.106.219.168 dst=192.168.1.130 sport=80 dport=58152 packets=49 bytes=66340 [ASSURED] mark=0 use=1

    While trying to display real-time accounting statistics, we require
    to pool the kernel periodically to obtain this information. This is
    OK if the number of flows is relatively low. However, in case that
    the number of flows is huge, we can spend a considerable amount of
    cycles to iterate over the list of flows that have been obtained.

    Moreover, if we want to obtain the sum of the flow accounting results
    that match some criteria, we have to iterate over the whole list of
    existing flows, look for matchings and update the counters.

    This patch adds the extended accounting infrastructure for
    nfnetlink which aims to allow displaying real-time traffic accounting
    without the need of complicated and resource-consuming implementation
    in user-space. Basically, this new infrastructure allows you to create
    accounting objects. One accounting object is composed of packet and
    byte counters.

    In order to manipulate create accounting objects, you require the
    new libnetfilter_acct library. It contains several examples of use:

    libnetfilter_acct/examples# ./nfacct-add http-traffic
    libnetfilter_acct/examples# ./nfacct-get
    http-traffic = { pkts = 000000000000, bytes = 000000000000 };

    Then, you can use one of this accounting objects in several iptables
    rules using the new nfacct match (which comes in a follow-up patch):

    # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
    # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic

    The idea is simple: if one packet matches the rule, the nfacct match
    updates the counters.

    Thanks to Patrick McHardy, Eric Dumazet, Changli Gao for reviewing and
    providing feedback for this contribution.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

16 Mar, 2011

1 commit


03 Feb, 2011

1 commit


01 Feb, 2011

2 commits

  • The patch adds the combined module of the "SET" target and "set" match
    to netfilter. Both the previous and the current revisions are supported.

    Signed-off-by: Jozsef Kadlecsik
    Signed-off-by: Patrick McHardy

    Jozsef Kadlecsik
     
  • The patch adds the IP set core support to the kernel.

    The IP set core implements a netlink (nfnetlink) based protocol by which
    one can create, destroy, flush, rename, swap, list, save, restore sets,
    and add, delete, test elements from userspace. For simplicity (and backward
    compatibilty and for not to force ip(6)tables to be linked with a netlink
    library) reasons a small getsockopt-based protocol is also kept in order
    to communicate with the ip(6)tables match and target.

    The netlink protocol passes all u16, etc values in network order with
    NLA_F_NET_BYTEORDER flag. The protocol enforces the proper use of the
    NLA_F_NESTED and NLA_F_NET_BYTEORDER flags.

    For other kernel subsystems (netfilter match and target) the API contains
    the functions to add, delete and test elements in sets and the required calls
    to get/put refereces to the sets before those operations can be performed.

    The set types (which are implemented in independent modules) are stored
    in a simple RCU protected list. A set type may have variants: for example
    without timeout or with timeout support, for IPv4 or for IPv6. The sets
    (i.e. the pointers to the sets) are stored in an array. The sets are
    identified by their index in the array, which makes possible easy and
    fast swapping of sets. The array is protected indirectly by the nfnl
    mutex from nfnetlink. The content of the sets are protected by the rwlock
    of the set.

    There are functional differences between the add/del/test functions
    for the kernel and userspace:

    - kernel add/del/test: works on the current packet (i.e. one element)
    - kernel test: may trigger an "add" operation in order to fill
    out unspecified parts of the element from the packet (like MAC address)
    - userspace add/del: works on the netlink message and thus possibly
    on multiple elements from the IPSET_ATTR_ADT container attribute.
    - userspace add: may trigger resizing of a set

    Signed-off-by: Jozsef Kadlecsik
    Signed-off-by: Patrick McHardy

    Jozsef Kadlecsik
     

19 Jan, 2011

2 commits

  • This patch adds flow-based timestamping for conntracks. This
    conntrack extension is disabled by default. Basically, we use
    two 64-bits variables to store the creation timestamp once the
    conntrack has been confirmed and the other to store the deletion
    time. This extension is disabled by default, to enable it, you
    have to:

    echo 1 > /proc/sys/net/netfilter/nf_conntrack_timestamp

    This patch allows to save memory for user-space flow-based
    loogers such as ulogd2. In short, ulogd2 does not need to
    keep a hashtable with the conntrack in user-space to know
    when they were created and destroyed, instead we use the
    kernel timestamp. If we want to have a sane IPFIX implementation
    in user-space, this nanosecs resolution timestamps are also
    useful. Other custom user-space applications can benefit from
    this via libnetfilter_conntrack.

    This patch modifies the /proc output to display the delta time
    in seconds since the flow start. You can also obtain the
    flow-start date by means of the conntrack-tools.

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Patrick McHardy

    Pablo Neira Ayuso
     
  • Adding support for SNMP broadcast connection tracking. The SNMP
    broadcast requests are now paired with the SNMP responses.
    Thus allowing using SNMP broadcasts with firewall enabled.

    Please refer to the following conversation:
    http://marc.info/?l=netfilter-devel&m=125992205006600&w=2

    Patrick McHardy wrote:
    > > The best solution would be to add generic broadcast tracking, the
    > > use of expectations for this is a bit of abuse.
    > > The second best choice I guess would be to move the help() function
    > > to a shared module and generalize it so it can be used for both.
    This patch implements the "second best choice".

    Since the netbios-ns conntrack module uses the same helper
    functionality as the snmp, only one helper function is added
    for both snmp and netbios-ns modules into the new object -
    nf_conntrack_broadcast.

    Signed-off-by: Jiri Olsa
    Signed-off-by: Patrick McHardy

    Jiri Olsa
     

17 Jan, 2011

1 commit

  • This patch adds a new netfilter target which creates audit records
    for packets traversing a certain chain.

    It can be used to record packets which are rejected administraively
    as follows:

    -N AUDIT_DROP
    -A AUDIT_DROP -j AUDIT --type DROP
    -A AUDIT_DROP -j DROP

    a rule which would typically drop or reject a packet would then
    invoke the new chain to record packets before dropping them.

    -j AUDIT_DROP

    The module is protocol independant and works for iptables, ip6tables
    and ebtables.

    The following information is logged:
    - netfilter hook
    - packet length
    - incomming/outgoing interface
    - MAC src/dst/proto for ethernet packets
    - src/dst/protocol address for IPv4/IPv6
    - src/dst port for TCP/UDP/UDPLITE
    - icmp type/code

    Cc: Patrick McHardy
    Cc: Eric Paris
    Cc: Al Viro
    Signed-off-by: Thomas Graf
    Signed-off-by: Patrick McHardy

    Thomas Graf
     

23 Jul, 2010

2 commits

  • In some situations a CPU match permits a better spreading of
    connections, or select targets only for a given cpu.

    With Remote Packet Steering or multiqueue NIC and appropriate IRQ
    affinities, we can distribute trafic on available cpus, per session.
    (all RX packets for a given flow is handled by a given cpu)

    Some legacy applications being not SMP friendly, one way to scale a
    server is to run multiple copies of them.

    Instead of randomly choosing an instance, we can use the cpu number as a
    key so that softirq handler for a whole instance is running on a single
    cpu, maximizing cache effects in TCP/UDP stacks.

    Using NAT for example, a four ways machine might run four copies of
    server application, using a separate listening port for each instance,
    but still presenting an unique external port :

    iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \
    -j REDIRECT --to-port 8080

    iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \
    -j REDIRECT --to-port 8081

    iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \
    -j REDIRECT --to-port 8082

    iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \
    -j REDIRECT --to-port 8083

    Signed-off-by: Eric Dumazet
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     
  • This implements the kernel-space side of the netfilter matcher xt_ipvs.

    [ minor fixes by Simon Horman ]
    Signed-off-by: Hannes Eder
    Signed-off-by: Simon Horman
    [ Patrick: added xt_ipvs.h to Kbuild ]
    Signed-off-by: Patrick McHardy

    Hannes Eder
     

15 Jul, 2010

1 commit

  • This adds a `CHECKSUM' target, which can be used in the iptables mangle
    table.

    You can use this target to compute and fill in the checksum in
    a packet that lacks a checksum. This is particularly useful,
    if you need to work around old applications such as dhcp clients,
    that do not work well with checksum offloads, but don't want to
    disable checksum offload in your device.

    The problem happens in the field with virtualized applications.
    For reference, see Red Hat bz 605555, as well as
    http://www.spinics.net/lists/kvm/msg37660.html

    Typical expected use (helps old dhclient binary running in a VM):
    iptables -A POSTROUTING -t mangle -p udp --dport bootpc \
    -j CHECKSUM --checksum-fill

    Includes fixes by Jan Engelhardt

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Patrick McHardy

    Michael S. Tsirkin
     

15 Jun, 2010

1 commit

  • This patch implements an idletimer Xtables target that can be used to
    identify when interfaces have been idle for a certain period of time.

    Timers are identified by labels and are created when a rule is set with a new
    label. The rules also take a timeout value (in seconds) as an option. If
    more than one rule uses the same timer label, the timer will be restarted
    whenever any of the rules get a hit.

    One entry for each timer is created in sysfs. This attribute contains the
    timer remaining for the timer to expire. The attributes are located under
    the xt_idletimer class:

    /sys/class/xt_idletimer/timers/

    When the timer expires, the target module sends a sysfs notification to the
    userspace, which can then decide what to do (eg. disconnect to save power).

    Cc: Timo Teras
    Signed-off-by: Luciano Coelho
    Signed-off-by: Patrick McHardy

    Luciano Coelho
     

19 Apr, 2010

1 commit

  • xt_TEE can be used to clone and reroute a packet. This can for
    example be used to copy traffic at a router for logging purposes
    to another dedicated machine.

    References: http://www.gossamer-threads.com/lists/iptables/devel/68781
    Signed-off-by: Jan Engelhardt
    Signed-off-by: Patrick McHardy

    Jan Engelhardt
     

17 Mar, 2010

2 commits

  • Signed-off-by: Jan Engelhardt

    Jan Engelhardt
     
  • Two arguments for combining the two:
    - xt_mark is pretty useless without xt_MARK
    - the actual code is so small anyway that the kmod metadata and the module
    in its loaded state totally outweighs the combined actual code size.

    i586-before:
    -rw-r--r-- 1 jengelh users 3821 Feb 10 01:01 xt_MARK.ko
    -rw-r--r-- 1 jengelh users 2592 Feb 10 00:04 xt_MARK.o
    -rw-r--r-- 1 jengelh users 3274 Feb 10 01:01 xt_mark.ko
    -rw-r--r-- 1 jengelh users 2108 Feb 10 00:05 xt_mark.o
    text data bss dec hex filename
    354 264 0 618 26a xt_MARK.o
    223 176 0 399 18f xt_mark.o
    And the runtime size is like 14 KB.

    i586-after:
    -rw-r--r-- 1 jengelh users 3264 Feb 18 17:28 xt_mark.o

    Signed-off-by: Jan Engelhardt

    Jan Engelhardt
     

04 Feb, 2010

1 commit

  • Add a new target for the raw table, which can be used to specify conntrack
    parameters for specific connections, f.i. the conntrack helper.

    The target attaches a "template" connection tracking entry to the skb, which
    is used by the conntrack core when initializing a new conntrack.

    Signed-off-by: Patrick McHardy

    Patrick McHardy
     

08 Jun, 2009

1 commit

  • Passive OS fingerprinting netfilter module allows to passively detect
    remote OS and perform various netfilter actions based on that knowledge.
    This module compares some data (WS, MSS, options and it's order, ttl, df
    and others) from packets with SYN bit set with dynamically loaded OS
    fingerprints.

    Fingerprint matching rules can be downloaded from OpenBSD source tree
    or found in archive and loaded via netfilter netlink subsystem into
    the kernel via special util found in archive.

    Archive contains library file (also attached), which was shipped
    with iptables extensions some time ago (at least when ipt_osf existed
    in patch-o-matic).

    Following changes were made in this release:
    * added NLM_F_CREATE/NLM_F_EXCL checks
    * dropped _rcu list traversing helpers in the protected add/remove calls
    * dropped unneded structures, debug prints, obscure comment and check

    Fingerprints can be downloaded from
    http://www.openbsd.org/cgi-bin/cvsweb/src/etc/pf.os
    or can be found in archive

    Example usage:
    -d switch removes fingerprints

    Please consider for inclusion.
    Thank you.

    Passive OS fingerprint homepage (archives, examples):
    http://www.ioremap.net/projects/osf

    Signed-off-by: Evgeniy Polyakov
    Signed-off-by: Patrick McHardy

    Evgeniy Polyakov
     

17 Mar, 2009

1 commit

  • This patch adds the iptables cluster match. This match can be used
    to deploy gateway and back-end load-sharing clusters. The cluster
    can be composed of 32 nodes maximum (although I have only tested
    this with two nodes, so I cannot tell what is the real scalability
    limit of this solution in terms of cluster nodes).

    Assuming that all the nodes see all packets (see below for an
    example on how to do that if your switch does not allow this), the
    cluster match decides if this node has to handle a packet given:

    (jhash(source IP) % total_nodes) & node_mask

    For related connections, the master conntrack is used. The following
    is an example of its use to deploy a gateway cluster composed of two
    nodes (where this is the node 1):

    iptables -I PREROUTING -t mangle -i eth1 -m cluster \
    --cluster-total-nodes 2 --cluster-local-node 1 \
    --cluster-proc-name eth1 -j MARK --set-mark 0xffff
    iptables -A PREROUTING -t mangle -i eth1 \
    -m mark ! --mark 0xffff -j DROP
    iptables -A PREROUTING -t mangle -i eth2 -m cluster \
    --cluster-total-nodes 2 --cluster-local-node 1 \
    --cluster-proc-name eth2 -j MARK --set-mark 0xffff
    iptables -A PREROUTING -t mangle -i eth2 \
    -m mark ! --mark 0xffff -j DROP

    And the following commands to make all nodes see the same packets:

    ip maddr add 01:00:5e:00:01:01 dev eth1
    ip maddr add 01:00:5e:00:01:02 dev eth2
    arptables -I OUTPUT -o eth1 --h-length 6 \
    -j mangle --mangle-mac-s 01:00:5e:00:01:01
    arptables -I INPUT -i eth1 --h-length 6 \
    --destination-mac 01:00:5e:00:01:01 \
    -j mangle --mangle-mac-d 00:zz:yy:xx:5a:27
    arptables -I OUTPUT -o eth2 --h-length 6 \
    -j mangle --mangle-mac-s 01:00:5e:00:01:02
    arptables -I INPUT -i eth2 --h-length 6 \
    --destination-mac 01:00:5e:00:01:02 \
    -j mangle --mangle-mac-d 00:zz:yy:xx:5a:27

    In the case of TCP connections, pickup facility has to be disabled
    to avoid marking TCP ACK packets coming in the reply direction as
    valid.

    echo 0 > /proc/sys/net/netfilter/nf_conntrack_tcp_loose

    BTW, some final notes:

    * This match mangles the skbuff pkt_type in case that it detects
    PACKET_MULTICAST for a non-multicast address. This may be done in
    a PKTTYPE target for this sole purpose.
    * This match supersedes the CLUSTERIP target.

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Patrick McHardy

    Pablo Neira Ayuso
     

20 Feb, 2009

1 commit

  • Kernel module providing implementation of LED netfilter target. Each
    instance of the target appears as a led-trigger device, which can be
    associated with one or more LEDs in /sys/class/leds/

    Signed-off-by: Adam Nielsen
    Acked-by: Richard Purdie
    Signed-off-by: Patrick McHardy

    Adam Nielsen
     

19 Feb, 2009

2 commits


09 Oct, 2008

1 commit


08 Oct, 2008

4 commits


07 Oct, 2008

1 commit

  • Since IPVS now has partial IPv6 support, this patch moves IPVS from
    net/ipv4/ipvs to net/netfilter/ipvs. It's a result of:

    $ git mv net/ipv4/ipvs net/netfilter

    and adapting the relevant Kconfigs/Makefiles to the new path.

    Signed-off-by: Julius Volz
    Signed-off-by: Simon Horman

    Julius Volz
     

22 Jul, 2008

1 commit

  • Initially netfilter has had 64bit counters for conntrack-based accounting, but
    it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are
    still required, for example for "connbytes" extension. However, 64bit counters
    waste a lot of memory and it was not possible to enable/disable it runtime.

    This patch:
    - reimplements accounting with respect to the extension infrastructure,
    - makes one global version of seq_print_acct() instead of two seq_print_counters(),
    - makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n),
    - makes it possible to enable/disable it at runtime by sysctl or sysfs,
    - extends counters from 32bit to 64bit,
    - renames ip_conntrack_counter -> nf_conn_counter,
    - enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT),
    - set initial accounting enable state based on CONFIG_NF_CT_ACCT
    - removes buggy IPCT_COUNTER_FILLING event handling.

    If accounting is enabled newly created connections get additional acct extend.
    Old connections are not changed as it is not possible to add a ct_extend area
    to confirmed conntrack. Accounting is performed for all connections with
    acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct".

    Signed-off-by: Krzysztof Piotr Oledzki
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Krzysztof Piotr Oledzki
     

14 Apr, 2008

1 commit


29 Jan, 2008

6 commits

  • Since there now is generic support for shared sysctl paths, the only
    remains are the net/netfilter and net/ipv4/netfilter paths. Move them
    to net/netfilter/core.c and net/ipv4/netfilter.c and kill nf_sysctl.c.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • This patch moves ipt_iprange to xt_iprange, in preparation for adding
    IPv6 support to xt_iprange.

    Signed-off-by: Jan Engelhardt
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Jan Engelhardt
     
  • Add rate estimator match. The rate estimator match can match on
    estimated rates by the RATEEST target. It supports matching on
    absolute bps/pps values, comparing two rate estimators and matching
    on the difference between two rate estimators.

    This is what I use to route outgoing data connections from a FTP
    server over two lines based on the available bandwidth:

    # estimate outgoing rates
    iptables -t mangle -A POSTROUTING -o eth0 -j RATEEST --rateest-name eth0 \
    --rateest-interval 250ms \
    --rateest-ewma 0.5s
    iptables -t mangle -A POSTROUTING -o ppp0 -j RATEEST --rateest-name ppp0 \
    --rateest-interval 250ms \
    --rateest-ewma 0.5s

    # mark based on available bandwidth
    iptables -t mangle -A BALANCE -m state --state NEW \
    -m helper --helper ftp \
    -m rateest --rateest-delta \
    --rateest1 eth0 \
    --rateest-bps1 2.5mbit \
    --rateest-gt \
    --rateest2 ppp0 \
    --rateest-bps2 2mbit \
    -j CONNMARK --set-mark 0x1

    iptables -t mangle -A BALANCE -m state --state NEW \
    -m helper --helper ftp \
    -m rateest --rateest-delta \
    --rateest1 ppp0 \
    --rateest-bps1 2mbit \
    --rateest-gt \
    --rateest2 eth0 \
    --rateest-bps2 2.5mbit \
    -j CONNMARK --set-mark 0x2

    iptables -t mangle -A BALANCE -j CONNMARK --restore-mark

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Add new rate estimator target (using gen_estimator). In combination with
    the rateest match (next patch) this can be used for load-based multipath
    routing.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • xt_owner merges ipt_owner and ip6t_owner, and adds a flag to match
    on socket (non-)existence.

    Signed-off-by: Jan Engelhardt
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Jan Engelhardt
     
  • Signed-off-by: Sven Schnelle
    Signed-off-by: Jan Engelhardt
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Sven Schnelle
     

07 Nov, 2007

1 commit


11 Oct, 2007

1 commit

  • This is ipt_time from POM-ng enhanced by the following:

    * xtables/ipv6 support
    * second granularity for daytime
    * day-of-month support (for example "match on the 15th of each month")
    * match against UTC or local timezone

    Signed-off-by: Jan Engelhardt
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Jan Engelhardt
     

15 Jul, 2007

1 commit