10 Jan, 2012

1 commit


28 Dec, 2011

2 commits


25 Dec, 2011

2 commits

  • This patch adds the match that allows to perform extended
    accounting. It requires the new nfnetlink_acct infrastructure.

    # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
    # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • We currently have two ways to account traffic in netfilter:

    - iptables chain and rule counters:

    # iptables -L -n -v
    Chain INPUT (policy DROP 3 packets, 867 bytes)
    pkts bytes target prot opt in out source destination
    8 1104 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0

    - use flow-based accounting provided by ctnetlink:

    # conntrack -L
    tcp 6 431999 ESTABLISHED src=192.168.1.130 dst=212.106.219.168 sport=58152 dport=80 packets=47 bytes=7654 src=212.106.219.168 dst=192.168.1.130 sport=80 dport=58152 packets=49 bytes=66340 [ASSURED] mark=0 use=1

    While trying to display real-time accounting statistics, we require
    to pool the kernel periodically to obtain this information. This is
    OK if the number of flows is relatively low. However, in case that
    the number of flows is huge, we can spend a considerable amount of
    cycles to iterate over the list of flows that have been obtained.

    Moreover, if we want to obtain the sum of the flow accounting results
    that match some criteria, we have to iterate over the whole list of
    existing flows, look for matchings and update the counters.

    This patch adds the extended accounting infrastructure for
    nfnetlink which aims to allow displaying real-time traffic accounting
    without the need of complicated and resource-consuming implementation
    in user-space. Basically, this new infrastructure allows you to create
    accounting objects. One accounting object is composed of packet and
    byte counters.

    In order to manipulate create accounting objects, you require the
    new libnetfilter_acct library. It contains several examples of use:

    libnetfilter_acct/examples# ./nfacct-add http-traffic
    libnetfilter_acct/examples# ./nfacct-get
    http-traffic = { pkts = 000000000000, bytes = 000000000000 };

    Then, you can use one of this accounting objects in several iptables
    rules using the new nfacct match (which comes in a follow-up patch):

    # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic
    # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic

    The idea is simple: if one packet matches the rule, the nfacct match
    updates the counters.

    Thanks to Patrick McHardy, Eric Dumazet, Changli Gao for reviewing and
    providing feedback for this contribution.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

23 Dec, 2011

3 commits

  • Export the NAT definitions to userspace. So far userspace (specifically,
    iptables) has been copying the headers files from include/net. Also
    rename some structures and definitions in preparation for IPv6 NAT.
    Since these have never been officially exported, this doesn't affect
    existing userspace code.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • This partially reworks bc01befdcf3e40979eb518085a075cbf0aacede0
    which added userspace expectation support.

    This patch removes the nf_ct_userspace_expect_list since now we
    force to use the new iptables CT target feature to add the helper
    extension for conntracks that have attached expectations from
    userspace.

    A new version of the proof-of-concept code to implement userspace
    helpers from userspace is available at:

    http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-POC.tar.bz2

    This patch also modifies the CT target to allow to set the
    conntrack's userspace helper status flags. This flag is used
    to tell the conntrack system to explicitly allocate the helper
    extension.

    This helper extension is useful to link the userspace expectations
    with the master conntrack that is being tracked from one userspace
    helper.

    This feature fixes a problem in the current approach of the
    userspace helper support. Basically, if the master conntrack that
    has got a userspace expectation vanishes, the expectations point to
    one invalid memory address. Thus, triggering an oops in the
    expectation deletion event path.

    I decided not to add a new revision of the CT target because
    I only needed to add a new flag for it. I'll document in this
    issue in the iptables manpage. I have also changed the return
    value from EINVAL to EOPNOTSUPP if one flag not supported is
    specified. Thus, in the future adding new features that only
    require a new flag can be added without a new revision.

    There is no official code using this in userspace (apart from
    the proof-of-concept) that uses this infrastructure but there
    will be some by beginning 2012.

    Reported-by: Sam Roberts
    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • We simply say that regular this_cpu use must be safe regardless of
    preemption and interrupt state. That has no material change for x86
    and s390 implementations of this_cpu operations. However, arches that
    do not provide their own implementation for this_cpu operations will
    now get code generated that disables interrupts instead of preemption.

    -tj: This is part of on-going percpu API cleanup. For detailed
    discussion of the subject, please refer to the following thread.

    http://thread.gmane.org/gmane.linux.kernel/1222078

    Signed-off-by: Christoph Lameter
    Signed-off-by: Tejun Heo
    LKML-Reference:

    Christoph Lameter
     

05 Dec, 2011

1 commit

  • This tries to do the same thing as fib_validate_source(), but differs
    in several aspects.

    The most important difference is that the reverse path filter built into
    fib_validate_source uses the oif as iif when performing the reverse
    lookup. We do not do this, as the oif is not yet known by the time the
    PREROUTING hook is invoked.

    We can't wait until FORWARD chain because by the time FORWARD is invoked
    ipv4 forward path may have already sent icmp messages is response
    to to-be-discarded-via-rpfilter packets.

    To avoid the such an additional lookup in PREROUTING, Patrick McHardy
    suggested to attach the path information directly in the match
    (i.e., just do what the standard ipv4 path does a bit earlier in PREROUTING).

    This works, but it also has a few caveats. Most importantly, when using
    marks in PREROUTING to re-route traffic based on the nfmark, -m rpfilter
    would have to be used after the nfmark has been set; otherwise the nfmark
    would have no effect (because the route is already attached).

    Another problem would be interaction with -j TPROXY, as this target sets an
    nfmark and uses ACCEPT instead of continue, i.e. such a version of
    -m rpfilter cannot be used for the initial to-be-intercepted packets.

    In case in turns out that the oif is required, we can add Patricks
    suggestion with a new match option (e.g. --rpf-use-oif) to keep ruleset
    compatibility.

    Another difference to current builtin ipv4 rpfilter is that packets subject to ipsec
    transformation are not automatically excluded. If you want this, simply
    combine -m rpfilter with the policy match.

    Packets arriving on loopback interfaces always match.

    Signed-off-by: Florian Westphal
    Acked-by: David S. Miller
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

27 Aug, 2011

1 commit


22 Jul, 2011

1 commit


21 Jul, 2011

3 commits


19 Jul, 2011

1 commit


18 Jul, 2011

1 commit

  • Goal of this patch is to permit nfnetlink providers not mandate
    nfnl_mutex being held while nfnetlink_rcv_msg() calls them.

    If struct nfnl_callback contains a non NULL call_rcu(), then
    nfnetlink_rcv_msg() will use it instead of call() field, holding
    rcu_read_lock instead of nfnl_mutex

    Signed-off-by: Eric Dumazet
    CC: Florian Westphal
    CC: Eric Leblond
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     

21 Jun, 2011

1 commit


17 Jun, 2011

11 commits


06 Jun, 2011

1 commit

  • Following error is raised (and other similar ones) :

    net/ipv4/netfilter/nf_nat_standalone.c: In function ‘nf_nat_fn’:
    net/ipv4/netfilter/nf_nat_standalone.c:119:2: warning: case value ‘4’
    not in enumerated type ‘enum ip_conntrack_info’

    gcc barfs on adding two enum values and getting a not enumerated
    result :

    case IP_CT_RELATED+IP_CT_IS_REPLY:

    Add missing enum values

    Signed-off-by: Eric Dumazet
    CC: David Miller
    Signed-off-by: Pablo Neira Ayuso

    Eric Dumazet
     

27 May, 2011

2 commits


20 Apr, 2011

1 commit


13 Apr, 2011

1 commit


11 Apr, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (34 commits)
    net: Add support for SMSC LAN9530, LAN9730 and LAN89530
    mlx4_en: Restoring RX buffer pointer in case of failure
    mlx4: Sensing link type at device initialization
    ipv4: Fix "Set rt->rt_iif more sanely on output routes."
    MAINTAINERS: add entry for Xen network backend
    be2net: Fix suspend/resume operation
    be2net: Rename some struct members for clarity
    pppoe: drop PPPOX_ZOMBIEs in pppoe_flush_dev
    dsa/mv88e6131: add support for mv88e6085 switch
    ipv6: Enable RFS sk_rxhash tracking for ipv6 sockets (v2)
    be2net: Fix a potential crash during shutdown.
    bna: Fix for handling firmware heartbeat failure
    can: mcp251x: Allow pass IRQ flags through platform data.
    smsc911x: fix mac_lock acquision before calling smsc911x_mac_read
    iwlwifi: accept EEPROM version 0x423 for iwl6000
    rt2x00: fix cancelling uninitialized work
    rtlwifi: Fix some warnings/bugs
    p54usb: IDs for two new devices
    wl12xx: fix potential buffer overflow in testmode nvs push
    zd1211rw: reset rx idle timer from tasklet
    ...

    Linus Torvalds
     

04 Apr, 2011

2 commits

  • We currently use a percpu spinlock to 'protect' rule bytes/packets
    counters, after various attempts to use RCU instead.

    Lately we added a seqlock so that get_counters() can run without
    blocking BH or 'writers'. But we really only need the seqcount in it.

    Spinlock itself is only locked by the current/owner cpu, so we can
    remove it completely.

    This cleanups api, using correct 'writer' vs 'reader' semantic.

    At replace time, the get_counters() call makes sure all cpus are done
    using the old table.

    Signed-off-by: Eric Dumazet
    Cc: Jan Engelhardt
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     
  • The timeout variant of the list:set type must reference the member sets.
    However, its garbage collector runs at timer interrupt so the mutex
    protection of the references is a no go. Therefore the reference protection
    is converted to rwlock.

    Signed-off-by: Jozsef Kadlecsik
    Signed-off-by: Patrick McHardy

    Jozsef Kadlecsik
     

31 Mar, 2011

1 commit


20 Mar, 2011

1 commit


19 Mar, 2011

1 commit


16 Mar, 2011

1 commit