24 Aug, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
    netfilter: fix CONFIG_COMPAT support
    isdn/avm: fix build when PCMCIA is not enabled
    header: fix broken headers for user space
    e1000e: don't check for alternate MAC addr on parts that don't support it
    e1000e: disable ASPM L1 on 82573
    ll_temac: Fix poll implementation
    netxen: fix a race in netxen_nic_get_stats()
    qlnic: fix a race in qlcnic_get_stats()
    irda: fix a race in irlan_eth_xmit()
    net: sh_eth: remove unused variable
    netxen: update version 4.0.74
    netxen: fix inconsistent lock state
    vlan: Match underlying dev carrier on vlan add
    ibmveth: Fix opps during MTU change on an active device
    ehea: Fix synchronization between HW and SW send queue
    bnx2x: Update bnx2x version to 1.52.53-4
    bnx2x: Fix PHY locking problem
    rds: fix a leak of kernel memory
    netlink: fix compat recvmsg
    netfilter: fix userspace header warning
    ...

    Linus Torvalds
     

23 Aug, 2010

1 commit

  • __packed is only defined in kernel space, so we should use
    __attribute__((packed)) for the code shared between kernel and user space.

    Two __attribute() annotations are replaced with __attribute__() too.

    Signed-off-by: Changli Gao
    Signed-off-by: David S. Miller

    Changli Gao
     

19 Aug, 2010

1 commit

  • "make headers_check" issued the following warning:

    CHECK include/linux/netfilter (64 files)
    usr/include/linux/netfilter/xt_ipvs.h:19: found __[us]{8,16,32,64} type without #include

    Fix this by as suggested including linux/types.h.

    Signed-off-by: Sam Ravnborg
    Signed-off-by: David S. Miller

    Sam Ravnborg
     

15 Aug, 2010

1 commit


23 Jul, 2010

3 commits

  • We should copy the initial value to userspace for iptables-save and
    to allow removal of specific quota rules.

    Signed-off-by: Changli Gao
    Signed-off-by: Patrick McHardy

    Changli Gao
     
  • In some situations a CPU match permits a better spreading of
    connections, or select targets only for a given cpu.

    With Remote Packet Steering or multiqueue NIC and appropriate IRQ
    affinities, we can distribute trafic on available cpus, per session.
    (all RX packets for a given flow is handled by a given cpu)

    Some legacy applications being not SMP friendly, one way to scale a
    server is to run multiple copies of them.

    Instead of randomly choosing an instance, we can use the cpu number as a
    key so that softirq handler for a whole instance is running on a single
    cpu, maximizing cache effects in TCP/UDP stacks.

    Using NAT for example, a four ways machine might run four copies of
    server application, using a separate listening port for each instance,
    but still presenting an unique external port :

    iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 0 \
    -j REDIRECT --to-port 8080

    iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 1 \
    -j REDIRECT --to-port 8081

    iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 2 \
    -j REDIRECT --to-port 8082

    iptables -t nat -A PREROUTING -p tcp --dport 80 -m cpu --cpu 3 \
    -j REDIRECT --to-port 8083

    Signed-off-by: Eric Dumazet
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     
  • This implements the kernel-space side of the netfilter matcher xt_ipvs.

    [ minor fixes by Simon Horman ]
    Signed-off-by: Hannes Eder
    Signed-off-by: Simon Horman
    [ Patrick: added xt_ipvs.h to Kbuild ]
    Signed-off-by: Patrick McHardy

    Hannes Eder
     

16 Jul, 2010

1 commit


15 Jul, 2010

2 commits

  • This adds a `CHECKSUM' target, which can be used in the iptables mangle
    table.

    You can use this target to compute and fill in the checksum in
    a packet that lacks a checksum. This is particularly useful,
    if you need to work around old applications such as dhcp clients,
    that do not work well with checksum offloads, but don't want to
    disable checksum offload in your device.

    The problem happens in the field with virtualized applications.
    For reference, see Red Hat bz 605555, as well as
    http://www.spinics.net/lists/kvm/msg37660.html

    Typical expected use (helps old dhclient binary running in a VM):
    iptables -A POSTROUTING -t mangle -p udp --dport bootpc \
    -j CHECKSUM --checksum-fill

    Includes fixes by Jan Engelhardt

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Patrick McHardy

    Michael S. Tsirkin
     
  • This patch moves NFULNL_COPY_PACKET definition from
    linux/netfilter/nfnetlink_log.h to net/netfilter/nfnetlink_log.h
    since this copy mode is only for internal use.

    I have also changed the value from 0x03 to 0xff. Thus, we avoid
    a gap from user-space that may confuse users if we add new
    copy modes in the future.

    This change was introduced in:
    http://www.spinics.net/lists/netfilter-devel/msg13535.html

    Since this change is not included in any stable Linux kernel,
    I think it's safe to make this change now. Anyway, this copy
    mode does not make any sense from user-space, so this patch
    should not break any existing setup.

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Patrick McHardy

    Pablo Neira Ayuso
     

15 Jun, 2010

2 commits

  • Conflicts:
    include/net/netfilter/xt_rateest.h
    net/bridge/br_netfilter.c
    net/netfilter/nf_conntrack_core.c

    Signed-off-by: Patrick McHardy

    Patrick McHardy
     
  • This patch implements an idletimer Xtables target that can be used to
    identify when interfaces have been idle for a certain period of time.

    Timers are identified by labels and are created when a rule is set with a new
    label. The rules also take a timeout value (in seconds) as an option. If
    more than one rule uses the same timer label, the timer will be restarted
    whenever any of the rules get a hit.

    One entry for each timer is created in sysfs. This attribute contains the
    timer remaining for the timer to expire. The attributes are located under
    the xt_idletimer class:

    /sys/class/xt_idletimer/timers/

    When the timer expires, the target module sends a sysfs notification to the
    userspace, which can then decide what to do (eg. disconnect to save power).

    Cc: Timo Teras
    Signed-off-by: Luciano Coelho
    Signed-off-by: Patrick McHardy

    Luciano Coelho
     

14 Jun, 2010

1 commit

  • - must use atomic_inc_not_zero() in instance_lookup_get()

    - must use hlist_add_head_rcu() instead of hlist_add_head()

    - must use hlist_del_rcu() instead of hlist_del()

    - Introduce NFULNL_COPY_DISABLED to stop lockless reader from using an
    instance, before we do final instance_put() on it.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     

08 Jun, 2010

1 commit

  • NOTRACK makes all cpus share a cache line on nf_conntrack_untracked
    twice per packet. This is bad for performance.
    __read_mostly annotation is also a bad choice.

    This patch introduces IPS_UNTRACKED bit so that we can use later a
    per_cpu untrack structure more easily.

    A new helper, nf_ct_untracked_get() returns a pointer to
    nf_conntrack_untracked.

    Another one, nf_ct_untracked_status_or() is used by nf_nat_init() to add
    IPS_NAT_DONE_MASK bits to untracked status.

    nf_ct_is_untracked() prototype is changed to work on a nf_conn pointer.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     

31 May, 2010

1 commit

  • commit f3c5c1bfd4 (netfilter: xtables: make ip_tables reentrant)
    introduced a performance regression, because stackptr array is shared by
    all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache
    line)

    Fix this using alloc_percpu()

    Signed-off-by: Eric Dumazet
    Acked-By: Jan Engelhardt
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     

20 May, 2010

1 commit


12 May, 2010

4 commits


27 Apr, 2010

1 commit

  • There has been quite a confusion in userspace about
    XT_FUNCTION_MAXNAMELEN; because struct xt_entry_match used MAX-1,
    userspace would have to do an awkward MAX-2 for maximum length
    checking (due to '\0'). This patch adds a new define that matches the
    definition of XT_TABLE_MAXNAMELEN - being the size of the actual
    struct member, not one off.

    Signed-off-by: Jan Engelhardt
    Signed-off-by: Patrick McHardy

    Jan Engelhardt
     

23 Apr, 2010

1 commit


20 Apr, 2010

2 commits


19 Apr, 2010

2 commits

  • Currently, the table traverser stores return addresses in the ruleset
    itself (struct ip6t_entry->comefrom). This has a well-known drawback:
    the jumpstack is overwritten on reentry, making it necessary for
    targets to return absolute verdicts. Also, the ruleset (which might
    be heavy memory-wise) needs to be replicated for each CPU that can
    possibly invoke ip6t_do_table.

    This patch decouples the jumpstack from struct ip6t_entry and instead
    puts it into xt_table_info. Not being restricted by 'comefrom'
    anymore, we can set up a stack as needed. By default, there is room
    allocated for two entries into the traverser.

    arp_tables is not touched though, because there is just one/two
    modules and further patches seek to collapse the table traverser
    anyhow.

    Signed-off-by: Jan Engelhardt
    Signed-off-by: Patrick McHardy

    Jan Engelhardt
     
  • xt_TEE can be used to clone and reroute a packet. This can for
    example be used to copy traffic at a router for logging purposes
    to another dedicated machine.

    References: http://www.gossamer-threads.com/lists/iptables/devel/68781
    Signed-off-by: Jan Engelhardt
    Signed-off-by: Patrick McHardy

    Jan Engelhardt
     

13 Apr, 2010

1 commit

  • XT_ALIGN() was rewritten through ALIGN() by commit 42107f5009da223daa800d6da6904d77297ae829
    "netfilter: xtables: symmetric COMPAT_XT_ALIGN definition".
    ALIGN() is not exported in userspace headers, which created compile problem for tc(8)
    and will create problem for iptables(8).

    We can't export generic looking name ALIGN() but we can export less generic
    __ALIGN_KERNEL() (suggested by Ben Hutchings).
    Google knows nothing about __ALIGN_KERNEL().

    COMPAT_XT_ALIGN() changed for symmetry.

    Reported-by: Andreas Henriksson
    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Patrick McHardy

    Alexey Dobriyan
     

25 Mar, 2010

3 commits


21 Mar, 2010

1 commit


18 Mar, 2010

4 commits


17 Mar, 2010

4 commits

  • Signed-off-by: Tim Gardner
    Signed-off-by: Patrick McHardy

    Tim Gardner
     
  • One of the problems with the way xt_recent is implemented is that
    there is no efficient way to remove expired entries. Of course,
    one can write a rule '-m recent --remove', but you have to know
    beforehand which entry to delete. This commit adds reaper
    logic which checks the head of the LRU list when a rule
    is invoked that has a '--seconds' value and XT_RECENT_REAP set. If an
    entry ceases to accumulate time stamps, then it will eventually bubble
    to the top of the LRU list where it is then reaped.

    Signed-off-by: Tim Gardner
    Signed-off-by: Eric Dumazet
    Signed-off-by: Patrick McHardy

    Tim Gardner
     
  • Signed-off-by: Jan Engelhardt

    Jan Engelhardt
     
  • Two arguments for combining the two:
    - xt_mark is pretty useless without xt_MARK
    - the actual code is so small anyway that the kmod metadata and the module
    in its loaded state totally outweighs the combined actual code size.

    i586-before:
    -rw-r--r-- 1 jengelh users 3821 Feb 10 01:01 xt_MARK.ko
    -rw-r--r-- 1 jengelh users 2592 Feb 10 00:04 xt_MARK.o
    -rw-r--r-- 1 jengelh users 3274 Feb 10 01:01 xt_mark.ko
    -rw-r--r-- 1 jengelh users 2108 Feb 10 00:05 xt_mark.o
    text data bss dec hex filename
    354 264 0 618 26a xt_MARK.o
    223 176 0 399 18f xt_mark.o
    And the runtime size is like 14 KB.

    i586-after:
    -rw-r--r-- 1 jengelh users 3264 Feb 18 17:28 xt_mark.o

    Signed-off-by: Jan Engelhardt

    Jan Engelhardt
     

08 Mar, 2010

1 commit