22 Sep, 2009

1 commit

  • Sizing of memory allocations shouldn't depend on the number of physical
    pages found in a system, as that generally includes (perhaps a huge amount
    of) non-RAM pages. The amount of what actually is usable as storage
    should instead be used as a basis here.

    Some of the calculations (i.e. those not intending to use high memory)
    should likely even use (totalram_pages - totalhigh_pages).

    Signed-off-by: Jan Beulich
    Acked-by: Rusty Russell
    Acked-by: Ingo Molnar
    Cc: Dave Airlie
    Cc: Kyle McMartin
    Cc: Jeremy Fitzhardinge
    Cc: Pekka Enberg
    Cc: Hugh Dickins
    Cc: "David S. Miller"
    Cc: Patrick McHardy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Beulich
     

24 Aug, 2009

1 commit


13 Jun, 2009

1 commit


08 May, 2009

1 commit


29 Apr, 2009

1 commit

  • The x_tables are organized with a table structure and a per-cpu copies
    of the counters and rules. On older kernels there was a reader/writer
    lock per table which was a performance bottleneck. In 2.6.30-rc, this
    was converted to use RCU and the counters/rules which solved the performance
    problems for do_table but made replacing rules much slower because of
    the necessary RCU grace period.

    This version uses a per-cpu set of spinlocks and counters to allow to
    table processing to proceed without the cache thrashing of a global
    reader lock and keeps the same performance for table updates.

    Signed-off-by: Stephen Hemminger
    Acked-by: Linus Torvalds
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

25 Mar, 2009

1 commit


20 Feb, 2009

1 commit

  • The reader/writer lock in ip_tables is acquired in the critical path of
    processing packets and is one of the reasons just loading iptables can cause
    a 20% performance loss. The rwlock serves two functions:

    1) it prevents changes to table state (xt_replace) while table is in use.
    This is now handled by doing rcu on the xt_table. When table is
    replaced, the new table(s) are put in and the old one table(s) are freed
    after RCU period.

    2) it provides synchronization when accesing the counter values.
    This is now handled by swapping in new table_info entries for each cpu
    then summing the old values, and putting the result back onto one
    cpu. On a busy system it may cause sampling to occur at different
    times on each cpu, but no packet/byte counts are lost in the process.

    Signed-off-by: Stephen Hemminger

    Sucessfully tested on my dual quad core machine too, but iptables only (no ipv6 here)
    BTW, my new "tbench 8" result is 2450 MB/s, (it was 2150 MB/s not so long ago)

    Acked-by: Eric Dumazet
    Signed-off-by: Patrick McHardy

    Stephen Hemminger
     

18 Feb, 2009

1 commit


13 Jan, 2009

1 commit

  • Commit 55b69e91 (netfilter: implement NFPROTO_UNSPEC as a wildcard
    for extensions) broke revision probing for matches and targets that
    are registered with NFPROTO_UNSPEC.

    Fix by continuing the search on the NFPROTO_UNSPEC list if nothing
    is found on the af-specific lists.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

08 Oct, 2008

9 commits


02 May, 2008

1 commit


29 Apr, 2008

1 commit


14 Apr, 2008

1 commit


26 Mar, 2008

1 commit


01 Feb, 2008

7 commits

  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • Propagate netns together with AF down to ->start/->next/->stop
    iterators. Choose table based on netns and AF for showing.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • There are many small but still wrong things with /proc/net/*_tables_*
    so I decided to do overhaul simultaneously making it more suitable for
    per-netns /proc/net/*_tables_* implementation.

    Fix
    a) xt_get_idx() duplicating now standard seq_list_start/seq_list_next
    iterators
    b) tables/matches/targets list was chosen again and again on every ->next
    c) multiple useless "af >= NPROTO" checks -- we simple don't supply invalid
    AFs there and registration function should BUG_ON instead.

    Regardless, the one in ->next() is the most useless -- ->next doesn't
    run at all if ->start fails.
    d) Don't use mutex_lock_interruptible() -- it can fail and ->stop is
    executed even if ->start failed, so unlock without lock is possible.

    As side effect, streamline code by splitting xt_tgt_ops into xt_target_ops,
    xt_matches_ops, xt_tables_ops.

    xt_tables_ops hooks will be changed by per-netns code. Code of
    xt_matches_ops, xt_target_ops is identical except the list chosen for
    iterating, but I think consolidating code for two files not worth it
    given "<< 16" hacks needed for it.

    [Patrick: removed unused enum in x_tables.c]

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • CHECK net/ipv4/netfilter/ip_tables.c
    net/ipv4/netfilter/ip_tables.c:1453:8: warning: incorrect type in argument 3 (different signedness)
    net/ipv4/netfilter/ip_tables.c:1453:8: expected int *size
    net/ipv4/netfilter/ip_tables.c:1453:8: got unsigned int [usertype] *size
    net/ipv4/netfilter/ip_tables.c:1458:44: warning: incorrect type in argument 3 (different signedness)
    net/ipv4/netfilter/ip_tables.c:1458:44: expected int *size
    net/ipv4/netfilter/ip_tables.c:1458:44: got unsigned int [usertype] *size
    net/ipv4/netfilter/ip_tables.c:1603:2: warning: incorrect type in argument 2 (different signedness)
    net/ipv4/netfilter/ip_tables.c:1603:2: expected unsigned int *i
    net/ipv4/netfilter/ip_tables.c:1603:2: got int *
    net/ipv4/netfilter/ip_tables.c:1627:8: warning: incorrect type in argument 3 (different signedness)
    net/ipv4/netfilter/ip_tables.c:1627:8: expected int *size
    net/ipv4/netfilter/ip_tables.c:1627:8: got unsigned int *size
    net/ipv4/netfilter/ip_tables.c:1634:40: warning: incorrect type in argument 3 (different signedness)
    net/ipv4/netfilter/ip_tables.c:1634:40: expected int *size
    net/ipv4/netfilter/ip_tables.c:1634:40: got unsigned int *size
    net/ipv4/netfilter/ip_tables.c:1653:8: warning: incorrect type in argument 5 (different signedness)
    net/ipv4/netfilter/ip_tables.c:1653:8: expected unsigned int *i
    net/ipv4/netfilter/ip_tables.c:1653:8: got int *
    net/ipv4/netfilter/ip_tables.c:1666:2: warning: incorrect type in argument 2 (different signedness)
    net/ipv4/netfilter/ip_tables.c:1666:2: expected unsigned int *i
    net/ipv4/netfilter/ip_tables.c:1666:2: got int *
    CHECK net/ipv4/netfilter/arp_tables.c
    net/ipv4/netfilter/arp_tables.c:1285:40: warning: incorrect type in argument 3 (different signedness)
    net/ipv4/netfilter/arp_tables.c:1285:40: expected int *size
    net/ipv4/netfilter/arp_tables.c:1285:40: got unsigned int *size
    net/ipv4/netfilter/arp_tables.c:1543:44: warning: incorrect type in argument 3 (different signedness)
    net/ipv4/netfilter/arp_tables.c:1543:44: expected int *size
    net/ipv4/netfilter/arp_tables.c:1543:44: got unsigned int [usertype] *size
    CHECK net/ipv6/netfilter/ip6_tables.c
    net/ipv6/netfilter/ip6_tables.c:1481:8: warning: incorrect type in argument 3 (different signedness)
    net/ipv6/netfilter/ip6_tables.c:1481:8: expected int *size
    net/ipv6/netfilter/ip6_tables.c:1481:8: got unsigned int [usertype] *size
    net/ipv6/netfilter/ip6_tables.c:1486:44: warning: incorrect type in argument 3 (different signedness)
    net/ipv6/netfilter/ip6_tables.c:1486:44: expected int *size
    net/ipv6/netfilter/ip6_tables.c:1486:44: got unsigned int [usertype] *size
    net/ipv6/netfilter/ip6_tables.c:1631:2: warning: incorrect type in argument 2 (different signedness)
    net/ipv6/netfilter/ip6_tables.c:1631:2: expected unsigned int *i
    net/ipv6/netfilter/ip6_tables.c:1631:2: got int *
    net/ipv6/netfilter/ip6_tables.c:1655:8: warning: incorrect type in argument 3 (different signedness)
    net/ipv6/netfilter/ip6_tables.c:1655:8: expected int *size
    net/ipv6/netfilter/ip6_tables.c:1655:8: got unsigned int *size
    net/ipv6/netfilter/ip6_tables.c:1662:40: warning: incorrect type in argument 3 (different signedness)
    net/ipv6/netfilter/ip6_tables.c:1662:40: expected int *size
    net/ipv6/netfilter/ip6_tables.c:1662:40: got unsigned int *size
    net/ipv6/netfilter/ip6_tables.c:1680:8: warning: incorrect type in argument 5 (different signedness)
    net/ipv6/netfilter/ip6_tables.c:1680:8: expected unsigned int *i
    net/ipv6/netfilter/ip6_tables.c:1680:8: got int *
    net/ipv6/netfilter/ip6_tables.c:1693:2: warning: incorrect type in argument 2 (different signedness)
    net/ipv6/netfilter/ip6_tables.c:1693:2: expected unsigned int *i
    net/ipv6/netfilter/ip6_tables.c:1693:2: got int *

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Typical table module registers xt_table structure (i.e. packet_filter)
    and link it to list during it. We can't use one template for it because
    corresponding list_head will become corrupted. We also can't unregister
    with template because it wasn't changed at all and thus doesn't know in
    which list it is.

    So, we duplicate template at the very first step of table registration.
    Table modules will save it for use during unregistration time and actual
    filtering.

    Do it at once to not screw bisection.

    P.S.: renaming i.e. packet_filter => __packet_filter is temporary until
    full netnsization of table modules is done.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • In fact all we want is per-netns set of rules, however doing that will
    unnecessary complicate routines such as ipt_hook()/ipt_do_table, so
    make full xt_table array per-netns.

    Every user stubbed with init_net for a while.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • Switch from 0/-E to ptr/PTR_ERR convention.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

29 Jan, 2008

3 commits


15 Dec, 2007

1 commit

  • When copying entries to user, the kernel makes two passes through the
    data, first copying all the entries, then fixing up names and counters.
    On the second pass it copies the kernel and match data from userspace
    to the kernel again to find the corresponding structures, expecting
    that kernel pointers contained in the data are still valid.

    This is obviously broken, fix by avoiding the second pass completely
    and fixing names and counters while dumping the ruleset, using the
    kernel-internal data structures.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

11 Oct, 2007

1 commit

  • This patch makes /proc/net per network namespace. It modifies the global
    variables proc_net and proc_net_stat to be per network namespace.
    The proc_net file helpers are modified to take a network namespace argument,
    and all of their callers are fixed to pass &init_net for that argument.
    This ensures that all of the /proc/net files are only visible and
    usable in the initial network namespace until the code behind them
    has been updated to be handle multiple network namespaces.

    Making /proc/net per namespace is necessary as at least some files
    in /proc/net depend upon the set of network devices which is per
    network namespace, and even more files in /proc/net have contents
    that are relevant to a single network namespace.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

11 Jul, 2007

2 commits


26 Apr, 2007

1 commit


13 Feb, 2007

2 commits


04 Dec, 2006

1 commit