26 Jul, 2008

1 commit

  • Removes legacy reinvent-the-wheel type thing. The generic
    machinery integrates much better to automated debugging aids
    such as kerneloops.org (and others), and is unambiguous due to
    better naming. Non-intuively BUG_TRAP() is actually equal to
    WARN_ON() rather than BUG_ON() though some might actually be
    promoted to BUG_ON() but I left that to future.

    I could make at least one BUILD_BUG_ON conversion.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     

28 Jun, 2008

2 commits


12 Jun, 2008

1 commit


20 May, 2008

3 commits


03 May, 2008

1 commit


29 Mar, 2008

1 commit


26 Mar, 2008

1 commit


29 Jan, 2008

13 commits

  • On namespace start we mainly prepare the ctl variables.

    When the namespace is stopped we have to kill all the fragments that
    point to this namespace. The inet_frags_exit_net() handles it.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • The inet_frags.lru_list is used for evicting only, so we have
    to make it per-namespace, to evict only those fragments, who's
    namespace exceeded its high threshold, but not the whole hash.
    Besides, this helps to avoid long loops in evictor.

    The spinlock is not per-namespace because it protects the
    hash table as well, which is global.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Since we have one hashtable to lookup the fragment, having
    different secret_interval-s for hash rebuild doesn't make
    sense, so move this one to inet_frags.

    The inet_frags_ctl becomes empty after this, so remove it.
    The appropriate ctl table is kept read-only in namespaces.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • This is the same as with the timeout variable.

    Currently, after exceeding the high threshold _all_
    the fragments are evicted, but it will be fixed in
    later patch.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Move it to the netns_frags, adjust the usage and
    make the appropriate ctl table writable.

    Now fragment, that live in different namespaces can
    live for different times.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Each namespace has to have own tables to tune their
    different parameters, so duplicate the tables and
    register them.

    All the tables in sub-namespaces are temporarily made
    read-only.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • This is also simple, but introduces more changes, since
    then mem counter is altered in more places.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • This is simple - just move the variable from struct inet_frags
    to struct netns_frags and adjust the usage appropriately.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Since fragment management code is consolidated, we cannot have the
    pointer from inet_frag_queue to struct net, since we must know what
    king of fragment this is.

    So, I introduce the netns_frags structure. This one is currently
    empty, but will be eventually filled with per-namespace
    attributes. Each inet_frag_queue is tagged with this one.

    The conntrack_reasm is not "netns-izated", so it has one static
    netns_frags instance to keep working in init namespace.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • This is a preparation for sysctl netns-ization.
    Move the ctl tables to the files, where the tuning
    variables reside. Plus make the helpers to register
    the tables.

    This will simplify the later patches and will keep
    similar things closer to each other.

    ipv4, ipv6 and conntrack_reasm are patched differently,
    but the result is all the tables are in appropriate files.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Alexey Dobriyan reported an oops when unsharing the network
    indefinitely inside a loop. This is because the ip6_frag is not per
    namespace while the ctls are.

    That happens at the fragment timer expiration:
    inet_frag_secret_rebuild function is called and this one restarts the
    timer using the value stored inside the sysctl field.

    "mod_timer(&f->secret_timer, now + f->ctl->secret_interval);"

    When the network is unshared, ip6_frag.ctl is initialized with the new
    sysctl instances, but ip6_frag has only one instance. A race in this
    case will appear because f->ctl can be modified during the read access
    in the timer callback.

    Until the ip6_frag is not per namespace, I discard the assignation to
    the ctl field of ip6_frags in ip6_frag_sysctl_init when the network
    namespace is not the init net.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: David S. Miller

    Daniel Lezcano
     
  • The ip6_frags is moved to the network namespace structure. Because
    there can be multiple instances of the network namespaces, and the
    ip6_frags is no longer a global static variable, a helper function has
    been added to facilitate the initialization of the variables.

    Until the ipv6 protocol is not per namespace, the variables are
    accessed relatively from the initial network namespace.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: David S. Miller

    Daniel Lezcano
     
  • This patch makes the frag_init to return an error code, so the af_inet6
    module can handle the error.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: David S. Miller

    Daniel Lezcano
     

18 Oct, 2007

7 commits

  • Since we now allocate the queues in inet_fragment.c, we
    can safely free it in the same place. The ->destructor
    callback thus becomes optional for inet_frags.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Since this callback is used to check for conflicts in
    hashtable when inserting a newly created frag queue, we can
    do the same by checking for matching the queue with the
    argument, used to create one.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Here we need another callback ->match to check whether the
    entry found in hash matches the key passed. The key used
    is the same as the creation argument for inet_frag_create.

    Yet again, this ->match is the same for netfilter and ipv6.
    Running a frew steps forward - this callback will later
    replace the ->equal one.

    Since the inet_frag_find() uses the already consolidated
    inet_frag_create() remove the xxx_frag_create from protocol
    codes.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • This one uses the xxx_frag_intern() and xxx_frag_alloc()
    routines, which are already consolidated, so remove them
    from protocol code (as promised).

    The ->constructor callback is used to init the rest of
    the frag queue and it is the same for netfilter and ipv6.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Just perform the kzalloc() allocation and setup common
    fields in the inet_frag_queue(). Then return the result
    to the caller to initialize the rest.

    The inet_frag_alloc() may return NULL, so check the
    return value before doing the container_of(). This looks
    ugly, but the xxx_frag_alloc() will be removed soon.

    The xxx_expire() timer callbacks are patches,
    because the argument is now the inet_frag_queue, not
    the protocol specific queue.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • This routine checks for the existence of a given entry
    in the hash table and inserts the new one if needed.

    The ->equal callback is used to compare two frag_queue-s
    together, but this one is temporary and will be removed
    later. The netfilter code and the ipv6 one use the same
    routine to compare frags.

    The inet_frag_intern() always returns non-NULL pointer,
    so convert the inet_frag_queue into protocol specific
    one (with the container_of) without any checks.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Since the hash value is already calculated in xxx_find, we can
    simply use it later. This is already done in netfilter code,
    so make the same in ipv4 and ipv6.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

16 Oct, 2007

10 commits

  • With all the users of the double pointers removed from the IPv6 input path,
    this patch converts all occurances of sk_buff ** to sk_buff * in IPv6 input
    handlers.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • These ones use the generic data types too, so move
    them in one place.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • After the evictor code is consolidated there is no need in
    passing the extra pointer to the xxx_put() functions.

    The only place when it made sense was the evictor code itself.

    Maybe this change must got with the previous (or with the
    next) patch, but I try to make them shorter as much as
    possible to simplify the review (but they are still large
    anyway), so this change goes in a separate patch.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • The evictors collect some statistics for ipv4 and ipv6,
    so make it return the number of evicted queues and account
    them all at once in the caller.

    The XXX_ADD_STATS_BH() macros are just for this case,
    but maybe there are places in code, that can make use of
    them as well.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • To make in possible we need to know the exact frag queue
    size for inet_frags->mem management and two callbacks:

    * to destoy the skb (optional, used in conntracks only)
    * to free the queue itself (mandatory, but later I plan to
    move the allocation and the destruction of frag_queues
    into the common place, so this callback will most likely
    be optional too).

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • This code works with the generic data types as well, so
    move this into inet_fragment.c

    This move makes it possible to hide the secret_timer
    management and the secret_rebuild routine completely in
    the inet_fragment.c

    Introduce the ->hashfn() callback in inet_frags() to get
    the hashfun for a given inet_frag_queue() object.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Since now all the xxx_frag_kill functions now work
    with the generic inet_frag_queue data type, this can
    be moved into a common place.

    The xxx_unlink() code is moved as well.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Some sysctl variables are used to tune the frag queues
    management and it will be useful to work with them in
    a common way in the future, so move them into one
    structure, moreover they are the same for all the frag
    management codes.

    I don't place them in the existing inet_frags object,
    introduced in the previous patch for two reasons:

    1. to keep them in the __read_mostly section;
    2. not to export the whole inet_frags objects outside.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • There are some objects that are common in all the places
    which are used to keep track of frag queues, they are:

    * hash table
    * LRU list
    * rw lock
    * rnd number for hash function
    * the number of queues
    * the amount of memory occupied by queues
    * secret timer

    Move all this stuff into one structure (struct inet_frags)
    to make it possible use them uniformly in the future. Like
    with the previous patch this mostly consists of hunks like

    - write_lock(&ipfrag_lock);
    + write_lock(&ip4_frags.lock);

    To address the issue with exporting the number of queues and
    the amount of memory occupied by queues outside the .c file
    they are declared in, I introduce a couple of helpers.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Introduce the struct inet_frag_queue in include/net/inet_frag.h
    file and place there all the common fields from three structs:

    * struct ipq in ipv4/ip_fragment.c
    * struct nf_ct_frag6_queue in nf_conntrack_reasm.c
    * struct frag_queue in ipv6/reassembly.c

    After this, replace these fields on appropriate structures with
    this structure instance and fix the users to use correct names
    i.e. hunks like

    - atomic_dec(&fq->refcnt);
    + atomic_dec(&fq->q.refcnt);

    (these occupy most of the patch)

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov