16 Oct, 2007

2 commits


11 Sep, 2007

1 commit

  • So I've had a deadlock reported to me. I've found that the sequence of
    events goes like this:

    1) process A (modprobe) runs to remove ip_tables.ko

    2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket,
    increasing the ip_tables socket_ops use count

    3) process A acquires a file lock on the file ip_tables.ko, calls remove_module
    in the kernel, which in turn executes the ip_tables module cleanup routine,
    which calls nf_unregister_sockopt

    4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the
    calling process into uninterruptible sleep, expecting the process using the
    socket option code to wake it up when it exits the kernel

    4) the user of the socket option code (process B) in do_ipt_get_ctl, calls
    ipt_find_table_lock, which in this case calls request_module to load
    ip_tables_nat.ko

    5) request_module forks a copy of modprobe (process C) to load the module and
    blocks until modprobe exits.

    6) Process C. forked by request_module process the dependencies of
    ip_tables_nat.ko, of which ip_tables.ko is one.

    7) Process C attempts to lock the request module and all its dependencies, it
    blocks when it attempts to lock ip_tables.ko (which was previously locked in
    step 3)

    Theres not really any great permanent solution to this that I can see, but I've
    developed a two part solution that corrects the problem

    Part 1) Modifies the nf_sockopt registration code so that, instead of using a
    use counter internal to the nf_sockopt_ops structure, we instead use a pointer
    to the registering modules owner to do module reference counting when nf_sockopt
    calls a modules set/get routine. This prevents the deadlock by preventing set 4
    from happening.

    Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking
    remove operations (the same way rmmod does), and add an option to explicity
    request blocking operation. So if you select blocking operation in modprobe you
    can still cause the above deadlock, but only if you explicity try (and since
    root can do any old stupid thing it would like.... :) ).

    Signed-off-by: Neil Horman
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Neil Horman
     

11 Jul, 2007

1 commit


26 Apr, 2007

2 commits


13 Feb, 2007

2 commits


05 Dec, 2006

2 commits


03 Dec, 2006

4 commits


23 Sep, 2006

1 commit


26 Apr, 2006

1 commit


10 Apr, 2006

3 commits


21 Mar, 2006

1 commit


16 Feb, 2006

2 commits

  • nf_hook() is supposed to call the netfilter hook and return control of the
    packet back to the caller in case it may pass, the okfn is only used for
    queueing.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • When a packet matching an IPsec policy is SNATed so it doesn't match any
    policy anymore it looses its xfrm bundle, which makes xfrm4_output_finish
    crash because of a NULL pointer dereference.

    This patch directs these packets to the original output path instead. Since
    the packets have already passed the POST_ROUTING hook, but need to start at
    the beginning of the original output path which includes another
    POST_ROUTING invocation, a flag is added to the IPCB to indicate that the
    packet was rerouted and doesn't need to pass the POST_ROUTING hook again.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

08 Jan, 2006

3 commits

  • Signed-off-by: David S. Miller

    David S. Miller
     
  • Handle NAT of decapsulated IPsec packets by reconstructing the struct flowi
    of the original packet from the conntrack information for IPsec policy
    checks.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Call netfilter hooks before IPsec transforms. Packets visit the
    FORWARD/LOCAL_OUT and POST_ROUTING hook before the first encapsulation
    and the LOCAL_OUT and POST_ROUTING hook before each following tunnel mode
    transform.

    Patch from Herbert Xu :

    Move the loop from dst_output into xfrm4_output/xfrm6_output since they're
    the only ones who need to it. xfrm{4,6}_output_one() processes the first SA
    all subsequent transport mode SAs and is called in a loop that calls the
    netfilter hooks between each two calls.

    In order to avoid the tail call issue, I've added the inline function
    nf_hook which is nf_hook_slow plus the empty list check.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

30 Aug, 2005

9 commits

  • Fix gcc-3.4.x warning about iplicit operator precedence in NF_QUEUE_NR()

    Signed-off-by: Harald Welte
    Signed-off-by: David S. Miller

    Harald Welte
     
  • I obviously wanted to use bitwise-or, not logical or.

    Signed-off-by: Harald Welte
    Signed-off-by: David S. Miller

    Harald Welte
     
  • Check whether pf is too large in order to prevent array overflow.

    Signed-off-by: Harald Welte
    Signed-off-by: David S. Miller

    Harald Welte
     
  • This patch adds a /proc/net/netfilter/nf_queue file, similar to the
    recently-added /proc/net/netfilter/nf_log. It indicates which queue
    handler is registered to which protocol family. This is useful since
    there are now multiple queue handlers in the treee (ip[6]_queue,
    nfnetlink_queue).

    Signed-off-by: Harald Welte
    Signed-off-by: David S. Miller

    Harald Welte
     
  • This patch is in preparation to nfnetlink_log:
    - loggers now have to register struct nf_logger instead of nf_logfn
    - nf_log_unregister() replaced by nf_log_unregister_pf() and
    nf_log_unregister_logger()
    - add comment to ip[6]t_LOG.h to assure nobody redefines flags
    - add /proc/net/netfilter/nf_log to tell user which logger is currently
    registered for which address family
    - if user has configured logging, but no logging backend (logger) is
    available, always spit a message to syslog, not just the first time.
    - split ip[6]t_LOG.c into two parts:
    Backend: Always try to register as logger for the respective address family
    Frontend: Always log via nf_log_packet() API
    - modify all users of nf_log_packet() to accomodate additional argument

    Signed-off-by: Harald Welte
    Signed-off-by: David S. Miller

    Harald Welte
     
  • - split netfiler verdict in 16bit verdict and 16bit queue number
    - add 'queuenum' argument to nf_queue_outfn_t and its users ip[6]_queue
    - move NFNL_SUBSYS_ definitions from enum to #define
    - introduce autoloading for nfnetlink subsystem modules
    - add MODULE_ALIAS_NFNL_SUBSYS macro
    - add nf_unregister_queue_handlers() to register all handlers for a given
    nf_queue_outfn_t
    - add more verbose DEBUGP macro definition to nfnetlink.c
    - make nfnetlink_subsys_register fail if subsys already exists
    - add some more comments and debug statements to nfnetlink.c

    Signed-off-by: Harald Welte
    Signed-off-by: David S. Miller

    Harald Welte
     
  • The rerouting functionality is required by the core, therefore it has
    to be implemented by the core and not in individual queue handlers.

    Signed-off-by: Harald Welte
    Signed-off-by: David S. Miller

    Harald Welte
     
  • There is nothing IPv4-specific in it. In fact, it was already used by
    IPv6, too... Upcoming nfnetlink_queue code will use it for any kind
    of packet.

    Signed-off-by: Harald Welte
    Signed-off-by: David S. Miller

    Harald Welte
     
  • As discussed at netconf'05, we're trying to save every bit in sk_buff.
    The patch below makes sk_buff 8 bytes smaller. I did some basic
    testing on my notebook and it seems to work.

    The only real in-tree user of nfcache was IPVS, who only needs a
    single bit. Unfortunately I couldn't find some other free bit in
    sk_buff to stuff that bit into, so I introduced a separate field for
    them. Maybe the IPVS guys can resolve that to further save space.

    Initially I wanted to shrink pkt_type to three bits (PACKET_HOST and
    alike are only 6 values defined), but unfortunately the bluetooth code
    overloads pkt_type :(

    The conntrack-event-api (out-of-tree) uses nfcache, but Rusty just
    came up with a way how to do it without any skb fields, so it's safe
    to remove it.

    - remove all never-implemented 'nfcache' code
    - don't have ipvs code abuse 'nfcache' field. currently get's their own
    compile-conditional skb->ipvs_property field. IPVS maintainers can
    decide to move this bit elswhere, but nfcache needs to die.
    - remove skb->nfcache field to save 4 bytes
    - move skb->nfctinfo into three unused bits to save further 4 bytes

    Signed-off-by: Harald Welte
    Signed-off-by: David S. Miller

    Harald Welte
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds