14 Jun, 2009

12 commits


13 Jun, 2009

19 commits

  • Signed-off-by: Joe Perches
    Signed-off-by: Patrick McHardy

    Joe Perches
     
  • This patch improves ctnetlink event reliability if one broadcast
    listener has set the NETLINK_BROADCAST_ERROR socket option.

    The logic is the following: if an event delivery fails, we keep
    the undelivered events in the missed event cache. Once the next
    packet arrives, we add the new events (if any) to the missed
    events in the cache and we try a new delivery, and so on. Thus,
    if ctnetlink fails to deliver an event, we try to deliver them
    once we see a new packet. Therefore, we may lose state
    transitions but the userspace process gets in sync at some point.

    At worst case, if no events were delivered to userspace, we make
    sure that destroy events are successfully delivered. Basically,
    if ctnetlink fails to deliver the destroy event, we remove the
    conntrack entry from the hashes and we insert them in the dying
    list, which contains inactive entries. Then, the conntrack timer
    is added with an extra grace timeout of random32() % 15 seconds
    to trigger the event again (this grace timeout is tunable via
    /proc). The use of a limited random timeout value allows
    distributing the "destroy" resends, thus, avoiding accumulating
    lots "destroy" events at the same time. Event delivery may
    re-order but we can identify them by means of the tuple plus
    the conntrack ID.

    The maximum number of conntrack entries (active or inactive) is
    still handled by nf_conntrack_max. Thus, we may start dropping
    packets at some point if we accumulate a lot of inactive conntrack
    entries that did not successfully report the destroy event to
    userspace.

    During my stress tests consisting of setting a very small buffer
    of 2048 bytes for conntrackd and the NETLINK_BROADCAST_ERROR socket
    flag, and generating lots of very small connections, I noticed
    very few destroy entries on the fly waiting to be resend.

    A simple way to test this patch consist of creating a lot of
    entries, set a very small Netlink buffer in conntrackd (+ a patch
    which is not in the git tree to set the BROADCAST_ERROR flag)
    and invoke `conntrack -F'.

    For expectations, no changes are introduced in this patch.
    Currently, event delivery is only done for new expectations (no
    events from expectation expiration, removal and confirmation).
    In that case, they need a per-expectation event cache to implement
    the same idea that is exposed in this patch.

    This patch can be useful to provide reliable flow-accouting. We
    still have to add a new conntrack extension to store the creation
    and destroy time.

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Patrick McHardy

    Pablo Neira Ayuso
     
  • This patch adds the hlist_nulls_add_head() function which is
    based on hlist_nulls_add_head_rcu() but without the use of
    rcu_assign_pointer(). It also adds hlist_nulls_del which is
    exactly the same like hlist_nulls_del_rcu().

    Signed-off-by: Pablo Neira Ayuso
    Acked-by: Eric Dumazet
    Signed-off-by: Patrick McHardy

    Pablo Neira Ayuso
     
  • This patch moves the helper destruction to a function that lives
    in nf_conntrack_helper.c. This new function is used in the patch
    to add ctnetlink reliable event delivery.

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Patrick McHardy

    Pablo Neira Ayuso
     
  • This patch reworks the per-cpu event caching to use the conntrack
    extension infrastructure.

    The main drawback is that we consume more memory per conntrack
    if event delivery is enabled. This patch is required by the
    reliable event delivery that follows to this patch.

    BTW, this patch allows you to enable/disable event delivery via
    /proc/sys/net/netfilter/nf_conntrack_events in runtime, although
    you can still disable event caching as compilation option.

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Patrick McHardy

    Pablo Neira Ayuso
     
  • Use mod_timer_pending() instead of atomic sequence of del_timer()/
    add_timer(). mod_timer_pending() does not rearm an inactive timer,
    so we don't need the conntrack lock anymore to make sure we don't
    accidentally rearm a timer of a conntrack which is in the process
    of being destroyed.

    With this change, we don't need to take the global lock anymore at all,
    counter updates can be performed under the per-conntrack lock.

    Signed-off-by: Patrick McHardy

    Patrick McHardy
     
  • Fix regression introduced by 17625274 "netfilter: sysctl support of
    logger choice":

    BUG: sleeping function called from invalid context at /mnt/s390test/linux-2.6-tip/arch/s390/include/asm/uaccess.h:234
    in_atomic(): 1, irqs_disabled(): 0, pid: 3245, name: sysctl
    CPU: 1 Not tainted 2.6.30-rc8-tipjun10-02053-g39ae214 #1
    Process sysctl (pid: 3245, task: 000000007f675da0, ksp: 000000007eb17cf0)
    0000000000000000 000000007eb17be8 0000000000000002 0000000000000000
    000000007eb17c88 000000007eb17c00 000000007eb17c00 0000000000048156
    00000000003e2de8 000000007f676118 000000007eb17f10 0000000000000000
    0000000000000000 000000007eb17be8 000000000000000d 000000007eb17c58
    00000000003e2050 000000000001635c 000000007eb17be8 000000007eb17c30
    Call Trace:
    (ݨ show_trace+0x13a/0x148)
    ݨ __might_sleep+0x13a/0x164
    ݨ proc_dostring+0x134/0x22c
    ݨ nf_log_proc_dostring+0xfc/0x188
    ݨ proc_sys_call_handler+0xf6/0x118
    ݨ proc_sys_read+0x26/0x34
    ݨ vfs_read+0xac/0x158
    ݨ SyS_read+0x56/0x88
    ݨ sysc_noemu+0x10/0x16

    Use the nf_log_mutex instead of RCU to fix this.

    Reported-and-tested-by: Maran Pakkirisamy
    Signed-off-by: Patrick McHardy

    Patrick McHardy
     
  • Convert magic values 1 and -1 to NETDEV_TX_BUSY and NETDEV_TX_LOCKED respectively.

    0 (NETDEV_TX_OK) is not changed to keep the noise down, except in very few cases
    where its in direct proximity to one of the other values.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Fix up USB drivers that return an errno value (result of usb_submit_urb())
    to qdisc_restart(), causing qdisc_restart() to print a warning and requeue/
    retransmit the skb.

    - hso: skb is freed: use after free
    - at76_usb: skb is freed: use after free

    Compile tested only.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Fix up ATM drivers that return an errno value to qdisc_restart(), causing
    qdisc_restart() to print a warning an requeue/retransmit the skb.

    - lec: condition can only be remedied by userspace, until that retransmissions

    Compile tested only.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Fix up hamradio drivers that return an errno value to dev_queue_xmit(), causing
    it to print a warning an free the skb.

    - bpqether: skb is freed: use after free

    Compile tested only.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Fix up s390 drivers that return an errno value to qdisc_restart(), causing
    qdisc_restart() to print a warning an requeue/retransmit the skb.

    - claw: impossible condition, simply remove it

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Fix up WAN drivers that return an errno value to qdisc_restart(), causing
    qdisc_restart() to print a warning an requeue/retransmit the skb.

    - cycx_x25: intention appears to be to requeue the skb

    Does not compile cleanly for me even without this patch, so untested.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • net: fix network drivers ndo_start_xmit() return values (part 3)

    Fix up wireless drivers that return an errno value to qdisc_restart(), causing
    qdisc_restart() to print a warning an requeue/retransmit the skb.

    - airo: transmission not implemented for chip, intention is to free and abort
    - ipw2200: transmission not implemented for promiscous mode, intention is to
    drop
    - prism54: intention is to drop
    - wl3501_cs: intention appears to be to drop
    - zd1201: error counter indicates intention is to drop

    All drivers compile tested.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Fix up IRDA drivers that return an errno value to qdisc_restart(), causing
    qdisc_restart() to print a warning an requeue/retransmit the skb.

    - donauboe: intention appears to be to have the skb retransmitted without
    error message
    - irda-usb: intention is to drop silently according to comment
    - kingsub-sir: skb is freed: use after free
    - ks959-sir: skb is freed: use after free
    - ksdazzle-sir: skb is freed: use after free
    - mcs7880: skb is freed: use after free

    All but donauboe compile tested.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Fix up drivers that return an errno value to qdisc_restart(), causing
    qdisc_restart() to print a warning and requeue/retransmit the skb.

    - xpnet: memory allocation error, intention is to drop
    - ethoc: oversized packet, packet must be dropped
    - ibmlana: skb freed: use after free
    - rrunner: skb freed: use after free

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • We now have an IrDA git tree on kernel.org:
    git://git.kernel.org/pub/scm/linux/kernel/git/sameo/irda-2.6.git

    Signed-off-by: Samuel Ortiz

    Samuel Ortiz
     
  • The sir retries count reaches -1 rather than 0.

    Signed-off-by: Roel Kluin
    Signed-off-by: Samuel Ortiz
    Signed-off-by: Andrew Morton

    Roel Kluin
     
  • Signed-off-by: Graff Yang
    Cc: Mike Frysinger
    Cc: Bryan Wu
    Signed-off-by: Samuel Ortiz
    Signed-off-by: Andrew Morton

    Graff Yang
     

12 Jun, 2009

9 commits