11 Jun, 2008

7 commits

  • Step 8.5 in RFC 4340 says for the newly cloned socket

    Initialize S.GAR := S.ISS,

    but what in fact the code (minisocks.c) does is

    Initialize S.GAR := S.ISR,

    which is wrong (typo?) -- fixed by the patch.

    Signed-off-by: Gerrit Renker

    Gerrit Renker
     
  • This fixes a bug in computing the inter-packet-interval t_ipi = s/X:

    scaled_div32(a, b) uses u32 for b, but in "scaled_div32(s, X)" the type of the
    sending rate `X' is u64. Since X is scaled by 2^6, this truncates rates greater
    than 2^26 Bps (~537 Mbps).

    Using full 64-bit division now.

    Signed-off-by: Gerrit Renker

    Gerrit Renker
     
  • This fixes a bug in the reverse lookup of p: given a value f(p), instead of p,
    the function returned the smallest tabulated value f(p).

    The smallest tabulated value of

    10^6 * f(p) = sqrt(2*p/3) + 12 * sqrt(3*p/8) * (32 * p^3 + p)

    for p=0.0001 is 8172.

    Since this value is scaled by 10^6, the outcome of this bug is that a loss
    of 8172/10^6 = 0.8172% was reported whenever the input was below the table
    resolution of 0.01%.

    This means that the value was over 80 times too high, resulting in large spikes
    of the initial loss interval, thus unnecessarily reducing the throughput.

    Also corrected the printk format (%u for u32).

    Signed-off-by: Gerrit Renker

    Gerrit Renker
     
  • This fixes an oversight from an earlier patch, ensuring that Ack Vectors
    are not processed on request sockets.

    The issue is that Ack Vectors must not be parsed on request sockets, since
    the Ack Vector feature depends on the selection of the (TX) CCID. During the
    initial handshake the CCIDs are undefined, and so RFC 4340, 10.3 applies:

    "Using CCID-specific options and feature options during a negotiation
    for the corresponding CCID feature is NOT RECOMMENDED [...]"

    And it is not even possible: when the server receives the Request from the
    client, the CCID and Ack vector features are undefined; when the Ack finalising
    the 3-way hanshake arrives, the request socket has not been cloned yet into a
    full socket. (This order is necessary, since otherwise the newly created socket
    would have to be destroyed whenever an option error occurred - a malicious
    hacker could simply send garbage options and exploit this.)

    Signed-off-by: Gerrit Renker

    Gerrit Renker
     
  • This patch fixes the following sparse warnings:
    * nested min(max()) expression:
    net/dccp/ccids/ccid3.c:91:21: warning: symbol '__x' shadows an earlier one
    net/dccp/ccids/ccid3.c:91:21: warning: symbol '__y' shadows an earlier one

    * Declaration of function prototypes in .c instead of .h file, resulting in
    "should it be static?" warnings.

    * Declared "struct dccpw" static (local to dccp_probe).

    * Disabled dccp_delayed_ack() - not fully removed due to RFC 4340, 11.3
    ("Receivers SHOULD implement delayed acknowledgement timers ...").

    * Used a different local variable name to avoid
    net/dccp/ackvec.c:293:13: warning: symbol 'state' shadows an earlier one
    net/dccp/ackvec.c:238:33: originally declared here

    * Removed unused functions `dccp_ackvector_print' and `dccp_ackvec_print'.

    Signed-off-by: Gerrit Renker

    Gerrit Renker
     
  • In commit $(825de27d9e40b3117b29a79d412b7a4b78c5d815) (from 27th May, commit
    message `dccp ccid-3: Fix "t_ipi explosion" bug'), the CCID-3 window counter
    computation was fixed to cope with RTTs < 4 microseconds.

    Such RTTs can be found e.g. when running CCID-3 over loopback. The fix removed
    a check against RTT < 4, but introduced a divide-by-zero bug.

    All steady-state RTTs in DCCP are filtered using dccp_sample_rtt(), which
    ensures non-zero samples. However, a zero RTT is possible on initialisation,
    when there is no RTT sample from the Request/Response exchange.

    The fix is to use the fallback-RTT from RFC 4340, 3.4.

    This is also better than just fixing update_win_count() since it allows other
    parts of the code to always assume that the RTT is non-zero during the time
    that the CCID is used.

    Signed-off-by: Gerrit Renker

    Gerrit Renker
     
  • Wei Yongjun noticed that we may call reqsk_free on request sock objects where
    the opt fields may not be initialized, fix it by introducing inet_reqsk_alloc
    where we initialize ->opt to NULL and set ->pktopts to NULL in
    inet6_reqsk_alloc.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     

27 May, 2008

2 commits

  • The identification of this bug is thanks to Cheng Wei and Tomasz
    Grobelny.

    To avoid divide-by-zero, the implementation previously ignored RTTs
    smaller than 4 microseconds when performing integer division RTT/4.

    When the RTT reached a value less than 4 microseconds (as observed on
    loopback), this prevented the Window Counter CCVal value from
    advancing. As a result, the receiver stopped sending feedback. This in
    turn caused non-ending expiries of the nofeedback timer at the sender,
    so that the sending rate was progressively reduced until reaching the
    minimum of one packet per 64 seconds.

    The patch fixes this bug by handling integer division more
    intelligently. Due to consistent use of dccp_sample_rtt(),
    divide-by-zero-RTT is avoided.

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • RFC4340 said:
    8.5. Pseudocode
    ...
    If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet
    has short sequence numbers), drop packet and return

    But DCCP has some mistake to handle short sequence numbers packet, now
    it drop packet only if P.type is Data, Ack, or DataAck and P.X == 0.

    Signed-off-by: Wei Yongjun
    Acked-by: Gerrit Renker
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Wei Yongjun
     

06 May, 2008

1 commit

  • dccp_feat_change() validates length and on error is returning 1.
    This happens to work since call chain is checking for 0 == success,
    but this is returned to userspace, so make it a real error value.

    Signed-off-by: Chris Wright
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Chris Wright
     

03 May, 2008

1 commit


26 Apr, 2008

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (48 commits)
    net: Fix wrong interpretation of some copy_to_user() results.
    xfrm: alg_key_len & alg_icv_len should be unsigned
    [netdrvr] tehuti: move ioctl perm check closer to function start
    ipv6: Fix typo in net/ipv6/Kconfig
    via-velocity: fix vlan receipt
    tg3: sparse cleanup
    forcedeth: realtek phy crossover detection
    ibm_newemac: Increase MDIO timeouts
    gianfar: Fix skb allocation strategy
    netxen: reduce stack usage of netxen_nic_flash_print
    smc911x: test after postfix decrement fails in smc911x_{reset,drop_pkt}
    net drivers: fix platform driver hotplug/coldplug
    forcedeth: new backoff implementation
    ehea: make things static
    phylib: Add support for board-level PHY fixups
    [netdrvr] atlx: code movement: move atl1 parameter parsing
    atlx: remove flash vendor parameter
    korina: misc cleanup
    korina: fix misplaced return statement
    WAN: Fix confusing insmod error code for C101 too.
    ...

    Linus Torvalds
     

25 Apr, 2008

1 commit


24 Apr, 2008

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
    iwlwifi: Fix built-in compilation of iwlcore
    net: Unexport move_addr_to_{kernel,user}
    rt2x00: Select LEDS_CLASS.
    iwlwifi: Select LEDS_CLASS.
    leds: Do not guard NEW_LEDS with HAS_IOMEM
    [IPSEC]: Fix catch-22 with algorithm IDs above 31
    time: Export set_normalized_timespec.
    tcp: Make use of before macro in tcp_input.c
    hamradio: Remove unneeded and deprecated cli()/sti() calls in dmascc.c
    [NETNS]: Remove empty ->init callback.
    [DCCP]: Convert do_gettimeofday() to getnstimeofday().
    [NETNS]: Don't initialize err variable twice.
    [NETNS]: The ip6_fib_timer can work with garbage on net namespace stop.
    [IPV4]: Convert do_gettimeofday() to getnstimeofday().
    [IPV4]: Make icmp_sk_init() static.
    [IPV6]: Make struct ip6_prohibit_entry_template static.
    tcp: Trivial fix to correct function name in a comment in net/ipv4/tcp.c
    [NET]: Expose netdevice dev_id through sysfs
    skbuff: fix missing kernel-doc notation
    [ROSE]: Fix soft lockup wrt. rose_node_list_lock

    Linus Torvalds
     

22 Apr, 2008

1 commit


19 Apr, 2008

1 commit


18 Apr, 2008

1 commit

  • As I can see from the code, two places (tcp_v6_syn_recv_sock and
    dccp_v6_request_recv_sock) that call this one already run with
    BHs disabled, so it's safe to call __inet_inherit_port there.

    Besides (in case I missed smth with code review) the calltrace
    tcp_v6_syn_recv_sock
    `- tcp_v4_syn_recv_sock
    `- __inet_inherit_port
    and the similar for DCCP are valid, but assumes BHs to be disabled.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

14 Apr, 2008

13 commits


13 Apr, 2008

1 commit

  • dev_queue_xmit() and the other IP output functions expect to get a skb
    with clear or properly initialized skb->cb. Unlike TCP and UDP, the
    dccp_skb_cb doesn't contain a struct inet_skb_parm at the beginning,
    so the DCCP-specific data is interpreted by the IP output functions.
    This can cause false negatives for the conditional POST_ROUTING hook
    invocation, making the packet bypass the hook.

    Add a inet_skb_parm/inet6_skb_parm union to the beginning of
    dccp_skb_cb to avoid clashes. Also add a BUILD_BUG_ON to make
    sure it fits in the cb.

    [ Combined with patch from Gerrit Renker to remove two now unnecessary
    memsets of IPCB(skb)->opt ]

    Signed-off-by: Patrick McHardy
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Patrick McHardy
     

10 Apr, 2008

1 commit


04 Apr, 2008

5 commits


29 Mar, 2008

1 commit


23 Mar, 2008

1 commit

  • Inspired by the commit ab1e0a13 ([SOCK] proto: Add hashinfo member to
    struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4
    and -v6 protocols.

    The result is not that exciting, but it removes some levels of
    indirection in udpxxx_get_port and saves some space in code and text.

    The first step is to union existing hashinfo and new udp_hash on the
    struct proto and give a name to this union, since future initialization
    of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc
    warning about inability to initialize anonymous member this way.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

06 Mar, 2008

1 commit