23 Nov, 2011

1 commit


21 Oct, 2011

1 commit

  • Adding const qualifiers to pointers can ease code review, and spot some
    bugs. It might allow compiler to optimize code further.

    For example, is it legal to temporary write a null cksum into tcphdr
    in tcp_md5_hash_header() ? I am afraid a sniffer could catch the
    temporary null value...

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

11 Aug, 2011

1 commit

  • Using a gcc 4.4.3, warnings are emitted for a possibly uninitialized use
    of ecn_ok.

    This can happen if cookie_check_timestamp() returns due to not having
    seen a timestamp. Defaulting to ecn off seems like a reasonable thing
    to do in this case, so initialized ecn_ok to false.

    Signed-off-by: Mike Waychison
    Signed-off-by: David S. Miller

    Mike Waychison
     

09 Jun, 2011

1 commit

  • This patch lowers the default initRTO from 3secs to 1sec per
    RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet
    has been retransmitted, AND the TCP timestamp option is not on.

    It also adds support to take RTT sample during 3WHS on the passive
    open side, just like its active open counterpart, and uses it, if
    valid, to seed the initRTO for the data transmission phase.

    The patch also resets ssthresh to its initial default at the
    beginning of the data transmission phase, and reduces cwnd to 1 if
    there has been MORE THAN ONE retransmission during 3WHS per RFC5681.

    Signed-off-by: H.K. Jerry Chu
    Signed-off-by: David S. Miller

    Jerry Chu
     

23 Apr, 2011

1 commit


13 Mar, 2011

4 commits


02 Mar, 2011

1 commit

  • Route lookups follow a general pattern in the ipv6 code wherein
    we first find the non-IPSEC route, potentially override the
    flow destination address due to ipv6 options settings, and then
    finally make an IPSEC search using either xfrm_lookup() or
    __xfrm_lookup().

    __xfrm_lookup() is used when we want to generate a blackhole route
    if the key manager needs to resolve the IPSEC rules (in this case
    -EREMOTE is returned and the original 'dst' is left unchanged).

    Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC
    resolution is necessary, we simply fail the lookup completely.

    All of these cases are encapsulated into two routines,
    ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow. The latter of which
    handles unconnected UDP datagram sockets.

    Signed-off-by: David S. Miller

    David S. Miller
     

27 Jun, 2010

2 commits

  • Allows use of ECN when syncookies are in effect by encoding ecn_ok
    into the syn-ack tcp timestamp.

    While at it, remove a uneeded #ifdef CONFIG_SYN_COOKIES.
    With CONFIG_SYN_COOKIES=nm want_cookie is ifdef'd to 0 and gcc
    removes the "if (0)".

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     
  • As pointed out by Fernando Gont there is no need to encode rcv_wscale
    into the cookie.

    We did not use the restored rcv_wscale anyway; it is recomputed
    via tcp_select_initial_window().

    Thus we can save 4 bits in the ts option space by removing rcv_wscale.
    In case window scaling was not supported, we set the (invalid) wscale
    value 0xf.

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

17 Jun, 2010

1 commit

  • Discard the ACK if we find options that do not match current sysctl
    settings.

    Previously it was possible to create a connection with sack, wscale,
    etc. enabled even if the feature was disabled via sysctl.

    Also remove an unneeded call to tcp_sack_reset() in
    cookie_check_timestamp: Both call sites (cookie_v4_check,
    cookie_v6_check) zero "struct tcp_options_received", hand it to
    tcp_parse_options() (which does not change tcp_opt->num_sacks/dsack)
    and then call cookie_check_timestamp().

    Even if num_sacks/dsacks were changed, the structure is allocated on
    the stack and after cookie_check_timestamp returns only a few selected
    members are copied to the inet_request_sock.

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

05 Jun, 2010

2 commits


02 Jun, 2010

1 commit

  • There are more than a dozen occurrences of following code in the
    IPv6 stack:

    if (opt && opt->srcrt) {
    struct rt0_hdr *rt0 = (struct rt0_hdr *) opt->srcrt;
    ipv6_addr_copy(&final, &fl.fl6_dst);
    ipv6_addr_copy(&fl.fl6_dst, rt0->addr);
    final_p = &final;
    }

    Replace those with a helper. Note that the helper overrides final_p
    in all cases. This is ok as final_p was previously initialized to
    NULL when declared.

    Signed-off-by: Arnaud Ebalard
    Signed-off-by: David S. Miller

    Arnaud Ebalard
     

24 Dec, 2009

1 commit

  • Add rtnetlink init_rcvwnd to set the TCP initial receive window size
    advertised by passive and active TCP connections.
    The current Linux TCP implementation limits the advertised TCP initial
    receive window to the one prescribed by slow start. For short lived
    TCP connections used for transaction type of traffic (i.e. http
    requests), bounding the advertised TCP initial receive window results
    in increased latency to complete the transaction.
    Support for setting initial congestion window is already supported
    using rtnetlink init_cwnd, but the feature is useless without the
    ability to set a larger TCP initial receive window.
    The rtnetlink init_rcvwnd allows increasing the TCP initial receive
    window, allowing TCP connection to advertise larger TCP receive window
    than the ones bounded by slow start.

    Signed-off-by: Laurent Chavey
    Signed-off-by: David S. Miller

    laurent chavey
     

16 Dec, 2009

1 commit

  • It creates a regression, triggering badness for SYN_RECV
    sockets, for example:

    [19148.022102] Badness at net/ipv4/inet_connection_sock.c:293
    [19148.022570] NIP: c02a0914 LR: c02a0904 CTR: 00000000
    [19148.023035] REGS: eeecbd30 TRAP: 0700 Not tainted (2.6.32)
    [19148.023496] MSR: 00029032 CR: 24002442 XER: 00000000
    [19148.024012] TASK = eee9a820[1756] 'privoxy' THREAD: eeeca000

    This is likely caused by the change in the 'estab' parameter
    passed to tcp_parse_options() when invoked by the functions
    in net/ipv4/tcp_minisocks.c

    But even if that is fixed, the ->conn_request() changes made in
    this patch series is fundamentally wrong. They try to use the
    listening socket's 'dst' to probe the route settings. The
    listening socket doesn't even have a route, and you can't
    get the right route (the child request one) until much later
    after we setup all of the state, and it must be done by hand.

    This stuff really isn't ready, so the best thing to do is a
    full revert. This reverts the following commits:

    f55017a93f1a74d50244b1254b9a2bd7ac9bbf7d
    022c3f7d82f0f1c68018696f2f027b87b9bb45c2
    1aba721eba1d84a2defce45b950272cee1e6c72a
    cda42ebd67ee5fdf09d7057b5a4584d36fe8a335
    345cda2fd695534be5a4494f1b59da9daed33663
    dc343475ed062e13fc260acccaab91d7d80fd5b2
    05eaade2782fb0c90d3034fd7a7d5a16266182bb
    6a2a2d6bf8581216e08be15fcb563cfd6c430e1e

    Signed-off-by: David S. Miller

    David S. Miller
     

03 Dec, 2009

1 commit

  • Parse incoming TCP_COOKIE option(s).

    Calculate TCP_COOKIE option.

    Send optional data.

    This is a significantly revised implementation of an earlier (year-old)
    patch that no longer applies cleanly, with permission of the original
    author (Adam Langley):

    http://thread.gmane.org/gmane.linux.network/102586

    Requires:
    TCPCT part 1a: add request_values parameter for sending SYNACK
    TCPCT part 1b: generate Responder Cookie secret
    TCPCT part 1c: sysctl_tcp_cookie_size, socket option TCP_COOKIE_TRANSACTIONS
    TCPCT part 1d: define TCP cookie option, extend existing struct's
    TCPCT part 1e: implement socket option TCP_COOKIE_TRANSACTIONS
    TCPCT part 1f: Initiator Cookie => Responder

    Signed-off-by: William.Allen.Simpson@gmail.com
    Signed-off-by: David S. Miller

    William Allen Simpson
     

29 Oct, 2009

1 commit


19 Oct, 2009

1 commit

  • In order to have better cache layouts of struct sock (separate zones
    for rx/tx paths), we need this preliminary patch.

    Goal is to transfert fields used at lookup time in the first
    read-mostly cache line (inside struct sock_common) and move sk_refcnt
    to a separate cache line (only written by rx path)

    This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
    sport and id fields. This allows a future patch to define these
    fields as macros, like sk_refcnt, without name clashes.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Oct, 2009

1 commit

  • Atis Elsts wrote:
    > Not sure if there is need to fill the mark from skb in tunnel xmit functions. In any case, it's not done for GRE or IPIP tunnels at the moment.

    Ok, I'll just drop that part, I'm not sure what should be done in this case.

    > Also, in this patch you are doing that for SIT (v6-in-v4) tunnels only, and not doing it for v4-in-v6 or v6-in-v6 tunnels. Any reason for that?

    I just sent that patch out too quickly, here's a better one with the updates.

    Add support for IPv6 route lookups using sk_mark.

    Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     

24 Jun, 2009

2 commits

  • Percpu variable definition is about to be updated such that all percpu
    symbols including the static ones must be unique. Update percpu
    variable definitions accordingly.

    * as,cfq: rename ioc_count uniquely

    * cpufreq: rename cpu_dbs_info uniquely

    * xen: move nesting_count out of xen_evtchn_do_upcall() and rename it

    * mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and
    rename it

    * ipv4,6: rename cookie_scratch uniquely

    * x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to
    pmc_irq_entry and nmi_entry to pmc_nmi_entry

    * perf_counter: rename disable_count to perf_disable_count

    * ftrace: rename test_event_disable to ftrace_test_event_disable

    * kmemleak: rename test_pointer to kmemleak_test_pointer

    * mce: rename next_interval to mce_next_interval

    [ Impact: percpu usage cleanups, no duplicate static percpu var names ]

    Signed-off-by: Tejun Heo
    Reviewed-by: Christoph Lameter
    Cc: Ivan Kokshaysky
    Cc: Jens Axboe
    Cc: Dave Jones
    Cc: Jeremy Fitzhardinge
    Cc: linux-mm
    Cc: David S. Miller
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Li Zefan
    Cc: Catalin Marinas
    Cc: Andi Kleen

    Tejun Heo
     
  • Currently, the following three different ways to define percpu arrays
    are in use.

    1. DEFINE_PER_CPU(elem_type[array_len], array_name);
    2. DEFINE_PER_CPU(elem_type, array_name[array_len]);
    3. DEFINE_PER_CPU(elem_type, array_name)[array_len];

    Unify to #1 which correctly separates the roles of the two parameters
    and thus allows more flexibility in the way percpu variables are
    defined.

    [ Impact: cleanup ]

    Signed-off-by: Tejun Heo
    Reviewed-by: Christoph Lameter
    Cc: Ingo Molnar
    Cc: Tony Luck
    Cc: Benjamin Herrenschmidt
    Cc: Thomas Gleixner
    Cc: Jeremy Fitzhardinge
    Cc: linux-mm@kvack.org
    Cc: Christoph Lameter
    Cc: David S. Miller

    Tejun Heo
     

20 Apr, 2009

1 commit

  • last_synq_overflow eats 4 or 8 bytes in struct tcp_sock, even
    though it is only used when a listening sockets syn queue
    is full.

    We can (ab)use rx_opt.ts_recent_stamp to store the same information;
    it is not used otherwise as long as a socket is in listen state.

    Move linger2 around to avoid splitting struct mtu_probe
    across cacheline boundary on 32 bit arches.

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

26 Nov, 2008

1 commit

  • Pass netns to xfrm_lookup()/__xfrm_lookup(). For that pass netns
    to flow_cache_lookup() and resolver callback.

    Take it from socket or netdevice. Stub DECnet to init_net.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

20 Oct, 2008

1 commit

  • 'tcp: Port redirection support for TCP' (a3116ac5c) added a new member
    to inet_request_sock() which inet_csk_clone() makes use of but failed
    to add proper initialization to the IPv6 syncookie code and missed a
    couple of places where the new member should be used instead of
    inet_sk(sk)->sport.

    Signed-off-by: KOVACS Krisztian
    Signed-off-by: David S. Miller

    KOVACS Krisztian
     

04 Aug, 2008

1 commit


26 Jul, 2008

1 commit

  • ecn_ok is not initialized when a connection is established by cookies.
    The cookie syn-ack never sets ECN, so ecn_ok must be set to 0.

    Spotted using ns-3/network simulation cradle simulator and valgrind.

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

17 Jul, 2008

1 commit


11 Jun, 2008

1 commit


10 Apr, 2008

1 commit

  • Allow the use of SACK and window scaling when syncookies are used
    and the client supports tcp timestamps. Options are encoded into
    the timestamp sent in the syn-ack and restored from the timestamp
    echo when the ack is received.

    Based on earlier work by Glenn Griffin.
    This patch avoids increasing the size of structs by encoding TCP
    options into the least significant bits of the timestamp and
    by not using any 'timestamp offset'.

    The downside is that the timestamp sent in the packet after the synack
    will increase by several seconds.

    changes since v1:
    don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c
    and have ipv6/syncookies.c use it.
    Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if ()

    Reviewed-by: Hagen Paul Pfeifer
    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

24 Mar, 2008

1 commit

  • the first u32 copied from syncookie_secret is overwritten by the
    minute-counter four lines below. After adjusting the destination
    address, the size of syncookie_secret can be reduced accordingly.

    AFAICS, the only other user of syncookie_secret[] is the ipv6
    syncookie support. Because ipv6 syncookies only grab 44 bytes from
    syncookie_secret[], this shouldn't affect them in any way.

    With fixes from Glenn Griffin.

    Signed-off-by: Florian Westphal
    Acked-by: Glenn Griffin
    Signed-off-by: David S. Miller

    Florian Westphal
     

04 Mar, 2008

1 commit