24 Dec, 2009

1 commit

  • Add rtnetlink init_rcvwnd to set the TCP initial receive window size
    advertised by passive and active TCP connections.
    The current Linux TCP implementation limits the advertised TCP initial
    receive window to the one prescribed by slow start. For short lived
    TCP connections used for transaction type of traffic (i.e. http
    requests), bounding the advertised TCP initial receive window results
    in increased latency to complete the transaction.
    Support for setting initial congestion window is already supported
    using rtnetlink init_cwnd, but the feature is useless without the
    ability to set a larger TCP initial receive window.
    The rtnetlink init_rcvwnd allows increasing the TCP initial receive
    window, allowing TCP connection to advertise larger TCP receive window
    than the ones bounded by slow start.

    Signed-off-by: Laurent Chavey
    Signed-off-by: David S. Miller

    laurent chavey
     

16 Dec, 2009

1 commit

  • It creates a regression, triggering badness for SYN_RECV
    sockets, for example:

    [19148.022102] Badness at net/ipv4/inet_connection_sock.c:293
    [19148.022570] NIP: c02a0914 LR: c02a0904 CTR: 00000000
    [19148.023035] REGS: eeecbd30 TRAP: 0700 Not tainted (2.6.32)
    [19148.023496] MSR: 00029032 CR: 24002442 XER: 00000000
    [19148.024012] TASK = eee9a820[1756] 'privoxy' THREAD: eeeca000

    This is likely caused by the change in the 'estab' parameter
    passed to tcp_parse_options() when invoked by the functions
    in net/ipv4/tcp_minisocks.c

    But even if that is fixed, the ->conn_request() changes made in
    this patch series is fundamentally wrong. They try to use the
    listening socket's 'dst' to probe the route settings. The
    listening socket doesn't even have a route, and you can't
    get the right route (the child request one) until much later
    after we setup all of the state, and it must be done by hand.

    This stuff really isn't ready, so the best thing to do is a
    full revert. This reverts the following commits:

    f55017a93f1a74d50244b1254b9a2bd7ac9bbf7d
    022c3f7d82f0f1c68018696f2f027b87b9bb45c2
    1aba721eba1d84a2defce45b950272cee1e6c72a
    cda42ebd67ee5fdf09d7057b5a4584d36fe8a335
    345cda2fd695534be5a4494f1b59da9daed33663
    dc343475ed062e13fc260acccaab91d7d80fd5b2
    05eaade2782fb0c90d3034fd7a7d5a16266182bb
    6a2a2d6bf8581216e08be15fcb563cfd6c430e1e

    Signed-off-by: David S. Miller

    David S. Miller
     

03 Dec, 2009

1 commit

  • Parse incoming TCP_COOKIE option(s).

    Calculate TCP_COOKIE option.

    Send optional data.

    This is a significantly revised implementation of an earlier (year-old)
    patch that no longer applies cleanly, with permission of the original
    author (Adam Langley):

    http://thread.gmane.org/gmane.linux.network/102586

    Requires:
    TCPCT part 1a: add request_values parameter for sending SYNACK
    TCPCT part 1b: generate Responder Cookie secret
    TCPCT part 1c: sysctl_tcp_cookie_size, socket option TCP_COOKIE_TRANSACTIONS
    TCPCT part 1d: define TCP cookie option, extend existing struct's
    TCPCT part 1e: implement socket option TCP_COOKIE_TRANSACTIONS
    TCPCT part 1f: Initiator Cookie => Responder

    Signed-off-by: William.Allen.Simpson@gmail.com
    Signed-off-by: David S. Miller

    William Allen Simpson
     

29 Oct, 2009

1 commit


08 Oct, 2009

1 commit


24 Jun, 2009

2 commits

  • Percpu variable definition is about to be updated such that all percpu
    symbols including the static ones must be unique. Update percpu
    variable definitions accordingly.

    * as,cfq: rename ioc_count uniquely

    * cpufreq: rename cpu_dbs_info uniquely

    * xen: move nesting_count out of xen_evtchn_do_upcall() and rename it

    * mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and
    rename it

    * ipv4,6: rename cookie_scratch uniquely

    * x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to
    pmc_irq_entry and nmi_entry to pmc_nmi_entry

    * perf_counter: rename disable_count to perf_disable_count

    * ftrace: rename test_event_disable to ftrace_test_event_disable

    * kmemleak: rename test_pointer to kmemleak_test_pointer

    * mce: rename next_interval to mce_next_interval

    [ Impact: percpu usage cleanups, no duplicate static percpu var names ]

    Signed-off-by: Tejun Heo
    Reviewed-by: Christoph Lameter
    Cc: Ivan Kokshaysky
    Cc: Jens Axboe
    Cc: Dave Jones
    Cc: Jeremy Fitzhardinge
    Cc: linux-mm
    Cc: David S. Miller
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Li Zefan
    Cc: Catalin Marinas
    Cc: Andi Kleen

    Tejun Heo
     
  • Currently, the following three different ways to define percpu arrays
    are in use.

    1. DEFINE_PER_CPU(elem_type[array_len], array_name);
    2. DEFINE_PER_CPU(elem_type, array_name[array_len]);
    3. DEFINE_PER_CPU(elem_type, array_name)[array_len];

    Unify to #1 which correctly separates the roles of the two parameters
    and thus allows more flexibility in the way percpu variables are
    defined.

    [ Impact: cleanup ]

    Signed-off-by: Tejun Heo
    Reviewed-by: Christoph Lameter
    Cc: Ingo Molnar
    Cc: Tony Luck
    Cc: Benjamin Herrenschmidt
    Cc: Thomas Gleixner
    Cc: Jeremy Fitzhardinge
    Cc: linux-mm@kvack.org
    Cc: Christoph Lameter
    Cc: David S. Miller

    Tejun Heo
     

20 Apr, 2009

1 commit

  • last_synq_overflow eats 4 or 8 bytes in struct tcp_sock, even
    though it is only used when a listening sockets syn queue
    is full.

    We can (ab)use rx_opt.ts_recent_stamp to store the same information;
    it is not used otherwise as long as a socket is in listen state.

    Move linger2 around to avoid splitting struct mtu_probe
    across cacheline boundary on 32 bit arches.

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

28 Mar, 2009

1 commit

  • The current placement of the security_inet_conn_request() hooks do not allow
    individual LSMs to override the IP options of the connection's request_sock.
    This is a problem as both SELinux and Smack have the ability to use labeled
    networking protocols which make use of IP options to carry security attributes
    and the inability to set the IP options at the start of the TCP handshake is
    problematic.

    This patch moves the IPv4 security_inet_conn_request() hooks past the code
    where the request_sock's IP options are set/reset so that the LSM can safely
    manipulate the IP options as needed. This patch intentionally does not change
    the related IPv6 hooks as IPv6 based labeling protocols which use IPv6 options
    are not currently implemented, once they are we will have a better idea of
    the correct placement for the IPv6 hooks.

    Signed-off-by: Paul Moore
    Acked-by: David S. Miller
    Signed-off-by: James Morris

    Paul Moore
     

01 Oct, 2008

2 commits

  • Current TCP code relies on the local port of the listening socket
    being the same as the destination address of the incoming
    connection. Port redirection used by many transparent proxying
    techniques obviously breaks this, so we have to store the original
    destination port address.

    This patch extends struct inet_request_sock and stores the incoming
    destination port value there. It also modifies the handshake code to
    use that value as the source port when sending reply packets.

    Signed-off-by: KOVACS Krisztian
    Signed-off-by: David S. Miller

    KOVACS Krisztian
     
  • Netfilter's ip_route_me_harder() tries to re-route packets either
    generated or re-routed by Netfilter. This patch changes
    ip_route_me_harder() to handle packets from non-locally-bound sockets
    with IP_TRANSPARENT set as local and to set the appropriate flowi
    flags when re-doing the routing lookup.

    Signed-off-by: KOVACS Krisztian
    Signed-off-by: David S. Miller

    KOVACS Krisztian
     

26 Jul, 2008

1 commit

  • ecn_ok is not initialized when a connection is established by cookies.
    The cookie syn-ack never sets ECN, so ecn_ok must be set to 0.

    Spotted using ns-3/network simulation cradle simulator and valgrind.

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

17 Jul, 2008

1 commit


14 Jun, 2008

1 commit


12 Jun, 2008

1 commit


11 Jun, 2008

1 commit


10 Apr, 2008

1 commit

  • Allow the use of SACK and window scaling when syncookies are used
    and the client supports tcp timestamps. Options are encoded into
    the timestamp sent in the syn-ack and restored from the timestamp
    echo when the ack is received.

    Based on earlier work by Glenn Griffin.
    This patch avoids increasing the size of structs by encoding TCP
    options into the least significant bits of the timestamp and
    by not using any 'timestamp offset'.

    The downside is that the timestamp sent in the packet after the synack
    will increase by several seconds.

    changes since v1:
    don't duplicate timestamp echo decoding function, put it into ipv4/syncookie.c
    and have ipv6/syncookies.c use it.
    Feedback from Glenn Griffin: fix line indented with spaces, kill redundant if ()

    Reviewed-by: Hagen Paul Pfeifer
    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

24 Mar, 2008

1 commit

  • the first u32 copied from syncookie_secret is overwritten by the
    minute-counter four lines below. After adjusting the destination
    address, the size of syncookie_secret can be reduced accordingly.

    AFAICS, the only other user of syncookie_secret[] is the ipv6
    syncookie support. Because ipv6 syncookies only grab 44 bytes from
    syncookie_secret[], this shouldn't affect them in any way.

    With fixes from Glenn Griffin.

    Signed-off-by: Florian Westphal
    Acked-by: Glenn Griffin
    Signed-off-by: David S. Miller

    Florian Westphal
     

04 Mar, 2008

2 commits


29 Jan, 2008

1 commit


26 Apr, 2007

2 commits


11 Feb, 2007

1 commit


03 Dec, 2006

1 commit


23 Sep, 2006

2 commits

  • This automatically labels the TCP, Unix stream, and dccp child sockets
    as well as openreqs to be at the same MLS level as the peer. This will
    result in the selection of appropriately labeled IPSec Security
    Associations.

    This also uses the sock's sid (as opposed to the isec sid) in SELinux
    enforcement of secmark in rcv_skb and postroute_last hooks.

    Signed-off-by: Venkat Yekkirala
    Signed-off-by: David S. Miller

    Venkat Yekkirala
     
  • This labels the flows that could utilize IPSec xfrms at the points the
    flows are defined so that IPSec policy and SAs at the right label can
    be used.

    The following protos are currently not handled, but they should
    continue to be able to use single-labeled IPSec like they currently
    do.

    ipmr
    ip_gre
    ipip
    igmp
    sit
    sctp
    ip6_tunnel (IPv6 over IPv6 tunnel device)
    decnet

    Signed-off-by: Venkat Yekkirala
    Signed-off-by: David S. Miller

    Venkat Yekkirala
     

04 Jan, 2006

1 commit


30 Aug, 2005

2 commits

  • Of this type, mostly:

    CHECK net/ipv6/netfilter.c
    net/ipv6/netfilter.c:96:12: warning: symbol 'ipv6_netfilter_init' was not declared. Should it be static?
    net/ipv6/netfilter.c:101:6: warning: symbol 'ipv6_netfilter_fini' was not declared. Should it be static?

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • This creates struct inet_connection_sock, moving members out of struct
    tcp_sock that are shareable with other INET connection oriented
    protocols, such as DCCP, that in my private tree already uses most of
    these members.

    The functions that operate on these members were renamed, using a
    inet_csk_ prefix while not being moved yet to a new file, so as to
    ease the review of these changes.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     

19 Jun, 2005

2 commits

  • Ok, this one just renames some stuff to have a better namespace and to
    dissassociate it from TCP:

    struct open_request -> struct request_sock
    tcp_openreq_alloc -> reqsk_alloc
    tcp_openreq_free -> reqsk_free
    tcp_openreq_fastfree -> __reqsk_free

    With this most of the infrastructure closely resembles a struct
    sock methods subset.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Kept this first changeset minimal, without changing existing names to
    ease peer review.

    Basicaly tcp_openreq_alloc now receives the or_calltable, that in turn
    has two new members:

    ->slab, that replaces tcp_openreq_cachep
    ->obj_size, to inform the size of the openreq descendant for
    a specific protocol

    The protocol specific fields in struct open_request were moved to a
    class hierarchy, with the things that are common to all connection
    oriented PF_INET protocols in struct inet_request_sock, the TCP ones
    in tcp_request_sock, that is an inet_request_sock, that is an
    open_request.

    I.e. this uses the same approach used for the struct sock class
    hierarchy, with sk_prot indicating if the protocol wants to use the
    open_request infrastructure by filling in sk_prot->rsk_prot with an
    or_calltable.

    Results? Performance is improved and TCP v4 now uses only 64 bytes per
    open request minisock, down from 96 without this patch :-)

    Next changeset will rename some of the structs, fields and functions
    mentioned above, struct or_calltable is way unclear, better name it
    struct request_sock_ops, s/struct open_request/struct request_sock/g,
    etc.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds