08 Apr, 2016

1 commit

  • Now that the UDP encapsulation GRO functions have been moved to the UDP
    socket we not longer need the udp_offload insfrastructure so removing it.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

11 Jan, 2016

1 commit

  • udp tunnel offloads tend to aggregate datagrams based on inner
    headers. gro engine gets notified by tunnel implementations about
    possible offloads. The match is solely based on the port number.

    Imagine a tunnel bound to port 53, the offloading will look into all
    DNS packets and tries to aggregate them based on the inner data found
    within. This could lead to data corruption and malformed DNS packets.

    While this patch minimizes the problem and helps an administrator to find
    the issue by querying ip tunnel/fou, a better way would be to match on
    the specific destination ip address so if a user space socket is bound
    to the same address it will conflict.

    Cc: Tom Herbert
    Cc: Eric Dumazet
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

24 May, 2014

1 commit

  • It doesn't seem like an protocols are setting anything other
    than the default, and allowing to arbitrarily disable checksums
    for a whole protocol seems dangerous. This can be done on a per
    socket basis.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

22 Jan, 2014

1 commit

  • Add GRO handlers for protocols that do UDP encapsulation, with the intent of
    being able to coalesce packets which encapsulate packets belonging to
    the same TCP session.

    For GRO purposes, the destination UDP port takes the role of the ether type
    field in the ethernet header or the next protocol in the IP header.

    The UDP GRO handler will only attempt to coalesce packets whose destination
    port is registered to have gro handler.

    Use a mark on the skb GRO CB data to disallow (flush) running the udp gro receive
    code twice on a packet. This solves the problem of udp encapsulated packets whose
    inner VM packet is udp and happen to carry a port which has registered offloads.

    Signed-off-by: Shlomo Pongratz
    Signed-off-by: Or Gerlitz
    Signed-off-by: David S. Miller

    Or Gerlitz
     

14 Jan, 2014

1 commit

  • This new ip_no_pmtu_disc mode only allowes fragmentation-needed errors
    to be honored by protocols which do more stringent validation on the
    ICMP's packet payload. This knob is useful for people who e.g. want to
    run an unmodified DNS server in a namespace where they need to use pmtu
    for TCP connections (as they are used for zone transfers or fallback
    for requests) but don't want to use possibly spoofed UDP pmtu information.

    Currently the whitelisted protocols are TCP, SCTP and DCCP as they check
    if the returned packet is in the window or if the association is valid.

    Cc: Eric Dumazet
    Cc: David Miller
    Cc: John Heffner
    Suggested-by: Florian Weimer
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

23 Sep, 2013

1 commit

  • There are a mix of function prototypes with and without extern
    in the kernel sources. Standardize on not using extern for
    function prototypes.

    Function prototypes don't need to be written with extern.
    extern is assumed by the compiler. Its use is as unnecessary as
    using auto to declare automatic/local variables in a block.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

16 Nov, 2012

6 commits


27 Jul, 2012

1 commit


28 Jun, 2012

3 commits

  • It's completely unnecessary.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • This reverts commit c074da2810c118b3812f32d6754bd9ead2f169e7.

    This change has several unwanted side effects:

    1) Sockets will cache the DST_NOCACHE route in sk->sk_rx_dst and we'll
    thus never create a real cached route.

    2) All TCP traffic will use DST_NOCACHE and never use the routing
    cache at all.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • DDOS synflood attacks hit badly IP route cache.

    On typical machines, this cache is allowed to hold up to 8 Millions dst
    entries, 256 bytes for each, for a total of 2GB of memory.

    rt_garbage_collect() triggers and tries to cleanup things.

    Eventually route cache is disabled but machine is under fire and might
    OOM and crash.

    This patch exploits the new TCP early demux, to set a nocache
    boolean in case incoming TCP frame is for a not yet ESTABLISHED or
    TIMEWAIT socket.

    This 'nocache' boolean is then used in case dst entry is not found in
    route cache, to create an unhashed dst entry (DST_NOCACHE)

    SYN-cookie-ACK sent use a similar mechanism (ipv4: tcp: dont cache
    output dst for syncookies), so after this patch, a machine is able to
    absorb a DDOS synflood attack without polluting its IP route cache.

    Signed-off-by: Eric Dumazet
    Cc: Hans Schillstrom
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Jun, 2012

2 commits

  • Input packet processing for local sockets involves two major demuxes.
    One for the route and one for the socket.

    But we can optimize this down to one demux for certain kinds of local
    sockets.

    Currently we only do this for established TCP sockets, but it could
    at least in theory be expanded to other kinds of connections.

    If a TCP socket is established then it's identity is fully specified.

    This means that whatever input route was used during the three-way
    handshake must work equally well for the rest of the connection since
    the keys will not change.

    Once we move to established state, we cache the receive packet's input
    route to use later.

    Like the existing cached route in sk->sk_dst_cache used for output
    packets, we have to check for route invalidations using dst->obsolete
    and dst->ops->check().

    Early demux occurs outside of a socket locked section, so when a route
    invalidation occurs we defer the fixup of sk->sk_rx_dst until we are
    actually inside of established state packet processing and thus have
    the socket locked.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Don't pretend that inet_protos[] and inet6_protos[] are hashes, thay
    are just a straight arrays. Remove all unnecessary hash masking.

    Document MAX_INET_PROTOS.

    Use RAW_HTABLE_SIZE when appropriate.

    Reported-by: Ben Hutchings
    Signed-off-by: David S. Miller

    David S. Miller
     

12 Dec, 2011

1 commit


17 Nov, 2011

1 commit


25 Jan, 2011

1 commit

  • Quoting Ben Hutchings: we presumably won't be defining features that
    can only be enabled on 64-bit architectures.

    Occurences found by `grep -r` on net/, drivers/net, include/

    [ Move features and vlan_features next to each other in
    struct netdev, as per Eric Dumazet's suggestion -DaveM ]

    Signed-off-by: Michał Mirosław
    Signed-off-by: David S. Miller

    Michał Mirosław
     

28 Oct, 2010

1 commit

  • Add __rcu annotations to :
    struct net_protocol *inet_protos
    struct net_protocol *inet6_protos

    And use appropriate casts to reduce sparse warnings if
    CONFIG_SPARSE_RCU_POINTER=y

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

06 Nov, 2009

1 commit

  • struct can_proto had a capability field which wasn't ever used. It is
    dropped entirely.

    struct inet_protosw had a capability field which can be more clearly
    expressed in the code by just checking if sock->type = SOCK_RAW.

    Signed-off-by: Eric Paris
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Eric Paris
     

04 Nov, 2009

1 commit

  • This cleanup patch puts struct/union/enum opening braces,
    in first line to ease grep games.

    struct something
    {

    becomes :

    struct something {

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Sep, 2009

2 commits


23 Jun, 2009

1 commit


09 Jan, 2009

1 commit

  • This patch adds GRO support for IPv6. IPv6 GRO supports extension
    headers in the same way as GSO (by using the same infrastructure).
    It's also simpler compared to IPv4 since we no longer have to worry
    about fragmentation attributes or header checksums.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

16 Dec, 2008

1 commit

  • This patch adds GRO support for IPv4.

    The criteria for merging is more stringent than LRO, in particular,
    we require all fields in the IP header to be identical except for
    the length, ID and checksum. In addition, the ID must form an
    arithmetic sequence with a difference of one.

    The ID requirement might seem overly strict, however, most hardware
    TSO solutions already obey this rule. Linux itself also obeys this
    whether GSO is in use or not.

    In future we could relax this rule by storing the IDs (or rather
    making sure that we don't drop them when pulling the aggregate
    skb's tail).

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

25 Mar, 2008

1 commit


29 Jan, 2008

1 commit


16 Oct, 2007

1 commit


03 Dec, 2006

2 commits

  • [acme@newtoy net-2.6.20]$ pahole /tmp/tcp_ipv6.o inet_protosw
    /* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/net/protocol.h:69 */
    struct inet_protosw {
    struct list_head list; /* 0 8 */
    short unsigned int type; /* 8 2 */

    /* XXX 2 bytes hole, try to pack */

    int protocol; /* 12 4 */
    struct proto * prot; /* 16 4 */
    const struct proto_ops * ops; /* 20 4 */
    int capability; /* 24 4 */
    char no_check; /* 28 1 */
    unsigned char flags; /* 29 1 */
    }; /* size: 32, sum members: 28, holes: 1, sum holes: 2, padding: 2 */

    So that we can kill that hole, protocol can only go all the way to 255 (RAW).

    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Signed-off-by: Al Viro
    Signed-off-by: David S. Miller

    Al Viro
     

09 Jul, 2006

1 commit

  • Certain subsystems in the stack (e.g., netfilter) can break the partial
    checksum on GSO packets. Until they're fixed, this patch allows this to
    work by recomputing the partial checksums through the GSO mechanism.

    Once they've all been converted to update the partial checksum instead of
    clearing it, this workaround can be removed.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

01 Jul, 2006

1 commit


30 Jun, 2006

1 commit

  • When GSO packets come from an untrusted source (e.g., a Xen guest domain),
    we need to verify the header integrity before passing it to the hardware.

    Since the first step in GSO is to verify the header, we can reuse that
    code by adding a new bit to gso_type: SKB_GSO_DODGY. Packets with this
    bit set can only be fed directly to devices with the corresponding bit
    NETIF_F_GSO_ROBUST. If the device doesn't have that bit, then the skb
    is fed to the GSO engine which will allow the packet to be sent to the
    hardware if it passes the header check.

    This patch changes the sg flag to a full features flag. The same method
    can be used to implement TSO ECN support. We simply have to mark packets
    with CWR set with SKB_GSO_ECN so that only hardware with a corresponding
    NETIF_F_TSO_ECN can accept them. The GSO engine can either fully segment
    the packet, or segment the first MTU and pass the rest to the hardware for
    further segmentation.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

23 Jun, 2006

1 commit


26 Apr, 2006

1 commit


08 Jan, 2006

1 commit

  • Move nextheader offset to the IP6CB to make it possible to pass a
    packet to ip6_input_finish multiple times and have it skip already
    parsed headers. As a nice side effect this gets rid of the manual
    hopopts skipping in ip6_input_finish.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy