11 Oct, 2007

3 commits

  • Background: RFC 4293 deprecates existing individual, named ICMP
    type counters to be replaced with the ICMPMsgStatsTable. This table
    includes entries for both IPv4 and IPv6, and requires counting of all
    ICMP types, whether or not the machine implements the type.

    These patches "remove" (but not really) the existing counters, and
    replace them with the ICMPMsgStats tables for v4 and v6.
    It includes the named counters in the /proc places they were, but gets the
    values for them from the new tables. It also counts packets generated
    from raw socket output (e.g., OutEchoes, MLD queries, RA's from
    radvd, etc).

    Changes:
    1) create icmpmsg_statistics mib
    2) create icmpv6msg_statistics mib
    3) modify existing counters to use these
    4) modify /proc/net/snmp to add "IcmpMsg" with all ICMP types
    listed by number for easy SNMP parsing
    5) modify /proc/net/snmp printing for "Icmp" to get the named data
    from new counters.

    Signed-off-by: David L Stevens
    Signed-off-by: David S. Miller

    David L Stevens
     
  • Signed-off-by: Denis Cheng
    Signed-off-by: David S. Miller

    Denis Cheng
     
  • This patch passes in the namespace a new socket should be created in
    and has the socket code do the appropriate reference counting. By
    virtue of this all socket create methods are touched. In addition
    the socket create methods are modified so that they will fail if
    you attempt to create a socket in a non-default network namespace.

    Failing if we attempt to create a socket outside of the default
    network namespace ensures that as we incrementally make the network stack
    network namespace aware we will not export functionality that someone
    has not audited and made certain is network namespace safe.
    Allowing us to partially enable network namespaces before all of the
    exotic protocols are supported.

    Any protocol layers I have missed will fail to compile because I now
    pass an extra parameter into the socket creation code.

    [ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ]

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

03 Aug, 2007

1 commit

  • As discovered by Evegniy Polyakov, if we try to sendmsg after
    a connection reset, we can do incredibly stupid things.

    The core issue is that inet_sendmsg() tries to autobind the
    socket, but we should never do that for TCP. Instead we should
    just go straight into TCP's sendmsg() code which will do all
    of the necessary state and pending socket error checks.

    TCP's sendpage already directly vectors to tcp_sendpage(), so this
    merely brings sendmsg() in line with that.

    Signed-off-by: David S. Miller

    David S. Miller
     

11 Jul, 2007

1 commit

  • The existing model for checksum offload does not correctly handle
    devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
    implies device can do any arbitrary protocol.

    This patch:
    * adds NETIF_F_IPV6_CSUM for those devices
    * fixes bnx2 and tg3 devices that need it
    * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
    * fixes assumptions about NETIF_F_ALL_CSUM in nat
    * adjusts bridge union of checksumming computation

    Signed-off-by: David S. Miller

    Stephen Hemminger
     

09 May, 2007

1 commit


26 Apr, 2007

9 commits


11 Feb, 2007

1 commit


09 Feb, 2007

1 commit


10 Jan, 2007

1 commit


09 Jan, 2007

1 commit

  • The inet_create() and inet6_create() functions incorrectly set the
    inet_sock->is_icsk field. Both functions assume that the is_icsk field is
    large enough to hold at least a INET_PROTOSW_ICSK value when it is actually
    only a single bit. This patch corrects the assignment by doing a boolean
    comparison whose result will safely fit into a single bit field.

    Signed-off-by: Paul Moore
    Signed-off-by: David S. Miller

    Paul Moore
     

03 Dec, 2006

3 commits

  • Signed-off-by: Al Viro
    Signed-off-by: David S. Miller

    Al Viro
     
  • This is a revision of the previously submitted patch, which alters
    the way files are organized and compiled in the following manner:

    * UDP and UDP-Lite now use separate object files
    * source file dependencies resolved via header files
    net/ipv{4,6}/udp_impl.h
    * order of inclusion files in udp.c/udplite.c adapted
    accordingly

    [NET/IPv4]: Support for the UDP-Lite protocol (RFC 3828)

    This patch adds support for UDP-Lite to the IPv4 stack, provided as an
    extension to the existing UDPv4 code:
    * generic routines are all located in net/ipv4/udp.c
    * UDP-Lite specific routines are in net/ipv4/udplite.c
    * MIB/statistics support in /proc/net/snmp and /proc/net/udplite
    * shared API with extensions for partial checksum coverage

    [NET/IPv6]: Extension for UDP-Lite over IPv6

    It extends the existing UDPv6 code base with support for UDP-Lite
    in the same manner as per UDPv4. In particular,
    * UDPv6 generic and shared code is in net/ipv6/udp.c
    * UDP-Litev6 specific extensions are in net/ipv6/udplite.c
    * MIB/statistics support in /proc/net/snmp6 and /proc/net/udplite6
    * support for IPV6_ADDRFORM
    * aligned the coding style of protocol initialisation with af_inet6.c
    * made the error handling in udpv6_queue_rcv_skb consistent;
    to return `-1' on error on all error cases
    * consolidation of shared code

    [NET]: UDP-Lite Documentation and basic XFRM/Netfilter support

    The UDP-Lite patch further provides
    * API documentation for UDP-Lite
    * basic xfrm support
    * basic netfilter support for IPv4 and IPv6 (LOG target)

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • We currently allocate a fixed size (TCP_SYNQ_HSIZE=512) slots hash table for
    each LISTEN socket, regardless of various parameters (listen backlog for
    example)

    On x86_64, this means order-1 allocations (might fail), even for 'small'
    sockets, expecting few connections. On the contrary, a huge server wanting a
    backlog of 50000 is slowed down a bit because of this fixed limit.

    This patch makes the sizing of listen hash table a dynamic parameter,
    depending of :
    - net.core.somaxconn tunable (default is 128)
    - net.ipv4.tcp_max_syn_backlog tunable (default : 256, 1024 or 128)
    - backlog value given by user application (2nd parameter of listen())

    For large allocations (bigger than PAGE_SIZE), we use vmalloc() instead of
    kmalloc().

    We still limit memory allocation with the two existing tunables (somaxconn &
    tcp_max_syn_backlog). So for standard setups, this patch actually reduce RAM
    usage.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

29 Sep, 2006

3 commits


23 Sep, 2006

4 commits


09 Jul, 2006

1 commit

  • Certain subsystems in the stack (e.g., netfilter) can break the partial
    checksum on GSO packets. Until they're fixed, this patch allows this to
    work by recomputing the partial checksums through the GSO mechanism.

    Once they've all been converted to update the partial checksum instead of
    clearing it, this workaround can be removed.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

04 Jul, 2006

1 commit


30 Jun, 2006

1 commit

  • When GSO packets come from an untrusted source (e.g., a Xen guest domain),
    we need to verify the header integrity before passing it to the hardware.

    Since the first step in GSO is to verify the header, we can reuse that
    code by adding a new bit to gso_type: SKB_GSO_DODGY. Packets with this
    bit set can only be fed directly to devices with the corresponding bit
    NETIF_F_GSO_ROBUST. If the device doesn't have that bit, then the skb
    is fed to the GSO engine which will allow the packet to be sent to the
    hardware if it passes the header check.

    This patch changes the sg flag to a full features flag. The same method
    can be used to implement TSO ECN support. We simply have to mark packets
    with CWR set with SKB_GSO_ECN so that only hardware with a corresponding
    NETIF_F_TSO_ECN can accept them. The GSO engine can either fully segment
    the packet, or segment the first MTU and pass the rest to the hardware for
    further segmentation.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

23 Jun, 2006

1 commit


30 Apr, 2006

1 commit


21 Mar, 2006

2 commits


12 Jan, 2006

1 commit


04 Jan, 2006

3 commits

  • Currently all network protocols need to call dev_ioctl as the default
    fallback in their ioctl implementations. This patch adds a fallback
    to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD.
    This way all the procotol ioctl handlers can be simplified and we don't
    need to export dev_ioctl.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     
  • To help in reducing the number of include dependencies, several files were
    touched as they were getting needed headers indirectly for stuff they use.

    Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had
    linux/dccp.h include twice.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • I noticed that some of 'struct proto_ops' used in the kernel may share
    a cache line used by locks or other heavily modified data. (default
    linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
    least)

    This patch makes sure a 'struct proto_ops' can be declared as const,
    so that all cpus can share all parts of it without false sharing.

    This is not mandatory : a driver can still use a read/write structure
    if it needs to (and eventually a __read_mostly)

    I made a global stubstitute to change all existing occurences to make
    them const.

    This should reduce the possibility of false sharing on SMP, and
    speedup some socket system calls.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet