26 Jun, 2006

3 commits

  • Fix checksum problems in the GSO code path for CHECKSUM_HW packets.

    The ipv4 TCP pseudo header checksum has to be adjusted for GSO
    segmented packets.

    The adjustment is needed because the length field in the pseudo-header
    changes. However, because we have the inequality oldlen > newlen, we
    know that delta = (u16)~oldlen + newlen is still a 16-bit quantity.
    This also means that htonl(delta) + th->check still fits in 32 bits.
    Therefore we don't have to use csum_add on this operations.

    This is based on a patch by Michael Chan .

    Signed-off-by: Herbert Xu
    Acked-by: Michael Chan
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • There are several instances of per_cpu(foo, raw_smp_processor_id()), which
    is semantically equivalent to __get_cpu_var(foo) but without the warning
    that smp_processor_id() can give if CONFIG_DEBUG_PREEMPT is enabled. For
    those architectures with optimized per-cpu implementations, namely ia64,
    powerpc, s390, sparc64 and x86_64, per_cpu() turns into more and slower
    code than __get_cpu_var(), so it would be preferable to use __get_cpu_var
    on those platforms.

    This defines a __raw_get_cpu_var(x) macro which turns into per_cpu(x,
    raw_smp_processor_id()) on architectures that use the generic per-cpu
    implementation, and turns into __get_cpu_var(x) on the architectures that
    have an optimized per-cpu implementation.

    Signed-off-by: Paul Mackerras
    Acked-by: David S. Miller
    Acked-by: Ingo Molnar
    Acked-by: Martin Schwidefsky
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Mackerras
     
  • Convert a few stragglers over to for_each_possible_cpu(), remove
    for_each_cpu().

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

23 Jun, 2006

3 commits

  • This patch segments GSO packets received by the IPsec stack. This can
    happen when a NIC driver injects GSO packets into the stack which are
    then forwarded to another host.

    The primary application of this is going to be Xen where its backend
    driver may inject GSO packets into dom0.

    Of course this also can be used by other virtualisation schemes such as
    VMWare or UML since the tap device could be modified to inject GSO packets
    received through splice.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch adds the GSO implementation for IPv4 TCP.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
    going to scale if we add any more segmentation methods (e.g., DCCP). So
    let's merge them.

    They were used to tell the protocol of a packet. This function has been
    subsumed by the new gso_type field. This is essentially a set of netdev
    feature bits (shifted by 16 bits) that are required to process a specific
    skb. As such it's easy to tell whether a given device can process a GSO
    skb: you just have to and the gso_type field and the netdev's features
    field.

    I've made gso_type a conjunction. The idea is that you have a base type
    (e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
    For example, if we add a hardware TSO type that supports ECN, they would
    declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would
    have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
    packets would be SKB_GSO_TCPV4. This means that only the CWR packets need
    to be emulated in software.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

20 Jun, 2006

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (46 commits)
    IB/uverbs: Don't serialize with ib_uverbs_idr_mutex
    IB/mthca: Make all device methods truly reentrant
    IB/mthca: Fix memory leak on modify_qp error paths
    IB/uverbs: Factor out common idr code
    IB/uverbs: Don't decrement usecnt on error paths
    IB/uverbs: Release lock on error path
    IB/cm: Use address handle helpers
    IB/sa: Add ib_init_ah_from_path()
    IB: Add ib_init_ah_from_wc()
    IB/ucm: Get rid of duplicate P_Key parameter
    IB/srp: Factor out common request reset code
    IB/srp: Support SRP rev. 10 targets
    [SCSI] srp.h: Add I/O Class values
    IB/fmr: Use device's max_map_map_per_fmr attribute in FMR pool.
    IB/mthca: Fill in max_map_per_fmr device attribute
    IB/ipath: Add client reregister event generation
    IB/mthca: Add client reregister event generation
    IB: Move struct port_info from ipath to
    IPoIB: Handle client reregister events
    IB: Add client reregister event type
    ...

    Linus Torvalds
     

18 Jun, 2006

33 commits

  • The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
    identically so we test for them in quite a few places. For the sake
    of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We
    also test the disjunct of NETIF_F_IP_CSUM and the other two in various
    places, for that purpose I've added NETIF_F_ALL_CSUM.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • A lot of people have asked for a way to disable tcp_cwnd_restart(),
    and it seems reasonable to add a sysctl to do that.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • RTT_min is updated each time a timeout event occurs
    in order to cope with hard handovers in wireless scenarios such as UMTS.

    Signed-off-by: Luca De Cicco
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Luca De Cicco
     
  • The bandwidth estimate filter is now initialized with the first
    sample in order to have better performances in the case of small
    file transfers.

    Signed-off-by: Luca De Cicco
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Luca De Cicco
     
  • Cleanup some comments and add more references

    Signed-off-by: Luca De Cicco
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Luca De Cicco
     
  • Need to update send sequence number tracking after first ack.
    Rework of patch from Luca De Cicco.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • The sysctl net.ipv4.ip_autoconfig is a legacy value that is not used.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • The linearisation operation doesn't need to be super-optimised. So we can
    replace __skb_linearize with __pskb_pull_tail which does the same thing but
    is more general.

    Also, most users of skb_linearize end up testing whether the skb is linear
    or not so it helps to make skb_linearize do just that.

    Some callers of skb_linearize also use it to copy cloned data, so it's
    useful to have a new function skb_linearize_cow to copy the data if it's
    either non-linear or cloned.

    Last but not least, I've removed the gfp argument since nobody uses it
    anymore. If it's ever needed we can easily add it back.

    Misc bugs fixed by this patch:

    * via-velocity error handling (also, no SG => no frags)

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • hashlimit does:

    if (!ht->rnd)
    get_random_bytes(&ht->rnd, 4);

    ignoring that 0 is also a valid random number.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • create_proc_entry must not be called with locks held. Use a mutex
    instead to protect data only changed in user context.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Add a secmark field to IP and NF conntracks, so that security markings
    on packets can be copied to their associated connections, and also
    copied back to packets as required. This is similar to the network
    mark field currently used with conntrack, although it is intended for
    enforcement of security policy rather than network policy.

    Signed-off-by: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    James Morris
     
  • Add a secmark field to the skbuff structure, to allow security subsystems to
    place security markings on network packets. This is similar to the nfmark
    field, except is intended for implementing security policy, rather than than
    networking policy.

    This patch was already acked in principle by Dave Miller.

    Signed-off-by: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    James Morris
     
  • It is typed wrong, and it's only assigned and used once.
    So just pass in iph->daddr directly which fixes both problems.

    Based upon a patch by Alexey Dobriyan.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • All users pass 32-bit values as addresses and internally they're
    compared with 32-bit entities. So, change "laddr" and "raddr" types to
    __be32.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • All users except two expect 32-bit big-endian value. One is of

    ->multiaddr = ->multiaddr

    variety. And last one is "%08lX".

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • The suseconds_t et al. are not necessarily any particular type on
    every platform, so cast to unsigned long so that we can use one printf
    format string and avoid warnings across the board

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Implementation of RFC3742 limited slow start. Added as part
    of the TCP highspeed congestion control module.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • This adds a new module for tracking TCP state variables non-intrusively
    using kprobes. It has a simple /proc interface that outputs one line
    for each packet received. A sample usage is to collect congestion
    window and ssthresh over time graphs.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Many of the TCP congestion methods all just use ssthresh
    as the minimum congestion window on decrease. Rather than
    duplicating the code, just have that be the default if that
    handle in the ops structure is not set.

    Minor behaviour change to TCP compound. It probably wants
    to use this (ssthresh) as lower bound, rather than ssthresh/2
    because the latter causes undershoot on loss.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • The original code did a 64 bit divide directly, which won't work on
    32 bit platforms. Rather than doing a 64 bit square root twice,
    just implement a 4th root function in one pass using Newton's method.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • TCP Compound is a sender-side only change to TCP that uses
    a mixed Reno/Vegas approach to calculate the cwnd.

    For further details look here:
    ftp://ftp.research.microsoft.com/pub/tr/TR-2005-86.pdf

    Signed-off-by: Angelo P. Castellani
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Angelo P. Castellani
     
  • TCP Veno module is a new congestion control module to improve TCP
    performance over wireless networks. The key innovation in TCP Veno is
    the enhancement of TCP Reno/Sack congestion control algorithm by using
    the estimated state of a connection based on TCP Vegas. This scheme
    significantly reduces "blind" reduction of TCP window regardless of
    the cause of packet loss.

    This work is based on the research paper "TCP Veno: TCP Enhancement
    for Transmission over Wireless Access Networks." C. P. Fu, S. C. Liew,
    IEEE Journal on Selected Areas in Communication, Feb. 2003.

    Original paper and many latest research works on veno:
    http://www.ntu.edu.sg/home/ascpfu/veno/veno.html

    Signed-off-by: Bin Zhou
    Cheng Peng Fu
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Bin Zhou
     
  • TCP Low Priority is a distributed algorithm whose goal is to utilize only
    the excess network bandwidth as compared to the ``fair share`` of
    bandwidth as targeted by TCP. Available from:
    http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdf

    Original Author:
    Aleksandar Kuzmanovic

    See http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
    As of 2.6.13, Linux supports pluggable congestion control algorithms.
    Due to the limitation of the API, we take the following changes from
    the original TCP-LP implementation:
    o We use newReno in most core CA handling. Only add some checking
    within cong_avoid.
    o Error correcting in remote HZ, therefore remote HZ will be keeped
    on checking and updating.
    o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
    OWD have a similar meaning as RTT. Also correct the buggy formular.
    o Handle reaction for Early Congestion Indication (ECI) within
    pkts_acked, as mentioned within pseudo code.
    o OWD is handled in relative format, where local time stamp will in
    tcp_time_stamp format.

    Port from 2.4.19 to 2.6.16 as module by:
    Wong Hoi Sing Edison
    Hung Hing Lun

    Signed-off-by: Wong Hoi Sing Edison
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Wong Hoi Sing Edison
     
  • GRE keys are 16-bit wide.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • Add SIP connection tracking helper. Originally written by
    Christian Hentschel , some cleanup, minor
    fixes and bidirectional SIP support added by myself.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Call Forwarding doesn't need to create an expectation if both peers can
    reach each other without our help. The internal_net_addr parameter
    lets the user explicitly specify a single network where this is true,
    but is not very flexible and even fails in the common case that calls
    will both be forwarded to outside parties and inside parties. Use an
    optional heuristic based on routing instead, the assumption is that
    if bpth the outgoing device and the gateway are equal, both peers can
    reach each other directly.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Signed-off-by: Jing Min Zhao
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Jing Min Zhao
     
  • When a port number within a packet is replaced by a differently sized
    number only the packet is resized, but not the copy of the data.
    Following port numbers are rewritten based on their offsets within
    the copy, leading to packet corruption.

    Convert the amanda helper to the textsearch infrastructure to avoid
    the copy entirely.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Instead of skipping search entries for the wrong direction simply index
    them by direction.

    Based on patch by Pablo Neira

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • debug is the debug level, not a bool.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Instead of using the ID to find out where to continue dumping, take a
    reference to the last entry dumped and try to continue there.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • The current configuration only allows to configure one manip and overloads
    conntrack status flags with netlink semantic.

    Signed-off-by: Patrick Mchardy
    Signed-off-by: David S. Miller

    Patrick McHardy