26 Jun, 2006
3 commits
-
Fix checksum problems in the GSO code path for CHECKSUM_HW packets.
The ipv4 TCP pseudo header checksum has to be adjusted for GSO
segmented packets.The adjustment is needed because the length field in the pseudo-header
changes. However, because we have the inequality oldlen > newlen, we
know that delta = (u16)~oldlen + newlen is still a 16-bit quantity.
This also means that htonl(delta) + th->check still fits in 32 bits.
Therefore we don't have to use csum_add on this operations.This is based on a patch by Michael Chan .
Signed-off-by: Herbert Xu
Acked-by: Michael Chan
Signed-off-by: David S. Miller -
There are several instances of per_cpu(foo, raw_smp_processor_id()), which
is semantically equivalent to __get_cpu_var(foo) but without the warning
that smp_processor_id() can give if CONFIG_DEBUG_PREEMPT is enabled. For
those architectures with optimized per-cpu implementations, namely ia64,
powerpc, s390, sparc64 and x86_64, per_cpu() turns into more and slower
code than __get_cpu_var(), so it would be preferable to use __get_cpu_var
on those platforms.This defines a __raw_get_cpu_var(x) macro which turns into per_cpu(x,
raw_smp_processor_id()) on architectures that use the generic per-cpu
implementation, and turns into __get_cpu_var(x) on the architectures that
have an optimized per-cpu implementation.Signed-off-by: Paul Mackerras
Acked-by: David S. Miller
Acked-by: Ingo Molnar
Acked-by: Martin Schwidefsky
Cc: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Convert a few stragglers over to for_each_possible_cpu(), remove
for_each_cpu().Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
23 Jun, 2006
3 commits
-
This patch segments GSO packets received by the IPsec stack. This can
happen when a NIC driver injects GSO packets into the stack which are
then forwarded to another host.The primary application of this is going to be Xen where its backend
driver may inject GSO packets into dom0.Of course this also can be used by other virtualisation schemes such as
VMWare or UML since the tap device could be modified to inject GSO packets
received through splice.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
This patch adds the GSO implementation for IPv4 TCP.
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
going to scale if we add any more segmentation methods (e.g., DCCP). So
let's merge them.They were used to tell the protocol of a packet. This function has been
subsumed by the new gso_type field. This is essentially a set of netdev
feature bits (shifted by 16 bits) that are required to process a specific
skb. As such it's easy to tell whether a given device can process a GSO
skb: you just have to and the gso_type field and the netdev's features
field.I've made gso_type a conjunction. The idea is that you have a base type
(e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
For example, if we add a hardware TSO type that supports ECN, they would
declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would
have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
packets would be SKB_GSO_TCPV4. This means that only the CWR packets need
to be emulated in software.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller
20 Jun, 2006
1 commit
-
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (46 commits)
IB/uverbs: Don't serialize with ib_uverbs_idr_mutex
IB/mthca: Make all device methods truly reentrant
IB/mthca: Fix memory leak on modify_qp error paths
IB/uverbs: Factor out common idr code
IB/uverbs: Don't decrement usecnt on error paths
IB/uverbs: Release lock on error path
IB/cm: Use address handle helpers
IB/sa: Add ib_init_ah_from_path()
IB: Add ib_init_ah_from_wc()
IB/ucm: Get rid of duplicate P_Key parameter
IB/srp: Factor out common request reset code
IB/srp: Support SRP rev. 10 targets
[SCSI] srp.h: Add I/O Class values
IB/fmr: Use device's max_map_map_per_fmr attribute in FMR pool.
IB/mthca: Fill in max_map_per_fmr device attribute
IB/ipath: Add client reregister event generation
IB/mthca: Add client reregister event generation
IB: Move struct port_info from ipath to
IPoIB: Handle client reregister events
IB: Add client reregister event type
...
18 Jun, 2006
33 commits
-
The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
identically so we test for them in quite a few places. For the sake
of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We
also test the disjunct of NETIF_F_IP_CSUM and the other two in various
places, for that purpose I've added NETIF_F_ALL_CSUM.Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
A lot of people have asked for a way to disable tcp_cwnd_restart(),
and it seems reasonable to add a sysctl to do that.Signed-off-by: David S. Miller
-
RTT_min is updated each time a timeout event occurs
in order to cope with hard handovers in wireless scenarios such as UMTS.Signed-off-by: Luca De Cicco
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
The bandwidth estimate filter is now initialized with the first
sample in order to have better performances in the case of small
file transfers.Signed-off-by: Luca De Cicco
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
Cleanup some comments and add more references
Signed-off-by: Luca De Cicco
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
Need to update send sequence number tracking after first ack.
Rework of patch from Luca De Cicco.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
The sysctl net.ipv4.ip_autoconfig is a legacy value that is not used.
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
The linearisation operation doesn't need to be super-optimised. So we can
replace __skb_linearize with __pskb_pull_tail which does the same thing but
is more general.Also, most users of skb_linearize end up testing whether the skb is linear
or not so it helps to make skb_linearize do just that.Some callers of skb_linearize also use it to copy cloned data, so it's
useful to have a new function skb_linearize_cow to copy the data if it's
either non-linear or cloned.Last but not least, I've removed the gfp argument since nobody uses it
anymore. If it's ever needed we can easily add it back.Misc bugs fixed by this patch:
* via-velocity error handling (also, no SG => no frags)
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
hashlimit does:
if (!ht->rnd)
get_random_bytes(&ht->rnd, 4);ignoring that 0 is also a valid random number.
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
create_proc_entry must not be called with locks held. Use a mutex
instead to protect data only changed in user context.Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
Add a secmark field to IP and NF conntracks, so that security markings
on packets can be copied to their associated connections, and also
copied back to packets as required. This is similar to the network
mark field currently used with conntrack, although it is intended for
enforcement of security policy rather than network policy.Signed-off-by: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller -
Add a secmark field to the skbuff structure, to allow security subsystems to
place security markings on network packets. This is similar to the nfmark
field, except is intended for implementing security policy, rather than than
networking policy.This patch was already acked in principle by Dave Miller.
Signed-off-by: James Morris
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller -
It is typed wrong, and it's only assigned and used once.
So just pass in iph->daddr directly which fixes both problems.Based upon a patch by Alexey Dobriyan.
Signed-off-by: David S. Miller
-
All users pass 32-bit values as addresses and internally they're
compared with 32-bit entities. So, change "laddr" and "raddr" types to
__be32.Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller -
All users except two expect 32-bit big-endian value. One is of
->multiaddr = ->multiaddr
variety. And last one is "%08lX".
Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller -
The suseconds_t et al. are not necessarily any particular type on
every platform, so cast to unsigned long so that we can use one printf
format string and avoid warnings across the boardSigned-off-by: David S. Miller
-
Implementation of RFC3742 limited slow start. Added as part
of the TCP highspeed congestion control module.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
This adds a new module for tracking TCP state variables non-intrusively
using kprobes. It has a simple /proc interface that outputs one line
for each packet received. A sample usage is to collect congestion
window and ssthresh over time graphs.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
Many of the TCP congestion methods all just use ssthresh
as the minimum congestion window on decrease. Rather than
duplicating the code, just have that be the default if that
handle in the ops structure is not set.Minor behaviour change to TCP compound. It probably wants
to use this (ssthresh) as lower bound, rather than ssthresh/2
because the latter causes undershoot on loss.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
The original code did a 64 bit divide directly, which won't work on
32 bit platforms. Rather than doing a 64 bit square root twice,
just implement a 4th root function in one pass using Newton's method.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
TCP Compound is a sender-side only change to TCP that uses
a mixed Reno/Vegas approach to calculate the cwnd.For further details look here:
ftp://ftp.research.microsoft.com/pub/tr/TR-2005-86.pdfSigned-off-by: Angelo P. Castellani
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
TCP Veno module is a new congestion control module to improve TCP
performance over wireless networks. The key innovation in TCP Veno is
the enhancement of TCP Reno/Sack congestion control algorithm by using
the estimated state of a connection based on TCP Vegas. This scheme
significantly reduces "blind" reduction of TCP window regardless of
the cause of packet loss.This work is based on the research paper "TCP Veno: TCP Enhancement
for Transmission over Wireless Access Networks." C. P. Fu, S. C. Liew,
IEEE Journal on Selected Areas in Communication, Feb. 2003.Original paper and many latest research works on veno:
http://www.ntu.edu.sg/home/ascpfu/veno/veno.htmlSigned-off-by: Bin Zhou
Cheng Peng Fu
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
TCP Low Priority is a distributed algorithm whose goal is to utilize only
the excess network bandwidth as compared to the ``fair share`` of
bandwidth as targeted by TCP. Available from:
http://www.ece.rice.edu/~akuzma/Doc/akuzma/TCP-LP.pdfOriginal Author:
Aleksandar KuzmanovicSee http://www-ece.rice.edu/networks/TCP-LP/ for their implementation.
As of 2.6.13, Linux supports pluggable congestion control algorithms.
Due to the limitation of the API, we take the following changes from
the original TCP-LP implementation:
o We use newReno in most core CA handling. Only add some checking
within cong_avoid.
o Error correcting in remote HZ, therefore remote HZ will be keeped
on checking and updating.
o Handling calculation of One-Way-Delay (OWD) within rtt_sample, sicne
OWD have a similar meaning as RTT. Also correct the buggy formular.
o Handle reaction for Early Congestion Indication (ECI) within
pkts_acked, as mentioned within pseudo code.
o OWD is handled in relative format, where local time stamp will in
tcp_time_stamp format.Port from 2.4.19 to 2.6.16 as module by:
Wong Hoi Sing Edison
Hung Hing LunSigned-off-by: Wong Hoi Sing Edison
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
GRE keys are 16-bit wide.
Signed-off-by: Alexey Dobriyan
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
Add SIP connection tracking helper. Originally written by
Christian Hentschel , some cleanup, minor
fixes and bidirectional SIP support added by myself.Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
Call Forwarding doesn't need to create an expectation if both peers can
reach each other without our help. The internal_net_addr parameter
lets the user explicitly specify a single network where this is true,
but is not very flexible and even fails in the common case that calls
will both be forwarded to outside parties and inside parties. Use an
optional heuristic based on routing instead, the assumption is that
if bpth the outgoing device and the gateway are equal, both peers can
reach each other directly.Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
Signed-off-by: Jing Min Zhao
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
When a port number within a packet is replaced by a differently sized
number only the packet is resized, but not the copy of the data.
Following port numbers are rewritten based on their offsets within
the copy, leading to packet corruption.Convert the amanda helper to the textsearch infrastructure to avoid
the copy entirely.Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
Instead of skipping search entries for the wrong direction simply index
them by direction.Based on patch by Pablo Neira
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
debug is the debug level, not a bool.
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
Instead of using the ID to find out where to continue dumping, take a
reference to the last entry dumped and try to continue there.Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
The current configuration only allows to configure one manip and overloads
conntrack status flags with netlink semantic.Signed-off-by: Patrick Mchardy
Signed-off-by: David S. Miller