09 Dec, 2006

2 commits


04 Dec, 2006

1 commit


03 Dec, 2006

3 commits

  • [acme@newtoy net-2.6.20]$ pahole net/ipv4/tcp.o hh_cache
    /* /pub/scm/linux/kernel/git/acme/net-2.6.20/include/linux/netdevice.h:190 */
    struct hh_cache {
    struct hh_cache * hh_next; /* 0 4 */
    atomic_t hh_refcnt; /* 4 4 */
    __be16 hh_type; /* 8 2 */

    /* XXX 2 bytes hole, try to pack */

    int hh_len; /* 12 4 */
    int (*hh_output)(); /* 16 4 */
    rwlock_t hh_lock; /* 20 36 */
    long unsigned int hh_data[24]; /* 56 96 */
    }; /* size: 152, sum members: 150, holes: 1, sum holes: 2 */

    [acme@newtoy net-2.6.20]$ find net -name "*.[ch]" | xargs grep 'hh_len.\+=' | sort -u
    net/atm/br2684.c: hh->hh_len = PADLEN + ETH_HLEN;
    net/ethernet/eth.c: hh->hh_len = ETH_HLEN;
    net/ipv4/ipconfig.c: int hh_len = LL_RESERVED_SPACE(dev);
    net/ipv4/ip_output.c: hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
    net/ipv4/ip_output.c: int hh_len = LL_RESERVED_SPACE(dev);
    net/ipv4/netfilter.c: hh_len = (*pskb)->dst->dev->hard_header_len;
    net/ipv4/raw.c: hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
    net/ipv6/ip6_output.c: hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
    net/ipv6/netfilter/ip6t_REJECT.c: hh_len = (dst->dev->hard_header_len + 15)&~15;
    net/ipv6/raw.c: hh_len = LL_RESERVED_SPACE(rt->u.dst.dev);
    [acme@newtoy net-2.6.20]$

    [acme@newtoy net-2.6.20]$ find include -name "*.h" | xargs grep 'define ETH_HLEN'
    include/linux/if_ether.h:#define ETH_HLEN 14 /* Total octets in header. */

    (((dev)->hard_header_len&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)

    [acme@newtoy net-2.6.20]$ pahole net/ipv4/tcp.o net_device | grep hard_header_len
    short unsigned int hard_header_len; /* 106 2 */
    [acme@newtoy net-2.6.20]$

    So I think we're safe in turning hh_len an u16, end result:

    [acme@newtoy net-2.6.20]$ codiff -sV /tmp/tcp.o.before net/ipv4/tcp.o
    /pub/scm/linux/kernel/git/acme/net-2.6.20/net/ipv4/tcp.c:
    struct hh_cache | -4
    hh_len;
    from: int /* 12(0) 4(0) */
    to: u16 /* 10(0) 2(0) */
    1 struct changed
    [acme@newtoy net-2.6.20]$

    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Throughout the TCP/DCCP (and tunnelling) code, it often happens that the
    return code of a transmit function needs to be tested against NET_XMIT_CN
    which is a value that does not indicate a strict error condition.

    This patch uses a macro for these recurring situations which is consistent
    with the already existing macro net_xmit_errno, saving on duplicated code.

    Signed-off-by: Gerrit Renker
    Signed-off-by: Arnaldo Carvalho de Melo

    Gerrit Renker
     
  • This patch contains the scheduled removal of the frame diverter.

    Signed-off-by: Adrian Bunk
    Signed-off-by: David S. Miller

    Adrian Bunk
     

29 Nov, 2006

1 commit

  • MAX_HEADER is either set to LL_MAX_HEADER or LL_MAX_HEADER + 48, and
    this is controlled by a set of CONFIG_* ifdef tests.

    It is trying to use LL_MAX_HEADER + 48 when any of the tunnels are
    enabled which set hard_header_len like this:

    dev->hard_header_len = LL_MAX_HEADER + sizeof(struct xxx);

    The correct set of tunnel drivers which do this are:

    ipip
    ip_gre
    ip6_tunnel
    sit

    so make the ifdef test match.

    Noticed by Patrick McHardy and with help from Herbert Xu.

    Signed-off-by: David S. Miller

    David S. Miller
     

29 Sep, 2006

1 commit


26 Sep, 2006

3 commits


25 Sep, 2006

1 commit

  • * 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6: (217 commits)
    net/ieee80211: fix more crypto-related build breakage
    [PATCH] Spidernet: add ethtool -S (show statistics)
    [NET] GT96100: Delete bitrotting ethernet driver
    [PATCH] mv643xx_eth: restrict to 32-bit PPC_MULTIPLATFORM
    [PATCH] Cirrus Logic ep93xx ethernet driver
    r8169: the MMIO region of the 8167 stands behin BAR#1
    e1000, ixgb: Remove pointless wrappers
    [PATCH] Remove powerpc specific parts of 3c509 driver
    [PATCH] s2io: Switch to pci_get_device
    [PATCH] gt96100: move to pci_get_device API
    [PATCH] ehea: bugfix for register access functions
    [PATCH] e1000 disable device on PCI error
    drivers/net/phy/fixed: #if 0 some incomplete code
    drivers/net: const-ify ethtool_ops declarations
    [PATCH] ethtool: allow const ethtool_ops
    [PATCH] sky2: big endian
    [PATCH] sky2: fiber support
    [PATCH] sky2: tx pause bug fix
    drivers/net: Trim trailing whitespace
    [PATCH] ehea: IBM eHEA Ethernet Device Driver
    ...

    Manually resolved conflicts in drivers/net/ixgb/ixgb_main.c and
    drivers/net/sky2.c related to CHECKSUM_HW/CHECKSUM_PARTIAL changes by
    commit 84fa7933a33f806bbbaae6775e87459b1ec584c0 that just happened to be
    next to unrelated changes in this update.

    Linus Torvalds
     

23 Sep, 2006

1 commit


14 Sep, 2006

1 commit


18 Aug, 2006

2 commits

  • When the bridge recomputes features, it does not maintain the
    constraint that SG/GSO must be off if TX checksum is off.
    This patch adds that constraint.

    On a completely unrelated note, I've also added TSO6 and TSO_ECN
    feature bits if GSO is enabled on the underlying device through
    the new NETIF_F_GSO_SOFTWARE macro.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Since __vlan_hwaccel_rx() is essentially bypassing the
    netif_receive_skb() call that would have occurred if we did the VLAN
    decapsulation in software, we are missing the skb_bond() call and the
    assosciated checks it does.

    Export those checks via an inline function, skb_bond_should_drop(),
    and use this in __vlan_hwaccel_rx().

    Signed-off-by: David S. Miller

    David S. Miller
     

22 Jul, 2006

1 commit


09 Jul, 2006

2 commits

  • Certain subsystems in the stack (e.g., netfilter) can break the partial
    checksum on GSO packets. Until they're fixed, this patch allows this to
    work by recomputing the partial checksums through the GSO mechanism.

    Once they've all been converted to update the partial checksum instead of
    clearing it, this workaround can be removed.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch adds the wrapper function skb_is_gso which can be used instead
    of directly testing skb_shinfo(skb)->gso_size. This makes things a little
    nicer and allows us to change the primary key for indicating whether an skb
    is GSO (if we ever want to do that).

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

01 Jul, 2006

2 commits

  • This patch adds GSO support for IPv6 and TCPv6. This is based on a patch
    by Ananda Raju . His original description is:

    This patch enables TSO over IPv6. Currently Linux network stacks
    restricts TSO over IPv6 by clearing of the NETIF_F_TSO bit from
    "dev->features". This patch will remove this restriction.

    This patch will introduce a new flag NETIF_F_TSO6 which will be used
    to check whether device supports TSO over IPv6. If device support TSO
    over IPv6 then we don't clear of NETIF_F_TSO and which will make the
    TCP layer to create TSO packets. Any device supporting TSO over IPv6
    will set NETIF_F_TSO6 flag in "dev->features" along with NETIF_F_TSO.

    In case when user disables TSO using ethtool, NETIF_F_TSO will get
    cleared from "dev->features". So even if we have NETIF_F_TSO6 we don't
    get TSO packets created by TCP layer.

    SKB_GSO_TCPV4 renamed to SKB_GSO_TCP to make it generic GSO packet.
    SKB_GSO_UDPV4 renamed to SKB_GSO_UDP as UFO is not a IPv4 feature.
    UFO is supported over IPv6 also

    The following table shows there is significant improvement in
    throughput with normal frames and CPU usage for both normal and jumbo.

    --------------------------------------------------
    | | 1500 | 9600 |
    | ------------------|-------------------|
    | | thru CPU | thru CPU |
    --------------------------------------------------
    | TSO OFF | 2.00 5.5% id | 5.66 20.0% id |
    --------------------------------------------------
    | TSO ON | 2.63 78.0 id | 5.67 39.0% id |
    --------------------------------------------------

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch generalises the TSO-specific bits from sk_setup_caps by adding
    the sk_gso_type member to struct sock. This makes sk_setup_caps generic
    so that it can be used by TCPv6 or UFO.

    The only catch is that whoever uses this must provide a GSO implementation
    for their protocol which I think is a fair deal :) For now UFO continues to
    live without a GSO implementation which is OK since it doesn't use the sock
    caps field at the moment.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

30 Jun, 2006

3 commits

  • In the current TSO implementation, NETIF_F_TSO and ECN cannot be
    turned on together in a TCP connection. The problem is that most
    hardware that supports TSO does not handle CWR correctly if it is set
    in the TSO packet. Correct handling requires CWR to be set in the
    first packet only if it is set in the TSO header.

    This patch adds the ability to turn on NETIF_F_TSO and ECN using
    GSO if necessary to handle TSO packets with CWR set. Hardware
    that handles CWR correctly can turn on NETIF_F_TSO_ECN in the dev->
    features flag.

    All TSO packets with CWR set will have the SKB_GSO_TCPV4_ECN set. If
    the output device does not have the NETIF_F_TSO_ECN feature set, GSO
    will split the packet up correctly with CWR only set in the first
    segment.

    With help from Herbert Xu .

    Since ECN can always be enabled with TSO, the SOCK_NO_LARGESEND sock
    flag is completely removed.

    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Michael Chan
     
  • The test in skb_gso_ok is backwards. Noticed by Michael Chan
    .

    Signed-off-by: Herbert Xu
    Acked-by: Michael Chan
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • When GSO packets come from an untrusted source (e.g., a Xen guest domain),
    we need to verify the header integrity before passing it to the hardware.

    Since the first step in GSO is to verify the header, we can reuse that
    code by adding a new bit to gso_type: SKB_GSO_DODGY. Packets with this
    bit set can only be fed directly to devices with the corresponding bit
    NETIF_F_GSO_ROBUST. If the device doesn't have that bit, then the skb
    is fed to the GSO engine which will allow the packet to be sent to the
    hardware if it passes the header check.

    This patch changes the sg flag to a full features flag. The same method
    can be used to implement TSO ECN support. We simply have to mark packets
    with CWR set with SKB_GSO_ECN so that only hardware with a corresponding
    NETIF_F_TSO_ECN can accept them. The GSO engine can either fully segment
    the packet, or segment the first MTU and pass the rest to the hardware for
    further segmentation.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

26 Jun, 2006

1 commit


23 Jun, 2006

3 commits

  • This patch adds a generic segmentation offload toggle that can be turned
    on/off for each net device. For now it only supports in TCPv4.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch adds the infrastructure for generic segmentation offload.
    The idea is to tap into the potential savings of TSO without hardware
    support by postponing the allocation of segmented skb's until just
    before the entry point into the NIC driver.

    The same structure can be used to support software IPv6 TSO, as well as
    UFO and segmentation offload for other relevant protocols, e.g., DCCP.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
    going to scale if we add any more segmentation methods (e.g., DCCP). So
    let's merge them.

    They were used to tell the protocol of a packet. This function has been
    subsumed by the new gso_type field. This is essentially a set of netdev
    feature bits (shifted by 16 bits) that are required to process a specific
    skb. As such it's easy to tell whether a given device can process a GSO
    skb: you just have to and the gso_type field and the netdev's features
    field.

    I've made gso_type a conjunction. The idea is that you have a base type
    (e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
    For example, if we add a hardware TSO type that supports ECN, they would
    declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would
    have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
    packets would be SKB_GSO_TCPV4. This means that only the CWR packets need
    to be emulated in software.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

21 Jun, 2006

2 commits

  • * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
    [ATM]: fix broken uses of NIPQUAD in net/atm
    [SCTP]: sctp_unpack_cookie() fix
    [SCTP]: Fix unintentional change to SCTP_ASSERT when !SCTP_DEBUG
    [NET]: Prevent multiple qdisc runs
    [CONNECTOR]: Initialize subsystem earlier.
    [NETFILTER]: xt_sctp: fix endless loop caused by 0 chunk length

    Linus Torvalds
     
  • * git://git.infradead.org/hdrcleanup-2.6: (63 commits)
    [S390] __FD_foo definitions.
    Switch to __s32 types in joystick.h instead of C99 types for consistency.
    Add to headers included for userspace in
    Move inclusion of out of user scope in asm-x86_64/mtrr.h
    Remove struct fddi_statistics from user view in
    Move user-visible parts of drivers/s390/crypto/z90crypt.h to include/asm-s390
    Revert include/media changes: Mauro says those ioctls are only used in-kernel(!)
    Include and use __uXX types in
    Use __uXX types in , include too
    Remove private struct dx_hash_info from public view in
    Include and use __uXX types in
    Use __uXX types in for struct divert_blk et al.
    Use __u32 for elf_addr_t in , not u32. It's user-visible.
    Remove PPP_FCS from user view in , remove __P mess entirely
    Use __uXX types in user-visible structures in
    Don't use 'u32' in user-visible struct ip_conntrack_old_tuple.
    Use __uXX types for S390 DASD volume label definitions which are user-visible
    S390 BIODASDREADCMB ioctl should use __u64 not u64 type.
    Remove unneeded inclusion of from
    Fix private integer types used in V4L2 ioctls.
    ...

    Manually resolve conflict in include/linux/mtd/physmap.h

    Linus Torvalds
     

20 Jun, 2006

1 commit

  • Having two or more qdisc_run's contend against each other is bad because
    it can induce packet reordering if the packets have to be requeued. It
    appears that this is an unintended consequence of relinquinshing the queue
    lock while transmitting. That in turn is needed for devices that spend a
    lot of time in their transmit routine.

    There are no advantages to be had as devices with queues are inherently
    single-threaded (the loopback device is not but then it doesn't have a
    queue).

    Even if you were to add a queue to a parallel virtual device (e.g., bolt
    a tbf filter in front of an ipip tunnel device), you would still want to
    process the queue in sequence to ensure that the packets are ordered
    correctly.

    The solution here is to steal a bit from net_device to prevent this.

    BTW, as qdisc_restart is no longer used by anyone as a module inside the
    kernel (IIRC it used to with netif_wake_queue), I have not exported the
    new __qdisc_run function.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

18 Jun, 2006

3 commits

  • The current stack treats NETIF_F_HW_CSUM and NETIF_F_NO_CSUM
    identically so we test for them in quite a few places. For the sake
    of brevity, I'm adding the macro NETIF_F_GEN_CSUM for these two. We
    also test the disjunct of NETIF_F_IP_CSUM and the other two in various
    places, for that purpose I've added NETIF_F_ALL_CSUM.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Various drivers use xmit_lock internally to synchronise with their
    transmission routines. They do so without setting xmit_lock_owner.
    This is fine as long as netpoll is not in use.

    With netpoll it is possible for deadlocks to occur if xmit_lock_owner
    isn't set. This is because if a printk occurs while xmit_lock is held
    and xmit_lock_owner is not set can cause netpoll to attempt to take
    xmit_lock recursively.

    While it is possible to resolve this by getting netpoll to use
    trylock, it is suboptimal because netpoll's sole objective is to
    maximise the chance of getting the printk out on the wire. So
    delaying or dropping the message is to be avoided as much as possible.

    So the only alternative is to always set xmit_lock_owner. The
    following patch does this by introducing the netif_tx_lock family of
    functions that take care of setting/unsetting xmit_lock_owner.

    I renamed xmit_lock to _xmit_lock to indicate that it should not be
    used directly. I didn't provide irq versions of the netif_tx_lock
    functions since xmit_lock is meant to be a BH-disabling lock.

    This is pretty much a straight text substitution except for a small
    bug fix in winbond. It currently uses
    netif_stop_queue/spin_unlock_wait to stop transmission. This is
    unsafe as an IRQ can potentially wake up the queue. So it is safer to
    use netif_tx_disable.

    The hamradio bits used spin_lock_irq but it is unnecessary as
    xmit_lock must never be taken in an IRQ handler.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Attempts to allocate per-CPU DMA channels

    Signed-off-by: Chris Leech
    Signed-off-by: David S. Miller

    Chris Leech
     

24 May, 2006

1 commit


11 May, 2006

1 commit

  • The last step of netdevice registration was being done by a delayed
    call, but because it was delayed, it was impossible to return any error
    code if the class_device registration failed.

    Side effects:
    * one state in registration process is unnecessary.
    * register_netdevice can sleep inside class_device registration/hotplug
    * code in netdev_run_todo only does unregistration so it is simpler.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

09 May, 2006

1 commit


07 May, 2006

1 commit


29 Apr, 2006

1 commit


26 Apr, 2006

1 commit