16 Feb, 2013

2 commits

  • Following patch adds GRE protocol offload handler so that
    skb_gso_segment() can segment GRE packets.
    SKB GSO CB is added to keep track of total header length so that
    skb_segment can push entire header. e.g. in case of GRE, skb_segment
    need to push inner and outer headers to every segment.
    New NETIF_F_GRE_GSO feature is added for devices which support HW
    GRE TSO offload. Currently none of devices support it therefore GRE GSO
    always fall backs to software GSO.

    [ Compute pkt_len before ip_local_out() invocation. -DaveM ]

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • This function will be used in next GRE_GSO patch. This patch does
    not change any functionality.

    Signed-off-by: Pravin B Shelar
    Acked-by: Eric Dumazet

    Pravin B Shelar
     

14 Feb, 2013

1 commit

  • Patch cef401de7be8c4e (net: fix possible wrong checksum
    generation) fixed wrong checksum calculation but it broke TSO by
    defining new GSO type but not a netdev feature for that type.
    net_gso_ok() would not allow hardware checksum/segmentation
    offload of such packets without the feature.

    Following patch fixes TSO and wrong checksum. This patch uses
    same logic that Eric Dumazet used. Patch introduces new flag
    SKBTX_SHARED_FRAG if at least one frag can be modified by
    the user. but SKBTX_SHARED_FRAG flag is kept in skb shared
    info tx_flags rather than gso_type.

    tx_flags is better compared to gso_type since we can have skb with
    shared frag without gso packet. It does not link SHARED_FRAG to
    GSO, So there is no need to define netdev feature for this.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     

09 Feb, 2013

1 commit

  • In order to address the fact that some devices cannot support the full 32K
    frag size we need to have the value accessible somewhere so that we can use it
    to do comparisons against what the device can support. As such I am moving
    the values out of skbuff.c and into skbuff.h.

    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     

28 Jan, 2013

1 commit

  • Pravin Shelar mentioned that GSO could potentially generate
    wrong TX checksum if skb has fragments that are overwritten
    by the user between the checksum computation and transmit.

    He suggested to linearize skbs but this extra copy can be
    avoided for normal tcp skbs cooked by tcp_sendmsg().

    This patch introduces a new SKB_GSO_SHARED_FRAG flag, set
    in skb_shinfo(skb)->gso_type if at least one frag can be
    modified by the user.

    Typical sources of such possible overwrites are {vm}splice(),
    sendfile(), and macvtap/tun/virtio_net drivers.

    Tested:

    $ netperf -H 7.7.8.84
    MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
    7.7.8.84 () port 0 AF_INET
    Recv Send Send
    Socket Socket Message Elapsed
    Size Size Size Time Throughput
    bytes bytes bytes secs. 10^6bits/sec

    87380 16384 16384 10.00 3959.52

    $ netperf -H 7.7.8.84 -t TCP_SENDFILE
    TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.8.84 ()
    port 0 AF_INET
    Recv Send Send
    Socket Socket Message Elapsed
    Size Size Size Time Throughput
    bytes bytes bytes secs. 10^6bits/sec

    87380 16384 16384 10.00 3216.80

    Performance of the SENDFILE is impacted by the extra allocation and
    copy, and because we use order-0 pages, while the TCP_STREAM uses
    bigger pages.

    Reported-by: Pravin Shelar
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

09 Jan, 2013

1 commit

  • We have skb_mac_header_was_set() helper to tell if mac_header
    was set on a skb. We would like the same for transport_header.

    __netif_receive_skb() doesn't reset the transport header if already
    set by GRO layer.

    Note that network stacks usually reset the transport header anyway,
    after pulling the network header, so this change only allows
    a followup patch to have more precise qdisc pkt_len computation
    for GSO packets at ingress side.

    Signed-off-by: Eric Dumazet
    Cc: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Eric Dumazet
     

09 Dec, 2012

1 commit

  • This patch adds support in the kernel for offloading in the NIC Tx and Rx
    checksumming for encapsulated packets (such as VXLAN and IP GRE).

    For Tx encapsulation offload, the driver will need to set the right bits
    in netdev->hw_enc_features. The protocol driver will have to set the
    skb->encapsulation bit and populate the inner headers, so the NIC driver will
    use those inner headers to calculate the csum in hardware.

    For Rx encapsulation offload, the driver will need to set again the
    skb->encapsulation flag and the skb->ip_csum to CHECKSUM_UNNECESSARY.
    In that case the protocol driver should push the decapsulated packet up
    to the stack, again with CHECKSUM_UNNECESSARY. In ether case, the protocol
    driver should set the skb->encapsulation flag back to zero. Finally the
    protocol driver should have NETIF_F_RXCSUM flag set in its features.

    Signed-off-by: Joseph Gasparakis
    Signed-off-by: Peter P Waskiewicz Jr
    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Joseph Gasparakis
     

03 Nov, 2012

2 commits

  • Orphaning frags for zero copy skbs needs to allocate data in atomic
    context so is has a chance to fail. If it does we currently discard
    the skb which is safe, but we don't report anything to the caller,
    so it can not recover by e.g. disabling zero copy.

    Add an API to free skb reporting such errors: this is used
    by tun in case orphaning frags fails.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • Even if skb is marked for zero copy, net core might still decide
    to copy it later which is somewhat slower than a copy in user context:
    besides copying the data we need to pin/unpin the pages.

    Add a parameter reporting such cases through zero copy callback:
    if this happens a lot, device can take this into account
    and switch to copying in user context.

    This patch updates all users but ignores the passed value for now:
    it will be used by follow-up patches.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     

01 Nov, 2012

1 commit

  • Network device drivers can communicate a Toeplitz hash in skb->rxhash,
    but devices differ in their hashing capabilities. All compute a 5-tuple
    hash for TCP over IPv4, but for other connection-oriented protocols,
    they may compute only a 3-tuple. This breaks RPS load balancing, e.g.,
    for TCP over IPv6 flows. Additionally, for GRE and other tunnels,
    the kernel computes a 5-tuple hash over the inner packet if possible,
    but devices do not.

    This patch recomputes the rxhash in software in all cases where it
    cannot be certain that a 5-tuple was computed. Device drivers can avoid
    recomputation by setting the skb->l4_rxhash flag.

    Recomputing adds cycles to each packet when RPS is enabled or the
    packet arrives over a tunnel. A comparison of 200x TCP_STREAM between
    two servers running unmodified netnext with rxhash computation
    in hardware vs software (using ethtool -K eth0 rxhash [on|off]) shows
    how much time is spent in __skb_get_rxhash in this worst case:

    0.03% swapper [kernel.kallsyms] [k] __skb_get_rxhash
    0.03% swapper [kernel.kallsyms] [k] __skb_get_rxhash
    0.05% swapper [kernel.kallsyms] [k] __skb_get_rxhash

    With 200x TCP_RR it increases to

    0.10% netperf [kernel.kallsyms] [k] __skb_get_rxhash
    0.10% netperf [kernel.kallsyms] [k] __skb_get_rxhash
    0.10% netperf [kernel.kallsyms] [k] __skb_get_rxhash

    I considered having the patch explicitly skips recomputation when it knows
    that it will not improve the hash (TCP over IPv4), but that conditional
    complicates code without saving many cycles in practice, because it has
    to take place after flow dissector.

    Signed-off-by: Willem de Bruijn
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

07 Oct, 2012

1 commit

  • Over time, skb recycling infrastructure got litle interest and
    many bugs. Generic rx path skb allocation is now using page
    fragments for efficient GRO / TCP coalescing, and recyling
    a tx skb for rx path is not worth the pain.

    Last identified bug is that fat skbs can be recycled
    and it can endup using high order pages after few iterations.

    With help from Maxime Bizon, who pointed out that commit
    87151b8689d (net: allow pskb_expand_head() to get maximum tailroom)
    introduced this regression for recycled skbs.

    Instead of fixing this bug, lets remove skb recycling.

    Drivers wanting really hot skbs should use build_skb() anyway,
    to allocate/populate sk_buff right before netif_receive_skb()

    Signed-off-by: Eric Dumazet
    Cc: Maxime Bizon
    Signed-off-by: David S. Miller

    Eric Dumazet
     

04 Aug, 2012

1 commit


01 Aug, 2012

3 commits

  • The skb->pfmemalloc flag gets set to true iff during the slab allocation
    of data in __alloc_skb that the the PFMEMALLOC reserves were used. If
    page splitting is used, it is possible that pages will be allocated from
    the PFMEMALLOC reserve without propagating this information to the skb.
    This patch propagates page->pfmemalloc from pages allocated for fragments
    to the skb.

    It works by reintroducing and expanding the skb_alloc_page() API to take
    an skb. If the page was allocated from pfmemalloc reserves, it is
    automatically copied. If the driver allocates the page before the skb, it
    should call skb_propagate_pfmemalloc() after the skb is allocated to
    ensure the flag is copied properly.

    Failure to do so is not critical. The resulting driver may perform slower
    if it is used for swap-over-NBD or swap-over-NFS but it should not result
    in failure.

    [davem@davemloft.net: API rename and consistency]
    Signed-off-by: Mel Gorman
    Acked-by: David S. Miller
    Cc: Neil Brown
    Cc: Peter Zijlstra
    Cc: Mike Christie
    Cc: Eric B Munson
    Cc: Eric Dumazet
    Cc: Sebastian Andrzej Siewior
    Cc: Mel Gorman
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • The skb->pfmemalloc flag gets set to true iff during the slab allocation
    of data in __alloc_skb that the the PFMEMALLOC reserves were used. If the
    packet is fragmented, it is possible that pages will be allocated from the
    PFMEMALLOC reserve without propagating this information to the skb. This
    patch propagates page->pfmemalloc from pages allocated for fragments to
    the skb.

    Signed-off-by: Mel Gorman
    Acked-by: David S. Miller
    Cc: Neil Brown
    Cc: Peter Zijlstra
    Cc: Mike Christie
    Cc: Eric B Munson
    Cc: Eric Dumazet
    Cc: Sebastian Andrzej Siewior
    Cc: Mel Gorman
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Change the skb allocation API to indicate RX usage and use this to fall
    back to the PFMEMALLOC reserve when needed. SKBs allocated from the
    reserve are tagged in skb->pfmemalloc. If an SKB is allocated from the
    reserve and the socket is later found to be unrelated to page reclaim, the
    packet is dropped so that the memory remains available for page reclaim.
    Network protocols are expected to recover from this packet loss.

    [a.p.zijlstra@chello.nl: Ideas taken from various patches]
    [davem@davemloft.net: Use static branches, coding style corrections]
    [sebastian@breakpoint.cc: Avoid unnecessary cast, fix !CONFIG_NET build]
    Signed-off-by: Mel Gorman
    Acked-by: David S. Miller
    Cc: Neil Brown
    Cc: Peter Zijlstra
    Cc: Mike Christie
    Cc: Eric B Munson
    Cc: Eric Dumazet
    Cc: Sebastian Andrzej Siewior
    Cc: Mel Gorman
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

23 Jul, 2012

1 commit

  • Many places do
    if ((skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY))
    skb_copy_ubufs(skb, gfp_mask);
    to copy and invoke frag destructors if necessary.
    Add an inline helper for this.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     

16 Jun, 2012

1 commit

  • Orphaning skb in dev_hard_start_xmit() makes bonding behavior
    unfriendly for applications sending big UDP bursts : Once packets
    pass the bonding device and come to real device, they might hit a full
    qdisc and be dropped. Without orphaning, the sender is automatically
    throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming
    sk_sndbuf is not too big)

    We could try to defer the orphaning adding another test in
    dev_hard_start_xmit(), but all this seems of little gain,
    now that BQL tends to make packets more likely to be parked
    in Qdisc queues instead of NIC TX ring, in cases where performance
    matters.

    Reverts commits :
    fc6055a5ba31 net: Introduce skb_orphan_try()
    87fd308cfc6b net: skb_tx_hash() fix relative to skb_orphan_try()
    and removes SKBTX_DRV_NEEDS_SK_REF flag

    Reported-and-bisected-by: Jean-Michel Hautbois
    Signed-off-by: Eric Dumazet
    Tested-by: Oliver Hartkopp
    Acked-by: Oliver Hartkopp
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 May, 2012

1 commit

  • At the beginning of __skb_cow, headroom gets set to a minimum of
    NET_SKB_PAD. This causes unnecessary reallocations if the buffer was not
    cloned and the headroom is just below NET_SKB_PAD, but still more than the
    amount requested by the caller.
    This was showing up frequently in my tests on VLAN tx, where
    vlan_insert_tag calls skb_cow_head(skb, VLAN_HLEN).

    Locally generated packets should have enough headroom, and for forward
    paths, we already have NET_SKB_PAD bytes of headroom, so we don't need to
    add any extra space here.

    Signed-off-by: Felix Fietkau
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Felix Fietkau
     

20 May, 2012

1 commit

  • Move tcp_try_coalesce() protocol independent part to
    skb_try_coalesce().

    skb_try_coalesce() can be used in IPv4 defrag and IPv6 reassembly,
    to build optimized skbs (less sk_buff, and possibly less 'headers')

    skb_try_coalesce() is zero copy, unless the copy can fit in destination
    header (its a rare case)

    kfree_skb_partial() is also moved to net/core/skbuff.c and exported,
    because IPv6 will need it in patch (ipv6: use skb coalescing in
    reassembly).

    Signed-off-by: Eric Dumazet
    Cc: Alexander Duyck
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 May, 2012

1 commit

  • Fix two issues introduced in commit a1c7fff7e18f5
    ( net: netdev_alloc_skb() use build_skb() )

    - Must be IRQ safe (non NAPI drivers can use it)
    - Must not leak the frag if build_skb() fails to allocate sk_buff

    This patch introduces netdev_alloc_frag() for drivers willing to
    use build_skb() instead of __netdev_alloc_skb() variants.

    Factorize code so that :
    __dev_alloc_skb() is a wrapper around __netdev_alloc_skb(), and
    dev_alloc_skb() a wrapper around netdev_alloc_skb()

    Use __GFP_COLD flag.

    Almost all network drivers now benefit from skb->head_frag
    infrastructure.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 May, 2012

1 commit

  • Conflicts:
    drivers/net/ethernet/intel/e1000e/param.c
    drivers/net/wireless/iwlwifi/iwl-agn-rx.c
    drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
    drivers/net/wireless/iwlwifi/iwl-trans.h

    Resolved the iwlwifi conflict with mainline using 3-way diff posted
    by John Linville and Stephen Rothwell. In 'net' we added a bug
    fix to make iwlwifi report a more accurate skb->truesize but this
    conflicted with RX path changes that happened meanwhile in net-next.

    In e1000e a conflict arose in the validation code for settings of
    adapter->itr. 'net-next' had more sophisticated logic so that
    logic was used.

    Signed-off-by: David S. Miller

    David S. Miller
     

07 May, 2012

1 commit

  • With the recent changes for how we compute the skb truesize it occurs to me
    we are probably going to have a lot of calls to skb_end_pointer -
    skb->head. Instead of running all over the place doing that it would make
    more sense to just make it a separate inline skb_end_offset(skb) that way
    we can return the correct value without having gcc having to do all the
    optimization to cancel out skb->head - skb->head.

    Signed-off-by: Alexander Duyck
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Alexander Duyck
     

04 May, 2012

1 commit

  • This patch adds support for a skb_head_is_locked helper function. It is
    meant to be used any time we are considering transferring the head from
    skb->head to a paged frag. If the head is locked it means we cannot remove
    the head from the skb so it must be copied or we must take the skb as a
    whole.

    Signed-off-by: Alexander Duyck
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Alexander Duyck
     

01 May, 2012

4 commits

  • fix kernel doc typos in function names

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • remove useless casts and rename variables for less confusion.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • GRO can check if skb to be merged has its skb->head mapped to a page
    fragment, instead of a kmalloc() area.

    We 'upgrade' skb->head as a fragment in itself

    This avoids the frag_list fallback, and permits to build true GRO skb
    (one sk_buff and up to 16 fragments), using less memory.

    This reduces number of cache misses when user makes its copy, since a
    single sk_buff is fetched.

    This is a followup of patch "net: allow skb->head to be a page fragment"

    Signed-off-by: Eric Dumazet
    Cc: Ilpo Järvinen
    Cc: Herbert Xu
    Cc: Maciej Żenczykowski
    Cc: Neal Cardwell
    Cc: Tom Herbert
    Cc: Jeff Kirsher
    Cc: Ben Hutchings
    Cc: Matt Carlson
    Cc: Michael Chan
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • skb->head is currently allocated from kmalloc(). This is convenient but
    has the drawback the data cannot be converted to a page fragment if
    needed.

    We have three spots were it hurts :

    1) GRO aggregation

    When a linear skb must be appended to another skb, GRO uses the
    frag_list fallback, very inefficient since we keep all struct sk_buff
    around. So drivers enabling GRO but delivering linear skbs to network
    stack aren't enabling full GRO power.

    2) splice(socket -> pipe).

    We must copy the linear part to a page fragment.
    This kind of defeats splice() purpose (zero copy claim)

    3) TCP coalescing.

    Recently introduced, this permits to group several contiguous segments
    into a single skb. This shortens queue lengths and save kernel memory,
    and greatly reduce probabilities of TCP collapses. This coalescing
    doesnt work on linear skbs (or we would need to copy data, this would be
    too slow)

    Given all these issues, the following patch introduces the possibility
    of having skb->head be a fragment in itself. We use a new skb flag,
    skb->head_frag to carry this information.

    build_skb() is changed to accept a frag_size argument. Drivers willing
    to provide a page fragment instead of kmalloc() data will set a non zero
    value, set to the fragment size.

    Then, on situations we need to convert the skb head to a frag in itself,
    we can check if skb->head_frag is set and avoid the copies or various
    fallbacks we have.

    This means drivers currently using frags could be updated to avoid the
    current skb->head allocation and reduce their memory footprint (aka skb
    truesize). (thats 512 or 1024 bytes saved per skb). This also makes
    bpf/netfilter faster since the 'first frag' will be part of skb linear
    part, no need to copy data.

    Signed-off-by: Eric Dumazet
    Cc: Ilpo Järvinen
    Cc: Herbert Xu
    Cc: Maciej Żenczykowski
    Cc: Neal Cardwell
    Cc: Tom Herbert
    Cc: Jeff Kirsher
    Cc: Ben Hutchings
    Cc: Matt Carlson
    Cc: Michael Chan
    Signed-off-by: David S. Miller

    Eric Dumazet
     

24 Apr, 2012

1 commit


20 Apr, 2012

1 commit


14 Apr, 2012

1 commit

  • The skb struct ubuf_info callback gets passed struct ubuf_info
    itself, not the arg value as the field name and the function signature
    seem to imply. Rename the arg field to ctx to match usage,
    add documentation and change the callback argument type
    to make usage clear and to have compiler check correctness.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     

11 Apr, 2012

1 commit

  • Marc Merlin reported many order-1 allocations failures in TX path on its
    wireless setup, that dont make any sense with MTU=1500 network, and non
    SG capable hardware.

    After investigation, it turns out TCP uses sk_stream_alloc_skb() and
    used as a convention skb_tailroom(skb) to know how many bytes of data
    payload could be put in this skb (for non SG capable devices)

    Note : these skb used kmalloc-4096 (MTU=1500 + MAX_HEADER +
    sizeof(struct skb_shared_info) being above 2048)

    Later, mac80211 layer need to add some bytes at the tail of skb
    (IEEE80211_ENCRYPT_TAILROOM = 18 bytes) and since no more tailroom is
    available has to call pskb_expand_head() and request order-1
    allocations.

    This patch changes sk_stream_alloc_skb() so that only
    sk->sk_prot->max_header bytes of headroom are reserved, and use a new
    skb field, avail_size to hold the data payload limit.

    This way, order-0 allocations done by TCP stack can leave more than 2 KB
    of tailroom and no more allocation is performed in mac80211 layer (or
    any layer needing some tailroom)

    avail_size is unioned with mark/dropcount, since mark will be set later
    in IP stack for output packets. Therefore, skb size is unchanged.

    Reported-by: Marc MERLIN
    Tested-by: Marc MERLIN
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

29 Mar, 2012

2 commits

  • …m/linux/kernel/git/dhowells/linux-asm_system

    Pull "Disintegrate and delete asm/system.h" from David Howells:
    "Here are a bunch of patches to disintegrate asm/system.h into a set of
    separate bits to relieve the problem of circular inclusion
    dependencies.

    I've built all the working defconfigs from all the arches that I can
    and made sure that they don't break.

    The reason for these patches is that I recently encountered a circular
    dependency problem that came about when I produced some patches to
    optimise get_order() by rewriting it to use ilog2().

    This uses bitops - and on the SH arch asm/bitops.h drags in
    asm-generic/get_order.h by a circuituous route involving asm/system.h.

    The main difficulty seems to be asm/system.h. It holds a number of
    low level bits with no/few dependencies that are commonly used (eg.
    memory barriers) and a number of bits with more dependencies that
    aren't used in many places (eg. switch_to()).

    These patches break asm/system.h up into the following core pieces:

    (1) asm/barrier.h

    Move memory barriers here. This already done for MIPS and Alpha.

    (2) asm/switch_to.h

    Move switch_to() and related stuff here.

    (3) asm/exec.h

    Move arch_align_stack() here. Other process execution related bits
    could perhaps go here from asm/processor.h.

    (4) asm/cmpxchg.h

    Move xchg() and cmpxchg() here as they're full word atomic ops and
    frequently used by atomic_xchg() and atomic_cmpxchg().

    (5) asm/bug.h

    Move die() and related bits.

    (6) asm/auxvec.h

    Move AT_VECTOR_SIZE_ARCH here.

    Other arch headers are created as needed on a per-arch basis."

    Fixed up some conflicts from other header file cleanups and moving code
    around that has happened in the meantime, so David's testing is somewhat
    weakened by that. We'll find out anything that got broken and fix it..

    * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
    Delete all instances of asm/system.h
    Remove all #inclusions of asm/system.h
    Add #includes needed to permit the removal of asm/system.h
    Move all declarations of free_initmem() to linux/mm.h
    Disintegrate asm/system.h for OpenRISC
    Split arch_align_stack() out from asm-generic/system.h
    Split the switch_to() wrapper out of asm-generic/system.h
    Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
    Create asm-generic/barrier.h
    Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
    Disintegrate asm/system.h for Xtensa
    Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
    Disintegrate asm/system.h for Tile
    Disintegrate asm/system.h for Sparc
    Disintegrate asm/system.h for SH
    Disintegrate asm/system.h for Score
    Disintegrate asm/system.h for S390
    Disintegrate asm/system.h for PowerPC
    Disintegrate asm/system.h for PA-RISC
    Disintegrate asm/system.h for MN10300
    ...

    Linus Torvalds
     
  • Remove all #inclusions of asm/system.h preparatory to splitting and killing
    it. Performed with the following command:

    perl -p -i -e 's!^#\s*include\s*.*\n!!' `grep -Irl '^#\s*include\s*' *`

    Signed-off-by: David Howells

    David Howells
     

28 Mar, 2012

1 commit

  • Pull networking fixes from David Miller:
    1) Name string overrun fix in gianfar driver from Joe Perches.

    2) VHOST bug fixes from Michael S. Tsirkin and Nadav Har'El

    3) Fix dependencies on xt_LOG netfilter module, from Pablo Neira Ayuso.

    4) Fix RCU locking in xt_CT, also from Pablo Neira Ayuso.

    5) Add a parameter to skb_add_rx_frag() so we can fix the truesize
    adjustments in the drivers that use it. The individual drivers
    aren't fixed by this commit, but will be dealt with using follow-on
    commits. From Eric Dumazet.

    6) Add some device IDs to qmi_wwan driver, from Andrew Bird.

    7) Fix a potential rcu_read_lock() imbalancein rt6_fill_node(). From
    Eric Dumazet.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    net: fix a potential rcu_read_lock() imbalance in rt6_fill_node()
    net: add a truesize parameter to skb_add_rx_frag()
    gianfar: Fix possible overrun and simplify interrupt name field creation
    USB: qmi_wwan: Add ZTE (Vodafone) K3570-Z and K3571-Z net interfaces
    USB: option: Ignore ZTE (Vodafone) K3570/71 net interfaces
    USB: qmi_wwan: Add ZTE (Vodafone) K3565-Z and K4505-Z net interfaces
    qlcnic: Bug fix for LRO
    netfilter: nf_conntrack: permanently attach timeout policy to conntrack
    netfilter: xt_CT: fix assignation of the generic protocol tracker
    netfilter: xt_CT: missing rcu_read_lock section in timeout assignment
    netfilter: cttimeout: fix dependency with l4protocol conntrack module
    netfilter: xt_LOG: use CONFIG_IP6_NF_IPTABLES instead of CONFIG_IPV6
    vhost: fix release path lockdep checks
    vhost: don't forget to schedule()
    tools/virtio: stub out strong barriers
    tools/virtio: add linux/hrtimer.h stub
    tools/virtio: add linux/module.h stub

    Linus Torvalds
     

26 Mar, 2012

1 commit

  • skb_add_rx_frag() API is misleading.

    Network skbs built with this helper can use uncharged kernel memory and
    eventually stress/crash machine in OOM.

    Add a 'truesize' parameter and then fix drivers in followup patches.

    Signed-off-by: Eric Dumazet
    Cc: Wey-Yi Guy
    Signed-off-by: David S. Miller

    Eric Dumazet
     

25 Mar, 2012

1 commit

  • Pull cleanup from Paul Gortmaker:
    "The changes shown here are to unify linux's BUG support under the one
    file. Due to historical reasons, we have some BUG code
    in bug.h and some in kernel.h -- i.e. the support for BUILD_BUG in
    linux/kernel.h predates the addition of linux/bug.h, but old code in
    kernel.h wasn't moved to bug.h at that time. As a band-aid, kernel.h
    was including to pseudo link them.

    This has caused confusion[1] and general yuck/WTF[2] reactions. Here
    is an example that violates the principle of least surprise:

    CC lib/string.o
    lib/string.c: In function 'strlcat':
    lib/string.c:225:2: error: implicit declaration of function 'BUILD_BUG_ON'
    make[2]: *** [lib/string.o] Error 1
    $
    $ grep linux/bug.h lib/string.c
    #include
    $

    We've included for the BUG infrastructure and yet we
    still get a compile fail! [We've not kernel.h for BUILD_BUG_ON.] Ugh -
    very confusing for someone who is new to kernel development.

    With the above in mind, the goals of this changeset are:

    1) find and fix any include/*.h files that were relying on the
    implicit presence of BUG code.
    2) find and fix any C files that were consuming kernel.h and hence
    relying on implicitly getting some/all BUG code.
    3) Move the BUG related code living in kernel.h to
    4) remove the asm/bug.h from kernel.h to finally break the chain.

    During development, the order was more like 3-4, build-test, 1-2. But
    to ensure that git history for bisect doesn't get needless build
    failures introduced, the commits have been reorderd to fix the problem
    areas in advance.

    [1] https://lkml.org/lkml/2012/1/3/90
    [2] https://lkml.org/lkml/2012/1/17/414"

    Fix up conflicts (new radeon file, reiserfs header cleanups) as per Paul
    and linux-next.

    * tag 'bug-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
    kernel.h: doesn't explicitly use bug.h, so don't include it.
    bug: consolidate BUILD_BUG_ON with other bug code
    BUG: headers with BUG/BUG_ON etc. need linux/bug.h
    bug.h: add include of it to various implicit C users
    lib: fix implicit users of kernel.h for TAINT_WARN
    spinlock: macroize assert_spin_locked to avoid bug.h dependency
    x86: relocate get/set debugreg fcns to include/asm/debugreg.

    Linus Torvalds
     

20 Mar, 2012

1 commit

  • As suggested by Ben, this adds the clarification on the usage of
    CHECKSUM_UNNECESSARY on the outgoing patch. Also add the usage
    description of NETIF_F_FCOE_CRC and CHECKSUM_UNNECESSARY
    for the kernel FCoE protocol driver.

    This is a follow-up to the following:
    http://patchwork.ozlabs.org/patch/147315/

    Signed-off-by: Yi Zou
    Cc: Ben Hutchings
    Cc: Jeff Kirsher
    Cc: www.Open-FCoE.org
    Signed-off-by: David S. Miller

    Yi Zou
     

10 Mar, 2012

1 commit


05 Mar, 2012

1 commit

  • If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any
    other BUG variant in a static inline (i.e. not in a #define) then
    that header really should be including and not just
    expecting it to be implicitly present.

    We can make this change risk-free, since if the files using these
    headers didn't have exposure to linux/bug.h already, they would have
    been causing compile failures/warnings.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

27 Feb, 2012

1 commit