12 Oct, 2013

18 commits

  • We do not actually need to set any rx filters for the virtual batman
    soft interface. However a dummy handler enables a user to set static
    multicast listeners for instance.

    Signed-off-by: Linus Lüssing
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Linus Lüssing
     
  • This is a simple batadv_tt_save_orig_buffer() refactoring
    aiming to make it more generic and avoid useless casts.

    Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner

    Antonio Quartulli
     
  • Implement batadv_tt_entries() to get the number of entries
    fitting in a given amount of bytes. This computation is done
    several times in the code and therefore it is useful to have
    an helper function.

    Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner

    Antonio Quartulli
     
  • This is not used anymore with the new fragmentation, and it might
    actually mess up the bonding code because find_router() assumes it
    is only called once per packet.

    Signed-off-by: Simon Wunderlich
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Simon Wunderlich
     
  • The module prints a warning when the MTU on the hard interface is too
    small to transfer payload traffic without fragmentation. The required
    MTU is calculated based on the encapsulation header size. If network
    coding is compild into the module its header size is taken into
    account as well.

    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Marek Lindner
     
  • the icmp and the icmp_rr packets share the same initial
    fields since they use the same code to be processed and
    forwarded.

    Extract the common fields and put them into a separate
    struct so that future ICMP packets can be easily added
    without bloating the packet definition.

    However, keep the seqno field outside of the newly created
    common header because future ICMP types may require a
    bigger sequence number space.

    This change breaks compatibility due to fields reordering
    in the ICMP headers.

    Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner

    Antonio Quartulli
     
  • When comparing a network ordered value with a constant, it
    is better to convert the constant at compile time by means
    of htons() instead of converting the value at runtime using
    ntohs().

    This refactoring may slightly improve the code performance.

    Moreover substitute __constant_htons() with htons() since
    the latter increase readability and it is smart enough to be
    as efficient as the former

    Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner
    Acked-by: Simon Wunderlich

    Antonio Quartulli
     
  • Non-broadcast packets larger than MTU are fragmented and sent with
    an encapsulating header. Up to 16 fragments are supported, which are
    sent in reverse order on the wire to allow minimal memory copying when
    creating fragments.

    Signed-off-by: Martin Hundebøll
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Martin Hundebøll
     
  • Fragments arriving at their destination are buffered for later merge.
    Merged packets are passed to the main receive function as had they never
    been fragmented.

    Fragments are forwarded without merging if the MTU of the outgoing
    interface is smaller than the size of the merged packet.

    Signed-off-by: Martin Hundebøll
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Martin Hundebøll
     
  • Remove the existing fragmentation code before adding the new version
    and delete unicast.{h,c}.

    batadv_unicast_send_skb() is moved to send.c and renamed to
    batadv_send_skb_unicast().

    fragmentation entry in sysfs (bat_priv->fragmentation) is kept for use in
    the new fragmentation code.

    BATADV_UNICAST_FRAG packet type is renamed to BATADV_FRAG for use in the
    new fragmentation code.

    Signed-off-by: Martin Hundebøll
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Martin Hundebøll
     
  • Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner

    Antonio Quartulli
     
  • In case of a VLAN tagged frame the ethhdr pointer is
    moved forward by 4 bytes so that the offset of h_proto
    in struct ethhdr matches the real
    h_vlan_encapsulated_proto address in the skb. While this
    trickery is correct it makes the code harder to understand
    and may lead to bugs in case of re-use of ethhdr for other
    purposes.

    This patch introduces a proto variable to make things
    cleaner and easier to understand.

    Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner

    Antonio Quartulli
     
  • batadv_tt_global_entry_free_ref uses call_rcu to schedule a
    function which will only free the global entry itself.

    For this reason call_rcu is useless and kfree_rcu can be
    used to simplify the code.

    Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner

    Antonio Quartulli
     
  • batadv_tt_global_add_orig is neither used nor implemented
    anymore, therefore it is possible to remove its declaration

    Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner

    Antonio Quartulli
     
  • batadv_tt_global_add is not used anymore outside of the TT
    code thanks to the TVLV implementation. It can therefore be
    declared as static

    Last user has been removed by 3de4e64df0f1326db7cc0ef25f5af8522850252d
    ("batman-adv: tvlv - convert roaming adv packet to use tvlv unicast packets")

    Moreover make it return bool since its result can be either 0 or 1.

    Reported-by: Simon Wunderlich
    Signed-off-by: Antonio Quartulli
    Signed-off-by: Marek Lindner

    Antonio Quartulli
     
  • Adding host information for record route is only required for ICMP
    requests and replys, and should not be added to just any (future?)
    packet type.

    Signed-off-by: Simon Wunderlich
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Simon Wunderlich
     
  • 1) We need to take a timestamp only for skb that should be cloned.

    Other skbs are not in write queue and no rtt estimation is done on them.

    2) the unlikely() hint is wrong for receivers (they send pure ACK)

    Signed-off-by: Eric Dumazet
    Cc: MF Nowlan
    Cc: Yuchung Cheng
    Cc: Neal Cardwell
    Acked-By: Yuchung Cheng
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Included changes:
    - update emails for A. Quartulli and M. Lindner in MAINTAINERS
    - switch to the next on-the-wire protocol version
    - introduce the T(ype) V(ersion) L(ength) V(alue) framework
    - adjust the existing components to make them use the new TVLV code
    - make the TT component use CRC32 instead of CRC16
    - totally remove the VIS functionality (has been moved to userspace)
    - reorder packet types and flags
    - add static checks on packet format
    - remove __packed from batadv_ogm_packet

    David S. Miller
     

11 Oct, 2013

2 commits

  • Jeff Kirsher says:

    ====================
    This series contains updates to i40e only.

    Alex provides the majority of the patches against i40e, where he does
    cleanup of the Tx and RX queues and to align the code with the known
    good Tx/Rx queue code in the ixgbe driver.

    Anjali provides an i40e patch to update link events to not print to
    the log until the device is administratively up.

    Catherine provides a patch to update the driver version.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • In commit 634fb979e8f ("inet: includes a sock_common in request_sock")
    I forgot that the two ports in sock_common do not have same byte order :

    skc_dport is __be16 (network order), but skc_num is __u16 (host order)

    So sparse complains because ir_loc_port (mapped into skc_num) is
    considered as __u16 while it should be __be16

    Let rename ir_loc_port to ireq->ir_num (analogy with inet->inet_num),
    and perform appropriate htons/ntohs conversions.

    Signed-off-by: Eric Dumazet
    Reported-by: Wu Fengguang
    Signed-off-by: David S. Miller

    Eric Dumazet
     

10 Oct, 2013

20 commits

  • Update the version number of the driver.

    Signed-off-by: Catherine Sullivan
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Catherine Sullivan
     
  • This change brings support for 64 bit netstats to the driver. Previously
    the stats were 64 bit but highly racy due to the fact that 64 bit
    transactions are not atomic on 32 bit systems. This change makes is so
    that the 64 bit byte and packet stats are reliable on all architectures.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • Allocate the queue pairs individually instead of as a group. This
    allows for much easier queue management as it is possible to dynamically
    resize the queues without having to free and allocate the entire block.

    Ease statistic collection by treating Tx/Rx queue pairs as a single
    unit. Each pair is allocated together and starts with a Tx queue and
    ends with an Rx queue. By ordering them this way it is possible to know
    the Rx offset based on a pointer to the Tx queue.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • This replaces the ring container array with a linked list. The idea is
    to make the logic much easier to deal with since this will allow us to
    call a simple helper function from the q_vectors to go through the
    entire list.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • Allocate the q_vectors individually. The advantage to this is that it
    allows for easier freeing and allocation. In addition it makes it so
    that we could do node specific allocations at some point in the future.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • This makes it so that the Tx and Rx byte and packet counts are
    separated from the rest of the statistics. This allows for better
    isolation of these stats when we move them into the 64 bit statistics.

    Simplify things by re-ordering how the stats display in ethtool.
    Instead of displaying all of the Tx queues as a block, followed by all
    the Rx queues, the new order is Tx[0], Rx[0], Tx[1], Rx[1], ..., Tx[n],
    Rx[n]. This reduces the loops and cleans up the display for testing
    purposes since it is very easy to verify if flow director is doing the
    right thing as the Tx and Rx queue pair are shown in pairs.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • Implement BQL (byte queue limit) support in i40e.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • Drop Tx flag and TXSW which is tested but never set.

    As a result of this change we can drop a complicated check that always
    resulted in the final result of i40e_tx_csum being equal to the
    CHECKSUM_PARTIAL value. As such we can replace the entire function call
    with just a check for skb->summed == CHECKSUM_PARTIAL.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • Sync the fast path for i40e_tx_map and i40e_clean_tx_irq so that they
    are similar to igb and ixgbe.

    - Only update the Tx descriptor ring in tx_map
    - Make skb mapping always on the first buffer in the chain
    - Drop the use of MAPPED_AS_PAGE Tx flag
    - Only store flags on the first buffer_info structure

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • Avoid directly incrementing next_to_use for multiple reasons. The main
    reason being that if we directly increment it then it can attain a state
    where it is equal to the ring count. Technically this is a state it
    should not be able to reach but the way this is written it now can.

    This patch pulls the value off into a register and then increments it
    and writes back either the value or 0 depending on if the value is equal
    to the ring count.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • - drop the mapped_as_page u8 from the Tx buffer info as it was unused
    - use the DMA unmap accessors for Tx DMA
    - replace checks of DMA with checks of the unmap length to verify if an
    unmap is needed
    - update the Tx buffer layout to make it consistent with igb, ixgbe

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • sk_pacing_rate is read by sch_fq packet scheduler at any time,
    with no synchronization, so make sure we update it in a
    sensible way. ACCESS_ONCE() is how we instruct compiler
    to not do stupid things, like using the memory location
    as a temporary variable.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • TCP listener refactoring, part 5 :

    We want to be able to insert request sockets (SYN_RECV) into main
    ehash table instead of the per listener hash table to allow RCU
    lookups and remove listener lock contention.

    This patch includes the needed struct sock_common in front
    of struct request_sock

    This means there is no more inet6_request_sock IPv6 specific
    structure.

    Following inet_request_sock fields were renamed as they became
    macros to reference fields from struct sock_common.
    Prefix ir_ was chosen to avoid name collisions.

    loc_port -> ir_loc_port
    loc_addr -> ir_loc_addr
    rmt_addr -> ir_rmt_addr
    rmt_port -> ir_rmt_port
    iif -> ir_iif

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • skb_gro_receive() is currently limited to 16 or 17 MSS per GRO skb,
    typically 24616 bytes, because it fills up to MAX_SKB_FRAGS frags.

    It's relatively easy to extend the skb using frag_list to allow
    more frags to be appended into the last sk_buff.

    This still builds very efficient skbs, and allows reaching 45 MSS per
    skb.

    (45 MSS GRO packet uses one skb plus a frag_list containing 2 additional
    sk_buff)

    High speed TCP flows benefit from this extension by lowering TCP stack
    cpu usage (less packets stored in receive queue, less ACK packets
    processed)

    Forwarding setups could be hurt, as such skbs will need to be
    linearized, although its not a new problem, as GRO could already
    provide skbs with a frag_list.

    We could make the 65536 bytes threshold a tunable to mitigate this.

    (First time we need to linearize skb in skb_needs_linearize(), we could
    lower the tunable to ~16*1460 so that following skb_gro_receive() calls
    build smaller skbs)

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This is a enhancement.

    for the first node in fib_trie, newpos is 0, bit is 1.
    Only for the leaf or node with unmatched key need calc pos.

    Signed-off-by: baker.zhang
    Signed-off-by: David S. Miller

    baker.zhang
     
  • We can only setup multicast address for network device when
    net_device_ops->ndo_set_rx_mode is not null.

    Some configurations need to add multicast address for net
    device, such as netfilter cluster match module.

    Add a fake ndo_set_rx_mode function to allow this operation.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     
  • The Tx "completed" stat was part of the original rewrite for detecting
    Tx hangs. However some time ago in ixgbe I determined that we could
    just use the packets stat instead. Since then this stat was
    removed from ixgbe and it serves no purpose in i40e so it can be
    dropped.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • Link events should not print to the log until the device is
    administratively up.

    Signed-off-by: Anjali Singhai
    Signed-off-by: Jesse Brandeburg
    Tested-by: Kavindya Deegala
    Signed-off-by: Jeff Kirsher

    Anjali Singhai
     
  • Signed-off-by: Ajit Khaparde
    Signed-off-by: David S. Miller

    Ajit Khaparde
     
  • SkyHawk-R can support RoCE. Add code to display RoCE specific
    counters maintained in hardware.

    Signed-off-by: Ajit Khaparde
    Signed-off-by: David S. Miller

    Ajit Khaparde