08 Jul, 2013

1 commit

  • Continue the approach taken by commit d2b57063e4a ("IB/core: Reserve
    bits in enum ib_qp_create_flags for low-level driver use") and add
    reserved entries to the ib_qp_type and ib_wr_opcode enums. Low-level
    drivers can then define macros to use these reserved values, giving
    proper names to the macros for readability. Also add a range of
    reserved flags to enum ib_send_flags.

    The mlx5 IB driver uses the new additions.

    Signed-off-by: Jack Morgenstein
    Signed-off-by: Roland Dreier

    Jack Morgenstein
     

20 Jun, 2013

39 commits

  • This typedef is unnecessary and should just be removed.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • This typedef is unnecessary and should just be removed.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • This patch removes an empty ifdef from inet_frag_intern()
    in net/ipv4/inet_fragment.c.

    commit b67bfe0d42cac56c512dd5da4b1b347a23f4b70a
    (hlist: drop the node parameter from iterators) removed hlist from
    net/ipv4/inet_fragment.c, but did not remove the enclosing ifdef command,
    which is now empty.

    Signed-off-by: Rami Rosen
    Signed-off-by: David S. Miller

    Rami Rosen
     
  • htb_sched structures are big, and source of false sharing on SMP.

    Every time a packet is queued or dequeue, many cache lines must be
    touched because structures are not lay out properly.

    By carefully splitting htb_sched in two parts, and define sub structures
    to increase data locality, we can improve performance dramatically on
    SMP.

    New htb_prio structure can also be used in htb_class to increase data
    locality.

    I got 26 % performance increase on a 24 threads machine, with 200
    concurrent netperf in TCP_RR mode, using a HTB hierarchy of 4 classes.

    Signed-off-by: Eric Dumazet
    Cc: Tom Herbert
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • In previous discussions, I tried to find some reasonable heuristics
    for delayed ACK, however this seems not possible, according to Eric:

    "ACKS might also be delayed because of bidirectional
    traffic, and is more controlled by the application
    response time. TCP stack can not easily estimate it."

    "ACK can be incredibly useful to recover from losses in
    a short time.

    The vast majority of TCP sessions are small lived, and we
    send one ACK per received segment anyway at beginning or
    retransmits to let the sender smoothly increase its cwnd,
    so an auto-tuning facility wont help them that much."

    and according to David:

    "ACKs are the only information we have to detect loss.

    And, for the same reasons that TCP VEGAS is fundamentally
    broken, we cannot measure the pipe or some other
    receiver-side-visible piece of information to determine
    when it's "safe" to stretch ACK.

    And even if it's "safe", we should not do it so that losses are
    accurately detected and we don't spuriously retransmit.

    The only way to know when the bandwidth increases is to
    "test" it, by sending more and more packets until drops happen.
    That's why all successful congestion control algorithms must
    operate on explicited tested pieces of information.

    Similarly, it's not really possible to universally know if
    it's safe to stretch ACK or not."

    It still makes sense to enable or disable quick ack mode like
    what TCP_QUICK_ACK does.

    Similar to TCP_QUICK_ACK option, but for people who can't
    modify the source code and still wants to control
    TCP delayed ACK behavior. As David suggested, this should belong
    to per-path scope, since different pathes may want different
    behaviors.

    Cc: Eric Dumazet
    Cc: Rick Jones
    Cc: Stephen Hemminger
    Cc: "David S. Miller"
    Cc: Thomas Graf
    CC: David Laight
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Signed-off-by: Dave Jones
    Signed-off-by: David S. Miller

    Dave Jones
     
  • Pci core has been saved pm cap register offset by pdev->pm_cap in pci_pm_init()
    in init path. So we can use pdev->pm_cap instead of using
    pci_find_capability(pdev, PCI_CAP_ID_PM) for better performance and simplified code.

    Signed-off-by: Yijing Wang
    Cc: Michael Chan
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: David S. Miller

    Yijing Wang
     
  • Pci core has been saved pm cap register offset by pdev->pm_cap in pci_pm_init()
    in init path. So we can use pdev->pm_cap instead of using
    pci_find_capability(pdev, PCI_CAP_ID_PM) for better performance and simplified code.

    Signed-off-by: Yijing Wang
    Cc: "David S. Miller"
    Cc: Patrick McHardy
    Cc: Bill Pemberton
    Cc: Greg Kroah-Hartman
    Cc: netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
    Signed-off-by: David S. Miller

    Yijing Wang
     
  • Pci_enable_device() will set device power state to D0,
    so it's no need to do it again in bnx2x_init_dev().
    Also remove redundant PM Cap find code, because pci core
    has been saved the pci device pm cap value.

    Signed-off-by: Yijing Wang
    Cc: Eilon Greenstein
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Acked-by: Yuval Mintz
    Signed-off-by: David S. Miller

    Yijing Wang
     
  • ETRAX_ETHERNET selects ETHERNET and MII, which depend on NETDEVICES.
    I don't think anything should select NETDEVICES, so make it a
    dependency. It also doesn't need to select or depend on ETHERNET,
    which has nothing to do with the Ethernet library functions.

    BPCTL selects MII, which depends on NETDEVICES. But everything in the
    drivers/staging/silicom directory is related to net devices, so make
    NET_VENDOR_SILICOM depend on NETDEVICES and remove the now-redundant
    dependencies on NET.

    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Ben Hutchings
     
  • This has no dependency on any of the drivers under NET_CORE.

    Signed-off-by: Ben Hutchings
    Acked-by: Nicolas Ferre
    Signed-off-by: David S. Miller

    Ben Hutchings
     
  • All drivers that select MII also need to select NET_CORE because MII
    depends on it. This is a bit ridiculous because NET_CORE is just a
    menu option that doesn't enable any code by itself.

    There is also no need for it to be a visible option, since its users
    all select it.

    Signed-off-by: Ben Hutchings
    Acked-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Ben Hutchings
     
  • Signed-off-by: Weiping Pan
    Signed-off-by: David S. Miller

    Weiping Pan
     
  • Also, cleanup bond_alb_handle_active_change() from 2 identical ifs.

    Signed-off-by: Veaceslav Falico
    Signed-off-by: David S. Miller

    Veaceslav Falico
     
  • be_find_vfs() is no longer needed as the common PCI calls provide the same
    functionality.

    Signed-off-by: Sathya Perla
    Signed-off-by: David S. Miller

    Sathya Perla
     
  • The use of this attribute has been added in 32b8a8e59c9c (sit: add IPv4 over
    IPv4 support). It is optional, by default proto is IPPROTO_IPV6.

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • The current situation is that SOCK_MIN_RCVBUF is 2048 + sizeof(struct sk_buff))
    while SOCK_MIN_SNDBUF is 2048. Since in both cases, skb->truesize is used for
    sk_{r,w}mem_alloc accounting, we should have both sizes adjusted via defining a
    TCP_SKB_MIN_TRUESIZE.

    Further, as Eric Dumazet points out, the minimal skb truesize in transmit path is
    SKB_TRUESIZE(2048) after commit f07d960df33c5 ("tcp: avoid frag allocation for
    small frames"), and tcp_sendmsg() tries to limit skb size to half the congestion
    window, meaning we try to build two skbs at minimum. Thus, having SOCK_MIN_SNDBUF
    as 2048 can hit a small regression for some applications setting to low
    SO_SNDBUF / SO_RCVBUF. Note that we define a TCP_SKB_MIN_TRUESIZE, because
    SKB_TRUESIZE(2048) adds SKB_DATA_ALIGN(sizeof(struct skb_shared_info)), but in
    case of TCP skbs, the skb_shared_info is part of the 2048 bytes allocation for
    skb->head.

    The minor adaption in sk_stream_moderate_sndbuf() is to silence a warning by
    using a typed max macro, as similarly done in SOCK_MIN_RCVBUF occurences, that
    would appear otherwise.

    Suggested-by: Eric Dumazet
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • thresh and interval are global resources,
    only init net can change them.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     
  • Though we don't export the /proc/sys/net/ipv[4,6]/neigh/default/
    directory to the un-init_net, but we can still use cmd such as
    "ip ntable change name arp_cache locktime 129" to change the locktime
    of default neigh_parms.

    This patch disallows the un-init_net to find out the neigh_table.parms.
    So the un-init_net will failed to influence the init_net.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     
  • neigh_table.parms always exist and is initialized,kmemdup
    can use it to create new neigh_parms, actually lookup_neigh_parms
    here will return neigh_table.parms too.

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     
  • Check next packet availability by validating that HW has finished CQE
    placement. This saves latency of another dma transaction performed to update
    SB indexes.

    Signed-off-by: Dmitry Kravkov
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Dmitry Kravkov
     
  • Adds ndo_ll_poll method and locking for FPs between LL and the napi.

    When receiving a packet we use skb_mark_ll to record the napi it came from.
    Add each napi to the napi_hash right after netif_napi_add().

    Signed-off-by: Dmitry Kravkov
    Signed-off-by: Eilon Greenstein
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Dmitry Kravkov
     
  • Signed-off-by: Amir Vadai
    Signed-off-by: David S. Miller

    Amir Vadai
     
  • Add basic support for LLS.

    Signed-off-by: Amir Vadai
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Amir Vadai
     
  • Pravin B Shelar says:

    ====================
    Following patch series adds support for gre tunneling.
    First six patches extend kernel gre and ip_tunnel modules
    api so that there is more code sharing between gre modules
    and ovs. Rest of patches adds ovs tunneling infrastructre
    and gre protocol vport.

    V2 fixes two patches according to comments from Jesse.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Add gre vport implementation. Most of gre protocol processing
    is pushed to gre module. It make use of gre demultiplexer
    therefore it can co-exist with linux device based gre tunnels.

    Signed-off-by: Pravin B Shelar
    Acked-by: Jesse Gross
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Following patch adds start offset for sw_flow-key, so that we can
    skip tunneling information in key for non-tunnel flows.

    Signed-off-by: Pravin B Shelar
    Acked-by: Jesse Gross
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • MAX_ACTIONS_BUFSIZE limits action list size, set tunnel action
    needs extra space on action list, for now increase max actions list limit.

    Signed-off-by: Pravin B Shelar
    Acked-by: Jesse Gross
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Add ovs tunnel interface for set tunnel action for userspace.

    Signed-off-by: Pravin B Shelar
    Acked-by: Jesse Gross
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Rather than validating actions and then copying all actiaons
    in one block, following patch does same operation in single pass.
    This validate and copy action one by one. This is required for
    ovs tunneling patch.

    This patch does not change any functionality.

    Signed-off-by: Pravin B Shelar
    Acked-by: Jesse Gross
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • This flag will be used by ovs tunneling.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Process skb tunnel header before sending packet to protocol handler.
    this allows code sharing between gre and ovs gre modules.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Refactor various ip tunnels xmit functions and extend iptunnel_xmit()
    so that there is more code sharing.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • This is required for OVS GRE offloading.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • This is required for ovs gre module.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Currently there is only one user is allowed to register for gre
    protocol. Following patch adds de-multiplexer. So that multiple
    modules can listen on gre protocol e.g. kernel gre devices and ovs.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Use cmpxchg() for atomic protocol registration which saves
    code and data space.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • The only R8A7740 specific #ifdef hindering ARM multiplatform build is left in
    sh_eth_rx(): it covers the code shifting Rx buffer descriptor word 0 by 16. Get
    rid of the #ifdef by adding 'shift_rd0' field to the 'struct sh_eth_cpu_data',
    making the shift dependent on it, and setting it to 1 for the R8A7740 case...

    Signed-off-by: Sergei Shtylyov
    Signed-off-by: David S. Miller

    Sergei Shtylyov
     
  • Fix the comment to 'enum TD_STS_BIT', reformat the values, and add a couple of
    values missing before (though unused by the driver).

    Signed-off-by: Sergei Shtylyov
    Signed-off-by: David S. Miller

    Sergei Shtylyov