Doug / smarc-fsl-linux-kernel | Embedian Git Server

08 Jul, 2013

1 commit

0134f16bc IB/core: Add reserved values to enums for low-level driver use ... Browse Code »

Continue the approach taken by commit d2b57063e4a ("IB/core: Reserve
bits in enum ib_qp_create_flags for low-level driver use") and add
reserved entries to the ib_qp_type and ib_wr_opcode enums. Low-level
drivers can then define macros to use these reserved values, giving
proper names to the macros for readability. Also add a range of
reserved flags to enum ib_send_flags.

The mlx5 IB driver uses the new additions.

Signed-off-by: Jack Morgenstein
Signed-off-by: Roland Dreier

Jack Morgenstein
2013-07-08 10:21:21 +0800

20 Jun, 2013

39 commits

fedaf4ffc ndisc: Convert use of typedef ctl_table to struct ctl_table ... Browse Code »

This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches
Signed-off-by: David S. Miller

Joe Perches
2013-06-20 14:18:07 +0800
9e8cda3ba ipv6: Convert use of typedef ctl_table to struct ctl_table ... Browse Code »

This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches
Signed-off-by: David S. Miller

Joe Perches
2013-06-20 14:18:07 +0800
af92e5425 inet: frag , remove an empty ifdef. ... Browse Code »

This patch removes an empty ifdef from inet_frag_intern()
in net/ipv4/inet_fragment.c.

commit b67bfe0d42cac56c512dd5da4b1b347a23f4b70a
(hlist: drop the node parameter from iterators) removed hlist from
net/ipv4/inet_fragment.c, but did not remove the enclosing ifdef command,
which is now empty.

Signed-off-by: Rami Rosen
Signed-off-by: David S. Miller

Rami Rosen
2013-06-20 14:06:52 +0800
c9364636d htb: refactor struct htb_sched fields for performance ... Browse Code »

htb_sched structures are big, and source of false sharing on SMP.

Every time a packet is queued or dequeue, many cache lines must be
touched because structures are not lay out properly.

By carefully splitting htb_sched in two parts, and define sub structures
to increase data locality, we can improve performance dramatically on
SMP.

New htb_prio structure can also be used in htb_class to increase data
locality.

I got 26 % performance increase on a 24 threads machine, with 200
concurrent netperf in TCP_RR mode, using a HTB hierarchy of 4 classes.

Signed-off-by: Eric Dumazet
Cc: Tom Herbert
Signed-off-by: David S. Miller

Eric Dumazet
2013-06-20 14:06:52 +0800
bcefe17cf tcp: introduce a per-route knob for quick ack ... Browse Code »

In previous discussions, I tried to find some reasonable heuristics
for delayed ACK, however this seems not possible, according to Eric:

"ACKS might also be delayed because of bidirectional
traffic, and is more controlled by the application
response time. TCP stack can not easily estimate it."

"ACK can be incredibly useful to recover from losses in
a short time.

The vast majority of TCP sessions are small lived, and we
send one ACK per received segment anyway at beginning or
retransmits to let the sender smoothly increase its cwnd,
so an auto-tuning facility wont help them that much."

and according to David:

"ACKs are the only information we have to detect loss.

And, for the same reasons that TCP VEGAS is fundamentally
broken, we cannot measure the pipe or some other
receiver-side-visible piece of information to determine
when it's "safe" to stretch ACK.

And even if it's "safe", we should not do it so that losses are
accurately detected and we don't spuriously retransmit.

The only way to know when the bandwidth increases is to
"test" it, by sending more and more packets until drops happen.
That's why all successful congestion control algorithms must
operate on explicited tested pieces of information.

Similarly, it's not really possible to universally know if
it's safe to stretch ACK or not."

It still makes sense to enable or disable quick ack mode like
what TCP_QUICK_ACK does.

Similar to TCP_QUICK_ACK option, but for people who can't
modify the source code and still wants to control
TCP delayed ACK behavior. As David suggested, this should belong
to per-path scope, since different pathes may want different
behaviors.

Cc: Eric Dumazet
Cc: Rick Jones
Cc: Stephen Hemminger
Cc: "David S. Miller"
Cc: Thomas Graf
CC: David Laight
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

Cong Wang
2013-06-20 14:06:51 +0800
2c0740e4e sctp: Convert __list_for_each use to list_for_each ... Browse Code »

Signed-off-by: Dave Jones
Signed-off-by: David S. Miller

Dave Jones
2013-06-20 14:02:49 +0800
857682713 bnx2: use pdev->pm_cap instead of pci_find_capability(.., PCI_CAP_ID_PM) ... Browse Code »

Pci core has been saved pm cap register offset by pdev->pm_cap in pci_pm_init()
in init path. So we can use pdev->pm_cap instead of using
pci_find_capability(pdev, PCI_CAP_ID_PM) for better performance and simplified code.

Signed-off-by: Yijing Wang
Cc: Michael Chan
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David S. Miller

Yijing Wang
2013-06-20 13:22:57 +0800
f9c7da5ea amd8111e: use pdev->pm_cap instead of pci_find_capability(.., PCI_CAP_ID_PM) ... Browse Code »

Pci core has been saved pm cap register offset by pdev->pm_cap in pci_pm_init()
in init path. So we can use pdev->pm_cap instead of using
pci_find_capability(pdev, PCI_CAP_ID_PM) for better performance and simplified code.

Signed-off-by: Yijing Wang
Cc: "David S. Miller"
Cc: Patrick McHardy
Cc: Bill Pemberton
Cc: Greg Kroah-Hartman
Cc: netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
Signed-off-by: David S. Miller

Yijing Wang
2013-06-20 13:22:56 +0800
b8a39dd29 Bnx2x: remove redundant D0 power state set ... Browse Code »

Pci_enable_device() will set device power state to D0,
so it's no need to do it again in bnx2x_init_dev().
Also remove redundant PM Cap find code, because pci core
has been saved the pci device pm cap value.

Signed-off-by: Yijing Wang
Cc: Eilon Greenstein
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Yuval Mintz
Signed-off-by: David S. Miller

Yijing Wang
2013-06-20 13:22:56 +0800
2206209e7 net: Add missing dependencies on NETDEVICES ... Browse Code »

ETRAX_ETHERNET selects ETHERNET and MII, which depend on NETDEVICES.
I don't think anything should select NETDEVICES, so make it a
dependency. It also doesn't need to select or depend on ETHERNET,
which has nothing to do with the Ethernet library functions.

BPCTL selects MII, which depends on NETDEVICES. But everything in the
drivers/staging/silicom directory is related to net devices, so make
NET_VENDOR_SILICOM depend on NETDEVICES and remove the now-redundant
dependencies on NET.

Signed-off-by: Ben Hutchings
Signed-off-by: David S. Miller

Ben Hutchings
2013-06-20 13:22:56 +0800
d6cf7a86c at91_ether: Do not select NET_CORE ... Browse Code »

This has no dependency on any of the drivers under NET_CORE.

Signed-off-by: Ben Hutchings
Acked-by: Nicolas Ferre
Signed-off-by: David S. Miller

Ben Hutchings
2013-06-20 13:22:56 +0800
a1606c7dc net: Move MII out from under NET_CORE and hide it ... Browse Code »

All drivers that select MII also need to select NET_CORE because MII
depends on it. This is a bit ridiculous because NET_CORE is just a
menu option that doesn't enable any code by itself.

There is also no need for it to be a visible option, since its users
all select it.

Signed-off-by: Ben Hutchings
Acked-by: Jeff Kirsher
Signed-off-by: David S. Miller

Ben Hutchings
2013-06-20 13:22:56 +0800
9ef71e0c8 tcp:typo unset should be unsent ... Browse Code »

Signed-off-by: Weiping Pan
Signed-off-by: David S. Miller

Weiping Pan
2013-06-20 13:21:09 +0800
b88ec38d1 bonding: trivial: make alb use bond_slave_has_mac() ... Browse Code »

Also, cleanup bond_alb_handle_active_change() from 2 identical ifs.

Signed-off-by: Veaceslav Falico
Signed-off-by: David S. Miller

Veaceslav Falico
2013-06-20 13:20:08 +0800
257a3feb3 be2net: use pci_vfs_assigned()/pci_num_vf() instead of be_find_vfs() ... Browse Code »

be_find_vfs() is no longer needed as the common PCI calls provide the same
functionality.

Signed-off-by: Sathya Perla
Signed-off-by: David S. Miller

Sathya Perla
2013-06-20 12:23:19 +0800
c2ff682a6 sit: fix an oops when IFLA_IPTUN_PROTO is not set ... Browse Code »

The use of this attribute has been added in 32b8a8e59c9c (sit: add IPv4 over
IPv4 support). It is optional, by default proto is IPPROTO_IPV6.

Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller

Nicolas Dichtel
2013-06-20 12:18:17 +0800
eea86af6b net: sock: adapt SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF ... Browse Code »

The current situation is that SOCK_MIN_RCVBUF is 2048 + sizeof(struct sk_buff))
while SOCK_MIN_SNDBUF is 2048. Since in both cases, skb->truesize is used for
sk_{r,w}mem_alloc accounting, we should have both sizes adjusted via defining a
TCP_SKB_MIN_TRUESIZE.

Further, as Eric Dumazet points out, the minimal skb truesize in transmit path is
SKB_TRUESIZE(2048) after commit f07d960df33c5 ("tcp: avoid frag allocation for
small frames"), and tcp_sendmsg() tries to limit skb size to half the congestion
window, meaning we try to build two skbs at minimum. Thus, having SOCK_MIN_SNDBUF
as 2048 can hit a small regression for some applications setting to low
SO_SNDBUF / SO_RCVBUF. Note that we define a TCP_SKB_MIN_TRUESIZE, because
SKB_TRUESIZE(2048) adds SKB_DATA_ALIGN(sizeof(struct skb_shared_info)), but in
case of TCP skbs, the skb_shared_info is part of the 2048 bytes allocation for
skb->head.

The minor adaption in sk_stream_moderate_sndbuf() is to silence a warning by
using a typed max macro, as similarly done in SOCK_MIN_RCVBUF occurences, that
would appear otherwise.

Suggested-by: Eric Dumazet
Signed-off-by: Daniel Borkmann
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Daniel Borkmann
2013-06-20 12:16:53 +0800
dc25c676f neigh: disallow un-init_net to change thresh of neigh ... Browse Code »

thresh and interval are global resources,
only init net can change them.

Signed-off-by: Gao feng
Signed-off-by: David S. Miller

Gao feng
2013-06-20 12:13:24 +0800
170d6f995 neigh: only allow init_net to change the default neigh_parms ... Browse Code »

Though we don't export the /proc/sys/net/ipv[4,6]/neigh/default/
directory to the un-init_net, but we can still use cmd such as
"ip ntable change name arp_cache locktime 129" to change the locktime
of default neigh_parms.

This patch disallows the un-init_net to find out the neigh_table.parms.
So the un-init_net will failed to influence the init_net.

Signed-off-by: Gao feng
Signed-off-by: David S. Miller

Gao feng
2013-06-20 12:13:24 +0800
cf89d6b28 neigh: no need to call lookup_neigh_parms in neigh_parms_alloc ... Browse Code »

neigh_table.parms always exist and is initialized,kmemdup
can use it to create new neigh_parms, actually lookup_neigh_parms
here will return neigh_table.parms too.

Signed-off-by: Gao feng
Signed-off-by: David S. Miller

Gao feng
2013-06-20 12:13:24 +0800
75b294598 bnx2x: replace mechanism to check for next available packet ... Browse Code »

Check next packet availability by validating that HW has finished CQE
placement. This saves latency of another dma transaction performed to update
SB indexes.

Signed-off-by: Dmitry Kravkov
Signed-off-by: Eilon Greenstein
Signed-off-by: David S. Miller

Dmitry Kravkov
2013-06-20 09:32:17 +0800
8f20aa575 bnx2x: add support for ndo_ll_poll ... Browse Code »

Adds ndo_ll_poll method and locking for FPs between LL and the napi.

When receiving a packet we use skb_mark_ll to record the napi it came from.
Add each napi to the napi_hash right after netif_napi_add().

Signed-off-by: Dmitry Kravkov
Signed-off-by: Eilon Greenstein
Reviewed-by: Eric Dumazet
Signed-off-by: David S. Miller

Dmitry Kravkov
2013-06-20 09:32:16 +0800
8501841a4 net/mlx4_en: Low Latency recv statistics ... Browse Code »

Signed-off-by: Amir Vadai
Signed-off-by: David S. Miller

Amir Vadai
2013-06-20 09:32:16 +0800
9e77a2b83 net/mlx4_en: Add Low Latency Socket (LLS) support ... Browse Code »

Add basic support for LLS.

Signed-off-by: Amir Vadai
Reviewed-by: Eric Dumazet
Signed-off-by: David S. Miller

Amir Vadai
2013-06-20 09:32:16 +0800
dc3d807d6 openvswitch: gre tunneling support. ... Browse Code »

Pravin B Shelar says:

====================
Following patch series adds support for gre tunneling.
First six patches extend kernel gre and ip_tunnel modules
api so that there is more code sharing between gre modules
and ovs. Rest of patches adds ovs tunneling infrastructre
and gre protocol vport.

V2 fixes two patches according to comments from Jesse.
====================

Signed-off-by: David S. Miller

David S. Miller
2013-06-20 09:07:49 +0800
aa310701e openvswitch: Add gre tunnel support. ... Browse Code »

Add gre vport implementation. Most of gre protocol processing
is pushed to gre module. It make use of gre demultiplexer
therefore it can co-exist with linux device based gre tunnels.

Signed-off-by: Pravin B Shelar
Acked-by: Jesse Gross
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:42 +0800
a3e82996a openvswitch: Optimize flow key match for non tunnel flows. ... Browse Code »

Following patch adds start offset for sw_flow-key, so that we can
skip tunneling information in key for non-tunnel flows.

Signed-off-by: Pravin B Shelar
Acked-by: Jesse Gross
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:41 +0800
ffe3f4321 openvswitch: Expand action buffer size. ... Browse Code »

MAX_ACTIONS_BUFSIZE limits action list size, set tunnel action
needs extra space on action list, for now increase max actions list limit.

Signed-off-by: Pravin B Shelar
Acked-by: Jesse Gross
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:41 +0800
7d5437c70 openvswitch: Add tunneling interface. ... Browse Code »

Add ovs tunnel interface for set tunnel action for userspace.

Signed-off-by: Pravin B Shelar
Acked-by: Jesse Gross
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:41 +0800
74f84a572 openvswitch: Copy individual actions. ... Browse Code »

Rather than validating actions and then copying all actiaons
in one block, following patch does same operation in single pass.
This validate and copy action one by one. This is required for
ovs tunneling patch.

This patch does not change any functionality.

Signed-off-by: Pravin B Shelar
Acked-by: Jesse Gross
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:41 +0800
9a628224a ip_tunnel: Add dont fragment flag. ... Browse Code »

This flag will be used by ovs tunneling.

Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:41 +0800
3d7b46cd2 ip_tunnel: push generic protocol handling to ip_tunnel module. ... Browse Code »

Process skb tunnel header before sending packet to protocol handler.
this allows code sharing between gre and ovs gre modules.

Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:41 +0800
0e6fbc5b6 ip_tunnels: extend iptunnel_xmit() ... Browse Code »

Refactor various ip tunnels xmit functions and extend iptunnel_xmit()
so that there is more code sharing.

Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:41 +0800
45f2e9976 gre: export gre_handle_offloads() function. ... Browse Code »

This is required for OVS GRE offloading.

Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:41 +0800
752f36da6 gre: export gre_build_header() function. ... Browse Code »

This is required for ovs gre module.

Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:40 +0800
bda7bb463 gre: Allow multiple protocol listener for gre protocol. ... Browse Code »

Currently there is only one user is allowed to register for gre
protocol. Following patch adds de-multiplexer. So that multiple
modules can listen on gre protocol e.g. kernel gre devices and ovs.

Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:40 +0800
20fd4d1f0 gre: Simplify gre protocol registration locking. ... Browse Code »

Use cmpxchg() for atomic protocol registration which saves
code and data space.

Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller

Pravin B Shelar
2013-06-20 09:07:40 +0800
ac8025a64 sh_eth: get R8A7740 Rx descriptor word 0 shift out of #ifdef ... Browse Code »

The only R8A7740 specific #ifdef hindering ARM multiplatform build is left in
sh_eth_rx(): it covers the code shifting Rx buffer descriptor word 0 by 16. Get
rid of the #ifdef by adding 'shift_rd0' field to the 'struct sh_eth_cpu_data',
making the shift dependent on it, and setting it to 1 for the R8A7740 case...

Signed-off-by: Sergei Shtylyov
Signed-off-by: David S. Miller

Sergei Shtylyov
2013-06-20 08:02:52 +0800
c8bbe37aa sh_eth: cleanup 'enum TD_STS_BIT' ... Browse Code »

Fix the comment to 'enum TD_STS_BIT', reformat the values, and add a couple of
values missing before (though unused by the driver).

Signed-off-by: Sergei Shtylyov
Signed-off-by: David S. Miller

Sergei Shtylyov
2013-06-20 08:01:39 +0800