25 Dec, 2016
1 commit
-
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
sed -i -e "s!$PATT!#include !" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)to do the replacement at the end of the merge window.
Requested-by: Al Viro
Signed-off-by: Linus Torvalds
18 Nov, 2016
1 commit
-
Make struct pernet_operations::id unsigned.
There are 2 reasons to do so:
1)
This field is really an index into an zero based array and
thus is unsigned entity. Using negative value is out-of-bound
access by definition.2)
On x86_64 unsigned 32-bit data which are mixed with pointers
via array indexing or offsets added or subtracted to pointers
are preffered to signed 32-bit data."int" being used as an array index needs to be sign-extended
to 64-bit before being used.void f(long *p, int i)
{
g(p[i]);
}roughly translates to
movsx rsi, esi
mov rdi, [rsi+...]
call gMOVSX is 3 byte instruction which isn't necessary if the variable is
unsigned because x86_64 is zero extending by default.Now, there is net_generic() function which, you guessed it right, uses
"int" as an array index:static inline void *net_generic(const struct net *net, int id)
{
...
ptr = ng->ptr[id - 1];
...
}And this function is used a lot, so those sign extensions add up.
Patch snipes ~1730 bytes on allyesconfig kernel (without all junk
messing with code generation):add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
Unfortunately some functions actually grow bigger.
This is a semmingly random artefact of code generation with register
allocator being used differently. gcc decides that some variable
needs to live in new r8+ registers and every access now requires REX
prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be
used which is longer than [r8]However, overall balance is in negative direction:
add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730)
function old new delta
nfsd4_lock 3886 3959 +73
tipc_link_build_proto_msg 1096 1140 +44
mac80211_hwsim_new_radio 2776 2808 +32
tipc_mon_rcv 1032 1058 +26
svcauth_gss_legacy_init 1413 1429 +16
tipc_bcbase_select_primary 379 392 +13
nfsd4_exchange_id 1247 1260 +13
nfsd4_setclientid_confirm 782 793 +11
...
put_client_renew_locked 494 480 -14
ip_set_sockfn_get 730 716 -14
geneve_sock_add 829 813 -16
nfsd4_sequence_done 721 703 -18
nlmclnt_lookup_host 708 686 -22
nfsd4_lockt 1085 1063 -22
nfs_get_client 1077 1050 -27
tcf_bpf_init 1106 1076 -30
nfsd4_encode_fattr 5997 5930 -67
Total: Before=154856051, After=154854321, chg -0.00%Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller
31 Oct, 2016
1 commit
-
Mostly simple overlapping changes.
For example, David Ahern's adjacency list revamp in 'net-next'
conflicted with an adjacency list traversal bug fix in 'net'.Signed-off-by: David S. Miller
21 Oct, 2016
2 commits
-
geneve:
- Merge __geneve_change_mtu back into geneve_change_mtu, set max_mtu
- This one isn't quite as straight-forward as others, could use some
closer inspection and testingmacvlan:
- set min/max_mtutun:
- set min/max_mtu, remove tun_net_change_mtuvxlan:
- Merge __vxlan_change_mtu back into vxlan_change_mtu
- Set max_mtu to IP_MAX_MTU and retain dynamic MTU range checks in
change_mtu function
- This one is also not as straight-forward and could use closer inspection
and testing from vxlan folksbridge:
- set max_mtu of IP_MAX_MTU and retain dynamic MTU range checks in
change_mtu functionopenvswitch:
- set min/max_mtu, remove internal_dev_change_mtu
- note: max_mtu wasn't checked previously, it's been set to 65535, which
is the largest possible size supportedsch_teql:
- set min/max_mtu (note: max_mtu previously unchecked, used max of 65535)macsec:
- min_mtu = 0, max_mtu = 65535macvlan:
- min_mtu = 0, max_mtu = 65535ntb_netdev:
- min_mtu = 0, max_mtu = 65535veth:
- min_mtu = 68, max_mtu = 655358021q:
- min_mtu = 0, max_mtu = 65535CC: netdev@vger.kernel.org
CC: Nicolas Dichtel
CC: Hannes Frederic Sowa
CC: Tom Herbert
CC: Daniel Borkmann
CC: Alexander Duyck
CC: Paolo Abeni
CC: Jiri Benc
CC: WANG Cong
CC: Roopa Prabhu
CC: Pravin B Shelar
CC: Sabrina Dubroca
CC: Patrick McHardy
CC: Stephen Hemminger
CC: Pravin Shelar
CC: Maxim Krasnyansky
Signed-off-by: Jarod Wilson
Signed-off-by: David S. Miller -
Currently, GRO can do unlimited recursion through the gro_receive
handlers. This was fixed for tunneling protocols by limiting tunnel GRO
to one level with encap_mark, but both VLAN and TEB still have this
problem. Thus, the kernel is vulnerable to a stack overflow, if we
receive a packet composed entirely of VLAN headers.This patch adds a recursion counter to the GRO layer to prevent stack
overflow. When a gro_receive function hits the recursion limit, GRO is
aborted for this skb and it is processed normally. This recursion
counter is put in the GRO CB, but could be turned into a percpu counter
if we run out of space in the CB.Thanks to Vladimír Beneš for the initial bug report.
Fixes: CVE-2016-7039
Fixes: 9b174d88c257 ("net: Add Transparent Ethernet Bridging GRO support.")
Fixes: 66e5133f19e9 ("vlan: Add GRO support for non hardware accelerated vlan")
Signed-off-by: Sabrina Dubroca
Reviewed-by: Jiri Benc
Acked-by: Hannes Frederic Sowa
Acked-by: Tom Herbert
Signed-off-by: David S. Miller
19 Oct, 2016
1 commit
-
Use sizeof variable instead of literal number to enhance the readability.
Signed-off-by: Gao Feng
Signed-off-by: David S. Miller
18 Oct, 2016
1 commit
-
args.u.name_type is of type unsigned int and is always >= 0.
This fixes the following GCC warning:
net/8021q/vlan.c: In function ‘vlan_ioctl_handler’:
net/8021q/vlan.c:574:14: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]Signed-off-by: Tobias Klauser
Signed-off-by: David S. Miller
14 Aug, 2016
1 commit
-
The idea for type_check in dev_get_nest_level() was to count the number
of nested devices of the same type (currently, only macvlan or vlan
devices).
This prevented the false positive lockdep warning on configurations such
as:eth0
Signed-off-by: David S. Miller
24 Jul, 2016
1 commit
-
Just several instances of overlapping changes.
Signed-off-by: David S. Miller
17 Jul, 2016
1 commit
-
macsec can't cope with mtu frames which need vlan tag insertion, and
vlan device set the default mtu equal to the underlying dev's one.
By default vlan over macsec devices use invalid mtu, dropping
all the large packets.
This patch adds a netif helper to check if an upper vlan device
needs mtu reduction. The helper is used during vlan devices
initialization to set a valid default and during mtu updating to
forbid invalid, too bit, mtu values.
The helper currently only check if the lower dev is a macsec device,
if we get more users, we need to update only the helper (possibly
reserving an additional IFF bit).Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller
06 Jul, 2016
1 commit
-
L2 upper device needs to propagate neigh_construct/destroy calls down to
lower devices. Do this by defining default ndo functions and use them in
team, bond, bridge and vlan.Signed-off-by: Jiri Pirko
Reviewed-by: Ido Schimmel
Signed-off-by: David S. Miller
01 Jun, 2016
1 commit
-
The MAC address of the physical interface is only copied to the VLAN
when it is first created, resulting in an inconsistency after MAC
address changes of only newly created VLANs having an up-to-date MAC.The VLANs should continue inheriting the MAC address of the physical
interface until the VLAN MAC address is explicitly set to any value.
This allows IPv6 EUI64 addresses for the VLAN to reflect any changes
to the MAC of the physical interface and thus for DAD to behave as
expected.Signed-off-by: Mike Manning
Signed-off-by: David S. Miller
18 Mar, 2016
1 commit
-
vlan drivers lack proper propagation of gso_max_segs from
lower device.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
26 Feb, 2016
1 commit
-
Signed-off-by: David Decotigny
Signed-off-by: David S. Miller
22 Feb, 2016
1 commit
-
Currently vlan device inherits unicast filtering flag from underlying
device. If underlying device doesn't support unicast filter, this will
put vlan device into promiscuous mode when it's stacked.Tun on IFF_UNICAST_FLT on the vlan device in any case so that it does
not go into promiscuous mode needlessly. If underlying device does not
support unicast filtering, that device will enter promiscuous mode.Signed-off-by: Zhang Shengju
Signed-off-by: David S. Miller
18 Feb, 2016
1 commit
-
Since function vlan_proc_rem_dev() will only return 0, it's better to
return void instead of int.Signed-off-by: Zhang Shengju
Signed-off-by: David S. Miller
16 Dec, 2015
3 commits
-
The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the
set of features for offloading all checksums. This is a mask of the
checksum offload related features bits. It is incorrect to set both
NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for
features of a device.This patch:
- Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where
NETIF_F_ALL_CSUM is being used as a mask).
- Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to
use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM.Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller -
The SCTP checksum is really a CRC and is very different from the
standards 1's complement checksum that serves as the checksum
for IP protocols. This offload interface is also very different.
Rename NETIF_F_SCTP_CSUM to NETIF_F_SCTP_CRC to highlight these
differences. The term CSUM should be reserved in the stack to refer
to the standard 1's complement IP checksum.Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller -
We need to be able to propagate static FDB entries and certain bridge
port attributes (e.g. learning, flooding) down to the port netdev
driver when bridge port is a VLAN interface.Achieve that by setting ndo_bridge* and ndo_fdb* in vlan_netdev_ops to
the corresponding switchdev_port* functions. This is consistent with
team and bond devices.Signed-off-by: Ido Schimmel
Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller
18 Nov, 2015
1 commit
-
When a vlan is configured with REORDER_HEADER set to 0, the vlan
header is put back into the packet and makes it appear that
the vlan header is still there even after it's been processed.
This posses a problem for bridge and macvlan ports. The packets
passed to those device may be forwarded and at the time of the
forward, vlan headers end up being unexpectedly present.With the patch, we make sure that we do not put the vlan header
back (when REORDER_HEADER is 0) if a bridge or macvlan has
been configured on top of the vlan device.Signed-off-by: Vladislav Yasevich
Signed-off-by: David S. Miller
04 Nov, 2015
1 commit
-
NIC drivers mark device as detached during error recovery.
It expects no manangement hooks to be invoked in this state.
Invoke driver vlan hooks only if device is present.Signed-off-by: Padmanabh Ratnakar
Signed-off-by: David S. Miller
19 Aug, 2015
1 commit
-
Signed-off-by: Phil Sutter
Cc: Patrick McHardy
Signed-off-by: David S. Miller
02 Jun, 2015
1 commit
-
Currently packets with non-hardware-accelerated vlan cannot be handled
by GRO. This causes low performance for 802.1ad and stacked vlan, as their
vlan tags are currently not stripped by hardware.This patch adds GRO support for non-hardware-accelerated vlan and
improves receive performance of them.Test Environment:
vlan device (.1Q) on vlan device (.1ad) on ixgbe (82599)Result:
- Before
$ netperf -t TCP_STREAM -H 192.168.20.2 -l 60
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec87380 16384 16384 60.00 5233.17
Rx side CPU usage:
%usr %sys %irq %soft %idle
0.27 58.03 0.00 41.70 0.00- After
$ netperf -t TCP_STREAM -H 192.168.20.2 -l 60
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec87380 16384 16384 60.00 7586.85
Rx side CPU usage:
%usr %sys %irq %soft %idle
0.50 25.83 0.00 59.53 14.14[ Register VLAN offloads with priority 10 -DaveM ]
Signed-off-by: Toshiaki Makita
Signed-off-by: David S. Miller
14 May, 2015
1 commit
-
Currently vlan notifier handler will try to update all vlans
for a device when that device comes up. A problem occurs,
however, when the vlan device was set to promiscuous, but not
by the user (ex: a bridge). In that case, dev->gflags are
not updated. What results is that the lower device ends
up with an extra promiscuity count. Here are the
backtraces that prove this:
[62852.052179] [] __dev_set_promiscuity+0x38/0x1e0
[62852.052186] [] ? _raw_spin_unlock_bh+0x1b/0x40
[62852.052188] [] ? dev_set_rx_mode+0x2e/0x40
[62852.052190] [] dev_set_promiscuity+0x24/0x50
[62852.052194] [] vlan_dev_open+0xd5/0x1f0 [8021q]
[62852.052196] [] __dev_open+0xbf/0x140
[62852.052198] [] __dev_change_flags+0x9d/0x170
[62852.052200] [] dev_change_flags+0x29/0x60The above comes from the setting the vlan device to IFF_UP state.
[62852.053569] [] __dev_set_promiscuity+0x38/0x1e0
[62852.053571] [] ? vlan_dev_set_rx_mode+0x2b/0x30
[8021q]
[62852.053573] [] __dev_change_flags+0xe5/0x170
[62852.053645] [] dev_change_flags+0x29/0x60
[62852.053647] [] vlan_device_event+0x18a/0x690
[8021q]
[62852.053649] [] notifier_call_chain+0x4c/0x70
[62852.053651] [] raw_notifier_call_chain+0x16/0x20
[62852.053653] [] call_netdevice_notifiers+0x2d/0x60
[62852.053654] [] __dev_notify_flags+0x33/0xa0
[62852.053656] [] dev_change_flags+0x52/0x60
[62852.053657] [] do_setlink+0x397/0xa40And this one comes from the notification code. What we end
up with is a vlan with promiscuity count of 1 and and a physical
device with a promiscuity count of 2. They should both have
a count 1.To resolve this issue, vlan code can use dev_get_flags() api
which correctly masks promiscuity and allmulti flags.Signed-off-by: Vlad Yasevich
Signed-off-by: David S. Miller
03 Apr, 2015
1 commit
-
Don't use dev->iflink anymore.
CC: Patrick McHardy
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller
30 Mar, 2015
1 commit
-
Stacked vlan devices curretly have few features (GRO, HIGHDMA, LLTX).
Since we have software fallbacks in case the NIC can not handle some
features for multiple vlans, we can add the same features as the lower
vlan devices for stacked vlan devices.This allows stacked vlan devices to create large (GSO) packets and not to
segment packets. Those packets will be segmented by software on the real
device, or even can be segmented by the NIC once TSO for multiple vlans
becomes enabled by the following patches.The exception is those related to FCoE, which does not have a software
fallback.Signed-off-by: Toshiaki Makita
Signed-off-by: David S. Miller
19 Mar, 2015
1 commit
-
When a networking device is taken down that has a non-trivial number
of VLAN devices configured under it, we eat a full synchronize_net()
for every such VLAN device.This is because of the call chain:
NETDEV_DOWN notifier
--> vlan_device_event()
--> dev_change_flags()
--> __dev_change_flags()
--> __dev_close()
--> __dev_close_many()
--> dev_deactivate_many()
--> synchronize_net()This is kind of rediculous because we already have infrastructure for
batching doing operation X to a list of net devices so that we only
incur one sync.So make use of that by exporting dev_close_many() and adjusting it's
interfaace so that the caller can fully manage the batch list. Use
this in vlan_device_event() and all the overhead goes away.Reported-by: Salam Noureddine
Signed-off-by: David S. Miller
04 Mar, 2015
1 commit
-
Use the built-in function instead of memset.
Signed-off-by: Joe Perches
Signed-off-by: David S. Miller
03 Mar, 2015
1 commit
-
Now that there are no more users kill dev_rebuild_header and all of it's
implementations.This is long overdue.
Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
24 Jan, 2015
1 commit
-
Assign rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is
added to rtnetlink messages.Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller
14 Jan, 2015
1 commit
-
The same macros are used for rx as well. So rename it.
Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller
12 Dec, 2014
1 commit
-
Since the real device can segment packets by software, a vlan device
can set TSO/UFO even when the real device doesn't have those features.
Unlike GSO, this allows packets to be segmented after Qdisc.Signed-off-by: Toshiaki Makita
Signed-off-by: David S. Miller
22 Nov, 2014
2 commits
-
Commit a6111d3c "vlan: Pass SIOC[SG]HWTSTAMP ioctls to real device"
intended to enable hardware time stamping on VLAN interfaces, but
passing SIOCSHWTSTAMP is only half of the story. This patch adds
the second half, by letting user space find out the time stamping
capabilities of the device backing a VLAN interface.Signed-off-by: Richard Cochran
Signed-off-by: David S. Miller -
Always returns the same skb it gets, so change to void.
Signed-off-by: Jiri Pirko
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
08 Oct, 2014
1 commit
-
Testing xmit_more support with netperf and connected UDP sockets,
I found strange dst refcount false sharing.Current handling of IFF_XMIT_DST_RELEASE is not optimal.
Dropping dst in validate_xmit_skb() is certainly too late in case
packet was queued by cpu X but dequeued by cpu YThe logical point to take care of drop/force is in __dev_queue_xmit()
before even taking qdisc lock.As Julian Anastasov pointed out, need for skb_dst() might come from some
packet schedulers or classifiers.This patch adds new helper to cleanly express needs of various drivers
or qdiscs/classifiers.Drivers that need skb_dst() in their ndo_start_xmit() should call
following helper in their setup instead of the prior :dev->priv_flags &= ~IFF_XMIT_DST_RELEASE;
->
netif_keep_dst(dev);Instead of using a single bit, we use two bits, one being
eventually rebuilt in bonding/team drivers.The other one, is permanent and blocks IFF_XMIT_DST_RELEASE being
rebuilt in bonding/team. Eventually, we could add something
smarter later.Signed-off-by: Eric Dumazet
Cc: Julian Anastasov
Signed-off-by: David S. Miller
12 Aug, 2014
1 commit
-
Currently the functionality to untag traffic on input resides
as part of the vlan module and is build only when VLAN support
is enabled in the kernel. When VLAN is disabled, the function
vlan_untag() turns into a stub and doesn't really untag the
packets. This seems to create an interesting interaction
between VMs supporting checksum offloading and some network drivers.There are some drivers that do not allow the user to change
tx-vlan-offload feature of the driver. These drivers also seem
to assume that any VLAN-tagged traffic they transmit will
have the vlan information in the vlan_tci and not in the vlan
header already in the skb. When transmitting skbs that already
have tagged data with partial checksum set, the checksum doesn't
appear to be updated correctly by the card thus resulting in a
failure to establish TCP connections.The following is a packet trace taken on the receiver where a
sender is a VM with a VLAN configued. The host VM is running on
doest not have VLAN support and the outging interface on the
host is tg3:
10:12:43.503055 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q
(0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27243,
offset 0, flags [DF], proto TCP (6), length 60)
10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
-> 0x48d9), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
4294837885 ecr 0,nop,wscale 7], length 0
10:12:44.505556 52:54:00:ae:42:3f > 28:d2:44:7d:c2:de, ethertype 802.1Q
(0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27244,
offset 0, flags [DF], proto TCP (6), length 60)
10.0.100.1.58545 > 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
-> 0x44ee), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
4294838888 ecr 0,nop,wscale 7], length 0This connection finally times out.
I've only access to the TG3 hardware in this configuration thus have
only tested this with TG3 driver. There are a lot of other drivers
that do not permit user changes to vlan acceleration features, and
I don't know if they all suffere from a similar issue.The patch attempt to fix this another way. It moves the vlan header
stipping code out of the vlan module and always builds it into the
kernel network core. This way, even if vlan is not supported on
a virtualizatoin host, the virtual machines running on top of such
host will still work with VLANs enabled.CC: Patrick McHardy
CC: Nithin Nayak Sujir
CC: Michael Chan
CC: Jiri Pirko
Signed-off-by: Vladislav Yasevich
Acked-by: Jiri Pirko
Signed-off-by: David S. Miller
30 Jul, 2014
1 commit
-
Similarly, vlan will create /proc/net/vlan/, so when we
create dev with name "config", it will confict with
/proc/net/vlan/config.Reported-by: Stephane Chazelas
Cc: "David S. Miller"
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
17 Jul, 2014
1 commit
-
Signed-off-by: David S. Miller
16 Jul, 2014
1 commit
-
Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
all users to pass NET_NAME_UNKNOWN.Coccinelle patch:
@@
expression sizeof_priv, name, setup, txqs, rxqs, count;
@@(
-alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
+alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
|
-alloc_netdev_mq(sizeof_priv, name, setup, count)
+alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
|
-alloc_netdev(sizeof_priv, name, setup)
+alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
)v9: move comments here from the wrong commit
Signed-off-by: Tom Gundersen
Reviewed-by: David Herrmann
Signed-off-by: David S. Miller
08 Jul, 2014
1 commit
-
This allows applications to enable hardware timestamping without being aware
of it being a vlan device and figuring out the real device.Signed-off-by: Stefan Sørensen
Signed-off-by: David S. Miller