21 Sep, 2016
2 commits
-
since commit commit db74a3335e0f6 ("openvswitch: use percpu
flow stats") flow alloc resets flow-key. So there is no need
to reset the flow-key again if OVS is using newly allocated
flow-key.Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller -
There is no need to declare separate key on stack,
we can just use sw_flow->key to store the key directly.This commit fixes following warning:
net/openvswitch/datapath.c: In function ‘ovs_flow_cmd_new’:
net/openvswitch/datapath.c:1080:1: warning: the frame size of 1040 bytes
is larger than 1024 bytes [-Wframe-larger-than=]Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller
11 Sep, 2016
1 commit
-
When userspace tries to create datapaths and the module is not loaded,
it will simply fail. With this patch, the module will be automatically
loaded.Signed-off-by: Thadeu Lima de Souza Cascardo
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
23 Jun, 2016
1 commit
-
The commit f2a4d086ed4c ("openvswitch: Add packet truncation support.")
introduces packet truncation before sending to userspace upcall receiver.
This patch passes up the skb->len before truncation so that the upcall
receiver knows the original packet size. Potentially this will be used
by sFlow, where OVS translates sFlow config header=N to a sample action,
truncating packet to N byte in kernel datapath. Thus, only N bytes instead
of full-packet size is copied from kernel to userspace, saving the
kernel-to-userspace bandwidth.Signed-off-by: William Tu
Cc: Pravin Shelar
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
11 Jun, 2016
1 commit
-
The patch adds a new OVS action, OVS_ACTION_ATTR_TRUNC, in order to
truncate packets. A 'max_len' is added for setting up the maximum
packet size, and a 'cutlen' field is to record the number of bytes
to trim the packet when the packet is outputting to a port, or when
the packet is sent to userspace.Signed-off-by: William Tu
Cc: Pravin Shelar
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
27 Apr, 2016
1 commit
-
I also fix commit 8b32ab9e6ef1: use nla_total_size_64bit() for
OVS_FLOW_ATTR_USED in ovs_flow_cmd_msg_size().Fixes: 8b32ab9e6ef1 ("ovs: use nla_put_u64_64bit()")
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller
26 Apr, 2016
1 commit
-
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller
14 Mar, 2016
1 commit
-
When we want to change a flow using netlink, we have to identify it to
be able to perform a lookup. Both the flow key and unique flow ID
(ufid) are valid identifiers, but we always have to specify the flow
key in the netlink message. When both attributes are there, the ufid
is used. The flow key is used to validate the actions provided by
the userland.This commit allows to use the ufid without having to provide the flow
key, as it is already done in the netlink 'flow get' and 'flow del'
path. The flow key remains mandatory when an action is provided.Signed-off-by: Samuel Gauthier
Reviewed-by: Simon Horman
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
02 Mar, 2016
1 commit
-
This patch implements bookkeeping support to compute the maximum
headroom for all the devices in each datapath. When said value
changes, the underlying devs are notified via the
ndo_set_rx_headroom method.This also increases the internal vports xmit performance.
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller
19 Feb, 2016
2 commits
-
This reverts commit bb9b18fb55b0 ("genl: Add genlmsg_new_unicast() for
unicast message allocation")'.Nothing wrong with it; its no longer needed since this was only for
mmapped netlink support.Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller -
revert commit 795449d8b846 ("openvswitch: Enable memory mapped Netlink i/o").
Following the mmaped netlink removal this code can be removed.Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller
11 Feb, 2016
1 commit
-
Operations with the GENL_ADMIN_PERM flag fail permissions checks because
this flag means we call netlink_capable, which uses the init user ns.Instead, let's introduce a new flag, GENL_UNS_ADMIN_PERM for operations
which should be allowed inside a user namespace.The motivation for this is to be able to run openvswitch in unprivileged
containers. I've tested this and it seems to work, but I really have no
idea about the security consequences of this patch, so thoughts would be
much appreciated.v2: use the GENL_UNS_ADMIN_PERM flag instead of a check in each function
v3: use separate ifs for UNS_ADMIN_PERM and ADMIN_PERM, instead of one
massive oneReported-by: James Page
Signed-off-by: Tycho Andersen
CC: Eric Biederman
CC: Pravin Shelar
CC: Justin Pettit
CC: "David S. Miller"
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
16 Jan, 2016
1 commit
-
Skb_gso_segment() uses skb control block during segmentation.
This patch adds 32-bytes room for previous control block which
will be copied into all resulting segments.This patch fixes kernel crash during fragmenting forwarded packets.
Fragmentation requires valid IP CB in skb for clearing ip options.
Also patch removes custom save/restore in ovs code, now it's redundant.Signed-off-by: Konstantin Khlebnikov
Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.com
Signed-off-by: David S. Miller
08 Nov, 2015
1 commit
-
Pull trivial updates from Jiri Kosina:
"Trivial stuff from trivial tree that can be trivially summed up as:- treewide drop of spurious unlikely() before IS_ERR() from Viresh
Kumar- cosmetic fixes (that don't really affect basic functionality of the
driver) for pktcdvd and bcache, from Julia Lawall and Petr Mladek- various comment / printk fixes and updates all over the place"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
bcache: Really show state of work pending bit
hwmon: applesmc: fix comment typos
Kconfig: remove comment about scsi_wait_scan module
class_find_device: fix reference to argument "match"
debugfs: document that debugfs_remove*() accepts NULL and error values
net: Drop unlikely before IS_ERR(_OR_NULL)
mm: Drop unlikely before IS_ERR(_OR_NULL)
fs: Drop unlikely before IS_ERR(_OR_NULL)
drivers: net: Drop unlikely before IS_ERR(_OR_NULL)
drivers: misc: Drop unlikely before IS_ERR(_OR_NULL)
UBI: Update comments to reflect UBI_METAONLY flag
pktcdvd: drop null test before destroy functions
24 Oct, 2015
1 commit
-
Conflicts:
net/ipv6/xfrm6_output.c
net/openvswitch/flow_netlink.c
net/openvswitch/vport-gre.c
net/openvswitch/vport-vxlan.c
net/openvswitch/vport.c
net/openvswitch/vport.hThe openvswitch conflicts were overlapping changes. One was
the egress tunnel info fix in 'net' and the other was the
vport ->send() op simplification in 'net-next'.The xfrm6_output.c conflicts was also a simplification
overlapping a bug fix.Signed-off-by: David S. Miller
23 Oct, 2015
1 commit
-
While transitioning to netdev based vport we broke OVS
feature which allows user to retrieve tunnel packet egress
information for lwtunnel devices. Following patch fixes it
by introducing ndo operation to get the tunnel egress info.
Same ndo operation can be used for lwtunnel devices and compat
ovs-tnl-vport devices. So after adding such device operation
we can remove similar operation from ovs-vport.Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device").
Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller
29 Sep, 2015
1 commit
-
IS_ERR(_OR_NULL) already contain an 'unlikely' compiler flag and there
is no need to do that again from its callers. Drop it.Acked-by: Neil Horman
Signed-off-by: Viresh Kumar
Signed-off-by: Jiri Kosina
27 Sep, 2015
1 commit
-
Conflicts:
net/ipv4/arp.cThe net/ipv4/arp.c conflict was one commit adding a new
local variable while another commit was deleting one.Signed-off-by: David S. Miller
25 Sep, 2015
1 commit
-
The genl_notify function has too many arguments for no real reason - all
callers use genl_info to get them anyway. Just pass the genl_info down to
genl_notify.Signed-off-by: Jiri Benc
Signed-off-by: David S. Miller
23 Sep, 2015
1 commit
-
When support for megaflows was introduced, OVS needed to start
installing flows with a mask applied to them. Since masking is an
expensive operation, OVS also had an optimization that would only
take the parts of the flow keys that were covered by a non-zero
mask. The values stored in the remaining pieces should not matter
because they are masked out.While this works fine for the purposes of matching (which must always
look at the mask), serialization to netlink can be problematic. Since
the flow and the mask are serialized separately, the uninitialized
portions of the flow can be encoded with whatever values happen to be
present.In terms of functionality, this has little effect since these fields
will be masked out by definition. However, it leaks kernel memory to
userspace, which is a potential security vulnerability. It is also
possible that other code paths could look at the masked key and get
uninitialized data, although this does not currently appear to be an
issue in practice.This removes the mask optimization for flows that are being installed.
This was always intended to be the case as the mask optimizations were
really targetting per-packet flow operations.Fixes: 03f0d916 ("openvswitch: Mega flow implementation")
Signed-off-by: Jesse Gross
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
01 Sep, 2015
1 commit
-
Currently tun-info options pointer is used in few cases to
pass options around. But tunnel options can be accessed using
ip_tunnel_info_opts() API without using the pointer. Following
patch removes the redundant pointer and consistently make use
of API.Signed-off-by: Pravin B Shelar
Acked-by: Thomas Graf
Reviewed-by: Jesse Gross
Signed-off-by: David S. Miller
30 Aug, 2015
1 commit
-
tun info is passed using skb-dst pointer. Now we have
converted all vports to netdev based implementation so
Now we can remove redundant pointer to tun-info from OVS_CB.Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller
28 Aug, 2015
3 commits
-
Allow matching and setting the ct_label field. As with ct_mark, this is
populated by executing the CT action. The label field may be modified by
specifying a label and mask nested under the CT action. It is stored as
metadata attached to the connection. Label modification occurs after
lookup, and will only persist when the conntrack entry is committed by
providing the COMMIT flag to the CT action. Labels are currently fixed
to 128 bits in size.Signed-off-by: Joe Stringer
Acked-by: Thomas Graf
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller -
Expose the kernel connection tracker via OVS. Userspace components can
make use of the CT action to populate the connection state (ct_state)
field for a flow. This state can be subsequently matched.Exposed connection states are OVS_CS_F_*:
- NEW (0x01) - Beginning of a new connection.
- ESTABLISHED (0x02) - Part of an existing connection.
- RELATED (0x04) - Related to an established connection.
- INVALID (0x20) - Could not track the connection for this packet.
- REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
- TRACKED (0x80) - This packet has been sent through conntrack.When the CT action is executed by itself, it will send the packet
through the connection tracker and populate the ct_state field with one
or more of the connection state flags above. The CT action will always
set the TRACKED bit.When the COMMIT flag is passed to the conntrack action, this specifies
that information about the connection should be stored. This allows
subsequent packets for the same (or related) connections to be
correlated with this connection. Sending subsequent packets for the
connection through conntrack allows the connection tracker to consider
the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.The CT action may optionally take a zone to track the flow within. This
allows connections with the same 5-tuple to be kept logically separate
from connections in other zones. If the zone is specified, then the
"ct_zone" match field will be subsequently populated with the zone id.IP fragments are handled by transparently assembling them as part of the
CT action. The maximum received unit (MRU) size is tracked so that
refragmentation can occur during output.IP frag handling contributed by Andy Zhou.
Based on original design by Justin Pettit.
Signed-off-by: Joe Stringer
Signed-off-by: Justin Pettit
Signed-off-by: Andy Zhou
Acked-by: Thomas Graf
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller -
Previously, we used the kernel-internal netlink actions length to
calculate the size of messages to serialize back to userspace.
However,the sw_flow_actions may not be formatted exactly the same as the
actions on the wire, so store the original actions length when
de-serializing and re-use the original length when serializing.Signed-off-by: Joe Stringer
Acked-by: Pravin B Shelar
Acked-by: Thomas Graf
Signed-off-by: David S. Miller
22 Jul, 2015
3 commits
-
This allows to get rid of the get_name() vport ops later on.
Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller -
This is the first step in representing all OVS vports as regular
struct net_devices. Move the net_device pointer into the vport
structure itself to get rid of struct vport_netdev.Signed-off-by: Thomas Graf
Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller -
Utilize the new metadata dst to attach encapsulation instructions to
the skb. The existing egress_tun_info via the OVS_CB() is left in
place until all tunnel vports have been converted to the new method.Signed-off-by: Thomas Graf
Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller
02 Jun, 2015
1 commit
-
If new optional attribute OVS_USERSPACE_ATTR_ACTIONS is added to an
OVS_ACTION_ATTR_USERSPACE action, then include the datapath actions
in the upcall.This Directly associates the sampled packet with the path it takes
through the virtual switch. Path information currently includes mangling,
encapsulation and decapsulation actions for tunneling protocols GRE,
VXLAN, Geneve, MPLS and QinQ, but this extension requires no further
changes to accommodate datapath actions that may be added in the
future.Adding path information enhances visibility into complex virtual
networks.Signed-off-by: Neil McKee
Signed-off-by: David S. Miller
06 May, 2015
1 commit
-
Replace "ntohs(proto) >= ETH_P_802_3_MIN" w/ eth_proto_is_802_3(proto).
Signed-off-by: Alexander Duyck
Signed-off-by: David S. Miller
13 Mar, 2015
1 commit
-
hold_net and release_net were an idea that turned out to be useless.
The code has been disabled since 2008. Kill the code it is long past due.Signed-off-by: "Eric W. Biederman"
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
21 Feb, 2015
1 commit
-
Open vSwitch allows moving internal vport to different namespace
while still connected to the bridge. But when namespace deleted
OVS does not detach these vports, that results in dangling
pointer to netdevice which causes kernel panic as follows.
This issue is fixed by detaching all ovs ports from the deleted
namespace at net-exit.BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: [] ovs_vport_locate+0x35/0x80 [openvswitch]
Oops: 0000 [#1] SMP
Call Trace:
[] lookup_vport+0x21/0xd0 [openvswitch]
[] ovs_vport_cmd_get+0x59/0xf0 [openvswitch]
[] genl_family_rcv_msg+0x1bc/0x3e0
[] genl_rcv_msg+0x79/0xc0
[] netlink_rcv_skb+0xb9/0xe0
[] genl_rcv+0x2c/0x40
[] netlink_unicast+0x12d/0x1c0
[] netlink_sendmsg+0x34a/0x6b0
[] sock_sendmsg+0xa0/0xe0
[] ___sys_sendmsg+0x408/0x420
[] __sys_sendmsg+0x51/0x90
[] SyS_sendmsg+0x12/0x20
[] system_call_fastpath+0x12/0x17Reported-by: Assaf Muller
Fixes: 46df7b81454("openvswitch: Add support for network namespaces.")
Signed-off-by: Pravin B Shelar
Reviewed-by: Thomas Graf
Signed-off-by: David S. Miller
27 Jan, 2015
2 commits
-
Previously, flows were manipulated by userspace specifying a full,
unmasked flow key. This adds significant burden onto flow
serialization/deserialization, particularly when dumping flows.This patch adds an alternative way to refer to flows using a
variable-length "unique flow identifier" (UFID). At flow setup time,
userspace may specify a UFID for a flow, which is stored with the flow
and inserted into a separate table for lookup, in addition to the
standard flow table. Flows created using a UFID must be fetched or
deleted using the UFID.All flow dump operations may now be made more terse with OVS_UFID_F_*
flags. For example, the OVS_UFID_F_OMIT_KEY flag allows responses to
omit the flow key from a datapath operation if the flow has a
corresponding UFID. This significantly reduces the time spent assembling
and transacting netlink messages. With all OVS_UFID_F_OMIT_* flags
enabled, the datapath only returns the UFID and statistics for each flow
during flow dump, increasing ovs-vswitchd revalidator performance by 40%
or more.Signed-off-by: Joe Stringer
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller -
Refactor the ovs_nla_fill_match() function into separate netlink
serialization functions ovs_nla_put_{unmasked_key,mask}(). Modify
ovs_nla_put_flow() to handle attribute nesting and expose the 'is_mask'
parameter - all callers need to nest the flow, and callers have better
knowledge about whether it is serializing a mask or not.Signed-off-by: Joe Stringer
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
18 Jan, 2015
1 commit
-
Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.This makes the very common pattern of
if (genlmsg_end(...) < 0) { ... }
be a whole bunch of dead code. Many places also simply do
return nlmsg_end(...);
and the caller is expected to deal with it.
This also commonly (at least for me) causes errors, because it is very
common to writeif (my_function(...))
/* error condition */and if my_function() does "return nlmsg_end()" this is of course wrong.
Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did- return nlmsg_end(...);
+ nlmsg_end(...);
+ return 0;I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with < 0 with no change in behaviour, so I opted for the more
efficient version.One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for
Signed-off-by: David S. Miller
15 Jan, 2015
2 commits
-
Conflicts:
drivers/net/xen-netfront.cMinor overlapping changes in xen-netfront.c, mostly to do
with some buffer management changes alongside the split
of stats into TX and RX.Signed-off-by: David S. Miller
-
User space is currently sending a OVS_FLOW_ATTR_PROBE for both flow
and packet messages. This leads to an out-of-bounds access in
ovs_packet_cmd_execute() because OVS_FLOW_ATTR_PROBE >
OVS_PACKET_ATTR_MAX.Introduce a new OVS_PACKET_ATTR_PROBE with the same numeric value
as OVS_FLOW_ATTR_PROBE to grow the range of accepted packet attributes
while maintaining to be binary compatible with existing OVS binaries.Fixes: 05da589 ("openvswitch: Add support for OVS_FLOW_ATTR_PROBE.")
Reported-by: Sander Eikelenboom
Tracked-down-by: Florian Westphal
Signed-off-by: Thomas Graf
Reviewed-by: Jesse Gross
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller
14 Jan, 2015
1 commit
-
The same macros are used for rx as well. So rename it.
Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller
27 Dec, 2014
1 commit
-
There's no point to force the caller to know about the internal
genl_sock to use inside struct net, just have them pass the network
namespace. This doesn't really change code generation since it's
an inline, but makes the caller less magic - there's never any
reason to pass another socket.Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller
22 Nov, 2014
1 commit
-
Conflicts:
drivers/net/ieee802154/fakehard.cA bug fix went into 'net' for ieee802154/fakehard.c, which is removed
in 'net-next'.Add build fix into the merge from Stephen Rothwell in openvswitch, the
logging macros take a new initial 'log' argument, a new call was added
in 'net' so when we merge that in here we have to explicitly add the
new 'log' arg to it else the build fails.Signed-off-by: David S. Miller