08 Nov, 2015

1 commit

  • Pull trivial updates from Jiri Kosina:
    "Trivial stuff from trivial tree that can be trivially summed up as:

    - treewide drop of spurious unlikely() before IS_ERR() from Viresh
    Kumar

    - cosmetic fixes (that don't really affect basic functionality of the
    driver) for pktcdvd and bcache, from Julia Lawall and Petr Mladek

    - various comment / printk fixes and updates all over the place"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
    bcache: Really show state of work pending bit
    hwmon: applesmc: fix comment typos
    Kconfig: remove comment about scsi_wait_scan module
    class_find_device: fix reference to argument "match"
    debugfs: document that debugfs_remove*() accepts NULL and error values
    net: Drop unlikely before IS_ERR(_OR_NULL)
    mm: Drop unlikely before IS_ERR(_OR_NULL)
    fs: Drop unlikely before IS_ERR(_OR_NULL)
    drivers: net: Drop unlikely before IS_ERR(_OR_NULL)
    drivers: misc: Drop unlikely before IS_ERR(_OR_NULL)
    UBI: Update comments to reflect UBI_METAONLY flag
    pktcdvd: drop null test before destroy functions

    Linus Torvalds
     

01 Nov, 2015

1 commit


28 Oct, 2015

2 commits

  • nf_ct_frag6_gather() makes a clone of each skb passed to it, and if the
    reassembly is successful, expects the caller to free all of the original
    skbs using nf_ct_frag6_consume_orig(). This call was previously missing,
    meaning that the original fragments were never freed (with the exception
    of the last fragment to arrive).

    Fix this by ensuring that all original fragments except for the last
    fragment are freed via nf_ct_frag6_consume_orig(). The last fragment
    will be morphed into the head, so it must not be freed yet. Furthermore,
    retain the ->next pointer for the head after skb_morph().

    Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
    Reported-by: Florian Westphal
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     
  • If ip_defrag() returns an error other than -EINPROGRESS, then the skb is
    freed. When handle_fragments() passes this back up to
    do_execute_actions(), it will be freed again. Prevent this double free
    by never freeing the skb in do_execute_actions() for errors returned by
    ovs_ct_execute. Always free it in ovs_ct_execute() error paths instead.

    Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
    Reported-by: Florian Westphal
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     

24 Oct, 2015

1 commit

  • Conflicts:
    net/ipv6/xfrm6_output.c
    net/openvswitch/flow_netlink.c
    net/openvswitch/vport-gre.c
    net/openvswitch/vport-vxlan.c
    net/openvswitch/vport.c
    net/openvswitch/vport.h

    The openvswitch conflicts were overlapping changes. One was
    the egress tunnel info fix in 'net' and the other was the
    vport ->send() op simplification in 'net-next'.

    The xfrm6_output.c conflicts was also a simplification
    overlapping a bug fix.

    Signed-off-by: David S. Miller

    David S. Miller
     

23 Oct, 2015

1 commit

  • While transitioning to netdev based vport we broke OVS
    feature which allows user to retrieve tunnel packet egress
    information for lwtunnel devices. Following patch fixes it
    by introducing ndo operation to get the tunnel egress info.
    Same ndo operation can be used for lwtunnel devices and compat
    ovs-tnl-vport devices. So after adding such device operation
    we can remove similar operation from ovs-vport.

    Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device").
    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     

22 Oct, 2015

6 commits

  • With use of lwtunnel, we can directly call dev_queue_xmit()
    rather than calling netdev vport send operation.
    Following change make tunnel vport code bit cleaner.

    Signed-off-by: Pravin B Shelar
    Acked-by: Thomas Graf
    Acked-by: Jiri Benc
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Patch fixes following sparse warning.
    net/openvswitch/flow_netlink.c:583:30: warning: incorrect type in assignment (different base types)
    net/openvswitch/flow_netlink.c:583:30: expected restricted __be16 [usertype] ipv4
    net/openvswitch/flow_netlink.c:583:30: got int

    Fixes: 6b26ba3a7d ("openvswitch: netlink attributes for IPv6 tunneling")
    Signed-off-by: Pravin B Shelar
    Acked-by: Thomas Graf
    Acked-by: Jiri Benc
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • If userspace provides a ct action with no nested mark or label, then the
    storage for these fields is zeroed. Later when actions are requested,
    such zeroed fields are serialized even though userspace didn't
    originally specify them. Fix the behaviour by ensuring that no action is
    serialized in this case, and reject actions where userspace attempts to
    set these fields with mask=0. This should make netlink marshalling
    consistent across deserialization/reserialization.

    Reported-by: Jarno Rajahalme
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Joe Stringer
     
  • New, related connections are marked as such as part of ovs_ct_lookup(),
    but they are not marked as "new" if the commit flag is used. Make this
    consistent by setting the "new" flag whenever !nf_ct_is_confirmed(ct).

    Reported-by: Jarno Rajahalme
    Signed-off-by: Joe Stringer
    Acked-by: Thomas Graf
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     
  • Currently, 0-bits are generated in ct_state where the bit position is
    undefined, and matches are accepted on these bit-positions. If userspace
    requests to match the 0-value for this bit then it may expect only a
    subset of traffic to match this value, whereas currently all packets
    will have this bit set to 0. Fix this by rejecting such masks.

    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Joe Stringer
     
  • "openvswitch: Remove vport stats" removed the per-vport statistics, in
    order to use the netdev's statistics fields.
    "openvswitch: Fix ovs_vport_get_stats()" fixed the export of these stats
    to user-space, by using the provided netdev_ops to collate them - but ovs
    internal devices still use an unallocated dev->tstats field to count
    packets, which are no longer exported by this api.

    Allocate the dev->tstats field for ovs internal devices, and wire up
    ndo_get_stats64 with the original implementation of
    ovs_vport_get_stats().

    On its own, "openvswitch: Fix ovs_vport_get_stats()" fixes the OOPs,
    unmasking a full-on panic on arm64:

    =============%] internal_dev_recv+0xa8/0x170 [openvswitch]
    [] do_output.isra.31+0x60/0x19c [openvswitch]
    [] do_execute_actions+0x208/0x11c0 [openvswitch]
    [] ovs_execute_actions+0xc8/0x238 [openvswitch]
    [] ovs_packet_cmd_execute+0x21c/0x288 [openvswitch]
    [] genl_family_rcv_msg+0x1b0/0x310
    [] genl_rcv_msg+0xa4/0xe4
    [] netlink_rcv_skb+0xb0/0xdc
    [] genl_rcv+0x38/0x50
    [] netlink_unicast+0x164/0x210
    [] netlink_sendmsg+0x304/0x368
    [] sock_sendmsg+0x30/0x4c
    [SNIP]
    Kernel panic - not syncing: Fatal exception in interrupt
    =============%
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    James Morse
     

20 Oct, 2015

1 commit


19 Oct, 2015

1 commit

  • If OVS receives a packet from another namespace, then the packet should
    be scrubbed. However, people have already begun to rely on the behaviour
    that skb->mark is preserved across namespaces, so retain this one field.

    This is mainly to address information leakage between namespaces when
    using OVS internal ports, but by placing it in ovs_vport_receive() it is
    more generally applicable, meaning it should not be overlooked if other
    port types are allowed to be moved into namespaces in future.

    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Joe Stringer
     

13 Oct, 2015

2 commits

  • The function nf_ct_frag6_gather is called on both the input and the
    output paths of the networking stack. In particular ipv6_defrag which
    calls nf_ct_frag6_gather is called from both the the PRE_ROUTING chain
    on input and the LOCAL_OUT chain on output.

    The addition of a net parameter makes it explicit which network
    namespace the packets are being reassembled in, and removes the need
    for nf_ct_frag6_gather to guess.

    Signed-off-by: "Eric W. Biederman"
    Acked-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • The function ip_defrag is called on both the input and the output
    paths of the networking stack. In particular conntrack when it is
    tracking outbound packets from the local machine calls ip_defrag.

    So add a struct net parameter and stop making ip_defrag guess which
    network namespace it needs to defragment packets in.

    Signed-off-by: "Eric W. Biederman"
    Acked-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

07 Oct, 2015

7 commits

  • Previously, the CT_ATTR_FLAGS attribute, when nested under the
    OVS_ACTION_ATTR_CT, encoded a 32-bit bitmask of flags that modify the
    semantics of the ct action. It's more extensible to just represent each
    flag as a nested attribute, and this requires no additional error
    checking to reject flags that aren't currently supported.

    Suggested-by: Ben Pfaff
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     
  • The ct_state field was initially added as an 8-bit field, however six of
    the bits are already being used and use cases are already starting to
    appear that may push the limits of this field. This patch extends the
    field to 32 bits while retaining the internal representation of 8 bits.
    This should cover forward compatibility of the ABI for the foreseeable
    future.

    This patch also reorders the OVS_CS_F_* bits to be sequential.

    Suggested-by: Jarno Rajahalme
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     
  • Previously, if userspace specified ct_state bits in the flow key which
    are currently undefined (and therefore unsupported), then they would be
    ignored. This could cause unexpected behaviour in future if userspace is
    extended to support additional bits but attempts to communicate with the
    current version of the kernel. This patch rectifies the situation by
    rejecting such ct_state bits.

    Fixes: 7f8a436eaa2c "openvswitch: Add conntrack action"
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     
  • The ct action uses parts of the flow key, so we need to ensure that it
    is valid before executing that action.

    Fixes: 7f8a436eaa2c "openvswitch: Add conntrack action"
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     
  • If ovs_fragment() was unable to fragment the skb due to an L2 header
    that exceeds the supported length, skbs would be leaked. Fix the bug.

    Fixes: 7f8a436eaa2c "openvswitch: Add conntrack action"
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     
  • Add netlink attributes for IPv6 tunnel addresses. This enables IPv6 support
    for tunnels.

    Signed-off-by: Jiri Benc
    Acked-by: Pravin B Shelar
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Jiri Benc
     
  • Store tunnel protocol (AF_INET or AF_INET6) in sw_flow_key. This field now
    also acts as an indicator whether the flow contains tunnel data (this was
    previously indicated by tun_key.u.ipv4.dst being set but with IPv6 addresses
    in an union with IPv4 ones this won't work anymore).

    The new field was added to a hole in sw_flow_key.

    Signed-off-by: Jiri Benc
    Acked-by: Pravin B Shelar
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Jiri Benc
     

05 Oct, 2015

3 commits

  • Not every device has dev->tstats set. So when OVS tries to calculate
    vport stats it causes kernel panic. Following patch fixes it by
    using standard API to get net-device stats.

    ---8
    Call trace:
    [] ovs_vport_get_stats+0x150/0x1f8 [openvswitch]
    [] ovs_vport_cmd_fill_info+0x140/0x1e0 [openvswitch]
    [] ovs_vport_cmd_dump+0xbc/0x138 [openvswitch]
    [] netlink_dump+0xb8/0x258
    [] __netlink_dump_start+0x120/0x178
    [] genl_family_rcv_msg+0x2d4/0x308
    [] genl_rcv_msg+0x88/0xc4
    [] netlink_rcv_skb+0xd4/0x100
    [] genl_rcv+0x30/0x48
    [] netlink_unicast+0x154/0x200
    [] netlink_sendmsg+0x308/0x364
    [] sock_sendmsg+0x14/0x2c
    [] SyS_sendto+0xbc/0xf0
    Code: aa1603e1 f94037a4 aa1303e2 aa1703e0 (f9400465)

    Reported-by: Tomasz Sawicki
    Fixes: 8c876639c98 ("openvswitch: Remove vport stats.")
    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • When openvswitch tries allocate memory from offline numa node 0:
    stats = kmem_cache_alloc_node(flow_stats_cache, GFP_KERNEL | __GFP_ZERO, 0)
    It catches VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES || !node_online(nid))
    [ replaced with VM_WARN_ON(!node_online(nid)) recently ] in linux/gfp.h
    This patch disables numa affinity in this case.

    Signed-off-by: Konstantin Khlebnikov
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Konstantin Khlebnikov
     
  • Conntrack LABELS (plural) are exposed by conntrack; rename the OVS name
    for these to be consistent with conntrack.

    Fixes: c2ac667 "openvswitch: Allow matching on conntrack label"
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     

30 Sep, 2015

5 commits


29 Sep, 2015

1 commit


27 Sep, 2015

2 commits


25 Sep, 2015

1 commit

  • The genl_notify function has too many arguments for no real reason - all
    callers use genl_info to get them anyway. Just pass the genl_info down to
    genl_notify.

    Signed-off-by: Jiri Benc
    Signed-off-by: David S. Miller

    Jiri Benc
     

23 Sep, 2015

1 commit

  • When support for megaflows was introduced, OVS needed to start
    installing flows with a mask applied to them. Since masking is an
    expensive operation, OVS also had an optimization that would only
    take the parts of the flow keys that were covered by a non-zero
    mask. The values stored in the remaining pieces should not matter
    because they are masked out.

    While this works fine for the purposes of matching (which must always
    look at the mask), serialization to netlink can be problematic. Since
    the flow and the mask are serialized separately, the uninitialized
    portions of the flow can be encoded with whatever values happen to be
    present.

    In terms of functionality, this has little effect since these fields
    will be masked out by definition. However, it leaks kernel memory to
    userspace, which is a potential security vulnerability. It is also
    possible that other code paths could look at the masked key and get
    uninitialized data, although this does not currently appear to be an
    issue in practice.

    This removes the mask optimization for flows that are being installed.
    This was always intended to be the case as the mask optimizations were
    really targetting per-packet flow operations.

    Fixes: 03f0d916 ("openvswitch: Mega flow implementation")
    Signed-off-by: Jesse Gross
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Jesse Gross
     

19 Sep, 2015

1 commit


18 Sep, 2015

1 commit

  • Static code analysis reveals the following bug:

    net/openvswitch/conntrack.c:281 ovs_ct_helper()
    warn: unsigned 'protoff' is never less than zero.

    This signedness bug breaks error handling for IPv6 extension headers when
    using conntrack helpers. Fix the error by using a local signed variable.

    Fixes: cae3a2627520: "openvswitch: Allow attaching helpers to ct
    action"
    Reported-by: Dan Carpenter
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer
     

16 Sep, 2015

1 commit

  • Masks were added to OVS flows in a way that was backwards compatible
    with userspace programs that did not generate masks. As a result, it is
    possible that we may receive flows that do not have a mask and we need
    to synthesize one.

    Generating a mask requires iterating over attributes and descending into
    nested attributes. For each level we need to know the size to generate the
    correct mask. We do this with a linked table of attribute types.

    Although the logic to handle these nested attributes was there in concept,
    there are a number of bugs in practice. Examples include incomplete links
    between tables, variable length attributes being treated as nested and
    missing sanity checks.

    Signed-off-by: Jesse Gross
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Jesse Gross
     

12 Sep, 2015

1 commit

  • When NF_CONNTRACK is built-in, NF_DEFRAG_IPV6 is a module, and
    OPENVSWITCH is built-in, the following build error would occur:

    net/built-in.o: In function `ovs_ct_execute':
    (.text+0x10f587): undefined reference to `nf_ct_frag6_gather'

    Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
    Reported-by: Jim Davis
    Signed-off-by: Joe Stringer
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Joe Stringer