17 Oct, 2013

1 commit

  • The combination of two commits:
    commit 8e4e1713e4
    ("openvswitch: Simplify datapath locking.")
    commit 2537b4dd0a
    ("openvswitch:: link upper device for port devices")

    introduced a bug where upper_dev wasn't unlinked upon
    netdev_unregister notification

    The following steps:

    modprobe openvswitch
    ovs-dpctl add-dp test
    ip tuntap add dev tap1 mode tap
    ovs-dpctl add-if test tap1
    ip tuntap del dev tap1 mode tap

    are causing multiple warnings:

    [ 62.747557] gre: GRE over IPv4 demultiplexor driver
    [ 62.749579] openvswitch: Open vSwitch switching datapath
    [ 62.755087] device test entered promiscuous mode
    [ 62.765911] device tap1 entered promiscuous mode
    [ 62.766033] IPv6: ADDRCONF(NETDEV_UP): tap1: link is not ready
    [ 62.769017] ------------[ cut here ]------------
    [ 62.769022] WARNING: CPU: 1 PID: 3267 at net/core/dev.c:5501 rollback_registered_many+0x20f/0x240()
    [ 62.769023] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
    [ 62.769051] CPU: 1 PID: 3267 Comm: ip Not tainted 3.12.0-rc3+ #60
    [ 62.769052] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
    [ 62.769053] 0000000000000009 ffff8807f25cbd28 ffffffff8175e575 0000000000000006
    [ 62.769055] 0000000000000000 ffff8807f25cbd68 ffffffff8105314c ffff8807f25cbd58
    [ 62.769057] ffff8807f2634000 ffff8807f25cbdc8 ffff8807f25cbd88 ffff8807f25cbdc8
    [ 62.769059] Call Trace:
    [ 62.769062] [] dump_stack+0x55/0x76
    [ 62.769065] [] warn_slowpath_common+0x8c/0xc0
    [ 62.769067] [] warn_slowpath_null+0x1a/0x20
    [ 62.769069] [] rollback_registered_many+0x20f/0x240
    [ 62.769071] [] rollback_registered+0x31/0x40
    [ 62.769073] [] unregister_netdevice_queue+0x58/0x90
    [ 62.769075] [] __tun_detach+0x140/0x340
    [ 62.769077] [] tun_chr_close+0x36/0x60
    [ 62.769080] [] __fput+0xff/0x260
    [ 62.769082] [] ____fput+0xe/0x10
    [ 62.769084] [] task_work_run+0xb5/0xe0
    [ 62.769087] [] do_notify_resume+0x59/0x80
    [ 62.769089] [] ? trace_hardirqs_on_thunk+0x3a/0x3f
    [ 62.769091] [] int_signal+0x12/0x17
    [ 62.769093] ---[ end trace 838756c62e156ffb ]---
    [ 62.769481] ------------[ cut here ]------------
    [ 62.769485] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
    [ 62.769486] sysfs: can not remove 'master', no directory
    [ 62.769486] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
    [ 62.769514] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60
    [ 62.769515] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
    [ 62.769518] Workqueue: events ovs_dp_notify_wq [openvswitch]
    [ 62.769519] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006
    [ 62.769521] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b28
    [ 62.769523] 0000000000000000 ffffffff81a87a1f ffff8807f2634000 ffff880037038500
    [ 62.769525] Call Trace:
    [ 62.769528] [] dump_stack+0x55/0x76
    [ 62.769529] [] warn_slowpath_common+0x8c/0xc0
    [ 62.769531] [] warn_slowpath_fmt+0x46/0x50
    [ 62.769533] [] sysfs_hash_and_remove+0xa9/0xb0
    [ 62.769535] [] sysfs_remove_link+0x26/0x30
    [ 62.769538] [] __netdev_adjacent_dev_remove+0xf7/0x150
    [ 62.769540] [] __netdev_adjacent_dev_unlink_lists+0x27/0x50
    [ 62.769542] [] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50
    [ 62.769544] [] netdev_upper_dev_unlink+0x3d/0x140
    [ 62.769548] [] netdev_destroy+0x4b/0x80 [openvswitch]
    [ 62.769550] [] ovs_vport_del+0x46/0x60 [openvswitch]
    [ 62.769552] [] ovs_dp_detach_port+0x44/0x60 [openvswitch]
    [ 62.769555] [] ovs_dp_notify_wq+0xb4/0x150 [openvswitch]
    [ 62.769557] [] process_one_work+0x1d8/0x6a0
    [ 62.769559] [] ? process_one_work+0x178/0x6a0
    [ 62.769562] [] worker_thread+0x11b/0x370
    [ 62.769564] [] ? rescuer_thread+0x350/0x350
    [ 62.769566] [] kthread+0xea/0xf0
    [ 62.769568] [] ? flush_kthread_worker+0x150/0x150
    [ 62.769570] [] ret_from_fork+0x7c/0xb0
    [ 62.769572] [] ? flush_kthread_worker+0x150/0x150
    [ 62.769573] ---[ end trace 838756c62e156ffc ]---
    [ 62.769574] ------------[ cut here ]------------
    [ 62.769576] WARNING: CPU: 1 PID: 92 at fs/sysfs/inode.c:325 sysfs_hash_and_remove+0xa9/0xb0()
    [ 62.769577] sysfs: can not remove 'upper_test', no directory
    [ 62.769577] Modules linked in: openvswitch gre vxlan ip_tunnel libcrc32c ip6table_filter ip6_tables ebtable_nat ebtables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack xt_CHECKSUM iptable_mangle ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp llc vhost_net macvtap macvlan vhost kvm_intel kvm dm_crypt iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi hid_generic mxm_wmi eeepc_wmi asus_wmi sparse_keymap dm_multipath psmouse serio_raw usbhid hid parport_pc ppdev firewire_ohci lpc_ich firewire_core e1000e crc_itu_t binfmt_misc igb dca ptp pps_core mac_hid wmi lp parport i2o_config i2o_block video
    [ 62.769603] CPU: 1 PID: 92 Comm: kworker/1:2 Tainted: G W 3.12.0-rc3+ #60
    [ 62.769604] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
    [ 62.769606] Workqueue: events ovs_dp_notify_wq [openvswitch]
    [ 62.769607] 0000000000000009 ffff880807ad3ac8 ffffffff8175e575 0000000000000006
    [ 62.769609] ffff880807ad3b18 ffff880807ad3b08 ffffffff8105314c ffff880807ad3b58
    [ 62.769611] 0000000000000000 ffff880807ad3bd9 ffff8807f2634000 ffff880037038500
    [ 62.769613] Call Trace:
    [ 62.769615] [] dump_stack+0x55/0x76
    [ 62.769617] [] warn_slowpath_common+0x8c/0xc0
    [ 62.769619] [] warn_slowpath_fmt+0x46/0x50
    [ 62.769621] [] sysfs_hash_and_remove+0xa9/0xb0
    [ 62.769622] [] sysfs_remove_link+0x26/0x30
    [ 62.769624] [] __netdev_adjacent_dev_remove+0x122/0x150
    [ 62.769627] [] __netdev_adjacent_dev_unlink_lists+0x27/0x50
    [ 62.769629] [] __netdev_adjacent_dev_unlink_neighbour+0x3a/0x50
    [ 62.769631] [] netdev_upper_dev_unlink+0x3d/0x140
    [ 62.769633] [] netdev_destroy+0x4b/0x80 [openvswitch]
    [ 62.769636] [] ovs_vport_del+0x46/0x60 [openvswitch]
    [ 62.769638] [] ovs_dp_detach_port+0x44/0x60 [openvswitch]
    [ 62.769640] [] ovs_dp_notify_wq+0xb4/0x150 [openvswitch]
    [ 62.769642] [] process_one_work+0x1d8/0x6a0
    [ 62.769644] [] ? process_one_work+0x178/0x6a0
    [ 62.769646] [] worker_thread+0x11b/0x370
    [ 62.769648] [] ? rescuer_thread+0x350/0x350
    [ 62.769650] [] kthread+0xea/0xf0
    [ 62.769652] [] ? flush_kthread_worker+0x150/0x150
    [ 62.769654] [] ret_from_fork+0x7c/0xb0
    [ 62.769656] [] ? flush_kthread_worker+0x150/0x150
    [ 62.769657] ---[ end trace 838756c62e156ffd ]---
    [ 62.769724] device tap1 left promiscuous mode

    This patch also affects moving devices between net namespaces.

    OVS used to ignore netns move notifications which caused problems.
    Like:
    ovs-dpctl add-if test tap1
    ip link set tap1 netns 3512
    and then removing tap1 inside the namespace will cause hang on missing dev_put.

    With this patch OVS will detach dev upon receiving netns move event.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Jesse Gross

    Alexei Starovoitov
     

12 Sep, 2013

1 commit

  • In function __parse_flow_nlattrs(), we check for condition
    (type > OVS_KEY_ATTR_MAX) and if true, print an error, but we do
    not return from this function as in other checks. It seems this
    has been forgotten, as otherwise, we could access beyond the
    memory of ovs_key_lens, which is of ovs_key_lens[OVS_KEY_ATTR_MAX + 1].
    Hence, a maliciously prepared nla_type from user space could access
    beyond this upper limit.

    Introduced by 03f0d916a ("openvswitch: Mega flow implementation").

    Signed-off-by: Daniel Borkmann
    Cc: Andy Zhou
    Acked-by: Jesse Gross
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

06 Sep, 2013

1 commit

  • sw_flow_key alignment was declared as " __aligned(__alignof__(long))".
    However, this breaks on the m68k architecture where long is 32 bit in
    size but 16 bit aligned by default. This aligns to the size of a long to
    ensure that we can always do comparsions in full long-sized chunks. It
    also adds an additional build check to catch any reduction in alignment.

    CC: Andy Zhou
    Reported-by: Fengguang Wu
    Reported-by: Geert Uytterhoeven
    Signed-off-by: Jesse Gross
    Signed-off-by: David S. Miller

    Jesse Gross
     

04 Sep, 2013

3 commits


01 Sep, 2013

1 commit

  • This patch adds IPv6 support to vxlan device, as the new version
    RFC already mentions it:

    http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-03

    Cc: David Stevens
    Cc: Stephen Hemminger
    Cc: David S. Miller
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

28 Aug, 2013

1 commit

  • Make sure the sw_flow_key structure and valid mask boundaries are always
    machine word aligned. Optimize the flow compare and mask operations
    using machine word size operations. This patch improves throughput on
    average by 15% when CPU is the bottleneck of forwarding packets.

    This patch is inspired by ideas and code from a patch submitted by Peter
    Klausler titled "replace memcmp() with specialized comparator".
    However, The original patch only optimizes for architectures
    support unaligned machine word access. This patch optimizes for all
    architectures.

    Signed-off-by: Andy Zhou
    Signed-off-by: Jesse Gross

    Andy Zhou
     

27 Aug, 2013

2 commits

  • Key_end is a better name describing the ending boundary than key_len.
    Rename those variables to make it less confusing.

    Signed-off-by: Andy Zhou
    Signed-off-by: Jesse Gross

    Andy Zhou
     
  • This patch adds support for rewriting SCTP src,dst ports similar to the
    functionality already available for TCP/UDP.

    Rewriting SCTP ports is expensive due to double-recalculation of the
    SCTP checksums; this is performed to ensure that packets traversing OVS
    with invalid checksums will continue to the destination with any
    checksum corruption intact.

    Reviewed-by: Simon Horman
    Signed-off-by: Joe Stringer
    Signed-off-by: Ben Pfaff
    Signed-off-by: Jesse Gross

    Joe Stringer
     

24 Aug, 2013

7 commits


20 Aug, 2013

1 commit


15 Aug, 2013

3 commits

  • It doesn't make sense to output a tunnel packet using the same
    parameters that it was received with since that will generally
    just result in the packet going back to us. As a result, userspace
    assumes that the tunnel key is cleared when transitioning through
    the switch. In the majority of cases this doesn't matter since a
    packet is either going to a tunnel port (in which the key is
    overwritten with new values) or to a non-tunnel port (in which
    case the key is ignored). However, it's theoreticaly possible that
    userspace could rely on the documented behavior, so this corrects
    it.

    Signed-off-by: Jesse Gross

    Jesse Gross
     
  • Flex array is used to allocate hash buckets which is type struct
    hlist_head, but we use `struct hlist_head *` to calculate
    array size. Since hlist_head is of size pointer it works fine.

    Following patch use correct type.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: Jesse Gross

    Pravin B Shelar
     
  • git silently included an extra hunk in vport_cmd_set() during
    automatic merging. This code is unreachable so it does not actually
    introduce a problem but it is clearly incorrect.

    Signed-off-by: Jesse Gross

    Jesse Gross
     

02 Jul, 2013

1 commit

  • Openvswitch uses function from NET_IPGRE_DEMUX module.
    Add Kconfig dependency to fix following compilation errors:
    http://marc.info/?l=linux-netdev&m=137244035226634

    CC: Jesse Gross
    Reported-by: Randy Dunlap
    Signed-off-by: Pravin Shelar
    Acked-by: Randy Dunlap
    Acked-by: Jesse Gross
    Signed-off-by: David S. Miller

    Pravin B Shelar
     

24 Jun, 2013

1 commit


20 Jun, 2013

5 commits


15 Jun, 2013

8 commits


29 May, 2013

1 commit

  • So far, only net_device * could be passed along with netdevice notifier
    event. This patch provides a possibility to pass custom structure
    able to provide info that event listener needs to know.

    Signed-off-by: Jiri Pirko

    v2->v3: fix typo on simeth
    shortened dev_getter
    shortened notifier_info struct name
    v1->v2: fix notifier_call parameter in call_netdevice_notifier()
    Signed-off-by: David S. Miller

    Jiri Pirko
     

30 Apr, 2013

1 commit


25 Apr, 2013

1 commit

  • OVS locking was recently changed to have private OVS lock which
    simplified overall locking. Therefore there is no need to have
    another global genl lock to protect OVS data structures. Following
    patch uses of parallel_ops genl family for OVS. This also allows
    more granual OVS locking using ovs_mutex for protecting OVS data
    structures, which gives more concurrencey. E.g multiple genl
    operations OVS_PACKET_CMD_EXECUTE can run in parallel, etc.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     

23 Apr, 2013

1 commit

  • Conflicts:
    drivers/net/ethernet/emulex/benet/be_main.c
    drivers/net/ethernet/intel/igb/igb_main.c
    drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
    include/net/scm.h
    net/batman-adv/routing.c
    net/ipv4/tcp_input.c

    The e{uid,gid} --> {uid,gid} credentials fix conflicted with the
    cleanup in net-next to now pass cred structs around.

    The be2net driver had a bug fix in 'net' that overlapped with the VLAN
    interface changes by Patrick McHardy in net-next.

    An IGB conflict existed because in 'net' the build_skb() support was
    reverted, and in 'net-next' there was a comment style fix within that
    code.

    Several batman-adv conflicts were resolved by making sure that all
    calls to batadv_is_my_mac() are changed to have a new bat_priv first
    argument.

    Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO
    rewrite in 'net-next', mostly overlapping changes.

    Thanks to Stephen Rothwell and Antonio Quartulli for help with several
    of these merge resolutions.

    Signed-off-by: David S. Miller

    David S. Miller