27 Sep, 2020

1 commit

  • [ Upstream commit 99f62a746066fa436aa15d4606a538569540db08 ]

    When calling the RCU brother of br_vlan_get_pvid(), lockdep warns:

    =============================
    WARNING: suspicious RCU usage
    5.9.0-rc3-01631-g13c17acb8e38-dirty #814 Not tainted
    -----------------------------
    net/bridge/br_private.h:1054 suspicious rcu_dereference_protected() usage!

    Call trace:
    lockdep_rcu_suspicious+0xd4/0xf8
    __br_vlan_get_pvid+0xc0/0x100
    br_vlan_get_pvid_rcu+0x78/0x108

    The warning is because br_vlan_get_pvid_rcu() calls nbp_vlan_group()
    which calls rtnl_dereference() instead of rcu_dereference(). In turn,
    rtnl_dereference() calls rcu_dereference_protected() which assumes
    operation under an RCU write-side critical section, which obviously is
    not the case here. So, when the incorrect primitive is used to access
    the RCU-protected VLAN group pointer, READ_ONCE() is not used, which may
    cause various unexpected problems.

    I'm sad to say that br_vlan_get_pvid() and br_vlan_get_pvid_rcu() cannot
    share the same implementation. So fix the bug by splitting the 2
    functions, and making br_vlan_get_pvid_rcu() retrieve the VLAN groups
    under proper locking annotations.

    Fixes: 7582f5b70f9a ("bridge: add br_vlan_get_pvid_rcu()")
    Signed-off-by: Vladimir Oltean
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vladimir Oltean
     

03 Sep, 2020

1 commit

  • [ Upstream commit 2404b73c3f1a5f15726c6ecd226b56f6f992767f ]

    nf_ct_frag6_gather is part of nf_defrag_ipv6.ko, not ipv6 core.

    The current use of the netfilter ipv6 stub indirections causes a module
    dependency between ipv6 and nf_defrag_ipv6.

    This prevents nf_defrag_ipv6 module from being removed because ipv6 can't
    be unloaded.

    Remove the indirection and always use a direct call. This creates a
    depency from nf_conntrack_bridge to nf_defrag_ipv6 instead:

    modinfo nf_conntrack
    depends: nf_conntrack,nf_defrag_ipv6,bridge

    .. and nf_conntrack already depends on nf_defrag_ipv6 anyway.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Sasha Levin

    Florian Westphal
     

22 Jul, 2020

1 commit

  • [ Upstream commit 5fc6266af7b427243da24f3443a50cd4584aac06 ]

    Commit e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in
    igmp3/mld2 report handling") introduced a bug in the IPv6 header payload
    length check which would potentially lead to rejecting a valid MLD2 Report:

    The check needs to take into account the 2 bytes for the "Number of
    Sources" field in the "Multicast Address Record" before reading it.
    And not the size of a pointer to this field.

    Fixes: e57f61858b7c ("net: bridge: mcast: fix stale nsrcs pointer in igmp3/mld2 report handling")
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Linus Lüssing
     

01 Jul, 2020

1 commit

  • [ Upstream commit db7202dec92e6caa2706c21d6fc359af318bde2e ]

    The eth_addr member is passed to ether_addr functions that require
    2-byte alignment, therefore the member must be properly aligned
    to avoid unaligned accesses.

    The problem is in place since the initial merge of multicast to unicast:
    commit 6db6f0eae6052b70885562e1733896647ec1d807 bridge: multicast to unicast

    Fixes: 6db6f0eae605 ("bridge: multicast to unicast")
    Cc: Roopa Prabhu
    Cc: Nikolay Aleksandrov
    Cc: David S. Miller
    Cc: Jakub Kicinski
    Cc: Felix Fietkau
    Cc: stable@vger.kernel.org
    Signed-off-by: Thomas Martitz
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Thomas Martitz
     

17 Jun, 2020

1 commit

  • [ Upstream commit 53fc685243bd6fb90d90305cea54598b78d3cbfc ]

    When neighbor suppression is enabled the bridge device might reply to
    Neighbor Solicitation (NS) messages on behalf of remote hosts.

    In case the NS message includes the "Source link-layer address" option
    [1], the bridge device will use the specified address as the link-layer
    destination address in its reply.

    To avoid an infinite loop, break out of the options parsing loop when
    encountering an option with length zero and disregard the NS message.

    This is consistent with the IPv6 ndisc code and RFC 4886 which states
    that "Nodes MUST silently discard an ND packet that contains an option
    with length zero" [2].

    [1] https://tools.ietf.org/html/rfc4861#section-4.3
    [2] https://tools.ietf.org/html/rfc4861#section-4.6

    Fixes: ed842faeb2bd ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports")
    Signed-off-by: Ido Schimmel
    Reported-by: Alla Segal
    Tested-by: Alla Segal
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Ido Schimmel
     

03 Jun, 2020

1 commit

  • commit e9c284ec4b41c827f4369973d2792992849e4fa5 upstream.

    Currently, using the bridge reject target with tagged packets
    results in untagged packets being sent back.

    Fix this by mirroring the vlan id as well.

    Fixes: 85f5b3086a04 ("netfilter: bridge: add reject support")
    Signed-off-by: Michael Braun
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Michael Braun
     

05 Jan, 2020

3 commits

  • [ Upstream commit bd085ef678b2cc8c38c105673dfe8ff8f5ec0c57 ]

    The MTU update code is supposed to be invoked in response to real
    networking events that update the PMTU. In IPv6 PMTU update function
    __ip6_rt_update_pmtu() we called dst_confirm_neigh() to update neighbor
    confirmed time.

    But for tunnel code, it will call pmtu before xmit, like:
    - tnl_update_pmtu()
    - skb_dst_update_pmtu()
    - ip6_rt_update_pmtu()
    - __ip6_rt_update_pmtu()
    - dst_confirm_neigh()

    If the tunnel remote dst mac address changed and we still do the neigh
    confirm, we will not be able to update neigh cache and ping6 remote
    will failed.

    So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
    should not be invoking dst_confirm_neigh() as we have no evidence
    of successful two-way communication at this point.

    On the other hand it is also important to keep the neigh reachability fresh
    for TCP flows, so we cannot remove this dst_confirm_neigh() call.

    To fix the issue, we have to add a new bool parameter for dst_ops.update_pmtu
    to choose whether we should do neigh update or not. I will add the parameter
    in this patch and set all the callers to true to comply with the previous
    way, and fix the tunnel code one by one on later patches.

    v5: No change.
    v4: No change.
    v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
    dst_ops.update_pmtu to control whether we should do neighbor confirm.
    Also split the big patch to small ones for each area.
    v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.

    Suggested-by: David Miller
    Reviewed-by: Guillaume Nault
    Acked-by: David Ahern
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Hangbin Liu
     
  • commit 5604285839aaedfb23ebe297799c6e558939334d upstream.

    syzbot is kind enough to remind us we need to call skb_may_pull()

    BUG: KMSAN: uninit-value in br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665
    CPU: 1 PID: 11631 Comm: syz-executor.1 Not tainted 5.4.0-rc8-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:

    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x1c9/0x220 lib/dump_stack.c:118
    kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108
    __msan_warning+0x64/0xc0 mm/kmsan/kmsan_instr.c:245
    br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665
    nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline]
    nf_hook_slow+0x18b/0x3f0 net/netfilter/core.c:512
    nf_hook include/linux/netfilter.h:260 [inline]
    NF_HOOK include/linux/netfilter.h:303 [inline]
    __br_forward+0x78f/0xe30 net/bridge/br_forward.c:109
    br_flood+0xef0/0xfe0 net/bridge/br_forward.c:234
    br_handle_frame_finish+0x1a77/0x1c20 net/bridge/br_input.c:162
    nf_hook_bridge_pre net/bridge/br_input.c:245 [inline]
    br_handle_frame+0xfb6/0x1eb0 net/bridge/br_input.c:348
    __netif_receive_skb_core+0x20b9/0x51a0 net/core/dev.c:4830
    __netif_receive_skb_one_core net/core/dev.c:4927 [inline]
    __netif_receive_skb net/core/dev.c:5043 [inline]
    process_backlog+0x610/0x13c0 net/core/dev.c:5874
    napi_poll net/core/dev.c:6311 [inline]
    net_rx_action+0x7a6/0x1aa0 net/core/dev.c:6379
    __do_softirq+0x4a1/0x83a kernel/softirq.c:293
    do_softirq_own_stack+0x49/0x80 arch/x86/entry/entry_64.S:1091

    do_softirq kernel/softirq.c:338 [inline]
    __local_bh_enable_ip+0x184/0x1d0 kernel/softirq.c:190
    local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32
    rcu_read_unlock_bh include/linux/rcupdate.h:688 [inline]
    __dev_queue_xmit+0x38e8/0x4200 net/core/dev.c:3819
    dev_queue_xmit+0x4b/0x60 net/core/dev.c:3825
    packet_snd net/packet/af_packet.c:2959 [inline]
    packet_sendmsg+0x8234/0x9100 net/packet/af_packet.c:2984
    sock_sendmsg_nosec net/socket.c:637 [inline]
    sock_sendmsg net/socket.c:657 [inline]
    __sys_sendto+0xc44/0xc70 net/socket.c:1952
    __do_sys_sendto net/socket.c:1964 [inline]
    __se_sys_sendto+0x107/0x130 net/socket.c:1960
    __x64_sys_sendto+0x6e/0x90 net/socket.c:1960
    do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x45a679
    Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f0a3c9e5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 000000000045a679
    RDX: 000000000000000e RSI: 0000000020000200 RDI: 0000000000000003
    RBP: 000000000075bf20 R08: 00000000200000c0 R09: 0000000000000014
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007f0a3c9e66d4
    R13: 00000000004c8ec1 R14: 00000000004dfe28 R15: 00000000ffffffff

    Uninit was created at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline]
    kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132
    kmsan_slab_alloc+0x97/0x100 mm/kmsan/kmsan_hooks.c:86
    slab_alloc_node mm/slub.c:2773 [inline]
    __kmalloc_node_track_caller+0xe27/0x11a0 mm/slub.c:4381
    __kmalloc_reserve net/core/skbuff.c:141 [inline]
    __alloc_skb+0x306/0xa10 net/core/skbuff.c:209
    alloc_skb include/linux/skbuff.h:1049 [inline]
    alloc_skb_with_frags+0x18c/0xa80 net/core/skbuff.c:5662
    sock_alloc_send_pskb+0xafd/0x10a0 net/core/sock.c:2244
    packet_alloc_skb net/packet/af_packet.c:2807 [inline]
    packet_snd net/packet/af_packet.c:2902 [inline]
    packet_sendmsg+0x63a6/0x9100 net/packet/af_packet.c:2984
    sock_sendmsg_nosec net/socket.c:637 [inline]
    sock_sendmsg net/socket.c:657 [inline]
    __sys_sendto+0xc44/0xc70 net/socket.c:1952
    __do_sys_sendto net/socket.c:1964 [inline]
    __se_sys_sendto+0x107/0x130 net/socket.c:1960
    __x64_sys_sendto+0x6e/0x90 net/socket.c:1960
    do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Fixes: c4e70a87d975 ("netfilter: bridge: rename br_netfilter.c to br_netfilter_hooks.c")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Reviewed-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • commit e608f631f0ba5f1fc5ee2e260a3a35d13107cbfe upstream.

    syzbot reported following splat:

    BUG: KASAN: vmalloc-out-of-bounds in size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline]
    BUG: KASAN: vmalloc-out-of-bounds in compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155
    Read of size 4 at addr ffffc900004461f4 by task syz-executor267/7937

    CPU: 1 PID: 7937 Comm: syz-executor267 Not tainted 5.5.0-rc1-syzkaller #0
    size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline]
    compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155
    compat_do_replace+0x344/0x720 net/bridge/netfilter/ebtables.c:2249
    compat_do_ebt_set_ctl+0x22f/0x27e net/bridge/netfilter/ebtables.c:2333
    [..]

    Because padding isn't considered during computation of ->buf_user_offset,
    "total" is decremented by fewer bytes than it should.

    Therefore, the first part of

    if (*total < sizeof(*entry) || entry->next_offset < sizeof(*entry))

    will pass, -- it should not have. This causes oob access:
    entry->next_offset is past the vmalloced size.

    Reject padding and check that computed user offset (sum of ebt_entry
    structure plus all individual matches/watchers/targets) is same
    value that userspace gave us as the offset of the next entry.

    Reported-by: syzbot+f68108fed972453a0ad4@syzkaller.appspotmail.com
    Fixes: 81e675c227ec ("netfilter: ebtables: add CONFIG_COMPAT support")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     

18 Dec, 2019

1 commit

  • [ Upstream commit c4b4c421857dc7b1cf0dccbd738472360ff2cd70 ]

    We have an interesting memory leak in the bridge when it is being
    unregistered and is a slave to a master device which would change the
    mac of its slaves on unregister (e.g. bond, team). This is a very
    unusual setup but we do end up leaking 1 fdb entry because
    dev_set_mac_address() would cause the bridge to insert the new mac address
    into its table after all fdbs are flushed, i.e. after dellink() on the
    bridge has finished and we call NETDEV_UNREGISTER the bond/team would
    release it and will call dev_set_mac_address() to restore its original
    address and that in turn will add an fdb in the bridge.
    One fix is to check for the bridge dev's reg_state in its
    ndo_set_mac_address callback and return an error if the bridge is not in
    NETREG_REGISTERED.

    Easy steps to reproduce:
    1. add bond in mode != A/B
    2. add any slave to the bond
    3. add bridge dev as a slave to the bond
    4. destroy the bridge device

    Trace:
    unreferenced object 0xffff888035c4d080 (size 128):
    comm "ip", pid 4068, jiffies 4296209429 (age 1413.753s)
    hex dump (first 32 bytes):
    41 1d c9 36 80 88 ff ff 00 00 00 00 00 00 00 00 A..6............
    d2 19 c9 5e 3f d7 00 00 00 00 00 00 00 00 00 00 ...^?...........
    backtrace:
    [] kmem_cache_alloc+0x155/0x26f
    [] fdb_create+0x21/0x486 [bridge]
    [] fdb_insert+0x91/0xdc [bridge]
    [] br_fdb_change_mac_address+0xb3/0x175 [bridge]
    [] br_stp_change_bridge_id+0xf/0xff [bridge]
    [] br_set_mac_address+0x76/0x99 [bridge]
    [] dev_set_mac_address+0x63/0x9b
    [] __bond_release_one+0x3f6/0x455 [bonding]
    [] bond_netdev_event+0x2f2/0x400 [bonding]
    [] notifier_call_chain+0x38/0x56
    [] call_netdevice_notifiers+0x1e/0x23
    [] rollback_registered_many+0x353/0x6a4
    [] unregister_netdevice_many+0x17/0x6f
    [] rtnl_delete_link+0x3c/0x43
    [] rtnl_dellink+0x1dc/0x20a
    [] rtnetlink_rcv_msg+0x23d/0x268

    Fixes: 43598813386f ("bridge: add local MAC address to forwarding table (v2)")
    Reported-by: syzbot+2add91c08eb181fea1bf@syzkaller.appspotmail.com
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Nikolay Aleksandrov
     

05 Nov, 2019

1 commit


25 Oct, 2019

1 commit

  • Some interface types could be nested.
    (VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
    These interface types should set lockdep class because, without lockdep
    class key, lockdep always warn about unexisting circular locking.

    In the current code, these interfaces have their own lockdep class keys and
    these manage itself. So that there are so many duplicate code around the
    /driver/net and /net/.
    This patch adds new generic lockdep keys and some helper functions for it.

    This patch does below changes.
    a) Add lockdep class keys in struct net_device
    - qdisc_running, xmit, addr_list, qdisc_busylock
    - these keys are used as dynamic lockdep key.
    b) When net_device is being allocated, lockdep keys are registered.
    - alloc_netdev_mqs()
    c) When net_device is being free'd llockdep keys are unregistered.
    - free_netdev()
    d) Add generic lockdep key helper function
    - netdev_register_lockdep_key()
    - netdev_unregister_lockdep_key()
    - netdev_update_lockdep_key()
    e) Remove unnecessary generic lockdep macro and functions
    f) Remove unnecessary lockdep code of each interfaces.

    After this patch, each interface modules don't need to maintain
    their lockdep keys.

    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller

    Taehee Yoo
     

22 Oct, 2019

1 commit

  • This patch removes the iph field from the state structure, which is not
    properly initialized. Instead, add a new field to make the "do we want
    to set DF" be the state bit and move the code to set the DF flag from
    ip_frag_next().

    Joint work with Pablo and Linus.

    Fixes: 19c3401a917b ("net: ipv4: place control buffer handling away from fragmentation iterators")
    Reported-by: Patrick Schönthaler
    Signed-off-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Linus Torvalds
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Oct, 2019

1 commit

  • Thomas found that some forwarded packets would be stuck
    in FQ packet scheduler because their skb->tstamp contained
    timestamps far in the future.

    We thought we addressed this point in commit 8203e2d844d3
    ("net: clear skb->tstamp in forwarding paths") but there
    is still an issue when/if a packet needs to be fragmented.

    In order to meet EDT requirements, we have to make sure all
    fragments get the original skb->tstamp.

    Note that this original skb->tstamp should be zero in
    forwarding path, but might have a non zero value in
    output path if user decided so.

    Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
    Signed-off-by: Eric Dumazet
    Reported-by: Thomas Bartschies
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Sep, 2019

1 commit


13 Sep, 2019

3 commits


10 Sep, 2019

1 commit

  • NLM_F_MULTI must be used only when a NLMSG_DONE message is sent at the end.
    In fact, NLMSG_DONE is sent only at the end of a dump.

    Libraries like libnl will wait forever for NLMSG_DONE.

    Fixes: 949f1e39a617 ("bridge: mdb: notify on router port add and del")
    CC: Nikolay Aleksandrov
    Signed-off-by: Nicolas Dichtel
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

03 Sep, 2019

2 commits


01 Sep, 2019

1 commit

  • Currently this simplified code snippet fails:

    br_vlan_get_pvid(netdev, &pvid);
    br_vlan_get_info(netdev, pvid, &vinfo);
    ASSERT(!(vinfo.flags & BRIDGE_VLAN_INFO_PVID));

    It is intuitive that the pvid of a netdevice should have the
    BRIDGE_VLAN_INFO_PVID flag set.

    However I can't seem to pinpoint a commit where this behavior was
    introduced. It seems like it's been like that since forever.

    At a first glance it would make more sense to just handle the
    BRIDGE_VLAN_INFO_PVID flag in __vlan_add_flags. However, as Nikolay
    explains:

    There are a few reasons why we don't do it, most importantly because
    we need to have only one visible pvid at any single time, even if it's
    stale - it must be just one. Right now that rule will not be violated
    by this change, but people will try using this flag and could see two
    pvids simultaneously. You can see that the pvid code is even using
    memory barriers to propagate the new value faster and everywhere the
    pvid is read only once. That is the reason the flag is set
    dynamically when dumping entries, too. A second (weaker) argument
    against would be given the above we don't want another way to do the
    same thing, specifically if it can provide us with two pvids (e.g. if
    walking the vlan list) or if it can provide us with a pvid different
    from the one set in the vg. [Obviously, I'm talking about RCU
    pvid/vlan use cases similar to the dumps. The locked cases are fine.
    I would like to avoid explaining why this shouldn't be relied upon
    without locking]

    So instead of introducing the above change and making sure of the pvid
    uniqueness under RCU, simply dynamically populate the pvid flag in
    br_vlan_get_info().

    Signed-off-by: Vladimir Oltean
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Vladimir Oltean
     

30 Aug, 2019

1 commit


28 Aug, 2019

1 commit


19 Aug, 2019

1 commit

  • The ordering of arguments to the x_tables ADD_COUNTER macro
    appears to be wrong in ebtables (cf. ip_tables.c, ip6_tables.c,
    and arp_tables.c).

    This causes data corruption in the ebtables userspace tools
    because they get incorrect packet & byte counts from the kernel.

    Fixes: d72133e628803 ("netfilter: ebtables: use ADD_COUNTER macro")
    Signed-off-by: Todd Seidelmann
    Signed-off-by: Pablo Neira Ayuso

    Todd Seidelmann
     

18 Aug, 2019

4 commits

  • Currently this is needed only for user-space compatibility, so similar
    object adds/deletes as the dumped ones would succeed. Later it can be
    used for L2 mcast MAC add/delete.

    v3: fix compiler warning (DaveM)
    v2: don't send a notification when used from user-space, arm the group
    timer if no ports are left after host entry del

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • Currently we dump only the port mdb entries but we can have host-joined
    entries on the bridge itself and they should be treated as normal temp
    mdbs, they're already notified:
    $ bridge monitor all
    [MDB]dev br0 port br0 grp ff02::8 temp

    The group will not be shown in the bridge mdb output, but it takes 1 slot
    and it's timing out. If it's only host-joined then the mdb show output
    can even be empty.

    After this patch we show the host-joined groups:
    $ bridge mdb show
    dev br0 port br0 grp ff02::8 temp

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • We have to factor out the mdb fill portion in order to re-use it later for
    the bridge mdb entries. No functional changes intended.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • Trivial patch to move the vlan comments in their proper places above the
    vid 0 checks.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

07 Aug, 2019

1 commit


06 Aug, 2019

1 commit

  • Most of the bridge device's vlan init bugs come from the fact that its
    default pvid is created at the wrong time, way too early in ndo_init()
    before the device is even assigned an ifindex. It introduces a bug when the
    bridge's dev_addr is added as fdb during the initial default pvid creation
    the notification has ifindex/NDA_MASTER both equal to 0 (see example below)
    which really makes no sense for user-space[0] and is wrong.
    Usually user-space software would ignore such entries, but they are
    actually valid and will eventually have all necessary attributes.
    It makes much more sense to send a notification *after* the device has
    registered and has a proper ifindex allocated rather than before when
    there's a chance that the registration might still fail or to receive
    it with ifindex/NDA_MASTER == 0. Note that we can remove the fdb flush
    from br_vlan_flush() since that case can no longer happen. At
    NETDEV_REGISTER br->default_pvid is always == 1 as it's initialized by
    br_vlan_init() before that and at NETDEV_UNREGISTER it can be anything
    depending why it was called (if called due to NETDEV_REGISTER error
    it'll still be == 1, otherwise it could be any value changed during the
    device life time).

    For the demonstration below a small change to iproute2 for printing all fdb
    notifications is added, because it contained a workaround not to show
    entries with ifindex == 0.
    Command executed while monitoring: $ ip l add br0 type bridge
    Before (both ifindex and master == 0):
    $ bridge monitor fdb
    36:7e:8a:b3:56:ba dev * vlan 1 master * permanent

    After (proper br0 ifindex):
    $ bridge monitor fdb
    e6:2a:ae:7a:b7:48 dev br0 vlan 1 master br0 permanent

    v4: move only the default pvid init/deinit to NETDEV_REGISTER/UNREGISTER
    v3: send the correct v2 patch with all changes (stub should return 0)
    v2: on error in br_vlan_init set br->vlgrp to NULL and return 0 in
    the br_vlan_bridge_event stub when bridge vlans are disabled

    [0] https://bugzilla.kernel.org/show_bug.cgi?id=204389

    Reported-by: michael-dev
    Fixes: 5be5a2df40f0 ("bridge: Add filtering support for default_pvid")
    Signed-off-by: Nikolay Aleksandrov
    Acked-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

01 Aug, 2019

2 commits

  • In user-space there's no way to distinguish why an mdb entry was deleted
    and that is a problem for daemons which would like to keep the mdb in
    sync with remote ends (e.g. mlag) but would also like to converge faster.
    In almost all cases we'd like to age-out the remote entry for performance
    and convergence reasons except when fast-leave is enabled. In that case we
    want explicit immediate remote delete, thus add mdb flag which is set only
    when the entry is being deleted due to fast-leave.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • When permanent entries were introduced by the commit below, they were
    exempt from timing out and thus igmp leave wouldn't affect them unless
    fast leave was enabled on the port which was added before permanent
    entries existed. It shouldn't matter if fast leave is enabled or not
    if the user added a permanent entry it shouldn't be deleted on igmp
    leave.

    Before:
    $ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave
    $ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent
    $ bridge mdb show
    dev br0 port eth4 grp 229.1.1.1 permanent

    < join and leave 229.1.1.1 on eth4 >

    $ bridge mdb show
    $

    After:
    $ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave
    $ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent
    $ bridge mdb show
    dev br0 port eth4 grp 229.1.1.1 permanent

    < join and leave 229.1.1.1 on eth4 >

    $ bridge mdb show
    dev br0 port eth4 grp 229.1.1.1 permanent

    Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

31 Jul, 2019

1 commit

  • Pablo Neira Ayuso says:

    ====================
    netfilter fixes for net

    The following patchset contains Netfilter fixes for your net tree:

    1) memleak in ebtables from the error path for the 32/64 compat layer,
    from Florian Westphal.

    2) Fix inverted meta ifname/ifidx matching when no interface is set
    on either from the input/output path, from Phil Sutter.

    3) Remove goto label in nft_meta_bridge, also from Phil.

    4) Missing include guard in xt_connlabel, from Masahiro Yamada.

    5) Two patch to fix ipset destination MAC matching coming from
    Stephano Brivio, via Jozsef Kadlecsik.

    6) Fix set rename and listing concurrency problem, from Shijie Luo.
    Patch also coming via Jozsef Kadlecsik.

    7) ebtables 32/64 compat missing base chain policy in rule count,
    from Florian Westphal.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

30 Jul, 2019

2 commits

  • ebtables doesn't include the base chain policies in the rule count,
    so we need to add them manually when we call into the x_tables core
    to allocate space for the comapt offset table.

    This lead syzbot to trigger:
    WARNING: CPU: 1 PID: 9012 at net/netfilter/x_tables.c:649
    xt_compat_add_offset.cold+0x11/0x36 net/netfilter/x_tables.c:649

    Reported-by: syzbot+276ddebab3382bbf72db@syzkaller.appspotmail.com
    Fixes: 2035f3ff8eaa ("netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • On initialization failure we have to delete the local fdb which was
    inserted due to the default pvid creation. This problem has been present
    since the inception of default_pvid. Note that currently there are 2 cases:
    1) in br_dev_init() when br_multicast_init() fails
    2) if register_netdevice() fails after calling ndo_init()

    This patch takes care of both since br_vlan_flush() is called on both
    occasions. Also the new fdb delete would be a no-op on normal bridge
    device destruction since the local fdb would've been already flushed by
    br_dev_delete(). This is not an issue for ports since nbp_vlan_init() is
    called last when adding a port thus nothing can fail after it.

    Reported-by: syzbot+88533dc8b582309bf3ee@syzkaller.appspotmail.com
    Fixes: 5be5a2df40f0 ("bridge: Add filtering support for default_pvid")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

25 Jul, 2019

2 commits

  • The label is used just once and the code it points at is not reused, no
    point in keeping it.

    Signed-off-by: Phil Sutter
    Signed-off-by: Pablo Neira Ayuso

    Phil Sutter
     
  • nft_meta_get_eval()'s tendency to bail out setting NFT_BREAK verdict in
    situations where required data is missing leads to unexpected behaviour
    with inverted checks like so:

    | meta iifname != eth0 accept

    This rule will never match if there is no input interface (or it is not
    known) which is not intuitive and, what's worse, breaks consistency of
    iptables-nft with iptables-legacy.

    Fix this by falling back to placing a value in dreg which never matches
    (avoiding accidental matches), i.e. zero for interface index and an
    empty string for interface name.

    Signed-off-by: Phil Sutter
    Signed-off-by: Pablo Neira Ayuso

    Phil Sutter
     

22 Jul, 2019

1 commit

  • In compat_do_replace(), a temporary buffer is allocated through vmalloc()
    to hold entries copied from the user space. The buffer address is firstly
    saved to 'newinfo->entries', and later on assigned to 'entries_tmp'. Then
    the entries in this temporary buffer is copied to the internal kernel
    structure through compat_copy_entries(). If this copy process fails,
    compat_do_replace() should be terminated. However, the allocated temporary
    buffer is not freed on this path, leading to a memory leak.

    To fix the bug, free the buffer before returning from compat_do_replace().

    Signed-off-by: Wenwen Wang
    Reviewed-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Wenwen Wang
     

20 Jul, 2019

1 commit

  • The new nft_meta_bridge code fails to link as built-in when NF_TABLES
    is a loadable module.

    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_get_eval':
    nft_meta_bridge.c:(.text+0x1e8): undefined reference to `nft_meta_get_eval'
    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_get_init':
    nft_meta_bridge.c:(.text+0x468): undefined reference to `nft_meta_get_init'
    nft_meta_bridge.c:(.text+0x49c): undefined reference to `nft_parse_register'
    nft_meta_bridge.c:(.text+0x4cc): undefined reference to `nft_validate_register_store'
    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_module_exit':
    nft_meta_bridge.c:(.exit.text+0x14): undefined reference to `nft_unregister_expr'
    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_module_init':
    nft_meta_bridge.c:(.init.text+0x14): undefined reference to `nft_register_expr'
    net/bridge/netfilter/nft_meta_bridge.o:(.rodata+0x60): undefined reference to `nft_meta_get_dump'
    net/bridge/netfilter/nft_meta_bridge.o:(.rodata+0x88): undefined reference to `nft_meta_set_eval'

    This can happen because the NF_TABLES_BRIDGE dependency itself is just a
    'bool'. Make the symbol a 'tristate' instead so Kconfig can propagate the
    dependencies correctly.

    Fixes: 30e103fe24de ("netfilter: nft_meta: move bridge meta keys into nft_meta_bridge")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Pablo Neira Ayuso

    Arnd Bergmann