05 Jan, 2020

3 commits

  • [ Upstream commit bd085ef678b2cc8c38c105673dfe8ff8f5ec0c57 ]

    The MTU update code is supposed to be invoked in response to real
    networking events that update the PMTU. In IPv6 PMTU update function
    __ip6_rt_update_pmtu() we called dst_confirm_neigh() to update neighbor
    confirmed time.

    But for tunnel code, it will call pmtu before xmit, like:
    - tnl_update_pmtu()
    - skb_dst_update_pmtu()
    - ip6_rt_update_pmtu()
    - __ip6_rt_update_pmtu()
    - dst_confirm_neigh()

    If the tunnel remote dst mac address changed and we still do the neigh
    confirm, we will not be able to update neigh cache and ping6 remote
    will failed.

    So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
    should not be invoking dst_confirm_neigh() as we have no evidence
    of successful two-way communication at this point.

    On the other hand it is also important to keep the neigh reachability fresh
    for TCP flows, so we cannot remove this dst_confirm_neigh() call.

    To fix the issue, we have to add a new bool parameter for dst_ops.update_pmtu
    to choose whether we should do neigh update or not. I will add the parameter
    in this patch and set all the callers to true to comply with the previous
    way, and fix the tunnel code one by one on later patches.

    v5: No change.
    v4: No change.
    v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
    dst_ops.update_pmtu to control whether we should do neighbor confirm.
    Also split the big patch to small ones for each area.
    v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.

    Suggested-by: David Miller
    Reviewed-by: Guillaume Nault
    Acked-by: David Ahern
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Hangbin Liu
     
  • commit 5604285839aaedfb23ebe297799c6e558939334d upstream.

    syzbot is kind enough to remind us we need to call skb_may_pull()

    BUG: KMSAN: uninit-value in br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665
    CPU: 1 PID: 11631 Comm: syz-executor.1 Not tainted 5.4.0-rc8-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:

    __dump_stack lib/dump_stack.c:77 [inline]
    dump_stack+0x1c9/0x220 lib/dump_stack.c:118
    kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108
    __msan_warning+0x64/0xc0 mm/kmsan/kmsan_instr.c:245
    br_nf_forward_arp+0xe61/0x1230 net/bridge/br_netfilter_hooks.c:665
    nf_hook_entry_hookfn include/linux/netfilter.h:135 [inline]
    nf_hook_slow+0x18b/0x3f0 net/netfilter/core.c:512
    nf_hook include/linux/netfilter.h:260 [inline]
    NF_HOOK include/linux/netfilter.h:303 [inline]
    __br_forward+0x78f/0xe30 net/bridge/br_forward.c:109
    br_flood+0xef0/0xfe0 net/bridge/br_forward.c:234
    br_handle_frame_finish+0x1a77/0x1c20 net/bridge/br_input.c:162
    nf_hook_bridge_pre net/bridge/br_input.c:245 [inline]
    br_handle_frame+0xfb6/0x1eb0 net/bridge/br_input.c:348
    __netif_receive_skb_core+0x20b9/0x51a0 net/core/dev.c:4830
    __netif_receive_skb_one_core net/core/dev.c:4927 [inline]
    __netif_receive_skb net/core/dev.c:5043 [inline]
    process_backlog+0x610/0x13c0 net/core/dev.c:5874
    napi_poll net/core/dev.c:6311 [inline]
    net_rx_action+0x7a6/0x1aa0 net/core/dev.c:6379
    __do_softirq+0x4a1/0x83a kernel/softirq.c:293
    do_softirq_own_stack+0x49/0x80 arch/x86/entry/entry_64.S:1091

    do_softirq kernel/softirq.c:338 [inline]
    __local_bh_enable_ip+0x184/0x1d0 kernel/softirq.c:190
    local_bh_enable+0x36/0x40 include/linux/bottom_half.h:32
    rcu_read_unlock_bh include/linux/rcupdate.h:688 [inline]
    __dev_queue_xmit+0x38e8/0x4200 net/core/dev.c:3819
    dev_queue_xmit+0x4b/0x60 net/core/dev.c:3825
    packet_snd net/packet/af_packet.c:2959 [inline]
    packet_sendmsg+0x8234/0x9100 net/packet/af_packet.c:2984
    sock_sendmsg_nosec net/socket.c:637 [inline]
    sock_sendmsg net/socket.c:657 [inline]
    __sys_sendto+0xc44/0xc70 net/socket.c:1952
    __do_sys_sendto net/socket.c:1964 [inline]
    __se_sys_sendto+0x107/0x130 net/socket.c:1960
    __x64_sys_sendto+0x6e/0x90 net/socket.c:1960
    do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x45a679
    Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
    RSP: 002b:00007f0a3c9e5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 000000000045a679
    RDX: 000000000000000e RSI: 0000000020000200 RDI: 0000000000000003
    RBP: 000000000075bf20 R08: 00000000200000c0 R09: 0000000000000014
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007f0a3c9e66d4
    R13: 00000000004c8ec1 R14: 00000000004dfe28 R15: 00000000ffffffff

    Uninit was created at:
    kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline]
    kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132
    kmsan_slab_alloc+0x97/0x100 mm/kmsan/kmsan_hooks.c:86
    slab_alloc_node mm/slub.c:2773 [inline]
    __kmalloc_node_track_caller+0xe27/0x11a0 mm/slub.c:4381
    __kmalloc_reserve net/core/skbuff.c:141 [inline]
    __alloc_skb+0x306/0xa10 net/core/skbuff.c:209
    alloc_skb include/linux/skbuff.h:1049 [inline]
    alloc_skb_with_frags+0x18c/0xa80 net/core/skbuff.c:5662
    sock_alloc_send_pskb+0xafd/0x10a0 net/core/sock.c:2244
    packet_alloc_skb net/packet/af_packet.c:2807 [inline]
    packet_snd net/packet/af_packet.c:2902 [inline]
    packet_sendmsg+0x63a6/0x9100 net/packet/af_packet.c:2984
    sock_sendmsg_nosec net/socket.c:637 [inline]
    sock_sendmsg net/socket.c:657 [inline]
    __sys_sendto+0xc44/0xc70 net/socket.c:1952
    __do_sys_sendto net/socket.c:1964 [inline]
    __se_sys_sendto+0x107/0x130 net/socket.c:1960
    __x64_sys_sendto+0x6e/0x90 net/socket.c:1960
    do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:291
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Fixes: c4e70a87d975 ("netfilter: bridge: rename br_netfilter.c to br_netfilter_hooks.c")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Reviewed-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • commit e608f631f0ba5f1fc5ee2e260a3a35d13107cbfe upstream.

    syzbot reported following splat:

    BUG: KASAN: vmalloc-out-of-bounds in size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline]
    BUG: KASAN: vmalloc-out-of-bounds in compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155
    Read of size 4 at addr ffffc900004461f4 by task syz-executor267/7937

    CPU: 1 PID: 7937 Comm: syz-executor267 Not tainted 5.5.0-rc1-syzkaller #0
    size_entry_mwt net/bridge/netfilter/ebtables.c:2063 [inline]
    compat_copy_entries+0x128b/0x1380 net/bridge/netfilter/ebtables.c:2155
    compat_do_replace+0x344/0x720 net/bridge/netfilter/ebtables.c:2249
    compat_do_ebt_set_ctl+0x22f/0x27e net/bridge/netfilter/ebtables.c:2333
    [..]

    Because padding isn't considered during computation of ->buf_user_offset,
    "total" is decremented by fewer bytes than it should.

    Therefore, the first part of

    if (*total < sizeof(*entry) || entry->next_offset < sizeof(*entry))

    will pass, -- it should not have. This causes oob access:
    entry->next_offset is past the vmalloced size.

    Reject padding and check that computed user offset (sum of ebt_entry
    structure plus all individual matches/watchers/targets) is same
    value that userspace gave us as the offset of the next entry.

    Reported-by: syzbot+f68108fed972453a0ad4@syzkaller.appspotmail.com
    Fixes: 81e675c227ec ("netfilter: ebtables: add CONFIG_COMPAT support")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     

18 Dec, 2019

1 commit

  • [ Upstream commit c4b4c421857dc7b1cf0dccbd738472360ff2cd70 ]

    We have an interesting memory leak in the bridge when it is being
    unregistered and is a slave to a master device which would change the
    mac of its slaves on unregister (e.g. bond, team). This is a very
    unusual setup but we do end up leaking 1 fdb entry because
    dev_set_mac_address() would cause the bridge to insert the new mac address
    into its table after all fdbs are flushed, i.e. after dellink() on the
    bridge has finished and we call NETDEV_UNREGISTER the bond/team would
    release it and will call dev_set_mac_address() to restore its original
    address and that in turn will add an fdb in the bridge.
    One fix is to check for the bridge dev's reg_state in its
    ndo_set_mac_address callback and return an error if the bridge is not in
    NETREG_REGISTERED.

    Easy steps to reproduce:
    1. add bond in mode != A/B
    2. add any slave to the bond
    3. add bridge dev as a slave to the bond
    4. destroy the bridge device

    Trace:
    unreferenced object 0xffff888035c4d080 (size 128):
    comm "ip", pid 4068, jiffies 4296209429 (age 1413.753s)
    hex dump (first 32 bytes):
    41 1d c9 36 80 88 ff ff 00 00 00 00 00 00 00 00 A..6............
    d2 19 c9 5e 3f d7 00 00 00 00 00 00 00 00 00 00 ...^?...........
    backtrace:
    [] kmem_cache_alloc+0x155/0x26f
    [] fdb_create+0x21/0x486 [bridge]
    [] fdb_insert+0x91/0xdc [bridge]
    [] br_fdb_change_mac_address+0xb3/0x175 [bridge]
    [] br_stp_change_bridge_id+0xf/0xff [bridge]
    [] br_set_mac_address+0x76/0x99 [bridge]
    [] dev_set_mac_address+0x63/0x9b
    [] __bond_release_one+0x3f6/0x455 [bonding]
    [] bond_netdev_event+0x2f2/0x400 [bonding]
    [] notifier_call_chain+0x38/0x56
    [] call_netdevice_notifiers+0x1e/0x23
    [] rollback_registered_many+0x353/0x6a4
    [] unregister_netdevice_many+0x17/0x6f
    [] rtnl_delete_link+0x3c/0x43
    [] rtnl_dellink+0x1dc/0x20a
    [] rtnetlink_rcv_msg+0x23d/0x268

    Fixes: 43598813386f ("bridge: add local MAC address to forwarding table (v2)")
    Reported-by: syzbot+2add91c08eb181fea1bf@syzkaller.appspotmail.com
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Nikolay Aleksandrov
     

05 Nov, 2019

1 commit


25 Oct, 2019

1 commit

  • Some interface types could be nested.
    (VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
    These interface types should set lockdep class because, without lockdep
    class key, lockdep always warn about unexisting circular locking.

    In the current code, these interfaces have their own lockdep class keys and
    these manage itself. So that there are so many duplicate code around the
    /driver/net and /net/.
    This patch adds new generic lockdep keys and some helper functions for it.

    This patch does below changes.
    a) Add lockdep class keys in struct net_device
    - qdisc_running, xmit, addr_list, qdisc_busylock
    - these keys are used as dynamic lockdep key.
    b) When net_device is being allocated, lockdep keys are registered.
    - alloc_netdev_mqs()
    c) When net_device is being free'd llockdep keys are unregistered.
    - free_netdev()
    d) Add generic lockdep key helper function
    - netdev_register_lockdep_key()
    - netdev_unregister_lockdep_key()
    - netdev_update_lockdep_key()
    e) Remove unnecessary generic lockdep macro and functions
    f) Remove unnecessary lockdep code of each interfaces.

    After this patch, each interface modules don't need to maintain
    their lockdep keys.

    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller

    Taehee Yoo
     

22 Oct, 2019

1 commit

  • This patch removes the iph field from the state structure, which is not
    properly initialized. Instead, add a new field to make the "do we want
    to set DF" be the state bit and move the code to set the DF flag from
    ip_frag_next().

    Joint work with Pablo and Linus.

    Fixes: 19c3401a917b ("net: ipv4: place control buffer handling away from fragmentation iterators")
    Reported-by: Patrick Schönthaler
    Signed-off-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Linus Torvalds
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Oct, 2019

1 commit

  • Thomas found that some forwarded packets would be stuck
    in FQ packet scheduler because their skb->tstamp contained
    timestamps far in the future.

    We thought we addressed this point in commit 8203e2d844d3
    ("net: clear skb->tstamp in forwarding paths") but there
    is still an issue when/if a packet needs to be fragmented.

    In order to meet EDT requirements, we have to make sure all
    fragments get the original skb->tstamp.

    Note that this original skb->tstamp should be zero in
    forwarding path, but might have a non zero value in
    output path if user decided so.

    Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
    Signed-off-by: Eric Dumazet
    Reported-by: Thomas Bartschies
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Sep, 2019

1 commit


13 Sep, 2019

3 commits


10 Sep, 2019

1 commit

  • NLM_F_MULTI must be used only when a NLMSG_DONE message is sent at the end.
    In fact, NLMSG_DONE is sent only at the end of a dump.

    Libraries like libnl will wait forever for NLMSG_DONE.

    Fixes: 949f1e39a617 ("bridge: mdb: notify on router port add and del")
    CC: Nikolay Aleksandrov
    Signed-off-by: Nicolas Dichtel
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

03 Sep, 2019

2 commits


01 Sep, 2019

1 commit

  • Currently this simplified code snippet fails:

    br_vlan_get_pvid(netdev, &pvid);
    br_vlan_get_info(netdev, pvid, &vinfo);
    ASSERT(!(vinfo.flags & BRIDGE_VLAN_INFO_PVID));

    It is intuitive that the pvid of a netdevice should have the
    BRIDGE_VLAN_INFO_PVID flag set.

    However I can't seem to pinpoint a commit where this behavior was
    introduced. It seems like it's been like that since forever.

    At a first glance it would make more sense to just handle the
    BRIDGE_VLAN_INFO_PVID flag in __vlan_add_flags. However, as Nikolay
    explains:

    There are a few reasons why we don't do it, most importantly because
    we need to have only one visible pvid at any single time, even if it's
    stale - it must be just one. Right now that rule will not be violated
    by this change, but people will try using this flag and could see two
    pvids simultaneously. You can see that the pvid code is even using
    memory barriers to propagate the new value faster and everywhere the
    pvid is read only once. That is the reason the flag is set
    dynamically when dumping entries, too. A second (weaker) argument
    against would be given the above we don't want another way to do the
    same thing, specifically if it can provide us with two pvids (e.g. if
    walking the vlan list) or if it can provide us with a pvid different
    from the one set in the vg. [Obviously, I'm talking about RCU
    pvid/vlan use cases similar to the dumps. The locked cases are fine.
    I would like to avoid explaining why this shouldn't be relied upon
    without locking]

    So instead of introducing the above change and making sure of the pvid
    uniqueness under RCU, simply dynamically populate the pvid flag in
    br_vlan_get_info().

    Signed-off-by: Vladimir Oltean
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Vladimir Oltean
     

30 Aug, 2019

1 commit


28 Aug, 2019

1 commit


19 Aug, 2019

1 commit

  • The ordering of arguments to the x_tables ADD_COUNTER macro
    appears to be wrong in ebtables (cf. ip_tables.c, ip6_tables.c,
    and arp_tables.c).

    This causes data corruption in the ebtables userspace tools
    because they get incorrect packet & byte counts from the kernel.

    Fixes: d72133e628803 ("netfilter: ebtables: use ADD_COUNTER macro")
    Signed-off-by: Todd Seidelmann
    Signed-off-by: Pablo Neira Ayuso

    Todd Seidelmann
     

18 Aug, 2019

4 commits

  • Currently this is needed only for user-space compatibility, so similar
    object adds/deletes as the dumped ones would succeed. Later it can be
    used for L2 mcast MAC add/delete.

    v3: fix compiler warning (DaveM)
    v2: don't send a notification when used from user-space, arm the group
    timer if no ports are left after host entry del

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • Currently we dump only the port mdb entries but we can have host-joined
    entries on the bridge itself and they should be treated as normal temp
    mdbs, they're already notified:
    $ bridge monitor all
    [MDB]dev br0 port br0 grp ff02::8 temp

    The group will not be shown in the bridge mdb output, but it takes 1 slot
    and it's timing out. If it's only host-joined then the mdb show output
    can even be empty.

    After this patch we show the host-joined groups:
    $ bridge mdb show
    dev br0 port br0 grp ff02::8 temp

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • We have to factor out the mdb fill portion in order to re-use it later for
    the bridge mdb entries. No functional changes intended.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • Trivial patch to move the vlan comments in their proper places above the
    vid 0 checks.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

07 Aug, 2019

1 commit


06 Aug, 2019

1 commit

  • Most of the bridge device's vlan init bugs come from the fact that its
    default pvid is created at the wrong time, way too early in ndo_init()
    before the device is even assigned an ifindex. It introduces a bug when the
    bridge's dev_addr is added as fdb during the initial default pvid creation
    the notification has ifindex/NDA_MASTER both equal to 0 (see example below)
    which really makes no sense for user-space[0] and is wrong.
    Usually user-space software would ignore such entries, but they are
    actually valid and will eventually have all necessary attributes.
    It makes much more sense to send a notification *after* the device has
    registered and has a proper ifindex allocated rather than before when
    there's a chance that the registration might still fail or to receive
    it with ifindex/NDA_MASTER == 0. Note that we can remove the fdb flush
    from br_vlan_flush() since that case can no longer happen. At
    NETDEV_REGISTER br->default_pvid is always == 1 as it's initialized by
    br_vlan_init() before that and at NETDEV_UNREGISTER it can be anything
    depending why it was called (if called due to NETDEV_REGISTER error
    it'll still be == 1, otherwise it could be any value changed during the
    device life time).

    For the demonstration below a small change to iproute2 for printing all fdb
    notifications is added, because it contained a workaround not to show
    entries with ifindex == 0.
    Command executed while monitoring: $ ip l add br0 type bridge
    Before (both ifindex and master == 0):
    $ bridge monitor fdb
    36:7e:8a:b3:56:ba dev * vlan 1 master * permanent

    After (proper br0 ifindex):
    $ bridge monitor fdb
    e6:2a:ae:7a:b7:48 dev br0 vlan 1 master br0 permanent

    v4: move only the default pvid init/deinit to NETDEV_REGISTER/UNREGISTER
    v3: send the correct v2 patch with all changes (stub should return 0)
    v2: on error in br_vlan_init set br->vlgrp to NULL and return 0 in
    the br_vlan_bridge_event stub when bridge vlans are disabled

    [0] https://bugzilla.kernel.org/show_bug.cgi?id=204389

    Reported-by: michael-dev
    Fixes: 5be5a2df40f0 ("bridge: Add filtering support for default_pvid")
    Signed-off-by: Nikolay Aleksandrov
    Acked-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

01 Aug, 2019

2 commits

  • In user-space there's no way to distinguish why an mdb entry was deleted
    and that is a problem for daemons which would like to keep the mdb in
    sync with remote ends (e.g. mlag) but would also like to converge faster.
    In almost all cases we'd like to age-out the remote entry for performance
    and convergence reasons except when fast-leave is enabled. In that case we
    want explicit immediate remote delete, thus add mdb flag which is set only
    when the entry is being deleted due to fast-leave.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • When permanent entries were introduced by the commit below, they were
    exempt from timing out and thus igmp leave wouldn't affect them unless
    fast leave was enabled on the port which was added before permanent
    entries existed. It shouldn't matter if fast leave is enabled or not
    if the user added a permanent entry it shouldn't be deleted on igmp
    leave.

    Before:
    $ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave
    $ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent
    $ bridge mdb show
    dev br0 port eth4 grp 229.1.1.1 permanent

    < join and leave 229.1.1.1 on eth4 >

    $ bridge mdb show
    $

    After:
    $ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave
    $ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent
    $ bridge mdb show
    dev br0 port eth4 grp 229.1.1.1 permanent

    < join and leave 229.1.1.1 on eth4 >

    $ bridge mdb show
    dev br0 port eth4 grp 229.1.1.1 permanent

    Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

31 Jul, 2019

1 commit

  • Pablo Neira Ayuso says:

    ====================
    netfilter fixes for net

    The following patchset contains Netfilter fixes for your net tree:

    1) memleak in ebtables from the error path for the 32/64 compat layer,
    from Florian Westphal.

    2) Fix inverted meta ifname/ifidx matching when no interface is set
    on either from the input/output path, from Phil Sutter.

    3) Remove goto label in nft_meta_bridge, also from Phil.

    4) Missing include guard in xt_connlabel, from Masahiro Yamada.

    5) Two patch to fix ipset destination MAC matching coming from
    Stephano Brivio, via Jozsef Kadlecsik.

    6) Fix set rename and listing concurrency problem, from Shijie Luo.
    Patch also coming via Jozsef Kadlecsik.

    7) ebtables 32/64 compat missing base chain policy in rule count,
    from Florian Westphal.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

30 Jul, 2019

2 commits

  • ebtables doesn't include the base chain policies in the rule count,
    so we need to add them manually when we call into the x_tables core
    to allocate space for the comapt offset table.

    This lead syzbot to trigger:
    WARNING: CPU: 1 PID: 9012 at net/netfilter/x_tables.c:649
    xt_compat_add_offset.cold+0x11/0x36 net/netfilter/x_tables.c:649

    Reported-by: syzbot+276ddebab3382bbf72db@syzkaller.appspotmail.com
    Fixes: 2035f3ff8eaa ("netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • On initialization failure we have to delete the local fdb which was
    inserted due to the default pvid creation. This problem has been present
    since the inception of default_pvid. Note that currently there are 2 cases:
    1) in br_dev_init() when br_multicast_init() fails
    2) if register_netdevice() fails after calling ndo_init()

    This patch takes care of both since br_vlan_flush() is called on both
    occasions. Also the new fdb delete would be a no-op on normal bridge
    device destruction since the local fdb would've been already flushed by
    br_dev_delete(). This is not an issue for ports since nbp_vlan_init() is
    called last when adding a port thus nothing can fail after it.

    Reported-by: syzbot+88533dc8b582309bf3ee@syzkaller.appspotmail.com
    Fixes: 5be5a2df40f0 ("bridge: Add filtering support for default_pvid")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

25 Jul, 2019

2 commits

  • The label is used just once and the code it points at is not reused, no
    point in keeping it.

    Signed-off-by: Phil Sutter
    Signed-off-by: Pablo Neira Ayuso

    Phil Sutter
     
  • nft_meta_get_eval()'s tendency to bail out setting NFT_BREAK verdict in
    situations where required data is missing leads to unexpected behaviour
    with inverted checks like so:

    | meta iifname != eth0 accept

    This rule will never match if there is no input interface (or it is not
    known) which is not intuitive and, what's worse, breaks consistency of
    iptables-nft with iptables-legacy.

    Fix this by falling back to placing a value in dreg which never matches
    (avoiding accidental matches), i.e. zero for interface index and an
    empty string for interface name.

    Signed-off-by: Phil Sutter
    Signed-off-by: Pablo Neira Ayuso

    Phil Sutter
     

22 Jul, 2019

1 commit

  • In compat_do_replace(), a temporary buffer is allocated through vmalloc()
    to hold entries copied from the user space. The buffer address is firstly
    saved to 'newinfo->entries', and later on assigned to 'entries_tmp'. Then
    the entries in this temporary buffer is copied to the internal kernel
    structure through compat_copy_entries(). If this copy process fails,
    compat_do_replace() should be terminated. However, the allocated temporary
    buffer is not freed on this path, leading to a memory leak.

    To fix the bug, free the buffer before returning from compat_do_replace().

    Signed-off-by: Wenwen Wang
    Reviewed-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Wenwen Wang
     

20 Jul, 2019

1 commit

  • The new nft_meta_bridge code fails to link as built-in when NF_TABLES
    is a loadable module.

    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_get_eval':
    nft_meta_bridge.c:(.text+0x1e8): undefined reference to `nft_meta_get_eval'
    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_get_init':
    nft_meta_bridge.c:(.text+0x468): undefined reference to `nft_meta_get_init'
    nft_meta_bridge.c:(.text+0x49c): undefined reference to `nft_parse_register'
    nft_meta_bridge.c:(.text+0x4cc): undefined reference to `nft_validate_register_store'
    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_module_exit':
    nft_meta_bridge.c:(.exit.text+0x14): undefined reference to `nft_unregister_expr'
    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_module_init':
    nft_meta_bridge.c:(.init.text+0x14): undefined reference to `nft_register_expr'
    net/bridge/netfilter/nft_meta_bridge.o:(.rodata+0x60): undefined reference to `nft_meta_get_dump'
    net/bridge/netfilter/nft_meta_bridge.o:(.rodata+0x88): undefined reference to `nft_meta_set_eval'

    This can happen because the NF_TABLES_BRIDGE dependency itself is just a
    'bool'. Make the symbol a 'tristate' instead so Kconfig can propagate the
    dependencies correctly.

    Fixes: 30e103fe24de ("netfilter: nft_meta: move bridge meta keys into nft_meta_bridge")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Pablo Neira Ayuso

    Arnd Bergmann
     

19 Jul, 2019

1 commit


12 Jul, 2019

1 commit

  • Pull networking updates from David Miller:
    "Some highlights from this development cycle:

    1) Big refactoring of ipv6 route and neigh handling to support
    nexthop objects configurable as units from userspace. From David
    Ahern.

    2) Convert explored_states in BPF verifier into a hash table,
    significantly decreased state held for programs with bpf2bpf
    calls, from Alexei Starovoitov.

    3) Implement bpf_send_signal() helper, from Yonghong Song.

    4) Various classifier enhancements to mvpp2 driver, from Maxime
    Chevallier.

    5) Add aRFS support to hns3 driver, from Jian Shen.

    6) Fix use after free in inet frags by allocating fqdirs dynamically
    and reworking how rhashtable dismantle occurs, from Eric Dumazet.

    7) Add act_ctinfo packet classifier action, from Kevin
    Darbyshire-Bryant.

    8) Add TFO key backup infrastructure, from Jason Baron.

    9) Remove several old and unused ISDN drivers, from Arnd Bergmann.

    10) Add devlink notifications for flash update status to mlxsw driver,
    from Jiri Pirko.

    11) Lots of kTLS offload infrastructure fixes, from Jakub Kicinski.

    12) Add support for mv88e6250 DSA chips, from Rasmus Villemoes.

    13) Various enhancements to ipv6 flow label handling, from Eric
    Dumazet and Willem de Bruijn.

    14) Support TLS offload in nfp driver, from Jakub Kicinski, Dirk van
    der Merwe, and others.

    15) Various improvements to axienet driver including converting it to
    phylink, from Robert Hancock.

    16) Add PTP support to sja1105 DSA driver, from Vladimir Oltean.

    17) Add mqprio qdisc offload support to dpaa2-eth, from Ioana
    Radulescu.

    18) Add devlink health reporting to mlx5, from Moshe Shemesh.

    19) Convert stmmac over to phylink, from Jose Abreu.

    20) Add PTP PHC (Physical Hardware Clock) support to mlxsw, from
    Shalom Toledo.

    21) Add nftables SYNPROXY support, from Fernando Fernandez Mancera.

    22) Convert tcp_fastopen over to use SipHash, from Ard Biesheuvel.

    23) Track spill/fill of constants in BPF verifier, from Alexei
    Starovoitov.

    24) Support bounded loops in BPF, from Alexei Starovoitov.

    25) Various page_pool API fixes and improvements, from Jesper Dangaard
    Brouer.

    26) Just like ipv4, support ref-countless ipv6 route handling. From
    Wei Wang.

    27) Support VLAN offloading in aquantia driver, from Igor Russkikh.

    28) Add AF_XDP zero-copy support to mlx5, from Maxim Mikityanskiy.

    29) Add flower GRE encap/decap support to nfp driver, from Pieter
    Jansen van Vuuren.

    30) Protect against stack overflow when using act_mirred, from John
    Hurley.

    31) Allow devmap map lookups from eBPF, from Toke Høiland-Jørgensen.

    32) Use page_pool API in netsec driver, Ilias Apalodimas.

    33) Add Google gve network driver, from Catherine Sullivan.

    34) More indirect call avoidance, from Paolo Abeni.

    35) Add kTLS TX HW offload support to mlx5, from Tariq Toukan.

    36) Add XDP_REDIRECT support to bnxt_en, from Andy Gospodarek.

    37) Add MPLS manipulation actions to TC, from John Hurley.

    38) Add sending a packet to connection tracking from TC actions, and
    then allow flower classifier matching on conntrack state. From
    Paul Blakey.

    39) Netfilter hw offload support, from Pablo Neira Ayuso"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2080 commits)
    net/mlx5e: Return in default case statement in tx_post_resync_params
    mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync().
    net: dsa: add support for BRIDGE_MROUTER attribute
    pkt_sched: Include const.h
    net: netsec: remove static declaration for netsec_set_tx_de()
    net: netsec: remove superfluous if statement
    netfilter: nf_tables: add hardware offload support
    net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload
    net: flow_offload: add flow_block_cb_is_busy() and use it
    net: sched: remove tcf block API
    drivers: net: use flow block API
    net: sched: use flow block API
    net: flow_offload: add flow_block_cb_{priv, incref, decref}()
    net: flow_offload: add list handling functions
    net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free()
    net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_*
    net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND
    net: flow_offload: add flow_block_cb_setup_simple()
    net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC
    net: hisilicon: Add an rx_desc to adapt HI13X1_GMAC
    ...

    Linus Torvalds
     

10 Jul, 2019

1 commit

  • Pull Documentation updates from Jonathan Corbet:
    "It's been a relatively busy cycle for docs:

    - A fair pile of RST conversions, many from Mauro. These create more
    than the usual number of simple but annoying merge conflicts with
    other trees, unfortunately. He has a lot more of these waiting on
    the wings that, I think, will go to you directly later on.

    - A new document on how to use merges and rebases in kernel repos,
    and one on Spectre vulnerabilities.

    - Various improvements to the build system, including automatic
    markup of function() references because some people, for reasons I
    will never understand, were of the opinion that
    :c:func:``function()`` is unattractive and not fun to type.

    - We now recommend using sphinx 1.7, but still support back to 1.4.

    - Lots of smaller improvements, warning fixes, typo fixes, etc"

    * tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits)
    docs: automarkup.py: ignore exceptions when seeking for xrefs
    docs: Move binderfs to admin-guide
    Disable Sphinx SmartyPants in HTML output
    doc: RCU callback locks need only _bh, not necessarily _irq
    docs: format kernel-parameters -- as code
    Doc : doc-guide : Fix a typo
    platform: x86: get rid of a non-existent document
    Add the RCU docs to the core-api manual
    Documentation: RCU: Add TOC tree hooks
    Documentation: RCU: Rename txt files to rst
    Documentation: RCU: Convert RCU UP systems to reST
    Documentation: RCU: Convert RCU linked list to reST
    Documentation: RCU: Convert RCU basic concepts to reST
    docs: filesystems: Remove uneeded .rst extension on toctables
    scripts/sphinx-pre-install: fix out-of-tree build
    docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/
    Documentation: PGP: update for newer HW devices
    Documentation: Add section about CPU vulnerabilities for Spectre
    Documentation: platform: Delete x86-laptop-drivers.txt
    docs: Note that :c:func: should no longer be used
    ...

    Linus Torvalds
     

09 Jul, 2019

1 commit


06 Jul, 2019

2 commits