15 Sep, 2019

1 commit


13 Sep, 2019

3 commits


10 Sep, 2019

1 commit

  • NLM_F_MULTI must be used only when a NLMSG_DONE message is sent at the end.
    In fact, NLMSG_DONE is sent only at the end of a dump.

    Libraries like libnl will wait forever for NLMSG_DONE.

    Fixes: 949f1e39a617 ("bridge: mdb: notify on router port add and del")
    CC: Nikolay Aleksandrov
    Signed-off-by: Nicolas Dichtel
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

03 Sep, 2019

2 commits


01 Sep, 2019

1 commit

  • Currently this simplified code snippet fails:

    br_vlan_get_pvid(netdev, &pvid);
    br_vlan_get_info(netdev, pvid, &vinfo);
    ASSERT(!(vinfo.flags & BRIDGE_VLAN_INFO_PVID));

    It is intuitive that the pvid of a netdevice should have the
    BRIDGE_VLAN_INFO_PVID flag set.

    However I can't seem to pinpoint a commit where this behavior was
    introduced. It seems like it's been like that since forever.

    At a first glance it would make more sense to just handle the
    BRIDGE_VLAN_INFO_PVID flag in __vlan_add_flags. However, as Nikolay
    explains:

    There are a few reasons why we don't do it, most importantly because
    we need to have only one visible pvid at any single time, even if it's
    stale - it must be just one. Right now that rule will not be violated
    by this change, but people will try using this flag and could see two
    pvids simultaneously. You can see that the pvid code is even using
    memory barriers to propagate the new value faster and everywhere the
    pvid is read only once. That is the reason the flag is set
    dynamically when dumping entries, too. A second (weaker) argument
    against would be given the above we don't want another way to do the
    same thing, specifically if it can provide us with two pvids (e.g. if
    walking the vlan list) or if it can provide us with a pvid different
    from the one set in the vg. [Obviously, I'm talking about RCU
    pvid/vlan use cases similar to the dumps. The locked cases are fine.
    I would like to avoid explaining why this shouldn't be relied upon
    without locking]

    So instead of introducing the above change and making sure of the pvid
    uniqueness under RCU, simply dynamically populate the pvid flag in
    br_vlan_get_info().

    Signed-off-by: Vladimir Oltean
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Vladimir Oltean
     

30 Aug, 2019

1 commit


28 Aug, 2019

1 commit


19 Aug, 2019

1 commit

  • The ordering of arguments to the x_tables ADD_COUNTER macro
    appears to be wrong in ebtables (cf. ip_tables.c, ip6_tables.c,
    and arp_tables.c).

    This causes data corruption in the ebtables userspace tools
    because they get incorrect packet & byte counts from the kernel.

    Fixes: d72133e628803 ("netfilter: ebtables: use ADD_COUNTER macro")
    Signed-off-by: Todd Seidelmann
    Signed-off-by: Pablo Neira Ayuso

    Todd Seidelmann
     

18 Aug, 2019

4 commits

  • Currently this is needed only for user-space compatibility, so similar
    object adds/deletes as the dumped ones would succeed. Later it can be
    used for L2 mcast MAC add/delete.

    v3: fix compiler warning (DaveM)
    v2: don't send a notification when used from user-space, arm the group
    timer if no ports are left after host entry del

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • Currently we dump only the port mdb entries but we can have host-joined
    entries on the bridge itself and they should be treated as normal temp
    mdbs, they're already notified:
    $ bridge monitor all
    [MDB]dev br0 port br0 grp ff02::8 temp

    The group will not be shown in the bridge mdb output, but it takes 1 slot
    and it's timing out. If it's only host-joined then the mdb show output
    can even be empty.

    After this patch we show the host-joined groups:
    $ bridge mdb show
    dev br0 port br0 grp ff02::8 temp

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • We have to factor out the mdb fill portion in order to re-use it later for
    the bridge mdb entries. No functional changes intended.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • Trivial patch to move the vlan comments in their proper places above the
    vid 0 checks.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

07 Aug, 2019

1 commit


06 Aug, 2019

1 commit

  • Most of the bridge device's vlan init bugs come from the fact that its
    default pvid is created at the wrong time, way too early in ndo_init()
    before the device is even assigned an ifindex. It introduces a bug when the
    bridge's dev_addr is added as fdb during the initial default pvid creation
    the notification has ifindex/NDA_MASTER both equal to 0 (see example below)
    which really makes no sense for user-space[0] and is wrong.
    Usually user-space software would ignore such entries, but they are
    actually valid and will eventually have all necessary attributes.
    It makes much more sense to send a notification *after* the device has
    registered and has a proper ifindex allocated rather than before when
    there's a chance that the registration might still fail or to receive
    it with ifindex/NDA_MASTER == 0. Note that we can remove the fdb flush
    from br_vlan_flush() since that case can no longer happen. At
    NETDEV_REGISTER br->default_pvid is always == 1 as it's initialized by
    br_vlan_init() before that and at NETDEV_UNREGISTER it can be anything
    depending why it was called (if called due to NETDEV_REGISTER error
    it'll still be == 1, otherwise it could be any value changed during the
    device life time).

    For the demonstration below a small change to iproute2 for printing all fdb
    notifications is added, because it contained a workaround not to show
    entries with ifindex == 0.
    Command executed while monitoring: $ ip l add br0 type bridge
    Before (both ifindex and master == 0):
    $ bridge monitor fdb
    36:7e:8a:b3:56:ba dev * vlan 1 master * permanent

    After (proper br0 ifindex):
    $ bridge monitor fdb
    e6:2a:ae:7a:b7:48 dev br0 vlan 1 master br0 permanent

    v4: move only the default pvid init/deinit to NETDEV_REGISTER/UNREGISTER
    v3: send the correct v2 patch with all changes (stub should return 0)
    v2: on error in br_vlan_init set br->vlgrp to NULL and return 0 in
    the br_vlan_bridge_event stub when bridge vlans are disabled

    [0] https://bugzilla.kernel.org/show_bug.cgi?id=204389

    Reported-by: michael-dev
    Fixes: 5be5a2df40f0 ("bridge: Add filtering support for default_pvid")
    Signed-off-by: Nikolay Aleksandrov
    Acked-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

01 Aug, 2019

2 commits

  • In user-space there's no way to distinguish why an mdb entry was deleted
    and that is a problem for daemons which would like to keep the mdb in
    sync with remote ends (e.g. mlag) but would also like to converge faster.
    In almost all cases we'd like to age-out the remote entry for performance
    and convergence reasons except when fast-leave is enabled. In that case we
    want explicit immediate remote delete, thus add mdb flag which is set only
    when the entry is being deleted due to fast-leave.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • When permanent entries were introduced by the commit below, they were
    exempt from timing out and thus igmp leave wouldn't affect them unless
    fast leave was enabled on the port which was added before permanent
    entries existed. It shouldn't matter if fast leave is enabled or not
    if the user added a permanent entry it shouldn't be deleted on igmp
    leave.

    Before:
    $ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave
    $ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent
    $ bridge mdb show
    dev br0 port eth4 grp 229.1.1.1 permanent

    < join and leave 229.1.1.1 on eth4 >

    $ bridge mdb show
    $

    After:
    $ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave
    $ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent
    $ bridge mdb show
    dev br0 port eth4 grp 229.1.1.1 permanent

    < join and leave 229.1.1.1 on eth4 >

    $ bridge mdb show
    dev br0 port eth4 grp 229.1.1.1 permanent

    Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

31 Jul, 2019

1 commit

  • Pablo Neira Ayuso says:

    ====================
    netfilter fixes for net

    The following patchset contains Netfilter fixes for your net tree:

    1) memleak in ebtables from the error path for the 32/64 compat layer,
    from Florian Westphal.

    2) Fix inverted meta ifname/ifidx matching when no interface is set
    on either from the input/output path, from Phil Sutter.

    3) Remove goto label in nft_meta_bridge, also from Phil.

    4) Missing include guard in xt_connlabel, from Masahiro Yamada.

    5) Two patch to fix ipset destination MAC matching coming from
    Stephano Brivio, via Jozsef Kadlecsik.

    6) Fix set rename and listing concurrency problem, from Shijie Luo.
    Patch also coming via Jozsef Kadlecsik.

    7) ebtables 32/64 compat missing base chain policy in rule count,
    from Florian Westphal.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

30 Jul, 2019

2 commits

  • ebtables doesn't include the base chain policies in the rule count,
    so we need to add them manually when we call into the x_tables core
    to allocate space for the comapt offset table.

    This lead syzbot to trigger:
    WARNING: CPU: 1 PID: 9012 at net/netfilter/x_tables.c:649
    xt_compat_add_offset.cold+0x11/0x36 net/netfilter/x_tables.c:649

    Reported-by: syzbot+276ddebab3382bbf72db@syzkaller.appspotmail.com
    Fixes: 2035f3ff8eaa ("netfilter: ebtables: compat: un-break 32bit setsockopt when no rules are present")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • On initialization failure we have to delete the local fdb which was
    inserted due to the default pvid creation. This problem has been present
    since the inception of default_pvid. Note that currently there are 2 cases:
    1) in br_dev_init() when br_multicast_init() fails
    2) if register_netdevice() fails after calling ndo_init()

    This patch takes care of both since br_vlan_flush() is called on both
    occasions. Also the new fdb delete would be a no-op on normal bridge
    device destruction since the local fdb would've been already flushed by
    br_dev_delete(). This is not an issue for ports since nbp_vlan_init() is
    called last when adding a port thus nothing can fail after it.

    Reported-by: syzbot+88533dc8b582309bf3ee@syzkaller.appspotmail.com
    Fixes: 5be5a2df40f0 ("bridge: Add filtering support for default_pvid")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

25 Jul, 2019

2 commits

  • The label is used just once and the code it points at is not reused, no
    point in keeping it.

    Signed-off-by: Phil Sutter
    Signed-off-by: Pablo Neira Ayuso

    Phil Sutter
     
  • nft_meta_get_eval()'s tendency to bail out setting NFT_BREAK verdict in
    situations where required data is missing leads to unexpected behaviour
    with inverted checks like so:

    | meta iifname != eth0 accept

    This rule will never match if there is no input interface (or it is not
    known) which is not intuitive and, what's worse, breaks consistency of
    iptables-nft with iptables-legacy.

    Fix this by falling back to placing a value in dreg which never matches
    (avoiding accidental matches), i.e. zero for interface index and an
    empty string for interface name.

    Signed-off-by: Phil Sutter
    Signed-off-by: Pablo Neira Ayuso

    Phil Sutter
     

22 Jul, 2019

1 commit

  • In compat_do_replace(), a temporary buffer is allocated through vmalloc()
    to hold entries copied from the user space. The buffer address is firstly
    saved to 'newinfo->entries', and later on assigned to 'entries_tmp'. Then
    the entries in this temporary buffer is copied to the internal kernel
    structure through compat_copy_entries(). If this copy process fails,
    compat_do_replace() should be terminated. However, the allocated temporary
    buffer is not freed on this path, leading to a memory leak.

    To fix the bug, free the buffer before returning from compat_do_replace().

    Signed-off-by: Wenwen Wang
    Reviewed-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Wenwen Wang
     

20 Jul, 2019

1 commit

  • The new nft_meta_bridge code fails to link as built-in when NF_TABLES
    is a loadable module.

    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_get_eval':
    nft_meta_bridge.c:(.text+0x1e8): undefined reference to `nft_meta_get_eval'
    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_get_init':
    nft_meta_bridge.c:(.text+0x468): undefined reference to `nft_meta_get_init'
    nft_meta_bridge.c:(.text+0x49c): undefined reference to `nft_parse_register'
    nft_meta_bridge.c:(.text+0x4cc): undefined reference to `nft_validate_register_store'
    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_module_exit':
    nft_meta_bridge.c:(.exit.text+0x14): undefined reference to `nft_unregister_expr'
    net/bridge/netfilter/nft_meta_bridge.o: In function `nft_meta_bridge_module_init':
    nft_meta_bridge.c:(.init.text+0x14): undefined reference to `nft_register_expr'
    net/bridge/netfilter/nft_meta_bridge.o:(.rodata+0x60): undefined reference to `nft_meta_get_dump'
    net/bridge/netfilter/nft_meta_bridge.o:(.rodata+0x88): undefined reference to `nft_meta_set_eval'

    This can happen because the NF_TABLES_BRIDGE dependency itself is just a
    'bool'. Make the symbol a 'tristate' instead so Kconfig can propagate the
    dependencies correctly.

    Fixes: 30e103fe24de ("netfilter: nft_meta: move bridge meta keys into nft_meta_bridge")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Pablo Neira Ayuso

    Arnd Bergmann
     

19 Jul, 2019

1 commit


12 Jul, 2019

1 commit

  • Pull networking updates from David Miller:
    "Some highlights from this development cycle:

    1) Big refactoring of ipv6 route and neigh handling to support
    nexthop objects configurable as units from userspace. From David
    Ahern.

    2) Convert explored_states in BPF verifier into a hash table,
    significantly decreased state held for programs with bpf2bpf
    calls, from Alexei Starovoitov.

    3) Implement bpf_send_signal() helper, from Yonghong Song.

    4) Various classifier enhancements to mvpp2 driver, from Maxime
    Chevallier.

    5) Add aRFS support to hns3 driver, from Jian Shen.

    6) Fix use after free in inet frags by allocating fqdirs dynamically
    and reworking how rhashtable dismantle occurs, from Eric Dumazet.

    7) Add act_ctinfo packet classifier action, from Kevin
    Darbyshire-Bryant.

    8) Add TFO key backup infrastructure, from Jason Baron.

    9) Remove several old and unused ISDN drivers, from Arnd Bergmann.

    10) Add devlink notifications for flash update status to mlxsw driver,
    from Jiri Pirko.

    11) Lots of kTLS offload infrastructure fixes, from Jakub Kicinski.

    12) Add support for mv88e6250 DSA chips, from Rasmus Villemoes.

    13) Various enhancements to ipv6 flow label handling, from Eric
    Dumazet and Willem de Bruijn.

    14) Support TLS offload in nfp driver, from Jakub Kicinski, Dirk van
    der Merwe, and others.

    15) Various improvements to axienet driver including converting it to
    phylink, from Robert Hancock.

    16) Add PTP support to sja1105 DSA driver, from Vladimir Oltean.

    17) Add mqprio qdisc offload support to dpaa2-eth, from Ioana
    Radulescu.

    18) Add devlink health reporting to mlx5, from Moshe Shemesh.

    19) Convert stmmac over to phylink, from Jose Abreu.

    20) Add PTP PHC (Physical Hardware Clock) support to mlxsw, from
    Shalom Toledo.

    21) Add nftables SYNPROXY support, from Fernando Fernandez Mancera.

    22) Convert tcp_fastopen over to use SipHash, from Ard Biesheuvel.

    23) Track spill/fill of constants in BPF verifier, from Alexei
    Starovoitov.

    24) Support bounded loops in BPF, from Alexei Starovoitov.

    25) Various page_pool API fixes and improvements, from Jesper Dangaard
    Brouer.

    26) Just like ipv4, support ref-countless ipv6 route handling. From
    Wei Wang.

    27) Support VLAN offloading in aquantia driver, from Igor Russkikh.

    28) Add AF_XDP zero-copy support to mlx5, from Maxim Mikityanskiy.

    29) Add flower GRE encap/decap support to nfp driver, from Pieter
    Jansen van Vuuren.

    30) Protect against stack overflow when using act_mirred, from John
    Hurley.

    31) Allow devmap map lookups from eBPF, from Toke Høiland-Jørgensen.

    32) Use page_pool API in netsec driver, Ilias Apalodimas.

    33) Add Google gve network driver, from Catherine Sullivan.

    34) More indirect call avoidance, from Paolo Abeni.

    35) Add kTLS TX HW offload support to mlx5, from Tariq Toukan.

    36) Add XDP_REDIRECT support to bnxt_en, from Andy Gospodarek.

    37) Add MPLS manipulation actions to TC, from John Hurley.

    38) Add sending a packet to connection tracking from TC actions, and
    then allow flower classifier matching on conntrack state. From
    Paul Blakey.

    39) Netfilter hw offload support, from Pablo Neira Ayuso"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2080 commits)
    net/mlx5e: Return in default case statement in tx_post_resync_params
    mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync().
    net: dsa: add support for BRIDGE_MROUTER attribute
    pkt_sched: Include const.h
    net: netsec: remove static declaration for netsec_set_tx_de()
    net: netsec: remove superfluous if statement
    netfilter: nf_tables: add hardware offload support
    net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload
    net: flow_offload: add flow_block_cb_is_busy() and use it
    net: sched: remove tcf block API
    drivers: net: use flow block API
    net: sched: use flow block API
    net: flow_offload: add flow_block_cb_{priv, incref, decref}()
    net: flow_offload: add list handling functions
    net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free()
    net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_*
    net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND
    net: flow_offload: add flow_block_cb_setup_simple()
    net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC
    net: hisilicon: Add an rx_desc to adapt HI13X1_GMAC
    ...

    Linus Torvalds
     

10 Jul, 2019

1 commit

  • Pull Documentation updates from Jonathan Corbet:
    "It's been a relatively busy cycle for docs:

    - A fair pile of RST conversions, many from Mauro. These create more
    than the usual number of simple but annoying merge conflicts with
    other trees, unfortunately. He has a lot more of these waiting on
    the wings that, I think, will go to you directly later on.

    - A new document on how to use merges and rebases in kernel repos,
    and one on Spectre vulnerabilities.

    - Various improvements to the build system, including automatic
    markup of function() references because some people, for reasons I
    will never understand, were of the opinion that
    :c:func:``function()`` is unattractive and not fun to type.

    - We now recommend using sphinx 1.7, but still support back to 1.4.

    - Lots of smaller improvements, warning fixes, typo fixes, etc"

    * tag 'docs-5.3' of git://git.lwn.net/linux: (129 commits)
    docs: automarkup.py: ignore exceptions when seeking for xrefs
    docs: Move binderfs to admin-guide
    Disable Sphinx SmartyPants in HTML output
    doc: RCU callback locks need only _bh, not necessarily _irq
    docs: format kernel-parameters -- as code
    Doc : doc-guide : Fix a typo
    platform: x86: get rid of a non-existent document
    Add the RCU docs to the core-api manual
    Documentation: RCU: Add TOC tree hooks
    Documentation: RCU: Rename txt files to rst
    Documentation: RCU: Convert RCU UP systems to reST
    Documentation: RCU: Convert RCU linked list to reST
    Documentation: RCU: Convert RCU basic concepts to reST
    docs: filesystems: Remove uneeded .rst extension on toctables
    scripts/sphinx-pre-install: fix out-of-tree build
    docs: zh_CN: submitting-drivers.rst: Remove a duplicated Documentation/
    Documentation: PGP: update for newer HW devices
    Documentation: Add section about CPU vulnerabilities for Spectre
    Documentation: platform: Delete x86-laptop-drivers.txt
    docs: Note that :c:func: should no longer be used
    ...

    Linus Torvalds
     

09 Jul, 2019

1 commit


06 Jul, 2019

6 commits


04 Jul, 2019

1 commit


03 Jul, 2019

3 commits

  • Don't cache eth dest pointer before calling pskb_may_pull.

    Fixes: cf0f02d04a83 ("[BRIDGE]: use llc for receiving STP packets")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • We would cache ether dst pointer on input in br_handle_frame_finish but
    after the neigh suppress code that could lead to a stale pointer since
    both ipv4 and ipv6 suppress code do pskb_may_pull. This means we have to
    always reload it after the suppress code so there's no point in having
    it cached just retrieve it directly.

    Fixes: 057658cb33fbf ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports")
    Fixes: ed842faeb2bd ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • We get a pointer to the ipv6 hdr in br_ip6_multicast_query but we may
    call pskb_may_pull afterwards and end up using a stale pointer.
    So use the header directly, it's just 1 place where it's needed.

    Fixes: 08b202b67264 ("bridge br_multicast: IPv6 MLD support.")
    Signed-off-by: Nikolay Aleksandrov
    Tested-by: Martin Weinelt
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov