29 Jan, 2019

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter/IPVS fixes for net

    The following patchset contains Netfilter/IPVS fixes for your net tree:

    1) The nftnl mutex is now per-netns, therefore use reference counter
    for matches and targets to deal with concurrent updates from netns.
    Moreover, place extensions in a pernet list. Patches from Florian Westphal.

    2) Bail out with EINVAL in case of negative timeouts via setsockopt()
    through ip_vs_set_timeout(), from ZhangXiaoxu.

    3) Spurious EINVAL on ebtables 32bit binary with 64bit kernel, also
    from Florian.

    4) Reset TCP option header parser in case of fingerprint mismatch,
    otherwise follow up overlapping fingerprint definitions including
    TCP options do not work, from Fernando Fernandez Mancera.

    5) Compilation warning in ipt_CLUSTER with CONFIG_PROC_FS unset.
    From Anders Roxell.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

28 Jan, 2019

1 commit

  • Unlike ip(6)tables ebtables only counts user-defined chains.

    The effect is that a 32bit ebtables binary on a 64bit kernel can do
    'ebtables -N FOO' only after adding at least one rule, else the request
    fails with -EINVAL.

    This is a similar fix as done in
    3f1e53abff84 ("netfilter: ebtables: don't attempt to allocate 0-sized compat array").

    Fixes: 7d7d7e02111e9 ("netfilter: compat: reject huge allocation requests")
    Reported-by: Francesco Ruggeri
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

19 Jan, 2019

2 commits

  • Externally learned entries can be added by a user or by a switch driver
    that is notifying the bridge driver about entries that were learned in
    hardware.

    In the first case, the entries are not marked with the 'added_by_user'
    flag, which causes switch drivers to ignore them and not offload them.

    The 'added_by_user' flag can be set on externally learned FDB entries
    based on the 'swdev_notify' parameter in br_fdb_external_learn_add(),
    which effectively means if the created / updated FDB entry was added by
    a user or not.

    Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications")
    Signed-off-by: Ido Schimmel
    Reported-by: Alexander Petrovskiy
    Reviewed-by: Petr Machata
    Cc: Roopa Prabhu
    Cc: Nikolay Aleksandrov
    Cc: bridge@lists.linux-foundation.org
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • In certain cases, pskb_trim_rcsum() may change skb pointers.
    Reinitialize header pointers afterwards to avoid potential
    use-after-frees. Add a note in the documentation of
    pskb_trim_rcsum(). Found by KASAN.

    Signed-off-by: Ross Lagerwall
    Signed-off-by: David S. Miller

    Ross Lagerwall
     

18 Jan, 2019

1 commit

  • The skb header should be set to ethernet header before using
    is_skb_forwardable. Because the ethernet header length has been
    considered in is_skb_forwardable(including dev->hard_header_len
    length).

    To reproduce the issue:
    1, add 2 ports on linux bridge br using following commands:
    $ brctl addbr br
    $ brctl addif br eth0
    $ brctl addif br eth1
    2, the MTU of eth0 and eth1 is 1500
    3, send a packet(Data 1480, UDP 8, IP 20, Ethernet 14, VLAN 4)
    from eth0 to eth1

    So the expect result is packet larger than 1500 cannot pass through
    eth0 and eth1. But currently, the packet passes through success, it
    means eth1's MTU limit doesn't take effect.

    Fixes: f6367b4660dd ("bridge: use is_skb_forwardable in forward path")
    Cc: bridge@lists.linux-foundation.org
    Cc: Nkolay Aleksandrov
    Cc: Roopa Prabhu
    Cc: Stephen Hemminger
    Signed-off-by: Yunjian Wang
    Signed-off-by: David S. Miller

    Yunjian Wang
     

16 Jan, 2019

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter fixes for net

    This is the first batch of Netfilter fixes for your net tree:

    1) Fix endless loop in nf_tables rules netlink dump, from Phil Sutter.

    2) Reference counter leak in object from the error path, from Taehee Yoo.

    3) Selective rule dump requires table and chain.

    4) Fix DNAT with nft_flow_offload reverse route lookup, from wenxu.

    5) Use GFP_KERNEL_ACCOUNT in vmalloc allocation from ebtables, from
    Shakeel Butt.

    6) Set ifindex from route to fix interaction with VRF slave device,
    also from wenxu.

    7) Use nfct_help() to check for conntrack helper, IPS_HELPER status
    flag is only set from explicit helpers via -j CT, from Henry Yen.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

12 Jan, 2019

1 commit

  • Matteo reported forwarding issues inside the linux bridge,
    if the enslaved interfaces use the fq qdisc.

    Similar to commit 8203e2d844d3 ("net: clear skb->tstamp in
    forwarding paths"), we need to clear the tstamp field in
    the bridge forwarding path.

    Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.")
    Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
    Reported-and-tested-by: Matteo Croce
    Signed-off-by: Paolo Abeni
    Acked-by: Nikolay Aleksandrov
    Acked-by: Roopa Prabhu
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Paolo Abeni
     

11 Jan, 2019

1 commit

  • The [ip,ip6,arp]_tables use x_tables_info internally and the underlying
    memory is already accounted to kmemcg. Do the same for ebtables. The
    syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the
    whole system from a restricted memcg, a potential DoS.

    By accounting the ebt_table_info, the memory used for ebt_table_info can
    be contained within the memcg of the allocating process. However the
    lifetime of ebt_table_info is independent of the allocating process and
    is tied to the network namespace. So, the oom-killer will not be able to
    relieve the memory pressure due to ebt_table_info memory. The memory for
    ebt_table_info is allocated through vmalloc. Currently vmalloc does not
    handle the oom-killed allocating process correctly and one large
    allocation can bypass memcg limit enforcement. So, with this patch,
    at least the small allocations will be contained. For large allocations,
    we need to fix vmalloc.

    Reported-by: syzbot+7713f3aa67be76b1552c@syzkaller.appspotmail.com
    Signed-off-by: Shakeel Butt
    Reviewed-by: Kirill Tkhai
    Signed-off-by: Pablo Neira Ayuso

    Shakeel Butt
     

09 Jan, 2019

1 commit

  • When adding / deleting VLANs to / from a bridge port, the bridge driver
    first tries to propagate the information via switchdev and falls back to
    the 8021q driver in case the underlying driver does not support
    switchdev. This can result in a memory leak [1] when VXLAN and mlxsw
    ports are enslaved to the bridge:

    $ ip link set dev vxlan0 master br0
    # No mlxsw ports are enslaved to 'br0', so mlxsw ignores the switchdev
    # notification and the bridge driver adds the VLAN on 'vxlan0' via the
    # 8021q driver
    $ bridge vlan add vid 10 dev vxlan0 pvid untagged
    # mlxsw port is enslaved to the bridge
    $ ip link set dev swp1 master br0
    # mlxsw processes the switchdev notification and the 8021q driver is
    # skipped
    $ bridge vlan del vid 10 dev vxlan0

    This results in 'struct vlan_info' and 'struct vlan_vid_info' being
    leaked, as they were allocated by the 8021q driver during VLAN addition,
    but never freed as the 8021q driver was skipped during deletion.

    Fix this by introducing a new VLAN private flag that indicates whether
    the VLAN was added on the port by switchdev or the 8021q driver. If the
    VLAN was added by the 8021q driver, then we make sure to delete it via
    the 8021q driver as well.

    [1]
    unreferenced object 0xffff88822d20b1e8 (size 256):
    comm "bridge", pid 2532, jiffies 4295216998 (age 1188.830s)
    hex dump (first 32 bytes):
    e0 42 97 ce 81 88 ff ff 00 00 00 00 00 00 00 00 .B..............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] kmem_cache_alloc_trace+0x1be/0x330
    [] vlan_vid_add+0x661/0x920
    [] __vlan_add+0x1be9/0x3a00
    [] nbp_vlan_add+0x8b3/0xd90
    [] br_vlan_info+0x132/0x410
    [] br_afspec+0x75c/0x870
    [] br_setlink+0x3dc/0x6d0
    [] rtnl_bridge_setlink+0x615/0xb30
    [] rtnetlink_rcv_msg+0x3a3/0xa80
    [] netlink_rcv_skb+0x152/0x3c0
    [] rtnetlink_rcv+0x21/0x30
    [] netlink_unicast+0x52f/0x740
    [] netlink_sendmsg+0x9c7/0xf50
    [] sock_sendmsg+0xbe/0x120
    [] ___sys_sendmsg+0x778/0x8f0
    [] __sys_sendmsg+0x112/0x270
    unreferenced object 0xffff888227454308 (size 32):
    comm "bridge", pid 2532, jiffies 4295216998 (age 1188.882s)
    hex dump (first 32 bytes):
    88 b2 20 2d 82 88 ff ff 88 b2 20 2d 82 88 ff ff .. -...... -....
    81 00 0a 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] kmem_cache_alloc_trace+0x1be/0x330
    [] vlan_vid_add+0x3e6/0x920
    [] __vlan_add+0x1be9/0x3a00
    [] nbp_vlan_add+0x8b3/0xd90
    [] br_vlan_info+0x132/0x410
    [] br_afspec+0x75c/0x870
    [] br_setlink+0x3dc/0x6d0
    [] rtnl_bridge_setlink+0x615/0xb30
    [] rtnetlink_rcv_msg+0x3a3/0xa80
    [] netlink_rcv_skb+0x152/0x3c0
    [] rtnetlink_rcv+0x21/0x30
    [] netlink_unicast+0x52f/0x740
    [] netlink_sendmsg+0x9c7/0xf50
    [] sock_sendmsg+0xbe/0x120
    [] ___sys_sendmsg+0x778/0x8f0
    [] __sys_sendmsg+0x112/0x270

    Fixes: d70e42b22dd4 ("mlxsw: spectrum: Enable VxLAN enslavement to VLAN-aware bridges")
    Signed-off-by: Ido Schimmel
    Reviewed-by: Petr Machata
    Cc: Roopa Prabhu
    Cc: Nikolay Aleksandrov
    Cc: bridge@lists.linux-foundation.org
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Ido Schimmel
     

08 Jan, 2019

1 commit

  • When handling DNAT'ed packets on a bridge device, the neighbour cache entry
    from lookup was used without checking its state. It means that a cache entry
    in the NUD_STALE state will be used directly instead of entering the NUD_DELAY
    state to confirm the reachability of the neighbor.

    This problem becomes worse after commit 2724680bceee ("neigh: Keep neighbour
    cache entries if number of them is small enough."), since all neighbour cache
    entries in the NUD_STALE state will be kept in the neighbour table as long as
    the number of cache entries does not exceed the value specified in gc_thresh1.

    This commit validates the state of a neighbour cache entry before using
    the entry.

    Signed-off-by: JianJhen Chen
    Reviewed-by: JinLin Chen
    Signed-off-by: David S. Miller

    JianJhen Chen
     

20 Dec, 2018

2 commits


19 Dec, 2018

1 commit


17 Dec, 2018

1 commit


14 Dec, 2018

2 commits

  • When a port device seeks approval of a potential new MAC address, make
    sure that should the bridge device end up using this address, all
    interested parties would agree with it.

    Signed-off-by: Petr Machata
    Acked-by: Jiri Pirko
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Petr Machata
     
  • When a port is attached to a bridge, the address of the bridge in
    question may change as well. Even if it would not change at this
    point (because the current bridge address is lower), it might end up
    changing later as a result of detach of another port, which can't be
    vetoed.

    Therefore issue NETDEV_PRE_CHANGEADDR regardless of whether the address
    will be used at this point or not, and make sure all involved parties
    would agree with the change.

    Signed-off-by: Petr Machata
    Acked-by: Jiri Pirko
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Petr Machata
     

13 Dec, 2018

3 commits

  • After the previous patch, bridge driver has extack argument available to
    pass to switchdev. Therefore extend switchdev_port_obj_add() with this
    argument, updating all callers, and passing the argument through to
    switchdev_port_obj_notify().

    Signed-off-by: Petr Machata
    Acked-by: Jiri Pirko
    Acked-by: Ivan Vecera
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Petr Machata
     
  • ndo_bridge_setlink has been updated in the previous patch to have extack
    available, and changelink RTNL op has had this argument since the time
    extack was added. Propagate both through the bridge driver to eventually
    reach br_switchdev_port_vlan_add(), where it will be used by subsequent
    patches.

    Signed-off-by: Petr Machata
    Acked-by: Jiri Pirko
    Acked-by: Nikolay Aleksandrov
    Acked-by: Ivan Vecera
    Acked-by: Roopa Prabhu
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Petr Machata
     
  • Drivers may not be able to implement a VLAN addition or reconfiguration.
    In those cases it's desirable to explain to the user that it was
    rejected (and why).

    To that end, add extack argument to ndo_bridge_setlink. Adapt all users
    to that change.

    Following patches will use the new argument in the bridge driver.

    Signed-off-by: Petr Machata
    Acked-by: Jiri Pirko
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Petr Machata
     

08 Dec, 2018

1 commit

  • When a driver unoffloads all FDB entries en bloc, it's inefficient to
    send the switchdev notification one by one. Add a helper that unsets the
    offload flag on FDB entries on a given bridge port and VLAN.

    Signed-off-by: Petr Machata
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Petr Machata
     

06 Dec, 2018

4 commits

  • bridge's default hash_max was 512 which is rather conservative, now that
    we're using the generic rhashtable API which autoshrinks let's increase
    it to 4096 and move it to a define in br_private.h.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • Now that the bridge multicast uses the generic rhashtable interface we
    can drop the hash_elasticity option as that is already done for us and
    it's hardcoded to a maximum of RHT_ELASTICITY (16 currently). Add a
    warning about the obsolete option when the hash_elasticity is set.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • The bridge multicast code has been using a mix of RCU and RCU-bh flavors
    sometimes in questionable way. Since we've moved to rhashtable just use
    non-bh RCU everywhere. In addition this simplifies freeing of objects
    and allows us to remove some unnecessary callback functions.

    v3: new patch

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • The bridge multicast code currently uses a custom resizable hashtable
    which predates the generic rhashtable interface. It has many
    shortcomings compared and duplicates functionality that is presently
    available via the generic rhashtable, so this patch removes the custom
    rhashtable implementation in favor of the kernel's generic rhashtable.
    The hash maximum is kept and the rhashtable's size is used to do a loose
    check if it's reached in which case we revert to the old behaviour and
    disable further bridge multicast processing. Also now we can support any
    hash maximum, doesn't need to be a power of 2.

    v3: add non-rcu br_mdb_get variant and use it where multicast_lock is
    held to avoid RCU splat, drop hash_max function and just set it
    directly

    v2: handle when IGMP snooping is undefined, add br_mdb_init/uninit
    placeholders

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

01 Dec, 2018

1 commit


28 Nov, 2018

3 commits

  • Now that we have at least one bool option, we can export all of the
    supported bool options via optmask when dumping them.

    v2: new patch

    Signed-off-by: Nikolay Aleksandrov
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • Use the new boolopt API to add an option which disables learning from
    link-local packets. The default is kept as before and learning is
    enabled. This is a simple map from a boolopt bit to a bridge private
    flag that is tested before learning.

    v2: pass NULL for extack via sysfs

    Signed-off-by: Nikolay Aleksandrov
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • We have been adding many new bridge options, a big number of which are
    boolean but still take up netlink attribute ids and waste space in the skb.
    Recently we discussed learning from link-local packets[1] and decided
    yet another new boolean option will be needed, thus introducing this API
    to save some bridge nl space.
    The API supports changing the value of multiple boolean options at once
    via the br_boolopt_multi struct which has an optmask (which options to
    set, bit per opt) and optval (options' new values). Future boolean
    options will only be added to the br_boolopt_id enum and then will have
    to be handled in br_boolopt_toggle/get. The API will automatically
    add the ability to change and export them via netlink, sysfs can use the
    single boolopt function versions to do the same. The behaviour with
    failing/succeeding is the same as with normal netlink option changing.

    If an option requires mapping to internal kernel flag or needs special
    configuration to be enabled then it should be handled in
    br_boolopt_toggle. It should also be able to retrieve an option's current
    state via br_boolopt_get.

    v2: WARN_ON() on unsupported option as that shouldn't be possible and
    also will help catch people who add new options without handling
    them for both set and get. Pass down extack so if an option desires
    it could set it on error and be more user-friendly.

    [1] https://www.spinics.net/lists/netdev/msg532698.html

    Signed-off-by: Nikolay Aleksandrov
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

26 Nov, 2018

1 commit

  • A recent change added a null check on p->dev after p->dev was being
    dereferenced by the ns_capable check on p->dev. It turns out that
    neither the p->dev and p->br null checks are necessary, and can be
    removed, which cleans up a static analyis warning.

    As Nikolay Aleksandrov noted, these checks can be removed because:

    "My reasoning of why it shouldn't be possible:
    - On port add new_nbp() sets both p->dev and p->br before creating
    kobj/sysfs

    - On port del (trickier) del_nbp() calls kobject_del() before call_rcu()
    to destroy the port which in turn calls sysfs_remove_dir() which uses
    kernfs_remove() which deactivates (shouldn't be able to open new
    files) and calls kernfs_drain() to drain current open/mmaped files in
    the respective dir before continuing, thus making it impossible to
    open a bridge port sysfs file with p->dev and p->br equal to NULL.

    So I think it's safe to remove those checks altogether. It'd be nice to
    get a second look over my reasoning as I might be missing something in
    sysfs/kernfs call path."

    Thanks to Nikolay Aleksandrov's suggestion to remove the check and
    David Miller for sanity checking this.

    Detected by CoverityScan, CID#751490 ("Dereference before null check")

    Fixes: a5f3ea54f3cc ("net: bridge: add support for raw sysfs port options")
    Signed-off-by: Colin Ian King
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Colin Ian King
     

22 Nov, 2018

1 commit

  • Allow querying bridge port flags so that drivers capable of performing
    VxLAN learning will update the bridge driver only if learning is enabled
    on its bridge port corresponding to the VxLAN device.

    Signed-off-by: Ido Schimmel
    Reviewed-by: Petr Machata
    Signed-off-by: David S. Miller

    Ido Schimmel
     

20 Nov, 2018

1 commit


18 Nov, 2018

1 commit

  • Syzbot reported a use-after-free of the global vlan context on port vlan
    destruction. When I added per-port vlan stats I missed the fact that the
    global vlan context can be freed before the per-port vlan rcu callback.
    There're a few different ways to deal with this, I've chosen to add a
    new private flag that is set only when per-port stats are allocated so
    we can directly check it on destruction without dereferencing the global
    context at all. The new field in net_bridge_vlan uses a hole.

    v2: cosmetic change, move the check to br_process_vlan_info where the
    other checks are done
    v3: add change log in the patch, add private (in-kernel only) flags in a
    hole in net_bridge_vlan struct and use that instead of mixing
    user-space flags with private flags

    Fixes: 9163a0fc1f0c ("net: bridge: add support for per-port vlan stats")
    Reported-by: syzbot+04681da557a0e49a52e5@syzkaller.appspotmail.com
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

09 Nov, 2018

1 commit


29 Oct, 2018

1 commit

  • Recently a check was added which prevents marking of routers with zero
    source address, but for IPv6 that cannot happen as the relevant RFCs
    actually forbid such packets:
    RFC 2710 (MLDv1):
    "To be valid, the Query message MUST
    come from a link-local IPv6 Source Address, be at least 24 octets
    long, and have a correct MLD checksum."

    Same goes for RFC 3810.

    And also it can be seen as a requirement in ipv6_mc_check_mld_query()
    which is used by the bridge to validate the message before processing
    it. Thus any queries with :: source address won't be processed anyway.
    So just remove the check for zero IPv6 source address from the query
    processing function.

    Fixes: 5a2de63fd1a5 ("bridge: do not add port to router list when receives query with source 0.0.0.0")
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

27 Oct, 2018

1 commit

  • Based on RFC 4541, 2.1.1. IGMP Forwarding Rules

    The switch supporting IGMP snooping must maintain a list of
    multicast routers and the ports on which they are attached. This
    list can be constructed in any combination of the following ways:

    a) This list should be built by the snooping switch sending
    Multicast Router Solicitation messages as described in IGMP
    Multicast Router Discovery [MRDISC]. It may also snoop
    Multicast Router Advertisement messages sent by and to other
    nodes.

    b) The arrival port for IGMP Queries (sent by multicast routers)
    where the source address is not 0.0.0.0.

    We should not add the port to router list when receives query with source
    0.0.0.0.

    Reported-by: Ying Xu
    Signed-off-by: Hangbin Liu
    Acked-by: Nikolay Aleksandrov
    Acked-by: Roopa Prabhu
    Signed-off-by: David S. Miller

    Hangbin Liu
     

25 Oct, 2018

1 commit

  • Pull documentation updates from Jonathan Corbet:
    "This is a fairly typical cycle for documentation. There's some welcome
    readability improvements for the formatted output, some LICENSES
    updates including the addition of the ISC license, the removal of the
    unloved and unmaintained 00-INDEX files, the deprecated APIs document
    from Kees, more MM docs from Mike Rapoport, and the usual pile of typo
    fixes and corrections"

    * tag 'docs-4.20' of git://git.lwn.net/linux: (41 commits)
    docs: Fix typos in histogram.rst
    docs: Introduce deprecated APIs list
    kernel-doc: fix declaration type determination
    doc: fix a typo in adding-syscalls.rst
    docs/admin-guide: memory-hotplug: remove table of contents
    doc: printk-formats: Remove bogus kobject references for device nodes
    Documentation: preempt-locking: Use better example
    dm flakey: Document "error_writes" feature
    docs/completion.txt: Fix a couple of punctuation nits
    LICENSES: Add ISC license text
    LICENSES: Add note to CDDL-1.0 license that it should not be used
    docs/core-api: memory-hotplug: add some details about locking internals
    docs/core-api: rename memory-hotplug-notifier to memory-hotplug
    docs: improve readability for people with poorer eyesight
    yama: clarify ptrace_scope=2 in Yama documentation
    docs/vm: split memory hotplug notifier description to Documentation/core-api
    docs: move memory hotplug description into admin-guide/mm
    doc: Fix acronym "FEKEK" in ecryptfs
    docs: fix some broken documentation references
    iommu: Fix passthrough option documentation
    ...

    Linus Torvalds
     

20 Oct, 2018

1 commit

  • This fixes a problem introduced by:
    commit 2cde6acd49da ("netpoll: Fix __netpoll_rcu_free so that it can hold the rtnl lock")

    When using netconsole on a bond, __netpoll_cleanup can asynchronously
    recurse multiple times, each __netpoll_free_async call can result in
    more __netpoll_free_async's. This means there is now a race between
    cleanup_work queues on multiple netpoll_info's on multiple devices and
    the configuration of a new netpoll. For example if a netconsole is set
    to enable 0, reconfigured, and enable 1 immediately, this netconsole
    will likely not work.

    Given the reason for __netpoll_free_async is it can be called when rtnl
    is not locked, if it is locked, we should be able to execute
    synchronously. It appears to be locked everywhere it's called from.

    Generalize the design pattern from the teaming driver for current
    callers of __netpoll_free_async.

    CC: Neil Horman
    CC: "David S. Miller"
    Signed-off-by: Debabrata Banerjee
    Signed-off-by: David S. Miller

    Debabrata Banerjee
     

18 Oct, 2018

1 commit

  • Currently, an FDB entry only ceases being offloaded when it is deleted.
    This changes with VxLAN encapsulation.

    Devices capable of performing VxLAN encapsulation usually have only one
    FDB table, unlike the software data path which has two - one in the
    bridge driver and another in the VxLAN driver.

    Therefore, bridge FDB entries pointing to a VxLAN device are only
    offloaded if there is a corresponding entry in the VxLAN FDB.

    Allow clearing the offload indication in case the corresponding entry
    was deleted from the VxLAN FDB.

    Signed-off-by: Ido Schimmel
    Reviewed-by: Petr Machata
    Signed-off-by: David S. Miller

    Ido Schimmel
     

16 Oct, 2018

1 commit

  • After per-port vlan stats, vlan stats should be released
    when fail to add vlan

    Fixes: 9163a0fc1f0c0 ("net: bridge: add support for per-port vlan stats")
    CC: bridge@lists.linux-foundation.org
    cc: Nikolay Aleksandrov
    CC: Roopa Prabhu
    Signed-off-by: Zhang Yu
    Signed-off-by: Li RongQing
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Li RongQing
     

13 Oct, 2018

1 commit

  • This patch adds an option to have per-port vlan stats instead of the
    default global stats. The option can be set only when there are no port
    vlans in the bridge since we need to allocate the stats if it is set
    when vlans are being added to ports (and respectively free them
    when being deleted). Also bump RTNL_MAX_TYPE as the bridge is the
    largest user of options. The current stats design allows us to add
    these without any changes to the fast-path, it all comes down to
    the per-vlan stats pointer which, if this option is enabled, will
    be allocated for each port vlan instead of using the global bridge-wide
    one.

    CC: bridge@lists.linux-foundation.org
    CC: Roopa Prabhu
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov