19 Feb, 2020

3 commits

  • Add a new API for start/end binary array brackets [] to force array
    around binary data as required from JSON. With this restriction, re-open
    API to set binary fmsg data.

    Signed-off-by: Aya Levin
    Reviewed-by: Jiri Pirko
    Signed-off-by: Saeed Mahameed

    Aya Levin
     
  • Instead of assigning skb = segments before the loop, just pass
    segments directly as the first argument to skb_list_walk_safe().

    Signed-off-by: Edward Cree
    Signed-off-by: David S. Miller

    Edward Cree
     
  • After performing an unbind/bind operation the network is no longer
    functional on i.MX6 (which has a single FEC instance):

    # echo 2188000.ethernet > /sys/bus/platform/drivers/fec/unbind
    # echo 2188000.ethernet > /sys/bus/platform/drivers/fec/bind
    [ 10.756519] pps pps0: new PPS source ptp0
    [ 10.792626] libphy: fec_enet_mii_bus: probed
    [ 10.799330] fec 2188000.ethernet eth0: registered PHC device 1
    # udhcpc -i eth0
    udhcpc: started, v1.31.1
    [ 14.985211] fec 2188000.ethernet eth0: no PHY, assuming direct connection to switch
    [ 14.993140] libphy: PHY fixed-0:00 not found
    [ 14.997643] fec 2188000.ethernet eth0: could not attach to PHY

    On SoCs with two FEC instances there are some cases where one FEC instance
    depends on the other one being present. One such example is i.MX28, which
    has the following FEC dependency as noted in the comments:

    /*
    * The i.MX28 dual fec interfaces are not equal.
    * Here are the differences:
    *
    * - fec0 supports MII & RMII modes while fec1 only supports RMII
    * - fec0 acts as the 1588 time master while fec1 is slave
    * - external phys can only be configured by fec0
    *
    * That is to say fec1 can not work independently. It only works
    * when fec0 is working. The reason behind this design is that the
    * second interface is added primarily for Switch mode.
    *
    * Because of the last point above, both phys are attached on fec0
    * mdio interface in board design, and need to be configured by
    * fec0 mii_bus.
    */

    Prevent the unbind operation to avoid these issues.

    Signed-off-by: Fabio Estevam
    Signed-off-by: David S. Miller

    Fabio Estevam
     

18 Feb, 2020

36 commits

  • drivers/net/ethernet/amazon/ena/ena_com.c: In function ena_com_hash_key_allocate:
    drivers/net/ethernet/amazon/ena/ena_com.c:1070:50:
    warning: variable hash_key set but not used [-Wunused-but-set-variable]

    commit 6a4f7dc82d1e ("net: ena: rss: do not allocate key when not supported")
    introduced this, but not used, so remove it.

    Reported-by: Hulk Robot
    Signed-off-by: YueHaibing
    Signed-off-by: David S. Miller

    YueHaibing
     
  • The current codebase makes use of the zero-length array language
    extension to the C90 standard, but the preferred mechanism to declare
    variable-length types such as these ones is a flexible array member[1][2],
    introduced in C99:

    struct foo {
    int stuff;
    struct boo array[];
    };

    By making use of the mechanism above, we will get a compiler warning
    in case the flexible array does not occur last in the structure, which
    will help us prevent some kind of undefined behavior bugs from being
    inadvertently introduced[3] to the codebase from now on.

    Also, notice that, dynamic memory allocations won't be affected by
    this change:

    "Flexible array members have incomplete type, and so the sizeof operator
    may not be applied. As a quirk of the original implementation of
    zero-length arrays, sizeof evaluates to zero."[1]

    This issue was found with the help of Coccinelle.

    [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
    [2] https://github.com/KSPP/linux/issues/21
    [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: David S. Miller

    Gustavo A. R. Silva
     
  • The current codebase makes use of the zero-length array language
    extension to the C90 standard, but the preferred mechanism to declare
    variable-length types such as these ones is a flexible array member[1][2],
    introduced in C99:

    struct foo {
    int stuff;
    struct boo array[];
    };

    By making use of the mechanism above, we will get a compiler warning
    in case the flexible array does not occur last in the structure, which
    will help us prevent some kind of undefined behavior bugs from being
    inadvertently introduced[3] to the codebase from now on.

    Also, notice that, dynamic memory allocations won't be affected by
    this change:

    "Flexible array members have incomplete type, and so the sizeof operator
    may not be applied. As a quirk of the original implementation of
    zero-length arrays, sizeof evaluates to zero."[1]

    This issue was found with the help of Coccinelle.

    [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
    [2] https://github.com/KSPP/linux/issues/21
    [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: David S. Miller

    Gustavo A. R. Silva
     
  • The current codebase makes use of the zero-length array language
    extension to the C90 standard, but the preferred mechanism to declare
    variable-length types such as these ones is a flexible array member[1][2],
    introduced in C99:

    struct foo {
    int stuff;
    struct boo array[];
    };

    By making use of the mechanism above, we will get a compiler warning
    in case the flexible array does not occur last in the structure, which
    will help us prevent some kind of undefined behavior bugs from being
    inadvertently introduced[3] to the codebase from now on.

    Also, notice that, dynamic memory allocations won't be affected by
    this change:

    "Flexible array members have incomplete type, and so the sizeof operator
    may not be applied. As a quirk of the original implementation of
    zero-length arrays, sizeof evaluates to zero."[1]

    This issue was found with the help of Coccinelle.

    [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
    [2] https://github.com/KSPP/linux/issues/21
    [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: David S. Miller

    Gustavo A. R. Silva
     
  • The current codebase makes use of the zero-length array language
    extension to the C90 standard, but the preferred mechanism to declare
    variable-length types such as these ones is a flexible array member[1][2],
    introduced in C99:

    struct foo {
    int stuff;
    struct boo array[];
    };

    By making use of the mechanism above, we will get a compiler warning
    in case the flexible array does not occur last in the structure, which
    will help us prevent some kind of undefined behavior bugs from being
    inadvertently introduced[3] to the codebase from now on.

    Also, notice that, dynamic memory allocations won't be affected by
    this change:

    "Flexible array members have incomplete type, and so the sizeof operator
    may not be applied. As a quirk of the original implementation of
    zero-length arrays, sizeof evaluates to zero."[1]

    This issue was found with the help of Coccinelle.

    [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
    [2] https://github.com/KSPP/linux/issues/21
    [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: David S. Miller

    Gustavo A. R. Silva
     
  • The current codebase makes use of the zero-length array language
    extension to the C90 standard, but the preferred mechanism to declare
    variable-length types such as these ones is a flexible array member[1][2],
    introduced in C99:

    struct foo {
    int stuff;
    struct boo array[];
    };

    By making use of the mechanism above, we will get a compiler warning
    in case the flexible array does not occur last in the structure, which
    will help us prevent some kind of undefined behavior bugs from being
    inadvertently introduced[3] to the codebase from now on.

    Also, notice that, dynamic memory allocations won't be affected by
    this change:

    "Flexible array members have incomplete type, and so the sizeof operator
    may not be applied. As a quirk of the original implementation of
    zero-length arrays, sizeof evaluates to zero."[1]

    This issue was found with the help of Coccinelle.

    [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
    [2] https://github.com/KSPP/linux/issues/21
    [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: David S. Miller

    Gustavo A. R. Silva
     
  • Now that the phylib module loading issue has been resolved, we can
    allow this PHY driver to be built as a module.

    Signed-off-by: Russell King
    Acked-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Russell King
     
  • Ursula Braun says:

    ====================
    net/smc: patches 2020-02-17

    here are patches for SMC making termination tasks more perfect.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • IB event handlers schedule the port event worker for further
    processing of port state changes. This patch reduces the number of
    schedules to avoid duplicate processing of the same port change.

    Reviewed-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Ursula Braun
     
  • smc_lgr_terminate() and smc_lgr_terminate_sched() both result in soft
    link termination, smc_lgr_terminate_sched() is scheduling a worker for
    this task. Take out complexity by always using the termination worker
    and getting rid of smc_lgr_terminate() completely.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • The soft parameter of smc_lgr_terminate() is not used and obsolete.
    Remove it.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • When 2 callers call smc_lgr_terminate() at the same time
    for the same lgr, one gets the lgr_lock and deletes the lgr from the
    list and releases the lock. Then the second caller gets the lock and
    tries to delete it again.
    In smc_lgr_terminate() add a check if the link group lgr is already
    deleted from the link group list and prevent to try to delete it a
    second time.
    And add a check if the lgr is marked as freeing, which means that a
    termination is already pending.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • smc_tx_rdma_write() is called under the send_lock and should not call
    smc_lgr_terminate() directly. Call smc_lgr_terminate_sched() instead
    which schedules a worker.

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • smc_lgr_cleanup() is called during termination processing, there is no
    need to send a DELETE_LINK at that time. A DELETE_LINK should have been
    sent before the termination is initiated, if needed.
    And remove the extra call to wake_up(&lnk->wr_reg_wait) because
    smc_llc_link_inactive() already calls the related helper function
    smc_wr_wakeup_reg_wait().

    Signed-off-by: Karsten Graul
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller

    Karsten Graul
     
  • Ido Schimmel says:

    ====================
    mlxsw: Reduce dependency between bridge and router code

    This patch set reduces the dependency between the bridge and the router
    code in preparation for RTNL removal from the route insertion path in
    mlxsw.

    The motivation and solution are explained in detail in patch #3. The
    main idea is that we need to stop special-casing the VXLAN devices with
    regards to the reference counting of the FIDs. Otherwise, we can bump
    into the situation described in patch #3, where the routing code calls
    into the bridge code which calls back into the routing code. After
    adding a mutex to protect router data structures to remove RTNL
    dependency, this can result in an AA deadlock.

    Patches #1 and #2 are preparations. They convert the FIDs to use
    'refcount_t' for reference counting in order to catch over/under flows
    and add extack to the bridge creation function.

    Patches #3-#5 reduce the dependency between the bridge and the router
    code. First, by having the VXLAN device take a reference on the FID in
    patch #3 and then by removing unnecessary code following the change in
    patch #3.

    Patches #6-#10 adjust existing selftests and add new ones to exercise
    the new code paths.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Test that when two VXLAN tunnels with conflicting configurations (i.e.,
    different TTL) are enslaved to the same VLAN-aware bridge, then the
    enslavement of a port to the bridge is denied.

    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • After recent changes, the VXLAN tunnel will be offloaded regardless if
    any local ports are member in the FID or not. Adjust the test to make
    sure the tunnel is offloaded in this case.

    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • The driver supports a single VLAN-aware bridge. Test that the
    enslavement of a port to the second VLAN-aware bridge fails with an
    extack.

    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Test that creation of a bridge (both VLAN-aware and VLAN-unaware) fails
    with an extack when a VXLAN device with an unsupported configuration is
    already enslaved to it.

    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • The addition of a VLAN on a bridge slave prompts the driver to have the
    local port in question join the FID corresponding to this VLAN.

    Before recent changes, the operation of joining the FID would also mean
    that the driver would enable VXLAN tunneling if a VXLAN device was also
    member in the VLAN. In case the configuration of the VXLAN tunnel was
    not supported, an extack error would be returned.

    Since the operation of joining the FID no longer means that VXLAN
    tunneling is potentially enabled, the test is no longer relevant. Remove
    it.

    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Commit f40be47a3e40 ("mlxsw: spectrum_router: Do not force specific
    configuration order") added a call from the routing code to the bridge
    code in order to handle the case where VNI should be set on a FID
    following the joining of the router port to the FID.

    This is no longer required, as previous patches made VXLAN devices
    explicitly take a reference on the FID and set VNI on it.

    Therefore, remove the unnecessary call and simply have the RIF take a
    reference on the FID without checking if VNI should also be set on it.

    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • As explained in previous patch, VXLAN devices now take a reference on
    the FID and not only local ports. Therefore, there is no need for local
    ports to check if they need to set a VNI on the FID when they join the
    FID.

    Remove these unnecessary checks.

    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Up until now only local ports and the router port (which is also a local
    port) took a reference on the corresponding FID (Filtering Identifier)
    when joining a bridge. For example:

    192.0.2.1/24
    br0
    |
    +------+------+
    | |
    swp1 vxlan0

    In this case the reference count of the FID will be '2'. Since the VXLAN
    device does not take a reference on the FID, whenever a local port joins
    the bridge it needs to check if a VXLAN device is already enslaved. If
    the VXLAN device should be mapped to the FID in question, then the VXLAN
    device's VNI is set on the FID.

    Beside the fact that this scheme special-cases the VXLAN device, it also
    creates an unnecessary dependency between the routing and bridge code:

    1. [R] IP address is added on 'br0', which prompts the creation of a RIF
    and a backing FID
    2. [B] VNI is enabled on backing FID
    3. [R] Host route corresponding to VXLAN device's source address is
    promoted to perform NVE decapsulation

    [R] - Routing code
    [B] - Bridge code

    This back and forth dependency will become problematic when a lock is
    added in the routing code instead of relying on RTNL, as it will result
    in an AA deadlock.

    Instead, have the VXLAN device take a reference on the FID just like all
    the other netdev members of the bridge. In order to correctly handle the
    case where VXLAN devices are already enslaved to the bridge when it is
    offloaded, walk the bridge's slaves and replay the configuration.

    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Reviewed-by: Petr Machata
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Propagate extack to bridge creation function so that error messages
    could be passed to user space via netlink instead of printing them to
    kernel log.

    A subsequent patch will pass the new extack argument to more functions.

    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • 'refcount_t' is very useful for catching over/under flows. Convert the
    FID (Filtering Identifier) objects to use it instead of 'unsigned int'
    for their reference count.

    A subsequent patch in the series will change the way VXLAN devices hold
    / release the FID reference, which is why the conversion is made now.

    Signed-off-by: Ido Schimmel
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • This enables ndo_dflt_bridge_getlink() to report a bridge port's
    offload settings for multicast and broadcast flooding.

    CC: Roopa Prabhu
    CC: Nikolay Aleksandrov
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Edward Cree says:

    ====================
    couple more ARFS tidy-ups

    Tie up some loose ends from the recent ARFS work.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • efx_filter_rfs_expire() is a work-function, so it being inline makes no
    sense. It's only ever used in efx_channels.c, so move it there.
    While we're at it, clean out some related unused cruft.

    Signed-off-by: Edward Cree
    Signed-off-by: David S. Miller

    Edward Cree
     
  • Prevent excessive CPU time spent running a workitem with nothing to do.

    We avoid any races by keeping the same check in efx_filter_rfs_expire().

    Suggested-by: Martin Habets
    Signed-off-by: Edward Cree
    Signed-off-by: David S. Miller

    Edward Cree
     
  • When a real dev unregisters, vlan_device_event() also unregisters all
    of its vlan interfaces. For each VID this ends up in __vlan_vid_del(),
    which attempts to remove the VID from the real dev's VLAN filter.

    But the unregistering real dev might no longer be able to issue the
    required IOs, and return an error. Subsequently we raise a noisy warning
    msg that is not appropriate for this situation: the real dev is being
    torn down anyway, there shouldn't be any worry about cleanly releasing
    all of its HW-internal resources.

    So to avoid scaring innocent users, suppress this warning when the
    failed deletion happens on an unregistering device.
    While at it also convert the raw pr_warn() to a more fitting
    netdev_warn().

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Since PCI core provides a generic PCI_DEVICE_DATA() macro,
    replace STMMAC_DEVICE() with former one.

    No functional change intended.

    Signed-off-by: Andy Shevchenko
    Signed-off-by: David S. Miller

    Andy Shevchenko
     
  • Vlad Buslov says:

    ====================
    Remove rtnl lock dependency from flow_action infra

    Currently, TC flow_action infrastructure code obtain rtnl lock before
    accessing action state in tc_setup_flow_action() function and releases
    it afterwards. This behavior is not supposed to impact TC filter
    insertion rate because filling flow_action representation is only a
    small part of creating new filter and expensive operations (hardware
    offload callbacks, classifiers, cls API code that creates chains and
    classifiers instances) already support unlocked execution. However,
    typical vswitch implementation might need to also dump TC filters
    concurrently, for example to age out unused flows or update flow
    counters. TC dump is fully serialized and holds rtnl lock during its
    whole execution in kernel space. As such, it can significantly impact
    concurrent tasks that try to intermittently obtain rtnl lock when
    filling intermediate representation for new filter offload (performance
    evaluation at the end of this mail).

    Refactor flow_action cls API infrastructure and its dependencies to not
    rely on rtnl lock for synchronization. Patch set overview:

    - Refactor tc_setup_flow_action() to obtain action tcf_lock when
    accessing action state. Fix its dependencies to not obtain tcf_lock
    themselves and assume that caller already holds it (needs to be done
    in same patch to prevent deadlock) and not to call sleeping functions
    (needs to be done in same patch to prevent "sleeping while atomic"
    dmesg warnings).

    - Refactor action helper functions to require tcf_lock instead of rtnl.
    Internally, all of the actions already use tcf_lock for
    synchronization to accommodate unlocked classifier API, so this change
    relies on already existing functionality.

    - Remove rtnl lock and "rtnl_held" argument from tc_setup_flow_action()
    function.

    To test the change, multiple concurrent TC instances are invoked with
    following command:

    time ls add* | xargs -n 1 -P 100 sudo tc -b

    Ten batch files with following typical rules (100k each) are used:

    filter add dev ens1f0_0 protocol ip ingress prio 1 handle 1 flower
    src_mac e4:11:0:0:0:0 dst_mac e4:12:0:0:0:0 src_ip 192.168.111.1
    dst_ip 192.168.111.2 ip_proto udp dst_port 1 src_port 1 action
    tunnel_key set id 1 src_ip 2.2.2.2 dst_ip 2.2.2.3 dst_port 4789
    no_percpu action mirred egress redirect dev vxlan1 no_percpu

    TC dump of same device is called in infinite loop from five concurrent
    instances:

    while true do tc -s filter show dev $NIC ingress >/dev/null done

    Results obtained on current net-next commit 9f68e3655aae ("Merge tag
    'drm-next-2020-01-30' of git://anongit.freedesktop.org/drm/drm"):

    | net-next | this change
    ---------------+----------+-------------
    TC add | 6.3s | 6.3s
    TC add + dump | 29.3s | 6.8s

    Test results confirm significant impact of concurrent TC dump. The
    impact is almost fully mitigated by proposed change (differences can be
    attributed to contention for chain and tp locks between add and dump TC
    instances).
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Refactor tc_setup_flow_action() function not to use rtnl lock and remove
    'rtnl_held' argument that is no longer needed.

    Signed-off-by: Vlad Buslov
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • In order to remove rtnl lock dependency from flow_action representation
    translator, change rtnl_dereference() to rcu_dereference_protected() in ct
    action helpers that provide external access to zone and action values. This
    is safe to do because the functions are not called from anywhere else
    outside flow_action infrastructure which was modified to obtain tcf_lock
    when accessing action data in one of previous patches in the series.

    Signed-off-by: Vlad Buslov
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • In order to remove rtnl lock dependency from flow_action representation
    translator, change rcu_dereference_bh_rtnl() to rcu_dereference_protected()
    in police action helpers that provide external access to rate and burst
    values. This is safe to do because the functions are not called from
    anywhere else outside flow_action infrastructure which was modified to
    obtain tcf_lock when accessing action data in one of previous patches in
    the series.

    Signed-off-by: Vlad Buslov
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • In order to remove dependency on rtnl lock, take action's tcfa_lock when
    constructing its representation as flow_action_entry structure.

    Refactor tcf_sample_get_group() to assume that caller holds tcf_lock and
    don't take it manually. This callback is only called from flow_action infra
    representation translator which now calls it with tcf_lock held, so this
    refactoring is necessary to prevent deadlock.

    Allocate memory with GFP_ATOMIC flag for ip_tunnel_info copy because
    tcf_tunnel_info_copy() is only called from flow_action representation infra
    code with tcf_lock spinlock taken.

    Signed-off-by: Vlad Buslov
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     

17 Feb, 2020

1 commit

  • Lorenzo Bianconi says:

    ====================
    add xdp ethtool stats to mvneta driver

    Rework mvneta stats accounting in order to introduce xdp ethtool
    statistics in the mvneta driver.
    Introduce xdp_redirect, xdp_pass, xdp_drop and xdp_tx counters to
    ethtool statistics.
    Fix skb_alloc_error and refill_error ethtool accounting
    ====================

    Signed-off-by: David S. Miller

    David S. Miller