20 Jan, 2021

1 commit

  • Fix incorrect user_ptr dereferencing when handling port param get/set:

    idx [0] stores the 'struct devlink' pointer;
    idx [1] stores the 'struct devlink_port' pointer;

    Fixes: 637989b5d77e ("devlink: Always use user_ptr[0] for devlink and simplify post_doit")
    CC: Parav Pandit
    Signed-off-by: Oleksandr Mazur
    Signed-off-by: Vadym Kochan
    Link: https://lore.kernel.org/r/20210119085333.16833-1-vadym.kochan@plvision.eu
    Signed-off-by: Jakub Kicinski

    Oleksandr Mazur
     

09 Dec, 2020

1 commit


28 Nov, 2020

1 commit


26 Nov, 2020

2 commits

  • When devlink reload operation is not used, netdev of an Ethernet port may
    be present in different net namespace than the net namespace of the
    devlink instance.

    Ensure that both the devlink instance and devlink port netdev are located
    in same net namespace.

    Fixes: 070c63f20f6c ("net: devlink: allow to change namespaces during reload")
    Signed-off-by: Parav Pandit
    Signed-off-by: Jakub Kicinski

    Parav Pandit
     
  • A netdevice of a devlink port can be moved to different net namespace
    than its parent devlink instance.
    This scenario occurs when devlink reload is not used.

    When netdevice is undergoing migration to net namespace, its ifindex
    and name may change.

    In such use case, devlink port query may read stale netdev attributes.

    Fix it by reading them under rtnl lock.

    Fixes: bfcd3a466172 ("Introduce devlink infrastructure")
    Signed-off-by: Parav Pandit
    Signed-off-by: Jakub Kicinski

    Parav Pandit
     

25 Nov, 2020

2 commits

  • Fix reload stats structure exposed to the user. Change stats structure
    hierarchy to have the reload action as a parent of the stat entry and
    then stat entry includes value per limit. This will also help to avoid
    string concatenation on iproute2 output.

    Reload stats structure before this fix:
    "stats": {
    "reload": {
    "driver_reinit": 2,
    "fw_activate": 1,
    "fw_activate_no_reset": 0
    }
    }

    After this fix:
    "stats": {
    "reload": {
    "driver_reinit": {
    "unspecified": 2
    },
    "fw_activate": {
    "unspecified": 1,
    "no_reset": 0
    }
    }

    Fixes: a254c264267e ("devlink: Add reload stats")
    Signed-off-by: Moshe Shemesh
    Reviewed-by: Jiri Pirko
    Link: https://lore.kernel.org/r/1606109785-25197-1-git-send-email-moshe@mellanox.com
    Signed-off-by: Jakub Kicinski

    Moshe Shemesh
     
  • Add a packet trap to report packets that were dropped due to a
    blackhole nexthop.

    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Signed-off-by: Jakub Kicinski

    Ido Schimmel
     

20 Nov, 2020

2 commits

  • When performing a flash update via devlink, device drivers may inform
    user space of status updates via
    devlink_flash_update_(begin|end|timeout|status)_notify functions.

    It is expected that drivers do not send any status notifications unless
    they send a begin and end message. If a driver sends a status
    notification without sending the appropriate end notification upon
    finishing (regardless of success or failure), the current implementation
    of the devlink userspace program can get stuck endlessly waiting for the
    end notification that will never come.

    The current ice driver implementation may send such a status message
    without the appropriate end notification in rare cases.

    Fixing the ice driver is relatively simple: we just need to send the
    begin_notify at the start of the function and always send an end_notify
    no matter how the function exits.

    Rather than assuming driver authors will always get this right in the
    future, lets just fix the API so that it is not possible to get wrong.
    Make devlink_flash_update_begin_notify and
    devlink_flash_update_end_notify static, and call them in devlink.c core
    code. Always send the begin_notify just before calling the driver's
    flash_update routine. Always send the end_notify just after the routine
    returns regardless of success or failure.

    Doing this makes the status notification easier to use from the driver,
    as it no longer needs to worry about catching failures and cleaning up
    by calling devlink_flash_update_end_notify. It is now no longer possible
    to do the wrong thing in this regard. We also save a couple of lines of
    code in each driver.

    Signed-off-by: Jacob Keller
    Acked-by: Vasundhara Volam
    Reviewed-by: Jiri Pirko
    Signed-off-by: Jakub Kicinski

    Jacob Keller
     
  • All drivers which implement the devlink flash update support, with the
    exception of netdevsim, use either request_firmware or
    request_firmware_direct to locate the firmware file. Rather than having
    each driver do this separately as part of its .flash_update
    implementation, perform the request_firmware within net/core/devlink.c

    Replace the file_name parameter in the struct devlink_flash_update_params
    with a pointer to the fw object.

    Use request_firmware rather than request_firmware_direct. Although most
    Linux distributions today do not have the fallback mechanism
    implemented, only about half the drivers used the _direct request, as
    compared to the generic request_firmware. In the event that
    a distribution does support the fallback mechanism, the devlink flash
    update ought to be able to use it to provide the firmware contents. For
    distributions which do not support the fallback userspace mechanism,
    there should be essentially no difference between request_firmware and
    request_firmware_direct.

    Signed-off-by: Jacob Keller
    Acked-by: Shannon Nelson
    Acked-by: Vasundhara Volam
    Reviewed-by: Jiri Pirko
    Signed-off-by: Jakub Kicinski

    Jacob Keller
     

15 Nov, 2020

1 commit

  • If sb_occ_port_pool_get() failed in devlink_nl_sb_port_pool_fill(),
    msg should be canceled by genlmsg_cancel().

    Fixes: df38dafd2559 ("devlink: implement shared buffer occupancy monitoring interface")
    Reported-by: Hulk Robot
    Signed-off-by: Wang Hai
    Link: https://lore.kernel.org/r/20201113111622.11040-1-wanghai38@huawei.com
    Signed-off-by: Jakub Kicinski

    Wang Hai
     

13 Nov, 2020

1 commit

  • Cited commit in fixes tag overwrites the port attributes for the
    registered port.

    Avoid such error by checking registered flag before setting attributes.

    Fixes: 71ad8d55f8e5 ("devlink: Replace devlink_port_attrs_set parameters with a struct")
    Signed-off-by: Parav Pandit
    Reviewed-by: Jiri Pirko
    Link: https://lore.kernel.org/r/20201111034744.35554-1-parav@nvidia.com
    Signed-off-by: Jakub Kicinski

    Parav Pandit
     

28 Oct, 2020

2 commits

  • This needs to unlock before returning.

    Fixes: 544e7c33ec2f ("net: devlink: Add support for port regions")
    Signed-off-by: Dan Carpenter
    Link: https://lore.kernel.org/r/20201026080127.GB1628785@mwanda
    Signed-off-by: Jakub Kicinski

    Dan Carpenter
     
  • These paths don't set the error codes. It's especially important in
    devlink_nl_region_notify_build() where it leads to a NULL dereference in
    the caller.

    Fixes: 544e7c33ec2f ("net: devlink: Add support for port regions")
    Signed-off-by: Dan Carpenter
    Link: https://lore.kernel.org/r/20201026080059.GA1628785@mwanda
    Signed-off-by: Jakub Kicinski

    Dan Carpenter
     

10 Oct, 2020

6 commits

  • The enable_remote_dev_reset devlink param flags that the host admin
    allows device resets that can be initiated by other hosts. This
    parameter is useful for setups where a device is shared by different
    hosts, such as multi-host setup. Once the user set this parameter to
    false, the driver should NACK any attempt to reset the device while the
    driver is loaded.

    Signed-off-by: Moshe Shemesh
    Reviewed-by: Jiri Pirko
    Signed-off-by: Jakub Kicinski

    Moshe Shemesh
     
  • Add remote reload stats to hold the history of actions performed due
    devlink reload commands initiated by remote host. For example, in case
    firmware activation with reset finished successfully but was initiated
    by remote host.

    The function devlink_remote_reload_actions_performed() is exported to
    enable drivers update on remote reload actions performed as it was not
    initiated by their own devlink instance.

    Expose devlink remote reload stats to the user through devlink dev get
    command.

    Examples:
    $ devlink dev show
    pci/0000:82:00.0:
    stats:
    reload:
    driver_reinit 2 fw_activate 1 fw_activate_no_reset 0
    remote_reload:
    driver_reinit 0 fw_activate 0 fw_activate_no_reset 0
    pci/0000:82:00.1:
    stats:
    reload:
    driver_reinit 1 fw_activate 0 fw_activate_no_reset 0
    remote_reload:
    driver_reinit 1 fw_activate 1 fw_activate_no_reset 0

    $ devlink dev show -jp
    {
    "dev": {
    "pci/0000:82:00.0": {
    "stats": {
    "reload": {
    "driver_reinit": 2,
    "fw_activate": 1,
    "fw_activate_no_reset": 0
    },
    "remote_reload": {
    "driver_reinit": 0,
    "fw_activate": 0,
    "fw_activate_no_reset": 0
    }
    }
    },
    "pci/0000:82:00.1": {
    "stats": {
    "reload": {
    "driver_reinit": 1,
    "fw_activate": 0,
    "fw_activate_no_reset": 0
    },
    "remote_reload": {
    "driver_reinit": 1,
    "fw_activate": 1,
    "fw_activate_no_reset": 0
    }
    }
    }
    }
    }

    Signed-off-by: Moshe Shemesh
    Reviewed-by: Jakub Kicinski
    Reviewed-by: Jiri Pirko
    Signed-off-by: Jakub Kicinski

    Moshe Shemesh
     
  • Add reload stats to hold the history per reload action type and limit.

    For example, the number of times fw_activate has been performed on this
    device since the driver module was added or if the firmware activation
    was performed with or without reset.

    Add devlink notification on stats update.

    Expose devlink reload stats to the user through devlink dev get command.

    Examples:
    $ devlink dev show
    pci/0000:82:00.0:
    stats:
    reload:
    driver_reinit 2 fw_activate 1 fw_activate_no_reset 0
    pci/0000:82:00.1:
    stats:
    reload:
    driver_reinit 1 fw_activate 0 fw_activate_no_reset 0

    $ devlink dev show -jp
    {
    "dev": {
    "pci/0000:82:00.0": {
    "stats": {
    "reload": {
    "driver_reinit": 2,
    "fw_activate": 1,
    "fw_activate_no_reset": 0
    }
    }
    },
    "pci/0000:82:00.1": {
    "stats": {
    "reload": {
    "driver_reinit": 1,
    "fw_activate": 0,
    "fw_activate_no_reset": 0
    }
    }
    }
    }
    }

    Signed-off-by: Moshe Shemesh
    Reviewed-by: Jiri Pirko
    Signed-off-by: Jakub Kicinski

    Moshe Shemesh
     
  • Add reload limit to demand restrictions on reload actions.
    Reload limits supported:
    no_reset: No reset allowed, no down time allowed, no link flap and no
    configuration is lost.

    By default reload limit is unspecified and so no constraints on reload
    actions are required.

    Some combinations of action and limit are invalid. For example, driver
    can not reinitialize its entities without any downtime.

    The no_reset reload limit will have usecase in this patchset to
    implement restricted fw_activate on mlx5.

    Have the uapi parameter of reload limit ready for future support of
    multiselection.

    Signed-off-by: Moshe Shemesh
    Reviewed-by: Jiri Pirko
    Signed-off-by: Jakub Kicinski

    Moshe Shemesh
     
  • Add devlink reload action to allow the user to request a specific reload
    action. The action parameter is optional, if not specified then devlink
    driver re-init action is used (backward compatible).
    Note that when required to do firmware activation some drivers may need
    to reload the driver. On the other hand some drivers may need to reset
    the firmware to reinitialize the driver entities. Therefore, the devlink
    reload command returns the actions which were actually performed.
    Reload actions supported are:
    driver_reinit: driver entities re-initialization, applying devlink-param
    and devlink-resource values.
    fw_activate: firmware activate.

    command examples:
    $devlink dev reload pci/0000:82:00.0 action driver_reinit
    reload_actions_performed:
    driver_reinit

    $devlink dev reload pci/0000:82:00.0 action fw_activate
    reload_actions_performed:
    driver_reinit fw_activate

    Signed-off-by: Moshe Shemesh
    Reviewed-by: Jakub Kicinski
    Reviewed-by: Jacob Keller
    Reviewed-by: Jiri Pirko
    Signed-off-by: Jakub Kicinski

    Moshe Shemesh
     
  • Change devlink_reload_supported() function to get devlink_ops pointer
    param instead of devlink pointer param.
    This change will be used in the next patch to check if devlink reload is
    supported before devlink instance is allocated.

    Signed-off-by: Moshe Shemesh
    Reviewed-by: Jakub Kicinski
    Reviewed-by: Jiri Pirko
    Reviewed-by: Jacob Keller
    Signed-off-by: Jakub Kicinski

    Moshe Shemesh
     

05 Oct, 2020

2 commits

  • Allow regions to be registered to a devlink port. The same netlink API
    is used, but the port index is provided to indicate when a region is a
    port region as opposed to a device region.

    Reviewed-by: Vladimir Oltean
    Tested-by: Vladimir Oltean
    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • Not all ports of a switch need to be used, particularly in embedded
    systems. Add a port flavour for ports which physically exist in the
    switch, but are not connected to the front panel etc, and so are
    unused. By having unused ports present in devlink, it gives a more
    accurate representation of the hardware. It also allows regions to be
    associated to such ports, so allowing, for example, to determine
    unused ports are correctly powered off, or to compare probable reset
    defaults of unused ports to used ports experiences issues.

    Actually registering unused ports and setting the flavour to unused is
    optional. The DSA core will register all such switch ports, but such
    ports are expected to be limited in number. Bigger ASICs may decide
    not to list unused ports.

    v2:
    Expand the description about why it is useful

    Reviewed-by: Vladimir Oltean
    Tested-by: Vladimir Oltean
    Signed-off-by: Andrew Lunn
    Reviewed-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Andrew Lunn
     

03 Oct, 2020

3 commits

  • Bulk of the genetlink users can use smaller ops, move them.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Johannes Berg
    Signed-off-by: David S. Miller

    Jakub Kicinski
     
  • Add a new devlink callback, .trap_group_action_set(), which can be used
    by device drivers which do not support controlling the action (drop,
    trap) on each trap but rather on the entire group trap.
    If this new callback is populated, it will take precedence over the
    .trap_action_set() callback when the user requests a change of all the
    traps in a group.

    Signed-off-by: Ioana Ciornei
    Signed-off-by: David S. Miller

    Ioana Ciornei
     
  • Add parser error drop packet traps, so that capable device driver could
    register them with devlink. The new packet trap group holds any drops of
    packets which were marked by the device as erroneous during header
    parsing. Add documentation for every added packet trap and packet trap
    group.

    Signed-off-by: Ioana Ciornei
    Signed-off-by: David S. Miller

    Ioana Ciornei
     

01 Oct, 2020

3 commits

  • Previously, devlink called into drop monitor in order to report hardware
    originated drops / exceptions. devlink intentionally filtered control
    packets and did not pass them to drop monitor as they were not dropped
    by the underlying hardware.

    Now drop monitor registers its probe on a generic 'devlink_trap_report'
    tracepoint and should therefore perform this filtering itself instead of
    having devlink do that.

    Add the trap type as metadata and have drop monitor ignore control
    packets.

    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Convert drop monitor to use the recently introduced
    'devlink_trap_report' tracepoint instead of having devlink call into
    drop monitor.

    This is both consistent with software originated drops ('kfree_skb'
    tracepoint) and also allows drop monitor to be built as a module and
    still report hardware originated drops.

    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Add a tracepoint for trap reports so that drop monitor could register
    its probe on it. Use trace_devlink_trap_report_enabled() to avoid
    wasting cycles setting the trap metadata if the tracepoint is not
    enabled.

    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     

26 Sep, 2020

3 commits

  • Sections of device flash may contain settings or device identifying
    information. When performing a flash update, it is generally expected
    that these settings and identifiers are not overwritten.

    However, it may sometimes be useful to allow overwriting these fields
    when performing a flash update. Some examples include, 1) customizing
    the initial device config on first programming, such as overwriting
    default device identifying information, or 2) reverting a device
    configuration to known good state provided in the new firmware image, or
    3) in case it is suspected that current firmware logic for managing the
    preservation of fields during an update is broken.

    Although some devices are able to completely separate these types of
    settings and fields into separate components, this is not true for all
    hardware.

    To support controlling this behavior, a new
    DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK is defined. This is an
    nla_bitfield32 which will define what subset of fields in a component
    should be overwritten during an update.

    If no bits are specified, or of the overwrite mask is not provided, then
    an update should not overwrite anything, and should maintain the
    settings and identifiers as they are in the previous image.

    If the overwrite mask has the DEVLINK_FLASH_OVERWRITE_SETTINGS bit set,
    then the device should be configured to overwrite any of the settings in
    the requested component with settings found in the provided image.

    Similarly, if the DEVLINK_FLASH_OVERWRITE_IDENTIFIERS bit is set, the
    device should be configured to overwrite any device identifiers in the
    requested component with the identifiers from the image.

    Multiple overwrite modes may be combined to indicate that a combination
    of the set of fields that should be overwritten.

    Drivers which support the new overwrite mask must set the
    DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK in the
    supported_flash_update_params field of their devlink_ops.

    Signed-off-by: Jacob Keller
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Jacob Keller
     
  • The devlink core recently gained support for checking whether the driver
    supports a flash_update parameter, via `supported_flash_update_params`.
    However, parameters are specified as function arguments. Adding a new
    parameter still requires modifying the signature of the .flash_update
    callback in all drivers.

    Convert the .flash_update function to take a new `struct
    devlink_flash_update_params` instead. By using this structure, and the
    `supported_flash_update_params` bit field, a new parameter to
    flash_update can be added without requiring modification to existing
    drivers.

    As before, all parameters except file_name will require driver opt-in.
    Because file_name is a necessary field to for the flash_update to make
    sense, no "SUPPORTED" bitflag is provided and it is always considered
    valid. All future additional parameters will require a new bit in the
    supported_flash_update_params bitfield.

    Signed-off-by: Jacob Keller
    Reviewed-by: Jakub Kicinski
    Cc: Jiri Pirko
    Cc: Jakub Kicinski
    Cc: Jonathan Corbet
    Cc: Michael Chan
    Cc: Bin Luo
    Cc: Saeed Mahameed
    Cc: Leon Romanovsky
    Cc: Ido Schimmel
    Cc: Danielle Ratson
    Signed-off-by: David S. Miller

    Jacob Keller
     
  • When implementing .flash_update, drivers which do not support
    per-component update are manually checking the component parameter to
    verify that it is NULL. Without this check, the driver might accept an
    update request with a component specified even though it will not honor
    such a request.

    Instead of having each driver check this, move the logic into
    net/core/devlink.c, and use a new `supported_flash_update_params` field
    in the devlink_ops. Drivers which will support per-component update must
    now specify this by setting DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT in
    the supported_flash_update_params in their devlink_ops.

    This helps ensure that drivers do not forget to check for a NULL
    component if they do not support per-component update. This also enables
    a slightly better error message by enabling the core stack to set the
    netlink bad attribute message to indicate precisely the unsupported
    attribute in the message.

    Going forward, any new additional parameter to flash update will require
    a bit in the supported_flash_update_params bitfield.

    Signed-off-by: Jacob Keller
    Reviewed-by: Jakub Kicinski
    Cc: Jiri Pirko
    Cc: Jonathan Corbet
    Cc: Michael Chan
    Cc: Bin Luo
    Cc: Saeed Mahameed
    Cc: Leon Romanovsky
    Cc: Ido Schimmel
    Cc: Danielle Ratson
    Cc: Shannon Nelson
    Signed-off-by: David S. Miller

    Jacob Keller
     

23 Sep, 2020

2 commits


19 Sep, 2020

3 commits

  • Pass the region to be snapshotted to the function performing the
    snapshot. This allows one function to operate on numerous regions.

    v4:
    Add missing kerneldoc for ICE

    Signed-off-by: Andrew Lunn
    Reviewed-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • The dev flash status notify function parameter lists are getting
    rather long, so add a struct to be filled and passed rather than
    continuously changing the function signatures.

    Signed-off-by: Shannon Nelson
    Reviewed-by: Jacob Keller
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Shannon Nelson
     
  • Add a timeout element to the DEVLINK_CMD_FLASH_UPDATE_STATUS
    netlink message for use by a userland utility to show that
    a particular firmware flash activity may take a long but
    bounded time to finish. Also add a handy helper for drivers
    to make use of the new timeout value.

    UI usage hints:
    - if non-zero, add timeout display to the end of the status line
    [component] status_msg ( Xm Ys : Am Bs )
    using the timeout value for Am Bs and updating the Xm Ys
    every second
    - if the timeout expires while awaiting the next update,
    display something like
    [component] status_msg ( timeout reached : Am Bs )
    - if new status notify messages are received, remove
    the timeout and start over

    Signed-off-by: Shannon Nelson
    Reviewed-by: Jakub Kicinski
    Reviewed-by: Jacob Keller
    Signed-off-by: David S. Miller

    Shannon Nelson
     

16 Sep, 2020

1 commit


11 Sep, 2020

1 commit

  • Following change will add support for a corner case where
    we may not have a netdev to pass to devlink_port_type_eth_set()
    but we still want to set port type.

    This is definitely a corner case, and drivers should not normally
    pass NULL netdev - print a warning message when this happens.

    Sadly for other port types (ib) switches don't have a device
    reference, the way we always do for Ethernet, so we can't put
    the warning in __devlink_port_type_set().

    Signed-off-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Jakub Kicinski
     

10 Sep, 2020

3 commits

  • Now that controller number attribute is available, use it when
    building phsy_port_name for external controller ports.

    An example devlink port and representor netdev name consist of controller
    annotation for external controller with controller number = 1,
    for a VF 1 of PF 0:

    $ devlink port show pci/0000:06:00.0/2
    pci/0000:06:00.0/2: type eth netdev ens2f0c1pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
    function:
    hw_addr 00:00:00:00:00:00

    $ devlink port show pci/0000:06:00.0/2 -jp
    {
    "port": {
    "pci/0000:06:00.0/2": {
    "type": "eth",
    "netdev": "ens2f0c1pf0vf1",
    "flavour": "pcivf",
    "controller": 1,
    "pfnum": 0,
    "vfnum": 1,
    "external": true,
    "splittable": false,
    "function": {
    "hw_addr": "00:00:00:00:00:00"
    }
    }
    }
    }

    Controller number annotation is skipped for non external controllers to
    maintain backward compatibility.

    Signed-off-by: Parav Pandit
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Parav Pandit
     
  • A devlink port may be for a controller consist of PCI device.
    A devlink instance holds ports of two types of controllers.
    (1) controller discovered on same system where eswitch resides
    This is the case where PCI PF/VF of a controller and devlink eswitch
    instance both are located on a single system.
    (2) controller located on external host system.
    This is the case where a controller is located in one system and its
    devlink eswitch ports are located in a different system.

    When a devlink eswitch instance serves the devlink ports of both
    controllers together, PCI PF/VF numbers may overlap.
    Due to this a unique phys_port_name cannot be constructed.

    For example in below such system controller-0 and controller-1, each has
    PCI PF pf0 whose eswitch ports can be present in controller-0.
    These results in phys_port_name as "pf0" for both.
    Similar problem exists for VFs and upcoming Sub functions.

    An example view of two controller systems:

    ---------------------------------------------------------
    | |
    | --------- --------- ------- ------- |
    ----------- | | vf(s) | | sf(s) | |vf(s)| |sf(s)| |
    | server | | ------- ----/---- ---/----- ------- ---/--- ---/--- |
    | pci rc |=== | pf0 |______/________/ | pf1 |___/_______/ |
    | connect | | ------- ------- |
    ----------- | | controller_num=1 (no eswitch) |
    ------|--------------------------------------------------
    (internal wire)
    |
    ---------------------------------------------------------
    | devlink eswitch ports and reps |
    | ----------------------------------------------------- |
    | |ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 | ctrl-0 |ctrl-0 | |
    | |pf0 | pf0vfN | pf0sfN | pf1 | pf1vfN |pf1sfN | |
    | ----------------------------------------------------- |
    | |ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 | ctrl-1 |ctrl-1 | |
    | |pf1 | pf1vfN | pf1sfN | pf1 | pf1vfN |pf0sfN | |
    | ----------------------------------------------------- |
    | |
    | |
    | --------- --------- ------- ------- |
    | | vf(s) | | sf(s) | |vf(s)| |sf(s)| |
    | ------- ----/---- ---/----- ------- ---/--- ---/--- |
    | | pf0 |______/________/ | pf1 |___/_______/ |
    | ------- ------- |
    | |
    | local controller_num=0 (eswitch) |
    ---------------------------------------------------------

    An example devlink port for external controller with controller
    number = 1 for a VF 1 of PF 0:

    $ devlink port show pci/0000:06:00.0/2
    pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf controller 1 pfnum 0 vfnum 1 external true splittable false
    function:
    hw_addr 00:00:00:00:00:00

    $ devlink port show pci/0000:06:00.0/2 -jp
    {
    "port": {
    "pci/0000:06:00.0/2": {
    "type": "eth",
    "netdev": "ens2f0pf0vf1",
    "flavour": "pcivf",
    "controller": 1,
    "pfnum": 0,
    "vfnum": 1,
    "external": true,
    "splittable": false,
    "function": {
    "hw_addr": "00:00:00:00:00:00"
    }
    }
    }
    }

    Signed-off-by: Parav Pandit
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Parav Pandit
     
  • A devlink eswitch port may represent PCI PF/VF ports of a controller.

    A controller either located on same system or it can be an external
    controller located in host where such NIC is plugged in.

    Add the ability for driver to specify if a port is for external
    controller.

    Use such flag in the mlx5_core driver.

    An example of an external controller having VF1 of PF0 belong to
    controller 1.

    $ devlink port show pci/0000:06:00.0/2
    pci/0000:06:00.0/2: type eth netdev ens2f0pf0vf1 flavour pcivf pfnum 0 vfnum 1 external true splittable false
    function:
    hw_addr 00:00:00:00:00:00
    $ devlink port show pci/0000:06:00.0/2 -jp
    {
    "port": {
    "pci/0000:06:00.0/2": {
    "type": "eth",
    "netdev": "ens2f0pf0vf1",
    "flavour": "pcivf",
    "pfnum": 0,
    "vfnum": 1,
    "external": true,
    "splittable": false,
    "function": {
    "hw_addr": "00:00:00:00:00:00"
    }
    }
    }
    }

    Signed-off-by: Parav Pandit
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Parav Pandit