20 Jul, 2016

40 commits

  • This manages NCSI packages and channels:

    * The available packages and channels are enumerated in the first
    time of calling ncsi_start_dev(). The channels' capabilities are
    probed in the meanwhile. The NCSI network topology won't change
    until the NCSI device is destroyed.
    * There in a queue in every NCSI device. The element in the queue,
    channel, is waiting for configuration (bringup) or suspending
    (teardown). The channel's state (inactive/active) indicates the
    futher action (configuration or suspending) will be applied on the
    channel. Another channel's state (invisible) means the requested
    action is being applied.
    * The hardware arbitration will be enabled if all available packages
    and channels support it. All available channels try to provide
    service when hardware arbitration is enabled. Otherwise, one channel
    is selected as the active one at once.
    * When channel is in active state, meaning it's providing service, a
    timer started to retrieve the channe's link status. If the channel's
    link status fails to be updated in the determined period, the channel
    is going to be reconfigured. It's the error handling implementation
    as defined in NCSI spec.

    Signed-off-by: Gavin Shan
    Acked-by: Joel Stanley
    Signed-off-by: David S. Miller

    Gavin Shan
     
  • The NCSI response packets are sent to MC (Management Controller)
    from the remote end. They are responses of NCSI command packets
    for multiple purposes: completion status of NCSI command packets,
    return NCSI channel's capability or configuration etc.

    This defines struct to represent NCSI response packets and introduces
    function ncsi_rcv_rsp() which will be used to receive NCSI response
    packets and parse them.

    Signed-off-by: Gavin Shan
    Acked-by: Joel Stanley
    Signed-off-by: David S. Miller

    Gavin Shan
     
  • The NCSI command packets are sent from MC (Management Controller)
    to remote end. They are used for multiple purposes: probe existing
    NCSI package/channel, retrieve NCSI channel's capability, configure
    NCSI channel etc.

    This defines struct to represent NCSI command packets and introduces
    function ncsi_xmit_cmd(), which will be used to transmit NCSI command
    packet according to the request. The request is represented by struct
    ncsi_cmd_arg.

    Signed-off-by: Gavin Shan
    Acked-by: Joel Stanley
    Signed-off-by: David S. Miller

    Gavin Shan
     
  • NCSI spec (DSP0222) defines several objects: package, channel, mode,
    filter, version and statistics etc. This introduces the data structs
    to represent those objects and implement functions to manage them.
    Also, this introduces CONFIG_NET_NCSI for the newly implemented NCSI
    stack.

    * The user (e.g. netdev driver) dereference NCSI device by
    "struct ncsi_dev", which is embedded to "struct ncsi_dev_priv".
    The later one is used by NCSI stack internally.
    * Every NCSI device can have multiple packages simultaneously, up
    to 8 packages. It's represented by "struct ncsi_package" and
    identified by 3-bits ID.
    * Every NCSI package can have multiple channels, up to 32. It's
    represented by "struct ncsi_channel" and identified by 5-bits ID.
    * Every NCSI channel has version, statistics, various modes and
    filters. They are represented by "struct ncsi_channel_version",
    "struct ncsi_channel_stats", "struct ncsi_channel_mode" and
    "struct ncsi_channel_filter" separately.
    * Apart from AEN (Asynchronous Event Notification), the NCSI stack
    works in terms of command and response. This introduces "struct
    ncsi_req" to represent a complete NCSI transaction made of NCSI
    request and response.

    link: https://www.dmtf.org/sites/default/files/standards/documents/DSP0222_1.1.0.pdf
    Signed-off-by: Gavin Shan
    Acked-by: Joel Stanley
    Signed-off-by: David S. Miller

    Gavin Shan
     
  • Vivien Didelot says:

    ====================
    net: dsa: mv88e6xxx: Global2 cleanup and STP

    The Marvell switches registers are organized in distinct internal SMI
    devices, such as PHY, Port, Global 1 or Global 2 registers sets.

    Since not all chips support every registers sets or have slightly
    differences in them (such as old 88E6060 or new 88E6390 likely to be
    supported soon), make the setup code clearer now by removing a few
    family checks and adding flags to describe the Global 2 registers map.

    This patchset enables basic STP support and bridging on most chips when
    getting rid of a few inconsistencies in chip descriptions (patch 1) and
    add bridge Ageing Time support to DSA and the mv88e6xxx driver.

    Changes v2 -> v3:
    - rename mv88e6xxx_update_write to mv88e6xxx_update
    - set fastest ageing time in use in the chip for multiple bridges,
    tested with a few printk

    Changes v1 -> v2:
    - add a write helper for pointer-data Update registers
    - add ageing time support
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Implement the DSA driver function to configure the bridge ageing time.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • All Marvell switch chips from (88E6060 to 88E6390) have a ATU Control
    register containing bits 11:4 to configure an ATU Age Time quotient.

    However the coefficient used to calculate the ATU Age Time vary with the
    models. E.g. 88E6060, 88E6352 and 88E6390 use respectively 16, 15 and
    3.75 seconds.

    Add a age_time_coeff to the info structure to handle this and a Global 1
    helper to set the default age time of 5 minutes in the setup code.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • Add a new function for DSA drivers to handle the switchdev
    SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME attribute.

    The ageing time is passed as milliseconds.

    Also because we can have multiple logical bridges on top of a physical
    switch and ageing time are switch-wide, call the driver function with
    the fastest ageing time in use on the chip instead of the requested one.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • Add capability flags to describe the presence of Ingress Rate Limit unit
    registers and an helper function to clear it.

    In the meantime, fix a few harmless issues:

    - 6185 and 6095 don't have such registers (reserved)
    - the previous code didn't wait for the IRL operation to complete

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • Add flags and helpers to describe the presence of Priority Override
    Table (POT) related registers and simplify the setup of Global 2.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • Add flags to describe the presence of Cross-chip Port VLAN Table (PVT)
    related registers and simplify the setup of Global 2.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • Switches such as 88E6185 as 3 Switch MAC registers in Global 1. Newer
    chips such as 88E6352 have freed these registers in favor of an indirect
    access in a Switch MAC/WoL/WoF register in Global 2.

    Explicit this difference with G1 and G2 helpers and flags.

    Also, note that this indirect access is a single-register which doesn't
    require to wait for the operation to complete (like Switch MAC, Trunk
    Mapping, etc.), in contrary to multi-registers indirect accesses with
    several operations and a busy bit.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • Some switches provide a Rsvd2CPU mechanism used to choose which of the
    16 reserved multicast destination addresses matching 01:80:c2:00:00:0x
    should be considered as MGMT and thus forwarded to the CPU port.

    Other switches extend this mechanism to also configure as MGMT the
    additional 16 reserved multicast addresses matching 01:80:c2:00:00:2x.

    This mechanism is exposed via two registers in Global 2, and an Rsvd2CPU
    enable bit in the management register.

    Newer chip (such as 88E6390) has replaced these registers with a new
    indirect MGMT mechanism in Global 1.

    The patch adds two MV88E6XXX_FLAG_G2_MGMT_EN_{0,2}X flags to describe
    the presence of these Global 2 registers. If 88E6390 support is added, a
    MV88E6XXX_FLAG_G1_MGMT_CTRL flag will be needed to setup Rsvd2CPU.

    Note: all switches still support in parallel the ATU Load operation with
    an MGMT Entry State to forward such frames in a less convenient way.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • The Trunk Mask and Trunk Mapping registers are two Global 2 indirect
    accesses to trunking configuration.

    Add helpers for these tables and simplify the Global 2 setup.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • The Device Mapping register is an indirect table access.

    Provide helpers to access this table and explicit the checking of the
    new DSA_RTABLE_NONE routing table value.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • Separate the setup of Global 1 and Global 2 internal SMI devices and add
    a flag to describe the presence of this second registers set.

    Also rearrange the G1 setup in the registers order.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • All 88E6xxx Marvell switches (even the old not supported yet 88E6060)
    have at least an ATU, per-port STP states and VLAN map, to run basic
    switch functions such as Spanning Tree and port based VLANs.

    Get rid of the related MV88E6XXX_FLAG_{ATU,PORTSTATE,VLANTABLE} flags,
    as they are defaults to every chip.

    This enables STP on 6185 and removes many inconsistencies on others.

    Signed-off-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • kernel/trace/bpf_trace.c: In function 'bpf_event_output':
    kernel/trace/bpf_trace.c:312: error: unknown field 'next' specified in initializer
    kernel/trace/bpf_trace.c:312: warning: missing braces around initializer
    kernel/trace/bpf_trace.c:312: warning: (near initialization for 'raw.frag.')

    Fixes: 555c8a8623a3a87 ("bpf: avoid stack copy and use skb ctx for event output")
    Acked-by: Daniel Borkmann
    Cc: Alexei Starovoitov
    Cc: David S. Miller
    Signed-off-by: Andrew Morton
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Andrew Morton
     
  • VLAN and MQ control was doing DMA from the stack. Fix it.

    Cc: Michael S. Tsirkin
    Cc: "netdev@vger.kernel.org"
    Signed-off-by: Andy Lutomirski
    Acked-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Andy Lutomirski
     
  • txr->dev_state was not consistently manipulated with the acquisition of
    the per-queue lock, after further inspection the lock does not seem
    necessary, either the value is read as BNXT_DEV_STATE_CLOSING or 0.

    Reported-by: coverity (CID 1339583)
    Fixes: c0c050c58d840 ("bnxt_en: New Broadcom ethernet driver.")
    Signed-off-by: Florian Fainelli
    Acked-by: Michael Chan
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • Shmulik Ladkani says:

    ====================
    net: Consider fragmentation of udp tunneled skbs in 'ip_finish_output_gso'

    Currently IP fragmentation of GSO segments that exceed dst mtu is
    considered only in the ipv4 forwarding case.

    There are cases where GSO skbs that are bridged and then udp-tunneled
    may have gso_size exceeding the egress device mtu.
    It makes sense to fragment them, as in the non GSOed code path.

    The exact cases where this behavior is needed is described and addressed
    in the 2nd patch.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • …tion for local udp tunneled skbs

    Given:
    - tap0 and vxlan0 are bridged
    - vxlan0 stacked on eth0, eth0 having small mtu (e.g. 1400)

    Assume GSO skbs arriving from tap0 having a gso_size as determined by
    user-provided virtio_net_hdr (e.g. 1460 corresponding to VM mtu of 1500).

    After encapsulation these skbs have skb_gso_network_seglen that exceed
    eth0's ip_skb_dst_mtu.

    These skbs are accidentally passed to ip_finish_output2 AS IS.
    Alas, each final segment (segmented either by validate_xmit_skb or by
    hardware UFO) would be larger than eth0 mtu.
    As a result, those above-mtu segments get dropped on certain networks.

    This behavior is not aligned with the NON-GSO case:
    Assume a non-gso 1500-sized IP packet arrives from tap0. After
    encapsulation, the vxlan datagram is fragmented normally at the
    ip_finish_output-->ip_fragment code path.

    The expected behavior for the GSO case would be segmenting the
    "gso-oversized" skb first, then fragmenting each segment according to
    dst mtu, and finally passing the resulting fragments to ip_finish_output2.

    'ip_finish_output_gso' already supports this "Slowpath" behavior,
    according to the IPSKB_FRAG_SEGS flag, which is only set during ipv4
    forwarding (not set in the bridged case).

    In order to support the bridged case, we'll mark skbs arriving from an
    ingress interface that get udp-encaspulated as "allowed to be fragmented",
    causing their network_seglen to be validated by 'ip_finish_output_gso'
    (and fragment if needed).

    Note the TUNNEL_DONT_FRAGMENT tun_flag is still honoured (both in the
    gso and non-gso cases), which serves users wishing to forbid
    fragmentation at the udp tunnel endpoint.

    Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
    Cc: Florian Westphal <fw@strlen.de>
    Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
    Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

    Shmulik Ladkani
     
  • This flag indicates whether fragmentation of segments is allowed.

    Formerly this policy was hardcoded according to IPSKB_FORWARDED (set by
    either ip_forward or ipmr_forward).

    Cc: Hannes Frederic Sowa
    Cc: Florian Westphal
    Signed-off-by: Shmulik Ladkani
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Shmulik Ladkani
     
  • Michael Chan says:

    ====================
    bnxt_en: Add support for NS2 Nitro.

    This series adds support for the embedded version of the
    ethernet controller (Nitro) in the North Star 2 SoC. There are a number
    of features not supported and a software workaround for a hardware rx
    bug is required for Nitro A0. Please review.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • A bridge device in NS2 has the same device ID as the ethernet controller.
    Add check to avoid probing the bridge device.

    Signed-off-by: Prashant Sreedharan
    Signed-off-by: Vasundhara Volam
    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Prashant Sreedharan
     
  • Allocate special vnic for dropping packets not matching the RX filters.
    First vnic is for normal RX packets and the driver will drop all
    packets on the 2nd vnic.

    Signed-off-by: Prashant Sreedharan
    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Prashant Sreedharan
     
  • Allocate napi for special vnic, packets arriving on this
    napi will simply be dropped and the buffers will be replenished back
    to the HW.

    Signed-off-by: Prashant Sreedharan
    Signed-off-by: Vasundhara Volam
    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Prashant Sreedharan
     
  • The hardware is unable to drop rx packets not matching the RX filters. To
    workaround it, we create a special VNIC and configure the hardware to
    direct all packets not matching the filters to it. We then setup the
    driver to drop packets received on this VNIC.

    This patch creates the infrastructure for this VNIC, reserves a
    completion ring, and rx rings. Only shared completion ring mode is
    supported. The next 2 patches add a NAPI to handle packets from this
    VNIC and the setup of the VNIC.

    Signed-off-by: Prashant Sreedharan
    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Prashant Sreedharan
     
  • Nitro A0 has a hardware bug in the rx path. The workaround is to create
    a special COS context as a path for non-RSS (non-IP) packets. Without this
    workaround, the chip may stall when receiving RSS and non-RSS packets.

    Add infrastructure to allow 2 contexts (RSS and CoS) per VNIC. Allocate
    and configure the CoS context for Nitro A0.

    Signed-off-by: Prashant Sreedharan
    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Prashant Sreedharan
     
  • Nitro is the embedded version of the ethernet controller in the North
    Star 2 SoC. Add basic code to recognize the chip ID and disable
    the features (ntuple, TPA, ring and port statistics) not supported on
    Nitro A0.

    Signed-off-by: Prashant Sreedharan
    Signed-off-by: Vasundhara Volam
    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Prashant Sreedharan
     
  • Charles-Antoine Couret says:

    ====================
    Marvell phy: fiber interface configuration

    Another patchset to manage correctly the fiber link for some concerned Marvell's
    phy like 88E1512.

    This patchset fixed the commit log for the third and last commits and a comment
    in the first commit.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • These functions used standards registers in a different page
    for both interfaces: copper and fiber.

    Reviewed-by: Andrew Lunn
    Signed-off-by: Charles-Antoine Couret
    Signed-off-by: David S. Miller

    Charles-Antoine Couret
     
  • To be correctly initilized, the fiber interface needs
    to be configured via autonegociation registers which use
    some customs options or registers.

    Reviewed-by: Andrew Lunn
    Signed-off-by: Charles-Antoine Couret
    Signed-off-by: David S. Miller

    Charles-Antoine Couret
     
  • Add support for the fiber receiver error counter in the
    statistics. Rename the current counter which is for copper errors to
    phy_receive_errors_copper, so it is easy to distinguish copper from
    fiber.

    Reviewed-by: Andrew Lunn
    Signed-off-by: Charles-Antoine Couret
    Signed-off-by: David S. Miller

    Charles-Antoine Couret
     
  • For concerned phy, the fiber link is checked before the copper link.
    According to datasheet, the link which is up is enabled.

    If both links are down, copper link would be used.
    To detect fiber link status, we used the real time status
    because of troubles with the copper method.

    Tested with Marvell 88E1512.

    Reviewed-by: Andrew Lunn
    Signed-off-by: Charles-Antoine Couret
    Signed-off-by: David S. Miller

    Charles-Antoine Couret
     
  • Sergei Shtylyov says:

    ====================
    Fix DMA channel misreporting for the Renesas Ethernet drivers

    Here's a set of 2 patches against DaveM's 'net.git' repo fixing up the DMA
    channel reporting by 'ifconfig'...
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Currently 'ifconfig' for the Ethernet devices handled by this driver shows
    "DMA chan: ff" while the driver doesn't use any DMA channels. Not assigning
    a value to 'net_device::dma' causes 'ifconfig' to correctly not report a
    DMA channel.

    Signed-off-by: Sergei Shtylyov
    Signed-off-by: David S. Miller

    Sergei Shtylyov
     
  • Currently 'ifconfig' for the Ethernet devices handled by this driver shows
    "DMA chan: ff" while the driver doesn't use any DMA channels. Not assigning
    a value to 'net_device::dma' causes 'ifconfig' to correctly not report a
    DMA channel.

    Signed-off-by: Sergei Shtylyov
    Signed-off-by: David S. Miller

    Sergei Shtylyov
     
  • In 'get_scq', 'dma_alloc_coherent' has been used to allocate some
    resources, so we need to free them using 'dma_free_coherent' instead
    of 'kfree'.

    Signed-off-by: Christophe JAILLET
    Signed-off-by: David S. Miller

    Christophe Jaillet
     
  • In 'cpmac_open', 'dma_alloc_coherent' has been used to allocate some
    resources, so we need to free them using 'dma_free_coherent' instead
    of 'kfree'.

    Also, we don't need to free these resources if the allocation has failed.
    So I have slighly modified the goto label in this case.

    Signed-off-by: Christophe JAILLET
    Signed-off-by: David S. Miller

    Christophe Jaillet