06 Jan, 2018

9 commits

  • Instead of having the different master network device drivers
    potentially used by DSA/Broadcom tags, move the padding necessary for
    the switches to accept short packets where it makes most sense: within
    tag_brcm.c. This avoids multiplying the number of similar commits to
    e.g: bgmac, bcmsysport, etc.

    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • The definition of functions tcf_block_get() and tcf_block_get_ext()
    depends of CONFIG_NET_CLS being set. When those functions gained extack
    support, only one version of the declaration of those functions was
    updated. Function tcf_block_get() was later fixed with commit
    3c1490913f3b ("net: sch: api: fix tcf_block_get").

    Change arguments of tcf_block_get_ext() for the case when CONFIG_NET_CLS
    is not set.

    Fixes: 8d1a77f974ca ("net: sch: api: add extack support in tcf_block_get")
    Signed-off-by: Quentin Monnet
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller

    Quentin Monnet
     
  • On multi-threaded processes, one common architecture is to have
    one (or a small number of) threads polling sockets, and a
    considerably larger pool of threads reading form and writing to the
    sockets. When we set RPS core on tcp_poll() or udp_poll() we essentially
    steer all packets of all the polled FDs to one (or small number of)
    cores, creaing a bottleneck and/or RPS misprediction.

    Another common architecture is to shard FDs among threads pinned
    to cores. In such a setting, setting RPS core in tcp_poll() and
    udp_poll() is redundant because the RFS core is correctly
    set in recvmsg and sendmsg.

    Thus, revert the following commit:
    c3f1dbaf6e28 ("net: Update RFS target at poll for tcp/udp").

    Signed-off-by: Soheil Hassas Yeganeh
    Signed-off-by: Willem de Bruijn
    Signed-off-by: Eric Dumazet
    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Soheil Hassas Yeganeh
     
  • We should only record RPS on normal reads and writes.
    In single threaded processes, all calls record the same state. In
    multi-threaded processes where a separate thread processes
    errors, the RFS table mispredicts.

    Note that, when CONFIG_RPS is disabled, sock_rps_record_flow
    is a noop and no branch is added as a result of this patch.

    Signed-off-by: Soheil Hassas Yeganeh
    Signed-off-by: Willem de Bruijn
    Signed-off-by: Eric Dumazet
    Signed-off-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Soheil Hassas Yeganeh
     
  • James Chapman says:

    ====================
    l2tp: remove configurable offset parameters

    This patch series removes all code to support a configurable offset in
    transmitted l2tp packets. Code to handle this is incomplete and buggy
    and has been this way for years. If anyone tried to configure an
    offset, it would be ignored for L2TPv2 tunnels, or for L2TPv3 tunnels,
    could result in L2TPv3 packets being transmitted which are not
    compliant with L2TPv3 RFC3931. This patch series removes the support
    for configurable offsets.

    No known userspace l2tp daemon configures an offset. However,
    iproute2's "ip l2tp" command has an offset parameter and if set, the
    value is passed to the kernel. This is the most likely use case where
    offsets might be configured, e.g.

    ip l2tp add tunnel local 1.1.1.1 remote 1.1.1.2 tunnel_id 1 \
    peer_tunnel_id 2 encap ip
    ip l2tp add session name l2tp0 tunnel_id 1 session_id 1 \
    peer_session_id 2 offset 8

    The above would result in packets being transmitted to 1.1.1.2 with 8
    bytes padding between the L2TPv3 header and the payload. The peer
    would need to be configured with the same offset value. However, the
    packets are not compliant with the L2TPv3 RFC, hence I think it's
    unlikely that offset is being used. With this patch series applied,
    the offset would not be configured. The peer would need to be modified to
    remove its offset setting too.

    iproute2 should be modified to remove or ignore the ip l2tp offset
    parameter.

    This issue was discovered when reviewing a patch series from
    lorenzo.bianconi@redhat.com which adds another netlink attribute to
    configure the expected offset in received L2TPv3 packets. This change
    is reverted by this series because offsets do not exist in L2TPv3
    packets. These commits are:

    commit f15bc54eeecd ("l2tp: add peer_offset parameter")
    commit 820da5357572 ("l2tp: fix missing print session offset info")

    In more detail:

    The L2TPv2 protocol supports a variable offset from the L2TPv2 header
    to the payload to give the sender implementation some flexibility for
    data alignment when adding L2TP headers on to payloads. The offset
    value is indicated by an optional field in the L2TP header. Our L2TP
    implementation already detects the presence of the optional offset in
    received packets and skips those bytes when parsing packets. All
    transmitted L2TPv2 packets are always transmitted with no offset.

    L2TPv3 has no optional offset field in the L2TPv3 packet
    header. Instead, L2TPv3 defines optional fields in a "Layer-2 Specific
    Sublayer". At the time when the original L2TP code was written, there
    was talk at IETF of offset being implemented in a new Layer-2 Specific
    Sublayer. A L2TP_ATTR_OFFSET netlink attribute was added so that this
    offset could be configured and the intention was to allow it to be
    also used to set the tx offset for L2TPv2. However, no L2TPv3 offset
    was ever specified and the L2TP_ATTR_OFFSET parameter was forgotten
    about.

    Setting L2TP_ATTR_OFFSET results in L2TPv3 packets being transmitted
    with the specified number of bytes padding between L2TPv3 header and
    payload. This is not compliant with L2TPv3 RFC3931. So this change
    removes the configurable offset altogether while retaining
    L2TP_ATTR_OFFSET in the API for backwards compatibility. If
    L2TP_ATTR_OFFSET is given, its value is now silently ignored.
    ====================

    Reviewed-by: Guillaume Nault
    Tested-by: Guillaume Nault
    Signed-off-by: David S. Miller

    David S. Miller
     
  • Signed-off-by: James Chapman
    Signed-off-by: David S. Miller

    James Chapman
     
  • If L2TP_ATTR_OFFSET is set to a non-zero value in L2TPv3 tunnels, it
    results in L2TPv3 packets being transmitted which might not be
    compliant with the L2TPv3 RFC. This patch has l2tp ignore the offset
    setting and send all packets with no offset.

    In more detail:

    L2TPv2 supports a variable offset from the L2TPv2 header to the
    payload. The offset value is indicated by an optional field in the
    L2TP header. Our L2TP implementation already detects the presence of
    the optional offset and skips that many bytes when handling data
    received packets. All transmitted packets are always transmitted with
    no offset.

    L2TPv3 has no optional offset field in the L2TPv3 packet
    header. Instead, L2TPv3 defines optional fields in a "Layer-2 Specific
    Sublayer". At the time when the original L2TP code was written, there
    was talk at IETF of offset being implemented in a new Layer-2 Specific
    Sublayer. A L2TP_ATTR_OFFSET netlink attribute was added so that this
    offset could be configured and the intention was to allow it to be
    also used to set the tx offset for L2TPv2. However, no L2TPv3 offset
    was ever specified and the L2TP_ATTR_OFFSET parameter was forgotten
    about.

    Setting L2TP_ATTR_OFFSET results in L2TPv3 packets being transmitted
    with the specified number of bytes padding between L2TPv3 header and
    payload. This is not compliant with L2TPv3 RFC3931. This change
    removes the configurable offset altogether while retaining
    L2TP_ATTR_OFFSET for backwards compatibility. Any L2TP_ATTR_OFFSET
    value is ignored.

    Signed-off-by: James Chapman
    Signed-off-by: David S. Miller

    James Chapman
     
  • Revert commit 820da5357572 ("l2tp: fix missing print session offset
    info"). The peer_offset parameter is removed.

    Signed-off-by: James Chapman
    Signed-off-by: David S. Miller

    James Chapman
     
  • Revert commit f15bc54eeecd ("l2tp: add peer_offset parameter"). This
    is removed because it is adding another configurable offset and
    configurable offsets are being removed.

    Signed-off-by: James Chapman
    Signed-off-by: David S. Miller

    James Chapman
     

05 Jan, 2018

6 commits


04 Jan, 2018

14 commits

  • Instead of calling ieee80211_recalc_txpower on monitor interfaces
    directly, call it using the virtual monitor interface, if one exists.

    In case of a single monitor interface given, reject setting TX power,
    if no virtual monitor interface exists.

    That being checked, don't warn in ieee80211_bss_info_change_notify,
    after setting TX power on a monitor interface.

    Fixes warning:
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 2193 at net/mac80211/driver-ops.h:167
    ieee80211_bss_info_change_notify+0x111/0x190 Modules linked in: uvcvideo
    videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core
    rndis_host cdc_ether usbnet mii tp_smapi(O) thinkpad_ec(O) ohci_hcd vboxpci(O)
    vboxnetadp(O) vboxnetflt(O) v boxdrv(O) x86_pkg_temp_thermal kvm_intel kvm
    irqbypass iwldvm iwlwifi ehci_pci ehci_hcd tpm_tis tpm_tis_core tpm CPU: 0
    PID: 2193 Comm: iw Tainted: G O 4.12.12-gentoo #2 task:
    ffff880186fd5cc0 task.stack: ffffc90001b54000 RIP:
    0010:ieee80211_bss_info_change_notify+0x111/0x190 RSP: 0018:ffffc90001b57a10
    EFLAGS: 00010246 RAX: 0000000000000006 RBX: ffff8801052ce840 RCX:
    0000000000000064 RDX: 00000000fffffffc RSI: 0000000000040000 RDI:
    ffff8801052ce840 RBP: ffffc90001b57a38 R08: 0000000000000062 R09:
    0000000000000000 R10: ffff8802144b5000 R11: ffff880049dc4614 R12:
    0000000000040000 R13: 0000000000000064 R14: ffff8802105f0760 R15:
    ffffc90001b57b48 FS: 00007f92644b4580(0000) GS:ffff88021e200000(0000)
    knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f9263c109f0 CR3: 00000001df850000 CR4: 00000000000406f0
    Call Trace:
    ieee80211_recalc_txpower+0x33/0x40
    ieee80211_set_tx_power+0x40/0x180
    nl80211_set_wiphy+0x32e/0x950

    Reported-by: Peter Große
    Signed-off-by: Peter Große

    Signed-off-by: Johannes Berg

    Peter Große
     
  • Jakub Kicinski says:

    ====================
    nfp: flower: repr link state

    Dirk says:

    This series provides two updates towards the link state of reprs in
    the flower nfp app.

    Patch #1 improves the way link state is reported for reprs. Instead of
    starting with an assumed 'UP' state, always assume the link state is
    'DOWN' and then modify this only on events received from firmware.

    Patch #2 adds a new nfp_app hook, repr_preclean. This callback is
    executed before reprs are removed from the app context and is executed
    per repr.

    Patch #3 implements the new REIFY control message, used to indicate
    when reprs are created and destroyed. Firmware uses these messages
    to prevent communication about any particular port when the driver
    doesn't know about the repr yet or anymore.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • The PORT_REIFY message indicates whether reprs have been created or
    when they are about to be destroyed. This is necessary so firmware
    can know which state the driver is in, e.g. the firmware will not send
    any control messages related to ports when the reprs are destroyed.

    This prevents nuisance warning messages printed whenever the firmware
    sends updates for non-existent reprs.

    Signed-off-by: Dirk van der Merwe
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Dirk van der Merwe
     
  • Just before a repr is cleaned up, we give the app a chance to perform
    some preclean configuration while the reprs pointer is still configured
    for the app.

    Signed-off-by: Dirk van der Merwe
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Dirk van der Merwe
     
  • Instead of starting up reprs assuming that there is link, only respond
    to the link state reported by firmware.

    Furthermore, ensure link is down after repr netdevs are created.

    Signed-off-by: Dirk van der Merwe
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Dirk van der Merwe
     
  • Ethernet switch on the MDIO bus have historically performed their own
    handling of the GPIO reset line. The resent patch to have the MDIO
    core handle the reset has broken the switch drivers, in that they
    cannot claim the GPIO. Some switch drivers need more control over the
    GPIO line than what the MDIO core provides. So restore the historical
    behaviour by only performing a reset of PHYs, not switches.

    Fixes: bafbdd527d56 ("phylib: Add device reset GPIO support")
    Reported-by: Sean Wang
    Signed-off-by: Andrew Lunn
    Reviewed-by: Geert Uytterhoeven
    Tested-by: Geert Uytterhoeven
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • Russell King says:

    ====================
    Resolve races in phy accessors

    This series resolves races with various accesses to PHY registers.
    The first five patches are necessary before we add phylink support
    to mvneta, the remaining three are merely cleanups for unobserved
    races, and hence are less critical.

    There are two possible classes of races that can occur: where we
    write to a page register that changes the meaning of a group of
    other registers, and where we read-modify-write a register.

    Resolve these races by performing the accesses under the mdio bus
    lock, ensuring that no other user can access the bus while the
    series of atomic operations are being performed.

    These patches have been posted before, and have been modified
    along the lines of previous feedback:

    - The third patch was originally reviewed by Florian, but as I've
    added __phy_modify() to it, I've removed that attributation.
    - Included generic page-based accessors as suggested last time
    around.
    - Since we have the unlocked __phy_modify() in this patch series,
    it is sensible to include the changes for this to marvell.c -
    these accessors have to change anyway to avoid deadlocks on the
    mdio bus lock.

    I haven't been able to test the at803x.c changes yet beyond compile
    testing - although I do have systems with an ar8035 PHY. However,
    they should be straight forward to review.

    This is targetted for net-next because the races have not been
    found in existing drivers, but have been observed with phylink
    integrated into mvneta - that's not to say that the races do not
    exist today, they are just unobserved (probably through lack of
    rigorous enough testing.) The race provoking condition is detailed
    in patch 5.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Convert read-modify-write sequences in at803x, Marvell and core phylib
    to use phy_modify() to ensure safety.

    Signed-off-by: Russell King
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Russell King
     
  • Add phy_modify() convenience accessor to complement the mdiobus
    counterpart.

    Signed-off-by: Russell King
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Russell King
     
  • For paged accesses to be truely safe, we need to hold the bus lock to
    prevent anyone else gaining access to the registers while we modify
    them.

    The phydev->lock mutex does not do this: userspace via the MII ioctl
    can still sneak in and read or write any register while we are on a
    different page, and the suspend/resume methods can be called by a
    thread different to the thread polling the phy status.

    Races have been observed with mvneta on SolidRun Clearfog with phylink,
    particularly between the phylib worker reading the PHYs status, and
    the thread resuming mvneta, calling phy_start() which then calls
    through to m88e1121_config_aneg_rgmii_delays(), which tries to
    read-modify-write the MSCR register:

    CPU0 CPU1
    marvell_read_status_page()
    marvell_set_page(phydev, MII_MARVELL_FIBER_PAGE)
    ...
    m88e1121_config_aneg_rgmii_delays()
    set_page(MII_MARVELL_MSCR_PAGE)
    phy_read(phydev, MII_88E1121_PHY_MSCR_REG)
    marvell_set_page(phydev, MII_MARVELL_COPPER_PAGE);
    ...
    phy_write(phydev, MII_88E1121_PHY_MSCR_REG)

    The result of this is we end up writing the copper page register 21,
    which causes the copper PHY to be disabled, and the link partner sees
    the link immediately go down.

    Solve this by taking the bus lock instead of the PHY lock, thereby
    preventing other accesses to the PHY while we are accessing other PHY
    pages.

    Signed-off-by: Russell King
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Russell King
     
  • Add a set of paged phy register accessors which are inherently safe in
    their design against other accesses interfering with the paged access.

    Signed-off-by: Russell King
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Russell King
     
  • Add unlocked versions of the bus accessors, which allows access to the
    bus with all the tracing. These accessors validate that the bus mutex
    is held, which is a basic requirement for all mii bus accesses.

    Also added is a read-modify-write unlocked accessor with the same
    locking requirements.

    Signed-off-by: Russell King
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Russell King
     
  • Use unlocked accessors for indirect MMD accesses to clause 22 PHYs.
    This permits tracing of these accesses.

    Reviewed-by: Florian Fainelli
    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • Add unlocked versions of the bus accessors, which allows access to the
    bus with all the tracing. These accessors validate that the bus mutex
    is held, which is a basic requirement for all mii bus accesses.

    Reviewed-by: Florian Fainelli
    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     

03 Jan, 2018

11 commits

  • Collect TX rate limiting related information in UP CIM logs.

    Signed-off-by: Rahul Lakkireddy
    Signed-off-by: Ganesh Goudar
    Signed-off-by: David S. Miller

    Rahul Lakkireddy
     
  • Russell King says:

    ====================
    Convert mvneta to phylink

    This series converts mvneta to use phylink, which is necessary to
    support the SFP cages on SolidRun's Clearfog platform. This series just
    converts mvneta without adding the DT parts - having discussed with
    Andrew, we believe we're too close to the merge window to submit that
    patch.

    I've split the "net: mvneta: convert to phylink" patch up to make it
    easier to review, and in doing so, spotted some minor corner cases that
    needed to be fixed along the way.

    This series depends on the previously merged phylink patches in netdev,
    along with the recently reviewed 7 patch series "Resolve races in phy
    accessors" without which, the race described in patch 5 of that series
    is very evident when triggering a dummy hibernate cycle.

    This series also illustrates how to convert mvpp2 to phylink.

    mvneta is the only user of the fixed_phy_update_state() API, and this
    becomes redundant with the conversion.

    It would be good to get this series not only reviewed, but also
    independently tested to ensure that I haven't missed anything - I only
    have the Clearfog platform to test on, and that doesn't support all the
    different interface modes that mvneta supports.

    A particularly interesting side effect of this series is that DSA
    switches no longer need the "CPU" port and DSA facing MAC ethernet
    instance to be marked as a fixed link anymore with mvneta - we can use
    1000BaseX mode, and the DSA to CPU link will use the 802.3z negotiation
    to determine the link properties without needing the link parameters to
    be explicitly stated in DT - that is a subject of a future patch.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • mvneta is the only user of fixed_phy_update_state(), which has been
    converted to use phylink instead. Remove fixed_phy_update_state().

    Reviewed-by: Florian Fainelli
    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • Add support for reading the SFF module's EEPROM via the ethtool API.

    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • The PSC sync change interrupt can fire multiple times while the link is
    down, which is caused by noise on the serdes lines. As this isn't
    information we make use of, it's pointless having the interrupt enabled.

    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • Add support for EEE to mvneta.

    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • Add support for flow control to mvneta.

    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • Add support for 1000BaseX link modes.

    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • Move the port configuration and release of reset to mvneta_mac_config()
    along side the rest of the port mode configuration.

    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • Convert mvneta to use phylink, which models the MAC to PHY link in
    a generic, reusable form.

    Signed-off-by: Russell King

    - remove unused sync status

    Signed-off-by: David S. Miller

    Russell King
     
  • Prepare to convert mvneta to phylink by splitting the adjust_link
    function into its consituent parts.

    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King