25 Nov, 2016

14 commits

  • Some HWs need the VF driver to put part of the packet headers on the
    TX descriptor so the e-switch can do proper matching and steering.

    The supported modes: none, link, network, transport.

    Signed-off-by: Roi Dayan
    Reviewed-by: Or Gerlitz
    Signed-off-by: Saeed Mahameed
    Signed-off-by: David S. Miller

    Roi Dayan
     
  • Reflect the administative link changes done on the VF representor to the
    VF e-switch vport. This means that doing ip link set down/up commands on
    the VF rep will modify the e-switch vport state which in turn will make
    proper VF drivers to set their carrier accordingly.

    Signed-off-by: Or Gerlitz
    Signed-off-by: Saeed Mahameed
    Signed-off-by: David S. Miller

    Or Gerlitz
     
  • Switchdev driver net-device port statistics should follow the model introduced
    in commit a5ea31f57309 'Merge branch net-offloaded-stats'.

    For VF reps we return the SRIOV eswitch vport stats as the usual ones and SW stats
    if asked. For the PF, if we're in the switchdev mode, we return the uplink stats
    and SW stats if asked, otherwise as before. The uplink stats are implemented using
    the PPCNT 802_3 counters which are already being read/cached by the driver.

    Signed-off-by: Or Gerlitz
    Signed-off-by: Saeed Mahameed
    Signed-off-by: David S. Miller

    Or Gerlitz
     
  • Some drivers would need to check few internal matters for
    that. To be used in downstream mlx5 commit.

    Signed-off-by: Or Gerlitz
    Signed-off-by: Saeed Mahameed
    Signed-off-by: David S. Miller

    Or Gerlitz
     
  • Florian Fainelli says:

    ====================
    net: phy: broadcom: Wirespeed/downshift support

    This patch series adds support for the Broadcom Wirespeed, aka
    downsfhit feature utilizing the recently added ethtool PHY tunables.

    Tested with two Gigabit link partners with a 4-wire cable having only
    2 pairs connected.

    Last patch in the series is a fix that was required for testing, which
    should make it to -stable, which I can submit separate against net if
    you prefer David.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • In case the link change and EEE is enabled or disabled, always try to
    re-negotiate this with the link partner.

    Fixes: 450b05c15f9c ("net: dsa: bcm_sf2: add support for controlling EEE")
    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • Add support for configuring the downshift/Wirespeed enable/disable
    toggles and specify a link retry value ranging from 1 to 9. Since the
    integrated BCM7xxx have issues when wirespeed is enabled and EEE is also
    enabled, we do disable EEE if wirespeed is enabled.

    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • In preparation for adding support for Wirespeed/downshift, we need to
    change bcm_phy_eee_enable() to allow enabling or disabling EEE, so make
    the function take an extra enable/disable boolean parameter and rename
    it to illustrate it sets EEE, not necessarily just enables it.

    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • Broadcom's Wirespeed feature allows us to configure how auto-negotiation
    should behave with fewer working pairs of wires on a cable. Add support
    code for retrieving and setting such downshift counters using the
    recently added ethtool downshift tunables.

    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • We are going to need these functions to implement support for Broadcom
    Wirespeed, aka downshift.

    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • In commit 2331ccc5b323 ("tcp: enhance tcp collapsing"),
    we made a first step allowing copying right skb to left skb head.

    Since all skbs in socket write queue are headless (but possibly the very
    first one), this strategy often does not work.

    This patch extends tcp_collapse_retrans() to perform frag shifting,
    thanks to skb_shift() helper.

    This helper needs to not BUG on non headless skbs, as callers are ok
    with that.

    Tested:

    Following packetdrill test now passes :

    0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
    +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
    +0 bind(3, ..., ...) = 0
    +0 listen(3, 1) = 0

    +0 < S 0:0(0) win 32792
    +0 > S. 0:0(0) ack 1
    +.100 < . 1:1(0) ack 1 win 257
    +0 accept(3, ..., ...) = 4

    +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
    +0 write(4, ..., 200) = 200
    +0 > P. 1:201(200) ack 1
    +.001 write(4, ..., 200) = 200
    +0 > P. 201:401(200) ack 1
    +.001 write(4, ..., 200) = 200
    +0 > P. 401:601(200) ack 1
    +.001 write(4, ..., 200) = 200
    +0 > P. 601:801(200) ack 1
    +.001 write(4, ..., 200) = 200
    +0 > P. 801:1001(200) ack 1
    +.001 write(4, ..., 100) = 100
    +0 > P. 1001:1101(100) ack 1
    +.001 write(4, ..., 100) = 100
    +0 > P. 1101:1201(100) ack 1
    +.001 write(4, ..., 100) = 100
    +0 > P. 1201:1301(100) ack 1
    +.001 write(4, ..., 100) = 100
    +0 > P. 1301:1401(100) ack 1

    +.099 < . 1:1(0) ack 201 win 257
    +.001 < . 1:1(0) ack 201 win 257
    +0 > P. 201:1001(800) ack 1

    Signed-off-by: Eric Dumazet
    Cc: Neal Cardwell
    Cc: Yuchung Cheng
    Acked-by: Yuchung Cheng
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Add support for the MV88E6097 switch. The change was tested on an Armada
    based platform with a MV88E6097 switch.

    Signed-off-by: Stefan Eichenberger
    Signed-off-by: David S. Miller

    Stefan Eichenberger
     
  • Make it possible to generate trace events for mdio read and write accesses.

    Signed-off-by: Uwe Kleine-König
    Acked-by: Steven Rostedt
    Signed-off-by: David S. Miller

    Uwe Kleine-König
     
  • The VMware VMCI transport supports loopback inside virtual machines.
    This patch implements loopback for virtio-vsock.

    Flow control is handled by the virtio-vsock protocol as usual. The
    sending process stops transmitting on a connection when the peer's
    receive buffer space is exhausted.

    Cathy Avery noticed this difference between VMCI and
    virtio-vsock when a test case using loopback failed. Although loopback
    isn't the main point of AF_VSOCK, it is useful for testing and
    virtio-vsock must match VMCI semantics so that userspace programs run
    regardless of the underlying transport.

    My understanding is that loopback is not supported on the host side with
    VMCI. Follow that by implementing it only in the guest driver, not the
    vhost host driver.

    Cc: Jorgen Hansen
    Reported-by: Cathy Avery
    Signed-off-by: Stefan Hajnoczi
    Signed-off-by: David S. Miller

    Stefan Hajnoczi
     

23 Nov, 2016

1 commit

  • All conflicts were simple overlapping changes except perhaps
    for the Thunder driver.

    That driver has a change_mtu method explicitly for sending
    a message to the hardware. If that fails it returns an
    error.

    Normally a driver doesn't need an ndo_change_mtu method becuase those
    are usually just range changes, which are now handled generically.
    But since this extra operation is needed in the Thunder driver, it has
    to stay.

    However, if the message send fails we have to restore the original
    MTU before the change because the entire call chain expects that if
    an error is thrown by ndo_change_mtu then the MTU did not change.
    Therefore code is added to nicvf_change_mtu to remember the original
    MTU, and to restore it upon nicvf_update_hw_max_frs() failue.

    Signed-off-by: David S. Miller

    David S. Miller
     

22 Nov, 2016

25 commits

  • Both of these drivers won't work on 64-bit architectures unless they
    are redesigned, since they store a virtual address pointer in a 32-bit
    field of the descriptors:

    drivers/net/ethernet/marvell/mvneta_bm.c: In function 'mvneta_bm_construct':
    drivers/net/ethernet/marvell/mvneta_bm.c:103:16: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
    drivers/net/ethernet/marvell/mvpp2.c: In function 'mvpp2_prs_vlan_init':
    drivers/net/ethernet/marvell/mvpp2.c:2563:32: error: large integer implicitly truncated to unsigned type [-Werror=overflow]

    This limits the COMPILE_TEST option for the two drivers again to
    only build them on 32-bit. This seems nicer than shutting up the
    warnings, in case we ever actually want to use them on 64-bit,
    as the warnings indicate which parts of the driver are currently
    broken there.

    Fixes: a0627f776a45 ("net: marvell: Allow drivers to be built with COMPILE_TEST")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     
  • Jiri Pirko says:

    ====================
    mlxsw: core: Implement thermal zone

    Implement thermal zone for mlxsw based HW.
    The first patch is just a register dependency for the second patch.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Implement thermal zone for mlxsw based HW. It uses temperature sensor
    provided by ASIC (the same as mlxsw hwmon interface) to report current
    temp to thermal core. The ASIC's PWM is then used to control speed
    of system fans registered as cooling devices.

    Signed-off-by: Ivan Vecera
    Reviewed-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ivan Vecera
     
  • The MFSL register is used to configure the fan speed event / interrupt
    notification mechanism. Fan speed threshold are defined for both
    under-speed and over-speed.

    Signed-off-by: Jiri Pirko
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • Andrew Lunn says:

    ====================
    Start adding support for mv88e6390

    This is the first patchset implementing support for the mv88e6390
    family. This is a new generation of switch devices and has numerous
    incompatible changes to the registers. These patches allow the switch
    to the detected during probe, and makes the statistics unit work.

    These patches are insufficient to make the mv88e6390 functional. More
    patches will follow.

    v2:
    Move stats code into global1
    Change DT compatible string to mv88e6190
    Fixed mv88e6351 stats which v1 had broken
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Move the stats functions which access global 1 registers into
    global1.c.

    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • The mv88e6390 uses a different bit to select between bank0 and bank1
    of the statistics. So implement an ops function for this, and pass the
    selector bit to the generic stats read function. Also, the histogram
    selection has moved for the mv88e6390, so abstract its selection as
    well.

    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • Different families have different sets of statistics. Abstract this
    using a stats_get_stats op. The mv88e6390 needs a different
    implementation, which will be added later.

    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • Different families have different sets of statistics. Abstract this
    using a stats_get_sset_count and stats_get_strings op. Each stat has a
    bitmap, and the ops implementer uses a bit map mask to count the
    statistics which apply for the family, or return the list of strings.

    Signed-off-by: Andrew Lunn
    v2:
    Rename functions to avoid _ prefix.
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • The statistics unit on the mv88e6390 needs the histogram mode to be
    configured in a different register compared to other devices. Add an
    ops to do this.

    Signed-off-by: Andrew Lunn
    v2:
    Rename to mv88e6390_g1_stats_set_histogram
    Move into global1.c
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • The MV88E6390 has a control register for what the histogram statistics
    actually contain. This means the stat_snapshot method should not set
    this information. So implement the 6390 stats_snapshot function without
    these bits.

    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • Knowing the family of device belongs to helps with picking the ops
    implementation which is appropriate to the device. So add a comment to
    each structure of ops.

    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • Taking a stats snapshot differs between same families. Abstract this
    into an ops member. At the same time, move the code into global1.[ch],
    since the registers are in the global1 range.

    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • With the devices added to the tables, the probe will recognize the
    switch. This however is not sufficient to make it work properly, other
    changes are needed because of incompatibilities.

    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • _mv88e6xxx_stats_wait() did not check the return value from
    mv88e6xxx_g1_read(), so the compiler complained about set but unused
    err.

    Signed-off-by: Andrew Lunn
    Reviewed-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • The switch needs to be taken out of reset before we can read its ID
    register on the MDIO bus.

    Signed-off-by: Andrew Lunn
    Reviewed-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • Pull apparmor bugfix from James Morris:
    "This has a fix for a policy replacement bug that is fairly serious for
    apache mod_apparmor users, as it results in the wrong policy being
    applied on an network facing service"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    apparmor: fix change_hat not finding hat after policy replacement

    Linus Torvalds
     
  • Pull sparc fixes from David Miller:

    1) With modern networking cards we can run out of 32-bit DMA space, so
    support 64-bit DMA addressing when possible on sparc64. From Dave
    Tushar.

    2) Some signal frame validation checks are inverted on sparc32, fix
    from Andreas Larsson.

    3) Lockdep tables can get too large in some circumstances on sparc64,
    add a way to adjust the size a bit. From Babu Moger.

    4) Fix NUMA node probing on some sun4v systems, from Thomas Tai.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
    sparc: drop duplicate header scatterlist.h
    lockdep: Limit static allocations if PROVE_LOCKING_SMALL is defined
    config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc
    sunbmac: Fix compiler warning
    sunqe: Fix compiler warnings
    sparc64: Enable 64-bit DMA
    sparc64: Enable sun4v dma ops to use IOMMU v2 APIs
    sparc64: Bind PCIe devices to use IOMMU v2 service
    sparc64: Initialize iommu_map_table and iommu_pool
    sparc64: Add ATU (new IOMMU) support
    sparc64: Add FORCE_MAX_ZONEORDER and default to 13
    sparc64: fix compile warning section mismatch in find_node()
    sparc32: Fix inverted invalid_frame_pointer checks on sigreturns
    sparc64: Fix find_node warning if numa node cannot be found

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) Clear congestion control state when changing algorithms on an
    existing socket, from Florian Westphal.

    2) Fix register bit values in altr_tse_pcs portion of stmmac driver,
    from Jia Jie Ho.

    3) Fix PTP handling in stammc driver for GMAC4, from Giuseppe
    CAVALLARO.

    4) Fix udplite multicast delivery handling, it ignores the udp_table
    parameter passed into the lookups, from Pablo Neira Ayuso.

    5) Synchronize the space estimated by rtnl_vfinfo_size and the space
    actually used by rtnl_fill_vfinfo. From Sabrina Dubroca.

    6) Fix memory leak in fib_info when splitting nodes, from Alexander
    Duyck.

    7) If a driver does a napi_hash_del() explicitily and not via
    netif_napi_del(), it must perform RCU synchronization as needed. Fix
    this in virtio-net and bnxt drivers, from Eric Dumazet.

    8) Likewise, it is not necessary to invoke napi_hash_del() is we are
    also doing neif_napi_del() in the same code path. Remove such calls
    from be2net and cxgb4 drivers, also from Eric Dumazet.

    9) Don't allocate an ID in peernet2id_alloc() if the netns is dead,
    from WANG Cong.

    10) Fix OF node and device struct leaks in of_mdio, from Johan Hovold.

    11) We cannot cache routes in ip6_tunnel when using inherited traffic
    classes, from Paolo Abeni.

    12) Fix several crashes and leaks in cpsw driver, from Johan Hovold.

    13) Splice operations cannot use freezable blocking calls in AF_UNIX,
    from WANG Cong.

    14) Link dump filtering by master device and kind support added an error
    in loop index updates during the dump if we actually do filter, fix
    from Zhang Shengju.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits)
    tcp: zero ca_priv area when switching cc algorithms
    net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit
    ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC
    tipc: eliminate obsolete socket locking policy description
    rtnl: fix the loop index update error in rtnl_dump_ifinfo()
    l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()
    net: macb: add check for dma mapping error in start_xmit()
    rtnetlink: fix FDB size computation
    netns: fix get_net_ns_by_fd(int pid) typo
    af_unix: conditionally use freezable blocking calls in read
    net: ethernet: ti: cpsw: fix fixed-link phy probe deferral
    net: ethernet: ti: cpsw: add missing sanity check
    net: ethernet: ti: cpsw: fix secondary-emac probe error path
    net: ethernet: ti: cpsw: fix of_node and phydev leaks
    net: ethernet: ti: cpsw: fix deferred probe
    net: ethernet: ti: cpsw: fix mdio device reference leak
    net: ethernet: ti: cpsw: fix bad register access in probe error path
    net: sky2: Fix shutdown crash
    cfg80211: limit scan results cache size
    net sched filters: pass netlink message flags in event notification
    ...

    Linus Torvalds
     
  • Declare the structure ieee802154_ops as const as it is only passed as an
    argument to the function ieee802154_alloc_hw. This argument is of type
    const struct ieee802154_ops *, so ieee80254_ops structures having this
    property can be declared as const.
    Done using Coccinelle:

    @r1 disable optional_qualifier @
    identifier i;
    position p;
    @@
    static struct ieee802154_ops i@p = {...};

    @ok1@
    identifier r1.i;
    position p;
    expression e1;
    @@
    ieee802154_alloc_hw(e1,&i@p)

    @bad@
    position p!={r1.p,ok1.p};
    identifier r1.i;
    @@
    i@p

    @depends on !bad disable optional_qualifier@
    identifier r1.i;
    @@
    static
    +const
    struct ieee802154_ops i={...};

    @depends on !bad disable optional_qualifier@
    identifier r1.i;
    @@
    +const
    struct ieee802154_ops i;

    The before and after size details of the affected files are:

    text data bss dec hex filename
    8669 1176 16 9861 2685 drivers/net/ieee802154/adf7242.o
    8805 1048 16 9869 268d drivers/net/ieee802154/adf7242.o

    text data bss dec hex filename
    7211 2296 32 9539 2543 drivers/net/ieee802154/atusb.o
    7339 2160 32 9531 253b drivers/net/ieee802154/atusb.o

    Signed-off-by: Bhumika Goyal
    Acked-by: Stefan Schmidt
    Signed-off-by: David S. Miller

    Bhumika Goyal
     
  • Pravin B Shelar says:

    ====================
    geneve: Use LWT more effectively.

    Following patch series make use of geneve LWT code path for
    geneve netdev type of device.
    This allows us to simplify geneve module without changing any
    functionality.

    v2-v3:
    Rebase against latest net-next.

    v1-v2:
    Fix warning reported by kbuild test robot.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Rather than comparing 64-bit tunnel-id, compare tunnel vni
    which is 24-bit id. This also save conversion from vni
    to tunnel id on each tunnel packet receive.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    pravin shelar
     
  • Geneve already has check for device socket in route
    lookup function. So no need to check it in xmit
    function.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    pravin shelar
     
  • There are minimal difference in building Geneve header
    between ipv4 and ipv6 geneve tunnels. Following patch
    refactors code to unify it.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    pravin shelar
     
  • Current geneve implementation has two separate cases to handle.
    1. netdev xmit
    2. LWT xmit.

    In case of netdev, geneve configuration is stored in various
    struct geneve_dev members. For example geneve_addr, ttl, tos,
    label, flags, dst_cache, etc. For LWT ip_tunnel_info is passed
    to the device in ip_tunnel_info.

    Following patch uses ip_tunnel_info struct to store almost all
    of configuration of a geneve netdevice. This allows us to unify
    most of geneve driver code around ip_tunnel_info struct.
    This dramatically simplify geneve code, since it does not
    need to handle two different configuration cases. Removes
    duplicate code, single code path can handle either type
    of geneve devices.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    pravin shelar