13 Nov, 2017

14 commits

  • Add a new type: DSA_TAG_PROTO_PREPEND which allows us to support for the
    4-bytes Broadcom tag that we already support, but in a format where it
    is pre-pended to the packet instead of located between the MAC SA and
    the Ethertyper (DSA_TAG_PROTO_BRCM).

    Signed-off-by: Florian Fainelli
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • In preparation for supporting the same Broadcom tag format, but instead
    of inserted between the MAC SA and EtherType, prepended to the Ethernet
    frame, restructure the code a little bit to make that possible and take
    an offset parameter.

    Signed-off-by: Florian Fainelli
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • A number of drivers want to check whether the configured CPU port is a
    possible configuration for enabling tagging, pass down the CPU port
    number so they verify that.

    Signed-off-by: Florian Fainelli
    Reviewed-by: Vivien Didelot
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • gcc-4.4.4 (at lest) has issues with initializers and anonymous unions:

    net/sched/sch_red.c: In function 'red_dump_offload':
    net/sched/sch_red.c:282: error: unknown field 'stats' specified in initializer
    net/sched/sch_red.c:282: warning: initialization makes integer from pointer without a cast
    net/sched/sch_red.c:283: error: unknown field 'stats' specified in initializer
    net/sched/sch_red.c:283: warning: initialization makes integer from pointer without a cast
    net/sched/sch_red.c: In function 'red_dump_stats':
    net/sched/sch_red.c:352: error: unknown field 'xstats' specified in initializer
    net/sched/sch_red.c:352: warning: initialization makes integer from pointer without a cast

    Work around this.

    Fixes: 602f3baf2218 ("net_sch: red: Add offload ability to RED qdisc")
    Cc: Nogah Frankel
    Cc: Jiri Pirko
    Cc: Simon Horman
    Cc: David S. Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Andrew Morton
     
  • Since Mellanox focus is on newer adapters, we would like to have the
    ability to disable the support for old gen2 adapters.

    This can be done by turning off the MLX4_CORE_GEN2 Kconfig flag.
    We keep it turned on by default.

    Signed-off-by: Slava Shwartsman
    Signed-off-by: Tariq Toukan
    Signed-off-by: David S. Miller

    Slava Shwartsman
     
  • The way people generally use netlink_dump is that they fill in the skb
    as much as possible, breaking when nla_put returns an error. Then, they
    get called again and start filling out the next skb, and again, and so
    forth. The mechanism at work here is the ability for the iterative
    dumping function to detect when the skb is filled up and not fill it
    past the brim, waiting for a fresh skb for the rest of the data.

    However, if the attributes are small and nicely packed, it is possible
    that a dump callback function successfully fills in attributes until the
    skb is of size 4080 (libmnl's default page-sized receive buffer size).
    The dump function completes, satisfied, and then, if it happens to be
    that this is actually the last skb, and no further ones are to be sent,
    then netlink_dump will add on the NLMSG_DONE part:

    nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);

    It is very important that netlink_dump does this, of course. However, in
    this example, that call to nlmsg_put_answer will fail, because the
    previous filling by the dump function did not leave it enough room. And
    how could it possibly have done so? All of the nla_put variety of
    functions simply check to see if the skb has enough tailroom,
    independent of the context it is in.

    In order to keep the important assumptions of all netlink dump users, it
    is therefore important to give them an skb that has this end part of the
    tail already reserved, so that the call to nlmsg_put_answer does not
    fail. Otherwise, library authors are forced to find some bizarre sized
    receive buffer that has a large modulo relative to the common sizes of
    messages received, which is ugly and buggy.

    This patch thus saves the NLMSG_DONE for an additional message, for the
    case that things are dangerously close to the brim. This requires
    keeping track of the errno from ->dump() across calls.

    Signed-off-by: Jason A. Donenfeld
    Signed-off-by: David S. Miller

    Jason A. Donenfeld
     
  • Dave Taht says:

    ====================
    netem: add nsec scheduling and slot feature

    This patch series converts netem away from the old "ticks" interface and
    userspace API, and adds support for a new "slot" feature intended to
    emulate bursty macs such as WiFi and LTE better.

    Changes since v2:
    Use u64 for packet_len_sched_time()
    Use simpler max(time_to_send,q->slot.slot_next)

    Changes since v1:
    Always pass new nanosecond APIs to userspace
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Slotting is a crude approximation of the behaviors of shared media such
    as cable, wifi, and LTE, which gather up a bunch of packets within a
    varying delay window and deliver them, relative to that, nearly all at
    once.

    It works within the existing loss, duplication, jitter and delay
    parameters of netem. Some amount of inherent latency must be specified,
    regardless.

    The new "slot" parameter specifies a minimum and maximum delay between
    transmission attempts.

    The "bytes" and "packets" parameters can be used to limit the amount of
    information transferred per slot.

    Examples of use:

    tc qdisc add dev eth0 root netem delay 200us \
    slot 800us 10ms bytes 64k packets 42

    A more correct example, using stacked netem instances and a packet limit
    to emulate a tail drop wifi queue with slots and variable packet
    delivery, with a 200Mbit isochronous underlying rate, and 20ms path
    delay:

    tc qdisc add dev eth0 root handle 1: netem delay 20ms rate 200mbit \
    limit 10000
    tc qdisc add dev eth0 parent 1:1 handle 10:1 netem delay 200us \
    slot 800us 10ms bytes 64k packets 42 limit 512

    Signed-off-by: Dave Taht
    Signed-off-by: David S. Miller

    Dave Taht
     
  • netem userspace has long relied on a horrible /proc/net/psched hack
    to translate the current notion of "ticks" to nanoseconds.

    Expressing latency and jitter instead, in well defined nanoseconds,
    increases the dynamic range of emulated delays and jitter in netem.

    It will also ease a transition where reducing a tick to nsec
    equivalence would constrain the max delay in prior versions of
    netem to only 4.3 seconds.

    Signed-off-by: Dave Taht
    Suggested-by: Eric Dumazet
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Dave Taht
     
  • Upgrade the internal netem scheduler to use nanoseconds rather than
    ticks throughout.

    Convert to and from the std "ticks" userspace api automatically,
    while allowing for finer grained scheduling to take place.

    Signed-off-by: Dave Taht
    Signed-off-by: David S. Miller

    Dave Taht
     
  • Avoid traversing the list of mr6_tables (which requires the
    rtnl_lock) in ip6mr_sk_done(), when we know in advance that
    a match will not be found.
    This can happen when rawv6_close()/ip6mr_sk_done() is invoked
    on non-mroute6 sockets.
    This patch helps reduce rtnl_lock contention when destroying
    a large number of net namespaces, each having a non-mroute6
    raw socket.

    v2: same patch, only fixed subject line and expanded comment.

    Signed-off-by: Francesco Ruggeri
    Signed-off-by: David S. Miller

    Francesco Ruggeri
     
  • The variable giga_ctrl is being assigned to zero however this is
    never read and hence the assignment is redundant, so remove it.
    Cleans up clang warning:

    drivers/net/ethernet/realtek/r8169.c:1978:3: warning: Value stored
    to 'giga_ctrl' is never read

    Signed-off-by: Colin Ian King
    Signed-off-by: David S. Miller

    Colin Ian King
     
  • Signed-off-by: Egil Hjelmeland
    Signed-off-by: David S. Miller

    Egil Hjelmeland
     
  • Fix embarrassing bug in lan9303_alr_del_port(): Instead of zeroing
    entr->mac_addr, I destroyed the next cache entry. Affected .port_fdb_del and
    .port_mdb_del.

    Fixes: 0620427ea0d6 ("net: dsa: lan9303: Add fdb/mdb manipulation")
    Signed-off-by: Egil Hjelmeland
    Reviewed-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Egil Hjelmeland
     

12 Nov, 2017

2 commits

  • David S. Miller
     
  • Pull networking fixes from David Miller:

    1) Use after free in vlan, from Cong Wang.

    2) Handle NAPI poll with a zero budget properly in mlx5 driver, from
    Saeed Mahameed.

    3) If DMA mapping fails in mlx5 driver, NULL out page, from Inbar
    Karmy.

    4) Handle overrun in RX FIFO of sun4i CAN driver, from Gerhard
    Bertelsmann.

    5) Missing return in mdb and vlan prepare phase of DSA layer, from
    Vivien Didelot.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    vlan: fix a use-after-free in vlan_device_event()
    net: dsa: return after vlan prepare phase
    net: dsa: return after mdb prepare phase
    can: ifi: Fix transmitter delay calculation
    tcp: fix tcp_fastretrans_alert warning
    tcp: gso: avoid refcount_t warning from tcp_gso_segment()
    can: peak: Add support for new PCIe/M2 CAN FD interfaces
    can: sun4i: handle overrun in RX FIFO
    can: c_can: don't indicate triple sampling support for D_CAN
    net/mlx5e: Increase Striding RQ minimum size limit to 4 multi-packet WQEs
    net/mlx5e: Set page to null in case dma mapping fails
    net/mlx5e: Fix napi poll with zero budget
    net/mlx5: Cancel health poll before sending panic teardown command
    net/mlx5: Loop over temp list to release delay events
    rds: ib: Fix NULL pointer dereference in debug code

    Linus Torvalds
     

11 Nov, 2017

24 commits

  • …ub/scm/linux/kernel/git/kvalo/wireless-drivers-next

    Kalle Valo says:

    ====================
    wireless-drivers-next patches for 4.15

    Last minute patches before the merge window. Not really anything
    special standing out, mostly fixes or cleanup and some minor new
    features.

    Major changes:

    iwlwifi

    * some new PCI IDs
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • ip6_frag_id was only used by UFO, which has been removed.
    ipv6_proxy_select_ident() only existed to set ip6_frag_id and has no
    in-tree callers.

    Signed-off-by: Mat Martineau
    Signed-off-by: David S. Miller

    Mat Martineau
     
  • Guillaume Nault says:

    ====================
    l2tp: avoid aliasing tunnels socket pointer

    We don't need to copy the tunnel's socket pointer in the pseudo-wire
    specific session structures. This uselessly complicates the code
    and hampers evolution.

    This series was part of an effort to protect tunnels socket pointer
    with RCU. But since it provides nice cleanup, I submit it separately.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • The last user of .tunnel_sock is pppol2tp_connect() which defensively
    uses it to verify internal data consistency.

    This check isn't necessary: l2tp_session_get() guarantees that the
    returned session belongs to the tunnel passed as parameter. And
    .tunnel_sock is never updated, so checking that it still points to
    the parent tunnel socket is useless; that test can never fail.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Sessions don't need to use l2tp_sock_to_tunnel(xxx->tunnel_sock) for
    accessing their parent tunnel. They have the .tunnel field in the
    l2tp_session structure for that. Furthermore, in all these cases, the
    session is registered, so we're guaranteed that .tunnel isn't NULL and
    that the session properly holds a reference on the tunnel.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • This field has never been used.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     
  • Florian Fainelli says:

    ====================
    net: dsa: b53: Turn on Broadcom tags

    This was long overdue, with this patch series, the b53 driver now
    turns on Broadcom tags except for 5325 and 5365 which use an older
    format that we do not support yet (TBD).

    First patch is necessary in order for bgmac, used on BCM5301X and Northstar
    Plus to work correctly and successfully send ARP packets back to the requsester.

    Second patch is actually a bug fix, but because net/master and net-next/master
    diverge in that area, I am targeting net-next/master here.

    Finally, the last patch enables Broadcom tags after checking that the CPU port
    selected is either, 5, 7 or 8, since those are the only valid combinations
    given currently supported HW.

    Changes in v3:

    - guarded padding with netdev_uses_dsa() to let the non-DSA use cases
    not have a performance hit for smaller packets

    - added missing select NET_DSA_TAG_BRCM to drivers/net/dsa/b53/Kconfig

    Changes in v2:

    - moved a hunk between patch 2 and patch 3 to avoid a bisectability issue
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Enable Broadcom tags for b53 devices, except 5325 and 5365 which use a
    different Broadcom tag format not yet supported by net/dsa/tag_brcm.c.

    We also make sure that we can turn on Broadcom tags on a CPU port number
    that is capable of that: 5, 7 or 8.

    Signed-off-by: Florian Fainelli
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • dev->cpu_port is the driver local information that should only be used
    to look up register offsets for a particular port, when they differ
    (e.g: IMP port override), but it should certainly not be used in place
    of the DSA configured CPU port.

    Since the DSA switch layer calls port_vlan_{add,del}() on the CPU port
    as well, we can remove the specific setting of the CPU port within
    port_vlan_{add,del}.

    Fixes: ff39c2d68679 ("net: dsa: b53: Add bridge support")
    Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch")
    Signed-off-by: Florian Fainelli
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • In preparation for enabling Broadcom tags with b53, pad packets to a
    minimum size of 64 bytes (sans FCS) in order for the Broadcom switch to
    accept ingressing frames. Without this, we would typically be able to
    DHCP, but not resolve with ARP because packets are too small and get
    rejected by the switch.

    Signed-off-by: Florian Fainelli
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • …nux/kernel/git/mkl/linux-can

    Marc Kleine-Budde says:

    ====================
    pull-request: can 2017-11-10

    this is a pull request for net/master.

    The first patch by Richard Schütz for the c_can driver removes the false
    indication to support triple sampling for d_can. Gerhard Bertelsmann's
    patch for the sun4i driver improves the RX overrun handling. The patch
    by Stephane Grosjean for the peak_canfd driver adds the PCI ids for
    various new PCIe/M2 interfaces. Marek Vasut's patch for the ifi driver
    fix transmitter delay calculation.
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • Egil Hjelmeland says:

    ====================
    net: dsa: lan9303: IGMP handling

    Set up the HW switch to trap IGMP packets to CPU port.
    And make sure skb->offload_fwd_mark is cleared for incoming IGMP packets.

    skb->offload_fwd_mark calculation is a candidate for consolidation into the
    DSA core. The calculation can probably be more polished when done at a point
    where DSA has updated skb.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Now that IGMP packets no longer is flooded in HW, we want the SW bridge to
    forward packets based on bridge configuration. To make that happen,
    IGMP packets must have skb->offload_fwd_mark = 0.

    Signed-off-by: Egil Hjelmeland
    Signed-off-by: David S. Miller

    Egil Hjelmeland
     
  • IGMP packets should be trapped to the CPU port. The SW bridge knows
    whether to forward to other ports.

    With "IGMP snooping for local traffic" merged, IGMP trapping is also
    required for stable IGMPv2 operation.

    LAN9303 does not trap IGMP packets by default.

    Enable IGMP trapping in lan9303_setup.

    Signed-off-by: Egil Hjelmeland
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Egil Hjelmeland
     
  • Collect vpd information directly from hardware instead of software
    adapter context. Move EEPROM physical address to virtual address
    translation logic to t4_hw.c and update relevant files.

    Fixes: 6f92a6544f1a ("cxgb4: collect hardware misc dumps")
    Signed-off-by: Rahul Lakkireddy
    Signed-off-by: Ganesh Goudar
    Signed-off-by: David S. Miller

    Rahul Lakkireddy
     
  • Saeed Mahameed says:

    ====================
    Mellanox, mlx5 fixes 2017-11-08

    The following series includes some fixes for mlx5 core and etherent
    driver.

    Sorry for the late submission but as you can see i have some very
    critical fixes below that i would like them merged into this RC.

    Please pull and let me know if there is any problem.

    For -stable:
    ('net/mlx5e: Set page to null in case dma mapping fails') kernels >= 4.13
    ('net/mlx5: FPGA, return -EINVAL if size is zero') kernels >= 4.13
    ('net/mlx5: Cancel health poll before sending panic teardown command') kernels >= 4.13

    V1->V2:
    - Fix Reviewed-by tag of the 2nd patch.
    - Drop the FPGA 0 size fix, it needs some more change log info.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • After refcnt reaches zero, vlan_vid_del() could free
    dev->vlan_info via RCU:

    RCU_INIT_POINTER(dev->vlan_info, NULL);
    call_rcu(&vlan_info->rcu, vlan_info_rcu_free);

    However, the pointer 'grp' still points to that memory
    since it is set before vlan_vid_del():

    vlan_info = rtnl_dereference(dev->vlan_info);
    if (!vlan_info)
    goto out;
    grp = &vlan_info->grp;

    Depends on when that RCU callback is scheduled, we could
    trigger a use-after-free in vlan_group_for_each_dev()
    right following this vlan_vid_del().

    Fix it by moving vlan_vid_del() before setting grp. This
    is also symmetric to the vlan_vid_add() we call in
    vlan_device_event().

    Reported-by: Fengguang Wu
    Fixes: efc73f4bbc23 ("net: Fix memory leak - vlan_info struct")
    Cc: Alexander Duyck
    Cc: Linus Torvalds
    Cc: Girish Moodalbail
    Signed-off-by: Cong Wang
    Reviewed-by: Girish Moodalbail
    Tested-by: Fengguang Wu
    Signed-off-by: David S. Miller

    Cong Wang
     
  • The statistics histogram mode was not being explicitly initialized on
    devices other than the 6390 family. Clearing the statistics then
    overwrote the default setting, setting the histogram to a reserved
    mode.

    Explicitly set the histogram mode for all devices. Change the
    statistics clear into a read/modify/write, and since it is now more
    complex, move it into global1.c.

    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • Andrew Lunn says:

    ====================
    mv88e6xxx broadcast flooding in hardware

    This patchset makes the mv88e6xxx driver perform flooding in hardware,
    rather than let the software bridge perform the flooding. This is a
    prerequisite for IGMP snooping on the bridge interface.

    In order to make hardware broadcasting work, a few other issues need
    fixing or improving. SWITCHDEV_ATTR_ID_PORT_PARENT_ID is broken, which
    is apparent when testing on the ZII devel board with multiple
    switches.

    Some of these patches are taken from a previous RFC patchset of IGMP
    support.

    Rebased onto net-next, with fixup for Vivien's refactoring.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • By default, the switch does not flood broadcast frames. Instead the
    broadcast address is unknown in the ATU, so the frame gets forwarded
    out the cpu port. The software bridge then floods it back to the
    individual switch ports which are members of the bridge.

    Add an ATU entry in the switch so that it floods broadcast frames out
    ports, rather than have the software bridge do it. Also, send a copy
    out the cpu port and any dsa ports. Rely on the port vectors to
    prevent broadcast frames leaking between bridges, and separated ports.

    Additionally, when a VLAN is added, a new FID is allocated. This
    represents a new table of ATU entries. A broadcast entry is added to
    the new FID.

    With offload_fwd_mark being set, the software bridge will not flood
    the frames it receives back to the switch.

    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • This function is going to be needed by a soon to be added new
    function. Move it earlier so we can avoid a forward declaration.
    No functional changes.

    Signed-off-by: Andrew Lunn
    Reviewed-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • When testing if a VLAN is one more than one bridge, we print an error
    message that the VLAN is already in use somewhere else. Print both the
    new port which would like the VLAN, and the port which already has it,
    to aid debugging.

    Signed-off-by: Andrew Lunn
    Reviewed-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • Having the same VLAN on multiple bridges is currently unsupported as
    an offload. mv88e6xxx_port_check_hw_vlan() is used to ensure that a
    VLAN is not on multiple bridges when adding a VLAN range to a port. It
    loops the ports and checks to see if there are ports in a different
    bridge with the same VLAN.

    While walking all switch ports, the code was checking if the new port
    has a netdev slave attached to it. If not, skip checking the port
    being walked. This seems like a typ0. If the new port does not have a
    slave, how has a VLAN been added to it in the first place, requiring
    this check be performed at all? More likely, we should be checking if
    the port being walked has a slave. Without the port having a slave, it
    cannot have a VLAN on it, so there is no need to check further for
    that particular port.

    Signed-off-by: Andrew Lunn
    Reviewed-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • The software bridge needs to know if a packet has already been bridged
    by hardware offload to ports in the same hardware offload, in order
    that it does not re-flood them, causing duplicates. This is
    particularly true for broadcast and multicast traffic which the host
    has requested.

    By setting offload_fwd_mark in the skb the bridge will only flood to
    ports in other offloads and other netifs. Set this flag in the DSA and
    EDSA tag driver.

    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn