11 Aug, 2016

23 commits

  • The commit 555c8a8623a3 ("bpf: avoid stack copy and use skb ctx for event output")
    started using 20 of initially reserved upper 32-bits of 'flags' argument
    in bpf_perf_event_output(). Adjust corresponding prototype in samples/bpf/bpf_helpers.h

    Signed-off-by: Adam Barth
    Signed-off-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Adam Barth
     
  • This patch adds support for 64 bit addressing and BDs.
    -> Enable 64 bit addressing in DMACFG register.
    -> Set DMA mask when design config register shows support for 64 bit addr.
    -> Add new BD words for higher address when 64 bit DMA support is present.
    -> Add and update TBQPH and RBQPH for MSB of BD pointers.
    -> Change extraction and updation of buffer addresses to use
    64 bit address.
    -> In gem_rx extract address in one place insted of two and use a
    separate flag for RXUSED.

    Signed-off-by: Harini Katakam
    Signed-off-by: David S. Miller

    Harini Katakam
     
  • This patch adds the driver implementation for ethtool link_ksettings
    callbacks. qed driver now defines/uses the qed specific masks for
    representing link capability values. qede driver maps these values to
    to new link modes defined by the kernel implementation of link_ksettings.

    Please consider applying this to 'net-next' branch.

    Signed-off-by: Sudarsana Reddy Kalluru
    Signed-off-by: Yuval Mintz
    Signed-off-by: David S. Miller

    Sudarsana Reddy Kalluru
     
  • Ivan Khoronzhuk says:

    ====================
    net: ethernet: ti: cpsw: split driver data and per ndev data

    In dual_emac mode the driver can handle 2 network devices. Each of them can use
    its own private data and common data/resources. This patchset splits common driver
    data/resources and private per net device data.
    It leads to:
    - reduce memory usage
    - increase code readability
    - allows add a bunch of simplification
    - create prerequisites to add multi-channel support,
    when channels are shared between net devices

    Doesn't have bad impact on performance.
    v2: https://lkml.org/lkml/2016/8/6/108

    Since v2:
    - removed patch:
    net: ethernet: ti: cpsw: fix int dbg message
    - replaced patch:
    "net: ethernet: ti: cpsw: remove redundant check in napi poll"
    on "net: ethernet: ti: cpsw: remove intr dbg msg from poll handlers"
    - removed macro "cpsw_get_slave_ndev"
    - corrected some commits

    Since v1:
    - added several patch improvements
    - avoided variable reordering in structures
    - removed static variable for common function
    - split big patch on several patches:
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • The ale, cpts, version, rx_packet_max, bus_freq, interrupt pacing
    parameters are common per net device that uses the same h/w. So,
    move them to common driver structure.

    Signed-off-by: Ivan Khoronzhuk
    Reviewed-by: Mugunthan V N
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • The napi structs are common for both net devices in dual_emac
    mode, In order to not hold duplicate links to them, move to
    cpsw_common.

    Signed-off-by: Ivan Khoronzhuk
    Reviewed-by: Mugunthan V N
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • These data are common for net devs in dual_emac mode. No need to hold
    it for every priv instance, so move them under cpsw_common.

    Signed-off-by: Ivan Khoronzhuk
    Reviewed-by: Mugunthan V N
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • The irq data are common for net devs in dual_emac mode. So no need to
    hold these data in every priv struct, move them under cpsw_common.
    Also delete irq_num var, as after optimization it's not needed.
    Correct number of irqs to 2, as anyway, driver is using only 2,
    at least for now.

    Signed-off-by: Ivan Khoronzhuk
    Reviewed-by: Mugunthan V N
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • Every net device private struct holds links to shared cpdma resources.
    No need to save and every time synchronize these resources per net dev.
    So, move it to common driver struct.

    Signed-off-by: Ivan Khoronzhuk
    Reviewed-by: Mugunthan V N
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • The pointers on h/w registers are common for every cpsw_private
    instance, so no need to hold them for every ndev.

    Signed-off-by: Ivan Khoronzhuk
    Reviewed-by: Mugunthan V N
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • No need to hold pdev link when only dev is needed.
    This allows to simplify a bunch of cpsw->pdev->dev now and farther.

    Signed-off-by: Ivan Khoronzhuk
    Reviewed-by: Mugunthan V N
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • This patch simply create holder for common data and as a start moves
    pdev var to it.

    Signed-off-by: Ivan Khoronzhuk
    Reviewed-by: Mugunthan V N
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • No need to check const slave num in runtime for every packet,
    and ndev for slaves w/o ndev is anyway NULL. So remove redundant
    check and macro.

    Reviewed-by: Mugunthan V N
    Signed-off-by: Ivan Khoronzhuk
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • There is no need to hold link to clk, it's used only once
    while probe.

    Reviewed-by: Mugunthan V N
    Reviewed-by: Grygorii Strashko
    Signed-off-by: Ivan Khoronzhuk
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • There is no need in priv here.

    Reviewed-by: Mugunthan V N
    Reviewed-by: Grygorii Strashko
    Signed-off-by: Ivan Khoronzhuk
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • At poll handler no possibility to figure out which network device is
    handling packets, as cpdma channels are common for both network
    devices in dual_emac mode. Currently, the messages are printed only
    for one device, in fact, there is two. This print msg is incorrect
    and seems is not very useful, so drop it from poll handler.

    Reviewed-by: Mugunthan V N
    Signed-off-by: Ivan Khoronzhuk
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • As second net dev is created only in case of dual_emac mode, port
    number can be figured out in simpler way. Also no need to pass
    redundant ndev struct.

    Reviewed-by: Mugunthan V N
    Reviewed-by: Grygorii Strashko
    Signed-off-by: Ivan Khoronzhuk
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • The PPTP is encapsulated by GRE header with that GRE_VERSION bits
    must contain one. But current GRE RPS needs the GRE_VERSION must be
    zero. So RPS does not work for PPTP traffic.

    In my test environment, there are four MIPS cores, and all traffic
    are passed through by PPTP. As a result, only one core is 100% busy
    while other three cores are very idle. After this patch, the usage
    of four cores are balanced well.

    Signed-off-by: Gao Feng
    Reviewed-by: Philip Prindeville
    Signed-off-by: David S. Miller

    Gao Feng
     
  • Jiri Kosina says:

    ====================
    Convert qdisc linked list into a hashtable

    This is a respin of the v6 of the original patch [1], split into two-patch
    series as requested by davem; first patch fixes all symbol conflicts
    that'd happen once netdevice.h starts to include hashtable.h, the second
    one performs the actual switch to hashtable.

    I've preserved Cong's Reviewed-by:, as code-wise this series is identical
    to the original v6 of the patch.

    [1] lkml.kernel.org/r/alpine.LNX.2.00.1608011220580.22028@cbobk.fhfr.pm
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Convert the per-device linked list into a hashtable. The primary
    motivation for this change is that currently, we're not tracking all the
    qdiscs in hierarchy (e.g. excluding default qdiscs), as the lookup
    performed over the linked list by qdisc_match_from_root() is rather
    expensive.

    The ultimate goal is to get rid of hidden qdiscs completely, which will
    bring much more determinism in user experience.

    Reviewed-by: Cong Wang
    Signed-off-by: Jiri Kosina
    Signed-off-by: David S. Miller

    Jiri Kosina
     
  • This is a preparatory patch for converting qdisc linked list into a
    hashtable. As we'll need to include hashtable.h in netdevice.h, we first
    have to make sure that this will not introduce symbol conflicts for any of
    the netdevice.h users.

    Reviewed-by: Cong Wang
    Signed-off-by: Jiri Kosina
    Signed-off-by: David S. Miller

    Jiri Kosina
     
  • The patch 'ravb: add sleep PM suspend/resume support' used incorrect
    function names containing 'runtime' for the suspend and resume
    functions.

    Reported-by: Sergei Shtylyov
    Signed-off-by: Niklas Söderlund
    Acked-by: Sergei Shtylyov
    Signed-off-by: David S. Miller

    Niklas Söderlund
     
  • ic_close_devs() calls kfree() for all devices's ic_device. Since commit
    2647cffb2bc6 ("net: ipconfig: Support using "delayed" DHCP replies")
    the active device's ic_device is still used however to print the
    ipconfig summary which results in an oops if the memory is already
    changed. So delay freeing until after the autoconfig results are
    reported.

    Fixes: 2647cffb2bc6 ("net: ipconfig: Support using "delayed" DHCP replies")
    Reported-by: Geert Uytterhoeven
    Signed-off-by: Uwe Kleine-König
    Tested-by: Geert Uytterhoeven
    Signed-off-by: David S. Miller

    Uwe Kleine-König
     

10 Aug, 2016

4 commits

  • The interface would not function after the system had been woken up
    after have been suspended (echo mem > /sys/power/state) cycle. The
    reason for this is that all device registers have been reset to its
    default values. This patch adds sleep suspend and resume functions that
    detached the interface at suspend and restore the registers and reattach
    the interface at resume.

    Only the registers that are only configured at probe time needs to be
    explicitly restored by the resume handler. All other registers are
    reconfigured by either reopening the device in the resume handler (if
    the device was running when the system was suspended) or when the
    interface is opened by a user at a later time.

    Signed-off-by: Niklas Söderlund
    Signed-off-by: David S. Miller

    Niklas Söderlund
     
  • The b53_io_ops structures are never modified, so declare them as const.

    Done with the help of Coccinelle.

    Signed-off-by: Julia Lawall
    Acked-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Julia Lawall
     
  • After commit 0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse")
    fib_local is set but not used. Remove it.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Userspace programs generally need to know the name of the ppp devices
    they create. Both ioctl and rtnl interfaces use the ppp sheme
    to name them. But although the suffix used by the ioctl interface can
    be known by userspace (it's the PPP unit identifier returned by the
    PPPIOCGUNIT ioctl), the one used by the rtnl is only known by the
    kernel.

    This patch brings more consistency between ioctl and rtnl based ppp
    devices by generating device names using the PPP unit identifer as
    suffix in both cases. This way, userspace can always infer the name of
    the devices they create.

    Signed-off-by: Guillaume Nault
    Signed-off-by: David S. Miller

    Guillaume Nault
     

09 Aug, 2016

13 commits

  • This is helpful to detect at compile-time errors related to format
    strings.

    Signed-off-by: Nicolas Iooss
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Nicolas Iooss
     
  • Hello,

    I added all review comments and re-sending for review.

    >From a5017f5878a92d2acec86a6a29b1498c457cb73a Mon Sep 17 00:00:00 2001
    From: Nagaraju Lakkaraju
    Date: Wed, 3 Aug 2016 18:28:24 +0530
    Subject: [PATCH v2] net: phy: Add drivers for Microsemi PHYs

    Signed-off-by: Nagaraju Lakkaraju
    Signed-off-by: David S. Miller

    Raju Lakkaraju
     
  • Use of_property_read_bool to check for the existence of a property.

    The semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @@
    expression e1,e2,x;
    @@
    - if (of_get_property(e1,e2,NULL))
    - x = true;
    - else
    - x = false;
    + x = of_property_read_bool(e1,e2);
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: David S. Miller

    Julia Lawall
     
  • On Hyper-V host 2016 and later, VMs gets an event message of the physical
    link speed when vSwitch is changed. This patch handles this message, so
    the updated link speed can be reported by ethtool.

    Signed-off-by: Haiyang Zhang
    Reviewed-by: K. Y. Srinivasan
    Signed-off-by: David S. Miller

    Haiyang Zhang
     
  • The physical link speed value will be reported by ethtool command.
    The real speed is available from Windows 2016 host or later.

    Signed-off-by: Haiyang Zhang
    Reviewed-by: K. Y. Srinivasan
    Signed-off-by: David S. Miller

    Haiyang Zhang
     
  • The struct cpdma_desc_pool->used_desc field can be safely removed from
    CPDMA driver (and hot patch) because used_descs counter is used just
    for pool consistency check at CPDMA deinitialization and now this
    check can be re-implemnted using gen_pool_size(pool->gen_pool) !=
    gen_pool_avail(pool->gen_pool).
    More over, this will allow to get rid of warnings in
    cpdma_desc_pool_destro()-> WARN_ON(pool->used_desc) which may happen
    because the used_descs is used unprotected, since CPDMA has been
    switched to use genalloc, and may get wrong values on SMP.

    Hence, remove used_desc from struct cpdma_desc_pool.

    Signed-off-by: Grygorii Strashko
    Reviewed-by: Ivan Khoronzhuk
    Signed-off-by: David S. Miller

    Grygorii Strashko
     
  • The code using this variable has been commented out in the past as it
    was causing issues in upperlimited link-sharing scenarios.

    Signed-off-by: Michal Soltys
    Signed-off-by: David S. Miller

    Michal Soltys
     
  • This patch simplifies how we update fsc and calculate vt from it - while
    keeping the expected functionality identical with how hfsc behaves
    curently. It also fixes a certain issue introduced with
    a very old patch.

    The idea is, that instead of correcting cl_vt before fsc curve update
    (rtsc_min) and correcting cl_vt after calculation (rtsc_y2x) to keep
    cl_vt local to the current period - we can simply rely on virtual times
    and curve values always being in sync - analogously to how rsc and usc
    function, except that we use virtual time here.

    Why hasn't it been done since the beginning this way ? The likely scenario
    (basing on the code trying to correct curves whenever possible) was to
    keep the virtual times as small as possible - as they have tendency to
    "gallop" forward whenever their siblings and other fair sharing
    subtrees are idling. On top of that, current code is subtly bugged, so
    cumulative time (without any corrections) is always kept and used in
    init_vf() when a new backlog period begins (using cl_cvtoff).

    Is cumulative value safe ? Generally yes, though corner cases are easy
    to create. For example consider:

    1gbit interface
    some 100kbit leaf, everything else idle

    With current tick (64ns) 1s is 15625000 ticks, but the leaf is alone and
    it's virtual time, so in reality it's 10000 times more. ITOW 38 bits are
    needed to hold 1 second. 54 - 1 day, 59 - 1 month, 63 - 1 year (all
    logarithms rounded up). It's getting somewhat dangerous, but also
    requires setup excusing this kind of values not mentioning permanently
    backlogged class for a year. In near most extreme case (10gbit, 10kbit
    leaf), we have "enough" to hold ~13.6 days in 64 bits.

    Well, the issue remains mostly theoretical and cl_cvtoff has been
    working fine for all those years. Sensible configuration are de-facto
    immune to this issue, and not so sensible can solve it with a cronjob
    and its period inversely proportional to the insanity of such setup =)

    Now let's explain the subtle bug mentioned earlier.

    The issue is related to how offsets are kept and how we calculate
    virtual times and update fair service curve(s). The issue itself is
    subtle, but easy to observe with long m1 segments. It was introduced in
    rather old patch:

    Commit 99296150c7: "[NET_SCHED]: O(1) children vtoff adjustment
    in HFSC scheduler"

    (available in git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git)

    Originally when a new backlog period was started, cl_vtoff of each
    sibling was updated with cl_cvtmax from past period - naturally moving
    all cl_vt to proper starting point. That patch adjusted it so cumulative
    offset is kept in the parent, and there is no need for traversing the
    list (as any subsequent child activation derives new vt from already
    active sibling(s)).

    But with this change, cl_vtoff (of each sibling) is no longer persistent
    across the inactivity periods, as it's calculated from parent's
    cl_cvtoff on a new backlog period, conflicting with the following curve
    correction from the previous period:

    if (cl->cl_virtual.x == vt) {
    cl->cl_virtual.x -= cl->cl_vtoff;
    cl->cl_vtoff = 0;
    }

    This essentially tries to keep curve as if it was local to the period
    and resets cl_vtoff (cumulative vt offset of the class) to 0 when
    possible (read: when we have an intersection or if a new curve is below
    the old one). But then it's recalculated from cl_cvtoff on next active
    period. Then rtsc_min() call preceding the above if() doesn't really
    do what we expect it to do in such scenario - as it calculates the
    minimum of corrected curve (from the previous backlog period) and the
    new uncorrected curve (with offset derived from cl_cvtoff).

    Example:

    tc class add dev $ife parent 1:0 classid 1:1 hfsc ls m2 100mbit ul m2 100mbit
    tc class add dev $ife parent 1:1 classid 1:10 hfsc ls m1 80mbit d 10s m2 20mbit
    tc class add dev $ife parent 1:1 classid 1:11 hfsc ls m2 20mbit

    start B, keep it backlogged, let it run 6s (30s worth of vt as A is idle)
    pause B briefly to force cl_cvtoff update in parent (whole 1:1 going idle)
    start A, let it run 10s
    pause A briefly to force rtsc_min()

    At this point we would expect A to continue at 20mbit after a brief
    moment of 80mbit. But instead A will use 80mbit for full 10s again. It's
    the effect of first correcting A (during 'start A'), and then - after
    unpausing - calculating rtsc_min() from old corrected and new uncorrected
    curve.

    The patch fixes this bug and keepis vt and fsc in sync (virtual times
    are cumulative, not local to the backlog period).

    Signed-off-by: Michal Soltys
    Signed-off-by: David S. Miller

    Michal Soltys
     
  • spinlock can be initialized automatically with DEFINE_SPINLOCK()
    rather than explicitly calling spin_lock_init().

    Signed-off-by: Wei Yongjun
    Acked-by: Yuval Mintz
    Signed-off-by: David S. Miller

    Wei Yongjun
     
  • Based on RFC3376 5.1 and RFC3810 6.1

    If the per-interface listening change that triggers the new report is
    a filter mode change, then the next [Robustness Variable] State
    Change Reports will include a Filter Mode Change Record. This
    applies even if any number of source list changes occur in that
    period.

    Old State New State State Change Record Sent
    --------- --------- ------------------------
    INCLUDE (A) EXCLUDE (B) TO_EX (B)
    EXCLUDE (A) INCLUDE (B) TO_IN (B)

    So we should not send source-list change if there is a filter-mode change.

    Here are two scenarios:
    1. Group deleted and filter mode is EXCLUDE, which means we need send a
    TO_IN { }.
    2. Not group deleted, but has pcm->crcount, which means we need send a
    normal filter-mode-change.

    At the same time, if the type is ALLOW or BLOCK, and have psf->sf_crcount,
    we stop add records and decrease sf_crcount directly

    Reference: https://www.ietf.org/mail-archive/web/magma/current/msg01274.html

    Signed-off-by: Hangbin Liu
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hangbin Liu
     
  • The ethtool api {get|set}_settings is deprecated.
    We move the mvneta driver to new api {get|set}_link_ksettings.

    We use the generic function phy_ethtool_get_link_ksettings,
    and update old mvneta_ethtool_set_settings to the new api.

    Signed-off-by: Philippe Reynes
    Signed-off-by: David S. Miller

    Philippe Reynes
     
  • The private structure contain a pointer to phydev, but the structure
    net_device already contain such pointer. So we can remove the pointer
    phy_dev in the private structure, and update the driver to use the
    one contained in struct net_device.

    Signed-off-by: Philippe Reynes
    Signed-off-by: David S. Miller

    Philippe Reynes
     
  • There are two generics functions phy_ethtool_{get|set}_link_ksettings,
    so we can use them instead of defining the same code in the driver.

    Signed-off-by: Philippe Reynes
    Signed-off-by: David S. Miller

    Philippe Reynes