17 Feb, 2018

3 commits

  • It is unnecessary to keep two structures, struct tipc_conn and struct
    tipc_subscriber, with a one-to-one relationship and still with different
    life cycles. The fact that the two often run in different contexts, and
    still may access each other via direct pointers constitutes an additional
    hazard, something we have experienced at several occasions, and still
    see happening.

    We have identified at least two remaining problems that are easier to
    fix if we simplify the topology server data structure somewhat.

    - When there is a race between a subscription up/down event and a
    timeout event, it is fully possible that the former might be delivered
    after the latter, leading to confusion for the receiver.

    - The function tipc_subcrp_timeout() is executing in interrupt context,
    while the following call chain is at least theoretically possible:
    tipc_subscrp_timeout()
    tipc_subscrp_send_event()
    tipc_conn_sendmsg()
    conn_put()
    tipc_conn_kref_release()
    sock_release(sock)

    I.e., we end up calling a function that might try to sleep in
    interrupt context. To eliminate this, we need to ensure that the
    tipc_conn structure and the socket, as well as the subscription
    instances, only are deleted in work queue context, i.e., after the
    timeout event really has been sent out.

    We now remove this unnecessary complexity, by merging data and
    functionality of the subscriber structure into struct tipc_conn
    and the associated file server.c. We thereafter add a spinlock and
    a new 'inactive' state to the subscription structure. Using those,
    both problems described above can be easily solved.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • Interaction between the functionality in server.c and subscr.c is
    done via function pointers installed in struct server. This makes
    the code harder to follow, and doesn't serve any obvious purpose.

    Here, we replace the function pointers with direct function calls.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • The socket handling in the topology server is unnecessarily generic.
    It is prepared to handle both SOCK_RDM, SOCK_DGRAM and SOCK_STREAM
    type sockets, as well as the only socket type which is really used,
    SOCK_SEQPACKET.

    We now remove this redundant code to make the code more readable.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

16 Feb, 2018

15 commits

  • …etooth/bluetooth-next

    Johan Hedberg says:

    ====================
    pull request: bluetooth-next 2018-02-15

    Here's the first bluetooth-next pull request targetting the 4.17 kernel
    release.

    - Fixes & cleanups to Atheros and Marvell drivers
    - Support for two new Realtek controllers
    - Support for new Intel Bluetooth controller
    - Fix for supporting multiple slave-role Bluetooth LE connections
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • eBPF test fails due to verifier failure because log_buf is too small.
    Fixed by increasing log_buf size

    Signed-off-by: Prashant Bhole
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Prashant Bhole
     
  • Remove rt_table_id from rtable. It was added for getroute to return the
    table id that was hit in the lookup. With the changes for fibmatch the
    table id can be extracted from the fib_info returned in the fib_result
    so it no longer needs to be in rtable directly.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Brenda J. Butler says:

    ====================
    tools: tc-testing: Plugin Architecture

    To make tdc.py more general, we are introducing a plugin architecture.

    This patch set first organizes the command line parameters, then
    introduces the plugin architecture and some example plugins.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Signed-off-by: Brenda J. Butler
    Acked-by: Lucas Bates
    Signed-off-by: David S. Miller

    Brenda J. Butler
     
  • Run the command under test under valgrind. Produce an extra set of
    tap output for the memory check on each test.

    Signed-off-by: Brenda J. Butler
    Acked-by: Lucas Bates
    Signed-off-by: David S. Miller

    Brenda J. Butler
     
  • Move the functionality of creating a namespace before the test suite
    and destroying it afterwards to a plugin.

    Signed-off-by: Brenda J. Butler
    Acked-by: Lucas Bates
    Signed-off-by: David S. Miller

    Brenda J. Butler
     
  • Move the functionality that checks for root permissions into a plugin.

    Signed-off-by: Brenda J. Butler
    Acked-by: Lucas Bates
    Signed-off-by: David S. Miller

    Brenda J. Butler
     
  • This should be a general test architecture, and yet allow specific
    tests to be done. Introduce a plugin architecture.

    An individual test has 4 stages, setup/execute/verify/teardown. Each
    plugin gets a chance to run a function at each stage, plus one call
    before all the tests are called ("pre" suite) and one after all the
    tests are called ("post" suite). In addition, just before each
    command is executed, the plugin gets a chance to modify the command
    using the "adjust_command" hook. This makes the test suite quite
    flexible.

    Future patches will take some functionality out of the tdc.py script and
    place it in plugins.

    To use the plugins, place the implementation in the plugins directory
    and run tdc.py. It will notice the plugins and use them.

    Signed-off-by: Brenda J. Butler
    Acked-by: Lucas Bates
    Signed-off-by: David S. Miller

    Brenda J. Butler
     
  • Split the test_runner function into the loop part (test_runner)
    and the contents (run_one_test) for maintainability.
    It makes it a little easier to catch exceptions
    in an individual test, and keep going (and flush a bunch
    of tap results for the skipped tests).

    Signed-off-by: Brenda J. Butler
    Acked-by: Lucas Bates
    Signed-off-by: David S. Miller

    Brenda J. Butler
     
  • Separate the functionality of the command line parameters into "selection"
    parameters, "action" parameters and other parameters.

    "Selection" parameters are for choosing which tests on which to act.
    "Action" parameters are for choosing what to do with the selected tests.
    "Other" parameters are for global effect (like "help" or "verbose").

    With this commit, we add the ability to name a directory as another
    selection mechanism. We can accumulate a number of tests by directory,
    file, category, or even by test id, instead of being constrained to
    run all tests in one collection or just one test.

    Signed-off-by: Brenda J. Butler
    Acked-by: Lucas Bates
    Signed-off-by: David S. Miller

    Brenda J. Butler
     
  • Kirill Tkhai says:

    ====================
    net: Add ioctl() SIOCGSKNS cmd to allow obtaining net ns of tun device

    Currently, it's not possible to get or check net namespace,
    which was used to create tun socket. User may have two tun
    devices with the same names in different nets, and there
    is no way to differ them each other.

    The patchset adds support for ioctl() cmd SIOCGSKNS for tun
    devices. It will allow people to obtain net namespace file
    descriptor like we allow to do that for sockets in general.

    v2: Add new patch [2/3] to export open_related_ns().
    ====================

    Signed-off-by: Kirill Tkhai
    Signed-off-by: David S. Miller

    David S. Miller
     
  • This patch adds possibility to get tun device's net namespace fd
    in the same way we allow to do that for sockets.

    Socket ioctl numbers do not intersect with tun-specific, and there
    is already SIOCSIFHWADDR used in tun code. So, SIOCGSKNS number
    is choosen instead of custom-made for this functionality.

    Note, that open_related_ns() uses plain get_net_ns() and it's safe
    (net can't be already dead at this moment):

    tun socket is allocated via sk_alloc() with zero last arg (kern = 0).
    So, each alive socket increments net::count, and the socket is definitely
    alive during ioctl syscall.

    Also, common variable net is introduced, so small cleanup in TUNSETIFF
    is made.

    Signed-off-by: Kirill Tkhai
    Signed-off-by: David S. Miller

    Kirill Tkhai
     
  • This function will be used to obtain net of tun device.

    Signed-off-by: Kirill Tkhai
    Signed-off-by: David S. Miller

    Kirill Tkhai
     
  • This function will be used to obtain net of tun device.

    Signed-off-by: Kirill Tkhai
    Signed-off-by: David S. Miller

    Kirill Tkhai
     

15 Feb, 2018

22 commits

  • Jeff Kirsher says:

    ====================
    40GbE Intel Wired LAN Driver Updates 2018-02-14

    This patch series enables the new mqprio hardware offload mechanism
    creating traffic classes on VFs for XL710 devices. The parameters
    needed to configure these traffic classes/queue channels are provides
    by the user via the tc tool. A maximum of four traffic classes can be
    created on each VF. This patch series also enables application of cloud
    filters to each of these traffic classes. The cloud filters are applied
    using the tc-flower classifier.

    Example:
    1. tc qdisc add dev vf0 root mqprio num_tc 4 map 0 0 0 0 1 2 2 3\
    queues 2@0 2@2 1@4 1@5 hw 1 mode channel
    2. tc qdisc add dev vf0 ingress
    3. ethtool -K vf0 hw-tc-offload on
    4. ip link set eth0 vf 0 spoofchk off
    5. tc filter add dev vf0 protocol ip parent ffff: prio 1 flower dst_ip\
    192.168.3.5/32 ip_proto udp dst_port 25 skip_sw hw_tc 2
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • In kcm_attach strp_done is called when sk_user_data is already
    set to fail the attach. strp_done needs the strp to be stopped and
    warns if it isn't. Call strp_stop in this case to eliminate the
    warning message.

    Reported-by: syzbot+88dfb55e4c8b770d86e3@syzkaller.appspotmail.com
    Fixes: e5571240236c5652f ("kcm: Check if sk_user_data already set in kcm_attach"
    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • Add documentation of ti,clk-output-sel which can be used to select
    a specific clock for CLK_OUT.

    Signed-off-by: Wadim Egorov
    Signed-off-by: Daniel Schultz
    Reviewed-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Wadim Egorov
     
  • The DP83867 has a muxing option for the CLK_OUT pin. It is possible
    to set CLK_OUT for different channels.
    Create a binding to select a specific clock for CLK_OUT pin.

    Signed-off-by: Wadim Egorov
    Signed-off-by: Daniel Schultz
    Reviewed-by: Andrew Lunn
    Reviewed-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Wadim Egorov
     
  • Currently, the default link tolerance set in struct tipc_bearer only
    has effect on links going up after that moment. I.e., a user has to
    reset all the node's links across that bearer to have the new value
    applied. This is too limiting and disturbing on a running cluster to
    be useful.

    We now change this so that also already existing links are updated
    dynamically, without any need for a reset, when the bearer value is
    changed. We leverage the already existing per-link functionality
    for this to achieve the wanted effect.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     
  • Rahul Lakkireddy says:

    ====================
    cxgb4: speed up reading on-chip memory

    This series of patches speed up reading on-chip memory (EDC and MC)
    by reading 64-bits at a time.

    Patch 1 reworks logic to read EDC and MC.

    Patch 2 adds logic to read EDC and MC 64-bits at a time.

    v2:
    - Dropped AVX CPU intrinsic instructions.
    - Use readq() to read 64-bits at a time.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Use readq() (via t4_read_reg64()) to read 64-bits at a time.
    Read residual in 32-bit multiples.

    Signed-off-by: Rahul Lakkireddy
    Signed-off-by: Ganesh Goudar
    Signed-off-by: David S. Miller

    Rahul Lakkireddy
     
  • Rework logic to read EDC and MC. Do 32-bit reads at a time.

    Signed-off-by: Rahul Lakkireddy
    Signed-off-by: Ganesh Goudar
    Signed-off-by: David S. Miller

    Rahul Lakkireddy
     
  • IPv4 uses set_lwt_redirect to set the lwtunnel redirect functions as
    needed. Move it to lwtunnel.h as lwtunnel_set_redirect and change
    IPv6 to also use it.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • Andrew Lunn says:

    ====================
    PTP support for DSA and mv88e6xxx driver.

    This patchset adds support for using the PTP hardware in switches
    supported by the mv88e6xxx driver. The code was produces in
    collaboration with Brandon Streiff doing the initial implementation,
    and then Richard Cochran and Andrew Lunn making further changes and
    cleanups.

    The code is sufficient to use ptp4l on a single DSA interface, either
    as a master or a slave. Due to the use of an MDIO bus to access the
    switch, reading hardware timestamps is slower than what ptp4l
    expects. Thus it is necessary to use the option
    --tx_timestamp_timeout=32. Heavy use of ethtool -S, or bridge fdb show
    can also upset ptp4l. Patches to address this will follow.

    Further work is requires to support bridges using Boundary Clock or
    Transparent Clock mode.

    Since the RFC, an overflow bug has been fixed. Brandon Streiff
    has also Acked-by: the updates to his initial patchset.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • 88E6341 devices default to timestamping at the PHY, but due to a
    hardware issue, timestamps via this component are unreliable. For
    this family, configure the PTP hardware to force the timestamping
    to occur at the MAC.

    Signed-off-by: Brandon Streiff
    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Brandon Streiff
     
  • This patch implements RX/TX timestamping support.

    The Marvell PTP hardware supports RX timestamping individual message
    types, but for simplicity we only support the EVENT receive filter since
    few if any clients bother with the more specific filter types.

    checkpatch and reverse Christmas tree changes by Andrew Lunn.

    Re-factor duplicated code paths and avoid IfOk anti-pattern, use the
    common ptp worker thread from the class layer and time stamp UDP/IPv4
    frames as well as Layer-2 frame by Richard Cochran.

    Signed-off-by: Brandon Streiff
    Signed-off-by: Andrew Lunn
    Signed-off-by: Richard Cochran
    Signed-off-by: David S. Miller

    Brandon Streiff
     
  • Forward the rx/tx timestamp machinery from the dsa infrastructure to the
    switch driver.

    On the rx side, defer delivery of skbs until we have an rx timestamp.
    This mimicks the behavior of skb_defer_rx_timestamp.

    On the tx side, identify PTP packets, clone them, and pass them to the
    underlying switch driver before we transmit. This mimicks the behavior
    of skb_tx_timestamp.

    Adjusted txstamp API to keep the allocation and freeing of the clone
    in the same central function by Richard Cochran

    Signed-off-by: Brandon Streiff
    Signed-off-by: Richard Cochran
    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Brandon Streiff
     
  • This patch adds support to the dsa slave network device so that
    switch drivers can implement the SIOC[GS]HWTSTAMP ioctls and the
    ethtool timestamp-info interface.

    Signed-off-by: Brandon Streiff
    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Brandon Streiff
     
  • This patch adds support for configuring mv88e6xxx GPIO lines as PTP
    pins, so that they may be used for time stamping external events or for
    periodic output.

    Checkpatch and reverse Christmas tree fixes by Andrew Lunn

    Periodic output removed by Richard Cochran, until a better abstraction
    of a VCO is added to Linux in general.

    Signed-off-by: Brandon Streiff
    Signed-off-by: Andrew Lunn
    Signed-off-by: Richard Cochran
    Signed-off-by: David S. Miller

    Brandon Streiff
     
  • MV88E6352 and later switches support GPIO control through the "Scratch
    & Misc" global2 register. (Older switches do too, though with a slightly
    different register interface. Only the 6352-style is implemented here.)

    Add a new file, global2_scratch.c, for operations in the Scratch & Misc
    space. Additionally, add a GPIO operations structure to present an
    abstract view over GPIO manipulation.

    Reverse Christmas tree and unsigned has been replaced with unsigned
    int by Andrew Lunn.

    Signed-off-by: Brandon Streiff
    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Brandon Streiff
     
  • This patch adds basic support for exposing the 32-bit timestamp counter
    inside the mv88e6xxx switch as a ptp_clock.

    Adjfine implemented by Richard Cochran.
    Andrew Lunn: fix return value of PTP stub function.

    Signed-off-by: Brandon Streiff
    Signed-off-by: Richard Cochran
    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Brandon Streiff
     
  • This patch implements support for accessing the Precision Time Protocol
    and Time Application Interface registers via the AVB register interface
    in the Global 2 register.

    The register interface differs slightly between different models; older
    models use a 3-bit operations field, while newer models use a 2-bit
    field. The operations values and the special "global port" values are
    different between the two. This is a similar split to the differences
    in the "Ingress Rate" register between models, so, like in that case,
    we call the two variants "6352" and "6390" and create an ops structure
    to abstract between the two.

    checkpatch fixups by Andrew Lunn

    Signed-off-by: Brandon Streiff
    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Brandon Streiff
     
  • Let the mv88e6xxx_g2_* register accessor functions be accessible
    outside of global2.c.

    Signed-off-by: Brandon Streiff
    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Brandon Streiff
     
  • When NET_PTP_CLASSIFY is disabled, a stub function is required in
    order that the drivers compile.

    Signed-off-by: Andrew Lunn
    Acked-by: Richard Cochran
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • 배석진 reported that in some situations, packets for a given 5-tuple
    end up being processed by different CPUS.

    This involves RPS, and fragmentation.

    배석진 is seeing packet drops when a SYN_RECV request socket is
    moved into ESTABLISH state. Other states are protected by socket lock.

    This is caused by a CPU losing the race, and simply not caring enough.

    Since this seems to occur frequently, we can do better and perform
    a second lookup.

    Note that all needed memory barriers are already in the existing code,
    thanks to the spin_lock()/spin_unlock() pair in inet_ehash_insert()
    and reqsk_put(). The second lookup must find the new socket,
    unless it has already been accepted and closed by another cpu.

    Note that the fragmentation could be avoided in the first place by
    use of a correct TCP MSS option in the SYN{ACK} packet, but this
    does not mean we can not be more robust.

    Many thanks to 배석진 for a very detailed analysis.

    Reported-by: 배석진
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This patch provides support to add or delete cloud filter for queue
    channels created for ADq on VF.
    We are using the HW's cloud filter feature and programming it to act
    as a TC filter applied to a group of queues.

    There are two possible modes for a VF when applying a cloud filter
    1. Basic Mode: Intended to apply filters that don't need a VF to be
    Trusted. This would include the following
    Dest MAC + L4 port
    Dest MAC + VLAN + L4 port
    2. Advanced Mode: This mode is only for filters with combination that
    requires VF to be Trusted.
    Dest IP + L4 port

    When cloud filters are applied on a trusted VF and for some reason
    the same VF is later made as untrusted then all cloud filters
    will be deleted. All cloud filters has to be re-applied in
    such a case.
    Cloud filters are also deleted when queue channel is deleted.

    Testing-Hints:
    =============
    1. Adding Basic Mode filter should be possible on a VF in
    Non-Trusted mode.
    2. In Advanced mode all filters should be able to be created.

    Steps:
    ======
    1. Enable ADq and create TCs using TC mqprio command
    2. Apply cloud filter.
    3. Turn-off the spoof check.
    4. Pass traffic.

    Example:
    ========
    1. tc qdisc add dev enp4s2 root mqprio num_tc 4 map 0 0 0 0 1 2 2 3\
    queues 2@0 2@2 1@4 1@5 hw 1 mode channel
    2. tc qdisc add dev enp4s2 ingress
    3. ethtool -K enp4s2 hw-tc-offload on
    4. ip link set ens261f0 vf 0 spoofchk off
    5. tc filter add dev enp4s2 protocol ip parent ffff: prio 1 flower\
    dst_ip 192.168.3.5/32 ip_proto udp dst_port 25 skip_sw hw_tc 2

    Signed-off-by: Avinash Dayanand
    Tested-by: Andrew Bowers
    Signed-off-by: Jeff Kirsher

    Avinash Dayanand