29 Jan, 2016

15 commits

  • After we use refcnt to check if transport is alive, the dead can be
    removed from sctp_transport.

    The traversal of transport_addr_list in procfs dump is using
    list_for_each_entry_rcu, no need to check if it has been freed.

    sctp_generate_t3_rtx_event and sctp_generate_heartbeat_event is
    protected by sock lock, it's not necessary to check dead, either.
    also, the timers are cancelled when sctp_transport_free() is
    called, that it doesn't wait for refcnt to reach 0 to cancel them.

    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     
  • Previously, before rhashtable, /proc assoc listing was done by
    read-locking the entire hash entry and dumping all assocs at once, so we
    were sure that the assoc wasn't freed because it wouldn't be possible to
    remove it from the hash meanwhile.

    Now we use rhashtable to list transports, and dump entries one by one.
    That is, now we have to check if the assoc is still a good one, as the
    transport we got may be being freed.

    Signed-off-by: Xin Long
    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     
  • Now when __sctp_lookup_association is running in BH, it will try to
    check if t->dead is set, but meanwhile other CPUs may be freeing this
    transport and this assoc and if it happens that
    __sctp_lookup_association checked t->dead a bit too early, it may think
    that the association is still good while it was already freed.

    So we fix this race by using atomic_add_unless in sctp_transport_hold.
    After we get one transport from hashtable, we will hold it only when
    this transport's refcnt is not 0, so that we can make sure t->asoc
    cannot be freed before we hold the asoc again.

    Note that sctp association is not freed using RCU so we can't use
    atomic_add_unless() with it as it may just be too late for that either.

    Fixes: 4f0087812648 ("sctp: apply rhashtable api to send/recv path")
    Reported-by: Vlad Yasevich
    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    Xin Long
     
  • Jiri Pirko says:

    ====================
    mlxsw: driver fixes

    Couple of various mlxsw driver fixes.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • The rx_lane, tx_lane and module fields in the PMLP register don't have
    an additional offset besides the base one (0x04), so set it to 0x00.

    Fixes: 4ec14b7634b2 ("mlxsw: Add interface to access registers and process events")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • When dumping the FDB we can't compare the actual pointers of the ports
    structs, as it's possible the struct represents a vPort instead of the
    underlying physical port.

    Solve this by comparing the local port number instead, as it's shared
    between the physical ports and all the vPorts on top of him.

    Fixes: 54a732018d8e ("mlxsw: spectrum: Adjust switchdev ops for VLAN devices")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • LAG FDB records can only point to LAG devices or VLAN devices configured
    on top of them. Therefore, when dumping the FDB we shouldn't associate
    these records with the underlying physical ports.

    Fixes: 8a1ab5d76639 ("mlxsw: spectrum: Implement FDB add/remove/dump for LAG")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • LAG FDB entries pointing to VLAN devices should be reported to the
    bridge with the matching VLAN device and not the underlying LAG device.

    Fixes: aac78a440887 ("mlxsw: spectrum: Adjust FDB notifications for VLAN devices")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • When dumping the hardware FDB we should report entries pointing to VLAN
    devices with VLAN 0, as packets coming into the bridge are untagged.
    Likewise, pass FDB_{ADD,DEL} notifications with VLAN 0 for these
    devices.

    Fixes: 54a732018d8e ("mlxsw: spectrum: Adjust switchdev ops for VLAN devices")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • When we disable learning on bridge port we should still update the
    software bridge's FDB when entry pointing to this bridge port is
    aged-out. We can otherwise have an inconsistency between software and
    hardware tables.

    Fixes: 8a1ab5d76639 ("mlxsw: spectrum: Implement FDB add/remove/dump for LAG")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • When port is put into LISTENING state it shouldn't populate the FDB, so
    set the port's STP state in hardware to DISCARDING instead of LEARNING.
    It will therefore keep listening to BPDU packets, but discard other
    non-control packets and won't perform any learning.

    Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • When STP state is set to DISABLED the port is assumed to be inactive, but
    currently we forward packets ingressing through it.

    Instead, set the port's STP state in hardware to DISCARDING, which means
    it doesn't forward packets or perform any learning, but it does trap
    control packets. However, these packets will be dropped by bridge code,
    which results in the expected behavior.

    Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • As explained in previous commit, we should always take care of flushing
    the FDB in the driver and not rely on bridge code.

    We need to distinguish between two cases with regards to LAG:

    1) Port is leaving LAG while LAG is bridged (or VLAN devices on top of
    it). In this case don't flush the FDB entries pointing to the LAG ID, as
    this will affect other ports still member in the LAG. Only flush the FDB
    when the last port in the LAG is leaving the bridge.

    2) LAG device is leaving the bridge. In this case the CHANGEUPPER event
    is simply propagated to each member port, so make each port flush the
    FDB in its turn.

    Note that emptying a bridged LAG from ports creates an inconsistency
    between hardware and software. A user who later (< ageing_time)
    re-populates the LAG won't have any FDB entries pointing to the LAG ID
    in hardware, but they will be present in the software bridge's FDB.
    Currently there is no good solution to this problem, but this will be
    addressed by us in the future.

    In order to optimize the flushing process, flush by port or LAG ID if
    there are no VLAN interfaces on top of the port. Otherwise, flush using
    (Port / LAG ID, FID=VID} for each of the lower 4K FIDs. In the case of
    VLAN device simply flush using {Port / LAG ID, vFID} with the vFID to
    which the VLAN device is mapped to.

    Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • When removing a net device from a bridge we should flush the FDB entries
    associated with this net device. Up until now, we relied upon bridge
    code to do that for us, but it is possible for user to prevent hardware
    from syncing with the software bridge (learning_sync=0), so we need to
    flush overselves.

    Add the Switch Filtering DB Flush (SFDF) register that is used to flush
    FDB entries according to different parameters (per-port, per-FID etc).

    Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • It is possible for a user to remove a port from a LAG device, while the
    LAG device or VLAN devices on top of it are bridged. In these cases,
    bridge's teardown sequence is never issued, so we need to take care of
    it ourselves.

    When LAG's unlinking event is received by port netdev:

    1) Traverse its vPorts list and make those member in a bridge leave it.
    They will be deleted later by LAG code.

    2) Make the port netdev itself leave its bridge if member in one.

    Fixes: 0d65fc13042f ("mlxsw: spectrum: Implement LAG port join/leave")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     

26 Jan, 2016

13 commits

  • Jeff Kirsher says:

    ====================
    Intel Wired LAN Driver Updates 2016-01-25

    This series contains updates to i40e only and so I won't continue receiving
    patches to fix the same issue (again).

    Arnd fixes the driver from causing the compiler whining about uninitialized
    variables, so initialize those variables.

    Eric fixes the build errors/warnings which were introduced by Anjali
    when she added geneve support to i40e.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • intel/i40e/i40e_txrx.c: In function 'i40e_xmit_frame_ring':
    intel/i40e/i40e_txrx.c:2367:20: error: 'oiph' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    intel/i40e/i40e_txrx.c:2317:16: note: 'oiph' was declared here
    intel/i40e/i40e_txrx.c:2367:17: error: 'oudph' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    intel/i40e/i40e_txrx.c:2316:17: note: 'oudph' was declared here

    Signed-off-by: Arnd Bergmann
    Tested-by: Andrew Bowers
    Signed-off-by: Jeff Kirsher

    Arnd Bergmann
     
  • Fixes following build warnings :

    drivers/net/ethernet/intel/i40e/i40e_main.c:7057:13: warning:
    'i40e_sync_udp_filters_subtask' defined but not used [-Wunused-function]
    drivers/net/ethernet/intel/i40e/i40e_main.c:8524:13: warning:
    'i40e_add_vxlan_port' defined but not used [-Wunused-function]
    drivers/net/ethernet/intel/i40e/i40e_main.c:8569:13: warning:
    'i40e_del_vxlan_port' defined but not used [-Wunused-function]
    drivers/net/ethernet/intel/i40e/i40e_main.c:8604:13: warning:
    'i40e_add_geneve_port' defined but not used [-Wunused-function]
    drivers/net/ethernet/intel/i40e/i40e_main.c:8651:13: warning:
    'i40e_del_geneve_port' defined but not used [-Wunused-function]

    Fixes: 6a899024058d ("i40e: geneve tunnel offload support")
    Signed-off-by: Eric Dumazet
    Tested-by: Andrew Bowers
    Signed-off-by: Jeff Kirsher

    Eric Dumazet
     
  • Since eliminating send_completion_tid from struct hv_netvsc_packet, we
    haven't add proper book keeping for the skb of the batched packet. This
    patch fixes this issue and allows the previous skb is properly freed.
    Otherwise, a panic may happen.
    Thanks to Simon Xiao for bisecting and analysis.

    Signed-off-by: Haiyang Zhang
    Reviewed-by: K. Y. Srinivasan
    Signed-off-by: David S. Miller

    Haiyang Zhang
     
  • Recent changes to 'struct flow_keys' (e.g commit d34af823ff40 ("net: Add
    VLAN ID to flow_keys")) introduced a performance regression in netvsc
    driver. Is problem is, however, not the above mentioned commit but the
    fact that netvsc_set_hash() function did some assumptions on the struct
    flow_keys data layout and this is wrong.

    Get rid of netvsc_set_hash() by switching to skb_get_hash(). This change
    will also imply switching to Jenkins hash from the currently used Toeplitz
    but it seems there is no good excuse for Toeplitz to stay.

    Signed-off-by: Vitaly Kuznetsov
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Vitaly Kuznetsov
     
  • When creating a SIT tunnel with ip tunnel, rtnl_link_ops is not set before
    ipip6_tunnel_create is called. When register_netdevice is called, there is
    no linkinfo attribute in the NEWLINK message because of that.

    Setting rtnl_link_ops before calling register_netdevice fixes that.

    Signed-off-by: Thadeu Lima de Souza Cascardo
    Signed-off-by: David S. Miller

    Thadeu Lima de Souza Cascardo
     
  • As Arnd Bergmann points out, using CONFIG_ARCH_MXC and/or SOC_IMX28
    is wrong if some other ARM platform uses this device - the operation
    of the driver would depend on an unrelated ARM platform that might
    or might not be set for multi-platform kernels.

    Prior to my previous patch, any other platforms using it would have
    been broken already due to having the cbd_datlen/cbd_sc fields in
    the wrong order, but byte ordering correctly, so no such platforms
    can exist and work today.

    In any case, it seems likely that only Freescale SoCs use this part,
    and those are little-endian on ARM, so CONFIG_ARM is safe for them.

    Signed-off-by: Johannes Berg
    Reviewed-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • We are getting many build warnings about:
    'bar_start' may be used uninitialized
    and
    'bar_len' may be used uninitialized

    They are not actually uninitialized as dfx_get_bars() will initialize
    them properly. But still lets have them initialized just to satisfy the
    compiler (gcc 4.8.2).

    Signed-off-by: Sudip Mukherjee
    Acked-by: Maciej W. Rozycki
    Signed-off-by: David S. Miller

    Sudip Mukherjee
     
  • We are getting build warning about:
    macb.c:2889:13: warning: 'tx_clk' may be used uninitialized in this function
    macb.c:2888:11: warning: 'hclk' may be used uninitialized in this function

    In reality they are not used uninitialized as clk_init() will initialize
    them, this patch will just silence the warning.

    Signed-off-by: Sudip Mukherjee
    Acked-by: Nicolas Ferre
    Signed-off-by: David S. Miller

    Sudip Mukherjee
     
  • The driver treats the device descriptors as CPU-endian, which appears
    to be correct with the default endianness on both ARM (typically LE)
    and PowerPC (typically BE) SoCs, indicating that the hardware block
    is generated differently. Add endianness annotations and byteswaps as
    necessary.

    It's not clear that the ifdef there really is correct and shouldn't
    just be #ifdef CONFIG_ARM, but I also can't test on anything but the
    i.MX6 HummingBoard where this gets it working with a BE kernel.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Since commit 76e398a62712 ("net: dsa: use switchdev obj for VLAN add/del
    ops"), the Marvell 88E6xxx switch has been unable to pass traffic
    between ports - any received traffic is discarded by the switch.
    Taking a port out of bridge mode and configuring a vlan on it also the
    port to start passing traffic.

    With the debugfs files re-instated to allow debug of this issue by
    comparing the register settings between the working and non-working
    case, the reason becomes clear:

    GLOBAL GLOBAL2 SERDES 0 1 2 3 4 5 6
    - 7: 1111 707f 2001 2 2 2 2 2 0 2
    + 7: 1111 707f 2001 1 1 1 1 1 0 1

    Register 7 for the ports is the default vlan tag register, and in the
    non-working setup, it has been set to 2, despite vlan 2 not being
    configured. This causes the switch to drop all packets coming in to
    these ports. The working setup has the default vlan tag register set
    to 1, which is the default vlan when none is configured.

    Inspection of the code reveals why. The code prior to this commit
    was:

    - for (vid = vlan->vid_begin; vid vid_end; ++vid) {
    ...
    - if (!err && vlan->flags & BRIDGE_VLAN_INFO_PVID)
    - err = ds->drv->port_pvid_set(ds, p->port, vid);

    but the new code is:

    + for (vid = vlan->vid_begin; vid vid_end; ++vid) {
    ...
    + }
    ...
    + if (pvid)
    + err = _mv88e6xxx_port_pvid_set(ds, port, vid);

    This causes the new code to always set the default vlan to one higher
    than the old code.

    Fix this.

    Fixes: 76e398a62712 ("net: dsa: use switchdev obj for VLAN add/del ops")
    Cc:
    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • This is an additional patch to the one already submitted recently.
    The previous patch was not complete, and the FCC port lock-up scenario
    has been reproduced in lab.
    I had an opportunity to check the current patch in lab and the FCC
    port lock no longer freezes, while the previous patch still locks-up the
    FCC port.
    The current patch fixes a pointer arithmetic bug (second bug in the same
    line), which leads FCC port lock-up during underrun/collision handling.
    Within the tx_startup() function in mac-fcc.c, the address of last BD is
    not calculated correctly. As a result of wrong calculation of the last BD
    address, the next transmitted BD may be set to an area out of the transmit
    BD ring. This actually causes to port lock-up and it is not recoverable.

    Signed-off-by: Martin Roth
    Signed-off-by: David S. Miller

    Martin Roth
     
  • The ESP algorithms using CBC mode require echainiv. Hence INET*_ESP have
    to select CRYPTO_ECHAINIV in order to work properly. This solves the
    issues caused by a misconfiguration as described in [1].
    The original approach, patching crypto/Kconfig was turned down by
    Herbert Xu [2].

    [1] https://lists.strongswan.org/pipermail/users/2015-December/009074.html
    [2] http://marc.info/?l=linux-crypto-vger&m=145224655809562&w=2

    Signed-off-by: Thomas Egerer
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Thomas Egerer
     

25 Jan, 2016

5 commits

  • This patch extends commit b93d6471748d ("sctp: implement the sender side
    for SACK-IMMEDIATELY extension") as it didn't white list
    SCTP_SACK_IMMEDIATELY on sctp_msghdr_parse(), causing it to be
    understood as an invalid flag and returning -EINVAL to the application.

    Note that the actual handling of the flag is already there in
    sctp_datamsg_from_user().

    https://tools.ietf.org/html/rfc7053#section-7

    Fixes: b93d6471748d ("sctp: implement the sender side for SACK-IMMEDIATELY extension")
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     
  • The napi_synchronize() function is defined twice: The definition
    for SMP builds waits for other CPUs to be done, while the uniprocessor
    variant just contains a barrier and ignores its argument.

    In the mvneta driver, this leads to a warning about an unused variable
    when we lookup the NAPI struct of another CPU and then don't use it:

    ethernet/marvell/mvneta.c: In function 'mvneta_percpu_notifier':
    ethernet/marvell/mvneta.c:2910:30: error: unused variable 'other_port' [-Werror=unused-variable]

    There are no other CPUs on a UP build, so that code never runs, but
    gcc does not know this.

    The nicest solution seems to be to turn the napi_synchronize() helper
    into an inline function for the UP case as well, as that leads gcc to
    not complain about the argument being unused. Once we do that, we can
    also combine the two cases into a single function definition and use
    if(IS_ENABLED()) rather than #ifdef to make it look a bit nicer.

    The warning first came up in linux-4.4, but I failed to catch it
    earlier.

    Signed-off-by: Arnd Bergmann
    Fixes: f86428854480 ("net: mvneta: Statically assign queues to CPUs")
    Signed-off-by: David S. Miller

    Arnd Bergmann
     
  • Several times already this has been reported as kasan reports caused by
    syzkaller and trinity and people always looked at RCU races, but it is
    much more simple. :)

    In case we bind a pptp socket multiple times, we simply add it to
    the callid_sock list but don't remove the old binding. Thus the old
    socket stays in the bucket with unused call_id indexes and doesn't get
    cleaned up. This causes various forms of kasan reports which were hard
    to pinpoint.

    Simply don't allow multiple binds and correct error handling in
    pptp_bind. Also keep sk_state bits in place in pptp_connect.

    Fixes: 00959ade36acad ("PPTP: PPP over IPv4 (Point-to-Point Tunneling Protocol)")
    Cc: Dmitry Kozlov
    Cc: Sasha Levin
    Cc: Dmitry Vyukov
    Reported-by: Dmitry Vyukov
    Cc: Dave Jones
    Reported-by: Dave Jones
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • For interrupt controller that doesn't support irq_disable and hardware
    with level interrupt, an extra interrupt may be pending. This patch fixes
    the issue by setting IRQ_DISABLE_UNLAZY flag for the interrupt line,
    as suggested by,

    'commit e9849777d0e2 ("genirq: Add flag to force mask in
    disable_irq[_nosync]()")'

    Signed-off-by: Iyappan Subramanian
    Tested-by: Toan Le
    Signed-off-by: David S. Miller

    Iyappan Subramanian
     
  • Dmitry reported a struct pid leak detected by a syzkaller program.

    Bug happens in unix_stream_recvmsg() when we break the loop when a
    signal is pending, without properly releasing scm.

    Fixes: b3ca9b02b007 ("net: fix multithreaded signal handling in unix recv routines")
    Reported-by: Dmitry Vyukov
    Signed-off-by: Eric Dumazet
    Cc: Rainer Weikusat
    Signed-off-by: David S. Miller

    Eric Dumazet
     

22 Jan, 2016

7 commits

  • The cgroup methods are no longer used after baac50bbc3cd ("net:
    tcp_memcontrol: simplify linkage between socket and page counter").
    The hunk to delete them was included in the original patch but must
    have gotten lost during conflict resolution on the way upstream.

    Fixes: baac50bbc3cd ("net: tcp_memcontrol: simplify linkage between socket and page counter")
    Signed-off-by: Johannes Weiner
    Signed-off-by: David S. Miller

    Johannes Weiner
     
  • When the lan87xx_read_status function is getting called the
    energy detect mode is enabled again even if it has been
    disabled by device tree.

    Added private struct to check the energy detect status.

    Signed-off-by: Teresa Remmet
    Signed-off-by: David S. Miller

    Teresa Remmet
     
  • Jisheng Zhang says:

    ====================
    net: mvneta: support more than one clk

    Some platforms may provide more than one clk for the mvneta IP, for
    example Marvell BG4CT provides "core" clk for the mac core, and "axi"
    clk for the AXI bus logic.

    This series tries to addess the "more than one clk" issue. Note: to
    support BG4CT, we have lots of refactor work to do, eg. BG4CT doesn't
    have mbus concept etc.

    Since v2:
    - Name the optional clock as "bus", which is a bit more flexible.

    Since v1:
    - Add Thomas Acks to patch1 and patch2.
    - make sure the headers are really sorted (some headers are still
    unsorted in v1).
    - disable axi clk before disabling core clk, Thank Thomas.
    - update dt binding as Thomas suggested.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Signed-off-by: Jisheng Zhang
    Acked-by: Rob Herring
    Signed-off-by: David S. Miller

    Jisheng Zhang
     
  • Some platforms may provide more than one clk for the mvneta IP, for
    example Marvell BG4CT provides one clk for the mac core, and one
    clk for the AXI bus logic. Obviously this bus clk also need to
    be enabled. This patch adds this optional "bus" clk support.

    Signed-off-by: Jisheng Zhang
    Signed-off-by: David S. Miller

    Jisheng Zhang
     
  • Some platforms may provide more than one clk for the mvneta IP, for
    example Marvell BG4CT provides one clk for the mac core, and one
    clk for the AXI bus logic.

    To support for more than one clock, we'll need to distinguish between
    the clock by name. Change clock probing to first try to get "core"
    clock before falling back to unnamed clock.

    Signed-off-by: Jisheng Zhang
    Acked-by: Thomas Petazzoni
    Signed-off-by: David S. Miller

    Jisheng Zhang
     
  • Sorting the headers in alphabetic order will help to reduce the conflict
    when adding new headers in the future.

    Signed-off-by: Jisheng Zhang
    Acked-by: Thomas Petazzoni
    Signed-off-by: David S. Miller

    Jisheng Zhang