09 May, 2016

4 commits


08 May, 2016

3 commits


07 May, 2016

7 commits

  • When we fail to set the flooding configuration for the broadcast and
    unregistered multicast traffic, we should revert the flooding
    configuration of the unknown unicast traffic.

    Fixes: 0293038e0c36 ("mlxsw: spectrum: Add support for flood control")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Make the leave procedure in the error path symmetric to the join
    procedure and first remove the port from the collector before
    potentially destroying the LAG.

    Fixes: 0d65fc13042f ("mlxsw: spectrum: Implement LAG port join/leave")
    Signed-off-by: Ido Schimmel
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • UDP tunnel segmentation code relies on the inner offsets being set for
    an UDP tunnel GSO packet, but the inner *_complete() functions will
    set the inner offsets only if 'encapsulation' is set before calling
    them. Currently, udp_gro_complete() sets 'encapsulation' only after
    the inner *_complete() functions are done. This causes the inner
    offsets having invalid values after udp_gro_complete() returns, which
    in turn will make it impossible to properly segment the packet in case
    it needs to be forwarded, which would be visible to the user either as
    invalid packets being sent or as packet loss.

    This patch fixes this by setting skb's 'encapsulation' in
    udp_gro_complete() before calling into the inner complete functions,
    and by making each possible UDP tunnel gro_complete() callback set the
    inner_mac_header to the beginning of the tunnel payload.

    Signed-off-by: Jarno Rajahalme
    Reviewed-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Jarno Rajahalme
     
  • The setting of the UDP tunnel GSO type is already performed by
    udp[46]_gro_complete().

    Signed-off-by: Jarno Rajahalme
    Signed-off-by: David S. Miller

    Jarno Rajahalme
     
  • qede requires qed to provide enough resources to accommodate 16 combined
    channels, but that upper-bound isn't actually being enforced by it.
    Instead, qed inform back to qede how many channels can be opened based on
    available resources - but that calculation doesn't really take into account
    the resources requested by qede; Instead it considers other FW/HW available
    resources.

    As a result, if a user would increase the number of channels to more than
    16 [e.g., using ethtool] the chip would hang.

    This change increments the resources requested by qede to 64 combined
    channels instead of 16; This value is an upper bound on the possible
    available channels [due to other FW/HW resources].

    Signed-off-by: Sudarsana Reddy Kalluru
    Signed-off-by: Yuval Mintz
    Signed-off-by: David S. Miller

    Sudarsana Reddy Kalluru
     
  • Responses for packets to unused ports are getting lost with L3 domains.

    IPv4 has ip_send_unicast_reply for sending TCP responses which accounts
    for L3 domains; update the IPv6 counterpart tcp_v6_send_response.
    For icmp the L3 master check needs to be moved up in icmp6_send
    to properly respond to UDP packets to a port with no listener.

    Fixes: ca254490c8df ("net: Add VRF support to IPv6 stack")
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • With the newly introduced helper functions the skb pulling is hidden
    in the checksumming function - and undone before returning to the
    caller.

    The IGMP and MLD query parsing functions in the bridge still
    assumed that the skb is pointing to the beginning of the IGMP/MLD
    message while it is now kept at the beginning of the IPv4/6 header.

    If there is a querier somewhere else, then this either causes
    the multicast snooping to stay disabled even though it could be
    enabled. Or, if we have the querier enabled too, then this can
    create unnecessary IGMP / MLD query messages on the link.

    Fixing this by taking the offset between IP and IGMP/MLD header into
    account, too.

    Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
    Reported-by: Simon Wunderlich
    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     

06 May, 2016

3 commits

  • get_bridge_ifindices() is used from the old "deviceless" bridge ioctl
    calls which aren't called with rtnl held. The comment above says that it is
    called with rtnl but that is not really the case.
    Here's a sample output from a test ASSERT_RTNL() which I put in
    get_bridge_ifindices and executed "brctl show":
    [ 957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30)
    [ 957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G W O
    4.6.0-rc4+ #157
    [ 957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
    BIOS 1.8.1-20150318_183358- 04/01/2014
    [ 957.423009] 0000000000000000 ffff880058adfdf0 ffffffff8138dec5
    0000000000000400
    [ 957.423009] ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32
    0000000000000001
    [ 957.423009] 00007ffec1a444b0 0000000000000400 ffff880053c19130
    0000000000008940
    [ 957.423009] Call Trace:
    [ 957.423009] [] dump_stack+0x85/0xc0
    [ 957.423009] []
    br_ioctl_deviceless_stub+0x212/0x2e0 [bridge]
    [ 957.423009] [] sock_ioctl+0x22b/0x290
    [ 957.423009] [] do_vfs_ioctl+0x95/0x700
    [ 957.423009] [] SyS_ioctl+0x79/0x90
    [ 957.423009] [] entry_SYSCALL_64_fastpath+0x23/0xc1

    Since it only reads bridge ifindices, we can use rcu to safely walk the net
    device list. Also remove the wrong rtnl comment above.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • The peer may be expecting a reply having sent a request and then done a
    shutdown(SHUT_WR), so tearing down the whole socket at this point seems
    wrong and breaks for me with a client which does a SHUT_WR.

    Looking at other socket family's stream_recvmsg callbacks doing a shutdown
    here does not seem to be the norm and removing it does not seem to have
    had any adverse effects that I can see.

    I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact
    on the vmci transport.

    Signed-off-by: Ian Campbell
    Cc: "David S. Miller"
    Cc: Stefan Hajnoczi
    Cc: Claudio Imbrenda
    Cc: Andy King
    Cc: Dmitry Torokhov
    Cc: Jorgen Hansen
    Cc: Adit Ranadive
    Cc: netdev@vger.kernel.org
    Signed-off-by: David S. Miller

    Ian Campbell
     
  • Use htons instead of unconditionally byte swapping nexthdr. On a little
    endian systems shifting the byte is correct behavior, but it results in
    incorrect csums on big endian architectures.

    Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE')
    Signed-off-by: Daniel Jurgens
    Reviewed-by: Carol Soto
    Tested-by: Carol Soto
    Signed-off-by: Tariq Toukan
    Signed-off-by: David S. Miller

    Daniel Jurgens
     

05 May, 2016

8 commits

  • Michael Chan says:

    ====================
    bnxt_en: 2 bug fixes.

    Fix crash on ppc64 due to missing memory barrier and restore multicast
    after reset.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • The multicast/all-multicast internal flags are not properly restored
    after device reset. This could lead to unreliable multicast operations
    after an ethtool configuration change for example.

    Call bnxt_mc_list_updated() and setup the vnic->mask in bnxt_init_chip()
    to fix the issue.

    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Michael Chan
     
  • The code determines if the next ring entry is valid before proceeding
    further to read the rest of the entry. The CPU can re-order and read
    the rest of the entry first, possibly reading a stale entry, if DMA
    of a new entry happens right after reading it. This issue can be
    readily seen on a ppc64 system, causing it to crash.

    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Michael Chan
     
  • Steffen Klassert says:

    ====================
    pull request (net): ipsec 2016-05-04

    1) The flowcache can hit an OOM condition if too
    many entries are in the gc_list. Fix this by
    counting the entries in the gc_list and refuse
    new allocations if the value is too high.

    2) The inner headers are invalid after a xfrm transformation,
    so reset the skb encapsulation field to ensure nobody tries
    access the inner headers. Otherwise tunnel devices stacked
    on top of xfrm may build the outer headers based on wrong
    informations.

    3) Add pmtu handling to vti, we need it to report
    pmtu informations for local generated packets.

    Please pull or let me know if there are problems.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • The stack object “map” has a total size of 32 bytes. Its last 4
    bytes are padding generated by compiler. These padding bytes are
    not initialized and sent out via “nla_put”.

    Signed-off-by: Kangjie Lu
    Signed-off-by: David S. Miller

    Kangjie Lu
     
  • The stack object “info” has a total size of 12 bytes. Its last byte
    is padding which is not initialized and leaked via “put_cmsg”.

    Signed-off-by: Kangjie Lu
    Signed-off-by: David S. Miller

    Kangjie Lu
     
  • In the receive path a queue's work bit was cleared unconditionally even
    if fec_enet_rx_queue only read out a part of the available packets from
    the hardware. This resulted in not reading any packets in the next napi
    turn and so packets were delayed or lost.

    The obvious fix is to only clear a queue's bit when the queue was
    emptied.

    Fixes: 4d494cdc92b3 ("net: fec: change data structure to support multiqueue")
    Signed-off-by: Uwe Kleine-König
    Reviewed-by: Lucas Stach
    Tested-by: Fugang Duan
    Acked-by: Fugang Duan
    Signed-off-by: David S. Miller

    Uwe Kleine-König
     
  • When probe bails out with an error, we try to unregister the
    netdev before we have even registered it. Fix the goto statements
    for that.

    Signed-off-by: Matthias Brugger
    Signed-off-by: David S. Miller

    Matthias Brugger
     

04 May, 2016

15 commits

  • Pull networking fixes from David Miller:
    "Some straggler bug fixes:

    1) Batman-adv DAT must consider VLAN IDs when choosing candidate
    nodes, from Antonio Quartulli.

    2) Fix botched reference counting of vlan objects and neigh nodes in
    batman-adv, from Sven Eckelmann.

    3) netem can crash when it sees GSO packets, the fix is to segment
    then upon ->enqueue. Fix from Neil Horman with help from Eric
    Dumazet.

    4) Fix VXLAN dependencies in mlx5 driver Kconfig, from Matthew
    Finlay.

    5) Handle VXLAN ops outside of rcu lock, via a workqueue, in mlx5,
    since it can sleep. Fix also from Matthew Finlay.

    6) Check mdiobus_scan() return values properly in pxa168_eth and macb
    drivers. From Sergei Shtylyov.

    7) If the netdevice doesn't support checksumming, disable
    segmentation. From Alexandery Duyck.

    8) Fix races between RDS tcp accept and sending, from Sowmini
    Varadhan.

    9) In macb driver, probe MDIO bus before we register the netdev,
    otherwise we can try to open the device before it is really ready
    for that. Fix from Florian Fainelli.

    10) Netlink attribute size for ILA "tunnels" not calculated properly,
    fix from Nicolas Dichtel"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    ipv6/ila: fix nlsize calculation for lwtunnel
    net: macb: Probe MDIO bus before registering netdev
    RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.
    RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
    vxlan: Add checksum check to the features check function
    net: Disable segmentation if checksumming is not supported
    net: mvneta: Remove superfluous SMP function call
    macb: fix mdiobus_scan() error check
    pxa168_eth: fix mdiobus_scan() error check
    net/mlx5e: Use workqueue for vxlan ops
    net/mlx5e: Implement a mlx5e workqueue
    net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue
    net/mlx5: Unmap only the relevant IO memory mapping
    netem: Segment GSO packets on enqueue
    batman-adv: Fix reference counting of hardif_neigh_node object for neigh_node
    batman-adv: Fix reference counting of vlan object for tt_local_entry
    batman-adv: B.A.T.M.A.N V - make sure iface is reactivated upon NETDEV_UP event
    batman-adv: fix DAT candidate selection (must use vid)

    Linus Torvalds
     
  • Pull fuse fixes from Miklos Szeredi:
    "Fix a regression and update the MAINTAINERS entry for fuse"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
    fuse: update mailing list in MAINTAINERS
    fuse: Fix return value from fuse_get_user_pages()

    Linus Torvalds
     
  • The handler 'ila_fill_encap_info' adds one attribute: ILA_ATTR_LOCATOR.

    Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module")
    CC: Tom Herbert
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • The current sequence makes us register for a network device prior to
    registering and probing the MDIO bus which could lead to some unwanted
    consequences, like a thread of execution calling into ndo_open before
    register_netdev() returns, while the MDIO bus is not ready yet.

    Rework the sequence to register for the MDIO bus, and therefore attach
    to a PHY prior to calling register_netdev(), which implies reworking the
    error path a bit.

    Signed-off-by: Florian Fainelli
    Acked-by: Nicolas Ferre
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • Sowmini Varadhan says:

    ====================
    RDS: TCP: sychronization during connection startup

    This patch series ensures that the passive (accept) side of the
    TCP connection used for RDS-TCP is correctly synchronized with
    any concurrent active (connect) attempts for a given pair of peers.

    Patch 1 in the series makes sure that the t_sock in struct
    rds_tcp_connection is only reset after any threads in rds_tcp_xmit
    have completed (otherwise a null-ptr deref may be encountered).
    Patch 2 synchronizes rds_tcp_accept_one() with the rds_tcp*connect()
    path.

    v2: review comments from Santosh Shilimkar, other spelling corrections
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • An arbitration scheme for duelling SYNs is implemented as part of
    commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
    outgoing socket in rds_tcp_accept_one()") which ensures that both nodes
    involved will arrive at the same arbitration decision. However, this
    needs to be synchronized with an outgoing SYN to be generated by
    rds_tcp_conn_connect(). This commit achieves the synchronization
    through the t_conn_lock mutex in struct rds_tcp_connection.

    The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring
    the t_conn_lock mutex. A SYN is sent out only if the RDS connection is
    not already UP (an UP would indicate that rds_tcp_accept_one() has
    completed 3WH, so no SYN needs to be generated).

    Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after
    acquiring the t_conn_lock mutex. The only acceptable states (to
    allow continuation of the arbitration logic) are UP (i.e., outgoing SYN
    was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent
    outgoing SYN before we saw incoming SYN).

    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     
  • There is a race condition between rds_send_xmit -> rds_tcp_xmit
    and the code that deals with resolution of duelling syns added
    by commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
    outgoing socket in rds_tcp_accept_one()").

    Specifically, we may end up derefencing a null pointer in rds_send_xmit
    if we have the interleaving sequence:
    rds_tcp_accept_one rds_send_xmit

    conn is RDS_CONN_UP, so
    invoke rds_tcp_xmit

    tc = conn->c_transport_data
    rds_tcp_restore_callbacks
    /* reset t_sock */
    null ptr deref from tc->t_sock

    The race condition can be avoided without adding the overhead of
    additional locking in the xmit path: have rds_tcp_accept_one wait
    for rds_tcp_xmit threads to complete before resetting callbacks.
    The synchronization can be done in the same manner as rds_conn_shutdown().
    First set the rds_conn_state to something other than RDS_CONN_UP
    (so that new threads cannot get into rds_tcp_xmit()), then wait for
    RDS_IN_XMIT to be cleared in the conn->c_flags indicating that any
    threads in rds_tcp_xmit are done.

    Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
    outgoing socket in rds_tcp_accept_one()")
    Signed-off-by: Sowmini Varadhan
    Acked-by: Santosh Shilimkar
    Signed-off-by: David S. Miller

    Sowmini Varadhan
     
  • Alexander Duyck says:

    ====================
    Fixes for tunnel checksum and segmentation offloads

    This patch series is a subset of patches I had submitted for net-next. I
    plan to drop these two patches from the v3 of "Fix Tunnel features and
    enable GSO partial for several drivers" and I am instead submitting them
    for net since these are truly fixes and likely will need to be backported
    to stable branches.

    This series addresses 2 specific issues. The first is that we could
    request TSO on a v4 inner header while not supporting checksum offload of
    the outer IPv6 header. The second is that we could request an IPv6 inner
    checksum offload without validating that we could actually support an inner
    IPv6 checksum offload.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • We need to perform an additional check on the inner headers to determine if
    we can offload the checksum for them. Previously this check didn't occur
    so we would generate an invalid frame in the case of an IPv6 header
    encapsulated inside of an IPv4 tunnel. To fix this I added a secondary
    check to vxlan_features_check so that we can verify that we can offload the
    inner checksum.

    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     
  • In the case of the mlx4 and mlx5 driver they do not support IPv6 checksum
    offload for tunnels. With this being the case we should disable GSO in
    addition to the checksum offload features when we find that a device cannot
    perform a checksum on a given packet type.

    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     
  • Since commit 3b9d6da67e11 ("cpu/hotplug: Fix rollback during error-out
    in __cpu_disable()") it is ensured that callbacks of CPU_ONLINE and
    CPU_DOWN_PREPARE are processed on the hotplugged CPU. Due to this SMP
    function calls are no longer required.

    Replace smp_call_function_single() with a direct call to
    mvneta_percpu_enable() or mvneta_percpu_disable(). The functions do
    not require to be called with interrupts disabled, therefore the
    smp_call_function_single() calling convention is not preserved.

    Cc: Thomas Petazzoni
    Cc: netdev@vger.kernel.org
    Signed-off-by: Anna-Maria Gleixner
    Signed-off-by: David S. Miller

    Anna-Maria Gleixner
     
  • Now mdiobus_scan() returns ERR_PTR(-ENODEV) instead of NULL if the PHY
    device ID was read as all ones. As this was not an error before, this
    value should be filtered out now in this driver.

    Fixes: b74766a0a0fe ("phylib: don't return NULL from get_phy_device()")
    Signed-off-by: Sergei Shtylyov
    Reviewed-by: Florian Fainelli
    Acked-by: Nicolas Ferre
    Signed-off-by: David S. Miller

    Sergei Shtylyov
     
  • Since mdiobus_scan() returns either an error code or NULL on error, the
    driver should check for both, not only for NULL, otherwise a crash is
    imminent...

    Reported-by: Arnd Bergmann
    Signed-off-by: Sergei Shtylyov
    Signed-off-by: David S. Miller

    Sergei Shtylyov
     
  • Pull HID fixes from Jiri Kosina:
    "Fixes for the HID subsystem:

    - regression fix for Wacom driver; commit introduced in 4.6-rc1
    mistakenly removed line that should be kept. Fix by Ping Cheng

    - two device-specific quirks, by Ping Cheng and Nazar Mokrynskyi"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
    HID: wacom: add missed stylus_in_proximity line back
    HID: Fix boot delay for Creative SB Omni Surround 5.1 with quirk
    HID: wacom: Add support for DTK-1651

    Linus Torvalds
     
  • Pull clk fix from Stephen Boyd:
    "One small bug fix for the imx6qp CAN clk definition that was causing
    failures and division by zeros in the kernel on those devices"

    * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
    clk: imx6q: fix typo in CAN clock definition

    Linus Torvalds