22 Oct, 2013

12 commits

  • In v3.9 6fd6ce2056de2709 ("ipv6: Do not depend on rt->n in
    ip6_finish_output2()." changed the behaviour of ip6_finish_output2()
    such that the recently introduced rt6_nexthop() is used
    instead of an assigned neighbor.

    As rt6_nexthop() prefers rt6i_gateway only for gatewayed
    routes this causes a problem for users like IPVS, xt_TEE and
    RAW(hdrincl) if they want to use different address for routing
    compared to the destination address.

    Another case is when redirect can create RTF_DYNAMIC
    route without RTF_GATEWAY flag, we ignore the rt6i_gateway
    in rt6_nexthop().

    Fix the above problems by considering the rt6i_gateway if
    present, so that traffic routed to address on local subnet is
    not wrongly diverted to the destination address.

    Thanks to Simon Horman and Phil Oester for spotting the
    problematic commit.

    Thanks to Hannes Frederic Sowa for his review and help in testing.

    Reported-by: Phil Oester
    Reported-by: Mark Brooks
    Signed-off-by: Julian Anastasov
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Julian Anastasov
     
  • Yuval Mintz says:

    ====================
    bnx2x: Bug fixes patch series

    This patch series contains fixes for various flows - several SR-IOV issues
    are fixed, ethtool callbacks (coalescing and register dump) are corrected,
    null pointer dereference on error flows is prevented, etc.

    Changes from V1
    ---------------
    - Patch 2 "bnx2x: Prevent an illegal pointer dereference during panic"
    is revised, with improved handling of edge cases.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Current driver implementation incorrectly sets the flag only if 64-bit
    DMA mask succeeded.

    Signed-off-by: Merav Sicron
    Signed-off-by: Yuval Mintz
    Signed-off-by: Ariel Elior
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Merav Sicron
     
  • As part of a register dump, the interface pretends to have the identity
    of other interfaces of the same physical device in order to perform
    HW configuration for them - specifically, it needs to prevent attentions
    from generating on those functions as the register dump accesses registers
    in common blocks which whose reading might generate an attention.

    However, such pretension is unsafe - unlike other flows in which the driver
    uses pretend, during register dump there is no guarantee no other HW access
    will take place (by other flows). If such access will take place, the HW will
    be accessed by the wrong interface, and leave both functions in an incorrect
    state.

    This patch removes all pretensions from the register dump flow. Instead, it
    changes initial configuration of attentions such that no fatal attention will
    be generated for other functions as a result of the register dump
    (notice however, a debug print claiming an attention from other functions IS
    possible during the register dump)

    Signed-off-by: Dmitry Kravkov
    Signed-off-by: Yuval Mintz
    Signed-off-by: Ariel Elior
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Dmitry Kravkov
     
  • bnx2x has several clients to its DMAE machines - all of them with the exception
    of the statistics flow used the same locking mechanisms to synchronize the DMAE
    machines' usage.

    Since statistics (which are periodically entered) use DMAE without taking the
    locks, they may erase the commands which were previously set -
    e.g., it may cause a VF to timeout while waiting for a PF answer on the VF-PF
    channel as that command header would have been overwritten by the statistics'
    header.

    This patch makes certain that all flows utilizing DMAE will use the same
    API, assuring that the locking scheme will be kept by all said flows.

    Signed-off-by: Ariel Elior
    Signed-off-by: Yuval Mintz
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Ariel Elior
     
  • If debug message is open and bnx2x_vfop_qdtor_cmd() were to fail,
    the resulting print would have caused a null pointer dereference.

    Signed-off-by: Yuval Mintz
    Signed-off-by: Ariel Elior
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Yuval Mintz
     
  • Starting with commit b9871bc "bnx2x: VF RSS support - PF side", if a PF will
    have SR-IOV supported in its PCI configuration space, storage drivers will not
    work for that interface.

    This patch fixes the resource calculation to allow such a configuration to
    properly work.

    Signed-off-by: Ariel Elior
    Signed-off-by: Yuval Mintz
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Ariel Elior
     
  • bnx2x drivers configure coalescing incorrectly (e.g., as a result of a call
    to 'ethtool -c'). Although this is almost invisible to the user (due to NAPI)
    designated tests will show the configuration is incorrect.

    Signed-off-by: Dmitry Kravkov
    Signed-off-by: Yuval Mintz
    Signed-off-by: Ariel Elior
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Dmitry Kravkov
     
  • Current code returns upon failure, leaving the VF-PF in an unusable state;
    This patch adds the missing release so further commands could pass between
    PF and VF.

    Signed-off-by: Ariel Elior
    Signed-off-by: Yuval Mintz
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Ariel Elior
     
  • During a panic, the driver tries to print the Management FW buffer of recent
    commands. To do so, the driver reads the address of that buffer from a known
    address. If the buffer is unavailable (e.g., PCI reads don't work, MCP is
    failing, etc.), the driver will try to access the address it has read, possibly
    causing a kernel panic.

    This check 'sanitizes' the access, validating the read value is indeed a valid
    address inside the management FW's buffers.
    The patch also removes a read outside the scope of the buffer, which resulted
    in some unrelated chraracters appearing in the log.

    Signed-off-by: Yuval Mintz
    Signed-off-by: Dmitry Kravkov
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Yuval Mintz
     
  • bnx2x VFs do not support Multi-CoS; Current implementation
    erroneously sets the VFs maximal number of CoS to be > 1.

    This will cause the driver to call alloc_etherdev_mqs() with
    a number of queues it cannot possibly support and reflects
    in 'odd' driver prints.

    Signed-off-by: Yuval Mintz
    Signed-off-by: Ariel Elior
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Yuval Mintz
     
  • When interrupt pacing is enabled, receive/transmit statistics are not
    updated properly by hardware which leads to ISR return with IRQ_NONE
    and inturn kernel disables the interrupt. This patch removed the checking
    of receive/transmit statistics from ISR.

    This patch is verified with AM335x Beagle Bone Black and below is the
    kernel warn when interrupt pacing is enabled.

    [ 104.298254] irq 58: nobody cared (try booting with the "irqpoll" option)
    [ 104.305356] CPU: 0 PID: 1073 Comm: iperf Not tainted 3.12.0-rc3-00342-g77d4015 #3
    [ 104.313284] [] (unwind_backtrace+0x0/0xf0) from [] (show_stack+0x10/0x14)
    [ 104.322282] [] (show_stack+0x10/0x14) from [] (dump_stack+0x78/0x94)
    [ 104.330816] [] (dump_stack+0x78/0x94) from [] (__report_bad_irq+0x20/0xc0)
    [ 104.339889] [] (__report_bad_irq+0x20/0xc0) from [] (note_interrupt+0x1dc/0x23c)
    [ 104.349505] [] (note_interrupt+0x1dc/0x23c) from [] (handle_irq_event_percpu+0xc4/0x238)
    [ 104.359851] [] (handle_irq_event_percpu+0xc4/0x238) from [] (handle_irq_event+0x3c/0x5c)
    [ 104.370198] [] (handle_irq_event+0x3c/0x5c) from [] (handle_level_irq+0xac/0x10c)
    [ 104.379907] [] (handle_level_irq+0xac/0x10c) from [] (generic_handle_irq+0x20/0x30)
    [ 104.389812] [] (generic_handle_irq+0x20/0x30) from [] (handle_IRQ+0x4c/0xb0)
    [ 104.399066] [] (handle_IRQ+0x4c/0xb0) from [] (omap3_intc_handle_irq+0x60/0x74)
    [ 104.408598] [] (omap3_intc_handle_irq+0x60/0x74) from [] (__irq_svc+0x44/0x5c)
    [ 104.418021] Exception stack(0xde4f7c00 to 0xde4f7c48)
    [ 104.423345] 7c00: 00000001 00000000 00000000 dd002140 60000013 de006e54 00000002 00000000
    [ 104.431952] 7c20: de345748 00000040 c11c8588 00018ee0 00000000 de4f7c48 c009dfc8 c050d300
    [ 104.440553] 7c40: 60000013 ffffffff
    [ 104.444237] [] (__irq_svc+0x44/0x5c) from [] (_raw_spin_unlock_irqrestore+0x34/0x44)
    [ 104.454220] [] (_raw_spin_unlock_irqrestore+0x34/0x44) from [] (__irq_put_desc_unlock+0x14/0x38)
    [ 104.465295] [] (__irq_put_desc_unlock+0x14/0x38) from [] (enable_irq+0x4c/0x74)
    [ 104.474829] [] (enable_irq+0x4c/0x74) from [] (cpsw_poll+0xb8/0xdc)
    [ 104.483276] [] (cpsw_poll+0xb8/0xdc) from [] (net_rx_action+0xc0/0x1e8)
    [ 104.492085] [] (net_rx_action+0xc0/0x1e8) from [] (__do_softirq+0x100/0x27c)
    [ 104.501338] [] (__do_softirq+0x100/0x27c) from [] (do_softirq+0x68/0x70)
    [ 104.510224] [] (do_softirq+0x68/0x70) from [] (local_bh_enable+0xd0/0xe4)
    [ 104.519211] [] (local_bh_enable+0xd0/0xe4) from [] (tcp_rcv_established+0x450/0x648)
    [ 104.529201] [] (tcp_rcv_established+0x450/0x648) from [] (tcp_v4_do_rcv+0x154/0x474)
    [ 104.539195] [] (tcp_v4_do_rcv+0x154/0x474) from [] (release_sock+0xac/0x1ac)
    [ 104.548448] [] (release_sock+0xac/0x1ac) from [] (tcp_recvmsg+0x4d0/0xa8c)
    [ 104.557528] [] (tcp_recvmsg+0x4d0/0xa8c) from [] (inet_recvmsg+0xcc/0xf0)
    [ 104.566507] [] (inet_recvmsg+0xcc/0xf0) from [] (sock_recvmsg+0x90/0xb0)
    [ 104.575394] [] (sock_recvmsg+0x90/0xb0) from [] (SyS_recvfrom+0x88/0xd8)
    [ 104.584280] [] (SyS_recvfrom+0x88/0xd8) from [] (sys_recv+0x18/0x20)
    [ 104.592805] [] (sys_recv+0x18/0x20) from [] (ret_fast_syscall+0x0/0x48)
    [ 104.601587] handlers:
    [ 104.603992] [] cpsw_interrupt
    [ 104.608040] Disabling IRQ #58

    Cc: Sebastian Siewior
    Signed-off-by: Mugunthan V N
    Signed-off-by: David S. Miller

    Mugunthan V N
     

20 Oct, 2013

6 commits

  • Jiri Pirko says:

    ====================
    UFO fixes

    Couple of patches fixing UFO functionality in different situations.

    v1->v2:
    - minor if{}else{} coding style adjustment suggested by Sergei Shtylyov
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Now, if user application does:
    sendto lenmtu flag 0
    The skb is not treated as fragmented one because it is not initialized
    that way. So move the initialization to fix this.

    introduced by:
    commit e89e9cf539a28df7d0eb1d0a545368e9920b34ac "[IPv4/IPv6]: UFO Scatter-gather approach"

    Signed-off-by: Jiri Pirko
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • Now, if user application does:
    sendto lenmtu flag 0
    The skb is not treated as fragmented one because it is not initialized
    that way. So move the initialization to fix this.

    introduced by:
    commit e89e9cf539a28df7d0eb1d0a545368e9920b34ac "[IPv4/IPv6]: UFO Scatter-gather approach"

    Signed-off-by: Jiri Pirko
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • if up->pending != 0 dontfrag is left with default value -1. That
    causes that application that do:
    sendto len>mtu flag MSG_MORE
    sendto len>mtu flag 0
    will receive EMSGSIZE errno as the result of the second sendto.

    This patch fixes it by respecting IPV6_DONTFRAG socket option.

    introduced by:
    commit 4b340ae20d0e2366792abe70f46629e576adaf5e "IPv6: Complete IPV6_DONTFRAG support"

    Signed-off-by: Jiri Pirko
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • When CONFIG_NETLABEL is disabled, the cipso_v4_validate() function could loop
    forever in the main loop if opt[opt_iter +1] == 0, this will causing a kernel
    crash in an SMP system, since the CPU executing this function will
    stall /not respond to IPIs.

    This problem can be reproduced by running the IP Stack Integrity Checker
    (http://isic.sourceforge.net) using the following command on a Linux machine
    connected to DUT:

    "icmpsic -s rand -d -r 123456"
    wait (1-2 min)

    Signed-off-by: Seif Mazareeb
    Acked-by: Paul Moore
    Signed-off-by: David S. Miller

    Seif Mazareeb
     
  • In the case of credentials passing in unix stream sockets (dgram
    sockets seem not affected), we get a rather sparse race after
    commit 16e5726 ("af_unix: dont send SCM_CREDENTIALS by default").

    We have a stream server on receiver side that requests credential
    passing from senders (e.g. nc -U). Since we need to set SO_PASSCRED
    on each spawned/accepted socket on server side to 1 first (as it's
    not inherited), it can happen that in the time between accept() and
    setsockopt() we get interrupted, the sender is being scheduled and
    continues with passing data to our receiver. At that time SO_PASSCRED
    is neither set on sender nor receiver side, hence in cmsg's
    SCM_CREDENTIALS we get eventually pid:0, uid:65534, gid:65534
    (== overflow{u,g}id) instead of what we actually would like to see.

    On the sender side, here nc -U, the tests in maybe_add_creds()
    invoked through unix_stream_sendmsg() would fail, as at that exact
    time, as mentioned, the sender has neither SO_PASSCRED on his side
    nor sees it on the server side, and we have a valid 'other' socket
    in place. Thus, sender believes it would just look like a normal
    connection, not needing/requesting SO_PASSCRED at that time.

    As reverting 16e5726 would not be an option due to the significant
    performance regression reported when having creds always passed,
    one way/trade-off to prevent that would be to set SO_PASSCRED on
    the listener socket and allow inheriting these flags to the spawned
    socket on server side in accept(). It seems also logical to do so
    if we'd tell the listener socket to pass those flags onwards, and
    would fix the race.

    Before, strace:

    recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
    msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
    cmsg_type=SCM_CREDENTIALS{pid=0, uid=65534, gid=65534}},
    msg_flags=0}, 0) = 5

    After, strace:

    recvmsg(4, {msg_name(0)=NULL, msg_iov(1)=[{"blub\n", 4096}],
    msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_SOCKET,
    cmsg_type=SCM_CREDENTIALS{pid=11580, uid=1000, gid=1000}},
    msg_flags=0}, 0) = 5

    Signed-off-by: Daniel Borkmann
    Cc: Eric Dumazet
    Cc: Eric W. Biederman
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

19 Oct, 2013

8 commits

  • o validate Tx queue only in case of adapters which supports
    multi Tx queue.

    This patch is to fix regression introduced in commit
    aa4a1f7df7cbb98797c9f4edfde3c726e2b3841f
    "qlcnic: Enable Tx queue changes using ethtool for 82xx Series adapter"

    Signed-off-by: Himanshu Madhani
    Signed-off-by: David S. Miller

    Himanshu Madhani
     
  • It is a required field for all TX_CREATE cmd versions > 0.
    This fixes a driver initialization failure, caused by recent SH-R Firmwares
    (versions > 10.0.639.0) failing the TX_CREATE cmd when if_id field is
    not passed.

    Signed-off-by: Sathya Perla
    Signed-off-by: David S. Miller

    Vasundhara Volam
     
  • The wanxl_ioctl() code fails to initialize the two padding bytes of
    struct sync_serial_settings after the ->loopback member. Add an explicit
    memset(0) before filling the structure to avoid the info leak.

    Signed-off-by: Salva Peiró
    Signed-off-by: David S. Miller

    Salva Peiró
     
  • Toshiaki Makita says:

    ====================
    bridge: Fix problems around the PVID

    There seem to be some undesirable behaviors related with PVID.
    1. It has no effect assigning PVID to a port. PVID cannot be applied
    to any frame regardless of whether we set it or not.
    2. FDB entries learned via frames applied PVID are registered with
    VID 0 rather than VID value of PVID.
    3. We can set 0 or 4095 as a PVID that are not allowed in IEEE 802.1Q.
    This leads interoperational problems such as sending frames with VID
    4095, which is not allowed in IEEE 802.1Q, and treating frames with VID
    0 as they belong to VLAN 0, which is expected to be handled as they have
    no VID according to IEEE 802.1Q.

    Note: 2nd and 3rd problems are potential and not exposed unless 1st problem
    is fixed, because we cannot activate PVID due to it.

    This is my analysis for each behavior.
    1. We are using VLAN_TAG_PRESENT bit when getting PVID, and not when
    adding/deleting PVID.
    It can be fixed in either way using or not using VLAN_TAG_PRESENT,
    but I think the latter is slightly more efficient.

    2. We are setting skb->vlan_tci with the value of PVID but the variable
    vid, which is used in FDB later, is set to 0 at br_allowed_ingress()
    when untagged frames arrive at a port with PVID valid. I'm afraid that
    vid should be updated to the value of PVID if PVID is valid.

    3. According to IEEE 802.1Q-2011 (6.9.1 and Table 9-2), we cannot use
    VID 0 or 4095 as a PVID.
    It looks like that there are more stuff to consider.

    - VID 0:
    VID 0 shall not be configured in any FDB entry and used in a tag header
    to indicate it is a 802.1p priority-tagged frame.
    Priority-tagged frames should be applied PVID (from IEEE 802.1Q 6.9.1).
    In my opinion, since we can filter incomming priority-tagged frames by
    deleting PVID, we don't need to filter them by vlan_bitmap.
    In other words, priority-tagged frames don't have VID 0 but have no VID,
    which is the same as untagged frames, and should be filtered by unsetting
    PVID.
    So, not only we cannot set PVID as 0, but also we don't need to add 0 to
    vlan_bitmap, which enables us to simply forbid to add vlan 0.

    - VID 4095:
    VID 4095 shall not be transmitted in a tag header. This VID value may be
    used to indicate a wildcard match for the VID in management operations or
    FDB entries (from IEEE 802.1Q Table 9-2).
    In current implementation, we can create a static FDB entry with all
    existing VIDs by not specifying any VID when creating it.
    I don't think this way to add wildcard-like entries needs to change,
    and VID 4095 looks no use and can be unacceptable to add.

    Consequently, I believe what we should do for 3rd problem is below:
    - Not allowing VID 0 and 4095 to be added.
    - Applying PVID to priority-tagged (VID 0) frames.

    Note: It has been descovered that another problem related to priority-tags
    remains. If we use vlan 0 interface such as eth0.0, we cannot communicate
    with another end station via a linux bridge.
    This problem exists regardless of whether this patch set is applied or not
    because we might receive untagged frames from another end station even if we
    are sending priority-tagged frames.
    This issue will be addressed by another patch set introducing an additional
    egress policy, on which Vlad Yasevich is working.
    See http://marc.info/?t=137880893800001&r=1&w=2 for detailed discussion.

    Patch set follows this mail.
    The order of patches is not the same as described above, because the way
    to fix 1st problem is based on the assumption that we don't use VID 0 as
    a PVID, which is realized by fixing 3rd problem.
    (1/4)(2/4): Fix 3rd problem.
    (3/4): Fix 1st problem.
    (4/4): Fix 2nd probelm.

    v2:
    - Add descriptions about the problem related to priority-tags in cover letter.
    - Revise patch comments to reference the newest spec.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • We currently set the value that variable vid is pointing, which will be
    used in FDB later, to 0 at br_allowed_ingress() when we receive untagged
    or priority-tagged frames, even though the PVID is valid.
    This leads to FDB updates in such a wrong way that they are learned with
    VID 0.
    Update the value to that of PVID if the PVID is applied.

    Signed-off-by: Toshiaki Makita
    Reviewed-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Toshiaki Makita
     
  • We are using the VLAN_TAG_PRESENT bit to detect whether the PVID is
    set or not at br_get_pvid(), while we don't care about the bit in
    adding/deleting the PVID, which makes it impossible to forward any
    incomming untagged frame with vlan_filtering enabled.

    Since vid 0 cannot be used for the PVID, we can use vid 0 to indicate
    that the PVID is not set, which is slightly more efficient than using
    the VLAN_TAG_PRESENT.

    Fix the problem by getting rid of using the VLAN_TAG_PRESENT.

    Signed-off-by: Toshiaki Makita
    Reviewed-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Toshiaki Makita
     
  • IEEE 802.1Q says that when we receive priority-tagged (VID 0) frames
    use the PVID for the port as its VID.
    (See IEEE 802.1Q-2011 6.9.1 and Table 9-2)

    Apply the PVID to not only untagged frames but also priority-tagged frames.

    Signed-off-by: Toshiaki Makita
    Reviewed-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Toshiaki Makita
     
  • IEEE 802.1Q says that:
    - VID 0 shall not be configured as a PVID, or configured in any Filtering
    Database entry.
    - VID 4095 shall not be configured as a PVID, or transmitted in a tag
    header. This VID value may be used to indicate a wildcard match for the VID
    in management operations or Filtering Database entries.
    (See IEEE 802.1Q-2011 6.9.1 and Table 9-2)

    Don't accept adding these VIDs in the vlan_filtering implementation.

    Signed-off-by: Toshiaki Makita
    Reviewed-by: Vlad Yasevich
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Toshiaki Makita
     

18 Oct, 2013

14 commits

  • Commit be4f154d5ef0ca147ab6bcd38857a774133f5450
    bridge: Clamp forward_delay when enabling STP
    had a typo when attempting to clamp maximum forward delay.

    It is possible to set bridge_forward_delay to be higher then
    permitted maximum when STP is off. When turning STP on, the
    higher then allowed delay has to be clamed down to max value.

    CC: Herbert Xu
    CC: Stephen Hemminger
    Signed-off-by: Vlad Yasevich
    Reviewed-by: Veaceslav Falico
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • sk_can_gso() should only be used as a hint in tcp_sendmsg() to build GSO
    packets in the first place. (As a performance hint)

    Once we have GSO packets in write queue, we can not decide they are no
    longer GSO only because flow now uses a route which doesn't handle
    TSO/GSO.

    Core networking stack handles the case very well for us, all we need
    is keeping track of packet counts in MSS terms, regardless of
    segmentation done later (in GSO or hardware)

    Right now, if tcp_fragment() splits a GSO packet in two parts,
    @left and @right, and route changed through a non GSO device,
    both @left and @right have pcount set to 1, which is wrong,
    and leads to incorrect packet_count tracking.

    This problem was added in commit d5ac99a648 ("[TCP]: skb pcount with MTU
    discovery")

    Signed-off-by: Eric Dumazet
    Signed-off-by: Neal Cardwell
    Signed-off-by: Yuchung Cheng
    Reported-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • TCP stack should make sure it owns skbs before mangling them.

    We had various crashes using bnx2x, and it turned out gso_size
    was cleared right before bnx2x driver was populating TC descriptor
    of the _previous_ packet send. TCP stack can sometime retransmit
    packets that are still in Qdisc.

    Of course we could make bnx2x driver more robust (using
    ACCESS_ONCE(shinfo->gso_size) for example), but the bug is TCP stack.

    We have identified two points where skb_unclone() was needed.

    This patch adds a WARN_ON_ONCE() to warn us if we missed another
    fix of this kind.

    Kudos to Neal for finding the root cause of this bug. Its visible
    using small MSS.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Neal Cardwell
    Cc: Yuchung Cheng
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • John W. Linville says:

    ====================
    Please pull this batch of fixes intended for the 3.12 stream!

    For the mac80211 bits, Johannes says:

    "Jouni fixes a remain-on-channel vs. scan bug, and Felix fixes client TX
    probing on VLANs."

    And also:

    "This time I have two fixes from Emmanuel for RF-kill issues, and fixed
    two issues reported by Evan Huus and Thomas Lindroth respectively."

    On top of those...

    Avinash Patil adds a couple of mwifiex fixes to properly inform cfg80211
    about some different types of disconnects, avoiding WARNINGs.

    Mark Cave-Ayland corrects a pointer arithmetic problem in rtlwifi,
    avoiding incorrect automatic gain calculations.

    Solomon Peachy sends a cw1200 fix for locking around calls to
    cw1200_irq_handler, addressing "lost interrupt" problems.

    Please let me know if there are problems!
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • This is a QMI device, manufactured by TCT Mobile Phones.
    A companion patch blacklisting this device's QMI interface in the option.c
    driver has been sent.

    Signed-off-by: Enrico Mioso
    Signed-off-by: Antonella Pellizzari
    Tested-by: Dan Williams
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: David S. Miller

    Enrico Mioso
     
  • We used to schedule the refill work unconditionally after changing the
    number of queues. This may lead an issue if the device is not
    up. Since we only try to cancel the work in ndo_stop(), this may cause
    the refill work still work after removing the device. Fix this by only
    schedule the work when device is up.

    The bug were introduce by commit 9b9cd8024a2882e896c65222aa421d461354e3f2.
    (virtio-net: fix the race between channels setting and refill)

    Cc: Rusty Russell
    Cc: Michael S. Tsirkin
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller

    Jason Wang
     
  • We're trying to re-configure the affinity unconditionally in cpu hotplug
    callback. This may lead the issue during resuming from s3/s4 since

    - virt queues haven't been allocated at that time.
    - it's unnecessary since thaw method will re-configure the affinity.

    Fix this issue by checking the config_enable and do nothing is we're not ready.

    The bug were introduced by commit 8de4b2f3ae90c8fc0f17eeaab87d5a951b66ee17
    (virtio-net: reset virtqueue affinity when doing cpu hotplug).

    Cc: Rusty Russell
    Cc: Michael S. Tsirkin
    Cc: Wanlong Gao
    Acked-by: Michael S. Tsirkin
    Reviewed-by: Wanlong Gao
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller

    Jason Wang
     
  • We overwrite the ->bitrate with the user supplied information on the
    next line.

    Signed-off-by: Dan Carpenter
    Signed-off-by: David S. Miller

    Dan Carpenter
     
  • We cap bitrate at YAM_MAXBITRATE in yam_ioctl(), but it could also be
    negative. I don't know the impact of using a negative bitrate but let's
    prevent it.

    Signed-off-by: Dan Carpenter
    Signed-off-by: David S. Miller

    Dan Carpenter
     
  • If interrupts happen before napi_enable was called, the driver will not
    work as expected. Network transmissions are impossible in this state.
    This bug can be reproduced easily by restarting the network interface in
    a loop. After some time any network transmissions on the network
    interface will fail.

    This patch fixes the bug by enabling napi before enabling the network
    interface interrupts.

    Signed-off-by: Markus Pargmann
    Acked-by: Peter Korsgaard
    Acked-by: Mugunthan V N
    Signed-off-by: David S. Miller

    Markus Pargmann
     
  • RPS support is kind of broken on bnx2x, because only non LRO packets
    get proper rx queue information. This triggers reorders, as it seems
    bnx2x like to generate a non LRO packet for segment including TCP PUSH
    flag : (this might be pure coincidence, but all the reorders I've
    seen involve segments with a PUSH)

    11:13:34.335847 IP A > B: . 415808:447136(31328) ack 1 win 457
    11:13:34.335992 IP A > B: . 447136:448560(1424) ack 1 win 457
    11:13:34.336391 IP A > B: . 448560:479888(31328) ack 1 win 457
    11:13:34.336425 IP A > B: P 511216:512640(1424) ack 1 win 457
    11:13:34.336423 IP A > B: . 479888:511216(31328) ack 1 win 457
    11:13:34.336924 IP A > B: . 512640:543968(31328) ack 1 win 457
    11:13:34.336963 IP A > B: . 543968:575296(31328) ack 1 win 457

    We must call skb_record_rx_queue() to properly give to RPS (and more
    generally for TX queue selection on forward path) the receive queue
    information.

    Similar fix is needed for skb_mark_napi_id(), but will be handled
    in a separate patch to ease stable backports.

    Signed-off-by: Eric Dumazet
    Cc: Willem de Bruijn
    Cc: Eilon Greenstein
    Acked-by: Dmitry Kravkov
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • On receiving an ACK that covers the loss probe sequence, TLP
    immediately sets the congestion state to Open, even though some packets
    are not recovered and retransmisssion are on the way. The later ACks
    may trigger a WARN_ON check in step D of tcp_fastretrans_alert(), e.g.,
    https://bugzilla.redhat.com/show_bug.cgi?id=989251

    The fix is to follow the similar procedure in recovery by calling
    tcp_try_keep_open(). The sender switches to Open state if no packets
    are retransmissted. Otherwise it goes to Disorder and let subsequent
    ACKs move the state to Recovery or Open.

    Reported-By: Michael Sterrett
    Tested-By: Dormando
    Signed-off-by: Yuchung Cheng
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Yuchung Cheng
     
  • Fix to return -ENOMEM in the padding pkt alloc fail error handling
    case instead of 0, as done elsewhere in this function.

    Signed-off-by: Wei Yongjun
    Acked-by: Oliver Neukum
    Signed-off-by: David S. Miller

    Wei Yongjun
     
  • Vlad Yasevich says:

    ====================
    sctp: Use software checksum under certain circumstances.

    There are some cards that support SCTP checksum offloading. When using
    these cards with IPSec or forcing IP fragmentation of SCTP traffic,
    the checksum is computed incorrectly due to the fact that xfrm and IP/IPv6
    fragmentation code do not know that this is SCTP traffic and do not
    know that checksum has to be computed differently.

    To fix this, we let SCTP detect these conditions and perform software
    checksum calculation.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller