23 Jul, 2012

3 commits

  • I've seen several attempts recently made to do quick failover of sctp transports
    by reducing various retransmit timers and counters. While its possible to
    implement a faster failover on multihomed sctp associations, its not
    particularly robust, in that it can lead to unneeded retransmits, as well as
    false connection failures due to intermittent latency on a network.

    Instead, lets implement the new ietf quick failover draft found here:
    http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05

    This will let the sctp stack identify transports that have had a small number of
    errors, and avoid using them quickly until their reliability can be
    re-established. I've tested this out on two virt guests connected via multiple
    isolated virt networks and believe its in compliance with the above draft and
    works well.

    Signed-off-by: Neil Horman
    CC: Vlad Yasevich
    CC: Sridhar Samudrala
    CC: "David S. Miller"
    CC: linux-sctp@vger.kernel.org
    CC: joe@perches.com
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Neil Horman
     
  • Fix race condition in several network drivers when reading stats on 32bit
    UP architectures. These drivers update their stats in a BH context and
    therefore should use u64_stats_fetch_begin_bh/u64_stats_fetch_retry_bh
    instead of u64_stats_fetch_begin/u64_stats_fetch_retry when reading the
    stats.

    Signed-off-by: Kevin Groeneveld
    Signed-off-by: David S. Miller

    Kevin Groeneveld
     
  • Set unicast_sock uc_ttl to -1 so that we select the right ttl,
    instead of sending packets with a 0 ttl.

    Bug added in commit be9f4a44e7d4 (ipv4: tcp: remove per net tcp_sock)

    Signed-off-by: Hiroaki SHIMODA
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Jul, 2012

26 commits

  • The replacement of spin_lock_irq/spin_unlock_irq pair in interrupt
    handler by spin_lock_irqsave/spin_lock_irqrestore pair.

    Found by Linux Driver Verification project (linuxtesting.org).

    Signed-off-by: Denis Efremov
    Signed-off-by: David S. Miller

    Denis Efremov
     
  • Jesse Gross says:

    ====================
    A few bug fixes and small enhancements for net-next/3.6.
    ...
    Ansis Atteka (1):
    openvswitch: Do not send notification if ovs_vport_set_options() failed

    Ben Pfaff (1):
    openvswitch: Check gso_type for correct sk_buff in queue_gso_packets().

    Jesse Gross (2):
    openvswitch: Enable retrieval of TCP flags from IPv6 traffic.
    openvswitch: Reset upper layer protocol info on internal devices.

    Leo Alterman (1):
    openvswitch: Fix typo in documentation.

    Pravin B Shelar (1):
    openvswitch: Check currect return value from skb_gso_segment()

    Raju Subramanian (1):
    openvswitch: Replace Nicira Networks.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • We were using a special key "0" for all loopback and point-to-point
    device neigh lookups under ipv4, but we wouldn't use that special
    key for the neigh creation.

    So basically we'd make a new neigh at each and every lookup :-)

    This special case to use only one neigh for these device types
    is of dubious value, so just remove it entirely.

    Reported-by: Eric Dumazet
    Signed-off-by: David S. Miller

    David S. Miller
     
  • Signed-off-by: Leo Alterman
    Signed-off-by: Jesse Gross

    Leo Alterman
     
  • At the point where it was used, skb_shinfo(skb)->gso_type referred to a
    post-GSO sk_buff. Thus, it would always be 0. We want to know the pre-GSO
    gso_type, so we need to obtain it before segmenting.

    Before this change, the kernel would pass inconsistent data to userspace:
    packets for UDP fragments with nonzero offset would be passed along with
    flow keys that indicate a zero offset (that is, the flow key for "later"
    fragments claimed to be "first" fragments). This inconsistency tended
    to confuse Open vSwitch userspace, causing it to log messages about
    "failed to flow_del" the flows with "later" fragments.

    Signed-off-by: Ben Pfaff
    Signed-off-by: Jesse Gross

    Ben Pfaff
     
  • Fix return check typo.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: Jesse Gross

    Pravin B Shelar
     
  • When io access mode is enabled by BOOTROM or BIOS for AR8152 v2.1,
    the register can't be read/write by memory access mode.
    Clearing Bit 8 of Register 0x21c could fixed the issue.

    Signed-off-by: Cloud Ren
    Cc: stable
    Signed-off-by: xiong
    Signed-off-by: David S. Miller

    Cloud Ren
     
  • This patch fixes a crash
    tun_chr_close -> netdev_run_todo -> tun_free_netdev -> sk_release_kernel ->
    sock_release -> iput(SOCK_INODE(sock))
    introduced by commit 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d

    The problem is that this socket is embedded in struct tun_struct, it has
    no inode, iput is called on invalid inode, which modifies invalid memory
    and optionally causes a crash.

    sock_release also decrements sockets_in_use, this causes a bug that
    "sockets: used" field in /proc/*/net/sockstat keeps on decreasing when
    creating and closing tun devices.

    This patch introduces a flag SOCK_EXTERNALLY_ALLOCATED that instructs
    sock_release to not free the inode and not decrement sockets_in_use,
    fixing both memory corruption and sockets_in_use underflow.

    It should be backported to 3.3 an 3.4 stabke.

    Signed-off-by: Mikulas Patocka
    Cc: stable@kernel.org
    Signed-off-by: David S. Miller

    Mikulas Patocka
     
  • Override the metrics with rt_pmtu

    Signed-off-by: Julian Anastasov
    Signed-off-by: David S. Miller

    Julian Anastasov
     
  • Jerr Kirsher says:

    ====================
    This series contains updates to ixgbe.
    ...
    Alexander Duyck (9):
    ixgbe: Use VMDq offset to indicate the default pool
    ixgbe: Fix memory leak when SR-IOV VFs are direct assigned
    ixgbe: Drop references to deprecated pci_ DMA api and instead use
    dma_ API
    ixgbe: Cleanup configuration of FCoE registers
    ixgbe: Merge all FCoE percpu values into a single structure
    ixgbe: Make FCoE allocation and configuration closer to how rings
    work
    ixgbe: Correctly set SAN MAC RAR pool to default pool of PF
    ixgbe: Only enable anti-spoof on VF pools
    ixgbe: Enable FCoE FSO and CRC offloads based on CAPABLE instead of
    ENABLED flag
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Jiri Pirko says:

    ====================
    This patchset represents the way I walked when I was adding multiqueue
    support for team driver.

    Jiri Pirko (6):
    net: honour netif_set_real_num_tx_queues() retval
    rtnl: allow to specify different num for rx and tx queue count
    rtnl: allow to specify number of rx and tx queues on device creation
    net: rename bond_queue_mapping to slave_dev_queue_mapping
    bond_sysfs: use ream_num_tx_queues rather than params.tx_queue
    team: add multiqueue support
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Largely copied from bonding code.

    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • Since now number of tx queues can be specified during bond instance
    creation and therefore it may differ from params.tx_queues, use rather
    real_num_tx_queues for boundary check.

    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • As this is going to be used not only by bonding.

    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • This patch introduces IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES by
    which userspace can set number of rx and/or tx queues to be allocated
    for newly created netdevice.
    This overrides ops->get_num_[tr]x_queues()

    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • Also cut out unused function parameters and possible err in return
    value.

    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • In netif_copy_real_num_queues() the return value of
    netif_set_real_num_tx_queues() should be checked.

    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • Modern TCP stack highly depends on tcp_write_timer() having a small
    latency, but current implementation doesn't exactly meet the
    expectations.

    When a timer fires but finds the socket is owned by the user, it rearms
    itself for an additional delay hoping next run will be more
    successful.

    tcp_write_timer() for example uses a 50ms delay for next try, and it
    defeats many attempts to get predictable TCP behavior in term of
    latencies.

    Use the recently introduced tcp_release_cb(), so that the user owning
    the socket will call various handlers right before socket release.

    This will permit us to post a followup patch to address the
    tcp_tso_should_defer() syndrome (some deferred packets have to wait
    RTO timer to be transmitted, while cwnd should allow us to send them
    sooner)

    Signed-off-by: Eric Dumazet
    Cc: Tom Herbert
    Cc: Yuchung Cheng
    Cc: Neal Cardwell
    Cc: Nandita Dukkipati
    Cc: H.K. Jerry Chu
    Cc: John Heffner
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • When/if sysctl_tcp_abc > 1, we expect to increase cwnd by 2 if the
    received ACK acknowledges more than 2*MSS bytes, in tcp_slow_start()

    Problem is this RFC 3465 statement is not correctly coded, as
    the while () loop increases snd_cwnd one by one.

    Add a new variable to avoid this off-by one error.

    Signed-off-by: Eric Dumazet
    Cc: Tom Herbert
    Cc: Yuchung Cheng
    Cc: Neal Cardwell
    Cc: Nandita Dukkipati
    Cc: John Heffner
    Cc: Stephen Hemminger
    Acked-by: Yuchung Cheng
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Fix a missing roundup_pow_of_two(), since tcpmhash_entries is not
    guaranteed to be a power of two.

    Uses hash_32() instead of custom hash.

    tcpmhash_entries should be an unsigned int.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Applied to a set of static inline functions in tcp_input.c

    Signed-off-by: Vijay Subramanian
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Vijay Subramanian
     
  • Use PCI_VENDOR_ID_INTEL from pci_ids.h instead of creating its own
    vendor ID #define.

    Signed-off-by: Jon Mason
    Cc: Jeff Kirsher
    Cc: Jesse Brandeburg
    Cc: Bruce Allan
    Cc: Carolyn Wyborny
    Cc: Don Skidmore
    Cc: Greg Rose
    Cc: Peter P Waskiewicz Jr
    Cc: Alex Duyck
    Cc: John Ronciak
    Acked-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jon Mason
     
  • Use PCI_VENDOR_ID_INTEL from pci_ids.h instead of creating its own
    vendor ID #define.

    Signed-off-by: Jon Mason
    Cc: Jeff Kirsher
    Cc: Jesse Brandeburg
    Cc: Bruce Allan
    Cc: Carolyn Wyborny
    Cc: Don Skidmore
    Cc: Greg Rose
    Cc: Peter P Waskiewicz Jr
    Cc: Alex Duyck
    Cc: John Ronciak
    Acked-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jon Mason
     
  • Remove myself from myri10ge MAINTAINERS list

    Signed-off-by: Jon Mason
    Signed-off-by: David S. Miller

    Jon Mason
     
  • Marc Kleine-Budde says:

    ====================
    the fifth pull request for upcoming v3.6 net-next cleans up and
    improves the janz-ican3 driver (6 patches by Ira W. Snyder, one by me).
    A patch by Steffen Trumtrar adds imx53 support to the flexcan driver.
    And another patch by me, which marks the bit timing constant in the CAN
    drivers as "const".
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • …wireless-next into for-davem

    John W. Linville
     

20 Jul, 2012

11 commits

  • The Janz VMOD-ICAN3 hardware has support for one shot packet
    transmission. This means that a packet will be attempted to be sent
    once, with no automatic retries.

    The SocketCAN core has a controller-wide setting for this mode:
    CAN_CTRLMODE_ONE_SHOT. The Janz VMOD-ICAN3 hardware supports this flag
    on a per-packet level, but the SocketCAN core does not.

    Signed-off-by: Ira W. Snyder
    Signed-off-by: Marc Kleine-Budde

    Ira W. Snyder
     
  • If the bus error quota is set to infinite and the host CPU cannot keep
    up, the Janz VMOD-ICAN3 firmware will stop responding to control
    messages until the controller is reset.

    The firmware will automatically stop sending bus error messages when the
    quota is reached, and will only resume sending bus error messages when
    the quota is re-set to a positive value.

    This limitation is worked around by setting the bus error quota to one
    message, and then re-setting the quota to one message every time a bus
    error message is received. By doing this, the firmware never stops
    responding to control messages. The CAN bus can be reset without a
    hard-reset of the controller card.

    Signed-off-by: Ira W. Snyder
    Signed-off-by: Marc Kleine-Budde

    Ira W. Snyder
     
  • The Janz VMOD-ICAN3 firmware does not support any sort of TX-done
    notification or interrupt. The driver previously used the hardware
    loopback to attempt to work around this deficiency, but this caused all
    sockets to receive all messages, even if CAN_RAW_RECV_OWN_MSGS is off.

    Using the new function ican3_cmp_echo_skb(), we can drop the loopback
    messages and return the original skbs. This fixes the issues with
    CAN_RAW_RECV_OWN_MSGS.

    A private skb queue is used to store the echo skbs. This avoids the need
    for any index management.

    Due to a lack of TX-error interrupts, bus errors are permanently
    enabled, and are used as a TX-error notification. This is used to drop
    an echo skb when transmission fails. Bus error packets are not generated
    if the user has not enabled bus error reporting.

    Signed-off-by: Ira W. Snyder
    Signed-off-by: Marc Kleine-Budde

    Ira W. Snyder
     
  • The error and byte counter statistics were being incremented
    incorrectly. For example, a TX error would be counted both in tx_errors
    and rx_errors.

    This corrects the problem so that tx_errors and rx_errors are only
    incremented for errors caused by packets sent to the bus. Error packets
    generated by the driver are not counted.

    The byte counters are only increased for packets which are actually
    transmitted or received from the bus. Error packets generated by the
    driver are not counted.

    Signed-off-by: Ira W. Snyder
    Signed-off-by: Marc Kleine-Budde

    Ira W. Snyder
     
  • This patch cleans up the ICAN3 to Linux CAN frame and vice versa
    conversion functions:

    - RX: Use get_can_dlc() to limit the dlc value.
    - RX+TX: Don't copy the whole frame, only copy the amount of bytes
    specified in cf->can_dlc.

    Acked-by: Ira W. Snyder
    Tested-by: Ira W. Snyder
    Signed-off-by: Marc Kleine-Budde

    Marc Kleine-Budde
     
  • The commit which added the janz-ican3 driver and commit
    3ccd4c61 "can: Unify droping of invalid tx skbs and netdev stats" were
    committed into mainline Linux during the same merge window.

    Therefore, the addition of this code to the janz-ican3 driver was
    forgotten. This patch adds the expected code.

    Signed-off-by: Ira W. Snyder
    Signed-off-by: Marc Kleine-Budde

    Ira W. Snyder
     
  • The code which used this variable was removed during review, before the
    driver was added to mainline Linux. It is now dead code, and can be
    removed.

    Signed-off-by: Ira W. Snyder
    Signed-off-by: Marc Kleine-Budde

    Ira W. Snyder
     
  • This patch adds support for a second clock to the flexcan driver. On
    modern freescale ARM cores like the imx53 and imx6q two clocks ("ipg"
    and "per") must be enabled in order to access the CAN core.

    In the original driver, the clock was requested without specifying the
    connection id, further all mainline ARM archs with flexcan support
    (imx28, imx25, imx35) register their flexcan clock without a
    connection id, too.

    This patch first renames the existing clk variable to clk_ipg and
    converts it to devm for easier error handling. The connection id "ipg"
    is added to the devm_clk_get() call. Then a second clock "per" is
    requested. As all archs don't specify a connection id, both clk_get
    return the same clock. This ensures compatibility to existing flexcan
    support and adds support for imx53 at the same time.

    After this patch hits mainline, the archs may give their existing
    flexcan clock the "ipg" connection id and implement a dummy "per"
    clock.

    This patch has been tested on imx28 (unmodified clk tree) and on imx53
    with a seperate "ipg" and "per" clock.

    Cc: Sascha Hauer
    Cc: Shawn Guo
    Signed-off-by: Steffen Trumtrar
    Acked-by: Hui Wang
    Signed-off-by: Marc Kleine-Budde

    Steffen Trumtrar
     
  • This patch marks the bittiming_const pointer as in the struct can_pric as
    "const". This allows us to mark the struct can_bittiming_const in the CAN
    drivers as "const", too.

    Signed-off-by: Marc Kleine-Budde

    Marc Kleine-Budde
     
  • Instead of only setting the FCOE segmentation offload and CRC offload flags
    if we enable FCoE, we could just set them always since there are no
    modifications needed to the hardware or adapter FCoE structure in order to
    use these features.

    The advantage to this is that if FCoE enablement fails, for example because
    SR-IOV was enabled on 82599, we will still have use of the FCoE
    segmentation offload and Tx/Rx CRC offloads which should still help to
    improve the FCoE performance.

    Signed-off-by: Alexander Duyck
    Tested-by: Phil Schmitt
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • The current logic is enabling anti-spoof on all pools and then clearing
    anti-spoof on just the first PF pool. The correct approach is to only set
    anti-spoof on the VF pools and to leave all of the PF pools unchecked.

    This allows for items such as FCoE to use adjacent pools within the PF for
    transmit and receive queues without the traffic being blocked by this
    security feature.

    Signed-off-by: Alexander Duyck
    Tested-by: Phil Schmitt
    Tested-by: Sibai Li
    Signed-off-by: Jeff Kirsher

    Alexander Duyck