20 Oct, 2011

11 commits

  • sysfs is a core piece of ifrastructure that many people use and
    few people have all of the rules in their head on how to use
    it correctly. Add warnings for people using tagged directories
    improperly to that any misuses can be caught and diagnosed quickly.

    A single inexpensive test in sysfs_find_dirent is almost sufficient
    to catch all possible misuses. An additional warning is needed
    in sysfs_add_dirent so that we actually fail when attempting to
    add an untagged dirent in a tagged directory.

    Signed-off-by: Eric W. Biederman
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Now that /sys/class/net/bonding_masters is implemented as a tagged sysfs
    file we can remove support for untagged files in tagged directories.

    This change removes any ambiguity of what a NULL namespace value
    means. A NULL namespace parameter after this patch means
    that we are talking about an untagged sysfs dirent.

    This makes the sysfs code much less prone to mistakes when during
    maintenance.

    Signed-off-by: Eric W. Biederman
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This fixes a network namespace misfeature that bonding_masters looked at
    current instead of the remembering the context where in which
    /sys/class/net/bonding_masters was opened in to see which network
    namespace to act upon.

    This removes the need for sysfs to handle tagged directories with
    untagged members allowing for a conceptually simpler sysfs
    implementation.

    Signed-off-by: Eric W. Biederman
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Signed-off-by: Eric W. Biederman
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Looking up files in sysfs is hard to understand and analyize because we
    currently allow placing untagged files in tagged directories. In the
    implementation of that we have two subtly different meanings of NULL.
    NULL meaning there is no tag on a directory entry and NULL meaning
    we don't care which namespace the lookup is performed for. This
    multiple uses of NULL have resulted in subtle bugs (since fixed)
    in the code.

    Currently it is only the bonding driver that needs to have an untagged
    file in a tagged directory.

    To untagle this mess I am adding support for tagged files to sysfs.
    Modifying the bonding driver to implement bonding_masters as a tagged
    file. Registering bonding_masters once for each network namespace.
    Then I am removing support for untagged entries in tagged sysfs
    directories.

    Resulting in code that is much easier to reason about.

    Signed-off-by: Eric W. Biederman
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • The port shouldn't be enabled unless its current MUX
    state is DISTRIBUTING which is correctly handled by
    ad_mux_machine(), otherwise the packet sent can be
    lost because the other end may not be ready.

    The issue happens on every port initialization, but
    as the ports are expected to move quickly to DISTRIBUTING,
    it doesn't cause much problem. However, it does cause
    constant packet loss if the other peer has the port
    configured to stay in STANDBY (i.e. SYNC set to OFF).

    Signed-off-by: Flavio Leitner
    Signed-off-by: David S. Miller

    Flavio Leitner
     
  • This patch adds a sanity check on the values provided by user space for
    the hardware time stamping configuration. If the values lie outside of
    the absolute limits, then the ioctl request will be denied.

    Signed-off-by: Richard Cochran
    Signed-off-by: David S. Miller

    Richard Cochran
     
  • This patch moves the rcu_barrier from rollback_registered_many
    (inside the rtnl_lock) into netdev_run_todo (just outside the rtnl_lock).
    This allows us to gain the full benefit of sychronize_net calling
    synchronize_rcu_expedited when the rtnl_lock is held.

    The rcu_barrier in rollback_registered_many was originally a synchronize_net
    but was promoted to be a rcu_barrier() when it was found that people were
    unnecessarily hitting the 250ms wait in netdev_wait_allrefs(). Changing
    the rcu_barrier back to a synchronize_net is therefore safe.

    Since we only care about waiting for the rcu callbacks before we get
    to netdev_wait_allrefs() it is also safe to move the wait into
    netdev_run_todo.

    This was tested by creating and destroying 1000 tap devices and observing
    /proc/lock_stat. /proc/lock_stat reports this change reduces the hold
    times of the rtnl_lock by a factor of 10. There was no observable
    difference in the amount of time it takes to destroy a network device.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Initial cwnd being 10 (TCP_INIT_CWND) instead of 3, change
    tcp_fixup_sndbuf() to get more than 16384 bytes (sysctl_tcp_wmem[1]) in
    initial sk_sndbuf

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • The Vitesse driver was using the RGMII_ID interface type to determine if
    skew was necessary. However, we want to move away from using that
    interface type, as it's really a property of the board's PHY connection.
    However, some boards depend on it, so we want to support it, while
    allowing new boards to use the more flexible "fixups" approach. To do
    this, we extract the code which adds skew into its own function, and
    call that function when RGMII_ID has been selected.

    Another side-effect of this change is that if your PHY has skew set
    already, it doesn't clear it. This way, the fixup code can modify the
    register without config_init then clearing it.

    Signed-off-by: Andy Fleming
    Signed-off-by: David S. Miller

    Andy Fleming
     
  • skb_recycle_check resets the skb if it's eligible for recycling.
    However, there are times when a driver might want to optionally
    manipulate the skb data with the skb before resetting the skb,
    but after it has determined eligibility. We do this by splitting the
    eligibility check from the skb reset, creating two inline functions to
    accomplish that task.

    Signed-off-by: Andy Fleming
    Acked-by: David Daney
    Signed-off-by: David S. Miller

    Andy Fleming
     

19 Oct, 2011

24 commits

  • Driver version updated to 1.5.4.2

    Signed-off-by: Yevgeny Petrilin
    Signed-off-by: David S. Miller

    Yevgeny Petrilin
     
  • Moving to Toeplitz function in RSS calculation.
    Reporting rxhash in skb.

    Signed-off-by: Yevgeny Petrilin
    Signed-off-by: David S. Miller

    Yevgeny Petrilin
     
  • Signed-off-by: Yevgeny Petrilin
    Signed-off-by: David S. Miller

    Yevgeny Petrilin
     
  • Not updating common counters from data path.
    The checksum counters are per ring, summarizing them when collecting statistics.

    Signed-off-by: Yevgeny Petrilin
    Signed-off-by: David S. Miller

    Yevgeny Petrilin
     
  • Canceling FCS removal where FW allows for better alignment
    of incoming data.

    Signed-off-by: Yevgeny Petrilin
    Signed-off-by: David S. Miller

    Yevgeny Petrilin
     
  • Prevent overflow when trying to register more Vlans then the Vlan table in
    HW is configured to.
    Need to take into acount that the first 2 entries are reserved.

    Signed-off-by: Yevgeny Petrilin
    Signed-off-by: David S. Miller

    Yevgeny Petrilin
     
  • Added recovery check of CA wake status in case of wake up timeout.
    Added check of CA wake status in case of wake down timeout.

    Signed-off-by: Sjur Brændeland
    Signed-off-by: David S. Miller

    Daniel Martensson
     
  • Added sanity check for length of CAIF frames, and tear down of
    CAIF link-layer device upon protocol error.

    Signed-off-by: Sjur Brændeland
    Signed-off-by: David S. Miller

    Daniel Martensson
     
  • CAIF HSI uses a timer for inactivity. Upon timeout HSI-wake signaling
    is initiated to allow power-down of the HSI block.

    Signed-off-by: Sjur Brændeland
    Signed-off-by: David S. Miller

    Dmitry Tarnyagin
     
  • Platform device is no longer removed from caif_hsi at shutdown.
    The HSI-platform device must do it's own registration and unregistration.

    Signed-off-by: Sjur Brændeland
    Signed-off-by: David S. Miller

    Daniel Martensson
     
  • Some platforms do not allow to put HSI block into low-power
    mode when FIFO is not empty. The patch flushes (by reading)
    FIFO at wake down sequence. Asynchronous read and write is
    implemented for that. As a side effect this will also greatly
    improve performance.

    Signed-off-by: Sjur Brændeland
    Signed-off-by: David S. Miller

    Daniel Martensson
     
  • Under stressed conditions a race could happen when del_timer_sync() was called
    from softirq context at the same time when mod_timer_pending() for the same
    timer was called from the workqueue. This leaded to a state mismatch in the
    CAIF HSI driver and following unexpected link wakeup procedure.

    The fix puts del_timer_sync() and mod_timer_pending() calls under a spin lock
    to protect against the race condition.

    Signed-off-by: Sjur Brændeland
    Signed-off-by: David S. Miller

    Dmitry Tarnyagin
     
  • cfhsi->tx_state was not protected by a spin lock. TX soft-irq could interrupt
    cfhsi_tx_done_work work leading to inconsistent state of the driver.

    Signed-off-by: Sjur Brændeland
    Signed-off-by: David S. Miller

    Dmitry Tarnyagin
     
  • CAIF HSI header may be uninitialized and cause last message to
    be repeated if transmit size is ~86 bytes long.

    Signed-off-by: Sjur Brændeland
    Signed-off-by: David S. Miller

    sjur.brandeland@stericsson.com
     
  • To ease skb->truesize sanitization, its better to be able to localize
    all references to skb frags size.

    Define accessors : skb_frag_size() to fetch frag size, and
    skb_frag_size_{set|add|sub}() to manipulate it.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Calling icmpv6_send() on a local message size error leads to
    an incorrect update of the path mtu. So use xfrm6_local_rxpmtu()
    to notify about the pmtu if the IPV6_DONTFRAG socket option is
    set on an udp or raw socket, according RFC 3542 and use
    ipv6_local_error() otherwise.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • ip6_append_data() builds packets based on the mtu from dst_mtu(rt->dst.path).
    On IPsec the effective mtu is lower because we need to add the protocol
    headers and trailers later when we do the IPsec transformations. So after
    the IPsec transformations the packet might be too big, which leads to a
    slowpath fragmentation then. This patch fixes this by building the packets
    based on the lower IPsec mtu from dst_mtu(&rt->dst) and adapts the exthdr
    handling to this.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • The pointer to mtu_info is taken from the common buffer
    of the skb, thus it can't be a NULL pointer. This patch
    removes this check on mtu_info.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • The replay check and replay advance functions had some code
    duplications. This patch removes the duplications.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • The extra delay of 2ns to adjust RX clock phase is actually needed
    in RGMII mode. Tested on the HDK7108 (STx7108c2).

    Signed-off-by: Giuseppe Cavallaro
    Signed-off-by: David S. Miller

    Giuseppe CAVALLARO
     
  • The following configuration used to work as I expected. At least
    we could use the fcoe interfaces to do MPIO and the bond0 iface
    to do load balancing or failover.

    ---eth2.228-fcoe
    |
    eth2 -----|
    |
    |---- bond0
    |
    eth3 -----|
    |
    ---eth3.228-fcoe

    This worked because of a change we added to allow inactive slaves
    to rx 'exact' matches. This functionality was kept intact with the
    rx_handler mechanism. However now the vlan interface attached to the
    active slave never receives traffic because the bonding rx_handler
    updates the skb->dev and goto's another_round. Previously, the
    vlan_do_receive() logic was called before the bonding rx_handler.

    Now by the time vlan_do_receive calls vlan_find_dev() the
    skb->dev is set to bond0 and it is clear no vlan is attached
    to this iface. The vlan lookup fails.

    This patch moves the VLAN check above the rx_handler. A VLAN
    tagged frame is now routed to the eth2.228-fcoe iface in the
    above schematic. Untagged frames continue to the bond0 as
    normal. This case also remains intact,

    eth2 --> bond0 --> vlan.228

    Here the skb is VLAN tagged but the vlan lookup fails on eth2
    causing the bonding rx_handler to be called. On the second
    pass the vlan lookup is on the bond0 iface and completes as
    expected.

    Putting a VLAN.228 on both the bond0 and eth2 device will
    result in eth2.228 receiving the skb. I don't think this is
    completely unexpected and was the result prior to the rx_handler
    result.

    Note, the same setup is also used for other storage traffic that
    MPIO is used with eg. iSCSI and similar setups can be contrived
    without storage protocols.

    Signed-off-by: John Fastabend
    Acked-by: Jesse Gross
    Reviewed-by: Jiri Pirko
    Tested-by: Hans Schillstrom
    Signed-off-by: David S. Miller

    John Fastabend
     
  • The cs89x0 driver was initial placed in the apple/ when it
    should have been placed in the cirrus/. This resolves the
    issue by moving the dirver and fixing up the respective
    Kconfig(s) and Makefile(s).

    Thanks to Sascha for reporting the issue.

    -v2 Fix a config error that was introduced with v1 by removing
    the dependency on MACE for NET_VENDOR_APPLE.

    CC: Russell Nelson
    CC: Andrew Morton
    Reported-by: Sascha Hauer
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Kirsher
     
  • pppol2tp_xmit() calls skb_cow_head(skb, 2) before calling
    l2tp_xmit_skb()

    Then l2tp_xmit_skb() calls again skb_cow_head(skb, large_headroom)

    This patchs changes the first skb_cow_head() call to supply the needed
    headroom to make sure at most one (expensive) pskb_expand_head() is
    done.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Fragmented multicast frames are delivered to a single macvlan port,
    because ip defrag logic considers other samples are redundant.

    Implement a defrag step before trying to send the multicast frame.

    Reported-by: Ben Greear
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

18 Oct, 2011

5 commits