28 Sep, 2010

2 commits


27 Sep, 2010

19 commits

  • Conflicts:
    drivers/net/qlcnic/qlcnic_init.c
    net/ipv4/ip_output.c

    David S. Miller
     
  • Signed-off-by: Otavio Salvador
    Signed-off-by: David S. Miller

    Otavio Salvador
     
  • Clean up a missing exit path in the ipv6 module init routines. In
    addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys
    for the ipv6_addr_label_ops structure. But if module loading fails, or if the
    ipv6 module is removed, there is no corresponding unregister_pernet_subsys call,
    which leaves a now-bogus address on the pernet_list, leading to oopses in
    subsequent registrations. This patch cleans up both the failed load path and
    the unload path. Tested by myself with good results.

    Signed-off-by: Neil Horman

    include/net/addrconf.h | 1 +
    net/ipv6/addrconf.c | 11 ++++++++---
    net/ipv6/addrlabel.c | 5 +++++
    3 files changed, 14 insertions(+), 3 deletions(-)
    Signed-off-by: David S. Miller

    Neil Horman
     
  • loopback driver uses dev->ml_priv to store its percpu stats pointer.
    It uses ugly casts "(void __percpu __force *)" to shut up sparse
    complains.

    Define an union to better document we use ml_priv in loopback driver and
    define a lstats field with appropriate types.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • __in_dev_get_rtnl(dev_out) is called while RTNL is not held, thus
    triggers a lockdep fault.

    At this point, we only perform a raw test of dev_out->ip_ptr being NULL,
    we dont need to make sure ip_ptr cant changed right after.

    We can use rcu_dereference_raw() for this.

    Reported-by: Andrew Morton
    Acked-by: Paul E. McKenney
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Instead of having two places were we allocate dev->_rx, introduce
    netif_alloc_rx_queues() helper and call it only from
    register_netdevice(), not from alloc_netdev_mq()

    Goal is to let drivers change dev->num_rx_queues after allocating netdev
    and before registering it.

    This also removes a lot of ifdefs in net/core/dev.c

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Freeing netdev without free_netdev() leads to net, tx leaks.
    I might lead to dereferencing freed pointer.

    The semantic match that finds this problem is as follows:
    (http://coccinelle.lip6.fr/)

    @@
    struct net_device* dev;
    @@

    -kfree(dev)
    +free_netdev(dev)

    Signed-off-by: David S. Miller

    Vasiliy Kulikov
     
  • Freeing netdev without free_netdev() leads to net, tx leaks.
    I might lead to dereferencing freed pointer.

    The semantic match that finds this problem is as follows:
    (http://coccinelle.lip6.fr/)

    @@
    struct net_device* dev;
    @@

    -kfree(dev)
    +free_netdev(dev)

    Signed-off-by: David S. Miller

    Kulikov Vasiliy
     
  • Freeing netdev without free_netdev() leads to net, tx leaks.
    I might lead to dereferencing freed pointer.

    The semantic match that finds this problem is as follows:
    (http://coccinelle.lip6.fr/)

    @@
    struct net_device* dev;
    @@

    -kfree(dev)
    +free_netdev(dev)

    Signed-off-by: David S. Miller

    Kulikov Vasiliy
     
  • Freeing netdev without free_netdev() leads to net, tx leaks.
    I might lead to dereferencing freed pointer.

    The semantic match that finds this problem is as follows:
    (http://coccinelle.lip6.fr/)

    @@
    struct net_device* dev;
    @@

    -kfree(dev)
    +free_netdev(dev)

    Signed-off-by: David S. Miller

    Kulikov Vasiliy
     
  • SOCK_MIN_RCVBUF current value is 256 bytes

    It doesnt permit to receive the smallest possible frame, considering
    socket sk_rmem_alloc/sk_rcvbuf account skb truesizes. On 64bit arches,
    sizeof(struct sk_buff) is 240 bytes. Add the typical 64 bytes of
    headroom, and we go over the limit.

    With old kernels and 32bit arches, we were under the limit, if netdriver
    was doing copybreak.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This enables auto loading for the smsc911x ethernet driver.

    Signed-off-by: Vincent Stehlé
    Signed-off-by: David S. Miller

    Vincent Stehlé
     
  • Reset queue mapping when an skb is reentering the stack via a tunnel.
    On second pass, the queue mapping from the original device is no
    longer valid.

    Signed-off-by: Tom Herbert
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • Change "return (EXPR);" to "return EXPR;"

    return is not a function, parentheses are not required.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • On 32bit arches, if PAGE_SIZE is smaller than 65536, we can use 16bit
    offset and size fields. This patch saves 72 bytes per skb on i386, or
    128 bytes after rounding.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • You can't call atomic_notifier_chain_unregister() while in atomic context.

    Fix, call un/register_atmdevice_notifier in module __init and __exit.

    Bug report:
    http://comments.gmane.org/gmane.linux.network/172603

    Reported-by: Mikko Vinni
    Tested-by: Mikko Vinni
    Signed-off-by: Karl Hiramoto
    Signed-off-by: David S. Miller

    Karl Hiramoto
     
  • Automatically allows vlans to get NETIF_F_HIGHDMA if underlying device
    supports it.

    On 32bit arches (and more precisely if CONFIG_HIGHMEM is enabled), it
    can help to reduce cost of illegal_highdma() and __skb_linearize()
    calls.

    Tested on tg3 , bnx2, bonding, this worked very well.

    This is a generalization of a patch provided by Yi Zou & Jeff Kirsher.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Compex FreedomLine 32 PnP-PCI2 cards have only TP and BNC connectors but the
    SROM contains AUI port too. When TP loses link, the driver switches to
    non-existing AUI port (which reports that carrier is always present).

    Connecting TP back generates LinkPass interrupt but de_media_interrupt() is
    broken - it only updates the link state of currently connected media, ignoring
    the fact that LinkPass and LinkFail bits of MacStatus register belong to the
    TP port only (the chip documentation says that).

    This patch changes de_media_interrupt() to switch media to TP when link goes
    up (and media type is not locked) and also to update the link state only when
    the TP port is used.

    Also the NonselPortActive (and also SelPortActive) bits of SIAStatus register
    need to be cleared (by writing 1) after reading or they're useless.

    Signed-off-by: Ondrej Zary
    Acked-by: Jeff Garzik
    Signed-off-by: David S. Miller

    Ondrej Zary
     
  • At least my 21041 cards come out of suspend with bus mastering disabled so
    they did not work after resume(no data transferred).
    After adding pci_set_master(), the driver oopsed immediately on resume -
    because de_clean_rings() is called on suspend but de_init_rings() call
    was missing in resume.

    Also disable link (reset SIA) before sleep (de4x5 does this too).

    Signed-off-by: Ondrej Zary
    Acked-by: Jeff Garzik
    Signed-off-by: David S. Miller

    Ondrej Zary
     

25 Sep, 2010

6 commits

  • The function to resize the Tx/Rx rings had the potential to
    dereference a NULL pointer and the code would attempt to resize
    the Tx ring even if the Rx ring allocation had failed. This
    would cause some confusion in the return code semantics. Fixed
    up to just unwind the allocations if any of them fail and return
    an error.

    Signed-off-by: Greg Rose
    Tested-by: Emil Tantilov
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Greg Rose
     
  • At least on older 21041-AA chips (mine is rev. 11), TP duplex autonegotiation
    causes the card not to work at all (link is up but no packets are transmitted).

    de4x5 disables autonegotiation completely. But it seems to work on newer
    (21041-PA rev. 21) so disable it only on rev
    Acked-by: Jeff Garzik
    Signed-off-by: David S. Miller

    Ondrej Zary
     
  • We have for each socket :

    One spinlock (sk_slock.slock)
    One rwlock (sk_callback_lock)

    Possible scenarios are :

    (A) (this is used in net/sunrpc/xprtsock.c)
    read_lock(&sk->sk_callback_lock) (without blocking BH)

    spin_lock(&sk->sk_slock.slock);
    ...
    read_lock(&sk->sk_callback_lock);
    ...

    (B)
    write_lock_bh(&sk->sk_callback_lock)
    stuff
    write_unlock_bh(&sk->sk_callback_lock)

    (C)
    spin_lock_bh(&sk->sk_slock)
    ...
    write_lock_bh(&sk->sk_callback_lock)
    stuff
    write_unlock_bh(&sk->sk_callback_lock)
    spin_unlock_bh(&sk->sk_slock)

    This (C) case conflicts with (A) :

    CPU1 [A] CPU2 [C]
    read_lock(callback_lock)
    spin_lock_bh(slock)

    We have one problematic (C) use case in inet_csk_listen_stop() :

    local_bh_disable();
    bh_lock_sock(child); // spin_lock_bh(&sk->sk_slock)
    WARN_ON(sock_owned_by_user(child));
    ...
    sock_orphan(child); // write_lock_bh(&sk->sk_callback_lock)

    lockdep is not happy with this, as reported by Tetsuo Handa

    It seems only way to deal with this is to use read_lock_bh(callbacklock)
    everywhere.

    Thanks to Jarek for pointing a bug in my first attempt and suggesting
    this solution.

    Reported-by: Tetsuo Handa
    Tested-by: Tetsuo Handa
    Signed-off-by: Eric Dumazet
    CC: Jarek Poplawski
    Tested-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • If the PM support is available this is passed
    through the platform instead to be hard-coded
    in the core files.
    WoL on Magic Frame can be enabled by using
    the ethtool support.

    Signed-off-by: Giuseppe Cavallaro
    Signed-off-by: David S. Miller

    Giuseppe Cavallaro
     
  • Signed-off-by: Masayuki Ohtake
    Signed-off-by: David S. Miller

    Masayuki Ohtake
     
  • While investigating a bit, I found ip_fragment() slow path was taken
    because ip_append_data() provides following layout for a send(MTU +
    N*(MTU - 20)) syscall :

    - one skb with 1500 (mtu) bytes
    - N fragments of 1480 (mtu-20) bytes (before adding IP header)
    last fragment gets 17 bytes of trail data because of following bit:

    if (datalen == length + fraggap)
    alloclen += rt->dst.trailer_len;

    Then esp4 adds 16 bytes of data (while trailer_len is 17... hmm...
    another bug ?)

    In ip_fragment(), we notice last fragment is too big (1496 + 20) > mtu,
    so we take slow path, building another skb chain.

    In order to avoid taking slow path, we should correct ip_append_data()
    to make sure last fragment has real trail space, under mtu...

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

24 Sep, 2010

4 commits

  • Change "return (EXPR);" to "return EXPR;"

    return is not a function, parentheses are not required.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • E1000 can benefit from calling the GRO receive functions.

    Signed-off-by: Jesse Brandeburg
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jesse Brandeburg
     
  • Net drivers in general have an issue where timers fired
    by mod_timer or work threads with schedule_work are running
    outside of the rtnl_lock.

    With no other lock protection these routines are vulnerable
    to races with driver unload or reset paths.

    The longer term solution to this might be a redesign with
    safer locks being taken in the driver to guarantee no
    reentrance, but for now a safe and effective fix is
    to take the rtnl_lock in these routines.

    Signed-off-by: Jesse Brandeburg
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jesse Brandeburg
     
  • E1000 is using several timers that in a follow on patch
    will need to acquire the rtnl_lock in order to be safe.

    This patch moves the timer bodies into work queues which
    will allow the next patch to add rtnl_lock.

    Signed-off-by: Jesse Brandeburg
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jesse Brandeburg
     

23 Sep, 2010

9 commits

  • If the netdev->features is set with NETIF_F_HIGHDMA, we should set the
    corresponding netdev->vlan_features as well to allow VLAN netdev created
    on top of the real netdev to be able to also benefit from HIGHDMA on 32bit
    system, reducing the performance hit that is caused by __skb_linearize(),
    particularly for large send. This is fixed in this patch for all Intel e1000,
    e1000e, igb, ixgbe, and ixgbe drivers since this should be beneficial
    to all devices supported by these drivers.

    Signed-off-by: Yi Zou
    Tested-by: Emil Tantilov
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Yi Zou
     
  • This patch adds support for the Intel(r) DH89xxCC series. The new
    device will be using Intel(r) i347-AT4 and Marvell(r) M88E1322 and
    M88E1112 PHYs. Support for these devices has also been added here.

    Signed-off-by: Joseph Gasparakis
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Joseph Gasparakis
     
  • This change corrects an issue in which we were setting all flag bits except
    for promisc instead of clearing the promisc bits due to the incorrect use
    of an |= instead of an &=.

    Signed-off-by: Alexander Duyck
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Alexander Duyck
     
  • For non-managed versions of 82579, set the bit that prevents the hardware
    from automatically configuring the PHY after resets only when the driver
    performs a reset, clear the bit after resets. This is so the hardware can
    configure the PHY automatically when the part is reset in a manner that is
    not controlled by the driver (e.g. in a virtual environment via PCI FLR)
    otherwise the PHY will be mis-configured causing issues such as failing to
    link at 1000Mbps.
    For managed versions of 82579, keep the previous behavior since the
    manageability firmware will handle the PHY configuration.

    Signed-off-by: Bruce Allan
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Bruce Allan
     
  • The subject workaround was causing CRC errors due to writing the wrong
    register with updates of the RCTL register. It was also found that the
    workaround function which modifies the RCTL register was being called in
    the middle of a read-modify-write operation of the RCTL register, so the
    function call has been moved appropriately. Lastly, jumbo frames must not
    be allowed when CRC stripping is disabled by a module parameter because the
    workaround requires the CRC be stripped.

    Signed-off-by: Bruce Allan
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Bruce Allan
     
  • On 82579, there is a hardware bug that can cause received packets to not
    get transferred from the PHY to the MAC due to K1 (a power saving feature
    of the PHY-MAC interconnect similar to ASPM L1). Since the MAC controls
    the accounting of missed packets, these will go unnoticed. Workaround the
    issue by setting the K1 beacon duration according to the link speed.

    Signed-off-by: Bruce Allan
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Bruce Allan
     
  • Two recent patches to cleanup the reset[1] and initial PHY configuration[2]
    code paths for ICH/PCH devices inadvertently left out a 10msec delay and
    device ID check respectively which are necessary for the 82566DC (device id
    0x104b) to be configured properly, otherwise it will not get link.

    [1] commit e98cac447cc1cc418dff1d610a5c79c4f2bdec7f
    [2] commit 3f0c16e84438d657d29446f85fe375794a93f159

    CC: stable@kernel.org
    Signed-off-by: Bruce Allan
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Bruce Allan
     
  • Since the hardware is prevented from performing automatic PHY configuration
    (the driver does it instead), the OEM_WRITE_ENABLE bit in the EXTCNF_CTRL
    register will not get cleared preventing the SMBus address and the LED
    configuration to be written to the PHY registers. On 82579, do not check
    the OEM_WRITE_ENABLE bit.

    Signed-off-by: Bruce Allan
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Bruce Allan
     
  • When going to Sx, disable gigabit in PHY (e1000_oem_bits_config_ich8lan)
    in addition to the MAC before configuring PHY wakeup otherwise the PHY
    configuration writes might be missed. Also write the LED configuration
    and SMBus address to the PHY registers (e1000_oem_bits_config_ich8lan and
    e1000_write_smbus_addr, respectively). The reset is no longer needed
    since re-auto-negotiation is forced in e1000_oem_bits_config_ich8lan and
    leaving it in causes issues with auto-negotiating the link.

    Signed-off-by: Bruce Allan
    Tested-by: Jeff Pieper
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Bruce Allan