18 May, 2010

1 commit


16 May, 2010

1 commit


12 Apr, 2010

1 commit


07 Apr, 2010

1 commit


04 Apr, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

25 Mar, 2010

2 commits

  • Updates real_num_tx_queues in case underlying real device
    has changed real_num_tx_queues.

    -v2
    As per Eric Dumazet comment:-
    -- adds BUG_ON to catch case of real_num_tx_queues exceeding num_tx_queues.
    -- created this self contained patch to just update real_num_tx_queues.

    Signed-off-by: Vasu Dev
    Signed-off-by: Jeff Kirsher
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Vasu Dev
     
  • This is required to correctly select vlan tx queue for a driver
    supporting multi tx queue with ndo_select_queue implemented since
    currently selected vlan tx queue is unaligned to selected queue by
    real net_devce ndo_select_queue.

    Unaligned vlan tx queue selection causes thrash with higher vlan
    tx lock contention for least fcoe traffic and wrong socket tx
    queue_mapping for ixgbe having ndo_select_queue implemented.

    -v2

    As per Eric Dumazet comments, mirrored
    vlan net_device_ops to have them with and without vlan_dev_select_queue
    and then select according to real dev ndo_select_queue present or not
    for a vlan net_device. This is to completely skip vlan_dev_select_queue
    calling for real net_device not supporting ndo_select_queue.

    Signed-off-by: Vasu Dev
    Signed-off-by: Jeff Kirsher
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Vasu Dev
     

21 Mar, 2010

1 commit


19 Mar, 2010

2 commits

  • When doing "ifenslave -d bond0 eth0", there is chance to get NULL
    dereference in netif_receive_skb(), because dev->master suddenly becomes
    NULL after we tested it.

    We should use ACCESS_ONCE() to avoid this (or rcu_dereference())

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • It's not desired for underlaying devices to change type. At the time,
    there is for example possible to have bond with changed type from
    Ethernet to Infiniband as a port of a bridge. This patch fixes this.

    Signed-off-by: Jiri Pirko
    Signed-off-by: Jay Vosburgh
    Signed-off-by: David S. Miller

    Jiri Pirko
     

17 Feb, 2010

1 commit

  • Add __percpu sparse annotations to net.

    These annotations are to make sparse consider percpu variables to be
    in a different address space and warn if accessed without going
    through percpu accessors. This patch doesn't affect normal builds.

    The macro and type tricks around snmp stats make things a bit
    interesting. DEFINE/DECLARE_SNMP_STAT() macros mark the target field
    as __percpu and SNMP_UPD_PO_STATS() macro is updated accordingly. All
    snmp_mib_*() users which used to cast the argument to (void **) are
    updated to cast it to (void __percpu **).

    Signed-off-by: Tejun Heo
    Acked-by: David S. Miller
    Cc: Patrick McHardy
    Cc: Arnaldo Carvalho de Melo
    Cc: Vlad Yasevich
    Cc: netdev@vger.kernel.org
    Signed-off-by: David S. Miller

    Tejun Heo
     

04 Feb, 2010

1 commit

  • In the vlan and macvlan drivers, the start_xmit function forwards
    data to the dev_queue_xmit function for another device, which may
    potentially belong to a different namespace.

    To make sure that classification stays within a single namespace,
    this resets the potentially critical fields.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

28 Jan, 2010

1 commit


25 Jan, 2010

1 commit

  • Bruno Prémont found commit 9793241fe92f7d930
    (vlan: Precise RX stats accounting) added a regression for non
    hw accelerated vlans.

    [ 26.390576] BUG: unable to handle kernel NULL pointer dereference at (null)
    [ 26.396369] IP: [] vlan_skb_recv+0x89/0x280 [8021q]

    vlan_dev_info() was used with original device, instead of
    skb->dev. Also spotted by Américo Wang.

    Reported-By: Bruno Prémont
    Tested-By: Bruno Prémont
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

18 Jan, 2010

1 commit


04 Jan, 2010

1 commit

  • This allows a bond device to specify an arp_ip_target as a host that is
    not on the same vlan as the base bond device and still use arp
    validation. A configuration like this, now works:

    BONDING_OPTS="mode=active-backup arp_interval=1000 arp_ip_target=10.0.100.1 arp_validate=3"

    1: lo: mtu 16436 qdisc noqueue
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
    valid_lft forever preferred_lft forever
    2: eth1: mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    3: eth0: mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    8: bond0: mtu 1500 qdisc noqueue
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::213:21ff:febe:33e9/64 scope link
    valid_lft forever preferred_lft forever
    9: bond0.100@bond0: mtu 1500 qdisc noqueue
    link/ether 00:13:21:be:33:e9 brd ff:ff:ff:ff:ff:ff
    inet 10.0.100.2/24 brd 10.0.100.255 scope global bond0.100
    inet6 fe80::213:21ff:febe:33e9/64 scope link
    valid_lft forever preferred_lft forever

    Ethernet Channel Bonding Driver: v3.6.0 (September 26, 2009)

    Bonding Mode: fault-tolerance (active-backup)
    Primary Slave: None
    Currently Active Slave: eth1
    MII Status: up
    MII Polling Interval (ms): 0
    Up Delay (ms): 0
    Down Delay (ms): 0
    ARP Polling Interval (ms): 1000
    ARP IP target/s (n.n.n.n form): 10.0.100.1

    Slave Interface: eth1
    MII Status: up
    Link Failure Count: 1
    Permanent HW addr: 00:40:05:30:ff:30

    Slave Interface: eth0
    MII Status: up
    Link Failure Count: 0
    Permanent HW addr: 00:13:21:be:33:e9

    Signed-off-by: Andy Gospodarek
    Signed-off-by: Jay Vosburgh
    Signed-off-by: David S. Miller

    Andy Gospodarek
     

27 Dec, 2009

1 commit

  • Using dev_hard_header allows us to use LLC with VLANs and potentially
    other Ethernet/TokernRing specific encapsulations. It also removes code
    duplication between LLC and Ethernet/TokenRing core code.

    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     

04 Dec, 2009

1 commit


03 Dec, 2009

1 commit

  • Take advantage of the fact that an explicit rtnl_kill_links is
    unnecessary (and skipping it improves batching), as network namespace
    exit calls dellink on all remaining virtual devices, and
    rtnl_link_unregister calls dellink on all outstanding devices in that
    network namespace. To do this we need to leave the vlan proc
    directories in place until after network device exit time, which is
    done by using register_pernet_subsys instead of
    register_pernet_device.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

02 Dec, 2009

1 commit


27 Nov, 2009

1 commit

  • Currently the UP/DOWN state of VLANs is synchronized to the state of the
    underlying device, meaning all VLANs are set down once the underlying
    device is set down. This causes all routes to the VLAN devices to vanish.

    Add a flag to specify a "loose binding" mode, in which only the operstate
    is transfered, but the VLAN device state is independant.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

19 Nov, 2009

1 commit


18 Nov, 2009

2 commits

  • Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • With multi queue devices, its possible that several cpus call
    vlan RX routines simultaneously for the same vlan device.

    We update RX stats counter without any locking, so we can
    get slightly wrong counters.

    One possible fix is to use percpu counters, to get precise
    accounting and also get guarantee of no cache line ping pongs
    between cpus.

    Note: this adds 16 bytes (32 bytes on 64bit arches) of percpu
    data per vlan device.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

17 Nov, 2009

1 commit

  • In case register_netdevice() returns an error, and a new vlan_group
    was allocated and inserted in vlan_group_hash[] we call
    vlan_group_free() without deleting group from hash table. Future
    lookups can give infinite loops or crashes.

    We must delete the vlan_group using RCU safe procedure.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

16 Nov, 2009

1 commit


14 Nov, 2009

1 commit


11 Nov, 2009

1 commit


08 Nov, 2009

1 commit

  • There is no good reason to not support userspace specifying the
    network namespace during device creation, and it makes it easier
    to create a network device and pass it to a child network namespace
    with a well known name.

    We have to be careful to ensure that the target network namespace
    for the new device exists through the life of the call. To keep
    that logic clear I have factored out the network namespace grabbing
    logic into rtnl_link_get_net.

    In addtion we need to continue to pass the source network namespace
    to the rtnl_link_ops.newlink method so that we can find the base
    device source network namespace.

    Signed-off-by: Eric W. Biederman
    Acked-by: Eric Dumazet

    Eric W. Biederman
     

30 Oct, 2009

3 commits

  • The temporary copy of the VLAN group is not neccessary since the lower device
    is already in the process of being unregistered, if it was neccessary the
    memset of the global group would introduce a race condition.

    With this removed, the changes to the original code are only a few lines, so
    remove the new function and move the code back into vlan_device_event().

    Signed-off-by: Patrick McHardy
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • This will allow drivers to adjust their receive path dynamically
    based on whether GRO is being applied successfully.

    Currently all in-tree callers ignore the return values of these
    functions and do not need to be changed.

    Signed-off-by: Ben Hutchings
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Ben Hutchings
     
  • This clarifies which return and parameter types are GRO result codes
    and not RX result codes.

    Signed-off-by: Ben Hutchings
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Ben Hutchings
     

29 Oct, 2009

1 commit


28 Oct, 2009

2 commits


27 Oct, 2009

1 commit

  • We currently use a 16 bit field (vlan_tci) to store VLAN ID/PRIO on a skb.

    Null value is used as a special value, meaning vlan tagging not enabled.
    This forbids use of null vlan ID.

    As pointed by David, some drivers use the 3 high order bits (PRIO)

    As VLAN ID is 12 bits, we can use the remaining bit (CFI) as a flag, and
    allow null VLAN ID.

    In case future code really wants to use VLAN_CFI_MASK, we'll have to use
    a bit outside of vlan_tci.

    #define VLAN_PRIO_MASK 0xe000 /* Priority Code Point */
    #define VLAN_PRIO_SHIFT 13
    #define VLAN_CFI_MASK 0x1000 /* Canonical Format Indicator */
    #define VLAN_TAG_PRESENT VLAN_CFI_MASK
    #define VLAN_VID_MASK 0x0fff /* VLAN Identifier */

    Reported-by: Gertjan Hofman
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Sep, 2009

1 commit


04 Sep, 2009

1 commit

  • Its hard to tell if vlans are dropping frames, since
    every frame given to vlan_???_start_xmit() functions
    is accounted as fully transmitted by lower device.

    We can test dev_queue_xmit() return values to
    properly account for dropped frames.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

03 Sep, 2009

1 commit

  • vlan_dev_hard_start_xmit() & vlan_dev_hwaccel_hard_start_xmit()
    select txqueue number 0, instead of using index provided by
    skb_get_queue_mapping().

    This is not correct after commit 2e59af3dcbdf11635c03f
    [vlan: multiqueue vlan device] because
    txq->tx_packets & txq->tx_bytes changes are performed on
    a single location, and not the right locking.

    Fix is to take the appropriate struct netdev_queue pointer

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet