02 Jun, 2011

1 commit

  • commit 4af429d29b341bb1735f04c2fb960178ed5d52e7 (vlan: lockless
    transmit path) have a typo in vlan_dev_hard_start_xmit(), using
    u64_stats_update_begin() to end the stat update, it should be
    u64_stats_update_end().

    Signed-off-by: Wei Yongjun
    Reviewed-by: WANG Cong
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Wei Yongjun
     

13 May, 2011

1 commit

  • Fix VLAN features propagation for devices which change vlan_features.
    For this to work, driver needs to make sure netdev_features_changed()
    gets called after the change (it is e.g. after ndo_set_features()).

    Side effect is that a user might request features that will never
    be enabled on a VLAN device.

    Signed-off-by: Michał Mirosław
    Signed-off-by: David S. Miller

    Michał Mirosław
     

12 May, 2011

1 commit


11 May, 2011

1 commit

  • ip link add link eth2 eth2.103 type vlan id 103 gvrp on loose_binding on
    ip link set eth2.103 up
    rmmod tg3 # driver providing eth2

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] garp_request_leave+0x3e/0xc0 [garp]
    PGD 11d251067 PUD 11b9e0067 PMD 0
    Oops: 0000 [#1] SMP
    last sysfs file: /sys/devices/virtual/net/eth2.104/ifindex
    CPU 0
    Modules linked in: tg3(-) 8021q garp nfsd lockd auth_rpcgss sunrpc libphy sg [last unloaded: x_tables]

    Pid: 11494, comm: rmmod Tainted: G W 2.6.39-rc6-00261-gfd71257-dirty #580 HP ProLiant BL460c G6
    RIP: 0010:[] [] garp_request_leave+0x3e/0xc0 [garp]
    RSP: 0018:ffff88007a19bae8 EFLAGS: 00010286
    RAX: 0000000000000000 RBX: ffff88011b5e2000 RCX: 0000000000000002
    RDX: 0000000000000000 RSI: 0000000000000175 RDI: ffffffffa0030d5b
    RBP: ffff88007a19bb18 R08: 0000000000000001 R09: ffff88011bd64a00
    R10: ffff88011d34ec00 R11: 0000000000000000 R12: 0000000000000002
    R13: ffff88007a19bc48 R14: ffff88007a19bb88 R15: 0000000000000001
    FS: 0000000000000000(0000) GS:ffff88011fc00000(0063) knlGS:00000000f77d76c0
    CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
    CR2: 0000000000000000 CR3: 000000011a675000 CR4: 00000000000006f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process rmmod (pid: 11494, threadinfo ffff88007a19a000, task ffff8800798595c0)
    Stack:
    ffff88007a19bb36 ffff88011c84b800 ffff88011b5e2000 ffff88007a19bc48
    ffff88007a19bb88 0000000000000006 ffff88007a19bb38 ffffffffa003a5f6
    ffff88007a19bb38 670088007a19bba8 ffff88007a19bb58 ffffffffa00397e7
    Call Trace:
    [] vlan_gvrp_request_leave+0x46/0x50 [8021q]
    [] vlan_dev_stop+0xb7/0xc0 [8021q]
    [] __dev_close_many+0x87/0xe0
    [] dev_close_many+0x87/0x110
    [] rollback_registered_many+0xa0/0x240
    [] unregister_netdevice_many+0x19/0x60
    [] vlan_device_event+0x53b/0x550 [8021q]
    [] ? ip6mr_device_event+0xa8/0xd0
    [] notifier_call_chain+0x53/0x80
    [] __raw_notifier_call_chain+0x9/0x10
    [] raw_notifier_call_chain+0x11/0x20
    [] call_netdevice_notifiers+0x32/0x60
    [] rollback_registered_many+0x10f/0x240
    [] rollback_registered+0x2f/0x40
    [] unregister_netdevice_queue+0x58/0x90
    [] unregister_netdev+0x1b/0x30
    [] tg3_remove_one+0x6f/0x10b [tg3]

    We should call vlan_gvrp_request_leave() from unregister_vlan_dev(),
    not from vlan_dev_stop(), because vlan_gvrp_uninit_applicant()
    is called right after unregister_netdevice_queue(). In batch mode,
    unregister_netdevice_queue() doesn’t immediately call vlan_dev_stop().

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

13 Apr, 2011

1 commit

  • Now there are 2 paths for rx vlan frames. When rx-vlan-hw-accel is
    enabled, skb is untagged by NIC, vlan_tci is set and the skb gets into
    vlan code in __netif_receive_skb - vlan_hwaccel_do_receive.

    For non-rx-vlan-hw-accel however, tagged skb goes thru whole
    __netif_receive_skb, it's untagged in ptype_base hander and reinjected

    This incosistency is fixed by this patch. Vlan untagging happens early in
    __netif_receive_skb so the rest of code (ptype_all handlers, rx_handlers)
    see the skb like it was untagged by hw.

    Signed-off-by: Jiri Pirko

    v1->v2:
    remove "inline" from vlan_core.c functions
    Signed-off-by: David S. Miller

    Jiri Pirko
     

03 Apr, 2011

1 commit

  • Note: get_flags was actually broken, because it should return the
    flags capped with vlan_features. This is now done implicitly by
    limiting netdev->hw_features.

    RX checksumming offload control is (and was) broken, as there was no way
    before to say whether it's done for tagged packets.

    Signed-off-by: Michał Mirosław
    Signed-off-by: David S. Miller

    Michał Mirosław
     

19 Mar, 2011

1 commit

  • Commit c95b819ad7 (gre: Use needed_headroom)
    made gre use needed_headroom instead of hard_header_len

    This uncover a bug in vlan code.

    We should make sure vlan devices take into account their
    real_dev->needed_headroom or we risk a crash in ipgre_header(), because
    we dont have enough room to push IP header in skb.

    Reported-by: Diddi Oscarsson
    Signed-off-by: Eric Dumazet
    Cc: Patrick McHardy
    Cc: Herbert Xu
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Mar, 2011

1 commit


17 Nov, 2010

2 commits

  • Now vlan are lockless, we dont need special ndo_select_queue() logic.
    dev_pick_tx() will do the multiqueue stuff on the real device transmit.

    Suggested-by: Jesse Gross
    Signed-off-by: Eric Dumazet
    Acked-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • vlan is a stacked device, like tunnels. We should use the lockless
    mechanism we are using in tunnels and loopback.

    This patch completely removes locking in TX path.

    tx stat counters are added into existing percpu stat structure, renamed
    from vlan_rx_stats to vlan_pcpu_stats.

    Note : this partially reverts commit 2e59af3dcbdf (vlan: multiqueue vlan
    device)

    Signed-off-by: Eric Dumazet
    Cc: Patrick McHardy
    Signed-off-by: David S. Miller

    Eric Dumazet
     

16 Nov, 2010

2 commits

  • Now that VLAN packets are tagged in dev_hard_start_xmit()
    at the bottom of the stack we no longer need to tag them
    in the 8021Q module (Except in the !VLAN_FLAG_REORDER_HDR
    case).

    This allows the accel path and non accel paths to be consolidated.
    Here the vlan_tci in the skb is always set and we allow the
    stack to add the actual tag in dev_hard_start_xmit().

    Signed-off-by: John Fastabend
    Acked-by: Jesse Gross
    Signed-off-by: David S. Miller

    John Fastabend
     
  • It is possible for the headroom to be smaller then the
    hard_header_len for a short period of time after toggling
    the vlan offload setting.

    This is not a hard error and skb_cow_head is called in
    __vlan_put_tag() to resolve this.

    Signed-off-by: John Fastabend
    Acked-by: Jesse Gross
    Signed-off-by: David S. Miller

    John Fastabend
     

21 Oct, 2010

1 commit

  • A struct net_device always maps to zero or one vlan groups and we
    always know the device when we are looking up a group. We currently
    do a hash table lookup on the device to find the group but it is
    much simpler to just store a pointer.

    Signed-off-by: Jesse Gross
    Signed-off-by: David S. Miller

    Jesse Gross
     

06 Oct, 2010

1 commit

  • In various situations, a device provides a packet to our stack and we
    drop it before it enters protocol stack :
    - softnet backlog full (accounted in /proc/net/softnet_stat)
    - bad vlan tag (not accounted)
    - unknown/unregistered protocol (not accounted)

    We can handle a per-device counter of such dropped frames at core level,
    and automatically adds it to the device provided stats (rx_dropped), so
    that standard tools can be used (ifconfig, ip link, cat /proc/net/dev)

    This is a generalization of commit 8990f468a (net: rx_dropped
    accounting), thus reverting it.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

28 Sep, 2010

1 commit


21 Sep, 2010

1 commit

  • Under load, netif_rx() can drop incoming packets but administrators dont
    have a chance to spot which device needs some tuning (RPS activation for
    example)

    This patch adds rx_dropped accounting in vlans and tunnels.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Aug, 2010

1 commit

  • When adding a new vlan, if the underlying interface has no carrier,
    then the newly added vlan interface should also have no carrier.
    At present, this is not true - the newly added vlan is added with
    carrier up. Fix by checking state of real device.

    Signed-off-by: Phil Oester
    Signed-off-by: David S. Miller

    Phil Oester
     

19 Jul, 2010

1 commit

  • - Without the 8021q module loaded in the kernel, all 802.1p packets
    (VLAN 0 but QoS tagging) are silently discarded (as expected, as
    the protocol is not loaded).

    - Without this patch in 8021q module, these packets are forwarded to
    the module, but they are discarded also if VLAN 0 is not configured,
    which should not be the default behaviour, as VLAN 0 is not really
    a VLANed packet but a 802.1p packet. Defining VLAN 0 makes it almost
    impossible to communicate with mixed 802.1p and non 802.1p devices on
    the same network due to arp table issues.

    - Changed logic to skip vlan specific code in vlan_skb_recv if VLAN
    is 0 and we have not defined a VLAN with ID 0, but we accept the
    packet with the encapsulated proto and pass it later to netif_rx.

    - In the vlan device event handler, added some logic to add VLAN 0
    to HW filter in devices that support it (this prevented any traffic
    in VLAN 0 to reach the stack in e1000e with HW filter under 2.6.35,
    and probably also with other HW filtered cards, so we fix it here).

    - In the vlan unregister logic, prevent the elimination of VLAN 0
    in devices with HW filter.

    - The default behaviour is to ignore the VLAN 0 tagging and accept
    the packet as if it was not tagged, but we can still define a
    VLAN 0 if desired (so it is backwards compatible).

    Signed-off-by: Pedro Garcia
    Signed-off-by: David S. Miller

    Pedro Garcia
     

10 Jul, 2010

1 commit

  • In commit be1f3c2c027cc5ad735df6a45a542ed1db7ec48b "net: Enable 64-bit
    net device statistics on 32-bit architectures" I redefined struct
    net_device_stats so that it could be used in a union with struct
    rtnl_link_stats64, avoiding the need for explicit copying or
    conversion between the two. However, this is unsafe because there is
    no locking required and no lock consistently held around calls to
    dev_get_stats() and use of the statistics structure it returns.

    In commit 28172739f0a276eb8d6ca917b3974c2edb036da3 "net: fix 64 bit
    counters on 32 bit arches" Eric Dumazet dealt with that problem by
    requiring callers of dev_get_stats() to provide storage for the
    result. This means that the net_device::stats64 field and the padding
    in struct net_device_stats are now redundant, so remove them.

    Update the comment on net_device_ops::ndo_get_stats64 to reflect its
    new usage.

    Change dev_txq_stats_fold() to use struct rtnl_link_stats64, since
    that is what all its callers are really using and it is no longer
    going to be compatible with struct net_device_stats.

    Eric Dumazet suggested the separate function for the structure
    conversion.

    Signed-off-by: Ben Hutchings
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Ben Hutchings
     

09 Jul, 2010

1 commit

  • When we need to shape traffic using low speeds, we need to
    disable tso on network interface :

    ethtool -K eth0.2240 tso off

    It seems vlan interfaces miss the set_tso() ethtool method.

    Before enabling TSO, we must check real device supports
    TSO for VLAN-tagged packets and enables TSO.

    Note that a TSO change on real device propagates TSO setting
    on all vlans, even if admin selected a different TSO setting.

    Signed-off-by: Eric Dumazet
    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Jul, 2010

1 commit

  • There is a small possibility that a reader gets incorrect values on 32
    bit arches. SNMP applications could catch incorrect counters when a
    32bit high part is changed by another stats consumer/provider.

    One way to solve this is to add a rtnl_link_stats64 param to all
    ndo_get_stats64() methods, and also add such a parameter to
    dev_get_stats().

    Rule is that we are not allowed to use dev->stats64 as a temporary
    storage for 64bit stats, but a caller provided area (usually on stack)

    Old drivers (only providing get_stats() method) need no changes.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

29 Jun, 2010

1 commit


02 Jun, 2010

1 commit


16 May, 2010

1 commit


12 Apr, 2010

1 commit


07 Apr, 2010

1 commit


04 Apr, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

25 Mar, 2010

1 commit

  • This is required to correctly select vlan tx queue for a driver
    supporting multi tx queue with ndo_select_queue implemented since
    currently selected vlan tx queue is unaligned to selected queue by
    real net_devce ndo_select_queue.

    Unaligned vlan tx queue selection causes thrash with higher vlan
    tx lock contention for least fcoe traffic and wrong socket tx
    queue_mapping for ixgbe having ndo_select_queue implemented.

    -v2

    As per Eric Dumazet comments, mirrored
    vlan net_device_ops to have them with and without vlan_dev_select_queue
    and then select according to real dev ndo_select_queue present or not
    for a vlan net_device. This is to completely skip vlan_dev_select_queue
    calling for real net_device not supporting ndo_select_queue.

    Signed-off-by: Vasu Dev
    Signed-off-by: Jeff Kirsher
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Vasu Dev
     

04 Feb, 2010

1 commit

  • In the vlan and macvlan drivers, the start_xmit function forwards
    data to the dev_queue_xmit function for another device, which may
    potentially belong to a different namespace.

    To make sure that classification stays within a single namespace,
    this resets the potentially critical fields.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

28 Jan, 2010

1 commit


25 Jan, 2010

1 commit

  • Bruno Prémont found commit 9793241fe92f7d930
    (vlan: Precise RX stats accounting) added a regression for non
    hw accelerated vlans.

    [ 26.390576] BUG: unable to handle kernel NULL pointer dereference at (null)
    [ 26.396369] IP: [] vlan_skb_recv+0x89/0x280 [8021q]

    vlan_dev_info() was used with original device, instead of
    skb->dev. Also spotted by Américo Wang.

    Reported-By: Bruno Prémont
    Tested-By: Bruno Prémont
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Dec, 2009

1 commit

  • Using dev_hard_header allows us to use LLC with VLANs and potentially
    other Ethernet/TokernRing specific encapsulations. It also removes code
    duplication between LLC and Ethernet/TokenRing core code.

    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     

27 Nov, 2009

1 commit

  • Currently the UP/DOWN state of VLANs is synchronized to the state of the
    underlying device, meaning all VLANs are set down once the underlying
    device is set down. This causes all routes to the VLAN devices to vanish.

    Add a flag to specify a "loose binding" mode, in which only the operstate
    is transfered, but the VLAN device state is independant.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

18 Nov, 2009

1 commit

  • With multi queue devices, its possible that several cpus call
    vlan RX routines simultaneously for the same vlan device.

    We update RX stats counter without any locking, so we can
    get slightly wrong counters.

    One possible fix is to use percpu counters, to get precise
    accounting and also get guarantee of no cache line ping pongs
    between cpus.

    Note: this adds 16 bytes (32 bytes on 64bit arches) of percpu
    data per vlan device.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Nov, 2009

1 commit


29 Oct, 2009

1 commit


27 Oct, 2009

1 commit

  • We currently use a 16 bit field (vlan_tci) to store VLAN ID/PRIO on a skb.

    Null value is used as a special value, meaning vlan tagging not enabled.
    This forbids use of null vlan ID.

    As pointed by David, some drivers use the 3 high order bits (PRIO)

    As VLAN ID is 12 bits, we can use the remaining bit (CFI) as a flag, and
    allow null VLAN ID.

    In case future code really wants to use VLAN_CFI_MASK, we'll have to use
    a bit outside of vlan_tci.

    #define VLAN_PRIO_MASK 0xe000 /* Priority Code Point */
    #define VLAN_PRIO_SHIFT 13
    #define VLAN_CFI_MASK 0x1000 /* Canonical Format Indicator */
    #define VLAN_TAG_PRESENT VLAN_CFI_MASK
    #define VLAN_VID_MASK 0x0fff /* VLAN Identifier */

    Reported-by: Gertjan Hofman
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

04 Sep, 2009

1 commit

  • Its hard to tell if vlans are dropping frames, since
    every frame given to vlan_???_start_xmit() functions
    is accounted as fully transmitted by lower device.

    We can test dev_queue_xmit() return values to
    properly account for dropped frames.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

03 Sep, 2009

1 commit

  • vlan_dev_hard_start_xmit() & vlan_dev_hwaccel_hard_start_xmit()
    select txqueue number 0, instead of using index provided by
    skb_get_queue_mapping().

    This is not correct after commit 2e59af3dcbdf11635c03f
    [vlan: multiqueue vlan device] because
    txq->tx_packets & txq->tx_bytes changes are performed on
    a single location, and not the right locking.

    Fix is to take the appropriate struct netdev_queue pointer

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet