02 Dec, 2011

1 commit

  • When in user-stp mode, bridge master do not follow state of its slaves, so
    after the following sequence of events it can stuck forever in no-carrier
    state:
    1) turn stp off
    2) put all slaves down - master device will follow their state and also go in
    no-carrier state
    3) turn stp on with bridge-stp script returning 0 (go to the user-stp mode)
    Now bridge master won't follow slaves' state and will never reach running
    state.

    This patch solves the problem by making user-stp and kernel-stp behavior
    similar regarding master following slaves' states.

    Signed-off-by: Vitalii Demianets
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Vitalii Demianets
     

19 Oct, 2011

1 commit

  • Need to cleanup bridge device timers and ports when being bridge
    device is being removed via netlink.

    This fixes the problem of observed when doing:
    ip link add br0 type bridge
    ip link set dev eth1 master br0
    ip link set br0 up
    ip link del br0

    which would cause br0 to hang in unregister_netdev because
    of leftover reference count.

    Reported-by: Sridhar Samudrala
    Signed-off-by: Stephen Hemminger
    Acked-by: Sridhar Samudrala
    Signed-off-by: David S. Miller

    stephen hemminger
     

23 Jul, 2011

1 commit


10 Jun, 2011

1 commit

  • The message size allocated for rtnl ifinfo dumps was limited to
    a single page. This is not enough for additional interface info
    available with devices that support SR-IOV and caused a bug in
    which VF info would not be displayed if more than approximately
    40 VFs were created per interface.

    Implement a new function pointer for the rtnl_register service that will
    calculate the amount of data required for the ifinfo dump and allocate
    enough data to satisfy the request.

    Signed-off-by: Greg Rose
    Signed-off-by: Jeff Kirsher

    Greg Rose
     

03 May, 2011

1 commit

  • Four years ago, Patrick made a change to hold rtnl mutex during netlink
    dump callbacks.

    I believe it was a wrong move. This slows down concurrent dumps, making
    good old /proc/net/ files faster than rtnetlink in some situations.

    This occurred to me because one "ip link show dev ..." was _very_ slow
    on a workload adding/removing network devices in background.

    All dump callbacks are able to use RCU locking now, so this patch does
    roughly a revert of commits :

    1c2d670f366 : [RTNETLINK]: Hold rtnl_mutex during netlink dump callbacks
    6313c1e0992 : [RTNETLINK]: Remove unnecessary locking in dump callbacks

    This let writers fight for rtnl mutex and readers going full speed.

    It also takes care of phonet : phonet_route_get() is now called from rcu
    read section. I renamed it to phonet_route_get_rcu()

    Signed-off-by: Eric Dumazet
    Cc: Patrick McHardy
    Cc: Remi Denis-Courmont
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     

05 Apr, 2011

3 commits

  • Add netlink device ops to allow creating bridge device via netlink.
    This works in a manner similar to vlan, macvlan and bonding.

    Example:
    # ip link add link dev br0 type bridge
    # ip link del dev br0

    The change required rearranging initializtion code to deal with
    being called by create link. Most of the initialization happens
    in br_dev_setup, but allocation of stats is done in ndo_init callback
    to deal with allocation failure. Sysfs setup has to wait until
    after the network device kobject is registered.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     
  • Use RTM_NEWNEIGH and RTM_DELNEIGH to allow updating of entries
    in bridge forwarding table. This allows manipulating static entries
    which is not possible with existing tools.

    Example (using bridge extensions to iproute2)
    # br fdb add 00:02:03:04:05:06 dev eth0

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     
  • This allows applications to query and monitor bridge forwarding
    table in the same method used for neighbor table. The forward table
    entries are returned in same structure format as used by the ioctl.
    If more information is desired in future, the netlink method is
    extensible.

    Example (using bridge extensions to iproute2)
    # br monitor

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     

16 Nov, 2010

2 commits


16 Jun, 2010

1 commit


16 May, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

25 Feb, 2009

1 commit

  • This patch changes the return value of nlmsg_notify() as follows:

    If NETLINK_BROADCAST_ERROR is set by any of the listeners and
    an error in the delivery happened, return the broadcast error;
    else if there are no listeners apart from the socket that
    requested a change with the echo flag, return the result of the
    unicast notification. Thus, with this patch, the unicast
    notification is handled in the same way of a broadcast listener
    that has set the NETLINK_BROADCAST_ERROR socket flag.

    This patch is useful in case that the caller of nlmsg_notify()
    wants to know the result of the delivery of a netlink notification
    (including the broadcast delivery) and take any action in case
    that the delivery failed. For example, ctnetlink can drop packets
    if the event delivery failed to provide reliable logging and
    state-synchronization at the cost of dropping packets.

    This patch also modifies the rtnetlink code to ignore the return
    value of rtnl_notify() in all callers. The function rtnl_notify()
    (before this patch) returned the error of the unicast notification
    which makes rtnl_set_sk_err() reports errors to all listeners. This
    is not of any help since the origin of the change (the socket that
    requested the echoing) notices the ENOBUFS error if the notification
    fails and should resync itself.

    Signed-off-by: Pablo Neira Ayuso
    Acked-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Pablo Neira Ayuso
     

09 Sep, 2008

1 commit

  • Bridge as netdevice doesn't cross netns boundaries.

    Bridge ports and bridge itself live in same netns.

    Notifiers are fixed.

    netns propagated from userspace socket for setup and teardown.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

26 Mar, 2008

1 commit


29 Jan, 2008

2 commits

  • After this patch none of the netlink callback support anything
    except the initial network namespace but the rtnetlink infrastructure
    now handles multiple network namespaces.

    Changes from v2:
    - IPv6 addrlabel processing

    Changes from v1:
    - no need for special rtnl_unlock handling
    - fixed IPv6 ndisc

    Signed-off-by: Denis V. Lunev
    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • Before I can enable rtnetlink to work in all network namespaces I need
    to be certain that something won't break. So this patch deliberately
    disables all of the rtnletlink methods in everything except the
    initial network namespace. After the methods have been audited this
    extra check can be disabled.

    Changes from v1:
    - added IPv6 addrlabel protection

    Signed-off-by: Denis V. Lunev
    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller
    Signed-off-by: Herbert Xu

    Denis V. Lunev
     

11 Oct, 2007

1 commit

  • This patch makes most of the generic device layer network
    namespace safe. This patch makes dev_base_head a
    network namespace variable, and then it picks up
    a few associated variables. The functions:
    dev_getbyhwaddr
    dev_getfirsthwbytype
    dev_get_by_flags
    dev_get_by_name
    __dev_get_by_name
    dev_get_by_index
    __dev_get_by_index
    dev_ioctl
    dev_ethtool
    dev_load
    wireless_process_ioctl

    were modified to take a network namespace argument, and
    deal with it.

    vlan_ioctl_set and brioctl_set were modified so their
    hooks will receive a network namespace argument.

    So basically anthing in the core of the network stack that was
    affected to by the change of dev_base was modified to handle
    multiple network namespaces. The rest of the network stack was
    simply modified to explicitly use &init_net the initial network
    namespace. This can be fixed when those components of the network
    stack are modified to handle multiple network namespaces.

    For now the ifindex generator is left global.

    Fundametally ifindex numbers are per namespace, or else
    we will have corner case problems with migration when
    we get that far.

    At the same time there are assumptions in the network stack
    that the ifindex of a network device won't change. Making
    the ifindex number global seems a good compromise until
    the network stack can cope with ifindex changes when
    you change namespaces, and the like.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

04 May, 2007

1 commit

  • Cleanup of dev_base list use, with the aim to simplify making device
    list per-namespace. In almost every occasion, use of dev_base variable
    and dev->next pointer could be easily replaced by for_each_netdev
    loop. A few most complicated places were converted to using
    first_netdev()/next_netdev().

    Signed-off-by: Pavel Emelianov
    Acked-by: Kirill Korotaev
    Signed-off-by: David S. Miller

    Pavel Emelianov
     

26 Apr, 2007

3 commits


09 Feb, 2007

1 commit

  • Currently netlink users BUG when the allocated skb for an event
    notification is undersized. While this is certainly a kernel bug,
    its not critical and crashing the kernel is too drastic, especially
    when considering that these errors have appeared multiple times in
    the past and it BUGs even if no listeners are present.

    This patch replaces BUG by WARN_ON and changes the notification
    functions to inform potential listeners of undersized allocations
    using a unique error code (EMSGSIZE).

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

03 Dec, 2006

2 commits


23 Sep, 2006

1 commit


05 Aug, 2006

1 commit


04 Jul, 2006

1 commit


18 Jun, 2006

1 commit

  • Add basic netlink support to the Ethernet bridge. Including:
    * dump interfaces in bridges
    * monitor link status changes
    * change state of bridge port

    For some demo programs see:
    http://developer.osdl.org/shemminger/prototypes/brnl.tar.gz

    These are to allow building a daemon that does alternative
    implementations of Spanning Tree Protocol.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger