19 Oct, 2016

1 commit

  • Satish reported a problem with the perm multicast router ports not getting
    reenabled after some series of events, in particular if it happens that the
    multicast snooping has been disabled and the port goes to disabled state
    then it will be deleted from the router port list, but if it moves into
    non-disabled state it will not be re-added because the mcast snooping is
    still disabled, and enabling snooping later does nothing.

    Here are the steps to reproduce, setup br0 with snooping enabled and eth1
    added as a perm router (multicast_router = 2):
    1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
    2. $ ip l set eth1 down
    ^ This step deletes the interface from the router list
    3. $ ip l set eth1 up
    ^ This step does not add it again because mcast snooping is disabled
    4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
    5. $ bridge -d -s mdb show

    At this point we have mcast enabled and eth1 as a perm router (value = 2)
    but it is not in the router list which is incorrect.

    After this change:
    1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
    2. $ ip l set eth1 down
    ^ This step deletes the interface from the router list
    3. $ ip l set eth1 up
    ^ This step does not add it again because mcast snooping is disabled
    4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
    5. $ bridge -d -s mdb show
    router ports on br0: eth1

    Note: we can directly do br_multicast_enable_port for all because the
    querier timer already has checks for the port state and will simply
    expire if it's in blocking/disabled. See the comment added by
    commit 9aa66382163e7 ("bridge: multicast: add a comment to
    br_port_state_selection about blocking state")

    Fixes: 561f1103a2b7 ("bridge: Add multicast_snooping sysfs toggle")
    Reported-by: Satish Ashok
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

01 Sep, 2016

1 commit

  • commit bc8c20acaea1 ("bridge: multicast: treat igmpv3 report with
    INCLUDE and no sources as a leave") seems to have accidentally reverted
    commit 47cc84ce0c2f ("bridge: fix parsing of MLDv2 reports"). This
    commit brings back a change to br_ip6_multicast_mld2_report() where
    parsing of MLDv2 reports stops when the first group is successfully
    added to the MDB cache.

    Fixes: bc8c20acaea1 ("bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave")
    Signed-off-by: Davide Caratti
    Acked-by: Nikolay Aleksandrov
    Acked-by: Thadeu Lima de Souza Cascardo
    Signed-off-by: David S. Miller

    Davide Caratti
     

10 Jul, 2016

1 commit

  • As was suggested this patch adds support for the different versions of MLD
    and IGMP query types. Since the user visible structure is still in net-next
    we can augment it instead of adding netlink attributes.
    The distinction between the different IGMP/MLD query types is done as
    suggested in Section 7.1, RFC 3376 [1] and Section 8.1, RFC 3810 [2] based
    on query payload size and code for IGMP. Since all IGMP packets go through
    multicast_rcv() and it uses ip_mc_check_igmp/ipv6_mc_check_mld we can be
    sure that at least the ip/ipv6 header can be directly used.

    [1] https://tools.ietf.org/html/rfc3376#section-7
    [2] https://tools.ietf.org/html/rfc3810#section-8.1

    Suggested-by: Linus Lüssing
    Signed-off-by: Nikolay Aleksandrov
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

30 Jun, 2016

1 commit

  • This patch adds stats support for the currently used IGMP/MLD types by the
    bridge. The stats are per-port (plus one stat per-bridge) and per-direction
    (RX/TX). The stats are exported via netlink via the new linkxstats API
    (RTM_GETSTATS). In order to minimize the performance impact, a new option
    is used to enable/disable the stats - multicast_stats_enabled, similar to
    the recent vlan stats. Also in order to avoid multiple IGMP/MLD type
    lookups and checks, we make use of the current "igmp" member of the bridge
    private skb->cb region to record the type on Rx (both host-generated and
    external packets pass by multicast_rcv()). We can do that since the igmp
    member was used as a boolean and all the valid IGMP/MLD types are positive
    values. The normal bridge fast-path is not affected at all, the only
    affected paths are the flooding ones and since we make use of the IGMP/MLD
    type, we can quickly determine if the packet should be counted using
    cache-hot data (cb's igmp member). We add counters for:
    * IGMP Queries
    * IGMP Leaves
    * IGMP v1/v2/v3 reports

    * MLD Queries
    * MLD Leaves
    * MLD v1/v2 reports

    These are invaluable when monitoring or debugging complex multicast setups
    with bridges.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

28 Jun, 2016

1 commit

  • The bridge is falsly dropping ipv6 mulitcast packets if there is:
    1. No ipv6 address assigned on the brigde.
    2. No external mld querier present.
    3. The internal querier enabled.

    When the bridge fails to build mld queries, because it has no
    ipv6 address, it slilently returns, but keeps the local querier enabled.
    This specific case causes confusing packet loss.

    Ipv6 multicast snooping can only work if:
    a) An external querier is present
    OR
    b) The bridge has an ipv6 address an is capable of sending own queries

    Otherwise it has to forward/flood the ipv6 multicast traffic,
    because snooping cannot work.

    This patch fixes the issue by adding a flag to the bridge struct that
    indicates that there is currently no ipv6 address assinged to the bridge
    and returns a false state for the local querier in
    __br_multicast_querier_exists().

    Special thanks to Linus Lüssing.

    Fixes: d1d81d4c3dd8 ("bridge: check return value of ipv6_dev_get_saddr()")
    Signed-off-by: Daniel Danzberger
    Acked-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Daniel
     

07 May, 2016

1 commit

  • With the newly introduced helper functions the skb pulling is hidden
    in the checksumming function - and undone before returning to the
    caller.

    The IGMP and MLD query parsing functions in the bridge still
    assumed that the skb is pointing to the beginning of the IGMP/MLD
    message while it is now kept at the beginning of the IPv4/6 header.

    If there is a querier somewhere else, then this either causes
    the multicast snooping to stay disabled even though it could be
    enabled. Or, if we have the querier enabled too, then this can
    create unnecessary IGMP / MLD query messages on the link.

    Fixing this by taking the offset between IP and IGMP/MLD header into
    account, too.

    Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
    Reported-by: Simon Wunderlich
    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     

25 Apr, 2016

1 commit

  • There is a race-condition when updating the mdb offload flag without using
    the mulicast_lock. This reverts commit 9e8430f8d60d98 ("bridge: mdb:
    Passing the port-group pointer to br_mdb module").

    This patch marks offloaded MDB entry as "offload" by changing the port-
    group flags and marks it as MDB_PG_FLAGS_OFFLOAD.

    When switchdev PORT_MDB succeeded and adds a multicast group, a completion
    callback is been invoked "br_mdb_complete". The completion function
    locks the multicast_lock and finds the right net_bridge_port_group and
    marks it as offloaded.

    Fixes: 9e8430f8d60d98 ("bridge: mdb: Passing the port-group pointer to br_mdb module")
    Reported-by: Nikolay Aleksandrov
    Signed-off-by: Elad Raz
    Signed-off-by: Jiri Pirko
    Reviewed-by: Ido Schimmel
    Acked-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Elad Raz
     

02 Mar, 2016

3 commits


09 Feb, 2016

2 commits


27 Sep, 2015

1 commit


18 Sep, 2015

1 commit

  • Pass a network namespace parameter into the netfilter hooks. At the
    call site of the netfilter hooks the path a packet is taking through
    the network stack is well known which allows the network namespace to
    be easily and reliabily.

    This allows the replacement of magic code like
    "dev_net(state->in?:state->out)" that appears at the start of most
    netfilter hooks with "state->net".

    In almost all cases the network namespace passed in is derived
    from the first network device passed in, guaranteeing those
    paths will not see any changes in practice.

    The exceptions are:
    xfrm/xfrm_output.c:xfrm_output_resume() xs_net(skb_dst(skb)->xfrm)
    ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont() ip_vs_conn_net(cp)
    ipvs/ip_vs_xmit.c:ip_vs_send_or_cont() ip_vs_conn_net(cp)
    ipv4/raw.c:raw_send_hdrinc() sock_net(sk)
    ipv6/ip6_output.c:ip6_xmit() sock_net(sk)
    ipv6/ndisc.c:ndisc_send_skb() dev_net(skb->dev) not dev_net(dst->dev)
    ipv6/raw.c:raw6_send_hdrinc() sock_net(sk)
    br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindev

    In all cases these exceptions seem to be a better expression for the
    network namespace the packet is being processed in then the historic
    "dev_net(in?in:out)". I am documenting them in case something odd
    pops up and someone starts trying to track down what happened.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

12 Sep, 2015

1 commit

  • With the newly introduced helper functions the skb pulling is hidden in
    the checksumming function - and undone before returning to the caller.

    The IGMPv3 and MLDv2 report parsing functions in the bridge still
    assumed that the skb is pointing to the beginning of the IGMP/MLD
    message while it is now kept at the beginning of the IPv4/6 header,
    breaking the message parsing and creating packet loss.

    Fixing this by taking the offset between IP and IGMP/MLD header into
    account, too.

    Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
    Reported-by: Tobias Powalowski
    Tested-by: Tobias Powalowski
    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     

22 Aug, 2015

1 commit


14 Aug, 2015

1 commit

  • The recent refactoring of the IGMP and MLD parsing code into
    ipv6_mc_check_mld() / ip_mc_check_igmp() introduced a potential crash /
    BUG() invocation for bridges:

    I wrongly assumed that skb_get() could be used as a simple reference
    counter for an skb which is not the case. skb_get() bears additional
    semantics, a user count. This leads to a BUG() invocation in
    pskb_expand_head() / kernel panic if pskb_may_pull() is called on an skb
    with a user count greater than one - unfortunately the refactoring did
    just that.

    Fixing this by removing the skb_get() call and changing the API: The
    caller of ipv6_mc_check_mld() / ip_mc_check_igmp() now needs to
    additionally check whether the returned skb_trimmed is a clone.

    Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
    Reported-by: Brenden Blanco
    Signed-off-by: Linus Lüssing
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Linus Lüssing
     

01 Aug, 2015

1 commit


30 Jul, 2015

1 commit


27 Jul, 2015

1 commit


23 Jul, 2015

1 commit


21 Jul, 2015

2 commits


16 Jul, 2015

1 commit

  • A report with INCLUDE/Change_to_include and empty source list should be
    treated as a leave, specified by RFC 3376, section 3.1:
    "If the requested filter mode is INCLUDE *and* the requested source
    list is empty, then the entry corresponding to the requested
    interface and multicast address is deleted if present. If no such
    entry is present, the request is ignored."

    Signed-off-by: Satish Ashok
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Satish Ashok
     

10 Jul, 2015

1 commit


24 Jun, 2015

1 commit


23 Jun, 2015

1 commit


14 Jun, 2015

1 commit


11 Jun, 2015

1 commit

  • Since the addition of sysfs multicast router support if one set
    multicast_router to "2" more than once, then the port would be added to
    the hlist every time and could end up linking to itself and thus causing an
    endless loop for rlist walkers.
    So to reproduce just do:
    echo 2 > multicast_router; echo 2 > multicast_router;
    in a bridge port and let some igmp traffic flow, for me it hangs up
    in br_multicast_flood().
    Fix this by adding a check in br_multicast_add_router() if the port is
    already linked.
    The reason this didn't happen before the addition of multicast_router
    sysfs entries is because there's a !hlist_unhashed check that prevents
    it.

    Signed-off-by: Nikolay Aleksandrov
    Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries")
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

02 Jun, 2015

1 commit

  • Conflicts:
    drivers/net/phy/amd-xgbe-phy.c
    drivers/net/wireless/iwlwifi/Kconfig
    include/net/mac80211.h

    iwlwifi/Kconfig and mac80211.h were both trivial overlapping
    changes.

    The drivers/net/phy/amd-xgbe-phy.c file got removed in 'net-next' and
    the bug fix that happened on the 'net' side is already integrated
    into the rest of the amd-xgbe driver.

    Signed-off-by: David S. Miller

    David S. Miller
     

31 May, 2015

1 commit

  • br_multicast_query_expired() querier argument is a pointer to
    a struct bridge_mcast_querier :

    struct bridge_mcast_querier {
    struct br_ip addr;
    struct net_bridge_port __rcu *port;
    };

    Intent of the code was to clear port field, not the pointer to querier.

    Fixes: 2cd4143192e8 ("bridge: memorize and export selected IGMP/MLD querier port")
    Signed-off-by: Eric Dumazet
    Acked-by: Thadeu Lima de Souza Cascardo
    Acked-by: Linus Lüssing
    Cc: Linus Lüssing
    Cc: Steinar H. Gunderson
    Signed-off-by: David S. Miller

    Eric Dumazet
     

26 May, 2015

1 commit

  • Network managers like netifd (used in OpenWRT for instance) try to
    configure interface options after creation but before setting the
    interface up.

    Unfortunately the sysfs / bridge currently only allows to configure the
    hash_max and multicast_router options when the bridge interface is up.
    But since br_multicast_init() doesn't start any timers and only sets
    default values and initializes timers it should be save to reconfigure
    the default values after that, before things actually get active after
    the bridge is set up.

    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     

23 May, 2015

2 commits

  • Conflicts:
    drivers/net/ethernet/cadence/macb.c
    drivers/net/phy/phy.c
    include/linux/skbuff.h
    net/ipv4/tcp.c
    net/switchdev/switchdev.c

    Switchdev was a case of RTNH_H_{EXTERNAL --> OFFLOAD}
    renaming overlapping with net-next changes of various
    sorts.

    phy.c was a case of two changes, one adding a local
    variable to a function whilst the second was removing
    one.

    tcp.c overlapped a deadlock fix with the addition of new tcp_info
    statistic values.

    macb.c involved the addition of two zyncq device entries.

    skbuff.h involved adding back ipv4_daddr to nf_bridge_info
    whilst net-next changes put two other existing members of
    that struct into a union.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • When more than a multicast address is present in a MLDv2 report, all but
    the first address is ignored, because the code breaks out of the loop if
    there has not been an error adding that address.

    This has caused failures when two guests connected through the bridge
    tried to communicate using IPv6. Neighbor discoveries would not be
    transmitted to the other guest when both used a link-local address and a
    static address.

    This only happens when there is a MLDv2 querier in the network.

    The fix will only break out of the loop when there is a failure adding a
    multicast address.

    The mdb before the patch:

    dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
    dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
    dev ovirtmgmt port bond0.86 grp ff02::2 temp

    After the patch:

    dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
    dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
    dev ovirtmgmt port bond0.86 grp ff02::fb temp
    dev ovirtmgmt port bond0.86 grp ff02::2 temp
    dev ovirtmgmt port bond0.86 grp ff02::d temp
    dev ovirtmgmt port vnet0 grp ff02::1:ff00:76 temp
    dev ovirtmgmt port bond0.86 grp ff02::16 temp
    dev ovirtmgmt port vnet1 grp ff02::1:ff00:77 temp
    dev ovirtmgmt port bond0.86 grp ff02::1:ff00:def temp
    dev ovirtmgmt port bond0.86 grp ff02::1:ffa1:40bf temp

    Fixes: 08b202b67264 ("bridge br_multicast: IPv6 MLD support.")
    Reported-by: Rik Theys
    Signed-off-by: Thadeu Lima de Souza Cascardo
    Tested-by: Rik Theys
    Signed-off-by: David S. Miller

    Thadeu Lima de Souza Cascardo
     

05 May, 2015

2 commits

  • With this patch, the IGMP and MLD message validation functions are moved
    from the bridge code to IPv4/IPv6 multicast files. Some small
    refactoring was done to enhance readibility and to iron out some
    differences in behaviour between the IGMP and MLD parsing code (e.g. the
    skb-cloning of MLD messages is now only done if necessary, just like the
    IGMP part always did).

    Finally, these IGMP and MLD message validation functions are exported so
    that not only the bridge can use it but batman-adv later, too.

    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     
  • Let's use these new, neat helpers.

    Signed-off-by: Linus Lüssing
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Linus Lüssing
     

08 Apr, 2015

1 commit

  • On the output paths in particular, we have to sometimes deal with two
    socket contexts. First, and usually skb->sk, is the local socket that
    generated the frame.

    And second, is potentially the socket used to control a tunneling
    socket, such as one the encapsulates using UDP.

    We do not want to disassociate skb->sk when encapsulating in order
    to fix this, because that would break socket memory accounting.

    The most extreme case where this can cause huge problems is an
    AF_PACKET socket transmitting over a vxlan device. We hit code
    paths doing checks that assume they are dealing with an ipv4
    socket, but are actually operating upon the AF_PACKET one.

    Signed-off-by: David S. Miller

    David Miller
     

17 Nov, 2014

1 commit

  • Ebtables on the OUTPUT chain (NF_BR_LOCAL_OUT) would not work as expected
    for both locally generated IGMP and MLD queries. The IP header specific
    filter options are off by 14 Bytes for netfilter (actual output on
    interfaces is fine).

    NF_HOOK() expects the skb->data to point to the IP header, not the
    ethernet one (while dev_queue_xmit() does not). Luckily there is an
    br_dev_queue_push_xmit() helper function already - let's just use that.

    Introduced by eb1d16414339a6e113d89e2cca2556005d7ce919
    ("bridge: Add core IGMP snooping support")

    Ebtables example:

    $ ebtables -I OUTPUT -p IPv6 -o eth1 --logical-out br0 \
    --log --log-level 6 --log-ip6 --log-prefix="~EBT: " -j DROP

    before (broken):

    ~EBT: IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
    MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
    SRC=64a4:39c2:86dd:6000:0000:0020:0001:fe80 IPv6 \
    DST=0000:0000:0000:0004:64ff:fea4:39c2:ff02, \
    IPv6 priority=0x3, Next Header=2

    after (working):

    ~EBT: IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
    MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
    SRC=fe80:0000:0000:0000:0004:64ff:fea4:39c2 IPv6 \
    DST=ff02:0000:0000:0000:0000:0000:0000:0001, \
    IPv6 priority=0x0, Next Header=0

    Signed-off-by: Linus Lüssing
    Acked-by: Herbert Xu
    Signed-off-by: Pablo Neira Ayuso

    Linus Lüssing
     

23 Aug, 2014

1 commit

  • The use of "rcu_assign_pointer()" is NULLing out the pointer.
    According to RCU_INIT_POINTER()'s block comment:
    "1. This use of RCU_INIT_POINTER() is NULLing out the pointer"
    it is better to use it instead of rcu_assign_pointer() because it has a
    smaller overhead.

    The following Coccinelle semantic patch was used:
    @@
    @@

    - rcu_assign_pointer
    + RCU_INIT_POINTER
    (..., NULL)

    Signed-off-by: Andreea-Cristina Bernat
    Signed-off-by: David S. Miller

    Andreea-Cristina Bernat
     

07 Aug, 2014

1 commit

  • All other add functions for lists have the new item as first argument
    and the position where it is added as second argument. This was changed
    for no good reason in this function and makes using it unnecessary
    confusing.

    The name was changed to hlist_add_behind() to cause unconverted code to
    generate a compile error instead of using the wrong parameter order.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Ken Helias
    Cc: "Paul E. McKenney"
    Acked-by: Jeff Kirsher [intel driver bits]
    Cc: Hugh Dickins
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ken Helias