19 Oct, 2016
1 commit
-
Satish reported a problem with the perm multicast router ports not getting
reenabled after some series of events, in particular if it happens that the
multicast snooping has been disabled and the port goes to disabled state
then it will be deleted from the router port list, but if it moves into
non-disabled state it will not be re-added because the mcast snooping is
still disabled, and enabling snooping later does nothing.Here are the steps to reproduce, setup br0 with snooping enabled and eth1
added as a perm router (multicast_router = 2):
1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
2. $ ip l set eth1 down
^ This step deletes the interface from the router list
3. $ ip l set eth1 up
^ This step does not add it again because mcast snooping is disabled
4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
5. $ bridge -d -s mdb showAt this point we have mcast enabled and eth1 as a perm router (value = 2)
but it is not in the router list which is incorrect.After this change:
1. $ echo 0 > /sys/class/net/br0/bridge/multicast_snooping
2. $ ip l set eth1 down
^ This step deletes the interface from the router list
3. $ ip l set eth1 up
^ This step does not add it again because mcast snooping is disabled
4. $ echo 1 > /sys/class/net/br0/bridge/multicast_snooping
5. $ bridge -d -s mdb show
router ports on br0: eth1Note: we can directly do br_multicast_enable_port for all because the
querier timer already has checks for the port state and will simply
expire if it's in blocking/disabled. See the comment added by
commit 9aa66382163e7 ("bridge: multicast: add a comment to
br_port_state_selection about blocking state")Fixes: 561f1103a2b7 ("bridge: Add multicast_snooping sysfs toggle")
Reported-by: Satish Ashok
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller
01 Sep, 2016
1 commit
-
commit bc8c20acaea1 ("bridge: multicast: treat igmpv3 report with
INCLUDE and no sources as a leave") seems to have accidentally reverted
commit 47cc84ce0c2f ("bridge: fix parsing of MLDv2 reports"). This
commit brings back a change to br_ip6_multicast_mld2_report() where
parsing of MLDv2 reports stops when the first group is successfully
added to the MDB cache.Fixes: bc8c20acaea1 ("bridge: multicast: treat igmpv3 report with INCLUDE and no sources as a leave")
Signed-off-by: Davide Caratti
Acked-by: Nikolay Aleksandrov
Acked-by: Thadeu Lima de Souza Cascardo
Signed-off-by: David S. Miller
10 Jul, 2016
1 commit
-
As was suggested this patch adds support for the different versions of MLD
and IGMP query types. Since the user visible structure is still in net-next
we can augment it instead of adding netlink attributes.
The distinction between the different IGMP/MLD query types is done as
suggested in Section 7.1, RFC 3376 [1] and Section 8.1, RFC 3810 [2] based
on query payload size and code for IGMP. Since all IGMP packets go through
multicast_rcv() and it uses ip_mc_check_igmp/ipv6_mc_check_mld we can be
sure that at least the ip/ipv6 header can be directly used.[1] https://tools.ietf.org/html/rfc3376#section-7
[2] https://tools.ietf.org/html/rfc3810#section-8.1Suggested-by: Linus Lüssing
Signed-off-by: Nikolay Aleksandrov
Acked-by: Stephen Hemminger
Signed-off-by: David S. Miller
30 Jun, 2016
1 commit
-
This patch adds stats support for the currently used IGMP/MLD types by the
bridge. The stats are per-port (plus one stat per-bridge) and per-direction
(RX/TX). The stats are exported via netlink via the new linkxstats API
(RTM_GETSTATS). In order to minimize the performance impact, a new option
is used to enable/disable the stats - multicast_stats_enabled, similar to
the recent vlan stats. Also in order to avoid multiple IGMP/MLD type
lookups and checks, we make use of the current "igmp" member of the bridge
private skb->cb region to record the type on Rx (both host-generated and
external packets pass by multicast_rcv()). We can do that since the igmp
member was used as a boolean and all the valid IGMP/MLD types are positive
values. The normal bridge fast-path is not affected at all, the only
affected paths are the flooding ones and since we make use of the IGMP/MLD
type, we can quickly determine if the packet should be counted using
cache-hot data (cb's igmp member). We add counters for:
* IGMP Queries
* IGMP Leaves
* IGMP v1/v2/v3 reports* MLD Queries
* MLD Leaves
* MLD v1/v2 reportsThese are invaluable when monitoring or debugging complex multicast setups
with bridges.Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller
28 Jun, 2016
1 commit
-
The bridge is falsly dropping ipv6 mulitcast packets if there is:
1. No ipv6 address assigned on the brigde.
2. No external mld querier present.
3. The internal querier enabled.When the bridge fails to build mld queries, because it has no
ipv6 address, it slilently returns, but keeps the local querier enabled.
This specific case causes confusing packet loss.Ipv6 multicast snooping can only work if:
a) An external querier is present
OR
b) The bridge has an ipv6 address an is capable of sending own queriesOtherwise it has to forward/flood the ipv6 multicast traffic,
because snooping cannot work.This patch fixes the issue by adding a flag to the bridge struct that
indicates that there is currently no ipv6 address assinged to the bridge
and returns a false state for the local querier in
__br_multicast_querier_exists().Special thanks to Linus Lüssing.
Fixes: d1d81d4c3dd8 ("bridge: check return value of ipv6_dev_get_saddr()")
Signed-off-by: Daniel Danzberger
Acked-by: Linus Lüssing
Signed-off-by: David S. Miller
07 May, 2016
1 commit
-
With the newly introduced helper functions the skb pulling is hidden
in the checksumming function - and undone before returning to the
caller.The IGMP and MLD query parsing functions in the bridge still
assumed that the skb is pointing to the beginning of the IGMP/MLD
message while it is now kept at the beginning of the IPv4/6 header.If there is a querier somewhere else, then this either causes
the multicast snooping to stay disabled even though it could be
enabled. Or, if we have the querier enabled too, then this can
create unnecessary IGMP / MLD query messages on the link.Fixing this by taking the offset between IP and IGMP/MLD header into
account, too.Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
Reported-by: Simon Wunderlich
Signed-off-by: Linus Lüssing
Signed-off-by: David S. Miller
25 Apr, 2016
1 commit
-
There is a race-condition when updating the mdb offload flag without using
the mulicast_lock. This reverts commit 9e8430f8d60d98 ("bridge: mdb:
Passing the port-group pointer to br_mdb module").This patch marks offloaded MDB entry as "offload" by changing the port-
group flags and marks it as MDB_PG_FLAGS_OFFLOAD.When switchdev PORT_MDB succeeded and adds a multicast group, a completion
callback is been invoked "br_mdb_complete". The completion function
locks the multicast_lock and finds the right net_bridge_port_group and
marks it as offloaded.Fixes: 9e8430f8d60d98 ("bridge: mdb: Passing the port-group pointer to br_mdb module")
Reported-by: Nikolay Aleksandrov
Signed-off-by: Elad Raz
Signed-off-by: Jiri Pirko
Reviewed-by: Ido Schimmel
Acked-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller
02 Mar, 2016
3 commits
-
Add support for a temporary router port which doesn't depend only on the
incoming query. It can be refreshed if set to the same value, which is
a no-op for the rest.Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller -
This is needed for the upcoming temporary port router. There's no point
to go through the logic if the value is the same.Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller -
Using raw values makes it difficult to extend and also understand the
code, give them names and do explicit per-option manipulation in
br_multicast_set_port_router.Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller
09 Feb, 2016
2 commits
-
Passing the port-group to br_mdb in order to allow direct access to the
structure. br_mdb will later use the structure to reflect HW reflection
status via "state" variable.Signed-off-by: Elad Raz
Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller -
Change net_bridge_port_group 'state' member to 'flags' and define new set
of flags internal to the kernel.Signed-off-by: Elad Raz
Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller
27 Sep, 2015
1 commit
-
Conflicts:
net/ipv4/arp.cThe net/ipv4/arp.c conflict was one commit adding a new
local variable while another commit was deleting one.Signed-off-by: David S. Miller
18 Sep, 2015
1 commit
-
Pass a network namespace parameter into the netfilter hooks. At the
call site of the netfilter hooks the path a packet is taking through
the network stack is well known which allows the network namespace to
be easily and reliabily.This allows the replacement of magic code like
"dev_net(state->in?:state->out)" that appears at the start of most
netfilter hooks with "state->net".In almost all cases the network namespace passed in is derived
from the first network device passed in, guaranteeing those
paths will not see any changes in practice.The exceptions are:
xfrm/xfrm_output.c:xfrm_output_resume() xs_net(skb_dst(skb)->xfrm)
ipvs/ip_vs_xmit.c:ip_vs_nat_send_or_cont() ip_vs_conn_net(cp)
ipvs/ip_vs_xmit.c:ip_vs_send_or_cont() ip_vs_conn_net(cp)
ipv4/raw.c:raw_send_hdrinc() sock_net(sk)
ipv6/ip6_output.c:ip6_xmit() sock_net(sk)
ipv6/ndisc.c:ndisc_send_skb() dev_net(skb->dev) not dev_net(dst->dev)
ipv6/raw.c:raw6_send_hdrinc() sock_net(sk)
br_netfilter_hooks.c:br_nf_pre_routing_finish() dev_net(skb->dev) before skb->dev is set to nf_bridge->physindevIn all cases these exceptions seem to be a better expression for the
network namespace the packet is being processed in then the historic
"dev_net(in?in:out)". I am documenting them in case something odd
pops up and someone starts trying to track down what happened.Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
12 Sep, 2015
1 commit
-
With the newly introduced helper functions the skb pulling is hidden in
the checksumming function - and undone before returning to the caller.The IGMPv3 and MLDv2 report parsing functions in the bridge still
assumed that the skb is pointing to the beginning of the IGMP/MLD
message while it is now kept at the beginning of the IPv4/6 header,
breaking the message parsing and creating packet loss.Fixing this by taking the offset between IP and IGMP/MLD header into
account, too.Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
Reported-by: Tobias Powalowski
Tested-by: Tobias Powalowski
Signed-off-by: Linus Lüssing
Signed-off-by: David S. Miller
22 Aug, 2015
1 commit
-
Conflicts:
drivers/net/usb/qmi_wwan.cOverlapping additions of new device IDs to qmi_wwan.c
Signed-off-by: David S. Miller
14 Aug, 2015
1 commit
-
The recent refactoring of the IGMP and MLD parsing code into
ipv6_mc_check_mld() / ip_mc_check_igmp() introduced a potential crash /
BUG() invocation for bridges:I wrongly assumed that skb_get() could be used as a simple reference
counter for an skb which is not the case. skb_get() bears additional
semantics, a user count. This leads to a BUG() invocation in
pskb_expand_head() / kernel panic if pskb_may_pull() is called on an skb
with a user count greater than one - unfortunately the refactoring did
just that.Fixing this by removing the skb_get() call and changing the API: The
caller of ipv6_mc_check_mld() / ip_mc_check_igmp() now needs to
additionally check whether the returned skb_trimmed is a clone.Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
Reported-by: Brenden Blanco
Signed-off-by: Linus Lüssing
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller
01 Aug, 2015
1 commit
-
Conflicts:
arch/s390/net/bpf_jit_comp.c
drivers/net/ethernet/ti/netcp_ethss.c
net/bridge/br_multicast.c
net/ipv4/ip_fragment.cAll four conflicts were cases of simple overlapping
changes.Signed-off-by: David S. Miller
30 Jul, 2015
1 commit
-
When fast leave is configured on a bridge port and an IGMP leave is
received for a group, the group is not deleted immediately if there is
a router detected or if multicast querier is configured.
Ideally the group should be deleted immediately when fast leave is
configured.Signed-off-by: Satish Ashok
Signed-off-by: David S. Miller
27 Jul, 2015
1 commit
-
Send notifications on router port add and del/expire, re-use the already
existing MDBA_ROUTER and send NEWMDB/DELMDB netlink notifications
respectively.Signed-off-by: Satish Ashok
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller
23 Jul, 2015
1 commit
-
Conflicts:
net/bridge/br_mdb.cbr_mdb.c conflict was a function call being removed to fix a bug in
'net' but whose signature was changed in 'net-next'.Signed-off-by: David S. Miller
21 Jul, 2015
2 commits
-
When the bridge (or port) is brought down/up flush only temp entries and
leave the perm ones. Flush perm entries only when deleting the bridge
device or the associated port.Signed-off-by: Satish Ashok
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller -
Group notifications were not sent when a group expired or was deleted
due to bridge/port device being deleted. So add br_mdb_notify() to
br_multicast_del_pg().Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller
16 Jul, 2015
1 commit
-
A report with INCLUDE/Change_to_include and empty source list should be
treated as a leave, specified by RFC 3376, section 3.1:
"If the requested filter mode is INCLUDE *and* the requested source
list is empty, then the entry corresponding to the requested
interface and multicast address is deleted if present. If no such
entry is present, the request is ignored."Signed-off-by: Satish Ashok
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller
10 Jul, 2015
1 commit
-
Fill also the port group state when sending notifications.
Signed-off-by: Satish Ashok
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller
24 Jun, 2015
1 commit
-
Conflicts:
drivers/net/ethernet/mellanox/mlx4/main.c
net/packet/af_packet.cBoth conflicts were cases of simple overlapping changes.
Signed-off-by: David S. Miller
23 Jun, 2015
1 commit
-
When a port goes through a link down/up the multicast router configuration
is not restored.Signed-off-by: Satish Ashok
Signed-off-by: Nikolay Aleksandrov
Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries")
Acked-by: Herbert Xu
Signed-off-by: David S. Miller
14 Jun, 2015
1 commit
11 Jun, 2015
1 commit
-
Since the addition of sysfs multicast router support if one set
multicast_router to "2" more than once, then the port would be added to
the hlist every time and could end up linking to itself and thus causing an
endless loop for rlist walkers.
So to reproduce just do:
echo 2 > multicast_router; echo 2 > multicast_router;
in a bridge port and let some igmp traffic flow, for me it hangs up
in br_multicast_flood().
Fix this by adding a check in br_multicast_add_router() if the port is
already linked.
The reason this didn't happen before the addition of multicast_router
sysfs entries is because there's a !hlist_unhashed check that prevents
it.Signed-off-by: Nikolay Aleksandrov
Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries")
Acked-by: Herbert Xu
Signed-off-by: David S. Miller
02 Jun, 2015
1 commit
-
Conflicts:
drivers/net/phy/amd-xgbe-phy.c
drivers/net/wireless/iwlwifi/Kconfig
include/net/mac80211.hiwlwifi/Kconfig and mac80211.h were both trivial overlapping
changes.The drivers/net/phy/amd-xgbe-phy.c file got removed in 'net-next' and
the bug fix that happened on the 'net' side is already integrated
into the rest of the amd-xgbe driver.Signed-off-by: David S. Miller
31 May, 2015
1 commit
-
br_multicast_query_expired() querier argument is a pointer to
a struct bridge_mcast_querier :struct bridge_mcast_querier {
struct br_ip addr;
struct net_bridge_port __rcu *port;
};Intent of the code was to clear port field, not the pointer to querier.
Fixes: 2cd4143192e8 ("bridge: memorize and export selected IGMP/MLD querier port")
Signed-off-by: Eric Dumazet
Acked-by: Thadeu Lima de Souza Cascardo
Acked-by: Linus Lüssing
Cc: Linus Lüssing
Cc: Steinar H. Gunderson
Signed-off-by: David S. Miller
26 May, 2015
1 commit
-
Network managers like netifd (used in OpenWRT for instance) try to
configure interface options after creation but before setting the
interface up.Unfortunately the sysfs / bridge currently only allows to configure the
hash_max and multicast_router options when the bridge interface is up.
But since br_multicast_init() doesn't start any timers and only sets
default values and initializes timers it should be save to reconfigure
the default values after that, before things actually get active after
the bridge is set up.Signed-off-by: Linus Lüssing
Signed-off-by: David S. Miller
23 May, 2015
2 commits
-
Conflicts:
drivers/net/ethernet/cadence/macb.c
drivers/net/phy/phy.c
include/linux/skbuff.h
net/ipv4/tcp.c
net/switchdev/switchdev.cSwitchdev was a case of RTNH_H_{EXTERNAL --> OFFLOAD}
renaming overlapping with net-next changes of various
sorts.phy.c was a case of two changes, one adding a local
variable to a function whilst the second was removing
one.tcp.c overlapped a deadlock fix with the addition of new tcp_info
statistic values.macb.c involved the addition of two zyncq device entries.
skbuff.h involved adding back ipv4_daddr to nf_bridge_info
whilst net-next changes put two other existing members of
that struct into a union.Signed-off-by: David S. Miller
-
When more than a multicast address is present in a MLDv2 report, all but
the first address is ignored, because the code breaks out of the loop if
there has not been an error adding that address.This has caused failures when two guests connected through the bridge
tried to communicate using IPv6. Neighbor discoveries would not be
transmitted to the other guest when both used a link-local address and a
static address.This only happens when there is a MLDv2 querier in the network.
The fix will only break out of the loop when there is a failure adding a
multicast address.The mdb before the patch:
dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
dev ovirtmgmt port bond0.86 grp ff02::2 tempAfter the patch:
dev ovirtmgmt port vnet0 grp ff02::1:ff7d:6603 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff7d:6604 temp
dev ovirtmgmt port bond0.86 grp ff02::fb temp
dev ovirtmgmt port bond0.86 grp ff02::2 temp
dev ovirtmgmt port bond0.86 grp ff02::d temp
dev ovirtmgmt port vnet0 grp ff02::1:ff00:76 temp
dev ovirtmgmt port bond0.86 grp ff02::16 temp
dev ovirtmgmt port vnet1 grp ff02::1:ff00:77 temp
dev ovirtmgmt port bond0.86 grp ff02::1:ff00:def temp
dev ovirtmgmt port bond0.86 grp ff02::1:ffa1:40bf tempFixes: 08b202b67264 ("bridge br_multicast: IPv6 MLD support.")
Reported-by: Rik Theys
Signed-off-by: Thadeu Lima de Souza Cascardo
Tested-by: Rik Theys
Signed-off-by: David S. Miller
05 May, 2015
2 commits
-
With this patch, the IGMP and MLD message validation functions are moved
from the bridge code to IPv4/IPv6 multicast files. Some small
refactoring was done to enhance readibility and to iron out some
differences in behaviour between the IGMP and MLD parsing code (e.g. the
skb-cloning of MLD messages is now only done if necessary, just like the
IGMP part always did).Finally, these IGMP and MLD message validation functions are exported so
that not only the bridge can use it but batman-adv later, too.Signed-off-by: Linus Lüssing
Signed-off-by: David S. Miller -
Let's use these new, neat helpers.
Signed-off-by: Linus Lüssing
Acked-by: Stephen Hemminger
Signed-off-by: David S. Miller
08 Apr, 2015
1 commit
-
On the output paths in particular, we have to sometimes deal with two
socket contexts. First, and usually skb->sk, is the local socket that
generated the frame.And second, is potentially the socket used to control a tunneling
socket, such as one the encapsulates using UDP.We do not want to disassociate skb->sk when encapsulating in order
to fix this, because that would break socket memory accounting.The most extreme case where this can cause huge problems is an
AF_PACKET socket transmitting over a vxlan device. We hit code
paths doing checks that assume they are dealing with an ipv4
socket, but are actually operating upon the AF_PACKET one.Signed-off-by: David S. Miller
17 Nov, 2014
1 commit
-
Ebtables on the OUTPUT chain (NF_BR_LOCAL_OUT) would not work as expected
for both locally generated IGMP and MLD queries. The IP header specific
filter options are off by 14 Bytes for netfilter (actual output on
interfaces is fine).NF_HOOK() expects the skb->data to point to the IP header, not the
ethernet one (while dev_queue_xmit() does not). Luckily there is an
br_dev_queue_push_xmit() helper function already - let's just use that.Introduced by eb1d16414339a6e113d89e2cca2556005d7ce919
("bridge: Add core IGMP snooping support")Ebtables example:
$ ebtables -I OUTPUT -p IPv6 -o eth1 --logical-out br0 \
--log --log-level 6 --log-ip6 --log-prefix="~EBT: " -j DROPbefore (broken):
~EBT: IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
SRC=64a4:39c2:86dd:6000:0000:0020:0001:fe80 IPv6 \
DST=0000:0000:0000:0004:64ff:fea4:39c2:ff02, \
IPv6 priority=0x3, Next Header=2after (working):
~EBT: IN= OUT=eth1 MAC source = 02:04:64:a4:39:c2 \
MAC dest = 33:33:00:00:00:01 proto = 0x86dd IPv6 \
SRC=fe80:0000:0000:0000:0004:64ff:fea4:39c2 IPv6 \
DST=ff02:0000:0000:0000:0000:0000:0000:0001, \
IPv6 priority=0x0, Next Header=0Signed-off-by: Linus Lüssing
Acked-by: Herbert Xu
Signed-off-by: Pablo Neira Ayuso
23 Aug, 2014
1 commit
-
The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1. This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.The following Coccinelle semantic patch was used:
@@
@@- rcu_assign_pointer
+ RCU_INIT_POINTER
(..., NULL)Signed-off-by: Andreea-Cristina Bernat
Signed-off-by: David S. Miller
07 Aug, 2014
1 commit
-
All other add functions for lists have the new item as first argument
and the position where it is added as second argument. This was changed
for no good reason in this function and makes using it unnecessary
confusing.The name was changed to hlist_add_behind() to cause unconverted code to
generate a compile error instead of using the wrong parameter order.[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Ken Helias
Cc: "Paul E. McKenney"
Acked-by: Jeff Kirsher [intel driver bits]
Cc: Hugh Dickins
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds