28 Jul, 2018

1 commit

  • There are two scenarios that we will restore deleted records. The first is
    when device down and up(or unmap/remap). In this scenario the new filter
    mode is same with previous one. Because we get it from in_dev->mc_list and
    we do not touch it during device down and up.

    The other scenario is when a new socket join a group which was just delete
    and not finish sending status reports. In this scenario, we should use the
    current filter mode instead of restore old one. Here are 4 cases in total.

    old_socket new_socket before_fix after_fix
    IN(A) IN(A) ALLOW(A) ALLOW(A)
    IN(A) EX( ) TO_IN( ) TO_EX( )
    EX( ) IN(A) TO_EX( ) ALLOW(A)
    EX( ) EX( ) TO_EX( ) TO_EX( )

    Fixes: 24803f38a5c0b (igmp: do not remove igmp souce list info when set link down)
    Fixes: 1666d49e1d416 (mld: do not remove mld souce list info when set link down)
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Hangbin Liu
     

13 Feb, 2018

1 commit

  • [ Upstream commit e7aadb27a5415e8125834b84a74477bfbee4eff5 ]

    Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.

    Timer callbacks do not ensure this locking.

    =============================
    WARNING: suspicious RCU usage
    4.15.0+ #200 Not tainted
    -----------------------------
    ./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!

    other info that might help us debug this:

    rcu_scheduler_active = 2, debug_locks = 1
    3 locks held by syzkaller616973/4074:
    #0: (&mm->mmap_sem){++++}, at: [] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
    #1: ((&im->timer)){+.-.}, at: [] lockdep_copy_map include/linux/lockdep.h:178 [inline]
    #1: ((&im->timer)){+.-.}, at: [] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
    #2: (&(&im->lock)->rlock){+.-.}, at: [] spin_lock_bh include/linux/spinlock.h:315 [inline]
    #2: (&(&im->lock)->rlock){+.-.}, at: [] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600

    stack backtrace:
    CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:

    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:53
    lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
    __in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
    igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
    igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
    add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
    add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
    igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
    igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
    igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
    call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
    expire_timers kernel/time/timer.c:1363 [inline]
    __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
    run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
    __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
    invoke_softirq kernel/softirq.c:365 [inline]
    irq_exit+0x1cc/0x200 kernel/softirq.c:405
    exiting_irq arch/x86/include/asm/apic.h:541 [inline]
    smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
    apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938

    Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot

    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

31 Jan, 2018

1 commit

  • [ Upstream commit ad23b750933ea7bf962678972a286c78a8fa36aa ]

    Commit "net: igmp: Use correct source address on IGMPv3 reports"
    introduced a check to validate the source address of locally generated
    IGMPv3 packets.
    Instead of checking the local interface address directly, it uses
    inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the
    local subnet (or equal to the point-to-point address if used).

    This breaks for point-to-point interfaces, so check against
    ifa->ifa_local directly.

    Cc: Kevin Cernekee
    Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
    Reported-by: Sebastian Gottschall
    Signed-off-by: Felix Fietkau
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Felix Fietkau
     

03 Jan, 2018

2 commits

  • [ Upstream commit a46182b00290839fa3fa159d54fd3237bd8669f0 ]

    Closing a multicast socket after the final IPv4 address is deleted
    from an interface can generate a membership report that uses the
    source IP from a different interface. The following test script, run
    from an isolated netns, reproduces the issue:

    #!/bin/bash

    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link set dummy0 up
    ip link set dummy1 up
    ip addr add 10.1.1.1/24 dev dummy0
    ip addr add 192.168.99.99/24 dev dummy1

    tcpdump -U -i dummy0 &
    socat EXEC:"sleep 2" \
    UDP4-DATAGRAM:239.101.1.68:8889,ip-add-membership=239.0.1.68:10.1.1.1 &

    sleep 1
    ip addr del 10.1.1.1/24 dev dummy0
    sleep 5
    kill %tcpdump

    RFC 3376 specifies that the report must be sent with a valid IP source
    address from the destination subnet, or from address 0.0.0.0. Add an
    extra check to make sure this is the case.

    Signed-off-by: Kevin Cernekee
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kevin Cernekee
     
  • [ Upstream commit b5476022bbada3764609368f03329ca287528dc8 ]

    IPv4 stack reacts to changes to small MTU, by disabling itself under
    RTNL.

    But there is a window where threads not using RTNL can see a wrong
    device mtu. This can lead to surprises, in igmp code where it is
    assumed the mtu is suitable.

    Fix this by reading device mtu once and checking IPv4 minimal MTU.

    This patch adds missing IPV4_MIN_MTU define, to not abuse
    ETH_MIN_MTU anymore.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

22 Aug, 2017

1 commit


17 Aug, 2017

1 commit

  • Anuradha reported that statically added groups for interfaces enslaved
    to a VRF device were not persisting. The problem is that igmp queries
    and reports need to use the data in the in_dev for the real ingress
    device rather than the VRF device. Update igmp_rcv accordingly.

    Fixes: e58e41596811 ("net: Enable support for VRF with ipv4 multicast")
    Reported-by: Anuradha Karuppiah
    Signed-off-by: David Ahern
    Reviewed-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    David Ahern
     

11 Aug, 2017

1 commit


10 Aug, 2017

1 commit

  • Commit dcd87999d415 ("igmp: net: Move igmp namespace init to correct file")
    moved the igmp sysctls initialization from tcp_sk_init to igmp_net_init. This
    function is only called as part of per-namespace initialization, only if
    CONFIG_IP_MULTICAST is defined, otherwise igmp_mc_init() call in ip_init is
    compiled out, casuing the igmp pernet ops to not be registerd and those sysctl
    being left initialized with 0. However, there are certain functions, such as
    ip_mc_join_group which are always compiled and make use of some of those
    sysctls. Let's do a partial revert of the aforementioned commit and move the
    sysctl initialization into inet_init_net, that way they will always have
    sane values.

    Fixes: dcd87999d415 ("igmp: net: Move igmp namespace init to correct file")
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=196595
    Reported-by: Gerardo Exequiel Pozzi
    Signed-off-by: Nikolay Borisov
    Signed-off-by: David S. Miller

    Nikolay Borisov
     

08 Aug, 2017

1 commit


01 Jul, 2017

1 commit

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

22 Jun, 2017

1 commit


21 Jun, 2017

1 commit

  • Andrey reported a lockdep warning on non-initialized
    spinlock:

    INFO: trying to register non-static key.
    the code is fine but needs lockdep annotation.
    turning off the locking correctness validator.
    CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:16
    dump_stack+0x292/0x395 lib/dump_stack.c:52
    register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755
    ? 0xffffffffa0000000
    __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255
    lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
    __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135
    _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175
    spin_lock_bh ./include/linux/spinlock.h:304
    ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076
    igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194
    ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736

    We miss a spin_lock_init() in igmpv3_add_delrec(), probably
    because previously we never use it on this code path. Since
    we already unlink it from the global mc_tomb list, it is
    probably safe not to acquire this spinlock here. It does not
    harm to have it although, to avoid conditional locking.

    Fixes: c38b7d327aaf ("igmp: acquire pmc lock for ip_mc_clear_src()")
    Reported-by: Andrey Konovalov
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    WANG Cong
     

16 Jun, 2017

1 commit

  • It seems like a historic accident that these return unsigned char *,
    and in many places that means casts are required, more often than not.

    Make these functions (skb_put, __skb_put and pskb_put) return void *
    and remove all the casts across the tree, adding a (u8 *) cast only
    where the unsigned char pointer was used directly, all done with the
    following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_put, __skb_put };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_put, __skb_put };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    which actually doesn't cover pskb_put since there are only three
    users overall.

    A handful of stragglers were converted manually, notably a macro in
    drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
    instances in net/bluetooth/hci_sock.c. In the former file, I also
    had to fix one whitespace problem spatch introduced.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

14 Jun, 2017

1 commit

  • Andrey reported a use-after-free in add_grec():

    for (psf = *psf_list; psf; psf = psf_next) {
    ...
    psf_next = psf->sf_next;

    where the struct ip_sf_list's were already freed by:

    kfree+0xe8/0x2b0 mm/slub.c:3882
    ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
    ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
    ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
    inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
    sock_release+0x8d/0x1e0 net/socket.c:597
    sock_close+0x16/0x20 net/socket.c:1072

    This happens because we don't hold pmc->lock in ip_mc_clear_src()
    and a parallel mr_ifc_timer timer could jump in and access them.

    The RCU lock is there but it is merely for pmc itself, this
    spinlock could actually ensure we don't access them in parallel.

    Thanks to Eric and Long for discussion on this bug.

    Reported-by: Andrey Konovalov
    Cc: Eric Dumazet
    Cc: Xin Long
    Signed-off-by: Cong Wang
    Reviewed-by: Xin Long
    Signed-off-by: David S. Miller

    WANG Cong
     

10 Feb, 2017

1 commit

  • In function igmpv3/mld_add_delrec() we allocate pmc and put it in
    idev->mc_tomb, so we should free it when we don't need it in del_delrec().
    But I removed kfree(pmc) incorrectly in latest two patches. Now fix it.

    Fixes: 24803f38a5c0 ("igmp: do not remove igmp souce list info when ...")
    Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when ...")
    Reported-by: Daniel Borkmann
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller

    Hangbin Liu
     

03 Jan, 2017

1 commit

  • 5.2. Action on Reception of a Query

    When a system receives a Query, it does not respond immediately.
    Instead, it delays its response by a random amount of time, bounded
    by the Max Resp Time value derived from the Max Resp Code in the
    received Query message. A system may receive a variety of Queries on
    different interfaces and of different kinds (e.g., General Queries,
    Group-Specific Queries, and Group-and-Source-Specific Queries), each
    of which may require its own delayed response.

    Before scheduling a response to a Query, the system must first
    consider previously scheduled pending responses and in many cases
    schedule a combined response. Therefore, the system must be able to
    maintain the following state:

    o A timer per interface for scheduling responses to General Queries.

    o A per-group and interface timer for scheduling responses to Group-
    Specific and Group-and-Source-Specific Queries.

    o A per-group and interface list of sources to be reported in the
    response to a Group-and-Source-Specific Query.

    When a new Query with the Router-Alert option arrives on an
    interface, provided the system has state to report, a delay for a
    response is randomly selected in the range (0, [Max Resp Time]) where
    Max Resp Time is derived from Max Resp Code in the received Query
    message. The following rules are then used to determine if a Report
    needs to be scheduled and the type of Report to schedule. The rules
    are considered in order and only the first matching rule is applied.

    1. If there is a pending response to a previous General Query
    scheduled sooner than the selected delay, no additional response
    needs to be scheduled.

    2. If the received Query is a General Query, the interface timer is
    used to schedule a response to the General Query after the
    selected delay. Any previously pending response to a General
    Query is canceled.
    --8
    Signed-off-by: David S. Miller

    Michal Tesar
     

25 Dec, 2016

1 commit


16 Nov, 2016

1 commit

  • In commit 24cf3af3fed5 ("igmp: call ip_mc_clear_src..."), we forgot to remove
    igmpv3_clear_delrec() in ip_mc_down(), which also called ip_mc_clear_src().
    This make us clear all IGMPv3 source filter info after NETDEV_DOWN.
    Move igmpv3_clear_delrec() to ip_mc_destroy_dev() and then no need
    ip_mc_clear_src() in ip_mc_destroy_dev().

    On the other hand, we should restore back instead of free all source filter
    info in igmpv3_del_delrec(). Or we will not able to restore IGMPv3 source
    filter info after NETDEV_UP and NETDEV_POST_TYPE_CHANGE.

    Fixes: 24cf3af3fed5 ("igmp: call ip_mc_clear_src() only when ...")
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller

    Hangbin Liu
     

09 Aug, 2016

1 commit

  • Based on RFC3376 5.1 and RFC3810 6.1

    If the per-interface listening change that triggers the new report is
    a filter mode change, then the next [Robustness Variable] State
    Change Reports will include a Filter Mode Change Record. This
    applies even if any number of source list changes occur in that
    period.

    Old State New State State Change Record Sent
    --------- --------- ------------------------
    INCLUDE (A) EXCLUDE (B) TO_EX (B)
    EXCLUDE (A) INCLUDE (B) TO_IN (B)

    So we should not send source-list change if there is a filter-mode change.

    Here are two scenarios:
    1. Group deleted and filter mode is EXCLUDE, which means we need send a
    TO_IN { }.
    2. Not group deleted, but has pcm->crcount, which means we need send a
    normal filter-mode-change.

    At the same time, if the type is ALLOW or BLOCK, and have psf->sf_crcount,
    we stop add records and decrease sf_crcount directly

    Reference: https://www.ietf.org/mail-archive/web/magma/current/msg01274.html

    Signed-off-by: Hangbin Liu
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hangbin Liu
     

09 Mar, 2016

1 commit


04 Mar, 2016

1 commit

  • The current reserved_tailroom calculation fails to take hlen and tlen into
    account.

    skb:
    [__hlen__|__data____________|__tlen___|__extra__]
    ^ ^
    head skb_end_offset

    In this representation, hlen + data + tlen is the size passed to alloc_skb.
    "extra" is the extra space made available in __alloc_skb because of
    rounding up by kmalloc. We can reorder the representation like so:

    [__hlen__|__data____________|__extra__|__tlen___]
    ^ ^
    head skb_end_offset

    The maximum space available for ip headers and payload without
    fragmentation is min(mtu, data + extra). Therefore,
    reserved_tailroom
    = data + extra + tlen - min(mtu, data + extra)
    = skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
    = skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)

    Compare the second line to the current expression:
    reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
    and we can see that hlen and tlen are not taken into account.

    The min() in the third line can be expanded into:
    if mtu < skb_tailroom - tlen:
    reserved_tailroom = skb_tailroom - mtu
    else:
    reserved_tailroom = tlen

    Depending on hlen, tlen, mtu and the number of multicast address records,
    the current code may output skbs that have less tailroom than
    dev->needed_tailroom or it may output more skbs than needed because not all
    space available is used.

    Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
    Signed-off-by: Benjamin Poirier
    Acked-by: Hannes Frederic Sowa
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Benjamin Poirier
     

17 Feb, 2016

1 commit

  • When igmp related sysctl were namespacified their initializatin was
    erroneously put into the tcp socket namespace constructor. This
    patch moves the relevant code into the igmp namespace constructor to
    keep things consistent.

    Also sprinkle some #ifdefs to silence warnings

    Signed-off-by: Nikolay Borisov
    Signed-off-by: David S. Miller

    Nikolay Borisov
     

11 Feb, 2016

4 commits


04 Dec, 2015

1 commit

  • When a multicast group is joined on a socket, a struct ip_mc_socklist
    is appended to the sockets mc_list containing information about the
    joined group.

    If the interface is hot unplugged, this entry becomes stale. Prior to
    commit 52ad353a5344f ("igmp: fix the problem when mc leave group") it
    was possible to remove the stale entry by performing a
    IP_DROP_MEMBERSHIP, passing either the old ifindex or ip address on
    the interface. However, this fix enforces that the interface must
    still exist. Thus with time, the number of stale entries grows, until
    sysctl_igmp_max_memberships is reached and then it is not possible to
    join and more groups.

    The previous patch fixes an issue where a IP_DROP_MEMBERSHIP is
    performed without specifying the interface, either by ifindex or ip
    address. However here we do supply one of these. So loosen the
    restriction on device existence to only apply when the interface has
    not been specified. This then restores the ability to clean up the
    stale entries.

    Signed-off-by: Andrew Lunn
    Fixes: 52ad353a5344f "(igmp: fix the problem when mc leave group")
    Signed-off-by: David S. Miller

    Andrew Lunn
     

05 Nov, 2015

1 commit

  • Sasha reported the following lockdep warning:

    Possible unsafe locking scenario:

    CPU0 CPU1
    ---- ----
    lock(sk_lock-AF_INET);
    lock(rtnl_mutex);
    lock(sk_lock-AF_INET);
    lock(rtnl_mutex);

    This is due to that for IP_MSFILTER and MCAST_MSFILTER, we take
    rtnl lock before the socket lock in setsockopt() path, but take
    the socket lock before rtnl lock in getsockopt() path. All the
    rest optnames are setsockopt()-only.

    Fix this by aligning the getsockopt() path with the setsockopt()
    path, so that all mcast socket path would be locked in the same
    order.

    Note, IPv6 part is different where rtnl lock is not held.

    Fixes: 54ff9ef36bdf ("ipv4, ipv6: kill ip_mc_{join, leave}_group and ipv6_sock_mc_{join, drop}")
    Reported-by: Sasha Levin
    Cc: Marcelo Ricardo Leitner
    Signed-off-by: Cong Wang
    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller

    WANG Cong
     

08 Oct, 2015

2 commits


30 Sep, 2015

1 commit

  • This patch updates ip_check_mc_rcu so that protocol is passed as a u8
    instead of a u16.

    The motivation is just to avoid any unneeded type transitions since some
    systems will require an instruction to zero extend a u8 field to a u16.
    Also it makes it a bit more readable as to the fact that protocol is a u8
    so there are no byte ordering changes needed to pass it.

    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     

29 Aug, 2015

1 commit

  • The range of addresses between 224.0.0.0 and 224.0.0.255 inclusive, is
    reserved for the use of routing protocols and other low-level topology
    discovery or maintenance protocols, such as gateway discovery and
    group membership reporting. Multicast routers should not forward any
    multicast datagram with destination addresses in this range,
    regardless of its TTL.

    Currently, IGMP reports are generated for this reserved range of
    addresses even though a router will ignore this information since it
    has no purpose. However, the presence of reserved group addresses in
    an IGMP membership report uses up network bandwidth and can also
    obscure addresses of interest when inspecting membership reports using
    packet inspection or debug messages.

    Although the RFCs for the various version of IGMP (e.g.RFC 3376 for
    v3) do not specify that the reserved addresses be excluded from
    membership reports, it should do no harm in doing so. In particular
    there should be no adverse effect in any IGMP snooping functionality
    since 224.0.0.x is specifically excluded as per RFC 4541 (IGMP and MLD
    Snooping Switches Considerations) section 2.1.2. Data Forwarding
    Rules:

    2) Packets with a destination IP (DIP) address in the 224.0.0.X
    range which are not IGMP must be forwarded on all ports.

    IGMP reports for local multicast groups can now be optionally
    inhibited by means of a system control variable (by setting the value
    to zero) e.g.:
    echo 0 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports

    To retain backwards compatibility the previous behaviour is retained
    by default on system boot or reverted by setting the value back to
    non-zero e.g.:
    echo 1 > /proc/sys/net/ipv4/igmp_link_local_mcast_reports

    Signed-off-by: Philip Downey
    Signed-off-by: David S. Miller

    Philip Downey
     

14 Aug, 2015

1 commit

  • The recent refactoring of the IGMP and MLD parsing code into
    ipv6_mc_check_mld() / ip_mc_check_igmp() introduced a potential crash /
    BUG() invocation for bridges:

    I wrongly assumed that skb_get() could be used as a simple reference
    counter for an skb which is not the case. skb_get() bears additional
    semantics, a user count. This leads to a BUG() invocation in
    pskb_expand_head() / kernel panic if pskb_may_pull() is called on an skb
    with a user count greater than one - unfortunately the refactoring did
    just that.

    Fixing this by removing the skb_get() call and changing the API: The
    caller of ipv6_mc_check_mld() / ip_mc_check_igmp() now needs to
    additionally check whether the returned skb_trimmed is a clone.

    Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code")
    Reported-by: Brenden Blanco
    Signed-off-by: Linus Lüssing
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Linus Lüssing
     

05 May, 2015

1 commit

  • With this patch, the IGMP and MLD message validation functions are moved
    from the bridge code to IPv4/IPv6 multicast files. Some small
    refactoring was done to enhance readibility and to iron out some
    differences in behaviour between the IGMP and MLD parsing code (e.g. the
    skb-cloning of MLD messages is now only done if necessary, just like the
    IGMP part always did).

    Finally, these IGMP and MLD message validation functions are exported so
    that not only the bridge can use it but batman-adv later, too.

    Signed-off-by: Linus Lüssing
    Signed-off-by: David S. Miller

    Linus Lüssing
     

04 Apr, 2015

2 commits

  • The ipv4 code uses a mixture of coding styles. In some instances check
    for non-NULL pointer is done as x != NULL and sometimes as x. x is
    preferred according to checkpatch and this patch makes the code
    consistent by adopting the latter form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     
  • The ipv4 code uses a mixture of coding styles. In some instances check
    for NULL pointer is done as x == NULL and sometimes as !x. !x is
    preferred according to checkpatch and this patch makes the code
    consistent by adopting the latter form.

    No changes detected by objdiff.

    Signed-off-by: Ian Morris
    Signed-off-by: David S. Miller

    Ian Morris
     

26 Mar, 2015

1 commit


19 Mar, 2015

1 commit

  • in favor of their inner __ ones, which doesn't grab rtnl.

    As these functions need to operate on a locked socket, we can't be
    grabbing rtnl by then. It's too late and doing so causes reversed
    locking.

    So this patch:
    - move rtnl handling to callers instead while already fixing some
    reversed locking situations, like on vxlan and ipvs code.
    - renames __ ones to not have the __ mark:
    __ip_mc_{join,leave}_group -> ip_mc_{join,leave}_group
    __ipv6_sock_mc_{join,drop} -> ipv6_sock_mc_{join,drop}

    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner
     

28 Feb, 2015

1 commit

  • Joining multicast group on ethernet level via "ip maddr" command would
    not work if we have an Ethernet switch that does igmp snooping since
    the switch would not replicate multicast packets on ports that did not
    have IGMP reports for the multicast addresses.

    Linux vxlan interfaces created via "ip link add vxlan" have the group option
    that enables then to do the required join.

    By extending ip address command with option "autojoin" we can get similar
    functionality for openvswitch vxlan interfaces as well as other tunneling
    mechanisms that need to receive multicast traffic. The kernel code is
    structured similar to how the vxlan driver does a group join / leave.

    example:
    ip address add 224.1.1.10/24 dev eth5 autojoin
    ip address del 224.1.1.10/24 dev eth5

    Signed-off-by: Madhu Challa
    Signed-off-by: David S. Miller

    Madhu Challa