12 Oct, 2016

1 commit

  • Pull uaccess.h prepwork from Al Viro:
    "Preparations to tree-wide switch to use of linux/uaccess.h (which,
    obviously, will allow to start unifying stuff for real). The last step
    there, ie

    PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
    sed -i -e "s!$PATT!#include !" \
    `git grep -l "$PATT"|grep -v ^include/linux/uaccess.h`

    is not taken here - I would prefer to do it once just before or just
    after -rc1. However, everything should be ready for it"

    * 'work.uaccess2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    remove a stray reference to asm/uaccess.h in docs
    sparc64: separate extable_64.h, switch elf_64.h to it
    score: separate extable.h, switch module.h to it
    mips: separate extable.h, switch module.h to it
    x86: separate extable.h, switch sections.h to it
    remove stray include of asm/uaccess.h from cacheflush.h
    mn10300: remove a bogus processor.h->uaccess.h include
    xtensa: split uaccess.h into C and asm sides
    bonding: quit messing with IOCTL
    kill __kernel_ds_p off
    mn10300: finish verify_area() off
    frv: move HAVE_ARCH_UNMAPPED_AREA to pgtable.h
    exceptions: detritus removal

    Linus Torvalds
     

28 Sep, 2016

1 commit

  • The only remaining users are issuing SIOCGMIIPHY and SIOCGMIIREG,
    neither of which deals with userland pointers. Simply calling
    ->ndo_do_ioctl() is fine; no messing with set_fs() is needed.
    It used to mess with SIOCETHTOOL, which would've needed set_fs(),
    but that has been killed in "[NET] ethtool ops are the only way"
    9 years ago...

    Signed-off-by: Al Viro

    Al Viro
     

13 Sep, 2016

1 commit


05 Sep, 2016

1 commit

  • Following few steps will crash kernel -

    (a) Create bonding master
    > modprobe bonding miimon=50
    (b) Create macvlan bridge on eth2
    > ip link add link eth2 dev mvl0 address aa:0:0:0:0:01 \
    type macvlan
    (c) Now try adding eth2 into the bond
    > echo +eth2 > /sys/class/net/bond0/bonding/slaves

    Bonding does lots of things before checking if the device enslaved is
    busy or not.

    In this case when the notifier call-chain sends notifications, the
    bond_netdev_event() assumes that the rx_handler /rx_handler_data is
    registered while the bond_enslave() hasn't progressed far enough to
    register rx_handler for the new slave.

    This patch adds a rx_handler check that can be performed right at the
    beginning of the enslave code to avoid getting into this situation.

    Signed-off-by: Mahesh Bandewar
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Mahesh Bandewar
     

02 Sep, 2016

1 commit

  • alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces
    deprecated create_singlethread_workqueue(). This is the identity
    conversion.

    The workqueue "wq" queues multiple work items viz
    &bond->mcast_work, &nnw->work, &bond->mii_work, &bond->arp_work,
    &bond->alb_work, &bond->mii_work, &bond->ad_work, &bond->slave_arr_work
    which require strict execution ordering. Hence, an ordered dedicated
    workqueue has been used.

    Since, it is a network driver, WQ_MEM_RECLAIM has been set to
    ensure forward progress under memory pressure.

    Signed-off-by: Bhaktipriya Shridhar
    Acked-by: Tejun Heo
    Signed-off-by: David S. Miller

    Bhaktipriya Shridhar
     

10 Aug, 2016

1 commit


26 Jul, 2016

1 commit

  • When using an IPoIB bond currently only active-backup mode is a valid
    use case and this commit strengthens it.

    Since commit 2ab82852a270 ("net/bonding: Enable bonding to enslave
    netdevices not supporting set_mac_address()") was introduced till
    4.7-rc1, IPoIB didn't support the set_mac_address ndo, and hence the
    fail over mac policy always applied to IPoIB bonds.

    With the introduction of commit 492a7e67ff83 ("IB/IPoIB: Allow setting
    the device address"), that doesn't hold and practically IPoIB bonds are
    broken as of that. To fix it, lets go to fail over mac if the device
    doesn't support the ndo OR this is IPoIB device.

    As a by-product, this commit also prevents a stack corruption which
    occurred when trying to copy 20 bytes (IPoIB) device address
    to a sockaddr struct that has only 16 bytes of storage.

    Signed-off-by: Mark Bloch
    Signed-off-by: Or Gerlitz
    Signed-off-by: Saeed Mahameed
    Acked-by: Andy Gospodarek
    Signed-off-by: Jay Vosburgh
    Signed-off-by: David S. Miller

    Mark Bloch
     

24 Jul, 2016

1 commit


15 Jul, 2016

1 commit

  • Commit e826eafa65c6 ("bonding: Call netif_carrier_off after
    register_netdevice") moved netif_carrier_off() from bond_init() to
    bond_create(), but the latter is called only for initial default
    devices and ones created through sysfs:

    $ modprobe bonding
    $ echo +bond1 > /sys/class/net/bonding_masters
    $ ip link add bond2 type bond
    $ grep "MII Status" /proc/net/bonding/*
    /proc/net/bonding/bond0:MII Status: down
    /proc/net/bonding/bond1:MII Status: down
    /proc/net/bonding/bond2:MII Status: up

    Ensure that carrier is initially off also for devices created through
    netlink.

    Signed-off-by: Beniamino Galvani
    Signed-off-by: David S. Miller

    Beniamino Galvani
     

07 Jul, 2016

1 commit


06 Jul, 2016

2 commits

  • Currently, link notifications are not sent by
    bond_set_slave_link_state() upon enslavement if
    the slave is enslaved when up.

    This happens because slave->link default init value
    is 0, which is the same as BOND_LINK_UP, resulting
    in bond_set_slave_link_state() ignoring this transition.

    This patch sets the default value of slave->link to
    BOND_LINK_NOCHANGE, assuring it will count as a state
    transition and thus trigger notification logic.

    Signed-off-by: Aviv Heller
    Reviewed-by: Jiri Pirko
    Signed-off-by: Saeed Mahameed
    Signed-off-by: David S. Miller

    Aviv Heller
     
  • L2 upper device needs to propagate neigh_construct/destroy calls down to
    lower devices. Do this by defining default ndo functions and use them in
    team, bond, bridge and vlan.

    Signed-off-by: Jiri Pirko
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Jiri Pirko
     

01 Jul, 2016

1 commit

  • ether_addr_equal_64bits() requires some care about its arguments,
    namely that 8 bytes might be read, even if last 2 byte values are not
    used.

    KASan detected a violation with null_mac_addr and lacpdu_mcast_addr
    in bond_3ad.c

    Same problem with mac_bcast[] and mac_v6_allmcast[] in bond_alb.c :
    Although the 8-byte alignment was there, KASan would detect out
    of bound accesses.

    Fixes: 815117adaf5b ("bonding: use ether_addr_equal_unaligned for bond addr compare")
    Fixes: bb54e58929f3 ("bonding: Verify RX LACPDU has proper dest mac-addr")
    Fixes: 885a136c52a8 ("bonding: use compare_ether_addr_64bits() in ALB")
    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Acked-by: Dmitry Vyukov
    Acked-by: Nikolay Aleksandrov
    Acked-by: Ding Tianhong
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Jun, 2016

1 commit


28 Jun, 2016

1 commit

  • Since commit 7bb11dc9f59d ("bonding: unify all places where
    actor-oper key needs to be updated."), the logic in bonding to handle
    selection between multiple aggregators has not functioned.

    This affects only configurations wherein the bonding slaves
    connect to two discrete aggregators (e.g., two independent switches, each
    with LACP enabled), thus creating two separate aggregation groups within a
    single bond.

    The cause is a change in 7bb11dc9f59d to no longer set
    AD_PORT_BEGIN on a port after a link state change, which would cause the
    port to be reselected for attachment to an aggregator as if were newly
    added to the bond. We cannot restore the prior behavior, as it
    contradicts IEEE 802.1AX 5.4.12, which requires ports that "become
    inoperable" (lose carrier, setting port_enabled=false as per 802.1AX
    5.4.7) to remain selected (i.e., assigned to the aggregator). As the port
    now remains selected, the aggregator selection logic is not invoked.

    A side effect of this change is that aggregators in bonding will
    now contain ports that are link down. The aggregator selection logic
    does not currently handle this situation correctly, causing incorrect
    aggregator selection.

    This patch makes two changes to repair the aggregator selection
    logic in bonding to function as documented and within the confines of the
    standard:

    First, the aggregator selection and related logic now utilizes the
    number of active ports per aggregator, not the number of selected ports
    (as some selected ports may be down). The ad_select "bandwidth" and
    "count" options only consider ports that are link up.

    Second, on any carrier state change of any slave, the aggregator
    selection logic is explicitly called to insure the correct aggregator is
    active.

    Reported-by: Veli-Matti Lintu
    Fixes: 7bb11dc9f59d ("bonding: unify all places where actor-oper key needs to be updated.")
    Signed-off-by: Jay Vosburgh
    Signed-off-by: David S. Miller

    Jay Vosburgh
     

10 Jun, 2016

1 commit


08 Jun, 2016

1 commit

  • Instead of using a single bit (__QDISC___STATE_RUNNING)
    in sch->__state, use a seqcount.

    This adds lockdep support, but more importantly it will allow us
    to sample qdisc/class statistics without having to grab qdisc root lock.

    Signed-off-by: Eric Dumazet
    Cc: Cong Wang
    Cc: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Mar, 2016

2 commits

  • bond_get_stats() can be called from rtnetlink (with RTNL held)
    or from /proc/net/dev seq handler (with RCU held)

    The logic added in commit 5f0c5f73e5ef ("bonding: make global bonding
    stats more reliable") kind of assumed only one cpu could run there.

    If multiple threads are reading /proc/net/dev, stats can be really
    messed up after a while.

    A second problem is that some fields are 32bit, so we need to properly
    handle the wrap around problem.

    Given that RTNL is not always held, we need to use
    bond_for_each_slave_rcu().

    Fixes: 5f0c5f73e5ef ("bonding: make global bonding stats more reliable")
    Signed-off-by: Eric Dumazet
    Cc: Andy Gospodarek
    Cc: Jay Vosburgh
    Cc: Veaceslav Falico
    Reviewed-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Remove unnecessary set of flag IFF_MULTICAST, since ether_setup
    already does this.

    Signed-off-by: Zhang Shengju
    Reviewed-by: Nikolay Aleksandrov
    Signed-off-by: Andy Gospodarek
    Signed-off-by: David S. Miller

    Zhang Shengju
     

26 Feb, 2016

1 commit


23 Feb, 2016

1 commit


17 Feb, 2016

1 commit

  • There is presently a race condition between the bonding periodic
    link monitor and the updating of a slave's speed and duplex. The former
    occurs on a periodic basis, and the latter in response to a driver's
    calling of netif_carrier_on.

    It is possible for the periodic monitor to run between the
    driver call of netif_carrier_on and the receipt of the NETDEV_CHANGE
    event that causes bonding to update the slave's speed and duplex. This
    manifests most notably as a report that a slave is up and "0 Mbps full
    duplex" after enslavement, but in principle could report an incorrect
    speed and duplex after any link up event if the device comes up with a
    different speed or duplex. This affects the 802.3ad aggregator
    selection, as the speed and duplex are selection criteria.

    This is fixed by updating the speed and duplex in the periodic
    monitor, prior to using that information.

    This was done historically in bonding, but the call to
    bond_update_speed_duplex was removed in commit 876254ae2758 ("bonding:
    don't call update_speed_duplex() under spinlocks"), as it might sleep
    under lock. Later, the locking was changed to only hold RTNL, and so
    after commit 876254ae2758 ("bonding: don't call update_speed_duplex()
    under spinlocks") this call is again safe.

    Tested-by: "Tantilov, Emil S"
    Cc: Veaceslav Falico
    Cc: dingtianhong
    Fixes: 876254ae2758 ("bonding: don't call update_speed_duplex() under spinlocks")
    Signed-off-by: Jay Vosburgh
    Acked-by: Ding Tianhong
    Signed-off-by: David S. Miller

    Jay Vosburgh
     

13 Feb, 2016

1 commit

  • The current logic in bond_arp_rcv will accept an incoming ARP for
    validation if (a) the receiving slave is either "active" (which includes
    the currently active slave, or the current ARP slave) or, (b) there is a
    currently active slave, and it has received an ARP since it became active.
    For case (b), the receiving slave isn't the currently active slave, and is
    receiving the original broadcast ARP request, not an ARP reply from the
    target.

    This logic can fail if there is no currently active slave. In
    this situation, the ARP probe logic cycles through all slaves, assigning
    each in turn as the "current_arp_slave" for one arp_interval, then setting
    that one as "active," and sending an ARP probe from that slave. The
    current logic expects the ARP reply to arrive on the sending
    current_arp_slave, however, due to switch FDB updating delays, the reply
    may be directed to another slave.

    This can arise if the bonding slaves and switch are working, but
    the ARP target is not responding. When the ARP target recovers, a
    condition may result wherein the ARP target host replies faster than the
    switch can update its forwarding table, causing each ARP reply to be sent
    to the previous current_arp_slave. This will never pass the logic in
    bond_arp_rcv, as neither of the above conditions (a) or (b) are met.

    Some experimentation on a LAN shows ARP reply round trips in the
    200 usec range, but my available switches never update their FDB in less
    than 4000 usec.

    This patch changes the logic in bond_arp_rcv to additionally
    accept an ARP reply for validation on any slave if there is a current ARP
    slave and it sent an ARP probe during the previous arp_interval.

    Fixes: aeea64ac717a ("bonding: don't trust arp requests unless active slave really works")
    Cc: Veaceslav Falico
    Cc: Andy Gospodarek
    Signed-off-by: Jay Vosburgh
    Signed-off-by: David S. Miller

    Jay Vosburgh
     

11 Feb, 2016

3 commits

  • Replace 'goto' with 'return' to remove unnecessary check at label:
    err_undo_flags.

    The reason is that 'err_undo_flags' do two things for the first slave device:
    1.revert bond mac address if it is set by the slave device.
    2.revert bond device type if it's not ARPHRD_ETHER.

    It's not necessary for the following three places, they changed neither bond
    mac address nor type. It's straightforward to return directly.

    Signed-off-by: Zhang Shengju
    Signed-off-by: David S. Miller

    Zhang Shengju
     
  • The return value of kzalloc on failure of allocation of memory should
    be -ENOMEM and not -1.

    Found using Coccinelle. A simplified version of the semantic patch
    used is:

    //
    @@
    expression *e;
    @@

    e = kzalloc(...);
    if (e == NULL) {
    ...
    return
    - -1
    + -ENOMEM
    ;
    }
    //

    The single call site only checks that the return value is not 0,
    hence no change is required at the call site.

    Signed-off-by: Amitoj Kaur Chawla
    Signed-off-by: David S. Miller

    Amitoj Kaur Chawla
     
  • No need to require the bond down while changing these settings, the change
    will be reflected immediately and the 3ad mode will sort itself out.
    For faster convergence set port->ntt to true in order to generate new
    LACPDUs immediately.

    CC: Jay Vosburgh
    CC: Veaceslav Falico
    CC: Andy Gospodarek
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

09 Feb, 2016

1 commit

  • Currently the bonding allows to set ad_actor_system and prio while the
    bond device is down, but these are actually applied only if there aren't
    any slaves yet (applied to bond device when first slave shows up, and to
    slaves at 3ad bind time). After this patch changes are applied immediately
    and the new values can be used/seen after the bond's upped so it's not
    necessary anymore to release all and enslave again to see the changes.

    CC: Jay Vosburgh
    CC: Veaceslav Falico
    CC: Andy Gospodarek
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: Jay Vosburgh
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

08 Feb, 2016

1 commit


06 Feb, 2016

2 commits

  • netdev_dbg() will add bond device name, it will be helpful if we print
    slave device name.

    Signed-off-by: Zhang Shengju
    Signed-off-by: David S. Miller

    Zhang Shengju
     
  • Sample output with this set applied for an active-backup bond:

    $ cat /sys/devices/virtual/net/bond0/lower_p7p1/statistics/rx_nohandler
    16568
    $ cat /sys/devices/virtual/net/bond0/lower_p5p2/statistics/rx_nohandler
    16583
    $ cat /sys/devices/virtual/net/bond0/statistics/rx_nohandler
    33151

    CC: Jay Vosburgh
    CC: Veaceslav Falico
    CC: Andy Gospodarek
    CC: netdev@vger.kernel.org
    Signed-off-by: Jarod Wilson
    Signed-off-by: David S. Miller

    Jarod Wilson
     

12 Jan, 2016

3 commits

  • Conflicts:
    drivers/net/bonding/bond_main.c
    drivers/net/ethernet/mellanox/mlxsw/spectrum.h
    drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c

    The bond_main.c and mellanox switch conflicts were cases of
    overlapping changes.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Commit 1f718f0f4f97 ("bonding: populate neighbour's private on enslave")
    undoes the fix provided by commit c2edacf80e15 ("bonding / ipv6: no addrconf
    for slaves separately from master") by effectively setting the slave flag
    after the slave has been opened. If the slave comes up quickly enough, it
    will go through the IPv6 addrconf before the slave flag has been set and
    will get a link local IPv6 address.

    In order to ensure that addrconf knows to ignore the slave devices on state
    change, set IFF_SLAVE before dev_open() during bonding enslavement.

    Fixes: 1f718f0f4f97 ("bonding: populate neighbour's private on enslave")
    Signed-off-by: Karl Heiss
    Signed-off-by: Jay Vosburgh
    Reviewed-by: Jarod Wilson
    Signed-off-by: Andy Gospodarek
    Signed-off-by: David S. Miller

    Karl Heiss
     
  • The spew in /proc/net/bonding/bond0 uses netif_carrier_ok() to determine
    mii_status, while /sys/class/net/bond0/bonding/mii_status looks at
    curr_active_slave, which doesn't actually seem to be set sometimes when
    the bond actually is up. A mode 4 bond configured via ifcfg-foo files on a
    Red Hat Enterprise Linux system, after boot, comes up clean and
    functional, but the sysfs node shows mii_status of down, while proc shows
    up. A simple enough fix here seems to be to use the same method for
    determining up or down in both places, and I'd opt for the one that seems
    to match reality.

    CC: Jay Vosburgh
    CC: Veaceslav Falico
    CC: Andy Gospodarek
    CC: netdev@vger.kernel.org
    Signed-off-by: Jarod Wilson
    Signed-off-by: David S. Miller

    Jarod Wilson
     

24 Dec, 2015

1 commit


16 Dec, 2015

1 commit

  • The name NETIF_F_ALL_CSUM is a misnomer. This does not correspond to the
    set of features for offloading all checksums. This is a mask of the
    checksum offload related features bits. It is incorrect to set both
    NETIF_F_HW_CSUM and NETIF_F_IP_CSUM or NETIF_F_IPV6 at the same time for
    features of a device.

    This patch:
    - Changes instances of NETIF_F_ALL_CSUM to NETIF_F_CSUM_MASK (where
    NETIF_F_ALL_CSUM is being used as a mask).
    - Changes bonding, sfc/efx, ipvlan, macvlan, vlan, and team drivers to
    use NEITF_F_HW_CSUM in features list instead of NETIF_F_ALL_CSUM.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

04 Dec, 2015

5 commits