29 Mar, 2014

1 commit

  • addrconf_join_solict and addrconf_join_anycast may cause actions which
    need rtnl locked, especially on first address creation.

    A new DAD state is introduced which defers processing of the initial
    DAD processing into a workqueue.

    To get rtnl lock we need to push the code paths which depend on those
    calls up to workqueues, specifically addrconf_verify and the DAD
    processing.

    (v2)
    addrconf_dad_failure needs to be queued up to the workqueue, too. This
    patch introduces a new DAD state and stop the DAD processing in the
    workqueue (this is because of the possible ipv6_del_addr processing
    which removes the solicited multicast address from the device).

    addrconf_verify_lock is removed, too. After the transition it is not
    needed any more.

    As we are not processing in bottom half anymore we need to be a bit more
    careful about disabling bottom half out when we lock spin_locks which are also
    used in bh.

    Relevant backtrace:
    [ 541.030090] RTNL: assertion failed at net/core/dev.c (4496)
    [ 541.031143] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 3.10.33-1-amd64-vyatta #1
    [ 541.031145] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
    [ 541.031146] ffffffff8148a9f0 000000000000002f ffffffff813c98c1 ffff88007c4451f8
    [ 541.031148] 0000000000000000 0000000000000000 ffffffff813d3540 ffff88007fc03d18
    [ 541.031150] 0000880000000006 ffff88007c445000 ffffffffa0194160 0000000000000000
    [ 541.031152] Call Trace:
    [ 541.031153] [] ? dump_stack+0xd/0x17
    [ 541.031180] [] ? __dev_set_promiscuity+0x101/0x180
    [ 541.031183] [] ? __hw_addr_create_ex+0x60/0xc0
    [ 541.031185] [] ? __dev_set_rx_mode+0xaa/0xc0
    [ 541.031189] [] ? __dev_mc_add+0x61/0x90
    [ 541.031198] [] ? igmp6_group_added+0xfc/0x1a0 [ipv6]
    [ 541.031208] [] ? kmem_cache_alloc+0xcb/0xd0
    [ 541.031212] [] ? ipv6_dev_mc_inc+0x267/0x300 [ipv6]
    [ 541.031216] [] ? addrconf_join_solict+0x2e/0x40 [ipv6]
    [ 541.031219] [] ? ipv6_dev_ac_inc+0x159/0x1f0 [ipv6]
    [ 541.031223] [] ? addrconf_join_anycast+0x92/0xa0 [ipv6]
    [ 541.031226] [] ? __ipv6_ifa_notify+0x11e/0x1e0 [ipv6]
    [ 541.031229] [] ? ipv6_ifa_notify+0x33/0x50 [ipv6]
    [ 541.031233] [] ? addrconf_dad_completed+0x28/0x100 [ipv6]
    [ 541.031241] [] ? task_cputime+0x2d/0x50
    [ 541.031244] [] ? addrconf_dad_timer+0x136/0x150 [ipv6]
    [ 541.031247] [] ? addrconf_dad_completed+0x100/0x100 [ipv6]
    [ 541.031255] [] ? call_timer_fn.isra.22+0x2a/0x90
    [ 541.031258] [] ? addrconf_dad_completed+0x100/0x100 [ipv6]

    Hunks and backtrace stolen from a patch by Stephen Hemminger.

    Reported-by: Stephen Hemminger
    Signed-off-by: Stephen Hemminger
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

14 Mar, 2014

1 commit

  • tmp_prefered_lft is an offset to ifp->tstamp, not now. Therefore
    age needs to be added to the condition.

    Age calculation in ipv6_create_tempaddr is different from the one
    in addrconf_verify and doesn't consider ADDRCONF_TIMER_FUZZ_MINUS.
    This can cause age in ipv6_create_tempaddr to be less than the one
    in addrconf_verify and therefore unnecessary temporary address to
    be generated.
    Use age calculation as in addrconf_modify to avoid this.

    Signed-off-by: Heiner Kallweit
    Signed-off-by: David S. Miller

    Heiner Kallweit
     

18 Feb, 2014

1 commit

  • This bug was reported by Steinar H. Gunderson and was introduced by commit
    f7cb8886335d ("sit/gre6: don't try to add the same route two times").

    root@morgental:~# ip tunnel add foo mode gre remote 1.2.3.4 ttl 64
    root@morgental:~# ip link set foo up mtu 1468
    root@morgental:~# ip -6 route show dev foo
    fe80::/64 proto kernel metric 256

    but after the above commit, no such route shows up.

    There is no link local route because dev->dev_addr is 0 (because local ipv4
    address is 0), hence no link local address is configured.

    In this scenario, the link local address is added manually: 'ip -6 addr add
    fe80::1 dev foo' and because prefix is /128, no link local route is added by the
    kernel.

    Even if the right things to do is to add the link local address with a /64
    prefix, we need to restore the previous behavior to avoid breaking userpace.

    Reported-by: Steinar H. Gunderson
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

25 Jan, 2014

1 commit

  • commit 25fb6ca4ed9cad72f14f61629b68dc03c0d9713f
    "net IPv6 : Fix broken IPv6 routing table after loopback down-up"
    allocates addrconf router for ipv6 address when lo device up.
    but commit a881ae1f625c599b460cc8f8a7fcb1c438f699ad
    "ipv6:don't call addrconf_dst_alloc again when enable lo" breaks
    this behavior.

    Since the addrconf router is moved to the garbage list when
    lo device down, we should release this router and rellocate
    a new one for ipv6 address when lo device up.

    This patch solves bug 67951 on bugzilla
    https://bugzilla.kernel.org/show_bug.cgi?id=67951

    change from v1:
    use ip6_rt_put to repleace ip6_del_rt, thanks Hannes!
    change code style, suggested by Sergei.

    CC: Sabrina Dubroca
    CC: Hannes Frederic Sowa
    Reported-by: Weilong Chen
    Signed-off-by: Weilong Chen
    Signed-off-by: Gao feng
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Gao feng
     

20 Jan, 2014

1 commit

  • ipv6_link_dev_addr sorts newly added addresses by scope in
    ifp->addr_list. Smaller scope addresses are added to the tail of the
    list. Use this fact to iterate in reverse over addr_list and break out
    as soon as a higher scoped one showes up, so we can spare some cycles
    on machines with lot's of addresses.

    The ordering of the addresses is not relevant and we are more likely to
    get the eui64 generated address with this change anyway.

    Suggested-by: Brian Haley
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

18 Jan, 2014

2 commits

  • Conflicts:
    drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
    net/ipv4/tcp_metrics.c

    Overlapping changes between the "don't create two tcp metrics objects
    with the same key" race fix in net and the addition of the destination
    address in the lookup key in net-next.

    Minor overlapping changes in bnx2x driver.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • In commit 1ec047eb4751e3 ("ipv6: introduce per-interface counter for
    dad-completed ipv6 addresses") I build the detection of the first
    operational link-local address much to complex. Additionally this code
    now has a race condition.

    Replace it with a much simpler variant, which just scans the address
    list when duplicate address detection completes, to check if this is
    the first valid link local address and send RS and MLD reports then.

    Fixes: 1ec047eb4751e3 ("ipv6: introduce per-interface counter for dad-completed ipv6 addresses")
    Reported-by: Jiri Pirko
    Cc: Flavio Leitner
    Signed-off-by: Hannes Frederic Sowa
    Acked-by: Flavio Leitner
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

16 Jan, 2014

2 commits

  • Refactor the deletion/update of prefix routes when removing an
    address. Now also consider IFA_F_NOPREFIXROUTE and if there is an address
    present with this flag, to not cleanup the route. Instead, assume
    that userspace is taking care of this route.

    Also perform the same cleanup, when userspace changes an existing address
    to add NOPREFIXROUTE (to an address that didn't have this flag). This is
    done because when the address was added, a prefix route was created for it.
    Since the user now wants to handle this route by himself, we cleanup this
    route.

    This cleanup of the route is not totally robust. There is no guarantee,
    that the route we are about to delete was really the one added by the
    kernel. This behavior does not change by the patch, and in practice it
    should work just fine.

    Signed-off-by: Thomas Haller
    Signed-off-by: David S. Miller

    Thomas Haller
     
  • When adding/modifying an IPv6 address, the userspace application needs
    a way to suppress adding a prefix route. This is for example relevant
    together with IFA_F_MANAGERTEMPADDR, where userspace creates autoconf
    generated addresses, but depending on on-link, no route for the
    prefix should be added.

    Signed-off-by: Thomas Haller
    Signed-off-by: David S. Miller

    Thomas Haller
     

15 Jan, 2014

3 commits


10 Jan, 2014

2 commits

  • …wireless-next into for-davem

    Conflicts:
    net/ieee802154/6lowpan.c

    John W. Linville
     
  • In the past the IFA_PERMANENT flag indicated, that the valid and preferred
    lifetime where ignored. Since change fad8da3e085ddf ("ipv6 addrconf: fix
    preferred lifetime state-changing behavior while valid_lft is infinity")
    we honour at least the preferred lifetime on those addresses. As such
    the valid lifetime gets recalculated and updated to 0.

    If loopback address is added manually this problem does not occur.
    Also if NetworkManager manages IPv6, those addresses will get added via
    inet6_rtm_newaddr and thus will have a correct lifetime, too.

    Reported-by: François-Xavier Le Bail
    Reported-by: Damien Wyart
    Fixes: fad8da3e085ddf ("ipv6 addrconf: fix preferred lifetime state-changing behavior while valid_lft is infinity")
    Cc: Yasushi Asano
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

07 Jan, 2014

2 commits


03 Jan, 2014

1 commit

  • Fixed a problem with setting the lifetime of an IPv6
    address. When setting preferred_lft to a value not zero or
    infinity, while valid_lft is infinity(0xffffffff) preferred
    lifetime is set to forever and does not update. Therefore
    preferred lifetime never becomes deprecated. valid lifetime
    and preferred lifetime should be set independently, even if
    valid lifetime is infinity, preferred lifetime must expire
    correctly (meaning it must eventually become deprecated)

    Signed-off-by: Yasushi Asano
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Yasushi Asano
     

02 Jan, 2014

1 commit


01 Jan, 2014

1 commit


23 Dec, 2013

1 commit


12 Dec, 2013

1 commit


11 Dec, 2013

1 commit

  • Turned out that applications like ifconfig do not handle the change.
    So revert ifa_flag format back to 2-letter hex value.

    Introduced by:
    commit 479840ffdbe4242e8a25349218c8e0859223aa35
    "ipv6 addrconf: extend ifa_flags to u32"

    Reported-by: Alexander Aring
    Signed-off-by: Jiri Pirko
    Tested-by: FLorent Fourcot
    Signed-off-by: David S. Miller

    Jiri Pirko
     

10 Dec, 2013

3 commits


07 Dec, 2013

2 commits


03 Dec, 2013

1 commit


20 Nov, 2013

1 commit

  • Pull networking fixes from David Miller:
    "Mostly these are fixes for fallout due to merge window changes, as
    well as cures for problems that have been with us for a much longer
    period of time"

    1) Johannes Berg noticed two major deficiencies in our genetlink
    registration. Some genetlink protocols we passing in constant
    counts for their ops array rather than something like
    ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were
    using fixed IDs for their multicast groups.

    We have to retain these fixed IDs to keep existing userland tools
    working, but reserve them so that other multicast groups used by
    other protocols can not possibly conflict.

    In dealing with these two problems, we actually now use less state
    management for genetlink operations and multicast groups.

    2) When configuring interface hardware timestamping, fix several
    drivers that simply do not validate that the hwtstamp_config value
    is one the driver actually supports. From Ben Hutchings.

    3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.

    4) In dev_forward_skb(), set the skb->protocol in the right order
    relative to skb_scrub_packet(). From Alexei Starovoitov.

    5) Bridge erroneously fails to use the proper wrapper functions to make
    calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki
    Makita.

    6) When detaching a bridge port, make sure to flush all VLAN IDs to
    prevent them from leaking, also from Toshiaki Makita.

    7) Put in a compromise for TCP Small Queues so that deep queued devices
    that delay TX reclaim non-trivially don't have such a performance
    decrease. One particularly problematic area is 802.11 AMPDU in
    wireless. From Eric Dumazet.

    8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
    here. Fix from Eric Dumzaet, reported by Dave Jones.

    9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.

    10) When computing mergeable buffer sizes, virtio-net fails to take the
    virtio-net header into account. From Michael Dalton.

    11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic
    bumping, this one has been with us for a while. From Eric Dumazet.

    12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
    Hugne.

    13) 6lowpan bit used for traffic classification was wrong, from Jukka
    Rissanen.

    14) macvlan has the same issue as normal vlans did wrt. propagating LRO
    disabling down to the real device, fix it the same way. From Michal
    Kubecek.

    15) CPSW driver needs to soft reset all slaves during suspend, from
    Daniel Mack.

    16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.

    17) The xen-netfront RX buffer refill timer isn't properly scheduled on
    partial RX allocation success, from Ma JieYue.

    18) When ipv6 ping protocol support was added, the AF_INET6 protocol
    initialization cleanup path on failure was borked a little. Fix
    from Vlad Yasevich.

    19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
    blocks we can do the wrong thing with the msg_name we write back to
    userspace. From Hannes Frederic Sowa. There is another fix in the
    works from Hannes which will prevent future problems of this nature.

    20) Fix route leak in VTI tunnel transmit, from Fan Du.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
    genetlink: make multicast groups const, prevent abuse
    genetlink: pass family to functions using groups
    genetlink: add and use genl_set_err()
    genetlink: remove family pointer from genl_multicast_group
    genetlink: remove genl_unregister_mc_group()
    hsr: don't call genl_unregister_mc_group()
    quota/genetlink: use proper genetlink multicast APIs
    drop_monitor/genetlink: use proper genetlink multicast APIs
    genetlink: only pass array to genl_register_family_with_ops()
    tcp: don't update snd_nxt, when a socket is switched from repair mode
    atm: idt77252: fix dev refcnt leak
    xfrm: Release dst if this dst is improper for vti tunnel
    netlink: fix documentation typo in netlink_set_err()
    be2net: Delete secondary unicast MAC addresses during be_close
    be2net: Fix unconditional enabling of Rx interface options
    net, virtio_net: replace the magic value
    ping: prevent NULL pointer dereference on write to msg_name
    bnx2x: Prevent "timeout waiting for state X"
    bnx2x: prevent CFC attention
    bnx2x: Prevent panic during DMAE timeout
    ...

    Linus Torvalds
     

15 Nov, 2013

3 commits

  • addrconf_add_linklocal() already adds the link local route, so there is no
    reason to add it before calling this function.

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • When a link local address was added to a sit interface, the corresponding route
    was not configured. This breaks routing protocols that use the link local
    address, like OSPFv3.

    To ease the code reading, I remove sit_route_add(), which only adds v4 mapped
    routes, and add this kind of route directly in sit_add_v4_addrs(). Thus link
    local and v4 mapped routes are configured in the same place.

    Reported-by: Li Hongjun
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • When the local IPv4 endpoint is wilcard (0.0.0.0), the prefix length is
    correctly set, ie 64 if the address is a link local one or 96 if the address is
    a v4 mapped one.
    But when the local endpoint is specified, the prefix length is set to 128 for
    both kind of address. This patch fix this wrong prefix length.

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

14 Nov, 2013

1 commit

  • Pull core locking changes from Ingo Molnar:
    "The biggest changes:

    - add lockdep support for seqcount/seqlocks structures, this
    unearthed both bugs and required extra annotation.

    - move the various kernel locking primitives to the new
    kernel/locking/ directory"

    * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
    block: Use u64_stats_init() to initialize seqcounts
    locking/lockdep: Mark __lockdep_count_forward_deps() as static
    lockdep/proc: Fix lock-time avg computation
    locking/doc: Update references to kernel/mutex.c
    ipv6: Fix possible ipv6 seqlock deadlock
    cpuset: Fix potential deadlock w/ set_mems_allowed
    seqcount: Add lockdep functionality to seqcount/seqlock structures
    net: Explicitly initialize u64_stats_sync structures for lockdep
    locking: Move the percpu-rwsem code to kernel/locking/
    locking: Move the lglocks code to kernel/locking/
    locking: Move the rwsem code to kernel/locking/
    locking: Move the rtmutex code to kernel/locking/
    locking: Move the semaphore core to kernel/locking/
    locking: Move the spinlock code to kernel/locking/
    locking: Move the lockdep code to kernel/locking/
    locking: Move the mutex code to kernel/locking/
    hung_task debugging: Add tracepoint to report the hang
    x86/locking/kconfig: Update paravirt spinlock Kconfig description
    lockstat: Report avg wait and hold times
    lockdep, x86/alternatives: Drop ancient lockdep fixup message
    ...

    Linus Torvalds
     

06 Nov, 2013

1 commit

  • In order to enable lockdep on seqcount/seqlock structures, we
    must explicitly initialize any locks.

    The u64_stats_sync structure, uses a seqcount, and thus we need
    to introduce a u64_stats_init() function and use it to initialize
    the structure.

    This unfortunately adds a lot of fairly trivial initialization code
    to a number of drivers. But the benefit of ensuring correctness makes
    this worth while.

    Because these changes are required for lockdep to be enabled, and the
    changes are quite trivial, I've not yet split this patch out into 30-some
    separate patches, as I figured it would be better to get the various
    maintainers thoughts on how to best merge this change along with
    the seqcount lockdep enablement.

    Feedback would be appreciated!

    Signed-off-by: John Stultz
    Acked-by: Julian Anastasov
    Signed-off-by: Peter Zijlstra
    Cc: Alexey Kuznetsov
    Cc: "David S. Miller"
    Cc: Eric Dumazet
    Cc: Hideaki YOSHIFUJI
    Cc: James Morris
    Cc: Jesse Gross
    Cc: Mathieu Desnoyers
    Cc: "Michael S. Tsirkin"
    Cc: Mirko Lindner
    Cc: Patrick McHardy
    Cc: Roger Luethi
    Cc: Rusty Russell
    Cc: Simon Horman
    Cc: Stephen Hemminger
    Cc: Steven Rostedt
    Cc: Thomas Petazzoni
    Cc: Wensong Zhang
    Cc: netdev@vger.kernel.org
    Link: http://lkml.kernel.org/r/1381186321-4906-2-git-send-email-john.stultz@linaro.org
    Signed-off-by: Ingo Molnar

    John Stultz
     

29 Oct, 2013

1 commit


01 Oct, 2013

1 commit

  • Consider the scenario where an IPv6 router is advertising a fixed
    preferred_lft of 1800 seconds, while the valid_lft begins at 3600
    seconds and counts down in realtime.

    A client should reset its preferred_lft to 1800 every time the RA is
    received, but a bug is causing Linux to ignore the update.

    The core problem is here:
    if (prefered_lft != ifp->prefered_lft) {

    Note that ifp->prefered_lft is an offset, so it doesn't decrease over
    time. Thus, the comparison is always (1800 != 1800), which fails to
    trigger an update.

    The most direct solution would be to compute a "stored_prefered_lft",
    and use that value in the comparison. But I think that trying to filter
    out unnecessary updates here is a premature optimization. In order for
    the filter to apply, both of these would need to hold:

    - The advertised valid_lft and preferred_lft are both declining in
    real time.
    - No clock skew exists between the router & client.

    So in this patch, I've set "update_lft = 1" unconditionally, which
    allows the surrounding code to be greatly simplified.

    Signed-off-by: Paul Marks
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Paul Marks
     

29 Sep, 2013

1 commit

  • When a router is doing DNAT for 6to4/6rd packets the latest
    anti-spoofing commit 218774dc ("ipv6: add anti-spoofing checks for
    6to4 and 6rd") will drop them because the IPv6 address embedded does
    not match the IPv4 destination. This patch will allow them to pass by
    testing if we have an address that matches on 6to4/6rd interface. I
    have been hit by this problem using Fedora and IPV6TO4_IPV4ADDR.
    Also, log the dropped packets (with rate limit).

    Signed-off-by: Catalin(ux) M. BOIE
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Catalin\(ux\) M. BOIE
     

06 Sep, 2013

1 commit

  • Conflicts:
    drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c
    net/bridge/br_multicast.c
    net/ipv6/sit.c

    The conflicts were minor:

    1) sit.c changes overlap with change to ip_tunnel_xmit() signature.

    2) br_multicast.c had an overlap between computing max_delay using
    msecs_to_jiffies and turning MLDV2_MRC() into an inline function
    with a name using lowercase instead of uppercase letters.

    3) stmmac had two overlapping changes, one which conditionally allocated
    and hooked up a dma_cfg based upon the presence of the pbl OF property,
    and another one handling store-and-forward DMA made. The latter of
    which should not go into the new of_find_property() basic block.

    Signed-off-by: David S. Miller

    David S. Miller
     

04 Sep, 2013

1 commit

  • This two-liner removes max_addresses variable which is now unecessary related
    to patch [ipv6: remove max_addresses check from ipv6_create_tempaddr].

    Signed-off-by: Petr Holasek
    Acked-by: Hannes Frederic Sowa
    Acked-by: Ding Tianhong
    Signed-off-by: David S. Miller

    Petr Holasek
     

01 Sep, 2013

1 commit