24 Dec, 2011

1 commit


23 Dec, 2011

2 commits

  • Chris Boot reported crashes occurring in ipv6_select_ident().

    [ 461.457562] RIP: 0010:[] []
    ipv6_select_ident+0x31/0xa7

    [ 461.578229] Call Trace:
    [ 461.580742]
    [ 461.582870] [] ? udp6_ufo_fragment+0x124/0x1a2
    [ 461.589054] [] ? ipv6_gso_segment+0xc0/0x155
    [ 461.595140] [] ? skb_gso_segment+0x208/0x28b
    [ 461.601198] [] ? ipv6_confirm+0x146/0x15e
    [nf_conntrack_ipv6]
    [ 461.608786] [] ? nf_iterate+0x41/0x77
    [ 461.614227] [] ? dev_hard_start_xmit+0x357/0x543
    [ 461.620659] [] ? nf_hook_slow+0x73/0x111
    [ 461.626440] [] ? br_parse_ip_options+0x19a/0x19a
    [bridge]
    [ 461.633581] [] ? dev_queue_xmit+0x3af/0x459
    [ 461.639577] [] ? br_dev_queue_push_xmit+0x72/0x76
    [bridge]
    [ 461.646887] [] ? br_nf_post_routing+0x17d/0x18f
    [bridge]
    [ 461.653997] [] ? nf_iterate+0x41/0x77
    [ 461.659473] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.665485] [] ? nf_hook_slow+0x73/0x111
    [ 461.671234] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.677299] [] ?
    nf_bridge_update_protocol+0x20/0x20 [bridge]
    [ 461.684891] [] ? nf_ct_zone+0xa/0x17 [nf_conntrack]
    [ 461.691520] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.697572] [] ? NF_HOOK.constprop.8+0x3c/0x56
    [bridge]
    [ 461.704616] [] ?
    nf_bridge_push_encap_header+0x1c/0x26 [bridge]
    [ 461.712329] [] ? br_nf_forward_finish+0x8a/0x95
    [bridge]
    [ 461.719490] [] ?
    nf_bridge_pull_encap_header+0x1c/0x27 [bridge]
    [ 461.727223] [] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge]
    [ 461.734292] [] ? nf_iterate+0x41/0x77
    [ 461.739758] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.746203] [] ? nf_hook_slow+0x73/0x111
    [ 461.751950] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.758378] [] ? NF_HOOK.constprop.4+0x56/0x56
    [bridge]

    This is caused by bridge netfilter special dst_entry (fake_rtable), a
    special shared entry, where attaching an inetpeer makes no sense.

    Problem is present since commit 87c48fa3b46 (ipv6: make fragment
    identifications less predictable)

    Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and
    __ip_select_ident() fallback to the 'no peer attached' handling.

    Reported-by: Chris Boot
    Tested-by: Chris Boot
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Signed-off-by: Stephen Rothwell
    Acked-by: Eric Dumazet
    Acked-by: David Miller
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     

22 Dec, 2011

1 commit

  • Commit 2c8cec5c10b (ipv4: Cache learned PMTU information in inetpeer)
    removed IP route cache garbage collector a bit too soon, as this gc was
    responsible for expired routes cleanup, releasing their neighbour
    reference.

    As pointed out by Robert Gladewitz, recent kernels can fill and exhaust
    their neighbour cache.

    Reintroduce the garbage collection, since we'll have to wait our
    neighbour lookups become refcount-less to not depend on this stuff.

    Reported-by: Robert Gladewitz
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Dec, 2011

1 commit


06 Dec, 2011

2 commits


03 Dec, 2011

2 commits


02 Dec, 2011

1 commit


01 Dec, 2011

2 commits


27 Nov, 2011

5 commits

  • Now inetpeer is the place where we cache redirect information for ipv4
    destinations, we must be able to invalidate informations when a route is
    added/removed on host.

    As inetpeer is not yet namespace aware, this patch adds a shared
    redirect_genid, and a per inetpeer redirect_genid. This might be changed
    later if inetpeer becomes ns aware.

    Cache information for one inerpeer is valid as long as its
    redirect_genid has the same value than global redirect_genid.

    Reported-by: Arkadiusz Miśkiewicz
    Tested-by: Arkadiusz Miśkiewicz
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • The pmtu informations on the inetpeer are visible for output and
    input routes. On packet forwarding, we might propagate a learned
    pmtu to the sender. As we update the pmtu informations of the
    inetpeer on demand, the original sender of the forwarded packets
    might never notice when the pmtu to that inetpeer increases.
    So use the mtu of the outgoing device on packet forwarding instead
    of the pmtu to the final destination.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • We move all mtu handling from dst_mtu() down to the protocol
    layer. So each protocol can implement the mtu handling in
    a different manner.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • We plan to invoke the dst_opt->default_mtu() method unconditioally
    from dst_mtu(). So rename the method to dst_opt->mtu() to match
    the name with the new meaning.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • As it is, we return null as the default mtu of blackhole routes.
    This may lead to a propagation of a bogus pmtu if the default_mtu
    method of a blackhole route is invoked. So return dst->dev->mtu
    as the default mtu instead.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     

19 Nov, 2011

1 commit

  • commit f39925dbde77 (ipv4: Cache learned redirect information in
    inetpeer.) introduced a regression in ICMP redirect handling.

    It assumed ipv4_dst_check() would be called because all possible routes
    were attached to the inetpeer we modify in ip_rt_redirect(), but thats
    not true.

    commit 7cc9150ebe (route: fix ICMP redirect validation) tried to fix
    this but solution was not complete. (It fixed only one route)

    So we must lookup existing routes (including different TOS values) and
    call check_peer_redir() on them.

    Reported-by: Ivan Zahariev
    Signed-off-by: Eric Dumazet
    CC: Flavio Leitner
    Signed-off-by: David S. Miller

    Eric Dumazet
     

09 Nov, 2011

1 commit

  • As we update the learned pmtu informations on demand, we might
    report a nagative expiration time value to userspace if the
    pmtu informations are already expired and we have not send a
    packet to that inetpeer after expiration. With this patch we
    send a expire time of null to userspace after expiration
    until the next packet is send to that inetpeer.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     

25 Oct, 2011

2 commits


24 Oct, 2011

1 commit

  • The commit f39925dbde7788cfb96419c0f092b086aa325c0f
    (ipv4: Cache learned redirect information in inetpeer.)
    removed some ICMP packet validations which are required by
    RFC 1122, section 3.2.2.2:
    ...
    A Redirect message SHOULD be silently discarded if the new
    gateway address it specifies is not on the same connected
    (sub-) net through which the Redirect arrived [INTRO:2,
    Appendix A], or if the source of the Redirect is not the
    current first-hop gateway for the specified destination (see
    Section 3.3.1).

    Signed-off-by: Flavio Leitner
    Signed-off-by: David S. Miller

    Flavio Leitner
     

04 Oct, 2011

1 commit


21 Aug, 2011

1 commit


12 Aug, 2011

1 commit


11 Aug, 2011

1 commit

  • As rt_iif represents input device even for packets
    coming from loopback with output route, it is not an unique
    key specific to input routes. Now rt_route_iif has such role,
    it was fl.iif in 2.6.38, so better to change the checks at
    some places to save CPU cycles and to restore 2.6.38 semantics.

    compare_keys:
    - input routes: only rt_route_iif matters, rt_iif is same
    - output routes: only rt_oif matters, rt_iif is not
    used for matching in __ip_route_output_key
    - now we are back to 2.6.38 state

    ip_route_input_common:
    - matching rt_route_iif implies input route
    - compared to 2.6.38 we eliminated one rth->fl.oif check
    because it was not needed even for 2.6.38

    compare_hash_inputs:
    Only the change here is not an optimization, it has
    effect only for output routes. I assume I'm restoring
    the original intention to ignore oif, it was using fl.iif
    - now we are back to 2.6.38 state

    Signed-off-by: Julian Anastasov
    Signed-off-by: David S. Miller

    Julian Anastasov
     

08 Aug, 2011

1 commit

  • compare_keys and ip_route_input_common rely on
    rt_oif for distinguishing of input and output routes
    with same keys values. But sometimes the input route has
    also same hash chain (keyed by iif != 0) with the output
    routes (keyed by orig_oif=0). Problem visible if running
    with small number of rhash_entries.

    Fix them to use rt_route_iif instead. By this way
    input route can not be returned to users that request
    output route.

    The patch fixes the ip_rt_bug errors that were
    reported in ip_local_out context, mostly for 255.255.255.255
    destinations.

    Signed-off-by: Julian Anastasov
    Signed-off-by: David S. Miller

    Julian Anastasov
     

07 Aug, 2011

1 commit

  • Computers have become a lot faster since we compromised on the
    partial MD4 hash which we use currently for performance reasons.

    MD5 is a much safer choice, and is inline with both RFC1948 and
    other ISS generators (OpenBSD, Solaris, etc.)

    Furthermore, only having 24-bits of the sequence number be truly
    unpredictable is a very serious limitation. So the periodic
    regeneration and 8-bit counter have been removed. We compute and
    use a full 32-bit sequence number.

    For ipv6, DCCP was found to use a 32-bit truncated initial sequence
    number (it needs 43-bits) and that is fixed here as well.

    Reported-by: Dan Kaminsky
    Tested-by: Willy Tarreau
    Signed-off-by: David S. Miller

    David S. Miller
     

03 Aug, 2011

1 commit

  • Gergely Kalman reported crashes in check_peer_redir().

    It appears commit f39925dbde778 (ipv4: Cache learned redirect
    information in inetpeer.) added a race, leading to possible NULL ptr
    dereference.

    Since we can now change dst neighbour, we should make sure a reader can
    safely use a neighbour.

    Add RCU protection to dst neighbour, and make sure check_peer_redir()
    can be called safely by different cpus in parallel.

    As neighbours are already freed after one RCU grace period, this patch
    should not add typical RCU penalty (cache cold effects)

    Many thanks to Gergely for providing a pretty report pointing to the
    bug.

    Reported-by: Gergely Kalman
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

24 Jul, 2011

1 commit


18 Jul, 2011

2 commits


17 Jul, 2011

1 commit


14 Jul, 2011

1 commit

  • Now that there is a one-to-one correspondance between neighbour
    and hh_cache entries, we no longer need:

    1) dynamic allocation
    2) attachment to dst->hh
    3) refcounting

    Initialization of the hh_cache entry is indicated by hh_len
    being non-zero, and such initialization is always done with
    the neighbour's lock held as a writer.

    Signed-off-by: David S. Miller

    David S. Miller
     

13 Jul, 2011

1 commit

  • Get rid of all of the useless and costly indirection
    by doing the neigh hash table lookup directly inside
    of the neighbour binding.

    Rename from arp_bind_neighbour to rt_bind_neighbour.

    Use new helpers {__,}ipv4_neigh_lookup()

    In rt_bind_neighbour() get rid of useless tests which
    are never true in the context this function is called,
    namely dev is never NULL and the dst->neighbour is
    always NULL.

    Signed-off-by: David S. Miller

    David Miller
     

02 Jul, 2011

1 commit


21 Jun, 2011

1 commit


19 Jun, 2011

1 commit

  • Knut Tidemann found that first packet of a multicast flow was not
    correctly received, and bisected the regression to commit b23dd4fe42b4
    (Make output route lookup return rtable directly.)

    Special thanks to Knut, who provided a very nice bug report, including
    sample programs to demonstrate the bug.

    Reported-and-bisectedby: Knut Tidemann
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

10 Jun, 2011

1 commit

  • The message size allocated for rtnl ifinfo dumps was limited to
    a single page. This is not enough for additional interface info
    available with devices that support SR-IOV and caused a bug in
    which VF info would not be displayed if more than approximately
    40 VFs were created per interface.

    Implement a new function pointer for the rtnl_register service that will
    calculate the amount of data required for the ifinfo dump and allocate
    enough data to satisfy the request.

    Signed-off-by: Greg Rose
    Signed-off-by: Jeff Kirsher

    Greg Rose
     

09 Jun, 2011

1 commit

  • commit 2c8cec5c10bc (ipv4: Cache learned PMTU information in inetpeer)
    added some racy peer->pmtu_expires accesses.

    As its value can be changed by another cpu/thread, we should be more
    careful, reading its value once.

    Add peer_pmtu_expired() and peer_pmtu_cleaned() helpers

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet