23 Dec, 2011

2 commits

  • Chris Boot reported crashes occurring in ipv6_select_ident().

    [ 461.457562] RIP: 0010:[] []
    ipv6_select_ident+0x31/0xa7

    [ 461.578229] Call Trace:
    [ 461.580742]
    [ 461.582870] [] ? udp6_ufo_fragment+0x124/0x1a2
    [ 461.589054] [] ? ipv6_gso_segment+0xc0/0x155
    [ 461.595140] [] ? skb_gso_segment+0x208/0x28b
    [ 461.601198] [] ? ipv6_confirm+0x146/0x15e
    [nf_conntrack_ipv6]
    [ 461.608786] [] ? nf_iterate+0x41/0x77
    [ 461.614227] [] ? dev_hard_start_xmit+0x357/0x543
    [ 461.620659] [] ? nf_hook_slow+0x73/0x111
    [ 461.626440] [] ? br_parse_ip_options+0x19a/0x19a
    [bridge]
    [ 461.633581] [] ? dev_queue_xmit+0x3af/0x459
    [ 461.639577] [] ? br_dev_queue_push_xmit+0x72/0x76
    [bridge]
    [ 461.646887] [] ? br_nf_post_routing+0x17d/0x18f
    [bridge]
    [ 461.653997] [] ? nf_iterate+0x41/0x77
    [ 461.659473] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.665485] [] ? nf_hook_slow+0x73/0x111
    [ 461.671234] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.677299] [] ?
    nf_bridge_update_protocol+0x20/0x20 [bridge]
    [ 461.684891] [] ? nf_ct_zone+0xa/0x17 [nf_conntrack]
    [ 461.691520] [] ? br_flood+0xfa/0xfa [bridge]
    [ 461.697572] [] ? NF_HOOK.constprop.8+0x3c/0x56
    [bridge]
    [ 461.704616] [] ?
    nf_bridge_push_encap_header+0x1c/0x26 [bridge]
    [ 461.712329] [] ? br_nf_forward_finish+0x8a/0x95
    [bridge]
    [ 461.719490] [] ?
    nf_bridge_pull_encap_header+0x1c/0x27 [bridge]
    [ 461.727223] [] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge]
    [ 461.734292] [] ? nf_iterate+0x41/0x77
    [ 461.739758] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.746203] [] ? nf_hook_slow+0x73/0x111
    [ 461.751950] [] ? __br_deliver+0xa0/0xa0 [bridge]
    [ 461.758378] [] ? NF_HOOK.constprop.4+0x56/0x56
    [bridge]

    This is caused by bridge netfilter special dst_entry (fake_rtable), a
    special shared entry, where attaching an inetpeer makes no sense.

    Problem is present since commit 87c48fa3b46 (ipv6: make fragment
    identifications less predictable)

    Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and
    __ip_select_ident() fallback to the 'no peer attached' handling.

    Reported-by: Chris Boot
    Tested-by: Chris Boot
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Signed-off-by: Stephen Rothwell
    Acked-by: Eric Dumazet
    Acked-by: David Miller
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     

22 Dec, 2011

1 commit

  • Commit 2c8cec5c10b (ipv4: Cache learned PMTU information in inetpeer)
    removed IP route cache garbage collector a bit too soon, as this gc was
    responsible for expired routes cleanup, releasing their neighbour
    reference.

    As pointed out by Robert Gladewitz, recent kernels can fill and exhaust
    their neighbour cache.

    Reintroduce the garbage collection, since we'll have to wait our
    neighbour lookups become refcount-less to not depend on this stuff.

    Reported-by: Robert Gladewitz
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Dec, 2011

1 commit

  • previous commit 3fb72f1e6e6165c5f495e8dc11c5bbd14c73385c
    makes IP-Config wait for carrier on at least one network device.

    Before waiting (predefined value 120s), check that at least one device
    was successfully brought up. Otherwise (e.g. buggy bootloader
    which does not set the MAC address) there is no point in waiting
    for carrier.

    Cc: Micha Nelissen
    Cc: Holger Brunck
    Signed-off-by: Gerlando Falauto
    Signed-off-by: David S. Miller

    Gerlando Falauto
     

13 Dec, 2011

1 commit

  • Same fix as 731abb9cb2 for ipip and sit tunnel.
    Commit 1c5cae815d removed an explicit call to dev_alloc_name in
    ipip_tunnel_locate and ipip6_tunnel_locate, because register_netdevice
    will now create a valid name, however the tunnel keeps a copy of the
    name in the private parms structure. Fix this by copying the name back
    after register_netdevice has successfully returned.

    This shows up if you do a simple tunnel add, followed by a tunnel show:

    $ sudo ip tunnel add mode ipip remote 10.2.20.211
    $ ip tunnel
    tunl0: ip/ip remote any local any ttl inherit nopmtudisc
    tunl%d: ip/ip remote 10.2.20.211 local any ttl inherit
    $ sudo ip tunnel add mode sit remote 10.2.20.212
    $ ip tunnel
    sit0: ipv6/ip remote any local any ttl 64 nopmtudisc 6rd-prefix 2002::/16
    sit%d: ioctl 89f8 failed: No such device
    sit%d: ipv6/ip remote 10.2.20.212 local any ttl inherit

    Cc: stable@vger.kernel.org
    Signed-off-by: Ted Feng
    Signed-off-by: David S. Miller

    Ted Feng
     

06 Dec, 2011

1 commit

  • If ipv4_valdiate_peer() fails during a cached entry lookup,
    we'll NULL derer since the loop iterator assumes rth is not
    NULL.

    Letting this be handled as a failure is just bogus, so just make it
    not fail. If we have trouble getting a non-NULL neighbour for the
    redirected gateway, just restore the original gateway and continue.

    The very next use of this cached route will try again.

    Reported-by: Dan Carpenter
    Signed-off-by: David S. Miller

    David S. Miller
     

03 Dec, 2011

1 commit


02 Dec, 2011

3 commits


01 Dec, 2011

1 commit


29 Nov, 2011

1 commit


27 Nov, 2011

5 commits

  • Now inetpeer is the place where we cache redirect information for ipv4
    destinations, we must be able to invalidate informations when a route is
    added/removed on host.

    As inetpeer is not yet namespace aware, this patch adds a shared
    redirect_genid, and a per inetpeer redirect_genid. This might be changed
    later if inetpeer becomes ns aware.

    Cache information for one inerpeer is valid as long as its
    redirect_genid has the same value than global redirect_genid.

    Reported-by: Arkadiusz Miśkiewicz
    Tested-by: Arkadiusz Miśkiewicz
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • The pmtu informations on the inetpeer are visible for output and
    input routes. On packet forwarding, we might propagate a learned
    pmtu to the sender. As we update the pmtu informations of the
    inetpeer on demand, the original sender of the forwarded packets
    might never notice when the pmtu to that inetpeer increases.
    So use the mtu of the outgoing device on packet forwarding instead
    of the pmtu to the final destination.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • We move all mtu handling from dst_mtu() down to the protocol
    layer. So each protocol can implement the mtu handling in
    a different manner.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • We plan to invoke the dst_opt->default_mtu() method unconditioally
    from dst_mtu(). So rename the method to dst_opt->mtu() to match
    the name with the new meaning.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • As it is, we return null as the default mtu of blackhole routes.
    This may lead to a propagation of a bogus pmtu if the default_mtu
    method of a blackhole route is invoked. So return dst->dev->mtu
    as the default mtu instead.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     

24 Nov, 2011

3 commits


23 Nov, 2011

1 commit


22 Nov, 2011

1 commit

  • This patch tries to fix the following issue in netfilter:
    In ip_route_me_harder(), we invoke pskb_expand_head() that
    rellocates new header with additional head room which can break
    the alignment of the original packet header.

    In one of my NAT test case, the NIC port for internal hosts is
    configured with vlan and the port for external hosts is with
    general configuration. If we ping an external "unknown" hosts from an
    internal host, an icmp packet will be sent. We find that in
    icmp_send()->...->ip_route_me_harder()->pskb_expand_head(), hh_len=18
    and current headroom (skb_headroom(skb)) of the packet is 16. After
    calling pskb_expand_head() the packet header becomes to be unaligned
    and then our system (arch/tile) panics immediately.

    Signed-off-by: Paul Guo
    Acked-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso

    Paul Guo
     

19 Nov, 2011

2 commits

  • commit f39925dbde77 (ipv4: Cache learned redirect information in
    inetpeer.) introduced a regression in ICMP redirect handling.

    It assumed ipv4_dst_check() would be called because all possible routes
    were attached to the inetpeer we modify in ip_rt_redirect(), but thats
    not true.

    commit 7cc9150ebe (route: fix ICMP redirect validation) tried to fix
    this but solution was not complete. (It fixed only one route)

    So we must lookup existing routes (including different TOS values) and
    call check_peer_redir() on them.

    Reported-by: Ivan Zahariev
    Signed-off-by: Eric Dumazet
    CC: Flavio Leitner
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • ping module incorrectly increments ICMP_MIB_INERRORS if feeded with a
    frame not belonging to its own sockets.

    RFC 2011 states that ICMP_MIB_INERRORS should count "the number of ICMP
    messages which the entiry received but determined as having
    ICMP-specific errors (bad ICMP checksums, bad length, etc.)."

    Signed-off-by: Eric Dumazet
    CC: Vasiliy Kulikov
    Acked-by: Flavio Leitner
    Acked-by: Vasiliy Kulikov
    Signed-off-by: David S. Miller

    Eric Dumazet
     

17 Nov, 2011

1 commit

  • Simon Kirby reported divides by zero errors in __tcp_select_window()

    This happens when inet_csk_route_child_sock() returns a NULL pointer :

    We free new socket while we eventually armed keepalive timer in
    tcp_create_openreq_child()

    Fix this by a call to tcp_clear_xmit_timers()

    [ This is a followup to commit 918eb39962dff (net: add missing
    bh_unlock_sock() calls) ]

    Reported-by: Simon Kirby
    Signed-off-by: Eric Dumazet
    Tested-by: Simon Kirby
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Nov, 2011

1 commit

  • commit 3ceca749668a52bd795585e0f71c6f0b04814f7b added a TOS attribute.

    Unfortunately TOS and TCLASS are both present in a dual-stack v6 socket,
    furthermore they can have different values. As such one cannot in a
    sane way expose both through a single attribute.

    Signed-off-by: Maciej Żenczyowski
    CC: Murali Raja
    CC: Stephen Hemminger
    CC: Eric Dumazet
    CC: David S. Miller
    Signed-off-by: David S. Miller

    Maciej Żenczykowski
     

13 Nov, 2011

1 commit

  • When the ahash driver returns -EBUSY, AH4/6 input functions return
    NET_XMIT_DROP, presumably copied from the output code path. But
    returning transmit codes on input doesn't make a lot of sense.
    Since NET_XMIT_DROP is a positive int, this gets interpreted as
    the next header type (i.e., success). As that can only end badly,
    remove the check.

    Signed-off-by: Nick Bowler
    Signed-off-by: David S. Miller

    Nick Bowler
     

10 Nov, 2011

3 commits

  • When opt->srr_is_hit is set skb_rtable(skb) has been updated for
    'nexthop' and iph->daddr should always equals to skb_rtable->rt_dst
    holds, We need update iph->daddr either.

    Signed-off-by: Li Wei
    Signed-off-by: David S. Miller

    Li Wei
     
  • The AH4/6 ahash input callbacks read out the nexthdr field from the AH
    header *after* they overwrite that header. This is obviously not going
    to end well. Fix it up.

    Signed-off-by: Nick Bowler
    Signed-off-by: David S. Miller

    Nick Bowler
     
  • The AH4/6 ahash output callbacks pass nexthdr to xfrm_output_resume
    instead of the error code. This appears to be a copy+paste error from
    the input case, where nexthdr is expected. This causes the driver to
    continuously add AH headers to the datagram until either an allocation
    fails and the packet is dropped or the ahash driver hits a synchronous
    fallback and the resulting monstrosity is transmitted.

    Correct this issue by simply passing the error code unadulterated.

    Signed-off-by: Nick Bowler
    Signed-off-by: David S. Miller

    Nick Bowler
     

09 Nov, 2011

2 commits

  • As we update the learned pmtu informations on demand, we might
    report a nagative expiration time value to userspace if the
    pmtu informations are already expired and we have not send a
    packet to that inetpeer after expiration. With this patch we
    send a expire time of null to userspace after expiration
    until the next packet is send to that inetpeer.

    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Steffen Klassert
     
  • TCP_NODELAY is weaker than TCP_CORK, when TCP_CORK was set, small
    segments will always pass Nagle test regardless of TCP_NODELAY option.

    Signed-off-by: Feng King
    Signed-off-by: David S. Miller

    Feng King
     

07 Nov, 2011

1 commit

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     

04 Nov, 2011

1 commit

  • Simon Kirby reported lockdep warnings and following messages :

    [104661.897577] huh, entered softirq 3 NET_RX ffffffff81613740
    preempt_count 00000101, exited with 00000102?

    [104661.923653] huh, entered softirq 3 NET_RX ffffffff81613740
    preempt_count 00000101, exited with 00000102?

    Problem comes from commit 0e734419
    (ipv4: Use inet_csk_route_child_sock() in DCCP and TCP.)

    If inet_csk_route_child_sock() returns NULL, we should release socket
    lock before freeing it.

    Another lock imbalance exists if __inet_inherit_port() returns an error
    since commit 093d282321da ( tproxy: fix hash locking issue when using
    port redirection in __inet_inherit_port()) a backport is also needed for
    >= 2.6.37 kernels.

    Reported-by: Simon Kirby
    Signed-off-by: Eric Dumazet
    Tested-by: Eric Dumazet
    CC: Balazs Scheidler
    CC: KOVACS Krisztian
    Reviewed-by: Thomas Gleixner
    Tested-by: Simon Kirby
    Signed-off-by: David S. Miller

    Eric Dumazet
     

02 Nov, 2011

2 commits

  • udp_queue_rcv_skb() has a possible race in encap_rcv handling, since
    this pointer can be changed anytime.

    We should use ACCESS_ONCE() to close the race.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • the tcp and udp code creates a set of struct file_operations at runtime
    while it can also be done at compile time, with the added benefit of then
    having these file operations be const.

    the trickiest part was to get the "THIS_MODULE" reference right; the naive
    method of declaring a struct in the place of registration would not work
    for this reason.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: David S. Miller

    Arjan van de Ven
     

01 Nov, 2011

3 commits


27 Oct, 2011

1 commit

  • commit 66b13d99d96a (ipv4: tcp: fix TOS value in ACK messages sent from
    TIME_WAIT) fixed IPv4 only.

    This part is for the IPv6 side, adding a tclass param to ip6_xmit()

    We alias tw_tclass and tw_tos, if socket family is INET6.

    [ if sockets is ipv4-mapped, only IP_TOS socket option is used to fill
    TOS field, TCLASS is not taken into account ]

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet