17 Jul, 2013

1 commit


13 Jul, 2013

1 commit

  • Static routes in this case are non-expiring routes which did not get
    configured by autoconf or by icmpv6 redirects.

    To make sure we actually get an ecmp route while searching for the first
    one in this fib6_node's leafs, also make sure it matches the ecmp route
    assumptions.

    v2:
    a) Removed RTF_EXPIRE check in dst.from chain. The check of RTF_ADDRCONF
    already ensures that this route, even if added again without
    RTF_EXPIRES (in case of a RA announcement with infinite timeout),
    does not cause the rt6i_nsiblings logic to go wrong if a later RA
    updates the expiration time later.

    v3:
    a) Allow RTF_EXPIRES routes to enter the ecmp route set. We have to do so,
    because an pmtu event could update the RTF_EXPIRES flag and we would
    not count this route, if another route joins this set. We now filter
    only for RTF_GATEWAY|RTF_ADDRCONF|RTF_DYNAMIC, which are flags that
    don't get changed after rt6_info construction.

    Cc: Nicolas Dichtel
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

12 Jul, 2013

1 commit

  • This is a follow-up patch to 3630d40067a21d4dfbadc6002bb469ce26ac5d52
    ("ipv6: rt6_check_neigh should successfully verify neigh if no NUD
    information are available").

    Since the removal of rt->n in rt6_info we can end up with a dst ==
    NULL in rt6_check_neigh. In case the kernel is not compiled with
    CONFIG_IPV6_ROUTER_PREF we should also select a route with unkown
    NUD state but we must not avoid doing round robin selection on routes
    with the same target. So introduce and pass down a boolean ``do_rr'' to
    indicate when we should update rt->rr_ptr. As soon as no route is valid
    we do backtracking and do a lookup on a higher level in the fib trie.

    v2:
    a) Improved rt6_check_neigh logic (no need to create neighbour there)
    and documented return values.

    v3:
    a) Introduce enum rt6_nud_state to get rid of the magic numbers
    (thanks to David Miller).
    b) Update and shorten commit message a bit to actualy reflect
    the source.

    Reported-by: Pierre Emeriaud
    Cc: YOSHIFUJI Hideaki
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

11 Jul, 2013

3 commits

  • We could end up expiring a route which is part of an ecmp route set. Doing
    so would invalidate the rt->rt6i_nsiblings calculations and could provoke
    the following panic:

    [ 80.144667] ------------[ cut here ]------------
    [ 80.145172] kernel BUG at net/ipv6/ip6_fib.c:733!
    [ 80.145172] invalid opcode: 0000 [#1] SMP
    [ 80.145172] Modules linked in: 8021q nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables
    +snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_page_alloc snd_timer virtio_balloon snd soundcore i2c_piix4 i2c_core virtio_net virtio_blk
    [ 80.145172] CPU: 1 PID: 786 Comm: ping6 Not tainted 3.10.0+ #118
    [ 80.145172] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    [ 80.145172] task: ffff880117fa0000 ti: ffff880118770000 task.ti: ffff880118770000
    [ 80.145172] RIP: 0010:[] [] fib6_add+0x75d/0x830
    [ 80.145172] RSP: 0018:ffff880118771798 EFLAGS: 00010202
    [ 80.145172] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011350e480
    [ 80.145172] RDX: ffff88011350e238 RSI: 0000000000000004 RDI: ffff88011350f738
    [ 80.145172] RBP: ffff880118771848 R08: ffff880117903280 R09: 0000000000000001
    [ 80.145172] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88011350f680
    [ 80.145172] R13: ffff880117903280 R14: ffff880118771890 R15: ffff88011350ef90
    [ 80.145172] FS: 00007f02b5127740(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000
    [ 80.145172] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    [ 80.145172] CR2: 00007f981322a000 CR3: 00000001181b1000 CR4: 00000000000006e0
    [ 80.145172] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 80.145172] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [ 80.145172] Stack:
    [ 80.145172] 0000000000000001 ffff880100000000 ffff880100000000 ffff880117903280
    [ 80.145172] 0000000000000000 ffff880119a4cf00 0000000000000400 00000000000007fa
    [ 80.145172] 0000000000000000 0000000000000000 0000000000000000 ffff88011350f680
    [ 80.145172] Call Trace:
    [ 80.145172] [] ? rt6_bind_peer+0x4b/0x90
    [ 80.145172] [] __ip6_ins_rt+0x45/0x70
    [ 80.145172] [] ip6_ins_rt+0x35/0x40
    [ 80.145172] [] ip6_pol_route.isra.44+0x3a4/0x4b0
    [ 80.145172] [] ip6_pol_route_output+0x2a/0x30
    [ 80.145172] [] fib6_rule_action+0xd7/0x210
    [ 80.145172] [] ? ip6_pol_route_input+0x30/0x30
    [ 80.145172] [] fib_rules_lookup+0xc6/0x140
    [ 80.145172] [] fib6_rule_lookup+0x44/0x80
    [ 80.145172] [] ? ip6_pol_route_input+0x30/0x30
    [ 80.145172] [] ip6_route_output+0x73/0xb0
    [ 80.145172] [] ip6_dst_lookup_tail+0x2c3/0x2e0
    [ 80.145172] [] ? list_del+0x11/0x40
    [ 80.145172] [] ? remove_wait_queue+0x3c/0x50
    [ 80.145172] [] ip6_dst_lookup_flow+0x3d/0xa0
    [ 80.145172] [] rawv6_sendmsg+0x267/0xc20
    [ 80.145172] [] inet_sendmsg+0x63/0xb0
    [ 80.145172] [] ? selinux_socket_sendmsg+0x23/0x30
    [ 80.145172] [] sock_sendmsg+0xa6/0xd0
    [ 80.145172] [] SYSC_sendto+0x128/0x180
    [ 80.145172] [] ? update_curr+0xec/0x170
    [ 80.145172] [] ? kvm_clock_get_cycles+0x9/0x10
    [ 80.145172] [] ? __getnstimeofday+0x3e/0xd0
    [ 80.145172] [] SyS_sendto+0xe/0x10
    [ 80.145172] [] system_call_fastpath+0x16/0x1b
    [ 80.145172] Code: fe ff ff 41 f6 45 2a 06 0f 85 ca fe ff ff 49 8b 7e 08 4c 89 ee e8 94 ef ff ff e9 b9 fe ff ff 48 8b 82 28 05 00 00 e9 01 ff ff ff 0b 49 8b 54 24 30 0d 00 00 40 00 89 83 14 01 00 00 48 89 53
    [ 80.145172] RIP [] fib6_add+0x75d/0x830
    [ 80.145172] RSP
    [ 80.387413] ---[ end trace 02f20b7a8b81ed95 ]---
    [ 80.390154] Kernel panic - not syncing: Fatal exception in interrupt

    Cc: Nicolas Dichtel
    Cc: YOSHIFUJI Hideaki
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • Rename ndo_ll_poll to ndo_busy_poll.
    Rename sk_mark_ll to sk_mark_napi_id.
    Rename skb_mark_ll to skb_mark_napi_id.
    Correct all useres of these functions.
    Update comments and defines in include/net/busy_poll.h

    Signed-off-by: Eliezer Tamir
    Signed-off-by: David S. Miller

    Eliezer Tamir
     
  • Rename the file and correct all the places where it is included.

    Signed-off-by: Eliezer Tamir
    Signed-off-by: David S. Miller

    Eliezer Tamir
     

05 Jul, 2013

1 commit


04 Jul, 2013

4 commits

  • After the removal of rt->n we do not create a neighbour entry at route
    insertion time (rt6_bind_neighbour is gone). As long as no neighbour is
    created because of "useful traffic" we skip this routing entry because
    rt6_check_neigh cannot pick up a valid neighbour (neigh == NULL) and
    thus returns false.

    This change was introduced by commit
    887c95cc1da53f66a5890fdeab13414613010097 ("ipv6: Complete neighbour
    entry removal from dst_entry.")

    To quote RFC4191:
    "If the host has no information about the router's reachability, then
    the host assumes the router is reachable."

    and also:
    "A host MUST NOT probe a router's reachability in the absence of useful
    traffic that the host would have sent to the router if it were reachable."

    So, just assume the router is reachable and let's rt6_probe do the
    rest. We don't need to create a neighbour on route insertion time.

    If we don't compile with CONFIG_IPV6_ROUTER_PREF (RFC4191 support)
    a neighbour is only valid if its nud_state is NUD_VALID. I did not find
    any references that we should probe the router on route insertion time
    via the other RFCs. So skip this route in that case.

    v2:
    a) use IS_ENABLED instead of #ifdefs (thanks to Sergei Shtylyov)

    Reported-by: Pierre Emeriaud
    Cc: YOSHIFUJI Hideaki
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • ping_v6_sendmsg currently returns 0 on success. It should return
    the number of bytes written instead.

    Signed-off-by: Lorenzo Colitti
    Signed-off-by: David S. Miller

    Lorenzo Colitti
     
  • Signed-off-by: Lorenzo Colitti
    Signed-off-by: David S. Miller

    Lorenzo Colitti
     
  • Conflicts:
    drivers/net/ethernet/freescale/fec_main.c
    drivers/net/ethernet/renesas/sh_eth.c
    net/ipv4/gre.c

    The GRE conflict is between a bug fix (kfree_skb --> kfree_skb_list)
    and the splitting of the gre.c code into seperate files.

    The FEC conflict was two sets of changes adding ethtool support code
    in an "!CONFIG_M5272" CPP protected block.

    Finally the sh_eth.c conflict was between one commit add bits set
    in the .eesr_err_check mask whilst another commit removed the
    .tx_error_check member and assignments.

    Signed-off-by: David S. Miller

    David S. Miller
     

03 Jul, 2013

2 commits

  • If the socket had an IPV6_MTU value set, ip6_append_data_mtu lost track
    of this when appending the second frame on a corked socket. This results
    in the following splat:

    [37598.993962] ------------[ cut here ]------------
    [37598.994008] kernel BUG at net/core/skbuff.c:2064!
    [37598.994008] invalid opcode: 0000 [#1] SMP
    [37598.994008] Modules linked in: tcp_lp uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev media vfat fat usb_storage fuse ebtable_nat xt_CHECKSUM bridge stp llc ipt_MASQUERADE nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat
    +nf_nat_ipv4 nf_nat iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i cxgb3 mdio libcxgbi ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi
    +scsi_transport_iscsi rfcomm bnep iTCO_wdt iTCO_vendor_support snd_hda_codec_conexant arc4 iwldvm mac80211 snd_hda_intel acpi_cpufreq mperf coretemp snd_hda_codec microcode cdc_wdm cdc_acm
    [37598.994008] snd_hwdep cdc_ether snd_seq snd_seq_device usbnet mii joydev btusb snd_pcm bluetooth i2c_i801 e1000e lpc_ich mfd_core ptp iwlwifi pps_core snd_page_alloc mei cfg80211 snd_timer thinkpad_acpi snd tpm_tis soundcore rfkill tpm tpm_bios vhost_net tun macvtap macvlan kvm_intel kvm uinput binfmt_misc
    +dm_crypt i915 i2c_algo_bit drm_kms_helper drm i2c_core wmi video
    [37598.994008] CPU 0
    [37598.994008] Pid: 27320, comm: t2 Not tainted 3.9.6-200.fc18.x86_64 #1 LENOVO 27744PG/27744PG
    [37598.994008] RIP: 0010:[] [] skb_copy_and_csum_bits+0x325/0x330
    [37598.994008] RSP: 0018:ffff88003670da18 EFLAGS: 00010202
    [37598.994008] RAX: ffff88018105c018 RBX: 0000000000000004 RCX: 00000000000006c0
    [37598.994008] RDX: ffff88018105a6c0 RSI: ffff88018105a000 RDI: ffff8801e1b0aa00
    [37598.994008] RBP: ffff88003670da78 R08: 0000000000000000 R09: ffff88018105c040
    [37598.994008] R10: ffff8801e1b0aa00 R11: 0000000000000000 R12: 000000000000fff8
    [37598.994008] R13: 00000000000004fc R14: 00000000ffff0504 R15: 0000000000000000
    [37598.994008] FS: 00007f28eea59740(0000) GS:ffff88023bc00000(0000) knlGS:0000000000000000
    [37598.994008] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    [37598.994008] CR2: 0000003d935789e0 CR3: 00000000365cb000 CR4: 00000000000407f0
    [37598.994008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [37598.994008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    [37598.994008] Process t2 (pid: 27320, threadinfo ffff88003670c000, task ffff88022c162ee0)
    [37598.994008] Stack:
    [37598.994008] ffff88022e098a00 ffff88020f973fc0 0000000000000008 00000000000004c8
    [37598.994008] ffff88020f973fc0 00000000000004c4 ffff88003670da78 ffff8801e1b0a200
    [37598.994008] 0000000000000018 00000000000004c8 ffff88020f973fc0 00000000000004c4
    [37598.994008] Call Trace:
    [37598.994008] [] ip6_append_data+0xccf/0xfe0
    [37598.994008] [] ? ip_copy_metadata+0x1a0/0x1a0
    [37598.994008] [] ? _raw_spin_lock_bh+0x16/0x40
    [37598.994008] [] udpv6_sendmsg+0x1ed/0xc10
    [37598.994008] [] ? sock_has_perm+0x75/0x90
    [37598.994008] [] inet_sendmsg+0x63/0xb0
    [37598.994008] [] ? selinux_socket_sendmsg+0x23/0x30
    [37598.994008] [] sock_sendmsg+0xb0/0xe0
    [37598.994008] [] ? __switch_to+0x181/0x4a0
    [37598.994008] [] sys_sendto+0x12d/0x180
    [37598.994008] [] ? __audit_syscall_entry+0x94/0xf0
    [37598.994008] [] ? syscall_trace_enter+0x231/0x240
    [37598.994008] [] tracesys+0xdd/0xe2
    [37598.994008] Code: fe 07 00 00 48 c7 c7 04 28 a6 81 89 45 a0 4c 89 4d b8 44 89 5d a8 e8 1b ac b1 ff 44 8b 5d a8 4c 8b 4d b8 8b 45 a0 e9 cf fe ff ff 0b 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 48
    [37598.994008] RIP [] skb_copy_and_csum_bits+0x325/0x330
    [37598.994008] RSP
    [37599.007323] ---[ end trace d69f6a17f8ac8eee ]---

    While there, also check if path mtu discovery is activated for this
    socket. The logic was adapted from ip6_append_data when first writing
    on the corked socket.

    This bug was introduced with commit
    0c1833797a5a6ec23ea9261d979aa18078720b74 ("ipv6: fix incorrect ipsec
    fragment").

    v2:
    a) Replace IPV6_PMTU_DISC_DO with IPV6_PMTUDISC_PROBE.
    b) Don't pass ipv6_pinfo to ip6_append_data_mtu (suggestion by Gao
    feng, thanks!).
    c) Change mtu to unsigned int, else we get a warning about
    non-matching types because of the min()-macro type-check.

    Acked-by: Gao feng
    Cc: YOSHIFUJI Hideaki
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • We accidentally call down to ip6_push_pending_frames when uncorking
    pending AF_INET data on a ipv6 socket. This results in the following
    splat (from Dave Jones):

    skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:
    ------------[ cut here ]------------
    kernel BUG at net/core/skbuff.c:126!
    invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
    Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth
    +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c
    CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37
    task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000
    RIP: 0010:[] [] skb_panic+0x63/0x65
    RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282
    RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006
    RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520
    RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800
    R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800
    FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
    Stack:
    ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4
    ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6
    ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0
    Call Trace:
    [] skb_push+0x3a/0x40
    [] ip6_push_pending_frames+0x1f6/0x4d0
    [] ? mark_held_locks+0xbb/0x140
    [] udp_v6_push_pending_frames+0x2b9/0x3d0
    [] ? udplite_getfrag+0x20/0x20
    [] udp_lib_setsockopt+0x1aa/0x1f0
    [] ? fget_light+0x387/0x4f0
    [] udpv6_setsockopt+0x34/0x40
    [] sock_common_setsockopt+0x14/0x20
    [] SyS_setsockopt+0x71/0xd0
    [] tracesys+0xdd/0xe2
    Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55
    RIP [] skb_panic+0x63/0x65
    RSP

    This patch adds a check if the pending data is of address family AF_INET
    and directly calls udp_push_ending_frames from udp_v6_push_pending_frames
    if that is the case.

    This bug was found by Dave Jones with trinity.

    (Also move the initialization of fl6 below the AF_INET check, even if
    not strictly necessary.)

    Cc: Dave Jones
    Cc: YOSHIFUJI Hideaki
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

02 Jul, 2013

3 commits

  • dingtianhong reported the following deadlock detected by lockdep:

    ======================================================
    [ INFO: possible circular locking dependency detected ]
    3.4.24.05-0.1-default #1 Not tainted
    -------------------------------------------------------
    ksoftirqd/0/3 is trying to acquire lock:
    (&ndev->lock){+.+...}, at: [] ipv6_get_lladdr+0x74/0x120

    but task is already holding lock:
    (&mc->mca_lock){+.+...}, at: [] mld_send_report+0x40/0x150

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&mc->mca_lock){+.+...}:
    [] validate_chain+0x637/0x730
    [] __lock_acquire+0x2f7/0x500
    [] lock_acquire+0x114/0x150
    [] rt_spin_lock+0x4a/0x60
    [] igmp6_group_added+0x3b/0x120
    [] ipv6_mc_up+0x38/0x60
    [] ipv6_find_idev+0x3d/0x80
    [] addrconf_notify+0x3d5/0x4b0
    [] notifier_call_chain+0x3f/0x80
    [] raw_notifier_call_chain+0x11/0x20
    [] call_netdevice_notifiers+0x32/0x60
    [] __dev_notify_flags+0x34/0x80
    [] dev_change_flags+0x40/0x70
    [] do_setlink+0x237/0x8a0
    [] rtnl_newlink+0x3ec/0x600
    [] rtnetlink_rcv_msg+0x160/0x310
    [] netlink_rcv_skb+0x89/0xb0
    [] rtnetlink_rcv+0x27/0x40
    [] netlink_unicast+0x140/0x180
    [] netlink_sendmsg+0x33e/0x380
    [] sock_sendmsg+0x112/0x130
    [] __sys_sendmsg+0x44e/0x460
    [] sys_sendmsg+0x44/0x70
    [] system_call_fastpath+0x16/0x1b

    -> #0 (&ndev->lock){+.+...}:
    [] check_prev_add+0x3de/0x440
    [] validate_chain+0x637/0x730
    [] __lock_acquire+0x2f7/0x500
    [] lock_acquire+0x114/0x150
    [] rt_read_lock+0x42/0x60
    [] ipv6_get_lladdr+0x74/0x120
    [] mld_newpack+0xb6/0x160
    [] add_grhead+0xab/0xc0
    [] add_grec+0x3ab/0x460
    [] mld_send_report+0x5a/0x150
    [] igmp6_timer_handler+0x4e/0xb0
    [] call_timer_fn+0xca/0x1d0
    [] run_timer_softirq+0x1df/0x2e0
    [] handle_pending_softirqs+0xf7/0x1f0
    [] __do_softirq_common+0x7b/0xf0
    [] __thread_do_softirq+0x1af/0x210
    [] run_ksoftirqd+0xe1/0x1f0
    [] kthread+0xae/0xc0
    [] kernel_thread_helper+0x4/0x10

    actually we can just hold idev->lock before taking pmc->mca_lock,
    and avoid taking idev->lock again when iterating idev->addr_list,
    since the upper callers of mld_newpack() already take
    read_lock_bh(&idev->lock).

    Reported-by: dingtianhong
    Cc: dingtianhong
    Cc: Hideaki YOSHIFUJI
    Cc: David S. Miller
    Cc: Hannes Frederic Sowa
    Tested-by: Ding Tianhong
    Tested-by: Chen Weilong
    Signed-off-by: Cong Wang
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Amerigo Wang
     
  • There is no reason to skip ECMP lookup when oif is specified, but this implies
    to check oif given by user when selecting another route.
    When the new route does not match oif requirement, we simply keep the initial
    one.

    Spotted-by: dingzhi
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • Because of commit 218774dc341f219bfcf940304a081b121a0e8099 ("ipv6: add
    anti-spoofing checks for 6to4 and 6rd") the sit driver dropped packets
    for 2002::/16 destinations and sources even when configured to work as a
    tunnel with fixed endpoint. We may only apply the 6rd/6to4 anti-spoofing
    checks if the device is not in pointopoint mode.

    This was an oversight from me in the above commit, sorry. Thanks to
    Roman Mamedov for reporting this!

    Reported-by: Roman Mamedov
    Cc: David Miller
    Cc: YOSHIFUJI Hideaki
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

29 Jun, 2013

2 commits

  • RFC3590/RFC3810 specifies we should resend MLD reports as soon as a
    valid link-local address is available.

    We now use the valid_ll_addr_cnt to check if it is necessary to resend
    a new report.

    Changes since Flavio Leitner's version:
    a) adapt for valid_ll_addr_cnt
    b) resend first reports directly in the path and just arm the timer for
    mc_qrv-1 resends.

    Reported-by: Flavio Leitner
    Cc: Hideaki YOSHIFUJI
    Cc: David Stevens
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: Flavio Leitner
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • To reduce the number of unnecessary router solicitations, MLDv2 and IGMPv3
    messages we need to track the number of valid (as in non-optimistic,
    no-dad-failed and non-tentative) link-local addresses. Therefore, this
    patch implements a valid_ll_addr_cnt in struct inet6_dev.

    We now only emit router solicitations if the first link-local address
    finishes duplicate address detection.

    The changes for MLDv2 and IGMPv3 are in a follow-up patch.

    While there, also simplify one if statement(one minor nit I made in one
    of my previous patches):

    if (!...)
    do();
    else
    return;

    <>

    if (...)
    return;
    do();

    Cc: Flavio Leitner
    Cc: YOSHIFUJI Hideaki
    Cc: David Stevens
    Suggested-by: David Stevens
    Signed-off-by: Hannes Frederic Sowa
    Acked-by: Flavio Leitner
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

28 Jun, 2013

1 commit

  • This patch allows to switch the netns when packet is encapsulated or
    decapsulated. In other word, the encapsulated packet is received in a netns,
    where the lookup is done to find the tunnel. Once the tunnel is found, the
    packet is decapsulated and injecting into the corresponding interface which
    stands to another netns.

    When one of the two netns is removed, the tunnel is destroyed.

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

27 Jun, 2013

2 commits

  • When a new tokenized address gets installed we send out just one
    router solicition. We should send out `rtr_solicits' in case one router
    advertisment got lost.

    So, rearm the timer as we do in addrconf_dad_complete.

    Cc: Daniel Borkmann
    Signed-off-by: Hannes Frederic Sowa
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • It's possible to use AF_INET6 sockets and to connect to an IPv4
    destination. After this, socket dst cache is a pointer to a rtable,
    not rt6_info.

    ip6_sk_dst_check() should check the socket dst cache is IPv6, or else
    various corruptions/crashes can happen.

    Dave Jones can reproduce immediate crash with
    trinity -q -l off -n -c sendmsg -c connect

    With help from Hannes Frederic Sowa

    Reported-by: Dave Jones
    Reported-by: Hannes Frederic Sowa
    Signed-off-by: Eric Dumazet
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Eric Dumazet
     

26 Jun, 2013

5 commits

  • If the tokenized ip address is re-set on an interface we depend on the
    arrival of a new router advertisment to call addrconf_verify to clean
    up the old address (which valid_lft is now set to 0). Old addresses can
    linger around for a longer time if e.g. the source of router advertisments
    vanishes.

    So, call addrconf_verify immediately after setting the new tokenized
    address to get rid of the old tokenized addresses.

    Cc: Daniel Borkmann
    Signed-off-by: Hannes Frederic Sowa
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • We should check the return value of ipv6_get_lladdr in inet6_set_iftoken.

    A possible situation, which could leave ll_addr unassigned is, when
    the user removed her link-local address but a global scoped address was
    already set. In this case the interface would still be IF_READY and not
    dead. In that case the RS source address is some value from the stack.

    v2: Daniel Borkmann noted a small indent inconstancy; no semantic
    changes.

    Cc: Daniel Borkmann
    Acked-by: Daniel Borkmann
    Reviewed-by: Flavio Leitner
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • The reason behind this change is that as soon as we delete
    the last ipv6 address of an interface we also lose the
    /proc/sys/net/ipv6/conf/ directory. This seems to be a
    usability problem for me.

    I don't see any reason why we should shutdown ipv6 on that interface in
    such cases.

    Cc: YOSHIFUJI Hideaki
    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • This patch splits the timers for duplicate address detection and router
    solicitations apart. The router solicitations timer goes into inet6_dev
    and the dad timer stays in inet6_ifaddr.

    The reason behind this patch is to reduce the number of unneeded router
    solicitations send out by the host if additional link-local addresses
    are created. Currently we send out RS for every link-local address on
    an interface.

    If the RS timer fires we pick a source address with ipv6_get_lladdr. This
    change could hurt people adding additional link-local addresses and
    specifying these addresses in the radvd clients section because we
    no longer guarantee that we use every ll address as source address in
    router solicitations.

    Cc: Flavio Leitner
    Cc: Hideaki YOSHIFUJI
    Cc: David Stevens
    Signed-off-by: Hannes Frederic Sowa
    Reviewed-by: Flavio Leitner
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • Router Alert option is marked in skb.
    Previously, IP6CB(skb)->ra was set to positive value for such packets.
    Since commit dd3332bf ("ipv6: Store Router Alert option in IP6CB
    directly."), IP6SKB_ROUTERALERT is set in IP6CB(skb)->flags, and
    the value of Router Alert option (in network byte order) is set
    to IP6CB(skb)->ra for such packets.

    Multicast forwarding path uses that flag and value, but unicast
    forwarding path does not use the flag and misuses IP6CB(skb)->ra
    value.

    Signed-off-by: YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki / 吉藤英明
     

25 Jun, 2013

2 commits

  • commit f88c91ddba95 ("ipv6: statically link
    register_inet6addr_notifier()" added following sparse warnings :

    net/ipv6/addrconf_core.c:83:5: warning: symbol
    'register_inet6addr_notifier' was not declared. Should it be static?
    net/ipv6/addrconf_core.c:89:5: warning: symbol
    'unregister_inet6addr_notifier' was not declared. Should it be static?
    net/ipv6/addrconf_core.c:95:5: warning: symbol
    'inet6addr_notifier_call_chain' was not declared. Should it be static?

    Signed-off-by: Eric Dumazet
    Cc: Cong Wang
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Pablo Neira Ayuso says:

    ====================
    The following patchset contains five fixes for Netfilter/IPVS, they are:

    * A skb leak fix in fragmentation handling in case that helpers are in place,
    it occurs since the IPV6 NAT infrastructure, from Phil Oester.

    * Fix SCTP port mangling in ICMP packets for IPVS, from Julian Anastasov.

    * Fix event delivery in ctnetlink regarding the new connlabel infrastructure,
    from Florian Westphal.

    * Fix mangling in the SIP NAT helper, from Balazs Peter Odor.

    * Fix crash in ipt_ULOG introduced while adding netnamespace support,
    from Gao Feng.

    I'll take care of passing several of these patches to -stable once they hit
    Linus' tree.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

24 Jun, 2013

1 commit


20 Jun, 2013

6 commits

  • In commit 4cdd3408 ("netfilter: nf_conntrack_ipv6: improve fragmentation
    handling"), an sk_buff leak was introduced when dealing with reassembled
    packets by grabbing a reference to the original skb instead of the
    reassembled skb. At this point, the leak only impacted conntracks with an
    associated helper.

    In commit 58a317f1 ("netfilter: ipv6: add IPv6 NAT support"), the bug was
    expanded to include all reassembled packets with unconfirmed conntracks.

    Fix this by grabbing a reference to the proper reassembled skb. This
    closes netfilter bugzilla #823.

    Signed-off-by: Phil Oester
    Signed-off-by: Pablo Neira Ayuso

    Phil Oester
     
  • If we disable all of the net interfaces, and enable
    un-lo interface before lo interface, we already allocated
    the addrconf dst in ipv6_add_addr. So we shouldn't allocate
    it again when we enable lo interface.

    Otherwise the message below will be triggered.
    unregister_netdevice: waiting for sit1 to become free. Usage count = 1

    This problem is introduced by commit 25fb6ca4ed9cad72f14f61629b68dc03c0d9713f
    "net IPv6 : Fix broken IPv6 routing table after loopback down-up"

    Signed-off-by: Gao feng
    Signed-off-by: David S. Miller

    Gao feng
     
  • The use of this attribute has been added in 32b8a8e59c9c (sit: add IPv4 over
    IPv4 support). It is optional, by default proto is IPPROTO_IPV6.

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • Process skb tunnel header before sending packet to protocol handler.
    this allows code sharing between gre and ovs gre modules.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Refactor various ip tunnels xmit functions and extend iptunnel_xmit()
    so that there is more code sharing.

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Conflicts:
    drivers/net/wireless/ath/ath9k/Kconfig
    drivers/net/xen-netback/netback.c
    net/batman-adv/bat_iv_ogm.c
    net/wireless/nl80211.c

    The ath9k Kconfig conflict was a change of a Kconfig option name right
    next to the deletion of another option.

    The xen-netback conflict was overlapping changes involving the
    handling of the notify list in xen_netbk_rx_action().

    Batman conflict resolution provided by Antonio Quartulli, basically
    keep everything in both conflict hunks.

    The nl80211 conflict is a little more involved. In 'net' we added a
    dynamic memory allocation to nl80211_dump_wiphy() to fix a race that
    Linus reported. Meanwhile in 'net-next' the handlers were converted
    to use pre and post doit handlers which use a flag to determine
    whether to hold the RTNL mutex around the operation.

    However, the dump handlers to not use this logic. Instead they have
    to explicitly do the locking. There were apparent bugs in the
    conversion of nl80211_dump_wiphy() in that we were not dropping the
    RTNL mutex in all the return paths, and it seems we very much should
    be doing so. So I fixed that whilst handling the overlapping changes.

    To simplify the initial returns, I take the RTNL mutex after we try
    to allocate 'tb'.

    Signed-off-by: David S. Miller

    David S. Miller
     

18 Jun, 2013

1 commit


13 Jun, 2013

2 commits

  • Reduce the uses of this unnecessary typedef.

    Done via perl script:

    $ git grep --name-only -w ctl_table net | \
    xargs perl -p -i -e '\
    sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
    s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'

    Reflow the modified lines that now exceed 80 columns.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • net/ipv4/ping.c:286:5: sparse: symbol 'ping_check_bind_addr' was not declared. Should it be static?
    net/ipv4/ping.c:355:6: sparse: symbol 'ping_set_saddr' was not declared. Should it be static?
    net/ipv4/ping.c:370:6: sparse: symbol 'ping_clear_saddr' was not declared. Should it be static?

    net/ipv6/ping.c:60:5: sparse: symbol 'dummy_ipv6_recv_error' was not declared. Should it be static?
    net/ipv6/ping.c:64:5: sparse: symbol 'dummy_ip6_datagram_recv_ctl' was not declared. Should it be static?
    net/ipv6/ping.c:69:5: sparse: symbol 'dummy_icmpv6_err_convert' was not declared. Should it be static?
    net/ipv6/ping.c:73:6: sparse: symbol 'dummy_ipv6_icmp_error' was not declared. Should it be static?
    net/ipv6/ping.c:75:5: sparse: symbol 'dummy_ipv6_chk_addr' was not declared. Should it be static?
    net/ipv6/ping.c:201:5: sparse: symbol 'ping_v6_seq_show' was not declared. Should it be static?

    Signed-off-by: Fengguang Wu
    Signed-off-by: David S. Miller

    Wu Fengguang
     

11 Jun, 2013

2 commits

  • Adds low latency socket poll support for TCP.
    In tcp_v[46]_rcv() add a call to sk_mark_ll() to copy the napi_id
    from the skb to the sk.
    In tcp_recvmsg(), when there is no data in the socket we busy-poll.
    This is a good example of how to add busy-poll support to more protocols.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Signed-off-by: Eliezer Tamir
    Acked-by: Eric Dumazet
    Tested-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Eliezer Tamir
     
  • Add upport for busy-polling on UDP sockets.
    In __udp[46]_lib_rcv add a call to sk_mark_ll() to copy the napi_id
    from the skb into the sk.
    This is done at the earliest possible moment, right after we identify
    which socket this skb is for.
    In __skb_recv_datagram When there is no data and the user
    tries to read we busy poll.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jesse Brandeburg
    Signed-off-by: Eliezer Tamir
    Acked-by: Eric Dumazet
    Tested-by: Willem de Bruijn
    Signed-off-by: David S. Miller

    Eliezer Tamir