20 Apr, 2018

1 commit


21 Mar, 2018

1 commit

  • [Patch] Pulling the following commits and some general changes
    from custom v3.10 kernel for supporting qcacld2.0 on kernel v4.9.11.
    1. cfg80211: Using new wiphy flag WIPHY_FLAG_DFS_OFFLOAD
    When flag WIPHY_FLAG_DFS_OFFLOAD is defined, the driver would handle
    all the DFS related operations. Therefore the kernel needs to ignore
    the DFS state that it uses to block the userspace calls to the driver
    through cfg80211 APIs. Also it should treat the userspace calls to
    start radar detection as a no-op.

    Please note that changes in util.c is not picked up explicitly.
    Kernel v4.9.11 uses wrapper cfg80211_get_chans_dfs_required which takes
    care of this change.

    Change-Id: I9dd2076945581ca67e54dfc96dd3dbc526c6f0a2
    IRs-Fixed: 202686

    2. New db.txt from git/sforshee/wireless-regdb.git
    CONFIG_CFG80211_INTERNAL_REGDB is enabled in build. This causes
    kernel warn messages as db.txt is empty. A new db.txt is added
    from:
    git://git.kernel.org/pub/scm/linux/kernel/git/sforshee/wireless-regdb.git

    IRs-Fixed: 202686

    3. Picked up the declaration and definition of the function
    cfg80211_is_gratuitous_arp_unsolicited_na

    Change-Id: I1e4083a2327c121073226aa6b75bb6b5b97cec00
    CRs-fixed: 1079453

    Signed-off-by: Nakul Kachhwaha
    Signed-off-by: Fugang Duan

    Nakul Kachhwaha
     

18 Mar, 2018

10 commits

  • commit ae0ac0ed6fcf5af3be0f63eb935f483f44a402d2 upstream.

    instead of allocating each xt_counter individually, allocate 4k chunks
    and then use these for counter allocation requests.

    This should speed up rule evaluation by increasing data locality,
    also speeds up ruleset loading because we reduce calls to the percpu
    allocator.

    As Eric points out we can't use PAGE_SIZE, page_allocator would fail on
    arches with 64k page size.

    Suggested-by: Eric Dumazet
    Signed-off-by: Florian Westphal
    Acked-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     
  • commit f28e15bacedd444608e25421c72eb2cf4527c9ca upstream.

    Keeps some noise away from a followup patch.

    Signed-off-by: Florian Westphal
    Acked-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     
  • commit 4d31eef5176df06f218201bc9c0ce40babb41660 upstream.

    On SMP we overload the packet counter (unsigned long) to contain
    percpu offset. Hide this from callers and pass xt_counters address
    instead.

    Preparation patch to allocate the percpu counters in page-sized batch
    chunks.

    Signed-off-by: Florian Westphal
    Acked-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     
  • commit b078556aecd791b0e5cb3a59f4c3a14273b52121 upstream.

    l4proto->manip_pkt() can cause reallocation of skb head so pointer
    to the ipv6 header must be reloaded.

    Reported-and-tested-by:
    Fixes: 58a317f1061c89 ("netfilter: ipv6: add IPv6 NAT support")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     
  • commit c4585a2823edf4d1326da44d1524ecbfda26bb37 upstream.

    ebt_among is special, it has a dynamic match size and is exempt
    from the central size checks.

    Therefore it must check that the size of the match structure
    provided from userspace is sane by making sure em->match_size
    is at least the minimum size of the expected structure.

    The module has such a check, but its only done after accessing
    a structure that might be out of bounds.

    tested with: ebtables -A INPUT ... \
    --among-dst fe:fe:fe:fe:fe:fe
    --among-dst fe:fe:fe:fe:fe:fe --among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fb,fe:fe:fe:fe:fc:fd,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe
    --among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fa,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe,fe:fe:fe:fe:fe:fe

    Reported-by:
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     
  • commit b71812168571fa55e44cdd0254471331b9c4c4c6 upstream.

    We need to make sure the offsets are not out of range of the
    total size.
    Also check that they are in ascending order.

    The WARN_ON triggered by syzkaller (it sets panic_on_warn) is
    changed to also bail out, no point in continuing parsing.

    Briefly tested with simple ruleset of
    -A INPUT --limit 1/s' --log
    plus jump to custom chains using 32bit ebtables binary.

    Reported-by:
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     
  • commit cfc2c740533368b96e2be5e0a4e8c3cace7d9814 upstream.

    We had one report from syzkaller [1]

    First issue is that INIT_WORK() should be done before mod_timer()
    or we risk timer being fired too soon, even with a 1 second timer.

    Second issue is that we need to reject too big info->timeout
    to avoid overflows in msecs_to_jiffies(info->timeout * 1000), or
    risk looping, if result after overflow is 0.

    [1]
    WARNING: CPU: 1 PID: 5129 at kernel/workqueue.c:1444 __queue_work+0xdf4/0x1230 kernel/workqueue.c:1444
    Kernel panic - not syncing: panic_on_warn set ...

    CPU: 1 PID: 5129 Comm: syzkaller159866 Not tainted 4.16.0-rc1+ #230
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:

    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:53
    panic+0x1e4/0x41c kernel/panic.c:183
    __warn+0x1dc/0x200 kernel/panic.c:547
    report_bug+0x211/0x2d0 lib/bug.c:184
    fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
    fixup_bug arch/x86/kernel/traps.c:247 [inline]
    do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
    do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
    invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:988
    RIP: 0010:__queue_work+0xdf4/0x1230 kernel/workqueue.c:1444
    RSP: 0018:ffff8801db507538 EFLAGS: 00010006
    RAX: ffff8801aeb46080 RBX: ffff8801db530200 RCX: ffffffff81481404
    RDX: 0000000000000100 RSI: ffffffff86b42640 RDI: 0000000000000082
    RBP: ffff8801db507758 R08: 1ffff1003b6a0de5 R09: 000000000000000c
    R10: ffff8801db5073f0 R11: 0000000000000020 R12: 1ffff1003b6a0eb6
    R13: ffff8801b1067ae0 R14: 00000000000001f8 R15: dffffc0000000000
    queue_work_on+0x16a/0x1c0 kernel/workqueue.c:1488
    queue_work include/linux/workqueue.h:488 [inline]
    schedule_work include/linux/workqueue.h:546 [inline]
    idletimer_tg_expired+0x44/0x60 net/netfilter/xt_IDLETIMER.c:116
    call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
    expire_timers kernel/time/timer.c:1363 [inline]
    __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
    run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
    __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
    invoke_softirq kernel/softirq.c:365 [inline]
    irq_exit+0x1cc/0x200 kernel/softirq.c:405
    exiting_irq arch/x86/include/asm/apic.h:541 [inline]
    smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
    apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:829

    RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:777 [inline]
    RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
    RIP: 0010:_raw_spin_unlock_irqrestore+0x5e/0xba kernel/locking/spinlock.c:184
    RSP: 0018:ffff8801c20173c8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff12
    RAX: dffffc0000000000 RBX: 0000000000000282 RCX: 0000000000000006
    RDX: 1ffffffff0d592cd RSI: 1ffff10035d68d23 RDI: 0000000000000282
    RBP: ffff8801c20173d8 R08: 1ffff10038402e47 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8820e5c8
    R13: ffff8801b1067ad8 R14: ffff8801aea7c268 R15: ffff8801aea7c278
    __debug_object_init+0x235/0x1040 lib/debugobjects.c:378
    debug_object_init+0x17/0x20 lib/debugobjects.c:391
    __init_work+0x2b/0x60 kernel/workqueue.c:506
    idletimer_tg_create net/netfilter/xt_IDLETIMER.c:152 [inline]
    idletimer_tg_checkentry+0x691/0xb00 net/netfilter/xt_IDLETIMER.c:213
    xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:850
    check_target net/ipv6/netfilter/ip6_tables.c:533 [inline]
    find_check_entry.isra.7+0x935/0xcf0 net/ipv6/netfilter/ip6_tables.c:575
    translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:744
    do_replace net/ipv6/netfilter/ip6_tables.c:1160 [inline]
    do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1686
    nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
    nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
    ipv6_setsockopt+0x10b/0x130 net/ipv6/ipv6_sockglue.c:927
    udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
    sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2976
    SYSC_setsockopt net/socket.c:1850 [inline]
    SyS_setsockopt+0x189/0x360 net/socket.c:1829
    do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287

    Fixes: 0902b469bd25 ("netfilter: xtables: idletimer target implementation")
    Signed-off-by: Eric Dumazet
    Reported-by: syzkaller
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • commit db57ccf0f2f4624b4c4758379f8165277504fbd7 upstream.

    syzbot reported a division by 0 bug in the netfilter nat code:

    divide error: 0000 [#1] SMP KASAN
    Dumping ftrace buffer:
    (ftrace buffer empty)
    Modules linked in:
    CPU: 1 PID: 4168 Comm: syzkaller034710 Not tainted 4.16.0-rc1+ #309
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    RIP: 0010:nf_nat_l4proto_unique_tuple+0x291/0x530
    net/netfilter/nf_nat_proto_common.c:88
    RSP: 0018:ffff8801b2466778 EFLAGS: 00010246
    RAX: 000000000000f153 RBX: ffff8801b2466dd8 RCX: ffff8801b2466c7c
    RDX: 0000000000000000 RSI: ffff8801b2466c58 RDI: ffff8801db5293ac
    RBP: ffff8801b24667d8 R08: ffff8801b8ba6dc0 R09: ffffffff88af5900
    R10: ffff8801b24666f0 R11: 0000000000000000 R12: 000000002990f153
    R13: 0000000000000001 R14: 0000000000000000 R15: ffff8801b2466c7c
    FS: 00000000017e3880(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000208fdfe4 CR3: 00000001b5340002 CR4: 00000000001606e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    dccp_unique_tuple+0x40/0x50 net/netfilter/nf_nat_proto_dccp.c:30
    get_unique_tuple+0xc28/0x1c10 net/netfilter/nf_nat_core.c:362
    nf_nat_setup_info+0x1c2/0xe00 net/netfilter/nf_nat_core.c:406
    nf_nat_redirect_ipv6+0x306/0x730 net/netfilter/nf_nat_redirect.c:124
    redirect_tg6+0x7f/0xb0 net/netfilter/xt_REDIRECT.c:34
    ip6t_do_table+0xc2a/0x1a30 net/ipv6/netfilter/ip6_tables.c:365
    ip6table_nat_do_chain+0x65/0x80 net/ipv6/netfilter/ip6table_nat.c:41
    nf_nat_ipv6_fn+0x594/0xa80 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:302
    nf_nat_ipv6_local_fn+0x33/0x5d0
    net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:407
    ip6table_nat_local_fn+0x2c/0x40 net/ipv6/netfilter/ip6table_nat.c:69
    nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline]
    nf_hook_slow+0xba/0x1a0 net/netfilter/core.c:483
    nf_hook include/linux/netfilter.h:243 [inline]
    NF_HOOK include/linux/netfilter.h:286 [inline]
    ip6_xmit+0x10ec/0x2260 net/ipv6/ip6_output.c:277
    inet6_csk_xmit+0x2fc/0x580 net/ipv6/inet6_connection_sock.c:139
    dccp_transmit_skb+0x9ac/0x10f0 net/dccp/output.c:142
    dccp_connect+0x369/0x670 net/dccp/output.c:564
    dccp_v6_connect+0xe17/0x1bf0 net/dccp/ipv6.c:946
    __inet_stream_connect+0x2d4/0xf00 net/ipv4/af_inet.c:620
    inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:684
    SYSC_connect+0x213/0x4a0 net/socket.c:1639
    SyS_connect+0x24/0x30 net/socket.c:1620
    do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287
    entry_SYSCALL_64_after_hwframe+0x26/0x9b
    RIP: 0033:0x441c69
    RSP: 002b:00007ffe50cc0be8 EFLAGS: 00000217 ORIG_RAX: 000000000000002a
    RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 0000000000441c69
    RDX: 000000000000001c RSI: 00000000208fdfe4 RDI: 0000000000000003
    RBP: 00000000006cc018 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000538 R11: 0000000000000217 R12: 0000000000403590
    R13: 0000000000403620 R14: 0000000000000000 R15: 0000000000000000
    Code: 48 89 f0 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 46 02 00 00 48 8b
    45 c8 44 0f b7 20 e8 88 97 04 fd 31 d2 41 0f b7 c4 4c 89 f9 f7 f6 48
    c1 e9 03 48 b8 00 00 00 00 00 fc ff df 0f b6 0c 01
    RIP: nf_nat_l4proto_unique_tuple+0x291/0x530
    net/netfilter/nf_nat_proto_common.c:88 RSP: ffff8801b2466778

    The problem is that currently we don't have any check on the
    configured port range. A port range == -1 triggers the bug, while
    other negative values may require a very long time to complete the
    following loop.

    This commit addresses the issue swapping the two ends on negative
    ranges. The check is performed in nf_nat_l4proto_unique_tuple() since
    the nft nat loads the port values from nft registers at runtime.

    v1 -> v2: use the correct 'Fixes' tag
    v2 -> v3: update commit message, drop unneeded READ_ONCE()

    Fixes: 5b1158e909ec ("[NETFILTER]: Add NAT support for nf_conntrack")
    Reported-by: syzbot+8012e198bd037f4871e5@syzkaller.appspotmail.com
    Signed-off-by: Paolo Abeni
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     
  • commit 10414014bc085aac9f787a5890b33b5605fbcfc4 upstream.

    syzbot reported that xt_LED may try to use the ledinternal->timer
    without previously initializing it:

    ------------[ cut here ]------------
    kernel BUG at kernel/time/timer.c:958!
    invalid opcode: 0000 [#1] SMP KASAN
    Dumping ftrace buffer:
    (ftrace buffer empty)
    Modules linked in:
    CPU: 1 PID: 1826 Comm: kworker/1:2 Not tainted 4.15.0+ #306
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    Workqueue: ipv6_addrconf addrconf_dad_work
    RIP: 0010:__mod_timer kernel/time/timer.c:958 [inline]
    RIP: 0010:mod_timer+0x7d6/0x13c0 kernel/time/timer.c:1102
    RSP: 0018:ffff8801d24fe9f8 EFLAGS: 00010293
    RAX: ffff8801d25246c0 RBX: ffff8801aec6cb50 RCX: ffffffff816052c6
    RDX: 0000000000000000 RSI: 00000000fffbd14b RDI: ffff8801aec6cb68
    RBP: ffff8801d24fec98 R08: 0000000000000000 R09: 1ffff1003a49fd6c
    R10: ffff8801d24feb28 R11: 0000000000000005 R12: dffffc0000000000
    R13: ffff8801d24fec70 R14: 00000000fffbd14b R15: ffff8801af608f90
    FS: 0000000000000000(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000206d6fd0 CR3: 0000000006a22001 CR4: 00000000001606e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    led_tg+0x1db/0x2e0 net/netfilter/xt_LED.c:75
    ip6t_do_table+0xc2a/0x1a30 net/ipv6/netfilter/ip6_tables.c:365
    ip6table_raw_hook+0x65/0x80 net/ipv6/netfilter/ip6table_raw.c:42
    nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline]
    nf_hook_slow+0xba/0x1a0 net/netfilter/core.c:483
    nf_hook.constprop.27+0x3f6/0x830 include/linux/netfilter.h:243
    NF_HOOK include/linux/netfilter.h:286 [inline]
    ndisc_send_skb+0xa51/0x1370 net/ipv6/ndisc.c:491
    ndisc_send_ns+0x38a/0x870 net/ipv6/ndisc.c:633
    addrconf_dad_work+0xb9e/0x1320 net/ipv6/addrconf.c:4008
    process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
    worker_thread+0x223/0x1990 kernel/workqueue.c:2247
    kthread+0x33c/0x400 kernel/kthread.c:238
    ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:429
    Code: 85 2a 0b 00 00 4d 8b 3c 24 4d 85 ff 75 9f 4c 8b bd 60 fd ff ff e8 bb
    57 10 00 65 ff 0d 94 9a a1 7e e9 d9 fc ff ff e8 aa 57 10 00 0b e8 a3
    57 10 00 e9 14 fb ff ff e8 99 57 10 00 4c 89 bd 70
    RIP: __mod_timer kernel/time/timer.c:958 [inline] RSP: ffff8801d24fe9f8
    RIP: mod_timer+0x7d6/0x13c0 kernel/time/timer.c:1102 RSP: ffff8801d24fe9f8
    ---[ end trace f661ab06f5dd8b3d ]---

    The ledinternal struct can be shared between several different
    xt_LED targets, but the related timer is currently initialized only
    if the first target requires it. Fix it by unconditionally
    initializing the timer struct.

    v1 -> v2: call del_timer_sync() unconditionally, too.

    Fixes: 268cb38e1802 ("netfilter: x_tables: add LED trigger target")
    Reported-by: syzbot+10c98dc5725c6c8fc7fb@syzkaller.appspotmail.com
    Signed-off-by: Paolo Abeni
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     
  • commit 57ebd808a97d7c5b1e1afb937c2db22beba3c1f8 upstream.

    The rationale for removing the check is only correct for rulesets
    generated by ip(6)tables.

    In iptables, a jump can only occur to a user-defined chain, i.e.
    because we size the stack based on number of user-defined chains we
    cannot exceed stack size.

    However, the underlying binary format has no such restriction,
    and the validation step only ensures that the jump target is a
    valid rule start point.

    IOW, its possible to build a rule blob that has no user-defined
    chains but does contain a jump.

    If this happens, no jump stack gets allocated and crash occurs
    because no jumpstack was allocated.

    Fixes: 7814b6ec6d0d6 ("netfilter: xtables: don't save/restore jumpstack offset")
    Reported-by: syzbot+e783f671527912cd9403@syzkaller.appspotmail.com
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Florian Westphal
     

11 Mar, 2018

16 commits

  • commit 3968523f855050b8195134da951b87c20bd66130 upstream.

    mpls_label_ok() validates that the 'platform_label' array index from a
    userspace netlink message payload is valid. Under speculation the
    mpls_label_ok() result may not resolve in the CPU pipeline until after
    the index is used to access an array element. Sanitize the index to zero
    to prevent userspace-controlled arbitrary out-of-bounds speculation, a
    precursor for a speculative execution side channel vulnerability.

    Cc: "David S. Miller"
    Cc: Eric W. Biederman
    Signed-off-by: Dan Williams
    Signed-off-by: David S. Miller
    [bwh: Backported to 4.4:
    - mpls_label_ok() doesn't take an extack parameter
    - Drop change in mpls_getroute()]
    Signed-off-by: Ben Hutchings
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     
  • commit b7b386f42f079b25b942c756820e36c6bd09b2ca upstream.

    mpls_route_add and mpls_route_del have the same checks on the label.
    Move to a helper. Avoid duplicate extack messages in the next patch.

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Ben Hutchings
    Signed-off-by: Greg Kroah-Hartman

    David Ahern
     
  • [ Upstream commit 07f2c7ab6f8d0a7e7c5764c4e6cc9c52951b9d9c ]

    When SCTP makes INIT or INIT_ACK packet the total chunk length
    can exceed SCTP_MAX_CHUNK_LEN which leads to kernel panic when
    transmitting these packets, e.g. the crash on sending INIT_ACK:

    [ 597.804948] skbuff: skb_over_panic: text:00000000ffae06e4 len:120168
    put:120156 head:000000007aa47635 data:00000000d991c2de
    tail:0x1d640 end:0xfec0 dev:
    ...
    [ 597.976970] ------------[ cut here ]------------
    [ 598.033408] kernel BUG at net/core/skbuff.c:104!
    [ 600.314841] Call Trace:
    [ 600.345829]
    [ 600.371639] ? sctp_packet_transmit+0x2095/0x26d0 [sctp]
    [ 600.436934] skb_put+0x16c/0x200
    [ 600.477295] sctp_packet_transmit+0x2095/0x26d0 [sctp]
    [ 600.540630] ? sctp_packet_config+0x890/0x890 [sctp]
    [ 600.601781] ? __sctp_packet_append_chunk+0x3b4/0xd00 [sctp]
    [ 600.671356] ? sctp_cmp_addr_exact+0x3f/0x90 [sctp]
    [ 600.731482] sctp_outq_flush+0x663/0x30d0 [sctp]
    [ 600.788565] ? sctp_make_init+0xbf0/0xbf0 [sctp]
    [ 600.845555] ? sctp_check_transmitted+0x18f0/0x18f0 [sctp]
    [ 600.912945] ? sctp_outq_tail+0x631/0x9d0 [sctp]
    [ 600.969936] sctp_cmd_interpreter.isra.22+0x3be1/0x5cb0 [sctp]
    [ 601.041593] ? sctp_sf_do_5_1B_init+0x85f/0xc30 [sctp]
    [ 601.104837] ? sctp_generate_t1_cookie_event+0x20/0x20 [sctp]
    [ 601.175436] ? sctp_eat_data+0x1710/0x1710 [sctp]
    [ 601.233575] sctp_do_sm+0x182/0x560 [sctp]
    [ 601.284328] ? sctp_has_association+0x70/0x70 [sctp]
    [ 601.345586] ? sctp_rcv+0xef4/0x32f0 [sctp]
    [ 601.397478] ? sctp6_rcv+0xa/0x20 [sctp]
    ...

    Here the chunk size for INIT_ACK packet becomes too big, mostly
    because of the state cookie (INIT packet has large size with
    many address parameters), plus additional server parameters.

    Later this chunk causes the panic in skb_put_data():

    skb_packet_transmit()
    sctp_packet_pack()
    skb_put_data(nskb, chunk->skb->data, chunk->skb->len);

    'nskb' (head skb) was previously allocated with packet->size
    from u16 'chunk->chunk_hdr->length'.

    As suggested by Marcelo we should check the chunk's length in
    _sctp_make_chunk() before trying to allocate skb for it and
    discard a chunk if its size bigger than SCTP_MAX_CHUNK_LEN.

    Signed-off-by: Alexey Kodanev
    Acked-by: Marcelo Ricardo Leitner
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexey Kodanev
     
  • [ Upstream commit 957d761cf91cdbb175ad7d8f5472336a4d54dbf2 ]

    When going through the bind address list in sctp_v6_get_dst() and
    the previously found address is better ('matchlen > bmatchlen'),
    the code continues to the next iteration without releasing currently
    held destination.

    Fix it by releasing 'bdst' before continue to the next iteration, and
    instead of introducing one more '!IS_ERR(bdst)' check for dst_release(),
    move the already existed one right after ip6_dst_lookup_flow(), i.e. we
    shouldn't proceed further if we get an error for the route lookup.

    Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary addresses for ipv6")
    Signed-off-by: Alexey Kodanev
    Acked-by: Neil Horman
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexey Kodanev
     
  • [ Upstream commit 350c9f484bde93ef229682eedd98cd5f74350f7f ]

    BBR uses tcp_tso_autosize() in an attempt to probe what would be the
    burst sizes and to adjust cwnd in bbr_target_cwnd() with following
    gold formula :

    /* Allow enough full-sized skbs in flight to utilize end systems. */
    cwnd += 3 * bbr->tso_segs_goal;

    But GSO can be lacking or be constrained to very small
    units (ip link set dev ... gso_max_segs 2)

    What we really want is to have enough packets in flight so that both
    GSO and GRO are efficient.

    So in the case GSO is off or downgraded, we still want to have the same
    number of packets in flight as if GSO/TSO was fully operational, so
    that GRO can hopefully be working efficiently.

    To fix this issue, we make tcp_tso_autosize() unaware of
    sk->sk_gso_max_segs

    Only tcp_tso_segs() has to enforce the gso_max_segs limit.

    Tested:

    ethtool -K eth0 tso off gso off
    tc qd replace dev eth0 root pfifo_fast

    Before patch:
    for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
        691  (ss -temoi shows cwnd is stuck around 6 )
        667
        651
        631
        517

    After patch :
    # for f in {1..5}; do ./super_netperf 1 -H lpaa24 -- -K bbr; done
       1733 (ss -temoi shows cwnd is around 386 )
       1778
       1746
       1781
       1718

    Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
    Signed-off-by: Eric Dumazet
    Reported-by: Oleksandr Natalenko
    Acked-by: Neal Cardwell
    Acked-by: Soheil Hassas Yeganeh
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 93c62c45ed5fad1b87e3a45835b251cd68de9c46 ]

    All the kernel_sendmsg() calls in rxrpc_send_data_packet() need to send
    both parts of the iov[] buffer, but one of them does not. Fix it so that
    it does.

    Without this, short IPv6 rxrpc DATA packets may be seen that have the rxrpc
    header included, but no payload.

    Fixes: 5a924b8951f8 ("rxrpc: Don't store the rxrpc header in the Tx queue sk_buffs")
    Reported-by: Marc Dionne
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     
  • [ Upstream commit 808cf9e38cd7923036a99f459ccc8cf2955e47af ]

    Avoid SKB coalescing if eor bit is set in one of the relevant
    SKBs.

    Fixes: c134ecb87817 ("tcp: Make use of MSG_EOR in tcp_sendmsg")
    Signed-off-by: Ilya Lesokhin
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Ilya Lesokhin
     
  • [ Upstream commit 4a31a6b19f9ddf498c81f5c9b089742b7472a6f8 ]

    Fix dst reference count leak in sctp_v4_get_dst() introduced in commit
    410f03831 ("sctp: add routing output fallback"):

    When walking the address_list, successive ip_route_output_key() calls
    may return the same rt->dst with the reference incremented on each call.

    The code would not decrement the dst refcount when the dst pointer was
    identical from the previous iteration, causing the dst refcnt leak.

    Testcase:
    ip netns add TEST
    ip netns exec TEST ip link set lo up
    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link add dummy2 type dummy
    ip link set dev dummy0 netns TEST
    ip link set dev dummy1 netns TEST
    ip link set dev dummy2 netns TEST
    ip netns exec TEST ip addr add 192.168.1.1/24 dev dummy0
    ip netns exec TEST ip link set dummy0 up
    ip netns exec TEST ip addr add 192.168.1.2/24 dev dummy1
    ip netns exec TEST ip link set dummy1 up
    ip netns exec TEST ip addr add 192.168.1.3/24 dev dummy2
    ip netns exec TEST ip link set dummy2 up
    ip netns exec TEST sctp_test -H 192.168.1.2 -P 20002 -h 192.168.1.1 -p 20000 -s -B 192.168.1.3
    ip netns del TEST

    In 4.4 and 4.9 kernels this results to:
    [ 354.179591] unregister_netdevice: waiting for lo to become free. Usage count = 1
    [ 364.419674] unregister_netdevice: waiting for lo to become free. Usage count = 1
    [ 374.663664] unregister_netdevice: waiting for lo to become free. Usage count = 1
    [ 384.903717] unregister_netdevice: waiting for lo to become free. Usage count = 1
    [ 395.143724] unregister_netdevice: waiting for lo to become free. Usage count = 1
    [ 405.383645] unregister_netdevice: waiting for lo to become free. Usage count = 1
    ...

    Fixes: 410f03831 ("sctp: add routing output fallback")
    Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary addresses")
    Signed-off-by: Tommi Rantala
    Acked-by: Marcelo Ricardo Leitner
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Tommi Rantala
     
  • [ Upstream commit 15f35d49c93f4fa9875235e7bf3e3783d2dd7a1b ]

    Since UDP-Lite is always using checksum, the following path is
    triggered when calculating pseudo header for it:

    udp4_csum_init() or udp6_csum_init()
    skb_checksum_init_zero_check()
    __skb_checksum_validate_complete()

    The problem can appear if skb->len is less than CHECKSUM_BREAK. In
    this particular case __skb_checksum_validate_complete() also invokes
    __skb_checksum_complete(skb). If UDP-Lite is using partial checksum
    that covers only part of a packet, the function will return bad
    checksum and the packet will be dropped.

    It can be fixed if we skip skb_checksum_init_zero_check() and only
    set the required pseudo header checksum for UDP-Lite with partial
    checksum before udp4_csum_init()/udp6_csum_init() functions return.

    Fixes: ed70fcfcee95 ("net: Call skb_checksum_init in IPv4")
    Fixes: e4f45b7f40bd ("net: Call skb_checksum_init in IPv6")
    Signed-off-by: Alexey Kodanev
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexey Kodanev
     
  • [ Upstream commit cb9f7a9a5c96a773bbc9c70660dc600cfff82f82 ]

    Nowadays, nlmsg_multicast() returns only 0 or -ESRCH but this was not the
    case when commit 134e63756d5f was pushed.
    However, there was no reason to stop the loop if a netns does not have
    listeners.
    Returns -ESRCH only if there was no listeners in all netns.

    To avoid having the same problem in the future, I didn't take the
    assumption that nlmsg_multicast() returns only 0 or -ESRCH.

    Fixes: 134e63756d5f ("genetlink: make netns aware")
    CC: Johannes Berg
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Dichtel
     
  • [ Upstream commit c7272c2f1229125f74f22dcdd59de9bbd804f1c8 ]

    According to RFC 1191 sections 3 and 4, ICMP frag-needed messages
    indicating an MTU below 68 should be rejected:

    A host MUST never reduce its estimate of the Path MTU below 68
    octets.

    and (talking about ICMP frag-needed's Next-Hop MTU field):

    This field will never contain a value less than 68, since every
    router "must be able to forward a datagram of 68 octets without
    fragmentation".

    Furthermore, by letting net.ipv4.route.min_pmtu be set to negative
    values, we can end up with a very large PMTU when (-1) is cast into u32.

    Let's also make ip_rt_min_pmtu a u32, since it's only ever compared to
    unsigned ints.

    Reported-by: Jianlin Shi
    Signed-off-by: Sabrina Dubroca
    Reviewed-by: Stefano Brivio
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Sabrina Dubroca
     
  • [ Upstream commit ac5b70198adc25c73fba28de4f78adcee8f6be0b ]

    netif_set_real_num_tx_queues() can be called when netdev is up.
    That usually happens when user requests change of number of
    channels/rings with ethtool -L. The procedure for changing
    the number of queues involves resetting the qdiscs and setting
    dev->num_tx_queues to the new value. When the new value is
    lower than the old one, extra care has to be taken to ensure
    ordering of accesses to the number of queues vs qdisc reset.

    Currently the queues are reset before new dev->num_tx_queues
    is assigned, leaving a window of time where packets can be
    enqueued onto the queues going down, leading to a likely
    crash in the drivers, since most drivers don't check if TX
    skbs are assigned to an active queue.

    Fixes: e6484930d7c7 ("net: allocate tx queues in register_netdevice")
    Signed-off-by: Jakub Kicinski
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jakub Kicinski
     
  • [ Upstream commit ca79bec237f5809a7c3c59bd41cd0880aa889966 ]

    gcc-8 has a new warning that detects overlapping input and output arguments
    in memcpy(). It triggers for sit_init_net() calling ipip6_tunnel_clone_6rd(),
    which is actually correct:

    net/ipv6/sit.c: In function 'sit_init_net':
    net/ipv6/sit.c:192:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]

    The problem here is that the logic detecting the memcpy() arguments finds them
    to be the same, but the conditional that tests for the input and output of
    ipip6_tunnel_clone_6rd() to be identical is not a compile-time constant.

    We know that netdev_priv(t->dev) is the same as t for a tunnel device,
    and comparing "dev" directly here lets the compiler figure out as well
    that 'dev == sitn->fb_tunnel_dev' when called from sit_init_net(), so
    it no longer warns.

    This code is old, so Cc stable to make sure that we don't get the warning
    for older kernels built with new gcc.

    Cc: Martin Sebor
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83456
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • [ Upstream commit a8c6db1dfd1b1d18359241372bb204054f2c3174 ]

    In fib_nh_match(), if output interface or gateway are passed in
    the FIB configuration, we don't have to check next hops of
    multipath routes to conclude whether we have a match or not.

    However, we might still have routes with different realms
    matching the same output interface and gateway configuration,
    and this needs to cause the match to fail. Otherwise the first
    route inserted in the FIB will match, regardless of the realms:

    # ip route add 1.1.1.1 dev eth0 table 1234 realms 1/2
    # ip route append 1.1.1.1 dev eth0 table 1234 realms 3/4
    # ip route list table 1234
    1.1.1.1 dev eth0 scope link realms 1/2
    1.1.1.1 dev eth0 scope link realms 3/4
    # ip route del 1.1.1.1 dev ens3 table 1234 realms 3/4
    # ip route list table 1234
    1.1.1.1 dev ens3 scope link realms 3/4

    whereas route with realms 3/4 should have been deleted instead.

    Explicitly check for fc_flow passed in the FIB configuration
    (this comes from RTA_FLOW extracted by rtm_to_fib_config()) and
    fail matching if it differs from nh_tclassid.

    The handling of RTA_FLOW for multipath routes later in
    fib_nh_match() is still needed, as we can have multiple RTA_FLOW
    attributes that need to be matched against the tclassid of each
    next hop.

    v2: Check that fc_flow is set before discarding the match, so
    that the user can still select the first matching rule by
    not specifying any realm, as suggested by David Ahern.

    Reported-by: Jianlin Shi
    Signed-off-by: Stefano Brivio
    Acked-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Stefano Brivio
     
  • [ Upstream commit 1b12580af1d0677c3c3a19e35bfe5d59b03f737f ]

    Now br_sysfs_if file flush doesn't have attr show. To read it will
    cause kernel panic after users chmod u+r this file.

    Xiong found this issue when running the commands:

    ip link add br0 type bridge
    ip link add type veth
    ip link set veth0 master br0
    chmod u+r /sys/devices/virtual/net/veth0/brport/flush
    timeout 3 cat /sys/devices/virtual/net/veth0/brport/flush

    kernel crashed with NULL a pointer dereference call trace.

    This patch is to fix it by return -EINVAL when brport_attr->show
    is null, just the same as the check for brport_attr->store in
    brport_store().

    Fixes: 9cf637473c85 ("bridge: add sysfs hook to flush forwarding table")
    Reported-by: Xiong Zhou
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Xin Long
     
  • commit b87b6194be631c94785fe93398651e804ed43e28 upstream.

    Before, if cb->start() failed, the module reference would never be put,
    because cb->cb_running is intentionally false at this point. Users are
    generally annoyed by this because they can no longer unload modules that
    leak references. Also, it may be possible to tediously wrap a reference
    counter back to zero, especially since module.c still uses atomic_inc
    instead of refcount_inc.

    This patch expands the error path to simply call module_put if
    cb->start() fails.

    Fixes: 41c87425a1ac ("netlink: do not set cb_running if dump's start() errs")
    Signed-off-by: Jason A. Donenfeld
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jason A. Donenfeld
     

03 Mar, 2018

7 commits

  • [ Upstream commit c76f97c99ae6d26d14c7f0e50e074382bfbc9f98 ]

    Some sockopt handling functions were calculating the length of the
    buffer to be written to userspace and then calculating it again when
    actually writing the buffer, which could lead to some write not using
    an up-to-date length.

    This patch updates such places to just make use of the len variable.

    Also, replace some sizeof(type) to sizeof(var).

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Marcelo Ricardo Leitner
     
  • [ Upstream commit 736a80bbfda709fb3631f5f62056f250a38e5804 ]

    If there are multiple mesh stations with the same MAC address,
    they will both get confused and start throwing warnings.

    Obviously in this case nothing can actually work anyway, so just
    drop frames that look like they're from ourselves early on.

    Reported-by: Gui Iribarren
    Signed-off-by: Johannes Berg
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Johannes Berg
     
  • [ Upstream commit 3ea15452ee85754f70f3b9fa1f23165ef2e77ba7 ]

    nl80211_nan_add_func() does not check if the required attribute
    NL80211_NAN_FUNC_FOLLOW_UP_DEST is present when processing
    NL80211_CMD_ADD_NAN_FUNCTION request. This request can be issued
    by users with CAP_NET_ADMIN privilege and may result in NULL dereference
    and a system crash. Add a check for the required attribute presence.

    Signed-off-by: Hao Chen
    Signed-off-by: Johannes Berg
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Hao Chen
     
  • [ Upstream commit 642a8439ddd8423b92f2e71960afe21ee1f66bb6 ]

    Calling tipc_mon_delete() before the monitor has been created will oops.
    This can happen in tipc_enable_bearer() error path if tipc_disc_create()
    fails.

    [ 48.589074] BUG: unable to handle kernel paging request at 0000000000001008
    [ 48.590266] IP: tipc_mon_delete+0xea/0x270 [tipc]
    [ 48.591223] PGD 1e60c5067 P4D 1e60c5067 PUD 1eb0cf067 PMD 0
    [ 48.592230] Oops: 0000 [#1] SMP KASAN
    [ 48.595610] CPU: 5 PID: 1199 Comm: tipc Tainted: G B 4.15.0-rc4-pc64-dirty #5
    [ 48.597176] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
    [ 48.598489] RIP: 0010:tipc_mon_delete+0xea/0x270 [tipc]
    [ 48.599347] RSP: 0018:ffff8801d827f668 EFLAGS: 00010282
    [ 48.600705] RAX: ffff8801ee813f00 RBX: 0000000000000204 RCX: 0000000000000000
    [ 48.602183] RDX: 1ffffffff1de6a75 RSI: 0000000000000297 RDI: 0000000000000297
    [ 48.604373] RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1dd1533
    [ 48.605607] R10: ffffffff8eafbb05 R11: fffffbfff1dd1534 R12: 0000000000000050
    [ 48.607082] R13: dead000000000200 R14: ffffffff8e73f310 R15: 0000000000001020
    [ 48.608228] FS: 00007fc686484800(0000) GS:ffff8801f5540000(0000) knlGS:0000000000000000
    [ 48.610189] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 48.611459] CR2: 0000000000001008 CR3: 00000001dda70002 CR4: 00000000003606e0
    [ 48.612759] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 48.613831] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 48.615038] Call Trace:
    [ 48.615635] tipc_enable_bearer+0x415/0x5e0 [tipc]
    [ 48.620623] tipc_nl_bearer_enable+0x1ab/0x200 [tipc]
    [ 48.625118] genl_family_rcv_msg+0x36b/0x570
    [ 48.631233] genl_rcv_msg+0x5a/0xa0
    [ 48.631867] netlink_rcv_skb+0x1cc/0x220
    [ 48.636373] genl_rcv+0x24/0x40
    [ 48.637306] netlink_unicast+0x29c/0x350
    [ 48.639664] netlink_sendmsg+0x439/0x590
    [ 48.642014] SYSC_sendto+0x199/0x250
    [ 48.649912] do_syscall_64+0xfd/0x2c0
    [ 48.650651] entry_SYSCALL64_slow_path+0x25/0x25
    [ 48.651843] RIP: 0033:0x7fc6859848e3
    [ 48.652539] RSP: 002b:00007ffd25dff938 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    [ 48.654003] RAX: ffffffffffffffda RBX: 00007ffd25dff990 RCX: 00007fc6859848e3
    [ 48.655303] RDX: 0000000000000054 RSI: 00007ffd25dff990 RDI: 0000000000000003
    [ 48.656512] RBP: 00007ffd25dff980 R08: 00007fc685c35fc0 R09: 000000000000000c
    [ 48.657697] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000d13010
    [ 48.658840] R13: 00007ffd25e009c0 R14: 0000000000000000 R15: 0000000000000000
    [ 48.662972] RIP: tipc_mon_delete+0xea/0x270 [tipc] RSP: ffff8801d827f668
    [ 48.664073] CR2: 0000000000001008
    [ 48.664576] ---[ end trace e811818d54d5ce88 ]---

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Tommi Rantala
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Tommi Rantala
     
  • [ Upstream commit 19142551b2be4a9e13838099fde1351386e5e007 ]

    Fix memory leak in tipc_enable_bearer() if enable_media() fails, and
    cleanup with bearer_disable() if tipc_mon_create() fails.

    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: Tommi Rantala
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Tommi Rantala
     
  • [ Upstream commit c9fefa08190fc879fb2e681035d7774e0a8c5170 ]

    Now it's using IPV6_MIN_MTU as the min mtu in ip6_tnl_xmit, but
    IPV6_MIN_MTU actually only works when the inner packet is ipv6.

    With IPV6_MIN_MTU for ipv4 packets, the new pmtu for inner dst
    couldn't be set less than 1280. It would cause tx_err and the
    packet to be dropped when the outer dst pmtu is close to 1280.

    Jianlin found it by running ipv4 traffic with the topo:

    (client) gre6 eth1 (route) eth2 gre6 (server)

    After changing eth2 mtu to 1300, the performance became very
    low, or the connection was even broken. The issue also affects
    ip4ip6 and ip6ip6 tunnels.

    So if the inner packet is ipv4, 576 should be considered as the
    min mtu.

    Note that for ip4ip6 and ip6ip6 tunnels, the inner packet can
    only be ipv4 or ipv6, but for gre6 tunnel, it may also be ARP.
    This patch using 576 as the min mtu for non-ipv6 packet works
    for all those cases.

    Reported-by: Jianlin Shi
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Xin Long
     
  • [ Upstream commit 588753f1eb18978512b1c9b85fddb457d46f9033 ]

    One example of when an ICMPv6 packet is required to be looped back is
    when a host acts as both a Multicast Listener and a Multicast Router.

    A Multicast Router will listen on address ff02::16 for MLDv2 messages.

    Currently, MLDv2 messages originating from a Multicast Listener running
    on the same host as the Multicast Router are not being delivered to the
    Multicast Router. This is due to dst.input being assigned the default
    value of dst_discard.

    This results in the packet being looped back but discarded before being
    delivered to the Multicast Router.

    This patch sets dst.input to ip6_input to ensure a looped back packet
    is delivered to the Multicast Router.

    Signed-off-by: Brendan McGrath
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Brendan McGrath
     

28 Feb, 2018

2 commits

  • commit bee92d06157fc39d5d7836a061c7d41289a55797 upstream.

    gcc-8 warns about some obviously incorrect code:

    net/mac80211/cfg.c: In function 'cfg80211_beacon_dup':
    net/mac80211/cfg.c:2896:3: error: 'memcpy' source argument is the same as destination [-Werror=restrict]

    From the context, I conclude that we want to copy from beacon into
    new_beacon, as we do in the rest of the function.

    Cc: stable@vger.kernel.org
    Fixes: 73da7d5bab79 ("mac80211: add channel switch command and beacon callbacks")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit 01ea306f2ac2baff98d472da719193e738759d93 upstream.

    The Syzbot reported a possible deadlock in the netfilter area caused by
    rtnl lock, xt lock and socket lock being acquired with a different order
    on different code paths, leading to the following backtrace:
    Reviewed-by: Xin Long

    ======================================================
    WARNING: possible circular locking dependency detected
    4.15.0+ #301 Not tainted
    ------------------------------------------------------
    syzkaller233489/4179 is trying to acquire lock:
    (rtnl_mutex){+.+.}, at: [] rtnl_lock+0x17/0x20
    net/core/rtnetlink.c:74

    but task is already holding lock:
    (&xt[i].mutex){+.+.}, at: []
    xt_find_table_lock+0x3e/0x3e0 net/netfilter/x_tables.c:1041

    which lock already depends on the new lock.
    ===

    Since commit 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock
    only in the required scope"), we already acquire the socket lock in
    the innermost scope, where needed. In such commit I forgot to remove
    the outer-most socket lock from the getsockopt() path, this commit
    addresses the issues dropping it now.

    v1 -> v2: fix bad subj, added relavant 'fixes' tag

    Fixes: 22265a5c3c10 ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
    Fixes: 202f59afd441 ("netfilter: ipt_CLUSTERIP: do not hold dev")
    Fixes: 3f34cfae1230 ("netfilter: on sockopt() acquire sock lock only in the required scope")
    Reported-by: syzbot+ddde1c7b7ff7442d7f2d@syzkaller.appspotmail.com
    Suggested-by: Florian Westphal
    Signed-off-by: Paolo Abeni
    Signed-off-by: Pablo Neira Ayuso
    Tested-by: Krzysztof Piotr Oledzki
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     

25 Feb, 2018

3 commits

  • [ Upstream commit 732706afe1cc46ef48493b3d2b69c98f36314ae4 ]

    On policies with a transport mode template, we pass the addresses
    from the flowi to xfrm_state_find(), assuming that the IP addresses
    (and address family) don't change during transformation.

    Unfortunately our policy template validation is not strict enough.
    It is possible to configure policies with transport mode template
    where the address family of the template does not match the selectors
    address family. This lead to stack-out-of-bound reads because
    we compare arddesses of the wrong family. Fix this by refusing
    such a configuration, address family can not change on transport
    mode.

    We use the assumption that, on transport mode, the first templates
    address family must match the address family of the policy selector.
    Subsequent transport mode templates must mach the address family of
    the previous template.

    Signed-off-by: Steffen Klassert
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Steffen Klassert
     
  • [ Upstream commit 8afa10cbe281b10371fee5a87ab266e48d71a7f9 ]

    Check the qmin & qmax values doesn't overflow for the given Wlog value.
    Check that qmin
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Nogah Frankel
     
  • [ Upstream commit d30fc5126efb0c33b7adf5966d3051db2c3d7721 ]

    Now outstanding_bytes is only increased when appending chunks into one
    packet and sending it at 1st time, while decreased when it is about to
    move into retransmit queue. It means outstanding_bytes value is already
    decreased for all chunks in retransmit queue.

    However sctp_prsctp_prune_sent is a common function to check the chunks
    in both transmitted and retransmit queue, it decrease outstanding_bytes
    when moving a chunk into abandoned queue from either of them.

    It could cause outstanding_bytes underflow, as it also decreases it's
    value for the chunks in retransmit queue.

    This patch fixes it by only updating outstanding_bytes for transmitted
    queue when pruning queues for prsctp prio policy, the same fix is also
    needed in sctp_check_transmitted.

    Fixes: 8dbdf1f5b09c ("sctp: implement prsctp PRIO policy")
    Signed-off-by: Xin Long
    Acked-by: Marcelo Ricardo Leitner
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Xin Long