18 Oct, 2018

1 commit

  • [ Upstream commit 8b4c3cdd9dd8290343ce959a132d3b334062c5b9 ]

    A number of TC attributes are processed without proper validation
    (e.g., length checks). Add a tca policy for all input attributes and use
    when invoking nlmsg_parse.

    The 2 Fixes tags below cover the latest additions. The other attributes
    are a string (KIND), nested attribute (OPTIONS which does seem to have
    validation in most cases), for dumps only or a flag.

    Fixes: 5bc1701881e39 ("net: sched: introduce multichain support for filters")
    Fixes: d47a6b0e7c492 ("net: sched: introduce ingress/egress block index attributes for qdisc")
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David Ahern
     

29 Sep, 2018

1 commit

  • [ Upstream commit 34043d250f51368f214aed7f54c2dc29c819a8c7 ]

    Matteo reported the following splat, testing the datapath of TC 'sample':

    BUG: KASAN: null-ptr-deref in tcf_sample_act+0xc4/0x310
    Read of size 8 at addr 0000000000000000 by task nc/433

    CPU: 0 PID: 433 Comm: nc Not tainted 4.19.0-rc3-kvm #17
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
    Call Trace:
    kasan_report.cold.6+0x6c/0x2fa
    tcf_sample_act+0xc4/0x310
    ? dev_hard_start_xmit+0x117/0x180
    tcf_action_exec+0xa3/0x160
    tcf_classify+0xdd/0x1d0
    htb_enqueue+0x18e/0x6b0
    ? deref_stack_reg+0x7a/0xb0
    ? htb_delete+0x4b0/0x4b0
    ? unwind_next_frame+0x819/0x8f0
    ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
    __dev_queue_xmit+0x722/0xca0
    ? unwind_get_return_address_ptr+0x50/0x50
    ? netdev_pick_tx+0xe0/0xe0
    ? save_stack+0x8c/0xb0
    ? kasan_kmalloc+0xbe/0xd0
    ? __kmalloc_track_caller+0xe4/0x1c0
    ? __kmalloc_reserve.isra.45+0x24/0x70
    ? __alloc_skb+0xdd/0x2e0
    ? sk_stream_alloc_skb+0x91/0x3b0
    ? tcp_sendmsg_locked+0x71b/0x15a0
    ? tcp_sendmsg+0x22/0x40
    ? __sys_sendto+0x1b0/0x250
    ? __x64_sys_sendto+0x6f/0x80
    ? do_syscall_64+0x5d/0x150
    ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
    ? __sys_sendto+0x1b0/0x250
    ? __x64_sys_sendto+0x6f/0x80
    ? do_syscall_64+0x5d/0x150
    ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
    ip_finish_output2+0x495/0x590
    ? ip_copy_metadata+0x2e0/0x2e0
    ? skb_gso_validate_network_len+0x6f/0x110
    ? ip_finish_output+0x174/0x280
    __tcp_transmit_skb+0xb17/0x12b0
    ? __tcp_select_window+0x380/0x380
    tcp_write_xmit+0x913/0x1de0
    ? __sk_mem_schedule+0x50/0x80
    tcp_sendmsg_locked+0x49d/0x15a0
    ? tcp_rcv_established+0x8da/0xa30
    ? tcp_set_state+0x220/0x220
    ? clear_user+0x1f/0x50
    ? iov_iter_zero+0x1ae/0x590
    ? __fget_light+0xa0/0xe0
    tcp_sendmsg+0x22/0x40
    __sys_sendto+0x1b0/0x250
    ? __ia32_sys_getpeername+0x40/0x40
    ? _copy_to_user+0x58/0x70
    ? poll_select_copy_remaining+0x176/0x200
    ? __pollwait+0x1c0/0x1c0
    ? ktime_get_ts64+0x11f/0x140
    ? kern_select+0x108/0x150
    ? core_sys_select+0x360/0x360
    ? vfs_read+0x127/0x150
    ? kernel_write+0x90/0x90
    __x64_sys_sendto+0x6f/0x80
    do_syscall_64+0x5d/0x150
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x7fefef2b129d
    Code: ff ff ff ff eb b6 0f 1f 80 00 00 00 00 48 8d 05 51 37 0c 00 41 89 ca 8b 00 85 c0 75 20 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 3d 00 f0 ff ff 77 6b f3 c3 66 0f 1f 84 00 00 00 00 00 41 56 41
    RSP: 002b:00007fff2f5350c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    RAX: ffffffffffffffda RBX: 000056118d60c120 RCX: 00007fefef2b129d
    RDX: 0000000000002000 RSI: 000056118d629320 RDI: 0000000000000003
    RBP: 000056118d530370 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000002000
    R13: 000056118d5c2a10 R14: 000056118d5c2a10 R15: 000056118d5303b8

    tcf_sample_act() tried to update its per-cpu stats, but tcf_sample_init()
    forgot to allocate them, because tcf_idr_create() was called with a wrong
    value of 'cpustats'. Setting it to true proved to fix the reported crash.

    Reported-by: Matteo Croce
    Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
    Fixes: 5c5670fae430 ("net/sched: Introduce sample tc action")
    Tested-by: Matteo Croce
    Signed-off-by: Davide Caratti
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     

20 Sep, 2018

1 commit

  • Geeralize private netem_rb_to_skb()

    TCP rtx queue will soon be converted to rb-tree,
    so we will need skb_rbtree_walk() helpers.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    (cherry picked from commit 18a4c0eab2623cc95be98a1e6af1ad18e7695977)
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

15 Sep, 2018

6 commits

  • [ Upstream commit 84cb8eb26cb9ce3c79928094962a475a9d850a53 ]

    Recent refactoring of add_metainfo() caused use_all_metadata() to add
    metainfo to ife action metalist without taking reference to module. This
    causes warning in module_put called from ife action cleanup function.

    Implement add_metainfo_and_get_ops() function that returns with reference
    to module taken if metainfo was added successfully, and call it from
    use_all_metadata(), instead of calling __add_metainfo() directly.

    Example warning:

    [ 646.344393] WARNING: CPU: 1 PID: 2278 at kernel/module.c:1139 module_put+0x1cb/0x230
    [ 646.352437] Modules linked in: act_meta_skbtcindex act_meta_mark act_meta_skbprio act_ife ife veth nfsv3 nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c tun ebtable_filter ebtables ip6table_filter ip6_tables bridge stp llc mlx5_ib ib_uverbs ib_core intel_rapl sb_edac x86_pkg_temp_thermal mlx5_core coretemp kvm_intel kvm nfsd igb irqbypass crct10dif_pclmul devlink crc32_pclmul mei_me joydev ses crc32c_intel enclosure auth_rpcgss i2c_algo_bit ioatdma ptp mei pps_core ghash_clmulni_intel iTCO_wdt iTCO_vendor_support pcspkr dca ipmi_ssif lpc_ich target_core_mod i2c_i801 ipmi_si ipmi_devintf pcc_cpufreq wmi ipmi_msghandler nfs_acl lockd acpi_pad acpi_power_meter grace sunrpc mpt3sas raid_class scsi_transport_sas
    [ 646.425631] CPU: 1 PID: 2278 Comm: tc Not tainted 4.19.0-rc1+ #799
    [ 646.432187] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
    [ 646.440595] RIP: 0010:module_put+0x1cb/0x230
    [ 646.445238] Code: f3 66 94 02 e8 26 ff fa ff 85 c0 74 11 0f b6 1d 51 30 94 02 80 fb 01 77 60 83 e3 01 74 13 65 ff 0d 3a 83 db 73 e9 2b ff ff ff 0b e9 00 ff ff ff e8 59 01 fb ff 85 c0 75 e4 48 c7 c2 20 62 6b
    [ 646.464997] RSP: 0018:ffff880354d37068 EFLAGS: 00010286
    [ 646.470599] RAX: 0000000000000000 RBX: ffffffffc0a52518 RCX: ffffffff8c2668db
    [ 646.478118] RDX: 0000000000000003 RSI: dffffc0000000000 RDI: ffffffffc0a52518
    [ 646.485641] RBP: ffffffffc0a52180 R08: fffffbfff814a4a4 R09: fffffbfff814a4a3
    [ 646.493164] R10: ffffffffc0a5251b R11: fffffbfff814a4a4 R12: 1ffff1006a9a6e0d
    [ 646.500687] R13: 00000000ffffffff R14: ffff880362bab890 R15: dead000000000100
    [ 646.508213] FS: 00007f4164c99800(0000) GS:ffff88036fe40000(0000) knlGS:0000000000000000
    [ 646.516961] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 646.523080] CR2: 00007f41638b8420 CR3: 0000000351df0004 CR4: 00000000001606e0
    [ 646.530595] Call Trace:
    [ 646.533408] ? find_symbol_in_section+0x260/0x260
    [ 646.538509] tcf_ife_cleanup+0x11b/0x200 [act_ife]
    [ 646.543695] tcf_action_cleanup+0x29/0xa0
    [ 646.548078] __tcf_action_put+0x5a/0xb0
    [ 646.552289] ? nla_put+0x65/0xe0
    [ 646.555889] __tcf_idr_release+0x48/0x60
    [ 646.560187] tcf_generic_walker+0x448/0x6b0
    [ 646.564764] ? tcf_action_dump_1+0x450/0x450
    [ 646.569411] ? __lock_is_held+0x84/0x110
    [ 646.573720] ? tcf_ife_walker+0x10c/0x20f [act_ife]
    [ 646.578982] tca_action_gd+0x972/0xc40
    [ 646.583129] ? tca_get_fill.constprop.17+0x250/0x250
    [ 646.588471] ? mark_lock+0xcf/0x980
    [ 646.592324] ? check_chain_key+0x140/0x1f0
    [ 646.596832] ? debug_show_all_locks+0x240/0x240
    [ 646.601839] ? memset+0x1f/0x40
    [ 646.605350] ? nla_parse+0xca/0x1a0
    [ 646.609217] tc_ctl_action+0x215/0x230
    [ 646.613339] ? tcf_action_add+0x220/0x220
    [ 646.617748] rtnetlink_rcv_msg+0x56a/0x6d0
    [ 646.622227] ? rtnl_fdb_del+0x3f0/0x3f0
    [ 646.626466] netlink_rcv_skb+0x18d/0x200
    [ 646.630752] ? rtnl_fdb_del+0x3f0/0x3f0
    [ 646.634959] ? netlink_ack+0x500/0x500
    [ 646.639106] netlink_unicast+0x2d0/0x370
    [ 646.643409] ? netlink_attachskb+0x340/0x340
    [ 646.648050] ? _copy_from_iter_full+0xe9/0x3e0
    [ 646.652870] ? import_iovec+0x11e/0x1c0
    [ 646.657083] netlink_sendmsg+0x3b9/0x6a0
    [ 646.661388] ? netlink_unicast+0x370/0x370
    [ 646.665877] ? netlink_unicast+0x370/0x370
    [ 646.670351] sock_sendmsg+0x6b/0x80
    [ 646.674212] ___sys_sendmsg+0x4a1/0x520
    [ 646.678443] ? copy_msghdr_from_user+0x210/0x210
    [ 646.683463] ? lock_downgrade+0x320/0x320
    [ 646.687849] ? debug_show_all_locks+0x240/0x240
    [ 646.692760] ? do_raw_spin_unlock+0xa2/0x130
    [ 646.697418] ? _raw_spin_unlock+0x24/0x30
    [ 646.701798] ? __handle_mm_fault+0x1819/0x1c10
    [ 646.706619] ? __pmd_alloc+0x320/0x320
    [ 646.710738] ? debug_show_all_locks+0x240/0x240
    [ 646.715649] ? restore_nameidata+0x7b/0xa0
    [ 646.720117] ? check_chain_key+0x140/0x1f0
    [ 646.724590] ? check_chain_key+0x140/0x1f0
    [ 646.729070] ? __fget_light+0xbc/0xd0
    [ 646.733121] ? __sys_sendmsg+0xd7/0x150
    [ 646.737329] __sys_sendmsg+0xd7/0x150
    [ 646.741359] ? __ia32_sys_shutdown+0x30/0x30
    [ 646.746003] ? up_read+0x53/0x90
    [ 646.749601] ? __do_page_fault+0x484/0x780
    [ 646.754105] ? do_syscall_64+0x1e/0x2c0
    [ 646.758320] do_syscall_64+0x72/0x2c0
    [ 646.762353] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [ 646.767776] RIP: 0033:0x7f4163872150
    [ 646.771713] Code: 8b 15 3c 7d 2b 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb cd 66 0f 1f 44 00 00 83 3d b9 d5 2b 00 00 75 10 b8 2e 00 00 00 0f 05 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 be cd 00 00 48 89 04 24
    [ 646.791474] RSP: 002b:00007ffdef7d6b58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    [ 646.799721] RAX: ffffffffffffffda RBX: 0000000000000024 RCX: 00007f4163872150
    [ 646.807240] RDX: 0000000000000000 RSI: 00007ffdef7d6bd0 RDI: 0000000000000003
    [ 646.814760] RBP: 000000005b8b9482 R08: 0000000000000001 R09: 0000000000000000
    [ 646.822286] R10: 00000000000005e7 R11: 0000000000000246 R12: 00007ffdef7dad20
    [ 646.829807] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000679bc0
    [ 646.837360] irq event stamp: 6083
    [ 646.841043] hardirqs last enabled at (6081): [] __call_rcu+0x17d/0x500
    [ 646.849882] hardirqs last disabled at (6083): [] trace_hardirqs_off_thunk+0x1a/0x1c
    [ 646.859775] softirqs last enabled at (5968): [] __do_softirq+0x4a1/0x6ee
    [ 646.868784] softirqs last disabled at (6082): [] tcf_ife_cleanup+0x39/0x200 [act_ife]
    [ 646.878845] ---[ end trace b1b8c12ffe51e657 ]---

    Fixes: 5ffe57da29b3 ("act_ife: fix a potential deadlock")
    Signed-off-by: Vlad Buslov
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Vlad Buslov
     
  • [ Upstream commit 5ffe57da29b3802baeddaa40909682bbb4cb4d48 ]

    use_all_metadata() acquires read_lock(&ife_mod_lock), then calls
    add_metainfo() which calls find_ife_oplist() which acquires the same
    lock again. Deadlock!

    Introduce __add_metainfo() which accepts struct tcf_meta_ops *ops
    as an additional parameter and let its callers to decide how
    to find it. For use_all_metadata(), it already has ops, no
    need to find it again, just call __add_metainfo() directly.

    And, as ife_mod_lock is only needed for find_ife_oplist(),
    this means we can make non-atomic allocation for populate_metalist()
    now.

    Fixes: 817e9f2c5c26 ("act_ife: acquire ife_mod_lock before reading ifeoplist")
    Cc: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     
  • [ Upstream commit 4e407ff5cd67ec76eeeea1deec227b7982dc7f66 ]

    The only time we need to take tcfa_lock is when adding
    a new metainfo to an existing ife->metalist. We don't need
    to take tcfa_lock so early and so broadly in tcf_ife_init().

    This means we can always take ife_mod_lock first, avoid the
    reverse locking ordering warning as reported by Vlad.

    Reported-by: Vlad Buslov
    Tested-by: Vlad Buslov
    Cc: Vlad Buslov
    Cc: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     
  • [ Upstream commit 85eb9af182243ce9a8b72410d5321c440ac5f8d7 ]

    in the (rare) case of failure in nla_nest_start(), missing NULL checks in
    tcf_pedit_key_ex_dump() can make the following command

    # tc action add action pedit ex munge ip ttl set 64

    dereference a NULL pointer:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
    PGD 800000007d1cd067 P4D 800000007d1cd067 PUD 7acd3067 PMD 0
    Oops: 0002 [#1] SMP PTI
    CPU: 0 PID: 3336 Comm: tc Tainted: G E 4.18.0.pedit+ #425
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    RIP: 0010:tcf_pedit_dump+0x19d/0x358 [act_pedit]
    Code: be 02 00 00 00 48 89 df 66 89 44 24 20 e8 9b b1 fd e0 85 c0 75 46 8b 83 c8 00 00 00 49 83 c5 08 48 03 83 d0 00 00 00 4d 39 f5 89 04 25 00 00 00 00 0f 84 81 01 00 00 41 8b 45 00 48 8d 4c 24
    RSP: 0018:ffffb5d4004478a8 EFLAGS: 00010246
    RAX: ffff8880fcda2070 RBX: ffff8880fadd2900 RCX: 0000000000000000
    RDX: 0000000000000002 RSI: ffffb5d4004478ca RDI: ffff8880fcda206e
    RBP: ffff8880fb9cb900 R08: 0000000000000008 R09: ffff8880fcda206e
    R10: ffff8880fadd2900 R11: 0000000000000000 R12: ffff8880fd26cf40
    R13: ffff8880fc957430 R14: ffff8880fc957430 R15: ffff8880fb9cb988
    FS: 00007f75a537a740(0000) GS:ffff8880fda00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 000000007a2fa005 CR4: 00000000001606f0
    Call Trace:
    ? __nla_reserve+0x38/0x50
    tcf_action_dump_1+0xd2/0x130
    tcf_action_dump+0x6a/0xf0
    tca_get_fill.constprop.31+0xa3/0x120
    tcf_action_add+0xd1/0x170
    tc_ctl_action+0x137/0x150
    rtnetlink_rcv_msg+0x263/0x2d0
    ? _cond_resched+0x15/0x40
    ? rtnl_calcit.isra.30+0x110/0x110
    netlink_rcv_skb+0x4d/0x130
    netlink_unicast+0x1a3/0x250
    netlink_sendmsg+0x2ae/0x3a0
    sock_sendmsg+0x36/0x40
    ___sys_sendmsg+0x26f/0x2d0
    ? do_wp_page+0x8e/0x5f0
    ? handle_pte_fault+0x6c3/0xf50
    ? __handle_mm_fault+0x38e/0x520
    ? __sys_sendmsg+0x5e/0xa0
    __sys_sendmsg+0x5e/0xa0
    do_syscall_64+0x5b/0x180
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x7f75a4583ba0
    Code: c3 48 8b 05 f2 62 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d fd c3 2c 00 00 75 10 b8 2e 00 00 00 0f 05 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 ae cc 00 00 48 89 04 24
    RSP: 002b:00007fff60ee7418 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00007fff60ee7540 RCX: 00007f75a4583ba0
    RDX: 0000000000000000 RSI: 00007fff60ee7490 RDI: 0000000000000003
    RBP: 000000005b842d3e R08: 0000000000000002 R09: 0000000000000000
    R10: 00007fff60ee6ea0 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007fff60ee7554 R14: 0000000000000001 R15: 000000000066c100
    Modules linked in: act_pedit(E) ip6table_filter ip6_tables iptable_filter binfmt_misc crct10dif_pclmul ext4 crc32_pclmul mbcache ghash_clmulni_intel jbd2 pcbc snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer cryptd glue_helper snd joydev pcspkr soundcore virtio_balloon i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi virtio_net net_failover virtio_blk virtio_console failover qxl crc32c_intel drm_kms_helper syscopyarea serio_raw sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix virtio_pci libata virtio_ring i2c_core virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_pedit]
    CR2: 0000000000000000

    Like it's done for other TC actions, give up dumping pedit rules and return
    an error if nla_nest_start() returns NULL.

    Fixes: 71d0ed7079df ("net/act_pedit: Support using offset relative to the conventional network headers")
    Signed-off-by: Davide Caratti
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit 98c8f125fd8a6240ea343c1aa50a1be9047791b8 ]

    Via u32_change(), TCA_U32_SEL has an unspecified type in the netlink
    policy, so max length isn't enforced, only minimum. This means nkeys
    (from userspace) was being trusted without checking the actual size of
    nla_len(), which could lead to a memory over-read, and ultimately an
    exposure via a call to u32_dump(). Reachability is CAP_NET_ADMIN within
    a namespace.

    Reported-by: Al Viro
    Cc: Jamal Hadi Salim
    Cc: Cong Wang
    Cc: Jiri Pirko
    Cc: "David S. Miller"
    Cc: netdev@vger.kernel.org
    Signed-off-by: Kees Cook
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kees Cook
     
  • [ Upstream commit 6d784f1625ea68783cc1fb17de8f6cd3e1660c3f ]

    Immediately after module_put(), user could delete this
    module, so e->ops could be already freed before we call
    e->ops->release().

    Fix this by moving module_put() after ops->release().

    Fixes: ef6980b6becb ("introduce IFE action")
    Cc: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     

24 Aug, 2018

1 commit

  • [ Upstream commit 38230a3e0e0933bbcf5df6fa469ba0667f667568 ]

    the control action in the common member of struct tcf_tunnel_key must be a
    valid value, as it can contain the chain index when 'goto chain' is used.
    Ensure that the control action can be read as x->tcfa_action, when x is a
    pointer to struct tc_action and x->ops->type is TCA_ACT_TUNNEL_KEY, to
    prevent the following command:

    # tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \
    > $tcflags dst_mac $h2mac action tunnel_key unset goto chain 1

    from causing a NULL dereference when a matching packet is received:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
    PGD 80000001097ac067 P4D 80000001097ac067 PUD 103b0a067 PMD 0
    Oops: 0000 [#1] SMP PTI
    CPU: 0 PID: 3491 Comm: mausezahn Tainted: G E 4.18.0-rc2.auguri+ #421
    Hardware name: Hewlett-Packard HP Z220 CMT Workstation/1790, BIOS K51 v01.58 02/07/2013
    RIP: 0010:tcf_action_exec+0xb8/0x100
    Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3
    RSP: 0018:ffff95145ea03c40 EFLAGS: 00010246
    RAX: 0000000020000001 RBX: ffff9514499e5800 RCX: 0000000000000001
    RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
    RBP: ffff95145ea03e60 R08: 0000000000000000 R09: ffff95145ea03c9c
    R10: ffff95145ea03c78 R11: 0000000000000008 R12: ffff951456a69800
    R13: ffff951456a69808 R14: 0000000000000001 R15: ffff95144965ee40
    FS: 00007fd67ee11740(0000) GS:ffff95145ea00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 00000001038a2006 CR4: 00000000001606f0
    Call Trace:

    fl_classify+0x1ad/0x1c0 [cls_flower]
    ? __update_load_avg_se.isra.47+0x1ca/0x1d0
    ? __update_load_avg_se.isra.47+0x1ca/0x1d0
    ? update_load_avg+0x665/0x690
    ? update_load_avg+0x665/0x690
    ? kmem_cache_alloc+0x38/0x1c0
    tcf_classify+0x89/0x140
    __netif_receive_skb_core+0x5ea/0xb70
    ? enqueue_entity+0xd0/0x270
    ? process_backlog+0x97/0x150
    process_backlog+0x97/0x150
    net_rx_action+0x14b/0x3e0
    __do_softirq+0xde/0x2b4
    do_softirq_own_stack+0x2a/0x40

    do_softirq.part.18+0x49/0x50
    __local_bh_enable_ip+0x49/0x50
    __dev_queue_xmit+0x4ab/0x8a0
    ? wait_woken+0x80/0x80
    ? packet_sendmsg+0x38f/0x810
    ? __dev_queue_xmit+0x8a0/0x8a0
    packet_sendmsg+0x38f/0x810
    sock_sendmsg+0x36/0x40
    __sys_sendto+0x10e/0x140
    ? do_vfs_ioctl+0xa4/0x630
    ? syscall_trace_enter+0x1df/0x2e0
    ? __audit_syscall_exit+0x22a/0x290
    __x64_sys_sendto+0x24/0x30
    do_syscall_64+0x5b/0x180
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x7fd67e18dc93
    Code: 48 8b 0d 18 83 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 59 c7 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 2b f7 ff ff 48 89 04 24
    RSP: 002b:00007ffe0189b748 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    RAX: ffffffffffffffda RBX: 00000000020ca010 RCX: 00007fd67e18dc93
    RDX: 0000000000000062 RSI: 00000000020ca322 RDI: 0000000000000003
    RBP: 00007ffe0189b780 R08: 00007ffe0189b760 R09: 0000000000000014
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000062
    R13: 00000000020ca322 R14: 00007ffe0189b760 R15: 0000000000000003
    Modules linked in: act_tunnel_key act_gact cls_flower sch_ingress vrf veth act_csum(E) xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek coretemp snd_hda_codec_generic kvm_intel kvm irqbypass snd_hda_intel crct10dif_pclmul crc32_pclmul hp_wmi ghash_clmulni_intel pcbc snd_hda_codec aesni_intel sparse_keymap rfkill snd_hda_core snd_hwdep snd_seq crypto_simd iTCO_wdt gpio_ich iTCO_vendor_support wmi_bmof cryptd mei_wdt glue_helper snd_seq_device snd_pcm pcspkr snd_timer snd i2c_i801 lpc_ich sg soundcore wmi mei_me
    mei ie31200_edac nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod sr_mod cdrom i915 video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci crc32c_intel libahci serio_raw sfc libata mtd drm ixgbe mdio i2c_core e1000e dca
    CR2: 0000000000000000
    ---[ end trace 1ab8b5b5d4639dfc ]---
    RIP: 0010:tcf_action_exec+0xb8/0x100
    Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3
    RSP: 0018:ffff95145ea03c40 EFLAGS: 00010246
    RAX: 0000000020000001 RBX: ffff9514499e5800 RCX: 0000000000000001
    RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
    RBP: ffff95145ea03e60 R08: 0000000000000000 R09: ffff95145ea03c9c
    R10: ffff95145ea03c78 R11: 0000000000000008 R12: ffff951456a69800
    R13: ffff951456a69808 R14: 0000000000000001 R15: ffff95144965ee40
    FS: 00007fd67ee11740(0000) GS:ffff95145ea00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 00000001038a2006 CR4: 00000000001606f0
    Kernel panic - not syncing: Fatal exception in interrupt
    Kernel Offset: 0x11400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
    ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

    Fixes: d0f6dd8a914f ("net/sched: Introduce act_tunnel_key")
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     

22 Aug, 2018

3 commits

  • [ Upstream commit a51c76b4dfb30496dc65396a957ef0f06af7fb22 ]

    Fix tcf_unbind_filter missing in cls_matchall as this will trigger
    WARN_ON() in cbq_destroy_class().

    Fixes: fd62d9f5c575f ("net/sched: matchall: Fix configuration race")
    Reported-by: Li Shuang
    Signed-off-by: Hangbin Liu
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Hangbin Liu
     
  • [ Upstream commit 008369dcc5f7bfba526c98054f8525322acf0ea3 ]

    Li Shuang reported the following warn:

    [ 733.484610] WARNING: CPU: 6 PID: 21123 at net/sched/sch_cbq.c:1418 cbq_destroy_class+0x5d/0x70 [sch_cbq]
    [ 733.495190] Modules linked in: sch_cbq cls_tcindex sch_dsmark rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat l
    [ 733.574155] syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb ixgbe ahci libahci i2c_algo_bit libata i40e i2c_core dca mdio megaraid_sas dm_mirror dm_region_hash dm_log dm_mod
    [ 733.592500] CPU: 6 PID: 21123 Comm: tc Not tainted 4.18.0-rc8.latest+ #131
    [ 733.600169] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.5 04/11/2016
    [ 733.608518] RIP: 0010:cbq_destroy_class+0x5d/0x70 [sch_cbq]
    [ 733.614734] Code: e7 d9 d2 48 8b 7b 48 e8 61 05 da d2 48 8d bb f8 00 00 00 e8 75 ae d5 d2 48 39 eb 74 0a 48 89 df 5b 5d e9 16 6c 94 d2 5b 5d c3 0b eb b6 0f 1f 44 00 00 66 2e 0f 1f 84
    [ 733.635798] RSP: 0018:ffffbfbb066bb9d8 EFLAGS: 00010202
    [ 733.641627] RAX: 0000000000000001 RBX: ffff9cdd17392800 RCX: 000000008010000f
    [ 733.649588] RDX: ffff9cdd1df547e0 RSI: ffff9cdd17392800 RDI: ffff9cdd0f84c800
    [ 733.657547] RBP: ffff9cdd0f84c800 R08: 0000000000000001 R09: 0000000000000000
    [ 733.665508] R10: ffff9cdd0f84d000 R11: 0000000000000001 R12: 0000000000000001
    [ 733.673469] R13: 0000000000000000 R14: 0000000000000001 R15: ffff9cdd17392200
    [ 733.681430] FS: 00007f911890a740(0000) GS:ffff9cdd1f8c0000(0000) knlGS:0000000000000000
    [ 733.690456] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 733.696864] CR2: 0000000000b5544c CR3: 0000000859374002 CR4: 00000000001606e0
    [ 733.704826] Call Trace:
    [ 733.707554] cbq_destroy+0xa1/0xd0 [sch_cbq]
    [ 733.712318] qdisc_destroy+0x62/0x130
    [ 733.716401] dsmark_destroy+0x2a/0x70 [sch_dsmark]
    [ 733.721745] qdisc_destroy+0x62/0x130
    [ 733.725829] qdisc_graft+0x3ba/0x470
    [ 733.729817] tc_get_qdisc+0x2a6/0x2c0
    [ 733.733901] ? cred_has_capability+0x7d/0x130
    [ 733.738761] rtnetlink_rcv_msg+0x263/0x2d0
    [ 733.743330] ? rtnl_calcit.isra.30+0x110/0x110
    [ 733.748287] netlink_rcv_skb+0x4d/0x130
    [ 733.752576] netlink_unicast+0x1a3/0x250
    [ 733.756949] netlink_sendmsg+0x2ae/0x3a0
    [ 733.761324] sock_sendmsg+0x36/0x40
    [ 733.765213] ___sys_sendmsg+0x26f/0x2d0
    [ 733.769493] ? handle_pte_fault+0x586/0xdf0
    [ 733.774158] ? __handle_mm_fault+0x389/0x500
    [ 733.778919] ? __sys_sendmsg+0x5e/0xa0
    [ 733.783099] __sys_sendmsg+0x5e/0xa0
    [ 733.787087] do_syscall_64+0x5b/0x180
    [ 733.791171] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [ 733.796805] RIP: 0033:0x7f9117f23f10
    [ 733.800791] Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8
    [ 733.821873] RSP: 002b:00007ffe96818398 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    [ 733.830319] RAX: ffffffffffffffda RBX: 000000005b71244c RCX: 00007f9117f23f10
    [ 733.838280] RDX: 0000000000000000 RSI: 00007ffe968183e0 RDI: 0000000000000003
    [ 733.846241] RBP: 00007ffe968183e0 R08: 000000000000ffff R09: 0000000000000003
    [ 733.854202] R10: 00007ffe96817e20 R11: 0000000000000246 R12: 0000000000000000
    [ 733.862161] R13: 0000000000662ee0 R14: 0000000000000000 R15: 0000000000000000
    [ 733.870121] ---[ end trace 28edd4aad712ddca ]---

    This is because we didn't update f->result.res when create new filter. Then in
    tcindex_delete() -> tcf_unbind_filter(), we will failed to find out the res
    and unbind filter, which will trigger the WARN_ON() in cbq_destroy_class().

    Fix it by updating f->result.res when create new filter.

    Fixes: 6e0565697a106 ("net_sched: fix another crash in cls_tcindex")
    Reported-by: Li Shuang
    Signed-off-by: Hangbin Liu
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Hangbin Liu
     
  • [ Upstream commit 2df8bee5654bb2b7312662ca6810d4dc16b0b67f ]

    Li Shuang reported the following crash:

    [ 71.267724] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
    [ 71.276456] PGD 800000085d9bd067 P4D 800000085d9bd067 PUD 859a0b067 PMD 0
    [ 71.284127] Oops: 0000 [#1] SMP PTI
    [ 71.288015] CPU: 12 PID: 2386 Comm: tc Not tainted 4.18.0-rc8.latest+ #131
    [ 71.295686] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.5 04/11/2016
    [ 71.304037] RIP: 0010:tcindex_delete+0x72/0x280 [cls_tcindex]
    [ 71.310446] Code: 00 31 f6 48 87 75 20 48 85 f6 74 11 48 8b 47 18 48 8b 40 08 48 8b 40 50 e8 fb a6 f8 fc 48 85 db 0f 84 dc 00 00 00 48 8b 73 18 56 04 48 8d 7e 04 85 d2 0f 84 7b 01 00
    [ 71.331517] RSP: 0018:ffffb45207b3f898 EFLAGS: 00010282
    [ 71.337345] RAX: ffff8ad3d72d6360 RBX: ffff8acc84393680 RCX: 000000000000002e
    [ 71.345306] RDX: ffff8ad3d72c8570 RSI: 0000000000000000 RDI: ffff8ad847a45800
    [ 71.353277] RBP: ffff8acc84393688 R08: ffff8ad3d72c8400 R09: 0000000000000000
    [ 71.361238] R10: ffff8ad3de786e00 R11: 0000000000000000 R12: ffffb45207b3f8c7
    [ 71.369199] R13: ffff8ad3d93bd2a0 R14: 000000000000002e R15: ffff8ad3d72c9600
    [ 71.377161] FS: 00007f9d3ec3e740(0000) GS:ffff8ad3df980000(0000) knlGS:0000000000000000
    [ 71.386188] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 71.392597] CR2: 0000000000000004 CR3: 0000000852f06003 CR4: 00000000001606e0
    [ 71.400558] Call Trace:
    [ 71.403299] tcindex_destroy_element+0x25/0x40 [cls_tcindex]
    [ 71.409611] tcindex_walk+0xbb/0x110 [cls_tcindex]
    [ 71.414953] tcindex_destroy+0x44/0x90 [cls_tcindex]
    [ 71.420492] ? tcindex_delete+0x280/0x280 [cls_tcindex]
    [ 71.426323] tcf_proto_destroy+0x16/0x40
    [ 71.430696] tcf_chain_flush+0x51/0x70
    [ 71.434876] tcf_block_put_ext.part.30+0x8f/0x1b0
    [ 71.440122] tcf_block_put+0x4d/0x70
    [ 71.444108] cbq_destroy+0x4d/0xd0 [sch_cbq]
    [ 71.448869] qdisc_destroy+0x62/0x130
    [ 71.452951] dsmark_destroy+0x2a/0x70 [sch_dsmark]
    [ 71.458300] qdisc_destroy+0x62/0x130
    [ 71.462373] qdisc_graft+0x3ba/0x470
    [ 71.466359] tc_get_qdisc+0x2a6/0x2c0
    [ 71.470443] ? cred_has_capability+0x7d/0x130
    [ 71.475307] rtnetlink_rcv_msg+0x263/0x2d0
    [ 71.479875] ? rtnl_calcit.isra.30+0x110/0x110
    [ 71.484832] netlink_rcv_skb+0x4d/0x130
    [ 71.489109] netlink_unicast+0x1a3/0x250
    [ 71.493482] netlink_sendmsg+0x2ae/0x3a0
    [ 71.497859] sock_sendmsg+0x36/0x40
    [ 71.501748] ___sys_sendmsg+0x26f/0x2d0
    [ 71.506029] ? handle_pte_fault+0x586/0xdf0
    [ 71.510694] ? __handle_mm_fault+0x389/0x500
    [ 71.515457] ? __sys_sendmsg+0x5e/0xa0
    [ 71.519636] __sys_sendmsg+0x5e/0xa0
    [ 71.523626] do_syscall_64+0x5b/0x180
    [ 71.527711] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [ 71.533345] RIP: 0033:0x7f9d3e257f10
    [ 71.537331] Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8
    [ 71.558401] RSP: 002b:00007fff6f893398 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    [ 71.566848] RAX: ffffffffffffffda RBX: 000000005b71274d RCX: 00007f9d3e257f10
    [ 71.574810] RDX: 0000000000000000 RSI: 00007fff6f8933e0 RDI: 0000000000000003
    [ 71.582770] RBP: 00007fff6f8933e0 R08: 000000000000ffff R09: 0000000000000003
    [ 71.590729] R10: 00007fff6f892e20 R11: 0000000000000246 R12: 0000000000000000
    [ 71.598689] R13: 0000000000662ee0 R14: 0000000000000000 R15: 0000000000000000
    [ 71.606651] Modules linked in: sch_cbq cls_tcindex sch_dsmark xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_coni
    [ 71.685425] libahci i2c_algo_bit i2c_core i40e libata dca mdio megaraid_sas dm_mirror dm_region_hash dm_log dm_mod
    [ 71.697075] CR2: 0000000000000004
    [ 71.700792] ---[ end trace f604eb1acacd978b ]---

    Reproducer:
    tc qdisc add dev lo handle 1:0 root dsmark indices 64 set_tc_index
    tc filter add dev lo parent 1:0 protocol ip prio 1 tcindex mask 0xfc shift 2
    tc qdisc add dev lo parent 1:0 handle 2:0 cbq bandwidth 10Mbit cell 8 avpkt 1000 mpu 64
    tc class add dev lo parent 2:0 classid 2:1 cbq bandwidth 10Mbit rate 1500Kbit avpkt 1000 prio 1 bounded isolated allot 1514 weight 1 maxburst 10
    tc filter add dev lo parent 2:0 protocol ip prio 1 handle 0x2e tcindex classid 2:1 pass_on
    tc qdisc add dev lo parent 2:1 pfifo limit 5
    tc qdisc del dev lo root

    This is because in tcindex_set_parms, when there is no old_r, we set new
    exts to cr.exts. And we didn't set it to filter when r == &new_filter_result.

    Then in tcindex_delete() -> tcf_exts_get_net(), we will get NULL pointer
    dereference as we didn't init exts.

    Fix it by moving tcf_exts_change() after "if (old_r && old_r != r)" check.
    Then we don't need "cr" as there is no errout after that.

    Fixes: bf63ac73b3e13 ("net_sched: fix an oops in tcindex filter")
    Reported-by: Li Shuang
    Signed-off-by: Hangbin Liu
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Hangbin Liu
     

22 Jul, 2018

1 commit

  • [ Upstream commit 7e85dc8cb35abf16455f1511f0670b57c1a84608 ]

    When blackhole is used on top of classful qdisc like hfsc it breaks
    qlen and backlog counters because packets are disappear without notice.

    In HFSC non-zero qlen while all classes are inactive triggers warning:
    WARNING: ... at net/sched/sch_hfsc.c:1393 hfsc_dequeue+0xba4/0xe90 [sch_hfsc]
    and schedules watchdog work endlessly.

    This patch return __NET_XMIT_BYPASS in addition to NET_XMIT_SUCCESS,
    this flag tells upper layer: this packet is gone and isn't queued.

    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Konstantin Khlebnikov
     

26 Jun, 2018

1 commit

  • [ Upstream commit 8d499533e0bc02d44283dbdab03142b599b8ba16 ]

    use nla_strlcpy() to avoid copying data beyond the length of TCA_DEF_DATA
    netlink attribute, in case it is less than SIMP_MAX_DATA and it does not
    end with '\0' character.

    v2: fix errors in the commit message, thanks Hangbin Liu

    Fixes: fa1b1cff3d06 ("net_cls_act: Make act_simple use of netlink policy.")
    Signed-off-by: Davide Caratti
    Reviewed-by: Simon Horman
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     

21 Jun, 2018

1 commit

  • [ Upstream commit af5d01842fe1fbfb9f5e1c1d957ba02ab6f4569a ]

    When application fails to pass flags in netlink TLV for a new skbedit action,
    the kernel results in the following oops:

    [ 8.307732] BUG: unable to handle kernel paging request at 0000000000021130
    [ 8.309167] PGD 80000000193d1067 P4D 80000000193d1067 PUD 180e0067 PMD 0
    [ 8.310595] Oops: 0000 [#1] SMP PTI
    [ 8.311334] Modules linked in: kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper serio_raw
    [ 8.314190] CPU: 1 PID: 397 Comm: tc Not tainted 4.17.0-rc3+ #357
    [ 8.315252] RIP: 0010:__tcf_idr_release+0x33/0x140
    [ 8.316203] RSP: 0018:ffffa0718038f840 EFLAGS: 00010246
    [ 8.317123] RAX: 0000000000000001 RBX: 0000000000021100 RCX: 0000000000000000
    [ 8.319831] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000021100
    [ 8.321181] RBP: 0000000000000000 R08: 000000000004adf8 R09: 0000000000000122
    [ 8.322645] R10: 0000000000000000 R11: ffffffff9e5b01ed R12: 0000000000000000
    [ 8.324157] R13: ffffffff9e0d3cc0 R14: 0000000000000000 R15: 0000000000000000
    [ 8.325590] FS: 00007f591292e700(0000) GS:ffff8fcf5bc40000(0000) knlGS:0000000000000000
    [ 8.327001] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 8.327987] CR2: 0000000000021130 CR3: 00000000180e6004 CR4: 00000000001606a0
    [ 8.329289] Call Trace:
    [ 8.329735] tcf_skbedit_init+0xa7/0xb0
    [ 8.330423] tcf_action_init_1+0x362/0x410
    [ 8.331139] ? try_to_wake_up+0x44/0x430
    [ 8.331817] tcf_action_init+0x103/0x190
    [ 8.332511] tc_ctl_action+0x11a/0x220
    [ 8.333174] rtnetlink_rcv_msg+0x23d/0x2e0
    [ 8.333902] ? _cond_resched+0x16/0x40
    [ 8.334569] ? __kmalloc_node_track_caller+0x5b/0x2c0
    [ 8.335440] ? rtnl_calcit.isra.31+0xf0/0xf0
    [ 8.336178] netlink_rcv_skb+0xdb/0x110
    [ 8.336855] netlink_unicast+0x167/0x220
    [ 8.337550] netlink_sendmsg+0x2a7/0x390
    [ 8.338258] sock_sendmsg+0x30/0x40
    [ 8.338865] ___sys_sendmsg+0x2c5/0x2e0
    [ 8.339531] ? pagecache_get_page+0x27/0x210
    [ 8.340271] ? filemap_fault+0xa2/0x630
    [ 8.340943] ? page_add_file_rmap+0x108/0x200
    [ 8.341732] ? alloc_set_pte+0x2aa/0x530
    [ 8.342573] ? finish_fault+0x4e/0x70
    [ 8.343332] ? __handle_mm_fault+0xbc1/0x10d0
    [ 8.344337] ? __sys_sendmsg+0x53/0x80
    [ 8.345040] __sys_sendmsg+0x53/0x80
    [ 8.345678] do_syscall_64+0x4f/0x100
    [ 8.346339] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [ 8.347206] RIP: 0033:0x7f591191da67
    [ 8.347831] RSP: 002b:00007fff745abd48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    [ 8.349179] RAX: ffffffffffffffda RBX: 00007fff745abe70 RCX: 00007f591191da67
    [ 8.350431] RDX: 0000000000000000 RSI: 00007fff745abdc0 RDI: 0000000000000003
    [ 8.351659] RBP: 000000005af35251 R08: 0000000000000001 R09: 0000000000000000
    [ 8.352922] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000
    [ 8.354183] R13: 00007fff745afed0 R14: 0000000000000001 R15: 00000000006767c0
    [ 8.355400] Code: 41 89 d4 53 89 f5 48 89 fb e8 aa 20 fd ff 85 c0 0f 84 ed 00
    00 00 48 85 db 0f 84 cf 00 00 00 40 84 ed 0f 85 cd 00 00 00 45 84 e4 53 30
    74 0d 85 d2 b8 ff ff ff ff 0f 8f b3 00 00 00 8b 43 2c
    [ 8.358699] RIP: __tcf_idr_release+0x33/0x140 RSP: ffffa0718038f840
    [ 8.359770] CR2: 0000000000021130
    [ 8.360438] ---[ end trace 60c66be45dfc14f0 ]---

    The caller calls action's ->init() and passes pointer to "struct tc_action *a",
    which later may be initialized to point at the existing action, otherwise
    "struct tc_action *a" is still invalid, and therefore dereferencing it is an
    error as happens in tcf_idr_release, where refcnt is decremented.

    So in case of missing flags tcf_idr_release must be called only for
    existing actions.

    v2:
    - prepare patch for net tree

    Fixes: 5e1567aeb7fe ("net sched: skbedit action fix late binding")
    Signed-off-by: Roman Mashak
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Roman Mashak
     

12 Jun, 2018

1 commit

  • [ Upstream commit 8258d2da9f9f521dce7019e018360c28d116354e ]

    When we fail to modify a rule, we incorrectly release the idr handle
    of the unmodified old rule.

    Fix that by checking if we need to release it.

    Fixes: fe2502e49b58 ("net_sched: remove cls_flower idr on failure")
    Reported-by: Vlad Buslov
    Reviewed-by: Roi Dayan
    Acked-by: Jiri Pirko
    Signed-off-by: Paul Blakey
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paul Blakey
     

30 May, 2018

7 commits

  • [ Upstream commit f29cdfbe33d6915ba8056179b0041279a67e3647 ]

    tcf_skbmod_init() can fail after the idr has been successfully reserved.
    When this happens, every subsequent attempt to configure skbmod rules
    using the same idr value will systematically fail with -ENOSPC, unless
    the first attempt was done using the 'replace' keyword:

    # tc action add action skbmod swap mac index 100
    RTNETLINK answers: Cannot allocate memory
    We have an error talking to the kernel
    # tc action add action skbmod swap mac index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    # tc action add action skbmod swap mac index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    ...

    Fix this in tcf_skbmod_init(), ensuring that tcf_idr_release() is called
    on the error path when the idr has been reserved, but not yet inserted.
    Also, don't test 'ovr' in the error path, to avoid a 'replace' failure
    implicitly become a 'delete' that leaks refcount in act_skbmod module:

    # rmmod act_skbmod; modprobe act_skbmod
    # tc action add action skbmod swap mac index 100
    # tc action add action skbmod swap mac continue index 100
    RTNETLINK answers: File exists
    We have an error talking to the kernel
    # tc action replace action skbmod swap mac continue index 100
    RTNETLINK answers: Cannot allocate memory
    We have an error talking to the kernel
    # tc action list action skbmod
    #
    # rmmod act_skbmod
    rmmod: ERROR: Module act_skbmod is in use

    Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit 1e46ef1762bb2e52f0f996131a4d16ed4e9fd065 ]

    __tcf_ipt_init() can fail after the idr has been successfully reserved.
    When this happens, subsequent attempts to configure xt/ipt rules using
    the same idr value systematically fail with -ENOSPC:

    # tc action add action xt -j LOG --log-prefix test1 index 100
    tablename: mangle hook: NF_IP_POST_ROUTING
    target: LOG level warning prefix "test1" index 100
    RTNETLINK answers: Cannot allocate memory
    We have an error talking to the kernel
    Command "(null)" is unknown, try "tc actions help".
    # tc action add action xt -j LOG --log-prefix test1 index 100
    tablename: mangle hook: NF_IP_POST_ROUTING
    target: LOG level warning prefix "test1" index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    Command "(null)" is unknown, try "tc actions help".
    # tc action add action xt -j LOG --log-prefix test1 index 100
    tablename: mangle hook: NF_IP_POST_ROUTING
    target: LOG level warning prefix "test1" index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    ...

    Fix this in the error path of __tcf_ipt_init(), calling tcf_idr_release()
    in place of tcf_idr_cleanup(). Since tcf_ipt_release() can now be called
    when tcfi_t is NULL, we also need to protect calls to ipt_destroy_target()
    to avoid NULL pointer dereference.

    Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit 94fa3f929ec0c048b1f3658cc335b940df4f6d22 ]

    tcf_pedit_init() can fail to allocate 'keys' after the idr has been
    successfully reserved. When this happens, subsequent attempts to configure
    a pedit rule using the same idr value systematically fail with -ENOSPC:

    # tc action add action pedit munge ip ttl set 63 index 100
    RTNETLINK answers: Cannot allocate memory
    We have an error talking to the kernel
    # tc action add action pedit munge ip ttl set 63 index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    # tc action add action pedit munge ip ttl set 63 index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    ...

    Fix this in the error path of tcf_act_pedit_init(), calling
    tcf_idr_release() in place of tcf_idr_cleanup().

    Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit 5bf7f8185f7c7112decdfe3d3e5c5d5e67f099a1 ]

    tcf_act_police_init() can fail after the idr has been successfully
    reserved (e.g., qdisc_get_rtab() may return NULL). When this happens,
    subsequent attempts to configure a police rule using the same idr value
    systematiclly fail with -ENOSPC:

    # tc action add action police rate 1000 burst 1000 drop index 100
    RTNETLINK answers: Cannot allocate memory
    We have an error talking to the kernel
    # tc action add action police rate 1000 burst 1000 drop index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    # tc action add action police rate 1000 burst 1000 drop index 100
    RTNETLINK answers: No space left on device
    ...

    Fix this in the error path of tcf_act_police_init(), calling
    tcf_idr_release() in place of tcf_idr_cleanup().

    Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit 60e10b3adc3bac0f6a894c28e0eb1f2d13607362 ]

    if the kernel fails to duplicate 'sdata', creation of a new action fails
    with -ENOMEM. However, subsequent attempts to install the same action
    using the same value of 'index' systematically fail with -ENOSPC, and
    that value of 'index' will no more be usable by act_simple, until rmmod /
    insmod of act_simple.ko is done:

    # tc actions add action simple sdata hello index 100
    # tc actions list action simple

    action order 0: Simple
    index 100 ref 1 bind 0
    # tc actions flush action simple
    # tc actions add action simple sdata hello index 100
    RTNETLINK answers: Cannot allocate memory
    We have an error talking to the kernel
    # tc actions flush action simple
    # tc actions add action simple sdata hello index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    # tc actions add action simple sdata hello index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel
    ...

    Fix this in the error path of tcf_simp_init(), calling tcf_idr_release()
    in place of tcf_idr_cleanup().

    Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
    Suggested-by: Cong Wang
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit bbc09e7842a5023ba5bc0f8d559b9dd464e44006 ]

    when the following command sequence is entered

    # tc action add action bpf bytecode '4,40 0 0 12,31 0 1 2048,6 0 0 262144,6 0 0 0' index 100
    RTNETLINK answers: Invalid argument
    We have an error talking to the kernel
    # tc action add action bpf bytecode '4,40 0 0 12,21 0 1 2048,6 0 0 262144,6 0 0 0' index 100
    RTNETLINK answers: No space left on device
    We have an error talking to the kernel

    act_bpf correctly refuses to install the first TC rule, because 31 is not
    a valid instruction. However, it refuses to install the second TC rule,
    even if the BPF code is correct. Furthermore, it's no more possible to
    install any other rule having the same value of 'index' until act_bpf
    module is unloaded/inserted again. After the idr has been reserved, call
    tcf_idr_release() instead of tcf_idr_cleanup(), to fix this issue.

    Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR")
    Acked-by: Jamal Hadi Salim
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit 1f110e7cae09e6c6a144616480d1a9dd99c5208a ]

    when the following command

    # tc action add action sample rate 100 group 100 index 100

    is run for the first time, and psample_group_get(100) fails to create a
    new group, tcf_sample_cleanup() calls psample_group_put(NULL), thus
    causing the following error:

    BUG: unable to handle kernel NULL pointer dereference at 000000000000001c
    IP: psample_group_put+0x15/0x71 [psample]
    PGD 8000000075775067 P4D 8000000075775067 PUD 7453c067 PMD 0
    Oops: 0002 [#1] SMP PTI
    Modules linked in: act_sample(E) psample ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core mbcache jbd2 crct10dif_pclmul snd_hwdep crc32_pclmul snd_seq ghash_clmulni_intel pcbc snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer glue_helper snd cryptd joydev pcspkr i2c_piix4 soundcore virtio_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_net ata_piix virtio_console virtio_blk libata serio_raw crc32c_intel virtio_pci i2c_core virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_tunnel_key]
    CPU: 2 PID: 5740 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    RIP: 0010:psample_group_put+0x15/0x71 [psample]
    RSP: 0018:ffffb8a80032f7d0 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000024
    RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffc06d93c0
    RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
    R10: 00000000bd003000 R11: ffff979fba04aa59 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: ffff979fbba3f22c
    FS: 00007f7638112740(0000) GS:ffff979fbfd00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000000000001c CR3: 00000000734ea001 CR4: 00000000001606e0
    Call Trace:
    __tcf_idr_release+0x79/0xf0
    tcf_sample_init+0x125/0x1d0 [act_sample]
    tcf_action_init_1+0x2cc/0x430
    tcf_action_init+0xd3/0x1b0
    tc_ctl_action+0x18b/0x240
    rtnetlink_rcv_msg+0x29c/0x310
    ? _cond_resched+0x15/0x30
    ? __kmalloc_node_track_caller+0x1b9/0x270
    ? rtnl_calcit.isra.28+0x100/0x100
    netlink_rcv_skb+0xd2/0x110
    netlink_unicast+0x17c/0x230
    netlink_sendmsg+0x2cd/0x3c0
    sock_sendmsg+0x30/0x40
    ___sys_sendmsg+0x27a/0x290
    ? filemap_map_pages+0x34a/0x3a0
    ? __handle_mm_fault+0xbfd/0xe20
    __sys_sendmsg+0x51/0x90
    do_syscall_64+0x6e/0x1a0
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    RIP: 0033:0x7f7637523ba0
    RSP: 002b:00007fff0473ef58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00007fff0473f080 RCX: 00007f7637523ba0
    RDX: 0000000000000000 RSI: 00007fff0473efd0 RDI: 0000000000000003
    RBP: 000000005aaaac80 R08: 0000000000000002 R09: 0000000000000000
    R10: 00007fff0473e9e0 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007fff0473f094 R14: 0000000000000001 R15: 0000000000669f60
    Code: be 02 00 00 00 48 89 df e8 a9 fe ff ff e9 7c ff ff ff 0f 1f 40 00 0f 1f 44 00 00 53 48 89 fb 48 c7 c7 c0 93 6d c0 e8 db 20 8c ef 6b 1c 01 74 10 48 c7 c7 c0 93 6d c0 ff 14 25 e8 83 83 b0 5b
    RIP: psample_group_put+0x15/0x71 [psample] RSP: ffffb8a80032f7d0
    CR2: 000000000000001c

    Fix it in tcf_sample_cleanup(), ensuring that calls to psample_group_put(p)
    are done only when p is not NULL.

    Fixes: cadb9c9fdbc6 ("net/sched: act_sample: Fix error path in init")
    Signed-off-by: Davide Caratti
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     

25 May, 2018

2 commits

  • [ Upstream commit 44a63b137f7b6e4c7bd6c9cc21615941cb36509d ]

    Hangbin reported an Oops triggered by the syzkaller qdisc rules:

    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] SMP KASAN PTI
    Modules linked in: sch_red
    CPU: 0 PID: 28699 Comm: syz-executor5 Not tainted 4.17.0-rc4.kcov #1
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    RIP: 0010:qdisc_hash_add+0x26/0xa0
    RSP: 0018:ffff8800589cf470 EFLAGS: 00010203
    RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff824ad971
    RDX: 0000000000000007 RSI: ffffc9000ce9f000 RDI: 000000000000003c
    RBP: 0000000000000001 R08: ffffed000b139ea2 R09: ffff8800589cf4f0
    R10: ffff8800589cf50f R11: ffffed000b139ea2 R12: ffff880054019fc0
    R13: ffff880054019fb4 R14: ffff88005c0af600 R15: ffff880054019fb0
    FS: 00007fa6edcb1700(0000) GS:ffff88005ce00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000020000740 CR3: 000000000fc16000 CR4: 00000000000006f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    red_change+0x2d2/0xed0 [sch_red]
    qdisc_create+0x57e/0xef0
    tc_modify_qdisc+0x47f/0x14e0
    rtnetlink_rcv_msg+0x6a8/0x920
    netlink_rcv_skb+0x2a2/0x3c0
    netlink_unicast+0x511/0x740
    netlink_sendmsg+0x825/0xc30
    sock_sendmsg+0xc5/0x100
    ___sys_sendmsg+0x778/0x8e0
    __sys_sendmsg+0xf5/0x1b0
    do_syscall_64+0xbd/0x3b0
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x450869
    RSP: 002b:00007fa6edcb0c48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00007fa6edcb16b4 RCX: 0000000000450869
    RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000013
    RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
    R13: 0000000000008778 R14: 0000000000702838 R15: 00007fa6edcb1700
    Code: e9 0b fe ff ff 0f 1f 44 00 00 55 53 48 89 fb 89 f5 e8 3f 07 f3 fe 48 8d 7b 3c 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 b6 14 02 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 04 84 d2 75 51
    RIP: qdisc_hash_add+0x26/0xa0 RSP: ffff8800589cf470

    When a red qdisc is updated with a 0 limit, the child qdisc is left
    unmodified, no additional scheduler is created in red_change(),
    the 'child' local variable is rightfully NULL and must not add it
    to the hash table.

    This change addresses the above issue moving qdisc_hash_add() right
    after the child qdisc creation. It additionally removes unneeded checks
    for noop_qdisc.

    Reported-by: Hangbin Liu
    Fixes: 49b499718fa1 ("net: sched: make default fifo qdiscs appear in the dump")
    Signed-off-by: Paolo Abeni
    Acked-by: Jiri Kosina
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     
  • [ Upstream commit 5a4931ae0193f8a4a97e8260fd0df1d705d83299 ]

    Similarly to what was done with commit a52956dfc503 ("net sched actions:
    fix refcnt leak in skbmod"), fix the error path of tcf_vlan_init() to avoid
    refcnt leaks when wrong value of TCA_VLAN_PUSH_VLAN_PROTOCOL is given.

    Fixes: 5026c9b1bafc ("net sched: vlan action fix late binding")
    CC: Roman Mashak
    Signed-off-by: Davide Caratti
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     

19 May, 2018

3 commits

  • [ Upstream commit d68d75fdc34b0253c2bded7ed18cd60eb5a9599b ]

    In case modules are not configured, error out when tp->ops is null
    and prevent later null pointer dereference.

    Fixes: 33a48927c193 ("sched: push TC filter protocol creation into a separate function")
    Signed-off-by: Jiri Pirko
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jiri Pirko
     
  • [ Upstream commit 7df40c2673a1307c3260aab6f9d4b9bf97ca8fd7 ]

    Normally, a socket can not be freed/reused unless all its TX packets
    left qdisc and were TX-completed. However connect(AF_UNSPEC) allows
    this to happen.

    With commit fc59d5bdf1e3 ("pkt_sched: fq: clear time_next_packet for
    reused flows") we cleared f->time_next_packet but took no special
    action if the flow was still in the throttled rb-tree.

    Since f->time_next_packet is the key used in the rb-tree searches,
    blindly clearing it might break rb-tree integrity. We need to make
    sure the flow is no longer in the rb-tree to avoid this problem.

    Fixes: fc59d5bdf1e3 ("pkt_sched: fq: clear time_next_packet for reused flows")
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit a52956dfc503f8cc5cfe6454959b7049fddb4413 ]

    When application fails to pass flags in netlink TLV when replacing
    existing skbmod action, the kernel will leak refcnt:

    $ tc actions get action skbmod index 1
    total acts 0

    action order 0: skbmod pipe set smac 00:11:22:33:44:55
    index 1 ref 1 bind 0

    For example, at this point a buggy application replaces the action with
    index 1 with new smac 00:aa:22:33:44:55, it fails because of zero flags,
    however refcnt gets bumped:

    $ tc actions get actions skbmod index 1
    total acts 0

    action order 0: skbmod pipe set smac 00:11:22:33:44:55
    index 1 ref 2 bind 0
    $

    Tha patch fixes this by calling tcf_idr_release() on existing actions.

    Fixes: 86da71b57383d ("net_sched: Introduce skbmod action")
    Signed-off-by: Roman Mashak
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Roman Mashak
     

29 Apr, 2018

2 commits

  • [ Upstream commit cc74eddd0ff325d57373cea99f642b787d7f76f5 ]

    There is currently no handling to check on a invalid tlv length. This
    patch adds such handling to avoid killing the kernel with a malformed
    ife packet.

    Signed-off-by: Alexander Aring
    Reviewed-by: Yotam Gigi
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexander Aring
     
  • [ Upstream commit f6cd14537ff9919081be19b9c53b9b19c0d3ea97 ]

    We need to record stats for received metadata that we dont know how
    to process. Have find_decode_metaid() return -ENOENT to capture this.

    Signed-off-by: Alexander Aring
    Reviewed-by: Yotam Gigi
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexander Aring
     

12 Apr, 2018

4 commits

  • [ Upstream commit 2d433610176d6569e8b3a28f67bc72235bf69efc ]

    when the following command

    # tc action replace action skbmod swap mac index 100

    is run for the first time, and tcf_skbmod_init() fails to allocate struct
    tcf_skbmod_params, tcf_skbmod_cleanup() calls kfree_rcu(NULL), thus
    causing the following error:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
    IP: __call_rcu+0x23/0x2b0
    PGD 8000000034057067 P4D 8000000034057067 PUD 74937067 PMD 0
    Oops: 0002 [#1] SMP PTI
    Modules linked in: act_skbmod(E) psample ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec crct10dif_pclmul mbcache jbd2 crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep pcbc snd_seq snd_seq_device snd_pcm aesni_intel snd_timer crypto_simd glue_helper snd cryptd virtio_balloon joydev soundcore pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_net virtio_blk ata_piix libata crc32c_intel virtio_pci serio_raw virtio_ring virtio i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_skbmod]
    CPU: 3 PID: 3144 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    RIP: 0010:__call_rcu+0x23/0x2b0
    RSP: 0018:ffffbd2e403e7798 EFLAGS: 00010246
    RAX: ffffffffc0872080 RBX: ffff981d34bff780 RCX: 00000000ffffffff
    RDX: ffffffff922a5f00 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000000021f
    R10: 000000003d003000 R11: 0000000000aaaaaa R12: 0000000000000000
    R13: ffffffff922a5f00 R14: 0000000000000001 R15: ffff981d3b698c2c
    FS: 00007f3678292740(0000) GS:ffff981d3fd80000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000008 CR3: 000000007c57a006 CR4: 00000000001606e0
    Call Trace:
    __tcf_idr_release+0x79/0xf0
    tcf_skbmod_init+0x1d1/0x210 [act_skbmod]
    tcf_action_init_1+0x2cc/0x430
    tcf_action_init+0xd3/0x1b0
    tc_ctl_action+0x18b/0x240
    rtnetlink_rcv_msg+0x29c/0x310
    ? _cond_resched+0x15/0x30
    ? __kmalloc_node_track_caller+0x1b9/0x270
    ? rtnl_calcit.isra.28+0x100/0x100
    netlink_rcv_skb+0xd2/0x110
    netlink_unicast+0x17c/0x230
    netlink_sendmsg+0x2cd/0x3c0
    sock_sendmsg+0x30/0x40
    ___sys_sendmsg+0x27a/0x290
    ? filemap_map_pages+0x34a/0x3a0
    ? __handle_mm_fault+0xbfd/0xe20
    __sys_sendmsg+0x51/0x90
    do_syscall_64+0x6e/0x1a0
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    RIP: 0033:0x7f36776a3ba0
    RSP: 002b:00007fff4703b618 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00007fff4703b740 RCX: 00007f36776a3ba0
    RDX: 0000000000000000 RSI: 00007fff4703b690 RDI: 0000000000000003
    RBP: 000000005aaaba36 R08: 0000000000000002 R09: 0000000000000000
    R10: 00007fff4703b0a0 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007fff4703b754 R14: 0000000000000001 R15: 0000000000669f60
    Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89
    RIP: __call_rcu+0x23/0x2b0 RSP: ffffbd2e403e7798
    CR2: 0000000000000008

    Fix it in tcf_skbmod_cleanup(), ensuring that kfree_rcu(p, ...) is called
    only when p is not NULL.

    Fixes: 86da71b57383 ("net_sched: Introduce skbmod action")
    Signed-off-by: Davide Caratti
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit abdadd3cfd3e7ea3da61ac774f84777d1f702058 ]

    when the following command

    # tc action add action tunnel_key unset index 100

    is run for the first time, and tunnel_key_init() fails to allocate struct
    tcf_tunnel_key_params, tunnel_key_release() dereferences NULL pointers.
    This causes the following error:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: tunnel_key_release+0xd/0x40 [act_tunnel_key]
    PGD 8000000033787067 P4D 8000000033787067 PUD 74646067 PMD 0
    Oops: 0000 [#1] SMP PTI
    Modules linked in: act_tunnel_key(E) act_csum ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel snd_hda_intel pcbc snd_hda_codec snd_hda_core snd_hwdep snd_seq aesni_intel snd_seq_device crypto_simd glue_helper snd_pcm cryptd joydev snd_timer pcspkr virtio_balloon snd i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk drm virtio_console crc32c_intel ata_piix serio_raw i2c_core virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod
    CPU: 2 PID: 3101 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    RIP: 0010:tunnel_key_release+0xd/0x40 [act_tunnel_key]
    RSP: 0018:ffffba46803b7768 EFLAGS: 00010286
    RAX: ffffffffc09010a0 RBX: 0000000000000000 RCX: 0000000000000024
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff99ee336d7480
    RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
    R10: 0000000000000220 R11: ffff99ee79d73131 R12: 0000000000000000
    R13: ffff99ee32d67610 R14: ffff99ee7671dc38 R15: 00000000fffffff4
    FS: 00007febcb2cd740(0000) GS:ffff99ee7fd00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000010 CR3: 000000007c8e4005 CR4: 00000000001606e0
    Call Trace:
    __tcf_idr_release+0x79/0xf0
    tunnel_key_init+0xd9/0x460 [act_tunnel_key]
    tcf_action_init_1+0x2cc/0x430
    tcf_action_init+0xd3/0x1b0
    tc_ctl_action+0x18b/0x240
    rtnetlink_rcv_msg+0x29c/0x310
    ? _cond_resched+0x15/0x30
    ? __kmalloc_node_track_caller+0x1b9/0x270
    ? rtnl_calcit.isra.28+0x100/0x100
    netlink_rcv_skb+0xd2/0x110
    netlink_unicast+0x17c/0x230
    netlink_sendmsg+0x2cd/0x3c0
    sock_sendmsg+0x30/0x40
    ___sys_sendmsg+0x27a/0x290
    __sys_sendmsg+0x51/0x90
    do_syscall_64+0x6e/0x1a0
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    RIP: 0033:0x7febca6deba0
    RSP: 002b:00007ffe7b0dd128 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00007ffe7b0dd250 RCX: 00007febca6deba0
    RDX: 0000000000000000 RSI: 00007ffe7b0dd1a0 RDI: 0000000000000003
    RBP: 000000005aaa90cb R08: 0000000000000002 R09: 0000000000000000
    R10: 00007ffe7b0dcba0 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007ffe7b0dd264 R14: 0000000000000001 R15: 0000000000669f60
    Code: 44 00 00 8b 0d b5 23 00 00 48 8b 87 48 10 00 00 48 8b 3c c8 e9 a5 e5 d8 c3 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 9f b0 00 00 00 7b 10 01 74 0b 48 89 df 31 f6 5b e9 f2 fa 7f c3 48 8b 7b 18
    RIP: tunnel_key_release+0xd/0x40 [act_tunnel_key] RSP: ffffba46803b7768
    CR2: 0000000000000010

    Fix this in tunnel_key_release(), ensuring 'param' is not NULL before
    dereferencing it.

    Fixes: d0f6dd8a914f ("net/sched: Introduce act_tunnel_key")
    Signed-off-by: Davide Caratti
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit 3239534a79ee6f20cffd974173a1e62e0730e8ac ]

    when tcf_bpf_init_from_ops() fails (e.g. because of program having invalid
    number of instructions), tcf_bpf_cfg_cleanup() calls bpf_prog_put(NULL) or
    bpf_prog_destroy(NULL). Unless CONFIG_BPF_SYSCALL is unset, this causes
    the following error:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
    PGD 800000007345a067 P4D 800000007345a067 PUD 340e1067 PMD 0
    Oops: 0000 [#1] SMP PTI
    Modules linked in: act_bpf(E) ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic pcbc snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd glue_helper cryptd joydev snd_timer snd virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_blk drm virtio_net virtio_console i2c_core crc32c_intel serio_raw virtio_pci ata_piix libata virtio_ring floppy virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_bpf]
    CPU: 3 PID: 5654 Comm: tc Tainted: G E 4.16.0.bpf_test+ #408
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    RIP: 0010:__bpf_prog_put+0xc/0xc0
    RSP: 0018:ffff9594003ef728 EFLAGS: 00010202
    RAX: 0000000000000000 RBX: ffff9594003ef758 RCX: 0000000000000024
    RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
    RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
    R10: 0000000000000220 R11: ffff8a7ab9f17131 R12: 0000000000000000
    R13: ffff8a7ab7c3c8e0 R14: 0000000000000001 R15: ffff8a7ab88f1054
    FS: 00007fcb2f17c740(0000) GS:ffff8a7abfd80000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000020 CR3: 000000007c888006 CR4: 00000000001606e0
    Call Trace:
    tcf_bpf_cfg_cleanup+0x2f/0x40 [act_bpf]
    tcf_bpf_cleanup+0x4c/0x70 [act_bpf]
    __tcf_idr_release+0x79/0x140
    tcf_bpf_init+0x125/0x330 [act_bpf]
    tcf_action_init_1+0x2cc/0x430
    ? get_page_from_freelist+0x3f0/0x11b0
    tcf_action_init+0xd3/0x1b0
    tc_ctl_action+0x18b/0x240
    rtnetlink_rcv_msg+0x29c/0x310
    ? _cond_resched+0x15/0x30
    ? __kmalloc_node_track_caller+0x1b9/0x270
    ? rtnl_calcit.isra.29+0x100/0x100
    netlink_rcv_skb+0xd2/0x110
    netlink_unicast+0x17c/0x230
    netlink_sendmsg+0x2cd/0x3c0
    sock_sendmsg+0x30/0x40
    ___sys_sendmsg+0x27a/0x290
    ? mem_cgroup_commit_charge+0x80/0x130
    ? page_add_new_anon_rmap+0x73/0xc0
    ? do_anonymous_page+0x2a2/0x560
    ? __handle_mm_fault+0xc75/0xe20
    __sys_sendmsg+0x58/0xa0
    do_syscall_64+0x6e/0x1a0
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    RIP: 0033:0x7fcb2e58eba0
    RSP: 002b:00007ffc93c496c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00007ffc93c497f0 RCX: 00007fcb2e58eba0
    RDX: 0000000000000000 RSI: 00007ffc93c49740 RDI: 0000000000000003
    RBP: 000000005ac6a646 R08: 0000000000000002 R09: 0000000000000000
    R10: 00007ffc93c49120 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007ffc93c49804 R14: 0000000000000001 R15: 000000000066afa0
    Code: 5f 00 48 8b 43 20 48 c7 c7 70 2f 7c b8 c7 40 10 00 00 00 00 5b e9 a5 8b 61 00 0f 1f 44 00 00 0f 1f 44 00 00 41 54 55 48 89 fd 53 8b 47 20 f0 ff 08 74 05 5b 5d 41 5c c3 41 89 f4 0f 1f 44 00
    RIP: __bpf_prog_put+0xc/0xc0 RSP: ffff9594003ef728
    CR2: 0000000000000020

    Fix it in tcf_bpf_cfg_cleanup(), ensuring that bpf_prog_{put,destroy}(f)
    is called only when f is not NULL.

    Fixes: bbc09e7842a5 ("net/sched: fix idr leak on the error path of tcf_bpf_init()")
    Reported-by: Lucas Bates
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davide Caratti
     
  • [ Upstream commit 734549eb550c0c720bc89e50501f1b1e98cdd841 ]

    Fixes a bug in the tcf_dump_walker function that can cause some actions
    to not be reported when dumping a large number of actions. This issue
    became more aggrevated when cookies feature was added. In particular
    this issue is manifest when large cookie values are assigned to the
    actions and when enough actions are created that the resulting table
    must be dumped in multiple batches.

    The number of actions returned in each batch is limited by the total
    number of actions and the memory buffer size. With small cookies
    the numeric limit is reached before the buffer size limit, which avoids
    the code path triggering this bug. When large cookies are used buffer
    fills before the numeric limit, and the erroneous code path is hit.

    For example after creating 32 csum actions with the cookie
    aaaabbbbccccdddd

    $ tc actions ls action csum
    total acts 26

    action order 0: csum (tcp) action continue
    index 1 ref 1 bind 0
    cookie aaaabbbbccccdddd

    .....

    action order 25: csum (tcp) action continue
    index 26 ref 1 bind 0
    cookie aaaabbbbccccdddd
    total acts 6

    action order 0: csum (tcp) action continue
    index 28 ref 1 bind 0
    cookie aaaabbbbccccdddd

    ......

    action order 5: csum (tcp) action continue
    index 32 ref 1 bind 0
    cookie aaaabbbbccccdddd

    Note that the action with index 27 is omitted from the report.

    Fixes: 4b3550ef530c ("[NET_SCHED]: Use nla_nest_start/nla_nest_end")"
    Signed-off-by: Craig Dillabaugh
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Craig Dillabaugh
     

01 Apr, 2018

2 commits

  • [ Upstream commit 35d889d10b649fda66121891ec05eca88150059d ]

    When we exceed current packets limit and we have more than one
    segment in the list returned by skb_gso_segment(), netem drops
    only the first one, skipping the rest, hence kmemleak reports:

    unreferenced object 0xffff880b5d23b600 (size 1024):
    comm "softirq", pid 0, jiffies 4384527763 (age 2770.629s)
    hex dump (first 32 bytes):
    00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00 ..#]............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] __alloc_skb+0xc9/0x520
    [] skb_segment+0x8c8/0x3710
    [] tcp_gso_segment+0x331/0x1830
    [] inet_gso_segment+0x476/0x1370
    [] skb_mac_gso_segment+0x1f9/0x510
    [] __skb_gso_segment+0x1dd/0x620
    [] netem_enqueue+0x1536/0x2590 [sch_netem]
    [] __dev_queue_xmit+0x1167/0x2120
    [] ip_finish_output2+0x998/0xf00
    [] ip_output+0x1aa/0x2c0
    [] tcp_transmit_skb+0x18db/0x3670
    [] tcp_write_xmit+0x4d4/0x58c0
    [] tcp_tasklet_func+0x3d9/0x540
    [] tasklet_action+0x1ca/0x250
    [] __do_softirq+0x1b4/0x5a3
    [] irq_exit+0x1e2/0x210

    Fix it by adding the rest of the segments, if any, to skb 'to_free'
    list. Add new __qdisc_drop_all() and qdisc_drop_all() functions
    because they can be useful in the future if we need to drop segmented
    GSO packets in other places.

    Fixes: 6071bd1aa13e ("netem: Segment GSO packets on enqueue")
    Signed-off-by: Alexey Kodanev
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexey Kodanev
     
  • [ Upstream commit 51d4740f88affd85d49c04e3c9cd129c0e33bcb9 ]

    If set/unset mode of the tunnel_key action is not provided, ->init() still
    returns 0, and the caller proceeds with bogus 'struct tc_action *' object,
    this results in crash:

    % tc actions add action tunnel_key src_ip 1.1.1.1 dst_ip 2.2.2.1 id 7 index 1

    [ 35.805515] general protection fault: 0000 [#1] SMP PTI
    [ 35.806161] Modules linked in: act_tunnel_key kvm_intel kvm irqbypass
    crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64
    crypto_simd glue_helper cryptd serio_raw
    [ 35.808233] CPU: 1 PID: 428 Comm: tc Not tainted 4.16.0-rc4+ #286
    [ 35.808929] RIP: 0010:tcf_action_init+0x90/0x190
    [ 35.809457] RSP: 0018:ffffb8edc068b9a0 EFLAGS: 00010206
    [ 35.810053] RAX: 1320c000000a0003 RBX: 0000000000000001 RCX: 0000000000000000
    [ 35.810866] RDX: 0000000000000070 RSI: 0000000000007965 RDI: ffffb8edc068b910
    [ 35.811660] RBP: ffffb8edc068b9d0 R08: 0000000000000000 R09: ffffb8edc068b808
    [ 35.812463] R10: ffffffffc02bf040 R11: 0000000000000040 R12: ffffb8edc068bb38
    [ 35.813235] R13: 0000000000000000 R14: 0000000000000000 R15: ffffb8edc068b910
    [ 35.814006] FS: 00007f3d0d8556c0(0000) GS:ffff91d1dbc40000(0000)
    knlGS:0000000000000000
    [ 35.814881] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 35.815540] CR2: 000000000043f720 CR3: 0000000019248001 CR4: 00000000001606a0
    [ 35.816457] Call Trace:
    [ 35.817158] tc_ctl_action+0x11a/0x220
    [ 35.817795] rtnetlink_rcv_msg+0x23d/0x2e0
    [ 35.818457] ? __slab_alloc+0x1c/0x30
    [ 35.819079] ? __kmalloc_node_track_caller+0xb1/0x2b0
    [ 35.819544] ? rtnl_calcit.isra.30+0xe0/0xe0
    [ 35.820231] netlink_rcv_skb+0xce/0x100
    [ 35.820744] netlink_unicast+0x164/0x220
    [ 35.821500] netlink_sendmsg+0x293/0x370
    [ 35.822040] sock_sendmsg+0x30/0x40
    [ 35.822508] ___sys_sendmsg+0x2c5/0x2e0
    [ 35.823149] ? pagecache_get_page+0x27/0x220
    [ 35.823714] ? filemap_fault+0xa2/0x640
    [ 35.824423] ? page_add_file_rmap+0x108/0x200
    [ 35.825065] ? alloc_set_pte+0x2aa/0x530
    [ 35.825585] ? finish_fault+0x4e/0x70
    [ 35.826140] ? __handle_mm_fault+0xbc1/0x10d0
    [ 35.826723] ? __sys_sendmsg+0x41/0x70
    [ 35.827230] __sys_sendmsg+0x41/0x70
    [ 35.827710] do_syscall_64+0x68/0x120
    [ 35.828195] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    [ 35.828859] RIP: 0033:0x7f3d0ca4da67
    [ 35.829331] RSP: 002b:00007ffc9f284338 EFLAGS: 00000246 ORIG_RAX:
    000000000000002e
    [ 35.830304] RAX: ffffffffffffffda RBX: 00007ffc9f284460 RCX: 00007f3d0ca4da67
    [ 35.831247] RDX: 0000000000000000 RSI: 00007ffc9f2843b0 RDI: 0000000000000003
    [ 35.832167] RBP: 000000005aa6a7a9 R08: 0000000000000001 R09: 0000000000000000
    [ 35.833075] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000
    [ 35.833997] R13: 00007ffc9f2884c0 R14: 0000000000000001 R15: 0000000000674640
    [ 35.834923] Code: 24 30 bb 01 00 00 00 45 31 f6 eb 5e 8b 50 08 83 c2 07 83 e2
    fc 83 c2 70 49 8b 07 48 8b 40 70 48 85 c0 74 10 48 89 14 24 4c 89 ff d0 48
    8b 14 24 48 01 c2 49 01 d6 45 85 ed 74 05 41 83 47 2c
    [ 35.837442] RIP: tcf_action_init+0x90/0x190 RSP: ffffb8edc068b9a0
    [ 35.838291] ---[ end trace a095c06ee4b97a26 ]---

    Fixes: d0f6dd8a914f ("net/sched: Introduce act_tunnel_key")
    Signed-off-by: Roman Mashak
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Roman Mashak
     

19 Mar, 2018

1 commit

  • [ Upstream commit 7bbde83b1860c28a1cc35516352c4e7e5172c29a ]

    In qdisc_graft_qdisc a "new" qdisc is attached and the 'qdisc_destroy'
    operation is called on the old qdisc. The destroy operation will wait
    a rcu grace period and call qdisc_rcu_free(). At which point
    gso_cpu_skb is free'd along with all stats so no need to zero stats
    and gso_cpu_skb from the graft operation itself.

    Further after dropping the qdisc locks we can not continue to call
    qdisc_reset before waiting an rcu grace period so that the qdisc is
    detached from all cpus. By removing the qdisc_reset() here we get
    the correct property of waiting an rcu grace period and letting the
    qdisc_destroy operation clean up the qdisc correctly.

    Note, a refcnt greater than 1 would cause the destroy operation to
    be aborted however if this ever happened the reference to the qdisc
    would be lost and we would have a memory leak.

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    John Fastabend
     

09 Mar, 2018

2 commits

  • [ Upstream commit d7cdee5ea8d28ae1b6922deb0c1badaa3aa0ef8c ]

    Li Shuang reported an Oops with cls_u32 due to an use-after-free
    in u32_destroy_key(). The use-after-free can be triggered with:

    dev=lo
    tc qdisc add dev $dev root handle 1: htb default 10
    tc filter add dev $dev parent 1: prio 5 handle 1: protocol ip u32 divisor 256
    tc filter add dev $dev protocol ip parent 1: prio 5 u32 ht 800:: match ip dst\
    10.0.0.0/8 hashkey mask 0x0000ff00 at 16 link 1:
    tc qdisc del dev $dev root

    Which causes the following kasan splat:

    ==================================================================
    BUG: KASAN: use-after-free in u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
    Read of size 4 at addr ffff881b83dae618 by task kworker/u48:5/571

    CPU: 17 PID: 571 Comm: kworker/u48:5 Not tainted 4.15.0+ #87
    Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
    Workqueue: tc_filter_workqueue u32_delete_key_freepf_work [cls_u32]
    Call Trace:
    dump_stack+0xd6/0x182
    ? dma_virt_map_sg+0x22e/0x22e
    print_address_description+0x73/0x290
    kasan_report+0x277/0x360
    ? u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
    u32_destroy_key.constprop.21+0x117/0x140 [cls_u32]
    u32_delete_key_freepf_work+0x1c/0x30 [cls_u32]
    process_one_work+0xae0/0x1c80
    ? sched_clock+0x5/0x10
    ? pwq_dec_nr_in_flight+0x3c0/0x3c0
    ? _raw_spin_unlock_irq+0x29/0x40
    ? trace_hardirqs_on_caller+0x381/0x570
    ? _raw_spin_unlock_irq+0x29/0x40
    ? finish_task_switch+0x1e5/0x760
    ? finish_task_switch+0x208/0x760
    ? preempt_notifier_dec+0x20/0x20
    ? __schedule+0x839/0x1ee0
    ? check_noncircular+0x20/0x20
    ? firmware_map_remove+0x73/0x73
    ? find_held_lock+0x39/0x1c0
    ? worker_thread+0x434/0x1820
    ? lock_contended+0xee0/0xee0
    ? lock_release+0x1100/0x1100
    ? init_rescuer.part.16+0x150/0x150
    ? retint_kernel+0x10/0x10
    worker_thread+0x216/0x1820
    ? process_one_work+0x1c80/0x1c80
    ? lock_acquire+0x1a5/0x540
    ? lock_downgrade+0x6b0/0x6b0
    ? sched_clock+0x5/0x10
    ? lock_release+0x1100/0x1100
    ? compat_start_thread+0x80/0x80
    ? do_raw_spin_trylock+0x190/0x190
    ? _raw_spin_unlock_irq+0x29/0x40
    ? trace_hardirqs_on_caller+0x381/0x570
    ? _raw_spin_unlock_irq+0x29/0x40
    ? finish_task_switch+0x1e5/0x760
    ? finish_task_switch+0x208/0x760
    ? preempt_notifier_dec+0x20/0x20
    ? __schedule+0x839/0x1ee0
    ? kmem_cache_alloc_trace+0x143/0x320
    ? firmware_map_remove+0x73/0x73
    ? sched_clock+0x5/0x10
    ? sched_clock_cpu+0x18/0x170
    ? find_held_lock+0x39/0x1c0
    ? schedule+0xf3/0x3b0
    ? lock_downgrade+0x6b0/0x6b0
    ? __schedule+0x1ee0/0x1ee0
    ? do_wait_intr_irq+0x340/0x340
    ? do_raw_spin_trylock+0x190/0x190
    ? _raw_spin_unlock_irqrestore+0x32/0x60
    ? process_one_work+0x1c80/0x1c80
    ? process_one_work+0x1c80/0x1c80
    kthread+0x312/0x3d0
    ? kthread_create_worker_on_cpu+0xc0/0xc0
    ret_from_fork+0x3a/0x50

    Allocated by task 1688:
    kasan_kmalloc+0xa0/0xd0
    __kmalloc+0x162/0x380
    u32_change+0x1220/0x3c9e [cls_u32]
    tc_ctl_tfilter+0x1ba6/0x2f80
    rtnetlink_rcv_msg+0x4f0/0x9d0
    netlink_rcv_skb+0x124/0x320
    netlink_unicast+0x430/0x600
    netlink_sendmsg+0x8fa/0xd60
    sock_sendmsg+0xb1/0xe0
    ___sys_sendmsg+0x678/0x980
    __sys_sendmsg+0xc4/0x210
    do_syscall_64+0x232/0x7f0
    return_from_SYSCALL_64+0x0/0x75

    Freed by task 112:
    kasan_slab_free+0x71/0xc0
    kfree+0x114/0x320
    rcu_process_callbacks+0xc3f/0x1600
    __do_softirq+0x2bf/0xc06

    The buggy address belongs to the object at ffff881b83dae600
    which belongs to the cache kmalloc-4096 of size 4096
    The buggy address is located 24 bytes inside of
    4096-byte region [ffff881b83dae600, ffff881b83daf600)
    The buggy address belongs to the page:
    page:ffffea006e0f6a00 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0
    flags: 0x17ffffc0008100(slab|head)
    raw: 0017ffffc0008100 0000000000000000 0000000000000000 0000000100070007
    raw: dead000000000100 dead000000000200 ffff880187c0e600 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff881b83dae500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ffff881b83dae580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    >ffff881b83dae600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ^
    ffff881b83dae680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ffff881b83dae700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ==================================================================

    The problem is that the htnode is freed before the linked knodes and the
    latter will try to access the first at u32_destroy_key() time.
    This change addresses the issue using the htnode refcnt to guarantee
    the correct free order. While at it also add a RCU annotation,
    to keep sparse happy.

    v1 -> v2: use rtnl_derefence() instead of RCU read locks
    v2 -> v3:
    - don't check refcnt in u32_destroy_hnode()
    - cleaned-up u32_destroy() implementation
    - cleaned-up code comment
    v3 -> v4:
    - dropped unneeded comment

    Reported-by: Li Shuang
    Fixes: c0d378ef1266 ("net_sched: use tcf_queue_work() in u32 filter")
    Signed-off-by: Paolo Abeni
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paolo Abeni
     
  • [ Upstream commit eb53f7af6f15285e2f6ada97285395343ce9f433 ]

    The following sequence is currently broken:

    # tc qdisc add dev foo ingress
    # tc filter replace dev foo protocol all ingress \
    u32 match u8 0 0 action mirred egress mirror dev bar1
    # tc filter replace dev foo protocol all ingress \
    handle 800::800 pref 49152 \
    u32 match u8 0 0 action mirred egress mirror dev bar2
    Error: cls_u32: Key node flags do not match passed flags.
    We have an error talking to the kernel, -1

    The error comes from u32_change() when comparing new and
    existing flags. The existing ones always contains one of
    TCA_CLS_FLAGS_{,NOT}_IN_HW flag depending on offloading state.
    These flags cannot be passed from userspace so the condition
    (n->flags != flags) in u32_change() always fails.

    Fix the condition so the flags TCA_CLS_FLAGS_NOT_IN_HW and
    TCA_CLS_FLAGS_IN_HW are not taken into account.

    Fixes: 24d3dc6d27ea ("net/sched: cls_u32: Reflect HW offload status")
    Signed-off-by: Ivan Vecera
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Ivan Vecera