25 Sep, 2020

1 commit

  • All TC actions call tcf_idr_insert() for new action at the end
    of their ->init(), so we can actually move it to a central place
    in tcf_action_init_1().

    And once the action is inserted into the global IDR, other parallel
    process could free it immediately as its refcnt is still 1, so we can
    not fail after this, we need to move it after the goto action
    validation to avoid handling the failure case after insertion.

    This is found during code review, is not directly triggered by syzbot.
    And this prepares for the next patch.

    Cc: Vlad Buslov
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

11 Jul, 2020

1 commit


08 Jul, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/latest/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: David S. Miller

    Gustavo A. R. Silva
     

04 Jul, 2020

1 commit

  • There are a couple of places in net/sched/ that check skb->protocol and act
    on the value there. However, in the presence of VLAN tags, the value stored
    in skb->protocol can be inconsistent based on whether VLAN acceleration is
    enabled. The commit quoted in the Fixes tag below fixed the users of
    skb->protocol to use a helper that will always see the VLAN ethertype.

    However, most of the callers don't actually handle the VLAN ethertype, but
    expect to find the IP header type in the protocol field. This means that
    things like changing the ECN field, or parsing diffserv values, stops
    working if there's a VLAN tag, or if there are multiple nested VLAN
    tags (QinQ).

    To fix this, change the helper to take an argument that indicates whether
    the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we
    make sure to skip all of them, so behaviour is consistent even in QinQ
    mode.

    To make the helper usable from the ECN code, move it to if_vlan.h instead
    of pkt_sched.h.

    v3:
    - Remove empty lines
    - Move vlan variable definitions inside loop in skb_protocol()
    - Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and
    bpf_skb_ecn_set_ce()

    v2:
    - Use eth_type_vlan() helper in skb_protocol()
    - Also fix code that reads skb->protocol directly
    - Change a couple of 'if/else if' statements to switch constructs to avoid
    calling the helper twice

    Reported-by: Ilya Ponetayev
    Fixes: d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: David S. Miller

    Toke Høiland-Jørgensen
     

27 Nov, 2019

1 commit

  • Pull RCU updates from Ingo Molnar:
    "The main changes in this cycle were:

    - Dynamic tick (nohz) updates, perhaps most notably changes to force
    the tick on when needed due to lengthy in-kernel execution on CPUs
    on which RCU is waiting.

    - Linux-kernel memory consistency model updates.

    - Replace rcu_swap_protected() with rcu_prepace_pointer().

    - Torture-test updates.

    - Documentation updates.

    - Miscellaneous fixes"

    * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
    security/safesetid: Replace rcu_swap_protected() with rcu_replace_pointer()
    net/sched: Replace rcu_swap_protected() with rcu_replace_pointer()
    net/netfilter: Replace rcu_swap_protected() with rcu_replace_pointer()
    net/core: Replace rcu_swap_protected() with rcu_replace_pointer()
    bpf/cgroup: Replace rcu_swap_protected() with rcu_replace_pointer()
    fs/afs: Replace rcu_swap_protected() with rcu_replace_pointer()
    drivers/scsi: Replace rcu_swap_protected() with rcu_replace_pointer()
    drm/i915: Replace rcu_swap_protected() with rcu_replace_pointer()
    x86/kvm/pmu: Replace rcu_swap_protected() with rcu_replace_pointer()
    rcu: Upgrade rcu_swap_protected() to rcu_replace_pointer()
    rcu: Suppress levelspread uninitialized messages
    rcu: Fix uninitialized variable in nocb_gp_wait()
    rcu: Update descriptions for rcu_future_grace_period tracepoint
    rcu: Update descriptions for rcu_nocb_wake tracepoint
    rcu: Remove obsolete descriptions for rcu_barrier tracepoint
    rcu: Ensure that ->rcu_urgent_qs is set before resched IPI
    workqueue: Convert for_each_wq to use built-in list check
    rcu: Several rcu_segcblist functions can be static
    rcu: Remove unused function hlist_bl_del_init_rcu()
    Documentation: Rename rcu_node_context_switch() to rcu_note_context_switch()
    ...

    Linus Torvalds
     

31 Oct, 2019

4 commits

  • Extend struct tc_action with new "tcfa_flags" field. Set the field in
    tcf_idr_create() function and provide new helper
    tcf_idr_create_from_flags() that derives 'cpustats' boolean from flags
    value. Update individual hardware-offloaded actions init() to pass their
    "flags" argument to new helper in order to skip percpu stats allocation
    when user requested it through flags.

    Signed-off-by: Vlad Buslov
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Extend TCA_ACT space with nla_bitfield32 flags. Add
    TCA_ACT_FLAGS_NO_PERCPU_STATS as the only allowed flag. Parse the flags in
    tcf_action_init_1() and pass resulting value as additional argument to
    a_o->init().

    Signed-off-by: Vlad Buslov
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Extract common code that increments cpu_qstats counters into standalone act
    API functions. Change hardware offloaded actions that use percpu counter
    allocation to use the new functions instead of accessing cpu_qstats
    directly.

    This commit doesn't change functionality.

    Signed-off-by: Vlad Buslov
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Extract common code that increments cpu_bstats counter into standalone act
    API function. Change hardware offloaded actions that use percpu counter
    allocation to use the new function instead of incrementing cpu_bstats
    directly.

    This commit doesn't change functionality.

    Signed-off-by: Vlad Buslov
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     

30 Oct, 2019

1 commit

  • This commit replaces the use of rcu_swap_protected() with the more
    intuitively appealing rcu_replace_pointer() as a step towards removing
    rcu_swap_protected().

    Link: https://lore.kernel.org/lkml/CAHk-=wiAsJLw1egFEE=Z7-GGtM6wcvtyytXZA1+BHqta4gg6Hw@mail.gmail.com/
    Reported-by: Linus Torvalds
    [ paulmck: From rcu_replace() to rcu_replace_pointer() per Ingo Molnar. ]
    Signed-off-by: Paul E. McKenney
    Cc: Jamal Hadi Salim
    Cc: Cong Wang
    Cc: Jiri Pirko
    Cc: "David S. Miller"
    Cc:
    Cc:

    Paul E. McKenney
     

28 Aug, 2019

1 commit

  • The net pointer in struct xt_tgdtor_param is not explicitly
    initialized therefore is still NULL when dereferencing it.
    So we have to find a way to pass the correct net pointer to
    ipt_destroy_target().

    The best way I find is just saving the net pointer inside the per
    netns struct tcf_idrinfo, which could make this patch smaller.

    Fixes: 0c66dc1ea3f0 ("netfilter: conntrack: register hooks in netns when needed by ruleset")
    Reported-and-tested-by: itugrok@yahoo.com
    Cc: Jamal Hadi Salim
    Cc: Jiri Pirko
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

06 Aug, 2019

1 commit

  • Currently init call of all actions (except ipt) init their 'parm'
    structure as a direct pointer to nla data in skb. This leads to race
    condition when some of the filter actions were initialized successfully
    (and were assigned with idr action index that was written directly
    into nla data), but then were deleted and retried (due to following
    action module missing or classifier-initiated retry), in which case
    action init code tries to insert action to idr with index that was
    assigned on previous iteration. During retry the index can be reused
    by another action that was inserted concurrently, which causes
    unintended action sharing between filters.
    To fix described race condition, save action idr index to temporary
    stack-allocated variable instead on nla data.

    Fixes: 0190c1d452a9 ("net: sched: atomically check-allocate action")
    Signed-off-by: Dmytro Linkin
    Signed-off-by: Vlad Buslov
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller

    Dmytro Linkin
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

28 Apr, 2019

1 commit

  • We currently have two levels of strict validation:

    1) liberal (default)
    - undefined (type >= max) & NLA_UNSPEC attributes accepted
    - attribute length >= expected accepted
    - garbage at end of message accepted
    2) strict (opt-in)
    - NLA_UNSPEC attributes accepted
    - attribute length >= expected accepted

    Split out parsing strictness into four different options:
    * TRAILING - check that there's no trailing data after parsing
    attributes (in message or nested)
    * MAXTYPE - reject attrs > max known type
    * UNSPEC - reject attributes with NLA_UNSPEC policy entries
    * STRICT_ATTRS - strictly validate attribute size

    The default for future things should be *everything*.
    The current *_strict() is a combination of TRAILING and MAXTYPE,
    and is renamed to _deprecated_strict().
    The current regular parsing has none of this, and is renamed to
    *_parse_deprecated().

    Additionally it allows us to selectively set one of the new flags
    even on old policies. Notably, the UNSPEC flag could be useful in
    this case, since it can be arranged (by filling in the policy) to
    not be an incompatible userspace ABI change, but would then going
    forward prevent forgetting attribute entries. Similar can apply
    to the POLICY flag.

    We end up with the following renames:
    * nla_parse -> nla_parse_deprecated
    * nla_parse_strict -> nla_parse_deprecated_strict
    * nlmsg_parse -> nlmsg_parse_deprecated
    * nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
    * nla_parse_nested -> nla_parse_nested_deprecated
    * nla_validate_nested -> nla_validate_nested_deprecated

    Using spatch, of course:
    @@
    expression TB, MAX, HEAD, LEN, POL, EXT;
    @@
    -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
    +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)

    @@
    expression NLH, HDRLEN, TB, MAX, POL, EXT;
    @@
    -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
    +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)

    @@
    expression NLH, HDRLEN, TB, MAX, POL, EXT;
    @@
    -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
    +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)

    @@
    expression TB, MAX, NLA, POL, EXT;
    @@
    -nla_parse_nested(TB, MAX, NLA, POL, EXT)
    +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)

    @@
    expression START, MAX, POL, EXT;
    @@
    -nla_validate_nested(START, MAX, POL, EXT)
    +nla_validate_nested_deprecated(START, MAX, POL, EXT)

    @@
    expression NLH, HDRLEN, MAX, POL, EXT;
    @@
    -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
    +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)

    For this patch, don't actually add the strict, non-renamed versions
    yet so that it breaks compile if I get it wrong.

    Also, while at it, make nla_validate and nla_parse go down to a
    common __nla_validate_parse() function to avoid code duplication.

    Ultimately, this allows us to have very strict validation for every
    new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
    next patch, while existing things will continue to work as is.

    In effect then, this adds fully strict validation for any new command.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

22 Mar, 2019

2 commits

  • the following script:

    # tc qdisc add dev crash0 clsact
    # tc filter add dev crash0 egress matchall action csum icmp pass index 90
    # tc actions replace action csum icmp goto chain 42 index 90 \
    > cookie c1a0c1a0
    # tc actions show action csum

    had the following output:

    Error: Failed to init TC action chain.
    We have an error talking to the kernel
    total acts 1

    action order 0: csum (icmp) action goto chain 42
    index 90 ref 2 bind 1
    cookie c1a0c1a0

    Then, the first packet transmitted by crash0 made the kernel crash:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
    #PF error: [normal kernel read fault]
    PGD 8000000074692067 P4D 8000000074692067 PUD 2e210067 PMD 0
    Oops: 0000 [#1] SMP PTI
    CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.0.0-rc4.gotochain_crash+ #533
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    RIP: 0010:tcf_action_exec+0xb8/0x100
    Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3
    RSP: 0018:ffff93153da03be0 EFLAGS: 00010246
    RAX: 000000002000002a RBX: ffff9314ee40f700 RCX: 0000000000003a00
    RDX: 0000000000000000 RSI: ffff931537c87828 RDI: ffff931537c87818
    RBP: ffff93153da03c80 R08: 00000000527cffff R09: 0000000000000003
    R10: 000000000000003f R11: 0000000000000028 R12: ffff9314edf68400
    R13: ffff9314edf68408 R14: 0000000000000001 R15: ffff9314ed67b600
    FS: 0000000000000000(0000) GS:ffff93153da00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 0000000073e32003 CR4: 00000000001606f0
    Call Trace:

    tcf_classify+0x58/0x120
    __dev_queue_xmit+0x40a/0x890
    ? ip6_finish_output2+0x369/0x590
    ip6_finish_output2+0x369/0x590
    ? ip6_output+0x68/0x110
    ip6_output+0x68/0x110
    ? nf_hook.constprop.35+0x79/0xc0
    mld_sendpack+0x16f/0x220
    mld_ifc_timer_expire+0x195/0x2c0
    ? igmp6_timer_handler+0x70/0x70
    call_timer_fn+0x2b/0x130
    run_timer_softirq+0x3e8/0x440
    ? tick_sched_timer+0x37/0x70
    __do_softirq+0xe3/0x2f5
    irq_exit+0xf0/0x100
    smp_apic_timer_interrupt+0x6c/0x130
    apic_timer_interrupt+0xf/0x20

    RIP: 0010:native_safe_halt+0x2/0x10
    Code: 66 ff ff ff 7f f3 c3 65 48 8b 04 25 00 5c 01 00 f0 80 48 02 20 48 8b 00 a8 08 74 8b eb c1 90 90 90 90 90 90 90 90 90 90 fb f4 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90
    RSP: 0018:ffffffff9a803e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
    RAX: ffffffff99e184f0 RBX: 0000000000000000 RCX: 0000000000000001
    RDX: 0000000000000001 RSI: 0000000000000087 RDI: 0000000000000000
    RBP: 0000000000000000 R08: 000eb5c4572376b3 R09: 0000000000000000
    R10: ffffa53e806a3ca0 R11: 00000000000f4240 R12: 0000000000000000
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    ? __sched_text_end+0x1/0x1
    default_idle+0x1c/0x140
    do_idle+0x1c4/0x280
    cpu_startup_entry+0x19/0x20
    start_kernel+0x49e/0x4be
    secondary_startup_64+0xa4/0xb0
    Modules linked in: act_csum veth ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 crct10dif_pclmul crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel snd_hda_intel mbcache snd_hda_codec jbd2 snd_hwdep snd_hda_core snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd cryptd snd_timer glue_helper snd joydev virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect virtio_net sysimgblt net_failover fb_sys_fops virtio_console virtio_blk ttm failover drm ata_piix crc32c_intel floppy virtio_pci serio_raw libata virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod
    CR2: 0000000000000000

    Validating the control action within tcf_csum_init() proved to fix the
    above issue. A TDC selftest is added to verify the correct behavior.

    Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain")
    Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values")
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller

    Davide Caratti
     
  • - pass a pointer to struct tcf_proto in each actions's init() handler,
    to allow validating the control action, checking whether the chain
    exists and (eventually) refcounting it.
    - remove code that validates the control action after a successful call
    to the action's init() handler, and replace it with a test that forbids
    addition of actions having 'goto_chain' and NULL goto_chain pointer at
    the same time.
    - add tcf_action_check_ctrlact(), that will validate the control action
    and eventually allocate the action 'goto_chain' within the init()
    handler.
    - add tcf_action_set_ctrlact(), that will assign the control action and
    swap the current 'goto_chain' pointer with the new given one.

    This disallows 'goto_chain' on actions that don't initialize it properly
    in their init() handler, i.e. calling tcf_action_check_ctrlact() after
    successful IDR reservation and then calling tcf_action_set_ctrlact()
    to assign 'goto_chain' and 'tcf_action' consistently.

    By doing this, the kernel does not leak anymore refcounts when a valid
    'goto chain' handle is replaced in TC actions, causing kmemleak splats
    like the following one:

    # tc chain add dev dd0 chain 42 ingress protocol ip flower \
    > ip_proto tcp action drop
    # tc chain add dev dd0 chain 43 ingress protocol ip flower \
    > ip_proto udp action drop
    # tc filter add dev dd0 ingress matchall \
    > action gact goto chain 42 index 66
    # tc filter replace dev dd0 ingress matchall \
    > action gact goto chain 43 index 66
    # echo scan >/sys/kernel/debug/kmemleak

    unreferenced object 0xffff93c0ee09f000 (size 1024):
    comm "tc", pid 2565, jiffies 4295339808 (age 65.426s)
    hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    00 00 00 00 08 00 06 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] tc_ctl_chain+0x3d2/0x4c0
    [] rtnetlink_rcv_msg+0x263/0x2d0
    [] netlink_rcv_skb+0x4a/0x110
    [] netlink_unicast+0x1a0/0x250
    [] netlink_sendmsg+0x2c1/0x3c0
    [] sock_sendmsg+0x36/0x40
    [] ___sys_sendmsg+0x280/0x2f0
    [] __sys_sendmsg+0x5e/0xa0
    [] do_syscall_64+0x5b/0x180
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [] 0xffffffffffffffff

    Fixes: db50514f9a9c ("net: sched: add termination action to allow goto chain")
    Fixes: 97763dc0f401 ("net_sched: reject unknown tcfa_action values")
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller

    Davide Caratti
     

28 Feb, 2019

1 commit

  • The csum calculation is different for IPv4/6. For VLAN packets,
    tc_skb_protocol returns the VLAN protocol rather than the packet's one
    (e.g. IPv4/6), so csum is not calculated. Furthermore, VLAN may not be
    stripped so csum is not calculated in this case too. Calculate the
    csum for those cases.

    Fixes: d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
    Signed-off-by: Eli Britstein
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Eli Britstein
     

11 Feb, 2019

1 commit

  • Modify the kernel users of the TCA_ACT_* macros to use TCA_ID_*. For
    example, use TCA_ID_GACT instead of TCA_ACT_GACT. This will align with
    TCA_ID_POLICE and also differentiates these identifier, used in struct
    tc_action_ops type field, from other macros starting with TCA_ACT_.

    To make things clearer, we name the enum defining the TCA_ID_*
    identifiers and also change the "type" field of struct tc_action to
    id.

    Signed-off-by: Eli Cohen
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Eli Cohen
     

01 Sep, 2018

1 commit


22 Aug, 2018

1 commit

  • All ops->delete() wants is getting the tn->idrinfo, but we already
    have tc_action before calling ops->delete(), and tc_action has
    a pointer ->idrinfo.

    More importantly, each type of action does the same thing, that is,
    just calling tcf_idr_delete_index().

    So it can be just removed.

    Fixes: b409074e6693 ("net: sched: add 'delete' function to action ops")
    Cc: Jiri Pirko
    Cc: Vlad Buslov
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

20 Aug, 2018

1 commit

  • Recently, ops->init() and ops->dump() of all actions were modified to
    always obtain tcf_lock when accessing private action state. Actions that
    don't depend on tcf_lock for synchronization with their data path use
    non-bh locking API. However, tcf_lock is also used to protect rate
    estimator stats in softirq context by timer callback.

    Change ops->init() and ops->dump() of all actions to disable bh when using
    tcf_lock to prevent deadlock reported by following lockdep warning:

    [ 105.470398] ================================
    [ 105.475014] WARNING: inconsistent lock state
    [ 105.479628] 4.18.0-rc8+ #664 Not tainted
    [ 105.483897] --------------------------------
    [ 105.488511] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
    [ 105.494871] swapper/16/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
    [ 105.500449] 00000000f86c012e (&(&p->tcfa_lock)->rlock){+.?.}, at: est_fetch_counters+0x3c/0xa0
    [ 105.509696] {SOFTIRQ-ON-W} state was registered at:
    [ 105.514925] _raw_spin_lock+0x2c/0x40
    [ 105.519022] tcf_bpf_init+0x579/0x820 [act_bpf]
    [ 105.523990] tcf_action_init_1+0x4e4/0x660
    [ 105.528518] tcf_action_init+0x1ce/0x2d0
    [ 105.532880] tcf_exts_validate+0x1d8/0x200
    [ 105.537416] fl_change+0x55a/0x268b [cls_flower]
    [ 105.542469] tc_new_tfilter+0x748/0xa20
    [ 105.546738] rtnetlink_rcv_msg+0x56a/0x6d0
    [ 105.551268] netlink_rcv_skb+0x18d/0x200
    [ 105.555628] netlink_unicast+0x2d0/0x370
    [ 105.559990] netlink_sendmsg+0x3b9/0x6a0
    [ 105.564349] sock_sendmsg+0x6b/0x80
    [ 105.568271] ___sys_sendmsg+0x4a1/0x520
    [ 105.572547] __sys_sendmsg+0xd7/0x150
    [ 105.576655] do_syscall_64+0x72/0x2c0
    [ 105.580757] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [ 105.586243] irq event stamp: 489296
    [ 105.590084] hardirqs last enabled at (489296): [] _raw_spin_unlock_irq+0x29/0x40
    [ 105.599765] hardirqs last disabled at (489295): [] _raw_spin_lock_irq+0x15/0x50
    [ 105.609277] softirqs last enabled at (489292): [] irq_enter+0x83/0xa0
    [ 105.618001] softirqs last disabled at (489293): [] irq_exit+0x140/0x190
    [ 105.626813]
    other info that might help us debug this:
    [ 105.633976] Possible unsafe locking scenario:

    [ 105.640526] CPU0
    [ 105.643325] ----
    [ 105.646125] lock(&(&p->tcfa_lock)->rlock);
    [ 105.650747]
    [ 105.653717] lock(&(&p->tcfa_lock)->rlock);
    [ 105.658514]
    *** DEADLOCK ***

    [ 105.665349] 1 lock held by swapper/16/0:
    [ 105.669629] #0: 00000000a640ad99 ((&est->timer)){+.-.}, at: call_timer_fn+0x10b/0x550
    [ 105.678200]
    stack backtrace:
    [ 105.683194] CPU: 16 PID: 0 Comm: swapper/16 Not tainted 4.18.0-rc8+ #664
    [ 105.690249] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
    [ 105.698626] Call Trace:
    [ 105.701421]
    [ 105.703791] dump_stack+0x92/0xeb
    [ 105.707461] print_usage_bug+0x336/0x34c
    [ 105.711744] mark_lock+0x7c9/0x980
    [ 105.715500] ? print_shortest_lock_dependencies+0x2e0/0x2e0
    [ 105.721424] ? check_usage_forwards+0x230/0x230
    [ 105.726315] __lock_acquire+0x923/0x26f0
    [ 105.730597] ? debug_show_all_locks+0x240/0x240
    [ 105.735478] ? mark_lock+0x493/0x980
    [ 105.739412] ? check_chain_key+0x140/0x1f0
    [ 105.743861] ? __lock_acquire+0x836/0x26f0
    [ 105.748323] ? lock_acquire+0x12e/0x290
    [ 105.752516] lock_acquire+0x12e/0x290
    [ 105.756539] ? est_fetch_counters+0x3c/0xa0
    [ 105.761084] _raw_spin_lock+0x2c/0x40
    [ 105.765099] ? est_fetch_counters+0x3c/0xa0
    [ 105.769633] est_fetch_counters+0x3c/0xa0
    [ 105.773995] est_timer+0x87/0x390
    [ 105.777670] ? est_fetch_counters+0xa0/0xa0
    [ 105.782210] ? lock_acquire+0x12e/0x290
    [ 105.786410] call_timer_fn+0x161/0x550
    [ 105.790512] ? est_fetch_counters+0xa0/0xa0
    [ 105.795055] ? del_timer_sync+0xd0/0xd0
    [ 105.799249] ? __lock_is_held+0x93/0x110
    [ 105.803531] ? mark_held_locks+0x20/0xe0
    [ 105.807813] ? _raw_spin_unlock_irq+0x29/0x40
    [ 105.812525] ? est_fetch_counters+0xa0/0xa0
    [ 105.817069] ? est_fetch_counters+0xa0/0xa0
    [ 105.821610] run_timer_softirq+0x3c4/0x9f0
    [ 105.826064] ? lock_acquire+0x12e/0x290
    [ 105.830257] ? __bpf_trace_timer_class+0x10/0x10
    [ 105.835237] ? __lock_is_held+0x25/0x110
    [ 105.839517] __do_softirq+0x11d/0x7bf
    [ 105.843542] irq_exit+0x140/0x190
    [ 105.847208] smp_apic_timer_interrupt+0xac/0x3b0
    [ 105.852182] apic_timer_interrupt+0xf/0x20
    [ 105.856628]
    [ 105.859081] RIP: 0010:cpuidle_enter_state+0xd8/0x4d0
    [ 105.864395] Code: 46 ff 48 89 44 24 08 0f 1f 44 00 00 31 ff e8 cf ec 46 ff 80 7c 24 07 00 0f 85 1d 02 00 00 e8 9f 90 4b ff fb 66 0f 1f 44 00 00 8b 6c 24 08 4d 29 fd 0f 80 36 03 00 00 4c 89 e8 48 ba cf f7 53
    [ 105.884288] RSP: 0018:ffff8803ad94fd20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
    [ 105.892494] RAX: 0000000000000000 RBX: ffffe8fb300829c0 RCX: ffffffffb41e19e1
    [ 105.899988] RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff8803ad9358ac
    [ 105.907503] RBP: ffffffffb6636300 R08: 0000000000000004 R09: 0000000000000000
    [ 105.914997] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000004
    [ 105.922487] R13: ffffffffb6636140 R14: ffffffffb66362d8 R15: 000000188d36091b
    [ 105.929988] ? trace_hardirqs_on_caller+0x141/0x2d0
    [ 105.935232] do_idle+0x28e/0x320
    [ 105.938817] ? arch_cpu_idle_exit+0x40/0x40
    [ 105.943361] ? mark_lock+0x8c1/0x980
    [ 105.947295] ? _raw_spin_unlock_irqrestore+0x32/0x60
    [ 105.952619] cpu_startup_entry+0xc2/0xd0
    [ 105.956900] ? cpu_in_idle+0x20/0x20
    [ 105.960830] ? _raw_spin_unlock_irqrestore+0x32/0x60
    [ 105.966146] ? trace_hardirqs_on_caller+0x141/0x2d0
    [ 105.971391] start_secondary+0x2b5/0x360
    [ 105.975669] ? set_cpu_sibling_map+0x1330/0x1330
    [ 105.980654] secondary_startup_64+0xa5/0xb0

    Taking tcf_lock in sample action with bh disabled causes lockdep to issue a
    warning regarding possible irq lock inversion dependency between tcf_lock,
    and psample_groups_lock that is taken when holding tcf_lock in sample init:

    [ 162.108959] Possible interrupt unsafe locking scenario:

    [ 162.116386] CPU0 CPU1
    [ 162.121277] ---- ----
    [ 162.126162] lock(psample_groups_lock);
    [ 162.130447] local_irq_disable();
    [ 162.136772] lock(&(&p->tcfa_lock)->rlock);
    [ 162.143957] lock(psample_groups_lock);
    [ 162.150813]
    [ 162.153808] lock(&(&p->tcfa_lock)->rlock);
    [ 162.158608]
    *** DEADLOCK ***

    In order to prevent potential lock inversion dependency between tcf_lock
    and psample_groups_lock, extract call to psample_group_get() from tcf_lock
    protected section in sample action init function.

    Fixes: 4e232818bd32 ("net: sched: act_mirred: remove dependency on rtnl lock")
    Fixes: 764e9a24480f ("net: sched: act_vlan: remove dependency on rtnl lock")
    Fixes: 729e01260989 ("net: sched: act_tunnel_key: remove dependency on rtnl lock")
    Fixes: d77284956656 ("net: sched: act_sample: remove dependency on rtnl lock")
    Fixes: e8917f437006 ("net: sched: act_gact: remove dependency on rtnl lock")
    Fixes: b6a2b971c0b0 ("net: sched: act_csum: remove dependency on rtnl lock")
    Fixes: 2142236b4584 ("net: sched: act_bpf: remove dependency on rtnl lock")
    Signed-off-by: Vlad Buslov
    Signed-off-by: David S. Miller

    Vlad Buslov
     

14 Aug, 2018

1 commit


12 Aug, 2018

1 commit

  • Use tcf lock to protect csum action struct private data from concurrent
    modification in init and dump. Use rcu swap operation to reassign params
    pointer under protection of tcf lock. (old params value is not used by
    init, so there is no need of standalone rcu dereference step)

    Remove rtnl assertion that is no longer necessary.

    Signed-off-by: Vlad Buslov
    Signed-off-by: David S. Miller

    Vlad Buslov
     

31 Jul, 2018

1 commit

  • Each lockless action currently does its own RCU locking in ->act().
    This allows using plain RCU accessor, even if the context
    is really RCU BH.

    This change drops the per action RCU lock, replace the accessors
    with the _bh variant, cleans up a bit the surrounding code and
    documents the RCU status in the relevant header.
    No functional nor performance change is intended.

    The goal of this patch is clarifying that the RCU critical section
    used by the tc actions extends up to the classifier's caller.

    v1 -> v2:
    - preserve rcu lock in act_bpf: it's needed by eBPF helpers,
    as pointed out by Daniel

    v3 -> v4:
    - fixed some typos in the commit message (JiriP)

    Signed-off-by: Paolo Abeni
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Paolo Abeni
     

21 Jul, 2018

1 commit


08 Jul, 2018

5 commits

  • Implement function that atomically checks if action exists and either takes
    reference to it, or allocates idr slot for action index to prevent
    concurrent allocations of actions with same index. Use EBUSY error pointer
    to indicate that idr slot is reserved.

    Implement cleanup helper function that removes temporary error pointer from
    idr. (in case of error between idr allocation and insertion of newly
    created action to specified index)

    Refactor all action init functions to insert new action to idr using this
    API.

    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: Vlad Buslov
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Return from action init function with reference to action taken,
    even when overwriting existing action.

    Action init API initializes its fourth argument (pointer to pointer to tc
    action) to either existing action with same index or newly created action.
    In case of existing index(and bind argument is zero), init function returns
    without incrementing action reference counter. Caller of action init then
    proceeds working with action, without actually holding reference to it.
    This means that action could be deleted concurrently.

    Change action init behavior to always take reference to action before
    returning successfully, in order to protect from concurrent deletion.

    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: Vlad Buslov
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Extend action ops with 'delete' function. Each action type to implements
    its own delete function that doesn't depend on rtnl lock.

    Implement delete function that is required to delete actions without
    holding rtnl lock. Use action API function that atomically deletes action
    only if it is still in action idr. This implementation prevents concurrent
    threads from deleting same action twice.

    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: Vlad Buslov
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Add additional 'rtnl_held' argument to act API init functions. It is
    required to implement actions that need to release rtnl lock before loading
    kernel module and reacquire if afterwards.

    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: Vlad Buslov
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     
  • Change type of action reference counter to refcount_t.

    Change type of action bind counter to atomic_t.
    This type is used to allow decrementing bind counter without testing
    for 0 result.

    Reviewed-by: Marcelo Ricardo Leitner
    Signed-off-by: Vlad Buslov
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Vlad Buslov
     

07 Jul, 2018

1 commit

  • the control action in the common member of struct tcf_csum must be a valid
    value, as it can contain the chain index when 'goto chain' is used. Ensure
    that the control action can be read as x->tcfa_action, when x is a pointer
    to struct tc_action and x->ops->type is TCA_ACT_CSUM, to prevent the
    following command:

    # tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \
    > $tcflags dst_mac $h2mac action csum ip or tcp or udp or sctp goto chain 1

    from triggering a NULL pointer dereference when a matching packet is
    received.

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
    PGD 800000010416b067 P4D 800000010416b067 PUD 1041be067 PMD 0
    Oops: 0000 [#1] SMP PTI
    CPU: 0 PID: 3072 Comm: mausezahn Tainted: G E 4.18.0-rc2.auguri+ #421
    Hardware name: Hewlett-Packard HP Z220 CMT Workstation/1790, BIOS K51 v01.58 02/07/2013
    RIP: 0010:tcf_action_exec+0xb8/0x100
    Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3
    RSP: 0018:ffffa020dea03c40 EFLAGS: 00010246
    RAX: 0000000020000001 RBX: ffffa020d7ccef00 RCX: 0000000000000054
    RDX: 0000000000000000 RSI: ffffa020ca5ae000 RDI: ffffa020d7ccef00
    RBP: ffffa020dea03e60 R08: 0000000000000000 R09: ffffa020dea03c9c
    R10: ffffa020dea03c78 R11: 0000000000000008 R12: ffffa020d3fe4f00
    R13: ffffa020d3fe4f08 R14: 0000000000000001 R15: ffffa020d53ca300
    FS: 00007f5a46942740(0000) GS:ffffa020dea00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 0000000104218002 CR4: 00000000001606f0
    Call Trace:

    fl_classify+0x1ad/0x1c0 [cls_flower]
    ? arp_rcv+0x121/0x1b0
    ? __x2apic_send_IPI_dest+0x40/0x40
    ? smp_reschedule_interrupt+0x1c/0xd0
    ? reschedule_interrupt+0xf/0x20
    ? reschedule_interrupt+0xa/0x20
    ? device_is_rmrr_locked+0xe/0x50
    ? iommu_should_identity_map+0x49/0xd0
    ? __intel_map_single+0x30/0x140
    ? e1000e_update_rdt_wa.isra.52+0x22/0xb0 [e1000e]
    ? e1000_alloc_rx_buffers+0x233/0x250 [e1000e]
    ? kmem_cache_alloc+0x38/0x1c0
    tcf_classify+0x89/0x140
    __netif_receive_skb_core+0x5ea/0xb70
    ? enqueue_task_fair+0xb6/0x7d0
    ? process_backlog+0x97/0x150
    process_backlog+0x97/0x150
    net_rx_action+0x14b/0x3e0
    __do_softirq+0xde/0x2b4
    do_softirq_own_stack+0x2a/0x40

    do_softirq.part.18+0x49/0x50
    __local_bh_enable_ip+0x49/0x50
    __dev_queue_xmit+0x4ab/0x8a0
    ? wait_woken+0x80/0x80
    ? packet_sendmsg+0x38f/0x810
    ? __dev_queue_xmit+0x8a0/0x8a0
    packet_sendmsg+0x38f/0x810
    sock_sendmsg+0x36/0x40
    __sys_sendto+0x10e/0x140
    ? do_vfs_ioctl+0xa4/0x630
    ? syscall_trace_enter+0x1df/0x2e0
    ? __audit_syscall_exit+0x22a/0x290
    __x64_sys_sendto+0x24/0x30
    do_syscall_64+0x5b/0x180
    entry_SYSCALL_64_after_hwframe+0x44/0xa9
    RIP: 0033:0x7f5a45cbec93
    Code: 48 8b 0d 18 83 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 59 c7 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 2b f7 ff ff 48 89 04 24
    RSP: 002b:00007ffd0ee6d748 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    RAX: ffffffffffffffda RBX: 0000000001161010 RCX: 00007f5a45cbec93
    RDX: 0000000000000062 RSI: 0000000001161322 RDI: 0000000000000003
    RBP: 00007ffd0ee6d780 R08: 00007ffd0ee6d760 R09: 0000000000000014
    R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000062
    R13: 0000000001161322 R14: 00007ffd0ee6d760 R15: 0000000000000003
    Modules linked in: act_csum act_gact cls_flower sch_ingress vrf veth act_tunnel_key(E) xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_hdmi snd_hda_codec_realtek kvm snd_hda_codec_generic hp_wmi iTCO_wdt sparse_keymap rfkill mei_wdt iTCO_vendor_support wmi_bmof gpio_ich irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel snd_hda_intel crypto_simd cryptd snd_hda_codec glue_helper snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm pcspkr i2c_i801 snd_timer snd sg lpc_ich soundcore wmi mei_me
    mei ie31200_edac nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod cdrom sd_mod ahci libahci crc32c_intel i915 ixgbe serio_raw libata video dca i2c_algo_bit sfc drm_kms_helper syscopyarea mtd sysfillrect mdio sysimgblt fb_sys_fops drm e1000e i2c_core
    CR2: 0000000000000000
    ---[ end trace 3c9e9d1a77df4026 ]---
    RIP: 0010:tcf_action_exec+0xb8/0x100
    Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3
    RSP: 0018:ffffa020dea03c40 EFLAGS: 00010246
    RAX: 0000000020000001 RBX: ffffa020d7ccef00 RCX: 0000000000000054
    RDX: 0000000000000000 RSI: ffffa020ca5ae000 RDI: ffffa020d7ccef00
    RBP: ffffa020dea03e60 R08: 0000000000000000 R09: ffffa020dea03c9c
    R10: ffffa020dea03c78 R11: 0000000000000008 R12: ffffa020d3fe4f00
    R13: ffffa020d3fe4f08 R14: 0000000000000001 R15: ffffa020d53ca300
    FS: 00007f5a46942740(0000) GS:ffffa020dea00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 0000000104218002 CR4: 00000000001606f0
    Kernel panic - not syncing: Fatal exception in interrupt
    Kernel Offset: 0x26400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
    ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

    Fixes: 9c5f69bbd75a ("net/sched: act_csum: don't use spinlock in the fast path")
    Signed-off-by: Davide Caratti
    Signed-off-by: David S. Miller

    Davide Caratti
     

03 May, 2018

1 commit


28 Mar, 2018

1 commit


23 Mar, 2018

1 commit

  • Fun set of conflict resolutions here...

    For the mac80211 stuff, these were fortunately just parallel
    adds. Trivially resolved.

    In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the
    function phy_disable_interrupts() earlier in the file, whilst in
    'net-next' the phy_error() call from this function was removed.

    In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the
    'rt_table_id' member of rtable collided with a bug fix in 'net' that
    added a new struct member "rt_mtu_locked" which needs to be copied
    over here.

    The mlxsw driver conflict consisted of net-next separating
    the span code and definitions into separate files, whilst
    a 'net' bug fix made some changes to that moved code.

    The mlx5 infiniband conflict resolution was quite non-trivial,
    the RDMA tree's merge commit was used as a guide here, and
    here are their notes:

    ====================

    Due to bug fixes found by the syzkaller bot and taken into the for-rc
    branch after development for the 4.17 merge window had already started
    being taken into the for-next branch, there were fairly non-trivial
    merge issues that would need to be resolved between the for-rc branch
    and the for-next branch. This merge resolves those conflicts and
    provides a unified base upon which ongoing development for 4.17 can
    be based.

    Conflicts:
    drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524
    (IB/mlx5: Fix cleanup order on unload) added to for-rc and
    commit b5ca15ad7e61 (IB/mlx5: Add proper representors support)
    add as part of the devel cycle both needed to modify the
    init/de-init functions used by mlx5. To support the new
    representors, the new functions added by the cleanup patch
    needed to be made non-static, and the init/de-init list
    added by the representors patch needed to be modified to
    match the init/de-init list changes made by the cleanup
    patch.
    Updates:
    drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
    prototypes added by representors patch to reflect new function
    names as changed by cleanup patch
    drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
    stage list to match new order from cleanup patch
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

18 Mar, 2018

1 commit

  • when the following command

    # tc action add action csum udp continue index 100

    is run for the first time, and tcf_csum_init() fails allocating struct
    tcf_csum, tcf_csum_cleanup() calls kfree_rcu(NULL,...). This causes the
    following error:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: __call_rcu+0x23/0x2b0
    PGD 80000000740b4067 P4D 80000000740b4067 PUD 32e7f067 PMD 0
    Oops: 0002 [#1] SMP PTI
    Modules linked in: act_csum(E) act_vlan ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic pcbc snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer aesni_intel crypto_simd glue_helper cryptd snd joydev pcspkr virtio_balloon i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_blk drm virtio_net virtio_console ata_piix crc32c_intel libata virtio_pci serio_raw i2c_core virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_vlan]
    CPU: 2 PID: 5763 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403
    Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    RIP: 0010:__call_rcu+0x23/0x2b0
    RSP: 0018:ffffb275803e77c0 EFLAGS: 00010246
    RAX: ffffffffc057b080 RBX: ffff9674bc6f5240 RCX: 00000000ffffffff
    RDX: ffffffff928a5f00 RSI: 0000000000000008 RDI: 0000000000000008
    RBP: 0000000000000008 R08: 0000000000000001 R09: 0000000000000044
    R10: 0000000000000220 R11: ffff9674b9ab4821 R12: 0000000000000000
    R13: ffffffff928a5f00 R14: 0000000000000000 R15: 0000000000000001
    FS: 00007fa6368d8740(0000) GS:ffff9674bfd00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000010 CR3: 0000000073dec001 CR4: 00000000001606e0
    Call Trace:
    __tcf_idr_release+0x79/0xf0
    tcf_csum_init+0xfb/0x180 [act_csum]
    tcf_action_init_1+0x2cc/0x430
    tcf_action_init+0xd3/0x1b0
    tc_ctl_action+0x18b/0x240
    rtnetlink_rcv_msg+0x29c/0x310
    ? _cond_resched+0x15/0x30
    ? __kmalloc_node_track_caller+0x1b9/0x270
    ? rtnl_calcit.isra.28+0x100/0x100
    netlink_rcv_skb+0xd2/0x110
    netlink_unicast+0x17c/0x230
    netlink_sendmsg+0x2cd/0x3c0
    sock_sendmsg+0x30/0x40
    ___sys_sendmsg+0x27a/0x290
    ? filemap_map_pages+0x34a/0x3a0
    ? __handle_mm_fault+0xbfd/0xe20
    __sys_sendmsg+0x51/0x90
    do_syscall_64+0x6e/0x1a0
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2
    RIP: 0033:0x7fa635ce9ba0
    RSP: 002b:00007ffc185b0fc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 00007ffc185b10f0 RCX: 00007fa635ce9ba0
    RDX: 0000000000000000 RSI: 00007ffc185b1040 RDI: 0000000000000003
    RBP: 000000005aaa85e0 R08: 0000000000000002 R09: 0000000000000000
    R10: 00007ffc185b0a20 R11: 0000000000000246 R12: 0000000000000000
    R13: 00007ffc185b1104 R14: 0000000000000001 R15: 0000000000669f60
    Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89
    RIP: __call_rcu+0x23/0x2b0 RSP: ffffb275803e77c0
    CR2: 0000000000000010

    fix this in tcf_csum_cleanup(), ensuring that kfree_rcu(param, ...) is
    called only when param is not NULL.

    Fixes: 9c5f69bbd75a ("net/sched: act_csum: don't use spinlock in the fast path")
    Signed-off-by: Davide Caratti
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Davide Caratti
     

10 Mar, 2018

1 commit

  • As well as the basic conversion, I noticed that a lot of the
    SCTP code checks gso_type without first checking skb_is_gso()
    so I have added that where appropriate.

    Also, document the helper.

    Cc: Daniel Borkmann
    Cc: Marcelo Ricardo Leitner
    Signed-off-by: Daniel Axtens
    Signed-off-by: David S. Miller

    Daniel Axtens
     

28 Feb, 2018

1 commit

  • These pernet_operations are from net/sched directory, and they call only
    tc_action_net_init() and tc_action_net_exit():

    bpf_net_ops
    connmark_net_ops
    csum_net_ops
    gact_net_ops
    ife_net_ops
    ipt_net_ops
    xt_net_ops
    mirred_net_ops
    nat_net_ops
    pedit_net_ops
    police_net_ops
    sample_net_ops
    simp_net_ops
    skbedit_net_ops
    skbmod_net_ops
    tunnel_key_net_ops
    vlan_net_ops

    1)tc_action_net_init() just allocates and initializes per-net memory.
    2)There should not be in-flight packets at the time of tc_action_net_exit()
    call, or another pernet_operations send packets to dying net (except
    netlink). So, it seems they can be marked as async.

    Signed-off-by: Kirill Tkhai
    Signed-off-by: David S. Miller

    Kirill Tkhai
     

17 Feb, 2018

3 commits