17 Oct, 2019

1 commit

  • Instead of waiting for rcu grace period just free it directly.

    This is safe because conntrack lookup doesn't consider extensions.

    Other accesses happen while ct->ext can't be free'd, either because
    a ct refcount was taken or because the conntrack hash bucket lock or
    the dying list spinlock have been taken.

    This allows to remove __krealloc in a followup patch, netfilter was the
    only user.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

17 Apr, 2018

1 commit

  • After merging the netfilter tree, today's linux-next build (powerpc
    ppc64_defconfig) failed like this:

    net/netfilter/nf_conntrack_extend.c: In function 'nf_ct_ext_add':
    net/netfilter/nf_conntrack_extend.c:74:2: error: implicit declaration of function 'kmemleak_not_leak' [-Werror=implicit-function-declaration]
    kmemleak_not_leak(old);
    ^~~~~~~~~~~~~~~~~
    cc1: some warnings being treated as errors

    Fixes: 114aa35d06d4 ("netfilter: conntrack: silent a memory leak warning")
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Pablo Neira Ayuso

    Stephen Rothwell
     

16 Apr, 2018

1 commit

  • The following memory leak is false postive:

    unreferenced object 0xffff8f37f156fb38 (size 128):
    comm "softirq", pid 0, jiffies 4294899665 (age 11.292s)
    hex dump (first 32 bytes):
    6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk
    00 00 00 00 30 00 20 00 48 6b 6b 6b 6b 6b 6b 6b ....0. .Hkkkkkkk
    backtrace:
    [] __kmalloc_track_caller+0x10d/0x141
    [] __krealloc+0x45/0x62
    [] nf_ct_ext_add+0xdc/0x133
    [] init_conntrack+0x1b1/0x392
    [] nf_conntrack_in+0x1ee/0x34b
    [] nf_hook_slow+0x36/0x95
    [] nf_hook.constprop.43+0x1c3/0x1dd
    [] __ip_local_out+0xae/0xb4
    [] ip_local_out+0x17/0x33
    [] igmp_ifc_timer_expire+0x23e/0x26f
    [] call_timer_fn+0x14c/0x2a5
    [] __run_timers.part.34+0x150/0x182
    [] run_timer_softirq+0x2a/0x4c
    [] __do_softirq+0x1d1/0x3c2
    [] irq_exit+0x53/0xa2
    [] smp_apic_timer_interrupt+0x22a/0x235

    because __krealloc() is not supposed to release the old
    memory and it is released later via kfree_rcu(). Since this is
    the only external user of __krealloc(), just mark it as not leak
    here.

    Signed-off-by: Cong Wang
    Signed-off-by: Pablo Neira Ayuso

    Cong Wang
     

04 Sep, 2017

1 commit


01 May, 2017

1 commit

  • For NF_NAT_MANIP_SRC, we will insert the ct to the nat_bysource_table,
    then remove it from the nat_bysource_table via nat_extend->destroy.

    But now, the nat extension is attached on demand, so if the nat extension
    is not attached, we will not be notified when the ct is destroyed, i.e.
    we may fail to remove ct from the nat_bysource_table.

    So just keep it simple, even if the extension is not attached, we will
    still invoke the related ext->destroy. And this will also preserve the
    flexibility for the future extension.

    Fixes: 9a08ecfe74d7 ("netfilter: don't attach a nat extension by default")
    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

26 Apr, 2017

3 commits

  • krealloc(NULL, ..) is same as kmalloc(), so we can avoid special-casing
    the initial allocation after the prealloc removal (we had to use
    ->alloc_len as the initial allocation size).

    This also means we do not zero the preallocated memory anymore; only
    offsets[]. Existing code makes sure the new (used) extension space gets
    zeroed out.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • It was used by the nat extension, but since commit
    7c9664351980 ("netfilter: move nat hlist_head to nf_conn") its only needed
    for connections that use MASQUERADE target or a nat helper.

    Also it seems a lot easier to preallocate a fixed size instead.

    With default settings, conntrack first adds ecache extension (sysctl
    defaults to 1), so we get 40(ct extension header) + 24 (ecache) == 64 byte
    on x86_64 for initial allocation.

    Followup patches can constify the extension structs and avoid
    the initial zeroing of the entire extension area.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

19 Apr, 2017

1 commit


27 Mar, 2017

1 commit

  • If one cpu is doing nf_ct_extend_unregister while another cpu is doing
    __nf_ct_ext_add_length, then we may hit BUG_ON(t == NULL). Moreover,
    there's no synchronize_rcu invocation after set nf_ct_ext_types[id] to
    NULL, so it's possible that we may access invalid pointer.

    But actually, most of the ct extends are built-in, so the problem listed
    above will not happen. However, there are two exceptions: NF_CT_EXT_NAT
    and NF_CT_EXT_SYNPROXY.

    For _EXT_NAT, the panic will not happen, since adding the nat extend and
    unregistering the nat extend are located in the same file(nf_nat_core.c),
    this means that after the nat module is removed, we cannot add the nat
    extend too.

    For _EXT_SYNPROXY, synproxy extend may be added by init_conntrack, while
    synproxy extend unregister will be done by synproxy_core_exit. So after
    nf_synproxy_core.ko is removed, we may still try to add the synproxy
    extend, then kernel panic may happen.

    I know it's very hard to reproduce this issue, but I can play a tricky
    game to make it happen very easily :)

    Step 1. Enable SYNPROXY for tcp dport 1234 at FORWARD hook:
    # iptables -I FORWARD -p tcp --dport 1234 -j SYNPROXY
    Step 2. Queue the syn packet to the userspace at raw table OUTPUT hook.
    Also note, in the userspace we only add a 20s' delay, then
    reinject the syn packet to the kernel:
    # iptables -t raw -I OUTPUT -p tcp --syn -j NFQUEUE --queue-num 1
    Step 3. Using "nc 2.2.2.2 1234" to connect the server.
    Step 4. Now remove the nf_synproxy_core.ko quickly:
    # iptables -F FORWARD
    # rmmod ipt_SYNPROXY
    # rmmod nf_synproxy_core
    Step 5. After 20s' delay, the syn packet is reinjected to the kernel.

    Now you will see the panic like this:
    kernel BUG at net/netfilter/nf_conntrack_extend.c:91!
    Call Trace:
    ? __nf_ct_ext_add_length+0x53/0x3c0 [nf_conntrack]
    init_conntrack+0x12b/0x600 [nf_conntrack]
    nf_conntrack_in+0x4cc/0x580 [nf_conntrack]
    ipv4_conntrack_local+0x48/0x50 [nf_conntrack_ipv4]
    nf_reinject+0x104/0x270
    nfqnl_recv_verdict+0x3e1/0x5f9 [nfnetlink_queue]
    ? nfqnl_recv_verdict+0x5/0x5f9 [nfnetlink_queue]
    ? nla_parse+0xa0/0x100
    nfnetlink_rcv_msg+0x175/0x6a9 [nfnetlink]
    [...]

    One possible solution is to make NF_CT_EXT_SYNPROXY extend built-in, i.e.
    introduce nf_conntrack_synproxy.c and only do ct extend register and
    unregister in it, similar to nf_conntrack_timeout.c.

    But having such a obscure restriction of nf_ct_extend_unregister is not a
    good idea, so we should invoke synchronize_rcu after set nf_ct_ext_types
    to NULL, and check the NULL pointer when do __nf_ct_ext_add_length. Then
    it will be easier if we add new ct extend in the future.

    Last, we use kfree_rcu to free nf_ct_ext, so rcu_barrier() is unnecessary
    anymore, remove it too.

    Signed-off-by: Liping Zhang
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

11 Jul, 2016

1 commit

  • The nat extension structure is 32bytes in size on x86_64:

    struct nf_conn_nat {
    struct hlist_node bysource; /* 0 16 */
    struct nf_conn * ct; /* 16 8 */
    union nf_conntrack_nat_help help; /* 24 4 */
    int masq_index; /* 28 4 */
    /* size: 32, cachelines: 1, members: 4 */
    /* last cacheline: 32 bytes */
    };

    The hlist is needed to quickly check for possible tuple collisions
    when installing a new nat binding. Storing this in the extension
    area has two drawbacks:

    1. We need ct backpointer to get the conntrack struct from the extension.
    2. When reallocation of extension area occurs we need to fixup the bysource
    hash head via hlist_replace_rcu.

    We can avoid both by placing the hlist_head in nf_conn and place nf_conn in
    the bysource hash rather than the extenstion.

    We can also remove the ->move support; no other extension needs it.

    Moving the entire nat extension into nf_conn would be possible as well but
    then we have to add yet another callback for deletion from the bysource
    hash table rather than just using nat extension ->destroy hook for this.

    nf_conn size doesn't increase due to aligment, followup patch replaces
    hlist_node with single pointer.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

16 Jun, 2012

1 commit


13 Jan, 2012

1 commit

  • commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to
    RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
    complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
    y).

    We miss needed barriers, even on x86, when y is not NULL.

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    CC: Paul E. McKenney
    Signed-off-by: David S. Miller

    Eric Dumazet
     

02 Aug, 2011

1 commit

  • When assigning a NULL value to an RCU protected pointer, no barrier
    is needed. The rcu_assign_pointer, used to handle that but will soon
    change to not handle the special case.

    Convert all rcu_assign_pointer of NULL value.

    //smpl
    @@ expression P; @@

    - rcu_assign_pointer(P, NULL)
    + RCU_INIT_POINTER(P, NULL)

    //

    Signed-off-by: Stephen Hemminger
    Acked-by: Paul E. McKenney
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

08 May, 2011

1 commit


16 Nov, 2010

1 commit


15 Nov, 2010

1 commit


07 Oct, 2010

1 commit


23 Sep, 2010

1 commit

  • As soon as rcu_read_unlock() is called, there is no guarantee current
    thread can safely derefence t pointer, rcu protected.

    Fix is to copy t->alloc_size in a temporary variable.

    Signed-off-by: Eric Dumazet
    Reviewed-by: Paul E. McKenney
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Aug, 2010

1 commit


02 Aug, 2010

1 commit


12 Feb, 2010

1 commit


25 Jun, 2009

1 commit

  • RCU barriers, rcu_barrier(), is inserted two places.

    In nf_conntrack_expect.c nf_conntrack_expect_fini() before the
    kmem_cache_destroy(). Firstly to make sure the callback to the
    nf_ct_expect_free_rcu() code is still around. Secondly because I'm
    unsure about the consequence of having in flight
    nf_ct_expect_free_rcu/kmem_cache_free() calls while doing a
    kmem_cache_destroy() slab destroy.

    And in nf_conntrack_extend.c nf_ct_extend_unregister(), inorder to
    wait for completion of callbacks to __nf_ct_ext_free_rcu(), which is
    invoked by __nf_ct_ext_add(). It might be more efficient to call
    rcu_barrier() in nf_conntrack_core.c nf_conntrack_cleanup_net(), but
    thats make it more difficult to read the code (as the callback code
    in located in nf_conntrack_extend.c).

    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: Patrick McHardy

    Jesper Dangaard Brouer
     

27 Jul, 2008

2 commits


20 Jun, 2008

1 commit


18 Jun, 2008

1 commit

  • Fix three ct_extend/NAT extension related races:

    - When cleaning up the extension area and removing it from the bysource hash,
    the nat->ct pointer must not be set to NULL since it may still be used in
    a RCU read side

    - When replacing a NAT extension area in the bysource hash, the nat->ct
    pointer must be assigned before performing the replacement

    - When reallocating extension storage in ct_extend, the old memory must
    not be freed immediately since it may still be used by a RCU read side

    Possibly fixes https://bugzilla.redhat.com/show_bug.cgi?id=449315
    and/or http://bugzilla.kernel.org/show_bug.cgi?id=10875

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

10 Jun, 2008

1 commit


14 Apr, 2008

1 commit


11 Mar, 2008

1 commit


08 Feb, 2008

1 commit

  • The ->move operation has two bugs:

    - It is called with the same extension as source and destination,
    so it doesn't update the new extension.

    - The address of the old extension is calculated incorrectly,
    instead of (void *)ct->ext + ct->ext->offset[i] it uses
    ct->ext + ct->ext->offset[i].

    Fixes a crash on x86_64 reported by Chuck Ebbert
    and Thomas Woerner .

    Tested-by: Thomas Woerner

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

16 Nov, 2007

1 commit

  • Reported by Chuck Ebbert as:

    https://bugzilla.redhat.com/show_bug.cgi?id=259501#c14

    This routine is called each time hash should be replaced, nf_conn has
    extension list which contains pointers to connection tracking users
    (like nat, which is right now the only such user), so when replace takes
    place it should copy own extensions. Loop above checks for own
    extension, but tries to move higer-layer one, which can lead to above
    oops.

    Signed-off-by: Evgeniy Polyakov
    Signed-off-by: David S. Miller

    Evgeniy Polyakov
     

11 Jul, 2007

1 commit

  • Old space allocator of conntrack had problems about extensibility.
    - It required slab cache per combination of extensions.
    - It expected what extensions would be assigned, but it was impossible
    to expect that completely, then we allocated bigger memory object than
    really required.
    - It needed to search helper twice due to lock issue.

    Now basic informations of a connection are stored in 'struct nf_conn'.
    And a storage for extension (helper, NAT) is allocated by kmalloc.

    Signed-off-by: Yasuyuki Kozakai
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Yasuyuki Kozakai