10 Jan, 2019

2 commits

  • commit 21ba8847f857028dc83a0f341e16ecc616e34740 upstream.

    Currently, we use check_hlist() for garbage colleciton. However, we
    use the ‘zone’ from the counted entry to query the existence of
    existing entries in the hlist. This could be wrong when they are in
    different zones, and this patch fixes this issue.

    Fixes: e59ea3df3fc2 ("netfilter: xt_connlimit: honor conntrack zone if available")
    Signed-off-by: Yi-Hung Wei
    Signed-off-by: Pablo Neira Ayuso

    [mfo: backport: refresh context lines and use older symbol/file names, note hunk 5:
    - nf_conncount.c -> xt_connlimit.c
    - nf_conncount_rb -> xt_connlimit_rb
    - nf_conncount_tuple -> xt_connlimit_conn
    - hunk 5: remove check for non-NULL 'tuple', that isn't required as it's introduced
    by upstream commit 35d8deb80 ("netfilter: conncount: Support count only use case")
    which addresses nf_conncount_count() that does not exist yet -- it's introduced by
    upstream commit 625c556118f3 ("netfilter: connlimit: split xt_connlimit into front
    and backend"), a refactor change.
    - nft_connlimit.c -> removed, not used/doesn't exist yet.]
    Signed-off-by: Mauricio Faria de Oliveira

    Signed-off-by: Sasha Levin

    Yi-Hung Wei
     
  • commit 5e5cbc7b23eaf13e18652c03efbad5be6995de6a upstream.

    This patch provides an interface to maintain the list of connections and
    the lookup function to obtain the number of connections in the list.

    Signed-off-by: Pablo Neira Ayuso

    [mfo: backport: refresh context lines and use older symbol/file names:
    - nf_conntrack_count.h: new file, add include guards.
    - nf_conncount.c -> xt_connlimit.c.
    - nf_conncount_rb -> xt_connlimit_rb
    - nf_conncount_tuple -> xt_connlimit_conn
    - conncount_rb_cachep -> connlimit_rb_cachep
    - conncount_conn_cachep -> connlimit_conn_cachep]
    Signed-off-by: Mauricio Faria de Oliveira

    Signed-off-by: Sasha Levin

    Pablo Neira Ayuso
     

08 Jul, 2018

1 commit

  • commit bb7b40aecbf778c0c83a5bd62b0f03ca9f49a618 upstream.

    When removing a rule that jumps to chain and such chain in the same
    batch, this bogusly hits EBUSY. Add activate and deactivate operations
    to expression that can be called from the preparation and the
    commit/abort phases.

    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Pablo Neira Ayuso
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

09 Sep, 2017

1 commit

  • This reverts commit 870190a9ec9075205c0fa795a09fa931694a3ff1.

    It was not a good idea. The custom hash table was a much better
    fit for this purpose.

    A fast lookup is not essential, in fact for most cases there is no lookup
    at all because original tuple is not taken and can be used as-is.
    What needs to be fast is insertion and deletion.

    rhlist removal however requires a rhlist walk.
    We can have thousands of entries in such a list if source port/addresses
    are reused for multiple flows, if this happens removal requests are so
    expensive that deletions of a few thousand flows can take several
    seconds(!).

    The advantages that we got from rhashtable are:
    1) table auto-sizing
    2) multiple locks

    1) would be nice to have, but it is not essential as we have at
    most one lookup per new flow, so even a million flows in the bysource
    table are not a problem compared to current deletion cost.
    2) is easy to add to custom hash table.

    I tried to add hlist_node to rhlist to speed up rhltable_remove but this
    isn't doable without changing semantics. rhltable_remove_fast will
    check that the to-be-deleted object is part of the table and that
    requires a list walk that we want to avoid.

    Furthermore, using hlist_node increases size of struct rhlist_head, which
    in turn increases nf_conn size.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=196821
    Reported-by: Ivan Babrou
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

04 Sep, 2017

4 commits


28 Aug, 2017

1 commit

  • This converts the storage and layout of netfilter hook entries from a
    linked list to an array. After this commit, hook entries will be
    stored adjacent in memory. The next pointer is no longer required.

    The ops pointers are stored at the end of the array as they are only
    used in the register/unregister path and in the legacy br_netfilter code.

    nf_unregister_net_hooks() is slower than needed as it just calls
    nf_unregister_net_hook in a loop (i.e. at least n synchronize_net()
    calls), this will be addressed in followup patch.

    Test setup:
    - ixgbe 10gbit
    - netperf UDP_STREAM, 64 byte packets
    - 5 hooks: (raw + mangle prerouting, mangle+filter input, inet filter):
    empty mangle and raw prerouting, mangle and filter input hooks:
    353.9
    this patch:
    364.2

    Signed-off-by: Aaron Conole
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Aaron Conole
     

25 Aug, 2017

7 commits


02 Aug, 2017

1 commit

  • When a nf_conntrack_l3/4proto parameter is not on the left hand side
    of an assignment, its address is not taken, and it is not passed to a
    function that may modify its fields, then it can be declared as const.

    This change is useful from a documentation point of view, and can
    possibly facilitate making some nf_conntrack_l3/4proto structures const
    subsequently.

    Done with the help of Coccinelle.

    Signed-off-by: Julia Lawall
    Signed-off-by: Pablo Neira Ayuso

    Julia Lawall
     

01 Aug, 2017

7 commits


24 Jul, 2017

1 commit


01 Jul, 2017

1 commit

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

20 Jun, 2017

1 commit


29 May, 2017

4 commits


15 May, 2017

3 commits

  • Andreas reports that the following incremental update using our commit
    protocol doesn't work.

    # nft -f incremental-update.nft
    delete element ip filter client_to_any { 10.180.86.22 : goto CIn_1 }
    delete chain ip filter CIn_1
    ... Error: Could not process rule: Device or resource busy

    The existing code is not well-integrated into the commit phase protocol,
    since element deletions do not result in refcount decrement from the
    preparation phase. This results in bogus EBUSY errors like the one
    above.

    Two new functions come with this patch:

    * nft_set_elem_activate() function is used from the abort path, to
    restore the set element refcounting on objects that occurred from
    the preparation phase.

    * nft_set_elem_deactivate() that is called from nft_del_setelem() to
    decrement set element refcounting on objects from the preparation
    phase in the commit protocol.

    The nft_data_uninit() has been renamed to nft_data_release() since this
    function does not uninitialize any data store in the data register,
    instead just releases the references to objects. Moreover, a new
    function nft_data_hold() has been introduced to be used from
    nft_set_elem_activate().

    Reported-by: Andreas Schultz
    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • We can still delete the ct helper even if it is in use, this will cause
    a use-after-free error. In more detail, I mean:
    # nfct helper add ssdp inet udp
    # iptables -t raw -A OUTPUT -p udp -j CT --helper ssdp
    # nfct helper delete ssdp //--> oops, succeed!
    BUG: unable to handle kernel paging request at 000026ca
    IP: 0x26ca
    [...]
    Call Trace:
    ? ipv4_helper+0x62/0x80 [nf_conntrack_ipv4]
    nf_hook_slow+0x21/0xb0
    ip_output+0xe9/0x100
    ? ip_fragment.constprop.54+0xc0/0xc0
    ip_local_out+0x33/0x40
    ip_send_skb+0x16/0x80
    udp_send_skb+0x84/0x240
    udp_sendmsg+0x35d/0xa50

    So add reference count to fix this issue, if ct helper is used by
    others, reject the delete request.

    Apply this patch:
    # nfct helper delete ssdp
    nfct v1.4.3: netlink error: Device or resource busy

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     
  • And convert module_put invocation to nf_conntrack_helper_put, this is
    prepared for the followup patch, which will add a refcnt for cthelper,
    so we can reject the deleting request when cthelper is in use.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

01 May, 2017

2 commits

  • For NF_NAT_MANIP_SRC, we will insert the ct to the nat_bysource_table,
    then remove it from the nat_bysource_table via nat_extend->destroy.

    But now, the nat extension is attached on demand, so if the nat extension
    is not attached, we will not be notified when the ct is destroyed, i.e.
    we may fail to remove ct from the nat_bysource_table.

    So just keep it simple, even if the extension is not attached, we will
    still invoke the related ext->destroy. And this will also preserve the
    flexibility for the future extension.

    Fixes: 9a08ecfe74d7 ("netfilter: don't attach a nat extension by default")
    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     
  • nf_unregister_net_hook(s) can avoid a second call to synchronize_net,
    provided there is no nfqueue active in that net namespace (which is
    the common case).

    This also gets rid of the extra arg to nf_queue_nf_hook_drop(), normally
    this gets called during netns cleanup so no packets should be queued.

    For the rare case of base chain being unregistered or module removal
    while nfqueue is in use the extra hiccup due to the packet drops isn't
    a big deal.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

26 Apr, 2017

3 commits

  • nowadays the NAT extension only stores the interface index
    (used to purge connections that got masqueraded when interface goes down)
    and pptp nat information.

    Previous patches moved nf_ct_nat_ext_add to those places that need it.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • It was used by the nat extension, but since commit
    7c9664351980 ("netfilter: move nat hlist_head to nf_conn") its only needed
    for connections that use MASQUERADE target or a nat helper.

    Also it seems a lot easier to preallocate a fixed size instead.

    With default settings, conntrack first adds ecache extension (sysctl
    defaults to 1), so we get 40(ct extension header) + 24 (ecache) == 64 byte
    on x86_64 for initial allocation.

    Followup patches can constify the extension structs and avoid
    the initial zeroing of the entire extension area.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal