05 Oct, 2020

2 commits

  • A typical use of bitwise expression is to mask out parts of an IP
    address when matching on the network part only. Optimize for this common
    use with a fast variant for NFT_BITWISE_BOOL-type expressions operating
    on 32bit-sized values.

    Signed-off-by: Phil Sutter
    Signed-off-by: Pablo Neira Ayuso

    Phil Sutter
     
  • Add a boolean indicating NFT_CMP_NEQ. To include it into the match
    decision, it is sufficient to XOR it with the data comparison's result.

    While being at it, store the mask that is calculated during expression
    init and free the eval routine from having to recalculate it each time.

    Signed-off-by: Phil Sutter
    Signed-off-by: Pablo Neira Ayuso

    Phil Sutter
     

22 Jul, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/latest/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Pablo Neira Ayuso

    Gustavo A. R. Silva
     

06 Jul, 2019

1 commit


19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

01 Mar, 2019

1 commit

  • Check the result of dereferencing base_chain->stats, instead of result
    of this_cpu_ptr with NULL.

    base_chain->stats maybe be changed to NULL when a chain is updated and a
    new NULL counter can be attached.

    And we do not need to check returning of this_cpu_ptr since
    base_chain->stats is from percpu allocator if it is non-NULL,
    this_cpu_ptr returns a valid value.

    And fix two sparse error by replacing rcu_access_pointer and
    rcu_dereference with READ_ONCE under rcu_read_lock.

    Thanks for Eric's help to finish this patch.

    Fixes: 009240940e84c1 ("netfilter: nf_tables: don't assume chain stats are set when jumplabel is set")
    Signed-off-by: Eric Dumazet
    Signed-off-by: Zhang Yu
    Signed-off-by: Li RongQing
    Signed-off-by: Pablo Neira Ayuso

    Li RongQing
     

27 Feb, 2019

1 commit


18 Jan, 2019

1 commit

  • With CONFIG_RETPOLINE its faster to add an if (ptr == &foo_func)
    check and and use direct calls for all the built-in expressions.

    ~15% improvement in pathological cases.

    checkpatch doesn't like the X macro due to the embedded return statement,
    but the macro has a very limited scope so I don't think its a problem.

    I would like to avoid bugs of the form
    If (e->ops->eval == (unsigned long)nft_foo_eval)
    nft_bar_eval();

    and open-coded if ()/else if()/else cascade, thus the macro.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

04 Dec, 2018

1 commit

  • basechain->stats is rcu protected data which is updated from
    nft_chain_stats_replace(). This function is executed from the commit
    phase which holds the pernet nf_tables commit mutex - not the global
    nfnetlink subsystem mutex.

    Test commands to reproduce the problem are:
    %iptables-nft -I INPUT
    %iptables-nft -Z
    %iptables-nft -Z

    This patch uses RCU calls to handle basechain->stats updates to fix a
    splat that looks like:

    [89279.358755] =============================
    [89279.363656] WARNING: suspicious RCU usage
    [89279.368458] 4.20.0-rc2+ #44 Tainted: G W L
    [89279.374661] -----------------------------
    [89279.379542] net/netfilter/nf_tables_api.c:1404 suspicious rcu_dereference_protected() usage!
    [...]
    [89279.406556] 1 lock held by iptables-nft/5225:
    [89279.411728] #0: 00000000bf45a000 (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid+0x1f/0x70 [nf_tables]
    [89279.424022] stack backtrace:
    [89279.429236] CPU: 0 PID: 5225 Comm: iptables-nft Tainted: G W L 4.20.0-rc2+ #44
    [89279.430135] Call Trace:
    [89279.430135] dump_stack+0xc9/0x16b
    [89279.430135] ? show_regs_print_info+0x5/0x5
    [89279.430135] ? lockdep_rcu_suspicious+0x117/0x160
    [89279.430135] nft_chain_commit_update+0x4ea/0x640 [nf_tables]
    [89279.430135] ? sched_clock_local+0xd4/0x140
    [89279.430135] ? check_flags.part.35+0x440/0x440
    [89279.430135] ? __rhashtable_remove_fast.constprop.67+0xec0/0xec0 [nf_tables]
    [89279.430135] ? sched_clock_cpu+0x126/0x170
    [89279.430135] ? find_held_lock+0x39/0x1c0
    [89279.430135] ? hlock_class+0x140/0x140
    [89279.430135] ? is_bpf_text_address+0x5/0xf0
    [89279.430135] ? check_flags.part.35+0x440/0x440
    [89279.430135] ? __lock_is_held+0xb4/0x140
    [89279.430135] nf_tables_commit+0x2555/0x39c0 [nf_tables]

    Fixes: f102d66b335a4 ("netfilter: nf_tables: use dedicated mutex to guard transactions")
    Signed-off-by: Taehee Yoo
    Signed-off-by: Pablo Neira Ayuso

    Taehee Yoo
     

28 Sep, 2018

1 commit

  • Add the ability to set the security context of packets within the nf_tables framework.
    Add a nft_object for holding security contexts in the kernel and manipulating packets on the wire.

    Convert the security context strings at rule addition time to security identifiers.
    This is the same behavior like in xt_SECMARK and offers better performance than computing it per packet.

    Set the maximum security context length to 256.

    Signed-off-by: Christian Göttsche
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Christian Göttsche
     

30 Jul, 2018

1 commit


13 Jun, 2018

1 commit


03 Jun, 2018

1 commit


29 May, 2018

3 commits

  • The comment and trace_loginfo are not used anymore.

    Signed-off-by: Taehee Yoo
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Taehee Yoo
     
  • synchronize_rcu() is expensive.

    The commit phase currently enforces an unconditional
    synchronize_rcu() after incrementing the generation counter.

    This is to make sure that a packet always sees a consistent chain, either
    nft_do_chain is still using old generation (it will skip the newly added
    rules), or the new one (it will skip old ones that might still be linked
    into the list).

    We could just remove the synchronize_rcu(), it would not cause a crash but
    it could cause us to evaluate a rule that was removed and new rule for the
    same packet, instead of either-or.

    To resolve this, add rule pointer array holding two generations, the
    current one and the future generation.

    In commit phase, allocate the rule blob and populate it with the rules that
    will be active in the new generation.

    Then, make this rule blob public, replacing the old generation pointer.

    Then the generation counter can be incremented.

    nft_do_chain() will either continue to use the current generation
    (in case loop was invoked right before increment), or the new one.

    Suggested-by: Pablo Neira Ayuso
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • This patch fixes the following splat.

    [118709.054937] BUG: using smp_processor_id() in preemptible [00000000] code: test/1571
    [118709.054970] caller is nft_update_chain_stats.isra.4+0x53/0x97 [nf_tables]
    [118709.054980] CPU: 2 PID: 1571 Comm: test Not tainted 4.17.0-rc6+ #335
    [...]
    [118709.054992] Call Trace:
    [118709.055011] dump_stack+0x5f/0x86
    [118709.055026] check_preemption_disabled+0xd4/0xe4

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

24 May, 2018

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter updates for net-next

    The following patchset contains Netfilter updates for your net-next
    tree, they are:

    1) Remove obsolete nf_log tracing from nf_tables, from Florian Westphal.

    2) Add support for map lookups to numgen, random and hash expressions,
    from Laura Garcia.

    3) Allow to register nat hooks for iptables and nftables at the same
    time. Patchset from Florian Westpha.

    4) Timeout support for rbtree sets.

    5) ip6_rpfilter works needs interface for link-local addresses, from
    Vincent Bernat.

    6) Add nf_ct_hook and nf_nat_hook structures and use them.

    7) Do not drop packets on packets raceing to insert conntrack entries
    into hashes, this is particularly a problem in nfqueue setups.

    8) Address fallout from xt_osf separation to nf_osf, patches
    from Florian Westphal and Fernando Mancera.

    9) Remove reference to struct nft_af_info, which doesn't exist anymore.
    From Taehee Yoo.

    This batch comes with is a conflict between 25fd386e0bc0 ("netfilter:
    core: add missing __rcu annotation") in your tree and 2c205dd3981f
    ("netfilter: add struct nf_nat_hook and use it") coming in this batch.
    This conflict can be solved by leaving the __rcu tag on
    __netfilter_net_init() - added by 25fd386e0bc0 - and remove all code
    related to nf_nat_decode_session_hook - which is gone after
    2c205dd3981f, as described by:

    diff --cc net/netfilter/core.c
    index e0ae4aae96f5,206fb2c4c319..168af54db975
    --- a/net/netfilter/core.c
    +++ b/net/netfilter/core.c
    @@@ -611,7 -580,13 +611,8 @@@ const struct nf_conntrack_zone nf_ct_zo
    EXPORT_SYMBOL_GPL(nf_ct_zone_dflt);
    #endif /* CONFIG_NF_CONNTRACK */

    - static void __net_init __netfilter_net_init(struct nf_hook_entries **e, int max)
    -#ifdef CONFIG_NF_NAT_NEEDED
    -void (*nf_nat_decode_session_hook)(struct sk_buff *, struct flowi *);
    -EXPORT_SYMBOL(nf_nat_decode_session_hook);
    -#endif
    -
    + static void __net_init
    + __netfilter_net_init(struct nf_hook_entries __rcu **e, int max)
    {
    int h;

    I can also merge your net-next tree into nf-next, solve the conflict and
    resend the pull request if you prefer so.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

22 May, 2018

1 commit

  • S390 bpf_jit.S is removed in net-next and had changes in 'net',
    since that code isn't used any more take the removal.

    TLS data structures split the TX and RX components in 'net-next',
    put the new struct members from the bug fix in 'net' into the RX
    part.

    The 'net-next' tree had some reworking of how the ERSPAN code works in
    the GRE tunneling code, overlapping with a one-line headroom
    calculation fix in 'net'.

    Overlapping changes in __sock_map_ctx_update_elem(), keep the bits
    that read the prog members via READ_ONCE() into local variables
    before using them.

    Signed-off-by: David S. Miller

    David S. Miller
     

17 May, 2018

1 commit


08 May, 2018

1 commit


27 Apr, 2018

3 commits


02 Aug, 2017

1 commit

  • The nf_loginfo structures are only passed as the seventh argument to
    nf_log_trace, which is declared as const or stored in a local const
    variable. Thus the nf_loginfo structures themselves can be const.

    Done with the help of Coccinelle.

    //
    @r disable optional_qualifier@
    identifier i;
    position p;
    @@
    static struct nf_loginfo i@p = { ... };

    @ok1@
    identifier r.i;
    expression list[6] es;
    position p;
    @@
    nf_log_trace(es,&i@p,...)

    @ok2@
    identifier r.i;
    const struct nf_loginfo *e;
    position p;
    @@
    e = &i@p

    @bad@
    position p != {r.p,ok1.p,ok2.p};
    identifier r.i;
    struct nf_loginfo e;
    @@
    e@i@p

    @depends on !bad disable optional_qualifier@
    identifier r.i;
    @@
    static
    +const
    struct nf_loginfo i = { ... };
    //

    Signed-off-by: Julia Lawall
    Signed-off-by: Pablo Neira Ayuso

    Julia Lawall
     

24 Jul, 2017

1 commit


10 Nov, 2016

1 commit

  • Some basic expressions are built into nf_tables.ko, such as nft_cmp,
    nft_lookup, nft_range and so on. But these basic expressions' init
    routine is a little ugly, too many goto errX labels, and we forget
    to call nft_range_module_exit in the exit routine, although it is
    harmless.

    Acctually, the init and exit routines of these basic expressions
    are same, i.e. do nft_register_expr in the init routine and do
    nft_unregister_expr in the exit routine.

    So it's better to arrange them into an array and deal with them
    together.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

03 Nov, 2016

1 commit


26 Oct, 2016

1 commit


26 Sep, 2016

2 commits

  • NFTA_LOG_FLAGS attribute is already supported, but the related
    NF_LOG_XXX flags are not exposed to the userspace. So we cannot
    explicitly enable log flags to log uid, tcp sequence, ip options
    and so on, i.e. such rule "nft add rule filter output log uid"
    is not supported yet.

    So move NF_LOG_XXX macro definitions to the uapi/../nf_log.h. In
    order to keep consistent with other modules, change NF_LOG_MASK to
    refer to all supported log flags. On the other hand, add a new
    NF_LOG_DEFAULT_MASK to refer to the original default log flags.

    Finally, if user specify the unsupported log flags or NFTA_LOG_GROUP
    and NFTA_LOG_FLAGS are set at the same time, report EINVAL to the
    userspace.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     
  • Inverse ranges != [a,b] are not currently possible because rules are
    composites of && operations, and we need to express this:

    data < a || data > b

    This patch adds a new range expression. Positive ranges can be already
    through two cmp expressions:

    cmp(sreg, data, >=)
    cmp(sreg, data,

    Pablo Neira Ayuso
     

23 Sep, 2016

2 commits

  • pkt->xt.thoff is not always set properly, but we use it without any check.
    For payload expr, it will cause wrong results. For nftrace, we may notify
    the wrong network or transport header to the user space, furthermore,
    input the following nft rules, warning message will be printed out:
    # nft add rule arp filter output meta nftrace set 1

    WARNING: CPU: 0 PID: 13428 at net/netfilter/nf_tables_trace.c:263
    nft_trace_notify+0x4a3/0x5e0 [nf_tables]
    Call Trace:
    [] dump_stack+0x63/0x85
    [] __warn+0xcb/0xf0
    [] warn_slowpath_null+0x1d/0x20
    [] nft_trace_notify+0x4a3/0x5e0 [nf_tables]
    [ ... ]
    [] nft_do_chain_arp+0x78/0x90 [nf_tables_arp]
    [] nf_iterate+0x62/0x80
    [] nf_hook_slow+0x73/0xd0
    [] arp_xmit+0x8f/0xb0
    [ ... ]
    [] arp_solicit+0x106/0x2c0

    So before we use pkt->xt.thoff, check the tprot_set first.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     
  • There's an off-by-one issue in nft_payload_fast_eval, skb_tail_pointer
    and ptr + priv->len all point to the last valid address plus 1. So if
    they are equal, we can still fetch the valid data. It's unnecessary to
    fall back to nft_payload_eval.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

15 Jun, 2016

1 commit


09 Dec, 2015

3 commits

  • nf_log_trace() outputs bogus 'TRACE:' strings because I forgot to update
    the comments array.

    Fixes: 33d5a7b14bfd0 ("netfilter: nf_tables: extend tracing infrastructure")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Only needed when meta nftrace rule(s) were added.
    The assumption is that no such rules are active, so the call to
    nft_trace_init is "never" needed.

    When nftrace rules are active, we always call the nft_trace_* functions,
    but will only send netlink messages when all of the following are true:

    - traceinfo structure was initialised
    - skb->nf_trace == 1
    - at least one subscriber to trace group.

    Adding an extra conditional
    (static_branch ... && skb->nf_trace)
    nft_trace_init( ..)

    Is possible but results in a larger nft_do_chain footprint.

    Signed-off-by: Florian Westphal
    Acked-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • nft monitor mode can then decode and display this trace data.

    Parts of LL/Network/Transport headers are provided as separate
    attributes.

    Otherwise, printing IP address data becomes virtually impossible
    for userspace since in the case of the netdev family we really don't
    want userspace to have to know all the possible link layer types
    and/or sizes just to display/print an ip address.

    We also don't want userspace to have to follow ipv6 header chains
    to get the s/dport info, the kernel already did this work for us.

    To avoid bloating nft_do_chain all data required for tracing is
    encapsulated in nft_traceinfo.

    The structure is initialized unconditionally(!) for each nft_do_chain
    invocation.

    This unconditionall call will be moved under a static key in a
    followup patch.

    With lots of help from Patrick McHardy and Pablo Neira.

    Signed-off-by: Florian Westphal
    Acked-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

19 Sep, 2015

3 commits


16 Jul, 2015

1 commit