08 Jul, 2018

1 commit

  • commit 97a0549b15a0b466c47f6a0143a490a082c64b4e upstream.

    In the nft_meta_set_eval, nftrace value is dereferenced as u32 from sreg.
    But correct type is u8. so that sometimes incorrect value is dereferenced.

    Steps to reproduce:

    %nft add table ip filter
    %nft add chain ip filter input { type filter hook input priority 4\; }
    %nft add rule ip filter input nftrace set 0
    %nft monitor

    Sometimes, we can see trace messages.

    trace id 16767227 ip filter input packet: iif "enp2s0"
    ether saddr xx:xx:xx:xx:xx:xx ether daddr xx:xx:xx:xx:xx:xx
    ip saddr 192.168.0.1 ip daddr 255.255.255.255 ip dscp cs0
    ip ecn not-ect ip
    trace id 16767227 ip filter input rule nftrace set 0 (verdict continue)
    trace id 16767227 ip filter input verdict continue
    trace id 16767227 ip filter input

    Signed-off-by: Taehee Yoo
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     

08 Apr, 2017

1 commit


24 Mar, 2017

1 commit


13 Mar, 2017

1 commit

  • Currently, there are two different methods to store an u16 integer to
    the u32 data register. For example:
    u32 *dest = ®s->data[priv->dreg];
    1. *dest = 0; *(u16 *) dest = val_u16;
    2. *dest = val_u16;

    For method 1, the u16 value will be stored like this, either in
    big-endian or little-endian system:
    0 15 31
    +-+-+-+-+-+-+-+-+-+-+-+-+
    | Value | 0 |
    +-+-+-+-+-+-+-+-+-+-+-+-+

    For method 2, in little-endian system, the u16 value will be the same
    as listed above. But in big-endian system, the u16 value will be stored
    like this:
    0 15 31
    +-+-+-+-+-+-+-+-+-+-+-+-+
    | 0 | Value |
    +-+-+-+-+-+-+-+-+-+-+-+-+

    So later we use "memcmp(®s->data[priv->sreg], data, 2);" to do
    compare in nft_cmp, nft_lookup expr ..., method 2 will get the wrong
    result in big-endian system, as 0~15 bits will always be zero.

    For the similar reason, when loading an u16 value from the u32 data
    register, we should use "*(u16 *) sreg;" instead of "(u16)*sreg;",
    the 2nd method will get the wrong value in the big-endian system.

    So introduce some wrapper functions to store/load an u8 or u16
    integer to/from the u32 data register, and use them in the right
    place.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

07 Mar, 2017

1 commit

  • When we want to validate the expr's dependency or hooks, we must do two
    things to accomplish it. First, write a X_validate callback function
    and point ->validate to it. Second, call X_validate in init routine.
    This is very common, such as fib, nat, reject expr and so on ...

    It is a little ugly, since we will call X_validate in the expr's init
    routine, it's better to do it in nf_tables_newexpr. So we can avoid to
    do this again and again. After doing this, the second step listed above
    is not useful anymore, remove them now.

    Patch was tested by nftables/tests/py/nft-test.py and
    nftables/tests/shell/run-tests.sh.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

19 Jan, 2017

2 commits

  • After adding the following nft rule, then ping 224.0.0.1:
    # nft add rule netdev t c pkttype host counter

    The warning complain message will be printed out again and again:
    WARNING: CPU: 0 PID: 10182 at net/netfilter/nft_meta.c:163 \
    nft_meta_get_eval+0x3fe/0x460 [nft_meta]
    [...]
    Call Trace:

    dump_stack+0x85/0xc2
    __warn+0xcb/0xf0
    warn_slowpath_null+0x1d/0x20
    nft_meta_get_eval+0x3fe/0x460 [nft_meta]
    nft_do_chain+0xff/0x5e0 [nf_tables]

    So we should deal with PACKET_LOOPBACK in netdev family too. For ipv4,
    convert it to PACKET_BROADCAST/MULTICAST according to the destination
    address's type; For ipv6, convert it to PACKET_MULTICAST directly.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     
  • Since there's no broadcast address in IPV6, so in ipv6 family, the
    PACKET_LOOPBACK must be multicast packets, there's no need to check
    it again.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

03 Nov, 2016

1 commit


26 Oct, 2016

1 commit


26 Sep, 2016

1 commit

  • Conflicts:
    net/netfilter/core.c
    net/netfilter/nf_tables_netdev.c

    Resolve two conflicts before pull request for David's net-next tree:

    1) Between c73c24849011 ("netfilter: nf_tables_netdev: remove redundant
    ip_hdr assignment") from the net tree and commit ddc8b6027ad0
    ("netfilter: introduce nft_set_pktinfo_{ipv4, ipv6}_validate()").

    2) Between e8bffe0cf964 ("net: Add _nf_(un)register_hooks symbols") and
    Aaron Conole's patches to replace list_head with single linked list.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

13 Sep, 2016

1 commit

  • This patch introduces nft_set_pktinfo_unspec() that ensures proper
    initialization all of pktinfo fields for non-IP traffic. This is used
    by the bridge, netdev and arp families.

    This new function relies on nft_set_pktinfo_proto_unspec() to set a new
    tprot_set field that indicates if transport protocol information is
    available. Remain fields are zeroed.

    The meta expression has been also updated to check to tprot_set in first
    place given that zero is a valid tprot value. Even a handcrafted packet
    may come with the IPPROTO_RAW (255) protocol number so we can't rely on
    this value as tprot unset.

    Reported-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

25 Aug, 2016

1 commit

  • "meta pkttype set" is only supported on prerouting chain with bridge
    family and ingress chain with netdev family.

    But the validate check is incomplete, and the user can add the nft
    rules on input chain with bridge family, for example:
    # nft add table bridge filter
    # nft add chain bridge filter input {type filter hook input \
    priority 0 \;}
    # nft add chain bridge filter test
    # nft add rule bridge filter test meta pkttype set unicast
    # nft add rule bridge filter input jump test

    This patch fixes the problem.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

24 Jul, 2016

1 commit


05 Jul, 2016

1 commit


23 Jun, 2016

1 commit


29 Feb, 2016

1 commit

  • Can be used to randomly match packets e.g. for statistic traffic sampling.

    See commit 3ad0040573b0c00f8848
    ("bpf: split state from prandom_u32() and consolidate {c, e}BPF prngs")
    for more info why this doesn't use prandom_u32 directly.

    Unlike bpf nft_meta can be built as a module, so add an EXPORT_SYMBOL
    for prandom_seed_full_state too.

    Cc: Daniel Borkmann
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

18 Dec, 2015

1 commit

  • This allows to redirect bridged packets to local machine:

    ether type ip ether daddr set aa:53:08:12:34:56 meta pkttype set unicast
    Without 'set unicast', ip stack discards PACKET_OTHERHOST skbs.

    It is also useful to add support for a '-m cluster like' nft rule
    (where switch floods packets to several nodes, and each cluster node
    node processes a subset of packets for load distribution).

    Mangling is restricted to HOST/OTHER/BROAD/MULTICAST, i.e. you cannot set
    skb->pkt_type to PACKET_KERNEL or change PACKET_LOOPBACK to PACKET_HOST.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

15 Dec, 2015

1 commit


09 Dec, 2015

2 commits

  • Only needed when meta nftrace rule(s) were added.
    The assumption is that no such rules are active, so the call to
    nft_trace_init is "never" needed.

    When nftrace rules are active, we always call the nft_trace_* functions,
    but will only send netlink messages when all of the following are true:

    - traceinfo structure was initialised
    - skb->nf_trace == 1
    - at least one subscriber to trace group.

    Adding an extra conditional
    (static_branch ... && skb->nf_trace)
    nft_trace_init( ..)

    Is possible but results in a larger nft_do_chain footprint.

    Signed-off-by: Florian Westphal
    Acked-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Introduce sock->sk_cgrp_data which is a struct sock_cgroup_data.
    ->sk_cgroup_prioidx and ->sk_classid are moved into it. The struct
    and its accessors are defined in cgroup-defs.h. This is to prepare
    for overloading the fields with a cgroup pointer.

    This patch mostly performs equivalent conversions but the followings
    are noteworthy.

    * Equality test before updating classid is removed from
    sock_update_classid(). This shouldn't make any noticeable
    difference and a similar test will be implemented on the helper side
    later.

    * sock_update_netprioidx() now takes struct sock_cgroup_data and can
    be moved to netprio_cgroup.h without causing include dependency
    loop. Moved.

    * The dummy version of sock_update_netprioidx() converted to a static
    inline function while at it.

    Signed-off-by: Tejun Heo
    Signed-off-by: David S. Miller

    Tejun Heo
     

09 Nov, 2015

1 commit


19 Sep, 2015

1 commit

  • - Add nft_pktinfo.pf to replace ops->pf
    - Add nft_pktinfo.hook to replace ops->hooknum

    This simplifies the code, makes it more readable, and likely reduces
    cache line misses. Maintainability is enhanced as the details of
    nft_hook_ops are of no concern to the recpients of nft_pktinfo.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Pablo Neira Ayuso

    Eric W. Biederman
     

22 Jul, 2015

1 commit


13 Apr, 2015

8 commits

  • Switch the nf_tables registers from 128 bit addressing to 32 bit
    addressing to support so called concatenations, where multiple values
    can be concatenated over multiple registers for O(1) exact matches of
    multiple dimensions using sets.

    The old register values are mapped to areas of 128 bits for compatibility.
    When dumping register numbers, values are expressed using the old values
    if they refer to the beginning of a 128 bit area for compatibility.

    To support concatenations, register loads of less than a full 32 bit
    value need to be padded. This mainly affects the payload and exthdr
    expressions, which both unconditionally zero the last word before
    copying the data.

    Userspace fully passes the testsuite using both old and new register
    addressing.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Add helper functions to parse and dump register values in netlink attributes.
    These helpers will later be changed to take care of translation between the
    old 128 bit and the new 32 bit register numbers.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Simple conversion to use u32 pointers to the beginning of the registers
    to keep follow up patches smaller.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Replace the array of registers passed to expressions by a struct nft_regs,
    containing the verdict as a seperate member, which aliases to the
    NFT_REG_VERDICT register.

    This is needed to seperate the verdict from the data registers completely,
    so their size can be changed.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Change nft_validate_input_register() to not only validate the input
    register number, but also the length of the load, and rename it to
    nft_validate_register_load() to reflect that change.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • All users of nft_validate_register_store() first invoke
    nft_validate_output_register(). There is in fact no use for using it
    on its own, so simplify the code by folding the functionality into
    nft_validate_register_store() and kill it.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • The existing name is ambiguous, data is loaded as well when we read from
    a register. Rename to nft_validate_register_store() for clarity and
    consistency with the upcoming patch to introduce its counterpart,
    nft_validate_register_load().

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • For values spanning multiple registers, we need to validate that enough
    space is available from the destination register onwards. Add a len
    argument to nft_validate_data_load() and consolidate the existing length
    validations in preparation of that.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     

01 Apr, 2015

1 commit


25 Mar, 2015

1 commit


18 Mar, 2015

1 commit


09 Nov, 2014

1 commit


09 Sep, 2014

1 commit


24 Aug, 2014

2 commits

  • Add cpu support to meta expresion.

    This allows you to match packets with cpu number.

    Signed-off-by: Ana Rey
    Signed-off-by: Pablo Neira Ayuso

    Ana Rey
     
  • Add pkttype support for ip, ipv6 and inet families of tables.

    This allows you to fetch the meta packet type based on the link layer
    information. The loopback traffic is a special case, the packet type
    is guessed from the network layer header.

    No special handling for bridge and arp since we're not going to see
    such traffic in the loopback interface.

    Joint work with Alvaro Neira Ayuso

    Signed-off-by: Alvaro Neira Ayuso
    Signed-off-by: Ana Rey
    Signed-off-by: Pablo Neira Ayuso

    Ana Rey
     

23 Apr, 2014

1 commit


03 Apr, 2014

1 commit