16 Jun, 2018

1 commit

  • commit b71534583f22d08c3e3563bf5100aeb5f5c9fbe5 upstream.

    In the nft_ct_helper_obj_dump(), always priv->helper4 is dereferenced.
    But if family is ipv6, priv->helper6 should be dereferenced.

    Steps to reproduces:

    #test.nft
    table ip6 filter {
    ct helper ftp {
    type "ftp" protocol tcp
    }
    chain input {
    type filter hook input priority 4;
    ct helper set "ftp"
    }
    }

    %nft -f test.nft
    %nft list ruleset

    we can see the below messages:

    [ 916.286233] kasan: GPF could be caused by NULL-ptr deref or user memory access
    [ 916.294777] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
    [ 916.302613] Modules linked in: nft_objref nf_conntrack_sip nf_conntrack_snmp nf_conntrack_broadcast nf_conntrack_ftp nft_ct nf_conntrack nf_tables nfnetlink [last unloaded: nfnetlink]
    [ 916.318758] CPU: 1 PID: 2093 Comm: nft Not tainted 4.17.0-rc4+ #181
    [ 916.326772] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
    [ 916.338773] RIP: 0010:strlen+0x1a/0x90
    [ 916.342781] RSP: 0018:ffff88010ff0f2f8 EFLAGS: 00010292
    [ 916.346773] RAX: dffffc0000000000 RBX: ffff880119b26ee8 RCX: ffff88010c150038
    [ 916.354777] RDX: 0000000000000002 RSI: ffff880119b26ee8 RDI: 0000000000000010
    [ 916.362773] RBP: 0000000000000010 R08: 0000000000007e88 R09: ffff88010c15003c
    [ 916.370773] R10: ffff88010c150037 R11: ffffed002182a007 R12: ffff88010ff04040
    [ 916.378779] R13: 0000000000000010 R14: ffff880119b26f30 R15: ffff88010ff04110
    [ 916.387265] FS: 00007f57a1997700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000
    [ 916.394785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 916.402778] CR2: 00007f57a0ac80f0 CR3: 000000010ff02000 CR4: 00000000001006e0
    [ 916.410772] Call Trace:
    [ 916.414787] nft_ct_helper_obj_dump+0x94/0x200 [nft_ct]
    [ 916.418779] ? nft_ct_set_eval+0x560/0x560 [nft_ct]
    [ 916.426771] ? memset+0x1f/0x40
    [ 916.426771] ? __nla_reserve+0x92/0xb0
    [ 916.434774] ? memcpy+0x34/0x50
    [ 916.434774] nf_tables_fill_obj_info+0x484/0x860 [nf_tables]
    [ 916.442773] ? __nft_release_basechain+0x600/0x600 [nf_tables]
    [ 916.450779] ? lock_acquire+0x193/0x380
    [ 916.454771] ? lock_acquire+0x193/0x380
    [ 916.458789] ? nf_tables_dump_obj+0x148/0xcb0 [nf_tables]
    [ 916.462777] nf_tables_dump_obj+0x5f0/0xcb0 [nf_tables]
    [ 916.470769] ? __alloc_skb+0x30b/0x500
    [ 916.474779] netlink_dump+0x752/0xb50
    [ 916.478775] __netlink_dump_start+0x4d3/0x750
    [ 916.482784] nf_tables_getobj+0x27a/0x930 [nf_tables]
    [ 916.490774] ? nft_obj_notify+0x100/0x100 [nf_tables]
    [ 916.494772] ? nf_tables_getobj+0x930/0x930 [nf_tables]
    [ 916.502579] ? nf_tables_dump_flowtable_done+0x70/0x70 [nf_tables]
    [ 916.506774] ? nft_obj_notify+0x100/0x100 [nf_tables]
    [ 916.514808] nfnetlink_rcv_msg+0x8ab/0xa86 [nfnetlink]
    [ 916.518771] ? nfnetlink_rcv_msg+0x550/0xa86 [nfnetlink]
    [ 916.526782] netlink_rcv_skb+0x23e/0x360
    [ 916.530773] ? nfnetlink_bind+0x200/0x200 [nfnetlink]
    [ 916.534778] ? debug_check_no_locks_freed+0x280/0x280
    [ 916.542770] ? netlink_ack+0x870/0x870
    [ 916.546786] ? ns_capable_common+0xf4/0x130
    [ 916.550765] nfnetlink_rcv+0x172/0x16c0 [nfnetlink]
    [ 916.554771] ? sched_clock_local+0xe2/0x150
    [ 916.558774] ? sched_clock_cpu+0x144/0x180
    [ 916.566575] ? lock_acquire+0x380/0x380
    [ 916.570775] ? sched_clock_local+0xe2/0x150
    [ 916.574765] ? nfnetlink_net_init+0x130/0x130 [nfnetlink]
    [ 916.578763] ? sched_clock_cpu+0x144/0x180
    [ 916.582770] ? lock_acquire+0x193/0x380
    [ 916.590771] ? lock_acquire+0x193/0x380
    [ 916.594766] ? lock_acquire+0x380/0x380
    [ 916.598760] ? netlink_deliver_tap+0x262/0xa60
    [ 916.602766] ? lock_acquire+0x193/0x380
    [ 916.606766] netlink_unicast+0x3ef/0x5a0
    [ 916.610771] ? netlink_attachskb+0x630/0x630
    [ 916.614763] netlink_sendmsg+0x72a/0xb00
    [ 916.618769] ? netlink_unicast+0x5a0/0x5a0
    [ 916.626766] ? _copy_from_user+0x92/0xc0
    [ 916.630773] __sys_sendto+0x202/0x300
    [ 916.634772] ? __ia32_sys_getpeername+0xb0/0xb0
    [ 916.638759] ? lock_acquire+0x380/0x380
    [ 916.642769] ? lock_acquire+0x193/0x380
    [ 916.646761] ? finish_task_switch+0xf4/0x560
    [ 916.650763] ? __schedule+0x582/0x19a0
    [ 916.655301] ? __sched_text_start+0x8/0x8
    [ 916.655301] ? up_read+0x1c/0x110
    [ 916.655301] ? __do_page_fault+0x48b/0xaa0
    [ 916.655301] ? entry_SYSCALL_64_after_hwframe+0x59/0xbe
    [ 916.655301] __x64_sys_sendto+0xdd/0x1b0
    [ 916.655301] do_syscall_64+0x96/0x3d0
    [ 916.655301] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [ 916.655301] RIP: 0033:0x7f57a0ff5e03
    [ 916.655301] RSP: 002b:00007fff6367e0a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    [ 916.655301] RAX: ffffffffffffffda RBX: 00007fff6367f1e0 RCX: 00007f57a0ff5e03
    [ 916.655301] RDX: 0000000000000020 RSI: 00007fff6367e110 RDI: 0000000000000003
    [ 916.655301] RBP: 00007fff6367e100 R08: 00007f57a0ce9160 R09: 000000000000000c
    [ 916.655301] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff6367e110
    [ 916.655301] R13: 0000000000000020 R14: 00007f57a153c610 R15: 0000562417258de0
    [ 916.655301] Code: ff ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 fa 53 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df 48 89 fd 48 83 ec 08 b6 04 02 48 89 fa 83 e2 07 38 d0 7f
    [ 916.655301] RIP: strlen+0x1a/0x90 RSP: ffff88010ff0f2f8
    [ 916.771929] ---[ end trace 1065e048e72479fe ]---
    [ 916.777204] Kernel panic - not syncing: Fatal exception
    [ 916.778158] Kernel Offset: 0x14000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

    Signed-off-by: Taehee Yoo
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     

04 Sep, 2017

1 commit


15 May, 2017

1 commit


19 Apr, 2017

1 commit

  • By default the kernel emits all ctnetlink events for a connection.
    This allows to select the types of events to generate.

    This can be used to e.g. only send DESTROY events but no NEW/UPDATE ones
    and will work even if sysctl net.netfilter.nf_conntrack_events is set to 0.

    This was already possible via iptables' CT target, but the nft version has
    the advantage that it can also be used with already-established conntracks.

    The added nf_ct_is_template() check isn't a bug fix as we only support
    mark and labels (and unlike ecache the conntrack core doesn't copy those).

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

15 Apr, 2017

1 commit

  • resurrect an old patch from Pablo Neira to remove the untracked objects.

    Currently, there are four possible states of an skb wrt. conntrack.

    1. No conntrack attached, ct is NULL.
    2. Normal (kmem cache allocated) ct attached.
    3. a template (kmalloc'd), not in any hash tables at any point in time
    4. the 'untracked' conntrack, a percpu nf_conn object, tagged via
    IPS_UNTRACKED_BIT in ct->status.

    Untracked is supposed to be identical to case 1. It exists only
    so users can check

    -m conntrack --ctstate UNTRACKED vs.
    -m conntrack --ctstate INVALID

    e.g. attempts to set connmark on INVALID or UNTRACKED conntracks is
    supposed to be a no-op.

    Thus currently we need to check
    ct == NULL || nf_ct_is_untracked(ct)

    in a lot of places in order to avoid altering untracked objects.

    The other consequence of the percpu untracked object is that all
    -j NOTRACK (and, later, kfree_skb of such skbs) result in an atomic op
    (inc/dec the untracked conntracks refcount).

    This adds a new kernel-private ctinfo state, IP_CT_UNTRACKED, to
    make the distinction instead.

    The (few) places that care about packet invalid (ct is NULL) vs.
    packet untracked now need to test ct == NULL vs. ctinfo == IP_CT_UNTRACKED,
    but all other places can omit the nf_ct_is_untracked() check.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

08 Apr, 2017

1 commit


24 Mar, 2017

1 commit


16 Mar, 2017

1 commit


13 Mar, 2017

2 commits

  • this allows to assign connection tracking helpers to
    connections via nft objref infrastructure.

    The idea is to first specifiy a helper object:

    table ip filter {
    ct helper some-name {
    type "ftp"
    protocol tcp
    l3proto ip
    }
    }

    and then assign it via

    nft add ... ct helper set "some-name"

    helper assignment works for new conntracks only as we cannot expand the
    conntrack extension area once it has been committed to the main conntrack
    table.

    ipv4 and ipv6 protocols are tracked stored separately so
    we can also handle families that observe both ipv4 and ipv6 traffic.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Currently, there are two different methods to store an u16 integer to
    the u32 data register. For example:
    u32 *dest = ®s->data[priv->dreg];
    1. *dest = 0; *(u16 *) dest = val_u16;
    2. *dest = val_u16;

    For method 1, the u16 value will be stored like this, either in
    big-endian or little-endian system:
    0 15 31
    +-+-+-+-+-+-+-+-+-+-+-+-+
    | Value | 0 |
    +-+-+-+-+-+-+-+-+-+-+-+-+

    For method 2, in little-endian system, the u16 value will be the same
    as listed above. But in big-endian system, the u16 value will be stored
    like this:
    0 15 31
    +-+-+-+-+-+-+-+-+-+-+-+-+
    | 0 | Value |
    +-+-+-+-+-+-+-+-+-+-+-+-+

    So later we use "memcmp(®s->data[priv->sreg], data, 2);" to do
    compare in nft_cmp, nft_lookup expr ..., method 2 will get the wrong
    result in big-endian system, as 0~15 bits will always be zero.

    For the similar reason, when loading an u16 value from the u32 data
    register, we should use "*(u16 *) sreg;" instead of "(u16)*sreg;",
    the 2nd method will get the wrong value in the big-endian system.

    So introduce some wrapper functions to store/load an u8 or u16
    integer to/from the u32 data register, and use them in the right
    place.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

24 Feb, 2017

1 commit


08 Feb, 2017

3 commits

  • zones allow tracking multiple connections sharing identical tuples,
    this is needed e.g. when tracking distinct vlans with overlapping ip
    addresses (conntrack is l2 agnostic).

    Thus the zone has to be set before the packet is picked up by the
    connection tracker. This is done by means of 'conntrack templates' which
    are conntrack structures used solely to pass this info from one netfilter
    hook to the next.

    The iptables CT target instantiates these connection tracking templates
    once per rule, i.e. the template is fixed/tied to particular zone, can
    be read-only and therefore be re-used by as many skbs simultaneously as
    needed.

    We can't follow this model because we want to take the zone id from
    an sreg at rule eval time so we could e.g. fill in the zone id from
    the packets vlan id or a e.g. nftables key : value maps.

    To avoid cost of per packet alloc/free of the template, use a percpu
    template 'scratch' object and use the refcount to detect the (unlikely)
    case where the template is still attached to another skb (i.e., previous
    skb was nfqueued ...).

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Next patch will add ZONE_ID set support which will need similar
    error unwind (put operation) as conntrack labels.

    Prepare for this: remove the 'label_got' boolean in favor
    of a switch statement that can be extended in next patch.

    As we already have that in the set_destroy function place that in
    a separate function and call it from the set init function.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Just like with counters the direction attribute is optional.
    We set priv->dir to MAX unconditionally to avoid duplicating the assignment
    for all keys with optional direction.

    For keys where direction is mandatory, existing code already returns
    an error.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

02 Feb, 2017

1 commit


03 Jan, 2017

1 commit


05 Dec, 2016

1 commit

  • currently aliased to try_module_get/_put.
    Will be changed in next patch when we add functions to make use of ->net
    argument to store usercount per l3proto tracker.

    This is needed to avoid registering the conntrack hooks in all netns and
    later only enable connection tracking in those that need conntrack.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

26 Oct, 2016

1 commit

  • This patch adds notrack support.

    I decided to add a new expression, given that this doesn't fit into the
    existing set operation. Notrack doesn't need a source register, and an
    hypothetical NFT_CT_NOTRACK key makes no sense since matching the
    untracked state is done through NFT_CT_STATE.

    I'm placing this new notrack expression into nft_ct.c, I think a single
    module is too much.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

25 Sep, 2016

2 commits

  • NFT_CT_MARK is unrelated to direction, so if NFTA_CT_DIRECTION attr is
    specified, report EINVAL to the userspace. This validation check was
    already done at nft_ct_get_init, but we missed it in nft_ct_set_init.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     
  • Currently, if the user want to match ct l3proto, we must specify the
    direction, for example:
    # nft add rule filter input ct original l3proto ipv4
    ^^^^^^^^
    Otherwise, error message will be reported:
    # nft add rule filter input ct l3proto ipv4
    nft add rule filter input ct l3proto ipv4
    :1:1-38: Error: Could not process rule: Invalid argument
    add rule filter input ct l3proto ipv4
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    Actually, there's no need to require NFTA_CT_DIRECTION attr, because
    ct l3proto and protocol are unrelated to direction.

    And for compatibility, even if the user specify the NFTA_CT_DIRECTION
    attr, do not report error, just skip it.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

25 Jul, 2016

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter/IPVS updates for net-next

    The following patchset contains Netfilter/IPVS updates for net-next,
    they are:

    1) Count pre-established connections as active in "least connection"
    schedulers such that pre-established connections to avoid overloading
    backend servers on peak demands, from Michal Kubecek via Simon Horman.

    2) Address a race condition when resizing the conntrack table by caching
    the bucket size when fulling iterating over the hashtable in these
    three possible scenarios: 1) dump via /proc/net/nf_conntrack,
    2) unlinking userspace helper and 3) unlinking custom conntrack timeout.
    From Liping Zhang.

    3) Revisit early_drop() path to perform lockless traversal on conntrack
    eviction under stress, use del_timer() as synchronization point to
    avoid two CPUs evicting the same entry, from Florian Westphal.

    4) Move NAT hlist_head to nf_conn object, this simplifies the existing
    NAT extension and it doesn't increase size since recent patches to
    align nf_conn, from Florian.

    5) Use rhashtable for the by-source NAT hashtable, also from Florian.

    6) Don't allow --physdev-is-out from OUTPUT chain, just like
    --physdev-out is not either, from Hangbin Liu.

    7) Automagically set on nf_conntrack counters if the user tries to
    match ct bytes/packets from nftables, from Liping Zhang.

    8) Remove possible_net_t fields in nf_tables set objects since we just
    simply pass the net pointer to the backend set type implementations.

    9) Fix possible off-by-one in h323, from Toby DiPasquale.

    10) early_drop() may be called from ctnetlink patch, so we must hold
    rcu read size lock from them too, this amends Florian's patch #3
    coming in this batch, from Liping Zhang.

    11) Use binary search to validate jump offset in x_tables, this
    addresses the O(n!) validation that was introduced recently
    resolve security issues with unpriviledge namespaces, from Florian.

    12) Fix reference leak to connlabel in error path of nft_ct, from Zhang.

    13) Three updates for nft_log: Fix log prefix leak in error path. Bail
    out on loglevel larger than debug in nft_log and set on the new
    NF_LOG_F_COPY_LEN flag when snaplen is specified. Again from Zhang.

    14) Allow to filter rule dumps in nf_tables based on table and chain
    names.

    15) Simplify connlabel to always use 128 bits to store labels and
    get rid of unused function in xt_connlabel, from Florian.

    16) Replace set_expect_timeout() by mod_timer() from the h323 conntrack
    helper, by Gao Feng.

    17) Put back x_tables module reference in nft_compat on error, from
    Liping Zhang.

    18) Add a reference count to the x_tables extensions cache in
    nft_compat, so we can remove them when unused and avoid a crash
    if the extensions are rmmod, again from Zhang.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

22 Jul, 2016

1 commit

  • The conntrack label extension is currently variable-sized, e.g. if
    only 2 labels are used by iptables rules then the labels->bits[] array
    will only contain one element.

    We track size of each label storage area in the 'words' member.

    But in nftables and openvswitch we always have to ask for worst-case
    since we don't know what bit will be used at configuration time.

    As most arches are 64bit we need to allocate 24 bytes in this case:

    struct nf_conn_labels {
    u8 words; /* 0 1 */
    /* XXX 7 bytes hole, try to pack */
    long unsigned bits[2]; /* 8 24 */

    Make bits a fixed size and drop the words member, it simplifies
    the code and only increases memory requirements on x86 when
    less than 64bit labels are required.

    We still only allocate the extension if its needed.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

20 Jul, 2016

1 commit

  • We only get nf_connlabels if the user add ct label set expr successfully,
    but we will also put nf_connlabels if the user delete ct lable get expr.
    This is mismathced, and will cause ct label expr cannot work properly.

    Also, if we init something fail, we should put nf_connlabels back.
    Otherwise, we may waste to alloc the memory that will never be used.

    Signed-off-by: Liping Zhang
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

11 Jul, 2016

1 commit

  • If we want to use ct packets expr, and add a rule like follows:
    # nft add rule filter input ct packets gt 1 counter

    We will find that no packets will hit it, because
    nf_conntrack_acct is disabled by default. So It will
    not work until we enable it manually via
    "echo 1 > /proc/sys/net/netfilter/nf_conntrack_acct".

    This is not friendly, so like xt_connbytes do, if the user
    want to use ct byte/packet expr, enable nf_conntrack_acct
    automatically.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

08 Jul, 2016

1 commit

  • We need to compute timeout.expires - jiffies, not the other way around.
    Add a helper, another patch can then later change more places in
    conntrack code where we currently open-code this.

    Will allow us to only change one place later when we remove per-ct timer.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

05 May, 2016

1 commit

  • Conntrack labels are currently sized depending on the iptables
    ruleset, i.e. if we're asked to test or set bits 1, 2, and 65 then we
    would allocate enough room to store at least bit 65.

    However, with nft, the input is just a register with arbitrary runtime
    content.

    We therefore ask for the upper ceiling we currently have, which is
    enough room to store 128 bits.

    Alternatively, we could alter nf_connlabel_replace to increase
    net->ct.label_words at run time, but since 128 bits is not that
    big we'd only save sizeof(long) so it doesn't seem worth it for now.

    This follows a similar approach that xtables 'connlabel'
    match uses, so when user inputs

    ct label set bar

    then we will set the bit used by the 'bar' label and leave the rest alone.

    This is done by passing the sreg content to nf_connlabels_replace
    as both value and mask argument.
    Labels (bits) already set thus cannot be re-set to zero, but
    this is not supported by xtables connlabel match either.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

19 Apr, 2016

1 commit

  • nf_connlabel_set() takes the bit number that we would like to set.
    nf_connlabels_get() however took the number of bits that we want to
    support.

    So e.g. nf_connlabels_get(32) support bits 0 to 31, but not 32.
    This changes nf_connlabels_get() to take the highest bit that we want
    to set.

    Callers then don't have to cope with a potential integer wrap
    when using nf_connlabels_get(bit + 1) anymore.

    Current callers are fine, this change is only to make folloup
    nft ct label set support simpler.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

15 Jan, 2016

1 commit


09 Jan, 2016

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter updates for net-next

    The following patchset contains Netfilter updates for net-next, they are:

    1) Release nf_tables objects on netns destructions via
    nft_release_afinfo().

    2) Destroy basechain and rules on netdevice removal in the new netdev
    family.

    3) Get rid of defensive check against removal of inactive objects in
    nf_tables.

    4) Pass down netns pointer to our existing nfnetlink callbacks, as well
    as commit() and abort() nfnetlink callbacks.

    5) Allow to invert limit expression in nf_tables, so we can throttle
    overlimit traffic.

    6) Add packet duplication for the netdev family.

    7) Add forward expression for the netdev family.

    8) Define pr_fmt() in conntrack helpers.

    9) Don't leave nfqueue configuration on inconsistent state in case of
    errors, from Ken-ichirou MATSUZAWA, follow up patches are also from
    him.

    10) Skip queue option handling after unbind.

    11) Return error on unknown both in nfqueue and nflog command.

    12) Autoload ctnetlink when NFQA_CFG_F_CONNTRACK is set.

    13) Add new NFTA_SET_USERDATA attribute to store user data in sets,
    from Carlos Falgueras.

    14) Add support for 64 bit byteordering changes nf_tables, from Florian
    Westphal.

    15) Add conntrack byte/packet counter matching support to nf_tables,
    also from Florian.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

08 Jan, 2016

1 commit


18 Dec, 2015

1 commit


13 Apr, 2015

8 commits

  • Switch the nf_tables registers from 128 bit addressing to 32 bit
    addressing to support so called concatenations, where multiple values
    can be concatenated over multiple registers for O(1) exact matches of
    multiple dimensions using sets.

    The old register values are mapped to areas of 128 bits for compatibility.
    When dumping register numbers, values are expressed using the old values
    if they refer to the beginning of a 128 bit area for compatibility.

    To support concatenations, register loads of less than a full 32 bit
    value need to be padded. This mainly affects the payload and exthdr
    expressions, which both unconditionally zero the last word before
    copying the data.

    Userspace fully passes the testsuite using both old and new register
    addressing.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Add helper functions to parse and dump register values in netlink attributes.
    These helpers will later be changed to take care of translation between the
    old 128 bit and the new 32 bit register numbers.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Simple conversion to use u32 pointers to the beginning of the registers
    to keep follow up patches smaller.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Replace the array of registers passed to expressions by a struct nft_regs,
    containing the verdict as a seperate member, which aliases to the
    NFT_REG_VERDICT register.

    This is needed to seperate the verdict from the data registers completely,
    so their size can be changed.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Change nft_validate_input_register() to not only validate the input
    register number, but also the length of the load, and rename it to
    nft_validate_register_load() to reflect that change.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • All users of nft_validate_register_store() first invoke
    nft_validate_output_register(). There is in fact no use for using it
    on its own, so simplify the code by folding the functionality into
    nft_validate_register_store() and kill it.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • The existing name is ambiguous, data is loaded as well when we read from
    a register. Rename to nft_validate_register_store() for clarity and
    consistency with the upcoming patch to introduce its counterpart,
    nft_validate_register_load().

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • For values spanning multiple registers, we need to validate that enough
    space is available from the destination register onwards. Add a len
    argument to nft_validate_data_load() and consolidate the existing length
    validations in preparation of that.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     

09 Apr, 2015

1 commit

  • More recent GCC warns about two kinds of switch statement uses:

    1) Switching on an enumeration, but not having an explicit case
    statement for all members of the enumeration. To show the
    compiler this is intentional, we simply add a default case
    with nothing more than a break statement.

    2) Switching on a boolean value. I think this warning is dumb
    but nevertheless you get it wholesale with -Wswitch.

    This patch cures all such warnings in netfilter.

    Signed-off-by: David S. Miller
    Acked-by: Pablo Neira Ayuso

    David Miller