09 Dec, 2020

1 commit

  • Since commit 656c8e9cc1ba ("netfilter: conntrack: Use consistent ct id
    hash calculation") the ct id will not change from initialization to
    confirmation. Removing the confirmation check allows for things like
    adding an element to a 'typeof ct id' set in prerouting upon reception
    of the first packet of a new connection, and then being able to
    reference that set consistently both before and after the connection
    is confirmed.

    Fixes: 656c8e9cc1ba ("netfilter: conntrack: Use consistent ct id hash calculation")
    Signed-off-by: Brett Mastbergen
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Brett Mastbergen
     

22 Jul, 2020

1 commit

  • Replace the existing /* fall through */ comments and its variants with
    the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
    fall-through markings when it is the case.

    [1] https://www.kernel.org/doc/html/latest/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Pablo Neira Ayuso

    Gustavo A. R. Silva
     

25 Jun, 2020

1 commit


10 Dec, 2019

1 commit

  • Replace all the occurrences of FIELD_SIZEOF() with sizeof_field() except
    at places where these are defined. Later patches will remove the unused
    definition of FIELD_SIZEOF().

    This patch is generated using following script:

    EXCLUDE_FILES="include/linux/stddef.h|include/linux/kernel.h"

    git grep -l -e "\bFIELD_SIZEOF\b" | while read file;
    do

    if [[ "$file" =~ $EXCLUDE_FILES ]]; then
    continue
    fi
    sed -i -e 's/\bFIELD_SIZEOF\b/sizeof_field/g' $file;
    done

    Signed-off-by: Pankaj Bharadiya
    Link: https://lore.kernel.org/r/20190924105839.110713-3-pankaj.laxminarayan.bharadiya@intel.com
    Co-developed-by: Kees Cook
    Signed-off-by: Kees Cook
    Acked-by: David Miller # for net

    Pankaj Bharadiya
     

16 Jul, 2019

1 commit

  • When conntracks change during a dialog, SDP messages may be sent from
    different conntracks to establish expects with identical tuples. In this
    case expects conflict may be detected for the 2nd SDP message and end up
    with a process failure.

    The fixing here is to reuse an existing expect who has the same tuple for a
    different conntrack if any.

    Here are two scenarios for the case.

    1)
    SERVER CPE

    | INVITE SDP |
    5060 ||5060
    | 183 SDP |
    5060 |---------------------->|5060 ===> Conntrack 1
    | PRACK |
    50601 ||5060
    | 200 OK (INVITE) |
    5060 |---------------------->|5060
    | ACK |
    50601 ||
    | |
    | INVITE SDP (t38) |
    50601 |---------------------->|5060 ===> Conntrack 2

    With a certain configuration in the CPE, SIP messages "183 with SDP" and
    "re-INVITE with SDP t38" will go through the sip helper to create
    expects for RTP and RTCP.

    It is okay to create RTP and RTCP expects for "183", whose master
    connection source port is 5060, and destination port is 5060.

    In the "183" message, port in Contact header changes to 50601 (from the
    original 5060). So the following requests e.g. PRACK and ACK are sent to
    port 50601. It is a different conntrack (let call Conntrack 2) from the
    original INVITE (let call Conntrack 1) due to the port difference.

    In this example, after the call is established, there is RTP stream but no
    RTCP stream for Conntrack 1, so the RTP expect created upon "183" is
    cleared, and RTCP expect created for Conntrack 1 retains.

    When "re-INVITE with SDP t38" arrives to create RTP&RTCP expects, current
    ALG implementation will call nf_ct_expect_related() for RTP and RTCP. The
    expects tuples are identical to those for Conntrack 1. RTP expect for
    Conntrack 2 succeeds in creation as the one for Conntrack 1 has been
    removed. RTCP expect for Conntrack 2 fails in creation because it has
    idential tuples and 'conflict' with the one retained for Conntrack 1. And
    then result in a failure in processing of the re-INVITE.

    2)

    SERVER A CPE

    | REGISTER |
    5060 | CT1
    | 200 |
    5060 |------------------>| 5060
    | |
    | INVITE SDP(1) |
    5060 || 5060 SERVER B
    | ACK |
    5060 || 5060 ==> CT2
    | 100 |
    5060 || 50601 ==> CT3
    | |
    ||
    | |
    | BYE |
    5060 || 50601
    | INVITE SDP(3) |
    5060 | CT1

    CPE sends an INVITE request(1) to Server A, and creates a RTP&RTCP expect
    pair for this Conntrack 1 (CT1). Server A responds 300 to redirect to
    Server B. The RTP&RTCP expect pairs created on CT1 are removed upon 300
    response.

    CPE sends the INVITE request(2) to Server B, and creates an expect pair
    for the new conntrack (due to destination address difference), let call
    CT2. Server B changes the port to 50601 in 200 OK response, and the
    following requests ACK and BYE from CPE are sent to 50601. The call is
    established. There is RTP stream and no RTCP stream. So RTP expect is
    removed and RTCP expect for CT2 retains.

    As BYE request is sent from port 50601, it is another conntrack, let call
    CT3, different from CT2 due to the port difference. So the BYE request will
    not remove the RTCP expect for CT2.

    Then another outgoing call is made, with the same RTP port being used (not
    definitely but possibly). CPE firstly sends the INVITE request(3) to Server
    A, and tries to create a RTP&RTCP expect pairs for this CT1. In current ALG
    implementation, the RTCP expect for CT1 fails in creation because it
    'conflicts' with the residual one for CT2. As a result the INVITE request
    fails to send.

    Signed-off-by: xiao ruizhu
    Signed-off-by: Pablo Neira Ayuso

    xiao ruizhu
     

25 Jun, 2019

1 commit


19 Jun, 2019

2 commits

  • nf_ct_helper_ext_add may return null, which must then be checked.

    Fixes: 857b46027d6f ("netfilter: nft_ct: add ct expectations support")
    Reported-by: Colin Ian King
    Signed-off-by: Stéphane Veyret
    Signed-off-by: Pablo Neira Ayuso

    Stéphane Veyret
     
  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

17 Jun, 2019

1 commit

  • This patch allows to add, list and delete expectations via nft objref
    infrastructure and assigning these expectations via nft rule.

    This allows manual port triggering when no helper is defined to manage a
    specific protocol. For example, if I have an online game which protocol
    is based on initial connection to TCP port 9753 of the server, and where
    the server opens a connection to port 9876, I can set rules as follow:

    table ip filter {
    ct expectation mygame {
    protocol udp;
    dport 9876;
    timeout 2m;
    size 1;
    }

    chain input {
    type filter hook input priority 0; policy drop;
    tcp dport 9753 ct expectation set "mygame";
    }

    chain output {
    type filter hook output priority 0; policy drop;
    udp dport 9876 ct status expected accept;
    }
    }

    Signed-off-by: Stéphane Veyret
    Signed-off-by: Pablo Neira Ayuso

    Stéphane Veyret
     

30 Apr, 2019

1 commit


28 Apr, 2019

2 commits

  • We currently have two levels of strict validation:

    1) liberal (default)
    - undefined (type >= max) & NLA_UNSPEC attributes accepted
    - attribute length >= expected accepted
    - garbage at end of message accepted
    2) strict (opt-in)
    - NLA_UNSPEC attributes accepted
    - attribute length >= expected accepted

    Split out parsing strictness into four different options:
    * TRAILING - check that there's no trailing data after parsing
    attributes (in message or nested)
    * MAXTYPE - reject attrs > max known type
    * UNSPEC - reject attributes with NLA_UNSPEC policy entries
    * STRICT_ATTRS - strictly validate attribute size

    The default for future things should be *everything*.
    The current *_strict() is a combination of TRAILING and MAXTYPE,
    and is renamed to _deprecated_strict().
    The current regular parsing has none of this, and is renamed to
    *_parse_deprecated().

    Additionally it allows us to selectively set one of the new flags
    even on old policies. Notably, the UNSPEC flag could be useful in
    this case, since it can be arranged (by filling in the policy) to
    not be an incompatible userspace ABI change, but would then going
    forward prevent forgetting attribute entries. Similar can apply
    to the POLICY flag.

    We end up with the following renames:
    * nla_parse -> nla_parse_deprecated
    * nla_parse_strict -> nla_parse_deprecated_strict
    * nlmsg_parse -> nlmsg_parse_deprecated
    * nlmsg_parse_strict -> nlmsg_parse_deprecated_strict
    * nla_parse_nested -> nla_parse_nested_deprecated
    * nla_validate_nested -> nla_validate_nested_deprecated

    Using spatch, of course:
    @@
    expression TB, MAX, HEAD, LEN, POL, EXT;
    @@
    -nla_parse(TB, MAX, HEAD, LEN, POL, EXT)
    +nla_parse_deprecated(TB, MAX, HEAD, LEN, POL, EXT)

    @@
    expression NLH, HDRLEN, TB, MAX, POL, EXT;
    @@
    -nlmsg_parse(NLH, HDRLEN, TB, MAX, POL, EXT)
    +nlmsg_parse_deprecated(NLH, HDRLEN, TB, MAX, POL, EXT)

    @@
    expression NLH, HDRLEN, TB, MAX, POL, EXT;
    @@
    -nlmsg_parse_strict(NLH, HDRLEN, TB, MAX, POL, EXT)
    +nlmsg_parse_deprecated_strict(NLH, HDRLEN, TB, MAX, POL, EXT)

    @@
    expression TB, MAX, NLA, POL, EXT;
    @@
    -nla_parse_nested(TB, MAX, NLA, POL, EXT)
    +nla_parse_nested_deprecated(TB, MAX, NLA, POL, EXT)

    @@
    expression START, MAX, POL, EXT;
    @@
    -nla_validate_nested(START, MAX, POL, EXT)
    +nla_validate_nested_deprecated(START, MAX, POL, EXT)

    @@
    expression NLH, HDRLEN, MAX, POL, EXT;
    @@
    -nlmsg_validate(NLH, HDRLEN, MAX, POL, EXT)
    +nlmsg_validate_deprecated(NLH, HDRLEN, MAX, POL, EXT)

    For this patch, don't actually add the strict, non-renamed versions
    yet so that it breaks compile if I get it wrong.

    Also, while at it, make nla_validate and nla_parse go down to a
    common __nla_validate_parse() function to avoid code duplication.

    Ultimately, this allows us to have very strict validation for every
    new caller of nla_parse()/nlmsg_parse() etc as re-introduced in the
    next patch, while existing things will continue to work as is.

    In effect then, this adds fully strict validation for any new command.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Even if the NLA_F_NESTED flag was introduced more than 11 years ago, most
    netlink based interfaces (including recently added ones) are still not
    setting it in kernel generated messages. Without the flag, message parsers
    not aware of attribute semantics (e.g. wireshark dissector or libmnl's
    mnl_nlmsg_fprintf()) cannot recognize nested attributes and won't display
    the structure of their contents.

    Unfortunately we cannot just add the flag everywhere as there may be
    userspace applications which check nlattr::nla_type directly rather than
    through a helper masking out the flags. Therefore the patch renames
    nla_nest_start() to nla_nest_start_noflag() and introduces nla_nest_start()
    as a wrapper adding NLA_F_NESTED. The calls which add NLA_F_NESTED manually
    are rewritten to use nla_nest_start().

    Except for changes in include/net/netlink.h, the patch was generated using
    this semantic patch:

    @@ expression E1, E2; @@
    -nla_nest_start(E1, E2)
    +nla_nest_start_noflag(E1, E2)

    @@ expression E1, E2; @@
    -nla_nest_start_noflag(E1, E2 | NLA_F_NESTED)
    +nla_nest_start(E1, E2)

    Signed-off-by: Michal Kubecek
    Acked-by: Jiri Pirko
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Michal Kubecek
     

18 Jan, 2019

2 commits

  • Its now same as __nf_ct_l4proto_find(), so rename that to
    nf_ct_l4proto_find and use it everywhere.

    It never returns NULL and doesn't need locks or reference counts.

    Before this series:
    302824 net/netfilter/nf_conntrack.ko
    21504 net/netfilter/nf_conntrack_proto_gre.ko

    text data bss dec hex filename
    6281 1732 4 8017 1f51 nf_conntrack_proto_gre.ko
    108356 20613 236 129205 1f8b5 nf_conntrack.ko

    After:
    294864 net/netfilter/nf_conntrack.ko
    text data bss dec hex filename
    106979 19557 240 126776 1ef38 nf_conntrack.ko

    so, even with builtin gre, total size got reduced.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • No need to get/put module owner reference, none of these can be removed
    anymore.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

28 Sep, 2018

1 commit


21 Sep, 2018

1 commit

  • l4 protocols are demuxed by l3num, l4num pair.

    However, almost all l4 trackers are l3 agnostic.

    Only exceptions are:
    - gre, icmp (ipv4 only)
    - icmpv6 (ipv6 only)

    This commit gets rid of the l3 mapping, l4 trackers can now be looked up
    by their IPPROTO_XXX value alone, which gets rid of the additional l3
    indirection.

    For icmp, ipcmp6 and gre, add a check on state->pf and
    return -NF_ACCEPT in case we're asked to track e.g. icmpv6-in-ipv4,
    this seems more fitting than using the generic tracker.

    Additionally we can kill the 2nd l4proto definitions that were needed
    for v4/v6 split -- they are now the same so we can use single l4proto
    struct for each protocol, rather than two.

    The EXPORT_SYMBOLs can be removed as all these object files are
    part of nf_conntrack with no external references.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

17 Sep, 2018

1 commit


29 Aug, 2018

1 commit

  • Using a private template is problematic:

    1. We can't assign both a zone and a timeout policy
    (zone assigns a conntrack template, so we hit problem 1)
    2. Using a template needs to take care of ct refcount, else we'll
    eventually free the private template due to ->use underflow.

    This patch reworks template policy to instead work with existing conntrack.

    As long as such conntrack has not yet been placed into the hash table
    (unconfirmed) we can still add the timeout extension.

    The only caveat is that we now need to update/correct ct->timeout to
    reflect the initial/new state, otherwise the conntrack entry retains the
    default 'new' timeout.

    Side effect of this change is that setting the policy must
    now occur from chains that are evaluated *after* the conntrack lookup
    has taken place.

    No released kernel contains the timeout policy feature yet, so this change
    should be ok.

    Changes since v2:
    - don't handle 'ct is confirmed case'
    - after previous patch, no need to special-case tcp/dccp/sctp timeout
    anymore

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

17 Aug, 2018

1 commit


07 Aug, 2018

2 commits

  • Enable conntrack if the user defines a helper to be used from the
    ruleset policy.

    Fixes: 1a64edf54f55 ("netfilter: nft_ct: add helper set support")
    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • This patch allows to add, list and delete connection tracking timeout
    policies via nft objref infrastructure and assigning these timeout
    via nft rule.

    %./libnftnl/examples/nft-ct-timeout-add ip raw cttime tcp

    Ruleset:

    table ip raw {
    ct timeout cttime {
    protocol tcp;
    policy = {established: 111, close: 13 }
    }

    chain output {
    type filter hook output priority -300; policy accept;
    ct timeout set "cttime"
    }
    }

    %./libnftnl/examples/nft-rule-ct-timeout-add ip raw output cttime

    %conntrack -E
    [NEW] tcp 6 111 ESTABLISHED src=172.16.19.128 dst=172.16.19.1
    sport=22 dport=41360 [UNREPLIED] src=172.16.19.1 dst=172.16.19.128
    sport=41360 dport=22

    %nft delete rule ip raw output handle
    %./libnftnl/examples/nft-ct-timeout-del ip raw cttime

    Joint work with Pablo Neira.

    Signed-off-by: Harsha Sharma
    Signed-off-by: Pablo Neira Ayuso

    Harsha Sharma
     

18 Jul, 2018

1 commit


03 Jun, 2018

2 commits


17 May, 2018

1 commit

  • In the nft_ct_helper_obj_dump(), always priv->helper4 is dereferenced.
    But if family is ipv6, priv->helper6 should be dereferenced.

    Steps to reproduces:

    #test.nft
    table ip6 filter {
    ct helper ftp {
    type "ftp" protocol tcp
    }
    chain input {
    type filter hook input priority 4;
    ct helper set "ftp"
    }
    }

    %nft -f test.nft
    %nft list ruleset

    we can see the below messages:

    [ 916.286233] kasan: GPF could be caused by NULL-ptr deref or user memory access
    [ 916.294777] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
    [ 916.302613] Modules linked in: nft_objref nf_conntrack_sip nf_conntrack_snmp nf_conntrack_broadcast nf_conntrack_ftp nft_ct nf_conntrack nf_tables nfnetlink [last unloaded: nfnetlink]
    [ 916.318758] CPU: 1 PID: 2093 Comm: nft Not tainted 4.17.0-rc4+ #181
    [ 916.326772] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
    [ 916.338773] RIP: 0010:strlen+0x1a/0x90
    [ 916.342781] RSP: 0018:ffff88010ff0f2f8 EFLAGS: 00010292
    [ 916.346773] RAX: dffffc0000000000 RBX: ffff880119b26ee8 RCX: ffff88010c150038
    [ 916.354777] RDX: 0000000000000002 RSI: ffff880119b26ee8 RDI: 0000000000000010
    [ 916.362773] RBP: 0000000000000010 R08: 0000000000007e88 R09: ffff88010c15003c
    [ 916.370773] R10: ffff88010c150037 R11: ffffed002182a007 R12: ffff88010ff04040
    [ 916.378779] R13: 0000000000000010 R14: ffff880119b26f30 R15: ffff88010ff04110
    [ 916.387265] FS: 00007f57a1997700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000
    [ 916.394785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 916.402778] CR2: 00007f57a0ac80f0 CR3: 000000010ff02000 CR4: 00000000001006e0
    [ 916.410772] Call Trace:
    [ 916.414787] nft_ct_helper_obj_dump+0x94/0x200 [nft_ct]
    [ 916.418779] ? nft_ct_set_eval+0x560/0x560 [nft_ct]
    [ 916.426771] ? memset+0x1f/0x40
    [ 916.426771] ? __nla_reserve+0x92/0xb0
    [ 916.434774] ? memcpy+0x34/0x50
    [ 916.434774] nf_tables_fill_obj_info+0x484/0x860 [nf_tables]
    [ 916.442773] ? __nft_release_basechain+0x600/0x600 [nf_tables]
    [ 916.450779] ? lock_acquire+0x193/0x380
    [ 916.454771] ? lock_acquire+0x193/0x380
    [ 916.458789] ? nf_tables_dump_obj+0x148/0xcb0 [nf_tables]
    [ 916.462777] nf_tables_dump_obj+0x5f0/0xcb0 [nf_tables]
    [ 916.470769] ? __alloc_skb+0x30b/0x500
    [ 916.474779] netlink_dump+0x752/0xb50
    [ 916.478775] __netlink_dump_start+0x4d3/0x750
    [ 916.482784] nf_tables_getobj+0x27a/0x930 [nf_tables]
    [ 916.490774] ? nft_obj_notify+0x100/0x100 [nf_tables]
    [ 916.494772] ? nf_tables_getobj+0x930/0x930 [nf_tables]
    [ 916.502579] ? nf_tables_dump_flowtable_done+0x70/0x70 [nf_tables]
    [ 916.506774] ? nft_obj_notify+0x100/0x100 [nf_tables]
    [ 916.514808] nfnetlink_rcv_msg+0x8ab/0xa86 [nfnetlink]
    [ 916.518771] ? nfnetlink_rcv_msg+0x550/0xa86 [nfnetlink]
    [ 916.526782] netlink_rcv_skb+0x23e/0x360
    [ 916.530773] ? nfnetlink_bind+0x200/0x200 [nfnetlink]
    [ 916.534778] ? debug_check_no_locks_freed+0x280/0x280
    [ 916.542770] ? netlink_ack+0x870/0x870
    [ 916.546786] ? ns_capable_common+0xf4/0x130
    [ 916.550765] nfnetlink_rcv+0x172/0x16c0 [nfnetlink]
    [ 916.554771] ? sched_clock_local+0xe2/0x150
    [ 916.558774] ? sched_clock_cpu+0x144/0x180
    [ 916.566575] ? lock_acquire+0x380/0x380
    [ 916.570775] ? sched_clock_local+0xe2/0x150
    [ 916.574765] ? nfnetlink_net_init+0x130/0x130 [nfnetlink]
    [ 916.578763] ? sched_clock_cpu+0x144/0x180
    [ 916.582770] ? lock_acquire+0x193/0x380
    [ 916.590771] ? lock_acquire+0x193/0x380
    [ 916.594766] ? lock_acquire+0x380/0x380
    [ 916.598760] ? netlink_deliver_tap+0x262/0xa60
    [ 916.602766] ? lock_acquire+0x193/0x380
    [ 916.606766] netlink_unicast+0x3ef/0x5a0
    [ 916.610771] ? netlink_attachskb+0x630/0x630
    [ 916.614763] netlink_sendmsg+0x72a/0xb00
    [ 916.618769] ? netlink_unicast+0x5a0/0x5a0
    [ 916.626766] ? _copy_from_user+0x92/0xc0
    [ 916.630773] __sys_sendto+0x202/0x300
    [ 916.634772] ? __ia32_sys_getpeername+0xb0/0xb0
    [ 916.638759] ? lock_acquire+0x380/0x380
    [ 916.642769] ? lock_acquire+0x193/0x380
    [ 916.646761] ? finish_task_switch+0xf4/0x560
    [ 916.650763] ? __schedule+0x582/0x19a0
    [ 916.655301] ? __sched_text_start+0x8/0x8
    [ 916.655301] ? up_read+0x1c/0x110
    [ 916.655301] ? __do_page_fault+0x48b/0xaa0
    [ 916.655301] ? entry_SYSCALL_64_after_hwframe+0x59/0xbe
    [ 916.655301] __x64_sys_sendto+0xdd/0x1b0
    [ 916.655301] do_syscall_64+0x96/0x3d0
    [ 916.655301] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [ 916.655301] RIP: 0033:0x7f57a0ff5e03
    [ 916.655301] RSP: 002b:00007fff6367e0a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    [ 916.655301] RAX: ffffffffffffffda RBX: 00007fff6367f1e0 RCX: 00007f57a0ff5e03
    [ 916.655301] RDX: 0000000000000020 RSI: 00007fff6367e110 RDI: 0000000000000003
    [ 916.655301] RBP: 00007fff6367e100 R08: 00007f57a0ce9160 R09: 000000000000000c
    [ 916.655301] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff6367e110
    [ 916.655301] R13: 0000000000000020 R14: 00007f57a153c610 R15: 0000562417258de0
    [ 916.655301] Code: ff ff ff 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 fa 53 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df 48 89 fd 48 83 ec 08 b6 04 02 48 89 fa 83 e2 07 38 d0 7f
    [ 916.655301] RIP: strlen+0x1a/0x90 RSP: ffff88010ff0f2f8
    [ 916.771929] ---[ end trace 1065e048e72479fe ]---
    [ 916.777204] Kernel panic - not syncing: Fatal exception
    [ 916.778158] Kernel Offset: 0x14000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

    Signed-off-by: Taehee Yoo
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Taehee Yoo
     

20 Mar, 2018

1 commit


10 Jan, 2018

1 commit

  • Place all existing user defined tables in struct net *, instead of
    having one list per family. This saves us from one level of indentation
    in netlink dump functions.

    Place pointer to struct nft_af_info in struct nft_table temporarily, as
    we still need this to put back reference module reference counter on
    table removal.

    This patch comes in preparation for the removal of struct nft_af_info.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

06 Nov, 2017

1 commit


04 Sep, 2017

1 commit


15 May, 2017

1 commit


19 Apr, 2017

1 commit

  • By default the kernel emits all ctnetlink events for a connection.
    This allows to select the types of events to generate.

    This can be used to e.g. only send DESTROY events but no NEW/UPDATE ones
    and will work even if sysctl net.netfilter.nf_conntrack_events is set to 0.

    This was already possible via iptables' CT target, but the nft version has
    the advantage that it can also be used with already-established conntracks.

    The added nf_ct_is_template() check isn't a bug fix as we only support
    mark and labels (and unlike ecache the conntrack core doesn't copy those).

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

15 Apr, 2017

1 commit

  • resurrect an old patch from Pablo Neira to remove the untracked objects.

    Currently, there are four possible states of an skb wrt. conntrack.

    1. No conntrack attached, ct is NULL.
    2. Normal (kmem cache allocated) ct attached.
    3. a template (kmalloc'd), not in any hash tables at any point in time
    4. the 'untracked' conntrack, a percpu nf_conn object, tagged via
    IPS_UNTRACKED_BIT in ct->status.

    Untracked is supposed to be identical to case 1. It exists only
    so users can check

    -m conntrack --ctstate UNTRACKED vs.
    -m conntrack --ctstate INVALID

    e.g. attempts to set connmark on INVALID or UNTRACKED conntracks is
    supposed to be a no-op.

    Thus currently we need to check
    ct == NULL || nf_ct_is_untracked(ct)

    in a lot of places in order to avoid altering untracked objects.

    The other consequence of the percpu untracked object is that all
    -j NOTRACK (and, later, kfree_skb of such skbs) result in an atomic op
    (inc/dec the untracked conntracks refcount).

    This adds a new kernel-private ctinfo state, IP_CT_UNTRACKED, to
    make the distinction instead.

    The (few) places that care about packet invalid (ct is NULL) vs.
    packet untracked now need to test ct == NULL vs. ctinfo == IP_CT_UNTRACKED,
    but all other places can omit the nf_ct_is_untracked() check.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

08 Apr, 2017

1 commit


24 Mar, 2017

1 commit


16 Mar, 2017

1 commit


13 Mar, 2017

2 commits

  • this allows to assign connection tracking helpers to
    connections via nft objref infrastructure.

    The idea is to first specifiy a helper object:

    table ip filter {
    ct helper some-name {
    type "ftp"
    protocol tcp
    l3proto ip
    }
    }

    and then assign it via

    nft add ... ct helper set "some-name"

    helper assignment works for new conntracks only as we cannot expand the
    conntrack extension area once it has been committed to the main conntrack
    table.

    ipv4 and ipv6 protocols are tracked stored separately so
    we can also handle families that observe both ipv4 and ipv6 traffic.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Currently, there are two different methods to store an u16 integer to
    the u32 data register. For example:
    u32 *dest = ®s->data[priv->dreg];
    1. *dest = 0; *(u16 *) dest = val_u16;
    2. *dest = val_u16;

    For method 1, the u16 value will be stored like this, either in
    big-endian or little-endian system:
    0 15 31
    +-+-+-+-+-+-+-+-+-+-+-+-+
    | Value | 0 |
    +-+-+-+-+-+-+-+-+-+-+-+-+

    For method 2, in little-endian system, the u16 value will be the same
    as listed above. But in big-endian system, the u16 value will be stored
    like this:
    0 15 31
    +-+-+-+-+-+-+-+-+-+-+-+-+
    | 0 | Value |
    +-+-+-+-+-+-+-+-+-+-+-+-+

    So later we use "memcmp(®s->data[priv->sreg], data, 2);" to do
    compare in nft_cmp, nft_lookup expr ..., method 2 will get the wrong
    result in big-endian system, as 0~15 bits will always be zero.

    For the similar reason, when loading an u16 value from the u32 data
    register, we should use "*(u16 *) sreg;" instead of "(u16)*sreg;",
    the 2nd method will get the wrong value in the big-endian system.

    So introduce some wrapper functions to store/load an u8 or u16
    integer to/from the u32 data register, and use them in the right
    place.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang
     

24 Feb, 2017

1 commit


08 Feb, 2017

2 commits

  • zones allow tracking multiple connections sharing identical tuples,
    this is needed e.g. when tracking distinct vlans with overlapping ip
    addresses (conntrack is l2 agnostic).

    Thus the zone has to be set before the packet is picked up by the
    connection tracker. This is done by means of 'conntrack templates' which
    are conntrack structures used solely to pass this info from one netfilter
    hook to the next.

    The iptables CT target instantiates these connection tracking templates
    once per rule, i.e. the template is fixed/tied to particular zone, can
    be read-only and therefore be re-used by as many skbs simultaneously as
    needed.

    We can't follow this model because we want to take the zone id from
    an sreg at rule eval time so we could e.g. fill in the zone id from
    the packets vlan id or a e.g. nftables key : value maps.

    To avoid cost of per packet alloc/free of the template, use a percpu
    template 'scratch' object and use the refcount to detect the (unlikely)
    case where the template is still attached to another skb (i.e., previous
    skb was nfqueued ...).

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Next patch will add ZONE_ID set support which will need similar
    error unwind (put operation) as conntrack labels.

    Prepare for this: remove the 'label_got' boolean in favor
    of a switch statement that can be extended in next patch.

    As we already have that in the set_destroy function place that in
    a separate function and call it from the set init function.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal