28 May, 2020

1 commit

  • Conntrack dump does not support kernel side filtering (only get exists,
    but it returns only one entry. And user has to give a full valid tuple)

    It means that userspace has to implement filtering after receiving many
    irrelevant entries, consuming resources (conntrack table is sometimes
    very huge, much more than a routing table for example).

    This patch adds filtering in kernel side. To achieve this goal, we:

    * Add a new CTA_FILTER netlink attributes, actually a flag list to
    parametize filtering
    * Convert some *nlattr_to_tuple() functions, to allow a partial parsing
    of CTA_TUPLE_ORIG and CTA_TUPLE_REPLY (so nf_conntrack_tuple it not
    fully set)

    Filtering is now possible on:
    * IP SRC/DST values
    * Ports for TCP and UDP flows
    * IMCP(v6) codes types and IDs

    Filtering is done as an "AND" operator. For example, when flags
    PROTO_SRC_PORT, PROTO_NUM and IP_SRC are sets, only entries matching all
    values are dumped.

    Changes since v1:
    Set NLM_F_DUMP_FILTERED in nlm flags if entries are filtered

    Changes since v2:
    Move several constants to nf_internals.h
    Move a fix on netlink values check in a separate patch
    Add a check on not-supported flags
    Return EOPNOTSUPP if CDA_FILTER is set in ctnetlink_flush_conntrack
    (not yet implemented)
    Code style issues

    Changes since v3:
    Fix compilation warning reported by kbuild test robot

    Changes since v4:
    Fix a regression introduced in v3 (returned EINVAL for valid netlink
    messages without CTA_MARK)

    Changes since v5:
    Change definition of CTA_FILTER_F_ALL
    Fix a regression when CTA_TUPLE_ZONE is not set

    Signed-off-by: Romain Bellan
    Signed-off-by: Florent Fourcot
    Signed-off-by: Pablo Neira Ayuso

    Romain Bellan
     

13 Sep, 2019

1 commit


19 Jun, 2019

1 commit

  • Based on 2 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license version 2 as
    published by the free software foundation #

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 4122 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Enrico Weigelt
    Reviewed-by: Kate Stewart
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

13 Apr, 2019

1 commit

  • Luca Moro says:
    ------
    The issue lies in the filtering of ICMP and ICMPv6 errors that include an
    inner IP datagram.
    For these packets, icmp_error_message() extract the ICMP error and inner
    layer to search of a known state.
    If a state is found the packet is tagged as related (IP_CT_RELATED).

    The problem is that there is no correlation check between the inner and
    outer layer of the packet.
    So one can encapsulate an error with an inner layer matching a known state,
    while its outer layer is directed to a filtered host.
    In this case the whole packet will be tagged as related.
    This has various implications from a rule bypass (if a rule to related
    trafic is allow), to a known state oracle.

    Unfortunately, we could not find a real statement in a RFC on how this case
    should be filtered.
    The closest we found is RFC5927 (Section 4.3) but it is not very clear.

    A possible fix would be to check that the inner IP source is the same than
    the outer destination.

    We believed this kind of attack was not documented yet, so we started to
    write a blog post about it.
    You can find it attached to this mail (sorry for the extract quality).
    It contains more technical details, PoC and discussion about the identified
    behavior.
    We discovered later that
    https://www.gont.com.ar/papers/filtering-of-icmp-error-messages.pdf
    described a similar attack concept in 2004 but without the stateful
    filtering in mind.
    -----

    This implements above suggested fix:
    In icmp(v6) error handler, take outer destination address, then pass
    that into the common function that does the "related" association.

    After obtaining the nf_conn of the matching inner-headers connection,
    check that the destination address of the opposite direction tuple
    is the same as the outer address and only set RELATED if thats the case.

    Reported-by: Luca Moro
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

18 Jan, 2019

6 commits


03 Nov, 2018

1 commit


21 Sep, 2018

4 commits

  • l4 protocols are demuxed by l3num, l4num pair.

    However, almost all l4 trackers are l3 agnostic.

    Only exceptions are:
    - gre, icmp (ipv4 only)
    - icmpv6 (ipv6 only)

    This commit gets rid of the l3 mapping, l4 trackers can now be looked up
    by their IPPROTO_XXX value alone, which gets rid of the additional l3
    indirection.

    For icmp, ipcmp6 and gre, add a check on state->pf and
    return -NF_ACCEPT in case we're asked to track e.g. icmpv6-in-ipv4,
    this seems more fitting than using the generic tracker.

    Additionally we can kill the 2nd l4proto definitions that were needed
    for v4/v6 split -- they are now the same so we can use single l4proto
    struct for each protocol, rather than two.

    The EXPORT_SYMBOLs can be removed as all these object files are
    part of nf_conntrack with no external references.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Its unused, next patch will remove l4proto->l3proto number to simplify
    l4 protocol demuxer lookup.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • icmp(v6) are the only two layer four protocols that need the error()
    callback (to handle icmp errors that are related to an established
    connections, e.g. packet too big, port unreachable and the like).

    Remove the error callback and handle these two special cases from the core.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Only two protocols need the ->error() function: icmp and icmpv6.
    This is because icmp error mssages might be RELATED to an existing
    connection (e.g. PMTUD, port unreachable and the like), and their
    ->error() handlers do this.

    The error callback is already optional, so remove it for
    udp and call them from ->packet() instead.

    As the error() callback can call checksum functions that write to
    skb->csum*, the const qualifier has to be removed as well.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

20 Sep, 2018

2 commits

  • ->new() gets invoked after ->error() and before ->packet() if
    a conntrack lookup has found no result for the tuple.

    We can fold it into ->packet() -- the packet() implementations
    can check if the conntrack is confirmed (new) or not
    (already in hash).

    If its unconfirmed, the conntrack isn't in the hash yet so current
    skb created a new conntrack entry.

    Only relevant side effect -- if packet() doesn't return NF_ACCEPT
    but -NF_ACCEPT (or drop), while the conntrack was just created,
    then the newly allocated conntrack is freed right away, rather than not
    created in the first place.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • nf_hook_state contains all the hook meta-information: netns, protocol family,
    hook location, and so on.

    Instead of only passing selected information, pass a pointer to entire
    structure.

    This will allow to merge the error and the packet handlers and remove
    the ->new() function in followup patches.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

11 Sep, 2018

1 commit

  • Now that cttimeout support for nft_ct is in place, these should depend
    on CONFIG_NF_CONNTRACK_TIMEOUT otherwise we can crash when dumping the
    policy if this option is not enabled.

    [ 71.600121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
    [...]
    [ 71.600141] CPU: 3 PID: 7612 Comm: nft Not tainted 4.18.0+ #246
    [...]
    [ 71.600188] Call Trace:
    [ 71.600201] ? nft_ct_timeout_obj_dump+0xc6/0xf0 [nft_ct]

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

17 Jul, 2018

1 commit

  • This unifies ipv4 and ipv6 protocol trackers and removes the l3proto
    abstraction.

    This gets rid of all l3proto indirect calls and the need to do
    a lookup on the function to call for l3 demux.

    It increases module size by only a small amount (12kbyte), so this reduces
    size because nf_conntrack.ko is useless without either nf_conntrack_ipv4
    or nf_conntrack_ipv6 module.

    before:
    text data bss dec hex filename
    7357 1088 0 8445 20fd nf_conntrack_ipv4.ko
    7405 1084 4 8493 212d nf_conntrack_ipv6.ko
    72614 13689 236 86539 1520b nf_conntrack.ko
    19K nf_conntrack_ipv4.ko
    19K nf_conntrack_ipv6.ko
    179K nf_conntrack.ko

    after:
    text data bss dec hex filename
    79277 13937 236 93450 16d0a nf_conntrack.ko
    191K nf_conntrack.ko

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal