25 Aug, 2017

1 commit


01 May, 2017

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter/IPVS updates for net-next

    The following patchset contains Netfilter updates for your net-next
    tree. A large bunch of code cleanups, simplify the conntrack extension
    codebase, get rid of the fake conntrack object, speed up netns by
    selective synchronize_net() calls. More specifically, they are:

    1) Check for ct->status bit instead of using nfct_nat() from IPVS and
    Netfilter codebase, patch from Florian Westphal.

    2) Use kcalloc() wherever possible in the IPVS code, from Varsha Rao.

    3) Simplify FTP IPVS helper module registration path, from Arushi Singhal.

    4) Introduce nft_is_base_chain() helper function.

    5) Enforce expectation limit from userspace conntrack helper,
    from Gao Feng.

    6) Add nf_ct_remove_expect() helper function, from Gao Feng.

    7) NAT mangle helper function return boolean, from Gao Feng.

    8) ctnetlink_alloc_expect() should only work for conntrack with
    helpers, from Gao Feng.

    9) Add nfnl_msg_type() helper function to nfnetlink to build the
    netlink message type.

    10) Get rid of unnecessary cast on void, from simran singhal.

    11) Use seq_puts()/seq_putc() instead of seq_printf() where possible,
    also from simran singhal.

    12) Use list_prev_entry() from nf_tables, from simran signhal.

    13) Remove unnecessary & on pointer function in the Netfilter and IPVS
    code.

    14) Remove obsolete comment on set of rules per CPU in ip6_tables,
    no longer true. From Arushi Singhal.

    15) Remove duplicated nf_conntrack_l4proto_udplite4, from Gao Feng.

    16) Remove unnecessary nested rcu_read_lock() in
    __nf_nat_decode_session(). Code running from hooks are already
    guaranteed to run under RCU read side.

    17) Remove deadcode in nf_tables_getobj(), from Aaron Conole.

    18) Remove double assignment in nf_ct_l4proto_pernet_unregister_one(),
    also from Aaron.

    19) Get rid of unsed __ip_set_get_netlink(), from Aaron Conole.

    20) Don't propagate NF_DROP error to userspace via ctnetlink in
    __nf_nat_alloc_null_binding() function, from Gao Feng.

    21) Revisit nf_ct_deliver_cached_events() to remove unnecessary checks,
    from Gao Feng.

    22) Kill the fake untracked conntrack objects, use ctinfo instead to
    annotate a conntrack object is untracked, from Florian Westphal.

    23) Remove nf_ct_is_untracked(), now obsolete since we have no
    conntrack template anymore, from Florian.

    24) Add event mask support to nft_ct, also from Florian.

    25) Move nf_conn_help structure to
    include/net/netfilter/nf_conntrack_helper.h.

    26) Add a fixed 32 bytes scratchpad area for conntrack helpers.
    Thus, we don't deal with variable conntrack extensions anymore.
    Make sure userspace conntrack helper doesn't go over that size.
    Remove variable size ct extension infrastructure now this code
    got no more clients. From Florian Westphal.

    27) Restore offset and length of nf_ct_ext structure to 8 bytes now
    that wraparound is not possible any longer, also from Florian.

    28) Allow to get rid of unassured flows under stress in conntrack,
    this applies to DCCP, SCTP and TCP protocols, from Florian.

    29) Shrink size of nf_conntrack_ecache structure, from Florian.

    30) Use TCP_MAX_WSCALE instead of hardcoded 14 in TCP tracker,
    from Gao Feng.

    31) Register SYNPROXY hooks on demand, from Florian Westphal.

    32) Use pernet hook whenever possible, instead of global hook
    registration, from Florian Westphal.

    33) Pass hook structure to ebt_register_table() to consolidate some
    infrastructure code, from Florian Westphal.

    34) Use consume_skb() and return NF_STOLEN, instead of NF_DROP in the
    SYNPROXY code, to make sure device stats are not fooled, patch
    from Gao Feng.

    35) Remove NF_CT_EXT_F_PREALLOC this kills quite some code that we
    don't need anymore if we just select a fixed size instead of
    expensive runtime time calculation of this. From Florian.

    36) Constify nf_ct_extend_register() and nf_ct_extend_unregister(),
    from Florian.

    37) Simplify nf_ct_ext_add(), this kills nf_ct_ext_create(), from
    Florian.

    38) Attach NAT extension on-demand from masquerade and pptp helper
    path, from Florian.

    39) Get rid of useless ip_vs_set_state_timeout(), from Aaron Conole.

    40) Speed up netns by selective calls of synchronize_net(), from
    Florian Westphal.

    41) Silence stack size warning gcc in 32-bit arch in snmp helper,
    from Florian.

    42) Inconditionally call nf_ct_ext_destroy(), even if we have no
    extensions, to deal with the NF_NAT_MANIP_SRC case. Patch from
    Liping Zhang.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Apr, 2017

2 commits

  • The window scale may be enlarged from 14 to 15 according to the itef
    draft https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03.

    Use the macro TCP_MAX_WSCALE to support it easily with TCP stack in
    the future.

    Signed-off-by: Gao Feng
    Signed-off-by: Pablo Neira Ayuso

    Gao Feng
     
  • If insertion of a new conntrack fails because the table is full, the kernel
    searches the next buckets of the hash slot where the new connection
    was supposed to be inserted at for an entry that hasn't seen traffic
    in reply direction (non-assured), if it finds one, that entry is
    is dropped and the new connection entry is allocated.

    Allow the conntrack gc worker to also remove *assured* conntracks if
    resources are low.

    Do this by querying the l4 tracker, e.g. tcp connections are now dropped
    if they are no longer established (e.g. in finwait).

    This could be refined further, e.g. by adding 'soft' established timeout
    (i.e., a timeout that is only used once we get close to resource
    exhaustion).

    Cc: Jozsef Kadlecsik
    Signed-off-by: Florian Westphal
    Acked-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

14 Apr, 2017

1 commit


02 Feb, 2017

1 commit


13 Aug, 2016

1 commit

  • This backward compatibility has been around for more than ten years,
    since Yasuyuki Kozakai introduced IPv6 in conntrack. These days, we have
    alternate /proc/net/nf_conntrack* entries, the ctnetlink interface and
    the conntrack utility got adopted by many people in the user community
    according to what I observed on the netfilter user mailing list.

    So let's get rid of this.

    Note that nf_conntrack_htable_size and unsigned int nf_conntrack_max do
    not need to be exported as symbol anymore.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

12 Aug, 2016

1 commit


24 Apr, 2016

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter updates for net-next

    The following patchset contains Netfilter updates for your net-next
    tree, mostly from Florian Westphal to sort out the lack of sufficient
    validation in x_tables and connlabel preparation patches to add
    nf_tables support. They are:

    1) Ensure we don't go over the ruleset blob boundaries in
    mark_source_chains().

    2) Validate that target jumps land on an existing xt_entry. This extra
    sanitization comes with a performance penalty when loading the ruleset.

    3) Introduce xt_check_entry_offsets() and use it from {arp,ip,ip6}tables.

    4) Get rid of the smallish check_entry() functions in {arp,ip,ip6}tables.

    5) Make sure the minimal possible target size in x_tables.

    6) Similar to #3, add xt_compat_check_entry_offsets() for compat code.

    7) Check that standard target size is valid.

    8) More sanitization to ensure that the target_offset field is correct.

    9) Add xt_check_entry_match() to validate that matches are well-formed.

    10-12) Three patch to reduce the number of parameters in
    translate_compat_table() for {arp,ip,ip6}tables by using a container
    structure.

    13) No need to return value from xt_compat_match_from_user(), so make
    it void.

    14) Consolidate translate_table() so it can be used by compat code too.

    15) Remove obsolete check for compat code, so we keep consistent with
    what was already removed in the native layout code (back in 2007).

    16) Get rid of target jump validation from mark_source_chains(),
    obsoleted by #2.

    17) Introduce xt_copy_counters_from_user() to consolidate counter
    copying, and use it from {arp,ip,ip6}tables.

    18,22) Get rid of unnecessary explicit inlining in ctnetlink for dump
    functions.

    19) Move nf_connlabel_match() to xt_connlabel.

    20) Skip event notification if connlabel did not change.

    21) Update of nf_connlabels_get() to make the upcoming nft connlabel
    support easier.

    23) Remove spinlock to read protocol state field in conntrack.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

20 Apr, 2016

1 commit


08 Apr, 2016

1 commit

  • Baozeng Ding reported a KASAN stack out of bounds issue - it uncovered that
    the TCP option parsing routines in netfilter TCP connection tracking could
    read one byte out of the buffer of the TCP options. Therefore in the patch
    we check that the available data length is large enough to parse both TCP
    option code and size.

    Reported-by: Baozeng Ding
    Tested-by: Baozeng Ding
    Signed-off-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso

    Jozsef Kadlecsik
     

19 Sep, 2015

1 commit


16 May, 2015

1 commit

  • In compliance with RFC5961, the network stack send challenge ACK in
    response to spurious SYN packets, since commit 0c228e833c88 ("tcp:
    Restore RFC5961-compliant behavior for SYN packets").

    This pose a problem for netfilter conntrack in state LAST_ACK, because
    this challenge ACK is (falsely) seen as ACKing last FIN, causing a
    false state transition (into TIME_WAIT).

    The challenge ACK is hard to distinguish from real last ACK. Thus,
    solution introduce a flag that tracks the potential for seeing a
    challenge ACK, in case a SYN packet is let through and current state
    is LAST_ACK.

    When conntrack transition LAST_ACK to TIME_WAIT happens, this flag is
    used for determining if we are expecting a challenge ACK.

    Scapy based reproducer script avail here:
    https://github.com/netoptimizer/network-testing/blob/master/scapy/tcp_hacks_3WHS_LAST_ACK.py

    Fixes: 0c228e833c88 ("tcp: Restore RFC5961-compliant behavior for SYN packets")
    Signed-off-by: Jesper Dangaard Brouer
    Acked-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso

    Jesper Dangaard Brouer
     

09 Dec, 2014

1 commit


06 Nov, 2014

2 commits

  • Since adding a new function to seq_file (seq_has_overflowed())
    there isn't any value for functions called from seq_show to
    return anything. Remove the int returns of the various
    print_tuple/_print_tuple functions.

    Link: http://lkml.kernel.org/p/f2e8cf8df433a197daa62cbaf124c900c708edc7.1412031505.git.joe@perches.com

    Cc: Pablo Neira Ayuso
    Cc: Patrick McHardy
    Cc: Jozsef Kadlecsik
    Cc: netfilter-devel@vger.kernel.org
    Cc: coreteam@netfilter.org
    Signed-off-by: Joe Perches
    Signed-off-by: Steven Rostedt

    Joe Perches
     
  • The seq_printf() and friends are having their return values removed.
    The print_conntrack() returns the result of seq_printf(), which is
    meaningless when seq_printf() returns void. Might as well remove the
    return values of print_conntrack() as well.

    Link: http://lkml.kernel.org/r/20141029220107.465008329@goodmis.org
    Acked-by: Pablo Neira Ayuso
    Cc: Patrick McHardy
    Cc: Jozsef Kadlecsik
    Cc: netfilter-devel@vger.kernel.org
    Cc: coreteam@netfilter.org
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

22 Oct, 2014

1 commit

  • When a port that was used to listen for inbound connections gets closed
    and reused for outgoing connections (like rsh ends up doing for stderr
    flow), current we may reject the SYN/ACK packet for the new connection
    because tcp_conntracks states forbirds a port to become a client while
    there is still a TIME_WAIT entry in there for it.

    As TCP may expire the TIME_WAIT socket in 60s and conntrack's timeout
    for it is 120s, there is a ~60s window that the application can end up
    opening a port that conntrack will end up blocking.

    This patch fixes this by simply allowing such state transition: if we
    see a SYN, in TIME_WAIT state, on REPLY direction, move it to sSS. Note
    that the rest of the code already handles this situation, more
    specificly in tcp_packet(), first switch clause.

    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso

    Marcelo Leitner
     

28 Aug, 2013

2 commits

  • Add a SYNPROXY for netfilter. The code is split into two parts, the synproxy
    core with common functions and an address family specific target.

    The SYNPROXY receives the connection request from the client, responds with
    a SYN/ACK containing a SYN cookie and announcing a zero window and checks
    whether the final ACK from the client contains a valid cookie.

    It then establishes a connection to the original destination and, if
    successful, sends a window update to the client with the window size
    announced by the server.

    Support for timestamps, SACK, window scaling and MSS options can be
    statically configured as target parameters if the features of the server
    are known. If timestamps are used, the timestamp value sent back to
    the client in the SYN/ACK will be different from the real timestamp of
    the server. In order to now break PAWS, the timestamps are translated in
    the direction server->client.

    Signed-off-by: Patrick McHardy
    Tested-by: Martin Topholm
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Split out sequence number adjustments from NAT and move them to the conntrack
    core to make them usable for SYN proxying. The sequence number adjustment
    information is moved to a seperate extend. The extend is added to new
    conntracks when a NAT mapping is set up for a connection using a helper.

    As a side effect, this saves 24 bytes per connection with NAT in the common
    case that a connection does not have a helper assigned.

    Signed-off-by: Patrick McHardy
    Tested-by: Martin Topholm
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     

21 Aug, 2013

1 commit

  • Conflicts:
    net/netfilter/nf_conntrack_proto_tcp.c

    The conflict had to do with overlapping changes dealing with
    fixing the use of an "s32" to hold the value returned by
    NAT_OFFSET().

    Pablo Neira Ayuso says:

    ====================
    The following batch contains Netfilter/IPVS updates for your net-next tree.
    More specifically, they are:

    * Trivial typo fix in xt_addrtype, from Phil Oester.

    * Remove net_ratelimit in the conntrack logging for consistency with other
    logging subsystem, from Patrick McHardy.

    * Remove unneeded includes from the recently added xt_connlabel support, from
    Florian Westphal.

    * Allow to update conntracks via nfqueue, don't need NFQA_CFG_F_CONNTRACK for
    this, from Florian Westphal.

    * Remove tproxy core, now that we have socket early demux, from Florian
    Westphal.

    * A couple of patches to refactor conntrack event reporting to save a good
    bunch of lines, from Florian Westphal.

    * Fix missing locking in NAT sequence adjustment, it did not manifested in
    any known bug so far, from Patrick McHardy.

    * Change sequence number adjustment variable to 32 bits, to delay the
    possible early overflow in long standing connections, also from Patrick.

    * Comestic cleanups for IPVS, from Dragos Foianu.

    * Fix possible null dereference in IPVS in the SH scheduler, from Daniel
    Borkmann.

    * Allow to attach conntrack expectations via nfqueue. Before this patch, you
    had to use ctnetlink instead, thus, we save the conntrack lookup.

    * Export xt_rpfilter and xt_HMARK header files, from Nicolas Dichtel.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

11 Aug, 2013

1 commit

  • Currently the conntrack checks if the ending sequence of a packet
    falls within the observed receive window. However it does so even
    if it has not observe any packet from the remote yet and uses an
    uninitialized receive window (td_maxwin).

    If a connection uses Fast Open to send a SYN-data packet which is
    dropped afterward in the network. The subsequent SYNs retransmits
    will all fail this check and be discarded, leading to a connection
    timeout. This is because the SYN retransmit does not contain data
    payload so

    end == initial sequence number (isn) + 1
    sender->td_end == isn + syn_data_len
    receiver->td_maxwin == 0

    The fix is to only apply this check after td_maxwin is initialized.

    Reported-by: Michael Chan
    Signed-off-by: Yuchung Cheng
    Acked-by: Eric Dumazet
    Acked-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso

    Yuchung Cheng
     

01 Aug, 2013

1 commit


20 Jun, 2013

1 commit

  • When loose tracking is enabled (default), non-syn packets cause
    creation of new conntracks in established state with default timeout for
    established state (5 days). This causes the table to fill up with UNREPLIED
    when the 'new ack' packet happened to be the last-ack of a previous,
    already timed-out connection.

    Consider:

    A 192.168.x.52792 > 10.184.y.80: F, 426:426(0) ack 9237 win 255
    B 10.184.y.80 > 192.168.x.52792: ., ack 427 win 123

    C 10.184.y.80 > 192.168.x.52792: F, 9237:9237(0) ack 427 win 123
    D 192.168.x.52792 > 10.184.y.80: ., ack 9238 win 255

    B moves conntrack to CLOSE_WAIT and will kill it after 60 second timeout,
    C is ignored (FIN set), but last packet (D) causes new ct with 5-days timeout.

    Use UNACK timeout (5 minutes) instead to get rid of these entries sooner
    when in ESTABLISHED state without having seen traffic in both directions.

    Signed-off-by: Florian Westphal
    Acked-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

19 Apr, 2013

1 commit

  • Add copyright statements to all netfilter files which have had significant
    changes done by myself in the past.

    Some notes:

    - nf_conntrack_ecache.c was incorrectly attributed to Rusty and Netfilter
    Core Team when it got split out of nf_conntrack_core.c. The copyrights
    even state a date which lies six years before it was written. It was
    written in 2005 by Harald and myself.

    - net/ipv{4,6}/netfilter.c, net/netfitler/nf_queue.c were missing copyright
    statements. I've added the copyright statement from net/netfilter/core.c,
    where this code originated

    - for nf_conntrack_proto_tcp.c I've also added Jozsef, since I didn't want
    it to give the wrong impression

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     

06 Apr, 2013

1 commit

  • This patch adds netns support to nf_log and it prepares netns
    support for existing loggers. It is composed of four major
    changes.

    1) nf_log_register has been split to two functions: nf_log_register
    and nf_log_set. The new nf_log_register is used to globally
    register the nf_logger and nf_log_set is used for enabling
    pernet support from nf_loggers.

    Per netns is not yet complete after this patch, it comes in
    separate follow up patches.

    2) Add net as a parameter of nf_log_bind_pf. Per netns is not
    yet complete after this patch, it only allows to bind the
    nf_logger to the protocol family from init_net and it skips
    other cases.

    3) Adapt all nf_log_packet callers to pass netns as parameter.
    After this patch, this function only works for init_net.

    4) Make the sysctl net/netfilter/nf_log pernet.

    Signed-off-by: Gao feng
    Signed-off-by: Pablo Neira Ayuso

    Gao feng
     

03 Dec, 2012

1 commit


15 Sep, 2012

1 commit

  • Conflicts:
    net/netfilter/nfnetlink_log.c
    net/netfilter/xt_LOG.c

    Rather easy conflict resolution, the 'net' tree had bug fixes to make
    sure we checked if a socket is a time-wait one or not and elide the
    logging code if so.

    Whereas on the 'net-next' side we are calculating the UID and GID from
    the creds using different interfaces due to the user namespace changes
    from Eric Biederman.

    Signed-off-by: David S. Miller

    David S. Miller
     

10 Sep, 2012

2 commits


30 Aug, 2012

1 commit


05 Jul, 2012

2 commits


28 Jun, 2012

2 commits


12 Jun, 2012

1 commit

  • This patch fixes the compilation of the TCP and UDP trackers with sysctl
    compilation disabled:

    net/netfilter/nf_conntrack_proto_udp.c: In function ‘udp_init_net_data’:
    net/netfilter/nf_conntrack_proto_udp.c:279:13: error: ‘struct nf_proto_net’ has no member named
    ‘user’
    net/netfilter/nf_conntrack_proto_tcp.c:1606:9: error: ‘struct nf_proto_net’ has no member named
    ‘user’
    net/netfilter/nf_conntrack_proto_tcp.c:1643:9: error: ‘struct nf_proto_net’ has no member named
    ‘user’

    Reported-by: Fengguang Wu
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Pablo Neira Ayuso
     

07 Jun, 2012

3 commits


17 May, 2012

1 commit

  • Extend log message if packets are ignored to include the TCP state, ie.
    replace:

    [ 3968.070196] nf_ct_tcp: invalid packet ignored IN= OUT= SRC=...

    by:

    [ 3968.070196] nf_ct_tcp: invalid packet ignored in state ESTABLISHED IN= OUT= SRC=...

    This information is useful to know in what state we were while ignoring the
    packet.

    Signed-off-by: Pablo Neira Ayuso
    Acked-by: Jozsef Kadlecsik

    Pablo Neira Ayuso
     

13 Apr, 2012

1 commit