24 Jul, 2017

1 commit

  • This patch removes duplicate rcu_read_lock().

    1. IPVS part:

    According to Julian Anastasov's mention, contexts of ipvs are described
    at: http://marc.info/?l=netfilter-devel&m=149562884514072&w=2, in summary:

    - packet RX/TX: does not need locks because packets come from hooks.
    - sync msg RX: backup server uses RCU locks while registering new
    connections.
    - ip_vs_ctl.c: configuration get/set, RCU locks needed.
    - xt_ipvs.c: It is a netfilter match, running from hook context.

    As result, rcu_read_lock and rcu_read_unlock can be removed from:

    - ip_vs_core.c: all
    - ip_vs_ctl.c:
    - only from ip_vs_has_real_service
    - ip_vs_ftp.c: all
    - ip_vs_proto_sctp.c: all
    - ip_vs_proto_tcp.c: all
    - ip_vs_proto_udp.c: all
    - ip_vs_xmit.c: all (contains only packet processing)

    2. Netfilter part:

    There are three types of functions that are guaranteed the rcu_read_lock().
    First, as result, functions are only called by nf_hook():

    - nf_conntrack_broadcast_help(), pptp_expectfn(), set_expected_rtp_rtcp().
    - tcpmss_reverse_mtu(), tproxy_laddr4(), tproxy_laddr6().
    - match_lookup_rt6(), check_hlist(), hashlimit_mt_common().
    - xt_osf_match_packet().

    Second, functions that caller already held the rcu_read_lock().
    - destroy_conntrack(), ctnetlink_conntrack_event().
    - ctnl_timeout_find_get(), nfqnl_nf_hook_drop().

    Third, functions that are mixed with type1 and type2.

    These functions are called by nf_hook() also these are called by
    ordinary functions that already held the rcu_read_lock():

    - __ctnetlink_glue_build(), ctnetlink_expect_event().
    - ctnetlink_proto_size().

    Applied files are below:

    - nf_conntrack_broadcast.c, nf_conntrack_core.c, nf_conntrack_netlink.c.
    - nf_conntrack_pptp.c, nf_conntrack_sip.c, nfnetlink_cttimeout.c.
    - nfnetlink_queue.c, xt_TCPMSS.c, xt_TPROXY.c, xt_addrtype.c.
    - xt_connlimit.c, xt_hashlimit.c, xt_osf.c

    Detailed calltrace can be found at:
    http://marc.info/?l=netfilter-devel&m=149667610710350&w=2

    Signed-off-by: Taehee Yoo
    Acked-by: Julian Anastasov
    Signed-off-by: Pablo Neira Ayuso

    Taehee Yoo
     

26 Apr, 2017

1 commit


19 Apr, 2017

2 commits

  • No need to track this for inkernel helpers anymore as
    NF_CT_HELPER_BUILD_BUG_ON checks do this now.

    All inkernel helpers know what kind of structure they
    stored in helper->data.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • add a 32 byte scratch area in the helper struct instead of relying
    on variable sized helpers plus compile-time asserts to let us know
    if 32 bytes aren't enough anymore.

    Not having variable sized helpers will later allow to add BUILD_BUG_ON
    for the total size of conntrack extensions -- the helper extension is
    the only one that doesn't have a fixed size.

    The (useless!) NF_CT_HELPER_BUILD_BUG_ON(0); are added so that in case
    someone adds a new helper and copy-pastes from one that doesn't store
    private data at least some indication that this macro should be used
    somehow is there...

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

30 Aug, 2016

1 commit

  • With stats enabled this eats 80 bytes on x86_64 per nf_conn entry, as
    Eric Dumazet pointed out during netfilter workshop 2016.

    Eric also says: "Another reason was the fact that Thomas was about to
    change max timer range [..]" (500462a9de657f8, 'timers: Switch to
    a non-cascading wheel').

    Remove the timer and use a 32bit jiffies value containing timestamp until
    entry is valid.

    During conntrack lookup, even before doing tuple comparision, check
    the timeout value and evict the entry in case it is too old.

    The dying bit is used as a synchronization point to avoid races where
    multiple cpus try to evict the same entry.

    Because lookup is always lockless, we need to bump the refcnt once
    when we evict, else we could try to evict already-dead entry that
    is being recycled.

    This is the standard/expected way when conntrack entries are destroyed.

    Followup patches will introduce garbage colliction via work queue
    and further places where we can reap obsoleted entries (e.g. during
    netlink dumps), this is needed to avoid expired conntracks from hanging
    around for too long when lookup rate is low after a busy period.

    Signed-off-by: Florian Westphal
    Acked-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

11 Aug, 2015

1 commit

  • This patch replaces the zone id which is pushed down into functions
    with the actual zone object. It's a bigger one-time change, but
    needed for later on extending zones with a direction parameter, and
    thus decoupling this additional information from all call-sites.

    No functional changes in this patch.

    The default zone becomes a global const object, namely nf_ct_zone_dflt
    and will be returned directly in various cases, one being, when there's
    f.e. no zoning support.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: Pablo Neira Ayuso

    Daniel Borkmann
     

08 Apr, 2014

1 commit

  • nf_ct_gre_keymap_flush() removes a nf_ct_gre_keymap object from
    net_gre->keymap_list and frees the object. But it doesn't clean
    a reference on this object from ct_pptp_info->keymap[dir].
    Then nf_ct_gre_keymap_destroy() may release the same object again.

    So nf_ct_gre_keymap_flush() can be called only when we are sure that
    when nf_ct_gre_keymap_destroy will not be called.

    nf_ct_gre_keymap is created by nf_ct_gre_keymap_add() and the right way
    to destroy it is to call nf_ct_gre_keymap_destroy().

    This patch marks nf_ct_gre_keymap_flush() as static, so this patch can
    break compilation of third party modules, which use
    nf_ct_gre_keymap_flush. I'm not sure this is the right way to deprecate
    this function.

    [ 226.540793] general protection fault: 0000 [#1] SMP
    [ 226.541750] Modules linked in: nf_nat_pptp nf_nat_proto_gre
    nf_conntrack_pptp nf_conntrack_proto_gre ip_gre ip_tunnel gre
    ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc xt_nat
    iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat
    nf_conntrack veth tun bridge stp llc ppdev microcode joydev pcspkr
    serio_raw virtio_console virtio_balloon floppy parport_pc parport
    pvpanic i2c_piix4 virtio_net drm_kms_helper ttm ata_generic virtio_pci
    virtio_ring virtio drm i2c_core pata_acpi [last unloaded: ip_tunnel]
    [ 226.541776] CPU: 0 PID: 49 Comm: kworker/u4:2 Not tainted 3.14.0-rc8+ #101
    [ 226.541776] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    [ 226.541776] Workqueue: netns cleanup_net
    [ 226.541776] task: ffff8800371e0000 ti: ffff88003730c000 task.ti: ffff88003730c000
    [ 226.541776] RIP: 0010:[] [] __list_del_entry+0x29/0xd0
    [ 226.541776] RSP: 0018:ffff88003730dbd0 EFLAGS: 00010a83
    [ 226.541776] RAX: 6b6b6b6b6b6b6b6b RBX: ffff8800374e6c40 RCX: dead000000200200
    [ 226.541776] RDX: 6b6b6b6b6b6b6b6b RSI: ffff8800371e07d0 RDI: ffff8800374e6c40
    [ 226.541776] RBP: ffff88003730dbd0 R08: 0000000000000000 R09: 0000000000000000
    [ 226.541776] R10: 0000000000000001 R11: ffff88003730d92e R12: 0000000000000002
    [ 226.541776] R13: ffff88007a4c42d0 R14: ffff88007aef0000 R15: ffff880036cf0018
    [ 226.541776] FS: 0000000000000000(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
    [ 226.541776] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    [ 226.541776] CR2: 00007f07f643f7d0 CR3: 0000000036fd2000 CR4: 00000000000006f0
    [ 226.541776] Stack:
    [ 226.541776] ffff88003730dbe8 ffffffff81389c5d ffff8800374ffbe4 ffff88003730dc28
    [ 226.541776] ffffffffa0162a43 ffffffffa01627c5 ffff88007a4c42d0 ffff88007aef0000
    [ 226.541776] ffffffffa01651c0 ffff88007a4c45e0 ffff88007aef0000 ffff88003730dc40
    [ 226.541776] Call Trace:
    [ 226.541776] [] list_del+0xd/0x30
    [ 226.541776] [] nf_ct_gre_keymap_destroy+0x283/0x2d0 [nf_conntrack_proto_gre]
    [ 226.541776] [] ? nf_ct_gre_keymap_destroy+0x5/0x2d0 [nf_conntrack_proto_gre]
    [ 226.541776] [] gre_destroy+0x27/0x70 [nf_conntrack_proto_gre]
    [ 226.541776] [] destroy_conntrack+0x83/0x200 [nf_conntrack]
    [ 226.541776] [] ? destroy_conntrack+0x27/0x200 [nf_conntrack]
    [ 226.541776] [] ? nf_conntrack_hash_check_insert+0x2e0/0x2e0 [nf_conntrack]
    [ 226.541776] [] nf_conntrack_destroy+0x72/0x180
    [ 226.541776] [] ? nf_conntrack_destroy+0x5/0x180
    [ 226.541776] [] ? kill_l3proto+0x20/0x20 [nf_conntrack]
    [ 226.541776] [] nf_ct_iterate_cleanup+0x14e/0x170 [nf_conntrack]
    [ 226.541776] [] nf_ct_l4proto_pernet_unregister+0x5b/0x90 [nf_conntrack]
    [ 226.541776] [] proto_gre_net_exit+0x19/0x30 [nf_conntrack_proto_gre]
    [ 226.541776] [] ops_exit_list.isra.1+0x39/0x60
    [ 226.541776] [] cleanup_net+0x100/0x1d0
    [ 226.541776] [] process_one_work+0x1ea/0x4f0
    [ 226.541776] [] ? process_one_work+0x188/0x4f0
    [ 226.541776] [] worker_thread+0x11b/0x3a0
    [ 226.541776] [] ? process_one_work+0x4f0/0x4f0
    [ 226.541776] [] kthread+0xed/0x110
    [ 226.541776] [] ? _raw_spin_unlock_irq+0x2c/0x40
    [ 226.541776] [] ? kthread_create_on_node+0x200/0x200
    [ 226.541776] [] ret_from_fork+0x7c/0xb0
    [ 226.541776] [] ? kthread_create_on_node+0x200/0x200
    [ 226.541776] Code: 00 00 55 48 8b 17 48 b9 00 01 10 00 00 00 ad de
    48 8b 47 08 48 89 e5 48 39 ca 74 29 48 b9 00 02 20 00 00 00 ad de 48
    39 c8 74 7a 8b 00 4c 39 c7 75 53 4c 8b 42 08 4c 39 c7 75 2b 48 89
    42 08
    [ 226.541776] RIP [] __list_del_entry+0x29/0xd0
    [ 226.541776] RSP
    [ 226.612193] ---[ end trace 985ae23ddfcc357c ]---

    Cc: Pablo Neira Ayuso
    Cc: Patrick McHardy
    Cc: Jozsef Kadlecsik
    Cc: "David S. Miller"
    Signed-off-by: Andrey Vagin
    Signed-off-by: Pablo Neira Ayuso

    Andrey Vagin
     

19 Apr, 2013

1 commit

  • Add copyright statements to all netfilter files which have had significant
    changes done by myself in the past.

    Some notes:

    - nf_conntrack_ecache.c was incorrectly attributed to Rusty and Netfilter
    Core Team when it got split out of nf_conntrack_core.c. The copyrights
    even state a date which lies six years before it was written. It was
    written in 2005 by Harald and myself.

    - net/ipv{4,6}/netfilter.c, net/netfitler/nf_queue.c were missing copyright
    statements. I've added the copyright statement from net/netfilter/core.c,
    where this code originated

    - for nf_conntrack_proto_tcp.c I've also added Jozsef, since I didn't want
    it to give the wrong impression

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     

05 Feb, 2013

1 commit


30 Aug, 2012

1 commit


16 Jun, 2012

1 commit

  • This patch uses the new variable length conntrack extensions.

    Instead of using union nf_conntrack_help that contain all the
    helper private data information, we allocate variable length
    area to store the private helper data.

    This patch includes the modification of all existing helpers.
    It also includes a couple of include header to avoid compilation
    warnings.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

30 Aug, 2011

1 commit

  • When both the server and the client are NATed, the set-link-info control
    packet containing the peer's call-id field is not properly translated.

    I have verified that it was working in 2.6.16.13 kernel previously but
    due to rewrite, this scenario stopped working (Not knowing exact version
    when it stopped working).

    Signed-off-by: Sanket Shah
    Signed-off-by: Patrick McHardy

    Sanket Shah
     

06 Jun, 2011

1 commit

  • Following error is raised (and other similar ones) :

    net/ipv4/netfilter/nf_nat_standalone.c: In function ‘nf_nat_fn’:
    net/ipv4/netfilter/nf_nat_standalone.c:119:2: warning: case value ‘4’
    not in enumerated type ‘enum ip_conntrack_info’

    gcc barfs on adding two enum values and getting a not enumerated
    result :

    case IP_CT_RELATED+IP_CT_IS_REPLY:

    Add missing enum values

    Signed-off-by: Eric Dumazet
    CC: David Miller
    Signed-off-by: Pablo Neira Ayuso

    Eric Dumazet
     

16 Feb, 2010

1 commit


27 Mar, 2009

1 commit


25 Mar, 2009

1 commit

  • This patch combines Greg Bank's dprintk() work with the existing dynamic
    printk patchset, we are now calling it 'dynamic debug'.

    The new feature of this patchset is a richer /debugfs control file interface,
    (an example output from my system is at the bottom), which allows fined grained
    control over the the debug output. The output can be controlled by function,
    file, module, format string, and line number.

    for example, enabled all debug messages in module 'nf_conntrack':

    echo -n 'module nf_conntrack +p' > /mnt/debugfs/dynamic_debug/control

    to disable them:

    echo -n 'module nf_conntrack -p' > /mnt/debugfs/dynamic_debug/control

    A further explanation can be found in the documentation patch.

    Signed-off-by: Greg Banks
    Signed-off-by: Jason Baron
    Signed-off-by: Greg Kroah-Hartman

    Jason Baron
     

01 Feb, 2009

1 commit


17 Nov, 2008

1 commit


17 Oct, 2008

1 commit

  • Base infrastructure to enable per-module debug messages.

    I've introduced CONFIG_DYNAMIC_PRINTK_DEBUG, which when enabled centralizes
    control of debugging statements on a per-module basis in one /proc file,
    currently, /dynamic_printk/modules. When, CONFIG_DYNAMIC_PRINTK_DEBUG,
    is not set, debugging statements can still be enabled as before, often by
    defining 'DEBUG' for the proper compilation unit. Thus, this patch set has no
    affect when CONFIG_DYNAMIC_PRINTK_DEBUG is not set.

    The infrastructure currently ties into all pr_debug() and dev_dbg() calls. That
    is, if CONFIG_DYNAMIC_PRINTK_DEBUG is set, all pr_debug() and dev_dbg() calls
    can be dynamically enabled/disabled on a per-module basis.

    Future plans include extending this functionality to subsystems, that define
    their own debug levels and flags.

    Usage:

    Dynamic debugging is controlled by the debugfs file,
    /dynamic_printk/modules. This file contains a list of the modules that
    can be enabled. The format of the file is as follows:


    .
    .
    .

    : Name of the module in which the debug call resides
    : whether the messages are enabled or not

    For example:

    snd_hda_intel enabled=0
    fixup enabled=1
    driver enabled=0

    Enable a module:

    $echo "set enabled=1 " > dynamic_printk/modules

    Disable a module:

    $echo "set enabled=0 " > dynamic_printk/modules

    Enable all modules:

    $echo "set enabled=1 all" > dynamic_printk/modules

    Disable all modules:

    $echo "set enabled=0 all" > dynamic_printk/modules

    Finally, passing "dynamic_printk" at the command line enables
    debugging for all modules. This mode can be turned off via the above
    disable command.

    [gkh: minor cleanups and tweaks to make the build work quietly]

    Signed-off-by: Jason Baron
    Signed-off-by: Greg Kroah-Hartman

    Jason Baron
     

08 Oct, 2008

4 commits


14 Apr, 2008

2 commits


26 Mar, 2008

1 commit

  • Introduce expectation classes and policies. An expectation class
    is used to distinguish different types of expectations by the
    same helper (for example audio/video/t.120). The expectation
    policy is used to hold the maximum number of expectations and
    the initial timeout for each class.

    The individual classes are isolated from each other, which means
    that for example an audio expectation will only evict other audio
    expectations.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

01 Feb, 2008

1 commit


16 Oct, 2007

1 commit


11 Jul, 2007

4 commits


13 Feb, 2007

1 commit


26 Jan, 2007

1 commit

  • When an expected connection arrives, the NAT helper should be called to
    set up NAT similar to the master connection. The PPTP conntrack helper
    incorrectly checks whether the _expected_ connection has NAT setup before
    calling the NAT helper (which is never the case), instead of checkeing
    whether the _master_ connection is NATed.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

03 Dec, 2006

1 commit