31 May, 2014

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter/IPVS updates for net-next

    This small patchset contains three accumulated Netfilter/IPVS updates,
    they are:

    1) Refactorize common NAT code by encapsulating it into a helper
    function, similarly to what we do in other conntrack extensions,
    from Florian Westphal.

    2) A minor format string mismatch fix for IPVS, from Masanari Iida.

    3) Add quota support to the netfilter accounting infrastructure, now
    you can add quotas to accounting objects via the nfnetlink interface
    and use them from iptables. You can also listen to quota
    notifications from userspace. This enhancement from Mathieu Poirier.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

13 May, 2014

1 commit

  • As suggested by several people, rename local_df to ignore_df,
    since it means "ignore df bit if it is set".

    Cc: Maciej Żenczykowski
    Cc: Florian Westphal
    Cc: David S. Miller
    Cc: Eric Dumazet
    Signed-off-by: Cong Wang
    Acked-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller

    WANG Cong
     

04 May, 2014

1 commit

  • else we may fail to forward skb even if original fragments do fit
    outgoing link mtu:

    1. remote sends 2k packets in two 1000 byte frags, DF set
    2. we want to forward but only see '2k > mtu and DF set'
    3. we then send icmp error saying that outgoing link is 1500

    But original sender never sent a packet that would not fit
    the outgoing link.

    Setting local_df makes outgoing path test size vs.
    IPCB(skb)->frag_max_size, so we will still send the correct
    error in case the largest original size did not fit
    outgoing link mtu.

    Reported-by: Maxime Bizon
    Suggested-by: Maxime Bizon
    Fixes: 5f2d04f1f9 (ipv4: fix path MTU discovery with connection tracking)
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     

30 Apr, 2014

1 commit


17 Apr, 2014

1 commit

  • As suggested by Julian:

    Simply, flowi4_iif must not contain 0, it does not
    look logical to ignore all ip rules with specified iif.

    because in fib_rule_match() we do:

    if (rule->iifindex && (rule->iifindex != fl->flowi_iif))
    goto out;

    flowi4_iif should be LOOPBACK_IFINDEX by default.

    We need to move LOOPBACK_IFINDEX to include/net/flow.h:

    1) It is mostly used by flowi_iif

    2) Fix the following compile error if we use it in flow.h
    by the patches latter:

    In file included from include/linux/netfilter.h:277:0,
    from include/net/netns/netfilter.h:5,
    from include/net/net_namespace.h:21,
    from include/linux/netdevice.h:43,
    from include/linux/icmpv6.h:12,
    from include/linux/ipv6.h:61,
    from include/net/ipv6.h:16,
    from include/linux/sunrpc/clnt.h:27,
    from include/linux/nfs_fs.h:30,
    from init/do_mounts.c:32:
    include/net/flow.h: In function ‘flowi4_init_output’:
    include/net/flow.h:84:32: error: ‘LOOPBACK_IFINDEX’ undeclared (first use in this function)

    Cc: Eric Biederman
    Cc: Julian Anastasov
    Cc: David S. Miller
    Signed-off-by: Cong Wang
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     

05 Apr, 2014

1 commit

  • All xtables variants suffer from the defect that the copy_to_user()
    to copy the counters to user memory may fail after the table has
    already been exchanged and thus exposed. Return an error at this
    point will result in freeing the already exposed table. Any
    subsequent packet processing will result in a kernel panic.

    We can't copy the counters before exposing the new tables as we
    want provide the counter state after the old table has been
    unhooked. Therefore convert this into a silent error.

    Cc: Florian Westphal
    Signed-off-by: Thomas Graf
    Signed-off-by: Pablo Neira Ayuso

    Thomas Graf
     

14 Feb, 2014

1 commit


06 Feb, 2014

3 commits

  • Add a reject module for NFPROTO_INET. It does nothing but dispatch
    to the AF-specific modules based on the hook family.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Currently the nft_reject module depends on symbols from ipv6. This is
    wrong since no generic module should force IPv6 support to be loaded.
    Split up the module into AF-specific and a generic part.

    Signed-off-by: Patrick McHardy
    Signed-off-by: Pablo Neira Ayuso

    Patrick McHardy
     
  • Similar bug fixed in SIP module in 3f509c6 ("netfilter: nf_nat_sip: fix
    incorrect handling of EBUSY for RTCP expectation").

    BUG: unable to handle kernel paging request at 00100104
    IP: [] nf_ct_unlink_expect_report+0x57/0xf0 [nf_conntrack]
    ...
    Call Trace:
    [] ? del_timer+0x48/0x70
    [] nf_ct_remove_expectations+0x47/0x60 [nf_conntrack]
    [] nf_ct_delete_from_lists+0x59/0x90 [nf_conntrack]
    [] death_by_timeout+0x14e/0x1c0 [nf_conntrack]
    [] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack]
    [] call_timer_fn+0x1d/0x80
    [] run_timer_softirq+0x18e/0x1a0
    [] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack]
    [] __do_softirq+0xa3/0x170
    [] ? __local_bh_enable+0x70/0x70

    [] ? irq_exit+0x67/0xa0
    [] ? do_IRQ+0x46/0xb0
    [] ? clockevents_notify+0x35/0x110
    [] ? common_interrupt+0x2c/0x40
    [] ? cpuidle_enter_state+0x41/0xf0
    [] ? cpuidle_idle_call+0x8b/0x100
    [] ? arch_cpu_idle+0x8/0x30
    [] ? cpu_idle_loop+0x4b/0x140
    [] ? cpu_startup_entry+0x18/0x20
    [] ? rest_init+0x5d/0x70
    [] ? start_kernel+0x2ec/0x2f2
    [] ? repair_env_string+0x5b/0x5b
    [] ? i386_start_kernel+0x33/0x35

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Pablo Neira Ayuso

    Alexey Dobriyan
     

10 Jan, 2014

5 commits


08 Jan, 2014

3 commits


07 Jan, 2014

1 commit

  • Pablo Neira Ayuso says:

    ====================
    nftables updates for net-next

    The following patchset contains nftables updates for your net-next tree,
    they are:

    * Add set operation to the meta expression by means of the select_ops()
    infrastructure, this allows us to set the packet mark among other things.
    From Arturo Borrero Gonzalez.

    * Fix wrong format in sscanf in nf_tables_set_alloc_name(), from Daniel
    Borkmann.

    * Add new queue expression to nf_tables. These comes with two previous patches
    to prepare this new feature, one to add mask in nf_tables_core to
    evaluate the queue verdict appropriately and another to refactor common
    code with xt_NFQUEUE, from Eric Leblond.

    * Do not hide nftables from Kconfig if nfnetlink is not enabled, also from
    Eric Leblond.

    * Add the reject expression to nf_tables, this adds the missing TCP RST
    support. It comes with an initial patch to refactor common code with
    xt_NFQUEUE, again from Eric Leblond.

    * Remove an unused variable assignment in nf_tables_dump_set(), from Michal
    Nazarewicz.

    * Remove the nft_meta_target code, now that Arturo added the set operation
    to the meta expression, from me.

    * Add help information for nf_tables to Kconfig, also from me.

    * Allow to dump all sets by specifying NFPROTO_UNSPEC, similar feature is
    available to other nf_tables objects, requested by Arturo, from me.

    * Expose the table usage counter, so we can know how many chains are using
    this table without dumping the list of chains, from Tomasz Bursztyka.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

06 Jan, 2014

1 commit

  • Pablo Neira Ayuso says:

    ====================
    netfilter/IPVS updates for net-next

    The following patchset contains Netfilter updates for your net-next tree,
    they are:

    * Add full port randomization support. Some crazy researchers found a way
    to reconstruct the secure ephemeral ports that are allocated in random mode
    by sending off-path bursts of UDP packets to overrun the socket buffer of
    the DNS resolver to trigger retransmissions, then if the timing for the
    DNS resolution done by a client is larger than usual, then they conclude
    that the port that received the burst of UDP packets is the one that was
    opened. It seems a bit aggressive method to me but it seems to work for
    them. As a result, Daniel Borkmann and Hannes Frederic Sowa came up with a
    new NAT mode to fully randomize ports using prandom.

    * Add a new classifier to x_tables based on the socket net_cls set via
    cgroups. These includes two patches to prepare the field as requested by
    Zefan Li. Also from Daniel Borkmann.

    * Use prandom instead of get_random_bytes in several locations of the
    netfilter code, from Florian Westphal.

    * Allow to use the CTA_MARK_MASK in ctnetlink when mangling the conntrack
    mark, also from Florian Westphal.

    * Fix compilation warning due to unused variable in IPVS, from Geert
    Uytterhoeven.

    * Add support for UID/GID via nfnetlink_queue, from Valentina Giusti.

    * Add IPComp extension to x_tables, from Fan Du.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

04 Jan, 2014

1 commit

  • The following code is not used in current upstream code.
    Some of this seems to be old hooks, other might be used by some
    out of tree module (which I don't care about breaking), and
    the need_ipv4_conntrack was used by old NAT code but no longer
    called.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: Pablo Neira Ayuso

    stephen hemminger
     

02 Jan, 2014

1 commit


31 Dec, 2013

1 commit

  • This patch moves nft_reject_ipv4 to nft_reject and adds support
    for IPv6 protocol. This patch uses functions included in nf_reject.h
    to implement reject by TCP reset.

    The code has to be build as a module if NF_TABLES_IPV6 is also a
    module to avoid compilation error due to usage of IPv6 functions.
    This has been done in Kconfig by using the construct:

    depends on NF_TABLES_IPV6 || !NF_TABLES_IPV6

    This seems a bit weird in terms of syntax but works perfectly.

    Signed-off-by: Eric Leblond
    Signed-off-by: Pablo Neira Ayuso

    Eric Leblond
     

30 Dec, 2013

1 commit


27 Dec, 2013

1 commit


19 Dec, 2013

1 commit


12 Dec, 2013

1 commit


11 Dec, 2013

1 commit


07 Dec, 2013

1 commit

  • Several files refer to an old address for the Free Software Foundation
    in the file header comment. Resolve by replacing the address with
    the URL so that we do not have to keep
    updating the header comments anytime the address changes.

    CC: Alexey Kuznetsov
    CC: James Morris
    CC: Hideaki YOSHIFUJI
    CC: Patrick McHardy
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Kirsher
     

18 Nov, 2013

1 commit

  • When the synproxy_parse_options is called on the client ack the mss
    option will not be present. Consequently mss wont be included in the
    backend syn packet, which falls back to 536 bytes mss.

    Therefore XT_SYNPROXY_OPT_MSS is explicitly flagged when recovering mss
    value from cookie.

    Signed-off-by: Martin Topholm
    Reviewed-by: Jesper Dangaard Brouer
    Signed-off-by: Pablo Neira Ayuso

    Martin Topholm
     

05 Nov, 2013

3 commits

  • Pablo Neira Ayuso says:

    ====================
    This batch contains fives nf_tables patches for your net-next tree,
    they are:

    * Fix possible use after free in the module removal path of the
    x_tables compatibility layer, from Dan Carpenter.

    * Add filter chain type for the bridge family, from myself.

    * Fix Kconfig dependencies of the nf_tables bridge family with
    the core, from myself.

    * Fix sparse warnings in nft_nat, from Tomasz Bursztyka.

    * Remove duplicated include in the IPv4 family support for nf_tables,
    from Wei Yongjun.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pablo Neira Ayuso says:

    ====================
    This is another batch containing Netfilter/IPVS updates for your net-next
    tree, they are:

    * Six patches to make the ipt_CLUSTERIP target support netnamespace,
    from Gao feng.

    * Two cleanups for the nf_conntrack_acct infrastructure, introducing
    a new structure to encapsulate conntrack counters, from Holger
    Eitzenberger.

    * Fix missing verdict in SCTP support for IPVS, from Daniel Borkmann.

    * Skip checksum recalculation in SCTP support for IPVS, also from
    Daniel Borkmann.

    * Fix behavioural change in xt_socket after IP early demux, from
    Florian Westphal.

    * Fix bogus large memory allocation in the bitmap port set type in ipset,
    from Jozsef Kadlecsik.

    * Fix possible compilation issues in the hash netnet set type in ipset,
    also from Jozsef Kadlecsik.

    * Define constants to identify netlink callback data in ipset dumps,
    again from Jozsef Kadlecsik.

    * Use sock_gen_put() in xt_socket to replace xt_socket_put_sk,
    from Eric Dumazet.

    * Improvements for the SH scheduler in IPVS, from Alexander Frolkin.

    * Remove extra delay due to unneeded rcu barrier in IPVS net namespace
    cleanup path, from Julian Anastasov.

    * Save some cycles in ip6t_REJECT by skipping checksum validation in
    packets leaving from our stack, from Stanislav Fomichev.

    * Fix IPVS_CMD_ATTR_MAX definition in IPVS, larger that required, from
    Julian Anastasov.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Conflicts:
    drivers/net/ethernet/emulex/benet/be.h
    drivers/net/netconsole.c
    net/bridge/br_private.h

    Three mostly trivial conflicts.

    The net/bridge/br_private.h conflict was a function signature (argument
    addition) change overlapping with the extern removals from Joe Perches.

    In drivers/net/netconsole.c we had one change adjusting a printk message
    whilst another changed "printk(KERN_INFO" into "pr_info(".

    Lastly, the emulex change was a new inline function addition overlapping
    with Joe Perches's extern removals.

    Signed-off-by: David S. Miller

    David S. Miller
     

04 Nov, 2013

1 commit


22 Oct, 2013

1 commit

  • During kernel stability testing on an SMP ARMv7 system, Yalin Wang
    reported the following panic from the netfilter code:

    1fe0: 0000001c 5e2d3b10 4007e779 4009e110 60000010 00000032 ff565656 ff545454
    [] (ipt_do_table+0x448/0x584) from [] (nf_iterate+0x48/0x7c)
    [] (nf_iterate+0x48/0x7c) from [] (nf_hook_slow+0x58/0x104)
    [] (nf_hook_slow+0x58/0x104) from [] (ip_local_deliver+0x88/0xa8)
    [] (ip_local_deliver+0x88/0xa8) from [] (ip_rcv_finish+0x418/0x43c)
    [] (ip_rcv_finish+0x418/0x43c) from [] (__netif_receive_skb+0x4cc/0x598)
    [] (__netif_receive_skb+0x4cc/0x598) from [] (process_backlog+0x84/0x158)
    [] (process_backlog+0x84/0x158) from [] (net_rx_action+0x70/0x1dc)
    [] (net_rx_action+0x70/0x1dc) from [] (__do_softirq+0x11c/0x27c)
    [] (__do_softirq+0x11c/0x27c) from [] (do_softirq+0x44/0x50)
    [] (do_softirq+0x44/0x50) from [] (local_bh_enable_ip+0x8c/0xd0)
    [] (local_bh_enable_ip+0x8c/0xd0) from [] (inet_stream_connect+0x164/0x298)
    [] (inet_stream_connect+0x164/0x298) from [] (sys_connect+0x88/0xc8)
    [] (sys_connect+0x88/0xc8) from [] (ret_fast_syscall+0x0/0x30)
    Code: 2a000021 e59d2028 e59de01c e59f011c (e7824103)
    ---[ end trace da227214a82491bd ]---
    Kernel panic - not syncing: Fatal exception in interrupt

    This comes about because CPU1 is executing xt_replace_table in response
    to a setsockopt syscall, resulting in:

    ret = xt_jumpstack_alloc(newinfo);
    --> newinfo->jumpstack = kzalloc(size, GFP_KERNEL);

    [...]

    table->private = newinfo;
    newinfo->initial_entries = private->initial_entries;

    Meanwhile, CPU0 is handling the network receive path and ends up in
    ipt_do_table, resulting in:

    private = table->private;

    [...]

    jumpstack = (struct ipt_entry **)private->jumpstack[cpu];

    On weakly ordered memory architectures, the writes to table->private
    and newinfo->jumpstack from CPU1 can be observed out of order by CPU0.
    Furthermore, on architectures which don't respect ordering of address
    dependencies (i.e. Alpha), the reads from CPU0 can also be re-ordered.

    This patch adds an smp_wmb() before the assignment to table->private
    (which is essentially publishing newinfo) to ensure that all writes to
    newinfo will be observed before plugging it into the table structure.
    A dependent-read barrier is also added on the consumer sides, to ensure
    the same ordering requirements are also respected there.

    Cc: Paul E. McKenney
    Reported-by: Wang, Yalin
    Tested-by: Wang, Yalin
    Signed-off-by: Will Deacon
    Acked-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso

    Will Deacon
     

17 Oct, 2013

5 commits