20 Mar, 2013

1 commit

  • Pablo Neira Ayuso says:

    ====================
    The following patchset contains 7 Netfilter/IPVS fixes for 3.9-rc, they are:

    * Restrict IPv6 stateless NPT targets to the mangle table. Many users are
    complaining that this target does not work in the nat table, which is the
    wrong table for it, from Florian Westphal.

    * Fix possible use before initialization in the netns init path of several
    conntrack protocol trackers (introduced recently while improving conntrack
    netns support), from Gao Feng.

    * Fix incorrect initialization of copy_range in nfnetlink_queue, spotted
    by Eric Dumazet during the NFWS2013, patch from myself.

    * Fix wrong calculation of next SCTP chunk in IPVS, from Julian Anastasov.

    * Remove rcu_read_lock section in IPVS while calling ipv4_update_pmtu
    not required anymore after change introduced in 3.7, again from Julian.

    * Fix SYN looping in IPVS state sync if the backup is used a real server
    in DR/TUN modes, this required a new /proc entry to disable the director
    function when acting as backup, also from Julian.

    * Remove leftover IP_NF_QUEUE Kconfig after ip_queue removal, noted by
    Paul Bolle.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Mar, 2013

1 commit

  • This patch introduces a constant limit of the fragment queue hash
    table bucket list lengths. Currently the limit 128 is choosen somewhat
    arbitrary and just ensures that we can fill up the fragment cache with
    empty packets up to the default ip_frag_high_thresh limits. It should
    just protect from list iteration eating considerable amounts of cpu.

    If we reach the maximum length in one hash bucket a warning is printed.
    This is implemented on the caller side of inet_frag_find to distinguish
    between the different users of inet_fragment.c.

    I dropped the out of memory warning in the ipv4 fragment lookup path,
    because we already get a warning by the slab allocator.

    Cc: Eric Dumazet
    Cc: Jesper Dangaard Brouer
    Signed-off-by: Hannes Frederic Sowa
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     

15 Mar, 2013

1 commit


19 Feb, 2013

3 commits

  • Pablo Neira Ayuso says:

    ====================
    The following patchset contain updates for your net-next tree, they are:

    * Fix (for just added) connlabel dependencies, from Florian Westphal.

    * Add aliasing support for conntrack, thus users can either use -m state
    or -m conntrack from iptables while using the same kernel module, from
    Jozsef Kadlecsik.

    * Some code refactoring for the CT target to merge common code in
    revision 0 and 1, from myself.

    * Add aliasing support for CT, based on patch from Jozsef Kadlecsik.

    * Add one mutex per nfnetlink subsystem, from myself.

    * Improved logging for packets that are dropped by helpers, from myself.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pull in 'net' to take in the bug fixes that didn't make it into
    3.8-final.

    Also, deal with the semantic conflict of the change made to
    net/ipv6/xfrm6_policy.c A missing rt6->n neighbour release
    was added to 'net', but in 'net-next' we no longer cache the
    neighbour entries in the ipv6 routes so that change is not
    appropriate there.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Connection tracking helpers have to drop packets under exceptional
    situations. Currently, the user gets the following logging message
    in case that happens:

    nf_ct_%s: dropping packet ...

    However, depending on the helper, there are different reasons why a
    packet can be dropped.

    This patch modifies the existing code to provide more specific
    error message in the scope of each helper to help users to debug
    the reason why the packet has been dropped, ie:

    nf_ct_%s: dropping packet: reason ...

    Thanks to Joe Perches for many formatting suggestions.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

16 Feb, 2013

1 commit


14 Feb, 2013

1 commit


13 Feb, 2013

1 commit


08 Feb, 2013

3 commits


30 Jan, 2013

3 commits


27 Jan, 2013

1 commit

  • Pablo Neira Ayuso says:

    ====================
    This batch contains netfilter updates for you net-next tree, they are:

    * The new connlabel extension for x_tables, that allows us to attach
    labels to each conntrack flow. The kernel implementation uses a
    bitmask and there's a file in user-space that maps the bits with the
    corresponding string for each existing label. By now, you can attach
    up to 128 overlapping labels. From Florian Westphal.

    * A new round of improvements for the netns support for conntrack.
    Gao feng has moved many of the initialization code of each module
    of the netns init path. He also made several code refactoring, that
    code looks cleaner to me now.

    * Added documentation for all possible tweaks for nf_conntrack via
    sysctl, from Jiri Pirko.

    * Cisco 7941/7945 IP phone support for our SIP conntrack helper,
    from Kevin Cernekee.

    * Missing header file in the snmp helper, from Stephen Hemminger.

    * Finally, a couple of fixes to resolve minor issues with these
    changes, from myself.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

23 Jan, 2013

3 commits

  • Move the code that register/unregister l4proto to the
    module_init/exit context.

    Given that we have to modify some interfaces to accomodate
    these changes, it is a good time to use shorter function names
    for this using the nf_ct_* prefix instead of nf_conntrack_*,
    that is:

    nf_ct_l4proto_register
    nf_ct_l4proto_pernet_register
    nf_ct_l4proto_unregister
    nf_ct_l4proto_pernet_unregister

    We same many line breaks with it.

    Signed-off-by: Gao feng
    Signed-off-by: Pablo Neira Ayuso

    Gao feng
     
  • Move the code that register/unregister l3proto to the
    module_init/exit context.

    Given that we have to modify some interfaces to accomodate
    these changes, it is a good time to use shorter function names
    for this using the nf_ct_* prefix instead of nf_conntrack_*,
    that is:

    nf_ct_l3proto_register
    nf_ct_l3proto_pernet_register
    nf_ct_l3proto_unregister
    nf_ct_l3proto_pernet_unregister

    We same many line breaks with it.

    Signed-off-by: Gao feng
    Signed-off-by: Pablo Neira Ayuso

    Gao feng
     
  • Signed-off-by: YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki / 吉藤英明
     

16 Jan, 2013

1 commit


14 Jan, 2013

1 commit


05 Jan, 2013

1 commit


17 Dec, 2012

4 commits

  • Commit b836c99fd6c9 (ipv6: unify conntrack reassembly expire
    code with standard one) use the standard IPv6 reassembly
    code(ip6_expire_frag_queue) to handle conntrack reassembly expire.

    In ip6_expire_frag_queue, it invoke dev_get_by_index_rcu to get
    which device received this expired packet.so we must save ifindex
    when NF_conntrack get this packet.

    With this patch applied, I can see ICMP Time Exceeded sent
    from the receiver when the sender sent out 1/2 fragmented
    IPv6 packet.

    Signed-off-by: Haibo Xi
    Signed-off-by: Pablo Neira Ayuso

    Haibo Xi
     
  • Remove ambiguity of double negation.

    Signed-off-by: Florent Fourcot
    Acked-by: Rick Jones
    Signed-off-by: Pablo Neira Ayuso

    Florent Fourcot
     
  • Since (a0ecb85 netfilter: nf_nat: Handle routing changes in MASQUERADE
    target), the MASQUERADE target handles routing changes which affect
    the output interface of a connection, but only for ESTABLISHED
    connections. It is also possible for NEW connections which
    already have a conntrack entry to be affected by routing changes.

    This adds a check to drop entries in the NEW+conntrack state
    when the oif has changed.

    Signed-off-by: Andrew Collins
    Acked-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso

    Andrew Collins
     
  • The problem occurs when iptables constructs the tcp reset packet.
    It doesn't initialize the pointer to the tcp header within the skb.
    When the skb is passed to the ixgbe driver for transmit, the ixgbe
    driver attempts to access the tcp header and crashes.
    Currently, other drivers (such as our 1G e1000e or igb drivers) don't
    access the tcp header on transmit unless the TSO option is turned on.

    BUG: unable to handle kernel NULL pointer dereference at 0000000d
    IP: [] ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe]
    *pdpt = 0000000085e5d001 *pde = 0000000000000000
    Oops: 0000 [#1] SMP
    [...]
    Pid: 0, comm: swapper Tainted: P 2.6.35.12 #1 Greencity/Thurley
    EIP: 0060:[] EFLAGS: 00010246 CPU: 16
    EIP is at ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe]
    EAX: c7628820 EBX: 00000007 ECX: 00000000 EDX: 00000000
    ESI: 00000008 EDI: c6882180 EBP: dfc6b000 ESP: ced95c48
    DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
    Process swapper (pid: 0, ti=ced94000 task=ced73bd0 task.ti=ced94000)
    Stack:
    cbec7418 c779e0d8 c77cc888 c77cc8a8 0903010a 00000000 c77c0008 00000002
    cd4997c0 00000010 dfc6b000 00000000 d0d176c9 c77cc8d8 c6882180 cbec7318
    00000004 00000004 cbec7230 cbec7110 00000000 cbec70c0 c779e000 00000002
    Call Trace:
    [] ? 0xd0d176c9
    [] ? 0xd0d18a4d
    [] ? dev_hard_start_xmit+0x218/0x2d7
    [] ? sch_direct_xmit+0x4b/0x114
    [] ? __qdisc_run+0xca/0xe0
    [] ? dev_queue_xmit+0x2d1/0x3d0
    [] ? neigh_resolve_output+0x1c5/0x20f
    [] ? neigh_update+0x29c/0x330
    [] ? arp_process+0x49c/0x4cd
    [] ? nf_hook_slow+0x3f/0xac
    [] ? arp_process+0x0/0x4cd
    [] ? arp_process+0x0/0x4cd
    [] ? T.901+0x38/0x3b
    [] ? arp_rcv+0xa3/0xb4
    [] ? arp_process+0x0/0x4cd
    [] ? __netif_receive_skb+0x32b/0x346
    [] ? netif_receive_skb+0x5a/0x5f
    [] ? napi_skb_finish+0x1b/0x30
    [] ? ixgbe_xmit_frame_ring+0x1564/0x2260 [ixgbe]
    [] ? lapic_next_event+0x13/0x16
    [] ? clockevents_program_event+0xd2/0xe4
    [] ? net_rx_action+0x55/0x127
    [] ? __do_softirq+0x77/0xeb
    [] ? do_softirq+0x23/0x27
    [] ? do_IRQ+0x7d/0x8e
    [] ? common_interrupt+0x29/0x30
    [] ? mwait_idle+0x48/0x4d
    [] ? cpu_idle+0x37/0x4c
    Code: df 09 d7 0f 94 c2 0f b6 d2 e9 e7 fb ff ff 31 db 31 c0 e9 38
    ff ff ff 80 78 06 06 0f 85 3e fb ff ff 8b 7c 24 38 8b 8f b8 00 00 00
    b6 51 0d f6 c2 01 0f 85 27 fb ff ff 80 e2 02 75 0d 8b 6c 24
    EIP: [] ixgbe_xmit_frame_ring+0x8cc/0x2260 [ixgbe] SS:ESP

    Signed-off-by: Mukund Jampala
    Signed-off-by: Pablo Neira Ayuso

    Mukund Jampala
     

03 Dec, 2012

1 commit

  • When the route changes (backup default route, VPNs) which affect a
    masqueraded target, the packets were sent out with the outdated source
    address. The patch addresses the issue by comparing the outgoing interface
    directly with the masqueraded interface in the nat table.

    Events are inefficient in this case, because it'd require adding route
    events to the network core and then scanning the whole conntrack table
    and re-checking the route for all entry.

    Signed-off-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso

    Jozsef Kadlecsik
     

01 Dec, 2012

1 commit

  • Conflicts:
    net/ipv6/exthdrs_core.c

    Jesse Gross says:

    ====================
    This series of improvements for 3.8/net-next contains four components:
    * Support for modifying IPv6 headers
    * Support for matching and setting skb->mark for better integration with
    things like iptables
    * Ability to recognize the EtherType for RARP packets
    * Two small performance enhancements

    The movement of ipv6_find_hdr() into exthdrs_core.c causes two small merge
    conflicts. I left it as is but can do the merge if you want. The conflicts
    are:
    * ipv6_find_hdr() and ipv6_find_tlv() were both moved to the bottom of
    exthdrs_core.c. Both should stay.
    * A new use of ipv6_find_hdr() was added to net/netfilter/ipvs/ip_vs_core.c
    after this patch. The IPVS user has two instances of the old constant
    name IP6T_FH_F_FRAG which has been renamed to IP6_FH_F_FRAG.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

19 Nov, 2012

1 commit

  • Allow an unpriviled user who has created a user namespace, and then
    created a network namespace to effectively use the new network
    namespace, by reducing capable(CAP_NET_ADMIN) and
    capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
    CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.

    Settings that merely control a single network device are allowed.
    Either the network device is a logical network device where
    restrictions make no difference or the network device is hardware NIC
    that has been explicity moved from the initial network namespace.

    In general policy and network stack state changes are allowed while
    resource control is left unchanged.

    Allow the SIOCSIFADDR ioctl to add ipv6 addresses.
    Allow the SIOCDIFADDR ioctl to delete ipv6 addresses.
    Allow the SIOCADDRT ioctl to add ipv6 routes.
    Allow the SIOCDELRT ioctl to delete ipv6 routes.

    Allow creation of ipv6 raw sockets.

    Allow setting the IPV6_JOIN_ANYCAST socket option.
    Allow setting the IPV6_FL_A_RENEW parameter of the IPV6_FLOWLABEL_MGR
    socket option.

    Allow setting the IPV6_TRANSPARENT socket option.
    Allow setting the IPV6_HOPOPTS socket option.
    Allow setting the IPV6_RTHDRDSTOPTS socket option.
    Allow setting the IPV6_DSTOPTS socket option.
    Allow setting the IPV6_IPSEC_POLICY socket option.
    Allow setting the IPV6_XFRM_POLICY socket option.

    Allow sending packets with the IPV6_2292HOPOPTS control message.
    Allow sending packets with the IPV6_2292DSTOPTS control message.
    Allow sending packets with the IPV6_RTHDRDSTOPTS control message.

    Allow setting the multicast routing socket options on non multicast
    routing sockets.

    Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, and SIOCDELTUNNEL ioctls for
    setting up, changing and deleting tunnels over ipv6.

    Allow the SIOCADDTUNNEL, SIOCCHGTUNNEL, SIOCDELTUNNEL ioctls for
    setting up, changing and deleting ipv6 over ipv4 tunnels.

    Allow the SIOCADDPRL, SIOCDELPRL, SIOCCHGPRL ioctls for adding,
    deleting, and changing the potential router list for ISATAP tunnels.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

17 Nov, 2012

1 commit


13 Nov, 2012

1 commit


11 Nov, 2012

1 commit


10 Nov, 2012

1 commit


04 Nov, 2012

1 commit

  • As suggested by Eric, we could introduce a helper function
    for ipv6 too, to avoid checking if rt is NULL before
    dst_release().

    Cc: Eric Dumazet
    Cc: David S. Miller
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Amerigo Wang
     

02 Nov, 2012

2 commits

  • userspace can query the original ipv4 destination address of a REDIRECTed
    connection via
    getsockopt(m_sock, SOL_IP, SO_ORIGINAL_DST, &m_server_addr, &addrsize)

    but for ipv6 no such option existed.

    This adds getsockopt(..., IPPROTO_IPV6, IP6T_SO_ORIGINAL_DST, ...).

    Without this, userspace needs to parse /proc or use ctnetlink, which
    appears to be overkill.

    This uses option number 80 for IP6T_SO_ORIGINAL_DST, which is spare,
    to use the same number we use in the IPv4 socket option SO_ORIGINAL_DST.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • #if defined(CONFIG_FOO) || defined(CONFIG_FOO_MODULE)

    can be replaced by

    #if IS_ENABLED(CONFIG_FOO)

    Cc: David S. Miller
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Amerigo Wang
     

29 Oct, 2012

3 commits

  • Use PTR_RET rather than if(IS_ERR(...)) + PTR_ERR

    Generated by: coccinelle/api/ptr_ret.cocci

    Reported-by: Fengguang Wu
    Signed-off-by: Fengguang Wu
    Signed-off-by: Pablo Neira Ayuso

    Wu Fengguang
     
  • WARNING: net/ipv6/netfilter/nf_defrag_ipv6.o(.text+0xe0): Section mismatch in
    reference from the function nf_ct_net_init() to the function
    .init.text:nf_ct_frag6_sysctl_register()
    The function nf_ct_net_init() references the function
    __init nf_ct_frag6_sysctl_register().

    In case nf_conntrack_ipv6 is compiled as a module, nf_ct_net_init could be
    called after the init code and data are unloaded. Therefore remove the
    "__net_init" annotation from nf_ct_frag6_sysctl_register().

    Signed-off-by: Hein Tibosch
    Acked-by: Cong Wang
    Signed-off-by: Pablo Neira Ayuso

    Hein Tibosch
     
  • ICMP tuples have id in src and type/code in dst.
    So comparing src.u.all with dst.u.all will always fail here
    and ip_xfrm_me_harder() is called for every ICMP packet,
    even if there was no NAT.

    Signed-off-by: Ulrich Weber
    Signed-off-by: Pablo Neira Ayuso

    Ulrich Weber
     

28 Sep, 2012

1 commit