16 May, 2012

1 commit


09 May, 2012

2 commits

  • This patch adds the flags parameter to ipv6_find_hdr. This flags
    allows us to:

    * know if this is a fragment.
    * stop at the AH header, so the information contained in that header
    can be used for some specific packet handling.

    This patch also adds the offset parameter for inspection of one
    inner IPv6 header that is contained in error messages.

    Signed-off-by: Hans Schillstrom
    Signed-off-by: Pablo Neira Ayuso

    Hans Schillstrom
     
  • This patch removes ip_queue support which was marked as obsolete
    years ago. The nfnetlink_queue modules provides more advanced
    user-space packet queueing mechanism.

    This patch also removes capability code included in SELinux that
    refers to ip_queue. Otherwise, we break compilation.

    Several warning has been sent regarding this to the mailing list
    in the past month without anyone rising the hand to stop this
    with some strong argument.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

21 Apr, 2012

2 commits

  • This results in code with less boiler plate that is a bit easier
    to read.

    Additionally stops us from using compatibility code in the sysctl
    core, hastening the day when the compatibility code can be removed.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This makes it clearer which sysctls are relative to your current network
    namespace.

    This makes it a little less error prone by not exposing sysctls for the
    initial network namespace in other namespaces.

    This is the same way we handle all of our other network interfaces to
    userspace and I can't honestly remember why we didn't do this for
    sysctls right from the start.

    Signed-off-by: Eric W. Biederman
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

16 Apr, 2012

1 commit


13 Apr, 2012

1 commit


09 Apr, 2012

1 commit

  • We may hit this in xt_LOG:

    net/built-in.o:xt_LOG.c:function dump_ipv6_packet:
    error: undefined reference to 'ip6t_ext_hdr'

    happens with these config options:

    CONFIG_NETFILTER_XT_TARGET_LOG=y
    CONFIG_IP6_NF_IPTABLES=m

    ip6t_ext_hdr is fairly small and it is called in the packet path.
    Make it static inline.

    Reported-by: Simon Kirby
    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

02 Apr, 2012

1 commit


23 Mar, 2012

1 commit

  • It used to be an int, and it got changed to a bool parameter at least
    7 years ago. It happens that NF_ACCEPT and NF_DROP are 0 and 1, so
    this works, but it's unclear, and the check that it's in range is not
    required.

    Reported-by: Dan Carpenter
    Signed-off-by: Rusty Russell
    Signed-off-by: David S. Miller

    Rusty Russell
     

08 Mar, 2012

3 commits

  • This patch adds the infrastructure to add fine timeout tuning
    over nfnetlink. Now you can use the NFNL_SUBSYS_CTNETLINK_TIMEOUT
    subsystem to create/delete/dump timeout objects that contain some
    specific timeout policy for one flow.

    The follow up patches will allow you attach timeout policy object
    to conntrack via the CT target and the conntrack extension
    infrastructure.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • This patch defines a new interface for l4 protocol trackers:

    unsigned int *(*get_timeouts)(struct net *net);

    that is used to return the array of unsigned int that contains
    the timeouts that will be applied for this flow. This is passed
    to the l4proto->new(...) and l4proto->packet(...) functions to
    specify the timeout policy.

    This interface allows per-net global timeout configuration
    (although only DCCP supports this by now) and it will allow
    custom custom timeout configuration by means of follow-up
    patches.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • ipt_LOG and ip6_LOG have a lot of common code, merge them
    to reduce duplicate code.

    Signed-off-by: Richard Weinberger
    Signed-off-by: Pablo Neira Ayuso

    Richard Weinberger
     

15 Jan, 2012

1 commit

  • * 'for-linus' of git://selinuxproject.org/~jmorris/linux-security:
    capabilities: remove __cap_full_set definition
    security: remove the security_netlink_recv hook as it is equivalent to capable()
    ptrace: do not audit capability check when outputing /proc/pid/stat
    capabilities: remove task_ns_* functions
    capabitlies: ns_capable can use the cap helpers rather than lsm call
    capabilities: style only - move capable below ns_capable
    capabilites: introduce new has_ns_capabilities_noaudit
    capabilities: call has_ns_capability from has_capability
    capabilities: remove all _real_ interfaces
    capabilities: introduce security_capable_noaudit
    capabilities: reverse arguments to security_capable
    capabilities: remove the task from capable LSM hook entirely
    selinux: sparse fix: fix several warnings in the security server cod
    selinux: sparse fix: fix warnings in netlink code
    selinux: sparse fix: eliminate warnings for selinuxfs
    selinux: sparse fix: declare selinux_disable() in security.h
    selinux: sparse fix: move selinux_complete_init
    selinux: sparse fix: make selinux_secmark_refcount static
    SELinux: Fix RCU deref check warning in sel_netport_insert()

    Manually fix up a semantic mis-merge wrt security_netlink_recv():

    - the interface was removed in commit fd7784615248 ("security: remove
    the security_netlink_recv hook as it is equivalent to capable()")

    - a new user of it appeared in commit a38f7907b926 ("crypto: Add
    userspace configuration API")

    causing no automatic merge conflict, but Eric Paris pointed out the
    issue.

    Linus Torvalds
     

06 Jan, 2012

1 commit


25 Dec, 2011

1 commit


20 Dec, 2011

1 commit

  • module_param(bool) used to counter-intuitively take an int. In
    fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy
    trick.

    It's time to remove the int/unsigned int option. For this version
    it'll simply give a warning, but it'll break next kernel version.

    (Thanks to Joe Perches for suggesting coccinelle for 0/1 -> true/false).

    Cc: "David S. Miller"
    Cc: netdev@vger.kernel.org
    Signed-off-by: Rusty Russell
    Signed-off-by: David S. Miller

    Rusty Russell
     

13 Dec, 2011

1 commit


04 Dec, 2011

1 commit

  • While parsing through IPv6 extension headers, fragment headers are
    skipped making them invisible to the caller. This reports the
    fragment offset of the last header in order to make it possible to
    determine whether the packet is fragmented and, if so whether it is
    a first or last fragment.

    Signed-off-by: Jesse Gross

    Jesse Gross
     

29 Nov, 2011

1 commit

  • Igor Maravic reported an error caused by jump_label_dec() being called
    from IRQ context :

    BUG: sleeping function called from invalid context at kernel/mutex.c:271
    in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper
    1 lock held by swapper/0:
    #0: (&n->timer){+.-...}, at: [] call_timer_fn+0x0/0x340
    Pid: 0, comm: swapper Not tainted 3.2.0-rc2-net-next-mpls+ #1
    Call Trace:
    [] __might_sleep+0x137/0x1f0
    [] mutex_lock_nested+0x2f/0x370
    [] ? trace_hardirqs_off+0xd/0x10
    [] ? local_clock+0x6f/0x80
    [] ? lock_release_holdtime.part.22+0x15/0x1a0
    [] ? sock_def_write_space+0x59/0x160
    [] ? arp_error_report+0x3e/0x90
    [] atomic_dec_and_mutex_lock+0x5d/0x80
    [] jump_label_dec+0x1d/0x50
    [] net_disable_timestamp+0x15/0x20
    [] sock_disable_timestamp+0x45/0x50
    [] __sk_free+0x80/0x200
    [] ? sk_send_sigurg+0x70/0x70
    [] ? arp_error_report+0x3e/0x90
    [] sock_wfree+0x3a/0x70
    [] skb_release_head_state+0x70/0x120
    [] __kfree_skb+0x16/0x30
    [] kfree_skb+0x49/0x170
    [] arp_error_report+0x3e/0x90
    [] neigh_invalidate+0x89/0xc0
    [] neigh_timer_handler+0x9e/0x2a0
    [] ? neigh_update+0x640/0x640
    [] __do_softirq+0xc8/0x3a0

    Since jump_label_{inc|dec} must be called from process context only,
    we must defer jump_label_dec() if net_disable_timestamp() is called
    from interrupt context.

    Reported-by: Igor Maravic
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Nov, 2011

1 commit


24 Nov, 2011

1 commit


23 Nov, 2011

1 commit


01 Nov, 2011

1 commit


19 Oct, 2011

1 commit

  • To ease skb->truesize sanitization, its better to be able to localize
    all references to skb frags size.

    Define accessors : skb_frag_size() to fetch frag size, and
    skb_frag_size_{set|add|sub}() to manipulate it.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Aug, 2011

1 commit

  • A userspace listener may send (bogus) NF_STOLEN verdict, which causes skb leak.

    This problem was previously fixed via
    64507fdbc29c3a622180378210ecea8659b14e40 (netfilter:
    nf_queue: fix NF_STOLEN skb leak) but this had to be reverted because
    NF_STOLEN can also be returned by a netfilter hook when iterating the
    rules in nf_reinject.

    Reject userspace NF_STOLEN verdict, as suggested by Michal Miroslaw.

    This is complementary to commit fad54440438a7c231a6ae347738423cbabc936d9
    (netfilter: avoid double free in nf_reinject).

    Cc: Julian Anastasov
    Cc: Eric Dumazet
    Signed-off-by: Florian Westphal
    Signed-off-by: Patrick McHardy

    Florian Westphal
     

29 Jul, 2011

1 commit

  • ipq_build_packet_message() in net/ipv4/netfilter/ip_queue.c and
    net/ipv6/netfilter/ip6_queue.c contain a small potential mem leak as
    far as I can tell.

    We allocate memory for 'skb' with alloc_skb() annd then call
    nlh = NLMSG_PUT(skb, 0, 0, IPQM_PACKET, size - sizeof(*nlh));

    NLMSG_PUT is a macro
    NLMSG_PUT(skb, pid, seq, type, len) \
    NLMSG_NEW(skb, pid, seq, type, len, 0)

    that expands to NLMSG_NEW, which is also a macro which expands to:
    NLMSG_NEW(skb, pid, seq, type, len, flags) \
    ({ if (unlikely(skb_tailroom(skb) < (int)NLMSG_SPACE(len))) \
    goto nlmsg_failure; \
    __nlmsg_put(skb, pid, seq, type, len, flags); })

    If we take the true branch of the 'if' statement and 'goto
    nlmsg_failure', then we'll, at that point, return from
    ipq_build_packet_message() without having assigned 'skb' to anything
    and we'll leak the memory we allocated for it when it goes out of
    scope.

    Fix this by placing a 'kfree(skb)' at 'nlmsg_failure'.

    I admit that I do not know how likely this to actually happen or even
    if there's something that guarantees that it will never happen - I'm
    not that familiar with this code, but if that is so, I've not been
    able to spot it.

    Signed-off-by: Jesper Juhl
    Signed-off-by: Patrick McHardy

    Jesper Juhl
     

16 Jun, 2011

1 commit

  • By default, when broadcast or multicast packet are sent from a local
    application, they are sent to the interface then looped by the kernel
    to other local applications, going throught netfilter hooks in the
    process.

    These looped packet have their MAC header removed from the skb by the
    kernel looping code. This confuse various netfilter's netlink queue,
    netlink log and the legacy ip_queue, because they try to extract a
    hardware address from these packets, but extracts a part of the IP
    header instead.

    This patch prevent NFQUEUE, NFLOG and ip_QUEUE to include a MAC header
    if there is none in the packet.

    Signed-off-by: Nicolas Cavallari
    Signed-off-by: Patrick McHardy

    Nicolas Cavallari
     

06 Jun, 2011

3 commits


12 May, 2011

1 commit


10 May, 2011

1 commit

  • The IPv6 header is not zeroed out in alloc_skb so we must initialize
    it properly unless we want to see IPv6 packets with random TOS fields
    floating around. The current implementation resets the flow label
    but this could be changed if deemed necessary.

    We stumbled upon this issue when trying to apply a mangle rule to
    the RST packet generated by the REJECT target module.

    Signed-off-by: Fernando Luis Vazquez Cao
    Signed-off-by: Pablo Neira Ayuso

    Fernando Luis Vazquez Cao
     

20 Apr, 2011

1 commit


18 Apr, 2011

2 commits


04 Apr, 2011

1 commit

  • We currently use a percpu spinlock to 'protect' rule bytes/packets
    counters, after various attempts to use RCU instead.

    Lately we added a seqlock so that get_counters() can run without
    blocking BH or 'writers'. But we really only need the seqcount in it.

    Spinlock itself is only locked by the current/owner cpu, so we can
    remove it completely.

    This cleanups api, using correct 'writer' vs 'reader' semantic.

    At replace time, the get_counters() call makes sure all cpus are done
    using the old table.

    Signed-off-by: Eric Dumazet
    Cc: Jan Engelhardt
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     

31 Mar, 2011

1 commit


20 Mar, 2011

1 commit

  • commit f3c5c1bfd4308 (make ip_tables reentrant) introduced a race in
    handling the stackptr restore, at the end of ipt_do_table()

    We should do it before the call to xt_info_rdunlock_bh(), or we allow
    cpu preemption and another cpu overwrites stackptr of original one.

    A second fix is to change the underflow test to check the origptr value
    instead of 0 to detect underflow, or else we allow a jump from different
    hooks.

    Signed-off-by: Eric Dumazet
    Cc: Jan Engelhardt
    Signed-off-by: Patrick McHardy

    Eric Dumazet
     

16 Mar, 2011

1 commit