28 Jun, 2006

1 commit

  • locking init cleanups:

    - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
    - convert rwlocks in a similar manner

    this patch was generated automatically.

    Motivation:

    - cleanliness
    - lockdep needs control of lock initialization, which the open-coded
    variants do not give
    - it's also useful for -rt and for lock debugging in general

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

23 Jun, 2006

4 commits

  • This patch segments GSO packets received by the IPsec stack. This can
    happen when a NIC driver injects GSO packets into the stack which are
    then forwarded to another host.

    The primary application of this is going to be Xen where its backend
    driver may inject GSO packets into dom0.

    Of course this also can be used by other virtualisation schemes such as
    VMWare or UML since the tap device could be modified to inject GSO packets
    received through splice.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Having separate fields in sk_buff for TSO/UFO (tso_size/ufo_size) is not
    going to scale if we add any more segmentation methods (e.g., DCCP). So
    let's merge them.

    They were used to tell the protocol of a packet. This function has been
    subsumed by the new gso_type field. This is essentially a set of netdev
    feature bits (shifted by 16 bits) that are required to process a specific
    skb. As such it's easy to tell whether a given device can process a GSO
    skb: you just have to and the gso_type field and the netdev's features
    field.

    I've made gso_type a conjunction. The idea is that you have a base type
    (e.g., SKB_GSO_TCPV4) that can be modified further to support new features.
    For example, if we add a hardware TSO type that supports ECN, they would
    declare NETIF_F_TSO | NETIF_F_TSO_ECN. All TSO packets with CWR set would
    have a gso_type of SKB_GSO_TCPV4 | SKB_GSO_TCPV4_ECN while all other TSO
    packets would be SKB_GSO_TCPV4. This means that only the CWR packets need
    to be emulated in software.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • We need to update hiscore.rule even if we don't enable CONFIG_IPV6_PRIVACY,
    because we have more less significant rule; longest match.

    Signed-off-by: YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki
     
  • Two additional labels (RFC 3484, sec. 10.3) for IPv6 addreses
    are defined to make a distinction between global unicast
    addresses and Unique Local Addresses (fc00::/7, RFC 4193) and
    Teredo (2001::/32, RFC 4380). It is necessary to avoid attempts
    of connection that would either fail (eg. fec0:: to 2001:feed::)
    or be sub-optimal (2001:0:: to 2001:feed::).

    Signed-off-by: Łukasz Stelmach
    Signed-off-by: YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    Łukasz Stelmach
     

18 Jun, 2006

10 commits

  • This patch fixes RTNLGRP_IPV6_IFINFO netlink notifications. Issue
    pointed out by Patrick McHardy .

    Signed-off-by: YOSHIFUJI Hideaki
    Acked-by: Patrick McHardy
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki
     
  • I found a few more spots where pskb_trim_rcsum could be used but were not.
    This patch changes them to use it.

    Also, sk_filter can get paged skb data. Therefore we must use pskb_trim
    instead of skb_trim.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The linearisation operation doesn't need to be super-optimised. So we can
    replace __skb_linearize with __pskb_pull_tail which does the same thing but
    is more general.

    Also, most users of skb_linearize end up testing whether the skb is linear
    or not so it helps to make skb_linearize do just that.

    Some callers of skb_linearize also use it to copy cloned data, so it's
    useful to have a new function skb_linearize_cow to copy the data if it's
    either non-linear or cloned.

    Last but not least, I've removed the gfp argument since nobody uses it
    anymore. If it's ever needed we can easily add it back.

    Misc bugs fixed by this patch:

    * via-velocity error handling (also, no SG => no frags)

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Add a secmark field to the skbuff structure, to allow security subsystems to
    place security markings on network packets. This is similar to the nfmark
    field, except is intended for implementing security policy, rather than than
    networking policy.

    This patch was already acked in principle by Dave Miller.

    Signed-off-by: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    James Morris
     
  • Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • None of the existing helpers expects to get called for related ICMP
    packets and some even drop them if they can't parse them.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Now that we have xfrm_mode objects we can move the transport mode specific
    input decapsulation code into xfrm_mode_transport. This removes duplicate
    code as well as unnecessary header movement in case of tunnel mode SAs
    since we will discard the original IP header immediately.

    This also fixes a minor bug for transport-mode ESP where the IP payload
    length is set to the correct value minus the header length (with extension
    headers for IPv6).

    Of course the other neat thing is that we no longer have to allocate
    temporary buffers to hold the IP headers for ESP and IPComp.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch adds the structure xfrm_mode. It is meant to represent
    the operations carried out by transport/tunnel modes.

    By doing this we allow additional encapsulation modes to be added
    without clogging up the xfrm_input/xfrm_output paths.

    Candidate modes include 4-to-6 tunnel mode, 6-to-4 tunnel mode, and
    BEET modes.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The number of locks used to manage afinfo structures can easily be reduced
    down to one each for policy and state respectively. This is based on the
    observation that the write locks are only held by module insertion/removal
    which are very rare events so there is no need to further differentiate
    between the insertion of modules like ipv6 versus esp6.

    The removal of the read locks in xfrm4_policy.c/xfrm6_policy.c might look
    suspicious at first. However, after you realise that nobody ever takes
    the corresponding write lock you'll feel better :)

    As far as I can gather it's an attempt to guard against the removal of
    the corresponding modules. Since neither module can be unloaded at all
    we can leave it to whoever fixes up IPv6 unloading :)

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Locks down user pages and sets up for DMA in tcp_recvmsg, then calls
    dma_async_try_early_copy in tcp_v4_do_rcv

    Signed-off-by: Chris Leech
    Signed-off-by: David S. Miller

    Chris Leech
     

27 May, 2006

1 commit


23 May, 2006

1 commit


19 May, 2006

2 commits

  • Solar Designer found a race condition in do_add_counters(). The beginning
    of paddc is supposed to be the same as tmp which was sanity-checked
    above, but it might not be the same in reality. In case the integer
    overflow and/or the race condition are triggered, paddc->num_counters
    might not match the allocation size for paddc. If the check below
    (t->private->number != paddc->num_counters) nevertheless passes (perhaps
    this requires the race condition to be triggered), IPT_ENTRY_ITERATE()
    would read kernel memory beyond the allocation size, potentially causing
    an oops or leaking sensitive data (e.g., passwords from host system or
    from another VPS) via counter increments. This requires CAP_NET_ADMIN.

    Signed-off-by: Solar Designer
    Signed-off-by: Kirill Korotaev
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Solar Designer
     
  • The prefix argument for nf_log_packet is a format specifier,
    so don't pass the user defined string directly to it.

    Signed-off-by: Philip Craig
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Philip Craig
     

17 May, 2006

1 commit


11 May, 2006

1 commit


30 Apr, 2006

1 commit

  • We eliminated rt6_dflt_lock (to protect default router pointer)
    at 2.6.17-rc1, and introduced rt6_select() for general router selection.
    The function is called in the context of rt6_lock read-lock held,
    but this means, we have some race conditions when we do round-robin.

    Signed-off-by; YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki
     

25 Apr, 2006

1 commit


19 Apr, 2006

4 commits


12 Apr, 2006

1 commit

  • This closes a race where an ipq6hashfn() caller could get a hash value
    and race with the cycling of the random seed. By the time they got to
    the read_lock they'd have a stale hash value and might not find
    previous fragments of their datagram.

    This matches the previous patch to IPv4.

    Signed-off-by: Zach Brown
    Signed-off-by: David S. Miller

    Zach Brown
     

11 Apr, 2006

1 commit

  • for_each_cpu() actually iterates across all possible CPUs. We've had mistakes
    in the past where people were using for_each_cpu() where they should have been
    iterating across only online or present CPUs. This is inefficient and
    possibly buggy.

    We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
    future.

    This patch replaces for_each_cpu with for_each_possible_cpu under /net

    Signed-off-by: KAMEZAWA Hiroyuki
    Acked-by: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

10 Apr, 2006

9 commits

  • Deinline a few functions which produce 200+ bytes of code.

    Size Uses Wasted Name and definition
    ===== ==== ====== ================================================
    429 3 818 __inet6_lookup include/net/inet6_hashtables.h
    404 2 384 __inet6_lookup_established include/net/inet6_hashtables.h
    206 3 372 __inet6_hash include/net/inet6_hashtables.h

    Signed-off-by: Denis Vlasenko
    Signed-off-by: David S. Miller

    Denis Vlasenko
     
  • Can't build with CONFIG_NETFILTER=y/m on IA64, there's a missing
    #include in net/ipv6/netfilter.c

    net/ipv6/netfilter.c: In function `nf_ip6_checksum':
    net/ipv6/netfilter.c:92: warning: implicit declaration of function
    `csum_ipv6_magic'

    Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     
  • Besides removing lots of duplicate code, all converted users benefit
    from improved HW checksum error handling. Tested with and without HW
    checksums in almost all combinations.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Add checksum operation which takes care of verifying the checksum and
    dealing with HW checksum errors and avoids multiple checksum
    operations by setting ip_summed to CHECKSUM_UNNECESSARY after
    successful verification.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Change the queue rerouter intrastructure to a generic usable
    infrastructure for address family specific operations as a base for
    some cleanups.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Fix section mismatch warnings caused by netfilter's init_or_cleanup
    functions used in many places by splitting the init from the cleanup
    parts.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Clean up hook registration by makeing use of the new mass registration and
    unregistration helpers.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • This patch changes GRE and SIT to generate port unreachable instead of
    protocol unreachable errors when we can't find a matching tunnel for a
    packet.

    This removes the ambiguity as to whether the error is caused by no
    tunnel being found or by the lack of support for the given tunnel
    type.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch moves the sending of ICMP messages when there are no IPv4/IPv6
    tunnels present to tunnel4/tunnel6 respectively. Please note that for now
    if xfrm4_tunnel/xfrm6_tunnel is loaded then no ICMP messages will ever be
    sent. This is similar to how we handle AH/ESP/IPCOMP.

    This move fixes the bug where we always send an ICMP message when there is
    no ip6_tunnel device present for a given packet even if it is later handled
    by IPsec. It also causes ICMP messages to be sent when no IPIP tunnel is
    present.

    I've decided to use the "port unreachable" ICMP message over the current
    value of "address unreachable" (and "protocol unreachable" by GRE) because
    it is not ambiguous unlike the other ones which can be triggered by other
    conditions. There seems to be no standard specifying what value must be
    used so this change should be OK. In fact we should change GRE to use
    this value as well.

    Incidentally, this patch also fixes a fairly serious bug in xfrm6_tunnel
    where we don't check whether the embedded IPv6 header is present before
    dereferencing it for the inside source address.

    This patch is inspired by a previous patch by Hugo Santos .

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

01 Apr, 2006

2 commits