03 Dec, 2011

1 commit


29 Nov, 2011

1 commit

  • We need to set np->mcast_hops to it's default value at this moment
    otherwise when we use it and found it's value is -1, the logic to
    get default hop limit doesn't take multicast into account and will
    return wrong hop limit(IPV6_DEFAULT_HOPLIMIT) which is for unicast.

    Signed-off-by: Li Wei
    Signed-off-by: David S. Miller

    Li Wei
     

23 Nov, 2011

1 commit


21 Oct, 2011

1 commit

  • Up till now the IP{,V6}_TRANSPARENT socket options (which actually set
    the same bit in the socket struct) have required CAP_NET_ADMIN
    privileges to set or clear the option.

    - we make clearing the bit not require any privileges.
    - we allow CAP_NET_ADMIN to set the bit (as before this change)
    - we allow CAP_NET_RAW to set this bit, because raw
    sockets already pretty much effectively allow you
    to emulate socket transparency.

    Signed-off-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller

    Maciej Żenczykowski
     

31 Aug, 2011

1 commit


19 Aug, 2011

1 commit

  • IPV6_2292PKTOPTIONS is broken for 32-bit applications running
    in COMPAT mode on 64-bit kernels.

    The same problem was fixed for IPv4 with the patch:
    ipv4: Fix ip_getsockopt for IP_PKTOPTIONS,
    commit dd23198e58cd35259dd09e8892bbdb90f1d57748

    Signed-off-by: Sorin Dumitru
    Signed-off-by: Daniel Baluta
    Signed-off-by: David S. Miller

    Daniel Baluta
     

13 Mar, 2011

2 commits


25 Oct, 2010

1 commit


21 Oct, 2010

1 commit


26 Jun, 2010

1 commit

  • commit 9261e5370112 (ipv6: making ip and icmp statistics per/namespace)
    forgot to remove ipv6_statistics variable.

    commit bc417d99bf27 (ipv6: remove stale MIB definitions) took care of
    icmpv6_statistics & icmpv6msg_statistics

    Signed-off-by: Eric Dumazet
    CC: Denis V. Lunev
    CC: Alexey Dobriyan
    CC: Hideaki YOSHIFUJI
    Signed-off-by: David S. Miller

    Eric Dumazet
     

24 Apr, 2010

2 commits


23 Apr, 2010

1 commit

  • This patch adds IPv6 support for RFC5082 Generalized TTL Security Mechanism.

    Not to users of mapped address; the IPV6 and IPV4 socket options are seperate.
    The server does have to deal with both IPv4 and IPv6 socket options
    and the client has to handle the different for each family.

    On client:
    int ttl = 255;
    getaddrinfo(argv[1], argv[2], &hint, &result);

    for (rp = result; rp != NULL; rp = rp->ai_next) {
    s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
    if (s < 0) continue;

    if (rp->ai_family == AF_INET) {
    setsockopt(s, IPPROTO_IP, IP_TTL, &ttl, sizeof(ttl));
    } else if (rp->ai_family == AF_INET6) {
    setsockopt(s, IPPROTO_IPV6, IPV6_UNICAST_HOPS,
    &ttl, sizeof(ttl)))
    }

    if (connect(s, rp->ai_addr, rp->ai_addrlen) == 0) {
    ...

    On server:
    int minttl = 255 - maxhops;

    getaddrinfo(NULL, port, &hints, &result);
    for (rp = result; rp != NULL; rp = rp->ai_next) {
    s = socket(rp->ai_family, rp->ai_socktype, rp->ai_protocol);
    if (s < 0) continue;

    if (rp->ai_family == AF_INET6)
    setsockopt(s, IPPROTO_IPV6, IPV6_MINHOPCOUNT,
    &minttl, sizeof(minttl));
    setsockopt(s, IPPROTO_IP, IP_MINTTL, &minttl, sizeof(minttl));

    if (bind(s, rp->ai_addr, rp->ai_addrlen) == 0)
    break
    ...

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

13 Apr, 2010

1 commit

  • With latest CONFIG_PROVE_RCU stuff, I felt more comfortable to make this
    work.

    sk->sk_dst_cache is currently protected by a rwlock (sk_dst_lock)

    This rwlock is readlocked for a very small amount of time, and dst
    entries are already freed after RCU grace period. This calls for RCU
    again :)

    This patch converts sk_dst_lock to a spinlock, and use RCU for readers.

    __sk_dst_get() is supposed to be called with rcu_read_lock() or if
    socket locked by user, so use appropriate rcu_dereference_check()
    condition (rcu_read_lock_held() || sock_owned_by_user(sk))

    This patch avoids two atomic ops per tx packet on UDP connected sockets,
    for example, and permits sk_dst_lock to be much less dirtied.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

27 Oct, 2009

1 commit


20 Oct, 2009

2 commits

  • Use symbols instead of magic constants while checking PMTU discovery
    setsockopt.

    Remove redundant test in ip_rt_frag_needed() (done by caller).

    Signed-off-by: John Dykstra
    Signed-off-by: David S. Miller

    John Dykstra
     
  • ipv4/ipv6 setsockopt(IP_MULTICAST_IF) have dubious __dev_get_by_index() calls.

    This function should be called only with RTNL or dev_base_lock held, or reader
    could see a corrupt hash chain and eventually enter an endless loop.

    Fix is to call dev_get_by_index()/dev_put().

    If this happens to be performance critical, we could define a new dev_exist_by_index()
    function to avoid touching dev refcount.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Oct, 2009

1 commit

  • In order to have better cache layouts of struct sock (separate zones
    for rx/tx paths), we need this preliminary patch.

    Goal is to transfert fields used at lookup time in the first
    read-mostly cache line (inside struct sock_common) and move sk_refcnt
    to a separate cache line (only written by rx path)

    This patch adds inet_ prefix to daddr, rcv_saddr, dport, num, saddr,
    sport and id fields. This allows a future patch to define these
    fields as macros, like sk_refcnt, without name clashes.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Oct, 2009

1 commit

  • Atis Elsts wrote:
    > Not sure if there is need to fill the mark from skb in tunnel xmit functions. In any case, it's not done for GRE or IPIP tunnels at the moment.

    Ok, I'll just drop that part, I'm not sure what should be done in this case.

    > Also, in this patch you are doing that for SIT (v6-in-v4) tunnels only, and not doing it for v4-in-v6 or v6-in-v6 tunnels. Any reason for that?

    I just sent that patch out too quickly, here's a better one with the updates.

    Add support for IPv6 route lookups using sk_mark.

    Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     

01 Oct, 2009

1 commit

  • This provides safety against negative optlen at the type
    level instead of depending upon (sometimes non-trivial)
    checks against this sprinkled all over the the place, in
    each and every implementation.

    Based upon work done by Arjan van de Ven and feedback
    from Linus Torvalds.

    Signed-off-by: David S. Miller

    David S. Miller
     

14 Aug, 2009

2 commits

  • This patch addresses:
    * assigning -1 to np->tclass as it is currently done is not very meaningful,
    since it turns into 0xff;
    * RFC 3542, 6.5 allows -1 for clearing the sticky IPV6_TCLASS option
    and specifies -1 to mean "use kernel default":
    - RFC 2460, 7. requires that the default traffic class must be zero for
    all 8 bits,
    - this is consistent with RFC 2474, 4.1 which recommends a default PHB of 0,
    in combination with a value of the ECN field of "non-ECT" (RFC 3168, 5.).

    This patch changes the meaning of -1 from assigning 255 to mean the RFC 2460
    default, which at the same time allows to satisfy clearing the sticky TCLASS
    option as per RFC 3542, 6.5.

    (When passing -1 as ancillary data, the fallback remains np->tclass, which
    has either been set via socket options, or contains the default value.)

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     
  • This replaces assignments of the type "int on LHS" = "u8 on RHS" with
    simpler code. The LHS can express all of the unsigned right hand side
    values, hence the assigned value can not be negative.

    Signed-off-by: Gerrit Renker
    Signed-off-by: David S. Miller

    Gerrit Renker
     

14 Apr, 2009

1 commit

  • After switch (rthdr->type) {...},the check below is completely useless.Because:
    if the type is 2,then hdrlen must be 2 and segments_left must be 1,clearly the
    check is redundant;if the type is not 2,then goto sticky_done,the check is useless
    too.

    Signed-off-by: Yang Hongyang
    Reviewed-by: Shan Wei
    Signed-off-by: David S. Miller

    Yang Hongyang
     

25 Feb, 2009

1 commit


05 Jan, 2009

1 commit


16 Dec, 2008

2 commits

  • When get receiving interface index while no message is received,
    the the value seted with setsockopt() should be returned.

    RFC 3542:
    Issuing getsockopt() for the above options will return the sticky
    option value i.e., the value set with setsockopt(). If no sticky
    option value has been set getsockopt() will return the following
    values:

    - For the IPV6_PKTINFO option, it will return an in6_pktinfo
    structure with ipi6_addr being in6addr_any and ipi6_ifindex being
    zero.

    Signed-off-by: Yang Hongyang
    Signed-off-by: David S. Miller

    Yang Hongyang
     
  • There are three reasons for me to add this support:
    1.When no interface is specified in an IPV6_PKTINFO ancillary data
    item, the interface specified in an IPV6_PKTINFO sticky optionis
    is used.

    RFC3542:
    6.7. Summary of Outgoing Interface Selection

    This document and [RFC-3493] specify various methods that affect the
    selection of the packet's outgoing interface. This subsection
    summarizes the ordering among those in order to ensure deterministic
    behavior.

    For a given outgoing packet on a given socket, the outgoing interface
    is determined in the following order:

    1. if an interface is specified in an IPV6_PKTINFO ancillary data
    item, the interface is used.

    2. otherwise, if an interface is specified in an IPV6_PKTINFO sticky
    option, the interface is used.

    2.When no IPV6_PKTINFO ancillary data is received,getsockopt() should
    return the sticky option value which set with setsockopt().

    RFC 3542:
    Issuing getsockopt() for the above options will return the sticky
    option value i.e., the value set with setsockopt(). If no sticky
    option value has been set getsockopt() will return the following
    values:

    3.Make the setsockopt implementation POSIX compliant.

    Signed-off-by: Yang Hongyang
    Signed-off-by: David S. Miller

    Yang Hongyang
     

13 Nov, 2008

1 commit

  • This patch fixes two bugs:

    1. setsockopt() of anything but a Type 2 routing header should return
    EINVAL instead of EPERM. Noticed by Shan Wei
    (shanwei@cn.fujitsu.com).

    2. setsockopt()/sendmsg() of a Type 2 routing header with invalid
    length or segments should return EINVAL. These values are statically
    fixed in RFC 3775, unlike the variable Type 0 was.

    Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     

18 Aug, 2008

1 commit

  • When get receiving interface index while no message is received,
    the bounded device's index of the socket should be returned.

    RFC 3542:
    Issuing getsockopt() for the above options will return the sticky
    option value i.e., the value set with setsockopt(). If no sticky
    option value has been set getsockopt() will return the following
    values:

    - For the IPV6_PKTINFO option, it will return an in6_pktinfo
    structure with ipi6_addr being in6addr_any and ipi6_ifindex being
    zero.

    Signed-off-by: Yang Hongyang
    Signed-off-by: David S. Miller

    Yang Hongyang
     

04 Aug, 2008

1 commit


20 Jul, 2008

1 commit


19 Jul, 2008

1 commit


28 Jun, 2008

1 commit


20 Jun, 2008

1 commit

  • Remove the sticky Hop-by-Hop options header by calling setsockopt()
    for IPV6_HOPOPTS with a zero option length, per RFC3542.

    Routing header and Destination options header does the same as
    Hop-by-Hop options header.

    Signed-off-by: Shan Wei
    Acked-by: YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    Shan Wei
     

14 Jun, 2008

1 commit


12 Jun, 2008

3 commits