19 Jul, 2007

1 commit


11 Jul, 2007

1 commit


31 May, 2007

2 commits


26 Apr, 2007

8 commits

  • Add a packet socket option to allow the orig_dev index to be returned
    to userspace when passing traffic through a decapsulated device, such
    as the bonding driver.

    This is very useful for layer 2 traffic being able to report which
    physical device actually received the traffic, instead of having the
    encapsulating device hide that information.

    The new option is called PACKET_ORIGDEV.

    Signed-off-by: Peter P. Waskiewicz Jr.
    Signed-off-by: David S. Miller

    Peter P. Waskiewicz Jr
     
  • So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
    on 64bit architectures, allowing us to combine the 4 bytes hole left by the
    layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
    64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
    :-)

    Many calculations that previously required that skb->{transport,network,
    mac}_header be first converted to a pointer now can be done directly, being
    meaningful as offsets or pointers.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
    skb->mac to skb->mac_header, to match the names of the associated helpers
    (skb[_[re]set]_{transport,network,mac}_header).

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the quite common 'skb->nh.raw - skb->data' sequence.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can
    later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in
    64bit land while possibly keeping it as a pointer on 32bit.

    This one touches just the most simple case, next will handle the slightly more
    "complex" cases.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the places where we need a pointer to the mac header, it is still legal to
    touch skb->mac.raw directly if just adding to, subtracting from or setting it
    to another layer header.

    This one also converts some more cases to skb_reset_mac_header() that my
    regex missed as it had no spaces before nor after '=', ugh.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Now network timestamps use ktime_t infrastructure, we can add a new
    ioctl() SIOCGSTAMPNS command to get timestamps in 'struct timespec'.
    User programs can thus access to nanosecond resolution.

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • We currently use a special structure (struct skb_timeval) and plain
    'struct timeval' to store packet timestamps in sk_buffs and struct
    sock.

    This has some drawbacks :
    - Fixed resolution of micro second.
    - Waste of space on 64bit platforms where sizeof(struct timeval)=16

    I suggest using ktime_t that is a nice abstraction of high resolution
    time services, currently capable of nanosecond resolution.

    As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits
    a 8 byte shrink of this structure on 64bit architectures. Some other
    structures also benefit from this size reduction (struct ipq in
    ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...)

    Once this ktime infrastructure adopted, we can more easily provide
    nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or
    SO_TIMESTAMPNS/SCM_TIMESTAMPNS)

    Note : this patch includes a bug correction in
    compat_sock_get_timestamp() where a "err = 0;" was missing (so this
    syscall returned -ENOENT instead of 0)

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    CC: John find
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Feb, 2007

1 commit


15 Feb, 2007

1 commit

  • After Al Viro (finally) succeeded in removing the sched.h #include in module.h
    recently, it makes sense again to remove other superfluous sched.h includes.
    There are quite a lot of files which include it but don't actually need
    anything defined in there. Presumably these includes were once needed for
    macros that used to live in sched.h, but moved to other header files in the
    course of cleaning it up.

    To ease the pain, this time I did not fiddle with any header files and only
    removed #includes from .c-files, which tend to cause less trouble.

    Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
    arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
    allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
    configs in arch/arm/configs on arm. I also checked that no new warnings were
    introduced by the patch (actually, some warnings are removed that were emitted
    by unnecessarily included header files).

    Signed-off-by: Tim Schmielau
    Acked-by: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tim Schmielau
     

13 Feb, 2007

1 commit

  • Many struct file_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

11 Feb, 2007

1 commit


09 Feb, 2007

2 commits

  • Both aux data and sockaddr tries to use the same buffer which
    obviously doesn't work. We just happen to have 4 bytes free in
    the skb->cb if you take away the maximum length of sockaddr_ll.
    That's just enough to store the one piece of info from aux data
    that we can't generate at recvmsg(2) time.

    This is what the following patch does.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch is needed to make ISC's DHCP server (and probably other
    DHCP servers/clients using AF_PACKET) to be able to serve another
    client on the same Xen host.

    The problem is that packets between different domains on the same
    Xen host only have partial checksums. Unfortunately this piece of
    information is not passed along in AF_PACKET unless you're using
    the mmap interface. Since dhcpd doesn't support packet-mmap, UDP
    packets from the same host come out with apparently bogus checksums.

    This patch adds a mechanism for AF_PACKET recvmsg(2) to return the
    status along with the packet. It does so by adding a new cmsg that
    contains this information along with some other relevant data such
    as the original packet length.

    I didn't include the time stamp information since there is already
    a cmsg for that.

    This patch also changes the mmap code to set the CSUMNOTREADY flag
    on all packets instead of just outoing packets on cooked sockets.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

26 Jan, 2007

1 commit


25 Jan, 2007

1 commit

  • This fixes a bug introduced by:

    commit fda9ef5d679b07c9d9097aaf6ef7f069d794a8f9
    Author: Dmitry Mishin
    Date: Thu Aug 31 15:28:39 2006 -0700

    [NET]: Fix sk->sk_filter field access

    sk_run_filter() returns either 0 or an unsigned 32-bit
    length which says how much of the packet to retain.
    If that 32-bit unsigned integer is larger than the packet,
    this is fine we just leave the packet unchanged.

    The above commit caused all filter return values which
    were negative when interpreted as a signed integer to
    indicate a packet drop, which is wrong.

    Based upon a report and initial patch by Raivis Bucis.

    Signed-off-by: David S. Miller

    David S. Miller
     

07 Dec, 2006

1 commit

  • I believe all the below memory barriers only matter on SMP so
    therefore the smp_* variant of the barrier should be used.

    I'm wondering if the barrier in net/ipv4/inet_timewait_sock.c should be
    dropped entirely. schedule_work's implementation currently implies a
    memory barrier and I think sane semantics of schedule_work() should imply
    a memory barrier, as needed so the caller shouldn't have to worry.
    It's not quite obvious why the barrier in net/packet/af_packet.c is
    needed; maybe it should be implied through flush_dcache_page?

    Signed-off-by: Ralf Baechle
    Signed-off-by: David S. Miller

    Ralf Baechle
     

04 Dec, 2006

1 commit


03 Dec, 2006

1 commit

  • Weirdness: the third argument of socket() is net-endian
    here. Oh, well - it's documented in packet(7).

    Signed-off-by: Al Viro
    Signed-off-by: David S. Miller

    Al Viro
     

23 Sep, 2006

2 commits

  • Function sk_filter() is called from tcp_v{4,6}_rcv() functions with arg
    needlock = 0, while socket is not locked at that moment. In order to avoid
    this and similar issues in the future, use rcu for sk->sk_filter field read
    protection.

    Signed-off-by: Dmitry Mishin
    Signed-off-by: Alexey Kuznetsov
    Signed-off-by: Kirill Korotaev

    Dmitry Mishin
     
  • Replace CHECKSUM_HW by CHECKSUM_PARTIAL (for outgoing packets, whose
    checksum still needs to be completed) and CHECKSUM_COMPLETE (for
    incoming packets, device supplied full checksum).

    Patch originally from Herbert Xu, updated by myself for 2.6.18-rc3.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

18 Sep, 2006

1 commit


01 Jul, 2006

1 commit


24 Jan, 2006

1 commit


12 Jan, 2006

2 commits


04 Jan, 2006

2 commits

  • Currently all network protocols need to call dev_ioctl as the default
    fallback in their ioctl implementations. This patch adds a fallback
    to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD.
    This way all the procotol ioctl handlers can be simplified and we don't
    need to export dev_ioctl.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     
  • I noticed that some of 'struct proto_ops' used in the kernel may share
    a cache line used by locks or other heavily modified data. (default
    linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
    least)

    This patch makes sure a 'struct proto_ops' can be declared as const,
    so that all cpus can share all parts of it without false sharing.

    This is not mandatory : a driver can still use a read/write structure
    if it needs to (and eventually a __read_mostly)

    I made a global stubstitute to change all existing occurences to make
    them const.

    This should reduce the possibility of false sharing on SMP, and
    speedup some socket system calls.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Dec, 2005

1 commit


04 Oct, 2005

1 commit

  • I've found the problem in general. It affects any 64-bit
    architecture. The problem occurs when you change the system time.

    Suppose that when you boot your system clock is forward by a day.
    This gets recorded down in skb_tv_base. You then wind the clock back
    by a day. From that point onwards the offset will be negative which
    essentially overflows the 32-bit variables they're stored in.

    In fact, why don't we just store the real time stamp in those 32-bit
    variables? After all, we're not going to overflow for quite a while
    yet.

    When we do overflow, we'll need a better solution of course.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

27 Sep, 2005

1 commit


21 Sep, 2005

1 commit

  • The convention is that longer addresses will simply extend
    the hardeware address byte arrays at the end of sockaddr_ll and
    packet_mreq.

    In making this change a small information leak was also closed.
    The code only initializes the hardware address bytes that are
    used, but all of struct sockaddr_ll was copied to userspace.
    Now we just copy sockaddr_ll to the last byte of the hardware
    address used.

    For error checking larger structures than our internal
    maximums continue to be allowed but an error is signaled if we can
    not fit the hardware address into our internal structure.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

07 Sep, 2005

1 commit


30 Aug, 2005

2 commits


13 Jul, 2005

1 commit

  • Revert the nf_reset change that caused so much trouble, drop conntrack
    references manually before packets are queued to packet sockets.

    Signed-off-by: Phil Oester
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Phil Oester