26 Mar, 2008

2 commits


24 Mar, 2008

1 commit


29 Jan, 2008

5 commits


13 Nov, 2007

1 commit


11 Nov, 2007

2 commits


01 Nov, 2007

1 commit

  • Finally, the zero_it argument can be completely removed from
    the callers and from the function prototype.

    Besides, fix the checkpatch.pl warnings about using the
    assignments inside if-s.

    This patch is rather big, and it is a part of the previous one.
    I splitted it wishing to make the patches more readable. Hope
    this particular split helped.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

19 Oct, 2007

1 commit


11 Oct, 2007

9 commits

  • Since hardware header operations are part of the protocol class
    not the device instance, make them into a separate object and
    save memory.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Wrap the hard_header_parse function to simplify next step of
    header_ops conversion.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Add inline for common usage of hardware header creation, and
    fix bug in IPV6 mcast where the assumption about negative return is
    an errno. Negative return from hard_header means not enough space
    was available,(ie -N bytes).

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • This patch makes most of the generic device layer network
    namespace safe. This patch makes dev_base_head a
    network namespace variable, and then it picks up
    a few associated variables. The functions:
    dev_getbyhwaddr
    dev_getfirsthwbytype
    dev_get_by_flags
    dev_get_by_name
    __dev_get_by_name
    dev_get_by_index
    __dev_get_by_index
    dev_ioctl
    dev_ethtool
    dev_load
    wireless_process_ioctl

    were modified to take a network namespace argument, and
    deal with it.

    vlan_ioctl_set and brioctl_set were modified so their
    hooks will receive a network namespace argument.

    So basically anthing in the core of the network stack that was
    affected to by the change of dev_base was modified to handle
    multiple network namespaces. The rest of the network stack was
    simply modified to explicitly use &init_net the initial network
    namespace. This can be fixed when those components of the network
    stack are modified to handle multiple network namespaces.

    For now the ifindex generator is left global.

    Fundametally ifindex numbers are per namespace, or else
    we will have corner case problems with migration when
    we get that far.

    At the same time there are assumptions in the network stack
    that the ifindex of a network device won't change. Making
    the ifindex number global seems a good compromise until
    the network stack can cope with ifindex changes when
    you change namespaces, and the like.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Every user of the network device notifiers is either a protocol
    stack or a pseudo device. If a protocol stack that does not have
    support for multiple network namespaces receives an event for a
    device that is not in the initial network namespace it quite possibly
    can get confused and do the wrong thing.

    To avoid problems until all of the protocol stacks are converted
    this patch modifies all netdev event handlers to ignore events on
    devices that are not in the initial network namespace.

    As the rest of the code is made network namespace aware these
    checks can be removed.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch modifies every packet receive function
    registered with dev_add_pack() to drop packets if they
    are not from the initial network namespace.

    This should ensure that the various network stacks do
    not receive packets in a anything but the initial network
    namespace until the code has been converted and is ready
    for them.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch passes in the namespace a new socket should be created in
    and has the socket code do the appropriate reference counting. By
    virtue of this all socket create methods are touched. In addition
    the socket create methods are modified so that they will fail if
    you attempt to create a socket in a non-default network namespace.

    Failing if we attempt to create a socket outside of the default
    network namespace ensures that as we incrementally make the network stack
    network namespace aware we will not export functionality that someone
    has not audited and made certain is network namespace safe.
    Allowing us to partially enable network namespaces before all of the
    exotic protocols are supported.

    Any protocol layers I have missed will fail to compile because I now
    pass an extra parameter into the socket creation code.

    [ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ]

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch makes /proc/net per network namespace. It modifies the global
    variables proc_net and proc_net_stat to be per network namespace.
    The proc_net file helpers are modified to take a network namespace argument,
    and all of their callers are fixed to pass &init_net for that argument.
    This ensures that all of the /proc/net files are only visible and
    usable in the initial network namespace until the code behind them
    has been updated to be handle multiple network namespaces.

    Making /proc/net per namespace is necessary as at least some files
    in /proc/net depend upon the set of network devices which is per
    network namespace, and even more files in /proc/net have contents
    that are relevant to a single network namespace.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Andi mentioned he did something like this already, but never submitted
    it.

    The dhcp client application uses AF_PACKET with a packet filter to
    receive data. The application doesn't even use timestamps, but because
    the AF_PACKET API has timestamps, they get turned on globally which
    causes an expensive time of day lookup for every packet received on
    any system that uses the standard DHCP client.

    The fix is to not enable the timestamp (but use if if available).
    This causes the time lookup to only occur on those packets that are
    destined for the AF_PACKET socket. The timestamping occurs after
    packet filtering so all packets dropped by filtering to not cause a
    clock call.

    The one downside of this a a few microseconds additional delay added
    from the normal timestamping location (netif_rx) until the receive
    callback in AF_PACKET. But since the offset is fairly consistent it
    should not upset applications that do want really use timestamps, like
    wireshark.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

19 Jul, 2007

1 commit


11 Jul, 2007

1 commit


31 May, 2007

2 commits


26 Apr, 2007

8 commits

  • Add a packet socket option to allow the orig_dev index to be returned
    to userspace when passing traffic through a decapsulated device, such
    as the bonding driver.

    This is very useful for layer 2 traffic being able to report which
    physical device actually received the traffic, instead of having the
    encapsulating device hide that information.

    The new option is called PACKET_ORIGDEV.

    Signed-off-by: Peter P. Waskiewicz Jr.
    Signed-off-by: David S. Miller

    Peter P. Waskiewicz Jr
     
  • So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
    on 64bit architectures, allowing us to combine the 4 bytes hole left by the
    layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
    64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
    :-)

    Many calculations that previously required that skb->{transport,network,
    mac}_header be first converted to a pointer now can be done directly, being
    meaningful as offsets or pointers.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
    skb->mac to skb->mac_header, to match the names of the associated helpers
    (skb[_[re]set]_{transport,network,mac}_header).

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the quite common 'skb->nh.raw - skb->data' sequence.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can
    later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in
    64bit land while possibly keeping it as a pointer on 32bit.

    This one touches just the most simple case, next will handle the slightly more
    "complex" cases.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the places where we need a pointer to the mac header, it is still legal to
    touch skb->mac.raw directly if just adding to, subtracting from or setting it
    to another layer header.

    This one also converts some more cases to skb_reset_mac_header() that my
    regex missed as it had no spaces before nor after '=', ugh.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Now network timestamps use ktime_t infrastructure, we can add a new
    ioctl() SIOCGSTAMPNS command to get timestamps in 'struct timespec'.
    User programs can thus access to nanosecond resolution.

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • We currently use a special structure (struct skb_timeval) and plain
    'struct timeval' to store packet timestamps in sk_buffs and struct
    sock.

    This has some drawbacks :
    - Fixed resolution of micro second.
    - Waste of space on 64bit platforms where sizeof(struct timeval)=16

    I suggest using ktime_t that is a nice abstraction of high resolution
    time services, currently capable of nanosecond resolution.

    As sizeof(ktime_t) is 8 bytes, using ktime_t in 'struct sock' permits
    a 8 byte shrink of this structure on 64bit architectures. Some other
    structures also benefit from this size reduction (struct ipq in
    ipv4/ip_fragment.c, struct frag_queue in ipv6/reassembly.c, ...)

    Once this ktime infrastructure adopted, we can more easily provide
    nanosecond resolution on top of it. (ioctl SIOCGSTAMPNS and/or
    SO_TIMESTAMPNS/SCM_TIMESTAMPNS)

    Note : this patch includes a bug correction in
    compat_sock_get_timestamp() where a "err = 0;" was missing (so this
    syscall returned -ENOENT instead of 0)

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    CC: John find
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Feb, 2007

1 commit


15 Feb, 2007

1 commit

  • After Al Viro (finally) succeeded in removing the sched.h #include in module.h
    recently, it makes sense again to remove other superfluous sched.h includes.
    There are quite a lot of files which include it but don't actually need
    anything defined in there. Presumably these includes were once needed for
    macros that used to live in sched.h, but moved to other header files in the
    course of cleaning it up.

    To ease the pain, this time I did not fiddle with any header files and only
    removed #includes from .c-files, which tend to cause less trouble.

    Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
    arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
    allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
    configs in arch/arm/configs on arm. I also checked that no new warnings were
    introduced by the patch (actually, some warnings are removed that were emitted
    by unnecessarily included header files).

    Signed-off-by: Tim Schmielau
    Acked-by: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tim Schmielau
     

13 Feb, 2007

1 commit

  • Many struct file_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

11 Feb, 2007

1 commit


09 Feb, 2007

2 commits

  • Both aux data and sockaddr tries to use the same buffer which
    obviously doesn't work. We just happen to have 4 bytes free in
    the skb->cb if you take away the maximum length of sockaddr_ll.
    That's just enough to store the one piece of info from aux data
    that we can't generate at recvmsg(2) time.

    This is what the following patch does.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch is needed to make ISC's DHCP server (and probably other
    DHCP servers/clients using AF_PACKET) to be able to serve another
    client on the same Xen host.

    The problem is that packets between different domains on the same
    Xen host only have partial checksums. Unfortunately this piece of
    information is not passed along in AF_PACKET unless you're using
    the mmap interface. Since dhcpd doesn't support packet-mmap, UDP
    packets from the same host come out with apparently bogus checksums.

    This patch adds a mechanism for AF_PACKET recvmsg(2) to return the
    status along with the packet. It does so by adding a new cmsg that
    contains this information along with some other relevant data such
    as the original packet length.

    I didn't include the time stamp information since there is already
    a cmsg for that.

    This patch also changes the mmap code to set the CSUMNOTREADY flag
    on all packets instead of just outoing packets on cooked sockets.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu