20 Jul, 2008

1 commit


19 Jul, 2008

1 commit

  • Add new sockopt to reserve some headroom in the mmaped ring frames in
    front of the packet payload. This can be used f.i. when the VLAN header
    needs to be (re)constructed to avoid moving the entire payload.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

15 Jul, 2008

3 commits

  • Store the VLAN tag in the auxillary data/tpacket2_hdr so userspace can
    properly deal with hardware VLAN tagging/stripping.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • The tpacket_hdr is not 64 bit clean due to use of an unsigned long
    and can't be extended because the following struct sockaddr_ll needs
    to be at a fixed offset.

    Add support for a version 2 tpacket protocol that removes these
    limitations.

    Userspace can query the header size through a new getsockopt option
    and change the protocol version through a setsockopt option. The
    changes needed to switch to the new protocol version are:

    1. replace struct tpacket_hdr by struct tpacket2_hdr
    2. query header len and save
    3. set protocol version to 2
    - set up ring as usual
    4. for getting the sockaddr_ll, use (void *)hdr + TPACKET_ALIGN(hdrlen)
    instead of (void *)hdr + TPACKET_ALIGN(sizeof(struct tpacket_hdr))

    Steps 2 and 4 can be omitted if the struct sockaddr_ll isn't needed.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • dev_set_promiscuity/allmulti might overflow. Commit: "netdevice: Fix
    promiscuity and allmulti overflow" in net-next makes
    dev_set_promiscuity/allmulti return error number if overflow happened.

    In af_packet, we check all positive increment for promiscuity and
    allmulti to get error return.

    Signed-off-by: Wang Chen
    Acked-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Wang Chen
     

12 Jun, 2008

1 commit


13 May, 2008

1 commit

  • This patch adds needed_headroom/needed_tailroom members to struct
    net_device and updates many places that allocate sbks to use them. Not
    all of them can be converted though, and I'm sure I missed some (I
    mostly grepped for LL_RESERVED_SPACE)

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

26 Mar, 2008

2 commits


24 Mar, 2008

1 commit


29 Jan, 2008

5 commits


13 Nov, 2007

1 commit


11 Nov, 2007

2 commits


01 Nov, 2007

1 commit

  • Finally, the zero_it argument can be completely removed from
    the callers and from the function prototype.

    Besides, fix the checkpatch.pl warnings about using the
    assignments inside if-s.

    This patch is rather big, and it is a part of the previous one.
    I splitted it wishing to make the patches more readable. Hope
    this particular split helped.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

19 Oct, 2007

1 commit


11 Oct, 2007

9 commits

  • Since hardware header operations are part of the protocol class
    not the device instance, make them into a separate object and
    save memory.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Wrap the hard_header_parse function to simplify next step of
    header_ops conversion.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Add inline for common usage of hardware header creation, and
    fix bug in IPV6 mcast where the assumption about negative return is
    an errno. Negative return from hard_header means not enough space
    was available,(ie -N bytes).

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • This patch makes most of the generic device layer network
    namespace safe. This patch makes dev_base_head a
    network namespace variable, and then it picks up
    a few associated variables. The functions:
    dev_getbyhwaddr
    dev_getfirsthwbytype
    dev_get_by_flags
    dev_get_by_name
    __dev_get_by_name
    dev_get_by_index
    __dev_get_by_index
    dev_ioctl
    dev_ethtool
    dev_load
    wireless_process_ioctl

    were modified to take a network namespace argument, and
    deal with it.

    vlan_ioctl_set and brioctl_set were modified so their
    hooks will receive a network namespace argument.

    So basically anthing in the core of the network stack that was
    affected to by the change of dev_base was modified to handle
    multiple network namespaces. The rest of the network stack was
    simply modified to explicitly use &init_net the initial network
    namespace. This can be fixed when those components of the network
    stack are modified to handle multiple network namespaces.

    For now the ifindex generator is left global.

    Fundametally ifindex numbers are per namespace, or else
    we will have corner case problems with migration when
    we get that far.

    At the same time there are assumptions in the network stack
    that the ifindex of a network device won't change. Making
    the ifindex number global seems a good compromise until
    the network stack can cope with ifindex changes when
    you change namespaces, and the like.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Every user of the network device notifiers is either a protocol
    stack or a pseudo device. If a protocol stack that does not have
    support for multiple network namespaces receives an event for a
    device that is not in the initial network namespace it quite possibly
    can get confused and do the wrong thing.

    To avoid problems until all of the protocol stacks are converted
    this patch modifies all netdev event handlers to ignore events on
    devices that are not in the initial network namespace.

    As the rest of the code is made network namespace aware these
    checks can be removed.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch modifies every packet receive function
    registered with dev_add_pack() to drop packets if they
    are not from the initial network namespace.

    This should ensure that the various network stacks do
    not receive packets in a anything but the initial network
    namespace until the code has been converted and is ready
    for them.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch passes in the namespace a new socket should be created in
    and has the socket code do the appropriate reference counting. By
    virtue of this all socket create methods are touched. In addition
    the socket create methods are modified so that they will fail if
    you attempt to create a socket in a non-default network namespace.

    Failing if we attempt to create a socket outside of the default
    network namespace ensures that as we incrementally make the network stack
    network namespace aware we will not export functionality that someone
    has not audited and made certain is network namespace safe.
    Allowing us to partially enable network namespaces before all of the
    exotic protocols are supported.

    Any protocol layers I have missed will fail to compile because I now
    pass an extra parameter into the socket creation code.

    [ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ]

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch makes /proc/net per network namespace. It modifies the global
    variables proc_net and proc_net_stat to be per network namespace.
    The proc_net file helpers are modified to take a network namespace argument,
    and all of their callers are fixed to pass &init_net for that argument.
    This ensures that all of the /proc/net files are only visible and
    usable in the initial network namespace until the code behind them
    has been updated to be handle multiple network namespaces.

    Making /proc/net per namespace is necessary as at least some files
    in /proc/net depend upon the set of network devices which is per
    network namespace, and even more files in /proc/net have contents
    that are relevant to a single network namespace.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Andi mentioned he did something like this already, but never submitted
    it.

    The dhcp client application uses AF_PACKET with a packet filter to
    receive data. The application doesn't even use timestamps, but because
    the AF_PACKET API has timestamps, they get turned on globally which
    causes an expensive time of day lookup for every packet received on
    any system that uses the standard DHCP client.

    The fix is to not enable the timestamp (but use if if available).
    This causes the time lookup to only occur on those packets that are
    destined for the AF_PACKET socket. The timestamping occurs after
    packet filtering so all packets dropped by filtering to not cause a
    clock call.

    The one downside of this a a few microseconds additional delay added
    from the normal timestamping location (netif_rx) until the receive
    callback in AF_PACKET. But since the offset is fairly consistent it
    should not upset applications that do want really use timestamps, like
    wireshark.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

19 Jul, 2007

1 commit


11 Jul, 2007

1 commit


31 May, 2007

2 commits


26 Apr, 2007

7 commits

  • Add a packet socket option to allow the orig_dev index to be returned
    to userspace when passing traffic through a decapsulated device, such
    as the bonding driver.

    This is very useful for layer 2 traffic being able to report which
    physical device actually received the traffic, instead of having the
    encapsulating device hide that information.

    The new option is called PACKET_ORIGDEV.

    Signed-off-by: Peter P. Waskiewicz Jr.
    Signed-off-by: David S. Miller

    Peter P. Waskiewicz Jr
     
  • So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
    on 64bit architectures, allowing us to combine the 4 bytes hole left by the
    layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
    64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
    :-)

    Many calculations that previously required that skb->{transport,network,
    mac}_header be first converted to a pointer now can be done directly, being
    meaningful as offsets or pointers.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
    skb->mac to skb->mac_header, to match the names of the associated helpers
    (skb[_[re]set]_{transport,network,mac}_header).

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the quite common 'skb->nh.raw - skb->data' sequence.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can
    later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in
    64bit land while possibly keeping it as a pointer on 32bit.

    This one touches just the most simple case, next will handle the slightly more
    "complex" cases.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the places where we need a pointer to the mac header, it is still legal to
    touch skb->mac.raw directly if just adding to, subtracting from or setting it
    to another layer header.

    This one also converts some more cases to skb_reset_mac_header() that my
    regex missed as it had no spaces before nor after '=', ugh.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Now network timestamps use ktime_t infrastructure, we can add a new
    ioctl() SIOCGSTAMPNS command to get timestamps in 'struct timespec'.
    User programs can thus access to nanosecond resolution.

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet