04 Dec, 2011

1 commit

  • Open vSwitch is a multilayer Ethernet switch targeted at virtualized
    environments. In addition to supporting a variety of features
    expected in a traditional hardware switch, it enables fine-grained
    programmatic extension and flow-based control of the network.
    This control is useful in a wide variety of applications but is
    particularly important in multi-server virtualization deployments,
    which are often characterized by highly dynamic endpoints and the need
    to maintain logical abstractions for multiple tenants.

    The Open vSwitch datapath provides an in-kernel fast path for packet
    forwarding. It is complemented by a userspace daemon, ovs-vswitchd,
    which is able to accept configuration from a variety of sources and
    translate it into packet processing rules.

    See http://openvswitch.org for more information and userspace
    utilities.

    Signed-off-by: Jesse Gross

    Jesse Gross
     

30 Nov, 2011

1 commit

  • Networking stack support for byte queue limits, uses dynamic queue
    limits library. Byte queue limits are maintained per transmit queue,
    and a dql structure has been added to netdev_queue structure for this
    purpose.

    Configuration of bql is in the tx- sysfs directory for the queue
    under the byte_queue_limits directory. Configuration includes:
    limit_min, bql minimum limit
    limit_max, bql maximum limit
    hold_time, bql slack hold time

    Also under the directory are:
    limit, current byte limit
    inflight, current number of bytes on the queue

    Signed-off-by: Tom Herbert
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Tom Herbert
     

23 Nov, 2011

1 commit

  • This patch adds in the infrastructure code to create the network priority
    cgroup. The cgroup, in addition to the standard processes file creates two
    control files:

    1) prioidx - This is a read-only file that exports the index of this cgroup.
    This is a value that is both arbitrary and unique to a cgroup in this subsystem,
    and is used to index the per-device priority map

    2) priomap - This is a writeable file. On read it reports a table of 2-tuples
    where name is the name of a network interface and priority is
    indicates the priority assigned to frames egresessing on the named interface and
    originating from a pid in this cgroup

    This cgroup allows for skb priority to be set prior to a root qdisc getting
    selected. This is benenficial for DCB enabled systems, in that it allows for any
    application to use dcb configured priorities so without application modification

    Signed-off-by: Neil Horman
    Signed-off-by: John Fastabend
    CC: Robert Love
    CC: "David S. Miller"
    Signed-off-by: David S. Miller

    Neil Horman
     

06 Jul, 2011

1 commit

  • The NFC subsystem core is responsible for providing the device driver
    interface. It is also responsible for providing an interface to the control
    operations and data exchange.

    Signed-off-by: Lauro Ramos Venancio
    Signed-off-by: Aloisio Almeida Jr
    Signed-off-by: Samuel Ortiz
    Signed-off-by: John W. Linville

    Lauro Ramos Venancio
     

30 Apr, 2011

1 commit


28 Apr, 2011

1 commit

  • In order to speedup packet filtering, here is an implementation of a
    JIT compiler for x86_64

    It is disabled by default, and must be enabled by the admin.

    echo 1 >/proc/sys/net/core/bpf_jit_enable

    It uses module_alloc() and module_free() to get memory in the 2GB text
    kernel range since we call helpers functions from the generated code.

    EAX : BPF A accumulator
    EBX : BPF X accumulator
    RDI : pointer to skb (first argument given to JIT function)
    RBP : frame pointer (even if CONFIG_FRAME_POINTER=n)
    r9d : skb->len - skb->data_len (headlen)
    r8 : skb->data

    To get a trace of generated code, use :

    echo 2 >/proc/sys/net/core/bpf_jit_enable

    Example of generated code :

    # tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24

    flen=18 proglen=147 pass=3 image=ffffffffa00b5000
    JIT code: ffffffffa00b5000: 55 48 89 e5 48 83 ec 60 48 89 5d f8 44 8b 4f 60
    JIT code: ffffffffa00b5010: 44 2b 4f 64 4c 8b 87 b8 00 00 00 be 0c 00 00 00
    JIT code: ffffffffa00b5020: e8 24 7b f7 e0 3d 00 08 00 00 75 28 be 1a 00 00
    JIT code: ffffffffa00b5030: 00 e8 fe 7a f7 e0 24 00 3d 00 14 a8 c0 74 49 be
    JIT code: ffffffffa00b5040: 1e 00 00 00 e8 eb 7a f7 e0 24 00 3d 00 14 a8 c0
    JIT code: ffffffffa00b5050: 74 36 eb 3b 3d 06 08 00 00 74 07 3d 35 80 00 00
    JIT code: ffffffffa00b5060: 75 2d be 1c 00 00 00 e8 c8 7a f7 e0 24 00 3d 00
    JIT code: ffffffffa00b5070: 14 a8 c0 74 13 be 26 00 00 00 e8 b5 7a f7 e0 24
    JIT code: ffffffffa00b5080: 00 3d 00 14 a8 c0 75 07 b8 ff ff 00 00 eb 02 31
    JIT code: ffffffffa00b5090: c0 c9 c3

    BPF program is 144 bytes long, so native program is almost same size ;)

    (000) ldh [12]
    (001) jeq #0x800 jt 2 jf 8
    (002) ld [26]
    (003) and #0xffffff00
    (004) jeq #0xc0a81400 jt 16 jf 5
    (005) ld [30]
    (006) and #0xffffff00
    (007) jeq #0xc0a81400 jt 16 jf 17
    (008) jeq #0x806 jt 10 jf 9
    (009) jeq #0x8035 jt 10 jf 17
    (010) ld [28]
    (011) and #0xffffff00
    (012) jeq #0xc0a81400 jt 16 jf 13
    (013) ld [38]
    (014) and #0xffffff00
    (015) jeq #0xc0a81400 jt 16 jf 17
    (016) ret #65535
    (017) ret #0

    Signed-off-by: Eric Dumazet
    Cc: Arnaldo Carvalho de Melo
    Cc: Ben Hutchings
    Cc: Hagen Paul Pfeifer
    Signed-off-by: David S. Miller

    Eric Dumazet
     

25 Jan, 2011

1 commit

  • Allow drivers for multiqueue hardware with flow filter tables to
    accelerate RFS. The driver must:

    1. Set net_device::rx_cpu_rmap to a cpu_rmap of the RX completion
    IRQs (in queue order). This will provide a mapping from CPUs to the
    queues for which completions are handled nearest to them.

    2. Implement net_device_ops::ndo_rx_flow_steer. This operation adds
    or replaces a filter steering the given flow to the given RX queue, if
    possible.

    3. Periodically remove filters for which rps_may_expire_flow() returns
    true.

    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Ben Hutchings
     

14 Jan, 2011

1 commit

  • * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
    Documentation/trace/events.txt: Remove obsolete sched_signal_send.
    writeback: fix global_dirty_limits comment runtime -> real-time
    ppc: fix comment typo singal -> signal
    drivers: fix comment typo diable -> disable.
    m68k: fix comment typo diable -> disable.
    wireless: comment typo fix diable -> disable.
    media: comment typo fix diable -> disable.
    remove doc for obsolete dynamic-printk kernel-parameter
    remove extraneous 'is' from Documentation/iostats.txt
    Fix spelling milisec -> ms in snd_ps3 module parameter description
    Fix spelling mistakes in comments
    Revert conflicting V4L changes
    i7core_edac: fix typos in comments
    mm/rmap.c: fix comment
    sound, ca0106: Fix assignment to 'channel'.
    hrtimer: fix a typo in comment
    init/Kconfig: fix typo
    anon_inodes: fix wrong function name in comment
    fix comment typos concerning "consistent"
    poll: fix a typo in comment
    ...

    Fix up trivial conflicts in:
    - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
    - fs/ext4/ext4.h

    Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.

    Linus Torvalds
     

17 Dec, 2010

1 commit

  • B.A.T.M.A.N. (better approach to mobile ad-hoc networking) is a routing
    protocol for multi-hop ad-hoc mesh networks. The networks may be wired or
    wireless. See http://www.open-mesh.org/ for more information and user space
    tools.

    Signed-off-by: Sven Eckelmann
    Signed-off-by: David S. Miller

    Sven Eckelmann
     

29 Nov, 2010

1 commit

  • This patch adds XPS_CONFIG option to enable and disable XPS. This is
    done in the same manner as RPS_CONFIG. This is also fixes build
    failure in XPS code when SMP is not enabled.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

16 Nov, 2010

1 commit

  • Some of the documentation refers to web pages under
    the domain `osdl.org'. However, `osdl.org' now
    redirects to `linuxfoundation.org'.

    Rather than rely on redirections, this patch updates
    the addresses appropriately; for the most part, only
    documentation that is meant to be current has been
    updated.

    The patch should be pretty quick to scan and check;
    each new web-page url was gotten by trying out the
    original URL in a browser and then simply copying the
    the redirected URL (formatting as necessary).

    There is some conflict as to which one of these domain
    names is preferred:

    linuxfoundation.org
    linux-foundation.org

    So, I wrote:

    info@linuxfoundation.org

    and got this reply:

    Message-ID:
    Date: Mon, 15 Nov 2010 10:41:42 -0800
    From: David Ames

    ...

    linuxfoundation.org is preferred. The canonical name for our web site is
    www.linuxfoundation.org. Our list site is actually
    lists.linux-foundation.org.

    Regarding email linuxfoundation.org is preferred there are a few people
    who choose to use linux-foundation.org for their own reasons.

    Consequently, I used `linuxfoundation.org' for web pages and
    `lists.linux-foundation.org' for mailing-list web pages and email addresses;
    the only personal email address I updated from `@osdl.org' was that of
    Andrew Morton, who prefers `linux-foundation.org' according `git log'.

    Signed-off-by: Michael Witten
    Signed-off-by: Jiri Kosina

    Michael Witten
     

21 Oct, 2010

1 commit

  • This factors out protocol and low-level storage parts of ceph into a
    separate libceph module living in net/ceph and include/linux/ceph. This
    is mostly a matter of moving files around. However, a few key pieces
    of the interface change as well:

    - ceph_client becomes ceph_fs_client and ceph_client, where the latter
    captures the mon and osd clients, and the fs_client gets the mds client
    and file system specific pieces.
    - Mount option parsing and debugfs setup is correspondingly broken into
    two pieces.
    - The mon client gets a generic handler callback for otherwise unknown
    messages (mds map, in this case).
    - The basic supported/required feature bits can be expanded (and are by
    ceph_fs_client).

    No functional change, aside from some subtle error handling cases that got
    cleaned up in the refactoring process.

    Signed-off-by: Sage Weil

    Yehuda Sadeh
     

15 Sep, 2010

1 commit


06 Aug, 2010

1 commit

  • Separate out the DNS resolver key type from the CIFS filesystem into its own
    module so that it can be made available for general use, including the AFS
    filesystem module.

    This facility makes it possible for the kernel to upcall to userspace to have
    it issue DNS requests, package up the replies and present them to the kernel
    in a useful form. The kernel is then able to cache the DNS replies as keys
    can be retained in keyrings.

    Resolver keys are of type "dns_resolver" and have a case-insensitive
    description that is of the form "[:]". The optional
    indicates the particular DNS lookup and packaging that's required. The
    is the query to be made.

    If isn't given, a basic hostname to IP address lookup is made, and the
    result is stored in the key in the form of a printable string consisting of a
    comma-separated list of IPv4 and IPv6 addresses.

    This key type is supported by userspace helpers driven from /sbin/request-key
    and configured through /etc/request-key.conf. The cifs.upcall utility is
    invoked for UNC path server name to IP address resolution.

    The CIFS functionality is encapsulated by the dns_resolve_unc_to_ip() function,
    which is used to resolve a UNC path to an IP address for CIFS filesystem. This
    part remains in the CIFS module for now.

    See the added Documentation/networking/dns_resolver.txt for more information.

    Signed-off-by: Wang Lei
    Signed-off-by: David Howells
    Acked-by: Jeff Layton
    Signed-off-by: Steve French

    Wang Lei
     

27 Jul, 2010

1 commit


19 Jul, 2010

1 commit

  • This patch adds a new networking option to allow hardware time stamps
    from PHY devices. When enabled, likely candidates among incoming and
    outgoing network packets are offered to the PHY driver for possible
    time stamping. When accepted by the PHY driver, incoming packets are
    deferred for later delivery by the driver.

    The patch also adds phylib driver methods for the SIOCSHWTSTAMP ioctl
    and callbacks for transmit and receive time stamping. Drivers may
    optionally implement these functions.

    Signed-off-by: Richard Cochran
    Signed-off-by: David S. Miller

    Richard Cochran
     

04 Apr, 2010

1 commit

  • This patch splits the pppol2tp driver into separate L2TP and PPP parts
    to prepare for L2TPv3 support. In L2TPv3, protocols other than PPP can
    be carried, so this split creates a common L2TP core that will handle
    the common L2TP bits which protocol support modules such as PPP will
    use.

    Note that the existing pppol2tp module is split into l2tp_core and
    l2tp_ppp by this change.

    There are no feature changes here. Internally, however, there are
    significant changes, mostly to handle the separation of PPP-specific
    data from the L2TP session and to provide hooks in the core for
    modules like PPP to access.

    Signed-off-by: James Chapman
    Reviewed-by: Randy Dunlap
    Signed-off-by: David S. Miller

    James Chapman
     

31 Mar, 2010

1 commit


26 Mar, 2010

1 commit

  • RPS currently depends on SMP and SYSFS

    Adding a CONFIG_RPS makes sense in case this requirement changes in the
    future. This patch saves about 1500 bytes of kernel text in case SMP is
    on but SYSFS is off.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Jul, 2009

1 commit

  • Wireless extensions have the unfortunate problem that events
    are multicast netlink messages, and are not independent of
    pointer size. Thus, currently 32-bit tasks on 64-bit platforms
    cannot properly receive events and fail with all kinds of
    strange problems, for instance wpa_supplicant never notices
    disassociations, due to the way the 64-bit event looks (to a
    32-bit process), the fact that the address is all zeroes is
    lost, it thinks instead it is 00:00:00:00:01:00.

    The same problem existed with the ioctls, until David Miller
    fixed those some time ago in an heroic effort.

    A different problem caused by this is that we cannot send the
    ASSOCREQIE/ASSOCRESPIE events because sending them causes a
    32-bit wpa_supplicant on a 64-bit system to overwrite its
    internal information, which is worse than it not getting the
    information at all -- so we currently resort to sending a
    custom string event that it then parses. This, however, has a
    severe size limitation we are frequently hitting with modern
    access points; this limitation would can be lifted after this
    patch by sending the correct binary, not custom, event.

    A similar problem apparently happens for some other netlink
    users on x86_64 with 32-bit tasks due to the alignment for
    64-bit quantities.

    In order to fix these problems, I have implemented a way to
    send compat messages to tasks. When sending an event, we send
    the non-compat event data together with a compat event data in
    skb_shinfo(main_skb)->frag_list. Then, when the event is read
    from the socket, the netlink code makes sure to pass out only
    the skb that is compatible with the task. This approach was
    suggested by David Miller, my original approach required
    always sending two skbs but that had various small problems.

    To determine whether compat is needed or not, I have used the
    MSG_CMSG_COMPAT flag, and adjusted the call path for recv and
    recvfrom to include it, even if those calls do not have a cmsg
    parameter.

    I have not solved one small part of the problem, and I don't
    think it is necessary to: if a 32-bit application uses read()
    rather than any form of recvmsg() it will still get the wrong
    (64-bit) event. However, neither do applications actually do
    this, nor would it be a regression.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

09 Jun, 2009

1 commit

  • Add support for communication over IEEE 802.15.4 networks. This implementation
    is neither certified nor complete, but aims to that goal. This commit contains
    only the socket interface for communication over IEEE 802.15.4 networks.
    One can either send RAW datagrams or use SOCK_DGRAM to encapsulate data
    inside normal IEEE 802.15.4 packets.

    Configuration interface, drivers and software MAC 802.15.4 implementation will
    follow.

    Initial implementation was done by Maxim Gorbachyov, Maxim Osipov and Pavel
    Smolensky as a research project at Siemens AG. Later the stack was heavily
    reworked to better suit the linux networking model, and is now maitained
    as an open project partially sponsored by Siemens.

    Signed-off-by: Dmitry Eremin-Solenikov
    Signed-off-by: Sergey Lapin
    Signed-off-by: David S. Miller

    Sergey Lapin
     

08 May, 2009

1 commit


30 Mar, 2009

1 commit


27 Mar, 2009

1 commit


22 Mar, 2009

1 commit

  • Now that most network device drivers in (all but one in x86_64 allmodconfig)
    support net_device_ops. Expose it as a configuration parameter. Still
    need to address even older 32 bit drivers, and other arch before
    compatiablity can be scheduled for removal in some future release.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

14 Mar, 2009

1 commit


04 Mar, 2009

1 commit


27 Feb, 2009

1 commit


27 Jan, 2009

2 commits


08 Jan, 2009

1 commit


22 Nov, 2008

1 commit


21 Nov, 2008

1 commit

  • This adds support for Data Center Bridging (DCB) features in the ixgbe
    driver and adds an rtnetlink interface for configuring DCB to the
    kernel. The DCB feature support included are Priority Grouping (PG) -
    which allows bandwidth guarantees to be allocated to groups to traffic
    based on the 802.1q priority, and Priority Based Flow Control (PFC) -
    which introduces a new MAC control PAUSE frame which works at
    granularity of the 802.1p priority instead of the link (IEEE 802.3x).

    Signed-off-by: Alexander Duyck
    Signed-off-by: Jeff Kirsher
    Signed-off-by: Peter P Waskiewicz Jr
    Signed-off-by: David S. Miller

    Alexander Duyck
     

20 Nov, 2008

1 commit

  • This patch changes the network device internal API to move adminstrative
    operations out of the network device structure and into a separate structure.

    This patch involves some hackery to maintain compatablity between the
    new and old model, so all 300+ drivers don't have to be changed at once.
    For drivers that aren't converted yet, the netdevice_ops virt function list
    still resides in the net_device structure. For old protocols, the new
    net_device_ops are copied out to the old net_device pointers.

    After the transistion is completed the nag message can be changed to
    an WARN_ON, and the compatiablity code can be made configurable.

    Some function pointers aren't moved:
    * destructor can't be in net_device_ops because
    it may need to be referenced after the module is unloaded.
    * neighbor setup is manipulated in a couple of places that need special
    consideration
    * hard_start_xmit is in the fast path for transmit.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

28 Oct, 2008

1 commit

  • To make testing of the network namespace simpler allow
    the network namespace code and the sysfs code to be
    compiled and run at the same time. To do this only
    virtual devices are allowed in the additional network
    namespaces and those virtual devices are not placed
    in the kobject tree.

    Since virtual devices don't actually do anything interesting
    hardware wise that needs device management there should
    be no loss in keeping them out of the kobject tree and
    by implication sysfs. The gain in ease of testing
    and code coverage should be significant.

    Changelog:

    v2: As pointed out by Benjamin Thery it only makes sense to call
    device_rename in the initial network namespace for now.

    Signed-off-by: Eric W. Biederman
    Acked-by: Benjamin Thery
    Tested-by: Serge Hallyn
    Acked-by: Serge Hallyn
    Acked-by: Daniel Lezcano
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

09 Oct, 2008

1 commit

  • Distributed Switch Architecture is a protocol for managing hardware
    switch chips. It consists of a set of MII management registers and
    commands to configure the switch, and an ethernet header format to
    signal which of the ports of the switch a packet was received from
    or is intended to be sent to.

    The switches that this driver supports are typically embedded in
    access points and routers, and a typical setup with a DSA switch
    looks something like this:

    +-----------+ +-----------+
    | | RGMII | |
    | +-------+ +------ 1000baseT MDI ("WAN")
    | | | 6-port +------ 1000baseT MDI ("LAN1")
    | CPU | | ethernet +------ 1000baseT MDI ("LAN2")
    | |MIImgmt| switch +------ 1000baseT MDI ("LAN3")
    | +-------+ w/5 PHYs +------ 1000baseT MDI ("LAN4")
    | | | |
    +-----------+ +-----------+

    The switch driver presents each port on the switch as a separate
    network interface to Linux, polls the switch to maintain software
    link state of those ports, forwards MII management interface
    accesses to those network interfaces (e.g. as done by ethtool) to
    the switch, and exposes the switch's hardware statistics counters
    via the appropriate Linux kernel interfaces.

    This initial patch supports the MII management interface register
    layout of the Marvell 88E6123, 88E6161 and 88E6165 switch chips, and
    supports the "Ethertype DSA" packet tagging format.

    (There is no officially registered ethertype for the Ethertype DSA
    packet format, so we just grab a random one. The ethertype to use
    is programmed into the switch, and the switch driver uses the value
    of ETH_P_EDSA for this, so this define can be changed at any time in
    the future if the one we chose is allocated to another protocol or
    if Ethertype DSA gets its own officially registered ethertype, and
    everything will continue to work.)

    Signed-off-by: Lennert Buytenhek
    Tested-by: Nicolas Pitre
    Tested-by: Byron Bradley
    Tested-by: Tim Ellis
    Tested-by: Peter van Valderen
    Tested-by: Dirk Teurlings
    Signed-off-by: David S. Miller

    Lennert Buytenhek
     

23 Sep, 2008

1 commit


23 Aug, 2008

1 commit


30 Jul, 2008

1 commit


06 Jul, 2008

1 commit

  • Add small STP demux layer for demuxing STP PDUs based on MAC address.
    This is needed to run both GARP and STP in parallel (or even load the
    modules) since both use LLC_SAP_BSPAN.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy