06 Sep, 2019

1 commit

  • commit 5f3e2bf008c2221478101ee72f5cb4654b9fc363 upstream.

    Some TCP peers announce a very small MSS option in their SYN and/or
    SYN/ACK messages.

    This forces the stack to send packets with a very high network/cpu
    overhead.

    Linux has enforced a minimal value of 48. Since this value includes
    the size of TCP options, and that the options can consume up to 40
    bytes, this means that each segment can include only 8 bytes of payload.

    In some cases, it can be useful to increase the minimal value
    to a saner value.

    We still let the default to 48 (TCP_MIN_SND_MSS), for compatibility
    reasons.

    Note that TCP_MAXSEG socket option enforces a minimal value
    of (TCP_MIN_MSS). David Miller increased this minimal value
    in commit c39508d6f118 ("tcp: Make TCP_MAXSEG minimum more correct.")
    from 64 to 88.

    We might in the future merge TCP_MIN_SND_MSS and TCP_MIN_MSS.

    CVE-2019-11479 -- tcp mss hardcoded to 48

    Signed-off-by: Eric Dumazet
    Suggested-by: Jonathan Looney
    Acked-by: Neal Cardwell
    Cc: Yuchung Cheng
    Cc: Tyler Hicks
    Cc: Bruce Curtis
    Cc: Jonathan Lemon
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman
    (cherry picked from commit 7f9f8a37e563c67b24ccd57da1d541a95538e8d9)

    Eric Dumazet
     

27 Sep, 2018

1 commit


24 Aug, 2018

1 commit

  • Pull ARM SoC driver updates from Olof Johansson:
    "Some of the larger changes this merge window:

    - Removal of drivers for Exynos5440, a Samsung SoC that never saw
    widespread use.

    - Uniphier support for USB3 and SPI reset handling

    - Syste control and SRAM drivers and bindings for Allwinner platforms

    - Qualcomm AOSS (Always-on subsystem) reset controller drivers

    - Raspberry Pi hwmon driver for voltage

    - Mediatek pwrap (pmic) support for MT6797 SoC"

    * tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (52 commits)
    drivers/firmware: psci_checker: stash and use topology_core_cpumask for hotplug tests
    soc: fsl: cleanup Kconfig menu
    soc: fsl: dpio: Convert DPIO documentation to .rst
    staging: fsl-mc: Remove remaining files
    staging: fsl-mc: Move DPIO from staging to drivers/soc/fsl
    staging: fsl-dpaa2: eth: move generic FD defines to DPIO
    soc: fsl: qe: gpio: Add qe_gpio_set_multiple
    usb: host: exynos: Remove support for Exynos5440
    clk: samsung: Remove support for Exynos5440
    soc: sunxi: Add the A13, A23 and H3 system control compatibles
    reset: uniphier: add reset control support for SPI
    cpufreq: exynos: Remove support for Exynos5440
    ata: ahci-platform: Remove support for Exynos5440
    soc: imx6qp: Use GENPD_FLAG_ALWAYS_ON for PU errata
    soc: mediatek: pwrap: add mt6351 driver for mt6797 SoCs
    soc: mediatek: pwrap: add pwrap driver for mt6797 SoCs
    soc: mediatek: pwrap: fix cipher init setting error
    dt-bindings: pwrap: mediatek: add pwrap support for MT6797
    reset: uniphier: add USB3 core reset control
    dt-bindings: reset: uniphier: add USB3 core reset support
    ...

    Linus Torvalds
     

19 Aug, 2018

1 commit

  • Pablo Neira Ayuso says:

    ====================
    Netfilter/IPVS fixes for net

    The following patchset contains Netfilter/IPVS fixes for your net tree:

    1) Infinite loop in IPVS when net namespace is released, from
    Tan Hu.

    2) Do not show negative timeouts in ip_vs_conn by using the new
    jiffies_delta_to_msecs(), patches from Matteo Croce.

    3) Set F_IFACE flag for linklocal addresses in ip6t_rpfilter,
    from Florian Westphal.

    4) Fix overflow in set size allocation, from Taehee Yoo.

    5) Use netlink_dump_start() from ctnetlink to fix memleak from
    the error path, again from Florian.

    6) Register nfnetlink_subsys in last place, otherwise netns
    init path may lose race and see net->nft uninitialized data.
    This also reverts previous attempt to fix this by increase
    netns refcount, patches from Florian.

    7) Remove conntrack entries on layer 4 protocol tracker module
    removal, from Florian.

    8) Use GFP_KERNEL_ACCOUNT for xtables blob allocation, from
    Michal Hocko.

    9) Get tproxy documentation in sync with existing codebase,
    from Mate Eckl.

    10) Honor preset layer 3 protocol via ctx->family in the new nft_ct
    timeout infrastructure, from Harsha Sharma.

    11) Let uapi nfnetlink_osf.h compile standalone with no errors,
    from Dmitry V. Levin.

    12) Missing braces compilation warning in nft_tproxy, patch from
    Mate Eclk.

    13) Disregard bogus check to bail out on non-anonymous sets from
    the dynamic set update extension.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

17 Aug, 2018

2 commits

  • If set cbs parameters calculated for 1000Mb, but use on 100Mb port
    w/o h/w offload (for cpsw offload it doesn't matter), it works
    incorrectly. According to the example and testing board, second port
    is 100Mb interface. Correct them on recalculated for 100Mb interface.
    It allows to use the same command for CBS software implementation for
    board in example.

    Signed-off-by: Ivan Khoronzhuk
    Signed-off-by: David S. Miller

    Ivan Khoronzhuk
     
  • Recently, transparent proxy support has been added to nf_tables so that
    this document should be updated with the new information.

    - Nft commands are added as alternatives to iptables ones.
    - The link for a patched iptables is removed as it is already part of
    the mainline iptables implementation (and the link is dead).
    - tcprdr is added as an example implementation of a transparent proxy

    Cc: "David S. Miller"
    Cc: Jonathan Corbet
    Cc: Florian Westphal
    Cc: KOVACS Krisztian
    Cc: Pablo Neira Ayuso
    Cc: linux-doc@vger.kernel.org
    Signed-off-by: Máté Eckl
    Signed-off-by: Pablo Neira Ayuso

    Máté Eckl
     

13 Aug, 2018

1 commit

  • Preventing the kernel from responding to ICMP Echo Requests messages
    can be useful in several ways. The sysctl parameter
    'icmp_echo_ignore_all' can be used to prevent the kernel from
    responding to IPv4 ICMP echo requests. For IPv6 pings, such
    a sysctl kernel parameter did not exist.

    Add the ability to prevent the kernel from responding to IPv6
    ICMP echo requests through the use of the following sysctl
    parameter : /proc/sys/net/ipv6/icmp/echo_ignore_all.
    Update the documentation to reflect this change.

    Signed-off-by: Virgile Jarry
    Signed-off-by: David S. Miller

    Virgile Jarry
     

03 Aug, 2018

1 commit


02 Aug, 2018

2 commits

  • After IPv4 packets are forwarded, the priority of the corresponding SKB
    is updated according to the TOS field of IPv4 header. This overrides any
    prioritization done earlier by e.g. an skbedit action or ingress-qos-map
    defined at a vlan device.

    Such overriding may not always be desirable. Even if the packet ends up
    being routed, which implies this is an L3 network node, an administrator
    may wish to preserve whatever prioritization was done earlier on in the
    pipeline.

    Therefore introduce a sysctl that controls this behavior. Keep the
    default value at 1 to maintain backward-compatible behavior.

    Signed-off-by: Petr Machata
    Reviewed-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Petr Machata
     
  • Add overline heading adornment to document title in order to comply
    with kernel doc requirements.

    Fixes: 60b9131 staging: fsl-mc: Convert documentation to rst format

    Signed-off-by: Ioana Ciornei
    Signed-off-by: David S. Miller

    Ioana Ciornei
     

27 Jul, 2018

2 commits

  • The UCAN driver supports the microcontroller-based USB/CAN
    adapters from Theobroma Systems. There are two form-factors
    that run essentially the same firmware:

    * Seal: standalone USB stick ( https://www.theobroma-systems.com/seal )

    * Mule: integrated on the PCB of various System-on-Modules from
    Theobroma Systems like the A31-µQ7 and the RK3399-Q7
    ( https://www.theobroma-systems.com/rk3399-q7 )

    The USB wire protocol has been designed to be as generic and
    hardware-indendent as possible in the hope of being useful for
    implementation on other microcontrollers.

    Signed-off-by: Martin Elshuber
    Signed-off-by: Jakob Unterwurzacher
    Signed-off-by: Philipp Tomsich
    Acked-by: Wolfgang Grandegger
    Signed-off-by: Marc Kleine-Budde

    Jakob Unterwurzacher
     
  • Preferred kernel docs format is now restructured text. Convert
    netdev-FAQ.txt to restructured text.

    - Add SPDX license identifier.

    - Change file heading 'Information you need to know about netdev' to
    'netdev FAQ' to better suit displayed index (in HTML).

    - Change question/answer layout to suit rst. Copy format in
    Documentation/bpf/bpf_devel_QA.rst

    - Fix indentation of code snippets

    - If multiple consecutive URLs appear put them in a list (to maintain
    whitespace).

    - Use uniform spelling of 'bug fix' throughout document (not bugfix or
    bug-fix).

    - Add double back ticks to 'net' and 'net-next' when referring to the
    trees.

    - Use rst references for Documentation/ links.

    - Add rst label 'netdev-FAQ' for referencing by other docs files.

    - Remove stale entry from Documentation/networking/00-INDEX

    Signed-off-by: Tobin C. Harding
    Signed-off-by: David S. Miller

    Tobin C. Harding
     

26 Jul, 2018

1 commit

  • …o/linux into next/drivers

    Various updates to soc/fsl for 4.19

    Moves DPAA2 DPIO driver from staging to fsl/soc
    Adds multiple-pin support to QE gpio driver

    * tag 'soc-fsl-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/leo/linux:
    soc: fsl: cleanup Kconfig menu
    soc: fsl: dpio: Convert DPIO documentation to .rst
    staging: fsl-mc: Remove remaining files
    staging: fsl-mc: Move DPIO from staging to drivers/soc/fsl
    staging: fsl-dpaa2: eth: move generic FD defines to DPIO
    soc: fsl: qe: gpio: Add qe_gpio_set_multiple

    Signed-off-by: Olof Johansson <olof@lixom.net>

    Olof Johansson
     

25 Jul, 2018

1 commit


24 Jul, 2018

1 commit


21 Jul, 2018

1 commit


19 Jul, 2018

2 commits

  • The kernel documentation is now restructured text. Convert the Ethernet
    Bridge documentation and include it in the toplevel kernel
    documentation.

    - Fix heading adornments.
    - Add license identifier.

    Signed-off-by: Tobin C. Harding
    Signed-off-by: David S. Miller

    Tobin C. Harding
     
  • The kernel documentation is now restructured text. Convert the IP
    aliasing documentation and include it in the toplevel kernel
    documentation.

    - Fix heading adornments.
    - Correctly indent code snippets.
    - Limit line length to 72 characters inline with kernel documentation
    standards.
    - Add license identifier.

    Signed-off-by: Tobin C. Harding
    Signed-off-by: David S. Miller

    Tobin C. Harding
     

17 Jul, 2018

3 commits

  • This patch fixes a spelling typo in bonding.txt

    Signed-off-by: Masanari Iida
    Signed-off-by: David S. Miller

    Masanari Iida
     
  • Currently building the net_failover docs causes a bunch of warnings to
    be emitted. These warnings are all related to indentation and correctly
    highlight missing '::' (for code sections). It looks, from other rst
    files in Documentation, that the first column should be indented 2
    spaces.

    Add '::' before code snippets and indent all snippets uniformly starting
    with 2 spaces.

    Cc: Jonathan Corbet
    Signed-off-by: Tobin C. Harding
    Signed-off-by: David S. Miller

    Tobin C. Harding
     
  • Currently we have rst format docs for the failover and net_failover
    modules however these docs are not linked to within the index.

    Add `failover` and `net_failover` to the networking documentation index.

    Signed-off-by: Tobin C. Harding
    Signed-off-by: David S. Miller

    Tobin C. Harding
     

12 Jul, 2018

3 commits

  • Documentation/networking/e1000.rst:83: ERROR: Unexpected indentation.
    Documentation/networking/e1000.rst:84: WARNING: Block quote ends without a blank line; unexpected unindent.
    Documentation/networking/e1000.rst:173: WARNING: Definition list ends without a blank line; unexpected unindent.
    Documentation/networking/e1000.rst:236: WARNING: Definition list ends without a blank line; unexpected unindent.

    While here, fix highlights and mark a table as such.

    Signed-off-by: Mauro Carvalho Chehab
    Signed-off-by: Jeff Kirsher

    Mauro Carvalho Chehab
     
  • Documentation/networking/e100.rst:57: WARNING: Literal block expected; none found.
    Documentation/networking/e100.rst:68: WARNING: Literal block expected; none found.
    Documentation/networking/e100.rst:75: WARNING: Literal block expected; none found.
    Documentation/networking/e100.rst:84: WARNING: Literal block expected; none found.
    Documentation/networking/e100.rst:93: WARNING: Inline emphasis start-string without end-string.

    While here, fix some highlights.

    Signed-off-by: Mauro Carvalho Chehab
    Signed-off-by: Jeff Kirsher

    Mauro Carvalho Chehab
     
  • addr_gen_mode was introduced in without documentation, add it now.

    Fixes: d35a00b8e33d ("net/ipv6: allow sysctl to change link-local address generation mode")
    Signed-off-by: Sabrina Dubroca
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     

02 Jul, 2018

1 commit


28 Jun, 2018

1 commit

  • The sock reference is lost when scrubbing the packet and that breaks
    TSQ (TCP Small Queues) and XPS (Transmit Packet Steering) causing
    performance impacts of about 50% in a single TCP stream when crossing
    network namespaces.

    XPS breaks because the queue mapping stored in the socket is not
    available, so another random queue might be selected when the stack
    needs to transmit something like a TCP ACK, or TCP Retransmissions.
    That causes packet re-ordering and/or performance issues.

    TSQ breaks because it orphans the packet while it is still in the
    host, so packets are queued contributing to the buffer bloat problem.

    Preserving the sock reference fixes both issues. The socket is
    orphaned anyways in the receiving path before any relevant action
    and on TX side the netfilter checks if the reference is local before
    use it.

    Signed-off-by: Flavio Leitner
    Signed-off-by: David S. Miller

    Flavio Leitner
     

24 Jun, 2018

1 commit


23 Jun, 2018

4 commits

  • Recent patch updated e1000 docs to rst format. Docs build (`make
    htmldocs`) is currently failing due to this file with error:

    (SEVERE/4) Unexpected section title.

    This is because a section of the file is indented 2 spaces. Build error
    can be cleared by aligning the text with column 0. While we are changing
    these lines we can make sure line length does not exceed 72, that
    newlines following headings are uniform, and that full stops are
    followed by two spaces.

    Align text with column 0, limit line length to 72, ensure two spaces
    follow all full stops, ensure uniform use of newlines after heading.

    Fixes commit (228046e76189 Documentation: e1000: Update kernel documentation)

    CC: Jeff Kirsher
    Signed-off-by: Tobin C. Harding
    Acked-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Tobin C. Harding
     
  • Recent patch updated e100 docs to rst format. Docs build (`make
    htmldocs`) is currently failing due to this file with error:

    (SEVERE/4) Unexpected section title.

    This is because a section of the file is indented 2 spaces. Build error
    can be cleared by aligning the text with column 0. While we are changing
    these lines we can make sure line length does not exceed 72, that
    newlines following headings are uniform, and that full stops are
    followed by two spaces.

    Align text with column 0, limit line length to 72, ensure two spaces
    follow all full stops, ensure uniform use of newlines after heading.

    Fixes commit (85d63445f411 Documentation: e100: Update the Intel 10/100 driver doc)

    CC: Jeff Kirsher
    Signed-off-by: Tobin C. Harding
    Acked-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Tobin C. Harding
     
  • Recently documentation file was converted to rst. The document title
    has the incorrect heading adornment. From kernel docs:

    * Please stick to this order of heading adornments:

    1. ``=`` with overline for document title::

    ==============
    Document title
    ==============

    Add overline heading adornment to document title.

    Fixes commit (228046e76189 Documentation: e1000: Update kernel documentation)

    CC: Jeff Kirsher
    Signed-off-by: Tobin C. Harding
    Acked-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Tobin C. Harding
     
  • Recently documentation file was converted to rst. The document title
    has the incorrect heading adornment. From kernel docs:

    * Please stick to this order of heading adornments:

    1. ``=`` with overline for document title::

    ==============
    Document title
    ==============

    Add overline heading adornment to document title.

    Fixes commit (85d63445f411 Documentation: e100: Update the Intel 10/100 driver doc)

    CC: Jeff Kirsher
    Signed-off-by: Tobin C. Harding
    Acked-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Tobin C. Harding
     

15 Jun, 2018

1 commit

  • As stated at:
    http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#footnotes

    A footnote should contain either a number, a reference or
    an auto number, e. g.:
    [1], [#f1] or [#].

    While using [*] accidentaly works for html, it fails for other
    document outputs. In particular, it causes an error with LaTeX
    output, causing all books after networking to not be built.

    So, replace it by a valid syntax.

    Acked-by: Oliver Hartkopp
    Signed-off-by: Mauro Carvalho Chehab
    Acked-by: Jonathan Corbet

    Mauro Carvalho Chehab
     

06 Jun, 2018

2 commits

  • Per discussion with David at netconf 2018, let's clarify
    DaveM's position of handling stable backports in netdev-FAQ.

    This is important for people relying on upstream -stable
    releases.

    Cc: Greg Kroah-Hartman
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Daniel Borkmann says:

    ====================
    pull-request: bpf-next 2018-06-05

    The following pull-request contains BPF updates for your *net-next* tree.

    The main changes are:

    1) Add a new BPF hook for sendmsg similar to existing hooks for bind and
    connect: "This allows to override source IP (including the case when it's
    set via cmsg(3)) and destination IP:port for unconnected UDP (slow path).
    TCP and connected UDP (fast path) are not affected. This makes UDP support
    complete, that is, connected UDP is handled by connect hooks, unconnected
    by sendmsg ones.", from Andrey.

    2) Rework of the AF_XDP API to allow extending it in future for type writer
    model if necessary. In this mode a memory window is passed to hardware
    and multiple frames might be filled into that window instead of just one
    that is the case in the current fixed frame-size model. With the new
    changes made this can be supported without having to add a new descriptor
    format. Also, core bits for the zero-copy support for AF_XDP have been
    merged as agreed upon, where i40e bits will be routed via Jeff later on.
    Various improvements to documentation and sample programs included as
    well, all from Björn and Magnus.

    3) Given BPF's flexibility, a new program type has been added to implement
    infrared decoders. Quote: "The kernel IR decoders support the most
    widely used IR protocols, but there are many protocols which are not
    supported. [...] There is a 'long tail' of unsupported IR protocols,
    for which lircd is need to decode the IR. IR encoding is done in such
    a way that some simple circuit can decode it; therefore, BPF is ideal.
    [...] user-space can define a decoder in BPF, attach it to the rc
    device through the lirc chardev.", from Sean.

    4) Several improvements and fixes to BPF core, among others, dumping map
    and prog IDs into fdinfo which is a straight forward way to correlate
    BPF objects used by applications, removing an indirect call and therefore
    retpoline in all map lookup/update/delete calls by invoking the callback
    directly for 64 bit archs, adding a new bpf_skb_cgroup_id() BPF helper
    for tc BPF programs to have an efficient way of looking up cgroup v2 id
    for policy or other use cases. Fixes to make sure we zero tunnel/xfrm
    state that hasn't been filled, to allow context access wrt pt_regs in
    32 bit archs for tracing, and last but not least various test cases
    for fixes that landed in bpf earlier, from Daniel.

    5) Get rid of the ndo_xdp_flush API and extend the ndo_xdp_xmit with
    a XDP_XMIT_FLUSH flag instead which allows to avoid one indirect
    call as flushing is now merged directly into ndo_xdp_xmit(), from Jesper.

    6) Add a new bpf_get_current_cgroup_id() helper that can be used in
    tracing to retrieve the cgroup id from the current process in order
    to allow for e.g. aggregation of container-level events, from Yonghong.

    7) Two follow-up fixes for BTF to reject invalid input values and
    related to that also two test cases for BPF kselftests, from Martin.

    8) Various API improvements to the bpf_fib_lookup() helper, that is,
    dropping MPLS bits which are not fully hashed out yet, rejecting
    invalid helper flags, returning error for unsupported address
    families as well as renaming flowlabel to flowinfo, from David.

    9) Various fixes and improvements to sockmap BPF kselftests in particular
    in proper error detection and data verification, from Prashant.

    10) Two arm32 BPF JIT improvements. One is to fix imm range check with
    regards to whether immediate fits into 24 bits, and a naming cleanup
    to get functions related to rsh handling consistent to those handling
    lsh, from Wang.

    11) Two compile warning fixes in BPF, one for BTF and a false positive
    to silent gcc in stack_map_get_build_id_offset(), from Arnd.

    12) Add missing seg6.h header into tools include infrastructure in order
    to fix compilation of BPF kselftests, from Mathieu.

    13) Several formatting cleanups in the BPF UAPI helper description that
    also fix an error during rst2man compilation, from Quentin.

    14) Hide an unused variable in sk_msg_convert_ctx_access() when IPv6 is
    not built into the kernel, from Yue.

    15) Remove a useless double assignment in dev_map_enqueue(), from Colin.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

05 Jun, 2018

5 commits

  • Jeff Kirsher says:

    ====================
    Intel Wired LAN Driver Updates 2018-06-04

    This series contains a smorgasbord of updates to documentation, e1000e,
    igb, ixgbe, ixgbevf and i40e.

    Benjamin Poirier fixes a potential kernel crash due to NULL pointer
    dereference in e1000e.

    Jeff updates the kernel documentation for e100 and e1000 to correct
    default values and URLs which were incorrect in the documentation. Also
    took the time to update these to the new reStructured text format for
    kernel documentation.

    Joanna Yurdal fixes a missing PTP transmit timestamp by ensuring that
    TSICR gets cleared when ICR is cleared.

    Sergey updates igb to reset all the transmit queues at one time so that
    we only have to wait once for all the queues to be reset.

    Alex fixes ixgbevf so that malicious driver detection (MDD) can co-exist
    with XDP.

    Emil and Tony extend the RTNL lock to ensure we get the most up-to-date
    values for the bits and avoid a possible race condition when going down.

    YueHaibing from Huawei introduces a helper function in ixgbe for
    operation reads to simplify the code a bit more.

    Daniel Borkmann adds support for XDP meta data when using build SKB
    for i40e.

    Shannon Nelson provides twp fixes for the IPSec code in ixgbe, first is
    to make sure we do not try to offload the decryption of any incoming
    packet that is destined for the management engine. The other fix is to
    resolve a cast problem introduced by a sparse cleanup patch.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • This patch fixes some typos/misspelling errors in the
    Documentation/networking files.

    Signed-off-by: Olivier Gayot
    Signed-off-by: David S. Miller

    Olivier Gayot
     
  • This changes the /proc/sys/net/ipv4/tcp_tw_reuse from a boolean
    to an integer.

    It now takes the values 0, 1 and 2, where 0 and 1 behave as before,
    while 2 enables timewait socket reuse only for sockets that we can
    prove are loopback connections:
    ie. bound to 'lo' interface or where one of source or destination
    IPs is 127.0.0.0/8, ::ffff:127.0.0.0/104 or ::1.

    This enables quicker reuse of ephemeral ports for loopback connections
    - where tcp_tw_reuse is 100% safe from a protocol perspective
    (this assumes no artificially induced packet loss on 'lo').

    This also makes estblishing many loopback connections *much* faster
    (allocating ports out of the first half of the ephemeral port range
    is significantly faster, then allocating from the second half)

    Without this change in a 32K ephemeral port space my sample program
    (it just establishes and closes [::1]:ephemeral -> [::1]:server_port
    connections in a tight loop) fails after 32765 connections in 24 seconds.
    With it enabled 50000 connections only take 4.7 seconds.

    This is particularly problematic for IPv6 where we only have one local
    address and cannot play tricks with varying source IP from 127.0.0.0/8
    pool.

    Signed-off-by: Maciej Żenczykowski
    Cc: Neal Cardwell
    Cc: Yuchung Cheng
    Cc: Wei Wang
    Change-Id: I0377961749979d0301b7b62871a32a4b34b654e1
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Maciej Żenczykowski
     
  • Updated the e1000.txt kernel documentation with the latest information.

    Also convert the text file to reStructuredText (RST) format, since the
    Linux kernel documentation now uses this format for documentation.

    Signed-off-by: Jeff Kirsher
    Tested-by: Aaron Brown

    Jeff Kirsher
     
  • Over the years, several of the links have changed or are no longer valid
    so update them. In addition, the default values were incorrect for a
    couple of parameters.

    Converted the text file to the reStructuredText (RST) format, since the
    Linux kernel documentation now uses this format for documentation.

    Signed-off-by: Jeff Kirsher
    Tested-by: Aaron Brown

    Jeff Kirsher
     

04 Jun, 2018

1 commit

  • Currently, AF_XDP only supports a fixed frame-size memory scheme where
    each frame is referenced via an index (idx). A user passes the frame
    index to the kernel, and the kernel acts upon the data. Some NICs,
    however, do not have a fixed frame-size model, instead they have a
    model where a memory window is passed to the hardware and multiple
    frames are filled into that window (referred to as the "type-writer"
    model).

    By changing the descriptor format from the current frame index
    addressing scheme, AF_XDP can in the future be extended to support
    these kinds of NICs.

    In the index-based model, an idx refers to a frame of size
    frame_size. Addressing a frame in the UMEM is done by offseting the
    UMEM starting address by a global offset, idx * frame_size + offset.
    Communicating via the fill- and completion-rings are done by means of
    idx.

    In this commit, the idx is removed in favor of an address (addr),
    which is a relative address ranging over the UMEM. To convert an
    idx-based address to the new addr is simply: addr = idx * frame_size +
    offset.

    We also stop referring to the UMEM "frame" as a frame. Instead it is
    simply called a chunk.

    To transfer ownership of a chunk to the kernel, the addr of the chunk
    is passed in the fill-ring. Note, that the kernel will mask addr to
    make it chunk aligned, so there is no need for userspace to do
    that. E.g., for a chunk size of 2k, passing an addr of 2048, 2050 or
    3000 to the fill-ring will refer to the same chunk.

    On the completion-ring, the addr will match that of the Tx descriptor,
    passed to the kernel.

    Changing the descriptor format to use chunks/addr will allow for
    future changes to move to a type-writer based model, where multiple
    frames can reside in one chunk. In this model passing one single chunk
    into the fill-ring, would potentially result in multiple Rx
    descriptors.

    This commit changes the uapi of AF_XDP sockets, and updates the
    documentation.

    Signed-off-by: Björn Töpel
    Signed-off-by: Daniel Borkmann

    Björn Töpel