04 Apr, 2020

1 commit

  • Pull SPDX updates from Greg KH:
    "Here are three SPDX patches for 5.7-rc1.

    One fixes up the SPDX tag for a single driver, while the other two go
    through the tree and add SPDX tags for all of the .gitignore files as
    needed.

    Nothing too complex, but you will get a merge conflict with your
    current tree, that should be trivial to handle (one file modified by
    two things, one file deleted.)

    All three of these have been in linux-next for a while, with no
    reported issues other than the merge conflict"

    * tag 'spdx-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
    ASoC: MT6660: make spdxcheck.py happy
    .gitignore: add SPDX License Identifier
    .gitignore: remove too obvious comments

    Linus Torvalds
     

01 Apr, 2020

4 commits

  • Pull networking updates from David Miller:
    "Highlights:

    1) Fix the iwlwifi regression, from Johannes Berg.

    2) Support BSS coloring and 802.11 encapsulation offloading in
    hardware, from John Crispin.

    3) Fix some potential Spectre issues in qtnfmac, from Sergey
    Matyukevich.

    4) Add TTL decrement action to openvswitch, from Matteo Croce.

    5) Allow paralleization through flow_action setup by not taking the
    RTNL mutex, from Vlad Buslov.

    6) A lot of zero-length array to flexible-array conversions, from
    Gustavo A. R. Silva.

    7) Align XDP statistics names across several drivers for consistency,
    from Lorenzo Bianconi.

    8) Add various pieces of infrastructure for offloading conntrack, and
    make use of it in mlx5 driver, from Paul Blakey.

    9) Allow using listening sockets in BPF sockmap, from Jakub Sitnicki.

    10) Lots of parallelization improvements during configuration changes
    in mlxsw driver, from Ido Schimmel.

    11) Add support to devlink for generic packet traps, which report
    packets dropped during ACL processing. And use them in mlxsw
    driver. From Jiri Pirko.

    12) Support bcmgenet on ACPI, from Jeremy Linton.

    13) Make BPF compatible with RT, from Thomas Gleixnet, Alexei
    Starovoitov, and your's truly.

    14) Support XDP meta-data in virtio_net, from Yuya Kusakabe.

    15) Fix sysfs permissions when network devices change namespaces, from
    Christian Brauner.

    16) Add a flags element to ethtool_ops so that drivers can more simply
    indicate which coalescing parameters they actually support, and
    therefore the generic layer can validate the user's ethtool
    request. Use this in all drivers, from Jakub Kicinski.

    17) Offload FIFO qdisc in mlxsw, from Petr Machata.

    18) Support UDP sockets in sockmap, from Lorenz Bauer.

    19) Fix stretch ACK bugs in several TCP congestion control modules,
    from Pengcheng Yang.

    20) Support virtual functiosn in octeontx2 driver, from Tomasz
    Duszynski.

    21) Add region operations for devlink and use it in ice driver to dump
    NVM contents, from Jacob Keller.

    22) Add support for hw offload of MACSEC, from Antoine Tenart.

    23) Add support for BPF programs that can be attached to LSM hooks,
    from KP Singh.

    24) Support for multiple paths, path managers, and counters in MPTCP.
    From Peter Krystad, Paolo Abeni, Florian Westphal, Davide Caratti,
    and others.

    25) More progress on adding the netlink interface to ethtool, from
    Michal Kubecek"

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2121 commits)
    net: ipv6: rpl_iptunnel: Fix potential memory leak in rpl_do_srh_inline
    cxgb4/chcr: nic-tls stats in ethtool
    net: dsa: fix oops while probing Marvell DSA switches
    net/bpfilter: remove superfluous testing message
    net: macb: Fix handling of fixed-link node
    net: dsa: ksz: Select KSZ protocol tag
    netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_write
    net: stmmac: add EHL 2.5Gbps PCI info and PCI ID
    net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID
    net: stmmac: create dwmac-intel.c to contain all Intel platform
    net: dsa: bcm_sf2: Support specifying VLAN tag egress rule
    net: dsa: bcm_sf2: Add support for matching VLAN TCI
    net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functions
    net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXT
    net: dsa: bcm_sf2: Disable learning for ASP port
    net: dsa: b53: Deny enslaving port 7 for 7278 into a bridge
    net: dsa: b53: Prevent tagged VLAN on port 7 for 7278
    net: dsa: b53: Restore VLAN entries upon (re)configuration
    net: dsa: bcm_sf2: Fix overflow checks
    hv_netvsc: Remove unnecessary round_up for recv_completion_cnt
    ...

    Linus Torvalds
     
  • In case memory resources for buf were allocated, release them before
    return.

    Addresses-Coverity-ID: 1492011 ("Resource leak")
    Fixes: a7a29f9c361f ("net: ipv6: add rpl sr tunnel")
    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: David S. Miller

    Gustavo A. R. Silva
     
  • Fix an oops in dsa_port_phylink_mac_change() caused by a combination
    of a20f997010c4 ("net: dsa: Don't instantiate phylink for CPU/DSA
    ports unless needed") and the net-dsa-improve-serdes-integration
    series of patches 65b7a2c8e369 ("Merge branch
    'net-dsa-improve-serdes-integration'").

    Unable to handle kernel NULL pointer dereference at virtual address 00000124
    pgd = c0004000
    [00000124] *pgd=00000000
    Internal error: Oops: 805 [#1] SMP ARM
    Modules linked in: tag_edsa spi_nor mtd xhci_plat_hcd mv88e6xxx(+) xhci_hcd armada_thermal marvell_cesa dsa_core ehci_orion libdes phy_armada38x_comphy at24 mcp3021 sfp evbug spi_orion sff mdio_i2c
    CPU: 1 PID: 214 Comm: irq/55-mv88e6xx Not tainted 5.6.0+ #470
    Hardware name: Marvell Armada 380/385 (Device Tree)
    PC is at phylink_mac_change+0x10/0x88
    LR is at mv88e6352_serdes_irq_status+0x74/0x94 [mv88e6xxx]

    Signed-off-by: Russell King
    Reviewed-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Russell King
     
  • A testing message was brought by 13d0f7b814d9 ("net/bpfilter: fix dprintf
    usage for /dev/kmsg") but should've been deleted before patch submission.
    Although it doesn't cause any harm to the code or functionality itself, it's
    totally unpleasant to have it displayed on every loop iteration with no real
    use case. Thus remove it unconditionally.

    Fixes: 13d0f7b814d9 ("net/bpfilter: fix dprintf usage for /dev/kmsg")
    Signed-off-by: Bruno Meneguele
    Signed-off-by: David S. Miller

    Bruno Meneguele
     

31 Mar, 2020

25 commits

  • David S. Miller
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • Pablo Neira Ayuso says:

    ====================
    Netfilter/IPVS updates for net-next

    The following patchset contains Netfilter/IPVS updates for net-next:

    1) Add support to specify a stateful expression in set definitions,
    this allows users to specify e.g. counters per set elements.

    2) Flowtable software counter support.

    3) Flowtable hardware offload counter support, from wenxu.

    3) Parallelize flowtable hardware offload requests, from Paul Blakey.
    This includes a patch to add one work entry per offload command.

    4) Several patches to rework nf_queue refcount handling, from Florian
    Westphal.

    4) A few fixes for the flowtable tunnel offload: Fix crash if tunneling
    information is missing and set up indirect flow block as TC_SETUP_FT,
    patch from wenxu.

    5) Stricter netlink attribute sanity check on filters, from Romain Bellan
    and Florent Fourcot.

    5) Annotations to make sparse happy, from Jules Irenge.

    6) Improve icmp errors in debugging information, from Haishuang Yan.

    7) Fix warning in IPVS icmp error debugging, from Haishuang Yan.

    8) Fix endianess issue in tcp extension header, from Sergey Marinkevich.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • The previous patch allowed device drivers to publish their default
    binding between packet trap policers and packet trap groups. However,
    some users might not be content with this binding and would like to
    change it.

    In case user space passed a packet trap policer identifier when setting
    a packet trap group, invoke the appropriate device driver callback and
    pass the new policer identifier.

    v2:
    * Check for presence of 'DEVLINK_ATTR_TRAP_POLICER_ID' in
    devlink_trap_group_set() and bail if not present
    * Add extack error message in case trap group was partially modified

    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Acked-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Packet trap groups are used to aggregate logically related packet traps.
    Currently, these groups allow user space to batch operations such as
    setting the trap action of all member traps.

    In order to prevent the CPU from being overwhelmed by too many trapped
    packets, it is desirable to bind a packet trap policer to these groups.
    For example, to limit all the packets that encountered an exception
    during routing to 10Kpps.

    Allow device drivers to bind default packet trap policers to packet trap
    groups when the latter are registered with devlink.

    The next patch will enable user space to change this default binding.

    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Devices capable of offloading the kernel's datapath and perform
    functions such as bridging and routing must also be able to send (trap)
    specific packets to the kernel (i.e., the CPU) for processing.

    For example, a device acting as a multicast-aware bridge must be able to
    trap IGMP membership reports to the kernel for processing by the bridge
    module.

    In most cases, the underlying device is capable of handling packet rates
    that are several orders of magnitude higher compared to those that can
    be handled by the CPU.

    Therefore, in order to prevent the underlying device from overwhelming
    the CPU, devices usually include packet trap policers that are able to
    police the trapped packets to rates that can be handled by the CPU.

    This patch allows capable device drivers to register their supported
    packet trap policers with devlink. User space can then tune the
    parameters of these policer (currently, rate and burst size) and read
    from the device the number of packets that were dropped by the policer,
    if supported.

    Subsequent patches in the series will allow device drivers to create
    default binding between these policers and packet trap groups and allow
    user space to change the binding.

    v2:
    * Add 'strict_start_type' in devlink policy
    * Have device drivers provide max/min rate/burst size for each policer.
    Use them to check validity of user provided parameters

    Signed-off-by: Ido Schimmel
    Reviewed-by: Jiri Pirko
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Ido Schimmel
     
  • Avoid taking a reference on listen sockets by checking the socket type
    in the sk_assign and in the corresponding skb_steal_sock() code in the
    the transport layer, and by ensuring that the prefetch free (sock_pfree)
    function uses the same logic to check whether the socket is refcounted.

    Suggested-by: Martin KaFai Lau
    Signed-off-by: Joe Stringer
    Signed-off-by: Alexei Starovoitov
    Acked-by: Martin KaFai Lau
    Link: https://lore.kernel.org/bpf/20200329225342.16317-4-joe@wand.net.nz

    Joe Stringer
     
  • Refactor the UDP/TCP handlers slightly to allow skb_steal_sock() to make
    the determination of whether the socket is reference counted in the case
    where it is prefetched by earlier logic such as early_demux.

    Signed-off-by: Joe Stringer
    Signed-off-by: Alexei Starovoitov
    Acked-by: Martin KaFai Lau
    Link: https://lore.kernel.org/bpf/20200329225342.16317-3-joe@wand.net.nz

    Joe Stringer
     
  • Add support for TPROXY via a new bpf helper, bpf_sk_assign().

    This helper requires the BPF program to discover the socket via a call
    to bpf_sk*_lookup_*(), then pass this socket to the new helper. The
    helper takes its own reference to the socket in addition to any existing
    reference that may or may not currently be obtained for the duration of
    BPF processing. For the destination socket to receive the traffic, the
    traffic must be routed towards that socket via local route. The
    simplest example route is below, but in practice you may want to route
    traffic more narrowly (eg by CIDR):

    $ ip route add local default dev lo

    This patch avoids trying to introduce an extra bit into the skb->sk, as
    that would require more invasive changes to all code interacting with
    the socket to ensure that the bit is handled correctly, such as all
    error-handling cases along the path from the helper in BPF through to
    the orphan path in the input. Instead, we opt to use the destructor
    variable to switch on the prefetch of the socket.

    Signed-off-by: Joe Stringer
    Signed-off-by: Alexei Starovoitov
    Acked-by: Martin KaFai Lau
    Link: https://lore.kernel.org/bpf/20200329225342.16317-2-joe@wand.net.nz

    Joe Stringer
     
  • Pull documentation updates from Jonathan Corbet:
    "This has been a busy cycle for documentation work.

    Highlights include:

    - Lots of RST conversion work by Mauro, Daniel ALmeida, and others.
    Maybe someday we'll get to the end of this stuff...maybe...

    - Some organizational work to bring some order to the core-api
    manual.

    - Various new docs and additions to the existing documentation.

    - Typo fixes, warning fixes, ..."

    * tag 'docs-5.7' of git://git.lwn.net/linux: (123 commits)
    Documentation: x86: exception-tables: document CONFIG_BUILDTIME_TABLE_SORT
    MAINTAINERS: adjust to filesystem doc ReST conversion
    docs: deprecated.rst: Add BUG()-family
    doc: zh_CN: add translation for virtiofs
    doc: zh_CN: index files in filesystems subdirectory
    docs: locking: Drop :c:func: throughout
    docs: locking: Add 'need' to hardirq section
    docs: conf.py: avoid thousands of duplicate label warning on Sphinx
    docs: prevent warnings due to autosectionlabel
    docs: fix reference to core-api/namespaces.rst
    docs: fix pointers to io-mapping.rst and io_ordering.rst files
    Documentation: Better document the softlockup_panic sysctl
    docs: hw-vuln: tsx_async_abort.rst: get rid of an unused ref
    docs: perf: imx-ddr.rst: get rid of a warning
    docs: filesystems: fuse.rst: supress a Sphinx warning
    docs: translations: it: avoid duplicate refs at programming-language.rst
    docs: driver.rst: supress two ReSt warnings
    docs: trace: events.rst: convert some new stuff to ReST format
    Documentation: Add io_ordering.rst to driver-api manual
    Documentation: Add io-mapping.rst to driver-api manual
    ...

    Linus Torvalds
     
  • Pull io_uring updates from Jens Axboe:
    "Here are the io_uring changes for this merge window. Light on new
    features this time around (just splice + buffer selection), lots of
    cleanups, fixes, and improvements to existing support. In particular,
    this contains:

    - Cleanup fixed file update handling for stack fallback (Hillf)

    - Re-work of how pollable async IO is handled, we no longer require
    thread offload to handle that. Instead we rely using poll to drive
    this, with task_work execution.

    - In conjunction with the above, allow expendable buffer selection,
    so that poll+recv (for example) no longer has to be a split
    operation.

    - Make sure we honor RLIMIT_FSIZE for buffered writes

    - Add support for splice (Pavel)

    - Linked work inheritance fixes and optimizations (Pavel)

    - Async work fixes and cleanups (Pavel)

    - Improve io-wq locking (Pavel)

    - Hashed link write improvements (Pavel)

    - SETUP_IOPOLL|SETUP_SQPOLL improvements (Xiaoguang)"

    * tag 'for-5.7/io_uring-2020-03-29' of git://git.kernel.dk/linux-block: (54 commits)
    io_uring: cleanup io_alloc_async_ctx()
    io_uring: fix missing 'return' in comment
    io-wq: handle hashed writes in chains
    io-uring: drop 'free_pfile' in struct io_file_put
    io-uring: drop completion when removing file
    io_uring: Fix ->data corruption on re-enqueue
    io-wq: close cancel gap for hashed linked work
    io_uring: make spdxcheck.py happy
    io_uring: honor original task RLIMIT_FSIZE
    io-wq: hash dependent work
    io-wq: split hashing and enqueueing
    io-wq: don't resched if there is no work
    io-wq: remove duplicated cancel code
    io_uring: fix truncated async read/readv and write/writev retry
    io_uring: dual license io_uring.h uapi header
    io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled
    io_uring: Fix unused function warnings
    io_uring: add end-of-bits marker and build time verify it
    io_uring: provide means of removing buffers
    io_uring: add IOSQE_BUFFER_SELECT support for IORING_OP_RECVMSG
    ...

    Linus Torvalds
     
  • If outer_proto is not set, GCC warning as following:

    In file included from net/netfilter/ipvs/ip_vs_core.c:52:
    net/netfilter/ipvs/ip_vs_core.c: In function 'ip_vs_in_icmp':
    include/net/ip_vs.h:233:4: warning: 'outer_proto' may be used uninitialized in this function [-Wmaybe-uninitialized]
    233 | printk(KERN_DEBUG pr_fmt(msg), ##__VA_ARGS__); \
    | ^~~~~~
    net/netfilter/ipvs/ip_vs_core.c:1666:8: note: 'outer_proto' was declared here
    1666 | char *outer_proto;
    | ^~~~~~~~~~~

    Fixes: 73348fed35d0 ("ipvs: optimize tunnel dumps for icmp errors")
    Signed-off-by: Haishuang Yan
    Acked-by: Julian Anastasov
    Signed-off-by: Pablo Neira Ayuso

    Haishuang Yan
     
  • I got a problem on MIPS with Big-Endian is turned on: every time when
    NF trying to change TCP MSS it returns because of new.v16 was greater
    than old.v16. But real MSS was 1460 and my rule was like this:

    add rule table chain tcp option maxseg size set 1400

    And 1400 is lesser that 1460, not greater.

    Later I founded that main causer is cast from u32 to __be16.

    Debugging:

    In example MSS = 1400(HEX: 0x578). Here is representation of each byte
    like it is in memory by addresses from left to right(e.g. [0x0 0x1 0x2
    0x3]). LE — Little-Endian system, BE — Big-Endian, left column is type.

    LE BE
    u32: [78 05 00 00] [00 00 05 78]

    As you can see, u32 representation will be casted to u16 from different
    half of 4-byte address range. But actually nf_tables uses registers and
    store data of various size. Actually TCP MSS stored in 2 bytes. But
    registers are still u32 in definition:

    struct nft_regs {
    union {
    u32 data[20];
    struct nft_verdict verdict;
    };
    };

    So, access like regs->data[priv->sreg] exactly u32. So, according to
    table presents above, per-byte representation of stored TCP MSS in
    register will be:

    LE BE
    (u32)regs->data[]: [78 05 00 00] [05 78 00 00]
    ^^ ^^

    We see that register uses just half of u32 and other 2 bytes may be
    used for some another data. But in nft_exthdr_tcp_set_eval() it casted
    just like u32 -> __be16:

    new.v16 = src

    But u32 overfill __be16, so it get 2 low bytes. For clarity draw
    one more table( means that bytes will be used for cast).

    LE BE
    u32: [ 00 00] [00 00 ]
    (u32)regs->data[]: [ 00 00] [05 78 ]

    As you can see, for Little-Endian nothing changes, but for Big-endian we
    take the wrong half. In my case there is some other data instead of
    zeros, so new MSS was wrongly greater.

    For shooting this bug I used solution for ports ranges. Applying of this
    patch does not affect Little-Endian systems.

    Signed-off-by: Sergey Marinkevich
    Acked-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Sergey Marinkevich
     
  • …etooth/bluetooth-next

    Johan Hedberg says:

    ====================
    pull request: bluetooth-next 2020-03-29

    Here are a few more Bluetooth patches for the 5.7 kernel:

    - Fix assumption of encryption key size when reading fails
    - Add support for DEFER_SETUP with L2CAP Enhanced Credit Based Mode
    - Fix issue with auto-connected devices
    - Fix suspend handling when entering the state fails
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • The approach taken to pass the port policer methods on to drivers is
    pragmatic. It is similar to the port mirroring implementation (in that
    the DSA core does all of the filter block interaction and only passes
    simple operations for the driver to implement) and dissimilar to how
    flow-based policers are going to be implemented (where the driver has
    full control over the flow_cls_offload data structure).

    Signed-off-by: Vladimir Oltean
    Signed-off-by: David S. Miller

    Vladimir Oltean
     
  • Make room for other actions for the matchall filter by keeping the
    mirred argument parsing self-contained in its own function.

    Signed-off-by: Vladimir Oltean
    Signed-off-by: David S. Miller

    Vladimir Oltean
     
  • On low memory system, run time dumps can consume too much memory. Add
    administrator ability to disable auto dumps per reporter as part of the
    error flow handle routine.

    This attribute is not relevant while executing
    DEVLINK_CMD_HEALTH_REPORTER_DUMP_GET.

    By default, auto dump is activated for any reporter that has a dump method,
    as part of the reporter registration to devlink.

    Signed-off-by: Eran Ben Elisha
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Eran Ben Elisha
     
  • When health reporter is registered to devlink, devlink will implicitly set
    auto recover if and only if the reporter has a recover method. No reason
    to explicitly get the auto recover flag from the driver.

    Remove this flag from all drivers that called
    devlink_health_reporter_create.

    All existing health reporters set auto recovery to true if they have a
    recover method.

    Yet, administrator can unset auto recover via netlink command as prior to
    this patch.

    Signed-off-by: Eran Ben Elisha
    Reviewed-by: Jiri Pirko
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Eran Ben Elisha
     
  • The rest of the devlink code sets the extack message using
    NL_SET_ERR_MSG_MOD. Change the existing appearances of NL_SET_ERR_MSG
    to NL_SET_ERR_MSG_MOD.

    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • It may be up to the driver (in case ANY HW stats is passed) to select
    which type of HW stats he is going to use. Add an infrastructure to
    expose this information to user.

    $ tc filter add dev enp3s0np1 ingress proto ip handle 1 pref 1 flower dst_ip 192.168.1.1 action drop
    $ tc -s filter show dev enp3s0np1 ingress
    filter protocol ip pref 1 flower chain 0
    filter protocol ip pref 1 flower chain 0 handle 0x1
    eth_type ipv4
    dst_ip 192.168.1.1
    in_hw in_hw_count 2
    action order 1: gact action drop
    random type none pass val 0
    index 1 ref 1 bind 1 installed 10 sec used 10 sec
    Action statistics:
    Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
    backlog 0b 0p requeues 0
    used_hw_stats immediate <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • Introduce a helper to pass value and selector to. The helper packs them
    into struct and puts them into netlink message.

    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jiri Pirko
     
  • Steffen Klassert says:

    ====================
    pull request (net-next): ipsec-next 2020-03-28

    1) Use kmem_cache_zalloc() instead of kmem_cache_alloc()
    in xfrm_state_alloc(). From Huang Zijiang.

    2) esp_output_fill_trailer() is the same in IPv4 and IPv6,
    so share this function to avoide code duplcation.
    From Raed Salem.

    3) Add offload support for esp beet mode.
    From Xin Long.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • There is no point in preparing the module name in a buffer. The format
    string can be passed diectly to 'request_module()'.

    This axes a few lines of code and cleans a few things:
    - max len for a driver name is MODULE_NAME_LEN wich is ~ 60 chars,
    not 128. It would be down-sized in 'request_module()'
    - we should pass the total size of the buffer to 'snprintf()', not the
    size minus 1

    Signed-off-by: Christophe JAILLET
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Christophe JAILLET
     
  • Xin Long says:
    On udp rx path udp_rcv_segment() may do segment where the frag skbs
    will get the header copied from the head skb in skb_segment_list()
    by calling __copy_skb_header(), which could overwrite the frag skbs'
    extensions by __skb_ext_copy() and cause a leak.

    This issue was found after loading esp_offload where a sec path ext
    is set in the skb.

    Fix this by discarding head state of the fraglist skb before replacing
    its contents.

    Fixes: 3a1296a38d0cf62 ("net: Support GRO/GSO fraglist chaining.")
    Cc: Steffen Klassert
    Reported-by: Xiumei Mu
    Tested-by: Xin Long
    Signed-off-by: Florian Westphal
    Acked-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Florian Westphal
     
  • Without NAPI_GRO_CB(skb)->is_flist initialized, when the dev doesn't
    support NETIF_F_GRO_FRAGLIST, is_flist can still be set and fraglist
    will be used in udp_gro_receive().

    So fix it by initializing is_flist with 0 in udp_gro_receive.

    Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
    Signed-off-by: Xin Long
    Acked-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Xin Long
     

30 Mar, 2020

10 commits

  • Implement TSINFO_GET request to get timestamping information for a network
    device. This is traditionally available via ETHTOOL_GET_TS_INFO ioctl
    request.

    Move part of ethtool_get_ts_info() into common.c so that ioctl and netlink
    code use the same logic to get timestamping information from the device.

    v3: use "TSINFO" rather than "TIMESTAMP", suggested by Richard Cochran

    Signed-off-by: Michal Kubecek
    Acked-by: Richard Cochran
    Signed-off-by: David S. Miller

    Michal Kubecek
     
  • Add three string sets related to timestamping information:

    ETH_SS_SOF_TIMESTAMPING: SOF_TIMESTAMPING_* flags
    ETH_SS_TS_TX_TYPES: timestamping Tx types
    ETH_SS_TS_RX_FILTERS: timestamping Rx filters

    These will be used for TIMESTAMP_GET request.

    v2: avoid compiler warning ("enumeration value not handled in switch")
    in net_hwtstamp_validate()

    v3: omit dash in Tx type names ("one-step-*" -> "onestep-*"), suggested by
    Richard Cochran

    Signed-off-by: Michal Kubecek
    Acked-by: Richard Cochran
    Signed-off-by: David S. Miller

    Michal Kubecek
     
  • Send ETHTOOL_MSG_EEE_NTF notification whenever EEE settings of a network
    device are modified using ETHTOOL_MSG_EEE_SET netlink message or
    ETHTOOL_SEEE ioctl request.

    Signed-off-by: Michal Kubecek
    Signed-off-by: David S. Miller

    Michal Kubecek
     
  • Implement EEE_SET netlink request to set EEE settings of a network device.
    These are traditionally set with ETHTOOL_SEEE ioctl request.

    The netlink interface allows setting the EEE status for all link modes
    supported by kernel but only first 32 link modes can be set at the moment
    as only those are supported by the ethtool_ops callback.

    Signed-off-by: Michal Kubecek
    Signed-off-by: David S. Miller

    Michal Kubecek
     
  • Implement EEE_GET request to get EEE settings of a network device. These
    are traditionally available via ETHTOOL_GEEE ioctl request.

    The netlink interface allows reporting EEE status for all link modes
    supported by kernel but only first 32 link modes are provided at the moment
    as only those are reported by the ethtool_ops callback and drivers.

    v2: fix alignment (whitespace only)

    Signed-off-by: Michal Kubecek
    Signed-off-by: David S. Miller

    Michal Kubecek
     
  • Send ETHTOOL_MSG_PAUSE_NTF notification whenever pause parameters of
    a network device are modified using ETHTOOL_MSG_PAUSE_SET netlink message
    or ETHTOOL_SPAUSEPARAM ioctl request.

    Signed-off-by: Michal Kubecek
    Signed-off-by: David S. Miller

    Michal Kubecek
     
  • Implement PAUSE_SET netlink request to set pause parameters of a network
    device. Thease are traditionally set with ETHTOOL_SPAUSEPARAM ioctl
    request.

    Signed-off-by: Michal Kubecek
    Signed-off-by: David S. Miller

    Michal Kubecek
     
  • Implement PAUSE_GET request to get pause parameters of a network device.
    These are traditionally available via ETHTOOL_GPAUSEPARAM ioctl request.

    Signed-off-by: Michal Kubecek
    Signed-off-by: David S. Miller

    Michal Kubecek
     
  • Send ETHTOOL_MSG_COALESCE_NTF notification whenever coalescing parameters
    of a network device are modified using ETHTOOL_MSG_COALESCE_SET netlink
    message or ETHTOOL_SCOALESCE ioctl request.

    Signed-off-by: Michal Kubecek
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Michal Kubecek
     
  • Implement COALESCE_SET netlink request to set coalescing parameters of
    a network device. These are traditionally set with ETHTOOL_SCOALESCE ioctl
    request. This commit adds only support for device coalescing parameters,
    not per queue coalescing parameters.

    Like the ioctl implementation, the generic ethtool code checks if only
    supported parameters are modified; if not, first offending attribute is
    reported using extack.

    v2: fix alignment (whitespace only)

    Signed-off-by: Michal Kubecek
    Reviewed-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Michal Kubecek