01 Nov, 2017

1 commit

  • MIPS will soon not be a part of Imagination Technologies, and as such
    many @imgtec.com email addresses will no longer be valid. This patch
    updates the addresses for those who:

    - Have 10 or more patches in mainline authored using an @imgtec.com
    email address, or any patches dated within the past year.

    - Are still with Imagination but leaving as part of the MIPS business
    unit, as determined from an internal email address list.

    - Haven't already updated their email address (ie. JamesH) or expressed
    a desire to be excluded (ie. Maciej).

    - Acked v2 or earlier of this patch, which leaves Deng-Cheng, Matt &
    myself.

    New addresses are of the form firstname.lastname@mips.com, and all
    verified against an internal email address list. An entry is added to
    .mailmap for each person such that get_maintainer.pl will report the new
    addresses rather than @imgtec.com addresses which will soon be dead.

    Instances of the affected addresses throughout the tree are then
    mechanically replaced with the new @mips.com address.

    Signed-off-by: Paul Burton
    Cc: Deng-Cheng Zhu
    Cc: Deng-Cheng Zhu
    Acked-by: Dengcheng Zhu
    Cc: Matt Redfearn
    Cc: Matt Redfearn
    Acked-by: Matt Redfearn
    Cc: Andrew Morton
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-mips@linux-mips.org
    Cc: trivial@kernel.org
    Patchwork: https://patchwork.linux-mips.org/patch/17540/
    Signed-off-by: James Hogan

    Paul Burton
     

29 Oct, 2017

4 commits

  • Pull networking fixes from David Miller:

    1) Fix route leak in xfrm_bundle_create().

    2) In mac80211, validate user rate mask before configuring it. From
    Johannes Berg.

    3) Properly enforce memory limits in fair queueing code, from Toke
    Hoiland-Jorgensen.

    4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet.

    5) Fix TSO header allocation and management in mvpp2 driver, from Yan
    Markman.

    6) Don't take socket lock in BH handler in strparser code, from Tom
    Herbert.

    7) Don't show sockets from other namespaces in AF_UNIX code, from
    Andrei Vagin.

    8) Fix double free in error path of tap_open(), from Girish Moodalbail.

    9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker
    and Alexander Duyck.

    10) Fix DCB mode programming in stmmac driver, from Jose Abreu.

    11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin
    Long.

    12) Properly align SKB head before building SKB in tuntap, from Jason
    Wang.

    13) Avoid matching qdiscs with a zero handle during lookups, from Cong
    Wang.

    14) Fix various endianness bugs in sctp, from Xin Long.

    15) Fix tc filter callback races and add selftests which trigger the
    problem, from Cong Wang.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
    selftests: Introduce a new test case to tc testsuite
    selftests: Introduce a new script to generate tc batch file
    net_sched: fix call_rcu() race on act_sample module removal
    net_sched: add rtnl assertion to tcf_exts_destroy()
    net_sched: use tcf_queue_work() in tcindex filter
    net_sched: use tcf_queue_work() in rsvp filter
    net_sched: use tcf_queue_work() in route filter
    net_sched: use tcf_queue_work() in u32 filter
    net_sched: use tcf_queue_work() in matchall filter
    net_sched: use tcf_queue_work() in fw filter
    net_sched: use tcf_queue_work() in flower filter
    net_sched: use tcf_queue_work() in flow filter
    net_sched: use tcf_queue_work() in cgroup filter
    net_sched: use tcf_queue_work() in bpf filter
    net_sched: use tcf_queue_work() in basic filter
    net_sched: introduce a workqueue for RCU callbacks of tc filter
    sctp: fix some type cast warnings introduced since very beginning
    sctp: fix a type cast warnings that causes a_rwnd gets the wrong value
    sctp: fix some type cast warnings introduced by transport rhashtable
    sctp: fix some type cast warnings introduced by stream reconf
    ...

    Linus Torvalds
     
  • Pull input fixes from Dmitry Torokhov:

    - fix gtco tablet driver, tightening parsing of HID descriptors

    - add ACPI ID added to Elan driver to be able to handle touchpads found
    in Lenovo Ideapad 320/520

    - fix the Symaptics RMI4 driver to adjust handling of buttons

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: synaptics-rmi4 - limit the range of what GPIOs are buttons
    Input: gtco - fix potential out-of-bound access
    Input: elan_i2c - add ELAN0611 to the ACPI table

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "Two amd fixes, one i915 core and a few i915 GVT fixes, things seem
    fairly quiet"

    * tag 'drm-fixes-for-v4.14-rc7' of git://people.freedesktop.org/~airlied/linux:
    drm/i915/gvt: Adding ACTHD mmio read handler
    drm/i915/gvt: Extract mmio_read_from_hw() common function
    drm/i915/gvt: Refine MMIO_RING_F()
    drm/i915/gvt: properly check per_ctx bb valid state
    drm/i915/perf: fix perf enable/disable ioctls with 32bits userspace
    drm/amd/amdgpu: Remove workaround check for UVD6 on APUs
    drm/amd/powerplay: fix uninitialized variable

    Linus Torvalds
     
  • Pull SCSI fixes from James Bottomley:
    "Six fixes for mostly minor issues, most of which have small race
    windows for occurring"

    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
    scsi: Suppress a kernel warning in case the prep function returns BLKPREP_DEFER
    scsi: sg: Re-fix off by one in sg_fill_request_table()
    scsi: aacraid: Fix controller initialization failure
    scsi: hpsa: Fix configured_logical_drive_count·check
    scsi: qla2xxx: Initialize Work element before requesting IRQs
    scsi: zfcp: fix erp_action use-before-initialize in REC action trace

    Linus Torvalds
     

28 Oct, 2017

6 commits

  • The commit 9a393b5d5988 ("tap: tap as an independent module") created a
    separate tap module that implements tap functionality and exports
    interfaces that will be used by macvtap and ipvtap modules to create
    create respective tap devices.

    However, that patch introduced a regression wherein the modules macvtap
    and ipvtap can be removed (through modprobe -r) while there are
    applications using the respective /dev/tapX devices. These applications
    cause kernel to hold reference to /dev/tapX through 'struct cdev
    macvtap_cdev' and 'struct cdev ipvtap_dev' defined in macvtap and ipvtap
    modules respectively. So, when the application is later closed the
    kernel panics because we are referencing KVA that is present in the
    unloaded modules.

    ----------8
    BUG: unable to handle kernel paging request at ffffffffa038c500
    IP: cdev_put+0xf/0x30
    ----------8
    Signed-off-by: David S. Miller

    Girish Moodalbail
     
  • An unaligned alloc_frag->offset caused by previous allocation will
    result an unaligned skb->head. This will lead unaligned
    skb_shared_info and then unaligned dataref which requires to be
    aligned for accessing on some architecture. Fix this by aligning
    alloc_frag->offset before the frag refilling.

    Fixes: 0bbd7dad34f8 ("tun: make tun_build_skb() thread safe")
    Cc: Eric Dumazet
    Cc: Willem de Bruijn
    Cc: Wei Wei
    Cc: Dmitry Vyukov
    Cc: Mark Rutland
    Reported-by: Wei Wei
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller

    Jason Wang
     
  • Pull xen fixes from Juergen Gross:

    - a fix for the Xen gntdev device repairing an issue in case of partial
    failure of mapping multiple pages of another domain

    - a fix of a regression in the Xen balloon driver introduced in 4.13

    - a build fix for Xen on ARM which will trigger e.g. for Linux RT

    - a maintainers update for pvops (not really Xen, but carrying through
    this tree just for convenience)

    * tag 'for-linus-4.14c-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    maintainers: drop Chris Wright from pvops
    arm/xen: don't inclide rwlock.h directly.
    xen: fix booting ballooned down hvm guest
    xen/gntdev: avoid out of bounds access in case of partial gntdev_mmap()

    Linus Torvalds
     
  • Pull EFI fixes from Ingo Molnar:
    "Two fixes: an ARM fix for KASLR interaction with hibernation, plus an
    efi_test crash fix"

    * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    efi/libstub/arm: Don't randomize runtime regions when CONFIG_HIBERNATION=y
    efi/efi_test: Prevent an Oops in efi_runtime_query_capsulecaps()

    Linus Torvalds
     
  • By convention the first 6 bits of F30 Ctrl 2 and 3 are used to signify
    GPIOs which are connected to buttons. Additional GPIOs may be used as
    input GPIOs to signal the touch controller of some event
    (ie disable touchpad). These additional GPIOs may meet the criteria of
    a button in rmi_f30_is_valid_button() but should not be considered
    buttons. This patch limits the GPIOs which are mapped to buttons to just
    the first 6.

    Signed-off-by: Andrew Duggan
    Reported-by: Daniel Martin
    Tested-by: Daniel Martin
    Acked-By: Benjamin Tissoires
    Signed-off-by: Dmitry Torokhov

    Andrew Duggan
     
  • parse_hid_report_descriptor() has a while (i < length) loop, which
    only guarantees that there's at least 1 byte in the buffer, but the
    loop body can read multiple bytes which causes out-of-bounds access.

    Reported-by: Andrey Konovalov
    Reviewed-by: Andrey Konovalov
    Cc: stable@vger.kernel.org
    Signed-off-by: Dmitry Torokhov

    Dmitry Torokhov
     

27 Oct, 2017

11 commits

  • Jeff Kirsher says:

    ====================
    Intel Wired LAN Driver Updates 2017-10-26

    This series contains fixes to e1000, igb, ixgbe and i40e.

    Vincenzo Maffione fixes a potential race condition which would result in
    the interface being up but transmits are disabled in the hardware.

    Colin Ian King fixes a possible NULL pointer dereference in e1000, which
    was found by Coverity.

    Jean-Philippe Brucker fixes a possible kernel panic when a driver cannot
    map a transmit buffer, which is caused by an erroneous test.

    Alex provides a fix for ixgbe, which is a partial revert of the commit
    ffed21bcee7a ("ixgbe: Don't bother clearing buffer memory for descriptor rings")
    because the previous commit messed up the exception handling path by
    adding the count back in when we did not need to. Also fixed a typo,
    where the transmit ITR setting was being used to determine if we were
    using adaptive receive interrupt moderation or not. Lastly, fixed a
    memory leak by including programming descriptors in the cleaned count.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • According to DWMAC databook the first queue operating mode
    must always be in DCB.

    As MTL_QUEUE_DCB = 1, we need to always set the first queue
    operating mode to DCB otherwise driver will think that queue
    is in AVB mode (because MTL_QUEUE_AVB = 0).

    Signed-off-by: Jose Abreu
    Cc: Joao Pinto
    Cc: David S. Miller
    Cc: Giuseppe Cavallaro
    Cc: Alexandre Torgue
    Signed-off-by: David S. Miller

    Jose Abreu
     
  • According to DT bindings documentation we are expecting a
    property called "snps,read-requests" but we are parsing
    instead a property called "read,read-requests".

    This is clearly a typo. Fix it.

    Signed-off-by: Jose Abreu
    Cc: Joao Pinto
    Cc: David S. Miller
    Cc: Giuseppe Cavallaro
    Cc: Alexandre Torgue
    Signed-off-by: David S. Miller

    Jose Abreu
     
  • Saeed Mahameed says:

    ====================
    Mellanox, mlx5 fixes 2017-10-26

    The series includes some misc fixes for mlx5 core and etherent driver.
    Please pull and let me know if there's any problem.

    For -Stable:
    net/mlx5e: Properly deal with encap flows add/del under neigh update (kernels >= 4.12)
    net/mlx5: Fix health work queue spin lock to IRQ safe (kernels >= 4.13)
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • One fix for stable:

    - fix perf enable/disable ioctls for 32bits (Lionel)

    Plus GVT fixes:

    - Fix per_ctx_bb check (Zhenyu)
    - Fix GPU hang of Linux guest (Xion)
    - Refine MMIO_RING_F to check for presence of VCS2 ring (Zhi)

    * tag 'drm-intel-fixes-2017-10-26' of git://anongit.freedesktop.org/drm/drm-intel:
    drm/i915/gvt: Adding ACTHD mmio read handler
    drm/i915/gvt: Extract mmio_read_from_hw() common function
    drm/i915/gvt: Refine MMIO_RING_F()
    drm/i915/gvt: properly check per_ctx bb valid state

    Dave Airlie
     
  • Pull rdma fix from Doug Ledford:
    "Fix an oops issue in the new RDMA netlink code"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
    RDMA/netlink: OOPs in rdma_nl_rcv_msg() from misinterpreted flag

    Linus Torvalds
     
  • When a workload is too heavy to finish it in gpu hang check timer
    intervals(1.5), gpu hang check function will check ACTHD register
    value to decide whether gpu is real dead or not. On real hw,
    ACTHD is updated by HW when workload is running, then host kernel
    won't think it is gpu hang. while guest kernel always read a constant
    ACTHD value as GVT doesn't supply ACTHD emulate handler, then
    guest kernel detects a fake gpu hang.

    To remove such guest fake gpu hang, this patch supply ACTHD
    mmio read handler which read real HW ACTHD register directly.

    Signed-off-by: Xiong Zhang
    Signed-off-by: Zhi Wang
    Signed-off-by: Rodrigo Vivi
    Link: https://patchwork.freedesktop.org/patch/msgid/b4c9a097-3e62-124e-6856-b0c37764df7b@intel.com

    Xiong Zhang
     
  • The mmio read handler for ring timestmap / instdone register are same
    as reading hw value directly.

    Extract it as common function to reduce code duplications.

    Signed-off-by: Xiong Zhang
    Signed-off-by: Zhi Wang

    Xiong Zhang
     
  • Inspect if the host has VCS2 ring by host i915 macro in MMIO_RING_F().
    Also this helps on reducing some LOCs.

    Signed-off-by: Zhi Wang

    Zhi Wang
     
  • Need to check valid state for per_ctx bb and bypass batch buffer
    combine for scan if necessary. Otherwise adding invalid MI batch
    buffer start cmd for per_ctx bb will cause scan failure, which is
    taken as -EFAULT now so vGPU would be put in failsafe. This trys
    to fix that by checking per_ctx bb valid state. Also remove old
    invalid WARNING that indirect ctx bb shouldn't depend on valid
    per_ctx bb.

    Signed-off-by: Zhenyu Wang
    Signed-off-by: Zhi Wang

    Zhenyu Wang
     
  • Pull power management fix from Rafael Wysocki:
    "This fixes a device power management quality of service (PM QoS)
    framework implementation issue causing 'no restriction' requests for
    device resume latency, including 'no restriction' set by user space,
    to effectively override requests with specific device resume latency
    requirements.

    It is late in the cycle, but the bug in question is in the 'user space
    can trigger unexpected behavior' category and the fix is
    stable-candidate, so here it goes"

    * tag 'pm-4.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    PM / QoS: Fix device resume latency PM QoS

    Linus Torvalds
     

26 Oct, 2017

18 commits

  • Pull block fixes from Jens Axboe:
    "A few select fixes that should go into this series. Mainly for NVMe,
    but also a single stable fix for nbd from Josef"

    * 'for-linus' of git://git.kernel.dk/linux-block:
    nbd: handle interrupted sendmsg with a sndtimeo set
    nvme-rdma: Fix error status return in tagset allocation failure
    nvme-rdma: Fix possible double free in reconnect flow
    nvmet: synchronize sqhd update
    nvme-fc: retry initial controller connections 3 times
    nvme-fc: fix iowait hang

    Linus Torvalds
     
  • Pull spi fixes from Mark Brown:
    "There are a bunch of device specific fixes (more than I'd like, I've
    been lax sending these) plus one important core fix for the conversion
    to use an IDR for bus number allocation which avoids issues with
    collisions when some but not all of the buses in the system have a
    fixed bus number specified.

    The Armada changes are rather large, specificially "spi: armada-3700:
    Fix padding when sending not 4-byte aligned data", but it's a storage
    corruption issue and there's things like indentation changes which
    make it look bigger than it really is. It's been cooking in -next for
    quite a while now and is part of the reason for the delay"

    * tag 'spi-fix-v4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
    spi: fix IDR collision on systems with both fixed and dynamic SPI bus numbers
    spi: bcm-qspi: Fix use after free in bcm_qspi_probe() in error path
    spi: a3700: Return correct value on timeout detection
    spi: uapi: spidev: add missing ioctl header
    spi: stm32: Fix logical error in stm32_spi_prepare_mbr()
    spi: armada-3700: Fix padding when sending not 4-byte aligned data
    spi: armada-3700: Fix failing commands with quad-SPI

    Linus Torvalds
     
  • This patch updates the i40e driver to include programming descriptors in
    the cleaned_count. Without this change it becomes possible for us to leak
    memory as we don't trigger a large enough allocation when the time comes to
    allocate new buffers and we end up overwriting a number of rx_buffers equal
    to the number of programming descriptors we encountered.

    Fixes: 0e626ff7ccbf ("i40e: Fix support for flow director programming status")
    Signed-off-by: Alexander Duyck
    Tested-by: Anders K. Pedersen
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • It looks like there was either a copy/paste error or just a typo that
    resulted in the Tx ITR setting being used to determine if we were using
    adaptive Rx interrupt moderation or not.

    This patch fixes the typo.

    Fixes: 65e87c0398f5 ("i40evf: support queue-specific settings for interrupt moderation")
    Signed-off-by: Alexander Duyck
    Tested-by: Andrew Bowers
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • This patch is a partial revert of "ixgbe: Don't bother clearing buffer
    memory for descriptor rings". Specifically I messed up the exception
    handling path a bit and this resulted in us incorrectly adding the count
    back in when we didn't need to.

    In order to make this simpler I am reverting most of the exception handling
    path change and instead just replacing the bit that was handled by the
    unmap_and_free call.

    Fixes: ffed21bcee7a ("ixgbe: Don't bother clearing buffer memory for descriptor rings")
    Signed-off-by: Alexander Duyck
    Tested-by: Andrew Bowers
    Signed-off-by: Jeff Kirsher

    Alexander Duyck
     
  • When the driver cannot map a TX buffer, instead of rolling back
    gracefully and retrying later, we currently get a panic:

    [ 159.885994] igb 0000:00:00.0: TX DMA map failed
    [ 159.886588] Unable to handle kernel paging request at virtual address ffff00000a08c7a8
    ...
    [ 159.897031] PC is at igb_xmit_frame_ring+0x9c8/0xcb8

    Fix the erroneous test that leads to this situation.

    Signed-off-by: Jean-Philippe Brucker
    Tested-by: Andrew Bowers
    Signed-off-by: Jeff Kirsher

    Jean-Philippe Brucker
     
  • Currently if the stat type is invalid then data[i] is being set
    either by dereferencing a null pointer p, or it is reading from
    an incorrect previous location if we had a valid stat type
    previously. Fix this by skipping over the read of p on an invalid
    stat type.

    Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")

    Signed-off-by: Colin Ian King
    Reviewed-by: Alexander Duyck
    Tested-by: Aaron Brown
    Signed-off-by: Jeff Kirsher

    Colin Ian King
     
  • This patch fixes a race condition that can result into the interface being
    up and carrier on, but with transmits disabled in the hardware.
    The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which
    allows e1000_watchdog() interleave with e1000_down().

    CPU x CPU y
    --------------------------------------------------------------------
    e1000_down():
    netif_carrier_off()
    e1000_watchdog():
    if (carrier == off) {
    netif_carrier_on();
    enable_hw_transmit();
    }
    disable_hw_transmit();
    e1000_watchdog():
    /* carrier on, do nothing */

    Signed-off-by: Vincenzo Maffione
    Tested-by: Aaron Brown
    Signed-off-by: Jeff Kirsher

    Vincenzo Maffione
     
  • Commit 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 ("xen/balloon: don't
    online new memory initially") introduced a regression when booting a
    HVM domain with memory less than mem-max: instead of ballooning down
    immediately the system would try to use the memory up to mem-max
    resulting in Xen crashing the domain.

    For HVM domains the current size will be reflected in Xenstore node
    memory/static-max instead of memory/target.

    Additionally we have to trigger the ballooning process at once.

    Cc: # 4.13
    Fixes: 96edd61dcf44362d3ef0bed1a5361e0ac7886a63 ("xen/balloon: don't
    online new memory initially")

    Reported-by: Simon Gaiser
    Suggested-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Juergen Gross
     
  • Double free of skb_array in tap module is causing kernel panic. When
    tap_set_queue() fails we free skb_array right away by calling
    skb_array_cleanup(). However, later on skb_array_cleanup() is called
    again by tap_sock_destruct through sock_put(). This patch fixes that
    issue.

    Fixes: 362899b8725b35e3 (macvtap: switch to use skb array)
    Signed-off-by: Girish Moodalbail
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller

    Girish Moodalbail
     
  • …nux/kernel/git/mkl/linux-can

    Marc Kleine-Budde says:

    ====================
    pull-request: can 2017-10-24

    here's another pull request for net/master.

    The patch by Gerhard Bertelsmann fixes the CAN_CTRLMODE_LOOPBACK in the
    sun4i driver. Two patches by Jimmy Assarsson for the kvaser_usb driver
    fix a print in the error path of the kvaser_usb_close() and remove a
    wrong warning message with the Leaf v2 firmware version v4.1.844.
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • This patch replaces GFP_KERNEL by GFP_ATOMIC to avoid sleeping in the
    ndo_set_rx_mode() call which is called with BH disabled.

    Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
    Signed-off-by: Antoine Tenart
    Signed-off-by: David S. Miller

    Antoine Tenart
     
  • When calling mvpp2_prs_mac_multi_set() from mvpp2_prs_mac_init(), two
    parameters (the port index and the table index) are inverted. Fixes
    this.

    Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
    Signed-off-by: Antoine Tenart
    Signed-off-by: David S. Miller

    Antoine Tenart
     
  • This patch fixes a typo in the mvpp2_prs_tcam_data_cmp() function, as
    the shift value is inverted with the data.

    Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
    Signed-off-by: Antoine Tenart
    Signed-off-by: David S. Miller

    Antoine Tenart
     
  • Previously, tc with ets type and zero bandwidth is not accepted
    by driver. This behavior does not follow the IEEE802.1qaz spec.

    If there are tcs with ets type and zero bandwidth, these tcs are
    assigned to the lowest priority tc_group #0. We equally distribute
    100% bw of the tc_group #0 to these zero bandwidth ets tcs.
    Also, the non zero bandwidth ets tcs are assigned to tc_group #1.

    If there is no zero bandwidth ets tc, the non zero bandwidth ets tcs
    are assigned to tc_group #0.

    Fixes: cdcf11212b22 ("net/mlx5e: Validate BW weight values of ETS")
    Signed-off-by: Huy Nguyen
    Reviewed-by: Parav Pandit
    Signed-off-by: Saeed Mahameed

    Huy Nguyen
     
  • Currently, the encap action offload is handled in the actions parse
    function and not in mlx5e_tc_add_fdb_flow() where we deal with all
    the other aspects of offloading actions (vlan, modify header) and
    the rule itself.

    When the neigh update code (mlx5e_tc_encap_flows_add()) recreates the
    encap entry and offloads the related flows, we wrongly call again into
    mlx5e_tc_add_fdb_flow(), this for itself would cause us to handle
    again the offloading of vlans and header re-write which puts things
    in non consistent state and step on freed memory (e.g the modify
    header parse buffer which is already freed).

    Since on error, mlx5e_tc_add_fdb_flow() detaches and may release the
    encap entry, it causes a corruption at the neigh update code which goes
    over the list of flows associated with this encap entry, or double free
    when the tc flow is later deleted by user-space.

    When neigh update (mlx5e_tc_encap_flows_del()) unoffloads the flows related
    to an encap entry which is now invalid, we do a partial repeat of the eswitch
    flow removal code which is wrong too.

    To fix things up we do the following:

    (1) handle the encap action offload in the eswitch flow add function
    mlx5e_tc_add_fdb_flow() as done for the other actions and the rule itself.

    (2) modify the neigh update code (mlx5e_tc_encap_flows_add/del) to only
    deal with the encap entry and rules delete/add and not with any of
    the other offloaded actions.

    Fixes: 232c001398ae ('net/mlx5e: Add support to neighbour update flow')
    Signed-off-by: Or Gerlitz
    Reviewed-by: Paul Blakey
    Signed-off-by: Saeed Mahameed

    Or Gerlitz
     
  • mlx5_ib_add is called during mlx5_pci_resume after a pci error.
    Before mlx5_ib_add completes, there are multiple events which trigger
    function mlx5_ib_event. This cause kernel panic because mlx5_ib_event
    accesses unitialized resources.

    The fix is to extend Erez Shitrit's patch
    ("net/mlx5: Delay events till ib registration ends") to cover
    the pci resume code path.

    Trace:
    mlx5_core 0001:01:00.6: mlx5_pci_resume was called
    mlx5_core 0001:01:00.6: firmware version: 16.20.1011
    mlx5_core 0001:01:00.6: mlx5_attach_interface:164:(pid 779):
    mlx5_ib_event:2996:(pid 34777): warning: event on port 1
    mlx5_ib_event:2996:(pid 34782): warning: event on port 1
    Unable to handle kernel paging request for data at address 0x0001c104
    Faulting instruction address: 0xd000000008f411fc
    Oops: Kernel access of bad area, sig: 11 [#1]
    ...
    ...
    Call Trace:
    [c000000fff77bb70] [d000000008f4119c] mlx5_ib_event+0x64/0x470 [mlx5_ib] (unreliable)
    [c000000fff77bc60] [d000000008e67130] mlx5_core_event+0xb8/0x210 [mlx5_core]
    [c000000fff77bd10] [d000000008e4bd00] mlx5_eq_int+0x528/0x860[mlx5_core]

    Fixes: 97834eba7c19 ("net/mlx5: Delay events till ib registration ends")
    Signed-off-by: Huy Nguyen
    Reviewed-by: Saeed Mahameed
    Signed-off-by: Saeed Mahameed

    Huy Nguyen
     
  • spin_lock/unlock of health->wq_lock should be IRQ safe.
    It was changed to spin_lock_irqsave since adding commit 0179720d6be2
    ("net/mlx5: Introduce trigger_health_work function") which uses
    spin_lock from asynchronous event (IRQ) context.
    Thus, all spin_lock/unlock of health->wq_lock should have been moved
    to IRQ safe mode.
    However, one occurrence on new code using this lock missed that
    change, resulting in possible deadlock:
    kernel: Possible unsafe locking scenario:
    kernel: CPU0
    kernel: ----
    kernel: lock(&(&health->wq_lock)->rlock);
    kernel:
    kernel: lock(&(&health->wq_lock)->rlock);
    kernel: #012 *** DEADLOCK ***

    Fixes: 2a0165a034ac ("net/mlx5: Cancel delayed recovery work when unloading the driver")
    Signed-off-by: Moshe Shemesh
    Signed-off-by: Saeed Mahameed

    Moshe Shemesh