13 Mar, 2020
20 commits
-
The PCI shutdown handler is invoked in response
to system reboot or shutdown. A data transfer
might still be in flight when this happens. So
the very first action we take here is to send
a link down notification, so that any pending
data transfer is terminated. Rest of the actions
are same as that of PCI remove handler.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
When the driver on the local side is loaded, it sets
SIDE_READY bit in SIDE_INFO register. Likewise, when
it is un-loaded, it clears the bit.Also just after being loaded, the driver polls for
peer SIDE_READY bit to be set. Since that bit is set
when the peer side driver has loaded, the polling on
local side breaks as soon as this condition is met.But the situation is different when the driver is
un-loaded. Since the polling has already been stopped
as mentioned before, if the peer side driver gets
un-loaded, the driver on the local side is not notified
implicitly.So, we improvise using existing doorbell mechanism.
We reserve the highest order bit of the DB register to
send a notification to peer when the driver on local
side is un-loaded. This also means that now we are one
short of 16 DB events and that is taken care of in the
valid DB mask.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
db_valid_mask is set at two places, once within
amd_init_ntb(), and again within amd_init_dev().
Since amd_init_ntb() is actually called from
amd_init_dev(), setting db_valid_mask from
former does not really make sense. So remove it.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
Since NTB connects two physically separate systems,
there can be scenarios where one system goes down
while the other one remains active. In case of NTB
primary, if the NTB secondary goes down, a Link-Down
event is received. For the NTB secondary, if the
NTB primary goes down, the PCIe hotplug mechanism
ensures that the driver on the secondary side is also
unloaded.But there are other scenarios to consider as well,
when suppose the physical link remains active, but
the driver on primary or secondary side is loaded
or un-loaded.When the driver is loaded, on either side, it sets
SIDE_READY bit(bit-1) of SIDE_INFO register. Similarly,
when the driver is un-loaded, it resets the same bit.We consider the NTB link to be up and operational
only when the driver on both sides of link are loaded
and ready. But we also need to take account of
Link Up and Down events which signify the physical
link status. So amd_link_is_up() is modified to take
care of the above scenarios.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
We define two new helper functions to set and clear
sideinfo registers respectively. These functions
take an additional boolean parameter which signifies
whether we want to set/clear the sideinfo register
of the peer(true) or local host(false).Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
It does not really make sense to enable or disable
the bits of NTB_CTRL register only during enable
and disable link callbacks. They should be done
independent of these callbacks. The correct placement
for that is during the amd_init_side_info() and
amd_deinit_side_info() functions, which are invoked
during probe and remove respectively.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
Just like for Link-Down event, Link-Up and D3 events
are also mutually exclusive to Link-Down and D0 events
respectively. So we clear the bitmasks in peer_sta
depending on event type.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
Link-Up and Link-Down are mutually exclusive events.
So when we receive a Link-Down event, we should also
clear the bitmask for Link-Up event in peer_sta.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
amd_link_is_up() is a callback to inquire whether
the NTB link is up or not. So it should not indulge
itself into clearing the bitmasks of peer_sta.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
amd_ack_smu() should only set the corresponding
bits into SMUACK register. Setting the bitmask
of peer_sta should be done within the event handler.
They are two different things, and so should be
handled differently and at different places.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
Bit 1 of SIDE_INFO register is an indication that
the driver on the other side of link is ready. We
set this bit during driver initialization sequence.
So rather than having separate macros to return the
status, we can simply return the status of this bit
from amd_poll_link(). So a return of 1 or 0 from
this function will indicate to the caller whether
the driver on the other side of link is ready or not,
respectively.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
Since getting the status of link is a logically separate
operation, we simply create a new function which will
store the link status to be used later.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
Link-Up and Link-Down events can occur irrespective
of whether a data transfer is in progress or not.
So we need to enable the interrupt delivery for
these events early during driver load.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
The interrupt status register should be cleared
by driver once the particular event is handled.
The patch fixes this.Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
The design of AMD NTB implementation is such that
NTB primary acts as an endpoint device and NTB
secondary is an endpoint device behind a combination
of Switch Upstream and Switch Downstream. Considering
that, the link status and control register needs to
be accessed differently based on the NTB topology.So in the case of NTB secondary, we first get the
pointer to the Switch Downstream device for the NTB
device. Then we get the pointer to the Switch Upstream
device. Once we have that, we read the Link Status
and Control register to get the correct status of
link at the secondary.In the case of NTB primary, simply reading the Link
Status and Control register of the NTB device itself
will suffice.Suggested-by: Jiasen Lin
Signed-off-by: Arindam Nath
Signed-off-by: Jon Mason -
Since snprintf() returns the would-be-output size instead of the
actual output size, the succeeding calls may go beyond the given
buffer limit. Fix it by replacing with scnprintf().Fixes: fce8a7bb5b4b (PCI-Express Non-Transparent Bridge Support)
Fixes: 282a2feeb9bf (NTB: Use DMA Engine to Transmit and Receive)
Fixes: a754a8fcaf38 (NTB: allocate number transport entries depending on size of ring size)
Fixes: d98ef99e378b (NTB: Clean up QP stats info)
Fixes: e74bfeedad08 (NTB: Add flow control to the ntb_netdev)
Fixes: 569410ca756c (NTB: Use unique DMA channels for TX and RX)
Signed-off-by: Takashi Iwai
Reviewed-by: Logan Gunthorpe
Signed-off-by: Jon Mason -
ntb_mw_set_trans() should work as ntb_mw_clear_trans() when size == 0 and/or
addr == 0. But error in xlate_pos checking condition prevents this.
Fix the condition to make ntb_mw_clear_trans() working.Fixes: 87d11e645e31 (NTB: switchtec_ntb: Add memory window support)
Signed-off-by: Alexander Fomichev
Reviewed-by: Logan Gunthorpe
Signed-off-by: Jon Mason -
The correct printk format is %pa or %pap, but not %pa[p].
Fixes: 7f46c8b3a5523 ("NTB: ntb_tool: Add full multi-port NTB API support")
Signed-off-by: Helge Deller
Signed-off-by: Jon Mason -
peer->outbuf is a virtual address which is get by ioremap, it can not
be converted to a physical address by virt_to_page and page_to_phys.
This conversion will result in DMA error, because the destination address
which is converted by page_to_phys is invalid.This patch save the MMIO address of NTB BARx in perf_setup_peer_mw,
and map the BAR space to DMA address after we assign the DMA channel.
Then fill the destination address of DMA descriptor with this DMA address
to guarantee that the address of memory write requests fall into
memory window of NBT BARx with IOMMU enabled and disabled.Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Jiasen Lin
Reviewed-by: Logan Gunthorpe
Signed-off-by: Jon Mason -
The offset of PCIe Capability Header for AMD and HYGON NTB is 0x64,
but the macro which named "AMD_LINK_STATUS_OFFSET" is defined as 0x68.
It is offset of Device Capabilities Reg rather than Link Control Reg.This code trigger an error in get link statsus:
cat /sys/kernel/debug/ntb_hw_amd/0000:43:00.1/info
LNK STA - 0x8fa1
Link Status - Up
Link Speed - PCI-E Gen 0
Link Width - x0This patch use pcie_capability_read_dword to get link status.
After fix this issue, we can get link status accurately:cat /sys/kernel/debug/ntb_hw_amd/0000:43:00.1/info
LNK STA - 0x11030042
Link Status - Up
Link Speed - PCI-E Gen 3
Link Width - x16Fixes: a1b3695820aa4 ("NTB: Add support for AMD PCI-Express Non-Transparent Bridge")
Signed-off-by: Jiasen Lin
Signed-off-by: Jon Mason
08 Dec, 2019
2 commits
-
Pull NTB update from Jon Mason:
"Just a simple patch to add a new Hygon Device ID to the AMD NTB device
driver"* tag 'ntb-5.5' of git://github.com/jonmason/ntb:
NTB: Add Hygon Device ID -
Signed-off-by: Jiasen Lin
Signed-off-by: Jon Mason
16 Oct, 2019
1 commit
-
There is no need to check the return value of debugfs_create_atomic_t as
nothing happens with the error. Also, the code will never return NULL,
so this check has never caught anything :)Fix this by removing the check entirely.
Cc: Jon Mason
Cc: Dave Jiang
Cc: Allen Hubbe
Cc: linux-ntb@googlegroups.com
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/20191011131919.GA1174815@kroah.com
Signed-off-by: Greg Kroah-Hartman
24 Sep, 2019
6 commits
-
Fix typos in drivers/ntb/hw/idt/Kconfig.
Use consistent spelling and capitalization.Fixes: bf2a952d31d2 ("NTB: Add IDT 89HPESxNTx PCIe-switches support")
Signed-off-by: Randy Dunlap
Cc: Dave Jiang
Cc: Allen Hubbe
Cc: Serge Semin
Signed-off-by: Jon Mason -
The AMD new hardware uses BAR23 and BAR45 as memory windows
as compared to previos where BAR1, BAR23 and BAR45 is used
for memory windows.This patch add support for both AMD hardwares.
Signed-off-by: Sanjay R Mehta
Signed-off-by: Jon Mason -
Signed-off-by: Sanjay R Mehta
Signed-off-by: Jon Mason -
Variable rc is initialized to a value that is never read and it
is re-assigned later. The initialization is redundant and can be
removed.Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King
Signed-off-by: Jon Mason -
On switchtec_ntb_mw_set_trans() call, when (only) address == 0, it acts as
ntb_mw_clear_trans(). Fix this, since address == 0 and size != 0 is valid
combination for setting translation.Signed-off-by: Alexander Fomichev
Reviewed-by: Logan Gunthorpe
Signed-off-by: Jon Mason -
second parameter of ntb_peer_mw_get_addr is pointing to wrong memory
window index by passing "peer gidx" instead of "local gidx".For ex, "local gidx" value is '0' and "peer gidx" value is '1', then
on peer side ntb_mw_set_trans() api is used as below with gidx pointing to
local side gidx which is '0', so memroy window '0' is chosen and XLAT '0'
will be programmed by peer side.ntb_mw_set_trans(perf->ntb, peer->pidx, peer->gidx, peer->inbuf_xlat,
peer->inbuf_size);Now, on local side ntb_peer_mw_get_addr() is been used as below with gidx
pointing to "peer gidx" which is '1', so pointing to memory window '1'
instead of memory window '0'.ntb_peer_mw_get_addr(perf->ntb, peer->gidx, &phys_addr,
&peer->outbuf_size);So this patch pass "local gidx" as parameter to ntb_peer_mw_get_addr().
Signed-off-by: Sanjay R Mehta
Signed-off-by: Jon Mason
06 Aug, 2019
1 commit
-
msi.c is not a module on its own right and should not have the
MODULE_[LICENSE|VERSION|AUTHOR|DESCRIPTION] definitions.This caused a regression noticed by lkp with the following back
trace:WARNING: CPU: 0 PID: 1 at kernel/params.c:861 param_sysfs_init+0xb1/0x20a
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc1-00018-g26b3a37b928457 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
RIP: 0010:param_sysfs_init+0xb1/0x20a
Code: 24 38 e8 ec 17 2e fd 49 8b 7c 24 38 e8 76 fe ff ff 48 85 c0 48 89 c5 74 25 31 d2 4c 89 e6 48 89 c7 e8 6d 6f 3c fd 85 c0 74 02 0b 48 89 ef 31 f6 e8 5d 70 a7 fe 48 89 ef e8 95 52 a7 fe 48 83
RSP: 0000:ffff88806b0ffe30 EFLAGS: 00010282
RAX: 00000000ffffffef RBX: ffffffff83774220 RCX: ffff88806a85e880
RDX: 00000000ffffffef RSI: ffff88806b000400 RDI: ffff88806a8608c0
RBP: ffff88806b392000 R08: ffffed100d61ff59 R09: ffffed100d61ff59
R10: 0000000000000001 R11: ffffed100d61ff58 R12: ffffffff83974bc0
R13: 0000000000000004 R14: 0000000000000028 R15: 00000000000003b9
FS: 0000000000000000(0000) GS:ffff88806b800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000000380e000 CR4: 00000000000406b0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? file_caps_disable+0x10/0x10
? locate_module_kobject+0xf2/0xf2
do_one_initcall+0x47/0x1f0
kernel_init_freeable+0x1b1/0x243
? rest_init+0xd0/0xd0
kernel_init+0xa/0x130
? calculate_sigpending+0x63/0x80
? rest_init+0xd0/0xd0
ret_from_fork+0x1f/0x30
---[ end trace 78201497ae74cc91 ]---Reported-by: kernel test robot
Fixes: 26b3a37b9284 ("NTB: Introduce MSI library")
Signed-off-by: Logan Gunthorpe
Signed-off-by: Jon Mason
22 Jul, 2019
1 commit
-
Pull NTB updates from Jon Mason:
"New feature to add support for NTB virtual MSI interrupts, the ability
to test and use this feature in the NTB transport layer.Also, bug fixes for the AMD and Switchtec drivers, as well as some
general patches"* tag 'ntb-5.3' of git://github.com/jonmason/ntb: (22 commits)
NTB: Describe the ntb_msi_test client in the documentation.
NTB: Add MSI interrupt support to ntb_transport
NTB: Add ntb_msi_test support to ntb_test
NTB: Introduce NTB MSI Test Client
NTB: Introduce MSI library
NTB: Rename ntb.c to support multiple source files in the module
NTB: Introduce functions to calculate multi-port resource index
NTB: Introduce helper functions to calculate logical port number
PCI/switchtec: Add module parameter to request more interrupts
PCI/MSI: Support allocating virtual MSI interrupts
ntb_hw_switchtec: Fix setup MW with failure bug
ntb_hw_switchtec: Skip unnecessary re-setup of shared memory window for crosslink case
ntb_hw_switchtec: Remove redundant steps of switchtec_ntb_reinit_peer() function
NTB: correct ntb_dev_ops and ntb_dev comment typos
NTB: amd: Silence shift wrapping warning in amd_ntb_db_vector_mask()
ntb_hw_switchtec: potential shift wrapping bug in switchtec_ntb_init_sndev()
NTB: ntb_transport: Ensure qp->tx_mw_dma_addr is initaliazed
NTB: ntb_hw_amd: set peer limit register
NTB: ntb_perf: Clear stale values in doorbell and command SPAD register
NTB: ntb_perf: Disable NTB link after clearing peer XLAT registers
...
13 Jun, 2019
9 commits
-
Introduce the module parameter 'use_msi' which, when set, uses
MSI interrupts instead of doorbells for each queue pair (QP). The
parameter is only available if NTB MSI support is configured into
the kernel. We also require there to be more than one memory window
(MW) so that an extra one is available to forward the APIC region.To use MSIs, we request one interrupt per QP and forward the MSI address
and data to the peer using scratch pad registers (SPADS) above the MW
SPADS. (If there are not enough SPADS the MSI interrupt will not be used.)Once registered, we simply use ntb_msi_peer_trigger and the receiving
ISR simply queues up the rxc_db_work for the queue.This addition can significantly improve performance of ntb_transport.
In a simple, untuned, apples-to-apples comparision using ntb_netdev
and iperf with switchtec hardware, I see 3.88Gb/s without MSI
interrupts and 14.1Gb/s wit MSI, which is a more than 3x improvement.Signed-off-by: Logan Gunthorpe
Cc: Dave Jiang
Cc: Allen Hubbe
Signed-off-by: Jon Mason -
Introduce a tool to test NTB MSI interrupts similar to the other
NTB test tools. This tool creates a debugfs directory for each
NTB device with the following files:port
irqX_occurrences
peerX/port
peerX/count
peerX/triggerThe 'port' file tells the user the local port number and the
'occurrences' files tell the number of local interrupts that
have been received for each interrupt.For each peer, the 'port' file and the 'count' file tell you the
peer's port number and number of interrupts respectively. Writing
the interrupt number to the 'trigger' file triggers the interrupt
handler for the peer which should increment their corresponding
'occurrences' file. The 'ready' file indicates if a peer is ready,
writing to this file blocks until it is ready.The module parameter num_irqs can be used to set the number of
local interrupts. By default this is 4. This is only limited by
the number of unused MSI interrupts registered by the hardware
(this will require support of the hardware driver) and there must
be at least 2*num_irqs + 1 spads registers available.Signed-off-by: Logan Gunthorpe
Cc: Dave Jiang
Cc: Allen Hubbe
Signed-off-by: Jon Mason -
The NTB MSI library allows passing MSI interrupts across a memory
window. This offers similar functionality to doorbells or messages
except will often have much better latency and the client can
potentially use significantly more remote interrupts than typical hardware
provides for doorbells. (Which can be important in high-multiport
setups.)The library utilizes one memory window per peer and uses the highest
index memory windows. Before any ntb_msi function may be used, the user
must call ntb_msi_init(). It may then setup and tear down the memory
windows when the link state changes using ntb_msi_setup_mws() and
ntb_msi_clear_mws().The peer which receives the interrupt must call ntb_msim_request_irq()
to assign the interrupt handler (this function is functionally
similar to devm_request_irq()) and the returned descriptor must be
transferred to the peer which can use it to trigger the interrupt.
The triggering peer, once having received the descriptor, can
trigger the interrupt by calling ntb_msi_peer_trigger().Signed-off-by: Logan Gunthorpe
Cc: Dave Jiang
Cc: Allen Hubbe
Signed-off-by: Jon Mason -
The kbuild system does not support having multiple source files in
a module if one of those source files has the same name as the module.Therefore, we must rename ntb.c to core.c, while the module remains
ntb.ko.This is similar to the way the nvme modules are structured.
Signed-off-by: Logan Gunthorpe
Cc: Dave Jiang
Cc: Allen Hubbe
Signed-off-by: Jon Mason -
Switchtec does not support setting multiple MWs simultaneously. The
driver takes a hardware lock to ensure that two peers are not doing this
simultaneously and it fails if someone else takes the lock. In most
cases, this is fine as clients only setup the MWs once on one side of
the link.However, there's a race condition when a re-initialization is caused by
a link event. The driver will re-setup the shared memory window
asynchronously and this races with the client setting up it's memory
windows on the link up event.To fix this we ensure do the entire initialization in a work queue and
signal the client once it's done.Signed-off-by: Joey Zhang
Signed-off-by: Wesley Sheng
Signed-off-by: Jon Mason -
In case of NTB crosslink topology, the setting of shared memory window in
the virtual partition doesn't reset on peer's reboot. So skip the
unnecessary re-setup of shared memory window for that case.Signed-off-by: Wesley Sheng
Signed-off-by: Jon Mason -
When a re-initialization is caused by a link event, the driver will
re-setup the shared memory window. But at that time, the shared memory
is still valid, and it's unnecessary to free, reallocate and then
initialize it again. We only need to reconfigure the hardware
registers. Remove the redundant steps from
switchtec_ntb_reinit_peer() function.Signed-off-by: Joey Zhang
Signed-off-by: Wesley Sheng
Reviewed-by: Logan Gunthorpe
Signed-off-by: Jon Mason -
This code triggers a Smatch warning:
drivers/ntb/hw/amd/ntb_hw_amd.c:336 amd_ntb_db_vector_mask()
warn: should '(1 << db_vector)' be a 64 bit type?I don't think "db_vector" can be higher than 16 so this doesn't affect
runtime, but it's nice to silence the static checker warning and we
might increase "ndev->db_count" in the future.Signed-off-by: Dan Carpenter
Acked-by: Shyam Sundar S K
Signed-off-by: Jon Mason -
This code triggers a Smatch warning:
drivers/ntb/hw/mscc/ntb_hw_switchtec.c:884 switchtec_ntb_init_sndev()
warn: should '(1 << sndev->peer_partition)' be a 64 bit type?The "part_map" and "tpart_vec" variables are u64 type so this seems like
a valid warning.Fixes: 3df54c870f52 ("ntb_hw_switchtec: Allow using Switchtec NTB in multi-partition setups")
Signed-off-by: Dan Carpenter
Reviewed-by: Logan Gunthorpe
Signed-off-by: Jon Mason