Blame view

Documentation/MSI-HOWTO.txt 22.6 KB
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
1
2
3
4
5
6
7
8
9
10
11
12
  		The MSI Driver Guide HOWTO
  	Tom L Nguyen tom.l.nguyen@intel.com
  			10/03/2003
  	Revised Feb 12, 2004 by Martine Silbermann
  		email: Martine.Silbermann@hp.com
  	Revised Jun 25, 2004 by Tom L Nguyen
  
  1. About this guide
  
  This guide describes the basics of Message Signaled Interrupts (MSI),
  the advantages of using MSI over traditional interrupt mechanisms,
  and how to enable your driver to use MSI or MSI-X. Also included is
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
13
14
15
16
17
18
19
20
21
  a Frequently Asked Questions (FAQ) section.
  
  1.1 Terminology
  
  PCI devices can be single-function or multi-function.  In either case,
  when this text talks about enabling or disabling MSI on a "device
  function," it is referring to one specific PCI device and function and
  not to all functions on a PCI device (unless the PCI device has only
  one function).
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
22
23
24
25
26
27
  
  2. Copyright 2003 Intel Corporation
  
  3. What is MSI/MSI-X?
  
  Message Signaled Interrupt (MSI), as described in the PCI Local Bus
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
28
  Specification Revision 2.3 or later, is an optional feature, and a
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
29
30
31
32
33
34
35
36
37
  required feature for PCI Express devices. MSI enables a device function
  to request service by sending an Inbound Memory Write on its PCI bus to
  the FSB as a Message Signal Interrupt transaction. Because MSI is
  generated in the form of a Memory Write, all transaction conditions,
  such as a Retry, Master-Abort, Target-Abort or normal completion, are
  supported.
  
  A PCI device that supports MSI must also support pin IRQ assertion
  interrupt mechanism to provide backward compatibility for systems that
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
38
  do not support MSI. In systems which support MSI, the bus driver is
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
  responsible for initializing the message address and message data of
  the device function's MSI/MSI-X capability structure during device
  initial configuration.
  
  An MSI capable device function indicates MSI support by implementing
  the MSI/MSI-X capability structure in its PCI capability list. The
  device function may implement both the MSI capability structure and
  the MSI-X capability structure; however, the bus driver should not
  enable both.
  
  The MSI capability structure contains Message Control register,
  Message Address register and Message Data register. These registers
  provide the bus driver control over MSI. The Message Control register
  indicates the MSI capability supported by the device. The Message
  Address register specifies the target address and the Message Data
  register specifies the characteristics of the message. To request
  service, the device function writes the content of the Message Data
  register to the target address. The device and its software driver
  are prohibited from writing to these registers.
  
  The MSI-X capability structure is an optional extension to MSI. It
  uses an independent and separate capability structure. There are
  some key advantages to implementing the MSI-X capability structure
  over the MSI capability structure as described below.
  
  	- Support a larger maximum number of vectors per function.
  
  	- Provide the ability for system software to configure
  	each vector with an independent message address and message
  	data, specified by a table that resides in Memory Space.
  
          - MSI and MSI-X both support per-vector masking. Per-vector
  	masking is an optional extension of MSI but a required
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
72
73
74
  	feature for MSI-X. Per-vector masking provides the kernel the
  	ability to mask/unmask a single MSI while running its
  	interrupt service routine. If per-vector masking is
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
75
76
77
78
79
  	not supported, then the device driver should provide the
  	hardware/software synchronization to ensure that the device
  	generates MSI when the driver wants it to do so.
  
  4. Why use MSI?
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
80
81
  As a benefit to the simplification of board design, MSI allows board
  designers to remove out-of-band interrupt routing. MSI is another
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
  step towards a legacy-free environment.
  
  Due to increasing pressure on chipset and processor packages to
  reduce pin count, the need for interrupt pins is expected to
  diminish over time. Devices, due to pin constraints, may implement
  messages to increase performance.
  
  PCI Express endpoints uses INTx emulation (in-band messages) instead
  of IRQ pin assertion. Using INTx emulation requires interrupt
  sharing among devices connected to the same node (PCI bridge) while
  MSI is unique (non-shared) and does not require BIOS configuration
  support. As a result, the PCI Express technology requires MSI
  support for better interrupt performance.
  
  Using MSI enables the device functions to support two or more
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
97
  vectors, which can be configured to target different CPUs to
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
  increase scalability.
  
  5. Configuring a driver to use MSI/MSI-X
  
  By default, the kernel will not enable MSI/MSI-X on all devices that
  support this capability. The CONFIG_PCI_MSI kernel option
  must be selected to enable MSI/MSI-X support.
  
  5.1 Including MSI/MSI-X support into the kernel
  
  To allow MSI/MSI-X capable device drivers to selectively enable
  MSI/MSI-X (using pci_enable_msi()/pci_enable_msix() as described
  below), the VECTOR based scheme needs to be enabled by setting
  CONFIG_PCI_MSI during kernel config.
  
  Since the target of the inbound message is the local APIC, providing
  CONFIG_X86_LOCAL_APIC must be enabled as well as CONFIG_PCI_MSI.
  
  5.2 Configuring for MSI support
  
  Due to the non-contiguous fashion in vector assignment of the
  existing Linux kernel, this version does not support multiple
  messages regardless of a device function is capable of supporting
  more than one vector. To enable MSI on a device function's MSI
  capability structure requires a device driver to call the function
  pci_enable_msi() explicitly.
  
  5.2.1 API pci_enable_msi
  
  int pci_enable_msi(struct pci_dev *dev)
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
128
129
  With this new API, a device driver that wants to have MSI
  enabled on its device function must call this API to enable MSI.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
130
131
132
  A successful call will initialize the MSI capability structure
  with ONE vector, regardless of whether a device function is
  capable of supporting multiple messages. This vector replaces the
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
133
134
  pre-assigned dev->irq with a new MSI vector. To avoid a conflict
  of the new assigned vector with existing pre-assigned vector requires
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
135
136
137
138
139
140
141
142
143
144
  a device driver to call this API before calling request_irq().
  
  5.2.2 API pci_disable_msi
  
  void pci_disable_msi(struct pci_dev *dev)
  
  This API should always be used to undo the effect of pci_enable_msi()
  when a device driver is unloading. This API restores dev->irq with
  the pre-assigned IOAPIC vector and switches a device's interrupt
  mode to PCI pin-irq assertion/INTx emulation mode.
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
145
146
147
  Note that a device driver should always call free_irq() on the MSI vector
  that it has done request_irq() on before calling this API. Failure to do
  so results in a BUG_ON() and a device will be left with MSI enabled and
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
148
149
150
  leaks its vector.
  
  5.2.3 MSI mode vs. legacy mode diagram
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
151
  The below diagram shows the events which switch the interrupt
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
152
153
154
155
156
157
158
159
  mode on the MSI-capable device function between MSI mode and
  PIN-IRQ assertion mode.
  
  	 ------------   pci_enable_msi 	 ------------------------
  	|	     | <===============	| 			 |
  	| MSI MODE   |	  	     	| PIN-IRQ ASSERTION MODE |
  	| 	     | ===============>	|			 |
   	 ------------	pci_disable_msi  ------------------------
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
160
  Figure 1. MSI Mode vs. Legacy Mode
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
161

2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
162
  In Figure 1, a device operates by default in legacy mode. Legacy
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
163
164
165
166
167
168
169
170
  in this context means PCI pin-irq assertion or PCI-Express INTx
  emulation. A successful MSI request (using pci_enable_msi()) switches
  a device's interrupt mode to MSI mode. A pre-assigned IOAPIC vector
  stored in dev->irq will be saved by the PCI subsystem and a new
  assigned MSI vector will replace dev->irq.
  
  To return back to its default mode, a device driver should always call
  pci_disable_msi() to undo the effect of pci_enable_msi(). Note that a
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
171
172
173
  device driver should always call free_irq() on the MSI vector it has
  done request_irq() on before calling pci_disable_msi(). Failure to do
  so results in a BUG_ON() and a device will be left with MSI enabled and
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
174
  leaks its vector. Otherwise, the PCI subsystem restores a device's
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
175
  dev->irq with a pre-assigned IOAPIC vector and marks the released
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
176
177
178
179
180
181
  MSI vector as unused.
  
  Once being marked as unused, there is no guarantee that the PCI
  subsystem will reserve this MSI vector for a device. Depending on
  the availability of current PCI vector resources and the number of
  MSI/MSI-X requests from other drivers, this MSI may be re-assigned.
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
182
183
  For the case where the PCI subsystem re-assigns this MSI vector to
  another driver, a request to switch back to MSI mode may result
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
  in being assigned a different MSI vector or a failure if no more
  vectors are available.
  
  5.3 Configuring for MSI-X support
  
  Due to the ability of the system software to configure each vector of
  the MSI-X capability structure with an independent message address
  and message data, the non-contiguous fashion in vector assignment of
  the existing Linux kernel has no impact on supporting multiple
  messages on an MSI-X capable device functions. To enable MSI-X on
  a device function's MSI-X capability structure requires its device
  driver to call the function pci_enable_msix() explicitly.
  
  The function pci_enable_msix(), once invoked, enables either
  all or nothing, depending on the current availability of PCI vector
  resources. If the PCI vector resources are available for the number
  of vectors requested by a device driver, this function will configure
  the MSI-X table of the MSI-X capability structure of a device with
  requested messages. To emphasize this reason, for example, a device
  may be capable for supporting the maximum of 32 vectors while its
  software driver usually may request 4 vectors. It is recommended
  that the device driver should call this function once during the
  initialization phase of the device driver.
  
  Unlike the function pci_enable_msi(), the function pci_enable_msix()
  does not replace the pre-assigned IOAPIC dev->irq with a new MSI
  vector because the PCI subsystem writes the 1:1 vector-to-entry mapping
  into the field vector of each element contained in a second argument.
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
212
213
  Note that the pre-assigned IOAPIC dev->irq is valid only if the device
  operates in PIN-IRQ assertion mode. In MSI-X mode, any attempt at
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
214
  using dev->irq by the device driver to request for interrupt service
4ae0edc21   Matt LaPlante   Fix typos in /Doc...
215
  may result in unpredictable behavior.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
216

2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
217
  For each MSI-X vector granted, a device driver is responsible for calling
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
218
219
220
221
222
223
224
225
226
227
  other functions like request_irq(), enable_irq(), etc. to enable
  this vector with its corresponding interrupt service handler. It is
  a device driver's choice to assign all vectors with the same
  interrupt service handler or each vector with a unique interrupt
  service handler.
  
  5.3.1 Handling MMIO address space of MSI-X Table
  
  The PCI 3.0 specification has implementation notes that MMIO address
  space for a device's MSI-X structure should be isolated so that the
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
228
229
  software system can set different pages for controlling accesses to the
  MSI-X structure. The implementation of MSI support requires the PCI
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
230
  subsystem, not a device driver, to maintain full control of the MSI-X
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
231
232
233
234
  table/MSI-X PBA (Pending Bit Array) and MMIO address space of the MSI-X
  table/MSI-X PBA.  A device driver is prohibited from requesting the MMIO
  address space of the MSI-X table/MSI-X PBA. Otherwise, the PCI subsystem
  will fail enabling MSI-X on its hardware device when it calls the function
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
235
  pci_enable_msix().
4904e23b6   Michael Ellerman   PCI: Remove no lo...
236
  5.3.2 API pci_enable_msix
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
237

2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
238
  int pci_enable_msix(struct pci_dev *dev, struct msix_entry *entries, int nvec)
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
239
240
  
  This API enables a device driver to request the PCI subsystem
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
241
  to enable MSI-X messages on its hardware device. Depending on
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
242
  the availability of PCI vectors resources, the PCI subsystem enables
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
243
  either all or none of the requested vectors.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
244

2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
245
  Argument 'dev' points to the device (pci_dev) structure.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
246

2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
247
248
249
  Argument 'entries' is a pointer to an array of msix_entry structs.
  The number of entries is indicated in argument 'nvec'.
  struct msix_entry is defined in /driver/pci/msi.h:
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
250
251
252
253
254
  
  struct msix_entry {
  	u16 	vector; /* kernel uses to write alloc vector */
  	u16	entry; /* driver uses to specify entry */
  };
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
255
256
  A device driver is responsible for initializing the field 'entry' of
  each element with a unique entry supported by MSI-X table. Otherwise,
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
257
  -EINVAL will be returned as a result. A successful return of zero
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
258
  indicates the PCI subsystem completed initializing each of the requested
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
259
260
  entries of the MSI-X table with message address and message data.
  Last but not least, the PCI subsystem will write the 1:1
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
261
262
  vector-to-entry mapping into the field 'vector' of each element. A
  device driver is responsible for keeping track of allocated MSI-X
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
263
  vectors in its internal data structure.
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
264
  A return of zero indicates that the number of MSI-X vectors was
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
265
266
267
268
269
  successfully allocated. A return of greater than zero indicates
  MSI-X vector shortage. Or a return of less than zero indicates
  a failure. This failure may be a result of duplicate entries
  specified in second argument, or a result of no available vector,
  or a result of failing to initialize MSI-X table entries.
4904e23b6   Michael Ellerman   PCI: Remove no lo...
270
  5.3.3 API pci_disable_msix
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
271
272
273
274
275
276
  
  void pci_disable_msix(struct pci_dev *dev)
  
  This API should always be used to undo the effect of pci_enable_msix()
  when a device driver is unloading. Note that a device driver should
  always call free_irq() on all MSI-X vectors it has done request_irq()
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
277
  on before calling this API. Failure to do so results in a BUG_ON() and
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
278
  a device will be left with MSI-X enabled and leaks its vectors.
4904e23b6   Michael Ellerman   PCI: Remove no lo...
279
  5.3.4 MSI-X mode vs. legacy mode diagram
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
280

2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
281
  The below diagram shows the events which switch the interrupt
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
282
283
284
285
286
287
288
289
  mode on the MSI-X capable device function between MSI-X mode and
  PIN-IRQ assertion mode (legacy).
  
  	 ------------   pci_enable_msix(,,n) ------------------------
  	|	     | <===============	    | 			     |
  	| MSI-X MODE |	  	     	    | PIN-IRQ ASSERTION MODE |
  	| 	     | ===============>	    |			     |
   	 ------------	pci_disable_msix     ------------------------
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
290
  Figure 2. MSI-X Mode vs. Legacy Mode
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
291

2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
292
  In Figure 2, a device operates by default in legacy mode. A
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
293
294
295
296
297
  successful MSI-X request (using pci_enable_msix()) switches a
  device's interrupt mode to MSI-X mode. A pre-assigned IOAPIC vector
  stored in dev->irq will be saved by the PCI subsystem; however,
  unlike MSI mode, the PCI subsystem will not replace dev->irq with
  assigned MSI-X vector because the PCI subsystem already writes the 1:1
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
298
  vector-to-entry mapping into the field 'vector' of each element
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
299
300
301
302
303
304
  specified in second argument.
  
  To return back to its default mode, a device driver should always call
  pci_disable_msix() to undo the effect of pci_enable_msix(). Note that
  a device driver should always call free_irq() on all MSI-X vectors it
  has done request_irq() on before calling pci_disable_msix(). Failure
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
305
  to do so results in a BUG_ON() and a device will be left with MSI-X
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
306
307
308
309
310
311
312
313
314
315
316
  enabled and leaks its vectors. Otherwise, the PCI subsystem switches a
  device function's interrupt mode from MSI-X mode to legacy mode and
  marks all allocated MSI-X vectors as unused.
  
  Once being marked as unused, there is no guarantee that the PCI
  subsystem will reserve these MSI-X vectors for a device. Depending on
  the availability of current PCI vector resources and the number of
  MSI/MSI-X requests from other drivers, these MSI-X vectors may be
  re-assigned.
  
  For the case where the PCI subsystem re-assigned these MSI-X vectors
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
317
  to other drivers, a request to switch back to MSI-X mode may result
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
318
319
  being assigned with another set of MSI-X vectors or a failure if no
  more vectors are available.
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
320
  5.4 Handling function implementing both MSI and MSI-X capabilities
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
321
322
323
324
325
  
  For the case where a function implements both MSI and MSI-X
  capabilities, the PCI subsystem enables a device to run either in MSI
  mode or MSI-X mode but not both. A device driver determines whether it
  wants MSI or MSI-X enabled on its hardware device. Once a device
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
326
  driver requests for MSI, for example, it is prohibited from requesting
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
327
328
329
330
  MSI-X; in other words, a device driver is not permitted to ping-pong
  between MSI mod MSI-X mode during a run-time.
  
  5.5 Hardware requirements for MSI/MSI-X support
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
331

1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
332
333
  MSI/MSI-X support requires support from both system hardware and
  individual hardware device functions.
4904e23b6   Michael Ellerman   PCI: Remove no lo...
334
  5.5.1 Required x86 hardware support
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
335

1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
336
  Since the target of MSI address is the local APIC CPU, enabling
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
337
338
339
340
  MSI/MSI-X support in the Linux kernel is dependent on whether existing
  system hardware supports local APIC. Users should verify that their
  system supports local APIC operation by testing that it runs when
  CONFIG_X86_LOCAL_APIC=y.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
341
342
343
344
  
  In SMP environment, CONFIG_X86_LOCAL_APIC is automatically set;
  however, in UP environment, users must manually set
  CONFIG_X86_LOCAL_APIC. Once CONFIG_X86_LOCAL_APIC=y, setting
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
345
346
  CONFIG_PCI_MSI enables the VECTOR based scheme and the option for
  MSI-capable device drivers to selectively enable MSI/MSI-X.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
347
348
349
350
  
  Note that CONFIG_X86_IO_APIC setting is irrelevant because MSI/MSI-X
  vector is allocated new during runtime and MSI/MSI-X support does not
  depend on BIOS support. This key independency enables MSI/MSI-X
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
351
  support on future IOxAPIC free platforms.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
352
353
  
  5.5.2 Device hardware support
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
354

1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
355
356
357
358
359
360
  The hardware device function supports MSI by indicating the
  MSI/MSI-X capability structure on its PCI capability list. By
  default, this capability structure will not be initialized by
  the kernel to enable MSI during the system boot. In other words,
  the device function is running on its default pin assertion mode.
  Note that in many cases the hardware supporting MSI have bugs,
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
361
362
  which may result in system hangs. The software driver of specific
  MSI-capable hardware is responsible for deciding whether to call
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
363
  pci_enable_msi or not. A return of zero indicates the kernel
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
364
  successfully initialized the MSI/MSI-X capability structure of the
d533f6718   Tobias Klauser   [PATCH] Spelling ...
365
  device function. The device function is now running on MSI/MSI-X mode.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
366
367
368
369
370
371
372
  
  5.6 How to tell whether MSI/MSI-X is enabled on device function
  
  At the driver level, a return of zero from the function call of
  pci_enable_msi()/pci_enable_msix() indicates to a device driver that
  its device function is initialized successfully and ready to run in
  MSI/MSI-X mode.
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
373
374
375
376
  At the user level, users can use the command 'cat /proc/interrupts'
  to display the vectors allocated for devices and their interrupt
  MSI/MSI-X modes ("PCI-MSI"/"PCI-MSI-X"). Below shows MSI mode is
  enabled on a SCSI Adaptec 39320D Ultra320 controller.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
377
378
379
380
381
382
383
384
385
386
  
             CPU0       CPU1
    0:     324639          0    IO-APIC-edge  timer
    1:       1186          0    IO-APIC-edge  i8042
    2:          0          0          XT-PIC  cascade
   12:       2797          0    IO-APIC-edge  i8042
   14:       6543          0    IO-APIC-edge  ide0
   15:          1          0    IO-APIC-edge  ide1
  169:          0          0   IO-APIC-level  uhci-hcd
  185:          0          0   IO-APIC-level  uhci-hcd
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
387
388
  193:        138         10         PCI-MSI  aic79xx
  201:         30          0         PCI-MSI  aic79xx
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
389
390
391
392
393
394
  225:         30          0   IO-APIC-level  aic7xxx
  233:         30          0   IO-APIC-level  aic7xxx
  NMI:          0          0
  LOC:     324553     325068
  ERR:          0
  MIS:          0
0cc2b3763   Brice Goglin   PCI: Update MSI-H...
395
396
397
398
399
400
401
402
403
  6. MSI quirks
  
  Several PCI chipsets or devices are known to not support MSI.
  The PCI stack provides 3 possible levels of MSI disabling:
  * on a single device
  * on all devices behind a specific bridge
  * globally
  
  6.1. Disabling MSI on a single device
a982ac06b   Matt LaPlante   misc doc and kcon...
404
405
  Under some circumstances it might be required to disable MSI on a
  single device.  This may be achieved by either not calling pci_enable_msi()
0cc2b3763   Brice Goglin   PCI: Update MSI-H...
406
407
408
409
410
411
412
413
414
415
  or all, or setting the pci_dev->no_msi flag before (most of the time
  in a quirk).
  
  6.2. Disabling MSI below a bridge
  
  The vast majority of MSI quirks are required by PCI bridges not
  being able to route MSI between busses. In this case, MSI have to be
  disabled on all devices behind this bridge. It is achieves by setting
  the PCI_BUS_FLAGS_NO_MSI flag in the pci_bus->bus_flags of the bridge
  subordinate bus. There is no need to set the same flag on bridges that
a982ac06b   Matt LaPlante   misc doc and kcon...
416
  are below the broken bridge. When pci_enable_msi() is called to enable
0cc2b3763   Brice Goglin   PCI: Update MSI-H...
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
  MSI on a device, pci_msi_supported() takes care of checking the NO_MSI
  flag in all parent busses of the device.
  
  Some bridges actually support dynamic MSI support enabling/disabling
  by changing some bits in their PCI configuration space (especially
  the Hypertransport chipsets such as the nVidia nForce and Serverworks
  HT2000). It may then be required to update the NO_MSI flag on the
  corresponding devices in the sysfs hierarchy. To enable MSI support
  on device "0000:00:0e", do:
  
  	echo 1 > /sys/bus/pci/devices/0000:00:0e/msi_bus
  
  To disable MSI support, echo 0 instead of 1. Note that it should be
  used with caution since changing this value might break interrupts.
  
  6.3. Disabling MSI globally
  
  Some extreme cases may require to disable MSI globally on the system.
  For now, the only known case is a Serverworks PCI-X chipsets (MSI are
  not supported on several busses that are not all connected to the
  chipset in the Linux PCI hierarchy). In the vast majority of other
  cases, disabling only behind a specific bridge is enough.
  
  For debugging purpose, the user may also pass pci=nomsi on the kernel
  command-line to explicitly disable MSI globally. But, once the appro-
  priate quirks are added to the kernel, this option should not be
  required anymore.
  
  6.4. Finding why MSI cannot be enabled on a device
  
  Assuming that MSI are not enabled on a device, you should look at
  dmesg to find messages that quirks may output when disabling MSI
  on some devices, some bridges or even globally.
  Then, lspci -t gives the list of bridges above a device. Reading
  /sys/bus/pci/devices/0000:00:0e/msi_bus will tell you whether MSI
  are enabled (1) or disabled (0). In 0 is found in a single bridge
  msi_bus file above the device, MSI cannot be enabled.
  
  7. FAQ
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
  
  Q1. Are there any limitations on using the MSI?
  
  A1. If the PCI device supports MSI and conforms to the
  specification and the platform supports the APIC local bus,
  then using MSI should work.
  
  Q2. Will it work on all the Pentium processors (P3, P4, Xeon,
  AMD processors)? In P3 IPI's are transmitted on the APIC local
  bus and in P4 and Xeon they are transmitted on the system
  bus. Are there any implications with this?
  
  A2. MSI support enables a PCI device sending an inbound
  memory write (0xfeexxxxx as target address) on its PCI bus
  directly to the FSB. Since the message address has a
  redirection hint bit cleared, it should work.
  
  Q3. The target address 0xfeexxxxx will be translated by the
  Host Bridge into an interrupt message. Are there any
  limitations on the chipsets such as Intel 8xx, Intel e7xxx,
  or VIA?
  
  A3. If these chipsets support an inbound memory write with
  target address set as 0xfeexxxxx, as conformed to PCI
  specification 2.3 or latest, then it should work.
  
  Q4. From the driver point of view, if the MSI is lost because
2500e7abc   Randy Dunlap   [PATCH] Doc/MSI-H...
483
484
  of errors occurring during inbound memory write, then it may
  wait forever. Is there a mechanism for it to recover?
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
485
486
487
488
489
490
491
492
493
  
  A4. Since the target of the transaction is an inbound memory
  write, all transaction termination conditions (Retry,
  Master-Abort, Target-Abort, or normal completion) are
  supported. A device sending an MSI must abide by all the PCI
  rules and conditions regarding that inbound memory write. So,
  if a retry is signaled it must retry, etc... We believe that
  the recommendation for Abort is also a retry (refer to PCI
  specification 2.3 or latest).