13 Sep, 2013

1 commit


13 Aug, 2013

1 commit

  • Some platforms (e.g S390) don't use the generic hardirqs code and
    therefore do not defined HAVE_GENERIC_HARDIRQS. This prevents using
    the irq_set_chip_data() and irq_get_chip_data() functions that are
    used for the default implementations of the MSI operations.

    So, when CONFIG_GENERIC_HARDIRQS is not enabled, provide another
    default implementation of the MSI operations, that simply errors
    out. The architecture is responsible for implementing those operations
    (which is the case on S390), and cannot use the msi_chip infrastructure.

    Signed-off-by: Thomas Petazzoni
    Signed-off-by: Jason Cooper

    Thomas Petazzoni
     

12 Aug, 2013

2 commits

  • The new struct msi_chip is used to associated an MSI controller with a
    PCI bus. It is automatically handed down from the root to its children
    during bus enumeration.

    This patch provides default (weak) implementations for the architecture-
    specific MSI functions (arch_setup_msi_irq(), arch_teardown_msi_irq()
    and arch_msi_check_device()) which check if a PCI device's bus has an
    attached MSI chip and forward the call appropriately.

    Signed-off-by: Thierry Reding
    Signed-off-by: Thomas Petazzoni
    Acked-by: Bjorn Helgaas
    Tested-by: Daniel Price
    Tested-by: Thierry Reding
    Signed-off-by: Jason Cooper

    Thierry Reding
     
  • Until now, the MSI architecture-specific functions could be overloaded
    using a fairly complex set of #define and compile-time
    conditionals. In order to prepare for the introduction of the msi_chip
    infrastructure, it is desirable to switch all those functions to use
    the 'weak' mechanism. This commit converts all the architectures that
    were overidding those MSI functions to use the new strategy.

    Note that we keep two separate, non-weak, functions
    default_teardown_msi_irqs() and default_restore_msi_irqs() for the
    default behavior of the arch_teardown_msi_irqs() and
    arch_restore_msi_irqs(), as the default behavior is needed by x86 PCI
    code.

    Signed-off-by: Thomas Petazzoni
    Acked-by: Bjorn Helgaas
    Acked-by: Benjamin Herrenschmidt
    Tested-by: Daniel Price
    Tested-by: Thierry Reding
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: linuxppc-dev@lists.ozlabs.org
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: linux390@de.ibm.com
    Cc: linux-s390@vger.kernel.org
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: x86@kernel.org
    Cc: Russell King
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: linux-ia64@vger.kernel.org
    Cc: Ralf Baechle
    Cc: linux-mips@linux-mips.org
    Cc: David S. Miller
    Cc: sparclinux@vger.kernel.org
    Cc: Chris Metcalf
    Signed-off-by: Jason Cooper

    Thomas Petazzoni
     

29 May, 2013

1 commit

  • Because of the encoding of the "Multiple Message Capable" and "Multiple
    Message Enable" fields, a device can only advertise that it's capable of a
    power-of-two number of vectors, and the OS can only enable a power-of-two
    number.

    For example, a device that's limited internally to using 18 vectors would
    have to advertise that it's capable of 32. The 14 extra vectors consume
    vector numbers and IRQ descriptors even though the device can't actually
    use them.

    This fix introduces a 'msi_desc::nvec_used' field to address this issue.
    When non-zero, it is the actual number of MSIs the device will send, as
    requested by the device driver. This value should be used by architectures
    to set up and tear down only as many interrupt resources as the device will
    actually use.

    Note, although the existing 'msi_desc::multiple' field might seem
    redundant, in fact it is not. The number of MSIs advertised need not be
    the smallest power-of-two larger than the number of MSIs the device will
    send. Thus, it is not always possible to derive the former from the
    latter, so we need to keep them both to handle this case.

    [bhelgaas: changelog, rename to "nvec_used"]
    Signed-off-by: Alexander Gordeev
    Signed-off-by: Bjorn Helgaas

    Alexander Gordeev
     

30 Apr, 2013

1 commit

  • The "+" operation has higher precedence than "?:" and ->msi_cap is
    always non-zero here so the original statement is equivalent to:

    entry->mask_pos = PCI_MSI_MASK_64;

    Which wasn't the intent.

    [bhelgaas: my fault from 78b5a310ce]
    Signed-off-by: Dan Carpenter
    Signed-off-by: Bjorn Helgaas

    Dan Carpenter
     

25 Apr, 2013

1 commit

  • * pci/gavin-msi-cleanup:
    vfio-pci: Use cached MSI/MSI-X capabilities
    vfio-pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
    PCI: Remove "extern" from function declarations
    PCI: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
    PCI: Drop msi_mask_reg() and remove drivers/pci/msi.h
    PCI: Use msix_table_size() directly, drop multi_msix_capable()
    PCI: Drop msix_table_offset_reg() and msix_pba_offset_reg() macros
    PCI: Drop is_64bit_address() and is_mask_bit_support() macros
    PCI: Drop msi_data_reg() macro
    PCI: Drop msi_lower_address_reg() and msi_upper_address_reg() macros
    PCI: Drop msi_control_reg() macro and use PCI_MSI_FLAGS directly
    PCI: Use cached MSI/MSI-X offsets from dev, not from msi_desc
    PCI: Clean up MSI/MSI-X capability #defines
    PCI: Use cached MSI-X cap while enabling MSI-X
    PCI: Use cached MSI cap while enabling MSI interrupts
    PCI: Remove MSI/MSI-X cap check in pci_msi_check_device()
    PCI: Cache MSI/MSI-X capability offsets in struct pci_dev
    PCI: Use u8, not int, for PM capability offset
    [SCSI] megaraid_sas: Use correct #define for MSI-X capability

    Bjorn Helgaas
     

23 Apr, 2013

13 commits


13 Apr, 2013

1 commit


25 Jan, 2013

1 commit

  • The new function pci_enable_msi_block_auto() tries to allocate
    maximum possible number of MSIs up to the number the device
    supports. It generalizes a pattern when pci_enable_msi_block()
    is contiguously called until it succeeds or fails.

    Opposite to pci_enable_msi_block() which takes the number of
    MSIs to allocate as a input parameter,
    pci_enable_msi_block_auto() could be used by device drivers to
    obtain the number of assigned MSIs and the number of MSIs the
    device supports.

    Signed-off-by: Alexander Gordeev
    Acked-by: Bjorn Helgaas
    Cc: Suresh Siddha
    Cc: Yinghai Lu
    Cc: Matthew Wilcox
    Cc: Jeff Garzik
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/c3de2419df94a0f95ca1a6f755afc421486455e6.1353324359.git.agordeev@redhat.com
    Signed-off-by: Ingo Molnar

    Alexander Gordeev
     

01 Dec, 2012

1 commit

  • Support PCI adapter interrupts using the Single-IRQ-mode. Single-IRQ-mode
    disables an adapter IRQ automatically after delivering it until the SIC
    instruction enables it again. This is used to reduce the number of IRQs
    for streaming workloads.

    Up to 64 MSI handlers can be registered per PCI function.
    A hash table is used to map interrupt numbers to MSI descriptors.
    The interrupt vector is scanned using the flogr instruction.
    Only MSI/MSI-X interrupts are supported, no legacy INTs.

    Signed-off-by: Jan Glauber
    Signed-off-by: Martin Schwidefsky

    Jan Glauber
     

07 Jan, 2012

4 commits

  • The MSI restore function will become a function pointer in an
    x86_msi_ops struct. It defaults to the implementation in the
    io_apic.c and msi.c. We piggyback on the indirection mechanism
    introduced by "x86: Introduce x86_msi_ops".

    Cc: x86@kernel.org
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: linux-pci@vger.kernel.org
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Jesse Barnes

    Konrad Rzeszutek Wilk
     
  • This warning was recently reported to me:

    ------------[ cut here ]------------
    WARNING: at lib/kobject.c:595 kobject_put+0x50/0x60()
    Hardware name: VMware Virtual Platform
    kobject: '(null)' (ffff880027b0df40): is not initialized, yet kobject_put() is
    being called.
    Modules linked in: vmxnet3(+) vmw_balloon i2c_piix4 i2c_core shpchp raid10
    vmw_pvscsi
    Pid: 630, comm: modprobe Tainted: G W 3.1.6-1.fc16.x86_64 #1
    Call Trace:
    [] warn_slowpath_common+0x7f/0xc0
    [] warn_slowpath_fmt+0x46/0x50
    [] ? free_desc+0x63/0x70
    [] kobject_put+0x50/0x60
    [] free_msi_irqs+0xd5/0x120
    [] pci_enable_msi_block+0x24c/0x2c0
    [] vmxnet3_alloc_intr_resources+0x173/0x240 [vmxnet3]
    [] vmxnet3_probe_device+0x615/0x834 [vmxnet3]
    [] local_pci_probe+0x5c/0xd0
    [] pci_device_probe+0x109/0x130
    [] driver_probe_device+0x9c/0x2b0
    [] __driver_attach+0xab/0xb0
    [] ? driver_probe_device+0x2b0/0x2b0
    [] ? driver_probe_device+0x2b0/0x2b0
    [] bus_for_each_dev+0x5c/0x90
    [] driver_attach+0x1e/0x20
    [] bus_add_driver+0x1b0/0x2a0
    [] ? 0xffffffffa0187fff
    [] driver_register+0x76/0x140
    [] ? printk+0x51/0x53
    [] ? 0xffffffffa0187fff
    [] __pci_register_driver+0x56/0xd0
    [] vmxnet3_init_module+0x3a/0x3c [vmxnet3]
    [] do_one_initcall+0x42/0x180
    [] sys_init_module+0x91/0x200
    [] system_call_fastpath+0x16/0x1b
    ---[ end trace 44593438a59a9558 ]---
    Using INTx interrupt, #Rx queues: 1.

    It occurs when populate_msi_sysfs fails, which in turn causes free_msi_irqs to
    be called. Because populate_msi_sysfs fails, we never registered any of the
    msi irq sysfs objects, but free_msi_irqs still calls kobject_del and kobject_put
    on each of them, which gets flagged in the above stack trace.

    The fix is pretty straightforward. We can key of the parent pointer in the
    kobject. It is only set if the kobject_init_and_add succededs in
    populate_msi_sysfs. If anything fails there, each kobject has its parent reset
    to NULL

    Signed-off-by: Neil Horman
    CC: Bjorn Helgaas
    CC: Greg Kroah-Hartman
    CC: linux-pci@vger.kernel.org
    Signed-off-by: Jesse Barnes

    Neil Horman
     
  • I traced a nasty kexec on panic boot failure to the fact that we had
    screaming msi interrupts and we were not disabling the msi messages at
    kernel startup. The booting kernel had not enabled those interupts so
    was not prepared to handle them.

    I can see no reason why we would ever want to leave the msi interrupts
    enabled at boot if something else has enabled those interrupts. The pci
    spec specifies that msi interrupts should be off by default. Drivers
    are expected to enable the msi interrupts if they want to use them. Our
    interrupt handling code reprograms the interrupt handlers at boot and
    will not be be able to do anything useful with an unexpected interrupt.

    This patch applies cleanly all of the way back to 2.6.32 where I noticed
    the problem.

    Cc: stable@kernel.org
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Jesse Barnes

    Eric W. Biederman
     
  • This patch adds a per-pci-device subdirectory in sysfs called:
    /sys/bus/pci/devices//msi_irqs

    This sub-directory exports the set of msi vectors allocated by a given
    pci device, by creating a numbered sub-directory for each vector beneath
    msi_irqs. For each vector various attributes can be exported.
    Currently the only attribute is called mode, which tracks the
    operational mode of that vector (msi vs. msix)

    Acked-by: Greg Kroah-Hartman
    Signed-off-by: Jesse Barnes

    Neil Horman
     

01 Nov, 2011

1 commit


29 Mar, 2011

1 commit


24 Dec, 2010

1 commit


18 Oct, 2010

1 commit

  • Introduce an override for the arch_[teardown|setup]_msi_irqs
    that can be utilized to fallback to the default arch_* code.

    If a platform wants to utilize the code paths defined
    in driver/pci/msi.c it has to define HAVE_DEFAULT_MSI_TEARDOWN_IRQS
    or HAVE_DEFAULT_MSI_SETUP_IRQS. Otherwise the old mechanism
    of over-ridding the arch_* works fine.

    Signed-off-by: Konrad Rzeszutek Wilk
    Cc: x86@kernel.org

    Thomas Gleixner
     

12 Oct, 2010

2 commits


31 Jul, 2010

3 commits

  • commit 2ca1af9aa3285c6a5f103ed31ad09f7399fc65d7 "PCI: MSI: Remove
    unsafe and unnecessary hardware access" changed read_msi_msg_desc() to
    return the last MSI message written instead of reading it from the
    device, since it may be called while the device is in a reduced
    power state.

    However, the pSeries platform code really does need to read messages
    from the device, since they are initially written by firmware.
    Therefore:
    - Restore the previous behaviour of read_msi_msg_desc()
    - Add new functions get_cached_msi_msg{,_desc}() which return the
    last MSI message written
    - Use the new functions where appropriate

    Acked-by: Michael Ellerman
    Signed-off-by: Ben Hutchings
    Signed-off-by: Jesse Barnes

    Ben Hutchings
     
  • During suspend on an SMP system, {read,write}_msi_msg_desc() may be
    called to mask and unmask interrupts on a device that is already in a
    reduced power state. At this point memory-mapped registers including
    MSI-X tables are not accessible, and config space may not be fully
    functional either.

    While a device is in a reduced power state its interrupts are
    effectively masked and its MSI(-X) state will be restored when it is
    brought back to D0. Therefore these functions can simply read and
    write msi_desc::msg for devices not in D0.

    Further, read_msi_msg_desc() should only ever be used to update a
    previously written message, so it can always read msi_desc::msg
    and never needs to touch the hardware.

    Tested-by: "Michael Chan"
    Signed-off-by: Ben Hutchings
    Signed-off-by: Jesse Barnes

    Ben Hutchings
     
  • Use resource_size_t for MMIO address instead of unsigned long. Otherwise,
    higher 32-bits of MMIO address are cleared unexpectedly in x86-32 PAE.

    Acked-by: Matthew Wilcox
    Signed-off-by: Kenji Kaneshige
    Signed-off-by: Jesse Barnes

    Kenji Kaneshige
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

10 Sep, 2009

3 commits