21 Nov, 2006

2 commits


17 Nov, 2006

2 commits

  • IPoIB assumes that high (reserved) octet in the hardware address is 0,
    and copies it into the QPN. This violates RFC 4391 (which requires
    that the high 8 bits are ignored on receive), and will result in an
    invalid QPN being used when interoperating with IPoIB connected mode.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Roland Dreier

    Michael S. Tsirkin
     
  • The PCI Express and Hypertransport chip-specific source files should only
    be built when the kernel has the capability of actually compiling them.

    This fixes the driver build on, for example, ia64.

    Signed-off-by: Bryan O'Sullivan
    Cc: "Eric W. Biederman"
    Cc: Roland Dreier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bryan O'Sullivan
     

14 Nov, 2006

6 commits

  • * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband:
    IB/mad: Fix race between cancel and receive completion
    RDMA/amso1100: Fix && typo
    RDMA/amso1100: Fix unitialized pseudo_netdev accessed in c2_register_device
    IB/ehca: Activate scaling code by default
    IB/ehca: Use named constant for max mtu
    IB/ehca: Assure 4K alignment for firmware control blocks

    Linus Torvalds
     
  • When ib_cancel_mad() is called, it puts the canceled send on a list
    and schedules a "flushed" callback from process context. However,
    this leaves a window where a receive completion could be processed
    before the send is fully flushed.

    This is fine, except that ib_find_send_mad() will find the MAD and
    return it to the receive processing, which results in the sender
    getting both a successful receive and a "flushed" send completion for
    the same request. Understandably, this confuses the sender, which is
    expecting only one of these two callbacks, and leads to grief such as
    a use-after-free in IPoIB.

    Fix this by changing ib_find_send_mad() to return a send struct only
    if the status is still successful (and not "flushed"). The search of
    the send_list already had this check, so this patch just adds the same
    check to the search of the wait_list.

    Signed-off-by: Roland Dreier

    Roland Dreier
     
  • Fix the AMSO1100 firmware version computation, which was broken
    due to "&&" being used where "&" should have.

    Signed-off-by: Jean Delvare
    Signed-off-by: Roland Dreier

    Jean Delvare
     
  • Rework some load-time error handling: c2_register_device() leaked when
    it failed, and the function that called it didn't check the return code.

    Signed-off-by: Tom Tucker
    Signed-off-by: Roland Dreier

    Tom Tucker
     
  • Change ehca's Kconfig to activates scaling code as default. After
    several measurements we saw that this feature prevents dropped packets
    (UD) in stress situation. Thus, enabling it helps to improve ehca's
    bandwidth through IPoIB.

    Signed-off-by: Hoang-Nam Nguyen
    Signed-off-by: Roland Dreier

    Hoang-Nam Nguyen
     
  • Define and use a constant EHCA_MAX_MTU instead hardcoded value.

    Signed-off-by: Hoang-Nam Nguyen
    Signed-off-by: Roland Dreier

    Hoang-Nam Nguyen
     

10 Nov, 2006

1 commit

  • Assure 4K alignment for firmware control blocks in 64K page mode,
    because kzalloc()'s result address might not be 4K aligned if 64K
    pages are enabled. Thus, we introduce wrappers called
    ehca_{alloc,free}_fw_ctrlblock(), which use a slab cache for objects
    with 4K length and 4K alignment in order to alloc/free firmware
    control blocks in 64K page mode. In 4K page mode those wrappers just
    are defines of get_zeroed_page() and free_page().

    Signed-off-by: Hoang-Nam Nguyen
    Signed-off-by: Roland Dreier

    Hoang-Nam Nguyen
     

09 Nov, 2006

1 commit


03 Nov, 2006

1 commit


01 Nov, 2006

1 commit


31 Oct, 2006

6 commits


17 Oct, 2006

4 commits

  • We discovered a problem when running IPoIB applications on multiple
    CPUs on an Altix system. Many messages such as:

    ib_mthca 0002:01:00.0: SQ 000014 full (19941644 head, 19941707 tail, 64 max, 0 nreq)

    appear in syslog, and the driver wedges up.

    Apparently this is because writes to the doorbells from different CPUs
    reach the device out of order. The following patch adds mmiowb() calls
    after doorbell rings to ensure the doorbell writes are ordered.

    Signed-off-by: Arthur Kepner
    Signed-off-by: Roland Dreier

    Arthur Kepner
     
  • Don't attempt to set up the diagpkt device in the module init code.
    Instead, wait until a piece of hardware is initialized. Fixes a
    problem when loading the ib_ipath module when no InfiniPath hardware
    is present: modprobe would go into the D state and stay there.

    Signed-off-by: Robert Walsh
    Signed-off-by: Roland Dreier

    Robert Walsh
     
  • This patch fixes a NULL dereference spotted by the Coverity checker.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Acked-by: Steve Wise
    Acked-by: Tom Tucker
    Signed-off-by: Roland Dreier

    Adrian Bunk
     
  • pci_module_init() convertion in amso1100 driver.

    Signed-off-by: Henrik Kretzschmar
    Signed-off-by: Andrew Morton
    Signed-off-by: Roland Dreier

    Henrik Kretzschmar
     

11 Oct, 2006

9 commits

  • All HCAs (not just mem-free) need a spare SRQ entry, so bump srq->max
    by 1 in all cases.

    Noted by Jack Morgenstein

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Roland Dreier

    Michael S. Tsirkin
     
  • Signed-off-by: Roland Dreier

    Roland Dreier
     
  • Since pr_debug() has changed from a macro to an inline function when
    DEBUG is not defined, its arguments now need to be defined even when
    debugging is off. Therefore to_event_str() and to_qp_state_str() need
    to be moved out of #ifdef DEBUG. The compiler will throw the
    definitions away if DEBUG is not defined, but it needs to be able to
    see that the functions exist.

    Signed-off-by: Roland Dreier

    Roland Dreier
     
  • Currently a DREP is only sent in response to a DREQ if a connection
    has been found matching the DREQ, and it is in the proper state. Once
    a DREP is sent, the local connection moves into timewait. Duplicate
    DREQs received while in this state result in re-sending the DREP.

    However, it's likely that the local connection will enter and exit
    timewait before the remote side times out a lost DREP and resends a DREQ.
    To handle this, we send a DREP in response to a DREQ, even if a local
    connection is not found. This avoids maintaining disconnected
    id's in timewait states for excessively long times, just to handle a
    lost DREP.

    Signed-off-by: Sean Hefty
    Signed-off-by: Roland Dreier

    Sean Hefty
     
  • If the ib_cm module is unloaded while id's are still in timewait, the
    CM will destroy the work queue used to process timewait. Once the
    id's exit timewait, their timers will fire, leading to a crash trying
    to access the destroyed work queue.

    We need to track id's that are in timewait, and cancel their deferred
    work on module unload.

    Signed-off-by: Sean Hefty
    Signed-off-by: Roland Dreier

    Sean Hefty
     
  • Fill in "max_vl_num" (encoded according to VLCap field in the PortInfo MAD)
    and "init_type_reply" values in the ib_query_port() verb.

    Signed-off-by: Jack Morgenstein
    Signed-off-by: Roland Dreier

    Jack Morgenstein
     
  • Enable multiple concurrent connections to the same SRP target:

    1) Use port GUID instead of node GUID in the initiator port
    identifier. This allows connections to be made from multiple HCA
    ports at the same time.
    2) Let the user specify the identifier extention when adding the
    device. This allows userspace to make multiple connections even
    from the same port, if it wants too.

    Without this, only one connection can be made from any given HCA, even
    if it has multiple ports, because we don't use multi-channel mode, so
    targets will only allow one connection from a given initiator port ID.

    Signed-off-by: Ishai Rabinovitz
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Roland Dreier

    Ishai Rabinovitz
     
  • scsi_host_alloc() already allocates with kzalloc(), so the struct Scsi_Host
    is zeroed out, including the private data portion. Remove the redundant
    memset that zeros this out again in the SRP initiator.

    Signed-off-by: Ishai Rabinovitz
    Signed-off-by: Roland Dreier

    Ishai Rabinovitz
     
  • The AMSO driver was not thread-safe in the post WR code and had
    code that would sleep if the WR post FIFO was full. Since these
    functions can be called on interrupt level I changed the sleep to a
    udelay.

    Signed-off-by: Tom Tucker
    Signed-off-by: Roland Dreier

    Tom Tucker
     

05 Oct, 2006

1 commit

  • Maintain a per-CPU global "struct pt_regs *" variable which can be used instead
    of passing regs around manually through all ~1800 interrupt handlers in the
    Linux kernel.

    The regs pointer is used in few places, but it potentially costs both stack
    space and code to pass it around. On the FRV arch, removing the regs parameter
    from all the genirq function results in a 20% speed up of the IRQ exit path
    (ie: from leaving timer_interrupt() to leaving do_IRQ()).

    Where appropriate, an arch may override the generic storage facility and do
    something different with the variable. On FRV, for instance, the address is
    maintained in GR28 at all times inside the kernel as part of general exception
    handling.

    Having looked over the code, it appears that the parameter may be handed down
    through up to twenty or so layers of functions. Consider a USB character
    device attached to a USB hub, attached to a USB controller that posts its
    interrupts through a cascaded auxiliary interrupt controller. A character
    device driver may want to pass regs to the sysrq handler through the input
    layer which adds another few layers of parameter passing.

    I've build this code with allyesconfig for x86_64 and i386. I've runtested the
    main part of the code on FRV and i386, though I can't test most of the drivers.
    I've also done partial conversion for powerpc and MIPS - these at least compile
    with minimal configurations.

    This will affect all archs. Mostly the changes should be relatively easy.
    Take do_IRQ(), store the regs pointer at the beginning, saving the old one:

    struct pt_regs *old_regs = set_irq_regs(regs);

    And put the old one back at the end:

    set_irq_regs(old_regs);

    Don't pass regs through to generic_handle_irq() or __do_IRQ().

    In timer_interrupt(), this sort of change will be necessary:

    - update_process_times(user_mode(regs));
    - profile_tick(CPU_PROFILING, regs);
    + update_process_times(user_mode(get_irq_regs()));
    + profile_tick(CPU_PROFILING);

    I'd like to move update_process_times()'s use of get_irq_regs() into itself,
    except that i386, alone of the archs, uses something other than user_mode().

    Some notes on the interrupt handling in the drivers:

    (*) input_dev() is now gone entirely. The regs pointer is no longer stored in
    the input_dev struct.

    (*) finish_unlinks() in drivers/usb/host/ohci-q.c needs checking. It does
    something different depending on whether it's been supplied with a regs
    pointer or not.

    (*) Various IRQ handler function pointers have been moved to type
    irq_handler_t.

    Signed-Off-By: David Howells
    (cherry picked from 1b16e7ac850969f38b375e511e3fa2f474a33867 commit)

    David Howells
     

04 Oct, 2006

2 commits


03 Oct, 2006

4 commits

  • Add an extra space to make things more readable.

    Signed-off-by: Hoang-Nam Nguyen
    Signed-off-by: Roland Dreier

    Hoang-Nam Nguyen
     
  • Move the call to ib_register_device() later, since a device should not
    be registered until it is completely read to be used. This fixes
    crashes that occur if an upper-layer driver such as IPoIB is loaded
    before the ehca module.

    Signed-off-by: Hoang-Nam Nguyen
    Signed-off-by: Roland Dreier

    Hoang-Nam Nguyen
     
  • The PSN used to generate the request following a RDMA read was
    incorrect and some state booking wasn't maintained correctly. This
    patch fixes that.

    Signed-off-by: Ralph Campbell
    Signed-off-by: Bryan O'Sullivan

    Ralph Campbell
     
  • Reorganize code relating to cma_get_net_info() and rdam_create_id() to
    optimize error case handling (no need to alloc memory/etc. as part of
    rdma_create_id() if input parameters are wrong).

    Signed-off-by: Krishna Kumar
    Signed-off-by: Sean Hefty
    Signed-off-by: Roland Dreier

    Krishna Kumar