29 Mar, 2009

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (119 commits)
    [SCSI] scsi_dh_rdac: Retry for NOT_READY check condition
    [SCSI] mpt2sas: make global symbols unique
    [SCSI] sd: Make revalidate less chatty
    [SCSI] sd: Try READ CAPACITY 16 first for SBC-2 devices
    [SCSI] sd: Refactor sd_read_capacity()
    [SCSI] mpt2sas v00.100.11.15
    [SCSI] mpt2sas: add MPT2SAS_MINOR(221) to miscdevice.h
    [SCSI] ch: Add scsi type modalias
    [SCSI] 3w-9xxx: add power management support
    [SCSI] bsg: add linux/types.h include to bsg.h
    [SCSI] cxgb3i: fix function descriptions
    [SCSI] libiscsi: fix possbile null ptr session command cleanup
    [SCSI] iscsi class: remove host no argument from session creation callout
    [SCSI] libiscsi: pass session failure a session struct
    [SCSI] iscsi lib: remove qdepth param from iscsi host allocation
    [SCSI] iscsi lib: have lib create work queue for transmitting IO
    [SCSI] iscsi class: fix lock dep warning on logout
    [SCSI] libiscsi: don't cap queue depth in iscsi modules
    [SCSI] iscsi_tcp: replace scsi_debug/tcp_debug logging with iscsi conn logging
    [SCSI] libiscsi_tcp: replace tcp_debug/scsi_debug logging with session/conn logging
    ...

    Linus Torvalds
     

27 Mar, 2009

5 commits

  • When net-next and infiniband were merged upstream, each branch deleted
    one of a pair of adjacent lines from nes_nic.c, but when Linus fixed the
    conflict up, he brought back both of the lines. Fix up to the intended
    final tree state.

    Signed-off-by: Roland Dreier
    Acked-by: David S. Miller
    Signed-off-by: Linus Torvalds

    Roland Dreier
     
  • * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (26 commits)
    posix timers: fix RLIMIT_CPU && fork()
    time: ntp: fix bug in ntp_update_offset() & do_adjtimex(), fix
    time: ntp: clean up second_overflow()
    time: ntp: simplify ntp_tick_adj calculations
    time: ntp: make 64-bit constants more robust
    time: ntp: refactor do_adjtimex() some more
    time: ntp: refactor do_adjtimex()
    time: ntp: fix bug in ntp_update_offset() & do_adjtimex()
    time: ntp: micro-optimize ntp_update_offset()
    time: ntp: simplify ntp_update_offset_fll()
    time: ntp: refactor and clean up ntp_update_offset()
    time: ntp: refactor up ntp_update_frequency()
    time: ntp: clean up ntp_update_frequency()
    time: ntp: simplify the MAX_TICKADJ_SCALED definition
    time: ntp: simplify the second_overflow() code flow
    time: ntp: clean up kernel/time/ntp.c
    x86: hpet: stop HPET_COUNTER when programming periodic mode
    x86: hpet: provide separate functions to stop and start the counter
    x86: hpet: print HPET registers during setup (if hpet=verbose is used)
    time: apply NTP frequency/tick changes immediately
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1750 commits)
    ixgbe: Allow Priority Flow Control settings to survive a device reset
    net: core: remove unneeded include in net/core/utils.c.
    e1000e: update version number
    e1000e: fix close interrupt race
    e1000e: fix loss of multicast packets
    e1000e: commonize tx cleanup routine to match e1000 & igb
    netfilter: fix nf_logger name in ebt_ulog.
    netfilter: fix warning in ebt_ulog init function.
    netfilter: fix warning about invalid const usage
    e1000: fix close race with interrupt
    e1000: cleanup clean_tx_irq routine so that it completely cleans ring
    e1000: fix tx hang detect logic and address dma mapping issues
    bridge: bad error handling when adding invalid ether address
    bonding: select current active slave when enslaving device for mode tlb and alb
    gianfar: reallocate skb when headroom is not enough for fcb
    Bump release date to 25Mar2009 and version to 0.22
    r6040: Fix second PHY address
    qeth: fix wait_event_timeout handling
    qeth: check for completion of a running recovery
    qeth: unregister MAC addresses during recovery.
    ...

    Manually fixed up conflicts in:
    drivers/infiniband/hw/cxgb3/cxio_hal.h
    drivers/infiniband/hw/nes/nes_nic.c

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (30 commits)
    RDMA/cxgb3: Enforce required firmware
    IB/mlx4: Unregister IB device prior to CLOSE PORT command
    mlx4_core: Add link type autosensing
    mlx4_core: Don't perform SET_PORT command for Ethernet ports
    RDMA/nes: Handle MPA Reject message properly
    RDMA/nes: Improve use of PBLs
    RDMA/nes: Remove LLTX
    RDMA/nes: Inform hardware that asynchronous event has been handled
    RDMA/nes: Fix tmp_addr compilation warning
    RDMA/nes: Report correct vendor_id and vendor_part_id
    RDMA/nes: Update copyright to new legal entity and year
    RDMA/nes: Account for freed PBL after HW operation
    IB: Remove useless ibdev_is_alive() tests from sysfs code
    IB/sa_query: Fix AH leak due to update_sm_ah() race
    IB/mad: Fix ib_post_send_mad() returning 0 with no generate send comp
    IB/mad: initialize mad_agent_priv before putting on lists
    IB/mad: Fix null pointer dereference in local_completions()
    IB/mad: Fix RMPP header RRespTime manipulation
    IB/iser: Remove hard setting of path MTU
    mlx4_core: Add device IDs for MT25458 10GigE devices
    ...

    Linus Torvalds
     
  • Conflicts:
    drivers/net/wimax/i2400m/usb-notif.c

    David S. Miller
     

26 Mar, 2009

1 commit


25 Mar, 2009

2 commits


22 Mar, 2009

3 commits


19 Mar, 2009

1 commit

  • According to the ConnectX programmer's reference manual, all
    operations should be stopped, all QPs should be torn down and all WQEs
    flushed before the CLOSE_PORT command is invoked. In some cases
    reversing the order of operations (as implemented now) could cause
    a loss of completions.

    Signed-off-by: Yevgeny Petrilin
    Signed-off-by: Roland Dreier

    Yevgeny Petrilin
     

14 Mar, 2009

5 commits


13 Mar, 2009

1 commit

  • STag zero is a special STag that allows consumers to access any bus
    address without registering memory. The nes driver unfortunately
    allows STag zero to be used even with QPs created by unprivileged
    userspace consumers, which means that any process with direct verbs
    access to the nes device can read and write any memory accessible to
    the underlying PCI device (usually any memory in the system). Such
    access is usually given for cluster software such as MPI to use, so
    this is a local privilege escalation bug on most systems running this
    driver.

    The driver was using STag zero to receive the last streaming mode
    data; to allow STag zero to be disabled for unprivileged QPs, the
    driver now registers a special MR for this data.

    Cc:
    Signed-off-by: Faisal Latif
    Signed-off-by: Roland Dreier
    Signed-off-by: Linus Torvalds

    Faisal Latif
     

07 Mar, 2009

8 commits

  • While doing testing, there are failures as MPA Reject call is not
    handled. To handle MPA Reject call, following changes are done:

    *Handle inbound/outbound MPA Reject response message.
    When nes_reject() is called for pending MPA request reply,
    send the MPA Reject message to its peer (active
    side)cm_node. The peer cm_node (active side) will indicate
    Reject message event for the pending Connect Request.

    *Handle MPA Reject response message for loopback connections and listener.
    When MPA Request is rejected, check if it is a loopback
    connection and if it is then it will send Reject message event
    to its peer loopback node. Also when destroying listener,
    check if the cm_nodes for that listener are loopback or not.

    *Add gracefull connection close with the MPA Reject response message.
    Send gracefull close (FIN, FIN ACK..) to terminate the cm_nodes.

    *Some code re-org while making the above changes.
    Removed recv_list and recv_list_lock from the cm_node
    structure as there can be only one receive close entry on the
    timer. Also implemented handle_recv_entry() as receive close
    entry is processed from both nes_rem_ref_cm_node() as well as
    nes_cm_timer_tick().

    Signed-off-by: Faisal Latif
    Signed-off-by: Roland Dreier

    Faisal Latif
     
  • Two level 256 byte PBLs was not implemented so the driver could report
    out of memory when in fact there were PBLs still available.

    This solution prefers to use 4KB PBLs over two level 256B PBLs until
    the number of 4KB PBLs falls below a threshold. At this point the 4KB
    PBL structure is converted to use 256B PBLs which prevents the driver
    from running out of 4KB PBLs too quickly.

    Signed-off-by: Don Wood
    Signed-off-by: Roland Dreier

    Don Wood
     
  • NETIF_F_LLTX is deprecated. Remove private TX locking from the driver
    and remove the NETIF_F_LLTX feature flag. This also fixes a warning
    in some configs that comes from doing skb_linearize() call in the
    hard_start_xmit method with IRQs disabled (if HIGHMEM is enabled,
    skb_linearize() may end up enabling BHs, which is a no-no if hard IRQs
    are disabled in that context). By getting rid of LLTX, we do not
    disable IRQs when skb_linearize() is called.

    Remove the sq_lock as it is not needed for non-LLTX. Fix ethtool not
    to show the counter for sq_lock.

    Reported-by: aluno3@poczta.onet.pl
    Signed-off-by: Faisal Latif
    Signed-off-by: Roland Dreier

    Faisal Latif
     
  • When asynchronous events are processed by software, it is necessary
    to let the hardware know that software has handled the event. This
    frees up the entry in the asynchronous event queue.

    Signed-off-by: Don Wood
    Signed-off-by: Chien Tung
    Signed-off-by: Roland Dreier

    Don Wood
     
  • In find_node(), tmp_addr causes an "unused variable" warning when
    INFINIBAND_NES_DEBUG is not defined. It's only used in a nes_debug()
    and the print does not make sense. So take out the whole thing.

    Reported-by: Manish Katiyar
    Signed-off-by: Chien Tung
    Signed-off-by: Roland Dreier

    Chien Tung
     
  • ibv_devinfo displays 0 for vendor_id and vendor_part_id. Fill in OUI
    and device_id for those two fields.

    Signed-off-by: Chien Tung
    Signed-off-by: Roland Dreier

    Chien Tung
     
  • Update copyright to the new legal entity, Intel-NE, Inc., an Intel
    company. Update copyright for the new year.

    Signed-off-by: Chien Tung
    Signed-off-by: Roland Dreier

    Chien Tung
     
  • Fix occurrences where the software PBL counts were changed before the
    hardware was updated. This bug allowed another thread to overallocate
    the hardware resources.

    Add proper PBL accounting in case nes_reg_mr() fails.

    Signed-off-by: Don Wood
    Signed-off-by: Roland Dreier

    Don Wood
     

05 Mar, 2009

1 commit

  • Some attribute show functions test ibdev_is_alive() to make sure that
    it's OK to access device state. However, the sysfs attributes will
    not be registered until the device is fully initialized, and they'll
    be unregistered before anything is torn down, so ibdev_is_alive()
    doesn't do anything useful. Remove it.

    Signed-off-by: Roland Dreier

    Roland Dreier
     

04 Mar, 2009

2 commits

  • Our testing uncovered a race condition in ib_sa_event():

    spin_lock_irqsave(&port->ah_lock, flags);
    if (port->sm_ah)
    kref_put(&port->sm_ah->ref, free_sm_ah);
    port->sm_ah = NULL;
    spin_unlock_irqrestore(&port->ah_lock, flags);

    schedule_work(&sa_dev->port[event->element.port_num -
    sa_dev->start_port].update_task);

    If two events occur back-to-back (e.g., client-reregister and LID
    change), both may pass the spinlock-protected code above before the
    scheduled work updates the port->sm_ah handle. Then if the scheduled
    work ends up running twice, the second operation will then find a
    non-NULL port->sm_ah, and will simply overwrite it in update_sm_ah --
    resulting in an AH leak.

    Signed-off-by: Jack Morgenstein
    Signed-off-by: Roland Dreier

    Jack Morgenstein
     
  • If ib_post_send_mad() returns 0, the API guarantees that there will be
    a callback to send_buf->mad_agent->send_handler() so that the sender
    can call ib_free_send_mad(). Otherwise, the ib_mad_send_buf will be
    leaked and the mad_agent reference count will never go to zero and the
    IB device module cannot be unloaded. The above can happen without
    this patch if process_mad() returns (IB_MAD_RESULT_SUCCESS |
    IB_MAD_RESULT_CONSUMED).

    If process_mad() returns IB_MAD_RESULT_SUCCESS and there is no agent
    registered to receive the mad being sent, handle_outgoing_dr_smp()
    returns zero which causes a MAD packet which is at the end of the
    directed route to be incorrectly sent on the wire but doesn't cause a
    hang since the HCA generates a send completion.

    Signed-off-by: Ralph Campbell
    Signed-off-by: Roland Dreier

    Ralph Campbell
     

28 Feb, 2009

3 commits

  • There is a potential race in ib_register_mad_agent() where the struct
    ib_mad_agent_private is not fully initialized before it is added to
    the list of agents per IB port. This means the ib_mad_agent_private
    could be seen before the refcount, spin locks, and linked lists are
    initialized. The fix is to initialize the structure earlier.

    Signed-off-by: Ralph Campbell
    Signed-off-by: Roland Dreier

    Ralph Campbell
     
  • handle_outgoing_dr_smp() can queue a struct ib_mad_local_private
    *local on the mad_agent_priv->local_work work queue with
    local->mad_priv == NULL if device->process_mad() returns
    IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY and
    (!ib_response_mad(&mad_priv->mad.mad) ||
    !mad_agent_priv->agent.recv_handler).

    In this case, local_completions() will be called with local->mad_priv
    == NULL. The code does check for this case and skips calling
    recv_mad_agent->agent.recv_handler() but recv == 0 so
    kmem_cache_free() is called with a NULL pointer.

    Also, since recv isn't reinitialized each time through the loop, it
    can cause a memory leak if recv should have been zero.

    Signed-off-by: Ralph Campbell

    Ralph Campbell
     
  • Remove hard setting of the IB MTU used by iSER's RC queue-pair to 1K,
    as this was done due to inter-op issues with an old iser target which
    is not used any more.

    Signed-off-by: Or Gerlitz
    Signed-off-by: Roland Dreier

    Or Gerlitz
     

26 Feb, 2009

1 commit

  • Move the ib_device_unregister_sysfs() call from ib_dealloc_device() to
    ib_unregister_device(). The old code allows device unregister to
    proceed even if some sysfs files are open, which leaves a window where
    userspace can open a file before a device is removed but then end up
    reading the file after the device is removed, which leads to various
    kernel crashes either because the device data structure is freed or
    because the low-level driver code is gone after module removal.

    By not returning from ib_unregister_device() until after all sysfs
    entries are removed, we make sure that data structures and/or module
    code is not freed until after all sysfs access is done.

    Reported-by: Jack Morgenstein
    Signed-off-by: Roland Dreier

    Roland Dreier
     

23 Feb, 2009

2 commits


19 Feb, 2009

1 commit

  • Impact: new timer API

    Based on an idea from Martin Josefsson with the help of
    Patrick McHardy and Stephen Hemminger:

    introduce the mod_timer_pending() API which is a mod_timer()
    offspring that is an invariant on already removed timers.

    (regular mod_timer() re-activates non-pending timers.)

    This is useful for the networking code in that it can
    allow unserialized mod_timer_pending() timer-forwarding
    calls, but a single del_timer*() will stop the timer
    from being reactivated again.

    Also while at it:

    - optimize the regular mod_timer() path some more, the
    timer-stat and a debug check was needlessly duplicated
    in __mod_timer().

    - make the exports come straight after the function, as
    most other exports in timer.c already did.

    - eliminate __mod_timer() as an external API, change the
    users to mod_timer().

    The regular mod_timer() code path is not impacted
    significantly, due to inlining optimizations and due to
    the simplifications.

    Based-on-patch-from: Stephen Hemminger
    Acked-by: Stephen Hemminger
    Cc: "David S. Miller"
    Cc: Patrick McHardy
    Cc: netdev@vger.kernel.org
    Cc: Oleg Nesterov
    Cc: Andrew Morton
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

18 Feb, 2009

1 commit

  • If path_rec_start() returns error, call path_free() only if the path
    was newly-created. If we free an existing path whose valid flag was zero,
    (but do not detach it from the list) we cause corruption of the
    path list (of which it is a member), and get a kernel crash.

    The simplest solution is to not free an existing path -- just leave it
    in the list as-is (i.e., with its valid flag cleared).

    Thanks to Yossi Etigin of Voltaire for identifying the problem flow
    which caused the kernel crash.

    Signed-off-by: Jack Morgenstein
    Signed-off-by: Moni Shua
    Signed-off-by: Roland Dreier

    Jack Morgenstein
     

17 Feb, 2009

1 commit


11 Feb, 2009

1 commit

  • The poll and flush code needs to handle all send opcodes: SEND,
    SEND_WITH_SE, SEND_WITH_INV, and SEND_WITH_SE_INV.

    Ignore TERM indications if the connection already gone.

    Ignore HW receive completions if the RQ is empty.

    Signed-off-by: Steve Wise
    Signed-off-by: Roland Dreier

    Steve Wise