09 Jan, 2012

2 commits

  • infiniband changes for 3.3 merge window

    * tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
    rdma/core: Fix sparse warnings
    RDMA/cma: Fix endianness bugs
    RDMA/nes: Fix terminate during AE
    RDMA/nes: Make unnecessarily global nes_set_pau() static
    RDMA/nes: Change MDIO bus clock to 2.5MHz
    IB/cm: Fix layout of APR message
    IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE
    IB/qib: Default some module parameters optimally
    IB/qib: Optimize locking for get_txreq()
    IB/qib: Fix a possible data corruption when receiving packets
    IB/qib: Eliminate 64-bit jiffies use
    IB/qib: Fix style issues
    IB/uverbs: Protect QP multicast list

    Linus Torvalds
     
  • * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
    reiserfs: Properly display mount options in /proc/mounts
    vfs: prevent remount read-only if pending removes
    vfs: count unlinked inodes
    vfs: protect remounting superblock read-only
    vfs: keep list of mounts for each superblock
    vfs: switch ->show_options() to struct dentry *
    vfs: switch ->show_path() to struct dentry *
    vfs: switch ->show_devname() to struct dentry *
    vfs: switch ->show_stats to struct dentry *
    switch security_path_chmod() to struct path *
    vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
    vfs: trim includes a bit
    switch mnt_namespace ->root to struct mount
    vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
    vfs: opencode mntget() mnt_set_mountpoint()
    vfs: spread struct mount - remaining argument of next_mnt()
    vfs: move fsnotify junk to struct mount
    vfs: move mnt_devname
    vfs: move mnt_list to struct mount
    vfs: switch pnode.h macros to struct mount *
    ...

    Linus Torvalds
     

05 Jan, 2012

4 commits


04 Jan, 2012

7 commits

  • For IBoE, SLs 0-7 are mapped to Ethernet 802.1Q user priority bits
    (pbits) which are part of the VLAN tag, SLs 8-15 are reserved.

    Under Ethernet, the ConnectX firmware treats (decode/encode) the four
    bit SL field in various constructs such as QPC / UD WQE / CQE as PPP0
    and not as 0PPP. This correlates well to the fact that within the
    vlan tag the pbits are located in bits 15-13 and not 12-14.

    The current code wasn't consistent around that area - the
    encoding was correct for the IBoE QPC.path.schedule_queue field,
    but was wrong for IBoE CQEs and when MLX header was built.

    These inconsistencies resulted in wrong SL wire 802.1Q pbits
    mapping, which is fixed by using SL PPP0 all around the place.

    Signed-off-by: Oren Duer
    Signed-off-by: Or Gerlitz
    Signed-off-by: Roland Dreier

    Or Gerlitz
     
  • Minimize the need for users to have to set module parameters to get
    good performance.

    The following two parameters are changed:
    - rcvhdrcnt to twice the rcvegrcnt
    - pcie_caps=0x51

    The rcvhdrcnt at twice the egrcount allows the preemptive NAK code
    during reception to function in 100% of the cases rather than a sender
    jiffies-based timeout.

    The pcie_caps default of 0x51 will set the proposed MaxPayload and
    MaxReceiveReqest to 256 and 4096 respectively. The capabilities on
    the root complex will be used to limit those values.

    Reviewed-by: Ram Vepa
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Roland Dreier

    Mike Marciniszyn
     
  • The current code locks the QP s_lock, followed by the pending_lock, I
    guess to to protect against the allocate failing.

    This patch only locks the pending_lock, assuming that the empty case
    is an exeception, in which case the pending_lock is dropped, and the
    original code is executed. This will save a lock of s_lock in the
    normal case.

    The observation is that the sdma descriptors will deplete at twice the
    rate of txreq's, so this should be rare.

    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Roland Dreier

    Mike Marciniszyn
     
  • Prevent a receive data corruption by ensuring that the write to update
    the rcvhdrheadn register to generate an interrupt is at the very end
    of the receive processing.

    Signed-off-by: Ramkrishna Vepa
    Signed-off-by: Mike Marciniszyn
    Cc:
    Signed-off-by: Roland Dreier

    Ram Vepa
     
  • The qib driver makes use of the the 64-bit jiffies API.

    Code inspection reveals that that version of the API is not really
    required. This patch converts to use the "normal" jiffies.

    Reviewed-by: Ram Vepa
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Roland Dreier

    Mike Marciniszyn
     
  • More style issues revealed with checkpatch.pl -f.

    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Roland Dreier

    Mike Marciniszyn
     
  • Signed-off-by: Al Viro

    Al Viro
     

24 Dec, 2011

1 commit


20 Dec, 2011

2 commits


14 Dec, 2011

3 commits


07 Dec, 2011

1 commit

  • Commit cfcde11c3d7a ("IB/mlx4: Use flow counters on IBoE ports") added
    code that sets elements of counters[] to -1 if no counter is allocated,
    but then goes ahead and passes every entry to mlx4_counter_free() on
    shutdown. This is a bad idea, especially if MLX4_DEV_CAP_FLAG_COUNTERS
    isn't set so there isn't even an underlying bitmap to free from.

    Tested-by: Sean Hefty
    Cc:
    Signed-off-by: Roland Dreier

    Roland Dreier
     

06 Dec, 2011

4 commits


03 Dec, 2011

1 commit


30 Nov, 2011

2 commits


29 Nov, 2011

3 commits

  • Don't over-schedule QSFP work on driver initialization. It could end
    up being run simultaneously on two different CPUs resulting in bad
    EEPROM reads. In combination with setting the physical IB link state
    prior to the IBC being brought out of reset, this can cause the link
    state machine to start training early with wrong settings.

    Signed-off-by: Mitko Haralanov
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Roland Dreier

    Mike Marciniszyn
     
  • Fix logic so that we don't retry with MPAv1 once we have done that
    already. Otherwise, we end up retrying with MPAv1 even when its not
    needed on getting peer aborts - and this could lead to kernel panic.

    Signed-off-by: Kumar Sanghvi
    Signed-off-by: Roland Dreier

    Kumar Sanghvi
     
  • Fix another place in the code where logic dealing with the t4_cqe was
    using the wrong QID. This fixes the counting logic so that it tests
    against the SQ QID instead of the RQ QID when counting RCQES.

    Signed-off by: Jonathan Lallinger
    Signed-off by: Steve Wise
    Signed-off-by: Roland Dreier

    Jonathan Lallinger
     

17 Nov, 2011

1 commit


09 Nov, 2011

1 commit


07 Nov, 2011

1 commit

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     

05 Nov, 2011

2 commits

  • Roland Dreier
     
  • The following panic can occur when flushing a QP:

    RIP: 0010:[] [] qib_send_complete+0x3b/0x190 [ib_qib]
    RSP: 0018:ffff8803cdc6fc90 EFLAGS: 00010046
    RAX: 0000000000000000 RBX: ffff8803d84ba000 RCX: 0000000000000000
    RDX: 0000000000000005 RSI: ffffc90015a53430 RDI: ffff8803d84ba000
    RBP: ffff8803cdc6fce0 R08: ffff8803cdc6fc90 R09: 0000000000000001
    R10: 00000000ffffffff R11: 0000000000000000 R12: ffff8803d84ba0c0
    R13: ffff8803d84ba5cc R14: 0000000000000800 R15: 0000000000000246
    FS: 0000000000000000(0000) GS:ffff880036600000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    CR2: 0000000000000034 CR3: 00000003e44f9000 CR4: 00000000000406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process qib/0 (pid: 1350, threadinfo ffff8803cdc6e000, task ffff88042728a100)
    Stack:
    53544c5553455201 0000000100000005 0000000000000000 ffff8803d84ba000
    0000000000000000 0000000000000000 0000000000000000 0000000000000000
    0000000000000000 0000000000000001 ffff8803cdc6fd30 ffffffffa0165d7a
    Call Trace:
    [] qib_make_rc_req+0x36a/0xe80 [ib_qib]
    [] ? qib_make_rc_req+0x0/0xe80 [ib_qib]
    [] qib_do_send+0xf3/0xb60 [ib_qib]
    [] ? thread_return+0x4e/0x777
    [] ? qib_do_send+0x0/0xb60 [ib_qib]
    [] worker_thread+0x170/0x2a0
    [] ? autoremove_wake_function+0x0/0x40
    [] ? worker_thread+0x0/0x2a0
    [] kthread+0x96/0xa0
    [] child_rip+0xa/0x20
    [] ? kthread+0x0/0xa0
    [] ? child_rip+0x0/0x20
    RIP [] qib_send_complete+0x3b/0x190 [ib_qib]

    The RC error state flush logic in qib_make_rc_req() could return all
    of the acked wqes and potentially have emptied the queue. It would
    then unconditionally try return a flush completion via
    qib_send_complete() for an invalid wqe, or worse a valid one that is
    not queued. The panic results when the completion code tries to
    maintain an MR reference count for a NULL MR.

    This fix modifies logic to only send one completion per
    qib_make_rc_req() call and changing the completion status from
    IB_WC_SUCCESS to IB_WC_WR_FLUSH_ERR as the completions progress.

    The outer loop will call as many times as necessary to flush the queue.

    Reviewed-by: Ram Vepa
    Signed-off-by: Mike Marciniszyn
    Signed-off-by: Roland Dreier

    Mike Marciniszyn
     

04 Nov, 2011

1 commit

  • The num_free field of mthca_buddy has a type of array of unsigned int
    while it was allocated as an array of pointers. On 64-bit platforms
    this allocates twice more than required. Fix this by allocating the
    correct size for the type.

    This is the same bug just fixed in mlx4 by Eli Cohen .

    Signed-off-by: Roland Dreier

    Roland Dreier
     

02 Nov, 2011

2 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (62 commits)
    mlx4_core: Deprecate log_num_vlan module param
    IB/mlx4: Don't set VLAN in IBoE WQEs' control segment
    IB/mlx4: Enable 4K mtu for IBoE
    RDMA/cxgb4: Mark QP in error before disabling the queue in firmware
    RDMA/cxgb4: Serialize calls to CQ's comp_handler
    RDMA/cxgb3: Serialize calls to CQ's comp_handler
    IB/qib: Fix issue with link states and QSFP cables
    IB/mlx4: Configure extended active speeds
    mlx4_core: Add extended port capabilities support
    IB/qib: Hold links until tuning data is available
    IB/qib: Clean up checkpatch issue
    IB/qib: Remove s_lock around header validation
    IB/qib: Precompute timeout jiffies to optimize latency
    IB/qib: Use RCU for qpn lookup
    IB/qib: Eliminate divide/mod in converting idx to egr buf pointer
    IB/qib: Decode path MTU optimization
    IB/qib: Optimize RC/UC code by IB operation
    IPoIB: Use the right function to do DMA unmap pages
    RDMA/cxgb4: Use correct QID in insert_recv_cqe()
    RDMA/cxgb4: Make sure flush CQ entries are collected on connection close
    ...

    Linus Torvalds
     
  • …sc', 'mlx4', 'misc', 'nes', 'qib' and 'xrc' into for-next

    Roland Dreier
     

01 Nov, 2011

2 commits

  • Some kernel components pin user space memory (infiniband and perf) (by
    increasing the page count) and account that memory as "mlocked".

    The difference between mlocking and pinning is:

    A. mlocked pages are marked with PG_mlocked and are exempt from
    swapping. Page migration may move them around though.
    They are kept on a special LRU list.

    B. Pinned pages cannot be moved because something needs to
    directly access physical memory. They may not be on any
    LRU list.

    I recently saw an mlockalled process where mm->locked_vm became
    bigger than the virtual size of the process (!) because some
    memory was accounted for twice:

    Once when the page was mlocked and once when the Infiniband
    layer increased the refcount because it needt to pin the RDMA
    memory.

    This patch introduces a separate counter for pinned pages and
    accounts them seperately.

    Signed-off-by: Christoph Lameter
    Cc: Mike Marciniszyn
    Cc: Roland Dreier
    Cc: Sean Hefty
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • These files were getting the moduleparam infrastructure from the
    implicit presence of module.h being everywhere, but that is going
    away soon.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker