Eric Lee / smarc-fsl-linux-kernel

09 Jan, 2012

2 commits

48fa57ac2 Merge tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband ... Browse Code »

infiniband changes for 3.3 merge window

* tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
rdma/core: Fix sparse warnings
RDMA/cma: Fix endianness bugs
RDMA/nes: Fix terminate during AE
RDMA/nes: Make unnecessarily global nes_set_pau() static
RDMA/nes: Change MDIO bus clock to 2.5MHz
IB/cm: Fix layout of APR message
IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE
IB/qib: Default some module parameters optimally
IB/qib: Optimize locking for get_txreq()
IB/qib: Fix a possible data corruption when receiving packets
IB/qib: Eliminate 64-bit jiffies use
IB/qib: Fix style issues
IB/uverbs: Protect QP multicast list

Linus Torvalds
2012-01-09 06:05:48 +0800
972b2c719 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
reiserfs: Properly display mount options in /proc/mounts
vfs: prevent remount read-only if pending removes
vfs: count unlinked inodes
vfs: protect remounting superblock read-only
vfs: keep list of mounts for each superblock
vfs: switch ->show_options() to struct dentry *
vfs: switch ->show_path() to struct dentry *
vfs: switch ->show_devname() to struct dentry *
vfs: switch ->show_stats to struct dentry *
switch security_path_chmod() to struct path *
vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
vfs: trim includes a bit
switch mnt_namespace ->root to struct mount
vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
vfs: opencode mntget() mnt_set_mountpoint()
vfs: spread struct mount - remaining argument of next_mnt()
vfs: move fsnotify junk to struct mount
vfs: move mnt_devname
vfs: move mnt_list to struct mount
vfs: switch pnode.h macros to struct mount *
...

Linus Torvalds
2012-01-09 04:19:57 +0800

05 Jan, 2012

6 commits

1583676d9 Merge branches 'cma', 'misc', 'mlx4', 'nes', 'qib' and 'uverbs' into for-next Browse Code »

Roland Dreier
2012-01-05 01:18:20 +0800
c89d1bedf rdma/core: Fix sparse warnings ... Browse Code »

Clean up sparse warnings in the rdma core layer.

Signed-off-by: Sean Hefty
Signed-off-by: Roland Dreier

Sean Hefty
2012-01-05 01:17:45 +0800
46ea5061c RDMA/cma: Fix endianness bugs ... Browse Code »

Fix endianness bugs reported by sparse in the RDMA core stack. Note
that these are real bugs, but don't affect any existing code to the
best of my knowledge. The mlid issue would only affect kernel users
of rdma_join_multicast which have the rdma_cm attach/detach its QP.
There are no current in tree users that do this. (rdma_join_multicast
may be used called by user space applications, which does not have
this issue.) And the pkey setting is simply returned as
informational.

Signed-off-by: Sean Hefty
Signed-off-by: Roland Dreier

Sean Hefty
2012-01-05 01:13:52 +0800
196f40c84 RDMA/nes: Fix terminate during AE ... Browse Code »

Fix for reset which happens right after sending a terminate message.
Terminate timer is not deleted when the connection is closed.

Signed-off-by: Tatyana Nikolova
Signed-off-by: Faisal Latif
Signed-off-by: Roland Dreier

Tatyana Nikolova
2012-01-05 01:12:39 +0800
b0fda90f2 RDMA/nes: Make unnecessarily global nes_set_pau() static ... Browse Code »

Warned about by sparse.

Signed-off-by: Tatyana Nikolova
Signed-off-by: Faisal Latif
Signed-off-by: Roland Dreier

Tatyana Nikolova
2012-01-05 01:07:24 +0800
30b7e117a RDMA/nes: Change MDIO bus clock to 2.5MHz ... Browse Code »

Change the PHY clock divisor to make the MDIO clock 2.5MHz, instead of
3.5MHz (which is out of spec).

Signed-off-by: Tatyana Nikolova
Signed-off-by: Faisal Latif
Signed-off-by: Roland Dreier

Tatyana Nikolova
2012-01-05 01:02:15 +0800

04 Jan, 2012

11 commits

6f233d300 IB/cm: Fix layout of APR message ... Browse Code »

Add a missing 16-bit reserved field between ap_status and info fields.

Signed-off-by: Eli Cohen
Acked-by: Sean Hefty
Signed-off-by: Roland Dreier

Eli Cohen
2012-01-04 13:04:18 +0800
9106c4106 IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE ... Browse Code »

For IBoE, SLs 0-7 are mapped to Ethernet 802.1Q user priority bits
(pbits) which are part of the VLAN tag, SLs 8-15 are reserved.

Under Ethernet, the ConnectX firmware treats (decode/encode) the four
bit SL field in various constructs such as QPC / UD WQE / CQE as PPP0
and not as 0PPP. This correlates well to the fact that within the
vlan tag the pbits are located in bits 15-13 and not 12-14.

The current code wasn't consistent around that area - the
encoding was correct for the IBoE QPC.path.schedule_queue field,
but was wrong for IBoE CQEs and when MLX header was built.

These inconsistencies resulted in wrong SL wire 802.1Q pbits
mapping, which is fixed by using SL PPP0 all around the place.

Signed-off-by: Oren Duer
Signed-off-by: Or Gerlitz
Signed-off-by: Roland Dreier

Or Gerlitz
2012-01-04 13:00:02 +0800
8d4548f2b IB/qib: Default some module parameters optimally ... Browse Code »
43

Minimize the need for users to have to set module parameters to get
good performance.

The following two parameters are changed:
- rcvhdrcnt to twice the rcvegrcnt
- pcie_caps=0x51

The rcvhdrcnt at twice the egrcount allows the preemptive NAK code
during reception to function in 100% of the cases rather than a sender
jiffies-based timeout.

The pcie_caps default of 0x51 will set the proposed MaxPayload and
MaxReceiveReqest to 256 and 4096 respectively. The capabilities on
the root complex will be used to limit those values.

Reviewed-by: Ram Vepa
Signed-off-by: Mike Marciniszyn
Signed-off-by: Roland Dreier

Mike Marciniszyn
2012-01-04 12:54:01 +0800
489471095 IB/qib: Optimize locking for get_txreq() ... Browse Code »

The current code locks the QP s_lock, followed by the pending_lock, I
guess to to protect against the allocate failing.

This patch only locks the pending_lock, assuming that the empty case
is an exeception, in which case the pending_lock is dropped, and the
original code is executed. This will save a lock of s_lock in the
normal case.

The observation is that the sdma descriptors will deplete at twice the
rate of txreq's, so this should be rare.

Signed-off-by: Mike Marciniszyn
Signed-off-by: Roland Dreier

Mike Marciniszyn
2012-01-04 12:53:31 +0800
eddfb6752 IB/qib: Fix a possible data corruption when receiving packets ... Browse Code »
1

Prevent a receive data corruption by ensuring that the write to update
the rcvhdrheadn register to generate an interrupt is at the very end
of the receive processing.

Signed-off-by: Ramkrishna Vepa
Signed-off-by: Mike Marciniszyn
Cc:
Signed-off-by: Roland Dreier

Ram Vepa
2012-01-04 12:53:02 +0800
8482d5d1b IB/qib: Eliminate 64-bit jiffies use ... Browse Code »

The qib driver makes use of the the 64-bit jiffies API.

Code inspection reveals that that version of the API is not really
required. This patch converts to use the "normal" jiffies.

Reviewed-by: Ram Vepa
Signed-off-by: Mike Marciniszyn
Signed-off-by: Roland Dreier

Mike Marciniszyn
2012-01-04 12:52:12 +0800
865b64be8 IB/qib: Fix style issues ... Browse Code »

More style issues revealed with checkpatch.pl -f.

Signed-off-by: Mike Marciniszyn
Signed-off-by: Roland Dreier

Mike Marciniszyn
2012-01-04 12:51:42 +0800
e214a0fe2 IB/uverbs: Protect QP multicast list ... Browse Code »

Userspace verbs multicast attach/detach operations on a QP are done
while holding the rwsem of the QP for reading. That's not sufficient
since a reader lock allows more than one reader to acquire the
lock. However, multicast attach/detach does list manipulation that
can corrupt the list if multiple threads run in parallel.

Fix this by acquiring the rwsem as a writer to serialize attach/detach
operations. Add idr_write_qp() and put_qp_write() to encapsulate
this.

This fixes oops seen when running applications that perform multicast
joins/leaves.

Reported by: Mike Dubman
Signed-off-by: Eli Cohen
Cc:
Signed-off-by: Roland Dreier

Eli Cohen
2012-01-04 12:36:48 +0800
f9ec80061 infiniband: umode_t noise, including open-coded S_ISDIR() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:55:03 +0800
587a1f165 switch ->is_visible() to returning umode_t ... Browse Code »
43

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:54:55 +0800
2c9ede55e switch device_get_devnode() and ->devnode() to umode_t * ... Browse Code »

both callers of device_get_devnode() are only interested in lower 16bits
and nobody tries to return anything wider than 16bit anyway.

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:54:55 +0800

24 Dec, 2011

1 commit

abb434cb0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Conflicts:
net/bluetooth/l2cap_core.c

Just two overlapping changes, one added an initialization of
a local variable, and another change added a new local variable.

Signed-off-by: David S. Miller

David S. Miller
2011-12-24 06:13:56 +0800

20 Dec, 2011

3 commits

480390c8f Merge branches 'cma', 'mlx4' and 'qib' into for-next Browse Code »

Roland Dreier
2011-12-20 01:19:49 +0800
29d1b1614 IB/qib: Correct sense on freectxts increment and decrement ... Browse Code »

Commit 53ab1c64983 ("IB/qib: Correct nfreectxts for multiple HCAs")
reversed the increments and decrements of dd->nfreectxts. Fix it.

Reviewed-by: Ram Vepa
Signed-off-by: Mike Marciniszyn
Signed-off-by: Roland Dreier

Mike Marciniszyn
2011-12-20 01:19:34 +0800
04ded1672 RDMA/cma: Verify private data length ... Browse Code »

private_data_len is defined as a u8. If the user specifies a large
private_data size (> 220 bytes), we will calculate a total length that
exceeds 255, resulting in private_data_len wrapping back to 0. This
can lead to overwriting random kernel memory. Avoid this by verifying
that the resulting size fits into a u8.

Reported-by: B. Thery
Addresses:
Signed-off-by: Sean Hefty
Signed-off-by: Roland Dreier

Sean Hefty
2011-12-20 01:15:33 +0800

14 Dec, 2011

3 commits

8e59d254f mlx4_ib: disable SRIOV mode for IB ports (not yet supported) ... Browse Code »

Signed-off-by: Jack Morgenstein
Signed-off-by: David S. Miller

Jack Morgenstein
2011-12-14 02:56:07 +0800
f9baff509 mlx4_core: Add "native" argument to mlx4_cmd and its callers (where needed) ... Browse Code »

For SRIOV, some Hypervisor commands can be executed directly (native = 1).
Others should go through the command wrapper flow (for tracking resource
usage, for example, or for changing some HCA configurations that slaves
need to be notified of).

This patch sets the groundwork for this capability -- adding the correct
value of "native" in each case.

Note that if SRIOV is not activated, this parameter has no effect.

Signed-off-by: Jack Morgenstein
Signed-off-by: David S. Miller

Jack Morgenstein
2011-12-14 02:56:05 +0800
65dab25de mlx4: Extanding port_mask functionality ... Browse Code »

Port mask now has additional state.
Port can be set as "none". In this case neither the mlx4_en or mlx4_ib
drivers take ownership of the port.
In multifunction mode there is an option to set the vfs as single ported devices.
(in single function mode, both physical ports belong to same function)

Signed-off-by: Jack Morgenstein
Signed-off-by: Yevgeny Petrilin
Signed-off-by: David S. Miller

Jack Morgenstein
2011-12-14 02:56:05 +0800

07 Dec, 2011

1 commit

4af3ce0de IB/mlx4: Fix shutdown crash accessing a non-existent bitmap ... Browse Code »

Commit cfcde11c3d7a ("IB/mlx4: Use flow counters on IBoE ports") added
code that sets elements of counters[] to -1 if no counter is allocated,
but then goes ahead and passes every entry to mlx4_counter_free() on
shutdown. This is a bad idea, especially if MLX4_DEV_CAP_FLAG_COUNTERS
isn't set so there isn't even an underlying bitmap to free from.

Tested-by: Sean Hefty
Cc:
Signed-off-by: Roland Dreier

Roland Dreier
2011-12-07 02:47:37 +0800

06 Dec, 2011

6 commits

17e6abeec infiniband: ipoib: Sanitize neighbour handling in ipoib_main.c ... Browse Code »

Reduce the number of dst_get_neighbour_noref() calls within a single
call chain. Primarily by passing the neighbour pointer down to the
helper functions.

Handle dst_get_neighbour_noref() returning NULL in ipoib_start_xmit()
by incrementing the dropped counter and freeing the packet. We don't
want it to fall through into the ARP/RARP/multicast handling, since
that should only happen when skb_dst() is NULL.

Signed-off-by: David S. Miller
Acked-by: Roland Dreier

David Miller
2011-12-06 04:20:20 +0800
3786cf189 infiniband: cxgb4: Consolidate 3 copies of the same operation into 1 helper function. ... Browse Code »

Three pieces of code do the same thing, create a l2t entry and then
import this information into the c4iw_ep object.

Create a helper function and call it from these 3 locations instead.

Signed-off-by: David S. Miller
Acked-by: Roland Dreier

David Miller
2011-12-06 04:20:20 +0800
40e2bb588 infiniband: nes: Use dst's neighbour entry. ... Browse Code »

Do this instead of performing a by-hand lookup.

Signed-off-by: David S. Miller
Acked-by: Roland Dreier

David Miller
2011-12-06 04:20:19 +0800
a4757123a cxgb3: Rework t3_l2t_get to take a dst_entry instead of a neighbour. ... Browse Code »

This way we consolidate the RCU locking down into the place where it
actually matters, and also we can make the code handle
dst_get_neighbour_noref() returning NULL properly.

Signed-off-by: David S. Miller

David Miller
2011-12-06 04:20:19 +0800
51d459745 infiniband: addr: Consolidate code to fetch neighbour hardware address from dst. ... Browse Code »

IPV4 should do exactly what the IPV6 code does here, which is
use the neighbour obtained via the dst entry.

And now that the two code paths do the same thing, use a common
helper function to perform the operation.

Signed-off-by: David S. Miller
Acked-by: Eric Dumazet
Acked-by: Roland Dreier

David Miller
2011-12-06 04:20:19 +0800
272174550 net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}. ... Browse Code »

To reflect the fact that a refrence is not obtained to the
resulting neighbour entry.

Signed-off-by: David S. Miller
Acked-by: Roland Dreier

David Miller
2011-12-06 04:20:19 +0800

03 Dec, 2011

1 commit

b3613118e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Browse Code »

David S. Miller
2011-12-03 02:49:21 +0800

01 Dec, 2011

1 commit

596b9b68e neigh: Add infrastructure for allocating device neigh privates. ... Browse Code »

netdev->neigh_priv_len records the private area length.

This will trigger for neigh_table objects which set tbl->entry_size
to zero, and the first instances of this will be forthcoming.

Signed-off-by: David S. Miller

David Miller
2011-12-01 07:46:43 +0800

30 Nov, 2011

3 commits

a493f1a24 Merge branches 'cxgb4', 'ipoib', 'misc' and 'qib' into for-next Browse Code »

Roland Dreier
2011-11-30 10:01:53 +0800
580da35a3 IB: Fix RCU lockdep splats ... Browse Code »
1

Commit f2c31e32b37 ("net: fix NULL dereferences in check_peer_redir()")
forgot to take care of infiniband uses of dst neighbours.

Many thanks to Marc Aurele who provided a nice bug report and feedback.

Reported-by: Marc Aurele La France
Signed-off-by: Eric Dumazet
Cc: David Miller
Cc:
Signed-off-by: Roland Dreier

Eric Dumazet
2011-11-30 05:37:11 +0800
3874397c0 IB/ipoib: Prevent hung task or softlockup processing multicast response ... Browse Code »

This following can occur with ipoib when processing a multicast reponse:

BUG: soft lockup - CPU#0 stuck for 67s! [ib_mad1:982]
Modules linked in: ...
CPU 0:
Modules linked in: ...
Pid: 982, comm: ib_mad1 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 ProLiant DL160 G5
RIP: 0010:[] [] _spin_unlock_irqrestore+0x17/0x20
RSP: 0018:ffff8802119ed860 EFLAGS: 00000246
0000000000000004 RBX: ffff8802119ed860 RCX: 000000000000a299
RDX: ffff88021086c700 RSI: 0000000000000246 RDI: 0000000000000246
RBP: ffffffff8100bc8e R08: ffff880210ac229c R09: 0000000000000000
R10: ffff88021278aab8 R11: 0000000000000000 R12: ffff8802119ed860
R13: ffffffff8100be6e R14: 0000000000000001 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000006d4840 CR3: 0000000209aa5000 CR4: 00000000000406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
[] ? ipoib_mcast_send+0x157/0x480 [ib_ipoib]
[] ? apic_timer_interrupt+0xe/0x20
[] ? apic_timer_interrupt+0xe/0x20
[] ? ipoib_path_lookup+0x124/0x2d0 [ib_ipoib]
[] ? ipoib_start_xmit+0x17c/0x430 [ib_ipoib]
[] ? dev_hard_start_xmit+0x2c8/0x3f0
[] ? sch_direct_xmit+0x15a/0x1c0
[] ? dev_queue_xmit+0x388/0x4d0
[] ? ipoib_mcast_join_finish+0x2c7/0x510 [ib_ipoib]
[] ? ipoib_mcast_sendonly_join_complete+0x1b8/0x1f0 [ib_ipoib]
[] ? mcast_work_handler+0x1a6/0x710 [ib_sa]
[] ? ib_send_mad+0xfe/0x3c0 [ib_mad]
[] ? ib_get_cached_lmc+0xa3/0xb0 [ib_core]
[] ? join_handler+0xeb/0x200 [ib_sa]
[] ? ib_sa_mcmember_rec_callback+0x5c/0xa0 [ib_sa]
[] ? recv_handler+0x3c/0x70 [ib_sa]
[] ? ib_mad_completion_handler+0x844/0x9d0 [ib_mad]
[] ? ib_mad_completion_handler+0x0/0x9d0 [ib_mad]
[] ? worker_thread+0x170/0x2a0
[] ? autoremove_wake_function+0x0/0x40
[] ? worker_thread+0x0/0x2a0
[] ? kthread+0x96/0xa0
[] ? child_rip+0xa/0x20

Coinciding with stack trace is the following message:

ib0: ib_address_create failed

The code below in ipoib_mcast_join_finish() will note the above
failure in the address handle but otherwise continue:

ah = ipoib_create_ah(dev, priv->pd, &av);
if (!ah) {
ipoib_warn(priv, "ib_address_create failed\n");
} else {

The while loop at the bottom of ipoib_mcast_join_finish() will attempt
to send queued multicast packets in mcast->pkt_queue and eventually
end up in ipoib_mcast_send():

if (!mcast->ah) {
if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE)
skb_queue_tail(&mcast->pkt_queue, skb);
else {
++dev->stats.tx_dropped;
dev_kfree_skb_any(skb);
}

My read is that the code will requeue the packet and return to the
ipoib_mcast_join_finish() while loop and the stage is set for the
"hung" task diagnostic as the while loop never sees a non-NULL ah, and
will do nothing to resolve.

There are GFP_ATOMIC allocates in the provider routines, so this is
possible and should be dealt with.

The test that induced the failure is associated with a host SM on the
same server during a shutdown.

This patch causes ipoib_mcast_join_finish() to exit with an error
which will flush the queued mcast packets. Nothing is done to unwind
the QP attached state so that subsequent sends from above will retry
the join.

Reviewed-by: Ram Vepa
Reviewed-by: Gary Leshner
Signed-off-by: Mike Marciniszyn
Signed-off-by: Roland Dreier

Mike Marciniszyn
2011-11-30 05:20:02 +0800

29 Nov, 2011

2 commits

8ee887d74 IB/qib: Fix over-scheduling of QSFP work ... Browse Code »

Don't over-schedule QSFP work on driver initialization. It could end
up being run simultaneously on two different CPUs resulting in bad
EEPROM reads. In combination with setting the physical IB link state
prior to the IBC being brought out of reset, this can cause the link
state machine to start training early with wrong settings.

Signed-off-by: Mitko Haralanov
Signed-off-by: Mike Marciniszyn
Signed-off-by: Roland Dreier

Mike Marciniszyn
2011-11-29 04:17:33 +0800
01b225e18 RDMA/cxgb4: Fix retry with MPAv1 logic for MPAv2 ... Browse Code »

Fix logic so that we don't retry with MPAv1 once we have done that
already. Otherwise, we end up retrying with MPAv1 even when its not
needed on getting peer aborts - and this could lead to kernel panic.

Signed-off-by: Kumar Sanghvi
Signed-off-by: Roland Dreier

Kumar Sanghvi
2011-11-29 03:58:07 +0800