Commit 428aac8a81058e2303677a8fbf26670229e51d3a

Authored by Ming Lei
Committed by Greg Kroah-Hartman
1 parent 9118f9eb4f

USB: EHCI: support running URB giveback in tasklet context

All 4 transfer types can work well on EHCI HCD after switching to run
URB giveback in tasklet context, so mark all HCD drivers to support
it.

Also we don't need to release ehci->lock during URB giveback any more.

>From below test results on 3 machines(2 ARM and one x86), time
consumed by EHCI interrupt handler droped much without performance
loss.

1 test description
1.1 mass storage performance test:
- run below command 10 times and compute the average performance

    dd if=/dev/sdN iflag=direct of=/dev/null bs=200M count=1

- two usb mass storage device:
A: sandisk extreme USB 3.0 16G(used in test case 1 & case 2)
B: kingston DataTraveler G2 4GB(only used in test case 2)

1.2 uvc function test:
- run one simple capture program in the below link

   http://kernel.ubuntu.com/~ming/up/capture.c

- capture format 640*480 and results in High Bandwidth mode on the
uvc device: Z-Star 0x0ac8/0x3450

- on T410(x86) laptop, also use guvcview to watch video capture/playback

1.3 about test2 and test4
- both two devices involved are tested concurrently by above test items

1.4 how to compute irq time(the time consumed by ehci_irq)
- use trace points of irq:irq_handler_entry and irq:irq_handler_exit

1.5 kernel
3.10.0-rc3-next-20130528

1.6 test machines
Pandaboard A1: ARM CortexA9 dural core
Arndale board: ARM CortexA15 dural core
T410: i5 CPU 2.67GHz quad core

2 test result
2.1 test case1: single mass storage device performance test
--------------------------------------------------------------------
		upstream 		| patched
		perf(MB/s)+irq time(us)	| perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  25.280(avg:145,max:772)	| 25.540(avg:14, max:75)
Arndale board:  29.700(avg:33, max:129)	| 29.700(avg:10,  max:50)
T410: 		34.430(avg:17, max:154*)| 34.660(avg:12, max:155)
---------------------------------------------------------------------

2.2 test case2: two mass storage devices' performance test
--------------------------------------------------------------------
		upstream 			| patched
		perf(MB/s)+irq time(us)		| perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  15.840/15.580(avg:158,max:1216)	| 16.500/16.160(avg:15,max:139)
Arndale board:  17.370/16.220(avg:33 max:234)	| 17.480/16.200(avg:11, max:91)
T410: 		21.180/19.820(avg:18 max:160)	| 21.220/19.880(avg:11, max:149)
---------------------------------------------------------------------

2.3 test case3: one uvc streaming test
- uvc device works well(on x86, luvcview can be used too and has
same result with uvc capture)
--------------------------------------------------------------------
		upstream 		| patched
		irq time(us)		| irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  (avg:445, max:873)	| (avg:33, max:44)
Arndale board:  (avg:316, max:630)	| (avg:20, max:27)
T410: 		(avg:39,  max:107)	| (avg:10, max:65)
---------------------------------------------------------------------

2.4 test case4: one uvc streaming plus one mass storage device test
--------------------------------------------------------------------
		upstream 		| patched
		perf(MB/s)+irq time(us)	| perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  20.340(avg:259,max:1704)| 20.390(avg:24, max:101)
Arndale board:  23.460(avg:124,max:726)	| 23.370(avg:15, max:52)
T410: 		28.520(avg:27, max:169)	| 28.630(avg:13, max:160)
---------------------------------------------------------------------

2.5 test case5: read single mass storage device with small transfer
- run below command 10 times and compute the average speed

 dd if=/dev/sdN iflag=direct of=/dev/null bs=4K count=4000

1), test device A:
--------------------------------------------------------------------
		upstream 		| patched
		perf(MB/s)+irq time(us)	| perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  6.5(avg:21, max:64)	| 6.5(avg:10, max:24)
Arndale board:  8.13(avg:12, max:23)	| 8.06(avg:7,  max:17)
T410: 		6.66(avg:13, max:131)   | 6.84(avg:11, max:149)
---------------------------------------------------------------------

2), test device B:
--------------------------------------------------------------------
		upstream 		| patched
		perf(MB/s)+irq time(us)	| perf(MB/s)+irq time(us)
--------------------------------------------------------------------
Pandaboard A1:  5.5(avg:21,max:43)	| 5.49(avg:10, max:24)
Arndale board:  5.9(avg:12, max:22)	| 5.9(avg:7, max:17)
T410: 		5.48(avg:13, max:155)	| 5.48(avg:7, max:140)
---------------------------------------------------------------------

* On T410, sometimes read ehci status register in ehci_irq takes more
than 100us, and the problem has been reported on the link:

	http://marc.info/?t=137065867300001&r=1&w=2

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Showing 14 changed files with 13 additions and 18 deletions Side-by-side Diff

drivers/usb/host/ehci-fsl.c
... ... @@ -669,7 +669,7 @@
669 669 * generic hardware linkage
670 670 */
671 671 .irq = ehci_irq,
672   - .flags = HCD_USB2 | HCD_MEMORY,
  672 + .flags = HCD_USB2 | HCD_MEMORY | HCD_BH,
673 673  
674 674 /*
675 675 * basic lifecycle operations
drivers/usb/host/ehci-grlib.c
... ... @@ -43,7 +43,7 @@
43 43 * generic hardware linkage
44 44 */
45 45 .irq = ehci_irq,
46   - .flags = HCD_MEMORY | HCD_USB2,
  46 + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
47 47  
48 48 /*
49 49 * basic lifecycle operations
drivers/usb/host/ehci-hcd.c
... ... @@ -1166,7 +1166,7 @@
1166 1166 * generic hardware linkage
1167 1167 */
1168 1168 .irq = ehci_irq,
1169   - .flags = HCD_MEMORY | HCD_USB2,
  1169 + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
1170 1170  
1171 1171 /*
1172 1172 * basic lifecycle operations
drivers/usb/host/ehci-mv.c
... ... @@ -96,7 +96,7 @@
96 96 * generic hardware linkage
97 97 */
98 98 .irq = ehci_irq,
99   - .flags = HCD_MEMORY | HCD_USB2,
  99 + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
100 100  
101 101 /*
102 102 * basic lifecycle operations
drivers/usb/host/ehci-octeon.c
... ... @@ -51,7 +51,7 @@
51 51 * generic hardware linkage
52 52 */
53 53 .irq = ehci_irq,
54   - .flags = HCD_MEMORY | HCD_USB2,
  54 + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
55 55  
56 56 /*
57 57 * basic lifecycle operations
drivers/usb/host/ehci-pmcmsp.c
... ... @@ -286,7 +286,7 @@
286 286 #else
287 287 .irq = ehci_irq,
288 288 #endif
289   - .flags = HCD_MEMORY | HCD_USB2,
  289 + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
290 290  
291 291 /*
292 292 * basic lifecycle operations
drivers/usb/host/ehci-ppc-of.c
... ... @@ -28,7 +28,7 @@
28 28 * generic hardware linkage
29 29 */
30 30 .irq = ehci_irq,
31   - .flags = HCD_MEMORY | HCD_USB2,
  31 + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
32 32  
33 33 /*
34 34 * basic lifecycle operations
drivers/usb/host/ehci-ps3.c
... ... @@ -71,7 +71,7 @@
71 71 .product_desc = "PS3 EHCI Host Controller",
72 72 .hcd_priv_size = sizeof(struct ehci_hcd),
73 73 .irq = ehci_irq,
74   - .flags = HCD_MEMORY | HCD_USB2,
  74 + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
75 75 .reset = ps3_ehci_hc_reset,
76 76 .start = ehci_run,
77 77 .stop = ehci_stop,
drivers/usb/host/ehci-q.c
... ... @@ -254,8 +254,6 @@
254 254  
255 255 static void
256 256 ehci_urb_done(struct ehci_hcd *ehci, struct urb *urb, int status)
257   -__releases(ehci->lock)
258   -__acquires(ehci->lock)
259 257 {
260 258 if (usb_pipetype(urb->pipe) == PIPE_INTERRUPT) {
261 259 /* ... update hc-wide periodic stats */
262 260  
263 261  
... ... @@ -281,11 +279,8 @@
281 279 urb->actual_length, urb->transfer_buffer_length);
282 280 #endif
283 281  
284   - /* complete() can reenter this HCD */
285 282 usb_hcd_unlink_urb_from_ep(ehci_to_hcd(ehci), urb);
286   - spin_unlock (&ehci->lock);
287 283 usb_hcd_giveback_urb(ehci_to_hcd(ehci), urb, status);
288   - spin_lock (&ehci->lock);
289 284 }
290 285  
291 286 static int qh_schedule (struct ehci_hcd *ehci, struct ehci_qh *qh);
drivers/usb/host/ehci-sead3.c
... ... @@ -55,7 +55,7 @@
55 55 * generic hardware linkage
56 56 */
57 57 .irq = ehci_irq,
58   - .flags = HCD_MEMORY | HCD_USB2,
  58 + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
59 59  
60 60 /*
61 61 * basic lifecycle operations
drivers/usb/host/ehci-sh.c
... ... @@ -36,7 +36,7 @@
36 36 * generic hardware linkage
37 37 */
38 38 .irq = ehci_irq,
39   - .flags = HCD_USB2 | HCD_MEMORY,
  39 + .flags = HCD_USB2 | HCD_MEMORY | HCD_BH,
40 40  
41 41 /*
42 42 * basic lifecycle operations
drivers/usb/host/ehci-tilegx.c
... ... @@ -61,7 +61,7 @@
61 61 * Generic hardware linkage.
62 62 */
63 63 .irq = ehci_irq,
64   - .flags = HCD_MEMORY | HCD_USB2,
  64 + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
65 65  
66 66 /*
67 67 * Basic lifecycle operations.
drivers/usb/host/ehci-w90x900.c
... ... @@ -108,7 +108,7 @@
108 108 * generic hardware linkage
109 109 */
110 110 .irq = ehci_irq,
111   - .flags = HCD_USB2|HCD_MEMORY,
  111 + .flags = HCD_USB2|HCD_MEMORY|HCD_BH,
112 112  
113 113 /*
114 114 * basic lifecycle operations
drivers/usb/host/ehci-xilinx-of.c
... ... @@ -79,7 +79,7 @@
79 79 * generic hardware linkage
80 80 */
81 81 .irq = ehci_irq,
82   - .flags = HCD_MEMORY | HCD_USB2,
  82 + .flags = HCD_MEMORY | HCD_USB2 | HCD_BH,
83 83  
84 84 /*
85 85 * basic lifecycle operations