Eric Lee / smarc-fsl-linux-kernel

06 Dec, 2008

1 commit

f2f1fa78a Enforce a minimum SG_IO timeout ... Browse Code »

There's no point in having too short SG_IO timeouts, since if the
command does end up timing out, we'll end up through the reset sequence
that is several seconds long in order to abort the command that timed
out.

As a result, shorter timeouts than a few seconds simply do not make
sense, as the recovery would be longer than the timeout itself.

Add a BLK_MIN_SG_TIMEOUT to match the existign BLK_DEFAULT_SG_TIMEOUT.

Suggested-by: Alan Cox
Acked-by: Tejun Heo
Acked-by: Jens Axboe
Cc: Jeff Garzik
Signed-off-by: Linus Torvalds

Linus Torvalds
2008-12-06 06:49:18 +0800

04 Dec, 2008

2 commits

fd4ce1acd [PATCH 1/2] kill FMODE_NDELAY_NOW ... Browse Code »

Update FMODE_NDELAY before each ioctl call so that we can kill the
magic FMODE_NDELAY_NOW. It would be even better to do this directly
in setfl(), but for that we'd need to have FMODE_NDELAY for all files,
not just block special files.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2008-12-04 17:22:57 +0800
1c925604e [PATCH] Fix block dev compat ioctl handling ... Browse Code »

Commit 33c2dca4957bd0da3e1af7b96d0758d97e708ef6 (trim file propagation
in block/compat_ioctl.c) removed the handling of some ioctls from
compat_blkdev_driver_ioctl. That caused them to be rejected as unknown
by the compat layer.

Signed-off-by: Andreas Schwab
Cc: Al Viro
Signed-off-by: Al Viro

Andreas Schwab
2008-12-04 17:22:55 +0800

03 Dec, 2008

4 commits

0e435ac26 block: fix setting of max_segment_size and seg_boundary mask ... Browse Code »

Fix setting of max_segment_size and seg_boundary mask for stacked md/dm
devices.

When stacking devices (LVM over MD over SCSI) some of the request queue
parameters are not set up correctly in some cases by default, namely
max_segment_size and and seg_boundary mask.

If you create MD device over SCSI, these attributes are zeroed.

Problem become when there is over this mapping next device-mapper mapping
- queue attributes are set in DM this way:

request_queue max_segment_size seg_boundary_mask
SCSI 65536 0xffffffff
MD RAID1 0 0
LVM 65536 -1 (64bit)

Unfortunately bio_add_page (resp. bio_phys_segments) calculates number of
physical segments according to these parameters.

During the generic_make_request() is segment cout recalculated and can
increase bio->bi_phys_segments count over the allowed limit. (After
bio_clone() in stack operation.)

Thi is specially problem in CCISS driver, where it produce OOPS here

BUG_ON(creq->nr_phys_segments > MAXSGENTRIES);

(MAXSEGENTRIES is 31 by default.)

Sometimes even this command is enough to cause oops:

dd iflag=direct if=/dev// of=/dev/null bs=128000 count=10

This command generates bios with 250 sectors, allocated in 32 4k-pages
(last page uses only 1024 bytes).

For LVM layer, it allocates bio with 31 segments (still OK for CCISS),
unfortunatelly on lower layer it is recalculated to 32 segments and this
violates CCISS restriction and triggers BUG_ON().

The patch tries to fix it by:

* initializing attributes above in queue request constructor
blk_queue_make_request()

* make sure that blk_queue_stack_limits() inherits setting

(DM uses its own function to set the limits because it
blk_queue_stack_limits() was introduced later. It should probably switch
to use generic stack limit function too.)

* sets the default seg_boundary value in one place (blkdev.h)

* use this mask as default in DM (instead of -1, which differs in 64bit)

Bugs related to this:
https://bugzilla.redhat.com/show_bug.cgi?id=471639
http://bugzilla.kernel.org/show_bug.cgi?id=8672

Signed-off-by: Milan Broz
Reviewed-by: Alasdair G Kergon
Cc: Neil Brown
Cc: FUJITA Tomonori
Cc: Tejun Heo
Cc: Mike Miller
Signed-off-by: Jens Axboe

Milan Broz
2008-12-03 19:55:55 +0800
53a08807c block: internal dequeue shouldn't start timer ... Browse Code »

blkdev_dequeue_request() and elv_dequeue_request() are equivalent and
both start the timeout timer. Barrier code dequeues the original
barrier request but doesn't passes the request itself to lower level
driver, only broken down proxy requests; however, as the original
barrier code goes through the same dequeue path and timeout timer is
started on it. If barrier sequence takes long enough, this timer
expires but the low level driver has no idea about this request and
oops follows.

Timeout timer shouldn't have been started on the original barrier
request as it never goes through actual IO. This patch unexports
elv_dequeue_request(), which has no external user anyway, and makes it
operate on elevator proper w/o adding the timer and make
blkdev_dequeue_request() call elv_dequeue_request() and add timer.
Internal users which don't pass the request to driver - barrier code
and end_that_request_last() - are converted to use
elv_dequeue_request().

Signed-off-by: Tejun Heo
Cc: Mike Anderson
Signed-off-by: Jens Axboe

Tejun Heo
2008-12-03 19:41:26 +0800
bf91db18a block: set disk->node_id before it's being used ... Browse Code »

disk->node_id will be refered in allocating in disk_expand_part_tbl, so we
should set it before disk->node_id is refered.

Signed-off-by: Cheng Renquan
Signed-off-by: Jens Axboe

Cheng Renquan
2008-12-03 19:41:20 +0800
53cc0b294 When block layer fails to map iov, it calls bio_unmap_user to undo ... Browse Code »

mapping. Which is good if pages were mapped - but if they were provided
by someone else and just copied then bad things happen - pages are
released once here, and once by caller, leading to user triggerable BUG
at include/linux/mm.h:246.

Signed-off-by: Petr Vandrovec
Signed-off-by: Jens Axboe

Petr Vandrovec
2008-12-03 19:41:20 +0800

18 Nov, 2008

3 commits

c26156b25 block: hold extra reference to bio in blk_rq_map_user_iov() ... Browse Code »

If the size passed in is OK but we end up mapping too many segments,
we call the unmap path directly like from IO completion. But from IO
completion we have an extra reference to the bio, so this error case
goes OOPS when it attempts to free and already free bio.

Fix it by getting an extra reference to the bio before calling the
unmap failure case.

Reported-by: Petr Vandrovec

Signed-off-by: Jens Axboe

Jens Axboe
2008-11-18 22:08:56 +0800
561ec68e4 block: fix boot failure with CONFIG_DEBUG_BLOCK_EXT_DEVT=y and nash ... Browse Code »

We run into system boot failure with kernel 2.6.28-rc. We found it on a
couple of machines, including T61 notebook, nehalem machine, and another
HPC NX6325 notebook. All the machines use FedoraCore 8 or FedoraCore 9.
With kernel prior to 2.6.28-rc, system boot doesn't fail.

I debug it and locate the root cause. Pls. see
http://bugzilla.kernel.org/show_bug.cgi?id=11899
https://bugzilla.redhat.com/show_bug.cgi?id=471517

As a matter of fact, there are 2 bugs.

1)root=/dev/sda1, system boot randomly fails. Mostly, boot for 5 times
and fails once. nash has a bug. Some of its functions misuse return
value 0. Sometimes, 0 means timeout and no uevent available. Sometimes,
0 means nash gets an uevent, but the uevent isn't block-related (for
exmaple, usb). If by coincidence, kernel tells nash that uevents are
available, but kernel also set timeout, nash might stops collecting
other uevents in queue if current uevent isn't block-related. I work
out a patch for nash to fix it.
http://bugzilla.kernel.org/attachment.cgi?id=18858

2) root=LABEL=/, system always can't boot. initrd init reports
switchroot fails. Here is an executation branch of nash when booting:
(1) nash read /sys/block/sda/dev; Assume major is 8 (on my desktop)
(2) nash query /proc/devices with the major number; It found line
"8 sd";
(3) nash use 'sd' to search its own probe table to find device (DISK)
type for the device and add it to its own list;
(4) Later on, it probes all devices in its list to get filesystem
labels; scsi register "8 sd" always.

When major is 259, nash fails to find the device(DISK) type. I enables
CONFIG_DEBUG_BLOCK_EXT_DEVT=y when compiling kernel, so 259 is picked up
for device /dev/sda1, which causes nash to fail to find device (DISK)
type.

To fixing issue 2), I create a patch for nash and another patch for
kernel.

http://bugzilla.kernel.org/attachment.cgi?id=18859
http://bugzilla.kernel.org/attachment.cgi?id=18837

Below is the patch for kernel 2.6.28-rc4. It registers blkext, a new
block device in proc/devices.

With 2 patches on nash and 1 patch on kernel, I boot my machines for
dozens of times without failure.

Signed-off-by Zhang Yanmin
Acked-by: Tejun Heo
Signed-off-by: Jens Axboe

Zhang, Yanmin
2008-11-18 22:08:56 +0800
ba32929a9 block: make add_partition() return pointer to hd_struct ... Browse Code »

Make add_partition() return pointer to the new hd_struct on success
and ERR_PTR() value on failure. This change will be used to fix md
autodetection bug.

Signed-off-by: Tejun Heo
Cc: Neil Brown
Signed-off-by: Jens Axboe

Tejun Heo
2008-11-18 22:08:56 +0800

06 Nov, 2008

4 commits

7838c15b8 Block: use round_jiffies_up() ... Browse Code »

This patch (as1159b) changes the timeout routines in the block core to
use round_jiffies_up(). There's no point in rounding the timer
deadline down, since if it expires too early we will have to restart
it.

The patch also removes some unnecessary tests when a request is
removed from the queue's timer list.

Signed-off-by: Alan Stern
Signed-off-by: Jens Axboe

Alan Stern
2008-11-06 15:42:49 +0800
e78042e5b blk: move blk_delete_timer call in end_that_request_last ... Browse Code »

Move the calling blk_delete_timer to later in end_that_request_last to
address an issue where blkdev_dequeue_request may have add a timer for the
request.

Signed-off-by: Mike Anderson
Acked-by: Tejun Heo
Signed-off-by: Jens Axboe

Mike Anderson
2008-11-06 15:41:56 +0800
2920ebbd6 block: add timer on blkdev_dequeue_request() not elv_next_request() ... Browse Code »

Block queue supports two usage models - one where block driver peeks
at the front of queue using elv_next_request(), processes it and
finishes it and the other where block driver peeks at the front of
queue, dequeue the request using blkdev_dequeue_request() and finishes
it. The latter is more flexible as it allows the driver to process
multiple commands concurrently.

These two inconsistent usage models affect the block layer
implementation confusing. For some, elv_next_request() is considered
the issue point while others consider blkdev_dequeue_request() the
issue point.

Till now the inconsistency mostly affect only accounting, so it didn't
really break anything seriously; however, with block layer timeout,
this inconsistency hits hard. Block layer considers
elv_next_request() the issue point and adds timer but SCSI layer
thinks it was just peeking and when the request can't process the
command right away, it's just left there without further processing.
This makes the request dangling on the timer list and, when the timer
goes off, the request which the SCSI layer and below think is still on
the block queue ends up in the EH queue, causing various problems - EH
hang (failed count goes over busy count and EH never wakes up),
WARN_ON() and oopses as low level driver trying to handle the unknown
command, etc. depending on the timing.

As SCSI midlayer is the only user of block layer timer at the moment,
moving blk_add_timer() to elv_dequeue_request() fixes the problem;
however, this two usage models definitely need to be cleaned up in the
future.

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2008-11-06 15:41:55 +0800
43381785a block: remove unused ll_new_mergeable() ... Browse Code »

Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-11-06 15:41:55 +0800

24 Oct, 2008

2 commits

88ed86fee Merge branch 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc ... Browse Code »

* 'proc' of git://git.kernel.org/pub/scm/linux/kernel/git/adobriyan/proc: (35 commits)
proc: remove fs/proc/proc_misc.c
proc: move /proc/vmcore creation to fs/proc/vmcore.c
proc: move pagecount stuff to fs/proc/page.c
proc: move all /proc/kcore stuff to fs/proc/kcore.c
proc: move /proc/schedstat boilerplate to kernel/sched_stats.h
proc: move /proc/modules boilerplate to kernel/module.c
proc: move /proc/diskstats boilerplate to block/genhd.c
proc: move /proc/zoneinfo boilerplate to mm/vmstat.c
proc: move /proc/vmstat boilerplate to mm/vmstat.c
proc: move /proc/pagetypeinfo boilerplate to mm/vmstat.c
proc: move /proc/buddyinfo boilerplate to mm/vmstat.c
proc: move /proc/vmallocinfo to mm/vmalloc.c
proc: move /proc/slabinfo boilerplate to mm/slub.c, mm/slab.c
proc: move /proc/slab_allocators boilerplate to mm/slab.c
proc: move /proc/interrupts boilerplate code to fs/proc/interrupts.c
proc: move /proc/stat to fs/proc/stat.c
proc: move rest of /proc/partitions code to block/genhd.c
proc: move /proc/cpuinfo code to fs/proc/cpuinfo.c
proc: move /proc/devices code to fs/proc/devices.c
proc: move rest of /proc/locks to fs/locks.c
...

Linus Torvalds
2008-10-24 03:04:37 +0800
5f4f0c4d3 compat_blkdev_driver_ioctl: Remove unused variable warning ... Browse Code »

Variable 'ret' is no longer used. Don't declare it.

Signed-off-by: Linus Torvalds

Linus Torvalds
2008-10-24 01:28:25 +0800

23 Oct, 2008

2 commits

31d85ab28 proc: move /proc/diskstats boilerplate to block/genhd.c ... Browse Code »

Signed-off-by: Alexey Dobriyan
Acked-by: Jens Axboe

Alexey Dobriyan
2008-10-23 21:57:37 +0800
f500975a3 proc: move rest of /proc/partitions code to block/genhd.c ... Browse Code »

Signed-off-by: Alexey Dobriyan
Acked-by: Jens Axboe

Alexey Dobriyan
2008-10-23 19:07:31 +0800

21 Oct, 2008

12 commits

56b26add0 [PATCH] kill the rest of struct file propagation in block ioctls ... Browse Code »

Now we can switch blkdev_ioctl() block_device/mode

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:49:14 +0800
6af3a56e1 [PATCH] get rid of struct file use in blkdev_ioctl() BLKBSZSET ... Browse Code »

We need to do bd_claim() only if file hadn't been opened with O_EXCL
and then we have no need to use file itself as owner.

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:49:12 +0800
45048d096 [PATCH] get rid of blkdev_locked_ioctl() ... Browse Code »

Most of that stuff doesn't need BKL at all; expand in the (only) caller,
merge the switch into one there and leave BKL only around the stuff that
might actually need it.

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:49:10 +0800
e436fdae7 [PATCH] get rid of blkdev_driver_ioctl() ... Browse Code »

convert remaining callers to __blkdev_driver_ioctl()

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:49:08 +0800
33c2dca49 [PATCH] trim file propagation in block/compat_ioctl.c ... Browse Code »

... and remove the handling of cases when it falls back to native
without changing arguments.

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:48:54 +0800
90b8f2824 [PATCH] end of methods switch: remove the old ones ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:48:52 +0800
d4430d62f [PATCH] beginning of methods conversion ... Browse Code »

To keep the size of changesets sane we split the switch by drivers;
to keep the damn thing bisectable we do the following:
1) rename the affected methods, add ones with correct
prototypes, make (few) callers handle both. That's this changeset.
2) for each driver convert to new methods. *ALL* drivers
are converted in this series.
3) kill the old (renamed) methods.

Note that it _is_ a flagday; all in-tree drivers are converted and by the
end of this series no trace of old methods remain. The only reason why
we do that this way is to keep the damn thing bisectable and allow per-driver
debugging if anything goes wrong.

New methods:
open(bdev, mode)
release(disk, mode)
ioctl(bdev, mode, cmd, arg) /* Called without BKL */
compat_ioctl(bdev, mode, cmd, arg)
locked_ioctl(bdev, mode, cmd, arg) /* Called with BKL, legacy */

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:47:32 +0800
633a08b81 [PATCH] introduce __blkdev_driver_ioctl() ... Browse Code »

Analog of blkdev_driver_ioctl() with sane arguments. For
now uses fake struct file, by the end of the series it won't
and blkdev_driver_ioctl() will become a wrapper around it.

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:47:26 +0800
74f3c8aff [PATCH] switch scsi_cmd_ioctl() to passing fmode_t ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:47:14 +0800
e915e872e [PATCH] switch sg_scsi_ioctl() to passing fmode_t ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:47:12 +0800
5842e51ff [PATCH] pass mode instead of file to sg_io() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:47:10 +0800
aeb5d7270 [PATCH] introduce fmode_t, do annotations ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2008-10-21 19:47:06 +0800

18 Oct, 2008

2 commits

c53dbf548 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block ... Browse Code »

* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: remove __generic_unplug_device() from exports
block: move q->unplug_work initialization
blktrace: pass zfcp driver data
blktrace: add support for driver data
block: fix current kernel-doc warnings
block: only call ->request_fn when the queue is not stopped
block: simplify string handling in elv_iosched_store()
block: fix kernel-doc for blk_alloc_devt()
block: fix nr_phys_segments miscalculation bug
block: add partition attribute for partition number
block: add BIG FAT WARNING to CONFIG_DEBUG_BLOCK_EXT_DEVT
softirq: Add support for triggering softirq work on softirqs.

Linus Torvalds
2008-10-18 00:29:55 +0800
ed09441da Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (39 commits)
[SCSI] sd: fix compile failure with CONFIG_BLK_DEV_INTEGRITY=n
libiscsi: fix locking in iscsi_eh_device_reset
libiscsi: check reason why we are stopping iscsi session to determine error value
[SCSI] iscsi_tcp: return a descriptive error value during connection errors
[SCSI] libiscsi: rename host reset to target reset
[SCSI] iscsi class: fix endpoint id handling
[SCSI] libiscsi: Support drivers initiating session removal
[SCSI] libiscsi: fix data corruption when target has to resend data-in packets
[SCSI] sd: Switch kernel printing level for DIF messages
[SCSI] sd: Correctly handle all combinations of DIF and DIX
[SCSI] sd: Always print actual protection_type
[SCSI] sd: Issue correct protection operation
[SCSI] scsi_error: fix target reset handling
[SCSI] lpfc 8.2.8 v2 : Add statistical reporting control and additional fc vendor events
[SCSI] lpfc 8.2.8 v2 : Add sysfs control of target queue depth handling
[SCSI] lpfc 8.2.8 v2 : Revert target busy in favor of transport disrupted
[SCSI] scsi_dh_alua: remove REQ_NOMERGE
[SCSI] lpfc 8.2.8 : update driver version to 8.2.8
[SCSI] lpfc 8.2.8 : Add MSI-X support
[SCSI] lpfc 8.2.8 : Update driver to use new Host byte error code DID_TRANSPORT_DISRUPTED
...

Linus Torvalds
2008-10-18 00:00:23 +0800

17 Oct, 2008

8 commits

f73e2d13a block: remove __generic_unplug_device() from exports ... Browse Code »

The only out-of-core user is IDE, and that should be using
blk_start_queueing() instead.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-17 20:03:08 +0800
713ada9ba block: move q->unplug_work initialization ... Browse Code »

modprobe loop; rmmod loop effectively creates a blk_queue and destroys it
which results in q->unplug_work being canceled without it ever being
initialized.

Therefore, move the initialization of q->unplug_work from
blk_queue_make_request() to blk_alloc_queue*().

Reported-by: Alexey Dobriyan
Signed-off-by: Peter Zijlstra
Signed-off-by: Jens Axboe

Peter Zijlstra
2008-10-17 14:46:57 +0800
496aa8a98 block: fix current kernel-doc warnings ... Browse Code »

Fix block kernel-doc warnings:

Warning(linux-2.6.27-git4//fs/block_dev.c:1272): No description found for parameter 'path'
Warning(linux-2.6.27-git4//block/blk-core.c:1021): No description found for parameter 'cpu'
Warning(linux-2.6.27-git4//block/blk-core.c:1021): No description found for parameter 'part'
Warning(/var/linsrc/linux-2.6.27-git4//block/genhd.c:544): No description found for parameter 'partno'

Signed-off-by: Randy Dunlap
Signed-off-by: Jens Axboe

Randy Dunlap
2008-10-17 14:46:57 +0800
80a4b58e3 block: only call ->request_fn when the queue is not stopped ... Browse Code »

Callers should use either blk_run_queue/__blk_run_queue, or
blk_start_queueing() to invoke request handling instead of calling
->request_fn() directly as that does not take the queue stopped
flag into account.

Also add appropriate comments on the above functions to detail
their usage.

Signed-off-by: Jens Axboe

Jens Axboe
2008-10-17 14:46:57 +0800
ee2e992cc block: simplify string handling in elv_iosched_store() ... Browse Code »

strlcpy() guarantees the dest buffer is NULL teminated.

Signed-off-by: Li Zefan
Signed-off-by: Jens Axboe

Li Zefan
2008-10-17 14:46:57 +0800
e6d63840b block: fix kernel-doc for blk_alloc_devt() ... Browse Code »

No argument 'gfp_mask' for blk_alloc_devt().

Signed-off-by: Li Zefan
Signed-off-by: Jens Axboe

Li Zefan
2008-10-17 14:46:56 +0800
867714271 block: fix nr_phys_segments miscalculation bug ... Browse Code »

This fixes the bug reported by Nikanth Karthikesan :

http://lkml.org/lkml/2008/10/2/203

The root cause of the bug is that blk_phys_contig_segment
miscalculates q->max_segment_size.

blk_phys_contig_segment checks:

req->biotail->bi_size + next_req->bio->bi_size > q->max_segment_size

But blk_recalc_rq_segments might expect that req->biotail and the
previous bio in the req are supposed be merged into one
segment. blk_recalc_rq_segments might also expect that next_req->bio
and the next bio in the next_req are supposed be merged into one
segment. In such case, we merge two requests that can't be merged
here. Later, blk_rq_map_sg gives more segments than it should.

We need to keep track of segment size in blk_recalc_rq_segments and
use it to see if two requests can be merged. This patch implements it
in the similar way that we used to do for hw merging (virtual
merging).

Signed-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe

FUJITA Tomonori
2008-10-17 14:46:56 +0800
1ff9f542e device create: block: convert device_create_drvdata to device_create ... Browse Code »

Now that device_create() has been audited, rename things back to the
original call to be sane.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2008-10-17 00:24:41 +0800