Eric Lee / smarc-fsl-linux-kernel

15 Jun, 2006

1 commit

553698f94 [PATCH] cfq-iosched: fix crash in do_div() ... Browse Code »

We don't clear the seek stat values in cfq_alloc_io_context(), and if
->seek_mean is unlucky enough to be set to -36 by chance, the first
invocation of cfq_update_io_seektime() will oops with a divide by zero
in do_div().

Just memset the entire cic instead of filling invididual values
independently.

Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds

Jens Axboe
2006-06-15 01:22:16 +0800

09 Jun, 2006

1 commit

bc1c11697 [PATCH] elevator switching race ... Browse Code »

There's a race between shutting down one io scheduler and firing up the
next, in which a new io could enter and cause the io scheduler to be
invoked with bad or NULL data.

To fix this, we need to maintain the queue lock for a bit longer.
Unfortunately we cannot do that, since the elevator init requires to be
run without the lock held. This isn't easily fixable, without also
changing the mempool API. So split the initialization into two parts,
and alloc-init operation and an attach operation. Then we can
preallocate the io scheduler and related structures, and run the attach
inside the lock after we detach the old one.

This patch has survived 30 minutes of 1 second io scheduler switching
with a very busy io load.

Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds

Jens Axboe
2006-06-09 06:14:23 +0800

02 Jun, 2006

1 commit

b52a83489 [PATCH] cfq-iosched: busy_rr fairness fix ... Browse Code »

Now that we select busy_rr for possible service, insert entries at the
back of that list instead of at the front.

Signed-off-by: Jens Axboe

Jens Axboe
2006-06-02 00:53:43 +0800

01 Jun, 2006

4 commits

ae818a38d [PATCH] cfq-iosched: fix bug in timer handling for the idle class ... Browse Code »

There's a small window from when the timer is entered and we grab
the queue lock, where cfq_set_active_queue() could be rearming the
timer for us. Seen in the wild on a 12-way ppc box. Fix this by
just using mod_timer(), which will do the right thing for us.

Signed-off-by: Jens Axboe

Jens Axboe
2006-06-01 16:13:43 +0800
25776e359 [PATCH] cfq-iosched: Detect hardware queueing ... Browse Code »

If the hardware is doing real queueing, decide that it's worthless to
idle the hardware. It does reasonable simultaneous io in that case
anyways, and the idling hurts some work loads.

Signed-off-by: Jens Axboe

Jens Axboe
2006-06-01 16:12:26 +0800
12e9fddd6 [PATCH] cfq-iosched: Detect idle process issuing async request ... Browse Code »

If we are anticipating a sync request from this process and we are
waiting for that and see an async request come in, expire that slice
and move on.

Signed-off-by: Jens Axboe

Jens Axboe
2006-06-01 16:09:56 +0800
e0de0206a [PATCH] cfq-iosched: check busy queues before deciding we are idle ... Browse Code »

For just one busy queue (like async write out), we often overlooked
that we could queue more io and decided we were idle instead. This causes
us quite a bit of performance loss.

Signed-off-by: Jens Axboe

Jens Axboe
2006-06-01 16:07:26 +0800

31 May, 2006

1 commit

3793c65c1 [PATCH] cfq-iosched: fixup locking and ->queue_list list management ... Browse Code »

- Drop cic from the list when seen as dead.
- Fixup the locking, just use a simple spinlock.

Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds

Jens Axboe
2006-05-31 11:31:05 +0800

24 May, 2006

1 commit

fd0ff8aa1 [PATCH] blk: fix gendisk->in_flight accounting during barrier sequence ... Browse Code »

While executing barrrier sequence, the bar_rq which carries actual
write was accounted as normal IO on completion, while it wasn't on
queueing. This caused gendisk->in_flight to be decremented by 1 after
each barrier thus messed up statistics.

This patch makes bar_rq not accounted as normal IO. As the containing
barrier request as a whole is accounted, part of it shouldn't be.

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds

Jens Axboe
2006-05-24 01:39:43 +0800

13 May, 2006

1 commit

1a2acc9e9 Revert "[BLOCK] Fix oops on removal of SD/MMC card" ... Browse Code »

This reverts commit 56cf6504fc1c0c221b82cebc16a444b684140fb7.

Both Erik Mouw and Andrew Vasquez independently pinpointed this commit
as causing problems, where the slab cache for a driver is never released
(most obviously causing problems when immediately re-loading that
driver, resulting in a "kmem_cache_create: duplicate cache "
message, but it can also cause other trouble).

James Bottomley dug into it, and reports:

"OK, here's the scoop. The problem patch adds a get of driverfs_dev in
add_disk(), but doesn't put it again until disk_release() (which occurs
on final put_disk() of the gendisk).

However, in SCSI, the driverfs_dev is the sdev_gendev. That means
there's a reference held on sdev_gendev until final disk put.
Unfortunately, we use the driver model driver_remove to trigger
del_gendisk (which removes the gendisk from visibility and decrements
the refcount), so we've introduced an unbreakable deadlock in the
reference counting with this.

I suggest simply reversing this patch at the moment. If Russell and
Jens can tell me what they're trying to do I'll see if there's another
way to do it."

so hereby the patch gets reverted, waiting for a better fix.

Cc: Jens Axboe
Cc: Russell King
Cc: James Bottomley
Cc: Erik Mouw
Cc: Andrew Vasquez
Signed-off-by: Linus Torvalds

Linus Torvalds
2006-05-13 03:08:46 +0800

12 May, 2006

1 commit

dac07ec12 [BLOCK] limit request_fn recursion ... Browse Code »

Don't recurse back into the driver even if the unplug threshold is met,
when the driver asks for a requeue. This is both silly from a logical
point of view (requeues typically happen due to driver/hardware
shortage), and also dangerous since we could hit an endless request_fn
-> requeue -> unplug -> request_fn loop and crash on stack overrun.

Also limit blk_run_queue() to one level of recursion, similar to how
blk_start_queue() works.

This patch fixed a real problem with SLES10 and lpfc, and it could hit
any SCSI lld that returns non-zero from it's ->queuecommand() handler.

Signed-off-by: Jens Axboe
Signed-off-by: Linus Torvalds

Jens Axboe
2006-05-12 03:38:59 +0800

06 May, 2006

1 commit

56cf6504f [BLOCK] Fix oops on removal of SD/MMC card ... Browse Code »

The block layer keeps a reference (driverfs_dev) to the struct
device associated with the block device, and uses it internally
for generating uevents in block_uevent.

Block device uevents include umounting the partition, which can
occur after the backing device has been removed.

Unfortunately, this reference is not counted. This means that
if the struct device is removed from the device tree, the block
layers reference will become stale.

Guard against this by holding a reference to the struct device
in add_disk(), and only drop the reference when we're releasing
the gendisk kobject - in other words when we can be sure that no
further uevents will be generated for this block device.

Signed-off-by: Russell King
Acked-by: Jens Axboe

Russell King
2006-05-06 00:57:52 +0800

26 Apr, 2006

1 commit

649bbaa48 [PATCH] Remove __devinitdata from notifier block definitions ... Browse Code »

Few of the notifier_chain_register() callers use __devinitdata in the
definition of notifier_block data structure. It is incorrect as the
data structure should be available after the initializations (they do
not unregister them during initializations).

This was leading to an oops when notifier_chain_register() call is
invoked for those callback chains after initialization.

This patch fixes all such usages to _not_ have the notifier_block data
structure in the init data section.

Signed-off-by: Chandra Seetharaman
Signed-off-by: Linus Torvalds

Chandra Seetharaman
2006-04-26 23:27:50 +0800

20 Apr, 2006

2 commits

4f73247f0 [PATCH] block/elevator.c: remove unused exports ... Browse Code »

This patch removes the following unused EXPORT_SYMBOL's:
- elv_requeue_request
- elv_completed_request

They are only used by the block core, hence they need not be exported.

Signed-off-by: Adrian Bunk
Signed-off-by: Jens Axboe

Adrian Bunk
2006-04-20 21:45:22 +0800
7daac4902 [patch] cleanup: use blk_queue_stopped ... Browse Code »

This cleanup the source to use blk_queue_stopped.

Signed-off-by: Coywolf Qi Hunt
Signed-off-by: Jens Axboe

Coywolf Qi Hunt
2006-04-20 19:04:36 +0800

19 Apr, 2006

1 commit

be3b07535 [PATCH] cfq: Further rbtree traversal and cfq_exit_queue() race fix ... Browse Code »

In current code, we are re-reading cic->key after dead cic->key check.
So, in theory, it may really re-read *after* cfq_exit_queue() seted NULL.

To avoid race, we copy it to stack, then use it. With this change, I
guess gcc will assign cic->key to a register or stack, and it wouldn't
be re-readed.

Signed-off-by: OGAWA Hirofumi
Signed-off-by: Jens Axboe

OGAWA Hirofumi
2006-04-19 01:18:31 +0800

18 Apr, 2006

2 commits

dbecf3ab4 [PATCH 2/2] cfq: fix cic's rbtree traversal ... Browse Code »

When queue dies, we set cic->key=NULL as dead mark. So, when we
traverse a rbtree, we must check whether it's still valid key. if it
was invalidated, drop it, then restart the traversal from top.

Signed-off-by: OGAWA Hirofumi
Signed-off-by: Jens Axboe

OGAWA Hirofumi
2006-04-18 15:45:18 +0800
fba822722 [PATCH 1/2] iosched: fix typo and barrier() ... Browse Code »

On rmmod path, cfq/as waits to make sure all io-contexts was
freed. However, it's using complete(), not wait_for_completion().

I think barrier() is not enough in here. To avoid the following case,
this patch replaces barrier() with smb_wmb().

cpu0 visibility cpu1
[ioc_gnone=NULL,ioc_count=1]

ioc_gnone = &all_gone NULL,ioc_count=1
atomic_read(&ioc_count) NULL,ioc_count=1
wait_for_completion() NULL,ioc_count=0 atomic_sub_and_test()
NULL,ioc_count=0 if ( && ioc_gone)
[ioc_gone==NULL,
so doesn't call complete()]
&all_gone,ioc_count=0

Signed-off-by: OGAWA Hirofumi
Signed-off-by: Jens Axboe

OGAWA Hirofumi
2006-04-18 15:44:06 +0800

13 Apr, 2006

1 commit

21b2f0c80 [SCSI] unify SCSI_IOCTL_SEND_COMMAND implementations ... Browse Code »

We currently have two implementations of this obsolete ioctl, one in
the block layer and one in the scsi code. Both of them have drawbacks.

This patch kills the scsi layer version after updating the block version
with the missing bits:

- argument checking
- use scatterlist I/O
- set number of retries based on the submitted command

This is the last user of non-S/G I/O except for the gdth driver, so
getting this in ASAP and through the scsi tree would be nie to kill
the non-S/G I/O path. Jens, what do you think about adding a check
for non-S/G I/O in the midlayer?

Thanks to Or Gerlitz for testing this patch.

Signed-off-by: Christoph Hellwig
Signed-off-by: James Bottomley

Christoph Hellwig
2006-04-13 23:13:15 +0800

02 Apr, 2006

1 commit

a580290c3 Documentation: fix minor kernel-doc warnings ... Browse Code »

This patch updates the comments to match the actual code.

Signed-off-by: Martin Waitz
Signed-off-by: Adrian Bunk

Martin Waitz
2006-04-02 19:59:55 +0800

01 Apr, 2006

3 commits

88b9adb07 [PATCH] config: fix CONFIG_LFS option ... Browse Code »

The help text says that if you select CONFIG_LBD, then it will automatically
select CONFIG_LFS. That isn't currently the case, so update the text.

- Get rid of the cruft in the help text mentioning CONFIG_LBD

- Tell unsure users to select CONFIG_LFS.

- Remove the `default n'.

Signed-off-by: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Trond Myklebust
2006-04-01 04:18:55 +0800
9b41046cd [PATCH] Don't pass boot parameters to argv_init[] ... Browse Code »

The boot cmdline is parsed in parse_early_param() and
parse_args(,unknown_bootoption).

And __setup() is used in obsolete_checksetup().

start_kernel()
-> parse_args()
-> unknown_bootoption()
-> obsolete_checksetup()

If __setup()'s callback (->setup_func()) returns 1 in
obsolete_checksetup(), obsolete_checksetup() thinks a parameter was
handled.

If ->setup_func() returns 0, obsolete_checksetup() tries other
->setup_func(). If all ->setup_func() that matched a parameter returns 0,
a parameter is seted to argv_init[].

Then, when runing /sbin/init or init=app, argv_init[] is passed to the app.
If the app doesn't ignore those arguments, it will warning and exit.

This patch fixes a wrong usage of it, however fixes obvious one only.

Signed-off-by: OGAWA Hirofumi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

OGAWA Hirofumi
2006-04-01 04:18:53 +0800
68eef3b47 [PATCH] Simplify proc/devices and fix early termination regression ... Browse Code »

Make baby-simple the code for /proc/devices. Based on the proven design
for /proc/interrupts.

This also fixes the early-termination regression 2.6.16 introduced, as
demonstrated by:

# dd if=/proc/devices bs=1
Character devices:
1 mem
27+0 records in
27+0 records out

This should also work (but is untested) when /proc/devices >4096 bytes,
which I believe is what the original 2.6.16 rewrite fixed.

[akpm@osdl.org: cleanups, simplifications]
Signed-off-by: Joe Korty
Cc: Neil Horman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Korty
2006-04-01 04:18:53 +0800

29 Mar, 2006

2 commits

7baf398f1 Merge branch 'cfq-merge' of git://brick.kernel.dk/data/git/linux-2.6-block ... Browse Code »

* 'cfq-merge' of git://brick.kernel.dk/data/git/linux-2.6-block:
[BLOCK] cfq-iosched: seek and async performance fixes
[PATCH] ll_rw_blk: fix 80-col offender in put_io_context()
[PATCH] cfq-iosched: small cfq_choose_req() optimization
[PATCH] [BLOCK] cfq-iosched: change cfq io context linking from list to tree

Linus Torvalds
2006-03-29 01:25:44 +0800
0a9450227 [PATCH] for_each_possible_cpu: fixes for generic part ... Browse Code »

replaces for_each_cpu with for_each_possible_cpu().

Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KAMEZAWA Hiroyuki
2006-03-29 01:16:05 +0800

28 Mar, 2006

6 commits

206dc69b3 [BLOCK] cfq-iosched: seek and async performance fixes ... Browse Code »

Detect whether a given process is seeky and if so disable (mostly) the
idle window if it is. We still allow just a little idle time, just enough
to allow that process to submit a new request. That is needed to maintain
fairness across priority groups.

In some cases, we could setup several async queues. This is not optimal
from a performance POV, since we want all async io in one queue to perform
good sorting on it. It also impacted sync queues, as async io got too much
slice time.

Signed-off-by: Jens Axboe

Jens Axboe
2006-03-28 19:03:44 +0800
7143dd4b0 [PATCH] ll_rw_blk: fix 80-col offender in put_io_context() ... Browse Code »

This makes akpm more happy.

Signed-off-by: Jens Axboe

Jens Axboe
2006-03-28 15:00:28 +0800
e8a99053e [PATCH] cfq-iosched: small cfq_choose_req() optimization ... Browse Code »

this is a small optimization to cfq_choose_req() in the CFQ I/O scheduler
(this function is a semi-often invoked candidate in an oprofile log):
by using a bit mask variable, we can use a simple switch() to check
the various cases instead of having to query two variables for each check.
Benefit: 251 vs. 285 bytes footprint of cfq_choose_req().
Also, common case 0 (no request wrapping) is now checked first in code.

Signed-off-by: Andreas Mohr
Signed-off-by: Jens Axboe

Andreas Mohr
2006-03-28 14:59:49 +0800
e2d74ac06 [PATCH] [BLOCK] cfq-iosched: change cfq io context linking from list to tree ... Browse Code »

On setups with many disks, we spend a considerable amount of time
looking up the process-disk mapping on each queue of io. Testing with
a NULL based block driver, this costs 40-50% reduction in throughput
for 1000 disks.

Signed-off-by: Jens Axboe

Jens Axboe
2006-03-28 14:59:01 +0800
4fa639123 Merge branch 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block ... Browse Code »

* 'for-linus' of git://brick.kernel.dk/data/git/linux-2.6-block:
[PATCH] Don't make debugfs depend on DEBUG_KERNEL
[PATCH] Fix blktrace compile with sysfs not defined
[PATCH] unused label in drivers/block/cciss.
[BLOCK] increase size of disk stat counters
[PATCH] blk_execute_rq_nowait-speedup
[PATCH] ide-cd: quiet down GPCMD_READ_CDVD_CAPACITY failure
[BLOCK] ll_rw_blk: kmalloc -> kzalloc conversion
[PATCH] kzalloc() conversion in drivers/block
[PATCH] update max_sectors documentation

Linus Torvalds
2006-03-28 00:46:49 +0800
89e5c8b5b [PATCH] md: Make sure QUEUE_FLAG_CLUSTER is set properly for md. ... Browse Code »

This flag should be set for a virtual device iff it is set for all underlying
devices.

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-03-28 00:45:00 +0800

27 Mar, 2006

6 commits

09540e691 [PATCH] Fix blktrace compile with sysfs not defined ... Browse Code »

debugfs depends on sysfs, so make blktrace kconfig option depend
on that.

Reported by Adrian Bunk.

Signed-off-by: Jens Axboe

Jens Axboe
2006-03-27 15:29:03 +0800
837c78787 [BLOCK] increase size of disk stat counters ... Browse Code »

The kernel's representation of the disk statistics uses the type unsigned
which is 32b on both 32b and 64b platforms. Unfortunately, most system
tools that work with these numbers that are exported in /proc/diskstats
including iostat read these numbers into unsigned longs. This works fine
on 32b platforms and when the number of IO transactions are small on 64b
platforms. However, when the numbers wrap on 64b platforms & you read the
numbers into unsigned longs, and compare the numbers to previous readings,
then you get an unsigned representation of a negative number. This looks
like a very large 64b number & gives you bizarre readouts in iostat:

ilc4: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
ilc4: sda 5.50 0.00 143.96 0.00 307496983987862656.00 0.00 153748491993931328.00 0.00 2136028725038430.00 7.94 55.12 5.59 80.42

Though fixing iostat in user space is possible, and a quick survey
indicates that several other similar tools also use unsigned longs when
processing /proc/diskstats. Therefore, it seems like a better approach
would be to extend the length of the disk_stats structure on 64b
architectures to 64b. The following patch does that. It should not affect
the operation on 32b platforms.

Signed-off-by: Ben Woodard
Cc: Rick Lindsley
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

Ben Woodard
2006-03-27 15:29:02 +0800
4c5d0bbde [PATCH] blk_execute_rq_nowait-speedup ... Browse Code »

Both elv_add_request() and generic_unplug_device() grab the queue lock
and disable interrupts, do that locally and use the __ variants.

Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

Andrew Morton
2006-03-27 15:29:02 +0800
f68110fc2 [BLOCK] ll_rw_blk: kmalloc -> kzalloc conversion ... Browse Code »

Signed-off-by: Jens Axboe

Jens Axboe
2006-03-27 15:29:02 +0800
a0f62ac63 [PATCH] 2TB files: add blkcnt_t ... Browse Code »

Add blkcnt_t as the type of inode.i_blocks. This enables you to make the size
of blkcnt_t either 4 bytes or 8 bytes on 32 bits architecture with CONFIG_LSF.

- CONFIG_LSF
Add new configuration parameter.
- blkcnt_t
On h8300, i386, mips, powerpc, s390 and sh that define sector_t,
blkcnt_t is defined as u64 if CONFIG_LSF is enabled; otherwise it is
defined as unsigned long.
On other architectures, it is defined as unsigned long.
- inode.i_blocks
Change the type from sector_t to blkcnt_t.

Signed-off-by: Takashi Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Takashi Sato
2006-03-27 00:57:00 +0800
93d2341c7 [PATCH] mempool: use mempool_create_slab_pool() ... Browse Code »

Modify well over a dozen mempool users to call mempool_create_slab_pool()
rather than calling mempool_create() with extra arguments, saving about 30
lines of code and increasing readability.

Signed-off-by: Matthew Dobson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Dobson
2006-03-27 00:57:00 +0800

25 Mar, 2006

1 commit

ce5244974 BUG_ON() Conversion in block/elevator.c ... Browse Code »

this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn
Signed-off-by: Adrian Bunk

Eric Sesterhenn
2006-03-25 01:43:26 +0800

24 Mar, 2006

1 commit

2056a782f [PATCH] Block queue IO tracing support (blktrace) as of 2006-03-23 ... Browse Code »

Signed-off-by: Jens Axboe

Jens Axboe
2006-03-24 03:00:26 +0800

23 Mar, 2006

1 commit

c039e3134 [PATCH] sem2mutex: blockdev #2 ... Browse Code »

Semaphore to mutex conversion.

The conversion was generated via scripts, and the result was validated
automatically via a script as well.

Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar
Acked-by: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arjan van de Ven
2006-03-23 23:38:11 +0800