14 Aug, 2009
1 commit
-
Conflicts:
arch/sparc/kernel/smp_64.c
arch/x86/kernel/cpu/perf_counter.c
arch/x86/kernel/setup_percpu.c
drivers/cpufreq/cpufreq_ondemand.c
mm/percpu.cConflicts in core and arch percpu codes are mostly from commit
ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many
num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all
the first chunk allocators into mm/percpu.c, the changes are moved
from arch code to mm/percpu.c.Signed-off-by: Tejun Heo
05 Aug, 2009
1 commit
-
…nce udev depends on BSG
Make Block Layer SG support v4 the default, since recent udev versions
depend on this to access serial numbers and other low level info properly.This should be backported to older kernels as well, since most distros have
enabled this for a long time.Signed-off-by: John Stoffel <john@stoffel.org>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
01 Aug, 2009
4 commits
-
Update topology comments and sysfs documentation based upon discussions
with Neil Brown.Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe -
When stacking block devices ensure that optimal I/O size is scaled
accordingly.Signed-off-by: Martin K. Petersen
Reviewed-by: Mike Snitzer
Signed-off-by: Jens Axboe -
Introduce blk_limits_io_min() and make blk_queue_io_min() call it.
Signed-off-by: Mike Snitzer
Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe -
blk_queue_stack_limits() has been superceded by blk_stack_limits() and
disk_stack_limits(). Wrap the function call for now, we'll deprecate it
later.Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe
29 Jul, 2009
1 commit
-
Prior to the change for more sane end_io functions, we exported
the helpers with the normal EXPORT_SYMBOL(). That got changed
to _GPL() for the new interface. Revert that particular change,
on the basis that this is basic functionality and doesn't dip
into internal structures. If these exports can't be non-GPL,
then we may as well make EXPORT_SYMBOL() imply GPL for
everything.Signed-off-by: Jens Axboe
28 Jul, 2009
2 commits
-
blk_integrity_unregister should use kobject_put to release the kobject,
otherwise after bi is freed, memory of bi->kobj->name is leaked.Signed-off-by: Xiaotian Feng
Signed-off-by: Jens Axboe -
Move the assignment of a default lock below blk_init_queue() to
blk_queue_make_request(), so we also get to set the default lock
for ->make_request_fn() based drivers. This is important since the
queue flag locking requires a lock to be in place.Signed-off-by: Jens Axboe
17 Jul, 2009
2 commits
-
In blk-sysfs.c, queue_var_store uses unsigned long to store data,
but queue_var_show uses unsigned int to show data. This causes,# echo 70000000000 > /sys/block//queue/read_ahead_kb
# cat /sys/block//queue/read_ahead_kb => get wrong valueFix it by using unsigned long.
While at it, convert queue_rq_affinity_show() such that it uses bool
variable instead of explicit != 0 testing.Signed-off-by: Xiaotian Feng
Signed-off-by: Tejun Heo -
Commit ab0fd1debe730ec9998678a0c53caefbd121ed10 tries to prevent merge
of requests with different failfast settings. In elv_rq_merge_ok(),
it compares new bio's failfast flags against the merge target
request's. However, the flag testing accessors for bio and blk don't
return boolean but the tested bit value directly and FAILFAST on bio
and blk don't match, so directly comparing them with == results in
false negative unnecessary preventing merge of readahead requests.This patch convert the results to boolean by negating them before
comparison.Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: Boaz Harrosh
Cc: FUJITA Tomonori
Cc: James Bottomley
Cc: Jeff Garzik
11 Jul, 2009
2 commits
-
In case memory is scarce, we now default to oom_cfqq. Once memory is
available again, we should allocate a new cfqq and stop using oom_cfqq for
a particular io context.Once a new request comes in, check if we are using oom_cfqq, and if yes,
try to allocate a new cfqq.Tested the patch by forcing the use of oom_cfqq and upon next request thread
realized that it was using oom_cfqq and it allocated a new cfqq.Signed-off-by: Vivek Goyal
Signed-off-by: Jens Axboe -
Currently, blk_scsi_ioctl_init() is not called since it lacks
an initcall marking. This causes the command table to be
unitialized, hence somce commands are block when they should
not have been.This fixes a regression introduced by commit
018e0446890661504783f92388ecce7138c1566dSigned-off-by: FUJITA Tomonori
Signed-off-by: Jens Axboe
04 Jul, 2009
2 commits
-
Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix
changes. As alpha in percpu tree uses 'weak' attribute instead of
inline assembly, there's no need for __used attribute.Conflicts:
arch/alpha/include/asm/percpu.h
arch/mn10300/kernel/vmlinux.lds.S
include/linux/percpu-defs.h -
Block layer used to merge requests and bios with different failfast
settings. This caused regular IOs to fail prematurely when they were
merged into failfast requests for readahead.Niel Lambrechts could trigger the problem semi-reliably on ext4 when
resuming from STR. ext4 uses readahead when reading inodes and
combined with the deterministic extra SATA PHY exception cycle during
resume on the specific configuration, non-readahead inode read would
fail causing ext4 errors. Please read the following thread for
details.http://lkml.org/lkml/2009/5/23/21
This patch makes block layer reject merging if the failfast settings
don't match. This is correct but likely to lower IO performance by
preventing regular IOs from mingling into surrounding readahead
requests. Changes to allow such mixed merges and handle errors
correctly will be added later.Signed-off-by: Tejun Heo
Reported-by: Niel Lambrechts
Cc: Theodore Tso
Signed-off-by: Jens Axboe
01 Jul, 2009
6 commits
-
With the changes for falling back to an oom_cfqq, we never fail
to find/allocate a queue in cfq_get_queue(). So remove the check.Signed-off-by: Shan Wei
Signed-off-by: Jens Axboe -
The next_ordered flag is only meaningful for devices that use __make_request.
So move the test against next_ordered out of generic code and in to
__make_requestSince this test was added, barriers have not worked on md or any
devices that don't use __make_request and so don't bother to set
next_ordered. (dm explicitly sets something other than
QUEUE_ORDERED_NONE since
commit 99360b4c18f7675b50d283301d46d755affe75fd
but notes in the comments that it is otherwise meaningless).Cc: Ken Milmore
Cc: stable@kernel.org
Signed-off-by: NeilBrown
Signed-off-by: Jens Axboe -
The initial patches to support this through sysfs export were broken
and have been if 0'ed out in any release. So lets just kill the code
and reclaim some space in struct request_queue, if anyone would later
like to fixup the sysfs bits, the git history can easily restore
the removed bits.Signed-off-by: Jens Axboe
-
This patch restores stacking ability to the block layer integrity
infrastructure by creating a set of dedicated bip slabs. Each bip slab
has an embedded bio_vec array at the end. This cuts down on memory
allocations and also simplifies the code compared to the original bvec
version. Only the largest bip slab is backed by a mempool. The pool is
contained in the bio_set so stacking drivers can ensure forward
progress.Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe -
Setup an emergency fallback cfqq that we allocate at IO scheduler init
time. If the slab allocation fails in cfq_find_alloc_queue(), we'll just
punt IO to that cfqq instead. This ensures that cfq_find_alloc_queue()
never fails without having to ensure free memory.On cfqq lookup, always try to allocate a new cfqq if the given cfq io
context has the oom_cfqq assigned. This ensures that we only temporarily
punt to this shared queue.Reviewed-by: Jeff Moyer
Signed-off-by: Jens Axboe -
We're going to be needing that init code outside of that function
to get rid of the __GFP_NOFAIL in cfqq allocation.Reviewed-by: Jeff Moyer
Signed-off-by: Jens Axboe
24 Jun, 2009
1 commit
-
Percpu variable definition is about to be updated such that all percpu
symbols including the static ones must be unique. Update percpu
variable definitions accordingly.* as,cfq: rename ioc_count uniquely
* cpufreq: rename cpu_dbs_info uniquely
* xen: move nesting_count out of xen_evtchn_do_upcall() and rename it
* mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and
rename it* ipv4,6: rename cookie_scratch uniquely
* x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to
pmc_irq_entry and nmi_entry to pmc_nmi_entry* perf_counter: rename disable_count to perf_disable_count
* ftrace: rename test_event_disable to ftrace_test_event_disable
* kmemleak: rename test_pointer to kmemleak_test_pointer
* mce: rename next_interval to mce_next_interval
[ Impact: percpu usage cleanups, no duplicate static percpu var names ]
Signed-off-by: Tejun Heo
Reviewed-by: Christoph Lameter
Cc: Ivan Kokshaysky
Cc: Jens Axboe
Cc: Dave Jones
Cc: Jeremy Fitzhardinge
Cc: linux-mm
Cc: David S. Miller
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Li Zefan
Cc: Catalin Marinas
Cc: Andi Kleen
21 Jun, 2009
1 commit
-
The SMP handler (sas_smp_request) was fixed to use the block API
properly, so we don't need this workaround to avoid blk_put_request()
warning.Signed-off-by: FUJITA Tomonori
Signed-off-by: James Bottomley
19 Jun, 2009
2 commits
-
Warning(block/blk-settings.c:108): No description found for parameter 'lim'
Warning(block/blk-settings.c:108): Excess function parameter 'limits' description in 'blk_set_default_limits'Signed-off-by: Randy Dunlap
Signed-off-by: Jens Axboe -
Follow-up to "block: enable by default support for large devices
and files on 32-bit archs".Rename CONFIG_LBD to CONFIG_LBDAF to:
- allow update of existing [def]configs for "default y" change
- reflect that it is used also for large files support nowadaysSigned-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Jens Axboe
18 Jun, 2009
1 commit
-
Correct stacking bounce_pfn limit setting and prevent warnings on
32-bit.Signed-off-by: Martin K. Petersen
Signed-off-by: Jens Axboe
17 Jun, 2009
1 commit
-
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (64 commits)
debugfs: use specified mode to possibly mark files read/write only
debugfs: Fix terminology inconsistency of dir name to mount debugfs filesystem.
xen: remove driver_data direct access of struct device from more drivers
usb: gadget: at91_udc: remove driver_data direct access of struct device
uml: remove driver_data direct access of struct device
block/ps3: remove driver_data direct access of struct device
s390: remove driver_data direct access of struct device
parport: remove driver_data direct access of struct device
parisc: remove driver_data direct access of struct device
of_serial: remove driver_data direct access of struct device
mips: remove driver_data direct access of struct device
ipmi: remove driver_data direct access of struct device
infiniband: ehca: remove driver_data direct access of struct device
ibmvscsi: gadget: at91_udc: remove driver_data direct access of struct device
hvcs: remove driver_data direct access of struct device
xen block: remove driver_data direct access of struct device
thermal: remove driver_data direct access of struct device
scsi: remove driver_data direct access of struct device
pcmcia: remove driver_data direct access of struct device
PCIE: remove driver_data direct access of struct device
...Manually fix up trivial conflicts due to different direct driver_data
direct access fixups in drivers/block/{ps3disk.c,ps3vram.c}
16 Jun, 2009
7 commits
-
When porting blktrace to tracepoints, we changed to trace/block.h
for trace prober declarations.Signed-off-by: Li Zefan
Signed-off-by: Jens Axboe -
DM reuses the request queue when swapping in a new device table
Introduce blk_set_default_limits() which can be used to reset the the
queue_limits prior to stacking devices.Signed-off-by: Martin K. Petersen
Acked-by: Alasdair G Kergon
Acked-by: Mike Snitzer
Signed-off-by: Jens Axboe -
I noticed a blank line in blktrace output. This patch fixes that.
Signed-off-by: Jeff Moyer
Signed-off-by: Jens Axboe -
Move the defaults to where we do the init of the backing_dev_info.
Signed-off-by: Jens Axboe
-
Actually, last_end_request in cfq_data isn't used now. So lets
just remove it.Signed-off-by: Gui Jianfeng
Signed-off-by: Jens Axboe -
This adds support to the BSG driver to report the proper device name to
userspace for the bsg devices.Signed-off-by: Kay Sievers
Signed-off-by: Jan Blunck
Signed-off-by: Greg Kroah-Hartman -
This adds support for block drivers to report their requested nodename
to userspace. It also updates a number of block drivers to provide the
needed subdirectory and device name to be used for them.Signed-off-by: Kay Sievers
Signed-off-by: Jan Blunck
Signed-off-by: Greg Kroah-Hartman
12 Jun, 2009
3 commits
-
Fix kernel-doc warnings in recently changed block/ source code.
Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds -
* 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits)
block: add request clone interface (v2)
floppy: fix hibernation
ramdisk: remove long-deprecated "ramdisk=" boot-time parameter
fs/bio.c: add missing __user annotation
block: prevent possible io_context->refcount overflow
Add serial number support for virtio_blk, V4a
block: Add missing bounce_pfn stacking and fix comments
Revert "block: Fix bounce limit setting in DM"
cciss: decode unit attention in SCSI error handling code
cciss: Remove no longer needed sendcmd reject processing code
cciss: change SCSI error handling routines to work with interrupts enabled.
cciss: separate error processing and command retrying code in sendcmd_withirq_core()
cciss: factor out fix target status processing code from sendcmd functions
cciss: simplify interface of sendcmd() and sendcmd_withirq()
cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code
cciss: Use schedule_timeout_uninterruptible in SCSI error handling code
block: needs to set the residual length of a bidi request
Revert "block: implement blkdev_readpages"
block: Fix bounce limit setting in DM
Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt
...Manually fix conflicts with tracing updates in:
block/blk-sysfs.c
drivers/ide/ide-atapi.c
drivers/ide/ide-cd.c
drivers/ide/ide-floppy.c
drivers/ide/ide-tape.c
include/trace/events/block.h
kernel/trace/blktrace.c -
* 'for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (28 commits)
ide-tape: fix debug call
alim15x3: Remove historical hacks, re-enable init_hwif for PowerPC
ide-dma: don't reset request fields on dma_timeout_retry()
ide: drop rq->data handling from ide_map_sg()
ide-atapi: kill unused fields and callbacks
ide-tape: simplify read/write functions
ide-tape: use byte size instead of sectors on rw issue functions
ide-tape: unify r/w init paths
ide-tape: kill idetape_bh
ide-tape: use standard data transfer mechanism
ide-tape: use single continuous buffer
ide-atapi,tape,floppy: allow ->pc_callback() to change rq->data_len
ide-tape,floppy: fix failed command completion after request sense
ide-pm: don't abuse rq->data
ide-cd,atapi: use bio for internal commands
ide-atapi: convert ide-{floppy,tape} to using preallocated sense buffer
ide-cd: convert to using generic sense request
ide: add helpers for preparing sense requests
ide-cd: don't abuse rq->buffer
ide-atapi: don't abuse rq->buffer
...
11 Jun, 2009
3 commits
-
This patch adds the following 2 interfaces for request-stacking drivers:
- blk_rq_prep_clone(struct request *clone, struct request *orig,
struct bio_set *bs, gfp_t gfp_mask,
int (*bio_ctr)(struct bio *, struct bio*, void *),
void *data)
* Clones bios in the original request to the clone request
(bio_ctr is called for each cloned bios.)
* Copies attributes of the original request to the clone request.
The actual data parts (e.g. ->cmd, ->buffer, ->sense) are not
copied.- blk_rq_unprep_clone(struct request *clone)
* Frees cloned bios from the clone request.Request stacking drivers (e.g. request-based dm) need to make a clone
request for a submitted request and dispatch it to other devices.To allocate request for the clone, request stacking drivers may not
be able to use blk_get_request() because the allocation may be done
in an irq-disabled context.
So blk_rq_prep_clone() takes a request allocated by the caller
as an argument.For each clone bio in the clone request, request stacking drivers
should be able to set up their own completion handler.
So blk_rq_prep_clone() takes a callback function which is called
for each clone bio, and a pointer for private data which is passed
to the callback.NOTE:
blk_rq_prep_clone() doesn't copy any actual data of the original
request. Pages are shared between original bios and cloned bios.
So caller must not complete the original request before the clone
request.Signed-off-by: Kiyoshi Ueda
Signed-off-by: Jun'ichi Nomura
Cc: Boaz Harrosh
Signed-off-by: Jens Axboe -
* 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (244 commits)
Revert "x86, bts: reenable ptrace branch trace support"
tracing: do not translate event helper macros in print format
ftrace/documentation: fix typo in function grapher name
tracing/events: convert block trace points to TRACE_EVENT(), fix !CONFIG_BLOCK
tracing: add protection around module events unload
tracing: add trace_seq_vprint interface
tracing: fix the block trace points print size
tracing/events: convert block trace points to TRACE_EVENT()
ring-buffer: fix ret in rb_add_time_stamp
ring-buffer: pass in lockdep class key for reader_lock
tracing: add annotation to what type of stack trace is recorded
tracing: fix multiple use of __print_flags and __print_symbolic
tracing/events: fix output format of user stack
tracing/events: fix output format of kernel stack
tracing/trace_stack: fix the number of entries in the header
ring-buffer: discard timestamps that are at the start of the buffer
ring-buffer: try to discard unneeded timestamps
ring-buffer: fix bug in ring_buffer_discard_commit
ftrace: do not profile functions when disabled
tracing: make trace pipe recognize latency format flag
... -
Currently io_context has an atomic_t(32-bit) as refcount. In the case of
cfq, for each device against whcih a task does I/O, a reference to the
io_context would be taken. And when there are multiple process sharing
io_contexts(CLONE_IO) would also have a reference to the same io_context.Theoretically the possible maximum number of processes sharing the same
io_context + the number of disks/cfq_data referring to the same io_context
can overflow the 32-bit counter on a very high-end machine.Even though it is an improbable case, let us make it atomic_long_t.
Signed-off-by: Nikanth Karthikesan
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe