Eric Lee / smarc-fsl-linux-kernel

10 Jan, 2012

1 commit

db0c2bf69 Merge branch 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

* 'for-3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits)
cgroup: fix to allow mounting a hierarchy by name
cgroup: move assignement out of condition in cgroup_attach_proc()
cgroup: Remove task_lock() from cgroup_post_fork()
cgroup: add sparse annotation to cgroup_iter_start() and cgroup_iter_end()
cgroup: mark cgroup_rmdir_waitq and cgroup_attach_proc() as static
cgroup: only need to check oldcgrp==newgrp once
cgroup: remove redundant get/put of task struct
cgroup: remove redundant get/put of old css_set from migrate
cgroup: Remove unnecessary task_lock before fetching css_set on migration
cgroup: Drop task_lock(parent) on cgroup_fork()
cgroups: remove redundant get/put of css_set from css_set_check_fetched()
resource cgroups: remove bogus cast
cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()
cgroup, cpuset: don't use ss->pre_attach()
cgroup: don't use subsys->can_attach_task() or ->attach_task()
cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach()
cgroup: improve old cgroup handling in cgroup_attach_proc()
cgroup: always lock threadgroup during migration
threadgroup: extend threadgroup_lock() to cover exit and exec
threadgroup: rename signal->threadgroup_fork_lock to ->group_rwsem
...

Fix up conflict in kernel/cgroup.c due to commit e0197aae59e5: "cgroups:
fix a css_set not found bug in cgroup_attach_proc" that already
mentioned that the bug is fixed (differently) in Tejun's cgroup
patchset. This one, in other words.

Linus Torvalds
2012-01-10 04:59:24 +0800

09 Jan, 2012

1 commit

972b2c719 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits)
reiserfs: Properly display mount options in /proc/mounts
vfs: prevent remount read-only if pending removes
vfs: count unlinked inodes
vfs: protect remounting superblock read-only
vfs: keep list of mounts for each superblock
vfs: switch ->show_options() to struct dentry *
vfs: switch ->show_path() to struct dentry *
vfs: switch ->show_devname() to struct dentry *
vfs: switch ->show_stats to struct dentry *
switch security_path_chmod() to struct path *
vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb
vfs: trim includes a bit
switch mnt_namespace ->root to struct mount
vfs: take /proc/*/mounts and friends to fs/proc_namespace.c
vfs: opencode mntget() mnt_set_mountpoint()
vfs: spread struct mount - remaining argument of next_mnt()
vfs: move fsnotify junk to struct mount
vfs: move mnt_devname
vfs: move mnt_list to struct mount
vfs: switch pnode.h macros to struct mount *
...

Linus Torvalds
2012-01-09 04:19:57 +0800

07 Jan, 2012

1 commit

ece2ccb66 Merge branches 'vfsmount-guts', 'umode_t' and 'partitions' into Z Browse Code »

Al Viro
2012-01-07 12:15:54 +0800

06 Jan, 2012

1 commit

07d106d0a vfs: fix up ENOIOCTLCMD error handling ... Browse Code »
258

We're doing some odd things there, which already messes up various users
(see the net/socket.c code that this removes), and it was going to add
yet more crud to the block layer because of the incorrect error code
translation.

ENOIOCTLCMD is not an error return that should be returned to user mode
from the "ioctl()" system call, but it should *not* be translated as
EINVAL ("Invalid argument"). It should be translated as ENOTTY
("Inappropriate ioctl for device").

That EINVAL confusion has apparently so permeated some code that the
block layer actually checks for it, which is sad. We continue to do so
for now, but add a big comment about how wrong that is, and we should
remove it entirely eventually. In the meantime, this tries to keep the
changes localized to just the EINVAL -> ENOTTY fix, and removing code
that makes it harder to do the right thing.

Signed-off-by: Linus Torvalds

Linus Torvalds
2012-01-06 07:40:12 +0800

04 Jan, 2012

5 commits

2c9ede55e switch device_get_devnode() and ->devnode() to umode_t * ... Browse Code »

both callers of device_get_devnode() are only interested in lower 16bits
and nobody tries to return anything wider than 16bit anyway.

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:54:55 +0800
ff01bb483 fs: move code out of buffer.c ... Browse Code »

Move invalidate_bdev, block_sync_page into fs/block_dev.c. Export
kill_bdev as well, so brd doesn't have to open code it. Reduce
buffer_head.h requirement accordingly.

Removed a rather large comment from invalidate_bdev, as it looked a bit
obsolete to bother moving. The small comment replacing it says enough.

Signed-off-by: Nick Piggin
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro

Al Viro
2012-01-04 11:54:07 +0800
94ea4158f separate partition format handling from generic code ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:54:06 +0800
9be96f3fd move fs/partitions to block/ ... Browse Code »
43

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:54:06 +0800
4752bc309 make register_disk() static ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:54:05 +0800

29 Dec, 2011

1 commit

f2b20d436 block: fix blk_queue_end_tag() ... Browse Code »

Commit 5e081591 "block: warn if tag is greater than real_max_depth"
cleaned up blk_queue_end_tag() to warn when the tag is truly invalid
(greater than real_max_depth). However, it changed behavior in the tag <
max_depth case to not end the request. Leading to triggering of
BUG_ON(blk_queued_rq(rq)) in the request completion path:

http://marc.info/?l=linux-kernel&m=132204370518629&w=2

In order to allow blk_queue_resize_tags() to shrink the tag space
blk_queue_end_tag() must always complete tags with a value less than
real_max_depth regardless of the current max_depth. The comment about
"handling the shrink case" seems to be what prompted changes in this
space, so remove it and BUG on all invalid tags (made even simpler by
Matthew's suggestion to use an unsigned compare).

Signed-off-by: Dan Williams
Cc: Tao Ma
Cc: Matthew Wilcox
Reported-by: Meelis Roos
Reported-by: Ed Nadolski
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

Dan Williams
2011-12-29 16:16:28 +0800

21 Dec, 2011

1 commit

609f6ea1c block: re-use existing 'reading' variable instead of checking direction again ... Browse Code »

Signed-off-by: majianpeng
Signed-off-by: Jens Axboe

majianpeng
2011-12-21 22:27:24 +0800

16 Dec, 2011

2 commits

6ae0516b8 block, cfq: fix empty queue crash caused by request merge ... Browse Code »

All requests of a queue could be merged to other requests of other queue.
Such queue will not have request in it, but it's in service tree. This
will cause kernel oops.
I encounter a BUG_ON() in cfq_dispatch_request() with next patch, but the
issue should exist without the patch.

Signed-off-by: Shaohua Li
Signed-off-by: Jens Axboe

Shaohua Li
2011-12-16 21:04:23 +0800
4eabc9412 block: don't kick empty queue in blk_drain_queue() ... Browse Code »

While probing, fd sets up queue, probes hardware and tears down the
queue if probing fails. In the process, blk_drain_queue() kicks the
queue which failed to finish initialization and fd is unhappy about
that.

floppy0: no floppy controllers found
------------[ cut here ]------------
WARNING: at drivers/block/floppy.c:2929 do_fd_request+0xbf/0xd0()
Hardware name: To Be Filled By O.E.M.
VFS: do_fd_request called on non-open device
Modules linked in:
Pid: 1, comm: swapper Not tainted 3.2.0-rc4-00077-g5983fe2 #2
Call Trace:
[] warn_slowpath_common+0x7a/0xb0
[] warn_slowpath_fmt+0x41/0x50
[] do_fd_request+0xbf/0xd0
[] blk_drain_queue+0x65/0x80
[] blk_cleanup_queue+0xe3/0x1a0
[] floppy_init+0xdeb/0xe28
[] ? daring+0x6b/0x6b
[] do_one_initcall+0x3f/0x170
[] kernel_init+0x9d/0x11e
[] ? schedule_tail+0x22/0xa0
[] kernel_thread_helper+0x4/0x10
[] ? start_kernel+0x2be/0x2be
[] ? gs_change+0xb/0xb

Avoid it by making blk_drain_queue() kick queue iff dispatch queue has
something on it.

Signed-off-by: Tejun Heo
Reported-by: Ralf Hildebrandt
Reported-by: Wu Fengguang
Tested-by: Sergei Trofimovich
Signed-off-by: Jens Axboe

Tejun Heo
2011-12-16 03:03:04 +0800

13 Dec, 2011

1 commit

bb9d97b6d cgroup: don't use subsys->can_attach_task() or ->attach_task() ... Browse Code »

Now that subsys->can_attach() and attach() take @tset instead of
@task, they can handle per-task operations. Convert
->can_attach_task() and ->attach_task() users to use ->can_attach()
and attach() instead. Most converions are straight-forward.
Noteworthy changes are,

* In cgroup_freezer, remove unnecessary NULL assignments to unused
methods. It's useless and very prone to get out of sync, which
already happened.

* In cpuset, PF_THREAD_BOUND test is checked for each task. This
doesn't make any practical difference but is conceptually cleaner.

Signed-off-by: Tejun Heo
Reviewed-by: KAMEZAWA Hiroyuki
Reviewed-by: Frederic Weisbecker
Acked-by: Li Zefan
Cc: Paul Menage
Cc: Balbir Singh
Cc: Daisuke Nishimura
Cc: James Morris
Cc: Ingo Molnar
Cc: Peter Zijlstra

Tejun Heo
2011-12-13 10:12:21 +0800

02 Dec, 2011

1 commit

5eb46851d cfq-iosched: fix cfq_cic_link() race confition ... Browse Code »
1

cfq_cic_link() has race condition. When some processes which shared ioc
issue I/O to same block device simultaneously, cfq_cic_link() returns -EEXIST
sometimes. The race condition might stop I/O by following steps:

step 1: Process A: Issue an I/O to /dev/sda
step 2: Process A: Get an ioc (iocA here) in get_io_context() which does not
linked with a cic for the device
step 3: Process A: Get a new cic for the device (cicA here) in
cfq_alloc_io_context()

step 4: Process B: Issue an I/O to /dev/sda
step 5: Process B: Get iocA in get_io_context() since process A and B share the
same ioc
step 6: Process B: Get a new cic for the device (cicB here) in
cfq_alloc_io_context() since iocA has not been linked with a
cic for the device yet

step 7: Process A: Link cicA to iocA in cfq_cic_link()
step 8: Process A: Dispatch I/O to driver and finish it

step 9: Process B: Try to link cicB to iocA in cfq_cic_link()
But it fails with showing "cfq: cic link failed!" kernel
message, since iocA has already linked with cicA at step 7.
step 10: Process B: Wait for finishig I/O in get_request_wait()
The function does not wake up, when there is no I/O to the
device.

When cfq_cic_link() returns -EEXIST, it means ioc has already linked with cic.
So when cfq_cic_link() return -EEXIST, retry cfq_cic_lookup().

Signed-off-by: Yasuaki Ishimatsu
Cc: stable@kernel.org
Signed-off-by: Jens Axboe

Yasuaki Ishimatsu
2011-12-02 17:07:07 +0800

30 Nov, 2011

1 commit

2984ff38c cfq-iosched: free cic_index if blkio_alloc_blkg_stats fails ... Browse Code »
1

If we fail allocating the blkpg stats, we free cfqd and cfgq.
But we need to free the IDA cfqd->cic_index as well.

Signed-off-by: majianpeng
Cc: stable@kernel.org
Signed-off-by: Jens Axboe

majianpeng
2011-11-30 22:47:48 +0800

23 Nov, 2011

1 commit

5151412dd block: initialize request_queue's numa node during ... Browse Code »
1

struct request_queue is allocated with __GFP_ZERO so its "node" field is
zero before initialization. This causes an oops if node 0 is offline in
the page allocator because its zonelists are not initialized. From Dave
Young's dmesg:

SRAT: Node 1 PXM 2 0-d0000000
SRAT: Node 1 PXM 2 100000000-330000000
SRAT: Node 0 PXM 1 330000000-630000000
Initmem setup node 1 0000000000000000-000000000affb000
...
Built 1 zonelists in Node order, mobility grouping on.
...
BUG: unable to handle kernel paging request at 0000000000001c08
IP: [] __alloc_pages_nodemask+0xb5/0x870

and __alloc_pages_nodemask+0xb5 translates to a NULL pointer on
zonelist->_zonerefs.

The fix is to initialize q->node at the time of allocation so the correct
node is passed to the slab allocator later.

Since blk_init_allocated_queue_node() is no longer needed, merge it with
blk_init_allocated_queue().

[rientjes@google.com: changelog, initializing q->node]
Cc: stable@vger.kernel.org [2.6.37+]
Reported-by: Dave Young
Signed-off-by: Mike Snitzer
Signed-off-by: David Rientjes
Tested-by: Dave Young
Signed-off-by: Jens Axboe

Mike Snitzer
2011-11-23 17:59:13 +0800

16 Nov, 2011

2 commits

019ceb7d5 block: add missed trace_block_plug ... Browse Code »

After flush plug list, the list has no request, so we need to add a
trace_block_plug().

Signed-off-by: Shaohua Li
Reviewed-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

Shaohua Li
2011-11-16 16:21:50 +0800
3540d5e89 block: avoid unnecessary plug list flush ... Browse Code »

get_request_wait() could sleep and flush the plug list. If the list is
already flushed, don't flush again.

Signed-off-by: Shaohua Li
Reviewed-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Jens Axboe

Shaohua Li
2011-11-16 16:21:50 +0800

14 Nov, 2011

1 commit

6b76106d8 block: Always check length of all iov entries in blk_rq_map_user_iov() ... Browse Code »
1

Even after commit 5478755616ae2ef1ce144dded589b62b2a50d575
("block: check for proper length of iov entries earlier ...")
we still won't check for zero-length entries after an unaligned
entry. Remove the break-statement, so all entries are checked.

Signed-off-by: Ben Hutchings
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe

Ben Hutchings
2011-11-14 02:58:09 +0800

10 Nov, 2011

1 commit

d0985394e block: Revert "[SCSI] genhd: add a new attribute "alias" in gendisk" ... Browse Code »

This reverts commit a72c5e5eb738033938ab30d6a634b74d1d060f10.

The commit introduced alias for block devices which is intended to be
used during logging although actual usage hasn't been committed yet.
This approach adds very limited benefit (raw log might be easier to
follow) which can be trivially implemented in userland but has a lot
of problems.

It is much worse than netif renames because it doesn't rename the
actual device but just adds conveninence name which isn't used
universally or enforced. Everything internal including device lookup
and sysfs still uses the internal name and nothing prevents two
devices from using conflicting alias - ie. sda can have sdb as its
alias.

This has been nacked by people working on device driver core, block
layer and kernel-userland interface and shouldn't have been
upstreamed. Revert it.

http://thread.gmane.org/gmane.linux.kernel/1155104
http://thread.gmane.org/gmane.linux.scsi/68632
http://thread.gmane.org/gmane.linux.scsi/69776

Signed-off-by: Tejun Heo
Acked-by: Greg Kroah-Hartman
Acked-by: Kay Sievers
Cc: "James E.J. Bottomley"
Cc: Nao Nishijima
Cc: Alan Cox
Cc: Al Viro
Signed-off-by: Jens Axboe

Tejun Heo
2011-11-10 16:03:55 +0800

07 Nov, 2011

1 commit

32aaeffbd Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux ... Browse Code »

* 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
Revert "tracing: Include module.h in define_trace.h"
irq: don't put module.h into irq.h for tracking irqgen modules.
bluetooth: macroize two small inlines to avoid module.h
ip_vs.h: fix implicit use of module_get/module_put from module.h
nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
include: replace linux/module.h with "struct module" wherever possible
include: convert various register fcns to macros to avoid include chaining
crypto.h: remove unused crypto_tfm_alg_modname() inline
uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
pm_runtime.h: explicitly requires notifier.h
linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
miscdevice.h: fix up implicit use of lists and types
stop_machine.h: fix implicit use of smp.h for smp_processor_id
of: fix implicit use of errno.h in include/linux/of.h
of_platform.h: delete needless include
acpi: remove module.h include from platform/aclinux.h
miscdevice.h: delete unnecessary inclusion of module.h
device_cgroup.h: delete needless include
net: sch_generic remove redundant use of
net: inet_timewait_sock doesnt need
...

Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
- drivers/media/dvb/frontends/dibx000_common.c
- drivers/media/video/{mt9m111.c,ov6650.c}
- drivers/mfd/ab3550-core.c
- include/linux/dmaengine.h

Linus Torvalds
2011-11-07 11:44:47 +0800

05 Nov, 2011

2 commits

3d0a8d10c Merge branch 'for-3.2/drivers' of git://git.kernel.dk/linux-block ... Browse Code »

* 'for-3.2/drivers' of git://git.kernel.dk/linux-block: (30 commits)
virtio-blk: use ida to allocate disk index
hpsa: add small delay when using PCI Power Management to reset for kump
cciss: add small delay when using PCI Power Management to reset for kump
xen/blkback: Fix two races in the handling of barrier requests.
xen/blkback: Check for proper operation.
xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
xen/blkback: Report VBD_WSECT (wr_sect) properly.
xen/blkback: Support 'feature-barrier' aka old-style BARRIER requests.
xen-blkfront: plug device number leak in xlblk_init() error path
xen-blkfront: If no barrier or flush is supported, use invalid operation.
xen-blkback: use kzalloc() in favor of kmalloc()+memset()
xen-blkback: fixed indentation and comments
xen-blkfront: fix a deadlock while handling discard response
xen-blkfront: Handle discard requests.
xen-blkback: Implement discard requests ('feature-discard')
xen-blkfront: add BLKIF_OP_DISCARD and discard request struct
drivers/block/loop.c: remove unnecessary bdev argument from loop_clr_fd()
drivers/block/loop.c: emit uevent on auto release
drivers/block/cpqarray.c: use pci_dev->revision
loop: always allow userspace partitions and optionally support automatic scanning
...

Fic up trivial header file includsion conflict in drivers/block/loop.c

Linus Torvalds
2011-11-05 08:22:14 +0800
b4fdcb02f Merge branch 'for-3.2/core' of git://git.kernel.dk/linux-block ... Browse Code »

* 'for-3.2/core' of git://git.kernel.dk/linux-block: (29 commits)
block: don't call blk_drain_queue() if elevator is not up
blk-throttle: use queue_is_locked() instead of lockdep_is_held()
blk-throttle: Take blkcg->lock while traversing blkcg->policy_list
blk-throttle: Free up policy node associated with deleted rule
block: warn if tag is greater than real_max_depth.
block: make gendisk hold a reference to its queue
blk-flush: move the queue kick into
blk-flush: fix invalid BUG_ON in blk_insert_flush
block: Remove the control of complete cpu from bio.
block: fix a typo in the blk-cgroup.h file
block: initialize the bounce pool if high memory may be added later
block: fix request_queue lifetime handling by making blk_queue_cleanup() properly shutdown
block: drop @tsk from attempt_plug_merge() and explain sync rules
block: make get_request[_wait]() fail if queue is dead
block: reorganize throtl_get_tg() and blk_throtl_bio()
block: reorganize queue draining
block: drop unnecessary blk_get/put_queue() in scsi_cmd_ioctl() and blk_get_tg()
block: pass around REQ_* flags instead of broken down booleans during request alloc/free
block: move blk_throtl prototypes to block/blk.h
block: fix genhd refcounting in blkio_policy_parse_and_set()
...

Fix up trivial conflicts due to "mddev_t" -> "struct mddev" conversion
and making the request functions be of type "void" instead of "int" in
- drivers/md/{faulty.c,linear.c,md.c,md.h,multipath.c,raid0.c,raid1.c,raid10.c,raid5.c}
- drivers/staging/zram/zram_drv.c

Linus Torvalds
2011-11-05 08:06:58 +0800

04 Nov, 2011

1 commit

6dd9ad7df block: don't call blk_drain_queue() if elevator is not up ... Browse Code »

blk_cleanup_queue() may be called before elevator is set up on a
queue which triggers the following oops.

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [] elv_drain_elevator+0x1c/0x70
...
Pid: 830, comm: kworker/0:2 Not tainted 3.1.0-next-20111025_64+ #1590
Bochs Bochs
RIP: 0010:[] [] elv_drain_elevator+0x1c/0x70
...
Call Trace:
[] blk_drain_queue+0x42/0x70
[] blk_cleanup_queue+0xd0/0x1c0
[] md_free+0x50/0x70
[] kobject_release+0x8b/0x1d0
[] kref_put+0x36/0xa0
[] kobject_put+0x27/0x60
[] mddev_delayed_delete+0x2f/0x40
[] process_one_work+0x100/0x3b0
[] worker_thread+0x15f/0x3a0
[] kthread+0x87/0x90
[] kernel_thread_helper+0x4/0x10

Fix it by making blk_cleanup_queue() check whether q->elevator is set
up before invoking blk_drain_queue.

Signed-off-by: Tejun Heo
Reported-and-tested-by: Jiri Slaby
Signed-off-by: Jens Axboe

Tejun Heo
2011-11-04 01:52:11 +0800

01 Nov, 2011

2 commits

6adb1236b block: Change module.h -> export.h in bsg-lib.c ... Browse Code »

This file isn't using full modular functionality, and hence
can be "downgraded" to just using the export.h header.

Reported-by: Stephen Rothwell
Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-11-01 07:31:13 +0800
d5decd3b9 block: add export.h to files using EXPORT_SYMBOL/THIS_MODULE macros ... Browse Code »

These files were getting via an implicit include
path, but we want to crush those out of existence since they cost
time during compiles of processing thousands of lines of headers
for no reason. Give them the lightweight header that just contains
the EXPORT_SYMBOL infrastructure.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2011-11-01 07:31:12 +0800

29 Oct, 2011

1 commit

ec7ae5175 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (204 commits)
[SCSI] qla4xxx: export address/port of connection (fix udev disk names)
[SCSI] ipr: Fix BUG on adapter dump timeout
[SCSI] megaraid_sas: Fix instance access in megasas_reset_timer
[SCSI] hpsa: change confusing message to be more clear
[SCSI] iscsi class: fix vlan configuration
[SCSI] qla4xxx: fix data alignment and use nl helpers
[SCSI] iscsi class: fix link local mispelling
[SCSI] iscsi class: Replace iscsi_get_next_target_id with IDA
[SCSI] aacraid: use lower snprintf() limit
[SCSI] lpfc 8.3.27: Change driver version to 8.3.27
[SCSI] lpfc 8.3.27: T10 additions for SLI4
[SCSI] lpfc 8.3.27: Fix queue allocation failure recovery
[SCSI] lpfc 8.3.27: Change algorithm for getting physical port name
[SCSI] lpfc 8.3.27: Changed worst case mailbox timeout
[SCSI] lpfc 8.3.27: Miscellanous logic and interface fixes
[SCSI] megaraid_sas: Changelog and version update
[SCSI] megaraid_sas: Add driver workaround for PERC5/1068 kdump kernel panic
[SCSI] megaraid_sas: Add multiple MSI-X vector/multiple reply queue support
[SCSI] megaraid_sas: Add support for MegaRAID 9360/9380 12GB/s controllers
[SCSI] megaraid_sas: Clear FUSION_IN_RESET before enabling interrupts
...

Linus Torvalds
2011-10-29 07:44:18 +0800

25 Oct, 2011

4 commits

334c2b0b8 blk-throttle: use queue_is_locked() instead of lockdep_is_held() ... Browse Code »

We can't use the latter if !CONFIG_LOCKDEP.

Reported-by: Sedat Dilek
Signed-off-by: Jens Axboe

Jens Axboe
2011-10-25 21:51:48 +0800
a38eb630f blk-throttle: Take blkcg->lock while traversing blkcg->policy_list ... Browse Code »

blkcg->policy_list is protected by blkcg->lock. Its not rcu protected
list. So even for readers, they need to take blkcg->lock. There are
few functions which were reading the list without taking lock. Fix it.

Signed-off-by: Vivek Goyal
Acked-by: Tejun Heo
Signed-off-by: Jens Axboe

Vivek Goyal
2011-10-25 21:48:12 +0800
e060f00be blk-throttle: Free up policy node associated with deleted rule ... Browse Code »

If a rule is being deleted, free up associated policy node. Otherwise
that memory is leaked.

Signed-off-by: Vivek Goyal
Acked-by: Tejun Heo
Signed-off-by: Jens Axboe

Vivek Goyal
2011-10-25 21:48:12 +0800
5e0815919 block: warn if tag is greater than real_max_depth. ... Browse Code »

In case tag depth is reduced, it is max_depth not real_max_depth.
So we should allow a request with tag >= max_depth, but for a
tag >= real_max_depth, there really should be some problem.

Signed-off-by: Tao Ma
Signed-off-by: Jens Axboe

Tao Ma
2011-10-25 16:20:05 +0800

24 Oct, 2011

6 commits

83157223d Merge branch 'for-linus' into for-3.2/core Browse Code »

Jens Axboe
2011-10-24 22:24:38 +0800
f992ae801 block: make gendisk hold a reference to its queue ... Browse Code »
1

The following command sequence triggers an oops.

# mount /dev/sdb1 /mnt
# echo 1 > /sys/class/scsi_device/0\:0\:1\:0/device/delete
# umount /mnt

general protection fault: 0000 [#1] PREEMPT SMP
CPU 2
Modules linked in:

Pid: 791, comm: umount Not tainted 3.1.0-rc3-work+ #8 Bochs Bochs
RIP: 0010:[] [] __lock_acquire+0x389/0x1d60
...
Call Trace:
[] lock_acquire+0x95/0x140
[] _raw_spin_lock+0x3b/0x50
[] bdi_lock_two+0x5c/0x70
[] bdev_inode_switch_bdi+0x4c/0xf0
[] __blkdev_put+0x11b/0x1d0
[] __blkdev_put+0x160/0x1d0
[] blkdev_put+0x5f/0x190
[] kill_block_super+0x4d/0x80
[] deactivate_locked_super+0x45/0x70
[] deactivate_super+0x4a/0x70
[] mntput_no_expire+0xed/0x130
[] sys_umount+0x7e/0x3a0
[] system_call_fastpath+0x16/0x1b

This is because bdev holds on to disk but disk doesn't pin the
associated queue. If a SCSI device is removed while the device is
still open, the sdev puts the base reference to the queue on release.
When the bdev is finally released, the associated queue is already
gone along with the bdi and bdev_inode_switch_bdi() ends up
dereferencing already freed bdi.

Even if it were not for this bug, disk not holding onto the associated
queue is very unusual and error-prone.

Fix it by making add_disk() take an extra reference to its queue and
put it on disk_release() and ensuring that disk and its fops owner are
put in that order after all accesses to the disk and queue are
complete.

Signed-off-by: Tejun Heo
Cc: Jens Axboe
Cc: stable@kernel.org
Signed-off-by: Jens Axboe

Tejun Heo
2011-10-24 22:24:31 +0800
e67b77c79 blk-flush: move the queue kick into ... Browse Code »

A dm-multipath user reported[1] a problem when trying to boot
a kernel with commit 4853abaae7e4a2af938115ce9071ef8684fb7af4
(block: fix flush machinery for stacking drivers with differring
flush flags) applied. It turns out that an empty flush request
can be sent into blk_insert_flush. When the BUG_ON was fixed
to allow for this, I/O on the underlying device would stall. The
reason is that blk_insert_cloned_request does not kick the queue.
In the aforementioned commit, I had added a special case to
kick the queue if data was sent down but the queue flags did
not require a flush. A better solution is to push the queue
kick up into blk_insert_cloned_request.

This patch, along with a follow-on which fixes the BUG_ON, fixes
the issue reported.

[1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html

Reported-by: Christophe Saout
Signed-off-by: Jeff Moyer
Acked-by: Tejun Heo

Stable note: 3.1
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe

Jeff Moyer
2011-10-24 22:24:31 +0800
834f9f61a blk-flush: fix invalid BUG_ON in blk_insert_flush ... Browse Code »

A user reported a regression due to commit
4853abaae7e4a2af938115ce9071ef8684fb7af4 (block: fix flush
machinery for stacking drivers with differring flush flags).
Part of the problem is that blk_insert_flush required a
single bio be attached to the request. In reality, having
no attached bio is also a valid case, as can be observed with
an empty flush.

[1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html

Reported-by: Christophe Saout
Signed-off-by: Jeff Moyer

Stable note: 3.1
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe

Jeff Moyer
2011-10-24 22:24:30 +0800
9562ad9ab block: Remove the control of complete cpu from bio. ... Browse Code »

bio originally has the functionality to set the complete cpu, but
it is broken.

Chirstoph said that "This code is unused, and from the all the
discussions lately pretty obviously broken. The only thing keeping
it serves is creating more confusion and possibly more bugs."

And Jens replied with "We can kill bio_set_completion_cpu(). I'm fine
with leaving cpu control to the request based drivers, they are the
only ones that can toggle the setting anyway".

So this patch tries to remove all the work of controling complete cpu
from a bio.

Cc: Shaohua Li
Cc: Christoph Hellwig
Signed-off-by: Tao Ma
Signed-off-by: Jens Axboe

Tao Ma
2011-10-24 22:11:30 +0800
e890413af block: fix a typo in the blk-cgroup.h file ... Browse Code »

byptes -> bytes.

Signed-off-by: Jie Liu
Signed-off-by: Jens Axboe

Jie Liu
2011-10-24 22:08:38 +0800

19 Oct, 2011

2 commits

c9a929dde block: fix request_queue lifetime handling by making blk_queue_cleanup() properly shutdown ... Browse Code »

request_queue is refcounted but actually depdends on lifetime
management from the queue owner - on blk_cleanup_queue(), block layer
expects that there's no request passing through request_queue and no
new one will.

This is fundamentally broken. The queue owner (e.g. SCSI layer)
doesn't have a way to know whether there are other active users before
calling blk_cleanup_queue() and other users (e.g. bsg) don't have any
guarantee that the queue is and would stay valid while it's holding a
reference.

With delay added in blk_queue_bio() before queue_lock is grabbed, the
following oops can be easily triggered when a device is removed with
in-flight IOs.

sd 0:0:1:0: [sdb] Stopping disk
ata1.01: disabled
general protection fault: 0000 [#1] PREEMPT SMP
CPU 2
Modules linked in:

Pid: 648, comm: test_rawio Not tainted 3.1.0-rc3-work+ #56 Bochs Bochs
RIP: 0010:[] [] elv_rqhash_find+0x61/0x100
...
Process test_rawio (pid: 648, threadinfo ffff880019efa000, task ffff880019ef8a80)
...
Call Trace:
[] elv_merge+0x84/0xe0
[] blk_queue_bio+0xf4/0x400
[] generic_make_request+0xca/0x100
[] submit_bio+0x74/0x100
[] dio_bio_submit+0xbc/0xc0
[] __blockdev_direct_IO+0x92e/0xb40
[] blkdev_direct_IO+0x57/0x60
[] generic_file_aio_read+0x6d5/0x760
[] do_sync_read+0xda/0x120
[] vfs_read+0xc5/0x180
[] sys_pread64+0x9a/0xb0
[] system_call_fastpath+0x16/0x1b

This happens because blk_queue_cleanup() destroys the queue and
elevator whether IOs are in progress or not and DEAD tests are
sprinkled in the request processing path without proper
synchronization.

Similar problem exists for blk-throtl. On queue cleanup, blk-throtl
is shutdown whether it has requests in it or not. Depending on
timing, it either oopses or throttled bios are lost putting tasks
which are waiting for bio completion into eternal D state.

The way it should work is having the usual clear distinction between
shutdown and release. Shutdown drains all currently pending requests,
marks the queue dead, and performs partial teardown of the now
unnecessary part of the queue. Even after shutdown is complete,
reference holders are still allowed to issue requests to the queue
although they will be immmediately failed. The rest of teardown
happens on release.

This patch makes the following changes to make blk_queue_cleanup()
behave as proper shutdown.

* QUEUE_FLAG_DEAD is now set while holding both q->exit_mutex and
queue_lock.

* Unsynchronized DEAD check in generic_make_request_checks() removed.
This couldn't make any meaningful difference as the queue could die
after the check.

* blk_drain_queue() updated such that it can drain all requests and is
now called during cleanup.

* blk_throtl updated such that it checks DEAD on grabbing queue_lock,
drains all throttled bios during cleanup and free td when queue is
released.

Signed-off-by: Tejun Heo
Cc: Vivek Goyal
Signed-off-by: Jens Axboe

Tejun Heo
2011-10-19 20:42:16 +0800
bd87b5898 block: drop @tsk from attempt_plug_merge() and explain sync rules ... Browse Code »

attempt_plug_merge() accesses elevator without holding queue_lock and
may call into ->elevator_bio_merge_fn(). The elvator is guaranteed to
be valid because it's accessed iff the plugged list has requests and
elevator is never exited with live requests, so as long as the
elevator method can deal with unlocked access, this is safe.

Explain the sync rules around attempt_plug_merge() and drop the
unnecessary @tsk parameter.

This patch doesn't introduce any functional change.

Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe

Tejun Heo
2011-10-19 20:33:08 +0800