Eric Lee / smarc-fsl-linux-kernel

04 Jul, 2012

16 commits

65622e647 ocfs2: for SEEK_DATA/SEEK_HOLE, return internal error unchanged if ocfs2_get_clu… ... Browse Code »

…sters_nocache() or ocfs2_inode_lock() call failed.

Hello,

Since ENXIO only means "offset beyond EOF" for SEEK_DATA/SEEK_HOLE,
Hence we should return the internal error unchanged if ocfs2_inode_lock() or
ocfs2_get_clusters_nocache() call failed rather than ENXIO.
Otherwise, it will confuse the user applications when they trying to understand the root cause.

Thanks Dave for pointing this out.

Thanks,
-Jeff

Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Joel Becker <jlbec@evilplan.org>

Jeff Liu
2012-07-04 14:27:16 +0800
a75e9ccab ocfs2: use spinlock irqsave for downconvert lock.patch ... Browse Code »

When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
Below is the stack snippet.

The patch disables interrupts when acquiring dc_task_lock spinlock.

ocfs2_wake_downconvert_thread
ocfs2_rw_unlock
ocfs2_dio_end_io
dio_complete
.....
bio_endio
req_bio_endio
....
scsi_io_completion
blk_done_softirq
__do_softirq
do_softirq
irq_exit
do_IRQ
ocfs2_downconvert_thread
[kthread]

Signed-off-by: Srinivas Eeda
Signed-off-by: Joel Becker

Srinivas Eeda
2012-07-04 14:27:15 +0800
16865b7c4 ocfs2: Misplaced parens in unlikley ... Browse Code »

Fix misplaced parentheses

Signed-off-by: Roel Kluin
Signed-off-by: Joel Becker

roel
2012-07-04 14:27:13 +0800
3e5d3c35a ocfs2: clear unaligned io flag when dio fails ... Browse Code »

The unaligned io flag is set in the kiocb when an unaligned
dio is issued, it should be cleared even when the dio fails,
or it may affect the following io which are using the same
kiocb.

Signed-off-by: Junxiao Bi
Cc: stable@vger.kernel.org
Signed-off-by: Joel Becker

Junxiao Bi
2012-07-04 14:26:50 +0800
9e85a6f9d Merge tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux ... Browse Code »

Pull fix to common clk framework from Michael Turquette:
"The previous set of common clk fixes for -rc5 left an uninitialized
int which could lead to bad array indexing when switching clock
parents. The issue is fixed with a trivial change to the code flow in
__clk_set_parent."

* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mturquette/linux:
clk: fix parent validation in __clk_set_parent()

Linus Torvalds
2012-07-04 09:06:49 +0800
6c8addcb7 Merge tag 'md-3.5-fixes' of git://neil.brown.name/md ... Browse Code »

Pull raid10 build failure fix from NeilBrown:
"I really shouldn't do important things late in the day. It seems that
I get careless."

* tag 'md-3.5-fixes' of git://neil.brown.name/md:
md/raid10: fix careless build error

Linus Torvalds
2012-07-04 09:05:35 +0800
567287488 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking update from David Miller:

1) Fix RX sequence number handling in mwifiex, from Stone Piao.

2) Netfilter ipset mis-compares device names, fix from Florian
Westphal.

3) Fix route leak in ipv6 IPVS, from Eric Dumazet.

4) NFS fixes. Several buffer overflows in NCI layer from Dan
Rosenberg, and release sock OOPS'er fix from Eric Dumazet.

5) Fix WEP handling ath9k, we started using a bit the chip provides to
indicate undecrypted packets but that bit turns out to be unreliable
in certain configurations. Fix from Felix Fietkau.

6) Fix Kconfig dependency bug in wlcore, from Randy Dunlap.

7) New USB IDs for rtlwifi driver from Larry Finger.

8) Fix crashes in qmi_wwan usbnet driver when disconnecting, from Bjørn
Mork.

9) Gianfar driver programs coalescing settings properly in single queue
mode, but does not do so in multi-queue mode. Fix from Claudiu
Manoil.

10) Missing module.h include in davinci_cpdma.c, from Daniel Mack.

11) Need dummy handler for IPSET_CMD_NONE otherwise we crash in ipset if
we get this via nfnetlink, fix from Tomasz Bursztyka.

12) Missing RCU unlock in nfnetlink error path, also from Tomasz.

13) Fix divide by zero in igbvf when the user tries to set an RX
coalescing value of 0 usecs, from Mitch A Williams.

14) We can process SCTP sacks for the wrong transport, oops. Fix from
Neil Horman.

15) Remove hw IP payload checksumming from e1000e driver. This has zery
value in our stack, and turning it on creates a very unintuitive
restriction for users when using jumbo MTUs.

Specifically, when IP payload checksums are on you cannot use both
receive hashing offload and jumbo MTU. Fix from Bruce Allan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
e1000e: remove use of IP payload checksum
sctp: be more restrictive in transport selection on bundled sacks
igbvf: fix divide by zero
netfilter: nfnetlink: fix missing rcu_read_unlock in nfnetlink_rcv_msg
netfilter: ipset: fix crash if IPSET_CMD_NONE command is sent
davinci_cpdma: include linux/module.h
gianfar: Fix RXICr/TXICr programming for multi-queue mode
net: Downgrade CAP_SYS_MODULE deprecated message from error to warning.
net: qmi_wwan: fix Oops while disconnecting
mwifiex: fix memory leak associated with IE manamgement
ath9k: fix panic caused by returning a descriptor we have queued for reuse
mac80211: correct behaviour on unrecognised action frames
ath9k: enable serialize_regmode for non-PCIE AR9287
rtlwifi: rtl8192cu: New USB IDs
NFC: Return from rawsock_release when sk is NULL
iwlwifi: fix activating inactive stations
wlcore: drop INET dependency
ath9k: fix dynamic WEP related regression
NFC: Prevent multiple buffer overflows in NCI
netfilter: update location of my trees
...

Linus Torvalds
2012-07-04 09:01:54 +0800
10684112c md/raid10: fix careless build error ... Browse Code »

build error introduced by commit b357f04a67c2aeee8

That function doesn't get extra args until a later patch. Bother.

Reported-by: Fengguang Wu
Reported-by: Simon Kirby
Reported-by: Tobias Klausmann
Signed-off-by: NeilBrown

NeilBrown
2012-07-04 07:35:35 +0800
dab058fd5 floppy: cancel any pending fd_timeouts before adding a new one ... Browse Code »

In commit 070ad7e793dc ("floppy: convert to delayed work and
single-thread wq") the 'fd_timeout' timer was converted to a delayed
work. However, the "del_timer(&fd_timeout)" was lost in the process,
and any previous pending timeouts would stay active when we then
re-queued the timeout.

This resulted in the floppy probe sequence having a (stale) 20s timeout
rather than the intended 3s timeout, and thus made booting with the
floppy driver (but no actual floppy controller) take much longer than it
should.

Of course, there's little reason for most people to compile the floppy
driver into the kernel at all, which is why most people never noticed.

Canceling the delayed work where we used to do the del_timer() fixes the
issue, and makes the floppy probing use the proper new timeout instead.
The three second timeout is still very wasteful, but better than the 20s
one.

Reported-and-tested-by: Andi Kleen
Reported-and-tested-by: Calvin Walton
Cc: Jiri Kosina
Signed-off-by: Linus Torvalds

Linus Torvalds
2012-07-04 06:51:22 +0800
a3da2c691 Merge branch 'for-linus' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block bits from Jens Axboe:
"As vacation is coming up, thought I'd better get rid of my pending
changes in my for-linus branch for this iteration. It contains:

- Two patches for mtip32xx. Killing a non-compliant sysfs interface
and moving it to debugfs, where it belongs.

- A few patches from Asias. Two legit bug fixes, and one killing an
interface that is no longer in use.

- A patch from Jan, making the annoying partition ioctl warning a bit
less annoying, by restricting it to !CAP_SYS_RAWIO only.

- Three bug fixes for drbd from Lars Ellenberg.

- A fix for an old regression for umem, it hasn't really worked since
the plugging scheme was changed in 3.0.

- A few fixes from Tejun.

- A splice fix from Eric Dumazet, fixing an issue with pipe
resizing."

* 'for-linus' of git://git.kernel.dk/linux-block:
scsi: Silence unnecessary warnings about ioctl to partition
block: Drop dead function blk_abort_queue()
block: Mitigate lock unbalance caused by lock switching
block: Avoid missed wakeup in request waitqueue
umem: fix up unplugging
splice: fix racy pipe->buffers uses
drbd: fix null pointer dereference with on-congestion policy when diskless
drbd: fix list corruption by failing but already aborted reads
drbd: fix access of unallocated pages and kernel panic
xen/blkfront: Add WARN to deal with misbehaving backends.
blkcg: drop local variable @q from blkg_destroy()
mtip32xx: Create debugfs entries for troubleshooting
mtip32xx: Remove 'registers' and 'flags' from sysfs
blkcg: fix blkg_alloc() failure path
block: blkcg_policy_cfq shouldn't be used if !CONFIG_CFQ_GROUP_IOSCHED
block: fix return value on cfq_init() failure
mtip32xx: Remove version.h header file inclusion
xen/blkback: Copy id field when doing BLKIF_DISCARD.

Linus Torvalds
2012-07-04 06:45:10 +0800
863b13271 clk: fix parent validation in __clk_set_parent() ... Browse Code »

The below commit introduced a bug in __clk_set_parent()
which could cause it to *skip* the parent validation
which makes sure the parent passed to the api is a valid
one.

commit 7975059db572eb47f0fb272a62afeae272a4b209
Author: Rajendra Nayak
Date: Wed Jun 6 14:41:31 2012 +0530

clk: Allow late cache allocation for clk->parents

This was identified by the following compiler warning..

drivers/clk/clk.c: In function '__clk_set_parent':
drivers/clk/clk.c:1083:5: warning: 'i' may be used uninitialized in this function [-Wuninitialized]

.. as reported by Marc Kleine-Budde.

There were various options discussed on how to fix this, one
being initing 'i' to clk->num_parents, but the below approach
was found to be more appropriate as it also makes the 'parent
validation' code simpler to read.

Reported-by: Marc Kleine-Budde
Signed-off-by: Rajendra Nayak
Signed-off-by: Mike Turquette
Cc: stable@kernel.org

Rajendra Nayak
2012-07-04 03:05:14 +0800
ff826b2b5 Merge tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound ... Browse Code »

Pull sound fixes from Takashi Iwai:
"Just a few driver-specific fixes for ASoC and HD-audio."

* tag 'sound-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix no sound from ALC662 after Windows reboot
ASoC: tlv320aic3x: Fix codec pll configure bug
ASoC: wm2200: Add missing BCLK rate

Linus Torvalds
2012-07-04 02:10:18 +0800
3492ee727 Merge tag 'dm-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm ... Browse Code »

Pull device-mapper fixes from Alasdair G Kergon:
"Four minor thin provisioning fixes and correct and update dm-verity
documentation."

* tag 'dm-3.5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm:
dm: verity fix documentation
dm persistent data: fix allocation failure in space map checker init
dm persistent data: handle space map checker creation failure
dm persistent data: fix shadow_info_leak on dm_tm_destroy
dm thin: commit metadata before creating metadata snapshot

Linus Torvalds
2012-07-04 02:08:16 +0800
73e608054 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux ... Browse Code »

Pull drm fixes from Dave Airlie:
"One regression fix, two radeon fixes (one for an oops), and an i915
fix to unload framebuffers earlier.

We originally were going to leave the i915 fix until -next, but grub2
in some situations causes vesafb/efifb to be loaded now, and this
causes big slowdowns, and I have reports in rawhide I'd like to have
fixed."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/i915: kick any firmware framebuffers before claiming the gtt
drm: edid: Don't add inferred modes with higher resolution
drm/radeon: fix rare segfault
drm/radeon: fix VM page table setup on SI

Linus Torvalds
2012-07-04 01:59:37 +0800
2fb748d26 Merge tag 'md-3.5-fixes' of git://neil.brown.name/md ... Browse Code »

Pull md fixes from NeilBrown:
"md: collection of bug fixes for 3.5

You go away for 2 weeks vacation and what do you get when you come
back? Piles of bugs :-)

Some found by inspection, some by testing, some during use in the
field, and some while developing for the next window..."

* tag 'md-3.5-fixes' of git://neil.brown.name/md:
md: fix up plugging (again).
md: support re-add of recovering devices.
md/raid1: fix bug in read_balance introduced by hot-replace
raid5: delayed stripe fix
md/raid456: When read error cannot be recovered, record bad block
md: make 'name' arg to md_register_thread non-optional.
md/raid10: fix failure when trying to repair a read error.
md/raid5: fix refcount problem when blocked_rdev is set.
md:Add blk_plug in sync_thread.
md/raid5: In ops_run_io, inc nr_pending before calling md_wait_for_blocked_rdev
md/raid5: Do not add data_offset before call to is_badblock
md/raid5: prefer replacing failed devices over want-replacement devices.
md/raid10: Don't try to recovery unmatched (and unused) chunks.

Linus Torvalds
2012-07-04 01:40:43 +0800
3bfd24547 Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull security layer fixes from James Morris.

A documentation update, and a nommu build fix.

* 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security: Fix nommu build.
security: document no_new_privs

Linus Torvalds
2012-07-04 01:39:40 +0800

03 Jul, 2012

24 commits

18068bdd5 dm: verity fix documentation ... Browse Code »

Veritysetup is now part of cryptsetup package.
Remove on-disk header description (which is not parsed in kernel)
and point users to cryptsetup where it the format is documented.
Mention units for block size paramaters.
Fix target line specification and dmsetup parameters.

Signed-off-by: Milan Broz
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon

Milan Broz
2012-07-03 19:55:41 +0800
b0239faaf dm persistent data: fix allocation failure in space map checker init ... Browse Code »

If CONFIG_DM_DEBUG_SPACE_MAPS is enabled and memory is fragmented and a
sufficiently-large metadata device is used in a thin pool then the space
map checker will fail to allocate the memory it requires.

Switch from kmalloc to vmalloc to allow larger virtually contiguous
allocations for the space map checker's internal count arrays.

Reported-by: Vivek Goyal
Cc: stable@kernel.org
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2012-07-03 19:55:37 +0800
62662303e dm persistent data: handle space map checker creation failure ... Browse Code »

If CONFIG_DM_DEBUG_SPACE_MAPS is enabled and dm_sm_checker_create()
fails, dm_tm_create_internal() would still return success even though it
cleaned up all resources it was supposed to have created. This will
lead to a kernel crash:

general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
...
RIP: 0010:[] [] dm_bufio_get_block_size+0x9/0x20
Call Trace:
[] dm_bm_block_size+0xe/0x10
[] sm_ll_init+0x78/0xd0
[] sm_ll_new_disk+0x16/0xa0
[] dm_sm_disk_create+0xfe/0x160
[] dm_pool_metadata_open+0x16e/0x6a0
[] pool_ctr+0x3f0/0x900
[] dm_table_add_target+0x195/0x450
[] table_load+0xe4/0x330
[] ctl_ioctl+0x15a/0x2c0
[] dm_ctl_ioctl+0x13/0x20
[] do_vfs_ioctl+0x98/0x560
[] sys_ioctl+0x91/0xa0
[] system_call_fastpath+0x16/0x1b

Fix the space map checker code to return an appropriate ERR_PTR and have
dm_sm_disk_create() and dm_tm_create_internal() check for it with
IS_ERR.

Reported-by: Vivek Goyal
Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2012-07-03 19:55:35 +0800
25d7cd6fa dm persistent data: fix shadow_info_leak on dm_tm_destroy ... Browse Code »

Cleanup the shadow table before destroying the transaction manager.

Reference: leak was identified with kmemleak when running
test_discard_random_sectors in the thinp-test-suite.

Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org
Signed-off-by: Alasdair G Kergon

Mike Snitzer
2012-07-03 19:55:33 +0800
0d200aefd dm thin: commit metadata before creating metadata snapshot ... Browse Code »

Userland sometimes sees a corrupt metadata block if metadata is changing
rapidly when a metadata snapshot is reserved for userland, To make the
problem go away, commit before we take the metadata snapshot (which is a
sensible thing to do anyway).

The checksums mean userland spots this corruption immediately so there's
no risk of acting on incorrect data. No corruption exists from the
kernel's point of view, and thin_check passes after pool shutdown.

I believe this is to do with shared blocks at the first level of the
{device, mapping} btree. Prior to the metadata-snap support no sharing
at this level was possible, so this patch is only required after commit
cc8394d86f045b86ff303d3c9e4ce47d97148951 ("dm thin: provide userspace
access to pool metadata").

Signed-off-by: Joe Thornber
Signed-off-by: Mike Snitzer
Signed-off-by: Alasdair G Kergon

Joe Thornber
2012-07-03 19:55:31 +0800
75331a597 security: Fix nommu build. ... Browse Code »

The security + nommu configuration presently blows up with an undefined
reference to BDI_CAP_EXEC_MAP:

security/security.c: In function 'mmap_prot':
security/security.c:687:36: error: dereferencing pointer to incomplete type
security/security.c:688:16: error: 'BDI_CAP_EXEC_MAP' undeclared (first use in this function)
security/security.c:688:16: note: each undeclared identifier is reported only once for each function it appears in

include backing-dev.h directly to fix it up.

Signed-off-by: Paul Mundt
Signed-off-by: James Morris

Paul Mundt
2012-07-03 19:41:03 +0800
9f846a16d drm/i915: kick any firmware framebuffers before claiming the gtt ... Browse Code »

Especially vesafb likes to map everything as uc- (yikes), and if that
mapping hangs around still while we try to map the gtt as wc the
kernel will downgrade our request to uc-, resulting in abyssal
performance.

Unfortunately we can't do this as early as readon does (i.e. as the
first thing we do when initializing the hw) because our fb/mmio space
region moves around on a per-gen basis. So I've had to move it below
the gtt initialization, but that seems to work, too. The important
thing is that we do this before we set up the gtt wc mapping.

Now an altogether different question is why people compile their
kernels with vesafb enabled, but I guess making things just work isn't
bad per se ...

v2:
- s/radeondrmfb/inteldrmfb/
- fix up error handling

v3: Kill #ifdef X86, this is Intel after all. Noticed by Ben Widawsky.

v4: Jani Nikula complained about the pointless bool primary
initialization.

v5: Don't oops if we can't allocate, noticed by Chris Wilson.

v6: Resolve conflicts with agp rework and fixup whitespace.

This is commit e188719a2891f01b3100d in drm-next.

Backport to 3.5 -fixes queue requested by Dave Airlie - due to grub
using vesa on fedora their initrd seems to load vesafb before loading
the real kms driver. So tons more people actually experience a
dead-slow gpu. Hence also the Cc: stable.

Cc: stable@vger.kernel.org
Reported-and-tested-by: "Kilarski, Bernard R"
Reviewed-by: Chris Wilson
Signed-off-by: Daniel Vetter
Signed-off-by: Dave Airlie

Daniel Vetter
2012-07-03 18:18:48 +0800
7b668ebe2 drm: edid: Don't add inferred modes with higher resolution ... Browse Code »

When a monitor EDID doesn't give the preferred bit, driver assumes
that the mode with the higest resolution and rate is the preferred
mode. Meanwhile the recent changes for allowing more modes in the
GFT/CVT ranges give actually more modes, and some modes may be over
the native size. Thus such a mode would be picked up as the preferred
mode although it's no native resolution.

For avoiding such a problem, this patch limits the addition of
inferred modes by checking not to be greater than other modes.
Also, it checks the duplicated mode entry at the same time.

Reviewed-by: Adam Jackson
Signed-off-by: Takashi Iwai
Signed-off-by: Dave Airlie

Takashi Iwai
2012-07-03 18:18:10 +0800
1ef5325b2 drm/radeon: fix rare segfault ... Browse Code »

In gem idle/busy ioctl the radeon object was derefenced after
drm_gem_object_unreference_unlocked which in case the object
have been destroyed lead to use of a possibly free pointer with
possibly wrong data.

Signed-off-by: Jerome Glisse
Reviewed-by: Alex Deucher
Reviewed-by: Christian König
Signed-off-by: Dave Airlie

Jerome Glisse
2012-07-03 18:17:09 +0800
b357f04a6 md: fix up plugging (again). ... Browse Code »
43

The value returned by "mddev_check_plug" is only valid until the
next 'schedule' as that will unplug things. This could happen at any
call to mempool_alloc.
So just calling mddev_check_plug at the start doesn't really make
sense.

So call it just before, or just after, queuing things for the thread.
As the action that happens at unplug is to wake the thread, this makes
lots of sense.
If we cannot add a plug (which requires a small GFP_ATOMIC alloc) we
wake thread immediately.

RAID5 is a bit different. Requests are queued for the thread and the
thread is woken by release_stripe. So we don't need to wake the
thread on failure.
However the thread doesn't perform certain actions when there is any
active plug, so it is important to install a plug before waking the
thread. So for RAID5 we install the plug *before* queuing the request
and waking the thread.

Without this patch it is possible for raid1 or raid10 to queue a
request without then waking the thread, resulting in the array locking
up.

Also change raid10 to only flush_pending_write when there are not
active plugs, just like raid1.

This patch is suitable for 3.0 or later. I plan to submit it to
-stable, but I'll like to let it spend a few weeks in mainline
first to be sure it is completely safe.

Signed-off-by: NeilBrown

NeilBrown
2012-07-03 15:45:31 +0800
f45630910 md: support re-add of recovering devices. ... Browse Code »

We currently only allow a device to be re-added if it appear to be
in-sync. This is overly restrictive as it may be desirable to re-add
a device that is in the middle of recovery.

So remove the test for "InSync" - the test on rdev->raid_disk is
sufficient to ensure that the re-add will succeed.

Reported-by: Alexander Lyakas
Tested-by: Alexander Lyakas
Signed-off-by: NeilBrown

NeilBrown
2012-07-03 13:59:06 +0800
32644afd8 md/raid1: fix bug in read_balance introduced by hot-replace ... Browse Code »

When we added hot_replace we doubled the number of devices
that could be in a RAID1 array. So we doubled how far read_balance
would search. Unfortunately we didn't double the point at which
it looped back to the beginning - so it effectively loops over
all non-replacement disks twice.
This doesn't cause bad behaviour, but it pointless and means we
never read from replacement devices.

Signed-off-by: NeilBrown

NeilBrown
2012-07-03 13:58:42 +0800
fab363b5f raid5: delayed stripe fix ... Browse Code »

There isn't locking setting STRIPE_DELAYED and STRIPE_PREREAD_ACTIVE bits, but
the two bits have relationship. A delayed stripe can be moved to hold list only
when preread active stripe count is below IO_THRESHOLD. If a stripe has both
the bits set, such stripe will be in delayed list and preread count not 0,
which will make such stripe never leave delayed list.

Signed-off-by: Shaohua Li
Signed-off-by: NeilBrown

Shaohua Li
2012-07-03 13:57:19 +0800
2e8ac3031 md/raid456: When read error cannot be recovered, record bad block ... Browse Code »

We may not be able to fix a bad block if:
- the array is degraded
- the over-write fails.

In these cases we currently eject the device, but we should
record a bad block if possible.

Signed-off-by: majianpeng
Signed-off-by: NeilBrown

majianpeng
2012-07-03 13:57:02 +0800
0232605d9 md: make 'name' arg to md_register_thread non-optional. ... Browse Code »

Having the 'name' arg optional and defaulting to the current
personality name is no necessary and leads to errors, as when
changing the level of an array we can end up using the
name of the old level instead of the new one.

So make it non-optional and always explicitly pass the name
of the level that the array will be.

Reported-by: majianpeng
Signed-off-by: NeilBrown

NeilBrown
2012-07-03 13:56:52 +0800
055d3747d md/raid10: fix failure when trying to repair a read error. ... Browse Code »

commit 58c54fcca3bac5bf9290cfed31c76e4c4bfbabaf
md/raid10: handle further errors during fix_read_error better.

in 3.1 added "r10_sync_page_io" which takes an IO size in sectors.
But we were passing the IO size in bytes!!!
This resulting in bio_add_page failing, and empty request being sent
down, and a consequent BUG_ON in scsi_lib.

[fix missing space in error message at same time]

This fix is suitable for 3.1.y and later.

Cc: stable@vger.kernel.org
Reported-by: Christian Balzer
Signed-off-by: NeilBrown

NeilBrown
2012-07-03 13:55:33 +0800
9d4056aa9 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc ... Browse Code »

Pull a couple more powerpc fixes from Benjamin Herrenschmidt:
"Here are two more fixes that I "missed" when scrubbing patchwork last
week which are worth still having in 3.5."

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/kvm: sldi should be sld
powerpc/xmon: Use cpumask iterator to avoid warning

Linus Torvalds
2012-07-03 10:52:25 +0800
09b243577 security: document no_new_privs ... Browse Code »

Document no_new_privs.

Signed-off-by: Andy Lutomirski
Acked-by: Kees Cook
Signed-off-by: James Morris

Andy Lutomirski
2012-07-03 10:35:36 +0800
5f066c632 md/raid5: fix refcount problem when blocked_rdev is set. ... Browse Code »

commit 43220aa0f22cd3ce5b30246d50ccd696d119edea
md/raid5: fix a hang on device failure.

fixed a hang, but introduced a refcounting in-balance so
that if the presence of bad-blocks ever caused an rdev to
be 'blocked' we would increment the refcount on the rdev and
never decrement it.

So added the needed rdev_dec_pending when md_wait_for_blocked_rdev
is not called.

Reported-by: majianpeng
Signed-off-by: NeilBrown

NeilBrown
2012-07-03 10:13:29 +0800
7c2c57c9a md:Add blk_plug in sync_thread. ... Browse Code »

Add blk_plug in sync_thread will increase the performance of sync.
Because sync_thread did not blk_plug,so when raid sync, the bio merge
not well.

Testing environment:
SATA controller: Intel Corporation 82801JI (ICH10 Family) SATA AHCI
Controller.
OS:Linux xxx 3.5.0-rc2+ #340 SMP Tue Jun 12 09:00:25 CST 2012
x86_64 x86_64 x86_64 GNU/Linux.
RAID5: four ST31000524NS disk.

Without blk_plug:recovery speed about 63M/Sec;
Add blk_plug:recovery speed about 120M/Sec.

Using blktrace:
blktrace -d /dev/sdb -w 60 -o -|blkparse -i -

without blk_plug:
Total (8,16):
Reads Queued: 309811, 1239MiB Writes Queued: 0, 0KiB
Read Dispatches: 283583, 1189MiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 273351, 1149MiB Writes Completed: 0, 0KiB
Read Merges: 23533, 94132KiB Write Merges: 0, 0KiB
IO unplugs: 0 Timer unplugs: 0

add blk_plug:
Total (8,16):
Reads Queued: 428697, 1714MiB Writes Queued: 0, 0KiB
Read Dispatches: 3954, 1714MiB Write Dispatches: 0, 0KiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 3956, 1715MiB Writes Completed: 0, 0KiB
Read Merges: 424743, 1698MiB Write Merges: 0, 0KiB
IO unplugs: 0 Timer unplugs: 3384

The ratio of merge will be markedly increased.

Signed-off-by: majianpeng
Signed-off-by: NeilBrown

majianpeng
2012-07-03 10:12:26 +0800
1850753d2 md/raid5: In ops_run_io, inc nr_pending before calling md_wait_for_blocked_rdev ... Browse Code »

In ops_run_io(), the call to md_wait_for_blocked_rdev will decrement
nr_pending so we lose the reference we hold on the rdev.
So atomic_inc it first to maintain the reference.

This bug was introduced by commit 73e92e51b7969ef5477d
md/raid5. Don't write to known bad block on doubtful devices.

which appeared in 3.0, so patch is suitable for stable kernels since
then.

Cc: stable@vger.kernel.org
Signed-off-by: majianpeng
Signed-off-by: NeilBrown

majianpeng
2012-07-03 10:11:54 +0800
6c0544e25 md/raid5: Do not add data_offset before call to is_badblock ... Browse Code »

In chunk_aligned_read() we are adding data_offset before calling
is_badblock. But is_badblock also adds data_offset, so that is bad.

So move the addition of data_offset to after the call to
is_badblock.

This bug was introduced by commit 31c176ecdf3563140e639
md/raid5: avoid reading from known bad blocks.
which first appeared in 3.0. So that patch is suitable for any
-stable kernel from 3.0.y onwards. However it will need minor
revision for most of those (as the comment didn't appear until
recently).

Cc: stable@vger.kernel.org
Signed-off-by: majianpeng
Signed-off-by: NeilBrown

majianpeng
2012-07-03 10:09:57 +0800
5cfb22a1f md/raid5: prefer replacing failed devices over want-replacement devices. ... Browse Code »

If a RAID5 has both a failed device and a device marked as
'WantReplacement', then we should preferentially replace the failed
device.
However the current code replaces whichever is found first.
So split into 2 loops, check fail failed/missing first, and only check
for WantReplacement if nothing is failed or missing.

Reported-by: majianpeng
Signed-off-by: NeilBrown

NeilBrown
2012-07-03 09:46:53 +0800
fc448a18a md/raid10: Don't try to recovery unmatched (and unused) chunks. ... Browse Code »

If a RAID10 has an odd number of chunks - as might happen when there
are an odd number of devices - the last chunk has no pair and so is
not mirrored. We don't store data there, but when recovering the last
device in an array we retry to recover that last chunk from a
non-existent location. This results in an error, and the recovery
aborts.

When we get to that last chunk we should just stop - there is nothing
more to do anyway.

This bug has been present since the introduction of RAID10, so the
patch is appropriate for any -stable kernel.

Cc: stable@vger.kernel.org
Reported-by: Christian Balzer
Tested-by: Christian Balzer
Signed-off-by: NeilBrown

NeilBrown
2012-07-03 08:37:30 +0800