Eric Lee / smarc-fsl-linux-kernel

01 Jul, 2016

1 commit

b223f4e21 Merge branch 'd_real' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into work.misc Browse Code »

Al Viro
2016-07-01 11:34:49 +0800

01 Jun, 2016

1 commit

d21384552 FS-Cache: wake write waiter after invalidating writes ... Browse Code »

Signed-off-by: Yan, Zheng
Acked-by: David Howells

Yan, Zheng
2016-06-01 16:29:09 +0800

30 May, 2016

1 commit

84c60b138 drop redundant ->owner initializations ... Browse Code »

it's not needed for file_operations of inodes located on fs defined
in the hosting module and for file_operations that go into procfs.

Signed-off-by: Al Viro

Al Viro
2016-05-30 07:08:00 +0800

05 Apr, 2016

1 commit

09cbfeaf1 mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros ... Browse Code »

PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized. And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special. They are
not.

The changes are pretty straight-forward:

- << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

- >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

- page_cache_get() -> get_page();

- page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov
Acked-by: Michal Hocko
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2016-04-05 01:41:08 +0800

11 Nov, 2015

3 commits

102f4d900 FS-Cache: Handle a write to the page immediately beyond the EOF marker ... Browse Code »

Handle a write being requested to the page immediately beyond the EOF
marker on a cache object. Currently this gets an assertion failure in
CacheFiles because the EOF marker is used there to encode information about
a partial page at the EOF - which could lead to an unknown blank spot in
the file if we extend the file over it.

The problem is actually in fscache where we check the index of the page
being written against store_limit. store_limit is set to the number of
pages that we're allowed to store by fscache_set_store_limit() - which
means it's one more than the index of the last page we're allowed to store.
The problem is that we permit writing to a page with an index _equal_ to
the store limit - when we should reject that case.

Whilst we're at it, change the triggered assertion in CacheFiles to just
return -ENOBUFS instead.

The assertion failure looks something like this:

CacheFiles: Assertion failed
1000 < 7b1 is false
------------[ cut here ]------------
kernel BUG at fs/cachefiles/rdwr.c:962!
...
RIP: 0010:[] [] cachefiles_write_page+0x273/0x2d0 [cachefiles]

Cc: stable@vger.kernel.org # v2.6.31+; earlier - that + backport of a17754f (at least)
Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2015-11-11 15:11:02 +0800
b130ed599 FS-Cache: Don't override netfs's primary_index if registering failed ... Browse Code »

Only override netfs->primary_index when registering success.

Cc: stable@vger.kernel.org # v2.6.30+
Signed-off-by: Kinglong Mee
Signed-off-by: David Howells
Signed-off-by: Al Viro

Kinglong Mee
2015-11-11 15:07:51 +0800
86108c2e3 FS-Cache: Increase reference of parent after registering, netfs success ... Browse Code »

If netfs exist, fscache should not increase the reference of parent's
usage and n_children, otherwise, never be decreased.

v2: thanks David's suggest,
move increasing reference of parent if success
use kmem_cache_free() freeing primary_index directly

v3: don't move "netfs->primary_index->parent = &fscache_fsdef_index;"

Cc: stable@vger.kernel.org # v2.6.30+
Signed-off-by: Kinglong Mee
Signed-off-by: David Howells
Signed-off-by: Al Viro

Kinglong Mee
2015-11-11 15:06:53 +0800

07 Nov, 2015

1 commit

d0164adc8 mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep an… ... Browse Code »

…d avoiding waking kswapd

__GFP_WAIT has been used to identify atomic context in callers that hold
spinlocks or are in interrupts. They are expected to be high priority and
have access one of two watermarks lower than "min" which can be referred
to as the "atomic reserve". __GFP_HIGH users get access to the first
lower watermark and can be called the "high priority reserve".

Over time, callers had a requirement to not block when fallback options
were available. Some have abused __GFP_WAIT leading to a situation where
an optimisitic allocation with a fallback option can access atomic
reserves.

This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
cannot sleep and have no alternative. High priority users continue to use
__GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
callers that want to wake kswapd for background reclaim. __GFP_WAIT is
redefined as a caller that is willing to enter direct reclaim and wake
kswapd for background reclaim.

This patch then converts a number of sites

o __GFP_ATOMIC is used by callers that are high priority and have memory
pools for those requests. GFP_ATOMIC uses this flag.

o Callers that have a limited mempool to guarantee forward progress clear
__GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
into this category where kswapd will still be woken but atomic reserves
are not used as there is a one-entry mempool to guarantee progress.

o Callers that are checking if they are non-blocking should use the
helper gfpflags_allow_blocking() where possible. This is because
checking for __GFP_WAIT as was done historically now can trigger false
positives. Some exceptions like dm-crypt.c exist where the code intent
is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
flag manipulations.

o Callers that built their own GFP flags instead of starting with GFP_KERNEL
and friends now also need to specify __GFP_KSWAPD_RECLAIM.

The first key hazard to watch out for is callers that removed __GFP_WAIT
and was depending on access to atomic reserves for inconspicuous reasons.
In some cases it may be appropriate for them to use __GFP_HIGH.

The second key hazard is callers that assembled their own combination of
GFP flags instead of starting with something like GFP_KERNEL. They may
now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
if it's missed in most cases as other activity will wake kswapd.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Mel Gorman
2015-11-07 09:50:42 +0800

21 Oct, 2015

1 commit

146aa8b14 KEYS: Merge the type-specific data with the payload data ... Browse Code »

Merge the type-specific data with the payload data into one four-word chunk
as it seems pointless to keep them separate.

Use user_key_payload() for accessing the payloads of overloaded
user-defined keys.

Signed-off-by: David Howells
cc: linux-cifs@vger.kernel.org
cc: ecryptfs@vger.kernel.org
cc: linux-ext4@vger.kernel.org
cc: linux-f2fs-devel@lists.sourceforge.net
cc: linux-nfs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: linux-ima-devel@lists.sourceforge.net

David Howells
2015-10-21 22:18:36 +0800

02 Apr, 2015

12 commits

4a47132ff FS-Cache: Retain the netfs context in the retrieval op earlier ... Browse Code »

Now that the retrieval operation may be disposed of by fscache_put_operation()
before we actually set the context, the retrieval-specific cleanup operation
can produce a NULL-pointer dereference when it tries to unconditionally clean
up the netfs context.

Given that it is expected that we'll get at least as far as the place where we
currently set the context pointer and it is unlikely we'll go through the
error handling paths prior to that point, retain the context right from the
point that the retrieval op is allocated.

Concomitant to this, we need to retain the cookie pointer in the retrieval op
also so that we can call the netfs to release its context in the release
method.

In addition, we might now get into fscache_release_retrieval_op() with the op
only initialised. To this end, set the operation to DEAD only after the
release method has been called and skip the n_pages test upon cleanup if the
op is still in the INITIALISED state.

Without these changes, the following oops might be seen:

BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
...
RIP: 0010:[] fscache_release_retrieval_op+0xae/0x100
...
Call Trace:
[] fscache_put_operation+0x117/0x2e0
[] __fscache_read_or_alloc_pages+0x351/0x3ac
[] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
[] nfs_readpages+0x10c/0x185 [nfs]
[] ? alloc_pages_current+0x119/0x13e
[] ? __page_cache_alloc+0xfb/0x10a
[] __do_page_cache_readahead+0x188/0x22c
[] ondemand_readahead+0x29e/0x2af
[] page_cache_sync_readahead+0x38/0x3a
[] generic_file_read_iter+0x1a2/0x55a
[] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
[] nfs_file_read+0x49/0x70 [nfs]
[] new_sync_read+0x78/0x9c
[] __vfs_read+0x13/0x38
[] vfs_read+0x95/0x121
[] SyS_read+0x4c/0x8a
[] system_call_fastpath+0x12/0x17

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
d3b97ca4a FS-Cache: The operation cancellation method needs calling in more places ... Browse Code »

Any time an incomplete operation is cancelled, the operation cancellation
function needs to be called to clean up. This is currently being passed
directly to some of the functions that might want to call it, but not all.

Instead, pass the cancellation method pointer to the fscache_operation_init()
and have that cache it in the operation struct. Further, plug in a dummy
cancellation handler if the caller declines to set one as this allows us to
call the function unconditionally (the extra overhead isn't worth bothering
about as we don't expect to be calling this typically).

The cancellation method must thence be called everywhere the CANCELLED state
is set. Note that we call it *before* setting the CANCELLED state such that
the method can use the old state value to guide its operation.

fscache_do_cancel_retrieval() needs moving higher up in the sources so that
the init function can use it now.

Without this, the following oops may be seen:

FS-Cache: Assertion failed
FS-Cache: 3 == 0 is false
------------[ cut here ]------------
kernel BUG at ../fs/fscache/page.c:261!
...
RIP: 0010:[] fscache_release_retrieval_op+0x77/0x100
[] fscache_put_operation+0x114/0x2da
[] __fscache_read_or_alloc_pages+0x358/0x3b3
[] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
[] nfs_readpages+0x10c/0x185 [nfs]
[] ? alloc_pages_current+0x119/0x13e
[] ? __page_cache_alloc+0xfb/0x10a
[] __do_page_cache_readahead+0x188/0x22c
[] ondemand_readahead+0x29e/0x2af
[] page_cache_sync_readahead+0x38/0x3a
[] generic_file_read_iter+0x1a2/0x55a
[] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
[] nfs_file_read+0x49/0x70 [nfs]
[] new_sync_read+0x78/0x9c
[] __vfs_read+0x13/0x38
[] vfs_read+0x95/0x121
[] SyS_read+0x4c/0x8a
[] system_call_fastpath+0x12/0x17

The assertion is showing that the remaining number of pages (n_pages) is not 0
when the operation is being released.

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
a39caadf0 FS-Cache: Put an aborted initialised op so that it is accounted correctly ... Browse Code »

Call fscache_put_operation() or a wrapper on any op that has gone through
fscache_operation_init() so that the accounting shown in /proc is done
correctly, specifically fscache_n_op_release.

fscache_put_operation() therefore now allows an op in the INITIALISED state as
well as in the CANCELLED and COMPLETE states.

Note that this means that an operation can get put that doesn't have its
->object pointer filled in, so anything that depends on the object needs to be
conditional in fscache_put_operation().

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
73c04a47b FS-Cache: Fix cancellation of in-progress operation ... Browse Code »

Cancellation of an in-progress operation needs to update the relevant counters
and start any operations that are pending waiting on this one.

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
03cdd0e4b FS-Cache: Count the number of initialised operations ... Browse Code »

Count and display through /proc/fs/fscache/stats the number of initialised
operations.

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
1339ec98e FS-Cache: Out of line fscache_operation_init() ... Browse Code »

Out of line fscache_operation_init() so that it can access internal FS-Cache
features, such as stats, in a later commit.

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
418b7eb9e FS-Cache: Permit fscache_cancel_op() to cancel in-progress operations too ... Browse Code »

Currently, fscache_cancel_op() only cancels pending operations - attempts to
cancel in-progress operations are ignored. This leads to a problem in
fscache_wait_for_operation_activation() whereby the wait is terminated, but
the object has been killed.

The check at the end of the function now triggers because it's no longer
contingent on the cache having produced an I/O error since the commit that
fixed the logic error in fscache_object_is_dead().

The result of the check is that it tries to cancel the operation - but since
the object may not be pending by this point, the cancellation request may be
ignored - with the result that the the object is just put by the caller and
fscache_put_operation has an assertion failure because the operation isn't in
either the COMPLETE or the CANCELLED states.

To fix this, we permit in-progress ops to be cancelled under some
circumstances.

The bug results in an oops that looks something like this:

FS-Cache: fscache_wait_for_operation_activation() = -ENOBUFS [obj dead 3]
FS-Cache:
FS-Cache: Assertion failed
FS-Cache: 3 == 5 is false
------------[ cut here ]------------
kernel BUG at ../fs/fscache/operation.c:432!
...
RIP: 0010:[] fscache_put_operation+0xf2/0x2cd
Call Trace:
[] __fscache_read_or_alloc_pages+0x2ec/0x3b3
[] __nfs_readpages_from_fscache+0x59/0xbf [nfs]
[] nfs_readpages+0x10c/0x185 [nfs]
[] ? alloc_pages_current+0x119/0x13e
[] ? __page_cache_alloc+0xfb/0x10a
[] __do_page_cache_readahead+0x188/0x22c
[] ondemand_readahead+0x29e/0x2af
[] page_cache_sync_readahead+0x38/0x3a
[] generic_file_read_iter+0x1a2/0x55a
[] ? nfs_revalidate_mapping+0xd6/0x288 [nfs]
[] nfs_file_read+0x49/0x70 [nfs]
[] new_sync_read+0x78/0x9c
[] __vfs_read+0x13/0x38
[] vfs_read+0x95/0x121
[] SyS_read+0x4c/0x8a
[] system_call_fastpath+0x12/0x17

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
870215263 FS-Cache: fscache_object_is_dead() has wrong logic, kill it ... Browse Code »

fscache_object_is_dead() returns true only if the object is marked dead and
the cache got an I/O error. This should be a logical OR instead. Since two
of the callers got split up into handling for separate subcases, expand the
other callers and kill the function. This is probably the right thing to do
anyway since one of the subcases isn't about the object at all, but rather
about the cache.

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
f09b443d0 FS-Cache: Synchronise object death state change vs operation submission ... Browse Code »

When an object is being marked as no longer live, do this under the object
spinlock to prevent a race with operation submission targeted on that object.

The problem occurs due to the following pair of intertwined sequences when the
cache tries to create an object that would take it over the hard available
space limit:

NETFS INTERFACE
===============
(A) The netfs calls fscache_acquire_cookie(). object creation is deferred to
the object state machine and the netfs is allowed to continue.

OBJECT STATE MACHINE KTHREAD
============================
(1) The object is looked up on disk by fscache_look_up_object()
calling cachefiles_walk_to_object(). The latter finds that the
object is not yet represented on disk and calls
fscache_object_lookup_negative().

(2) fscache_object_lookup_negative() sets FSCACHE_COOKIE_NO_DATA_YET
and clears FSCACHE_COOKIE_LOOKING_UP, thus allowing the netfs to
start queuing read operations.

(B) The netfs calls fscache_read_or_alloc_pages(). This calls
fscache_wait_for_deferred_lookup() which sees FSCACHE_COOKIE_LOOKING_UP
become clear, allowing the read to begin.

(C) A read operation is set up and passed to fscache_submit_op() to deal
with.

(3) cachefiles_walk_to_object() calls cachefiles_has_space(), which
fails (or one of the file operations to create stuff fails).
cachefiles returns an error to fscache.

(4) fscache_look_up_object() transits to the LOOKUP_FAILURE state,

(5) fscache_lookup_failure() sets FSCACHE_OBJECT_LOOKED_UP and
FSCACHE_COOKIE_UNAVAILABLE and clears FSCACHE_COOKIE_LOOKING_UP
then transits to the KILL_OBJECT state.

(6) fscache_kill_object() clears FSCACHE_OBJECT_IS_LIVE in an attempt
to reject any further requests from the netfs.

(7) object->n_ops is examined and found to be 0.
fscache_kill_object() transits to the DROP_OBJECT state.

(D) fscache_submit_op() locks the object spinlock, sees if it can dispatch
the op immediately by calling fscache_object_is_active() - which fails
since FSCACHE_OBJECT_IS_AVAILABLE has not yet been set.

(E) fscache_submit_op() then tests FSCACHE_OBJECT_LOOKED_UP - which is set.
It then queues the object and increments object->n_ops.

(8) fscache_drop_object() releases the object and eventually
fscache_put_object() calls cachefiles_put_object() which suffers
an assertion failure here:

ASSERTCMP(object->fscache.n_ops, ==, 0);

Locking the object spinlock in step (6) around the clearance of
FSCACHE_OBJECT_IS_LIVE ensures that the the decision trees in
fscache_submit_op() and fscache_submit_exclusive_op() don't see the IS_LIVE
flag being cleared mid-decision: either the op is queued before step (7) - in
which case fscache_kill_object() will see n_ops>0 and will deal with the op -
or the op will be rejected.

This, combined with rejecting op submission if the target object is dying, fix
the problem.

The problem shows up as the following oops:

CacheFiles: Assertion failed
CacheFiles: 1 == 0 is false
------------[ cut here ]------------
kernel BUG at ../fs/cachefiles/interface.c:339!
...
RIP: 0010:[] [] cachefiles_put_object+0x2a4/0x301 [cachefiles]
...
Call Trace:
[] fscache_put_object+0x18/0x21 [fscache]
[] fscache_object_work_func+0x3ba/0x3c9 [fscache]
[] process_one_work+0x226/0x441
[] worker_thread+0x273/0x36b
[] ? rescuer_thread+0x2e1/0x2e1
[] kthread+0x10e/0x116
[] ? kthread_create_on_node+0x1bb/0x1bb
[] ret_from_fork+0x7c/0xb0
[] ? kthread_create_on_node+0x1bb/0x1bb

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
6515d1dbf FS-Cache: Handle a new operation submitted against a killed object ... Browse Code »

Reject new operations that are being submitted against an object if that
object has failed its lookup or creation states or has been killed by the
cache backend for some other reason, such as having been culled.

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
30ceec628 FS-Cache: When submitting an op, cancel it if the target object is dying ... Browse Code »

When submitting an operation, prefer to cancel the operation immediately
rather than queuing it for later processing if the object is marked as dying
(ie. the object state machine has reached the KILL_OBJECT state).

Whilst we're at it, change the series of related test_bit() calls into a
READ_ONCE() and bitwise-AND operators to reduce the number of load
instructions (test_bit() has a volatile address).

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800
3c3059841 FS-Cache: Move fscache_report_unexpected_submission() to make it more available ... Browse Code »

Move fscache_report_unexpected_submission() up within operation.c so that it
can be called from fscache_submit_exclusive_op() too.

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-04-02 21:28:53 +0800

24 Feb, 2015

1 commit

182d919b8 FS-Cache: Count culled objects and objects rejected due to lack of space ... Browse Code »

Count the number of objects that get culled by the cache backend and the
number of objects that the cache backend declines to instantiate due to lack
of space in the cache.

These numbers are made available through /proc/fs/fscache/stats

Signed-off-by: David Howells
Reviewed-by: Steve Dickson
Acked-by: Jeff Layton

David Howells
2015-02-24 18:05:27 +0800

14 Oct, 2014

1 commit

d5d962265 fs/fscache/object-list.c: use __seq_open_private() ... Browse Code »

Reduce boilerplate code by using __seq_open_private() instead of seq_open()
in fscache_objlist_open().

Signed-off-by: Rob Jones
Signed-off-by: David Howells
Acked-by: Steve Dickson

Rob Jones
2014-10-14 00:52:21 +0800

18 Sep, 2014

1 commit

3e1199dca FS-Cache: refcount becomes corrupt under vma pressure. ... Browse Code »

In rare cases under heavy VMA pressure the ref count for a fscache cookie
becomes corrupt. In this case we decrement ref count even if we fail before
incrementing the refcount.

FS-Cache: Assertion failed bnode-eca5f9c6/syslog
0 > 0 is false
------------[ cut here ]------------
kernel BUG at fs/fscache/cookie.c:519!
invalid opcode: 0000 [#1] SMP
Call Trace:
[] __fscache_relinquish_cookie+0x50/0x220 [fscache]
[] ceph_fscache_unregister_inode_cookie+0x3e/0x50 [ceph]
[] ceph_destroy_inode+0x33/0x200 [ceph]
[] ? __fsnotify_inode_delete+0xe/0x10
[] destroy_inode+0x3c/0x70
[] evict+0x111/0x180
[] iput+0x103/0x190
[] __dentry_kill+0x1c8/0x220
[] shrink_dentry_list+0xf1/0x250
[] prune_dcache_sb+0x4c/0x60
[] super_cache_scan+0xff/0x170
[] shrink_slab_node+0x140/0x2c0
[] shrink_slab+0x8a/0x130
[] balance_pgdat+0x3e2/0x5d0
[] kswapd+0x16a/0x4a0
[] ? __wake_up_sync+0x20/0x20
[] ? balance_pgdat+0x5d0/0x5d0
[] kthread+0xc9/0xe0
[] ? ftrace_raw_event_xen_mmu_release_ptpage+0x70/0x90
[] ? flush_kthread_worker+0xb0/0xb0
[] ret_from_fork+0x7c/0xb0
[] ? flush_kthread_worker+0xb0/0xb0
RIP [] __fscache_disable_cookie+0x1db/0x210 [fscache]
RSP
---[ end trace 254d0d7c74a01f25 ]---

Signed-off-by: Milosz Tanski
Signed-off-by: David Howells

Milosz Tanski
2014-09-18 05:41:40 +0800

27 Aug, 2014

2 commits

920bce20d FS-Cache: Reduce cookie ref count if submit fails. ... Browse Code »

I've been seeing issues with disposing cookies under vma pressure. The symptom
is that the refcount gets out of sync. In this case we fail to decrement the
refcount if submit fails. I found this while auditing the error in and around
cookie operations.

Signed-off-by: Milosz Tanski
Signed-off-by: David Howells

Milosz Tanski
2014-08-27 22:29:34 +0800
9776de96e FS-Cache: Timeout for releasepage() ... Browse Code »

This is meant to avoid a recusive hang caused by underlying filesystem trying
to grab a free page and causing a write-out.

INFO: task kworker/u30:7:28375 blocked for more than 120 seconds.
Not tainted 3.15.0-virtual #74
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u30:7 D 0000000000000000 0 28375 2 0x00000000
Workqueue: fscache_operation fscache_op_work_func [fscache]
ffff88000b147148 0000000000000046 0000000000000000 ffff88000b1471c8
ffff8807aa031820 0000000000014040 ffff88000b147fd8 0000000000014040
ffff880f0c50c860 ffff8807aa031820 ffff88000b147158 ffff88007be59cd0
Call Trace:
[] schedule+0x29/0x70
[] __fscache_wait_on_page_write+0x55/0x90 [fscache]
[] ? __wake_up_sync+0x20/0x20
[] __fscache_maybe_release_page+0x65/0x1e0 [fscache]
[] ceph_releasepage+0x83/0x100 [ceph]
[] ? anon_vma_fork+0x130/0x130
[] try_to_release_page+0x32/0x50
[] shrink_page_list+0x7e6/0x9d0
[] ? isolate_lru_pages.isra.73+0x78/0x1e0
[] shrink_inactive_list+0x252/0x4c0
[] shrink_lruvec+0x3e1/0x670
[] shrink_zone+0x3f/0x110
[] do_try_to_free_pages+0x1d6/0x450
[] ? zone_statistics+0x99/0xc0
[] try_to_free_pages+0xc4/0x180
[] __alloc_pages_nodemask+0x6b2/0xa60
[] ? __find_get_block+0xbe/0x250
[] ? wake_up_bit+0x2e/0x40
[] alloc_pages_current+0xb3/0x180
[] __page_cache_alloc+0xb7/0xd0
[] grab_cache_page_write_begin+0x7c/0xe0
[] ? ext4_mark_inode_dirty+0x82/0x220
[] ext4_da_write_begin+0x89/0x2d0
[] generic_perform_write+0xbe/0x1d0
[] ? update_time+0x81/0xc0
[] ? mnt_clone_write+0x12/0x30
[] __generic_file_aio_write+0x1ce/0x3f0
[] generic_file_aio_write+0x5e/0xe0
[] ext4_file_write+0x9f/0x410
[] ? ext4_file_open+0x66/0x180
[] do_sync_write+0x5a/0x90
[] cachefiles_write_page+0x149/0x430 [cachefiles]
[] ? radix_tree_gang_lookup_tag+0x89/0xd0
[] fscache_write_op+0x222/0x3b0 [fscache]
[] fscache_op_work_func+0x3a/0x100 [fscache]
[] process_one_work+0x179/0x4a0
[] worker_thread+0x11b/0x370
[] ? manage_workers.isra.21+0x2e0/0x2e0
[] kthread+0xc9/0xe0
[] ? ftrace_raw_event_xen_mmu_release_ptpage+0x70/0x90
[] ? flush_kthread_worker+0xb0/0xb0
[] ret_from_fork+0x7c/0xb0
[] ? flush_kthread_worker+0xb0/0xb0

Signed-off-by: Milosz Tanski
Signed-off-by: David Howells

Milosz Tanski
2014-08-27 22:24:06 +0800

07 Aug, 2014

1 commit

3e5840648 fs/fscache: make ctl_table static ... Browse Code »

fscache_sysctls and fscache_sysctls_root are only used in main.c

Signed-off-by: Fabian Frederick
Cc: David Howells
Cc: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-08-07 09:01:12 +0800

16 Jul, 2014

1 commit

743162013 sched: Remove proliferation of wait_on_bit() action functions ... Browse Code »

The current "wait_on_bit" interface requires an 'action'
function to be provided which does the actual waiting.
There are over 20 such functions, many of them identical.
Most cases can be satisfied by one of just two functions, one
which uses io_schedule() and one which just uses schedule().

So:
Rename wait_on_bit and wait_on_bit_lock to
wait_on_bit_action and wait_on_bit_lock_action
to make it explicit that they need an action function.

Introduce new wait_on_bit{,_lock} and wait_on_bit{,_lock}_io
which are *not* given an action function but implicitly use
a standard one.
The decision to error-out if a signal is pending is now made
based on the 'mode' argument rather than being encoded in the action
function.

All instances of the old wait_on_bit and wait_on_bit_lock which
can use the new version have been changed accordingly and their
action functions have been discarded.
wait_on_bit{_lock} does not return any specific error code in the
event of a signal so the caller must check for non-zero and
interpolate their own error code as appropriate.

The wait_on_bit() call in __fscache_wait_on_invalidate() was
ambiguous as it specified TASK_UNINTERRUPTIBLE but used
fscache_wait_bit_interruptible as an action function.
David Howells confirms this should be uniformly
"uninterruptible"

The main remaining user of wait_on_bit{,_lock}_action is NFS
which needs to use a freezer-aware schedule() call.

A comment in fs/gfs2/glock.c notes that having multiple 'action'
functions is useful as they display differently in the 'wchan'
field of 'ps'. (and /proc/$PID/wchan).
As the new bit_wait{,_io} functions are tagged "__sched", they
will not show up at all, but something higher in the stack. So
the distinction will still be visible, only with different
function names (gds2_glock_wait versus gfs2_glock_dq_wait in the
gfs2/glock.c case).

Since first version of this patch (against 3.15) two new action
functions appeared, on in NFS and one in CIFS. CIFS also now
uses an action function that makes the same freezer aware
schedule call as NFS.

Signed-off-by: NeilBrown
Acked-by: David Howells (fscache, keys)
Acked-by: Steven Whitehouse (gfs2)
Acked-by: Peter Zijlstra
Cc: Oleg Nesterov
Cc: Steve French
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/20140707051603.28027.72349.stgit@notabene.brown
Signed-off-by: Ingo Molnar

NeilBrown
2014-07-16 21:10:39 +0800

07 Jun, 2014

1 commit

75a3294ec fscache: convert use of typedef ctl_table to struct ctl_table ... Browse Code »

This typedef is unnecessary and should just be removed.

Signed-off-by: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2014-06-07 07:08:16 +0800

05 Jun, 2014

2 commits

3185a88ce fs/fscache: replace seq_printf by seq_puts ... Browse Code »

Replace seq_printf where possible + coalesce formats from 2 existing
seq_puts

Signed-off-by: Fabian Frederick
Cc: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-06-05 07:53:52 +0800
36dfd116e fs/fscache: convert printk to pr_foo() ... Browse Code »

All printk converted to pr_foo() except internal.h: printk(KERN_DEBUG

Coalesce formats.

Add pr_fmt

Signed-off-by: Fabian Frederick
Cc: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-06-05 07:53:51 +0800

18 Feb, 2014

1 commit

7026f1929 FS-Cache: Handle removal of unadded object to the fscache_object_list rb tree ... Browse Code »

When FS-Cache allocates an object, the following sequence of events can
occur:

-->fscache_alloc_object()
-->cachefiles_alloc_object() [via cache->ops->alloc_object]
fscache_attach_object()
cachefiles_put_object() [via cache->ops->put_object]
-->fscache_object_destroy()
-->fscache_objlist_remove()
-->rb_erase() to remove the object from fscache_object_list.

resulting in a crash in the rbtree code.

The problem is that the object is only added to fscache_object_list on
the success path of fscache_attach_object() where it calls
fscache_objlist_add().

So if fscache_attach_object() fails, the object won't have been added to
the objlist rbtree. We do, however, unconditionally try to remove the
object from the tree.

Thanks to NeilBrown for finding this and suggesting this solution.

Reported-by: NeilBrown
Signed-off-by: David Howells
Tested-by: (a customer of) NeilBrown
Signed-off-by: Linus Torvalds

David Howells
2014-02-18 05:47:35 +0800

14 Nov, 2013

1 commit

0910c0bdf Merge branch 'for-3.13/core' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block IO core updates from Jens Axboe:
"This is the pull request for the core changes in the block layer for
3.13. It contains:

- The new blk-mq request interface.

This is a new and more scalable queueing model that marries the
best part of the request based interface we currently have (which
is fully featured, but scales poorly) and the bio based "interface"
which the new drivers for high IOPS devices end up using because
it's much faster than the request based one.

The bio interface has no block layer support, since it taps into
the stack much earlier. This means that drivers end up having to
implement a lot of functionality on their own, like tagging,
timeout handling, requeue, etc. The blk-mq interface provides all
these. Some drivers even provide a switch to select bio or rq and
has code to handle both, since things like merging only works in
the rq model and hence is faster for some workloads. This is a
huge mess. Conversion of these drivers nets us a substantial code
reduction. Initial results on converting SCSI to this model even
shows an 8x improvement on single queue devices. So while the
model was intended to work on the newer multiqueue devices, it has
substantial improvements for "classic" hardware as well. This code
has gone through extensive testing and development, it's now ready
to go. A pull request is coming to convert virtio-blk to this
model will be will be coming as well, with more drivers scheduled
for 3.14 conversion.

- Two blktrace fixes from Jan and Chen Gang.

- A plug merge fix from Alireza Haghdoost.

- Conversion of __get_cpu_var() from Christoph Lameter.

- Fix for sector_div() with 64-bit divider from Geert Uytterhoeven.

- A fix for a race between request completion and the timeout
handling from Jeff Moyer. This is what caused the merge conflict
with blk-mq/core, in case you are looking at that.

- A dm stacking fix from Mike Snitzer.

- A code consolidation fix and duplicated code removal from Kent
Overstreet.

- A handful of block bug fixes from Mikulas Patocka, fixing a loop
crash and memory corruption on blk cg.

- Elevator switch bug fix from Tomoki Sekiyama.

A heads-up that I had to rebase this branch. Initially the immutable
bio_vecs had been queued up for inclusion, but a week later, it became
clear that it wasn't fully cooked yet. So the decision was made to
pull this out and postpone it until 3.14. It was a straight forward
rebase, just pruning out the immutable series and the later fixes of
problems with it. The rest of the patches applied directly and no
further changes were made"

* 'for-3.13/core' of git://git.kernel.dk/linux-block: (31 commits)
block: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
block: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
block: Do not call sector_div() with a 64-bit divisor
kernel: trace: blktrace: remove redundent memcpy() in compat_blk_trace_setup()
block: Consolidate duplicated bio_trim() implementations
block: Use rw_copy_check_uvector()
block: Enable sysfs nomerge control for I/O requests in the plug list
block: properly stack underlying max_segment_size to DM device
elevator: acquire q->sysfs_lock in elevator_change()
elevator: Fix a race in elevator switching and md device initialization
block: Replace __get_cpu_var uses
bdi: test bdi_init failure
block: fix a probe argument to blk_register_region
loop: fix crash if blk_alloc_queue fails
blk-core: Fix memory corruption if blkcg_init_queue fails
block: fix race between request completion and timeout handling
blktrace: Send BLK_TN_PROCESS events to all running traces
blk-mq: don't disallow request merges for req->special being set
blk-mq: mq plug list breakage
blk-mq: fix for flush deadlock
...

Linus Torvalds
2013-11-14 11:08:14 +0800

08 Nov, 2013

1 commit

170d800af block: Replace __get_cpu_var uses ... Browse Code »

__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x). This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.

Other use cases are for storing and retrieving data from the current
processors percpu area. __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.

__get_cpu_var() is defined as :

#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))

__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.

this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.

This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset. Thereby address calculations are avoided and less registers
are used when code is generated.

At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.

The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e. using a global
register that may be set to the per cpu base.

Transformations done to __get_cpu_var()

1. Determine the address of the percpu instance of the current processor.

DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);

Converts to

int *x = this_cpu_ptr(&y);

2. Same as #1 but this time an array structure is involved.

DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);

Converts to

int *x = this_cpu_ptr(y);

3. Retrieve the content of the current processors instance of a per cpu
variable.

DEFINE_PER_CPU(int, y);
int x = __get_cpu_var(y)

Converts to

int x = __this_cpu_read(y);

4. Retrieve the content of a percpu struct

DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);

Converts to

memcpy(&x, this_cpu_ptr(&y), sizeof(x));

5. Assignment to a per cpu variable

DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;

Converts to

this_cpu_write(y, x);

6. Increment/Decrement etc of a per cpu variable

DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++

Converts to

this_cpu_inc(y)

Signed-off-by: Christoph Lameter
Signed-off-by: Jens Axboe

Christoph Lameter
2013-11-08 23:59:58 +0800

28 Sep, 2013

2 commits

94d30ae90 FS-Cache: Provide the ability to enable/disable cookies ... Browse Code »

Provide the ability to enable and disable fscache cookies. A disabled cookie
will reject or ignore further requests to:

Acquire a child cookie
Invalidate and update backing objects
Check the consistency of a backing object
Allocate storage for backing page
Read backing pages
Write to backing pages

but still allows:

Checks/waits on the completion of already in-progress objects
Uncaching of pages
Relinquishment of cookies

Two new operations are provided:

(1) Disable a cookie:

void fscache_disable_cookie(struct fscache_cookie *cookie,
bool invalidate);

If the cookie is not already disabled, this locks the cookie against other
dis/enablement ops, marks the cookie as being disabled, discards or
invalidates any backing objects and waits for cessation of activity on any
associated object.

This is a wrapper around a chunk split out of fscache_relinquish_cookie(),
but it reinitialises the cookie such that it can be reenabled.

All possible failures are handled internally. The caller should consider
calling fscache_uncache_all_inode_pages() afterwards to make sure all page
markings are cleared up.

(2) Enable a cookie:

void fscache_enable_cookie(struct fscache_cookie *cookie,
bool (*can_enable)(void *data),
void *data)

If the cookie is not already enabled, this locks the cookie against other
dis/enablement ops, invokes can_enable() and, if the cookie is not an
index cookie, will begin the procedure of acquiring backing objects.

The optional can_enable() function is passed the data argument and returns
a ruling as to whether or not enablement should actually be permitted to
begin.

All possible failures are handled internally. The cookie will only be
marked as enabled if provisional backing objects are allocated.

A later patch will introduce these to NFS. Cookie enablement during nfs_open()
is then contingent on i_writecount <dhowells@redhat.com

David Howells
2013-09-28 01:40:25 +0800
8fb883f3e FS-Cache: Add use/unuse/wake cookie wrappers ... Browse Code »

Add wrapper functions for dealing with cookie->n_active:

(*) __fscache_use_cookie() to increment it.

(*) __fscache_unuse_cookie() to decrement and test against zero.

(*) __fscache_wake_unused_cookie() to wake up anyone waiting for it to reach
zero.

The second and third are split so that the third can be done after cookie->lock
has been released in case the waiter wakes up whilst we're still holding it and
tries to get it.

We will need to wake-on-zero once the cookie disablement patch is applied
because it will then be possible to see n_active become zero without the cookie
being relinquished.

Also move the cookie usement out of fscache_attr_changed_op() and into
fscache_attr_changed() and the operation struct so that cookie disablement
will be able to track it.

Whilst we're at it, only increment n_active if we're about to do
fscache_submit_op() so that we don't have to deal with undoing it if anything
earlier fails. Possibly this should be moved into fscache_submit_op() which
could look at FSCACHE_OP_UNUSE_COOKIE.

Signed-off-by: David Howells

David Howells
2013-09-28 01:40:25 +0800

20 Sep, 2013

1 commit

e9ff04dd9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull ceph fixes from Sage Weil:
"These fix several bugs with RBD from 3.11 that didn't get tested in
time for the merge window: some error handling, a use-after-free, and
a sequencing issue when unmapping and image races with a notify
operation.

There is also a patch fixing a problem with the new ceph + fscache
code that just went in"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
fscache: check consistency does not decrement refcount
rbd: fix error handling from rbd_snap_name()
rbd: ignore unmapped snapshots that no longer exist
rbd: fix use-after free of rbd_dev->disk
rbd: make rbd_obj_notify_ack() synchronous
rbd: complete notifies before cleaning up osd_client and rbd_dev
libceph: add function to ensure notifies are complete

Linus Torvalds
2013-09-20 01:50:37 +0800

12 Sep, 2013

1 commit

5e4c0d974 lib/radix-tree.c: make radix_tree_node_alloc() work correctly within interrupt ... Browse Code »

With users of radix_tree_preload() run from interrupt (block/blk-ioc.c is
one such possible user), the following race can happen:

radix_tree_preload()
...
radix_tree_insert()
radix_tree_node_alloc()
if (rtp->nr) {
ret = rtp->nodes[rtp->nr - 1];

...
radix_tree_preload()
...
radix_tree_insert()
radix_tree_node_alloc()
if (rtp->nr) {
ret = rtp->nodes[rtp->nr - 1];

And we give out one radix tree node twice. That clearly results in radix
tree corruption with different results (usually OOPS) depending on which
two users of radix tree race.

We fix the problem by making radix_tree_node_alloc() always allocate fresh
radix tree nodes when in interrupt. Using preloading when in interrupt
doesn't make sense since all the allocations have to be atomic anyway and
we cannot steal nodes from process-context users because some users rely
on radix_tree_insert() succeeding after radix_tree_preload().
in_interrupt() check is somewhat ugly but we cannot simply key off passed
gfp_mask as that is acquired from root_gfp_mask() and thus the same for
all preload users.

Another part of the fix is to avoid node preallocation in
radix_tree_preload() when passed gfp_mask doesn't allow waiting. Again,
preallocation in such case doesn't make sense and when preallocation would
happen in interrupt we could possibly leak some allocated nodes. However,
some users of radix_tree_preload() require following radix_tree_insert()
to succeed. To avoid unexpected effects for these users,
radix_tree_preload() only warns if passed gfp mask doesn't allow waiting
and we provide a new function radix_tree_maybe_preload() for those users
which get different gfp mask from different call sites and which are
prepared to handle radix_tree_insert() failure.

Signed-off-by: Jan Kara
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2013-09-12 06:59:36 +0800

11 Sep, 2013

1 commit

9c89d6294 fscache: check consistency does not decrement refcount ... Browse Code »

__fscache_check_consistency() does not decrement the count of operations
active after it finishes in the success case. This leads to a hung tasks on
cookie de-registration (commonly in inode eviction).

INFO: task kworker/1:2:4214 blocked for more than 120 seconds.
kworker/1:2 D ffff880443513fc0 0 4214 2 0x00000000
Workqueue: ceph-msgr con_work [libceph]
...
Call Trace:
[] ? _raw_spin_unlock_irqrestore+0x16/0x20
[] ? fscache_wait_bit_interruptible+0x30/0x30 [fscache]
[] schedule+0x29/0x70
[] fscache_wait_atomic_t+0xe/0x20 [fscache]
[] out_of_line_wait_on_atomic_t+0x9f/0xe0
[] ? autoremove_wake_function+0x40/0x40
[] __fscache_relinquish_cookie+0x15c/0x310 [fscache]
[] ceph_fscache_unregister_inode_cookie+0x3e/0x50 [ceph]
[] ceph_destroy_inode+0x33/0x200 [ceph]
[] ? __fsnotify_inode_delete+0xe/0x10
[] destroy_inode+0x3c/0x70
[] evict+0x119/0x1b0

Signed-off-by: Milosz Tanski
Acked-by: David Howells
Signed-off-by: Sage Weil

Milosz Tanski
2013-09-11 00:04:46 +0800

06 Sep, 2013

1 commit

5a6f282a2 fscache: Netfs function for cleanup post readpages ... Browse Code »

Currently the fscache code expect the netfs to call fscache_readpages_or_alloc
inside the aops readpages callback. It marks all the pages in the list
provided by readahead with PG_private_2. In the cases that the netfs fails to
read all the pages (which is legal) it ends up returning to the readahead and
triggering a BUG. This happens because the page list still contains marked
pages.

This patch implements a simple fscache_readpages_cancel function that the netfs
should call before returning from readpages. It will revoke the pages from the
underlying cache backend and unmark them.

The problem was originally worked out in the Ceph devel tree, but it also
occurs in CIFS. It appears that NFS, AFS and 9P are okay as read_cache_pages()
will clean up the unprocessed pages in the case of an error.

This can be used to address the following oops:

[12410647.597278] BUG: Bad page state in process petabucket pfn:3d504e
[12410647.597292] page:ffffea000f541380 count:0 mapcount:0 mapping:
(null) index:0x0
[12410647.597298] page flags: 0x200000000001000(private_2)

...

[12410647.597334] Call Trace:
[12410647.597345] [] dump_stack+0x19/0x1b
[12410647.597356] [] bad_page+0xc7/0x120
[12410647.597359] [] free_pages_prepare+0x10e/0x120
[12410647.597361] [] free_hot_cold_page+0x40/0x170
[12410647.597363] [] __put_single_page+0x27/0x30
[12410647.597365] [] put_page+0x25/0x40
[12410647.597376] [] ceph_readpages+0x2e9/0x6e0 [ceph]
[12410647.597379] [] __do_page_cache_readahead+0x1af/0x260
[12410647.597382] [] ra_submit+0x21/0x30
[12410647.597384] [] filemap_fault+0x254/0x490
[12410647.597387] [] __do_fault+0x6f/0x4e0
[12410647.597391] [] ? __switch_to+0x16d/0x4a0
[12410647.597395] [] ? finish_task_switch+0x5a/0xc0
[12410647.597398] [] handle_pte_fault+0xf6/0x930
[12410647.597401] [] ? pte_mfn_to_pfn+0x93/0x110
[12410647.597403] [] ? xen_pmd_val+0xe/0x10
[12410647.597405] [] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[12410647.597407] [] handle_mm_fault+0x251/0x370
[12410647.597411] [] ? call_rwsem_down_read_failed+0x14/0x30
[12410647.597414] [] __do_page_fault+0x1aa/0x550
[12410647.597418] [] ? up_write+0x1d/0x20
[12410647.597422] [] ? vm_mmap_pgoff+0xbc/0xe0
[12410647.597425] [] ? SyS_mmap_pgoff+0xd8/0x240
[12410647.597427] [] do_page_fault+0xe/0x10
[12410647.597431] [] page_fault+0x28/0x30

Signed-off-by: Milosz Tanski
Signed-off-by: David Howells

Milosz Tanski
2013-09-06 16:17:30 +0800