Eric Lee / smarc-fsl-linux-kernel

07 Oct, 2010

6 commits

d91f2438d ceph: update issue_seq on cap grant ... Browse Code »

We need to update the issue_seq on any grant operation, be it via an MDS
reply or a separate grant message. The update in the grant path was
missing. This broke cap release for inodes in which the MDS sent an
explicit grant message that was not soon after followed by a successful
MDS reply on the same inode.

Also fix the signedness on seq locals.

Signed-off-by: Sage Weil

Sage Weil
2010-10-07 23:01:50 +0800
21b559de5 ceph: send cap release message early on failed revoke. ... Browse Code »

If an MDS tries to revoke caps that we don't have, we want to send
releases early since they probably contain the caps message the MDS
is looking for.

Previously, we only sent the messages if we didn't have the inode either. But
in a multi-mds system we can retain the inode after dropping all caps for
a single MDS.

Signed-off-by: Greg Farnum
Signed-off-by: Sage Weil

Greg Farnum
2010-10-07 23:00:24 +0800
bba0cd0e3 ceph: Update max_len with minimum required size ... Browse Code »

encode_fh on error should update max_len with minimum required
size, so that caller can redo the call with the reallocated buffer.
This is required with open by handle patch series

Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Sage Weil

Aneesh Kumar K.V
2010-10-07 23:00:24 +0800
92923dcbf ceph: Fix return value of encode_fh function ... Browse Code »

encode_fh function should return 255 on error as done by other file
system to indicate EOVERFLOW. Also max_len is in sizeof(u32) units
and not in bytes.

Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Sage Weil

Aneesh Kumar K.V
2010-10-07 23:00:23 +0800
6bc18876b ceph: avoid null deref in osd request error path ... Browse Code »

If we interrupt an osd request, we call __cancel_request, but it wasn't
verifying that req->r_osd was non-NULL before dereferencing it. This could
cause a crash if osds were flapping and we aborted a request on said osd.

Reported-by: Henry C Chang
Signed-off-by: Sage Weil

Sage Weil
2010-10-07 23:00:23 +0800
936aeb5c4 ceph: fix list_add usage on unsafe_writes list ... Browse Code »

Fix argument order.

Signed-off-by: Henry C Chang
Signed-off-by: Sage Weil

Henry C Chang
2010-10-07 23:00:23 +0800

18 Sep, 2010

2 commits

be4f104df ceph: select CRYPTO ... Browse Code »

We select CRYPTO_AES, but not CRYPTO.

Signed-off-by: Sage Weil

Sage Weil
2010-09-18 03:30:31 +0800
a43fb7310 ceph: check mapping to determine if FILE_CACHE cap is used ... Browse Code »

See if the i_data mapping has any pages to determine if the FILE_CACHE
capability is currently in use, instead of assuming it is any time the
rdcache_gen value is set (i.e., issued -> used).

This allows the MDS RECALL_STATE process work for inodes that have cached
pages.

Signed-off-by: Sage Weil

Sage Weil
2010-09-18 00:54:31 +0800

17 Sep, 2010

2 commits

e835124c2 ceph: only send one flushsnap per cap_snap per mds session ... Browse Code »

Sending multiple flushsnap messages is problematic because we ignore
the response if the tid doesn't match, and the server may only respond to
each one once. It's also a waste.

So, skip cap_snaps that are already on the flushing list, unless the caller
tells us to resend (because we are reconnecting).

Signed-off-by: Sage Weil

Sage Weil
2010-09-17 23:03:08 +0800
ae00d4f37 ceph: fix cap_snap and realm split ... Browse Code »

The cap_snap creation/queueing relies on both the current i_head_snapc
_and_ the i_snap_realm pointers being correct, so that the new cap_snap
can properly reference the old context and the new i_head_snapc can be
updated to reference the new snaprealm's context. To fix this, we:

- move inodes completely to the new (split) realm so that i_snap_realm
is correct, and
- generate the new snapc's _before_ queueing the cap_snaps in
ceph_update_snap_trace().

Signed-off-by: Sage Weil

Sage Weil
2010-09-17 07:26:51 +0800

15 Sep, 2010

2 commits

cfc0bf664 ceph: stop sending FLUSHSNAPs when we hit a dirty capsnap ... Browse Code »

Stop sending FLUSHSNAP messages when we hit a capsnap that has dirty_pages
or is still writing. We'll send the newer capsnaps only after the older
ones complete.

Signed-off-by: Sage Weil

Sage Weil
2010-09-15 06:50:59 +0800
8bef9239e ceph: correctly set 'follows' in flushsnap messages ... Browse Code »

The 'follows' should match the seq for the snap context for the given snap
cap, which is the context under which we have been dirtying and writing
data and metadata. The snapshot that _contains_ those updates thus
_follows_ that context's seq #.

Signed-off-by: Sage Weil

Sage Weil
2010-09-15 06:45:44 +0800

14 Sep, 2010

1 commit

467c52510 ceph: fix dn offset during readdir_prepopulate ... Browse Code »

When adding the readdir results to the cache, ceph_set_dentry_offset was
clobbered our just-set offset. This can cause the readdir result offsets
to get out of sync with the server. Add an argument to the helper so
that it does not.

This bug was introduced by 1cd3935bedccf592d44343890251452a6dd74fc4.

Signed-off-by: Sage Weil

Sage Weil
2010-09-14 02:40:36 +0800

12 Sep, 2010

4 commits

a77d9f7dc ceph: fix file offset wrapping at 4GB on 32-bit archs ... Browse Code »

Cast the value before shifting so that we don't run out of bits with a
32-bit unsigned long. This fixes wrapping of high file offsets into the
low 4GB of a file on disk, and the subsequent data corruption for large
files.

Signed-off-by: Sage Weil

Sage Weil
2010-09-12 01:55:25 +0800
3612abbd5 ceph: fix reconnect encoding for old servers ... Browse Code »

Fix the reconnect encoding to encode the cap record when the MDS does not
have the FLOCK capability (i.e., pre v0.22).

Signed-off-by: Sage Weil

Sage Weil
2010-09-12 01:52:47 +0800
3d4401d9d ceph: fix pagelist kunmap tail ... Browse Code »

A wrong parameter was passed to the kunmap.

Signed-off-by: Yehuda Sadeh
Signed-off-by: Sage Weil

Yehuda Sadeh
2010-09-12 01:52:47 +0800
ca04d9c3e ceph: fix null pointer deref on anon root dentry release ... Browse Code »

When we release a root dentry, particularly after a splice, the parent
(actually our) inode was evaluating to NULL and was getting dereferenced
by ceph_snap(). This is reproduced by something as simple as

mount -t ceph monhost:/a/b mnt
mount -t ceph monhost:/a mnt2
ls mnt2

A splice_dentry() would kill the old 'b' inode's root dentry, and we'd
crash while releasing it.

Fix by checking for both the ROOT and NULL cases explicitly. We only need
to invalidate the parent dir when we have a correct parent to invalidate.

Signed-off-by: Sage Weil

Sage Weil
2010-09-12 01:52:47 +0800

27 Aug, 2010

3 commits

b545787db ceph: fix get_ticket_handler() error handling ... Browse Code »

get_ticket_handler() returns a valid pointer or it returns
ERR_PTR(-ENOMEM) if kzalloc() fails.

Signed-off-by: Dan Carpenter
Signed-off-by: Sage Weil

Dan Carpenter
2010-08-27 00:26:50 +0800
e072f8aa3 ceph: don't BUG on ENOMEM during mds reconnect ... Browse Code »

We are in a position to return an error; do that instead.

Signed-off-by: Sage Weil

Sage Weil
2010-08-27 00:26:37 +0800
f44c3890d ceph: ceph_mdsc_build_path() returns an ERR_PTR ... Browse Code »

ceph_mdsc_build_path() returns an ERR_PTR but this code is set up to
handle NULL returns.

Signed-off-by: Dan Carpenter
Signed-off-by: Sage Weil

Dan Carpenter
2010-08-27 00:24:28 +0800

26 Aug, 2010

2 commits

ad8453ab0 ceph: Fix warnings ... Browse Code »

Just scrubbing some warnings so I can see real problem ones in the build
noise. For 32bit we need to coax gcc politely into believing we really
honestly intend to the casts. Using (u64)(unsigned long) means we cast from
a pointer to a type of the right size and then extend it. This stops the
warning spew.

Signed-off-by: Alan Cox
Signed-off-by: Sage Weil

Alan Cox
2010-08-26 03:02:14 +0800
ac1f12ef5 ceph: ceph_get_inode() returns an ERR_PTR ... Browse Code »

ceph_get_inode() returns an ERR_PTR and it doesn't return a NULL.

Signed-off-by: Dan Carpenter
Signed-off-by: Sage Weil

Dan Carpenter
2010-08-26 03:01:54 +0800

25 Aug, 2010

2 commits

36e21687e ceph: initialize fields on new dentry_infos ... Browse Code »

Signed-off-by: Sage Weil

Sage Weil
2010-08-25 07:24:19 +0800
7d8cb26d7 ceph: maintain i_head_snapc when any caps are dirty, not just for data ... Browse Code »

We used to use i_head_snapc to keep track of which snapc the current epoch
of dirty data was dirtied under. It is used by queue_cap_snap to set up
the cap_snap. However, since we queue cap snaps for any dirty caps, not
just for dirty file data, we need to keep a valid i_head_snapc anytime
we have dirty|flushing caps. This fixes a NULL pointer deref in
queue_cap_snap when writing back dirty caps without data (e.g.,
snaptest-authwb.sh).

Signed-off-by: Sage Weil

Sage Weil
2010-08-25 07:24:18 +0800

23 Aug, 2010

8 commits

07a27e226 ceph: fix osd request lru adjustment when sending request ... Browse Code »

Fix argument order. We want to move the item to the end of the list, not
change the position of the head.

Signed-off-by: Henry C Chang
Signed-off-by: Sage Weil

Henry C Chang
2010-08-23 12:34:27 +0800
124514918 ceph: don't improperly set dir complete when holding EXCL cap ... Browse Code »

If we hold the EXCL cap, we cannot trust the dir stats from the MDS (num
files, subdirs) and must not incorrectly conclude that the directory is
empty. If we do, we get can bad results from lookup (bad ENOENT) and
bad readdir results.

Signed-off-by: Sage Weil

Sage Weil
2010-08-23 12:33:32 +0800
679ceace8 mm: exporting account_page_dirty ... Browse Code »

This allows code outside of the mm core to safely manipulate page state
and not worry about the other accounting. Not using these routines means
that some code will lose track of the accounting and we get bugs. This
has happened once already.

Signed-off-by: Michael Rubin
Signed-off-by: Sage Weil

Michael Rubin
2010-08-23 06:16:51 +0800
eb6bb1c5b ceph: direct requests in snapped namespace based on nonsnap parent ... Browse Code »

When making a request in the virtual snapdir or a snapped portion of the
namespace, we should choose the MDS based on the first nonsnap parent (and
its caps). If that is not the best place, we will get forward hints to
find the right MDS in the cluster. This fixes ESTALE errors when using
the .snap directory and namespace with multiple MDSs.

Signed-off-by: Sage Weil

Sage Weil
2010-08-23 06:16:48 +0800
ed3260444 ceph: queue cap snap writeback for realm children on snap update ... Browse Code »

When a realm is updated, we need to queue writeback on inodes in that
realm _and_ its children. Otherwise, if the inode gets cowed on the
server, we can get a hang later due to out-of-sync cap/snap state.

Signed-off-by: Sage Weil

Sage Weil
2010-08-23 06:16:47 +0800
4a625be47 ceph: include dirty xattrs state in snapped caps ... Browse Code »

When we snapshot dirty metadata that needs to be written back to the MDS,
include dirty xattr metadata. Make the capsnap reference the encoded
xattr blob so that it will be written back in the FLUSHSNAP op.

Also fix the capsnap creation guard to include dirty auth or file bits,
not just tests specific to dirty file data or file writes in progress
(this fixes auth metadata writeback).

Signed-off-by: Sage Weil

Sage Weil
2010-08-23 06:16:46 +0800
082afec92 ceph: fix xattr cap writeback ... Browse Code »

We should include the xattr metadata blob in the cap update message any
time we are flushing dirty state, NOT just when we are also dropping the
cap. This fixes async xattr writeback.

Also, clean up the code slightly to avoid duplicating the bit test.

Signed-off-by: Sage Weil

Sage Weil
2010-08-23 06:16:41 +0800
f3c60c591 ceph: fix multiple mds session shutdown ... Browse Code »

The use of a completion when waiting for session shutdown during umount is
inappropriate, given the complexity of the condition. For multiple MDS's,
this resulted in the umount thread spinning, often preventing the session
close message from being processed in some cases.

Switch to a waitqueue and defined a condition helper. This cleans things
up nicely.

Signed-off-by: Sage Weil

Sage Weil
2010-08-23 06:04:43 +0800

11 Aug, 2010

1 commit

e56fa10e9 ceph: generalize mon requests, add pool op support ... Browse Code »

Generalize the current statfs synchronous requests, and support pool_ops.

Signed-off-by: Yehuda Sadeh
Signed-off-by: Sage Weil

Yehuda Sadeh
2010-08-11 05:41:25 +0800

06 Aug, 2010

1 commit

0eb6cd49f ceph: only queue async writeback on cap revocation if there is dirty data ... Browse Code »

Normally, if the Fb cap bit is being revoked, we queue an async writeback.
If there is no dirty data but we still hold the cap, this leaves the
client sitting around doing nothing until the cap timeouts expire and the
cap is released on its own (as it would have been without the revocation).

Instead, only queue writeback if the bit is actually used (i.e., we have
dirty data). If not, we can reply to the revocation immediately.

Signed-off-by: Sage Weil

Sage Weil
2010-08-06 04:53:40 +0800

04 Aug, 2010

3 commits

e9d177443 ceph: do not ignore osd_idle_ttl mount option ... Browse Code »

Actually apply the mount option to the mount_args struct.

Signed-off-by: Sage Weil

Sage Weil
2010-08-04 03:56:57 +0800
52dfb8ac0 ceph: constify dentry_operations ... Browse Code »

This makes checkpatch happy.

Signed-off-by: Sage Weil

Sage Weil
2010-08-04 01:25:30 +0800
213c99ee0 ceph: whitespace cleanup ... Browse Code »

Signed-off-by: Sage Weil

Sage Weil
2010-08-04 01:25:11 +0800

03 Aug, 2010

3 commits

40819f6fb ceph: add flock/fcntl lock support ... Browse Code »

Implement flock inode operation to support advisory file locking. All
lock/unlock operations are synchronous with the MDS. Lock state is
sent when reconnecting to a recovering MDS to restore the shared lock
state.

Signed-off-by: Greg Farnum
Signed-off-by: Sage Weil

Greg Farnum
2010-08-03 07:10:53 +0800
fbaad9797 ceph: define on-wire types, constants for file locking support ... Browse Code »

Define the MDS operations and data types for doing file advisory locking
with the MDS.

Signed-off-by: Greg Farnum
Signed-off-by: Sage Weil

Greg Farnum
2010-08-03 06:48:54 +0800
c6f3fdc59 ceph: add CEPH_FEATURE_FLOCK to the supported feature bits ... Browse Code »

This informs the server that we will accept v2 client_caps format and v2
client_reconnect format messages.

Signed-off-by: Greg Farnum
Signed-off-by: Sage Weil

Greg Farnum
2010-08-03 06:48:51 +0800