Eric Lee / smarc-fsl-linux-kernel

04 Jan, 2012

1 commit

5706b27de ceph: propagate umode_t ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2012-01-04 11:55:16 +0800

08 Dec, 2011

1 commit

be655596b ceph: use i_ceph_lock instead of i_lock ... Browse Code »

We have been using i_lock to protect all kinds of data structures in the
ceph_inode_info struct, including lists of inodes that we need to iterate
over while avoiding races with inode destruction. That requires grabbing
a reference to the inode with the list lock protected, but igrab() now
takes i_lock to check the inode flags.

Changing the list lock ordering would be a painful process.

However, using a ceph-specific i_ceph_lock in the ceph inode instead of
i_lock is a simple mechanical change and avoids the ordering constraints
imposed by igrab().

Reported-by: Amon Ott
Signed-off-by: Sage Weil

Sage Weil
2011-12-08 02:46:44 +0800

06 Nov, 2011

1 commit

c6ffe1001 ceph: use new D_COMPLETE dentry flag ... Browse Code »

We used to use a flag on the directory inode to track whether the dcache
contents for a directory were a complete cached copy. Switch to a dentry
flag CEPH_D_COMPLETE that is safely updated by ->d_prune().

Signed-off-by: Sage Weil

Sage Weil
2011-11-06 12:10:10 +0800

02 Nov, 2011

1 commit

bfe868486 filesystems: add set_nlink() ... Browse Code »

Replace remaining direct i_nlink updates with a new set_nlink()
updater function.

Signed-off-by: Miklos Szeredi
Tested-by: Toshiyuki Okajima
Signed-off-by: Christoph Hellwig

Miklos Szeredi
2011-11-02 19:53:43 +0800

26 Oct, 2011

1 commit

b61c27636 libceph: don't complain on msgpool alloc failures ... Browse Code »

The pool allocation failures are masked by the pool; there is no need to
spam the console about them. (That's the whole point of having the pool
in the first place.)

Mark msg allocations whose failure is safely handled as such.

Signed-off-by: Sage Weil

Sage Weil
2011-10-26 07:10:15 +0800

21 Jul, 2011

1 commit

02c24a821 fs: push i_mutex and filemap_write_and_wait down into ->fsync() handlers ... Browse Code »

Btrfs needs to be able to control how filemap_write_and_wait_range() is called
in fsync to make it less of a painful operation, so push down taking i_mutex and
the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
file systems can drop taking the i_mutex altogether it seems, like ext3 and
ocfs2. For correctness sake I just pushed everything down in all cases to make
sure that we keep the current behavior the same for everybody, and then each
individual fs maintainer can make up their mind about what to do from there.
Thanks,

Acked-by: Jan Kara
Signed-off-by: Josef Bacik
Signed-off-by: Al Viro

Josef Bacik
2011-07-21 08:47:59 +0800

08 Jun, 2011

1 commit

70b666c3b ceph: use ihold when we already have an inode ref ... Browse Code »

We should use ihold whenever we already have a stable inode ref, even
when we aren't holding i_lock. This avoids adding new and unnecessary
locking dependencies.

Signed-off-by: Sage Weil

Sage Weil
2011-06-08 12:34:11 +0800

25 May, 2011

1 commit

db3540522 ceph: fix cap flush race reentrancy ... Browse Code »

In e9964c10 we change cap flushing to do a delicate dance because some
inodes on the cap_dirty list could be in a migrating state (got EXPORT but
not IMPORT) in which we couldn't actually flush and move from
dirty->flushing, breaking the while (!empty) { process first } loop
structure. It worked for a single sync thread, but was not reentrant and
triggered infinite loops when multiple syncers came along.

Instead, move inodes with dirty to a separate cap_dirty_migrating list
when in the limbo export-but-no-import state, allowing us to go back to
the simple loop structure (which was reentrant). This is cleaner and more
robust.

Audited the cap_dirty users and this looks fine:
list_empty(&ci->i_dirty_item) is still a reliable indicator of whether we
have dirty caps (which list we're on is irrelevant) and list_del_init()
calls still do the right thing.

Signed-off-by: Sage Weil

Sage Weil
2011-05-25 02:52:12 +0800

20 May, 2011

1 commit

3540303f8 ceph: fix rare potential cap leak ... Browse Code »

If we grab new_cap, retake the lock, and find we already have a cap now
for the given mds, release new_cap.

Signed-off-by: Sage Weil

Sage Weil
2011-05-20 02:25:03 +0800

12 May, 2011

1 commit

d3d0720d4 ceph: do not use i_wrbuffer_ref as refcount for Fb cap ... Browse Code »

We increments i_wrbuffer_ref when taking the Fb cap. This breaks
the dirty page accounting and causes looping in
__ceph_do_pending_vmtruncate, and ceph client hangs.

This bug can be reproduced occasionally by running blogbench.

Add a new field i_wb_ref to inode and dedicate it to Fb reference
counting.

Signed-off-by: Henry C Chang
Signed-off-by: Sage Weil

Henry C Chang
2011-05-12 01:44:48 +0800

05 May, 2011

1 commit

fca65b4ad ceph: do not call __mark_dirty_inode under i_lock ... Browse Code »

The __mark_dirty_inode helper now takes i_lock as of 250df6ed. Fix the
one ceph callers that held i_lock (__ceph_mark_dirty_caps) to return the
flags value so that the callers can do it outside of i_lock.

Signed-off-by: Sage Weil

Sage Weil
2011-05-05 03:56:45 +0800

04 May, 2011

1 commit

3772d26d8 ceph: use ihold() when i_lock is held ... Browse Code »

See 0444d76ae64fffc7851797fc1b6ebdbb44ac504a.

Signed-off-by: Sage Weil

Sage Weil
2011-05-04 00:28:08 +0800

31 Mar, 2011

1 commit

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800

20 Jan, 2011

3 commits

7e57b81c7 ceph: avoid immediate cap check after import ... Browse Code »

The NODELAY flag avoids the heuristics that delay cap (issued/wanted)
release. There's no reason for that after we import a cap, and it kills
whatever benefit we get from those delays.

Signed-off-by: Sage Weil

Sage Weil
2011-01-20 01:23:26 +0800
088b3f5e9 ceph: fix flushing of caps vs cap import ... Browse Code »

If we are mid-flush and a cap is migrated to another node, we need to
resend the cap flush message to the new MDS, and do so with the original
flush_seq to avoid leaking across a sync boundary. Previously we didn't
redo the flush (we only flushed newly dirty data), which would cause a
later sync to hang forever.

Signed-off-by: Sage Weil

Sage Weil
2011-01-20 01:23:25 +0800
24be0c481 ceph: fix erroneous cap flush to non-auth mds ... Browse Code »

The int flushing is global and not clear on each iteration of the loop,
which can cause a second flush of caps to any MDSs with ids greater than
the auth.

Signed-off-by: Sage Weil

Sage Weil
2011-01-20 01:23:24 +0800

08 Nov, 2010

2 commits

cd045cb42 ceph: fix rdcache_gen usage and invalidate ... Browse Code »

We used to use rdcache_gen to indicate whether we "might" have cached
pages. Now we just look at the mapping to determine that. However, some
old behavior remains from that transition.

First, rdcache_gen == 0 no longer means we have no pages. That can happen
at any time (presumably when we carry FILE_CACHE). We should not reset it
to zero, and we should not check that it is zero.

That means that the only purpose for rdcache_revoking is to resolve races
between new issues of FILE_CACHE and an async invalidate. If they are
equal, we should invalidate. On success, we decrement rdcache_revoking,
so that it is no longer equal to rdcache_gen. Similarly, if we success
in doing a sync invalidate, set revoking = gen - 1. (This is a small
optimization to avoid doing unnecessary invalidate work and does not
affect correctness.)

Signed-off-by: Sage Weil

Sage Weil
2010-11-08 23:29:05 +0800
feb4cc9bb ceph: re-request max_size if cap auth changes ... Browse Code »

If the auth cap migrates to another MDS, clear requested_max_size so that
we resend any pending max_size increase requests. This fixes potential
hangs on writes that extend a file and race with an cap migration between
MDSs.

Signed-off-by: Sage Weil

Sage Weil
2010-11-08 01:39:23 +0800

28 Oct, 2010

1 commit

2f56f56ad Revert "ceph: update issue_seq on cap grant" ... Browse Code »

This reverts commit d91f2438d881514e4a923fd786dbd94b764a9440.

The intent of issue_seq is to distinguish between mds->client messages that
(re)create the cap and those that do not, which means we should _only_ be
updating that value in the create paths. By updating it in handle_cap_grant,
we reset it to zero, which then breaks release.

The larger question is what workload/problem made me think it should be
updated here...

Signed-off-by: Sage Weil

Sage Weil
2010-10-28 12:05:54 +0800

21 Oct, 2010

3 commits

18a38193e ceph: use mapping->nrpages to determine if mapping is empty ... Browse Code »

This is simpler and faster.

Signed-off-by: Sage Weil

Sage Weil
2010-10-21 06:38:15 +0800
93afd449a ceph: only invalidate on check_caps if we actually have pages ... Browse Code »

The i_rdcache_gen value only implies we MAY have cached pages; actually
check the mapping to see if it's worth bothering with an invalidate.

Signed-off-by: Sage Weil

Sage Weil
2010-10-21 06:38:15 +0800
3d14c5d2b ceph: factor out libceph from Ceph file system ... Browse Code »

This factors out protocol and low-level storage parts of ceph into a
separate libceph module living in net/ceph and include/linux/ceph. This
is mostly a matter of moving files around. However, a few key pieces
of the interface change as well:

- ceph_client becomes ceph_fs_client and ceph_client, where the latter
captures the mon and osd clients, and the fs_client gets the mds client
and file system specific pieces.
- Mount option parsing and debugfs setup is correspondingly broken into
two pieces.
- The mon client gets a generic handler callback for otherwise unknown
messages (mds map, in this case).
- The basic supported/required feature bits can be expanded (and are by
ceph_fs_client).

No functional change, aside from some subtle error handling cases that got
cleaned up in the refactoring process.

Signed-off-by: Sage Weil

Yehuda Sadeh
2010-10-21 06:37:28 +0800

07 Oct, 2010

2 commits

d91f2438d ceph: update issue_seq on cap grant ... Browse Code »

We need to update the issue_seq on any grant operation, be it via an MDS
reply or a separate grant message. The update in the grant path was
missing. This broke cap release for inodes in which the MDS sent an
explicit grant message that was not soon after followed by a successful
MDS reply on the same inode.

Also fix the signedness on seq locals.

Signed-off-by: Sage Weil

Sage Weil
2010-10-07 23:01:50 +0800
21b559de5 ceph: send cap release message early on failed revoke. ... Browse Code »

If an MDS tries to revoke caps that we don't have, we want to send
releases early since they probably contain the caps message the MDS
is looking for.

Previously, we only sent the messages if we didn't have the inode either. But
in a multi-mds system we can retain the inode after dropping all caps for
a single MDS.

Signed-off-by: Greg Farnum
Signed-off-by: Sage Weil

Greg Farnum
2010-10-07 23:00:24 +0800

18 Sep, 2010

1 commit

a43fb7310 ceph: check mapping to determine if FILE_CACHE cap is used ... Browse Code »

See if the i_data mapping has any pages to determine if the FILE_CACHE
capability is currently in use, instead of assuming it is any time the
rdcache_gen value is set (i.e., issued -> used).

This allows the MDS RECALL_STATE process work for inodes that have cached
pages.

Signed-off-by: Sage Weil

Sage Weil
2010-09-18 00:54:31 +0800

17 Sep, 2010

1 commit

e835124c2 ceph: only send one flushsnap per cap_snap per mds session ... Browse Code »

Sending multiple flushsnap messages is problematic because we ignore
the response if the tid doesn't match, and the server may only respond to
each one once. It's also a waste.

So, skip cap_snaps that are already on the flushing list, unless the caller
tells us to resend (because we are reconnecting).

Signed-off-by: Sage Weil

Sage Weil
2010-09-17 23:03:08 +0800

15 Sep, 2010

1 commit

cfc0bf664 ceph: stop sending FLUSHSNAPs when we hit a dirty capsnap ... Browse Code »

Stop sending FLUSHSNAP messages when we hit a capsnap that has dirty_pages
or is still writing. We'll send the newer capsnaps only after the older
ones complete.

Signed-off-by: Sage Weil

Sage Weil
2010-09-15 06:50:59 +0800

25 Aug, 2010

1 commit

7d8cb26d7 ceph: maintain i_head_snapc when any caps are dirty, not just for data ... Browse Code »

We used to use i_head_snapc to keep track of which snapc the current epoch
of dirty data was dirtied under. It is used by queue_cap_snap to set up
the cap_snap. However, since we queue cap snaps for any dirty caps, not
just for dirty file data, we need to keep a valid i_head_snapc anytime
we have dirty|flushing caps. This fixes a NULL pointer deref in
queue_cap_snap when writing back dirty caps without data (e.g.,
snaptest-authwb.sh).

Signed-off-by: Sage Weil

Sage Weil
2010-08-25 07:24:18 +0800

23 Aug, 2010

2 commits

4a625be47 ceph: include dirty xattrs state in snapped caps ... Browse Code »

When we snapshot dirty metadata that needs to be written back to the MDS,
include dirty xattr metadata. Make the capsnap reference the encoded
xattr blob so that it will be written back in the FLUSHSNAP op.

Also fix the capsnap creation guard to include dirty auth or file bits,
not just tests specific to dirty file data or file writes in progress
(this fixes auth metadata writeback).

Signed-off-by: Sage Weil

Sage Weil
2010-08-23 06:16:46 +0800
082afec92 ceph: fix xattr cap writeback ... Browse Code »

We should include the xattr metadata blob in the cap update message any
time we are flushing dirty state, NOT just when we are also dropping the
cap. This fixes async xattr writeback.

Also, clean up the code slightly to avoid duplicating the bit test.

Signed-off-by: Sage Weil

Sage Weil
2010-08-23 06:16:41 +0800

06 Aug, 2010

1 commit

0eb6cd49f ceph: only queue async writeback on cap revocation if there is dirty data ... Browse Code »

Normally, if the Fb cap bit is being revoked, we queue an async writeback.
If there is no dirty data but we still hold the cap, this leaves the
client sitting around doing nothing until the cap timeouts expire and the
cap is released on its own (as it would have been without the revocation).

Instead, only queue writeback if the bit is actually used (i.e., we have
dirty data). If not, we can reply to the revocation immediately.

Signed-off-by: Sage Weil

Sage Weil
2010-08-06 04:53:40 +0800

03 Aug, 2010

1 commit

ce1fbc8dd ceph: support v2 client_caps encoding ... Browse Code »

Add support for v2 encoding of MClientCaps, which includes a flock blob.

Signed-off-by: Sage Weil

Sage Weil
2010-08-03 06:48:49 +0800

02 Aug, 2010

8 commits

b8cd07e78 ceph: warn on missing snap realm ... Browse Code »

Well, this Shouldn't Happen, so it would be helpful to know the caller when
it does.

Signed-off-by: Sage Weil

Sage Weil
2010-08-02 11:11:42 +0800
2bc50259f ceph: add ceph_get_cap_for_mds function. ... Browse Code »

Signed-off-by: Greg Farnum
Signed-off-by: Sage Weil

Greg Farnum
2010-08-02 11:11:41 +0800
154f42c2c ceph: connect to export targets on cap export ... Browse Code »

When we get a cap EXPORT message, make sure we are connected to all export
targets to ensure we can handle the matching IMPORT.

Signed-off-by: Sage Weil

Sage Weil
2010-08-02 11:11:41 +0800
37151668b ceph: do caps accounting per mds_client ... Browse Code »

Caps related accounting is now being done per mds client instead
of just being global. This prepares ground work for a later revision
of the caps preallocated reservation list.

Signed-off-by: Yehuda Sadeh
Signed-off-by: Sage Weil

Yehuda Sadeh
2010-08-02 11:11:40 +0800
cd84db6e4 ceph: code cleanup ... Browse Code »

Mainly fixing minor issues reported by sparse.

Signed-off-by: Yehuda Sadeh
Signed-off-by: Sage Weil

Yehuda Sadeh
2010-08-02 11:11:40 +0800
ca81f3f6b ceph: skip if no auth cap in flush_snaps ... Browse Code »

If we have a capsnap but no auth cap (e.g. because it is migrating to
another mds), bail out and do nothing for now. Do NOT remove the capsnap
from the flush list.

Signed-off-by: Sage Weil

Sage Weil
2010-08-02 11:11:39 +0800
3b454c494 ceph: simplify caps revocation, fix for multimds ... Browse Code »

The caps revocation should either initiate writeback, invalidateion, or
call check_caps to ack or do the dirty work. The primary question is
whether we can get away with only checking the auth cap or whether all
caps need to be checked.

The old code was doing...something else. At the very least, revocations
from non-auth MDSs could break by triggering the "check auth cap only"
case.

Signed-off-by: Sage Weil

Sage Weil
2010-08-02 11:11:39 +0800
ee6b272b9 ceph: drop unused argument ... Browse Code »

Signed-off-by: Sage Weil

Sage Weil
2010-08-02 11:11:39 +0800