Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

15 Oct, 2008

3 commits

0fcaa56a2 ocfs2: Simplify ocfs2_read_block() ... Browse Code »

More than 30 callers of ocfs2_read_block() pass exactly OCFS2_BH_CACHED.
Only six pass a different flag set. Rather than have every caller care,
let's make ocfs2_read_block() take no flags and always do a cached read.
The remaining six places can call ocfs2_read_blocks() directly.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-15 02:51:57 +0800
31d33073c ocfs2: Require an inode for ocfs2_read_block(s)(). ... Browse Code »

Now that synchronous readers are using ocfs2_read_blocks_sync(), all
callers of ocfs2_read_blocks() are passing an inode. Use it
unconditionally. Since it's there, we don't need to pass the
ocfs2_super either.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-15 02:43:29 +0800
da1e90985 ocfs2: Separate out sync reads from ocfs2_read_blocks() ... Browse Code »

The ocfs2_read_blocks() function currently handles sync reads, cached,
reads, and sometimes cached reads. We're going to add some
functionality to it, so first we should simplify it. The uncached,
synchronous reads are much easer to handle as a separate function, so we
instroduce ocfs2_read_blocks_sync().

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-15 02:29:10 +0800

14 Oct, 2008

37 commits

936b88343 ocfs2: Refactor xattr list and remove ocfs2_xattr_handler(). ... Browse Code »

According to Christoph Hellwig's advice, we really don't need
a ->list to handle one xattr's list. Just a map from index to
xattr prefix is enough. And I also refactor the old list method
with the reference from fs/xfs/linux-2.6/xfs_xattr.c and the
xattr list method in btrfs.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 08:02:45 +0800
2057e5c67 ocfs2: Calculate EA hash only by its suffix. ... Browse Code »

According to Christoph Hellwig's advice, the hash value of EA
is only calculated by its suffix.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 08:02:44 +0800
99219aea6 ocfs2: Move trusted and user attribute support into xattr.c ... Browse Code »

Per Christoph Hellwig's suggestion - don't split these up. It's not like we
gained much by having the two tiny files around.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-10-14 08:02:44 +0800
40daa16a3 ocfs2: Uninline ocfs2_xattr_name_hash() ... Browse Code »

This is too big to be inlined.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-10-14 08:02:44 +0800
a81cb88b6 ocfs2: Don't check for NULL before brelse() ... Browse Code »

This is pointless as brelse() already does the check.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-10-14 08:02:44 +0800
fd8351f83 ocfs2: use smaller counters in ocfs2_remove_xattr_clusters_from_cache ... Browse Code »

i and b_len don't really need to be u64's. Xattr extent lengths should be
limited by the VFS, and then the size of our on-disk length field.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-10-14 08:02:44 +0800
696b55d76 ocfs2: Documentation update for user_xattr / nouser_xattr mount options ... Browse Code »

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-10-14 08:02:44 +0800
4cc812458 ocfs2: make la_debug_mutex static ... Browse Code »

It can also be moved into ocfs2_la_debug_read().

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-10-14 08:02:44 +0800
009d37502 ocfs2: Remove pointless !! ... Browse Code »

ocfs2_stack_supports_plocks() doesn't need this to properly return a zero or
one value.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-10-14 08:02:44 +0800
5a0956119 ocfs2: Add empty bucket support in xattr. ... Browse Code »

As Mark mentioned, it may be time-consuming when we remove the
empty xattr bucket, so this patch try to let empty bucket exist
in xattr operation. The modification includes:
1. Remove the functin of bucket and extent record deletion during
xattr delete.
2. In xattr set:
1) Don't clean the last entry so that if the bucket is empty,
the hash value of the bucket is the hash value of the entry
which is deleted last.
2) During insert, if we meet with an empty bucket, just use the
1st entry.
3. In binary search of xattr bucket, use the bucket hash value(which
stored in the 1st xattr entry) to find the right place.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 08:02:43 +0800
06b240d8a ocfs2/xattr.c: Fix a bug when inserting xattr. ... Browse Code »

During the process of xatt insertion, we use binary search
to find the right place and "low" is set to it. But when
there is one xattr which has the same name hash as the inserted
one, low is the wrong value. So set it to the right position.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 08:02:43 +0800
b0f73cfc3 ocfs2: Add xattr mount option in ocfs2_show_options() ... Browse Code »

Patch adds check for [no]user_xattr in ocfs2_show_options() that completes
the list of all mount options.

Signed-off-by: Sunil Mushran
Signed-off-by: Mark Fasheh

Sunil Mushran
2008-10-14 08:02:43 +0800
2b4e30fbd ocfs2: Switch over to JBD2. ... Browse Code »

ocfs2 wants JBD2 for many reasons, not the least of which is that JBD is
limiting our maximum filesystem size.

It's a pretty trivial change. Most functions are just renamed. The
only functional change is moving to Jan's inode-based ordered data mode.
It's better, too.

Because JBD2 reads and writes JBD journals, this is compatible with any
existing filesystem. It can even interact with JBD-based ocfs2 as long
as the journal is formated for JBD.

We provide a compatibility option so that paranoid people can still use
JBD for the time being. This will go away shortly.

[ Moved call of ocfs2_begin_ordered_truncate() from ocfs2_delete_inode() to
ocfs2_truncate_for_delete(). --Mark ]

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 08:02:43 +0800
12462f1d9 ocfs2: Add the 'inode64' mount option. ... Browse Code »

Now that ocfs2 limits inode numbers to 32bits, add a mount option to
disable the limit. This parallels XFS. 64bit systems can handle the
larger inode numbers.

[ Added description of inode64 mount option in ocfs2.txt. --Mark ]

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:08 +0800
1187c9688 ocfs2: Limit inode allocation to 32bits. ... Browse Code »

ocfs2 inode numbers are block numbers. For any filesystem with less
than 2^32 blocks, this is not a problem. However, when ocfs2 starts
using JDB2, it will be able to support filesystems with more than 2^32
blocks. This would result in inode numbers higher than 2^32.

The problem is that stat(2) can't handle those numbers on 32bit
machines. The simple solution is to have ocfs2 allocate all inodes
below that boundary.

The suballoc code is changed to honor an optional block limit. Only the
inode suballocator sets that limit - all other allocations stay unlimited.

The biggest trick is to grow the inode suballocator beneath that limit.
There's no point in allocating block groups that are above the limit,
then rejecting their elements later on. We want to prevent the inode
allocator from ever having block groups above the limit. This involves
a little gyration with the local alloc code. If the local alloc window
is above the limit, it signals the caller to try the global bitmap but
does not disable the local alloc file (which can be used for other
allocations).

[ Minor cleanup - removed an ML_NOTICE comment. --Mark ]

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:07 +0800
08413899d ocfs2: Resolve deadlock in ocfs2_xattr_free_block. ... Browse Code »

In ocfs2_xattr_free_block, we take a cluster lock on xb_alloc_inode while we
have a transaction open. This will deadlock the downconvert thread, so fix
it.

We can clean up how xattr blocks are removed while here - this patch also
moves the mechanism of releasing xattr block (including both value, xattr
tree and xattr block) into this function.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 07:57:06 +0800
28b8ca0b7 ocfs2: bug-fix for journal extend in xattr. ... Browse Code »

In ocfs2_extend_trans, when we can't extend the current
transaction, it will commit current transaction and restart
a new one. So if the previous credits we have allocated aren't
used(the block isn't dirtied before our extend), we will not
have enough credits for any future operation(it will cause jbd
complain and bug out). So check this and re-extend it.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 07:57:06 +0800
8d6220d6a ocfs2: Change ocfs2_get_*_extent_tree() to ocfs2_init_*_extent_tree() ... Browse Code »

The original get/put_extent_tree() functions held a reference on
et_root_bh. However, every single caller already has a safe reference,
making the get/put cycle irrelevant.

We change ocfs2_get_*_extent_tree() to ocfs2_init_*_extent_tree(). It
no longer gets a reference on et_root_bh. ocfs2_put_extent_tree() is
removed. Callers now have a simpler init+use pattern.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:05 +0800
1625f8ac1 ocfs2: Comment struct ocfs2_extent_tree_operations. ... Browse Code »

struct ocfs2_extent_tree_operations provides methods for the different
on-disk btrees in ocfs2. Describing what those methods do is probably a
good idea.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:05 +0800
f99b9b7cc ocfs2: Make ocfs2_extent_tree the first-class representation of a tree. ... Browse Code »

We now have three different kinds of extent trees in ocfs2: inode data
(dinode), extended attributes (xattr_tree), and extended attribute
values (xattr_value). There is a nice abstraction for them,
ocfs2_extent_tree, but it is hidden in alloc.c. All the calling
functions have to pick amongst a varied API and pass in type bits and
often extraneous pointers.

A better way is to make ocfs2_extent_tree a first-class object.
Everyone converts their object to an ocfs2_extent_tree() via the
ocfs2_get_*_extent_tree() calls, then uses the ocfs2_extent_tree for all
tree calls to alloc.c.

This simplifies a lot of callers, making for readability. It also
provides an easy way to add additional extent tree types, as they only
need to be defined in alloc.c with a ocfs2_get__extent_tree()
function.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:05 +0800
1e61ee79e ocfs2: Add an insertion check to ocfs2_extent_tree_operations. ... Browse Code »

A couple places check an extent_tree for a valid inode. We move that
out to add an eo_insert_check() operation. It can be called from
ocfs2_insert_extent() and elsewhere.

We also have the wrapper calls ocfs2_et_insert_check() and
ocfs2_et_sanity_check() ignore NULL ops. That way we don't have to
provide useless operations for xattr types.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:05 +0800
1a09f556e ocfs2: Create specific get_extent_tree functions. ... Browse Code »

A caller knows what kind of extent tree they have. There's no reason
they have to call ocfs2_get_extent_tree() with a NULL when they could
just as easily call a specific function to their type of extent tree.

Introduce ocfs2_dinode_get_extent_tree(),
ocfs2_xattr_tree_get_extent_tree(), and
ocfs2_xattr_value_get_extent_tree(). They only take the necessary
arguments, calling into the underlying __ocfs2_get_extent_tree() to do
the real work.

__ocfs2_get_extent_tree() is the old ocfs2_get_extent_tree(), but
without needing any switch-by-type logic.

ocfs2_get_extent_tree() is now a wrapper around the specific calls. It
exists because a couple alloc.c functions can take et_type. This will
go later.

Another benefit is that ocfs2_xattr_value_get_extent_tree() can take a
struct ocfs2_xattr_value_root* instead of void*. This gives us
typechecking where we didn't have it before.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:05 +0800
943cced39 ocfs2: Determine an extent tree's max_leaf_clusters in an et_op. ... Browse Code »

Provide an optional extent_tree_operation to specify the
max_leaf_clusters of an ocfs2_extent_tree. If not provided, the value
is 0 (unlimited).

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:04 +0800
1c25d93a4 ocfs2: Use struct ocfs2_extent_tree in ocfs2_num_free_extents(). ... Browse Code »

ocfs2_num_free_extents() re-implements the logic of
ocfs2_get_extent_tree(). Now that ocfs2_get_extent_tree() does not
allocate, let's use it in ocfs2_num_free_extents() to simplify the code.

The inode validation code in ocfs2_num_free_extents() is not needed.
All callers are passing in pre-validated inodes.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:04 +0800
0ce1010f1 ocfs2: Provide the get_root_el() method to ocfs2_extent_tree_operations. ... Browse Code »

The root_el of an ocfs2_extent_tree needs to be calculated from
et->et_object. Make it an operation on et->et_ops.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:04 +0800
ea5efa151 ocfs2: Make 'private' into 'object' on ocfs2_extent_tree. ... Browse Code »

The 'private' pointer was a way to store off xattr values, which don't
live at a set place in the bh. But the concept of "the object
containing the extent tree" is much more generic. For an inode it's the
struct ocfs2_dinode, for an xattr value its the value. Let's save off
the 'object' at all times. If NULL is passed to
ocfs2_get_extent_tree(), 'object' is set to bh->b_data;

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:04 +0800
dc0ce61af ocfs2: Make ocfs2_extent_tree get/put instead of alloc. ... Browse Code »

Rather than allocating a struct ocfs2_extent_tree, just put it on the
stack. Fill it with ocfs2_get_extent_tree() and drop it with
ocfs2_put_extent_tree(). Now the callers don't have to ENOMEM, yet
still safely ref the root_bh.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:04 +0800
ce1d9ea62 ocfs2: Prefix the ocfs2_extent_tree structure. ... Browse Code »

The members of the ocfs2_extent_tree structure gain a prefix of 'et_'.
All users are updated.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:04 +0800
35dc0aa3c ocfs2: Prefix the extent tree operations structure. ... Browse Code »

The ocfs2_extent_tree_operations structure gains a field prefix on its
members. The ->eo_sanity_check() operation gains a wrapper function for
completeness. All of the extent tree operation wrappers gain a
consistent name (ocfs2_et_*()).

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 07:57:04 +0800
ff1ec20ef ocfs2: fix printk format warnings ... Browse Code »

This patch fixes the following build warnings:

fs/ocfs2/xattr.c: In function 'ocfs2_half_xattr_bucket':
fs/ocfs2/xattr.c:3282: warning: format '%d' expects type 'int', but argument 7 has type 'long int'
fs/ocfs2/xattr.c:3282: warning: format '%d' expects type 'int', but argument 8 has type 'long int'
fs/ocfs2/xattr.c:3282: warning: format '%d' expects type 'int', but argument 7 has type 'long int'
fs/ocfs2/xattr.c:3282: warning: format '%d' expects type 'int', but argument 8 has type 'long int'
fs/ocfs2/xattr.c:3282: warning: format '%d' expects type 'int', but argument 7 has type 'long int'
fs/ocfs2/xattr.c:3282: warning: format '%d' expects type 'int', but argument 8 has type 'long int'
fs/ocfs2/xattr.c: In function 'ocfs2_xattr_set_entry_in_bucket':
fs/ocfs2/xattr.c:4092: warning: format '%d' expects type 'int', but argument 6 has type 'size_t'
fs/ocfs2/xattr.c:4092: warning: format '%d' expects type 'int', but argument 6 has type 'size_t'
fs/ocfs2/xattr.c:4092: warning: format '%d' expects type 'int', but argument 6 has type 'size_t'

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-10-14 07:57:03 +0800
8154da3d2 ocfs2: Add incompatible flag for extended attribute ... Browse Code »

This patch adds the s_incompat flag for extended attribute support. This
helps us ensure that older versions of Ocfs2 or ocfs2-tools will not be able
to mount a volume with xattr support.

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2008-10-14 07:57:03 +0800
a39442564 ocfs2: Delete all xattr buckets during inode removal ... Browse Code »

In inode removal, we need to iterate all the buckets, remove any
externally-stored EA values and delete the xattr buckets.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 07:57:03 +0800
012255961 ocfs2: Enable xattr set in index btree ... Browse Code »

Where the previous patches added the ability of list/get xattr in buckets
for ocfs2, this patch enables ocfs2 to store large numbers of EAs.

The original design doc is written by Mark Fasheh, and it can be found in
http://oss.oracle.com/osswiki/OCFS2/DesignDocs/IndexedEATrees. I only had to
make small modifications to it.

First, because the bucket size is 4K, a new field named xh_free_start is added
in ocfs2_xattr_header to indicate the next valid name/value offset in a bucket.
It is used when we store new EA name/value. With this field, we can find the
place more quickly and what's more, we don't need to sort the name/value every
time to let the last entry indicate the next unused space. This makes the
insert operation more efficient for blocksizes smaller than 4k.

Because of the new xh_free_start, another field named as xh_name_value_len is
also added in ocfs2_xattr_header. It records the total length of all the
name/values in the bucket. We need this so that we can check it and defragment
the bucket if there is not enough contiguous free space.

An xattr insertion looks like this:
1. xattr_index_block_find: find the right bucket by the name_hash, say bucketA.
2. check whether there is enough space in bucketA. If yes, insert it directly
and modify xh_free_start and xh_name_value_len accordingly. If not, check
xh_name_value_len to see whether we can store this by defragment the bucket.
If yes, defragment it and go on insertion.
3. If defragement doesn't work, check whether there is new empty bucket in
the clusters within this extent record. If yes, init the new bucket and move
all the buckets after bucketA one by one to the next bucket. Move half of the
entries in bucketA to the next bucket and go on insertion.
4. If there is no new bucket, grow the extent tree.

As for xattr deletion, we will delete an xattr bucket when all it's xattrs
are removed and move all the buckets after it to the previous one. When all
the xattr buckets in an extend record are freed, free this extend records
from ocfs2_xattr_tree.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 07:57:03 +0800
ca12b7c48 ocfs2: Optionally limit extent size in ocfs2_insert_extent() ... Browse Code »

In xattr bucket, we want to limit the maximum size of a btree leaf,
otherwise we'll lose the benefits of hashing because we'll have to search
large leaves.

So add a new field in ocfs2_extent_tree which indicates the maximum leaf cluster
size we want so that we can prevent ocfs2_insert_extent() from merging the leaf
record even if it is contiguous with an adjacent record.

Other btree types are not affected by this change.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 07:57:03 +0800
589dc2602 ocfs2: Add xattr lookup code xattr btrees ... Browse Code »

Add code to lookup a given extended attribute in the xattr btree. Lookup
follows this general scheme:

1. Use ocfs2_xattr_get_rec to find the xattr extent record

2. Find the xattr bucket within the extent which may contain this xattr

3. Iterate the bucket to find the xattr. In ocfs2_xattr_block_get(), we need
to recalcuate the block offset and name offset for the right position of
name/value.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 07:57:03 +0800
0c044f0b2 ocfs2: Add xattr bucket iteration for large numbers of EAs ... Browse Code »

Ocfs2 breaks up xattr index tree leaves into 4k regions, called buckets.
Attributes are stored within a given bucket, depending on hash value.

After a discussion with Mark, we decided that the per-bucket index
(xe_entry[]) would only exist in the 1st block of a bucket. Likewise,
name/value pairs will not straddle more than one block. This allows the
majority of operations to work directly on the buffer heads in a leaf block.

This patch adds code to iterate the buckets in an EA. A new abstration of
ocfs2_xattr_bucket is added. It records the bhs in this bucket and
ocfs2_xattr_header. This keeps the code neat, improving readibility.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 07:57:03 +0800
ba492615f ocfs2: Add xattr index tree operations ... Browse Code »

When necessary, an ocfs2_xattr_block will embed an ocfs2_extent_list to
store large numbers of EAs. This patch adds a new type in
ocfs2_extent_tree_type and adds the implementation so that we can re-use the
b-tree code to handle the storage of many EAs.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-10-14 07:57:02 +0800