Eric Lee / smarc-fsl-linux-kernel

06 Jan, 2009

7 commits

13723d00e ocfs2: Use metadata-specific ocfs2_journal_access_*() functions. ... Browse Code »

The per-metadata-type ocfs2_journal_access_*() functions hook up jbd2
commit triggers and allow us to compute metadata ecc right before the
buffers are written out. This commit provides ecc for inodes, extent
blocks, group descriptors, and quota blocks. It is not safe to use
extened attributes and metaecc at the same time yet.

The ocfs2_extent_tree and ocfs2_path abstractions in alloc.c both hide
the type of block at their root. Before, it didn't matter, but now the
root block must use the appropriate ocfs2_journal_access_*() function.
To keep this abstract, the structures now have a pointer to the matching
journal_access function and a wrapper call to call it.

A few places use naked ocfs2_write_block() calls instead of adding the
blocks to the journal. We make sure to calculate their checksum and ecc
before the write.

Since we pass around the journal_access functions. Let's typedef them
in ocfs2.h.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:32 +0800
d6b32bbb3 ocfs2: block read meta ecc. ... Browse Code »

Add block check calls to the read_block validate functions. This is the
almost all of the read-side checking of metaecc. xattr buckets are not checked
yet. Writes are also unchecked, and so a read-write mount will quickly fail.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:31 +0800
a90714c15 ocfs2: Add quota calls for allocation and freeing of inodes and space ... Browse Code »

Add quota calls for allocation and freeing of inodes and space, also update
estimates on number of needed credits for a transaction. Move out inode
allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called
outside of a transaction.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:23 +0800
bbbd0eb34 ocfs2: Mark system files as not subject to quota accounting ... Browse Code »

Mark system files as not subject to quota accounting. This prevents
possible recursions into quota code and thus deadlocks.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:23 +0800
1a224ad11 ocfs2: Assign feature bits and system inodes to quota feature and quota files ... Browse Code »

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:23 +0800
970e4936d ocfs2: Validate metadata only when it's read from disk. ... Browse Code »

Add an optional validation hook to ocfs2_read_blocks(). Now the
validation function is only called when a block was actually read off of
disk. It is not called when the buffer was in cache.

We add a buffer state bit BH_NeedsValidate to flag these buffers. It
must always be one higher than the last JBD2 buffer state bit.

The dinode, dirblock, extent_block, and xattr_block validators are
lifted to this scheme directly. The group_descriptor validator needs to
be split into two pieces. The first part only needs the gd buffer and
is passed to ocfs2_read_block(). The second part requires the dinode as
well, and is called every time. It's only 3 compares, so it's tiny.
This also allows us to clean up the non-fatal gd check used by resize.c.
It now has no magic argument.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:53 +0800
b657c95c1 ocfs2: Wrap inode block reads in a dedicated function. ... Browse Code »

The ocfs2 code currently reads inodes off disk with a simple
ocfs2_read_block() call. Each place that does this has a different set
of sanity checks it performs. Some check only the signature. A couple
validate the block number (the block read vs di->i_blkno). A couple
others check for VALID_FL. Only one place validates i_fs_generation. A
couple check nothing. Even when an error is found, they don't all do
the same thing.

We wrap inode reading into ocfs2_read_inode_block(). This will validate
all the above fields, going readonly if they are invalid (they never
should be). ocfs2_read_inode_block_full() is provided for the places
that want to pass read_block flags. Every caller is passing a struct
inode with a valid ip_blkno, so we don't need a separate blkno argument
either.

We will remove the validation checks from the rest of the code in a
later commit, as they are no longer necessary.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:52 +0800

11 Nov, 2008

1 commit

ae0dff683 ocfs2: Set journal descriptor to NULL after journal shutdown ... Browse Code »

Patch sets journal descriptor to NULL after the journal is shutdown.
This ensures that jbd2_journal_release_jbd_inode(), which removes the
jbd2 inode from txn lists, can be called safely from ocfs2_clear_inode()
even after the journal has been shutdown.

Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Sunil Mushran
2008-11-11 01:51:47 +0800

15 Oct, 2008

5 commits

d4a8c93c8 ocfs2: Make cached block reads the common case. ... Browse Code »

ocfs2_read_blocks() currently requires the CACHED flag for cached I/O.
However, that's the common case. Let's flip it around and provide an
IGNORE_CACHE flag for the special users. This has the added benefit of
cleaning up the code some (ignore_cache takes on its special meaning
earlier in the loop).

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-15 02:58:22 +0800
07446dc72 ocfs2: Move ocfs2_bread() into dir.c ... Browse Code »

dir.c is the only place using ocfs2_bread(), so let's make it static to
that file.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-15 02:58:03 +0800
0fcaa56a2 ocfs2: Simplify ocfs2_read_block() ... Browse Code »

More than 30 callers of ocfs2_read_block() pass exactly OCFS2_BH_CACHED.
Only six pass a different flag set. Rather than have every caller care,
let's make ocfs2_read_block() take no flags and always do a cached read.
The remaining six places can call ocfs2_read_blocks() directly.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-15 02:51:57 +0800
31d33073c ocfs2: Require an inode for ocfs2_read_block(s)(). ... Browse Code »

Now that synchronous readers are using ocfs2_read_blocks_sync(), all
callers of ocfs2_read_blocks() are passing an inode. Use it
unconditionally. Since it's there, we don't need to pass the
ocfs2_super either.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-15 02:43:29 +0800
da1e90985 ocfs2: Separate out sync reads from ocfs2_read_blocks() ... Browse Code »

The ocfs2_read_blocks() function currently handles sync reads, cached,
reads, and sometimes cached reads. We're going to add some
functionality to it, so first we should simplify it. The uncached,
synchronous reads are much easer to handle as a separate function, so we
instroduce ocfs2_read_blocks_sync().

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-15 02:29:10 +0800

14 Oct, 2008

4 commits

a81cb88b6 ocfs2: Don't check for NULL before brelse() ... Browse Code »

This is pointless as brelse() already does the check.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-10-14 08:02:44 +0800
2b4e30fbd ocfs2: Switch over to JBD2. ... Browse Code »

ocfs2 wants JBD2 for many reasons, not the least of which is that JBD is
limiting our maximum filesystem size.

It's a pretty trivial change. Most functions are just renamed. The
only functional change is moving to Jan's inode-based ordered data mode.
It's better, too.

Because JBD2 reads and writes JBD journals, this is compatible with any
existing filesystem. It can even interact with JBD-based ocfs2 as long
as the journal is formated for JBD.

We provide a compatibility option so that paranoid people can still use
JBD for the time being. This will go away shortly.

[ Moved call of ocfs2_begin_ordered_truncate() from ocfs2_delete_inode() to
ocfs2_truncate_for_delete(). --Mark ]

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2008-10-14 08:02:43 +0800
cf1d6c763 ocfs2: Add extended attribute support ... Browse Code »

This patch implements storing extended attributes both in inode or a single
external block. We only store EA's in-inode when blocksize > 512 or that
inode block has free space for it. When an EA's value is larger than 80
bytes, we will store the value via b-tree outside inode or block.

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2008-10-14 07:57:02 +0800
53da4939f ocfs2: POSIX file locks support ... Browse Code »

This is actually pretty easy since fs/dlm already handles the bulk of the
work. The Ocfs2 userspace cluster stack module already uses fs/dlm as the
underlying lock manager, so I only had to add the right calls.

Cluster-aware POSIX locks ("plocks") can be turned off by the same means at
UNIX locks - mount with 'noflocks', or create a local-only Ocfs2 volume.
Internally, the file system uses two sets of file_operations, depending on
whether cluster aware plocks is required. This turns out to be easier than
implementing local-only versions of ->lock.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-10-14 04:57:57 +0800

26 Jan, 2008

5 commits

4092d49f7 ocfs2: convert byte order of constant instead of variable ... Browse Code »

Convert byte order of constant instead of variable it will be done at
compile time vs run time. Remove unused le32_and_cpu.

Signed-off-by: Marcin Slusarz
Signed-off-by: Mark Fasheh

Marcin Slusarz
2008-01-26 07:05:46 +0800
5fa0613ea ocfs2: Silence false lockdep warnings ... Browse Code »

Create separate lockdep lock classes for system file's i_mutexes. They are
used to guard allocations and similar things and thus rank differently
than i_mutex of a regular file or directory.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2008-01-26 07:05:44 +0800
e63aecb65 ocfs2: Rename ocfs2_meta_[un]lock ... Browse Code »

Call this the "inode_lock" now, since it covers both data and meta data.
This patch makes no functional changes.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-01-26 06:46:01 +0800
c934a92d0 ocfs2: Remove data locks ... Browse Code »

The meta lock now covers both meta data and data, so this just removes the
now-redundant data lock.

Combining locks saves us a round of lock mastery per inode and one less lock
to ping between nodes during read/write.

We don't lose much - since meta locks were always held before a data lock
(and at the same level) ordered writeout mode (the default) ensured that
flushing for the meta data lock also pushed out data anyways.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-01-26 06:45:57 +0800
34d024f84 ocfs2: Remove mount/unmount votes ... Browse Code »

The node maps that are set/unset by these votes are no longer relevant, thus
we can remove the mount and umount votes. Since those are the last two
remaining votes, we can also remove the entire vote infrastructure.

The vote thread has been renamed to the downconvert thread, and the small
amount of functionality related to managing it has been moved into
fs/ocfs2/dlmglue.c. All references to votes have been removed or updated.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-01-26 06:45:34 +0800

28 Nov, 2007

2 commits

a46043e08 ocfs2: log valid inode # on bad inode ... Browse Code »

If the inode block isn't valid then we don't want to print the value from
that, instead print the block number which was passed in (which should
always be correct). Also, turn this into a debug print for now - folks who
hit an actual problem always have other logs indicating what the source is.

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-11-28 08:47:02 +0800
2759236f8 [PATCH] fs/ocfs2: Add missing "space" ... Browse Code »

Signed-off-by: Joe Perches
Signed-off-by: Mark Fasheh

Joe Perches
2007-11-28 08:47:01 +0800

13 Oct, 2007

2 commits

1afc32b95 ocfs2: Write support for inline data ... Browse Code »

This fixes up write, truncate, mmap, and RESVSP/UNRESVP to understand inline
inode data.

For the most part, the changes to the core write code can be relied on to do
the heavy lifting. Any code calling ocfs2_write_begin (including shared
writeable mmap) can count on it doing the right thing with respect to
growing inline data to an extent tree.

Size reducing truncates, including UNRESVP can simply zero that portion of
the inode block being removed. Size increasing truncatesm, including RESVP
have to be a little bit smarter and grow the inode to an extent tree if
necessary.

Signed-off-by: Mark Fasheh
Reviewed-by: Joel Becker

Mark Fasheh
2007-10-13 02:54:40 +0800
15b1e36bd ocfs2: Structure updates for inline data ... Browse Code »

Add the disk, network and memory structures needed to support data in inode.

Struct ocfs2_inline_data is defined and embedded in ocfs2_dinode for storing
inline data.

A new inode field, i_dyn_features, is added to facilitate tracking of
dynamic inode state. Since it will be used often, we want to mirror it on
ocfs2_inode_info, and transfer it via the meta data lvb.

Signed-off-by: Mark Fasheh
Reviewed-by: Joel Becker

Mark Fasheh
2007-10-13 02:54:39 +0800

09 May, 2007

1 commit

e63340ae6 header cleaning: don't include smp_lock.h when not used ... Browse Code »

Remove includes of where it is not used/needed.
Suggested by Al Viro.

Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
sparc64, and arm (all 59 defconfigs).

Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Randy Dunlap
2007-05-09 02:15:07 +0800

03 May, 2007

3 commits

1ca1a111b ocfs2: fix sparse warnings in fs/ocfs2 ... Browse Code »

None of these are actually harmful, but the noise makes looking for real
problems difficult.

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-05-03 06:08:08 +0800
6e4b0d569 [PATCH] Copy i_flags to ocfs2 inode flags on write ... Browse Code »

Propagate flags such as S_APPEND, S_IMMUTABLE, etc. from i_flags into
ocfs2-specific ip_attr. Hence, when someone sets these flags via a different
interface than ioctl, they are stored correctly.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2007-05-03 06:07:58 +0800
ee19a7795 ocfs2: Wrap access of directory allocations with ip_alloc_sem. ... Browse Code »

OCFS2_I(inode)->ip_alloc_sem is a read-write semaphore protecting
local concurrent access of ocfs2 inodes. However, ocfs2 directories were
not taking the semaphore while they accessed or modified the allocation
tree.

ocfs2_extend_dir() needs to take the semaphore in a write mode when it
adds to the allocation. All other directory users get there via
ocfs2_bread(), which takes the semaphore in read mode.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2007-05-03 06:07:42 +0800

27 Apr, 2007

9 commits

834189788 ocfs2: Cache extent records ... Browse Code »

The extent map code was ripped out earlier because of an inability to deal
with holes. This patch adds back a simpler caching scheme requiring far less
code.

Our old extent map caching was designed back when meta data block caching in
Ocfs2 didn't work very well, resulting in many disk reads. These days our
metadata caching is much better, resulting in no un-necessary disk reads. As
a result, extent caching doesn't have to be as fancy, nor does it have to
cache as many extents. Keeping the last 3 extents seen should be sufficient
to give us a small performance boost on some streaming workloads.

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-04-27 06:10:40 +0800
8110b073a ocfs2: Fix up i_blocks calculation to know about holes ... Browse Code »

Older file systems which didn't support holes did a dumb calculation of
i_blocks based on i_size. This is no longer accurate, so fix things up to
take actual allocation into account.

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-04-27 06:07:40 +0800
49cb8d2d4 ocfs2: Read from an unwritten extent returns zeros ... Browse Code »

Return an optional extent flags field from our lookup functions and wire up
callers to treat unwritten regions as holes for the purpose of returning
zeros to the user.

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-04-27 06:02:41 +0800
60b11392f ocfs2: zero tail of sparse files on truncate ... Browse Code »

Since we don't zero on extend anymore, truncate needs to be fixed up to zero
the part of a file between i_size and and end of it's cluster. Otherwise a
subsequent extend could expose bad data.

This introduced a new helper, which can be used in ocfs2_write().

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-04-27 06:02:20 +0800
3a0782d09 ocfs2: teach extend/truncate about sparse files ... Browse Code »

For ocfs2_truncate_file(), we eliminate the "simple" truncate case which no
longer exists since i_size is not tied to i_clusters. In
ocfs2_extend_file(), we skip the allocation / page zeroing code for file
systems which understand sparse files.

The core truncate code is changed to do a bottom up tree traversal. This
gets abstracted out into it's own function. To make things more readable,
most of the special case handling for in-inode extents from
ocfs2_do_truncate() is also removed.

Though write support for sparse files comes in a later patch, we at least
update ocfs2_prepare_inode_for_write() to skip allocation for sparse files.

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-04-27 06:01:56 +0800
363041a5f ocfs2: temporarily remove extent map caching ... Browse Code »

The code in extent_map.c is not prepared to deal with a subtree being
rotated between lookups. This can happen when filling holes in sparse files.
Instead of a lengthy patch to update the code (which would likely lose the
benefit of caching subtree roots), we remove most of the algorithms and
implement a simple path based lookup. A less ambitious extent caching scheme
will be added in a later patch.

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-04-27 06:01:31 +0800
6f16bf655 ocfs2: small cleanup of ocfs2_request_delete() ... Browse Code »

There are two checks in there (one for inode newness, one for other mounted
nodes) which are unnecessary, so remove them. The DLM will allow the trylock
in either case without any messaging overhead.

Removing these makes ocfs2_request_delete() a one liner function, so just
move the trylock out one level into ocfs2_query_inode_wipe().

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-04-27 05:40:55 +0800
68e2b740c ocfs2: remove unused code ... Browse Code »

Remove node messaging code that becomes unused with the delete inode vote
removal.

[Removed even more cruft which I spotted during review --Mark]

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2007-04-27 05:40:16 +0800
500086300 ocfs2: Remove delete inode vote ... Browse Code »

Ocfs2 currently does cluster-wide node messaging to check the open state of
an inode during delete. This patch removes that mechanism in favor of an
inode cluster lock which is taken at shared read when an inode is first read
and dropped in clear_inode(). This allows a deleting node to test the
liveness of an inode by attempting to take an exclusive lock.

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2007-04-27 05:39:48 +0800

22 Jan, 2007

1 commit

6a1bd4a57 ocfs2: cleanup ocfs2_iget() errors ... Browse Code »

Get rid of some error prints in the ocfs2_iget() path from
ocfs2_get_dentry(). NFSD can easily cause us to read stale inodes.

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-01-22 08:19:12 +0800