Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

06 Jan, 2009

40 commits

a90714c15 ocfs2: Add quota calls for allocation and freeing of inodes and space ... Browse Code »

Add quota calls for allocation and freeing of inodes and space, also update
estimates on number of needed credits for a transaction. Move out inode
allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called
outside of a transaction.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:23 +0800
9e33d69f5 ocfs2: Implementation of local and global quota file handling ... Browse Code »

For each quota type each node has local quota file. In this file it stores
changes users have made to disk usage via this node. Once in a while this
information is synced to global file (and thus with other nodes) so that
limits enforcement at least aproximately works.

Global quota files contain all the information about usage and limits. It's
mostly handled by the generic VFS code (which implements a trie of structures
inside a quota file). We only have to provide functions to convert structures
from on-disk format to in-memory one. We also have to provide wrappers for
various quota functions starting transactions and acquiring necessary cluster
locks before the actual IO is really started.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:23 +0800
bbbd0eb34 ocfs2: Mark system files as not subject to quota accounting ... Browse Code »

Mark system files as not subject to quota accounting. This prevents
possible recursions into quota code and thus deadlocks.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:23 +0800
1a224ad11 ocfs2: Assign feature bits and system inodes to quota feature and quota files ... Browse Code »

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:23 +0800
90e86a63e ocfs2: Support nested transactions ... Browse Code »

OCFS2 can easily support nested transactions. We just have to
take care and not spoil statistics acquire semaphore unnecessarily.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:23 +0800
12c77527e quota: Implement function for scanning active dquots ... Browse Code »

OCFS2 needs to scan all active dquots once in a while and sync quota
information among cluster nodes. Provide a helper function for it so
that it does not have to reimplement internally a list which VFS
already has. Moreover this function is probably going to be useful
for other clustered filesystems if they decide to use VFS quotas.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:23 +0800
3d9ea253a quota: Add helpers to allow ocfs2 specific quota initialization, freeing and recovery ... Browse Code »

OCFS2 needs to peek whether quota structure is already in memory so
that it can avoid expensive cluster locking in that case. Similarly
when freeing dquots, it checks whether it is the last quota structure
user or not. Finally, it needs to get reference to dquot structure for
specified id and quota type when recovering quota file after crash.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:22 +0800
571b46e40 quota: Update version number ... Browse Code »

Increase reported version number of quota support since quota core has changed
significantly. Also remove __DQUOT_NUM_VERSION__ since nobody uses it.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:22 +0800
4d59bce4f quota: Keep which entries were set by SETQUOTA quotactl ... Browse Code »

Quota in a clustered environment needs to synchronize quota information
among cluster nodes. This means we have to occasionally update some
information in dquot from disk / network. On the other hand we have to
be careful not to overwrite changes administrator did via SETQUOTA.
So indicate in dquot->dq_flags which entries have been set by SETQUOTA
and quota format can clear these flags when it properly propagated
the changes.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:22 +0800
db49d2df4 quota: Allow negative usage of space and inodes ... Browse Code »

For clustered filesystems, it can happen that space / inode usage goes
negative temporarily (because some node is allocating another node
is freeing and they are not completely in sync). So let quota code
allow this and change qsize_t so a signed type so that we don't
underflow the variables.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:21 +0800
e3d4d56b9 quota: Convert union in mem_dqinfo to a pointer ... Browse Code »

Coming quota support for OCFS2 is going to need quite a bit
of additional per-sb quota information. Moreover having fs.h
include all the types needed for this structure would be a
pain in the a**. So remove the union from mem_dqinfo and add
a private pointer for filesystem's use.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:21 +0800
1ccd14b9c quota: Split off quota tree handling into a separate file ... Browse Code »

There is going to be a new version of quota format having 64-bit
quota limits and a new quota format for OCFS2. They are both
going to use the same tree structure as VFSv0 quota format. So
split out tree handling into a separate file and make size of
leaf blocks, amount of space usable in each block (needed for
checksumming) and structures contained in them configurable
so that the code can be shared.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:21 +0800
cf770c137 quota: Move quotaio_v[12].h from include/linux/ to fs/ ... Browse Code »

Since these include files are used only by implementation of quota formats,
there's no need to have them in include/linux/.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:58 +0800
ca785ec66 quota: Introduce DQUOT_QUOTA_SYS_FILE flag ... Browse Code »

If filesystem can handle quota files as system files hidden from users, we can
skip a lot of cache invalidation, syncing, inode flags setting etc. when
turning quotas on, off and quota_sync. Allow filesystem to indicate that it is
hiding quota files from users by DQUOT_QUOTA_SYS_FILE flag.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:57 +0800
dcb30695f quota: Remove compatibility function sb_any_quota_enabled() ... Browse Code »

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:56 +0800
6929f8912 reiserfs: Use sb_any_quota_loaded() instead of sb_any_quota_enabled(). ... Browse Code »

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:56 +0800
17bd13b31 ext4: Use sb_any_quota_loaded() instead of sb_any_quota_enabled() ... Browse Code »

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:56 +0800
ee0d5ffe0 ext3: Use sb_any_quota_loaded() instead of sb_any_quota_enabled() ... Browse Code »

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:56 +0800
f55abc0fb quota: Allow to separately enable quota accounting and enforcing limits ... Browse Code »

Split DQUOT_USR_ENABLED (and DQUOT_GRP_ENABLED) into DQUOT_USR_USAGE_ENABLED
and DQUOT_USR_LIMITS_ENABLED. This way we are able to separately enable /
disable whether we should:
1) ignore quotas completely
2) just keep uptodate information about usage
3) actually enforce quota limits

This is going to be useful when quota is treated as filesystem metadata - we
then want to keep quota information uptodate all the time and just enable /
disable limits enforcement.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:56 +0800
e4bc7b4b7 quota: Make _SUSPENDED just a flag ... Browse Code »

Upto now, DQUOT_USR_SUSPENDED behaved like a state - i.e., either quota
was enabled or suspended or none. Now allowed states are 0, ENABLED,
ENABLED | SUSPENDED. This will be useful later when we implement separate
enabling of quota usage tracking and limits enforcement because we need to
keep track of a state which has been suspended.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:56 +0800
1497d3ad4 quota: Remove bogus 'optimization' in check_idq() and check_bdq() ... Browse Code »

Checks like
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:56 +0800
12095460f quota: Increase size of variables for limits and inode usage ... Browse Code »

So far quota was fine with quota block limits and inode limits/numbers in
a 32-bit type. Now with rapid increase in storage sizes there are coming
requests to be able to handle quota limits above 4TB / more that 2^32 inodes.
So bump up sizes of types in mem_dqblk structure to 64-bits to be able to
handle this. Also update inode allocation / checking functions to use qsize_t
and make global structure keep quota limits in bytes so that things are
consistent.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:55 +0800
74f783af9 quota: Add callbacks for allocating and destroying dquot structures ... Browse Code »

Some filesystems would like to keep private information together with each
dquot. Add callbacks alloc_dquot and destroy_dquot allowing filesystem to
allocate larger dquots from their private slab in a similar fashion we
currently allocate inodes.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:36:55 +0800
9f868f16e ocfs2/xattr: Restore not_found in xis ... Browse Code »

During an xattr set, when we move a xattr which was stored in inode to the
outside bucket, we have to delete it and it will use the old value of
xis->not_found. xis->not_found is removed by ocfs2_calc_xattr_set_need
though, so we must restore it.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2009-01-06 00:36:55 +0800
97aff52ae ocfs2/xattr: Fix a bug in xattr allocation estimation ... Browse Code »

When we extend one xattr's value to a large size, the old value size might
be smaller than the size of a value root. In those cases, we still need to
guess the metadata allocation.

Reported-by: Tiger Yang
Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2009-01-06 00:36:55 +0800
53ef99cad ocfs2: Remove JBD compatibility layer ... Browse Code »

JBD2 is fully backwards compatible with JBD and it's been tested enough with
Ocfs2 that we can clean this code up now.

Signed-off-by: Mark Fasheh

Mark Fasheh
2009-01-06 00:36:55 +0800
511308d90 ocfs2: Convert ocfs2_read_dir_block() to ocfs2_read_virt_blocks() ... Browse Code »

Now that we've centralized the ocfs2_read_virt_blocks() code, let's use
it in ocfs2_read_dir_block().

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:55 +0800
a8549fb5a ocfs2: Wrap virtual block reads in ocfs2_read_virt_blocks() ... Browse Code »

The ocfs2_read_dir_block() function really maps an inode's virtual
blocks to physical ones before calling ocfs2_read_blocks(). Let's
extract that to common code, because other places might want to do that.

Other than the block number being virtual, ocfs2_read_virt_blocks()
takes the same arguments as ocfs2_read_blocks(). It converts those
virtual block numbers to physical before calling ocfs2_read_blocks()
directly. If the blocks asked for are discontiguous, this can mean
multiple calls to ocfs2_read_blocks(), but this is mostly hidden from
the caller.

Like ocfs2_read_blocks(), the caller can pass in an existing
buffer_head. This is usually done to pick up some readahead I/O.
ocfs2_read_virt_blocks() checks the buffer_head's block number
against the extent map - it must match.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:54 +0800
970e4936d ocfs2: Validate metadata only when it's read from disk. ... Browse Code »

Add an optional validation hook to ocfs2_read_blocks(). Now the
validation function is only called when a block was actually read off of
disk. It is not called when the buffer was in cache.

We add a buffer state bit BH_NeedsValidate to flag these buffers. It
must always be one higher than the last JBD2 buffer state bit.

The dinode, dirblock, extent_block, and xattr_block validators are
lifted to this scheme directly. The group_descriptor validator needs to
be split into two pieces. The first part only needs the gd buffer and
is passed to ocfs2_read_block(). The second part requires the dinode as
well, and is called every time. It's only 3 compares, so it's tiny.
This also allows us to clean up the non-fatal gd check used by resize.c.
It now has no magic argument.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:53 +0800
4ae1d69be ocfs2: Wrap xattr block reads in a dedicated function ... Browse Code »

We weren't consistently checking xattr blocks after we read them.
Most places checked the signature, but none checked xb_blkno or
xb_fs_signature. Create a toplevel ocfs2_read_xattr_block() that does
the read and the validation.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:53 +0800
a22305cc6 ocfs2: Wrap dirblock reads in a dedicated function. ... Browse Code »

We have ocfs2_bread() as a vestige of the original ext-based dir code.
It's only used by directories, though. Turn it into
ocfs2_read_dir_block(), with a prototype matching the other metadata
read functions. It's set up to validate dirblocks when the time comes.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:53 +0800
5e96581a3 ocfs2: Wrap extent block reads in a dedicated function. ... Browse Code »

We weren't consistently checking extent blocks after we read them.
Most places checked the signature, but none checked h_blkno or
h_fs_signature. Create a toplevel ocfs2_read_extent_block() that does
the read and the validation.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:53 +0800
420353061 ocfs2: Morph the haphazard OCFS2_IS_VALID_GROUP_DESC() checks. ... Browse Code »

Random places in the code would check a group descriptor bh to see if it
was valid. The previous commit unified descriptor block reads,
validating all block reads in the same place. Thus, these checks are no
longer necessary. Rather than eliminate them, however, we change them
to BUG_ON() checks. This ensures the assumptions remain true. All of
the code paths to these checks have been audited to ensure they come
from a validated descriptor read.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:53 +0800
68f64d471 ocfs2: Wrap group descriptor reads in a dedicated function. ... Browse Code »

We have a clean call for validating group descriptors, but every place
that wants the always does a read_block()+validate() call pair. Create
a toplevel ocfs2_read_group_descriptor() that does the right
thing. This allows us to leverage the single call point later for
fancier handling. We also add validation of gd->bg_generation against
the superblock and gd->bg_blkno against the block we thought we read.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:53 +0800
57e3e7971 ocfs2: Consolidate validation of group descriptors. ... Browse Code »

Currently the validation of group descriptors is directly duplicated so
that one version can error the filesystem and the other (resize) can
just report the problem. Consolidate to one function that takes a
boolean. Wrap that function with the old call for the old users.

This is in preparation for lifting the read+validate step into a
single function.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:53 +0800
10995aa24 ocfs2: Morph the haphazard OCFS2_IS_VALID_DINODE() checks. ... Browse Code »

Random places in the code would check a dinode bh to see if it was
valid. Not only did they do different levels of validation, they
handled errors in different ways.

The previous commit unified inode block reads, validating all block
reads in the same place. Thus, these haphazard checks are no longer
necessary. Rather than eliminate them, however, we change them to
BUG_ON() checks. This ensures the assumptions remain true. All of the
code paths to these checks have been audited to ensure they come from a
validated inode read.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:52 +0800
b657c95c1 ocfs2: Wrap inode block reads in a dedicated function. ... Browse Code »

The ocfs2 code currently reads inodes off disk with a simple
ocfs2_read_block() call. Each place that does this has a different set
of sanity checks it performs. Some check only the signature. A couple
validate the block number (the block read vs di->i_blkno). A couple
others check for VALID_FL. Only one place validates i_fs_generation. A
couple check nothing. Even when an error is found, they don't all do
the same thing.

We wrap inode reading into ocfs2_read_inode_block(). This will validate
all the above fields, going readonly if they are invalid (they never
should be). ocfs2_read_inode_block_full() is provided for the places
that want to pass read_block flags. Every caller is passing a struct
inode with a valid ip_blkno, so we don't need a separate blkno argument
either.

We will remove the validation checks from the rest of the code in a
later commit, as they are no longer necessary.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:36:52 +0800
a68979b85 ocfs2: add mount option and Kconfig option for acl ... Browse Code »

This patch adds the Kconfig option "CONFIG_OCFS2_FS_POSIX_ACL"
and mount options "acl" to enable acls in Ocfs2.

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2009-01-06 00:36:52 +0800
89c38bd0a ocfs2: add ocfs2_init_acl in mknod ... Browse Code »

We need to get the parent directories acls and let the new child inherit it.
To this, we add additional calculations for data/metadata allocation.

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2009-01-06 00:34:20 +0800
060bc66dd ocfs2: add ocfs2_acl_chmod ... Browse Code »

This function is used to update acl xattrs during file mode changes.

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2009-01-06 00:34:20 +0800