Eric Lee / smarc-fsl-linux-kernel

25 May, 2010

1 commit

4be929be3 kernel-wide: replace USHORT_MAX, SHORT_MAX and SHORT_MIN with USHRT_MAX, SHRT_MAX and SHRT_MIN ... Browse Code »

- C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not
USHORT_MAX/SHORT_MAX/SHORT_MIN.

- Make SHRT_MIN of type s16, not int, for consistency.

[akpm@linux-foundation.org: fix drivers/dma/timb_dma.c]
[akpm@linux-foundation.org: fix security/keys/keyring.c]
Signed-off-by: Alexey Dobriyan
Acked-by: WANG Cong
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2010-05-25 23:07:02 +0800

22 May, 2010

10 commits

75fe0a247 ocfs2: replace inode uid,gid,mode initialization with helper function ... Browse Code »

Acked-by: Joel Becker
Signed-off-by: Dmitry Monakhov
Signed-off-by: Al Viro

Dmitry Monakhov
2010-05-22 06:31:25 +0800
537d81ca7 ocfs: constify xattr_handler ... Browse Code »

Signed-off-by: Stephen Hemminger
Signed-off-by: Al Viro

Stephen Hemminger
2010-05-22 06:31:20 +0800
c06bcbfa1 ocfs2: Fix lock inversion in quotas during umount ... Browse Code »

We cannot cancel delayed work from ocfs2_local_free_info because that is called
with dqonoff_mutex held and the work it cancels requires dqonoff_mutex to
finish. Cancel the work before acquiring dqonoff_mutex.

Acked-by: Joel Becker
Signed-off-by: Jan Kara

Jan Kara
2010-05-22 01:30:48 +0800
52a9ee281 ocfs2: Use __dquot_transfer to avoid lock inversion ... Browse Code »

dquot_transfer() acquires own references to dquots via dqget(). Thus it waits
for dq_lock which creates a lock inversion because dq_lock ranks above
transaction start but transaction is already started in ocfs2_setattr(). Fix
the problem by passing own references directly to __dquot_transfer.

Acked-by: Joel Becker
Signed-off-by: Jan Kara

Jan Kara
2010-05-22 01:30:48 +0800
741e12893 ocfs2: Fix NULL pointer deref when writing local dquot ... Browse Code »

commit_dqblk() can write quota info to global file. That is actually a bad
thing to do because if we are just modifying local quota file, we are not
prepared (do not hold proper locks, do not have transaction credits) to do
a modification of the global quota file. So do not use commit_dqblk() and
instead call our writing function directly.

Acked-by: Joel Becker
Signed-off-by: Jan Kara

Jan Kara
2010-05-22 01:30:48 +0800
832d09cf1 ocfs2: Fix estimate of credits needed for quota allocation ... Browse Code »

We were missing reservation of a journal credit for modification of quota
file inode when creating new dquot structure in the global quota file.

Acked-by: Joel Becker
Signed-off-by: Jan Kara

Jan Kara
2010-05-22 01:30:47 +0800
fb8dd8d78 ocfs2: Fix quota locking ... Browse Code »

OCFS2 had three issues with quota locking:
a) When reading dquot from global quota file, we started a transaction while
holding dqio_mutex which is prone to deadlocks because other paths do it
the other way around
b) During ocfs2_sync_dquot we were not protected against concurrent writers
on the same node. Because we first copy data to local buffer, a race
could happen resulting in old data being written to global quota file and
thus causing quota inconsistency after a crash.
c) ip_alloc_sem of quota files was acquired while a transaction is started
in ocfs2_quota_write which can deadlock because we first get ip_alloc_sem
and then start a transaction when extending quota files.

We fix the problem a) by pulling all necessary code to ocfs2_acquire_dquot
and ocfs2_release_dquot. Thus we no longer depend on generic dquot_acquire
to do the locking and can force proper lock ordering.

Problems b) and c) are fixed by locking i_mutex and ip_alloc_sem of
global quota file in ocfs2_lock_global_qf and removing ip_alloc_sem from
ocfs2_quota_read and ocfs2_quota_write.

Acked-by: Joel Becker
Signed-off-by: Jan Kara

Jan Kara
2010-05-22 01:30:47 +0800
ae4f6ef13 ocfs2: Avoid unnecessary block mapping when refreshing quota info ... Browse Code »

The position of global quota file info does not change. So we do not have
to do logical -> physical block translation every time we reread it from
disk. Thus we can also avoid taking ip_alloc_sem.

Acked-by: Joel Becker
Signed-off-by: Jan Kara

Jan Kara
2010-05-22 01:30:46 +0800
f64dd44eb ocfs2: Do not map blocks from local quota file on each write ... Browse Code »

There is no need to map offset of local dquot structure to on disk block
in each quota write. It is enough to map it just once and store the physical
block number in quota structure in memory. Moreover this simplifies locking
as we do not have to take ip_alloc_sem from quota write path.

Acked-by: Joel Becker
Signed-off-by: Jan Kara

Jan Kara
2010-05-22 01:30:46 +0800
12755627b quota: unify quota init condition in setattr ... Browse Code »

Quota must being initialized if size or uid/git changes requested.
But initialization performed in two different places:
in case of i_size file system is responsible for dquot init
, but in case of uid/gid init will be called internally in
dquot_transfer().
This ambiguity makes code harder to understand.
Let's move this logic to one common helper function.

Signed-off-by: Dmitry Monakhov
Signed-off-by: Jan Kara

Dmitry Monakhov
2010-05-22 01:30:45 +0800

21 May, 2010

1 commit

03e62303c Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 ... Browse Code »

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (47 commits)
ocfs2: Silence a gcc warning.
ocfs2: Don't retry xattr set in case value extension fails.
ocfs2:dlm: avoid dlm->ast_lock lockres->spinlock dependency break
ocfs2: Reset xattr value size after xa_cleanup_value_truncate().
fs/ocfs2/dlm: Use kstrdup
fs/ocfs2/dlm: Drop memory allocation cast
Ocfs2: Optimize punching-hole code.
Ocfs2: Make ocfs2_find_cpos_for_left_leaf() public.
Ocfs2: Fix hole punching to correctly do CoW during cluster zeroing.
Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead.
ocfs2: Block signals for mkdir/link/symlink/O_CREAT.
ocfs2: Wrap signal blocking in void functions.
ocfs2/dlm: Increase o2dlm lockres hash size
ocfs2: Make ocfs2_extend_trans() really extend.
ocfs2/trivial: Code cleanup for allocation reservation.
ocfs2: make ocfs2_adjust_resv_from_alloc simple.
ocfs2: Make nointr a default mount option
ocfs2/dlm: Make o2dlm domain join/leave messages KERN_NOTICE
o2net: log socket state changes
ocfs2: print node # when tcp fails
...

Linus Torvalds
2010-05-21 22:20:17 +0800

19 May, 2010

11 commits

18d3a98f3 ocfs2: Silence a gcc warning. ... Browse Code »

ocfs2_block_group_claim_bits() is never called with min_bits=0, but we
shouldn't leave status undefined if it ever is.

Signed-off-by: Joel Becker

Joel Becker
2010-05-19 07:48:41 +0800
5f5261acb ocfs2: Don't retry xattr set in case value extension fails. ... Browse Code »

In normal xattr set, the set sequence is inode, xattr block
and finally xattr bucket if we meet with a ENOSPC. But there
is a corner case.
So consider we will set a xattr whose value will be stored in
a cluster, and there is no xattr block by now. So we will
reserve 1 xattr block and 1 cluster for setting it. Now if we
fail in value extension(in case the volume is almost full and
we can't allocate the cluster because the check in
ocfs2_test_bg_bit_allocatable), ENOSPC will be returned. So
we will try to create a bucket(this time there is a chance that
the reserved cluster will be used), and when we try value extension
again, kernel bug happens. We did meet with it. Check the bug below.
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1251

This patch just try to avoid this by adding a set_abort in
ocfs2_xattr_set_ctxt, so in case ENOSPC happens in value extension,
we will check whether it is caused by the real ENOSPC or just the
full of inode or xattr block. If it is the first case, we set set_abort
so that we don't try any further. we are safe to exit directly here
ince it is really ENOSPC.

Signed-off-by: Tao Ma
Signed-off-by: Joel Becker

Tao Ma
2010-05-19 07:41:39 +0800
d9ef75221 ocfs2:dlm: avoid dlm->ast_lock lockres->spinlock dependency break ... Browse Code »

Currently we process a dirty lockres with the lockres->spinlock taken. While
during the process, we may need to lock on dlm->ast_lock. This breaks the
dependency of dlm->ast_lock(lock first) and lockres->spinlock(lock second).

This patch fixes the problem.
Since we can't release lockres->spinlock, we have to take dlm->ast_lock
just before taking the lockres->spinlock and release it after lockres->spinlock
is released. And use __dlm_queue_bast()/__dlm_queue_ast(), the nolock version,
in dlm_shuffle_lists(). There are no too many locks on a lockres, so there is no
performance harm.

Signed-off-by: Wengang Wang
Signed-off-by: Joel Becker

Wengang Wang
2010-05-19 07:41:34 +0800
d5a7df064 ocfs2: Reset xattr value size after xa_cleanup_value_truncate(). ... Browse Code »

In ocfs2_prepare_xattr_entry, if we fail to grow an existing value,
xa_cleanup_value_truncate() will leave the old entry in place. Thus, we
reset its value size. However, if we were allocating a new value, we
must not reset the value size or we will BUG(). This resolves
oss.oracle.com bug 1247.

Signed-off-by: Tao Ma
Signed-off-by: Joel Becker

Tao Ma
2010-05-19 07:41:21 +0800
41841b0bc Merge branch 'discontig-bg' of git://oss.oracle.com/git/tma/linux-2.6 into ocfs2-merge-window Browse Code »

Joel Becker
2010-05-19 07:40:42 +0800
316ce2ba8 fs/ocfs2/dlm: Use kstrdup ... Browse Code »

Use kstrdup when the goal of an allocation is copy a string into the
allocated region.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

//
@@
expression from,to;
expression flag,E1,E2;
statement S;
@@

- to = kmalloc(strlen(from) + 1,flag);
+ to = kstrdup(from, flag);
... when != \(from = E1 \| to = E1 \)
if (to==NULL || ...) S
... when != \(from = E2 \| to = E2 \)
- strcpy(to, from);
//

Signed-off-by: Julia Lawall
Signed-off-by: Joel Becker

Julia Lawall
2010-05-19 03:31:11 +0800
3914ed0ce fs/ocfs2/dlm: Drop memory allocation cast ... Browse Code »

Drop cast on the result of kmalloc and similar functions.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

//
@@
type T;
@@

- (T *)
(\(kmalloc\|kzalloc\|kcalloc\|kmem_cache_alloc\|kmem_cache_zalloc\|
kmem_cache_alloc_node\|kmalloc_node\|kzalloc_node\)(...))
//

Signed-off-by: Julia Lawall
Signed-off-by: Joel Becker

Julia Lawall
2010-05-19 03:31:10 +0800
c1631d4a4 Ocfs2: Optimize punching-hole code. ... Browse Code »

This patch simplifies the logic of handling existing holes and
skipping extent blocks and removes some confusing comments.

The patch survived the fill_verify_holes testcase in ocfs2-test.
It also passed my manual sanity check and stress tests with enormous
extent records.

Currently punching a hole on a file with 3+ extent tree depth was
really a performance disaster. It can even take several hours,
though we may not hit this in real life with such a huge extent
number.

One simple way to improve the performance is quite straightforward.
From the logic of truncate, we can punch the hole from hole_end to
hole_start, which reduces the overhead of btree operations in a
significant way, such as tree rotation and moving.

Following is the testing result when punching hole from 0 to file end
in bytes, on a 1G file, 1G file consists of 256k extent records, each record
cover 4k data(just one cluster, clustersize is 4k):

===========================================================================
* Original punching-hole mechanism:
===========================================================================

I waited 1 hour for its completion, unfortunately it's still ongoing.

===========================================================================
* Patched punching-hode mechanism:
===========================================================================

real 0m2.518s
user 0m0.000s
sys 0m2.445s

That means we've gained up to 1000 times improvement on performance in this
case, whee! It's fairly cool. and it looks like that performance gain will
be raising when extent records grow.

The patch was based on my former 2 patches, which were about truncating
codes optimization and fixup to handle CoW on punching hole.

Signed-off-by: Tristan Ye
Acked-by: Mark Fasheh
Signed-off-by: Joel Becker

Tristan Ye
2010-05-19 03:31:05 +0800
ee149a7c6 Ocfs2: Make ocfs2_find_cpos_for_left_leaf() public. ... Browse Code »

The original idea to pull ocfs2_find_cpos_for_left_leaf() out of
alloc.c is to benefit punching-holes optimization patch, it however,
can also be referred by other funcs in the future who want to do the
same job.

Signed-off-by: Tristan Ye
Acked-by: Mark Fasheh
Signed-off-by: Joel Becker

Tristan Ye
2010-05-19 03:28:13 +0800
e8aec068e Ocfs2: Fix hole punching to correctly do CoW during cluster zeroing. ... Browse Code »

Based on the previous patch of optimizing truncate, the bugfix for
refcount trees when punching holes can be fairly easy
and straightforward since most of work we should take into account for
refcounting have been completed already in ocfs2_remove_btree_range().

This patch performs CoW for refcounted extents when a hole being punched
whose start or end offset were in the middle of a cluster, which means
partial zeroing of the cluster will be performed soon.

The patch has been tested fixing the following bug:

http://oss.oracle.com/bugzilla/show_bug.cgi?id=1216

Signed-off-by: Tristan Ye
Acked-by: Mark Fasheh
Signed-off-by: Joel Becker

Tristan Ye
2010-05-19 03:27:46 +0800
78f94673d Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead. ... Browse Code »

Truncate is just a special case of punching holes(from new i_size to
end), we therefore could take advantage of the existing
ocfs2_remove_btree_range() to reduce the comlexity and redundancy in
alloc.c. The goal here is to make truncate more generic and
straightforward.

Several functions only used by ocfs2_commit_truncate() will smiply be
removed.

ocfs2_remove_btree_range() was originally used by the hole punching
code, which didn't take refcount trees into account (definitely a bug).
We therefore need to change that func a bit to handle refcount trees.
It must take the refcount lock, calculate and reserve blocks for
refcount tree changes, and decrease refcounts at the end. We replace
ocfs2_lock_allocators() here by adding a new func
ocfs2_reserve_blocks_for_rec_trunc() which accepts some extra blocks to
reserve. This will not hurt any other code using
ocfs2_remove_btree_range() (such as dir truncate and hole punching).

I merged the following steps into one patch since they may be
logically doing one thing, though I know it looks a little bit fat
to review.

1). Remove redundant code used by ocfs2_commit_truncate(), since we're
moving to ocfs2_remove_btree_range anyway.

2). Add a new func ocfs2_reserve_blocks_for_rec_trunc() for purpose of
accepting some extra blocks to reserve.

3). Change ocfs2_prepare_refcount_change_for_del() a bit to fit our
needs. It's safe to do this since it's only being called by
truncate.

4). Change ocfs2_remove_btree_range() a bit to take refcount case into
account.

5). Finally, we change ocfs2_commit_truncate() to call
ocfs2_remove_btree_range() in a proper way.

The patch has been tested normally for sanity check, stress tests
with heavier workload will be expected.

Based on this patch, fixing the punching holes bug will be fairly easy.

Signed-off-by: Tristan Ye
Acked-by: Mark Fasheh
Signed-off-by: Joel Becker

Tristan Ye
2010-05-19 03:25:10 +0800

11 May, 2010

2 commits

547ba7c8e ocfs2: Block signals for mkdir/link/symlink/O_CREAT. ... Browse Code »

Once file or link creation gets going, it can't be interrupted by a
signal. They're not idempotent.

This blocks signals in ocfs2_mknod(), ocfs2_link(), and ocfs2_symlink()
once we start actually changing things. ocfs2_mknod() covers mknod(),
creat(), mkdir(), and open(O_CREAT).

Signed-off-by: Joel Becker

Joel Becker
2010-05-11 02:56:52 +0800
e4b963f10 ocfs2: Wrap signal blocking in void functions. ... Browse Code »

ocfs2 sometimes needs to block signals around dlm operations, but it
currently does it with sigprocmask(). Even worse, it's checking the
error code of sigprocmask(). The in-kernel sigprocmask() can only error
if you get the SIG_* argument wrong. We don't.

Wrap the sigprocmask() calls with ocfs2_[un]block_signals(). These
functions are void, but they will BUG() if somehow sigprocmask() returns
an error.

Signed-off-by: Joel Becker

Joel Becker
2010-05-11 02:50:10 +0800

06 May, 2010

15 commits

0467ae954 ocfs2/dlm: Increase o2dlm lockres hash size ... Browse Code »

Lockres hash size of 16KB is far too small for large filesystems (where we
have hundreds of thousands of lock resources stored in the table).
This patch increases it to 128KB.

Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2010-05-06 09:20:01 +0800
c901fb007 ocfs2: Make ocfs2_extend_trans() really extend. ... Browse Code »

In ocfs2, we use ocfs2_extend_trans() to extend a journal handle's
blocks. But if jbd2_journal_extend() fails, it will only restart
with the the new number of blocks. This tends to be awkward since
in most cases we want additional reserved blocks. It makes our code
harder to mantain since the caller can't be sure all the original
blocks will not be accessed and dirtied again. There are 15 callers
of ocfs2_extend_trans() in fs/ocfs2, and 12 of them have to add
h_buffer_credits before they call ocfs2_extend_trans(). This makes
ocfs2_extend_trans() really extend atop the original block count.

Signed-off-by: Tao Ma
Signed-off-by: Joel Becker

Tao Ma
2010-05-06 09:18:09 +0800
3e4218df3 ocfs2/trivial: Code cleanup for allocation reservation. ... Browse Code »

Two tiny cleanup for allocation reservation.
1. Remove some extra codes in ocfs2_local_alloc_find_clear_bits.
2. Remove an unuseful variables in ocfs2_find_resv_lhs.

Signed-off-by: Tao Ma
Acked-by: Mark Fasheh
Signed-off-by: Joel Becker

Tao Ma
2010-05-06 09:18:09 +0800
b065556a7 ocfs2: make ocfs2_adjust_resv_from_alloc simple. ... Browse Code »

When we allocate some bits from the reservation, we always
allocate from the r_start(see ocfs2_resmap_resv_bits).
So there should be no reason to check between r_start
and start. And I don't think we will change this behaviour
later by allocating from some bits after r_start. Why not make
ocfs2_adjust_resv_from_alloc simple for now?

The only chance we have to adjust the reservation is when we haven't
reached the end. With this patch, the function is more readable.

Note:
btw, this patch also fixes an original bug in the function
which I haven't found before.
if (end < ocfs2_resv_end(resv))
rhs = end - ocfs2_resv_end(resv);
This code is of course buggy. ;)

Signed-off-by: Tao Ma
Acked-by: Mark Fasheh
Signed-off-by: Joel Becker

Tao Ma
2010-05-06 09:18:09 +0800
4b37fcb7d ocfs2: Make nointr a default mount option ... Browse Code »

OCFS2 has never really supported intr. This patch acknowledges this reality
and makes nointr the default mount option. In a later patch, we intend to
support intr.

Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2010-05-06 09:18:08 +0800
5c80d4c9e ocfs2/dlm: Make o2dlm domain join/leave messages KERN_NOTICE ... Browse Code »

o2dlm join and leave messages are more than informational as they are
required for debugging locking issues. This patch changes them from
KERN_INFO to KERN_NOTICE.

Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2010-05-06 09:18:08 +0800
23fd9abdc o2net: log socket state changes ... Browse Code »

This patch logs socket state changes that lead to socket shutdown.

Signed-off-by: Srinivas Eeda
Signed-off-by: Joel Becker

Srinivas Eeda
2010-05-06 09:18:08 +0800
a5196ec5e ocfs2: print node # when tcp fails ... Browse Code »

Print the node number of a peer node if sending it a message failed.

Signed-off-by: Wengang Wang
Signed-off-by: Joel Becker

Wengang Wang
2010-05-06 09:18:08 +0800
83f92318f ocfs2: Add dir_resv_level mount option ... Browse Code »

The default behavior for directory reservations stays the same, but we add a
mount option so people can tweak the size of directory reservations
according to their workloads.

Signed-off-by: Mark Fasheh
Signed-off-by: Joel Becker

Mark Fasheh
2010-05-06 09:18:07 +0800
b07f8f24d ocfs2: change default reservation window sizes ... Browse Code »

The default reservation size of 4 (32-bit windows) is a bit too ambitious.
Scale it back to 16 bits (resv_level=2). I have been testing various sizes
on a 4-node cluster which runs a mixed workload that is heavily threaded.
With a 256MB local alloc, I get *roughly* the following levels of average file
fragmentation:

resv_level=0 70%
resv_level=1 21%
resv_level=2 23%
resv_level=3 24%
resv_level=4 60%
resv_level=5 did not test
resv_level=6 60%

resv_level=2 seemed like a good compromise between not letting windows be
too small, but not so big that heavier workloads will immediately suffer
without tuning.

This patch also change the behavior of directory reservations - they now
track file reservations. The previous compromise of giving directory
windows only 8 bits wound up fragmenting more at some window sizes because
file allocations had smaller unused windows to poach from.

Signed-off-by: Mark Fasheh
Signed-off-by: Joel Becker

Mark Fasheh
2010-05-06 09:18:07 +0800
6b82021b9 ocfs2: increase the default size of local alloc windows ... Browse Code »

I have observed that the current size of 8M gives us pretty poor
fragmentation on multi-threaded workloads which do lots of writes.

Generally, I can increase the size of local alloc windows and observe a
marked decrease in fragmentation, even up and beyond window sizes of 512
megabytes. This makes sense for a couple reasons - larger local alloc means
more room for reservation windows. On multi-node workloads the larger local
alloc helps as well because we don't have to do window slides as often.

Also, I removed the OCFS2_DEFAULT_LOCAL_ALLOC_SIZE constant as it is no
longer used and the comment above it was out of date.

To test fragmentation, I used a workload which launched 4 threads that did
4k writes into a series of about 140 alternating files.

With resv_level=2, and a 4k/4k file system I observed the following average
fragmentation for various localalloc= parameters:

localalloc= avg. fragmentation
8 48
32 16
64 10
120 7

On larger cluster sizes, the difference is more dramatic.

The new default size top out at 256M, which we'll only get for cluster
sizes of 32K and above.

Signed-off-by: Mark Fasheh
Signed-off-by: Joel Becker

Mark Fasheh
2010-05-06 09:18:07 +0800
73c8a8000 ocfs2: clean up localalloc mount option size parsing ... Browse Code »

This patch pulls the local alloc sizing code into localalloc.c and provides
a callout to it from ocfs2_fill_super(). Behavior is essentially unchanged
except that I correctly calculate the maximum local alloc size. The old code
in ocfs2_parse_options() calculated the max size as:

ocfs2_local_alloc_size(sb) * 8

which is correct, in bits. Unfortunately though the option passed in is in
megabytes. Ultimately, this bug made no real difference - the shrink code
would catch a too-large size and bring it down to something reasonable.
Still, it's less than efficient as-is.

Signed-off-by: Mark Fasheh
Signed-off-by: Joel Becker

Mark Fasheh
2010-05-06 09:18:06 +0800
a57c8fd2a ocfs2: remove ocfs2_local_alloc_in_range() ... Browse Code »

Inodes are always allocated from the global bitmap now so we don't need this
any more. Also, the existing implementation bounces reservations around
needlessly.

Signed-off-by: Mark Fasheh

Mark Fasheh
2010-05-06 09:17:31 +0800
33d5d380d ocfs2: allocate btree internal block groups from the global bitmap ... Browse Code »

Otherwise, the need for a very large contiguous allocation tends to
wreak havoc on many inode allocation reservations on the local alloc, thus
ruining any chances for contiguousness.

Signed-off-by: Mark Fasheh

Mark Fasheh
2010-05-06 09:17:31 +0800
e3b4a97db ocfs2: use allocation reservations for directory data ... Browse Code »

Use the reservations system for unindexed dir tree allocations. We don't
bother with the indexed tree as reads from it are mostly random anyway.
Directory reservations are marked seperately, to allow the reservations code
a chance to optimize their window sizes. This patch allocates only 8 bits
for directory windows as they generally are not expected to grow as quickly
as file data. Future improvements to dir window sizing can trivially be
made.

Signed-off-by: Mark Fasheh

Mark Fasheh
2010-05-06 09:17:30 +0800