Doug / smarc-fsl-linux-kernel | Embedian Git Server

23 Oct, 2010

10 commits

032dbb3b5 nilfs2: see state of root dentry for mount check of snapshots ... Browse Code »

After applied the patch that unified sb instances, root dentry of
snapshots can be left in dcache even after their trees are unmounted.

The orphan root dentry/inode keeps a root object, and this causes
false positive of nilfs_checkpoint_is_mounted function.

This resolves the issue by having nilfs_checkpoint_is_mounted test
whether the root dentry is busy or not.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-10-23 08:24:38 +0800
f1e89c86f nilfs2: use iget for all metadata files ... Browse Code »

This makes use of iget5_locked to allocate or get inode for metadata
files to stop using own inode allocator.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-10-23 08:24:38 +0800
518d1a6a1 nilfs2: allow nilfs_clear_inode to clear metadata file inodes ... Browse Code »

Allows clear inode function (nilfs_clear_inode) to handle metadata
files that uses bitmap-based object alloctor. DAT and ifile
correspond to this.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-10-23 08:24:37 +0800
dc3d3b810 nilfs2: deny write access to inodes in snapshots ... Browse Code »

Snapshots of nilfs are read-only.

After super block instances (sb) will be unified, nilfs will need to
check write access by a way other than implicit test with
IS_RDONLY(inode). This is because IS_RDONLY() refers to MS_RDONLY bit
of inode->i_sb->s_flags and it will become inaccurate after the
unification of sb.

To prepare for the issue, this uses i_op->permission to deny write
access to inodes in snapshots.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-10-23 08:24:35 +0800
b7c063420 nilfs2: move inode count and block count into root object ... Browse Code »

This moves sbi->s_inodes_count and sbi->s_blocks_count into nilfs_root
object.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-10-23 08:24:35 +0800
e912a5b66 nilfs2: use root object to get ifile ... Browse Code »

This rewrites functions using ifile so that they get ifile from
nilfs_root object, and will remove sbi->s_ifile. Some functions that
don't know the root object are extended to receive it from caller.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-10-23 08:24:35 +0800
4d8d9293d nilfs2: set pointer to root object in inodes ... Browse Code »

This puts a pointer to nilfs_root object in the private part of
on-memory inode, and makes nilfs_iget function pick up the inode with
the same root object.

Non-root inodes inherit its nilfs_root object from parent inode. That
of the root inode is allocated through nilfs_attach_checkpoint()
function.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-10-23 08:24:34 +0800
263d90cef nilfs2: remove own inode hash used for GC ... Browse Code »

This uses inode hash function that vfs provides instead of the own
hash table for caching gc inodes. This finally removes the own inode
hash from nilfs.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-10-23 08:24:34 +0800
0e14a3595 nilfs2: use iget5_locked to get inode ... Browse Code »

This uses iget5_locked instead of iget_locked so that gc cache can
look up inodes with an inode number and an optional checkpoint number.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-10-23 08:24:33 +0800
7d6cd92fe nilfs2: allow nilfs_dirty_inode to mark metadata file inodes dirty ... Browse Code »

This allows sop->dirty_inode callback function (nilfs_dirty_inode) to
handle metadata file inodes.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-10-23 08:24:33 +0800

10 Aug, 2010

4 commits

6fd1e5c99 convert nilfs2 to ->evict_inode() ... Browse Code »

[folded build fix from sfr]

Signed-off-by: Al Viro

Al Viro
2010-08-10 04:48:25 +0800
1025774ce remove inode_setattr ... Browse Code »

Replace inode_setattr with opencoded variants of it in all callers. This
moves the remaining call to vmtruncate into the filesystem methods where it
can be replaced with the proper truncate sequence.

In a few cases it was obvious that we would never end up calling vmtruncate
so it was left out in the opencoded variant:

spufs: explicitly checks for ATTR_SIZE earlier
btrfs,hugetlbfs,logfs,dlmfs: explicitly clears ATTR_SIZE earlier
ufs: contains an opencoded simple_seattr + truncate that sets the filesize just above

In addition to that ncpfs called inode_setattr with handcrafted iattrs,
which allowed to trim down the opencoded variant.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-08-10 04:47:37 +0800
155130a4f get rid of block_write_begin_newtrunc ... Browse Code »

Move the call to vmtruncate to get rid of accessive blocks to the callers
in preparation of the new truncate sequence and rename the non-truncating
version to block_write_begin.

While we're at it also remove several unused arguments to block_write_begin.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-08-10 04:47:33 +0800
eafdc7d19 sort out blockdev_direct_IO variants ... Browse Code »

Move the call to vmtruncate to get rid of accessive blocks to the callers
in prepearation of the new truncate calling sequence. This was only done
for DIO_LOCKING filesystems, so the __blockdev_direct_IO_newtrunc variant
was not needed anyway. Get rid of blockdev_direct_IO_no_locking and
its _newtrunc variant while at it as just opencoding the two additional
paramters is shorted than the name suffix.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2010-08-10 04:47:29 +0800

22 May, 2010

1 commit

73459dcc6 nilfs2: replace inode uid,gid,mode initialization with helper function ... Browse Code »

Acked-by: Ryusuke Konishi
Signed-off-by: Dmitry Monakhov
Signed-off-by: Al Viro

Dmitry Monakhov
2010-05-22 06:31:25 +0800

10 May, 2010

1 commit

cdce214e3 nilfs2: use huge_encode_dev/huge_decode_dev ... Browse Code »

This replaces uses of new_encode_dev/new_decode_dev with their 64-bit
counterparts, huge_encode_dev/huge_decode_dev respectively.

This is just for clarification and has no impact on the disk format.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2010-05-10 10:32:34 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

27 Nov, 2009

2 commits

abdb318b7 nilfs2: replace mark_inode_dirty as nilfs_mark_inode_dirty ... Browse Code »

Replace mark_inode_dirty() as nilfs_mark_inode_dirty()
to reduce deep function calls.

Signed-off-by: Jiro SEKIBA
Signed-off-by: Ryusuke Konishi

Jiro SEKIBA
2009-11-27 19:05:16 +0800
9ca941d4b nilfs2: delete mark_inode_dirty in nilfs_new_inode ... Browse Code »

It is redundant to call mark_inode_dirty() in nilfs_new_inode() because
all caller of nilfs_new_inode() will call mark_inode_dirty()
after calling nilfs_new_inode() directly or indirectly in transaction.

Signed-off-by: Jiro SEKIBA
Signed-off-by: Ryusuke Konishi

Jiro SEKIBA
2009-11-27 19:05:15 +0800

20 Nov, 2009

2 commits

9cb4e0d2b nilfs2: move out mark_inode_dirty calls from bmap routines ... Browse Code »

Previously, nilfs_bmap_add_blocks() and nilfs_bmap_sub_blocks() called
mark_inode_dirty() after they changed the number of data blocks.

This moves these calls outside bmap outermost functions like
nilfs_bmap_insert() or nilfs_bmap_truncate().

This will mitigate overhead for truncate or delete operation since
they repeatedly remove set of blocks. Nearly 10 percent improvement
was observed for removal of a large file:

# dd if=/dev/zero of=/test/aaa bs=1M count=512
# time rm /test/aaa

real 2.968s -> 2.705s

Further optimization may be possible by eliminating these
mark_inode_dirty() uses though I avoid mixing separate changes here.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2009-11-20 09:05:47 +0800
a49762fd1 nilfs2: remove buffer locking in nilfs_mark_inode_dirty ... Browse Code »

This lock is eliminable because inodes on the buffer can be updated
independently. Although a log writer also fills in bmap data on the
on-disk inodes, this update is exclusively done by a log writer lock.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2009-11-20 09:05:47 +0800

15 Nov, 2009

1 commit

18dafac1a nilfs2: deleted inconsistent comment in nilfs_load_inode_block() ... Browse Code »

The comment says, "Caller of this function MUST lock s_inode_lock",
however just above the comment, it locks s_inode_lock in the function.

Signed-off-by: Jiro SEKIBA
Signed-off-by: Ryusuke Konishi

Jiro SEKIBA
2009-11-15 16:17:46 +0800

29 Sep, 2009

1 commit

3cc811bff nilfs2: fix missing initialization of i_dir_start_lookup member ... Browse Code »

The i_dir_start_lookup field in nilfs_inode_info objects should be
cleared when the objects are allocated, but the the initialization was
missing in case of reading from disk. This adds the initialization.

Since the variable just gives a start page on directory lookups, the
bug was nonfatal until now.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2009-09-29 19:32:13 +0800

22 Sep, 2009

1 commit

7f09410bb const: mark remaining address_space_operations const ... Browse Code »

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-09-22 22:17:24 +0800

14 Sep, 2009

1 commit

1b2f5a641 nilfs2: fix ignored error code in __nilfs_read_inode() ... Browse Code »

The __nilfs_read_inode function is ignoring the error code returned
from nilfs_read_inode_common(), and wrongly delivers a success code
(zero) when it escapes from the function in erroneous cases.

This adds the missing error handling.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2009-09-14 17:27:12 +0800

24 Jun, 2009

1 commit

d441b1c29 switch nilfs2 to inode->i_acl ... Browse Code »

Actually, get rid of private analog, since nothing in there is
using ACLs at all so far.

Signed-off-by: Al Viro

Al Viro
2009-06-24 20:17:05 +0800

10 Jun, 2009

3 commits

c3a7abf06 nilfs2: support contiguous lookup of blocks ... Browse Code »

Although get_block() callback function can return extent of contiguous
blocks with bh->b_size, nilfs_get_block() function did not support
this feature.

This adds contiguous lookup feature to the block mapping codes of
nilfs, and allows the nilfs_get_blocks() function to return the extent
information by applying the feature.

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2009-06-10 22:41:12 +0800
e85dc1d52 nilfs2: enable sync_page method ... Browse Code »

This adds a missing sync_page method which unplugs bio requests when
waiting for page locks. This will improve read performance of nilfs.

Here is a measurement result using dd command.

Without this patch:

# mount -t nilfs2 /dev/sde1 /test
# dd if=/test/aaa of=/dev/null bs=512k
1024+0 records in
1024+0 records out
536870912 bytes (537 MB) copied, 6.00688 seconds, 89.4 MB/s

With this patch:

# mount -t nilfs2 /dev/sde1 /test
# dd if=/test/aaa of=/dev/null bs=512k
1024+0 records in
1024+0 records out
536870912 bytes (537 MB) copied, 3.54998 seconds, 151 MB/s

Signed-off-by: Ryusuke Konishi

Ryusuke Konishi
2009-06-10 22:41:11 +0800
258ef67e2 NILFS2: Pagecache usage optimization on NILFS2 ... Browse Code »

Hi,

I introduced "is_partially_uptodate" aops for NILFS2.

A page can have multiple buffers and even if a page is not uptodate, some buffers
can be uptodate on pagesize != blocksize environment.
This aops checks that all buffers which correspond to a part of a file
that we want to read are uptodate. If so, we do not have to issue actual
read IO to HDD even if a page is not uptodate because the portion we
want to read are uptodate.
"block_is_partially_uptodate" function is already used by ext2/3/4.
With the following patch random read/write mixed workloads or random read after
random write workloads can be optimized and we can get performance improvement.

I did a performance test using the sysbench.

1 --file-block-size=8K --file-total-size=2G --file-test-mode=rndrw --file-fsync-freq=0 --fil
e-rw-ratio=1 run

-2.6.30-rc5

Test execution summary:
total time: 151.2907s
total number of events: 200000
total time taken by event execution: 2409.8387
per-request statistics:
min: 0.0000s
avg: 0.0120s
max: 0.9306s
approx. 95 percentile: 0.0439s

Threads fairness:
events (avg/stddev): 12500.0000/238.52
execution time (avg/stddev): 150.6149/0.01

-2.6.30-rc5-patched

Test execution summary:
total time: 140.8828s
total number of events: 200000
total time taken by event execution: 2240.8577
per-request statistics:
min: 0.0000s
avg: 0.0112s
max: 0.8750s
approx. 95 percentile: 0.0418s

Threads fairness:
events (avg/stddev): 12500.0000/218.43
execution time (avg/stddev): 140.0536/0.01

arch: ia64
pagesize: 16k

Thanks.

Signed-off-by: Hisashi Hifumi
Signed-off-by: Ryusuke Konishi

Hisashi Hifumi
2009-06-10 22:41:11 +0800

07 Apr, 2009

6 commits

612392307 nilfs2: support nanosecond timestamp ... Browse Code »

After a review of user's feedback for finding out other compatibility
issues, I found nilfs improperly initializes timestamps in inode;
CURRENT_TIME was used there instead of CURRENT_TIME_SEC even though nilfs
didn't have nanosecond timestamps on disk. A few users gave us the report
that the tar program sometimes failed to expand symbolic links on nilfs,
and it turned out to be the cause.

Instead of applying the above displacement, I've decided to support
nanosecond timestamps on this occation. Fortunetaly, a needless 64-bit
field was in the nilfs_inode struct, and I found it's available for this
purpose without impact for the users.

So, this will do the enhancement and resolve the tar problem.

Signed-off-by: Ryusuke Konishi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ryusuke Konishi
2009-04-07 23:31:20 +0800
458c5b082 nilfs2: clean up sketch file ... Browse Code »

The sketch file is a file to mark checkpoints with user data. It was
experimentally introduced in the original implementation, and now
obsolete. The file was handled differently with regular files; the file
size got truncated when a checkpoint was created.

This stops the special treatment and will treat it as a regular file.
Most users are not affected because mkfs.nilfs2 no longer makes this file.

Signed-off-by: Ryusuke Konishi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ryusuke Konishi
2009-04-07 23:31:19 +0800
1f5abe7e7 nilfs2: replace BUG_ON and BUG calls triggerable from ioctl ... Browse Code »

Pekka Enberg advised me:
> It would be nice if BUG(), BUG_ON(), and panic() calls would be
> converted to proper error handling using WARN_ON() calls. The BUG()
> call in nilfs_cpfile_delete_checkpoints(), for example, looks to be
> triggerable from user-space via the ioctl() system call.

This will follow the comment and keep them to a minimum.

Acked-by: Pekka Enberg
Signed-off-by: Ryusuke Konishi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ryusuke Konishi
2009-04-07 23:31:19 +0800
47420c799 nilfs2: avoid double error caused by nilfs_transaction_end ... Browse Code »

Pekka Enberg pointed out that double error handlings found after
nilfs_transaction_end() can be avoided by separating abort operation:

OK, I don't understand this. The only way nilfs_transaction_end() can
fail is if we have NILFS_TI_SYNC set and we fail to construct the
segment. But why do we want to construct a segment if we don't commit?

I guess what I'm asking is why don't we have a separate
nilfs_transaction_abort() function that can't fail for the erroneous
case to avoid this double error value tracking thing?

This does the separation and renames nilfs_transaction_end() to
nilfs_transaction_commit() for clarification.

Since, some calls of these functions were used just for exclusion control
against the segment constructor, they are replaced with semaphore
operations.

Acked-by: Pekka Enberg
Signed-off-by: Ryusuke Konishi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ryusuke Konishi
2009-04-07 23:31:17 +0800
f30bf3e40 nilfs2: fix missed-sync issue for do_sync_mapping_range() ... Browse Code »

Chris Mason pointed out that there is a missed sync issue in
nilfs_writepages():

On Wed, 17 Dec 2008 21:52:55 -0500, Chris Mason wrote:
> It looks like nilfs_writepage ignores WB_SYNC_NONE, which is used by
> do_sync_mapping_range().

where WB_SYNC_NONE in do_sync_mapping_range() was replaced with
WB_SYNC_ALL by Nick's patch (commit:
ee53a891f47444c53318b98dac947ede963db400).

This fixes the problem by letting nilfs_writepages() write out the log of
file data within the range if sync_mode is WB_SYNC_ALL.

This involves removal of nilfs_file_aio_write() which was previously
needed to ensure O_SYNC sync writes.

Cc: Chris Mason
Signed-off-by: Ryusuke Konishi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ryusuke Konishi
2009-04-07 23:31:15 +0800
05fe58fdc nilfs2: inode operations ... Browse Code »

This adds inode level operations of the nilfs2 file system.

Signed-off-by: Ryusuke Konishi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ryusuke Konishi
2009-04-07 23:31:14 +0800