Doug / smarc-fsl-linux-kernel | Embedian Git Server

17 Oct, 2007

4 commits

03158cd7e fs: restore nobh ... Browse Code »

Implement nobh in new aops. This is a bit tricky. FWIW, nobh_truncate is
now implemented in a way that does not create blocks in sparse regions,
which is a silly thing for it to have been doing (isn't it?)

ext2 survives fsx and fsstress. jfs is converted as well... ext3
should be easy to do (but not done yet).

[akpm@linux-foundation.org: coding-style fixes]
Cc: Badari Pulavarty
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-10-17 00:42:58 +0800
a20fa20c5 With reiserfs no longer using the weird generic_cont_expand, remove it completely. ... Browse Code »

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-10-17 00:42:56 +0800
89e107877 fs: new cont helpers ... Browse Code »

Rework the generic block "cont" routines to handle the new aops. Supporting
cont_prepare_write would take quite a lot of code to support, so remove it
instead (and we later convert all filesystems to use it).

write_begin gets passed AOP_FLAG_CONT_EXPAND when called from
generic_cont_expand, so filesystems can avoid the old hacks they used.

Signed-off-by: Nick Piggin
Cc: OGAWA Hirofumi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-10-17 00:42:55 +0800
afddba49d fs: introduce write_begin, write_end, and perform_write aops ... Browse Code »

These are intended to replace prepare_write and commit_write with more
flexible alternatives that are also able to avoid the buffered write
deadlock problems efficiently (which prepare_write is unable to do).

[mark.fasheh@oracle.com: API design contributions, code review and fixes]
[akpm@linux-foundation.org: various fixes]
[dmonakhov@sw.ru: new aop block_write_begin fix]
Signed-off-by: Nick Piggin
Signed-off-by: Mark Fasheh
Signed-off-by: Dmitriy Monakhov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-10-17 00:42:55 +0800

19 Jul, 2007

1 commit

541716902 [FS] Implement block_page_mkwrite. ... Browse Code »

Many filesystems need a ->page-mkwrite callout to correctly
set up pages that have been written to by mmap. This is especially
important when mmap is writing into holes as it allows filesystems
to correctly account for and allocate space before the mmap
write is allowed to proceed.

Protection against truncate races is provided by locking the page
and checking to see whether the page mapping is correct and whether
it is beyond EOF so we don't end up allowing allocations beyond
the current EOF or changing EOF as a result of a mmap write.

SGI-PV: 940392
SGI-Modid: 2.6.x-xfs-melb:linux:29146a

Signed-off-by: David Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Tim Shimmin

David Chinner
2007-07-19 17:50:50 +0800

08 May, 2007

2 commits

f9a14399a mm: optimize kill_bdev() ... Browse Code »

Remove duplicate work in kill_bdev().

It currently invalidates and then truncates the bdev's mapping.
invalidate_mapping_pages() will opportunistically remove pages from the
mapping. And truncate_inode_pages() will forcefully remove all pages.

The only thing truncate doesn't do is flush the bh lrus. So do that
explicitly. This avoids (very unlikely) but possible invalid lookup
results if the same bdev is quickly re-issued.

It also will prevent extreme kernel latencies which are observed when
blockdevs which have a large amount of pagecache are unmounted, by avoiding
invalidate_mapping_pages() on that path. invalidate_mapping_pages() has no
cond_resched (it can be called under spinlock), whereas truncate_inode_pages()
has one.

[akpm@linux-foundation.org: restore nrpages==0 optimisation]
Signed-off-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-05-08 03:12:55 +0800
f98393a64 mm: remove destroy_dirty_buffers from invalidate_bdev() ... Browse Code »

Remove the destroy_dirty_buffers argument from invalidate_bdev(), it hasn't
been used in 6 years (so akpm says).

find * -name \*.[ch] | xargs grep -l invalidate_bdev |
while read file; do
quilt add $file;
sed -ie 's/invalidate_bdev($[^,]*$,[^)]*)/invalidate_bdev(\1)/g' $file;
done

Signed-off-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Peter Zijlstra
2007-05-08 03:12:55 +0800

13 Feb, 2007

2 commits

3991d3bd1 [PATCH] warning fix: unsigned->signed ... Browse Code »

While compiling my code with -Wconversion using gcc-trunk, I always get a
bunch of warrning from headers, here is fix for them:

__getblk is alawys called with unsigned argument,
but it takes signed, the same story with __bread,__breadahead and so on.

Signed-off-by: Tomasz Kvarsin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tomasz Kvarsin
2007-02-13 01:48:29 +0800
33a266dda [PATCH] Make BH_Unwritten a first class bufferhead flag V2 ... Browse Code »

Currently, XFS uses BH_PrivateStart for flagging unwritten extent state in a
bufferhead. Recently, I found the long standing mmap/unwritten extent
conversion bug, and it was to do with partial page invalidation not clearing
the unwritten flag from bufferheads attached to the page but beyond EOF. See
here for a full explaination:

http://oss.sgi.com/archives/xfs/2006-12/msg00196.html

The solution I have checked into the XFS dev tree involves duplicating code
from block_invalidatepage to clear the unwritten flag from the bufferhead(s),
and then calling block_invalidatepage() to do the rest.

Christoph suggested that this would be better solved by pushing the unwritten
flag into the common buffer head flags and just adding the call to
discard_buffer():

http://oss.sgi.com/archives/xfs/2006-12/msg00239.html

The following patch makes BH_Unwritten a first class citizen.

Signed-off-by: Dave Chinner
Acked-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Chinner
2007-02-13 01:48:27 +0800

17 Oct, 2006

1 commit

58ff407be [PATCH] Fix IO error reporting on fsync() ... Browse Code »

When IO error happens on metadata buffer, buffer is freed from memory and
later fsync() is called, filesystems like ext2 fail to report EIO. We

solve the problem by introducing a pointer to associated address space into
the buffer_head. When a buffer is removed from a list of metadata buffers
associated with an address space, IO error is transferred from the buffer to
the address space, so that fsync can later report it.

Signed-off-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2006-10-17 23:18:46 +0800

01 Oct, 2006

2 commits

9361401eb [PATCH] BLOCK: Make it possible to disable the block layer [try #6] ... Browse Code »

Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.

This patch does the following:

(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.

(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:

(*) Block I/O tracing.

(*) Disk partition code.

(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.

(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.

(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.

(*) MTD blockdev handling and FTL.

(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.

(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.

(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.

(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.

(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.

(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.

(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:

(*) Default blockdev file operations (to give error ENODEV on opening).

(*) Makes some /proc changes:

(*) /proc/devices does not list any blockdevs.

(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.

(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.

(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.

(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.

(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).

(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.

Signed-Off-By: David Howells
Signed-off-by: Jens Axboe

David Howells
2006-10-01 02:52:31 +0800
cf9a2ae8d [PATCH] BLOCK: Move functions out of buffer code [try #6] ... Browse Code »

Move some functions out of the buffering code that aren't strictly buffering
specific. This is a precursor to being able to disable the block layer.

(*) Moved some stuff out of fs/buffer.c:

(*) The file sync and general sync stuff moved to fs/sync.c.

(*) The superblock sync stuff moved to fs/super.c.

(*) do_invalidatepage() moved to mm/truncate.c.

(*) try_to_release_page() moved to mm/filemap.c.

(*) Moved some related declarations between header files:

(*) declarations for do_invalidatepage() and try_to_release_page() moved
to linux/mm.h.

(*) __set_page_dirty_buffers() moved to linux/buffer_head.h.

Signed-Off-By: David Howells
Signed-off-by: Jens Axboe

David Howells
2006-10-01 02:31:19 +0800

28 Jun, 2006

1 commit

b6cd0b772 [PATCH] fs/buffer.c: cleanups ... Browse Code »

- add a proper prototype for the following global function:
- buffer_init()

- make the following needlessly global function static:
- end_buffer_async_write()

Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-06-28 08:32:38 +0800

27 Mar, 2006

4 commits

b0cf2321c [PATCH] pass b_size to ->get_block() ... Browse Code »

Pass amount of disk needs to be mapped to get_block(). This way one can
modify the fs ->get_block() functions to map multiple blocks at the same time.

[akpm@osdl.org: performance tweak]
[akpm@osdl.org: remove unneeded assignments]
Signed-off-by: Badari Pulavarty
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Badari Pulavarty
2006-03-27 00:57:01 +0800
205f87f6b [PATCH] change buffer_head.b_size to size_t ... Browse Code »

Increase the size of the buffer_head b_size field (only) for 64 bit platforms.
Update some old and moldy comments in and around the structure as well.

The b_size increase allows us to perform larger mappings and allocations for
large I/O requests from userspace, which tie in with other changes allowing
the get_block_t() interface to map multiple blocks at once.

Signed-off-by: Nathan Scott
Signed-off-by: Badari Pulavarty
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Badari Pulavarty
2006-03-27 00:57:01 +0800
2ff28e22b [PATCH] Make address_space_operations->invalidatepage return void ... Browse Code »

The return value of this function is never used, so let's be honest and
declare it as void.

Some places where invalidatepage returned 0, I have inserted comments
suggesting a BUG_ON.

[akpm@osdl.org: JBD BUG fix]
[akpm@osdl.org: rework for git-nfs]
[akpm@osdl.org: don't go BUG in block_invalidate_page()]
Signed-off-by: Neil Brown
Acked-by: Dave Kleikamp
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-03-27 00:56:55 +0800
3978d7179 [PATCH] Make address_space_operations->sync_page return void ... Browse Code »

The only user ignores the return value, and the only instanace
(block_sync_page) always returns 0...

Signed-off-by: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

NeilBrown
2006-03-27 00:56:55 +0800

09 Jan, 2006

1 commit

05eb0b51f [PATCH] fat: support a truncate() for expanding size (generic_cont_expand) ... Browse Code »

This patch changes generic_cont_expand(), in order to share the code
with fatfs.

- Use vmtruncate() if ->prepare_write() returns a error.

Even if ->prepare_write() returns an error, it may already have added some
blocks. So, this truncates blocks outside of ->i_size by vmtruncate().

- Add generic_cont_expand_simple().

The generic_cont_expand_simple() assumes that ->prepare_write() can handle
the block boundary. With this, we don't need to care the extra byte.

And for expanding a file size by truncate(), fatfs uses the
added generic_cont_expand_simple().

Signed-off-by: OGAWA Hirofumi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

OGAWA Hirofumi
2006-01-09 12:13:47 +0800

31 Oct, 2005

1 commit

aaa4059bc [PATCH] ext3: Fix unmapped buffers in transaction's lists ... Browse Code »

Fix the problem (BUG 4964) with unmapped buffers in transaction's
t_sync_data list. The problem is we need to call filesystem's own
invalidatepage() from block_write_full_page().

block_write_full_page() must call filesystem's invalidatepage(). Otherwise
following nasty race can happen:

proc 1 proc 2
------ ------
- write some new data to 'offset'
=> bh gets to the transactions data list
- starts truncate
=> i_size set to new size
- mpage_writepages()
- ext3_ordered_writepage() to 'offset'
- block_write_full_page()
- page->index > end_index+1
- block_invalidatepage()
- discard_buffer()
- clear_buffer_mapped()

- commit triggers and finds unmapped buffer - BOOM!

Signed-off-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2005-10-31 09:37:17 +0800

30 Oct, 2005

1 commit

4c21e2f24 [PATCH] mm: split page table lock ... Browse Code »

Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
a many-threaded application which concurrently initializes different parts of
a large anonymous area.

This patch corrects that, by using a separate spinlock per page table page, to
guard the page table entries in that page, instead of using the mm's single
page_table_lock. (But even then, page_table_lock is still used to guard page
table allocation, and anon_vma allocation.)

In this implementation, the spinlock is tucked inside the struct page of the
page table page: with a BUILD_BUG_ON in case it overflows - which it would in
the case of 32-bit PA-RISC with spinlock debugging enabled.

Splitting the lock is not quite for free: another cacheline access. Ideally,
I suppose we would use split ptlock only for multi-threaded processes on
multi-cpu machines; but deciding that dynamically would have its own costs.
So for now enable it by config, at some number of cpus - since the Kconfig
language doesn't support inequalities, let preprocessor compare that with
NR_CPUS. But I don't think it's worth being user-configurable: for good
testing of both split and unsplit configs, split now at 4 cpus, and perhaps
change that to 8 later.

There is a benefit even for singly threaded processes: kswapd can be attacking
one part of the mm while another part is busy faulting.

Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2005-10-30 12:40:42 +0800

28 Oct, 2005

1 commit

27496a8c6 [PATCH] gfp_t: fs/* ... Browse Code »

- ->releasepage() annotated (s/int/gfp_t), instances updated
- missing gfp_t in fs/* added
- fixed misannotation from the original sweep caught by bitwise checks:
XFS used __nocast both for gfp_t and for flags used by XFS allocator.
The latter left with unsigned int __nocast; we might want to add a
different type for those but for now let's leave them alone. That,
BTW, is a case when __nocast use had been actively confusing - it had
been used in the same code for two different and similar types, with
no way to catch misuses. Switch of gfp_t to bitwise had caught that
immediately...

One tricky bit is left alone to be dealt with later - mapping->flags is
a mix of gfp_t and error indications. Left alone for now.

Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2005-10-28 23:16:47 +0800

09 Oct, 2005

1 commit

dd0fc66fb [PATCH] gfp flags annotations - part 1 ... Browse Code »

- added typedef unsigned int __nocast gfp_t;

- replaced __nocast uses for gfp flags with gfp_t - it gives exactly
the same warnings as far as sparse is concerned, doesn't change
generated code (from gcc point of view we replaced unsigned int with
typedef) and documents what's going on far better.

Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds

Al Viro
2005-10-09 06:00:57 +0800

08 Jul, 2005

1 commit

a39722034 [PATCH] page_uptodate locking scalability ... Browse Code »

Use a bit spin lock in the first buffer of the page to synchronise asynch
IO buffer completions, instead of the global page_uptodate_lock, which is
showing some scalabilty problems.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2005-07-08 09:23:45 +0800

17 Apr, 2005

1 commit

1da177e4c Linux-2.6.12-rc2 ... Browse Code »

Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!

Linus Torvalds
2005-04-17 06:20:36 +0800