Eric Lee / smarc-fsl-linux-kernel

17 May, 2007

1 commit

a35afb830 Remove SLAB_CTOR_CONSTRUCTOR ... Browse Code »

SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.

Signed-off-by: Christoph Lameter
Cc: David Howells
Cc: Jens Axboe
Cc: Steven French
Cc: Michael Halcrow
Cc: OGAWA Hirofumi
Cc: Miklos Szeredi
Cc: Steven Whitehouse
Cc: Roman Zippel
Cc: David Woodhouse
Cc: Dave Kleikamp
Cc: Trond Myklebust
Cc: "J. Bruce Fields"
Cc: Anton Altaparmakov
Cc: Mark Fasheh
Cc: Paul Mackerras
Cc: Christoph Hellwig
Cc: Jan Kara
Cc: David Chinner
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-05-17 20:23:04 +0800

09 May, 2007

2 commits

60c9b2746 Merge git://oss.sgi.com:8090/xfs/xfs-2.6 ... Browse Code »

* git://oss.sgi.com:8090/xfs/xfs-2.6:
[XFS] Add lockdep support for XFS
[XFS] Fix race in xfs_write() b/w dmapi callout and direct I/O checks.
[XFS] Get rid of redundant "required" in msg.
[XFS] Export via a function xfs_buftarg_list for use by kdb/xfsidbg.
[XFS] Remove unused ilen variable and references.
[XFS] Fix to prevent the notorious 'NULL files' problem after a crash.
[XFS] Fix race condition in xfs_write().
[XFS] Fix uquota and oquota enforcement problems.
[XFS] propogate return codes from flush routines
[XFS] Fix quotaon syscall failures for group enforcement requests.
[XFS] Invalidate quotacheck when mounting without a quota type.
[XFS] reducing the number of random number functions.
[XFS] remove more misc. unused args
[XFS] the "aendp" arg to xfs_dir2_data_freescan is always NULL, remove it.
[XFS] The last argument "lsn" of xfs_trans_commit() is always called with

Linus Torvalds
2007-05-09 02:59:33 +0800
0ceb33143 mm: move common segment checks to separate helper function ... Browse Code »

[akpm@linux-foundation.org: cleanup]
Signed-off-by: Monakhov Dmitriy
Cc: Christoph Hellwig
Acked-by: Anton Altaparmakov
Acked-by: David Chinner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dmitriy Monakhov
2007-05-09 02:14:57 +0800

08 May, 2007

7 commits

f7c66ce3f [XFS] Add lockdep support for XFS ... Browse Code »

SGI-PV: 963965
SGI-Modid: xfs-linux-melb:xfs-kern:28485a

Signed-off-by: Lachlan McIlroy
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Lachlan McIlroy
2007-05-08 11:50:19 +0800
71dfd5a39 [XFS] Fix race in xfs_write() b/w dmapi callout and direct I/O checks. ... Browse Code »

In xfs_write() the iolock is dropped and reacquired in XFS_SEND_DATA()
which means that the file could change from not-cached to cached and we
need to redo the direct I/O checks. We should also redo the direct I/O
checks when the file size changes regardless if O_APPEND is set or not.

SGI-PV: 963483
SGI-Modid: xfs-linux-melb:xfs-kern:28440a

Signed-off-by: Lachlan McIlroy
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Lachlan McIlroy
2007-05-08 11:50:12 +0800
e6a0e9cdf [XFS] Export via a function xfs_buftarg_list for use by kdb/xfsidbg. ... Browse Code »

SGI-PV: 963465
SGI-Modid: xfs-linux-melb:xfs-kern:28414a

Signed-off-by: Tim Shimmin
Signed-off-by: Lachlan McIlroy

Tim Shimmin
2007-05-08 11:49:59 +0800
ba87ea699 [XFS] Fix to prevent the notorious 'NULL files' problem after a crash. ... Browse Code »

The problem that has been addressed is that of synchronising updates of
the file size with writes that extend a file. Without the fix the update
of a file's size, as a result of a write beyond eof, is independent of
when the cached data is flushed to disk. Often the file size update would
be written to the filesystem log before the data is flushed to disk. When
a system crashes between these two events and the filesystem log is
replayed on mount the file's size will be set but since the contents never
made it to disk the file is full of holes. If some of the cached data was
flushed to disk then it may just be a section of the file at the end that
has holes.

There are existing fixes to help alleviate this problem, particularly in
the case where a file has been truncated, that force cached data to be
flushed to disk when the file is closed. If the system crashes while the
file(s) are still open then this flushing will never occur.

The fix that we have implemented is to introduce a second file size,
called the in-memory file size, that represents the current file size as
viewed by the user. The existing file size, called the on-disk file size,
is the one that get's written to the filesystem log and we only update it
when it is safe to do so. When we write to a file beyond eof we only
update the in- memory file size in the write operation. Later when the I/O
operation, that flushes the cached data to disk completes, an I/O
completion routine will update the on-disk file size. The on-disk file
size will be updated to the maximum offset of the I/O or to the value of
the in-memory file size if the I/O includes eof.

SGI-PV: 958522
SGI-Modid: xfs-linux-melb:xfs-kern:28322a

Signed-off-by: Lachlan McIlroy
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Lachlan McIlroy
2007-05-08 11:49:46 +0800
2a3296313 [XFS] Fix race condition in xfs_write(). ... Browse Code »

This change addresses a race in xfs_write() where, for direct I/O, the
flags need_i_mutex and need_flush are setup before the iolock is acquired.
The logic used to setup the flags may change between setting the flags and
acquiring the iolock resulting in these flags having incorrect values. For
example, if a file is not currently cached then need_i_mutex is set to
zero and then if the file is cached before the iolock is acquired we will
fail to do the flushinval before the direct write.

The flush (and also the call to xfs_zero_eof()) need to be done with the
iolock held exclusive so we need to acquire the iolock before checking for
cached data (or if the write begins after eof) to prevent this state from
changing. For direct I/O I've chosen to always acquire the iolock in
shared mode initially and if there is a need to promote it then drop it
and reacquire it.

There's also some other tidy-ups including removing the O_APPEND offset
adjustment since that work is done in generic_write_checks() (and we don't
use offset as an input parameter anywhere).

SGI-PV: 962170
SGI-Modid: xfs-linux-melb:xfs-kern:28319a

Signed-off-by: Lachlan McIlroy
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Lachlan McIlroy
2007-05-08 11:49:39 +0800
d3cf20947 [XFS] propogate return codes from flush routines ... Browse Code »

This patch handles error return values in fs_flush_pages and
fs_flushinval_pages. It changes the prototype of fs_flushinval_pages so we
can propogate the errors and handle them at higher layers. I also modified
xfs_itruncate_start so that it could propogate the error further.

SGI-PV: 961990
SGI-Modid: xfs-linux-melb:xfs-kern:28231a

Signed-off-by: Lachlan McIlroy
Signed-off-by: Stewart Smith
Signed-off-by: Tim Shimmin

Lachlan McIlroy
2007-05-08 11:49:27 +0800
50953fe9e slab allocators: Remove SLAB_DEBUG_INITIAL flag ... Browse Code »

I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by
SLAB.

I think its purpose was to have a callback after an object has been freed
to verify that the state is the constructor state again? The callback is
performed before each freeing of an object.

I would think that it is much easier to check the object state manually
before the free. That also places the check near the code object
manipulation of the object.

Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
compiled with SLAB debugging on. If there would be code in a constructor
handling SLAB_DEBUG_INITIAL then it would have to be conditional on
SLAB_DEBUG otherwise it would just be dead code. But there is no such code
in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real
use of, difficult to understand and there are easier ways to accomplish the
same effect (i.e. add debug code before kfree).

There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
clear in fs inode caches. Remove the pointless checks (they would even be
pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.

This is the last slab flag that SLUB did not support. Remove the check for
unimplemented flags from SLUB.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-05-08 03:12:57 +0800

23 Mar, 2007

1 commit

b43376927 [PATCH] Make XFS workqueues nonfreezable ... Browse Code »

Since freezable workqueues are broken in 2.6.21-rc
(cf. http://marc.theaimsgroup.com/?l=linux-kernel&m=116855740612755,
http://marc.theaimsgroup.com/?l=linux-kernel&m=117261312523921&w=2)
it's better to change the only user of them, which is XFS, to use "normal"
nonfreezable workqueues.

Signed-off-by: Rafael J. Wysocki
Cc: Pavel Machek
Cc: David Chinner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rafael J. Wysocki
2007-03-23 10:39:06 +0800

21 Feb, 2007

1 commit

5085b607f [PATCH] xfs warning fix ... Browse Code »

fs/xfs/linux-2.6/xfs_super.c:903: warning: 'noinline' attribute ignored

Cc: David Chinner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2007-02-21 09:10:13 +0800

15 Feb, 2007

2 commits

0b4d41471 [PATCH] sysctl: remove insert_at_head from register_sysctl ... Browse Code »

The semantic effect of insert_at_head is that it would allow new registered
sysctl entries to override existing sysctl entries of the same name. Which is
pain for caching and the proc interface never implemented.

I have done an audit and discovered that none of the current users of
register_sysctl care as (excpet for directories) they do not register
duplicate sysctl entries.

So this patch simply removes the support for overriding existing entries in
the sys_sysctl interface since no one uses it or cares and it makes future
enhancments harder.

Signed-off-by: Eric W. Biederman
Acked-by: Ralf Baechle
Acked-by: Martin Schwidefsky
Cc: Russell King
Cc: David Howells
Cc: "Luck, Tony"
Cc: Ralf Baechle
Cc: Paul Mackerras
Cc: Martin Schwidefsky
Cc: Andi Kleen
Cc: Jens Axboe
Cc: Corey Minyard
Cc: Neil Brown
Cc: "John W. Linville"
Cc: James Bottomley
Cc: Jan Kara
Cc: Trond Myklebust
Cc: Mark Fasheh
Cc: David Chinner
Cc: "David S. Miller"
Cc: Patrick McHardy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric W. Biederman
2007-02-15 00:09:59 +0800
cd354f1ae [PATCH] remove many unneeded #includes of sched.h ... Browse Code »

After Al Viro (finally) succeeded in removing the sched.h #include in module.h
recently, it makes sense again to remove other superfluous sched.h includes.
There are quite a lot of files which include it but don't actually need
anything defined in there. Presumably these includes were once needed for
macros that used to live in sched.h, but moved to other header files in the
course of cleaning it up.

To ease the pain, this time I did not fiddle with any header files and only
removed #includes from .c-files, which tend to cause less trouble.

Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha,
arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig,
allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all
configs in arch/arm/configs on arm. I also checked that no new warnings were
introduced by the patch (actually, some warnings are removed that were emitted
by unnecessarily included header files).

Signed-off-by: Tim Schmielau
Acked-by: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tim Schmielau
2007-02-15 00:09:54 +0800

13 Feb, 2007

3 commits

c5ef1c42c [PATCH] mark struct inode_operations const 3 ... Browse Code »

Many struct inode_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arjan van de Ven
2007-02-13 01:48:46 +0800
6ab8eb1cf [PATCH] Make XFS use BH_Unwritten and BH_Delay correctly ... Browse Code »

Don't hide buffer_unwritten behind buffer_delay() and remove the hack that
clears unexpected buffer_unwritten() states now that it can't happen.

Signed-off-by: Dave Chinner
Acked-by: Christoph Hellwig
Cc: Timothy Shimmin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Chinner
2007-02-13 01:48:27 +0800
33a266dda [PATCH] Make BH_Unwritten a first class bufferhead flag V2 ... Browse Code »

Currently, XFS uses BH_PrivateStart for flagging unwritten extent state in a
bufferhead. Recently, I found the long standing mmap/unwritten extent
conversion bug, and it was to do with partial page invalidation not clearing
the unwritten flag from bufferheads attached to the page but beyond EOF. See
here for a full explaination:

http://oss.sgi.com/archives/xfs/2006-12/msg00196.html

The solution I have checked into the XFS dev tree involves duplicating code
from block_invalidatepage to clear the unwritten flag from the bufferhead(s),
and then calling block_invalidatepage() to do the rest.

Christoph suggested that this would be better solved by pushing the unwritten
flag into the common buffer head flags and just adding the call to
discard_buffer():

http://oss.sgi.com/archives/xfs/2006-12/msg00239.html

The following patch makes BH_Unwritten a first class citizen.

Signed-off-by: Dave Chinner
Acked-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Chinner
2007-02-13 01:48:27 +0800

10 Feb, 2007

15 commits

e7ff6aed8 [XFS] Don't use kmap in xfs_iozero. ... Browse Code »

kmap() is inefficient and does not scale well. kmap_atomic() is a better
choice. Use the generic wrapper function instead of open coding the
kmap-memset-dcache flush-kunmap stuff.

SGI-PV: 960904
SGI-Modid: xfs-linux-melb:xfs-kern:28041a

Signed-off-by: David Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Tim Shimmin

David Chinner
2007-02-10 15:37:46 +0800
7bc5306d7 [XFS] Remove unused header files for MAC and CAP checking functionality. ... Browse Code »

xfs_mac.h and xfs_cap.h provide definitions and macros that aren't used
anywhere in XFS at all. They are left-overs from "to be implement at some
point in the future" functionality that Irix XFS has. If this
functionality ever goes into Linux, it will be provided at a different
layer, most likely through the security hooks in the kernel so we will
never need this functionality in XFS.

Patch provided by Eric Sandeen (sandeen@sandeen.net).

SGI-PV: 960895
SGI-Modid: xfs-linux-melb:xfs-kern:28036a

Signed-off-by: Eric Sandeen
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Eric Sandeen
2007-02-10 15:37:28 +0800
3c0dc77b4 [XFS] Make freeze code a little cleaner. ... Browse Code »

Fixes a few small issues (mostly cosmetic) that were picked up during the
review cycle for the last set of freeze path changes.

SGI-PV: 959267
SGI-Modid: xfs-linux-melb:xfs-kern:28035a

Signed-off-by: David Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Tim Shimmin

David Chinner
2007-02-10 15:37:22 +0800
39058a0e1 [XFS] Clean up use of VFS attr flags ... Browse Code »

Use the the generic VFS attr flags where appropriate instead of open
coding them to the same values.

Patch provided by Eric Sandeen.

SGI-PV: 960868
SGI-Modid: xfs-linux-melb:xfs-kern:28033a

Signed-off-by: Eric Sandeen
Signed-off-by: David Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Tim Shimmin

Eric Sandeen
2007-02-10 15:37:10 +0800
4cf3b5208 [XFS] Remove useless memory barrier ... Browse Code »

wake_up's implementation does an implicit memory barrier so the explicit
memory barrier is not needed in vfs_sync_worker.

Patch provided by Ralf Baechle.

SGI-PV: 960867
SGI-Modid: xfs-linux-melb:xfs-kern:28032a

Signed-off-by: Ralf Baechle
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Ralf Baechle
2007-02-10 15:37:04 +0800
3a68cbfe0 [XFS] XFS sysctl cleanups ... Browse Code »

Removes unneeded sysctl insert at head behaviour. Cleans up sysctl
definitions to use C99 initialisers. Patch provided by Eric W. Biederman.

SGI-PV: 960192
SGI-Modid: xfs-linux-melb:xfs-kern:28031a

Signed-off-by: Eric W. Biederman
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Eric W. Biederman
2007-02-10 15:36:59 +0800
681601613 [XFS] Fix callers of xfs_iozero() to zero the correct range. ... Browse Code »

The problem is the two callers of xfs_iozero() are rounding out the range
to be zeroed to the end of a fsb and in some cases this extends past the
new eof. The call to commit_write() in xfs_iozero() will cause the Linux
inode's file size to be set too high.

SGI-PV: 960788
SGI-Modid: xfs-linux-melb:xfs-kern:28013a

Signed-off-by: Lachlan McIlroy
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Lachlan McIlroy
2007-02-10 15:36:47 +0800
2823945fd [XFS] Ensure a frozen filesystem has a clean log before writing the dummy ... Browse Code »

record.

The current Linux XFS freeze code is a mess. We flush the metadata buffers
out while we are still allowing new transactions to start and then fail to
flush the dirty buffers back out before writing the unmount and dummy
records to the log.

This leads to problems when the frozen filesystem is used for snapshots -
we do log recovery on a readonly image and often it appears that the log
image in the snapshot is not correct. Hence we end up with hangs, oops and
mount failures when trying to mount a snapshot image that has been created
when the filesystem has not been correctly frozen.

To fix this, we need to move th metadata flush to after we wait for all
current transactions to complete in teh second stage of the freeze. This
means that when we write the final log records, the log should be clean
and recovery should never occur on a snapshot image created from a frozen
filesystem.

SGI-PV: 959267
SGI-Modid: xfs-linux-melb:xfs-kern:28010a

Signed-off-by: David Chinner
Signed-off-by: Donald Douwsma
Signed-off-by: Tim Shimmin

David Chinner
2007-02-10 15:36:40 +0800
549054afa [XFS] Fix sub-block zeroing for buffered writes into unwritten extents. ... Browse Code »

When writing less than a filesystem block of data into an unwritten extent
via buffered I/O, __xfs_get_blocks fails to set the buffer new flag. As a
result, the generic code will not zero either edge of the block resulting
in garbage being written to disk either side of the real data. Set the
buffer new state on bufferd writes to unwritten extents to ensure that
zeroing occurs.

SGI-PV: 960328
SGI-Modid: xfs-linux-melb:xfs-kern:28000a

Signed-off-by: David Chinner
Signed-off-by: Lachlan McIlroy
Signed-off-by: Tim Shimmin

David Chinner
2007-02-10 15:36:35 +0800
5180602e6 [XFS] remove unused filp from ioctl functions ... Browse Code »

SGI-PV: 959140
SGI-Modid: xfs-linux-melb:xfs-kern:27712a

Signed-off-by: Lachlan McIlroy
Signed-off-by: Eric Sandeen
Signed-off-by: Tim Shimmin

Lachlan McIlroy
2007-02-10 15:35:46 +0800
a3227fb99 [XFS] mraccessf & mrupdatef are supposed to be the "flags" versions of the ... Browse Code »

functions, but they

a) ignore the flags parameter completely, and b) are never called
directly, only via the flag-less defines anyway

So, drop the #define indirection, and rename mraccessf to mraccess, etc.

SGI-PV: 959138
SGI-Modid: xfs-linux-melb:xfs-kern:27711a

Signed-off-by: Lachlan McIlroy
Signed-off-by: Eric Sandeen
Signed-off-by: Tim Shimmin

Lachlan McIlroy
2007-02-10 15:35:40 +0800
e5eb7f202 [XFS] use struct kvec in struct uio ... Browse Code »

SGI-PV: 954580
SGI-Modid: xfs-linux-melb:xfs-kern:27701a

Signed-off-by: Lachlan McIlroy
Signed-off-by: Christoph Hellwig
Signed-off-by: Tim Shimmin

Lachlan McIlroy
2007-02-10 15:35:21 +0800
7989cb8ef [XFS] Keep stack usage down for 4k stacks by using noinline. ... Browse Code »

gcc-4.1 and more recent aggressively inline static functions which
increases XFS stack usage by ~15% in critical paths. Prevent this from
occurring by adding noinline to the STATIC definition.

Also uninline some functions that are too large to be inlined and were
causing problems with CONFIG_FORCED_INLINING=y.

Finally, clean up all the different users of inline, __inline and
__inline__ and put them under one STATIC_INLINE macro. For debug kernels
the STATIC_INLINE macro uninlines those functions.

SGI-PV: 957159
SGI-Modid: xfs-linux-melb:xfs-kern:27585a

Signed-off-by: David Chinner
Signed-off-by: David Chatterton
Signed-off-by: Tim Shimmin

David Chinner
2007-02-10 15:34:56 +0800
5e6a07dfe [XFS] Current usage of buftarg flags is incorrect. ... Browse Code »

The {test,set,clear}_bit() operations take a bit index for the bit to
operate on. The XBT_* flags are defined as bit fields which is incorrect,
not to mention the way the bit fields are enumerated is broken too. This
was only working by chance.

Fix the definitions of the flags and make the code using them use the
{test,set,clear}_bit() operations correctly.

SGI-PV: 958639
SGI-Modid: xfs-linux-melb:xfs-kern:27565a

Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

David Chinner
2007-02-10 15:34:49 +0800
585e6d885 [XFS] Fix a synchronous buftarg flush deadlock when freezing. ... Browse Code »

At the last stage of a freeze, we flush the buftarg synchronously over and
over again until it succeeds twice without skipping any buffers.

The delwri list flush skips pinned buffers, but tries to flush all others.
It removes the buffers from the delwri list, then tries to lock them one
at a time as it traverses the list to issue the I/O. It holds them locked
until we issue all of the I/O and then unlocks them once we've waited for
it to complete.

The problem is that during a freeze, the filesystem may still be doing
stuff - like flushing delalloc data buffers - in the background and hence
we can be trying to lock buffers that were on the delwri list at the same
time. Hence we can get ABBA deadlocks between threads doing allocation and
the buftarg flush (freeze) thread.

Fix it by skipping locked (and pinned) buffers as we traverse the delwri
buffer list.

SGI-PV: 957195
SGI-Modid: xfs-linux-melb:xfs-kern:27535a

Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

David Chinner
2007-02-10 15:32:29 +0800

22 Dec, 2006

1 commit

921320210 [PATCH] Fix XFS after clear_page_dirty() removal ... Browse Code »

XFS appears to call clear_page_dirty to get the mapping tree dirty tag
set correctly at the same time the page dirty flag is cleared. I note
that this can be done by set_page_writeback() if we clear the dirty flag
on the page first when we are writing back the entire page.

Hence it seems to me that the XFS call to clear_page_dirty() could
easily be substituted by clear_page_dirty_for_io() followed by a call to
set_page_writeback() to get the mapping tree tags set correctly after
the page has been marked clean.

Signed-off-by: Linus Torvalds

David Chinner
2006-12-22 02:01:08 +0800

11 Dec, 2006

1 commit

8459d86af [PATCH] dio: only call aio_complete() after returning -EIOCBQUEUED ... Browse Code »

The only time it is safe to call aio_complete() is when the ->ki_retry
function returns -EIOCBQUEUED to the AIO core. direct_io_worker() has
historically done this by relying on its caller to translate positive return
codes into -EIOCBQUEUED for the aio case. It did this by trying to keep
conditionals in sync. direct_io_worker() knew when finished_one_bio() was
going to call aio_complete(). It would reverse the test and wait and free the
dio in the cases it thought that finished_one_bio() wasn't going to.

Not surprisingly, it ended up getting it wrong. 'ret' could be a negative
errno from the submission path but it failed to communicate this to
finished_one_bio(). direct_io_worker() would return < 0, it's callers
wouldn't raise -EIOCBQUEUED, and aio_complete() would be called. In the
future finished_one_bio()'s tests wouldn't reflect this and aio_complete()
would be called for a second time which can manifest as an oops.

The previous cleanups have whittled the sync and async completion paths down
to the point where we can collapse them and clearly reassert the invariant
that we must only call aio_complete() after returning -EIOCBQUEUED.
direct_io_worker() will only return -EIOCBQUEUED when it is not the last to
drop the dio refcount and the aio bio completion path will only call
aio_complete() when it is the last to drop the dio refcount.
direct_io_worker() can ensure that it is the last to drop the reference count
by waiting for bios to drain. It does this for sync ops, of course, and for
partial dio writes that must fall back to buffered and for aio ops that saw
errors during submission.

This means that operations that end up waiting, even if they were issued as
aio ops, will not call aio_complete() from dio. Instead we return the return
code of the operation and let the aio core call aio_complete(). This is
purposely done to fix a bug where AIO DIO file extensions would call
aio_complete() before their callers have a chance to update i_size.

Now that direct_io_worker() is explicitly returning -EIOCBQUEUED its callers
no longer have to translate for it. XFS needs to be careful not to free
resources that will be used during AIO completion if -EIOCBQUEUED is returned.
We maintain the previous behaviour of trying to write fs metadata for O_SYNC
aio+dio writes.

Signed-off-by: Zach Brown
Cc: Badari Pulavarty
Cc: Suparna Bhattacharya
Acked-by: Jeff Moyer
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Zach Brown
2006-12-11 01:57:21 +0800

09 Dec, 2006

1 commit

e678fb0d5 [PATCH] xfs: change uses of f_{dentry,vfsmnt} to use f_path ... Browse Code »

Change all the uses of f_{dentry,vfsmnt} to f_path.{dentry,mnt} in the xfs
filesystem.

Signed-off-by: Josef "Jeff" Sipek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josef "Jeff" Sipek
2006-12-09 00:28:43 +0800

08 Dec, 2006

2 commits

58e14b148 [PATCH] Use freezeable workqueues in XFS ... Browse Code »

Make the workqueues used by XFS freezeable, so their worker threads don't
submit any I/O after the suspend image has been created.

Signed-off-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Cc: Nigel Cunningham
Cc: David Chinner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rafael J. Wysocki
2006-12-08 00:39:29 +0800
7dfb71030 [PATCH] Add include/linux/freezer.h and move definitions from sched.h ... Browse Code »

Move process freezing functions from include/linux/sched.h to freezer.h, so
that modifications to the freezer or the kernel configuration don't require
recompiling just about everything.

[akpm@osdl.org: fix ueagle driver]
Signed-off-by: Nigel Cunningham
Cc: "Rafael J. Wysocki"
Cc: Pavel Machek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nigel Cunningham
2006-12-08 00:39:27 +0800

22 Nov, 2006

1 commit

c4028958b WorkStruct: make allyesconfig ... Browse Code »

Fix up for make allyesconfig.

Signed-Off-By: David Howells

David Howells
2006-11-22 22:57:56 +0800

11 Nov, 2006

2 commits

050e714eb [XFS] Remove KERNEL_VERSION macros from xfs_dmapi.h ... Browse Code »

SGI-PV: 957005
SGI-Modid: xfs-linux-melb:xfs-kern:27398a

Signed-off-by: David Chinner
Signed-off-by: Michal Piotrowski
Signed-off-by: Tim Shimmin

David Chinner
2006-11-11 15:05:06 +0800
7a18c3860 [XFS] Clean up i_flags and i_flags_lock handling. ... Browse Code »

SGI-PV: 956832
SGI-Modid: xfs-linux-melb:xfs-kern:27358a

Signed-off-by: David Chinner
Signed-off-by: Nathan Scott
Signed-off-by: Tim Shimmin

David Chinner
2006-11-11 15:04:54 +0800