Eric Lee / smarc-fsl-linux-kernel

20 Jul, 2007

5 commits

20c2df83d mm: Remove slab destructors from kmem_cache_create(). ... Browse Code »

Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt

Paul Mundt
2007-07-20 09:11:58 +0800
83c54070e mm: fault feedback #2 ... Browse Code »

This patch completes Linus's wish that the fault return codes be made into
bit flags, which I agree makes everything nicer. This requires requires
all handle_mm_fault callers to be modified (possibly the modifications
should go further and do things like fault accounting in handle_mm_fault --
however that would be for another patch).

[akpm@linux-foundation.org: fix alpha build]
[akpm@linux-foundation.org: fix s390 build]
[akpm@linux-foundation.org: fix sparc build]
[akpm@linux-foundation.org: fix sparc64 build]
[akpm@linux-foundation.org: fix ia64 build]
Signed-off-by: Nick Piggin
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Russell King
Cc: Ian Molton
Cc: Bryan Wu
Cc: Mikael Starvik
Cc: David Howells
Cc: Yoshinori Sato
Cc: "Luck, Tony"
Cc: Hirokazu Takata
Cc: Geert Uytterhoeven
Cc: Roman Zippel
Cc: Greg Ungerer
Cc: Matthew Wilcox
Cc: Paul Mackerras
Cc: Benjamin Herrenschmidt
Cc: Heiko Carstens
Cc: Martin Schwidefsky
Cc: Paul Mundt
Cc: Kazumoto Kojima
Cc: Richard Curnow
Cc: William Lee Irwin III
Cc: "David S. Miller"
Cc: Jeff Dike
Cc: Paolo 'Blaisorblade' Giarrusso
Cc: Miles Bader
Cc: Chris Zankel
Acked-by: Kyle McMartin
Acked-by: Haavard Skinnemoen
Acked-by: Ralf Baechle
Acked-by: Andi Kleen
Signed-off-by: Andrew Morton
[ Still apparently needs some ARM and PPC loving - Linus ]
Signed-off-by: Linus Torvalds

Nick Piggin
2007-07-20 01:04:41 +0800
d0217ac04 mm: fault feedback #1 ... Browse Code »

Change ->fault prototype. We now return an int, which contains
VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
FAULT_RET_ code tells the VM whether a page was found, whether it has been
locked, and potentially other things. This is not quite the way he wanted
it yet, but that's changed in the next patch (which requires changes to
arch code).

This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
that a page is locked which requires filemap_nopage to go away (because we
can no longer remain backward compatible without that flag), but we were
going to do that anyway.

struct fault_data is renamed to struct vm_fault as Linus asked. address
is now a void __user * that we should firmly encourage drivers not to use
without really good reason.

The page is now returned via a page pointer in the vm_fault struct.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-07-20 01:04:41 +0800
54cb8821d mm: merge populate and nopage into fault (fixes nonlinear) ... Browse Code »

Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
the virtual address -> file offset differently from linear mappings.

->populate is a layering violation because the filesystem/pagecache code
should need to know anything about the virtual memory mapping. The hitch here
is that the ->nopage handler didn't pass down enough information (ie. pgoff).
But it is more logical to pass pgoff rather than have the ->nopage function
calculate it itself anyway (because that's a similar layering violation).

Having the populate handler install the pte itself is likewise a nasty thing
to be doing.

This patch introduces a new fault handler that replaces ->nopage and
->populate and (later) ->nopfn. Most of the old mechanism is still in place
so there is a lot of duplication and nice cleanups that can be removed if
everyone switches over.

The rationale for doing this in the first place is that nonlinear mappings are
subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
to duplicate the synchronisation logic rather than just consolidate the two.

After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
pagecache. Seems like a fringe functionality anyway.

NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no
users have hit mainline yet.

[akpm@linux-foundation.org: cleanup]
[randy.dunlap@oracle.com: doc. fixes for readahead]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Nick Piggin
Signed-off-by: Randy Dunlap
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-07-20 01:04:41 +0800
d00806b18 mm: fix fault vs invalidate race for linear mappings ... Browse Code »

Fix the race between invalidate_inode_pages and do_no_page.

Andrea Arcangeli identified a subtle race between invalidation of pages from
pagecache with userspace mappings, and do_no_page.

The issue is that invalidation has to shoot down all mappings to the page,
before it can be discarded from the pagecache. Between shooting down ptes to
a particular page, and actually dropping the struct page from the pagecache,
do_no_page from any process might fault on that page and establish a new
mapping to the page just before it gets discarded from the pagecache.

The most common case where such invalidation is used is in file truncation.
This case was catered for by doing a sort of open-coded seqlock between the
file's i_size, and its truncate_count.

Truncation will decrease i_size, then increment truncate_count before
unmapping userspace pages; do_no_page will read truncate_count, then find the
page if it is within i_size, and then check truncate_count under the page
table lock and back out and retry if it had subsequently been changed (ptl
will serialise against unmapping, and ensure a potentially updated
truncate_count is actually visible).

Complexity and documentation issues aside, the locking protocol fails in the
case where we would like to invalidate pagecache inside i_size. do_no_page
can come in anytime and filemap_nopage is not aware of the invalidation in
progress (as it is when it is outside i_size). The end result is that
dangling (->mapping == NULL) pages that appear to be from a particular file
may be mapped into userspace with nonsense data. Valid mappings to the same
place will see a different page.

Andrea implemented two working fixes, one using a real seqlock, another using
a page->flags bit. He also proposed using the page lock in do_no_page, but
that was initially considered too heavyweight. However, it is not a global or
per-file lock, and the page cacheline is modified in do_no_page to increment
_count and _mapcount anyway, so a further modification should not be a large
performance hit. Scalability is not an issue.

This patch implements this latter approach. ->nopage implementations return
with the page locked if it is possible for their underlying file to be
invalidated (in that case, they must set a special vm_flags bit to indicate
so). do_no_page only unlocks the page after setting up the mapping
completely. invalidation is excluded because it holds the page lock during
invalidation of each page (and ensures that the page is not mapped while
holding the lock).

This also allows significant simplifications in do_no_page, because we have
the page locked in the right place in the pagecache from the start.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-07-20 01:04:41 +0800

19 Jul, 2007

1 commit

60446067b gfs2: stop giving out non-cluster-coherent leases ... Browse Code »

Since gfs2 can't prevent conflicting opens or leases on other nodes, we
probably shouldn't allow it to give out leases at all.

Put the newly defined lease operation into use in gfs2 by turning off
lease, unless we're using the "nolock' locking module (in which case all
locking is local anyway).

Signed-off-by: Marc Eshel
Signed-off-by: J. Bruce Fields
Cc: Steven Whitehouse

Marc Eshel
2007-07-19 07:17:19 +0800

18 Jul, 2007

2 commits

3bd858ab1 Introduce is_owner_or_cap() to wrap CAP_FOWNER use with fsuid check ... Browse Code »

Introduce is_owner_or_cap() macro in fs.h, and convert over relevant
users to it. This is done because we want to avoid bugs in the future
where we check for only effective fsuid of the current task against a
file's owning uid, without simultaneously checking for CAP_FOWNER as
well, thus violating its semantics.
[ XFS uses special macros and structures, and in general looked ...
untouchable, so we leave it alone -- but it has been looked over. ]

The (current->fsuid != inode->i_uid) check in generic_permission() and
exec_permission_lite() is left alone, because those operations are
covered by CAP_DAC_OVERRIDE and CAP_DAC_READ_SEARCH. Similarly operations
falling under the purview of CAP_CHOWN and CAP_LEASE are also left alone.

Signed-off-by: Satyam Sharma
Cc: Al Viro
Acked-by: Serge E. Hallyn
Signed-off-by: Linus Torvalds

Satyam Sharma
2007-07-18 03:00:03 +0800
a56942551 knfsd: exportfs: add exportfs.h header ... Browse Code »

currently the export_operation structure and helpers related to it are in
fs.h. fs.h is already far too large and there are very few places needing the
export bits, so split them off into a separate header.

[akpm@linux-foundation.org: fix cifs build]
Signed-off-by: Christoph Hellwig
Signed-off-by: Neil Brown
Cc: Steven French
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2007-07-18 01:23:06 +0800

17 Jul, 2007

1 commit

aa0ac3651 Remove capability.h from mm.h ... Browse Code »

I forgot to remove capability.h from mm.h while removing sched.h! This
patch remedies that, because the only inline function which was using
CAP_something was made out of line.

Cross-compile tested without regressions on:

all powerpc defconfigs
all mips defconfigs
all m68k defconfigs
all arm defconfigs
all ia64 defconfigs

alpha alpha-allnoconfig alpha-defconfig alpha-up
arm
i386 i386-allnoconfig i386-defconfig i386-up
ia64 ia64-allnoconfig ia64-defconfig ia64-up
m68k
mips
parisc parisc-allnoconfig parisc-defconfig parisc-up
powerpc powerpc-up
s390 s390-allnoconfig s390-defconfig s390-up
sparc sparc-allnoconfig sparc-defconfig sparc-up
sparc64 sparc64-allnoconfig sparc64-defconfig sparc64-up
um-x86_64
x86_64 x86_64-allnoconfig x86_64-defconfig x86_64-up

as well as my two usual configs.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-07-17 00:05:45 +0800

11 Jul, 2007

1 commit

1b21f458d Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (57 commits)
[GFS2] Accept old format NFS filehandles
[GFS2] Small fixes to logging code
[DLM] dump more lock values
[GFS2] Remove i_mode passing from NFS File Handle
[GFS2] Obtaining no_formal_ino from directory entry
[GFS2] git-gfs2-nmw-build-fix
[GFS2] System won't suspend with GFS2 file system mounted
[GFS2] remounting w/o acl option leaves acls enabled
[GFS2] inode size inconsistency
[DLM] Telnet to port 21064 can stop all lockspaces
[GFS2] Fix gfs2_block_truncate_page err return
[GFS2] Addendum to the journaled file/unmount patch
[GFS2] Simplify multiple glock aquisition
[GFS2] assertion failure after writing to journaled file, umount
[GFS2] Use zero_user_page() in stuffed_readpage()
[GFS2] Remove bogus '\0' in rgrp.c
[GFS2] Journaled file write/unstuff bug
[DLM] don't require FS flag on all nodes
[GFS2] Fix deallocation issues
[GFS2] return conflicts for GETLK
...

Linus Torvalds
2007-07-11 04:56:13 +0800

10 Jul, 2007

2 commits

3ebf44902 [GFS2] Accept old format NFS filehandles ... Browse Code »

On Tue, 2007-07-10 at 10:06 +0100, Christoph Hellwig wrote:
> > -#define GFS2_LARGE_FH_SIZE 10
> > -
> > -struct gfs2_fh_obj {
> > - struct gfs2_inum_host this;
> > - u32 imode;
> > -};
> > +#define GFS2_LARGE_FH_SIZE 8
>
> Because gfs2_decode_fh only accepts file handles with GFS2_LARGE_FH_SIZE
> or GFS2_LARGE_FH_SIZE you don't accept filehandles sent out by and older
> gfs version anymore. Stale filehandles because of a new kernel version
> are a big no-no, so please add back code to handle the old filehandles
> on the decode side.
>

This should fix that problem I think since its only relating to end of
the fh we can just ignore that field in order to accept the older
format.

Signed-off-by: Steven Whitehouse
Cc: Christoph Hellwig
Cc: Wendy Cheng

Steven Whitehouse
2007-07-10 19:28:27 +0800
5ffc4ef45 sendfile: remove .sendfile from filesystems that use generic_file_sendfile() ... Browse Code »

They can use generic_file_splice_read() instead. Since sys_sendfile() now
prefers that, there should be no change in behaviour.

Signed-off-by: Jens Axboe

Jens Axboe
2007-07-10 14:04:13 +0800

09 Jul, 2007

28 commits

a0a24741c [GFS2] Small fixes to logging code ... Browse Code »

This reverts part of an earlier patch which tried to reclaim
gfs2_bufdata structures too early and resulted in a "use after free"
case (this bit from me). Also a change to not write out log headers
unless we really need to (in the case of flushing nothing we don't need
a header) from Bob.

Signed-off-by: Steven Whitehouse
Signed-off-by: Bob Peterson

Steven Whitehouse
2007-07-09 22:43:07 +0800
35dcc52e3 [GFS2] Remove i_mode passing from NFS File Handle ... Browse Code »

GFS2 has been passing i_mode within NFS File Handle. Other than the
wrong assumption that there is always room for this extra 16 bit value,
the current gfs2_get_dentry doesn't really need the i_mode to work
correctly. Note that GFS2 NFS code does go thru the same lookup code
path as direct file access route (where the mode is obtained from name
lookup) but gfs2_get_dentry() is coded for different purpose. It is not
used during lookup time. It is part of the file access procedure call.
When the call is invoked, if on-disk inode is not in-memory, it has to
be read-in. This makes i_mode passing a useless overhead.

Signed-off-by: S. Wendy Cheng
Signed-off-by: Steven Whitehouse

Wendy Cheng
2007-07-09 15:24:11 +0800
bb9bcf061 [GFS2] Obtaining no_formal_ino from directory entry ... Browse Code »

GFS2 lookup code doesn't ask for inode shared glock. This implies during
in-memory inode creation for existing file, GFS2 will not disk-read in
the inode contents. This leaves no_formal_ino un-initialized during
lookup time. The un-initialized no_formal_ino is subsequently encoded
into file handle. Clients will get ESTALE error whenever it tries to
access these files.

Signed-off-by: S. Wendy Cheng
Signed-off-by: Steven Whitehouse

Wendy Cheng
2007-07-09 15:24:08 +0800
b36576292 [GFS2] System won't suspend with GFS2 file system mounted ... Browse Code »

The kernel threads in gfs2, namely gfs2_scand, gfs2_logd, gfs2_quotad,
gfs2_glockd, gfs2_recoverd weren't doing anything when the suspend
mechanism was trying to freeze them.

I put in calls to refrigerator() in the loops for all the daemons and
suspend works as expected.

Signed-off-by: Abhijith Das
Signed-off-by: Steven Whitehouse

Abhijith Das
2007-07-09 15:24:04 +0800
569a7b6c2 [GFS2] remounting w/o acl option leaves acls enabled ... Browse Code »

This patch is for bugzilla bug #245663. This crosswrites a fix from
gfs1 (bz #210369) so that the mount options are reset properly upon
remount. This was tested on system trin-10.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Bob Peterson
2007-07-09 15:24:01 +0800
090ffaa55 [GFS2] inode size inconsistency ... Browse Code »

This should have been part of the NFS patch #1 but somehow I missed it
when packaging the patches. It is not a critical issue as the others (I
hope). RHEL 5.1 31.el5 kernel runs fine without this change.

Our truncate code is chopped into two parts, one for vfs inode changes
(in vmtruncate()) and one of gfs inode (in gfs2_truncatei()). These two
operatons are, unfortunately, not atomic. So it could happens that
vmtruncate() succeeds (inode->i_size is changed) but gfs2_truncatei
fails (say kernel temporarily out of memory). This would leave gfs inode
i_di.di_size out of sync with vfs inode i_size. It will later confuse
gfs2_commit_write() if a write is issued. Last time I checked, it will
cause file corruption.

Signed-off-by: S. Wendy Cheng
Signed-off-by: Steven Whitehouse

Wendy Cheng
2007-07-09 15:23:59 +0800
1875f2f31 [GFS2] Fix gfs2_block_truncate_page err return ... Browse Code »

Code segment inside gfs2_block_truncate_page() doesn't set the return
code correctly. This causes NFSD erroneously returns EIO back to client
with setattr procedure call (truncate error).

Signed-off-by: S. Wendy Cheng
Signed-off-by: Steven Whitehouse

S. Wendy Cheng
2007-07-09 15:23:54 +0800
773ed1a04 [GFS2] Addendum to the journaled file/unmount patch ... Browse Code »

This patch is an addendum to the previous journaled file/unmount patch.
It fixes a problem discovered during testing.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Robert Peterson
2007-07-09 15:23:52 +0800
eaf5bd3ca [GFS2] Simplify multiple glock aquisition ... Browse Code »

There is a bug in the code which acquires multiple glocks where if the
initial out-of-order attempt fails part way though we can land up trying
to acquire the wrong number of glocks. This is part of the fix for red
hat bz #239737. The other part of the bz doesn't apply to upstream
kernels since it was fixed by:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=d3717bdf8f08a0e1039158c8bab2c24d20f492b6

Since the out-of-order code doesn't appear to add anything to the
performance of GFS2, this patch just removed it rather than trying to
fix it. It should be much easier to see whats going on here now. In
addition, we don't allocate any memory unless we are using a lot of
glocks (which is a relatively uncommon case).

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2007-07-09 15:23:50 +0800
2332c4435 [GFS2] assertion failure after writing to journaled file, umount ... Browse Code »

This patch passes all my nasty tests that were causing the code to
fail under one circumstance or another. Here is a complete summary
of all changes from today's git tree, in order of appearance:

1. There are now separate variables for metadata buffer accounting.
2. Variable sd_log_num_hdrs is no longer needed, since the header
accounting is taken care of by the reserve/refund sequence.
3. Fixed a tiny grammatical problem in a comment.
4. Added a new function "calc_reserved" to calculate the reserved
log space. This isn't entirely necessary, but it has two benefits:
First, it simplifies the gfs2_log_refund function greatly.
Second, it allows for easier debugging because I could sprinkle the
code with calls to this function to make sure the accounting is
proper (by adding asserts and printks) at strategic point of the code.
5. In log_pull_tail there apparently was a kludge to fix up the
accounting based on a "pull" parameter. The buffer accounting is
now done properly, so the kludge was removed.
6. File sync operations were making a call to gfs2_log_flush that
writes another journal header. Since that header was unplanned
for (reserved) by the reserve/refund sequence, the free space had
to be decremented so that when log_pull_tail gets called, the free
space is be adjusted properly. (Did I hear you call that a kludge?
well, maybe, but a lot more justifiable than the one I removed).
7. In the gfs2_log_shutdown code, it optionally syncs the log by
specifying the PULL parameter to log_write_header. I'm not sure
this is necessary anymore. It just seems to me there could be
cases where shutdown is called while there are outstanding log
buffers.
8. In the (data)buf_lo_before_commit functions, I changed some offset
values from being calculated on the fly to being constants. That
simplified some code and we might as well let the compiler do the
calculation once rather than redoing those cycles at run time.
9. This version has my rewritten databuf_lo_add function.
This version is much more like its predecessor, buf_lo_add, which
makes it easier to understand. Again, this might not be necessary,
but it seems as if this one works as well as the previous one,
maybe even better, so I decided to leave it in.
10. In databuf_lo_before_commit, a previous data corruption problem
was caused by going off the end of the buffer. The proper solution
is to have the proper limit in place, rather than stopping earlier.
(Thus my previous attempt to fix it is wrong).
If you don't wrap the buffer, you're stopping too early and that
causes more log buffer accounting problems.
11. In lops.h there are two new (previously mentioned) constants for
figuring out the data offset for the journal buffers.
12. There are also two new functions, buf_limit and databuf_limit to
calculate how many entries will fit in the buffer.
13. In function gfs2_meta_wipe, it needs to distinguish between pinned
metadata buffers and journaled data buffers for proper journal buffer
accounting. It can't use the JDATA gfs2_inode flag because it's
sometimes passed the "real" inode and sometimes the "metadata
inode" and the inode flags will be random bits in a metadata
gfs2_inode. It needs to base its decision on which was passed in.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Robert Peterson
2007-07-09 15:23:47 +0800
2840501ac [GFS2] Use zero_user_page() in stuffed_readpage() ... Browse Code »

As suggested by Robert P. J. Day

Signed-off-by: Steven Whitehouse
Cc: Robert P. J. Day

Steven Whitehouse
2007-07-09 15:23:45 +0800
c4201214c [GFS2] Remove bogus '\0' in rgrp.c ... Browse Code »

Not sure how it slipped in, but we don't want it anyway.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2007-07-09 15:23:43 +0800
8fb68595d [GFS2] Journaled file write/unstuff bug ... Browse Code »

This patch is for bugzilla bug 283162, which uncovered a number of
bugs pertaining to writing to files that have the journaled bit on.
These bugs happen most often when writing to the meta_fs because
the files are always journaled. So operations like gfs2_grow were
particularly vulnerable, although many of the problems could be
recreated with normal files after setting the journaled bit on.
The problems fixed are:

-GFS2 wasn't ever writing unstuffed journaled data blocks to their
in-place location on disk. Now it does.

-If you unmounted too quickly after doing IO to a journaled file,
GFS2 was crashing because you would discard a buffer whose bufdata
was still on the active items list. GFS2 now deals with this
gracefully.

-GFS2 was losing track of the bufdata for journaled data blocks,
and it wasn't getting freed, causing an error when you tried to
unmount the module. GFS2 now frees all the bufdata structures.

-There was a memory corruption occurring because GFS2 wrote
twice as many log entries for journaled buffers.

-It was occasionally trying to write journal headers in buffers
that weren't currently mapped.

Signed-off-by: Bob Peterson
Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Robert Peterson
2007-07-09 15:23:40 +0800
d93cfa988 [GFS2] Fix deallocation issues ... Browse Code »

There were two issues during deallocation of unlinked inodes. The
first was relating to the use of a "try" lock which in the case of
the inode lock wasn't trying hard enough to deallocate in all
circumstances (now changed to a normal glock) and in the case of
the iopen lock didn't wait for the demotion of the shared lock before
attempting to get the exclusive lock, and thereby sometimes (timing dependent)
not completing the deallocation when it should have done.

The second issue related to the lack of a way to invalidate dcache entries
on remote nodes (now fixed by this patch) which meant that unlinks were
taking a long time to return disk space to the fs. By adding some code to
invalidate the dcache entries across the cluster for unlinked inodes, that
is now fixed.

This patch was written jointly by Abhijith Das and Steven Whitehouse.

Signed-off-by: Abhijith Das
Signed-off-by: Steven Whitehouse

Abhijith Das
2007-07-09 15:23:36 +0800
a7a2ff8a9 [GFS2] return conflicts for GETLK ... Browse Code »

We weren't returning the correct result when GETLK found a conflict,
which is indicated by userspace passing back a 1.

Signed-off-by: Abhijith Das
Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-07-09 15:23:33 +0800
d88101d4d [GFS2] set plock owner in GETLK info ... Browse Code »

Set the owner field in the plock info sent to userspace for GETLK.
Without this, gfs_controld won't correctly see when the GETLK from a
process matches one of the process's existing locks.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-07-09 15:23:31 +0800
037bcbb75 [GFS2] gfs2_lookupi() uninitialised var fix ... Browse Code »

fs/gfs2/inode.c: In function 'gfs2_lookupi':
fs/gfs2/inode.c:392: warning: 'error' may be used uninitialized in this function

Looks like a real bug to me.

Cc: Steven Whitehouse
Signed-off-by: Andrew Morton
Signed-off-by: Steven Whitehouse

akpm@linux-foundation.org
2007-07-09 15:23:29 +0800
c8cdf4793 [GFS2] Recovery for lost unlinked inodes ... Browse Code »

Under certain circumstances its possible (though rather unlikely) that
inodes which were unlinked by one node while still open on another might
get "lost" in the sense that they don't get deallocated if the node
which held the inode open crashed before it was unlinked.

This patch adds the recovery code which allows automatic deallocation of
the inode if its found during block allocation (the sensible time to
look for such inodes since we are scanning the rgrp's bitmaps anyway at
this time, so it adds no overhead to do this).

Since the inode will have had its i_nlink set to zero, all we need to
trigger recovery is a lookup and an iput(), and the normal deallocation
code takes care of the rest.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2007-07-09 15:23:26 +0800
b35997d44 [GFS2] Can't mount GFS2 file system on AoE device ... Browse Code »

This patch fixes bug 243131: Can't mount GFS2 file system on AoE device.
When using AoE devices with lock_nolock, there is no locking table, so
gfs2 (and gfs1) uses the superblock s_id. This turns out to be the device
name in some cases. In the case of AoE, the device contains a slash,
(e.g. "etherd/e1.1p2") which is an invalid character when we try to
register the table in sysfs. This patch replaces the "/" with underscore.
Rather than add a new variable to the stack, I'm just reusing a (char *)
variable that's no longer used: table.

This code has been tested on the failing system using a RHEL5 patch.
The upstream code was tested by using gfs2_tool sb to interject a "/"
into the table name of a clustered gfs2 file system.

Signed-off-by: Bob Peterson
Signed-off-by: Steven Whitehouse

Robert Peterson
2007-07-09 15:23:24 +0800
e1cc86037 [GFS2] Fix bug in error path of inode ... Browse Code »

This fixes a bug in the ordering of operations in the error path of
createi. Its not valid to do an iput() when holding the inode's glock
since the iput() will (in this case) result in delete_inode() being
called which needs to grab the lock itself. This was causing the
recursive lock checking code to trigger.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2007-07-09 15:23:22 +0800
ffed8ab34 [GFS2] Fix typo in rename of directories ... Browse Code »

A typo caused us to pass a NULL pointer when renaming directories. It
was accidentally introduced in: [GFS2] Clean up inode number handling

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2007-07-09 15:23:19 +0800
44f487a55 [DLM] variable allocation ... Browse Code »

Add a new flag, DLM_LSFL_FS, to be used when a file system creates a lockspace.
This flag causes the dlm to use GFP_NOFS for allocations instead of GFP_KERNEL.
(This updated version of the patch uses gfp_t for ls_allocation.)

Signed-Off-By: Patrick Caulfield
Signed-Off-By: David Teigland
Signed-off-by: Steven Whitehouse

Patrick Caulfield
2007-07-09 15:23:17 +0800
4bd91ba18 [GFS2] Add nanosecond timestamp feature ... Browse Code »

This adds a nanosecond timestamp feature to the GFS2 filesystem. Due
to the way that the on-disk format works, older filesystems will just
appear to have this field set to zero. When mounted by an older version
of GFS2, the filesystem will simply ignore the extra fields so that
it will again appear to have whole second resolution, so that its
trivially backward compatible.

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2007-07-09 15:23:12 +0800
bb8d8a6f5 [GFS2] Fix sign problem in quota/statfs and cleanup _host structures ... Browse Code »

This patch fixes some sign issues which were accidentally introduced
into the quota & statfs code during the endianess annotation process.
Also included is a general clean up which moves all of the _host
structures out of gfs2_ondisk.h (where they should not have been to
start with) and into the places where they are actually used (often only
one place). Also those _host structures which are not required any more
are removed entirely (which is the eventual plan for all of them).

The conversion routines from ondisk.c are also moved into the places
where they are actually used, which for almost every one, was just one
single place, so all those are now static functions. This also cleans up
the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__.

The net result is a reduction of about 100 lines of code, many functions
now marked static plus the bug fixes as mentioned above. For good
measure I ran the code through sparse after making these changes to
check that there are no warnings generated.

This fixes Red Hat bz #239686

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2007-07-09 15:23:10 +0800
ddf4b426a [GFS2] fix jdata issues ... Browse Code »

This is a patch for the first three issues of RHBZ #238162

The first issue is that when you allocate a new page for a file, it will not
start off uptodate. This makes sense, since you haven't written anything to that
part of the file yet. Unfortunately, gfs2_pin() checks to make sure that the
buffers are uptodate. The solution to this is to mark the buffers uptodate in
gfs2_commit_write(), after they have been zeroed out and have the data written
into them. I'm pretty confident with this fix, although it's not completely
obvious that there is no problem with marking the buffers uptodate here.

The second issue is simply that you can try to pin a data buffer that is already
on the incore log, and thus, already pinned. This patch checks to see if this
buffer is already on the log, and exits databuf_lo_add() if it is, just like
buf_lo_add() does.

The third issue is that gfs2_log_flush() doesn't do it's block accounting
correctly. Both metadata and journaled data are logged, but gfs2_log_flush()
only compares the number of metadata blocks with the number of blocks to commit
to the ondisk journal. This patch also counts the journaled data blocks.

Signed-off-by: Benjamin Marzinski
Signed-off-by: Steven Whitehouse

Benjamin Marzinski
2007-07-09 15:23:08 +0800
89918647a [GFS2] Make the log reserved blocks depend on block size ... Browse Code »

The number of blocks which we reserve in the log at the start of each
transaction needs to depends upon the block size since the overhead is
related to the number of "pointers" which can be fitted into a single
block.

This relates to Red Hat bz #240435

Signed-off-by: Steven Whitehouse

Steven Whitehouse
2007-07-09 15:23:03 +0800
1990e9176 [GFS2] Quotas non-functional - fix another bug ... Browse Code »

This patch fixes a bug where gfs2 was writing update quota usage
information to the wrong location in the quota file.

Signed-off-by: Abhijith Das
Signed-off-by: Steven Whitehouse

Abhijith Das
2007-07-09 15:23:01 +0800
2a87ab080 [GFS2] Quotas non-functional - fix bug ... Browse Code »

This patch fixes an error in the quota code where a 'struct
gfs2_quota_lvb*' was being passed to gfs2_adjust_quota() instead of a
'struct gfs2_quota_data*'. Also moved 'struct gfs2_quota_lvb' from
fs/gfs2/incore.h to include/linux/gfs2_ondisk.h as per Steve's suggestion.

Signed-off-by: Abhijith Das
Signed-off-by: Steven Whitehouse

Abhijith Das
2007-07-09 15:22:26 +0800