Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

10 Jan, 2009

1 commit

c4be0c1dc filesystem freeze: add error handling of write_super_lockfs/unlockfs ... Browse Code »

Currently, ext3 in mainline Linux doesn't have the freeze feature which
suspends write requests. So, we cannot take a backup which keeps the
filesystem's consistency with the storage device's features (snapshot and
replication) while it is mounted.

In many case, a commercial filesystem (e.g. VxFS) has the freeze feature
and it would be used to get the consistent backup.

If Linux's standard filesystem ext3 has the freeze feature, we can do it
without a commercial filesystem.

So I have implemented the ioctls of the freeze feature.
I think we can take the consistent backup with the following steps.
1. Freeze the filesystem with the freeze ioctl.
2. Separate the replication volume or create the snapshot
with the storage device's feature.
3. Unfreeze the filesystem with the unfreeze ioctl.
4. Take the backup from the separated replication volume
or the snapshot.

This patch:

VFS:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that they can return an error.
Rename write_super_lockfs and unlockfs of the super block operation
freeze_fs and unfreeze_fs to avoid a confusion.

ext3, ext4, xfs, gfs2, jfs:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that write_super_lockfs returns an error if needed,
and unlockfs always returns 0.

reiserfs:
Changed the type of write_super_lockfs and unlockfs from "void"
to "int" so that they always return 0 (success) to keep a current behavior.

Signed-off-by: Takashi Sato
Signed-off-by: Masayuki Hamaguchi
Cc:
Cc:
Cc: Christoph Hellwig
Cc: Dave Kleikamp
Cc: Dave Chinner
Cc: Alasdair G Kergon
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Takashi Sato
2009-01-10 08:54:42 +0800

02 Dec, 2008

1 commit

743bb4650 [XFS] Move copy_from_user calls out of ioctl helpers into ioctl switch. ... Browse Code »

Moving the copy_from_user out of some of the ioctl helpers will
make it easier for the compat ioctl switch to copy in the right
struct, then just pass to the underlying helper.

Also, move common access checks into the helpers themselves,
and out of the native ioctl switch code, to reduce code
duplication between native & compat ioctl callers.

Signed-off-by: Eric Sandeen
Reviewed-by: Christoph Hellwig
Signed-off-by: Lachlan McIlroy

sandeen@sandeen.net
2008-12-02 14:08:01 +0800

30 Oct, 2008

2 commits

7cc95a821 [XFS] Always use struct xfs_btree_block instead of short / longform ... Browse Code »

structures.

Always use the generic xfs_btree_block type instead of the short / long
structures. Add XFS_BTREE_SBLOCK_LEN / XFS_BTREE_LBLOCK_LEN defines for
the length of a short / long form block. The rationale for this is that we
will grow more btree block header variants to support CRCs and other RAS
information, and always accessing them through the same datatype with
unions for the short / long form pointers makes implementing this much
easier.

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32300a

Signed-off-by: Christoph Hellwig
Signed-off-by: Donald Douwsma
Signed-off-by: David Chinner
Signed-off-by: Lachlan McIlroy

Christoph Hellwig
2008-10-30 14:14:34 +0800
136341b41 [XFS] cleanup btree record / key / ptr addressing macros. ... Browse Code »

Replace the generic record / key / ptr addressing macros that use cpp
token pasting with simpler macros that do the job for just one given btree
type. The new macros lose the cur argument and thus can be used outside
the core btree code, but also gain an xfs_mount * argument to allow for
checking the CRC flag in the near future. Note that many of these macros
aren't actually used in the kernel code, but only in userspace (mostly in
xfs_repair).

SGI-PV: 988146

SGI-Modid: xfs-linux-melb:xfs-kern:32295a

Signed-off-by: Christoph Hellwig
Signed-off-by: Donald Douwsma
Signed-off-by: David Chinner
Signed-off-by: Lachlan McIlroy

Christoph Hellwig
2008-10-30 14:11:40 +0800

28 Jul, 2008

1 commit

189f4bf22 [XFS] XFS: ASCII case-insensitive support ... Browse Code »

Implement ASCII case-insensitive support. It's primary purpose is for
supporting existing filesystems that already use this case-insensitive
mode migrated from IRIX. But, if you only need ASCII-only case-insensitive
support (ie. English only) and will never use another language, then this
mode is perfectly adequate.

ASCII-CI is implemented by generating hashes based on lower-case letters
and doing lower-case compares. It implements a new xfs_nameops vector for
doing the hashes and comparisons for all filename operations.

To create a filesystem with this CI mode, use: # mkfs.xfs -n version=ci

SGI-PV: 981516
SGI-Modid: xfs-linux-melb:xfs-kern:31209a

Signed-off-by: Barry Naujok
Signed-off-by: Christoph Hellwig

Barry Naujok
2008-07-28 14:58:42 +0800

29 Apr, 2008

2 commits

d349404ff [XFS] Don't double count reserved block changes on UP. ... Browse Code »

On uniprocessor machines, the incore superblock is used for all in memory
accounting of free blocks. in this situation, changes to the reserved
block count are accounted twice; once directly and once via
xfs_mod_incore_sb(). Seeing as the modification on SMP is done via
xfs_mod_incore_sb(), make this the only update mechanism that UP uses as
well.

SGI-PV: 980654
SGI-Modid: xfs-linux-melb:xfs-kern:30997a

Signed-off-by: David Chinner
Signed-off-by: Lachlan McIlroy

David Chinner
2008-04-29 13:58:27 +0800
d4d90b577 [XFS] Add xfs_icsb_sync_counters_locked for when m_sb_lock already held ... Browse Code »

Add a new xfs_icsb_sync_counters_locked for the case where m_sb_lock
is already taken and add a flags argument to xfs_icsb_sync_counters so
that xfs_icsb_sync_counters_flags is not needed.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30917a

Signed-off-by: Christoph Hellwig
Signed-off-by: David Chinner
Signed-off-by: Lachlan McIlroy

Christoph Hellwig
2008-04-29 13:57:11 +0800

10 Apr, 2008

1 commit

621187099 [XFS] remove shouting-indirection macros from xfs_sb.h ... Browse Code »

Remove macro-to-small-function indirection from xfs_sb.h, and remove some
which are completely unused.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30528a

Signed-off-by: Eric Sandeen
Signed-off-by: Donald Douwsma
Signed-off-by: Lachlan McIlroy

Eric Sandeen
2008-04-10 14:24:45 +0800

14 Feb, 2008

1 commit

413d57c99 xfs: convert beX_add to beX_add_cpu (new common API) ... Browse Code »

remove beX_add functions and replace all uses with beX_add_cpu

Signed-off-by: Marcin Slusarz
Cc: Mark Fasheh
Reviewed-by: Dave Chinner
Cc: Timothy Shimmin
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Marcin Slusarz
2008-02-14 08:21:19 +0800

07 Feb, 2008

1 commit

3685c2a1d [XFS] Unwrap XFS_SB_LOCK. ... Browse Code »

Un-obfuscate XFS_SB_LOCK, remove XFS_SB_LOCK->mutex_lock->spin_lock
macros, call spin_lock directly, remove extraneous cookie holdover from
old xfs code, and change lock type to spinlock_t.

SGI-PV: 970382
SGI-Modid: xfs-linux-melb:xfs-kern:29746a

Signed-off-by: Eric Sandeen
Signed-off-by: Donald Douwsma
Signed-off-by: Tim Shimmin

Eric Sandeen
2008-02-07 13:47:15 +0800

16 Oct, 2007

2 commits

cc92e7ac8 [XFS] growlock should be a mutex ... Browse Code »

m_growlock only needs plain binary mutex semantics, so use a struct mutex
instead of a semaphore for it.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29512a

Signed-off-by: Christoph Hellwig
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Christoph Hellwig
2007-10-16 10:18:09 +0800
b267ce995 [XFS] kill struct bhv_vfs ... Browse Code »

Now that struct bhv_vfs doesn't have any members left we can kill it and
go directly from the super_block to the xfs_mount everywhere.

SGI-PV: 969608
SGI-Modid: xfs-linux-melb:xfs-kern:29509a

Signed-off-by: Christoph Hellwig
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Christoph Hellwig
2007-10-16 10:17:27 +0800

15 Oct, 2007

1 commit

2bdf7cd0b [XFS] superblock endianess annotations ... Browse Code »

Creates a new xfs_dsb_t that is __be annotated and keeps xfs_sb_t for the
incore one. xfs_xlatesb is renamed to xfs_sb_to_disk and only handles the
incore -> disk conversion. A new helper xfs_sb_from_disk handles the other
direction and doesn't need the slightly hacky table-driven approach
because we only ever read the full sb from disk.

The handling of shared r/o filesystems has been buggy on little endian
system and fixing this required shuffling around of some code in that
area.

SGI-PV: 968563
SGI-Modid: xfs-linux-melb:xfs-kern:29477a

Signed-off-by: Christoph Hellwig
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Christoph Hellwig
2007-10-15 14:49:09 +0800

14 Jul, 2007

5 commits

2a82b8be8 [XFS] Concurrent Multi-File Data Streams ... Browse Code »

In media spaces, video is often stored in a frame-per-file format. When
dealing with uncompressed realtime HD video streams in this format, it is
crucial that files do not get fragmented and that multiple files a placed
contiguously on disk.

When multiple streams are being ingested and played out at the same time,
it is critical that the filesystem does not cross the streams and
interleave them together as this creates seek and readahead cache miss
latency and prevents both ingest and playout from meeting frame rate
targets.

This patch set creates a "stream of files" concept into the allocator to
place all the data from a single stream contiguously on disk so that RAID
array readahead can be used effectively. Each additional stream gets
placed in different allocation groups within the filesystem, thereby
ensuring that we don't cross any streams. When an AG fills up, we select a
new AG for the stream that is not in use.

The core of the functionality is the stream tracking - each inode that we
create in a directory needs to be associated with the directories' stream.
Hence every time we create a file, we look up the directories' stream
object and associate the new file with that object.

Once we have a stream object for a file, we use the AG that the stream
object point to for allocations. If we can't allocate in that AG (e.g. it
is full) we move the entire stream to another AG. Other inodes in the same
stream are moved to the new AG on their next allocation (i.e. lazy
update).

Stream objects are kept in a cache and hold a reference on the inode.
Hence the inode cannot be reclaimed while there is an outstanding stream
reference. This means that on unlink we need to remove the stream
association and we also need to flush all the associations on certain
events that want to reclaim all unreferenced inodes (e.g. filesystem
freeze).

SGI-PV: 964469
SGI-Modid: xfs-linux-melb:xfs-kern:29096a

Signed-off-by: David Chinner
Signed-off-by: Barry Naujok
Signed-off-by: Donald Douwsma
Signed-off-by: Christoph Hellwig
Signed-off-by: Tim Shimmin
Signed-off-by: Vlad Apostolov

David Chinner
2007-07-14 13:40:53 +0800
84e1e99f1 [XFS] Prevent ENOSPC from aborting transactions that need to succeed ... Browse Code »

During delayed allocation extent conversion or unwritten extent
conversion, we need to reserve some blocks for transactions reservations.
We need to reserve these blocks in case a btree split occurs and we need
to allocate some blocks.

Unfortunately, we've only ever reserved the number of data blocks we are
allocating, so in both the unwritten and delalloc case we can get ENOSPC
to the transaction reservation. This is bad because in both cases we
cannot report the failure to the writing application.

The fix is two-fold:

1 - leverage the reserved block infrastructure XFS already
has to reserve a small pool of blocks by default to allow
specially marked transactions to dip into when we are at
ENOSPC.
Default setting is min(5%, 1024 blocks).

2 - convert critical transaction reservations to be allowed
to dip into this pool. Spots changed are delalloc
conversion, unwritten extent conversion and growing a
filesystem at ENOSPC.
This also allows growing the filesytsem to succeed at ENOSPC.

SGI-PV: 964468
SGI-Modid: xfs-linux-melb:xfs-kern:28865a

Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

David Chinner
2007-07-14 13:35:19 +0800
0164af51c [XFS] Log the agf_length change in xfs_growfs_data_private(). ... Browse Code »

SGI-PV: 963528
SGI-Modid: xfs-linux-melb:xfs-kern:28856a

Signed-off-by: Tim Shimmin
Signed-off-by: David Chinner
Signed-off-by: Christoph Hellwig

Tim Shimmin
2007-07-14 13:32:59 +0800
92821e2ba [XFS] Lazy Superblock Counters ... Browse Code »
13

When we have a couple of hundred transactions on the fly at once, they all
typically modify the on disk superblock in some way.
create/unclink/mkdir/rmdir modify inode counts, allocation/freeing modify
free block counts.

When these counts are modified in a transaction, they must eventually lock
the superblock buffer and apply the mods. The buffer then remains locked
until the transaction is committed into the incore log buffer. The result
of this is that with enough transactions on the fly the incore superblock
buffer becomes a bottleneck.

The result of contention on the incore superblock buffer is that
transaction rates fall - the more pressure that is put on the superblock
buffer, the slower things go.

The key to removing the contention is to not require the superblock fields
in question to be locked. We do that by not marking the superblock dirty
in the transaction. IOWs, we modify the incore superblock but do not
modify the cached superblock buffer. In short, we do not log superblock
modifications to critical fields in the superblock on every transaction.
In fact we only do it just before we write the superblock to disk every
sync period or just before unmount.

This creates an interesting problem - if we don't log or write out the
fields in every transaction, then how do the values get recovered after a
crash? the answer is simple - we keep enough duplicate, logged information
in other structures that we can reconstruct the correct count after log
recovery has been performed.

It is the AGF and AGI structures that contain the duplicate information;
after recovery, we walk every AGI and AGF and sum their individual
counters to get the correct value, and we do a transaction into the log to
correct them. An optimisation of this is that if we have a clean unmount
record, we know the value in the superblock is correct, so we can avoid
the summation walk under normal conditions and so mount/recovery times do
not change under normal operation.

One wrinkle that was discovered during development was that the blocks
used in the freespace btrees are never accounted for in the AGF counters.
This was once a valid optimisation to make; when the filesystem is full,
the free space btrees are empty and consume no space. Hence when it
matters, the "accounting" is correct. But that means the when we do the
AGF summations, we would not have a correct count and xfs_check would
complain. Hence a new counter was added to track the number of blocks used
by the free space btrees. This is an *on-disk format change*.

As a result of this, lazy superblock counters are a mkfs option and at the
moment on linux there is no way to convert an old filesystem. This is
possible - xfs_db can be used to twiddle the right bits and then
xfs_repair will do the format conversion for you. Similarly, you can
convert backwards as well. At some point we'll add functionality to
xfs_admin to do the bit twiddling easily....

SGI-PV: 964999
SGI-Modid: xfs-linux-melb:xfs-kern:28652a

Signed-off-by: David Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Tim Shimmin

David Chinner
2007-07-14 13:28:50 +0800
4cc929ee3 [XFS] Don't grow filesystems past the size they can index. ... Browse Code »

When growing a filesystem we don't check to see if the new size overflows
the page cache index range, so we can do silly things like grow a
filesystem page 16TB on a 32bit. Check new filesystem sizes against the
limits the kernel can support.

SGI-PV: 957886
SGI-Modid: xfs-linux-melb:xfs-kern:28563a

Signed-Off-By: Nathan Scott
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Nathan Scott
2007-07-14 13:21:29 +0800

08 May, 2007

1 commit

1c72bf900 [XFS] The last argument "lsn" of xfs_trans_commit() is always called with ... Browse Code »

NULL.

Patch provided by Eric Sandeen.

SGI-PV: 961693
SGI-Modid: xfs-linux-melb:xfs-kern:28199a

Signed-off-by: Eric Sandeen
Signed-off-by: Lachlan McIlroy
Signed-off-by: Tim Shimmin

Eric Sandeen
2007-05-08 11:48:42 +0800

10 Feb, 2007

2 commits

2c36ddeda [XFS] Remove unused arguments from the XFS_BTREE_*_ADDR macros. ... Browse Code »

It makes it incrementally clearer to read the code when the top of a macro
spaghetti-pile only receives the 3 arguments it uses, rather than 2 extra
ones which are not used. Also when you start pulling this thread out of
the sweater (i.e. remove unused args from XFS_BTREE_*_ADDR), a couple
other third arms etc fall off too. If they're not used in the macro, then
they sometimes don't need to be passed to the function calling the macro
either, etc....

Patch provided by Eric Sandeen (sandeen@sandeen.net).

SGI-PV: 960197
SGI-Modid: xfs-linux-melb:xfs-kern:28037a

Signed-off-by: Eric Sandeen
Signed-off-by: David Chinner
Signed-off-by: Tim Shimmin

Eric Sandeen
2007-02-10 15:37:33 +0800
dbcabad19 [XFS] Fix block reservation mechanism. ... Browse Code »

The block reservation mechanism has been broken since the per-cpu
superblock counters were introduced. Make the block reservation code work
with the per-cpu counters by syncing the counters, snapshotting the amount
of available space and then doing a modifcation of the counter state
according to the result. Continue in a loop until we either have no space
available or we reserve some space.

SGI-PV: 956323
SGI-Modid: xfs-linux-melb:xfs-kern:27895a

Signed-off-by: David Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Tim Shimmin

David Chinner
2007-02-10 15:36:17 +0800

07 Sep, 2006

1 commit

4be536deb [XFS] Prevent free space oversubscription and xfssyncd looping. ... Browse Code »

The fix for recent ENOSPC deadlocks introduced certain limitations on
allocations. The fix could cause xfssyncd to loop endlessly if we did not
leave some space free for the allocator to work correctly. Basically, we
needed to ensure that we had at least 4 blocks free for an AG free list
and a block for the inode bmap btree at all times.

However, this did not take into account the fact that each AG has a free
list that needs 4 blocks. Hence any filesystem with more than one AG could
cause oversubscription of free space and make xfssyncd spin forever trying
to allocate space needed for AG freelists that was not available in the
AG.

The following patch reserves space for the free lists in all AGs plus the
inode bmap btree which prevents oversubscription. It also prevents those
blocks from being reported as free space (as they can never be used) and
makes the SMP in-core superblock accounting code and the reserved block
ioctl respect this requirement.

SGI-PV: 955674
SGI-Modid: xfs-linux-melb:xfs-kern:26894a

Signed-off-by: David Chinner
Signed-off-by: David Chatterton

David Chinner
2006-09-07 12:26:50 +0800

20 Jun, 2006

1 commit

f6c2d1fa6 [XFS] Remove version 1 directory code. Never functioned on Linux, just ... Browse Code »

pure bloat.

SGI-PV: 952969
SGI-Modid: xfs-linux-melb:xfs-kern:26251a

Signed-off-by: Nathan Scott

Nathan Scott
2006-06-20 11:04:51 +0800

09 Jun, 2006

3 commits

421ad1345 [XFS] Fix mismerge of the fs_writable cleanup patch causing a freeze/thaw ... Browse Code »

test hang.

SGI-PV: 953563
SGI-Modid: xfs-linux-melb:xfs-kern:26182a

Signed-off-by: Nathan Scott

Nathan Scott
2006-06-09 15:12:46 +0800
b83bd1388 [XFS] Resolve a namespace collision on vfs/vfsops for FreeBSD porters. ... Browse Code »

SGI-PV: 9533338
SGI-Modid: xfs-linux-melb:xfs-kern:26106a

Signed-off-by: Nathan Scott

Nathan Scott
2006-06-09 14:48:30 +0800
7d04a335b [XFS] Shutdown the filesystem if all device paths have gone. Made ... Browse Code »

shutdown vop flags consistent with sync vop flags declarations too.

SGI-PV: 939911
SGI-Modid: xfs-linux-melb:xfs-kern:26096a

Signed-off-by: Nathan Scott

Nathan Scott
2006-06-09 12:58:38 +0800

29 Mar, 2006

1 commit

c41564b5a [XFS] We really suck at spulling. Thanks to Chris Pascoe for fixing all ... Browse Code »

these typos.

SGI-PV: 904196
SGI-Modid: xfs-linux-melb:xfs-kern:25539a

Signed-off-by: Nathan Scott

Nathan Scott
2006-03-29 06:55:14 +0800

14 Mar, 2006

1 commit

8d280b98c [XFS] On machines with more than 8 cpus, when running parallel I/O ... Browse Code »

threads, the incore superblock lock becomes the limiting factor for
buffered write throughput. Make the contended fields in the incore
superblock use per-cpu counters so that there is no global lock to limit
scalability.

SGI-PV: 946630
SGI-Modid: xfs-linux-melb:xfs-kern:25106a

Signed-off-by: David Chinner
Signed-off-by: Nathan Scott

David Chinner
2006-03-14 10:13:09 +0800

15 Jan, 2006

1 commit

014c2544e return statement cleanup - kill pointless parentheses ... Browse Code »

This patch removes pointless parentheses from return statements.

Signed-off-by: Jesper Juhl
Signed-off-by: Adrian Bunk

Jesper Juhl
2006-01-15 09:37:08 +0800

11 Jan, 2006

1 commit

e13a73f02 [XFS] Write log dummy record when freezing filesystem ... Browse Code »

SGI-PV: 945483
SGI-Modid: xfs-linux-melb:xfs-kern:202638a

Signed-off-by: Christoph Hellwig
Signed-off-by: Nathan Scott

Christoph Hellwig
2006-01-11 12:30:08 +0800

25 Nov, 2005

1 commit

f33c6797b [XFS] handle error returns from freeze_bdev ... Browse Code »

SGI-PV: 945483
SGI-Modid: xfs-linux-melb:xfs-kern:201884a

Signed-off-by: Christoph Hellwig
Signed-off-by: Nathan Scott

Christoph Hellwig
2005-11-25 13:41:47 +0800

02 Nov, 2005

5 commits

c11e2c369 [XFS] Rework fid encode/decode wrt 64 bit inums interacting with NFS. ... Browse Code »

SGI-PV: 937127
SGI-Modid: xfs-linux:xfs-kern:24201a

Signed-off-by: Nathan Scott

Nathan Scott
2005-11-02 12:11:45 +0800
16259e7d9 [XFS] Endianess annotations for various allocator data structures ... Browse Code »

SGI-PV: 943272
SGI-Modid: xfs-linux:xfs-kern:201006a

Signed-off-by: Christoph Hellwig
Signed-off-by: Nathan Scott

Christoph Hellwig
2005-11-02 12:11:25 +0800
7b7187698 [XFS] Update license/copyright notices to match the prefered SGI ... Browse Code »

boilerplate.

SGI-PV: 913862
SGI-Modid: xfs-linux:xfs-kern:23903a

Signed-off-by: Nathan Scott

Nathan Scott
2005-11-02 11:58:39 +0800
a844f4510 [XFS] Remove xfs_macros.c, xfs_macros.h, rework headers a whole lot. ... Browse Code »

SGI-PV: 943122
SGI-Modid: xfs-linux:xfs-kern:23901a

Signed-off-by: Nathan Scott

Nathan Scott
2005-11-02 11:38:42 +0800
d8cc890d4 [XFS] Ondisk format extension for extended attributes (attr2). Basically, ... Browse Code »

the data/attr forks now grow up/down from either end of the literal area,
rather than dividing the literal area into two chunks and growing both
upward. Means we can now make much more efficient use of the attribute
space, incl. fitting DMF attributes inline in 256 byte inodes, and large
jumps in dbench3 performance numbers. It is self enabling, but can be
forced on/off via the attr2/noattr2 mount options.

SGI-PV: 941645
SGI-Modid: xfs-linux:xfs-kern:23835a

Signed-off-by: Nathan Scott

Nathan Scott
2005-11-02 07:34:53 +0800