Eric Lee / smarc-fsl-linux-kernel

08 May, 2007

1 commit

50953fe9e slab allocators: Remove SLAB_DEBUG_INITIAL flag ... Browse Code »

I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by
SLAB.

I think its purpose was to have a callback after an object has been freed
to verify that the state is the constructor state again? The callback is
performed before each freeing of an object.

I would think that it is much easier to check the object state manually
before the free. That also places the check near the code object
manipulation of the object.

Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
compiled with SLAB debugging on. If there would be code in a constructor
handling SLAB_DEBUG_INITIAL then it would have to be conditional on
SLAB_DEBUG otherwise it would just be dead code. But there is no such code
in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real
use of, difficult to understand and there are easier ways to accomplish the
same effect (i.e. add debug code before kfree).

There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
clear in fs inode caches. Remove the pointless checks (they would even be
pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.

This is the last slab flag that SLUB did not support. Remove the check for
unimplemented flags from SLUB.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2007-05-08 03:12:57 +0800

29 Mar, 2007

3 commits

16a100190 [PATCH] holepunch: fix disconnected pages after second truncate ... Browse Code »

shmem_truncate_range has its own truncate_inode_pages_range, to free any pages
racily instantiated while it was in progress: a SHMEM_PAGEIN flag is set when
this might have happened. But holepunching gets no chance to clear that flag
at the start of vmtruncate_range, so it's always set (unless a truncate came
just before), so holepunch almost always does this second
truncate_inode_pages_range.

shmem holepunch has unlikely swapfile races hereabouts whatever we do
(without a fuller rework than is fit for this release): I was going to skip
the second truncate in the punch_hole case, but Miklos points out that would
make holepunch correctness more vulnerable to swapoff. So keep the second
truncate, but follow it by an unmap_mapping_range to eliminate the
disconnected pages (freed from pagecache while still mapped in userspace) that
it might have left behind.

Signed-off-by: Hugh Dickins
Cc: Miklos Szeredi
Cc: Badari Pulavarty
Cc: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2007-03-29 23:22:25 +0800
1ae700063 [PATCH] holepunch: fix shmem_truncate_range punch locking ... Browse Code »

Miklos Szeredi observes that during truncation of shmem page directories,
info->lock is released to improve latency (after lowering i_size and
next_index to exclude races); but this is quite wrong for holepunching, which
receives no such protection from i_size or next_index, and is left vulnerable
to races with shmem_unuse, shmem_getpage and shmem_writepage.

Hold info->lock throughout when holepunching? No, any user could prevent
rescheduling for far too long. Instead take info->lock just when needed: in
shmem_free_swp when removing the swap entries, and whenever removing a
directory page from the level above. But so long as we remove before
scanning, we can safely skip taking the lock at the lower levels, except at
misaligned start and end of the hole.

Signed-off-by: Hugh Dickins
Cc: Miklos Szeredi
Cc: Badari Pulavarty
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2007-03-29 23:22:25 +0800
a2646d1e6 [PATCH] holepunch: fix shmem_truncate_range punching too far ... Browse Code »

Miklos Szeredi observes BUG_ON(!entry) in shmem_writepage() triggered in rare
circumstances, because shmem_truncate_range() erroneously removes partially
truncated directory pages at the end of the range: later reclaim on pages
pointing to these removed directories triggers the BUG. Indeed, and it can
also cause data loss beyond the hole.

Fix this as in the patch proposed by Miklos, but distinguish between "limit"
(how far we need to search: ignore truncation's next_index optimization in the
holepunch case - if there are races it's more consistent to act on the whole
range specified) and "upper_limit" (how far we can free directory pages:
generally we must be careful to keep partially punched pages, but can relax at
end of file - i_size being held stable by i_mutex).

Signed-off-by: Hugh Dickins
Cc: Miklos Szeredi
Cc: Badari Pulavarty
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2007-03-29 23:22:25 +0800

05 Mar, 2007

1 commit

759b9775c [PATCH] shmem and simple const super_operations ... Browse Code »

shmem's super_operations were missed from the recent const-ification;
and simple_fill_super()'s, which can share with get_sb_pseudo()'s.

Signed-off-by: Hugh Dickins
Acked-by: Josef 'Jeff' Sipek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2007-03-05 23:57:51 +0800

02 Mar, 2007

1 commit

9b83a6a85 [PATCH] mm/{,tiny-}shmem.c cleanups ... Browse Code »

shmem_{nopage,mmap} are no longer used in ipc/shm.c

Signed-off-by: Adrian Bunk
Cc: "Eric W. Biederman"
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2007-03-02 06:53:35 +0800

13 Feb, 2007

1 commit

92e1d5be9 [PATCH] mark struct inode_operations const 2 ... Browse Code »

Many struct inode_operations in the kernel can be "const". Marking them const
moves these to the .rodata section, which avoids false sharing with potential
dirty data. In addition it'll catch accidental writes at compile time to
these shared resources.

Signed-off-by: Arjan van de Ven
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Arjan van de Ven
2007-02-13 01:48:46 +0800

12 Feb, 2007

1 commit

767193253 [PATCH] simplify shmem_aops.set_page_dirty() method ... Browse Code »

shmem backed file does not have page writeback, nor it participates in
backing device's dirty or writeback accounting. So using generic
__set_page_dirty_nobuffers() for its .set_page_dirty aops method is a bit
overkill. It unnecessarily prolongs shm unmap latency.

For example, on a densely populated large shm segment (sevearl GBs), the
unmapping operation becomes painfully long. Because at unmap, kernel
transfers dirty bit in PTE into page struct and to the radix tree tag. The
operation of tagging the radix tree is particularly expensive because it
has to traverse the tree from the root to the leaf node on every dirty
page. What's bothering is that radix tree tag is used for page write back.
However, shmem is memory backed and there is no page write back for such
file system. And in the end, we spend all that time tagging radix tree and
none of that fancy tagging will be used. So let's simplify it by introduce
a new aops __set_page_dirty_no_writeback and this will speed up shm unmap.

Signed-off-by: Ken Chen
Cc: Peter Zijlstra
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ken Chen
2007-02-12 02:51:19 +0800

23 Dec, 2006

1 commit

92a3d03aa [PATCH] Fix for shmem_truncate_range() BUG_ON() ... Browse Code »

Ran into BUG() while doing madvise(REMOVE) testing. If we are punching a
hole into shared memory segment using madvise(REMOVE) and the entire hole
is below the indirect blocks, we hit following assert.

BUG_ON(limit
Cc: Hugh Dickins
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Badari Pulavarty
2006-12-23 00:55:47 +0800

09 Dec, 2006

1 commit

d3ac7f892 [PATCH] mm: change uses of f_{dentry,vfsmnt} to use f_path ... Browse Code »

Change all the uses of f_{dentry,vfsmnt} to f_path.{dentry,mnt} in linux/mm/.

Signed-off-by: Josef "Jeff" Sipek
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Josef "Jeff" Sipek
2006-12-09 00:28:43 +0800

08 Dec, 2006

3 commits

15ad7cdcf [PATCH] struct seq_operations and struct file_operations constification ... Browse Code »

- move some file_operations structs into the .rodata section

- move static strings from policy_types[] array into the .rodata section

- fix generic seq_operations usages, so that those structs may be defined
as "const" as well

[akpm@osdl.org: couple of fixes]
Signed-off-by: Helge Deller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Helge Deller
2006-12-08 00:39:46 +0800
1f370a23f [PATCH] make mm/shmem.c:shmem_xattr_security_handler static ... Browse Code »

Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-12-08 00:39:39 +0800
e94b17660 [PATCH] slab: remove SLAB_KERNEL ... Browse Code »

SLAB_KERNEL is an alias of GFP_KERNEL.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-12-08 00:39:24 +0800

21 Oct, 2006

1 commit

3fcfab16c [PATCH] separate bdi congestion functions from queue congestion functions ... Browse Code »

Separate out the concept of "queue congestion" from "backing-dev congestion".
Congestion is a backing-dev concept, not a queue concept.

The blk_* congestion functions are retained, as wrappers around the core
backing-dev congestion functions.

This proper layering is needed so that NFS can cleanly use the congestion
functions, and so that CONFIG_BLOCK=n actually links.

Cc: "Thomas Maier"
Cc: "Jens Axboe"
Cc: Trond Myklebust
Cc: David Howells
Cc: Peter Osterlund
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-10-21 01:26:35 +0800

17 Oct, 2006

1 commit

91828a405 [PATCH] knfsd: add nfs-export support to tmpfs ... Browse Code »

We need to encode a decode the 'file' part of a handle. We simply use the
inode number and generation number to construct the filehandle.

The generation number is the time when the file was created. As inode numbers
cycle through the full 32 bits before being reused, there is no real chance of
the same inum being allocated to different files in the same second so this is
suitably unique. Using time-of-day rather than e.g. jiffies makes it less
likely that the same filehandle can be created after a reboot.

In order to be able to decode a filehandle we need to be able to lookup by
inum, which means that the inode needs to be added to the inode hash table
(tmpfs doesn't currently hash inodes as there is never a need to lookup by
inum). To avoid overhead when not exporting, we only hash an inode when it is
first exported. This requires a lock to ensure it isn't hashed twice.

This code is separate from the patch posted in June06 from Atal Shargorodsky
which provided the same functionality, but does borrow slightly from it.

Locking comment: Most filesystems that hash their inodes do so at the point
where the 'struct inode' is initialised, and that has suitable locking
(I_NEW). Here in shmem, we are hashing the inode later, the first time we
need an NFS file handle for it. We no longer have I_NEW to ensure only one
thread tries to add it to the hash table.

Cc: Atal Shargorodsky
Cc: Gilad Ben-Yossef
Signed-off-by: David M. Grimes
Signed-off-by: Neil Brown
Acked-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David M. Grimes
2006-10-17 23:18:43 +0800

01 Oct, 2006

2 commits

d8c76e6f4 [PATCH] r/o bind mount prepwork: inc_nlink() helper ... Browse Code »

This is mostly included for parity with dec_nlink(), where we will have some
more hooks. This one should stay pretty darn straightforward for now.

Signed-off-by: Dave Hansen
Acked-by: Christoph Hellwig
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Hansen
2006-10-01 15:39:30 +0800
9a53c3a78 [PATCH] r/o bind mounts: unlink: monitor i_nlink ... Browse Code »

When a filesystem decrements i_nlink to zero, it means that a write must be
performed in order to drop the inode from the filesystem.

We're shortly going to have keep filesystems from being remounted r/o between
the time that this i_nlink decrement and that write occurs.

So, add a little helper function to do the decrements. We'll tie into it in a
bit to note when i_nlink hits zero.

Signed-off-by: Dave Hansen
Acked-by: Christoph Hellwig
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Hansen
2006-10-01 15:39:30 +0800

30 Sep, 2006

1 commit

39f0247d3 [PATCH] Access Control Lists for tmpfs ... Browse Code »

Add access control lists for tmpfs.

Signed-off-by: Andreas Gruenbacher
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andreas Gruenbacher
2006-09-30 00:18:24 +0800

27 Sep, 2006

2 commits

ba52de123 [PATCH] inode-diet: Eliminate i_blksize from the inode structure ... Browse Code »

This eliminates the i_blksize field from struct inode. Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function.

Note that some filesystems were providing pretty much random (and incorrect)
values for i_blksize.

[bunk@stusta.de: cleanup]
[akpm@osdl.org: generic_fillattr() fix]
Signed-off-by: "Theodore Ts'o"
Signed-off-by: Adrian Bunk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Theodore Ts'o
2006-09-27 23:26:18 +0800
1a1d92c10 [PATCH] Really ignore kmem_cache_destroy return value ... Browse Code »

* Rougly half of callers already do it by not checking return value
* Code in drivers/acpi/osl.c does the following to be sure:

(void)kmem_cache_destroy(cache);

* Those who check it printk something, however, slab_error already printed
the name of failed cache.
* XFS BUGs on failed kmem_cache_destroy which is not the decision
low-level filesystem driver should make. Converted to ignore.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2006-09-27 23:26:10 +0800

26 Sep, 2006

1 commit

c1f60a5a4 [PATCH] reduce MAX_NR_ZONES: move HIGHMEM counters into highmem.c/.h ... Browse Code »

Move totalhigh_pages and nr_free_highpages() into highmem.c/.h

Move the totalhigh_pages definition into highmem.c/.h. Move the
nr_free_highpages function into highmem.c

[yoichi_yuasa@tripeaks.co.jp: build fix]
Signed-off-by: Christoph Lameter
Signed-off-by: Yoichi Yuasa
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-09-26 23:48:46 +0800

01 Jul, 2006

3 commits

22a3e233c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial:
Remove obsolete #include
remove obsolete swsusp_encrypt
arch/arm26/Kconfig typos
Documentation/IPMI typos
Kconfig: Typos in net/sched/Kconfig
v9fs: do not include linux/version.h
Documentation/DocBook/mtdnand.tmpl: typo fixes
typo fixes: specfic -> specific
typo fixes in Documentation/networking/pktgen.txt
typo fixes: occuring -> occurring
typo fixes: infomation -> information
typo fixes: disadvantadge -> disadvantage
typo fixes: aquire -> acquire
typo fixes: mecanism -> mechanism
typo fixes: bandwith -> bandwidth
fix a typo in the RTC_CLASS help text
smb is no longer maintained

Manually merged trivial conflict in arch/um/kernel/vmlinux.lds.S

Linus Torvalds
2006-07-01 06:39:30 +0800
f8891e5e1 [PATCH] Light weight event counters ... Browse Code »

The remaining counters in page_state after the zoned VM counter patches
have been applied are all just for show in /proc/vmstat. They have no
essential function for the VM.

We use a simple increment of per cpu variables. In order to avoid the most
severe races we disable preempt. Preempt does not prevent the race between
an increment and an interrupt handler incrementing the same statistics
counter. However, that race is exceedingly rare, we may only loose one
increment or so and there is no requirement (at least not in kernel) that
the vm event counters have to be accurate.

In the non preempt case this results in a simple increment for each
counter. For many architectures this will be reduced by the compiler to a
single instruction. This single instruction is atomic for i386 and x86_64.
And therefore even the rare race condition in an interrupt is avoided for
both architectures in most cases.

The patchset also adds an off switch for embedded systems that allows a
building of linux kernels without these counters.

The implementation of these counters is through inline code that hopefully
results in only a single instruction increment instruction being emitted
(i386, x86_64) or in the increment being hidden though instruction
concurrency (EPIC architectures such as ia64 can get that done).

Benefits:
- VM event counter operations usually reduce to a single inline instruction
on i386 and x86_64.
- No interrupt disable, only preempt disable for the preempt case.
Preempt disable can also be avoided by moving the counter into a spinlock.
- Handling is similar to zoned VM counters.
- Simple and easily extendable.
- Can be omitted to reduce memory use for embedded use.

References:

RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=113512330605497&w=2
RFC http://marc.theaimsgroup.com/?l=linux-kernel&m=114988082814934&w=2
local_t http://marc.theaimsgroup.com/?l=linux-kernel&m=114991748606690&w=2
V2 http://marc.theaimsgroup.com/?t=115014808400007&r=1&w=2
V3 http://marc.theaimsgroup.com/?l=linux-kernel&m=115024767022346&w=2
V4 http://marc.theaimsgroup.com/?l=linux-kernel&m=115047968808926&w=2

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-07-01 02:25:36 +0800
6ab3d5624 Remove obsolete #include <linux/config.h> ... Browse Code »

Signed-off-by: Jörn Engel
Signed-off-by: Adrian Bunk

Jörn Engel
2006-07-01 01:25:36 +0800

30 Jun, 2006

1 commit

602cada85 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6: (22 commits)
[PATCH] devfs: Remove it from the feature_removal.txt file
[PATCH] devfs: Last little devfs cleanups throughout the kernel tree.
[PATCH] devfs: Rename TTY_DRIVER_NO_DEVFS to TTY_DRIVER_DYNAMIC_DEV
[PATCH] devfs: Remove the tty_driver devfs_name field as it's no longer needed
[PATCH] devfs: Remove the line_driver devfs_name field as it's no longer needed
[PATCH] devfs: Remove the videodevice devfs_name field as it's no longer needed
[PATCH] devfs: Remove the gendisk devfs_name field as it's no longer needed
[PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed
[PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree
[PATCH] devfs: Remove devfs_remove() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_cdev() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_bdev() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_symlink() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree
[PATCH] devfs: Remove devfs_*_tape() functions from the kernel tree
[PATCH] devfs: Remove devfs support from the sound subsystem
[PATCH] devfs: Remove devfs support from the ide subsystem.
[PATCH] devfs: Remove devfs support from the serial subsystem
[PATCH] devfs: Remove devfs from the init code
[PATCH] devfs: Remove devfs from the partition code
...

Linus Torvalds
2006-06-30 05:19:21 +0800

29 Jun, 2006

1 commit

f5e54d6e5 [PATCH] mark address_space_operations const ... Browse Code »

Same as with already do with the file operations: keep them in .rodata and
prevents people from doing runtime patching.

Signed-off-by: Christoph Hellwig
Cc: Steven French
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2006-06-29 05:59:04 +0800

27 Jun, 2006

2 commits

ff23eca3e [PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree ... Browse Code »

Also fixes up all files that #include it.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2006-06-27 03:25:08 +0800
95dc112a5 [PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree ... Browse Code »

Removes the devfs_mk_dir() function and all callers of it.

Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2006-06-27 03:25:06 +0800

25 Jun, 2006

1 commit

816724e65 Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ ... Browse Code »

Conflicts:

fs/nfs/inode.c
fs/super.c

Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch
'VFS: Permit filesystem to override root dentry on mount'

Trond Myklebust
2006-06-25 01:07:53 +0800

23 Jun, 2006

3 commits

3c5a87f47 [PATCH] migration: remove unnecessary PageSwapCache checks ... Browse Code »

Remove two unnecessary PageSwapCache checks. The page refcount is raised
and therefore page migration cannot occur in both functions.

Signed-off-by: Christoph Lameter
Cc: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-06-23 22:42:46 +0800
726c33422 [PATCH] VFS: Permit filesystem to perform statfs with a known root dentry ... Browse Code »

Give the statfs superblock operation a dentry pointer rather than a superblock
pointer.

This complements the get_sb() patch. That reduced the significance of
sb->s_root, allowing NFS to place a fake root there. However, NFS does
require a dentry to use as a target for the statfs operation. This permits
the root in the vfsmount to be used instead.

linux/mount.h has been added where necessary to make allyesconfig build
successfully.

Interest has also been expressed for use with the FUSE and XFS filesystems.

Signed-off-by: David Howells
Acked-by: Al Viro
Cc: Nathan Scott
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2006-06-23 22:42:45 +0800
454e2398b [PATCH] VFS: Permit filesystem to override root dentry on mount ... Browse Code »

Extend the get_sb() filesystem operation to take an extra argument that
permits the VFS to pass in the target vfsmount that defines the mountpoint.

The filesystem is then required to manually set the superblock and root dentry
pointers. For most filesystems, this should be done with simple_set_mnt()
which will set the superblock pointer and then set the root dentry to the
superblock's s_root (as per the old default behaviour).

The get_sb() op now returns an integer as there's now no need to return the
superblock pointer.

This patch permits a superblock to be implicitly shared amongst several mount
points, such as can be done with NFS to avoid potential inode aliasing. In
such a case, simple_set_mnt() would not be called, and instead the mnt_root
and mnt_sb would be set directly.

The patch also makes the following changes:

(*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
pointer argument and return an integer, so most filesystems have to change
very little.

(*) If one of the convenience function is not used, then get_sb() should
normally call simple_set_mnt() to instantiate the vfsmount. This will
always return 0, and so can be tail-called from get_sb().

(*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
dcache upon superblock destruction rather than shrink_dcache_anon().

This is required because the superblock may now have multiple trees that
aren't actually bound to s_root, but that still need to be cleaned up. The
currently called functions assume that the whole tree is rooted at s_root,
and that anonymous dentries are not the roots of trees which results in
dentries being left unculled.

However, with the way NFS superblock sharing are currently set to be
implemented, these assumptions are violated: the root of the filesystem is
simply a dummy dentry and inode (the real inode for '/' may well be
inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
with child trees.

[*] Anonymous until discovered from another tree.

(*) The documentation has been adjusted, including the additional bit of
changing ext2_* into foo_* in the documentation.

[akpm@osdl.org: convert ipath_fs, do other stuff]
Signed-off-by: David Howells
Acked-by: Al Viro
Cc: Nathan Scott
Cc: Roland Dreier
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Howells
2006-06-23 22:42:45 +0800

20 Jun, 2006

1 commit

d59bf96cd Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ Browse Code »

Trond Myklebust
2006-06-20 20:59:45 +0800

13 Jun, 2006

2 commits

86bc843a2 [PATCH] tmpfs: Decrement i_nlink correctly in shmem_rmdir() ... Browse Code »

shmem_rmdir() must undo the increment of i_nlink done in
shmem_get_inode() for directories, otherwise at least
IN_DELETE_SELF inotify event generation is broken.

Signed-off-by: Sergey Vlasov
Signed-off-by: Hugh Dickins
Signed-off-by: Linus Torvalds

Sergey Vlasov
2006-06-13 05:29:04 +0800
cfd95a9cf [PATCH] tmpfs: time granularity fix for [acm]time going backwards ... Browse Code »

I noticed a strange behavior in a tmpfs file system the other day, while
building packages - occasionally, and seemingly at random, make decided to
rebuild a target. However, only on tmpfs.

A file would be created, and if checked, it had a sub-second timestamp.
However, after an utimes related call where sub-seconds should be set, they
were zeroed instead. In the case that a file was created, and utimes(...,NULL)
was used on it in the same second, the timestamp on the file moved backwards.

After some digging, I found that this was being caused by tmpfs not having a
time granularity set, thus inheriting the default 1 second granularity.

Hugh adds: yes, we missed tmpfs when the s_time_gran mods went into 2.6.11.
Unfortunately, the granularity of CURRENT_TIME, often used in filesystems,
does not match the default granularity set by alloc_super. A few more such
discrepancies have been found, but this is the most important to fix now.

Signed-off-by: Robin H. Johnson
Acked-by: Andi Kleen
Signed-off-by: Hugh Dickins
Signed-off-by: Linus Torvalds

Robin H. Johnson
2006-06-13 04:55:52 +0800

09 Jun, 2006

1 commit

1f5ce9e93 VFS: Unexport do_kern_mount() and clean up simple_pin_fs() ... Browse Code »

Replace all module uses with the new vfs_kern_mount() interface, and fix up
simple_pin_fs().

Signed-off-by: Trond Myklebust

Trond Myklebust
2006-06-09 21:34:16 +0800

23 Apr, 2006

1 commit

304dbdb7a [PATCH] add migratepage address space op to shmem ... Browse Code »

Basic problem: pages of a shared memory segment can only be migrated once.

In 2.6.16 through 2.6.17-rc1, shared memory mappings do not have a
migratepage address space op. Therefore, migrate_pages() falls back to
default processing. In this path, it will try to pageout() dirty pages.
Once a shared memory page has been migrated it becomes dirty, so
migrate_pages() will try to page it out. However, because the page count
is 3 [cache + current + pte], pageout() will return PAGE_KEEP because
is_page_cache_freeable() returns false. This will abort all subsequent
migrations.

This patch adds a migratepage address space op to shared memory segments to
avoid taking the default path. We use the "migrate_page()" function
because it knows how to migrate dirty pages. This allows shared memory
segment pages to migrate, subject to other conditions such as # pte's
referencing the page [page_mapcount(page)], when requested.

I think this is safe. If we're migrating a shared memory page, then we
found the page via a page table, so it must be in memory.

Can be verified with memtoy and the shmem-mbind-test script, both
available at: http://free.linux.hp.com/~lts/Tools/

Signed-off-by: Lee Schermerhorn
Acked-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2006-04-23 00:19:52 +0800

22 Mar, 2006

2 commits

d15c023b4 [PATCH] shmem: inline to avoid warning ... Browse Code »

shmem.c was named and shamed in Jesper's "Building 100 kernels" warnings:
shmem_parse_mpol is only used when CONFIG_TMPFS parses mount options; and
only called from that one site, so mark it inline like its non-NUMA stub.

Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2006-03-22 23:54:02 +0800
fcc234f88 [PATCH] mm: kill kmem_cache_t usage ... Browse Code »

We have struct kmem_cache now so use it instead of the old typedef.

Signed-off-by: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pekka Enberg
2006-03-22 23:53:58 +0800

22 Feb, 2006

1 commit

b00dc3ad7 [PATCH] tmpfs: fix mount mpol nodelist parsing ... Browse Code »

I've been dissatisfied with the mpol_nodelist mount option which was
added to tmpfs earlier in -rc. Replace it by mpol=policy:nodelist.

And it was broken: a nodelist is a comma-separated list of numbers and
ranges; the mount options are a comma-separated list of token=values.
Whoops, blindly strsep'ing on commas doesn't work so well: since we've
no numeric tokens, and unlikely to add them, use that to distinguish.

Move the mpol= parsing to shmem_parse_mpol under CONFIG_NUMA, reject
all its options as invalid if not NUMA. /proc shows MPOL_PREFERRED
as "prefer", so use that name for the policy instead of "preferred".

Enforce that mpol=default has no nodelist; that mpol=prefer has one
node only; that mpol=bind has a nodelist; but let mpol=interleave use
node_online_map if no nodelist given. Describe this in tmpfs.txt.

Signed-off-by: Hugh Dickins
Acked-by: Robin Holt
Acked-by: Andi Kleen
Signed-off-by: Linus Torvalds

Hugh Dickins
2006-02-22 09:10:15 +0800