Eric Lee / smarc-fsl-linux-kernel

29 Sep, 2008

2 commits

31a78f23b mm owner: fix race between swapoff and exit ... Browse Code »

There's a race between mm->owner assignment and swapoff, more easily
seen when task slab poisoning is turned on. The condition occurs when
try_to_unuse() runs in parallel with an exiting task. A similar race
can occur with callers of get_task_mm(), such as /proc//
or ptrace or page migration.

CPU0 CPU1
try_to_unuse
looks at mm = task0->mm
increments mm->mm_users
task 0 exits
mm->owner needs to be updated, but no
new owner is found (mm_users > 1, but
no other task has task->mm = task0->mm)
mm_update_next_owner() leaves
mmput(mm) decrements mm->mm_users
task0 freed
dereferencing mm->owner fails

The fix is to notify the subsystem via mm_owner_changed callback(),
if no new owner is found, by specifying the new task as NULL.

Jiri Slaby:
mm->owner was set to NULL prior to calling cgroup_mm_owner_callbacks(), but
must be set after that, so as not to pass NULL as old owner causing oops.

Daisuke Nishimura:
mm_update_next_owner() may set mm->owner to NULL, but mem_cgroup_from_task()
and its callers need to take account of this situation to avoid oops.

Hugh Dickins:
Lockdep warning and hang below exec_mmap() when testing these patches.
exit_mm() up_reads mmap_sem before calling mm_update_next_owner(),
so exec_mmap() now needs to do the same. And with that repositioning,
there's now no point in mm_need_new_owner() allowing for NULL mm.

Reported-by: Hugh Dickins
Signed-off-by: Balbir Singh
Signed-off-by: Jiri Slaby
Signed-off-by: Daisuke Nishimura
Signed-off-by: Hugh Dickins
Cc: KAMEZAWA Hiroyuki
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Balbir Singh
2008-09-29 23:41:47 +0800
d0185c088 Fix NULL pointer dereference in proc_sys_compare ... Browse Code »

The VFS interface for the 'd_compare()' is a bit special (read: 'odd'),
because it really just essentially replaces a memcmp(). The filesystem
is supposed to just compare the two names with whatever case-independent
or other function.

And when I say 'is supposed to', I obviously mean that 'procfs does odd
things, and actually looks at the dentry that we don't even pass down,
rather than just the name'. Which results in problems, because we
actually call d_compare before we have even verified that the dentry is
still hashed at all.

And that causes a problm since the inode that procfs looks at may have
been free'd and the d_inode pointer is NULL. procfs just assumes that
all dentries are positive, since procfs itself never generates a
negative one. But memory pressure will still result in the dentry
getting torn down, and as it is removed by RCU, it still remains visible
on some lists - and to d_compare.

If the filesystem just did a name comparison, we wouldn't care. And we
could just fix procfs to know about negative dentries too. But rather
than have the low-level filesystems know about internal VFS details,
just move the check for a unhashed dentry up a bit, so that we will only
call d_compare on dentries that are still active.

The actual oops this caused didn't look like a NULL pointer dereference
because procfs did a 'container_of(inode, struct proc_inode, vfs_inode)'
to get at its internal proc_inode information from the inode pointer,
and accessed a field below the inode. So the oops would look something
like

BUG: unable to handle kernel paging request at fffffffffffffff0
IP: [] proc_sys_compare+0x36/0x50

and was seen on both x86-64 (Alexey Dobriyan and Hugh Dickins) and
ppc64 (Hugh Dickins).

Reported-by: Alexey Dobriyan
Acked-by: Hugh Dickins
Cc: Al Viro
Reviewed-by: "Eric W. Biederman"
Signed-of-by: Linus Torvalds

Linus Torvalds
2008-09-29 22:42:57 +0800

26 Sep, 2008

4 commits

ec4d90287 Merge git://oss.sgi.com:8090/xfs/linux-2.6 ... Browse Code »

* git://oss.sgi.com:8090/xfs/linux-2.6:
[XFS] Remove xfs_iext_irec_compact_full()
[XFS] Fix extent list corruption in xfs_iext_irec_compact_full().

Linus Torvalds
2008-09-26 23:49:34 +0800
bde40fe07 Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6 ... Browse Code »

* 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6:
UBIFS: fix printk format warnings
UBIFS: remove incorrect assert
UBIFS: TNC / GC race fixes
UBIFS: create the name of the background thread in every case

Linus Torvalds
2008-09-26 23:20:26 +0800
71a8c87fb [XFS] Remove xfs_iext_irec_compact_full() ... Browse Code »

Yet another bug was found in xfs_iext_irec_compact_full() and while the
source of the bug was found it wasn't an easy task to track it down
because the conditions are very difficult to reproduce.

A HUGE thank-you goes to Russell Cattelan and Eric Sandeen for their
significant effort in tracking down the source of this corruption.

xfs_iext_irec_compact_full() and xfs_iext_irec_compact_pages() are almost
identical - they both compact indirect extent lists by moving extents from
subsequent buffers into earlier ones. xfs_iext_irec_compact_pages() only
moves extents if all of the extents in the next buffer will fit into the
empty space in the buffer before it. xfs_iext_irec_compact_full() will go
a step further and move part of the next buffer if all the extents wont
fit. It will then shift the remaining extents in the next buffer up to the
start of the buffer. The bug here was that we did not update er_extoff and
this caused extent list corruption.

It does not appear that this extra functionality gains us much. Calling
xfs_iext_irec_compact_pages() instead will do a good enough job at
compacting the indirect list and will be quicker too.

For the case in xfs_iext_indirect_to_direct() the total number of extents
in the indirect list will fit into one buffer so we will never need the
extra functionality of xfs_iext_irec_compact_full() there.

Also xfs_iext_irec_compact_pages() doesn't need to do a memmove() (the
buffers will never overlap) so we don't want the performance hit that can
incur.

SGI-PV: 987159

SGI-Modid: xfs-linux-melb:xfs-kern:32166a

Signed-off-by: Lachlan McIlroy
Signed-off-by: Eric Sandeen

Lachlan McIlroy
2008-09-26 10:17:57 +0800
f1ccd2955 [XFS] Fix extent list corruption in xfs_iext_irec_compact_full(). ... Browse Code »

If we don't move all the records from the next buffer into the current
buffer then we need to update the er_extoff field of the next buffer as we
shift the remaining records to the start of the buffer.

SGI-PV: 987159

SGI-Modid: xfs-linux-melb:xfs-kern:32165a

Signed-off-by: Lachlan McIlroy
Signed-off-by: Eric Sandeen
Signed-off-by: Russell Cattelan

Lachlan McIlroy
2008-09-26 10:16:46 +0800

25 Sep, 2008

1 commit

62aa528e0 9p: use an IS_ERR test rather than a NULL test ... Browse Code »

In case of error, the function p9_client_walk returns an ERR pointer, but
never returns a NULL pointer. So a NULL test that comes after an IS_ERR
test should be deleted.

The semantic match that finds this problem is as follows:
(http://www.emn.fr/x-info/coccinelle/)

//
@match_bad_null_test@
expression x, E;
statement S1,S2;
@@
x = p9_client_walk(...)
... when != x = E
* if (x != NULL)
S1 else S2
//

Signed-off-by: Julien Brunel
Signed-off-by: Julia Lawall
Signed-off-by: Eric Van Hensbergen
Signed-off-by: Andrew Morton

Julien Brunel
2008-09-25 05:22:22 +0800

18 Sep, 2008

1 commit

7424bac82 UBIFS: fix printk format warnings ... Browse Code »

fs/ubifs/dir.c:428: warning: format '%llu' expects type 'long long
unsigned int', but argument 5 has type 'long unsigned int'

fs/ubifs/debug.c:541: warning: format '%llu' expects type 'long long
unsigned int', but argument 2 has type 'long unsigned int'

Signed-off-by: Alexander Beregalov
Signed-off-by: Artem Bityutskiy

Alexander Beregalov
2008-09-18 14:57:57 +0800

17 Sep, 2008

10 commits

6e14968c8 UBIFS: remove incorrect assert ... Browse Code »

The assert was not valid because one of the variables
'taken_empty_lebs' has transient values out of sync
with the other variables.

Signed-off-by: Adrian Hunter
Signed-off-by: Artem Bityutskiy

Adrian Hunter
2008-09-17 19:47:09 +0800
6dcfac4f1 UBIFS: TNC / GC race fixes ... Browse Code »

- update GC sequence number if any nodes may have been moved
even if GC did not finish the LEB
- don't ignore error return when reading

Signed-off-by: Adrian Hunter
Signed-off-by: Artem Bityutskiy

Adrian Hunter
2008-09-17 19:23:26 +0800
0855f310d UBIFS: create the name of the background thread in every case ... Browse Code »

If the ubifs partition is mounted RO and then remounted RW we end
up with no thread name in ubifs_remount_rw() and the thread appears
nameless.

Signed-off-by: Sebastian Siewior
Signed-off-by: Artem Bityutskiy

Sebastian Siewior
2008-09-17 15:07:56 +0800
2fd6f6ec6 [XFS] Don't do I/O beyond eof when unreserving space ... Browse Code »

When unreserving space with boundaries that are not block aligned we round
up the start and round down the end boundaries and then use this function,
xfs_zero_remaining_bytes(), to zero the parts of the blocks that got
dropped during the rounding. The problem is we don't consider if these
blocks are beyond eof. Worse still is if we encounter delayed allocations
beyond eof we will try to use the magic delayed allocation block number as
a real block number. If the file size is ever extended to expose these
blocks then we'll go through xfs_zero_eof() to zero them anyway.

SGI-PV: 983683

SGI-Modid: xfs-linux-melb:xfs-kern:32055a

Signed-off-by: Lachlan McIlroy
Signed-off-by: Christoph Hellwig

Lachlan McIlroy
2008-09-17 14:52:50 +0800
e1f5dbd70 [XFS] Fix use-after-free with buffers ... Browse Code »

We have a use-after-free issue where log completions access buffers via
the buffer log item and the buffer has already been freed. Fix this by
taking a reference on the buffer when attaching the buffer log item and
release the hold when the buffer log item is detached and we no longer
need the buffer. Also create a new function xfs_buf_item_free() to combine
some common code.

SGI-PV: 985757

SGI-Modid: xfs-linux-melb:xfs-kern:32025a

Signed-off-by: Lachlan McIlroy
Signed-off-by: Christoph Hellwig

Lachlan McIlroy
2008-09-17 14:52:13 +0800
f9114eba1 [XFS] Prevent lockdep false positives when locking two inodes. ... Browse Code »

If we call xfs_lock_two_inodes() to grab both the iolock and the ilock,
then drop the ilocks on both inodes, then grab them again (as
xfs_swap_extents() does) then lockdep will report a locking order problem.
This is a false positive.

To avoid this, disallow xfs_lock_two_inodes() fom locking both inode locks
at once - force calers to make two separate calls. This means that nested
dropping and regaining of the ilocks will retain the same lockdep subclass
and so lockdep will not see anything wrong with this code.

SGI-PV: 986238

SGI-Modid: xfs-linux-melb:xfs-kern:31999a

Signed-off-by: David Chinner
Signed-off-by: Christoph Hellwig
Signed-off-by: Peter Leckie
Signed-off-by: Lachlan McIlroy

David Chinner
2008-09-17 14:51:21 +0800
b5b8c9acd [XFS] Fix barrier status change detection. ... Browse Code »

The current code in xlog_iodone() uses the wrong macro to check if the
barrier has been cleared due to an EOPNOTSUPP error form the lower layer.

SGI-PV: 986143

SGI-Modid: xfs-linux-melb:xfs-kern:31984a

Signed-off-by: David Chinner
Signed-off-by: Nathaniel W. Turner
Signed-off-by: Peter Leckie
Signed-off-by: Lachlan McIlroy

David Chinner
2008-09-17 14:50:50 +0800
364f358a7 [XFS] Prevent direct I/O from mapping extents beyond eof ... Browse Code »

With the help from some tracing I found that we try to map extents beyond
eof when doing a direct I/O read. It appears that the way to inform the
generic direct I/O path (ie do_direct_IO()) that we have breached eof is
to return an unmapped buffer from xfs_get_blocks_direct(). This will cause
do_direct_IO() to jump to the hole handling code where is will check for
eof and then abort.

This problem was found because a direct I/O read was trying to map beyond
eof and was encountering delayed allocations. The delayed allocations
beyond eof are speculative allocations and they didn't get converted when
the direct I/O flushed the file because there was only enough space in the
current AG to convert and write out the dirty pages within eof. Note that
xfs_iomap_write_allocate() wont necessarily convert all the delayed
allocation passed to it - it will return after allocating the first extent
- so if the delayed allocation extends beyond eof then it will stay that
way.

SGI-PV: 983683

SGI-Modid: xfs-linux-melb:xfs-kern:31929a

Signed-off-by: Lachlan McIlroy
Signed-off-by: Christoph Hellwig

Lachlan McIlroy
2008-09-17 14:50:14 +0800
6efdf2817 [XFS] Fix regression introduced by remount fixup ... Browse Code »

Logically we would return an error in xfs_fs_remount code to prevent users
from believing they might have changed mount options using remount which
can't be changed.

But unfortunately mount(8) adds all options from mtab and fstab to the
mount arguments in some cases so we can't blindly reject options, but have
to check for each specified option if it actually differs from the
currently set option and only reject it if that's the case.

Until that is implemented we return success for every remount request, and
silently ignore all options that we can't actually change.

SGI-PV: 985710

SGI-Modid: xfs-linux-melb:xfs-kern:31908a

Signed-off-by: Christoph Hellwig
Signed-off-by: Tim Shimmin
Signed-off-by: Lachlan McIlroy

Christoph Hellwig
2008-09-17 14:49:33 +0800
31bd61f2b [XFS] Move memory allocations for log tracing out of the critical path ... Browse Code »

Memory allocations for log->l_grant_trace and iclog->ic_trace are done on
demand when the first event is logged. In xlog_state_get_iclog_space() we
call xlog_trace_iclog() under a spinlock and allocating memory here can
cause us to sleep with a spinlock held and deadlock the system.

For the log grant tracing we use KM_NOSLEEP but that means we can lose
trace entries. Since there is no locking to serialize the log grant
tracing we could race and have multiple allocations and leak memory.

So move the allocations to where we initialize the log/iclog structures.
Use KM_NOFS to avoid recursing into the filesystem and drop log->l_trace
since it's not even used.

SGI-PV: 983738

SGI-Modid: xfs-linux-melb:xfs-kern:31896a

Signed-off-by: Lachlan McIlroy
Signed-off-by: Christoph Hellwig

Lachlan McIlroy
2008-09-17 14:45:37 +0800

14 Sep, 2008

4 commits

8d99f83b9 rescan_partitions(): make device capacity errors non-fatal ... Browse Code »

Herton Krzesinski reports that the error-checking changes in
04ebd4aee52b06a2c38127d9208546e5b96f3a19 ("block/ioctl.c and
fs/partition/check.c: check value returned by add_partition") cause his
buggy USB camera to no longer mount. "The camera is an Olympus X-840.
The original issue comes from the camera itself: its format program
creates a partition with an off by one error".

Buggy devices happen. It is better for the kernel to warn and to proceed
with the mount.

Reported-by: Herton Ronaldo Krzesinski
Cc: Abdel Benamrouche
Cc: Jens Axboe
Cc: Alan Stern
Cc: David Brownell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2008-09-14 05:41:52 +0800
d7a3e4959 mm: ifdef Quicklists in /proc/meminfo ... Browse Code »

A "Quicklists: 0 kB" line has just started appearing in
/proc/meminfo, but most architectures (including x86) don't have
them configured, so #ifdef it, like the highmem lines.

And those architectures which do have quicklists configured are
using them for page tables: so let's place it next to PageTables.

Signed-off-by: Hugh Dickins
Acked-by: Christoph Lameter
Acked-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2008-09-14 05:41:51 +0800
1558182f6 bfs: fix Lockdep warning ... Browse Code »

This fixes:

=============================================
[ INFO: possible recursive locking detected ]
2.6.27-rc5-00283-g70bb089 #68
---------------------------------------------
touch/6855 is trying to acquire lock:
(&info->bfs_lock){--..}, at: [] bfs_delete_inode+0x9e/0x18c

but task is already holding lock:
(&info->bfs_lock){--..}, at: [] bfs_create+0x45/0x187

other info that might help us debug this:
2 locks held by touch/6855:
#0: (&type->i_mutex_dir_key#5){--..}, at: [] do_filp_open+0x10b/0x62f
#1: (&info->bfs_lock){--..}, at: [] bfs_create+0x45/0x187

stack backtrace:
Pid: 6855, comm: touch Not tainted 2.6.27-rc5-00283-g70bb089 #68
[] validate_chain+0x458/0x9f4
[] ? trace_hardirqs_off+0xb/0xd
[] __lock_acquire+0x666/0x6e0
[] lock_acquire+0x5b/0x77
[] ? bfs_delete_inode+0x9e/0x18c
[] mutex_lock_nested+0xbc/0x234
[] ? bfs_delete_inode+0x9e/0x18c
[] ? bfs_delete_inode+0x9e/0x18c
[] bfs_delete_inode+0x9e/0x18c
[] ? bfs_delete_inode+0x0/0x18c
[] generic_delete_inode+0x94/0xfe
[] generic_drop_inode+0x12/0x12f
[] iput+0x4b/0x4e
[] bfs_create+0x163/0x187
[] vfs_create+0xa6/0x114
[] do_filp_open+0x1ad/0x62f
[] ? native_sched_clock+0x82/0x96
[] ? _spin_unlock+0x27/0x3c
[] ? alloc_fd+0xbf/0xc9
[] ? sub_preempt_count+0x9d/0xab
[] ? alloc_fd+0xbf/0xc9
[] do_sys_open+0x42/0xb8
[] ? trace_hardirqs_on_thunk+0xc/0x10
[] sys_open+0x1e/0x26
[] sysenter_do_call+0x12/0x31
=======================

The problem is that we don't unlock the bfs->lock mutex before calling
iput (we do in the other cases).

Signed-off-by: Eric Sesterhenn
Cc: Tigran Aivazian
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Sesterhenn
2008-09-14 05:41:51 +0800
665020c35 proc: more debugging for "already registered" case ... Browse Code »

Print parent directory name as well.

The aim is to catch non-creation of parent directory when proc_mkdir will
return NULL and all subsequent registrations go directly in /proc instead
of intended directory.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
[ Fixed insane printk string while at it. - Linus ]
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2008-09-14 05:41:50 +0800

11 Sep, 2008

1 commit

e2858ce3e Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
udf: add llseek method
udf: Fix error paths in udf_new_inode()
udf: Fix lock inversion between iprune_mutex and alloc_mutex (v2)

Linus Torvalds
2008-09-11 23:40:11 +0800

10 Sep, 2008

2 commits

0e116227a ocfs2: Fix a bug in direct IO read. ... Browse Code »

ocfs2 will become read-only if we try to read the bytes which pass
the end of i_size. This can be easily reproduced by following steps:
1. mkfs a ocfs2 volume with bs=4k cs=4k and nosparse.
2. create a small file(say less than 100 bytes) and we will create the file
which is allocated 1 cluster.
3. read 8196 bytes from the kernel using O_DIRECT which exceeds the limit.
4. The ocfs2 volume becomes read-only and dmesg shows:
OCFS2: ERROR (device sda13): ocfs2_direct_IO_get_blocks:
Inode 66010 has a hole at block 1
File system is now read-only due to the potential of on-disk corruption.
Please run fsck.ocfs2 once the file system is unmounted.

So suppress the ERROR message.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-09-10 16:44:08 +0800
b975dee38 Merge branch 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6 ... Browse Code »

* 'linux-next' of git://git.infradead.org/~dedekind/ubifs-2.6:
UBIFS: make minimum fanout 3
UBIFS: fix division by zero
UBIFS: amend f_fsid
UBIFS: fill f_fsid
UBIFS: improve statfs reporting even more
UBIFS: introduce LEB overhead
UBIFS: add forgotten gc_idx_lebs component
UBIFS: fix assertion
UBIFS: improve statfs reporting
UBIFS: remove incorrect index space check
UBIFS: push empty flash hack down
UBIFS: do not update min_idx_lebs in stafs
UBIFS: allow for racing between GC and TNC
UBIFS: always read hashed-key nodes under TNC mutex
UBIFS: fix zero-length truncations

Linus Torvalds
2008-09-10 02:52:12 +0800

09 Sep, 2008

2 commits

af904deaf NFS: Restore missing hunk in NFS mount option parser ... Browse Code »

Automounter maps can contain mount options valid for other NFS
implementations but not for Linux. The Linux automounter uses the
mount command's "-s" command line option ("s" for "sloppy") so that
mount requests containing such options are not rejected.

Commit f45663ce5fb30f76a3414ab3ac69f4dd320e760a attempted to address a
known regression with text-based NFS mount option parsing. Unrecognized
mount options would cause mount requests to fail, even if the "-s"
option was used on the mount command line.

Unfortunately, this commit was not complete as submitted. It adds a
new mount option, "sloppy". But it is missing a hunk, so it now allows
NFS mounts with unrecognized mount options, even if the "sloppy" option
is not present. This could be a problem if a required critical mount
option such as "sync" is misspelled, for example, and is considered a
regression from 2.6.26.

This patch restores the missing hunk. Now, the default behavior of
text-based NFS mount options is as before: any unrecognized mount option
will cause the mount to fail.

Please include this in 2.6.27-rc.

Thanks to Neil Brown for reporting this.

Signed-off-by: Chuck Lever
Acked-by: J. Bruce Fields
Signed-off-by: Linus Torvalds

Chuck Lever
2008-09-09 06:35:19 +0800
5c89468c1 udf: add llseek method ... Browse Code »

UDF currently doesn't set a llseek method for regular files, which
means it will fall back to default_llseek. This means no one can seek
beyond 2 Gigabytes on udf, and that there's not protection vs
the i_size updates from writers.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara

Christoph Hellwig
2008-09-09 02:31:04 +0800

06 Sep, 2008

3 commits

a5cb562d6 UBIFS: make minimum fanout 3 ... Browse Code »

UBIFS does not really work correctly when fanout is 2,
because of the way we manage the indexing tree. It may
just become a list and UBIFS screws up.

Signed-off-by: Artem Bityutskiy

Artem Bityutskiy
2008-09-06 01:02:35 +0800
f171d4d76 UBIFS: fix division by zero ... Browse Code »

If fanout is 3, we have division by zero in
'ubifs_read_superblock()':

divide error: 0000 [#1] PREEMPT SMP

Pid: 28744, comm: mount Not tainted (2.6.27-rc4-ubifs-2.6 #23)
EIP: 0060:[] EFLAGS: 00010202 CPU: 0
EIP is at ubifs_reported_space+0x2d/0x69 [ubifs]
EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000
ESI: 00000000 EDI: f0ae64b0 EBP: f1f9fcf4 ESP: f1f9fce0
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068

Signed-off-by: Artem Bityutskiy

Artem Bityutskiy
2008-09-06 01:01:59 +0800
49048622e sched: fix process time monotonicity ... Browse Code »

Spencer reported a problem where utime and stime were going negative despite
the fixes in commit b27f03d4bdc145a09fb7b0c0e004b29f1ee555fa. The suspected
reason for the problem is that signal_struct maintains it's own utime and
stime (of exited tasks), these are not updated using the new task_utime()
routine, hence sig->utime can go backwards and cause the same problem
to occur (sig->utime, adds tsk->utime and not task_utime()). This patch
fixes the problem

TODO: using max(task->prev_utime, derived utime) works for now, but a more
generic solution is to implement cputime_max() and use the cputime_gt()
function for comparison.

Reported-by: spencer@bluehost.com
Signed-off-by: Balbir Singh
Signed-off-by: Peter Zijlstra
Signed-off-by: Ingo Molnar

Balbir Singh
2008-09-06 00:14:35 +0800

03 Sep, 2008

4 commits

7c7cbadf7 UBIFS: amend f_fsid ... Browse Code »

David Woodhouse suggested to be consistent with other FSes
and xor the beginning and the end of the UUID.

Signed-off-by: Artem Bityutskiy

Artem Bityutskiy
2008-09-03 19:56:22 +0800
4b8561521 mm: show quicklist usage in /proc/meminfo ... Browse Code »

Quicklists can consume several GB of memory. We should provide a means of
monitoring this.

After this patch is applied, /proc/meminfo will output the following:

% cat /proc/meminfo

MemTotal: 7715392 kB
MemFree: 5401600 kB
Buffers: 80384 kB
Cached: 300800 kB
SwapCached: 0 kB
Active: 235584 kB
Inactive: 262656 kB
SwapTotal: 2031488 kB
SwapFree: 2031488 kB
Dirty: 3520 kB
Writeback: 0 kB
AnonPages: 117696 kB
Mapped: 38528 kB
Slab: 1589952 kB
SReclaimable: 23104 kB
SUnreclaim: 1566848 kB
PageTables: 14656 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 5889152 kB
Committed_AS: 393152 kB
VmallocTotal: 17592177655808 kB
VmallocUsed: 29056 kB
VmallocChunk: 17592177626432 kB
Quicklists: 130944 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 262144 kB

Signed-off-by: KOSAKI Motohiro
Cc: Christoph Lameter
Cc: Keiichiro Tokunaga
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2008-09-03 10:21:38 +0800
169ccbd44 NTFS: update homepage ... Browse Code »

Update the location of the NTFS homepage in several files.

Signed-off-by: Adrian Bunk
Cc: Jeff Garzik
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2008-09-03 10:21:37 +0800
e77295dc9 Merge branch 'for-2.6.27' of git://linux-nfs.org/~bfields/linux ... Browse Code »

* 'for-2.6.27' of git://linux-nfs.org/~bfields/linux:
nfsd: fix buffer overrun decoding NFSv4 acl
sunrpc: fix possible overrun on read of /proc/sys/sunrpc/transports
nfsd: fix compound state allocation error handling
svcrdma: Fix race between svc_rdma_recvfrom thread and the dto_tasklet

Linus Torvalds
2008-09-03 01:58:11 +0800

02 Sep, 2008

2 commits

91b80969b nfsd: fix buffer overrun decoding NFSv4 acl ... Browse Code »

The array we kmalloc() here is not large enough.

Thanks to Johann Dahm and David Richter for bug report and testing.

Signed-off-by: J. Bruce Fields
Cc: David Richter
Tested-by: Johann Dahm

J. Bruce Fields
2008-09-02 02:24:24 +0800
c228c24bf nfsd: fix compound state allocation error handling ... Browse Code »

Move the cstate_alloc call so that if it fails, the response is setup to
encode the NFS error. The out label now means that the
nfsd4_compound_state has not been allocated.

Signed-off-by: Andy Adamson
Signed-off-by: J. Bruce Fields

Andy Adamson
2008-09-02 02:17:48 +0800

31 Aug, 2008

4 commits

b3385c278 UBIFS: fill f_fsid ... Browse Code »

UBIFS stores 16-bit UUID in the superblock, and it is a good
idea to return part of it in 'f_fsid' filed of kstatfs structure.

Signed-off-by: Artem Bityutskiy

Artem Bityutskiy
2008-08-31 22:22:11 +0800
7dad181bb UBIFS: improve statfs reporting even more ... Browse Code »

Since free space we report in statfs is file size which should
fit to the FS - change the way we calculate free space and use
leb_overhead instead of dark_wm in calculations.

Results of "freespace" test (120MiB volume, 16KiB LEB size,
512 bytes page size). Before the change:

freespace: Test 1: fill the space we have 3 times
freespace: was free: 85204992 bytes 81.3 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 11284480 bytes 10.8 MiB, wrote 13.2% more than predicted
freespace: was free: 83554304 bytes 79.7 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 12935168 bytes 12.3 MiB, wrote 15.5% more than predicted
freespace: was free: 83554304 bytes 79.7 MiB, wrote: 96493568 bytes 92.0 MiB, delta: 12939264 bytes 12.3 MiB, wrote 15.5% more than predicted
freespace: Test 1 finished

freespace: Test 2: gradually lessen amount of free space and fill the FS
freespace: do 10 steps, lessen free space by 7596218 bytes 7.2 MiB each time
freespace: was free: 78675968 bytes 75.0 MiB, wrote: 88903680 bytes 84.8 MiB, delta: 10227712 bytes 9.8 MiB, wrote 13.0% more than predicted
freespace: was free: 72015872 bytes 68.7 MiB, wrote: 81514496 bytes 77.7 MiB, delta: 9498624 bytes 9.1 MiB, wrote 13.2% more than predicted
freespace: was free: 63938560 bytes 61.0 MiB, wrote: 72589312 bytes 69.2 MiB, delta: 8650752 bytes 8.2 MiB, wrote 13.5% more than predicted
freespace: was free: 56127488 bytes 53.5 MiB, wrote: 63762432 bytes 60.8 MiB, delta: 7634944 bytes 7.3 MiB, wrote 13.6% more than predicted
freespace: was free: 48336896 bytes 46.1 MiB, wrote: 54935552 bytes 52.4 MiB, delta: 6598656 bytes 6.3 MiB, wrote 13.7% more than predicted
freespace: was free: 40587264 bytes 38.7 MiB, wrote: 46157824 bytes 44.0 MiB, delta: 5570560 bytes 5.3 MiB, wrote 13.7% more than predicted
freespace: was free: 32841728 bytes 31.3 MiB, wrote: 37384192 bytes 35.7 MiB, delta: 4542464 bytes 4.3 MiB, wrote 13.8% more than predicted
freespace: was free: 25100288 bytes 23.9 MiB, wrote: 28618752 bytes 27.3 MiB, delta: 3518464 bytes 3.4 MiB, wrote 14.0% more than predicted
freespace: was free: 17342464 bytes 16.5 MiB, wrote: 19841024 bytes 18.9 MiB, delta: 2498560 bytes 2.4 MiB, wrote 14.4% more than predicted
freespace: was free: 9605120 bytes 9.2 MiB, wrote: 11063296 bytes 10.6 MiB, delta: 1458176 bytes 1.4 MiB, wrote 15.2% more than predicted
freespace: Test 2 finished

freespace: Test 3: gradually lessen amount of free space by trashing and fill the FS
freespace: do 10 steps, lessen free space by 7606272 bytes 7.3 MiB each time
freespace: trashing: was free: 83668992 bytes 79.8 MiB, need free: 7606272 bytes 7.3 MiB, files created: 248297, delete 225724 (90.9% of them)
freespace: was free: 70803456 bytes 67.5 MiB, wrote: 82485248 bytes 78.7 MiB, delta: 11681792 bytes 11.1 MiB, wrote 16.5% more than predicted
freespace: trashing: was free: 81080320 bytes 77.3 MiB, need free: 15212544 bytes 14.5 MiB, files created: 248711, delete 202047 (81.2% of them)
freespace: was free: 59867136 bytes 57.1 MiB, wrote: 71897088 bytes 68.6 MiB, delta: 12029952 bytes 11.5 MiB, wrote 20.1% more than predicted
freespace: trashing: was free: 82243584 bytes 78.4 MiB, need free: 22818816 bytes 21.8 MiB, files created: 248866, delete 179817 (72.3% of them)
freespace: was free: 50905088 bytes 48.5 MiB, wrote: 63168512 bytes 60.2 MiB, delta: 12263424 bytes 11.7 MiB, wrote 24.1% more than predicted
freespace: trashing: was free: 83402752 bytes 79.5 MiB, need free: 30425088 bytes 29.0 MiB, files created: 248920, delete 158114 (63.5% of them)
freespace: was free: 42651648 bytes 40.7 MiB, wrote: 55406592 bytes 52.8 MiB, delta: 12754944 bytes 12.2 MiB, wrote 29.9% more than predicted
freespace: trashing: was free: 84402176 bytes 80.5 MiB, need free: 38031360 bytes 36.3 MiB, files created: 248709, delete 136641 (54.9% of them)
freespace: was free: 35233792 bytes 33.6 MiB, wrote: 48250880 bytes 46.0 MiB, delta: 13017088 bytes 12.4 MiB, wrote 36.9% more than predicted
freespace: trashing: was free: 82530304 bytes 78.7 MiB, need free: 45637632 bytes 43.5 MiB, files created: 248778, delete 111208 (44.7% of them)
freespace: was free: 27287552 bytes 26.0 MiB, wrote: 40267776 bytes 38.4 MiB, delta: 12980224 bytes 12.4 MiB, wrote 47.6% more than predicted
freespace: trashing: was free: 85114880 bytes 81.2 MiB, need free: 53243904 bytes 50.8 MiB, files created: 248508, delete 93052 (37.4% of them)
freespace: was free: 22437888 bytes 21.4 MiB, wrote: 35328000 bytes 33.7 MiB, delta: 12890112 bytes 12.3 MiB, wrote 57.4% more than predicted
freespace: trashing: was free: 84103168 bytes 80.2 MiB, need free: 60850176 bytes 58.0 MiB, files created: 248637, delete 68743 (27.6% of them)
freespace: was free: 15536128 bytes 14.8 MiB, wrote: 28319744 bytes 27.0 MiB, delta: 12783616 bytes 12.2 MiB, wrote 82.3% more than predicted
freespace: trashing: was free: 84357120 bytes 80.4 MiB, need free: 68456448 bytes 65.3 MiB, files created: 248567, delete 46852 (18.8% of them)
freespace: was free: 9015296 bytes 8.6 MiB, wrote: 22044672 bytes 21.0 MiB, delta: 13029376 bytes 12.4 MiB, wrote 144.5% more than predicted
freespace: trashing: was free: 84942848 bytes 81.0 MiB, need free: 76062720 bytes 72.5 MiB, files created: 248636, delete 25993 (10.5% of them)
freespace: was free: 6086656 bytes 5.8 MiB, wrote: 8331264 bytes 7.9 MiB, delta: 2244608 bytes 2.1 MiB, wrote 36.9% more than predicted
freespace: Test 3 finished

freespace: finished successfully

After the change:

freespace: Test 1: fill the space we have 3 times
freespace: was free: 94048256 bytes 89.7 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 2441216 bytes 2.3 MiB, wrote 2.6% more than predicted
freespace: was free: 92246016 bytes 88.0 MiB, wrote: 96493568 bytes 92.0 MiB, delta: 4247552 bytes 4.1 MiB, wrote 4.6% more than predicted
freespace: was free: 92254208 bytes 88.0 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 4235264 bytes 4.0 MiB, wrote 4.6% more than predicted
freespace: Test 1 finished

freespace: Test 2: gradually lessen amount of free space and fill the FS
freespace: do 10 steps, lessen free space by 8386001 bytes 8.0 MiB each time
freespace: was free: 86605824 bytes 82.6 MiB, wrote: 88252416 bytes 84.2 MiB, delta: 1646592 bytes 1.6 MiB, wrote 1.9% more than predicted
freespace: was free: 78667776 bytes 75.0 MiB, wrote: 80715776 bytes 77.0 MiB, delta: 2048000 bytes 2.0 MiB, wrote 2.6% more than predicted
freespace: was free: 69615616 bytes 66.4 MiB, wrote: 71630848 bytes 68.3 MiB, delta: 2015232 bytes 1.9 MiB, wrote 2.9% more than predicted
freespace: was free: 61018112 bytes 58.2 MiB, wrote: 62783488 bytes 59.9 MiB, delta: 1765376 bytes 1.7 MiB, wrote 2.9% more than predicted
freespace: was free: 52424704 bytes 50.0 MiB, wrote: 53968896 bytes 51.5 MiB, delta: 1544192 bytes 1.5 MiB, wrote 2.9% more than predicted
freespace: was free: 43880448 bytes 41.8 MiB, wrote: 45199360 bytes 43.1 MiB, delta: 1318912 bytes 1.3 MiB, wrote 3.0% more than predicted
freespace: was free: 35332096 bytes 33.7 MiB, wrote: 36425728 bytes 34.7 MiB, delta: 1093632 bytes 1.0 MiB, wrote 3.1% more than predicted
freespace: was free: 26771456 bytes 25.5 MiB, wrote: 27643904 bytes 26.4 MiB, delta: 872448 bytes 852.0 KiB, wrote 3.3% more than predicted
freespace: was free: 18231296 bytes 17.4 MiB, wrote: 18878464 bytes 18.0 MiB, delta: 647168 bytes 632.0 KiB, wrote 3.5% more than predicted
freespace: was free: 9674752 bytes 9.2 MiB, wrote: 10088448 bytes 9.6 MiB, delta: 413696 bytes 404.0 KiB, wrote 4.3% more than predicted
freespace: Test 2 finished

freespace: Test 3: gradually lessen amount of free space by trashing and fill the FS
freespace: do 10 steps, lessen free space by 8397544 bytes 8.0 MiB each time
freespace: trashing: was free: 92372992 bytes 88.1 MiB, need free: 8397552 bytes 8.0 MiB, files created: 248296, delete 225723 (90.9% of them)
freespace: was free: 71909376 bytes 68.6 MiB, wrote: 82472960 bytes 78.7 MiB, delta: 10563584 bytes 10.1 MiB, wrote 14.7% more than predicted
freespace: trashing: was free: 88989696 bytes 84.9 MiB, need free: 16795096 bytes 16.0 MiB, files created: 248794, delete 201838 (81.1% of them)
freespace: was free: 60354560 bytes 57.6 MiB, wrote: 71782400 bytes 68.5 MiB, delta: 11427840 bytes 10.9 MiB, wrote 18.9% more than predicted
freespace: trashing: was free: 90304512 bytes 86.1 MiB, need free: 25192640 bytes 24.0 MiB, files created: 248733, delete 179342 (72.1% of them)
freespace: was free: 51187712 bytes 48.8 MiB, wrote: 62943232 bytes 60.0 MiB, delta: 11755520 bytes 11.2 MiB, wrote 23.0% more than predicted
freespace: trashing: was free: 91209728 bytes 87.0 MiB, need free: 33590184 bytes 32.0 MiB, files created: 248779, delete 157160 (63.2% of them)
freespace: was free: 42704896 bytes 40.7 MiB, wrote: 55050240 bytes 52.5 MiB, delta: 12345344 bytes 11.8 MiB, wrote 28.9% more than predicted
freespace: trashing: was free: 92700672 bytes 88.4 MiB, need free: 41987728 bytes 40.0 MiB, files created: 248848, delete 136135 (54.7% of them)
freespace: was free: 35250176 bytes 33.6 MiB, wrote: 48115712 bytes 45.9 MiB, delta: 12865536 bytes 12.3 MiB, wrote 36.5% more than predicted
freespace: trashing: was free: 93986816 bytes 89.6 MiB, need free: 50385272 bytes 48.1 MiB, files created: 248723, delete 115385 (46.4% of them)
freespace: was free: 29995008 bytes 28.6 MiB, wrote: 41582592 bytes 39.7 MiB, delta: 11587584 bytes 11.1 MiB, wrote 38.6% more than predicted
freespace: trashing: was free: 91881472 bytes 87.6 MiB, need free: 58782816 bytes 56.1 MiB, files created: 248645, delete 89569 (36.0% of them)
freespace: was free: 22511616 bytes 21.5 MiB, wrote: 34705408 bytes 33.1 MiB, delta: 12193792 bytes 11.6 MiB, wrote 54.2% more than predicted
freespace: trashing: was free: 91774976 bytes 87.5 MiB, need free: 67180360 bytes 64.1 MiB, files created: 248580, delete 66616 (26.8% of them)
freespace: was free: 16908288 bytes 16.1 MiB, wrote: 26898432 bytes 25.7 MiB, delta: 9990144 bytes 9.5 MiB, wrote 59.1% more than predicted
freespace: trashing: was free: 92450816 bytes 88.2 MiB, need free: 75577904 bytes 72.1 MiB, files created: 248654, delete 45381 (18.3% of them)
freespace: was free: 10170368 bytes 9.7 MiB, wrote: 19111936 bytes 18.2 MiB, delta: 8941568 bytes 8.5 MiB, wrote 87.9% more than predicted
freespace: trashing: was free: 93282304 bytes 89.0 MiB, need free: 83975448 bytes 80.1 MiB, files created: 248513, delete 24794 (10.0% of them)
freespace: was free: 3911680 bytes 3.7 MiB, wrote: 7872512 bytes 7.5 MiB, delta: 3960832 bytes 3.8 MiB, wrote 101.3% more than predicted
freespace: Test 3 finished

freespace: finished successfully

Signed-off-by: Artem Bityutskiy

Artem Bityutskiy
2008-08-31 22:21:51 +0800
9bbb5726e UBIFS: introduce LEB overhead ... Browse Code »

This is a preparational patch for the following statfs()
report fix.

Signed-off-by: Artem Bityutskiy

Artem Bityutskiy
2008-08-31 22:21:40 +0800
131130b9a UBIFS: add forgotten gc_idx_lebs component ... Browse Code »

We add this component at other similar places, but not in this
one.

Signed-off-by: Artem Bityutskiy

Artem Bityutskiy
2008-08-31 22:21:31 +0800