Eric Lee / smarc-fsl-linux-kernel

29 Sep, 2007

1 commit

54af3bb54 NFS: Fix an Oops in encode_lookup() ... Browse Code »

It doesn't look as if the NFS file name limit is being initialised correctly
in the struct nfs_server. Make sure that we limit whatever is being set in
nfs_probe_fsinfo() and nfs_init_server().

Also ensure that readdirplus and nfs4_path_walk respect our file name
limits.

Signed-off-by: Trond Myklebust
Signed-off-by: Linus Torvalds

Trond Myklebust
2007-09-29 06:36:42 +0800

20 Sep, 2007

1 commit

49af7ee18 nfs: fix oops re sysctls and V4 support ... Browse Code »

NFS unregisters sysctls only if V4 support is compiled in. However, sysctl
table is not V4 specific, so unregister it always.

Steps to reproduce:

[build nfs.ko with CONFIG_NFS_V4=n]
modrobe nfs
rmmod nfs
ls /proc/sys

Unable to handle kernel paging request at ffffffff880661c0 RIP:
[] proc_sys_readdir+0xd3/0x350
PGD 203067 PUD 207063 PMD 7e216067 PTE 0
Oops: 0000 [1] SMP
CPU 1
Modules linked in: lockd nfs_acl sunrpc
Pid: 3335, comm: ls Not tainted 2.6.23-rc3-bloat #2
RIP: 0010:[] [] proc_sys_readdir+0xd3/0x350
RSP: 0018:ffff81007fd93e78 EFLAGS: 00010286
RAX: ffffffff880661c0 RBX: ffffffff80466370 RCX: ffffffff880661c0
RDX: 00000000000014c0 RSI: ffff81007f3ad020 RDI: ffff81007efd8b40
RBP: 0000000000000018 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: ffffffff802a8570 R12: ffffffff880661c0
R13: ffff81007e219640 R14: ffff81007efd8b40 R15: ffff81007ded7280
FS: 00002ba25ef03060(0000) GS:ffff81007ff81258(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffff880661c0 CR3: 000000007dfaf000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ls (pid: 3335, threadinfo ffff81007fd92000, task ffff81007d8a0000)
Stack: ffff81007f3ad150 ffffffff80283f30 ffff81007fd93f48 ffff81007efd8b40
ffff81007ee00440 0000000422222222 0000000200035593 ffffffff88037e9a
2222222222222222 ffffffff80466500 ffff81007e416400 ffff81007e219640
Call Trace:
[] filldir+0x0/0xf0
[] filldir+0x0/0xf0
[] vfs_readdir+0xa7/0xc0
[] sys_getdents+0x96/0xe0
[] system_call+0x7e/0x83

Code: 41 8b 14 24 85 d2 74 dc 49 8b 44 24 08 48 85 c0 74 e7 49 3b
RIP [] proc_sys_readdir+0xd3/0x350
RSP
CR2: ffffffff880661c0
Kernel panic - not syncing: Fatal exception

Signed-off-by: Alexey Dobriyan
Acked-by: Trond Myklebust
Cc: "J. Bruce Fields"
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2007-09-20 02:24:18 +0800

01 Sep, 2007

9 commits

1b3b4a1a2 NFS: Fix a write request leak in nfs_invalidate_page() ... Browse Code »

Ryusuke Konishi says:

The recent truncate_complete_page() clears the dirty flag from a page
before calling a_ops->invalidatepage(),
^^^^^^
static void
truncate_complete_page(struct address_space *mapping, struct page *page)
{
...
cancel_dirty_page(page, PAGE_CACHE_SIZE); will call
a_ops->invalidatepage()
...
}

and this is disturbing nfs_wb_page_priority() from calling
nfs_writepage_locked() that is expected to handle the pending
request (=nfs_page) associated with the page.

int nfs_wb_page_priority(struct inode *inode, struct page *page, int how)
{
...
if (clear_page_dirty_for_io(page)) {
ret = nfs_writepage_locked(page, &wbc);
if (ret < 0)
goto out;
}
...
}

Since truncate_complete_page() will get rid of the page after
a_ops->invalidatepage() returns, the request (=nfs_page) associated
with the page becomes a garbage in nfs_inode->nfs_page_tree.
------------------------

Fix this by ensuring that nfs_wb_page_priority() recognises that it may
also need to clear out non-dirty pages that have an nfs_page associated
with them.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-09-01 22:14:54 +0800
7d1cca729 NFS: change NFS mount error return when hostname/pathname too long ... Browse Code »

According to the mount(2) man page, the proper error return code for the
mount(2) system call when the special device name or the mounted-on
directory name is too long is ENAMETOOLONG.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-09-01 22:14:40 +0800
350c73af6 NFS: Off-by-one length error in string handling ... Browse Code »

The hostname was getting truncated in the new text-based NFS mount API.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-09-01 22:14:40 +0800
fdc6e2c8c NFS: Return a real error code from mount(2) ... Browse Code »

Don't filter the return code from the in-kernel rpcbind or NFS mount
clients. Return the real error code so that callers of the new NFS
text-based mount API can apply a useful retry strategy.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-09-01 22:14:39 +0800
fdb66ff4a NFS: mount option parser chokes on proto= ... Browse Code »

The new text-based NFS mount option parsing logic doesn't recognize any
valid transport protocols due to a silly mistake in the protocol token
matching logic. This prevents basic mount requests such as:

mount.nfs server:/export /mnt -o proto=tcp

from working with the new text-based NFS mount API.

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-09-01 22:14:38 +0800
deee9369b NFSv4: Ensure that we pass the correct dentry to nfs4_intent_set_file ... Browse Code »

This patch fixes an Oops that was reported by Gabriel Barazer.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-09-01 22:14:38 +0800
65bbf6bdb NFSv4: Fix a typo in _nfs4_do_open_reclaim ... Browse Code »

This should fix the following Oops reported by Jeff Garzik:

kernel BUG at fs/nfs/nfs4xdr.c:1040!
invalid opcode: 0000 [1] SMP
CPU 0
Modules linked in: nfs lockd sunrpc af_packet
ipv6 cpufreq_ondemand acpi_cpufreq battery floppy nvram sg snd_hda_intel
ata_generic snd_pcm_oss snd_mixer_oss snd_pcm i2c_i801 snd_page_alloc e1000
firewire_ohci ata_piix i2c_core sr_mod cdrom sata_sil ahci libata sd_mod
scsi_mod ext3 jbd ehci_hcd uhci_hcd
Pid: 16353, comm: 10.10.10.1-recl Not tainted 2.6.23-rc3 #1
RIP: 0010:[] [] :nfs:encode_open+0x1c0/0x330
RSP: 0018:ffff8100467c5c60 EFLAGS: 00010202
RAX: ffff81000f89b8b8 RBX: 00000000697a6f6d RCX: ffff81000f89b8b8
RDX: 0000000000000004 RSI: 0000000000000004 RDI: ffff8100467c5c80
RBP: ffff8100467c5c80 R08: ffff81000f89bc30 R09: ffff81000f89b83f
R10: 0000000000000001 R11: ffffffff881e79e0 R12: ffff81003cbd1808
R13: ffff81000f89b860 R14: ffff81005fc984e0 R15: ffffffff88240af0
FS: 0000000000000000(0000) GS:ffffffff8052a000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00002adb9e51a030 CR3: 000000007ea7e000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process 10.10.10.1-recl (pid: 16353, threadinfo ffff8100467c4000, task ffff8100038ce780)
Stack: ffff81004aeb6a40 ffff81003cbd1808 ffff81003cbd1808 ffffffff88240b5d
ffff81000f89b8bc ffff81005fc984e8 ffff81000f89bc30 ffff81005fc984e8
0000000300000000 0000000000000000 0000000000000000 ffff81003cbd1800
Call Trace:
[] :nfs:nfs4_xdr_enc_open_noattr+0x6d/0x90
[] :sunrpc:rpcauth_wrap_req+0x97/0xf0
[] :nfs:nfs4_xdr_enc_open_noattr+0x0/0x90
[] :sunrpc:call_transmit+0x18a/0x290
[] :sunrpc:__rpc_execute+0x6b/0x290
[] :sunrpc:rpc_do_run_task+0x76/0xd0
[] :nfs:_nfs4_proc_open+0x76/0x230
[] :nfs:nfs4_open_recover_helper+0x5e/0xc0
[] :nfs:nfs4_open_recover+0xe4/0x120
[] :nfs:nfs4_open_reclaim+0xa4/0xf0
[] :nfs:nfs4_reclaim_open_state+0x55/0x1b0
[] :nfs:reclaimer+0x2ca/0x390
[] :nfs:reclaimer+0x0/0x390
[] kthread+0x4b/0x80
[] child_rip+0xa/0x12
[] kthread+0x0/0x80
[] child_rip+0x0/0x12

Code: 0f 0b eb fe 48 89 ef c7 00 00 00 00 02 be 08 00 00 00 e8 79
RIP [] :nfs:encode_open+0x1c0/0x330
RSP

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-09-01 22:14:37 +0800
560aef745 NFS: Fix use of cancel_delayed_work_sync in nfs_release_automount_timer ... Browse Code »

Doh! We can't use cancel_delayed_work_sync because we may have been called
from an unmount that was being performed by nfs_automount_task.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-09-01 22:14:36 +0800
e89a5a43b NFS: Fix the mount regression ... Browse Code »

This avoids the recent NFS mount regression (returning EBUSY when
mounting the same filesystem twice with different parameters).

The best I can do given the constraints appears to be to have the kernel
first look for a superblock that matches both the fsid and the
user-specified mount options, and then spawn off a new superblock if
that search fails.

Note that this is not the same as specifying nosharecache everywhere
since nosharecache will never attempt to match an existing superblock.

Signed-off-by: Trond Myklebust
Tested-by: Hua Zhong
Signed-off-by: Linus Torvalds

Trond Myklebust
2007-09-01 11:26:45 +0800

08 Aug, 2007

5 commits

3d39c691f NFS: Replace flush_scheduled_work with cancel_work_sync() and friends ... Browse Code »

This will avoid deadlocks of the form:

stack backtrace:
[] show_trace_log_lvl+0x1a/0x30
[] show_trace+0x12/0x20
[] dump_stack+0x15/0x20
[] __lock_acquire+0xc22/0x1030
[] lock_acquire+0x61/0x80
[] flush_workqueue+0x49/0x70
[] flush_scheduled_work+0xd/0x10
[] nfs_release_automount_timer+0x2c/0x30 [nfs]
[] nfs_free_server+0x9e/0xd0 [nfs]
[] nfs_kill_super+0x16/0x20 [nfs]
[] deactivate_super+0x7d/0xa0
[] mntput_no_expire+0x4b/0x80
[] expire_mount_list+0xe4/0x140
[] mark_mounts_for_expiry+0x99/0xb0
[] nfs_expire_automounts+0xd/0x40 [nfs]
[] run_workqueue+0x12b/0x1e0
[] worker_thread+0x9b/0x100
[] kthread+0x42/0x70
[] kernel_thread_helper+0x7/0x18
=======================

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-08-08 04:12:50 +0800
905f8d16e NFSv4: Don't call put_rpccred() from an rcu callback ... Browse Code »

Doing so would require us to introduce bh-safe locks into put_rpccred().
This patch fixes the lockdep complaint reported by Marc Dietrich:

inconsistent {softirq-on-W} -> {in-softirq-W} usage.
swapper/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
(rpc_credcache_lock){-+..}, at: []
_atomic_dec_and_lock+0x17/0x60
{softirq-on-W} state was registered at:
[] __lock_acquire+0x650/0x1030
[] lock_acquire+0x61/0x80
[] _spin_lock+0x2c/0x40
[] _atomic_dec_and_lock+0x17/0x60
[] put_rpccred+0x5d/0x100 [sunrpc]
[] rpcauth_unbindcred+0x21/0x60 [sunrpc]
[] a0 [sunrpc]
[] rpc_call_sync+0x30/0x40 [sunrpc]
[] rpcb_register+0xdb/0x180 [sunrpc]
[] svc_register+0x93/0x160 [sunrpc]
[] __svc_create+0x1ee/0x220 [sunrpc]
[] svc_create+0x13/0x20 [sunrpc]
[] nfs_callback_up+0x82/0x120 [nfs]
[] nfs_get_client+0x176/0x390 [nfs]
[] nfs4_set_client+0x31/0x190 [nfs]
[] nfs4_create_server+0x63/0x3b0 [nfs]
[] nfs4_get_sb+0x346/0x5b0 [nfs]
[] vfs_kern_mount+0x94/0x110
[] do_mount+0x1f2/0x7d0
[] sys_mount+0x66/0xa0
[] syscall_call+0x7/0xb
[] 0xffffffff
irq event stamp: 5277830
hardirqs last enabled at (5277830): [] kmem_cache_free+0x8a/0xc0
hardirqs last disabled at (5277829): [] kmem_cache_free+0x52/0xc0
softirqs last enabled at (5277798): [] __do_softirq+0xa3/0xc0
softirqs last disabled at (5277817): [] do_softirq+0x47/0x50

other info that might help us debug this:
no locks held by swapper/0.

stack backtrace:
[] show_trace_log_lvl+0x1a/0x30
[] show_trace+0x12/0x20
[] dump_stack+0x15/0x20
[] print_usage_bug+0x153/0x160
[] mark_lock+0x449/0x620
[] __lock_acquire+0x604/0x1030
[] lock_acquire+0x61/0x80
[] _spin_lock+0x2c/0x40
[] _atomic_dec_and_lock+0x17/0x60
[] put_rpccred+0x5d/0x100 [sunrpc]
[] nfs_free_delegation_callback+0x13/0x20 [nfs]
[] __rcu_process_callbacks+0x6a/0x1c0
[] rcu_process_callbacks+0x12/0x30
[] tasklet_action+0x38/0x80
[] __do_softirq+0x55/0xc0
[] do_softirq+0x47/0x50
[] irq_exit+0x35/0x40
[] smp_apic_timer_interrupt+0x43/0x80
[] apic_timer_interrupt+0x33/0x38
[] cpuidle_idle_call+0x6f/0x90
[] cpu_idle+0x43/0x70
[] rest_init+0x47/0x50
[] start_kernel+0x22a/0x2b0
[] 0x0
=======================

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-08-08 03:15:57 +0800
45328c354 NFS: Fix NFSv4 open stateid regressions ... Browse Code »

Do not allow cached open for O_RDONLY or O_WRONLY unless the file has been
previously opened in these modes.

Also Fix the calculation of the mode in nfs4_close_prepare. We should only
issue an OPEN_DOWNGRADE if we're sure that we will still be holding the
correct open modes. This may not be the case if we've been doing delegated
opens.

Finally, there is no need to adjust the open mode bit flags in
nfs4_close_done(): that has already been done in nfs4_close_prepare().

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-08-08 03:13:19 +0800
ba683031f NFSv4: Fix a locking regression in nfs4_set_mode_locked() ... Browse Code »

We don't really need to clear &state->inode_states inside
nfs4_set_mode_locked, and doing so without holding the inode->i_lock would
in any case be a bug...

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-08-08 03:13:18 +0800
5e11934d1 NFS: Fix put_nfs_open_context ... Browse Code »

We need to grab the inode->i_lock atomically with the last reference put in
order to remove the open context that is being freed from the
nfsi->open_files list.

Fix by converting the kref to a standard atomic counter and then using
atomic_dec_and_lock()...

Thanks to Arnd Bergmann for pointing out the problem.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-08-08 03:13:17 +0800

23 Jul, 2007

1 commit

41089644c fix broken handling of port=... in NFS option parsing ... Browse Code »

Obviously broken on little-endian; fortunately, the option is not
frequently used...

Signed-off-by: Al Viro
[ Hey, sparse is wonderful, but even better than sparse is having people
like Al that actually _run_ it and fix bugs using it. - Linus ]
Signed-off-by: Linus Torvalds

Al Viro
2007-07-23 02:15:18 +0800

20 Jul, 2007

13 commits

20c2df83d mm: Remove slab destructors from kmem_cache_create(). ... Browse Code »

Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.

This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).

Signed-off-by: Paul Mundt

Paul Mundt
2007-07-20 09:11:58 +0800
0a87cf128 NFSv4: handle lack of clientaddr in option string ... Browse Code »

If a NFSv4 mount is attempted with string based options, and the
option string doesn't contain a clientaddr= option, the kernel will
currently oops. Check for this situation and return a proper error.

Signed-off-by: Jeff Layton
Signed-off-by: Trond Myklebust

Jeff Layton
2007-07-20 03:21:40 +0800
f9d888fcd NFSv4: debug print ntohl(status) in nfs client callback xdr code ... Browse Code »

status in nfs client callback xdr code is passed in network order.
print it in host order for better readability.

Signed-off-by: Benny Halevy
Signed-off-by: Trond Myklebust

Benny Halevy
2007-07-20 03:21:40 +0800
e4eff1a62 SUNRPC: Clean up the sillyrename code ... Browse Code »

Fix a couple of bugs:
- Don't rely on the parent dentry still being valid when the call completes.
Fixes a race with shrink_dcache_for_umount_subtree()

- Don't remove the file if the filehandle has been labelled as stale.

Fix a couple of inefficiencies
- Remove the global list of sillyrenamed files. Instead we can cache the
sillyrename information in the dentry->d_fsdata
- Move common code from unlink_setup/unlink_done into fs/nfs/unlink.c

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:21:39 +0800
4fdc17b2a NFS: Introduce struct nfs_removeargs+nfs_removeres ... Browse Code »

We need a common structure for setting up an unlink() rpc call in order to
fix the asynchronous unlink code.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:21:39 +0800
3062c532a NFS: Use dentry->d_time to store the parent directory verifier. ... Browse Code »

This will free up the d_fsdata field for other use.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:21:39 +0800
e3a535e17 NFSv4: Fix the nfsv4 readlink reply buffer alignment ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:09:04 +0800
d6ac02dfa NFSv4: Fix the readdir reply buffer alignment ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:09:04 +0800
9104a55dc NFSv4: More NFSv4 xdr cleanups ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:09:04 +0800
9936781d0 NFSv4: Try to recover from getfh failures in nfs4_xdr_dec_open ... Browse Code »

Try harder to recover the open state if the server failed to return a
filehandle.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:09:03 +0800
56659e992 NFSv4: 'constify' lookup arguments. ... Browse Code »

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:09:03 +0800
365c8f589 NFSv4: Don't fail nfs4_xdr_dec_open if decode_restorefh() failed ... Browse Code »

We can already easily recover from that inside _nfs4_proc_open().

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:09:03 +0800
6f220ed5a NFSv4: Fix open state recovery ... Browse Code »

Ensure that opendata->state is always initialised when we do state
recovery.

Ensure that we set the filehandle in the case where we're doing an
"OPEN_CLAIM_PREVIOUS" call due to a server reboot.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-20 03:09:03 +0800

19 Jul, 2007

2 commits

6d34ac199 locks: make posix_test_lock() interface more consistent ... Browse Code »

Since posix_test_lock(), like fcntl() and ->lock(), indicates absence or
presence of a conflict lock by setting fl_type to, respectively, F_UNLCK
or something other than F_UNLCK, the return value is no longer needed.

Signed-off-by: "J. Bruce Fields"

J. Bruce Fields
2007-07-19 07:17:19 +0800
370f6599e nfs: disable leases over NFS ... Browse Code »

As Peter Staubach says elsewhere
(http://marc.info/?l=linux-kernel&m=118113649526444&w=2):

> The problem is that some file system such as NFSv2 and NFSv3 do
> not have sufficient support to be able to support leases correctly.
> In particular for these two file systems, there is no over the wire
> protocol support.
>
> Currently, these two file systems fail the fcntl(F_SETLEASE) call
> accidentally, due to a reference counting difference. These file
> systems should fail more consciously, with a proper error to
> indicate that the call is invalid for them.

Define an nfs setlease method that just returns -EINVAL.

If someone can demonstrate a real need, perhaps we could reenable
them in the presence of the "nolock" mount option.

Signed-off-by: "J. Bruce Fields"
Cc: Peter Staubach
Cc: Trond Myklebust

J. Bruce Fields
2007-07-19 07:17:19 +0800

18 Jul, 2007

2 commits

831441862 Freezer: make kernel threads nonfreezable by default ... Browse Code »

Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves. This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.

It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.

The patch causes all kernel threads to be nonfreezable by default (ie. to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE. It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear. Additionally, it updates documentation to
describe the freezing of tasks more accurately.

[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki
Acked-by: Nigel Cunningham
Cc: Pavel Machek
Cc: Oleg Nesterov
Cc: Gautham R Shenoy
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rafael J. Wysocki
2007-07-18 01:23:02 +0800
8e1f936b7 mm: clean up and kernelify shrinker registration ... Browse Code »

I can never remember what the function to register to receive VM pressure
is called. I have to trace down from __alloc_pages() to find it.

It's called "set_shrinker()", and it needs Your Help.

1) Don't hide struct shrinker. It contains no magic.
2) Don't allocate "struct shrinker". It's not helpful.
3) Call them "register_shrinker" and "unregister_shrinker".
4) Call the function "shrink" not "shrinker".
5) Reduce the 17 lines of waffly comments to 13, but document it properly.

Signed-off-by: Rusty Russell
Cc: David Chinner
Cc: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rusty Russell
2007-07-18 01:23:00 +0800

17 Jul, 2007

1 commit

259902ea9 Make NFS client use seq_list_xxx helpers ... Browse Code »

This includes /proc/fs/nfsfs/servers and /proc/fs/nfsfs/volumes entries.

Both need to show the header and use the list_head.

Signed-off-by: Pavel Emelianov
Acked-by: Trond Myklebust
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pavel Emelianov
2007-07-17 00:05:42 +0800

11 Jul, 2007

5 commits

137d6acaa NFSv4: Make sure unlock is really an unlock when cancelling a lock ... Browse Code »

I ran into a curious issue when a lock is being canceled. The
cancellation results in a lock request to the vfs layer instead of an
unlock request. This is particularly insidious when the process that
owns the lock is exiting. In that case, sometimes the erroneous lock is
applied AFTER the process has entered zombie state, preventing the lock
from ever being released. Eventually other processes block on the lock
causing a slow degredation of the system. In the 2.6.16 kernel this was
investigated on, the problem is compounded by the fact that the cl_sem
is held while blocking on the vfs lock, which results in most processes
accessing the nfs file system in question hanging.

In more detail, here is how the situation occurs:

first _nfs4_do_setlk():

static int _nfs4_do_setlk(struct nfs4_state *state, int cmd, struct file_lock *fl, int reclaim)
...
ret = nfs4_wait_for_completion_rpc_task(task);
if (ret == 0) {
...
} else
data->cancelled = 1;

then nfs4_lock_release():

static void nfs4_lock_release(void *calldata)
...
if (data->cancelled != 0) {
struct rpc_task *task;
task = nfs4_do_unlck(&data->fl, data->ctx, data->lsp,
data->arg.lock_seqid);

The problem is the same file_lock that was passed in to _nfs4_do_setlk()
gets passed to nfs4_do_unlck() from nfs4_lock_release(). So the type is
still F_RDLCK or FWRLCK, not F_UNLCK. At some point, when cancelling the
lock, the type needs to be changed to F_UNLCK. It seemed easiest to do
that in nfs4_do_unlck(), but it could be done in nfs4_lock_release().
The concern I had with doing it there was if something still needed the
original file_lock, though it turns out the original file_lock still
needs to be modified by nfs4_do_unlck() because nfs4_do_unlck() uses the
original file_lock to pass to the vfs layer, and a copy of the original
file_lock for the RPC request.

It seems like the simplest solution is to force all situations where
nfs4_do_unlck() is being used to result in an unlock, so with that in
mind, I made the following change:

Signed-off-by: Frank Filz
Signed-off-by: Trond Myklebust

Frank Filz
2007-07-11 11:40:49 +0800
6f2e64d3e NFSv4: Make the NFS state model work with the nosharedcache mount option ... Browse Code »

Consider the case where the user has mounted the remote filesystem
server:/foo on the two local directories /bar and /baz using the
nosharedcache mount option. The files /bar/file and /baz/file are
represented by different inodes in the local namespace, but refer to the
same file /foo/file on the server.
Consider the case where a process opens both /bar/file and /baz/file, then
closes /bar/file: because the nfs4_state is not shared between /bar/file
and /baz/file, the kernel will see that the nfs4_state for /bar/file is no
longer referenced, so it will send off a CLOSE rpc call. Unless the
open_owners differ, then that CLOSE call will invalidate the open state on
/baz/file too.

Conclusion: we cannot share open state owners between two different
non-shared mount instances of the same filesystem.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-11 11:40:48 +0800
275a5d24b NFS: Error when mounting the same filesystem with different options ... Browse Code »

Unless the user sets the NFS_MOUNT_NOSHAREDCACHE mount flag, we should
return EBUSY if the filesystem is already mounted on a superblock that
has set conflicting mount options.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-11 11:40:48 +0800
75180df2e NFS: Add the mount option "nosharecache" ... Browse Code »

Prior to David Howell's mount changes in 2.6.18, users who mounted
different directories which happened to be from the same filesystem on the
server would get different super blocks, and hence could choose different
mount options. As long as there were no hard linked files that crossed from
one subtree to another, this was quite safe.
Post the changes, if the two directories are on the same filesystem (have
the same 'fsid'), they will share the same super block, and hence the same
mount options.

Add a flag to allow users to elect not to share the NFS super block with
another mount point, even if the fsids are the same. This will allow
users to set different mount options for the two different super blocks, as
was previously possible. It is still up to the user to ensure that there
are no cache coherency issues when doing this, however the default
behaviour will be to share super blocks whenever two paths result in
the same fsid.

Signed-off-by: Trond Myklebust

Trond Myklebust
2007-07-11 11:40:48 +0800
800712252 NFS: Add support for mounting NFSv4 file systems with string options ... Browse Code »

Signed-off-by: Chuck Lever
Signed-off-by: Trond Myklebust

Chuck Lever
2007-07-11 11:40:48 +0800