23 Mar, 2011
17 commits
-
printk()s without a priority level default to KERN_WARNING. To reduce
noise at KERN_WARNING, this patch set the priority level appriopriately
for unleveled printks()s. This should be useful to folks that look at
dmesg warnings closely.Signed-off-by: Mandeep Singh Baines
Cc: Jens Axboe
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Now that the mere act of _looking_ at /proc/$pid/smaps will not destroy
transparent huge pages, tell how much of the VMA is actually mapped with
them.This way, we can make sure that we're getting THPs where we
expect to see them.Signed-off-by: Dave Hansen
Acked-by: Mel Gorman
Acked-by: David Rientjes
Reviewed-by: Eric B Munson
Tested-by: Eric B Munson
Cc: Michael J Wolf
Cc: Andrea Arcangeli
Cc: Johannes Weiner
Cc: Matt Mackall
Cc: Jeremy Fitzhardinge
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This adds code to explicitly detect and handle pmd_trans_huge() pmds. It
then passes HPAGE_SIZE units in to the smap_pte_entry() function instead
of PAGE_SIZE.This means that using /proc/$pid/smaps now will no longer cause THPs to be
broken down in to small pages.Signed-off-by: Dave Hansen
Reviewed-by: Eric B Munson
Tested-by: Eric B Munson
Acked-by: Andrea Arcangeli
Acked-by: David Rientjes
Cc: Mel Gorman
Cc: Michael J Wolf
Cc: Andrea Arcangeli
Cc: Johannes Weiner
Cc: Matt Mackall
Cc: Jeremy Fitzhardinge
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add an argument to the new smaps_pte_entry() function to let it account in
things other than PAGE_SIZE units. I changed all of the PAGE_SIZE sites,
even though not all of them can be reached for transparent huge pages,
just so this will continue to work without changes as THPs are improved.Signed-off-by: Dave Hansen
Acked-by: Mel Gorman
Acked-by: Johannes Weiner
Acked-by: David Rientjes
Reviewed-by: Eric B Munson
Tested-by: Eric B Munson
Cc: Michael J Wolf
Cc: Andrea Arcangeli
Cc: Matt Mackall
Cc: Jeremy Fitzhardinge
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We will use smaps_pte_entry() in a moment to handle both small and
transparent large pages. But, we must break it out of smaps_pte_range()
first.Signed-off-by: Dave Hansen
Acked-by: Mel Gorman
Acked-by: Johannes Weiner
Acked-by: David Rientjes
Reviewed-by: Eric B Munson
Tested-by: Eric B Munson
Cc: Michael J Wolf
Cc: Andrea Arcangeli
Cc: Matt Mackall
Cc: Jeremy Fitzhardinge
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Right now, if a mm_walk has either ->pte_entry or ->pmd_entry set, it will
unconditionally split any transparent huge pages it runs in to. In
practice, that means that anyone doing acat /proc/$pid/smaps
will unconditionally break down every huge page in the process and depend
on khugepaged to re-collapse it later. This is fairly suboptimal.This patch changes that behavior. It teaches each ->pmd_entry handler
(there are five) that they must break down the THPs themselves. Also, the
_generic_ code will never break down a THP unless a ->pte_entry handler is
actually set.This means that the ->pmd_entry handlers can now choose to deal with THPs
without breaking them down.[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Dave Hansen
Acked-by: Mel Gorman
Acked-by: David Rientjes
Reviewed-by: Eric B Munson
Tested-by: Eric B Munson
Cc: Michael J Wolf
Cc: Andrea Arcangeli
Cc: Johannes Weiner
Cc: Matt Mackall
Cc: Jeremy Fitzhardinge
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch series changes remove_from_page_cache()'s page ref counting
rule. Page cache ref count is decreased in delete_from_page_cache(). So
we don't need to decrease the page reference in callers.Signed-off-by: Minchan Kim
Cc: William Irwin
Acked-by: Hugh Dickins
Acked-by: Mel Gorman
Reviewed-by: KAMEZAWA Hiroyuki
Reviewed-by: Johannes Weiner
Reviewed-by: KOSAKI Motohiro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This function basically does:
remove_from_page_cache(old);
page_cache_release(old);
add_to_page_cache_locked(new);Except it does this atomically, so there's no possibility for the "add" to
fail because of a race.If memory cgroups are enabled, then the memory cgroup charge is also moved
from the old page to the new.This function is currently used by fuse to move pages into the page cache
on read, instead of copying the page contents.[minchan.kim@gmail.com: add freepage() hook to replace_page_cache_page()]
Signed-off-by: Miklos Szeredi
Acked-by: Rik van Riel
Acked-by: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Signed-off-by: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
[net/9p]: Introduce basic flow-control for VirtIO transport.
9p: use the updated offset given by generic_write_checks
[net/9p] Don't re-pin pages on retrying virtqueue_add_buf().
[net/9p] Set the condition just before waking up.
[net/9p] unconditional wake_up to proc waiting for space on VirtIO ring
fs/9p: Add v9fs_dentry2v9ses
fs/9p: Attach writeback_fid on first open with WR flag
fs/9p: Open writeback fid in O_SYNC mode
fs/9p: Use truncate_setsize instead of vmtruncate
net/9p: Fix compile warning
net/9p: Convert the in the 9p rpc call path to GFP_NOFS
fs/9p: Fix race in initializing writeback fid -
* git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: use watch/notify for changes in rbd header
libceph: add lingering request and watch/notify event framework
rbd: update email address in Documentation
ceph: rename dentry_release -> d_release, fix comment
ceph: add request to the tail of unsafe write list
ceph: remove request from unsafe list if it is canceled/timed out
ceph: move readahead default to fs/ceph from libceph
ceph: add ino32 mount option
ceph: update common header files
ceph: remove debugfs debug cruft
libceph: fix osd request queuing on osdmap updates
ceph: preserve I_COMPLETE across rename
libceph: Fix base64-decoding when input ends in newline. -
Without this fix, even if a file is opened in O_APPEND mode, data will be
written at current file position instead of end of file.Signed-off-by: M. Mohan Kumar
Reviewed-by: Aneesh Kumar K.V
Signed-off-by: Eric Van Hensbergen -
Add the new static inline and use the same
Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Venkateswararao Jujjuri
Signed-off-by: Eric Van Hensbergen -
We don't need writeback fid if we are only doing O_RDONLY open
Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Venkateswararao Jujjuri
Signed-off-by: Eric Van Hensbergen -
Older version of protocol don't support tsyncfs operation.
So for them force a O_SYNC flag on the serverSigned-off-by: Aneesh Kumar K.V
Signed-off-by: Venkateswararao Jujjuri
Signed-off-by: Eric Van Hensbergen -
convert vmtruncate usage to truncate_setsize. We also writeback
all dirty pages before doing 9p operations and on success call truncate_setsize.
This ensure that we continue sanely on failed truncate on the server. The
disadvantage is that we are now going to write back the content that get
thrown away later as a part of truncate.Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Venkateswararao Jujjuri
Signed-off-by: Eric Van Hensbergen -
When two process open the same file we can end up with both of them
allocating the writeback_fid. Add a new mutex which can be used
for synchronizing v9fs_inode member values.Signed-off-by: Aneesh Kumar K.V
Signed-off-by: Venkateswararao Jujjuri
Signed-off-by: Eric Van Hensbergen -
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: make fuse_dentry_revalidate() RCU aware
fuse: make fuse_permission() RCU aware
fuse: wakeup pollers on connection release/abort
fuse: reduce size of struct fuse_request
22 Mar, 2011
9 commits
-
* 'for-linus' of git://oss.sgi.com/xfs/xfs: (23 commits)
xfs: don't name variables "panic"
xfs: factor agf counter updates into a helper
xfs: clean up the xfs_alloc_compute_aligned calling convention
xfs: kill support/debug.[ch]
xfs: Convert remaining cmn_err() callers to new API
xfs: convert the quota debug prints to new API
xfs: rename xfs_cmn_err_fsblock_zero()
xfs: convert xfs_fs_cmn_err to new error logging API
xfs: kill xfs_fs_mount_cmn_err() macro
xfs: kill xfs_fs_repair_cmn_err() macro
xfs: convert xfs_cmn_err to xfs_alert_tag
xfs: Convert xlog_warn to new logging interface
xfs: Convert linux-2.6/ files to new logging interface
xfs: introduce new logging API.
xfs: zero proper structure size for geometry calls
xfs: enable delaylog by default
xfs: more sensible inode refcounting for ialloc
xfs: stop using xfs_trans_iget in the RT allocator
xfs: check if device support discard in xfs_ioc_trim()
xfs: prevent leaking uninitialized stack memory in FSGEOMETRY_V1
... -
/sys/fs is a somewhat strange way to tweak what could more
obviously be tuned with a mount option.Suggested-by: Christoph Hellwig
Signed-off-by: Tony Luck
Signed-off-by: Linus Torvalds -
Just for consistency's sake. Fix obsolete comment too.
Signed-off-by: Sage Weil
-
In sync_write_wait(), we assume that the newest request is at the
tail of unsafe write list. We should maintain the semantics here.Signed-off-by: Henry C Chang
Signed-off-by: Sage Weil -
This fixes the list corruption warning like this:
------------[ cut here ]------------
WARNING: at lib/list_debug.c:30 __list_add+0x68/0x81()
Hardware name: X8DTU
list_add corruption. prev->next should be next (ffff880618931250), but was (null). (prev=ffff880c188b9130).
Modules linked in: nfsd lockd nfs_acl auth_rpcgss exportfs ceph libceph libcrc32c sunrpc ipv6 fuse igb i2c_i801 ioatdma i2c_core iTCO_wdt iTCO_vendor_support joydev dca serio_raw usb_storage [last unloaded: scsi_wait_scan]
Pid: 10977, comm: smbd Tainted: G W 2.6.32.23-170.Elaster.xendom0.fc12.x86_64 #1
Call Trace:
[] warn_slowpath_common+0x7c/0x94
[] warn_slowpath_fmt+0x41/0x43
[] __list_add+0x68/0x81
[] ceph_aio_write+0x614/0x8a2 [ceph]
[] do_sync_write+0xe8/0x125
[] ? autoremove_wake_function+0x0/0x39
[] ? selinux_file_permission+0x5c/0xb3
[] ? security_file_permission+0x16/0x18
[] vfs_write+0xae/0x10b
[] sys_pwrite64+0x5a/0x76
[] system_call_fastpath+0x16/0x1b
---[ end trace 08573eb9f07ff6f4 ]---Signed-off-by: Henry C Chang
Signed-off-by: Sage Weil -
Signed-off-by: Sage Weil
-
The ino32 mount option forces the ceph fs to report 32 bit
ino values. This is useful for 64 bit kernels with 32 bit userspace.Signed-off-by: Yehuda Sadeh
-
Whoops!
Signed-off-by: Sage Weil
-
lookup_mnt() is only used in the core fs routines now, so it doesn't need to
be globally declared anymore. It isn't exported to modules at the moment, so
nothing that can be modularised seems to be using it.Signed-off-by: David Howells
Signed-off-by: Al Viro
21 Mar, 2011
14 commits
-
Only bail out of fuse_dentry_revalidate() on LOOKUP_RCU when blocking
is actually necessary.CC: Nick Piggin
Signed-off-by: Miklos Szeredi -
Only bail out of fuse_permission() on IPERM_FLAG_RCU when blocking is
actually necessary.CC: Nick Piggin
Signed-off-by: Miklos Szeredi -
If a fuse dev connection is broken, wake up any
processes that are blocking, in a poll system call,
on one of the files in the now defunct filesystem.Signed-off-by: Miklos Szeredi
-
Reduce the size of struct fuse_request by removing cuse_init_out from
the request structure and allocating it dinamically instead.CC: Tejun Heo
Signed-off-by: Miklos Szeredi -
The usage of find_first_zero_bit() in bfs_create() is wrong for two
reasons.The bitmap size argument to find_first_zero_bit() is info->si_lasti but
the correct bitmap size is info->si_lasti + 1 as info->si_lasti is the
last valid index in info->si_imap bitmap.Another problem is that it is impossible to detect that info->si_imap
bitmap is full because there is an off-by-one bug in the return value
check for find_first_zero_bit(). If no zero bits exist in info->si_imap,
find_first_zero_bit() returns info->si_lasti. But the check can't catch
it due to the off-by-one.Signed-off-by: Akinobu Mita
Acked-by: "Tigran A. Aivazian"
Signed-off-by: Andrew Morton
Signed-off-by: Al Viro -
dentry_open() requires callers to pass a valid vfsmount.
Signed-off-by: Tetsuo Handa
Signed-off-by: Al Viro -
In this case nobody can open a slave point, so will be better return
from devpts_pty_new()Now we should not check error code from d_find_alias() in
devpts_pty_kill(), because the dentry exists all times.Signed-off-by: Andrey Vagin
Signed-off-by: Al Viro -
These should be spin_unlock() instead of spin_lock(). It's a typo.
Signed-off-by: Dan Carpenter
Signed-off-by: Al Viro -
Move kfree() of i_private out of ->unlink() and into ->evict_inode()
Signed-off-by: Tony Luck
Signed-off-by: Al Viro -
It is frequently useful to sync a single file system, instead of all
mounted file systems via sync(2):- On machines with many mounts, it is not at all uncommon for some of
them to hang (e.g. unresponsive NFS server). sync(2) will get stuck on
those and may never get to the one you do care about (e.g., /).
- Some applications write lots of data to the file system and then
want to make sure it is flushed to disk. Calling fsync(2) on each
file introduces unnecessary ordering constraints that result in a large
amount of sub-optimal writeback/flush/commit behavior by the file
system.There are currently two ways (that I know of) to sync a single super_block:
- BLKFLSBUF ioctl on the block device: That also invalidates the bdev
mapping, which isn't usually desirable, and doesn't work for non-block
file systems.
- 'mount -o remount,rw' will call sync_filesystem as an artifact of the
current implemention. Relying on this little-known side effect for
something like data safety sounds foolish.Both of these approaches require root privileges, which some applications
do not have (nor should they need?) given that sync(2) is an unprivileged
operation.This patch introduces a new system call syncfs(2) that takes an fd and
syncs only the file system it references. Maybe someday we can$ sync /some/path
and not get
sync: ignoring all arguments
The syscall is motivated by comments by Al and Christoph at the last LSF.
syncfs(2) seems like an appropriate name given statfs(2).A similar ioctl was also proposed a while back, see
http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2Signed-off-by: Sage Weil
Signed-off-by: Al Viro -
Hi,
I was backporting the coredump over pipe feature and noticed this small typo,
I wish I would have something bigger to contribute...>From 15d6080e0ed4267da103c706917a33b1015e8804 Mon Sep 17 00:00:00 2001
From: Holger Hans Peter Freyther
Date: Thu, 24 Feb 2011 17:42:50 +0100
Subject: [PATCH] fs: Fix a small typo in the commentThe function is called umh_pipe_setup not uhm_pipe_setup.
Signed-off-by: Holger Hans Peter Freyther
Signed-off-by: Al Viro -
Fixed coding style issue.
Signed-off-by: David Jenni
Signed-off-by: Al Viro -
Signed-off-by: Ben Hutchings
Signed-off-by: Al Viro -
Remove the leftover from the commit 8ff3e8e85fa6 ("select:
switch select() and poll() over to hrtimers").Signed-off-by: Namhyung Kim
Acked-by: Arjan van de Ven
Signed-off-by: Al Viro