11 Oct, 2016
2 commits
-
Pull more vfs updates from Al Viro:
">rename2() work from Miklos + current_time() from Deepa"* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: Replace current_fs_time() with current_time()
fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
fs: Replace CURRENT_TIME with current_time() for inode timestamps
fs: proc: Delete inode time initializations in proc_alloc_inode()
vfs: Add current_time() api
vfs: add note about i_op->rename changes to porting
fs: rename "rename2" i_op to "rename"
vfs: remove unused i_op->rename
fs: make remaining filesystems use .rename2
libfs: support RENAME_NOREPLACE in simple_rename()
fs: support RENAME_NOREPLACE for local filesystems
ncpfs: fix unused variable warning
08 Oct, 2016
1 commit
28 Sep, 2016
1 commit
-
CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_time() instead.CURRENT_TIME is also not y2038 safe.
This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe. As part of the effort current_time() will be
extended to do range checks. Hence, it is necessary for all
file system timestamps to use current_time(). Also,
current_time() will be transitioned along with vfs to be
y2038 safe.Note that whenever a single call to current_time() is used
to change timestamps in different inodes, it is because they
share the same time granularity.Signed-off-by: Deepa Dinamani
Reviewed-by: Arnd Bergmann
Acked-by: Felipe Balbi
Acked-by: Steven Whitehouse
Acked-by: Ryusuke Konishi
Acked-by: David Sterba
Signed-off-by: Al Viro
27 Sep, 2016
2 commits
-
Generated patch:
sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2`
sed -i "s/\brename2\b/rename/g" `git grep -wl rename2`Signed-off-by: Miklos Szeredi
-
This is trivial to do:
- add flags argument to simple_rename()
- check if flags doesn't have any other than RENAME_NOREPLACE
- assign simple_rename() to .rename2 instead of .renameFilesystems converted:
hugetlbfs, ramfs, bpf.
Debugfs uses simple_rename() to implement debugfs_rename(), which is for
debugfs instances to rename files internally, not for userspace filesystem
access. For this case pass zero flags to simple_rename().Signed-off-by: Miklos Szeredi
Acked-by: Greg Kroah-Hartman
Cc: Alexei Starovoitov
22 Sep, 2016
1 commit
-
inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.Reviewed-by: Christoph Hellwig
Signed-off-by: Jan Kara
20 Sep, 2016
1 commit
-
Commit c01d5b300774 ("shmem: get_unmapped_area align huge page") makes
use of shm_get_unmapped_area() in shm_file_operations() unconditional to
CONFIG_MMU.As Tony Battersby pointed this can lead NULL-pointer dereference on
machine with CONFIG_MMU=y and CONFIG_SHMEM=n. In this case ipc/shm is
backed by ramfs which doesn't provide f_op->get_unmapped_area for
configurations with MMU.The solution is to provide dummy f_op->get_unmapped_area for ramfs when
CONFIG_MMU=y, which just call current->mm->get_unmapped_area().Fixes: c01d5b300774 ("shmem: get_unmapped_area align huge page")
Link: http://lkml.kernel.org/r/20160912102704.140442-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov
Reported-by: Tony Battersby
Tested-by: Tony Battersby
Cc: Hugh Dickins
Cc: [4.7.x]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 May, 2016
1 commit
-
The nommu do_mmap expects f_op->get_unmapped_area to either succeed or
return -ENOSYS for VM_MAYSHARE (e.g. private read-only) mappings.
Returning addr in the non-MAP_SHARED case was completely wrong, and only
happened to work because addr was 0. However, it prevented VM_MAYSHARE
mappings from sharing backing with the fs cache, and forced such
mappings (including shareable program text) to be copied whenever the
number of mappings transitioned from 0 to 1, impacting performance and
memory usage. Subsequent mappings beyond the first still correctly
shared memory with the first.Instead, treat VM_MAYSHARE identically to VM_SHARED at the file ops level;
do_mmap already handles the semantic differences between them.Signed-off-by: Rich Felker
Cc: Michal Hocko
Cc: Greg Ungerer
Cc: Geert Uytterhoeven
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
05 Apr, 2016
1 commit
-
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.This promise never materialized. And unlikely will.
We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.Let's stop pretending that pages in page cache are special. They are
not.The changes are pretty straight-forward:
- << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;
- >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;
- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
- page_cache_get() -> get_page();
- page_cache_release() -> put_page();
This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.virtual patch
@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)Signed-off-by: Kirill A. Shutemov
Acked-by: Michal Hocko
Signed-off-by: Linus Torvalds
09 Dec, 2015
1 commit
-
kmap() in page_follow_link_light() needed to go - allowing to hold
an arbitrary number of kmaps for long is a great way to deadlocking
the system.new helper (inode_nohighmem(inode)) needs to be used for pagecache
symlinks inodes; done for all in-tree cases. page_follow_link_light()
instrumented to yell about anything missed.Signed-off-by: Al Viro
17 Oct, 2015
1 commit
-
Commit 6afdb859b710 ("mm: do not ignore mapping_gfp_mask in page cache
allocation paths") has caught some users of hardcoded GFP_KERNEL used in
the page cache allocation paths. This, however, wasn't complete and
there were others which went unnoticed.Dave Chinner has reported the following deadlock for xfs on loop device:
: With the recent merge of the loop device changes, I'm now seeing
: XFS deadlock on my single CPU, 1GB RAM VM running xfs/073.
:
: The deadlocked is as follows:
:
: kloopd1: loop_queue_read_work
: xfs_file_iter_read
: lock XFS inode XFS_IOLOCK_SHARED (on image file)
: page cache read (GFP_KERNEL)
: radix tree alloc
: memory reclaim
: reclaim XFS inodes
: log force to unpin inodes
:
:
: xfs-cil/loop1:
: xlog_cil_push
: xlog_write
:
: xlog_state_get_iclog_space()
:
:
:
: kloopd1: loop_queue_write_work
: xfs_file_write_iter
: lock XFS inode XFS_IOLOCK_EXCL (on image file)
:
:
: i.e. the kloopd, with it's split read and write work queues, has
: introduced a dependency through memory reclaim. i.e. that writes
: need to be able to progress for reads make progress.
:
: The problem, fundamentally, is that mpage_readpages() does a
: GFP_KERNEL allocation, rather than paying attention to the inode's
: mapping gfp mask, which is set to GFP_NOFS.
:
: The didn't used to happen, because the loop device used to issue
: reads through the splice path and that does:
:
: error = add_to_page_cache_lru(page, mapping, index,
: GFP_KERNEL & mapping_gfp_mask(mapping));This has changed by commit aa4d86163e4 ("block: loop: switch to VFS
ITER_BVEC").This patch changes mpage_readpage{s} to follow gfp mask set for the
mapping. There are, however, other places which are doing basically the
same.lustre:ll_dir_filler is doing GFP_KERNEL from the function which
apparently uses GFP_NOFS for other allocations so let's make this
consistent.cifs:readpages_get_pages is called from cifs_readpages and
__cifs_readpages_from_fscache called from the same path obeys mapping
gfp.ramfs_nommu_expand_for_mapping is hardcoding GFP_KERNEL as well
regardless it uses mapping_gfp_mask for the page allocation.ext4_mpage_readpages is the called from the page cache allocation path
same as read_pages and read_cache_pagesAs I've noticed in my previous post I cannot say I would be happy about
sprinkling mapping_gfp_mask all over the place and it sounds like we
should drop gfp_mask argument altogether and use it internally in
__add_to_page_cache_locked that would require all the filesystems to use
mapping gfp consistently which I am not sure is the case here. From a
quick glance it seems that some file system use it all the time while
others are selective.Signed-off-by: Michal Hocko
Reported-by: Dave Chinner
Cc: "Theodore Ts'o"
Cc: Ming Lei
Cc: Andreas Dilger
Cc: Oleg Drokin
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
16 Apr, 2015
1 commit
-
that's the bulk of filesystem drivers dealing with inodes of their own
Signed-off-by: David Howells
Signed-off-by: Al Viro
12 Apr, 2015
1 commit
-
All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.Signed-off-by: Al Viro
21 Jan, 2015
2 commits
-
Now that we never use the backing_dev_info pointer in struct address_space
we can simply remove it and save 4 to 8 bytes in every inode.Signed-off-by: Christoph Hellwig
Acked-by: Ryusuke Konishi
Reviewed-by: Tejun Heo
Reviewed-by: Jan Kara
Signed-off-by: Jens Axboe -
Since "BDI: Provide backing device capability information [try #3]" the
backing_dev_info structure also provides flags for the kind of mmap
operation available in a nommu environment, which is entirely unrelated
to it's original purpose.Introduce a new nommu-only file operation to provide this information to
the nommu mmap code instead. Splitting this from the backing_dev_info
structure allows to remove lots of backing_dev_info instance that aren't
otherwise needed, and entirely gets rid of the concept of providing a
backing_dev_info for a character device. It also removes the need for
the mtd_inodefs filesystem.Signed-off-by: Christoph Hellwig
Reviewed-by: Tejun Heo
Acked-by: Brian Norris
Signed-off-by: Jens Axboe
09 Aug, 2014
1 commit
-
Signed-off-by: Fabian Frederick
Cc: Axel Lin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Jun, 2014
1 commit
-
iter_file_splice_write() - a ->splice_write() instance that gathers the
pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
it to ->write_iter(). A bunch of simple cases coverted to that...[AV: fixed the braino spotted by Cyrill]
Signed-off-by: Al Viro
07 May, 2014
2 commits
-
Signed-off-by: Al Viro
-
Signed-off-by: Al Viro
24 Jan, 2014
2 commits
-
ramfs_aops is identical in file-mmu.c and file-nommu.c. Thus move it to
fs/ramfs/inode.c and make it static.Signed-off-by: Axel Lin
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Since commit 853ac43ab194 ("shmem: unify regular and tiny shmem"),
ramfs_nommu_get_unmapped_area() and ramfs_nommu_mmap() are not directly
referenced outside of file-nommu.c. Thus make them static.Signed-off-by: Axel Lin
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
22 Jan, 2014
1 commit
-
The ramfs is always built in. It will never be modular, so using
module_init as an alias for __initcall is rather misleading.Fix this up now, so that we can relocate module_init from init.h into
module.h in the future. If we don't do this, we'd have to add module.h
to obviously non-modular code, and that would be a worse thing.Note that direct use of __initcall is discouraged, vs. one of the
priority categorized subgroups. As __initcall gets mapped onto
device_initcall, our use of fs_initcall (which makes sense for fs code)
will thus change this registration from level 6-device to level 5-fs
(i.e. slightly earlier). However no observable impact of that small
difference has been observed during testing, or is expected.Also note that this change uncovers a missing semicolon bug in the
registration of the initcall.Signed-off-by: Paul Gortmaker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
12 Sep, 2013
3 commits
-
When the rootfs code was a wrapper around ramfs, having them in the same
file made sense. Now that it can wrap another filesystem type, move it in
with the init code instead.This also allows a subsequent patch to access rootfstype= command line
arg.Signed-off-by: Rob Landley
Cc: Jeff Layton
Cc: Jens Axboe
Cc: Stephen Warren
Cc: Rusty Russell
Cc: Jim Cromie
Cc: Sam Ravnborg
Cc: Greg Kroah-Hartman
Cc: "Eric W. Biederman"
Cc: Alexander Viro
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Even though ramfs hasn't got a backing device, commit e0bf68ddec4f ("mm:
bdi init hooks") added one anyway, and put the initialization in
init_rootfs() since that's the first user, leaving it out of init_ramfs()
to avoid duplication.But initmpfs uses init_tmpfs() instead, so move the init into the
filesystem's init function, add a "once" guard to prevent duplicate
initialization, and call the filesystem init from rootfs init.This goes part of the way to allowing ramfs to be built as a module.
[akpm@linux-foundation.org; using bit 1 was odd]
Signed-off-by: Rob Landley
Cc: Jeff Layton
Cc: Jens Axboe
Cc: Stephen Warren
Cc: Rusty Russell
Cc: Jim Cromie
Cc: Sam Ravnborg
Cc: Greg Kroah-Hartman
Cc: "Eric W. Biederman"
Cc: Alexander Viro
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Mounting MS_NOUSER prevents --bind mounts from rootfs. Prevent new rootfs
mounts with a different mechanism that doesn't affect bind mounts.Signed-off-by: Rob Landley
Cc: Jeff Layton
Cc: Jens Axboe
Cc: Stephen Warren
Cc: Rusty Russell
Cc: Jim Cromie
Cc: Sam Ravnborg
Cc: Greg Kroah-Hartman
Cc: "Eric W. Biederman"
Cc: Alexander Viro
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
27 Feb, 2013
1 commit
-
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
23 Feb, 2013
1 commit
-
Signed-off-by: Al Viro
27 Jan, 2013
1 commit
-
There is no backing store to ramfs and file creation
rules are the same as for any other filesystem so
it is semantically safe to allow unprivileged users
to mount it.The memory control group successfully limits how much
memory ramfs can consume on any system that cares about
a user namespace root using ramfs to exhaust memory
the memory control group can be deployed.Signed-off-by: "Eric W. Biederman"
14 Jul, 2012
1 commit
-
boolean "does it have to be exclusive?" flag is passed instead;
Local filesystem should just ignore it - the object is guaranteed
not to be there yet.Signed-off-by: Al Viro
12 Jul, 2012
1 commit
-
There is a bug in the below scenario for !CONFIG_MMU:
1. create a new file
2. mmap the file and write to it
3. read the file can't get the correct valueBecause
sys_read() -> generic_file_aio_read() -> simple_readpage() -> clear_page()
which causes the page to be zeroed.
Add SetPageUptodate() to ramfs_nommu_expand_for_mapping() so that
generic_file_aio_read() do not call simple_readpage().Signed-off-by: Bob Liu
Cc: Hugh Dickins
Cc: David Howells
Cc: Geert Uytterhoeven
Cc: Greg Ungerer
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 Mar, 2012
2 commits
-
Signed-off-by: Al Viro
-
Signed-off-by: Al Viro
04 Jan, 2012
4 commits
-
Signed-off-by: Al Viro
-
Signed-off-by: Al Viro
-
vfs_create() ignores everything outside of 16bit subset of its
mode argument; switching it to umode_t is obviously equivalent
and it's the only caller of the methodSigned-off-by: Al Viro
-
vfs_mkdir() gets int, but immediately drops everything that might not
fit into umode_t and that's the only caller of ->mkdir()...Signed-off-by: Al Viro
03 Nov, 2011
1 commit
-
Since ramfs is hard-selected to "y", the module leftovers make no sense.
Signed-off-by: Richard Weinberger
Reviewed-by: WANG Cong
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
15 Apr, 2011
1 commit
-
On no-mmu arch, there is a memleak during shmem test. The cause of this
memleak is ramfs_nommu_expand_for_mapping() added page refcount to 2
which makes iput() can't free that pages.The simple test file is like this:
int main(void)
{
int i;
key_t k = ftok("/etc", 42);for ( i=0; i free
total used free shared buffers
Mem: 60320 17912 42408 0 0
-/+ buffers: 17912 42408
root:/> shmem
run ok...
root:/> free
total used free shared buffers
Mem: 60320 19096 41224 0 0
-/+ buffers: 19096 41224
root:/> shmem
run ok...
root:/> free
total used free shared buffers
Mem: 60320 20296 40024 0 0
-/+ buffers: 20296 40024
...After this patch the test result is:(no memleak anymore)
root:/> free
total used free shared buffers
Mem: 60320 16668 43652 0 0
-/+ buffers: 16668 43652
root:/> shmem
run ok...
root:/> free
total used free shared buffers
Mem: 60320 16668 43652 0 0
-/+ buffers: 16668 43652Signed-off-by: Bob Liu
Acked-by: Hugh Dickins
Signed-off-by: David Howells
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Oct, 2010
1 commit
-
Signed-off-by: Al Viro