09 Aug, 2014

1 commit


12 Jun, 2014

1 commit

  • iter_file_splice_write() - a ->splice_write() instance that gathers the
    pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
    it to ->write_iter(). A bunch of simple cases coverted to that...

    [AV: fixed the braino spotted by Cyrill]

    Signed-off-by: Al Viro

    Al Viro
     

07 May, 2014

2 commits


24 Jan, 2014

2 commits


22 Jan, 2014

1 commit

  • The ramfs is always built in. It will never be modular, so using
    module_init as an alias for __initcall is rather misleading.

    Fix this up now, so that we can relocate module_init from init.h into
    module.h in the future. If we don't do this, we'd have to add module.h
    to obviously non-modular code, and that would be a worse thing.

    Note that direct use of __initcall is discouraged, vs. one of the
    priority categorized subgroups. As __initcall gets mapped onto
    device_initcall, our use of fs_initcall (which makes sense for fs code)
    will thus change this registration from level 6-device to level 5-fs
    (i.e. slightly earlier). However no observable impact of that small
    difference has been observed during testing, or is expected.

    Also note that this change uncovers a missing semicolon bug in the
    registration of the initcall.

    Signed-off-by: Paul Gortmaker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Gortmaker
     

12 Sep, 2013

3 commits

  • When the rootfs code was a wrapper around ramfs, having them in the same
    file made sense. Now that it can wrap another filesystem type, move it in
    with the init code instead.

    This also allows a subsequent patch to access rootfstype= command line
    arg.

    Signed-off-by: Rob Landley
    Cc: Jeff Layton
    Cc: Jens Axboe
    Cc: Stephen Warren
    Cc: Rusty Russell
    Cc: Jim Cromie
    Cc: Sam Ravnborg
    Cc: Greg Kroah-Hartman
    Cc: "Eric W. Biederman"
    Cc: Alexander Viro
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rob Landley
     
  • Even though ramfs hasn't got a backing device, commit e0bf68ddec4f ("mm:
    bdi init hooks") added one anyway, and put the initialization in
    init_rootfs() since that's the first user, leaving it out of init_ramfs()
    to avoid duplication.

    But initmpfs uses init_tmpfs() instead, so move the init into the
    filesystem's init function, add a "once" guard to prevent duplicate
    initialization, and call the filesystem init from rootfs init.

    This goes part of the way to allowing ramfs to be built as a module.

    [akpm@linux-foundation.org; using bit 1 was odd]
    Signed-off-by: Rob Landley
    Cc: Jeff Layton
    Cc: Jens Axboe
    Cc: Stephen Warren
    Cc: Rusty Russell
    Cc: Jim Cromie
    Cc: Sam Ravnborg
    Cc: Greg Kroah-Hartman
    Cc: "Eric W. Biederman"
    Cc: Alexander Viro
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rob Landley
     
  • Mounting MS_NOUSER prevents --bind mounts from rootfs. Prevent new rootfs
    mounts with a different mechanism that doesn't affect bind mounts.

    Signed-off-by: Rob Landley
    Cc: Jeff Layton
    Cc: Jens Axboe
    Cc: Stephen Warren
    Cc: Rusty Russell
    Cc: Jim Cromie
    Cc: Sam Ravnborg
    Cc: Greg Kroah-Hartman
    Cc: "Eric W. Biederman"
    Cc: Alexander Viro
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rob Landley
     

27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

23 Feb, 2013

1 commit


27 Jan, 2013

1 commit

  • There is no backing store to ramfs and file creation
    rules are the same as for any other filesystem so
    it is semantically safe to allow unprivileged users
    to mount it.

    The memory control group successfully limits how much
    memory ramfs can consume on any system that cares about
    a user namespace root using ramfs to exhaust memory
    the memory control group can be deployed.

    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

14 Jul, 2012

1 commit


12 Jul, 2012

1 commit

  • There is a bug in the below scenario for !CONFIG_MMU:

    1. create a new file
    2. mmap the file and write to it
    3. read the file can't get the correct value

    Because

    sys_read() -> generic_file_aio_read() -> simple_readpage() -> clear_page()

    which causes the page to be zeroed.

    Add SetPageUptodate() to ramfs_nommu_expand_for_mapping() so that
    generic_file_aio_read() do not call simple_readpage().

    Signed-off-by: Bob Liu
    Cc: Hugh Dickins
    Cc: David Howells
    Cc: Geert Uytterhoeven
    Cc: Greg Ungerer
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Liu
     

21 Mar, 2012

2 commits


04 Jan, 2012

4 commits


03 Nov, 2011

1 commit


15 Apr, 2011

1 commit

  • On no-mmu arch, there is a memleak during shmem test. The cause of this
    memleak is ramfs_nommu_expand_for_mapping() added page refcount to 2
    which makes iput() can't free that pages.

    The simple test file is like this:

    int main(void)
    {
    int i;
    key_t k = ftok("/etc", 42);

    for ( i=0; i free
    total used free shared buffers
    Mem: 60320 17912 42408 0 0
    -/+ buffers: 17912 42408
    root:/> shmem
    run ok...
    root:/> free
    total used free shared buffers
    Mem: 60320 19096 41224 0 0
    -/+ buffers: 19096 41224
    root:/> shmem
    run ok...
    root:/> free
    total used free shared buffers
    Mem: 60320 20296 40024 0 0
    -/+ buffers: 20296 40024
    ...

    After this patch the test result is:(no memleak anymore)

    root:/> free
    total used free shared buffers
    Mem: 60320 16668 43652 0 0
    -/+ buffers: 16668 43652
    root:/> shmem
    run ok...
    root:/> free
    total used free shared buffers
    Mem: 60320 16668 43652 0 0
    -/+ buffers: 16668 43652

    Signed-off-by: Bob Liu
    Acked-by: Hugh Dickins
    Signed-off-by: David Howells
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Liu
     

29 Oct, 2010

1 commit


26 Oct, 2010

1 commit

  • Instead of always assigning an increasing inode number in new_inode
    move the call to assign it into those callers that actually need it.
    For now callers that need it is estimated conservatively, that is
    the call is added to all filesystems that do not assign an i_ino
    by themselves. For a few more filesystems we can avoid assigning
    any inode number given that they aren't user visible, and for others
    it could be done lazily when an inode number is actually needed,
    but that's left for later patches.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Dave Chinner
    Signed-off-by: Al Viro

    Christoph Hellwig
     

10 Aug, 2010

2 commits

  • Make sure we check the truncate constraints early on in ->setattr by adding
    those checks to inode_change_ok. Also clean up and document inode_change_ok
    to make this obvious.

    As a fallout we don't have to call inode_newsize_ok from simple_setsize and
    simplify it down to a truncate_setsize which doesn't return an error. This
    simplifies a lot of setattr implementations and means we use truncate_setsize
    almost everywhere. Get rid of fat_setsize now that it's trivial and mark
    ext2_setsize static to make the calling convention obvious.

    Keep the inode_newsize_ok in vmtruncate for now as all callers need an
    audit for its removal anyway.

    Note: setattr code in ecryptfs doesn't call inode_change_ok at all and
    needs a deeper audit, but that is left for later.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Despite its name it's now a generic implementation of ->setattr, but
    rather a helper to copy attributes from a struct iattr to the inode.
    Rename it to setattr_copy to reflect this fact.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

28 May, 2010

2 commits

  • Convert simple filesystems: ramfs, configfs, sysfs, block_dev to new truncate
    sequence.

    Cc: Christoph Hellwig
    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    Nick Piggin
     
  • We don't name our generic fsync implementations very well currently.
    The no-op implementation for in-memory filesystems currently is called
    simple_sync_file which doesn't make too much sense to start with,
    the the generic one for simple filesystems is called simple_fsync
    which can lead to some confusion.

    This patch renames the generic file fsync method to generic_file_fsync
    to match the other generic_file_* routines it is supposed to be used
    with, and the no-op implementation to noop_fsync to make it obvious
    what to expect. In addition add some documentation for both methods.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

22 May, 2010

2 commits


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

17 Jan, 2010

2 commits

  • Fix a problem in NOMMU mmap with ramfs whereby a shared mmap can happen
    over the end of a truncation. The problem is that
    ramfs_nommu_check_mappings() checks that the reduced file size against the
    VMA tree, but not the vm_region tree.

    The following sequence of events can cause the problem:

    fd = open("/tmp/x", O_RDWR|O_TRUNC|O_CREAT, 0600);
    ftruncate(fd, 32 * 1024);
    a = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
    b = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
    munmap(a, 32 * 1024);
    ftruncate(fd, 16 * 1024);
    c = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

    Mapping 'a' creates a vm_region covering 32KB of the file. Mapping 'b'
    sees that the vm_region from 'a' is covering the region it wants and so
    shares it, pinning it in memory.

    Mapping 'a' then goes away and the file is truncated to the end of VMA
    'b'. However, the region allocated by 'a' is still in effect, and has
    _not_ been reduced.

    Mapping 'c' is then created, and because there's a vm_region covering the
    desired region, get_unmapped_area() is _not_ called to repeat the check,
    and the mapping is granted, even though the pages from the latter half of
    the mapping have been discarded.

    However:

    d = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

    Mapping 'd' should work, and should end up sharing the region allocated by
    'a'.

    To deal with this, we shrink the vm_region struct during the truncation,
    lest do_mmap_pgoff() take it as licence to share the full region
    automatically without calling the get_unmapped_area() file op again.

    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Greg Ungerer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Fix the race between the truncation of a ramfs file and an attempt to make
    a shared mmap of region of that file.

    The problem is that do_mmap_pgoff() calls f_op->get_unmapped_area() to
    verify that the file region is made of contiguous pages and to find its
    base address - but there isn't any locking to guarantee this region until
    vma_prio_tree_insert() is called by add_vma_to_mm().

    Note that moving the functionality into f_op->mmap() doesn't help as that
    is also called before vma_prio_tree_insert().

    Instead make ramfs_nommu_check_mappings() grab nommu_region_sem whilst it
    does its checks. This means that this function will wait whilst mmaps
    take place.

    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Greg Ungerer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

18 Dec, 2009

1 commit


24 Sep, 2009

1 commit

  • Update some fs code to make use of new helper functions introduced
    in the previous patch. Should be no significant change in behaviour
    (except CIFS now calls send_sig under i_lock, via inode_newsize_ok).

    Reviewed-by: Christoph Hellwig
    Acked-by: Miklos Szeredi
    Cc: linux-nfs@vger.kernel.org
    Cc: Trond.Myklebust@netapp.com
    Cc: linux-cifs-client@lists.samba.org
    Cc: sfrench@samba.org
    Signed-off-by: Nick Piggin
    Signed-off-by: Al Viro

    npiggin@suse.de
     

23 Sep, 2009

1 commit


11 Sep, 2009

1 commit


30 Jul, 2009

1 commit

  • This file makes use of various macros defined in files like asm/current.h
    or asm-generic/resource.h. All these files can be included via sched.h.
    The building of the !MMU ARM kernel (with additional patches) fails
    without this change.

    Signed-off-by: Catalin Marinas
    Acked-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Catalin Marinas
     

15 Jun, 2009

1 commit

  • On systems where CONFIG_SHMEM is disabled, mounting tmpfs filesystems can
    fail when tmpfs options are used. This is because tmpfs creates a small
    wrapper around ramfs which rejects unknown options, and ramfs itself only
    supports a tiny subset of what tmpfs supports. This makes it pretty hard
    to use the same userspace systems across different configuration systems.
    As such, ramfs should ignore the tmpfs options when tmpfs is merely a
    wrapper around ramfs.

    This used to work before commit c3b1b1cbf0 as previously, ramfs would
    ignore all options. But now, we get:
    ramfs: bad mount option: size=10M
    mount: mounting mdev on /dev failed: Invalid argument

    Another option might be to restore the previous behavior, where ramfs
    simply ignored all unknown mount options ... which is what Hugh prefers.

    Signed-off-by: Mike Frysinger
    Signed-off-by: Hugh Dickins
    Acked-by: Matt Mackall
    Acked-by: Wu Fengguang
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Frysinger