05 Sep, 2018

1 commit

  • Remove the verbose license text from NILFS2 files and replace them with
    SPDX tags. This does not change the license of any of the code.

    Link: http://lkml.kernel.org/r/1535624528-5982-1-git-send-email-konishi.ryusuke@lab.ntt.co.jp
    Signed-off-by: Ryusuke Konishi
    Reviewed-by: Andrew Morton
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     

23 Aug, 2018

1 commit

  • Use new return type vm_fault_t for page_mkwrite handler.

    Link: http://lkml.kernel.org/r/1529555928-2411-1-git-send-email-konishi.ryusuke@lab.ntt.co.jp
    Signed-off-by: Souptick Joarder
    Signed-off-by: Ryusuke Konishi
    Reviewed-by: Matthew Wilcox
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Souptick Joarder
     

25 Feb, 2017

1 commit

  • ->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to
    take a vma and vmf parameter when the vma already resides in vmf.

    Remove the vma parameter to simplify things.

    [arnd@arndb.de: fix ARM build]
    Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de
    Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com
    Signed-off-by: Dave Jiang
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Ross Zwisler
    Cc: Theodore Ts'o
    Cc: Darrick J. Wong
    Cc: Matthew Wilcox
    Cc: Dave Hansen
    Cc: Christoph Hellwig
    Cc: Jan Kara
    Cc: Dan Williams
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jiang
     

24 May, 2016

2 commits

  • E-mail addresses of osrg.net domain are no longer available. This
    removes them from authorship notices and prevents reporters from being
    confused.

    Link: http://lkml.kernel.org/r/1461935747-10380-5-git-send-email-konishi.ryusuke@lab.ntt.co.jp
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This removes the extra paragraph which mentions FSF address in GPL
    notices from source code of nilfs2 and avoids the checkpatch.pl error
    related to it.

    Link: http://lkml.kernel.org/r/1461935747-10380-4-git-send-email-konishi.ryusuke@lab.ntt.co.jp
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     

11 Nov, 2015

1 commit

  • The function currently called "__block_page_mkwrite()" used to be called
    "block_page_mkwrite()" until a wrapper for this function was added by:

    commit 24da4fab5a61 ("vfs: Create __block_page_mkwrite() helper passing
    error values back")

    This wrapper, the current "block_page_mkwrite()", is currently unused.
    __block_page_mkwrite() is used directly by ext4, nilfs2 and xfs.

    Remove the unused wrapper, rename __block_page_mkwrite() back to
    block_page_mkwrite() and update the comment above block_page_mkwrite().

    Signed-off-by: Ross Zwisler
    Reviewed-by: Jan Kara
    Cc: Jan Kara
    Cc: Christoph Hellwig
    Cc: Al Viro
    Signed-off-by: Al Viro

    Ross Zwisler
     

12 Apr, 2015

1 commit

  • All places outside of core VFS that checked ->read and ->write for being NULL or
    called the methods directly are gone now, so NULL {read,write} with non-NULL
    {read,write}_iter will do the right thing in all cases.

    Signed-off-by: Al Viro

    Al Viro
     

11 Feb, 2015

1 commit


11 Dec, 2014

1 commit

  • This patch removes filemap_write_and_wait_range() from nilfs_sync_file(),
    because it triggers a data segment construction by calling
    nilfs_writepages() with WB_SYNC_ALL. A data segment construction does not
    remove the inode from the i_dirty list and it does not clear the
    NILFS_I_DIRTY flag. Therefore nilfs_inode_dirty() still returns true,
    which leads to an unnecessary duplicate segment construction in
    nilfs_sync_file().

    A call to filemap_write_and_wait_range() is not needed, because NILFS2
    does not rely on the generic writeback mechanisms. Instead it implements
    its own mechanism to collect all dirty pages and write them into segments.
    It is more efficient to initiate the segment construction directly in
    nilfs_sync_file() without the detour over filemap_write_and_wait_range().

    Additionally the lock of i_mutex is not needed, because all code blocks
    that are protected by i_mutex are also protected by a NILFS transaction:

    Function i_mutex nilfs_transaction
    ------------------------------------------------------
    nilfs_ioctl_setflags: yes yes
    nilfs_fiemap: yes no
    nilfs_write_begin: yes yes
    nilfs_write_end: yes yes
    nilfs_lookup: yes no
    nilfs_create: yes yes
    nilfs_link: yes yes
    nilfs_mknod: yes yes
    nilfs_symlink: yes yes
    nilfs_mkdir: yes yes
    nilfs_unlink: yes yes
    nilfs_rmdir: yes yes
    nilfs_rename: yes yes
    nilfs_setattr: yes yes

    For nilfs_lookup() i_mutex is held for the parent directory, to protect it
    from modification. The segment construction does not modify directory
    inodes, so no lock is needed.

    nilfs_fiemap() reads the block layout on the disk, by using
    nilfs_bmap_lookup_contig(). This is already protected by bmap->b_sem.

    Signed-off-by: Andreas Rohner
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andreas Rohner
     

14 Oct, 2014

1 commit

  • Under normal circumstances nilfs_sync_fs() writes out the super block,
    which causes a flush of the underlying block device. But this depends
    on the THE_NILFS_SB_DIRTY flag, which is only set if the pointer to the
    last segment crosses a segment boundary. So if only a small amount of
    data is written before the call to nilfs_sync_fs(), no flush of the
    block device occurs.

    In the above case an additional call to blkdev_issue_flush() is needed.
    To prevent unnecessary overhead, the new flag nilfs->ns_flushed_device
    is introduced, which is cleared whenever new logs are written and set
    whenever the block device is flushed. For convenience the function
    nilfs_flush_device() is added, which contains the above logic.

    Signed-off-by: Andreas Rohner
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andreas Rohner
     

07 May, 2014

2 commits


08 Apr, 2014

1 commit

  • filemap_map_pages() is generic implementation of ->map_pages() for
    filesystems who uses page cache.

    It should be safe to use filemap_map_pages() for ->map_pages() if
    filesystem use filemap_fault() for ->fault().

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Linus Torvalds
    Cc: Mel Gorman
    Cc: Rik van Riel
    Cc: Andi Kleen
    Cc: Matthew Wilcox
    Cc: Dave Hansen
    Cc: Alexander Viro
    Cc: Dave Chinner
    Cc: Ning Qu
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

23 Feb, 2013

1 commit


22 Feb, 2013

1 commit

  • Create a helper function to check if a backing device requires stable
    page writes and, if so, performs the necessary wait. Then, make it so
    that all points in the memory manager that handle making pages writable
    use the helper function. This should provide stable page write support
    to most filesystems, while eliminating unnecessary waiting for devices
    that don't require the feature.

    Before this patchset, all filesystems would block, regardless of whether
    or not it was necessary. ext3 would wait, but still generate occasional
    checksum errors. The network filesystems were left to do their own
    thing, so they'd wait too.

    After this patchset, all the disk filesystems except ext3 and btrfs will
    wait only if the hardware requires it. ext3 (if necessary) snapshots
    pages instead of blocking, and btrfs provides its own bdi so the mm will
    never wait. Network filesystems haven't been touched, so either they
    provide their own stable page guarantees or they don't block at all.
    The blocking behavior is back to what it was before 3.0 if you don't
    have a disk requiring stable page writes.

    Here's the result of using dbench to test latency on ext2:

    3.8.0-rc3:
    Operation Count AvgLat MaxLat
    ----------------------------------------
    WriteX 109347 0.028 59.817
    ReadX 347180 0.004 3.391
    Flush 15514 29.828 287.283

    Throughput 57.429 MB/sec 4 clients 4 procs max_latency=287.290 ms

    3.8.0-rc3 + patches:
    WriteX 105556 0.029 4.273
    ReadX 335004 0.005 4.112
    Flush 14982 30.540 298.634

    Throughput 55.4496 MB/sec 4 clients 4 procs max_latency=298.650 ms

    As you can see, the maximum write latency drops considerably with this
    patch enabled. The other filesystems (ext3/ext4/xfs/btrfs) behave
    similarly, but see the cover letter for those results.

    Signed-off-by: Darrick J. Wong
    Acked-by: Steven Whitehouse
    Reviewed-by: Jan Kara
    Cc: Adrian Hunter
    Cc: Andy Lutomirski
    Cc: Artem Bityutskiy
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Jens Axboe
    Cc: Eric Van Hensbergen
    Cc: Ron Minnich
    Cc: Latchesar Ionkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Darrick J. Wong
     

21 Dec, 2012

1 commit


09 Oct, 2012

1 commit

  • Move actual pte filling for non-linear file mappings into the new special
    vma operation: ->remap_pages().

    Filesystems must implement this method to get non-linear mapping support,
    if it uses filemap_fault() then generic_file_remap_pages() can be used.

    Now device drivers can implement this method and obtain nonlinear vma support.

    Signed-off-by: Konstantin Khlebnikov
    Cc: Alexander Viro
    Cc: Carsten Otte
    Cc: Chris Metcalf #arch/tile
    Cc: Cyrill Gorcunov
    Cc: Eric Paris
    Cc: H. Peter Anvin
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: James Morris
    Cc: Jason Baron
    Cc: Kentaro Takeda
    Cc: Matt Helsley
    Cc: Nick Piggin
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Suresh Siddha
    Cc: Tetsuo Handa
    Cc: Venkatesh Pallipadi
    Acked-by: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

01 Oct, 2012

1 commit

  • Commits 5e8830dc85d0 and 41c4d25f78c0 introduced a regression into
    v3.6-rc1 for ext4 in nodealloc mode, such that mtime updates would not
    take place for files modified via mmap if the page was already in the
    page cache. This would also affect ext3 file systems mounted using
    the ext4 file system driver.

    The problem was that ext4_page_mkwrite() had a shortcut which would
    avoid calling __block_page_mkwrite() under some circumstances, and the
    above two commit transferred the responsibility of calling
    file_update_time() to __block_page_mkwrite --- which woudln't get
    called in some circumstances.

    Since __block_page_mkwrite() only has three callers,
    block_page_mkwrite(), ext4_page_mkwrite, and nilfs_page_mkwrite(), the
    best way to solve this is to move the responsibility for calling
    file_update_time() to its caller.

    This problem was found via xfstests #215 with a file system mounted
    with -o nodelalloc.

    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: Jan Kara
    Cc: KONISHI Ryusuke
    Cc: stable@vger.kernel.org

    Theodore Ts'o
     

31 Jul, 2012

1 commit

  • We change nilfs_page_mkwrite() to provide proper freeze protection for
    writeable page faults (we must wait for frozen filesystem even if the
    page is fully mapped).

    We remove all vfs_check_frozen() checks since they are now handled by
    the generic code.

    CC: linux-nilfs@vger.kernel.org
    CC: KONISHI Ryusuke
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     

01 Jun, 2012

1 commit

  • There are two cases that the cache flush is needed to avoid data loss
    against unexpected hang or power failure. One is sync file function (i.e.
    nilfs_sync_file) and another is checkpointing ioctl.

    This issues a cache flush request to device for such cases if barrier
    mount option is enabled, and makes sure data really is on persistent
    storage on their completion.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     

21 Jul, 2011

1 commit

  • Btrfs needs to be able to control how filemap_write_and_wait_range() is called
    in fsync to make it less of a painful operation, so push down taking i_mutex and
    the calling of filemap_write_and_wait() down into the ->fsync() handlers. Some
    file systems can drop taking the i_mutex altogether it seems, like ext3 and
    ocfs2. For correctness sake I just pushed everything down in all cases to make
    sure that we keep the current behavior the same for everybody, and then each
    individual fs maintainer can make up their mind about what to do from there.
    Thanks,

    Acked-by: Jan Kara
    Signed-off-by: Josef Bacik
    Signed-off-by: Al Viro

    Josef Bacik
     

10 May, 2011

1 commit

  • Previously, nilfs was cloning pages for mmapped region to freeze their
    data and ensure consistency of checksum during writeback cycles. A
    private page allocator was used for this page cloning. But, we no
    longer need to do that since clear_page_dirty_for_io function sets up
    pte so that vm_ops->page_mkwrite function is called right before the
    mmapped pages are modified and nilfs_page_mkwrite function can safely
    wait for the pages to be written back to disk.

    So, this stops making a copy of mmapped pages during writeback, and
    eliminates the private page allocation and deallocation functions from
    nilfs.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

30 Mar, 2011

1 commit

  • From the result of a function test of mmap, mmap write to shared pages
    turned out to be broken for hole blocks. It doesn't write out filled
    blocks and the data will be lost after umount. This is due to a bug
    that the target file is not queued for log writer when filling hole
    blocks.

    Also, nilfs_page_mkwrite function exits normal code path even after
    successfully filled hole blocks due to a change of block_page_mkwrite
    function; just after nilfs was merged into the mainline,
    block_page_mkwrite() started to return VM_FAULT_LOCKED instead of zero
    by the patch "mm: close page_mkwrite races" (commit:
    b827e496c893de0c). The current nilfs_page_mkwrite() is not handling
    this value properly.

    This corrects nilfs_page_mkwrite() and will resolve the data loss
    problem in mmap write.

    [This should be applied to every kernel since 2.6.30 but a fix is
    needed for 2.6.37 and prior kernels]

    Signed-off-by: Ryusuke Konishi
    Tested-by: Ryusuke Konishi
    Cc: stable [2.6.38]

    Ryusuke Konishi
     

09 Mar, 2011

1 commit

  • This directly uses sb->s_fs_info to keep a nilfs filesystem object and
    fully removes the intermediate nilfs_sb_info structure. With this
    change, the hierarchy of on-memory structures of nilfs will be
    simplified as follows:

    Before:
    super_block
    -> nilfs_sb_info
    -> the_nilfs
    -> cptree --+-> nilfs_root (current file system)
    +-> nilfs_root (snapshot A)
    +-> nilfs_root (snapshot B)
    :
    -> nilfs_sc_info (log writer structure)
    After:
    super_block
    -> the_nilfs
    -> cptree --+-> nilfs_root (current file system)
    +-> nilfs_root (snapshot A)
    +-> nilfs_root (snapshot B)
    :
    -> nilfs_sc_info (log writer structure)

    The reason why we didn't design so from the beginning is because the
    initial shape also differed from the above. The early hierachy was
    composed of "per-mount-point" super_block -> nilfs_sb_info pairs and a
    shared nilfs object. On the kernel 2.6.37, it was changed to the
    current shape in order to unify super block instances into one per
    device, and this cleanup became applicable as the result.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

08 Mar, 2011

1 commit

  • The current FS_IOC_GETFLAGS/SETFLAGS/GETVERSION will fail if
    application is 32 bit and kernel is 64 bit.

    This issue is avoidable by adding compat_ioctl method.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

10 Jan, 2011

1 commit

  • This adds fiemap to nilfs. Two new functions, nilfs_fiemap and
    nilfs_find_uncommitted_extent are added.

    nilfs_fiemap() implements the fiemap inode operation, and
    nilfs_find_uncommitted_extent() helps to get a range of data blocks
    whose physical location has not been determined.

    nilfs_fiemap() collects extent information by looping through
    nilfs_bmap_lookup_contig and nilfs_find_uncommitted_extent routines.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

28 May, 2010

1 commit


02 Oct, 2009

1 commit


28 Sep, 2009

1 commit


22 Sep, 2009

1 commit


07 Apr, 2009

5 commits

  • Pekka Enberg suggested converting ->ioctl operations to use
    ->unlocked_ioctl to avoid BKL.

    The conversion was verified to be safe, so I will take it on this
    occasion.

    Cc: Pekka Enberg
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This removes compat code from the nilfs ioctls and applies the same
    function for both .ioctl and .compat_ioctl file operations.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • Chris Mason pointed out that there is a missed sync issue in
    nilfs_writepages():

    On Wed, 17 Dec 2008 21:52:55 -0500, Chris Mason wrote:
    > It looks like nilfs_writepage ignores WB_SYNC_NONE, which is used by
    > do_sync_mapping_range().

    where WB_SYNC_NONE in do_sync_mapping_range() was replaced with
    WB_SYNC_ALL by Nick's patch (commit:
    ee53a891f47444c53318b98dac947ede963db400).

    This fixes the problem by letting nilfs_writepages() write out the log of
    file data within the range if sync_mode is WB_SYNC_ALL.

    This involves removal of nilfs_file_aio_write() which was previously
    needed to ensure O_SYNC sync writes.

    Cc: Chris Mason
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds the segment constructor (also called log writer).

    The segment constructor collects dirty buffers for every dirty inode,
    makes summaries of the buffers, assigns disk block addresses to the
    buffers, and then submits BIOs for the buffers.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds primitives for regular file handling.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi