10 May, 2011

1 commit

  • Previously, nilfs was cloning pages for mmapped region to freeze their
    data and ensure consistency of checksum during writeback cycles. A
    private page allocator was used for this page cloning. But, we no
    longer need to do that since clear_page_dirty_for_io function sets up
    pte so that vm_ops->page_mkwrite function is called right before the
    mmapped pages are modified and nilfs_page_mkwrite function can safely
    wait for the pages to be written back to disk.

    So, this stops making a copy of mmapped pages during writeback, and
    eliminates the private page allocation and deallocation functions from
    nilfs.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

09 Mar, 2011

3 commits

  • This directly uses sb->s_fs_info to keep a nilfs filesystem object and
    fully removes the intermediate nilfs_sb_info structure. With this
    change, the hierarchy of on-memory structures of nilfs will be
    simplified as follows:

    Before:
    super_block
    -> nilfs_sb_info
    -> the_nilfs
    -> cptree --+-> nilfs_root (current file system)
    +-> nilfs_root (snapshot A)
    +-> nilfs_root (snapshot B)
    :
    -> nilfs_sc_info (log writer structure)
    After:
    super_block
    -> the_nilfs
    -> cptree --+-> nilfs_root (current file system)
    +-> nilfs_root (snapshot A)
    +-> nilfs_root (snapshot B)
    :
    -> nilfs_sc_info (log writer structure)

    The reason why we didn't design so from the beginning is because the
    initial shape also differed from the above. The early hierachy was
    composed of "per-mount-point" super_block -> nilfs_sb_info pairs and a
    shared nilfs object. On the kernel 2.6.37, it was changed to the
    current shape in order to unify super block instances into one per
    device, and this cleanup became applicable as the result.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     
  • This replaces sbi uses with direct reference to sb instance.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     
  • Removes sci->sc_sbi which is a back pointer to nilfs_sb_info struct
    from log writer object (nilfs_sc_info).

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

23 Oct, 2010

2 commits

  • This rewrites functions using ifile so that they get ifile from
    nilfs_root object, and will remove sbi->s_ifile. Some functions that
    don't know the root object are extended to receive it from caller.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     
  • On-memory inode structures of nilfs have a member "i_cno" which stores
    a checkpoint number related to the inode. For gc-inodes, this field
    indicates version of data each gc-inode caches for GC. Log writer
    temporarily uses "i_cno" to transfer the latest checkpoint number.

    This stops the latter use and lets only gc-inodes use it.

    The purpose of this patch is to allow the successive change use
    "i_cno" for inode lookup.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

23 Jul, 2010

2 commits


31 May, 2010

1 commit


10 May, 2010

3 commits


14 Mar, 2010

2 commits


13 Feb, 2010

1 commit


30 Nov, 2009

1 commit

  • This separates wait function for submitted logs from the write
    function nilfs_segctor_write(). A new list of segment buffers
    "sc_write_logs" is added to hold logs under writing, and double
    buffering is partially applied to hide io latency.

    At this point, the double buffering is disabled for blocksize <
    pagesize because page dirty flag is turned off during write and dirty
    buffers are not properly collected for pages crossing over segments.

    To receive full benefit of the double buffering, further refinement is
    needed to move the io wait outside the lock section of log writer.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

10 Jun, 2009

2 commits

  • This will eliminate obsolete list operations of nilfs_segment_entry
    structure which has been used to handle mutiple segment numbers.

    The patch ("nilfs2: remove list of freeing segments") removed use of
    the structure from the segment constructor code, and this patch
    simplifies the remaining code by integrating it into recovery.c.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     
  • This will clean up the removal list of segments and the related
    functions from segment.c and ioctl.c, which have hurt code
    readability.

    This elimination is applied by using nilfs_sufile_updatev() previously
    introduced in the patch ("nilfs2: add sufile function that can modify
    multiple segment usages").

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

11 May, 2009

1 commit

  • This is a companion patch to ("nilfs2: fix possible circular locking
    for get information ioctls").

    This corrects lock order reversal between mm->mmap_sem and
    nilfs->ns_segctor_sem in nilfs_clean_segments() which was detected by
    lockdep check:

    =======================================================
    [ INFO: possible circular locking dependency detected ]
    2.6.30-rc3-nilfs-00003-g360bdc1 #7
    -------------------------------------------------------
    mmap/5294 is trying to acquire lock:
    (&nilfs->ns_segctor_sem){++++.+}, at: [] nilfs_transaction_begin+0xb6/0x10c [nilfs2]

    but task is already holding lock:
    (&mm->mmap_sem){++++++}, at: [] do_page_fault+0x1d8/0x30a

    which lock already depends on the new lock.

    the existing dependency chain (in reverse order) is:

    -> #1 (&mm->mmap_sem){++++++}:
    [] __lock_acquire+0x1066/0x13b0
    [] lock_acquire+0xba/0xdd
    [] might_fault+0x68/0x88
    [] copy_from_user+0x2a/0x111
    [] nilfs_ioctl_prepare_clean_segments+0x1d/0xf1 [nilfs2]
    [] nilfs_clean_segments+0x6d/0x1b9 [nilfs2]
    [] nilfs_ioctl+0x2ad/0x318 [nilfs2]
    [] vfs_ioctl+0x22/0x69
    [] do_vfs_ioctl+0x460/0x499
    [] sys_ioctl+0x40/0x5a
    [] sysenter_do_call+0x12/0x38
    [] 0xffffffff

    -> #0 (&nilfs->ns_segctor_sem){++++.+}:
    [] __lock_acquire+0xdcc/0x13b0
    [] lock_acquire+0xba/0xdd
    [] down_read+0x2a/0x3e
    [] nilfs_transaction_begin+0xb6/0x10c [nilfs2]
    [] nilfs_page_mkwrite+0xe7/0x154 [nilfs2]
    [] __do_fault+0x165/0x376
    [] handle_mm_fault+0x287/0x5d1
    [] do_page_fault+0x2fb/0x30a
    [] error_code+0x72/0x78
    [] 0xffffffff

    where nilfs_clean_segments() holds:

    nilfs->ns_segctor_sem -> copy_from_user()
    --> page fault -> mm->mmap_sem

    And, page fault path may hold:

    page fault -> mm->mmap_sem
    --> nilfs_page_mkwrite() -> nilfs->ns_segctor_sem

    Even though nilfs_clean_segments() does not perform write access on
    given user pages, it may cause deadlock because nilfs->ns_segctor_sem
    is shared per device and mm->mmap_sem can be shared with other tasks.

    To avoid this problem, this patch moves all calls of copy_from_user()
    outside the nilfs->ns_segctor_sem lock in the ioctl.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

07 Apr, 2009

6 commits

  • The former versions didn't have extra super blocks. This improves the
    weak point by introducing another super block at unused region in tail of
    the partition.

    This doesn't break disk format compatibility; older versions just ingore
    the secondary super block, and new versions just recover it if it doesn't
    exist. The partition created by an old mkfs may not have unused region,
    but in that case, the secondary super block will not be added.

    This doesn't make more redundant copies of the super block; it is a future
    work.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • will reduce some lines of segment constructor. Previously, the state was
    complexly controlled through a list of segments in order to keep
    consistency in meta data of usage state of segments. Instead, this
    presents ``calculated'' active flags to userland cleaner program and stop
    maintaining its real flag on disk.

    Only by this fake flag, the cleaner cannot exactly know if each segment is
    reclaimable or not. However, the recent extension of nilfs_sustat ioctl
    struct (nilfs2-extend-nilfs_sustat-ioctl-struct.patch) can prevent the
    cleaner from reclaiming in-use segment wrongly.

    So, now I can apply this for simplification.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • Nilfs creates checkpoints even for garbage collection or metadata updates
    such as checkpoint mode change. So, user often sees checkpoints created
    only by such internal operations.

    This is inconvenient in some situations. For example, application that
    monitors checkpoints and changes them to snapshots, will fall into an
    infinite loop because it cannot distinguish internally created
    checkpoints.

    This patch solves this sort of problem by adding a flag to checkpoint for
    identification.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • The sketch file is a file to mark checkpoints with user data. It was
    experimentally introduced in the original implementation, and now
    obsolete. The file was handled differently with regular files; the file
    size got truncated when a checkpoint was created.

    This stops the special treatment and will treat it as a regular file.
    Most users are not affected because mkfs.nilfs2 no longer makes this file.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • Chris Mason pointed out that there is a missed sync issue in
    nilfs_writepages():

    On Wed, 17 Dec 2008 21:52:55 -0500, Chris Mason wrote:
    > It looks like nilfs_writepage ignores WB_SYNC_NONE, which is used by
    > do_sync_mapping_range().

    where WB_SYNC_NONE in do_sync_mapping_range() was replaced with
    WB_SYNC_ALL by Nick's patch (commit:
    ee53a891f47444c53318b98dac947ede963db400).

    This fixes the problem by letting nilfs_writepages() write out the log of
    file data within the range if sync_mode is WB_SYNC_ALL.

    This involves removal of nilfs_file_aio_write() which was previously
    needed to ensure O_SYNC sync writes.

    Cc: Chris Mason
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds the segment constructor (also called log writer).

    The segment constructor collects dirty buffers for every dirty inode,
    makes summaries of the buffers, assigns disk block addresses to the
    buffers, and then submits BIOs for the buffers.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi