07 Apr, 2009

40 commits

  • This adds a Makefile for the nilfs2 file system, and updates the
    makefile and Kconfig file in the file system directory.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds userland interface implemented with ioctl.

    Signed-off-by: Koji Sato
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Koji Sato
     
  • This adds the cache of on-disk blocks to be moved in garbage
    collection. The disk blocks are held with dummy inodes (called
    gcinodes), and this file provides lookup function of the dummy inodes,
    and their buffer read function.

    Signed-off-by: Seiji Kihara
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Yoshiji Amagai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • NILFS2 uses another DAT inode during garbage collection to ensure
    atomicity and consistency of the DAT in the transient state. This
    twin inode is called GCDAT.

    This adds functions to initialize the GCDAT and to switch page caches
    and B-tree node caches between these two inodes.

    Signed-off-by: Seiji Kihara
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Yoshiji Amagai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds recovery function on mount.

    Usually the recovery is achieved by just finding the latest super
    root. When logs without checkpoints were appended for data sync
    operations after the latest super root, the recovery function will
    perform roll forwarding and reconstruct new log(s) with a super root.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • Chris Mason pointed out that there is a missed sync issue in
    nilfs_writepages():

    On Wed, 17 Dec 2008 21:52:55 -0500, Chris Mason wrote:
    > It looks like nilfs_writepage ignores WB_SYNC_NONE, which is used by
    > do_sync_mapping_range().

    where WB_SYNC_NONE in do_sync_mapping_range() was replaced with
    WB_SYNC_ALL by Nick's patch (commit:
    ee53a891f47444c53318b98dac947ede963db400).

    This fixes the problem by letting nilfs_writepages() write out the log of
    file data within the range if sync_mode is WB_SYNC_ALL.

    This involves removal of nilfs_file_aio_write() which was previously
    needed to ensure O_SYNC sync writes.

    Cc: Chris Mason
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds the segment constructor (also called log writer).

    The segment constructor collects dirty buffers for every dirty inode,
    makes summaries of the buffers, assigns disk block addresses to the
    buffers, and then submits BIOs for the buffers.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds the segment buffer which is used to constuct logs.

    [akpm@linux-foundation.org: BIO_RW_SYNC got removed]
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds super block operations for the nilfs2 file system.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds functions on the_nilfs object, which keeps shared resources and
    states among a read/write mount and snapshots mounts going individually.

    the_nilfs is allocated per block device; it is created when user first
    mount a snapshot or a read/write mount on the device, then it is reused
    for successive mounts. It will be freed when all mount instances on the
    device are detached.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds pathname operations, most of which comes from the ext2 file
    system.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds directory handling functions, most of which comes from the ext2
    file system.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Yoshiji Amagai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yoshiji Amagai
     
  • This adds primitives for regular file handling.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds inode level operations of the nilfs2 file system.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds a meta data file which stores the allocation state of segments.

    [konishi.ryusuke@lab.ntt.co.jp: fix wrong counting of checkpoints and dirty segments]
    Signed-off-by: Koji Sato
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Koji Sato
     
  • This adds a meta data file which holds checkpoint entries in its data
    blocks.

    Signed-off-by: Koji Sato
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Koji Sato
     
  • This adds a meta data file which stores on-disk inodes in its data blocks.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Yoshiji Amagai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds the disk address translation file (DAT) whose primary function
    is to convert virtual disk block numbers to actual disk block numbers.

    The virtual block numbers of NILFS are associated with checkpoint
    generation numbers, and this file also provides functions to manage the
    lifetime information of each virtual block number.

    Signed-off-by: Koji Sato
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Koji Sato
     
  • This adds common functions to allocate or deallocate entries with bitmaps
    on a meta data file. This feature is used by the DAT and ifile.

    Signed-off-by: Koji Sato
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Yoshiji Amagai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds the meta data file, which serves common buffer functions to the
    DAT, sufile, cpfile, ifile, and so forth.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds common routines for buffer/page operations used in B-tree
    node caches, meta data files, or segment constructor (log writer).

    NILFS uses copy functions for buffers and pages due to the following
    reasons:

    1) Relocation required for COW
    Since NILFS changes address of on-disk blocks, moving buffers
    in page cache is needed for the buffers which are not addressed
    by a file offset. If buffer size is smaller than page size,
    this involves partial copy of pages.

    2) Freezing mmapped pages
    NILFS calculates checksums for each log to ensure its validity.
    If page data changes after the checksum calculation, this validity
    check will not work correctly. To avoid this failure for mmaped
    pages, NILFS freezes their data by copying.

    3) Copy-on-write for DAT pages
    NILFS makes clones of DAT page caches in a copy-on-write manner
    during GC processes, and this ensures atomicity and consistency
    of the DAT in the transient state.

    In addition, NILFS uses two obsolete functions, nilfs_mark_buffer_dirty()
    and nilfs_clear_page_dirty() respectively.

    * nilfs_mark_buffer_dirty() was required to avoid NULL pointer
    dereference faults:

    Since the page cache of B-tree node pages or data page cache of pseudo
    inodes does not have a valid mapping->host, calling mark_buffer_dirty()
    for their buffers causes the fault; it calls __mark_inode_dirty(NULL)
    through __set_page_dirty().

    * nilfs_clear_page_dirty() was needed in the two cases:

    1) For B-tree node pages and data pages of the dat/gcdat, NILFS2 clears
    page dirty flags when it copies back pages from the cloned cache
    (gcdat->{i_mapping,i_btnode_cache}) to its original cache
    (dat->{i_mapping,i_btnode_cache}).

    2) Some B-tree operations like insertion or deletion may dispose buffers
    in dirty state, and this needs to cancel the dirty state of their
    pages. clear_page_dirty_for_io() caused faults because it does not
    clear the dirty tag on the page cache.

    Signed-off-by: Seiji Kihara
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds routines for B-tree node buffers.

    Signed-off-by: Seiji Kihara
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds block mappings using direct pointers which are stored in the
    i_bmap array of inode.

    Signed-off-by: Koji Sato
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Koji Sato
     
  • This adds declarations and functions of NILFS2 B-tree.

    Two variants are integrated in the NILFS2 B-tree. The B-tree for the most
    files points to the child nodes or data blocks with virtual block
    addresses, whereas the B-tree of the DAT uses actual block addresses.

    Signed-off-by: Koji Sato
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Koji Sato
     
  • This adds structures and operations for the block mapping (bmap for
    short). NILFS2 uses direct mappings for short files or B-tree based
    mappings for longer files.

    Every on-disk data block is held with inodes and managed through this
    block mapping. The nilfs_bmap structure and a set of functions here
    provide this capability to the NILFS2 inode.

    [penberg@cs.helsinki.fi: remove a bunch of bmap wrapper macros]
    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Koji Sato
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Koji Sato
     
  • This adds the following common structures of the NILFS2 file system.

    * nilfs_inode_info structure:
    gives on-memory inode.

    * nilfs_sb_info structure:
    keeps per-mount state and a special inode for the ifile.
    This structure is attached to the super_block structure.

    * the_nilfs structure:
    keeps shared state and locks among a read/write mount and snapshot
    mounts. This keeps special inodes for the sufile, cpfile, dat, and
    another dat inode used during GC (gcdat). This also has a hash table
    of dummy inodes to cache disk blocks during GC (gcinodes).

    * nilfs_transaction_info structure:
    keeps per task state while nilfs is writing logs or doing indivisible
    inode or namespace operations. This structure is used to identify
    context during log making and store nest level of the lock which
    ensures atomicity of file system operations.

    Signed-off-by: Koji Sato
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • This adds a header file which specifies the on-disk format and ioctl
    interface of the nilfs2 file system.

    Signed-off-by: Koji Sato
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Koji Sato
     
  • This adds a document describing the features, mount options, userland
    tools, usage, disk format, and related URLs for the nilfs2 file system.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • Update the old macro DMA_nBIT_MASK related documentations

    Signed-off-by: Yang Hongyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Hongyang
     
  • Replace all DMA_24BIT_MASK macro with DMA_BIT_MASK(24)

    Signed-off-by: Yang Hongyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Hongyang
     
  • Replace all DMA_28BIT_MASK macro with DMA_BIT_MASK(28)

    Signed-off-by: Yang Hongyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Hongyang
     
  • Replace all DMA_30BIT_MASK macro with DMA_BIT_MASK(30)

    Signed-off-by: Yang Hongyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Hongyang
     
  • Replace all DMA_31BIT_MASK macro with DMA_BIT_MASK(31)

    Signed-off-by: Yang Hongyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Hongyang
     
  • Replace all DMA_32BIT_MASK macro with DMA_BIT_MASK(32)

    Signed-off-by: Yang Hongyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Hongyang
     
  • Replace all DMA_39BIT_MASK macro with DMA_BIT_MASK(39)

    Signed-off-by: Yang Hongyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Hongyang
     
  • Replace all DMA_40BIT_MASK macro with DMA_BIT_MASK(40)

    Signed-off-by: Yang Hongyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Hongyang
     
  • Replace all DMA_48BIT_MASK macro with DMA_BIT_MASK(48)

    Signed-off-by: Yang Hongyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Hongyang
     
  • Replace all DMA_64BIT_MASK macro with DMA_BIT_MASK(64)

    Signed-off-by: Yang Hongyang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yang Hongyang
     
  • Make romfs return f_fsid info for statfs(2).

    Signed-off-by: Coly Li
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Coly Li
     
  • Largely inspired from ipc/ipc_sysctl.c. This patch isolates the mqueue
    sysctl stuff in its own file.

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Cedric Le Goater
    Signed-off-by: Nadia Derbey
    Signed-off-by: Serge E. Hallyn
    Cc: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn