13 Oct, 2007

1 commit

  • Big thanks go to Mathias Kolehmainen for reporting the bug, providing
    debug output and testing the patches I sent him to get it working.

    The fix was to stop calling ntfs_attr_set() at mount time as that causes
    balance_dirty_pages_ratelimited() to be called which on systems with
    little memory actually tries to go and balance the dirty pages which tries
    to take the s_umount semaphore but because we are still in fill_super()
    across which the VFS holds s_umount for writing this results in a
    deadlock.

    We now do the dirty work by hand by submitting individual buffers. This
    has the annoying "feature" that mounting can take a few seconds if the
    journal is large as we have clear it all. One day someone should improve
    on this by deferring the journal clearing to a helper kernel thread so it
    can be done in the background but I don't have time for this at the moment
    and the current solution works fine so I am leaving it like this for now.

    Signed-off-by: Anton Altaparmakov
    Signed-off-by: Linus Torvalds

    Anton Altaparmakov
     

09 May, 2007

1 commit


18 Jan, 2007

1 commit


30 Nov, 2006

2 commits


04 Oct, 2006

2 commits


24 Mar, 2006

1 commit


24 Feb, 2006

1 commit


11 Oct, 2005

1 commit

  • file operations ->write(), ->aio_write(), and ->writev() for regular
    files. This replaces the old use of generic_file_write(), et al and
    the address space operations ->prepare_write and ->commit_write.
    This means that both sparse and non-sparse (unencrypted and
    uncompressed) files can now be extended using the normal write(2)
    code path. There are two limitations at present and these are that
    we never create sparse files and that we only have limited support
    for highly fragmented files, i.e. ones whose data attribute is split
    across multiple extents. When such a case is encountered,
    EOPNOTSUPP is returned.

    Signed-off-by: Anton Altaparmakov

    Anton Altaparmakov
     

09 Sep, 2005

1 commit


27 Jun, 2005

1 commit

  • The situation: VFS inode X on a mounted ntfs volume is dirty. For
    same inode X, the ntfs_inode is dirty and thus corresponding on-disk
    inode, i.e. mft record, which is in a dirty PAGE_CACHE_PAGE belonging
    to the table of inodes, i.e. $MFT, inode 0.
    What happens:
    Process 1: sys_sync()/umount()/whatever... calls
    __sync_single_inode() for $MFT -> do_writepages() -> write_page for
    the dirty page containing the on-disk inode X, the page is now locked
    -> ntfs_write_mst_block() which clears PageUptodate() on the page to
    prevent anyone else getting hold of it whilst it does the write out.
    This is necessary as the on-disk inode needs "fixups" applied before
    the write to disk which are removed again after the write and
    PageUptodate is then set again. It then analyses the page looking
    for dirty on-disk inodes and when it finds one it calls
    ntfs_may_write_mft_record() to see if it is safe to write this
    on-disk inode. This then calls ilookup5() to check if the
    corresponding VFS inode is in icache(). This in turn calls ifind()
    which waits on the inode lock via wait_on_inode whilst holding the
    global inode_lock.
    Process 2: pdflush results in a call to __sync_single_inode for the
    same VFS inode X on the ntfs volume. This locks the inode (I_LOCK)
    then calls write-inode -> ntfs_write_inode -> map_mft_record() ->
    read_cache_page() for the page (in page cache of table of inodes
    $MFT, inode 0) containing the on-disk inode. This page has
    PageUptodate() clear because of Process 1 (see above) so
    read_cache_page() blocks when it tries to take the page lock for the
    page so it can call ntfs_read_page().
    Thus Process 1 is holding the page lock on the page containing the
    on-disk inode X and it is waiting on the inode X to be unlocked in
    ifind() so it can write the page out and then unlock the page.
    And Process 2 is holding the inode lock on inode X and is waiting for
    the page to be unlocked so it can call ntfs_readpage() or discover
    that Process 1 set PageUptodate() again and use the page.
    Thus we have a deadlock due to ifind() waiting on the inode lock.
    The solution: The fix is to use the newly introduced
    ilookup5_nowait() which does not wait on the inode's lock and hence
    avoids the deadlock. This is safe as we do not care about the VFS
    inode and only use the fact that it is in the VFS inode cache and the
    fact that the vfs and ntfs inodes are one struct in memory to find
    the ntfs inode in memory if present. Also, the ntfs inode has its
    own locking so it does not matter if the vfs inode is locked.

    Signed-off-by: Anton Altaparmakov

    Anton Altaparmakov
     

26 Jun, 2005

1 commit


05 May, 2005

1 commit


17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds