31 Jul, 2014

1 commit

  • We use a circle area to record the log nodes in ubifs. This log area
    should not be overlapped. But after researching the code, I found
    some conditions may lead log head wraps log ltail. Although we've
    fixed the problems discovered, there may be some other issues still
    left.

    This patch adds assertions where lhead changes to next leb to make
    sure ltail is not wrapped.

    Signed-off-by: hujianyang
    Signed-off-by: Artem Bityutskiy

    hujianyang
     

29 Jul, 2014

1 commit


19 Jul, 2014

14 commits


13 Jun, 2014

1 commit

  • Pull vfs updates from Al Viro:
    "This the bunch that sat in -next + lock_parent() fix. This is the
    minimal set; there's more pending stuff.

    In particular, I really hope to get acct.c fixes merged this cycle -
    we need that to deal sanely with delayed-mntput stuff. In the next
    pile, hopefully - that series is fairly short and localized
    (kernel/acct.c, fs/super.c and fs/namespace.c). In this pile: more
    iov_iter work. Most of prereqs for ->splice_write with sane locking
    order are there and Kent's dio rewrite would also fit nicely on top of
    this pile"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits)
    lock_parent: don't step on stale ->d_parent of all-but-freed one
    kill generic_file_splice_write()
    ceph: switch to iter_file_splice_write()
    shmem: switch to iter_file_splice_write()
    nfs: switch to iter_splice_write_file()
    fs/splice.c: remove unneeded exports
    ocfs2: switch to iter_file_splice_write()
    ->splice_write() via ->write_iter()
    bio_vec-backed iov_iter
    optimize copy_page_{to,from}_iter()
    bury generic_file_aio_{read,write}
    lustre: get rid of messing with iovecs
    ceph: switch to ->write_iter()
    ceph_sync_direct_write: stop poking into iov_iter guts
    ceph_sync_read: stop poking into iov_iter guts
    new helper: copy_page_from_iter()
    fuse: switch to ->write_iter()
    btrfs: switch to ->write_iter()
    ocfs2: switch to ->write_iter()
    xfs: switch to ->write_iter()
    ...

    Linus Torvalds
     

12 Jun, 2014

1 commit

  • iter_file_splice_write() - a ->splice_write() instance that gathers the
    pipe buffers, builds a bio_vec-based iov_iter covering those and feeds
    it to ->write_iter(). A bunch of simple cases coverted to that...

    [AV: fixed the braino spotted by Cyrill]

    Signed-off-by: Al Viro

    Al Viro
     

10 Jun, 2014

1 commit

  • Pull UBIFS updates from Artem Bityutskiy:
    "This contains several UBIFS fixes. One of them fixes a race condition
    between the mmap page fault path and fsync. Another just removes a
    bogus assertion from the UBIFS memory shrinker.

    UBIFS also started honoring the MS_SILENT mount flag, so now it won't
    print many I/O errors when user-space just tries to probe for the FS.

    Rest of the changes are rather minor UBI/UBIFS fixes, improvements,
    and clean-ups"

    * tag 'upstream-3.16-rc1-v2' of git://git.infradead.org/linux-ubifs:
    UBIFS: Add an assertion for clean_zn_cnt
    UBIFS: respect MS_SILENT mount flag
    UBIFS: Remove incorrect assertion in shrink_tnc()
    UBIFS: fix debugging check
    UBIFS: add missing ui pointer in debugging code
    UBI: block: Fix error path on alloc_workqueue failure
    UBIFS: Fix dump messages in ubifs_dump_lprops
    UBI: fix rb_tree node comparison in add_map
    UBIFS: Remove unused variables in ubifs_budget_space
    UBI: weaken the 'exclusive' constraint when opening volumes to rename
    UBIFS: fix an mmap and fsync race condition

    Linus Torvalds
     

04 Jun, 2014

1 commit

  • …el/git/tip/tip into next

    Pull core locking updates from Ingo Molnar:
    "The main changes in this cycle were:

    - reduced/streamlined smp_mb__*() interface that allows more usecases
    and makes the existing ones less buggy, especially in rarer
    architectures

    - add rwsem implementation comments

    - bump up lockdep limits"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
    rwsem: Add comments to explain the meaning of the rwsem's count field
    lockdep: Increase static allocations
    arch: Mass conversion of smp_mb__*()
    arch,doc: Convert smp_mb__*()
    arch,xtensa: Convert smp_mb__*()
    arch,x86: Convert smp_mb__*()
    arch,tile: Convert smp_mb__*()
    arch,sparc: Convert smp_mb__*()
    arch,sh: Convert smp_mb__*()
    arch,score: Convert smp_mb__*()
    arch,s390: Convert smp_mb__*()
    arch,powerpc: Convert smp_mb__*()
    arch,parisc: Convert smp_mb__*()
    arch,openrisc: Convert smp_mb__*()
    arch,mn10300: Convert smp_mb__*()
    arch,mips: Convert smp_mb__*()
    arch,metag: Convert smp_mb__*()
    arch,m68k: Convert smp_mb__*()
    arch,m32r: Convert smp_mb__*()
    arch,ia64: Convert smp_mb__*()
    ...

    Linus Torvalds
     

03 Jun, 2014

1 commit

  • This patch adds a new ubifs_assert() in ubifs_tnc_close() to check
    if there are any leaks of per-filesystem @clean_zn_cnt. This new
    assert inspects whether the return value of ubifs_destroy_tnc_subtree()
    is equal to @clean_zn_cnt or not while umount.

    Artem: a minor amendment

    Signed-off-by: hujianyang
    Signed-off-by: Artem Bityutskiy

    hujianyang
     

02 Jun, 2014

2 commits

  • When attempting to mount a non-ubifs formatted volume, lots of error
    messages (including a stack dump) are thrown to the kernel log even if
    the MS_SILENT mount flag is set.
    Fix this by introducing adding an additional state-variable in
    struct ubifs_info and suppress error messages in ubifs_read_node if
    MS_SILENT is set.

    Signed-off-by: Daniel Golle
    Signed-off-by: Artem Bityutskiy

    Daniel Golle
     
  • I hit the same assert failed as Dolev Raviv reported in Kernel v3.10
    shows like this:

    [ 9641.164028] UBIFS assert failed in shrink_tnc at 131 (pid 13297)
    [ 9641.234078] CPU: 1 PID: 13297 Comm: mmap.test Tainted: G O 3.10.40 #1
    [ 9641.234116] [] (unwind_backtrace+0x0/0x12c) from [] (show_stack+0x20/0x24)
    [ 9641.234137] [] (show_stack+0x20/0x24) from [] (dump_stack+0x20/0x28)
    [ 9641.234188] [] (dump_stack+0x20/0x28) from [] (shrink_tnc_trees+0x25c/0x350 [ubifs])
    [ 9641.234265] [] (shrink_tnc_trees+0x25c/0x350 [ubifs]) from [] (ubifs_shrinker+0x25c/0x310 [ubifs])
    [ 9641.234307] [] (ubifs_shrinker+0x25c/0x310 [ubifs]) from [] (shrink_slab+0x1d4/0x2f8)
    [ 9641.234327] [] (shrink_slab+0x1d4/0x2f8) from [] (do_try_to_free_pages+0x300/0x544)
    [ 9641.234344] [] (do_try_to_free_pages+0x300/0x544) from [] (try_to_free_pages+0x2d0/0x398)
    [ 9641.234363] [] (try_to_free_pages+0x2d0/0x398) from [] (__alloc_pages_nodemask+0x494/0x7e8)
    [ 9641.234382] [] (__alloc_pages_nodemask+0x494/0x7e8) from [] (new_slab+0x78/0x238)
    [ 9641.234400] [] (new_slab+0x78/0x238) from [] (__slab_alloc.constprop.42+0x1a4/0x50c)
    [ 9641.234419] [] (__slab_alloc.constprop.42+0x1a4/0x50c) from [] (kmem_cache_alloc_trace+0x54/0x188)
    [ 9641.234459] [] (kmem_cache_alloc_trace+0x54/0x188) from [] (do_readpage+0x168/0x468 [ubifs])
    [ 9641.234553] [] (do_readpage+0x168/0x468 [ubifs]) from [] (ubifs_readpage+0x424/0x464 [ubifs])
    [ 9641.234606] [] (ubifs_readpage+0x424/0x464 [ubifs]) from [] (filemap_fault+0x304/0x418)
    [ 9641.234638] [] (filemap_fault+0x304/0x418) from [] (__do_fault+0xd4/0x530)
    [ 9641.234665] [] (__do_fault+0xd4/0x530) from [] (handle_pte_fault+0x480/0xf54)
    [ 9641.234690] [] (handle_pte_fault+0x480/0xf54) from [] (handle_mm_fault+0x140/0x184)
    [ 9641.234716] [] (handle_mm_fault+0x140/0x184) from [] (do_page_fault+0x150/0x3ac)
    [ 9641.234737] [] (do_page_fault+0x150/0x3ac) from [] (do_DataAbort+0x3c/0xa0)
    [ 9641.234759] [] (do_DataAbort+0x3c/0xa0) from [] (__dabt_usr+0x38/0x40)

    After analyzing the code, I found a condition that may cause this failed
    in correct operations. Thus, I think this assertion is wrong and should be
    removed.

    Suppose there are two clean znodes and one dirty znode in TNC. So the
    per-filesystem atomic_t @clean_zn_cnt is (2). If commit start, dirty_znode
    is set to COW_ZNODE in get_znodes_to_commit() in case of potentially ops
    on this znode. We clear COW bit and DIRTY bit in write_index() without
    @tnc_mutex locked. We don't increase @clean_zn_cnt in this place. As the
    comments in write_index() shows, if another process hold @tnc_mutex and
    dirty this znode after we clean it, @clean_zn_cnt would be decreased to (1).
    We will increase @clean_zn_cnt to (2) with @tnc_mutex locked in
    free_obsolete_znodes() to keep it right.

    If shrink_tnc() performs between decrease and increase, it will release
    other 2 clean znodes it holds and found @clean_zn_cnt is less than zero
    (1 - 2 = -1), then hit the assertion. Because free_obsolete_znodes() will
    soon correct @clean_zn_cnt and no harm to fs in this case, I think this
    assertion could be removed.

    2 clean zondes and 1 dirty znode, @clean_zn_cnt == 2

    Thread A (commit) Thread B (write or others) Thread C (shrinker)
    ->write_index
    ->clear_bit(DIRTY_NODE)
    ->clear_bit(COW_ZNODE)

    @clean_zn_cnt == 2
    ->mutex_locked(&tnc_mutex)
    ->dirty_cow_znode
    ->!ubifs_zn_cow(znode)
    ->!test_and_set_bit(DIRTY_NODE)
    ->atomic_dec(&clean_zn_cnt)
    ->mutex_unlocked(&tnc_mutex)

    @clean_zn_cnt == 1
    ->mutex_locked(&tnc_mutex)
    ->shrink_tnc
    ->destroy_tnc_subtree
    ->atomic_sub(&clean_zn_cnt, 2)
    ->ubifs_assert mutex_unlocked(&tnc_mutex)

    @clean_zn_cnt == -1
    ->mutex_lock(&tnc_mutex)
    ->free_obsolete_znodes
    ->atomic_inc(&clean_zn_cnt)
    ->mutux_unlock(&tnc_mutex)

    @clean_zn_cnt == 0 (correct after shrink)

    Signed-off-by: hujianyang
    Cc: stable@vger.kernel.org
    Signed-off-by: Artem Bityutskiy

    hujianyang
     

28 May, 2014

1 commit

  • The debugging check which verifies that we never write outside of the file
    length was incorrect, since it was multiplying file length by the page size,
    instead of dividing. Fix this.

    Spotted-by: hujianyang
    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

27 May, 2014

2 commits

  • If UBIFS_DEBUG is defined an additional assertion of the ui_lock
    spinlock in do_writepage cannot compile because the ui pointer has not
    been previously declared.

    Fix this by declaring and initializing the ui pointer in case
    UBIFS_DEBUG is defined.

    Signed-off-by: Daniel Golle
    Signed-off-by: Artem Bityutskiy

    Daniel Golle
     
  • Function ubifs_read_one_lp will not set @lp and returns
    an error when ubifs_read_one_lp failed. We should not
    perform ubifs_dump_lprop in this case because @lp is not
    initialized as we wanted.

    Signed-off-by: hujianyang
    Signed-off-by: Artem Bityutskiy

    hujianyang
     

13 May, 2014

2 commits

  • I found two variables in ubifs_budget_space declared but not
    use. This state remains since the first commit 1e5176. So just
    remove them.

    Signed-off-by: hujianyang
    Signed-off-by: Artem Bityutskiy

    hujianyang
     
  • There is a race condition in UBIFS:

    Thread A (mmap) Thread B (fsync)

    ->__do_fault ->write_cache_pages
    -> ubifs_vm_page_mkwrite
    -> budget_space
    -> lock_page
    -> release/convert_page_budget
    -> SetPagePrivate
    -> TestSetPageDirty
    -> unlock_page
    -> lock_page
    -> TestClearPageDirty
    -> ubifs_writepage
    -> do_writepage
    -> release_budget
    -> ClearPagePrivate
    -> unlock_page
    -> !(ret & VM_FAULT_LOCKED)
    -> lock_page
    -> set_page_dirty
    -> ubifs_set_page_dirty
    -> TestSetPageDirty (set page dirty without budgeting)
    -> unlock_page

    This leads to situation where we have a diry page but no budget allocated for
    this page, so further write-back may fail with -ENOSPC.

    In this fix we return from page_mkwrite without performing unlock_page. We
    return VM_FAULT_LOCKED instead. After doing this, the race above will not
    happen.

    Signed-off-by: hujianyang
    Tested-by: Laurence Withers
    Cc: stable@vger.kernel.org
    Signed-off-by: Artem Bityutskiy

    hujianyang
     

07 May, 2014

2 commits


05 May, 2014

1 commit

  • Dan's "smatch" checker found out that there was a bug in the error path of the
    'ubifs_remount_rw()' function. Instead of jumping to the "out" label which
    cleans-things up, we just returned.

    This patch fixes the problem.

    Reported-by: Dan Carpenter
    Signed-off-by: Artem Bityutskiy

    Artem Bityutskiy
     

18 Apr, 2014

1 commit

  • Mostly scripted conversion of the smp_mb__* barriers.

    Signed-off-by: Peter Zijlstra
    Acked-by: Paul E. McKenney
    Link: http://lkml.kernel.org/n/tip-55dhyhocezdw1dg7u19hmh1u@git.kernel.org
    Cc: Linus Torvalds
    Cc: linux-arch@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

08 Apr, 2014

1 commit

  • filemap_map_pages() is generic implementation of ->map_pages() for
    filesystems who uses page cache.

    It should be safe to use filemap_map_pages() for ->map_pages() if
    filesystem use filemap_fault() for ->fault().

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Linus Torvalds
    Cc: Mel Gorman
    Cc: Rik van Riel
    Cc: Andi Kleen
    Cc: Matthew Wilcox
    Cc: Dave Hansen
    Cc: Alexander Viro
    Cc: Dave Chinner
    Cc: Ning Qu
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

05 Apr, 2014

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "Major changes for 3.14 include support for the newly added ZERO_RANGE
    and COLLAPSE_RANGE fallocate operations, and scalability improvements
    in the jbd2 layer and in xattr handling when the extended attributes
    spill over into an external block.

    Other than that, the usual clean ups and minor bug fixes"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (42 commits)
    ext4: fix premature freeing of partial clusters split across leaf blocks
    ext4: remove unneeded test of ret variable
    ext4: fix comment typo
    ext4: make ext4_block_zero_page_range static
    ext4: atomically set inode->i_flags in ext4_set_inode_flags()
    ext4: optimize Hurd tests when reading/writing inodes
    ext4: kill i_version support for Hurd-castrated file systems
    ext4: each filesystem creates and uses its own mb_cache
    fs/mbcache.c: doucple the locking of local from global data
    fs/mbcache.c: change block and index hash chain to hlist_bl_node
    ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
    ext4: refactor ext4_fallocate code
    ext4: Update inode i_size after the preallocation
    ext4: fix partial cluster handling for bigalloc file systems
    ext4: delete path dealloc code in ext4_ext_handle_uninitialized_extents
    ext4: only call sync_filesystm() when remounting read-only
    fs: push sync_filesystem() down to the file system's remount_fs()
    jbd2: improve error messages for inconsistent journal heads
    jbd2: minimize region locked by j_list_lock in jbd2_journal_forget()
    jbd2: minimize region locked by j_list_lock in journal_get_create_access()
    ...

    Linus Torvalds
     

04 Apr, 2014

1 commit

  • Reclaim will be leaving shadow entries in the page cache radix tree upon
    evicting the real page. As those pages are found from the LRU, an
    iput() can lead to the inode being freed concurrently. At this point,
    reclaim must no longer install shadow pages because the inode freeing
    code needs to ensure the page tree is really empty.

    Add an address_space flag, AS_EXITING, that the inode freeing code sets
    under the tree lock before doing the final truncate. Reclaim will check
    for this flag before installing shadow pages.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Rik van Riel
    Reviewed-by: Minchan Kim
    Cc: Andrea Arcangeli
    Cc: Bob Liu
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Greg Thelen
    Cc: Hugh Dickins
    Cc: Jan Kara
    Cc: KOSAKI Motohiro
    Cc: Luigi Semenzato
    Cc: Mel Gorman
    Cc: Metin Doslu
    Cc: Michel Lespinasse
    Cc: Ozgun Erdogan
    Cc: Peter Zijlstra
    Cc: Roman Gushchin
    Cc: Ryan Mallon
    Cc: Tejun Heo
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

13 Mar, 2014

1 commit

  • Previously, the no-op "mount -o mount /dev/xxx" operation when the
    file system is already mounted read-write causes an implied,
    unconditional syncfs(). This seems pretty stupid, and it's certainly
    documented or guaraunteed to do this, nor is it particularly useful,
    except in the case where the file system was mounted rw and is getting
    remounted read-only.

    However, it's possible that there might be some file systems that are
    actually depending on this behavior. In most file systems, it's
    probably fine to only call sync_filesystem() when transitioning from
    read-write to read-only, and there are some file systems where this is
    not needed at all (for example, for a pseudo-filesystem or something
    like romfs).

    Signed-off-by: "Theodore Ts'o"
    Cc: linux-fsdevel@vger.kernel.org
    Cc: Christoph Hellwig
    Cc: Artem Bityutskiy
    Cc: Adrian Hunter
    Cc: Evgeniy Dushistov
    Cc: Jan Kara
    Cc: OGAWA Hirofumi
    Cc: Anders Larsen
    Cc: Phillip Lougher
    Cc: Kees Cook
    Cc: Mikulas Patocka
    Cc: Petr Vandrovec
    Cc: xfs@oss.sgi.com
    Cc: linux-btrfs@vger.kernel.org
    Cc: linux-cifs@vger.kernel.org
    Cc: samba-technical@lists.samba.org
    Cc: codalist@coda.cs.cmu.edu
    Cc: linux-ext4@vger.kernel.org
    Cc: linux-f2fs-devel@lists.sourceforge.net
    Cc: fuse-devel@lists.sourceforge.net
    Cc: cluster-devel@redhat.com
    Cc: linux-mtd@lists.infradead.org
    Cc: jfs-discussion@lists.sourceforge.net
    Cc: linux-nfs@vger.kernel.org
    Cc: linux-nilfs@vger.kernel.org
    Cc: linux-ntfs-dev@lists.sourceforge.net
    Cc: ocfs2-devel@oss.oracle.com
    Cc: reiserfs-devel@vger.kernel.org

    Theodore Ts'o
     

24 Jan, 2014

1 commit


13 Nov, 2013

2 commits

  • Pull vfs updates from Al Viro:
    "All kinds of stuff this time around; some more notable parts:

    - RCU'd vfsmounts handling
    - new primitives for coredump handling
    - files_lock is gone
    - Bruce's delegations handling series
    - exportfs fixes

    plus misc stuff all over the place"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits)
    ecryptfs: ->f_op is never NULL
    locks: break delegations on any attribute modification
    locks: break delegations on link
    locks: break delegations on rename
    locks: helper functions for delegation breaking
    locks: break delegations on unlink
    namei: minor vfs_unlink cleanup
    locks: implement delegations
    locks: introduce new FL_DELEG lock flag
    vfs: take i_mutex on renamed file
    vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
    vfs: don't use PARENT/CHILD lock classes for non-directories
    vfs: pull ext4's double-i_mutex-locking into common code
    exportfs: fix quadratic behavior in filehandle lookup
    exportfs: better variable name
    exportfs: move most of reconnect_path to helper function
    exportfs: eliminate unused "noprogress" counter
    exportfs: stop retrying once we race with rename/remove
    exportfs: clear DISCONNECTED on all parents sooner
    exportfs: more detailed comment for path_reconnect
    ...

    Linus Torvalds
     
  • Pull ubifs changes from Artem Bityutskiy:
    "Mostly fixes for the power cut emulation UBIFS mode, and only one
    functional change which fixes a return error code"

    * tag 'upstream-3.13-rc1' of git://git.infradead.org/linux-ubifs:
    UBIFS: correct data corruption range
    UBIFS: fix return code
    UBIFS: remove unnecessary code in ubifs_garbage_collect

    Linus Torvalds
     

26 Oct, 2013

1 commit

  • With power-cut emulation, it is possible that sometimes no data at all is
    corrupted and that confusing messages are printed due to errors in the
    computation of data corruption range.

    [1] The start of the range should be [0..len-1], not [0..len].
    [2] The end of the range should always be at least 1 greater than the start.

    Signed-off-by: Mats Karrman
    Signed-off-by: Artem Bityutskiy

    Mats Kärrman