04 Apr, 2014

2 commits

  • Ensure that ocfs2_update_inode_fsync_trans() is called any time we touch
    an inode in a given transaction. This is a follow-on to the previous
    patch to reduce lock contention and deadlocking during an fsync
    operation.

    Signed-off-by: Darrick J. Wong
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Wengang
    Cc: Greg Marsden
    Cc: Srinivas Eeda
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Darrick J. Wong
     
  • Currently, ocfs2_sync_file grabs i_mutex and forces the current journal
    transaction to complete. This isn't terribly efficient, since sync_file
    really only needs to wait for the last transaction involving that inode
    to complete, and this doesn't require i_mutex.

    Therefore, implement the necessary bits to track the newest tid
    associated with an inode, and teach sync_file to wait for that instead
    of waiting for everything in the journal to commit. Furthermore, only
    issue the flush request to the drive if jbd2 hasn't already done so.

    This also eliminates the deadlock between ocfs2_file_aio_write() and
    ocfs2_sync_file(). aio_write takes i_mutex then calls
    ocfs2_aiodio_wait() to wait for unaligned dio writes to finish.
    However, if that dio completion involves calling fsync, then we can get
    into trouble when some ocfs2_sync_file tries to take i_mutex.

    Signed-off-by: Darrick J. Wong
    Reviewed-by: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Darrick J. Wong
     

13 Nov, 2013

2 commits


14 Aug, 2013

1 commit

  • Fix a NULL pointer deference while removing an empty directory, which
    was introduced by commit 3704412bdbf3 ("[readdir] convert ocfs2").

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] (null)
    PGD 6da85067 PUD 6da89067 PMD 0
    Oops: 0010 [#1] SMP
    CPU: 0 PID: 6564 Comm: rmdir Tainted: G O 3.11.0-rc1 #4
    RIP: 0010:[] [< (null)>] (null)
    Call Trace:
    ocfs2_dir_foreach+0x49/0x50 [ocfs2]
    ocfs2_empty_dir+0x12c/0x3e0 [ocfs2]
    ocfs2_unlink+0x56e/0xc10 [ocfs2]
    vfs_rmdir+0xd5/0x140
    do_rmdir+0x1cb/0x1e0
    SyS_rmdir+0x16/0x20
    system_call_fastpath+0x16/0x1b
    Code: Bad RIP value.
    RIP [< (null)>] (null)
    RSP
    CR2: 0000000000000000

    [dan.carpenter@oracle.com: fix pointer math]
    Signed-off-by: Jie Liu
    Reported-by: David Weber
    Cc: Al Viro
    Cc: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Liu
     

29 Jun, 2013

1 commit


27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

26 Feb, 2013

1 commit


23 Feb, 2013

1 commit


21 Jan, 2013

1 commit

  • This macro, initially introduced by ext2 in v0.99.15, does not
    have any users from the beginning. It has been removed in later
    ext2 version but still remains in the code of ext3, ext4, ocfs2.
    Remove this macro there.

    Cc: Jan Kara
    Cc: linux-ext4@vger.kernel.org
    Cc: ocfs2-devel@oss.oracle.com
    Acked-by: Mark Fasheh
    Acked-by: "Theodore Ts'o"
    Signed-off-by: Guo Chao
    Signed-off-by: Jan Kara

    Guo Chao
     

02 Dec, 2011

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (31 commits)
    ocfs2: avoid unaligned access to dqc_bitmap
    ocfs2: Use filemap_write_and_wait() instead of write_inode_now()
    ocfs2: honor O_(D)SYNC flag in fallocate
    ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2
    ocfs2: send correct UUID to cleancache initialization
    ocfs2: Commit transactions in error cases -v2
    ocfs2: make direntry invalid when deleting it
    fs/ocfs2/dlm/dlmlock.c: free kmem_cache_zalloc'd data using kmem_cache_free
    ocfs2: Avoid livelock in ocfs2_readpage()
    ocfs2: serialize unaligned aio
    ocfs2: Implement llseek()
    ocfs2: Fix ocfs2_page_mkwrite()
    ocfs2: Add comment about orphan scanning
    ocfs2: Clean up messages in the fs
    ocfs2/cluster: Cluster up now includes network connections too
    ocfs2/cluster: Add new function o2net_fill_node_map()
    ocfs2/cluster: Fix output in file elapsed_time_in_ms
    ocfs2/dlm: dlmlock_remote() needs to account for remastery
    ocfs2/dlm: Take inflight reference count for remotely mastered resources too
    ocfs2/dlm: Cleanup dlm_wait_for_node_death() and dlm_wait_for_node_recovery()
    ...

    Linus Torvalds
     

17 Nov, 2011

1 commit

  • When we deleting a direntry from a directory, if it's the first in a block we
    invalid it by setting inode to 0; otherwise, we merge the deleted one to the
    prior and contiguous direntry. And we don't truncate directories.

    There is a problem for the later case since inode is not set to 0.
    This problem happens when the caller passes a file position as parameter to
    ocfs2_dir_foreach_blk(). If the position happens to point to a stale(not
    the first, deleted in betweens of ocfs2_dir_foreach_blk()s) direntry, we are
    not able to recognize its staleness. So that we treat it as a live one wrongly.

    The fix is to set inode to 0 in both cases indicating the direntry is stale.
    This won't introduce additional IOs.

    Signed-off-by: Wengang Wang
    Signed-off-by: Joel Becker

    Wengang Wang
     

02 Nov, 2011

1 commit


14 May, 2011

1 commit

  • CLANG found that there is a path that has data_ac uninitialized,
    this place
    2917 /* This gets us the dx_root */
    2918 ret = ocfs2_reserve_new_metadata_blocks(osb, 1, &meta_ac);
    2919 if (ret) {

    3
    Taking true branch
    2920 mlog_errno(ret);
    2921 goto out;

    4
    Control jumps to line 3168
    2922 }

    Goes to the out: label without data_ac being initialized.

    Ciao, Marcus

    Signed-Off-By: Marcus Meissner
    Signed-off-by: Mark Fasheh
    Signed-off-by: Joel Becker

    Marcus Meissner
     

29 Mar, 2011

2 commits

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (39 commits)
    Treat writes as new when holes span across page boundaries
    fs,ocfs2: Move o2net_get_func_run_time under CONFIG_OCFS2_FS_STATS.
    ocfs2/dlm: Move kmalloc() outside the spinlock
    ocfs2: Make the left masklogs compat.
    ocfs2: Remove masklog ML_AIO.
    ocfs2: Remove masklog ML_UPTODATE.
    ocfs2: Remove masklog ML_BH_IO.
    ocfs2: Remove masklog ML_JOURNAL.
    ocfs2: Remove masklog ML_EXPORT.
    ocfs2: Remove masklog ML_DCACHE.
    ocfs2: Remove masklog ML_NAMEI.
    ocfs2: Remove mlog(0) from fs/ocfs2/dir.c
    ocfs2: remove NAMEI from symlink.c
    ocfs2: Remove masklog ML_QUOTA.
    ocfs2: Remove mlog(0) from quota_local.c.
    ocfs2: Remove masklog ML_RESERVATIONS.
    ocfs2: Remove masklog ML_XATTR.
    ocfs2: Remove masklog ML_SUPER.
    ocfs2: Remove mlog(0) from fs/ocfs2/heartbeat.c
    ocfs2: Remove mlog(0) from fs/ocfs2/slot_map.c
    ...

    Fix up trivial conflict in fs/ocfs2/super.c

    Linus Torvalds
     
  • Joel Becker
     

07 Mar, 2011

1 commit

  • mlog_exit is used to record the exit status of a function.
    But because it is added in so many functions, if we enable it,
    the system logs get filled up quickly and cause too much I/O.
    So actually no one can open it for a production system or even
    for a test.

    This patch just try to remove it or change it. So:
    1. if all the error paths already use mlog_errno, it is just removed.
    Otherwise, it will be replaced by mlog_errno.
    2. if it is used to print some return value, it is replaced with
    mlog(0,...).
    mlog_exit_ptr is changed to mlog(0.
    All those mlog(0,...) will be replaced with trace events later.

    Signed-off-by: Tao Ma

    Tao Ma
     

23 Feb, 2011

1 commit


21 Feb, 2011

1 commit

  • ENTRY is used to record the entry of a function.
    But because it is added in so many functions, if we enable it,
    the system logs get filled up quickly and cause too much I/O.
    So actually no one can open it for a production system or even
    for a test.

    So for mlog_entry_void, we just remove it.
    for mlog_entry(...), we replace it with mlog(0,...), and they
    will be replace by trace event later.

    Signed-off-by: Tao Ma

    Tao Ma
     

20 Feb, 2011

1 commit

  • In cad3f00, ext4_check_dir_entry was modified by adding some unlikely.
    Ted described it as "This function gets called a lot for large
    directories, and the answer is almost always 'no, no, there's no problem'.
    This means using unlikely() is a good thing."
    ext3 added the similar change in commit a4ae309.

    So change it accordingly in ocfs2.

    Cc: Joel Becker
    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     

19 Jan, 2011

1 commit

  • Fix a bunch of
    warning: ‘inline’ is not at beginning of declaration
    messages when building a 'make allyesconfig' kernel with -Wextra.

    These warnings are trivial to kill, yet rather annoying when building with
    -Wextra.
    The more we can cut down on pointless crap like this the better (IMHO).

    A previous patch to do this for a 'allnoconfig' build has already been
    merged. This just takes the cleanup a little further.

    Signed-off-by: Jesper Juhl
    Signed-off-by: Jiri Kosina

    Jesper Juhl
     

16 Dec, 2010

1 commit


11 Sep, 2010

1 commit


19 May, 2010

2 commits

  • Joel Becker
     
  • Truncate is just a special case of punching holes(from new i_size to
    end), we therefore could take advantage of the existing
    ocfs2_remove_btree_range() to reduce the comlexity and redundancy in
    alloc.c. The goal here is to make truncate more generic and
    straightforward.

    Several functions only used by ocfs2_commit_truncate() will smiply be
    removed.

    ocfs2_remove_btree_range() was originally used by the hole punching
    code, which didn't take refcount trees into account (definitely a bug).
    We therefore need to change that func a bit to handle refcount trees.
    It must take the refcount lock, calculate and reserve blocks for
    refcount tree changes, and decrease refcounts at the end. We replace
    ocfs2_lock_allocators() here by adding a new func
    ocfs2_reserve_blocks_for_rec_trunc() which accepts some extra blocks to
    reserve. This will not hurt any other code using
    ocfs2_remove_btree_range() (such as dir truncate and hole punching).

    I merged the following steps into one patch since they may be
    logically doing one thing, though I know it looks a little bit fat
    to review.

    1). Remove redundant code used by ocfs2_commit_truncate(), since we're
    moving to ocfs2_remove_btree_range anyway.

    2). Add a new func ocfs2_reserve_blocks_for_rec_trunc() for purpose of
    accepting some extra blocks to reserve.

    3). Change ocfs2_prepare_refcount_change_for_del() a bit to fit our
    needs. It's safe to do this since it's only being called by
    truncate.

    4). Change ocfs2_remove_btree_range() a bit to take refcount case into
    account.

    5). Finally, we change ocfs2_commit_truncate() to call
    ocfs2_remove_btree_range() in a proper way.

    The patch has been tested normally for sanity check, stress tests
    with heavier workload will be expected.

    Based on this patch, fixing the punching holes bug will be fairly easy.

    Signed-off-by: Tristan Ye
    Acked-by: Mark Fasheh
    Signed-off-by: Joel Becker

    Tristan Ye
     

06 May, 2010

4 commits

  • The default behavior for directory reservations stays the same, but we add a
    mount option so people can tweak the size of directory reservations
    according to their workloads.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Joel Becker

    Mark Fasheh
     
  • Use the reservations system for unindexed dir tree allocations. We don't
    bother with the indexed tree as reads from it are mostly random anyway.
    Directory reservations are marked seperately, to allow the reservations code
    a chance to optimize their window sizes. This patch allocates only 8 bits
    for directory windows as they generally are not expected to grow as quickly
    as file data. Future improvements to dir window sizing can trivially be
    made.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • jbd[2]_journal_dirty_metadata() only returns 0. It's been returning 0
    since before the kernel moved to git. There is no point in checking
    this error.

    ocfs2_journal_dirty() has been faithfully returning the status since the
    beginning. All over ocfs2, we have blocks of code checking this can't
    fail status. In the past few years, we've tried to avoid adding these
    checks, because they are pointless. But anyone who looks at our code
    assumes they are needed.

    Finally, ocfs2_journal_dirty() is made a void function. All error
    checking is removed from other files. We'll BUG_ON() the status of
    jbd2_journal_dirty_metadata() just in case they change it someday. They
    won't.

    Signed-off-by: Joel Becker

    Joel Becker
     
  • They all take an ocfs2_alloc_context, which has the allocation inode.

    Signed-off-by: Joel Becker
    Signed-off-by: Tao Ma

    Joel Becker
     

26 Mar, 2010

1 commit


22 Mar, 2010

1 commit


06 Mar, 2010

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
    quota: stop using QUOTA_OK / NO_QUOTA
    dquot: cleanup dquot initialize routine
    dquot: move dquot initialization responsibility into the filesystem
    dquot: cleanup dquot drop routine
    dquot: move dquot drop responsibility into the filesystem
    dquot: cleanup dquot transfer routine
    dquot: move dquot transfer responsibility into the filesystem
    dquot: cleanup inode allocation / freeing routines
    dquot: cleanup space allocation / freeing routines
    ext3: add writepage sanity checks
    ext3: Truncate allocated blocks if direct IO write fails to update i_size
    quota: Properly invalidate caches even for filesystems with blocksize < pagesize
    quota: generalize quota transfer interface
    quota: sb_quota state flags cleanup
    jbd: Delay discarding buffers in journal_unmap_buffer
    ext3: quota_write cross block boundary behaviour
    quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
    quota: split out compat_sys_quotactl support from quota.c
    quota: split out netlink notification support from quota.c
    quota: remove invalid optimization from quota_sync_all
    ...

    Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c

    Linus Torvalds
     

05 Mar, 2010

1 commit

  • Get rid of the alloc_space, free_space, reserve_space, claim_space and
    release_rsv dquot operations - they are always called from the filesystem
    and if a filesystem really needs their own (which none currently does)
    it can just call into it's own routine directly.

    Move shared logic into the common __dquot_alloc_space,
    dquot_claim_space_nodirty and __dquot_free_space low-level methods,
    and rationalize the wrappers around it to move as much as possible
    code into the common block for CONFIG_QUOTA vs not. Also rename
    all these helpers to be named dquot_* instead of vfs_dq_*.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     

27 Feb, 2010

1 commit

  • This patch add extent block (metadata) stealing mechanism for
    extent allocation. This mechanism is same as the inode stealing.
    if no room in slot specific extent_alloc, we will try to
    allocate extent block from the next slot.

    Signed-off-by: Tiger Yang
    Acked-by: Tao Ma
    Signed-off-by: Joel Becker

    Tiger Yang
     

05 Sep, 2009

6 commits