04 Apr, 2014
2 commits
-
Ensure that ocfs2_update_inode_fsync_trans() is called any time we touch
an inode in a given transaction. This is a follow-on to the previous
patch to reduce lock contention and deadlocking during an fsync
operation.Signed-off-by: Darrick J. Wong
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Wengang
Cc: Greg Marsden
Cc: Srinivas Eeda
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Currently, ocfs2_sync_file grabs i_mutex and forces the current journal
transaction to complete. This isn't terribly efficient, since sync_file
really only needs to wait for the last transaction involving that inode
to complete, and this doesn't require i_mutex.Therefore, implement the necessary bits to track the newest tid
associated with an inode, and teach sync_file to wait for that instead
of waiting for everything in the journal to commit. Furthermore, only
issue the flush request to the drive if jbd2 hasn't already done so.This also eliminates the deadlock between ocfs2_file_aio_write() and
ocfs2_sync_file(). aio_write takes i_mutex then calls
ocfs2_aiodio_wait() to wait for unaligned dio writes to finish.
However, if that dio completion involves calling fsync, then we can get
into trouble when some ocfs2_sync_file tries to take i_mutex.Signed-off-by: Darrick J. Wong
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
13 Nov, 2013
2 commits
-
The only reason for sb_getblk() failing is if it can't allocate the
buffer_head. So return ENOMEM instead when it fails.[joseph.qi@huawei.com: ocfs2_symlink_get_block() and ocfs2_read_blocks_sync() and ocfs2_read_blocks() need the same change]
Signed-off-by: Rui Xiang
Reviewed-by: Jie Liu
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Cc: Joseph Qi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Code cleanup to remove unnecessary variable passed but never used
to ocfs2_calc_extend_credits.Signed-off-by: Goldwyn Rodrigues
Cc: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Aug, 2013
1 commit
-
Fix a NULL pointer deference while removing an empty directory, which
was introduced by commit 3704412bdbf3 ("[readdir] convert ocfs2").BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [] (null)
PGD 6da85067 PUD 6da89067 PMD 0
Oops: 0010 [#1] SMP
CPU: 0 PID: 6564 Comm: rmdir Tainted: G O 3.11.0-rc1 #4
RIP: 0010:[] [< (null)>] (null)
Call Trace:
ocfs2_dir_foreach+0x49/0x50 [ocfs2]
ocfs2_empty_dir+0x12c/0x3e0 [ocfs2]
ocfs2_unlink+0x56e/0xc10 [ocfs2]
vfs_rmdir+0xd5/0x140
do_rmdir+0x1cb/0x1e0
SyS_rmdir+0x16/0x20
system_call_fastpath+0x16/0x1b
Code: Bad RIP value.
RIP [< (null)>] (null)
RSP
CR2: 0000000000000000[dan.carpenter@oracle.com: fix pointer math]
Signed-off-by: Jie Liu
Reported-by: David Weber
Cc: Al Viro
Cc: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
29 Jun, 2013
1 commit
-
Signed-off-by: Al Viro
27 Feb, 2013
1 commit
-
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
26 Feb, 2013
1 commit
-
very few users left...
Signed-off-by: Al Viro
23 Feb, 2013
1 commit
-
Signed-off-by: Al Viro
21 Jan, 2013
1 commit
-
This macro, initially introduced by ext2 in v0.99.15, does not
have any users from the beginning. It has been removed in later
ext2 version but still remains in the code of ext3, ext4, ocfs2.
Remove this macro there.Cc: Jan Kara
Cc: linux-ext4@vger.kernel.org
Cc: ocfs2-devel@oss.oracle.com
Acked-by: Mark Fasheh
Acked-by: "Theodore Ts'o"
Signed-off-by: Guo Chao
Signed-off-by: Jan Kara
02 Dec, 2011
1 commit
-
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (31 commits)
ocfs2: avoid unaligned access to dqc_bitmap
ocfs2: Use filemap_write_and_wait() instead of write_inode_now()
ocfs2: honor O_(D)SYNC flag in fallocate
ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2
ocfs2: send correct UUID to cleancache initialization
ocfs2: Commit transactions in error cases -v2
ocfs2: make direntry invalid when deleting it
fs/ocfs2/dlm/dlmlock.c: free kmem_cache_zalloc'd data using kmem_cache_free
ocfs2: Avoid livelock in ocfs2_readpage()
ocfs2: serialize unaligned aio
ocfs2: Implement llseek()
ocfs2: Fix ocfs2_page_mkwrite()
ocfs2: Add comment about orphan scanning
ocfs2: Clean up messages in the fs
ocfs2/cluster: Cluster up now includes network connections too
ocfs2/cluster: Add new function o2net_fill_node_map()
ocfs2/cluster: Fix output in file elapsed_time_in_ms
ocfs2/dlm: dlmlock_remote() needs to account for remastery
ocfs2/dlm: Take inflight reference count for remotely mastered resources too
ocfs2/dlm: Cleanup dlm_wait_for_node_death() and dlm_wait_for_node_recovery()
...
17 Nov, 2011
1 commit
-
When we deleting a direntry from a directory, if it's the first in a block we
invalid it by setting inode to 0; otherwise, we merge the deleted one to the
prior and contiguous direntry. And we don't truncate directories.There is a problem for the later case since inode is not set to 0.
This problem happens when the caller passes a file position as parameter to
ocfs2_dir_foreach_blk(). If the position happens to point to a stale(not
the first, deleted in betweens of ocfs2_dir_foreach_blk()s) direntry, we are
not able to recognize its staleness. So that we treat it as a live one wrongly.The fix is to set inode to 0 in both cases indicating the direntry is stale.
This won't introduce additional IOs.Signed-off-by: Wengang Wang
Signed-off-by: Joel Becker
02 Nov, 2011
1 commit
-
Replace remaining direct i_nlink updates with a new set_nlink()
updater function.Signed-off-by: Miklos Szeredi
Tested-by: Toshiyuki Okajima
Signed-off-by: Christoph Hellwig
14 May, 2011
1 commit
-
CLANG found that there is a path that has data_ac uninitialized,
this place
2917 /* This gets us the dx_root */
2918 ret = ocfs2_reserve_new_metadata_blocks(osb, 1, &meta_ac);
2919 if (ret) {3
Taking true branch
2920 mlog_errno(ret);
2921 goto out;4
Control jumps to line 3168
2922 }Goes to the out: label without data_ac being initialized.
Ciao, Marcus
Signed-Off-By: Marcus Meissner
Signed-off-by: Mark Fasheh
Signed-off-by: Joel Becker
29 Mar, 2011
2 commits
-
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (39 commits)
Treat writes as new when holes span across page boundaries
fs,ocfs2: Move o2net_get_func_run_time under CONFIG_OCFS2_FS_STATS.
ocfs2/dlm: Move kmalloc() outside the spinlock
ocfs2: Make the left masklogs compat.
ocfs2: Remove masklog ML_AIO.
ocfs2: Remove masklog ML_UPTODATE.
ocfs2: Remove masklog ML_BH_IO.
ocfs2: Remove masklog ML_JOURNAL.
ocfs2: Remove masklog ML_EXPORT.
ocfs2: Remove masklog ML_DCACHE.
ocfs2: Remove masklog ML_NAMEI.
ocfs2: Remove mlog(0) from fs/ocfs2/dir.c
ocfs2: remove NAMEI from symlink.c
ocfs2: Remove masklog ML_QUOTA.
ocfs2: Remove mlog(0) from quota_local.c.
ocfs2: Remove masklog ML_RESERVATIONS.
ocfs2: Remove masklog ML_XATTR.
ocfs2: Remove masklog ML_SUPER.
ocfs2: Remove mlog(0) from fs/ocfs2/heartbeat.c
ocfs2: Remove mlog(0) from fs/ocfs2/slot_map.c
...Fix up trivial conflict in fs/ocfs2/super.c
07 Mar, 2011
1 commit
-
mlog_exit is used to record the exit status of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.This patch just try to remove it or change it. So:
1. if all the error paths already use mlog_errno, it is just removed.
Otherwise, it will be replaced by mlog_errno.
2. if it is used to print some return value, it is replaced with
mlog(0,...).
mlog_exit_ptr is changed to mlog(0.
All those mlog(0,...) will be replaced with trace events later.Signed-off-by: Tao Ma
23 Feb, 2011
1 commit
-
This is the 2nd step to remove the debug info of NAMEI.
Signed-off-by: Tao Ma
21 Feb, 2011
1 commit
-
ENTRY is used to record the entry of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.So for mlog_entry_void, we just remove it.
for mlog_entry(...), we replace it with mlog(0,...), and they
will be replace by trace event later.Signed-off-by: Tao Ma
20 Feb, 2011
1 commit
-
In cad3f00, ext4_check_dir_entry was modified by adding some unlikely.
Ted described it as "This function gets called a lot for large
directories, and the answer is almost always 'no, no, there's no problem'.
This means using unlikely() is a good thing."
ext3 added the similar change in commit a4ae309.So change it accordingly in ocfs2.
Cc: Joel Becker
Signed-off-by: Tao Ma
Signed-off-by: Joel Becker
19 Jan, 2011
1 commit
-
Fix a bunch of
warning: ‘inline’ is not at beginning of declaration
messages when building a 'make allyesconfig' kernel with -Wextra.These warnings are trivial to kill, yet rather annoying when building with
-Wextra.
The more we can cut down on pointless crap like this the better (IMHO).A previous patch to do this for a 'allnoconfig' build has already been
merged. This just takes the cleanup a little further.Signed-off-by: Jesper Juhl
Signed-off-by: Jiri Kosina
16 Dec, 2010
1 commit
-
When we set/clear the dyn_features for an inode we hold the ip_lock.
So do it when we set/clear OCFS2_INDEXED_DIR_FL also.Signed-off-by: Tao Ma
Signed-off-by: Joel Becker
11 Sep, 2010
1 commit
-
In ocfs2_dx_dir_rebalance(), we need to rejournal_acess the blocks after
calling ocfs2_insert_extent() since growing an extent tree may trigger
ocfs2_extend_trans(), which makes previous journal_access meaningless.Signed-off-by: Tristan Ye
Signed-off-by: Joel Becker
19 May, 2010
2 commits
-
Truncate is just a special case of punching holes(from new i_size to
end), we therefore could take advantage of the existing
ocfs2_remove_btree_range() to reduce the comlexity and redundancy in
alloc.c. The goal here is to make truncate more generic and
straightforward.Several functions only used by ocfs2_commit_truncate() will smiply be
removed.ocfs2_remove_btree_range() was originally used by the hole punching
code, which didn't take refcount trees into account (definitely a bug).
We therefore need to change that func a bit to handle refcount trees.
It must take the refcount lock, calculate and reserve blocks for
refcount tree changes, and decrease refcounts at the end. We replace
ocfs2_lock_allocators() here by adding a new func
ocfs2_reserve_blocks_for_rec_trunc() which accepts some extra blocks to
reserve. This will not hurt any other code using
ocfs2_remove_btree_range() (such as dir truncate and hole punching).I merged the following steps into one patch since they may be
logically doing one thing, though I know it looks a little bit fat
to review.1). Remove redundant code used by ocfs2_commit_truncate(), since we're
moving to ocfs2_remove_btree_range anyway.2). Add a new func ocfs2_reserve_blocks_for_rec_trunc() for purpose of
accepting some extra blocks to reserve.3). Change ocfs2_prepare_refcount_change_for_del() a bit to fit our
needs. It's safe to do this since it's only being called by
truncate.4). Change ocfs2_remove_btree_range() a bit to take refcount case into
account.5). Finally, we change ocfs2_commit_truncate() to call
ocfs2_remove_btree_range() in a proper way.The patch has been tested normally for sanity check, stress tests
with heavier workload will be expected.Based on this patch, fixing the punching holes bug will be fairly easy.
Signed-off-by: Tristan Ye
Acked-by: Mark Fasheh
Signed-off-by: Joel Becker
06 May, 2010
4 commits
-
The default behavior for directory reservations stays the same, but we add a
mount option so people can tweak the size of directory reservations
according to their workloads.Signed-off-by: Mark Fasheh
Signed-off-by: Joel Becker -
Use the reservations system for unindexed dir tree allocations. We don't
bother with the indexed tree as reads from it are mostly random anyway.
Directory reservations are marked seperately, to allow the reservations code
a chance to optimize their window sizes. This patch allocates only 8 bits
for directory windows as they generally are not expected to grow as quickly
as file data. Future improvements to dir window sizing can trivially be
made.Signed-off-by: Mark Fasheh
-
jbd[2]_journal_dirty_metadata() only returns 0. It's been returning 0
since before the kernel moved to git. There is no point in checking
this error.ocfs2_journal_dirty() has been faithfully returning the status since the
beginning. All over ocfs2, we have blocks of code checking this can't
fail status. In the past few years, we've tried to avoid adding these
checks, because they are pointless. But anyone who looks at our code
assumes they are needed.Finally, ocfs2_journal_dirty() is made a void function. All error
checking is removed from other files. We'll BUG_ON() the status of
jbd2_journal_dirty_metadata() just in case they change it someday. They
won't.Signed-off-by: Joel Becker
-
They all take an ocfs2_alloc_context, which has the allocation inode.
Signed-off-by: Joel Becker
Signed-off-by: Tao Ma
26 Mar, 2010
1 commit
-
Get the suballoc_loc from ocfs2_claim_new_inode() or
ocfs2_claim_metadata(). Store it on the appropriate field of the block
we just allocated.Signed-off-by: Joel Becker
22 Mar, 2010
1 commit
-
In case the block we are going to free is allocated from
a discontiguous block group, we have to use suballoc_loc
to be the right group.Signed-off-by: Tao Ma
06 Mar, 2010
1 commit
-
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (33 commits)
quota: stop using QUOTA_OK / NO_QUOTA
dquot: cleanup dquot initialize routine
dquot: move dquot initialization responsibility into the filesystem
dquot: cleanup dquot drop routine
dquot: move dquot drop responsibility into the filesystem
dquot: cleanup dquot transfer routine
dquot: move dquot transfer responsibility into the filesystem
dquot: cleanup inode allocation / freeing routines
dquot: cleanup space allocation / freeing routines
ext3: add writepage sanity checks
ext3: Truncate allocated blocks if direct IO write fails to update i_size
quota: Properly invalidate caches even for filesystems with blocksize < pagesize
quota: generalize quota transfer interface
quota: sb_quota state flags cleanup
jbd: Delay discarding buffers in journal_unmap_buffer
ext3: quota_write cross block boundary behaviour
quota: drop permission checks from xfs_fs_set_xstate/xfs_fs_set_xquota
quota: split out compat_sys_quotactl support from quota.c
quota: split out netlink notification support from quota.c
quota: remove invalid optimization from quota_sync_all
...Fixed trivial conflicts in fs/namei.c and fs/ufs/inode.c
05 Mar, 2010
1 commit
-
Get rid of the alloc_space, free_space, reserve_space, claim_space and
release_rsv dquot operations - they are always called from the filesystem
and if a filesystem really needs their own (which none currently does)
it can just call into it's own routine directly.Move shared logic into the common __dquot_alloc_space,
dquot_claim_space_nodirty and __dquot_free_space low-level methods,
and rationalize the wrappers around it to move as much as possible
code into the common block for CONFIG_QUOTA vs not. Also rename
all these helpers to be named dquot_* instead of vfs_dq_*.Signed-off-by: Christoph Hellwig
Signed-off-by: Jan Kara
27 Feb, 2010
1 commit
-
This patch add extent block (metadata) stealing mechanism for
extent allocation. This mechanism is same as the inode stealing.
if no room in slot specific extent_alloc, we will try to
allocate extent block from the next slot.Signed-off-by: Tiger Yang
Acked-by: Tao Ma
Signed-off-by: Joel Becker
05 Sep, 2009
6 commits
-
With this commit, extent tree operations are divorced from inodes and
rely on ocfs2_caching_info. Phew!Signed-off-by: Joel Becker
-
One more function down, no inode in the entire insert-extent chain.
Signed-off-by: Joel Becker
-
ocfs2_find_path and ocfs2_find_leaf() walk our btrees, reading extent
blocks. They need struct ocfs2_caching_info for that, but not struct
inode.Signed-off-by: Joel Becker
-
extent blocks belong to btrees on more than just inodes, so we want to
pass the ocfs2_caching_info structure directly to
ocfs2_read_extent_block(). A number of places in alloc.c can now drop
struct inode from their argument list.Signed-off-by: Joel Becker
-
The next step in divorcing metadata I/O management from struct inode is
to pass struct ocfs2_caching_info to the journal functions. Thus the
journal locks a metadata cache with the cache io_lock function. It also
can compare ci_last_trans and ci_created_trans directly.This is a large patch because of all the places we change
ocfs2_journal_access..(handle, inode, ...) to
ocfs2_journal_access..(handle, INODE_CACHE(inode), ...).Signed-off-by: Joel Becker
-
We are really passing the inode into the ocfs2_read/write_blocks()
functions to get at the metadata cache. This commit passes the cache
directly into the metadata block functions, divorcing them from the
inode.Signed-off-by: Joel Becker