Eric Lee / smarc-fsl-linux-kernel

25 Jul, 2011

1 commit

5cffff9e2 ocfs2: Fix ocfs2_page_mkwrite() ... Browse Code »

This patch address two shortcomings in ocfs2_page_mkwrite():
1. Makes the function return better VM_FAULT_* errors.
2. It handles a error that is triggered when a page is dropped from the mapping
due to memory pressure. This patch locks the page to prevent that.

[Patch was cleaned up by Sunil Mushran.]

Signed-off-by: Wengang Wang
Signed-off-by: Sunil Mushran

Wengang Wang
2011-07-25 01:36:54 +0800

07 Mar, 2011

1 commit

c1e8d35ef ocfs2: Remove EXIT from masklog. ... Browse Code »
43

mlog_exit is used to record the exit status of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.

This patch just try to remove it or change it. So:
1. if all the error paths already use mlog_errno, it is just removed.
Otherwise, it will be replaced by mlog_errno.
2. if it is used to print some return value, it is replaced with
mlog(0,...).
mlog_exit_ptr is changed to mlog(0.
All those mlog(0,...) will be replaced with trace events later.

Signed-off-by: Tao Ma

Tao Ma
2011-03-07 16:43:21 +0800

22 Feb, 2011

1 commit

614a9e849 ocfs2: Remove FILE_IO from masklog. ... Browse Code »

Change all the "mlog(0," in fs/ocfs2/mmap.c to trace events.
And finally remove masklog FILE_IO.

Signed-off-by: Tao Ma

Tao Ma
2011-02-22 21:59:46 +0800

21 Feb, 2011

1 commit

ef6b689b6 ocfs2: Remove ENTRY from masklog. ... Browse Code »

ENTRY is used to record the entry of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.

So for mlog_entry_void, we just remove it.
for mlog_entry(...), we replace it with mlog(0,...), and they
will be replace by trace event later.

Signed-off-by: Tao Ma

Tao Ma
2011-02-21 11:10:44 +0800

10 Sep, 2010

1 commit

729963a1f Merge branch 'cow_readahead' of git://oss.oracle.com/git/tma/linux-2.6 into merge-2 Browse Code »

Joel Becker
2010-09-10 23:41:04 +0800

08 Sep, 2010

1 commit

f63afdb2c ocfs2: make __ocfs2_page_mkwrite handle file end properly. ... Browse Code »

__ocfs2_page_mkwrite now is broken in handling file end.
1. the last page should be the page contains i_size - 1.
2. the len in the last page is also calculated wrong.
So change them accordingly.

Acked-by: Mark Fasheh
Signed-off-by: Tao Ma

Tao Ma
2010-09-08 14:25:55 +0800

12 Aug, 2010

1 commit

0378da0fd ocfs2: pass struct file* to ocfs2_write_begin_nolock. ... Browse Code »

struct file * has file_ra_state to store the readahead state
and data. So pass this to ocfs2_write_begin_nolock so that
it can be used in ocfs2_refcount_cow.

Signed-off-by: Tao Ma

Tao Ma
2010-08-12 10:39:48 +0800

21 May, 2010

1 commit

03e62303c Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 ... Browse Code »

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (47 commits)
ocfs2: Silence a gcc warning.
ocfs2: Don't retry xattr set in case value extension fails.
ocfs2:dlm: avoid dlm->ast_lock lockres->spinlock dependency break
ocfs2: Reset xattr value size after xa_cleanup_value_truncate().
fs/ocfs2/dlm: Use kstrdup
fs/ocfs2/dlm: Drop memory allocation cast
Ocfs2: Optimize punching-hole code.
Ocfs2: Make ocfs2_find_cpos_for_left_leaf() public.
Ocfs2: Fix hole punching to correctly do CoW during cluster zeroing.
Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead.
ocfs2: Block signals for mkdir/link/symlink/O_CREAT.
ocfs2: Wrap signal blocking in void functions.
ocfs2/dlm: Increase o2dlm lockres hash size
ocfs2: Make ocfs2_extend_trans() really extend.
ocfs2/trivial: Code cleanup for allocation reservation.
ocfs2: make ocfs2_adjust_resv_from_alloc simple.
ocfs2: Make nointr a default mount option
ocfs2/dlm: Make o2dlm domain join/leave messages KERN_NOTICE
o2net: log socket state changes
ocfs2: print node # when tcp fails
...

Linus Torvalds
2010-05-21 22:20:17 +0800

11 May, 2010

1 commit

e4b963f10 ocfs2: Wrap signal blocking in void functions. ... Browse Code »

ocfs2 sometimes needs to block signals around dlm operations, but it
currently does it with sigprocmask(). Even worse, it's checking the
error code of sigprocmask(). The in-kernel sigprocmask() can only error
if you get the SIG_* argument wrong. We don't.

Wrap the sigprocmask() calls with ocfs2_[un]block_signals(). These
functions are void, but they will BUG() if somehow sigprocmask() returns
an error.

Signed-off-by: Joel Becker

Joel Becker
2010-05-11 02:50:10 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

28 Sep, 2009

1 commit

f0f37e2f7 const: mark struct vm_struct_operations ... Browse Code »

* mark struct vm_area_struct::vm_ops as const
* mark vm_ops in AGP code

But leave TTM code alone, something is fishy there with global vm_ops
being used.

Signed-off-by: Alexey Dobriyan
Signed-off-by: Linus Torvalds

Alexey Dobriyan
2009-09-28 02:39:25 +0800

01 Apr, 2009

1 commit

c2ec175c3 mm: page_mkwrite change prototype to match fault ... Browse Code »

Change the page_mkwrite prototype to take a struct vm_fault, and return
VM_FAULT_xxx flags. There should be no functional change.

This makes it possible to return much more detailed error information to
the VM (and also can provide more information eg. virtual_address to the
driver, which might be important in some special cases).

This is required for a subsequent fix. And will also make it easier to
merge page_mkwrite() with fault() in future.

Signed-off-by: Nick Piggin
Cc: Chris Mason
Cc: Trond Myklebust
Cc: Miklos Szeredi
Cc: Steven Whitehouse
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Artem Bityutskiy
Cc: Felix Blyakher
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2009-04-01 23:59:14 +0800

11 Nov, 2008

1 commit

4c1bbf1ba ocfs2: return 0 in page_mkwrite to let VFS retry. ... Browse Code »

In ocfs2_page_mkwrite, we return -EINVAL when we found the page mapping
isn't updated, and it will cause the user space program get SIGBUS and
exit. The reason is that during race writeable mmap, we will do
unmap_mapping_range in ocfs2_data_downconvert_worker. The good thing is
that if we reuturn 0 in page_mkwrite, VFS will retry fault and then
call page_mkwrite again, so it is safe to return 0 here.

Signed-off-by: Tao Ma
Signed-off-by: Mark Fasheh

Tao Ma
2008-11-11 01:51:47 +0800

26 Jan, 2008

2 commits

e63aecb65 ocfs2: Rename ocfs2_meta_[un]lock ... Browse Code »

Call this the "inode_lock" now, since it covers both data and meta data.
This patch makes no functional changes.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-01-26 06:46:01 +0800
c934a92d0 ocfs2: Remove data locks ... Browse Code »

The meta lock now covers both meta data and data, so this just removes the
now-redundant data lock.

Combining locks saves us a round of lock mastery per inode and one less lock
to ping between nodes during read/write.

We don't lose much - since meta locks were always held before a data lock
(and at the same level) ordered writeout mode (the default) ensured that
flushing for the meta data lock also pushed out data anyways.

Signed-off-by: Mark Fasheh

Mark Fasheh
2008-01-26 06:45:57 +0800

20 Jul, 2007

4 commits

183363380 fix some conversion overflows ... Browse Code »

Fix page index to offset conversion overflows in buffer layer, ecryptfs,
and ocfs2.

It would be nice to convert the whole tree to page_offset, but for now
just fix the bugs.

Signed-off-by: Nick Piggin
Cc: Michael Halcrow
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-07-20 23:44:19 +0800
d0217ac04 mm: fault feedback #1 ... Browse Code »

Change ->fault prototype. We now return an int, which contains
VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
FAULT_RET_ code tells the VM whether a page was found, whether it has been
locked, and potentially other things. This is not quite the way he wanted
it yet, but that's changed in the next patch (which requires changes to
arch code).

This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
that a page is locked which requires filemap_nopage to go away (because we
can no longer remain backward compatible without that flag), but we were
going to do that anyway.

struct fault_data is renamed to struct vm_fault as Linus asked. address
is now a void __user * that we should firmly encourage drivers not to use
without really good reason.

The page is now returned via a page pointer in the vm_fault struct.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-07-20 01:04:41 +0800
54cb8821d mm: merge populate and nopage into fault (fixes nonlinear) ... Browse Code »

Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
the virtual address -> file offset differently from linear mappings.

->populate is a layering violation because the filesystem/pagecache code
should need to know anything about the virtual memory mapping. The hitch here
is that the ->nopage handler didn't pass down enough information (ie. pgoff).
But it is more logical to pass pgoff rather than have the ->nopage function
calculate it itself anyway (because that's a similar layering violation).

Having the populate handler install the pte itself is likewise a nasty thing
to be doing.

This patch introduces a new fault handler that replaces ->nopage and
->populate and (later) ->nopfn. Most of the old mechanism is still in place
so there is a lot of duplication and nice cleanups that can be removed if
everyone switches over.

The rationale for doing this in the first place is that nonlinear mappings are
subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
to duplicate the synchronisation logic rather than just consolidate the two.

After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
pagecache. Seems like a fringe functionality anyway.

NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no
users have hit mainline yet.

[akpm@linux-foundation.org: cleanup]
[randy.dunlap@oracle.com: doc. fixes for readahead]
[akpm@linux-foundation.org: build fix]
Signed-off-by: Nick Piggin
Signed-off-by: Randy Dunlap
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-07-20 01:04:41 +0800
d00806b18 mm: fix fault vs invalidate race for linear mappings ... Browse Code »

Fix the race between invalidate_inode_pages and do_no_page.

Andrea Arcangeli identified a subtle race between invalidation of pages from
pagecache with userspace mappings, and do_no_page.

The issue is that invalidation has to shoot down all mappings to the page,
before it can be discarded from the pagecache. Between shooting down ptes to
a particular page, and actually dropping the struct page from the pagecache,
do_no_page from any process might fault on that page and establish a new
mapping to the page just before it gets discarded from the pagecache.

The most common case where such invalidation is used is in file truncation.
This case was catered for by doing a sort of open-coded seqlock between the
file's i_size, and its truncate_count.

Truncation will decrease i_size, then increment truncate_count before
unmapping userspace pages; do_no_page will read truncate_count, then find the
page if it is within i_size, and then check truncate_count under the page
table lock and back out and retry if it had subsequently been changed (ptl
will serialise against unmapping, and ensure a potentially updated
truncate_count is actually visible).

Complexity and documentation issues aside, the locking protocol fails in the
case where we would like to invalidate pagecache inside i_size. do_no_page
can come in anytime and filemap_nopage is not aware of the invalidation in
progress (as it is when it is outside i_size). The end result is that
dangling (->mapping == NULL) pages that appear to be from a particular file
may be mapped into userspace with nonsense data. Valid mappings to the same
place will see a different page.

Andrea implemented two working fixes, one using a real seqlock, another using
a page->flags bit. He also proposed using the page lock in do_no_page, but
that was initially considered too heavyweight. However, it is not a global or
per-file lock, and the page cacheline is modified in do_no_page to increment
_count and _mapcount anyway, so a further modification should not be a large
performance hit. Scalability is not an issue.

This patch implements this latter approach. ->nopage implementations return
with the page locked if it is possible for their underlying file to be
invalidated (in that case, they must set a special vm_flags bit to indicate
so). do_no_page only unlocks the page after setting up the mapping
completely. invalidation is excluded because it holds the page lock during
invalidation of each page (and ensures that the page is not mapped while
holding the lock).

This also allows significant simplifications in do_no_page, because we have
the page locked in the right place in the pagecache from the start.

Signed-off-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2007-07-20 01:04:41 +0800

11 Jul, 2007

1 commit

7307de805 ocfs2: shared writeable mmap ... Browse Code »

Implement cluster consistent shared writeable mappings using the
->page_mkwrite() callback.

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-07-11 08:31:51 +0800

27 Apr, 2007

1 commit

89488984a ocfs2: Turn off shared writeable mmap for local files systems with holes. ... Browse Code »

This will be turned back on once we can do allocation in ->page_mkwrite().

Signed-off-by: Mark Fasheh

Mark Fasheh
2007-04-27 06:02:01 +0800

08 Dec, 2006

1 commit

c271c5c22 ocfs2: local mounts ... Browse Code »

This allows users to format an ocfs2 file system with a special flag,
OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT. When the file system sees this flag, it
will not use any cluster services, nor will it require a cluster
configuration, thus acting like a 'local' file system.

Signed-off-by: Sunil Mushran
Signed-off-by: Mark Fasheh

Sunil Mushran
2006-12-08 09:37:53 +0800

02 Dec, 2006

1 commit

25899deef ocfs2: update file system paths to set atime ... Browse Code »

Conditionally update atime in ocfs2_file_aio_read(), ocfs2_readdir() and
ocfs2_mmap().

Signed-off-by: Tiger Yang
Signed-off-by: Mark Fasheh

Tiger Yang
2006-12-02 10:28:58 +0800

30 Jun, 2006

1 commit

2b388c679 ocfs2: Compile-time disabling of ocfs2 debugging output. ... Browse Code »

Give gcc the chance to compile out the debug logging code in ocfs2.
This saves some size at the expense of being able to debug the code.

Signed-off-by: Joel Becker
Signed-off-by: Mark Fasheh

Joel Becker
2006-06-30 05:48:30 +0800

11 Jan, 2006

1 commit

870f48179 [PATCH] replace inode_update_time with file_update_time ... Browse Code »

To allow various options to work per-mount instead of per-sb we need a
struct vfsmount when updating ctime and mtime. This preparation patch
replaces the inode_update_time routine with a file_update_atime routine so
we can easily get at the vfsmount. (and the file makes more sense in this
context anyway). Also get rid of the unused second argument - we always
want to update the ctime when calling this routine.

Signed-off-by: Christoph Hellwig
Cc: Al Viro
Cc: Anton Altaparmakov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2006-01-11 00:01:30 +0800

04 Jan, 2006

1 commit

ccd979bdb [PATCH] OCFS2: The Second Oracle Cluster Filesystem ... Browse Code »

The OCFS2 file system module.

Signed-off-by: Mark Fasheh
Signed-off-by: Kurt Hackel

Mark Fasheh
2006-01-04 03:45:47 +0800