Eric Lee / smarc-fsl-linux-kernel

02 Dec, 2011

1 commit

0a4ebed78 Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 ... Browse Code »

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (31 commits)
ocfs2: avoid unaligned access to dqc_bitmap
ocfs2: Use filemap_write_and_wait() instead of write_inode_now()
ocfs2: honor O_(D)SYNC flag in fallocate
ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2
ocfs2: send correct UUID to cleancache initialization
ocfs2: Commit transactions in error cases -v2
ocfs2: make direntry invalid when deleting it
fs/ocfs2/dlm/dlmlock.c: free kmem_cache_zalloc'd data using kmem_cache_free
ocfs2: Avoid livelock in ocfs2_readpage()
ocfs2: serialize unaligned aio
ocfs2: Implement llseek()
ocfs2: Fix ocfs2_page_mkwrite()
ocfs2: Add comment about orphan scanning
ocfs2: Clean up messages in the fs
ocfs2/cluster: Cluster up now includes network connections too
ocfs2/cluster: Add new function o2net_fill_node_map()
ocfs2/cluster: Fix output in file elapsed_time_in_ms
ocfs2/dlm: dlmlock_remote() needs to account for remastery
ocfs2/dlm: Take inflight reference count for remotely mastered resources too
ocfs2/dlm: Cleanup dlm_wait_for_node_death() and dlm_wait_for_node_recovery()
...

Linus Torvalds
2011-12-02 06:55:34 +0800

02 Nov, 2011

1 commit

bfe868486 filesystems: add set_nlink() ... Browse Code »

Replace remaining direct i_nlink updates with a new set_nlink()
updater function.

Signed-off-by: Miklos Szeredi
Tested-by: Toshiyuki Okajima
Signed-off-by: Christoph Hellwig

Miklos Szeredi
2011-11-02 19:53:43 +0800

01 Jun, 2011

1 commit

03efed8a2 ocfs2: Bugfix for hard readonly mount ... Browse Code »

ocfs2 cannot currently mount a device that is readonly at the media
("hard readonly"). Fix the broken places.
see detail: http://oss.oracle.com/bugzilla/show_bug.cgi?id=1322

[ Description edited -- Joel ]

Signed-off-by: Tiger Yang
Reviewed-by: Sunil Mushran
Signed-off-by: Joel Becker

Tiger Yang
2011-06-01 10:03:44 +0800

29 Mar, 2011

1 commit

99bdc3880 Merge branch 'mlog_replace_for_39' of git://repo.or.cz/taoma-kernel into ocfs2-merge-window-fix Browse Code »

Joel Becker
2011-03-29 00:44:26 +0800

07 Mar, 2011

1 commit

c1e8d35ef ocfs2: Remove EXIT from masklog. ... Browse Code »
43

mlog_exit is used to record the exit status of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.

This patch just try to remove it or change it. So:
1. if all the error paths already use mlog_errno, it is just removed.
Otherwise, it will be replaced by mlog_errno.
2. if it is used to print some return value, it is replaced with
mlog(0,...).
mlog_exit_ptr is changed to mlog(0.
All those mlog(0,...) will be replaced with trace events later.

Signed-off-by: Tao Ma

Tao Ma
2011-03-07 16:43:21 +0800

21 Feb, 2011

1 commit

ef6b689b6 ocfs2: Remove ENTRY from masklog. ... Browse Code »

ENTRY is used to record the entry of a function.
But because it is added in so many functions, if we enable it,
the system logs get filled up quickly and cause too much I/O.
So actually no one can open it for a production system or even
for a test.

So for mlog_entry_void, we just remove it.
for mlog_entry(...), we replace it with mlog(0,...), and they
will be replace by trace event later.

Signed-off-by: Tao Ma

Tao Ma
2011-02-21 11:10:44 +0800

20 Feb, 2011

1 commit

5bc970e80 ocfs2: Use hrtimer to track ocfs2 fs lock stats ... Browse Code »

Patch makes use of the hrtimer to track times in ocfs2 lock stats.

The patch is a bit involved to ensure no additional impact on the memory
footprint. The size of ocfs2_inode_cache remains 1280 bytes on 32-bit systems.

A related change was to modify the unit of the max wait time from nanosec to
microsec allowing us to track max time larger than 4 secs. This change
necessitated the bumping of the output version in the debugfs file,
locking_state, from 2 to 3.

Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2011-02-20 19:56:07 +0800

11 Sep, 2010

1 commit

5e98d4924 Track negative entries v3 ... Browse Code »

Track negative dentries by recording the generation number of the parent
directory in d_fsdata. The generation number for the parent directory is
recorded in the inode_info, which increments every time the lock on the
directory is dropped.

If the generation number of the parent directory and the negative dentry
matches, there is no need to perform the revalidate, else a revalidate
is forced. This improves performance in situations where nodes look for
the same non-existent file multiple times.

Thanks Mark for explaining the DLM sequence.

Signed-off-by: Goldwyn Rodrigues
Signed-off-by: Joel Becker

Goldwyn Rodrigues
2010-09-11 00:18:15 +0800

20 Jul, 2010

1 commit

33fa1d909 fs/ocfs2: Remove unnecessary casts of private_data ... Browse Code »

Signed-off-by: Joe Perches
Acked-by: Joel Becker
Signed-off-by: Jiri Kosina

Joe Perches
2010-07-20 23:20:08 +0800

22 May, 2010

1 commit

ae4f6ef13 ocfs2: Avoid unnecessary block mapping when refreshing quota info ... Browse Code »

The position of global quota file info does not change. So we do not have
to do logical -> physical block translation every time we reread it from
disk. Thus we can also avoid taking ip_alloc_sem.

Acked-by: Joel Becker
Signed-off-by: Jan Kara

Jan Kara
2010-05-22 01:30:46 +0800

08 Mar, 2010

1 commit

318ae2edc Merge branch 'for-next' into for-linus ... Browse Code »

Conflicts:
Documentation/filesystems/proc.txt
arch/arm/mach-u300/include/mach/debug-macro.S
drivers/net/qlge/qlge_ethtool.c
drivers/net/qlge/qlge_main.c
drivers/net/typhoon.c

Jiri Kosina
2010-03-08 23:55:37 +0800

28 Feb, 2010

1 commit

9b915181a ocfs2: Use a separate masklog for AST and BASTs ... Browse Code »

This patch adds a new masklog and uses it allow tracing ASTs and BASTs
in the dlmglue layer. This has been found to be very useful in debugging
cluster locking issues.

Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2010-02-28 11:57:06 +0800

27 Feb, 2010

3 commits

553b5eb91 ocfs2: Pass the locking protocol into ocfs2_cluster_connect(). ... Browse Code »

Inside the stackglue, the locking protocol structure is hanging off of
the ocfs2_cluster_connection. This takes it one further; the locking
protocol is passed into ocfs2_cluster_connect(). Now different cluster
connections can have different locking protocols with distinct asts.
Note that all locking protocols have to keep their maximum protocol
version in lock-step.

With the protocol structure set in ocfs2_cluster_connect(), there is no
need for the stackglue to have a static pointer to a specific protocol
structure. We can change initialization to only pass in the maximum
protocol version.

Signed-off-by: Joel Becker

Joel Becker
2010-02-27 07:41:17 +0800
c0e413385 ocfs2: Attach the connection to the lksb ... Browse Code »

We're going to want it in the ast functions, so we convert union
ocfs2_dlm_lksb to struct ocfs2_dlm_lksb and let it carry the connection.

Signed-off-by: Joel Becker

Joel Becker
2010-02-27 07:41:14 +0800
a796d2862 ocfs2: Pass lksbs back from stackglue ast/bast functions. ... Browse Code »

The stackglue ast and bast functions tried to maintain the fiction that
their arguments were void pointers. In reality, stack_user.c had to
know that the argument was an ocfs2_lock_res in order to get the status
off of the lksb. That's ugly.

This changes stackglue to always pass the lksb as the argument to ast
and bast functions. The caller can always use container_of() to get the
ocfs2_lock_res or user_dlm_lock_res. The net effect to the caller is
zero. They still get back the lockres in their ast. stackglue gets
cleaner, and now can use the lksb itself.

Signed-off-by: Joel Becker

Joel Becker
2010-02-27 07:41:14 +0800

09 Feb, 2010

1 commit

3ad2f3fbb tree-wide: Assorted spelling fixes ... Browse Code »

In particular, several occurances of funny versions of 'success',
'unknown', 'therefore', 'acknowledge', 'argument', 'achieve', 'address',
'beginning', 'desirable', 'separate' and 'necessary' are fixed.

Signed-off-by: Daniel Mack
Cc: Joe Perches
Cc: Junio C Hamano
Signed-off-by: Jiri Kosina

Daniel Mack
2010-02-09 18:13:56 +0800

04 Feb, 2010

1 commit

079b80578 ocfs2: Plugs race between the dc thread and an unlock ast message ... Browse Code »

This patch plugs a race between the downconvert thread and an unlock ast message.
Specifically, after the downconvert worker has done its task, the dc thread needs
to check whether an unlock ast made the downconvert moot.

Reported-by: David Teigland
Signed-off-by: Sunil Mushran
Acked-by: Mark Fasheh
Signed-off-by: Joel Becker

Sunil Mushran
2010-02-04 09:26:03 +0800

03 Feb, 2010

4 commits

db0f6ce69 ocfs2: Remove overzealous BUG_ON during blocked lock processing ... Browse Code »

During blocked lock processing, we should consider the possibility that the
lock is no longer blocking.

Joel Becker assisted in fixing this issue.

Reported-by: David Teigland
Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2010-02-03 15:51:16 +0800
0d74125a6 ocfs2: Do not downconvert if the lock level is already compatible ... Browse Code »

During upconvert, if the master were to send a BAST, dlmglue will detect the
upconversion in process and send a cancel convert to the master. Upon receiving
the AST for the cancel convert, it will re-process the lock resource to determine
whether it needs downconverting. Say, the up was from PR to EX and the BAST was
for EX. After the cancel convert, it will need to downconvert to NL.

However, if the node was originally upconverting from NL to EX, then there would
be no reason to downconvert (assuming the same message sequence).

This patch makes dlmglue consider the possibility that the current lock level
is already compatible and that downconverting is not required.

Joel Becker assisted in fixing this issue.

Fixes ossbz#1178
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1178

Reported-by: Coly Li
Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2010-02-03 15:51:14 +0800
a19128260 ocfs2: Prevent a livelock in dlmglue ... Browse Code »

There is possibility of a livelock in __ocfs2_cluster_lock(). If a node were
to get an ast for an upconvert request, followed immediately by a bast,
there is a small window where the fs may downconvert the lock before the
process requesting the upconvert is able to take the lock.

This patch adds a new flag to indicate that the upconvert is still in
progress and that the dc thread should not downconvert it right now.

Wengang Wang and Joel Becker
contributed heavily to this patch.

Reported-by: David Teigland
Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2010-02-03 15:51:13 +0800
0b94a909e ocfs2: Fix setting of OCFS2_LOCK_BLOCKED during bast ... Browse Code »

During bast, set the OCFS2_LOCK_BLOCKED flag only if the lock needs to
downconverted.

Signed-off-by: Wengang Wang
Acked-by: Sunil Mushran
Acked-by: Mark Fasheh
Signed-off-by: Joel Becker

Wengang Wang
2010-02-03 15:50:55 +0800

26 Jan, 2010

1 commit

2bd632165 ocfs2/trivial: Remove trailing whitespaces ... Browse Code »

Patch removes trailing whitespaces.

Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2010-01-26 11:20:51 +0800

04 Dec, 2009

1 commit

af901ca18 tree-wide: fix assorted typos all over the place ... Browse Code »

That is "success", "unknown", "through", "performance", "[re|un]mapping"
, "access", "default", "reasonable", "[con]currently", "temperature"
, "channel", "[un]used", "application", "example","hierarchy", "therefore"
, "[over|under]flow", "contiguous", "threshold", "enough" and others.

Signed-off-by: André Goddard Rosa
Signed-off-by: Jiri Kosina

André Goddard Rosa
2009-12-04 22:39:55 +0800

23 Sep, 2009

3 commits

d92bc5127 dlmglue.c: add missed mlog lines ... Browse Code »

This patch adds the missed mlog_exit() and mlog_exit_void() lines when routines
return.

Signed-off-by: Coly Li
Acked-by: Mark Fasheh
Signed-off-by: Joel Becker

Coly Li
2009-09-23 16:54:47 +0800
8dec98edf ocfs2: Add new refcount tree lock resource in dlmglue. ... Browse Code »

refcount tree lock resource is used to protect refcount
tree read/write among multiple nodes.

Signed-off-by: Tao Ma

Tao Ma
2009-09-23 11:09:28 +0800
a43384813 ocfs2: Abstract caching info checkpoint. ... Browse Code »

In meta downconvert, we need to checkpoint the metadata in an inode.
For refcount tree, we also need it. So abstract the process out.

Signed-off-by: Tao Ma

Tao Ma
2009-09-23 11:09:27 +0800

05 Sep, 2009

2 commits

0cf2f7632 ocfs2: Pass struct ocfs2_caching_info to the journal functions. ... Browse Code »

The next step in divorcing metadata I/O management from struct inode is
to pass struct ocfs2_caching_info to the journal functions. Thus the
journal locks a metadata cache with the cache io_lock function. It also
can compare ci_last_trans and ci_created_trans directly.

This is a large patch because of all the places we change
ocfs2_journal_access..(handle, inode, ...) to
ocfs2_journal_access..(handle, INODE_CACHE(inode), ...).

Signed-off-by: Joel Becker

Joel Becker
2009-09-05 07:07:50 +0800
8cb471e8f ocfs2: Take the inode out of the metadata read/write paths. ... Browse Code »

We are really passing the inode into the ocfs2_read/write_blocks()
functions to get at the metadata cache. This commit passes the cache
directly into the metadata block functions, divorcing them from the
inode.

Signed-off-by: Joel Becker

Joel Becker
2009-09-05 07:07:48 +0800

23 Jun, 2009

4 commits

cb25797d4 ocfs2: Add lockdep annotations ... Browse Code »

Add lockdep support to OCFS2. The support also covers all of the cluster
locks except for open locks, journal locks, and local quotafile locks. These
are special because they are acquired for a node, not for a particular process
and lockdep cannot deal with such type of locking.

Signed-off-by: Jan Kara
Signed-off-by: Joel Becker

Jan Kara
2009-06-23 05:34:26 +0800
df152c241 ocfs2: Disable orphan scanning for local and hard-ro mounts ... Browse Code »

Local and Hard-RO mounts do not need orphan scanning.

Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2009-06-23 05:24:55 +0800
3211949f8 ocfs2: Do not initialize lvb in ocfs2_orphan_scan_lock_res_init() ... Browse Code »

We don't access the LVB in our ocfs2_*_lock_res_init() functions.

Since the LVB can become invalid during some cluster recovery
operations, the dlmglue must be able to handle an uninitialized
LVB.

For the orphan scan lock, we initialized an uninitialzed LVB with our
scan sequence number plus one. This starts a normal orphan scan
cycle.

Signed-off-by: Sunil Mushran
Signed-off-by: Joel Becker

Sunil Mushran
2009-06-23 05:24:53 +0800
1c520dfbf ocfs2: Provide the ocfs2_dlm_lvb_valid() stack API. ... Browse Code »

The Lock Value Block (LVB) of a DLM lock can be lost when nodes die and
the DLM cannot reconstruct its state. Clients of the DLM need to know
this.

ocfs2's internal DLM, o2dlm, explicitly zeroes out the LVB when it loses
track of the state. This is not a standard behavior, but ocfs2 has
always relied on it. Thus, an o2dlm LVB is always "valid".

ocfs2 now supports both o2dlm and fs/dlm via the stack glue. When
fs/dlm loses track of an LVBs state, it sets a flag
(DLM_SBF_VALNOTVALID) on the Lock Status Block (LKSB). The contents of
the LVB may be garbage or merely stale.

ocfs2 doesn't want to try to guess at the validity of the stale LVB.
Instead, it should be checking the VALNOTVALID flag. As this is the
'standard' way of treating LVBs, we will promote this behavior.

We add a stack glue API ocfs2_dlm_lvb_valid(). It returns non-zero when
the LVB is valid. o2dlm will always return valid, while fs/dlm will
check VALNOTVALID.

Signed-off-by: Joel Becker
Acked-by: Mark Fasheh

Joel Becker
2009-06-23 05:24:30 +0800

04 Jun, 2009

1 commit

83273932f ocfs2: timer to queue scan of all orphan slots ... Browse Code »

When a dentry is unlinked, the unlinking node takes an EX on the dentry lock
before moving the dentry to the orphan directory. Other nodes that have
this dentry in cache have a PR on the same dentry lock. When the EX is
requested, the other nodes flag the corresponding inode as MAYBE_ORPHANED
during downconvert. The inode is finally deleted when the last node to iput
the inode sees that i_nlink==0 and the MAYBE_ORPHANED flag is set.

A problem arises if a node is forced to free dentry locks because of memory
pressure. If this happens, the node will no longer get downconvert
notifications for the dentries that have been unlinked on another node.
If it also happens that node is actively using the corresponding inode and
happens to be the one performing the last iput on that inode, it will fail
to delete the inode as it will not have the MAYBE_ORPHANED flag set.

This patch fixes this shortcoming by introducing a periodic scan of the
orphan directories to delete such inodes. Care has been taken to distribute
the workload across the cluster so that no one node has to perform the task
all the time.

Signed-off-by: Srinivas Eeda
Signed-off-by: Joel Becker

Srinivas Eeda
2009-06-04 10:14:31 +0800

04 Apr, 2009

1 commit

6ca497a83 ocfs2: fix rare stale inode errors when exporting via nfs ... Browse Code »

For nfs exporting, ocfs2_get_dentry() returns the dentry for fh.
ocfs2_get_dentry() may read from disk when the inode is not in memory,
without any cross cluster lock. this leads to the file system loading a
stale inode.

This patch fixes above problem.

Solution is that in case of inode is not in memory, we get the cluster
lock(PR) of alloc inode where the inode in question is allocated from (this
causes node on which deletion is done sync the alloc inode) before reading
out the inode itsself. then we check the bitmap in the group (the inode in
question allcated from) to see if the bit is clear. if it's clear then it's
stale. if the bit is set, we then check generation as the existing code
does.

We have to read out the inode in question from disk first to know its alloc
slot and allot bit. And if its not stale we read it out using ocfs2_iget().
The second read should then be from cache.

And also we have to add a per superblock nfs_sync_lock to cover the lock for
alloc inode and that for inode in question. this is because ocfs2_get_dentry()
and ocfs2_delete_inode() lock on them in reverse order. nfs_sync_lock is locked
in EX mode in ocfs2_get_dentry() and in PR mode in ocfs2_delete_inode(). so
that mutliple ocfs2_delete_inode() can run concurrently in normal case.

[mfasheh@suse.com: build warning fixes and comment cleanups]
Signed-off-by: Wengang Wang
Acked-by: Joel Becker
Signed-off-by: Mark Fasheh

wengang wang
2009-04-04 02:39:25 +0800

27 Feb, 2009

1 commit

c74ff8bb2 ocfs2: Cleanup the lockname print in dlmglue.c ... Browse Code »

The dentry lock has a different format than other locks. This patch fixes
ocfs2_log_dlm_error() macro to make it print the dentry lock correctly.

Signed-off-by: Sunil Mushran
Acked-by: Joel Becker
Signed-off-by: Mark Fasheh

Sunil Mushran
2009-02-27 03:51:09 +0800

03 Feb, 2009

1 commit

a4b91965d ocfs2: Wakeup the downconvert thread after a successful cancel convert ... Browse Code »

When two nodes holding PR locks on a resource concurrently attempt to
upconvert the locks to EX, the master sends a BAST to one of the nodes. This
message tells that node to first cancel convert the upconvert request,
followed by downconvert to a NL. Only when this lock is downconverted to NL,
can the master upconvert the first node's lock to EX.

While the fs was doing the cancel convert, it was forgetting to wake up the
dc thread after a successful cancel, leading to a deadlock.

Reported-and-Tested-by: David Teigland
Signed-off-by: Sunil Mushran
Signed-off-by: Mark Fasheh

Sunil Mushran
2009-02-03 06:20:19 +0800

09 Jan, 2009

1 commit

73ac36ea1 fix similar typos to successfull ... Browse Code »

When I review ocfs2 code, find there are 2 typos to "successfull". After
doing grep "successfull " in kernel tree, 22 typos found totally -- great
minds always think alike :)

This patch fixes all the similar typos. Thanks for Randy's ack and comments.

Signed-off-by: Coly Li
Acked-by: Randy Dunlap
Acked-by: Roland Dreier
Cc: Jeremy Kerr
Cc: Jeff Garzik
Cc: Heiko Carstens
Cc: Martin Schwidefsky
Cc: Theodore Ts'o
Cc: Mark Fasheh
Cc: Vlad Yasevich
Cc: Sridhar Samudrala
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Coly Li
2009-01-09 00:31:15 +0800

06 Jan, 2009

3 commits

a641dc2a5 ocfs2: remove unneeded lvb casts ... Browse Code »

dlmglue.c has lots of code which casts the return value of ocfs2_dlm_lvb().
This is pointless however, as ocfs2_dlm_lvb() returns void *.

Signed-off-by: Mark Fasheh

Mark Fasheh
2009-01-06 00:40:36 +0800
85eb8b73d ocfs2: Fix ocfs2_read_quota_block() error handling. ... Browse Code »

ocfs2_bread() has become ocfs2_read_virt_blocks(), with a prototype to
match ocfs2_read_blocks(). The quota code, converting from
ocfs2_bread(), wraps the call to ocfs2_read_virt_blocks() in
ocfs2_read_quota_block(). Unfortunately, the prototype of
ocfs2_read_quota_block() matches the old prototype of ocfs2_bread().

The problem is that ocfs2_bread() returned the buffer head, and callers
assumed that a NULL pointer was indicative of error. It wasn't. This
is why ocfs2_bread() took an int*err argument as well.

The new prototype of ocfs2_read_virt_blocks() avoids this error handling
confusion. Let's change ocfs2_read_quota_block() to match.

Signed-off-by: Joel Becker
Acked-by: Jan Kara
Signed-off-by: Mark Fasheh

Joel Becker
2009-01-06 00:40:24 +0800
9e33d69f5 ocfs2: Implementation of local and global quota file handling ... Browse Code »

For each quota type each node has local quota file. In this file it stores
changes users have made to disk usage via this node. Once in a while this
information is synced to global file (and thus with other nodes) so that
limits enforcement at least aproximately works.

Global quota files contain all the information about usage and limits. It's
mostly handled by the generic VFS code (which implements a trie of structures
inside a quota file). We only have to provide functions to convert structures
from on-disk format to in-memory one. We also have to provide wrappers for
various quota functions starting transactions and acquiring necessary cluster
locks before the actual IO is really started.

Signed-off-by: Jan Kara
Signed-off-by: Mark Fasheh

Jan Kara
2009-01-06 00:40:23 +0800