Eric Lee / smarc-fsl-linux-kernel

15 Feb, 2014

1 commit

075f01775 dlm: use INFO for recovery messages ... Browse Code »

The log messages relating to the progress of recovery
are minimal and very often useful. Change these to
the KERN_INFO level so they are always available.

Signed-off-by: David Teigland

David Teigland
2014-02-15 01:54:44 +0800

28 Feb, 2013

3 commits

2a86b3e74 dlm: convert to idr_alloc() ... Browse Code »

Convert to the much saner new idr interface. Error return values from
recover_idr_add() mix -1 and -errno. The conversion doesn't change
that but it looks iffy.

Signed-off-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:19 +0800
a67a380e6 dlm: don't use idr_remove_all() ... Browse Code »

idr_destroy() can destroy idr by itself and idr_remove_all() is being
deprecated.

The conversion isn't completely trivial for recover_idr_clear() as it's
the only place in kernel which makes legitimate use of idr_remove_all()
w/o idr_destroy(). Replace it with idr_remove() call inside
idr_for_each_entry() loop. It goes on top so that it matches the
operation order in recover_idr_del().

Signed-off-by: Tejun Heo
Cc: Christine Caulfield
Cc: David Teigland
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:13 +0800
cda95406c dlm: use idr_for_each_entry() in recover_idr_clear() error path ... Browse Code »

Convert recover_idr_clear() to use idr_for_each_entry() instead of
idr_for_each(). It's somewhat less efficient this way but it shouldn't
matter in an error path. This is to help with deprecation of
idr_remove_all().

Signed-off-by: Tejun Heo
Cc: Christine Caulfield
Cc: David Teigland
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2013-02-28 11:10:13 +0800

17 Nov, 2012

1 commit

da8c66638 dlm: fix lvb invalidation conditions ... Browse Code »

When a node is removed that held a PW/EX lock, the
existing master node should invalidate the lvb on the
resource due to the purged lock.

Previously, the existing master node was invalidating
the lvb if it found only NL/CR locks on the resource
during recovery for the removed node. This could lead
to cases where it invalidated the lvb and shouldn't
have, or cases where it should have invalidated and
didn't.

When recovery selects a *new* master node for a
resource, and that new master finds only NL/CR locks
on the resource after lock recovery, it should
invalidate the lvb. This case was handled correctly
(but was incorrectly applied to the existing master
case also.)

When a process exits while holding a PW/EX lock,
the lvb on the resource should be invalidated.
This was not happening.

The lvb contents and VALNOTVALID flag should be
recovered before granting locks in recovery so that
the recovered lvb state is provided in the callback.
The lvb was being recovered after the lock was granted.

Signed-off-by: David Teigland

David Teigland
2012-11-17 01:20:42 +0800

17 Jul, 2012

4 commits

c503a6210 dlm: fix conversion deadlock from recovery ... Browse Code »

The process of rebuilding locks on a new master during
recovery could re-order the locks on the convert queue,
creating an "in place" conversion deadlock that would
not be resolved. Fix this by not considering queue
order when granting conversions after recovery.

Signed-off-by: David Teigland

David Teigland
2012-07-17 03:18:22 +0800
6d768177c dlm: use wait_event_timeout ... Browse Code »

Use wait_event_timeout to avoid using a timer
directly.

Signed-off-by: David Teigland

David Teigland
2012-07-17 03:18:12 +0800
1d7c484ee dlm: use idr instead of list for recovered rsbs ... Browse Code »

When a large number of resources are being recovered,
a linear search of the recover_list takes a long time.
Use an idr in place of a list.

Signed-off-by: David Teigland

David Teigland
2012-07-17 03:17:52 +0800
c04fecb4d dlm: use rsbtbl as resource directory ... Browse Code »

Remove the dir hash table (dirtbl), and use
the rsb hash table (rsbtbl) as the resource
directory. It has always been an unnecessary
duplication of information.

This improves efficiency by using a single rsbtbl
lookup in many cases where both rsbtbl and dirtbl
lookups were needed previously.

This eliminates the need to handle cases of rsbtbl
and dirtbl being out of sync.

In many cases there will be memory savings because
the dir hash table no longer exists.

Signed-off-by: David Teigland

David Teigland
2012-07-17 03:16:19 +0800

03 May, 2012

1 commit

4875647a0 dlm: fixes for nodir mode ... Browse Code »

The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used. This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
all in-progress operations after recovery. In some
cases it's not possible to know which in-progess locks
to recover, so recover all. (Most require recovery
in nodir mode anyway since rehashing changes most
master nodes.)

- Change the way nodir mode is enabled, from a command
line mount arg passed through gfs2, into a sysfs
file managed by dlm_controld, consistent with the
other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
from a previous, aborted recovery cycle. Base this
on the local recovery status not being in the state
where any nodes should be sending LOCK messages for the
current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
the master as is usual), because the lkb can switch
back and forth between being a master and being a
process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
non-empty convert or waiting queues for granting
at the end of recovery. (Rename flag from LOCKS_PURGED
to RECOVER_GRANT and similar for the recovery function,
because it's not only resources with purged locks
that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
error messages.

Signed-off-by: David Teigland

David Teigland
2012-05-03 03:15:27 +0800

04 Jan, 2012

2 commits

757a42719 dlm: add node slots and generation ... Browse Code »

Slot numbers are assigned to nodes when they join the lockspace.
The slot number chosen is the minimum unused value starting at 1.
Once a node is assigned a slot, that slot number will not change
while the node remains a lockspace member. If the node leaves
and rejoins it can be assigned a new slot number.

A new generation number is also added to a lockspace. It is
set and incremented during each recovery along with the slot
collection/assignment.

The slot numbers will be passed to gfs2 which will use them as
journal id's.

Signed-off-by: David Teigland

David Teigland
2012-01-04 22:55:57 +0800
f95a34c66 dlm: move recovery barrier calls ... Browse Code »

Put all the calls to recovery barriers in the same function
to clarify where they each happen. Should not change any behavior.
Also modify some recovery debug lines to make them consistent.

Signed-off-by: David Teigland

David Teigland
2012-01-04 22:53:27 +0800

19 Nov, 2011

1 commit

9beb3bf5a dlm: convert rsb list to rb_tree ... Browse Code »

Change the linked lists to rb_tree's in the rsb
hash table to speed up searches. Slow rsb searches
were having a large impact on gfs2 performance due
to the large number of dlm locks gfs2 uses.

Signed-off-by: Bob Peterson
Signed-off-by: David Teigland

Bob Peterson
2011-11-19 00:20:15 +0800

31 Mar, 2011

1 commit

25985edce Fix common misspellings ... Browse Code »

Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi

Lucas De Marchi
2011-03-31 22:26:23 +0800

09 Jan, 2009

1 commit

c7be761a8 dlm: change rsbtbl rwlock to spinlock ... Browse Code »

The rwlock is almost always used in write mode, so there's no reason
to not use a spinlock instead.

Signed-off-by: David Teigland

David Teigland
2009-01-09 05:12:39 +0800

04 Feb, 2008

1 commit

4007685c6 dlm: use proper type for ->ls_recover_buf ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: David Teigland

Al Viro
2008-02-04 15:24:07 +0800

31 Jan, 2008

1 commit

85f0379aa dlm: keep cached master rsbs during recovery ... Browse Code »

To prevent the master of an rsb from changing rapidly, an unused rsb is kept
on the "toss list" for a period of time to be reused. The toss list was
being cleared completely for each recovery, which is unnecessary. Much of
the benefit of the toss list can be maintained if nodes keep rsb's in their
toss list that they are the master of. These rsb's need to be included
when the resource directory is rebuilt during recovery.

Signed-off-by: David Teigland

David Teigland
2008-01-31 01:04:43 +0800

30 Jan, 2008

1 commit

52bda2b5b dlm: use dlm prefix on alloc and free functions ... Browse Code »

The dlm functions in memory.c should use the dlm_ prefix. Also, use
kzalloc/kfree directly for dlm_direntry's, removing the wrapper functions.

Signed-off-by: David Teigland

David Teigland
2008-01-30 07:17:19 +0800

06 Feb, 2007

2 commits

222d39609 [DLM] fix master recovery ... Browse Code »

If master recovery happens on an rsb in one recovery sequence, then that
sequence is aborted before lock recovery happens, then in the next
sequence, we rely on the previous master recovery (which may now be
invalid due to another node ignoring a lookup result) and go on do to the
lock recovery where we get stuck due to an invalid master value.

recovery cycle begins: master of rsb X has left
nodes A and B send node C an rcom lookup for X to find the new master
C gets lookup from B first, sets B as new master, and sends reply back to B
C gets lookup from A next, and sends reply back to A saying B is master
A gets lookup reply from C and sets B as the new master in the rsb
recovery cycle on A, B and C is aborted to start a new recovery
B gets lookup reply from C and ignores it since there's a new recovery
recovery cycle begins: some other node has joined
B doesn't think it's the master of X so it doesn't rebuild it in the directory
C looks up the master of X, no one is master, so it becomes new master
B looks up the master of X, finds it's C
A believes that B is the master of X, so it sends its lock to B
B sends an error back to A
A resends
this repeats forever, the incorrect master value on A is never corrected

The fix is to do master recovery on an rsb that still has the NEW_MASTER
flag set from an earlier recovery sequence, and therefore didn't complete
lock recovery.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-02-06 02:36:58 +0800
68c817a1c [DLM] rename dlm_config_info fields ... Browse Code »

Add a "ci_" prefix to the fields in the dlm_config_info struct so that we
can use macros to add configfs functions to access them (in a later
patch). No functional changes in this patch, just naming changes.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-02-06 02:36:37 +0800

30 Nov, 2006

1 commit

520698096 [DLM] res_recover_locks_count not reset when recover_locks is aborted ... Browse Code »

Red Hat BZ 213684

If a node sends an lkb to the new master (RCOM_LOCK message) during
recovery and recovery is then aborted on both nodes before it gets a
reply, the res_recover_locks_count needs to be reset to 0 so that when the
subsequent recovery comes along and sends the lkb to the new master again
the assertion doesn't trigger that checks that counter is zero.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-11-30 23:35:03 +0800

24 Aug, 2006

1 commit

233e515f4 [DLM] recover_locks not clearing NEW_MASTER flag ... Browse Code »

When there are no locks on a resource, the recover_locks() function fails
to clear the NEW_MASTER flag by going directly to out, missing the line
that clears the flag.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-08-24 21:38:19 +0800

21 Aug, 2006

1 commit

a345da3e8 [DLM] dump rsb and locks on assert ... Browse Code »

Introduce new function dlm_dump_rsb() to call within assertions instead of
dlm_print_rsb(). The new function dumps info about all locks on the rsb
in addition to rsb details.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-08-21 21:50:09 +0800

26 Jul, 2006

1 commit

f7da790d7 [DLM] set purged flag on rsbs ... Browse Code »

If a node becomes the new master of an rsb during recovery, the
LOCKS_PURGED flag needs to be set on it so that any waiting/converting
locks will try to be granted.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-07-26 20:42:01 +0800

24 May, 2006

1 commit

9229f0134 [GFS2] Cast 64 bit printk args to unsigned long long. ... Browse Code »

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-05-24 21:21:30 +0800

20 Jan, 2006

1 commit

901359256 [DLM] Update DLM to the latest patch level ... Browse Code »

Signed-off-by: David Teigland
Signed-off-by: Steve Whitehouse

David Teigland
2006-01-20 16:47:07 +0800

18 Jan, 2006

1 commit

e7fd41792 [DLM] The core of the DLM for GFS2/CLVM ... Browse Code »

This is the core of the distributed lock manager which is required
to use GFS2 as a cluster filesystem. It is also used by CLVM and
can be used as a standalone lock manager independantly of either
of these two projects.

It implements VAX-style locking modes.

Signed-off-by: David Teigland
Signed-off-by: Steve Whitehouse

David Teigland
2006-01-18 17:30:29 +0800