Eric Lee / smarc-fsl-linux-kernel

04 Jan, 2012

3 commits

60f98d183 dlm: add recovery callbacks ... Browse Code »

These new callbacks notify the dlm user about lock recovery.
GFS2, and possibly others, need to be aware of when the dlm
will be doing lock recovery for a failed lockspace member.

In the past, this coordination has been done between dlm and
file system daemons in userspace, which then direct their
kernel counterparts. These callbacks allow the same
coordination directly, and more simply.

Signed-off-by: David Teigland

David Teigland
2012-01-04 22:56:31 +0800
757a42719 dlm: add node slots and generation ... Browse Code »

Slot numbers are assigned to nodes when they join the lockspace.
The slot number chosen is the minimum unused value starting at 1.
Once a node is assigned a slot, that slot number will not change
while the node remains a lockspace member. If the node leaves
and rejoins it can be assigned a new slot number.

A new generation number is also added to a lockspace. It is
set and incremented during each recovery along with the slot
collection/assignment.

The slot numbers will be passed to gfs2 which will use them as
journal id's.

Signed-off-by: David Teigland

David Teigland
2012-01-04 22:55:57 +0800
f95a34c66 dlm: move recovery barrier calls ... Browse Code »

Put all the calls to recovery barriers in the same function
to clarify where they each happen. Should not change any behavior.
Also modify some recovery debug lines to make them consistent.

Signed-off-by: David Teigland

David Teigland
2012-01-04 22:53:27 +0800

04 Feb, 2010

1 commit

c41b20e72 Fix misspellings of "truly" in comments. ... Browse Code »

Some comments misspell "truly"; this fixes them. No code changes.

Signed-off-by: Adam Buchbinder
Signed-off-by: Jiri Kosina

Adam Buchbinder
2010-02-04 18:55:45 +0800

01 Dec, 2009

1 commit

573c24c4a dlm: always use GFP_NOFS ... Browse Code »

Replace all GFP_KERNEL and ls_allocation with GFP_NOFS.
ls_allocation would be GFP_KERNEL for userland lockspaces
and GFP_NOFS for file system lockspaces.

It was discovered that any lockspaces on the system can
affect all others by triggering memory reclaim in the
file system which could in turn call back into the dlm
to acquire locks, deadlocking dlm threads that were
shared by all lockspaces, like dlm_recv.

Signed-off-by: David Teigland

David Teigland
2009-12-01 06:34:43 +0800

16 May, 2009

1 commit

748285ccf dlm: use more NOFS allocation ... Browse Code »

Change some GFP_KERNEL allocations to use either GFP_NOFS or
ls_allocation (when available) which the fs sets to GFP_NOFS.
The point is to prevent allocations from going back into the
cluster fs in places where that might lead to deadlock.

Signed-off-by: David Teigland

David Teigland
2009-05-16 00:24:59 +0800

15 May, 2009

1 commit

391fbdc5d dlm: connect to nodes earlier ... Browse Code »

Make network connections to other nodes earlier, in the context of
dlm_recoverd. This avoids connecting to nodes from dlm_send where we
try to avoid allocations which could possibly deadlock if memory reclaim
goes into the cluster fs which may try to do a dlm operation.

Signed-off-by: Christine Caulfield
Signed-off-by: David Teigland

Christine Caulfield
2009-05-15 22:34:12 +0800

22 Apr, 2008

1 commit

d44e0fc70 dlm: recover nodes that are removed and re-added ... Browse Code »

If a node is removed from a lockspace, and then added back before the
dlm is notified of the removal, the dlm will not detect the removal
and won't clear the old state from the node. This is fixed by using a
list of added nodes so the membership recovery can detect when a newly
added node is already in the member list.

Signed-off-by: David Teigland

David Teigland
2008-04-22 00:18:01 +0800

31 Jan, 2008

1 commit

46b43eed7 dlm: reject messages from non-members ... Browse Code »

Messages from nodes that are no longer members of the lockspace should be
ignored. When nodes are removed from the lockspace, recovery can
sometimes complete quickly enough that messages arrive from a removed node
after recovery has completed. When processed, these messages would often
cause an error message, and could in some cases change some state, causing
problems.

Signed-off-by: David Teigland

David Teigland
2008-01-31 01:04:42 +0800

10 Oct, 2007

1 commit

c36258b59 [DLM] block dlm_recv in recovery transition ... Browse Code »

Introduce a per-lockspace rwsem that's held in read mode by dlm_recv
threads while working in the dlm. This allows dlm_recv activity to be
suspended when the lockspace transitions to, from and between recovery
cycles.

The specific bug prompting this change is one where an in-progress
recovery cycle is aborted by a new recovery cycle. While dlm_recv was
processing a recovery message, the recovery cycle was aborted and
dlm_recoverd began cleaning up. dlm_recv decremented recover_locks_count
on an rsb after dlm_recoverd had reset it to zero. This is fixed by
suspending dlm_recv (taking write lock on the rwsem) before aborting the
current recovery.

The transitions to/from normal and recovery modes are simplified by using
this new ability to block dlm_recv. The switch from normal to recovery
mode means dlm_recv goes from processing locking messages, to saving them
for later, and vice versa. Races are avoided by blocking dlm_recv when
setting the flag that switches between modes.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-10-10 15:56:38 +0800

14 Aug, 2007

1 commit

1a2bf2eef [DLM] Fix memory leak in dlm_add_member() when dlm_node_weight() returns less than zero ... Browse Code »

There's a memory leak in fs/dlm/member.c::dlm_add_member().

If "dlm_node_weight(ls->ls_name, nodeid)" returns < 0, then
we'll return without freeing the memory allocated to the (at
that point yet unused) 'memb'.
This patch frees the allocated memory in that case and thus
avoids the leak.

Signed-off-by: Jesper Juhl
Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

Jesper Juhl
2007-08-14 17:30:04 +0800

09 Jul, 2007

2 commits

8b0e7b2cf [DLM] wait for config check during join [6/6] ... Browse Code »

Joining the lockspace should wait for the initial round of inter-node
config checks to complete before returning. This way, if there's a
configuration mismatch between the joining node and the existing nodes,
the join can fail and return an error to the application.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-07-09 15:22:42 +0800
3ae1acf93 [DLM] add lock timeouts and warnings [2/6] ... Browse Code »

New features: lock timeouts and time warnings. If the DLM_LKF_TIMEOUT
flag is set, then the request/conversion will be canceled after waiting
the specified number of centiseconds (specified per lock). This feature
is only available for locks requested through libdlm (can be enabled for
kernel dlm users if there's a use for it.)

If the new DLM_LSFL_TIMEWARN flag is set when creating the lockspace, then
a warning message will be sent to userspace (using genetlink) after a
request/conversion has been waiting for a given number of centiseconds
(configurable per node). The time warnings will be used in the future
to do deadlock detection in userspace.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-07-09 15:22:33 +0800

30 Nov, 2006

1 commit

91c0dc93a [DLM] fix aborted recovery during node removal ... Browse Code »

Red Hat BZ 211914

With the new cluster infrastructure, dlm recovery for a node removal can
be aborted and restarted for a node addition. When this happens, the
restarted recovery isn't aware that it's doing recovery for the earlier
removal as well as the addition. So, it then skips the recovery steps
only required when nodes are removed. This can result in locks not being
purged for failed/removed nodes. The fix is to check for removed nodes
for which recovery has not been completed at the start of a new recovery
sequence.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-11-30 23:35:13 +0800

09 Aug, 2006

2 commits

faa0f2677 [DLM] show nodeid for recovery message ... Browse Code »

To aid debugging, it's useful to be able to see what nodeid the dlm is
waiting on for a message reply.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-08-09 21:46:38 +0800
f6db1b8e7 [DLM] abort recovery more quickly ... Browse Code »

When we abort one recovery to do another, break out of the ping_members()
routine more quickly, and wake up the dlm_recoverd thread more quickly
instead of waiting for it to time out.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-08-09 21:45:31 +0800

28 Apr, 2006

1 commit

1c032c031 [DLM] PATCH 2/3 dlm: lowcomms close ... Browse Code »

When a node is removed from a lockspace configuration, close our
connection to it, clearing any remaining messages for it.

Signed-off-by: David Teigland
Signed-off-by: Patrick Caulfield
Signed-off-by: Steven Whitehouse

David Teigland
2006-04-28 22:50:41 +0800

20 Jan, 2006

1 commit

901359256 [DLM] Update DLM to the latest patch level ... Browse Code »

Signed-off-by: David Teigland
Signed-off-by: Steve Whitehouse

David Teigland
2006-01-20 16:47:07 +0800

18 Jan, 2006

1 commit

e7fd41792 [DLM] The core of the DLM for GFS2/CLVM ... Browse Code »

This is the core of the distributed lock manager which is required
to use GFS2 as a cluster filesystem. It is also used by CLVM and
can be used as a standalone lock manager independantly of either
of these two projects.

It implements VAX-style locking modes.

Signed-off-by: David Teigland
Signed-off-by: Steve Whitehouse

David Teigland
2006-01-18 17:30:29 +0800