Eric Lee / smarc-fsl-linux-kernel

04 Jan, 2012

1 commit

757a42719 dlm: add node slots and generation ... Browse Code »

Slot numbers are assigned to nodes when they join the lockspace.
The slot number chosen is the minimum unused value starting at 1.
Once a node is assigned a slot, that slot number will not change
while the node remains a lockspace member. If the node leaves
and rejoins it can be assigned a new slot number.

A new generation number is also added to a lockspace. It is
set and incremented during each recovery along with the slot
collection/assignment.

The slot numbers will be passed to gfs2 which will use them as
journal id's.

Signed-off-by: David Teigland

David Teigland
2012-01-04 22:55:57 +0800

11 Mar, 2011

1 commit

8304d6f24 dlm: record full callback state ... Browse Code »

Change how callbacks are recorded for locks. Previously, information
about multiple callbacks was combined into a couple of variables that
indicated what the end result should be. In some situations, we
could not tell from this combined state what the exact sequence of
callbacks were, and would end up either delivering the callbacks in
the wrong order, or suppress redundant callbacks incorrectly. This
new approach records all the data for each callback, leaving no
uncertainty about what needs to be delivered.

Signed-off-by: David Teigland

David Teigland
2011-03-11 00:40:00 +0800

01 Dec, 2009

1 commit

573c24c4a dlm: always use GFP_NOFS ... Browse Code »

Replace all GFP_KERNEL and ls_allocation with GFP_NOFS.
ls_allocation would be GFP_KERNEL for userland lockspaces
and GFP_NOFS for file system lockspaces.

It was discovered that any lockspaces on the system can
affect all others by triggering memory reclaim in the
file system which could in turn call back into the dlm
to acquire locks, deadlocking dlm threads that were
shared by all lockspaces, like dlm_recv.

Signed-off-by: David Teigland

David Teigland
2009-12-01 06:34:43 +0800

22 Feb, 2008

1 commit

599e0f584 dlm: fix rcom_names message to self ... Browse Code »

The recent patch to validate data lengths in rcom_names messages
failed to account for fake messages a node directs to itself before
ever sending it. In this case we need to fill in the message length
in the header for the validation code to use.

Signed-off-by: David Teigland

David Teigland
2008-02-22 05:19:54 +0800

06 Feb, 2008

1 commit

e5dae548b dlm: proper types for asts and basts ... Browse Code »

Use proper types for ast and bast functions, and use
consistent type for ast param.

Signed-off-by: David Teigland

David Teigland
2008-02-06 14:35:45 +0800

04 Feb, 2008

5 commits

ae773d0b7 dlm: verify that places expecting rcom_lock have packet long enough ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: David Teigland

Al Viro
2008-02-04 15:25:09 +0800
02ed16b64 dlm: missing length check in check_config() ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: David Teigland

Al Viro
2008-02-04 15:24:20 +0800
4007685c6 dlm: use proper type for ->ls_recover_buf ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: David Teigland

Al Viro
2008-02-04 15:24:07 +0800
93ff2971e dlm: do not byteswap rcom_config ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: David Teigland

Al Viro
2008-02-04 15:23:43 +0800
163a1859e dlm: do not byteswap rcom_lock ... Browse Code »

Signed-off-by: Al Viro
Signed-off-by: David Teigland

Al Viro
2008-02-04 15:23:14 +0800

31 Jan, 2008

1 commit

dbcfc3473 dlm: clean ups ... Browse Code »

A couple small clean-ups. Remove unnecessary wrapper-functions in
rcom.c, and remove unnecessary casting and an unnecessary ASSERT in
util.c.

Signed-off-by: David Teigland

David Teigland
2008-01-31 01:04:43 +0800

10 Oct, 2007

1 commit

c36258b59 [DLM] block dlm_recv in recovery transition ... Browse Code »

Introduce a per-lockspace rwsem that's held in read mode by dlm_recv
threads while working in the dlm. This allows dlm_recv activity to be
suspended when the lockspace transitions to, from and between recovery
cycles.

The specific bug prompting this change is one where an in-progress
recovery cycle is aborted by a new recovery cycle. While dlm_recv was
processing a recovery message, the recovery cycle was aborted and
dlm_recoverd began cleaning up. dlm_recv decremented recover_locks_count
on an rsb after dlm_recoverd had reset it to zero. This is fixed by
suspending dlm_recv (taking write lock on the rwsem) before aborting the
current recovery.

The transitions to/from normal and recovery modes are simplified by using
this new ability to block dlm_recv. The switch from normal to recovery
mode means dlm_recv goes from processing locking messages, to saving them
for later, and vice versa. Races are avoided by blocking dlm_recv when
setting the flag that switches between modes.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-10-10 15:56:38 +0800

14 Aug, 2007

1 commit

41684f954 [DLM] fix NULL ls usage ... Browse Code »

Fix regression in recent patch "[DLM] variable allocation" which
attempts to dereference an "ls" struct when it's NULL.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-08-14 17:28:44 +0800

09 Jul, 2007

2 commits

44f487a55 [DLM] variable allocation ... Browse Code »

Add a new flag, DLM_LSFL_FS, to be used when a file system creates a lockspace.
This flag causes the dlm to use GFP_NOFS for allocations instead of GFP_KERNEL.
(This updated version of the patch uses gfp_t for ls_allocation.)

Signed-Off-By: Patrick Caulfield
Signed-Off-By: David Teigland
Signed-off-by: Steven Whitehouse

Patrick Caulfield
2007-07-09 15:23:17 +0800
8b0e7b2cf [DLM] wait for config check during join [6/6] ... Browse Code »

Joining the lockspace should wait for the initial round of inter-node
config checks to complete before returning. This way, if there's a
configuration mismatch between the joining node and the existing nodes,
the join can fail and return an error to the application.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-07-09 15:22:42 +0800

06 Feb, 2007

4 commits

68c817a1c [DLM] rename dlm_config_info fields ... Browse Code »

Add a "ci_" prefix to the fields in the dlm_config_info struct so that we
can use macros to add configfs functions to access them (in a later
patch). No functional changes in this patch, just naming changes.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-02-06 02:36:37 +0800
8ec688674 [DLM] change some log_error to log_debug ... Browse Code »

Some common, non-error messages should use log_debug instead of log_error
so they can be turned off.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-02-06 02:36:34 +0800
9e971b715 [DLM] add version check ... Browse Code »

Check if we receive a message from another lockspace member running a
version of the dlm with an incompatible inter-node message protocol.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-02-06 02:35:53 +0800
38aa8b0c5 [DLM] fix old rcom messages ... Browse Code »

A reply to a recovery message will often be received after the relevant
recovery sequence has aborted and the next recovery sequence has begun.
We need to ignore replies to these old messages from the previous
recovery. There's already a way to do this for synchronous recovery
requests using the rc_id number, but not for async.

Each recovery sequence already has a locally unique sequence number
associated with it. This patch adds a field to the rcom (recovery
message) structure where this recovery sequence number can be placed,
rc_seq. When a node sends a reply to a recovery request, it copies the
rc_seq number it received into rc_seq_reply. When the first node receives
the reply to its recovery message, it will check whether rc_seq_reply
matches the current recovery sequence number, ls_recover_seq, and if not
then it ignores the old reply.

An old, inadequate approach to filtering out old replies (checking if the
current stage of recovery has moved back to the start) has been removed
from two spots.

The protocol version number is changed to reflect the different rcom
structures.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2007-02-06 02:35:50 +0800

30 Nov, 2006

4 commits

57adf7eed [DLM] fix format warnings in rcom.c and recoverd.c ... Browse Code »

This fixes the following gcc warnings generated on
the architectures where uint64_t != unsigned long long (e.g. ppc64).

fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'uint64_t'
fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'uint64_t'
fs/dlm/recoverd.c:48: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
fs/dlm/recoverd.c:202: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
fs/dlm/recoverd.c:210: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'

Signed-off-by: Ryusuke Konishi
Signed-off-by: Patrick Caulfield
Signed-off-by: Steven Whitehouse

Ryusuke Konishi
2006-11-30 23:37:22 +0800
98f176fb3 [DLM] don't accept replies to old recovery messages ... Browse Code »

We often abort a recovery after sending a status request to a remote node.
We want to ignore any potential status reply we get from the remote node.
If we get one of these unwanted replies, we've often moved on to the next
recovery message and incremented the message sequence counter, so the
reply will be ignored due to the seq number. In some cases, we've not
moved on to the next message so the seq number of the reply we want to
ignore is still correct, causing the reply to be accepted. The next
recovery message will then mistake this old reply as a new one.

To fix this, we add the flag RCOM_WAIT to indicate when we can accept a
new reply. We clear this flag if we abort recovery while waiting for a
reply. Before the flag is set again (to allow new replies) we know that
any old replies will be rejected due to their sequence number. We also
initialize the recovery-message sequence number to a random value when a
lockspace is first created. This makes it clear when messages are being
rejected from an old instance of a lockspace that has since been
recreated.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-11-30 23:37:14 +0800
1babdb453 [DLM] fix size of STATUS_REPLY message ... Browse Code »

When the not_ready routine sends a "fake" status reply with blank status
flags, it needs to use the correct size for a normal STATUS_REPLY by
including the size of the would-be config parameters. We also fill in the
non-existant config parameters with an invalid lvblen value so it's easier
to notice if these invalid paratmers are ever being used.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-11-30 23:37:08 +0800
435618b75 [DLM] status messages ping-pong between unmounted nodes ... Browse Code »

Red Hat BZ 213682

If two nodes leave the lockspace (while unmounting the fs in the case of
gfs) after one has sent a STATUS message to the other, STATUS/STATUS_REPLY
messages will then ping-pong between the nodes when neither of them can
find the lockspace in question any longer. We kill this by not sending
another STATUS message when we get a STATUS_REPLY for an unknown
lockspace.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-11-30 23:35:06 +0800

24 Aug, 2006

1 commit

f5888750a [DLM] sequence number missing in not_ready reply ... Browse Code »

When a status reply is sent for a lockspace that doesn't yet exist, the
message sequence number from the sender was not being copied into the
reply causing the sender to ignore the reply.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-08-24 21:37:43 +0800

10 Aug, 2006

1 commit

4a99c3d9d [DLM] reject replies to old requests ... Browse Code »

When recoveries are aborted by other recoveries we can get replies to
status or names requests that we've given up on. This can cause problems
if we're making another request and receive an old reply. Add a sequence
number to status/names requests and reject replies that don't match. A
field already exists for the seq number that's used in other message
types.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-08-10 05:32:07 +0800

09 Aug, 2006

1 commit

faa0f2677 [DLM] show nodeid for recovery message ... Browse Code »

To aid debugging, it's useful to be able to see what nodeid the dlm is
waiting on for a message reply.

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-08-09 21:46:38 +0800

23 Feb, 2006

1 commit

3bcd3687f [DLM] Remove range locks from the DLM ... Browse Code »

This patch removes support for range locking from the DLM

Signed-off-by: David Teigland
Signed-off-by: Steven Whitehouse

David Teigland
2006-02-23 17:56:38 +0800

18 Jan, 2006

1 commit

e7fd41792 [DLM] The core of the DLM for GFS2/CLVM ... Browse Code »

This is the core of the distributed lock manager which is required
to use GFS2 as a cluster filesystem. It is also used by CLVM and
can be used as a standalone lock manager independantly of either
of these two projects.

It implements VAX-style locking modes.

Signed-off-by: David Teigland
Signed-off-by: Steve Whitehouse

David Teigland
2006-01-18 17:30:29 +0800