04 Feb, 2008

1 commit


10 Oct, 2007

1 commit

  • Introduce a per-lockspace rwsem that's held in read mode by dlm_recv
    threads while working in the dlm. This allows dlm_recv activity to be
    suspended when the lockspace transitions to, from and between recovery
    cycles.

    The specific bug prompting this change is one where an in-progress
    recovery cycle is aborted by a new recovery cycle. While dlm_recv was
    processing a recovery message, the recovery cycle was aborted and
    dlm_recoverd began cleaning up. dlm_recv decremented recover_locks_count
    on an rsb after dlm_recoverd had reset it to zero. This is fixed by
    suspending dlm_recv (taking write lock on the rwsem) before aborting the
    current recovery.

    The transitions to/from normal and recovery modes are simplified by using
    this new ability to block dlm_recv. The switch from normal to recovery
    mode means dlm_recv goes from processing locking messages, to saving them
    for later, and vice versa. Races are avoided by blocking dlm_recv when
    setting the flag that switches between modes.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     

30 Nov, 2006

2 commits

  • Requests that arrive after recovery has started are saved in the
    requestqueue and processed after recovery is done. Some of these requests
    are purged during recovery if they are from nodes that have been removed.
    We move the purging of the requests (dlm_purge_requestqueue) to later in
    the recovery sequence which allows the routine saving requests
    (dlm_add_requestqueue) to avoid filtering out requests by nodeid since the
    same will be done by the purge. The current code has add_requestqueue
    filtering by nodeid but doesn't hold any locks when accessing the list of
    current nodes. This also means that we need to call the purge routine
    when the lockspace is being shut down since the add routine will not be
    rejecting requests itself any more.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Red Hat BZ 211914

    There's a race between dlm_recoverd (1) enabling locking and (2) clearing
    out the requestqueue, and dlm_recvd (1) checking if locking is enabled and
    (2) adding a message to the requestqueue. An order of recoverd(1),
    recvd(1), recvd(2), recoverd(2) will result in a message being left on the
    requestqueue. The fix is to have dlm_recvd check if dlm_recoverd has
    enabled locking after taking the mutex for the requestqueue and if it has
    processing the message instead of queueing it.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     

20 Jan, 2006

1 commit


18 Jan, 2006

1 commit

  • This is the core of the distributed lock manager which is required
    to use GFS2 as a cluster filesystem. It is also used by CLVM and
    can be used as a standalone lock manager independantly of either
    of these two projects.

    It implements VAX-style locking modes.

    Signed-off-by: David Teigland
    Signed-off-by: Steve Whitehouse

    David Teigland