04 Jan, 2012

1 commit

  • Slot numbers are assigned to nodes when they join the lockspace.
    The slot number chosen is the minimum unused value starting at 1.
    Once a node is assigned a slot, that slot number will not change
    while the node remains a lockspace member. If the node leaves
    and rejoins it can be assigned a new slot number.

    A new generation number is also added to a lockspace. It is
    set and incremented during each recovery along with the slot
    collection/assignment.

    The slot numbers will be passed to gfs2 which will use them as
    journal id's.

    Signed-off-by: David Teigland

    David Teigland
     

11 Mar, 2011

1 commit

  • Change how callbacks are recorded for locks. Previously, information
    about multiple callbacks was combined into a couple of variables that
    indicated what the end result should be. In some situations, we
    could not tell from this combined state what the exact sequence of
    callbacks were, and would end up either delivering the callbacks in
    the wrong order, or suppress redundant callbacks incorrectly. This
    new approach records all the data for each callback, leaving no
    uncertainty about what needs to be delivered.

    Signed-off-by: David Teigland

    David Teigland
     

01 Dec, 2009

1 commit

  • Replace all GFP_KERNEL and ls_allocation with GFP_NOFS.
    ls_allocation would be GFP_KERNEL for userland lockspaces
    and GFP_NOFS for file system lockspaces.

    It was discovered that any lockspaces on the system can
    affect all others by triggering memory reclaim in the
    file system which could in turn call back into the dlm
    to acquire locks, deadlocking dlm threads that were
    shared by all lockspaces, like dlm_recv.

    Signed-off-by: David Teigland

    David Teigland
     

22 Feb, 2008

1 commit

  • The recent patch to validate data lengths in rcom_names messages
    failed to account for fake messages a node directs to itself before
    ever sending it. In this case we need to fill in the message length
    in the header for the validation code to use.

    Signed-off-by: David Teigland

    David Teigland
     

06 Feb, 2008

1 commit


04 Feb, 2008

5 commits


31 Jan, 2008

1 commit

  • A couple small clean-ups. Remove unnecessary wrapper-functions in
    rcom.c, and remove unnecessary casting and an unnecessary ASSERT in
    util.c.

    Signed-off-by: David Teigland

    David Teigland
     

10 Oct, 2007

1 commit

  • Introduce a per-lockspace rwsem that's held in read mode by dlm_recv
    threads while working in the dlm. This allows dlm_recv activity to be
    suspended when the lockspace transitions to, from and between recovery
    cycles.

    The specific bug prompting this change is one where an in-progress
    recovery cycle is aborted by a new recovery cycle. While dlm_recv was
    processing a recovery message, the recovery cycle was aborted and
    dlm_recoverd began cleaning up. dlm_recv decremented recover_locks_count
    on an rsb after dlm_recoverd had reset it to zero. This is fixed by
    suspending dlm_recv (taking write lock on the rwsem) before aborting the
    current recovery.

    The transitions to/from normal and recovery modes are simplified by using
    this new ability to block dlm_recv. The switch from normal to recovery
    mode means dlm_recv goes from processing locking messages, to saving them
    for later, and vice versa. Races are avoided by blocking dlm_recv when
    setting the flag that switches between modes.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     

14 Aug, 2007

1 commit


09 Jul, 2007

2 commits

  • Add a new flag, DLM_LSFL_FS, to be used when a file system creates a lockspace.
    This flag causes the dlm to use GFP_NOFS for allocations instead of GFP_KERNEL.
    (This updated version of the patch uses gfp_t for ls_allocation.)

    Signed-Off-By: Patrick Caulfield
    Signed-Off-By: David Teigland
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • Joining the lockspace should wait for the initial round of inter-node
    config checks to complete before returning. This way, if there's a
    configuration mismatch between the joining node and the existing nodes,
    the join can fail and return an error to the application.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     

06 Feb, 2007

4 commits

  • Add a "ci_" prefix to the fields in the dlm_config_info struct so that we
    can use macros to add configfs functions to access them (in a later
    patch). No functional changes in this patch, just naming changes.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Some common, non-error messages should use log_debug instead of log_error
    so they can be turned off.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Check if we receive a message from another lockspace member running a
    version of the dlm with an incompatible inter-node message protocol.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • A reply to a recovery message will often be received after the relevant
    recovery sequence has aborted and the next recovery sequence has begun.
    We need to ignore replies to these old messages from the previous
    recovery. There's already a way to do this for synchronous recovery
    requests using the rc_id number, but not for async.

    Each recovery sequence already has a locally unique sequence number
    associated with it. This patch adds a field to the rcom (recovery
    message) structure where this recovery sequence number can be placed,
    rc_seq. When a node sends a reply to a recovery request, it copies the
    rc_seq number it received into rc_seq_reply. When the first node receives
    the reply to its recovery message, it will check whether rc_seq_reply
    matches the current recovery sequence number, ls_recover_seq, and if not
    then it ignores the old reply.

    An old, inadequate approach to filtering out old replies (checking if the
    current stage of recovery has moved back to the start) has been removed
    from two spots.

    The protocol version number is changed to reflect the different rcom
    structures.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     

30 Nov, 2006

4 commits

  • This fixes the following gcc warnings generated on
    the architectures where uint64_t != unsigned long long (e.g. ppc64).

    fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 4 has type 'uint64_t'
    fs/dlm/rcom.c:154: warning: format '%llx' expects type 'long long unsigned int', but argument 5 has type 'uint64_t'
    fs/dlm/recoverd.c:48: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
    fs/dlm/recoverd.c:202: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'
    fs/dlm/recoverd.c:210: warning: format '%llx' expects type 'long long unsigned int', but argument 3 has type 'uint64_t'

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Ryusuke Konishi
     
  • We often abort a recovery after sending a status request to a remote node.
    We want to ignore any potential status reply we get from the remote node.
    If we get one of these unwanted replies, we've often moved on to the next
    recovery message and incremented the message sequence counter, so the
    reply will be ignored due to the seq number. In some cases, we've not
    moved on to the next message so the seq number of the reply we want to
    ignore is still correct, causing the reply to be accepted. The next
    recovery message will then mistake this old reply as a new one.

    To fix this, we add the flag RCOM_WAIT to indicate when we can accept a
    new reply. We clear this flag if we abort recovery while waiting for a
    reply. Before the flag is set again (to allow new replies) we know that
    any old replies will be rejected due to their sequence number. We also
    initialize the recovery-message sequence number to a random value when a
    lockspace is first created. This makes it clear when messages are being
    rejected from an old instance of a lockspace that has since been
    recreated.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • When the not_ready routine sends a "fake" status reply with blank status
    flags, it needs to use the correct size for a normal STATUS_REPLY by
    including the size of the would-be config parameters. We also fill in the
    non-existant config parameters with an invalid lvblen value so it's easier
    to notice if these invalid paratmers are ever being used.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Red Hat BZ 213682

    If two nodes leave the lockspace (while unmounting the fs in the case of
    gfs) after one has sent a STATUS message to the other, STATUS/STATUS_REPLY
    messages will then ping-pong between the nodes when neither of them can
    find the lockspace in question any longer. We kill this by not sending
    another STATUS message when we get a STATUS_REPLY for an unknown
    lockspace.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     

24 Aug, 2006

1 commit


10 Aug, 2006

1 commit

  • When recoveries are aborted by other recoveries we can get replies to
    status or names requests that we've given up on. This can cause problems
    if we're making another request and receive an old reply. Add a sequence
    number to status/names requests and reject replies that don't match. A
    field already exists for the seq number that's used in other message
    types.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     

09 Aug, 2006

1 commit


23 Feb, 2006

1 commit


18 Jan, 2006

1 commit

  • This is the core of the distributed lock manager which is required
    to use GFS2 as a cluster filesystem. It is also used by CLVM and
    can be used as a standalone lock manager independantly of either
    of these two projects.

    It implements VAX-style locking modes.

    Signed-off-by: David Teigland
    Signed-off-by: Steve Whitehouse

    David Teigland