16 May, 2009

1 commit

  • Change some GFP_KERNEL allocations to use either GFP_NOFS or
    ls_allocation (when available) which the fs sets to GFP_NOFS.
    The point is to prevent allocations from going back into the
    cluster fs in places where that might lead to deadlock.

    Signed-off-by: David Teigland

    David Teigland
     

15 May, 2009

1 commit

  • Make network connections to other nodes earlier, in the context of
    dlm_recoverd. This avoids connecting to nodes from dlm_send where we
    try to avoid allocations which could possibly deadlock if memory reclaim
    goes into the cluster fs which may try to do a dlm operation.

    Signed-off-by: Christine Caulfield
    Signed-off-by: David Teigland

    Christine Caulfield
     

12 Mar, 2009

1 commit


29 Jan, 2009

2 commits


24 Dec, 2008

2 commits

  • The pages used in lowcomms are not highmem, so kmap is not necessary.

    Cc: Christine Caulfield
    Signed-off-by: Steven Whitehouse
    Signed-off-by: David Teigland

    Steven Whitehouse
     
  • Use ls_allocation for memory allocations, which a cluster fs sets to
    GFP_NOFS. Use GFP_NOFS for allocations when no lockspace struct is
    available. Taking dlm locks needs to avoid calling back into the
    cluster fs because write-out can require taking dlm locks.

    Cc: Christine Caulfield
    Signed-off-by: Steven Whitehouse
    Signed-off-by: David Teigland

    Steven Whitehouse
     

15 Jul, 2008

1 commit

  • It seems that `sock' allocated by sock_create_kern in
    tcp_connect_to_sock() of dlm/fs/lowcomms.c is not released if
    dlm_nodeid_to_addr an error.

    Acked-by: Christine Caulfield
    Signed-off-by: Masatake YAMATO
    Signed-off-by: David Teigland

    Masatake YAMATO
     

20 May, 2008

2 commits


30 Jan, 2008

2 commits

  • This patch addresses a problem introduced with the last round of
    lowcomms patches where the 'othercon' connections do not get freed when
    the DLM shuts down.

    This results in the error message
    "slab error in kmem_cache_destroy(): cache `dlm_conn': Can't free all
    objects"

    and the DLM cannot be restarted without a system reboot.

    See bz#428119

    Signed-off-by: Patrick Caulfield
    Signed-off-by: Fabio M. Di Nitto
    Signed-off-by: David Teigland

    Patrick Caulfeld
     
  • A common problem occurs when multiple IP addresses within the same
    subnet are assigned to the same NIC. If we make a connection attempt to
    another address on the same subnet as one of those addresses, the
    connection attempt will not necessarily be routed from the address we
    want.

    In the case of the DLM, the other nodes will quickly drop the connection
    attempt, causing problems.

    This patch makes the DLM bind to the local address it acquired from the
    cluster manager when using TCP prior to making a connection, obviating
    the need for administrators to "fix" their systems or use clever routing
    tricks.

    Signed-off-by: Lon Hohberger
    Signed-off-by: Patrick Caulfield
    Signed-off-by: David Teigland

    Lon Hohberger
     

07 Nov, 2007

1 commit


10 Oct, 2007

2 commits


14 Aug, 2007

3 commits


20 Jul, 2007

1 commit

  • Slab destructors were no longer supported after Christoph's
    c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
    BUGs for both slab and slub, and slob never supported them
    either.

    This rips out support for the dtor pointer from kmem_cache_create()
    completely and fixes up every single callsite in the kernel (there were
    about 224, not including the slab allocator definitions themselves,
    or the documentation references).

    Signed-off-by: Paul Mundt

    Paul Mundt
     

09 Jul, 2007

3 commits


01 May, 2007

4 commits

  • Replace some printk with log_print, and fix some simple cases of lines
    over 80. Also, return -ENOTCONN if lowcomms_start fails due to no local
    IP address being available.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Fix a few range & initialization bugs in lowcomms.
    - max_nodeid is really the highest nodeid encountered, so all loops must include
    it in their iterations.
    - clean dlm_local_count & connection_idr so we can do a clean restart.
    - Remove a spurious BUG_ON

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • When you attempt to release a lockspace in DLM, it will hang trying to down a
    semaphore that has already been downed. The attached patch fixes the problem.

    Signed-off-by: Josef Bacik
    Signed-off-by: Steven Whitehouse
    Cc: Patrick Caulfield

    Josef Bacik
     
  • This patch consolidates the TCP & SCTP protocols for the DLM into a single file
    and makes it switchable at run-time (well, at least before the DLM actually
    starts up!)

    For RHEL5 this patch requires Neil Horman's patch that expands the in-kernel
    socket API but that has already been twice ACKed so it should be OK.

    The patch adds a new lowcomms.c file that replaces the existing lowcomms-sctp.c
    & lowcomms-tcp.c files.

    Signed-off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     

30 Nov, 2006

1 commit

  • The following patch adds a TCP based communications layer
    to the DLM which is compile time selectable. The existing SCTP
    layer gives the advantage of allowing multihoming, whereas
    the TCP layer has been heavily tested in previous versions of
    the DLM and is known to be robust and therefore can be used as
    a baseline for performance testing.

    Signed-off-by: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     

20 Oct, 2006

1 commit

  • I didn't spot that the msg_iovlen was set to 2 if there
    were two elements in the iovec but left at zero if not :(

    I think this might be why bob was still seeing trouble.

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     

13 Oct, 2006

1 commit


10 Oct, 2006

1 commit


11 Aug, 2006

1 commit

  • Doing the kmap() while holding the spinlock was causing recursive spinlock
    problems. It seems the kmap was scheduling, although there was no warning
    as I'd expect. Patrick, do we need locking around the kmap?

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     

19 Jun, 2006

1 commit


26 May, 2006

1 commit


28 Apr, 2006

1 commit


18 Jan, 2006

1 commit

  • This is the core of the distributed lock manager which is required
    to use GFS2 as a cluster filesystem. It is also used by CLVM and
    can be used as a standalone lock manager independantly of either
    of these two projects.

    It implements VAX-style locking modes.

    Signed-off-by: David Teigland
    Signed-off-by: Steve Whitehouse

    David Teigland