15 May, 2009

1 commit

  • Make network connections to other nodes earlier, in the context of
    dlm_recoverd. This avoids connecting to nodes from dlm_send where we
    try to avoid allocations which could possibly deadlock if memory reclaim
    goes into the cluster fs which may try to do a dlm operation.

    Signed-off-by: Christine Caulfield
    Signed-off-by: David Teigland

    Christine Caulfield
     

07 Dec, 2006

1 commit

  • This fixes up most of the things pointed out by akpm and Pavel Machek
    with comments below indicating why some things have been left:

    Andrew Morton wrote:
    >
    >> +static struct nodeinfo *nodeid2nodeinfo(int nodeid, gfp_t alloc)
    >> +{
    >> + struct nodeinfo *ni;
    >> + int r;
    >> + int n;
    >> +
    >> + down_read(&nodeinfo_lock);
    >
    > Given that this function can sleep, I wonder if `alloc' is useful.
    >
    > I see lots of callers passing in a literal "0" for `alloc'. That's in fact
    > a secret (GFP_ATOMIC & ~__GFP_HIGH). I doubt if that's what you really
    > meant. Particularly as the code could at least have used __GFP_WAIT (aka
    > GFP_NOIO) which is much, much more reliable than "0". In fact "0" is the
    > least reliable mode possible.
    >
    > IOW, this is all bollixed up.

    When 0 is passed into nodeid2nodeinfo the function does not try to allocate a
    new structure at all. it's an indication that the caller only wants the nodeinfo
    struct for that nodeid if there actually is one in existance.
    I've tidied the function itself so it's more obvious, (and tidier!)

    >> +/* Data received from remote end */
    >> +static int receive_from_sock(void)
    >> +{
    >> + int ret = 0;
    >> + struct msghdr msg;
    >> + struct kvec iov[2];
    >> + unsigned len;
    >> + int r;
    >> + struct sctp_sndrcvinfo *sinfo;
    >> + struct cmsghdr *cmsg;
    >> + struct nodeinfo *ni;
    >> +
    >> + /* These two are marginally too big for stack allocation, but this
    >> + * function is (currently) only called by dlm_recvd so static should be
    >> + * OK.
    >> + */
    >> + static struct sockaddr_storage msgname;
    >> + static char incmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))];
    >
    > whoa. This is globally singly-threaded code??

    Yes. it is only ever run in the context of dlm_recvd.
    >>
    >> +static void initiate_association(int nodeid)
    >> +{
    >> + struct sockaddr_storage rem_addr;
    >> + static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))];
    >
    > Another static buffer to worry about. Globally singly-threaded code?

    Yes. Only ever called by dlm_sendd.

    >> +
    >> +/* Send a message */
    >> +static int send_to_sock(struct nodeinfo *ni)
    >> +{
    >> + int ret = 0;
    >> + struct writequeue_entry *e;
    >> + int len, offset;
    >> + struct msghdr outmsg;
    >> + static char outcmsg[CMSG_SPACE(sizeof(struct sctp_sndrcvinfo))];
    >
    > Singly-threaded?

    Yep.

    >>
    >> +static void dealloc_nodeinfo(void)
    >> +{
    >> + int i;
    >> +
    >> + for (i=1; i> + struct nodeinfo *ni = nodeid2nodeinfo(i, 0);
    >> + if (ni) {
    >> + idr_remove(&nodeinfo_idr, i);
    >
    > Didn't that need locking?

    Not. it's only ever called at DLM shutdown after all the other threads
    have been stopped.

    >>
    >> +static int write_list_empty(void)
    >> +{
    >> + int status;
    >> +
    >> + spin_lock_bh(&write_nodes_lock);
    >> + status = list_empty(&write_nodes);
    >> + spin_unlock_bh(&write_nodes_lock);
    >> +
    >> + return status;
    >> +}
    >
    > This function's return value is meaningless. As soon as the lock gets
    > dropped, the return value can get out of sync with reality.
    >
    > Looking at the caller, this _might_ happen to be OK, but it's a nasty and
    > dangerous thing. Really the locking should be moved into the caller.

    It's just an optimisation to allow the caller to schedule if there is no work
    to do. if something arrives immediately afterwards then it will get picked up
    when the process re-awakes (and it will be woken by that arrival).

    The 'accepting' atomic has gone completely. as Andrew pointed out it didn't
    really achieve much anyway. I suspect it was a plaster over some other
    startup or shutdown bug to be honest.

    Signed-off-by: Patrick Caulfield
    Signed-off-by: Steven Whitehouse
    Cc: Andrew Morton
    Cc: Pavel Machek

    Patrick Caulfield
     

10 Oct, 2006

1 commit


28 Apr, 2006

1 commit


18 Jan, 2006

1 commit

  • This is the core of the distributed lock manager which is required
    to use GFS2 as a cluster filesystem. It is also used by CLVM and
    can be used as a standalone lock manager independantly of either
    of these two projects.

    It implements VAX-style locking modes.

    Signed-off-by: David Teigland
    Signed-off-by: Steve Whitehouse

    David Teigland