08 May, 2007

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (34 commits)
    [GFS2] Uncomment sprintf_symbol calling code
    [DLM] lowcomms style
    [GFS2] printk warning fixes
    [GFS2] Patch to fix mmap of stuffed files
    [GFS2] use lib/parser for parsing mount options
    [DLM] Lowcomms nodeid range & initialisation fixes
    [DLM] Fix dlm_lowcoms_stop hang
    [DLM] fix mode munging
    [GFS2] lockdump improvements
    [GFS2] Patch to detect corrupt number of dir entries in leaf and/or inode blocks
    [GFS2] bz 236008: Kernel gpf doing cat /debugfs/gfs2/xxx (lock dump)
    [DLM] fs/dlm/ast.c should #include "ast.h"
    [DLM] Consolidate transport protocols
    [DLM] Remove redundant assignment
    [GFS2] Fix bz 234168 (ignoring rgrp flags)
    [DLM] change lkid format
    [DLM] interface for purge (2/2)
    [DLM] add orphan purging code (1/2)
    [DLM] split create_message function
    [GFS2] Set drop_count to 0 (off) by default
    ...

    Linus Torvalds
     

03 May, 2007

1 commit


01 May, 2007

15 commits

  • Replace some printk with log_print, and fix some simple cases of lines
    over 80. Also, return -ENOTCONN if lowcomms_start fails due to no local
    IP address being available.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Fix a few range & initialization bugs in lowcomms.
    - max_nodeid is really the highest nodeid encountered, so all loops must include
    it in their iterations.
    - clean dlm_local_count & connection_idr so we can do a clean restart.
    - Remove a spurious BUG_ON

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • When you attempt to release a lockspace in DLM, it will hang trying to down a
    semaphore that has already been downed. The attached patch fixes the problem.

    Signed-off-by: Josef Bacik
    Signed-off-by: Steven Whitehouse
    Cc: Patrick Caulfield

    Josef Bacik
     
  • There are flags to enable two specialized features in the dlm:
    1. CONVDEADLK causes the dlm to resolve conversion deadlocks internally by
    changing the granted mode of locks to NL.
    2. ALTPR/ALTCW cause the dlm to change the requested mode of locks to PR
    or CW to grant them if the normal requested mode can't be granted.

    GFS direct i/o exercises both of these features, especially when mixed
    with buffered i/o. The dlm has problems with them.

    The first problem is on the master node. If it demotes a lock as a part of
    converting it, the actual step of converting the lock isn't being done
    after the demotion, the lock is just left sitting on the granted queue
    with a granted mode of NL. I think the mistaken assumption was that the
    call to grant_pending_locks() would grant it, but that function naturally
    doesn't look at locks on the granted queue.

    The second problem is on the process node. If the master either demotes
    or gives an altmode, the munging of the gr/rq modes is never done in the
    process copy of the lock, leaving the master/process copies out of sync.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Every file should include the headers containing the prototypes for
    it's global functions.

    Signed-off-by: Adrian Bunk
    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    Adrian Bunk
     
  • This patch consolidates the TCP & SCTP protocols for the DLM into a single file
    and makes it switchable at run-time (well, at least before the DLM actually
    starts up!)

    For RHEL5 this patch requires Neil Horman's patch that expands the in-kernel
    socket API but that has already been twice ACKed so it should be OK.

    The patch adds a new lowcomms.c file that replaces the existing lowcomms-sctp.c
    & lowcomms-tcp.c files.

    Signed-off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • This patch removes a redundant (and incorrect) assignment from compat_output

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • A lock id is a uint32 and is used as an opaque reference to the lock. For
    userland apps, the lkid is passed up, through libdlm, as the return value
    from a write() on the dlm device. This created a problem when the high
    bit was 1, making the lkid look like an error. This is fixed by changing
    how the lkid is composed. The low 16 bits identified the hash bucket for
    the lock and the high 16 bits were a per-bucket counter (which eventually
    hit 0x8000 causing the problem). These are simply swapped around; the
    number of hash table buckets is far below 0x8000, making all lkid's
    positive when viewed as signed.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Add code to accept purge commands from userland.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Add code for purging orphan locks. A process can also purge all of its
    own non-orphan locks by passing a pid of zero. Code already exists for
    processes to create persistent locks that become orphans when the process
    exits, but the complimentary capability for another process to then purge
    these orphans has been missing.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • This splits the current create_message() function into two parts so that
    later patches can call the new lower-level _create_message() function when
    they don't have an rsb struct. No functional change in this patch.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Full cancel and force-unlock support. In the past, cancel and force-unlock
    wouldn't work if there was another operation in progress on the lock. Now,
    both cancel and unlock-force can overlap an operation on a lock, meaning there
    may be 2 or 3 operations in progress on a lock in parallel. This support is
    important not only because cancel and force-unlock are explicit operations
    that an app can use, but both are used implicitly when a process exits while
    holding locks.

    Summary of changes:

    - add-to and remove-from waiters functions were rewritten to handle situations
    with more than one remote operation outstanding on a lock

    - validate_unlock_args detects when an overlapping cancel/unlock-force
    can be sent and when it needs to be delayed until a request/lookup
    reply is received

    - processing request/lookup replies detects when cancel/unlock-force
    occured during the op, and carries out the delayed cancel/unlock-force

    - manipulation of the "waiters" (remote operation) state of a lock moved under
    the standard rsb mutex that protects all the other lock state

    - the two recovery routines related to locks on the waiters list changed
    according to the way lkb's are now locked before accessing waiters state

    - waiters recovery detects when lkb's being recovered have overlapping
    cancel/unlock-force, and may not recover such locks

    - revert_lock (cancel) returns a value to distinguish cases where it did
    nothing vs cases where it actually did a cancel; the cancel completion ast
    should only be done when cancel did something

    - orphaned locks put on new list so they can be found later for purging

    - cancel must be called on a lock when making it an orphan

    - flag user locks (ENDOFLIFE) at the end of their useful life (to the
    application) so we can return an error for any further cancel/unlock-force

    - we weren't setting COMP/BAST ast flags if one was already set, so we'd lose
    either a completion or blocking ast

    - clear an unread bast on a lock that's become unlocked

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Replacement patch to remove redundant code rather than moving it around.

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • Currently if the lockspace removal fails the misc device associated with a
    lockspace is left deleted. After that there is no way to access the orphaned
    lockspace from userland.

    This patch recreates the misc device if th dlm_release_lockspace fails. I
    believe this is better than attempting to remove the lockspace first because
    that leaves an unattached device lying around. The potential gap in which there
    is no access to the lockspace between removing the misc device and recreating it
    is acceptable ... after all the application is trying to remove it, and only new
    users of the lockspace will be affected.

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • The length of the second element of the kvec array was not initialised before
    being added to the first one. This could cause invalid lengths to be passed to
    kernel_recvmsg

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     

08 Mar, 2007

1 commit


13 Feb, 2007

1 commit

  • Many struct file_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

12 Feb, 2007

1 commit

  • Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the
    corresponding "kmem_cache_zalloc()" call.

    Signed-off-by: Robert P. J. Day
    Cc: "Luck, Tony"
    Cc: Andi Kleen
    Cc: Roland McGrath
    Cc: James Bottomley
    Cc: Greg KH
    Acked-by: Joel Becker
    Cc: Steven Whitehouse
    Cc: Jan Kara
    Cc: Michael Halcrow
    Cc: "David S. Miller"
    Cc: Stephen Smalley
    Cc: James Morris
    Cc: Chris Wright
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     

10 Feb, 2007

1 commit


06 Feb, 2007

19 commits

  • This patch stops the dlm_recv workqueue from busy-waiting when a node
    disconnects. This can cause soft lockup errors on debug systems and bad
    performance generally.

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • A new lvb for a userland lock wasn't being initialized to zero.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Indent help text as expected.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Steven Whitehouse

    Randy Dunlap
     
  • On Sun, Jan 28, 2007 at 11:08:18AM +0100, Jiri Slaby wrote:
    > Andrew Morton napsal(a):
    > >Temporarily at
    > >
    > > http://userweb.kernel.org/~akpm/2.6.20-rc6-mm1/
    >
    > Unable to select IPV6. Menuconfig doesn't offer it when INET is selected.
    > When it's not it appears in the menu, but after state change it gets away.
    > The same behaviour in xconfig, gconfig.
    >
    > $ mkdir ../a/tst
    > $ make O=../a/tst menuconfig
    > HOSTCC scripts/basic/fixdep
    > [...]
    > HOSTLD scripts/kconfig/mconf
    > scripts/kconfig/mconf arch/i386/Kconfig
    > Warning! Found recursive dependency: INET GFS2_FS_LOCKING_DLM SYSFS
    > OCFS2_FS INET
    >
    > Maybe this is the problem?

    Yes, patch below.

    > regards,

    cu
    Adrian

    This patch fixes a circular dependency by letting GFS2_FS_LOCKING_DLM
    and DLM depend on instead of select SYSFS.

    Since SYSFS depends on EMBEDDED this change shouldn't cause any problems
    for users.

    Signed-off-by: Adrian Bunk
    Acked-by: Randy Dunlap
    Signed-off-by: Steven Whitehouse

    Adrian Bunk
     
  • With CONFIG_DLM=m, CONFIG_PROC_FS=n, and CONFIG_SYSFS=n, kernel build
    fails with:

    WARNING: "kernel_subsys" [fs/gfs2/locking/dlm/lock_dlm.ko] undefined!
    WARNING: "kernel_subsys" [fs/dlm/dlm.ko] undefined!
    WARNING: "kernel_subsys" [fs/configfs/configfs.ko] undefined!
    make[1]: *** [__modpost] Error 1
    make: *** [modules] Error 2

    Since fs/dlm/lockspace.c and fs/gfs2/locking/dlm/sysfs.c use
    kernel_subsys, they should either DEPEND on it or SELECT it.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Steven Whitehouse

    Randy Dunlap
     
  • A long, complicated sequence of events, beginning with the RESEND flag not
    being cleared on an lkb, can result in an unlock never completing.

    - lkb on waiters list for remote lookup
    - the remote node is both the dir node and the master node, so
    it optimizes the lookup into a request and sends a request
    reply back
    - the request reply is saved on the requestqueue to be processed
    after recovery
    - recovery runs dlm_recover_waiters_pre() which sets RESEND flag
    so the lookup will be resent after recovery
    - end of recovery: process_requestqueue takes saved request reply
    which removes the lkb off the waitesr list, _without_ clearing
    the RESEND flag
    - end of recovery: dlm_recover_waiters_post() doesn't do anything
    with the now completed lookup lkb (would usually clear RESEND)
    - later, the node unmounts, unlocks this lkb that still has RESEND
    flag set
    - the lkb is on the waiters list again, now for unlock, when recovery
    occurs, dlm_recover_waiters_pre() shows the lkb for unlock with RESEND
    set, doesn't do anything since the master still exists
    - end of recovery: dlm_recover_waiters_post() takes this lkb off
    the waiters list because it has the RESEND flag set, then reports
    an error because unlocks are never supposed to be handled in
    recover_waiters_post().
    - later, the unlock reply is received, doesn't find the lkb on
    the waiters list because recover_waiters_post() has wrongly
    removed it.
    - the unlock operation has been lost, and we're left with a
    stray granted lock
    - unmount spins waiting for the unlock to complete

    The visible evidence of this problem will be a node where gfs umount is
    spinning, the dlm waiters list will be empty, and the dlm locks list will
    show a granted lock.

    The fix is simply to clear the RESEND flag when taking an lkb off the
    waiters list.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • dlm_receive_message() returns 0 instead of returning 'error'. What would
    happen is that process_requestqueue would take a saved message off the
    requestqueue and call receive_message on it. receive_message would then
    see that recovery had been aborted, set error to EINTR, and 'goto out',
    expecting that the error would be returned. Instead, 0 was always
    returned, so process_requestqueue would think that the message had been
    processed and delete it instead of saving it to process next time. This
    means the message (usually an unlock in my tests) would be lost.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Now that there can be multiple dlm_recv threads running we need to prevent two
    recvs running for the same connection - it's unlikely but it can happen and it
    causes message corruption.

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • This patch fixes a bug whereby data on a newly accepted connection would be
    ignored if it arrived soon after the accept.

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • This patch removes some redundant fields from the connection structure and adds
    some lockdep annotation to remove spurious warnings.

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • If master recovery happens on an rsb in one recovery sequence, then that
    sequence is aborted before lock recovery happens, then in the next
    sequence, we rely on the previous master recovery (which may now be
    invalid due to another node ignoring a lookup result) and go on do to the
    lock recovery where we get stuck due to an invalid master value.

    recovery cycle begins: master of rsb X has left
    nodes A and B send node C an rcom lookup for X to find the new master
    C gets lookup from B first, sets B as new master, and sends reply back to B
    C gets lookup from A next, and sends reply back to A saying B is master
    A gets lookup reply from C and sets B as the new master in the rsb
    recovery cycle on A, B and C is aborted to start a new recovery
    B gets lookup reply from C and ignores it since there's a new recovery
    recovery cycle begins: some other node has joined
    B doesn't think it's the master of X so it doesn't rebuild it in the directory
    C looks up the master of X, no one is master, so it becomes new master
    B looks up the master of X, finds it's C
    A believes that B is the master of X, so it sends its lock to B
    B sends an error back to A
    A resends
    this repeats forever, the incorrect master value on A is never corrected

    The fix is to do master recovery on an rsb that still has the NEW_MASTER
    flag set from an earlier recovery sequence, and therefore didn't complete
    lock recovery.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • When a user process exits, we clear all the locks it holds. There is a
    problem, though, with locks that the process had begun unlocking before it
    exited. We couldn't find the lkb's that were in the process of being
    unlocked remotely, to flag that they are DEAD. To solve this, we move
    lkb's being unlocked onto a new list in the per-process structure that
    tracks what locks the process is holding. We can then go through this
    list to flag the necessary lkb's when clearing locks for a process when it
    exits.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • This patch converts the DLM TCP lowcomms to use workqueues rather than using its
    own daemon functions. Simultaneously removing a lot of code and making it more
    scalable on multi-processor machines.

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • Make the dlm_config_info values readable and writeable via configfs
    entries.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Add a new dlm_config_info field to enable log_debug output and change
    log_debug() to use it.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Add a "ci_" prefix to the fields in the dlm_config_info struct so that we
    can use macros to add configfs functions to access them (in a later
    patch). No functional changes in this patch, just naming changes.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • Some common, non-error messages should use log_debug instead of log_error
    so they can be turned off.

    Signed-off-by: David Teigland
    Signed-off-by: Steven Whitehouse

    David Teigland
     
  • I just noticed this message when testing some other changes I'd made to
    lowcomms (to use workqueues) but the problem seems to be in the current
    git trees too. I'm amazed no-one has seen it.

    BUG: spinlock already unlocked on CPU#1, dlm_recoverd/16868

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield
     
  • I was a little over-enthusiastic turning schedule() calls int cond_sched() when fixing the DLM for Andrew Morton.

    These four should really be calls to schedule() or the dlm can busy-wait.

    Signed-Off-By: Patrick Caulfield
    Signed-off-by: Steven Whitehouse

    Patrick Caulfield