27 Feb, 2009

3 commits

  • In dlm_assert_master_handler(), if we get an incorrect assert master from a node
    that, we reply with EINVAL asking the asserter to die. The problem is that an
    assert is sent after so many hoops, it is invariably the node that thinks the
    asserter is wrong, is actually wrong. So instead of killing the asserter, this
    patch kills the assertee.

    This patch papers over a race that is still being addressed.

    Signed-off-by: Sunil Mushran
    Acked-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • The code was using dlm->spinlock instead of dlm->ast_lock to protect the
    ast_list. This patch fixes the issue.

    Signed-off-by: Sunil Mushran
    Acked-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • Mainline commit d4f7e650e55af6b235871126f747da88600e8040 attempts to delay
    the dlm_thread from sending the drop ref message if the lockres is being
    migrated. The problem is that we make the dlm_thread wait for the migration
    to complete. This causes a deadlock as dlm_thread also participates in the
    lockres migration process.

    A better fix for the original oss bugzilla#1012 is in testing.

    Signed-off-by: Sunil Mushran
    Acked-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     

06 Jan, 2009

7 commits

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (138 commits)
    ocfs2: Access the right buffer_head in ocfs2_merge_rec_left.
    ocfs2: use min_t in ocfs2_quota_read()
    ocfs2: remove unneeded lvb casts
    ocfs2: Add xattr support checking in init_security
    ocfs2: alloc xattr bucket in ocfs2_xattr_set_handle
    ocfs2: calculate and reserve credits for xattr value in mknod
    ocfs2/xattr: fix credits calculation during index create
    ocfs2/xattr: Always updating ctime during xattr set.
    ocfs2/xattr: Remove extend_trans call and add its credits from the beginning
    ocfs2/dlm: Fix race during lockres mastery
    ocfs2/dlm: Fix race in adding/removing lockres' to/from the tracking list
    ocfs2/dlm: Hold off sending lockres drop ref message while lockres is migrating
    ocfs2/dlm: Clean up errors in dlm_proxy_ast_handler()
    ocfs2/dlm: Fix a race between migrate request and exit domain
    ocfs2: One more hamming code optimization.
    ocfs2: Another hamming code optimization.
    ocfs2: Don't hand-code xor in ocfs2_hamming_encode().
    ocfs2: Enable metadata checksums.
    ocfs2: Validate superblock with checksum and ecc.
    ocfs2: Checksum and ECC for directory blocks.
    ...

    Linus Torvalds
     
  • ... and don't bother in callers. Don't bother with zeroing i_blocks,
    while we are at it - it's already been zeroed.

    i_mode is not worth the effort; it has no common default value.

    Signed-off-by: Al Viro

    Al Viro
     
  • dlm_get_lock_resource() is supposed to return a lock resource with a proper
    master. If multiple concurrent threads attempt to lookup the lockres for the
    same lockid while the lock mastery in underway, one or more threads are likely
    to return a lockres without a proper master.

    This patch makes the threads wait in dlm_get_lock_resource() while the mastery
    is underway, ensuring all threads return the lockres with a proper master.

    This issue is known to be limited to users using the flock() syscall. For all
    other fs operations, the ocfs2 dlmglue layer serializes the dlm op for each
    lockid.

    Users encountering this bug will see flock() return EINVAL and dmesg have the
    following error:
    ERROR: Dlm error "DLM_BADARGS" while calling dlmlock on resource : bad api args

    Reported-by: Coly Li
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • This patch adds a new lock, dlm->tracking_lock, to protect adding/removing
    lockres' to/from the dlm->tracking_list. We were previously using dlm->spinlock
    for the same, but that proved inadequate as we could be freeing a lockres from
    a context that did not hold that lock. As the new lock only protects this list,
    we can explicitly take it when removing the lockres from the tracking list.

    This bug was exposed when testing multiple processes concurrently flock() the
    same file.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • During lockres purge, o2dlm sends a drop reference message to the lockres
    master. This patch delays the message if the lockres is being migrated.

    Fixes oss bugzilla#1012
    http://oss.oracle.com/bugzilla/show_bug.cgi?id=1012

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • Patch cleans printed errors in dlm_proxy_ast_handler(). The errors now includes
    the node number that sent the (b)ast. Also it reduces the number of endian swaps
    of the cookie.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • Patch address a racing migrate request message and an exit domain message.
    Instead of blocking exit domains for the duration of the migrate, we ignore
    failure to deliver that message. This is because an exiting domain should
    not have any active locks and thus has no role to play in the migration.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     

04 Dec, 2008

1 commit


02 Dec, 2008

2 commits


14 Nov, 2008

1 commit

  • Wrap access to task credentials so that they can be separated more easily from
    the task_struct during the introduction of COW creds.

    Change most current->(|e|s|fs)[ug]id to current_(|e|s|fs)[ug]id().

    Change some task->e?[ug]id to task_e?[ug]id(). In some places it makes more
    sense to use RCU directly rather than a convenient wrapper; these will be
    addressed by later patches.

    Signed-off-by: David Howells
    Reviewed-by: James Morris
    Acked-by: Serge Hallyn
    Acked-by: Mark Fasheh
    Cc: Joel Becker
    Cc: ocfs2-devel@oss.oracle.com
    Signed-off-by: James Morris

    David Howells
     

27 Jul, 2008

1 commit

  • Kmem cache passed to constructor is only needed for constructors that are
    themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
    passed kmem cache in non-trivial way, so pass only pointer to object.

    Non-trivial places are:
    arch/powerpc/mm/init_64.c
    arch/powerpc/mm/hugetlbpage.c

    This is flag day, yes.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Pekka Enberg
    Acked-by: Christoph Lameter
    Cc: Jon Tollefson
    Cc: Nick Piggin
    Cc: Matt Mackall
    [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
    [akpm@linux-foundation.org: fix mm/slab.c]
    [akpm@linux-foundation.org: fix ubifs]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

08 Jul, 2008

1 commit


31 May, 2008

1 commit


01 May, 2008

1 commit


30 Apr, 2008

1 commit

  • Add a new BDI capability flag: BDI_CAP_NO_ACCT_WB. If this flag is
    set, then don't update the per-bdi writeback stats from
    test_set_page_writeback() and test_clear_page_writeback().

    Misc cleanups:

    - convert bdi_cap_writeback_dirty() and friends to static inline functions
    - create a flag that includes all three dirty/writeback related flags,
    since almst all users will want to have them toghether

    Signed-off-by: Miklos Szeredi
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

18 Apr, 2008

13 commits


11 Mar, 2008

8 commits

  • This patch addresses the bug in which the dlm_thread could go to sleep
    while holding the dlm_spinlock.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • Knowing the dlm recovery master helps in debugging recovery
    issues. This patch prints a message on the recovery master node.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • dlm_master_request_handler() forgot to put a lockres when
    dlm_assert_master_worker() failed or was skipped.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • During migration, the recovery master node may be asked to master a lockres
    it may not know about. In that case, it would not only have to create a
    lockres and add it to the hash, but also remember to to do the _put_
    corresponding to the kref_init in dlm_init_lockres(), as soon as the migration
    is completed. Yes, we don't wait for the dlm_purge_lockres() to do that
    matching put. Note the ref added for it being in the hash protects the lockres
    from being freed prematurely.

    This patch adds that missing put, as described above, to plug a memleak.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • Normally locks for remote nodes are freed when that node sends an UNLOCK
    message to the master. The master node tags an DLM_UNLOCK_FREE_LOCK action
    to do an extra put on the lock at the end.

    However, there are times when the master node has to free the locks for the
    remote nodes forcibly.

    Two cases when this happens are:
    1. When the master has migrated the lockres plus all locks to another node.
    2. When the master is clearing all the locks of a dead node.

    It was in the above two conditions that the dlm was missing the extra put.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • struct dlm_query_join_packet is made up of four one-byte fields. They
    are effectively in big-endian order already. However, little-endian
    machines swap them before putting the packet on the wire (because
    query_join's response is a status, and that status is treated as a u32
    on the wire). Thus, a big-endian and little-endian machines will
    treat this structure differently.

    The solution is to have little-endian machines swap the structure when
    converting from the structure to the u32 representation.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • __dlm_print_one_lock_resource must be called with spin_lock
    the res->spinlock. While in some cases, we use it without this
    precondition and lead to the failure of assert_spin_locked.
    So call dlm_print_one_lock_resource instead.

    Signed-off-by: Tao Ma
    Signed-off-by: Mark Fasheh

    Tao Ma
     
  • fs/ocfs2/dlm/dlmdomain.c: In function 'dlm_send_join_cancels':
    fs/ocfs2/dlm/dlmdomain.c:983: warning: format '%u' expects type 'unsigned int', but argument 7 has type 'long unsigned int'

    Signed-off-by: Andrew Morton
    Signed-off-by: Mark Fasheh

    Andrew Morton