13 Oct, 2007

1 commit


12 Oct, 2007

1 commit


10 Oct, 2007

1 commit

  • As bi_end_io is only called once when the reqeust is complete,
    the 'size' argument is now redundant. Remove it.

    Now there is no need for bio_endio to subtract the size completed
    from bi_size. So don't do that either.

    While we are at it, change bi_end_io to return void.

    Signed-off-by: Neil Brown
    Signed-off-by: Jens Axboe

    NeilBrown
     

10 Aug, 2007

1 commit


17 Jul, 2007

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (32 commits)
    [PATCH] ocfs2: zero_user_page conversion
    ocfs2: Support xfs style space reservation ioctls
    ocfs2: support for removing file regions
    ocfs2: update truncate handling of partial clusters
    ocfs2: btree support for removal of arbirtrary extents
    ocfs2: Support creation of unwritten extents
    ocfs2: support writing of unwritten extents
    ocfs2: small cleanup of ocfs2_write_begin_nolock()
    ocfs2: btree changes for unwritten extents
    ocfs2: abstract btree growing calls
    ocfs2: use all extent block suballocators
    ocfs2: plug truncate into cached dealloc routines
    ocfs2: simplify deallocation locking
    ocfs2: harden buffer check during mapping of page blocks
    ocfs2: shared writeable mmap
    ocfs2: factor out write aops into nolock variants
    ocfs2: rework ocfs2_buffered_write_cluster()
    ocfs2: take ip_alloc_sem during entire truncate
    ocfs2: Add "preferred slot" mount option
    [KJ PATCH] Replacing memset(,0,PAGE_SIZE) with clear_page() in fs/ocfs2/dlm/dlmrecovery.c
    ...

    Linus Torvalds
     

12 Jul, 2007

1 commit

  • sysfs is now completely out of driver/module lifetime game. After
    deletion, a sysfs node doesn't access anything outside sysfs proper,
    so there's no reason to hold onto the attribute owners. Note that
    often the wrong modules were accounted for as owners leading to
    accessing removed modules.

    This patch kills now unnecessary attribute->owner. Note that with
    this change, userland holding a sysfs node does not prevent the
    backing module from being unloaded.

    For more info regarding lifetime rule cleanup, please read the
    following message.

    http://article.gmane.org/gmane.linux.kernel/510293

    (tweaked by Greg to not delete the field just yet, to make it easier to
    merge things properly.)

    Signed-off-by: Tejun Heo
    Cc: Cornelia Huck
    Cc: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     

11 Jul, 2007

5 commits


07 Jun, 2007

1 commit

  • Some of the sysfs changes inadvertantly broke the simple runtime debug log
    filtering employed in ocfs2. Fix this by properly exporting the masklog
    category filter names.

    Signed-off-by: Tiger Yang
    Signed-off-by: Mark Fasheh

    Tiger Yang
     

11 May, 2007

1 commit

  • Fix gcc warning and Oops that it causes:

    fs/ocfs2/cluster/masklog.c:161: warning: assignment from incompatible pointer type
    [ 2776.204120] OCFS2 Node Manager 1.3.3
    [ 2776.211729] BUG: spinlock bad magic on CPU#0, modprobe/4424
    [ 2776.214269] lock: ffff810021c8fe18, .magic: ffffffff, .owner: /6394416, .owner_cpu: 0
    [ 2776.217864] [ 2776.217865] Call Trace:
    [ 2776.219662] [] spin_bug+0x9e/0xe9
    [ 2776.221921] [] _raw_spin_lock+0x23/0xf9
    [ 2776.224417] [] _spin_lock+0x9/0xb
    [ 2776.226676] [] kobject_shadow_add+0x98/0x1ac
    [ 2776.229367] [] kobject_add+0xb/0xd
    [ 2776.231665] [] kset_add+0xd/0xf
    [ 2776.233845] [] kset_register+0x23/0x28
    [ 2776.236309] [] :ocfs2_nodemanager:mlog_sys_init+0x68/0x6d
    [ 2776.239518] [] :ocfs2_nodemanager:o2cb_sys_init+0x32/0x4a
    [ 2776.242726] [] :ocfs2_nodemanager:init_o2nm+0xa6/0xd5
    [ 2776.245772] [] sys_init_module+0x1471/0x15d2
    [ 2776.248465] [] simple_strtoull+0x0/0xdc
    [ 2776.250959] [] system_call+0x7e/0x83

    Signed-off-by: Randy Dunlap
    Acked-by: Mark Fasheh
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

05 May, 2007

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2:
    ocfs2: Force use of GFP_NOFS in ocfs2_write()
    ocfs2: fix sparse warnings in fs/ocfs2/cluster
    ocfs2: fix sparse warnings in fs/ocfs2/dlm
    ocfs2: fix sparse warnings in fs/ocfs2
    [PATCH] Copy i_flags to ocfs2 inode flags on write
    [PATCH] ocfs2: use __set_current_state()
    ocfs2: Wrap access of directory allocations with ip_alloc_sem.
    [PATCH] fs/ocfs2/: make 3 functions static
    ocfs2: Implement compat_ioctl()

    Linus Torvalds
     

03 May, 2007

2 commits


27 Apr, 2007

2 commits

  • Ocfs2 currently does cluster-wide node messaging to check the open state of
    an inode during delete. This patch removes that mechanism in favor of an
    inode cluster lock which is taken at shared read when an inode is first read
    and dropped in clear_inode(). This allows a deleting node to test the
    liveness of an inode by attempting to take an exclusive lock.

    Signed-off-by: Tiger Yang
    Signed-off-by: Mark Fasheh

    Tiger Yang
     
  • We have noticed panic() hanging leading us to a situation in which
    the node, while otherwise dead, is still disk heartbeating. This
    leads to a hung cluster as the other nodes are waiting for this
    node to stop disk heartbeating. This situation is only resolved
    by power resetting the box.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     

15 Mar, 2007

2 commits


15 Feb, 2007

2 commits

  • The semantic effect of insert_at_head is that it would allow new registered
    sysctl entries to override existing sysctl entries of the same name. Which is
    pain for caching and the proc interface never implemented.

    I have done an audit and discovered that none of the current users of
    register_sysctl care as (excpet for directories) they do not register
    duplicate sysctl entries.

    So this patch simply removes the support for overriding existing entries in
    the sys_sysctl interface since no one uses it or cares and it makes future
    enhancments harder.

    Signed-off-by: Eric W. Biederman
    Acked-by: Ralf Baechle
    Acked-by: Martin Schwidefsky
    Cc: Russell King
    Cc: David Howells
    Cc: "Luck, Tony"
    Cc: Ralf Baechle
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Andi Kleen
    Cc: Jens Axboe
    Cc: Corey Minyard
    Cc: Neil Brown
    Cc: "John W. Linville"
    Cc: James Bottomley
    Cc: Jan Kara
    Cc: Trond Myklebust
    Cc: Mark Fasheh
    Cc: David Chinner
    Cc: "David S. Miller"
    Cc: Patrick McHardy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • ocfs2 was did not have the binary number it uses under CTL_FS registered in
    sysctl.h. Register it to avoid future conflicts, and change the name of the
    definition to be in line with the rest of the sysctl numbers.

    Signed-off-by: Eric W. Biederman
    Acked-by: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

08 Feb, 2007

6 commits

  • As was already pointed out Mathieu Avila on Thu, 07 Sep 2006 03:15:25 -0700
    that OCFS2 is expecting bio_add_page() to add pages to BIOs in an easily
    predictable manner.

    That is not true, especially for devices with own merge_bvec_fn().

    Therefore OCFS2's heartbeat code is very likely to fail on such devices.

    Move the bio_put() call into the bio's bi_end_io() function. This makes the
    whole idea of trying to predict the behaviour of bio_add_page() unnecessary.
    Removed compute_max_sectors() and o2hb_compute_request_limits().

    Signed-off-by: Philipp Reisner
    Signed-off-by: Mark Fasheh

    Philipp Reisner
     
  • When there is a lot of multithreaded I/O usage, two threads can collide
    while sending out a message to the other nodes. This is due to the lack of
    locking between threads while sending out the messages.

    When a connected TCP send(), sendto(), or sendmsg() arrives in the Linux
    kernel, it eventually comes through tcp_sendmsg(). tcp_sendmsg() protects
    itself by acquiring a lock at invocation by calling lock_sock().
    tcp_sendmsg() then loops over the buffers in the iovec, allocating
    associated sk_buff's and cache pages for use in the actual send. As it does
    so, it pushes the data out to tcp for actual transmission. However, if one
    of those allocation fails (because a large number of large sends is being
    processed, for example), it must wait for memory to become available. It
    does so by jumping to wait_for_sndbuf or wait_for_memory, both of which
    eventually cause a call to sk_stream_wait_memory(). sk_stream_wait_memory()
    contains a code path that calls sk_wait_event(). Finally, sk_wait_event()
    contains the call to release_sock().

    The following patch adds a lock to the socket container in order to
    properly serialize outbound requests.

    From: Zhen Wei
    Acked-by: Jeff Mahoney
    Signed-off-by: Mark Fasheh

    Zhen Wei
     
  • There is a small window where a joining node may not see the node(s) that
    just died but are still part of the domain. To fix this, we must disallow
    join requests if the joining node has a different node map.

    A new field node_map is added to dlm_query_join_request to send the current
    nodes nodemap along with join request. On the receiving end the nodes that
    are part of the cluster verifies if this new node sees all the nodes that
    are still part of the cluster. They disallow the join if the maps mismatch.

    Signed-off-by: Srinivas Eeda
    Signed-off-by: Mark Fasheh

    Srinivas Eeda
     
  • This patch binds the o2net listener to the configured ip address
    instead of INADDR_ANY for security. Fixes oss.oracle.com bugzilla#814.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • Currently o2net allows one handler function per message type. This
    patch adds the ability to call another function to be called after
    the handler has returned the message to the other node.

    Handlers are now given the option of returning a context (in the form of a
    void **) which will be passed back into the post message handler function.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     
  • This was previously broken and migration of some locks had to be temporarily
    disabled. We use a new (and backward-incompatible) set of network messages
    to account for all references to a lock resources held across the cluster.
    once these are all freed, the master node may then free the lock resource
    memory once its local references are dropped.

    Signed-off-by: Kurt Hackel
    Signed-off-by: Mark Fasheh

    Kurt Hackel
     

29 Dec, 2006

1 commit

  • The patch allows the ocfs2 heartbeat thread to prioritize I/O which may
    help cut down on spurious fencing. Most of this will be in the tools -
    we can have a pid configfs attribute and let userspace (ocfs2_hb_ctl)
    calls the ioprio_set syscall after starting heartbeat, but only cfq
    scheduler supports I/O priorities now.

    Signed-off-by: Zhen Wei
    Signed-off-by: Mark Fasheh

    Zhen Wei
     

14 Dec, 2006

1 commit

  • All kcalloc() calls of the form "kcalloc(1,...)" are converted to the
    equivalent kzalloc() calls, and a few kcalloc() calls with the incorrect
    ordering of the first two arguments are fixed.

    Signed-off-by: Robert P. J. Day
    Cc: Jeff Garzik
    Cc: Alan Cox
    Cc: Dominik Brodowski
    Cc: Adam Belay
    Cc: James Bottomley
    Cc: Greg KH
    Cc: Mark Fasheh
    Cc: Trond Myklebust
    Cc: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert P. J. Day
     

12 Dec, 2006

1 commit

  • Modify the OCFS2 handshake to ensure essential timeouts are configured
    identically on all nodes.

    Only allow changes when there are no connected peers

    Improves the logic in o2net_advance_rx() which broke now that
    sizeof(struct o2net_handshake) is greater than sizeof(struct o2net_msg)

    Included is the field for userspace-heartbeat timeout to avoid the need for
    further protocol changes.

    Uses a global spinlock to ensure the decisions to update configfs entries
    are made on the correct value. The region covered by the spinlock when
    incrementing the counter is much larger as this is the more critical case.

    Small cleanup contributed by Adrian Bunk

    Signed-off-by: Andrew Beekhof
    Signed-off-by: Mark Fasheh

    Andrew Beekhof
     

08 Dec, 2006

2 commits


22 Nov, 2006

1 commit


21 Oct, 2006

1 commit


25 Sep, 2006

2 commits

  • OCFS2 puts inode meta data in the "lock value block" provided by the DLM.
    Typically, i_generation is encoded in the lock name so that a deleted inode
    on and a new one in the same block don't share the same lvb.

    Unfortunately, that scheme means that the read in ocfs2_read_locked_inode()
    is potentially thrown away as soon as the meta data lock is taken - we
    cannot encode the lock name without first knowing i_generation, which
    requires a disk read.

    This patch encodes i_generation in the inode meta data lvb, and removes the
    value from the inode meta data lock name. This way, the read can be covered
    by a lock, and at the same time we can distinguish between an up to date and
    a stale LVB.

    This will help cold-cache stat(2) performance in particular.

    Since this patch changes the protocol version, we take the opportunity to do
    a minor re-organization of two of the LVB fields.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • Actually replace the vote calls with the new dentry operations. Make any
    necessary adjustments to get the scheme to work.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     

21 Sep, 2006

1 commit


30 Jun, 2006

2 commits