25 Dec, 2009

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
    ocfs2/trivial: Use le16_to_cpu for a disk value in xattr.c
    ocfs2/trivial: Use proper mask for 2 places in hearbeat.c
    Ocfs2: Let ocfs2 support fiemap for symlink and fast symlink.
    Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode?
    ocfs2: Use FIEMAP_EXTENT_SHARED
    fiemap: Add new extent flag FIEMAP_EXTENT_SHARED
    ocfs2: replace u8 by __u8 in ocfs2_fs.h
    ocfs2: explicit declare uninitialized var in user_cluster_connect()
    ocfs2-devel: remove redundant OCFS2_MOUNT_POSIX_ACL check in ocfs2_get_acl_nolock()
    ocfs2: return -EAGAIN instead of EAGAIN in dlm
    ocfs2/cluster: Make fence method configurable - v2
    ocfs2: Set MS_POSIXACL on remount
    ocfs2: Make acl use the default
    ocfs2: Always include ACL support

    Linus Torvalds
     

30 Oct, 2009

1 commit

  • Currently the f_fsid of struct kstatfs returned from ocfs2_statfs() is
    undefined (vfs layer fills in 0 as default). Since in some conditions,
    f_fsid value might be used in a (f_fsid, ino) pair to uniquely identify
    a file, ocfs2 should return a unique defined f_fsid value from
    ocfs2_statfs().

    Because uuid_str is the same on big or litlle endian machine, it's
    endian consistent to use osb->uuid_str to generate f_fsid value.

    Signed-off-by: Coly Li
    Cc: Sunil Mushran
    Cc: Mark Fasheh
    Signed-off-by: Joel Becker

    Coly Li
     

29 Oct, 2009

4 commits

  • We have to set MS_POSIXACL on remount as well. Otherwise VFS
    would not know we started supporting ACLs after remount and
    thus ACLs would not work.

    Signed-off-by: Jan Kara
    Signed-off-by: Joel Becker

    Jan Kara
     
  • Change acl mount options handling to match the one of XFS and BTRFS and
    hopefully it is also easier to use now. When admin does not specify any
    acl mount option, acls are enabled if and only if the filesystem has
    xattr feature enabled. If admin specifies 'acl' mount option, we fail
    the mount if the filesystem does not have xattr feature and thus acls
    cannot be enabled.

    Signed-off-by: Jan Kara
    Signed-off-by: Joel Becker

    Jan Kara
     
  • To become consistent with filesystems such as XFS or BTRFS, make posix
    ACLs always available. This also reduces possibility of
    misconfiguration on admin's side.

    Signed-off-by: Jan Kara
    Signed-off-by: Joel Becker

    Jan Kara
     
  • In case of non-modular kernels the root filesystem is mounted by trying
    several filesystems. If ocfs2 was tried before the actual filesystem
    type, the mount would fail because ocfs2_sb_probe() returns -EAGAIN
    instead of -EINVAL. ocfs2 will now return -EINVAL properly.

    Signed-off-by: Joel Becker
    Reported-by: Laszlo Attila Toth

    Joel Becker
     

02 Oct, 2009

1 commit


24 Sep, 2009

2 commits

  • * remove asm/atomic.h inclusion from linux/utsname.h --
    not needed after kref conversion
    * remove linux/utsname.h inclusion from files which do not need it

    NOTE: it looks like fs/binfmt_elf.c do not need utsname.h, however
    due to some personality stuff it _is_ needed -- cowardly leave ELF-related
    headers and files alone.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (85 commits)
    ocfs2: Use buffer IO if we are appending a file.
    ocfs2: add spinlock protection when dealing with lockres->purge.
    dlmglue.c: add missed mlog lines
    ocfs2: __ocfs2_abort() should not enable panic for local mounts
    ocfs2: Add ioctl for reflink.
    ocfs2: Enable refcount tree support.
    ocfs2: Implement ocfs2_reflink.
    ocfs2: Add preserve to reflink.
    ocfs2: Create reflinked file in orphan dir.
    ocfs2: Use proper parameter for some inode operation.
    ocfs2: Make transaction extend more efficient.
    ocfs2: Don't merge in 1st refcount ops of reflink.
    ocfs2: Modify removing xattr process for refcount.
    ocfs2: Add reflink support for xattr.
    ocfs2: Create an xattr indexed block if needed.
    ocfs2: Call refcount tree remove process properly.
    ocfs2: Attach xattr clusters to refcount tree.
    ocfs2: Abstract ocfs2 xattr tree extend rec iteration process.
    ocfs2: Abstract the creation of xattr block.
    ocfs2: Remove inode from ocfs2_xattr_bucket_get_name_value.
    ...

    Linus Torvalds
     

23 Sep, 2009

2 commits

  • In a clustered setup, we have to panic the box on journal abort. This is
    because we don't have the facility to go hard readonly. With hard ro, another
    node would detect node failure and initiate recovery.

    Having said that, we shouldn't force panic if the volume is mounted locally.
    This patch defers the handling to the mount option, errors.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Joel Becker

    Sunil Mushran
     
  • Implement locking around struct ocfs2_refcount_tree. This protects
    all read/write operations on refcount trees. ocfs2_refcount_tree
    has its own lock and its own caching_info, protecting buffers among
    multiple nodes.

    User must call ocfs2_lock_refcount_tree before his operation on
    the tree and unlock it after that.

    ocfs2_refcount_trees are referenced by the block number of the
    refcount tree root block, So we create an rb-tree on the ocfs2_super
    to look them up.

    Signed-off-by: Tao Ma

    Tao Ma
     

22 Sep, 2009

1 commit


05 Sep, 2009

5 commits

  • Similar ip_last_trans, ip_created_trans tracks the creation of a journal
    managed inode. This specifically tracks what transaction created the
    inode. This is so the code can know if the inode has ever been written
    to disk.

    This behavior is desirable for any journal managed object. We move it
    to struct ocfs2_caching_info as ci_created_trans so that any object
    using ocfs2_caching_info can rely on this behavior.

    Signed-off-by: Joel Becker

    Joel Becker
     
  • We have the read side of metadata caching isolated to struct
    ocfs2_caching_info, now we need the write side. This means the journal
    functions. The journal only does a couple of things with struct inode.

    This change moves the ip_last_trans field onto struct
    ocfs2_caching_info as ci_last_trans. This field tells the journal
    whether a pending journal flush is required.

    Signed-off-by: Joel Becker

    Joel Becker
     
  • We are really passing the inode into the ocfs2_read/write_blocks()
    functions to get at the metadata cache. This commit passes the cache
    directly into the metadata block functions, divorcing them from the
    inode.

    Signed-off-by: Joel Becker

    Joel Becker
     
  • We don't really want to cart around too many new fields on the
    ocfs2_caching_info structure. So let's wrap all our access of the
    parent object in a set of operations. One pointer on caching_info, and
    more flexibility to boot.

    Signed-off-by: Joel Becker

    Joel Becker
     
  • We want to use the ocfs2_caching_info structure in places that are not
    inodes. To do that, it can no longer rely on referencing the inode
    directly.

    This patch moves the flags to ocfs2_caching_info->ci_flags, stores
    pointers to the parent's locks on the ocfs2_caching_info, and renames
    the constants and flags to reflect its independant state.

    Signed-off-by: Joel Becker

    Joel Becker
     

18 Aug, 2009

1 commit


24 Jul, 2009

1 commit


22 Jul, 2009

1 commit

  • In commit ea455f8ab68338ba69f5d3362b342c115bea8e13, we moved the dentry lock
    put process into ocfs2_wq. This causes problems during umount because ocfs2_wq
    can drop references to inodes while they are being invalidated by
    invalidate_inodes() causing all sorts of nasty things (invalidate_inodes()
    ending in an infinite loop, "Busy inodes after umount" messages etc.).

    We fix the problem by stopping ocfs2_wq from doing any further releasing of
    inode references on the superblock being unmounted, wait until it finishes
    the current round of releasing and finally cleaning up all the references in
    dentry_lock_list from ocfs2_put_super().

    The issue was tracked down by Tao Ma .

    Signed-off-by: Jan Kara
    Signed-off-by: Joel Becker

    Jan Kara
     

09 Jul, 2009

1 commit

  • If the mount fails for any reason, ocfs2_dismount_volume calls
    ocfs2_orphan_scan_stop. It requires that ocfs2_orphan_scan_init
    be called to setup the mutex and work queues, but that doesn't
    happen if the mount has failed and we oops accessing an uninitialized
    work queue.

    This patch splits the init and startup of the orphan scan, eliminating
    the oops.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Joel Becker

    Jeff Mahoney
     

24 Jun, 2009

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
    ocfs2/trivial: Wrap ocfs2_sysfile_cluster_lock_key within define.
    ocfs2: Add lockdep annotations
    vfs: Set special lockdep map for dirs only if not set by fs
    ocfs2: Disable orphan scanning for local and hard-ro mounts
    ocfs2: Do not initialize lvb in ocfs2_orphan_scan_lock_res_init()
    ocfs2: Stop orphan scan as early as possible during umount
    ocfs2: Fix ocfs2_osb_dump()
    ocfs2: Pin journal head before accessing jh->b_committed_data
    ocfs2: Update atime in splice read if necessary.
    ocfs2: Provide the ocfs2_dlm_lvb_valid() stack API.

    Linus Torvalds
     

23 Jun, 2009

3 commits


19 Jun, 2009

1 commit

  • Follow-up to "block: enable by default support for large devices
    and files on 32-bit archs".

    Rename CONFIG_LBD to CONFIG_LBDAF to:
    - allow update of existing [def]configs for "default y" change
    - reflect that it is used also for large files support nowadays

    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Jens Axboe

    Bartlomiej Zolnierkiewicz
     

17 Jun, 2009

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
    ocfs2/net: Use wait_event() in o2net_send_message_vec()
    ocfs2: Adjust rightmost path in ocfs2_add_branch.
    ocfs2: fdatasync should skip unimportant metadata writeout
    ocfs2: Remove redundant gotos in ocfs2_mount_volume()
    ocfs2: Add statistics for the checksum and ecc operations.
    ocfs2 patch to track delayed orphan scan timer statistics
    ocfs2: timer to queue scan of all orphan slots
    ocfs2: Correct ordering of ip_alloc_sem and localloc locks for directories
    ocfs2: Fix possible deadlock in quota recovery
    ocfs2: Fix possible deadlock with quotas in ocfs2_setattr()
    ocfs2: Fix lock inversion in ocfs2_local_read_info()
    ocfs2: Fix possible deadlock in ocfs2_global_read_dquot()
    ocfs2: update comments in masklog.h
    ocfs2: Don't printk the error when listing too many xattrs.

    Linus Torvalds
     

12 Jun, 2009

3 commits

  • [xfs, btrfs, capifs, shmem don't need BKL, exempt]

    Signed-off-by: Alessio Igor Bogani
    Signed-off-by: Al Viro

    Alessio Igor Bogani
     
  • Move BKL into ->put_super from the only caller. A couple of
    filesystems had trivial enough ->put_super (only kfree and NULLing of
    s_fs_info + stuff in there) to not get any locking: coda, cramfs, efs,
    hugetlbfs, omfs, qnx4, shmem, all others got the full treatment. Most
    of them probably don't need it, but I'd rather sort that out individually.
    Preferably after all the other BKL pushdowns in that area.

    [AV: original used to move lock_super() down as well; these changes are
    removed since we don't do lock_super() at all in generic_shutdown_super()
    now]
    [AV: fuse, btrfs and xfs are known to need no damn BKL, exempt]

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Signed-off-by: Christoph Hellwig
    Acked-by: Joel Becker
    Signed-off-by: Al Viro

    Christoph Hellwig
     

04 Jun, 2009

4 commits

  • Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     
  • It would be nice to know how often we get checksum failures. Even
    better, how many of them we can fix with the single bit ecc. So, we add
    a statistics structure. The structure can be installed into debugfs
    wherever the user wants.

    For ocfs2, we'll put it in the superblock-specific debugfs directory and
    pass it down from our higher-level functions. The stats are only
    registered with debugfs when the filesystem supports metadata ecc.

    Signed-off-by: Joel Becker

    Joel Becker
     
  • Patch to track delayed orphan scan timer statistics.

    Modifies ocfs2_osb_dump to print the following:
    Orphan Scan=> Local: 10 Global: 21 Last Scan: 67 seconds ago

    Signed-off-by: Srinivas Eeda
    Signed-off-by: Joel Becker

    Srinivas Eeda
     
  • When a dentry is unlinked, the unlinking node takes an EX on the dentry lock
    before moving the dentry to the orphan directory. Other nodes that have
    this dentry in cache have a PR on the same dentry lock. When the EX is
    requested, the other nodes flag the corresponding inode as MAYBE_ORPHANED
    during downconvert. The inode is finally deleted when the last node to iput
    the inode sees that i_nlink==0 and the MAYBE_ORPHANED flag is set.

    A problem arises if a node is forced to free dentry locks because of memory
    pressure. If this happens, the node will no longer get downconvert
    notifications for the dentries that have been unlinked on another node.
    If it also happens that node is actively using the corresponding inode and
    happens to be the one performing the last iput on that inode, it will fail
    to delete the inode as it will not have the MAYBE_ORPHANED flag set.

    This patch fixes this shortcoming by introducing a periodic scan of the
    orphan directories to delete such inodes. Care has been taken to distribute
    the workload across the cluster so that no one node has to perform the task
    all the time.

    Signed-off-by: Srinivas Eeda
    Signed-off-by: Joel Becker

    Srinivas Eeda
     

23 May, 2009

1 commit

  • Until now we have had a 1:1 mapping between storage device physical
    block size and the logical block sized used when addressing the device.
    With SATA 4KB drives coming out that will no longer be the case. The
    sector size will be 4KB but the logical block size will remain
    512-bytes. Hence we need to distinguish between the physical block size
    and the logical ditto.

    This patch renames hardsect_size to logical_block_size.

    Signed-off-by: Martin K. Petersen
    Signed-off-by: Jens Axboe

    Martin K. Petersen
     

04 Apr, 2009

3 commits

  • During recovery, a node recovers orphans in it's slot and the dead node(s). But
    if the dead nodes were holding orphans in offline slots, they will be left
    unrecovered.

    If the dead node is the last one to die and is holding orphans in other slots
    and is the first one to mount, then it only recovers it's own slot, which
    leaves orphans in offline slots.

    This patch queues complete_recovery to clean orphans for all offline slots
    during mount and node recovery.

    Signed-off-by: Srinivas Eeda
    Acked-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Srinivas Eeda
     
  • This patch makes use of Ocfs2's flexible btree code to add an additional
    tree to directory inodes. The new tree stores an array of small,
    fixed-length records in each leaf block. Each record stores a hash value,
    and pointer to a block in the traditional (unindexed) directory tree where a
    dirent with the given name hash resides. Lookup exclusively uses this tree
    to find dirents, thus providing us with constant time name lookups.

    Some of the hashing code was copied from ext3. Unfortunately, it has lots of
    unfixed checkpatch errors. I left that as-is so that tracking changes would
    be easier.

    Signed-off-by: Mark Fasheh
    Acked-by: Joel Becker

    Mark Fasheh
     
  • This patch creates a per mount debugfs file, fs_state, which exposes
    information like, cluster stack in use, states of the downconvert, recovery
    and commit threads, number of journal txns, some allocation stats, list of
    all slots, etc.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     

27 Feb, 2009

2 commits