02 Nov, 2011

3 commits


26 Jul, 2011

1 commit

  • Replace the ->check_acl method with a ->get_acl method that simply reads an
    ACL from disk after having a cache miss. This means we can replace the ACL
    checking boilerplate code with a single implementation in namei.c.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

20 Jul, 2011

1 commit


31 Mar, 2011

1 commit


29 Mar, 2011

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (39 commits)
    Treat writes as new when holes span across page boundaries
    fs,ocfs2: Move o2net_get_func_run_time under CONFIG_OCFS2_FS_STATS.
    ocfs2/dlm: Move kmalloc() outside the spinlock
    ocfs2: Make the left masklogs compat.
    ocfs2: Remove masklog ML_AIO.
    ocfs2: Remove masklog ML_UPTODATE.
    ocfs2: Remove masklog ML_BH_IO.
    ocfs2: Remove masklog ML_JOURNAL.
    ocfs2: Remove masklog ML_EXPORT.
    ocfs2: Remove masklog ML_DCACHE.
    ocfs2: Remove masklog ML_NAMEI.
    ocfs2: Remove mlog(0) from fs/ocfs2/dir.c
    ocfs2: remove NAMEI from symlink.c
    ocfs2: Remove masklog ML_QUOTA.
    ocfs2: Remove mlog(0) from quota_local.c.
    ocfs2: Remove masklog ML_RESERVATIONS.
    ocfs2: Remove masklog ML_XATTR.
    ocfs2: Remove masklog ML_SUPER.
    ocfs2: Remove mlog(0) from fs/ocfs2/heartbeat.c
    ocfs2: Remove mlog(0) from fs/ocfs2/slot_map.c
    ...

    Fix up trivial conflict in fs/ocfs2/super.c

    Linus Torvalds
     

08 Mar, 2011

1 commit


07 Mar, 2011

1 commit

  • mlog_exit is used to record the exit status of a function.
    But because it is added in so many functions, if we enable it,
    the system logs get filled up quickly and cause too much I/O.
    So actually no one can open it for a production system or even
    for a test.

    This patch just try to remove it or change it. So:
    1. if all the error paths already use mlog_errno, it is just removed.
    Otherwise, it will be replaced by mlog_errno.
    2. if it is used to print some return value, it is replaced with
    mlog(0,...).
    mlog_exit_ptr is changed to mlog(0.
    All those mlog(0,...) will be replaced with trace events later.

    Signed-off-by: Tao Ma

    Tao Ma
     

23 Feb, 2011

1 commit


21 Feb, 2011

1 commit

  • ENTRY is used to record the entry of a function.
    But because it is added in so many functions, if we enable it,
    the system logs get filled up quickly and cause too much I/O.
    So actually no one can open it for a production system or even
    for a test.

    So for mlog_entry_void, we just remove it.
    for mlog_entry(...), we replace it with mlog(0,...), and they
    will be replace by trace event later.

    Signed-off-by: Tao Ma

    Tao Ma
     

02 Feb, 2011

1 commit

  • SELinux would like to implement a new labeling behavior of newly created
    inodes. We currently label new inodes based on the parent and the creating
    process. This new behavior would also take into account the name of the
    new object when deciding the new label. This is not the (supposed) full path,
    just the last component of the path.

    This is very useful because creating /etc/shadow is different than creating
    /etc/passwd but the kernel hooks are unable to differentiate these
    operations. We currently require that userspace realize it is doing some
    difficult operation like that and than userspace jumps through SELinux hoops
    to get things set up correctly. This patch does not implement new
    behavior, that is obviously contained in a seperate SELinux patch, but it
    does pass the needed name down to the correct LSM hook. If no such name
    exists it is fine to pass NULL.

    Signed-off-by: Eric Paris

    Eric Paris
     

13 Jan, 2011

1 commit


12 Jan, 2011

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (22 commits)
    MAINTAINERS: Update Joel Becker's email address
    ocfs2: Remove unused truncate function from alloc.c
    ocfs2/cluster: dereferencing before checking in nst_seq_show()
    ocfs2: fix build for OCFS2_FS_STATS not enabled
    ocfs2/cluster: Show o2net timing statistics
    ocfs2/cluster: Track process message timing stats for each socket
    ocfs2/cluster: Track send message timing stats for each socket
    ocfs2/cluster: Use ktime instead of timeval in struct o2net_sock_container
    ocfs2/cluster: Replace timeval with ktime in struct o2net_send_tracking
    ocfs2: Add DEBUG_FS dependency
    ocfs2/dlm: Hard code the values for enums
    ocfs2/dlm: Minor cleanup
    ocfs2/dlm: Cleanup dlmdebug.c
    ocfs2: Release buffer_head in case of error in ocfs2_double_lock.
    ocfs2/cluster: Pin the local node when o2hb thread starts
    ocfs2/cluster: Show pin state for each o2hb region
    ocfs2/cluster: Pin/unpin o2hb regions
    ocfs2/cluster: Remove dropped region from o2hb quorum region bitmap
    ocfs2/cluster: Pin the remote node item in configfs
    ocfs2/dlm: make existing convertion precedent over new lock
    ...

    Linus Torvalds
     

07 Jan, 2011

1 commit

  • Reduce some branches and memory accesses in dcache lookup by adding dentry
    flags to indicate common d_ops are set, rather than having to check them.
    This saves a pointer memory access (dentry->d_op) in common path lookup
    situations, and saves another pointer load and branch in cases where we
    have d_op but not the particular operation.

    Patched with:

    git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i

    Signed-off-by: Nick Piggin

    Nick Piggin
     

23 Dec, 2010

1 commit

  • In ocfs2_double_lock, when ocfs2_inode_lock for inode1 fails, we
    just unlock inode2 and return without releasing buffer we get from
    inode_lock(inode2). The good thing is that it is freed by the only
    caller ocfs2_rename when it exits.

    But I don't think this is a right way for error handling. We should
    free the buffer_head we get in ocfs2_double_lock before exit so that
    the caller doesn't need to take care of it.

    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     

26 Oct, 2010

1 commit


11 Sep, 2010

1 commit

  • Track negative dentries by recording the generation number of the parent
    directory in d_fsdata. The generation number for the parent directory is
    recorded in the inode_info, which increments every time the lock on the
    directory is dropped.

    If the generation number of the parent directory and the negative dentry
    matches, there is no need to perform the revalidate, else a revalidate
    is forced. This improves performance in situations where nodes look for
    the same non-existent file multiple times.

    Thanks Mark for explaining the DLM sequence.

    Signed-off-by: Goldwyn Rodrigues
    Signed-off-by: Joel Becker

    Goldwyn Rodrigues
     

08 Sep, 2010

3 commits

  • ocfs2_create_inode_in_orphan() is used by reflink to create the newly
    reflinked inode simultaneously in the orphan dir. This allows us to easily
    handle partially-reflinked files during recovery cleanup.

    We have a problem though - the orphan dir stringifies inode # to determine
    a unique name under which the orphan entry dirent can be created. Since
    ocfs2_create_inode_in_orphan() needs the space allocated in the orphan dir
    before it can allocate the inode, we currently call into the orphan code:

    /*
    * We give the orphan dir the root blkno to fake an orphan name,
    * and allocate enough space for our insertion.
    */
    status = ocfs2_prepare_orphan_dir(osb, &orphan_dir,
    osb->root_blkno,
    orphan_name, &orphan_insert);

    Using osb->root_blkno might work fine on unindexed directories, but the
    orphan dir can have an index. When it has that index, the above code fails
    to allocate the proper index entry. Later, when we try to remove the file
    from the orphan dir (using the actual inode #), the reflink operation will
    fail.

    To fix this, I created a function ocfs2_alloc_orphaned_file() which uses the
    newly split out orphan and inode alloc code to figure out what the inode
    block number will be (once allocated) and then prepare the orphan dir from
    that data.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • We do this because ocfs2_create_inode_in_orphan() wants to order locking of
    the orphan dir with respect to locking of the inode allocator *before*
    making any changes to the directory.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • Do this by splitting the bulk of the function away from the inode allocation
    code at the very tom of ocfs2_mknod_locked(). Existing callers don't need to
    change and won't see any difference. The new function created,
    __ocfs2_mknod_locked() will be used shortly.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Mark Fasheh
     

22 May, 2010

1 commit


21 May, 2010

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (47 commits)
    ocfs2: Silence a gcc warning.
    ocfs2: Don't retry xattr set in case value extension fails.
    ocfs2:dlm: avoid dlm->ast_lock lockres->spinlock dependency break
    ocfs2: Reset xattr value size after xa_cleanup_value_truncate().
    fs/ocfs2/dlm: Use kstrdup
    fs/ocfs2/dlm: Drop memory allocation cast
    Ocfs2: Optimize punching-hole code.
    Ocfs2: Make ocfs2_find_cpos_for_left_leaf() public.
    Ocfs2: Fix hole punching to correctly do CoW during cluster zeroing.
    Ocfs2: Optimize ocfs2 truncate to use ocfs2_remove_btree_range() instead.
    ocfs2: Block signals for mkdir/link/symlink/O_CREAT.
    ocfs2: Wrap signal blocking in void functions.
    ocfs2/dlm: Increase o2dlm lockres hash size
    ocfs2: Make ocfs2_extend_trans() really extend.
    ocfs2/trivial: Code cleanup for allocation reservation.
    ocfs2: make ocfs2_adjust_resv_from_alloc simple.
    ocfs2: Make nointr a default mount option
    ocfs2/dlm: Make o2dlm domain join/leave messages KERN_NOTICE
    o2net: log socket state changes
    ocfs2: print node # when tcp fails
    ...

    Linus Torvalds
     

19 May, 2010

1 commit


11 May, 2010

1 commit

  • Once file or link creation gets going, it can't be interrupted by a
    signal. They're not idempotent.

    This blocks signals in ocfs2_mknod(), ocfs2_link(), and ocfs2_symlink()
    once we start actually changing things. ocfs2_mknod() covers mknod(),
    creat(), mkdir(), and open(O_CREAT).

    Signed-off-by: Joel Becker

    Joel Becker
     

06 May, 2010

2 commits

  • jbd[2]_journal_dirty_metadata() only returns 0. It's been returning 0
    since before the kernel moved to git. There is no point in checking
    this error.

    ocfs2_journal_dirty() has been faithfully returning the status since the
    beginning. All over ocfs2, we have blocks of code checking this can't
    fail status. In the past few years, we've tried to avoid adding these
    checks, because they are pointless. But anyone who looks at our code
    assumes they are needed.

    Finally, ocfs2_journal_dirty() is made a void function. All error
    checking is removed from other files. We'll BUG_ON() the status of
    jbd2_journal_dirty_metadata() just in case they change it someday. They
    won't.

    Signed-off-by: Joel Becker

    Joel Becker
     
  • They all take an ocfs2_alloc_context, which has the allocation inode.

    Signed-off-by: Joel Becker
    Signed-off-by: Tao Ma

    Joel Becker
     

24 Apr, 2010

4 commits


26 Mar, 2010

1 commit


24 Mar, 2010

1 commit


05 Mar, 2010

4 commits

  • Get rid of the initialize dquot operation - it is now always called from
    the filesystem and if a filesystem really needs it's own (which none
    currently does) it can just call into it's own routine directly.

    Rename the now static low-level dquot_initialize helper to __dquot_initialize
    and vfs_dq_init to dquot_initialize to have a consistent namespace.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • Currently various places in the VFS call vfs_dq_init directly. This means
    we tie the quota code into the VFS. Get rid of that and make the
    filesystem responsible for the initialization. For most metadata operations
    this is a straight forward move into the methods, but for truncate and
    open it's a bit more complicated.

    For truncate we currently only call vfs_dq_init for the sys_truncate case
    because open already takes care of it for ftruncate and open(O_TRUNC) - the
    new code causes an additional vfs_dq_init for those which is harmless.

    For open the initialization is moved from do_filp_open into the open method,
    which means it happens slightly earlier now, and only for regular files.
    The latter is fine because we don't need to initialize it for operations
    on special files, and we already do it as part of the namespace operations
    for directories.

    Add a dquot_file_open helper that filesystems that support generic quotas
    can use to fill in ->open.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • Get rid of the alloc_inode and free_inode dquot operations - they are
    always called from the filesystem and if a filesystem really needs
    their own (which none currently does) it can just call into it's
    own routine directly.

    Also get rid of the vfs_dq_alloc/vfs_dq_free wrappers and always
    call the lowlevel dquot_alloc_inode / dqout_free_inode routines
    directly, which now lose the number argument which is always 1.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • Get rid of the alloc_space, free_space, reserve_space, claim_space and
    release_rsv dquot operations - they are always called from the filesystem
    and if a filesystem really needs their own (which none currently does)
    it can just call into it's own routine directly.

    Move shared logic into the common __dquot_alloc_space,
    dquot_claim_space_nodirty and __dquot_free_space low-level methods,
    and rationalize the wrappers around it to move as much as possible
    code into the common block for CONFIG_QUOTA vs not. Also rename
    all these helpers to be named dquot_* instead of vfs_dq_*.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     

25 Dec, 2009

1 commit

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
    ocfs2/trivial: Use le16_to_cpu for a disk value in xattr.c
    ocfs2/trivial: Use proper mask for 2 places in hearbeat.c
    Ocfs2: Let ocfs2 support fiemap for symlink and fast symlink.
    Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode?
    ocfs2: Use FIEMAP_EXTENT_SHARED
    fiemap: Add new extent flag FIEMAP_EXTENT_SHARED
    ocfs2: replace u8 by __u8 in ocfs2_fs.h
    ocfs2: explicit declare uninitialized var in user_cluster_connect()
    ocfs2-devel: remove redundant OCFS2_MOUNT_POSIX_ACL check in ocfs2_get_acl_nolock()
    ocfs2: return -EAGAIN instead of EAGAIN in dlm
    ocfs2/cluster: Make fence method configurable - v2
    ocfs2: Set MS_POSIXACL on remount
    ocfs2: Make acl use the default
    ocfs2: Always include ACL support

    Linus Torvalds
     

19 Dec, 2009

2 commits

  • We create a file in orphan dir for reflink so that if there
    is any error, we don't create any wrong dentry in the dir.
    But actually the file in orphan dir should be i_nlink = 0
    so that it can be replayed and freed successfully.

    This patch first set i_nlink to 0 when creating the file in
    orphan dir and then set it to 1(reflink now only works for
    regular file) when we move it to the dest dir.

    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     
  • We used to add reflinked file's inode to inode hash when
    we add it to the dest dir. But actually there is a race.
    Consider the following sequence.
    1. reflink happens and create the inode in orphan dir.
    2. reflink thread is scheduled out because of some io.
    3. recovery begins to work and calls ocfs2_recover_orphans.
    It calls ocfs2_iget and get a new inode and i_count = 1.
    It calls iput then and delete inode. the buffer's
    uptodate state is cleared.

    This patch move insert_inode_hash to the create function so
    that it can be found by step 3 and prevented from deleting
    because i_count > 1.

    This resolves the bug
    http://oss.oracle.com/bugzilla/show_bug.cgi?id=1183.

    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma