04 Jul, 2013

21 commits

  • There may exist NULL pointer dereference in config_item_name() when one
    volume (say Volume A) unmounts while another (say Volume B) mounting.

    Volume A Volume B

    already Mounted.
    Unmounting, call
    o2hb_heartbeat_group_drop_item()
    -> config_item_put(item)
    set reg(A)->item.ci_name to NULL
    in function config_item_cleanup().

    begin mounting, call
    o2hb_region_pin() and tranverse all
    regions. When reading
    reg(A)->item.ci_name, it causes
    NULL pointer dereference.

    call o2hb_region_release() and
    del reg(A) from list.

    So we should skip accessing regions that is going to release when
    tranverse o2hb_all_regions.

    Signed-off-by: Yiwen Jiang
    Signed-off-by: joyce
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Cc: Sunil Mushran
    Cc: Jie Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xue jiufei
     
  • Adjust switch..case syntax at o2net_state_change to meet the kernel coding
    standard.

    s/printk/pr_info/.

    [akpm@linux-foundation.org: revert pr_foo() change]
    Signed-off-by: Jie Liu
    Acked-by: Joel Becker
    Cc: Gurudas Pai
    Cc: Mark Fasheh
    Cc: Noboru Iwamatsu
    Cc: Srinivas Eeeda
    Cc: Sunil Mushran
    Cc: Tao Ma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jie Liu
     
  • Fix a comment typo in o2quo_hb_still_up()

    Signed-off-by: Jie Liu
    Cc: Gurudas Pai
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Noboru Iwamatsu
    Cc: Srinivas Eeeda
    Cc: Sunil Mushran
    Cc: Tao Ma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jie Liu
     
  • s/o2hb_global_hearbeat_mode_set/o2hb_global_heartbeat_mode_set/ to make
    the signature of those routines in a consistent manner with others for
    heartbeating.

    Signed-off-by: Jie Liu
    Acked-by: Sunil Mushran
    Cc: Gurudas Pai
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Noboru Iwamatsu
    Cc: Srinivas Eeeda
    Cc: Tao Ma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jie Liu
     
  • Under heavy I/O load, writing the disk heartbeat can be forced to wait for
    minutes, and this causes the node to be fenced.

    This patch tries to use WRITE_SYNC in submitting the heartbeat bio, so
    that writing the heartbeat will have a priority over other requests.

    Signed-off-by: Noboru Iwamatsu
    Acked-by: Tao Ma
    Acked-by: Sunil Mushran
    Cc: Srinivas Eeeda
    Reviewed-by: Jie Liu
    Tested-by: Gurudas Pai
    Cc: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Noboru Iwamatsu
     
  • Inlined xattr shared free space of inode block with inlined data or data
    extent record, so the size of the later two should be adjusted when
    inlined xattr is enabled. See ocfs2_xattr_ibody_init(). But this isn't
    done well when reflink. For inode with inlined data, its max inlined
    data size is adjusted in ocfs2_duplicate_inline_data(), no problem. But
    for inode with data extent record, its record count isn't adjusted. Fix
    it, or data extent record and inlined xattr may overwrite each other,
    then cause data corruption or xattr failure.

    One panic caused by this bug in our test environment is the following:

    kernel BUG at fs/ocfs2/xattr.c:1435!
    invalid opcode: 0000 [#1] SMP
    Pid: 10871, comm: multi_reflink_t Not tainted 2.6.39-300.17.1.el5uek #1
    RIP: ocfs2_xa_offset_pointer+0x17/0x20 [ocfs2]
    RSP: e02b:ffff88007a587948 EFLAGS: 00010283
    RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00000000000051e4
    RDX: ffff880057092060 RSI: 0000000000000f80 RDI: ffff88007a587a68
    RBP: ffff88007a587948 R08: 00000000000062f4 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000010
    R13: ffff88007a587a68 R14: 0000000000000001 R15: ffff88007a587c68
    FS: 00007fccff7f06e0(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
    CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00000000015cf000 CR3: 000000007aa76000 CR4: 0000000000000660
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process multi_reflink_t
    Call Trace:
    ocfs2_xa_reuse_entry+0x60/0x280 [ocfs2]
    ocfs2_xa_prepare_entry+0x17e/0x2a0 [ocfs2]
    ocfs2_xa_set+0xcc/0x250 [ocfs2]
    ocfs2_xattr_ibody_set+0x98/0x230 [ocfs2]
    __ocfs2_xattr_set_handle+0x4f/0x700 [ocfs2]
    ocfs2_xattr_set+0x6c6/0x890 [ocfs2]
    ocfs2_xattr_user_set+0x46/0x50 [ocfs2]
    generic_setxattr+0x70/0x90
    __vfs_setxattr_noperm+0x80/0x1a0
    vfs_setxattr+0xa9/0xb0
    setxattr+0xc3/0x120
    sys_fsetxattr+0xa8/0xd0
    system_call_fastpath+0x16/0x1b

    Signed-off-by: Junxiao Bi
    Reviewed-by: Jie Liu
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Cc: Sunil Mushran
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Junxiao Bi
     
  • While deleting a file with ocfs2_unlink(), there is a bug in this
    function. This bug will result in filesystem read-only.

    After calling ocfs2_orphan_add(), the file which will be deleted is
    added into orphan dir. If ocfs2_delete_entry() fails, the file still
    exists in the parent dir. And this scenario introduces a conflict of
    metadata.

    If a file is added into orphan dir, when we put inode of the file with
    iput(), the inode i_flags is setted (~OCFS2_VALID_FL) in
    ocfs2_remove_inode(), and then write back to disk.

    But as previously mentioned, the file still exists in the parent dir.
    On other nodes, the file can be still accessed. When first read the
    file with ocfs2_read_blocks() from disk, It will check and avalidate
    inode using ocfs2_validate_inode_block(). So File system will be
    readonly because the inode is invalid. In other words, the inode
    i_flags has been set (~OCFS2_VALID_FL).

    [akpm@linux-foundation.org: cleanups]
    [jeff.liu@oracle.com: s/inode_is_unlinkable/ocfs2_inode_is_unlinkable/]
    Signed-off-by: Younger Liu
    Signed-off-by: Jensen
    Cc: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Sunil Mushran
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Younger Liu
     
  • Cc: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Sunil Mushran
    Cc: Younger Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • In ocfs2_relink_block_group(), we roll back all those changes if notify
    intent to modify buffers for metadata update failed even if the relevant
    buffer has not yet been modified/got dirty at that point, that are not
    quite right because of:

    - None buffer has been modified/dirty if failed to call
    ocfs2_journal_access_gd() against the previous block group buffer

    - Only the previous block group buffer has got dirty if failed to call
    ocfs2_journal_access_gd() against the block group buffer

    - There is no need to roll back the change for file entry buffer at all

    Those problems will not cause anything wrong but unnecessary. This
    patch fix them and kill the useless bg_ptr variable as well.

    Signed-off-by: Jie Liu
    Cc: Younger Liu
    Cc: Sunil Mushran
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jie Liu
     
  • While adding a file into orphan dir in ocfs2_orphan_add(), it calls
    __ocfs2_add_entry() before ocfs2_journal_access_di(). If
    ocfs2_journal_access_di() failed, the file is added into orphan dir, and
    orphan dir dinode updated, but file dinode has not been updated.
    Accordingly, the data is not consistent between file dinode and orphan
    dir.

    So, need to call ocfs2_journal_access_di() before __ocfs2_add_entry(),
    and if ocfs2_journal_access_di() failed, orphan_fe and
    orphan_dir_inode->i_nlink need rollback.

    This bug was added by 3939fda4 ("Ocfs2: Journaling i_flags and
    i_orphaned_slot when adding inode to orphan dir.").

    Signed-off-by: Younger Liu
    Acked-by: Jeff Liu
    Cc: Sunil Mushran
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Younger Liu
     
  • dlmlock_master() returns DLM_RECOVERING/DLM_MIGRATING/ DLM_FORWAR after
    adding lock to blocked list if lockres has the state
    DLM_LOCK_RES_RECOVERING/DLM_LOCK_RES_MIGRATING/ DLM_LOCK_RES_IN_PROGRESS.
    so it will retry in dlmlock(). And this may cause dlm_thread fall into an
    infinite loop

    Thread1 dlm_thread

    calls dlm_lock->dlmlock_master,
    if lockresA is in state
    DLM_LOCK_RES_RECOVERING, calls
    __dlm_wait_on_lockres() and waits
    until others threads clear this
    state;

    If cannot grant this lock,
    adding lock to blocked list,
    and return DLM_RECOVERING;

    Grant this lock and move it to
    grant list;

    After a while, retry and
    calls list_add_tail(), adding lock
    to blocked list again.

    Granted and blocked list of this lockres will become the following
    conditions:

    lock_res->granted.next = dlm_lock->list_head;
    lock_res->blocked.next = dlm_lock->list_head;
    dlm_lock->list_head.next = dlm_lock_resource->blocked;

    When dlm_thread traverses the granted list, it will fall into an endless
    loop, checking dlm_lock.list_head, dlm_lock->list_head.next
    (i.e.lock_res->blocked), lock_res->blocked.next(i.e.dlm_lock.list_head
    again) .....

    Signed-off-by: joyce
    Reviewed-by: jensen
    Cc: Jeff Liu
    Acked-by: Sunil Mushran
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xue jiufei
     
  • Free space checking will be done in ocfs2_xattr_ibody_init(). So remove
    here.

    [akpm@linux-foundation.org: remove unused local]
    Signed-off-by: Junxiao Bi
    Reviewed-by: Jie Liu
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Junxiao Bi
     
  • There is a memory leak in sc_kref_release(). When free struct
    o2net_sock_container (sc), we should release sc->sc_page.

    Signed-off-by: Younger Liu
    Reviewed-by: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Younger Liu
     
  • While adding extends to a file, the credits are calculated incorrectly
    and if the requested clusters is more than one (or more because we used
    a conservative limit) then we run out of journal credits and we hit an
    assert in journalling code.

    The function parameter bits_wanted variable was not used at all.

    Signed-off-by: Goldwyn Rodrigues
    Reviewed-by: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Goldwyn Rodrigues
     
  • In ocfs2_remove_btree_range, when calling ocfs2_lock_refcount_tree and
    ocfs2_prepare_refcount_change_for_del failed, it goes to out and then
    tries to call mutex_unlock without mutex_lock before. And when calling
    ocfs2_reserve_blocks_for_rec_trunc failed, it should free ref_tree
    before return.

    Signed-off-by: Joseph Qi
    Reviewed-by: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • Code cleanup: needs_checkpoint is assigned to but never used. Delete
    the variable.

    Signed-off-by: Goldwyn Rodrigues
    Cc: Jeff Liu
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Goldwyn Rodrigues
     
  • dlm_begin_reco_handler() returns without putting dlm when dlm recovery
    state is DLM_RECO_STATE_FINALIZE.

    Signed-off-by: joyce
    Reviewed-by: Jie Liu
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xue jiufei
     
  • If we use le32_add_cpu to set ocfs2_dinode i_flags, it may lead to the
    corresponding flag corrupted. So we should change it to bitwise and/or
    operation.

    Signed-off-by: Joseph Qi
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: shencanquan
    Reviewed-by: Jie Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • In dlm_request_all_locks, ret is type enum. But o2net_send_message
    returns a type int value. Then it will never run into the following
    error branch. So we should change the ret type from enum to int.

    Signed-off-by: Joseph Qi
    Cc: Joel Becker
    Cc: Mark Fasheh
    Acked-by: Sunil Mushran
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • Below 3 functions have already been declared in dlmcommon.h, so we have
    no need to declare them again in dlmrecovery.c:

    dlm_complete_recovery_thread
    dlm_launch_recovery_thread
    dlm_kick_recovery_thread

    Signed-off-by: Joseph Qi
    Cc: Joel Becker
    Cc: Mark Fasheh
    Acked-by: Sunil Mushran
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • Pull second set of VFS changes from Al Viro:
    "Assorted f_pos race fixes, making do_splice_direct() safe to call with
    i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
    ->d_hash/->d_compare calling conventions changes from Linus, misc
    stuff all over the place."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    Document ->tmpfile()
    ext4: ->tmpfile() support
    vfs: export lseek_execute() to modules
    lseek_execute() doesn't need an inode passed to it
    block_dev: switch to fixed_size_llseek()
    cpqphp_sysfs: switch to fixed_size_llseek()
    tile-srom: switch to fixed_size_llseek()
    proc_powerpc: switch to fixed_size_llseek()
    ubi/cdev: switch to fixed_size_llseek()
    pci/proc: switch to fixed_size_llseek()
    isapnp: switch to fixed_size_llseek()
    lpfc: switch to fixed_size_llseek()
    locks: give the blocked_hash its own spinlock
    locks: add a new "lm_owner_key" lock operation
    locks: turn the blocked_list into a hashtable
    locks: convert fl_link to a hlist_node
    locks: avoid taking global lock if possible when waking up blocked waiters
    locks: protect most of the file_lock handling with i_lock
    locks: encapsulate the fl_link list handling
    locks: make "added" in __posix_lock_file a bool
    ...

    Linus Torvalds
     

03 Jul, 2013

2 commits

  • For those file systems(btrfs/ext4/ocfs2/tmpfs) that support
    SEEK_DATA/SEEK_HOLE functions, we end up handling the similar
    matter in lseek_execute() to update the current file offset
    to the desired offset if it is valid, ceph also does the
    simliar things at ceph_llseek().

    To reduce the duplications, this patch make lseek_execute()
    public accessible so that we can call it directly from the
    underlying file systems.

    Thanks Dave Chinner for this suggestion.

    [AV: call it vfs_setpos(), don't bring the removed 'inode' argument back]

    v2->v1:
    - Add kernel-doc comments for lseek_execute()
    - Call lseek_execute() in ceph->llseek()

    Signed-off-by: Jie Liu
    Cc: Dave Chinner
    Cc: Al Viro
    Cc: Andi Kleen
    Cc: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Chris Mason
    Cc: Josef Bacik
    Cc: Ben Myers
    Cc: Ted Tso
    Cc: Hugh Dickins
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Sage Weil
    Signed-off-by: Al Viro

    Jie Liu
     
  • Pull ext4 update from Ted Ts'o:
    "Lots of bug fixes, cleanups and optimizations. In the bug fixes
    category, of note is a fix for on-line resizing file systems where the
    block size is smaller than the page size (i.e., file systems 1k blocks
    on x86, or more interestingly file systems with 4k blocks on Power or
    ia64 systems.)

    In the cleanup category, the ext4's punch hole implementation was
    significantly improved by Lukas Czerner, and now supports bigalloc
    file systems. In addition, Jan Kara significantly cleaned up the
    write submission code path. We also improved error checking and added
    a few sanity checks.

    In the optimizations category, two major optimizations deserve
    mention. The first is that ext4_writepages() is now used for
    nodelalloc and ext3 compatibility mode. This allows writes to be
    submitted much more efficiently as a single bio request, instead of
    being sent as individual 4k writes into the block layer (which then
    relied on the elevator code to coalesce the requests in the block
    queue). Secondly, the extent cache shrink mechanism, which was
    introduce in 3.9, no longer has a scalability bottleneck caused by the
    i_es_lru spinlock. Other optimizations include some changes to reduce
    CPU usage and to avoid issuing empty commits unnecessarily."

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits)
    ext4: optimize starting extent in ext4_ext_rm_leaf()
    jbd2: invalidate handle if jbd2_journal_restart() fails
    ext4: translate flag bits to strings in tracepoints
    ext4: fix up error handling for mpage_map_and_submit_extent()
    jbd2: fix theoretical race in jbd2__journal_restart
    ext4: only zero partial blocks in ext4_zero_partial_blocks()
    ext4: check error return from ext4_write_inline_data_end()
    ext4: delete unnecessary C statements
    ext3,ext4: don't mess with dir_file->f_pos in htree_dirblock_to_tree()
    jbd2: move superblock checksum calculation to jbd2_write_superblock()
    ext4: pass inode pointer instead of file pointer to punch hole
    ext4: improve free space calculation for inline_data
    ext4: reduce object size when !CONFIG_PRINTK
    ext4: improve extent cache shrink mechanism to avoid to burn CPU time
    ext4: implement error handling of ext4_mb_new_preallocation()
    ext4: fix corruption when online resizing a fs with 1K block size
    ext4: delete unused variables
    ext4: return FIEMAP_EXTENT_UNKNOWN for delalloc extents
    jbd2: remove debug dependency on debug_fs and update Kconfig help text
    jbd2: use a single printk for jbd_debug()
    ...

    Linus Torvalds
     

29 Jun, 2013

1 commit


13 Jun, 2013

3 commits

  • dlm_mig_lockres_handler() is missing a dlm_lockres_put() on an error path.

    Signed-off-by: joyce
    Reviewed-by: shencanquan
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xue jiufei
     
  • While removing a non-empty directory, the kernel dumps a message:

    (rmdir,21743,1):ocfs2_unlink:953 ERROR: status = -39

    Suppress the error message from being printed in the dmesg so users
    don't panic.

    Signed-off-by: Goldwyn Rodrigues
    Cc: Mark Fasheh
    Cc: Joel Becker
    Acked-by: Sunil Mushran
    Reviewed-by: Jie Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Goldwyn Rodrigues
     
  • If an error occurs, for example an EIO in __ocfs2_prepare_orphan_dir,
    ocfs2_prep_new_orphaned_file will release the inode_ac, then when the
    caller of ocfs2_prep_new_orphaned_file gets a 0 return, it will refer to
    a NULL ocfs2_alloc_context struct in the following functions. A kernel
    panic happens.

    Signed-off-by: "Xiaowei.Hu"
    Reviewed-by: shencanquan
    Acked-by: Sunil Mushran
    Cc: Joe Jin
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiaowei.Hu
     

25 May, 2013

2 commits

  • Last time we found there is lock/unlock bug in ocfs2_file_aio_write, and
    then we did a thorough search for all lock resources in
    ocfs2_inode_info, including rw, inode and open lockres and found this
    bug. My kernel version is 3.0.13, and it is also in the lastest version
    3.9. In ocfs2_fiemap, once ocfs2_get_clusters_nocache failed, it should
    goto out_unlock instead of out, because we need release buffer head, up
    read alloc sem and unlock inode.

    Signed-off-by: Joseph Qi
    Reviewed-by: Jie Liu
    Cc: Mark Fasheh
    Cc: Joel Becker
    Acked-by: Sunil Mushran
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • In ocfs2_file_aio_write(), it does ocfs2_rw_lock() first and then
    ocfs2_inode_lock().

    But if ocfs2_inode_lock() failed, it goes to out_sems without unlocking
    rw lock. This will cause a bug in ocfs2_lock_res_free() when testing
    res->l_ex_holders, which is increased in __ocfs2_cluster_lock() and
    decreased in __ocfs2_cluster_unlock().

    Signed-off-by: Joseph Qi
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Li Zefan
    Cc: "Duyongfeng (B)"
    Acked-by: Sunil Mushran
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     

22 May, 2013

3 commits

  • ->invalidatepage() aop now accepts range to invalidate so we can make
    use of it in ocfs2_invalidatepage().

    Signed-off-by: Lukas Czerner
    Reviewed-by: Jan Kara
    Acked-by: Joel Becker

    Lukas Czerner
     
  • invalidatepage now accepts range to invalidate and there are two file
    system using jbd2 also implementing punch hole feature which can benefit
    from this. We need to implement the same thing for jbd2 layer in order to
    allow those file system take benefit of this functionality.

    This commit adds length argument to the jbd2_journal_invalidatepage()
    and updates all instances in ext4 and ocfs2.

    Signed-off-by: Lukas Czerner
    Reviewed-by: Jan Kara

    Lukas Czerner
     
  • Currently there is no way to truncate partial page where the end
    truncate point is not at the end of the page. This is because it was not
    needed and the functionality was enough for file system truncate
    operation to work properly. However more file systems now support punch
    hole feature and it can benefit from mm supporting truncating page just
    up to the certain point.

    Specifically, with this functionality truncate_inode_pages_range() can
    be changed so it supports truncating partial page at the end of the
    range (currently it will BUG_ON() if 'end' is not at the end of the
    page).

    This commit changes the invalidatepage() address space operation
    prototype to accept range to be invalidated and update all the instances
    for it.

    We also change the block_invalidatepage() in the same way and actually
    make a use of the new length argument implementing range invalidation.

    Actual file system implementations will follow except the file systems
    where the changes are really simple and should not change the behaviour
    in any way .Implementation for truncate_page_range() which will be able
    to accept page unaligned ranges will follow as well.

    Signed-off-by: Lukas Czerner
    Cc: Andrew Morton
    Cc: Hugh Dickins

    Lukas Czerner
     

08 May, 2013

2 commits

  • Faster kernel compiles by way of fewer unnecessary includes.

    [akpm@linux-foundation.org: fix fallout]
    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     
  • This removes the retry-based AIO infrastructure now that nothing in tree
    is using it.

    We want to remove retry-based AIO because it is fundemantally unsafe.
    It retries IO submission from a kernel thread that has only assumed the
    mm of the submitting task. All other task_struct references in the IO
    submission path will see the kernel thread, not the submitting task.
    This design flaw means that nothing of any meaningful complexity can use
    retry-based AIO.

    This removes all the code and data associated with the retry machinery.
    The most significant benefit of this is the removal of the locking
    around the unused run list in the submission path.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Kent Overstreet
    Signed-off-by: Zach Brown
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Acked-by: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zach Brown
     

02 May, 2013

1 commit

  • Pull VFS updates from Al Viro,

    Misc cleanups all over the place, mainly wrt /proc interfaces (switch
    create_proc_entry to proc_create(), get rid of the deprecated
    create_proc_read_entry() in favor of using proc_create_data() and
    seq_file etc).

    7kloc removed.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
    don't bother with deferred freeing of fdtables
    proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
    proc: Make the PROC_I() and PDE() macros internal to procfs
    proc: Supply a function to remove a proc entry by PDE
    take cgroup_open() and cpuset_open() to fs/proc/base.c
    ppc: Clean up scanlog
    ppc: Clean up rtas_flash driver somewhat
    hostap: proc: Use remove_proc_subtree()
    drm: proc: Use remove_proc_subtree()
    drm: proc: Use minor->index to label things, not PDE->name
    drm: Constify drm_proc_list[]
    zoran: Don't print proc_dir_entry data in debug
    reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
    proc: Supply an accessor for getting the data from a PDE's parent
    airo: Use remove_proc_subtree()
    rtl8192u: Don't need to save device proc dir PDE
    rtl8187se: Use a dir under /proc/net/r8180/
    proc: Add proc_mkdir_data()
    proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
    proc: Move PDE_NET() to fs/proc/proc_net.c
    ...

    Linus Torvalds
     

30 Apr, 2013

5 commits