04 Aug, 2012

1 commit


02 Aug, 2012

1 commit

  • Pull second vfs pile from Al Viro:
    "The stuff in there: fsfreeze deadlock fixes by Jan (essentially, the
    deadlock reproduced by xfstests 068), symlink and hardlink restriction
    patches, plus assorted cleanups and fixes.

    Note that another fsfreeze deadlock (emergency thaw one) is *not*
    dealt with - the series by Fernando conflicts a lot with Jan's, breaks
    userland ABI (FIFREEZE semantics gets changed) and trades the deadlock
    for massive vfsmount leak; this is going to be handled next cycle.
    There probably will be another pull request, but that stuff won't be
    in it."

    Fix up trivial conflicts due to unrelated changes next to each other in
    drivers/{staging/gdm72xx/usb_boot.c, usb/gadget/storage_common.c}

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (54 commits)
    delousing target_core_file a bit
    Documentation: Correct s_umount state for freeze_fs/unfreeze_fs
    fs: Remove old freezing mechanism
    ext2: Implement freezing
    btrfs: Convert to new freezing mechanism
    nilfs2: Convert to new freezing mechanism
    ntfs: Convert to new freezing mechanism
    fuse: Convert to new freezing mechanism
    gfs2: Convert to new freezing mechanism
    ocfs2: Convert to new freezing mechanism
    xfs: Convert to new freezing code
    ext4: Convert to new freezing mechanism
    fs: Protect write paths by sb_start_write - sb_end_write
    fs: Skip atime update on frozen filesystem
    fs: Add freezing handling to mnt_want_write() / mnt_drop_write()
    fs: Improve filesystem freezing handling
    switch the protection of percpu_counter list to spinlock
    nfsd: Push mnt_want_write() outside of i_mutex
    btrfs: Push mnt_want_write() outside of i_mutex
    fat: Push mnt_want_write() outside of i_mutex
    ...

    Linus Torvalds
     

31 Jul, 2012

6 commits

  • We change nilfs_page_mkwrite() to provide proper freeze protection for
    writeable page faults (we must wait for frozen filesystem even if the
    page is fully mapped).

    We remove all vfs_check_frozen() checks since they are now handled by
    the generic code.

    CC: linux-nilfs@vger.kernel.org
    CC: KONISHI Ryusuke
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     
  • Add omitted comments for different structures in driver implementation.

    Signed-off-by: Vyacheslav Dubeyko
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vyacheslav Dubeyko
     
  • An fs-thaw ioctl causes deadlock with a chcp or mkcp -s command:

    chcp D ffff88013870f3d0 0 1325 1324 0x00000004
    ...
    Call Trace:
    nilfs_transaction_begin+0x11c/0x1a0 [nilfs2]
    wake_up_bit+0x20/0x20
    copy_from_user+0x18/0x30 [nilfs2]
    nilfs_ioctl_change_cpmode+0x7d/0xcf [nilfs2]
    nilfs_ioctl+0x252/0x61a [nilfs2]
    do_page_fault+0x311/0x34c
    get_unmapped_area+0x132/0x14e
    do_vfs_ioctl+0x44b/0x490
    __set_task_blocked+0x5a/0x61
    vm_mmap_pgoff+0x76/0x87
    __set_current_blocked+0x30/0x4a
    sys_ioctl+0x4b/0x6f
    system_call_fastpath+0x16/0x1b
    thaw D ffff88013870d890 0 1352 1351 0x00000004
    ...
    Call Trace:
    rwsem_down_failed_common+0xdb/0x10f
    call_rwsem_down_write_failed+0x13/0x20
    down_write+0x25/0x27
    thaw_super+0x13/0x9e
    do_vfs_ioctl+0x1f5/0x490
    vm_mmap_pgoff+0x76/0x87
    sys_ioctl+0x4b/0x6f
    filp_close+0x64/0x6c
    system_call_fastpath+0x16/0x1b

    where the thaw ioctl deadlocked at thaw_super() when called while chcp was
    waiting at nilfs_transaction_begin() called from
    nilfs_ioctl_change_cpmode(). This deadlock is 100% reproducible.

    This is because nilfs_ioctl_change_cpmode() first locks sb->s_umount in
    read mode and then waits for unfreezing in nilfs_transaction_begin(),
    whereas thaw_super() locks sb->s_umount in write mode. The locking of
    sb->s_umount here was intended to make snapshot mounts and the downgrade
    of snapshots to checkpoints exclusive.

    This fixes the deadlock issue by replacing the sb->s_umount usage in
    nilfs_ioctl_change_cpmode() with a dedicated mutex which protects snapshot
    mounts.

    Signed-off-by: Ryusuke Konishi
    Cc: Fernando Luis Vazquez Cao
    Tested-by: Ryusuke Konishi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • The checkpoint deletion ioctl (rmcp ioctl) has potential for breaking
    snapshot because it is not fully exclusive with checkpoint mode change
    ioctl (chcp ioctl).

    The rmcp ioctl first tests if the specified checkpoint is a snapshot or
    not within nilfs_cpfile_delete_checkpoint function, and then calls
    nilfs_cpfile_delete_checkpoints function to actually invalidate the
    checkpoint only if it's not a snapshot. However, the checkpoint can be
    changed into a snapshot by the chcp ioctl between these two operations.
    In that case, calling nilfs_cpfile_delete_checkpoints() wrongly
    invalidates the snapshot, which leads to snapshot list corruption and
    snapshot count mismatch.

    This fixes the issue by changing nilfs_cpfile_delete_checkpoints() so
    that it reconfirms the target checkpoints are snapshot or not.

    This second check is exclusive with the chcp operation since it is
    protected by an existing semaphore.

    Signed-off-by: Ryusuke Konishi
    Cc: Fernando Luis Vazquez Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • ->delete_inode(), ->write_super_lockfs(), ->unlockfs() are gone so remove
    references to them in the NTFS code. Noticed while cleaning up the
    fsfreeze mess.

    Signed-off-by: Fernando Luis Vazquez Cao
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fernando Luis Vazquez Cao
     
  • Add omitted comment for ns_mount_state field of the_nilfs structure.

    Signed-off-by: Vyacheslav Dubeyko
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vyacheslav Dubeyko
     

14 Jul, 2012

3 commits

  • Pass mount flags to sget() so that it can use them in initialising a new
    superblock before the set function is called. They could also be passed to the
    compare function.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     
  • boolean "does it have to be exclusive?" flag is passed instead;
    Local filesystem should just ignore it - the object is guaranteed
    not to be there yet.

    Signed-off-by: Al Viro

    Al Viro
     
  • Just the flags; only NFS cares even about that, but there are
    legitimate uses for such argument. And getting rid of that
    completely would require splitting ->lookup() into a couple
    of methods (at least), so let's leave that alone for now...

    Signed-off-by: Al Viro

    Al Viro
     

21 Jun, 2012

1 commit

  • A gc-inode is a pseudo inode used to buffer the blocks to be moved by
    garbage collection.

    Block caches of gc-inodes must be cleared every time a garbage collection
    function (nilfs_clean_segments) completes. Otherwise, stale blocks
    buffered in the caches may be wrongly reused in successive calls of the GC
    function.

    For user files, this is not a problem because their gc-inodes are
    distinguished by a checkpoint number as well as an inode number. They
    never buffer different blocks if either an inode number, a checkpoint
    number, or a block offset differs.

    However, gc-inodes of sufile, cpfile and DAT file can store different data
    for the same block offset. Thus, the nilfs_clean_segments function can
    move incorrect block for these meta-data files if an old block is cached.
    I found this is really causing meta-data corruption in nilfs.

    This fixes the issue by ensuring cache clear of gc-inodes and resolves
    reported GC problems including checkpoint file corruption, b-tree
    corruption, and the following warning during GC.

    nilfs_palloc_freev: entry number 307234 already freed.
    ...

    Signed-off-by: Ryusuke Konishi
    Tested-by: Ryusuke Konishi
    Cc: [2.6.37+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     

02 Jun, 2012

1 commit

  • Pull vfs changes from Al Viro.
    "A lot of misc stuff. The obvious groups:
    * Miklos' atomic_open series; kills the damn abuse of
    ->d_revalidate() by NFS, which was the major stumbling block for
    all work in that area.
    * ripping security_file_mmap() and dealing with deadlocks in the
    area; sanitizing the neighborhood of vm_mmap()/vm_munmap() in
    general.
    * ->encode_fh() switched to saner API; insane fake dentry in
    mm/cleancache.c gone.
    * assorted annotations in fs (endianness, __user)
    * parts of Artem's ->s_dirty work (jff2 and reiserfs parts)
    * ->update_time() work from Josef.
    * other bits and pieces all over the place.

    Normally it would've been in two or three pull requests, but
    signal.git stuff had eaten a lot of time during this cycle ;-/"

    Fix up trivial conflicts in Documentation/filesystems/vfs.txt (the
    'truncate_range' inode method was removed by the VM changes, the VFS
    update adds an 'update_time()' method), and in fs/btrfs/ulist.[ch] (due
    to sparse fix added twice, with other changes nearby).

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (95 commits)
    nfs: don't open in ->d_revalidate
    vfs: retry last component if opening stale dentry
    vfs: nameidata_to_filp(): don't throw away file on error
    vfs: nameidata_to_filp(): inline __dentry_open()
    vfs: do_dentry_open(): don't put filp
    vfs: split __dentry_open()
    vfs: do_last() common post lookup
    vfs: do_last(): add audit_inode before open
    vfs: do_last(): only return EISDIR for O_CREAT
    vfs: do_last(): check LOOKUP_DIRECTORY
    vfs: do_last(): make ENOENT exit RCU safe
    vfs: make follow_link check RCU safe
    vfs: do_last(): use inode variable
    vfs: do_last(): inline walk_component()
    vfs: do_last(): make exit RCU safe
    vfs: split do_lookup()
    Btrfs: move over to use ->update_time
    fs: introduce inode operation ->update_time
    reiserfs: get rid of resierfs_sync_super
    reiserfs: mark the superblock as dirty a bit later
    ...

    Linus Torvalds
     

01 Jun, 2012

1 commit

  • There are two cases that the cache flush is needed to avoid data loss
    against unexpected hang or power failure. One is sync file function (i.e.
    nilfs_sync_file) and another is checkpointing ioctl.

    This issues a cache flush request to device for such cases if barrier
    mount option is enabled, and makes sure data really is on persistent
    storage on their completion.

    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     

30 May, 2012

1 commit

  • pass inode + parent's inode or NULL instead of dentry + bool saying
    whether we want the parent or not.

    NOTE: that needs ceph fix folded in.

    Signed-off-by: Al Viro

    Al Viro
     

29 May, 2012

1 commit

  • Pull writeback tree from Wu Fengguang:
    "Mainly from Jan Kara to avoid iput() in the flusher threads."

    * tag 'writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
    writeback: Avoid iput() from flusher thread
    vfs: Rename end_writeback() to clear_inode()
    vfs: Move waiting for inode writeback from end_writeback() to evict_inode()
    writeback: Refactor writeback_single_inode()
    writeback: Remove wb->list_lock from writeback_single_inode()
    writeback: Separate inode requeueing after writeback
    writeback: Move I_DIRTY_PAGES handling
    writeback: Move requeueing when I_SYNC set to writeback_sb_inodes()
    writeback: Move clearing of I_SYNC into inode_sync_complete()
    writeback: initialize global_dirty_limit
    fs: remove 8 bytes of padding from struct writeback_control on 64 bit builds
    mm: page-writeback.c: local functions should not be exposed globally

    Linus Torvalds
     

11 May, 2012

1 commit

  • This allows comparing hash and len in one operation on 64-bit
    architectures. Right now only __d_lookup_rcu() takes advantage of this,
    since that is the case we care most about.

    The use of anonymous struct/unions hides the alternate 64-bit approach
    from most users, the exception being a few cases where we initialize a
    'struct qstr' with a static initializer. This makes the problematic
    cases use a new QSTR_INIT() helper function for that (but initializing
    just the name pointer with a "{ .name = xyzzy }" initializer remains
    valid, as does just copying another qstr structure).

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

06 May, 2012

1 commit

  • After we moved inode_sync_wait() from end_writeback() it doesn't make sense
    to call the function end_writeback() anymore. Rename it to clear_inode()
    which well says what the function really does - set I_CLEAR flag.

    Signed-off-by: Jan Kara
    Signed-off-by: Fengguang Wu

    Jan Kara
     

22 Mar, 2012

1 commit

  • Pull vfs pile 1 from Al Viro:
    "This is _not_ all; in particular, Miklos' and Jan's stuff is not there
    yet."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
    ext4: initialization of ext4_li_mtx needs to be done earlier
    debugfs-related mode_t whack-a-mole
    hfsplus: add an ioctl to bless files
    hfsplus: change finder_info to u32
    hfsplus: initialise userflags
    qnx4: new helper - try_extent()
    qnx4: get rid of qnx4_bread/qnx4_getblk
    take removal of PF_FORKNOEXEC to flush_old_exec()
    trim includes in inode.c
    um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it
    um: embed ->stub_pages[] into mmu_context
    gadgetfs: list_for_each_safe() misuse
    ocfs2: fix leaks on failure exits in module_init
    ecryptfs: make register_filesystem() the last potential failure exit
    ntfs: forgets to unregister sysctls on register_filesystem() failure
    logfs: missing cleanup on register_filesystem() failure
    jfs: mising cleanup on register_filesystem() failure
    make configfs_pin_fs() return root dentry on success
    configfs: configfs_create_dir() has parent dentry in dentry->d_parent
    configfs: sanitize configfs_create()
    ...

    Linus Torvalds
     

21 Mar, 2012

2 commits


20 Mar, 2012

1 commit


17 Mar, 2012

2 commits

  • According to the report from Slicky Devil, nilfs caused kernel oops at
    nilfs_load_super_block function during mount after he shrank the
    partition without resizing the filesystem:

    BUG: unable to handle kernel NULL pointer dereference at 00000048
    IP: [] nilfs_load_super_block+0x17e/0x280 [nilfs2]
    *pde = 00000000
    Oops: 0000 [#1] PREEMPT SMP
    ...
    Call Trace:
    [] init_nilfs+0x4b/0x2e0 [nilfs2]
    [] nilfs_mount+0x447/0x5b0 [nilfs2]
    [] mount_fs+0x36/0x180
    [] vfs_kern_mount+0x51/0xa0
    [] do_kern_mount+0x3e/0xe0
    [] do_mount+0x169/0x700
    [] sys_mount+0x6b/0xa0
    [] sysenter_do_call+0x12/0x28
    Code: 53 18 8b 43 20 89 4b 18 8b 4b 24 89 53 1c 89 43 24 89 4b 20 8b 43
    20 c7 43 2c 00 00 00 00 23 75 e8 8b 50 68 89 53 28 8b 54 b3 20 72
    48 8b 7a 4c 8b 55 08 89 b3 84 00 00 00 89 bb 88 00 00 00
    EIP: [] nilfs_load_super_block+0x17e/0x280 [nilfs2] SS:ESP 0068:ca9bbdcc
    CR2: 0000000000000048

    This turned out due to a defect in an error path which runs if the
    calculated location of the secondary super block was invalid.

    This patch fixes it and eliminates the reported oops.

    Reported-by: Slicky Devil
    Signed-off-by: Ryusuke Konishi
    Tested-by: Slicky Devil
    Cc: [2.6.30+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ryusuke Konishi
     
  • ns_r_segments_percentage is read from the disk. Bogus or malicious
    value could cause integer overflow and malfunction due to meaningless
    disk usage calculation. This patch reports error when mounting such
    bogus volumes.

    Signed-off-by: Haogang Chen
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Haogang Chen
     

09 Feb, 2012

1 commit

  • nsegs is read from userspace. Limit its value and avoid overflowing nsegs
    * sizeof(__u64) in the subsequent call to memdup_user().

    This patch complements 481fe17e973fb9 ("nilfs2: potential integer overflow
    in nilfs_ioctl_clean_segments()").

    Signed-off-by: Xi Wang
    Cc: Haogang Chen
    Acked-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xi Wang
     

09 Jan, 2012

1 commit

  • * 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (76 commits)
    PM / Hibernate: Implement compat_ioctl for /dev/snapshot
    PM / Freezer: fix return value of freezable_schedule_timeout_killable()
    PM / shmobile: Allow the A4R domain to be turned off at run time
    PM / input / touchscreen: Make st1232 use device PM QoS constraints
    PM / QoS: Introduce dev_pm_qos_add_ancestor_request()
    PM / shmobile: Remove the stay_on flag from SH7372's PM domains
    PM / shmobile: Don't include SH7372's INTCS in syscore suspend/resume
    PM / shmobile: Add support for the sh7372 A4S power domain / sleep mode
    PM: Drop generic_subsys_pm_ops
    PM / Sleep: Remove forward-only callbacks from AMBA bus type
    PM / Sleep: Remove forward-only callbacks from platform bus type
    PM: Run the driver callback directly if the subsystem one is not there
    PM / Sleep: Make pm_op() and pm_noirq_op() return callback pointers
    PM/Devfreq: Add Exynos4-bus device DVFS driver for Exynos4210/4212/4412.
    PM / Sleep: Merge internal functions in generic_ops.c
    PM / Sleep: Simplify generic system suspend callbacks
    PM / Hibernate: Remove deprecated hibernation snapshot ioctls
    PM / Sleep: Fix freezer failures due to racy usermodehelper_is_disabled()
    ARM: S3C64XX: Implement basic power domain support
    PM / shmobile: Use common always on power domain governor
    ...

    Fix up trivial conflict in fs/xfs/xfs_buf.c due to removal of unused
    XBT_FORCE_SLEEP bit

    Linus Torvalds
     

07 Jan, 2012

1 commit


04 Jan, 2012

7 commits


22 Dec, 2011

1 commit

  • * master: (848 commits)
    SELinux: Fix RCU deref check warning in sel_netport_insert()
    binary_sysctl(): fix memory leak
    mm/vmalloc.c: remove static declaration of va from __get_vm_area_node
    ipmi_watchdog: restore settings when BMC reset
    oom: fix integer overflow of points in oom_badness
    memcg: keep root group unchanged if creation fails
    nilfs2: potential integer overflow in nilfs_ioctl_clean_segments()
    nilfs2: unbreak compat ioctl
    cpusets: stall when updating mems_allowed for mempolicy or disjoint nodemask
    evm: prevent racing during tfm allocation
    evm: key must be set once during initialization
    mmc: vub300: fix type of firmware_rom_wait_states module parameter
    Revert "mmc: enable runtime PM by default"
    mmc: sdhci: remove "state" argument from sdhci_suspend_host
    x86, dumpstack: Fix code bytes breakage due to missing KERN_CONT
    IB/qib: Correct sense on freectxts increment and decrement
    RDMA/cma: Verify private data length
    cgroups: fix a css_set not found bug in cgroup_attach_proc
    oprofile: Fix uninitialized memory access when writing to writing to oprofilefs
    Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
    ...

    Conflicts:
    kernel/cgroup_freezer.c

    Rafael J. Wysocki
     

21 Dec, 2011

2 commits

  • There is a potential integer overflow in nilfs_ioctl_clean_segments().
    When a large argv[n].v_nmembs is passed from the userspace, the subsequent
    call to vmalloc() will allocate a buffer smaller than expected, which
    leads to out-of-bound access in nilfs_ioctl_move_blocks() and
    lfs_clean_segments().

    The following check does not prevent the overflow because nsegs is also
    controlled by the userspace and could be very large.

    if (argv[n].v_nmembs > nsegs * nilfs->ns_blocks_per_segment)
    goto out_free;

    This patch clamps argv[n].v_nmembs to UINT_MAX / argv[n].v_size, and
    returns -EINVAL when overflow.

    Signed-off-by: Haogang Chen
    Signed-off-by: Ryusuke Konishi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Haogang Chen
     
  • commit 828b1c50ae ("nilfs2: add compat ioctl") incidentally broke all
    other NILFS compat ioctls. Make them work again.

    Signed-off-by: Thomas Meyer
    Signed-off-by: Ryusuke Konishi
    Tested-by: Ryusuke Konishi
    Cc: [3.0+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Meyer
     

22 Nov, 2011

1 commit

  • There is no reason to export two functions for entering the
    refrigerator. Calling refrigerator() instead of try_to_freeze()
    doesn't save anything noticeable or removes any race condition.

    * Rename refrigerator() to __refrigerator() and make it return bool
    indicating whether it scheduled out for freezing.

    * Update try_to_freeze() to return bool and relay the return value of
    __refrigerator() if freezing().

    * Convert all refrigerator() users to try_to_freeze().

    * Update documentation accordingly.

    * While at it, add might_sleep() to try_to_freeze().

    Signed-off-by: Tejun Heo
    Cc: Samuel Ortiz
    Cc: Chris Mason
    Cc: "Theodore Ts'o"
    Cc: Steven Whitehouse
    Cc: Andrew Morton
    Cc: Jan Kara
    Cc: KONISHI Ryusuke
    Cc: Christoph Hellwig

    Tejun Heo
     

02 Nov, 2011

2 commits