11 Mar, 2013

1 commit

  • With the commit 3be2be0a32c18b0fd6d623cda63174a332ca0de1 we removed vmtruncate,
    but actaully there is no need to call inode_newsize_ok() because the checks are
    already done in inode_change_ok() at the begin of the function.

    Signed-off-by: Marco Stornelli
    Signed-off-by: Richard Weinberger

    Marco Stornelli
     

10 Mar, 2013

1 commit

  • Pull namespace bugfixes from Eric Biederman:
    "This is three simple fixes against 3.9-rc1. I have tested each of
    these fixes and verified they work correctly.

    The userns oops in key_change_session_keyring and the BUG_ON triggered
    by proc_ns_follow_link were found by Dave Jones.

    I am including the enhancement for mount to only trigger requests of
    filesystem modules here instead of delaying this for the 3.10 merge
    window because it is both trivial and the kind of change that tends to
    bit-rot if left untouched for two months."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    proc: Use nd_jump_link in proc_ns_follow_link
    fs: Limit sys_mount to only request filesystem modules (Part 2).
    fs: Limit sys_mount to only request filesystem modules.
    userns: Stop oopsing in key_change_session_keyring

    Linus Torvalds
     

09 Mar, 2013

4 commits

  • Update proc_ns_follow_link to use nd_jump_link instead of just
    manually updating nd.path.dentry.

    This fixes the BUG_ON(nd->inode != parent->d_inode) reported by Dave
    Jones and reproduced trivially with mkdir /proc/self/ns/uts/a.

    Sigh it looks like the VFS change to require use of nd_jump_link
    happend while proc_ns_follow_link was baking and since the common case
    of proc_ns_follow_link continued to work without problems the need for
    making this change was overlooked.

    Cc: stable@vger.kernel.org
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Pull btrfs fixes from Chris Mason:
    "These are scattered fixes and one performance improvement. The
    biggest functional change is in how we throttle metadata changes. The
    new code bumps our average file creation rate up by ~13% in fs_mark,
    and lowers CPU usage.

    Stefan bisected out a regression in our allocation code that made
    balance loop on extents larger than 256MB."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Btrfs: improve the delayed inode throttling
    Btrfs: fix a mismerge in btrfs_balance()
    Btrfs: enforce min_bytes parameter during extent allocation
    Btrfs: allow running defrag in parallel to administrative tasks
    Btrfs: avoid deadlock on transaction waiting list
    Btrfs: do not BUG_ON on aborted situation
    Btrfs: do not BUG_ON in prepare_to_reloc
    Btrfs: free all recorded tree blocks on error
    Btrfs: build up error handling for merge_reloc_roots
    Btrfs: check for NULL pointer in updating reloc roots
    Btrfs: fix unclosed transaction handler when the async transaction commitment fails
    Btrfs: fix wrong handle at error path of create_snapshot() when the commit fails
    Btrfs: use set_nlink if our i_nlink is 0

    Linus Torvalds
     
  • Pull CIFS fixes from Steve French:
    "A small set of cifs fixes which includes one for a recent regression
    in the write path (pointed out by Anton), some fixes for rename
    problems and as promised for 3.9 removing the obsolete sockopt mount
    option (and the accompanying deprecation warning)."

    * 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
    CIFS: Fix missing of oplock_read value in smb30_values structure
    cifs: don't try to unlock pagecache page after releasing it
    cifs: remove the sockopt= mount option
    cifs: Check server capability before attempting silly rename
    cifs: Fix bug when checking error condition in cifs_rename_pending_delete()

    Linus Torvalds
     
  • It's "normal" - it can happen if the file descriptor you followed was
    opened with O_NOFOLLOW.

    Reported-by: Dave Jones
    Cc: Al Viro
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

08 Mar, 2013

1 commit

  • Pull ecryptfs fixes from Tyler Hicks:
    "Minor code cleanups and new Kconfig option to disable /dev/ecryptfs

    The code cleanups fix up W=1 compiler warnings and some unnecessary
    checks. The new Kconfig option, defaulting to N, allows the rarely
    used eCryptfs kernel to userspace communication channel to be compiled
    out. This may be the first step in it being eventually removed."

    Hmm. I'm not sure whether these should be called "fixes", and it
    probably should have gone in the merge window. But I'll let it slide.

    * tag 'ecryptfs-3.9-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
    eCryptfs: allow userspace messaging to be disabled
    eCryptfs: Fix redundant error check on ecryptfs_find_daemon_by_euid()
    ecryptfs: ecryptfs_msg_ctx_alloc_to_free(): remove kfree() redundant null check
    eCryptfs: decrypt_pki_encrypted_session_key(): remove kfree() redundant null check
    eCryptfs: remove unneeded checks in virt_to_scatterlist()
    eCryptfs: Fix -Wmissing-prototypes warnings
    eCryptfs: Fix -Wunused-but-set-variable warnings
    eCryptfs: initialize payload_len in keystore.c

    Linus Torvalds
     

07 Mar, 2013

9 commits


06 Mar, 2013

1 commit

  • Commit 24542bf7ea5e4fdfdb5157ff544c093fa4dcb536 changed preallocation of
    extents to cap the max size we try to allocate. It's a valid change,
    but the extent reservation code is also used by balance, and that
    can't tolerate a smaller extent being allocated.

    __btrfs_prealloc_file_range already has a min_size parameter, which is
    used by relocation to request a specific extent size. This commit
    adds an extra check to enforce that minimum extent size.

    Signed-off-by: Chris Mason
    Reported-by: Stefan Behrens

    Chris Mason
     

05 Mar, 2013

10 commits

  • Commit 5ac00add added a testnset mutex and code that disallows
    running administrative tasks in parallel. It is prevented that
    the device add/delete/balance/replace/resize operations are
    started in parallel. By mistake, the defragmentation operation
    was included in the check for mutually exclusiveness as well.
    This is fixed with this commit.

    Signed-off-by: Stefan Behrens
    Signed-off-by: Josef Bacik

    Stefan Behrens
     
  • Only let one trans handle to wait for other handles, otherwise we
    will get ABBA issues.

    Signed-off-by: Liu Bo
    Signed-off-by: Josef Bacik

    Liu Bo
     
  • Btrfs balance can easily hit BUG_ON in these places, but we want
    to it bail out gracefully after we force the whole filesystem to
    readonly. So we use btrfs_std_error hook in place of BUG_ON.

    Signed-off-by: Liu Bo
    Signed-off-by: Josef Bacik

    Liu Bo
     
  • We can bail out from here gracefully instead of a cold BUG_ON.

    Signed-off-by: Liu Bo
    Signed-off-by: Josef Bacik

    Liu Bo
     
  • We've missed the 'free blocks' part on ENOMEM error.

    Signed-off-by: Liu Bo
    Signed-off-by: Josef Bacik

    Liu Bo
     
  • We first use btrfs_std_error hook to replace with BUG_ON, and we
    also need to cleanup what is left, including reloc roots rbtree
    and reloc roots list.
    Here we use a helper function to cleanup both rbtree and list, and
    since this function can also be used in the balance recover path,
    we also make the change as well to keep code simple.

    Signed-off-by: Liu Bo
    Signed-off-by: Josef Bacik

    Liu Bo
     
  • Add a check for NULL pointer to avoid invalid reference.

    Signed-off-by: Liu Bo
    Signed-off-by: Josef Bacik

    Liu Bo
     
  • If the async transaction commitment failed, we need close the
    current transaction handler, or the current transaction will be
    blocked to commit because of this orphan handler.

    We fix the problem by doing sync transaction commitment, that is
    to invoke btrfs_commit_transaction().

    Signed-off-by: Miao Xie
    Signed-off-by: Josef Bacik

    Miao Xie
     
  • There are several bugs at error path of create_snapshot() when the
    transaction commitment failed.
    - access the freed transaction handler. At the end of the
    transaction commitment, the transaction handler was freed, so we
    should not access it after the transaction commitment.
    - we were not aware of the error which happened during the snapshot
    creation if we submitted a async transaction commitment.
    - pending snapshot access vs pending snapshot free. when something
    wrong happened after we submitted a async transaction commitment,
    the transaction committer would cleanup the pending snapshots and
    free them. But the snapshot creators were not aware of it, they
    would access the freed pending snapshots.

    This patch fixes the above problems by:
    - remove the dangerous code that accessed the freed handler
    - assign ->error if the error happens during the snapshot creation
    - the transaction committer doesn't free the pending snapshots,
    just assigns the error number and evicts them before we unblock
    the transaction.

    Reported-by: Dan Carpenter
    Signed-off-by: Miao Xie
    Signed-off-by: Josef Bacik

    Miao Xie
     
  • We need to inc the nlink of deleted entries when running replay so we can do the
    unlink on the fs_root and get everything cleaned up and then have the orphan
    cleanup do the right thing. The problem is inc_nlink complains about this, even
    thought it still does the right thing. So use set_nlink() if our i_nlink is 0
    to keep users from seeing the warnings during log replay. Thanks,

    Signed-off-by: Josef Bacik

    Josef Bacik
     

04 Mar, 2013

5 commits

  • When the userspace messaging (for the less common case of userspace key
    wrap/unwrap via ecryptfsd) is not needed, allow eCryptfs to build with
    it removed. This saves on kernel code size and reduces potential attack
    surface by removing the /dev/ecryptfs node.

    Signed-off-by: Kees Cook
    Signed-off-by: Tyler Hicks

    Kees Cook
     
  • Modify the request_module to prefix the file system type with "fs-"
    and add aliases to all of the filesystems that can be built as modules
    to match.

    A common practice is to build all of the kernel code and leave code
    that is not commonly needed as modules, with the result that many
    users are exposed to any bug anywhere in the kernel.

    Looking for filesystems with a fs- prefix limits the pool of possible
    modules that can be loaded by mount to just filesystems trivially
    making things safer with no real cost.

    Using aliases means user space can control the policy of which
    filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
    with blacklist and alias directives. Allowing simple, safe,
    well understood work-arounds to known problematic software.

    This also addresses a rare but unfortunate problem where the filesystem
    name is not the same as it's module name and module auto-loading
    would not work. While writing this patch I saw a handful of such
    cases. The most significant being autofs that lives in the module
    autofs4.

    This is relevant to user namespaces because we can reach the request
    module in get_fs_type() without having any special permissions, and
    people get uncomfortable when a user specified string (in this case
    the filesystem type) goes all of the way to request_module.

    After having looked at this issue I don't think there is any
    particular reason to perform any filtering or permission checks beyond
    making it clear in the module request that we want a filesystem
    module. The common pattern in the kernel is to call request_module()
    without regards to the users permissions. In general all a filesystem
    module does once loaded is call register_filesystem() and go to sleep.
    Which means there is not much attack surface exposed by loading a
    filesytem module unless the filesystem is mounted. In a user
    namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
    which most filesystems do not set today.

    Acked-by: Serge Hallyn
    Acked-by: Kees Cook
    Reported-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Pull more VFS bits from Al Viro:
    "Unfortunately, it looks like xattr series will have to wait until the
    next cycle ;-/

    This pile contains 9p cleanups and fixes (races in v9fs_fid_add()
    etc), fixup for nommu breakage in shmem.c, several cleanups and a bit
    more file_inode() work"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    constify path_get/path_put and fs_struct.c stuff
    fix nommu breakage in shmem.c
    cache the value of file_inode() in struct file
    9p: if v9fs_fid_lookup() gets to asking server, it'd better have hashed dentry
    9p: make sure ->lookup() adds fid to the right dentry
    9p: untangle ->lookup() a bit
    9p: double iput() in ->lookup() if d_materialise_unique() fails
    9p: v9fs_fid_add() can't fail now
    v9fs: get rid of v9fs_dentry
    9p: turn fid->dlist into hlist
    9p: don't bother with private lock in ->d_fsdata; dentry->d_lock will do just fine
    more file_inode() open-coded instances
    selinux: opened file can't have NULL or negative ->f_path.dentry

    (In the meantime, the hlist traversal macros have changed, so this
    required a semantic conflict fixup for the newly hlistified fid->dlist)

    Linus Torvalds
     
  • Pull btrfs fixup from Chris Mason:
    "Geert and James both sent this one in, sorry guys"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    btrfs/raid56: Add missing #include

    Linus Torvalds
     
  • Pull new ImgTec Meta architecture from James Hogan:
    "This adds core architecture support for Imagination's Meta processor
    cores, followed by some later miscellaneous arch/metag cleanups and
    fixes which I kept separate to ease review:

    - Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
    - A few fixes all over, particularly for symbol prefixes
    - A few privilege protection fixes
    - Several cleanups (setup.c includes, split out a lot of
    metag_ksyms.c)
    - Fix some missing exports
    - Convert hugetlb to use vm_unmapped_area()
    - Copy device tree to non-init memory
    - Provide dma_get_sgtable()"

    * tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
    metag: Provide dma_get_sgtable()
    metag: prom.h: remove declaration of metag_dt_memblock_reserve()
    metag: copy devicetree to non-init memory
    metag: cleanup metag_ksyms.c includes
    metag: move mm/init.c exports out of metag_ksyms.c
    metag: move usercopy.c exports out of metag_ksyms.c
    metag: move setup.c exports out of metag_ksyms.c
    metag: move kick.c exports out of metag_ksyms.c
    metag: move traps.c exports out of metag_ksyms.c
    metag: move irq enable out of irqflags.h on SMP
    genksyms: fix metag symbol prefix on crc symbols
    metag: hugetlb: convert to vm_unmapped_area()
    metag: export clear_page and copy_page
    metag: export metag_code_cache_flush_all
    metag: protect more non-MMU memory regions
    metag: make TXPRIVEXT bits explicit
    metag: kernel/setup.c: sort includes
    perf: Enable building perf tools for Meta
    metag: add boot time LNKGET/LNKSET check
    metag: add __init to metag_cache_probe()
    ...

    Linus Torvalds
     

03 Mar, 2013

8 commits

  • tilegx_defconfig:

    fs/btrfs/raid56.c: In function 'btrfs_alloc_stripe_hash_table':
    fs/btrfs/raid56.c:206:3: error: implicit declaration of function 'vzalloc' [-Werror=implicit-function-declaration]
    fs/btrfs/raid56.c:206:9: warning: assignment makes pointer from integer without a cast [enabled by default]
    fs/btrfs/raid56.c:226:4: error: implicit declaration of function 'vfree' [-Werror=implicit-function-declaration]

    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: Chris Mason

    Geert Uytterhoeven
     
  • Pull ext4 bug fixes from Ted Ts'o:
    "Various bug fixes for ext4. The most important is a fix for the new
    extent cache's slab shrinker which can cause significant, user-visible
    pauses when the system is under memory pressure."

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: enable quotas before orphan cleanup
    ext4: don't allow quota mount options when quota feature enabled
    ext4: fix a warning from sparse check for ext4_dir_llseek
    ext4: convert number of blocks to clusters properly
    ext4: fix possible memory leak in ext4_remount()
    jbd2: fix ERR_PTR dereference in jbd2__journal_start
    ext4: use percpu counter for extent cache count
    ext4: optimize ext4_es_shrink()

    Linus Torvalds
     
  • Pull NFS client bugfixes from Trond Myklebust:
    "We've just concluded another Connectathon interoperability testing
    week, and so here are the fixes for the bugs that were discovered:

    - Don't allow NFS silly-renamed files to be deleted
    - Don't start the retransmission timer when out of socket space
    - Fix a couple of pnfs-related Oopses.
    - Fix one more NFSv4 state recovery deadlock
    - Don't loop forever when LAYOUTGET returns NFS4ERR_LAYOUTTRYLATER"

    * tag 'nfs-for-3.9-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    SUNRPC: One line comment fix
    NFSv4.1: LAYOUTGET EDELAY loops timeout to the MDS
    SUNRPC: add call to get configured timeout
    PNFS: set the default DS timeout to 60 seconds
    NFSv4: Fix another open/open_recovery deadlock
    nfs: don't allow nfs_find_actor to match inodes of the wrong type
    NFSv4.1: Hold reference to layout hdr in layoutget
    pnfs: fix resend_to_mds for directio
    SUNRPC: Don't start the retransmission timer when out of socket space
    NFS: Don't allow NFS silly-renamed files to be deleted, no signal

    Linus Torvalds
     
  • Pull btrfs update from Chris Mason:
    "The biggest feature in the pull is the new (and still experimental)
    raid56 code that David Woodhouse started long ago. I'm still working
    on the parity logging setup that will avoid inconsistent parity after
    a crash, so this is only for testing right now. But, I'd really like
    to get it out to a broader audience to hammer out any performance
    issues or other problems.

    scrub does not yet correct errors on raid5/6 either.

    Josef has another pass at fsync performance. The big change here is
    to combine waiting for metadata with waiting for data, which is a big
    latency win. It is also step one toward using atomics from the
    hardware during a commit.

    Mark Fasheh has a new way to use btrfs send/receive to send only the
    metadata changes. SUSE is using this to make snapper more efficient
    at finding changes between snapshosts.

    Snapshot-aware defrag is also included.

    Otherwise we have a large number of fixes and cleanups. Eric Sandeen
    wins the award for removing the most lines, and I'm hoping we steal
    this idea from XFS over and over again."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (118 commits)
    btrfs: fixup/remove module.h usage as required
    Btrfs: delete inline extents when we find them during logging
    btrfs: try harder to allocate raid56 stripe cache
    Btrfs: cleanup to make the function btrfs_delalloc_reserve_metadata more logic
    Btrfs: don't call btrfs_qgroup_free if just btrfs_qgroup_reserve fails
    Btrfs: remove reduplicate check about root in the function btrfs_clean_quota_tree
    Btrfs: return ENOMEM rather than use BUG_ON when btrfs_alloc_path fails
    Btrfs: fix missing deleted items in btrfs_clean_quota_tree
    btrfs: use only inline_pages from extent buffer
    Btrfs: fix wrong reserved space when deleting a snapshot/subvolume
    Btrfs: fix wrong reserved space in qgroup during snap/subv creation
    Btrfs: remove unnecessary dget_parent/dput when creating the pending snapshot
    btrfs: remove a printk from scan_one_device
    Btrfs: fix NULL pointer after aborting a transaction
    Btrfs: fix memory leak of log roots
    Btrfs: copy everything if we've created an inline extent
    btrfs: cleanup for open-coded alignment
    Btrfs: do not change inode flags in rename
    Btrfs: use reserved space for creating a snapshot
    clear chunk_alloc flag on retryable failure
    ...

    Linus Torvalds
     
  • When using quota feature we need to enable quotas before orphan cleanup
    so that changes happening during it are properly reflected in quota
    accounting.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • So far we silently ignored when quota mount options were set while quota
    feature was enabled. But this can create confusion in userspace when
    mount options are set but silently ignored and also creates opportunities
    for bugs when we don't properly test all quota types. Actually
    ext4_mark_dquot_dirty() forgets to test for quota feature so it was
    dependent on journaled quota options being set. OTOH ext4_orphan_cleanup()
    tries to enable journaled quota when quota options are specified which is
    wrong when quota feature is enabled.

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • ext4_dir_llseek is only used as a callback function, and no one calls
    it directly. So make it as a static function in order to remove a
    warning message from sparse check.

    Signed-off-by: Zheng Liu
    Signed-off-by: "Theodore Ts'o"

    Zheng Liu
     
  • We're using macro EXT4_B2C() to convert number of blocks to number of
    clusters for bigalloc file systems. However, we should be using
    EXT4_NUM_B2C().

    Signed-off-by: Lukas Czerner
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@vger.kernel.org

    Lukas Czerner