10 Oct, 2012

1 commit


03 Oct, 2012

1 commit

  • Pull user namespace changes from Eric Biederman:
    "This is a mostly modest set of changes to enable basic user namespace
    support. This allows the code to code to compile with user namespaces
    enabled and removes the assumption there is only the initial user
    namespace. Everything is converted except for the most complex of the
    filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
    nfs, ocfs2 and xfs as those patches need a bit more review.

    The strategy is to push kuid_t and kgid_t values are far down into
    subsystems and filesystems as reasonable. Leaving the make_kuid and
    from_kuid operations to happen at the edge of userspace, as the values
    come off the disk, and as the values come in from the network.
    Letting compile type incompatible compile errors (present when user
    namespaces are enabled) guide me to find the issues.

    The most tricky areas have been the places where we had an implicit
    union of uid and gid values and were storing them in an unsigned int.
    Those places were converted into explicit unions. I made certain to
    handle those places with simple trivial patches.

    Out of that work I discovered we have generic interfaces for storing
    quota by projid. I had never heard of the project identifiers before.
    Adding full user namespace support for project identifiers accounts
    for most of the code size growth in my git tree.

    Ultimately there will be work to relax privlige checks from
    "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
    root in a user names to do those things that today we only forbid to
    non-root users because it will confuse suid root applications.

    While I was pushing kuid_t and kgid_t changes deep into the audit code
    I made a few other cleanups. I capitalized on the fact we process
    netlink messages in the context of the message sender. I removed
    usage of NETLINK_CRED, and started directly using current->tty.

    Some of these patches have also made it into maintainer trees, with no
    problems from identical code from different trees showing up in
    linux-next.

    After reading through all of this code I feel like I might be able to
    win a game of kernel trivial pursuit."

    Fix up some fairly trivial conflicts in netfilter uid/git logging code.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
    userns: Convert the ufs filesystem to use kuid/kgid where appropriate
    userns: Convert the udf filesystem to use kuid/kgid where appropriate
    userns: Convert ubifs to use kuid/kgid
    userns: Convert squashfs to use kuid/kgid where appropriate
    userns: Convert reiserfs to use kuid and kgid where appropriate
    userns: Convert jfs to use kuid/kgid where appropriate
    userns: Convert jffs2 to use kuid and kgid where appropriate
    userns: Convert hpfs to use kuid and kgid where appropriate
    userns: Convert btrfs to use kuid/kgid where appropriate
    userns: Convert bfs to use kuid/kgid where appropriate
    userns: Convert affs to use kuid/kgid wherwe appropriate
    userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
    userns: On ia64 deal with current_uid and current_gid being kuid and kgid
    userns: On ppc convert current_uid from a kuid before printing.
    userns: Convert s390 getting uid and gid system calls to use kuid and kgid
    userns: Convert s390 hypfs to use kuid and kgid where appropriate
    userns: Convert binder ipc to use kuids
    userns: Teach security_path_chown to take kuids and kgids
    userns: Add user namespace support to IMA
    userns: Convert EVM to deal with kuids and kgids in it's hmac computation
    ...

    Linus Torvalds
     

18 Sep, 2012

6 commits

  • Now that the type changes are done, here is the final set of
    changes to make the quota code work when user namespaces are enabled.

    Small cleanups and fixes to make the code build when user namespaces
    are enabled.

    Cc: Jan Kara
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Convert w_dq_id to be a struct kquid and remove the now unncessary
    w_dq_type.

    This is a simple conversion and enough other places have already
    been converted that this actually reduces the code complexity
    by a little bit, when removing now unnecessary type conversions.

    Cc: Jan Kara
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Change struct dquot dq_id to a struct kqid and remove the now
    unecessary dq_type.

    Make minimal changes to dquot, quota_tree, quota_v1, quota_v2, ext3,
    ext4, and ocfs2 to deal with the change in quota structures and
    signatures. The ocfs2 changes are larger than most because of the
    extensive tracing throughout the ocfs2 quota code that prints out
    dq_id.

    quota_tree.c:get_index is modified to take a struct kqid instead of a
    qid_t because all of it's callers pass in dquot->dq_id and it allows
    me to introduce only a single conversion.

    The rest of the changes are either just replacing dq_type with dq_id.type,
    adding conversions to deal with the change in type and occassionally
    adding qid_eq to allow quota id comparisons in a user namespace safe way.

    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Jan Kara
    Cc: Andrew Morton
    Cc: Andreas Dilger
    Cc: Theodore Tso
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Modify dqget to take struct kqid instead of a type and an identifier
    pair.

    Modify the callers of dqget in ocfs2 and dquot to take generate
    a struct kqid so they can continue to call dqget. The conversion
    to create struct kqid should all be the final conversions that
    are needed in those code paths.

    Cc: Jan Kara
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Modify quota_send_warning to take struct kqid instead a type and
    identifier pair.

    When sending netlink broadcasts always convert uids and quota
    identifiers into the intial user namespace. There is as yet no way to
    send a netlink broadcast message with different contents to receivers
    in different namespaces, so for the time being just map all of the
    identifiers into the initial user namespace which preserves the
    current behavior.

    Change the callers of quota_send_warning in gfs2, xfs and dquot
    to generate a struct kqid to pass to quota send warning. When
    all of the user namespaces convesions are complete a struct kqid
    values will be availbe without need for conversion, but a conversion
    is needed now to avoid needing to convert everything at once.

    Cc: Ben Myers
    Cc: Alex Elder
    Cc: Dave Chinner
    Cc: Jan Kara
    Cc: Steven Whitehouse
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Update the quotactl user space interface to successfull compile with
    user namespaces support enabled and to hand off quota identifiers to
    lower layers of the kernel in struct kqid instead of type and qid
    pairs.

    The quota on function is not converted because while it takes a quota
    type and an id. The id is the on disk quota format to use, which
    is something completely different.

    The signature of two struct quotactl_ops methods were changed to take
    struct kqid argumetns get_dqblk and set_dqblk.

    The dquot, xfs, and ocfs2 implementations of get_dqblk and set_dqblk
    are minimally changed so that the code continues to work with
    the change in parameter type.

    This is the first in a series of changes to always store quota
    identifiers in the kernel in struct kqid and only use raw type and qid
    values when interacting with on disk structures or userspace. Always
    using struct kqid internally makes it hard to miss places that need
    conversion to or from the kernel internal values.

    Cc: Jan Kara
    Cc: Dave Chinner
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Ben Myers
    Cc: Alex Elder
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

15 Aug, 2012

1 commit


25 Jul, 2012

1 commit

  • Pull misc udf, ext2, ext3, and isofs fixes from Jan Kara:
    "Assorted, mostly trivial, fixes for udf, ext2, ext3, and isofs. I'm
    on vacation and scarcely checking email since we are expecting baby
    any day now but these fixes should be safe to go in and I don't want
    to delay them unnecessarily."

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    udf: avoid info leak on export
    isofs: avoid info leak on export
    udf: Improve table length check to avoid possible overflow
    ext3: Check return value of blkdev_issue_flush()
    jbd: Check return value of blkdev_issue_flush()
    udf: Do not decrement i_blocks when freeing indirect extent block
    udf: Fix memory leak when mounting
    ext2: cleanup the confused goto label
    UDF: Remove unnecessary variable "offset" from udf_fill_inode
    udf: stop using s_dirt
    ext3: force ro mount if ext3_setup_super() fails
    quota: fix checkpatch.pl warning by replacing with

    Linus Torvalds
     

23 Jul, 2012

1 commit

  • Split off part of dquot_quota_sync() which writes dquots into a quota file
    to a separate function. In the next patch we will use the function from
    filesystems and we do not want to abuse ->quota_sync quotactl callback more
    than necessary.

    Acked-by: Steven Whitehouse
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro

    Jan Kara
     

09 Jul, 2012

1 commit


16 May, 2012

3 commits

  • So far i_mutex was ranking above dqonoff_mutex and i_mutex on quota files
    was special and ranking below dqonoff_mutex (and several other locks).
    However there's no real need for i_mutex on quota files to be special.
    IO on quota files is serialized by dqio_mutex anyway so we don't need to
    take i_mutex when writing to quota files. Other places where we take i_mutex
    on quota file can accomodate standard i_mutex lock ranking, we only need
    to change the lock ranking to be dqonoff_mutex > i_mutex which is a matter
    of changing documentation because there's no place which would enforce
    ordering in the other direction.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • Signed-off-by: Jan Kara

    Jan Kara
     
  • When CONFIG_QUOTA_DEBUG is enabled we call inode_get_rsv_space() from
    add_dquot_ref() while holding i_lock. But inode_get_rsv_space() is trying
    to get i_lock as well resulting in double lock.

    Fix the problem by moving inode_get_rsv_space() call out of i_lock.

    Reported-and-analyzed-by: Jie Liu
    Signed-off-by: Jan Kara

    Jan Kara
     

29 Mar, 2012

1 commit

  • Pull ext3, UDF, and quota fixes from Jan Kara:
    "A couple of ext3 & UDF fixes and also one improvement in quota
    locking."

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    ext3: fix start and len arguments handling in ext3_trim_fs()
    udf: Fix deadlock in udf_release_file()
    udf: Fix file entry logicalBlocksRecorded
    udf: Fix handling of i_blocks
    quota: Make quota code not call tty layer with dqptr_sem held
    udf: Init/maintain file entry checkpoint field
    ext3: Update ctime in ext3_splice_branch() only when needed
    ext3: Don't call dquot_free_block() if we don't update anything
    udf: Remove unnecessary OOM messages

    Linus Torvalds
     

01 Mar, 2012

1 commit

  • dqptr_sem can be called from slab reclaim. tty layer uses GFP_KERNEL mask for
    allocation so it can end up calling slab reclaim. Given quota code can call
    into tty layer to print warning this creates possibility for lock inversion
    between tty->atomic_write_lock and dqptr_sem.

    Using direct printing of warnings from quota layer is obsolete but since it's
    easy enough to change quota code to not hold any locks when printing warnings,
    let's just do it. It seems like a good thing to do even when we use netlink
    layer to transmit warnings to userspace.

    Reported-by: Markus
    Signed-off-by: Jan Kara

    Jan Kara
     

14 Feb, 2012

1 commit


12 Jan, 2012

1 commit


07 Jan, 2012

1 commit


04 Jan, 2012

1 commit

  • Move invalidate_bdev, block_sync_page into fs/block_dev.c. Export
    kill_bdev as well, so brd doesn't have to open code it. Reduce
    buffer_head.h requirement accordingly.

    Removed a rather large comment from invalidate_bdev, as it looked a bit
    obsolete to bother moving. The small comment replacing it says enough.

    Signed-off-by: Nick Piggin
    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Al Viro
     

25 May, 2011

1 commit

  • Change each shrinker's API by consolidating the existing parameters into
    shrink_control struct. This will simplify any further features added w/o
    touching each file of shrinker.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: fix warning]
    [kosaki.motohiro@jp.fujitsu.com: fix up new shrinker API]
    [akpm@linux-foundation.org: fix xfs warning]
    [akpm@linux-foundation.org: update gfs2]
    Signed-off-by: Ying Han
    Cc: KOSAKI Motohiro
    Cc: Minchan Kim
    Acked-by: Pavel Emelyanov
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Acked-by: Rik van Riel
    Cc: Johannes Weiner
    Cc: Hugh Dickins
    Cc: Dave Hansen
    Cc: Steven Whitehouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ying Han
     

08 Apr, 2011

1 commit


01 Apr, 2011

1 commit

  • There's no reason to write quota info in dquot_commit(). The writing is a
    relict from the old days when we didn't have dquot_acquire() and
    dquot_release() and thus dquot_commit() could have created / removed quota
    structures from the file. These days dquot_commit() only updates usage counters
    / limits in quota structure and thus there's no need to write quota info.

    This also fixes an issue with journaling filesystem which didn't reserve
    enough space in the transaction for write of quota info (it could have been
    dirty at the time of dquot_commit() because of a race with other operation
    changing it).

    CC: stable@kernel.org
    Reported-and-tested-by: Lukas Czerner
    Signed-off-by: Jan Kara

    Jan Kara
     

31 Mar, 2011

1 commit


25 Mar, 2011

2 commits

  • Protect the per-sb inode list with a new global lock
    inode_sb_list_lock and use it to protect the list manipulations and
    traversals. This lock replaces the inode_lock as the inodes on the
    list can be validity checked while holding the inode->i_lock and
    hence the inode_lock is no longer needed to protect the list.

    Signed-off-by: Dave Chinner
    Signed-off-by: Al Viro

    Dave Chinner
     
  • Protect inode state transitions and validity checks with the
    inode->i_lock. This enables us to make inode state transitions
    independently of the inode_lock and is the first step to peeling
    away the inode_lock from the code.

    This requires that __iget() is done atomically with i_state checks
    during list traversals so that we don't race with another thread
    marking the inode I_FREEING between the state check and grabbing the
    reference.

    Also remove the unlock_new_inode() memory barrier optimisation
    required to avoid taking the inode_lock when clearing I_NEW.
    Simplify the code by simply taking the inode->i_lock around the
    state change and wakeup. Because the wakeup is no longer tricky,
    remove the wake_up_inode() function and open code the wakeup where
    necessary.

    Signed-off-by: Dave Chinner
    Signed-off-by: Al Viro

    Dave Chinner
     

13 Jan, 2011

1 commit

  • As Al Viro pointed out path resolution during Q_QUOTAON calls to quotactl
    is prone to deadlocks. We hold s_umount semaphore for reading during the
    path resolution and resolution itself may need to acquire the semaphore
    for writing when e. g. autofs mountpoint is passed.

    Solve the problem by performing the resolution before we get hold of the
    superblock (and thus s_umount semaphore). The whole thing is complicated
    by the fact that some filesystems (OCFS2) ignore the path argument. So to
    distinguish between filesystem which want the path and which do not we
    introduce new .quota_on_meta callback which does not get the path. OCFS2
    then uses this callback instead of old .quota_on.

    CC: Al Viro
    CC: Christoph Hellwig
    CC: Ted Ts'o
    CC: Joel Becker
    Signed-off-by: Jan Kara

    Jan Kara
     

11 Jan, 2011

1 commit


28 Oct, 2010

3 commits

  • When quotaon(8) races with __dquot_initialize() or dqget() fails because
    of EIO, ENOSPC, or similar error, we could possibly dereference NULL pointer
    in inode->i_dquot[cnt]. Add proper checking.

    Reported-by: Dmitry Monakhov
    Signed-off-by: Jan Kara

    Jan Kara
     
  • __dquot_transfer accidentally called flush_warnings for a wrong set of
    dquots which could result in quota warnings being issued with a wrong
    identification. Also when operation fails because of EDQUOT, there's no
    need check for issuing information message about user getting below limits
    (no transfer has actually happened).

    Signed-off-by: Jan Kara

    Jan Kara
     
  • I've got following lockup:
    dquot_disable dquot_transfer
    ->dqget()
    sb_has_quota_active
    dqopt->flags &= ~dquot_state_flag(f, cnt) atomic_inc(dq->dq_count)
    ->drop_dquot_ref(sb, cnt);
    down_write(dqptr_sem)
    inode->i_dquot[cnt] = NULL ->__dquot_transfer
    invalidate_dquots(sb, cnt); down_write(&dqptr_sem)
    ->wait for dq_wait_unused inode->i_dquot = new_dquot
    /* wait forever */ ^^^^New quota user^^^^^^

    We cannot allow new references to dquots from inodes after drop_dquot_ref()
    has removed them. We have to recheck quota state under dqptr_sem and before
    assignment, as we do it in dquot_initialize().

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Jan Kara

    Dmitry
     

11 Aug, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (96 commits)
    no need for list_for_each_entry_safe()/resetting with superblock list
    Fix sget() race with failing mount
    vfs: don't hold s_umount over close_bdev_exclusive() call
    sysv: do not mark superblock dirty on remount
    sysv: do not mark superblock dirty on mount
    btrfs: remove junk sb_dirt change
    BFS: clean up the superblock usage
    AFFS: wait for sb synchronization when needed
    AFFS: clean up dirty flag usage
    cifs: truncate fallout
    mbcache: fix shrinker function return value
    mbcache: Remove unused features
    add f_flags to struct statfs(64)
    pass a struct path to vfs_statfs
    update VFS documentation for method changes.
    All filesystems that need invalidate_inode_buffers() are doing that explicitly
    convert remaining ->clear_inode() to ->evict_inode()
    Make ->drop_inode() just return whether inode needs to be dropped
    fs/inode.c:clear_inode() is gone
    fs/inode.c:evict() doesn't care about delete vs. non-delete paths now
    ...

    Fix up trivial conflicts in fs/nilfs2/super.c

    Linus Torvalds
     

10 Aug, 2010

1 commit

  • add I_CLEAR instead of replacing I_FREEING with it. I_CLEAR is
    equivalent to I_FREEING for almost all code looking at either;
    it's there to keep track of having called clear_inode() exactly
    once per inode lifetime, at some point after having set I_FREEING.
    I_CLEAR and I_FREEING never get set at the same time with the
    current code, so we can switch to setting i_flags to I_FREEING | I_CLEAR
    instead of I_CLEAR without loss of information. As the result of
    such change, checks become simpler and the amount of code that needs
    to know about I_CLEAR shrinks a lot.

    Signed-off-by: Al Viro

    Al Viro
     

08 Aug, 2010

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
    ext3: Fix dirtying of journalled buffers in data=journal mode
    ext3: default to ordered mode
    quota: Use mark_inode_dirty_sync instead of mark_inode_dirty
    quota: Change quota error message to print out disk and function name
    MAINTAINERS: Update entries of ext2 and ext3
    MAINTAINERS: Update address of Andreas Dilger
    ext3: Avoid filesystem corruption after a crash under heavy delete load
    ext3: remove vestiges of nobh support
    ext3: Fix set but unused variables
    quota: clean up quota active checks
    quota: Clean up the namespace in dqblk_xfs.h
    quota: check quota reservation on remove_dquot_ref

    Linus Torvalds
     

23 Jul, 2010

1 commit


21 Jul, 2010

4 commits

  • The current quota error message doesn't always print the disk name, so
    it is hard to identify the "bad" disk when quota error happens.

    This patch changes the standardized quota error message to print out disk name
    and function name. It also uses a combination of cpp macro and inline function
    to provide better type checking and to lower the text size of the message.

    [Jan Kara: Export __quota_error]

    Signed-off-by: Jiaying Zhang
    Signed-off-by: Jan Kara

    Jiaying Zhang
     
  • The various quota operations check for any quota beeing active on
    a superblock, and the inode not having the noquota flag.

    Merge these two checks into a dquot_active check and move that
    into dquot.c as that's the only place where it's needed.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • Almost all identifiers use the FS_* namespace, so rename the missing few
    XFS_* ones to FS_* as well. Without this some people might get upset
    about having too many XFS names in generic code.

    Acked-by: Steven Whitehouse
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • Reserved space must being claimed before remove_dquot_ref() for a
    given inode. Filesystem is responsible for performing force blocks
    allocation in case of dealloc in ->quota_off. Let's add sanity check
    for that case. Do it similar to add_dquot_ref().

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Jan Kara

    Dmitry Monakhov