08 Apr, 2014

1 commit

  • filemap_map_pages() is generic implementation of ->map_pages() for
    filesystems who uses page cache.

    It should be safe to use filemap_map_pages() for ->map_pages() if
    filesystem use filemap_fault() for ->fault().

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Linus Torvalds
    Cc: Mel Gorman
    Cc: Rik van Riel
    Cc: Andi Kleen
    Cc: Matthew Wilcox
    Cc: Dave Hansen
    Cc: Alexander Viro
    Cc: Dave Chinner
    Cc: Ning Qu
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

05 Apr, 2014

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "Major changes for 3.14 include support for the newly added ZERO_RANGE
    and COLLAPSE_RANGE fallocate operations, and scalability improvements
    in the jbd2 layer and in xattr handling when the extended attributes
    spill over into an external block.

    Other than that, the usual clean ups and minor bug fixes"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (42 commits)
    ext4: fix premature freeing of partial clusters split across leaf blocks
    ext4: remove unneeded test of ret variable
    ext4: fix comment typo
    ext4: make ext4_block_zero_page_range static
    ext4: atomically set inode->i_flags in ext4_set_inode_flags()
    ext4: optimize Hurd tests when reading/writing inodes
    ext4: kill i_version support for Hurd-castrated file systems
    ext4: each filesystem creates and uses its own mb_cache
    fs/mbcache.c: doucple the locking of local from global data
    fs/mbcache.c: change block and index hash chain to hlist_bl_node
    ext4: Introduce FALLOC_FL_ZERO_RANGE flag for fallocate
    ext4: refactor ext4_fallocate code
    ext4: Update inode i_size after the preallocation
    ext4: fix partial cluster handling for bigalloc file systems
    ext4: delete path dealloc code in ext4_ext_handle_uninitialized_extents
    ext4: only call sync_filesystm() when remounting read-only
    fs: push sync_filesystem() down to the file system's remount_fs()
    jbd2: improve error messages for inconsistent journal heads
    jbd2: minimize region locked by j_list_lock in jbd2_journal_forget()
    jbd2: minimize region locked by j_list_lock in journal_get_create_access()
    ...

    Linus Torvalds
     

04 Apr, 2014

1 commit

  • Reclaim will be leaving shadow entries in the page cache radix tree upon
    evicting the real page. As those pages are found from the LRU, an
    iput() can lead to the inode being freed concurrently. At this point,
    reclaim must no longer install shadow pages because the inode freeing
    code needs to ensure the page tree is really empty.

    Add an address_space flag, AS_EXITING, that the inode freeing code sets
    under the tree lock before doing the final truncate. Reclaim will check
    for this flag before installing shadow pages.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Rik van Riel
    Reviewed-by: Minchan Kim
    Cc: Andrea Arcangeli
    Cc: Bob Liu
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Greg Thelen
    Cc: Hugh Dickins
    Cc: Jan Kara
    Cc: KOSAKI Motohiro
    Cc: Luigi Semenzato
    Cc: Mel Gorman
    Cc: Metin Doslu
    Cc: Michel Lespinasse
    Cc: Ozgun Erdogan
    Cc: Peter Zijlstra
    Cc: Roman Gushchin
    Cc: Ryan Mallon
    Cc: Tejun Heo
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

13 Mar, 2014

1 commit

  • Previously, the no-op "mount -o mount /dev/xxx" operation when the
    file system is already mounted read-write causes an implied,
    unconditional syncfs(). This seems pretty stupid, and it's certainly
    documented or guaraunteed to do this, nor is it particularly useful,
    except in the case where the file system was mounted rw and is getting
    remounted read-only.

    However, it's possible that there might be some file systems that are
    actually depending on this behavior. In most file systems, it's
    probably fine to only call sync_filesystem() when transitioning from
    read-write to read-only, and there are some file systems where this is
    not needed at all (for example, for a pseudo-filesystem or something
    like romfs).

    Signed-off-by: "Theodore Ts'o"
    Cc: linux-fsdevel@vger.kernel.org
    Cc: Christoph Hellwig
    Cc: Artem Bityutskiy
    Cc: Adrian Hunter
    Cc: Evgeniy Dushistov
    Cc: Jan Kara
    Cc: OGAWA Hirofumi
    Cc: Anders Larsen
    Cc: Phillip Lougher
    Cc: Kees Cook
    Cc: Mikulas Patocka
    Cc: Petr Vandrovec
    Cc: xfs@oss.sgi.com
    Cc: linux-btrfs@vger.kernel.org
    Cc: linux-cifs@vger.kernel.org
    Cc: samba-technical@lists.samba.org
    Cc: codalist@coda.cs.cmu.edu
    Cc: linux-ext4@vger.kernel.org
    Cc: linux-f2fs-devel@lists.sourceforge.net
    Cc: fuse-devel@lists.sourceforge.net
    Cc: cluster-devel@redhat.com
    Cc: linux-mtd@lists.infradead.org
    Cc: jfs-discussion@lists.sourceforge.net
    Cc: linux-nfs@vger.kernel.org
    Cc: linux-nilfs@vger.kernel.org
    Cc: linux-ntfs-dev@lists.sourceforge.net
    Cc: ocfs2-devel@oss.oracle.com
    Cc: reiserfs-devel@vger.kernel.org

    Theodore Ts'o
     

24 Jan, 2014

1 commit


13 Nov, 2013

2 commits

  • Pull vfs updates from Al Viro:
    "All kinds of stuff this time around; some more notable parts:

    - RCU'd vfsmounts handling
    - new primitives for coredump handling
    - files_lock is gone
    - Bruce's delegations handling series
    - exportfs fixes

    plus misc stuff all over the place"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (101 commits)
    ecryptfs: ->f_op is never NULL
    locks: break delegations on any attribute modification
    locks: break delegations on link
    locks: break delegations on rename
    locks: helper functions for delegation breaking
    locks: break delegations on unlink
    namei: minor vfs_unlink cleanup
    locks: implement delegations
    locks: introduce new FL_DELEG lock flag
    vfs: take i_mutex on renamed file
    vfs: rename I_MUTEX_QUOTA now that it's not used for quotas
    vfs: don't use PARENT/CHILD lock classes for non-directories
    vfs: pull ext4's double-i_mutex-locking into common code
    exportfs: fix quadratic behavior in filehandle lookup
    exportfs: better variable name
    exportfs: move most of reconnect_path to helper function
    exportfs: eliminate unused "noprogress" counter
    exportfs: stop retrying once we race with rename/remove
    exportfs: clear DISCONNECTED on all parents sooner
    exportfs: more detailed comment for path_reconnect
    ...

    Linus Torvalds
     
  • Pull ubifs changes from Artem Bityutskiy:
    "Mostly fixes for the power cut emulation UBIFS mode, and only one
    functional change which fixes a return error code"

    * tag 'upstream-3.13-rc1' of git://git.infradead.org/linux-ubifs:
    UBIFS: correct data corruption range
    UBIFS: fix return code
    UBIFS: remove unnecessary code in ubifs_garbage_collect

    Linus Torvalds
     

26 Oct, 2013

2 commits

  • With power-cut emulation, it is possible that sometimes no data at all is
    corrupted and that confusing messages are printed due to errors in the
    computation of data corruption range.

    [1] The start of the range should be [0..len-1], not [0..len].
    [2] The end of the range should always be at least 1 greater than the start.

    Signed-off-by: Mats Karrman
    Signed-off-by: Artem Bityutskiy

    Mats Kärrman
     
  • Fix to return -ENOMEM in the kmalloc() and d_make_root() error handling
    case instead of 0, as done elsewhere in those functions.

    Signed-off-by: Wei Yongjun
    Signed-off-by: Artem Bityutskiy

    Wei Yongjun
     

25 Oct, 2013

1 commit


22 Oct, 2013

1 commit

  • In ubifs_garbage_collect,local variable "space_before" calculate twice. In
    fact, at the beginning of the loop, there is no need to calculate this
    variable. Calculate it before call "ubifs_garbage_collect_leb" is enough. This
    patch just remove the unnecessary calculate code.

    Signed-off-by: wang bo
    Acked-by: Brian Norris
    Signed-off-by: Artem Bityutskiy

    wang.bo116@zte.com.cn
     

17 Sep, 2013

1 commit

  • Pull ubifs fix from Artem Bityutskiy:
    "Just one patch which fixes the power-cut recovery testing mode.

    I'll start using a single UBI/UBIFS tree instead of 2 trees from now
    on. So in the future you'll get 1 small pull request instead of 2
    tiny ones"

    * tag 'upstream-3.12-rc1' of git://git.infradead.org/linux-ubifs:
    UBIFS: remove invalid warn msg with tst_recovery enabled

    Linus Torvalds
     

11 Sep, 2013

1 commit

  • Convert the filesystem shrinkers to use the new API, and standardise some
    of the behaviours of the shrinkers at the same time. For example,
    nr_to_scan means the number of objects to scan, not the number of objects
    to free.

    I refactored the CIFS idmap shrinker a little - it really needs to be
    broken up into a shrinker per tree and keep an item count with the tree
    root so that we don't need to walk the tree every time the shrinker needs
    to count the number of objects in the tree (i.e. all the time under
    memory pressure).

    [glommer@openvz.org: fixes for ext4, ubifs, nfs, cifs and glock. Fixes are needed mainly due to new code merged in the tree]
    [assorted fixes folded in]
    Signed-off-by: Dave Chinner
    Signed-off-by: Glauber Costa
    Acked-by: Mel Gorman
    Acked-by: Artem Bityutskiy
    Acked-by: Jan Kara
    Acked-by: Steven Whitehouse
    Cc: Adrian Hunter
    Cc: "Theodore Ts'o"
    Cc: Adrian Hunter
    Cc: Al Viro
    Cc: Artem Bityutskiy
    Cc: Arve Hjønnevåg
    Cc: Carlos Maiolino
    Cc: Christoph Hellwig
    Cc: Chuck Lever
    Cc: Daniel Vetter
    Cc: David Rientjes
    Cc: Gleb Natapov
    Cc: Greg Thelen
    Cc: J. Bruce Fields
    Cc: Jan Kara
    Cc: Jerome Glisse
    Cc: John Stultz
    Cc: KAMEZAWA Hiroyuki
    Cc: Kent Overstreet
    Cc: Kirill A. Shutemov
    Cc: Marcelo Tosatti
    Cc: Mel Gorman
    Cc: Steven Whitehouse
    Cc: Thomas Hellstrom
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton

    Signed-off-by: Al Viro

    Dave Chinner
     

16 Aug, 2013

1 commit


06 Jul, 2013

1 commit


03 Jul, 2013

1 commit

  • Pull ext4 update from Ted Ts'o:
    "Lots of bug fixes, cleanups and optimizations. In the bug fixes
    category, of note is a fix for on-line resizing file systems where the
    block size is smaller than the page size (i.e., file systems 1k blocks
    on x86, or more interestingly file systems with 4k blocks on Power or
    ia64 systems.)

    In the cleanup category, the ext4's punch hole implementation was
    significantly improved by Lukas Czerner, and now supports bigalloc
    file systems. In addition, Jan Kara significantly cleaned up the
    write submission code path. We also improved error checking and added
    a few sanity checks.

    In the optimizations category, two major optimizations deserve
    mention. The first is that ext4_writepages() is now used for
    nodelalloc and ext3 compatibility mode. This allows writes to be
    submitted much more efficiently as a single bio request, instead of
    being sent as individual 4k writes into the block layer (which then
    relied on the elevator code to coalesce the requests in the block
    queue). Secondly, the extent cache shrink mechanism, which was
    introduce in 3.9, no longer has a scalability bottleneck caused by the
    i_es_lru spinlock. Other optimizations include some changes to reduce
    CPU usage and to avoid issuing empty commits unnecessarily."

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits)
    ext4: optimize starting extent in ext4_ext_rm_leaf()
    jbd2: invalidate handle if jbd2_journal_restart() fails
    ext4: translate flag bits to strings in tracepoints
    ext4: fix up error handling for mpage_map_and_submit_extent()
    jbd2: fix theoretical race in jbd2__journal_restart
    ext4: only zero partial blocks in ext4_zero_partial_blocks()
    ext4: check error return from ext4_write_inline_data_end()
    ext4: delete unnecessary C statements
    ext3,ext4: don't mess with dir_file->f_pos in htree_dirblock_to_tree()
    jbd2: move superblock checksum calculation to jbd2_write_superblock()
    ext4: pass inode pointer instead of file pointer to punch hole
    ext4: improve free space calculation for inline_data
    ext4: reduce object size when !CONFIG_PRINTK
    ext4: improve extent cache shrink mechanism to avoid to burn CPU time
    ext4: implement error handling of ext4_mb_new_preallocation()
    ext4: fix corruption when online resizing a fs with 1K block size
    ext4: delete unused variables
    ext4: return FIEMAP_EXTENT_UNKNOWN for delalloc extents
    jbd2: remove debug dependency on debug_fs and update Kconfig help text
    jbd2: use a single printk for jbd_debug()
    ...

    Linus Torvalds
     

29 Jun, 2013

3 commits

  • Signed-off-by: Al Viro

    Al Viro
     
  • Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no
    mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are
    in the middle of 'ubifs_readdir()'.

    This means that 'file->private_data' can be freed while 'ubifs_readdir()' uses
    it, and this is a very bad bug: not only 'ubifs_readdir()' can return garbage,
    but this may corrupt memory and lead to all kinds of problems like crashes an
    security holes.

    This patch fixes the problem by using the 'file->f_version' field, which
    '->llseek()' always unconditionally sets to zero. We set it to 1 in
    'ubifs_readdir()' and whenever we detect that it became 0, we know there was a
    seek and it is time to clear the state saved in 'file->private_data'.

    I tested this patch by writing a user-space program which runds readdir and
    seek in parallell. I could easily crash the kernel without these patches, but
    could not crash it with these patches.

    Cc: stable@vger.kernel.org
    Reported-by: Al Viro
    Tested-by: Artem Bityutskiy
    Signed-off-by: Artem Bityutskiy
    Signed-off-by: Al Viro

    Artem Bityutskiy
     
  • Al Viro pointed me to the fact that '->readdir()' and '->llseek()' have no
    mutual exclusion, which means the 'ubifs_dir_llseek()' can be run while we are
    in the middle of 'ubifs_readdir()'.

    First of all, this means that 'file->private_data' can be freed while
    'ubifs_readdir()' uses it. But this particular patch does not fix the problem.
    This patch is only a preparation, and the fix will follow next.

    In this patch we make 'ubifs_readdir()' stop using 'file->f_pos' directly,
    because 'file->f_pos' can be changed by '->llseek()' at any point. This may
    lead 'ubifs_readdir()' to returning inconsistent data: directory entry names
    may correspond to incorrect file positions.

    So here we introduce a local variable 'pos', read 'file->f_pose' once at very
    the beginning, and then stick to 'pos'. The result of this is that when
    'ubifs_dir_llseek()' changes 'file->f_pos' while we are in the middle of
    'ubifs_readdir()', the latter "wins".

    Cc: stable@vger.kernel.org
    Reported-by: Al Viro
    Tested-by: Artem Bityutskiy
    Signed-off-by: Artem Bityutskiy
    Signed-off-by: Al Viro

    Artem Bityutskiy
     

22 May, 2013

1 commit

  • Currently there is no way to truncate partial page where the end
    truncate point is not at the end of the page. This is because it was not
    needed and the functionality was enough for file system truncate
    operation to work properly. However more file systems now support punch
    hole feature and it can benefit from mm supporting truncating page just
    up to the certain point.

    Specifically, with this functionality truncate_inode_pages_range() can
    be changed so it supports truncating partial page at the end of the
    range (currently it will BUG_ON() if 'end' is not at the end of the
    page).

    This commit changes the invalidatepage() address space operation
    prototype to accept range to be invalidated and update all the instances
    for it.

    We also change the block_invalidatepage() in the same way and actually
    make a use of the new length argument implementing range invalidation.

    Actual file system implementations will follow except the file systems
    where the changes are really simple and should not change the behaviour
    in any way .Implementation for truncate_page_range() which will be able
    to accept page unaligned ranges will follow as well.

    Signed-off-by: Lukas Czerner
    Cc: Andrew Morton
    Cc: Hugh Dickins

    Lukas Czerner
     

10 May, 2013

1 commit

  • When mounting an UBIFS R/W volume, we have the message:
    UBIFS: mounted UBI device 0, volume 1, name "rootfs"(null)
    With this patch, we'll have:
    UBIFS: mounted UBI device 0, volume 1, name "rootfs"
    Which is, I think, what was intended.

    Signed-off-by: Richard Genoud
    Cc: stable@vger.kernel.org [v3.7+]
    Signed-off-by: Artem Bityutskiy

    Richard Genoud
     

08 May, 2013

1 commit

  • Faster kernel compiles by way of fewer unnecessary includes.

    [akpm@linux-foundation.org: fix fallout]
    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     

14 Mar, 2013

1 commit

  • The UBIFS space fixup is a useful feature which allows to fixup the "broken"
    flash space at the time of the first mount. The "broken" space is usually the
    result of using a "dumb" industrial flasher which is not able to skip empty
    NAND pages and just writes all 0xFFs to the empty space, which has grave
    side-effects for UBIFS when UBIFS trise to write useful data to those empty
    pages.

    The fix-up feature works roughly like this:
    1. mkfs.ubifs sets the fixup flag in UBIFS superblock when creating the image
    (see -F option)
    2. when the file-system is mounted for the first time, UBIFS notices the fixup
    flag and re-writes the entire media atomically, which may take really a lot
    of time.
    3. UBIFS clears the fixup flag in the superblock.

    This works fine when the file system is mounted R/W for the very first time.
    But it did not really work in the case when we first mount the file-system R/O,
    and then re-mount R/W. The reason was that we started the fixup procedure too
    late, which we cannot really do because we have to fixup the space before it
    starts being used.

    Signed-off-by: Artem Bityutskiy
    Reported-by: Mark Jackson
    Cc: stable@vger.kernel.org # 3.0+

    Artem Bityutskiy
     

04 Mar, 2013

1 commit

  • Modify the request_module to prefix the file system type with "fs-"
    and add aliases to all of the filesystems that can be built as modules
    to match.

    A common practice is to build all of the kernel code and leave code
    that is not commonly needed as modules, with the result that many
    users are exposed to any bug anywhere in the kernel.

    Looking for filesystems with a fs- prefix limits the pool of possible
    modules that can be loaded by mount to just filesystems trivially
    making things safer with no real cost.

    Using aliases means user space can control the policy of which
    filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
    with blacklist and alias directives. Allowing simple, safe,
    well understood work-arounds to known problematic software.

    This also addresses a rare but unfortunate problem where the filesystem
    name is not the same as it's module name and module auto-loading
    would not work. While writing this patch I saw a handful of such
    cases. The most significant being autofs that lives in the module
    autofs4.

    This is relevant to user namespaces because we can reach the request
    module in get_fs_type() without having any special permissions, and
    people get uncomfortable when a user specified string (in this case
    the filesystem type) goes all of the way to request_module.

    After having looked at this issue I don't think there is any
    particular reason to perform any filtering or permission checks beyond
    making it clear in the module request that we want a filesystem
    module. The common pattern in the kernel is to call request_module()
    without regards to the users permissions. In general all a filesystem
    module does once loaded is call register_filesystem() and go to sleep.
    Which means there is not much attack surface exposed by loading a
    filesytem module unless the filesystem is mounted. In a user
    namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
    which most filesystems do not set today.

    Acked-by: Serge Hallyn
    Acked-by: Kees Cook
    Reported-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

27 Feb, 2013

2 commits

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     
  • Pull ubifs updates from Artem Bityutskiy:
    "It's been quite silent and we have only a couple of bug-fixes for the
    orphans handling code plus one cosmetic change."

    * tag 'upstream-3.9-rc1' of git://git.infradead.org/linux-ubifs:
    UBIFS: fix double free of ubifs_orphan objects
    UBIFS: fix use of freed ubifs_orphan objects
    UBIFS: rename random32() to prandom_u32()

    Linus Torvalds
     

23 Feb, 2013

1 commit


22 Feb, 2013

1 commit

  • When stable pages are required, we have to wait if the page is just
    going to disk and we want to modify it. Add proper callback to
    ubifs_vm_page_mkwrite().

    Signed-off-by: Jan Kara
    Signed-off-by: Darrick J. Wong
    Cc: Artem Bityutskiy
    Cc: Adrian Hunter
    Cc: Andy Lutomirski
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Steven Whitehouse
    Cc: Jens Axboe
    Cc: Eric Van Hensbergen
    Cc: Ron Minnich
    Cc: Latchesar Ionkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

04 Feb, 2013

2 commits

  • The last orphan in the dnext list has its dnext set to NULL. Because
    of that, ubifs_delete_orphan assumes that it is not on the dnext list
    and frees it immediately instead ignoring it as a second delete. The
    orphan is later freed again by erase_deleted.

    This change adds an explicit flag to ubifs_orphan indicating whether
    it is pending delete.

    Signed-off-by: Adam Thomas
    Signed-off-by: Artem Bityutskiy
    Cc: stable@vger.kernel.org

    Adam Thomas
     
  • The last orphan in the cnext list has its cnext set to NULL. Because
    of that, ubifs_delete_orphan assumes that it is not on the cnext list
    and frees it immediately instead of adding it to the dnext list. The
    freed orphan is later modified by write_orph_node.

    This can cause various inconsistencies including directory entries
    that cannot be removed and this error:

    UBIFS error (pid 20685): layout_cnodes: LPT out of space at LEB 14:129009 needing 17, done_ltab 1, done_lsave 1

    This is a regression introduced by
    "7074e5eb UBIFS: remove invalid reference to list iterator variable".

    This change adds an explicit flag to ubifs_orphan indicating whether
    it is pending commit.

    Signed-off-by: Adam Thomas
    Reviewed-by: Adrian Hunter
    Cc: stable@vger.kernel.org # v3.6+
    Signed-off-by: Artem Bityutskiy

    Adam Thomas
     

15 Jan, 2013

1 commit


18 Dec, 2012

2 commits

  • This also converts filling memory loop to use memset.

    Signed-off-by: Akinobu Mita
    Cc: Artem Bityutskiy
    Cc: Adrian Hunter
    Cc: "Theodore Ts'o"
    Cc: David Laight
    Cc: David Woodhouse
    Cc: Eilon Greenstein
    Cc: Michel Lespinasse
    Cc: Robert Love
    Cc: Valdis Kletnieks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • But the kernel decided to call it "origin" instead. Fix most of the
    sites.

    Acked-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

26 Oct, 2012

2 commits

  • This is a bugfix for a problem with the following symptoms:

    1. A power cut happens
    2. After reboot, we try to mount UBIFS
    3. Mount fails with "No space left on device" error message

    UBIFS complains like this:

    UBIFS error (pid 28225): grab_empty_leb: could not find an empty LEB

    The root cause of this problem is that when we mount, not all LEBs are
    categorized. Only those which were read are. However, the
    'ubifs_find_free_leb_for_idx()' function assumes that all LEBs were
    categorized and 'c->freeable_cnt' is valid, which is a false assumption.

    This patch fixes the problem by teaching 'ubifs_find_free_leb_for_idx()'
    to always fall back to LPT scanning if no freeable LEBs were found.

    This problem was reported by few people in the past, but Brent Taylor
    was able to reproduce it and send me a flash image which cannot be mounted,
    which made it easy to hunt the bug. Kudos to Brent.

    Reported-by: Brent Taylor
    Signed-off-by: Artem Bityutskiy
    Cc: stable@vger.kernel.org

    Artem Bityutskiy
     
  • This commit is a preparation for a subsequent bugfix. We introduce a
    counter for categorized lprops.

    Signed-off-by: Artem Bityutskiy
    Cc: stable@vger.kernel.org

    Artem Bityutskiy
     

09 Oct, 2012

1 commit

  • Move actual pte filling for non-linear file mappings into the new special
    vma operation: ->remap_pages().

    Filesystems must implement this method to get non-linear mapping support,
    if it uses filemap_fault() then generic_file_remap_pages() can be used.

    Now device drivers can implement this method and obtain nonlinear vma support.

    Signed-off-by: Konstantin Khlebnikov
    Cc: Alexander Viro
    Cc: Carsten Otte
    Cc: Chris Metcalf #arch/tile
    Cc: Cyrill Gorcunov
    Cc: Eric Paris
    Cc: H. Peter Anvin
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: James Morris
    Cc: Jason Baron
    Cc: Kentaro Takeda
    Cc: Matt Helsley
    Cc: Nick Piggin
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Suresh Siddha
    Cc: Tetsuo Handa
    Cc: Venkatesh Pallipadi
    Acked-by: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

03 Oct, 2012

4 commits

  • Pull ubifs changes from Artem Bityutskiy:
    "No big changes for 3.7 in UBIFS:
    - Error reporting and debug printing improvements
    - Power cut emulation fixes
    - Minor cleanups"

    Fix trivial conflict in fs/ubifs/debug.c due to the user namespace
    changes.

    * tag 'upstream-3.7-rc1' of git://git.infradead.org/linux-ubifs:
    UBIFS: print less
    UBIFS: use pr_ helper instead of printk
    UBIFS: comply with coding style
    UBIFS: use __aligned() attribute
    UBIFS: remove __DATE__ and __TIME__
    UBIFS: fix power cut emulation for mtdram
    UBIFS: improve scanning debug output
    UBIFS: always print full error reports
    UBIFS: print PID in debug messages

    Linus Torvalds
     
  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     
  • There's no reason to call rcu_barrier() on every
    deactivate_locked_super(). We only need to make sure that all delayed rcu
    free inodes are flushed before we destroy related cache.

    Removing rcu_barrier() from deactivate_locked_super() affects some fast
    paths. E.g. on my machine exit_group() of a last process in IPC
    namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.

    Signed-off-by: Kirill A. Shutemov
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Kirill A. Shutemov
     
  • Pull user namespace changes from Eric Biederman:
    "This is a mostly modest set of changes to enable basic user namespace
    support. This allows the code to code to compile with user namespaces
    enabled and removes the assumption there is only the initial user
    namespace. Everything is converted except for the most complex of the
    filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
    nfs, ocfs2 and xfs as those patches need a bit more review.

    The strategy is to push kuid_t and kgid_t values are far down into
    subsystems and filesystems as reasonable. Leaving the make_kuid and
    from_kuid operations to happen at the edge of userspace, as the values
    come off the disk, and as the values come in from the network.
    Letting compile type incompatible compile errors (present when user
    namespaces are enabled) guide me to find the issues.

    The most tricky areas have been the places where we had an implicit
    union of uid and gid values and were storing them in an unsigned int.
    Those places were converted into explicit unions. I made certain to
    handle those places with simple trivial patches.

    Out of that work I discovered we have generic interfaces for storing
    quota by projid. I had never heard of the project identifiers before.
    Adding full user namespace support for project identifiers accounts
    for most of the code size growth in my git tree.

    Ultimately there will be work to relax privlige checks from
    "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
    root in a user names to do those things that today we only forbid to
    non-root users because it will confuse suid root applications.

    While I was pushing kuid_t and kgid_t changes deep into the audit code
    I made a few other cleanups. I capitalized on the fact we process
    netlink messages in the context of the message sender. I removed
    usage of NETLINK_CRED, and started directly using current->tty.

    Some of these patches have also made it into maintainer trees, with no
    problems from identical code from different trees showing up in
    linux-next.

    After reading through all of this code I feel like I might be able to
    win a game of kernel trivial pursuit."

    Fix up some fairly trivial conflicts in netfilter uid/git logging code.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
    userns: Convert the ufs filesystem to use kuid/kgid where appropriate
    userns: Convert the udf filesystem to use kuid/kgid where appropriate
    userns: Convert ubifs to use kuid/kgid
    userns: Convert squashfs to use kuid/kgid where appropriate
    userns: Convert reiserfs to use kuid and kgid where appropriate
    userns: Convert jfs to use kuid/kgid where appropriate
    userns: Convert jffs2 to use kuid and kgid where appropriate
    userns: Convert hpfs to use kuid and kgid where appropriate
    userns: Convert btrfs to use kuid/kgid where appropriate
    userns: Convert bfs to use kuid/kgid where appropriate
    userns: Convert affs to use kuid/kgid wherwe appropriate
    userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
    userns: On ia64 deal with current_uid and current_gid being kuid and kgid
    userns: On ppc convert current_uid from a kuid before printing.
    userns: Convert s390 getting uid and gid system calls to use kuid and kgid
    userns: Convert s390 hypfs to use kuid and kgid where appropriate
    userns: Convert binder ipc to use kuids
    userns: Teach security_path_chown to take kuids and kgids
    userns: Add user namespace support to IMA
    userns: Convert EVM to deal with kuids and kgids in it's hmac computation
    ...

    Linus Torvalds