08 Apr, 2014

1 commit

  • Pull ext3 improvements, cleanups, reiserfs fix from Jan Kara:
    "various cleanups for ext2, ext3, udf, isofs, a documentation update
    for quota, and a fix of a race in reiserfs readdir implementation"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    reiserfs: fix race in readdir
    ext2: acl: remove unneeded include of linux/capability.h
    ext3: explicitly remove inode from orphan list after failed direct io
    fs/isofs/inode.c add __init to init_inodecache()
    ext3: Speedup WB_SYNC_ALL pass
    fs/quota/Kconfig: Update filesystems
    ext3: Update outdated comment before ext3_ordered_writepage()
    ext3: Update PF_MEMALLOC handling in ext3_write_inode()
    ext2/3: use prandom_u32() instead of get_random_bytes()
    ext3: remove an unneeded check in ext3_new_blocks()
    ext3: remove unneeded check in ext3_ordered_writepage()
    fs: Mark function as static in ext3/xattr_security.c
    fs: Mark function as static in ext3/dir.c
    fs: Mark function as static in ext2/xattr_security.c
    ext3: Add __init macro to init_inodecache
    ext2: Add __init macro to init_inodecache
    udf: Add __init macro to init_inodecache
    fs: udf: parse_options: blocksize check

    Linus Torvalds
     

13 Mar, 2014

1 commit

  • Previously, the no-op "mount -o mount /dev/xxx" operation when the
    file system is already mounted read-write causes an implied,
    unconditional syncfs(). This seems pretty stupid, and it's certainly
    documented or guaraunteed to do this, nor is it particularly useful,
    except in the case where the file system was mounted rw and is getting
    remounted read-only.

    However, it's possible that there might be some file systems that are
    actually depending on this behavior. In most file systems, it's
    probably fine to only call sync_filesystem() when transitioning from
    read-write to read-only, and there are some file systems where this is
    not needed at all (for example, for a pseudo-filesystem or something
    like romfs).

    Signed-off-by: "Theodore Ts'o"
    Cc: linux-fsdevel@vger.kernel.org
    Cc: Christoph Hellwig
    Cc: Artem Bityutskiy
    Cc: Adrian Hunter
    Cc: Evgeniy Dushistov
    Cc: Jan Kara
    Cc: OGAWA Hirofumi
    Cc: Anders Larsen
    Cc: Phillip Lougher
    Cc: Kees Cook
    Cc: Mikulas Patocka
    Cc: Petr Vandrovec
    Cc: xfs@oss.sgi.com
    Cc: linux-btrfs@vger.kernel.org
    Cc: linux-cifs@vger.kernel.org
    Cc: samba-technical@lists.samba.org
    Cc: codalist@coda.cs.cmu.edu
    Cc: linux-ext4@vger.kernel.org
    Cc: linux-f2fs-devel@lists.sourceforge.net
    Cc: fuse-devel@lists.sourceforge.net
    Cc: cluster-devel@redhat.com
    Cc: linux-mtd@lists.infradead.org
    Cc: jfs-discussion@lists.sourceforge.net
    Cc: linux-nfs@vger.kernel.org
    Cc: linux-nilfs@vger.kernel.org
    Cc: linux-ntfs-dev@lists.sourceforge.net
    Cc: ocfs2-devel@oss.oracle.com
    Cc: reiserfs-devel@vger.kernel.org

    Theodore Ts'o
     

03 Mar, 2014

2 commits


19 Oct, 2013

1 commit

  • The UDF driver was not strict enough about checking the IDs in the
    VSDs when mounting, which resulted in reading through all the sectors
    of the block device in some unfortunate cases. Eg, trying to mount my
    uninitialized 200G SSD partition (all 0xFF bytes) took ~350 minutes to
    fail, because the code expected some of the valid IDs or a zero byte.
    During this, the mount couldn't be killed, sync from the cmdline
    blocked, and the machine froze into the shutdown. Valid filesystems
    (extX, btrfs, ntfs) were rejected by the mere accident of having a
    zero byte at just the right place in some of their sectors, close
    enough to the beginning not to generate excess I/O. The fix adds a
    hard limit on the VSD sector offset, adds the two missing VSD IDs, and
    stops scanning when encountering an invalid ID. Also replaced the
    magic number 32768 with a more meaningful #define, and supressed the
    bogus message about failing to read the first sector if no UDF fs was
    detected.

    Signed-off-by: Peter A. Felvegi
    Signed-off-by: Jan Kara

    Peter A. Felvegi
     

24 Sep, 2013

1 commit

  • A user has reported an oops in udf_statfs() that was caused by
    numOfPartitions entry in LVID structure being corrupted. Fix the problem
    by verifying whether numOfPartitions makes sense at least to the extent
    that LVID fits into a single block as it should.

    Reported-by: Juergen Weigert
    Signed-off-by: Jan Kara

    Jan Kara
     

01 Aug, 2013

2 commits

  • Refuse RW mount of udf filesystem. So far we just silently changed it
    to RO mount but when the media is writeable, block layer won't notice
    this change and thus will think device is used RW and will block eject
    button of the drive. That is unexpected by users because for
    non-writeable media eject button works just fine.

    Userspace mount(8) command handles this just fine and retries mounting
    with MS_RDONLY set so userspace shouldn't see any regression. Plus any
    tool mounting udf is likely confronted with the case of read-only
    media where block layer already refuses to mount the filesystem without
    MS_RDONLY set so our behavior shouldn't be anything new for it.

    Reported-by: Hui Wang
    Signed-off-by: Jan Kara

    Jan Kara
     
  • Change all function used in filesystem discovery during mount to user
    standard kernel return values - -errno on error, 0 on success instead
    of 1 on failure and 0 on success. This allows us to pass error number
    (not just failure / success) so we can abort device scanning earlier
    in case of errors like EIO or ENOMEM . Also we will be able to return
    EROFS in case writeable mount is requested but writing isn't supported.

    Signed-off-by: Jan Kara

    Jan Kara
     

11 Mar, 2013

1 commit


06 Feb, 2013

2 commits


22 Jan, 2013

1 commit

  • This patch implements extent caching in case of file reading.
    While reading a file, currently, UDF reads metadata serially
    which takes a lot of time depending on the number of extents present
    in the file. Caching last accessd extent improves metadata read time.
    Instead of reading file metadata from start, now we read from
    the cached extent.

    This patch considerably improves the time spent by CPU in kernel mode.
    For example, while reading a 10.9 GB file using dd:
    Time before applying patch:
    11677022208 bytes (10.9GB) copied, 1529.748921 seconds, 7.3MB/s
    real 25m 29.85s
    user 0m 12.41s
    sys 15m 34.75s

    Time after applying patch:
    11677022208 bytes (10.9GB) copied, 1469.338231 seconds, 7.6MB/s
    real 24m 29.44s
    user 0m 15.73s
    sys 3m 27.61s

    [JK: Fix bh refcounting issues, simplify initialization]

    Signed-off-by: Namjae Jeon
    Signed-off-by: Ashish Sangwan
    Signed-off-by: Bonggil Bak
    Signed-off-by: Jan Kara

    Namjae Jeon
     

21 Jan, 2013

1 commit

  • So far we just marked the buffer as dirty and left writing on flusher thread
    but especially on opening that opens possible race window where we could write
    other modified fs structures to disk before we mark filesystem as open. So sync
    LVID buffer to disk after opening and closing fs.

    Reported-by: Steve Nickel
    Signed-off-by: Jan Kara

    Jan Kara
     

15 Jan, 2013

1 commit

  • This patch fixes a regression caused by commit bff943af6fe "udf: Fix memory
    leak when mounting" due to which it was triggering a kernel null point
    dereference in case of interrupted mount OR when allocating memory to
    sbi->s_partmaps failed in function udf_sb_alloc_partition_maps.

    Reported-and-tested-by: James Hogan
    Signed-off-by: Namjae Jeon
    Signed-off-by: Ashish Sangwan
    Signed-off-by: Jan Kara

    Namjae Jeon
     

03 Oct, 2012

3 commits

  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     
  • There's no reason to call rcu_barrier() on every
    deactivate_locked_super(). We only need to make sure that all delayed rcu
    free inodes are flushed before we destroy related cache.

    Removing rcu_barrier() from deactivate_locked_super() affects some fast
    paths. E.g. on my machine exit_group() of a last process in IPC
    namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.

    Signed-off-by: Kirill A. Shutemov
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Kirill A. Shutemov
     
  • Pull user namespace changes from Eric Biederman:
    "This is a mostly modest set of changes to enable basic user namespace
    support. This allows the code to code to compile with user namespaces
    enabled and removes the assumption there is only the initial user
    namespace. Everything is converted except for the most complex of the
    filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
    nfs, ocfs2 and xfs as those patches need a bit more review.

    The strategy is to push kuid_t and kgid_t values are far down into
    subsystems and filesystems as reasonable. Leaving the make_kuid and
    from_kuid operations to happen at the edge of userspace, as the values
    come off the disk, and as the values come in from the network.
    Letting compile type incompatible compile errors (present when user
    namespaces are enabled) guide me to find the issues.

    The most tricky areas have been the places where we had an implicit
    union of uid and gid values and were storing them in an unsigned int.
    Those places were converted into explicit unions. I made certain to
    handle those places with simple trivial patches.

    Out of that work I discovered we have generic interfaces for storing
    quota by projid. I had never heard of the project identifiers before.
    Adding full user namespace support for project identifiers accounts
    for most of the code size growth in my git tree.

    Ultimately there will be work to relax privlige checks from
    "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
    root in a user names to do those things that today we only forbid to
    non-root users because it will confuse suid root applications.

    While I was pushing kuid_t and kgid_t changes deep into the audit code
    I made a few other cleanups. I capitalized on the fact we process
    netlink messages in the context of the message sender. I removed
    usage of NETLINK_CRED, and started directly using current->tty.

    Some of these patches have also made it into maintainer trees, with no
    problems from identical code from different trees showing up in
    linux-next.

    After reading through all of this code I feel like I might be able to
    win a game of kernel trivial pursuit."

    Fix up some fairly trivial conflicts in netfilter uid/git logging code.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
    userns: Convert the ufs filesystem to use kuid/kgid where appropriate
    userns: Convert the udf filesystem to use kuid/kgid where appropriate
    userns: Convert ubifs to use kuid/kgid
    userns: Convert squashfs to use kuid/kgid where appropriate
    userns: Convert reiserfs to use kuid and kgid where appropriate
    userns: Convert jfs to use kuid/kgid where appropriate
    userns: Convert jffs2 to use kuid and kgid where appropriate
    userns: Convert hpfs to use kuid and kgid where appropriate
    userns: Convert btrfs to use kuid/kgid where appropriate
    userns: Convert bfs to use kuid/kgid where appropriate
    userns: Convert affs to use kuid/kgid wherwe appropriate
    userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
    userns: On ia64 deal with current_uid and current_gid being kuid and kgid
    userns: On ppc convert current_uid from a kuid before printing.
    userns: Convert s390 getting uid and gid system calls to use kuid and kgid
    userns: Convert s390 hypfs to use kuid and kgid where appropriate
    userns: Convert binder ipc to use kuids
    userns: Teach security_path_chown to take kuids and kgids
    userns: Add user namespace support to IMA
    userns: Convert EVM to deal with kuids and kgids in it's hmac computation
    ...

    Linus Torvalds
     

21 Sep, 2012

1 commit


15 Aug, 2012

2 commits


11 Jul, 2012

1 commit

  • When a partition table length is corrupted to be close to 1 << 32, the
    check for its length may overflow on 32-bit systems and we will think
    the length is valid. Later on the kernel can crash trying to read beyond
    end of buffer. Fix the check to avoid possible overflow.

    CC: stable@vger.kernel.org
    Reported-by: Ben Hutchings
    Signed-off-by: Jan Kara

    Jan Kara
     

09 Jul, 2012

2 commits

  • When we are mounting filesystem, we can load one partition table before
    finding out that we cannot complete processing of logical volume descriptor
    and trying the reserve descriptor. Free the table properly before trying
    the reserve descriptor.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • The UDF file-system does not need the 's_dirt' superblock flag because it does
    not define the 'write_super()' method. This flag was set to 1 in few places and
    set to 0 in '->sync_fs()' and was basically useless. Stop using it because it
    is on its way out.

    Signed-off-by: Artem Bityutskiy
    Signed-off-by: Jan Kara

    Artem Bityutskiy
     

29 Jun, 2012

3 commits


29 Mar, 2012

1 commit

  • Pull ext3, UDF, and quota fixes from Jan Kara:
    "A couple of ext3 & UDF fixes and also one improvement in quota
    locking."

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    ext3: fix start and len arguments handling in ext3_trim_fs()
    udf: Fix deadlock in udf_release_file()
    udf: Fix file entry logicalBlocksRecorded
    udf: Fix handling of i_blocks
    quota: Make quota code not call tty layer with dqptr_sem held
    udf: Init/maintain file entry checkpoint field
    ext3: Update ctime in ext3_splice_branch() only when needed
    ext3: Don't call dquot_free_block() if we don't update anything
    udf: Remove unnecessary OOM messages

    Linus Torvalds
     

21 Mar, 2012

2 commits


01 Mar, 2012

1 commit


10 Jan, 2012

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    ext2/3/4: delete unneeded includes of module.h
    ext{3,4}: Fix potential race when setversion ioctl updates inode
    udf: Mark LVID buffer as uptodate before marking it dirty
    ext3: Don't warn from writepage when readonly inode is spotted after error
    jbd: Remove j_barrier mutex
    reiserfs: Force inode evictions before umount to avoid crash
    reiserfs: Fix quota mount option parsing
    udf: Treat symlink component of type 2 as /
    udf: Fix deadlock when converting file from in-ICB one to normal one
    udf: Cleanup calling convention of inode_getblk()
    ext2: Fix error handling on inode bitmap corruption
    ext3: Fix error handling on inode bitmap corruption
    ext3: replace ll_rw_block with other functions
    ext3: NULL dereference in ext3_evict_inode()
    jbd: clear revoked flag on buffers before a new transaction started
    ext3: call ext3_mark_recovery_complete() when recovery is really needed

    Linus Torvalds
     

09 Jan, 2012

1 commit


07 Jan, 2012

1 commit


04 Jan, 2012

2 commits

  • note re mount options: fmask and dmask are explicitly truncated to 12bit,
    UDF_INVALID_MODE just needs to be guaranteed to differ from any such value.
    And umask is used only in &= with umode_t, so we ignore other bits anyway.

    Signed-off-by: Al Viro

    Al Viro
     
  • Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
    it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
    the cost of taking it into inode_init_always() will be negligible for pipes
    and sockets and negative for everything else. Not to mention the removal of
    boilerplate code from ->destroy_inode() instances...

    Signed-off-by: Al Viro

    Al Viro
     

01 Nov, 2011

5 commits