06 Feb, 2013

2 commits


22 Jan, 2013

1 commit

  • This patch implements extent caching in case of file reading.
    While reading a file, currently, UDF reads metadata serially
    which takes a lot of time depending on the number of extents present
    in the file. Caching last accessd extent improves metadata read time.
    Instead of reading file metadata from start, now we read from
    the cached extent.

    This patch considerably improves the time spent by CPU in kernel mode.
    For example, while reading a 10.9 GB file using dd:
    Time before applying patch:
    11677022208 bytes (10.9GB) copied, 1529.748921 seconds, 7.3MB/s
    real 25m 29.85s
    user 0m 12.41s
    sys 15m 34.75s

    Time after applying patch:
    11677022208 bytes (10.9GB) copied, 1469.338231 seconds, 7.6MB/s
    real 24m 29.44s
    user 0m 15.73s
    sys 3m 27.61s

    [JK: Fix bh refcounting issues, simplify initialization]

    Signed-off-by: Namjae Jeon
    Signed-off-by: Ashish Sangwan
    Signed-off-by: Bonggil Bak
    Signed-off-by: Jan Kara

    Namjae Jeon
     

21 Jan, 2013

1 commit

  • So far we just marked the buffer as dirty and left writing on flusher thread
    but especially on opening that opens possible race window where we could write
    other modified fs structures to disk before we mark filesystem as open. So sync
    LVID buffer to disk after opening and closing fs.

    Reported-by: Steve Nickel
    Signed-off-by: Jan Kara

    Jan Kara
     

15 Jan, 2013

1 commit

  • This patch fixes a regression caused by commit bff943af6fe "udf: Fix memory
    leak when mounting" due to which it was triggering a kernel null point
    dereference in case of interrupted mount OR when allocating memory to
    sbi->s_partmaps failed in function udf_sb_alloc_partition_maps.

    Reported-and-tested-by: James Hogan
    Signed-off-by: Namjae Jeon
    Signed-off-by: Ashish Sangwan
    Signed-off-by: Jan Kara

    Namjae Jeon
     

03 Oct, 2012

3 commits

  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     
  • There's no reason to call rcu_barrier() on every
    deactivate_locked_super(). We only need to make sure that all delayed rcu
    free inodes are flushed before we destroy related cache.

    Removing rcu_barrier() from deactivate_locked_super() affects some fast
    paths. E.g. on my machine exit_group() of a last process in IPC
    namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.

    Signed-off-by: Kirill A. Shutemov
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Kirill A. Shutemov
     
  • Pull user namespace changes from Eric Biederman:
    "This is a mostly modest set of changes to enable basic user namespace
    support. This allows the code to code to compile with user namespaces
    enabled and removes the assumption there is only the initial user
    namespace. Everything is converted except for the most complex of the
    filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
    nfs, ocfs2 and xfs as those patches need a bit more review.

    The strategy is to push kuid_t and kgid_t values are far down into
    subsystems and filesystems as reasonable. Leaving the make_kuid and
    from_kuid operations to happen at the edge of userspace, as the values
    come off the disk, and as the values come in from the network.
    Letting compile type incompatible compile errors (present when user
    namespaces are enabled) guide me to find the issues.

    The most tricky areas have been the places where we had an implicit
    union of uid and gid values and were storing them in an unsigned int.
    Those places were converted into explicit unions. I made certain to
    handle those places with simple trivial patches.

    Out of that work I discovered we have generic interfaces for storing
    quota by projid. I had never heard of the project identifiers before.
    Adding full user namespace support for project identifiers accounts
    for most of the code size growth in my git tree.

    Ultimately there will be work to relax privlige checks from
    "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
    root in a user names to do those things that today we only forbid to
    non-root users because it will confuse suid root applications.

    While I was pushing kuid_t and kgid_t changes deep into the audit code
    I made a few other cleanups. I capitalized on the fact we process
    netlink messages in the context of the message sender. I removed
    usage of NETLINK_CRED, and started directly using current->tty.

    Some of these patches have also made it into maintainer trees, with no
    problems from identical code from different trees showing up in
    linux-next.

    After reading through all of this code I feel like I might be able to
    win a game of kernel trivial pursuit."

    Fix up some fairly trivial conflicts in netfilter uid/git logging code.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
    userns: Convert the ufs filesystem to use kuid/kgid where appropriate
    userns: Convert the udf filesystem to use kuid/kgid where appropriate
    userns: Convert ubifs to use kuid/kgid
    userns: Convert squashfs to use kuid/kgid where appropriate
    userns: Convert reiserfs to use kuid and kgid where appropriate
    userns: Convert jfs to use kuid/kgid where appropriate
    userns: Convert jffs2 to use kuid and kgid where appropriate
    userns: Convert hpfs to use kuid and kgid where appropriate
    userns: Convert btrfs to use kuid/kgid where appropriate
    userns: Convert bfs to use kuid/kgid where appropriate
    userns: Convert affs to use kuid/kgid wherwe appropriate
    userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
    userns: On ia64 deal with current_uid and current_gid being kuid and kgid
    userns: On ppc convert current_uid from a kuid before printing.
    userns: Convert s390 getting uid and gid system calls to use kuid and kgid
    userns: Convert s390 hypfs to use kuid and kgid where appropriate
    userns: Convert binder ipc to use kuids
    userns: Teach security_path_chown to take kuids and kgids
    userns: Add user namespace support to IMA
    userns: Convert EVM to deal with kuids and kgids in it's hmac computation
    ...

    Linus Torvalds
     

21 Sep, 2012

1 commit


15 Aug, 2012

2 commits


11 Jul, 2012

1 commit

  • When a partition table length is corrupted to be close to 1 << 32, the
    check for its length may overflow on 32-bit systems and we will think
    the length is valid. Later on the kernel can crash trying to read beyond
    end of buffer. Fix the check to avoid possible overflow.

    CC: stable@vger.kernel.org
    Reported-by: Ben Hutchings
    Signed-off-by: Jan Kara

    Jan Kara
     

09 Jul, 2012

2 commits

  • When we are mounting filesystem, we can load one partition table before
    finding out that we cannot complete processing of logical volume descriptor
    and trying the reserve descriptor. Free the table properly before trying
    the reserve descriptor.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • The UDF file-system does not need the 's_dirt' superblock flag because it does
    not define the 'write_super()' method. This flag was set to 1 in few places and
    set to 0 in '->sync_fs()' and was basically useless. Stop using it because it
    is on its way out.

    Signed-off-by: Artem Bityutskiy
    Signed-off-by: Jan Kara

    Artem Bityutskiy
     

29 Jun, 2012

3 commits


29 Mar, 2012

1 commit

  • Pull ext3, UDF, and quota fixes from Jan Kara:
    "A couple of ext3 & UDF fixes and also one improvement in quota
    locking."

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    ext3: fix start and len arguments handling in ext3_trim_fs()
    udf: Fix deadlock in udf_release_file()
    udf: Fix file entry logicalBlocksRecorded
    udf: Fix handling of i_blocks
    quota: Make quota code not call tty layer with dqptr_sem held
    udf: Init/maintain file entry checkpoint field
    ext3: Update ctime in ext3_splice_branch() only when needed
    ext3: Don't call dquot_free_block() if we don't update anything
    udf: Remove unnecessary OOM messages

    Linus Torvalds
     

21 Mar, 2012

2 commits


01 Mar, 2012

1 commit


10 Jan, 2012

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    ext2/3/4: delete unneeded includes of module.h
    ext{3,4}: Fix potential race when setversion ioctl updates inode
    udf: Mark LVID buffer as uptodate before marking it dirty
    ext3: Don't warn from writepage when readonly inode is spotted after error
    jbd: Remove j_barrier mutex
    reiserfs: Force inode evictions before umount to avoid crash
    reiserfs: Fix quota mount option parsing
    udf: Treat symlink component of type 2 as /
    udf: Fix deadlock when converting file from in-ICB one to normal one
    udf: Cleanup calling convention of inode_getblk()
    ext2: Fix error handling on inode bitmap corruption
    ext3: Fix error handling on inode bitmap corruption
    ext3: replace ll_rw_block with other functions
    ext3: NULL dereference in ext3_evict_inode()
    jbd: clear revoked flag on buffers before a new transaction started
    ext3: call ext3_mark_recovery_complete() when recovery is really needed

    Linus Torvalds
     

09 Jan, 2012

1 commit


07 Jan, 2012

1 commit


04 Jan, 2012

2 commits

  • note re mount options: fmask and dmask are explicitly truncated to 12bit,
    UDF_INVALID_MODE just needs to be guaranteed to differ from any such value.
    And umask is used only in &= with umode_t, so we ignore other bits anyway.

    Signed-off-by: Al Viro

    Al Viro
     
  • Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
    it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
    the cost of taking it into inode_init_always() will be negligible for pipes
    and sockets and negative for everything else. Not to mention the removal of
    boilerplate code from ->destroy_inode() instances...

    Signed-off-by: Al Viro

    Al Viro
     

01 Nov, 2011

5 commits


11 Oct, 2011

3 commits

  • Rename udf_warning to udf_warn for consistency with normal logging
    uses of pr_warn.

    Rename function udf_warning to _udf_warn.
    Remove __func__ from uses and move __func__ to a new udf_warn
    macro that calls _udf_warn.
    Add \n's to uses of udf_warn, remove \n from _udf_warn.
    Coalesce formats.

    Reviewed-by: NamJae Jeon
    Signed-off-by: Joe Perches
    Signed-off-by: Jan Kara

    Joe Perches
     
  • Rename udf_error to udf_err for consistency with normal logging
    uses of pr_err.

    Rename function udf_err to _udf_err.
    Remove __func__ from uses and move __func__ to a new udf_err
    macro that calls _udf_err.
    Some of the udf_error uses had \n terminations, some did not so
    standardize \n's to udf_err uses, remove \n from _udf_err function.
    Coalesce udf_err formats.
    One message prefixed with udf_read_super is now prefixed with
    udf_fill_super.

    Reviewed-by: NamJae Jeon
    Signed-off-by: Joe Perches
    Signed-off-by: Jan Kara

    Joe Perches
     
  • If there is a problem with a scratched disc or loader, it's valuable to know
    which error occurred.

    Convert some debug messages to udf_error, neaten those messages too.
    Add the calculated tag checksum and the read checksum to error message.
    Make udf_error a public function and move the logging prototypes together.

    Original-patch-by: NamJae Jeon
    Reviewed-by: NamJae Jeon
    Signed-off-by: Joe Perches
    Signed-off-by: Jan Kara

    Joe Perches
     

12 Jan, 2011

1 commit

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
    UDF: Close small mem leak in udf_find_entry()
    udf: Fix directory corruption after extent merging
    udf: Protect udf_file_aio_write from possible races
    udf: Remove unnecessary bkl usages
    udf: Use of s_alloc_mutex to serialize udf_relocate_blocks() execution
    udf: Replace bkl with the UDF_I(inode)->i_data_sem for protect udf_inode_info struct
    udf: Remove BKL from free space counting functions
    udf: Call udf_add_free_space() for more blocks at once in udf_free_blocks()
    udf: Remove BKL from udf_put_super() and udf_remount_fs()
    udf: Protect default inode credentials by rwlock
    udf: Protect all modifications of LVID with s_alloc_mutex
    udf: Move handling of uniqueID into a helper function and protect it by a s_alloc_mutex
    udf: Remove BKL from udf_update_inode
    udf: Convert UDF_SB(sb)->s_flags to use bitops
    fs/udf: Add printf format/argument verification
    fs/udf: Use vzalloc

    (Evil merge: this also removes the BKL dependency from the Kconfig file)

    Linus Torvalds
     

07 Jan, 2011

5 commits

  • RCU free the struct inode. This will allow:

    - Subsequent store-free path walking patch. The inode must be consulted for
    permissions when walking, so an RCU inode reference is a must.
    - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
    to take i_lock no longer need to take sb_inode_list_lock to walk the list in
    the first place. This will simplify and optimize locking.
    - Could remove some nested trylock loops in dcache code
    - Could potentially simplify things a bit in VM land. Do not need to take the
    page lock to follow page->mapping.

    The downsides of this is the performance cost of using RCU. In a simple
    creat/unlink microbenchmark, performance drops by about 10% due to inability to
    reuse cache-hot slab objects. As iterations increase and RCU freeing starts
    kicking over, this increases to about 20%.

    In cases where inode lifetimes are longer (ie. many inodes may be allocated
    during the average life span of a single inode), a lot of this cache reuse is
    not applicable, so the regression caused by this patch is smaller.

    The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
    however this adds some complexity to list walking and store-free path walking,
    so I prefer to implement this at a later date, if it is shown to be a win in
    real situations. I haven't found a regression in any non-micro benchmark so I
    doubt it will be a problem.

    Signed-off-by: Nick Piggin

    Nick Piggin
     
  • The udf_readdir(), udf_lookup(), udf_create(), udf_mknod(), udf_mkdir(),
    udf_rmdir(), udf_link(), udf_get_parent() and udf_unlink() seems already
    adequately protected by i_mutex held by VFS invoking calls. The udf_rename()
    instead should be already protected by lock_rename again by VFS. The
    udf_ioctl(), udf_fill_super() and udf_evict_inode() don't requires any further
    protection.

    This work was supported by a hardware donation from the CE Linux Forum.

    Signed-off-by: Alessio Igor Bogani
    Signed-off-by: Jan Kara

    Alessio Igor Bogani
     
  • Replace bkl with the UDF_I(inode)->i_data_sem rw semaphore in
    udf_release_file(), udf_symlink(), udf_symlink_filler(), udf_get_block(),
    udf_block_map(), and udf_setattr(). The rule now is that any operation
    on regular file's or symlink's extents (or generally allocation information
    including goal block) needs to hold i_data_sem.

    This work was supported by a hardware donation from the CE Linux Forum.

    Signed-off-by: Alessio Igor Bogani
    Signed-off-by: Jan Kara

    Alessio Igor Bogani
     
  • udf_count_free_bitmap() does not need BKL because bitmaps are in a fixed
    place on disk and so we can count set bits without serialization.
    udf_count_free_table() is now protected by s_alloc_mutex instead of BKL
    to get a consistent view of free space extents.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • udf_put_super() does not need BKL because the filesystem is shut down so
    there's nothing to race with. The credential changes in udf_remount_fs()
    and LVID changes are now protected by dedicated locks so we can remove BKL
    from this function as well.

    Signed-off-by: Jan Kara

    Jan Kara