17 May, 2007

1 commit

  • SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it.

    Signed-off-by: Christoph Lameter
    Cc: David Howells
    Cc: Jens Axboe
    Cc: Steven French
    Cc: Michael Halcrow
    Cc: OGAWA Hirofumi
    Cc: Miklos Szeredi
    Cc: Steven Whitehouse
    Cc: Roman Zippel
    Cc: David Woodhouse
    Cc: Dave Kleikamp
    Cc: Trond Myklebust
    Cc: "J. Bruce Fields"
    Cc: Anton Altaparmakov
    Cc: Mark Fasheh
    Cc: Paul Mackerras
    Cc: Christoph Hellwig
    Cc: Jan Kara
    Cc: David Chinner
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

09 May, 2007

3 commits

  • Propagate flags such as S_APPEND, S_IMMUTABLE, etc. from i_flags into
    ext2-specific i_flags. Hence, when someone sets these flags via a different
    interface than ioctl, they are stored correctly.

    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Remove includes of where it is not used/needed.
    Suggested by Al Viro.

    Builds cleanly on x86_64, i386, alpha, ia64, powerpc, sparc,
    sparc64, and arm (all 59 defconfigs).

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Taken from http://bugzilla.kernel.org/show_bug.cgi?id=5079

    signed long ranges from -2.147.483.648 to 2.147.483.647 on x86 32bit

    10000011110110100100111110111101 .. -2,082,844,739
    10000011110110100100111110111101 .. 2,212,122,557
    Cc:

    Andreas says:

    This patch is now treating timestamps with the high bit set as negative
    times (before Jan 1, 1970). This means we lose 1/2 of the possible range
    of timestamps (lopping off 68 years before unix timestamp overflow -
    now only 30 years away :-) to handle the extremely rare case of setting
    timestamps into the distant past.

    If we are only interested in fixing the underflow case, we could just
    limit the values to 0 instead of storing negative values. At worst this
    will skew the timestamp by a few hours for timezones in the far east
    (files would still show Jan 1, 1970 in "ls -l" output).

    That said, it seems 32-bit systems (mine at least) allow files to be set
    into the past (01/01/1907 works fine) so it seems this patch is bringing
    the x86_64 behaviour into sync with other kernels.

    On the plus side, we have a patch that is ready to add nanosecond timestamps
    to ext3 and as an added bonus adds 2 high bits to the on-disk timestamp so
    this extends the maximum date to 2242.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Markus Rechberger
     

08 May, 2007

2 commits

  • I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by
    SLAB.

    I think its purpose was to have a callback after an object has been freed
    to verify that the state is the constructor state again? The callback is
    performed before each freeing of an object.

    I would think that it is much easier to check the object state manually
    before the free. That also places the check near the code object
    manipulation of the object.

    Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was
    compiled with SLAB debugging on. If there would be code in a constructor
    handling SLAB_DEBUG_INITIAL then it would have to be conditional on
    SLAB_DEBUG otherwise it would just be dead code. But there is no such code
    in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real
    use of, difficult to understand and there are easier ways to accomplish the
    same effect (i.e. add debug code before kfree).

    There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be
    clear in fs inode caches. Remove the pointless checks (they would even be
    pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors.

    This is the last slab flag that SLUB did not support. Remove the check for
    unimplemented flags from SLUB.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Ensure pages are uptodate after returning from read_cache_page, which allows
    us to cut out most of the filesystem-internal PageUptodate calls.

    I didn't have a great look down the call chains, but this appears to fixes 7
    possible use-before uptodate in hfs, 2 in hfsplus, 1 in jfs, a few in
    ecryptfs, 1 in jffs2, and a possible cleared data overwritten with readpage in
    block2mtd. All depending on whether the filler is async and/or can return
    with a !uptodate page.

    Signed-off-by: Nick Piggin
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

21 Feb, 2007

1 commit


13 Feb, 2007

2 commits

  • This patch is inspired by Arjan's "Patch series to mark struct
    file_operations and struct inode_operations const".

    Compile tested with gcc & sparse.

    Signed-off-by: Josef 'Jeff' Sipek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef 'Jeff' Sipek
     
  • Many struct inode_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

12 Feb, 2007

2 commits

  • Fix insecure default behaviour reported by Tigran Aivazian: if an ext2 or
    ext3 or ext4 filesystem is tuned to mount with "acl", but mounted by a
    kernel built without ACL support, then umask was ignored when creating
    inodes - though root or user has umask 022, touch creates files as 0666,
    and mkdir creates directories as 0777.

    This appears to have worked right until 2.6.11, when a fix to the default
    mode on symlinks (always 0777) assumed VFS applies umask: which it does,
    unless the mount is marked for ACLs; but ext[234] set MS_POSIXACL in
    s_flags according to s_mount_opt set according to def_mount_opts.

    We could revert to the 2.6.10 ext[234]_init_acl (adding an S_ISLNK test);
    but other filesystems only set MS_POSIXACL when ACLs are configured. We
    could fix this at another level; but it seems most robust to avoid setting
    the s_mount_opt flag in the first place (at the expense of more ifdefs).

    Likewise don't set the XATTR_USER flag when built without XATTR support.

    Signed-off-by: Hugh Dickins
    Cc: Tigran Aivazian
    Cc:
    Cc: Andreas Gruenbacher
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • This one was pointed out on the MOKB site:
    http://kernelfun.blogspot.com/2006/11/mokb-09-11-2006-linux-26x-ext2checkpage.html

    If a directory's i_size is corrupted, ext2_find_entry() will keep
    processing pages until the i_size is reached, even if there are no more
    blocks associated with the directory inode. This patch puts in some
    minimal sanity-checking so that we don't keep checking pages (and issuing
    errors) if we know there can be no more data to read, based on the block
    count of the directory inode.

    This is somewhat similar in approach to the ext3 patch I sent earlier this
    year.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     

09 Dec, 2006

2 commits

  • This facility provides three entry points:

    ilog2() Log base 2 of unsigned long
    ilog2_u32() Log base 2 of u32
    ilog2_u64() Log base 2 of u64

    These facilities can either be used inside functions on dynamic data:

    int do_something(long q)
    {
    ...;
    y = ilog2(x)
    ...;
    }

    Or can be used to statically initialise global variables with constant values:

    unsigned n = ilog2(27);

    When performing static initialisation, the compiler will report "error:
    initializer element is not constant" if asked to take a log of zero or of
    something not reducible to a constant. They treat negative numbers as
    unsigned.

    When not dealing with a constant, they fall back to using fls() which permits
    them to use arch-specific log calculation instructions - such as BSR on
    x86/x86_64 or SCAN on FRV - if available.

    [akpm@osdl.org: MMC fix]
    Signed-off-by: David Howells
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Herbert Xu
    Cc: David Howells
    Cc: Wojtek Kaniewski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Change all the uses of f_{dentry,vfsmnt} to f_path.{dentry,mnt} in the ext2
    filesystem.

    Signed-off-by: Josef "Jeff" Sipek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef "Jeff" Sipek
     

08 Dec, 2006

5 commits

  • Port commit a090d9132c1e53e3517111123680c15afb25c0a4 into ext2:

    All modifications of ->i_flags in inodes that might be visible to somebody
    else must be under ->i_mutex. That patch fixes ext2 ioctl() setting S_APPEND.

    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • lock_super() is unnecessary for setting super-block feature flags. Use the
    provided *_SET_COMPAT_FEATURE() macros as well.

    Signed-off-by: Andreas Gruenbacher
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andreas Gruenbacher
     
  • Update ext2_statfs to return an FSID that is a 64 bit XOR of the 128 bit
    filesystem UUID as suggested by Andreas Dilger. See the following Bugzilla
    entry for details:

    http://bugzilla.kernel.org/show_bug.cgi?id=136

    Cc: Andreas Dilger
    Cc: Stephen Tweedie
    Signed-off-by: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pekka Enberg
     
  • Replace all uses of kmem_cache_t with struct kmem_cache.

    The patch was generated using the following script:

    #!/bin/sh
    #
    # Replace one string by another in all the kernel sources.
    #

    set -e

    for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
    quilt add $file
    sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
    mv /tmp/$$ $file
    quilt refresh
    done

    The script was run like this

    sh replace kmem_cache_t "struct kmem_cache"

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • SLAB_KERNEL is an alias of GFP_KERNEL.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

12 Oct, 2006

1 commit

  • Current error behaviour for ext2 and ext3 filesystems does not fully
    correspond to the documentation and should be fixed.

    According to man 8 mount, ext2 and ext3 file systems allow to set one of 3
    different on-errors behaviours:

    ---- start of quote man 8 mount ----

    errors=continue / errors=remount-ro / errors=panic

    Define the behaviour when an error is encountered. (Either ignore
    errors and just mark the file system erroneous and continue, or remount
    the file system read-only, or panic and halt the system.) The default is
    set in the filesystem superblock, and can be changed using tune2fs(8).

    ---- end of quote ----

    However EXT3_ERRORS_CONTINUE is not read from the superblock, and thus
    ERRORS_CONT is not saved on the sbi->s_mount_opt. It leads to the incorrect
    handle of errors on ext3.

    Then we've checked corresponding code in ext2 and discovered that it is buggy
    as well:

    - EXT2_ERRORS_CONTINUE is not read from the superblock (the same);

    - parse_option() does not clean the alternative values and thus something
    like (ERRORS_CONT|ERRORS_RO) can be set;

    - if options are omitted, parse_option() does not set any of these options.

    Therefore it is possible to set any combination of these options on the ext2:

    - none of them may be set: EXT2_ERRORS_CONTINUE on superblock / empty mount
    options;

    - any of them may be set using mount options;

    - 2 any options may be set: by using EXT2_ERRORS_RO/EXT2_ERRORS_PANIC on the
    superblock and other value in mount options;

    - and finally all three options may be set by adding third option in remount.

    Currently ext2 uses these values only in ext2_error() and it is not leading to
    any noticeable troubles. However somebody may be discouraged when he will try
    to workaround EXT2_ERRORS_PANIC on the superblock by using errors=continue in
    mount options.

    This patch:

    EXT2_ERRORS_CONTINUE should be read from the superblock as default value for
    error behaviour. parse_option() should clean the alternative options and
    should not change default value taken from the superblock.

    Signed-off-by: Vasily Averin
    Acked-by: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vasily Averin
     

01 Oct, 2006

4 commits

  • When a filesystem decrements i_nlink to zero, it means that a write must be
    performed in order to drop the inode from the filesystem.

    We're shortly going to have keep filesystems from being remounted r/o between
    the time that this i_nlink decrement and that write occurs.

    So, add a little helper function to do the decrements. We'll tie into it in a
    bit to note when i_nlink hits zero.

    Signed-off-by: Dave Hansen
    Acked-by: Christoph Hellwig
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • This patch cleans up generic_file_*_read/write() interfaces. Christoph
    Hellwig gave me the idea for this clean ups.

    In a nutshell, all filesystems should set .aio_read/.aio_write methods and use
    do_sync_read/ do_sync_write() as their .read/.write methods. This allows us
    to cleanup all variants of generic_file_* routines.

    Final available interfaces:

    generic_file_aio_read() - read handler
    generic_file_aio_write() - write handler
    generic_file_aio_write_nolock() - no lock write handler

    __generic_file_aio_write_nolock() - internal worker routine

    Signed-off-by: Badari Pulavarty
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     
  • This patch removes readv() and writev() methods and replaces them with
    aio_read()/aio_write() methods.

    Signed-off-by: Badari Pulavarty
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     
  • Move the Ext2 device ioctl compat stuff from fs/compat_ioctl.c to the Ext2
    driver so that the Ext2 header file doesn't need to be included.

    Signed-Off-By: David Howells
    Signed-off-by: Jens Axboe

    David Howells
     

27 Sep, 2006

5 commits


19 Sep, 2006

1 commit

  • Fix a performance degradation introduced in 2.6.17. (30% degradation
    running dbench with 16 threads)

    Commit 21730eed11de42f22afcbd43f450a1872a0b5ea1, which claims to make
    EXT2_DEBUG work again, moves the taking of the kernel lock out of
    debug-only code in ext2_count_free_inodes and ext2_count_free_blocks and
    into ext2_statfs.

    The same problem was fixed in ext3 by removing the lock completely (commit
    5b11687924e40790deb0d5f959247ade82196665)

    Signed-off-by: Dave Kleikamp
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Kleikamp
     

17 Sep, 2006

1 commit


28 Aug, 2006

1 commit


04 Jul, 2006

1 commit

  • The quota code plays interesting games with the lock ordering; to quote Jan:

    | i_mutex of inode containing quota file is acquired after all other
    | quota locks. i_mutex of all other inodes is acquired before quota
    | locks. Quota code makes sure (by resetting inode operations and
    | setting special flag on inode) that noone tries to enter quota code
    | while holding i_mutex on a quota file...

    The good news is that all of this special case i_mutex grabbing happens in the
    (per filesystem) low level quota write function. For this special case we
    need a new I_MUTEX_* nesting level, since this just entirely outside any of
    the regular VFS locking rules for i_mutex. I trust Jan on his blue eyes that
    this is not ever going to deadlock; and based on that the patch below is what
    it takes to inform lockdep of these very interesting new locking rules.

    The new locking rule for the I_MUTEX_QUOTA nesting level is that this is the
    deepest possible level of nesting for i_mutex, and that this only should be
    used in quota write (and possibly read) function of filesystems. This makes
    the lock ordering of the I_MUTEX_* levels:

    I_MUTEX_PARENT -> I_MUTEX_CHILD -> I_MUTEX_NORMAL -> I_MUTEX_QUOTA

    Has no effect on non-lockdep kernels.

    Signed-off-by: Arjan van de Ven
    Acked-by: Ingo Molnar
    Cc: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

01 Jul, 2006

1 commit


29 Jun, 2006

1 commit


26 Jun, 2006

3 commits


23 Jun, 2006

3 commits

  • The percpu counter data type are changed in this set of patches to support
    more users like ext3 who need more than 32 bit to store the free blocks
    total in the filesystem.

    - Generic perpcu counters data type changes. The size of the global counter
    and local counter were explictly specified using s64 and s32. The global
    counter is changed from long to s64, while the local counter is changed from
    long to s32, so we could avoid doing 64 bit update in most cases.

    - Users of the percpu counters are updated to make use of the new
    percpu_counter_init() routine now taking an additional parameter to allow
    users to pass the initial value of the global counter.

    Signed-off-by: Mingming Cao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mingming Cao
     
  • Add read_mapping_page() which is used for callers that pass
    mapping->a_ops->readpage as the filler for read_cache_page. This removes
    some duplication from filesystem code.

    Signed-off-by: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pekka Enberg
     
  • Give the statfs superblock operation a dentry pointer rather than a superblock
    pointer.

    This complements the get_sb() patch. That reduced the significance of
    sb->s_root, allowing NFS to place a fake root there. However, NFS does
    require a dentry to use as a target for the statfs operation. This permits
    the root in the vfsmount to be used instead.

    linux/mount.h has been added where necessary to make allyesconfig build
    successfully.

    Interest has also been expressed for use with the FUSE and XFS filesystems.

    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Nathan Scott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells