03 Jun, 2009

1 commit


02 Jun, 2009

3 commits

  • It's possible to recurse into filesystem from the memory
    allocation, which deadlocks in xfs_qm_shake(). Add check
    for __GFP_FS, and bail out if it is not set.

    Signed-off-by: Felix Blyakher
    Signed-off-by: Hedi Berriche
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Andi Kleen
    Signed-off-by: Felix Blyakher

    Felix Blyakher
     
  • In the case where growing a filesystem would leave the last AG
    too small, the fixup code has an overflow in the calculation
    of the new size with one fewer ag, because "nagcount" is a 32
    bit number. If the new filesystem has > 2^32 blocks in it
    this causes a problem resulting in an EINVAL return from growfs:

    # xfs_io -f -c "truncate 19998630180864" fsfile
    # mkfs.xfs -f -bsize=4096 -dagsize=76288719b,size=3905982455b fsfile
    # mount -o loop fsfile /mnt
    # xfs_growfs /mnt

    meta-data=/dev/loop0 isize=256 agcount=52,
    agsize=76288719 blks
    = sectsz=512 attr=2
    data = bsize=4096 blocks=3905982455, imaxpct=5
    = sunit=0 swidth=0 blks
    naming =version 2 bsize=4096 ascii-ci=0
    log =internal bsize=4096 blocks=32768, version=2
    = sectsz=512 sunit=0 blks, lazy-count=0
    realtime =none extsz=4096 blocks=0, rtextents=0
    xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Invalid argument

    Reported-by: richard.ems@cape-horn-eng.com
    Signed-off-by: Eric Sandeen
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Felix Blyakher
    Signed-off-by: Felix Blyakher

    Eric Sandeen
     
  • Regreesion from commit ef8f7fc, which rearranged the code in
    xfs_swap_extents() leading to double unlock of xfs inode ilock.
    That resulted in xfs_fsr deadlocking itself on platforms, which
    don't handle double unlock of rw_semaphore nicely. It caused the
    count go negative, which represents the write holder, without
    really having one. ia64 is one of the platforms where deadlock
    was easily reproduced and the fix was tested.

    Signed-off-by: Eric Sandeen
    Reviewed-by: Eric Sandeen
    Signed-off-by: Felix Blyakher

    Felix Blyakher
     

30 May, 2009

2 commits


29 May, 2009

7 commits

  • * git://git.infradead.org/~dwmw2/mtd-2.6.30:
    jffs2: Fix corruption when flash erase/write failure
    mtd: MXC NAND driver fixes (v5)

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
    Driver Core: do not oops when driver_unregister() is called for unregistered drivers
    sysfs: file.c: use create_singlethread_workqueue()

    Linus Torvalds
     
  • * 'for-2.6.30' of git://linux-nfs.org/~bfields/linux:
    svcrdma: dma unmap the correct length for the RPCRDMA header page.
    nfsd: Revert "svcrpc: take advantage of tcp autotuning"
    nfsd: fix hung up of nfs client while sync write data to nfs server

    Linus Torvalds
     
  • The flat loader uses an architecture's flat_stack_align() to align the
    stack but assumes word-alignment is enough for the data sections.

    However, on the Xtensa S6000 we have registers up to 128bit width
    which can be used from userspace and therefor need userspace stack and
    data-section alignment of at least this size.

    This patch drops flat_stack_align() and uses the same alignment that
    is required for slab caches, ARCH_SLAB_MINALIGN, or wordsize if it's
    not defined by the architecture.

    It also fixes m32r which was obviously kaput, aligning an
    uninitialized stack entry instead of the stack pointer.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Oskar Schirmer
    Cc: David Howells
    Cc: Russell King
    Cc: Bryan Wu
    Cc: Geert Uytterhoeven
    Acked-by: Paul Mundt
    Cc: Greg Ungerer
    Signed-off-by: Johannes Weiner
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oskar Schirmer
     
  • proc_pident_instantiate() has following call flow.

    proc_pident_lookup()
    proc_pident_instantiate()
    proc_pid_make_inode()

    And, proc_pident_lookup() has following error handling.

    const struct pid_entry *p, *last;
    error = ERR_PTR(-ENOENT);
    if (!task)
    goto out_no_task;

    Then, proc_pident_instantiate should return ENOENT too when racing against
    exit(2) occur.

    EINAL has two bad reason.
    - it implies caller is wrong. bad the race isn't caller's mistake.
    - man 2 open don't explain EINVAL. user often don't handle it.

    Note: Other proc_pid_make_inode() caller already use ENOENT properly.

    Acked-by: Eric W. Biederman
    Cc: Alexey Dobriyan
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Erase errors such as:
    "Newly-erased block contained word 0xa4ef223e at offset 0x0296a014"
    and failure to write the clean marker,
    moves the offending erase block to erasing list before calling
    jffs2_erase_failed(). This is bad as jffs2_erase_failed() will
    also move the block to the bad_list, but is now moving the
    wrong block, causing FS corruption.

    Signed-off-by: Joakim Tjernlund
    Signed-off-by: David Woodhouse

    Joakim Tjernlund
     
  • We don't need a kernel thread per CPU for this application.

    Acked-by: Alex Chiang
    Cc: Lai Jiangshan
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Andrew Morton
     

28 May, 2009

3 commits

  • Commit 'Short write in nfsd becomes a full write to the client'
    (31dec2538e45e9fff2007ea1f4c6bae9f78db724) broken the sync write.
    With the following commands to reproduce:

    $ mount -t nfs -o sync 192.168.0.21:/nfsroot /mnt
    $ cd /mnt
    $ echo aaaa > temp.txt

    Then nfs client is hung up.

    In SYNC mode the server alaways return the write count 0 to the
    client. This is because the value of host_err in nfsd_vfs_write()
    will be overwrite in SYNC mode by 'host_err=nfsd_sync(file);',
    and then we return host_err(which is now 0) as write count.

    This patch fixed the problem.

    Signed-off-by: Wei Yongjun
    Signed-off-by: J. Bruce Fields

    Wei Yongjun
     
  • Fix up renamed filenames in comments in fs/cachefiles/internal.h.

    Originally, the files were all called cf-xxx.c, but they got renamed to
    just xxx.c.

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Fix up renamed filenames in comments in fs/fscache/internal.h.

    Originally, the files were all called fsc-xxx.c, but they got renamed to
    just xxx.c.

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     

27 May, 2009

2 commits

  • If the asynchronous lease renewal fails (usually due to a soft timeout),
    then we _must_ schedule state recovery in order to ensure that we don't
    lose the lease unnecessarily or, if the lease is already lost, that we
    recover the locking state promptly...

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • fix build error with latest kbuild adjustments to initconst.

    The commit a447c0932445f92ce6f4c1bd020f62c5097a7842 ("vfs: Use
    const for kernel parser table") changed:

    static match_table_t __initdata tokens = {
    to
    static match_table_t __initconst tokens = {

    But the missing const causes popwerpc to fail with latest
    updates to __initconst like this:

    fs/nfs/nfsroot.c:400: error: __setup_str_nfs_root_setup causes a section type conflict
    fs/nfs/nfsroot.c:400: error: __setup_str_nfs_root_setup causes a section type conflict

    The bug is only present with kbuild-next.
    Following patch has been build tested.

    Signed-off-by: Sam Ravnborg
    Cc: Steven Whitehouse
    Cc: Stephen Rothwell
    Acked-by: Jan Beulich
    Signed-off-by: Trond Myklebust

    Sam Ravnborg
     

24 May, 2009

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
    [CIFS] Avoid open on possible directories since Samba now rejects them

    Linus Torvalds
     
  • Small change (mostly formatting) to limit lookup based open calls to
    file create only.

    After discussion yesteday on samba-technical about the posix lookup
    regression, and looking at a problem with cifs posix open to one
    particular Samba version, Jeff and JRA realized that Samba server's
    behavior changed in this area (posix open behavior on files vs.
    directories). To make this behavior consistent, JRA just made a
    fix to Samba server to alter how it handles open of directories (now
    returning the equivalent of EISDIR instead of success). Since we don't
    know at lookup time whether the inode is a directory or file (and
    thus whether posix open will succeed with most current Samba server),
    this change avoids the posix open code on lookup open (just issues
    posix open on creates). This gets the semantic benefits we want
    (atomicity, posix byte range locks, improved write semantics on newly
    created files) and file create still is fast, and we avoid the problem
    that Jeff noticed yesterday with "openat" (and some open directory
    calls) of non-cached directories to one version of Samba server, and
    will work with future Samba versions (which include the fix jra just
    pushed into Samba server). I confirmed this approach with jra
    yesterday and with Shirish today.

    Posix open is only called (at lookup time) for file create now.
    For opens (rather than creates), because we do not know if it
    is a file or directory yet, and current Samba no longer allows
    us to do posix open on dirs, we could end up wasting an open call
    on what turns out to be a dir. For file opens, we wait to call posix
    open till cifs_open. It could be added here (lookup) in the future
    but the performance tradeoff of the extra network request when EISDIR
    or EACCES is returned would have to be weighed against the 50%
    reduction in network traffic in the other paths.

    Reviewed-by: Shirish Pargaonkar
    Tested-by: Jeff Layton
    CC: Jeremy Allison
    Signed-off-by: Steve French

    Steve French
     

22 May, 2009

3 commits


20 May, 2009

1 commit


19 May, 2009

2 commits

  • This is the third respin of the patch posted yesterday to fix the error
    handling in cifs_follow_symlink. It also includes a fix for a bogus NULL
    pointer check in CIFSSMBQueryUnixSymLink that Jeff Moyer spotted.

    It's possible for CIFSSMBQueryUnixSymLink to return without setting
    target_path to a valid pointer. If that happens then the current value
    to which we're initializing this pointer could cause an oops when it's
    kfree'd.

    This patch is a little more comprehensive than the last patches. It
    reorganizes cifs_follow_link a bit for (hopefully) better readability.
    It should also eliminate the uneeded allocation of full_path on servers
    without unix extensions (assuming they can get to this point anyway, of
    which I'm not convinced).

    On a side note, I'm not sure I agree with the logic of enabling this
    query even when unix extensions are disabled on the client. It seems
    like that should disable this as well. But, changing that is outside the
    scope of this fix, so I've left it alone for now.

    Reported-by: Jeff Moyer
    Signed-off-by: Jeff Layton
    Reviewed-by: Jeff Moyer
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Steve French

    Jeff Layton
     
  • The problem is that permission checking is skipped if atomic open is
    possible, but when exec opens a file, it just opens it O_READONLY which
    means EXEC permission will not be checked at that time.

    This problem is observed by the following sequence (executed as root):

    mount -t nfs4 server:/ /mnt4
    echo "ls" >/mnt4/foo
    chmod 744 /mnt4/foo
    su guest -c "mnt4/foo"

    Signed-off-by: Frank Filz
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org
    Tested-by: Eugene Teo
    Signed-off-by: Linus Torvalds

    Frank Filz
     

18 May, 2009

3 commits


15 May, 2009

11 commits

  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: Fix race in ext4_inode_info.i_cached_extent
    ext4: Clear the unwritten buffer_head flag after the extent is initialized
    ext4: Use a fake block number for delayed new buffer_head
    ext4: Fix sub-block zeroing for writes into preallocated extents

    Linus Torvalds
     
  • devpts_get_sb() calls memset(0) to clear mount options and calls
    parse_mount_options() if user specified any mount options.

    The memset(0) is bogus since the 'mode' and 'ptmxmode' options are
    non-zero by default. parse_mount_options() restores options to default
    anyway and can properly deal with NULL mount options.

    So in devpts_get_sb() remove memset(0) and call parse_mount_options() even
    for NULL mount options.

    Bug reported by Eric Paris: http://lkml.org/lkml/2009/5/7/448.

    Signed-off-by: Sukadev Bhattiprolu
    Tested-by: Marc Dionne
    Reported-by: Eric Paris
    Cc: Christoph Hellwig
    Cc: Alan Cox
    Acked-by: Serge Hallyn
    Cc: Al Viro
    Cc: "Rafael J. Wysocki"
    Reviewed-by: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sukadev Bhattiprolu
     
  • If two CPU's simultaneously call ext4_ext_get_blocks() at the same
    time, there is nothing protecting the i_cached_extent structure from
    being used and updated at the same time. This could potentially cause
    the wrong location on disk to be read or written to, including
    potentially causing the corruption of the block group descriptors
    and/or inode table.

    This bug has been in the ext4 code since almost the very beginning of
    ext4's development. Fortunately once the data is stored in the page
    cache cache, ext4_get_blocks() doesn't need to be called, so trying to
    replicate this problem to the point where we could identify its root
    cause was *extremely* difficult. Many thanks to Kevin Shanahan for
    working over several months to be able to reproduce this easily so we
    could finally nail down the cause of the corruption.

    Signed-off-by: "Theodore Ts'o"
    Reviewed-by: "Aneesh Kumar K.V"

    Theodore Ts'o
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
    cifs: fix error handling in parse_DFS_referrals

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
    Btrfs: Spelling fix in btrfs_lookup_first_block_group comments
    Btrfs: make show_options result match actual option names
    Btrfs: remove outdated comment in btrfs_ioctl_resize()
    Btrfs: remove some WARN_ONs in the IO failure path
    Btrfs: Don't loop forever on metadata IO failures
    Btrfs: init inode ordered_data_close flag properly

    Linus Torvalds
     
  • The BH_Unwritten flag indicates that the buffer is allocated on disk
    but has not been written; that is, the disk was part of a persistent
    preallocation area. That flag should only be set when a get_blocks()
    function is looking up a inode's logical to physical block mapping.

    When ext4_get_blocks_wrap() is called with create=1, the uninitialized
    extent is converted into an initialized one, so the BH_Unwritten flag
    is no longer appropriate. Hence, we need to make sure the
    BH_Unwritten is not left set, since the combination of BH_Mapped and
    BH_Unwritten is not allowed; among other things, it will result ext4's
    get_block() to be called over and over again during the write_begin
    phase of write(2).

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: "Theodore Ts'o"

    Aneesh Kumar K.V
     
  • Signed-off-by: Sankar P
    Signed-off-by: Chris Mason

    Sankar P
     
  • The notreelog and flushoncommit mount options were being printed slightly
    differently.

    Signed-off-by: Sage Weil
    Signed-off-by: Chris Mason

    Sage Weil
     
  • In Li Zefan's commit dae7b665cf6d6e6e733f1c9c16cf55547dd37e33,
    a combination call of kmalloc() and copy_from_user() is replaced by
    memdup_user(). So btrfs_ioctl_resize() doesn't use GFP_NOFS any more.

    Signed-off-by: Li Hong
    Signed-off-by: Chris Mason

    Li Hong
     
  • These debugging WARN_ONs make too much console noise during regular
    IO failures. An IO failure will still generate a number of messages
    as we verify checksums etc, but these two are not needed.

    Signed-off-by: Chris Mason

    Chris Mason
     
  • When a btrfs metadata read fails, the first thing we try to do is find
    a good copy on another mirror of the block. If this fails, read_tree_block()
    ends up returning a buffer that isn't up to date.

    The btrfs btree reading code was reworked to drop locks and repeat
    the search when IO was done, but the changes didn't add a check for failed
    reads. The end result was looping forever on buffers that were never
    going to become up to date.

    Signed-off-by: Chris Mason

    Chris Mason