30 Mar, 2010

7 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs:
    [LogFS] Erase new journal segments
    [LogFS] Move reserved segments with journal
    [LogFS] Clear PagePrivate when moving journal
    Simplify and fix pad_wbuf
    Prevent data corruption in logfs_rewrite_block()
    Use deactivate_locked_super
    Fix logfs_get_sb_final error path
    Write out both superblocks on mismatch
    Prevent schedule while atomic in __logfs_readdir
    Plug memory leak in writeseg_end_io
    Limit max_pages for insane devices
    Open segment file before using it

    Linus Torvalds
     
  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2:
    ocfs2: Fix a race in o2dlm lockres mastery
    Ocfs2: Handle deletion of reflinked oprhan inodes correctly.
    Ocfs2: Journaling i_flags and i_orphaned_slot when adding inode to orphan dir.
    ocfs2: Clear undo bits when local alloc is freed
    ocfs2: Init meta_ac properly in ocfs2_create_empty_xattr_block.
    ocfs2: Fix the update of name_offset when removing xattrs
    ocfs2: Always try for maximum bits with new local alloc windows
    ocfs2: set i_mode on disk during acl operations
    ocfs2: Update i_blocks in reflink operations.
    ocfs2: Change bg_chain check for ocfs2_validate_gd_parent.
    [PATCH] Skip check for mandatory locks when unlocking

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (28 commits)
    ceph: update discussion list address in MAINTAINERS
    ceph: some documentations fixes
    ceph: fix use after free on mds __unregister_request
    ceph: avoid loaded term 'OSD' in documention
    ceph: fix possible double-free of mds request reference
    ceph: fix session check on mds reply
    ceph: handle kmalloc() failure
    ceph: propagate mds session allocation failures to caller
    ceph: make write_begin wait propagate ERESTARTSYS
    ceph: fix snap rebuild condition
    ceph: avoid reopening osd connections when address hasn't changed
    ceph: rename r_sent_stamp r_stamp
    ceph: fix connection fault con_work reentrancy problem
    ceph: prevent dup stale messages to console for restarting mds
    ceph: fix pg pool decoding from incremental osdmap update
    ceph: fix mds sync() race with completing requests
    ceph: only release unused caps with mds requests
    ceph: clean up handle_cap_grant, handle_caps wrt session mutex
    ceph: fix session locking in handle_caps, ceph_check_caps
    ceph: drop unnecessary WARN_ON in caps migration
    ...

    Linus Torvalds
     
  • In commit 9df93939b735 ("ext3: Use bitops to read/modify
    EXT3_I(inode)->i_state") ext3 changed its internal 'i_state' variable to
    use bitops for its state handling. However, unline the same ext4
    change, it didn't actually change the name of the field when it changed
    the semantics of it.

    As a result, an old use of 'i_state' remained in fs/ext3/ialloc.c that
    initialized the field to EXT3_STATE_NEW. And that does not work
    _at_all_ when we're now working with individually named bits rather than
    values that get masked. So the code tried to mark the state to be new,
    but in actual fact set the field to EXT3_STATE_JDATA. Which makes no
    sense at all, and screws up all the code that checks whether the inode
    was newly allocated.

    In particular, it made the xattr code unhappy, and caused various random
    behavior, like apparently

    https://bugzilla.redhat.com/show_bug.cgi?id=577911

    So fix the initialization, and rename the field to match ext4 so that we
    don't have this happen again.

    Cc: James Morris
    Cc: Stephen Smalley
    Cc: Daniel J Walsh
    Cc: Eric Paris
    Cc: Jan Kara
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • If the device contains on old logfs image and the journal is moved to
    segment that have never been used by the current logfs and not all
    journal segments are erased before the next mount, the old content can
    confuse mount code. To prevent this, always erase the new journal
    segments.

    Signed-off-by: Joern Engel

    Joern Engel
     
  • Fixes a GC livelock.

    Signed-off-by: Joern Engel

    Joern Engel
     
  • CONFIG_SLOW_WORK_PROC was changed to CONFIG_SLOW_WORK_DEBUG, but not in all
    instances. Change the remaining instances. This makes the debugfs file
    display the time mark and the owner's description again.

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     

29 Mar, 2010

2 commits


28 Mar, 2010

2 commits

  • A comment in the old code read:
    /* The math in this function can surely use some love */

    And indeed it did. In the case that area->a_used_bytes is exactly
    4096 bytes below segment size it fell apart. pad_wbuf is now split
    into two helpers that are significantly less complicated.

    Signed-off-by: Joern Engel

    Joern Engel
     
  • The comment was correct, so make the code match the comment. As the
    new comment indicates, we might be able to do a little less work. But
    for the current -rc series let's keep it simple and just fix the bug.

    Signed-off-by: Joern Engel

    Joern Engel
     

27 Mar, 2010

10 commits


26 Mar, 2010

1 commit


25 Mar, 2010

12 commits

  • * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
    NFS: don't try to decode GETATTR if DELEGRETURN returned error
    sunrpc: handle allocation errors from __rpc_lookup_create()
    SUNRPC: Fix the return value of rpc_run_bc_task()
    SUNRPC: Fix a use after free bug with the NFSv4.1 backchannel
    SUNRPC: Fix a potential memory leak in auth_gss
    NFS: Prevent another deadlock in nfs_release_page()

    Linus Torvalds
     
  • Sparse complained about this missing spin_unlock()

    Signed-off-by: Dan Carpenter
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    Dan Carpenter
     
  • do_sync_read/write() should set kiocb.ki_nbytes to be consistent with
    do_sync_readv_writev().

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Fix an incorrect for-loop in elf_core_vma_data_size(). The advance-pointer
    statement lacks an assignment:

    CC fs/binfmt_elf_fdpic.o
    fs/binfmt_elf_fdpic.c: In function 'elf_core_vma_data_size':
    fs/binfmt_elf_fdpic.c:1593: warning: statement with no effect

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Smaller size than a minimum blocksize can't be used, after all it's
    handled like 0 size.

    For extended partition itself, this makes sure to use bigger size than one
    logical sector size at least.

    Signed-off-by: OGAWA Hirofumi
    Cc: Daniel Taylor
    Cc: "H. Peter Anvin"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    OGAWA Hirofumi
     
  • In order to use disks larger than 2TiB on Windows XP, it is necessary to
    use 4096-byte logical sectors in an MBR.

    Although the kernel storage and functions called from msdos.c used
    "sector_t" internally, msdos.c still used u32 variables, which results in
    the ability to handle XP-compatible large disks.

    This patch changes the internal variables to "sector_t".

    Daniel said: "In the near future, WD will be releasing products that need
    this patch".

    [hirofumi@mail.parknet.co.jp: tweaks and fix]
    Signed-off-by: Daniel Taylor
    Signed-off-by: OGAWA Hirofumi
    Cc: "H. Peter Anvin"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Taylor
     
  • "m" is never NULL here. We need a different test for the end of list
    condition.

    Signed-off-by: Dan Carpenter
    Acked-by: KAMEZAWA Hiroyuki
    Acked-by: WANG Cong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Carpenter
     
  • The reiserfs journal behaves inconsistently when determining whether to
    allow a mount of a read-only device.

    This is due to the use of the continue_replay variable to short circuit
    the journal scanning. If it's set, it's assumed that there are
    transactions to replay, but there may not be. If it's unset, it's assumed
    that there aren't any, and that may not be the case either.

    I've observed two failure cases:
    1) Where a clean file system on a read-only device refuses to mount
    2) Where a clean file system on a read-only device passes the
    optimization and then tries writing the journal header to update
    the latest mount id.

    The former is easily observable by using a freshly created file system on
    a read-only loopback device.

    This patch moves the check into journal_read_transaction, where it can
    bail out before it's about to replay a transaction. That way it can go
    through and skip transactions where appropriate, yet still refuse to mount
    a file system with outstanding transactions.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • Commit 57fe60df ("reiserfs: add atomic addition of selinux attributes
    during inode creation") contains a bug that will cause it to oops when
    mounting a file system that didn't previously contain extended attributes
    on a system using security.* xattrs.

    The issue is that while creating the privroot during mount
    reiserfs_security_init calls reiserfs_xattr_jcreate_nblocks which
    dereferences the xattr root. The xattr root doesn't exist, so we get an
    oops.

    Addresses http://bugzilla.kernel.org/show_bug.cgi?id=15309

    Signed-off-by: Jeff Mahoney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Mahoney
     
  • fs/binfmt_aout.c: In function `aout_core_dump':
    fs/binfmt_aout.c:125: warning: passing argument 2 of `dump_write' makes pointer from integer without a cast
    include/linux/coredump.h:12: note: expected `const void *' but argument is of type `long unsigned int'
    fs/binfmt_aout.c:132: warning: passing argument 2 of `dump_write' makes pointer from integer without a cast
    include/linux/coredump.h:12: note: expected `const void *' but argument is of type `long unsigned int'

    due to dump_write() expecting a user void *. Fold casts into the
    START_DATA/START_STACK macros and shut up the warnings.

    Signed-off-by: Borislav Petkov
    Cc: Daisuke HATAYAMA
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Borislav Petkov
     
  • When ext4 driver is used to mount a filesystem instead of the ext3 file
    system driver (through CONFIG_EXT4_USE_FOR_EXT23), do not enable delayed
    allocation by default since some ext3 users and application writers have
    developed unfortunate expectations about the safety of writing files on
    systems subject to sudden and violent death without using fsync().

    Signed-off-by: Jan Kara
    Signed-off-by: "Theodore Ts'o"

    Jan Kara
     
  • Oops. (Blush.)

    Thanks to Sedat Dilek for pointing this out.

    Signed-off-by: "Theodore Ts'o"

    Theodore Ts'o
     

24 Mar, 2010

6 commits

  • When used_dirs was introduced for the flex_groups struct, it looks
    like the accounting was not put into place properly, in some places
    manipulating free_inodes rather than used_dirs.

    Signed-off-by: Eric Sandeen
    Signed-off-by: "Theodore Ts'o"

    Eric Sandeen
     
  • In o2dlm, the master of a lock resource keeps a map of all interested
    nodes. This prevents the master from purging the resource before an
    interested node can create a lock.

    A race between the mastery thread and the mastery handler allowed an
    interested node to discover who the master is without informing the
    master directly. This is easily fixed by holding the dlm spinlock a
    little longer in the mastery handler.

    Signed-off-by: Srinivas Eeda
    Signed-off-by: Joel Becker

    Srinivas Eeda
     
  • The rule is that all inodes in the orphan dir have ORPHANED_FL,
    otherwise we treated it as an ERROR. This rule works well except
    for some rare cases of reflink operation:

    http://oss.oracle.com/bugzilla/show_bug.cgi?id=1215

    The problem is caused by how reflink and our orphan_scan thread
    interact.

    * The orphan scan pulls the orphans into a queue first, then runs the
    queue at a later time. We only hold the orphan_dir's lock
    during scanning.

    * Reflink create a oprhaned target in orphan_dir as its first step.
    It removes the target and clears the flag as the final step.
    These two steps take the orphan_dir's lock, but it is not held for
    the duration.

    Based on the above semantics, a reflink inode can be moved out of the
    orphan dir and have its ORPHANED_FL cleared before the queue of orphans
    is run. This leads to a ERROR in ocfs2_query_wipde_inode().

    This patch teaches ocfs2_query_wipe_inode() to detect previously
    orphaned reflink targets. If a reflink fails or a crash occurs during
    the relfink operation, the inode will retain ORPHANED_FL and will be
    properly wiped.

    Signed-off-by: Tristan Ye
    Signed-off-by: Joel Becker

    Tristan Ye
     
  • Currently, some callers were missing to journal the dirty inode after
    adding it to orphan dir.

    Now we're going to journal such modifications within the ocfs2_orphan_add()
    itself, It's safe to do so, though some existing caller may duplicate this,
    and it makes the logic look more straightforward anyway.

    Signed-off-by: Tristan Ye
    Signed-off-by: Joel Becker

    Tristan Ye
     
  • When the local alloc file changes windows, unused bits are freed back to the
    global bitmap. By defnition, those bits can not be in use by any file. Also,
    the local alloc will never have been able to allocate those bits if they
    were part of a previous truncate. Therefore it makes sense that we should
    clear unused local alloc bits in the undo buffer so that they can be used
    immediatly.

    [ Modified to call it ocfs2_release_clusters() -- Joel ]

    Signed-off-by: Mark Fasheh
    Signed-off-by: Joel Becker

    Mark Fasheh
     
  • nilfs_wait_on_logs has a potential to slip out before completion of
    all bio requests when it met an error. This synchronization fault may
    cause unexpected results, for instance, violative access to freed
    segment buffers from an end-bio callback routine.

    This fixes the issue by ensuring that nilfs_wait_on_logs waits all
    given logs.

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi