10 Sep, 2010

1 commit


09 Sep, 2010

13 commits


08 Sep, 2010

26 commits

  • The full cleanup of init_MUTEX[_LOCKED] and DECLARE_MUTEX has not been
    done. Some of the users are real semaphores and we should name them as
    such instead of confusing everyone with "MUTEX".

    Provide the infrastructure to get finally rid of init_MUTEX[_LOCKED]
    and DECLARE_MUTEX.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Christoph Hellwig
    LKML-Reference:

    Thomas Gleixner
     
  • ocfs2_create_inode_in_orphan() is used by reflink to create the newly
    reflinked inode simultaneously in the orphan dir. This allows us to easily
    handle partially-reflinked files during recovery cleanup.

    We have a problem though - the orphan dir stringifies inode # to determine
    a unique name under which the orphan entry dirent can be created. Since
    ocfs2_create_inode_in_orphan() needs the space allocated in the orphan dir
    before it can allocate the inode, we currently call into the orphan code:

    /*
    * We give the orphan dir the root blkno to fake an orphan name,
    * and allocate enough space for our insertion.
    */
    status = ocfs2_prepare_orphan_dir(osb, &orphan_dir,
    osb->root_blkno,
    orphan_name, &orphan_insert);

    Using osb->root_blkno might work fine on unindexed directories, but the
    orphan dir can have an index. When it has that index, the above code fails
    to allocate the proper index entry. Later, when we try to remove the file
    from the orphan dir (using the actual inode #), the reflink operation will
    fail.

    To fix this, I created a function ocfs2_alloc_orphaned_file() which uses the
    newly split out orphan and inode alloc code to figure out what the inode
    block number will be (once allocated) and then prepare the orphan dir from
    that data.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • We do this because ocfs2_create_inode_in_orphan() wants to order locking of
    the orphan dir with respect to locking of the inode allocator *before*
    making any changes to the directory.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • This allows code which needs to know the eventual block number of an inode
    but can't allocate it yet due to transaction or lock ordering. For example,
    ocfs2_create_inode_in_orphan() currently gives a junk blkno for preparation
    of the orphan dir because it can't yet know where the actual inode is placed
    - that code is actually in ocfs2_mknod_locked. This is a problem when the
    orphan dirs are indexed as the junk inode number will create an index entry
    which goes unused (and fails the later removal from the orphan dir). Now
    with these interfaces, ocfs2_create_inode_in_orphan() can run the block
    group search (and get back the inode block number) *before* any actual
    allocation occurs.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • ocfs2_search_chain() makes the same updates as
    ocfs2_alloc_dinode_update_counts to the alloc inode. Instead of open coding
    the bitmap update, use our helper function.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • Do this by splitting the bulk of the function away from the inode allocation
    code at the very tom of ocfs2_mknod_locked(). Existing callers don't need to
    change and won't see any difference. The new function created,
    __ocfs2_mknod_locked() will be used shortly.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • The patch is to fix the regression bug brought from commit 6b933c8...( 'ocfs2:
    Avoid direct write if we fall back to buffered I/O'):

    http://oss.oracle.com/bugzilla/show_bug.cgi?id=1285

    The commit 6b933c8e6f1a2f3118082c455eef25f9b1ac7b45 changed __generic_file_aio_write
    to generic_file_buffered_write, which didn't call filemap_{write,wait}_range to flush
    the pagecaches when we were falling O_DIRECT writes back to buffered ones. it did hurt
    the O_DIRECT semantics somehow in extented odirect writes.

    This patch tries to guarantee O_DIRECT writes of 'fall back to buffered' to be correctly
    flushed.

    Signed-off-by: Tristan Ye
    Signed-off-by: Tao Ma

    Tristan Ye
     
  • We cannot call grab_cache_page() when holding filesystem locks or with
    a transaction started as grab_cache_page() calls page allocation with
    GFP_KERNEL flag and thus page reclaim can recurse back into the filesystem
    causing deadlocks or various assertion failures. We have to use
    find_or_create_page() instead and pass it GFP_NOFS as we do with other
    allocations.

    Acked-by: Mark Fasheh
    Signed-off-by: Jan Kara
    Signed-off-by: Tao Ma

    Jan Kara
     
  • We were setting ac->ac_last_group in ocfs2_claim_suballoc_bits from
    res->sr_bg_blkno. Unfortunately, res->sr_bg_blkno is going to be zero under
    normal (non-fragmented) circumstances. The discontig block group patches
    effectively turned off that feature. Fix this by correctly calculating what
    the next group hint should be.

    Acked-by: Tao Ma
    Signed-off-by: Mark Fasheh
    Tested-by: Goldwyn Rodrigues
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • We have added discontig block group now, and now an inode
    can be allocated in an discontig block group. So get
    it in ocfs2_get_suballoc_slot_bit.

    The old ocfs2_test_suballoc_bit gets group block no
    from the allocation inode which is wrong. Fix it by
    passing the right group.

    Acked-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Tao Ma
     
  • When 'barrier' mount option is specified, we have to issue a cache flush
    during fdatasync(2). We have to do this even if inode doesn't have
    I_DIRTY_DATASYNC set because we still have to get written *data* to disk so
    that they are not lost in case of crash.

    Acked-by: Tao Ma
    Signed-off-by: Jan Kara
    Singed-off-by: Tao Ma

    Jan Kara
     
  • __ocfs2_page_mkwrite now is broken in handling file end.
    1. the last page should be the page contains i_size - 1.
    2. the len in the last page is also calculated wrong.
    So change them accordingly.

    Acked-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Tao Ma
     
  • For local mounts, ocfs2_read_locked_inode() calls ocfs2_read_blocks_sync() to
    read the inode off the disk. The latter first checks to see if that block is
    cached in the journal, and, if so, returns that block. That is ok.

    But ocfs2_read_locked_inode() goes wrong when it tries to validate the checksum
    of such blocks. Blocks that are cached in the journal may not have had their
    checksum computed as yet. We should not validate the checksums of such blocks.

    Fixes ossbz#1282
    http://oss.oracle.com/bugzilla/show_bug.cgi?id=1282

    Signed-off-by: Sunil Mushran
    Cc: stable@kernel.org
    Singed-off-by: Tao Ma

    Sunil Mushran
     
  • Like tools, the checksum validate function now prints the values in hex.

    Signed-off-by: Sunil Mushran
    Singed-off-by: Tao Ma

    Sunil Mushran
     
  • * 'for-2.6.36' of git://linux-nfs.org/~bfields/linux:
    nfsd4: mask out non-access bits in nfs4_access_to_omode

    Linus Torvalds
     
  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
    ima: always maintain counters
    AppArmor: Fix locking from removal of profile namespace
    AppArmor: Fix splitting an fqname into separate namespace and profile names
    AppArmor: Fix security_task_setrlimit logic for 2.6.36 changes
    AppArmor: Drop hack to remove appended " (deleted)" string

    Linus Torvalds
     
  • commit 8262bb85da allocated the inode integrity struct (iint) before any
    inodes were created. Only after IMA was initialized in late_initcall were
    the counters updated. This patch updates the counters, whether or not IMA
    has been initialized, to resolve 'imbalance' messages.

    This patch fixes the bug as reported in bugzilla: 15673. When the i915
    is builtin, the ring_buffer is initialized before IMA, causing the
    imbalance message on suspend.

    Reported-by: Thomas Meyer
    Signed-off-by: Mimi Zohar
    Tested-by: Thomas Meyer
    Tested-by: David Safford
    Cc: Stable Kernel
    Signed-off-by: James Morris

    Mimi Zohar
     
  • The locking for profile namespace removal is wrong, when removing a
    profile namespace, it needs to be removed from its parent's list.
    Lock the parent of namespace list instead of the namespace being removed.

    Signed-off-by: John Johansen
    Signed-off-by: James Morris

    John Johansen
     
  • As per Dan Carpenter
    If we have a ns name without a following profile then in the original
    code it did "*ns_name = &name[1];". "name" is NULL so "*ns_name" is
    0x1. That isn't useful and could cause an oops when this function is
    called from aa_remove_profiles().

    Beyond this the assignment of the namespace name was wrong in the case
    where the profile name was provided as it was being set to &name[1]
    after name = skip_spaces(split + 1);

    Move the ns_name assignment before updating name for the split and
    also add skip_spaces, making the interface more robust.

    Signed-off-by: John Johansen
    Signed-off-by: James Morris

    John Johansen
     
  • 2.6.36 introduced the abilitiy to specify the task that is having its
    rlimits set. Update mediation to ensure that confined tasks can only
    set their own group_leader as expected by current policy.

    Add TODO note about extending policy to support setting other tasks
    rlimits.

    Signed-off-by: John Johansen
    Signed-off-by: James Morris

    John Johansen
     
  • The 2.6.36 kernel has refactored __d_path() so that it no longer appends
    " (deleted)" to unlinked paths. So drop the hack that was used to detect
    and remove the appended string.

    Signed-off-by: John Johansen
    Signed-off-by: James Morris

    John Johansen
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
    PCI: bus speed strings should be const
    PCI hotplug: Fix build with CONFIG_ACPI unset
    PCI: PCIe: Remove the port driver module exit routine
    PCI: PCIe: Move PCIe PME code to the pcie directory
    PCI: PCIe: Disable PCIe port services during port initialization
    PCI: PCIe: Ask BIOS for control of all native services at once
    ACPI/PCI: Negotiate _OSC control bits before requesting them
    ACPI/PCI: Do not preserve _OSC control bits returned by a query
    ACPI/PCI: Make acpi_pci_query_osc() return control bits
    ACPI/PCI: Reorder checks in acpi_pci_osc_control_set()
    PCI: PCIe: Introduce commad line switch for disabling port services
    PCI: PCIe AER: Introduce pci_aer_available()
    x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set
    PCI: provide stub pci_domain_nr function for !CONFIG_PCI configs

    Linus Torvalds
     
  • * 'for-linus' of git://oss.sgi.com/xfs/xfs:
    xfs: Make fiemap work with sparse files
    xfs: prevent 32bit overflow in space reservation
    xfs: Disallow 32bit project quota id
    xfs: improve buffer cache hash scalability

    Linus Torvalds
     
  • * 'for-linus' of git://android.kernel.org/kernel/tegra:
    [ARM] tegra: Add ZRELADDR default for ARCH_TEGRA

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
    alpha: Fix printk format errors
    alpha: convert perf_event to use local_t
    Fix call to replaced SuperIO functions
    alpha: remove homegrown L1_CACHE_ALIGN macro

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
    9p: potential ERR_PTR() dereference

    Linus Torvalds