04 Jul, 2013

33 commits

  • The template lookup interface does not provide a way to use format
    strings, so make sure that the interface cannot be abused accidentally.

    Signed-off-by: Kees Cook
    Cc: Herbert Xu
    Cc: "David S. Miller"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • Disk names may contain arbitrary strings, so they must not be
    interpreted as format strings. It seems that only md allows arbitrary
    strings to be used for disk names, but this could allow for a local
    memory corruption from uid 0 into ring 0.

    CVE-2013-2851

    Signed-off-by: Kees Cook
    Cc: Jens Axboe
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     
  • In drivers/cdrom/cdrom.c mmc_ioctl_cdrom_read_data() allocates a memory
    area with kmalloc in line 2885.

    2885 cgc->buffer = kmalloc(blocksize, GFP_KERNEL);
    2886 if (cgc->buffer == NULL)
    2887 return -ENOMEM;

    In line 2908 we can find the copy_to_user function:

    2908 if (!ret && copy_to_user(arg, cgc->buffer, blocksize))

    The cgc->buffer is never cleaned and initialized before this function.
    If ret = 0 with the previous basic block, it's possible to display some
    memory bytes in kernel space from userspace.

    When we read a block from the disk it normally fills the ->buffer but if
    the drive is malfunctioning there is a chance that it would only be
    partially filled. The result is an leak information to userspace.

    Signed-off-by: Dan Carpenter
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jonathan Salwan
     
  • There is a hole in struct hd_geometry, so we have to zero the struct on
    stack before copying it to user-space.

    Signed-off-by: Cong Wang
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cong Wang
     
  • Without this patch, gdrom_major will leak when gd.cd_info alloc fails.

    Signed-off-by: Libo Chen
    Cc: Jens Axboe
    Acked-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Libo Chen
     
  • There may exist NULL pointer dereference in config_item_name() when one
    volume (say Volume A) unmounts while another (say Volume B) mounting.

    Volume A Volume B

    already Mounted.
    Unmounting, call
    o2hb_heartbeat_group_drop_item()
    -> config_item_put(item)
    set reg(A)->item.ci_name to NULL
    in function config_item_cleanup().

    begin mounting, call
    o2hb_region_pin() and tranverse all
    regions. When reading
    reg(A)->item.ci_name, it causes
    NULL pointer dereference.

    call o2hb_region_release() and
    del reg(A) from list.

    So we should skip accessing regions that is going to release when
    tranverse o2hb_all_regions.

    Signed-off-by: Yiwen Jiang
    Signed-off-by: joyce
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Cc: Sunil Mushran
    Cc: Jie Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xue jiufei
     
  • Adjust switch..case syntax at o2net_state_change to meet the kernel coding
    standard.

    s/printk/pr_info/.

    [akpm@linux-foundation.org: revert pr_foo() change]
    Signed-off-by: Jie Liu
    Acked-by: Joel Becker
    Cc: Gurudas Pai
    Cc: Mark Fasheh
    Cc: Noboru Iwamatsu
    Cc: Srinivas Eeeda
    Cc: Sunil Mushran
    Cc: Tao Ma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jie Liu
     
  • Fix a comment typo in o2quo_hb_still_up()

    Signed-off-by: Jie Liu
    Cc: Gurudas Pai
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Noboru Iwamatsu
    Cc: Srinivas Eeeda
    Cc: Sunil Mushran
    Cc: Tao Ma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jie Liu
     
  • s/o2hb_global_hearbeat_mode_set/o2hb_global_heartbeat_mode_set/ to make
    the signature of those routines in a consistent manner with others for
    heartbeating.

    Signed-off-by: Jie Liu
    Acked-by: Sunil Mushran
    Cc: Gurudas Pai
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Noboru Iwamatsu
    Cc: Srinivas Eeeda
    Cc: Tao Ma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jie Liu
     
  • Under heavy I/O load, writing the disk heartbeat can be forced to wait for
    minutes, and this causes the node to be fenced.

    This patch tries to use WRITE_SYNC in submitting the heartbeat bio, so
    that writing the heartbeat will have a priority over other requests.

    Signed-off-by: Noboru Iwamatsu
    Acked-by: Tao Ma
    Acked-by: Sunil Mushran
    Cc: Srinivas Eeeda
    Reviewed-by: Jie Liu
    Tested-by: Gurudas Pai
    Cc: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Noboru Iwamatsu
     
  • Inlined xattr shared free space of inode block with inlined data or data
    extent record, so the size of the later two should be adjusted when
    inlined xattr is enabled. See ocfs2_xattr_ibody_init(). But this isn't
    done well when reflink. For inode with inlined data, its max inlined
    data size is adjusted in ocfs2_duplicate_inline_data(), no problem. But
    for inode with data extent record, its record count isn't adjusted. Fix
    it, or data extent record and inlined xattr may overwrite each other,
    then cause data corruption or xattr failure.

    One panic caused by this bug in our test environment is the following:

    kernel BUG at fs/ocfs2/xattr.c:1435!
    invalid opcode: 0000 [#1] SMP
    Pid: 10871, comm: multi_reflink_t Not tainted 2.6.39-300.17.1.el5uek #1
    RIP: ocfs2_xa_offset_pointer+0x17/0x20 [ocfs2]
    RSP: e02b:ffff88007a587948 EFLAGS: 00010283
    RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00000000000051e4
    RDX: ffff880057092060 RSI: 0000000000000f80 RDI: ffff88007a587a68
    RBP: ffff88007a587948 R08: 00000000000062f4 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000010
    R13: ffff88007a587a68 R14: 0000000000000001 R15: ffff88007a587c68
    FS: 00007fccff7f06e0(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
    CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00000000015cf000 CR3: 000000007aa76000 CR4: 0000000000000660
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process multi_reflink_t
    Call Trace:
    ocfs2_xa_reuse_entry+0x60/0x280 [ocfs2]
    ocfs2_xa_prepare_entry+0x17e/0x2a0 [ocfs2]
    ocfs2_xa_set+0xcc/0x250 [ocfs2]
    ocfs2_xattr_ibody_set+0x98/0x230 [ocfs2]
    __ocfs2_xattr_set_handle+0x4f/0x700 [ocfs2]
    ocfs2_xattr_set+0x6c6/0x890 [ocfs2]
    ocfs2_xattr_user_set+0x46/0x50 [ocfs2]
    generic_setxattr+0x70/0x90
    __vfs_setxattr_noperm+0x80/0x1a0
    vfs_setxattr+0xa9/0xb0
    setxattr+0xc3/0x120
    sys_fsetxattr+0xa8/0xd0
    system_call_fastpath+0x16/0x1b

    Signed-off-by: Junxiao Bi
    Reviewed-by: Jie Liu
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Cc: Sunil Mushran
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Junxiao Bi
     
  • While deleting a file with ocfs2_unlink(), there is a bug in this
    function. This bug will result in filesystem read-only.

    After calling ocfs2_orphan_add(), the file which will be deleted is
    added into orphan dir. If ocfs2_delete_entry() fails, the file still
    exists in the parent dir. And this scenario introduces a conflict of
    metadata.

    If a file is added into orphan dir, when we put inode of the file with
    iput(), the inode i_flags is setted (~OCFS2_VALID_FL) in
    ocfs2_remove_inode(), and then write back to disk.

    But as previously mentioned, the file still exists in the parent dir.
    On other nodes, the file can be still accessed. When first read the
    file with ocfs2_read_blocks() from disk, It will check and avalidate
    inode using ocfs2_validate_inode_block(). So File system will be
    readonly because the inode is invalid. In other words, the inode
    i_flags has been set (~OCFS2_VALID_FL).

    [akpm@linux-foundation.org: cleanups]
    [jeff.liu@oracle.com: s/inode_is_unlinkable/ocfs2_inode_is_unlinkable/]
    Signed-off-by: Younger Liu
    Signed-off-by: Jensen
    Cc: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Sunil Mushran
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Younger Liu
     
  • Cc: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Sunil Mushran
    Cc: Younger Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • In ocfs2_relink_block_group(), we roll back all those changes if notify
    intent to modify buffers for metadata update failed even if the relevant
    buffer has not yet been modified/got dirty at that point, that are not
    quite right because of:

    - None buffer has been modified/dirty if failed to call
    ocfs2_journal_access_gd() against the previous block group buffer

    - Only the previous block group buffer has got dirty if failed to call
    ocfs2_journal_access_gd() against the block group buffer

    - There is no need to roll back the change for file entry buffer at all

    Those problems will not cause anything wrong but unnecessary. This
    patch fix them and kill the useless bg_ptr variable as well.

    Signed-off-by: Jie Liu
    Cc: Younger Liu
    Cc: Sunil Mushran
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jie Liu
     
  • While adding a file into orphan dir in ocfs2_orphan_add(), it calls
    __ocfs2_add_entry() before ocfs2_journal_access_di(). If
    ocfs2_journal_access_di() failed, the file is added into orphan dir, and
    orphan dir dinode updated, but file dinode has not been updated.
    Accordingly, the data is not consistent between file dinode and orphan
    dir.

    So, need to call ocfs2_journal_access_di() before __ocfs2_add_entry(),
    and if ocfs2_journal_access_di() failed, orphan_fe and
    orphan_dir_inode->i_nlink need rollback.

    This bug was added by 3939fda4 ("Ocfs2: Journaling i_flags and
    i_orphaned_slot when adding inode to orphan dir.").

    Signed-off-by: Younger Liu
    Acked-by: Jeff Liu
    Cc: Sunil Mushran
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Younger Liu
     
  • dlmlock_master() returns DLM_RECOVERING/DLM_MIGRATING/ DLM_FORWAR after
    adding lock to blocked list if lockres has the state
    DLM_LOCK_RES_RECOVERING/DLM_LOCK_RES_MIGRATING/ DLM_LOCK_RES_IN_PROGRESS.
    so it will retry in dlmlock(). And this may cause dlm_thread fall into an
    infinite loop

    Thread1 dlm_thread

    calls dlm_lock->dlmlock_master,
    if lockresA is in state
    DLM_LOCK_RES_RECOVERING, calls
    __dlm_wait_on_lockres() and waits
    until others threads clear this
    state;

    If cannot grant this lock,
    adding lock to blocked list,
    and return DLM_RECOVERING;

    Grant this lock and move it to
    grant list;

    After a while, retry and
    calls list_add_tail(), adding lock
    to blocked list again.

    Granted and blocked list of this lockres will become the following
    conditions:

    lock_res->granted.next = dlm_lock->list_head;
    lock_res->blocked.next = dlm_lock->list_head;
    dlm_lock->list_head.next = dlm_lock_resource->blocked;

    When dlm_thread traverses the granted list, it will fall into an endless
    loop, checking dlm_lock.list_head, dlm_lock->list_head.next
    (i.e.lock_res->blocked), lock_res->blocked.next(i.e.dlm_lock.list_head
    again) .....

    Signed-off-by: joyce
    Reviewed-by: jensen
    Cc: Jeff Liu
    Acked-by: Sunil Mushran
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xue jiufei
     
  • Free space checking will be done in ocfs2_xattr_ibody_init(). So remove
    here.

    [akpm@linux-foundation.org: remove unused local]
    Signed-off-by: Junxiao Bi
    Reviewed-by: Jie Liu
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Junxiao Bi
     
  • There is a memory leak in sc_kref_release(). When free struct
    o2net_sock_container (sc), we should release sc->sc_page.

    Signed-off-by: Younger Liu
    Reviewed-by: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Younger Liu
     
  • While adding extends to a file, the credits are calculated incorrectly
    and if the requested clusters is more than one (or more because we used
    a conservative limit) then we run out of journal credits and we hit an
    assert in journalling code.

    The function parameter bits_wanted variable was not used at all.

    Signed-off-by: Goldwyn Rodrigues
    Reviewed-by: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Goldwyn Rodrigues
     
  • In ocfs2_remove_btree_range, when calling ocfs2_lock_refcount_tree and
    ocfs2_prepare_refcount_change_for_del failed, it goes to out and then
    tries to call mutex_unlock without mutex_lock before. And when calling
    ocfs2_reserve_blocks_for_rec_trunc failed, it should free ref_tree
    before return.

    Signed-off-by: Joseph Qi
    Reviewed-by: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • Code cleanup: needs_checkpoint is assigned to but never used. Delete
    the variable.

    Signed-off-by: Goldwyn Rodrigues
    Cc: Jeff Liu
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Goldwyn Rodrigues
     
  • dlm_begin_reco_handler() returns without putting dlm when dlm recovery
    state is DLM_RECO_STATE_FINALIZE.

    Signed-off-by: joyce
    Reviewed-by: Jie Liu
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xue jiufei
     
  • If we use le32_add_cpu to set ocfs2_dinode i_flags, it may lead to the
    corresponding flag corrupted. So we should change it to bitwise and/or
    operation.

    Signed-off-by: Joseph Qi
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: shencanquan
    Reviewed-by: Jie Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • In dlm_request_all_locks, ret is type enum. But o2net_send_message
    returns a type int value. Then it will never run into the following
    error branch. So we should change the ret type from enum to int.

    Signed-off-by: Joseph Qi
    Cc: Joel Becker
    Cc: Mark Fasheh
    Acked-by: Sunil Mushran
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • Below 3 functions have already been declared in dlmcommon.h, so we have
    no need to declare them again in dlmrecovery.c:

    dlm_complete_recovery_thread
    dlm_launch_recovery_thread
    dlm_kick_recovery_thread

    Signed-off-by: Joseph Qi
    Cc: Joel Becker
    Cc: Mark Fasheh
    Acked-by: Sunil Mushran
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • The difference between "count" and "len" is that "len" is capped at
    4095. Changing it like this makes it match how sysfs_write_file() is
    implemented.

    This is a static analysis patch. I haven't found any store_attribute()
    functions where this change makes a difference.

    Signed-off-by: Dan Carpenter
    Acked-by: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Carpenter
     
  • spacr64 gcc-3.4.5 (at least) spits this back.

    Cc: Andrey Smirnov
    Cc: Mark Brown
    Cc: Takashi Iwai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • tasklet_kill() may sleep so call it before taking pch->lock.

    Fixes following lockup:

    BUG: scheduling while atomic: cat/2383/0x00000002
    Modules linked in:
    unwind_backtrace+0x0/0xfc
    __schedule_bug+0x4c/0x58
    __schedule+0x690/0x6e0
    sys_sched_yield+0x70/0x78
    tasklet_kill+0x34/0x8c
    pl330_free_chan_resources+0x24/0x88
    dma_chan_put+0x4c/0x50
    [...]
    BUG: spinlock lockup suspected on CPU#0, swapper/0/0
    lock: 0xe52aa04c, .magic: dead4ead, .owner: cat/2383, .owner_cpu: 1
    unwind_backtrace+0x0/0xfc
    do_raw_spin_lock+0x194/0x204
    _raw_spin_lock_irqsave+0x20/0x28
    pl330_tasklet+0x2c/0x5a8
    tasklet_action+0xfc/0x114
    __do_softirq+0xe4/0x19c
    irq_exit+0x98/0x9c
    handle_IPI+0x124/0x16c
    gic_handle_irq+0x64/0x68
    __irq_svc+0x40/0x70
    cpuidle_wrap_enter+0x4c/0xa0
    cpuidle_enter_state+0x18/0x68
    cpuidle_idle_call+0xac/0xe0
    cpu_idle+0xac/0xf0

    Signed-off-by: Bartlomiej Zolnierkiewicz
    Signed-off-by: Kyungmin Park
    Acked-by: Jassi Brar
    Cc: Vinod Koul
    Cc: Tomasz Figa
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bartlomiej Zolnierkiewicz
     
  • Need include "asm/uaccess.h" to pass compiling.

    The related error (with allmodconfig):

    arch/c6x/mm/init.c: In function `paging_init':
    arch/c6x/mm/init.c:46:2: error: implicit declaration of function `set_fs' [-Werror=implicit-function-declaration]
    arch/c6x/mm/init.c:46:9: error: `KERNEL_DS' undeclared (first use in this function)
    arch/c6x/mm/init.c:46:9: note: each undeclared identifier is reported only once for each function it appears in

    Signed-off-by: Chen Gang
    Cc: [3.10.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chen Gang
     
  • Commit f21afc25f9ed ("smp.h: Use local_irq_{save,restore}() in !SMP
    version of on_each_cpu()") converted on_each_cpu() to a C function.

    This required inclusion of irqflags.h, which broke ia64 and mn10300 (at
    least) due to header ordering hell.

    Switch on_each_cpu() back to a macro to fix this.

    Reported-by: Geert Uytterhoeven
    Acked-by: Geert Uytterhoeven
    Cc: David Daney
    Cc: Ralf Baechle
    Cc: Stephen Rothwell
    Cc: [3.10.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Pull ARM64 updates from Catalin Marinas:
    "Main features:
    - KVM and Xen ports to AArch64
    - Hugetlbfs and transparent huge pages support for arm64
    - Applied Micro X-Gene Kconfig entry and dts file
    - Cache flushing improvements

    For arm64 huge pages support, there are x86 changes moving part of
    arch/x86/mm/hugetlbpage.c into mm/hugetlb.c to be re-used by arm64"

    * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (66 commits)
    arm64: Add initial DTS for APM X-Gene Storm SOC and APM Mustang board
    arm64: Add defines for APM ARMv8 implementation
    arm64: Enable APM X-Gene SOC family in the defconfig
    arm64: Add Kconfig option for APM X-Gene SOC family
    arm64/Makefile: provide vdso_install target
    ARM64: mm: THP support.
    ARM64: mm: Raise MAX_ORDER for 64KB pages and THP.
    ARM64: mm: HugeTLB support.
    ARM64: mm: Move PTE_PROT_NONE bit.
    ARM64: mm: Make PAGE_NONE pages read only and no-execute.
    ARM64: mm: Restore memblock limit when map_mem finished.
    mm: thp: Correct the HPAGE_PMD_ORDER check.
    x86: mm: Remove general hugetlb code from x86.
    mm: hugetlb: Copy general hugetlb code from x86 to mm.
    x86: mm: Remove x86 version of huge_pmd_share.
    mm: hugetlb: Copy huge_pmd_share from x86 to mm.
    arm64: KVM: document kernel object mappings in HYP
    arm64: KVM: MAINTAINERS update
    arm64: KVM: userspace API documentation
    arm64: KVM: enable initialization of a 32bit vcpu
    ...

    Linus Torvalds
     
  • Pull ARM updates from Russell King:
    "This contains the usual updates from other people (listed below) and
    the usual random muddle of miscellaneous ARM updates which cover some
    low priority bug fixes and performance improvements.

    I've started to put the pull request wording into the merge commits,
    which are:

    - NoMMU stuff:

    This includes the following series sent earlier to the list:
    - nommu-fixes
    - R7 Support
    - MPU support

    I've left out the ARCH_MULTIPLATFORM/!MMU stuff that Arnd and I
    were discussing today until we've reached a conclusion/that's had
    some more review.

    This is rebased (and re-tested) on your devel-stable branch because
    otherwise there were going to be conflicts with Uwe's V7M work now
    that you've merged that. I've included the fix for limiting MPU to
    CPU_V7.

    - Huge page support

    These changes bring both HugeTLB support and Transparent HugePage
    (THP) support to ARM. Only long descriptors (LPAE) are supported
    in this series.

    The code has been tested on an Arndale board (Exynos 5250).

    - LPAE updates

    Please pull these miscellaneous LPAE fixes I've been collecting for
    a while now for 3.11. They've been tested and reviewed by quite a
    few people, and most of the patches are pretty trivial. -- Will Deacon.

    - arch_timer cleanups

    Please pull these arch_timer cleanups I've been holding onto for a
    while. They're the same as my last posting, but have been rebased
    to v3.10-rc3.

    - mpidr linearisation (multiprocessor id register - identifies which
    CPU number we are in the system)

    This patch series that implements MPIDR linearization through a
    simple hashing algorithm and updates current cpu_{suspend}/{resume}
    code to use the newly created hash structures to retrieve context
    pointers. It represents a stepping stone for the implementation of
    power management code on forthcoming multi-cluster ARM systems.

    It has been tested on TC2 (dual cluster A15xA7 system), iMX6q,
    OMAP4 and Tegra, with processors hitting low-power states requiring
    warm-boot resume through the cpu_resume code path"

    * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (77 commits)
    ARM: 7775/1: mm: Remove do_sect_fault from LPAE code
    ARM: 7777/1: Avoid extra calls to the C compiler
    ARM: 7774/1: Fix dtb dependency to use order-only prerequisites
    ARM: 7770/1: remove residual ARMv2 support from decompressor
    ARM: 7769/1: Cortex-A15: fix erratum 798181 implementation
    ARM: 7768/1: prevent risks of out-of-bound access in ASID allocator
    ARM: 7767/1: let the ASID allocator handle suspended animation
    ARM: 7766/1: versatile: don't mark pen as __INIT
    ARM: 7765/1: perf: Record the user-mode PC in the call chain.
    ARM: 7735/2: Preserve the user r/w register TPIDRURW on context switch and fork
    ARM: kernel: implement stack pointer save array through MPIDR hashing
    ARM: kernel: build MPIDR hash function data structure
    ARM: mpu: Ensure that MPU depends on CPU_V7
    ARM: mpu: protect the vectors page with an MPU region
    ARM: mpu: Allow enabling of the MPU via kconfig
    ARM: 7758/1: introduce config HAS_BANDGAP
    ARM: 7757/1: mm: don't flush icache in switch_mm with hardware broadcasting
    ARM: 7751/1: zImage: don't overwrite ourself with a page table
    ARM: 7749/1: spinlock: retry trylock operation if strex fails on free lock
    ARM: 7748/1: oabi: handle faults when loading swi instruction from userspace
    ...

    Linus Torvalds
     
  • Pull second set of VFS changes from Al Viro:
    "Assorted f_pos race fixes, making do_splice_direct() safe to call with
    i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
    ->d_hash/->d_compare calling conventions changes from Linus, misc
    stuff all over the place."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    Document ->tmpfile()
    ext4: ->tmpfile() support
    vfs: export lseek_execute() to modules
    lseek_execute() doesn't need an inode passed to it
    block_dev: switch to fixed_size_llseek()
    cpqphp_sysfs: switch to fixed_size_llseek()
    tile-srom: switch to fixed_size_llseek()
    proc_powerpc: switch to fixed_size_llseek()
    ubi/cdev: switch to fixed_size_llseek()
    pci/proc: switch to fixed_size_llseek()
    isapnp: switch to fixed_size_llseek()
    lpfc: switch to fixed_size_llseek()
    locks: give the blocked_hash its own spinlock
    locks: add a new "lm_owner_key" lock operation
    locks: turn the blocked_list into a hashtable
    locks: convert fl_link to a hlist_node
    locks: avoid taking global lock if possible when waking up blocked waiters
    locks: protect most of the file_lock handling with i_lock
    locks: encapsulate the fl_link list handling
    locks: make "added" in __posix_lock_file a bool
    ...

    Linus Torvalds
     

03 Jul, 2013

7 commits

  • Signed-off-by: Al Viro

    Al Viro
     
  • very similar to ext3 counterpart...

    Signed-off-by: Al Viro

    Al Viro
     
  • For those file systems(btrfs/ext4/ocfs2/tmpfs) that support
    SEEK_DATA/SEEK_HOLE functions, we end up handling the similar
    matter in lseek_execute() to update the current file offset
    to the desired offset if it is valid, ceph also does the
    simliar things at ceph_llseek().

    To reduce the duplications, this patch make lseek_execute()
    public accessible so that we can call it directly from the
    underlying file systems.

    Thanks Dave Chinner for this suggestion.

    [AV: call it vfs_setpos(), don't bring the removed 'inode' argument back]

    v2->v1:
    - Add kernel-doc comments for lseek_execute()
    - Call lseek_execute() in ceph->llseek()

    Signed-off-by: Jie Liu
    Cc: Dave Chinner
    Cc: Al Viro
    Cc: Andi Kleen
    Cc: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Chris Mason
    Cc: Josef Bacik
    Cc: Ben Myers
    Cc: Ted Tso
    Cc: Hugh Dickins
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Sage Weil
    Signed-off-by: Al Viro

    Jie Liu
     
  • Pull cpuset changes from Tejun Heo:
    "cpuset has always been rather odd about its configurations - a cgroup
    right after creation didn't allow any task executions before
    configuration, changing configuration in the parent modifies the
    descendants irreversibly and so on. These behaviors are inherently
    nasty and almost hostile against sharing the hierarchy with other
    controllers making it very difficult to use in unified hierarchy.

    Li is currently in the process of updating the behaviors for
    __DEVEL__sane_behavior which is the bulk of changes in this pull
    request. It isn't complete yet and the behaviors will change further
    but all changes are gated behind sane_behavior. In the process, the
    rather hairy work-item punting which was used to work around the
    limitations of cgroup descendant iterator was simplified."

    * 'for-3.11-cpuset' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    cpuset: rename @cont to @cgrp
    cpuset: fix to migrate mm correctly in a corner case
    cpuset: allow to move tasks to empty cpusets
    cpuset: allow to keep tasks in empty cpusets
    cpuset: introduce effective_{cpumask|nodemask}_cpuset()
    cpuset: record old_mems_allowed in struct cpuset
    cpuset: remove async hotplug propagation work
    cpuset: let hotplug propagation work wait for task attaching
    cpuset: re-structure update_cpumask() a bit
    cpuset: remove cpuset_test_cpumask()
    cpuset: remove unnecessary variable in cpuset_attach()
    cpuset: cleanup guarantee_online_{cpus|mems}()
    cpuset: remove redundant check in cpuset_cpus_allowed_fallback()

    Linus Torvalds
     
  • Pull cgroup changes from Tejun Heo:
    "This pull request contains the following changes.

    - cgroup_subsys_state (css) reference counting has been converted to
    percpu-ref. css is what each resource controller embeds into its
    own control structure and perform reference count against. It may
    be used in hot paths of various subsystems and is similar to module
    refcnt in that aspect. For example, block-cgroup's css refcnting
    was showing up a lot in Mikulaus's device-mapper scalability work
    and this should alleviate it.

    - cgroup subtree iterator has been updated so that RCU read lock can
    be released after grabbing reference. This allows simplifying its
    users which requires blocking which used to build iteration list
    under RCU read lock and then traverse it outside. This pull
    request contains simplification of cgroup core and device-cgroup.
    A separate pull request will update cpuset.

    - Fixes for various bugs including corner race conditions and RCU
    usage bugs.

    - A lot of cleanups and some prepartory work for the planned unified
    hierarchy support."

    * 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (48 commits)
    cgroup: CGRP_ROOT_SUBSYS_BOUND should also be ignored when mounting an existing hierarchy
    cgroup: CGRP_ROOT_SUBSYS_BOUND should be ignored when comparing mount options
    cgroup: fix deadlock on cgroup_mutex via drop_parsed_module_refcounts()
    cgroup: always use RCU accessors for protected accesses
    cgroup: fix RCU accesses around task->cgroups
    cgroup: fix RCU accesses to task->cgroups
    cgroup: grab cgroup_mutex in drop_parsed_module_refcounts()
    cgroup: fix cgroupfs_root early destruction path
    cgroup: reserve ID 0 for dummy_root and 1 for unified hierarchy
    cgroup: implement for_each_[builtin_]subsys()
    cgroup: move init_css_set initialization inside cgroup_mutex
    cgroup: s/for_each_subsys()/for_each_root_subsys()/
    cgroup: clean up find_css_set() and friends
    cgroup: remove cgroup->actual_subsys_mask
    cgroup: prefix global variables with "cgroup_"
    cgroup: convert CFTYPE_* flags to enums
    cgroup: rename cont to cgrp
    cgroup: clean up cgroup_serial_nr_cursor
    cgroup: convert cgroup_cft_commit() to use cgroup_for_each_descendant_pre()
    cgroup: make serial_nr_cursor available throughout cgroup.c
    ...

    Linus Torvalds
     
  • Pull workqueue changes from Tejun Heo:
    "Surprisingly, Lai and I didn't break too many things implementing
    custom pools and stuff last time around and there aren't any follow-up
    changes necessary at this point.

    The only change in this pull request is Viresh's patches to make some
    per-cpu workqueues to behave as unbound workqueues dependent on a boot
    param whose default can be configured via a config option. This leads
    to higher processing overhead / lower bandwidth as more work items are
    bounced across CPUs; however, it can lead to noticeable powersave in
    certain configurations - ~10% w/ idlish constant workload on a
    big.LITTLE configuration according to Viresh.

    This is because per-cpu workqueues interfere with how the scheduler
    perceives whether or not each CPU is idle by forcing pinned tasks on
    them, which makes the scheduler's power-aware scheduling decisions
    less effective.

    Its effectiveness is likely less pronounced on homogenous
    configurations and this type of optimization can probably be made
    automatic; however, the changes are pretty minimal and the affected
    workqueues are clearly marked, so it's an easy gain for some
    configurations for the time being with pretty unintrusive changes."

    * 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    fbcon: queue work on power efficient wq
    block: queue work on power efficient wq
    PHYLIB: queue work on system_power_efficient_wq
    workqueue: Add system wide power_efficient workqueues
    workqueues: Introduce new flag WQ_POWER_EFFICIENT for power oriented workqueues

    Linus Torvalds
     
  • Pull per-cpu changes from Tejun Heo:
    "This pull request contains Kent's per-cpu reference counter. It has
    gone through several iterations since the last time and the dynamic
    allocation is gone.

    The usual usage is relatively straight-forward although async kill
    confirm interface, which is not used int most cases, is somewhat icky.
    There also are some interface concerns - e.g. I'm not sure about
    passing in @relesae callback during init as that becomes funny when we
    later implement synchronous kill_and_drain - but nothing too serious
    and it's quite useable now.

    cgroup_subsys_state refcnting has already been converted and we should
    convert module refcnt (Kent?)"

    * 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu-refcount: use RCU-sched insted of normal RCU
    percpu-refcount: implement percpu_tryget() along with percpu_ref_kill_and_confirm()
    percpu-refcount: implement percpu_ref_cancel_init()
    percpu-refcount: add __must_check to percpu_ref_init() and don't use ACCESS_ONCE() in percpu_ref_kill_rcu()
    percpu-refcount: cosmetic updates
    percpu-refcount: consistently use plain (non-sched) RCU
    percpu-refcount: Don't use silly cmpxchg()
    percpu: implement generic percpu refcounting

    Linus Torvalds