Eric Lee / smarc-fsl-linux-kernel

04 Jul, 2013

33 commits

1c8fca1d9 crypto: sanitize argument for format string ... Browse Code »

The template lookup interface does not provide a way to use format
strings, so make sure that the interface cannot be abused accidentally.

Signed-off-by: Kees Cook
Cc: Herbert Xu
Cc: "David S. Miller"
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2013-07-04 07:07:25 +0800
ffc8b3086 block: do not pass disk names as format strings ... Browse Code »

Disk names may contain arbitrary strings, so they must not be
interpreted as format strings. It seems that only md allows arbitrary
strings to be used for disk names, but this could allow for a local
memory corruption from uid 0 into ring 0.

CVE-2013-2851

Signed-off-by: Kees Cook
Cc: Jens Axboe
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2013-07-04 07:07:25 +0800
542db0157 drivers/cdrom/cdrom.c: use kzalloc() for failing hardware ... Browse Code »

In drivers/cdrom/cdrom.c mmc_ioctl_cdrom_read_data() allocates a memory
area with kmalloc in line 2885.

2885 cgc->buffer = kmalloc(blocksize, GFP_KERNEL);
2886 if (cgc->buffer == NULL)
2887 return -ENOMEM;

In line 2908 we can find the copy_to_user function:

2908 if (!ret && copy_to_user(arg, cgc->buffer, blocksize))

The cgc->buffer is never cleaned and initialized before this function.
If ret = 0 with the previous basic block, it's possible to display some
memory bytes in kernel space from userspace.

When we read a block from the disk it normally fills the ->buffer but if
the drive is malfunctioning there is a chance that it would only be
partially filled. The result is an leak information to userspace.

Signed-off-by: Dan Carpenter
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jonathan Salwan
2013-07-04 07:07:25 +0800
8b0d77f13 block/compat_ioctl.c: do not leak info to user-space ... Browse Code »

There is a hole in struct hd_geometry, so we have to zero the struct on
stack before copying it to user-space.

Signed-off-by: Cong Wang
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cong Wang
2013-07-04 07:07:25 +0800
31bd8fbb4 drivers/cdrom/gdrom.c: fix device number leak ... Browse Code »

Without this patch, gdrom_major will leak when gd.cd_info alloc fails.

Signed-off-by: Libo Chen
Cc: Jens Axboe
Acked-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Libo Chen
2013-07-04 07:07:25 +0800
4a184b4ff ocfs2: fix NULL pointer dereference when traversing o2hb_all_regions ... Browse Code »

There may exist NULL pointer dereference in config_item_name() when one
volume (say Volume A) unmounts while another (say Volume B) mounting.

Volume A Volume B

already Mounted.
Unmounting, call
o2hb_heartbeat_group_drop_item()
-> config_item_put(item)
set reg(A)->item.ci_name to NULL
in function config_item_cleanup().

begin mounting, call
o2hb_region_pin() and tranverse all
regions. When reading
reg(A)->item.ci_name, it causes
NULL pointer dereference.

call o2hb_region_release() and
del reg(A) from list.

So we should skip accessing regions that is going to release when
tranverse o2hb_all_regions.

Signed-off-by: Yiwen Jiang
Signed-off-by: joyce
Acked-by: Joel Becker
Cc: Mark Fasheh
Cc: Sunil Mushran
Cc: Jie Liu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xue jiufei
2013-07-04 07:07:25 +0800
44e89cb8e ocfs2: adjust switch_case syntax at o2net_state_change() ... Browse Code »

Adjust switch..case syntax at o2net_state_change to meet the kernel coding
standard.

s/printk/pr_info/.

[akpm@linux-foundation.org: revert pr_foo() change]
Signed-off-by: Jie Liu
Acked-by: Joel Becker
Cc: Gurudas Pai
Cc: Mark Fasheh
Cc: Noboru Iwamatsu
Cc: Srinivas Eeeda
Cc: Sunil Mushran
Cc: Tao Ma
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jie Liu
2013-07-04 07:07:25 +0800
b4d8ed4f8 ocfs2: fix a comments typo at o2quo_hb_still_up() ... Browse Code »

Fix a comment typo in o2quo_hb_still_up()

Signed-off-by: Jie Liu
Cc: Gurudas Pai
Cc: Joel Becker
Cc: Mark Fasheh
Cc: Noboru Iwamatsu
Cc: Srinivas Eeeda
Cc: Sunil Mushran
Cc: Tao Ma
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jie Liu
2013-07-04 07:07:24 +0800
70f651edb ocfs2: consolidate o2hb_global_hearbeat_mode_set() naming convention ... Browse Code »

s/o2hb_global_hearbeat_mode_set/o2hb_global_heartbeat_mode_set/ to make
the signature of those routines in a consistent manner with others for
heartbeating.

Signed-off-by: Jie Liu
Acked-by: Sunil Mushran
Cc: Gurudas Pai
Cc: Joel Becker
Cc: Mark Fasheh
Cc: Noboru Iwamatsu
Cc: Srinivas Eeeda
Cc: Tao Ma
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jie Liu
2013-07-04 07:07:24 +0800
e873fdb52 ocfs2: submit disk heartbeat bio using WRITE_SYNC ... Browse Code »

Under heavy I/O load, writing the disk heartbeat can be forced to wait for
minutes, and this causes the node to be fenced.

This patch tries to use WRITE_SYNC in submitting the heartbeat bio, so
that writing the heartbeat will have a priority over other requests.

Signed-off-by: Noboru Iwamatsu
Acked-by: Tao Ma
Acked-by: Sunil Mushran
Cc: Srinivas Eeeda
Reviewed-by: Jie Liu
Tested-by: Gurudas Pai
Cc: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Noboru Iwamatsu
2013-07-04 07:07:24 +0800
ef962df05 ocfs2: xattr: fix inlined xattr reflink ... Browse Code »

Inlined xattr shared free space of inode block with inlined data or data
extent record, so the size of the later two should be adjusted when
inlined xattr is enabled. See ocfs2_xattr_ibody_init(). But this isn't
done well when reflink. For inode with inlined data, its max inlined
data size is adjusted in ocfs2_duplicate_inline_data(), no problem. But
for inode with data extent record, its record count isn't adjusted. Fix
it, or data extent record and inlined xattr may overwrite each other,
then cause data corruption or xattr failure.

One panic caused by this bug in our test environment is the following:

kernel BUG at fs/ocfs2/xattr.c:1435!
invalid opcode: 0000 [#1] SMP
Pid: 10871, comm: multi_reflink_t Not tainted 2.6.39-300.17.1.el5uek #1
RIP: ocfs2_xa_offset_pointer+0x17/0x20 [ocfs2]
RSP: e02b:ffff88007a587948 EFLAGS: 00010283
RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00000000000051e4
RDX: ffff880057092060 RSI: 0000000000000f80 RDI: ffff88007a587a68
RBP: ffff88007a587948 R08: 00000000000062f4 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000010
R13: ffff88007a587a68 R14: 0000000000000001 R15: ffff88007a587c68
FS: 00007fccff7f06e0(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000015cf000 CR3: 000000007aa76000 CR4: 0000000000000660
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process multi_reflink_t
Call Trace:
ocfs2_xa_reuse_entry+0x60/0x280 [ocfs2]
ocfs2_xa_prepare_entry+0x17e/0x2a0 [ocfs2]
ocfs2_xa_set+0xcc/0x250 [ocfs2]
ocfs2_xattr_ibody_set+0x98/0x230 [ocfs2]
__ocfs2_xattr_set_handle+0x4f/0x700 [ocfs2]
ocfs2_xattr_set+0x6c6/0x890 [ocfs2]
ocfs2_xattr_user_set+0x46/0x50 [ocfs2]
generic_setxattr+0x70/0x90
__vfs_setxattr_noperm+0x80/0x1a0
vfs_setxattr+0xa9/0xb0
setxattr+0xc3/0x120
sys_fsetxattr+0xa8/0xd0
system_call_fastpath+0x16/0x1b

Signed-off-by: Junxiao Bi
Reviewed-by: Jie Liu
Acked-by: Joel Becker
Cc: Mark Fasheh
Cc: Sunil Mushran
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Junxiao Bi
2013-07-04 07:07:24 +0800
b5a8bb717 ocfs2: fix readonly issue in ocfs2_unlink() ... Browse Code »

While deleting a file with ocfs2_unlink(), there is a bug in this
function. This bug will result in filesystem read-only.

After calling ocfs2_orphan_add(), the file which will be deleted is
added into orphan dir. If ocfs2_delete_entry() fails, the file still
exists in the parent dir. And this scenario introduces a conflict of
metadata.

If a file is added into orphan dir, when we put inode of the file with
iput(), the inode i_flags is setted (~OCFS2_VALID_FL) in
ocfs2_remove_inode(), and then write back to disk.

But as previously mentioned, the file still exists in the parent dir.
On other nodes, the file can be still accessed. When first read the
file with ocfs2_read_blocks() from disk, It will check and avalidate
inode using ocfs2_validate_inode_block(). So File system will be
readonly because the inode is invalid. In other words, the inode
i_flags has been set (~OCFS2_VALID_FL).

[akpm@linux-foundation.org: cleanups]
[jeff.liu@oracle.com: s/inode_is_unlinkable/ocfs2_inode_is_unlinkable/]
Signed-off-by: Younger Liu
Signed-off-by: Jensen
Cc: Jie Liu
Cc: Joel Becker
Cc: Mark Fasheh
Cc: Sunil Mushran
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Younger Liu
2013-07-04 07:07:24 +0800
25e289210 ocfs2: remove duplicated mlog_errno() in ocfs2_relink_block_group ... Browse Code »

Cc: Jie Liu
Cc: Joel Becker
Cc: Mark Fasheh
Cc: Sunil Mushran
Cc: Younger Liu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2013-07-04 07:07:24 +0800
493098413 ocfs2: rework transaction rollback in ocfs2_relink_block_group() ... Browse Code »

In ocfs2_relink_block_group(), we roll back all those changes if notify
intent to modify buffers for metadata update failed even if the relevant
buffer has not yet been modified/got dirty at that point, that are not
quite right because of:

- None buffer has been modified/dirty if failed to call
ocfs2_journal_access_gd() against the previous block group buffer

- Only the previous block group buffer has got dirty if failed to call
ocfs2_journal_access_gd() against the block group buffer

- There is no need to roll back the change for file entry buffer at all

Those problems will not cause anything wrong but unnecessary. This
patch fix them and kill the useless bg_ptr variable as well.

Signed-off-by: Jie Liu
Cc: Younger Liu
Cc: Sunil Mushran
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jie Liu
2013-07-04 07:07:24 +0800
ea45466ae ocfs2: need rollback when journal_access failed in ocfs2_orphan_add() ... Browse Code »

While adding a file into orphan dir in ocfs2_orphan_add(), it calls
__ocfs2_add_entry() before ocfs2_journal_access_di(). If
ocfs2_journal_access_di() failed, the file is added into orphan dir, and
orphan dir dinode updated, but file dinode has not been updated.
Accordingly, the data is not consistent between file dinode and orphan
dir.

So, need to call ocfs2_journal_access_di() before __ocfs2_add_entry(),
and if ocfs2_journal_access_di() failed, orphan_fe and
orphan_dir_inode->i_nlink need rollback.

This bug was added by 3939fda4 ("Ocfs2: Journaling i_flags and
i_orphaned_slot when adding inode to orphan dir.").

Signed-off-by: Younger Liu
Acked-by: Jeff Liu
Cc: Sunil Mushran
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Younger Liu
2013-07-04 07:07:24 +0800
096b2ef83 ocfs2: dlmlock_master() should return DLM_NORMAL after adding lock to blocked list ... Browse Code »

dlmlock_master() returns DLM_RECOVERING/DLM_MIGRATING/ DLM_FORWAR after
adding lock to blocked list if lockres has the state
DLM_LOCK_RES_RECOVERING/DLM_LOCK_RES_MIGRATING/ DLM_LOCK_RES_IN_PROGRESS.
so it will retry in dlmlock(). And this may cause dlm_thread fall into an
infinite loop

Thread1 dlm_thread

calls dlm_lock->dlmlock_master,
if lockresA is in state
DLM_LOCK_RES_RECOVERING, calls
__dlm_wait_on_lockres() and waits
until others threads clear this
state;

If cannot grant this lock,
adding lock to blocked list,
and return DLM_RECOVERING;

Grant this lock and move it to
grant list;

After a while, retry and
calls list_add_tail(), adding lock
to blocked list again.

Granted and blocked list of this lockres will become the following
conditions:

lock_res->granted.next = dlm_lock->list_head;
lock_res->blocked.next = dlm_lock->list_head;
dlm_lock->list_head.next = dlm_lock_resource->blocked;

When dlm_thread traverses the granted list, it will fall into an endless
loop, checking dlm_lock.list_head, dlm_lock->list_head.next
(i.e.lock_res->blocked), lock_res->blocked.next(i.e.dlm_lock.list_head
again) .....

Signed-off-by: joyce
Reviewed-by: jensen
Cc: Jeff Liu
Acked-by: Sunil Mushran
Cc: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xue jiufei
2013-07-04 07:07:24 +0800
b30f14c49 ocfs2: xattr: remove useless free space checking ... Browse Code »

Free space checking will be done in ocfs2_xattr_ibody_init(). So remove
here.

[akpm@linux-foundation.org: remove unused local]
Signed-off-by: Junxiao Bi
Reviewed-by: Jie Liu
Acked-by: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Junxiao Bi
2013-07-04 07:07:24 +0800
d3e3b41b3 fs/ocfs2/cluster/tcp.c: free sc->sc_page in sc_kref_release() ... Browse Code »

There is a memory leak in sc_kref_release(). When free struct
o2net_sock_container (sc), we should release sc->sc_page.

Signed-off-by: Younger Liu
Reviewed-by: Jie Liu
Cc: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Younger Liu
2013-07-04 07:07:23 +0800
40bd62eb7 fs/ocfs2/journal.h: add bits_wanted while calculating credits in ocfs2_calc_extend_credits ... Browse Code »

While adding extends to a file, the credits are calculated incorrectly
and if the requested clusters is more than one (or more because we used
a conservative limit) then we run out of journal credits and we hit an
assert in journalling code.

The function parameter bits_wanted variable was not used at all.

Signed-off-by: Goldwyn Rodrigues
Reviewed-by: Jie Liu
Cc: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Goldwyn Rodrigues
2013-07-04 07:07:23 +0800
33add0e3a ocfs2: fix mutex_unlock and possible memory leak in ocfs2_remove_btree_range ... Browse Code »

In ocfs2_remove_btree_range, when calling ocfs2_lock_refcount_tree and
ocfs2_prepare_refcount_change_for_del failed, it goes to out and then
tries to call mutex_unlock without mutex_lock before. And when calling
ocfs2_reserve_blocks_for_rec_trunc failed, it should free ref_tree
before return.

Signed-off-by: Joseph Qi
Reviewed-by: Jie Liu
Cc: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2013-07-04 07:07:23 +0800
8fa9d17f9 ocfs2: remove unecessary variable needs_checkpoint ... Browse Code »

Code cleanup: needs_checkpoint is assigned to but never used. Delete
the variable.

Signed-off-by: Goldwyn Rodrigues
Cc: Jeff Liu
Acked-by: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Goldwyn Rodrigues
2013-07-04 07:07:23 +0800
40c7f2eaf ocfs2: add missing dlm_put() in dlm_begin_reco_handler() ... Browse Code »

dlm_begin_reco_handler() returns without putting dlm when dlm recovery
state is DLM_RECO_STATE_FINALIZE.

Signed-off-by: joyce
Reviewed-by: Jie Liu
Acked-by: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xue jiufei
2013-07-04 07:07:23 +0800
13eb98874 ocfs2: should not use le32_add_cpu to set ocfs2_dinode i_flags ... Browse Code »

If we use le32_add_cpu to set ocfs2_dinode i_flags, it may lead to the
corresponding flag corrupted. So we should change it to bitwise and/or
operation.

Signed-off-by: Joseph Qi
Cc: Joel Becker
Cc: Mark Fasheh
Cc: shencanquan
Reviewed-by: Jie Liu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2013-07-04 07:07:23 +0800
22ab9014b fs/ocfs2/dlm/dlmrecovery.c:dlm_request_all_locks(): ret should be int instead of enum ... Browse Code »

In dlm_request_all_locks, ret is type enum. But o2net_send_message
returns a type int value. Then it will never run into the following
error branch. So we should change the ret type from enum to int.

Signed-off-by: Joseph Qi
Cc: Joel Becker
Cc: Mark Fasheh
Acked-by: Sunil Mushran
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2013-07-04 07:07:23 +0800
82d627cf1 fs/ocfs2/dlm/dlmrecovery.c: remove duplicate declarations ... Browse Code »

Below 3 functions have already been declared in dlmcommon.h, so we have
no need to declare them again in dlmrecovery.c:

dlm_complete_recovery_thread
dlm_launch_recovery_thread
dlm_kick_recovery_thread

Signed-off-by: Joseph Qi
Cc: Joel Becker
Cc: Mark Fasheh
Acked-by: Sunil Mushran
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2013-07-04 07:07:23 +0800
7121064b2 configfs: use capped length for ->store_attribute() ... Browse Code »
40

The difference between "count" and "len" is that "len" is capped at
4095. Changing it like this makes it match how sysfs_write_file() is
implemented.

This is a static analysis patch. I haven't found any store_attribute()
functions where this change makes a difference.

Signed-off-by: Dan Carpenter
Acked-by: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Carpenter
2013-07-04 07:07:23 +0800
3b1f9f53b sound/soc/codecs/si476x.c: don't use 0bNNN ... Browse Code »

spacr64 gcc-3.4.5 (at least) spits this back.

Cc: Andrey Smirnov
Cc: Mark Brown
Cc: Takashi Iwai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2013-07-04 07:07:23 +0800
da331ba8e drivers/dma/pl330.c: fix locking in pl330_free_chan_resources() ... Browse Code »

tasklet_kill() may sleep so call it before taking pch->lock.

Fixes following lockup:

BUG: scheduling while atomic: cat/2383/0x00000002
Modules linked in:
unwind_backtrace+0x0/0xfc
__schedule_bug+0x4c/0x58
__schedule+0x690/0x6e0
sys_sched_yield+0x70/0x78
tasklet_kill+0x34/0x8c
pl330_free_chan_resources+0x24/0x88
dma_chan_put+0x4c/0x50
[...]
BUG: spinlock lockup suspected on CPU#0, swapper/0/0
lock: 0xe52aa04c, .magic: dead4ead, .owner: cat/2383, .owner_cpu: 1
unwind_backtrace+0x0/0xfc
do_raw_spin_lock+0x194/0x204
_raw_spin_lock_irqsave+0x20/0x28
pl330_tasklet+0x2c/0x5a8
tasklet_action+0xfc/0x114
__do_softirq+0xe4/0x19c
irq_exit+0x98/0x9c
handle_IPI+0x124/0x16c
gic_handle_irq+0x64/0x68
__irq_svc+0x40/0x70
cpuidle_wrap_enter+0x4c/0xa0
cpuidle_enter_state+0x18/0x68
cpuidle_idle_call+0xac/0xe0
cpu_idle+0xac/0xf0

Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Kyungmin Park
Acked-by: Jassi Brar
Cc: Vinod Koul
Cc: Tomasz Figa
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bartlomiej Zolnierkiewicz
2013-07-04 07:07:22 +0800
fe7465016 arch: c6x: mm: include "asm/uaccess.h" to pass compiling ... Browse Code »

Need include "asm/uaccess.h" to pass compiling.

The related error (with allmodconfig):

arch/c6x/mm/init.c: In function `paging_init':
arch/c6x/mm/init.c:46:2: error: implicit declaration of function `set_fs' [-Werror=implicit-function-declaration]
arch/c6x/mm/init.c:46:9: error: `KERNEL_DS' undeclared (first use in this function)
arch/c6x/mm/init.c:46:9: note: each undeclared identifier is reported only once for each function it appears in

Signed-off-by: Chen Gang
Cc: [3.10.x]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Chen Gang
2013-07-04 07:07:22 +0800
c846ef7de include/linux/smp.h:on_each_cpu(): switch back to a macro ... Browse Code »

Commit f21afc25f9ed ("smp.h: Use local_irq_{save,restore}() in !SMP
version of on_each_cpu()") converted on_each_cpu() to a C function.

This required inclusion of irqflags.h, which broke ia64 and mn10300 (at
least) due to header ordering hell.

Switch on_each_cpu() back to a macro to fix this.

Reported-by: Geert Uytterhoeven
Acked-by: Geert Uytterhoeven
Cc: David Daney
Cc: Ralf Baechle
Cc: Stephen Rothwell
Cc: [3.10.x]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2013-07-04 07:07:22 +0800
1873e5002 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 ... Browse Code »

Pull ARM64 updates from Catalin Marinas:
"Main features:
- KVM and Xen ports to AArch64
- Hugetlbfs and transparent huge pages support for arm64
- Applied Micro X-Gene Kconfig entry and dts file
- Cache flushing improvements

For arm64 huge pages support, there are x86 changes moving part of
arch/x86/mm/hugetlbpage.c into mm/hugetlb.c to be re-used by arm64"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (66 commits)
arm64: Add initial DTS for APM X-Gene Storm SOC and APM Mustang board
arm64: Add defines for APM ARMv8 implementation
arm64: Enable APM X-Gene SOC family in the defconfig
arm64: Add Kconfig option for APM X-Gene SOC family
arm64/Makefile: provide vdso_install target
ARM64: mm: THP support.
ARM64: mm: Raise MAX_ORDER for 64KB pages and THP.
ARM64: mm: HugeTLB support.
ARM64: mm: Move PTE_PROT_NONE bit.
ARM64: mm: Make PAGE_NONE pages read only and no-execute.
ARM64: mm: Restore memblock limit when map_mem finished.
mm: thp: Correct the HPAGE_PMD_ORDER check.
x86: mm: Remove general hugetlb code from x86.
mm: hugetlb: Copy general hugetlb code from x86 to mm.
x86: mm: Remove x86 version of huge_pmd_share.
mm: hugetlb: Copy huge_pmd_share from x86 to mm.
arm64: KVM: document kernel object mappings in HYP
arm64: KVM: MAINTAINERS update
arm64: KVM: userspace API documentation
arm64: KVM: enable initialization of a 32bit vcpu
...

Linus Torvalds
2013-07-04 01:31:38 +0800
fb2af0020 Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm ... Browse Code »

Pull ARM updates from Russell King:
"This contains the usual updates from other people (listed below) and
the usual random muddle of miscellaneous ARM updates which cover some
low priority bug fixes and performance improvements.

I've started to put the pull request wording into the merge commits,
which are:

- NoMMU stuff:

This includes the following series sent earlier to the list:
- nommu-fixes
- R7 Support
- MPU support

I've left out the ARCH_MULTIPLATFORM/!MMU stuff that Arnd and I
were discussing today until we've reached a conclusion/that's had
some more review.

This is rebased (and re-tested) on your devel-stable branch because
otherwise there were going to be conflicts with Uwe's V7M work now
that you've merged that. I've included the fix for limiting MPU to
CPU_V7.

- Huge page support

These changes bring both HugeTLB support and Transparent HugePage
(THP) support to ARM. Only long descriptors (LPAE) are supported
in this series.

The code has been tested on an Arndale board (Exynos 5250).

- LPAE updates

Please pull these miscellaneous LPAE fixes I've been collecting for
a while now for 3.11. They've been tested and reviewed by quite a
few people, and most of the patches are pretty trivial. -- Will Deacon.

- arch_timer cleanups

Please pull these arch_timer cleanups I've been holding onto for a
while. They're the same as my last posting, but have been rebased
to v3.10-rc3.

- mpidr linearisation (multiprocessor id register - identifies which
CPU number we are in the system)

This patch series that implements MPIDR linearization through a
simple hashing algorithm and updates current cpu_{suspend}/{resume}
code to use the newly created hash structures to retrieve context
pointers. It represents a stepping stone for the implementation of
power management code on forthcoming multi-cluster ARM systems.

It has been tested on TC2 (dual cluster A15xA7 system), iMX6q,
OMAP4 and Tegra, with processors hitting low-power states requiring
warm-boot resume through the cpu_resume code path"

* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (77 commits)
ARM: 7775/1: mm: Remove do_sect_fault from LPAE code
ARM: 7777/1: Avoid extra calls to the C compiler
ARM: 7774/1: Fix dtb dependency to use order-only prerequisites
ARM: 7770/1: remove residual ARMv2 support from decompressor
ARM: 7769/1: Cortex-A15: fix erratum 798181 implementation
ARM: 7768/1: prevent risks of out-of-bound access in ASID allocator
ARM: 7767/1: let the ASID allocator handle suspended animation
ARM: 7766/1: versatile: don't mark pen as __INIT
ARM: 7765/1: perf: Record the user-mode PC in the call chain.
ARM: 7735/2: Preserve the user r/w register TPIDRURW on context switch and fork
ARM: kernel: implement stack pointer save array through MPIDR hashing
ARM: kernel: build MPIDR hash function data structure
ARM: mpu: Ensure that MPU depends on CPU_V7
ARM: mpu: protect the vectors page with an MPU region
ARM: mpu: Allow enabling of the MPU via kconfig
ARM: 7758/1: introduce config HAS_BANDGAP
ARM: 7757/1: mm: don't flush icache in switch_mm with hardware broadcasting
ARM: 7751/1: zImage: don't overwrite ourself with a page table
ARM: 7749/1: spinlock: retry trylock operation if strex fails on free lock
ARM: 7748/1: oabi: handle faults when loading swi instruction from userspace
...

Linus Torvalds
2013-07-04 00:46:29 +0800
790eac564 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull second set of VFS changes from Al Viro:
"Assorted f_pos race fixes, making do_splice_direct() safe to call with
i_mutex on parent, O_TMPFILE support, Jeff's locks.c series,
->d_hash/->d_compare calling conventions changes from Linus, misc
stuff all over the place."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
Document ->tmpfile()
ext4: ->tmpfile() support
vfs: export lseek_execute() to modules
lseek_execute() doesn't need an inode passed to it
block_dev: switch to fixed_size_llseek()
cpqphp_sysfs: switch to fixed_size_llseek()
tile-srom: switch to fixed_size_llseek()
proc_powerpc: switch to fixed_size_llseek()
ubi/cdev: switch to fixed_size_llseek()
pci/proc: switch to fixed_size_llseek()
isapnp: switch to fixed_size_llseek()
lpfc: switch to fixed_size_llseek()
locks: give the blocked_hash its own spinlock
locks: add a new "lm_owner_key" lock operation
locks: turn the blocked_list into a hashtable
locks: convert fl_link to a hlist_node
locks: avoid taking global lock if possible when waking up blocked waiters
locks: protect most of the file_lock handling with i_lock
locks: encapsulate the fl_link list handling
locks: make "added" in __posix_lock_file a bool
...

Linus Torvalds
2013-07-04 00:10:19 +0800

03 Jul, 2013

7 commits

48bde8d36 Document ->tmpfile() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2013-07-03 20:23:29 +0800
af51a2ac3 ext4: ->tmpfile() support ... Browse Code »

very similar to ext3 counterpart...

Signed-off-by: Al Viro

Al Viro
2013-07-03 20:23:28 +0800
46a1c2c7a vfs: export lseek_execute() to modules ... Browse Code »

For those file systems(btrfs/ext4/ocfs2/tmpfs) that support
SEEK_DATA/SEEK_HOLE functions, we end up handling the similar
matter in lseek_execute() to update the current file offset
to the desired offset if it is valid, ceph also does the
simliar things at ceph_llseek().

To reduce the duplications, this patch make lseek_execute()
public accessible so that we can call it directly from the
underlying file systems.

Thanks Dave Chinner for this suggestion.

[AV: call it vfs_setpos(), don't bring the removed 'inode' argument back]

v2->v1:
- Add kernel-doc comments for lseek_execute()
- Call lseek_execute() in ceph->llseek()

Signed-off-by: Jie Liu
Cc: Dave Chinner
Cc: Al Viro
Cc: Andi Kleen
Cc: Andrew Morton
Cc: Christoph Hellwig
Cc: Chris Mason
Cc: Josef Bacik
Cc: Ben Myers
Cc: Ted Tso
Cc: Hugh Dickins
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Sage Weil
Signed-off-by: Al Viro

Jie Liu
2013-07-03 20:23:27 +0800
0b0585c3e Merge branch 'for-3.11-cpuset' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cpuset changes from Tejun Heo:
"cpuset has always been rather odd about its configurations - a cgroup
right after creation didn't allow any task executions before
configuration, changing configuration in the parent modifies the
descendants irreversibly and so on. These behaviors are inherently
nasty and almost hostile against sharing the hierarchy with other
controllers making it very difficult to use in unified hierarchy.

Li is currently in the process of updating the behaviors for
__DEVEL__sane_behavior which is the bulk of changes in this pull
request. It isn't complete yet and the behaviors will change further
but all changes are gated behind sane_behavior. In the process, the
rather hairy work-item punting which was used to work around the
limitations of cgroup descendant iterator was simplified."

* 'for-3.11-cpuset' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: rename @cont to @cgrp
cpuset: fix to migrate mm correctly in a corner case
cpuset: allow to move tasks to empty cpusets
cpuset: allow to keep tasks in empty cpusets
cpuset: introduce effective_{cpumask|nodemask}_cpuset()
cpuset: record old_mems_allowed in struct cpuset
cpuset: remove async hotplug propagation work
cpuset: let hotplug propagation work wait for task attaching
cpuset: re-structure update_cpumask() a bit
cpuset: remove cpuset_test_cpumask()
cpuset: remove unnecessary variable in cpuset_attach()
cpuset: cleanup guarantee_online_{cpus|mems}()
cpuset: remove redundant check in cpuset_cpus_allowed_fallback()

Linus Torvalds
2013-07-03 11:04:25 +0800
b028161fb Merge branch 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup changes from Tejun Heo:
"This pull request contains the following changes.

- cgroup_subsys_state (css) reference counting has been converted to
percpu-ref. css is what each resource controller embeds into its
own control structure and perform reference count against. It may
be used in hot paths of various subsystems and is similar to module
refcnt in that aspect. For example, block-cgroup's css refcnting
was showing up a lot in Mikulaus's device-mapper scalability work
and this should alleviate it.

- cgroup subtree iterator has been updated so that RCU read lock can
be released after grabbing reference. This allows simplifying its
users which requires blocking which used to build iteration list
under RCU read lock and then traverse it outside. This pull
request contains simplification of cgroup core and device-cgroup.
A separate pull request will update cpuset.

- Fixes for various bugs including corner race conditions and RCU
usage bugs.

- A lot of cleanups and some prepartory work for the planned unified
hierarchy support."

* 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (48 commits)
cgroup: CGRP_ROOT_SUBSYS_BOUND should also be ignored when mounting an existing hierarchy
cgroup: CGRP_ROOT_SUBSYS_BOUND should be ignored when comparing mount options
cgroup: fix deadlock on cgroup_mutex via drop_parsed_module_refcounts()
cgroup: always use RCU accessors for protected accesses
cgroup: fix RCU accesses around task->cgroups
cgroup: fix RCU accesses to task->cgroups
cgroup: grab cgroup_mutex in drop_parsed_module_refcounts()
cgroup: fix cgroupfs_root early destruction path
cgroup: reserve ID 0 for dummy_root and 1 for unified hierarchy
cgroup: implement for_each_[builtin_]subsys()
cgroup: move init_css_set initialization inside cgroup_mutex
cgroup: s/for_each_subsys()/for_each_root_subsys()/
cgroup: clean up find_css_set() and friends
cgroup: remove cgroup->actual_subsys_mask
cgroup: prefix global variables with "cgroup_"
cgroup: convert CFTYPE_* flags to enums
cgroup: rename cont to cgrp
cgroup: clean up cgroup_serial_nr_cursor
cgroup: convert cgroup_cft_commit() to use cgroup_for_each_descendant_pre()
cgroup: make serial_nr_cursor available throughout cgroup.c
...

Linus Torvalds
2013-07-03 10:54:47 +0800
f317ff9ee Merge branch 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

Pull workqueue changes from Tejun Heo:
"Surprisingly, Lai and I didn't break too many things implementing
custom pools and stuff last time around and there aren't any follow-up
changes necessary at this point.

The only change in this pull request is Viresh's patches to make some
per-cpu workqueues to behave as unbound workqueues dependent on a boot
param whose default can be configured via a config option. This leads
to higher processing overhead / lower bandwidth as more work items are
bounced across CPUs; however, it can lead to noticeable powersave in
certain configurations - ~10% w/ idlish constant workload on a
big.LITTLE configuration according to Viresh.

This is because per-cpu workqueues interfere with how the scheduler
perceives whether or not each CPU is idle by forcing pinned tasks on
them, which makes the scheduler's power-aware scheduling decisions
less effective.

Its effectiveness is likely less pronounced on homogenous
configurations and this type of optimization can probably be made
automatic; however, the changes are pretty minimal and the affected
workqueues are clearly marked, so it's an easy gain for some
configurations for the time being with pretty unintrusive changes."

* 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
fbcon: queue work on power efficient wq
block: queue work on power efficient wq
PHYLIB: queue work on system_power_efficient_wq
workqueue: Add system wide power_efficient workqueues
workqueues: Introduce new flag WQ_POWER_EFFICIENT for power oriented workqueues

Linus Torvalds
2013-07-03 10:53:30 +0800
13cc56013 Merge branch 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

Pull per-cpu changes from Tejun Heo:
"This pull request contains Kent's per-cpu reference counter. It has
gone through several iterations since the last time and the dynamic
allocation is gone.

The usual usage is relatively straight-forward although async kill
confirm interface, which is not used int most cases, is somewhat icky.
There also are some interface concerns - e.g. I'm not sure about
passing in @relesae callback during init as that becomes funny when we
later implement synchronous kill_and_drain - but nothing too serious
and it's quite useable now.

cgroup_subsys_state refcnting has already been converted and we should
convert module refcnt (Kent?)"

* 'for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu-refcount: use RCU-sched insted of normal RCU
percpu-refcount: implement percpu_tryget() along with percpu_ref_kill_and_confirm()
percpu-refcount: implement percpu_ref_cancel_init()
percpu-refcount: add __must_check to percpu_ref_init() and don't use ACCESS_ONCE() in percpu_ref_kill_rcu()
percpu-refcount: cosmetic updates
percpu-refcount: consistently use plain (non-sched) RCU
percpu-refcount: Don't use silly cmpxchg()
percpu: implement generic percpu refcounting

Linus Torvalds
2013-07-03 10:52:14 +0800