Eric Lee / smarc-fsl-linux-kernel

08 Apr, 2015

1 commit

bd4120221 tracing: Add TRACE_SYSTEM_VAR to kvm-s390 ... Browse Code »

New code will require TRACE_SYSTEM to be a valid C variable name,
but some tracepoints have TRACE_SYSTEM with '-' and not '_', so
it can not be used. Instead, add a TRACE_SYSTEM_VAR that can
give the tracing infrastructure a unique name for the trace system.

Link: http://lkml.kernel.org/r/20150402111500.5e52c1ed.cornelia.huck@de.ibm.com

Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: David Hildenbrand
Cc: Christian Borntraeger
Acked-by: Cornelia Huck
Reviewed-by: Masami Hiramatsu
Tested-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2015-04-08 00:31:39 +0800

10 Mar, 2015

2 commits

affb8172d Merge git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

Pull kvm/s390 bugfixes from Marcelo Tosatti.

* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: s390: non-LPAR case obsolete during facilities mask init
KVM: s390: include guest facilities in kvm facility test
KVM: s390: fix in memory copy of facility lists
KVM: s390/cpacf: Fix kernel bug under z/VM
KVM: s390/cpacf: Enable key wrapping by default

Linus Torvalds
2015-03-10 09:59:50 +0800
ec0e6bd3f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux ... Browse Code »

Pull s390 fixes from Martin Schwidefsky:
"One performance optimization for page_clear and a couple of bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/mm: fix incorrect ASCE after crst_table_downgrade
s390/ftrace: fix crashes when switching tracers / add notrace to cpu_relax()
s390/pci: unify pci_iomap symbol exports
s390/pci: fix [un]map_resources sequence
s390: let the compiler do page clearing
s390/pci: fix possible information leak in mmio syscall
s390/dcss: array index 'i' is used before limits check.
s390/scm_block: fix off by one during cluster reservation
s390/jump label: improve and fix sanity check
s390/jump label: add missing jump_label_apply_nops() call

Linus Torvalds
2015-03-10 09:55:52 +0800

06 Mar, 2015

1 commit

bfb8fb477 Merge tag 'kvm-s390-master-20150303' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux ... Browse Code »

KVM: s390: Fixups for changes in merge window for 4.0

Here are some fixups/improvements for

commit 658b6eda204 ("KVM: s390: add cpu model support")
commit 9d8d578605b ("KVM: s390: use facilities and cpu_id per KVM")
commit a374e892c34 ("KVM: s390/cpacf: Enable/disable protected key
functions for kvm guest")
commit 45c9b47c588 ("KVM: s390/CPACF: Choose crypto control block format")

which all have been merged during the merge window for 4.0.

Marcelo Tosatti
2015-03-06 01:42:48 +0800

04 Mar, 2015

4 commits

fb5bf93f8 KVM: s390: non-LPAR case obsolete during facilities mask init ... Browse Code »

With patch "include guest facilities in kvm facility test" it is no
longer necessary to have special handling for the non-LPAR case.

Signed-off-by: Michael Mueller
Signed-off-by: Christian Borntraeger

Michael Mueller
2015-03-04 17:33:25 +0800
981467c93 KVM: s390: include guest facilities in kvm facility test ... Browse Code »

Most facility related decisions in KVM have to take into account:

- the facilities offered by the underlying run container (LPAR/VM)
- the facilities supported by the KVM code itself
- the facilities requested by a guest VM

This patch adds the KVM driver requested facilities to the test routine.

It additionally renames struct s390_model_fac to kvm_s390_fac and its field
names to be more meaningful.

The semantics of the facilities stored in the KVM architecture structure
is changed. The address arch.model.fac->list now points to the guest
facility list and arch.model.fac->mask points to the KVM facility mask.

This patch fixes the behaviour of KVM for some facilities for guests
that ignore the guest visible facility bits, e.g. guests could use
transactional memory intructions on hosts supporting them even if the
chosen cpu model would not offer them.

The userspace interface is not affected by this change.

Signed-off-by: Michael Mueller
Signed-off-by: Christian Borntraeger

Michael Mueller
2015-03-04 17:33:25 +0800
94422ee88 KVM: s390: fix in memory copy of facility lists ... Browse Code »

The facility lists were not fully copied.

Signed-off-by: Michael Mueller
Signed-off-by: Christian Borntraeger

Michael Mueller
2015-03-04 17:33:24 +0800
86044c8c1 KVM: s390/cpacf: Fix kernel bug under z/VM ... Browse Code »

Under z/VM PQAP might trigger an operation exception if no crypto cards
are defined via APVIRTUAL or APDEDICATED.

[ 386.098666] Kernel BUG at 0000000000135c56 [verbose debug info unavailable]
[ 386.098693] illegal operation: 0001 ilc:2 [#1] SMP
[...]
[ 386.098751] Krnl PSW : 0704c00180000000 0000000000135c56 (kvm_s390_apxa_installed+0x46/0x98)
[...]
[ 386.098804] [] kvm_arch_init_vm+0x29c/0x358
[ 386.098806] [] kvm_dev_ioctl+0xc0/0x460
[ 386.098809] [] do_vfs_ioctl+0x332/0x508
[ 386.098811] [] SyS_ioctl+0x9e/0xb0
[ 386.098814] [] system_call+0xd6/0x258
[ 386.098815] [] 0x3fffc7400a2

Lets add an extable entry and provide a zeroed config in that case.

Reported-by: Stefan Zimmermann
Signed-off-by: Christian Borntraeger
Reviewed-by: Thomas Huth
Tested-by: Stefan Zimmermann

Christian Borntraeger
2015-03-04 17:29:55 +0800

03 Mar, 2015

3 commits

ed6f76b46 KVM: s390/cpacf: Enable key wrapping by default ... Browse Code »

z/VM and LPAR enable key wrapping by default, lets do the same on KVM.

Signed-off-by: Tony Krowiak
Signed-off-by: Christian Borntraeger

Tony Krowiak
2015-03-03 19:08:13 +0800
691d52641 s390/mm: fix incorrect ASCE after crst_table_downgrade ... Browse Code »

The switch_mm function does nothing in case the prev and next mm
are the same. It can happen that a crst_table_downgrade has changed
the top-level pgd in the meantime on a different CPU. Always store
the new ASCE to be picked up in entry.S.

[heiko.carstens@de.ibm.com]: Bug was introduced with git commit
53e857f30867 ("s390/mm,tlb: race of lazy TLB flush vs. recreation
of TLB entries") and causes random crashes due to broken page tables
being used.

Reported-by: Dominik Vogt
Signed-off-by: Martin Schwidefsky
Signed-off-by: Heiko Carstens

Martin Schwidefsky
2015-03-03 03:35:57 +0800
a9ca8eb7a s390/ftrace: fix crashes when switching tracers / add notrace to cpu_relax() ... Browse Code »

With git commit 4d92f50249eb ("s390: reintroduce diag 44 calls for
cpu_relax()") I reintroduced a non-trivial cpu_relax() variant on s390.

The difference to the previous variant however is that the new version is
an out-of-line function, which will be traced if function tracing is enabled.

Switching to different tracers includes instruction patching. Therefore this
is done within stop_machine() "context" to prevent that any function tracing
is going on while instructions are being patched.
With the new out-of-line variant of cpu_relax() this is not true anymore,
since cpu_relax() gets called in a busy loop by all waiting cpus within
stop_machine() until function patching is finished.
Therefore cpu_relax() must be marked notrace.

This fixes kernel crashes when frequently switching between "function" and
"function_graph" tracers.

Moving cpu_relax() to a header file again, doesn't work because of header
include order dependencies.

Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky

Heiko Carstens
2015-03-03 03:35:53 +0800

01 Mar, 2015

1 commit

c07af4f1c mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines ... Browse Code »

Core mm expects __PAGETABLE_{PUD,PMD}_FOLDED to be defined if these page
table levels folded. Usually, these defines are provided by
and .

But some architectures fold page table levels in a custom way. They
need to define these macros themself. This patch adds missing defines.

The patch fixes mm->nr_pmds underflow and eliminates dead __pmd_alloc()
and __pud_alloc() on architectures without these page table levels.

Signed-off-by: Kirill A. Shutemov
Cc: Aaro Koskinen
Cc: David Howells
Cc: Geert Uytterhoeven
Cc: Heiko Carstens
Cc: Helge Deller
Cc: "James E.J. Bottomley"
Cc: Koichi Yasutake
Cc: Martin Schwidefsky
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2015-03-01 01:57:51 +0800

28 Feb, 2015

2 commits

d94260832 s390/pci: unify pci_iomap symbol exports ... Browse Code »

Since commit 8cfc99b58366 ("s390: add pci_iomap_range") we use
EXPORT_SYMBOL for pci_iomap but EXPORT_SYMBOL_GPL for pci_iounmap.
Change the related functions to use EXPORT_SYMBOL like the asm-generic
variants do.

Signed-off-by: Sebastian Ott
Signed-off-by: Martin Schwidefsky

Sebastian Ott
2015-02-28 00:50:07 +0800
1803ba2d7 s390/pci: fix [un]map_resources sequence ... Browse Code »

Commit 8cfc99b58366 ("s390: add pci_iomap_range") introduced counters
to keep track of the number of mappings created. This revealed that
we don't have our internal mappings in order when using hotunplug or
resume from hibernate. This patch addresses both issues.

Signed-off-by: Sebastian Ott
Signed-off-by: Martin Schwidefsky

Sebastian Ott
2015-02-28 00:50:04 +0800

26 Feb, 2015

4 commits

fb3d1c085 s390: let the compiler do page clearing ... Browse Code »

The hardware folks told me that for page clearing "when you exactly
know what to do, hand written xc+pfd is usally faster then mvcl for
page clearing, as it saves millicode overhead and parameter parsing
and checking" as long as you dont need the cache bypassing.
Turns out that gcc already does a proper xc,pfd loop.

A small test on z196 that does

buff = mmap(NULL, bufsize,PROT_EXEC|PROT_WRITE|PROT_READ,AP_PRIVATE| MAP_ANONYMOUS,0,0);
for ( i = 0; i < bufsize; i+= 256)
buff[i] = 0x5;

gets 20% faster (touches every cache line of a page)

and

buff = mmap(NULL, bufsize,PROT_EXEC|PROT_WRITE|PROT_READ,AP_PRIVATE| MAP_ANONYMOUS,0,0);
for ( i = 0; i < bufsize; i+= 4096)
buff[i] = 0x5;

is within noise ratio (touches one cache line of a page).

As the clear_page is usually called for first memory accesses
we can assume that at least one cache line is used afterwards,
so this change should be always better.
Another benchmark, a make -j 40 of my testsuite in tmpfs with
hot caches on a 32cpu system:

-- unpatched -- -- patched --
real 0m1.017s real 0m0.994s (~2% faster, but in noise)
user 0m5.339s user 0m5.016s (~6% faster)
sys 0m0.691s sys 0m0.632s (~8% faster)

Let use the same define to memset as the asm-generic variant

Signed-off-by: Christian Borntraeger
Signed-off-by: Martin Schwidefsky

Christian Borntraeger
2015-02-26 16:24:49 +0800
f0483044c s390/pci: fix possible information leak in mmio syscall ... Browse Code »

Make sure that even in error situations we do not use copy_to_user
on uninitialized kernel memory.

Cc: stable@vger.kernel.org # 3.19+
Signed-off-by: Sebastian Ott
Signed-off-by: Martin Schwidefsky

Sebastian Ott
2015-02-26 16:24:48 +0800
72dace969 s390/jump label: improve and fix sanity check ... Browse Code »

Fix the output of the jump label sanity check and also print the
code pattern that is supposed to be written to the jump label.

Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky

Heiko Carstens
2015-02-26 16:24:46 +0800
6f367769e s390/jump label: add missing jump_label_apply_nops() call ... Browse Code »

When modules are loaded we want to transform the compile time generated
nops into runtime generated nops. Otherwise the jump label sanity check
will detect invalid code when trying to patch code.

Fixes this crash:

Jump label code mismatch at __rds_conn_create+0x3c/0x720
Found: c0 04 00 00 00 01
Expected: c0 04 00 00 00 00
Kernel panic - not syncing: Corrupted kernel text
CPU: 0 PID: 10 Comm: migration/0 Not tainted 3.19.0-01935-g006610f #14
Call Trace:
show_trace+0xf8/0x158)
show_stack+0x6a/0xe8
dump_stack+0x7c/0xd8
panic+0xe4/0x288
jump_label_bug.isra.2+0xbe/0xc001
__jump_label_transform+0x94/0xc8

Reported-by: Sebastian Ott
Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky

Heiko Carstens
2015-02-26 16:24:45 +0800

23 Feb, 2015

2 commits

be5e6616d Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull more vfs updates from Al Viro:
"Assorted stuff from this cycle. The big ones here are multilayer
overlayfs from Miklos and beginning of sorting ->d_inode accesses out
from David"

* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (51 commits)
autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation
procfs: fix race between symlink removals and traversals
debugfs: leave freeing a symlink body until inode eviction
Documentation/filesystems/Locking: ->get_sb() is long gone
trylock_super(): replacement for grab_super_passive()
fanotify: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
Cachefiles: Fix up scripted S_ISDIR/S_ISREG/S_ISLNK conversions
VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry)
SELinux: Use d_is_positive() rather than testing dentry->d_inode
Smack: Use d_is_positive() rather than testing dentry->d_inode
TOMOYO: Use d_is_dir() rather than d_inode and S_ISDIR()
Apparmor: Use d_is_positive/negative() rather than testing dentry->d_inode
Apparmor: mediated_filesystem() should use dentry->d_sb not inode->i_sb
VFS: Split DCACHE_FILE_TYPE into regular and special types
VFS: Add a fallthrough flag for marking virtual dentries
VFS: Add a whiteout dentry type
VFS: Introduce inode-getting helpers for layered/unioned fs environments
Infiniband: Fix potential NULL d_inode dereference
posix_acl: fix reference leaks in posix_acl_create
autofs4: Wrong format for printing dentry
...

Linus Torvalds
2015-02-23 09:42:14 +0800
e36cb0b89 VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) ... Browse Code »

Convert the following where appropriate:

(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.

In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).

Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.

However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.

There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.

The following perl+coccinelle script was used:

use strict;

my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = ;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}

my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );

my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);

foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}

[AV: overlayfs parts skipped]

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2015-02-23 00:38:41 +0800

22 Feb, 2015

1 commit

d34696c22 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux ... Browse Code »

Pull s390 fixes from Martin Schwidefsky:
"Two patches to save some memory if CONFIG_NR_CPUS is large, a changed
default for the use of compare-and-delay, and a couple of bug fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/spinlock: disabled compare-and-delay by default
s390/mm: align 64-bit PIE binaries to 4GB
s390/cacheinfo: coding style changes
s390/cacheinfo: fix shared cpu masks
s390/smp: reduce size of struct pcpu
s390/topology: convert cpu_topology array to per cpu variable
s390/topology: delay initialization of topology cpu masks
s390/vdso: fix clock_gettime for CLOCK_THREAD_CPUTIME_ID, -2 and -3

Linus Torvalds
2015-02-22 03:18:26 +0800

20 Feb, 2015

2 commits

a457ac285 hypfs: switch to read_iter/write_iter ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2015-02-20 17:56:28 +0800
61b0b0168 s390/spinlock: disabled compare-and-delay by default ... Browse Code »

Until we have hard performance data about the effects of CAD in the
spinlock loop disable the instruction by default.

Signed-off-by: Martin Schwidefsky

Martin Schwidefsky
2015-02-20 01:05:38 +0800

19 Feb, 2015

2 commits

4ba2815d3 s390/mm: align 64-bit PIE binaries to 4GB ... Browse Code »

The base address (STACK_TOP / 3 * 2) for a 64-bit program is two thirds
into the 4GB segment at 0x2aa00000000. The randomization added on z13
can eat another 1GB of the remaining 1.33GB to the next 4GB boundary.
In the worst case 300MB are left for the executable + bss which may
cross into the next 4GB segment. This is bad for branch prediction,
therefore align the base address to 4GB to give the program more room
before it crosses the 4GB boundary.

Signed-off-by: Martin Schwidefsky

Martin Schwidefsky
2015-02-19 17:36:32 +0800
53861af9a Merge tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull virtio updates from Rusty Russell:
"OK, this has the big virtio 1.0 implementation, as specified by OASIS.

On top of tht is the major rework of lguest, to use PCI and virtio
1.0, to double-check the implementation.

Then comes the inevitable fixes and cleanups from that work"

* tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (80 commits)
virtio: don't set VIRTIO_CONFIG_S_DRIVER_OK twice.
virtio_net: unconditionally define struct virtio_net_hdr_v1.
tools/lguest: don't use legacy definitions for net device in example launcher.
virtio: Don't expose legacy net features when VIRTIO_NET_NO_LEGACY defined.
tools/lguest: use common error macros in the example launcher.
tools/lguest: give virtqueues names for better error messages
tools/lguest: more documentation and checking of virtio 1.0 compliance.
lguest: don't look in console features to find emerg_wr.
tools/lguest: don't start devices until DRIVER_OK status set.
tools/lguest: handle indirect partway through chain.
tools/lguest: insert driver references from the 1.0 spec (4.1 Virtio Over PCI)
tools/lguest: insert device references from the 1.0 spec (4.1 Virtio Over PCI)
tools/lguest: rename virtio_pci_cfg_cap field to match spec.
tools/lguest: fix features_accepted logic in example launcher.
tools/lguest: handle device reset correctly in example launcher.
virtual: Documentation: simplify and generalize paravirt_ops.txt
lguest: remove NOTIFY call and eventfd facility.
lguest: remove NOTIFY facility from demonstration launcher.
lguest: use the PCI console device's emerg_wr for early boot messages.
lguest: always put console in PCI slot #1.
...

Linus Torvalds
2015-02-19 01:24:01 +0800

14 Feb, 2015

2 commits

cb9e3c292 mm: vmalloc: pass additional vm_flags to __vmalloc_node_range() ... Browse Code »

For instrumenting global variables KASan will shadow memory backing memory
for modules. So on module loading we will need to allocate memory for
shadow and map it at address in shadow that corresponds to the address
allocated in module_alloc().

__vmalloc_node_range() could be used for this purpose, except it puts a
guard hole after allocated area. Guard hole in shadow memory should be a
problem because at some future point we might need to have a shadow memory
at address occupied by guard hole. So we could fail to allocate shadow
for module_alloc().

Now we have VM_NO_GUARD flag disabling guard page, so we need to pass into
__vmalloc_node_range(). Add new parameter 'vm_flags' to
__vmalloc_node_range() function.

Signed-off-by: Andrey Ryabinin
Cc: Dmitry Vyukov
Cc: Konstantin Serebryany
Cc: Dmitry Chernenkov
Signed-off-by: Andrey Konovalov
Cc: Yuri Gribov
Cc: Konstantin Khlebnikov
Cc: Sasha Levin
Cc: Christoph Lameter
Cc: Joonsoo Kim
Cc: Dave Hansen
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrey Ryabinin
2015-02-14 13:21:42 +0800
b9085bcbf Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

Pull KVM update from Paolo Bonzini:
"Fairly small update, but there are some interesting new features.

Common:
Optional support for adding a small amount of polling on each HLT
instruction executed in the guest (or equivalent for other
architectures). This can improve latency up to 50% on some
scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests). This
also has to be enabled manually for now, but the plan is to
auto-tune this in the future.

ARM/ARM64:
The highlights are support for GICv3 emulation and dirty page
tracking

s390:
Several optimizations and bugfixes. Also a first: a feature
exposed by KVM (UUID and long guest name in /proc/sysinfo) before
it is available in IBM's hypervisor! :)

MIPS:
Bugfixes.

x86:
Support for PML (page modification logging, a new feature in
Broadwell Xeons that speeds up dirty page tracking), nested
virtualization improvements (nested APICv---a nice optimization),
usual round of emulation fixes.

There is also a new option to reduce latency of the TSC deadline
timer in the guest; this needs to be tuned manually.

Some commits are common between this pull and Catalin's; I see you
have already included his tree.

Powerpc:
Nothing yet.

The KVM/PPC changes will come in through the PPC maintainers,
because I haven't received them yet and I might end up being
offline for some part of next week"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
KVM: ia64: drop kvm.h from installed user headers
KVM: x86: fix build with !CONFIG_SMP
KVM: x86: emulate: correct page fault error code for NoWrite instructions
KVM: Disable compat ioctl for s390
KVM: s390: add cpu model support
KVM: s390: use facilities and cpu_id per KVM
KVM: s390/CPACF: Choose crypto control block format
s390/kernel: Update /proc/sysinfo file with Extended Name and UUID
KVM: s390: reenable LPP facility
KVM: s390: floating irqs: fix user triggerable endless loop
kvm: add halt_poll_ns module parameter
kvm: remove KVM_MMIO_SIZE
KVM: MIPS: Don't leak FPU/DSP to guest
KVM: MIPS: Disable HTW while in guest
KVM: nVMX: Enable nested posted interrupt processing
KVM: nVMX: Enable nested virtual interrupt delivery
KVM: nVMX: Enable nested apic register virtualization
KVM: nVMX: Make nested control MSRs per-cpu
KVM: nVMX: Enable nested virtualize x2apic mode
KVM: nVMX: Prepare for using hardware MSR bitmap
...

Linus Torvalds
2015-02-14 01:55:09 +0800

13 Feb, 2015

2 commits

af3cd1350 lib/string.c: remove strnicmp() ... Browse Code »

Now that all in-tree users of strnicmp have been converted to
strncasecmp, the wrapper can be removed.

Signed-off-by: Rasmus Villemoes
Cc: David Howells
Cc: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rasmus Villemoes
2015-02-13 10:54:14 +0800
f56141e3e all arches, signal: move restart_block to struct task_struct ... Browse Code »

If an attacker can cause a controlled kernel stack overflow, overwriting
the restart block is a very juicy exploit target. This is because the
restart_block is held in the same memory allocation as the kernel stack.

Moving the restart block to struct task_struct prevents this exploit by
making the restart_block harder to locate.

Note that there are other fields in thread_info that are also easy
targets, at least on some architectures.

It's also a decent simplification, since the restart code is more or less
identical on all architectures.

[james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
Signed-off-by: Andy Lutomirski
Cc: Thomas Gleixner
Cc: Al Viro
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Kees Cook
Cc: David Miller
Acked-by: Richard Weinberger
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Matt Turner
Cc: Vineet Gupta
Cc: Russell King
Cc: Catalin Marinas
Cc: Will Deacon
Cc: Haavard Skinnemoen
Cc: Hans-Christian Egtvedt
Cc: Steven Miao
Cc: Mark Salter
Cc: Aurelien Jacquiot
Cc: Mikael Starvik
Cc: Jesper Nilsson
Cc: David Howells
Cc: Richard Kuo
Cc: "Luck, Tony"
Cc: Geert Uytterhoeven
Cc: Michal Simek
Cc: Ralf Baechle
Cc: Jonas Bonn
Cc: "James E.J. Bottomley"
Cc: Helge Deller
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Acked-by: Michael Ellerman (powerpc)
Tested-by: Michael Ellerman (powerpc)
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Chen Liqin
Cc: Lennox Wu
Cc: Chris Metcalf
Cc: Guan Xuetao
Cc: Chris Zankel
Cc: Max Filippov
Cc: Oleg Nesterov
Cc: Guenter Roeck
Signed-off-by: James Hogan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andy Lutomirski
2015-02-13 10:54:12 +0800

12 Feb, 2015

11 commits

f4dce5c93 s390/cacheinfo: coding style changes ... Browse Code »

Just some minor coding style changes, while I had to look at the code.

Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky

Heiko Carstens
2015-02-12 16:37:24 +0800
4fd4f1c79 s390/cacheinfo: fix shared cpu masks ... Browse Code »

When testing Sudeep Holla's cache info rework I didn't realize that the
shared cpu masks are broken (all have the same cpu set).
Let's fix this.

Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky

Heiko Carstens
2015-02-12 16:37:24 +0800
2f859d0da s390/smp: reduce size of struct pcpu ... Browse Code »

Reduce the size of struct pcpu, since the pcpu_devices array consists
of NR_CPUS elements of type struct pcpu. For most machines this is just
a waste of memory.
So let's try to make it a bit smaller.
This saves 16k with performance_defconfig.

Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky

Heiko Carstens
2015-02-12 16:37:23 +0800
da0c636ea s390/topology: convert cpu_topology array to per cpu variable ... Browse Code »

Convert the per cpu topology cpu masks to a per cpu variable.
At least for machines which do have less possible cpus than NR_CPUS this can
save a bit of memory (z/VM: max 64 vs 512 for performance_defconfig).

This reduces the kernel image size by 100k.

Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky

Heiko Carstens
2015-02-12 16:37:22 +0800
d05d15da1 s390/topology: delay initialization of topology cpu masks ... Browse Code »

There is no reason to initialize the topology cpu masks already while
setup_arch() is being called. It is sufficient to initialize the masks
before the scheduler becomes SMP aware.
Therefore a pre-SMP initcall aka early_initcall is suffucient.

This also allows to convert the cpu_topology array into a per cpu
variable with a later patch. Without this patch this wouldn't be
possible since the per cpu memory areas are not allocated while setup_arch
is executed.

Signed-off-by: Heiko Carstens
Signed-off-by: Martin Schwidefsky

Heiko Carstens
2015-02-12 16:37:22 +0800
49253925c s390/vdso: fix clock_gettime for CLOCK_THREAD_CPUTIME_ID, -2 and -3 ... Browse Code »

Git commit 8d8f2e18a6dbd3d09dd918788422e6ac8c878e96
"s390/vdso: ectg gettime support for CLOCK_THREAD_CPUTIME_ID"
broke clock_gettime for CLOCK_THREAD_CPUTIME_ID.

Git commit c742b31c03f37c5c499178f09f57381aa6c70131
"fast vdso implementation for CLOCK_THREAD_CPUTIME_ID"
introduced the ECTG for clock id -2. Correct would have been
clock id -3.

Fix the whole mess, CLOCK_THREAD_CPUTIME_ID is based on
CPUCLOCK_SCHED and can not be speed up by the vdso. A speedup
is only available for clock id -3 which is CPUCLOCK_VIRT for
the task currently running on the CPU.

Signed-off-by: Martin Schwidefsky

Martin Schwidefsky
2015-02-12 16:37:21 +0800
59d53737a Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge second set of updates from Andrew Morton:
"More of MM"

* emailed patches from Andrew Morton : (83 commits)
mm/nommu.c: fix arithmetic overflow in __vm_enough_memory()
mm/mmap.c: fix arithmetic overflow in __vm_enough_memory()
vmstat: Reduce time interval to stat update on idle cpu
mm/page_owner.c: remove unnecessary stack_trace field
Documentation/filesystems/proc.txt: describe /proc//map_files
mm: incorporate read-only pages into transparent huge pages
vmstat: do not use deferrable delayed work for vmstat_update
mm: more aggressive page stealing for UNMOVABLE allocations
mm: always steal split buddies in fallback allocations
mm: when stealing freepages, also take pages created by splitting buddy page
mincore: apply page table walker on do_mincore()
mm: /proc/pid/clear_refs: avoid split_huge_page()
mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP)
mempolicy: apply page table walker on queue_pages_range()
arch/powerpc/mm/subpage-prot.c: use walk->vma and walk_page_vma()
memcg: cleanup preparation for page table walk
numa_maps: remove numa_maps->vma
numa_maps: fix typo in gather_hugetbl_stats
pagemap: use walk->vma instead of calling find_vma()
clear_refs: remove clear_refs_private->vma and introduce clear_refs_test_walk()
...

Linus Torvalds
2015-02-12 10:23:28 +0800
b3d6524ff Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux ... Browse Code »

Pull s390 updates from Martin Schwidefsky:

- The remaining patches for the z13 machine support: kernel build
option for z13, the cache synonym avoidance, SMT support,
compare-and-delay for spinloops and the CES5S crypto adapater.

- The ftrace support for function tracing with the gcc hotpatch option.
This touches common code Makefiles, Steven is ok with the changes.

- The hypfs file system gets an extension to access diagnose 0x0c data
in user space for performance analysis for Linux running under z/VM.

- The iucv hvc console gets wildcard spport for the user id filtering.

- The cacheinfo code is converted to use the generic infrastructure.

- Cleanup and bug fixes.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (42 commits)
s390/process: free vx save area when releasing tasks
s390/hypfs: Eliminate hypfs interval
s390/hypfs: Add diagnose 0c support
s390/cacheinfo: don't use smp_processor_id() in preemptible context
s390/zcrypt: fixed domain scanning problem (again)
s390/smp: increase maximum value of NR_CPUS to 512
s390/jump label: use different nop instruction
s390/jump label: add sanity checks
s390/mm: correct missing space when reporting user process faults
s390/dasd: cleanup profiling
s390/dasd: add locking for global_profile access
s390/ftrace: hotpatch support for function tracing
ftrace: let notrace function attribute disable hotpatching if necessary
ftrace: allow architectures to specify ftrace compile options
s390: reintroduce diag 44 calls for cpu_relax()
s390/zcrypt: Add support for new crypto express (CEX5S) adapter.
s390/zcrypt: Number of supported ap domains is not retrievable.
s390/spinlock: add compare-and-delay to lock wait loops
s390/tape: remove redundant if statement
s390/hvc_iucv: add simple wildcard matches to the iucv allow filter
...

Linus Torvalds
2015-02-12 09:42:32 +0800
a7b780750 mm: gup: use get_user_pages_unlocked within get_user_pages_fast ... Browse Code »

This allows the get_user_pages_fast slow path to release the mmap_sem
before blocking.

Signed-off-by: Andrea Arcangeli
Reviewed-by: Kirill A. Shutemov
Cc: Andres Lagar-Cavilla
Cc: Peter Feiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrea Arcangeli
2015-02-12 09:06:05 +0800
d016bf7ec mm: make FIRST_USER_ADDRESS unsigned long on all archs ... Browse Code »

LKP has triggered a compiler warning after my recent patch "mm: account
pmd page tables to the process":

mm/mmap.c: In function 'exit_mmap':
>> mm/mmap.c:2857:2: warning: right shift count >= width of type [enabled by default]

The code:

> 2857 WARN_ON(mm_nr_pmds(mm) >
2858 round_up(FIRST_USER_ADDRESS, PUD_SIZE) >> PUD_SHIFT);

In this, on tile, we have FIRST_USER_ADDRESS defined as 0. round_up() has
the same type -- int. PUD_SHIFT.

I think the best way to fix it is to define FIRST_USER_ADDRESS as unsigned
long. On every arch for consistency.

Signed-off-by: Kirill A. Shutemov
Reported-by: Wu Fengguang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2015-02-12 09:06:03 +0800
61f77eda9 mm/hugetlb: reduce arch dependent code around follow_huge_* ... Browse Code »

Currently we have many duplicates in definitions around
follow_huge_addr(), follow_huge_pmd(), and follow_huge_pud(), so this
patch tries to remove the m. The basic idea is to put the default
implementation for these functions in mm/hugetlb.c as weak symbols
(regardless of CONFIG_ARCH_WANT_GENERAL_HUGETL B), and to implement
arch-specific code only when the arch needs it.

For follow_huge_addr(), only powerpc and ia64 have their own
implementation, and in all other architectures this function just returns
ERR_PTR(-EINVAL). So this patch sets returning ERR_PTR(-EINVAL) as
default.

As for follow_huge_(pmd|pud)(), if (pmd|pud)_huge() is implemented to
always return 0 in your architecture (like in ia64 or sparc,) it's never
called (the callsite is optimized away) no matter how implemented it is.
So in such architectures, we don't need arch-specific implementation.

In some architecture (like mips, s390 and tile,) their current
arch-specific follow_huge_(pmd|pud)() are effectively identical with the
common code, so this patch lets these architecture use the common code.

One exception is metag, where pmd_huge() could return non-zero but it
expects follow_huge_pmd() to always return NULL. This means that we need
arch-specific implementation which returns NULL. This behavior looks
strange to me (because non-zero pmd_huge() implies that the architecture
supports PMD-based hugepage, so follow_huge_pmd() can/should return some
relevant value,) but that's beyond this cleanup patch, so let's keep it.

Justification of non-trivial changes:
- in s390, follow_huge_pmd() checks !MACHINE_HAS_HPAGE at first, and this
patch removes the check. This is OK because we can assume MACHINE_HAS_HPAGE
is true when follow_huge_pmd() can be called (note that pmd_huge() has
the same check and always returns 0 for !MACHINE_HAS_HPAGE.)
- in s390 and mips, we use HPAGE_MASK instead of PMD_MASK as done in common
code. This patch forces these archs use PMD_MASK, but it's OK because
they are identical in both archs.
In s390, both of HPAGE_SHIFT and PMD_SHIFT are 20.
In mips, HPAGE_SHIFT is defined as (PAGE_SHIFT + PAGE_SHIFT - 3) and
PMD_SHIFT is define as (PAGE_SHIFT + PAGE_SHIFT + PTE_ORDER - 3), but
PTE_ORDER is always 0, so these are identical.

Signed-off-by: Naoya Horiguchi
Acked-by: Hugh Dickins
Cc: James Hogan
Cc: David Rientjes
Cc: Mel Gorman
Cc: Johannes Weiner
Cc: Michal Hocko
Cc: Rik van Riel
Cc: Andrea Arcangeli
Cc: Luiz Capitulino
Cc: Nishanth Aravamudan
Cc: Lee Schermerhorn
Cc: Steve Capper
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Naoya Horiguchi
2015-02-12 09:06:01 +0800