28 Jan, 2015

8 commits

  • commit 5138d5c562e3bfe30964e20ab46eec9f8b89225d upstream.

    The gpio4 and gpio5 are in 0xf7fc0000 apb which is located in the SM domain.
    This patch moves gpio4 and gpio5 to the correct location. This patch also
    renames them as the following to match the names we internally used in
    marvell:
    gpio4 -> sm_gpio1
    gpio5 -> sm_gpio0
    porte -> portf
    portf -> porte

    This also matches what we did for BG2 and BG2CD's SM GPIO.

    Fixes: cedf57fc4f2f ("ARM: dts: berlin: add the BG2Q GPIO nodes")
    Signed-off-by: Jisheng Zhang
    Signed-off-by: Sebastian Hesselbarth
    Signed-off-by: Greg Kroah-Hartman

    Jisheng Zhang
     
  • commit 237d28db036e411f22c03cfd5b0f6dc2aa9bf3bc upstream.

    If the function graph tracer traces a jprobe callback, the system will
    crash. This can easily be demonstrated by compiling the jprobe
    sample module that is in the kernel tree, loading it and running the
    function graph tracer.

    # modprobe jprobe_example.ko
    # echo function_graph > /sys/kernel/debug/tracing/current_tracer
    # ls

    The first two commands end up in a nice crash after the first fork.
    (do_fork has a jprobe attached to it, so "ls" just triggers that fork)

    The problem is caused by the jprobe_return() that all jprobe callbacks
    must end with. The way jprobes works is that the function a jprobe
    is attached to has a breakpoint placed at the start of it (or it uses
    ftrace if fentry is supported). The breakpoint handler (or ftrace callback)
    will copy the stack frame and change the ip address to return to the
    jprobe handler instead of the function. The jprobe handler must end
    with jprobe_return() which swaps the stack and does an int3 (breakpoint).
    This breakpoint handler will then put back the saved stack frame,
    simulate the instruction at the beginning of the function it added
    a breakpoint to, and then continue on.

    For function tracing to work, it hijakes the return address from the
    stack frame, and replaces it with a hook function that will trace
    the end of the call. This hook function will restore the return
    address of the function call.

    If the function tracer traces the jprobe handler, the hook function
    for that handler will not be called, and its saved return address
    will be used for the next function. This will result in a kernel crash.

    To solve this, pause function tracing before the jprobe handler is called
    and unpause it before it returns back to the function it probed.

    Some other updates:

    Used a variable "saved_sp" to hold kcb->jprobe_saved_sp. This makes the
    code look a bit cleaner and easier to understand (various tries to fix
    this bug required this change).

    Note, if fentry is being used, jprobes will change the ip address before
    the function graph tracer runs and it will not be able to trace the
    function that the jprobe is probing.

    Link: http://lkml.kernel.org/r/20150114154329.552437962@goodmis.org

    Acked-by: Masami Hiramatsu
    Signed-off-by: Steven Rostedt
    Signed-off-by: Greg Kroah-Hartman

    Steven Rostedt (Red Hat)
     
  • commit 0145058c3d30b4319d747f64caa16a9cb15f0581 upstream.

    This patch partially reverts commit 421520ba98290a73b35b7644e877a48f18e06004
    (only the arm64 part). There is no guarantee that the boot-loader places other
    images like dtb in a different page than initrd start/end, especially when the
    kernel is built with 64KB pages. When this happens, such pages must not be
    freed. The free_reserved_area() already takes care of rounding up "start" and
    rounding down "end" to avoid freeing partially used pages.

    Reported-by: Peter Maydell
    Signed-off-by: Catalin Marinas
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Catalin Marinas
     
  • commit bfe5fda8e7ced129716f5741cf7ed2592a338824 upstream.

    Patch c49f63530bb6 ("powernv: Add OPAL tracepoints") has a spurious
    store to the stack:

    ld r12,opal_tracepoint_refcount@toc(r2); \
    std r12,32(r1); \

    The store was originally used to save the current tracepoint status
    so the entry and the exit tracepoints were always balanced. In the
    end I just created a separate path when tracepoints are enabled.

    The offset on the stack used for this store is not valid for ABIv2
    and it causes strange issues. I noticed it because OPAL console input
    was broken.

    Fixes: c49f63530bb6 ("powernv: Add OPAL tracepoints")
    Signed-off-by: Anton Blanchard
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Anton Blanchard
     
  • commit 280dbc572357eb50184663fc9e4aaf09c8141e9b upstream.

    Commit 9def39be4e96 ("x86: Support compiling out human-friendly
    processor feature names") made two source file targets
    conditional. Such conditional targets will not be cleaned
    automatically by make mrproper.

    Fix by adding explicit clean-files targets for the two files.

    Fixes: 9def39be4e96 ("x86: Support compiling out human-friendly processor feature names")
    Signed-off-by: Bjørn Mork
    Cc: Josh Triplett
    Link: http://lkml.kernel.org/r/1419335863-10608-1-git-send-email-bjorn@mork.no
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Bjørn Mork
     
  • commit 45db07382a5c78b0c43b3b0002b63757fb60e873 upstream.

    The __ldcw macro has a problem when its argument needs to be reloaded from
    memory. The output memory operand and the input register operand both need to
    be reloaded using a register in class R1_REGS when generating 64-bit code.
    This fails because there's only a single register in the class. Instead, use a
    memory clobber. This also makes the __ldcw macro a compiler memory barrier.

    Signed-off-by: John David Anglin
    Signed-off-by: Helge Deller
    Signed-off-by: Greg Kroah-Hartman

    John David Anglin
     
  • commit 96ed6046d3bf1113de3bdbd6dbb7f40e6d0ae0ef upstream.

    On BG2Q, the sdhci2 host uses nfcecc for "io" clk and nfc for "core" clk.
    The shdci2 can't work without this patch due to the "core" clk is gated.

    Fixes: 0d859a6a9d14 ("ARM: dts: berlin: add the SDHCI nodes for the BG2Q")
    Signed-off-by: Jisheng Zhang
    Signed-off-by: Sebastian Hesselbarth
    Signed-off-by: Greg Kroah-Hartman

    Jisheng Zhang
     
  • commit e8ef060b37c2d3cc5fd0c0edbe4e42ec1cb9768b upstream.

    This allows the sdplite/Zebu images to run on OSCI simulation platform

    Signed-off-by: Vineet Gupta
    Signed-off-by: Greg Kroah-Hartman

    Vineet Gupta
     

16 Jan, 2015

31 commits

  • commit 5306c31c5733cb4a79cc002e0c3ad256fd439614 upstream.

    There was another report of a boot failure with a #GP fault in the
    uncore SBOX initialization. The earlier work around was not enough
    for this system.

    The boot was failing while trying to initialize the third SBOX.

    This patch detects parts with only two SBOXes and limits the number
    of SBOX units to two there.

    Stable material, as it affects boot problems on 3.18.

    Tested-by: Andreas Oehler
    Signed-off-by: Andi Kleen
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Arnaldo Carvalho de Melo
    Cc: Stephane Eranian
    Cc: Yan, Zheng
    Link: http://lkml.kernel.org/r/1420583675-9163-1-git-send-email-andi@firstfloor.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andi Kleen
     
  • commit af91568e762d04931dcbdd6bef4655433d8b9418 upstream.

    The uncore_collect_events functions assumes that event group
    might contain only uncore events which is wrong, because it
    might contain any type of events.

    This bug leads to uncore framework touching 'not' uncore events,
    which could end up all sorts of bugs.

    One was triggered by Vince's perf fuzzer, when the uncore code
    touched breakpoint event private event space as if it was uncore
    event and caused BUG:

    BUG: unable to handle kernel paging request at ffffffff82822068
    IP: [] uncore_assign_events+0x188/0x250
    ...

    The code in uncore_assign_events() function was looking for
    event->hw.idx data while the event was initialized as a
    breakpoint with different members in event->hw union.

    This patch forces uncore_collect_events() to collect only uncore
    events.

    Reported-by: Vince Weaver
    Signed-off-by: Jiri Olsa
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Yan, Zheng
    Link: http://lkml.kernel.org/r/1418243031-20367-2-git-send-email-jolsa@kernel.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Jiri Olsa
     
  • commit 0b1e95b2fa0934c3a08db483979c70d3b287f50e upstream.

    The "by8" counter mode optimization is broken for 128 bit keys with
    input data longer than 128 bytes. It uses the wrong key material for
    en- and decryption.

    The key registers xkey0, xkey4, xkey8 and xkey12 need to be preserved
    in case we're handling more than 128 bytes of input data -- they won't
    get reloaded after the initial load. They must therefore be (a) loaded
    on the first iteration and (b) be preserved for the latter ones. The
    implementation for 128 bit keys does not comply with (a) nor (b).

    Fix this by bringing the implementation back to its original source
    and correctly load the key registers and preserve their values by
    *not* re-using the registers for other purposes.

    Kudos to James for reporting the issue and providing a test case
    showing the discrepancies.

    Reported-by: James Yonan
    Cc: Chandramouli Narayanan
    Signed-off-by: Mathias Krause
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Mathias Krause
     
  • commit 0b8c960cf6defc56b3aa1a71b5af95872b6dff2b upstream.

    This patch fixes this allyesconfig target build error with older
    binutils.

    LD arch/x86/crypto/built-in.o
    ld: arch/x86/crypto/sha-mb/built-in.o: No such file: No such file or directory

    Signed-off-by: Vinson Lee
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Vinson Lee
     
  • commit 0e63ea48b4d8035dd0e91a3fa6fb79458b47adfb upstream.

    The early ioremap support introduced by patch bf4b558eba92
    ("arm64: add early_ioremap support") failed to add a call to
    early_ioremap_reset() at an appropriate time. Without this call,
    invocations of early_ioremap etc. that are done too late will go
    unnoticed and may cause corruption.

    This is exactly what happened when the first user of this feature
    was added in patch f84d02755f5a ("arm64: add EFI runtime services").
    The early mapping of the EFI memory map is unmapped during an early
    initcall, at which time the early ioremap support is long gone.

    Fix by adding the missing call to early_ioremap_reset() to
    setup_arch(), and move the offending early_memunmap() to right after
    the point where the early mapping of the EFI memory map is last used.

    Fixes: f84d02755f5a ("arm64: add EFI runtime services")
    Signed-off-by: Leif Lindholm
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Ard Biesheuvel
     
  • commit f43c27188a49111b58e9611afa2f0365b0b55625 upstream.

    On arm64 the TTBR0_EL1 register is set to either the reserved TTBR0
    page tables on boot or to the active_mm mappings belonging to user space
    processes, it must never be set to swapper_pg_dir page tables mappings.

    When a CPU is booted its active_mm is set to init_mm even though its
    TTBR0_EL1 points at the reserved TTBR0 page mappings. This implies
    that when __cpu_suspend is triggered the active_mm can point at
    init_mm even if the current TTBR0_EL1 register contains the reserved
    TTBR0_EL1 mappings.

    Therefore, the mm save and restore executed in __cpu_suspend might
    turn out to be erroneous in that, if the current->active_mm corresponds
    to init_mm, on resume from low power it ends up restoring in the
    TTBR0_EL1 the init_mm mappings that are global and can cause speculation
    of TLB entries which end up being propagated to user space.

    This patch fixes the issue by checking the active_mm pointer before
    restoring the TTBR0 mappings. If the current active_mm == &init_mm,
    the code sets the TTBR0_EL1 to the reserved TTBR0 mapping instead of
    switching back to the active_mm, which is the expected behaviour
    corresponding to the TTBR0_EL1 settings when __cpu_suspend was entered.

    Fixes: 95322526ef62 ("arm64: kernel: cpu_{suspend/resume} implementation")
    Cc: Will Deacon
    Signed-off-by: Lorenzo Pieralisi
    Signed-off-by: Catalin Marinas
    Signed-off-by: Greg Kroah-Hartman

    Lorenzo Pieralisi
     
  • commit c3684fbb446501b48dec6677a6a9f61c215053de upstream.

    The function cpu_resume currently lives in the .data section.
    There's no reason for it to be there since we can use relative
    instructions without a problem. Move a few cpu_resume data
    structures out of the assembly file so the .data annotation
    can be dropped completely and cpu_resume ends up in the read
    only text section.

    Reviewed-by: Kees Cook
    Reviewed-by: Mark Rutland
    Reviewed-by: Lorenzo Pieralisi
    Tested-by: Mark Rutland
    Tested-by: Lorenzo Pieralisi
    Tested-by: Kees Cook
    Acked-by: Ard Biesheuvel
    Signed-off-by: Laura Abbott
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Laura Abbott
     
  • commit 4bf9636c39ac70da091d5a2e28d3448eaa7f115c upstream.

    Commit 9fc2105aeaaf ("ARM: 7830/1: delay: don't bother reporting
    bogomips in /proc/cpuinfo") breaks audio in python, and probably
    elsewhere, with message

    FATAL: cannot locate cpu MHz in /proc/cpuinfo

    I'm not the first one to hit it, see for example

    https://theredblacktree.wordpress.com/2014/08/10/fatal-cannot-locate-cpu-mhz-in-proccpuinfo/
    https://devtalk.nvidia.com/default/topic/765800/workaround-for-fatal-cannot-locate-cpu-mhz-in-proc-cpuinf/?offset=1

    Reading original changelog, I have to say "Stop breaking working setups.
    You know who you are!".

    Signed-off-by: Pavel Machek
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Pavel Machek
     
  • commit 9008d83fe9dc2e0f19b8ba17a423b3759d8e0fd7 upstream.

    Commit 705814b5ea6f ("ARM: OMAP4+: PM: Consolidate OMAP4 PM code to
    re-use it for OMAP5")

    Moved logic generic for OMAP5+ as part of the init routine by
    introducing omap4_pm_init. However, the patch left the powerdomain
    initial setup, an unused omap4430 es1.0 check and a spurious log
    "Power Management for TI OMAP4." in the original code.

    Remove the duplicate code which is already present in omap4_pm_init from
    omap4_init_static_deps.

    As part of this change, also move the u-boot version print out of the
    static dependency function to the omap4_pm_init function.

    Fixes: 705814b5ea6f ("ARM: OMAP4+: PM: Consolidate OMAP4 PM code to re-use it for OMAP5")
    Signed-off-by: Nishanth Menon
    Signed-off-by: Tony Lindgren
    Signed-off-by: Greg Kroah-Hartman

    Nishanth Menon
     
  • commit 5e794de514f56de1e78e979ca09c56a91aa2e9f1 upstream.

    The PWM block is required for system clock source so it must be always
    enabled. This patch fixes boot issues on SMDK6410 which did not have
    the node enabled explicitly for other purposes.

    Fixes: eeb93d02 ("clocksource: of: Respect device tree node status")

    Signed-off-by: Tomasz Figa
    Signed-off-by: Kukjin Kim
    Signed-off-by: Greg Kroah-Hartman

    Tomasz Figa
     
  • commit be6688350a4470e417aaeca54d162652aab40ac5 upstream.

    OMAP wdt driver supports only ti,omap3-wdt compatible. In DRA7 dt
    wdt compatible property is defined as ti,omap4-wdt by mistake instead of
    ti,omap3-wdt. Correcting the typo.

    Fixes: 6e58b8f1daaf1a ("ARM: dts: DRA7: Add the dts files for dra7 SoC and dra7-evm board")
    Signed-off-by: Lokesh Vutla
    Signed-off-by: Tony Lindgren
    Signed-off-by: Greg Kroah-Hartman

    Lokesh Vutla
     
  • commit 9d312cd12e89ce08add99fe66e8f6baeaca16d7d upstream.

    CONFIG_GENERIC_CPUFREQ_CPU0 disappeared with commit bbcf071969b20f
    ("cpufreq: cpu0: rename driver and internals to 'cpufreq_dt'") and some
    defconfigs are still using it instead of the new one.

    Use the renamed CONFIG_CPUFREQ_DT generic driver.

    Reported-by: Nishanth Menon
    Signed-off-by: Viresh Kumar
    Signed-off-by: Kevin Hilman
    Signed-off-by: Greg Kroah-Hartman

    Viresh Kumar
     
  • commit d73f825e6efa723e81d9ffcc4949fe9f03f1df29 upstream.

    The lcd0 node for am437x-sk-evm.dts contains bad LCD timings, and while
    they seem to work with a quick test, doing for example blank/unblank
    will give you a black display.

    This patch updates the timings to the 'typical' values from the LCD spec
    sheet.

    Also, the compatible string is completely bogus, as
    "osddisplays,osd057T0559-34ts" is _not_ a 480x272 panel. The panel on
    the board is a newhaven one. Update the compatible string to reflect
    this. Note that this hasn't caused any issues, as the "panel-dpi"
    matches the driver.

    Tested-by: Felipe Balbi
    Signed-off-by: Tomi Valkeinen
    Signed-off-by: Tony Lindgren
    Signed-off-by: Greg Kroah-Hartman

    Tomi Valkeinen
     
  • commit 58230c2c443bc9801293f6535144d04ceaf731e0 upstream.

    Caused by a copy & paste error. Note that even with
    this bug AM437x SK display still works because GPIO
    mux mode is always enabled. It's still wrong to mux
    somebody else's pin.

    Luckily ball D25 (offset 0x238 - gpio5_8) on AM437x
    isn't used for anything.

    While at that, also replace a pullup with a pulldown
    as that gpio should be normally low, not high.

    Acked-by: Tomi Valkeinen
    Signed-off-by: Felipe Balbi
    Signed-off-by: Tony Lindgren
    Signed-off-by: Greg Kroah-Hartman

    Felipe Balbi
     
  • commit ff009ab6d4d4581b62fa055ab6233133aca25ab8 upstream.

    Replace PAGE_KERNEL with PAGE_KERNEL_EXEC to allow copy_to_user_page
    invalidate icache for pages mapped with kmap.

    Signed-off-by: Max Filippov
    Signed-off-by: Greg Kroah-Hartman

    Max Filippov
     
  • commit 007487f1fd43d84f26cda926081ca219a24ecbc4 upstream.

    Currently we enable Exynos devices in the multi v7 defconfig, however, when
    testing on my ODROID-U3, I noticed that USB was not working. Enabling this
    option causes USB to work, which enables networking support as well since the
    ODROID-U3 has networking on the USB bus.

    [arnd] Support for odroid-u3 was added in 3.10, so it would be nice to
    backport this fix at least that far.

    Signed-off-by: Steev Klimaszewski
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Greg Kroah-Hartman

    Steev Klimaszewski
     
  • commit 1ddf0b1b11aa8a90cef6706e935fc31c75c406ba upstream.

    In Linux 3.18 and below, GCC hoists the lsl instructions in the
    pvclock code all the way to the beginning of __vdso_clock_gettime,
    slowing the non-paravirt case significantly. For unknown reasons,
    presumably related to the removal of a branch, the performance issue
    is gone as of

    e76b027e6408 x86,vdso: Use LSL unconditionally for vgetcpu

    but I don't trust GCC enough to expect the problem to stay fixed.

    There should be no correctness issue, because the __getcpu calls in
    __vdso_vlock_gettime were never necessary in the first place.

    Note to stable maintainers: In 3.18 and below, depending on
    configuration, gcc 4.9.2 generates code like this:

    9c3: 44 0f 03 e8 lsl %ax,%r13d
    9c7: 45 89 eb mov %r13d,%r11d
    9ca: 0f 03 d8 lsl %ax,%ebx

    This patch won't apply as is to any released kernel, but I'll send a
    trivial backported version if needed.

    [
    Backported by Andy Lutomirski. Should apply to all affected
    versions. This fixes a functionality bug as well as a performance
    bug: buggy kernels can infinite loop in __vdso_clock_gettime on
    affected compilers. See, for exammple:

    https://bugzilla.redhat.com/show_bug.cgi?id=1178975
    ]

    Fixes: 51c19b4f5927 x86: vdso: pvclock gettime support
    Cc: Marcelo Tosatti
    Acked-by: Paolo Bonzini
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit 394f56fe480140877304d342dec46d50dc823d46 upstream.

    The theory behind vdso randomization is that it's mapped at a random
    offset above the top of the stack. To avoid wasting a page of
    memory for an extra page table, the vdso isn't supposed to extend
    past the lowest PMD into which it can fit. Other than that, the
    address should be a uniformly distributed address that meets all of
    the alignment requirements.

    The current algorithm is buggy: the vdso has about a 50% probability
    of being at the very end of a PMD. The current algorithm also has a
    decent chance of failing outright due to incorrect handling of the
    case where the top of the stack is near the top of its PMD.

    This fixes the implementation. The paxtest estimate of vdso
    "randomisation" improves from 11 bits to 18 bits. (Disclaimer: I
    don't know what the paxtest code is actually calculating.)

    It's worth noting that this algorithm is inherently biased: the vdso
    is more likely to end up near the end of its PMD than near the
    beginning. Ideally we would either nix the PMD sharing requirement
    or jointly randomize the vdso and the stack to reduce the bias.

    In the mean time, this is a considerable improvement with basically
    no risk of compatibility issues, since the allowed outputs of the
    algorithm are unchanged.

    As an easy test, doing this:

    for i in `seq 10000`
    do grep -P vdso /proc/self/maps |cut -d- -f1
    done |sort |uniq -d

    used to produce lots of output (1445 lines on my most recent run).
    A tiny subset looks like this:

    7fffdfffe000
    7fffe01fe000
    7fffe05fe000
    7fffe07fe000
    7fffe09fe000
    7fffe0bfe000
    7fffe0dfe000

    Note the suspicious fe000 endings. With the fix, I get a much more
    palatable 76 repeated addresses.

    Reviewed-by: Kees Cook
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit a629df7eadffb03e6ce4a8616e62ea29fdf69b6b upstream.

    Since most virtual machines raise this message once, it is a bit annoying.
    Make it KERN_DEBUG severity.

    Fixes: 7a2e8aaf0f6873b47bc2347f216ea5b0e4c258ab
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Paolo Bonzini
     
  • commit 1365039d0cb32c0cf96eb9f750f4277c9a90f87d upstream.

    ipte_unlock_siif uses cmpxchg to replace the in-memory data of the ipte
    lock together with ACCESS_ONCE for the intial read.

    union ipte_control {
    unsigned long val;
    struct {
    unsigned long k : 1;
    unsigned long kh : 31;
    unsigned long kg : 32;
    };
    };
    [...]
    static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
    {
    union ipte_control old, new, *ic;

    ic = &vcpu->kvm->arch.sca->ipte_control;
    do {
    new = old = ACCESS_ONCE(*ic);
    new.kh--;
    if (!new.kh)
    new.k = 0;
    } while (cmpxchg(&ic->val, old.val, new.val) != old.val);
    if (!new.kh)
    wake_up(&vcpu->kvm->arch.ipte_wq);
    }

    The new value, is loaded twice from memory with gcc 4.7.2 of
    fedora 18, despite the ACCESS_ONCE:

    --->

    l %r4,0(%r3)
    nihh %r1,32767
    lgr %r4,%r2
    csg %r4,%r1,0(%r3)
    cgr %r2,%r4
    jne a70

    If the memory value changes between the first load (l) and the second
    load (lg) we are broken. If that happens VCPU threads will hang
    (unkillable) in handle_ipte_interlock.

    Andreas Krebbel analyzed this and tracked it down to a compiler bug in
    that version:
    "while it is not that obvious the C99 standard basically forbids
    duplicating the memory access also in that case. For an argumentation of
    a similiar case please see:
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278#c43

    For the implementation-defined cases regarding volatile there are some
    GCC-specific clarifications which can be found here:
    https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html#Volatiles

    I've tracked down the problem with a reduced testcase. The problem was
    that during a tree level optimization (SRA - scalar replacement of
    aggregates) the volatile marker is lost. And an RTL level optimizer (CSE
    - common subexpression elimination) then propagated the memory read into
    its second use introducing another access to the memory location. So
    indeed Christian's suspicion that the union access has something to do
    with it is correct (since it triggered the SRA optimization).

    This issue has been reported and fixed in the GCC 4.8 development cycle:
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145"

    This patch replaces the ACCESS_ONCE scheme with a barrier() based scheme
    that should work for all supported compilers.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Greg Kroah-Hartman

    Christian Borntraeger
     
  • commit 2dca485f8740208604543c3960be31a5dd3ea603 upstream.

    some control register changes will flush some aspects of the CPU, e.g.
    POP explicitely mentions that for CR9-CR11 "TLBs may be cleared".
    Instead of trying to be clever and only flush on specific CRs, let
    play safe and flush on all lctl(g) as future machines might define
    new bits in CRs. Load control intercept should not happen that often.

    Signed-off-by: Christian Borntraeger
    Acked-by: Cornelia Huck
    Reviewed-by: David Hildenbrand
    Signed-off-by: Greg Kroah-Hartman

    Christian Borntraeger
     
  • commit a36c5393266222129ce6f622e3bc3fb5463f290c upstream.

    The monitor-class number field is only 16 bits, so we have to use
    a u16 pointer to access it.

    Signed-off-by: Thomas Huth
    Reviewed-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Greg Kroah-Hartman

    Thomas Huth
     
  • commit b65d6e17fe2239c9b2051727903955d922083fbf upstream.

    This feature is not supported inside KVM guests yet, because we do not emulate
    MSR_IA32_XSS. Mask it out.

    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Paolo Bonzini
     
  • commit ab646f54f4fd1a8b9671b8707f0739fdd28ce2b1 upstream.

    commit d50eaa18039b ("KVM: x86: Perform limit checks when assigning EIP")
    mistakenly used zero as cpl on em_ret_far. Use the actual one.

    Fixes: d50eaa18039b8b848c2285478d0775335ad5e930
    Signed-off-by: Nadav Amit
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • commit df1daba7d1cb8ed7957f873cde5c9e953cbaa483 upstream.

    Userspace is expecting non-compacted format for KVM_GET_XSAVE, but
    struct xsave_struct might be using the compacted format. Convert
    in order to preserve userspace ABI.

    Likewise, userspace is passing non-compacted format for KVM_SET_XSAVE
    but the kernel will pass it to XRSTORS, and we need to convert back.

    Fixes: f31a9f7c71691569359fa7fb8b0acaa44bce0324
    Cc: Fenghua Yu
    Cc: H. Peter Anvin
    Tested-by: Nadav Amit
    Reviewed-by: Radim Krčmář
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Paolo Bonzini
     
  • commit ba7b39203a3a18018173b87e73f27169bd8e5147 upstream.

    get_xsave_addr is the API to access XSAVE states, and KVM would
    like to use it. Export it.

    Cc: x86@kernel.org
    Cc: H. Peter Anvin
    Acked-by: Thomas Gleixner
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Paolo Bonzini
     
  • commit 63f13448d81c910a284b096149411a719cbed501 upstream.

    Since both ppc and ppc64 have LE variants which are now reported by uname, add
    that flag (__AUDIT_ARCH_LE) to syscall_get_arch() and add AUDIT_ARCH_PPC64LE
    variant.

    Without this, perf trace and auditctl fail.

    Mainline kernel reports ppc64le (per a058801) but there is no matching
    AUDIT_ARCH_PPC64LE.

    Since 32-bit PPC LE is not supported by audit, don't advertise it in
    AUDIT_ARCH_PPC* variants.

    See:
    https://www.redhat.com/archives/linux-audit/2014-August/msg00082.html
    https://www.redhat.com/archives/linux-audit/2014-December/msg00004.html

    Signed-off-by: Richard Guy Briggs
    Acked-by: Paul Moore
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Richard Guy Briggs
     
  • commit f34b6c72c3ebaa286d3311a825ef79eccbcca82f upstream.

    The 24x7 counters are continuously running and not updated on an
    interrupt. So we record the event counts when stopping the event or
    deleting it.

    But to "read" a single counter in 24x7, we allocate a page and pass it
    into the hypervisor (The HV returns the page full of counters from which
    we extract the specific counter for this event).

    We allocate a page using GFP_USER and when deleting the event, we end up
    with the following warning because we are blocking in interrupt context.

    [ 698.641709] BUG: scheduling while atomic: swapper/0/0/0x10010000

    We could use GFP_ATOMIC but that could result in failures. Pre-allocate
    a buffer so we don't have to allocate in interrupt context. Further as
    Michael Ellerman suggested, use Per-CPU buffer so we only need to
    allocate once per CPU.

    Signed-off-by: Sukadev Bhattiprolu
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    sukadev@linux.vnet.ibm.com
     
  • commit 8117ac6a6c2fa0f847ff6a21a1f32c8d2c8501d0 upstream.

    Currently, when going idle, we set the flag indicating that we are in
    nap mode (paca->kvm_hstate.hwthread_state) and then execute the nap
    (or sleep or rvwinkle) instruction, all with the MMU on. This is bad
    for two reasons: (a) the architecture specifies that those instructions
    must be executed with the MMU off, and in fact with only the SF, HV, ME
    and possibly RI bits set, and (b) this introduces a race, because as
    soon as we set the flag, another thread can switch the MMU to a guest
    context. If the race is lost, this thread will typically start looping
    on relocation-on ISIs at 0xc...4400.

    This fixes it by setting the MSR as required by the architecture before
    setting the flag or executing the nap/sleep/rvwinkle instruction.

    [ shreyas@linux.vnet.ibm.com: Edited to handle LE ]
    Signed-off-by: Paul Mackerras
    Signed-off-by: Shreyas B. Prabhu
    Cc: Benjamin Herrenschmidt
    Cc: Michael Ellerman
    Cc: linuxppc-dev@lists.ozlabs.org
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Paul Mackerras
     
  • commit 682e77c861c4c60f79ffbeae5e1938ffed24a575 upstream.

    The existing MCE code calls flush_tlb hook with IS=0 (single page) resulting
    in partial invalidation of TLBs which is not right. This patch fixes
    that by passing IS=0xc00 to invalidate whole TLB for successful recovery
    from TLB and ERAT errors.

    Signed-off-by: Mahesh Salgaonkar
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Mahesh Salgaonkar
     
  • commit cd32e2dcc9de6c27ecbbfc0e2079fb64b42bad5f upstream.

    We have some code in udbg_uart_getc_poll() that tries to protect
    against a NULL udbg_uart_in, but gets it all wrong.

    Found with the LLVM static analyzer (scan-build).

    Fixes: 309257484cc1 ("powerpc: Cleanup udbg_16550 and add support for LPC PIO-only UARTs")
    Signed-off-by: Anton Blanchard
    [mpe: Add some newlines for readability while we're here]
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Anton Blanchard
     

09 Jan, 2015

1 commit

  • commit 7ff4d90b4c24a03666f296c3d4878cd39001e81e upstream.

    Today there are 3 instances of setgroups and due to an oversight their
    permission checking has diverged. Add a common function so that
    they may all share the same permission checking code.

    This corrects the current oversight in the current permission checks
    and adds a helper to avoid this in the future.

    A user namespace security fix will update this new helper, shortly.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman