28 Apr, 2014

1 commit


12 Apr, 2014

1 commit

  • Pull more ACPI and power management fixes and updates from Rafael Wysocki:
    "This is PM and ACPI material that has emerged over the last two weeks
    and one fix for a CPU hotplug regression introduced by the recent CPU
    hotplug notifiers registration series.

    Included are intel_idle and turbostat updates from Len Brown (these
    have been in linux-next for quite some time), a new cpufreq driver for
    powernv (that might spend some more time in linux-next, but BenH was
    asking me so nicely to push it for 3.15 that I couldn't resist), some
    cpufreq fixes and cleanups (including fixes for some silly breakage in
    a couple of cpufreq drivers introduced during the 3.14 cycle),
    assorted ACPI cleanups, wakeup framework documentation fixes, a new
    sysfs attribute for cpuidle and a new command line argument for power
    domains diagnostics.

    Specifics:

    - Fix for a recently introduced CPU hotplug regression in ARM KVM
    from Ming Lei.

    - Fixes for breakage in the at32ap, loongson2_cpufreq, and unicore32
    cpufreq drivers introduced during the 3.14 cycle (-stable material)
    from Chen Gang and Viresh Kumar.

    - New powernv cpufreq driver from Vaidyanathan Srinivasan, with bits
    from Gautham R Shenoy and Srivatsa S Bhat.

    - Exynos cpufreq driver fix preventing it from being included into
    multiplatform builds that aren't supported by it from Sachin Kamat.

    - cpufreq cleanups related to the usage of the driver_data field in
    struct cpufreq_frequency_table from Viresh Kumar.

    - cpufreq ppc driver cleanup from Sachin Kamat.

    - Intel BayTrail support for intel_idle and ACPI idle from Len Brown.

    - Intel CPU model 54 (Atom N2000 series) support for intel_idle from
    Jan Kiszka.

    - intel_idle fix for Intel Ivy Town residency targets from Len Brown.

    - turbostat updates (Intel Broadwell support and output cleanups)
    from Len Brown.

    - New cpuidle sysfs attribute for exporting C-states' target
    residency information to user space from Daniel Lezcano.

    - New kernel command line argument to prevent power domains enabled
    by the bootloader from being turned off even if they are not in use
    (for diagnostics purposes) from Tushar Behera.

    - Fixes for wakeup sysfs attributes documentation from Geert
    Uytterhoeven.

    - New ACPI video blacklist entry for ThinkPad Helix from Stephen
    Chandler Paul.

    - Assorted ACPI cleanups and a Kconfig help update from Jonghwan
    Choi, Zhihui Zhang, Hanjun Guo"

    * tag 'pm+acpi-3.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits)
    ACPI: Update the ACPI spec information in Kconfig
    arm, kvm: fix double lock on cpu_add_remove_lock
    cpuidle: sysfs: Export target residency information
    cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h
    cpufreq: create another field .flags in cpufreq_frequency_table
    cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table
    cpufreq: don't print value of .driver_data from core
    cpufreq: ia64: don't set .driver_data to index
    cpufreq: powernv: Select CPUFreq related Kconfig options for powernv
    cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids
    cpufreq: powernv: cpufreq driver for powernv platform
    cpufreq: at32ap: don't declare local variable as static
    cpufreq: loongson2_cpufreq: don't declare local variable as static
    cpufreq: unicore32: fix typo issue for 'clk'
    cpufreq: exynos: Disable on multiplatform build
    PM / wakeup: Correct presence vs. emptiness of wakeup_* attributes
    PM / domains: Add pd_ignore_unused to keep power domains enabled
    ACPI / dock: Drop dock_device_ids[] table
    ACPI / video: Favor native backlight interface for ThinkPad Helix
    ACPI / thermal: Fix wrong variable usage in debug statement
    ...

    Linus Torvalds
     

09 Apr, 2014

6 commits

  • Pull more powerpc updates from Ben Herrenschmidt:
    "Here are a few more powerpc things for you.

    So you'll find here the conversion of the two new firmware sysfs
    interfaces to the new API for self-removing files that Greg and Tejun
    introduced, so they can finally remove the old one.

    I'm also reverting the hwmon driver for powernv. I shouldn't have
    merged it, I got a bit carried away here. I hadn't realized it was
    never CCed to the relevant maintainer(s) and list(s), and happens to
    have some issues so I'm taking it out and it will come back via the
    proper channels.

    The rest is a bunch of LE fixes (argh, some of the new stuff was
    broken on LE, I really need to start testing LE myself !) and various
    random fixes here and there.

    Finally one bit that's not strictly a fix, which is the HVC OPAL
    change to "kick" the HVC thread when the firmware tells us there is
    new incoming data. I don't feel like waiting for this one, it's
    simple enough, and it makes a big difference in console responsiveness
    which is good for my nerves"

    * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (26 commits)
    powerpc/powernv Adapt opal-elog and opal-dump to new sysfs_remove_file_self
    Revert "powerpc/powernv: hwmon driver for power values, fan rpm and temperature"
    power, sched: stop updating inside arch_update_cpu_topology() when nothing to be update
    powerpc/le: Avoid creatng R_PPC64_TOCSAVE relocations for modules.
    arch/powerpc: Use RCU_INIT_POINTER(x, NULL) in platforms/cell/spu_syscalls.c
    powerpc/opal: Add missing include
    powerpc: Convert last uses of __FUNCTION__ to __func__
    powerpc: Add lq/stq emulation
    powerpc/powernv: Add invalid OPAL call
    powerpc/powernv: Add OPAL message log interface
    powerpc/book3s: Fix mc_recoverable_range buffer overrun issue.
    powerpc: Remove dead code in sycall entry
    powerpc: Use of_node_init() for the fakenode in msi_bitmap.c
    powerpc/mm: NUMA pte should be handled via slow path in get_user_pages_fast()
    powerpc/powernv: Fix endian issues with sensor code
    powerpc/powernv: Fix endian issues with OPAL async code
    tty/hvc_opal: Kick the HVC thread on OPAL console events
    powerpc/powernv: Add opal_notifier_unregister() and export to modules
    powerpc/ppc64: Do not turn AIL (reloc-on interrupts) too early
    powerpc/ppc64: Gracefully handle early interrupts
    ...

    Linus Torvalds
     
  • next-20140324 currently fails compiling celleb_defconfig with:

    arch/powerpc/include/asm/opal.h:894:42: error: 'struct notifier_block' declared inside parameter list [-Werror]
    arch/powerpc/include/asm/opal.h:894:42: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
    arch/powerpc/include/asm/opal.h:896:14: error: 'struct notifier_block' declared inside parameter list [-Werror]

    This is due to a missing include which is added here.

    Signed-off-by: Michael Neuling
    Signed-off-by: Benjamin Herrenschmidt

    Michael Neuling
     
  • Recent CPUs support quad word load and store instructions. Add
    support to the alignment handler for them.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     
  • This call will not be understood by OPAL, and cause it to add an error
    to it's log. Among other things, this is useful for testing the
    behaviour of the log as it fills up.

    Signed-off-by: Joel Stanley
    Signed-off-by: Benjamin Herrenschmidt

    Joel Stanley
     
  • OPAL provides an in-memory circular buffer containing a message log
    populated with various runtime messages produced by the firmware.

    Provide a sysfs interface /sys/firmware/opal/msglog for userspace to
    view the messages.

    Signed-off-by: Joel Stanley
    Signed-off-by: Benjamin Herrenschmidt

    Joel Stanley
     
  • One OPAL call and one device tree property needed byte swapping.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     

08 Apr, 2014

2 commits

  • * pm-cpufreq:
    cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h
    cpufreq: create another field .flags in cpufreq_frequency_table
    cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table
    cpufreq: don't print value of .driver_data from core
    cpufreq: ia64: don't set .driver_data to index
    cpufreq: powernv: Select CPUFreq related Kconfig options for powernv
    cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids
    cpufreq: powernv: cpufreq driver for powernv platform
    cpufreq: at32ap: don't declare local variable as static
    cpufreq: loongson2_cpufreq: don't declare local variable as static
    cpufreq: unicore32: fix typo issue for 'clk'
    cpufreq: exynos: Disable on multiplatform build

    Rafael J. Wysocki
     
  • Eliminate the following warning in proc/vmcore.c:

    fs/proc/vmcore.c:1088:6: warning: no previous prototype for `vmcore_cleanup' [-Wmissing-prototypes]

    [akpm@linux-foundation.org: clean up powerpc, remove unneeded EXPORT_SYMBOL]
    Signed-off-by: Rashika Kheria
    Reviewed-by: Josh Triplett
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rashika Kheria
     

07 Apr, 2014

4 commits

  • Backend driver to dynamically set voltage and frequency on
    IBM POWER non-virtualized platforms. Power management SPRs
    are used to set the required PState.

    This driver works in conjunction with cpufreq governors
    like 'ondemand' to provide a demand based frequency and
    voltage setting on IBM POWER non-virtualized platforms.

    PState table is obtained from OPAL v3 firmware through device
    tree.

    powernv_cpufreq back-end driver would parse the relevant device-tree
    nodes and initialise the cpufreq subsystem on powernv platform.

    The code was originally written by svaidy@linux.vnet.ibm.com. Over
    time it was modified to accomodate bug-fixes as well as updates to the
    the cpu-freq core. Relevant portions of the change logs corresponding
    to those modifications are noted below:

    * The policy->cpus needs to be populated in a hotplug-invariant
    manner instead of using cpu_sibling_mask() which varies with
    cpu-hotplug. This is because the cpufreq core code copies this
    content into policy->related_cpus mask which should not vary on
    cpu-hotplug. [Authored by srivatsa.bhat@linux.vnet.ibm.com]

    * Create a helper routine that can return the cpu-frequency for the
    corresponding pstate_id. Also, cache the values of the pstate_max,
    pstate_min and pstate_nominal and nr_pstates in a static structure
    so that they can be reused in the future to perform any
    validations. [Authored by ego@linux.vnet.ibm.com]

    * Create a driver attribute named cpuinfo_nominal_freq which creates
    a sysfs read-only file named cpuinfo_nominal_freq. Export the
    frequency corresponding to the nominal_pstate through this
    interface.

    Nominal frequency is the highest non-turbo frequency for the
    platform. This is generally used for setting governor policies
    from user space for optimal energy efficiency. [Authored by
    ego@linux.vnet.ibm.com]

    * Implement a powernv_cpufreq_get(unsigned int cpu) method which will
    return the current operating frequency. Export this via the sysfs
    interface cpuinfo_cur_freq by setting powernv_cpufreq_driver.get to
    powernv_cpufreq_get(). [Authored by ego@linux.vnet.ibm.com]

    [Change log updated by ego@linux.vnet.ibm.com]

    Reviewed-by: Preeti U Murthy
    Signed-off-by: Vaidyanathan Srinivasan
    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Anton Blanchard
    Signed-off-by: Gautham R. Shenoy
    Signed-off-by: Rafael J. Wysocki

    Vaidyanathan Srinivasan
     
  • OPAL defines opal_msg as a big endian struct so we have to
    byte swap it on little endian builds.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     
  • opal_notifier_register() is missing a pending "unregister" variant
    and should be exposed to modules.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     
  • The current kernel code assumes big endian and parses RTAS events all
    wrong. The most visible effect is that we cannot honor EPOW events,
    meaning, for example, we cannot shut down a guest properly from the
    hypervisor.

    This new patch is largely inspired by Nathan's work: we get rid of all
    the bit fields in the RTAS event structures (even the unused ones, for
    consistency). We also introduce endian safe accessors for the fields used
    by the kernel (trivial rtas_error_type() accessor added for consistency).

    Cc: Nathan Fontenot
    Signed-off-by: Greg Kurz
    Signed-off-by: Benjamin Herrenschmidt

    Greg Kurz
     

05 Apr, 2014

1 commit

  • Pull /dev/random changes from Ted Ts'o:
    "A number of cleanups plus support for the RDSEED instruction, which
    will be showing up in Intel Broadwell CPU's"

    * tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
    random: Add arch_has_random[_seed]()
    random: If we have arch_get_random_seed*(), try it before blocking
    random: Use arch_get_random_seed*() at init time and once a second
    x86, random: Enable the RDSEED instruction
    random: use the architectural HWRNG for the SHA's IV in extract_buf()
    random: clarify bits/bytes in wakeup thresholds
    random: entropy_bytes is actually bits
    random: simplify accounting code
    random: tighten bound on random_read_wakeup_thresh
    random: forget lock in lockless accounting
    random: simplify accounting logic
    random: fix comment on "account"
    random: simplify loop in random_read
    random: fix description of get_random_bytes
    random: fix comment on proc_do_uuid
    random: fix typos / spelling errors in comments

    Linus Torvalds
     

03 Apr, 2014

3 commits

  • Pull kvm updates from Paolo Bonzini:
    "PPC and ARM do not have much going on this time. Most of the cool
    stuff, instead, is in s390 and (after a few releases) x86.

    ARM has some caching fixes and PPC has transactional memory support in
    guests. MIPS has some fixes, with more probably coming in 3.16 as
    QEMU will soon get support for MIPS KVM.

    For x86 there are optimizations for debug registers, which trigger on
    some Windows games, and other important fixes for Windows guests. We
    now expose to the guest Broadwell instruction set extensions and also
    Intel MPX. There's also a fix/workaround for OS X guests, nested
    virtualization features (preemption timer), and a couple kvmclock
    refinements.

    For s390, the main news is asynchronous page faults, together with
    improvements to IRQs (floating irqs and adapter irqs) that speed up
    virtio devices"

    * tag 'kvm-3.15-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (96 commits)
    KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8
    KVM: PPC: Book3S HV: Fix decrementer timeouts with non-zero TB offset
    KVM: PPC: Book3S HV: Don't use kvm_memslots() in real mode
    KVM: PPC: Book3S HV: Return ENODEV error rather than EIO
    KVM: PPC: Book3S: Trim top 4 bits of physical address in RTAS code
    KVM: PPC: Book3S HV: Add get/set_one_reg for new TM state
    KVM: PPC: Book3S HV: Add transactional memory support
    KVM: Specify byte order for KVM_EXIT_MMIO
    KVM: vmx: fix MPX detection
    KVM: PPC: Book3S HV: Fix KVM hang with CONFIG_KVM_XICS=n
    KVM: PPC: Book3S: Introduce hypervisor call H_GET_TCE
    KVM: PPC: Book3S HV: Fix incorrect userspace exit on ioeventfd write
    KVM: s390: clear local interrupts at cpu initial reset
    KVM: s390: Fix possible memory leak in SIGP functions
    KVM: s390: fix calculation of idle_mask array size
    KVM: s390: randomize sca address
    KVM: ioapic: reinject pending interrupts on KVM_SET_IRQCHIP
    KVM: Bump KVM_MAX_IRQ_ROUTES for s390
    KVM: s390: irq routing for adapter interrupts.
    KVM: s390: adapter interrupt sources
    ...

    Linus Torvalds
     
  • Pull powerpc non-virtualized cpuidle from Ben Herrenschmidt:
    "This is the branch I mentioned in my other pull request which contains
    our improved cpuidle support for the "powernv" platform
    (non-virtualized).

    It adds support for the "fast sleep" feature of the processor which
    provides higher power savings than our usual "nap" mode but at the
    cost of losing the timers while asleep, and thus exploits the new
    timer broadcast framework to work around that limitation.

    It's based on a tip timer tree that you seem to have already merged"

    * 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
    cpuidle/powernv: Parse device tree to setup idle states
    cpuidle/powernv: Add "Fast-Sleep" CPU idle state
    powerpc/powernv: Add OPAL call to resync timebase on wakeup
    powerpc/powernv: Add context management for Fast Sleep
    powerpc: Split timer_interrupt() into timer handling and interrupt handling routines
    powerpc: Implement tick broadcast IPI as a fixed IPI message
    powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message

    Linus Torvalds
     
  • Pull main powerpc updates from Ben Herrenschmidt:
    "This time around, the powerpc merges are going to be a little bit more
    complicated than usual.

    This is the main pull request with most of the work for this merge
    window. I will describe it a bit more further down.

    There is some additional cpuidle driver work, however I haven't
    included it in this tree as it depends on some work in tip/timer-core
    which Thomas accidentally forgot to put in a topic branch. Since I
    didn't want to carry all of that tip timer stuff in powerpc -next, I
    setup a separate branch on top of Thomas tree with just that cpuidle
    driver in it, and Stephen has been carrying that in next separately
    for a while now. I'll send a separate pull request for it.

    Additionally, two new pieces in this tree add users for a sysfs API
    that Tejun and Greg have been deprecating in drivers-core-next.
    Thankfully Greg reverted the patch that removes the old API so this
    merge can happen cleanly, but once merged, I will send a patch
    adjusting our new code to the new API so that Greg can send you the
    removal patch.

    Now as for the content of this branch, we have a lot of perf work for
    power8 new counters including support for our new "nest" counters
    (also called 24x7) under pHyp (not natively yet).

    We have new functionality when running under the OPAL firmware
    (non-virtualized or KVM host), such as access to the firmware error
    logs and service processor dumps, system parameters and sensors, along
    with a hwmon driver for the latter.

    There's also a bunch of bug fixes accross the board, some LE fixes,
    and a nice set of selftests for validating our various types of copy
    loops.

    On the Freescale side, we see mostly new chip/board revisions, some
    clock updates, better support for machine checks and debug exceptions,
    etc..."

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (70 commits)
    powerpc/book3s: Fix CFAR clobbering issue in machine check handler.
    powerpc/compat: 32-bit little endian machine name is ppcle, not ppc
    powerpc/le: Big endian arguments for ppc_rtas()
    powerpc: Use default set of netfilter modules (CONFIG_NETFILTER_ADVANCED=n)
    powerpc/defconfigs: Enable THP in pseries defconfig
    powerpc/mm: Make sure a local_irq_disable prevent a parallel THP split
    powerpc: Rate-limit users spamming kernel log buffer
    powerpc/perf: Fix handling of L3 events with bank == 1
    powerpc/perf/hv_{gpci, 24x7}: Add documentation of device attributes
    powerpc/perf: Add kconfig option for hypervisor provided counters
    powerpc/perf: Add support for the hv 24x7 interface
    powerpc/perf: Add support for the hv gpci (get performance counter info) interface
    powerpc/perf: Add macros for defining event fields & formats
    powerpc/perf: Add a shared interface to get gpci version and capabilities
    powerpc/perf: Add 24x7 interface headers
    powerpc/perf: Add hv_gpci interface header
    powerpc: Add hvcalls for 24x7 and gpci (Get Performance Counter Info)
    sysfs: create bin_attributes under the requested group
    powerpc/perf: Enable BHRB access for EBB events
    powerpc/perf: Add BHRB constraint and IFM MMCRA handling for EBB
    ...

    Linus Torvalds
     

02 Apr, 2014

1 commit

  • Pull char/misc driver patches from Greg KH:
    "Here's the big char/misc driver updates for 3.15-rc1.

    Lots of various things here, including the new mcb driver subsystem.

    All of these have been in linux-next for a while"

    * tag 'char-misc-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (118 commits)
    extcon: Move OF helper function to extcon core and change function name
    extcon: of: Remove unnecessary function call by using the name of device_node
    extcon: gpio: Use SIMPLE_DEV_PM_OPS macro
    extcon: palmas: Use SIMPLE_DEV_PM_OPS macro
    mei: don't use deprecated DEFINE_PCI_DEVICE_TABLE macro
    mei: amthif: fix checkpatch error
    mei: client.h fix checkpatch errors
    mei: use cl_dbg where appropriate
    mei: fix Unnecessary space after function pointer name
    mei: report consistently copy_from/to_user failures
    mei: drop pr_fmt macros
    mei: make me hw headers private to me hw.
    mei: fix memory leak of pending write cb objects
    mei: me: do not reset when less than expected data is received
    drivers: mcb: Fix build error discovered by 0-day bot
    cs5535-mfgpt: Simplify dependencies
    spmi: pm: drop bus-level PM suspend/resume routines
    spmi: pmic_arb: make selectable on ARCH_QCOM
    Drivers: hv: vmbus: Increase the limit on the number of pfns we can handle
    pch_phub: Report error writing MAC back to user
    ...

    Linus Torvalds
     

01 Apr, 2014

2 commits

  • Pull scheduler changes from Ingo Molnar:
    "Bigger changes:

    - sched/idle restructuring: they are WIP preparation for deeper
    integration between the scheduler and idle state selection, by
    Nicolas Pitre.

    - add NUMA scheduling pseudo-interleaving, by Rik van Riel.

    - optimize cgroup context switches, by Peter Zijlstra.

    - RT scheduling enhancements, by Thomas Gleixner.

    The rest is smaller changes, non-urgnt fixes and cleanups"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits)
    sched: Clean up the task_hot() function
    sched: Remove double calculation in fix_small_imbalance()
    sched: Fix broken setscheduler()
    sparc64, sched: Remove unused sparc64_multi_core
    sched: Remove unused mc_capable() and smt_capable()
    sched/numa: Move task_numa_free() to __put_task_struct()
    sched/fair: Fix endless loop in idle_balance()
    sched/core: Fix endless loop in pick_next_task()
    sched/fair: Push down check for high priority class task into idle_balance()
    sched/rt: Fix picking RT and DL tasks from empty queue
    trace: Replace hardcoding of 19 with MAX_NICE
    sched: Guarantee task priority in pick_next_task()
    sched/idle: Remove stale old file
    sched: Put rq's sched_avg under CONFIG_FAIR_GROUP_SCHED
    cpuidle/arm64: Remove redundant cpuidle_idle_call()
    cpuidle/powernv: Remove redundant cpuidle_idle_call()
    sched, nohz: Exclude isolated cores from load balancing
    sched: Fix select_task_rq_fair() description comments
    workqueue: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE
    sys: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE
    ...

    Linus Torvalds
     
  • Pull core locking updates from Ingo Molnar:
    "The biggest change is the MCS spinlock generalization changes from Tim
    Chen, Peter Zijlstra, Jason Low et al. There's also lockdep
    fixes/enhancements from Oleg Nesterov, in particular a false negative
    fix related to lockdep_set_novalidate_class() usage"

    * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
    locking/mutex: Fix debug checks
    locking/mutexes: Add extra reschedule point
    locking/mutexes: Introduce cancelable MCS lock for adaptive spinning
    locking/mutexes: Unlock the mutex without the wait_lock
    locking/mutexes: Modify the way optimistic spinners are queued
    locking/mutexes: Return false if task need_resched() in mutex_can_spin_on_owner()
    locking: Move mcs_spinlock.h into kernel/locking/
    m68k: Skip futex_atomic_cmpxchg_inatomic() test
    futex: Allow architectures to skip futex_atomic_cmpxchg_inatomic() test
    Revert "sched/wait: Suppress Sparse 'variable shadowing' warning"
    lockdep: Change lockdep_set_novalidate_class() to use _and_name
    lockdep: Change mark_held_locks() to check hlock->check instead of lockdep_no_validate
    lockdep: Don't create the wrong dependency on hlock->check == 0
    lockdep: Make held_lock->check and "int check" argument bool
    locking/mcs: Allow architecture specific asm files to be used for contended case
    locking/mcs: Order the header files in Kbuild of each architecture in alphabetical order
    sched/wait: Suppress Sparse 'variable shadowing' warning
    hung_task/Documentation: Fix hung_task_warnings description
    locking/mcs: Allow architectures to hook in to contended paths
    locking/mcs: Micro-optimize the MCS code, add extra comments
    ...

    Linus Torvalds
     

29 Mar, 2014

4 commits

  • …aulus/powerpc into kvm-next

    Paolo Bonzini
     
  • Currently we save the host PMU configuration, counter values, etc.,
    when entering a guest, and restore it on return from the guest.
    (We have to do this because the guest has control of the PMU while
    it is executing.) However, we missed saving/restoring the SIAR and
    SDAR registers, as well as the registers which are new on POWER8,
    namely SIER and MMCR2.

    This adds code to save the values of these registers when entering
    the guest and restore them on exit. This also works around the bug
    in POWER8 where setting PMAE with a counter already negative doesn't
    generate an interrupt.

    Signed-off-by: Paul Mackerras
    Acked-by: Scott Wood

    Paul Mackerras
     
  • With HV KVM, some high-frequency hypercalls such as H_ENTER are handled
    in real mode, and need to access the memslots array for the guest.
    Accessing the memslots array is safe, because we hold the SRCU read
    lock for the whole time that a guest vcpu is running. However, the
    checks that kvm_memslots() does when lockdep is enabled are potentially
    unsafe in real mode, when only the linear mapping is available.
    Furthermore, kvm_memslots() can be called from a secondary CPU thread,
    which is an offline CPU from the point of view of the host kernel,
    and is not running the task which holds the SRCU read lock.

    To avoid false positives in the checks in kvm_memslots(), and to avoid
    possible side effects from doing the checks in real mode, this replaces
    kvm_memslots() with kvm_memslots_raw() in all the places that execute
    in real mode. kvm_memslots_raw() is a new function that is like
    kvm_memslots() but uses rcu_dereference_raw_notrace() instead of
    kvm_dereference_check().

    Signed-off-by: Paul Mackerras
    Acked-by: Scott Wood

    Paul Mackerras
     
  • This adds saving of the transactional memory (TM) checkpointed state
    on guest entry and exit. We only do this if we see that the guest has
    an active transaction.

    It also adds emulation of the TM state changes when delivering IRQs
    into the guest. According to the architecture, if we are
    transactional when an IRQ occurs, the TM state is changed to
    suspended, otherwise it's left unchanged.

    Signed-off-by: Michael Neuling
    Signed-off-by: Paul Mackerras
    Acked-by: Scott Wood

    Michael Neuling
     

26 Mar, 2014

2 commits

  • This introduces the H_GET_TCE hypervisor call, which is basically the
    reverse of H_PUT_TCE, as defined in the Power Architecture Platform
    Requirements (PAPR).

    The hcall H_GET_TCE is required by the kdump kernel, which uses it to
    retrieve TCEs set up by the previous (panicked) kernel.

    Signed-off-by: Laurent Dufour
    Signed-off-by: Alexander Graf
    Signed-off-by: Paul Mackerras

    Laurent Dufour
     
  • When the guest does an MMIO write which is handled successfully by an
    ioeventfd, ioeventfd_write() returns 0 (success) and
    kvmppc_handle_store() returns EMULATE_DONE. Then
    kvmppc_emulate_mmio() converts EMULATE_DONE to RESUME_GUEST_NV and
    this causes an exit from the loop in kvmppc_vcpu_run_hv(), causing an
    exit back to userspace with a bogus exit reason code, typically
    causing userspace (e.g. qemu) to crash with a message about an unknown
    exit code.

    This adds handling of RESUME_GUEST_NV in kvmppc_vcpu_run_hv() in order
    to fix that. For generality, we define a helper to check for either
    of the return-to-guest codes we use, RESUME_GUEST and RESUME_GUEST_NV,
    to make it easy to check for either and provide one place to update if
    any other return-to-guest code gets defined in future.

    Since it only affects Book3S HV for now, the helper is added to
    the kvm_book3s.h header file.

    We use the helper in two places in kvmppc_run_core() as well for
    future-proofing, though we don't see RESUME_GUEST_NV in either place
    at present.

    [paulus@samba.org - combined 4 patches into one, rewrote description]

    Suggested-by: Paul Mackerras
    Signed-off-by: Alexey Kardashevskiy
    Signed-off-by: Greg Kurz
    Signed-off-by: Paul Mackerras

    Greg Kurz
     

24 Mar, 2014

11 commits


20 Mar, 2014

2 commits

  • Add predicate functions for having arch_get_random[_seed]*(). The
    only current use is to avoid the loop in arch_random_refill() when
    arch_get_random_seed_long() is unavailable.

    Signed-off-by: H. Peter Anvin
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Signed-off-by: Theodore Ts'o

    H. Peter Anvin
     
  • Upcoming Intel silicon adds a new RDSEED instruction, which is similar
    to RDRAND but provides a stronger guarantee: unlike RDRAND, RDSEED
    will always reseed the PRNG from the true random number source between
    each read. Thus, the output of RDSEED is guaranteed to be 100%
    entropic, unlike RDRAND which is only architecturally guaranteed to be
    1/512 entropic (although in practice is much more.)

    The RDSEED instruction takes the same time to execute as RDRAND, but
    RDSEED unlike RDRAND can legitimately return failure (CF=0) due to
    entropy exhaustion if too many threads on too many cores are hammering
    the RDSEED instruction at the same time. Therefore, we have to be
    more conservative and only use it in places where we can tolerate
    failures.

    This patch introduces the primitives arch_get_random_seed_{int,long}()
    but does not use it yet.

    Signed-off-by: H. Peter Anvin
    Reviewed-by: Ingo Molnar
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Michael Ellerman
    Signed-off-by: Theodore Ts'o

    H. Peter Anvin