13 Sep, 2013

4 commits

  • After the last architecture switched to generic hard irqs the config
    options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
    for !CONFIG_GENERIC_HARDIRQS can be removed.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Merge more patches from Andrew Morton:
    "The rest of MM. Plus one misc cleanup"

    * emailed patches from Andrew Morton : (35 commits)
    mm/Kconfig: add MMU dependency for MIGRATION.
    kernel: replace strict_strto*() with kstrto*()
    mm, thp: count thp_fault_fallback anytime thp fault fails
    thp: consolidate code between handle_mm_fault() and do_huge_pmd_anonymous_page()
    thp: do_huge_pmd_anonymous_page() cleanup
    thp: move maybe_pmd_mkwrite() out of mk_huge_pmd()
    mm: cleanup add_to_page_cache_locked()
    thp: account anon transparent huge pages into NR_ANON_PAGES
    truncate: drop 'oldsize' truncate_pagecache() parameter
    mm: make lru_add_drain_all() selective
    memcg: document cgroup dirty/writeback memory statistics
    memcg: add per cgroup writeback pages accounting
    memcg: check for proper lock held in mem_cgroup_update_page_stat
    memcg: remove MEMCG_NR_FILE_MAPPED
    memcg: reduce function dereference
    memcg: avoid overflow caused by PAGE_ALIGN
    memcg: rename RESOURCE_MAX to RES_COUNTER_MAX
    memcg: correct RESOURCE_MAX to ULLONG_MAX
    mm: memcg: do not trap chargers with full callstack on OOM
    mm: memcg: rework and document OOM waiting and wakeup
    ...

    Linus Torvalds
     
  • Unlike global OOM handling, memory cgroup code will invoke the OOM killer
    in any OOM situation because it has no way of telling faults occuring in
    kernel context - which could be handled more gracefully - from
    user-triggered faults.

    Pass a flag that identifies faults originating in user space from the
    architecture-specific fault handlers to generic code so that memcg OOM
    handling can be improved.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Michal Hocko
    Cc: David Rientjes
    Cc: KAMEZAWA Hiroyuki
    Cc: azurIt
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Pull IOMMU Updates from Joerg Roedel:
    "This round the updates contain:

    - A new driver for the Freescale PAMU IOMMU from Varun Sethi.

    This driver has cooked for a while and required changes to the
    IOMMU-API and infrastructure that were already merged before.

    - Updates for the ARM-SMMU driver from Will Deacon

    - Various fixes, the most important one is probably a fix from Alex
    Williamson for a memory leak in the VT-d page-table freeing code

    In summary not all that much. The biggest part in the diffstat is the
    new PAMU driver"

    * tag 'iommu-updates-v3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
    intel-iommu: Fix leaks in pagetable freeing
    iommu/amd: Fix resource leak in iommu_init_device()
    iommu/amd: Clean up unnecessary MSI/MSI-X capability find
    iommu/arm-smmu: Simplify VMID and ASID allocation
    iommu/arm-smmu: Don't use VMIDs for stage-1 translations
    iommu/arm-smmu: Tighten up global fault reporting
    iommu/arm-smmu: Remove broken big-endian check
    iommu/fsl: Remove unnecessary 'fsl-pamu' prefixes
    iommu/fsl: Fix whitespace problems noticed by git-am
    iommu/fsl: Freescale PAMU driver and iommu implementation.
    iommu/fsl: Add additional iommu attributes required by the PAMU driver.
    powerpc: Add iommu domain pointer to device archdata
    iommu/exynos: Remove dead code (set_prefbuf)

    Linus Torvalds
     

12 Sep, 2013

2 commits

  • Joerg Roedel
     
  • Currently hugepage migration works well only for pmd-based hugepages
    (mainly due to lack of testing,) so we had better not enable migration of
    other levels of hugepages until we are ready for it.

    Some users of hugepage migration (mbind, move_pages, and migrate_pages) do
    page table walk and check pud/pmd_huge() there, so they are safe. But the
    other users (softoffline and memory hotremove) don't do this, so without
    this patch they can try to migrate unexpected types of hugepages.

    To prevent this, we introduce hugepage_migration_support() as an
    architecture dependent check of whether hugepage are implemented on a pmd
    basis or not. And on some architecture multiple sizes of hugepages are
    available, so hugepage_migration_support() also checks hugepage size.

    Signed-off-by: Naoya Horiguchi
    Cc: Andi Kleen
    Cc: Hillf Danton
    Cc: Wanpeng Li
    Cc: Mel Gorman
    Cc: Hugh Dickins
    Cc: KOSAKI Motohiro
    Cc: Michal Hocko
    Cc: Rik van Riel
    Cc: "Aneesh Kumar K.V"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

11 Sep, 2013

5 commits

  • When adding cpuidle support to pSeries, we introduced two
    regressions:

    - The new cpuidle backend driver only works under hypervisors
    supporting the "SLPLAR" option, which isn't the case of the
    old POWER4 hypervisor and the HV "light" used on js2x blades

    - The cpuidle driver registers fairly late, meaning that for
    a significant portion of the boot process, we end up having
    all threads spinning. This slows down the boot process and
    increases the overall resource usage if the hypervisor has
    shared processors.

    This fixes both by implementing a "default" idle that will cede
    to the hypervisor when possible, in a very simple way without
    all the bells and whisles of cpuidle.

    Reported-by: Paul Mackerras
    Signed-off-by: Vaidyanathan Srinivasan
    Acked-by: Deepthi Dharwar
    Signed-off-by: Benjamin Herrenschmidt
    CC:

    Vaidyanathan Srinivasan
     
  • While cross-building for PPC64 I've got

    WARNING: vmlinux.o(.text.unlikely+0x1ba): Section mismatch in
    reference from the function .prom_rtas_call() to the variable
    .init.data:dt_string_start The function .prom_rtas_call() references
    the variable __initdata dt_string_start. This is often because
    .prom_rtas_call lacks a __initdata annotation or the annotation of
    dt_string_start is wrong.

    WARNING: vmlinux.o(.meminit.text+0xeb0): Section mismatch in reference
    from the function .free_area_init_core.isra.47() to the function
    .init.text:.set_pageblock_order() The function __meminit
    .free_area_init_core.isra.47() references a function __init
    .set_pageblock_order(). If .set_pageblock_order is only used by
    .free_area_init_core.isra.47 then annotate .set_pageblock_order with a
    matching annotation.

    Fix it by proper annotation of prom_rtas_call.

    Signed-off-by: Vladimir Murzin
    Signed-off-by: Benjamin Herrenschmidt

    Vladimir Murzin
     
  • stack_grow_into/14082 is trying to acquire lock:
    (&mm->mmap_sem){++++++}, at: [] .might_fault+0x78/0xe0

    but task is already holding lock:
    (&mm->mmap_sem){++++++}, at: [] .do_page_fault+0x24c/0x910

    other info that might help us debug this:
    Possible unsafe locking scenario:

    CPU0
    ----
    lock(&mm->mmap_sem);
    lock(&mm->mmap_sem);

    *** DEADLOCK ***

    May be due to missing lock nesting notation

    1 lock held by stack_grow_into/14082:
    #0: (&mm->mmap_sem){++++++}, at: [] .do_page_fault+0x24c/0x910

    stack backtrace:
    CPU: 21 PID: 14082 Comm: stack_grow_into Not tainted 3.10.0-10.el7.ppc64.debug #1
    Call Trace:
    [c0000003d396b850] [c000000000016e7c] .show_stack+0x7c/0x1f0 (unreliable)
    [c0000003d396b920] [c000000000813fc8] .dump_stack+0x28/0x3c
    [c0000003d396b990] [c000000000124b90] .__lock_acquire+0x1640/0x1800
    [c0000003d396bab0] [c00000000012570c] .lock_acquire+0xac/0x250
    [c0000003d396bb80] [c000000000206d54] .might_fault+0xa4/0xe0
    [c0000003d396bbf0] [c0000000007ffe2c] .do_page_fault+0x2ec/0x910
    [c0000003d396be30] [c0000000000092e8] handle_page_fault+0x10/0x30

    Signed-off-by: Aneesh Kumar K.V
    Signed-off-by: Benjamin Herrenschmidt

    Aneesh Kumar K.V
     
  • powerpc allmodconfig build fails with:

    ERROR: ".cpu_to_chip_id" [drivers/block/mtip32xx/mtip32xx.ko] undefined!

    The problem was introduced with commit 15863ff3b (powerpc: Make chip-id
    information available to userspace).

    Export the missing symbol.

    Cc: Vasant Hegde
    Cc: Shivaprasad G Bhat
    Signed-off-by: Guenter Roeck
    Signed-off-by: Benjamin Herrenschmidt

    Guenter Roeck
     
  • Pull device tree core updates from Grant Likely:
    "Generally minor changes. A bunch of bug fixes, particularly for
    initialization and some refactoring. Most notable change if feeding
    the entire flattened tree into the random pool at boot. May not be
    significant, but shouldn't hurt either"

    Tim Bird questions whether the boot time cost of the random feeding may
    be noticeable. And "add_device_randomness()" is definitely not some
    speed deamon of a function.

    * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
    of/platform: add error reporting to of_amba_device_create()
    irq/of: Fix comment typo for irq_of_parse_and_map
    of: Feed entire flattened device tree into the random pool
    of/fdt: Clean up casting in unflattening path
    of/fdt: Remove duplicate memory clearing on FDT unflattening
    gpio: implement gpio-ranges binding document fix
    of: call __of_parse_phandle_with_args from of_parse_phandle
    of: introduce of_parse_phandle_with_fixed_args
    of: move of_parse_phandle()
    of: move documentation of of_parse_phandle_with_args
    of: Fix missing memory initialization on FDT unflattening
    of: consolidate definition of early_init_dt_alloc_memory_arch()
    of: Make of_get_phy_mode() return int i.s.o. const int
    include: dt-binding: input: create a DT header defining key codes.
    of/platform: Staticize of_platform_device_create_pdata()
    of: Specify initrd location using 64-bit
    dt: Typo fix
    OF: make of_property_for_each_{u32|string}() use parameters if OF is not enabled

    Linus Torvalds
     

10 Sep, 2013

1 commit

  • Pull xfs updates from Ben Myers:
    "For 3.12-rc1 there are a number of bugfixes in addition to work to
    ease usage of shared code between libxfs and the kernel, the rest of
    the work to enable project and group quotas to be used simultaneously,
    performance optimisations in the log and the CIL, directory entry file
    type support, fixes for log space reservations, some spelling/grammar
    cleanups, and the addition of user namespace support.

    - introduce readahead to log recovery
    - add directory entry file type support
    - fix a number of spelling errors in comments
    - introduce new Q_XGETQSTATV quotactl for project quotas
    - add USER_NS support
    - log space reservation rework
    - CIL optimisations
    - kernel/userspace libxfs rework"

    * tag 'xfs-for-linus-v3.12-rc1' of git://oss.sgi.com/xfs/xfs: (112 commits)
    xfs: XFS_MOUNT_QUOTA_ALL needed by userspace
    xfs: dtype changed xfs_dir2_sfe_put_ino to xfs_dir3_sfe_put_ino
    Fix wrong flag ASSERT in xfs_attr_shortform_getvalue
    xfs: finish removing IOP_* macros.
    xfs: inode log reservations are too small
    xfs: check correct status variable for xfs_inobt_get_rec() call
    xfs: inode buffers may not be valid during recovery readahead
    xfs: check LSN ordering for v5 superblocks during recovery
    xfs: btree block LSN escaping to disk uninitialised
    XFS: Assertion failed: first < BBTOB(bp->b_length), file: fs/xfs/xfs_trans_buf.c, line: 568
    xfs: fix bad dquot buffer size in log recovery readahead
    xfs: don't account buffer cancellation during log recovery readahead
    xfs: check for underflow in xfs_iformat_fork()
    xfs: xfs_dir3_sfe_put_ino can be static
    xfs: introduce object readahead to log recovery
    xfs: Simplify xfs_ail_min() with list_first_entry_or_null()
    xfs: Register hotcpu notifier after initialization
    xfs: add xfs sb v4 support for dirent filetype field
    xfs: Add write support for dirent filetype field
    xfs: Add read-only support for dirent filetype field
    ...

    Linus Torvalds
     

07 Sep, 2013

3 commits

  • Pull ARM SoC platform changes from Olof Johansson:
    "This branch contains mostly additions and changes to platform
    enablement and SoC-level drivers. Since there's sometimes a
    dependency on device-tree changes, there's also a fair amount of
    those in this branch.

    Pieces worth mentioning are:

    - Mbus driver for Marvell platforms, allowing kernel configuration
    and resource allocation of on-chip peripherals.
    - Enablement of the mbus infrastructure from Marvell PCI-e drivers.
    - Preparation of MSI support for Marvell platforms.
    - Addition of new PCI-e host controller driver for Tegra platforms
    - Some churn caused by sharing of macro names between i.MX 6Q and 6DL
    platforms in the device tree sources and header files.
    - Various suspend/PM updates for Tegra, including LP1 support.
    - Versatile Express support for MCPM, part of big little support.
    - Allwinner platform support for A20 and A31 SoCs (dual and quad
    Cortex-A7)
    - OMAP2+ support for DRA7, a new Cortex-A15-based SoC.

    The code that touches other architectures are patches moving MSI
    arch-specific functions over to weak symbols and removal of
    ARCH_SUPPORTS_MSI, acked by PCI maintainers"

    * tag 'soc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (266 commits)
    tegra-cpuidle: provide stub when !CONFIG_CPU_IDLE
    PCI: tegra: replace devm_request_and_ioremap by devm_ioremap_resource
    ARM: tegra: Drop ARCH_SUPPORTS_MSI and sort list
    ARM: dts: vf610-twr: enable i2c0 device
    ARM: dts: i.MX51: Add one more I2C2 pinmux entry
    ARM: dts: i.MX51: Move pins configuration under "iomuxc" label
    ARM: dtsi: imx6qdl-sabresd: Add USB OTG vbus pin to pinctrl_hog
    ARM: dtsi: imx6qdl-sabresd: Add USB host 1 VBUS regulator
    ARM: dts: imx27-phytec-phycore-som: Enable AUDMUX
    ARM: dts: i.MX27: Disable AUDMUX in the template
    ARM: dts: wandboard: Add support for SDIO bcm4329
    ARM: i.MX5 clocks: Remove optional clock setup (CKIH1) from i.MX51 template
    ARM: dts: imx53-qsb: Make USBH1 functional
    ARM i.MX6Q: dts: Enable I2C1 with EEPROM and PMIC on Phytec phyFLEX-i.MX6 Ouad module
    ARM i.MX6Q: dts: Enable SPI NOR flash on Phytec phyFLEX-i.MX6 Ouad module
    ARM: dts: imx6qdl-sabresd: Add touchscreen support
    ARM: imx: add ocram clock for imx53
    ARM: dts: imx: ocram size is different between imx6q and imx6dl
    ARM: dts: imx27-phytec-phycore-som: Fix regulator settings
    ARM: dts: i.MX27: Remove clock name from CPU node
    ...

    Linus Torvalds
     
  • Pull powerpc updates from Ben Herrenschmidt:
    "Here's the powerpc batch for this merge window. Some of the
    highlights are:

    - A bunch of endian fixes ! We don't have full LE support yet in that
    release but this contains a lot of fixes all over arch/powerpc to
    use the proper accessors, call the firmware with the right endian
    mode, etc...

    - A few updates to our "powernv" platform (non-virtualized, the one
    to run KVM on), among other, support for bridging the P8 LPC bus
    for UARTs, support and some EEH fixes.

    - Some mpc51xx clock API cleanups in preparation for a clock API
    overhaul

    - A pile of cleanups of our old math emulation code, including better
    support for using it to emulate optional FP instructions on
    embedded chips that otherwise have a HW FPU.

    - Some infrastructure in selftest, for powerpc now, but could be
    generalized, initially used by some tests for our perf instruction
    counting code.

    - A pile of fixes for hotplug on pseries (that was seriously
    bitrotting)

    - The usual slew of freescale embedded updates, new boards, 64-bit
    hiberation support, e6500 core PMU support, etc..."

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (146 commits)
    powerpc: Correct FSCR bit definitions
    powerpc/xmon: Fix printing of set of CPUs in xmon
    powerpc/pseries: Move lparcfg.c to platforms/pseries
    powerpc/powernv: Return secondary CPUs to firmware on kexec
    powerpc/btext: Fix CONFIG_PPC_EARLY_DEBUG_BOOTX on ppc32
    powerpc: Cleanup handling of the DSCR bit in the FSCR register
    powerpc/pseries: Child nodes are not detached by dlpar_detach_node
    powerpc/pseries: Add mising of_node_put in delete_dt_node
    powerpc/pseries: Make dlpar_configure_connector parent node aware
    powerpc/pseries: Do all node initialization in dlpar_parse_cc_node
    powerpc/pseries: Fix parsing of initial node path in update_dt_node
    powerpc/pseries: Pack update_props_workarea to map correctly to rtas buffer header
    powerpc/pseries: Fix over writing of rtas return code in update_dt_node
    powerpc/pseries: Fix creation of loop in device node property list
    powerpc: Skip emulating & leave interrupts off for kernel program checks
    powerpc: Add more exception trampolines for hypervisor exceptions
    powerpc: Fix location and rename exception trampolines
    powerpc: Add more trap names to xmon
    powerpc/pseries: Add a warning in the case of cross-cpu VPA registration
    powerpc: Update the 00-Index in Documentation/powerpc
    ...

    Linus Torvalds
     
  • Pull trivial tree from Jiri Kosina:
    "The usual trivial updates all over the tree -- mostly typo fixes and
    documentation updates"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (52 commits)
    doc: Documentation/cputopology.txt fix typo
    treewide: Convert retrun typos to return
    Fix comment typo for init_cma_reserved_pageblock
    Documentation/trace: Correcting and extending tracepoint documentation
    mm/hotplug: fix a typo in Documentation/memory-hotplug.txt
    power: Documentation: Update s2ram link
    doc: fix a typo in Documentation/00-INDEX
    Documentation/printk-formats.txt: No casts needed for u64/s64
    doc: Fix typo "is is" in Documentations
    treewide: Fix printks with 0x%#
    zram: doc fixes
    Documentation/kmemcheck: update kmemcheck documentation
    doc: documentation/hwspinlock.txt fix typo
    PM / Hibernate: add section for resume options
    doc: filesystems : Fix typo in Documentations/filesystems
    scsi/megaraid fixed several typos in comments
    ppc: init_32: Fix error typo "CONFIG_START_KERNEL"
    treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacks
    page_isolation: Fix a comment typo in test_pages_isolated()
    doc: fix a typo about irq affinity
    ...

    Linus Torvalds
     

06 Sep, 2013

1 commit

  • Pull i2c updates from Wolfram Sang:
    "Highlights:

    - OF and ACPI helpers are now included in the core, and not in
    external files anymore. This removes dependency problems for
    modules and is cleaner, in general.
    - mv64xxx-driver gains fifo usage to support mv78230
    - imx-driver overhaul to support VF610
    - various cleanups, most notably related to devm_* and CONFIG_PM
    usage
    - driver bugfixes and smaller feature additions"

    * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (51 commits)
    i2c: rcar: add rcar-H2 support
    i2c: sirf: retry 3 times as sometimes we get random noack and timeout
    i2c: sirf: support reverse direction of address
    i2c: sirf: fix the typo for setting bitrate to less than 100k
    i2c: sirf: we need to wait I2C_RESET status in resume
    i2c: sirf: reset i2c controller early after we get a noack
    i2c: designware: get SDA hold time, HCNT and LCNT configuration from ACPI
    i2c: designware: make HCNT/LCNT values configurable
    i2c: mpc: cleanup clock API use
    i2c: pnx: fix error return code in i2c_pnx_probe()
    i2c: ismt: add error return code in probe()
    i2c: mv64xxx: fix typo in binding documentation
    i2c: imx: use exact SoC revision to document binding
    i2c: move ACPI helpers into the core
    i2c: move OF helpers into the core
    i2c: mv64xxx: Fix timing issue on Armada XP (errata FE-8471889)
    i2c: mv64xxx: Add I2C Transaction Generator support
    i2c: powermac: fix return path on error
    Documentation: i2c: Fix example in instantiating-devices
    i2c: tiny-usb: do not use stack as URB transfer_buffer
    ...

    Linus Torvalds
     

05 Sep, 2013

7 commits

  • Pull vfs pile 1 from Al Viro:
    "Unfortunately, this merge window it'll have a be a lot of small piles -
    my fault, actually, for not keeping #for-next in anything that would
    resemble a sane shape ;-/

    This pile: assorted fixes (the first 3 are -stable fodder, IMO) and
    cleanups + %pd/%pD formats (dentry/file pathname, up to 4 last
    components) + several long-standing patches from various folks.

    There definitely will be a lot more (starting with Miklos'
    check_submount_and_drop() series)"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
    direct-io: Handle O_(D)SYNC AIO
    direct-io: Implement generic deferred AIO completions
    add formats for dentry/file pathnames
    kvm eventfd: switch to fdget
    powerpc kvm: use fdget
    switch fchmod() to fdget
    switch epoll_ctl() to fdget
    switch copy_module_from_fd() to fdget
    git simplify nilfs check for busy subtree
    ibmasmfs: don't bother passing superblock when not needed
    don't pass superblock to hypfs_{mkdir,create*}
    don't pass superblock to hypfs_diag_create_files
    don't pass superblock to hypfs_vm_create_files()
    oprofile: get rid of pointless forward declarations of struct super_block
    oprofilefs_create_...() do not need superblock argument
    oprofilefs_mkdir() doesn't need superblock argument
    don't bother with passing superblock to oprofile_create_stats_files()
    oprofile: don't bother with passing superblock to ->create_files()
    don't bother passing sb to oprofile_create_files()
    coh901318: don't open-code simple_read_from_buffer()
    ...

    Linus Torvalds
     
  • Commit 74e400cee6 ("powerpc: Rework setting up H/FSCR bit definitions")
    ended up with incorrect bit numbers for FSCR_PM_LG and FSCR_BHRB_LG.
    This fixes them.

    Signed-off-by: Paul Mackerras
    Acked-by: Michael Neuling
    Signed-off-by: Benjamin Herrenschmidt

    Paul Mackerras
     
  • Commit 24ec2125f3 ("powerpc/xmon: Use cpumask iterator to avoid warning")
    replaced a loop from 0 to NR_CPUS-1 with a for_each_possible_cpu() loop,
    which means that if the last possible cpu is in xmon, we print the
    wrong value for the end of the range. For example, if 4 cpus are
    possible, NR_CPUS is 128, and all cpus are in xmon, we print "0-7f"
    rather than "0-3". The code also assumes that the set of possible
    cpus is contiguous, which may not necessarily be true.

    This fixes the code to check explicitly for contiguity, and to print
    the ending value correctly.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Benjamin Herrenschmidt

    Paul Mackerras
     
  • From Anatolij:
    <<
    There are cleanups for some mpc5121 specific drivers and DTS files
    in preparation to switch mpc5121 clock support to a clock driver
    based on common clock framework. Additionally Sebastian fixed the
    mpc52xx PIC driver so that it builds when using older gcc versions.
    >>

    Benjamin Herrenschmidt
     
  • Pull KVM updates from Gleb Natapov:
    "The highlights of the release are nested EPT and pv-ticketlocks
    support (hypervisor part, guest part, which is most of the code, goes
    through tip tree). Apart of that there are many fixes for all arches"

    Fix up semantic conflicts as discussed in the pull request thread..

    * 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (88 commits)
    ARM: KVM: Add newlines to panic strings
    ARM: KVM: Work around older compiler bug
    ARM: KVM: Simplify tracepoint text
    ARM: KVM: Fix kvm_set_pte assignment
    ARM: KVM: vgic: Bump VGIC_NR_IRQS to 256
    ARM: KVM: Bugfix: vgic_bytemap_get_reg per cpu regs
    ARM: KVM: vgic: fix GICD_ICFGRn access
    ARM: KVM: vgic: simplify vgic_get_target_reg
    KVM: MMU: remove unused parameter
    KVM: PPC: Book3S PR: Rework kvmppc_mmu_book3s_64_xlate()
    KVM: PPC: Book3S PR: Make instruction fetch fallback work for system calls
    KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX
    KVM: x86: update masterclock when kvmclock_offset is calculated (v2)
    KVM: PPC: Book3S: Fix compile error in XICS emulation
    KVM: PPC: Book3S PR: return appropriate error when allocation fails
    arch: powerpc: kvm: add signed type cast for comparation
    KVM: x86: add comments where MMIO does not return to the emulator
    KVM: vmx: count exits to userspace during invalid guest emulation
    KVM: rename __kvm_io_bus_sort_cmp to kvm_io_bus_cmp
    kvm: optimize away THP checks in kvm_is_mmio_pfn()
    ...

    Linus Torvalds
     
  • Pull PTR_RET() removal patches from Rusty Russell:
    "PTR_RET() is a weird name, and led to some confusing usage. We ended
    up with PTR_ERR_OR_ZERO(), and replacing or fixing all the usages.

    This has been sitting in linux-next for a whole cycle"

    [ There are still some PTR_RET users scattered about, with some of them
    possibly being new, but most of them existing in Rusty's tree too. We
    have that

    #define PTR_RET(p) PTR_ERR_OR_ZERO(p)

    thing in , so they continue to work for now - Linus ]

    * tag 'PTR_RET-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    GFS2: Replace PTR_RET with PTR_ERR_OR_ZERO
    Btrfs: volume: Replace PTR_RET with PTR_ERR_OR_ZERO
    drm/cma: Replace PTR_RET with PTR_ERR_OR_ZERO
    sh_veu: Replace PTR_RET with PTR_ERR_OR_ZERO
    dma-buf: Replace PTR_RET with PTR_ERR_OR_ZERO
    drivers/rtc: Replace PTR_RET with PTR_ERR_OR_ZERO
    mm/oom_kill: remove weird use of ERR_PTR()/PTR_ERR().
    staging/zcache: don't use PTR_RET().
    remoteproc: don't use PTR_RET().
    pinctrl: don't use PTR_RET().
    acpi: Replace weird use of PTR_RET.
    s390: Replace weird use of PTR_RET.
    PTR_RET is now PTR_ERR_OR_ZERO(): Replace most.
    PTR_RET is now PTR_ERR_OR_ZERO

    Linus Torvalds
     
  • Pull timers/nohz changes from Ingo Molnar:
    "It mostly contains fixes and full dynticks off-case optimizations, by
    Frederic Weisbecker"

    * 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
    nohz: Include local CPU in full dynticks global kick
    nohz: Optimize full dynticks's sched hooks with static keys
    nohz: Optimize full dynticks state checks with static keys
    nohz: Rename a few state variables
    vtime: Always debug check snapshot source _before_ updating it
    vtime: Always scale generic vtime accounting results
    vtime: Optimize full dynticks accounting off case with static keys
    vtime: Describe overriden functions in dedicated arch headers
    m68k: hardirq_count() only need preempt_mask.h
    hardirq: Split preempt count mask definitions
    context_tracking: Split low level state headers
    vtime: Fix racy cputime delta update
    vtime: Remove a few unneeded generic vtime state checks
    context_tracking: User/kernel broundary cross trace events
    context_tracking: Optimize context switch off case with static keys
    context_tracking: Optimize guest APIs off case with static key
    context_tracking: Optimize main APIs off case with static key
    context_tracking: Ground setup for static key use
    context_tracking: Remove full dynticks' hacky dependency on wide context tracking
    nohz: Only enable context tracking on full dynticks CPUs
    ...

    Linus Torvalds
     

04 Sep, 2013

9 commits

  • …rnel.org/pub/scm/linux/kernel/git/tip/tip

    Pull perf changes from Ingo Molnar:
    "As a first remark I'd like to point out that the obsolete '-f'
    (--force) option, which has not done anything for several releases,
    has been removed from 'perf record' and related utilities. Everyone
    please update muscle memory accordingly! :-)

    Main changes on the perf kernel side:

    - Performance optimizations:
    . for trace events, by Steve Rostedt.
    . for time values, by Peter Zijlstra

    - New hardware support:
    . for Intel Silvermont (22nm Atom) CPUs, by Zheng Yan
    . for Intel SNB-EP uncore PMUs, by Zheng Yan

    - Enhanced hardware support:
    . for Intel uncore PMUs: add filter support for QPI boxes, by Zheng Yan

    - Core perf events code enhancements and fixes:
    . for full-nohz feature handling, by Frederic Weisbecker
    . for group events, by Jiri Olsa
    . for call chains, by Frederic Weisbecker
    . for event stream parsing, by Adrian Hunter

    - New ABI details:
    . Add attr->mmap2 attribute, by Stephane Eranian
    . Add PERF_EVENT_IOC_ID ioctl to return event ID, by Jiri Olsa
    . Export u64 time_zero on the mmap header page to allow TSC
    calculation, by Adrian Hunter
    . Add dummy software event, by Adrian Hunter.
    . Add a new PERF_SAMPLE_IDENTIFIER to make samples always
    parseable, by Adrian Hunter.
    . Make Power7 events available via sysfs, by Runzhen Wang.

    - Code cleanups and refactorings:
    . for nohz-full, by Frederic Weisbecker
    . for group events, by Jiri Olsa

    - Documentation updates:
    . for perf_event_type, by Peter Zijlstra

    Main changes on the perf tooling side (some of these tooling changes
    utilize the above kernel side changes):

    - Lots of 'perf trace' enhancements:

    . Make 'perf trace' command line arguments consistent with
    'perf record', by David Ahern.

    . Allow specifying syscalls a la strace, by Arnaldo Carvalho de Melo.

    . Add --verbose and -o/--output options, by Arnaldo Carvalho de Melo.

    . Support ! in -e expressions, to filter a list of syscalls,
    by Arnaldo Carvalho de Melo.

    . Arg formatting improvements to allow masking arguments in
    syscalls such as futex and open, where the some arguments are
    ignored and thus should not be printed depending on other args,
    by Arnaldo Carvalho de Melo.

    . Beautify futex open, openat, open_by_handle_at, lseek and futex
    syscalls, by Arnaldo Carvalho de Melo.

    . Add option to analyze events in a file versus live, so that
    one can do:

    [root@zoo ~]# perf record -a -e raw_syscalls:* sleep 1
    [ perf record: Woken up 0 times to write data ]
    [ perf record: Captured and wrote 25.150 MB perf.data (~1098836 samples) ]
    [root@zoo ~]# perf trace -i perf.data -e futex --duration 1
    17.799 ( 1.020 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, ua
    113.344 (95.429 ms): 7127 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 4294967
    133.778 ( 1.042 ms): 18004 futex(uaddr: 0x7fff3f6c6674, op: 393, val: 1, utime: 0x7fff3f6c6470, uaddr2: 0x7fff3f6c6648, val3: 429496
    [root@zoo ~]#

    By David Ahern.

    . Honor target pid / tid options when analyzing a file, by David Ahern.

    . Introduce better formatting of syscall arguments, including so
    far beautifiers for mmap, madvise, syscall return values,
    by Arnaldo Carvalho de Melo.

    . Handle HUGEPAGE defines in the mmap beautifier, by David Ahern.

    - 'perf report/top' enhancements:

    . Do annotation using /proc/kcore and /proc/kallsyms when
    available, removing the forced need for a vmlinux file kernel
    assembly annotation. This also improves this use case because
    vmlinux has just the initial kernel image, not what is actually
    in use after various code patchings by things like alternatives.
    By Adrian Hunter.

    . Add --ignore-callees=<regex> option to collapse undesired parts
    of call graphs, by Greg Price.

    . Simplify symbol filtering by doing it at machine class level,
    by Adrian Hunter.

    . Add support for callchains in the gtk UI, by Namhyung Kim.

    . Add --objdump option to 'perf top', by Sukadev Bhattiprolu.

    - 'perf kvm' enhancements:

    . Add option to print only events that exceed a specified time
    duration, by David Ahern.

    . Improve stack trace printing, by David Ahern.

    . Update documentation of the live command, by David Ahern

    . Add perf kvm stat live mode that combines aspects of 'perf kvm
    stat' record and report, by David Ahern.

    . Add option to analyze specific VM in perf kvm stat report, by
    David Ahern.

    . Do not require /lib/modules/* on a guest, by Jason Wessel.

    - 'perf script' enhancements:

    . Fix symbol offset computation for some dsos, by David Ahern.

    . Fix named threads support, by David Ahern.

    . Don't install scripting files files when perl/python support
    is disabled, by Arnaldo Carvalho de Melo.

    - 'perf test' enhancements:

    . Add various improvements and fixes to the "vmlinux matches
    kallsyms" 'perf test' entry, related to the /proc/kcore
    annotation feature. By Adrian Hunter.

    . Add sample parsing test, by Adrian Hunter.

    . Add test for reading object code, by Adrian Hunter.

    . Add attr record group sampling test, by Jiri Olsa.

    . Misc testing infrastructure improvements and other details,
    by Jiri Olsa.

    - 'perf list' enhancements:

    . Skip unsupported hardware events, by Namhyung Kim.

    . List pmu events, by Andi Kleen.

    - 'perf diff' enhancements:

    . Add support for more than two files comparison, by Jiri Olsa.

    - 'perf sched' enhancements:

    . Various improvements, including removing reliance on some
    scheduler tracepoints that provide the same information as the
    PERF_RECORD_{FORK,EXIT} events. By David Ahern.

    . Remove odd build stall by moving a large struct initialization
    from a local variable to a global one, by Namhyung Kim.

    - 'perf stat' enhancements:

    . Add --initial-delay option to skip measuring for a defined
    startup phase, by Andi Kleen.

    - Generic perf tooling infrastructure/plumbing changes:

    . Tidy up sample parsing validation, by Adrian Hunter.

    . Fix up jobserver setup in libtraceevent Makefile.
    by Arnaldo Carvalho de Melo.

    . Debug improvements, by Adrian Hunter.

    . Fix correlation of samples coming after PERF_RECORD_EXIT event,
    by David Ahern.

    . Improve robustness of the topology parsing code,
    by Stephane Eranian.

    . Add group leader sampling, that allows just one event in a group
    to sample while the other events have just its values read,
    by Jiri Olsa.

    . Add support for a new modifier "D", which requests that the
    event, or group of events, be pinned to the PMU.
    By Michael Ellerman.

    . Support callchain sorting based on addresses, by Andi Kleen

    . Prep work for multi perf data file storage, by Jiri Olsa.

    . libtraceevent cleanups, by Namhyung Kim.

    And lots and lots of other fixes and code reorganizations that did not
    make it into the list, see the shortlog, diffstat and the Git log for
    details!"

    [ Also merge a leftover from the 3.11 cycle ]

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf: Prevent race in unthrottling code

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (237 commits)
    perf trace: Tell arg formatters the arg index
    perf trace: Add beautifier for open's flags arg
    perf trace: Add beautifier for lseek's whence arg
    perf tools: Fix symbol offset computation for some dsos
    perf list: Skip unsupported events
    perf tests: Add 'keep tracking' test
    perf tools: Add support for PERF_COUNT_SW_DUMMY
    perf: Add a dummy software event to keep tracking
    perf trace: Add beautifier for futex 'operation' parm
    perf trace: Allow syscall arg formatters to mask args
    perf: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node()
    perf: Export struct perf_branch_entry to userspace
    perf: Add attr->mmap2 attribute to an event
    perf/x86: Add Silvermont (22nm Atom) support
    perf/x86: use INTEL_UEVENT_EXTRA_REG to define MSR_OFFCORE_RSP_X
    perf trace: Handle missing HUGEPAGE defines
    perf trace: Honor target pid / tid options when analyzing a file
    perf trace: Add option to analyze events in a file versus live
    perf evlist: Add tracepoint lookup by name
    perf tests: Add a sample parsing test
    ...

    Linus Torvalds
     
  • Pull pstore changes from Tony Luck:
    "A big part of this is the addition of compression to the generic
    pstore layer so that all backends can use the pitiful amounts of
    storage they control more effectively. Three other small
    fixes/cleanups too.

    * tag 'please-pull-pstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
    pstore/ram: (really) fix undefined usage of rounddown_pow_of_two
    pstore/ram: Read and write to the 'compressed' flag of pstore
    efi-pstore: Read and write to the 'compressed' flag of pstore
    erst: Read and write to the 'compressed' flag of pstore
    powerpc/pseries: Read and write to the 'compressed' flag of pstore
    pstore: Add file extension to pstore file if compressed
    pstore: Add decompression support to pstore
    pstore: Introduce new argument 'compressed' in the read callback
    pstore: Add compression support to pstore
    pstore/Kconfig: Select ZLIB_DEFLATE and ZLIB_INFLATE when PSTORE is selected
    pstore: Add new argument 'compressed' in pstore write callback
    powerpc/pseries: Remove (de)compression in nvram with pstore enabled
    pstore: d_alloc_name() doesn't return an ERR_PTR
    acpi/apei/erst: Add missing iounmap() on error in erst_exec_move_data()

    Linus Torvalds
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • same story as with oprofilefs_mkdir()

    Signed-off-by: Al Viro

    Al Viro
     
  • it's always equal to ->d_sb of the second argument (parent dentry),
    due to either being literally that, or ->d_sb of parent's parent.

    Signed-off-by: Al Viro

    Al Viro
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • Pull PCI changes from Bjorn Helgaas:

    PCI device hotplug:
    - Use PCIe native hotplug, not ACPI hotplug, when possible (Neil Horman)
    - Assign resources on per-host bridge basis (Yinghai Lu)

    MPS (Max Payload Size):
    - Allow larger MPS settings below hotplug-capable Root Port (Yijing Wang)
    - Add warnings about unsafe MPS settings (Yijing Wang)
    - Simplify interface and messages (Bjorn Helgaas)

    SR-IOV:
    - Return -ENOSYS on non-SR-IOV devices (Stefan Assmann)
    - Update NumVFs register when disabling SR-IOV (Yijing Wang)

    Virtualization:
    - Add bus and slot reset support (Alex Williamson)
    - Fix ACS (Access Control Services) issues (Alex Williamson)

    Miscellaneous:
    - Simplify PCIe Capability accessors (Bjorn Helgaas)
    - Add pcibios_pm_ops for arch-specific hibernate stuff (Sebastian Ott)
    - Disable decoding during BAR sizing only when necessary (Zoltan Kiss)
    - Delay enabling bridges until they're needed (Yinghai Lu)
    - Split Designware support into Synopsys and Exynos parts (Jingoo Han)
    - Convert class code to use dev_groups (Greg Kroah-Hartman)
    - Cleanup Designware and Exynos I/O access wrappers (Seungwon Jeon)
    - Fix bridge I/O window alignment (Bjorn Helgaas)
    - Add pci_wait_for_pending_transaction() (Casey Leedom)
    - Use devm_ioremap_resource() in Marvell driver (Tushar Behera)

    * tag 'pci-v3.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (63 commits)
    PCI/ACPI: Fix _OSC ordering to allow PCIe hotplug use when available
    PCI: exynos: Add I/O access wrappers
    PCI: designware: Drop "addr" arg from dw_pcie_readl_rc()/dw_pcie_writel_rc()
    PCI: Remove pcie_cap_has_devctl()
    PCI: Support PCIe Capability Slot registers only for ports with slots
    PCI: Remove PCIe Capability version checks
    PCI: Allow PCIe Capability link-related register access for switches
    PCI: Add offsets of PCIe capability registers
    PCI: Tidy bitmasks and spacing of PCIe capability definitions
    PCI: Remove obsolete comment reference to pci_pcie_cap2()
    PCI: Clarify PCI_EXP_TYPE_PCI_BRIDGE comment
    PCI: Rename PCIe capability definitions to follow convention
    PCI: Warn if unsafe MPS settings detected
    PCI: Fix MPS peer-to-peer DMA comment syntax
    PCI: Disable decoding for BAR sizing only when it was actually enabled
    PCI: Add comment about needing pci_msi_off() even when CONFIG_PCI_MSI=n
    PCI: Add pcibios_pm_ops for optional arch-specific hibernate functionality
    PCI: Don't restrict MPS for slots below Root Ports
    PCI: Simplify MPS test for Downstream Port
    PCI: Remove unnecessary check for pcie_get_mps() failure
    ...

    Linus Torvalds
     
  • Pull ACPI and power management updates from Rafael Wysocki:

    1) ACPI-based PCI hotplug (ACPIPHP) subsystem rework and introduction
    of Intel Thunderbolt support on systems that use ACPI for signalling
    Thunderbolt hotplug events. This also should make ACPIPHP work in
    some cases in which it was known to have problems. From
    Rafael J Wysocki, Mika Westerberg and Kirill A Shutemov.

    2) ACPI core code cleanups and dock station support cleanups from
    Jiang Liu and Rafael J Wysocki.

    3) Fixes for locking problems related to ACPI device hotplug from
    Rafael J Wysocki.

    4) ACPICA update to version 20130725 includig fixes, cleanups, support
    for more than 256 GPEs per GPE block and a change to make the ACPI
    PM Timer optional (we've seen systems without the PM Timer in the
    field already). One of the fixes, related to the DeRefOf operator,
    is necessary to prevent some Windows 8 oriented AML from causing
    problems to happen. From Bob Moore, Lv Zheng, and Jung-uk Kim.

    5) Removal of the old and long deprecated /proc/acpi/event interface
    and related driver changes from Thomas Renninger.

    6) ACPI and Xen changes to make the reduced hardware sleep work with
    the latter from Ben Guthro.

    7) ACPI video driver cleanups and a blacklist of systems that should
    not tell the BIOS that they are compatible with Windows 8 (or ACPI
    backlight and possibly other things will not work on them). From
    Felipe Contreras.

    8) Assorted ACPI fixes and cleanups from Aaron Lu, Hanjun Guo,
    Kuppuswamy Sathyanarayanan, Lan Tianyu, Sachin Kamat, Tang Chen,
    Toshi Kani, and Wei Yongjun.

    9) cpufreq ondemand governor target frequency selection change to
    reduce oscillations between min and max frequencies (essentially,
    it causes the governor to choose target frequencies proportional
    to load) from Stratos Karafotis.

    10) cpufreq fixes allowing sysfs attributes file permissions to be
    preserved over suspend/resume cycles Srivatsa S Bhat.

    11) Removal of Device Tree parsing for CPU device nodes from multiple
    cpufreq drivers that required some changes related to
    of_get_cpu_node() to be made in a few architectures and in the
    driver core. From Sudeep KarkadaNagesha.

    12) cpufreq core fixes and cleanups related to mutual exclusion and
    driver module references from Viresh Kumar, Lukasz Majewski and
    Rafael J Wysocki.

    13) Assorted cpufreq fixes and cleanups from Amit Daniel Kachhap,
    Bartlomiej Zolnierkiewicz, Hanjun Guo, Jingoo Han, Joseph Lo,
    Julia Lawall, Li Zhong, Mark Brown, Sascha Hauer, Stephen Boyd,
    Stratos Karafotis, and Viresh Kumar.

    14) Fixes to prevent race conditions in coupled cpuidle from happening
    from Colin Cross.

    15) cpuidle core fixes and cleanups from Daniel Lezcano and
    Tuukka Tikkanen.

    16) Assorted cpuidle fixes and cleanups from Daniel Lezcano,
    Geert Uytterhoeven, Jingoo Han, Julia Lawall, Linus Walleij,
    and Sahara.

    17) System sleep tracing changes from Todd E Brandt and Shuah Khan.

    18) PNP subsystem conversion to using struct dev_pm_ops for power
    management from Shuah Khan.

    * tag 'pm+acpi-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (217 commits)
    cpufreq: Don't use smp_processor_id() in preemptible context
    cpuidle: coupled: fix race condition between pokes and safe state
    cpuidle: coupled: abort idle if pokes are pending
    cpuidle: coupled: disable interrupts after entering safe state
    ACPI / hotplug: Remove containers synchronously
    driver core / ACPI: Avoid device hot remove locking issues
    cpufreq: governor: Fix typos in comments
    cpufreq: governors: Remove duplicate check of target freq in supported range
    cpufreq: Fix timer/workqueue corruption due to double queueing
    ACPI / EC: Add ASUSTEK L4R to quirk list in order to validate ECDT
    ACPI / thermal: Add check of "_TZD" availability and evaluating result
    cpufreq: imx6q: Fix clock enable balance
    ACPI: blacklist win8 OSI for buggy laptops
    cpufreq: tegra: fix the wrong clock name
    cpuidle: Change struct menu_device field types
    cpuidle: Add a comment warning about possible overflow
    cpuidle: Fix variable domains in get_typical_interval()
    cpuidle: Fix menu_device->intervals type
    cpuidle: CodingStyle: Break up multiple assignments on single line
    cpuidle: Check called function parameter in get_typical_interval()
    ...

    Linus Torvalds
     
  • Pull driver core patches from Greg KH:
    "Here's the big driver core pull request for 3.12-rc1.

    Lots of tiny changes here fixing up the way sysfs attributes are
    created, to try to make drivers simpler, and fix a whole class race
    conditions with creations of device attributes after the device was
    announced to userspace.

    All the various pieces are acked by the different subsystem
    maintainers"

    * tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (119 commits)
    firmware loader: fix pending_fw_head list corruption
    drivers/base/memory.c: introduce help macro to_memory_block
    dynamic debug: line queries failing due to uninitialized local variable
    sysfs: sysfs_create_groups returns a value.
    debugfs: provide debugfs_create_x64() when disabled
    rbd: convert bus code to use bus_groups
    firmware: dcdbas: use binary attribute groups
    sysfs: add sysfs_create/remove_groups for when SYSFS is not enabled
    driver core: add #include to core files.
    HID: convert bus code to use dev_groups
    Input: serio: convert bus code to use drv_groups
    Input: gameport: convert bus code to use drv_groups
    driver core: firmware: use __ATTR_RW()
    driver core: core: use DEVICE_ATTR_RO
    driver core: bus: use DRIVER_ATTR_WO()
    driver core: create write-only attribute macros for devices and drivers
    sysfs: create __ATTR_WO()
    driver-core: platform: convert bus code to use dev_groups
    workqueue: convert bus code to use dev_groups
    MEI: convert bus code to use dev_groups
    ...

    Linus Torvalds
     

30 Aug, 2013

1 commit

  • * 'kvm-ppc-next' of git://github.com/agraf/linux-2.6:
    KVM: PPC: Book3S PR: Rework kvmppc_mmu_book3s_64_xlate()
    KVM: PPC: Book3S PR: Make instruction fetch fallback work for system calls
    KVM: PPC: Book3S PR: Don't corrupt guest state when kernel uses VMX
    KVM: PPC: Book3S: Fix compile error in XICS emulation
    KVM: PPC: Book3S PR: return appropriate error when allocation fails
    arch: powerpc: kvm: add signed type cast for comparation
    powerpc/kvm: Copy the pvr value after memset
    KVM: PPC: Book3S PR: Load up SPRG3 register with guest value on guest entry
    kvm/ppc/booke: Don't call kvm_guest_enter twice
    kvm/ppc: Call trace_hardirqs_on before entry
    KVM: PPC: Book3S HV: Allow negative offsets to real-mode hcall handlers
    KVM: PPC: Book3S HV: Correct tlbie usage
    powerpc/kvm: Use 256K chunk to track both RMA and hash page table allocation.
    powerpc/kvm: Contiguous memory allocator based RMA allocation
    powerpc/kvm: Contiguous memory allocator based hash page table allocation
    KVM: PPC: Book3S: Ignore DABR register
    mm/cma: Move dma contiguous changes into a seperate config

    Gleb Natapov
     

29 Aug, 2013

5 commits


28 Aug, 2013

2 commits

  • It turns out that if we exit the guest due to a hcall instruction (sc 1),
    and the loading of the instruction in the guest exit path fails for any
    reason, the call to kvmppc_ld() in kvmppc_get_last_inst() fetches the
    instruction after the hcall instruction rather than the hcall itself.
    This in turn means that the instruction doesn't get recognized as an
    hcall in kvmppc_handle_exit_pr() but gets passed to the guest kernel
    as a sc instruction. That usually results in the guest kernel getting
    a return code of 38 (ENOSYS) from an hcall, which often triggers a
    BUG_ON() or other failure.

    This fixes the problem by adding a new variant of kvmppc_get_last_inst()
    called kvmppc_get_last_sc(), which fetches the instruction if necessary
    from pc - 4 rather than pc.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Alexander Graf

    Paul Mackerras
     
  • Currently the code assumes that once we load up guest FP/VSX or VMX
    state into the CPU, it stays valid in the CPU registers until we
    explicitly flush it to the thread_struct. However, on POWER7,
    copy_page() and memcpy() can use VMX. These functions do flush the
    VMX state to the thread_struct before using VMX instructions, but if
    this happens while we have guest state in the VMX registers, and we
    then re-enter the guest, we don't reload the VMX state from the
    thread_struct, leading to guest corruption. This has been observed
    to cause guest processes to segfault.

    To fix this, we check before re-entering the guest that all of the
    bits corresponding to facilities owned by the guest, as expressed
    in vcpu->arch.guest_owned_ext, are set in current->thread.regs->msr.
    Any bits that have been cleared correspond to facilities that have
    been used by kernel code and thus flushed to the thread_struct, so
    for them we reload the state from the thread_struct.

    We also need to check current->thread.regs->msr before calling
    giveup_fpu() or giveup_altivec(), since if the relevant bit is
    clear, the state has already been flushed to the thread_struct and
    to flush it again would corrupt it.

    Signed-off-by: Paul Mackerras
    Signed-off-by: Alexander Graf

    Paul Mackerras