26 Jul, 2011

1 commit


25 Jul, 2011

9 commits

  • * 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (237 commits)
    ARM: 7004/1: fix traps.h compile warnings
    ARM: 6998/2: kernel: use proper memory barriers for bitops
    ARM: 6997/1: ep93xx: increase NR_BANKS to 16 for support of 128MB RAM
    ARM: Fix build errors caused by adding generic macros
    ARM: CPU hotplug: ensure we migrate all IRQs off a downed CPU
    ARM: CPU hotplug: pass in proper affinity mask on IRQ migration
    ARM: GIC: avoid routing interrupts to offline CPUs
    ARM: CPU hotplug: fix abuse of irqdesc->node
    ARM: 6981/2: mmci: adjust calculation of f_min
    ARM: 7000/1: LPAE: Use long long printk format for displaying the pud
    ARM: 6999/1: head, zImage: Always Enter the kernel in ARM state
    ARM: btc: avoid invalidating the branch target cache on kernel TLB maintanence
    ARM: ARM_DMA_ZONE_SIZE is no more
    ARM: mach-shark: move ARM_DMA_ZONE_SIZE to mdesc->dma_zone_size
    ARM: mach-sa1100: move ARM_DMA_ZONE_SIZE to mdesc->dma_zone_size
    ARM: mach-realview: move from ARM_DMA_ZONE_SIZE to mdesc->dma_zone_size
    ARM: mach-pxa: move from ARM_DMA_ZONE_SIZE to mdesc->dma_zone_size
    ARM: mach-ixp4xx: move from ARM_DMA_ZONE_SIZE to mdesc->dma_zone_size
    ARM: mach-h720x: move from ARM_DMA_ZONE_SIZE to mdesc->dma_zone_size
    ARM: mach-davinci: move from ARM_DMA_ZONE_SIZE to mdesc->dma_zone_size
    ...

    Linus Torvalds
     
  • Current documentation referred to the old method of handling augmented
    trees. Update documentation to correspond with the changes done in
    commit b945d6b2554d ("rbtree: Undo augmented trees performance damage
    and regression").

    Cc: Pekka Enberg
    Cc: David Woodhouse
    Cc: Andrew Morton
    Acked-by: Ingo Molnar
    Acked-by: Peter Zijlstra
    Signed-off-by: Sasha Levin
    Signed-off-by: Linus Torvalds

    Sasha Levin
     
  • is needed for min_t. The old version
    happened to work on x86 because
    indirectly includes , but it didn't
    work on ARM.

    includes so it's
    not necessary to include it explicitly anymore.

    Signed-off-by: Lasse Collin
    Cc: stable
    Signed-off-by: Linus Torvalds

    Lasse Collin
     
  • * 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (21 commits)
    [S390] use siginfo for sigtrap signals
    [S390] dasd: add enhanced DASD statistics interface
    [S390] kvm: make sigp emerg smp capable
    [S390] disable cpu measurement alerts on a dying cpu
    [S390] initial cr0 bits
    [S390] iucv cr0 enablement bit
    [S390] race safe external interrupt registration
    [S390] remove tape block docu
    [S390] ap: toleration support for ap device type 10
    [S390] cleanup program check handler prototypes
    [S390] remove kvm mmu reload on s390
    [S390] Use gmap translation for accessing guest memory
    [S390] use gmap address spaces for kvm guest images
    [S390] kvm guest address space mapping
    [S390] fix s390 assembler code alignments
    [S390] move sie code to entry.S
    [S390] kvm: handle tprot intercepts
    [S390] qdio: clear shared DSCI before scheduling the queue handler
    [S390] reference bit testing for unmapped pages
    [S390] irqs: Do not trace arch_local_{*,irq_*} functions
    ...

    Linus Torvalds
     
  • * 'for-upstream' of git://openrisc.net/jonas/linux: (24 commits)
    OpenRISC: Add MAINTAINERS entry
    OpenRISC: Miscellaneous
    OpenRISC: Library routines
    OpenRISC: Headers
    OpenRISC: Traps
    OpenRISC: Module support
    OpenRISC: GPIO
    OpenRISC: Scheduling/Process management
    OpenRISC: Idle/Power management
    OpenRISC: System calls
    OpenRISC: IRQ
    OpenRISC: Timekeeping
    OpenRISC: DMA
    OpenRISC: PTrace
    OpenRISC: Build infrastructure
    OpenRISC: Signal handling
    OpenRISC: Memory management
    OpenRISC: Device tree
    OpenRISC: Boot code
    iomap: make IOPORT/PCI mapping functions conditional
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
    modpost: Fix modpost's license checking V3
    module: add /sys/module//uevent files
    module: change attr callbacks to take struct module_kobject
    modules: make arch's use default loader hooks
    modules: add default loader hook implementations
    param: fix return value handling in param_set_*

    Linus Torvalds
     
  • * 'kvm-updates/3.1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (143 commits)
    KVM: IOMMU: Disable device assignment without interrupt remapping
    KVM: MMU: trace mmio page fault
    KVM: MMU: mmio page fault support
    KVM: MMU: reorganize struct kvm_shadow_walk_iterator
    KVM: MMU: lockless walking shadow page table
    KVM: MMU: do not need atomicly to set/clear spte
    KVM: MMU: introduce the rules to modify shadow page table
    KVM: MMU: abstract some functions to handle fault pfn
    KVM: MMU: filter out the mmio pfn from the fault pfn
    KVM: MMU: remove bypass_guest_pf
    KVM: MMU: split kvm_mmu_free_page
    KVM: MMU: count used shadow pages on prepareing path
    KVM: MMU: rename 'pt_write' to 'emulate'
    KVM: MMU: cleanup for FNAME(fetch)
    KVM: MMU: optimize to handle dirty bit
    KVM: MMU: cache mmio info on page fault path
    KVM: x86: introduce vcpu_mmio_gva_to_gpa to cleanup the code
    KVM: MMU: do not update slot bitmap if spte is nonpresent
    KVM: MMU: fix walking shadow page table
    KVM guest: KVM Steal time registration
    ...

    Linus Torvalds
     
  • * 'upstream/xen-tracing2' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
    xen/trace: use class for multicall trace
    xen/trace: convert mmu events to use DECLARE_EVENT_CLASS()/DEFINE_EVENT()
    xen/multicall: move *idx fields to start of mc_buffer
    xen/multicall: special-case singleton hypercalls
    xen/multicalls: add unlikely around slowpath in __xen_mc_entry()
    xen/multicalls: disable MC_DEBUG
    xen/mmu: tune pgtable alloc/release
    xen/mmu: use extend_args for more mmuext updates
    xen/trace: add tlb flush tracepoints
    xen/trace: add segment desc tracing
    xen/trace: add xen_pgd_(un)pin tracepoints
    xen/trace: add ptpage alloc/release tracepoints
    xen/trace: add mmu tracepoints
    xen/trace: add multicall tracing
    xen/trace: set up tracepoint skeleton
    xen/multicalls: remove debugfs stats
    trace/xen: add skeleton for Xen trace events

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (34 commits)
    crypto: caam - ablkcipher support
    crypto: caam - faster aead implementation
    crypto: caam - structure renaming
    crypto: caam - shorter names
    crypto: talitos - don't bad_key in ablkcipher setkey
    crypto: talitos - remove unused giv from ablkcipher methods
    crypto: talitos - don't set done notification in hot path
    crypto: talitos - ensure request ordering within a single tfm
    crypto: gf128mul - fix call to memset()
    crypto: s390 - support hardware accelerated SHA-224
    crypto: algif_hash - Handle initial af_alg_make_sg error correctly
    crypto: sha1_generic - use SHA1_BLOCK_SIZE
    hwrng: ppc4xx - add support for ppc4xx TRNG
    crypto: crypto4xx - Perform read/modify/write on device control register
    crypto: caam - fix build warning when DEBUG_FS not configured
    crypto: arc4 - Fixed coding style issues
    crypto: crc32c - Fixed coding style issue
    crypto: omap-sham - do not schedule tasklet if there is no active requests
    crypto: omap-sham - clear device flags when finishing request
    crypto: omap-sham - irq handler must not clear error code
    ...

    Linus Torvalds
     

24 Jul, 2011

30 commits

  • The commit f02e8a6 sorts symbols placing each of them in its own elf section.
    The sorting and merging into the canonical sections are done by the linker.
    Unfortunately modpost to generate Module.symvers file parses vmlinux
    (already linked) and all modules object files (which aren't linked yet).
    These aren't sanitized by the linker yet. That breaks modpost that can't
    detect license properly for modules. This patch makes modpost aware of
    the new exported symbols structure.

    Thanks to Arnaud Lacombe and Anders Kaseorg
    for providing useful suggestions about code.

    This work was supported by a hardware donation from the CE Linux Forum.

    Reported-by: Jan Beulich
    Signed-off-by: Alessio Igor Bogani
    Signed-off-by: Rusty Russell

    Alessio Igor Bogani
     
  • Userspace wants to manage module parameters with udev rules.
    This currently only works for loaded modules, but not for
    built-in ones.

    To allow access to the built-in modules we need to
    re-trigger all module load events that happened before any
    userspace was running. We already do the same thing for all
    devices, subsystems(buses) and drivers.

    This adds the currently missing /sys/module//uevent files
    to all module entries.

    Signed-off-by: Kay Sievers
    Signed-off-by: Rusty Russell (split & trivial fix)

    Kay Sievers
     
  • This simplifies the next patch, where we have an attribute on a
    builtin module (ie. module == NULL).

    Signed-off-by: Kay Sievers
    Signed-off-by: Rusty Russell (split into 2)

    Kay Sievers
     
  • This patch removes all the module loader hook implementations in the
    architecture specific code where the functionality is the same as that
    now provided by the recently added default hooks.

    Signed-off-by: Jonas Bonn
    Acked-by: Mike Frysinger
    Acked-by: Geert Uytterhoeven
    Tested-by: Michal Simek
    Signed-off-by: Rusty Russell

    Jonas Bonn
     
  • The module loader code allows architectures to hook into the code by
    providing a small number of entry points that each arch must implement.
    This patch provides __weakly linked generic implementations of these
    entry points for architectures that don't need to do anything special.

    Signed-off-by: Jonas Bonn
    Signed-off-by: Rusty Russell

    Jonas Bonn
     
  • In STANDARD_PARAM_DEF, param_set_* handles the case in which strtolfn
    returns -EINVAL but it may return -ERANGE. If it returns -ERANGE,
    param_set_* may set uninitialized value to the paramerter. We should handle
    both cases.

    The one of the cases in which strtolfn() returns -ERANGE is following:

    *Type of module parameter is long
    *Set the parameter more than LONG_MAX

    Signed-off-by: Satoru Moriya
    Signed-off-by: Rusty Russell

    Satoru Moriya
     
  • IOMMU interrupt remapping support provides a further layer of
    isolation for device assignment by preventing arbitrary interrupt
    block DMA writes by a malicious guest from reaching the host. By
    default, we should require that the platform provides interrupt
    remapping support, with an opt-in mechanism for existing behavior.

    Both AMD IOMMU and Intel VT-d2 hardware support interrupt
    remapping, however we currently only have software support on
    the Intel side. Users wishing to re-enable device assignment
    when interrupt remapping is not supported on the platform can
    use the "allow_unsafe_assigned_interrupts=1" module option.

    [avi: break long lines]

    Signed-off-by: Alex Williamson
    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Alex Williamson
     
  • Add tracepoints to trace mmio page fault

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • The idea is from Avi:

    | We could cache the result of a miss in an spte by using a reserved bit, and
    | checking the page fault error code (or seeing if we get an ept violation or
    | ept misconfiguration), so if we get repeated mmio on a page, we don't need to
    | search the slot list/tree.
    | (https://lkml.org/lkml/2011/2/22/221)

    When the page fault is caused by mmio, we cache the info in the shadow page
    table, and also set the reserved bits in the shadow page table, so if the mmio
    is caused again, we can quickly identify it and emulate it directly

    Searching mmio gfn in memslots is heavy since we need to walk all memeslots, it
    can be reduced by this feature, and also avoid walking guest page table for
    soft mmu.

    [jan: fix operator precedence issue]

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Jan Kiszka
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Reorganize it for good using the cache

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Use rcu to protect shadow pages table to be freed, so we can safely walk it,
    it should run fastly and is needed by mmio page fault

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Now, the spte is just from nonprsent to present or present to nonprsent, so
    we can use some trick to set/clear spte non-atomicly as linux kernel does

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Introduce some interfaces to modify spte as linux kernel does:
    - mmu_spte_clear_track_bits, it set the spte from present to nonpresent, and
    track the stat bits(accessed/dirty) of spte
    - mmu_spte_clear_no_track, the same as mmu_spte_clear_track_bits except
    tracking the stat bits
    - mmu_spte_set, set spte from nonpresent to present
    - mmu_spte_update, only update the stat bits

    Now, it does not allowed to set spte from present to present, later, we can
    drop the atomicly opration for X86_32 host, and it is the preparing work to
    get spte on X86_32 host out of the mmu lock

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Introduce handle_abnormal_pfn to handle fault pfn on page fault path,
    introduce mmu_invalid_pfn to handle fault pfn on prefetch path

    It is the preparing work for mmio page fault support

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • If the page fault is caused by mmio, the gfn can not be found in memslots, and
    'bad_pfn' is returned on gfn_to_hva path, so we can use 'bad_pfn' to identify
    the mmio page fault.
    And, to clarify the meaning of mmio pfn, we return fault page instead of bad
    page when the gfn is not allowd to prefetch

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • The idea is from Avi:
    | Maybe it's time to kill off bypass_guest_pf=1. It's not as effective as
    | it used to be, since unsync pages always use shadow_trap_nonpresent_pte,
    | and since we convert between the two nonpresent_ptes during sync and unsync.

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Split kvm_mmu_free_page to kvm_mmu_isolate_page and
    kvm_mmu_free_page

    One is used to remove the page from cache under mmu lock and the other is
    used to free page table out of mmu lock

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Move counting used shadow pages from commiting path to preparing path to
    reduce tlb flush on some paths

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • If 'pt_write' is true, we need to emulate the fault. And in later patch, we
    need to emulate the fault even though it is not a pt_write event, so rename
    it to better fit the meaning

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • gw->pte_access is the final access permission, since it is unified with
    gw->pt_access when we walked guest page table:

    FNAME(walk_addr_generic):
    pte_access = pt_access & FNAME(gpte_access)(vcpu, pte, true);

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • If dirty bit is not set, we can make the pte access read-only to avoid handing
    dirty bit everywhere

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • If the page fault is caused by mmio, we can cache the mmio info, later, we do
    not need to walk guest page table and quickly know it is a mmio fault while we
    emulate the mmio instruction

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Introduce vcpu_mmio_gva_to_gpa to translate the gva to gpa, we can use it
    to cleanup the code between read emulation and write emulation

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Set slot bitmap only if the spte is present

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • Properly check the last mapping, and do not walk to the next level if last spte
    is met

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Avi Kivity

    Xiao Guangrong
     
  • This patch implements the kvm bits of the steal time infrastructure.
    The most important part of it, is the steal time clock. It is an
    continuous clock that shows the accumulated amount of steal time
    since vcpu creation. It is supposed to survive cpu offlining/onlining.

    [marcelo: fix build with CONFIG_KVM_GUEST=n]

    Signed-off-by: Glauber Costa
    Acked-by: Rik van Riel
    Tested-by: Eric B Munson
    CC: Jeremy Fitzhardinge
    CC: Peter Zijlstra
    CC: Avi Kivity
    CC: Anthony Liguori
    Signed-off-by: Avi Kivity
    Signed-off-by: Marcelo Tosatti

    Glauber Costa
     
  • Provide additional information on SIGTRAP by using a sig_info signal.
    Use TRAP_BRKPT for breakpoints via illegal operation and TRAP_HWBKPT
    for breakpoints via program event recording. Provide the address of
    the instruction that caused the breakpoint via si_addr.
    While we are at it get rid of tracehook_consider_fatal_signal.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • This patch extends the DASD statistics to allow for a more detailed
    analysis of DASD I/O operations. In particular we want the statistics
    to provide answers to the following questions:
    - How many requests used a PAV alias?
    - How many requests used High Performance FICON?
    - How do read request perform versus write requests?

    The existing DASD statistics interface has several shortcomings
    - The interface for global data is a formatted text table in procfs
    (/proc/dasd/statistics). The layout is meant for human readers and
    is not to easy to parse. If values get to large for the table
    layout, they get scaled down.
    - The statistics which are collected per block device can be
    accessed via an ioctl interface, which can only be extended by
    defining a new ioctl.
    - There is no statistics interface for individual PAV base and alias
    devices.

    To overcome theses shortcomings we create a new DASD statistics
    interface in debugfs. This interface will contain one entry for global
    data, one per DASD block device, and one per DASD base and alias
    device. Each file contains the statistic data in easy to parse
    name/value and name/array pairs. The existing interfaces will remain
    functional, but they will not be extended.

    Signed-off-by: Stefan Weinhuber
    Signed-off-by: Martin Schwidefsky

    Stefan Weinhuber
     
  • SIGP emerg needs to pass the source vpu adress into __LC_CPU_ADDRESS of the
    target guest.

    Signed-off-by: Christian Ehrhardt
    Signed-off-by: Martin Schwidefsky

    Christian Ehrhardt
     
  • The cpu measurement alerts that are used for instance by oprofile
    for hardware sampling are not turned off on a cpu that is going
    offline. Add the appropriate control register bit that should be
    disabled to the list.

    Signed-off-by: Jan Glauber
    Signed-off-by: Martin Schwidefsky

    Jan Glauber