11 Feb, 2015

1 commit

  • When a request is made to unbind a device from a vfio bus driver,
    we need to wait for the device to become unused, ie. for userspace
    to release the device. However, we have a long standing TODO in
    the code to do something proactive to make that happen. To enable
    this, we add a request callback on the vfio bus driver struct,
    which is intended to signal the user through the vfio device
    interface to release the device. Instead of passively waiting for
    the device to become unused, we can now pester the user to give
    it up.

    Signed-off-by: Alex Williamson

    Alex Williamson
     

07 Feb, 2015

5 commits

  • Move the iommu_group reference from the device to the vfio_group.
    This ensures that the iommu_group persists as long as the vfio_group
    remains. This can be important if all of the device from an
    iommu_group are removed, but we still have an outstanding vfio_group
    reference; we can still walk the empty list of devices.

    Signed-off-by: Alex Williamson

    Alex Williamson
     
  • There's a small window between the vfio bus driver calling
    vfio_del_group_dev() and the device being completely unbound where
    the vfio group appears to be non-viable. This creates a race for
    users like QEMU/KVM where the kvm-vfio module tries to get an
    external reference to the group in order to match and release an
    existing reference, while the device is potentially being removed
    from the vfio bus driver. If the group is momentarily non-viable,
    kvm-vfio may not be able to release the group reference until VM
    shutdown, making the group unusable until that point.

    Bridge the gap between device removal from the group and completion
    of the driver unbind by tracking it in a list. The device is added
    to the list before the bus driver reference is released and removed
    using the existing unbind notifier.

    Signed-off-by: Alex Williamson

    Alex Williamson
     
  • IOMMU operations can be expensive and it's not very difficult for a
    user to give us a lot of work to do for a map or unmap operation.
    Killing a large VM will vfio assigned devices can result in soft
    lockups and IOMMU tracing shows that we can easily spend 80% of our
    time with need-resched set. A sprinkling of conf_resched() calls
    after map and unmap calls has a very tiny affect on performance
    while resulting in traces with

    Alex Williamson
     
  • We currently map invalid and reserved pages, such as often occur from
    mapping MMIO regions of a VM through the IOMMU, using single pages.
    There's really no reason we can't instead follow the methodology we
    use for normal pages and find the largest possible physically
    contiguous chunk for mapping. The only difference is that we don't
    do locked memory accounting for these since they're not back by RAM.

    In most applications this will be a very minor improvement, but when
    graphics and GPGPU devices are in play, MMIO BARs become non-trivial.

    Signed-off-by: Alex Williamson

    Alex Williamson
     
  • When unmapping DMA entries we try to rely on the IOMMU API behavior
    that allows the IOMMU to unmap a larger area than requested, up to
    the size of the original mapping. This works great when the IOMMU
    supports superpages *and* they're in use. Otherwise, each PAGE_SIZE
    increment is unmapped separately, resulting in poor performance.

    Instead we can use the IOVA-to-physical-address translation provided
    by the IOMMU API and unmap using the largest contiguous physical
    memory chunk available, which is also how vfio/type1 would have
    mapped the region. For a synthetic 1TB guest VM mapping and shutdown
    test on Intel VT-d (2M IOMMU pagesize support), this achieves about
    a 30% overall improvement mapping standard 4K pages, regardless of
    IOMMU superpage enabling, and about a 40% improvement mapping 2M
    hugetlbfs pages when IOMMU superpages are not available. Hugetlbfs
    with IOMMU superpages enabled is effectively unchanged.

    Unfortunately the same algorithm does not work well on IOMMUs with
    fine-grained superpages, like AMD-Vi, costing about 25% extra since
    the IOMMU will automatically unmap any power-of-two contiguous
    mapping we've provided it. We add a routine and a domain flag to
    detect this feature, leaving AMD-Vi unaffected by this unmap
    optimization.

    Signed-off-by: Alex Williamson

    Alex Williamson
     

02 Feb, 2015

6 commits

  • Linus Torvalds
     
  • Pull ARM SoC fixes from Olof Johansson:
    "One more week's worth of fixes. Worth pointing out here are:

    - A patch fixing detaching of iommu registrations when a device is
    removed -- earlier the ops pointer wasn't managed properly
    - Another set of Renesas boards get the same GIC setup fixup as
    others have in previous -rcs
    - Serial port aliases fixups for sunxi. We did the same to tegra but
    we caught that in time before the merge window due to more machines
    being affected. Here it took longer for anyone to notice.
    - A couple more DT tweaks on sunxi
    - A follow-up patch for the mvebu coherency disabling in last -rc
    batch"

    * tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
    arm: dma-mapping: Set DMA IOMMU ops in arm_iommu_attach_device()
    ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
    ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds
    ARM: mvebu: don't set the PL310 in I/O coherency mode when I/O coherency is disabled
    ARM: sunxi: dt: Fix aliases
    ARM: dts: sun4i: Add simplefb node with de_fe0-de_be0-lcd0-hdmi pipeline
    ARM: dts: sun6i: ippo-q8h-v5: Fix serial0 alias
    ARM: dts: sunxi: Fix usb-phy support for sun4i/sun5i

    Linus Torvalds
     
  • Pull input layer updates from Dmitry Torokhov:
    "Just a few quirks for PS/2 this time"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: elantech - add more Fujtisu notebooks to force crc_enabled
    Input: i8042 - add noloop quirk for Medion Akoya E7225 (MD98857)
    Input: synaptics - adjust min/max for Lenovo ThinkPad X1 Carbon 2nd

    Linus Torvalds
     
  • Commit 8eb23b9f35aa ("sched: Debug nested sleeps") added code to report
    on nested sleep conditions, which we generally want to avoid because the
    inner sleeping operation can re-set the thread state to TASK_RUNNING,
    but that will then cause the outer sleep loop not actually sleep when it
    calls schedule.

    However, that's actually valid traditional behavior, with the inner
    sleep being some fairly rare case (like taking a sleeping lock that
    normally doesn't actually need to sleep).

    And the debug code would actually change the state of the task to
    TASK_RUNNING internally, which makes that kind of traditional and
    working code not work at all, because now the nested sleep doesn't just
    sometimes cause the outer one to not block, but will cause it to happen
    every time.

    In particular, it will cause the cardbus kernel daemon (pccardd) to
    basically busy-loop doing scheduling, converting a laptop into a heater,
    as reported by Bruno Prémont. But there may be other legacy uses of
    that nested sleep model in other drivers that are also likely to never
    get converted to the new model.

    This fixes both cases:

    - don't set TASK_RUNNING when the nested condition happens (note: even
    if WARN_ONCE() only _warns_ once, the return value isn't whether the
    warning happened, but whether the condition for the warning was true.
    So despite the warning only happening once, the "if (WARN_ON(..))"
    would trigger for every nested sleep.

    - in the cases where we knowingly disable the warning by using
    "sched_annotate_sleep()", don't change the task state (that is used
    for all core scheduling decisions), instead use '->task_state_change'
    that is used for the debugging decision itself.

    (Credit for the second part of the fix goes to Oleg Nesterov: "Can't we
    avoid this subtle change in behaviour DEBUG_ATOMIC_SLEEP adds?" with the
    suggested change to use 'task_state_change' as part of the test)

    Reported-and-bisected-by: Bruno Prémont
    Tested-by: Rafael J Wysocki
    Acked-by: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner ,
    Cc: Ilya Dryomov ,
    Cc: Mike Galbraith
    Cc: Ingo Molnar
    Cc: Peter Hurley ,
    Cc: Davidlohr Bueso ,
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Add two more Fujitsu LIFEBOOK models that also ship with the Elantech
    touchpad and don't work with crc_disabled to the quirk list.

    Signed-off-by: Rainer Koenig
    Cc: stable@vger.kernel.org
    Signed-off-by: Dmitry Torokhov

    Rainer Koenig
     
  • …ernel/git/horms/renesas into fixes

    Merge "Third Round of Renesas ARM Based SoC Fixes for v3.19" from Simon Horman:

    * Instantiate GIC from C board code in legacy builds on r8a7790 and r8a73a4

    * tag 'renesas-soc-fixes3-for-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
    ARM: shmobile: r8a7790: Instantiate GIC from C board code in legacy builds
    ARM: shmobile: r8a73a4: Instantiate GIC from C board code in legacy builds

    Signed-off-by: Olof Johansson <olof@lixom.net>

    Olof Johansson
     

01 Feb, 2015

1 commit

  • Pull i2c fixes from Wolfram Sang:
    "i2c driver bugfixes (s3c2410, slave-eeprom, sh_mobile), size
    regression "bugfix" (i2c slave), documentation bugfix (st).

    Also, one documentation update (da9063), so some devicetrees can now
    be verified"

    * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
    i2c: sh_mobile: terminate DMA reads properly
    i2c: Only include slave support if selected
    i2c: s3c2410: fix ABBA deadlock by keeping clock prepared
    i2c: slave-eeprom: fix boundary check when using sysfs
    i2c: st: Rename clock reference to something that exists
    DT: i2c: Add devices handled by the da9063 MFD driver

    Linus Torvalds
     

31 Jan, 2015

11 commits

  • Pull char/misc driver fixes from Greg KH:
    "Here are two tiny patches, one fixing up the drivers/Kconfig file, and
    one adding a MAINTAINERS entry for the UIO git tree"

    * tag 'char-misc-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
    drivers/Kconfig: remove duplicate entry for soc
    MAINTAINERS: add git url entry for UIO

    Linus Torvalds
     
  • Pull staging tree fixes from Greg KH:
    "Here are two tiny staging tree fixes. One for the nvec driver to
    resolve a reported problem, and one to add a MAINTAINERS entry for the
    Android drivers"

    * tag 'staging-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
    MAINTAINERS: add Android driver entries
    staging: nvec: specify a platform-device base id

    Linus Torvalds
     
  • Pull USB fixes from Greg KH:
    "Here are some small USB fixes and quirk additions for 3.19-rc7.

    All have been in linux-next for a while with no reported problems"

    * tag 'usb-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
    USB: Add OTG PET device to TPL
    usb-storage/SCSI: blacklist FUA on JMicron 152d:2566 USB-SATA controller
    uas: Add no-report-opcodes quirk for Simpletech devices with id 4971:8017
    storage: Revise/fix quirk for 04E6:000F SCM USB-SCSI converter
    usb: phy: never defer probe in non-OF case
    usb: dwc2: call dwc2_is_controller_alive() under spinlock

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar:
    "Mostly tooling fixes, but also an event groups fix, two PMU driver
    fixes and a CPU model variant addition"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf: Tighten (and fix) the grouping condition
    perf/x86/intel: Add model number for Airmont
    perf/rapl: Fix crash in rapl_scale()
    perf/x86/intel/uncore: Move uncore_box_init() out of driver initialization
    perf probe: Fix probing kretprobes
    perf symbols: Introduce 'for' method to iterate over the symbols with a given name
    perf probe: Do not rely on map__load() filter to find symbols
    perf symbols: Introduce method to iterate symbols ordered by name
    perf symbols: Return the first entry with a given name in find_by_name method
    perf annotate: Fix memory leaks in LOCK handling
    perf annotate: Handle ins parsing failures
    perf scripting perl: Force to use stdbool
    perf evlist: Remove extraneous 'was' on error message

    Linus Torvalds
     
  • Pull btrfs fix from Chris Mason:
    "We have one more fix for btrfs in my for-linus branch - this was a bug
    in the new raid5/6 scrubbing support"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    btrfs: fix raid56 scrub failed in xfstests btrfs/072

    Linus Torvalds
     
  • Pull quota and UDF fix from Jan Kara:
    "A fix for UDF to properly free preallocated blocks and a fix for quota
    so that Q_GETQUOTA quotactl reports correct numbers for XFS filesystem
    (and similarly Q_XGETQUOTA quotactl works properly for other
    filesystems)"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    quota: Switch ->get_dqblk() and ->set_dqblk() to use bytes as space units
    udf: Release preallocation on last writeable close

    Linus Torvalds
     
  • Pull KVM fixes from Paolo Bonzini:
    "The ARM changes are largish, but not too scary. And a simple fix for
    x86 (bug introduced in 3.19)"

    (Paolo sayus these are the "Final" fixes. We'll see).

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    KVM: x86: check LAPIC presence when building apic_map
    arm/arm64: KVM: Use kernel mapping to perform invalidation on page fault
    arm/arm64: KVM: Invalidate data cache on unmap
    arm/arm64: KVM: Use set/way op trapping to track the state of the caches

    Linus Torvalds
     
  • Pull IOMMU fixes from Joerg Roedel:
    "Two small fixes for the Tegra GART IOMMU driver:

    - provide a .map_sg function for iommu_ops
    - do not register Tegra GART driver as a workaround because of issues
    with it when used from DRM code"

    * tag 'iommu-fixes-v3.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
    iommu/tegra: gart: Provide default ->map_sg() callback
    iommu/tegra: gart: Do not register with bus

    Linus Torvalds
     
  • Pull intel and dp mst drm fixes from Dave Airlie:
    "Intel had a few more fixes lined up and no point me sitting on them,
    along with a DP MST fix from Rob for a race at undock + vt switch"

    * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
    drm: fix fb-helper vs MST dangling connector ptrs (v2)
    drm/i915: BDW Fix Halo PCI IDs marked as ULT.
    drm/i915: Fix and clean BDW PCH identification
    drm/i915: Only fence tiled region of object.
    drm/i915: fix inconsistent brightness after resume
    drm/i915: Init PPGTT before context enable

    Linus Torvalds
     
  • Fix misspelled define.

    Fixes: 33692f27597f ("vm: add VM_FAULT_SIGSEGV handling support")
    Signed-off-by: Guenter Roeck
    Signed-off-by: Linus Torvalds

    Guenter Roeck
     
  • DMA read requests could miss proper termination, so two more bytes would
    have been read via PIO overwriting the end of the buffer with wrong
    data. Make DMA stop handling more readable while we are here.

    Signed-off-by: Wolfram Sang
    Signed-off-by: Wolfram Sang

    Wolfram Sang
     

30 Jan, 2015

15 commits

  • We forgot to re-check LAPIC after splitting the loop in commit
    173beedc1601 (KVM: x86: Software disabled APIC should still deliver
    NMIs, 2014-11-02).

    Signed-off-by: Radim Krčmář
    Fixes: 173beedc1601f51dae9d579aa7a414c5aa8f700b
    Signed-off-by: Paolo Bonzini

    Radim Krčmář
     
  • …t/kvmarm/kvmarm into kvm-master

    Second round of fixes for KVM/ARM for 3.19.

    Fixes memory corruption issues on APM platforms and swapping issues on
    DMA-coherent systems.

    Paolo Bonzini
     
  • misc i915 fixes, mostly all stable material as well.

    * tag 'drm-intel-fixes-2015-01-29' of git://anongit.freedesktop.org/drm-intel:
    drm/i915: BDW Fix Halo PCI IDs marked as ULT.
    drm/i915: Fix and clean BDW PCH identification
    drm/i915: Only fence tiled region of object.
    drm/i915: fix inconsistent brightness after resume
    drm/i915: Init PPGTT before context enable

    Dave Airlie
     
  • VT switch back/forth from console to xserver (for example) has potential
    to go horribly wrong if a dynamic DP MST connector ends up in the saved
    modeset that is restored when switching back to fbcon.

    When removing a dynamic connector, don't forget to clean up the saved
    state.

    v1: original
    v2: null out set->fb if no more connectors to avoid making i915 cranky

    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1184968
    Cc: stable@vger.kernel.org #v3.17+
    Signed-off-by: Rob Clark
    Signed-off-by: Dave Airlie

    Rob Clark
     
  • Pull device mapper fixes from Mike Snitzer:
    "One stable fix for a dm-cache 3.19-rc6 regression and one stable fix
    for dm-thin:

    - fix DM cache metadata open/lookup error paths to properly use
    ERR_PTR and IS_ERR (fixes: 3.19-rc6 "stable" commit 9b1cc9f251)

    - fix DM thin-provisioning to disallow userspace from sending
    messages to the thin-pool if the pool is in READ_ONLY or FAIL mode
    since no metadata changes are allowed in these modes"

    * tag 'dm-3.19-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
    dm thin: don't allow messages to be sent to a pool target in READ_ONLY or FAIL mode
    dm cache: fix missing ERR_PTR returns and handling

    Linus Torvalds
     
  • Pull NFS client bugfixes from Trond Myklebust:
    "Highlights include:

    - Stable fix for a NFSv4.1 Oops on mount
    - Stable fix for an O_DIRECT deadlock condition
    - Fix an issue with submounted volumes and fake duplicate inode
    numbers"

    * tag 'nfs-for-3.19-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
    NFS: Fix use of nfs_attr_use_mounted_on_fileid()
    NFSv4.1: Fix an Oops in nfs41_walk_client_list
    nfs: fix dio deadlock when O_DIRECT flag is flipped

    Linus Torvalds
     
  • Pull Ceph fixes from Sage Weil:
    "These paches from Ilya finally squash a race condition with layered
    images that he's been chasing for a while"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    rbd: drop parent_ref in rbd_dev_unprobe() unconditionally
    rbd: fix rbd_dev_parent_get() when parent_overlap == 0

    Linus Torvalds
     
  • When handling a fault in stage-2, we need to resync I$ and D$, just
    to be sure we don't leave any old cache line behind.

    That's very good, except that we do so using the *user* address.
    Under heavy load (swapping like crazy), we may end up in a situation
    where the page gets mapped in stage-2 while being unmapped from
    userspace by another CPU.

    At that point, the DC/IC instructions can generate a fault, which
    we handle with kvm->mmu_lock held. The box quickly deadlocks, user
    is unhappy.

    Instead, perform this invalidation through the kernel mapping,
    which is guaranteed to be present. The box is much happier, and so
    am I.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • Let's assume a guest has created an uncached mapping, and written
    to that page. Let's also assume that the host uses a cache-coherent
    IO subsystem. Let's finally assume that the host is under memory
    pressure and starts to swap things out.

    Before this "uncached" page is evicted, we need to make sure
    we invalidate potential speculated, clean cache lines that are
    sitting there, or the IO subsystem is going to swap out the
    cached view, loosing the data that has been written directly
    into memory.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • Trying to emulate the behaviour of set/way cache ops is fairly
    pointless, as there are too many ways we can end-up missing stuff.
    Also, there is some system caches out there that simply ignore
    set/way operations.

    So instead of trying to implement them, let's convert it to VA ops,
    and use them as a way to re-enable the trapping of VM ops. That way,
    we can detect the point when the MMU/caches are turned off, and do
    a full VM flush (which is what the guest was trying to do anyway).

    This allows a 32bit zImage to boot on the APM thingy, and will
    probably help bootloaders in general.

    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     
  • Pull sound fixes from Takashi Iwai:
    "This batch ended up being larger than wished, but there is nothing to
    worry too much there.

    Most of commits are for ASoC, a compress NULL dereference fix, a fix
    for probe error handling, and the rest are device-specific fixes. In
    addition, we have a fix for a long-standing but of seq-dummy driver,
    which just cuts off the buggy part in the end"

    * tag 'sound-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ALSA: seq-dummy: remove deadlock-causing events on close
    ASoC: omap-mcbsp: Correct CBM_CFS dai format configuration
    ASoC: soc-compress.c: fix NULL dereference
    ASoC: rt286: set the same format for dac and adc
    ASoC: wm8904: fix runtime warning
    ASoC: simple-card: Fix crash in asoc_simple_card_unref()
    ASoC: fsl: imx-wm8962: Set the card owner field
    ASoC: pcm512x: Fix DSP program selection
    ASoC: rt5677: Modify the behavior that updates the PLL parameter.
    ASoC: fsl_ssi: Fix irq error check
    ASoC: rockchip: i2s: applys rate symmetry for CPU DAI
    ASoC: Intel: Add NULL checks for the stream pointer
    ASoC: wm8960: Fix capture sample rate from 11250 to 11025
    ASoC: adi: Add missing return statement.
    ASoC: Intel: Don't change offset of block allocator during fixed allocate
    ASoC: ts3a227e: Check and report jack status at probe
    ASoC: fsl_esai: Fix incorrect xDC field width of xCCR registers

    Linus Torvalds
     
  • Pull final pin control fix from Linus Walleij:
    "A late pin control fix for the v3.19 series: The AT91 gpio controller
    would miss wakeup events, this single fix make it work properly"

    [ "Final"? Yeah, I'll believe that once I've actually released 3.19 ;) - Linus ]

    * tag 'pinctrl-v3.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
    pinctrl: at91: allow to have disabled gpio bank

    Linus Torvalds
     
  • The stack guard page error case has long incorrectly caused a SIGBUS
    rather than a SIGSEGV, but nobody actually noticed until commit
    fee7e49d4514 ("mm: propagate error from stack expansion even for guard
    page") because that error case was never actually triggered in any
    normal situations.

    Now that we actually report the error, people noticed the wrong signal
    that resulted. So far, only the test suite of libsigsegv seems to have
    actually cared, but there are real applications that use libsigsegv, so
    let's not wait for any of those to break.

    Reported-and-tested-by: Takashi Iwai
    Tested-by: Jan Engelhardt
    Acked-by: Heiko Carstens # "s390 still compiles and boots"
    Cc: linux-arch@vger.kernel.org
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Commit 4bb25789ed28228a ("arm: dma-mapping: plumb our iommu mapping ops
    into arch_setup_dma_ops") moved the setting of the DMA operations from
    arm_iommu_attach_device() to arch_setup_dma_ops() where the DMA
    operations to be used are selected based on whether the device is
    connected to an IOMMU. However, the IOMMU detection scheme requires the
    IOMMU driver to be ported to the new IOMMU of_xlate API. As no driver
    has been ported yet, this effectively breaks all IOMMU ARM users that
    depend on the IOMMU being handled transparently by the DMA mapping API.

    Fix this by restoring the setting of DMA IOMMU ops in
    arm_iommu_attach_device() and splitting the rest of the function into a
    new internal __arm_iommu_attach_device() function, called by
    arch_setup_dma_ops().

    Signed-off-by: Laurent Pinchart
    Acked-by: Will Deacon
    Tested-by: Heiko Stuebner
    Signed-off-by: Olof Johansson

    Laurent Pinchart
     
  • The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
    "you should SIGSEGV" error, because the SIGSEGV case was generally
    handled by the caller - usually the architecture fault handler.

    That results in lots of duplication - all the architecture fault
    handlers end up doing very similar "look up vma, check permissions, do
    retries etc" - but it generally works. However, there are cases where
    the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.

    In particular, when accessing the stack guard page, libsigsegv expects a
    SIGSEGV. And it usually got one, because the stack growth is handled by
    that duplicated architecture fault handler.

    However, when the generic VM layer started propagating the error return
    from the stack expansion in commit fee7e49d4514 ("mm: propagate error
    from stack expansion even for guard page"), that now exposed the
    existing VM_FAULT_SIGBUS result to user space. And user space really
    expected SIGSEGV, not SIGBUS.

    To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
    duplicate architecture fault handlers about it. They all already have
    the code to handle SIGSEGV, so it's about just tying that new return
    value to the existing code, but it's all a bit annoying.

    This is the mindless minimal patch to do this. A more extensive patch
    would be to try to gather up the mostly shared fault handling logic into
    one generic helper routine, and long-term we really should do that
    cleanup.

    Just from this patch, you can generally see that most architectures just
    copied (directly or indirectly) the old x86 way of doing things, but in
    the meantime that original x86 model has been improved to hold the VM
    semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
    "newer" things, so it would be a good idea to bring all those
    improvements to the generic case and teach other architectures about
    them too.

    Reported-and-tested-by: Takashi Iwai
    Tested-by: Jan Engelhardt
    Acked-by: Heiko Carstens # "s390 still compiles and boots"
    Cc: linux-arch@vger.kernel.org
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

29 Jan, 2015

1 commit

  • As of commit 9a1091ef0017c40a ("irqchip: gic: Support hierarchy irq
    domain."), the Lager legacy board support is known to be broken.

    The IRQ numbers of the GIC are now virtual, and no longer match the
    hardcoded hardware IRQ numbers in the legacy platform board code.

    To fix this issue specific to non-multiplatform r8a7790 and Lager:
    1) Instantiate the GIC from platform board code and also
    2) Skip over the DT arch timer as well as
    3) Force delay setup based on DT CPU frequency

    With these 3 fixes in place interrupts on Lager are now unbroken.

    Partially based on legacy GIC fix by Geert Uytterhoeven, thanks to
    him for the initial work.

    Signed-off-by: Magnus Damm
    Acked-by: Geert Uytterhoeven
    Signed-off-by: Simon Horman

    Magnus Damm