22 Jul, 2015

6 commits

  • commit f9bb48825a6b5d02f4cabcc78967c75db903dcdc upstream.

    This allows for better documentation in the code and
    it allows for a simpler and fully correct version of
    fs_fully_visible to be written.

    The mount points converted and their filesystems are:
    /sys/hypervisor/s390/ s390_hypfs
    /sys/kernel/config/ configfs
    /sys/kernel/debug/ debugfs
    /sys/firmware/efi/efivars/ efivarfs
    /sys/fs/fuse/connections/ fusectl
    /sys/fs/pstore/ pstore
    /sys/kernel/tracing/ tracefs
    /sys/fs/cgroup/ cgroup
    /sys/kernel/security/ securityfs
    /sys/fs/selinux/ selinuxfs
    /sys/fs/smackfs/ smackfs

    Acked-by: Greg Kroah-Hartman
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 87d2846fcf88113fae2341da1ca9a71f0d916f2c upstream.

    Add two functions sysfs_create_mount_point and
    sysfs_remove_mount_point that hang a permanently empty directory off
    of a kobject or remove a permanently emptpy directory hanging from a
    kobject. Export these new functions so modular filesystems can use
    them.

    Acked-by: Greg Kroah-Hartman
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit ea015218f2f7ace2dad9cedd21ed95bdba2886d7 upstream.

    Add a new function kernfs_create_empty_dir that can be used to create
    directory that can not be modified.

    Update the code to use make_empty_dir_inode when reporting a
    permanently empty directory to the vfs.

    Update the code to not allow adding to permanently empty directories.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit eb6d38d5427b3ad42f5268da0f1dd31bb0af1264 upstream.

    Add a new function proc_create_mount_point that when used to creates a
    directory that can not be added to.

    Add a new function is_empty_pde to test if a function is a mount
    point.

    Update the code to use make_empty_dir_inode when reporting
    a permanently empty directory to the vfs.

    Update the code to not allow adding to permanently empty directories.

    Update /proc/openprom and /proc/fs/nfsd to be permanently empty directories.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit f9bd6733d3f11e24f3949becf277507d422ee1eb upstream.

    Add a magic sysctl table sysctl_mount_point that when used to
    create a directory forces that directory to be permanently empty.

    Update the code to use make_empty_dir_inode when accessing permanently
    empty directories.

    Update the code to not allow adding to permanently empty directories.

    Update /proc/sys/fs/binfmt_misc to be a permanently empty directory.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit fbabfd0f4ee2e8847bf56edf481249ad1bb8c44d upstream.

    To ensure it is safe to mount proc and sysfs I need to check if
    filesystems that are mounted on top of them are mounted on truly empty
    directories. Given that some directories can gain entries over time,
    knowing that a directory is empty right now is insufficient.

    Therefore add supporting infrastructure for permantently empty
    directories that proc and sysfs can use when they create mount points
    for filesystems and fs_fully_visible can use to test for permanently
    empty directories to ensure that nothing will be gained by mounting a
    fresh copy of proc or sysfs.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     

11 Jul, 2015

34 commits

  • Greg Kroah-Hartman
     
  • commit e4f95517f18271b1da36cfc5d700e46844396d6e upstream.

    Add last missing line in commit "cdd9eefdf905"
    ("fs/ufs: restore s_lock mutex")

    Signed-off-by: Fabian Frederick
    Signed-off-by: Al Viro
    Signed-off-by: Greg Kroah-Hartman

    Fabian Frederick
     
  • commit 514d748f69c97a51a2645eb198ac5c6218f22ff9 upstream.

    Commit e4502c63f56aeca88 (ufs: deal with nfsd/iget races) made ufs
    create inodes with I_NEW flag set. However ufs_mkdir() never cleared
    this flag. Thus if someone ever tried to lookup the directory by inode
    number, he would deadlock waiting for I_NEW to be cleared. Luckily this
    mostly happens only if the filesystem is exported over NFS since
    otherwise we have the inode attached to dentry and don't look it up by
    inode number. In rare cases dentry can get freed without inode being
    freed and then we'd hit the deadlock even without NFS export.

    Fix the problem by clearing I_NEW before instantiating new directory
    inode.

    Fixes: e4502c63f56aeca887ced37f24e0def1ef11cec8
    Reported-by: Fabian Frederick
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit 12ecbb4b1d765a5076920999298d9625439dbe58 upstream.

    Commit e4502c63f56aeca88 (ufs: deal with nfsd/iget races) introduced
    unlock_new_inode() call into ufs_add_nondir(). However that function
    gets called also from ufs_link() which hands it already initialized
    inode and thus unlock_new_inode() complains. The problem is harmless but
    annoying.

    Fix the problem by opencoding necessary stuff in ufs_link()

    Fixes: e4502c63f56aeca887ced37f24e0def1ef11cec8
    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit ceeb0e5d39fcdf4dca2c997bf225c7fc49200b37 upstream.

    Limit the mounts fs_fully_visible considers to locked mounts.
    Unlocked can always be unmounted so considering them adds hassle
    but no security benefit.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 93e3bce6287e1fb3e60d3324ed08555b5bbafa89 upstream.

    The warning message in prepend_path is unclear and outdated. It was
    added as a warning that the mechanism for generating names of pseudo
    files had been removed from prepend_path and d_dname should be used
    instead. Unfortunately the warning reads like a general warning,
    making it unclear what to do with it.

    Remove the warning. The transition it was added to warn about is long
    over, and I added code several years ago which in rare cases causes
    the warning to fire on legitimate code, and the warning is now firing
    and scaring people for no good reason.

    Reported-by: Ivan Delalande
    Reported-by: Omar Sandoval
    Fixes: f48cfddc6729e ("vfs: In d_path don't call d_dname on a mount point")
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit cdd9eefdf905e92e7fc6cc393314efe68dc6ff66 upstream.

    Commit 0244756edc4b98c ("ufs: sb mutex merge + mutex_destroy") generated
    deadlocks in read/write mode on mkdir.

    This patch partially reverts it keeping fixes by Andrew Morton and
    mutex_destroy()

    [AV: fixed a missing bit in ufs_remount()]

    Signed-off-by: Fabian Frederick
    Reported-by: Ian Campbell
    Suggested-by: Jan Kara
    Cc: Ian Campbell
    Cc: Evgeniy Dushistov
    Cc: Alexey Khoroshilov
    Cc: Roger Pau Monne
    Cc: Ian Jackson
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro
    Signed-off-by: Greg Kroah-Hartman

    Fabian Frederick
     
  • commit 13b987ea275840d74d9df9a44326632fab1894da upstream.

    This reverts commit 9ef7db7f38d0 ("ufs: fix deadlocks introduced by sb
    mutex merge") That patch tried to solve commit 0244756edc4b98c ("ufs: sb
    mutex merge + mutex_destroy") which is itself partially reverted due to
    multiple deadlocks.

    Signed-off-by: Fabian Frederick
    Suggested-by: Jan Kara
    Cc: Ian Campbell
    Cc: Evgeniy Dushistov
    Cc: Alexey Khoroshilov
    Cc: Roger Pau Monne
    Cc: Ian Jackson
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Fabian Frederick
     
  • commit 2426f3910069ed47c0cc58559a6d088af7920201 upstream.

    file_remove_suid() could mistakenly set S_NOSEC inode bit when root was
    modifying the file. As a result following writes to the file by ordinary
    user would avoid clearing suid or sgid bits.

    Fix the bug by checking actual mode bits before setting S_NOSEC.

    Signed-off-by: Jan Kara
    Signed-off-by: Al Viro
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit 42720138b06301cc8a7ee8a495a6d021c4b6a9bc upstream.

    Writes were a bit racy, but hard to turn into a bug at the same time.
    (Particularly because modern Linux doesn't use this feature anymore.)

    Signed-off-by: Radim Krčmář
    [Actually the next patch makes it much, much easier to trigger the race
    so I'm including this one for stable@ as well. - Paolo]
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Radim Krčmář
     
  • commit db1385624c686fe99fe2d1b61a36e1537b915d08 upstream.

    Legacy NMI watchdog didn't work after migration/resume, because
    vapics_in_nmi_mode was left at 0.

    Signed-off-by: Radim Krčmář
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Radim Krčmář
     
  • commit 4839ddc27b7212ec58874f62c97da7400c8523be upstream.

    Commit fd1d0ddf2ae9 (KVM: arm/arm64: check IRQ number on userland
    injection) rightly limited the range of interrupts userspace can
    inject in a guest, but failed to consider the (unlikely) case where
    a guest is configured with 1024 interrupts.

    In this case, interrupts ranging from 1020 to 1023 are unuseable,
    as they have a special meaning for the GIC CPU interface.

    Make sure that these number cannot be used as an IRQ. Also delete
    a redundant (and similarily buggy) check in kvm_set_irq.

    Reported-by: Peter Maydell
    Cc: Andre Przywara
    Signed-off-by: Marc Zyngier
    Signed-off-by: Greg Kroah-Hartman

    Marc Zyngier
     
  • commit 431dae778aea4eed31bd12e5ee82edc571cd4d70 upstream.

    Eric noticed problems with vhost-scsi and virtio-ccw: vhost-scsi
    complained about overwriting values in the config space, which
    was triggered by a broken implementation of virtio-ccw's config
    get/set routines. It was probably sheer luck that we did not hit
    this before.

    When writing a value to the config space, the WRITE_CONF ccw will
    always write from the beginning of the config space up to and
    including the value to be set. If the config space up to the value
    has not yet been retrieved from the device, however, we'll end up
    overwriting values. Keep track of the known config space and update
    if needed to avoid this.

    Moreover, READ_CONF will only read the number of bytes it has been
    instructed to retrieve, so we must not copy more than that to the
    buffer, or we might overwrite trailing values.

    Reported-by: Eric Farman
    Signed-off-by: Cornelia Huck
    Reviewed-by: Eric Farman
    Tested-by: Eric Farman
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Cornelia Huck
     
  • commit 3c8e5105e759e7b2d88ea8a85b1285e535bc7500 upstream.

    The REGSET_VX_LOW ELF notes should contain the lower 64 bit halfes of the
    first sixteen 128 bit vector registers. Unfortunately currently we copy
    the upper halfes.

    Fix this and correctly copy the lower halfes.

    Fixes: a62bc0739253 ("s390/kdump: add support for vector extension")
    Signed-off-by: Michael Holzheu
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Michael Holzheu
     
  • commit b035b60ded132592055c0f9bd1cc280259c7de4b upstream.

    Currently all backward jumps crash for JITed s390x eBPF programs
    with an illegal instruction program check and kernel panic. Because
    for negative values the opcode of the jump instruction is overriden
    by the negative branch offset an illegal instruction is generated
    by the JIT:

    000003ff802da378: c01100000002 lgfi %r1,2
    000003ff802da37e: fffffff52065 unknown
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Michael Holzheu
     
  • commit f2ae45edbca7ba5324eef01719ede0151dc5cead upstream.

    commit 6d3da24141 ("KVM: s390: deliver floating interrupts in order
    of priority") introduced a regression for the reset handling.

    We don't clear the bitmap of pending floating interrupts
    and interrupt parameters. This could result in stale interrupts
    even after a reset. Let's fix this by clearing the pending bitmap
    and the parameters for service and machine check interrupts.

    Signed-off-by: Jens Freimann
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Jens Freimann
     
  • commit b938eacea0b6881f2116a061e6da3ec840e75137 upstream.

    Commit ea5f49692575 ("KVM: s390: only one external call may be pending
    at a time") introduced a bug on machines that don't have SIGP
    interpretation facility installed.
    The injection of an external call will now always fail with -EBUSY
    (if none is already pending).

    This leads to the following symptoms:
    - An external call will be injected but with the wrong "src cpu id",
    as this id will not be remembered.
    - The target vcpu will not be woken up, therefore the guest will hang if
    it cannot deal with unexpected failures of the SIGP EXTERNAL CALL
    instruction.
    - If an external call is already pending, -EBUSY will not be reported.

    Reviewed-by: Christian Borntraeger
    Reviewed-by: Jens Freimann
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Greg Kroah-Hartman

    David Hildenbrand
     
  • commit 8e748c8d09a9314eedb5c6367d9acfaacddcdc88 upstream.

    KVM guest kernels for trap & emulate run in user mode, with a modified
    set of kernel memory segments. However the fixmap address is still in
    the normal KSeg3 region at 0xfffe0000 regardless, causing problems when
    cache alias handling makes use of them when handling copy on write.

    Therefore define FIXADDR_TOP as 0x7ffe0000 in the guest kernel mapped
    region when CONFIG_KVM_GUEST is defined.

    Signed-off-by: James Hogan
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/9887/
    Signed-off-by: Ralf Baechle
    Signed-off-by: Greg Kroah-Hartman

    James Hogan
     
  • commit 69a1220060c1523fd0515216eaa29e22f133b894 upstream.

    The argument to KVM_GET_DIRTY_LOG is a memslot id; it may not match the
    position in the memslots array, which is sorted by gfn.

    Reviewed-by: James Hogan
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Paolo Bonzini
     
  • commit 1dace0116d0b05c967d94644fc4dfe96be2ecd3d upstream.

    The Foxconn K8M890-8237A has two PCI host bridges, and we can't assign
    resources correctly without the information from _CRS that tells us which
    address ranges are claimed by which bridge. In the bugs mentioned below,
    we incorrectly assign a sound card address (this example is from 1033299):

    bus: 00 index 2 [mem 0x80000000-0xfcffffffff]
    ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f])
    pci_root PNP0A08:00: host bridge window [mem 0x80000000-0xbfefffff] (ignored)
    pci_root PNP0A08:00: host bridge window [mem 0xc0000000-0xdfffffff] (ignored)
    pci_root PNP0A08:00: host bridge window [mem 0xf0000000-0xfebfffff] (ignored)
    ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-ff])
    pci_root PNP0A08:01: host bridge window [mem 0xbff00000-0xbfffffff] (ignored)
    pci 0000:80:01.0: [1106:3288] type 0 class 0x000403
    pci 0000:80:01.0: reg 10: [mem 0xbfffc000-0xbfffffff 64bit]
    pci 0000:80:01.0: address space collision: [mem 0xbfffc000-0xbfffffff 64bit] conflicts with PCI Bus #00 [mem 0x80000000-0xfcffffffff]
    pci 0000:80:01.0: BAR 0: assigned [mem 0xfd00000000-0xfd00003fff 64bit]
    BUG: unable to handle kernel paging request at ffffc90000378000
    IP: [] azx_create+0x37c/0x822 [snd_hda_intel]

    We assigned 0xfd_0000_0000, but that is not in any of the host bridge
    windows, and the sound card doesn't work.

    Turn on pci=use_crs automatically for this system.

    Link: https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/931368
    Link: https://bugs.launchpad.net/ubuntu/+source/alsa-driver/+bug/1033299
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Greg Kroah-Hartman

    Bjorn Helgaas
     
  • commit 3d9fecf6bfb8b12bc2f9a4c7109895a2a2bb9436 upstream.

    We enable _CRS on all systems from 2008 and later. On older systems, we
    ignore _CRS and assume the whole physical address space (excluding RAM and
    other devices) is available for PCI devices, but on systems that support
    physical address spaces larger than 4GB, it's doubtful that the area above
    4GB is really available for PCI.

    After d56dbf5bab8c ("PCI: Allocate 64-bit BARs above 4G when possible"), we
    try to use that space above 4GB *first*, so we're more likely to put a
    device there.

    On Juan's Toshiba Satellite Pro U200, BIOS left the graphics, sound, 1394,
    and card reader devices unassigned (but only after Windows had been
    booted). Only the sound device had a 64-bit BAR, so it was the only device
    placed above 4GB, and hence the only device that didn't work.

    Keep _CRS enabled even on pre-2008 systems if they support physical address
    space larger than 4GB.

    Fixes: d56dbf5bab8c ("PCI: Allocate 64-bit BARs above 4G when possible")
    Reported-and-tested-by: Juan Dayer
    Reported-and-tested-by: Alan Horsfield
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=99221
    Link: https://bugzilla.opensuse.org/show_bug.cgi?id=907092
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Greg Kroah-Hartman

    Bjorn Helgaas
     
  • commit 72e349f1124a114435e599479c9b8d14bfd1ebcd upstream.

    When we take a PMU exception or a software event we call
    perf_read_regs(). This overloads regs->result with a boolean that
    describes if we should use the sampled instruction address register
    (SIAR) or the regs.

    If the exception is in kernel, we start with the kernel regs and
    backtrace through the kernel stack. At this point we switch to the
    userspace regs and backtrace the user stack with perf_callchain_user().

    Unfortunately these regs have not got the perf_read_regs() treatment,
    so regs->result could be anything. If it is non zero,
    perf_instruction_pointer() decides to use the SIAR, and we get issues
    like this:

    0.11% qemu-system-ppc [kernel.kallsyms] [k] _raw_spin_lock_irqsave
    |
    ---_raw_spin_lock_irqsave
    |
    |--52.35%-- 0
    | |
    | |--46.39%-- __hrtimer_start_range_ns
    | | kvmppc_run_core
    | | kvmppc_vcpu_run_hv
    | | kvmppc_vcpu_run
    | | kvm_arch_vcpu_ioctl_run
    | | kvm_vcpu_ioctl
    | | do_vfs_ioctl
    | | sys_ioctl
    | | system_call
    | | |
    | | |--67.08%-- _raw_spin_lock_irqsave
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Anton Blanchard
     
  • commit cc5a2f7b8f39e7db559778f7913a2410257b3e50 upstream.

    On some archs, the local clockevent device stops in deep cpuidle states.
    The broadcast framework is used to wakeup cpus in these idle states, in
    which either an external clockevent device is used to send wakeup ipis
    or the hrtimer broadcast framework kicks in in the absence of such a
    device. One cpu is nominated as the broadcast cpu and this cpu sends
    wakeup ipis to sleeping cpus at the appropriate time. This is the
    implementation in the oneshot mode of broadcast.

    In periodic mode of broadcast however, the presence of such cpuidle
    states results in the cpuidle driver calling tick_broadcast_enable()
    which shuts down the local clockevent devices of all the cpus and
    appoints the tick broadcast device as the clockevent device for each of
    them. This works on those archs where the tick broadcast device is a
    real clockevent device. But on archs which depend on the hrtimer mode
    of broadcast, the tick broadcast device hapens to be a pseudo device.
    The consequence is that the local clockevent devices of all cpus are
    shutdown and the kernel hangs at boot time in periodic mode.

    Let us thus not register the cpuidle states which have
    CPUIDLE_FLAG_TIMER_STOP flag set, on archs which depend on the hrtimer
    mode of broadcast in periodic mode. This patch takes care of doing this
    on powerpc. The cpus would not have entered into such deep cpuidle
    states in periodic mode on powerpc anyway. So there is no loss here.

    Signed-off-by: Preeti U Murthy
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    preeti
     
  • commit 2f5bc307be2480ba89e4c5d118f406f04a4a7299 upstream.

    The current Armada XP suspend to RAM implementation, as added in
    commit 27432825ae19f ("ARM: mvebu: Armada XP GP specific
    suspend/resume code") does not handle big-endian configurations
    properly: the small bit of assembly code putting the DRAM in
    self-refresh and toggling the GPIOs to turn off power forgets to
    convert the values to little-endian.

    This commit fixes that by making sure the two values we will write to
    the DRAM controller register and GPIO register are already in
    little-endian before entering the critical assembly code.

    Signed-off-by: Thomas Petazzoni
    Fixes: 27432825ae19f ("ARM: mvebu: Armada XP GP specific suspend/resume code")
    Signed-off-by: Greg Kroah-Hartman

    Thomas Petazzoni
     
  • commit 4d48edb3c3e1234d6b3fcdfb9ac24d7c6de449cb upstream.

    Commit 7232398abc6a ("ARM: tegra: Convert PMC to a driver") changed tegra_resume()
    location storing from late to early and, as a result, broke suspend on Tegra20.
    PMC scratch register 41 is used by tegra LP1 resume code for retrieving stored
    physical memory address of common resume function and in the same time used by
    tegra20_cpu_shutdown() (shared by Tegra20 cpuidle driver and platform SMP code),
    which is storing CPU1 "resettable" status. It implies strict order of scratch
    register usage, otherwise resume function address is lost on Tegra20 after
    disabling non-boot CPU's on suspend. Fix it by storing "resettable" status in
    IRAM instead of PMC scratch register.

    Signed-off-by: Dmitry Osipenko
    Fixes: 7232398abc6a (ARM: tegra: Convert PMC to a driver)
    Signed-off-by: Thierry Reding
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Osipenko
     
  • commit e2d997366dc5b6c9d14035867f73957f93e7578c upstream.

    According to the PSCI specification and the SMC/HVC calling
    convention, PSCI function_ids that are not implemented must
    return NOT_SUPPORTED as return value.

    Current KVM implementation takes an unhandled PSCI function_id
    as an error and injects an undefined instruction into the guest
    if PSCI implementation is called with a function_id that is not
    handled by the resident PSCI version (ie it is not implemented),
    which is not the behaviour expected by a guest when calling a
    PSCI function_id that is not implemented.

    This patch fixes this issue by returning NOT_SUPPORTED whenever
    the kvm PSCI call is executed for a function_id that is not
    implemented by the PSCI kvm layer.

    Cc: Christoffer Dall
    Acked-by: Sudeep Holla
    Signed-off-by: Lorenzo Pieralisi
    Signed-off-by: Marc Zyngier
    Signed-off-by: Greg Kroah-Hartman

    Lorenzo Pieralisi
     
  • commit 85e84ba31039595995dae80b277378213602891b upstream.

    On VM entry, we disable access to the VFP registers in order to
    perform a lazy save/restore of these registers.

    On VM exit, we restore access, test if we did enable them before,
    and save/restore the guest/host registers if necessary. In this
    sequence, the FPEXC register is always accessed, irrespective
    of the trapping configuration.

    If the guest didn't touch the VFP registers, then the HCPTR access
    has now enabled such access, but we're missing a barrier to ensure
    architectural execution of the new HCPTR configuration. If the HCPTR
    access has been delayed/reordered, the subsequent access to FPEXC
    will cause a trap, which we aren't prepared to handle at all.

    The same condition exists when trapping to enable VFP for the guest.

    The fix is to introduce a barrier after enabling VFP access. In the
    vmexit case, it can be relaxed to only takes place if the guest hasn't
    accessed its view of the VFP registers, making the access to FPEXC safe.

    The set_hcptr macro is modified to deal with both vmenter/vmexit and
    vmtrap operations, and now takes an optional label that is branched to
    when the guest hasn't touched the VFP registers.

    Reported-by: Vikram Sethi
    Signed-off-by: Marc Zyngier
    Signed-off-by: Greg Kroah-Hartman

    Marc Zyngier
     
  • commit 9fc2b4b436cff7d8403034676014f1be9d534942 upstream.

    Before calling into the filesystem, vfs_setxattr calls
    security_inode_setxattr, which ends up calling selinux_inode_setxattr in
    our case. That returns -EOPNOTSUPP whenever SBLABEL_MNT is not set.
    SBLABEL_MNT was supposed to be set by sb_finish_set_opts, which sets it
    only if selinux_is_sblabel_mnt returns true.

    The selinux_is_sblabel_mnt logic was broken by eadcabc697e9 "SELinux: do
    all flags twiddling in one place", which didn't take into the account
    the SECURITY_FS_USE_NATIVE behavior that had been introduced for nfs
    with eb9ae686507b "SELinux: Add new labeling type native labels".

    This caused setxattr's of security labels over NFSv4.2 to fail.

    Cc: Eric Paris
    Cc: David Quigley
    Reported-by: Richard Chan
    Signed-off-by: J. Bruce Fields
    Acked-by: Stephen Smalley
    [PM: added the stable dependency]
    Signed-off-by: Paul Moore
    Signed-off-by: Greg Kroah-Hartman

    J. Bruce Fields
     
  • commit 0dd23f94251f49da99a6cbfb22418b2d757d77d6 upstream.

    Commit 007bea098b86 (intel_pstate: Add setting voltage value for
    baytrail P states.) introduced byt_set_pstate() with the assumption that
    it would always be run by the CPU whose MSR is to be written by it. It
    turns out, however, that is not always the case in practice, so modify
    byt_set_pstate() to enforce the MSR write done by it to always happen on
    the right CPU.

    Fixes: 007bea098b86 (intel_pstate: Add setting voltage value for baytrail P states.)
    Signed-off-by: Joe Konno
    Acked-by: Kristen Carlson Accardi
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Joe Konno
     
  • commit 62a7f368ffbc13d9aedfdd7aeae711b177db69ac upstream.

    When dma mapping (dma_map_sg) fails in sdhci_pre_dma_transfer, -EINVAL
    is returned. There are 3 callers of sdhci_pre_dma_transfer:
    * sdhci_pre_req and sdhci_adma_table_pre: handle negative return
    * sdhci_prepare_data: handles 0 (error) and "else" (good) only

    sdhci_prepare_data is therefore broken. When it receives -EINVAL from
    sdhci_pre_dma_transfer, it assumes 1 sg mapping was mapped. Later,
    this non-existent mapping with address 0 is kmap'ped and written to:
    Corrupted low memory at ffff880000001000 (1000 phys) = 22b7d67df2f6d1cf
    Corrupted low memory at ffff880000001008 (1008 phys) = 63848a5216b7dd95
    Corrupted low memory at ffff880000001010 (1010 phys) = 330eb7ddef39e427
    Corrupted low memory at ffff880000001018 (1018 phys) = 8017ac7295039bda
    Corrupted low memory at ffff880000001020 (1020 phys) = 8ce039eac119074f
    ...

    So teach sdhci_prepare_data to understand negative return values from
    sdhci_pre_dma_transfer and disable DMA in that case, as well as for
    zero.

    It was introduced in 348487cb28e66b032bae1b38424d81bf5b444408 (mmc:
    sdhci: use pipeline mmc requests to improve performance). The commit
    seems to be suspicious also by assigning host->sg_count both in
    sdhci_pre_dma_transfer and sdhci_adma_table_pre.

    Signed-off-by: Jiri Slaby
    Fixes: 348487cb28e6
    Cc: Ulf Hansson
    Cc: Haibo Chen
    Signed-off-by: Ulf Hansson
    Signed-off-by: Greg Kroah-Hartman

    Jiri Slaby
     
  • commit 0b3fff54bc01e8e6064d222a33e6fa7adabd94cd upstream.

    Make sure that we are skipping over large PTEs while walking
    the page-table tree.

    Fixes: 5c34c403b723 ("iommu/amd: Fix memory leak in free_pagetable")
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Joerg Roedel
     
  • commit d38f0ff9ab35414644995bae187d015c31aae19c upstream.

    Commit 83a60ed8f0b5 ("iommu/arm-smmu: fix ARM_SMMU_FEAT_TRANS_OPS
    condition") accidentally negated the ID0_ATOSNS predicate in the ATOS
    feature check, causing the driver to attempt ATOS requests on SMMUv2
    hardware without the ATOS feature implemented.

    This patch restores the predicate to the correct value.

    Reported-by: Varun Sethi
    Signed-off-by: Will Deacon
    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Will Deacon
     
  • commit 69d9cd8c592f1abce820dbce7181bbbf6812cfbd upstream.

    This reverts commit 7291a932c6e27d9768e374e9d648086636daf61c.

    The conversion to be16_add_cpu() is incorrect in case cryptlen is
    negative due to premature (i.e. before addition / subtraction)
    implicit conversion of cryptlen (int -> u16) leading to sign loss.

    Cc: Wei Yongjun
    Signed-off-by: Horia Geanta
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Horia Geant?
     
  • commit 5fa7dadc898567ce14d6d6d427e7bd8ce6eb5d39 upstream.

    Fixes: 1d11911a8c57 ("crypto: talitos - fix warning: 'alg' may be used uninitialized in this function")
    Signed-off-by: Horia Geanta
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Horia Geant?