07 Feb, 2018

5 commits


06 Feb, 2018

10 commits

  • Pull more xfs updates from Darrick Wong:
    "As promised, here's a (much smaller) second pull request for the
    second week of the merge cycle. This time around we have a couple
    patches shutting off unsupported fs configurations, and a couple of
    cleanups.

    Last, we turn off EXPERIMENTAL for the reverse mapping btree, since
    the primary downstream user of that information (online fsck) is now
    upstream and I haven't seen any major failures in a few kernel
    releases.

    Summary:

    - Print scrub build status in the xfs build info.

    - Explicitly call out the remaining two scenarios where we don't
    support reflink and never have.

    - Remove EXPERIMENTAL tag from reverse mapping btree!"

    * tag 'xfs-4.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
    xfs: remove experimental tag for reverse mapping
    xfs: don't allow reflink + realtime filesystems
    xfs: don't allow DAX on reflink filesystems
    xfs: add scrub to XFS_BUILD_OPTIONS
    xfs: fix u32 type usage in sb validation function

    Linus Torvalds
     
  • Pull overlayfs updates from Miklos Szeredi:
    "This work from Amir adds NFS export capability to overlayfs. NFS
    exporting an overlay filesystem is a challange because we want to keep
    track of any copy-up of a file or directory between encoding the file
    handle and decoding it.

    This is achieved by indexing copied up objects by lower layer file
    handle. The index is already used for hard links, this patchset
    extends the use to NFS file handle decoding"

    * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (51 commits)
    ovl: check ERR_PTR() return value from ovl_encode_fh()
    ovl: fix regression in fsnotify of overlay merge dir
    ovl: wire up NFS export operations
    ovl: lookup indexed ancestor of lower dir
    ovl: lookup connected ancestor of dir in inode cache
    ovl: hash non-indexed dir by upper inode for NFS export
    ovl: decode pure lower dir file handles
    ovl: decode indexed dir file handles
    ovl: decode lower file handles of unlinked but open files
    ovl: decode indexed non-dir file handles
    ovl: decode lower non-dir file handles
    ovl: encode lower file handles
    ovl: copy up before encoding non-connectable dir file handle
    ovl: encode non-indexed upper file handles
    ovl: decode connected upper dir file handles
    ovl: decode pure upper file handles
    ovl: encode pure upper file handles
    ovl: document NFS export
    vfs: factor out helpers d_instantiate_anon() and d_alloc_anon()
    ovl: store 'has_upper' and 'opaque' as bit flags
    ...

    Linus Torvalds
     
  • Pull remoteproc updates from Bjorn Andersson:
    "This contains a few bug fixes and a cleanup up of the resource-table
    handling in the framework, which removes the need for drivers with no
    resource table to provide a fake one"

    * tag 'rproc-v4.16' of git://github.com/andersson/remoteproc:
    remoteproc: Reset table_ptr on stop
    remoteproc: Drop dangling find_rsc_table dummies
    remoteproc: Move resource table load logic to find
    remoteproc: Don't handle empty resource table
    remoteproc: Merge rproc_ops and rproc_fw_ops
    remoteproc: Clone rproc_ops in rproc_alloc()
    remoteproc: Cache resource table size
    remoteproc: Remove depricated crash completion
    virtio_remoteproc: correct put_device virtio_device.dev

    Linus Torvalds
     
  • Pull rpmsg updates from Bjorn Andersson:
    "This fixes a few issues found in the SMD and GLINK drivers and
    corrects the handling of SMD channels that are found in an
    (previously) unexpected state"

    * tag 'rpmsg-v4.16' of git://github.com/andersson/remoteproc:
    rpmsg: smd: Fix double unlock in __qcom_smd_send()
    rpmsg: glink: Fix missing mutex_init() in qcom_glink_alloc_channel()
    rpmsg: smd: Don't hold the tx lock during wait
    rpmsg: smd: Fail send on a closed channel
    rpmsg: smd: Wake up all waiters
    rpmsg: smd: Create device for all channels
    rpmsg: smd: Perform handshake during open
    rpmsg: glink: smem: Ensure ordering during tx
    drivers: rpmsg: remove duplicate includes
    remoteproc: qcom: Use PTR_ERR_OR_ZERO() in glink prob

    Linus Torvalds
     
  • Pull MMC host fixes from Ulf Hansson:

    - renesas_sdhi: Fix build error in case NO_DMA=y

    - sdhci: Implement a bounce buffer to address throughput regressions

    * tag 'mmc-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
    mmc: MMC_SDHI_{SYS,INTERNAL}_DMAC should depend on HAS_DMA
    mmc: sdhci: Implement an SDHCI-specific bounce buffer

    Linus Torvalds
     
  • …ierry.reding/linux-pwm

    Pull pwm updates from Thierry Reding:
    "The Meson PWM controller driver gains support for the AXG series and a
    minor bug is fixed for the STMPE driver.

    To round things off, the class is now set for PWM channels exported
    via sysfs which allows non-root access, provided that the system has
    been configured accordingly"

    * tag 'pwm/for-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
    pwm: meson: Add clock source configuration for Meson-AXG
    dt-bindings: pwm: Update bindings for the Meson-AXG
    pwm: stmpe: Fix wrong register offset for hwpwm=2 case
    pwm: Set class for exported channels in sysfs

    Linus Torvalds
     
  • The Mediatek ethernet driver fails to build after commit 23c35f48f5fb
    ("pinctrl: remove include file from ") because it relies
    on the pinctrl/consumer.h and pinctrl/devinfo.h being pulled in by the
    device.h header implicitly.

    Include these headers explicitly to avoid the build failure.

    Cc: Linus Walleij
    Signed-off-by: Thierry Reding
    Signed-off-by: Linus Torvalds

    Thierry Reding
     
  • The Meson GX MMC driver fails to build after commit 23c35f48f5fb
    ("pinctrl: remove include file from ") because it relies
    on the pinctrl/consumer.h being pulled in by the device.h header
    implicitly.

    Include the header explicitly to avoid the build failure.

    Cc: Linus Walleij
    Signed-off-by: Thierry Reding
    Signed-off-by: Linus Torvalds

    Thierry Reding
     
  • The Rockchip LVDS driver fails to build after commit 23c35f48f5fb
    ("pinctrl: remove include file from ") because it relies
    on the pinctrl/consumer.h and pinctrl/devinfo.h being pulled in by the
    device.h header implicitly.

    Include these headers explicitly to avoid the build failure.

    Cc: Linus Walleij
    Signed-off-by: Thierry Reding
    Signed-off-by: Linus Torvalds

    Thierry Reding
     
  • Fixes: 23c35f48f5fb ("pinctrl: remove include file from ")
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     

05 Feb, 2018

10 commits

  • Another fix for an issue reported by 0-day robot.

    Reported-by: Dan Carpenter
    Fixes: 8ed5eec9d6c4 ("ovl: encode pure upper file handles")
    Signed-off-by: Amir Goldstein
    Signed-off-by: Miklos Szeredi

    Amir Goldstein
     
  • A re-factoring patch in NFS export series has passed the wrong argument
    to ovl_get_inode() causing a regression in the very recent fix to
    fsnotify of overlay merge dir.

    The regression has caused merge directory inodes to be hashed by upper
    instead of lower real inode, when NFS export and directory indexing is
    disabled. That caused an inotify watch to become obsolete after directory
    copy up and drop caches.

    LTP test inotify07 was improved to catch this regression.
    The regression also caused multiple redirect dirs to same origin not to
    be detected on lookup with NFS export disabled. An xfstest was added to
    cover this case.

    Fixes: 0aceb53e73be ("ovl: do not pass overlay dentry to ovl_get_inode()")
    Signed-off-by: Amir Goldstein
    Signed-off-by: Miklos Szeredi

    Amir Goldstein
     
  • Pull spectre/meltdown updates from Thomas Gleixner:
    "The next round of updates related to melted spectrum:

    - The initial set of spectre V1 mitigations:

    - Array index speculation blocker and its usage for syscall,
    fdtable and the n180211 driver.

    - Speculation barrier and its usage in user access functions

    - Make indirect calls in KVM speculation safe

    - Blacklisting of known to be broken microcodes so IPBP/IBSR are not
    touched.

    - The initial IBPB support and its usage in context switch

    - The exposure of the new speculation MSRs to KVM guests.

    - A fix for a regression in x86/32 related to the cpu entry area

    - Proper whitelisting for known to be safe CPUs from the mitigations.

    - objtool fixes to deal proper with retpolines and alternatives

    - Exclude __init functions from retpolines which speeds up the boot
    process.

    - Removal of the syscall64 fast path and related cleanups and
    simplifications

    - Removal of the unpatched paravirt mode which is yet another source
    of indirect unproteced calls.

    - A new and undisputed version of the module mismatch warning

    - A couple of cleanup and correctness fixes all over the place

    Yet another step towards full mitigation. There are a few things still
    missing like the RBS underflow mitigation for Skylake and other small
    details, but that's being worked on.

    That said, I'm taking a belated christmas vacation for a week and hope
    that everything is magically solved when I'm back on Feb 12th"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
    KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL
    KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL
    KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES
    KVM/x86: Add IBPB support
    KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX
    x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL
    x86/pti: Mark constant arrays as __initconst
    x86/spectre: Simplify spectre_v2 command line parsing
    x86/retpoline: Avoid retpolines for built-in __init functions
    x86/kvm: Update spectre-v1 mitigation
    KVM: VMX: make MSR bitmaps per-VCPU
    x86/paravirt: Remove 'noreplace-paravirt' cmdline option
    x86/speculation: Use Indirect Branch Prediction Barrier in context switch
    x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel
    x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable"
    x86/spectre: Report get_user mitigation for spectre_v1
    nl80211: Sanitize array index in parse_txq_params
    vfs, fdtable: Prevent bounds-check bypass via speculative execution
    x86/syscall: Sanitize syscall table de-references under speculation
    x86/get_user: Use pointer masking to limit speculation
    ...

    Linus Torvalds
     
  • Pull x86 fixes from Thomas Gleixner:
    "A small set of changes:

    - a fixup for kexec related to 5-level paging mode. That covers most
    of the cases except kexec from a 5-level kernel to a 4-level
    kernel. The latter needs more work and is going to come in 4.17

    - two trivial fixes for build warnings triggered by LTO and gcc-8"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/power: Fix swsusp_arch_resume prototype
    x86/dumpstack: Avoid uninitlized variable
    x86/kexec: Make kexec (mostly) work in 5-level paging mode

    Linus Torvalds
     
  • Pull irq fixes from Thomas Gleixner:
    "Two small changes:

    - a fix for a interrupt regression caused by the vector management
    changes in 4.15 affecting museum pieces which rely on interrupt
    probing for legacy (e.g. parallel port) devices.

    One of the startup calls in the autoprobe code was not changed to
    the new activate_and_startup() function resulting in a warning and
    as a consequence failing to discover the device interrupt.

    - a trivial update to the copyright/license header of the STM32 irq
    chip driver"

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    genirq: Make legacy autoprobing work again
    irqchip/stm32: Fix copyright

    Linus Torvalds
     
  • Pull more block updates from Jens Axboe:
    "Most of this is fixes and not new code/features:

    - skd fix from Arnd, fixing a build error dependent on sla allocator
    type.

    - blk-mq scheduler discard merging fixes, one from me and one from
    Keith. This fixes a segment miscalculation for blk-mq-sched, where
    we mistakenly think two segments are physically contigious even
    though the request isn't carrying real data. Also fixes a bio-to-rq
    merge case.

    - Don't re-set a bit on the buffer_head flags, if it's already set.
    This can cause scalability concerns on bigger machines and
    workloads. From Kemi Wang.

    - Add BLK_STS_DEV_RESOURCE return value to blk-mq, allowing us to
    distuingish between a local (device related) resource starvation
    and a global one. The latter might happen without IO being in
    flight, so it has to be handled a bit differently. From Ming"

    * tag 'for-linus-20180204' of git://git.kernel.dk/linux-block:
    block: skd: fix incorrect linux/slab_def.h inclusion
    buffer: Avoid setting buffer bits that are already set
    blk-mq-sched: Enable merging discard bio into request
    blk-mq: fix discard merge with scheduler attached
    blk-mq: introduce BLK_STS_DEV_RESOURCE

    Linus Torvalds
     
  • Pull NTB updates from Jon Mason:
    "Bug fixes galore, removal of the ntb atom driver, and updates to the
    ntb tools and tests to support the multi-port interface"

    * tag 'ntb-4.16' of git://github.com/jonmason/ntb: (37 commits)
    NTB: ntb_perf: fix cast to restricted __le32
    ntb_perf: Fix an error code in perf_copy_chunk()
    ntb_hw_switchtec: Make function switchtec_ntb_remove() static
    NTB: ntb_tool: fix memory leak on 'buf' on error exit path
    NTB: ntb_perf: fix printing of resource_size_t
    NTB: ntb_hw_idt: Set NTB_TOPO_SWITCH topology
    NTB: ntb_test: Update ntb_perf tests
    NTB: ntb_test: Update ntb_tool MW tests
    NTB: ntb_test: Add ntb_tool Message tests
    NTB: ntb_test: Update ntb_tool Scratchpad tests
    NTB: ntb_test: Update ntb_tool DB tests
    NTB: ntb_test: Update ntb_tool link tests
    NTB: ntb_test: Add ntb_tool port tests
    NTB: ntb_test: Safely use paths with whitespace
    NTB: ntb_perf: Add full multi-port NTB API support
    NTB: ntb_tool: Add full multi-port NTB API support
    NTB: ntb_pp: Add full multi-port NTB API support
    NTB: Fix UB/bug in ntb_mw_get_align()
    NTB: Set dma mask and dma coherent mask to NTB devices
    NTB: Rename NTB messaging API methods
    ...

    Linus Torvalds
     
  • Pull mailbox updates from Jassi Brar:
    "Misc driver changes only:

    - TI-MsgMgr: Fix print format for a printk

    - TI-MSgMgr: SPDX license switch for the driver

    - QCOM-IPC: Convert driver to use regmap

    - QCOM-IPC: Spawn sibling clock device from mailbox driver"

    * tag 'mailbox-v4.16' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
    dt-bindings: mailbox: qcom: Document the APCS clock binding
    mailbox: qcom: Create APCS child device for clock controller
    mailbox: qcom: Convert APCS IPC driver to use regmap
    mailbox: ti-msgmgr: Use %zu for size_t print format
    mailbox: ti-msgmgr: Switch to SPDX Licensing

    Linus Torvalds
     
  • Pull i2c updates from Wolfram Sang:
    "I2C has the following changes for you:

    - new flag to mark DMA safe buffers in i2c_msg. Also, some
    infrastructure around it. And docs.

    - huge refactoring of the at24 driver led by the new maintainer
    Bartosz

    - update I2C bus recovery to send STOP after recovery

    - conversion from gpio to gpiod for I2C bus recovery

    - adding a fault-injector to the i2c-gpio driver

    - lots of small driver improvements, and bigger ones to
    i2c-sh_mobile"

    * 'i2c/for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (99 commits)
    i2c: mv64xxx: Add myself as maintainer for this driver
    i2c: mv64xxx: Fix clock resource by adding an optional bus clock
    i2c: mv64xxx: Remove useless test before clk_disable_unprepare
    i2c: mxs: use true and false for boolean values
    i2c: meson: update doc description to fix build warnings
    i2c: meson: add configurable divider factors
    dt-bindings: i2c: update documentation for the Meson-AXG
    i2c: imx-lpi2c: add runtime pm support
    i2c: rcar: fix some trivial typos in comments
    i2c: davinci: fix the cpufreq transition
    i2c: rk3x: add proper kerneldoc header
    i2c: rk3x: account for const type of of_device_id.data
    i2c: acorn: remove outdated path from file header
    i2c: acorn: add MODULE_LICENSE tag
    i2c: rcar: implement bus recovery
    i2c: send STOP after successful bus recovery
    i2c: ensure SDA is released in recovery if SDA is controllable
    i2c: add 'set_sda' to bus_recovery_info
    i2c: add identifier in declarations for i2c_bus_recovery
    i2c: make kerneldoc about bus recovery more precise
    ...

    Linus Torvalds
     
  • Pull fscrypt updates from Ted Ts'o:
    "Refactor support for encrypted symlinks to move common code to fscrypt"

    Ted also points out about the merge:
    "This makes the f2fs symlink code use the fscrypt_encrypt_symlink()
    from the fscrypt tree. This will end up dropping the kzalloc() ->
    f2fs_kzalloc() change, which means the fscrypt-specific allocation
    won't get tested by f2fs's kmalloc error injection system; which is
    fine"

    * tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt: (26 commits)
    fscrypt: fix build with pre-4.6 gcc versions
    fscrypt: remove 'ci' parameter from fscrypt_put_encryption_info()
    fscrypt: document symlink length restriction
    fscrypt: fix up fscrypt_fname_encrypted_size() for internal use
    fscrypt: define fscrypt_fname_alloc_buffer() to be for presented names
    fscrypt: calculate NUL-padding length in one place only
    fscrypt: move fscrypt_symlink_data to fscrypt_private.h
    fscrypt: remove fscrypt_fname_usr_to_disk()
    ubifs: switch to fscrypt_get_symlink()
    ubifs: switch to fscrypt ->symlink() helper functions
    ubifs: free the encrypted symlink target
    f2fs: switch to fscrypt_get_symlink()
    f2fs: switch to fscrypt ->symlink() helper functions
    ext4: switch to fscrypt_get_symlink()
    ext4: switch to fscrypt ->symlink() helper functions
    fscrypt: new helper function - fscrypt_get_symlink()
    fscrypt: new helper functions for ->symlink()
    fscrypt: trim down fscrypt.h includes
    fscrypt: move fscrypt_is_dot_dotdot() to fs/crypto/fname.c
    fscrypt: move fscrypt_valid_enc_modes() to fscrypt_private.h
    ...

    Linus Torvalds
     

04 Feb, 2018

15 commits

  • Update the binding documentation for APCS to mention that the APCS
    hardware block also expose a clock controller functionality.

    The APCS clock controller is a mux and half-integer divider. It has the
    main CPU PLL as an input and provides the clock for the application CPU.

    Signed-off-by: Georgi Djakov
    Reviewed-by: Rob Herring
    Acked-by: Bjorn Andersson
    Signed-off-by: Jassi Brar

    Georgi Djakov
     
  • There is a clock controller functionality provided by the APCS hardware
    block of msm8916 devices. The device-tree would represent an APCS node
    with both mailbox and clock provider properties.
    Create a platform child device for the clock controller functionality so
    the driver can probe and use APCS as parent.

    Signed-off-by: Georgi Djakov
    Acked-by: Bjorn Andersson
    Signed-off-by: Jassi Brar

    Georgi Djakov
     
  • This hardware block provides more functionalities that just IPC. Convert
    it to regmap to allow other child platform devices to use the same regmap.

    Signed-off-by: Georgi Djakov
    Acked-by: Bjorn Andersson
    Signed-off-by: Jassi Brar

    Georgi Djakov
     
  • Pull hardened usercopy whitelisting from Kees Cook:
    "Currently, hardened usercopy performs dynamic bounds checking on slab
    cache objects. This is good, but still leaves a lot of kernel memory
    available to be copied to/from userspace in the face of bugs.

    To further restrict what memory is available for copying, this creates
    a way to whitelist specific areas of a given slab cache object for
    copying to/from userspace, allowing much finer granularity of access
    control.

    Slab caches that are never exposed to userspace can declare no
    whitelist for their objects, thereby keeping them unavailable to
    userspace via dynamic copy operations. (Note, an implicit form of
    whitelisting is the use of constant sizes in usercopy operations and
    get_user()/put_user(); these bypass all hardened usercopy checks since
    these sizes cannot change at runtime.)

    This new check is WARN-by-default, so any mistakes can be found over
    the next several releases without breaking anyone's system.

    The series has roughly the following sections:
    - remove %p and improve reporting with offset
    - prepare infrastructure and whitelist kmalloc
    - update VFS subsystem with whitelists
    - update SCSI subsystem with whitelists
    - update network subsystem with whitelists
    - update process memory with whitelists
    - update per-architecture thread_struct with whitelists
    - update KVM with whitelists and fix ioctl bug
    - mark all other allocations as not whitelisted
    - update lkdtm for more sensible test overage"

    * tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (38 commits)
    lkdtm: Update usercopy tests for whitelisting
    usercopy: Restrict non-usercopy caches to size 0
    kvm: x86: fix KVM_XEN_HVM_CONFIG ioctl
    kvm: whitelist struct kvm_vcpu_arch
    arm: Implement thread_struct whitelist for hardened usercopy
    arm64: Implement thread_struct whitelist for hardened usercopy
    x86: Implement thread_struct whitelist for hardened usercopy
    fork: Provide usercopy whitelisting for task_struct
    fork: Define usercopy region in thread_stack slab caches
    fork: Define usercopy region in mm_struct slab caches
    net: Restrict unwhitelisted proto caches to size 0
    sctp: Copy struct sctp_sock.autoclose to userspace using put_user()
    sctp: Define usercopy region in SCTP proto slab cache
    caif: Define usercopy region in caif proto slab cache
    ip: Define usercopy region in IP proto slab cache
    net: Define usercopy region in struct proto slab cache
    scsi: Define usercopy region in scsi_sense_cache slab cache
    cifs: Define usercopy region in cifs_request slab cache
    vxfs: Define usercopy region in vxfs_inode slab cache
    ufs: Define usercopy region in ufs_inode_cache slab cache
    ...

    Linus Torvalds
     
  • [ Based on a patch from Paolo Bonzini ]

    ... basically doing exactly what we do for VMX:

    - Passthrough SPEC_CTRL to guests (if enabled in guest CPUID)
    - Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest
    actually used it.

    Signed-off-by: KarimAllah Ahmed
    Signed-off-by: David Woodhouse
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Darren Kenny
    Reviewed-by: Konrad Rzeszutek Wilk
    Cc: Andrea Arcangeli
    Cc: Andi Kleen
    Cc: Jun Nakajima
    Cc: kvm@vger.kernel.org
    Cc: Dave Hansen
    Cc: Tim Chen
    Cc: Andy Lutomirski
    Cc: Asit Mallick
    Cc: Arjan Van De Ven
    Cc: Greg KH
    Cc: Paolo Bonzini
    Cc: Dan Williams
    Cc: Linus Torvalds
    Cc: Ashok Raj
    Link: https://lkml.kernel.org/r/1517669783-20732-1-git-send-email-karahmed@amazon.de

    KarimAllah Ahmed
     
  • [ Based on a patch from Ashok Raj ]

    Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
    guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
    be using a retpoline+IBPB based approach.

    To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for
    guests that do not actually use the MSR, only start saving and restoring
    when a non-zero is written to it.

    No attempt is made to handle STIBP here, intentionally. Filtering STIBP
    may be added in a future patch, which may require trapping all writes
    if we don't want to pass it through directly to the guest.

    [dwmw2: Clean up CPUID bits, save/restore manually, handle reset]

    Signed-off-by: KarimAllah Ahmed
    Signed-off-by: David Woodhouse
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Darren Kenny
    Reviewed-by: Konrad Rzeszutek Wilk
    Reviewed-by: Jim Mattson
    Cc: Andrea Arcangeli
    Cc: Andi Kleen
    Cc: Jun Nakajima
    Cc: kvm@vger.kernel.org
    Cc: Dave Hansen
    Cc: Tim Chen
    Cc: Andy Lutomirski
    Cc: Asit Mallick
    Cc: Arjan Van De Ven
    Cc: Greg KH
    Cc: Paolo Bonzini
    Cc: Dan Williams
    Cc: Linus Torvalds
    Cc: Ashok Raj
    Link: https://lkml.kernel.org/r/1517522386-18410-5-git-send-email-karahmed@amazon.de

    KarimAllah Ahmed
     
  • Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO
    (bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the
    contents will come directly from the hardware, but user-space can still
    override it.

    [dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional]

    Signed-off-by: KarimAllah Ahmed
    Signed-off-by: David Woodhouse
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Paolo Bonzini
    Reviewed-by: Darren Kenny
    Reviewed-by: Jim Mattson
    Reviewed-by: Konrad Rzeszutek Wilk
    Cc: Andrea Arcangeli
    Cc: Andi Kleen
    Cc: Jun Nakajima
    Cc: kvm@vger.kernel.org
    Cc: Dave Hansen
    Cc: Linus Torvalds
    Cc: Andy Lutomirski
    Cc: Asit Mallick
    Cc: Arjan Van De Ven
    Cc: Greg KH
    Cc: Dan Williams
    Cc: Tim Chen
    Cc: Ashok Raj
    Link: https://lkml.kernel.org/r/1517522386-18410-4-git-send-email-karahmed@amazon.de

    KarimAllah Ahmed
     
  • The Indirect Branch Predictor Barrier (IBPB) is an indirect branch
    control mechanism. It keeps earlier branches from influencing
    later ones.

    Unlike IBRS and STIBP, IBPB does not define a new mode of operation.
    It's a command that ensures predicted branch targets aren't used after
    the barrier. Although IBRS and IBPB are enumerated by the same CPUID
    enumeration, IBPB is very different.

    IBPB helps mitigate against three potential attacks:

    * Mitigate guests from being attacked by other guests.
    - This is addressed by issing IBPB when we do a guest switch.

    * Mitigate attacks from guest/ring3->host/ring3.
    These would require a IBPB during context switch in host, or after
    VMEXIT. The host process has two ways to mitigate
    - Either it can be compiled with retpoline
    - If its going through context switch, and has set !dumpable then
    there is a IBPB in that path.
    (Tim's patch: https://patchwork.kernel.org/patch/10192871)
    - The case where after a VMEXIT you return back to Qemu might make
    Qemu attackable from guest when Qemu isn't compiled with retpoline.
    There are issues reported when doing IBPB on every VMEXIT that resulted
    in some tsc calibration woes in guest.

    * Mitigate guest/ring0->host/ring0 attacks.
    When host kernel is using retpoline it is safe against these attacks.
    If host kernel isn't using retpoline we might need to do a IBPB flush on
    every VMEXIT.

    Even when using retpoline for indirect calls, in certain conditions 'ret'
    can use the BTB on Skylake-era CPUs. There are other mitigations
    available like RSB stuffing/clearing.

    * IBPB is issued only for SVM during svm_free_vcpu().
    VMX has a vmclear and SVM doesn't. Follow discussion here:
    https://lkml.org/lkml/2018/1/15/146

    Please refer to the following spec for more details on the enumeration
    and control.

    Refer here to get documentation about mitigations.

    https://software.intel.com/en-us/side-channel-security-support

    [peterz: rebase and changelog rewrite]
    [karahmed: - rebase
    - vmx: expose PRED_CMD if guest has it in CPUID
    - svm: only pass through IBPB if guest has it in CPUID
    - vmx: support !cpu_has_vmx_msr_bitmap()]
    - vmx: support nested]
    [dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS)
    PRED_CMD is a write-only MSR]

    Signed-off-by: Ashok Raj
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: David Woodhouse
    Signed-off-by: KarimAllah Ahmed
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Konrad Rzeszutek Wilk
    Cc: Andrea Arcangeli
    Cc: Andi Kleen
    Cc: kvm@vger.kernel.org
    Cc: Asit Mallick
    Cc: Linus Torvalds
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: Arjan Van De Ven
    Cc: Greg KH
    Cc: Jun Nakajima
    Cc: Paolo Bonzini
    Cc: Dan Williams
    Cc: Tim Chen
    Link: http://lkml.kernel.org/r/1515720739-43819-6-git-send-email-ashok.raj@intel.com
    Link: https://lkml.kernel.org/r/1517522386-18410-3-git-send-email-karahmed@amazon.de

    Ashok Raj
     
  • [dwmw2: Stop using KF() for bits in it, too]
    Signed-off-by: KarimAllah Ahmed
    Signed-off-by: David Woodhouse
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Paolo Bonzini
    Reviewed-by: Konrad Rzeszutek Wilk
    Reviewed-by: Jim Mattson
    Cc: kvm@vger.kernel.org
    Cc: Radim Krčmář
    Link: https://lkml.kernel.org/r/1517522386-18410-2-git-send-email-karahmed@amazon.de

    KarimAllah Ahmed
     
  • Pull pstore update from Kees Cook:
    "Only a header cleanup this release; nice and quiet. :)

    - clean up hardirq header usage (Yang Shi)"

    * tag 'pstore-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    fs: pstore: remove unused hardirq.h

    Linus Torvalds
     
  • Pull ext4 updates from Ted Ts'o:
    "Only miscellaneous cleanups and bug fixes for ext4 this cycle"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: create ext4_kset dynamically
    ext4: create ext4_feat kobject dynamically
    ext4: release kobject/kset even when init/register fail
    ext4: fix incorrect indentation of if statement
    ext4: correct documentation for grpid mount option
    ext4: use 'sbi' instead of 'EXT4_SB(sb)'
    ext4: save error to disk in __ext4_grp_locked_error()
    jbd2: fix sphinx kernel-doc build warnings
    ext4: fix a race in the ext4 shutdown path
    mbcache: make sure c_entry_count is not decremented past zero
    ext4: no need flush workqueue before destroying it
    ext4: fixed alignment and minor code cleanup in ext4.h
    ext4: fix ENOSPC handling in DAX page fault handler
    dax: pass detailed error code from dax_iomap_fault()
    mbcache: revert "fs/mbcache.c: make count_objects() more robust"
    mbcache: initialize entry->e_referenced in mb_cache_entry_create()
    ext4: fix up remaining files with SPDX cleanups

    Linus Torvalds
     
  • Pull dmi subsystem updates/fixes from Jean Delvare.

    * 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
    firmware: dmi: handle missing DMI data gracefully
    firmware: dmi_scan: Fix handling of empty DMI strings
    firmware: dmi_scan: Drop dmi_initialized
    firmware: dmi: Optimize dmi_matches

    Linus Torvalds
     
  • …jmorris/linux-security

    Pull integrity fixes from James Morris:

    - add James Bottommley as a Trusted Keys maintainer.

    - IMA: re-initialize iint->atomic_flags on iint_free(), from Mimi.

    * 'fixes-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    ima: re-initialize iint->atomic_flags
    maintainers: update trusted keys

    Linus Torvalds
     
  • Pull the KVM prerequisites so the IBPB patches apply.

    Thomas Gleixner
     
  • Pull networking fixes from David Miller:

    1) The bnx2x can hang if you give it a GSO packet with a segment size
    which is too big for the hardware, detect and drop in this case.
    From Daniel Axtens.

    2) Fix some overflows and pointer leaks in xtables, from Dmitry Vyukov.

    3) Missing RCU locking in igmp, from Eric Dumazet.

    4) Fix RX checksum handling on r8152, it can only checksum UDP and TCP
    packets. From Hayes Wang.

    5) Minor pacing tweak to TCP BBR congestion control, from Neal
    Cardwell.

    6) Missing RCU annotations in cls_u32, from Paolo Abeni.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
    Revert "defer call to mem_cgroup_sk_alloc()"
    soreuseport: fix mem leak in reuseport_add_sock()
    net: qlge: use memmove instead of skb_copy_to_linear_data
    net: qed: use correct strncpy() size
    net: cxgb4: avoid memcpy beyond end of source buffer
    cls_u32: add missing RCU annotation.
    r8152: set rx mode early when linking on
    r8152: fix wrong checksum status for received IPv4 packets
    nfp: fix TLV offset calculation
    net: pxa168_eth: add netconsole support
    net: igmp: add a missing rcu locking section
    ibmvnic: fix firmware version when no firmware level has been provided by the VIOS server
    vmxnet3: remove redundant initialization of pointer 'rq'
    lan78xx: remove redundant initialization of pointer 'phydev'
    net: jme: remove unused initialization of 'rxdesc'
    rtnetlink: remove check for IFLA_IF_NETNSID
    rocker: fix possible null pointer dereference in rocker_router_fib_event_work
    inet: Avoid unitialized variable warning in inet_unhash()
    net: bridge: Fix uninitialized error in br_fdb_sync_static()
    openvswitch: Remove padding from packet before L3+ conntrack processing
    ...

    Linus Torvalds