18 Aug, 2018

21 commits

  • commit 785a19f9d1dd8a4ab2d0633be4656653bd3de1fc upstream.

    The following kernel panic was observed on ARM64 platform due to a stale
    TLB entry.

    1. ioremap with 4K size, a valid pte page table is set.
    2. iounmap it, its pte entry is set to 0.
    3. ioremap the same address with 2M size, update its pmd entry with
    a new value.
    4. CPU may hit an exception because the old pmd entry is still in TLB,
    which leads to a kernel panic.

    Commit b6bdb7517c3d ("mm/vmalloc: add interfaces to free unmapped page
    table") has addressed this panic by falling to pte mappings in the above
    case on ARM64.

    To support pmd mappings in all cases, TLB purge needs to be performed
    in this case on ARM64.

    Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page()
    so that TLB purge can be added later in seprate patches.

    [toshi.kani@hpe.com: merge changes, rewrite patch description]
    Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
    Signed-off-by: Chintan Pandya
    Signed-off-by: Toshi Kani
    Signed-off-by: Thomas Gleixner
    Cc: mhocko@suse.com
    Cc: akpm@linux-foundation.org
    Cc: hpa@zytor.com
    Cc: linux-mm@kvack.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: Will Deacon
    Cc: Joerg Roedel
    Cc: stable@vger.kernel.org
    Cc: Andrew Morton
    Cc: Michal Hocko
    Cc: "H. Peter Anvin"
    Cc:
    Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com
    Signed-off-by: Greg Kroah-Hartman

    Chintan Pandya
     
  • commit 7992c18810e568b95c869b227137a2215702a805 upstream.

    CVE-2018-9363

    The buffer length is unsigned at all layers, but gets cast to int and
    checked in hidp_process_report and can lead to a buffer overflow.
    Switch len parameter to unsigned int to resolve issue.

    This affects 3.18 and newer kernels.

    Signed-off-by: Mark Salyzyn
    Fixes: a4b1b5877b514b276f0f31efe02388a9c2836728 ("HID: Bluetooth: hidp: make sure input buffers are big enough")
    Cc: Marcel Holtmann
    Cc: Johan Hedberg
    Cc: "David S. Miller"
    Cc: Kees Cook
    Cc: Benjamin Tissoires
    Cc: linux-bluetooth@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: security@kernel.org
    Cc: kernel-team@android.com
    Acked-by: Kees Cook
    Signed-off-by: Marcel Holtmann
    Signed-off-by: Greg Kroah-Hartman

    Mark Salyzyn
     
  • commit 3bbda5a38601f7675a214be2044e41d7749e6c7b upstream.

    If the ts3a227e audio accessory detection hardware is present and its
    driver probed, the jack needs to be created before enabling jack
    detection in the ts3a227e driver. With this patch, the jack is
    instantiated in the max98090 headset init function if the ts3a227e is
    present. This fixes a null pointer dereference as the jack detection
    enabling function in the ts3a driver was called before the jack is
    created.

    [minor correction to keep error handling on jack creation the same
    as before by Pierre Bossart]

    Signed-off-by: Thierry Escande
    Signed-off-by: Pierre-Louis Bossart
    Acked-By: Vinod Koul
    Signed-off-by: Mark Brown
    Signed-off-by: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    Thierry Escande
     
  • commit f53ee247ad546183fc13739adafc5579b9f0ebc0 upstream.

    The kcontrol for the third input (rxN_mix1_inp3) of both RX2
    and RX3 mixers are not using the correct control register. This simple
    patch fixes this.

    Signed-off-by: Jean-François Têtu
    Signed-off-by: Mark Brown
    Signed-off-by: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    Jean-François Têtu
     
  • commit 4baa8bb13f41307f3eb62fe91f93a1a798ebef53 upstream.

    This commit fixes a bug that causes bfq to fail to guarantee a high
    responsiveness on some drives, if there is heavy random read+write I/O
    in the background. More precisely, such a failure allowed this bug to
    be found [1], but the bug may well cause other yet unreported
    anomalies.

    BFQ raises the weight of the bfq_queues associated with soft real-time
    applications, to privilege the I/O, and thus reduce latency, for these
    applications. This mechanism is named soft-real-time weight raising in
    BFQ. A soft real-time period may happen to be nested into an
    interactive weight raising period, i.e., it may happen that, when a
    bfq_queue switches to a soft real-time weight-raised state, the
    bfq_queue is already being weight-raised because deemed interactive
    too. In this case, BFQ saves in a special variable
    wr_start_at_switch_to_srt, the time instant when the interactive
    weight-raising period started for the bfq_queue, i.e., the time
    instant when BFQ started to deem the bfq_queue interactive. This value
    is then used to check whether the interactive weight-raising period
    would still be in progress when the soft real-time weight-raising
    period ends. If so, interactive weight raising is restored for the
    bfq_queue. This restore is useful, in particular, because it prevents
    bfq_queues from losing their interactive weight raising prematurely,
    as a consequence of spurious, short-lived soft real-time
    weight-raising periods caused by wrong detections as soft real-time.

    If, instead, a bfq_queue switches to soft-real-time weight raising
    while it *is not* already in an interactive weight-raising period,
    then the variable wr_start_at_switch_to_srt has no meaning during the
    following soft real-time weight-raising period. Unfortunately the
    handling of this case is wrong in BFQ: not only the variable is not
    flagged somehow as meaningless, but it is also set to the time when
    the switch to soft real-time weight-raising occurs. This may cause an
    interactive weight-raising period to be considered mistakenly as still
    in progress, and thus a spurious interactive weight-raising period to
    start for the bfq_queue, at the end of the soft-real-time
    weight-raising period. In particular the spurious interactive
    weight-raising period will be considered as still in progress, if the
    soft-real-time weight-raising period does not last very long. The
    bfq_queue will then be wrongly privileged and, if I/O bound, will
    unjustly steal bandwidth to truly interactive or soft real-time
    bfq_queues, harming responsiveness and low latency.

    This commit fixes this issue by just setting wr_start_at_switch_to_srt
    to minus infinity (farthest past time instant according to jiffies
    macros): when the soft-real-time weight-raising period ends, certainly
    no interactive weight-raising period will be considered as still in
    progress.

    [1] Background I/O Type: Random - Background I/O mix: Reads and writes
    - Application to start: LibreOffice Writer in
    http://www.phoronix.com/scan.php?page=news_item&px=Linux-4.13-IO-Laptop

    Signed-off-by: Paolo Valente
    Signed-off-by: Angelo Ruocco
    Tested-by: Oleksandr Natalenko
    Tested-by: Lee Tibbert
    Tested-by: Mirko Montanari
    Signed-off-by: Jens Axboe
    Signed-off-by: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    Paolo Valente
     
  • commit a894990ac994a53bc5a0cc694eb12f3c064c18c5 upstream.

    When using cpufreq-dt with default govenor other than "performance"
    system freezes while booting.
    Adding CLK_SET_RATE_PARENT | CLK_IS_CRITICAL to clk_cpu fixes the
    problem.

    Tested on Cubietruck (A20).

    Fixes: c84f5683f6E ("clk: sunxi-ng: Add sun4i/sun7i CCU driver")
    Acked-by: Chen-Yu Tsai
    Signed-off-by: Alexander Syring
    Signed-off-by: Maxime Ripard
    Signed-off-by: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    Alexander Syring
     
  • commit b7165d26bf730567ab081bb9383aff82cd43d9ea upstream.

    Current ADG driver is over-writing flags. This patch fixes it.

    Reported-by: Hiroyuki Yokoyama
    Signed-off-by: Kuninori Morimoto
    Signed-off-by: Mark Brown
    Signed-off-by: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    Kuninori Morimoto
     
  • commit 23f1b8d938c861ee0bbb786162f7ce0685f722ec upstream.

    On driver remove(), all objects created during probe() should be
    removed, but sysfs qemu_fw_cfg/rev file was left. Also reorder
    functions to match probe() error cleanup code.

    Cc: stable@vger.kernel.org
    Signed-off-by: Marc-André Lureau
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    Marc-André Lureau
     
  • commit 3f5fe9fef5b2da06b6319fab8123056da5217c3f upstream.

    The recent conversion of the task state recording to use task_state_index()
    broke the sched_switch tracepoint task state output.

    task_state_index() returns surprisingly an index (0-7) which is then
    printed with __print_flags() applying bitmasks. Not really working and
    resulting in weird states like 'prev_state=t' instead of 'prev_state=I'.

    Use TASK_REPORT_MAX instead of TASK_STATE_MAX to report preemption. Build a
    bitmask from the return value of task_state_index() and store it in
    entry->prev_state, which makes __print_flags() work as expected.

    Signed-off-by: Thomas Gleixner
    Cc: Linus Torvalds
    Cc: Paul E. McKenney
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: stable@vger.kernel.org
    Fixes: efb40f588b43 ("sched/tracing: Fix trace_sched_switch task-state printing")
    Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1711221304180.1751@nanos
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • commit 520e18a5080d2c444a03280d99c8a35cb667d321 upstream.

    Now that nothing is using the ghes_ioremap_area pages, rip them out.

    Signed-off-by: James Morse
    Reviewed-by: Borislav Petkov
    Tested-by: Tyler Baicar
    Signed-off-by: Rafael J. Wysocki
    Cc: All applicable
    Signed-off-by: Sudip Mukherjee
    Signed-off-by: Greg Kroah-Hartman

    James Morse
     
  • commit 8088d3dd4d7c6933a65aa169393b5d88d8065672 upstream.

    scatterwalk_done() is only meant to be called after a nonzero number of
    bytes have been processed, since scatterwalk_pagedone() will flush the
    dcache of the *previous* page. But in the error case of
    skcipher_walk_done(), e.g. if the input wasn't an integer number of
    blocks, scatterwalk_done() was actually called after advancing 0 bytes.
    This caused a crash ("BUG: unable to handle kernel paging request")
    during '!PageSlab(page)' on architectures like arm and arm64 that define
    ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
    page-aligned as in that case walk->offset == 0.

    Fix it by reorganizing skcipher_walk_done() to skip the
    scatterwalk_advance() and scatterwalk_done() if an error has occurred.

    This bug was found by syzkaller fuzzing.

    Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE:

    #include
    #include
    #include

    int main()
    {
    struct sockaddr_alg addr = {
    .salg_type = "skcipher",
    .salg_name = "cbc(aes-generic)",
    };
    char buffer[4096] __attribute__((aligned(4096))) = { 0 };
    int fd;

    fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
    bind(fd, (void *)&addr, sizeof(addr));
    setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16);
    fd = accept(fd, NULL, NULL);
    write(fd, buffer, 15);
    read(fd, buffer, 15);
    }

    Reported-by: Liu Chao
    Fixes: b286d8b1a690 ("crypto: skcipher - Add skcipher walk interface")
    Cc: # v4.10+
    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit 0567fc9e90b9b1c8dbce8a5468758e6206744d4a upstream.

    The ALIGN() macro needs to be passed the alignment, not the alignmask
    (which is the alignment minus 1).

    Fixes: b286d8b1a690 ("crypto: skcipher - Add skcipher walk interface")
    Cc: # v4.10+
    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit 318abdfbe708aaaa652c79fb500e9bd60521f9dc upstream.

    Like the skcipher_walk and blkcipher_walk cases:

    scatterwalk_done() is only meant to be called after a nonzero number of
    bytes have been processed, since scatterwalk_pagedone() will flush the
    dcache of the *previous* page. But in the error case of
    ablkcipher_walk_done(), e.g. if the input wasn't an integer number of
    blocks, scatterwalk_done() was actually called after advancing 0 bytes.
    This caused a crash ("BUG: unable to handle kernel paging request")
    during '!PageSlab(page)' on architectures like arm and arm64 that define
    ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
    page-aligned as in that case walk->offset == 0.

    Fix it by reorganizing ablkcipher_walk_done() to skip the
    scatterwalk_advance() and scatterwalk_done() if an error has occurred.

    Reported-by: Liu Chao
    Fixes: bf06099db18a ("crypto: skcipher - Add ablkcipher_walk interfaces")
    Cc: # v2.6.35+
    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit 0868def3e4100591e7a1fdbf3eed1439cc8f7ca3 upstream.

    Like the skcipher_walk case:

    scatterwalk_done() is only meant to be called after a nonzero number of
    bytes have been processed, since scatterwalk_pagedone() will flush the
    dcache of the *previous* page. But in the error case of
    blkcipher_walk_done(), e.g. if the input wasn't an integer number of
    blocks, scatterwalk_done() was actually called after advancing 0 bytes.
    This caused a crash ("BUG: unable to handle kernel paging request")
    during '!PageSlab(page)' on architectures like arm and arm64 that define
    ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was
    page-aligned as in that case walk->offset == 0.

    Fix it by reorganizing blkcipher_walk_done() to skip the
    scatterwalk_advance() and scatterwalk_done() if an error has occurred.

    This bug was found by syzkaller fuzzing.

    Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE:

    #include
    #include
    #include

    int main()
    {
    struct sockaddr_alg addr = {
    .salg_type = "skcipher",
    .salg_name = "ecb(aes-generic)",
    };
    char buffer[4096] __attribute__((aligned(4096))) = { 0 };
    int fd;

    fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
    bind(fd, (void *)&addr, sizeof(addr));
    setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16);
    fd = accept(fd, NULL, NULL);
    write(fd, buffer, 15);
    read(fd, buffer, 15);
    }

    Reported-by: Liu Chao
    Fixes: 5cde0af2a982 ("[CRYPTO] cipher: Added block cipher type")
    Cc: # v2.6.19+
    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit bb29648102335586e9a66289a1d98a0cb392b6e5 upstream.

    syzbot reported a crash in vmac_final() when multiple threads
    concurrently use the same "vmac(aes)" transform through AF_ALG. The bug
    is pretty fundamental: the VMAC template doesn't separate per-request
    state from per-tfm (per-key) state like the other hash algorithms do,
    but rather stores it all in the tfm context. That's wrong.

    Also, vmac_final() incorrectly zeroes most of the state including the
    derived keys and cached pseudorandom pad. Therefore, only the first
    VMAC invocation with a given key calculates the correct digest.

    Fix these bugs by splitting the per-tfm state from the per-request state
    and using the proper init/update/final sequencing for requests.

    Reproducer for the crash:

    #include
    #include
    #include

    int main()
    {
    int fd;
    struct sockaddr_alg addr = {
    .salg_type = "hash",
    .salg_name = "vmac(aes)",
    };
    char buf[256] = { 0 };

    fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
    bind(fd, (void *)&addr, sizeof(addr));
    setsockopt(fd, SOL_ALG, ALG_SET_KEY, buf, 16);
    fork();
    fd = accept(fd, NULL, NULL);
    for (;;)
    write(fd, buf, 256);
    }

    The immediate cause of the crash is that vmac_ctx_t.partial_size exceeds
    VMAC_NHBYTES, causing vmac_final() to memset() a negative length.

    Reported-by: syzbot+264bca3a6e8d645550d3@syzkaller.appspotmail.com
    Fixes: f1939f7c5645 ("crypto: vmac - New hash algorithm for intel_txt support")
    Cc: # v2.6.32+
    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit 73bf20ef3df262026c3470241ae4ac8196943ffa upstream.

    The VMAC template assumes the block cipher has a 128-bit block size, but
    it failed to check for that. Thus it was possible to instantiate it
    using a 64-bit block size cipher, e.g. "vmac(cast5)", causing
    uninitialized memory to be used.

    Add the needed check when instantiating the template.

    Fixes: f1939f7c5645 ("crypto: vmac - New hash algorithm for intel_txt support")
    Cc: # v2.6.32+
    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit af839b4e546613aed1fbd64def73956aa98631e7 upstream.

    There is a copy-paste error where sha256_mb_mgr_get_comp_job_avx2()
    copies the SHA-256 digest state from sha256_mb_mgr::args::digest to
    job_sha256::result_digest. Consequently, the sha256_mb algorithm
    sometimes calculates the wrong digest. Fix it.

    Reproducer using AF_ALG:

    #include
    #include
    #include
    #include
    #include
    #include

    static const __u8 expected[32] =
    "\xad\x7f\xac\xb2\x58\x6f\xc6\xe9\x66\xc0\x04\xd7\xd1\xd1\x6b\x02"
    "\x4f\x58\x05\xff\x7c\xb4\x7c\x7a\x85\xda\xbd\x8b\x48\x89\x2c\xa7";

    int main()
    {
    int fd;
    struct sockaddr_alg addr = {
    .salg_type = "hash",
    .salg_name = "sha256_mb",
    };
    __u8 data[4096] = { 0 };
    __u8 digest[32];
    int ret;
    int i;

    fd = socket(AF_ALG, SOCK_SEQPACKET, 0);
    bind(fd, (void *)&addr, sizeof(addr));
    fork();
    fd = accept(fd, 0, 0);
    do {
    ret = write(fd, data, 4096);
    assert(ret == 4096);
    ret = read(fd, digest, 32);
    assert(ret == 32);
    } while (memcmp(digest, expected, 32) == 0);

    printf("wrong digest: ");
    for (i = 0; i < 32; i++)
    printf("%02x", digest[i]);
    printf("\n");
    }

    Output was:

    wrong digest: ad7facb2000000000000000000000000ffffffef7cb47c7a85dabd8b48892ca7

    Fixes: 172b1d6b5a93 ("crypto: sha256-mb - fix ctx pointer and digest copy")
    Cc: # v4.8+
    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit 934193a654c1f4d0643ddbf4b2529b508cae926e upstream.

    Verify that 'depmod' ($DEPMOD) is installed.
    This is a partial revert of commit 620c231c7a7f
    ("kbuild: do not check for ancient modutils tools").

    Also update Documentation/process/changes.rst to refer to
    kmod instead of module-init-tools.

    Fixes kernel bugzilla #198965:
    https://bugzilla.kernel.org/show_bug.cgi?id=198965

    Signed-off-by: Randy Dunlap
    Cc: Lucas De Marchi
    Cc: Lucas De Marchi
    Cc: Michal Marek
    Cc: Jessica Yu
    Cc: Chih-Wei Huang
    Cc: stable@vger.kernel.org # any kernel since 2012
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Greg Kroah-Hartman

    Randy Dunlap
     
  • commit f967db0b9ed44ec3057a28f3b28efc51df51b835 upstream.

    ioremap() supports pmd mappings on x86-PAE. However, kernel's pmd
    tables are not shared among processes on x86-PAE. Therefore, any
    update to sync'd pmd entries need re-syncing. Freeing a pte page
    also leads to a vmalloc fault and hits the BUG_ON in vmalloc_sync_one().

    Disable free page handling on x86-PAE. pud_free_pmd_page() and
    pmd_free_pte_page() simply return 0 if a given pud/pmd entry is present.
    This assures that ioremap() does not update sync'd pmd entries at the
    cost of falling back to pte mappings.

    Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
    Reported-by: Joerg Roedel
    Signed-off-by: Toshi Kani
    Signed-off-by: Thomas Gleixner
    Cc: mhocko@suse.com
    Cc: akpm@linux-foundation.org
    Cc: hpa@zytor.com
    Cc: cpandya@codeaurora.org
    Cc: linux-mm@kvack.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: stable@vger.kernel.org
    Cc: Andrew Morton
    Cc: Michal Hocko
    Cc: "H. Peter Anvin"
    Cc:
    Link: https://lkml.kernel.org/r/20180627141348.21777-2-toshi.kani@hpe.com
    Signed-off-by: Greg Kroah-Hartman

    Toshi Kani
     
  • commit 0a957467c5fd46142bc9c52758ffc552d4c5e2f7 upstream.

    i8259.h uses inb/outb and thus needs to include asm/io.h to avoid the
    following build error, as seen with x86_64:defconfig and CONFIG_SMP=n.

    In file included from drivers/rtc/rtc-cmos.c:45:0:
    arch/x86/include/asm/i8259.h: In function 'inb_pic':
    arch/x86/include/asm/i8259.h:32:24: error:
    implicit declaration of function 'inb'

    arch/x86/include/asm/i8259.h: In function 'outb_pic':
    arch/x86/include/asm/i8259.h:45:2: error:
    implicit declaration of function 'outb'

    Reported-by: Sebastian Gottschall
    Suggested-by: Sebastian Gottschall
    Fixes: 447ae3166702 ("x86: Don't include linux/irq.h from asm/hardirq.h")
    Signed-off-by: Guenter Roeck
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Guenter Roeck
     
  • commit 1eb46908b35dfbac0ec1848d4b1e39667e0187e9 upstream.

    allmodconfig+CONFIG_INTEL_KVM=n results in the following build error.

    ERROR: "l1tf_vmx_mitigation" [arch/x86/kvm/kvm.ko] undefined!

    Fixes: 5b76a3cff011 ("KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry")
    Reported-by: Meelis Roos
    Cc: Meelis Roos
    Cc: Paolo Bonzini
    Cc: Thomas Gleixner
    Signed-off-by: Guenter Roeck
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Guenter Roeck
     

16 Aug, 2018

19 commits

  • Greg Kroah-Hartman
     
  • commit f8b64d08dde2714c62751d18ba77f4aeceb161d3 upstream.

    Move smp_num_siblings and cpu_llc_id to cpu/common.c so that they're
    always present as symbols and not only in the CONFIG_SMP case. Then,
    other code using them doesn't need ugly ifdeffery anymore. Get rid of
    some ifdeffery.

    Signed-off-by: Borislav Petkov
    Signed-off-by: Suravee Suthikulpanit
    Signed-off-by: Borislav Petkov
    Signed-off-by: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1524864877-111962-2-git-send-email-suravee.suthikulpanit@amd.com
    Signed-off-by: Guenter Roeck
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     
  • commit 6c26fcd2abfe0a56bbd95271fce02df2896cfd24 upstream.

    pfn_modify_allowed() and arch_has_pfn_modify_check() are outside of the
    !__ASSEMBLY__ section in include/asm-generic/pgtable.h, which confuses
    assembler on archs that don't have __HAVE_ARCH_PFN_MODIFY_ALLOWED (e.g.
    ia64) and breaks build:

    include/asm-generic/pgtable.h: Assembler messages:
    include/asm-generic/pgtable.h:538: Error: Unknown opcode `static inline bool pfn_modify_allowed(unsigned long pfn,pgprot_t prot)'
    include/asm-generic/pgtable.h:540: Error: Unknown opcode `return true'
    include/asm-generic/pgtable.h:543: Error: Unknown opcode `static inline bool arch_has_pfn_modify_check(void)'
    include/asm-generic/pgtable.h:545: Error: Unknown opcode `return false'
    arch/ia64/kernel/entry.S:69: Error: `mov' does not fit into bundle

    Move those two static inlines into the !__ASSEMBLY__ section so that they
    don't confuse the asm build pass.

    Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
    Signed-off-by: Jiri Kosina
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Guenter Roeck
    Signed-off-by: Greg Kroah-Hartman

    Jiri Kosina
     
  • commit 792adb90fa724ce07c0171cbc96b9215af4b1045 upstream.

    The introduction of generic_max_swapfile_size and arch-specific versions has
    broken linking on x86 with CONFIG_SWAP=n due to undefined reference to
    'generic_max_swapfile_size'. Fix it by compiling the x86-specific
    max_swapfile_size() only with CONFIG_SWAP=y.

    Reported-by: Tomas Pruzina
    Fixes: 377eeaa8e11f ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2")
    Signed-off-by: Vlastimil Babka
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Vlastimil Babka
     
  • commit 269777aa530f3438ec1781586cdac0b5fe47b061 upstream.

    Commit 0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
    breaks non-SMP builds.

    [ I suspect the 'bool' fields should just be made to be bitfields and be
    exposed regardless of configuration, but that's a separate cleanup
    that I'll leave to the owners of this file for later. - Linus ]

    Fixes: 0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
    Cc: Dave Hansen
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Signed-off-by: Abel Vesa
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Abel Vesa
     
  • commit d0055f351e647f33f3b0329bff022213bf8aa085 upstream.

    The function has an inline "return false;" definition with CONFIG_SMP=n
    but the "real" definition is also visible leading to "redefinition of
    ‘apic_id_is_primary_thread’" compiler error.

    Guard it with #ifdef CONFIG_SMP

    Signed-off-by: Vlastimil Babka
    Fixes: 6a4d2657e048 ("x86/smp: Provide topology_is_primary_thread()")
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Vlastimil Babka
     
  • commit 07d981ad4cf1e78361c6db1c28ee5ba105f96cc1 upstream

    The kernel unnecessarily prevents late microcode loading when SMT is
    disabled. It should be safe to allow it if all the primary threads are
    online.

    Signed-off-by: Josh Poimboeuf
    Acked-by: Borislav Petkov
    Signed-off-by: David Woodhouse
    Signed-off-by: Greg Kroah-Hartman

    Josh Poimboeuf
     
  • commit e24f14b0ff985f3e09e573ba1134bfdf42987e05 upstream

    Signed-off-by: David Woodhouse
    Signed-off-by: Greg Kroah-Hartman

    David Woodhouse
     
  • commit 1063711b57393c1999248cccb57bebfaf16739e7 upstream

    The mmio tracer sets io mapping PTEs and PMDs to non present when enabled
    without inverting the address bits, which makes the PTE entry vulnerable
    for L1TF.

    Make it use the right low level macros to actually invert the address bits
    to protect against L1TF.

    In principle this could be avoided because MMIO tracing is not likely to be
    enabled on production machines, but the fix is straigt forward and for
    consistency sake it's better to get rid of the open coded PTE manipulation.

    Signed-off-by: Andi Kleen
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Andi Kleen
     
  • commit 958f79b9ee55dfaf00c8106ed1c22a2919e0028b upstream

    set_memory_np() is used to mark kernel mappings not present, but it has
    it's own open coded mechanism which does not have the L1TF protection of
    inverting the address bits.

    Replace the open coded PTE manipulation with the L1TF protecting low level
    PTE routines.

    Passes the CPA self test.

    Signed-off-by: Andi Kleen
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Andi Kleen
     
  • commit 0768f91530ff46683e0b372df14fd79fe8d156e5 upstream

    Some cases in THP like:
    - MADV_FREE
    - mprotect
    - split

    mark the PMD non present for temporarily to prevent races. The window for
    an L1TF attack in these contexts is very small, but it wants to be fixed
    for correctness sake.

    Use the proper low level functions for pmd/pud_mknotpresent() to address
    this.

    Signed-off-by: Andi Kleen
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Andi Kleen
     
  • commit f22cc87f6c1f771b57c407555cfefd811cdd9507 upstream

    For kernel mappings PAGE_PROTNONE is not necessarily set for a non present
    mapping, but the inversion logic explicitely checks for !PRESENT and
    PROT_NONE.

    Remove the PROT_NONE check and make the inversion unconditional for all not
    present mappings.

    Signed-off-by: Andi Kleen
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Andi Kleen
     
  • commit bc2d8d262cba5736332cbc866acb11b1c5748aa9 upstream

    Josh reported that the late SMT evaluation in cpu_smt_state_init() sets
    cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied
    on the kernel command line as it cannot differentiate between SMT disabled
    by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and
    makes the sysfs interface unusable.

    Rework this so that during bringup of the non boot CPUs the availability of
    SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a
    'primary' thread then set the local cpu_smt_available marker and evaluate
    this explicitely right after the initial SMP bringup has finished.

    SMT evaulation on x86 is a trainwreck as the firmware has all the
    information _before_ booting the kernel, but there is no interface to query
    it.

    Fixes: 73d5e2b47264 ("cpu/hotplug: detect SMT disabled by BIOS")
    Reported-by: Josh Poimboeuf
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • commit 5b76a3cff011df2dcb6186c965a2e4d809a05ad4 upstream

    When nested virtualization is in use, VMENTER operations from the nested
    hypervisor into the nested guest will always be processed by the bare metal
    hypervisor, and KVM's "conditional cache flushes" mode in particular does a
    flush on nested vmentry. Therefore, include the "skip L1D flush on
    vmentry" bit in KVM's suggested ARCH_CAPABILITIES setting.

    Add the relevant Documentation.

    Signed-off-by: Paolo Bonzini
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Paolo Bonzini
     
  • commit 8e0b2b916662e09dd4d09e5271cdf214c6b80e62 upstream

    Bit 3 of ARCH_CAPABILITIES tells a hypervisor that L1D flush on vmentry is
    not needed. Add a new value to enum vmx_l1d_flush_state, which is used
    either if there is no L1TF bug at all, or if bit 3 is set in ARCH_CAPABILITIES.

    Signed-off-by: Paolo Bonzini
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Paolo Bonzini
     
  • commit ea156d192f5257a5bf393d33910d3b481bf8a401 upstream

    Three changes to the content of the sysfs file:

    - If EPT is disabled, L1TF cannot be exploited even across threads on the
    same core, and SMT is irrelevant.

    - If mitigation is completely disabled, and SMT is enabled, print "vulnerable"
    instead of "vulnerable, SMT vulnerable"

    - Reorder the two parts so that the main vulnerability state comes first
    and the detail on SMT is second.

    Signed-off-by: Paolo Bonzini
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Paolo Bonzini
     
  • commit cd28325249a1ca0d771557ce823e0308ad629f98 upstream

    This lets userspace read the MSR_IA32_ARCH_CAPABILITIES and check that all
    requested features are available on the host.

    Signed-off-by: Paolo Bonzini
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Paolo Bonzini
     
  • commit 518e7b94817abed94becfe6a44f1ece0d4745afe upstream

    Linux (among the others) has checks to make sure that certain features
    aren't enabled on a certain family/model/stepping if the microcode version
    isn't greater than or equal to a known good version.

    By exposing the real microcode version, we're preventing buggy guests that
    don't check that they are running virtualized (i.e., they should trust the
    hypervisor) from disabling features that are effectively not buggy.

    Suggested-by: Filippo Sironi
    Signed-off-by: Wanpeng Li
    Signed-off-by: Radim Krčmář
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Paolo Bonzini
    Cc: Liran Alon
    Cc: Nadav Amit
    Cc: Borislav Petkov
    Cc: Tom Lendacky
    Signed-off-by: Greg Kroah-Hartman

    Wanpeng Li
     
  • commit 66421c1ec340096b291af763ed5721314cdd9c5c upstream

    Introduce kvm_get_msr_feature() to handle the msrs which are supported
    by different vendors and sharing the same emulation logic.

    Signed-off-by: Wanpeng Li
    Signed-off-by: Radim Krčmář
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Paolo Bonzini
    Cc: Liran Alon
    Cc: Nadav Amit
    Cc: Borislav Petkov
    Cc: Tom Lendacky
    Signed-off-by: Greg Kroah-Hartman

    Wanpeng Li