12 Feb, 2019

6 commits


31 Jan, 2019

1 commit


13 Jan, 2019

1 commit

  • commit fdd669684655c07dacbdb0d753fd13833de69a33 upstream.

    Calling the test program genwqe_cksum with the default buffer size of
    2MB triggers the following kernel warning on s390:

    WARNING: CPU: 30 PID: 9311 at mm/page_alloc.c:3189 __alloc_pages_nodemask+0x45c/0xbe0
    CPU: 30 PID: 9311 Comm: genwqe_cksum Kdump: loaded Not tainted 3.10.0-957.el7.s390x #1
    task: 00000005e5d13980 ti: 00000005e7c6c000 task.ti: 00000005e7c6c000
    Krnl PSW : 0704c00180000000 00000000002780ac (__alloc_pages_nodemask+0x45c/0xbe0)
    R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
    Krnl GPRS: 00000000002932b8 0000000000b73d7c 0000000000000010 0000000000000009
    0000000000000041 00000005e7c6f9b8 0000000000000001 00000000000080d0
    0000000000000000 0000000000b70500 0000000000000001 0000000000000000
    0000000000b70528 00000000007682c0 0000000000277df2 00000005e7c6f9a0
    Krnl Code: 000000000027809e: de7195001000 ed 1280(114,%r9),0(%r1)
    00000000002780a4: a774fead brc 7,277dfe
    #00000000002780a8: a7f40001 brc 15,2780aa
    >00000000002780ac: 92011000 mvi 0(%r1),1
    00000000002780b0: a7f4fea7 brc 15,277dfe
    00000000002780b4: 9101c6b6 tm 1718(%r12),1
    00000000002780b8: a784ff3a brc 8,277f2c
    00000000002780bc: a7f4fe2e brc 15,277d18
    Call Trace:
    ([] __alloc_pages_nodemask+0x1a2/0xbe0)
    [] s390_dma_alloc+0xfe/0x310
    [] __genwqe_alloc_consistent+0xfa/0x148 [genwqe_card]
    [] genwqe_mmap+0xca/0x248 [genwqe_card]
    [] mmap_region+0x4e2/0x778
    [] do_mmap+0x2ac/0x3e0
    [] vm_mmap_pgoff+0xd6/0x118
    [] SyS_mmap_pgoff+0xdc/0x268
    [] SyS_old_mmap+0x8c/0xb0
    [] sysc_tracego+0x14/0x1e
    [] 0x3ffacf87dc6

    turns out the check in __genwqe_alloc_consistent uses "> MAX_ORDER"
    while the mm code uses ">= MAX_ORDER". Fix genwqe.

    Cc: stable@vger.kernel.org
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Frank Haverkamp
    Signed-off-by: Greg Kroah-Hartman

    Christian Borntraeger
     

06 Dec, 2018

1 commit

  • commit 6484a677294aa5d08c0210f2f387ebb9be646115 upstream.

    gcc '-Wunused-but-set-variable' warning:

    drivers/misc/mic/scif/scif_rma.c: In function 'scif_create_remote_lookup':
    drivers/misc/mic/scif/scif_rma.c:373:25: warning:
    variable 'vmalloc_num_pages' set but not used [-Wunused-but-set-variable]

    'vmalloc_num_pages' should be used to determine if the address is
    within the vmalloc range.

    Fixes: ba612aa8b487 ("misc: mic: SCIF memory registration and unregistration")
    Signed-off-by: YueHaibing
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    YueHaibing
     

27 Nov, 2018

2 commits

  • commit fee05f455ceb5c670cbe48e2f9454ebc4a388554 upstream.

    req.gid can be indirectly controlled by user-space, hence leading to
    a potential exploitation of the Spectre variant 1 vulnerability.

    This issue was detected with the help of Smatch:

    vers/misc/sgi-gru/grukdump.c:200 gru_dump_chiplet_request() warn:
    potential spectre issue 'gru_base' [w]

    Fix this by sanitizing req.gid before calling macro GID_TO_GRU, which
    uses it to index gru_base.

    Notice that given that speculation windows are large, the policy is
    to kill the speculation on the first load and not worry if it can be
    completed with a dependent load/store [1].

    [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

    Cc: stable@vger.kernel.org
    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Greg Kroah-Hartman

    Gustavo A. R. Silva
     
  • commit 7c97301285b62a41d6bceded7d964085fc8cc50f upstream.

    After building the kernel with Clang, the following section mismatch
    warning appears:

    WARNING: vmlinux.o(.text+0x3bf19a6): Section mismatch in reference from
    the function ssc_probe() to the function
    .init.text:atmel_ssc_get_driver_data()
    The function ssc_probe() references
    the function __init atmel_ssc_get_driver_data().
    This is often because ssc_probe lacks a __init
    annotation or the annotation of atmel_ssc_get_driver_data is wrong.

    Remove __init from atmel_ssc_get_driver_data to get rid of the mismatch.

    Signed-off-by: Nathan Chancellor
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Nathan Chancellor
     

14 Nov, 2018

2 commits

  • commit 0ab93e9c99f8208c0a1a7b7170c827936268c996 upstream.

    The genweq_add_file and genwqe_del_file by caching current without
    using reference counting embed the assumption that a file descriptor
    will never be passed from one process to another. It even embeds the
    assumption that the the thread that opened the file will be in
    existence when the process terminates. Neither of which are
    guaranteed to be true.

    Therefore replace caching the task_struct of the opener with
    pid of the openers thread group id. All the knowledge of the
    opener is used for is as the target of SIGKILL and a SIGKILL
    will kill the entire process group.

    Rename genwqe_force_sig to genwqe_terminate, remove it's unncessary
    signal argument, update it's ownly caller, and use kill_pid
    instead of force_sig.

    The work force_sig does in changing signal handling state is not
    relevant to SIGKILL sent as SEND_SIG_PRIV. The exact same processess
    will be killed just with less work, and less confusion. The work done
    by force_sig is really only needed for handling syncrhonous
    exceptions.

    It will still be possible to cause genwqe_device_remove to wait
    8 seconds by passing a file descriptor to another process but
    the possible user after free is fixed.

    Fixes: eaf4722d4645 ("GenWQE Character device and DDCB queue")
    Cc: stable@vger.kernel.org
    Cc: Greg Kroah-Hartman
    Cc: Frank Haverkamp
    Cc: Joerg-Stephan Vogt
    Cc: Michael Jung
    Cc: Michael Ruettger
    Cc: Kleber Sacilotto de Souza
    Cc: Sebastian Ott
    Cc: Eberhard S. Amann
    Cc: Gabriel Krisman Bertazi
    Cc: Guilherme G. Piccoli
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • [ Upstream commit 11924ba5e671d6caef1516923e2bd8c72929a3fe ]

    When adding a VMCI resource, the check for an existing entry
    would ignore that the new entry could be a wildcard. This could
    result in multiple resource entries that would match a given
    handle. One disastrous outcome of this is that the
    refcounting used to ensure that delayed callbacks for VMCI
    datagrams have run before the datagram is destroyed can be
    wrong, since the refcount could be increased on the duplicate
    entry. This in turn leads to a use after free bug. This issue
    was discovered by Hangbin Liu using KASAN and syzkaller.

    Fixes: bc63dedb7d46 ("VMCI: resource object implementation")
    Reported-by: Hangbin Liu
    Reviewed-by: Adit Ranadive
    Reviewed-by: Vishnu Dasa
    Signed-off-by: Jorgen Hansen
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Jorgen Hansen
     

10 Nov, 2018

1 commit

  • [ Upstream commit a2b3bf4846e5eed62ea6abb096af2c950961033c ]

    Provide a flexible way to determine the addressing bits of eeprom.
    Pass the addressing bits to driver through address-width property.

    Signed-off-by: Alan Chiang
    Signed-off-by: Andy Yeh
    Signed-off-by: Bartosz Golaszewski
    Signed-off-by: Sasha Levin

    Alan Chiang
     

04 Oct, 2018

3 commits

  • [ Upstream commit d5b9653dd2bb7a2b1c8cc783c5d3b607bbb6b271 ]

    Make sure to enable the clock before registering regions and exporting
    partitions to user space at which point we must be prepared for I/O.

    Fixes: ee895ccdf776 ("misc: sram: fix enabled clock leak on error path")
    Signed-off-by: Johan Hovold
    Reviewed-by: Vladimir Zapolskiy
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • [ Upstream commit 7fb2fd4e25fc1fb10dcb30b5519de257cfeae84c ]

    The problem is that if get_user_pages_fast() fails and returns a
    negative error code, it gets type promoted to a high positive value and
    treated as a success.

    Fixes: 06164d2b72aa ("VMCI: queue pairs implementation.")
    Signed-off-by: Dan Carpenter
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     
  • [ Upstream commit ce054546cc2c26891cefa2f284d90d93b52205de ]

    ADC channel 0 photodiode detects both infrared + visible light,
    but ADC channel 1 just detects infrared. However, the latter is a bit
    more sensitive in that range so complete darkness or low light causes
    a error condition in which the chan0 - chan1 is negative that
    results in a -EAGAIN.

    This patch changes the resulting lux1_input sysfs attribute message from
    "Resource temporarily unavailable" to a user-grokable lux value of 0.

    Cc: Arnd Bergmann
    Cc: Greg Kroah-Hartman
    Signed-off-by: Matt Ranostay
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Matt Ranostay
     

29 Sep, 2018

1 commit

  • commit a3b92ee6fc171d7c9d9b6b829b7fef169210440c upstream.

    Fix a build error due to missing virt_to_phys()

    Reported-by: kbuild test robot
    Fixes: f0a1bf29d821b ("vmw_balloon: fix inflation with batching")
    Cc: stable@vger.kernel.org
    Cc: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     

26 Sep, 2018

4 commits

  • commit b40b3e9358fbafff6a4ba0f4b9658f6617146f9c upstream.

    We accidentally removed the check for negative returns
    without considering the issue of type promotion.
    The "if_version_length" variable is type size_t so if __mei_cl_recv()
    returns a negative then "bytes_recv" is type promoted
    to a high positive value and treated as success.

    Cc:
    Fixes: 582ab27a063a ("mei: bus: fix received data size check in NFC fixup")
    Signed-off-by: Dan Carpenter
    Signed-off-by: Tomas Winkler
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     
  • commit 34f1166afd67f9f48a08c52f36180048908506a4 upstream.

    In case a client fails to connect in mei_cldev_enable(), the
    caller won't call the mei_cldev_disable leaving the client
    in a linked stated. Upon driver unload the client structure
    will be freed in mei_cl_bus_dev_release(), leaving a stale pointer
    on a fail_list. This will eventually end up in crash
    during power down flow in mei_cl_set_disonnected().

    RIP: mei_cl_set_disconnected+0x5/0x260[mei]
    Call trace:
    mei_cl_all_disconnect+0x22/0x30
    mei_reset+0x194/0x250
    __synchronize_hardirq+0x43/0x50
    _cond_resched+0x15/0x30
    mei_me_intr_clear+0x20/0x100
    mei_stop+0x76/0xb0
    mei_me_shutdown+0x3f/0x80
    pci_device_shutdown+0x34/0x60
    kernel_restart+0x0e/0x30

    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200455
    Fixes: 'c110cdb17148 ("mei: bus: make a client pointer always available")'
    Cc: 4.10+
    Tested-by: Georg Müller
    Signed-off-by: Tomas Winkler
    Signed-off-by: Greg Kroah-Hartman

    Tomas Winkler
     
  • commit 8d2d8935d30cc2acc57a3196dc10dfa8d5cbcdab upstream.

    Some of the ME clients are available only for BIOS operation and are
    removed during hand off to an OS. However the removal is not instant.
    A client may be visible on the client list when the mei driver requests
    for enumeration, while the subsequent request for properties will be
    answered with client not found error value. The default behavior
    for an error is to perform client reset while this error is harmless and
    the link reset should be prevented. This issue started to be visible due to
    suspend/resume timing changes. Currently reported only on the Haswell
    based system.

    Fixes:
    [33.564957] mei_me 0000:00:16.0: hbm: properties response: wrong status = 1 CLIENT_NOT_FOUND
    [33.564978] mei_me 0000:00:16.0: mei_irq_read_handler ret = -71.
    [33.565270] mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS fw status = 1E000255 60002306 00000200 00004401 00000000 00000010

    Cc:
    Reported-by: Heiner Kallweit
    Signed-off-by: Alexander Usyskin
    Signed-off-by: Tomas Winkler
    Signed-off-by: Greg Kroah-Hartman

    Alexander Usyskin
     
  • commit de916736aaaadddbd6061472969f667b14204aa9 upstream.

    val is indirectly controlled by user-space, hence leading to a
    potential exploitation of the Spectre variant 1 vulnerability.

    This issue was detected with the help of Smatch:

    drivers/misc/hmc6352.c:54 compass_store() warn: potential spectre issue
    'map' [r]

    Fix this by sanitizing val before using it to index map

    Notice that given that speculation windows are large, the policy is
    to kill the speculation on the first load and not worry if it can be
    completed with a dependent load/store [1].

    [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

    Cc: stable@vger.kernel.org
    Signed-off-by: Gustavo A. R. Silva
    Signed-off-by: Greg Kroah-Hartman

    Gustavo A. R. Silva
     

20 Sep, 2018

2 commits

  • [ Upstream commit 81ae962d7f180c0092859440c82996cccb254976 ]

    Free resources instead of direct return of the error code if kim_probe
    fails.

    Found by Linux Driver Verification project (linuxtesting.org).

    Signed-off-by: Anton Vasilyev
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Anton Vasilyev
     
  • [ Upstream commit a39284ae9d2ad09975c8ae33f1bd0f05fbfbf6ee ]

    There are only 2 callers of scif_get_new_port() and both appear to get
    the error handling wrong. Both treat zero returns as error, but it
    actually returns negative error codes and >= 0 on success.

    Fixes: e9089f43c9a7 ("misc: mic: SCIF open close bind and listen APIs")
    Signed-off-by: Dan Carpenter
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     

10 Sep, 2018

5 commits

  • commit c3cc1b0fc27508da53fe955a3b23d03964410682 upstream.

    Currently, when all modules, including VMCI and VMware balloon are built
    into the kernel, the initialization of the balloon happens before the
    VMCI is probed. As a result, the balloon fails to initialize the VMCI
    doorbell, which it uses to get asynchronous requests for balloon size
    changes.

    The problem can be seen in the logs, in the form of the following
    message:
    "vmw_balloon: failed to initialize vmci doorbell"

    The driver would work correctly but slightly less efficiently, probing
    for requests periodically. This patch changes the balloon to be
    initialized using late_initcall() instead of module_init() to address
    this issue. It does not address a situation in which VMCI is built as a
    module and the balloon is built into the kernel.

    Fixes: 48e3d668b790 ("VMware balloon: Enable notification via VMCI")
    Cc: stable@vger.kernel.org
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • commit ce664331b2487a5d244a51cbdd8cb54f866fbe5d upstream.

    When vmballoon_vmci_init() sets a doorbell using VMCI_DOORBELL_SET, for
    some reason it does not consider the status and looks at the result.
    However, the hypervisor does not update the result - it updates the
    status. This might cause VMCI doorbell not to be enabled, resulting in
    degraded performance.

    Fixes: 48e3d668b790 ("VMware balloon: Enable notification via VMCI")
    Cc: stable@vger.kernel.org
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • commit 5081efd112560d3febb328e627176235b250d59d upstream.

    If the hypervisor sets 2MB batching is on, while batching is cleared,
    the balloon code breaks. In this case the legacy mechanism is used with
    2MB page. The VM would report a 2MB page is ballooned, and the
    hypervisor would only take the first 4KB.

    While the hypervisor should not report such settings, make the code more
    robust by not enabling 2MB support without batching.

    Fixes: 365bd7ef7ec8e ("VMware balloon: Support 2m page ballooning.")
    Cc: stable@vger.kernel.org
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • commit 09755690c6b7c1eabdc4651eb3b276f8feb1e447 upstream.

    When balloon batching is not supported by the hypervisor, the guest
    frame number (GFN) must fit in 32-bit. However, due to a bug, this check
    was mistakenly ignored. In practice, when total RAM is greater than
    16TB, the balloon does not work currently, making this bug unlikely to
    happen.

    Fixes: ef0f8f112984 ("VMware balloon: partially inline vmballoon_reserve_page.")
    Cc: stable@vger.kernel.org
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • commit ef6cb5f1a048fdf91ccee6d63d2bfa293338502d upstream.

    Function atomic_inc_unless_negative() returns a bool to indicate
    success/failure. However cxl_adapter_context_get() wrongly compares
    the return value against '>=0' which will always be true. The patch
    fixes this comparison to '==0' there by also fixing this compile time
    warning:

    drivers/misc/cxl/main.c:290 cxl_adapter_context_get()
    warn: 'atomic_inc_unless_negative(&adapter->contexts_num)' is unsigned

    Fixes: 70b565bbdb91 ("cxl: Prevent adapter reset if an active context exists")
    Cc: stable@vger.kernel.org # v4.9+
    Reported-by: Dan Carpenter
    Signed-off-by: Vaibhav Jain
    Acked-by: Andrew Donnellan
    Acked-by: Frederic Barrat
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Vaibhav Jain
     

05 Sep, 2018

1 commit

  • commit a103af1b64d74853a5e08ca6c86aeb0e5c6ca4f1 upstream.

    MEI enables writes of complete messages only
    while read can be performed in parts, hence
    write should not update the file offset to
    not break interleaving partial reads with writes.

    Cc:
    Signed-off-by: Alexander Usyskin
    Signed-off-by: Tomas Winkler
    Signed-off-by: Greg Kroah-Hartman

    Alexander Usyskin
     

22 Aug, 2018

1 commit

  • commit f294d00961d1d869ecffa60e280eeeee1ccf9a49 upstream.

    Make sure to disable clocks and deregister any exported partitions
    before returning on late probe errors.

    Note that since commit ee895ccdf776 ("misc: sram: fix enabled clock leak
    on error path"), partitions are deliberately exported before enabling
    the clock so we stick to that logic here. A follow up patch will address
    this.

    Cc: stable # 4.9
    Cc: Alexandre Belloni
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     

25 Jul, 2018

1 commit


17 Jul, 2018

2 commits

  • commit 90d72ce079791399ac255c75728f3c9e747b093d upstream.

    Embarrassingly, the recent fix introduced worse problem than it solved,
    causing the balloon not to inflate. The VM informed the hypervisor that
    the pages for lock/unlock are sitting in the wrong address, as it used
    the page that is used the uninitialized page variable.

    Fixes: b23220fe054e9 ("vmw_balloon: fixing double free when batching mode is off")
    Cc: stable@vger.kernel.org
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • commit a0341fc1981a950c1e902ab901e98f60e0e243f3 upstream.

    This read handler had a lot of custom logic and wrote outside the bounds of
    the provided buffer. This could lead to kernel and userspace memory
    corruption. Just use simple_read_from_buffer() with a stack buffer.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jann Horn
    Signed-off-by: Greg Kroah-Hartman

    Jann Horn
     

03 Jul, 2018

1 commit

  • commit b6c84ba22ff3a198eb8d5552cf9b8fda1d792e54 upstream.

    Currently we see a kernel-oops reported on Power-9 while attaching a
    context to an AFU, with radix-mode and sysfs attr 'prefault_mode' set
    to anything other than 'none'. The backtrace of the oops is of this
    form:

    Unable to handle kernel paging request for data at address 0x00000080
    Faulting instruction address: 0xc00800000bcf3b20
    cpu 0x1: Vector: 300 (Data Access) at [c00000037f003800]
    pc: c00800000bcf3b20: cxl_load_segment+0x178/0x290 [cxl]
    lr: c00800000bcf39f0: cxl_load_segment+0x48/0x290 [cxl]
    sp: c00000037f003a80
    msr: 9000000000009033
    dar: 80
    dsisr: 40000000
    current = 0xc00000037f280000
    paca = 0xc0000003ffffe600 softe: 3 irq_happened: 0x01
    pid = 3529, comm = afp_no_int

    cxl_prefault+0xfc/0x248 [cxl]
    process_element_entry_psl9+0xd8/0x1a0 [cxl]
    cxl_attach_dedicated_process_psl9+0x44/0x130 [cxl]
    native_attach_process+0xc0/0x130 [cxl]
    afu_ioctl+0x3f4/0x5e0 [cxl]
    do_vfs_ioctl+0xdc/0x890
    ksys_ioctl+0x68/0xf0
    sys_ioctl+0x40/0xa0
    system_call+0x58/0x6c

    The issue is caused as on Power-8 the AFU attr 'prefault_mode' was
    used to improve initial storage fault performance by prefaulting
    process segments. However on Power-9 with radix mode we don't have
    Storage-Segments that we can prefault. Also prefaulting process Pages
    will be too costly and fine-grained.

    Hence, since the prefaulting mechanism doesn't makes sense of
    radix-mode, this patch updates prefault_mode_store() to not allow any
    other value apart from CXL_PREFAULT_NONE when radix mode is enabled.

    Fixes: f24be42aab37 ("cxl: Add psl9 specific code")
    Cc: stable@vger.kernel.org # v4.12+
    Signed-off-by: Vaibhav Jain
    Acked-by: Frederic Barrat
    Acked-by: Andrew Donnellan
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Vaibhav Jain
     

16 Jun, 2018

1 commit

  • commit b23220fe054e92f616b82450fae8cd3ab176cc60 upstream.

    The balloon.page field is used for two different purposes if batching is
    on or off. If batching is on, the field point to the page which is used
    to communicate with with the hypervisor. If it is off, balloon.page
    points to the page that is about to be (un)locked.

    Unfortunately, this dual-purpose of the field introduced a bug: when the
    balloon is popped (e.g., when the machine is reset or the balloon driver
    is explicitly removed), the balloon driver frees, unconditionally, the
    page that is held in balloon.page. As a result, if batching is
    disabled, this leads to double freeing the last page that is sent to the
    hypervisor.

    The following error occurs during rmmod when kernel checkers are on, and
    the balloon is not empty:

    [ 42.307653] ------------[ cut here ]------------
    [ 42.307657] Kernel BUG at ffffffffba1e4b28 [verbose debug info unavailable]
    [ 42.307720] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
    [ 42.312512] Modules linked in: vmw_vsock_vmci_transport vsock ppdev joydev vmw_balloon(-) input_leds serio_raw vmw_vmci parport_pc shpchp parport i2c_piix4 nfit mac_hid autofs4 vmwgfx drm_kms_helper hid_generic syscopyarea sysfillrect usbhid sysimgblt fb_sys_fops hid ttm mptspi scsi_transport_spi ahci mptscsih drm psmouse vmxnet3 libahci mptbase pata_acpi
    [ 42.312766] CPU: 10 PID: 1527 Comm: rmmod Not tainted 4.12.0+ #5
    [ 42.312803] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2016
    [ 42.313042] task: ffff9bf9680f8000 task.stack: ffffbfefc1638000
    [ 42.313290] RIP: 0010:__free_pages+0x38/0x40
    [ 42.313510] RSP: 0018:ffffbfefc163be98 EFLAGS: 00010246
    [ 42.313731] RAX: 000000000000003e RBX: ffffffffc02b9720 RCX: 0000000000000006
    [ 42.313972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9bf97e08e0a0
    [ 42.314201] RBP: ffffbfefc163be98 R08: 0000000000000000 R09: 0000000000000000
    [ 42.314435] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc02b97e4
    [ 42.314505] R13: ffffffffc02b9748 R14: ffffffffc02b9728 R15: 0000000000000200
    [ 42.314550] FS: 00007f3af5fec700(0000) GS:ffff9bf97e080000(0000) knlGS:0000000000000000
    [ 42.314599] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 42.314635] CR2: 00007f44f6f4ab24 CR3: 00000003a7d12000 CR4: 00000000000006e0
    [ 42.314864] Call Trace:
    [ 42.315774] vmballoon_pop+0x102/0x130 [vmw_balloon]
    [ 42.315816] vmballoon_exit+0x42/0xd64 [vmw_balloon]
    [ 42.315853] SyS_delete_module+0x1e2/0x250
    [ 42.315891] entry_SYSCALL_64_fastpath+0x23/0xc2
    [ 42.315924] RIP: 0033:0x7f3af5b0e8e7
    [ 42.315949] RSP: 002b:00007fffe6ce0148 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
    [ 42.315996] RAX: ffffffffffffffda RBX: 000055be676401e0 RCX: 00007f3af5b0e8e7
    [ 42.316951] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055be67640248
    [ 42.317887] RBP: 0000000000000003 R08: 0000000000000000 R09: 1999999999999999
    [ 42.318845] R10: 0000000000000883 R11: 0000000000000206 R12: 00007fffe6cdf130
    [ 42.319755] R13: 0000000000000000 R14: 0000000000000000 R15: 000055be676401e0
    [ 42.320606] Code: c0 74 1c f0 ff 4f 1c 74 02 5d c3 85 f6 74 07 e8 0f d8 ff ff 5d c3 31 f6 e8 c6 fb ff ff 5d c3 48 c7 c6 c8 0f c5 ba e8 58 be 02 00 0b 66 0f 1f 44 00 00 66 66 66 66 90 48 85 ff 75 01 c3 55 48
    [ 42.323462] RIP: __free_pages+0x38/0x40 RSP: ffffbfefc163be98
    [ 42.325735] ---[ end trace 872e008e33f81508 ]---

    To solve the bug, we eliminate the dual purpose of balloon.page.

    Fixes: f220a80f0c2e ("VMware balloon: add batching to the vmw_balloon.")
    Cc: stable@vger.kernel.org
    Reported-by: Oleksandr Natalenko
    Signed-off-by: Gil Kupfer
    Signed-off-by: Nadav Amit
    Reviewed-by: Xavier Deguillard
    Tested-by: Oleksandr Natalenko
    Signed-off-by: Greg Kroah-Hartman

    Gil Kupfer
     

30 May, 2018

1 commit

  • [ Upstream commit 94322ed8e857e3b2a33cf75118051af9baaa110f ]

    PSL9D doesn't have a data-cache that needs to be flushed before
    resetting the card. However when cxl tries to flush data-cache on such
    a card, it times-out as PSL_Control register never indicates flush
    operation complete due to missing data-cache. This is usually
    indicated in the kernel logs with this message:

    "WARNING: cache flush timed out"

    To fix this the patch checks PSL_Debug register CDC-Field(BIT:27)
    which indicates the absence of a data-cache and sets a flag
    'no_data_cache' in 'struct cxl_native' to indicate this. When
    cxl_data_cache_flush() is called it checks the flag and if set bails
    out early without requesting a data-cache flush operation to the PSL.

    Signed-off-by: Vaibhav Jain
    Acked-by: Andrew Donnellan
    Acked-by: Frederic Barrat
    Signed-off-by: Michael Ellerman
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Vaibhav Jain
     

24 Apr, 2018

1 commit

  • commit ad7b4e8022b9864c075fe71e1328b1d25cad82f6 upstream.

    cxllib_handle_fault() is called by an external driver when it needs to
    have the host resolve page faults for a buffer. The buffer can cover
    several pages and VMAs. The function iterates over all the pages used
    by the buffer, based on the page size of the VMA.

    To ensure some stability while processing the faults, the thread T1
    grabs the mm->mmap_sem semaphore with read access (R1). However, when
    processing a page fault for a single page, one of the underlying
    functions, copro_handle_mm_fault(), also grabs the same semaphore with
    read access (R2). So the thread T1 takes the semaphore twice.

    If another thread T2 tries to access the semaphore in write mode W1
    (say, because it wants to allocate memory and calls 'brk'), then that
    thread T2 will have to wait because there's a reader (R1). If the
    thread T1 is processing a new page at that time, it won't get an
    automatic grant at R2, because there's now a writer thread
    waiting (T2). And we have a deadlock.

    The timeline is:
    1. thread T1 owns the semaphore with read access R1
    2. thread T2 requests write access W1 and waits
    3. thread T1 requests read access R2 and waits

    The fix is for the thread T1 to release the semaphore R1 once it got
    the information it needs from the current VMA. The address space/VMAs
    could evolve while T1 iterates over the full buffer, but in the
    unlikely case where T1 misses a page, the external driver will raise a
    new page fault when retrying the memory access.

    Fixes: 3ced8d730063 ("cxl: Export library to support IBM XSL")
    Cc: stable@vger.kernel.org # 4.13+
    Signed-off-by: Frederic Barrat
    Signed-off-by: Michael Ellerman
    Signed-off-by: Greg Kroah-Hartman

    Frederic Barrat
     

08 Apr, 2018

1 commit

  • commit bb0829a741792b56c908d7745bc0b2b540293bcc upstream.

    Currently the driver spams the kernel log on unsupported ioctls which is
    unnecessary as the ioctl returns -ENOIOCTLCMD to indicate this anyway.
    I suspect this was originally for debugging purposes but it really is not
    required so remove it.

    Signed-off-by: Colin Ian King
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Colin Ian King
     

25 Feb, 2018

1 commit