24 Mar, 2018

4 commits

  • [ Upstream commit 67b8fbead4685b36d290a0ef91c6ddffc4920ec9 ]

    In case of hci send frame failure, skb is still owned
    by the caller (hci_core) and then should not be freed.

    This fixes crash on dragonboard-410c when sending SCO
    packet. skb is freed by both btqcomsmd and hci_core.

    Fixes: 1511cc750c3d ("Bluetooth: Introduce Qualcomm WCNSS SMD based HCI driver")
    Signed-off-by: Loic Poulain
    Signed-off-by: Marcel Holtmann
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Loic Poulain
     
  • [ Upstream commit ba8f3597900291a93604643017fff66a14546015 ]

    Assuming that the original code idea was to enable in-band sleeping
    only if the setup_rome method returns succes and run in 'standard'
    mode otherwise, we should not return setup_rome return value which
    makes qca_setup fail if no rampatch/nvm file found.

    This fixes BT issue on the dragonboard-820C p4 which includes the
    following QCA controller:
    hci0: Product:0x00000008
    hci0: Patch :0x00000111
    hci0: ROM :0x00000302
    hci0: SOC :0x00000044

    Since there is no rampatch for this controller revision, just make
    it work as is.

    Signed-off-by: Loic Poulain
    Signed-off-by: Marcel Holtmann
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Loic Poulain
     
  • commit 740a5759bf222332fbb5eda42f89aa25ba38f9b2 upstream.

    ashmem_mutex may create a chain of dependencies like:

    CPU0 CPU1
    mmap syscall ioctl syscall
    -> mmap_sem (acquired) -> ashmem_ioctl
    -> ashmem_mmap -> ashmem_mutex (acquired)
    -> ashmem_mutex (try to acquire) -> copy_from_user
    -> mmap_sem (try to acquire)

    There is a lock odering problem between mmap_sem and ashmem_mutex causing
    a lockdep splat[1] during a syzcaller test. This patch fixes the problem
    by move copy_from_user out of ashmem_mutex.

    [1] https://www.spinics.net/lists/kernel/msg2733200.html

    Fixes: ce8a3a9e76d0 (staging: android: ashmem: Fix a race condition in pin ioctls)
    Reported-by: syzbot+d7a918a7a8e1c952bc36@syzkaller.appspotmail.com
    Signed-off-by: Yisheng Xie
    Cc: "Joel Fernandes (Google)"
    Signed-off-by: Greg Kroah-Hartman

    Yisheng Xie
     
  • commit 9ff97fa8db94caeab59a3c5401e975df468b4d8e upstream.

    Problem Statement: Sending I/O through 32 bit descriptors to Ventura series of
    controller results in IO timeout on certain conditions.

    This error only occurs on systems with high I/O activity on Ventura series
    controllers.

    Changes in this patch will prevent driver from using 32 bit descriptor and use
    64 bit Descriptors.

    Cc:
    Signed-off-by: Kashyap Desai
    Signed-off-by: Shivasharan S
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Tomas Henzl
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Shivasharan S
     

21 Mar, 2018

36 commits

  • Greg Kroah-Hartman
     
  • commit b16ea8b9492e99e03b1269fe93ebdbf8e4eabf8a upstream.

    The FIFO/Queue type values are incorrect. Correct them according to
    DWC_usb3 programming guide section 1.2.27 (or DWC_usb31 section 1.2.25).

    Additionally, this patch includes ProtocolStatusQ and AuxEventQ types.

    Fixes: cf6d867d3b57 ("usb: dwc3: core: add fifo space helper")
    Signed-off-by: Thinh Nguyen
    Signed-off-by: Felipe Balbi
    Signed-off-by: Greg Kroah-Hartman

    Thinh Nguyen
     
  • commit 8874ae5f15f3feef3b4a415b9aed51edcf449aa1 upstream.

    Add the missing platform_device_put() before return from bdc_pci_probe()
    in the platform_device_add_resources() error handling case.

    Fixes: efed421a94e6 ("usb: gadget: Add UDC driver for Broadcom USB3.0 device controller IP BDC")
    Signed-off-by: Wei Yongjun
    Signed-off-by: Felipe Balbi
    Signed-off-by: Greg Kroah-Hartman

    Wei Yongjun
     
  • commit 6a2cf8d3663e13e19af636c2a8d92e766261dc45 upstream.

    Because of the shifting around of code in qla2x00_probe_one recently,
    failures during adapter initialization can lead to problems, i.e. NULL
    pointer crashes and doubly freed data structures which cause eventual
    panics.

    This V2 version makes the relevant memory free routines idempotent, so
    repeat calls won't cause any harm. I also removed the problematic
    probe_init_failed exit point as it is not needed.

    Fixes: d64d6c5671db ("scsi: qla2xxx: Fix NULL pointer crash due to probe failure")
    Signed-off-by: Bill Kuzeja
    Acked-by: Himanshu Madhani
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Bill Kuzeja
     
  • commit a2390348c19d0819d525d375414a7cfdacb51a68 upstream.

    Commit 3515832cc614 ("scsi: qla2xxx: Reset the logo flag, after target
    re-login.")fixed the target re-login after session relogin is complete,
    but missed out the qlt_free_session_done() path.

    This patch clears send_els_logo flag in qlt_free_session_done()
    callback.

    [mkp: checkpatch]

    Fixes: 3515832cc614 ("scsi: qla2xxx: Reset the logo flag, after target re-login.")
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Himanshu Madhani
     
  • commit 5c25d451163cab9be80744cbc5448d6b95ab8d1a upstream.

    when processing iocb in a timeout case, driver was trying to log messages
    without verifying if the fcport structure could have valid data. This
    results in a NULL pointer access.

    Fixes: 726b85487067("qla2xxx: Add framework for async fabric discovery")
    Signed-off-by: Quinn Tran
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Quinn Tran
     
  • commit 62aa281470fdb7c0796d63a1cc918a8c1f02dde2 upstream.

    This patch fixes following warnings reported by smatch:

    drivers/scsi/qla2xxx/qla_mid.c:586 qla25xx_delete_req_que()
    error: we previously assumed 'req' could be null (see line 580)

    drivers/scsi/qla2xxx/qla_mid.c:602 qla25xx_delete_rsp_que()
    error: we previously assumed 'rsp' could be null (see line 596)

    Fixes: 7867b98dceb7 ("scsi: qla2xxx: Fix memory leak in dual/target mode")
    Reported-by: Dan Carpenter
    Signed-off-by: Himanshu Madhani
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Himanshu Madhani
     
  • commit 9deae9689231964972a94bb56a79b669f9d47ac1 upstream.

    Commit addc3fa74e5b ("Btrfs: Fix the problem that the dirty flag of dev
    stats is cleared") reworked the way device stats changes are tracked. A
    new atomic dev_stats_ccnt counter was introduced which is incremented
    every time any of the device stats counters are changed. This serves as
    a flag whether there are any pending stats changes. However, this patch
    only partially implemented the correct memory barriers necessary:

    - It only ordered the stores to the counters but not the reads e.g.
    btrfs_run_dev_stats
    - It completely omitted any comments documenting the intended design and
    how the memory barriers pair with each-other

    This patch provides the necessary comments as well as adds a missing
    smp_rmb in btrfs_run_dev_stats. Furthermore since dev_stats_cnt is only
    a snapshot at best there was no point in reading the counter twice -
    once in btrfs_dev_stats_dirty and then again when assigning stats_cnt.
    Just collapse both reads into 1.

    Fixes: addc3fa74e5b ("Btrfs: Fix the problem that the dirty flag of dev stats is cleared")
    Signed-off-by: Nikolay Borisov
    Reviewed-by: Mathieu Desnoyers
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Nikolay Borisov
     
  • commit c8195a7b1ad5648857ce20ba24f384faed8512bc upstream.

    Until v4.14, this warning was very infrequent:

    WARNING: CPU: 3 PID: 18172 at fs/btrfs/backref.c:1391 find_parent_nodes+0xc41/0x14e0
    Modules linked in: [...]
    CPU: 3 PID: 18172 Comm: bees Tainted: G D W L 4.11.9-zb64+ #1
    Hardware name: System manufacturer System Product Name/M5A78L-M/USB3, BIOS 2101 12/02/2014
    Call Trace:
    dump_stack+0x85/0xc2
    __warn+0xd1/0xf0
    warn_slowpath_null+0x1d/0x20
    find_parent_nodes+0xc41/0x14e0
    __btrfs_find_all_roots+0xad/0x120
    ? extent_same_check_offsets+0x70/0x70
    iterate_extent_inodes+0x168/0x300
    iterate_inodes_from_logical+0x87/0xb0
    ? iterate_inodes_from_logical+0x87/0xb0
    ? extent_same_check_offsets+0x70/0x70
    btrfs_ioctl+0x8ac/0x2820
    ? lock_acquire+0xc2/0x200
    do_vfs_ioctl+0x91/0x700
    ? __fget+0x112/0x200
    SyS_ioctl+0x79/0x90
    entry_SYSCALL_64_fastpath+0x23/0xc6
    ? trace_hardirqs_off_caller+0x1f/0x140

    Starting with v4.14 (specifically 86d5f9944252 ("btrfs: convert prelimary
    reference tracking to use rbtrees")) the WARN_ON occurs three orders of
    magnitude more frequently--almost once per second while running workloads
    like bees.

    Replace the WARN_ON() with a comment rationale for its removal.
    The rationale is paraphrased from an explanation by Edmund Nadolski
    on the linux-btrfs mailing list.

    Fixes: 8da6d5815c59 ("Btrfs: added btrfs_find_all_roots()")
    Signed-off-by: Zygo Blaxell
    Reviewed-by: Lu Fengqi
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Zygo Blaxell
     
  • commit fd649f10c3d21ee9d7542c609f29978bdf73ab94 upstream.

    Commit 4fde46f0cc71 ("Btrfs: free the stale device") introduced
    btrfs_free_stale_device which iterates the device lists for all
    registered btrfs filesystems and deletes those devices which aren't
    mounted. In a btrfs_devices structure has only 1 device attached to it
    and it is unused then btrfs_free_stale_devices will proceed to also free
    the btrfs_fs_devices struct itself. Currently this leads to a use after
    free since list_for_each_entry will try to perform a check on the
    already freed memory to see if it has to terminate the loop.

    The fix is to use 'break' when we know we are freeing the current
    fs_devs.

    Fixes: 4fde46f0cc71 ("Btrfs: free the stale device")
    Signed-off-by: Nikolay Borisov
    Reviewed-by: Anand Jain
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Nikolay Borisov
     
  • commit 92e222df7b8f05c565009c7383321b593eca488b upstream.

    In case of using DUP, we search for enough unallocated disk space on a
    device to hold two stripes.

    The devices_info[ndevs-1].max_avail that holds the amount of unallocated
    space found is directly assigned to stripe_size, while it's actually
    twice the stripe size.

    Later on in the code, an unconditional division of stripe_size by
    dev_stripes corrects the value, but in the meantime there's a check to
    see if the stripe_size does not exceed max_chunk_size. Since during this
    check stripe_size is twice the amount as intended, the check will reduce
    the stripe_size to max_chunk_size if the actual correct to be used
    stripe_size is more than half the amount of max_chunk_size.

    The unconditional division later tries to correct stripe_size, but will
    actually make sure we can't allocate more than half the max_chunk_size.

    Fix this by moving the division by dev_stripes before the max chunk size
    check, so it always contains the right value, instead of putting a duct
    tape division in further on to get it fixed again.

    Since in all other cases than DUP, dev_stripes is 1, this change only
    affects DUP.

    Other attempts in the past were made to fix this:
    * 37db63a400 "Btrfs: fix max chunk size check in chunk allocator" tried
    to fix the same problem, but still resulted in part of the code acting
    on a wrongly doubled stripe_size value.
    * 86db25785a "Btrfs: fix max chunk size on raid5/6" unintentionally
    broke this fix again.

    The real problem was already introduced with the rest of the code in
    73c5de0051.

    The user visible result however will be that the max chunk size for DUP
    will suddenly double, while it's actually acting according to the limits
    in the code again like it was 5 years ago.

    Reported-by: Naohiro Aota
    Link: https://www.spinics.net/lists/linux-btrfs/msg69752.html
    Fixes: 73c5de0051 ("btrfs: quasi-round-robin for chunk allocation")
    Fixes: 86db25785a ("Btrfs: fix max chunk size on raid5/6")
    Signed-off-by: Hans van Kranenburg
    Reviewed-by: David Sterba
    [ update comment ]
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Hans van Kranenburg
     
  • commit 18bf591ba9753e3e5ba91f38f756a800693408f4 upstream.

    This patch addresses an issue that causes fiemap to falsely
    report a shared extent. The test case is as follows:

    xfs_io -f -d -c "pwrite -b 16k 0 64k" -c "fiemap -v" /media/scratch/file5
    sync
    xfs_io -c "fiemap -v" /media/scratch/file5

    which gives the resulting output:

    wrote 65536/65536 bytes at offset 0
    64 KiB, 4 ops; 0.0000 sec (121.359 MiB/sec and 7766.9903 ops/sec)
    /media/scratch/file5:
    EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
    0: [0..127]: 24576..24703 128 0x2001
    /media/scratch/file5:
    EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
    0: [0..127]: 24576..24703 128 0x1

    This is because btrfs_check_shared calls find_parent_nodes
    repeatedly in a loop, passing a share_check struct to report
    the count of shared extent. But btrfs_check_shared does not
    re-initialize the count value to zero for subsequent calls
    from the loop, resulting in a false share count value. This
    is a regressive behavior from 4.13.

    With proper re-initialization the test result is as follows:

    wrote 65536/65536 bytes at offset 0
    64 KiB, 4 ops; 0.0000 sec (110.035 MiB/sec and 7042.2535 ops/sec)
    /media/scratch/file5:
    EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
    0: [0..127]: 24576..24703 128 0x1
    /media/scratch/file5:
    EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
    0: [0..127]: 24576..24703 128 0x1

    which corrects the regression.

    Fixes: 3ec4d3238ab ("btrfs: allow backref search checks for shared extents")
    Signed-off-by: Edmund Nadolski
    [ add text from cover letter to changelog ]
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Edmund Nadolski
     
  • commit 047fdea6341966a0898e3b16c51f54d4f5ba030a upstream.

    On detaching of a disk which is a part of a RAID6 filesystem, the
    following kernel OOPS may happen:

    [63122.680461] BTRFS error (device sdo): bdev /dev/sdo errs: wr 0, rd 0, flush 1, corrupt 0, gen 0
    [63122.719584] BTRFS warning (device sdo): lost page write due to IO error on /dev/sdo
    [63122.719587] BTRFS error (device sdo): bdev /dev/sdo errs: wr 1, rd 0, flush 1, corrupt 0, gen 0
    [63122.803516] BTRFS warning (device sdo): lost page write due to IO error on /dev/sdo
    [63122.803519] BTRFS error (device sdo): bdev /dev/sdo errs: wr 2, rd 0, flush 1, corrupt 0, gen 0
    [63122.863902] BTRFS critical (device sdo): fatal error on device /dev/sdo
    [63122.935338] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
    [63122.946554] IP: fail_bio_stripe+0x58/0xa0 [btrfs]
    [63122.958185] PGD 9ecda067 P4D 9ecda067 PUD b2b37067 PMD 0
    [63122.971202] Oops: 0000 [#1] SMP
    [63123.006760] CPU: 0 PID: 3979 Comm: kworker/u8:9 Tainted: G W 4.14.2-16-scst34x+ #8
    [63123.007091] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
    [63123.007402] Workqueue: btrfs-worker btrfs_worker_helper [btrfs]
    [63123.007595] task: ffff880036ea4040 task.stack: ffffc90006384000
    [63123.007796] RIP: 0010:fail_bio_stripe+0x58/0xa0 [btrfs]
    [63123.007968] RSP: 0018:ffffc90006387ad8 EFLAGS: 00010287
    [63123.008140] RAX: 0000000000000002 RBX: ffff88004beaa0b8 RCX: ffff8800b2bd5690
    [63123.008359] RDX: 0000000000000000 RSI: ffff88007bb43500 RDI: ffff88004beaa000
    [63123.008621] RBP: ffffc90006387ae8 R08: 0000000099100000 R09: ffff8800b2bd5600
    [63123.008840] R10: 0000000000000004 R11: 0000000000010000 R12: ffff88007bb43500
    [63123.009059] R13: 00000000fffffffb R14: ffff880036fc5180 R15: 0000000000000004
    [63123.009278] FS: 0000000000000000(0000) GS:ffff8800b7000000(0000) knlGS:0000000000000000
    [63123.009564] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [63123.009748] CR2: 0000000000000080 CR3: 00000000b0866000 CR4: 00000000000406f0
    [63123.009969] Call Trace:
    [63123.010085] raid_write_end_io+0x7e/0x80 [btrfs]
    [63123.010251] bio_endio+0xa1/0x120
    [63123.010378] generic_make_request+0x218/0x270
    [63123.010921] submit_bio+0x66/0x130
    [63123.011073] finish_rmw+0x3fc/0x5b0 [btrfs]
    [63123.011245] full_stripe_write+0x96/0xc0 [btrfs]
    [63123.011428] raid56_parity_write+0x117/0x170 [btrfs]
    [63123.011604] btrfs_map_bio+0x2ec/0x320 [btrfs]
    [63123.011759] ? ___cache_free+0x1c5/0x300
    [63123.011909] __btrfs_submit_bio_done+0x26/0x50 [btrfs]
    [63123.012087] run_one_async_done+0x9c/0xc0 [btrfs]
    [63123.012257] normal_work_helper+0x19e/0x300 [btrfs]
    [63123.012429] btrfs_worker_helper+0x12/0x20 [btrfs]
    [63123.012656] process_one_work+0x14d/0x350
    [63123.012888] worker_thread+0x4d/0x3a0
    [63123.013026] ? _raw_spin_unlock_irqrestore+0x15/0x20
    [63123.013192] kthread+0x109/0x140
    [63123.013315] ? process_scheduled_works+0x40/0x40
    [63123.013472] ? kthread_stop+0x110/0x110
    [63123.013610] ret_from_fork+0x25/0x30
    [63123.014469] RIP: fail_bio_stripe+0x58/0xa0 [btrfs] RSP: ffffc90006387ad8
    [63123.014678] CR2: 0000000000000080
    [63123.016590] ---[ end trace a295ea7259c17880 ]—

    This is reproducible in a cycle, where a series of writes is followed by
    SCSI device delete command. The test may take up to few minutes.

    Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index")
    [ no signed-off-by provided ]
    Author: Dmitriy Gorokh
    Reviewed-by: Liu Bo
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Dmitriy Gorokh
     
  • commit 4f2c7583e33eb08dc09dd2e25574b80175ba7d93 upstream.

    When struct its_device instances are created, the nr_ites member
    will be set to a power of 2 that equals or exceeds the requested
    number of MSIs passed to the msi_prepare() callback. At the same
    time, the LPI map is allocated to be some multiple of 32 in size,
    where the allocated size may be less than the requested size
    depending on whether a contiguous range of sufficient size is
    available in the global LPI bitmap.

    This may result in the situation where the nr_ites < nr_lpis, and
    since nr_ites is what we program into the hardware when we map the
    device, the additional LPIs will be non-functional.

    For bog standard hardware, this does not really matter. However,
    in cases where ITS device IDs are shared between different PCIe
    devices, we may end up allocating these additional LPIs without
    taking into account that they don't actually work.

    So let's make nr_ites at least 32. This ensures that all allocated
    LPIs are 'live', and that its_alloc_device_irq() will fail when
    attempts are made to allocate MSIs beyond what was allocated in
    the first place.

    Signed-off-by: Ard Biesheuvel
    [maz: updated comment]
    Signed-off-by: Marc Zyngier
    Signed-off-by: Greg Kroah-Hartman

    Ard Biesheuvel
     
  • commit 74b44bbe80b4c62113ac1501482ea1ee40eb9d67 upstream.

    rvt_mregion uses percpu_ref for reference counting and RCU to protect
    accesses from lkey_table. When a rvt_mregion needs to be freed, it
    first gets unregistered from lkey_table and then rvt_check_refs() is
    called to wait for in-flight usages before the rvt_mregion is freed.

    rvt_check_refs() seems to have a couple issues.

    * It has a fast exit path which tests percpu_ref_is_zero(). However,
    a percpu_ref reading zero doesn't mean that the object can be
    released. In fact, the ->release() callback might not even have
    started executing yet. Proceeding with freeing can lead to
    use-after-free.

    * lkey_table is RCU protected but there is no RCU grace period in the
    free path. percpu_ref uses RCU internally but it's sched-RCU whose
    grace periods are different from regular RCU. Also, it generally
    isn't a good idea to depend on internal behaviors like this.

    To address the above issues, this patch removes the fast exit and adds
    an explicit synchronize_rcu().

    Signed-off-by: Tejun Heo
    Acked-by: Dennis Dalessandro
    Cc: Mike Marciniszyn
    Cc: linux-rdma@vger.kernel.org
    Cc: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • commit d0264c01e7587001a8c4608a5d1818dba9a4c11a upstream.

    While converting ioctx index from a list to a table, db446a08c23d
    ("aio: convert the ioctx list to table lookup v3") missed tagging
    kioctx_table->table[] as an array of RCU pointers and using the
    appropriate RCU accessors. This introduces a small window in the
    lookup path where init and access may race.

    Mark kioctx_table->table[] with __rcu and use the approriate RCU
    accessors when using the field.

    Signed-off-by: Tejun Heo
    Reported-by: Jann Horn
    Fixes: db446a08c23d ("aio: convert the ioctx list to table lookup v3")
    Cc: Benjamin LaHaise
    Cc: Linus Torvalds
    Cc: stable@vger.kernel.org # v3.12+
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • commit a6d7cff472eea87d96899a20fa718d2bab7109f3 upstream.

    While fixing refcounting, e34ecee2ae79 ("aio: Fix a trinity splat")
    incorrectly removed explicit RCU grace period before freeing kioctx.
    The intention seems to be depending on the internal RCU grace periods
    of percpu_ref; however, percpu_ref uses a different flavor of RCU,
    sched-RCU. This can lead to kioctx being freed while RCU read
    protected dereferences are still in progress.

    Fix it by updating free_ioctx() to go through call_rcu() explicitly.

    v2: Comment added to explain double bouncing.

    Signed-off-by: Tejun Heo
    Reported-by: Jann Horn
    Fixes: e34ecee2ae79 ("aio: Fix a trinity splat")
    Cc: Kent Overstreet
    Cc: Linus Torvalds
    Cc: stable@vger.kernel.org # v3.13+
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • commit 3b821409632ab778d46e807516b457dfa72736ed upstream.

    In case when dentry passed to lock_parent() is protected from freeing only
    by the fact that it's on a shrink list and trylock of parent fails, we
    could get hit by __dentry_kill() (and subsequent dentry_kill(parent))
    between unlocking dentry and locking presumed parent. We need to recheck
    that dentry is alive once we lock both it and parent *and* postpone
    rcu_read_unlock() until after that point. Otherwise we could return
    a pointer to struct dentry that already is rcu-scheduled for freeing, with
    ->d_lock held on it; caller's subsequent attempt to unlock it can end
    up with memory corruption.

    Cc: stable@vger.kernel.org # 3.12+, counting backports
    Signed-off-by: Al Viro
    Signed-off-by: Greg Kroah-Hartman

    Al Viro
     
  • commit 16ca6a607d84bef0129698d8d808f501afd08d43 upstream.

    The vgic code is trying to be clever when injecting GICv2 SGIs,
    and will happily populate LRs with the same interrupt number if
    they come from multiple vcpus (after all, they are distinct
    interrupt sources).

    Unfortunately, this is against the letter of the architecture,
    and the GICv2 architecture spec says "Each valid interrupt stored
    in the List registers must have a unique VirtualID for that
    virtual CPU interface.". GICv3 has similar (although slightly
    ambiguous) restrictions.

    This results in guests locking up when using GICv2-on-GICv3, for
    example. The obvious fix is to stop trying so hard, and inject
    a single vcpu per SGI per guest entry. After all, pending SGIs
    with multiple source vcpus are pretty rare, and are mostly seen
    in scenario where the physical CPUs are severely overcomitted.

    But as we now only inject a single instance of a multi-source SGI per
    vcpu entry, we may delay those interrupts for longer than strictly
    necessary, and run the risk of injecting lower priority interrupts
    in the meantime.

    In order to address this, we adopt a three stage strategy:
    - If we encounter a multi-source SGI in the AP list while computing
    its depth, we force the list to be sorted
    - When populating the LRs, we prevent the injection of any interrupt
    of lower priority than that of the first multi-source SGI we've
    injected.
    - Finally, the injection of a multi-source SGI triggers the request
    of a maintenance interrupt when there will be no pending interrupt
    in the LRs (HCR_NPIE).

    At the point where the last pending interrupt in the LRs switches
    from Pending to Active, the maintenance interrupt will be delivered,
    allowing us to add the remaining SGIs using the same process.

    Cc: stable@vger.kernel.org
    Fixes: 0919e84c0fc1 ("KVM: arm/arm64: vgic-new: Add IRQ sync/flush framework")
    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier
    Signed-off-by: Greg Kroah-Hartman

    Marc Zyngier
     
  • commit 27e91ad1e746e341ca2312f29bccb9736be7b476 upstream.

    On guest exit, and when using GICv2 on GICv3, we use a dsb(st) to
    force synchronization between the memory-mapped guest view and
    the system-register view that the hypervisor uses.

    This is incorrect, as the spec calls out the need for "a DSB whose
    required access type is both loads and stores with any Shareability
    attribute", while we're only synchronizing stores.

    We also lack an isb after the dsb to ensure that the latter has
    actually been executed before we start reading stuff from the sysregs.

    The fix is pretty easy: turn dsb(st) into dsb(sy), and slap an isb()
    just after.

    Cc: stable@vger.kernel.org
    Fixes: f68d2b1b73cc ("arm64: KVM: Implement vgic-v3 save/restore")
    Acked-by: Christoffer Dall
    Reviewed-by: Andre Przywara
    Signed-off-by: Marc Zyngier
    Signed-off-by: Greg Kroah-Hartman

    Marc Zyngier
     
  • commit 76600428c3677659e3c3633bb4f2ea302220a275 upstream.

    On my GICv3 system, the following is printed to the kernel log at boot:

    kvm [1]: 8-bit VMID
    kvm [1]: IDMAP page: d20e35000
    kvm [1]: HYP VA range: 800000000000:ffffffffffff
    kvm [1]: vgic-v2@2c020000
    kvm [1]: GIC system register CPU interface enabled
    kvm [1]: vgic interrupt IRQ1
    kvm [1]: virtual timer IRQ4
    kvm [1]: Hyp mode initialized successfully

    The KVM IDMAP is a mapping of a statically allocated kernel structure,
    and so printing its physical address leaks the physical placement of
    the kernel when physical KASLR in effect. So change the kvm_info() to
    kvm_debug() to remove it from the log output.

    While at it, trim the output a bit more: IRQ numbers can be found in
    /proc/interrupts, and the HYP VA and vgic-v2 lines are not highly
    informational either.

    Cc:
    Acked-by: Will Deacon
    Acked-by: Christoffer Dall
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Marc Zyngier
    Signed-off-by: Greg Kroah-Hartman

    Ard Biesheuvel
     
  • commit 95dd77580ccd66a0da96e6d4696945b8cea39431 upstream.

    On nfsv2 and nfsv3 the nfs server can export subsets of the same
    filesystem and report the same filesystem identifier, so that the nfs
    client can know they are the same filesystem. The subsets can be from
    disjoint directory trees. The nfsv2 and nfsv3 filesystems provides no
    way to find the common root of all directory trees exported form the
    server with the same filesystem identifier.

    The practical result is that in struct super s_root for nfs s_root is
    not necessarily the root of the filesystem. The nfs mount code sets
    s_root to the root of the first subset of the nfs filesystem that the
    kernel mounts.

    This effects the dcache invalidation code in generic_shutdown_super
    currently called shrunk_dcache_for_umount and that code for years
    has gone through an additional list of dentries that might be dentry
    trees that need to be freed to accomodate nfs.

    When I wrote path_connected I did not realize nfs was so special, and
    it's hueristic for avoiding calling is_subdir can fail.

    The practical case where this fails is when there is a move of a
    directory from the subtree exposed by one nfs mount to the subtree
    exposed by another nfs mount. This move can happen either locally or
    remotely. With the remote case requiring that the move directory be cached
    before the move and that after the move someone walks the path
    to where the move directory now exists and in so doing causes the
    already cached directory to be moved in the dcache through the magic
    of d_splice_alias.

    If someone whose working directory is in the move directory or a
    subdirectory and now starts calling .. from the initial mount of nfs
    (where s_root == mnt_root), then path_connected as a heuristic will
    not bother with the is_subdir check. As s_root really is not the root
    of the nfs filesystem this heuristic is wrong, and the path may
    actually not be connected and path_connected can fail.

    The is_subdir function might be cheap enough that we can call it
    unconditionally. Verifying that will take some benchmarking and
    the result may not be the same on all kernels this fix needs
    to be backported to. So I am avoiding that for now.

    Filesystems with snapshots such as nilfs and btrfs do something
    similar. But as the directory tree of the snapshots are disjoint
    from one another and from the main directory tree rename won't move
    things between them and this problem will not occur.

    Cc: stable@vger.kernel.org
    Reported-by: Al Viro
    Fixes: 397d425dc26d ("vfs: Test for and handle paths that are unreachable from their mnt_root")
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Al Viro
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 7d617264eb22b18d979eac6e85877a141253034e upstream.

    Turning off the sink in this case causes various issues, because
    userspace expects it to stay on until it turns it off explicitly.

    Instead, turn the sink off and back on when a display is connected
    again. This dance seems necessary for link training to work correctly.

    Bugzilla: https://bugs.freedesktop.org/105308
    Cc: stable@vger.kernel.org
    Reviewed-by: Alex Deucher
    Signed-off-by: Michel Dänzer
    Signed-off-by: Alex Deucher
    Signed-off-by: Greg Kroah-Hartman

    Michel Dänzer
     
  • commit 0f4f715bc6bed3bf14c5cd7d5fe88d443e756b14 upstream.

    We unmapped imported DMA-bufs when the GEM handle was dropped, not when the
    hardware was done with the buffere.

    Signed-off-by: Christian König
    Reviewed-by: Michel Dänzer
    CC: stable@vger.kernel.org
    Signed-off-by: Alex Deucher
    Signed-off-by: Greg Kroah-Hartman

    Christian König
     
  • commit 342038d92403b3efa1138a8599666b9f026279d6 upstream.

    We unmapped imported DMA-bufs when the GEM handle was dropped, not when the
    hardware was done with the buffere.

    Signed-off-by: Christian König
    Reviewed-by: Michel Dänzer
    CC: stable@vger.kernel.org
    Signed-off-by: Alex Deucher
    Signed-off-by: Greg Kroah-Hartman

    Christian König
     
  • commit 76f2e2bc627f7d08360ac731b6277d744d4eb599 upstream.

    Unbinding nouveau on a dual GPU MacBook Pro oopses because we iterate
    over the bl_connectors list in nouveau_backlight_exit() but skipped
    initializing it in nouveau_backlight_init(). Stacktrace for posterity:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: nouveau_backlight_exit+0x2b/0x70 [nouveau]
    nouveau_display_destroy+0x29/0x80 [nouveau]
    nouveau_drm_unload+0x65/0xe0 [nouveau]
    drm_dev_unregister+0x3c/0xe0 [drm]
    drm_put_dev+0x2e/0x60 [drm]
    nouveau_drm_device_remove+0x47/0x70 [nouveau]
    pci_device_remove+0x36/0xb0
    device_release_driver_internal+0x157/0x220
    driver_detach+0x39/0x70
    bus_remove_driver+0x51/0xd0
    pci_unregister_driver+0x2a/0xa0
    nouveau_drm_exit+0x15/0xfb0 [nouveau]
    SyS_delete_module+0x18c/0x290
    system_call_fast_compare_end+0xc/0x6f

    Fixes: b53ac1ee12a3 ("drm/nouveau/bl: Do not register interface if Apple GMUX detected")
    Cc: stable@vger.kernel.org # v4.10+
    Cc: Pierre Moreau
    Signed-off-by: Lukas Wunner
    Signed-off-by: Ben Skeggs
    Signed-off-by: Greg Kroah-Hartman

    Lukas Wunner
     
  • commit a2ff19f7b70118ced291a28d5313469914de451b upstream.

    When releasing a client, we need to clear the clienttab[] entry at
    first, then call snd_seq_queue_client_leave(). Otherwise, the
    in-flight cell in the queue might be picked up by the timer interrupt
    via snd_seq_check_queue() before calling snd_seq_queue_client_leave(),
    and it's delivered to another queue while the client is clearing
    queues. This may eventually result in an uncleared cell remaining in
    a queue, and the later snd_seq_pool_delete() may need to wait for a
    long time until the event gets really processed.

    By moving the clienttab[] clearance at the beginning of release, any
    event delivery of a cell belonging to this client will fail at a later
    point, since snd_seq_client_ptr() returns NULL. Thus the cell that
    was picked up by the timer interrupt will be returned immediately
    without further delivery, and the long stall of snd_seq_delete_pool()
    can be avoided, too.

    Cc:
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Takashi Iwai
     
  • commit d0f833065221cbfcbadf19fd4102bcfa9330006a upstream.

    Although we've covered the races between concurrent write() and
    ioctl() in the previous patch series, there is still a possible UAF in
    the following scenario:

    A: user client closed B: timer irq
    -> snd_seq_release() -> snd_seq_timer_interrupt()
    -> snd_seq_free_client() -> snd_seq_check_queue()
    -> cell = snd_seq_prioq_cell_peek()
    -> snd_seq_prioq_leave()
    .... removing all cells
    -> snd_seq_pool_done()
    .... vfree()
    -> snd_seq_compare_tick_time(cell)
    ... Oops

    So the problem is that a cell is peeked and accessed without any
    protection until it's retrieved from the queue again via
    snd_seq_prioq_cell_out().

    This patch tries to address it, also cleans up the code by a slight
    refactoring. snd_seq_prioq_cell_out() now receives an extra pointer
    argument. When it's non-NULL, the function checks the event timestamp
    with the given pointer. The caller needs to pass the right reference
    either to snd_seq_tick or snd_seq_realtime depending on the event
    timestamp type.

    A good news is that the above change allows us to remove the
    snd_seq_prioq_cell_peek(), too, thus the patch actually reduces the
    code size.

    Reviewed-by: Nicolai Stange
    Cc:
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Takashi Iwai
     
  • commit 40088dc4e1ead7df31728c73f5b51d71da18831d upstream.

    With the commit 1ba8f9d30817 ("ALSA: hda: Add a power_save
    blacklist"), we changed the default value of power_save option to -1
    for processing the power-save blacklist.
    Unfortunately, this seems breaking user-space applications that
    actually read the power_save parameter value via sysfs and judge /
    adjust the power-saving status. They see the value -1 as if the
    power-save is turned off, although the actual value is taken from
    CONFIG_SND_HDA_POWER_SAVE_DEFAULT and it can be a positive.

    So, overall, passing -1 there was no good idea. Let's partially
    revert it -- at least for power_save option default value is restored
    again to CONFIG_SND_HDA_POWER_SAVE_DEFAULT. Meanwhile, in this patch,
    we keep the blacklist behavior and make is adjustable via the new
    option, pm_blacklist.

    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199073
    Fixes: 1ba8f9d30817 ("ALSA: hda: Add a power_save blacklist")
    Acked-by: Hans de Goede
    Cc:
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Takashi Iwai
     
  • commit 01c0b4265cc16bc1f43f475c5944c55c10d5768f upstream.

    snd_pcm_oss_get_formats() has an obvious use-after-free around
    snd_mask_test() calls, as spotted by syzbot. The passed format_mask
    argument is a pointer to the hw_params object that is freed before the
    loop. What a surprise that it has been present since the original
    code of decades ago...

    Reported-by: syzbot+4090700a4f13fccaf648@syzkaller.appspotmail.com
    Cc:
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Takashi Iwai
     
  • commit 9ef0f88fe5466c2ca1d2975549ba6be502c464c1 upstream.

    Just when I had decided that flush_cache_range() was always called with
    a valid context, Helge reported two cases where the
    "BUG_ON(!vma->vm_mm->context);" was hit on the phantom buildd:

    kernel BUG at /mnt/sdb6/linux/linux-4.15.4/arch/parisc/kernel/cache.c:587!
    CPU: 1 PID: 3254 Comm: kworker/1:2 Tainted: G D 4.15.0-1-parisc64-smp #1 Debian 4.15.4-1+b1
    Workqueue: events free_ioctx
      IAOQ[0]: flush_cache_range+0x164/0x168
      IAOQ[1]: flush_cache_page+0x0/0x1c8
      RP(r2): unmap_page_range+0xae8/0xb88
    Backtrace:
      [] unmap_page_range+0xae8/0xb88
      [] unmap_single_vma+0xc0/0x188
      [] zap_page_range_single+0x134/0x1f8
      [] unmap_mapping_range+0x1cc/0x208
      [] truncate_pagecache+0x98/0x108
      [] truncate_setsize+0x9c/0xb8
      [] put_aio_ring_file+0x80/0x100
      [] aio_free_ring+0x8c/0x290
      [] free_ioctx+0x80/0x180
      [] process_one_work+0x21c/0x668
      [] worker_thread+0x20c/0x778
      [] kthread+0x2d4/0x2e0
      [] end_fault_vector+0x20/0xc0

    This indicates that we need to handle the no context case in
    flush_cache_range() as we do in flush_cache_mm().

    In thinking about this, I realized that we don't need to flush the TLB
    when there is no context. So, I added context checks to the large flush
    cases in flush_cache_mm() and flush_cache_range(). The large flush case
    occurs frequently in flush_cache_mm() and the change should improve fork
    performance.

    The v2 version of this change removes the BUG_ON from flush_cache_page()
    by skipping the TLB flush when there is no context.  I also added code
    to flush the TLB in flush_cache_mm() and flush_cache_range() when we
    have a context that's not current.  Now all three routines handle TLB
    flushes in a similar manner.

    Signed-off-by: John David Anglin
    Cc: stable@vger.kernel.org # 4.9+
    Signed-off-by: Helge Deller
    Signed-off-by: Greg Kroah-Hartman

    John David Anglin
     
  • commit 18a955219bf7d9008ce480d4451b6b8bf4483a22 upstream.

    Gratian Crisan reported that vmalloc_fault() crashes when CONFIG_HUGETLBFS
    is not set since the function inadvertently uses pXn_huge(), which always
    return 0 in this case. ioremap() does not depend on CONFIG_HUGETLBFS.

    Fix vmalloc_fault() to call pXd_large() instead.

    Fixes: f4eafd8bcd52 ("x86/mm: Fix vmalloc_fault() to handle large pages properly")
    Reported-by: Gratian Crisan
    Signed-off-by: Toshi Kani
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Cc: linux-mm@kvack.org
    Cc: Borislav Petkov
    Cc: Andy Lutomirski
    Link: https://lkml.kernel.org/r/20180313170347.3829-2-toshi.kani@hpe.com
    Signed-off-by: Greg Kroah-Hartman

    Toshi Kani
     
  • commit daaf216c06fba4ee4dc3f62715667da929d68774 upstream.

    When using device passthrough with SME active, the MMIO range that is
    mapped for the device should not be mapped encrypted. Add a check in
    set_spte() to insure that a page is not mapped encrypted if that page
    is a device MMIO page as indicated by kvm_is_mmio_pfn().

    Cc: # 4.14.x-
    Signed-off-by: Tom Lendacky
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Tom Lendacky
     
  • commit e3b3121fa8da94cb20f9e0c64ab7981ae47fd085 upstream.

    In accordance with Intel's microcode revision guidance from March 6 MCU
    rev 0xc2 is cleared on both Skylake H/S and Skylake Xeon E3 processors
    that share CPUID 506E3.

    Signed-off-by: Alexander Sergeyev
    Signed-off-by: Thomas Gleixner
    Cc: Jia Zhang
    Cc: Greg Kroah-Hartman
    Cc: Kyle Huey
    Cc: David Woodhouse
    Link: https://lkml.kernel.org/r/20180313193856.GA8580@localhost.localdomain
    Signed-off-by: Greg Kroah-Hartman

    Alexander Sergeyev
     
  • commit a14bff131108faf50cc0cf864589fd71ee216c96 upstream.

    In the following commit:

    9e0e3c5130e9 ("x86/speculation, objtool: Annotate indirect calls/jumps for objtool")

    ... we added annotations for CALL_NOSPEC/JMP_NOSPEC on 64-bit x86 kernels,
    but we did not annotate the 32-bit path.

    Annotate it similarly.

    Signed-off-by: Andy Whitcroft
    Acked-by: Peter Zijlstra (Intel)
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Woodhouse
    Cc: David Woodhouse
    Cc: Greg Kroah-Hartman
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20180314112427.22351-1-apw@canonical.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Whitcroft
     
  • commit b5069782453459f6ec1fdeb495d9901a4545fcb5 upstream.

    POPF would trap if VIP was set regardless of whether IF was set. Fix it.

    Suggested-by: Stas Sergeev
    Reported-by: Bart Oldeman
    Signed-off-by: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Fixes: 5ed92a8ab71f ("x86/vm86: Use the normal pt_regs area for vm86")
    Link: http://lkml.kernel.org/r/ce95f40556e7b2178b6bc06ee9557827ff94bd28.1521003603.git.luto@kernel.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski