26 Apr, 2018

40 commits

  • [ Upstream commit 6b136a24b05c81a24e0b648a4bd938bcd0c4f69e ]

    Attributes that only implement .seq_ops are read-only, any write to
    them should be rejected. But currently kernel would crash when
    writing to such debugfs entries, e.g.

    chmod +w /sys/kernel/debug/block//requeue_list
    echo 0 > /sys/kernel/debug/block//requeue_list
    chmod -w /sys/kernel/debug/block//requeue_list

    Fix it by returning -EPERM in blk_mq_debugfs_write() when writing to
    such attributes.

    Cc: Ming Lei
    Signed-off-by: Eryu Guan
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Eryu Guan
     
  • [ Upstream commit b3ecd4aa8632a86428605ab73393d14779019d82 ]

    Another VCPU might try to modify the SCB while we are creating the
    shadow SCB. In general this is no problem - unless the compiler decides
    to not load values once, but e.g. twice.

    For us, this is only relevant when checking/working with such values.
    E.g. the prefix value, the mso, state of transactional execution and
    addresses of satellite blocks.

    E.g. if we blindly forward values (e.g. general purpose registers or
    execution controls after masking), we don't care.

    Leaving unpin_blocks() untouched for now, will handle it separately.

    The worst thing right now that I can see would be a missed prefix
    un/remap (mso, prefix, tx) or using wrong guest addresses. Nothing
    critical, but let's try to avoid unpredictable behavior.

    Signed-off-by: David Hildenbrand
    Message-Id:
    Reviewed-by: Christian Borntraeger
    Acked-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    David Hildenbrand
     
  • [ Upstream commit 587d8628fb71c3bfae29fb2bbe84c1478c59bac8 ]

    This patch prevents the thinkpad_acpi driver from warning about 2 event
    codes returned for keyboard palm-detection. No behavioral changes,
    other than suppressing the warning in the kernel log. The events are
    still forwarded via acpi-netlink channels.

    We could, optionally, decide to forward the event through a
    input-switch on the tpacpi input device. However, so far no suitable
    input-code exists, and no similar drivers report such events. Hence,
    leave it an acpi event for now.

    Note that the event-codes are named based on empirical studies. On the
    ThinkPad X1 5th Gen the sensor can be found underneath the arrow key.

    Cc: Matthew Thode
    Signed-off-by: David Herrmann
    Acked-by: Henrique de Moraes Holschuh
    Signed-off-by: Andy Shevchenko
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    David Herrmann
     
  • [ Upstream commit e0346f9fcb6c636d2f870e6666de8781413f34ea ]

    If we receive the link status message from PF with link up before queues
    are actually enabled, it will trigger a TX hang. This fixes the issue
    by ignoring a link up message if the VF state is not yet in RUNNING
    state.

    Signed-off-by: Alan Brady
    Tested-by: Andrew Bowers
    Signed-off-by: Jeff Kirsher
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Alan Brady
     
  • [ Upstream commit 06aa040f039404a0039a5158cd12f41187487a1f ]

    When a host disables and enables a PF device, all the associated
    VFs are removed and added back in. It also generates a PFR which in turn
    resets all the connected VFs. This behaviour is different from that of
    Linux guest on Linux host. Hence we end up in a situation where there's
    a PFR and device removal at the same time. And watchdog doesn't have a
    clue about this and schedules a reset_task. This patch adds code to send
    signal to reset_task that the device is currently being removed.

    Signed-off-by: Avinash Dayanand
    Tested-by: Andrew Bowers
    Signed-off-by: Jeff Kirsher
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Avinash Dayanand
     
  • [ Upstream commit 783687810e986a15ffbf86c516a1a48ff37f38f7 ]

    Bug: BPF programs and maps related to sockmaps test exist
    in memory even after test_maps ends.

    This patch fixes it as a short term workaround (sockmap
    kernel side needs real fixing) by empyting sockmaps when
    test ends.

    Fixes: 6f6d33f3b3d0f ("bpf: selftests add sockmap tests")
    Signed-off-by: Prashant Bhole
    [ daniel: Note on workaround. ]
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Prashant Bhole
     
  • [ Upstream commit 20d59023c5ec4426284af492808bcea1f39787ef ]

    We inadvertently set it again on the source bio, but we need
    to set it on the new split bio instead.

    Fixes: fbbaf700e7b1 ("block: trace completion of all bios.")
    Signed-off-by: Goldwyn Rodrigues
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Goldwyn Rodrigues
     
  • [ Upstream commit e58decc9c51eb61697aba35ba8eda33f4b80552d ]

    Fix to return error code -EINVAL instead of 0 when num_vfs above
    limit_vfs, as done elsewhere in this function.

    Fixes: 0dc786219186 ("nfp: handle SR-IOV already enabled when driver is probing")
    Signed-off-by: Wei Yongjun
    Acked-by: Jakub Kicinski
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Wei Yongjun
     
  • [ Upstream commit 7ad81482cad67cbe1ec808490d1ddfc420c42008 ]

    We get the "new_profile_index" value from the mouse device when we're
    handling raw events. Smatch taints it as untrusted data and complains
    that we need a bounds check. This seems like a reasonable warning
    otherwise there is a small read beyond the end of the array.

    Fixes: 0e70f97f257e ("HID: roccat: Add support for Kova[+] mouse")
    Signed-off-by: Dan Carpenter
    Acked-by: Silvan Jegen
    Signed-off-by: Jiri Kosina
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     
  • [ Upstream commit cba04cdf437d745fac85220d1d692a9ae23d7004 ]

    The interrupt is requested before the device is powered on and
    it's value in some cases cannot be reliable. It happens on some
    devices that an interrupt is generated as soon as requested
    before having the chance to disable the irq.

    Set the irq flag as IRQ_NOAUTOEN before requesting it.

    This patch mutes the error:

    stmfts 2-0049: failed to read events: -11

    received sometimes during boot time.

    Signed-off-by: Andi Shyti
    Signed-off-by: Dmitry Torokhov
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Andi Shyti
     
  • [ Upstream commit 96d5eaa9bb74d299508d811d865c2c41b38b0301 ]

    While testing with the ARM specific memset() macro removed, I ran into a
    compiler warning that shows an old bug:

    drivers/scsi/arm/fas216.c: In function 'fas216_rq_sns_done':
    drivers/scsi/arm/fas216.c:2014:40: error: argument to 'sizeof' in 'memset' call is the same expression as the destination; did you mean to provide an explicit length? [-Werror=sizeof-pointer-memaccess]

    It turns out that the definition of the scsi_cmd structure changed back
    in linux-2.6.25, so now we clear only four bytes (sizeof(pointer))
    instead of 96 (SCSI_SENSE_BUFFERSIZE). I did not check whether we
    actually need to initialize the buffer here, but it's clear that if we
    do it, we should use the correct size.

    Fixes: de25deb18016 ("[SCSI] use dynamically allocated sense buffer")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • [ Upstream commit 3f884a0a8bdf28cfd1e9987d54d83350096cdd46 ]

    Replace "" with NULL for product revision level, and merge TEXEL
    duplicate entries.

    Cc: Hannes Reinecke
    Cc: Martin K. Petersen
    Cc: James E.J. Bottomley
    Cc: SCSI ML
    Signed-off-by: Xose Vazquez Perez
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Xose Vazquez Perez
     
  • [ Upstream commit a9d572c7550044d5b217b5287d99a2e6d34b97b0 ]

    When io_bits is set, GCing encrypted block may hit the following hungtask.
    Since io_bits requires aligned block address, f2fs_submit_page_write may
    return -EAGAIN if new_blkaddr does not satisify io_bits alignment. As a
    result, the encrypted page will never be writtenback.

    This patch makes move_data_block aware the EAGAIN error and cancel the
    writeback.

    [ 246.751371] INFO: task kworker/u4:4:797 blocked for more than 90 seconds.
    [ 246.752423] Not tainted 4.15.0-rc4+ #11
    [ 246.754176] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [ 246.755336] kworker/u4:4 D25448 797 2 0x80000000
    [ 246.755597] Workqueue: writeback wb_workfn (flush-7:0)
    [ 246.755616] Call Trace:
    [ 246.755695] ? __schedule+0x322/0xa90
    [ 246.755761] ? blk_init_request_from_bio+0x120/0x120
    [ 246.755773] ? pci_mmcfg_check_reserved+0xb0/0xb0
    [ 246.755801] ? __radix_tree_create+0x19e/0x200
    [ 246.755813] ? delete_node+0x136/0x370
    [ 246.755838] schedule+0x43/0xc0
    [ 246.755904] io_schedule+0x17/0x40
    [ 246.755939] wait_on_page_bit_common+0x17b/0x240
    [ 246.755950] ? wake_page_function+0xa0/0xa0
    [ 246.755961] ? add_to_page_cache_lru+0x160/0x160
    [ 246.755972] ? page_cache_tree_insert+0x170/0x170
    [ 246.755983] ? __lru_cache_add+0x96/0xb0
    [ 246.756086] __filemap_fdatawait_range+0x14f/0x1c0
    [ 246.756097] ? wait_on_page_bit_common+0x240/0x240
    [ 246.756120] ? __wake_up_locked_key_bookmark+0x20/0x20
    [ 246.756167] ? wait_on_all_pages_writeback+0xc9/0x100
    [ 246.756179] ? __remove_ino_entry+0x120/0x120
    [ 246.756192] ? wait_woken+0x100/0x100
    [ 246.756204] filemap_fdatawait_range+0x9/0x20
    [ 246.756216] write_checkpoint+0x18a1/0x1f00
    [ 246.756254] ? blk_get_request+0x10/0x10
    [ 246.756265] ? cpumask_next_and+0x43/0x60
    [ 246.756279] ? f2fs_sync_inode_meta+0x160/0x160
    [ 246.756289] ? remove_element.isra.4+0xa0/0xa0
    [ 246.756300] ? __put_compound_page+0x40/0x40
    [ 246.756310] ? f2fs_sync_fs+0xec/0x1c0
    [ 246.756320] ? f2fs_sync_fs+0x120/0x1c0
    [ 246.756329] f2fs_sync_fs+0x120/0x1c0
    [ 246.756357] ? trace_event_raw_event_f2fs__page+0x260/0x260
    [ 246.756393] ? ata_build_rw_tf+0x173/0x410
    [ 246.756397] f2fs_balance_fs_bg+0x198/0x390
    [ 246.756405] ? drop_inmem_page+0x230/0x230
    [ 246.756415] ? ahci_qc_prep+0x1bb/0x2e0
    [ 246.756418] ? ahci_qc_issue+0x1df/0x290
    [ 246.756422] ? __accumulate_pelt_segments+0x42/0xd0
    [ 246.756426] ? f2fs_write_node_pages+0xd1/0x380
    [ 246.756429] f2fs_write_node_pages+0xd1/0x380
    [ 246.756437] ? sync_node_pages+0x8f0/0x8f0
    [ 246.756440] ? update_curr+0x53/0x220
    [ 246.756444] ? __accumulate_pelt_segments+0xa2/0xd0
    [ 246.756448] ? __update_load_avg_se.isra.39+0x349/0x360
    [ 246.756452] ? do_writepages+0x2a/0xa0
    [ 246.756456] do_writepages+0x2a/0xa0
    [ 246.756460] __writeback_single_inode+0x70/0x490
    [ 246.756463] ? check_preempt_wakeup+0x199/0x310
    [ 246.756467] writeback_sb_inodes+0x2a2/0x660
    [ 246.756471] ? is_empty_dir_inode+0x40/0x40
    [ 246.756474] ? __writeback_single_inode+0x490/0x490
    [ 246.756477] ? string+0xbf/0xf0
    [ 246.756480] ? down_read_trylock+0x35/0x60
    [ 246.756484] __writeback_inodes_wb+0x9f/0xf0
    [ 246.756488] wb_writeback+0x41d/0x4b0
    [ 246.756492] ? writeback_inodes_wb.constprop.55+0x150/0x150
    [ 246.756498] ? set_worker_desc+0xf7/0x130
    [ 246.756502] ? current_is_workqueue_rescuer+0x60/0x60
    [ 246.756511] ? _find_next_bit+0x2c/0xa0
    [ 246.756514] ? wb_workfn+0x400/0x5d0
    [ 246.756518] wb_workfn+0x400/0x5d0
    [ 246.756521] ? finish_task_switch+0xdf/0x2a0
    [ 246.756525] ? inode_wait_for_writeback+0x30/0x30
    [ 246.756529] process_one_work+0x3a7/0x6f0
    [ 246.756533] worker_thread+0x82/0x750
    [ 246.756537] kthread+0x16f/0x1c0
    [ 246.756541] ? trace_event_raw_event_workqueue_work+0x110/0x110
    [ 246.756544] ? kthread_create_worker_on_cpu+0xb0/0xb0
    [ 246.756548] ret_from_fork+0x1f/0x30

    Signed-off-by: Sheng Yong
    Reviewed-by: Chao Yu
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Sheng Yong
     
  • [ Upstream commit 00db63c128dd3daf38f481371976c24d32678142 ]

    If valid netdevice is not found for RoCE, GID table should not be
    searched with NULL netdevice.

    Doing so causes the search routines to ignore the netdev argument and may
    match the wrong GID table entry if the netdev is deleted.

    Fixes: abae1b71dd37 ("IB/cma: cma_validate_port should verify the port and netdevice")
    Signed-off-by: Parav Pandit
    Reviewed-by: Mark Bloch
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Jason Gunthorpe
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Parav Pandit
     
  • [ Upstream commit 7583d8d088ff2c323b1d4f15b191ca2c23d32558 ]

    Before rbio_orig_end_io() goes to free rbio, rbio may get merged with
    more bios from other rbios and rbio->bio_list becomes non-empty,
    in that case, these newly merged bios don't end properly.

    Once unlock_stripe() is done, rbio->bio_list will not be updated any
    more and we can call bio_endio() on all queued bios.

    It should only happen in error-out cases, the normal path of recover
    and full stripe write have already set RBIO_RMW_LOCKED_BIT to disable
    merge before doing IO, so rbio_orig_end_io() called by them doesn't
    have the above issue.

    Reported-by: Jérôme Carretero
    Signed-off-by: Liu Bo
    Signed-off-by: David Sterba
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • [ Upstream commit 18e83ac75bfe67009c4ddcdd581bba8eb16f4030 ]

    This fixes a corner case that is caused by a race of dio write vs dio
    read/write.

    Here is how the race could happen.

    Suppose that no extent map has been loaded into memory yet.
    There is a file extent [0, 32K), two jobs are running concurrently
    against it, t1 is doing dio write to [8K, 32K) and t2 is doing dio
    read from [0, 4K) or [4K, 8K).

    t1 goes ahead of t2 and splits em [0, 32K) to em [0K, 8K) and [8K 32K).

    ------------------------------------------------------
    t1 t2
    btrfs_get_blocks_direct() btrfs_get_blocks_direct()
    -> btrfs_get_extent() -> btrfs_get_extent()
    -> lookup_extent_mapping()
    -> add_extent_mapping() -> lookup_extent_mapping()
    # load [0, 32K)
    -> btrfs_new_extent_direct()
    -> btrfs_drop_extent_cache()
    # split [0, 32K) and
    # drop [8K, 32K)
    -> add_extent_mapping()
    # add [8K, 32K)
    -> add_extent_mapping()
    # handle -EEXIST when adding
    # [0, 32K)
    ------------------------------------------------------
    About how t2(dio read/write) runs into -EEXIST:

    a) add_extent_mapping() gets -EEXIST for adding em [0, 32k),

    b) search_extent_mapping() then returns [0, 8k) as the existing em,
    even though start == existing->start, em is [0, 32k) so that
    extent_map_end(em) > extent_map_end(existing), i.e. 32k > 8k,

    c) then it goes thru merge_extent_mapping() which tries to add a [8k, 8k)
    (with a length 0) and returns -EEXIST as [8k, 32k) is already in tree,

    d) so btrfs_get_extent() ends up returning -EEXIST to dio read/write,
    which is confusing applications.

    Here I conclude all the possible situations,
    1) start < existing->start

    +-----------+em+-----------+
    +--prev---+ | +-------------+ |
    | | | | | |
    +---------+ + +---+existing++ ++
    +
    |
    +
    start

    2) start == existing->start

    +------------em------------+
    | +-------------+ |
    | | | |
    + +----existing-+ +
    |
    |
    +
    start

    3) start > existing->start && start < (existing->start + existing->len)

    +------------em------------+
    | +-------------+ |
    | | | |
    + +----existing-+ +
    |
    |
    +
    start

    4) start >= (existing->start + existing->len)

    +-----------+em+-----------+
    | +-------------+ | +--next---+
    | | | | | |
    + +---+existing++ + +---------+
    +
    |
    +
    start

    As we can see, it turns out that if start is within existing em (front
    inclusive), then the existing em should be returned as is, otherwise,
    we try our best to merge candidate em with sibling ems to form a
    larger em (in order to reduce the total number of em).

    Reported-by: David Vallender
    Signed-off-by: Liu Bo
    Reviewed-by: Josef Bacik
    Signed-off-by: David Sterba

    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • [ Upstream commit 6f794e3c5c8f8fdd3b5bb20d9ded894e685b5bbe ]

    It appears from the original commit [1] that there isn't any design
    specific reason not to fail the mount instead of just warning. This
    patch will change it to fail.

    [1]
    commit 319e4d0661e5323c9f9945f0f8fb5905e5fe74c3
    btrfs: Enhance super validation check

    Fixes: 319e4d0661e5323 ("btrfs: Enhance super validation check")
    Signed-off-by: Anand Jain
    Reviewed-by: Qu Wenruo
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Anand Jain
     
  • [ Upstream commit 762221f095e3932669093466aaf4b85ed9ad2ac1 ]

    The raid6 corruption is that,
    suppose that all disks can be read without problems and if the content
    that was read out doesn't match its checksum, currently for raid6
    btrfs at most retries twice,

    - the 1st retry is to rebuild with all other stripes, it'll eventually
    be a raid5 xor rebuild,
    - if the 1st fails, the 2nd retry will deliberately fail parity p so
    that it will do raid6 style rebuild,

    however, the chances are that another non-parity stripe content also
    has something corrupted, so that the above retries are not able to
    return correct content.

    We've fixed normal reads to rebuild raid6 correctly with more retries
    in Patch "Btrfs: make raid6 rebuild retry more"[1], this is to fix
    scrub to do the exactly same rebuild process.

    [1]: https://patchwork.kernel.org/patch/10091755/

    Signed-off-by: Liu Bo
    Signed-off-by: David Sterba
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • [ Upstream commit 9ea2c7c9da13c9073e371c046cbbc45481ecb459 ]

    When modifying a tree where the root is at BTRFS_MAX_LEVEL - 1 then
    the level variable is going to be 7 (this is the max height of the
    tree). On the other hand btrfs_cow_block is always called with
    "level + 1" as an index into the nodes and slots arrays. This leads to
    an out of bounds access. Admittdely this will be benign since an OOB
    access of the nodes array will likely read the 0th element from the
    slots array, which in this case is going to be 0 (since we start CoW at
    the top of the tree). The OOB access into the slots array in turn will
    read the 0th and 1st values of the locks array, which would both be 0
    at the time. However, this benign behavior relies on the fact that the
    path being passed hasn't been initialised, if it has already been used to
    query a btree then it could potentially have populated the nodes/slots arrays.

    Fix it by explicitly checking if we are at level 7 (the maximum allowed
    index in nodes/slots arrays) and explicitly call the CoW routine with
    NULL for parent's node/slot.

    Signed-off-by: Nikolay Borisov
    Fixes-coverity-id: 711515
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Nikolay Borisov
     
  • [ Upstream commit 343e4fc1c60971b0734de26dbbd475d433950982 ]

    Setting plug can merge adjacent IOs before dispatching IOs to the disk
    driver.

    Without plug, it'd not be a problem for single disk usecases, but for
    multiple disks using raid profile, a large IO can be split to several
    IOs of stripe length, and plug can be helpful to bring them together
    for each disk so that we can save several disk access.

    Moreover, fsync issues synchronous writes, so plug can really take
    effect.

    Signed-off-by: Liu Bo
    Reviewed-by: David Sterba
    Signed-off-by: David Sterba
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Liu Bo
     
  • [ Upstream commit e749d328b0b450aa78d562fa26a0cd8872325dd9 ]

    Fix to return a negative error code from the request_irq() error
    handling case instead of 0, as done elsewhere in this function.

    Fixes: dce143c3381c ("ipmi/powernv: Convert to irq event interface")
    Signed-off-by: Wei Yongjun
    Reviewed-by: Alexey Kardashevskiy
    Signed-off-by: Corey Minyard
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Wei Yongjun
     
  • [ Upstream commit 0ddcff49b672239dda94d70d0fcf50317a9f4b51 ]

    'hwname' is malloced in hwsim_new_radio_nl() and should be freed
    before leaving from the error handling cases, otherwise it will cause
    memory leak.

    Fixes: ff4dd73dd2b4 ("mac80211_hwsim: check HWSIM_ATTR_RADIO_NAME length")
    Signed-off-by: Wei Yongjun
    Reviewed-by: Ben Hutchings
    Signed-off-by: Johannes Berg
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    weiyongjun (A)
     
  • [ Upstream commit 5b1374b3b3c2fc4f63a398adfa446fb8eff791a4 ]

    Only the E_NOT operand and not the E_NOT node itself was freed, due to
    accidentally returning too early in expr_free(). Outline of leak:

    switch (e->type) {
    ...
    case E_NOT:
    expr_free(e->left.expr);
    return;
    ...
    }
    *Never reached, 'e' leaked*
    free(e);

    Fix by changing the 'return' to a 'break'.

    Summary from Valgrind on 'menuconfig' (ARCH=x86) before the fix:

    LEAK SUMMARY:
    definitely lost: 44,448 bytes in 1,852 blocks
    ...

    Summary after the fix:

    LEAK SUMMARY:
    definitely lost: 1,608 bytes in 67 blocks
    ...

    Signed-off-by: Ulf Magnusson
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Ulf Magnusson
     
  • [ Upstream commit ae7440ef0c8013d68c00dad6900e7cce5311bb1c ]

    expr_trans_compare() always allocates and returns a new expression,
    giving the following leak outline:

    ...
    *Allocate*
    basedep = expr_trans_compare(basedep, E_UNEQUAL, &symbol_no);
    ...
    for (menu = parent->next; menu; menu = menu->next) {
    ...
    *Copy*
    dep2 = expr_copy(basedep);
    ...
    *Free copy*
    expr_free(dep2);
    }
    *basedep lost!*

    Fix by freeing 'basedep' after the loop.

    Summary from Valgrind on 'menuconfig' (ARCH=x86) before the fix:

    LEAK SUMMARY:
    definitely lost: 344,376 bytes in 14,349 blocks
    ...

    Summary after the fix:

    LEAK SUMMARY:
    definitely lost: 44,448 bytes in 1,852 blocks
    ...

    Signed-off-by: Ulf Magnusson
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Ulf Magnusson
     
  • [ Upstream commit 0724a7c32a54e3e50d28e19e30c59014f61d4e2c ]

    If a 'mainmenu' entry appeared in the Kconfig files, two things would
    leak:

    - The 'struct property' allocated for the default "Linux Kernel
    Configuration" prompt.

    - The string for the T_WORD/T_WORD_QUOTE prompt after the
    T_MAINMENU token, allocated on the heap in zconf.l.

    To fix it, introduce a new 'no_mainmenu_stmt' nonterminal that matches
    if there's no 'mainmenu' and adds the default prompt. That means the
    prompt only gets allocated once regardless of whether there's a
    'mainmenu' statement or not, and managing it becomes simple.

    Summary from Valgrind on 'menuconfig' (ARCH=x86) before the fix:

    LEAK SUMMARY:
    definitely lost: 344,568 bytes in 14,352 blocks
    ...

    Summary after the fix:

    LEAK SUMMARY:
    definitely lost: 344,440 bytes in 14,350 blocks
    ...

    Signed-off-by: Ulf Magnusson
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Ulf Magnusson
     
  • [ Upstream commit f541c09ebfc61697b586b38c9ebaf4b70defb278 ]

    According to all published information, the watchdog disable bit for SB800
    compatible controllers is bit 1 of PM register 0x48, not bit 2. For the
    most part that doesn't matter in practice, since the bit has to be cleared
    to enable watchdog address decoding, which is the default setting, but it
    still needs to be fixed.

    Cc: Zoltán Böszörményi
    Signed-off-by: Guenter Roeck
    Signed-off-by: Wim Van Sebroeck
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Guenter Roeck
     
  • [ Upstream commit 80db6f08b7af93eddc9487535e6150b220262637 ]

    Some hardware can operate in either "host" or "endpoint" mode, which means
    there can be both a host bridge driver and an endpoint driver for the same
    device. Those drivers share a lot of code, so sometimes they live in the
    same source file.

    The host bridge driver requires CONFIG_PCI=y because it enumerates PCI
    devices below the bridge using the PCI core. The endpoint driver does not
    require CONFIG_PCI=y because it runs in an embedded kernel on the other
    side of the device, e.g., on an adapter card.

    pci-dra7xx.c contains both host and endpoint drivers. If we select only
    the endpoint driver (CONFIG_PCI=n and CONFIG_PCI_DRA7XX_EP=y), the unneeded
    host driver is still compiled. It references pci_irqd_intx_xlate(), which
    is not present when CONFIG_PCI=n, which causes this error:

    drivers/pci/dwc/pci-dra7xx.c:229:11: error: 'pci_irqd_intx_xlate' undeclared here (not in a function)

    Add a dummy pci_irqd_intx_xlate() for the CONFIG_PCI=n case.

    [bhelgaas: changelog]
    Signed-off-by: Niklas Cassel
    Signed-off-by: Bjorn Helgaas
    Acked-by: Arnd Bergmann
    Acked-by: Lorenzo Pieralisi
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Niklas Cassel
     
  • [ Upstream commit 5f2483eb2423152445b39f2db59d372f523e664e ]

    Make doesn't expand shell style "vmlinuz.{32,ecoff,bin,srec}" to the 4
    separate files, so none of these files get cleaned up by make clean.
    List the files separately instead.

    Fixes: ec3352925b74 ("MIPS: Remove all generated vmlinuz* files on "make clean"")
    Signed-off-by: James Hogan
    Cc: Ralf Baechle
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/18491/
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    James Hogan
     
  • [ Upstream commit cbebc6ef4fc830f4040d4140bf53484812d5d5d9 ]

    Since commit 57e62324e469 ("NFS: Store the legacy idmapper result in the
    keyring") nfs_idmap_cache_timeout changed units from jiffies to seconds.
    Unfortunately sysctl interface was not updated accordingly.

    As a effect updating /proc/sys/fs/nfs/idmap_cache_timeout with some
    value will incorrectly multiply this value by HZ.
    Also reading /proc/sys/fs/nfs/idmap_cache_timeout will show real value
    divided by HZ.

    Fixes: 57e62324e469 ("NFS: Store the legacy idmapper result in the keyring")
    Signed-off-by: Jan Chochol
    Signed-off-by: Trond Myklebust
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Jan Chochol
     
  • [ Upstream commit 246d8b184c100e8eb6b4e8c88f232c2ed2a4e672 ]

    polling the completion queue directly does not interfere
    with the existing polling logic, hence drop the requirement.
    Be aware that running ib_process_cq_direct with non IB_POLL_DIRECT
    CQ may trigger concurrent CQ processing.

    This can be used for polling mode ULPs.

    Cc: Bart Van Assche
    Reported-by: Steve Wise
    Signed-off-by: Sagi Grimberg
    [maxg: added wcs array argument to __ib_process_cq]
    Signed-off-by: Max Gurtovoy
    Signed-off-by: Doug Ledford
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Sagi Grimberg
     
  • [ Upstream commit 44a5f423e70374e5b42cecd85e78f2d79334e0f2 ]

    When performing a read using FIFO mode, the spi controller shifts out
    the last 2 bytes that were written in a previous transfer on MOSI.

    This undocumented behaviour can cause devices to misinterpret the
    transfer, so we explicitly clear the WFIFO before each read.

    This behaviour was noticed on EspressoBin.

    Signed-off-by: Maxime Chevallier
    Signed-off-by: Mark Brown
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Maxime Chevallier
     
  • [ Upstream commit fb7d38a70e1d8ffd54f7a7464dcc4889d7e490ad ]

    On Meson8b the only valid input clock is MPLL2. The bootloader
    configures that to run at 500002394Hz which cannot be divided evenly
    down to 125MHz using the m250_div clock. Currently the common clock
    framework chooses a m250_div of 2 - with the internal fixed
    "divide by 10" this results in a RGMII TX clock of 125001197Hz (120Hz
    above the requested 125MHz).

    Letting the common clock framework propagate the rate changes up to the
    parent of m250_mux allows us to get the best possible clock rate. With
    this patch the common clock framework calculates a rate of
    very-close-to-250MHz (249999701Hz to be exact) for the MPLL2 clock
    (which is the mux input). Dividing that by 2 (which is an internal,
    fixed divider for the RGMII TX clock) gives us an RGMII TX clock of
    124999850Hz (which is only 150Hz off the requested 125MHz, compared to
    1197Hz based on the MPLL2 rate set by u-boot and the Amlogic GPL kernel
    sources).

    SoCs from the Meson GX series are not affected by this change because
    the input clock is FCLK_DIV2 whose rate cannot be changed (which is fine
    since it's running at 1GHz, so it's already a multiple of 250MHz and
    125MHz).

    Fixes: 566e8251625304 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
    Suggested-by: Jerome Brunet
    Signed-off-by: Martin Blumenstingl
    Reviewed-by: Jerome Brunet
    Tested-by: Jerome Brunet
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Martin Blumenstingl
     
  • [ Upstream commit 433c6cab9d298687c097f6ee82e49157044dc7c6 ]

    Meson8b only supports MPLL2 as clock input. The rate of the MPLL2 clock
    set by Odroid-C1's u-boot is close to (but not exactly) 500MHz. The
    exact rate is 500002394Hz, which is calculated in
    drivers/clk/meson/clk-mpll.c using the following formula:
    DIV_ROUND_UP_ULL((u64)parent_rate * SDM_DEN, (SDM_DEN * n2) + sdm)
    Odroid-C1's u-boot configures MPLL2 with the following values:
    - SDM_DEN = 16384
    - SDM = 1638
    - N2 = 5

    The 250MHz clock (m250_div) inside dwmac-meson8b driver is derived from
    the MPLL2 clock. Due to MPLL2 running slightly faster than 500MHz the
    common clock framework chooses a divider which is too big to generate
    the 250MHz clock (a divider of 2 would be needed, but this is rounded up
    to a divider of 3). This breaks the RTL8211F RGMII PHY on Odroid-C1
    because it requires a (close to) 125MHz RGMII TX clock (on Gbit speeds,
    the IP block internally divides that down to 25MHz on 100Mbit/s
    connections and 2.5MHz on 10Mbit/s connections - we don't need any
    special configuration for that).

    Round the divider to the closest value to prevent this issue on Meson8b.
    This means we'll now end up with a clock rate for the RGMII TX clock of
    125001197Hz (= 125MHz plus 1197Hz), which is close-enough to 125MHz.
    This has no effect on the Meson GX SoCs since there fclk_div2 is used as
    input clock, which has a rate of 1000MHz (and thus is divisible cleanly
    to 250MHz and 125MHz).

    Fixes: 566e8251625304 ("net: stmmac: add a glue driver for the Amlogic Meson 8b / GXBB DWMAC")
    Reported-by: Emiliano Ingrassia
    Signed-off-by: Martin Blumenstingl
    Reviewed-by: Jerome Brunet
    Tested-by: Jerome Brunet
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Martin Blumenstingl
     
  • [ Upstream commit c877154d307f4a91e0b5b85b75535713dab945ae ]

    fs/ubifs/tnc.c: In function ‘search_dh_cookie’:
    fs/ubifs/tnc.c:1893: warning: ‘err’ is used uninitialized in this function

    Indeed, err is always used uninitialized.

    According to an original review comment from Hyunchul, acknowledged by
    Richard, err should be initialized to -ENOENT to avoid the first call to
    tnc_next(). But we can achieve the same by reordering the code.

    Fixes: 781f675e2d7e ("ubifs: Fix unlink code wrt. double hash lookups")
    Reported-by: Hyunchul Lee
    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: Richard Weinberger
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Geert Uytterhoeven
     
  • [ Upstream commit 7df938fbc4ee641e70e05002ac67c24b19e86e74 ]

    We know this WARN_ON is harmless and in reality it may be trigged,
    so convert it to printk() and dump_stack() to avoid to confusing
    people.

    Also add comment about two releated races here.

    Cc: Christian Borntraeger
    Cc: Stefan Haberland
    Cc: Christoph Hellwig
    Cc: Thomas Gleixner
    Cc: "jianchao.wang"
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Ming Lei
     
  • [ Upstream commit 050af08ffb1b62af69196d61c22a0755f9a3cdbd ]

    blk-mq will rerun queue via RESTART or dispatch wake after one request
    is completed, so not necessary to wait random time for requeuing, we
    should trust blk-mq to do it.

    More importantly, we need to return BLK_STS_RESOURCE to blk-mq so that
    dequeuing from the I/O scheduler can be stopped, this results in
    improved I/O merging.

    Signed-off-by: Ming Lei
    Signed-off-by: Mike Snitzer
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Ming Lei
     
  • [ Upstream commit 9b28a1102efc75d81298198166ead87d643a29ce ]

    Fixes:
    1. The use of "exceeds" when the opposite of exceeds, falls below,
    was meant.
    2. Properly speaking, a table can not exceed a threshold.

    It emphasizes the important point, which is that it is the userspace
    daemon's responsibility to check for low free space when a device
    is resumed, since it won't get a special event indicating low free
    space in that situation.

    Signed-off-by: mulhern
    Signed-off-by: Mike Snitzer
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    mulhern
     
  • [ Upstream commit 9d2e6505f6d6934e681aed502f566198cb25c74a ]

    after commit a1ddcbe93010 ("iommu/vt-d: Pass dmar_domain directly into
    iommu_flush_iotlb_psi", 2015-08-12), we have domain pointer as parameter
    to iommu_flush_iotlb_psi(), so no need to fetch it from cache again.

    More importantly, a NULL reference pointer bug is reported on RHEL7 (and
    it can be reproduced on some old upstream kernels too, e.g., v4.13) by
    unplugging an 40g nic from a VM (hard to test unplug on real host, but
    it should be the same):

    https://bugzilla.redhat.com/show_bug.cgi?id=1531367

    [ 24.391863] pciehp 0000:00:03.0:pcie004: Slot(0): Attention button pressed
    [ 24.393442] pciehp 0000:00:03.0:pcie004: Slot(0): Powering off due to button press
    [ 29.721068] i40evf 0000:01:00.0: Unable to send opcode 2 to PF, err I40E_ERR_QUEUE_EMPTY, aq_err OK
    [ 29.783557] iommu: Removing device 0000:01:00.0 from group 3
    [ 29.784662] BUG: unable to handle kernel NULL pointer dereference at 0000000000000304
    [ 29.785817] IP: iommu_flush_iotlb_psi+0xcf/0x120
    [ 29.786486] PGD 0
    [ 29.786487] P4D 0
    [ 29.786812]
    [ 29.787390] Oops: 0000 [#1] SMP
    [ 29.787876] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_ng
    [ 29.795371] CPU: 0 PID: 156 Comm: kworker/0:2 Not tainted 4.13.0 #14
    [ 29.796366] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-1.el7 04/01/2014
    [ 29.797593] Workqueue: pciehp-0 pciehp_power_thread
    [ 29.798328] task: ffff94f5745b4a00 task.stack: ffffb326805ac000
    [ 29.799178] RIP: 0010:iommu_flush_iotlb_psi+0xcf/0x120
    [ 29.799919] RSP: 0018:ffffb326805afbd0 EFLAGS: 00010086
    [ 29.800666] RAX: ffff94f5bc56e800 RBX: 0000000000000000 RCX: 0000000200000025
    [ 29.801667] RDX: ffff94f5bc56e000 RSI: 0000000000000082 RDI: 0000000000000000
    [ 29.802755] RBP: ffffb326805afbf8 R08: 0000000000000000 R09: ffff94f5bc86bbf0
    [ 29.803772] R10: ffffb326805afba8 R11: 00000000000ffdc4 R12: ffff94f5bc86a400
    [ 29.804789] R13: 0000000000000000 R14: 00000000ffdc4000 R15: 0000000000000000
    [ 29.805792] FS: 0000000000000000(0000) GS:ffff94f5bfc00000(0000) knlGS:0000000000000000
    [ 29.806923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 29.807736] CR2: 0000000000000304 CR3: 000000003499d000 CR4: 00000000000006f0
    [ 29.808747] Call Trace:
    [ 29.809156] flush_unmaps_timeout+0x126/0x1c0
    [ 29.809800] domain_exit+0xd6/0x100
    [ 29.810322] device_notifier+0x6b/0x70
    [ 29.810902] notifier_call_chain+0x4a/0x70
    [ 29.812822] __blocking_notifier_call_chain+0x47/0x60
    [ 29.814499] blocking_notifier_call_chain+0x16/0x20
    [ 29.816137] device_del+0x233/0x320
    [ 29.817588] pci_remove_bus_device+0x6f/0x110
    [ 29.819133] pci_stop_and_remove_bus_device+0x1a/0x20
    [ 29.820817] pciehp_unconfigure_device+0x7a/0x1d0
    [ 29.822434] pciehp_disable_slot+0x52/0xe0
    [ 29.823931] pciehp_power_thread+0x8a/0xa0
    [ 29.825411] process_one_work+0x18c/0x3a0
    [ 29.826875] worker_thread+0x4e/0x3b0
    [ 29.828263] kthread+0x109/0x140
    [ 29.829564] ? process_one_work+0x3a0/0x3a0
    [ 29.831081] ? kthread_park+0x60/0x60
    [ 29.832464] ret_from_fork+0x25/0x30
    [ 29.833794] Code: 85 ed 74 0b 5b 41 5c 41 5d 41 5e 41 5f 5d c3 49 8b 54 24 60 44 89 f8 0f b6 c4 48 8b 04 c2 48 85 c0 74 49 45 0f b6 ff 4a 8b 3c f8 bf
    [ 29.838514] RIP: iommu_flush_iotlb_psi+0xcf/0x120 RSP: ffffb326805afbd0
    [ 29.840362] CR2: 0000000000000304
    [ 29.841716] ---[ end trace b10ec0d6900868d3 ]---

    This patch fixes that problem if applied to v4.13 kernel.

    The bug does not exist on latest upstream kernel since it's fixed as a
    side effect of commit 13cf01744608 ("iommu/vt-d: Make use of iova
    deferred flushing", 2017-08-15). But IMHO it's still good to have this
    patch upstream.

    CC: Alex Williamson
    Signed-off-by: Peter Xu
    Fixes: a1ddcbe93010 ("iommu/vt-d: Pass dmar_domain directly into iommu_flush_iotlb_psi")
    Reviewed-by: Alex Williamson
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Peter Xu
     
  • [ Upstream commit 4552d128c26e0f0f27a5bd2fadc24092b8f6c1d7 ]

    The die() oops path contains a serializing lock to prevent oops
    messages from being interleaved. In the case of a system reset
    initiated oops (e.g., qemu nmi command), __die was being called
    which lacks that synchronisation and oops reports could be
    interleaved across CPUs.

    A recent patch 4388c9b3a6ee7 ("powerpc: Do not send system reset
    request through the oops path") changed this to __die to avoid
    the debugger() call, but there is no real harm to calling it twice
    if the first time fell through. So go back to using die() here.
    This was observed to fix the problem.

    Fixes: 4388c9b3a6ee7 ("powerpc: Do not send system reset request through the oops path")
    Signed-off-by: Nicholas Piggin
    Reviewed-by: David Gibson
    Signed-off-by: Michael Ellerman
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Nicholas Piggin
     
  • [ Upstream commit dc98b8480d8a68c2ce9aa28b9f0d714fd258bc0b ]

    Removing the early device registration hook overlooked the fact that
    it only ran conditionally on a compatible device being present in the
    DT. With exynos_iommu_init() now running as an unconditional initcall,
    problems arise on non-Exynos systems when other IOMMU drivers find
    themselves unable to install their ops on the platform bus, or at worst
    the Exynos ops get called with someone else's domain and all hell breaks
    loose.

    The global ops/cache setup could probably all now be triggered from the
    first IOMMU probe, as with dma_dev assigment, but for the time being the
    simplest fix is to resurrect the logic from commit a7b67cd5d9af
    ("iommu/exynos: Play nice in multi-platform builds") to explicitly check
    the DT for the presence of an Exynos IOMMU before trying anything.

    Fixes: 928055a01b3f ("iommu/exynos: Remove custom platform device registration code")
    Signed-off-by: Robin Murphy
    Acked-by: Marek Szyprowski
    Signed-off-by: Joerg Roedel
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Robin Murphy