05 Jan, 2020

40 commits

  • [ Upstream commit 3c1c24d91ffd536de0a64688a9df7f49e58fadbc ]

    A while ago Andy noticed
    (http://lkml.kernel.org/r/CALCETrWY+5ynDct7eU_nDUqx=okQvjm=Y5wJvA4ahBja=CQXGw@mail.gmail.com)
    that UFFD_FEATURE_EVENT_FORK used by an unprivileged user may have
    security implications.

    As the first step of the solution the following patch limits the availably
    of UFFD_FEATURE_EVENT_FORK only for those having CAP_SYS_PTRACE.

    The usage of CAP_SYS_PTRACE ensures compatibility with CRIU.

    Yet, if there are other users of non-cooperative userfaultfd that run
    without CAP_SYS_PTRACE, they would be broken :(

    Current implementation of UFFD_FEATURE_EVENT_FORK modifies the file
    descriptor table from the read() implementation of uffd, which may have
    security implications for unprivileged use of the userfaultfd.

    Limit availability of UFFD_FEATURE_EVENT_FORK only for callers that have
    CAP_SYS_PTRACE.

    Link: http://lkml.kernel.org/r/1572967777-8812-2-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Reviewed-by: Andrea Arcangeli
    Cc: Daniel Colascione
    Cc: Jann Horn
    Cc: Lokesh Gidra
    Cc: Nick Kralevich
    Cc: Nosh Minwalla
    Cc: Pavel Emelyanov
    Cc: Tim Murray
    Cc: Aleksa Sarai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Mike Rapoport
     
  • [ Upstream commit 204cb79ad42f015312a5bbd7012d09c93d9b46fb ]

    Currently, the drop_caches proc file and sysctl read back the last value
    written, suggesting this is somehow a stateful setting instead of a
    one-time command. Make it write-only, like e.g. compact_memory.

    While mitigating a VM problem at scale in our fleet, there was confusion
    about whether writing to this file will permanently switch the kernel into
    a non-caching mode. This influences the decision making in a tense
    situation, where tens of people are trying to fix tens of thousands of
    affected machines: Do we need a rollback strategy? What are the
    performance implications of operating in a non-caching state for several
    days? It also caused confusion when the kernel team said we may need to
    write the file several times to make sure it's effective ("But it already
    reads back 3?").

    Link: http://lkml.kernel.org/r/20191031221602.9375-1-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Acked-by: Chris Down
    Acked-by: Vlastimil Babka
    Acked-by: David Hildenbrand
    Acked-by: Michal Hocko
    Acked-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Johannes Weiner
     
  • [ Upstream commit 8fc312b32b25c6b0a8b46fab4df8c68df5af1223 ]

    It is assumed that the hugetlbfs_vfsmount[] array will contain either a
    valid vfsmount pointer or NULL for each hstate after initialization.
    Changes made while converting to use fs_context broke this assumption.

    While fixing the hugetlbfs_vfsmount issue, it was discovered that
    init_hugetlbfs_fs never did correctly clean up when encountering a vfs
    mount error.

    It was found during code inspection. A small memory allocation failure
    would be the most likely cause of taking a error path with the bug.
    This is unlikely to happen as this is early init code.

    Link: http://lkml.kernel.org/r/94b6244d-2c24-e269-b12c-e3ba694b242d@oracle.com
    Reported-by: Chengguang Xu
    Fixes: 32021982a324 ("hugetlbfs: Convert to fs_context")
    Signed-off-by: Mike Kravetz
    Cc: David Howells
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Mike Kravetz
     
  • [ Upstream commit 746dd4012d215b53152f0001a48856e41ea31730 ]

    When running test_vmalloc.sh smoke the following print out states that
    the fragment is missing.

    # ./test_vmalloc.sh: You must have the following enabled in your kernel:
    # CONFIG_TEST_VMALLOC=m

    Rework to add the fragment 'CONFIG_TEST_VMALLOC=m' to the config file.

    Link: http://lkml.kernel.org/r/20190916095217.19665-1-anders.roxell@linaro.org
    Fixes: a05ef00c9790 ("selftests/vm: add script helper for CONFIG_TEST_VMALLOC_MODULE")
    Signed-off-by: Anders Roxell
    Cc: Shuah Khan
    Cc: "Uladzislau Rezki (Sony)"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Anders Roxell
     
  • [ Upstream commit 7f28dad395243c5026d649136823bbc40029a828 ]

    Make sure preemption is disabled when temporary switching to nodat
    stack with CALL_ON_STACK helper, because nodat stack is per cpu.

    Reviewed-by: Heiko Carstens
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Vasily Gorbik
     
  • [ Upstream commit bf159d151a0b844be28882f39e316b5800acaa2b ]

    Tx doorbell is handled by txdb_tasklet and doesn't
    have an associated IRQ.

    Anyhow, imx_mu_shutdown ignores this and tries to
    free an IRQ that wasn't requested for Tx DB resulting
    in the following warning:

    [ 1.967644] Trying to free already-free IRQ 26
    [ 1.972108] WARNING: CPU: 2 PID: 157 at kernel/irq/manage.c:1708 __free_irq+0xc0/0x358
    [ 1.980024] Modules linked in:
    [ 1.983088] CPU: 2 PID: 157 Comm: kworker/2:1 Tainted: G
    [ 1.993524] Hardware name: Freescale i.MX8QXP MEK (DT)
    [ 1.998668] Workqueue: events deferred_probe_work_func
    [ 2.003812] pstate: 60000085 (nZCv daIf -PAN -UAO)
    [ 2.008607] pc : __free_irq+0xc0/0x358
    [ 2.012364] lr : __free_irq+0xc0/0x358
    [ 2.016111] sp : ffff00001179b7e0
    [ 2.019422] x29: ffff00001179b7e0 x28: 0000000000000018
    [ 2.024736] x27: ffff000011233000 x26: 0000000000000004
    [ 2.030053] x25: 000000000000001a x24: ffff80083bec74d4
    [ 2.035369] x23: 0000000000000000 x22: ffff80083bec7588
    [ 2.040686] x21: ffff80083b1fe8d8 x20: ffff80083bec7400
    [ 2.046003] x19: 0000000000000000 x18: ffffffffffffffff
    [ 2.051320] x17: 0000000000000000 x16: 0000000000000000
    [ 2.056637] x15: ffff0000111296c8 x14: ffff00009179b517
    [ 2.061953] x13: ffff00001179b525 x12: ffff000011142000
    [ 2.067270] x11: ffff000011129f20 x10: ffff0000105da970
    [ 2.072587] x9 : 00000000ffffffd0 x8 : 0000000000000194
    [ 2.077903] x7 : 612065657266206f x6 : ffff0000111e7b09
    [ 2.083220] x5 : 0000000000000003 x4 : 0000000000000000
    [ 2.088537] x3 : 0000000000000000 x2 : 00000000ffffffff
    [ 2.093854] x1 : 28b70f0a2b60a500 x0 : 0000000000000000
    [ 2.099173] Call trace:
    [ 2.101618] __free_irq+0xc0/0x358
    [ 2.105021] free_irq+0x38/0x98
    [ 2.108170] imx_mu_shutdown+0x90/0xb0
    [ 2.111921] mbox_free_channel.part.2+0x24/0xb8
    [ 2.116453] mbox_free_channel+0x18/0x28

    This bug is present from the beginning of times.

    Cc: Oleksij Rempel
    Signed-off-by: Daniel Baluta
    Signed-off-by: Richard Zhu
    Reviewed-by: Dong Aisheng
    Signed-off-by: Jassi Brar
    Signed-off-by: Sasha Levin

    Daniel Baluta
     
  • [ Upstream commit 188c523e1c271d537f3c9f55b6b65bf4476de32f ]

    Fix a static code checker warning:
    fs/ocfs2/acl.c:331
    ocfs2_acl_chmod() warn: passing zero to 'PTR_ERR'

    Link: http://lkml.kernel.org/r/1dee278b-6c96-eec2-ce76-fe6e07c6e20f@linux.alibaba.com
    Fixes: 5ee0fbd50fd ("ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang")
    Signed-off-by: Ding Xiang
    Reviewed-by: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Changwei Ge
    Cc: Gang He
    Cc: Jun Piao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Ding Xiang
     
  • [ Upstream commit 247f265fa502e7b17a0cb0cc330e055a36aafce4 ]

    Each SBDT is located at a 4KB page and contains 512 entries.
    Each entry of a SDBT points to a SDB, a 4KB page containing
    sampled data. The last entry is a link to another SDBT page.

    When an event is created the function sequence executed is:

    __hw_perf_event_init()
    +--> allocate_buffers()
    +--> realloc_sampling_buffers()
    +---> alloc_sample_data_block()

    Both functions realloc_sampling_buffers() and
    alloc_sample_data_block() allocate pages and the allocation
    can fail. This is handled correctly and all allocated
    pages are freed and error -ENOMEM is returned to the
    top calling function. Finally the event is not created.

    Once the event has been created, the amount of initially
    allocated SDBT and SDB can be too low. This is detected
    during measurement interrupt handling, where the amount
    of lost samples is calculated. If the number of lost samples
    is too high considering sampling frequency and already allocated
    SBDs, the number of SDBs is enlarged during the next execution
    of cpumsf_pmu_enable().

    If more SBDs need to be allocated, functions

    realloc_sampling_buffers()
    +---> alloc-sample_data_block()

    are called to allocate more pages. Page allocation may fail
    and the returned error is ignored. A SDBT and SDB setup
    already exists.

    However the modified SDBTs and SDBs might end up in a situation
    where the first entry of an SDBT does not point to an SDB,
    but another SDBT, basicly an SBDT without payload.
    This can not be handled by the interrupt handler, where an SDBT
    must have at least one entry pointing to an SBD.

    Add a check to avoid SDBTs with out payload (SDBs) when enlarging
    the buffer setup.

    Signed-off-by: Thomas Richter
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Thomas Richter
     
  • [ Upstream commit bf018ee644897d7982e1b8dd8b15e97db6e1a4da ]

    Currently unwinder unconditionally returns %r14 from the first frame
    pointed by %r15 from pt_regs. A task could be interrupted when a function
    already allocated this frame (if it needs it) for its callees or to
    store local variables. In that case this frame would contain random
    values from stack or values stored there by a callee. As we are only
    interested in %r14 to get potential return address, skip bogus return
    addresses which doesn't belong to kernel text.

    This helps to avoid duplicating filtering logic in unwider users, most
    of which use unwind_get_return_address() and would choke on bogus 0
    address returned by it otherwise.

    Reviewed-by: Heiko Carstens
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Vasily Gorbik
     
  • [ Upstream commit a8de1304b7df30e3a14f2a8b9709bb4ff31a0385 ]

    The DTC v1.5.1 added references to (U)INT32_MAX.

    This is no problem for user-space programs since defines
    (U)INT32_MAX along with (u)int32_t.

    For the kernel space, libfdt_env.h needs to be adjusted before we
    pull in the changes.

    In the kernel, we usually use s/u32 instead of (u)int32_t for the
    fixed-width types.

    Accordingly, we already have S/U32_MAX for their max values.
    So, we should not add (U)INT32_MAX to any more.

    Instead, add them to the in-kernel libfdt_env.h to compile the
    latest libfdt.

    Signed-off-by: Masahiro Yamada
    Signed-off-by: Rob Herring
    Signed-off-by: Sasha Levin

    Masahiro Yamada
     
  • [ Upstream commit 5f0af07e89199ac51cdd4f25bc303bdc703f4e9c ]

    Make sure to only clear enabled interrupts keeping count
    of the connection type.

    Suggested-by: Oleksij Rempel
    Signed-off-by: Daniel Baluta
    Signed-off-by: Richard Zhu
    Reviewed-by: Dong Aisheng
    Signed-off-by: Jassi Brar
    Signed-off-by: Sasha Levin

    Daniel Baluta
     
  • [ Upstream commit 6733775a92eacd612ac88afa0fd922e4ffeb2bc7 ]

    This patch introduces support for a new architectured reply
    code 0x8B indicating that a hypervisor layer (if any) has
    rejected an ap message.

    Linux may run as a guest on top of a hypervisor like zVM
    or KVM. So the crypto hardware seen by the ap bus may be
    restricted by the hypervisor for example only a subset like
    only clear key crypto requests may be supported. Other
    requests will be filtered out - rejected by the hypervisor.
    The new reply code 0x8B will appear in such cases and needs
    to get recognized by the ap bus and zcrypt device driver zoo.

    Signed-off-by: Harald Freudenberger
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Harald Freudenberger
     
  • [ Upstream commit 5b596e0ff0e1852197d4c82d3314db5e43126bf7 ]

    To avoid breaking the build on arches where this is not wired up, at
    least all the other features should be made available and when using
    this specific routine, the "unknown" should point the user/developer to
    the need to wire this up on this particular hardware architecture.

    Detected in a container mipsel debian cross build environment, where it
    shows up as:

    In file included from /usr/mipsel-linux-gnu/include/stdio.h:867,
    from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
    from util/session.c:13:
    In function 'printf',
    inlined from 'regs_dump__printf' at util/session.c:1103:3,
    inlined from 'regs__printf' at util/session.c:1131:2:
    /usr/mipsel-linux-gnu/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
    107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
    | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    cross compiler details:

    mipsel-linux-gnu-gcc (Debian 9.2.1-8) 9.2.1 20190909

    Also on mips64:

    In file included from /usr/mips64-linux-gnuabi64/include/stdio.h:867,
    from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
    from util/session.c:13:
    In function 'printf',
    inlined from 'regs_dump__printf' at util/session.c:1103:3,
    inlined from 'regs__printf' at util/session.c:1131:2,
    inlined from 'regs_user__printf' at util/session.c:1139:3,
    inlined from 'dump_sample' at util/session.c:1246:3,
    inlined from 'machines__deliver_event' at util/session.c:1421:3:
    /usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
    107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
    | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    In function 'printf',
    inlined from 'regs_dump__printf' at util/session.c:1103:3,
    inlined from 'regs__printf' at util/session.c:1131:2,
    inlined from 'regs_intr__printf' at util/session.c:1147:3,
    inlined from 'dump_sample' at util/session.c:1249:3,
    inlined from 'machines__deliver_event' at util/session.c:1421:3:
    /usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
    107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
    | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    cross compiler details:

    mips64-linux-gnuabi64-gcc (Debian 9.2.1-8) 9.2.1 20190909

    Fixes: 2bcd355b71da ("perf tools: Add interface to arch registers sets")
    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-95wjyv4o65nuaeweq31t7l1s@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin

    Arnaldo Carvalho de Melo
     
  • [ Upstream commit 0cd032d3b5fcebf5454315400ab310746a81ca53 ]

    brstackinsn must be allowed to be set by the user when AUX area data has
    been captured because, in that case, the branch stack might be
    synthesized on the fly. This fixes the following error:

    Before:

    $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
    [ perf record: Woken up 19 times to write data ]
    [ perf record: Captured and wrote 2.274 MB perf.data ]
    $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
    Display of branch stack assembler requested, but non all-branch filter set
    Hint: run 'perf record -b ...'

    After:

    $ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
    [ perf record: Woken up 19 times to write data ]
    [ perf record: Captured and wrote 2.274 MB perf.data ]
    $ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
    grep 13759 [002] 8091.310257: 1862 instructions:uH: 5641d58069eb bmexec+0x86b (/bin/grep)
    bmexec+2485:
    00005641d5806b35 jnz 0x5641d5806bd0 # MISPRED
    00005641d5806bd0 movzxb (%r13,%rdx,1), %eax
    00005641d5806bd6 add %rdi, %rax
    00005641d5806bd9 movzxb -0x1(%rax), %edx
    00005641d5806bdd cmp %rax, %r14
    00005641d5806be0 jnb 0x5641d58069c0 # MISPRED
    mismatch of LBR data and executable
    00005641d58069c0 movzxb (%r13,%rdx,1), %edi

    Fixes: 48d02a1d5c13 ("perf script: Add 'brstackinsn' for branch stacks")
    Reported-by: Andi Kleen
    Signed-off-by: Adrian Hunter
    Cc: Jiri Olsa
    Link: http://lore.kernel.org/lkml/20191127095322.15417-1-adrian.hunter@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin

    Adrian Hunter
     
  • [ Upstream commit 98e93245113d0f5c279ef77f4a9e7d097323ad71 ]

    To fix these build errors on a debian mipsel cross build environment:

    builtin-diff.c: In function 'block_cycles_diff_cmp':
    builtin-diff.c:550:6: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
    550 | l = labs(left->diff.cycles);
    | ^~~~
    builtin-diff.c:551:6: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
    551 | r = labs(right->diff.cycles);
    | ^~~~

    Fixes: 99150a1faab2 ("perf diff: Use hists to manage basic blocks per symbol")
    Cc: Jin Yao
    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-pn7szy5uw384ntjgk6zckh6a@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Sasha Levin

    Arnaldo Carvalho de Melo
     
  • [ Upstream commit 32546a9586aa4565035bb557e191648e022b29e8 ]

    This patch moves the final part of the cifsFileInfo_put() logic where we
    need a write lock on lock_sem to be processed in a separate thread that
    holds no other locks.
    This is to prevent deadlocks like the one below:

    > there are 6 processes looping to while trying to down_write
    > cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from
    > cifs_new_fileinfo
    >
    > and there are 5 other processes which are blocked, several of them
    > waiting on either PG_writeback or PG_locked (which are both set), all
    > for the same page of the file
    >
    > 2 inode_lock() (inode->i_rwsem) for the file
    > 1 wait_on_page_writeback() for the page
    > 1 down_read(inode->i_rwsem) for the inode of the directory
    > 1 inode_lock()(inode->i_rwsem) for the inode of the directory
    > 1 __lock_page
    >
    >
    > so processes are blocked waiting on:
    > page flags PG_locked and PG_writeback for one specific page
    > inode->i_rwsem for the directory
    > inode->i_rwsem for the file
    > cifsInodeInflock_sem
    >
    >
    >
    > here are the more gory details (let me know if I need to provide
    > anything more/better):
    >
    > [0 00:48:22.765] [UN] PID: 8863 TASK: ffff8c691547c5c0 CPU: 3
    > COMMAND: "reopen_file"
    > #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095
    > #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df
    > #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7
    > #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d
    > #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d
    > #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33
    > #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6
    > #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315
    > * (I think legitimize_path is bogus)
    >
    > in path_openat
    > } else {
    > const char *s = path_init(nd, flags);
    > while (!(error = link_path_walk(s, nd)) &&
    > (error = do_last(nd, file, op)) > 0) { <<<<
    >
    > do_last:
    > if (open_flag & O_CREAT)
    > inode_lock(dir->d_inode); <<<<
    > else
    > so it's trying to take inode->i_rwsem for the directory
    >
    > DENTRY INODE SUPERBLK TYPE PATH
    > ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR /mnt/vm1_smb/
    > inode.i_rwsem is ffff8c691158efc0
    >
    > :
    > owner: (UN - 8856 -
    > reopen_file), counter: 0x0000000000000003
    > waitlist: 2
    > 0xffff9965007e3c90 8863 reopen_file UN 0 1:29:22.926
    > RWSEM_WAITING_FOR_WRITE
    > 0xffff996500393e00 9802 ls UN 0 1:17:26.700
    > RWSEM_WAITING_FOR_READ
    >
    >
    > the owner of the inode.i_rwsem of the directory is:
    >
    > [0 00:00:00.109] [UN] PID: 8856 TASK: ffff8c6914275d00 CPU: 3
    > COMMAND: "reopen_file"
    > #0 [ffff99650065b828] __schedule at ffffffff9b6e6095
    > #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df
    > #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89
    > #3 [ffff99650065b940] msleep at ffffffff9af573a9
    > #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
    > #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs]
    > #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs]
    > #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd
    > #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs]
    > #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs]
    > #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs]
    > #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7
    > #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33
    > #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6
    > #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315
    >
    > cifs_launder_page is for page 0xffffd1e2c07d2480
    >
    > crash> page.index,mapping,flags 0xffffd1e2c07d2480
    > index = 0x8
    > mapping = 0xffff8c68f3cd0db0
    > flags = 0xfffffc0008095
    >
    > PAGE-FLAG BIT VALUE
    > PG_locked 0 0000001
    > PG_uptodate 2 0000004
    > PG_lru 4 0000010
    > PG_waiters 7 0000080
    > PG_writeback 15 0008000
    >
    >
    > inode is ffff8c68f3cd0c40
    > inode.i_rwsem is ffff8c68f3cd0ce0
    > DENTRY INODE SUPERBLK TYPE PATH
    > ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG
    > /mnt/vm1_smb/testfile.8853
    >
    >
    > this process holds the inode->i_rwsem for the parent directory, is
    > laundering a page attached to the inode of the file it's opening, and in
    > _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem
    > for the file itself.
    >
    >
    > :
    > owner: (UN - 8854 -
    > reopen_file), counter: 0x0000000000000003
    > waitlist: 1
    > 0xffff9965005dfd80 8855 reopen_file UN 0 1:29:22.912
    > RWSEM_WAITING_FOR_WRITE
    >
    > this is the inode.i_rwsem for the file
    >
    > the owner:
    >
    > [0 00:48:22.739] [UN] PID: 8854 TASK: ffff8c6914272e80 CPU: 2
    > COMMAND: "reopen_file"
    > #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095
    > #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df
    > #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2
    > #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f
    > #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf
    > #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c
    > #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs]
    > #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4
    > #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a
    > #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs]
    > #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd
    > #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35
    > #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9
    > #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315
    >
    > the process holds the inode->i_rwsem for the file to which it's writing,
    > and is trying to __lock_page for the same page as in the other processes
    >
    >
    > the other tasks:
    > [0 00:00:00.028] [UN] PID: 8859 TASK: ffff8c6915479740 CPU: 2
    > COMMAND: "reopen_file"
    > #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095
    > #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df
    > #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89
    > #3 [ffff9965007b3af0] msleep at ffffffff9af573a9
    > #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs]
    > #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs]
    > #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a
    > #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f
    > #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33
    > #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6
    > #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315
    >
    > this is opening the file, and is trying to down_write cinode->lock_sem
    >
    >
    > [0 00:00:00.041] [UN] PID: 8860 TASK: ffff8c691547ae80 CPU: 2
    > COMMAND: "reopen_file"
    > [0 00:00:00.057] [UN] PID: 8861 TASK: ffff8c6915478000 CPU: 3
    > COMMAND: "reopen_file"
    > [0 00:00:00.059] [UN] PID: 8858 TASK: ffff8c6914271740 CPU: 2
    > COMMAND: "reopen_file"
    > [0 00:00:00.109] [UN] PID: 8862 TASK: ffff8c691547dd00 CPU: 6
    > COMMAND: "reopen_file"
    > #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095
    > #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df
    > #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89
    > #3 [ffff9965007c3d90] msleep at ffffffff9af573a9
    > #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
    > #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs]
    > #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e
    > #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614
    > #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f
    > #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c
    >
    > closing the file, and trying to down_write cifsi->lock_sem
    >
    >
    > [0 00:48:22.839] [UN] PID: 8857 TASK: ffff8c6914270000 CPU: 7
    > COMMAND: "reopen_file"
    > #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095
    > #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df
    > #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2
    > #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6
    > #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028
    > #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165
    > #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs]
    > #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1
    > #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e
    > #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315
    >
    > in __filemap_fdatawait_range
    > wait_on_page_writeback(page);
    > for the same page of the file
    >
    >
    >
    > [0 00:48:22.718] [UN] PID: 8855 TASK: ffff8c69142745c0 CPU: 7
    > COMMAND: "reopen_file"
    > #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095
    > #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df
    > #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7
    > #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs]
    > #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd
    > #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35
    > #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9
    > #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315
    >
    > inode_lock(inode);
    >
    >
    > and one 'ls' later on, to see whether the rest of the mount is available
    > (the test file is in the root, so we get blocked up on the directory
    > ->i_rwsem), so the entire mount is unavailable
    >
    > [0 00:36:26.473] [UN] PID: 9802 TASK: ffff8c691436ae80 CPU: 4
    > COMMAND: "ls"
    > #0 [ffff996500393d28] __schedule at ffffffff9b6e6095
    > #1 [ffff996500393db8] schedule at ffffffff9b6e64df
    > #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421
    > #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2
    > #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56
    > #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c
    > #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6
    > #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315
    >
    > in iterate_dir:
    > if (shared)
    > res = down_read_killable(&inode->i_rwsem); <<<<
    > else
    > res = down_write_killable(&inode->i_rwsem);
    >

    Reported-by: Frank Sorenson
    Reviewed-by: Pavel Shilovsky
    Signed-off-by: Ronnie Sahlberg
    Signed-off-by: Steve French
    Signed-off-by: Sasha Levin

    Ronnie Sahlberg
     
  • [ Upstream commit 366ba7c71ef77c08d06b18ad61b26e2df7352338 ]

    Reading the TOC only works if the device can play audio, otherwise
    these commands fail (and possibly bring the device to an unhealthy
    state.)

    Similarly, cdrom_mmc3_profile() should only be called if the device
    supports generic packet commands.

    To: Jens Axboe
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-scsi@vger.kernel.org
    Signed-off-by: Diego Elio Pettenò
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Diego Elio Pettenò
     
  • [ Upstream commit 2aacace6dbbb6b6ce4e177e6c7ea901f389c0472 ]

    In attach_node_and_children memory is allocated for full_name via
    kasprintf. If the condition of the 1st if is not met the function
    returns early without freeing the memory. Add a kfree() to fix that.

    This has been detected with kmemleak:
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=205327

    It looks like the leak was introduced by this commit:
    Fixes: 5babefb7f7ab ("of: unittest: allow base devicetree to have symbol metadata")

    Signed-off-by: Erhard Furtner
    Reviewed-by: Michael Ellerman
    Reviewed-by: Tyrel Datwyler
    Signed-off-by: Rob Herring
    Signed-off-by: Sasha Levin

    Erhard Furtner
     
  • [ Upstream commit eb065d301e8c83643367bdb0898becc364046bda ]

    We currently rely on the ring destroy on cleaning things up in case of
    failure, but io_allocate_scq_urings() can leave things half initialized
    if only parts of it fails.

    Be nice and return with either everything setup in success, or return an
    error with things nicely cleaned up.

    Reported-by: syzbot+0d818c0d39399188f393@syzkaller.appspotmail.com
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Jens Axboe
     
  • [ Upstream commit 7e60746005573a06149cdee7acedf428906f3a59 ]

    When we get an interrupt from the socket getting readable,
    and start reading, there's a possibility for a race. This
    depends on the implementation of the device, but e.g. with
    qemu's libvhost-user, we can see:

    device virtio_uml
    ---------------------------------------
    write header
    get interrupt
    read header
    read body -> returns -EAGAIN
    write body

    The -EAGAIN return is because the socket is non-blocking,
    and then this leads us to abandon this message.

    In fact, we've already read the header, so when the get
    another signal/interrupt for the body, we again read it
    as though it's a new message header, and also abandon it
    for the same reason (wrong size etc.)

    This essentially breaks things, and if that message was
    one that required a response, it leads to a deadlock as
    the device is waiting for the response but we'll never
    reply.

    Fix this by spinning on -EAGAIN as well when we read the
    message body. We need to handle -EAGAIN as "no message"
    while reading the header, since we share an interrupt.

    Note that this situation is highly unlikely to occur in
    normal usage, since there will be very few messages and
    only in the startup phase. With the inband call feature
    this does tend to happen (eventually) though.

    Signed-off-by: Johannes Berg
    Signed-off-by: Richard Weinberger
    Signed-off-by: Sasha Levin

    Johannes Berg
     
  • [ Upstream commit 8354d88efdab72b4da32fc4f032448fcef22dab4 ]

    Ensure we grab an active reference in cifs superblock while doing
    failover to prevent automounts (DFS links) of expiring and then
    destroying the superblock pointer.

    This patch fixes the following KASAN report:

    [ 464.301462] BUG: KASAN: use-after-free in
    cifs_reconnect+0x6ab/0x1350
    [ 464.303052] Read of size 8 at addr ffff888155e580d0 by task
    cifsd/1107

    [ 464.304682] CPU: 3 PID: 1107 Comm: cifsd Not tainted 5.4.0-rc4+ #13
    [ 464.305552] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
    BIOS rel-1.12.1-0-ga5cab58-rebuilt.opensuse.org 04/01/2014
    [ 464.307146] Call Trace:
    [ 464.307875] dump_stack+0x5b/0x90
    [ 464.308631] print_address_description.constprop.0+0x16/0x200
    [ 464.309478] ? cifs_reconnect+0x6ab/0x1350
    [ 464.310253] ? cifs_reconnect+0x6ab/0x1350
    [ 464.311040] __kasan_report.cold+0x1a/0x41
    [ 464.311811] ? cifs_reconnect+0x6ab/0x1350
    [ 464.312563] kasan_report+0xe/0x20
    [ 464.313300] cifs_reconnect+0x6ab/0x1350
    [ 464.314062] ? extract_hostname.part.0+0x90/0x90
    [ 464.314829] ? printk+0xad/0xde
    [ 464.315525] ? _raw_spin_lock+0x7c/0xd0
    [ 464.316252] ? _raw_read_lock_irq+0x40/0x40
    [ 464.316961] ? ___ratelimit+0xed/0x182
    [ 464.317655] cifs_readv_from_socket+0x289/0x3b0
    [ 464.318386] cifs_read_from_socket+0x98/0xd0
    [ 464.319078] ? cifs_readv_from_socket+0x3b0/0x3b0
    [ 464.319782] ? try_to_wake_up+0x43c/0xa90
    [ 464.320463] ? cifs_small_buf_get+0x4b/0x60
    [ 464.321173] ? allocate_buffers+0x98/0x1a0
    [ 464.321856] cifs_demultiplex_thread+0x218/0x14a0
    [ 464.322558] ? cifs_handle_standard+0x270/0x270
    [ 464.323237] ? __switch_to_asm+0x40/0x70
    [ 464.323893] ? __switch_to_asm+0x34/0x70
    [ 464.324554] ? __switch_to_asm+0x40/0x70
    [ 464.325226] ? __switch_to_asm+0x40/0x70
    [ 464.325863] ? __switch_to_asm+0x34/0x70
    [ 464.326505] ? __switch_to_asm+0x40/0x70
    [ 464.327161] ? __switch_to_asm+0x34/0x70
    [ 464.327784] ? finish_task_switch+0xa1/0x330
    [ 464.328414] ? __switch_to+0x363/0x640
    [ 464.329044] ? __schedule+0x575/0xaf0
    [ 464.329655] ? _raw_spin_lock_irqsave+0x82/0xe0
    [ 464.330301] kthread+0x1a3/0x1f0
    [ 464.330884] ? cifs_handle_standard+0x270/0x270
    [ 464.331624] ? kthread_create_on_node+0xd0/0xd0
    [ 464.332347] ret_from_fork+0x35/0x40

    [ 464.333577] Allocated by task 1110:
    [ 464.334381] save_stack+0x1b/0x80
    [ 464.335123] __kasan_kmalloc.constprop.0+0xc2/0xd0
    [ 464.335848] cifs_smb3_do_mount+0xd4/0xb00
    [ 464.336619] legacy_get_tree+0x6b/0xa0
    [ 464.337235] vfs_get_tree+0x41/0x110
    [ 464.337975] fc_mount+0xa/0x40
    [ 464.338557] vfs_kern_mount.part.0+0x6c/0x80
    [ 464.339227] cifs_dfs_d_automount+0x336/0xd29
    [ 464.339846] follow_managed+0x1b1/0x450
    [ 464.340449] lookup_fast+0x231/0x4a0
    [ 464.341039] path_openat+0x240/0x1fd0
    [ 464.341634] do_filp_open+0x126/0x1c0
    [ 464.342277] do_sys_open+0x1eb/0x2c0
    [ 464.342957] do_syscall_64+0x5e/0x190
    [ 464.343555] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    [ 464.344772] Freed by task 0:
    [ 464.345347] save_stack+0x1b/0x80
    [ 464.345966] __kasan_slab_free+0x12c/0x170
    [ 464.346576] kfree+0xa6/0x270
    [ 464.347211] rcu_core+0x39c/0xc80
    [ 464.347800] __do_softirq+0x10d/0x3da

    [ 464.348919] The buggy address belongs to the object at
    ffff888155e58000
    which belongs to the cache kmalloc-256 of size 256
    [ 464.350222] The buggy address is located 208 bytes inside of
    256-byte region [ffff888155e58000, ffff888155e58100)
    [ 464.351575] The buggy address belongs to the page:
    [ 464.352333] page:ffffea0005579600 refcount:1 mapcount:0
    mapping:ffff88815a803400 index:0x0 compound_mapcount: 0
    [ 464.353583] flags: 0x200000000010200(slab|head)
    [ 464.354209] raw: 0200000000010200 ffffea0005576200 0000000400000004
    ffff88815a803400
    [ 464.355353] raw: 0000000000000000 0000000080100010 00000001ffffffff
    0000000000000000
    [ 464.356458] page dumped because: kasan: bad access detected

    [ 464.367005] Memory state around the buggy address:
    [ 464.367787] ffff888155e57f80: fc fc fc fc fc fc fc fc fc fc fc fc
    fc fc fc fc
    [ 464.368877] ffff888155e58000: fb fb fb fb fb fb fb fb fb fb fb fb
    fb fb fb fb
    [ 464.369967] >ffff888155e58080: fb fb fb fb fb fb fb fb fb fb fb fb
    fb fb fb fb
    [ 464.371111] ^
    [ 464.371775] ffff888155e58100: fc fc fc fc fc fc fc fc fc fc fc fc
    fc fc fc fc
    [ 464.372893] ffff888155e58180: fc fc fc fc fc fc fc fc fc fc fc fc
    fc fc fc fc
    [ 464.373983] ==================================================================

    Signed-off-by: Paulo Alcantara (SUSE)
    Reviewed-by: Aurelien Aptel
    Signed-off-by: Steve French
    Signed-off-by: Sasha Levin

    Paulo Alcantara (SUSE)
     
  • [ Upstream commit 465bfd9c44dea6b55962b5788a23ac87a467c923 ]

    When building pseries_defconfig, building vdso32 errors out:

    error: unknown target ABI 'elfv1'

    This happens because -m32 in clang changes the target to 32-bit,
    which does not allow the ABI to be changed.

    Commit 4dc831aa8813 ("powerpc: Fix compiling a BE kernel with a
    powerpc64le toolchain") added these flags to fix building big endian
    kernels with a little endian GCC.

    Clang doesn't need -mabi because the target triple controls the
    default value. -mlittle-endian and -mbig-endian manipulate the triple
    into either powerpc64-* or powerpc64le-*, which properly sets the
    default ABI.

    Adding a debug print out in the PPC64TargetInfo constructor after line
    383 above shows this:

    $ echo | ./clang -E --target=powerpc64-linux -mbig-endian -o /dev/null -
    Default ABI: elfv1

    $ echo | ./clang -E --target=powerpc64-linux -mlittle-endian -o /dev/null -
    Default ABI: elfv2

    $ echo | ./clang -E --target=powerpc64le-linux -mbig-endian -o /dev/null -
    Default ABI: elfv1

    $ echo | ./clang -E --target=powerpc64le-linux -mlittle-endian -o /dev/null -
    Default ABI: elfv2

    Don't specify -mabi when building with clang to avoid the build error
    with -m32 and not change any code generation.

    -mcall-aixdesc is not an implemented flag in clang so it can be safely
    excluded as well, see commit 238abecde8ad ("powerpc: Don't use gcc
    specific options on clang").

    pseries_defconfig successfully builds after this patch and
    powernv_defconfig and ppc44x_defconfig don't regress.

    Reviewed-by: Daniel Axtens
    Signed-off-by: Nathan Chancellor
    [mpe: Trim clang links in change log]
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/20191119045712.39633-2-natechancellor@gmail.com
    Signed-off-by: Sasha Levin

    Nathan Chancellor
     
  • [ Upstream commit 21915eca088dc271c970e8351290e83d938114ac ]

    build_initial_tok_table() overwrites unused sym_entry to shrink the
    table size. Before the entry is overwritten, table[i].sym must be freed
    since it is malloc'ed data.

    This fixes the 'definitely lost' report from valgrind. I ran valgrind
    against x86_64_defconfig of v5.4-rc8 kernel, and here is the summary:

    [Before the fix]

    LEAK SUMMARY:
    definitely lost: 53,184 bytes in 2,874 blocks

    [After the fix]

    LEAK SUMMARY:
    definitely lost: 0 bytes in 0 blocks

    Signed-off-by: Masahiro Yamada
    Signed-off-by: Sasha Levin

    Masahiro Yamada
     
  • [ Upstream commit a9ae8731e6e52829a935d81a65d7f925cb95dbac ]

    find_vma() must be called under the mmap_sem, reorganize this code to
    do the vma check after entering the lock.

    Further, fix the unlocked use of struct task_struct's mm, instead use
    the mm from hmm_mirror which has an active mm_grab. Also the mm_grab
    must be converted to a mm_get before acquiring mmap_sem or calling
    find_vma().

    Fixes: 66c45500bfdc ("drm/amdgpu: use new HMM APIs and helpers")
    Fixes: 0919195f2b0d ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads")
    Link: https://lore.kernel.org/r/20191112202231.3856-11-jgg@ziepe.ca
    Acked-by: Christian König
    Reviewed-by: Felix Kuehling
    Reviewed-by: Philip Yang
    Tested-by: Philip Yang
    Signed-off-by: Jason Gunthorpe
    Signed-off-by: Sasha Levin

    Jason Gunthorpe
     
  • [ Upstream commit 00e0590dbaec6f1bcaa36a85467d7e3497ced522 ]

    The sanity check in macro update_for_len checks to see if len
    is less than zero, however, len is a size_t so it can never be
    less than zero, so this sanity check is a no-op. Fix this by
    making len a ssize_t so the comparison will work and add ulen
    that is a size_t copy of len so that the min() macro won't
    throw warnings about comparing different types.

    Addresses-Coverity: ("Macro compares unsigned to 0")
    Fixes: f1bd904175e8 ("apparmor: add the base fns() for domain labels")
    Signed-off-by: Colin Ian King
    Signed-off-by: John Johansen
    Signed-off-by: Sasha Levin

    Colin Ian King
     
  • [ Upstream commit 7a1323b5dfe44a9013a2cc56ef2973034a00bf88 ]

    The crash handler calls hv_synic_cleanup() to shutdown the
    Hyper-V synthetic interrupt controller. But if the CPU
    that calls hv_synic_cleanup() has a VMbus channel interrupt
    assigned to it (which is likely the case in smaller VM sizes),
    hv_synic_cleanup() returns an error and the synthetic
    interrupt controller isn't shutdown. While the lack of
    being shutdown hasn't caused a known problem, it still
    should be fixed for highest reliability.

    So directly call hv_synic_disable_regs() instead of
    hv_synic_cleanup(), which ensures that the synic is always
    shutdown.

    Signed-off-by: Michael Kelley
    Reviewed-by: Vitaly Kuznetsov
    Reviewed-by: Dexuan Cui
    Signed-off-by: Sasha Levin

    Michael Kelley
     
  • [ Upstream commit 20183ccd3e4d01d23b0a01fe9f3ee73fbae312fa ]

    It is possible that certain config levels are not available, even
    if the max level includes the level. There can be missing levels in
    some platforms. So ignore the level when called for information dump
    for all levels and fail if specifically ask for the missing level.

    Here the changes is to continue reading information about other levels
    even if we fail to get information for the current level. But use the
    "processed" flag to indicate the failure. When the "processed" flag is
    not set, don't dump information about that level.

    Signed-off-by: Srinivas Pandruvada
    Signed-off-by: Andy Shevchenko
    Signed-off-by: Sasha Levin

    Srinivas Pandruvada
     
  • [ Upstream commit e272f7ec070d212b9301d5a465bc8952f8dcf908 ]

    When commit 75e99bf5ed8f ("gpio: lynxpoint: set default handler to be
    handle_bad_irq()") switched default handler to be handle_bad_irq() the
    lp_irq_type() function remained untouched. It means that even request_irq()
    can't change the handler and we are not able to handle IRQs properly anymore.
    Fix it by setting correct handlers in the lp_irq_type() callback.

    Fixes: 75e99bf5ed8f ("gpio: lynxpoint: set default handler to be handle_bad_irq()")
    Signed-off-by: Andy Shevchenko
    Link: https://lore.kernel.org/r/20191118180251.31439-1-andriy.shevchenko@linux.intel.com
    Signed-off-by: Linus Walleij
    Signed-off-by: Sasha Levin

    Andy Shevchenko
     
  • [ Upstream commit 4e50573f39229d5e9c985fa3b4923a8b29619ade ]

    The per-SoC devtype structures can contain their own callbacks that
    overwrite mpc8xxx_gpio_devtype_default.

    The clear intention is that mpc8xxx_irq_set_type is used in case the SoC
    does not specify a more specific callback. But what happens is that if
    the SoC doesn't specify one, its .irq_set_type is de-facto NULL, and
    this overwrites mpc8xxx_irq_set_type to a no-op. This means that the
    following SoCs are affected:

    - fsl,mpc8572-gpio
    - fsl,ls1028a-gpio
    - fsl,ls1088a-gpio

    On these boards, the irq_set_type does exactly nothing, and the GPIO
    controller keeps its GPICR register in the hardware-default state. On
    the LS1028A, that is ACTIVE_BOTH, which means 2 interrupts are raised
    even if the IRQ client requests LEVEL_HIGH. Another implication is that
    the IRQs are not checked (e.g. level-triggered interrupts are not
    rejected, although they are not supported).

    Fixes: 82e39b0d8566 ("gpio: mpc8xxx: handle differences between incarnations at a single place")
    Signed-off-by: Vladimir Oltean
    Link: https://lore.kernel.org/r/20191115125551.31061-1-olteanv@gmail.com
    Tested-by: Michael Walle
    Signed-off-by: Linus Walleij
    Signed-off-by: Sasha Levin

    Vladimir Oltean
     
  • [ Upstream commit 5406327d43edd9a171bd260f49c752d148727eaf ]

    Add Comet Lake to the list of the platforms that intel_pmc_core driver
    supports for pmc_core device.

    Just like Ice Lake, Comet Lake can also reuse all the Cannon Lake PCH
    IPs. No additional effort is needed to enable but to simply reuse them.

    Cc: Mario Limonciello
    Cc: Peter Zijlstra
    Cc: Srinivas Pandruvada
    Cc: Andy Shevchenko
    Cc: Kan Liang
    Cc: David E. Box
    Cc: Rajneesh Bhardwaj
    Cc: Tony Luck
    Signed-off-by: Gayatri Kammela
    Signed-off-by: Andy Shevchenko
    Signed-off-by: Sasha Levin

    Gayatri Kammela
     
  • [ Upstream commit 43e82d8aa92503d264309fb648b251b2d85caf1a ]

    Intel's SoCs follow a naming convention which spells out the SoC name as
    two words instead of one word (E.g: Cannon Lake vs Cannonlake). Thus fix
    the naming inconsistency across the intel_pmc_core driver, so future
    SoCs can follow the naming consistency as below.

    Cometlake -> Comet Lake
    Tigerlake -> Tiger Lake
    Elkhartlake -> Elkhart Lake

    Cc: Mario Limonciello
    Cc: Peter Zijlstra
    Cc: Srinivas Pandruvada
    Cc: Andy Shevchenko
    Cc: Kan Liang
    Cc: David E. Box
    Cc: Rajneesh Bhardwaj
    Cc: Tony Luck
    Suggested-by: Andy Shevchenko
    Signed-off-by: Gayatri Kammela
    Signed-off-by: Andy Shevchenko
    Signed-off-by: Sasha Levin

    Gayatri Kammela
     
  • [ Upstream commit 787b64a43f7acacf8099329ea08872e663f1e74f ]

    Qoriq requires the IBE register to be set to enable GPIO inputs to be
    read. Set it.

    Signed-off-by: Russell King
    Link: https://lore.kernel.org/r/E1iX3HC-00069N-0T@rmk-PC.armlinux.org.uk
    Signed-off-by: Linus Walleij
    Signed-off-by: Sasha Levin

    Russell King
     
  • [ Upstream commit 71c5e55e7c077fa17c42fbda91a8d14322825c44 ]

    Reduce context close time by skipping the VA block free list update in
    order to avoid hard reset with open contexts.
    Reset with open contexts can potentially lead to a kernel crash as the
    generic pool of the MMU hops is destroyed while it is not empty because
    some unmap operations are not done.
    The commit affect mainly when running on simulator.

    Signed-off-by: Omer Shpigelman
    Reviewed-by: Oded Gabbay
    Signed-off-by: Oded Gabbay
    Signed-off-by: Sasha Levin

    Omer Shpigelman
     
  • [ Upstream commit 677017d196ba2a4cfff13626b951cc9a206b8c7c ]

    The FS got stuck in the below stack when the storage is almost
    full/dirty condition (when FG_GC is being done).

    schedule_timeout
    io_schedule_timeout
    congestion_wait
    f2fs_drop_inmem_pages_all
    f2fs_gc
    f2fs_balance_fs
    __write_node_page
    f2fs_fsync_node_pages
    f2fs_do_sync_file
    f2fs_ioctl

    The root cause for this issue is there is a potential infinite loop
    in f2fs_drop_inmem_pages_all() for the case where gc_failure is true
    and when there an inode whose i_gc_failures[GC_FAILURE_ATOMIC] is
    not set. Fix this by keeping track of the total atomic files
    currently opened and using that to exit from this condition.

    Fix-suggested-by: Chao Yu
    Signed-off-by: Chao Yu
    Signed-off-by: Sahitya Tummala
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Sasha Levin

    Sahitya Tummala
     
  • [ Upstream commit e9d3009cb936bd0faf0719f68d98ad8afb1e613b ]

    The iSCSI target driver is the only target driver that does not wait for
    ongoing commands to finish before freeing a session. Make the iSCSI target
    driver wait for ongoing commands to finish before freeing a session. This
    patch fixes the following KASAN complaint:

    BUG: KASAN: use-after-free in __lock_acquire+0xb1a/0x2710
    Read of size 8 at addr ffff8881154eca70 by task kworker/0:2/247

    CPU: 0 PID: 247 Comm: kworker/0:2 Not tainted 5.4.0-rc1-dbg+ #6
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
    Workqueue: target_completion target_complete_ok_work [target_core_mod]
    Call Trace:
    dump_stack+0x8a/0xd6
    print_address_description.constprop.0+0x40/0x60
    __kasan_report.cold+0x1b/0x33
    kasan_report+0x16/0x20
    __asan_load8+0x58/0x90
    __lock_acquire+0xb1a/0x2710
    lock_acquire+0xd3/0x200
    _raw_spin_lock_irqsave+0x43/0x60
    target_release_cmd_kref+0x162/0x7f0 [target_core_mod]
    target_put_sess_cmd+0x2e/0x40 [target_core_mod]
    lio_check_stop_free+0x12/0x20 [iscsi_target_mod]
    transport_cmd_check_stop_to_fabric+0xd8/0xe0 [target_core_mod]
    target_complete_ok_work+0x1b0/0x790 [target_core_mod]
    process_one_work+0x549/0xa40
    worker_thread+0x7a/0x5d0
    kthread+0x1bc/0x210
    ret_from_fork+0x24/0x30

    Allocated by task 889:
    save_stack+0x23/0x90
    __kasan_kmalloc.constprop.0+0xcf/0xe0
    kasan_slab_alloc+0x12/0x20
    kmem_cache_alloc+0xf6/0x360
    transport_alloc_session+0x29/0x80 [target_core_mod]
    iscsi_target_login_thread+0xcd6/0x18f0 [iscsi_target_mod]
    kthread+0x1bc/0x210
    ret_from_fork+0x24/0x30

    Freed by task 1025:
    save_stack+0x23/0x90
    __kasan_slab_free+0x13a/0x190
    kasan_slab_free+0x12/0x20
    kmem_cache_free+0x146/0x400
    transport_free_session+0x179/0x2f0 [target_core_mod]
    transport_deregister_session+0x130/0x180 [target_core_mod]
    iscsit_close_session+0x12c/0x350 [iscsi_target_mod]
    iscsit_logout_post_handler+0x136/0x380 [iscsi_target_mod]
    iscsit_response_queue+0x8de/0xbe0 [iscsi_target_mod]
    iscsi_target_tx_thread+0x27f/0x370 [iscsi_target_mod]
    kthread+0x1bc/0x210
    ret_from_fork+0x24/0x30

    The buggy address belongs to the object at ffff8881154ec9c0
    which belongs to the cache se_sess_cache of size 352
    The buggy address is located 176 bytes inside of
    352-byte region [ffff8881154ec9c0, ffff8881154ecb20)
    The buggy address belongs to the page:
    page:ffffea0004553b00 refcount:1 mapcount:0 mapping:ffff888101755400 index:0x0 compound_mapcount: 0
    flags: 0x2fff000000010200(slab|head)
    raw: 2fff000000010200 dead000000000100 dead000000000122 ffff888101755400
    raw: 0000000000000000 0000000080130013 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff8881154ec900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
    ffff8881154ec980: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
    >ffff8881154eca00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ^
    ffff8881154eca80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ffff8881154ecb00: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc

    Cc: Mike Christie
    Link: https://lore.kernel.org/r/20191113220508.198257-3-bvanassche@acm.org
    Reviewed-by: Roman Bolshakov
    Signed-off-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Bart Van Assche
     
  • [ Upstream commit 238191d65d7217982d69e21c1d623616da34b281 ]

    If a faulty initiator fails to bind the socket to the iSCSI connection
    before emitting a command, for instance, a subsequent send_pdu, it will
    crash the kernel due to a null pointer dereference in sock_sendmsg(), as
    shown in the log below. This patch makes sure the bind succeeded before
    trying to use the socket.

    BUG: kernel NULL pointer dereference, address: 0000000000000018
    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    PGD 0 P4D 0
    Oops: 0000 [#1] SMP PTI
    CPU: 3 PID: 7 Comm: kworker/u8:0 Not tainted 5.4.0-rc2.iscsi+ #13
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
    [ 24.158246] Workqueue: iscsi_q_0 iscsi_xmitworker
    [ 24.158883] RIP: 0010:apparmor_socket_sendmsg+0x5/0x20
    [...]
    [ 24.161739] RSP: 0018:ffffab6440043ca0 EFLAGS: 00010282
    [ 24.162400] RAX: ffffffff891c1c00 RBX: ffffffff89d53968 RCX: 0000000000000001
    [ 24.163253] RDX: 0000000000000030 RSI: ffffab6440043d00 RDI: 0000000000000000
    [ 24.164104] RBP: 0000000000000030 R08: 0000000000000030 R09: 0000000000000030
    [ 24.165166] R10: ffffffff893e66a0 R11: 0000000000000018 R12: ffffab6440043d00
    [ 24.166038] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9d5575a62e90
    [ 24.166919] FS: 0000000000000000(0000) GS:ffff9d557db80000(0000) knlGS:0000000000000000
    [ 24.167890] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 24.168587] CR2: 0000000000000018 CR3: 000000007a838000 CR4: 00000000000006e0
    [ 24.169451] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 24.170320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 24.171214] Call Trace:
    [ 24.171537] security_socket_sendmsg+0x3a/0x50
    [ 24.172079] sock_sendmsg+0x16/0x60
    [ 24.172506] iscsi_sw_tcp_xmit_segment+0x77/0x120
    [ 24.173076] iscsi_sw_tcp_pdu_xmit+0x58/0x170
    [ 24.173604] ? iscsi_dbg_trace+0x63/0x80
    [ 24.174087] iscsi_tcp_task_xmit+0x101/0x280
    [ 24.174666] iscsi_xmit_task+0x83/0x110
    [ 24.175206] iscsi_xmitworker+0x57/0x380
    [ 24.175757] ? __schedule+0x2a2/0x700
    [ 24.176273] process_one_work+0x1b5/0x360
    [ 24.176837] worker_thread+0x50/0x3c0
    [ 24.177353] kthread+0xf9/0x130
    [ 24.177799] ? process_one_work+0x360/0x360
    [ 24.178401] ? kthread_park+0x90/0x90
    [ 24.178915] ret_from_fork+0x35/0x40
    [ 24.179421] Modules linked in:
    [ 24.179856] CR2: 0000000000000018
    [ 24.180327] ---[ end trace b4b7674b6df5f480 ]---

    Signed-off-by: Anatol Pomazau
    Co-developed-by: Frank Mayhar
    Signed-off-by: Frank Mayhar
    Co-developed-by: Bharath Ravi
    Signed-off-by: Bharath Ravi
    Co-developed-by: Khazhimsel Kumykov
    Signed-off-by: Khazhimsel Kumykov
    Co-developed-by: Gabriel Krisman Bertazi
    Signed-off-by: Gabriel Krisman Bertazi
    Reviewed-by: Lee Duncan
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Anatol Pomazau
     
  • [ Upstream commit 71d848b8d97ec0f8e993d63cf9de6ac8b3f7c43d ]

    Fix up possible unclocked register access to auto hibern8 register in
    resume path and through sysfs entry. Meanwhile, enable auto hibern8 only
    after device is fully initialized in probe path.

    Link: https://lore.kernel.org/r/1573798172-20534-4-git-send-email-cang@codeaurora.org
    Reviewed-by: Stanley Chu
    Signed-off-by: Can Guo
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Can Guo
     
  • [ Upstream commit 80647a89eaf3f2549741648f3230cd6ff68c23b4 ]

    The SCSI specs require releasing SPC-2 reservations when a session is
    closed. Make sure that the target core does this.

    Running the libiscsi tests triggers the KASAN complaint shown below. This
    patch fixes that use-after-free.

    BUG: KASAN: use-after-free in target_check_reservation+0x171/0x980 [target_core_mod]
    Read of size 8 at addr ffff88802ecd1878 by task iscsi_trx/17200

    CPU: 0 PID: 17200 Comm: iscsi_trx Not tainted 5.4.0-rc1-dbg+ #1
    Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
    Call Trace:
    dump_stack+0x8a/0xd6
    print_address_description.constprop.0+0x40/0x60
    __kasan_report.cold+0x1b/0x34
    kasan_report+0x16/0x20
    __asan_load8+0x58/0x90
    target_check_reservation+0x171/0x980 [target_core_mod]
    __target_execute_cmd+0xb1/0xf0 [target_core_mod]
    target_execute_cmd+0x22d/0x4d0 [target_core_mod]
    transport_generic_new_cmd+0x31f/0x5b0 [target_core_mod]
    transport_handle_cdb_direct+0x6f/0x90 [target_core_mod]
    iscsit_execute_cmd+0x381/0x3f0 [iscsi_target_mod]
    iscsit_sequence_cmd+0x13b/0x1f0 [iscsi_target_mod]
    iscsit_process_scsi_cmd+0x4c/0x130 [iscsi_target_mod]
    iscsit_get_rx_pdu+0x8e8/0x15f0 [iscsi_target_mod]
    iscsi_target_rx_thread+0x105/0x1b0 [iscsi_target_mod]
    kthread+0x1bc/0x210
    ret_from_fork+0x24/0x30

    Allocated by task 1079:
    save_stack+0x23/0x90
    __kasan_kmalloc.constprop.0+0xcf/0xe0
    kasan_slab_alloc+0x12/0x20
    kmem_cache_alloc+0xfe/0x3a0
    transport_alloc_session+0x29/0x80 [target_core_mod]
    iscsi_target_login_thread+0xceb/0x1920 [iscsi_target_mod]
    kthread+0x1bc/0x210
    ret_from_fork+0x24/0x30

    Freed by task 17193:
    save_stack+0x23/0x90
    __kasan_slab_free+0x13a/0x190
    kasan_slab_free+0x12/0x20
    kmem_cache_free+0xc8/0x3e0
    transport_free_session+0x179/0x2f0 [target_core_mod]
    transport_deregister_session+0x121/0x170 [target_core_mod]
    iscsit_close_session+0x12c/0x350 [iscsi_target_mod]
    iscsit_logout_post_handler+0x136/0x380 [iscsi_target_mod]
    iscsit_response_queue+0x8fa/0xc00 [iscsi_target_mod]
    iscsi_target_tx_thread+0x28e/0x390 [iscsi_target_mod]
    kthread+0x1bc/0x210
    ret_from_fork+0x24/0x30

    The buggy address belongs to the object at ffff88802ecd1860
    which belongs to the cache se_sess_cache of size 352
    The buggy address is located 24 bytes inside of
    352-byte region [ffff88802ecd1860, ffff88802ecd19c0)
    The buggy address belongs to the page:
    page:ffffea0000bb3400 refcount:1 mapcount:0 mapping:ffff8880bef2ed00 index:0x0 compound_mapcount: 0
    flags: 0x1000000000010200(slab|head)
    raw: 1000000000010200 dead000000000100 dead000000000122 ffff8880bef2ed00
    raw: 0000000000000000 0000000080270027 00000001ffffffff 0000000000000000
    page dumped because: kasan: bad access detected

    Memory state around the buggy address:
    ffff88802ecd1700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ffff88802ecd1780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    >ffff88802ecd1800: fb fb fb fb fc fc fc fc fc fc fc fc fb fb fb fb
    ^
    ffff88802ecd1880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
    ffff88802ecd1900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

    Cc: Mike Christie
    Link: https://lore.kernel.org/r/20191113220508.198257-2-bvanassche@acm.org
    Reviewed-by: Roman Bolshakov
    Signed-off-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Bart Van Assche
     
  • [ Upstream commit 0b7a223552d455bcfba6fb9cfc5eef2b5fce1491 ]

    Add a module parameter to inhibit disconnect/reselect for individual
    targets. This gains compatibility with Aztec PowerMonster SCSI/SATA
    adapters with buggy firmware. (No fix is available from the vendor.)

    Apparently these adapters pass-through the product/vendor of the attached
    SATA device. Since they can't be identified from the response to an INQUIRY
    command, a device blacklist flag won't work.

    Cc: Michael Schmitz
    Link: https://lore.kernel.org/r/993b17545990f31f9fa5a98202b51102a68e7594.1573875417.git.fthain@telegraphics.com.au
    Reviewed-and-tested-by: Michael Schmitz
    Signed-off-by: Finn Thain
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Finn Thain
     
  • [ Upstream commit aa5334c4f3014940f11bf876e919c956abef4089 ]

    Passing the parameter "num_tgts=-1" will start an infinite loop that
    exhausts the system memory

    Link: https://lore.kernel.org/r/20191115163727.24626-1-mlombard@redhat.com
    Signed-off-by: Maurizio Lombardi
    Acked-by: Douglas Gilbert
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Maurizio Lombardi