05 Jan, 2020
40 commits
-
[ Upstream commit 3c1c24d91ffd536de0a64688a9df7f49e58fadbc ]
A while ago Andy noticed
(http://lkml.kernel.org/r/CALCETrWY+5ynDct7eU_nDUqx=okQvjm=Y5wJvA4ahBja=CQXGw@mail.gmail.com)
that UFFD_FEATURE_EVENT_FORK used by an unprivileged user may have
security implications.As the first step of the solution the following patch limits the availably
of UFFD_FEATURE_EVENT_FORK only for those having CAP_SYS_PTRACE.The usage of CAP_SYS_PTRACE ensures compatibility with CRIU.
Yet, if there are other users of non-cooperative userfaultfd that run
without CAP_SYS_PTRACE, they would be broken :(Current implementation of UFFD_FEATURE_EVENT_FORK modifies the file
descriptor table from the read() implementation of uffd, which may have
security implications for unprivileged use of the userfaultfd.Limit availability of UFFD_FEATURE_EVENT_FORK only for callers that have
CAP_SYS_PTRACE.Link: http://lkml.kernel.org/r/1572967777-8812-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport
Reviewed-by: Andrea Arcangeli
Cc: Daniel Colascione
Cc: Jann Horn
Cc: Lokesh Gidra
Cc: Nick Kralevich
Cc: Nosh Minwalla
Cc: Pavel Emelyanov
Cc: Tim Murray
Cc: Aleksa Sarai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin -
[ Upstream commit 204cb79ad42f015312a5bbd7012d09c93d9b46fb ]
Currently, the drop_caches proc file and sysctl read back the last value
written, suggesting this is somehow a stateful setting instead of a
one-time command. Make it write-only, like e.g. compact_memory.While mitigating a VM problem at scale in our fleet, there was confusion
about whether writing to this file will permanently switch the kernel into
a non-caching mode. This influences the decision making in a tense
situation, where tens of people are trying to fix tens of thousands of
affected machines: Do we need a rollback strategy? What are the
performance implications of operating in a non-caching state for several
days? It also caused confusion when the kernel team said we may need to
write the file several times to make sure it's effective ("But it already
reads back 3?").Link: http://lkml.kernel.org/r/20191031221602.9375-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner
Acked-by: Chris Down
Acked-by: Vlastimil Babka
Acked-by: David Hildenbrand
Acked-by: Michal Hocko
Acked-by: Alexey Dobriyan
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin -
[ Upstream commit 8fc312b32b25c6b0a8b46fab4df8c68df5af1223 ]
It is assumed that the hugetlbfs_vfsmount[] array will contain either a
valid vfsmount pointer or NULL for each hstate after initialization.
Changes made while converting to use fs_context broke this assumption.While fixing the hugetlbfs_vfsmount issue, it was discovered that
init_hugetlbfs_fs never did correctly clean up when encountering a vfs
mount error.It was found during code inspection. A small memory allocation failure
would be the most likely cause of taking a error path with the bug.
This is unlikely to happen as this is early init code.Link: http://lkml.kernel.org/r/94b6244d-2c24-e269-b12c-e3ba694b242d@oracle.com
Reported-by: Chengguang Xu
Fixes: 32021982a324 ("hugetlbfs: Convert to fs_context")
Signed-off-by: Mike Kravetz
Cc: David Howells
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin -
[ Upstream commit 746dd4012d215b53152f0001a48856e41ea31730 ]
When running test_vmalloc.sh smoke the following print out states that
the fragment is missing.# ./test_vmalloc.sh: You must have the following enabled in your kernel:
# CONFIG_TEST_VMALLOC=mRework to add the fragment 'CONFIG_TEST_VMALLOC=m' to the config file.
Link: http://lkml.kernel.org/r/20190916095217.19665-1-anders.roxell@linaro.org
Fixes: a05ef00c9790 ("selftests/vm: add script helper for CONFIG_TEST_VMALLOC_MODULE")
Signed-off-by: Anders Roxell
Cc: Shuah Khan
Cc: "Uladzislau Rezki (Sony)"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin -
[ Upstream commit 7f28dad395243c5026d649136823bbc40029a828 ]
Make sure preemption is disabled when temporary switching to nodat
stack with CALL_ON_STACK helper, because nodat stack is per cpu.Reviewed-by: Heiko Carstens
Signed-off-by: Vasily Gorbik
Signed-off-by: Sasha Levin -
[ Upstream commit bf159d151a0b844be28882f39e316b5800acaa2b ]
Tx doorbell is handled by txdb_tasklet and doesn't
have an associated IRQ.Anyhow, imx_mu_shutdown ignores this and tries to
free an IRQ that wasn't requested for Tx DB resulting
in the following warning:[ 1.967644] Trying to free already-free IRQ 26
[ 1.972108] WARNING: CPU: 2 PID: 157 at kernel/irq/manage.c:1708 __free_irq+0xc0/0x358
[ 1.980024] Modules linked in:
[ 1.983088] CPU: 2 PID: 157 Comm: kworker/2:1 Tainted: G
[ 1.993524] Hardware name: Freescale i.MX8QXP MEK (DT)
[ 1.998668] Workqueue: events deferred_probe_work_func
[ 2.003812] pstate: 60000085 (nZCv daIf -PAN -UAO)
[ 2.008607] pc : __free_irq+0xc0/0x358
[ 2.012364] lr : __free_irq+0xc0/0x358
[ 2.016111] sp : ffff00001179b7e0
[ 2.019422] x29: ffff00001179b7e0 x28: 0000000000000018
[ 2.024736] x27: ffff000011233000 x26: 0000000000000004
[ 2.030053] x25: 000000000000001a x24: ffff80083bec74d4
[ 2.035369] x23: 0000000000000000 x22: ffff80083bec7588
[ 2.040686] x21: ffff80083b1fe8d8 x20: ffff80083bec7400
[ 2.046003] x19: 0000000000000000 x18: ffffffffffffffff
[ 2.051320] x17: 0000000000000000 x16: 0000000000000000
[ 2.056637] x15: ffff0000111296c8 x14: ffff00009179b517
[ 2.061953] x13: ffff00001179b525 x12: ffff000011142000
[ 2.067270] x11: ffff000011129f20 x10: ffff0000105da970
[ 2.072587] x9 : 00000000ffffffd0 x8 : 0000000000000194
[ 2.077903] x7 : 612065657266206f x6 : ffff0000111e7b09
[ 2.083220] x5 : 0000000000000003 x4 : 0000000000000000
[ 2.088537] x3 : 0000000000000000 x2 : 00000000ffffffff
[ 2.093854] x1 : 28b70f0a2b60a500 x0 : 0000000000000000
[ 2.099173] Call trace:
[ 2.101618] __free_irq+0xc0/0x358
[ 2.105021] free_irq+0x38/0x98
[ 2.108170] imx_mu_shutdown+0x90/0xb0
[ 2.111921] mbox_free_channel.part.2+0x24/0xb8
[ 2.116453] mbox_free_channel+0x18/0x28This bug is present from the beginning of times.
Cc: Oleksij Rempel
Signed-off-by: Daniel Baluta
Signed-off-by: Richard Zhu
Reviewed-by: Dong Aisheng
Signed-off-by: Jassi Brar
Signed-off-by: Sasha Levin -
[ Upstream commit 188c523e1c271d537f3c9f55b6b65bf4476de32f ]
Fix a static code checker warning:
fs/ocfs2/acl.c:331
ocfs2_acl_chmod() warn: passing zero to 'PTR_ERR'Link: http://lkml.kernel.org/r/1dee278b-6c96-eec2-ce76-fe6e07c6e20f@linux.alibaba.com
Fixes: 5ee0fbd50fd ("ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang")
Signed-off-by: Ding Xiang
Reviewed-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Junxiao Bi
Cc: Changwei Ge
Cc: Gang He
Cc: Jun Piao
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin -
[ Upstream commit 247f265fa502e7b17a0cb0cc330e055a36aafce4 ]
Each SBDT is located at a 4KB page and contains 512 entries.
Each entry of a SDBT points to a SDB, a 4KB page containing
sampled data. The last entry is a link to another SDBT page.When an event is created the function sequence executed is:
__hw_perf_event_init()
+--> allocate_buffers()
+--> realloc_sampling_buffers()
+---> alloc_sample_data_block()Both functions realloc_sampling_buffers() and
alloc_sample_data_block() allocate pages and the allocation
can fail. This is handled correctly and all allocated
pages are freed and error -ENOMEM is returned to the
top calling function. Finally the event is not created.Once the event has been created, the amount of initially
allocated SDBT and SDB can be too low. This is detected
during measurement interrupt handling, where the amount
of lost samples is calculated. If the number of lost samples
is too high considering sampling frequency and already allocated
SBDs, the number of SDBs is enlarged during the next execution
of cpumsf_pmu_enable().If more SBDs need to be allocated, functions
realloc_sampling_buffers()
+---> alloc-sample_data_block()are called to allocate more pages. Page allocation may fail
and the returned error is ignored. A SDBT and SDB setup
already exists.However the modified SDBTs and SDBs might end up in a situation
where the first entry of an SDBT does not point to an SDB,
but another SDBT, basicly an SBDT without payload.
This can not be handled by the interrupt handler, where an SDBT
must have at least one entry pointing to an SBD.Add a check to avoid SDBTs with out payload (SDBs) when enlarging
the buffer setup.Signed-off-by: Thomas Richter
Signed-off-by: Vasily Gorbik
Signed-off-by: Sasha Levin -
[ Upstream commit bf018ee644897d7982e1b8dd8b15e97db6e1a4da ]
Currently unwinder unconditionally returns %r14 from the first frame
pointed by %r15 from pt_regs. A task could be interrupted when a function
already allocated this frame (if it needs it) for its callees or to
store local variables. In that case this frame would contain random
values from stack or values stored there by a callee. As we are only
interested in %r14 to get potential return address, skip bogus return
addresses which doesn't belong to kernel text.This helps to avoid duplicating filtering logic in unwider users, most
of which use unwind_get_return_address() and would choke on bogus 0
address returned by it otherwise.Reviewed-by: Heiko Carstens
Signed-off-by: Vasily Gorbik
Signed-off-by: Sasha Levin -
[ Upstream commit a8de1304b7df30e3a14f2a8b9709bb4ff31a0385 ]
The DTC v1.5.1 added references to (U)INT32_MAX.
This is no problem for user-space programs since defines
(U)INT32_MAX along with (u)int32_t.For the kernel space, libfdt_env.h needs to be adjusted before we
pull in the changes.In the kernel, we usually use s/u32 instead of (u)int32_t for the
fixed-width types.Accordingly, we already have S/U32_MAX for their max values.
So, we should not add (U)INT32_MAX to any more.Instead, add them to the in-kernel libfdt_env.h to compile the
latest libfdt.Signed-off-by: Masahiro Yamada
Signed-off-by: Rob Herring
Signed-off-by: Sasha Levin -
[ Upstream commit 5f0af07e89199ac51cdd4f25bc303bdc703f4e9c ]
Make sure to only clear enabled interrupts keeping count
of the connection type.Suggested-by: Oleksij Rempel
Signed-off-by: Daniel Baluta
Signed-off-by: Richard Zhu
Reviewed-by: Dong Aisheng
Signed-off-by: Jassi Brar
Signed-off-by: Sasha Levin -
[ Upstream commit 6733775a92eacd612ac88afa0fd922e4ffeb2bc7 ]
This patch introduces support for a new architectured reply
code 0x8B indicating that a hypervisor layer (if any) has
rejected an ap message.Linux may run as a guest on top of a hypervisor like zVM
or KVM. So the crypto hardware seen by the ap bus may be
restricted by the hypervisor for example only a subset like
only clear key crypto requests may be supported. Other
requests will be filtered out - rejected by the hypervisor.
The new reply code 0x8B will appear in such cases and needs
to get recognized by the ap bus and zcrypt device driver zoo.Signed-off-by: Harald Freudenberger
Signed-off-by: Vasily Gorbik
Signed-off-by: Sasha Levin -
[ Upstream commit 5b596e0ff0e1852197d4c82d3314db5e43126bf7 ]
To avoid breaking the build on arches where this is not wired up, at
least all the other features should be made available and when using
this specific routine, the "unknown" should point the user/developer to
the need to wire this up on this particular hardware architecture.Detected in a container mipsel debian cross build environment, where it
shows up as:In file included from /usr/mipsel-linux-gnu/include/stdio.h:867,
from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
from util/session.c:13:
In function 'printf',
inlined from 'regs_dump__printf' at util/session.c:1103:3,
inlined from 'regs__printf' at util/session.c:1131:2:
/usr/mipsel-linux-gnu/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cross compiler details:
mipsel-linux-gnu-gcc (Debian 9.2.1-8) 9.2.1 20190909
Also on mips64:
In file included from /usr/mips64-linux-gnuabi64/include/stdio.h:867,
from /git/linux/tools/perf/lib/include/perf/cpumap.h:6,
from util/session.c:13:
In function 'printf',
inlined from 'regs_dump__printf' at util/session.c:1103:3,
inlined from 'regs__printf' at util/session.c:1131:2,
inlined from 'regs_user__printf' at util/session.c:1139:3,
inlined from 'dump_sample' at util/session.c:1246:3,
inlined from 'machines__deliver_event' at util/session.c:1421:3:
/usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function 'printf',
inlined from 'regs_dump__printf' at util/session.c:1103:3,
inlined from 'regs__printf' at util/session.c:1131:2,
inlined from 'regs_intr__printf' at util/session.c:1147:3,
inlined from 'dump_sample' at util/session.c:1249:3,
inlined from 'machines__deliver_event' at util/session.c:1421:3:
/usr/mips64-linux-gnuabi64/include/bits/stdio2.h:107:10: error: '%-5s' directive argument is null [-Werror=format-overflow=]
107 | return __printf_chk (__USE_FORTIFY_LEVEL - 1, __fmt, __va_arg_pack ());
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~cross compiler details:
mips64-linux-gnuabi64-gcc (Debian 9.2.1-8) 9.2.1 20190909
Fixes: 2bcd355b71da ("perf tools: Add interface to arch registers sets")
Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-95wjyv4o65nuaeweq31t7l1s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 0cd032d3b5fcebf5454315400ab310746a81ca53 ]
brstackinsn must be allowed to be set by the user when AUX area data has
been captured because, in that case, the branch stack might be
synthesized on the fly. This fixes the following error:Before:
$ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
[ perf record: Woken up 19 times to write data ]
[ perf record: Captured and wrote 2.274 MB perf.data ]
$ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
Display of branch stack assembler requested, but non all-branch filter set
Hint: run 'perf record -b ...'After:
$ perf record -e '{intel_pt//,cpu/mem_inst_retired.all_loads,aux-sample-size=8192/pp}:u' grep -rqs jhgjhg /boot
[ perf record: Woken up 19 times to write data ]
[ perf record: Captured and wrote 2.274 MB perf.data ]
$ perf script -F +brstackinsn --xed --itrace=i1usl100 | head
grep 13759 [002] 8091.310257: 1862 instructions:uH: 5641d58069eb bmexec+0x86b (/bin/grep)
bmexec+2485:
00005641d5806b35 jnz 0x5641d5806bd0 # MISPRED
00005641d5806bd0 movzxb (%r13,%rdx,1), %eax
00005641d5806bd6 add %rdi, %rax
00005641d5806bd9 movzxb -0x1(%rax), %edx
00005641d5806bdd cmp %rax, %r14
00005641d5806be0 jnb 0x5641d58069c0 # MISPRED
mismatch of LBR data and executable
00005641d58069c0 movzxb (%r13,%rdx,1), %ediFixes: 48d02a1d5c13 ("perf script: Add 'brstackinsn' for branch stacks")
Reported-by: Andi Kleen
Signed-off-by: Adrian Hunter
Cc: Jiri Olsa
Link: http://lore.kernel.org/lkml/20191127095322.15417-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 98e93245113d0f5c279ef77f4a9e7d097323ad71 ]
To fix these build errors on a debian mipsel cross build environment:
builtin-diff.c: In function 'block_cycles_diff_cmp':
builtin-diff.c:550:6: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
550 | l = labs(left->diff.cycles);
| ^~~~
builtin-diff.c:551:6: error: absolute value function 'labs' given an argument of type 's64' {aka 'long long int'} but has parameter of type 'long int' which may cause truncation of value [-Werror=absolute-value]
551 | r = labs(right->diff.cycles);
| ^~~~Fixes: 99150a1faab2 ("perf diff: Use hists to manage basic blocks per symbol")
Cc: Jin Yao
Cc: Adrian Hunter
Cc: Jiri Olsa
Cc: Namhyung Kim
Link: https://lkml.kernel.org/n/tip-pn7szy5uw384ntjgk6zckh6a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo
Signed-off-by: Sasha Levin -
[ Upstream commit 32546a9586aa4565035bb557e191648e022b29e8 ]
This patch moves the final part of the cifsFileInfo_put() logic where we
need a write lock on lock_sem to be processed in a separate thread that
holds no other locks.
This is to prevent deadlocks like the one below:> there are 6 processes looping to while trying to down_write
> cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from
> cifs_new_fileinfo
>
> and there are 5 other processes which are blocked, several of them
> waiting on either PG_writeback or PG_locked (which are both set), all
> for the same page of the file
>
> 2 inode_lock() (inode->i_rwsem) for the file
> 1 wait_on_page_writeback() for the page
> 1 down_read(inode->i_rwsem) for the inode of the directory
> 1 inode_lock()(inode->i_rwsem) for the inode of the directory
> 1 __lock_page
>
>
> so processes are blocked waiting on:
> page flags PG_locked and PG_writeback for one specific page
> inode->i_rwsem for the directory
> inode->i_rwsem for the file
> cifsInodeInflock_sem
>
>
>
> here are the more gory details (let me know if I need to provide
> anything more/better):
>
> [0 00:48:22.765] [UN] PID: 8863 TASK: ffff8c691547c5c0 CPU: 3
> COMMAND: "reopen_file"
> #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095
> #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df
> #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7
> #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d
> #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d
> #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33
> #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6
> #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315
> * (I think legitimize_path is bogus)
>
> in path_openat
> } else {
> const char *s = path_init(nd, flags);
> while (!(error = link_path_walk(s, nd)) &&
> (error = do_last(nd, file, op)) > 0) { <<<<
>
> do_last:
> if (open_flag & O_CREAT)
> inode_lock(dir->d_inode); <<<<
> else
> so it's trying to take inode->i_rwsem for the directory
>
> DENTRY INODE SUPERBLK TYPE PATH
> ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR /mnt/vm1_smb/
> inode.i_rwsem is ffff8c691158efc0
>
> :
> owner: (UN - 8856 -
> reopen_file), counter: 0x0000000000000003
> waitlist: 2
> 0xffff9965007e3c90 8863 reopen_file UN 0 1:29:22.926
> RWSEM_WAITING_FOR_WRITE
> 0xffff996500393e00 9802 ls UN 0 1:17:26.700
> RWSEM_WAITING_FOR_READ
>
>
> the owner of the inode.i_rwsem of the directory is:
>
> [0 00:00:00.109] [UN] PID: 8856 TASK: ffff8c6914275d00 CPU: 3
> COMMAND: "reopen_file"
> #0 [ffff99650065b828] __schedule at ffffffff9b6e6095
> #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df
> #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff99650065b940] msleep at ffffffff9af573a9
> #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
> #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs]
> #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs]
> #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd
> #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs]
> #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs]
> #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs]
> #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7
> #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33
> #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6
> #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315
>
> cifs_launder_page is for page 0xffffd1e2c07d2480
>
> crash> page.index,mapping,flags 0xffffd1e2c07d2480
> index = 0x8
> mapping = 0xffff8c68f3cd0db0
> flags = 0xfffffc0008095
>
> PAGE-FLAG BIT VALUE
> PG_locked 0 0000001
> PG_uptodate 2 0000004
> PG_lru 4 0000010
> PG_waiters 7 0000080
> PG_writeback 15 0008000
>
>
> inode is ffff8c68f3cd0c40
> inode.i_rwsem is ffff8c68f3cd0ce0
> DENTRY INODE SUPERBLK TYPE PATH
> ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG
> /mnt/vm1_smb/testfile.8853
>
>
> this process holds the inode->i_rwsem for the parent directory, is
> laundering a page attached to the inode of the file it's opening, and in
> _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem
> for the file itself.
>
>
> :
> owner: (UN - 8854 -
> reopen_file), counter: 0x0000000000000003
> waitlist: 1
> 0xffff9965005dfd80 8855 reopen_file UN 0 1:29:22.912
> RWSEM_WAITING_FOR_WRITE
>
> this is the inode.i_rwsem for the file
>
> the owner:
>
> [0 00:48:22.739] [UN] PID: 8854 TASK: ffff8c6914272e80 CPU: 2
> COMMAND: "reopen_file"
> #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095
> #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df
> #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2
> #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f
> #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf
> #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c
> #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs]
> #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4
> #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a
> #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs]
> #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd
> #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35
> #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9
> #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315
>
> the process holds the inode->i_rwsem for the file to which it's writing,
> and is trying to __lock_page for the same page as in the other processes
>
>
> the other tasks:
> [0 00:00:00.028] [UN] PID: 8859 TASK: ffff8c6915479740 CPU: 2
> COMMAND: "reopen_file"
> #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095
> #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df
> #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff9965007b3af0] msleep at ffffffff9af573a9
> #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs]
> #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs]
> #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a
> #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f
> #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33
> #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6
> #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315
>
> this is opening the file, and is trying to down_write cinode->lock_sem
>
>
> [0 00:00:00.041] [UN] PID: 8860 TASK: ffff8c691547ae80 CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.057] [UN] PID: 8861 TASK: ffff8c6915478000 CPU: 3
> COMMAND: "reopen_file"
> [0 00:00:00.059] [UN] PID: 8858 TASK: ffff8c6914271740 CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.109] [UN] PID: 8862 TASK: ffff8c691547dd00 CPU: 6
> COMMAND: "reopen_file"
> #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095
> #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df
> #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89
> #3 [ffff9965007c3d90] msleep at ffffffff9af573a9
> #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
> #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs]
> #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e
> #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614
> #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f
> #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c
>
> closing the file, and trying to down_write cifsi->lock_sem
>
>
> [0 00:48:22.839] [UN] PID: 8857 TASK: ffff8c6914270000 CPU: 7
> COMMAND: "reopen_file"
> #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095
> #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df
> #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2
> #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6
> #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028
> #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165
> #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs]
> #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1
> #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e
> #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315
>
> in __filemap_fdatawait_range
> wait_on_page_writeback(page);
> for the same page of the file
>
>
>
> [0 00:48:22.718] [UN] PID: 8855 TASK: ffff8c69142745c0 CPU: 7
> COMMAND: "reopen_file"
> #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095
> #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df
> #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7
> #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs]
> #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd
> #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35
> #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9
> #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315
>
> inode_lock(inode);
>
>
> and one 'ls' later on, to see whether the rest of the mount is available
> (the test file is in the root, so we get blocked up on the directory
> ->i_rwsem), so the entire mount is unavailable
>
> [0 00:36:26.473] [UN] PID: 9802 TASK: ffff8c691436ae80 CPU: 4
> COMMAND: "ls"
> #0 [ffff996500393d28] __schedule at ffffffff9b6e6095
> #1 [ffff996500393db8] schedule at ffffffff9b6e64df
> #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421
> #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2
> #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56
> #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c
> #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6
> #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315
>
> in iterate_dir:
> if (shared)
> res = down_read_killable(&inode->i_rwsem); <<<<
> else
> res = down_write_killable(&inode->i_rwsem);
>Reported-by: Frank Sorenson
Reviewed-by: Pavel Shilovsky
Signed-off-by: Ronnie Sahlberg
Signed-off-by: Steve French
Signed-off-by: Sasha Levin -
[ Upstream commit 366ba7c71ef77c08d06b18ad61b26e2df7352338 ]
Reading the TOC only works if the device can play audio, otherwise
these commands fail (and possibly bring the device to an unhealthy
state.)Similarly, cdrom_mmc3_profile() should only be called if the device
supports generic packet commands.To: Jens Axboe
Cc: linux-kernel@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Diego Elio Pettenò
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin -
[ Upstream commit 2aacace6dbbb6b6ce4e177e6c7ea901f389c0472 ]
In attach_node_and_children memory is allocated for full_name via
kasprintf. If the condition of the 1st if is not met the function
returns early without freeing the memory. Add a kfree() to fix that.This has been detected with kmemleak:
Link: https://bugzilla.kernel.org/show_bug.cgi?id=205327It looks like the leak was introduced by this commit:
Fixes: 5babefb7f7ab ("of: unittest: allow base devicetree to have symbol metadata")Signed-off-by: Erhard Furtner
Reviewed-by: Michael Ellerman
Reviewed-by: Tyrel Datwyler
Signed-off-by: Rob Herring
Signed-off-by: Sasha Levin -
[ Upstream commit eb065d301e8c83643367bdb0898becc364046bda ]
We currently rely on the ring destroy on cleaning things up in case of
failure, but io_allocate_scq_urings() can leave things half initialized
if only parts of it fails.Be nice and return with either everything setup in success, or return an
error with things nicely cleaned up.Reported-by: syzbot+0d818c0d39399188f393@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe
Signed-off-by: Sasha Levin -
[ Upstream commit 7e60746005573a06149cdee7acedf428906f3a59 ]
When we get an interrupt from the socket getting readable,
and start reading, there's a possibility for a race. This
depends on the implementation of the device, but e.g. with
qemu's libvhost-user, we can see:device virtio_uml
---------------------------------------
write header
get interrupt
read header
read body -> returns -EAGAIN
write bodyThe -EAGAIN return is because the socket is non-blocking,
and then this leads us to abandon this message.In fact, we've already read the header, so when the get
another signal/interrupt for the body, we again read it
as though it's a new message header, and also abandon it
for the same reason (wrong size etc.)This essentially breaks things, and if that message was
one that required a response, it leads to a deadlock as
the device is waiting for the response but we'll never
reply.Fix this by spinning on -EAGAIN as well when we read the
message body. We need to handle -EAGAIN as "no message"
while reading the header, since we share an interrupt.Note that this situation is highly unlikely to occur in
normal usage, since there will be very few messages and
only in the startup phase. With the inband call feature
this does tend to happen (eventually) though.Signed-off-by: Johannes Berg
Signed-off-by: Richard Weinberger
Signed-off-by: Sasha Levin -
[ Upstream commit 8354d88efdab72b4da32fc4f032448fcef22dab4 ]
Ensure we grab an active reference in cifs superblock while doing
failover to prevent automounts (DFS links) of expiring and then
destroying the superblock pointer.This patch fixes the following KASAN report:
[ 464.301462] BUG: KASAN: use-after-free in
cifs_reconnect+0x6ab/0x1350
[ 464.303052] Read of size 8 at addr ffff888155e580d0 by task
cifsd/1107[ 464.304682] CPU: 3 PID: 1107 Comm: cifsd Not tainted 5.4.0-rc4+ #13
[ 464.305552] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS rel-1.12.1-0-ga5cab58-rebuilt.opensuse.org 04/01/2014
[ 464.307146] Call Trace:
[ 464.307875] dump_stack+0x5b/0x90
[ 464.308631] print_address_description.constprop.0+0x16/0x200
[ 464.309478] ? cifs_reconnect+0x6ab/0x1350
[ 464.310253] ? cifs_reconnect+0x6ab/0x1350
[ 464.311040] __kasan_report.cold+0x1a/0x41
[ 464.311811] ? cifs_reconnect+0x6ab/0x1350
[ 464.312563] kasan_report+0xe/0x20
[ 464.313300] cifs_reconnect+0x6ab/0x1350
[ 464.314062] ? extract_hostname.part.0+0x90/0x90
[ 464.314829] ? printk+0xad/0xde
[ 464.315525] ? _raw_spin_lock+0x7c/0xd0
[ 464.316252] ? _raw_read_lock_irq+0x40/0x40
[ 464.316961] ? ___ratelimit+0xed/0x182
[ 464.317655] cifs_readv_from_socket+0x289/0x3b0
[ 464.318386] cifs_read_from_socket+0x98/0xd0
[ 464.319078] ? cifs_readv_from_socket+0x3b0/0x3b0
[ 464.319782] ? try_to_wake_up+0x43c/0xa90
[ 464.320463] ? cifs_small_buf_get+0x4b/0x60
[ 464.321173] ? allocate_buffers+0x98/0x1a0
[ 464.321856] cifs_demultiplex_thread+0x218/0x14a0
[ 464.322558] ? cifs_handle_standard+0x270/0x270
[ 464.323237] ? __switch_to_asm+0x40/0x70
[ 464.323893] ? __switch_to_asm+0x34/0x70
[ 464.324554] ? __switch_to_asm+0x40/0x70
[ 464.325226] ? __switch_to_asm+0x40/0x70
[ 464.325863] ? __switch_to_asm+0x34/0x70
[ 464.326505] ? __switch_to_asm+0x40/0x70
[ 464.327161] ? __switch_to_asm+0x34/0x70
[ 464.327784] ? finish_task_switch+0xa1/0x330
[ 464.328414] ? __switch_to+0x363/0x640
[ 464.329044] ? __schedule+0x575/0xaf0
[ 464.329655] ? _raw_spin_lock_irqsave+0x82/0xe0
[ 464.330301] kthread+0x1a3/0x1f0
[ 464.330884] ? cifs_handle_standard+0x270/0x270
[ 464.331624] ? kthread_create_on_node+0xd0/0xd0
[ 464.332347] ret_from_fork+0x35/0x40[ 464.333577] Allocated by task 1110:
[ 464.334381] save_stack+0x1b/0x80
[ 464.335123] __kasan_kmalloc.constprop.0+0xc2/0xd0
[ 464.335848] cifs_smb3_do_mount+0xd4/0xb00
[ 464.336619] legacy_get_tree+0x6b/0xa0
[ 464.337235] vfs_get_tree+0x41/0x110
[ 464.337975] fc_mount+0xa/0x40
[ 464.338557] vfs_kern_mount.part.0+0x6c/0x80
[ 464.339227] cifs_dfs_d_automount+0x336/0xd29
[ 464.339846] follow_managed+0x1b1/0x450
[ 464.340449] lookup_fast+0x231/0x4a0
[ 464.341039] path_openat+0x240/0x1fd0
[ 464.341634] do_filp_open+0x126/0x1c0
[ 464.342277] do_sys_open+0x1eb/0x2c0
[ 464.342957] do_syscall_64+0x5e/0x190
[ 464.343555] entry_SYSCALL_64_after_hwframe+0x44/0xa9[ 464.344772] Freed by task 0:
[ 464.345347] save_stack+0x1b/0x80
[ 464.345966] __kasan_slab_free+0x12c/0x170
[ 464.346576] kfree+0xa6/0x270
[ 464.347211] rcu_core+0x39c/0xc80
[ 464.347800] __do_softirq+0x10d/0x3da[ 464.348919] The buggy address belongs to the object at
ffff888155e58000
which belongs to the cache kmalloc-256 of size 256
[ 464.350222] The buggy address is located 208 bytes inside of
256-byte region [ffff888155e58000, ffff888155e58100)
[ 464.351575] The buggy address belongs to the page:
[ 464.352333] page:ffffea0005579600 refcount:1 mapcount:0
mapping:ffff88815a803400 index:0x0 compound_mapcount: 0
[ 464.353583] flags: 0x200000000010200(slab|head)
[ 464.354209] raw: 0200000000010200 ffffea0005576200 0000000400000004
ffff88815a803400
[ 464.355353] raw: 0000000000000000 0000000080100010 00000001ffffffff
0000000000000000
[ 464.356458] page dumped because: kasan: bad access detected[ 464.367005] Memory state around the buggy address:
[ 464.367787] ffff888155e57f80: fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc fc
[ 464.368877] ffff888155e58000: fb fb fb fb fb fb fb fb fb fb fb fb
fb fb fb fb
[ 464.369967] >ffff888155e58080: fb fb fb fb fb fb fb fb fb fb fb fb
fb fb fb fb
[ 464.371111] ^
[ 464.371775] ffff888155e58100: fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc fc
[ 464.372893] ffff888155e58180: fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc fc
[ 464.373983] ==================================================================Signed-off-by: Paulo Alcantara (SUSE)
Reviewed-by: Aurelien Aptel
Signed-off-by: Steve French
Signed-off-by: Sasha Levin -
[ Upstream commit 465bfd9c44dea6b55962b5788a23ac87a467c923 ]
When building pseries_defconfig, building vdso32 errors out:
error: unknown target ABI 'elfv1'
This happens because -m32 in clang changes the target to 32-bit,
which does not allow the ABI to be changed.Commit 4dc831aa8813 ("powerpc: Fix compiling a BE kernel with a
powerpc64le toolchain") added these flags to fix building big endian
kernels with a little endian GCC.Clang doesn't need -mabi because the target triple controls the
default value. -mlittle-endian and -mbig-endian manipulate the triple
into either powerpc64-* or powerpc64le-*, which properly sets the
default ABI.Adding a debug print out in the PPC64TargetInfo constructor after line
383 above shows this:$ echo | ./clang -E --target=powerpc64-linux -mbig-endian -o /dev/null -
Default ABI: elfv1$ echo | ./clang -E --target=powerpc64-linux -mlittle-endian -o /dev/null -
Default ABI: elfv2$ echo | ./clang -E --target=powerpc64le-linux -mbig-endian -o /dev/null -
Default ABI: elfv1$ echo | ./clang -E --target=powerpc64le-linux -mlittle-endian -o /dev/null -
Default ABI: elfv2Don't specify -mabi when building with clang to avoid the build error
with -m32 and not change any code generation.-mcall-aixdesc is not an implemented flag in clang so it can be safely
excluded as well, see commit 238abecde8ad ("powerpc: Don't use gcc
specific options on clang").pseries_defconfig successfully builds after this patch and
powernv_defconfig and ppc44x_defconfig don't regress.Reviewed-by: Daniel Axtens
Signed-off-by: Nathan Chancellor
[mpe: Trim clang links in change log]
Signed-off-by: Michael Ellerman
Link: https://lore.kernel.org/r/20191119045712.39633-2-natechancellor@gmail.com
Signed-off-by: Sasha Levin -
[ Upstream commit 21915eca088dc271c970e8351290e83d938114ac ]
build_initial_tok_table() overwrites unused sym_entry to shrink the
table size. Before the entry is overwritten, table[i].sym must be freed
since it is malloc'ed data.This fixes the 'definitely lost' report from valgrind. I ran valgrind
against x86_64_defconfig of v5.4-rc8 kernel, and here is the summary:[Before the fix]
LEAK SUMMARY:
definitely lost: 53,184 bytes in 2,874 blocks[After the fix]
LEAK SUMMARY:
definitely lost: 0 bytes in 0 blocksSigned-off-by: Masahiro Yamada
Signed-off-by: Sasha Levin -
[ Upstream commit a9ae8731e6e52829a935d81a65d7f925cb95dbac ]
find_vma() must be called under the mmap_sem, reorganize this code to
do the vma check after entering the lock.Further, fix the unlocked use of struct task_struct's mm, instead use
the mm from hmm_mirror which has an active mm_grab. Also the mm_grab
must be converted to a mm_get before acquiring mmap_sem or calling
find_vma().Fixes: 66c45500bfdc ("drm/amdgpu: use new HMM APIs and helpers")
Fixes: 0919195f2b0d ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads")
Link: https://lore.kernel.org/r/20191112202231.3856-11-jgg@ziepe.ca
Acked-by: Christian König
Reviewed-by: Felix Kuehling
Reviewed-by: Philip Yang
Tested-by: Philip Yang
Signed-off-by: Jason Gunthorpe
Signed-off-by: Sasha Levin -
[ Upstream commit 00e0590dbaec6f1bcaa36a85467d7e3497ced522 ]
The sanity check in macro update_for_len checks to see if len
is less than zero, however, len is a size_t so it can never be
less than zero, so this sanity check is a no-op. Fix this by
making len a ssize_t so the comparison will work and add ulen
that is a size_t copy of len so that the min() macro won't
throw warnings about comparing different types.Addresses-Coverity: ("Macro compares unsigned to 0")
Fixes: f1bd904175e8 ("apparmor: add the base fns() for domain labels")
Signed-off-by: Colin Ian King
Signed-off-by: John Johansen
Signed-off-by: Sasha Levin -
[ Upstream commit 7a1323b5dfe44a9013a2cc56ef2973034a00bf88 ]
The crash handler calls hv_synic_cleanup() to shutdown the
Hyper-V synthetic interrupt controller. But if the CPU
that calls hv_synic_cleanup() has a VMbus channel interrupt
assigned to it (which is likely the case in smaller VM sizes),
hv_synic_cleanup() returns an error and the synthetic
interrupt controller isn't shutdown. While the lack of
being shutdown hasn't caused a known problem, it still
should be fixed for highest reliability.So directly call hv_synic_disable_regs() instead of
hv_synic_cleanup(), which ensures that the synic is always
shutdown.Signed-off-by: Michael Kelley
Reviewed-by: Vitaly Kuznetsov
Reviewed-by: Dexuan Cui
Signed-off-by: Sasha Levin -
[ Upstream commit 20183ccd3e4d01d23b0a01fe9f3ee73fbae312fa ]
It is possible that certain config levels are not available, even
if the max level includes the level. There can be missing levels in
some platforms. So ignore the level when called for information dump
for all levels and fail if specifically ask for the missing level.Here the changes is to continue reading information about other levels
even if we fail to get information for the current level. But use the
"processed" flag to indicate the failure. When the "processed" flag is
not set, don't dump information about that level.Signed-off-by: Srinivas Pandruvada
Signed-off-by: Andy Shevchenko
Signed-off-by: Sasha Levin -
[ Upstream commit e272f7ec070d212b9301d5a465bc8952f8dcf908 ]
When commit 75e99bf5ed8f ("gpio: lynxpoint: set default handler to be
handle_bad_irq()") switched default handler to be handle_bad_irq() the
lp_irq_type() function remained untouched. It means that even request_irq()
can't change the handler and we are not able to handle IRQs properly anymore.
Fix it by setting correct handlers in the lp_irq_type() callback.Fixes: 75e99bf5ed8f ("gpio: lynxpoint: set default handler to be handle_bad_irq()")
Signed-off-by: Andy Shevchenko
Link: https://lore.kernel.org/r/20191118180251.31439-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Linus Walleij
Signed-off-by: Sasha Levin -
[ Upstream commit 4e50573f39229d5e9c985fa3b4923a8b29619ade ]
The per-SoC devtype structures can contain their own callbacks that
overwrite mpc8xxx_gpio_devtype_default.The clear intention is that mpc8xxx_irq_set_type is used in case the SoC
does not specify a more specific callback. But what happens is that if
the SoC doesn't specify one, its .irq_set_type is de-facto NULL, and
this overwrites mpc8xxx_irq_set_type to a no-op. This means that the
following SoCs are affected:- fsl,mpc8572-gpio
- fsl,ls1028a-gpio
- fsl,ls1088a-gpioOn these boards, the irq_set_type does exactly nothing, and the GPIO
controller keeps its GPICR register in the hardware-default state. On
the LS1028A, that is ACTIVE_BOTH, which means 2 interrupts are raised
even if the IRQ client requests LEVEL_HIGH. Another implication is that
the IRQs are not checked (e.g. level-triggered interrupts are not
rejected, although they are not supported).Fixes: 82e39b0d8566 ("gpio: mpc8xxx: handle differences between incarnations at a single place")
Signed-off-by: Vladimir Oltean
Link: https://lore.kernel.org/r/20191115125551.31061-1-olteanv@gmail.com
Tested-by: Michael Walle
Signed-off-by: Linus Walleij
Signed-off-by: Sasha Levin -
[ Upstream commit 5406327d43edd9a171bd260f49c752d148727eaf ]
Add Comet Lake to the list of the platforms that intel_pmc_core driver
supports for pmc_core device.Just like Ice Lake, Comet Lake can also reuse all the Cannon Lake PCH
IPs. No additional effort is needed to enable but to simply reuse them.Cc: Mario Limonciello
Cc: Peter Zijlstra
Cc: Srinivas Pandruvada
Cc: Andy Shevchenko
Cc: Kan Liang
Cc: David E. Box
Cc: Rajneesh Bhardwaj
Cc: Tony Luck
Signed-off-by: Gayatri Kammela
Signed-off-by: Andy Shevchenko
Signed-off-by: Sasha Levin -
[ Upstream commit 43e82d8aa92503d264309fb648b251b2d85caf1a ]
Intel's SoCs follow a naming convention which spells out the SoC name as
two words instead of one word (E.g: Cannon Lake vs Cannonlake). Thus fix
the naming inconsistency across the intel_pmc_core driver, so future
SoCs can follow the naming consistency as below.Cometlake -> Comet Lake
Tigerlake -> Tiger Lake
Elkhartlake -> Elkhart LakeCc: Mario Limonciello
Cc: Peter Zijlstra
Cc: Srinivas Pandruvada
Cc: Andy Shevchenko
Cc: Kan Liang
Cc: David E. Box
Cc: Rajneesh Bhardwaj
Cc: Tony Luck
Suggested-by: Andy Shevchenko
Signed-off-by: Gayatri Kammela
Signed-off-by: Andy Shevchenko
Signed-off-by: Sasha Levin -
[ Upstream commit 787b64a43f7acacf8099329ea08872e663f1e74f ]
Qoriq requires the IBE register to be set to enable GPIO inputs to be
read. Set it.Signed-off-by: Russell King
Link: https://lore.kernel.org/r/E1iX3HC-00069N-0T@rmk-PC.armlinux.org.uk
Signed-off-by: Linus Walleij
Signed-off-by: Sasha Levin -
[ Upstream commit 71c5e55e7c077fa17c42fbda91a8d14322825c44 ]
Reduce context close time by skipping the VA block free list update in
order to avoid hard reset with open contexts.
Reset with open contexts can potentially lead to a kernel crash as the
generic pool of the MMU hops is destroyed while it is not empty because
some unmap operations are not done.
The commit affect mainly when running on simulator.Signed-off-by: Omer Shpigelman
Reviewed-by: Oded Gabbay
Signed-off-by: Oded Gabbay
Signed-off-by: Sasha Levin -
[ Upstream commit 677017d196ba2a4cfff13626b951cc9a206b8c7c ]
The FS got stuck in the below stack when the storage is almost
full/dirty condition (when FG_GC is being done).schedule_timeout
io_schedule_timeout
congestion_wait
f2fs_drop_inmem_pages_all
f2fs_gc
f2fs_balance_fs
__write_node_page
f2fs_fsync_node_pages
f2fs_do_sync_file
f2fs_ioctlThe root cause for this issue is there is a potential infinite loop
in f2fs_drop_inmem_pages_all() for the case where gc_failure is true
and when there an inode whose i_gc_failures[GC_FAILURE_ATOMIC] is
not set. Fix this by keeping track of the total atomic files
currently opened and using that to exit from this condition.Fix-suggested-by: Chao Yu
Signed-off-by: Chao Yu
Signed-off-by: Sahitya Tummala
Signed-off-by: Jaegeuk Kim
Signed-off-by: Sasha Levin -
[ Upstream commit e9d3009cb936bd0faf0719f68d98ad8afb1e613b ]
The iSCSI target driver is the only target driver that does not wait for
ongoing commands to finish before freeing a session. Make the iSCSI target
driver wait for ongoing commands to finish before freeing a session. This
patch fixes the following KASAN complaint:BUG: KASAN: use-after-free in __lock_acquire+0xb1a/0x2710
Read of size 8 at addr ffff8881154eca70 by task kworker/0:2/247CPU: 0 PID: 247 Comm: kworker/0:2 Not tainted 5.4.0-rc1-dbg+ #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
Workqueue: target_completion target_complete_ok_work [target_core_mod]
Call Trace:
dump_stack+0x8a/0xd6
print_address_description.constprop.0+0x40/0x60
__kasan_report.cold+0x1b/0x33
kasan_report+0x16/0x20
__asan_load8+0x58/0x90
__lock_acquire+0xb1a/0x2710
lock_acquire+0xd3/0x200
_raw_spin_lock_irqsave+0x43/0x60
target_release_cmd_kref+0x162/0x7f0 [target_core_mod]
target_put_sess_cmd+0x2e/0x40 [target_core_mod]
lio_check_stop_free+0x12/0x20 [iscsi_target_mod]
transport_cmd_check_stop_to_fabric+0xd8/0xe0 [target_core_mod]
target_complete_ok_work+0x1b0/0x790 [target_core_mod]
process_one_work+0x549/0xa40
worker_thread+0x7a/0x5d0
kthread+0x1bc/0x210
ret_from_fork+0x24/0x30Allocated by task 889:
save_stack+0x23/0x90
__kasan_kmalloc.constprop.0+0xcf/0xe0
kasan_slab_alloc+0x12/0x20
kmem_cache_alloc+0xf6/0x360
transport_alloc_session+0x29/0x80 [target_core_mod]
iscsi_target_login_thread+0xcd6/0x18f0 [iscsi_target_mod]
kthread+0x1bc/0x210
ret_from_fork+0x24/0x30Freed by task 1025:
save_stack+0x23/0x90
__kasan_slab_free+0x13a/0x190
kasan_slab_free+0x12/0x20
kmem_cache_free+0x146/0x400
transport_free_session+0x179/0x2f0 [target_core_mod]
transport_deregister_session+0x130/0x180 [target_core_mod]
iscsit_close_session+0x12c/0x350 [iscsi_target_mod]
iscsit_logout_post_handler+0x136/0x380 [iscsi_target_mod]
iscsit_response_queue+0x8de/0xbe0 [iscsi_target_mod]
iscsi_target_tx_thread+0x27f/0x370 [iscsi_target_mod]
kthread+0x1bc/0x210
ret_from_fork+0x24/0x30The buggy address belongs to the object at ffff8881154ec9c0
which belongs to the cache se_sess_cache of size 352
The buggy address is located 176 bytes inside of
352-byte region [ffff8881154ec9c0, ffff8881154ecb20)
The buggy address belongs to the page:
page:ffffea0004553b00 refcount:1 mapcount:0 mapping:ffff888101755400 index:0x0 compound_mapcount: 0
flags: 0x2fff000000010200(slab|head)
raw: 2fff000000010200 dead000000000100 dead000000000122 ffff888101755400
raw: 0000000000000000 0000000080130013 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detectedMemory state around the buggy address:
ffff8881154ec900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8881154ec980: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
>ffff8881154eca00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8881154eca80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8881154ecb00: fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc fcCc: Mike Christie
Link: https://lore.kernel.org/r/20191113220508.198257-3-bvanassche@acm.org
Reviewed-by: Roman Bolshakov
Signed-off-by: Bart Van Assche
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 238191d65d7217982d69e21c1d623616da34b281 ]
If a faulty initiator fails to bind the socket to the iSCSI connection
before emitting a command, for instance, a subsequent send_pdu, it will
crash the kernel due to a null pointer dereference in sock_sendmsg(), as
shown in the log below. This patch makes sure the bind succeeded before
trying to use the socket.BUG: kernel NULL pointer dereference, address: 0000000000000018
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] SMP PTI
CPU: 3 PID: 7 Comm: kworker/u8:0 Not tainted 5.4.0-rc2.iscsi+ #13
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 24.158246] Workqueue: iscsi_q_0 iscsi_xmitworker
[ 24.158883] RIP: 0010:apparmor_socket_sendmsg+0x5/0x20
[...]
[ 24.161739] RSP: 0018:ffffab6440043ca0 EFLAGS: 00010282
[ 24.162400] RAX: ffffffff891c1c00 RBX: ffffffff89d53968 RCX: 0000000000000001
[ 24.163253] RDX: 0000000000000030 RSI: ffffab6440043d00 RDI: 0000000000000000
[ 24.164104] RBP: 0000000000000030 R08: 0000000000000030 R09: 0000000000000030
[ 24.165166] R10: ffffffff893e66a0 R11: 0000000000000018 R12: ffffab6440043d00
[ 24.166038] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9d5575a62e90
[ 24.166919] FS: 0000000000000000(0000) GS:ffff9d557db80000(0000) knlGS:0000000000000000
[ 24.167890] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 24.168587] CR2: 0000000000000018 CR3: 000000007a838000 CR4: 00000000000006e0
[ 24.169451] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 24.170320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 24.171214] Call Trace:
[ 24.171537] security_socket_sendmsg+0x3a/0x50
[ 24.172079] sock_sendmsg+0x16/0x60
[ 24.172506] iscsi_sw_tcp_xmit_segment+0x77/0x120
[ 24.173076] iscsi_sw_tcp_pdu_xmit+0x58/0x170
[ 24.173604] ? iscsi_dbg_trace+0x63/0x80
[ 24.174087] iscsi_tcp_task_xmit+0x101/0x280
[ 24.174666] iscsi_xmit_task+0x83/0x110
[ 24.175206] iscsi_xmitworker+0x57/0x380
[ 24.175757] ? __schedule+0x2a2/0x700
[ 24.176273] process_one_work+0x1b5/0x360
[ 24.176837] worker_thread+0x50/0x3c0
[ 24.177353] kthread+0xf9/0x130
[ 24.177799] ? process_one_work+0x360/0x360
[ 24.178401] ? kthread_park+0x90/0x90
[ 24.178915] ret_from_fork+0x35/0x40
[ 24.179421] Modules linked in:
[ 24.179856] CR2: 0000000000000018
[ 24.180327] ---[ end trace b4b7674b6df5f480 ]---Signed-off-by: Anatol Pomazau
Co-developed-by: Frank Mayhar
Signed-off-by: Frank Mayhar
Co-developed-by: Bharath Ravi
Signed-off-by: Bharath Ravi
Co-developed-by: Khazhimsel Kumykov
Signed-off-by: Khazhimsel Kumykov
Co-developed-by: Gabriel Krisman Bertazi
Signed-off-by: Gabriel Krisman Bertazi
Reviewed-by: Lee Duncan
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 71d848b8d97ec0f8e993d63cf9de6ac8b3f7c43d ]
Fix up possible unclocked register access to auto hibern8 register in
resume path and through sysfs entry. Meanwhile, enable auto hibern8 only
after device is fully initialized in probe path.Link: https://lore.kernel.org/r/1573798172-20534-4-git-send-email-cang@codeaurora.org
Reviewed-by: Stanley Chu
Signed-off-by: Can Guo
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 80647a89eaf3f2549741648f3230cd6ff68c23b4 ]
The SCSI specs require releasing SPC-2 reservations when a session is
closed. Make sure that the target core does this.Running the libiscsi tests triggers the KASAN complaint shown below. This
patch fixes that use-after-free.BUG: KASAN: use-after-free in target_check_reservation+0x171/0x980 [target_core_mod]
Read of size 8 at addr ffff88802ecd1878 by task iscsi_trx/17200CPU: 0 PID: 17200 Comm: iscsi_trx Not tainted 5.4.0-rc1-dbg+ #1
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x8a/0xd6
print_address_description.constprop.0+0x40/0x60
__kasan_report.cold+0x1b/0x34
kasan_report+0x16/0x20
__asan_load8+0x58/0x90
target_check_reservation+0x171/0x980 [target_core_mod]
__target_execute_cmd+0xb1/0xf0 [target_core_mod]
target_execute_cmd+0x22d/0x4d0 [target_core_mod]
transport_generic_new_cmd+0x31f/0x5b0 [target_core_mod]
transport_handle_cdb_direct+0x6f/0x90 [target_core_mod]
iscsit_execute_cmd+0x381/0x3f0 [iscsi_target_mod]
iscsit_sequence_cmd+0x13b/0x1f0 [iscsi_target_mod]
iscsit_process_scsi_cmd+0x4c/0x130 [iscsi_target_mod]
iscsit_get_rx_pdu+0x8e8/0x15f0 [iscsi_target_mod]
iscsi_target_rx_thread+0x105/0x1b0 [iscsi_target_mod]
kthread+0x1bc/0x210
ret_from_fork+0x24/0x30Allocated by task 1079:
save_stack+0x23/0x90
__kasan_kmalloc.constprop.0+0xcf/0xe0
kasan_slab_alloc+0x12/0x20
kmem_cache_alloc+0xfe/0x3a0
transport_alloc_session+0x29/0x80 [target_core_mod]
iscsi_target_login_thread+0xceb/0x1920 [iscsi_target_mod]
kthread+0x1bc/0x210
ret_from_fork+0x24/0x30Freed by task 17193:
save_stack+0x23/0x90
__kasan_slab_free+0x13a/0x190
kasan_slab_free+0x12/0x20
kmem_cache_free+0xc8/0x3e0
transport_free_session+0x179/0x2f0 [target_core_mod]
transport_deregister_session+0x121/0x170 [target_core_mod]
iscsit_close_session+0x12c/0x350 [iscsi_target_mod]
iscsit_logout_post_handler+0x136/0x380 [iscsi_target_mod]
iscsit_response_queue+0x8fa/0xc00 [iscsi_target_mod]
iscsi_target_tx_thread+0x28e/0x390 [iscsi_target_mod]
kthread+0x1bc/0x210
ret_from_fork+0x24/0x30The buggy address belongs to the object at ffff88802ecd1860
which belongs to the cache se_sess_cache of size 352
The buggy address is located 24 bytes inside of
352-byte region [ffff88802ecd1860, ffff88802ecd19c0)
The buggy address belongs to the page:
page:ffffea0000bb3400 refcount:1 mapcount:0 mapping:ffff8880bef2ed00 index:0x0 compound_mapcount: 0
flags: 0x1000000000010200(slab|head)
raw: 1000000000010200 dead000000000100 dead000000000122 ffff8880bef2ed00
raw: 0000000000000000 0000000080270027 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detectedMemory state around the buggy address:
ffff88802ecd1700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88802ecd1780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88802ecd1800: fb fb fb fb fc fc fc fc fc fc fc fc fb fb fb fb
^
ffff88802ecd1880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88802ecd1900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fbCc: Mike Christie
Link: https://lore.kernel.org/r/20191113220508.198257-2-bvanassche@acm.org
Reviewed-by: Roman Bolshakov
Signed-off-by: Bart Van Assche
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 0b7a223552d455bcfba6fb9cfc5eef2b5fce1491 ]
Add a module parameter to inhibit disconnect/reselect for individual
targets. This gains compatibility with Aztec PowerMonster SCSI/SATA
adapters with buggy firmware. (No fix is available from the vendor.)Apparently these adapters pass-through the product/vendor of the attached
SATA device. Since they can't be identified from the response to an INQUIRY
command, a device blacklist flag won't work.Cc: Michael Schmitz
Link: https://lore.kernel.org/r/993b17545990f31f9fa5a98202b51102a68e7594.1573875417.git.fthain@telegraphics.com.au
Reviewed-and-tested-by: Michael Schmitz
Signed-off-by: Finn Thain
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit aa5334c4f3014940f11bf876e919c956abef4089 ]
Passing the parameter "num_tgts=-1" will start an infinite loop that
exhausts the system memoryLink: https://lore.kernel.org/r/20191115163727.24626-1-mlombard@redhat.com
Signed-off-by: Maurizio Lombardi
Acked-by: Douglas Gilbert
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin