22 Jan, 2021
1 commit
-
This reverts commit 175fe6289fcb7e13cb5add2b80cdc0e5e049fb95.
This change isn't needed as the new MU binding is efficient and
can handle messages of any sizes.Signed-off-by: Nitin Garg
Reviewed-by: Dong Aisheng
20 Jan, 2021
10 commits
-
This is the 5.10.9 stable release
* tag 'v5.10.9': (153 commits)
Linux 5.10.9
netfilter: nf_nat: Fix memleak in nf_nat_init
netfilter: conntrack: fix reading nf_conntrack_buckets
...Signed-off-by: Jason Liu
-
This is the 5.10.8 stable release
* tag 'v5.10.8': (104 commits)
Linux 5.10.8
tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
drm/panfrost: Remove unused variables in panfrost_job_close()
...Signed-off-by: Jason Liu
-
This is the 5.10.7 stable release
* tag 'v5.10.7': (144 commits)
Linux 5.10.7
scsi: target: Fix XCOPY NAA identifier lookup
rtlwifi: rise completion at the last step of firmware callback
...Signed-off-by: Jason Liu
-
This is the 5.10.6 stable release
* tag 'v5.10.6': (21 commits)
Linux 5.10.6
mwifiex: Fix possible buffer overflows in mwifiex_cmd_802_11_ad_hoc_start
exec: Transform exec_update_mutex into a rw_semaphore
...Signed-off-by: Jason Liu
Conflicts:
drivers/rtc/rtc-pcf2127.c -
This is the 5.10.5 stable release
* tag 'v5.10.5': (63 commits)
Linux 5.10.5
device-dax: Fix range release
ext4: avoid s_mb_prefetch to be zero in individual scenarios
...Signed-off-by: Jason Liu
-
[ Upstream commit 1b04fa9900263b4e217ca2509fd778b32c2b4eb2 ]
PowerPC testing encountered boot failures due to RCU Tasks not being
fully initialized until core_initcall() time. This commit therefore
initializes RCU Tasks (along with Rude RCU and RCU Tasks Trace) just
before early_initcall() time, thus allowing waiting on RCU Tasks grace
periods from early_initcall() handlers.Link: https://lore.kernel.org/rcu/87eekfh80a.fsf@dja-thinkpad.axtens.net/
Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
Tested-by: Daniel Axtens
Signed-off-by: Uladzislau Rezki (Sony)
Signed-off-by: Paul E. McKenney
Signed-off-by: Sasha Levin -
[ Upstream commit ee61cfd955a64a58ed35cbcfc54068fcbd486945 ]
It adds a stub acpi_create_platform_device() for !CONFIG_ACPI build, so
that caller doesn't have to deal with !CONFIG_ACPI build issue.Reported-by: kernel test robot
Signed-off-by: Shawn Guo
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Sasha Levin -
commit 9b5948267adc9e689da609eb61cf7ed49cae5fa8 upstream.
With external metadata device, flush requests are not passed down to the
data device.Fix this by submitting the flush request in dm_integrity_flush_buffers. In
order to not degrade performance, we overlap the data device flush with
the metadata device flush.Reported-by: Lukas Straub
Signed-off-by: Mikulas Patocka
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer
Signed-off-by: Greg Kroah-Hartman -
commit dca5244d2f5b94f1809f0c02a549edf41ccd5493 upstream.
GCC versions >= 4.9 and < 5.1 have been shown to emit memory references
beyond the stack pointer, resulting in memory corruption if an interrupt
is taken after the stack pointer has been adjusted but before the
reference has been executed. This leads to subtle, infrequent data
corruption such as the EXT4 problems reported by Russell King at the
link below.Life is too short for buggy compilers, so raise the minimum GCC version
required by arm64 to 5.1.Reported-by: Russell King
Suggested-by: Arnd Bergmann
Signed-off-by: Will Deacon
Tested-by: Nathan Chancellor
Reviewed-by: Nick Desaulniers
Reviewed-by: Nathan Chancellor
Acked-by: Linus Torvalds
Cc:
Cc: Theodore Ts'o
Cc: Florian Weimer
Cc: Peter Zijlstra
Cc: Nick Desaulniers
Link: https://lore.kernel.org/r/20210105154726.GD1551@shell.armlinux.org.uk
Link: https://lore.kernel.org/r/20210112224832.10980-1-will@kernel.org
Signed-off-by: Catalin Marinas
Signed-off-by: Greg Kroah-Hartman -
The implementation was limiting the size of a message which can be
received to 4 but soem response can be bigger. For example the
response of the 'sc_seco_secvio_config' API is 6 words.This patch removes this limitation relying on the count of word
received instead of the index of the chan.
It does so by duplicating imx_scu_call_rpc as imx_scu_call_big_rpc
in order to cahnge the RX method using imx_scu_big_rx_callback
instead of imx_scu_rx_callback.Signed-off-by: Franck LENORMAND
17 Jan, 2021
1 commit
-
commit 2ca408d9c749c32288bc28725f9f12ba30299e8f upstream.
Commit
121b32a58a3a ("x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments")
converted native x86-32 which take 64-bit arguments to use the
compat handlers to allow conversion to passing args via pt_regs.
sys_fanotify_mark() was however missed, as it has a general compat
handler. Add a config option that will use the syscall wrapper that
takes the split args for native 32-bit.[ bp: Fix typo in Kconfig help text. ]
Fixes: 121b32a58a3a ("x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments")
Reported-by: Paweł Jasiak
Signed-off-by: Brian Gerst
Signed-off-by: Borislav Petkov
Acked-by: Jan Kara
Acked-by: Andy Lutomirski
Link: https://lkml.kernel.org/r/20201130223059.101286-1-brgerst@gmail.com
Signed-off-by: Greg Kroah-Hartman
13 Jan, 2021
5 commits
-
To unify the driver interface on 8q and 8m,
we need to add a new header file to define custom interfacesSigned-off-by: Ming Qian
Reviewed-by: Shijie Qin
(cherry picked from commit 47c536bc846dfec04cf5eb502b895a65ecd5376d) -
commit 9ad9f45b3b91162b33abfe175ae75ab65718dbf5 upstream.
'struct intel_svm' is shared by all devices bound to a give process,
but records only a single pointer to a 'struct intel_iommu'. Consequently,
cache invalidations may only be applied to a single DMAR unit, and are
erroneously skipped for the other devices.In preparation for fixing this, rework the structures so that the iommu
pointer resides in 'struct intel_svm_dev', allowing 'struct intel_svm'
to track them in its device list.Fixes: 1c4f88b7f1f9 ("iommu/vt-d: Shared virtual address in scalable mode")
Cc: Lu Baolu
Cc: Jacob Pan
Cc: Raj Ashok
Cc: David Woodhouse
Reported-by: Guo Kaijie
Reported-by: Xin Zeng
Signed-off-by: Guo Kaijie
Signed-off-by: Xin Zeng
Signed-off-by: Liu Yi L
Tested-by: Guo Kaijie
Cc: stable@vger.kernel.org # v5.0+
Acked-by: Lu Baolu
Link: https://lore.kernel.org/r/1609949037-25291-2-git-send-email-yi.l.liu@intel.com
Signed-off-by: Will Deacon
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 52abca64fd9410ea6c9a3a74eab25663b403d7da ]
blk_queue_enter() accepts BLK_MQ_REQ_PM requests independent of the runtime
power management state. Now that SCSI domain validation no longer depends
on this behavior, modify the behavior of blk_queue_enter() as follows:- Do not accept any requests while suspended.
- Only process power management requests while suspending or resuming.
Submitting BLK_MQ_REQ_PM requests to a device that is runtime suspended
causes runtime-suspended devices not to resume as they should. The request
which should cause a runtime resume instead gets issued directly, without
resuming the device first. Of course the device can't handle it properly,
the I/O fails, and the device remains suspended.The problem is fixed by checking that the queue's runtime-PM status isn't
RPM_SUSPENDED before allowing a request to be issued, and queuing a
runtime-resume request if it is. In particular, the inline
blk_pm_request_resume() routine is renamed blk_pm_resume_queue() and the
code is unified by merging the surrounding checks into the routine. If the
queue isn't set up for runtime PM, or there currently is no restriction on
allowed requests, the request is allowed. Likewise if the BLK_MQ_REQ_PM
flag is set and the status isn't RPM_SUSPENDED. Otherwise a runtime resume
is queued and the request is blocked until conditions are more suitable.[ bvanassche: modified commit message and removed Cc: stable because
without the previous patches from this series this patch would break
parallel SCSI domain validation + introduced queue_rpm_status() ]Link: https://lore.kernel.org/r/20201209052951.16136-9-bvanassche@acm.org
Cc: Jens Axboe
Cc: Christoph Hellwig
Cc: Hannes Reinecke
Cc: Can Guo
Cc: Stanley Chu
Cc: Ming Lei
Cc: Rafael J. Wysocki
Reported-and-tested-by: Martin Kepplinger
Reviewed-by: Hannes Reinecke
Reviewed-by: Can Guo
Signed-off-by: Alan Stern
Signed-off-by: Bart Van Assche
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit a4d34da715e3cb7e0741fe603dcd511bed067e00 ]
Remove flag RQF_PREEMPT and BLK_MQ_REQ_PREEMPT since these are no longer
used by any kernel code.Link: https://lore.kernel.org/r/20201209052951.16136-8-bvanassche@acm.org
Cc: Can Guo
Cc: Stanley Chu
Cc: Alan Stern
Cc: Ming Lei
Cc: Rafael J. Wysocki
Cc: Martin Kepplinger
Reviewed-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Reviewed-by: Jens Axboe
Reviewed-by: Can Guo
Signed-off-by: Bart Van Assche
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin -
[ Upstream commit 0854bcdcdec26aecdc92c303816f349ee1fba2bc ]
Introduce the BLK_MQ_REQ_PM flag. This flag makes the request allocation
functions set RQF_PM. This is the first step towards removing
BLK_MQ_REQ_PREEMPT.Link: https://lore.kernel.org/r/20201209052951.16136-3-bvanassche@acm.org
Cc: Alan Stern
Cc: Stanley Chu
Cc: Ming Lei
Cc: Rafael J. Wysocki
Cc: Can Guo
Reviewed-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Reviewed-by: Jens Axboe
Reviewed-by: Can Guo
Signed-off-by: Bart Van Assche
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin
09 Jan, 2021
4 commits
-
[ Upstream commit f7cfd871ae0c5008d94b6f66834e7845caa93c15 ]
Recently syzbot reported[0] that there is a deadlock amongst the users
of exec_update_mutex. The problematic lock ordering found by lockdep
was:perf_event_open (exec_update_mutex -> ovl_i_mutex)
chown (ovl_i_mutex -> sb_writes)
sendfile (sb_writes -> p->lock)
by reading from a proc file and writing to overlayfs
proc_pid_syscall (p->lock -> exec_update_mutex)While looking at possible solutions it occured to me that all of the
users and possible users involved only wanted to state of the given
process to remain the same. They are all readers. The only writer is
exec.There is no reason for readers to block on each other. So fix
this deadlock by transforming exec_update_mutex into a rw_semaphore
named exec_update_lock that only exec takes for writing.Cc: Jann Horn
Cc: Vasiliy Kulikov
Cc: Al Viro
Cc: Bernd Edlinger
Cc: Oleg Nesterov
Cc: Christopher Yeoh
Cc: Cyrill Gorcunov
Cc: Sargun Dhillon
Cc: Christian Brauner
Cc: Arnd Bergmann
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Arnaldo Carvalho de Melo
Fixes: eea9673250db ("exec: Add exec_update_mutex to replace cred_guard_mutex")
[0] https://lkml.kernel.org/r/00000000000063640c05ade8e3de@google.com
Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/87ft4mbqen.fsf@x220.int.ebiederm.org
Signed-off-by: Eric W. Biederman
Signed-off-by: Sasha Levin -
[ Upstream commit 31784cff7ee073b34d6eddabb95e3be2880a425c ]
In preparation for converting exec_update_mutex to a rwsem so that
multiple readers can execute in parallel and not deadlock, add
down_read_interruptible. This is needed for perf_event_open to be
converted (with no semantic changes) from working on a mutex to
wroking on a rwsem.Signed-off-by: Eric W. Biederman
Signed-off-by: Peter Zijlstra (Intel)
Link: https://lkml.kernel.org/r/87k0tybqfy.fsf@x220.int.ebiederm.org
Signed-off-by: Sasha Levin -
[ Upstream commit 0f9368b5bf6db0c04afc5454b1be79022a681615 ]
In preparation for converting exec_update_mutex to a rwsem so that
multiple readers can execute in parallel and not deadlock, add
down_read_killable_nested. This is needed so that kcmp_lock
can be converted from working on a mutexes to working on rw_semaphores.Signed-off-by: Eric W. Biederman
Signed-off-by: Peter Zijlstra (Intel)
Link: https://lkml.kernel.org/r/87o8jabqh3.fsf@x220.int.ebiederm.org
Signed-off-by: Sasha Levin -
commit aa8c7db494d0a83ecae583aa193f1134ef25d506 upstream.
Silly GCC doesn't always inline these trivial functions.
Fixes the following warning:
arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled
Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.com
Signed-off-by: Josh Poimboeuf
Reported-by: Randy Dunlap
Acked-by: Randy Dunlap [build-tested]
Cc: Peter Zijlstra
Cc: Greg Kroah-Hartman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman
06 Jan, 2021
1 commit
-
commit dc2da7b45ffe954a0090f5d0310ed7b0b37d2bd2 upstream.
VMware observed a performance regression during memmap init on their
platform, and bisected to commit 73a6e474cb376 ("mm: memmap_init:
iterate over memblock regions rather that check each PFN") causing it.Before the commit:
[0.033176] Normal zone: 1445888 pages used for memmap
[0.033176] Normal zone: 89391104 pages, LIFO batch:63
[0.035851] ACPI: PM-Timer IO Port: 0x448With commit
[0.026874] Normal zone: 1445888 pages used for memmap
[0.026875] Normal zone: 89391104 pages, LIFO batch:63
[2.028450] ACPI: PM-Timer IO Port: 0x448The root cause is the current memmap defer init doesn't work as expected.
Before, memmap_init_zone() was used to do memmap init of one whole zone,
to initialize all low zones of one numa node, but defer memmap init of
the last zone in that numa node. However, since commit 73a6e474cb376,
function memmap_init() is adapted to iterater over memblock regions
inside one zone, then call memmap_init_zone() to do memmap init for each
region.E.g, on VMware's system, the memory layout is as below, there are two
memory regions in node 2. The current code will mistakenly initialize the
whole 1st region [mem 0xab00000000-0xfcffffffff], then do memmap defer to
iniatialize only one memmory section on the 2nd region [mem
0x10000000000-0x1033fffffff]. In fact, we only expect to see that there's
only one memory section's memmap initialized. That's why more time is
costed at the time.[ 0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
[ 0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0xbfffffff]
[ 0.008843] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x55ffffffff]
[ 0.008844] ACPI: SRAT: Node 1 PXM 1 [mem 0x5600000000-0xaaffffffff]
[ 0.008844] ACPI: SRAT: Node 2 PXM 2 [mem 0xab00000000-0xfcffffffff]
[ 0.008845] ACPI: SRAT: Node 2 PXM 2 [mem 0x10000000000-0x1033fffffff]Now, let's add a parameter 'zone_end_pfn' to memmap_init_zone() to pass
down the real zone end pfn so that defer_init() can use it to judge
whether defer need be taken in zone wide.Link: https://lkml.kernel.org/r/20201223080811.16211-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20201223080811.16211-2-bhe@redhat.com
Fixes: commit 73a6e474cb376 ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
Signed-off-by: Baoquan He
Reported-by: Rahul Gopakumar
Reviewed-by: Mike Rapoport
Cc: David Hildenbrand
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman
04 Jan, 2021
3 commits
-
This is the 5.10.4 stable release
* tag 'v5.10.4': (717 commits)
Linux 5.10.4
x86/CPU/AMD: Save AMD NodeId as cpu_die_id
drm/edid: fix objtool warning in drm_cvt_modes()
...Signed-off-by: Jason Liu
Conflicts:
drivers/gpu/drm/imx/dcss/dcss-plane.c
drivers/media/i2c/ov5640.c -
This is the 5.10.3 stable release
* tag 'v5.10.3': (41 commits)
Linux 5.10.3
md: fix a warning caused by a race between concurrent md_ioctl()s
nl80211: validate key indexes for cfg80211_registered_device
...Signed-off-by: Jason Liu
-
This is the 5.10.2 stable release
* tag 'v5.10.2': (17 commits)
Linux 5.10.2
serial: 8250_omap: Avoid FIFO corruption caused by MDR1 access
ALSA: pcm: oss: Fix potential out-of-bounds shift
...Signed-off-by: Jason Liu
Conflicts:
drivers/usb/host/xhci-hub.c
drivers/usb/host/xhci.h
30 Dec, 2020
10 commits
-
commit 5812b32e01c6d86ba7a84110702b46d8a8531fe9 upstream.
Specify type alignment when declaring linker-section match-table entries
to prevent gcc from increasing alignment and corrupting the various
tables with padding (e.g. timers, irqchips, clocks, reserved memory).This is specifically needed on x86 where gcc (typically) aligns larger
objects like struct of_device_id with static extent on 32-byte
boundaries which at best prevents matching on anything but the first
entry. Specifying alignment when declaring variables suppresses this
optimisation.Here's a 64-bit example where all entries are corrupt as 16 bytes of
padding has been inserted before the first entry:ffffffff8266b4b0 D __clk_of_table
ffffffff8266b4c0 d __of_table_fixed_factor_clk
ffffffff8266b5a0 d __of_table_fixed_clk
ffffffff8266b680 d __clk_of_table_sentinelAnd here's a 32-bit example where the 8-byte-aligned table happens to be
placed on a 32-byte boundary so that all but the first entry are corrupt
due to the 28 bytes of padding inserted between entries:812b3ec0 D __irqchip_of_table
812b3ec0 d __of_table_irqchip1
812b3fa0 d __of_table_irqchip2
812b4080 d __of_table_irqchip3
812b4160 d irqchip_of_match_endVerified on x86 using gcc-9.3 and gcc-4.9 (which uses 64-byte
alignment), and on arm using gcc-7.2.Note that there are no in-tree users of these tables on x86 currently
(even if they are included in the image).Fixes: 54196ccbe0ba ("of: consolidate linker section OF match table declarations")
Fixes: f6e916b82022 ("irqchip: add basic infrastructure")
Cc: stable # 3.9
Signed-off-by: Johan Hovold
Link: https://lore.kernel.org/r/20201123102319.8090-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman -
commit 0fb6ee8d0b5e90b72f870f76debc8bd31a742014 upstream.
Use a heap allocated memory for the SPI transfer buffer. Using stack memory
can corrupt stack memory when using DMA on some systems.This change moves the buffer from the stack of the trigger handler call to
the heap of the buffer of the state struct. The size increases takes into
account the alignment for the timestamp, which is 8 bytes.The 'data' buffer is split into 'tx_buf' and 'rx_buf', to make a clearer
separation of which part of the buffer should be used for TX & RX.Fixes: af3008485ea03 ("iio:adc: Add common code for ADI Sigma Delta devices")
Signed-off-by: Lars-Peter Clausen
Signed-off-by: Alexandru Ardelean
Link: https://lore.kernel.org/r/20201124123807.19717-1-alexandru.ardelean@analog.com
Cc:
Signed-off-by: Jonathan Cameron
Signed-off-by: Greg Kroah-Hartman -
commit fecc4559780d52d174ea05e3bf543669165389c3 upstream.
fsnotify_parent() used to send two separate events to backends when a
parent inode is watching children and the child inode is also watching.
In an attempt to avoid duplicate events in fanotify, we unified the two
backend callbacks to a single callback and handled the reporting of the
two separate events for the relevant backends (inotify and dnotify).
However the handling is buggy and can result in inotify and dnotify
listeners receiving events of the type they never asked for or spurious
events.The problem is the unified event callback with two inode marks (parent and
child) is called when any of the parent and child inodes are watched and
interested in the event, but the parent inode's mark that is interested
in the event on the child is not necessarily the one we are currently
reporting to (it could belong to a different group).So before reporting the parent or child event flavor to backend we need
to check that the mark is really interested in that event flavor.The semantics of INODE and CHILD marks were hard to follow and made the
logic more complicated than it should have been. Replace it with INODE
and PARENT marks semantics to hopefully make the logic more clear.Thanks to Hugh Dickins for spotting a bug in the earlier version of this
patch.Fixes: 497b0c5a7c06 ("fsnotify: send event to parent and child with single callback")
CC: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201202120713.702387-4-amir73il@gmail.com
Reported-by: Hugh Dickins
Signed-off-by: Amir Goldstein
Signed-off-by: Jan Kara
Signed-off-by: Greg Kroah-Hartman -
commit 950cc0d2bef078e1f6459900ca4d4b2a2e0e3c37 upstream.
The handle_inode_event() interface was added as (quoting comment):
"a simple variant of handle_event() for groups that only have inode
marks and don't have ignore mask".In other words, all backends except fanotify. The inotify backend
also falls under this category, but because it required extra arguments
it was left out of the initial pass of backends conversion to the
simple interface.This results in code duplication between the generic helper
fsnotify_handle_event() and the inotify_handle_event() callback
which also happen to be buggy code.Generalize the handle_inode_event() arguments and add the check for
FS_EXCL_UNLINK flag to the generic helper, so inotify backend could
be converted to use the simple interface.Link: https://lore.kernel.org/r/20201202120713.702387-2-amir73il@gmail.com
CC: stable@vger.kernel.org
Fixes: b9a1b9772509 ("fsnotify: create method handle_inode_event() in fsnotify_operations")
Signed-off-by: Amir Goldstein
Signed-off-by: Jan Kara
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit c6c75deda81344c3a95d1d1f606d5cee109e5d54 ]
Commit 1fde6f21d90f ("proc: fix /proc/net/* after setns(2)") only forced
revalidation of regular files under /proc/net/However, /proc/net/ is unusual in the sense of /proc/net/foo handlers
take netns pointer from parent directory which is old netns.Steps to reproduce:
(void)open("/proc/net/sctp/snmp", O_RDONLY);
unshare(CLONE_NEWNET);int fd = open("/proc/net/sctp/snmp", O_RDONLY);
read(fd, &c, 1);Read will read wrong data from original netns.
Patch forces lookup on every directory under /proc/net .
Link: https://lkml.kernel.org/r/20201205160916.GA109739@localhost.localdomain
Fixes: 1da4d377f943 ("proc: revalidate misc dentries")
Signed-off-by: Alexey Dobriyan
Reported-by: "Rantala, Tommi T. (Nokia - FI/Espoo)"
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin -
[ Upstream commit 013339df116c2ee0d796dd8bfb8f293a2030c063 ]
Since commit 369ea8242c0f ("mm/rmap: update to new mmu_notifier semantic
v2"), the code to check the secondary MMU's page table access bit is
broken for !(TTU_IGNORE_ACCESS) because the page is unmapped from the
secondary MMU's page table before the check. More specifically for those
secondary MMUs which unmap the memory in
mmu_notifier_invalidate_range_start() like kvm.However memory reclaim is the only user of !(TTU_IGNORE_ACCESS) or the
absence of TTU_IGNORE_ACCESS and it explicitly performs the page table
access check before trying to unmap the page. So, at worst the reclaim
will miss accesses in a very short window if we remove page table access
check in unmapping code.There is an unintented consequence of !(TTU_IGNORE_ACCESS) for the memcg
reclaim. From memcg reclaim the page_referenced() only account the
accesses from the processes which are in the same memcg of the target page
but the unmapping code is considering accesses from all the processes, so,
decreasing the effectiveness of memcg reclaim.The simplest solution is to always assume TTU_IGNORE_ACCESS in unmapping
code.Link: https://lkml.kernel.org/r/20201104231928.1494083-1-shakeelb@google.com
Fixes: 369ea8242c0f ("mm/rmap: update to new mmu_notifier semantic v2")
Signed-off-by: Shakeel Butt
Acked-by: Johannes Weiner
Cc: Hugh Dickins
Cc: Jerome Glisse
Cc: Vlastimil Babka
Cc: Michal Hocko
Cc: Andrea Arcangeli
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin -
[ Upstream commit 57efa1fe5957694fa541c9062de0a127f0b9acb0 ]
Since commit 70e806e4e645 ("mm: Do early cow for pinned pages during
fork() for ptes") pages under a FOLL_PIN will not be write protected
during COW for fork. This means that pages returned from
pin_user_pages(FOLL_WRITE) should not become write protected while the pin
is active.However, there is a small race where get_user_pages_fast(FOLL_PIN) can
establish a FOLL_PIN at the same time copy_present_page() is write
protecting it:CPU 0 CPU 1
get_user_pages_fast()
internal_get_user_pages_fast()
copy_page_range()
pte_alloc_map_lock()
copy_present_page()
atomic_read(has_pinned) == 0
page_maybe_dma_pinned() == false
atomic_set(has_pinned, 1);
gup_pgd_range()
gup_pte_range()
pte_t pte = gup_get_pte(ptep)
pte_access_permitted(pte)
try_grab_compound_head()
pte = pte_wrprotect(pte)
set_pte_at();
pte_unmap_unlock()
// GUP now returns with a write protected pageThe first attempt to resolve this by using the write protect caused
problems (and was missing a barrrier), see commit f3c64eda3e50 ("mm: avoid
early COW write protect games during fork()")Instead wrap copy_p4d_range() with the write side of a seqcount and check
the read side around gup_pgd_range(). If there is a collision then
get_user_pages_fast() fails and falls back to slow GUP.Slow GUP is safe against this race because copy_page_range() is only
called while holding the exclusive side of the mmap_lock on the src
mm_struct.[akpm@linux-foundation.org: coding style fixes]
Link: https://lore.kernel.org/r/CAHk-=wi=iCnYCARbPGjkVJu9eyYeZ13N64tZYLdOB8CP5Q_PLw@mail.gmail.comLink: https://lkml.kernel.org/r/2-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com
Fixes: f3c64eda3e50 ("mm: avoid early COW write protect games during fork()")
Signed-off-by: Jason Gunthorpe
Suggested-by: Linus Torvalds
Reviewed-by: John Hubbard
Reviewed-by: Jan Kara
Reviewed-by: Peter Xu
Acked-by: "Ahmed S. Darwish" [seqcount_t parts]
Cc: Andrea Arcangeli
Cc: "Aneesh Kumar K.V"
Cc: Christoph Hellwig
Cc: Hugh Dickins
Cc: Jann Horn
Cc: Kirill Shutemov
Cc: Kirill Tkhai
Cc: Leon Romanovsky
Cc: Michal Hocko
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin -
[ Upstream commit 88149082bb8ef31b289673669e080ec6a00c2e59 ]
If generic_drop_inode() returns true, it means iput_final() can evict
this inode regardless of whether it is dirty or not. If we check
I_DONTCACHE in generic_drop_inode(), any inode with this bit set will be
evicted unconditionally. This is not the desired behavior because
I_DONTCACHE only means the inode shouldn't be cached on the LRU list.
As for whether we need to evict this inode, this is what
generic_drop_inode() should do. This patch corrects the usage of
I_DONTCACHE.This patch was proposed in [1].
[1]: https://lore.kernel.org/linux-fsdevel/20200831003407.GE12096@dread.disaster.area/
Fixes: dae2f8ed7992 ("fs: Lift XFS_IDONTCACHE to the VFS layer")
Signed-off-by: Hao Li
Reviewed-by: Dave Chinner
Reviewed-by: Ira Weiny
Signed-off-by: Al Viro
Signed-off-by: Sasha Levin -
[ Upstream commit d9a9280a0d0ae51dc1d4142138b99242b7ec8ac6 ]
Building with W=2 prints a number of warnings for one function that
has a pointer type mismatch:linux/seq_buf.h: In function 'seq_buf_init':
linux/seq_buf.h:35:12: warning: pointer targets in assignment from 'unsigned char *' to 'char *' differ in signedness [-Wpointer-sign]Change the type in the function prototype according to the type in
the structure.Link: https://lkml.kernel.org/r/20201026161108.3707783-1-arnd@kernel.org
Fixes: 9a7777935c34 ("tracing: Convert seq_buf fields to be like seq_file fields")
Reviewed-by: Cezary Rojewski
Signed-off-by: Arnd Bergmann
Signed-off-by: Steven Rostedt (VMware)
Signed-off-by: Sasha Levin -
[ Upstream commit d5aa6b22e2258f05317313ecc02efbb988ed6d38 ]
According to RFC5666, the correct netid for an IPv6 addressed RDMA
transport is "rdma6", which we've supported as a mount option since
Linux-4.7. The problem is when we try to load the module "xprtrdma6",
that will fail, since there is no modulealias of that name.Fixes: 181342c5ebe8 ("xprtrdma: Add rdma6 option to support NFS/RDMA IPv6")
Signed-off-by: Trond Myklebust
Signed-off-by: Sasha Levin
26 Dec, 2020
1 commit
-
commit 159e1de201b6fca10bfec50405a3b53a561096a8 upstream.
It's possible to create a duplicate filename in an encrypted directory
by creating a file concurrently with adding the encryption key.Specifically, sys_open(O_CREAT) (or sys_mkdir(), sys_mknod(), or
sys_symlink()) can lookup the target filename while the directory's
encryption key hasn't been added yet, resulting in a negative no-key
dentry. The VFS then calls ->create() (or ->mkdir(), ->mknod(), or
->symlink()) because the dentry is negative. Normally, ->create() would
return -ENOKEY due to the directory's key being unavailable. However,
if the key was added between the dentry lookup and ->create(), then the
filesystem will go ahead and try to create the file.If the target filename happens to already exist as a normal name (not a
no-key name), a duplicate filename may be added to the directory.In order to fix this, we need to fix the filesystems to prevent
->create(), ->mkdir(), ->mknod(), and ->symlink() on no-key names.
(->rename() and ->link() need it too, but those are already handled
correctly by fscrypt_prepare_rename() and fscrypt_prepare_link().)In preparation for this, add a helper function fscrypt_is_nokey_name()
that filesystems can use to do this check. Use this helper function for
the existing checks that fs/crypto/ does for rename and link.Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201118075609.120337-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers
Signed-off-by: Greg Kroah-Hartman
21 Dec, 2020
1 commit
-
commit 8010622c86ca5bb44bc98492f5968726fc7c7a21 upstream.
UAS does not share the pessimistic assumption storage is making that
devices cannot deal with WRITE_SAME. A few devices supported by UAS,
are reported to not deal well with WRITE_SAME. Those need a quirk.Add it to the device that needs it.
Reported-by: David C. Partridge
Signed-off-by: Oliver Neukum
Cc: stable
Link: https://lore.kernel.org/r/20201209152639.9195-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman
18 Dec, 2020
3 commits
-
* usb/next: (86 commits)
LF-2482 usb: typec: tcpm: fix uninitialized value ret
LF-2345-12 usb: typec: tcpm: use vbus_present for power supply online
LF-2345-11 usb: typec: tcpm: add BC charger types if power type is usb
LF-2345-10 usb: typec: tcpci: handle fault event
LF-2345-9 usb: typec: tcpm: remove logically dead code
... -
* thermal/next: (9 commits)
LF-2630 thermal: imx_sc: Omit getting temperature error message for powered off resource
LF-2402 thermal: imx: Correct run_measurement check method
LF-1486 thermal: imx8mm: Make sure no out-of-bounds access in test_bit
LF-1228-3 thermal: imx: Add device cooling support
LF-1228-2 thermal: imx8mm: Add device cooling support
... -
* sdhc/next: (14 commits)
mmc: handle voltage parsing failure
mmc: sdhci-of-esdhc: support ACPI
mmc: use generic device properties in mmc_of_parse_voltage
mmc: host: imx: validate pinctrl before use it
MLK-24515 mmc: sdio: add a delay to call sdio_irq_work when sdio bus resume
...