Eric Lee / smarc-fsl-linux-kernel

05 Aug, 2020

1 commit

99ea1521a Merge tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux ... Browse Code »

Pull uninitialized_var() macro removal from Kees Cook:
"This is long overdue, and has hidden too many bugs over the years. The
series has several "by hand" fixes, and then a trivial treewide
replacement.

- Clean up non-trivial uses of uninitialized_var()

- Update documentation and checkpatch for uninitialized_var() removal

- Treewide removal of uninitialized_var()"

* tag 'uninit-macro-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
compiler: Remove uninitialized_var() macro
treewide: Remove uninitialized_var() usage
checkpatch: Remove awareness of uninitialized_var() macro
mm/debug_vm_pgtable: Remove uninitialized_var() usage
f2fs: Eliminate usage of uninitialized_var() macro
media: sur40: Remove uninitialized_var() usage
KVM: PPC: Book3S PR: Remove uninitialized_var() usage
clk: spear: Remove uninitialized_var() usage
clk: st: Remove uninitialized_var() usage
spi: davinci: Remove uninitialized_var() usage
ide: Remove uninitialized_var() usage
rtlwifi: rtl8192cu: Remove uninitialized_var() usage
b43: Remove uninitialized_var() usage
drbd: Remove uninitialized_var() usage
x86/mm/numa: Remove uninitialized_var() usage
docs: deprecated.rst: Add uninitialized_var()

Linus Torvalds
2020-08-05 04:49:43 +0800

17 Jul, 2020

1 commit

3f649ab72 treewide: Remove uninitialized_var() usage ... Browse Code »

Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.

In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:

git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
xargs perl -pi -e \
's/\buninitialized_var$([^$]+)\)/\1/g;
s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'

drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.

No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.

[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/

Reviewed-by: Leon Romanovsky # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe # IB
Acked-by: Kalle Valo # wireless drivers
Reviewed-by: Chao Yu # erofs
Signed-off-by: Kees Cook

Kees Cook
2020-07-17 03:35:15 +0800

08 Jul, 2020

1 commit

20f829999 gfs2: Rework read and page fault locking ... Browse Code »

So far, gfs2 has taken the inode glocks inside the ->readpage and
->readahead address space operations. Since commit d4388340ae0b ("fs:
convert mpage_readpages to mpage_readahead"), gfs2_readahead is passed
the pages to read ahead locked. With that, the current holder of the
inode glock may be trying to lock one of those pages while
gfs2_readahead is trying to take the inode glock, resulting in a
deadlock.

Fix that by moving the lock taking to the higher-level ->read_iter file
and ->fault vm operations. This also gets rid of an ugly lock inversion
workaround in gfs2_readpage.

The cache consistency model of filesystems like gfs2 is such that if
data is found in the page cache, the data is up to date and can be used
without taking any filesystem locks. If a page is not cached,
filesystem locks must be taken before populating the page cache.

To avoid taking the inode glock when the data is already cached,
gfs2_file_read_iter first tries to read the data with the IOCB_NOIO flag
set. If that fails, the inode glock is taken and the operation is
retried with the IOCB_NOIO flag cleared.

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-07-08 05:40:12 +0800

03 Jul, 2020

5 commits

c860f8ffb gfs2: The freeze glock should never be frozen ... Browse Code »

Before this patch, some gfs2 code locked the freeze glock with LM_FLAG_NOEXP
(Do not freeze) flag, and some did not. We never want to freeze the freeze
glock, so this patch makes it consistently use LM_FLAG_NOEXP always.

Signed-off-by: Bob Peterson

Bob Peterson
2020-07-03 18:05:35 +0800
623ba664b gfs2: When freezing gfs2, use GL_EXACT and not GL_NOCACHE ... Browse Code »

Before this patch, the freeze code in gfs2 specified GL_NOCACHE in
several places. That's wrong because we always want to know the state
of whether the file system is frozen.

There was also a problem with freeze/thaw transitioning the glock from
frozen (EX) to thawed (SH) because gfs2 will normally grant glocks in EX
to processes that request it in SH mode, unless GL_EXACT is specified.
Therefore, the freeze/thaw code, which tried to reacquire the glock in
SH mode would get the glock in EX mode, and miss the transition from EX
to SH. That made it think the thaw had completed normally, but since the
glock was still cached in EX, other nodes could not freeze again.

This patch removes the GL_NOCACHE flag to allow the freeze glock to be
cached. It also adds the GL_EXACT flag so the glock is fully transitioned
from EX to SH, thereby allowing future freeze operations.

Signed-off-by: Bob Peterson

Bob Peterson
2020-07-03 18:05:35 +0800
b780cc615 gfs2: read-only mounts should grab the sd_freeze_gl glock ... Browse Code »

Before this patch, only read-write mounts would grab the freeze
glock in read-only mode, as part of gfs2_make_fs_rw. So the freeze
glock was never initialized. That meant requests to freeze, which
request the glock in EX, were granted without any state transition.
That meant you could mount a gfs2 file system, which is currently
frozen on a different cluster node, in read-only mode.

This patch makes read-only mounts lock the freeze glock in SH mode,
which will block for file systems that are frozen on another node.

Signed-off-by: Bob Peterson

Bob Peterson
2020-07-03 18:05:35 +0800
541656d3a gfs2: freeze should work on read-only mounts ... Browse Code »

Before this patch, function freeze_go_sync, called when promoting
the freeze glock, was testing for the SDF_JOURNAL_LIVE superblock flag.
That's only set for read-write mounts. Read-only mounts don't use a
journal, so the bit is never set, so the freeze never happened.

This patch removes the check for SDF_JOURNAL_LIVE for freeze requests
but still checks it when deciding whether to flush a journal.

Signed-off-by: Bob Peterson

Bob Peterson
2020-07-03 18:05:35 +0800
7542486b8 gfs2: eliminate GIF_ORDERED in favor of list_empty ... Browse Code »

In several places, we used the GIF_ORDERED inode flag to determine
if an inode was on the ordered writes list. However, since we always
held the sd_ordered_lock spin_lock during the manipulation, we can
just as easily check list_empty(&ip->i_ordered) instead.
This allows us to keep more than one ordered writes list to make
journal writing improvements.

This patch eliminates GIF_ORDERED in favor of checking list_empty.

Signed-off-by: Bob Peterson

Bob Peterson
2020-07-03 18:05:34 +0800

30 Jun, 2020

3 commits

34244d711 gfs2: Don't sleep during glock hash walk ... Browse Code »

In flush_delete_work, instead of flushing each individual pending
delayed work item, cancel and re-queue them for immediate execution.
The waiting isn't needed here because we're already waiting for all
queued work items to complete in gfs2_flush_delete_work. This makes the
code more efficient, but more importantly, it avoids sleeping during a
rhashtable walk, inside rcu_read_lock().

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-30 19:04:45 +0800
58e08e8d8 gfs2: fix trans slab error when withdraw occurs inside log_flush ... Browse Code »

Log flush operations (gfs2_log_flush()) can target a specific transaction.
But if the function encounters errors (e.g. io errors) and withdraws,
the transaction was only freed it if was queued to one of the ail lists.
If the withdraw occurred before the transaction was queued to the ail1
list, function ail_drain never freed it. The result was:

BUG gfs2_trans: Objects remaining in gfs2_trans on __kmem_cache_shutdown()

This patch makes log_flush() add the targeted transaction to the ail1
list so that function ail_drain() will find and free it properly.

Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-30 19:04:45 +0800
5902f4dd6 gfs2: Don't return NULL from gfs2_inode_lookup ... Browse Code »

Callers expect gfs2_inode_lookup to return an inode pointer or ERR_PTR(error).
Commit b66648ad6dcf caused it to return NULL instead of ERR_PTR(-ESTALE) in
some cases. Fix that.

Reported-by: Dan Carpenter
Fixes: b66648ad6dcf ("gfs2: Move inode generation number check into gfs2_inode_lookup")
Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-30 19:04:45 +0800

09 Jun, 2020

1 commit

ca687877e Merge tag 'gfs2-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 ... Browse Code »

Pull gfs2 updates from Andreas Gruenbacher:

- An iopen glock locking scheme rework that speeds up deletes of inodes
accessed from multiple nodes

- Various bug fixes and debugging improvements

- Convert gfs2-glocks.txt to ReST

* tag 'gfs2-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: fix use-after-free on transaction ail lists
gfs2: new slab for transactions
gfs2: initialize transaction tr_ailX_lists earlier
gfs2: Smarter iopen glock waiting
gfs2: Wake up when setting GLF_DEMOTE
gfs2: Check inode generation number in delete_work_func
gfs2: Move inode generation number check into gfs2_inode_lookup
gfs2: Minor gfs2_lookup_by_inum cleanup
gfs2: Try harder to delete inodes locally
gfs2: Give up the iopen glock on contention
gfs2: Turn gl_delete into a delayed work
gfs2: Keep track of deleted inode generations in LVBs
gfs2: Allow ASPACE glocks to also have an lvb
gfs2: instrumentation wrt log_flush stuck
gfs2: introduce new gfs2_glock_assert_withdraw
gfs2: print mapping->nrpages in glock dump for address space glocks
gfs2: Only do glock put in gfs2_create_inode for free inodes
gfs2: Allow lock_nolock mount to specify jid=X
gfs2: Don't ignore inode write errors during inode_go_sync
docs: filesystems: convert gfs2-glocks.txt to ReST

Linus Torvalds
2020-06-09 03:47:09 +0800

06 Jun, 2020

16 commits

0b166a57e Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

Pull ext4 updates from Ted Ts'o:
"A lot of bug fixes and cleanups for ext4, including:

- Fix performance problems found in dioread_nolock now that it is the
default, caused by transaction leaks.

- Clean up fiemap handling in ext4

- Clean up and refactor multiple block allocator (mballoc) code

- Fix a problem with mballoc with a smaller file systems running out
of blocks because they couldn't properly use blocks that had been
reserved by inode preallocation.

- Fixed a race in ext4_sync_parent() versus rename()

- Simplify the error handling in the extent manipulation code

- Make sure all metadata I/O errors are felected to
ext4_ext_dirty()'s and ext4_make_inode_dirty()'s callers.

- Avoid passing an error pointer to brelse in ext4_xattr_set()

- Fix race which could result to freeing an inode on the dirty last
in data=journal mode.

- Fix refcount handling if ext4_iget() fails

- Fix a crash in generic/019 caused by a corrupted extent node"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits)
ext4: avoid unnecessary transaction starts during writeback
ext4: don't block for O_DIRECT if IOCB_NOWAIT is set
ext4: remove the access_ok() check in ext4_ioctl_get_es_cache
fs: remove the access_ok() check in ioctl_fiemap
fs: handle FIEMAP_FLAG_SYNC in fiemap_prep
fs: move fiemap range validation into the file systems instances
iomap: fix the iomap_fiemap prototype
fs: move the fiemap definitions out of fs.h
fs: mark __generic_block_fiemap static
ext4: remove the call to fiemap_check_flags in ext4_fiemap
ext4: split _ext4_fiemap
ext4: fix fiemap size checks for bitmap files
ext4: fix EXT4_MAX_LOGICAL_BLOCK macro
add comment for ext4_dir_entry_2 file_type member
jbd2: avoid leaking transaction credits when unreserving handle
ext4: drop ext4_journal_free_reserved()
ext4: mballoc: use lock for checking free blocks while retrying
ext4: mballoc: refactor ext4_mb_good_group()
ext4: mballoc: introduce pcpu seqcnt for freeing PA to improve ENOSPC handling
ext4: mballoc: refactor ext4_mb_discard_preallocations()
...

Linus Torvalds
2020-06-06 07:19:28 +0800
300e549b6 Merge branch 'gfs2-iopen' into for-next Browse Code »

Andreas Gruenbacher
2020-06-06 03:25:36 +0800
83d060ca8 gfs2: fix use-after-free on transaction ail lists ... Browse Code »

Before this patch, transactions could be merged into the system
transaction by function gfs2_merge_trans(), but the transaction ail
lists were never merged. Because the ail flushing mechanism can run
separately, bd elements can be attached to the transaction's buffer
list during the transaction (trans_add_meta, etc) but quickly moved
to its ail lists. Later, in function gfs2_trans_end, the transaction
can be freed (by gfs2_trans_end) while it still has bd elements
queued to its ail lists, which can cause it to either lose track of
the bd elements altogether (memory leak) or worse, reference the bd
elements after the parent transaction has been freed.

Although I've not seen any serious consequences, the problem becomes
apparent with the previous patch's addition of:

gfs2_assert_warn(sdp, list_empty(&tr->tr_ail1_list));

to function gfs2_trans_free().

This patch adds logic into gfs2_merge_trans() to move the merged
transaction's ail lists to the sdp transaction. This prevents the
use-after-free. To do this properly, we need to hold the ail lock,
so we pass sdp into the function instead of the transaction itself.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-06 03:24:25 +0800
b839dadae gfs2: new slab for transactions ... Browse Code »

This patch adds a new slab for gfs2 transactions. That allows us to
reduce kernel memory fragmentation, have better organization of data
for analysis of vmcore dumps. A new centralized function is added to
free the slab objects, and it exposes use-after-free by giving
warnings if a transaction is freed while it still has bd elements
attached to its buffers or ail lists. We make sure to initialize
those transaction ail lists so we can check their integrity when freeing.

At a later time, we should add a slab initialization function to
make it more efficient, but for this initial patch I wanted to
minimize the impact.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-06 03:24:25 +0800
cbcc89b63 gfs2: initialize transaction tr_ailX_lists earlier ... Browse Code »

Since transactions may be freed shortly after they're created, before
a log_flush occurs, we need to initialize their ail1 and ail2 lists
earlier. Before this patch, the ail1 list was initialized in gfs2_log_flush().
This moves the initialization to the point when the transaction is first
created.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-06 03:24:25 +0800
9e8990dea gfs2: Smarter iopen glock waiting ... Browse Code »

When trying to upgrade the iopen glock from a shared to an exclusive lock in
gfs2_evict_inode, abort the wait if there is contention on the corresponding
inode glock: in that case, the inode must still be in active use on another
node, and we're not guaranteed to get the iopen glock anytime soon.

To make this work even better, when we notice contention on the iopen glock and
we can't evict the corresponsing inode and release the iopen glock immediately,
poke the inode glock. The other node(s) trying to acquire the lock can then
abort instead of timing out.

Thanks to Heinz Mauelshagen for pointing out a locking bug in a previous
version of this patch.

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-06 02:19:21 +0800
35b6f8fbc gfs2: Wake up when setting GLF_DEMOTE ... Browse Code »

Wake up the sdp->sd_async_glock_wait wait queue when setting the GLF_DEMOTE
flag.

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-06 02:19:21 +0800
b0dcffd8d gfs2: Check inode generation number in delete_work_func ... Browse Code »

In delete_work_func, if the iopen glock still has an inode attached,
limit the inode lookup to that specific generation number: in the likely
case that the inode was deleted on the node on which the inode's link
count dropped to zero, we can skip verifying the on-disk block type and
reading in the inode. The same applies if another node that had the
inode open managed to delete the inode before us.

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-06 02:19:21 +0800
b66648ad6 gfs2: Move inode generation number check into gfs2_inode_lookup ... Browse Code »

Move the inode generation number check from gfs2_lookup_by_inum into
gfs2_inode_lookup: gfs2_inode_lookup may be able to decide that an inode with
the given inode generation number cannot exist without having to verify the
block type or reading the inode from disk.

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-06 02:19:21 +0800
6bdcadea7 gfs2: Minor gfs2_lookup_by_inum cleanup ... Browse Code »

Use a zero no_formal_ino instead of a NULL pointer to indicate that any inode
generation number will qualify: a valid inode never has a zero no_formal_ino.

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-06 02:19:21 +0800
9e73330f2 gfs2: Try harder to delete inodes locally ... Browse Code »

When an inode's link count drops to zero and the inode is cached on
other nodes, the current behavior of gfs2 is to immediately give up and
to rely on the other node(s) to delete the inode if there is iopen glock
contention. This leads to resource group glock bouncing and the loss of
caching. With the previous patches in place, we can fix that by not
giving up immediately.

When the inode is still open on other nodes, those nodes won't be able
to evict the inode and give up the iopen glock. In that case, our lock
conversion request will time out. The unlink system call will block for
the duration of the iopen lock conversion request. We're also holding
the inode glock in EX mode for an extended duration, so other nodes
won't be able to make progress on the inode, either.

This is worse than what we had before, but we can prevent other nodes
from getting stuck by aborting our iopen locking request if there is
contention on the inode glock. This will the the subject of a future
patch.

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-06 02:19:21 +0800
8c7b9262a gfs2: Give up the iopen glock on contention ... Browse Code »

When there's contention on the iopen glock, it means that the link count
of the corresponding inode has dropped to zero on a remote node which is
now trying to delete the inode. In that case, try to evict the inode so
that the iopen glock will be released, which will allow the remote node
to do its job.

When the inode is still open locally, the inode's reference count won't
drop to zero and so we'll keep holding the inode and its iopen glock.
The remote node will time out its request to grab the iopen glock, and
when the inode is finally closed locally, we'll try to delete it
ourself.

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-06 02:19:21 +0800
a0e3cc65f gfs2: Turn gl_delete into a delayed work ... Browse Code »

This requires flushing delayed work items in gfs2_make_fs_ro (which is called
before unmounting a filesystem).

When inodes are deleted and then recreated, pending gl_delete work items would
have no effect because the inode generations will have changed, so we can
cancel any pending gl_delete works before reusing iopen glocks.

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-06 02:19:21 +0800
f286d627e gfs2: Keep track of deleted inode generations in LVBs ... Browse Code »

When deleting an inode, keep track of the generation of the deleted inode in
the inode glock Lock Value Block (LVB). When trying to delete an inode
remotely, check the last-known inode generation against the deleted inode
generation to skip duplicate remote deletes. This avoids taking the resource
group glock in order to verify the block type.

Signed-off-by: Andreas Gruenbacher

Andreas Gruenbacher
2020-06-06 02:19:20 +0800
15f2547b4 gfs2: Allow ASPACE glocks to also have an lvb ... Browse Code »

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-06 02:18:59 +0800
d5dc3d967 gfs2: instrumentation wrt log_flush stuck ... Browse Code »

This adds checks for gfs2_log_flush being stuck, similarly to the check
in gfs2_ail1_flush. To faciliate this and make the strings easy to grep
we move the ail1 emptying to its own function, empty_ail1_list.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-06 01:35:54 +0800

05 Jun, 2020

2 commits

ea4e61c7f gfs2: introduce new gfs2_glock_assert_withdraw ... Browse Code »

Before this patch, asserts based on glocks did not print the glock with
the error. This patch introduces a new macro, gfs2_glock_assert_withdraw
which first prints the glock, then takes the assert.

This also changes a few glock asserts to the new macro.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-05 22:44:29 +0800
7e901d6e9 gfs2: print mapping->nrpages in glock dump for address space glocks ... Browse Code »

This patch makes the glock dumps in debugfs print the number of pages
(nrpages) for address space glocks. This will aid in debugging.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-05 20:58:23 +0800

04 Jun, 2020

1 commit

10c5db286 fs: move the fiemap definitions out of fs.h ... Browse Code »

No need to pull the fiemap definitions into almost every file in the
kernel build.

Signed-off-by: Christoph Hellwig
Reviewed-by: Ritesh Harjani
Reviewed-by: Darrick J. Wong
Link: https://lore.kernel.org/r/20200523073016.2944131-5-hch@lst.de
Signed-off-by: Theodore Ts'o

Christoph Hellwig
2020-06-04 11:16:55 +0800

03 Jun, 2020

5 commits

1a0b00d15 gfs2: Only do glock put in gfs2_create_inode for free inodes ... Browse Code »

Before this patch, the error path of function gfs2_create_inode would
always calls gfs2_glock_put for the inode glock. That's good for inodes
that are free. But after they've been added to the vfs inodes, errors
will cause the inode to be evicted, and the evict will do the glock
put for us. If we do a glock put again, we can try to free the glock
while there are still references to it, e.g. revokes pending for
the transaction that created it.

This patch adds a check: if (free_vfs_inode) before the put, thus
solving the problem.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-03 03:23:55 +0800
88dca4ca5 mm: remove the pgprot argument to __vmalloc ... Browse Code »

The pgprot argument to __vmalloc is always PAGE_KERNEL now, so remove it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Reviewed-by: Michael Kelley [hyperv]
Acked-by: Gao Xiang [erofs]
Acked-by: Peter Zijlstra (Intel)
Acked-by: Wei Liu
Cc: Christian Borntraeger
Cc: Christophe Leroy
Cc: Daniel Vetter
Cc: David Airlie
Cc: Greg Kroah-Hartman
Cc: Haiyang Zhang
Cc: Johannes Weiner
Cc: "K. Y. Srinivasan"
Cc: Laura Abbott
Cc: Mark Rutland
Cc: Minchan Kim
Cc: Nitin Gupta
Cc: Robin Murphy
Cc: Sakari Ailus
Cc: Stephen Hemminger
Cc: Sumit Semwal
Cc: Benjamin Herrenschmidt
Cc: Catalin Marinas
Cc: Heiko Carstens
Cc: Paul Mackerras
Cc: Vasily Gorbik
Cc: Will Deacon
Link: http://lkml.kernel.org/r/20200414131348.444715-22-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-03 01:59:11 +0800
d4388340a fs: convert mpage_readpages to mpage_readahead ... Browse Code »

Implement the new readahead aop and convert all callers (block_dev,
exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2, ocfs2, omfs, qnx6,
reiserfs & udf).

The callers are all trivial except for GFS2 & OCFS2.

Signed-off-by: Matthew Wilcox (Oracle)
Signed-off-by: Andrew Morton
Reviewed-by: Junxiao Bi # ocfs2
Reviewed-by: Joseph Qi # ocfs2
Reviewed-by: Dave Chinner
Reviewed-by: John Hubbard
Reviewed-by: Christoph Hellwig
Reviewed-by: William Kucharski
Cc: Chao Yu
Cc: Cong Wang
Cc: Darrick J. Wong
Cc: Eric Biggers
Cc: Gao Xiang
Cc: Jaegeuk Kim
Cc: Michal Hocko
Cc: Zi Yan
Cc: Johannes Thumshirn
Cc: Miklos Szeredi
Link: http://lkml.kernel.org/r/20200414150233.24495-17-willy@infradead.org
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2020-06-03 01:59:07 +0800
ea22eee4e gfs2: Allow lock_nolock mount to specify jid=X ... Browse Code »

Before this patch, a simple typo accidentally added \n to the jid=
string for lock_nolock mounts. This made it impossible to mount a
gfs2 file system with a journal other than journal0. Thus:

mount -tgfs2 -o hostdata="jid=1"

Resulted in:
mount: wrong fs type, bad option, bad superblock on

In most cases this is not a problem. However, for debugging and
testing purposes we sometimes want to test the integrity of other
journals. This patch removes the unnecessary \n and thus allows
lock_nolock users to specify an alternate journal.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-03 01:45:05 +0800
bbae10fac gfs2: Don't ignore inode write errors during inode_go_sync ... Browse Code »

Before for this patch, function inode_go_sync ignored io errors
during inode_go_sync, overwriting them with metadata write errors:

error = filemap_fdatawait(mapping);
mapping_set_error(mapping, error);
}
error = filemap_fdatawait(metamapping);
...
return error;

So any errors returned by the inode write would be forgotten if the
metadata write succeeded. This patch still does both writes, but
only sets error if it's still zero. That way, any errors will be
reported by to the caller, do_xmote, which will take appropriate
action and report the error.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-06-03 01:45:05 +0800

29 May, 2020

1 commit

20be493b7 gfs2: Even more gfs2_find_jhead fixes ... Browse Code »

Fix several issues in the previous gfs2_find_jhead fix:
* When updating @blocks_submitted, @block refers to the first block block not
submitted yet, not the last block submitted, so fix an off-by-one error.
* We want to ensure that @blocks_submitted is far enough ahead of @blocks_read
to guarantee that there is in-flight I/O. Otherwise, we'll eventually end up
waiting for pages that haven't been submitted, yet.
* It's much easier to compare the number of blocks added with the number of
blocks submitted to limit the maximum bio size.
* Even with bio chaining, we can keep adding blocks until we reach the maximum
bio size, as long as we stop at a page boundary. This simplifies the logic.

Signed-off-by: Andreas Gruenbacher
Reviewed-by: Bob Peterson

Andreas Gruenbacher
2020-05-29 23:00:24 +0800

09 May, 2020

3 commits

b14c94908 Revert "gfs2: Don't demote a glock until its revokes are written" ... Browse Code »

This reverts commit df5db5f9ee112e76b5202fbc331f990a0fc316d6.

This patch fixes a regression: patch df5db5f9ee112 allowed function
run_queue() to bypass its call to do_xmote() if revokes were queued for
the glock. That's wrong because its call to do_xmote() is what is
responsible for calling the go_sync() glops functions to sync both
the ail list and any revokes queued for it. By bypassing the call,
gfs2 could get into a stand-off where the glock could not be demoted
until its revokes are written back, but the revokes would not be
written back because do_xmote() was never called.

It "sort of" works, however, because there are other mechanisms like
the log flush daemon (logd) that can sync the ail items and revokes,
if it deems it necessary. The problem is: without file system pressure,
it might never deem it necessary.

Signed-off-by: Bob Peterson

Bob Peterson
2020-05-09 04:01:25 +0800
b11e1a84f gfs2: If go_sync returns error, withdraw but skip invalidate ... Browse Code »

Before this patch, if the go_sync operation returned an error during
the do_xmote process (such as unable to sync metadata to the journal)
the code did goto out. That kept the glock locked, so it could not be
given away, which correctly avoids file system corruption. However,
it never set the withdraw bit or requeueing the glock work. So it would
hang forever, unable to ever demote the glock.

This patch changes to goto to a new label, skip_inval, so that errors
from go_sync are treated the same way as errors from go_inval:
The delayed withdraw bit is set and the work is requeued. That way,
the logd should eventually figure out there's a problem and withdraw
properly there.

Signed-off-by: Bob Peterson
Signed-off-by: Andreas Gruenbacher

Bob Peterson
2020-05-09 04:00:07 +0800
f4e2f5e1a gfs2: Grab glock reference sooner in gfs2_add_revoke ... Browse Code »

This patch rearranges gfs2_add_revoke so that the extra glock
reference is added earlier on in the function to avoid races in which
the glock is freed before the new reference is taken.

Signed-off-by: Andreas Gruenbacher
Signed-off-by: Bob Peterson

Andreas Gruenbacher
2020-05-09 00:49:04 +0800