Eric Lee / smarc-fsl-linux-kernel

12 Oct, 2020

6 commits

28e1581c3 libceph: clear con->out_msg on Policy::stateful_server faults ... Browse Code »

con->out_msg must be cleared on Policy::stateful_server
(!CEPH_MSG_CONNECT_LOSSY) faults. Not doing so botches the
reconnection attempt, because after writing the banner the
messenger moves on to writing the data section of that message
(either from where it got interrupted by the connection reset or
from the beginning) instead of writing struct ceph_msg_connect.
This results in a bizarre error message because the server
sends CEPH_MSGR_TAG_BADPROTOVER but we think we wrote struct
ceph_msg_connect:

libceph: mds0 (1)172.21.15.45:6828 socket error on write
ceph: mds0 reconnect start
libceph: mds0 (1)172.21.15.45:6829 socket closed (con state OPEN)
libceph: mds0 (1)172.21.15.45:6829 protocol version mismatch, my 32 != server's 32
libceph: mds0 (1)172.21.15.45:6829 protocol version mismatch

AFAICT this bug goes back to the dawn of the kernel client.
The reason it survived for so long is that only MDS sessions
are stateful and only two MDS messages have a data section:
CEPH_MSG_CLIENT_RECONNECT (always, but reconnecting is rare)
and CEPH_MSG_CLIENT_REQUEST (only when xattrs are involved).
The connection has to get reset precisely when such message
is being sent -- in this case it was the former.

Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/47723
Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-10-12 21:29:27 +0800
a9dfe31e5 libceph: format ceph_entity_addr nonces as unsigned ... Browse Code »

Match the server side logs.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2020-10-12 21:29:27 +0800
5a5036c89 libceph: move a dout in queue_con_delay() ... Browse Code »

The queued con->work can start executing (and therefore logging)
before we get to this "con->work has been queued" message, making
the logs confusing. Move it up, with the meaning of "con->work
is about to be queued".

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2020-10-12 21:29:27 +0800
1b05fae7f libceph: switch to the new "osd blocklist add" command ... Browse Code »

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2020-10-12 21:29:26 +0800
0b98acd61 libceph, rbd, ceph: "blacklist" -> "blocklist" ... Browse Code »

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2020-10-12 21:29:26 +0800
3986f9a42 libceph: multiple workspaces for CRUSH computations ... Browse Code »

Replace a global map->crush_workspace (protected by a global mutex)
with a list of workspaces, up to the number of CPUs + 1.

This is based on a patch from Robin Geuze .
Robin and his team have observed a 10-20% increase in IOPS on all
queue depths and lower CPU usage as well on a high-end all-NVMe
100GbE cluster.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2020-10-12 21:29:26 +0800

03 Oct, 2020

1 commit

40efc4dc7 libceph: use sendpage_ok() in ceph_tcp_sendpage() ... Browse Code »

In libceph, ceph_tcp_sendpage() does the following checks before handle
the page by network layer's zero copy sendpage method,
if (page_count(page) >= 1 && !PageSlab(page))

This check is exactly what sendpage_ok() does. This patch replace the
open coded checks by sendpage_ok() as a code cleanup.

Signed-off-by: Coly Li
Acked-by: Jeff Layton
Cc: Ilya Dryomov
Signed-off-by: David S. Miller

Coly Li
2020-10-03 06:27:08 +0800

24 Aug, 2020

1 commit

df561f668 treewide: Use fallthrough pseudo-keyword ... Browse Code »

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva

Gustavo A. R. Silva
2020-08-24 06:36:59 +0800

03 Aug, 2020

4 commits

94f17c00d libceph: replace HTTP links with HTTPS ones ... Browse Code »

Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.

[ idryomov: Do the same for the CRUSH paper and replace
ceph.newdream.net with ceph.io. ]

Signed-off-by: Alexander A. Klimov
Reviewed-by: Ilya Dryomov
Signed-off-by: Ilya Dryomov

Alexander A. Klimov
2020-08-03 17:05:26 +0800
042f64981 libceph: just have osd_req_op_init() return a pointer ... Browse Code »

The caller can just ignore the return. No need for this wrapper that
just casts the other function to void.

[ idryomov: argument alignment ]

Signed-off-by: Jeff Layton
Reviewed-by: Ilya Dryomov
Signed-off-by: Ilya Dryomov

Jeff Layton
2020-08-03 17:05:25 +0800
6e6f0f011 libceph: dump class and method names on method calls ... Browse Code »

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2020-08-03 17:03:01 +0800
5133ba8f1 libceph: use target_copy() in send_linger() ... Browse Code »

Instead of copying just oloc, oid and flags, copy the entire
linger target. This is more for consistency than anything else,
as send_linger() -> submit_request() -> __submit_request() sends
the request regardless of what calc_target() says (i.e. both on
CALC_TARGET_NO_ACTION and CALC_TARGET_NEED_RESEND).

Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-08-03 17:03:01 +0800

16 Jun, 2020

3 commits

7ed286f3e libceph: don't omit used_replica in target_copy() ... Browse Code »

Currently target_copy() is used only for sending linger pings, so
this doesn't come up, but generally omitting used_replica can hang
the client as we wouldn't notice the acting set change (legacy_change
in calc_target()) or trigger a warning in handle_reply().

Fixes: 117d96a04f00 ("libceph: support for balanced and localized reads")
Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-06-16 22:02:08 +0800
2f3fead62 libceph: don't omit recovery_deletes in target_copy() ... Browse Code »

Currently target_copy() is used only for sending linger pings, so
this doesn't come up, but generally omitting recovery_deletes can
result in unneeded resends (force_resend in calc_target()).

Fixes: ae78dd8139ce ("libceph: make RECOVERY_DELETES feature create a new interval")
Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-06-16 22:02:04 +0800
22d2cfdff libceph: move away from global osd_req_flags ... Browse Code »

osd_req_flags is overly general and doesn't suit its only user
(read_from_replica option) well:

- applying osd_req_flags in account_request() affects all OSD
requests, including linger (i.e. watch and notify). However,
linger requests should always go to the primary even though
some of them are reads (e.g. notify has side effects but it
is a read because it doesn't result in mutation on the OSDs).

- calls to class methods that are reads are allowed to go to
the replica, but most such calls issued for "rbd map" and/or
exclusive lock transitions are requested to be resent to the
primary via EAGAIN, doubling the latency.

Get rid of global osd_req_flags and set read_from_replica flag
only on specific OSD requests instead.

Fixes: 8ad44d5e0d1e ("libceph: read_from_replica option")
Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-06-16 22:01:53 +0800

09 Jun, 2020

1 commit

95288a9b3 Merge tag 'ceph-for-5.8-rc1' of git://github.com/ceph/ceph-client ... Browse Code »

Pull ceph updates from Ilya Dryomov:
"The highlights are:

- OSD/MDS latency and caps cache metrics infrastructure for the
filesytem (Xiubo Li). Currently available through debugfs and will
be periodically sent to the MDS in the future.

- support for replica reads (balanced and localized reads) for rbd
and the filesystem (myself). The default remains to always read
from primary, users can opt-in with the new crush_location and
read_from_replica options. Note that reading from replica is safe
for general use only since Octopus.

- support for RADOS allocation hint flags (myself). Currently used by
rbd to propagate the compressible/incompressible hint given with
the new compression_hint map option and ready for passing on more
advanced hints, e.g. based on fadvise() from the filesystem.

- support for efficient cross-quota-realm renames (Luis Henriques)

- assorted cap handling improvements and cleanups, particularly
untangling some of the locking (Jeff Layton)"

* tag 'ceph-for-5.8-rc1' of git://github.com/ceph/ceph-client: (29 commits)
rbd: compression_hint option
libceph: support for alloc hint flags
libceph: read_from_replica option
libceph: support for balanced and localized reads
libceph: crush_location infrastructure
libceph: decode CRUSH device/bucket types and names
libceph: add non-asserting rbtree insertion helper
ceph: skip checking caps when session reconnecting and releasing reqs
ceph: make sure mdsc->mutex is nested in s->s_mutex to fix dead lock
ceph: don't return -ESTALE if there's still an open file
libceph, rbd: replace zero-length array with flexible-array
ceph: allow rename operation under different quota realms
ceph: normalize 'delta' parameter usage in check_quota_exceeded
ceph: ceph_kick_flushing_caps needs the s_mutex
ceph: request expedited service on session's last cap flush
ceph: convert mdsc->cap_dirty to a per-session list
ceph: reset i_requested_max_size if file write is not wanted
ceph: throw a warning if we destroy session with mutex still locked
ceph: fix potential race in ceph_check_caps
ceph: document what protects i_dirty_item and i_flushing_item
...

Linus Torvalds
2020-06-09 03:49:18 +0800

04 Jun, 2020

1 commit

cb8e59cc8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next ... Browse Code »

Pull networking updates from David Miller:

1) Allow setting bluetooth L2CAP modes via socket option, from Luiz
Augusto von Dentz.

2) Add GSO partial support to igc, from Sasha Neftin.

3) Several cleanups and improvements to r8169 from Heiner Kallweit.

4) Add IF_OPER_TESTING link state and use it when ethtool triggers a
device self-test. From Andrew Lunn.

5) Start moving away from custom driver versions, use the globally
defined kernel version instead, from Leon Romanovsky.

6) Support GRO vis gro_cells in DSA layer, from Alexander Lobakin.

7) Allow hard IRQ deferral during NAPI, from Eric Dumazet.

8) Add sriov and vf support to hinic, from Luo bin.

9) Support Media Redundancy Protocol (MRP) in the bridging code, from
Horatiu Vultur.

10) Support netmap in the nft_nat code, from Pablo Neira Ayuso.

11) Allow UDPv6 encapsulation of ESP in the ipsec code, from Sabrina
Dubroca. Also add ipv6 support for espintcp.

12) Lots of ReST conversions of the networking documentation, from Mauro
Carvalho Chehab.

13) Support configuration of ethtool rxnfc flows in bcmgenet driver,
from Doug Berger.

14) Allow to dump cgroup id and filter by it in inet_diag code, from
Dmitry Yakunin.

15) Add infrastructure to export netlink attribute policies to
userspace, from Johannes Berg.

16) Several optimizations to sch_fq scheduler, from Eric Dumazet.

17) Fallback to the default qdisc if qdisc init fails because otherwise
a packet scheduler init failure will make a device inoperative. From
Jesper Dangaard Brouer.

18) Several RISCV bpf jit optimizations, from Luke Nelson.

19) Correct the return type of the ->ndo_start_xmit() method in several
drivers, it's netdev_tx_t but many drivers were using
'int'. From Yunjian Wang.

20) Add an ethtool interface for PHY master/slave config, from Oleksij
Rempel.

21) Add BPF iterators, from Yonghang Song.

22) Add cable test infrastructure, including ethool interfaces, from
Andrew Lunn. Marvell PHY driver is the first to support this
facility.

23) Remove zero-length arrays all over, from Gustavo A. R. Silva.

24) Calculate and maintain an explicit frame size in XDP, from Jesper
Dangaard Brouer.

25) Add CAP_BPF, from Alexei Starovoitov.

26) Support terse dumps in the packet scheduler, from Vlad Buslov.

27) Support XDP_TX bulking in dpaa2 driver, from Ioana Ciornei.

28) Add devm_register_netdev(), from Bartosz Golaszewski.

29) Minimize qdisc resets, from Cong Wang.

30) Get rid of kernel_getsockopt and kernel_setsockopt in order to
eliminate set_fs/get_fs calls. From Christoph Hellwig.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2517 commits)
selftests: net: ip_defrag: ignore EPERM
net_failover: fixed rollback in net_failover_open()
Revert "tipc: Fix potential tipc_aead refcnt leak in tipc_crypto_rcv"
Revert "tipc: Fix potential tipc_node refcnt leak in tipc_rcv"
vmxnet3: allow rx flow hash ops only when rss is enabled
hinic: add set_channels ethtool_ops support
selftests/bpf: Add a default $(CXX) value
tools/bpf: Don't use $(COMPILE.c)
bpf, selftests: Use bpf_probe_read_kernel
s390/bpf: Use bcr 0,%0 as tail call nop filler
s390/bpf: Maintain 8-byte stack alignment
selftests/bpf: Fix verifier test
selftests/bpf: Fix sample_cnt shared between two threads
bpf, selftests: Adapt cls_redirect to call csum_level helper
bpf: Add csum_level helper for fixing up csum levels
bpf: Fix up bpf_skb_adjust_room helper's skb csum setting
sfc: add missing annotation for efx_ef10_try_update_nic_stats_vf()
crypto/chtls: IPv6 support for inline TLS
Crypto/chcr: Fixes a coccinile check error
Crypto/chcr: Fixes compilations warnings
...

Linus Torvalds
2020-06-04 07:27:18 +0800

03 Jun, 2020

1 commit

ed1f324c5 mm: remove map_vm_range ... Browse Code »

Switch all callers to map_kernel_range, which symmetric to the unmap side
(as well as the _noflush versions).

Signed-off-by: Christoph Hellwig
Signed-off-by: Andrew Morton
Acked-by: Peter Zijlstra (Intel)
Cc: Christian Borntraeger
Cc: Christophe Leroy
Cc: Daniel Vetter
Cc: David Airlie
Cc: Gao Xiang
Cc: Greg Kroah-Hartman
Cc: Haiyang Zhang
Cc: Johannes Weiner
Cc: "K. Y. Srinivasan"
Cc: Laura Abbott
Cc: Mark Rutland
Cc: Michael Kelley
Cc: Minchan Kim
Cc: Nitin Gupta
Cc: Robin Murphy
Cc: Sakari Ailus
Cc: Stephen Hemminger
Cc: Sumit Semwal
Cc: Wei Liu
Cc: Benjamin Herrenschmidt
Cc: Catalin Marinas
Cc: Heiko Carstens
Cc: Paul Mackerras
Cc: Vasily Gorbik
Cc: Will Deacon
Link: http://lkml.kernel.org/r/20200414131348.444715-17-hch@lst.de
Signed-off-by: Linus Torvalds

Christoph Hellwig
2020-06-03 01:59:11 +0800

02 Jun, 2020

1 commit

d3798acc0 libceph: support for alloc hint flags ... Browse Code »

Allow indicating future I/O pattern via flags. This is supported since
Kraken (and bluestore persists flags together with expected_object_size
and expected_write_size).

Signed-off-by: Ilya Dryomov
Reviewed-by: Jason Dillaman

Ilya Dryomov
2020-06-02 05:32:35 +0800

01 Jun, 2020

7 commits

8ad44d5e0 libceph: read_from_replica option ... Browse Code »

Expose replica reads through read_from_replica=balance and
read_from_replica=localize. The default is to read from primary
(read_from_replica=no).

Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-06-01 19:22:53 +0800
117d96a04 libceph: support for balanced and localized reads ... Browse Code »

OSD-side issues with reads from replica have been resolved in
Octopus. Reading from replica should be safe wrt. unstable or
uncommitted state now, so add support for balanced and localized
reads.

There are two cases when a read from replica can't be served:

- OSD may silently drop the request, expecting the client to
notice that the acting set has changed and resend via the usual
means (handled with t->used_replica)

- OSD may return EAGAIN, expecting the client to resend to the
primary, ignoring replica read flags (see handle_reply())

Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-06-01 19:22:53 +0800
45e6aa9f5 libceph: crush_location infrastructure ... Browse Code »

Allow expressing client's location in terms of CRUSH hierarchy as
a set of (bucket type name, bucket name) pairs. The userspace syntax
"crush_location = key1=value1 key2=value2" is incompatible with mount
options and needed adaptation. Key-value pairs are separated by '|'
and we use ':' instead of '=' to separate keys from values. So for:

crush_location = host=foo rack=bar

one would write:

crush_location=host:foo|rack:bar

As in userspace, "multipath" locations are supported, so indicating
locality for parallel hierarchies is possible:

crush_location=rack:foo1|rack:foo2|datacenter:bar

Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-06-01 19:22:53 +0800
86403a92c libceph: decode CRUSH device/bucket types and names ... Browse Code »

These would be matched with the provided client location to calculate
the locality value.

Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-06-01 19:22:53 +0800
8a4b863c8 libceph: add non-asserting rbtree insertion helper ... Browse Code »

Needed for the next commit and useful for ceph_pg_pool_info tree as
well. I'm leaving the asserting helper in for now, but we should look
at getting rid of it in the future.

Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-06-01 19:22:53 +0800
97e27aaa9 ceph: add read/write latency metric support ... Browse Code »

Calculate the latency for OSD read requests. Add a new r_end_stamp
field to struct ceph_osd_request that will hold the time of that
the reply was received. Use that to calculate the RTT for each call,
and divide the sum of those by number of calls to get averate RTT.

Keep a tally of RTT for OSD writes and number of calls to track average
latency of OSD writes.

URL: https://tracker.ceph.com/issues/43215
Signed-off-by: Xiubo Li
Reviewed-by: Jeff Layton
Signed-off-by: Ilya Dryomov

Xiubo Li
2020-06-01 19:22:51 +0800
1806c13dc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net ... Browse Code »

xdp_umem.c had overlapping changes between the 64-bit math fix
for the calculation of npgs and the removal of the zerocopy
memory type which got rid of the chunk_size_nohdr member.

The mlx5 Kconfig conflict is a case where we just take the
net-next copy of the Kconfig entry dependency as it takes on
the ESWITCH dependency by one level of indirection which is
what the 'net' conflicting change is trying to ensure.

Signed-off-by: David S. Miller

David S. Miller
2020-06-01 08:48:46 +0800

29 May, 2020

1 commit

12abc5ee7 tcp: add tcp_sock_set_nodelay ... Browse Code »

Add a helper to directly set the TCP_NODELAY sockopt from kernel space
without going through a fake uaccess. Cleanup the callers to avoid
pointless wrappers now that this is a simple function call.

Signed-off-by: Christoph Hellwig
Acked-by: Sagi Grimberg
Acked-by: Jason Gunthorpe
Signed-off-by: David S. Miller

Christoph Hellwig
2020-05-29 02:11:45 +0800

27 May, 2020

1 commit

890bd0f89 libceph: ignore pool overlay and cache logic on redirects ... Browse Code »

OSD client should ignore cache/overlay flag if got redirect reply.
Otherwise, the client hangs when the cache tier is in forward mode.

[ idryomov: Redirects are effectively deprecated and no longer
used or tested. The original tiering modes based on redirects
are inherently flawed because redirects can race and reorder,
potentially resulting in data corruption. The new proxy and
readproxy tiering modes should be used instead of forward and
readforward. Still marking for stable as obviously correct,
though. ]

Cc: stable@vger.kernel.org
URL: https://tracker.ceph.com/issues/23296
URL: https://tracker.ceph.com/issues/36406
Signed-off-by: Jerry Lee
Reviewed-by: Ilya Dryomov
Signed-off-by: Ilya Dryomov

Jerry Lee
2020-05-27 18:43:35 +0800

29 Apr, 2020

1 commit

9dfe13612 docs: networking: convert dns_resolver.txt to ReST ... Browse Code »

- add SPDX header;
- adjust titles and chapters, adding proper markups;
- comment out text-only TOC from html/pdf output;

- mark code blocks and literals as such;

- adjust identation, whitespaces and blank lines;
- add to networking/index.rst.

Signed-off-by: Mauro Carvalho Chehab
Signed-off-by: David S. Miller

Mauro Carvalho Chehab
2020-04-29 05:39:46 +0800

30 Mar, 2020

4 commits

bb0e681dd libceph: directly skip to the end of redirect reply ... Browse Code »

Coverity complains about a double write to *p. Don't bother with
osd_instructions and directly skip to the end of redirect reply.

Reported-by: Colin Ian King
Signed-off-by: Ilya Dryomov

Ilya Dryomov
2020-03-30 18:42:41 +0800
4d8b8fb49 libceph: simplify ceph_monc_handle_map() ... Browse Code »

ceph_monc_handle_map() confuses static checkers which report a
false use-after-free on monc->monmap, missing that monc->monmap and
client->monc.monmap is the same pointer.

Use monc->monmap consistently and get rid of "old", which is redundant.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2020-03-30 18:42:41 +0800
5107d7d50 ceph: move ceph_osdc_{read,write}pages to ceph.ko ... Browse Code »

Since these helpers are only used by ceph.ko, move them there and
rename them with _sync_ qualifiers.

Signed-off-by: Xiubo Li
Reviewed-by: Jeff Layton
Signed-off-by: Ilya Dryomov

Xiubo Li
2020-03-30 18:42:40 +0800
072eaf3c0 libceph: drop CEPH_DEFINE_SHOW_FUNC ... Browse Code »

Although CEPH_DEFINE_SHOW_FUNC is much older, it now duplicates
DEFINE_SHOW_ATTRIBUTE from linux/seq_file.h.

Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2020-03-30 18:42:40 +0800

23 Mar, 2020

2 commits

e88627403 libceph: fix alloc_msg_with_page_vector() memory leaks ... Browse Code »

Make it so that CEPH_MSG_DATA_PAGES data item can own pages,
fixing a bunch of memory leaks for a page vector allocated in
alloc_msg_with_page_vector(). Currently, only watch-notify
messages trigger this allocation, and normally the page vector
is freed either in handle_watch_notify() or by the caller of
ceph_osdc_notify(). But if the message is freed before that
(e.g. if the session faults while reading in the message or
if the notify is stale), we leak the page vector.

This was supposed to be fixed by switching to a message-owned
pagelist, but that never happened.

Fixes: 1907920324f1 ("libceph: support for sending notifies")
Reported-by: Roman Penyaev
Signed-off-by: Ilya Dryomov
Reviewed-by: Roman Penyaev

Ilya Dryomov
2020-03-23 20:07:08 +0800
761420973 ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULL ... Browse Code »

CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult
per-pool flags as well. Unfortunately the backwards compatibility here
is lacking:

- the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but
was guarded by require_osd_release >= RELEASE_LUMINOUS
- it was subsequently backported to luminous in v12.2.2, but that makes
no difference to clients that only check OSDMAP_FULL/NEARFULL because
require_osd_release is not client-facing -- it is for OSDs

Since all kernels are affected, the best we can do here is just start
checking both map flags and pool flags and send that to stable.

These checks are best effort, so take osdc->lock and look up pool flags
just once. Remove the FIXME, since filesystem quotas are checked above
and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches
its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set.

Cc: stable@vger.kernel.org
Reported-by: Yanhu Cao
Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton
Acked-by: Sage Weil

Ilya Dryomov
2020-03-23 20:07:08 +0800

09 Feb, 2020

1 commit

c9d35ee04 Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs file system parameter updates from Al Viro:
"Saner fs_parser.c guts and data structures. The system-wide registry
of syntax types (string/enum/int32/oct32/.../etc.) is gone and so is
the horror switch() in fs_parse() that would have to grow another case
every time something got added to that system-wide registry.

New syntax types can be added by filesystems easily now, and their
namespace is that of functions - not of system-wide enum members. IOW,
they can be shared or kept private and if some turn out to be widely
useful, we can make them common library helpers, etc., without having
to do anything whatsoever to fs_parse() itself.

And we already get that kind of requests - the thing that finally
pushed me into doing that was "oh, and let's add one for timeouts -
things like 15s or 2h". If some filesystem really wants that, let them
do it. Without somebody having to play gatekeeper for the variants
blessed by direct support in fs_parse(), TYVM.

Quite a bit of boilerplate is gone. And IMO the data structures make a
lot more sense now. -200LoC, while we are at it"

* 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (25 commits)
tmpfs: switch to use of invalfc()
cgroup1: switch to use of errorfc() et.al.
procfs: switch to use of invalfc()
hugetlbfs: switch to use of invalfc()
cramfs: switch to use of errofc() et.al.
gfs2: switch to use of errorfc() et.al.
fuse: switch to use errorfc() et.al.
ceph: use errorfc() and friends instead of spelling the prefix out
prefix-handling analogues of errorf() and friends
turn fs_param_is_... into functions
fs_parse: handle optional arguments sanely
fs_parse: fold fs_parameter_desc/fs_parameter_spec
fs_parser: remove fs_parameter_description name field
add prefix to fs_context->log
ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log
new primitive: __fs_parse()
switch rbd and libceph to p_log-based primitives
struct p_log, variants of warnf() et.al. taking that one instead
teach logfc() to handle prefices, give it saner calling conventions
get rid of cg_invalf()
...

Linus Torvalds
2020-02-09 05:26:41 +0800

08 Feb, 2020

4 commits

d7167b149 fs_parse: fold fs_parameter_desc/fs_parameter_spec ... Browse Code »

The former contains nothing but a pointer to an array of the latter...

Signed-off-by: Al Viro

Al Viro
2020-02-08 03:48:37 +0800
96cafb9cc fs_parser: remove fs_parameter_description name field ... Browse Code »

Unused now.

Signed-off-by: Eric Sandeen
Acked-by: David Howells
Signed-off-by: Al Viro

Eric Sandeen
2020-02-08 03:48:36 +0800
c80c98f0d ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log ... Browse Code »

... and now errorf() et.al. are never called with NULL fs_context,
so we can get rid of conditional in those.

Signed-off-by: Al Viro

Al Viro
2020-02-08 03:48:34 +0800
7f5d38141 new primitive: __fs_parse() ... Browse Code »

fs_parse() analogue taking p_log instead of fs_context.
fs_parse() turned into a wrapper, callers in ceph_common and rbd
switched to __fs_parse().

As the result, fs_parse() never gets NULL fs_context and neither
do fs_context-based logging primitives

Signed-off-by: Al Viro

Al Viro
2020-02-08 03:48:34 +0800