Eric Lee / smarc-fsl-linux-kernel

23 Mar, 2020

2 commits

e88627403 libceph: fix alloc_msg_with_page_vector() memory leaks ... Browse Code »

Make it so that CEPH_MSG_DATA_PAGES data item can own pages,
fixing a bunch of memory leaks for a page vector allocated in
alloc_msg_with_page_vector(). Currently, only watch-notify
messages trigger this allocation, and normally the page vector
is freed either in handle_watch_notify() or by the caller of
ceph_osdc_notify(). But if the message is freed before that
(e.g. if the session faults while reading in the message or
if the notify is stale), we leak the page vector.

This was supposed to be fixed by switching to a message-owned
pagelist, but that never happened.

Fixes: 1907920324f1 ("libceph: support for sending notifies")
Reported-by: Roman Penyaev
Signed-off-by: Ilya Dryomov
Reviewed-by: Roman Penyaev

Ilya Dryomov
2020-03-23 20:07:08 +0800
761420973 ceph: check POOL_FLAG_FULL/NEARFULL in addition to OSDMAP_FULL/NEARFULL ... Browse Code »

CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult
per-pool flags as well. Unfortunately the backwards compatibility here
is lacking:

- the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but
was guarded by require_osd_release >= RELEASE_LUMINOUS
- it was subsequently backported to luminous in v12.2.2, but that makes
no difference to clients that only check OSDMAP_FULL/NEARFULL because
require_osd_release is not client-facing -- it is for OSDs

Since all kernels are affected, the best we can do here is just start
checking both map flags and pool flags and send that to stable.

These checks are best effort, so take osdc->lock and look up pool flags
just once. Remove the FIXME, since filesystem quotas are checked above
and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches
its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set.

Cc: stable@vger.kernel.org
Reported-by: Yanhu Cao
Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton
Acked-by: Sage Weil

Ilya Dryomov
2020-03-23 20:07:08 +0800

09 Feb, 2020

1 commit

c9d35ee04 Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs file system parameter updates from Al Viro:
"Saner fs_parser.c guts and data structures. The system-wide registry
of syntax types (string/enum/int32/oct32/.../etc.) is gone and so is
the horror switch() in fs_parse() that would have to grow another case
every time something got added to that system-wide registry.

New syntax types can be added by filesystems easily now, and their
namespace is that of functions - not of system-wide enum members. IOW,
they can be shared or kept private and if some turn out to be widely
useful, we can make them common library helpers, etc., without having
to do anything whatsoever to fs_parse() itself.

And we already get that kind of requests - the thing that finally
pushed me into doing that was "oh, and let's add one for timeouts -
things like 15s or 2h". If some filesystem really wants that, let them
do it. Without somebody having to play gatekeeper for the variants
blessed by direct support in fs_parse(), TYVM.

Quite a bit of boilerplate is gone. And IMO the data structures make a
lot more sense now. -200LoC, while we are at it"

* 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (25 commits)
tmpfs: switch to use of invalfc()
cgroup1: switch to use of errorfc() et.al.
procfs: switch to use of invalfc()
hugetlbfs: switch to use of invalfc()
cramfs: switch to use of errofc() et.al.
gfs2: switch to use of errorfc() et.al.
fuse: switch to use errorfc() et.al.
ceph: use errorfc() and friends instead of spelling the prefix out
prefix-handling analogues of errorf() and friends
turn fs_param_is_... into functions
fs_parse: handle optional arguments sanely
fs_parse: fold fs_parameter_desc/fs_parameter_spec
fs_parser: remove fs_parameter_description name field
add prefix to fs_context->log
ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log
new primitive: __fs_parse()
switch rbd and libceph to p_log-based primitives
struct p_log, variants of warnf() et.al. taking that one instead
teach logfc() to handle prefices, give it saner calling conventions
get rid of cg_invalf()
...

Linus Torvalds
2020-02-09 05:26:41 +0800

08 Feb, 2020

5 commits

d7167b149 fs_parse: fold fs_parameter_desc/fs_parameter_spec ... Browse Code »

The former contains nothing but a pointer to an array of the latter...

Signed-off-by: Al Viro

Al Viro
2020-02-08 03:48:37 +0800
96cafb9cc fs_parser: remove fs_parameter_description name field ... Browse Code »

Unused now.

Signed-off-by: Eric Sandeen
Acked-by: David Howells
Signed-off-by: Al Viro

Eric Sandeen
2020-02-08 03:48:36 +0800
c80c98f0d ceph_parse_param(), ceph_parse_mon_ips(): switch to passing fc_log ... Browse Code »

... and now errorf() et.al. are never called with NULL fs_context,
so we can get rid of conditional in those.

Signed-off-by: Al Viro

Al Viro
2020-02-08 03:48:34 +0800
7f5d38141 new primitive: __fs_parse() ... Browse Code »

fs_parse() analogue taking p_log instead of fs_context.
fs_parse() turned into a wrapper, callers in ceph_common and rbd
switched to __fs_parse().

As the result, fs_parse() never gets NULL fs_context and neither
do fs_context-based logging primitives

Signed-off-by: Al Viro

Al Viro
2020-02-08 03:48:34 +0800
2c3f3dc31 switch rbd and libceph to p_log-based primitives ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2020-02-08 03:48:33 +0800

07 Feb, 2020

1 commit

2710c957a fs_parse: get rid of ->enums ... Browse Code »

Don't do a single array; attach them to fsparam_enum() entry
instead. And don't bother trying to embed the names into those -
it actually loses memory, with no real speedup worth mentioning.

Simplifies validation as well.

Signed-off-by: Al Viro

Al Viro
2020-02-07 13:12:50 +0800

27 Jan, 2020

2 commits

24604f7e2 ceph: move net/ceph/ceph_fs.c to fs/ceph/util.c ... Browse Code »

All of these functions are only called from CephFS, so move them into
ceph.ko, and drop the exports.

Signed-off-by: Jeff Layton
Reviewed-by: Ilya Dryomov
Signed-off-by: Ilya Dryomov

Jeff Layton
2020-01-27 23:53:40 +0800
78beb0ff2 ceph: use copy-from2 op in copy_file_range ... Browse Code »

Instead of using the copy-from operation, switch copy_file_range to the
new copy-from2 operation, which allows to send the truncate_seq and
truncate_size parameters.

If an OSD does not support the copy-from2 operation it will return
-EOPNOTSUPP. In that case, the kernel client will stop trying to do
remote object copies for this fs client and will always use the generic
VFS copy_file_range.

Signed-off-by: Luis Henriques
Reviewed-by: Jeff Layton
Signed-off-by: Ilya Dryomov

Luis Henriques
2020-01-27 23:53:40 +0800

28 Nov, 2019

1 commit

82995cc6c libceph, rbd, ceph: convert to use the new mount API ... Browse Code »

Convert the ceph filesystem to the new internal mount API as the old
one will be obsoleted and removed. This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.

See Documentation/filesystems/mount_api.txt for more information.

[ Numerous string handling, leak and regression fixes; rbd conversion
was particularly broken and had to be redone almost from scratch. ]

Signed-off-by: David Howells
Signed-off-by: Jeff Layton
Signed-off-by: Ilya Dryomov

David Howells
2019-11-28 05:28:37 +0800

25 Nov, 2019

1 commit

d8f544c30 libceph: drop unnecessary check from dispatch() in mon_client.c ... Browse Code »

con->private is set in ceph_con_init() and is never cleared.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2019-11-25 18:44:02 +0800

16 Sep, 2019

6 commits

cf73d882c libceph: use ceph_kvmalloc() for osdmap arrays ... Browse Code »

osdmap has a bunch of arrays that grow linearly with the number of
OSDs. osd_state, osd_weight and osd_primary_affinity take 4 bytes per
OSD. osd_addr takes 136 bytes per OSD because of sockaddr_storage.
The CRUSH workspace area also grows linearly with the number of OSDs.

Normally these arrays are allocated at client startup. The osdmap is
usually updated in small incrementals, but once in a while a full map
may need to be processed. For a cluster with 10000 OSDs, this means
a bunch of 40K allocations followed by a 1.3M allocation, all of which
are currently required to be physically contiguous. This results in
sporadic ENOMEM errors, hanging the client.

Go back to manually (re)allocating arrays and use ceph_kvmalloc() to
fall back to non-contiguous allocation when necessary.

Link: https://tracker.ceph.com/issues/40481
Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2019-09-16 18:06:25 +0800
10c12851a libceph: avoid a __vmalloc() deadlock in ceph_kvmalloc() ... Browse Code »

The vmalloc allocator doesn't fully respect the specified gfp mask:
while the actual pages are allocated as requested, the page table pages
are always allocated with GFP_KERNEL. ceph_kvmalloc() may be called
with GFP_NOFS and GFP_NOIO (for ceph and rbd respectively), so this may
result in a deadlock.

There is no real reason for the current PAGE_ALLOC_COSTLY_ORDER logic,
it's just something that seemed sensible at the time (ceph_kvmalloc()
predates kvmalloc()). kvmalloc() is smarter: in an attempt to reduce
long term fragmentation, it first tries to kmalloc non-disruptively.

Switch to kvmalloc() and set the respective PF_MEMALLOC_* flag using
the scope API to avoid the deadlock. Note that kvmalloc() needs to be
passed GFP_KERNEL to enable the fallback.

Signed-off-by: Ilya Dryomov
Reviewed-by: Jeff Layton

Ilya Dryomov
2019-09-16 18:06:25 +0800
8edf84ba4 libceph: drop unused con parameter of calc_target() ... Browse Code »

This bit was omitted from a561372405cf ("libceph: fix PG split vs OSD
(re)connect race") to avoid backport conflicts.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2019-09-16 18:06:25 +0800
4766815b1 libceph: handle OSD op ceph_pagelist_append() errors ... Browse Code »

osd_req_op_cls_init() and osd_req_op_xattr_init() currently propagate
ceph_pagelist_alloc() ENOMEM errors but ignore ceph_pagelist_append()
memory allocation failures. Add these checks and cleanup on error.

Signed-off-by: David Disseldorp
Reviewed-by: Jeff Layton
Signed-off-by: Ilya Dryomov

David Disseldorp
2019-09-16 18:06:25 +0800
2cef0ba80 libceph: add function that clears osd client's abort_err ... Browse Code »

Signed-off-by: "Yan, Zheng"
Reviewed-by: Jeff Layton
Signed-off-by: Ilya Dryomov

Yan, Zheng
2019-09-16 18:06:23 +0800
120a75ea9 libceph: add function that reset client's entity addr ... Browse Code »

This function also re-open connections to OSD/MON, and re-send in-flight
OSD requests after re-opening connections to OSD.

Signed-off-by: "Yan, Zheng"
Reviewed-by: Jeff Layton
Signed-off-by: Ilya Dryomov

Yan, Zheng
2019-09-16 18:06:23 +0800

28 Aug, 2019

1 commit

e8c99200b libceph: don't call crypto_free_sync_skcipher() on a NULL tfm ... Browse Code »

In set_secret(), key->tfm is assigned to NULL on line 55, and then
ceph_crypto_key_destroy(key) is executed.

ceph_crypto_key_destroy(key)
crypto_free_sync_skcipher(key->tfm)
crypto_free_skcipher(&tfm->base);

This happens to work because crypto_sync_skcipher is a trivial wrapper
around crypto_skcipher: &tfm->base is still 0 and crypto_free_skcipher()
handles that. Let's not rely on the layout of crypto_sync_skcipher.

This bug is found by a static analysis tool STCheck written by us.

Fixes: 69d6302b65a8 ("libceph: Remove VLA usage of skcipher").
Signed-off-by: Jia-Ju Bai
Reviewed-by: Ilya Dryomov
Signed-off-by: Ilya Dryomov

Jia-Ju Bai
2019-08-28 18:33:46 +0800

22 Aug, 2019

1 commit

a56137240 libceph: fix PG split vs OSD (re)connect race ... Browse Code »

We can't rely on ->peer_features in calc_target() because it may be
called both when the OSD session is established and open and when it's
not. ->peer_features is not valid unless the OSD session is open. If
this happens on a PG split (pg_num increase), that could mean we don't
resend a request that should have been resent, hanging the client
indefinitely.

In userspace this was fixed by looking at require_osd_release and
get_xinfo[osd].features fields of the osdmap. However these fields
belong to the OSD section of the osdmap, which the kernel doesn't
decode (only the client section is decoded).

Instead, let's drop this feature check. It effectively checks for
luminous, so only pre-luminous OSDs would be affected in that on a PG
split the kernel might resend a request that should not have been
resent. Duplicates can occur in other scenarios, so both sides should
already be prepared for them: see dup/replay logic on the OSD side and
retry_attempt check on the client side.

Cc: stable@vger.kernel.org
Fixes: 7de030d6b10a ("libceph: resend on PG splits if OSD has RESEND_ON_SPLIT")
Link: https://tracker.ceph.com/issues/41162
Reported-by: Jerry Lee
Signed-off-by: Ilya Dryomov
Tested-by: Jerry Lee
Reviewed-by: Jeff Layton

Ilya Dryomov
2019-08-22 16:47:41 +0800

19 Jul, 2019

1 commit

d9b9c8930 Merge tag 'ceph-for-5.3-rc1' of git://github.com/ceph/ceph-client ... Browse Code »

Pull ceph updates from Ilya Dryomov:
"Lots of exciting things this time!

- support for rbd object-map and fast-diff features (myself). This
will speed up reads, discards and things like snap diffs on sparse
images.

- ceph.snap.btime vxattr to expose snapshot creation time (David
Disseldorp). This will be used to integrate with "Restore Previous
Versions" feature added in Windows 7 for folks who reexport ceph
through SMB.

- security xattrs for ceph (Zheng Yan). Only selinux is supported for
now due to the limitations of ->dentry_init_security().

- support for MSG_ADDR2, FS_BTIME and FS_CHANGE_ATTR features (Jeff
Layton). This is actually a single feature bit which was missing
because of the filesystem pieces. With this in, the kernel client
will finally be reported as "luminous" by "ceph features" -- it is
still being reported as "jewel" even though all required Luminous
features were implemented in 4.13.

- stop NULL-terminating ceph vxattrs (Jeff Layton). The convention
with xattrs is to not terminate and this was causing
inconsistencies with ceph-fuse.

- change filesystem time granularity from 1 us to 1 ns, again fixing
an inconsistency with ceph-fuse (Luis Henriques).

On top of this there are some additional dentry name handling and cap
flushing fixes from Zheng. Finally, Jeff is formally taking over for
Zheng as the filesystem maintainer"

* tag 'ceph-for-5.3-rc1' of git://github.com/ceph/ceph-client: (71 commits)
ceph: fix end offset in truncate_inode_pages_range call
ceph: use generic_delete_inode() for ->drop_inode
ceph: use ceph_evict_inode to cleanup inode's resource
ceph: initialize superblock s_time_gran to 1
MAINTAINERS: take over for Zheng as CephFS kernel client maintainer
rbd: setallochint only if object doesn't exist
rbd: support for object-map and fast-diff
rbd: call rbd_dev_mapping_set() from rbd_dev_image_probe()
libceph: export osd_req_op_data() macro
libceph: change ceph_osdc_call() to take page vector for response
libceph: bump CEPH_MSG_MAX_DATA_LEN (again)
rbd: new exclusive lock wait/wake code
rbd: quiescing lock should wait for image requests
rbd: lock should be quiesced on reacquire
rbd: introduce copyup state machine
rbd: rename rbd_obj_setup_*() to rbd_obj_init_*()
rbd: move OSD request allocation into object request state machines
rbd: factor out __rbd_osd_setup_discard_ops()
rbd: factor out rbd_osd_setup_copyup()
rbd: introduce obj_req->osd_reqs list
...

Linus Torvalds
2019-07-19 02:05:25 +0800

13 Jul, 2019

1 commit

f632a8170 Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull driver core and debugfs updates from Greg KH:
"Here is the "big" driver core and debugfs changes for 5.3-rc1

It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups.

Other than the debugfs cleanups, in this set of changes we have:

- bus iteration function cleanups

- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way

- cleanups to Documenatation/ABI/ entries to make them parse easier
due to typos and other minor things

- default_attrs use for some ktype users

- driver model documentation file conversions to .rst

- compressed firmware file loading

- deferred probe fixes

All of these have been in linux-next for a while, with a bunch of
merge issues that Stephen has been patient with me for"

* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
debugfs: make error message a bit more verbose
orangefs: fix build warning from debugfs cleanup patch
ubifs: fix build warning after debugfs cleanup patch
driver: core: Allow subsystems to continue deferring probe
drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
arch_topology: Remove error messages on out-of-memory conditions
lib: notifier-error-inject: no need to check return value of debugfs_create functions
swiotlb: no need to check return value of debugfs_create functions
ceph: no need to check return value of debugfs_create functions
sunrpc: no need to check return value of debugfs_create functions
ubifs: no need to check return value of debugfs_create functions
orangefs: no need to check return value of debugfs_create functions
nfsd: no need to check return value of debugfs_create functions
lib: 842: no need to check return value of debugfs_create functions
debugfs: provide pr_fmt() macro
debugfs: log errors when something goes wrong
drivers: s390/cio: Fix compilation warning about const qualifiers
drivers: Add generic helper to match by of_node
driver_find_device: Unify the match function with class_find_device()
bus_find_device: Unify the match callback with class_find_device
...

Linus Torvalds
2019-07-13 03:24:03 +0800

11 Jul, 2019

1 commit

028db3e29 Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/dhowells/linux-fs"

This reverts merge 0f75ef6a9cff49ff612f7ce0578bced9d0b38325 (and thus
effectively commits

7a1ade847596 ("keys: Provide KEYCTL_GRANT_PERMISSION")
2e12256b9a76 ("keys: Replace uid/gid/perm permissions checking with an ACL")

that the merge brought in).

It turns out that it breaks booting with an encrypted volume, and Eric
biggers reports that it also breaks the fscrypt tests [1] and loading of
in-kernel X.509 certificates [2].

The root cause of all the breakage is likely the same, but David Howells
is off email so rather than try to work it out it's getting reverted in
order to not impact the rest of the merge window.

[1] https://lore.kernel.org/lkml/20190710011559.GA7973@sol.localdomain/
[2] https://lore.kernel.org/lkml/20190710013225.GB7973@sol.localdomain/

Link: https://lore.kernel.org/lkml/CAHk-=wjxoeMJfeBahnWH=9zShKp2bsVy527vo3_y8HfOdhwAAw@mail.gmail.com/
Reported-by: Eric Biggers <ebiggers@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Linus Torvalds
2019-07-11 09:43:43 +0800

09 Jul, 2019

2 commits

0f75ef6a9 Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs ... Browse Code »

Pull keyring ACL support from David Howells:
"This changes the permissions model used by keys and keyrings to be
based on an internal ACL by the following means:

- Replace the permissions mask internally with an ACL that contains a
list of ACEs, each with a specific subject with a permissions mask.
Potted default ACLs are available for new keys and keyrings.

ACE subjects can be macroised to indicate the UID and GID specified
on the key (which remain). Future commits will be able to add
additional subject types, such as specific UIDs or domain
tags/namespaces.

Also split a number of permissions to give finer control. Examples
include splitting the revocation permit from the change-attributes
permit, thereby allowing someone to be granted permission to revoke
a key without allowing them to change the owner; also the ability
to join a keyring is split from the ability to link to it, thereby
stopping a process accessing a keyring by joining it and thus
acquiring use of possessor permits.

- Provide a keyctl to allow the granting or denial of one or more
permits to a specific subject. Direct access to the ACL is not
granted, and the ACL cannot be viewed"

* tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
keys: Provide KEYCTL_GRANT_PERMISSION
keys: Replace uid/gid/perm permissions checking with an ACL

Linus Torvalds
2019-07-09 10:56:57 +0800
c84ca912b Merge tag 'keys-namespace-20190627' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/dhowells/linux-fs

Pull keyring namespacing from David Howells:
"These patches help make keys and keyrings more namespace aware.

Firstly some miscellaneous patches to make the process easier:

- Simplify key index_key handling so that the word-sized chunks
assoc_array requires don't have to be shifted about, making it
easier to add more bits into the key.

- Cache the hash value in the key so that we don't have to calculate
on every key we examine during a search (it involves a bunch of
multiplications).

- Allow keying_search() to search non-recursively.

Then the main patches:

- Make it so that keyring names are per-user_namespace from the point
of view of KEYCTL_JOIN_SESSION_KEYRING so that they're not
accessible cross-user_namespace.

keyctl_capabilities() shows KEYCTL_CAPS1_NS_KEYRING_NAME for this.

- Move the user and user-session keyrings to the user_namespace
rather than the user_struct. This prevents them propagating
directly across user_namespaces boundaries (ie. the KEY_SPEC_*
flags will only pick from the current user_namespace).

- Make it possible to include the target namespace in which the key
shall operate in the index_key. This will allow the possibility of
multiple keys with the same description, but different target
domains to be held in the same keyring.

keyctl_capabilities() shows KEYCTL_CAPS1_NS_KEY_TAG for this.

- Make it so that keys are implicitly invalidated by removal of a
domain tag, causing them to be garbage collected.

- Institute a network namespace domain tag that allows keys to be
differentiated by the network namespace in which they operate. New
keys that are of a type marked 'KEY_TYPE_NET_DOMAIN' are assigned
the network domain in force when they are created.

- Make it so that the desired network namespace can be handed down
into the request_key() mechanism. This allows AFS, NFS, etc. to
request keys specific to the network namespace of the superblock.

This also means that the keys in the DNS record cache are
thenceforth namespaced, provided network filesystems pass the
appropriate network namespace down into dns_query().

For DNS, AFS and NFS are good, whilst CIFS and Ceph are not. Other
cache keyrings, such as idmapper keyrings, also need to set the
domain tag - for which they need access to the network namespace of
the superblock"

* tag 'keys-namespace-20190627' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
keys: Pass the network namespace into request_key mechanism
keys: Network namespace domain tag
keys: Garbage collect keys for which the domain has been removed
keys: Include target namespace in match criteria
keys: Move the user and user-session keyrings to the user_namespace
keys: Namespace keyring names
keys: Add a 'recurse' flag for keyring searches
keys: Cache the hash value to avoid lots of recalculation
keys: Simplify key description management

Linus Torvalds
2019-07-09 10:36:47 +0800

08 Jul, 2019

14 commits

22e8bd51b rbd: support for object-map and fast-diff ... Browse Code »

Speed up reads, discards and zeroouts through RBD_OBJ_FLAG_MAY_EXIST
and RBD_OBJ_FLAG_NOOP_FOR_NONEXISTENT based on object map.

Invalid object maps are not trusted, but still updated. Note that we
never iterate, resize or invalidate object maps. If object-map feature
is enabled but object map fails to load, we just fail the requester
(either "rbd map" or I/O, by way of post-acquire action).

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2019-07-08 20:01:45 +0800
4cf3e6dff libceph: export osd_req_op_data() macro ... Browse Code »

We already have one exported wrapper around it for extent.osd_data and
rbd_object_map_update_finish() needs another one for cls.request_data.

Signed-off-by: Ilya Dryomov
Reviewed-by: Dongsheng Yang
Reviewed-by: Jeff Layton

Ilya Dryomov
2019-07-08 20:01:45 +0800
68ada915e libceph: change ceph_osdc_call() to take page vector for response ... Browse Code »

This will be used for loading object map. rbd_obj_read_sync() isn't
suitable because object map must be accessed through class methods.

Signed-off-by: Ilya Dryomov
Reviewed-by: Dongsheng Yang
Reviewed-by: Jeff Layton

Ilya Dryomov
2019-07-08 20:01:45 +0800
94e857718 libceph: rename r_unsafe_item to r_private_item ... Browse Code »

This list item remained from when we had safe and unsafe replies
(commit vs ack). It has since become a private list item for use by
clients.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2019-07-08 20:01:44 +0800
2c66de560 libceph: rename ceph_encode_addr to ceph_encode_banner_addr ... Browse Code »

...ditto for the decode function. We only use these functions to fix
up banner addresses now, so let's name them more appropriately.

Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov

Jeff Layton
2019-07-08 20:01:43 +0800
d3c3c0a84 libceph: use TYPE_LEGACY for entity addrs instead of TYPE_NONE ... Browse Code »

Going forward, we'll have different address types so let's use
the addr2 TYPE_LEGACY for internal tracking rather than TYPE_NONE.

Also, make ceph_pr_addr print the address type value as well.

Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov

Jeff Layton
2019-07-08 20:01:43 +0800
2f9800c89 ceph: fix decode_locker to use ceph_decode_entity_addr ... Browse Code »

Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov

Jeff Layton
2019-07-08 20:01:43 +0800
8cb5f2b4f libceph: correctly decode ADDR2 addresses in incremental OSD maps ... Browse Code »

Given the new format, we have to decode the addresses twice. Once to
skip past the new_up_client field, and a second time to collect the
addresses.

Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov

Jeff Layton
2019-07-08 20:01:43 +0800
51fc7ab44 libceph: fix watch_item_t decoding to use ceph_decode_entity_addr ... Browse Code »

While we're in there, let's also fix up the decoder to do proper
bounds checking.

Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov

Jeff Layton
2019-07-08 20:01:43 +0800
dcbc919a5 libceph: switch osdmap decoding to use ceph_decode_entity_addr ... Browse Code »

Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov

Jeff Layton
2019-07-08 20:01:43 +0800
0bfb0f288 libceph: ADDR2 support for monmap ... Browse Code »

Switch the MonMap decoder to use the new decoding routine for
entity_addr_t's.

Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov

Jeff Layton
2019-07-08 20:01:43 +0800
6c37f0e64 libceph: add ceph_decode_entity_addr ... Browse Code »

Add a function for decoding an entity_addr_t. Once
CEPH_FEATURE_MSG_ADDR2 is enabled, the server daemons will start
encoding entity_addr_t differently.

Add a new helper function that can handle either format.

Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov

Jeff Layton
2019-07-08 20:01:43 +0800
bc07532cc libceph: fix sa_family just after reading address ... Browse Code »

It doesn't make sense to leave it undecoded until later.

Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov

Jeff Layton
2019-07-08 20:01:43 +0800
97a385e55 libceph: remove ceph_get_direct_page_vector() ... Browse Code »

This function is entirely unused.

Signed-off-by: Christoph Hellwig
Reviewed-by: Ilya Dryomov
Signed-off-by: Ilya Dryomov

Christoph Hellwig
2019-07-08 20:01:40 +0800