12 Oct, 2020

1 commit

  • Replace a global map->crush_workspace (protected by a global mutex)
    with a list of workspaces, up to the number of CPUs + 1.

    This is based on a patch from Robin Geuze .
    Robin and his team have observed a 10-20% increase in IOPS on all
    queue depths and lower CPU usage as well on a high-end all-NVMe
    100GbE cluster.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     

01 Jun, 2020

4 commits

  • OSD-side issues with reads from replica have been resolved in
    Octopus. Reading from replica should be safe wrt. unstable or
    uncommitted state now, so add support for balanced and localized
    reads.

    There are two cases when a read from replica can't be served:

    - OSD may silently drop the request, expecting the client to
    notice that the acting set has changed and resend via the usual
    means (handled with t->used_replica)

    - OSD may return EAGAIN, expecting the client to resend to the
    primary, ignoring replica read flags (see handle_reply())

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Jeff Layton

    Ilya Dryomov
     
  • Allow expressing client's location in terms of CRUSH hierarchy as
    a set of (bucket type name, bucket name) pairs. The userspace syntax
    "crush_location = key1=value1 key2=value2" is incompatible with mount
    options and needed adaptation. Key-value pairs are separated by '|'
    and we use ':' instead of '=' to separate keys from values. So for:

    crush_location = host=foo rack=bar

    one would write:

    crush_location=host:foo|rack:bar

    As in userspace, "multipath" locations are supported, so indicating
    locality for parallel hierarchies is possible:

    crush_location=rack:foo1|rack:foo2|datacenter:bar

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Jeff Layton

    Ilya Dryomov
     
  • These would be matched with the provided client location to calculate
    the locality value.

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Jeff Layton

    Ilya Dryomov
     
  • Needed for the next commit and useful for ceph_pg_pool_info tree as
    well. I'm leaving the asserting helper in for now, but we should look
    at getting rid of it in the future.

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Jeff Layton

    Ilya Dryomov
     

23 Mar, 2020

1 commit

  • CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult
    per-pool flags as well. Unfortunately the backwards compatibility here
    is lacking:

    - the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but
    was guarded by require_osd_release >= RELEASE_LUMINOUS
    - it was subsequently backported to luminous in v12.2.2, but that makes
    no difference to clients that only check OSDMAP_FULL/NEARFULL because
    require_osd_release is not client-facing -- it is for OSDs

    Since all kernels are affected, the best we can do here is just start
    checking both map flags and pool flags and send that to stable.

    These checks are best effort, so take osdc->lock and look up pool flags
    just once. Remove the FIXME, since filesystem quotas are checked above
    and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches
    its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set.

    Cc: stable@vger.kernel.org
    Reported-by: Yanhu Cao
    Signed-off-by: Ilya Dryomov
    Reviewed-by: Jeff Layton
    Acked-by: Sage Weil

    Ilya Dryomov
     

16 Sep, 2019

1 commit

  • osdmap has a bunch of arrays that grow linearly with the number of
    OSDs. osd_state, osd_weight and osd_primary_affinity take 4 bytes per
    OSD. osd_addr takes 136 bytes per OSD because of sockaddr_storage.
    The CRUSH workspace area also grows linearly with the number of OSDs.

    Normally these arrays are allocated at client startup. The osdmap is
    usually updated in small incrementals, but once in a while a full map
    may need to be processed. For a cluster with 10000 OSDs, this means
    a bunch of 40K allocations followed by a 1.3M allocation, all of which
    are currently required to be physically contiguous. This results in
    sporadic ENOMEM errors, hanging the client.

    Go back to manually (re)allocating arrays and use ceph_kvmalloc() to
    fall back to non-contiguous allocation when necessary.

    Link: https://tracker.ceph.com/issues/40481
    Signed-off-by: Ilya Dryomov
    Reviewed-by: Jeff Layton

    Ilya Dryomov
     

08 Jul, 2019

2 commits


06 Mar, 2019

1 commit

  • One of the more common cases of allocation size calculations is finding
    the size of a structure that has a zero-sized array at the end, along
    with memory for some number of elements for that array. For example:

    struct foo {
    int stuff;
    struct boo entry[];
    };

    instance = kmalloc(sizeof(struct foo) + count * sizeof(struct boo), GFP_KERNEL);

    Instead of leaving these open-coded and prone to type mistakes, we can
    now use the new struct_size() helper:

    instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);

    This code was detected with the help of Coccinelle.

    Signed-off-by: Gustavo A. R. Silva
    Reviewed-by: Ilya Dryomov
    Signed-off-by: Ilya Dryomov

    Gustavo A. R. Silva
     

15 Jun, 2018

1 commit

  • Pull ceph updates from Ilya Dryomov:
    "The main piece is a set of libceph changes that revamps how OSD
    requests are aborted, improving CephFS ENOSPC handling and making
    "umount -f" actually work (Zheng and myself).

    The rest is mostly mount option handling cleanups from Chengguang and
    assorted fixes from Zheng, Luis and Dongsheng.

    * tag 'ceph-for-4.18-rc1' of git://github.com/ceph/ceph-client: (31 commits)
    rbd: flush rbd_dev->watch_dwork after watch is unregistered
    ceph: update description of some mount options
    ceph: show ino32 if the value is different with default
    ceph: strengthen rsize/wsize/readdir_max_bytes validation
    ceph: fix alignment of rasize
    ceph: fix use-after-free in ceph_statfs()
    ceph: prevent i_version from going back
    ceph: fix wrong check for the case of updating link count
    libceph: allocate the locator string with GFP_NOFAIL
    libceph: make abort_on_full a per-osdc setting
    libceph: don't abort reads in ceph_osdc_abort_on_full()
    libceph: avoid a use-after-free during map check
    libceph: don't warn if req->r_abort_on_full is set
    libceph: use for_each_request() in ceph_osdc_abort_on_full()
    libceph: defer __complete_request() to a workqueue
    libceph: move more code into __complete_request()
    libceph: no need to call flush_workqueue() before destruction
    ceph: flush pending works before shutdown super
    ceph: abort osd requests on force umount
    libceph: introduce ceph_osdc_abort_requests()
    ...

    Linus Torvalds
     

13 Jun, 2018

1 commit

  • The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
    patch replaces cases of:

    kmalloc(a * b, gfp)

    with:
    kmalloc_array(a * b, gfp)

    as well as handling cases of:

    kmalloc(a * b * c, gfp)

    with:

    kmalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kmalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kmalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The tools/ directory was manually excluded, since it has its own
    implementation of kmalloc().

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kmalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kmalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kmalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kmalloc
    + kmalloc_array
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kmalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kmalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kmalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kmalloc(sizeof(THING) * C2, ...)
    |
    kmalloc(sizeof(TYPE) * C2, ...)
    |
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(C1 * C2, ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

05 Jun, 2018

1 commit

  • calc_target() isn't supposed to fail with anything but POOL_DNE, in
    which case we report that the pool doesn't exist and fail the request
    with -ENOENT. Doing this for -ENOMEM is at the very least confusing
    and also harmful -- as the preceding requests complete, a short-lived
    locator string allocation is likely to succeed after a wait.

    (We used to call ceph_object_locator_to_pg() for a pi lookup. In
    theory that could fail with -ENOENT, hence the "ret != -ENOENT" warning
    being removed.)

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     

02 Apr, 2018

3 commits


02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

20 Sep, 2017

1 commit

  • This reverts most of commit f53b7665c8ce ("libceph: upmap semantic
    changes").

    We need to prevent duplicates in the final result. For example, we
    can currently take

    [1,2,3] and apply [(1,2)] and get [2,2,3]

    or

    [1,2,3] and apply [(3,2)] and get [1,2,2]

    The rest of the system is not prepared to handle duplicates in the
    result set like this.

    The reverted piece was intended to allow

    [1,2,3] and [(1,2),(2,1)] to get [2,1,3]

    to reorder primaries. First, this bidirectional swap is hard to
    implement in a way that also prevents dups. For example, [1,2,3] and
    [(1,4),(2,3),(3,4)] would give [4,3,4] but would we just drop the last
    step we'd have [4,3,3] which is also invalid, etc. Simpler to just not
    handle bidirectional swaps. In practice, they are not needed: if you
    just want to choose a different primary then use primary_affinity, or
    pg_upmap (not pg_upmap_items).

    Cc: stable@vger.kernel.org # 4.13
    Link: http://tracker.ceph.com/issues/21410
    Signed-off-by: Ilya Dryomov
    Reviewed-by: Sage Weil

    Ilya Dryomov
     

01 Aug, 2017

4 commits


17 Jul, 2017

3 commits


07 Jul, 2017

15 commits

  • Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • If there is no crush_choose_arg_map for a given pool, a NULL pointer is
    passed to preserve existing crush_do_rule() behavior.

    Reflects ceph.git commits 55fb91d64071552ea1bc65ab4ea84d3c8b73ab4b,
    dbe36e08be00c6519a8c89718dd47b0219c20516.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • bucket_straw2_choose needs to use weights that may be different from
    weight_items. For instance to compensate for an uneven distribution
    caused by a low number of values. Or to fix the probability biais
    introduced by conditional probabilities (see
    http://tracker.ceph.com/issues/15653 for more information).

    We introduce a weight_set for each straw2 bucket to set the desired
    weight for a given item at a given position. The weight of a given item
    when picking the first replica (first position) may be different from
    the weight the second replica (second position). For instance the weight
    matrix for a given bucket containing items 3, 7 and 13 could be as
    follows:

    position 0 position 1

    item 3 0x10000 0x100000
    item 7 0x40000 0x10000
    item 13 0x40000 0x10000

    When crush_do_rule picks the first of two replicas (position 0), item 7,
    3 are four times more likely to be choosen by bucket_straw2_choose than
    item 13. When choosing the second replica (position 1), item 3 is ten
    times more likely to be choosen than item 7, 13.

    By default the weight_set of each bucket exactly matches the content of
    item_weights for each position to ensure backward compatibility.

    bucket_straw2_choose compares items by using their id. The same ids are
    also used to index buckets and they must be unique. For each item in a
    bucket an array of ids can be provided for placement purposes and they
    are used instead of the ids. If no replacement ids are provided, the
    legacy behavior is preserved.

    Reflects ceph.git commit 19537a450fd5c5a0bb8b7830947507a76db2ceca.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Previously, pg_to_raw_osds() didn't filter for existent OSDs because
    raw_to_up_osds() would filter for "up" ("up" is predicated on "exists")
    and raw_to_up_osds() was called directly after pg_to_raw_osds(). Now,
    with apply_upmap() call in there, nonexistent OSDs in pg_to_raw_osds()
    output can affect apply_upmap(). Introduce remove_nonexistent_osds()
    to deal with that.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Move raw_pg_to_pg() call out of get_temp_osds() and into
    ceph_pg_to_up_acting_osds(), for upcoming apply_upmap().

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • pg_temp and pg_upmap encodings are the same (PG -> array of osds),
    except for the incremental remove: it's an empty mapping in new_pg_temp
    for pg_temp and a separate old_pg_upmap set for pg_upmap. (This isn't
    to allow for empty pg_upmap mappings -- apparently, pg_temp just wasn't
    looked at as an example for pg_upmap encoding.)

    Reuse __decode_pg_temp() for decoding pg_upmap and new_pg_upmap.
    __decode_pg_temp() stores into pg_temp union member, but since pg_upmap
    union member is identical, reading through pg_upmap later is OK.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Some of these won't be as efficient as they could be (e.g.
    ceph_decode_skip_set(... 32 ...) could advance by len * sizeof(u32)
    once instead of advancing by sizeof(u32) len times), but that's fine
    and not worth a bunch of extra macro code.

    Replace skip_name_map() with ceph_decode_skip_map as an example.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Switch to DEFINE_RB_FUNCS2-generated {insert,lookup,erase}_pg_mapping().

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Make __{lookup,remove}_pg_mapping() look like their ceph_spg_mapping
    counterparts: take const struct ceph_pg *.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Note that ceph_osd_request_target fields are updated regardless of
    RESEND_ON_SPLIT.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Store both raw pgid and actual spgid in ceph_osd_request_target.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • The old (v15) pi->last_force_request_resend has been repurposed to
    make pre-RESEND_ON_SPLIT clients that don't check for PG splits but do
    obey pi->last_force_request_resend resend on splits. See ceph.git
    commit 189ca7ec6420 ("mon/OSDMonitor: make pre-luminous clients resend
    ops on split").

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov