Eric Lee / smarc-fsl-linux-kernel

28 Jul, 2016

4 commits

774a6a118 ceph: reduce i_nr_by_mode array size ... Browse Code »

Track usage count for individual fmode bit. This can reduce the
array size by half.

Signed-off-by: Yan, Zheng

Yan, Zheng
9 years ago
779fe0fb8 ceph: rados pool namespace support ... Browse Code »

This patch adds codes that decode pool namespace information in
cap message and request reply. Pool namespace is saved in i_layout,
it will be passed to libceph when doing read/write.

Signed-off-by: Yan, Zheng

Yan, Zheng
9 years ago
7627151ea libceph: define new ceph_file_layout structure ... Browse Code »

Define new ceph_file_layout structure and rename old ceph_file_layout
to ceph_file_layout_legacy. This is preparation for adding namespace
to ceph_file_layout structure.

Signed-off-by: Yan, Zheng

Yan, Zheng
9 years ago
281dbe5db libceph: add an ONSTACK initializer for oids ... Browse Code »

An on-stack oid in ceph_ioctl_get_dataloc() is not initialized,
resulting in a WARN and a NULL pointer dereference later on. We will
have more of these on-stack in the future, so fix it with a convenience
macro.

Fixes: d30291b985d1 ("libceph: variable-sized ceph_object_id")
Signed-off-by: Ilya Dryomov

Ilya Dryomov
9 years ago

26 May, 2016

4 commits

5aea3dcd5 libceph: a major OSD client update ... Browse Code »

This is a major sync up, up to ~Jewel. The highlights are:

- per-session request trees (vs a global per-client tree)
- per-session locking (vs a global per-client rwlock)
- homeless OSD session
- no ad-hoc global per-client lists
- support for pool quotas
- foundation for watch/notify v2 support
- foundation for map check (pool deletion detection) support

The switchover is incomplete: lingering requests can be setup and
teared down but aren't ever reestablished. This functionality is
restored with the introduction of the new lingering infrastructure
(ceph_osd_linger_request, linger_work, etc) in a later commit.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
9 years ago
f81f16339 libceph: rename ceph_calc_pg_primary() ... Browse Code »

Rename ceph_calc_pg_primary() to ceph_pg_to_acting_primary() to
emphasise that it returns acting primary.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
9 years ago
d9591f5e2 libceph: rename ceph_oloc_oid_to_pg() ... Browse Code »

Rename ceph_oloc_oid_to_pg() to ceph_object_locator_to_pg(). Emphasise
that returned is raw PG and return -ENOENT instead of -EIO if the pool
doesn't exist.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
9 years ago
d30291b98 libceph: variable-sized ceph_object_id ... Browse Code »

Currently ceph_object_id can hold object names of up to 100
(CEPH_MAX_OID_NAME_LEN) characters. This is enough for all use cases,
expect one - long rbd image names:

- a format 1 header is named ".rbd"
- an object that points to a format 2 header is named "rbd_id."

We operate on these potentially long-named objects during rbd map, and,
for format 1 images, during header refresh. (A format 2 header name is
a small system-generated string.)

Lift this 100 character limit by making ceph_object_id be able to point
to an externally-allocated string. Apart from being able to work with
almost arbitrarily-long named objects, this allows us to reduce the
size of ceph_object_id from >100 bytes to 64 bytes.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
9 years ago

15 Oct, 2014

2 commits

0bc62284e ceph: fix divide-by-zero in __validate_layout() ... Browse Code »

The 'stripe_unit' field is 64 bits, casting it to 32 bits can result zero.

Signed-off-by: Yan, Zheng

Yan, Zheng
11 years ago
508b32d86 ceph: request xattrs if xattr_version is zero ... Browse Code »

Following sequence of events can happen.
- Client releases an inode, queues cap release message.
- A 'lookup' reply brings the same inode back, but the reply
doesn't contain xattrs because MDS didn't receive the cap release
message and thought client already has up-to-data xattrs.

The fix is force sending a getattr request to MDS if xattrs_version
is 0. The getattr mask is set to CEPH_STAT_CAP_XATTR, so MDS knows client
does not have xattr.

Signed-off-by: Yan, Zheng

Yan, Zheng
11 years ago

06 May, 2014

1 commit

5575eeb7b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull Ceph fixes from Sage Weil:
"First, there is a critical fix for the new primary-affinity function
that went into -rc1.

The second batch of patches from Zheng fix a range of problems with
directory fragmentation, readdir, and a few odds and ends for cephfs"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: reserve caps for file layout/lock MDS requests
ceph: avoid releasing caps that are being used
ceph: clear directory's completeness when creating file
libceph: fix non-default values check in apply_primary_affinity()
ceph: use fpos_cmp() to compare dentry positions
ceph: check directory's completeness before emitting directory entry

Linus Torvalds
11 years ago

29 Apr, 2014

1 commit

3bd58143b ceph: reserve caps for file layout/lock MDS requests ... Browse Code »

Signed-off-by: Yan, Zheng
Reviewed-by: Sage Weil

Yan, Zheng
11 years ago

13 Apr, 2014

1 commit

96c57ade7 ceph: fix pr_fmt() redefinition ... Browse Code »

The vfs merge caused a latent bug to show up:

In file included from fs/ceph/super.h:4:0,
from fs/ceph/ioctl.c:3:
include/linux/ceph/ceph_debug.h:4:0: warning: "pr_fmt" redefined [enabled by default]
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
^
In file included from include/linux/kernel.h:13:0,
from include/linux/uio.h:12,
from include/linux/socket.h:7,
from include/uapi/linux/in.h:22,
from include/linux/in.h:23,
from fs/ceph/ioctl.c:1:
include/linux/printk.h:214:0: note: this is the location of the previous definition
#define pr_fmt(fmt) fmt
^

where the reason is that is included much too late
for the "pr_fmt()" define.

The include of needs to be the first include in the
file, but fs/ceph/ioctl.c had for some reason missed that, and it wasn't
noticeable until some unrelated header file changes brought in an
indirect earlier include of .

Signed-off-by: Linus Torvalds

Linus Torvalds
11 years ago

03 Apr, 2014

1 commit

752c8bdcf ceph: do not chain inode updates to parent fsync ... Browse Code »

The fsync(dirfd) only covers namespace operations, not inode updates.
We do not need to cover setattr variants or O_TRUNC.

Reported-by: Al Viro
Signed-off-by: Sage Weil
Reviewed-by: Yan, Zheng

Sage Weil
11 years ago

28 Jan, 2014

1 commit

7c13cb643 libceph: replace ceph_calc_ceph_pg() with ceph_oloc_oid_to_pg() ... Browse Code »

Switch ceph_calc_ceph_pg() to new oloc and oid abstractions and rename
it to ceph_oloc_oid_to_pg() to make its purpose more clear.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
12 years ago

10 Aug, 2013

2 commits

2fbcbff1d ceph: Add check returned value on func ceph_calc_ceph_pg. ... Browse Code »

Func ceph_calc_ceph_pg maybe failed.So add check for returned value.

Signed-off-by: Jianpeng Ma
Reviewed-by: Sage Weil
Signed-off-by: Sage Weil

majianpeng
12 years ago
494ddd11b ceph: Don't forget the 'up_read(&osdc->map_sem)' if met error. ... Browse Code »

CC: stable@vger.kernel.org
Signed-off-by: Jianpeng Ma
Reviewed-by: Sage Weil

majianpeng
12 years ago

02 May, 2013

1 commit

41766f87f libceph: rename ceph_calc_object_layout() ... Browse Code »

The purpose of ceph_calc_object_layout() is to fill in the pool
number and seed for a ceph_pg structure provided, based on a given
osd map and target object id.

Currently that function takes a file layout parameter, but the only
thing used out of that is its pool number.

Change the function so it takes a pool number rather than the full
file layout structure. Only update the ceph_pg if the pool is found
in the osd map. Get rid of few useless lines of code from the
function while there.

Since the function now very clearly just fills in the ceph_pg
structure it's provided, rename it ceph_calc_ceph_pg().

Signed-off-by: Alex Elder
Reviewed-by: Josh Durgin

Alex Elder
12 years ago

01 Mar, 2013

1 commit

1cf0209c4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull Ceph updates from Sage Weil:
"A few groups of patches here. Alex has been hard at work improving
the RBD code, layout groundwork for understanding the new formats and
doing layering. Most of the infrastructure is now in place for the
final bits that will come with the next window.

There are a few changes to the data layout. Jim Schutt's patch fixes
some non-ideal CRUSH behavior, and a set of patches from me updates
the client to speak a newer version of the protocol and implement an
improved hashing strategy across storage nodes (when the server side
supports it too).

A pair of patches from Sam Lang fix the atomicity of open+create
operations. Several patches from Yan, Zheng fix various mds/client
issues that turned up during multi-mds torture tests.

A final set of patches expose file layouts via virtual xattrs, and
allow the policies to be set on directories via xattrs as well
(avoiding the awkward ioctl interface and providing a consistent
interface for both kernel mount and ceph-fuse users)."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (143 commits)
libceph: add support for HASHPSPOOL pool flag
libceph: update osd request/reply encoding
libceph: calculate placement based on the internal data types
ceph: update support for PGID64, PGPOOL3, OSDENC protocol features
ceph: update "ceph_features.h"
libceph: decode into cpu-native ceph_pg type
libceph: rename ceph_pg -> ceph_pg_v1
rbd: pass length, not op for osd completions
rbd: move rbd_osd_trivial_callback()
libceph: use a do..while loop in con_work()
libceph: use a flag to indicate a fault has occurred
libceph: separate non-locked fault handling
libceph: encapsulate connection backoff
libceph: eliminate sparse warnings
ceph: eliminate sparse warnings in fs code
rbd: eliminate sparse warnings
libceph: define connection flag helpers
rbd: normalize dout() calls
rbd: barriers are hard
rbd: ignore zero-length requests
...

Linus Torvalds
12 years ago

27 Feb, 2013

3 commits

2169aea64 libceph: calculate placement based on the internal data types ... Browse Code »

Instead of using the old ceph_object_layout struct, update our internal
ceph_calc_object_layout method to use the ceph_pg type. This allows us to
pass the full 32-bit precision of the pgid.seed to the callers. It also
allows some callers to avoid reaching into the request structures for the
struct ceph_object_layout fields.

Signed-off-by: Sage Weil
Reviewed-by: Alex Elder

Sage Weil
12 years ago
5b191d991 libceph: decode into cpu-native ceph_pg type ... Browse Code »

Always decode data into our cpu-native ceph_pg type that has the correct
field widths. Limit any remaining uses of ceph_pg_v1 to dealing with the
legacy protocol.

Signed-off-by: Sage Weil
Reviewed-by: Alex Elder

Sage Weil
12 years ago
12979354a libceph: rename ceph_pg -> ceph_pg_v1 ... Browse Code »

Rename the old version this type to distinguish it from the new version.

Signed-off-by: Sage Weil
Reviewed-by: Alex Elder

Sage Weil
12 years ago

23 Feb, 2013

1 commit

496ad9aa8 new helper: file_inode(file) ... Browse Code »

Signed-off-by: Al Viro

Al Viro
12 years ago

18 Jan, 2013

1 commit

e8afad656 libceph: pass length to ceph_calc_file_object_mapping() ... Browse Code »

ceph_calc_file_object_mapping() takes (among other things) a "file"
offset and length, and based on the layout, determines the object
number ("bno") backing the affected portion of the file's data and
the offset into that object where the desired range begins. It also
computes the size that should be used for the request--either the
amount requested or something less if that would exceed the end of
the object.

This patch changes the input length parameter in this function so it
is used only for input. That is, the argument will be passed by
value rather than by address, so the value provided won't get
updated by the function.

The value would only get updated if the length would surpass the
current object, and in that case the value it got updated to would
be exactly that returned in *oxlen.

Only one of the two callers is affected by this change. Update
ceph_calc_raw_layout() so it records any updated value.

Signed-off-by: Alex Elder
Reviewed-by: Josh Durgin

Alex Elder
13 years ago

03 Oct, 2012

1 commit

457712a0b ceph: return EIO on invalid layout on GET_DATALOC ioctl ... Browse Code »

If the user calls GET_DATALOC on a file with an invalid (e.g.,
zeroed) layout, return EIO to userland.

Signed-off-by: Sage Weil
Reviewed-by: Alex Elder

Sage Weil
13 years ago

22 Aug, 2012

1 commit

45f2e081f ceph: avoid divide by zero in __validate_layout() ... Browse Code »

If "l->stripe_unit" is zero the the mod on the next line will cause a
divide by zero bug. This comes from the copy_from_user() in
ceph_ioctl_set_layout_policy(). Passing 0 is valid, though (it means
"do not change") so avoid the % check in that case.

Reported-by: Dan Carpenter
Signed-off-by: Sage Weil
Reviewed-by: Alex Elder

Sage Weil
13 years ago

17 May, 2012

2 commits

c047be093 ceph: ignore preferred_osd field ... Browse Code »

Old users may not expect EINVAL, and there is no clear user-visibile
behavior change now that we ignore it.

Signed-off-by: Sage Weil
Reviewed-by: Alex Elder

Sage Weil
13 years ago
702aeb1f8 ceph: fully initialize new layout ... Browse Code »

When we are setting a new layout, fully initialize the structure:
- zero it out
- always set preferred_osd to -1

Signed-off-by: Sage Weil
Reviewed-by: Alex Elder

Sage Weil
13 years ago

08 May, 2012

2 commits

e49bf4c51 ceph: refactor SETLAYOUT and SETDIRLAYOUT ioctl checks into common helper ... Browse Code »

Both of these methods perform similar checks; move that code to a helper
so that we can ensure the checks are consistent.

Reviewed-by: Alex Elder
Signed-off-by: Sage Weil

Sage Weil
13 years ago
3469ac1aa ceph: drop support for preferred_osd pgs ... Browse Code »

This was an ill-conceived feature that has been removed from Ceph. Do
this gracefully:

- reject attempts to specify a preferred_osd via the ioctl
- stop exposing this information via virtual xattrs
- always fill in -1 for requests, in case we talk to an older server
- don't calculate preferred_osd placements/pgids

Reviewed-by: Alex Elder
Signed-off-by: Sage Weil

Sage Weil
13 years ago

08 Dec, 2011

1 commit

be655596b ceph: use i_ceph_lock instead of i_lock ... Browse Code »

We have been using i_lock to protect all kinds of data structures in the
ceph_inode_info struct, including lists of inodes that we need to iterate
over while avoiding races with inode destruction. That requires grabbing
a reference to the inode with the list lock protected, but igrab() now
takes i_lock to check the inode flags.

Changing the list lock ordering would be a painful process.

However, using a ceph-specific i_ceph_lock in the ceph inode instead of
i_lock is a simple mechanical change and avoids the ordering constraints
imposed by igrab().

Reported-by: Amon Ott
Signed-off-by: Sage Weil

Sage Weil
14 years ago

26 Oct, 2011

1 commit

a35eca958 ceph: let the set_layout ioctl set single traits ... Browse Code »

Previously we were validating the passed-in stripe unit, object size,
and stripe count against each other (and not testing most other stuff).
Instead, make sure that the composed previous layout and new values are valid,
and only send the new values to the MDS. This lets users change the
pool without setting the whole layout, for instance.

Signed-off-by: Greg Farnum

Greg Farnum
14 years ago

27 Jul, 2011

2 commits

5f21c96dd ceph: protect access to d_parent ... Browse Code »

d_parent is protected by d_lock: use it when looking up a dentry's parent
directory inode. Also take a reference and drop it in the caller to avoid
a use-after-free.

Reported-by: Al Viro
Reviewed-by: Yehuda Sadeh
Signed-off-by: Sage Weil

Sage Weil
14 years ago
4918b6d14 ceph: add F_SYNC file flag to force sync (non-O_DIRECT) io ... Browse Code »

This allows us to force IO through the sync path which you normally only
get when multiple clients are reading/writing to the same file or by
mounting with -o sync. Among other things, this lets test programs verify
correctness with a single mount.

Reviewed-by: Yehuda Sadeh
Signed-off-by: Sage Weil

Sage Weil
14 years ago

08 Jun, 2011

1 commit

70b666c3b ceph: use ihold when we already have an inode ref ... Browse Code »

We should use ihold whenever we already have a stable inode ref, even
when we aren't holding i_lock. This avoids adding new and unnecessary
locking dependencies.

Signed-off-by: Sage Weil

Sage Weil
14 years ago

21 Oct, 2010

2 commits

571dba52a ceph: add CEPH_MDS_OP_SETDIRLAYOUT and associated ioctl. ... Browse Code »

Signed-off-by: Sage Weil

Greg Farnum
15 years ago
3d14c5d2b ceph: factor out libceph from Ceph file system ... Browse Code »

This factors out protocol and low-level storage parts of ceph into a
separate libceph module living in net/ceph and include/linux/ceph. This
is mostly a matter of moving files around. However, a few key pieces
of the interface change as well:

- ceph_client becomes ceph_fs_client and ceph_client, where the latter
captures the mon and osd clients, and the fs_client gets the mds client
and file system specific pieces.
- Mount option parsing and debugfs setup is correspondingly broken into
two pieces.
- The mon client gets a generic handler callback for otherwise unknown
messages (mds map, in this case).
- The basic supported/required feature bits can be expanded (and are by
ceph_fs_client).

No functional change, aside from some subtle error handling cases that got
cleaned up in the refactoring process.

Signed-off-by: Sage Weil

Yehuda Sadeh
15 years ago

02 Aug, 2010

1 commit

8c6e9229f ceph: add LAZYIO ioctl to mark a file description for lazy consistency ... Browse Code »

Allow an application to mark a file descriptor for lazy file consistency
semantics, allowing buffered reads and writes when multiple clients are
accessing the same file.

Signed-off-by: Sage Weil

Sage Weil
15 years ago

18 May, 2010

1 commit

640ef79d2 ceph: use ceph_sb_to_client instead of ceph_client ... Browse Code »

ceph_sb_to_client and ceph_client are really identical, we need to dump
one; while function ceph_client is confusing with "struct ceph_client",
ceph_sb_to_client's definition is more clear; so we'd better switch all
call to ceph_sb_to_client.

-static inline struct ceph_client *ceph_client(struct super_block *sb)
-{
- return sb->s_fs_info;
-}

Signed-off-by: Cheng Renquan
Signed-off-by: Sage Weil

Cheng Renquan
15 years ago

04 Dec, 2009

1 commit

33d4909cc ceph: allow preferred osd to be get/set via layout ioctl ... Browse Code »

There is certainly no reason not to report this.

The only real downside to allowing the user to set it is that you don't
get default values by zeroing the layout struct (the default is -1).

Signed-off-by: Sage Weil

Sage Weil
16 years ago