Eric Lee / smarc-fsl-linux-kernel

26 Apr, 2016

1 commit

6c1ea260f libceph: make authorizer destruction independent of ceph_auth_client ... Browse Code »

Starting the kernel client with cephx disabled and then enabling cephx
and restarting userspace daemons can result in a crash:

[262671.478162] BUG: unable to handle kernel paging request at ffffebe000000000
[262671.531460] IP: [] kfree+0x5a/0x130
[262671.584334] PGD 0
[262671.635847] Oops: 0000 [#1] SMP
[262672.055841] CPU: 22 PID: 2961272 Comm: kworker/22:2 Not tainted 4.2.0-34-generic #39~14.04.1-Ubuntu
[262672.162338] Hardware name: Dell Inc. PowerEdge R720/068CDY, BIOS 2.4.3 07/09/2014
[262672.268937] Workqueue: ceph-msgr con_work [libceph]
[262672.322290] task: ffff88081c2d0dc0 ti: ffff880149ae8000 task.ti: ffff880149ae8000
[262672.428330] RIP: 0010:[] [] kfree+0x5a/0x130
[262672.535880] RSP: 0018:ffff880149aeba58 EFLAGS: 00010286
[262672.589486] RAX: 000001e000000000 RBX: 0000000000000012 RCX: ffff8807e7461018
[262672.695980] RDX: 000077ff80000000 RSI: ffff88081af2be04 RDI: 0000000000000012
[262672.803668] RBP: ffff880149aeba78 R08: 0000000000000000 R09: 0000000000000000
[262672.912299] R10: ffffebe000000000 R11: ffff880819a60e78 R12: ffff8800aec8df40
[262673.021769] R13: ffffffffc035f70f R14: ffff8807e5b138e0 R15: ffff880da9785840
[262673.131722] FS: 0000000000000000(0000) GS:ffff88081fac0000(0000) knlGS:0000000000000000
[262673.245377] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[262673.303281] CR2: ffffebe000000000 CR3: 0000000001c0d000 CR4: 00000000001406e0
[262673.417556] Stack:
[262673.472943] ffff880149aeba88 ffff88081af2be04 ffff8800aec8df40 ffff88081af2be04
[262673.583767] ffff880149aeba98 ffffffffc035f70f ffff880149aebac8 ffff8800aec8df00
[262673.694546] ffff880149aebac8 ffffffffc035c89e ffff8807e5b138e0 ffff8805b047f800
[262673.805230] Call Trace:
[262673.859116] [] ceph_x_destroy_authorizer+0x1f/0x50 [libceph]
[262673.968705] [] ceph_auth_destroy_authorizer+0x3e/0x60 [libceph]
[262674.078852] [] put_osd+0x45/0x80 [libceph]
[262674.134249] [] remove_osd+0xae/0x140 [libceph]
[262674.189124] [] __reset_osd+0x103/0x150 [libceph]
[262674.243749] [] kick_requests+0x223/0x460 [libceph]
[262674.297485] [] ceph_osdc_handle_map+0x282/0x5e0 [libceph]
[262674.350813] [] dispatch+0x4e/0x720 [libceph]
[262674.403312] [] try_read+0x3d1/0x1090 [libceph]
[262674.454712] [] ? dequeue_entity+0x152/0x690
[262674.505096] [] con_work+0xcb/0x1300 [libceph]
[262674.555104] [] process_one_work+0x14e/0x3d0
[262674.604072] [] worker_thread+0x11a/0x470
[262674.652187] [] ? rescuer_thread+0x310/0x310
[262674.699022] [] kthread+0xd2/0xf0
[262674.744494] [] ? kthread_create_on_node+0x1c0/0x1c0
[262674.789543] [] ret_from_fork+0x3f/0x70
[262674.834094] [] ? kthread_create_on_node+0x1c0/0x1c0

What happens is the following:

(1) new MON session is established
(2) old "none" ac is destroyed
(3) new "cephx" ac is constructed
...
(4) old OSD session (w/ "none" authorizer) is put
ceph_auth_destroy_authorizer(ac, osd->o_auth.authorizer)

osd->o_auth.authorizer in the "none" case is just a bare pointer into
ac, which contains a single static copy for all services. By the time
we get to (4), "none" ac, freed in (2), is long gone. On top of that,
a new vtable installed in (3) points us at ceph_x_destroy_authorizer(),
so we end up trying to destroy a "none" authorizer with a "cephx"
destructor operating on invalid memory!

To fix this, decouple authorizer destruction from ac and do away with
a single static "none" authorizer by making a copy for each OSD or MDS
session. Authorizers themselves are independent of ac and so there is
no reason for destroy_authorizer() to be an ac op. Make it an op on
the authorizer itself by turning ceph_authorizer into a real struct.

Fixes: http://tracker.ceph.com/issues/15447

Reported-by: Alan Zhang
Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
2016-04-26 02:54:13 +0800

22 Jan, 2016

3 commits

f6cdb2928 libceph: kill off ceph_x_ticket_handler::validity ... Browse Code »

With it gone, no need to preserve ceph_timespec in process_one_ticket()
either.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
2016-01-22 02:36:09 +0800
187d131dd libceph: invalidate AUTH in addition to a service ticket ... Browse Code »

If we fault due to authentication, we invalidate the service ticket we
have and request a new one - the idea being that if a service rejected
our authorizer, it must have expired, despite mon_client's attempts at
periodic renewal. (The other possibility is that our ticket is too new
and the service hasn't gotten it yet, in which case invalidating isn't
necessary but doesn't hurt.)

Invalidating just the service ticket is not enough, though. If we
assume a failure on mon_client's part to renew a service ticket, we
have to assume the same for the AUTH ticket. If our AUTH ticket is
bad, we won't get any service tickets no matter how hard we try, so
invalidate AUTH ticket along with the service ticket.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
2016-01-22 02:36:09 +0800
6abe097db libceph: fix authorizer invalidation, take 2 ... Browse Code »

Back in 2013, commit 4b8e8b5d78b8 ("libceph: fix authorizer
invalidation") tried to fix authorizer invalidation issues by clearing
validity field. However, nothing ever consults this field, so it
doesn't force us to request any new secrets in any way and therefore we
never get out of the exponential backoff mode:

[ 129.973812] libceph: osd2 192.168.122.1:6810 connect authorization failure
[ 130.706785] libceph: osd2 192.168.122.1:6810 connect authorization failure
[ 131.710088] libceph: osd2 192.168.122.1:6810 connect authorization failure
[ 133.708321] libceph: osd2 192.168.122.1:6810 connect authorization failure
[ 137.706598] libceph: osd2 192.168.122.1:6810 connect authorization failure
...

AFAICT this was the case at the time 4b8e8b5d78b8 was merged, too.

Using timespec solely as a bool isn't nice, so introduce a new have_key
flag, specifically for this purpose.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
2016-01-22 02:36:08 +0800

03 Nov, 2015

3 commits

a51983e4d libceph: add nocephx_sign_messages option ... Browse Code »

Support for message signing was merged into 3.19, along with
nocephx_require_signatures option. But, all that option does is allow
the kernel client to talk to clusters that don't support MSG_AUTH
feature bit. That's pretty useless, given that it's been supported
since bobtail.

Meanwhile, if one disables message signing on the server side with
"cephx sign messages = false", it becomes impossible to use the kernel
client since it expects messages to be signed if MSG_AUTH was
negotiated. Add nocephx_sign_messages option to support this use case.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2015-11-03 06:37:46 +0800
4199b8eec libceph: drop authorizer check from cephx msg signing routines ... Browse Code »

I don't see a way for auth->authorizer to be NULL in
ceph_x_sign_message() or ceph_x_check_message_signature().

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2015-11-03 06:37:46 +0800
cbf99a11f libceph: introduce ceph_x_authorizer_cleanup() ... Browse Code »

Commit ae385eaf24dc ("libceph: store session key in cephx authorizer")
introduced ceph_x_authorizer::session_key, but didn't update all the
exit/error paths. Introduce ceph_x_authorizer_cleanup() to encapsulate
ceph_x_authorizer cleanup and switch to it. This fixes ceph_x_destroy(),
which currently always leaks key and ceph_x_build_authorizer() error
paths.

Signed-off-by: Ilya Dryomov
Reviewed-by: Yan, Zheng

Ilya Dryomov
2015-11-03 06:36:48 +0800

09 Jan, 2015

1 commit

d7d5a007b libceph: fix sparse endianness warnings ... Browse Code »

The only real issue is the one in auth_x.c and it came with
3.19-rc1 merge.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2015-01-09 01:36:57 +0800

18 Dec, 2014

2 commits

33d073379 libceph: message signature support ... Browse Code »

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:50 +0800
ae385eaf2 libceph: store session key in cephx authorizer ... Browse Code »

Session key is required when calculating message signature. Save the session
key in authorizer, this avoid lookup ticket handler for each message

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:50 +0800

01 Nov, 2014

1 commit

e9226d7c9 libceph: eliminate unnecessary allocation in process_one_ticket() ... Browse Code »

Commit c27a3e4d667f ("libceph: do not hard code max auth ticket len")
while fixing a buffer overlow tried to keep the same as much of the
surrounding code as possible and introduced an unnecessary kmalloc() in
the unencrypted ticket path. It is likely to fail on huge tickets, so
get rid of it.

Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
2014-11-01 04:43:08 +0800

11 Sep, 2014

2 commits

c27a3e4d6 libceph: do not hard code max auth ticket len ... Browse Code »

We hard code cephx auth ticket buffer size to 256 bytes. This isn't
enough for any moderate setups and, in case tickets themselves are not
encrypted, leads to buffer overflows (ceph_x_decrypt() errors out, but
ceph_decode_copy() doesn't - it's just a memcpy() wrapper). Since the
buffer is allocated dynamically anyway, allocated it a bit later, at
the point where we know how much is going to be needed.

Fixes: http://tracker.ceph.com/issues/8979

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
2014-09-11 00:08:36 +0800
597cda357 libceph: add process_one_ticket() helper ... Browse Code »

Add a helper for processing individual cephx auth tickets. Needed for
the next commit, which deals with allocating ticket buffers. (Most of
the diff here is whitespace - view with git diff -b).

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
2014-09-11 00:08:35 +0800

02 May, 2013

3 commits

27859f977 libceph: wrap auth ops in wrapper functions ... Browse Code »

Use wrapper functions that check whether the auth op exists so that callers
do not need a bunch of conditional checks. Simplifies the external
interface.

Signed-off-by: Sage Weil
Reviewed-by: Alex Elder

Sage Weil
2013-05-02 12:17:14 +0800
0bed9b5c5 libceph: add update_authorizer auth method ... Browse Code »

Currently the messenger calls out to a get_authorizer con op, which will
create a new authorizer if it doesn't yet have one. In the meantime, when
we rotate our service keys, the authorizer doesn't get updated. Eventually
it will be rejected by the server on a new connection attempt and get
invalidated, and we will then rebuild a new authorizer, but this is not
ideal.

Instead, if we do have an authorizer, call a new update_authorizer op that
will verify that the current authorizer is using the latest secret. If it
is not, we will build a new one that does. This avoids the transient
failure.

This fixes one of the sorry sequence of events for bug

http://tracker.ceph.com/issues/4282

Signed-off-by: Sage Weil
Reviewed-by: Alex Elder

Sage Weil
2013-05-02 12:17:13 +0800
4b8e8b5d7 libceph: fix authorizer invalidation ... Browse Code »

We were invalidating the authorizer by removing the ticket handler
entirely. This was effective in inducing us to request a new authorizer,
but in the meantime it mean that any authorizer we generated would get a
new and initialized handler with secret_id=0, which would always be
rejected by the server side with a confusing error message:

auth: could not find secret_id=0
cephx: verify_authorizer could not get service secret for service osd secret_id=0

Instead, simply clear the validity field. This will still induce the auth
code to request a new secret, but will let us continue to use the old
ticket in the meantime. The messenger code will probably continue to fail,
but the exponential backoff will kick in, and eventually the we will get a
new (hopefully more valid) ticket from the mon and be able to continue.

Signed-off-by: Sage Weil
Reviewed-by: Alex Elder

Sage Weil
2013-05-02 12:17:12 +0800

17 May, 2012

1 commit

74f1869f7 ceph: messenger: reduce args to create_authorizer ... Browse Code »

Make use of the new ceph_auth_handshake structure in order to reduce
the number of arguments passed to the create_authorizor method in
ceph_auth_client_ops. Use a local variable of that type as a
shorthand in the get_authorizer method definitions.

Signed-off-by: Alex Elder
Reviewed-by: Sage Weil

Alex Elder
2012-05-17 21:18:12 +0800

30 Mar, 2011

1 commit

8323c3aa7 ceph: Move secret key parsing earlier. ... Browse Code »

This makes the base64 logic be contained in mount option parsing,
and prepares us for replacing the homebew key management with the
kernel key retention service.

Signed-off-by: Tommi Virtanen
Signed-off-by: Sage Weil

Tommi Virtanen
2011-03-30 03:11:16 +0800

21 Oct, 2010

1 commit

3d14c5d2b ceph: factor out libceph from Ceph file system ... Browse Code »

This factors out protocol and low-level storage parts of ceph into a
separate libceph module living in net/ceph and include/linux/ceph. This
is mostly a matter of moving files around. However, a few key pieces
of the interface change as well:

- ceph_client becomes ceph_fs_client and ceph_client, where the latter
captures the mon and osd clients, and the fs_client gets the mds client
and file system specific pieces.
- Mount option parsing and debugfs setup is correspondingly broken into
two pieces.
- The mon client gets a generic handler callback for otherwise unknown
messages (mds map, in this case).
- The basic supported/required feature bits can be expanded (and are by
ceph_fs_client).

No functional change, aside from some subtle error handling cases that got
cleaned up in the refactoring process.

Signed-off-by: Sage Weil

Yehuda Sadeh
2010-10-21 06:37:28 +0800