12 Oct, 2020
1 commit
-
This will help simplify the code.
[ jlayton: fix minor merge conflict in quota.c ]
Signed-off-by: Xiubo Li
Signed-off-by: Jeff Layton
Signed-off-by: Ilya Dryomov
30 Mar, 2020
1 commit
-
When a process exits, kernel closes its files. locks_remove_file()
is called to remove file locks on these files. locks_remove_file()
tries unlocking files even there is no file lock.Signed-off-by: "Yan, Zheng"
Reviewed-by: Jeff Layton
Signed-off-by: Ilya Dryomov
16 Sep, 2019
1 commit
-
After mds evicts session, file locks get lost sliently. It's not safe to
let programs continue to do read/write.Signed-off-by: "Yan, Zheng"
Reviewed-by: Jeff Layton
Signed-off-by: Ilya Dryomov
22 Aug, 2019
1 commit
-
When ceph_mdsc_do_request returns an error, we can't assume that the
filelock_reply pointer will be set. Only try to fetch fields out of
the r_reply_info when it returns success.Cc: stable@vger.kernel.org
Reported-by: Hector Martin
Signed-off-by: Jeff Layton
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov
08 May, 2019
1 commit
-
Inode i_filelock_ref is increased in ceph_lock or ceph_flock, but it is
increased again in ceph_lock_message. This results in this ref won't
become zero. If CEPH_I_ERROR_FILELOCK flag is set in
remove_session_caps once, this flag can't be cleared even if client is
back to normal. So further file lock will return EIO.Signed-off-by: Zhi Zhang
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov
02 Apr, 2018
1 commit
-
Some of dout format do not include newline in the end,
fix for the files which are in fs/ceph and net/ceph directories,
and changing printk to dout for printing debug info in super.cSigned-off-by: Chengguang Xu
Reviewed-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov
13 Nov, 2017
4 commits
-
When session get evicted, all file locks associated with the session
get released remotely by mds. File locks tracked by kernel become
stale. In this situation, set an error flag on inode. The flag makes
further file locks return -EIO.Another option to handle this situation is cleanup file locks tracked
kernel. I do not choose it because it is inconvenient to notify user
program about the error.Signed-off-by: "Yan, Zheng"
Acked-by: Jeff Layton
Signed-off-by: Ilya Dryomov -
Don't malloc if there is no flock.
Signed-off-by: "Yan, Zheng"
Reviewed-by: Jeff Layton
Signed-off-by: Ilya Dryomov -
Signed-off-by: "Yan, Zheng"
Reviewed-by: Jeff Layton
Signed-off-by: Ilya Dryomov -
file locks are tracked by inode's auth mds. dropping auth caps
is equivalent to releasing all file locks.Signed-off-by: "Yan, Zheng"
Acked-by: Jeff Layton
Signed-off-by: Ilya Dryomov
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
16 Jul, 2017
1 commit
-
Since commit c69899a17ca4 "NFSv4: Update of VFS byte range lock must be
atomic with the stateid update", NFSv4 has been inserting locks in rpciod
worker context. The result is that the file_lock's fl_nspid is the
kworker's pid instead of the original userspace pid.The fl_nspid is only used to represent the namespaced virtual pid number
when displaying locks or returning from F_GETLK. There's no reason to set
it for every inserted lock, since we can usually just look it up from
fl_pid. So, instead of looking up and holding struct pid for every lock,
let's just look up the virtual pid number from fl_pid when it is needed.
That means we can remove fl_nspid entirely.The translaton and presentation of fl_pid should handle the following four
cases:1 - F_GETLK on a remote file with a remote lock:
In this case, the filesystem should determine the l_pid to return here.
Filesystems should indicate that the fl_pid represents a non-local pid
value that should not be translated by returning an fl_pid
Signed-off-by: Jeff Layton
07 Jul, 2017
1 commit
-
Don't re-send interrupted flock request in cases of mds failover
and receiving request forward. Because corresponding 'lock intr'
request may have been finished, it won't get re-sent.Link: http://tracker.ceph.com/issues/20170
Signed-off-by: "Yan, Zheng"
Signed-off-by: Ilya Dryomov
03 Oct, 2016
1 commit
-
Signed-off-by: Yan, Zheng
23 Oct, 2015
1 commit
-
Instead of having users check for FL_POSIX or FL_FLOCK to call the correct
locks API function, use the check within locks_lock_inode_wait(). This
allows for some later cleanup.Signed-off-by: Benjamin Coddington
Signed-off-by: Jeff Layton
31 Jul, 2015
1 commit
-
posix locks should be in ctx->flc_posix list
Signed-off-by: Yan, Zheng
Signed-off-by: Ilya Dryomov
17 Feb, 2015
1 commit
-
This reverts commit 9bd0f45b7037fcfa8b575c7e27d0431d6e6dc3bb.
Linus rightly pointed out that I failed to initialize the counters
when adding them, so they don't work as expected. Just revert this
patch for now.Reported-by: Linus Torvalds
Signed-off-by: Jeff Layton
17 Jan, 2015
5 commits
-
This makes things a bit more efficient in the cifs and ceph lock
pushing code.Signed-off-by: Jeff Layton
Acked-by: Christoph Hellwig -
We can now add a dedicated spinlock without expanding struct inode.
Change to using that to protect the various i_flctx lists.Signed-off-by: Jeff Layton
Acked-by: Christoph Hellwig -
Signed-off-by: Jeff Layton
Acked-by: Christoph Hellwig -
Signed-off-by: Jeff Layton
Acked-by: Christoph Hellwig -
There is only a single call site for each of these functions, and the
caller takes the i_lock prior to calling them and drops it just
afterward. Move the spinlocking into the functions instead.Signed-off-by: Jeff Layton
Acked-by: Christoph Hellwig
18 Dec, 2014
1 commit
-
When a lock operation is interrupted, current code sends a unlock request to
MDS to undo the lock operation. This method does not work as expected because
the unlock request can drop locks that have already been acquired.The fix is use the newly introduced CEPH_LOCK_FCNTL_INTR/CEPH_LOCK_FLOCK_INTR
requests to interrupt blocked file lock request. These requests do not drop
locks that have alread been acquired, they only interrupt blocked file lock
request.Signed-off-by: Yan, Zheng
02 Jun, 2014
1 commit
-
Currently, the fl_owner isn't set for flock locks. Some filesystems use
byte-range locks to simulate flock locks and there is a common idiom in
those that does:fl->fl_owner = (fl_owner_t)filp;
fl->fl_start = 0;
fl->fl_end = OFFSET_MAX;Since flock locks are generally "owned" by the open file description,
move this into the common flock lock setup code. The fl_start and fl_end
fields are already set appropriately, so remove the unneeded setting of
that in flock ops in those filesystems as well.Finally, the lease code also sets the fl_owner as if they were owned by
the process and not the open file description. This is incorrect as
leases have the same ownership semantics as flock locks. Set them the
same way. The lease code doesn't actually use the fl_owner value for
anything, so this is more for consistency's sake than a bugfix.Reported-by: Trond Myklebust
Signed-off-by: Jeff Layton
Acked-by: Greg Kroah-Hartman (Staging portion)
Acked-by: J. Bruce Fields
29 Apr, 2014
1 commit
-
Signed-off-by: Yan, Zheng
Reviewed-by: Sage Weil
05 Apr, 2014
3 commits
-
flock and posix lock should use fl->fl_file instead of process ID
as owner identifier. (posix lock uses fl->fl_owner. fl->fl_owner
is usually equal to fl->fl_file, but it also can be a customized
value). The process ID of who holds the lock is just for F_GETLK
fcntl(2).The fix is rename the 'pid' fields of struct ceph_mds_request_args
and struct ceph_filelock to 'owner', rename 'pid_namespace' fields
to 'pid'. Assign fl->fl_file to the 'owner' field of lock messages.
We also set the most significant bit of the 'owner' field. MDS can
use that bit to distinguish between old and new clients.The MDS counterpart of this patch modifies the flock code to not
take the 'pid_namespace' into consideration when checking conflict
locks.Signed-off-by: Yan, Zheng
Reviewed-by: Sage Weil -
Signed-off-by: Yan, Zheng
-
VFS does not directly pass flock's operation code to filesystem's
flock callback. It translates the operation code to the form how
posix lock's parameters are presented.Signed-off-by: Yan, Zheng
10 Jul, 2013
1 commit
-
Pull Ceph updates from Sage Weil:
"There is some follow-on RBD cleanup after the last window's code drop,
a series from Yan fixing multi-mds behavior in cephfs, and then a
sprinkling of bug fixes all around. Some warnings, sleeping while
atomic, a null dereference, and cleanups"* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits)
libceph: fix invalid unsigned->signed conversion for timespec encoding
libceph: call r_unsafe_callback when unsafe reply is received
ceph: fix race between cap issue and revoke
ceph: fix cap revoke race
ceph: fix pending vmtruncate race
ceph: avoid accessing invalid memory
libceph: Fix NULL pointer dereference in auth client code
ceph: Reconstruct the func ceph_reserve_caps.
ceph: Free mdsc if alloc mdsc->mdsmap failed.
ceph: remove sb_start/end_write in ceph_aio_write.
ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.
ceph: fix sleeping function called from invalid context.
ceph: move inode to proper flushing list when auth MDS changes
rbd: fix a couple warnings
ceph: clear migrate seq when MDS restarts
ceph: check migrate seq before changing auth cap
ceph: fix race between page writeback and truncate
ceph: reset iov_len when discarding cap release messages
ceph: fix cap release race
libceph: fix truncate size calculation
...
02 Jul, 2013
1 commit
-
Signed-off-by: Jim Schutt
Reviewed-by: Alex Elder
29 Jun, 2013
1 commit
-
Having a global lock that protects all of this code is a clear
scalability problem. Instead of doing that, move most of the code to be
protected by the i_lock instead. The exceptions are the global lists
that the ->fl_link sits on, and the ->fl_block list.->fl_link is what connects these structures to the
global lists, so we must ensure that we hold those locks when iterating
over or updating these lists.Furthermore, sound deadlock detection requires that we hold the
blocked_list state steady while checking for loops. We also must ensure
that the search and update to the list are atomic.For the checking and insertion side of the blocked_list, push the
acquisition of the global lock into __posix_lock_file and ensure that
checking and update of the blocked_list is done without dropping the
lock in between.On the removal side, when waking up blocked lock waiters, take the
global lock before walking the blocked list and dequeue the waiters from
the global list prior to removal from the fl_block list.With this, deadlock detection should be race free while we minimize
excessive file_lock_lock thrashing.Finally, in order to avoid a lock inversion problem when handling
/proc/locks output we must ensure that manipulations of the fl_block
list are also protected by the file_lock_lock.Signed-off-by: Jeff Layton
Signed-off-by: Al Viro
18 May, 2013
2 commits
-
Ceph's encode_caps_cb() worked hard to not call __page_cache_alloc()
while holding a lock, but it's spoiled because ceph_pagelist_addpage()
always calls kmap(), which might sleep. Here's the result:[13439.295457] ceph: mds0 reconnect start
[13439.300572] BUG: sleeping function called from invalid context at include/linux/highmem.h:58
[13439.309243] in_atomic(): 1, irqs_disabled(): 0, pid: 12059, name: kworker/1:1
. . .
[13439.376225] Call Trace:
[13439.378757] [] __might_sleep+0xfc/0x110
[13439.384353] [] ceph_pagelist_append+0x120/0x1b0 [libceph]
[13439.391491] [] ceph_encode_locks+0x89/0x190 [ceph]
[13439.398035] [] ? _raw_spin_lock+0x49/0x50
[13439.403775] [] ? lock_flocks+0x15/0x20
[13439.409277] [] encode_caps_cb+0x41f/0x4a0 [ceph]
[13439.415622] [] ? igrab+0x28/0x70
[13439.420610] [] ? iterate_session_caps+0xe8/0x250 [ceph]
[13439.427584] [] iterate_session_caps+0x115/0x250 [ceph]
[13439.434499] [] ? set_request_path_attr+0x2d0/0x2d0 [ceph]
[13439.441646] [] send_mds_reconnect+0x238/0x450 [ceph]
[13439.448363] [] ? ceph_mdsmap_decode+0x5e2/0x770 [ceph]
[13439.455250] [] check_new_map+0x352/0x500 [ceph]
[13439.461534] [] ceph_mdsc_handle_map+0x1bd/0x260 [ceph]
[13439.468432] [] ? mutex_unlock+0xe/0x10
[13439.473934] [] extra_mon_dispatch+0x22/0x30 [ceph]
[13439.480464] [] dispatch+0xbc/0x110 [libceph]
[13439.486492] [] process_message+0x1ad/0x1d0 [libceph]
[13439.493190] [] ? read_partial_message+0x3e8/0x520 [libceph]
. . .
[13439.587132] ceph: mds0 reconnect success
[13490.720032] ceph: mds0 caps stale
[13501.235257] ceph: mds0 recovery completed
[13501.300419] ceph: mds0 caps renewedFix it up by encoding locks into a buffer first, and when the number
of encoded locks is stable, copy that into a ceph_pagelist.[elder@inktank.com: abbreviated the stack info a bit.]
Cc: stable@vger.kernel.org # 3.4+
Signed-off-by: Jim Schutt
Reviewed-by: Alex Elder -
In his review, Alex Elder mentioned that he hadn't checked that
num_fcntl_locks and num_flock_locks were properly decoded on the
server side, from a le32 over-the-wire type to a cpu type.
I checked, and AFAICS it is done; those interested can consult
Locker::_do_cap_update()
in src/mds/Locker.cc and src/include/encoding.h in the Ceph server
code (git://github.com/ceph/ceph).I also checked the server side for flock_len decoding, and I believe
that also happens correctly, by virtue of having been declared
__le32 in struct ceph_mds_cap_reconnect, in src/include/ceph_fs.h.Cc: stable@vger.kernel.org # 3.4+
Signed-off-by: Jim Schutt
Reviewed-by: Alex Elder
23 Feb, 2013
1 commit
-
Signed-off-by: Al Viro
08 Jun, 2011
2 commits
-
If we request a lock and then abort (e.g., ^C), we need to send a matching
unlock request to the MDS to unwind our lock attempt to avoid indefinitely
blocking other clients.Reported-by: Brian Chrisman
Signed-off-by: Sage Weil -
We should use ihold whenever we already have a stable inode ref, even
when we aren't holding i_lock. This avoids adding new and unnecessary
locking dependencies.Signed-off-by: Sage Weil
02 Dec, 2010
2 commits
-
Fill in the local lock with response data if appropriate,
and don't call posix_lock_file when reading locks.Signed-off-by: Herb Shiu
Acked-by: Greg Farnum
Signed-off-by: Sage Weil -
Signed-off-by: Herb Shiu
Acked-by: Greg Farnum
Signed-off-by: Sage Weil
21 Oct, 2010
2 commits
-
When the lock_kernel() turns into lock_flocks() and a spinlock, we won't
be able to do allocations with the lock held. Preallocate space without
the lock, and retry if the lock state changes out from underneath us.Signed-off-by: Greg Farnum
Signed-off-by: Sage Weil -
This factors out protocol and low-level storage parts of ceph into a
separate libceph module living in net/ceph and include/linux/ceph. This
is mostly a matter of moving files around. However, a few key pieces
of the interface change as well:- ceph_client becomes ceph_fs_client and ceph_client, where the latter
captures the mon and osd clients, and the fs_client gets the mds client
and file system specific pieces.
- Mount option parsing and debugfs setup is correspondingly broken into
two pieces.
- The mon client gets a generic handler callback for otherwise unknown
messages (mds map, in this case).
- The basic supported/required feature bits can be expanded (and are by
ceph_fs_client).No functional change, aside from some subtle error handling cases that got
cleaned up in the refactoring process.Signed-off-by: Sage Weil