12 Oct, 2020

1 commit


30 Mar, 2020

1 commit


16 Sep, 2019

1 commit


22 Aug, 2019

1 commit


08 May, 2019

1 commit

  • Inode i_filelock_ref is increased in ceph_lock or ceph_flock, but it is
    increased again in ceph_lock_message. This results in this ref won't
    become zero. If CEPH_I_ERROR_FILELOCK flag is set in
    remove_session_caps once, this flag can't be cleared even if client is
    back to normal. So further file lock will return EIO.

    Signed-off-by: Zhi Zhang
    Reviewed-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    Zhi Zhang
     

02 Apr, 2018

1 commit


13 Nov, 2017

4 commits


02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

16 Jul, 2017

1 commit

  • Since commit c69899a17ca4 "NFSv4: Update of VFS byte range lock must be
    atomic with the stateid update", NFSv4 has been inserting locks in rpciod
    worker context. The result is that the file_lock's fl_nspid is the
    kworker's pid instead of the original userspace pid.

    The fl_nspid is only used to represent the namespaced virtual pid number
    when displaying locks or returning from F_GETLK. There's no reason to set
    it for every inserted lock, since we can usually just look it up from
    fl_pid. So, instead of looking up and holding struct pid for every lock,
    let's just look up the virtual pid number from fl_pid when it is needed.
    That means we can remove fl_nspid entirely.

    The translaton and presentation of fl_pid should handle the following four
    cases:

    1 - F_GETLK on a remote file with a remote lock:
    In this case, the filesystem should determine the l_pid to return here.
    Filesystems should indicate that the fl_pid represents a non-local pid
    value that should not be translated by returning an fl_pid
    Signed-off-by: Jeff Layton

    Benjamin Coddington
     

07 Jul, 2017

1 commit

  • Don't re-send interrupted flock request in cases of mds failover
    and receiving request forward. Because corresponding 'lock intr'
    request may have been finished, it won't get re-sent.

    Link: http://tracker.ceph.com/issues/20170
    Signed-off-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    Yan, Zheng
     

03 Oct, 2016

1 commit


23 Oct, 2015

1 commit


31 Jul, 2015

1 commit


17 Feb, 2015

1 commit


17 Jan, 2015

5 commits


18 Dec, 2014

1 commit

  • When a lock operation is interrupted, current code sends a unlock request to
    MDS to undo the lock operation. This method does not work as expected because
    the unlock request can drop locks that have already been acquired.

    The fix is use the newly introduced CEPH_LOCK_FCNTL_INTR/CEPH_LOCK_FLOCK_INTR
    requests to interrupt blocked file lock request. These requests do not drop
    locks that have alread been acquired, they only interrupt blocked file lock
    request.

    Signed-off-by: Yan, Zheng

    Yan, Zheng
     

02 Jun, 2014

1 commit

  • Currently, the fl_owner isn't set for flock locks. Some filesystems use
    byte-range locks to simulate flock locks and there is a common idiom in
    those that does:

    fl->fl_owner = (fl_owner_t)filp;
    fl->fl_start = 0;
    fl->fl_end = OFFSET_MAX;

    Since flock locks are generally "owned" by the open file description,
    move this into the common flock lock setup code. The fl_start and fl_end
    fields are already set appropriately, so remove the unneeded setting of
    that in flock ops in those filesystems as well.

    Finally, the lease code also sets the fl_owner as if they were owned by
    the process and not the open file description. This is incorrect as
    leases have the same ownership semantics as flock locks. Set them the
    same way. The lease code doesn't actually use the fl_owner value for
    anything, so this is more for consistency's sake than a bugfix.

    Reported-by: Trond Myklebust
    Signed-off-by: Jeff Layton
    Acked-by: Greg Kroah-Hartman (Staging portion)
    Acked-by: J. Bruce Fields

    Jeff Layton
     

29 Apr, 2014

1 commit


05 Apr, 2014

3 commits

  • flock and posix lock should use fl->fl_file instead of process ID
    as owner identifier. (posix lock uses fl->fl_owner. fl->fl_owner
    is usually equal to fl->fl_file, but it also can be a customized
    value). The process ID of who holds the lock is just for F_GETLK
    fcntl(2).

    The fix is rename the 'pid' fields of struct ceph_mds_request_args
    and struct ceph_filelock to 'owner', rename 'pid_namespace' fields
    to 'pid'. Assign fl->fl_file to the 'owner' field of lock messages.
    We also set the most significant bit of the 'owner' field. MDS can
    use that bit to distinguish between old and new clients.

    The MDS counterpart of this patch modifies the flock code to not
    take the 'pid_namespace' into consideration when checking conflict
    locks.

    Signed-off-by: Yan, Zheng
    Reviewed-by: Sage Weil

    Yan, Zheng
     
  • Signed-off-by: Yan, Zheng

    Yan, Zheng
     
  • VFS does not directly pass flock's operation code to filesystem's
    flock callback. It translates the operation code to the form how
    posix lock's parameters are presented.

    Signed-off-by: Yan, Zheng

    Yan, Zheng
     

10 Jul, 2013

1 commit

  • Pull Ceph updates from Sage Weil:
    "There is some follow-on RBD cleanup after the last window's code drop,
    a series from Yan fixing multi-mds behavior in cephfs, and then a
    sprinkling of bug fixes all around. Some warnings, sleeping while
    atomic, a null dereference, and cleanups"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits)
    libceph: fix invalid unsigned->signed conversion for timespec encoding
    libceph: call r_unsafe_callback when unsafe reply is received
    ceph: fix race between cap issue and revoke
    ceph: fix cap revoke race
    ceph: fix pending vmtruncate race
    ceph: avoid accessing invalid memory
    libceph: Fix NULL pointer dereference in auth client code
    ceph: Reconstruct the func ceph_reserve_caps.
    ceph: Free mdsc if alloc mdsc->mdsmap failed.
    ceph: remove sb_start/end_write in ceph_aio_write.
    ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.
    ceph: fix sleeping function called from invalid context.
    ceph: move inode to proper flushing list when auth MDS changes
    rbd: fix a couple warnings
    ceph: clear migrate seq when MDS restarts
    ceph: check migrate seq before changing auth cap
    ceph: fix race between page writeback and truncate
    ceph: reset iov_len when discarding cap release messages
    ceph: fix cap release race
    libceph: fix truncate size calculation
    ...

    Linus Torvalds
     

02 Jul, 2013

1 commit


29 Jun, 2013

1 commit

  • Having a global lock that protects all of this code is a clear
    scalability problem. Instead of doing that, move most of the code to be
    protected by the i_lock instead. The exceptions are the global lists
    that the ->fl_link sits on, and the ->fl_block list.

    ->fl_link is what connects these structures to the
    global lists, so we must ensure that we hold those locks when iterating
    over or updating these lists.

    Furthermore, sound deadlock detection requires that we hold the
    blocked_list state steady while checking for loops. We also must ensure
    that the search and update to the list are atomic.

    For the checking and insertion side of the blocked_list, push the
    acquisition of the global lock into __posix_lock_file and ensure that
    checking and update of the blocked_list is done without dropping the
    lock in between.

    On the removal side, when waking up blocked lock waiters, take the
    global lock before walking the blocked list and dequeue the waiters from
    the global list prior to removal from the fl_block list.

    With this, deadlock detection should be race free while we minimize
    excessive file_lock_lock thrashing.

    Finally, in order to avoid a lock inversion problem when handling
    /proc/locks output we must ensure that manipulations of the fl_block
    list are also protected by the file_lock_lock.

    Signed-off-by: Jeff Layton
    Signed-off-by: Al Viro

    Jeff Layton
     

18 May, 2013

2 commits

  • Ceph's encode_caps_cb() worked hard to not call __page_cache_alloc()
    while holding a lock, but it's spoiled because ceph_pagelist_addpage()
    always calls kmap(), which might sleep. Here's the result:

    [13439.295457] ceph: mds0 reconnect start
    [13439.300572] BUG: sleeping function called from invalid context at include/linux/highmem.h:58
    [13439.309243] in_atomic(): 1, irqs_disabled(): 0, pid: 12059, name: kworker/1:1
    . . .
    [13439.376225] Call Trace:
    [13439.378757] [] __might_sleep+0xfc/0x110
    [13439.384353] [] ceph_pagelist_append+0x120/0x1b0 [libceph]
    [13439.391491] [] ceph_encode_locks+0x89/0x190 [ceph]
    [13439.398035] [] ? _raw_spin_lock+0x49/0x50
    [13439.403775] [] ? lock_flocks+0x15/0x20
    [13439.409277] [] encode_caps_cb+0x41f/0x4a0 [ceph]
    [13439.415622] [] ? igrab+0x28/0x70
    [13439.420610] [] ? iterate_session_caps+0xe8/0x250 [ceph]
    [13439.427584] [] iterate_session_caps+0x115/0x250 [ceph]
    [13439.434499] [] ? set_request_path_attr+0x2d0/0x2d0 [ceph]
    [13439.441646] [] send_mds_reconnect+0x238/0x450 [ceph]
    [13439.448363] [] ? ceph_mdsmap_decode+0x5e2/0x770 [ceph]
    [13439.455250] [] check_new_map+0x352/0x500 [ceph]
    [13439.461534] [] ceph_mdsc_handle_map+0x1bd/0x260 [ceph]
    [13439.468432] [] ? mutex_unlock+0xe/0x10
    [13439.473934] [] extra_mon_dispatch+0x22/0x30 [ceph]
    [13439.480464] [] dispatch+0xbc/0x110 [libceph]
    [13439.486492] [] process_message+0x1ad/0x1d0 [libceph]
    [13439.493190] [] ? read_partial_message+0x3e8/0x520 [libceph]
    . . .
    [13439.587132] ceph: mds0 reconnect success
    [13490.720032] ceph: mds0 caps stale
    [13501.235257] ceph: mds0 recovery completed
    [13501.300419] ceph: mds0 caps renewed

    Fix it up by encoding locks into a buffer first, and when the number
    of encoded locks is stable, copy that into a ceph_pagelist.

    [elder@inktank.com: abbreviated the stack info a bit.]

    Cc: stable@vger.kernel.org # 3.4+
    Signed-off-by: Jim Schutt
    Reviewed-by: Alex Elder

    Jim Schutt
     
  • In his review, Alex Elder mentioned that he hadn't checked that
    num_fcntl_locks and num_flock_locks were properly decoded on the
    server side, from a le32 over-the-wire type to a cpu type.
    I checked, and AFAICS it is done; those interested can consult
    Locker::_do_cap_update()
    in src/mds/Locker.cc and src/include/encoding.h in the Ceph server
    code (git://github.com/ceph/ceph).

    I also checked the server side for flock_len decoding, and I believe
    that also happens correctly, by virtue of having been declared
    __le32 in struct ceph_mds_cap_reconnect, in src/include/ceph_fs.h.

    Cc: stable@vger.kernel.org # 3.4+
    Signed-off-by: Jim Schutt
    Reviewed-by: Alex Elder

    Jim Schutt
     

23 Feb, 2013

1 commit


08 Jun, 2011

2 commits


02 Dec, 2010

2 commits


21 Oct, 2010

2 commits

  • When the lock_kernel() turns into lock_flocks() and a spinlock, we won't
    be able to do allocations with the lock held. Preallocate space without
    the lock, and retry if the lock state changes out from underneath us.

    Signed-off-by: Greg Farnum
    Signed-off-by: Sage Weil

    Greg Farnum
     
  • This factors out protocol and low-level storage parts of ceph into a
    separate libceph module living in net/ceph and include/linux/ceph. This
    is mostly a matter of moving files around. However, a few key pieces
    of the interface change as well:

    - ceph_client becomes ceph_fs_client and ceph_client, where the latter
    captures the mon and osd clients, and the fs_client gets the mds client
    and file system specific pieces.
    - Mount option parsing and debugfs setup is correspondingly broken into
    two pieces.
    - The mon client gets a generic handler callback for otherwise unknown
    messages (mds map, in this case).
    - The basic supported/required feature bits can be expanded (and are by
    ceph_fs_client).

    No functional change, aside from some subtle error handling cases that got
    cleaned up in the refactoring process.

    Signed-off-by: Sage Weil

    Yehuda Sadeh