26 Oct, 2020

1 commit


12 Oct, 2020

1 commit


13 Aug, 2020

1 commit


03 Aug, 2020

1 commit


24 Jun, 2020

1 commit


01 Jun, 2020

1 commit

  • Count hits and misses in the caps cache. If the client has all of
    the necessary caps when a task needs references, then it's counted
    as a hit. Any other situation is a miss.

    URL: https://tracker.ceph.com/issues/43215
    Signed-off-by: Xiubo Li
    Reviewed-by: Jeff Layton
    Signed-off-by: Ilya Dryomov

    Xiubo Li
     

10 Feb, 2020

1 commit


27 Jan, 2020

2 commits


06 Nov, 2019

1 commit

  • Add a flag option to get xattr method that could have a bit flag of
    XATTR_NOSECURITY passed to it. XATTR_NOSECURITY is generally then
    set in the __vfs_getxattr path when called by security
    infrastructure.

    This handles the case of a union filesystem driver that is being
    requested by the security layer to report back the xattr data.

    For the use case where access is to be blocked by the security layer.

    The path then could be security(dentry) ->
    __vfs_getxattr(dentry...XATTR_NOSECURITY) ->
    handler->get(dentry...XATTR_NOSECURITY) ->
    __vfs_getxattr(lower_dentry...XATTR_NOSECURITY) ->
    lower_handler->get(lower_dentry...XATTR_NOSECURITY)
    which would report back through the chain data and success as
    expected, the logging security layer at the top would have the
    data to determine the access permissions and report back the target
    context that was blocked.

    Without the get handler flag, the path on a union filesystem would be
    the errant security(dentry) -> __vfs_getxattr(dentry) ->
    handler->get(dentry) -> vfs_getxattr(lower_dentry) -> nested ->
    security(lower_dentry, log off) -> lower_handler->get(lower_dentry)
    which would report back through the chain no data, and -EACCES.

    For selinux for both cases, this would translate to a correctly
    determined blocked access. In the first case with this change a correct avc
    log would be reported, in the second legacy case an incorrect avc log
    would be reported against an uninitialized u:object_r:unlabeled:s0
    context making the logs cosmetically useless for audit2allow.

    This patch series is inert and is the wide-spread addition of the
    flags option for xattr functions, and a replacement of __vfs_getxattr
    with __vfs_getxattr(...XATTR_NOSECURITY).

    Signed-off-by: Mark Salyzyn
    Reviewed-by: Jan Kara
    Acked-by: Jan Kara
    Acked-by: Jeff Layton
    Acked-by: David Sterba
    Acked-by: Darrick J. Wong
    Acked-by: Mike Marshall
    Cc: Stephen Smalley
    Cc: linux-kernel@vger.kernel.org
    Cc: kernel-team@android.com
    Cc: linux-security-module@vger.kernel.org

    (cherry picked from (rejected from archive because of too many recipients))
    Signed-off-by: Mark Salyzyn
    Bug: 133515582
    Bug: 136124883
    Bug: 129319403
    Change-Id: Iabbb8771939d5f66667a26bb23ddf4c562c349a1

    Mark Salyzyn
     

16 Sep, 2019

4 commits

  • Most filesystems don't limit what security.* xattrs can be set or
    fetched. I see no reason that we need to limit that on cephfs either.

    Drop the special xattr handler for "security." xattrs, and allow the
    "other" xattr handler to handle security xattrs as well.

    In addition to fixing xfstest generic/093, this allows us to support
    per-file capabilities (a'la setcap(8)).

    Link: https://tracker.ceph.com/issues/41135
    Signed-off-by: Jeff Layton
    Signed-off-by: Ilya Dryomov

    Jeff Layton
     
  • __ceph_getxattr will set the CEPH_I_SEC_INITED flag whenever it gets
    any xattr that starts with "security.". We only want to set that flag
    when fetching the MAC label for the currently-active LSM, however.

    Signed-off-by: Jeff Layton
    Signed-off-by: Ilya Dryomov

    Jeff Layton
     
  • No need to do an extra jump here. Also add some comments on the endifs.

    Signed-off-by: Jeff Layton
    Signed-off-by: Ilya Dryomov

    Jeff Layton
     
  • Most filesystems that provide virtual xattrs (e.g. CIFS) don't display
    them via listxattr(). Ceph does, and that causes some of the tests in
    xfstests to fail.

    Have cephfs stop listing vxattrs in listxattr. Userspace can always
    query them directly when the name is known.

    Signed-off-by: Jeff Layton
    Acked-by: David Disseldorp
    Signed-off-by: Ilya Dryomov

    Jeff Layton
     

22 Aug, 2019

2 commits

  • Calling ceph_buffer_put() in __ceph_build_xattrs_blob() may result in
    freeing the i_xattrs.blob buffer while holding the i_ceph_lock. This can
    be fixed by having this function returning the old blob buffer and have
    the callers of this function freeing it when the lock is released.

    The following backtrace was triggered by fstests generic/117.

    BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
    in_atomic(): 1, irqs_disabled(): 0, pid: 649, name: fsstress
    4 locks held by fsstress/649:
    #0: 00000000a7478e7e (&type->s_umount_key#19){++++}, at: iterate_supers+0x77/0xf0
    #1: 00000000f8de1423 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: ceph_check_caps+0x7b/0xc60
    #2: 00000000562f2b27 (&s->s_mutex){+.+.}, at: ceph_check_caps+0x3bd/0xc60
    #3: 00000000f83ce16a (&mdsc->snap_rwsem){++++}, at: ceph_check_caps+0x3ed/0xc60
    CPU: 1 PID: 649 Comm: fsstress Not tainted 5.2.0+ #439
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
    Call Trace:
    dump_stack+0x67/0x90
    ___might_sleep.cold+0x9f/0xb1
    vfree+0x4b/0x60
    ceph_buffer_release+0x1b/0x60
    __ceph_build_xattrs_blob+0x12b/0x170
    __send_cap+0x302/0x540
    ? __lock_acquire+0x23c/0x1e40
    ? __mark_caps_flushing+0x15c/0x280
    ? _raw_spin_unlock+0x24/0x30
    ceph_check_caps+0x5f0/0xc60
    ceph_flush_dirty_caps+0x7c/0x150
    ? __ia32_sys_fdatasync+0x20/0x20
    ceph_sync_fs+0x5a/0x130
    iterate_supers+0x8f/0xf0
    ksys_sync+0x4f/0xb0
    __ia32_sys_sync+0xa/0x10
    do_syscall_64+0x50/0x1c0
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x7fc6409ab617

    Signed-off-by: Luis Henriques
    Reviewed-by: Jeff Layton
    Signed-off-by: Ilya Dryomov

    Luis Henriques
     
  • Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the
    i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be
    fixed by postponing the call until later, when the lock is released.

    The following backtrace was triggered by fstests generic/117.

    BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
    in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress
    3 locks held by fsstress/650:
    #0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50
    #1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0
    #2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810
    CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
    Call Trace:
    dump_stack+0x67/0x90
    ___might_sleep.cold+0x9f/0xb1
    vfree+0x4b/0x60
    ceph_buffer_release+0x1b/0x60
    __ceph_setxattr+0x2b4/0x810
    __vfs_setxattr+0x66/0x80
    __vfs_setxattr_noperm+0x59/0xf0
    vfs_setxattr+0x81/0xa0
    setxattr+0x115/0x230
    ? filename_lookup+0xc9/0x140
    ? rcu_read_lock_sched_held+0x74/0x80
    ? rcu_sync_lockdep_assert+0x2e/0x60
    ? __sb_start_write+0x142/0x1a0
    ? mnt_want_write+0x20/0x50
    path_setxattr+0xba/0xd0
    __x64_sys_lsetxattr+0x24/0x30
    do_syscall_64+0x50/0x1c0
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x7ff23514359a

    Signed-off-by: Luis Henriques
    Reviewed-by: Jeff Layton
    Signed-off-by: Ilya Dryomov

    Luis Henriques
     

08 Jul, 2019

11 commits

  • The convention with xattrs is to not store the termination with string
    data, given that it returns the length. This is how setfattr/getfattr
    operate.

    Most of ceph's virtual xattr routines use snprintf to plop the string
    directly into the destination buffer, but snprintf always NULL
    terminates the string. This means that if we send the kernel a buffer
    that is the exact length needed to hold the string, it'll end up
    truncated.

    Add a ceph_fmt_xattr helper function to format the string into an
    on-stack buffer that should always be large enough to hold the whole
    thing and then memcpy the result into the destination buffer. If it does
    turn out that the formatted string won't fit in the on-stack buffer,
    then return -E2BIG and do a WARN_ONCE().

    Change over most of the virtual xattr routines to use the new helper. A
    couple of the xattrs are sourced from strings however, and it's
    difficult to know how long they'll be. Just have those memcpy the result
    in place after verifying the length.

    Signed-off-by: Jeff Layton
    Reviewed-by: "Yan, Zheng"
    Acked-by: Ilya Dryomov
    Signed-off-by: Ilya Dryomov

    Jeff Layton
     
  • The getxattr manpage states that we should return ERANGE if the
    destination buffer size is too small to hold the value.
    ceph_vxattrcb_layout does this internally, but we should be doing
    this for all vxattrs.

    Fix the only caller of getxattr_cb to check the returned size
    against the buffer length and return -ERANGE if it doesn't fit.
    Drop the same check in ceph_vxattrcb_layout and just rely on the
    caller to handle it.

    Signed-off-by: Jeff Layton
    Reviewed-by: "Yan, Zheng"
    Acked-by: Ilya Dryomov
    Signed-off-by: Ilya Dryomov

    Jeff Layton
     
  • The getxattr_cb functions return size_t, which is unsigned and then
    cast that value to int and then ssize_t before returning it. While all
    of this works, it relies on implicit casting rules for signed/unsigned
    conversions.

    Change getxattr_cb to return ssize_t to better conform with what the
    caller actually wants. Also, remove some suspicious casts.

    Signed-off-by: Jeff Layton
    Reviewed-by: "Yan, Zheng"
    Acked-by: Ilya Dryomov
    Signed-off-by: Ilya Dryomov

    Jeff Layton
     
  • When creating new file/directory, use security_dentry_init_security() to
    prepare selinux context for the new inode, then send openc/mkdir request
    to MDS, together with selinux xattr.

    security_dentry_init_security() only supports single security module and
    only selinux has dentry_init_security hook. So only selinux is supported
    for now. We can add support for other security modules once kernel has a
    generic version of dentry_init_security()

    Signed-off-by: "Yan, Zheng"
    Reviewed-by: Jeff Layton
    Signed-off-by: Ilya Dryomov

    Yan, Zheng
     
  • Also rename ceph_release_acls_info() to ceph_release_acl_sec_ctx().
    And move their definitions to different files. This is preparation
    for security label support.

    Signed-off-by: "Yan, Zheng"
    Reviewed-by: Jeff Layton
    Signed-off-by: Ilya Dryomov

    Yan, Zheng
     
  • name is not '\0' terminated.

    Signed-off-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    Yan, Zheng
     
  • The vxattr value incorrectly places a "09" prefix to the nanoseconds
    field, instead of providing it as a zero-pad width specifier after '%'.

    Fixes: 3489b42a72a4 ("ceph: fix three bugs, two in ceph_vxattrcb_file_layout()")
    Link: https://tracker.ceph.com/issues/39943
    Signed-off-by: David Disseldorp
    Reviewed-by: Ilya Dryomov
    Signed-off-by: Ilya Dryomov

    David Disseldorp
     
  • ceph_listxattr() now calculates the length of vxattrs dynamically, so
    these helpers, which incorrectly ignore vxattr.exists_cb(), can be
    removed.

    Signed-off-by: David Disseldorp
    Reviewed-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    David Disseldorp
     
  • ceph_listxattr() incorrectly returns a length based on the static
    ceph_vxattrs_name_size() value, which only takes into account whether
    vxattrs are hidden, ignoring vxattr.exists_cb().

    When filling the xattr buffer ceph_listxattr() checks VXATTR_FLAG_HIDDEN
    and vxattr.exists_cb(). If both are false, we return an incorrect
    (oversize) length.

    Fix this behaviour by always calculating the vxattrs length at runtime,
    taking both vxattr.hidden and vxattr.exists_cb() into account.

    This bug is only exposed with the new "ceph.snap.btime" vxattr, as all
    other vxattrs with a non-null exists_cb also carry VXATTR_FLAG_HIDDEN.

    Signed-off-by: David Disseldorp
    Reviewed-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    David Disseldorp
     
  • The ceph.snap.btime virtual xattr provides the snapshot creation (birth)
    time in $secs.$nsecs format.

    Link: https://tracker.ceph.com/issues/38838
    Signed-off-by: David Disseldorp
    Reviewed-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    David Disseldorp
     
  • .name_size should use the same string as .name.

    Signed-off-by: David Disseldorp
    Reviewed-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    David Disseldorp
     

06 Mar, 2019

1 commit


22 Oct, 2018

1 commit


03 Aug, 2018

1 commit

  • Since the vfs structures are all using timespec64, we can now
    change the internal representation, using ceph_encode_timespec64 and
    ceph_decode_timespec64.

    In case of ceph_aux_inode however, we need to avoid doing a memcmp()
    on uninitialized padding data, so the members of the i_mtime field get
    copied individually into 64-bit integers.

    Signed-off-by: Arnd Bergmann
    Reviewed-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    Arnd Bergmann
     

05 Jun, 2018

2 commits


23 Apr, 2018

1 commit


02 Apr, 2018

1 commit

  • This patch adds the infrastructure required to support cephfs quotas as it
    is currently implemented in the ceph fuse client. Cephfs quotas can be
    set on any directory, and can restrict the number of bytes or the number
    of files stored beneath that point in the directory hierarchy.

    Quotas are set using the extended attributes 'ceph.quota.max_files' and
    'ceph.quota.max_bytes', and can be removed by setting these attributes to
    '0'.

    Link: http://tracker.ceph.com/issues/22372
    Signed-off-by: Luis Henriques
    Reviewed-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    Luis Henriques
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

07 Sep, 2017

1 commit


07 Jul, 2017

1 commit

  • Previously we were returning values for quota, layout
    xattrs without any kind of update -- the user just got
    whatever happened to be in our cache.

    Clearly this extra round trip has a cost, but reads of
    these xattrs are fairly rare, happening on admin
    intervention rather than in normal operation.

    Link: http://tracker.ceph.com/issues/17939
    Signed-off-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    Yan, Zheng
     

04 May, 2017

1 commit

  • The ceph_inode_xattr needs to be released when removing an xattr. Easily
    reproducible running the 'generic/020' test from xfstests or simply by
    doing:

    attr -s attr0 -V 0 /mnt/test && attr -r attr0 /mnt/test

    While there, also fix the error path.

    Here's the kmemleak splat:

    unreferenced object 0xffff88001f86fbc0 (size 64):
    comm "attr", pid 244, jiffies 4294904246 (age 98.464s)
    hex dump (first 32 bytes):
    40 fa 86 1f 00 88 ff ff 80 32 38 1f 00 88 ff ff @........28.....
    00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................
    backtrace:
    [] kmemleak_alloc+0x49/0xa0
    [] kmem_cache_alloc+0x9b/0xf0
    [] __ceph_setxattr+0x17e/0x820
    [] ceph_set_xattr_handler+0x37/0x40
    [] __vfs_removexattr+0x4b/0x60
    [] vfs_removexattr+0x77/0xd0
    [] removexattr+0x41/0x60
    [] path_removexattr+0x75/0xa0
    [] SyS_lremovexattr+0xb/0x10
    [] entry_SYSCALL_64_fastpath+0x13/0x94
    [] 0xffffffffffffffff

    Cc: stable@vger.kernel.org
    Signed-off-by: Luis Henriques
    Reviewed-by: "Yan, Zheng"
    Signed-off-by: Ilya Dryomov

    Luis Henriques
     

18 Oct, 2016

1 commit


28 Sep, 2016

1 commit

  • current_fs_time() uses struct super_block* as an argument.
    As per Linus's suggestion, this is changed to take struct
    inode* as a parameter instead. This is because the function
    is primarily meant for vfs inode timestamps.
    Also the function was renamed as per Arnd's suggestion.

    Change all calls to current_fs_time() to use the new
    current_time() function instead. current_fs_time() will be
    deleted.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: Al Viro

    Deepa Dinamani