26 Jun, 2018
2 commits
-
commit 7f54910fa8dfe504f2e1563f4f6ddc3294dfbf3a upstream.
OrangeFS formerly failed to set attributes_mask with the result that
software could not see immutable and append flags present in the
filesystem.Reported-by: Becky Ligon
Signed-off-by: Martin Brandenburg
Fixes: 68a24a6cc4a6 ("orangefs: implement statx")
Cc: stable@vger.kernel.org
Cc: hubcap@omnibond.com
Signed-off-by: Mike Marshall
Signed-off-by: Greg Kroah-Hartman -
commit f6a4b4c9d07dda90c7c29dae96d6119ac6425dca upstream.
As long as a symlink inode remains in-core, the destination (and
therefore size) will not be re-fetched from the server, as it cannot
change. The original implementation of the attribute cache assumed that
setting the expiry time in the past was sufficient to cause a re-fetch
of all attributes on the next getattr. That does not work in this case.The bug manifested itself as follows. When the command sequence
touch foo; ln -s foo bar; ls -l bar
is run, the output was
lrwxrwxrwx. 1 fedora fedora 4906 Apr 24 19:10 bar -> foo
However, after a re-mount, ls -l bar produces
lrwxrwxrwx. 1 fedora fedora 3 Apr 24 19:10 bar -> foo
After this commit, even before a re-mount, the output is
lrwxrwxrwx. 1 fedora fedora 3 Apr 24 19:10 bar -> foo
Reported-by: Becky Ligon
Signed-off-by: Martin Brandenburg
Fixes: 71680c18c8f2 ("orangefs: Cache getattr results.")
Cc: stable@vger.kernel.org
Cc: hubcap@omnibond.com
Signed-off-by: Mike Marshall
Signed-off-by: Greg Kroah-Hartman
30 May, 2018
1 commit
-
commit 1e2e547a93a00ebc21582c06ca3c6cfea2a309ee upstream.
For anything NFS-exported we do _not_ want to unlock new inode
before it has grown an alias; original set of fixes got the
ordering right, but missed the nasty complication in case of
lockdep being enabled - unlock_new_inode() does
lockdep_annotate_inode_mutex_key(inode)
which can only be done before anyone gets a chance to touch
->i_mutex. Unfortunately, flipping the order and doing
unlock_new_inode() before d_instantiate() opens a window when
mkdir can race with open-by-fhandle on a guessed fhandle, leading
to multiple aliases for a directory inode and all the breakage
that follows from that.Correct solution: a new primitive (d_instantiate_new())
combining these two in the right order - lockdep annotate, then
d_instantiate(), then the rest of unlock_new_inode(). All
combinations of d_instantiate() with unlock_new_inode() should
be converted to that.Cc: stable@kernel.org # 2.6.29 and later
Tested-by: Mike Marshall
Reviewed-by: Andreas Dilger
Signed-off-by: Al Viro
Signed-off-by: Greg Kroah-Hartman
24 Apr, 2018
1 commit
-
commit 659038428cb43a66e3eff71e2c845c9de3611a98 upstream.
orangefs_fill_sb() might've failed to allocate ORANGEFS_SB(s); don't
oops in that case.Cc: stable@kernel.org
Signed-off-by: Al Viro
Signed-off-by: Greg Kroah-Hartman
31 Jan, 2018
3 commits
-
commit 6793f1c450b1533a5e9c2493490de771d38b24f9 upstream.
After do_readv_writev, the inode cache is invalidated anyway, so i_size
will never be read. It will be fetched from the server which will also
know about updates from other machines.Fixes deadlock on 32-bit SMP.
See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2
Signed-off-by: Martin Brandenburg
Cc: Al Viro
Cc: Mike Marshall
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman -
commit a0ec1ded22e6a6bc41981fae22406835b006a66e upstream.
In orangefs_devreq_read, there is a loop which picks an op off the list
of pending ops. If the loop fails to find an op, there is nothing to
read, and it returns EAGAIN. If the op has been given up on, the loop
is restarted via a goto. The bug is that the variable which the found
op is written to is not reinitialized, so if there are no more eligible
ops on the list, the code runs again on the already handled op.This is triggered by interrupting a process while the op is being copied
to the client-core. It's a fairly small window, but it's there.Signed-off-by: Martin Brandenburg
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman -
commit 0afc0decf247f65b7aba666a76a0a68adf4bc435 upstream.
set_op_state_purged can delete the op.
Signed-off-by: Martin Brandenburg
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
15 Sep, 2017
8 commits
-
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bitThe script “checkpatch.pl” pointed information out like the following.
Comparison to NULL could be written !…
Thus fix affected source code places.
Signed-off-by: Markus Elfring
Signed-off-by: Mike Marshall -
* A multiplication for the size determination of a memory allocation
indicated that an array data structure should be processed.
Thus use the corresponding function "kcalloc".This issue was detected by using the Coccinelle software.
* Replace the specification of a data structure by a pointer dereference
to make the corresponding size determination a bit safer according to
the Linux coding style convention.Signed-off-by: Markus Elfring
Signed-off-by: Mike Marshall -
Omit an extra message for a memory allocation failure in these functions.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring
Signed-off-by: Mike Marshall -
The xattr_handler structure is only stored in an array of const
structures. Thus the xattr_handler structure itself can be
const.Signed-off-by: Julia Lawall
Signed-off-by: Mike Marshall -
Orangefs doesn't do buffered writes yet, so there's no point in
initiating and waiting for writeback.Signed-off-by: Jeff Layton
Signed-off-by: Mike Marshall -
A previous patch which claimed to remove off by ones actually introduced
them.strlen() returns the length of the string not including the NUL
character. We are using strcpy() to copy "name" into a buffer which is
ORANGEFS_MAX_XATTR_NAMELEN characters long. We should make sure to
leave space for the NUL, otherwise we're writing one character beyond
the end of the buffer.Fixes: e675c5ec51fe ("orangefs: clean up oversize xattr validation")
Signed-off-by: Dan Carpenter
Signed-off-by: Mike Marshall -
posix_acl_update_mode checks to see if the permissions
described by the ACL can be encoded into the
object's mode. If so, it sets "acl" to NULL
and "mode" to the new desired value. Prior to this patch
we failed to actually propagate the new mode back to the
server.Signed-off-by: Mike Marshall
-
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.Fix the problem by creating __orangefs_set_acl() function that does not
call posix_acl_update_mode() and use it when inheriting ACLs. That
prevents SGID bit clearing and the mode has been properly set by
posix_acl_create() anyway.Fixes: 073931017b49d9458aa351605b43a7e34598caef
CC: stable@vger.kernel.org
CC: Mike Marshall
CC: pvfs2-developers@beowulf-underground.org
Signed-off-by: Jan Kara
Signed-off-by: Mike Marshall
16 Jul, 2017
1 commit
-
Pull ->s_options removal from Al Viro:
"Preparations for fsmount/fsopen stuff (coming next cycle). Everything
gets moved to explicit ->show_options(), killing ->s_options off +
some cosmetic bits around fs/namespace.c and friends. Basically, the
stuff needed to work with fsmount series with minimum of conflicts
with other work.It's not strictly required for this merge window, but it would reduce
the PITA during the coming cycle, so it would be nice to have those
bits and pieces out of the way"* 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
isofs: Fix isofs_show_options()
VFS: Kill off s_options and helpers
orangefs: Implement show_options
9p: Implement show_options
isofs: Implement show_options
afs: Implement show_options
affs: Implement show_options
befs: Implement show_options
spufs: Implement show_options
bpf: Implement show_options
ramfs: Implement show_options
pstore: Implement show_options
omfs: Implement show_options
hugetlbfs: Implement show_options
VFS: Don't use save/replace_mount_options if not using generic_show_options
VFS: Provide empty name qstr
VFS: Make get_filesystem() return the affected filesystem
VFS: Clean up whitespace in fs/namespace.c and fs/super.c
Provide a function to create a NUL-terminated string from unterminated data
11 Jul, 2017
1 commit
-
Implement the show_options superblock op for orangefs as part of a bid to
rid of s_options and generic_show_options() to make it easier to implement
a context-based mount where the mount options can be passed individually
over a file descriptor.Signed-off-by: David Howells
cc: Mike Marshall
cc: pvfs2-developers@beowulf-underground.org
Signed-off-by: Al Viro
20 Jun, 2017
2 commits
-
So I've noticed a number of instances where it was not obvious from the
code whether ->task_list was for a wait-queue head or a wait-queue entry.Furthermore, there's a number of wait-queue users where the lists are
not for 'tasks' but other entities (poll tables, etc.), in which case
the 'task_list' name is actively confusing.To clear this all up, name the wait-queue head and entry list structure
fields unambiguously:struct wait_queue_head::task_list => ::head
struct wait_queue_entry::task_list => ::entryFor example, this code:
rqw->wait.task_list.next != &wait->task_list
... is was pretty unclear (to me) what it's doing, while now it's written this way:
rqw->wait.head.next != &wait->entry
... which makes it pretty clear that we are iterating a list until we see the head.
Other examples are:
list_for_each_entry_safe(pos, next, &x->task_list, task_list) {
list_for_each_entry(wq, &fence->wait.task_list, task_list) {... where it's unclear (to me) what we are iterating, and during review it's
hard to tell whether it's trying to walk a wait-queue entry (which would be
a bug), while now it's written as:list_for_each_entry_safe(pos, next, &x->head, entry) {
list_for_each_entry(wq, &fence->wait.head, entry) {Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar -
Rename:
wait_queue_t => wait_queue_entry_t
'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue",
but in reality it's a queue *entry*. The 'real' queue is the wait queue head,
which had to carry the name.Start sorting this out by renaming it to 'wait_queue_entry_t'.
This also allows the real structure name 'struct __wait_queue' to
lose its double underscore and become 'struct wait_queue_entry',
which is the more canonical nomenclature for such data types.Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar
06 May, 2017
1 commit
-
Pull orangefs updates from Mike Marshall:
"Orangefs cleanups, fixes and statx support.Some cleanups:
- remove unused get_fsid_from_ino
- fix bounds check for listxattr
- clean up oversize xattr validation
- do not set getattr_time on orangefs_lookup
- return from orangefs_devreq_read quickly if possible
- do not wait for timeout if umounting
- handle zero size write in debugfsBug fixes:
- do not check possibly stale size on truncate
- ensure the userspace component is unmounted if mount fails
- total reimplementation of dir.cNew feature:
- implement statx
The new implementation of dir.c is kind of a big deal, all new code.
It has been posted to fs-devel during the previous rc period, we
didn't get much review or feedback from there, but it has been
reviewed very heavily here, so much so that we have two entire
versions of the reimplementation.Not only does the new implementation fix some xfstests, but it passes
all the new tests we made here that involve seeking and rewinding and
giant directories and long file names. The new dir code has three
patches itself:- skip forward to the next directory entry if seek is short
- invalidate stored directory on seek
- count directory pieces correctly"* tag 'for-linus-4.12-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
orangefs: count directory pieces correctly
orangefs: invalidate stored directory on seek
orangefs: skip forward to the next directory entry if seek is short
orangefs: handle zero size write in debugfs
orangefs: do not wait for timeout if umounting
orangefs: return from orangefs_devreq_read quickly if possible
orangefs: ensure the userspace component is unmounted if mount fails
orangefs: do not check possibly stale size on truncate
orangefs: implement statx
orangefs: remove ORANGEFS_READDIR macros
orangefs: support very large directories
orangefs: support llseek on directories
orangefs: rewrite readdir to fix several bugs
orangefs: do not set getattr_time on orangefs_lookup
orangefs: clean up oversize xattr validation
orangefs: fix bounds check for listxattr
orangefs: remove unused get_fsid_from_ino
05 May, 2017
3 commits
-
A large directory full of differently sized file names triggered this.
Most directories, even very large directories with shorter names, would
be lucky enough to fit in one server response.Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall -
If an application seeks to a position before the point which has been
read, it must want updates which have been made to the directory. So
delete the copy stored in the kernel so it will be fetched again.Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall -
If userspace seeks to a position in the stream which is not correct, it
would have returned EIO because the data in the buffer at that offset
would be incorrect. This and the userspace daemon returning a corrupt
directory are indistinguishable.Now if the data does not look right, skip forward to the next chunk and
try again. The motivation is that if the directory changes, an
application may seek to a position that was valid and no longer is valid.It is not yet possible for a directory to change.
Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall
27 Apr, 2017
14 commits
-
If we write zero bytes to this debugfs file, then it will cause an
underflow when we do copy_from_user(buf, ubuf, count - 1). Debugfs can
normally only be written to by root so the impact of this is low.Signed-off-by: Dan Carpenter
Signed-off-by: Mike Marshall -
When the computer is turned off, all the processes are killed and then
all the filesystems are umounted. OrangeFS should not wait for the
userspace daemon to come back in that case.This only works for plain umount(2). To actually take advantage of this
interactively, `umount -f' is needed; otherwise umount will issue a
statfs first, which will wait for the userspace daemon to come back.Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall -
It is not necessary to take the lock and search through the request list
if the list is empty.Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall -
If the mount is aborted after userspace has been asked to mount,
userspace must be told to unmount.Ordinarily orangefs_kill_sb does the unmount. However it cannot be
called if the superblock has not been set up. This is a very narrow
window.The NULL fs_id is not unmounted.
Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall -
Let the server figure this out because our size might be out of date or
not present.The bug was that
xfs_io -f -t -c "pread -v 0 100" /mnt/foo
echo "Test" > /mnt/foo
xfs_io -f -t -c "pread -v 0 100" /mnt/foofails because the second truncate did not happen if nothing had
requested the size after the write in echo. Thus i_size was zero (not
present) and the orangefs_setattr though i_size was zero and there was
nothing to do.Signed-off-by: Martin Brandenburg
Cc: stable@vger.kernel.org
Signed-off-by: Mike Marshall -
Fortunately OrangeFS has had a getattr request mask for a long time.
The server basically has two difficulty levels for attributes. Fetching
any attribute except size requires communicating with the metadata
server for that handle. Since all the attributes are right there, it
makes sense to return them all. Fetching the size requires
communicating with every I/O server (that the file is distributed
across). Therefore if asked for anything except size, get everything
except size, and if asked for size, get everything.Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall -
They are clones of the ORANGEFS_ITERATE macros in use elsewhere. Delete
ORANGEFS_ITERATE_NEXT which is a hack previously used by readdir.Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall -
This works by maintaining a linked list of pages which the directory
has been read into rather than one giant fixed-size buffer.This replaces code which limits the total directory size to the total
amount that could be returned in one server request. Since filenames
are usually considerably shorter than the maximum, the old code could
usually handle several server requests before running out of space.Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall -
This and the previous commit fix xfstests generic/257.
Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall -
In the past, readdir assumed that the user buffer will be large enough
that all entries from the server will fit. If this was not true,
entries would be skipped.Since it works now, request 512 entries rather than 96 per server
operation.Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall -
Since orangefs_lookup calls orangefs_iget which calls
orangefs_inode_getattr, getattr_time will get set.Signed-off-by: Martin Brandenburg
Cc: stable@vger.kernel.org
Signed-off-by: Mike Marshall -
Also don't check flags as this has been validated by the VFS already.
Fix an off-by-one error in the max size checking.
Stop logging just because userspace wants to write attributes which do
not fit.This and the previous commit fix xfstests generic/020.
Signed-off-by: Martin Brandenburg
Cc: stable@vger.kernel.org
Signed-off-by: Mike Marshall -
Signed-off-by: Martin Brandenburg
Cc: stable@vger.kernel.org
Signed-off-by: Mike Marshall -
Signed-off-by: Martin Brandenburg
Signed-off-by: Mike Marshall
22 Apr, 2017
1 commit
-
Signed-off-by: Al Viro
18 Apr, 2017
1 commit
-
short copy here should mean instant EFAULT, not "move to the
next page and hope it fails there, this time with nothing
copied"Signed-off-by: Al Viro