Eric Lee / smarc-fsl-linux-kernel

05 Apr, 2016

1 commit

09cbfeaf1 mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros ... Browse Code »

PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
ago with promise that one day it will be possible to implement page
cache with bigger chunks than PAGE_SIZE.

This promise never materialized. And unlikely will.

We have many places where PAGE_CACHE_SIZE assumed to be equal to
PAGE_SIZE. And it's constant source of confusion on whether
PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
especially on the border between fs and mm.

Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
breakage to be doable.

Let's stop pretending that pages in page cache are special. They are
not.

The changes are pretty straight-forward:

- << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

- >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

- PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

- page_cache_get() -> get_page();

- page_cache_release() -> put_page();

This patch contains automated changes generated with coccinelle using
script below. For some reason, coccinelle doesn't patch header files.
I've called spatch for them manually.

The only adjustment after coccinelle is revert of changes to
PAGE_CAHCE_ALIGN definition: we are going to drop it later.

There are few places in the code where coccinelle didn't reach. I'll
fix them manually in a separate patch. Comments and documentation also
will be addressed with the separate patch.

virtual patch

@@
expression E;
@@
- E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
expression E;
@@
- E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
+ E

@@
@@
- PAGE_CACHE_SHIFT
+ PAGE_SHIFT

@@
@@
- PAGE_CACHE_SIZE
+ PAGE_SIZE

@@
@@
- PAGE_CACHE_MASK
+ PAGE_MASK

@@
expression E;
@@
- PAGE_CACHE_ALIGN(E)
+ PAGE_ALIGN(E)

@@
expression E;
@@
- page_cache_get(E)
+ get_page(E)

@@
expression E;
@@
- page_cache_release(E)
+ put_page(E)

Signed-off-by: Kirill A. Shutemov
Acked-by: Michal Hocko
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2016-04-05 01:41:08 +0800

23 Jan, 2016

1 commit

5955102c9 wrappers for ->i_mutex access ... Browse Code »

parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro

Al Viro
2016-01-23 07:04:28 +0800

15 Jan, 2016

1 commit

5d097056c kmemcg: account certain kmem allocations to memcg ... Browse Code »

Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg. For the list, see below:

- threadinfo
- task_struct
- task_delay_info
- pid
- cred
- mm_struct
- vm_area_struct and vm_region (nommu)
- anon_vma and anon_vma_chain
- signal_struct
- sighand_struct
- fs_struct
- files_struct
- fdtable and fdtable->full_fds_bits
- dentry and external_name
- inode for all filesystems. This is the most tedious part, because
most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds. Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov
Acked-by: Johannes Weiner
Acked-by: Michal Hocko
Cc: Tejun Heo
Cc: Greg Thelen
Cc: Christoph Lameter
Cc: Pekka Enberg
Cc: David Rientjes
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vladimir Davydov
2016-01-15 08:00:49 +0800

12 Jan, 2016

1 commit

ddf1d6238 Merge branch 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs xattr updates from Al Viro:
"Andreas' xattr cleanup series.

It's a followup to his xattr work that went in last cycle; -0.5KLoC"

* 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
xattr handlers: Simplify list operation
ocfs2: Replace list xattr handler operations
nfs: Move call to security_inode_listsecurity into nfs_listxattr
xfs: Change how listxattr generates synthetic attributes
tmpfs: listxattr should include POSIX ACL xattrs
tmpfs: Use xattr handler infrastructure
btrfs: Use xattr handler infrastructure
vfs: Distinguish between full xattr names and proper prefixes
posix acls: Remove duplicate xattr name definitions
gfs2: Remove gfs2_xattr_acl_chmod
vfs: Remove vfs_xattr_cmp

Linus Torvalds
2016-01-12 05:32:10 +0800

09 Dec, 2015

1 commit

21fc61c73 don't put symlink bodies in pagecache into highmem ... Browse Code »

kmap() in page_follow_link_light() needed to go - allowing to hold
an arbitrary number of kmaps for long is a great way to deadlocking
the system.

new helper (inode_nohighmem(inode)) needs to be used for pagecache
symlinks inodes; done for all in-tree cases. page_follow_link_light()
instrumented to yell about anything missed.

Signed-off-by: Al Viro

Al Viro
2015-12-09 11:41:36 +0800

07 Dec, 2015

2 commits

98e9cb571 vfs: Distinguish between full xattr names and proper prefixes ... Browse Code »

Add an additional "name" field to struct xattr_handler. When the name
is set, the handler matches attributes with exactly that name. When the
prefix is set instead, the handler matches attributes with the given
prefix and with a non-empty suffix.

This patch should avoid bugs like the one fixed in commit c361016a in
the future.

Signed-off-by: Andreas Gruenbacher
Reviewed-by: James Morris
Signed-off-by: Al Viro

Andreas Gruenbacher
2015-12-07 10:33:52 +0800
97d792992 posix acls: Remove duplicate xattr name definitions ... Browse Code »

Remove POSIX_ACL_XATTR_{ACCESS,DEFAULT} and GFS2_POSIX_ACL_{ACCESS,DEFAULT}
and replace them with the definitions in .

Signed-off-by: Andreas Gruenbacher
Reviewed-by: James Morris
Signed-off-by: Al Viro

Andreas Gruenbacher
2015-12-07 10:25:17 +0800

14 Nov, 2015

2 commits

d9a82a040 xattr handlers: Pass handler to operations instead of flags ... Browse Code »

The xattr_handler operations are currently all passed a file system
specific flags value which the operations can use to disambiguate between
different handlers; some file systems use that to distinguish the xattr
namespace, for example. In some oprations, it would be useful to also have
access to the handler prefix. To allow that, pass a pointer to the handler
to operations instead of the flags value alone.

Signed-off-by: Andreas Gruenbacher
Reviewed-by: Christoph Hellwig
Signed-off-by: Al Viro

Andreas Gruenbacher
2015-11-14 09:34:32 +0800
e282fb7f3 hfsplus: Remove unused xattr handler list operations ... Browse Code »

The list operations can never be called; they are even documented to be
unused.

Signed-off-by: Andreas Gruenbacher
Reviewed-by: Christoph Hellwig
Signed-off-by: Al Viro

Andreas Gruenbacher
2015-11-14 09:34:29 +0800

11 Sep, 2015

1 commit

7cb74be6f hfs,hfsplus: cache pages correctly between bnode_create and bnode_free ... Browse Code »

Pages looked up by __hfs_bnode_create() (called by hfs_bnode_create() and
hfs_bnode_find() for finding or creating pages corresponding to an inode)
are immediately kmap()'ed and used (both read and write) and kunmap()'ed,
and should not be page_cache_release()'ed until hfs_bnode_free().

This patch fixes a problem I first saw in July 2012: merely running "du"
on a large hfsplus-mounted directory a few times on a reasonably loaded
system would get the hfsplus driver all confused and complaining about
B-tree inconsistencies, and generates a "BUG: Bad page state". Most
recently, I can generate this problem on up-to-date Fedora 22 with shipped
kernel 4.0.5, by running "du /" (="/" + "/home" + "/mnt" + other smaller
mounts) and "du /mnt" simultaneously on two windows, where /mnt is a
lightly-used QEMU VM image of the full Mac OS X 10.9:

$ df -i / /home /mnt
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/fedora-root 3276800 551665 2725135 17% /
/dev/mapper/fedora-home 52879360 716221 52163139 2% /home
/dev/nbd0p2 4294967295 1387818 4293579477 1% /mnt

After applying the patch, I was able to run "du /" (60+ times) and "du
/mnt" (150+ times) continuously and simultaneously for 6+ hours.

There are many reports of the hfsplus driver getting confused under load
and generating "BUG: Bad page state" or other similar issues over the
years. [1]

The unpatched code [2] has always been wrong since it entered the kernel
tree. The only reason why it gets away with it is that the
kmap/memcpy/kunmap follow very quickly after the page_cache_release() so
the kernel has not had a chance to reuse the memory for something else,
most of the time.

The current RW driver appears to have followed the design and development
of the earlier read-only hfsplus driver [3], where-by version 0.1 (Dec
2001) had a B-tree node-centric approach to
read_cache_page()/page_cache_release() per bnode_get()/bnode_put(),
migrating towards version 0.2 (June 2002) of caching and releasing pages
per inode extents. When the current RW code first entered the kernel [2]
in 2005, there was an REF_PAGES conditional (and "//" commented out code)
to switch between B-node centric paging to inode-centric paging. There
was a mistake with the direction of one of the REF_PAGES conditionals in
__hfs_bnode_create(). In a subsequent "remove debug code" commit [4], the
read_cache_page()/page_cache_release() per bnode_get()/bnode_put() were
removed, but a page_cache_release() was mistakenly left in (propagating
the "REF_PAGES !REF_PAGE" mistake), and the commented-out
page_cache_release() in bnode_release() (which should be spanned by
!REF_PAGES) was never enabled.

References:
[1]:
Michael Fox, Apr 2013
http://www.spinics.net/lists/linux-fsdevel/msg63807.html
("hfsplus volume suddenly inaccessable after 'hfs: recoff %d too large'")

Sasha Levin, Feb 2015
http://lkml.org/lkml/2015/2/20/85 ("use after free")

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/740814
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1027887
https://bugzilla.kernel.org/show_bug.cgi?id=42342
https://bugzilla.kernel.org/show_bug.cgi?id=63841
https://bugzilla.kernel.org/show_bug.cgi?id=78761

[2]:
http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
fs/hfs/bnode.c?id=d1081202f1d0ee35ab0beb490da4b65d4bc763db
commit d1081202f1d0ee35ab0beb490da4b65d4bc763db
Author: Andrew Morton
Date: Wed Feb 25 16:17:36 2004 -0800

[PATCH] HFS rewrite

http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
fs/hfsplus/bnode.c?id=91556682e0bf004d98a529bf829d339abb98bbbd

commit 91556682e0bf004d98a529bf829d339abb98bbbd
Author: Andrew Morton
Date: Wed Feb 25 16:17:48 2004 -0800

[PATCH] HFS+ support

[3]:
http://sourceforge.net/projects/linux-hfsplus/

http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.1/
http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.2/

http://linux-hfsplus.cvs.sourceforge.net/viewvc/linux-hfsplus/linux/\
fs/hfsplus/bnode.c?r1=1.4&r2=1.5

Date: Thu Jun 6 09:45:14 2002 +0000
Use buffer cache instead of page cache in bnode.c. Cache inode extents.

[4]:
http://git.kernel.org/cgit/linux/kernel/git/\
stable/linux-stable.git/commit/?id=a5e3985fa014029eb6795664c704953720cc7f7d

commit a5e3985fa014029eb6795664c704953720cc7f7d
Author: Roman Zippel
Date: Tue Sep 6 15:18:47 2005 -0700

[PATCH] hfs: remove debug code

Signed-off-by: Hin-Tak Leung
Signed-off-by: Sergei Antonov
Reviewed-by: Anton Altaparmakov
Reported-by: Sasha Levin
Cc: Al Viro
Cc: Christoph Hellwig
Cc: Vyacheslav Dubeyko
Cc: Sougata Santra
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hin-Tak Leung
2015-09-11 04:29:01 +0800

05 Sep, 2015

1 commit

a068acf2e fs: create and use seq_show_option for escaping ... Browse Code »

Many file systems that implement the show_options hook fail to correctly
escape their output which could lead to unescaped characters (e.g. new
lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This
could lead to confusion, spoofed entries (resulting in things like
systemd issuing false d-bus "mount" notifications), and who knows what
else. This looks like it would only be the root user stepping on
themselves, but it's possible weird things could happen in containers or
in other situations with delegated mount privileges.

Here's an example using overlay with setuid fusermount trusting the
contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use
of "sudo" is something more sneaky:

$ BASE="ovl"
$ MNT="$BASE/mnt"
$ LOW="$BASE/lower"
$ UP="$BASE/upper"
$ WORK="$BASE/work/ 0 0
none /proc fuse.pwn user_id=1000"
$ mkdir -p "$LOW" "$UP" "$WORK"
$ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
$ cat /proc/mounts
none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
none /proc fuse.pwn user_id=1000 0 0
$ fusermount -u /proc
$ cat /proc/mounts
cat: /proc/mounts: No such file or directory

This fixes the problem by adding new seq_show_option and
seq_show_option_n helpers, and updating the vulnerable show_option
handlers to use them as needed. Some, like SELinux, need to be open
coded due to unusual existing escape mechanisms.

[akpm@linux-foundation.org: add lost chunk, per Kees]
[keescook@chromium.org: seq_show_option should be using const parameters]
Signed-off-by: Kees Cook
Acked-by: Serge Hallyn
Acked-by: Jan Kara
Acked-by: Paul Moore
Cc: J. R. Okajima
Signed-off-by: Kees Cook
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kees Cook
2015-09-05 07:54:41 +0800

05 Jul, 2015

1 commit

1dc51b828 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull more vfs updates from Al Viro:
"Assorted VFS fixes and related cleanups (IMO the most interesting in
that part are f_path-related things and Eric's descriptor-related
stuff). UFS regression fixes (it got broken last cycle). 9P fixes.
fs-cache series, DAX patches, Jan's file_remove_suid() work"

[ I'd say this is much more than "fixes and related cleanups". The
file_table locking rule change by Eric Dumazet is a rather big and
fundamental update even if the patch isn't huge. - Linus ]

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
9p: cope with bogus responses from server in p9_client_{read,write}
p9_client_write(): avoid double p9_free_req()
9p: forgetting to cancel request on interrupted zero-copy RPC
dax: bdev_direct_access() may sleep
block: Add support for DAX reads/writes to block devices
dax: Use copy_from_iter_nocache
dax: Add block size note to documentation
fs/file.c: __fget() and dup2() atomicity rules
fs/file.c: don't acquire files->file_lock in fd_install()
fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
vfs: avoid creation of inode number 0 in get_next_ino
namei: make set_root_rcu() return void
make simple_positive() public
ufs: use dir_pages instead of ufs_dir_pages()
pagemap.h: move dir_pages() over there
remove the pointless include of lglock.h
fs: cleanup slight list_entry abuse
xfs: Correctly lock inode when removing suid and file capabilities
fs: Call security_ops->inode_killpriv on truncate
fs: Provide function telling whether file_remove_privs() will do anything
...

Linus Torvalds
2015-07-05 10:36:06 +0800

24 Jun, 2015

1 commit

db6172c41 fs: cleanup slight list_entry abuse ... Browse Code »

list_entry is just a wrapper for container_of, but it is arguably
wrong (and slightly confusing) to use it when the pointed-to struct
member is not a struct list_head. Use container_of directly instead.

Signed-off-by: Rasmus Villemoes
Signed-off-by: Al Viro

Rasmus Villemoes
2015-06-24 06:01:59 +0800

02 Jun, 2015

1 commit

66114cad6 writeback: separate out include/linux/backing-dev-defs.h ... Browse Code »

With the planned cgroup writeback support, backing-dev related
declarations will be more widely used across block and cgroup;
unfortunately, including backing-dev.h from include/linux/blkdev.h
makes cyclic include dependency quite likely.

This patch separates out backing-dev-defs.h which only has the
essential definitions and updates blkdev.h to include it. c files
which need access to more backing-dev details now include
backing-dev.h directly. This takes backing-dev.h off the common
include dependency chain making it a lot easier to use it across block
and cgroup.

v2: fs/fat build failure fixed.

Signed-off-by: Tejun Heo
Reviewed-by: Jan Kara
Cc: Jens Axboe
Signed-off-by: Jens Axboe

Tejun Heo
2015-06-02 22:33:34 +0800

27 Apr, 2015

1 commit

9ec3a646f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull fourth vfs update from Al Viro:
"d_inode() annotations from David Howells (sat in for-next since before
the beginning of merge window) + four assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
RCU pathwalk breakage when running into a symlink overmounting something
fix I_DIO_WAKEUP definition
direct-io: only inc/dec inode->i_dio_count for file systems
fs/9p: fix readdir()
VFS: assorted d_backing_inode() annotations
VFS: fs/inode.c helpers: d_inode() annotations
VFS: fs/cachefiles: d_backing_inode() annotations
VFS: fs library helpers: d_inode() annotations
VFS: assorted weird filesystems: d_inode() annotations
VFS: normal filesystems (and lustre): d_inode() annotations
VFS: security/: d_inode() annotations
VFS: security/: d_backing_inode() annotations
VFS: net/: d_inode() annotations
VFS: net/unix: d_backing_inode() annotations
VFS: kernel/: d_inode() annotations
VFS: audit: d_backing_inode() annotations
VFS: Fix up some ->d_inode accesses in the chelsio driver
VFS: Cachefiles should perform fs modifications on the top layer only
VFS: AF_UNIX sockets should call mknod on the top layer only

Linus Torvalds
2015-04-27 08:22:07 +0800

17 Apr, 2015

10 commits

54e514b91 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge third patchbomb from Andrew Morton:

- various misc things

- a couple of lib/ optimisations

- provide DIV_ROUND_CLOSEST_ULL()

- checkpatch updates

- rtc tree

- befs, nilfs2, hfs, hfsplus, fatfs, adfs, affs, bfs

- ptrace fixes

- fork() fixes

- seccomp cleanups

- more mmap_sem hold time reductions from Davidlohr

* emailed patches from Andrew Morton : (138 commits)
proc: show locks in /proc/pid/fdinfo/X
docs: add missing and new /proc/PID/status file entries, fix typos
drivers/rtc/rtc-at91rm9200.c: make IO endian agnostic
Documentation/spi/spidev_test.c: fix warning
drivers/rtc/rtc-s5m.c: allow usage on device type different than main MFD type
.gitignore: ignore *.tar
MAINTAINERS: add Mediatek SoC mailing list
tomoyo: reduce mmap_sem hold for mm->exe_file
powerpc/oprofile: reduce mmap_sem hold for exe_file
oprofile: reduce mmap_sem hold for mm->exe_file
mips: ip32: add platform data hooks to use DS1685 driver
lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text
x86: switch to using asm-generic for seccomp.h
sparc: switch to using asm-generic for seccomp.h
powerpc: switch to using asm-generic for seccomp.h
parisc: switch to using asm-generic for seccomp.h
mips: switch to using asm-generic for seccomp.h
microblaze: use asm-generic for seccomp.h
arm: use asm-generic for seccomp.h
seccomp: allow COMPAT sigreturn overrides
...

Linus Torvalds
2015-04-17 21:04:38 +0800
db579e76f hfsplus: don't store special "osx" xattr prefix on-disk ... Browse Code »

On Mac OS X, HFS+ extended attributes are not namespaced. Since we want
to be compatible with OS X filesystems and yet still support the Linux
namespacing system, the hfsplus driver implements a special "osx"
namespace that is reported for any attribute that is not namespaced
on-disk. However, the current code for getting and setting these
unprefixed attributes is broken.

hfsplus_osx_setattr() and hfsplus_osx_getattr() are passed names that have
already had their "osx." prefixes stripped by the generic functions. The
functions first, quite correctly, check those names to make sure that they
aren't prefixed with a known namespace, which would allow namespace access
restrictions to be bypassed. However, the functions then prepend "osx."
to the name they're given before passing it on to hfsplus_getattr() and
hfsplus_setattr(). Not only does this cause the "osx." prefix to be
stored on-disk, defeating its purpose, it also breaks the check for the
special "com.apple.FinderInfo" attribute, which is reported for all files,
and as a consequence makes some userspace applications (e.g. GNU patch)
fail even when extended attributes are not otherwise in use.

There are five commits which have touched this particular code:

127e5f5ae51e ("hfsplus: rework functionality of getting, setting and deleting of extended attributes")
b168fff72109 ("hfsplus: use xattr handlers for removexattr")
bf29e886b242 ("hfsplus: correct usage of HFSPLUS_ATTR_MAX_STRLEN for non-English attributes")
fcacbd95e121 ("fs/hfsplus: move xattr_name allocation in hfsplus_getxattr()")
ec1bbd346f18 ("fs/hfsplus: move xattr_name allocation in hfsplus_setxattr()")

The first commit creates the functions to begin with. The namespace is
prepended by the original code, which I believe was correct at the time,
since hfsplus_?etattr() stripped the prefix if found. The second commit
removes this behavior from hfsplus_?etattr() and appears to have been
intended to also remove the prefixing from hfsplus_osx_?etattr().
However, what it actually does is remove a necessary strncpy() call
completely, breaking the osx namespace entirely. The third commit re-adds
the strncpy() call as it was originally, but doesn't mention it in its
commit message. The final two commits refactor the code and don't affect
its functionality.

This commit does what b168fff attempted to do (prevent the prefix from
being added), but does it properly, instead of passing in an empty buffer
(which is what b168fff actually did).

Fixes: b168fff72109 ("hfsplus: use xattr handlers for removexattr")
Signed-off-by: Thomas Hebb
Cc: Hin-Tak Leung
Cc: Sergei Antonov
Cc: Anton Altaparmakov
Cc: Fabian Frederick
Cc: Christian Kujau
Cc: Christoph Hellwig
Cc: Al Viro
Cc: Viacheslav Dubeyko
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Thomas Hebb
2015-04-17 21:04:05 +0800
059a704c4 hfsplus: fix expand when not enough available space ... Browse Code »

Fix a bug which is reproduced as follows. Create a file:

echo abc > test_file

Try to expand the file beyond available space:

truncate --size= test_file

Since HFS+ does not support file size > allocated size, truncate should
fail. However, it ends successfully. The driver returns success despite
having been unable to allocate the requested space for the file. Also
filesystem check finds an error:

Checking catalog file.
Incorrect size for file test_file
(It should be 469094400 instead of 1000000000)

Add a piece of code analogous to code in the fat driver. Now a proper
error is returned and filesystem remains consistent.

Signed-off-by: Sergei Antonov
Cc: Vyacheslav Dubeyko
Cc: Hin-Tak Leung
Reviewed-by: Anton Altaparmakov
Cc: Al Viro
Cc: Christoph Hellwig
Cc: Sougata Santra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sergei Antonov
2015-04-17 21:04:05 +0800
27a4e3884 hfsplus: incorrect return value ... Browse Code »

In case of memory allocation error, the return should be -ENOMEM, instead
of -ENOSPC.

Signed-off-by: Chengyu Song
Reviewed-by: Sergei Antonov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Chengyu Song
2015-04-17 21:04:05 +0800
7ce844a20 fs/hfsplus: replace if/BUG by BUG_ON ... Browse Code »

Signed-off-by: Fabian Frederick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2015-04-17 21:04:05 +0800
1ad8d63d6 fs/hfsplus: use bool instead of int for is_known_namespace() return value ... Browse Code »

is_known_namespace() only returns true/false. Also remove inline and let
compiler decide what to do with static functions.

Signed-off-by: Fabian Frederick
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2015-04-17 21:04:05 +0800
73d28d571 fs/hfsplus: atomically set inode->i_flags ... Browse Code »

According to commit 5f16f3225b06 ("ext4: atomically set inode->i_flags in
ext4_set_inode_flags()").

Signed-off-by: Fabian Frederick
Cc: "Theodore Ts'o"
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2015-04-17 21:04:05 +0800
5e61473ea fs/hfsplus: move xattr_name allocation in hfsplus_setxattr() ... Browse Code »

security/trusted/user/osx setxattr did the same
xattr_name initialization. Move that operation in hfsplus_setxattr().

Tested with security/trusted/user getfattr/setfattr

Signed-off-by: Fabian Frederick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2015-04-17 21:04:05 +0800
a3cef4cd6 fs/hfsplus: move xattr_name allocation in hfsplus_getxattr() ... Browse Code »

security/trusted/user/osx getxattr did the same
xattr_name initialization. Move that operation in hfsplus_getxattr().

Tested with security/trusted/user getfattr/setfattr

Signed-off-by: Fabian Frederick
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2015-04-17 21:04:04 +0800
f01fa5fb3 hfsplus: add missing curly braces in hfsplus_delete_cat() ... Browse Code »

This doesn't change how the code works, but clearly the curly braces were
intended.

Signed-off-by: Dan Carpenter
Cc: Vyacheslav Dubeyko
Cc: Sougata Santra
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Carpenter
2015-04-17 21:04:04 +0800

16 Apr, 2015

1 commit

2b0143b5c VFS: normal filesystems (and lustre): d_inode() annotations ... Browse Code »

that's the bulk of filesystem drivers dealing with inodes of their own

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2015-04-16 03:06:57 +0800

12 Apr, 2015

4 commits

22c6186ec direct_IO: remove rw from a_ops->direct_IO() ... Browse Code »

Now that no one is using rw, remove it completely.

Signed-off-by: Omar Sandoval
Signed-off-by: Al Viro

Omar Sandoval
2015-04-12 10:29:45 +0800
6f6737631 direct_IO: use iov_iter_rw() instead of rw everywhere ... Browse Code »

The rw parameter to direct_IO is redundant with iov_iter->type, and
treated slightly differently just about everywhere it's used: some users
do rw & WRITE, and others do rw == WRITE where they should be doing a
bitwise check. Simplify this with the new iov_iter_rw() helper, which
always returns either READ or WRITE.

Signed-off-by: Omar Sandoval
Signed-off-by: Al Viro

Omar Sandoval
2015-04-12 10:29:45 +0800
17f8c842d Remove rw from {,__,do_}blockdev_direct_IO() ... Browse Code »

Most filesystems call through to these at some point, so we'll start
here.

Signed-off-by: Omar Sandoval
Signed-off-by: Al Viro

Omar Sandoval
2015-04-12 10:29:44 +0800
5d5d56897 make new_sync_{read,write}() static ... Browse Code »

All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.

Signed-off-by: Al Viro

Al Viro
2015-04-12 10:29:40 +0800

09 Apr, 2015

1 commit

237dae889 Merge branch 'iocb' into for-davem ... Browse Code »

trivial conflict in net/socket.c and non-trivial one in crypto -
that one had evaded aio_complete() removal.

Signed-off-by: Al Viro

Al Viro
2015-04-09 12:01:38 +0800

26 Mar, 2015

2 commits

e2e40f2c1 fs: move struct kiocb to fs.h ... Browse Code »

struct kiocb now is a generic I/O container, so move it to fs.h.
Also do a #include diet for aio.h while we're at it.

Signed-off-by: Christoph Hellwig
Signed-off-by: Al Viro

Christoph Hellwig
2015-03-26 08:28:11 +0800
98cf21c61 hfsplus: fix B-tree corruption after insertion at position 0 ... Browse Code »

Fix B-tree corruption when a new record is inserted at position 0 in the
node in hfs_brec_insert(). In this case a hfs_brec_update_parent() is
called to update the parent index node (if exists) and it is passed
hfs_find_data with a search_key containing a newly inserted key instead
of the key to be updated. This results in an inconsistent index node.
The bug reproduces on my machine after an extents overflow record for
the catalog file (CNID=4) is inserted into the extents overflow B-tree.
Because of a low (reserved) value of CNID=4, it has to become the first
record in the first leaf node.

The resulting first leaf node is correct:

----------------------------------------------------
| key0.CNID=4 | key1.CNID=123 | key2.CNID=456, ... |
----------------------------------------------------

But the parent index key0 still contains the previous key CNID=123:

-----------------------
| key0.CNID=123 | ... |
-----------------------

A change in hfs_brec_insert() makes hfs_brec_update_parent() work
correctly by preventing it from getting fd->record=-1 value from
__hfs_brec_find().

Along the way, I removed duplicate code with unification of the if
condition. The resulting code is equivalent to the original code
because node is never 0.

Also hfs_brec_update_parent() will now return an error after getting a
negative fd->record value. However, the return value of
hfs_brec_update_parent() is not checked anywhere in the file and I'm
leaving it unchanged by this patch. brec.c lacks error checking after
some other calls too, but this issue is of less importance than the one
being fixed by this patch.

Signed-off-by: Sergei Antonov
Cc: Joe Perches
Reviewed-by: Vyacheslav Dubeyko
Acked-by: Hin-Tak Leung
Cc: Anton Altaparmakov
Cc: Al Viro
Cc: Christoph Hellwig
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sergei Antonov
2015-03-26 07:20:31 +0800

23 Feb, 2015

1 commit

e36cb0b89 VFS: (Scripted) Convert S_ISLNK/DIR/REG(dentry->d_inode) to d_is_*(dentry) ... Browse Code »

Convert the following where appropriate:

(1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

(2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

(3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
complicated than it appears as some calls should be converted to
d_can_lookup() instead. The difference is whether the directory in
question is a real dir with a ->lookup op or whether it's a fake dir with
a ->d_automount op.

In some circumstances, we can subsume checks for dentry->d_inode not being
NULL into this, provided we the code isn't in a filesystem that expects
d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
use d_inode() rather than d_backing_inode() to get the inode pointer).

Note that the dentry type field may be set to something other than
DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
manages the fall-through from a negative dentry to a lower layer. In such a
case, the dentry type of the negative union dentry is set to the same as the
type of the lower dentry.

However, if you know d_inode is not NULL at the call site, then you can use
the d_is_xxx() functions even in a filesystem.

There is one further complication: a 0,0 chardev dentry may be labelled
DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
intended for special directory entry types that don't have attached inodes.

The following perl+coccinelle script was used:

use strict;

my @callers;
open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
die "Can't grep for S_ISDIR and co. callers";
@callers = ;
close($fd);
unless (@callers) {
print "No matches\n";
exit(0);
}

my @cocci = (
'@@',
'expression E;',
'@@',
'',
'- S_ISLNK(E->d_inode->i_mode)',
'+ d_is_symlink(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISDIR(E->d_inode->i_mode)',
'+ d_is_dir(E)',
'',
'@@',
'expression E;',
'@@',
'',
'- S_ISREG(E->d_inode->i_mode)',
'+ d_is_reg(E)' );

my $coccifile = "tmp.sp.cocci";
open($fd, ">$coccifile") || die $coccifile;
print($fd "$_\n") || die $coccifile foreach (@cocci);
close($fd);

foreach my $file (@callers) {
chomp $file;
print "Processing ", $file, "\n";
system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
die "spatch failed";
}

[AV: overlayfs parts skipped]

Signed-off-by: David Howells
Signed-off-by: Al Viro

David Howells
2015-02-23 00:38:41 +0800

19 Dec, 2014

1 commit

89ac9b4d3 hfsplus: fix longname handling ... Browse Code »

Longname is not correctly handled by hfsplus driver. If an attempt to
create a longname(>255) file/directory is made, it succeeds by creating a
file/directory with HFSPLUS_MAX_STRLEN and incorrect catalog key. Thus
leaving the volume in an inconsistent state. This patch fixes this issue.

Although lookup is always called first to create a negative entry, so just
doing a check in lookup would probably fix this issue. I choose to
propagate error to other iops as well.

Please NOTE: I have factored out hfsplus_cat_build_key_with_cnid from
hfsplus_cat_build_key, to avoid unncessary branching.

Thanks a lot.

TEST:
------
dir="TEST_DIR"
cdir=`pwd`
name255="_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_1234"
name256="${name255}5"

mkdir $dir
cd $dir
touch $name255
rm -f $name255
touch $name256
ls -la
cd $cdir
rm -rf $dir

RESULT:
-------
[sougata@ultrabook tmp]$ cdir=`pwd`
[sougata@ultrabook tmp]$
name255="_123456789_123456789_123456789_123456789_123456789_123456789\
> _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
> _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
> _123456789_123456789_123456789_123456789_123456789_1234"
[sougata@ultrabook tmp]$ name256="${name255}5"
[sougata@ultrabook tmp]$
[sougata@ultrabook tmp]$ mkdir $dir
[sougata@ultrabook tmp]$ cd $dir
[sougata@ultrabook TEST_DIR]$ touch $name255
[sougata@ultrabook TEST_DIR]$ rm -f $name255
[sougata@ultrabook TEST_DIR]$ touch $name256
[sougata@ultrabook TEST_DIR]$ ls -la
ls: cannot access
_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234:
No such file or directory
total 0
drwxrwxr-x 1 sougata sougata 3 Feb 20 19:56 .
drwxrwxrwx 1 root root 6 Feb 20 19:56 ..
-????????? ? ? ? ? ?
_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234
[sougata@ultrabook TEST_DIR]$ cd $cdir
[sougata@ultrabook tmp]$ rm -rf $dir
rm: cannot remove `TEST_DIR': Directory not empty

-ENAMETOOLONG returned from hfsplus_asc2uni was not propaged to iops.
This allowed hfsplus to create files/directories with HFSPLUS_MAX_STRLEN
and incorrect keys, leaving the FS in an inconsistent state. This patch
fixes this issue.

Signed-off-by: Sougata Santra
Reviewed-by: Christoph Hellwig
Cc: Vyacheslav Dubeyko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sougata Santra
2014-12-19 11:08:10 +0800

13 Jun, 2014

1 commit

16b905780 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs updates from Al Viro:
"This the bunch that sat in -next + lock_parent() fix. This is the
minimal set; there's more pending stuff.

In particular, I really hope to get acct.c fixes merged this cycle -
we need that to deal sanely with delayed-mntput stuff. In the next
pile, hopefully - that series is fairly short and localized
(kernel/acct.c, fs/super.c and fs/namespace.c). In this pile: more
iov_iter work. Most of prereqs for ->splice_write with sane locking
order are there and Kent's dio rewrite would also fit nicely on top of
this pile"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits)
lock_parent: don't step on stale ->d_parent of all-but-freed one
kill generic_file_splice_write()
ceph: switch to iter_file_splice_write()
shmem: switch to iter_file_splice_write()
nfs: switch to iter_splice_write_file()
fs/splice.c: remove unneeded exports
ocfs2: switch to iter_file_splice_write()
->splice_write() via ->write_iter()
bio_vec-backed iov_iter
optimize copy_page_{to,from}_iter()
bury generic_file_aio_{read,write}
lustre: get rid of messing with iovecs
ceph: switch to ->write_iter()
ceph_sync_direct_write: stop poking into iov_iter guts
ceph_sync_read: stop poking into iov_iter guts
new helper: copy_page_from_iter()
fuse: switch to ->write_iter()
btrfs: switch to ->write_iter()
ocfs2: switch to ->write_iter()
xfs: switch to ->write_iter()
...

Linus Torvalds
2014-06-13 01:30:18 +0800

07 Jun, 2014

4 commits

df3d4e7a2 hfsplus: fix compiler warning on PowerPC ... Browse Code »

Commit a99b7069aab8 ("hfsplus: Fix undefined __divdi3 in
hfsplus_init_header_node()") introduced do_div() to xattr.c and the
warning below too.

As Geert remarked: "tmp" is "loff_t" which is "__kernel_loff_t", which
is "long long", i.e. signed, while include/asm-generic/div64.h compares
its type with "uint64_t". As inode sizes are positive, it should be
safe to change the type of "tmp" to "u64".

In file included from
arch/powerpc/include/asm/div64.h:1:0,
from include/linux/kernel.h:124,
from include/asm-generic/bug.h:13,
from arch/powerpc/include/asm/bug.h:127,
from include/linux/bug.h:4,
from include/linux/thread_info.h:11,
from include/asm-generic/preempt.h:4,
from arch/powerpc/include/generated/asm/preempt.h:1,
from include/linux/preempt.h:18,
from include/linux/spinlock.h:50,
from include/linux/wait.h:8,
from include/linux/fs.h:6,
from fs/hfsplus/hfsplus_fs.h:19,
from fs/hfsplus/xattr.c:9:
fs/hfsplus/xattr.c: In function 'hfsplus_init_header_node':
include/asm-generic/div64.h:43:28: warning: comparison of distinct pointer types lacks a cast [enabled by default]
(void)(((typeof((n)) *)0) == ((uint64_t *)0)); \
^
fs/hfsplus/xattr.c:86:2: note: in expansion of macro 'do_div'
do_div(tmp, node_size);
^

Signed-off-by: Christian Kujau
Signed-off-by: Geert Uytterhoeven
Acked-by: Sergei Antonov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christian Kujau
2014-06-07 07:08:10 +0800
b73f3d0e7 fs/hfsplus: fix pr_foo() and hfs_dbg formats ... Browse Code »

Signed-off-by: Fabian Frederick
Suggested-By: Vyacheslav Dubeyko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-06-07 07:08:10 +0800
ffbc06716 hfsplus: coding style fix for declarations in hfsplus_fs.h ... Browse Code »

Some function declarations in hfsplus_fs.h were with argument names,
some without, and some were mixed. This patch adds argument names
everywhere, sorts function in order they go in .c files, and moves
hfs_part_find() to a proper section.

Auto-formatting and sorting was done with:
cfunctions *.c | indent -linux | sed "s| \* | \*|"

Signed-off-by: Sergei Antonov
Cc: Vyacheslav Dubeyko
Cc: Hin-Tak Leung
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sergei Antonov
2014-06-07 07:08:10 +0800
297cc2720 fs/hfsplus/wrapper.c: replace shift loop by ilog2 ... Browse Code »

Replace while blocksize;shift by ilog2

Signed-off-by: Fabian Frederick
Cc: Vyacheslav Dubeyko
Cc: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-06-07 07:08:10 +0800