Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

10 Jan, 2015

3 commits

dc9319f5a Merge branch 'for-3.19' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull two nfsd bugfixes from Bruce Fields.

* 'for-3.19' of git://linux-nfs.org/~bfields/linux:
rpc: fix xdr_truncate_encode to handle buffer ending on page boundary
nfsd: fix fi_delegees leak when fi_had_conflict returns true

Linus Torvalds
2015-01-10 10:10:48 +0800
20ebb3452 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull two Ceph fixes from Sage Weil:
"These are both pretty trivial: a sparse warning fix and size_t printk
thing"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: fix sparse endianness warnings
ceph: use %zu for len in ceph_fill_inline_data()

Linus Torvalds
2015-01-10 09:55:00 +0800
03c751a5e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes from Chris Mason:
"None of these are huge, but my commit does fix a regression from 3.18
that could cause lost files during log replay.

This also adds Dave Sterba to the list of Btrfs maintainers. It
doesn't mean we're doing things differently, but Dave has really been
helping with the maintainer workload for years"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: don't delay inode ref updates during log replay
Btrfs: correctly get tree level in tree_backref_for_extent
Btrfs: call inode_dec_link_count() on mkdir error path
Btrfs: abort transaction if we don't find the block group
Btrfs, scrub: uninitialized variable in scrub_extent_for_parity()
Btrfs: add more maintainers

Linus Torvalds
2015-01-10 09:46:07 +0800

09 Jan, 2015

4 commits

75069f2b5 vfs: renumber FMODE_NONOTIFY and add to uniqueness check ... Browse Code »

Fix clashing values for O_PATH and FMODE_NONOTIFY on sparc. The
clashing O_PATH value was added in commit 5229645bdc35 ("vfs: add
nonconflicting values for O_PATH") but this can't be changed as it is
user-visible.

FMODE_NONOTIFY is only used internally in the kernel, but it is in the
same numbering space as the other O_* flags, as indicated by the comment
at the top of include/uapi/asm-generic/fcntl.h (and its use in
fs/notify/fanotify/fanotify_user.c). So renumber it to avoid the clash.

All of this has happened before (commit 12ed2e36c98a: "fanotify:
FMODE_NONOTIFY and __O_SYNC in sparc conflict"), and all of this will
happen again -- so update the uniqueness check in fcntl_init() to
include __FMODE_NONOTIFY.

Signed-off-by: David Drysdale
Acked-by: David S. Miller
Acked-by: Jan Kara
Cc: Heinrich Schuchardt
Cc: Alexander Viro
Cc: Arnd Bergmann
Cc: Stephen Rothwell
Cc: Eric Paris
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Drysdale
2015-01-09 07:10:52 +0800
53dc20b9a ocfs2: fix the wrong directory passed to ocfs2_lookup_ino_from_name() when link file ... Browse Code »
3

In ocfs2_link(), the parent directory inode passed to function
ocfs2_lookup_ino_from_name() is wrong. Parameter dir is the parent of
new_dentry not old_dentry. We should get old_dir from old_dentry and
lookup old_dentry in old_dir in case another node remove the old dentry.

With this change, hard linking works again, when paths are relative with
at least one subdirectory. This is how the problem was reproducable:

# mkdir a
# mkdir b
# touch a/test
# ln a/test b/test
ln: failed to create hard link `b/test' => `a/test': No such file or directory

However when creating links in the same dir, it worked well.

Now the link gets created.

Fixes: 0e048316ff57 ("ocfs2: check existence of old dentry in ocfs2_link()")
Signed-off-by: joyce.xue
Reported-by: Szabo Aron - UBIT
Cc: Mark Fasheh
Cc: Joel Becker
Tested-by: Aron Szabo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xue jiufei
2015-01-09 07:10:51 +0800
eb4f73b4c ocfs2: remove bogus check in dlm_process_recovery_data ... Browse Code »

In dlm_process_recovery_data, only when dlm_new_lock failed the ret will
be set to -ENOMEM. And in this case, newlock is definitely NULL. So
test newlock is meaningless, remove it.

Signed-off-by: Joseph Qi
Reviewed-by: Alex Chen
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2015-01-09 07:10:51 +0800
0668ff52e ceph: use %zu for len in ceph_fill_inline_data() ... Browse Code »

len is size_t, should be printed with %zu.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2015-01-09 01:36:56 +0800

08 Jan, 2015

1 commit

94ae1db22 nfsd: fix fi_delegees leak when fi_had_conflict returns true ... Browse Code »

Currently, nfs4_set_delegation takes a reference to an existing
delegation and then checks to see if there is a conflict. If there is
one, then it doesn't release that reference.

Change the code to take the reference after the check and only if there
is no conflict.

Signed-off-by: Jeff Layton
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

Jeff Layton
2015-01-08 02:38:21 +0800

07 Jan, 2015

1 commit

3b421b80b Merge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

Pull ext4 bugfixes from Ted Ts'o:
"Revert a potential seek_data/hole regression which shows up when using
ext4 to handle ext3 file systems, plus two minor bug fixes"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: remove spurious KERN_INFO from ext4_warning call
Revert "ext4: fix suboptimal seek_{data,hole} extents traversial"
ext4: prevent online resize with backup superblock

Linus Torvalds
2015-01-07 06:05:40 +0800

03 Jan, 2015

7 commits

363307e6e ext4: remove spurious KERN_INFO from ext4_warning call ... Browse Code »

Signed-off-by: Jakub Wilk
Signed-off-by: Theodore Ts'o

Jakub Wilk
2015-01-03 04:31:14 +0800
ad7fefb10 Revert "ext4: fix suboptimal seek_{data,hole} extents traversial" ... Browse Code »

This reverts commit 14516bb7bb6ffbd49f35389f9ece3b2045ba5815.

This was causing regression test failures with generic/285 with an ext3
filesystem using CONFIG_EXT4_USE_FOR_EXT23.

Signed-off-by: Theodore Ts'o

Theodore Ts'o
2015-01-03 04:16:00 +0800
6f8960541 Btrfs: don't delay inode ref updates during log replay ... Browse Code »
3

Commit 1d52c78afbb (Btrfs: try not to ENOSPC on log replay) added a
check to skip delayed inode updates during log replay because it
confuses the enospc code. But the delayed processing will end up
ignoring delayed refs from log replay because the inode itself wasn't
put through the delayed code.

This can end up triggering a warning at commit time:

WARNING: CPU: 2 PID: 778 at fs/btrfs/delayed-inode.c:1410 btrfs_assert_delayed_root_empty+0x32/0x34()

Which is repeated for each commit because we never process the delayed
inode ref update.

The fix used here is to change btrfs_delayed_delete_inode_ref to return
an error if we're currently in log replay. The caller will do the ref
deletion immediately and everything will work properly.

Signed-off-by: Chris Mason
cc: stable@vger.kernel.org # v3.18 and any stable series that picked 1d52c78afbbf80b58299e076a159617d6b42fe3c

Chris Mason
2015-01-03 03:47:56 +0800
a1317f455 Btrfs: correctly get tree level in tree_backref_for_extent ... Browse Code »

If we are using skinny metadata, the block's tree level is in the offset
of the key and not in a btrfs_tree_block_info structure following the
extent item (it doesn't exist). Therefore fix it.

Besides returning the correct level in the tree, this also prevents reading
past the leaf's end in the case where the extent item is the last item in
the leaf (eb) and it has only 1 inline reference - this is because
sizeof(struct btrfs_tree_block_info) is greater than
sizeof(struct btrfs_extent_inline_ref).

Got it while running a scrub which produced the following warning:

BTRFS: checksum error at logical 42123264 on dev /dev/sde, sector 15840: metadata node (level 24) in tree 5

Signed-off-by: Filipe Manana
Reviewed-by: Satoru Takeuchi
Signed-off-by: Chris Mason

Filipe Manana
2015-01-03 03:47:56 +0800
c7cfb8a54 Btrfs: call inode_dec_link_count() on mkdir error path ... Browse Code »

In btrfs_mkdir(), if it fails to create dir, we should
clean up existed items, setting inode's link properly
to make sure it could be cleaned up properly.

Signed-off-by: Wang Shilong
Signed-off-by: Chris Mason

Wang Shilong
2015-01-03 03:47:55 +0800
df95e7f0d Btrfs: abort transaction if we don't find the block group ... Browse Code »

We shouldn't BUG_ON() if there is corruption. I hit this while testing my block
group patch and the abort worked properly. Thanks,

Signed-off-by: Josef Bacik
Signed-off-by: Chris Mason

Josef Bacik
2015-01-03 03:47:55 +0800
6b6d24b38 Btrfs, scrub: uninitialized variable in scrub_extent_for_parity() ... Browse Code »

The only way that "ret" is set is when we call scrub_pages_for_parity()
so the skip to "if (ret) " test doesn't make sense and causes a static
checker warning.

Signed-off-by: Dan Carpenter
Signed-off-by: Chris Mason

Dan Carpenter
2015-01-03 03:47:55 +0800

30 Dec, 2014

2 commits

5faa0154f Merge branch 'for-linus' of git://git.samba.org/sfrench/cifs-2.6 ... Browse Code »

Pull CIFS fixes from Steve French:
"A set of three minor cifs fixes"

* 'for-linus' of git://git.samba.org/sfrench/cifs-2.6:
cifs: make new inode cache when file type is different
Fix signed/unsigned pointer warning
Convert MessageID in smb2_hdr to LE

Linus Torvalds
2014-12-30 13:09:57 +0800
b9d4a35f0 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs ... Browse Code »

Pull UDF & isofs fixes from Jan Kara:
"A couple of UDF fixes of handling of corrupted media and one iso9660
fix of the same"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
udf: Reduce repeated dereferences
udf: Check component length before reading it
udf: Check path length when reading symlink
udf: Verify symlink size before loading it
udf: Verify i_size when loading inode
isofs: Fix unchecked printing of ER records

Linus Torvalds
2014-12-30 12:43:10 +0800

27 Dec, 2014

1 commit

011fa9940 ext4: prevent online resize with backup superblock ... Browse Code »

Prevent BUG or corrupted file systems after the following:

mkfs.ext4 /dev/vdc 100M
mount -t ext4 -o sb=40961 /dev/vdc /vdc
resize2fs /dev/vdc

We previously prevented online resizing using the old resize ioctl.
Move the code to ext4_resize_begin(), so the check applies for all of
the resize ioctl's.

Reported-by: Maxim Malkov
Signed-off-by: Theodore Ts'o

Theodore Ts'o
2014-12-27 12:58:21 +0800

23 Dec, 2014

1 commit

9e6d722f3 cifs: make new inode cache when file type is different ... Browse Code »

In spite of different file type,
if file is same name and same inode number, old inode cache is used.
This causes that you can not cd directory, can not cat SymbolicLink.
So this patch is that if file type is different, return error.

Reproducible sample :
1. create file 'a' at cifs client.
2. repeat rm and mkdir 'a' 4 times at server, then direcotry 'a' having same inode number is created.
(Repeat 4 times, then same inode number is recycled.)
(When server is under RHEL 6.6, 1 time is O.K. Always same inode number is recycled.)
3. ls -li at client, then you can not cd directory, can not remove directory.

SymbolicLink has same problem.

Bug link:
https://bugzilla.kernel.org/show_bug.cgi?id=90011

Signed-off-by: Nakajima Akira
Acked-by: Jeff Layton
Signed-off-by: Steve French

Nakajima Akira
2014-12-23 04:16:21 +0800

22 Dec, 2014

2 commits

3ee3039c5 udf: Reduce repeated dereferences ... Browse Code »

Replace repeated dereferences like dir->i_sb by storing superblock
pointer in a variable and using that.

Signed-off-by: Jan Kara

Jan Kara
2014-12-22 05:42:37 +0800
e237ec37e udf: Check component length before reading it ... Browse Code »

Check that length specified in a component of a symlink fits in the
input buffer we are reading. Also properly ignore component length for
component types that do not use it. Otherwise we read memory after end
of buffer for corrupted udf image.

Reported-by: Carl Henrik Lunde
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara

Jan Kara
2014-12-22 05:42:19 +0800

20 Dec, 2014

4 commits

ecb5ec044 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs pile #3 from Al Viro:
"Assorted fixes and patches from the last cycle"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
[regression] chunk lost from bd9b51
vfs: make mounts and mountstats honor root dir like mountinfo does
vfs: cleanup show_mountinfo
init: fix read-write root mount
unfuck binfmt_misc.c (broken by commit e6084d4)
vm_area_operations: kill ->migrate()
new helper: iter_is_iovec()
move_extent_per_page(): get rid of unused w_flags
lustre: get rid of playing with ->fs
btrfs: filp_open() returns ERR_PTR() on failure, not NULL...

Linus Torvalds
2014-12-20 10:19:19 +0800
298647e31 Merge tag 'ecryptfs-3.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tyhicks/ecryptfs

Pull eCryptfs fixes from Tyler Hicks:
"Fixes for filename decryption and encrypted view plus a cleanup

- The filename decryption routines were, at times, writing a zero
byte one character past the end of the filename buffer

- The encrypted view feature attempted, and failed, to roll its own
form of enforcing a read-only mount instead of letting the VFS
enforce it"

* tag 'ecryptfs-3.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
eCryptfs: Remove buggy and unnecessary write in file name decode routine
eCryptfs: Remove unnecessary casts when parsing packet lengths
eCryptfs: Force RO mount when encrypted view is enabled

Linus Torvalds
2014-12-20 10:15:12 +0800
5c68eac68 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull more btrfs updates from Chris Mason:
"This is part two of our merge window patches.

These are all from Filipe, and fix some really hard to find races that
can cause corruptions. Most of them involved block group removal
(balance) or discard"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: remove non-sense btrfs_error_discard_extent() function
Btrfs: fix fs corruption on transaction abort if device supports discard
Btrfs: always clear a block group node when removing it from the tree
Btrfs: ensure deletion from pinned_chunks list is protected

Linus Torvalds
2014-12-20 10:10:42 +0800
ac88ee3b6 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull irq core fix from Thomas Gleixner:
"A single fix plugging a long standing race between proc/stat and
proc/interrupts access and freeing of interrupt descriptors"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Prevent proc race against freeing of irq descriptors

Linus Torvalds
2014-12-20 05:26:08 +0800

19 Dec, 2014

11 commits

0e5cc9a40 udf: Check path length when reading symlink ... Browse Code »

Symlink reading code does not check whether the resulting path fits into
the page provided by the generic code. This isn't as easy as just
checking the symlink size because of various encoding conversions we
perform on path. So we have to check whether there is still enough space
in the buffer on the fly.

CC: stable@vger.kernel.org
Reported-by: Carl Henrik Lunde
Signed-off-by: Jan Kara

Jan Kara
2014-12-19 21:12:08 +0800
a1d47b262 udf: Verify symlink size before loading it ... Browse Code »
3

UDF specification allows arbitrarily large symlinks. However we support
only symlinks at most one block large. Check the length of the symlink
so that we don't access memory beyond end of the symlink block.

CC: stable@vger.kernel.org
Reported-by: Carl Henrik Lunde
Signed-off-by: Jan Kara

Jan Kara
2014-12-19 20:17:10 +0800
e159332b9 udf: Verify i_size when loading inode ... Browse Code »

Verify that inode size is sane when loading inode with data stored in
ICB. Otherwise we may get confused later when working with the inode and
inode size is too big.

CC: stable@vger.kernel.org
Reported-by: Carl Henrik Lunde
Signed-off-by: Jan Kara

Jan Kara
2014-12-19 20:13:05 +0800
4e2024624 isofs: Fix unchecked printing of ER records ... Browse Code »
3

We didn't check length of rock ridge ER records before printing them.
Thus corrupted isofs image can cause us to access and print some memory
behind the buffer with obvious consequences.

Reported-and-tested-by: Carl Henrik Lunde
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara

Jan Kara
2014-12-19 18:29:24 +0800
018cb13eb Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge misc patches from Andrew Morton:
"A few stragglers"

* emailed patches from Andrew Morton :
tools/testing/selftests/Makefile: alphasort the TARGETS list
mm/zsmalloc: adjust order of functions
ocfs2: fix journal commit deadlock
ocfs2/dlm: fix race between dispatched_work and dlm_lockres_grab_inflight_worker
ocfs2: reflink: fix slow unlink for refcounted file
mm/memory.c:do_shared_fault(): add comment
.mailmap: Santosh Shilimkar has moved
.mailmap: update akpm@osdl.org
lib/show_mem.c: add cma reserved information
fs/proc/meminfo.c: include cma info in proc/meminfo
mm: cma: split cma-reserved in dmesg log
hfsplus: fix longname handling
mm/mempolicy.c: remove unnecessary is_valid_nodemask()

Linus Torvalds
2014-12-19 11:08:25 +0800
136f49b91 ocfs2: fix journal commit deadlock ... Browse Code »
16

For buffer write, page lock will be got in write_begin and released in
write_end, in ocfs2_write_end_nolock(), before it unlock the page in
ocfs2_free_write_ctxt(), it calls ocfs2_run_deallocs(), this will ask
for the read lock of journal->j_trans_barrier. Holding page lock and
ask for journal->j_trans_barrier breaks the locking order.

This will cause a deadlock with journal commit threads, ocfs2cmt will
get write lock of journal->j_trans_barrier first, then it wakes up
kjournald2 to do the commit work, at last it waits until done. To
commit journal, kjournald2 needs flushing data first, it needs get the
cache page lock.

Since some ocfs2 cluster locks are holding by write process, this
deadlock may hung the whole cluster.

unlock pages before ocfs2_run_deallocs() can fix the locking order, also
put unlock before ocfs2_commit_trans() to make page lock is unlocked
before j_trans_barrier to preserve unlocking order.

Signed-off-by: Junxiao Bi
Reviewed-by: Wengang Wang
Cc:
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Junxiao Bi
2014-12-19 11:08:11 +0800
1e5895816 ocfs2/dlm: fix race between dispatched_work and dlm_lockres_grab_inflight_worker ... Browse Code »

Commit ac4fef4d23ed ("ocfs2/dlm: do not purge lockres that is queued for
assert master") may have the following possible race case:

dlm_dispatch_assert_master dlm_wq
========================================================================
queue_work(dlm->quedlm_worker,
&dlm->dispatched_work);
dispatch work,
dlm_lockres_drop_inflight_worker
*BUG_ON(res->inflight_assert_workers == 0)*
dlm_lockres_grab_inflight_worker
inflight_assert_workers++

So ensure inflight_assert_workers to be increased first.

Signed-off-by: Joseph Qi
Signed-off-by: Xue jiufei
Cc: Joel Becker
Reviewed-by: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2014-12-19 11:08:11 +0800
f62f12b3a ocfs2: reflink: fix slow unlink for refcounted file ... Browse Code »

When running ocfs2 test suite multiple nodes reflink stress test, for a
4 nodes cluster, every unlink() for refcounted file needs about 700s.

The slow unlink is caused by the contention of refcount tree lock since
all nodes are unlink files using the same refcount tree. When the
unlinking file have many extents(over 1600 in our test), most of the
extents has refcounted flag set. In ocfs2_commit_truncate(), it will
execute the following call trace for every extents. This means it needs
get and released refcount tree lock about 1600 times. And when several
nodes are do this at the same time, the performance will be very low.

ocfs2_remove_btree_range()
-- ocfs2_lock_refcount_tree()
---- ocfs2_refcount_lock()
------ __ocfs2_cluster_lock()

ocfs2_refcount_lock() is costly, move it to ocfs2_commit_truncate() to
do lock/unlock once can improve a lot performance.

Signed-off-by: Junxiao Bi
Cc: Wengang
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Junxiao Bi
2014-12-19 11:08:11 +0800
47f8f9297 fs/proc/meminfo.c: include cma info in proc/meminfo ... Browse Code »

This patch include CMA info (CMATotal, CMAFree) in /proc/meminfo.
Currently, in a CMA enabled system, if somebody wants to know the total
CMA size declared, there is no way to tell, other than the dmesg or
/var/log/messages logs.

With this patch we are showing the CMA info as part of meminfo, so that it
can be determined at any point of time. This will be populated only when
CMA is enabled.

Below is the sample output from a ARM based device with RAM:512MB and CMA:16MB.

MemTotal: 471172 kB
MemFree: 111712 kB
MemAvailable: 271172 kB
.
.
.
CmaTotal: 16384 kB
CmaFree: 6144 kB

This patch also fix below checkpatch errors that were found during these changes.

ERROR: space required after that ',' (ctx:ExV)
199: FILE: fs/proc/meminfo.c:199:
+ ,atomic_long_read(&num_poisoned_pages) << (PAGE_SHIFT - 10)
^

ERROR: space required after that ',' (ctx:ExV)
202: FILE: fs/proc/meminfo.c:202:
+ ,K(global_page_state(NR_ANON_TRANSPARENT_HUGEPAGES) *
^

ERROR: space required after that ',' (ctx:ExV)
206: FILE: fs/proc/meminfo.c:206:
+ ,K(totalcma_pages)
^

total: 3 errors, 0 warnings, 2 checks, 236 lines checked

Signed-off-by: Pintu Kumar
Signed-off-by: Vishnu Pratap Singh
Acked-by: Michal Nazarewicz
Cc: Rafael Aquini
Cc: Jerome Marchand
Cc: Marek Szyprowski
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pintu Kumar
2014-12-19 11:08:10 +0800
89ac9b4d3 hfsplus: fix longname handling ... Browse Code »

Longname is not correctly handled by hfsplus driver. If an attempt to
create a longname(>255) file/directory is made, it succeeds by creating a
file/directory with HFSPLUS_MAX_STRLEN and incorrect catalog key. Thus
leaving the volume in an inconsistent state. This patch fixes this issue.

Although lookup is always called first to create a negative entry, so just
doing a check in lookup would probably fix this issue. I choose to
propagate error to other iops as well.

Please NOTE: I have factored out hfsplus_cat_build_key_with_cnid from
hfsplus_cat_build_key, to avoid unncessary branching.

Thanks a lot.

TEST:
------
dir="TEST_DIR"
cdir=`pwd`
name255="_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_1234"
name256="${name255}5"

mkdir $dir
cd $dir
touch $name255
rm -f $name255
touch $name256
ls -la
cd $cdir
rm -rf $dir

RESULT:
-------
[sougata@ultrabook tmp]$ cdir=`pwd`
[sougata@ultrabook tmp]$
name255="_123456789_123456789_123456789_123456789_123456789_123456789\
> _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
> _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
> _123456789_123456789_123456789_123456789_123456789_1234"
[sougata@ultrabook tmp]$ name256="${name255}5"
[sougata@ultrabook tmp]$
[sougata@ultrabook tmp]$ mkdir $dir
[sougata@ultrabook tmp]$ cd $dir
[sougata@ultrabook TEST_DIR]$ touch $name255
[sougata@ultrabook TEST_DIR]$ rm -f $name255
[sougata@ultrabook TEST_DIR]$ touch $name256
[sougata@ultrabook TEST_DIR]$ ls -la
ls: cannot access
_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234:
No such file or directory
total 0
drwxrwxr-x 1 sougata sougata 3 Feb 20 19:56 .
drwxrwxrwx 1 root root 6 Feb 20 19:56 ..
-????????? ? ? ? ? ?
_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234
[sougata@ultrabook TEST_DIR]$ cd $cdir
[sougata@ultrabook tmp]$ rm -rf $dir
rm: cannot remove `TEST_DIR': Directory not empty

-ENAMETOOLONG returned from hfsplus_asc2uni was not propaged to iops.
This allowed hfsplus to create files/directories with HFSPLUS_MAX_STRLEN
and incorrect keys, leaving the FS in an inconsistent state. This patch
fixes this issue.

Signed-off-by: Sougata Santra
Reviewed-by: Christoph Hellwig
Cc: Vyacheslav Dubeyko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sougata Santra
2014-12-19 11:08:10 +0800
c297abfdf mnt: Fix a memory stomp in umount ... Browse Code »
3

While reviewing the code of umount_tree I realized that when we append
to a preexisting unmounted list we do not change pprev of the former
first item in the list.

Which means later in namespace_unlock hlist_del_init(&mnt->mnt_hash) on
the former first item of the list will stomp unmounted.first leaving
it set to some random mount point which we are likely to free soon.

This isn't likely to hit, but if it does I don't know how anyone could
track it down.

[ This happened because we don't have all the same operations for
hlist's as we do for normal doubly-linked lists. In particular,
list_splice() is easy on our standard doubly-linked lists, while
hlist_splice() doesn't exist and needs both start/end entries of the
hlist. And commit 38129a13e6e7 incorrectly open-coded that missing
hlist_splice().

We should think about making these kinds of "mindless" conversions
easier to get right by adding the missing hlist helpers - Linus ]

Fixes: 38129a13e6e71f666e0468e99fdd932a687b4d7e switch mnt_hash to hlist
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman"
Signed-off-by: Linus Torvalds

Eric W. Biederman
2014-12-19 03:22:02 +0800

18 Dec, 2014

3 commits

44e8967d5 Ceph: remove left-over reject file ... Browse Code »

Neither Sage nor I noticed that Zheng Yan had mistakenly committed
fs/ceph/super.h.rej as part of commit 31c542a199d7 ("ceph: add inline
data to pagecache").

Remove it.

Requested-by: Yan, Zheng
Cc: Sage Weil
Signed-off-by: Linus Torvalds

Linus Torvalds
2014-12-18 10:47:01 +0800
57666509b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull ceph updates from Sage Weil:
"The big item here is support for inline data for CephFS and for
message signatures from Zheng. There are also several bug fixes,
including interrupted flock request handling, 0-length xattrs, mksnap,
cached readdir results, and a message version compat field. Finally
there are several cleanups from Ilya, Dan, and Markus.

Note that there is another series coming soon that fixes some bugs in
the RBD 'lingering' requests, but it isn't quite ready yet"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (27 commits)
ceph: fix setting empty extended attribute
ceph: fix mksnap crash
ceph: do_sync is never initialized
libceph: fixup includes in pagelist.h
ceph: support inline data feature
ceph: flush inline version
ceph: convert inline data to normal data before data write
ceph: sync read inline data
ceph: fetch inline data when getting Fcr cap refs
ceph: use getattr request to fetch inline data
ceph: add inline data to pagecache
ceph: parse inline data in MClientReply and MClientCaps
libceph: specify position of extent operation
libceph: add CREATE osd operation support
libceph: add SETXATTR/CMPXATTR osd operations support
rbd: don't treat CEPH_OSD_OP_DELETE as extent op
ceph: remove unused stringification macros
libceph: require cephx message signature by default
ceph: introduce global empty snap context
ceph: message versioning fixes
...

Linus Torvalds
2014-12-18 08:03:12 +0800
87c31b39a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull user namespace related fixes from Eric Biederman:
"As these are bug fixes almost all of thes changes are marked for
backporting to stable.

The first change (implicitly adding MNT_NODEV on remount) addresses a
regression that was created when security issues with unprivileged
remount were closed. I go on to update the remount test to make it
easy to detect if this issue reoccurs.

Then there are a handful of mount and umount related fixes.

Then half of the changes deal with the a recently discovered design
bug in the permission checks of gid_map. Unix since the beginning has
allowed setting group permissions on files to less than the user and
other permissions (aka ---rwx---rwx). As the unix permission checks
stop as soon as a group matches, and setgroups allows setting groups
that can not later be dropped, results in a situtation where it is
possible to legitimately use a group to assign fewer privileges to a
process. Which means dropping a group can increase a processes
privileges.

The fix I have adopted is that gid_map is now no longer writable
without privilege unless the new file /proc/self/setgroups has been
set to permanently disable setgroups.

The bulk of user namespace using applications even the applications
using applications using user namespaces without privilege remain
unaffected by this change. Unfortunately this ix breaks a couple user
space applications, that were relying on the problematic behavior (one
of which was tools/selftests/mount/unprivileged-remount-test.c).

To hopefully prevent needing a regression fix on top of my security
fix I rounded folks who work with the container implementations mostly
like to be affected and encouraged them to test the changes.

> So far nothing broke on my libvirt-lxc test bed. :-)
> Tested with openSUSE 13.2 and libvirt 1.2.9.
> Tested-by: Richard Weinberger

> Tested on Fedora20 with libvirt 1.2.11, works fine.
> Tested-by: Chen Hanxiao

> Ok, thanks - yes, unprivileged lxc is working fine with your kernels.
> Just to be sure I was testing the right thing I also tested using
> my unprivileged nsexec testcases, and they failed on setgroup/setgid
> as now expected, and succeeded there without your patches.
> Tested-by: Serge Hallyn

> I tested this with Sandstorm. It breaks as is and it works if I add
> the setgroups thing.
> Tested-by: Andy Lutomirski # breaks things as designed :("

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
userns: Unbreak the unprivileged remount tests
userns; Correct the comment in map_write
userns: Allow setting gid_maps without privilege when setgroups is disabled
userns: Add a knob to disable setgroups on a per user namespace basis
userns: Rename id_map_mutex to userns_state_mutex
userns: Only allow the creator of the userns unprivileged mappings
userns: Check euid no fsuid when establishing an unprivileged uid mapping
userns: Don't allow unprivileged creation of gid mappings
userns: Don't allow setgroups until a gid mapping has been setablished
userns: Document what the invariant required for safe unprivileged mappings.
groups: Consolidate the setgroups permission checks
mnt: Clear mnt_expire during pivot_root
mnt: Carefully set CL_UNPRIVILEGED in clone_mnt
mnt: Move the clear of MNT_LOCKED from copy_tree to it's callers.
umount: Do not allow unmounting rootfs.
umount: Disallow unprivileged mount force
mnt: Update unprivileged remount test
mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount

Linus Torvalds
2014-12-18 04:31:40 +0800