Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

20 Dec, 2014

1 commit

ac88ee3b6 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull irq core fix from Thomas Gleixner:
"A single fix plugging a long standing race between proc/stat and
proc/interrupts access and freeing of interrupt descriptors"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Prevent proc race against freeing of irq descriptors

Linus Torvalds
2014-12-20 05:26:08 +0800

19 Dec, 2014

7 commits

018cb13eb Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge misc patches from Andrew Morton:
"A few stragglers"

* emailed patches from Andrew Morton :
tools/testing/selftests/Makefile: alphasort the TARGETS list
mm/zsmalloc: adjust order of functions
ocfs2: fix journal commit deadlock
ocfs2/dlm: fix race between dispatched_work and dlm_lockres_grab_inflight_worker
ocfs2: reflink: fix slow unlink for refcounted file
mm/memory.c:do_shared_fault(): add comment
.mailmap: Santosh Shilimkar has moved
.mailmap: update akpm@osdl.org
lib/show_mem.c: add cma reserved information
fs/proc/meminfo.c: include cma info in proc/meminfo
mm: cma: split cma-reserved in dmesg log
hfsplus: fix longname handling
mm/mempolicy.c: remove unnecessary is_valid_nodemask()

Linus Torvalds
2014-12-19 11:08:25 +0800
136f49b91 ocfs2: fix journal commit deadlock ... Browse Code »
16

For buffer write, page lock will be got in write_begin and released in
write_end, in ocfs2_write_end_nolock(), before it unlock the page in
ocfs2_free_write_ctxt(), it calls ocfs2_run_deallocs(), this will ask
for the read lock of journal->j_trans_barrier. Holding page lock and
ask for journal->j_trans_barrier breaks the locking order.

This will cause a deadlock with journal commit threads, ocfs2cmt will
get write lock of journal->j_trans_barrier first, then it wakes up
kjournald2 to do the commit work, at last it waits until done. To
commit journal, kjournald2 needs flushing data first, it needs get the
cache page lock.

Since some ocfs2 cluster locks are holding by write process, this
deadlock may hung the whole cluster.

unlock pages before ocfs2_run_deallocs() can fix the locking order, also
put unlock before ocfs2_commit_trans() to make page lock is unlocked
before j_trans_barrier to preserve unlocking order.

Signed-off-by: Junxiao Bi
Reviewed-by: Wengang Wang
Cc:
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Junxiao Bi
2014-12-19 11:08:11 +0800
1e5895816 ocfs2/dlm: fix race between dispatched_work and dlm_lockres_grab_inflight_worker ... Browse Code »

Commit ac4fef4d23ed ("ocfs2/dlm: do not purge lockres that is queued for
assert master") may have the following possible race case:

dlm_dispatch_assert_master dlm_wq
========================================================================
queue_work(dlm->quedlm_worker,
&dlm->dispatched_work);
dispatch work,
dlm_lockres_drop_inflight_worker
*BUG_ON(res->inflight_assert_workers == 0)*
dlm_lockres_grab_inflight_worker
inflight_assert_workers++

So ensure inflight_assert_workers to be increased first.

Signed-off-by: Joseph Qi
Signed-off-by: Xue jiufei
Cc: Joel Becker
Reviewed-by: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joseph Qi
2014-12-19 11:08:11 +0800
f62f12b3a ocfs2: reflink: fix slow unlink for refcounted file ... Browse Code »

When running ocfs2 test suite multiple nodes reflink stress test, for a
4 nodes cluster, every unlink() for refcounted file needs about 700s.

The slow unlink is caused by the contention of refcount tree lock since
all nodes are unlink files using the same refcount tree. When the
unlinking file have many extents(over 1600 in our test), most of the
extents has refcounted flag set. In ocfs2_commit_truncate(), it will
execute the following call trace for every extents. This means it needs
get and released refcount tree lock about 1600 times. And when several
nodes are do this at the same time, the performance will be very low.

ocfs2_remove_btree_range()
-- ocfs2_lock_refcount_tree()
---- ocfs2_refcount_lock()
------ __ocfs2_cluster_lock()

ocfs2_refcount_lock() is costly, move it to ocfs2_commit_truncate() to
do lock/unlock once can improve a lot performance.

Signed-off-by: Junxiao Bi
Cc: Wengang
Reviewed-by: Mark Fasheh
Cc: Joel Becker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Junxiao Bi
2014-12-19 11:08:11 +0800
47f8f9297 fs/proc/meminfo.c: include cma info in proc/meminfo ... Browse Code »

This patch include CMA info (CMATotal, CMAFree) in /proc/meminfo.
Currently, in a CMA enabled system, if somebody wants to know the total
CMA size declared, there is no way to tell, other than the dmesg or
/var/log/messages logs.

With this patch we are showing the CMA info as part of meminfo, so that it
can be determined at any point of time. This will be populated only when
CMA is enabled.

Below is the sample output from a ARM based device with RAM:512MB and CMA:16MB.

MemTotal: 471172 kB
MemFree: 111712 kB
MemAvailable: 271172 kB
.
.
.
CmaTotal: 16384 kB
CmaFree: 6144 kB

This patch also fix below checkpatch errors that were found during these changes.

ERROR: space required after that ',' (ctx:ExV)
199: FILE: fs/proc/meminfo.c:199:
+ ,atomic_long_read(&num_poisoned_pages) << (PAGE_SHIFT - 10)
^

ERROR: space required after that ',' (ctx:ExV)
202: FILE: fs/proc/meminfo.c:202:
+ ,K(global_page_state(NR_ANON_TRANSPARENT_HUGEPAGES) *
^

ERROR: space required after that ',' (ctx:ExV)
206: FILE: fs/proc/meminfo.c:206:
+ ,K(totalcma_pages)
^

total: 3 errors, 0 warnings, 2 checks, 236 lines checked

Signed-off-by: Pintu Kumar
Signed-off-by: Vishnu Pratap Singh
Acked-by: Michal Nazarewicz
Cc: Rafael Aquini
Cc: Jerome Marchand
Cc: Marek Szyprowski
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Pintu Kumar
2014-12-19 11:08:10 +0800
89ac9b4d3 hfsplus: fix longname handling ... Browse Code »

Longname is not correctly handled by hfsplus driver. If an attempt to
create a longname(>255) file/directory is made, it succeeds by creating a
file/directory with HFSPLUS_MAX_STRLEN and incorrect catalog key. Thus
leaving the volume in an inconsistent state. This patch fixes this issue.

Although lookup is always called first to create a negative entry, so just
doing a check in lookup would probably fix this issue. I choose to
propagate error to other iops as well.

Please NOTE: I have factored out hfsplus_cat_build_key_with_cnid from
hfsplus_cat_build_key, to avoid unncessary branching.

Thanks a lot.

TEST:
------
dir="TEST_DIR"
cdir=`pwd`
name255="_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_123456789_123456789\
_123456789_123456789_123456789_123456789_123456789_1234"
name256="${name255}5"

mkdir $dir
cd $dir
touch $name255
rm -f $name255
touch $name256
ls -la
cd $cdir
rm -rf $dir

RESULT:
-------
[sougata@ultrabook tmp]$ cdir=`pwd`
[sougata@ultrabook tmp]$
name255="_123456789_123456789_123456789_123456789_123456789_123456789\
> _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
> _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
> _123456789_123456789_123456789_123456789_123456789_1234"
[sougata@ultrabook tmp]$ name256="${name255}5"
[sougata@ultrabook tmp]$
[sougata@ultrabook tmp]$ mkdir $dir
[sougata@ultrabook tmp]$ cd $dir
[sougata@ultrabook TEST_DIR]$ touch $name255
[sougata@ultrabook TEST_DIR]$ rm -f $name255
[sougata@ultrabook TEST_DIR]$ touch $name256
[sougata@ultrabook TEST_DIR]$ ls -la
ls: cannot access
_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234:
No such file or directory
total 0
drwxrwxr-x 1 sougata sougata 3 Feb 20 19:56 .
drwxrwxrwx 1 root root 6 Feb 20 19:56 ..
-????????? ? ? ? ? ?
_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234
[sougata@ultrabook TEST_DIR]$ cd $cdir
[sougata@ultrabook tmp]$ rm -rf $dir
rm: cannot remove `TEST_DIR': Directory not empty

-ENAMETOOLONG returned from hfsplus_asc2uni was not propaged to iops.
This allowed hfsplus to create files/directories with HFSPLUS_MAX_STRLEN
and incorrect keys, leaving the FS in an inconsistent state. This patch
fixes this issue.

Signed-off-by: Sougata Santra
Reviewed-by: Christoph Hellwig
Cc: Vyacheslav Dubeyko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sougata Santra
2014-12-19 11:08:10 +0800
c297abfdf mnt: Fix a memory stomp in umount ... Browse Code »
3

While reviewing the code of umount_tree I realized that when we append
to a preexisting unmounted list we do not change pprev of the former
first item in the list.

Which means later in namespace_unlock hlist_del_init(&mnt->mnt_hash) on
the former first item of the list will stomp unmounted.first leaving
it set to some random mount point which we are likely to free soon.

This isn't likely to hit, but if it does I don't know how anyone could
track it down.

[ This happened because we don't have all the same operations for
hlist's as we do for normal doubly-linked lists. In particular,
list_splice() is easy on our standard doubly-linked lists, while
hlist_splice() doesn't exist and needs both start/end entries of the
hlist. And commit 38129a13e6e7 incorrectly open-coded that missing
hlist_splice().

We should think about making these kinds of "mindless" conversions
easier to get right by adding the missing hlist helpers - Linus ]

Fixes: 38129a13e6e71f666e0468e99fdd932a687b4d7e switch mnt_hash to hlist
Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman"
Signed-off-by: Linus Torvalds

Eric W. Biederman
2014-12-19 03:22:02 +0800

18 Dec, 2014

24 commits

44e8967d5 Ceph: remove left-over reject file ... Browse Code »

Neither Sage nor I noticed that Zheng Yan had mistakenly committed
fs/ceph/super.h.rej as part of commit 31c542a199d7 ("ceph: add inline
data to pagecache").

Remove it.

Requested-by: Yan, Zheng
Cc: Sage Weil
Signed-off-by: Linus Torvalds

Linus Torvalds
2014-12-18 10:47:01 +0800
57666509b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull ceph updates from Sage Weil:
"The big item here is support for inline data for CephFS and for
message signatures from Zheng. There are also several bug fixes,
including interrupted flock request handling, 0-length xattrs, mksnap,
cached readdir results, and a message version compat field. Finally
there are several cleanups from Ilya, Dan, and Markus.

Note that there is another series coming soon that fixes some bugs in
the RBD 'lingering' requests, but it isn't quite ready yet"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (27 commits)
ceph: fix setting empty extended attribute
ceph: fix mksnap crash
ceph: do_sync is never initialized
libceph: fixup includes in pagelist.h
ceph: support inline data feature
ceph: flush inline version
ceph: convert inline data to normal data before data write
ceph: sync read inline data
ceph: fetch inline data when getting Fcr cap refs
ceph: use getattr request to fetch inline data
ceph: add inline data to pagecache
ceph: parse inline data in MClientReply and MClientCaps
libceph: specify position of extent operation
libceph: add CREATE osd operation support
libceph: add SETXATTR/CMPXATTR osd operations support
rbd: don't treat CEPH_OSD_OP_DELETE as extent op
ceph: remove unused stringification macros
libceph: require cephx message signature by default
ceph: introduce global empty snap context
ceph: message versioning fixes
...

Linus Torvalds
2014-12-18 08:03:12 +0800
87c31b39a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull user namespace related fixes from Eric Biederman:
"As these are bug fixes almost all of thes changes are marked for
backporting to stable.

The first change (implicitly adding MNT_NODEV on remount) addresses a
regression that was created when security issues with unprivileged
remount were closed. I go on to update the remount test to make it
easy to detect if this issue reoccurs.

Then there are a handful of mount and umount related fixes.

Then half of the changes deal with the a recently discovered design
bug in the permission checks of gid_map. Unix since the beginning has
allowed setting group permissions on files to less than the user and
other permissions (aka ---rwx---rwx). As the unix permission checks
stop as soon as a group matches, and setgroups allows setting groups
that can not later be dropped, results in a situtation where it is
possible to legitimately use a group to assign fewer privileges to a
process. Which means dropping a group can increase a processes
privileges.

The fix I have adopted is that gid_map is now no longer writable
without privilege unless the new file /proc/self/setgroups has been
set to permanently disable setgroups.

The bulk of user namespace using applications even the applications
using applications using user namespaces without privilege remain
unaffected by this change. Unfortunately this ix breaks a couple user
space applications, that were relying on the problematic behavior (one
of which was tools/selftests/mount/unprivileged-remount-test.c).

To hopefully prevent needing a regression fix on top of my security
fix I rounded folks who work with the container implementations mostly
like to be affected and encouraged them to test the changes.

> So far nothing broke on my libvirt-lxc test bed. :-)
> Tested with openSUSE 13.2 and libvirt 1.2.9.
> Tested-by: Richard Weinberger

> Tested on Fedora20 with libvirt 1.2.11, works fine.
> Tested-by: Chen Hanxiao

> Ok, thanks - yes, unprivileged lxc is working fine with your kernels.
> Just to be sure I was testing the right thing I also tested using
> my unprivileged nsexec testcases, and they failed on setgroup/setgid
> as now expected, and succeeded there without your patches.
> Tested-by: Serge Hallyn

> I tested this with Sandstorm. It breaks as is and it works if I add
> the setgroups thing.
> Tested-by: Andy Lutomirski # breaks things as designed :("

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
userns: Unbreak the unprivileged remount tests
userns; Correct the comment in map_write
userns: Allow setting gid_maps without privilege when setgroups is disabled
userns: Add a knob to disable setgroups on a per user namespace basis
userns: Rename id_map_mutex to userns_state_mutex
userns: Only allow the creator of the userns unprivileged mappings
userns: Check euid no fsuid when establishing an unprivileged uid mapping
userns: Don't allow unprivileged creation of gid mappings
userns: Don't allow setgroups until a gid mapping has been setablished
userns: Document what the invariant required for safe unprivileged mappings.
groups: Consolidate the setgroups permission checks
mnt: Clear mnt_expire during pivot_root
mnt: Carefully set CL_UNPRIVILEGED in clone_mnt
mnt: Move the clear of MNT_LOCKED from copy_tree to it's callers.
umount: Do not allow unmounting rootfs.
umount: Disallow unprivileged mount force
mnt: Update unprivileged remount test
mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount

Linus Torvalds
2014-12-18 04:31:40 +0800
d6666be6f Merge tag 'for-linus-20141215' of git://git.infradead.org/linux-mtd ... Browse Code »

Pull MTD updates from Brian Norris:
"Summary:
- Add device tree support for DoC3

- SPI NOR:
Refactoring, for better layering between spi-nor.c and its
driver users (e.g., m25p80.c)

New flash device support

Support 6-byte ID strings

- NAND:
New NAND driver for Allwinner SoC's (sunxi)

GPMI NAND: add support for raw (no ECC) access, for testing
purposes

Add ATO manufacturer ID

A few odd driver fixes

- MTD tests:
Allow testers to compensate for OOB bitflips in oobtest

Fix a torturetest regression

- nandsim: Support longer ID byte strings

And more"

* tag 'for-linus-20141215' of git://git.infradead.org/linux-mtd: (63 commits)
mtd: tests: abort torturetest on erase errors
mtd: physmap_of: fix potential NULL dereference
mtd: spi-nor: allow NULL as chip name and try to auto detect it
mtd: nand: gpmi: add raw oob access functions
mtd: nand: gpmi: add proper raw access support
mtd: nand: gpmi: add gpmi_copy_bits function
mtd: spi-nor: factor out write_enable() for erase commands
mtd: spi-nor: add support for s25fl128s
mtd: spi-nor: remove the jedec_id/ext_id
mtd: spi-nor: add id/id_len for flash_info{}
mtd: nand: correct the comment of function nand_block_isreserved()
jffs2: Drop bogus if in comment
mtd: atmel_nand: replace memcpy32_toio/memcpy32_fromio with memcpy
mtd: cafe_nand: drop duplicate .write_page implementation
mtd: m25p80: Add support for serial flash Spansion S25FL132K
MTD: m25p80: fix inconsistency in m25p_ids compared to spi_nor_ids
mtd: spi-nor: improve wait-till-ready timeout loop
mtd: delete unnecessary checks before two function calls
mtd: nand: omap: Fix NAND enumeration on 3430 LDP
mtd: nand: add ATO manufacturer info
...

Linus Torvalds
2014-12-18 01:59:26 +0800
c103b21c2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse ... Browse Code »

Pull fuse update from Miklos Szeredi:
"The first part makes sure we don't hold up umount with pending async
requests. In addition to being a cleanup, this is a small behavioral
change (for the better) and unlikely to break anything.

The second part prepares for a cleanup of the fuse device I/O code by
adding a helper for simple request submission, with some savings in
line numbers already realized"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: use file_inode() in fuse_file_fallocate()
fuse: introduce fuse_simple_request() helper
fuse: reduce max out args
fuse: hold inode instead of path after release
fuse: flush requests on umount
fuse: don't wake up reserved req in fuse_conn_kill()

Linus Torvalds
2014-12-18 01:41:32 +0800
0aeff37ab ceph: fix setting empty extended attribute ... Browse Code »

make sure 'value' is not null. otherwise __ceph_setxattr will remove
the extended attribute.

Signed-off-by: Yan, Zheng
Reviewed-by: Sage Weil

Yan, Zheng
2014-12-18 01:18:49 +0800
275dd19ea ceph: fix mksnap crash ... Browse Code »

mksnap reply only contain 'target', does not contain 'dentry'. So
it's wrong to use req->r_reply_info.head->is_dentry to detect traceless
reply.

Signed-off-by: Yan, Zheng
Reviewed-by: Sage Weil

Yan, Zheng
2014-12-18 01:09:53 +0800
021b77bee ceph: do_sync is never initialized ... Browse Code »
3

Probably this code was syncing a lot more often then intended because
the do_sync variable wasn't set to zero.

Cc: stable@vger.kernel.org # v3.11+
Fixes: c62988ec0910 ('ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.')
Signed-off-by: Dan Carpenter
Signed-off-by: Ilya Dryomov

Dan Carpenter
2014-12-18 01:09:53 +0800
65a22662b ceph: support inline data feature ... Browse Code »

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:53 +0800
e20d258d7 ceph: flush inline version ... Browse Code »

After converting inline data to normal data, client need to flush
the new i_inline_version (CEPH_INLINE_NONE) to MDS. This commit makes
cap messages (sent to MDS) contain inline_version and inline_data.
Client always converts inline data to normal data before data write,
so the inline data length part is always zero.

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:53 +0800
28127bdd2 ceph: convert inline data to normal data before data write ... Browse Code »

Before any data write, convert inline data to normal data and set
i_inline_version to CEPH_INLINE_NONE. The OSD request that saves
inline data to object contains 3 operations (CMPXATTR, WRITE and
SETXATTR). It compares a xattr named 'inline_version' to prevent
old data overwrites newer data.

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:52 +0800
83701246a ceph: sync read inline data ... Browse Code »

we can't use getattr to fetch inline data while holding Fr cap,
because it can cause deadlock. If we need to sync read inline data,
drop cap refs first, then use getattr to fetch inline data.

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:52 +0800
3738daa68 ceph: fetch inline data when getting Fcr cap refs ... Browse Code »

we can't use getattr to fetch inline data after getting Fcr caps,
because it can cause deadlock. The solution is try bringing inline
data to page cache when not holding any cap, and hope the inline
data page is still there after getting the Fcr caps. If the page
is still there, pin it in page cache for later IO.

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:52 +0800
01deead04 ceph: use getattr request to fetch inline data ... Browse Code »

Add a new parameter 'locked_page' to ceph_do_getattr(). If inline data
in getattr reply will be copied to the page.

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:52 +0800
31c542a19 ceph: add inline data to pagecache ... Browse Code »
13

Request reply and cap message can contain inline data. add inline data
to the page cache if there is Fc cap.

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:52 +0800
fb01d1f8b ceph: parse inline data in MClientReply and MClientCaps ... Browse Code »

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:52 +0800
715e4cd40 libceph: specify position of extent operation ... Browse Code »

allow specifying position of extent operation in multi-operations
osd request. This is required for cephfs to convert inline data to
normal data (compare xattr, then write object).

Signed-off-by: Yan, Zheng
Reviewed-by: Ilya Dryomov

Yan, Zheng
2014-12-18 01:09:52 +0800
ca3995ad1 ceph: remove unused stringification macros ... Browse Code »

These were used to report git versions a long time ago.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2014-12-18 01:09:51 +0800
97c85a828 ceph: introduce global empty snap context ... Browse Code »

Current snaphost code does not properly handle moving inode from one
empty snap realm to another empty snap realm. After changing inode's
snap realm, some dirty pages' snap context can be not equal to inode's
i_head_snap. This can trigger BUG() in ceph_put_wrbuffer_cap_refs()

The fix is introduce a global empty snap context for all empty snap
realm. This avoids triggering the BUG() for filesystem with no snapshot.

Fixes: http://tracker.ceph.com/issues/9928

Signed-off-by: Yan, Zheng
Reviewed-by: Ilya Dryomov

Yan, Zheng
2014-12-18 01:09:51 +0800
7cfa0313d ceph: message versioning fixes ... Browse Code »

There were two places we were assigning version in host byte order
instead of network byte order.

Also in MSG_CLIENT_SESSION we weren't setting compat_version in the
header to reflect continued compatability with older MDSs.

Fixes: http://tracker.ceph.com/issues/9945

Signed-off-by: John Spray
Reviewed-by: Sage Weil

John Spray
2014-12-18 01:09:51 +0800
33d073379 libceph: message signature support ... Browse Code »

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:50 +0800
e96a650a8 ceph, rbd: delete unnecessary checks before two function calls ... Browse Code »

The functions ceph_put_snap_context() and iput() test whether their
argument is NULL and then return immediately. Thus the test around the
call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring
[idryomov@redhat.com: squashed rbd.c hunk, changelog]
Signed-off-by: Ilya Dryomov

SF Markus Elfring
2014-12-18 01:09:50 +0800
70db4f362 ceph: introduce a new inode flag indicating if cached dentries are ordered ... Browse Code »

After creating/deleting/renaming file, offsets of sibling dentries may
change. So we can not use cached dentries to satisfy readdir. But we can
still use the cached dentries to conclude -ENOENT for lookup.

This patch introduces a new inode flag indicating if child dentries are
ordered. The flag is set at the same time marking a directory complete.
After creating/deleting/renaming file, we clear the flag on directory
inode. This prevents ceph_readdir() from using cached dentries to satisfy
readdir syscall.

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:50 +0800
9280be24d ceph: fix file lock interruption ... Browse Code »

When a lock operation is interrupted, current code sends a unlock request to
MDS to undo the lock operation. This method does not work as expected because
the unlock request can drop locks that have already been acquired.

The fix is use the newly introduced CEPH_LOCK_FCNTL_INTR/CEPH_LOCK_FLOCK_INTR
requests to interrupt blocked file lock request. These requests do not drop
locks that have alread been acquired, they only interrupt blocked file lock
request.

Signed-off-by: Yan, Zheng

Yan, Zheng
2014-12-18 01:09:49 +0800

17 Dec, 2014

3 commits

603ba7e41 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs pile #2 from Al Viro:
"Next pile (and there'll be one or two more).

The large piece in this one is getting rid of /proc/*/ns/* weirdness;
among other things, it allows to (finally) make nameidata completely
opaque outside of fs/namei.c, making for easier further cleanups in
there"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
coda_venus_readdir(): use file_inode()
fs/namei.c: fold link_path_walk() call into path_init()
path_init(): don't bother with LOOKUP_PARENT in argument
fs/namei.c: new helper (path_cleanup())
path_init(): store the "base" pointer to file in nameidata itself
make default ->i_fop have ->open() fail with ENXIO
make nameidata completely opaque outside of fs/namei.c
kill proc_ns completely
take the targets of /proc/*/ns/* symlinks to separate fs
bury struct proc_ns in fs/proc
copy address of proc_ns_ops into ns_common
new helpers: ns_alloc_inum/ns_free_inum
make proc_ns_operations work with struct ns_common * instead of void *
switch the rest of proc_ns_operations to working with &...->ns
netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns
make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
common object embedded into various struct ....ns

Linus Torvalds
2014-12-17 07:53:03 +0800
31f48fc8f Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs ... Browse Code »

Pull isofs and reiserfs fixes from Jan Kara:
"A reiserfs and an isofs fix. They arrived after I sent you my first
pull request and I don't want to delay them unnecessarily till rc2"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
isofs: Fix infinite looping over CE entries
reiserfs: destroy allocated commit workqueue

Linus Torvalds
2014-12-17 07:46:01 +0800
0b233b7c7 Merge branch 'for-3.19' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd updates from Bruce Fields:
"A comparatively quieter cycle for nfsd this time, but still with two
larger changes:

- RPC server scalability improvements from Jeff Layton (using RCU
instead of a spinlock to find idle threads).

- server-side NFSv4.2 ALLOCATE/DEALLOCATE support from Anna
Schumaker, enabling fallocate on new clients"

* 'for-3.19' of git://linux-nfs.org/~bfields/linux: (32 commits)
nfsd4: fix xdr4 count of server in fs_location4
nfsd4: fix xdr4 inclusion of escaped char
sunrpc/cache: convert to use string_escape_str()
sunrpc: only call test_bit once in svc_xprt_received
fs: nfsd: Fix signedness bug in compare_blob
sunrpc: add some tracepoints around enqueue and dequeue of svc_xprt
sunrpc: convert to lockless lookup of queued server threads
sunrpc: fix potential races in pool_stats collection
sunrpc: add a rcu_head to svc_rqst and use kfree_rcu to free it
sunrpc: require svc_create callers to pass in meaningful shutdown routine
sunrpc: have svc_wake_up only deal with pool 0
sunrpc: convert sp_task_pending flag to use atomic bitops
sunrpc: move rq_cachetype field to better optimize space
sunrpc: move rq_splice_ok flag into rq_flags
sunrpc: move rq_dropme flag into rq_flags
sunrpc: move rq_usedeferral flag to rq_flags
sunrpc: move rq_local field to rq_flags
sunrpc: add a generic rq_flags field to svc_rqst and move rq_secure to it
nfsd: minor off by one checks in __write_versions()
sunrpc: release svc_pool_map reference when serv allocation fails
...

Linus Torvalds
2014-12-17 07:25:31 +0800

15 Dec, 2014

5 commits

f54e18f1b isofs: Fix infinite looping over CE entries ... Browse Code »
3

Rock Ridge extensions define so called Continuation Entries (CE) which
define where is further space with Rock Ridge data. Corrupted isofs
image can contain arbitrarily long chain of these, including a one
containing loop and thus causing kernel to end in an infinite loop when
traversing these entries.

Limit the traversal to 32 entries which should be more than enough space
to store all the Rock Ridge data.

Reported-by: P J P
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara

Jan Kara
2014-12-15 22:53:26 +0800
67e2c3883 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security ... Browse Code »

Pull security layer updates from James Morris:
"In terms of changes, there's general maintenance to the Smack,
SELinux, and integrity code.

The IMA code adds a new kconfig option, IMA_APPRAISE_SIGNED_INIT,
which allows IMA appraisal to require signatures. Support for reading
keys from rootfs before init is call is also added"

* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (23 commits)
selinux: Remove security_ops extern
security: smack: fix out-of-bounds access in smk_parse_smack()
VFS: refactor vfs_read()
ima: require signature based appraisal
integrity: provide a hook to load keys when rootfs is ready
ima: load x509 certificate from the kernel
integrity: provide a function to load x509 certificate from the kernel
integrity: define a new function integrity_read_file()
Security: smack: replace kzalloc with kmem_cache for inode_smack
Smack: Lock mode for the floor and hat labels
ima: added support for new kernel cmdline parameter ima_template_fmt
ima: allocate field pointers array on demand in template_desc_init_fields()
ima: don't allocate a copy of template_fmt in template_desc_init_fields()
ima: display template format in meas. list if template name length is zero
ima: added error messages to template-related functions
ima: use atomic bit operations to protect policy update interface
ima: ignore empty and with whitespaces policy lines
ima: no need to allocate entry for comment
ima: report policy load status
ima: use path names cache
...

Linus Torvalds
2014-12-15 12:36:37 +0800
e6b5be2be Merge tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core ... Browse Code »

Pull driver core update from Greg KH:
"Here's the set of driver core patches for 3.19-rc1.

They are dominated by the removal of the .owner field in platform
drivers. They touch a lot of files, but they are "simple" changes,
just removing a line in a structure.

Other than that, a few minor driver core and debugfs changes. There
are some ath9k patches coming in through this tree that have been
acked by the wireless maintainers as they relied on the debugfs
changes.

Everything has been in linux-next for a while"

* tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (324 commits)
Revert "ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries"
fs: debugfs: add forward declaration for struct device type
firmware class: Deletion of an unnecessary check before the function call "vunmap"
firmware loader: fix hung task warning dump
devcoredump: provide a one-way disable function
device: Add dev__once variants
ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries
ath: use seq_file api for ath9k debugfs files
debugfs: add helper function to create device related seq_file
drivers/base: cacheinfo: remove noisy error boot message
Revert "core: platform: add warning if driver has no owner"
drivers: base: support cpu cache information interface to userspace via sysfs
drivers: base: add cpu_device_create to support per-cpu devices
topology: replace custom attribute macros with standard DEVICE_ATTR*
cpumask: factor out show_cpumap into separate helper function
driver core: Fix unbalanced device reference in drivers_probe
driver core: fix race with userland in device_add()
sysfs/kernfs: make read requests on pre-alloc files use the buffer.
sysfs/kernfs: allow attributes to request write buffer be pre-allocated.
fs: sysfs: return EGBIG on write if offset is larger than file size
...

Linus Torvalds
2014-12-15 08:10:09 +0800
7a02d0896 Merge tag 'squashfs-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next ... Browse Code »

Pull squashfs update from Phillip Lougher:
"These patches optionally add LZ4 compression support to Squashfs.

LZ4 is a lightweight compression algorithm which can be used on
embedded systems to reduce CPU and memory overhead (in comparison to
the standard zlib compression).

These patches add the wrapper code to allow Squashfs to use the
existing LZ4 decompression code, and the necessary configuration
option"

* tag 'squashfs-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next:
Squashfs: Add LZ4 compression configuration option
Squashfs: add LZ4 compression support

Linus Torvalds
2014-12-15 06:42:53 +0800
7d22286ff Merge git://git.kvack.org/~bcrl/aio-next ... Browse Code »

Pull aio updates from Benjamin LaHaise.

* git://git.kvack.org/~bcrl/aio-next:
aio: Skip timer for io_getevents if timeout=0
aio: Make it possible to remap aio ring

Linus Torvalds
2014-12-15 05:36:57 +0800