Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

22 Jun, 2014

3 commits

2dfded821 Merge tag 'locks-v3.16-2' of git://git.samba.org/jlayton/linux ... Browse Code »

Pull file locking fixes from Jeff Layton:
"File locking related bugfixes

Nothing too earth-shattering here. A fix for a potential regression
due to a patch in pile #1, and the addition of a memory barrier to
prevent a race condition between break_deleg and generic_add_lease"

* tag 'locks-v3.16-2' of git://git.samba.org/jlayton/linux:
locks: set fl_owner for leases back to current->files
locks: add missing memory barrier in break_deleg

Linus Torvalds
2014-06-22 10:40:30 +0800
e13d100be Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull btrfs fixes from Chris Mason:
"This fixes some lockups in btrfs reported with rc1. It probably has
some performance impact because it is backing off our spinning locks
more often and switching to a blocking lock. I'll be able to nail
that down next week, but for now I want to get the lockups taken care
of.

Otherwise some more stack reduction and assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix wrong error handle when the device is missing or is not writeable
Btrfs: fix deadlock when mounting a degraded fs
Btrfs: use bio_endio_nodec instead of open code
Btrfs: fix NULL pointer crash when running balance and scrub concurrently
btrfs: Skip scrubbing removed chunks to avoid -ENOENT.
Btrfs: fix broken free space cache after the system crashed
Btrfs: make free space cache write out functions more readable
Btrfs: remove unused wait queue in struct extent_buffer
Btrfs: fix deadlocks with trylock on tree nodes

Linus Torvalds
2014-06-22 08:21:43 +0800
147f1404d Merge branch 'for-3.16' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull nfsd bugfixes from Bruce Fields:
"Fixes for a new regression from the xdr encoding rewrite, and a
delegation problem we've had for a while (made somewhat more annoying
by the vfs delegation support added in 3.13)"

* 'for-3.16' of git://linux-nfs.org/~bfields/linux:
NFSD: fix bug for readdir of pseudofs
NFSD: Don't hand out delegations for 30 seconds after recalling them.

Linus Torvalds
2014-06-22 08:20:38 +0800

20 Jun, 2014

9 commits

8408c716d Btrfs: fix wrong error handle when the device is missing or is not writeable ... Browse Code »

The original bio might be submitted, so we shoud increase bi_remaining to
account for it when we deal with the error that the device is missing or
is not writeable, or we would skip the endio handle.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-06-20 05:20:56 +0800
c55f13964 Btrfs: fix deadlock when mounting a degraded fs ... Browse Code »

The deadlock happened when we mount degraded filesystem, the reproduced
steps are following:
# mkfs.btrfs -f -m raid1 -d raid1
# echo 1 > /sys/block/`basename `/device/delete
# mount -o degraded

The reason was that the counter -- bi_remaining was wrong. If the missing
or unwriteable device was the last device in the mapping array, we would
not submit the original bio, so we shouldn't increase bi_remaining of it
in btrfs_end_bio(), or we would skip the final endio handle.

Fix this problem by adding a flag into btrfs bio structure. If we submit
the original bio, we will set the flag, and we increase bi_remaining counter,
or we don't.

Though there is another way to fix it -- decrease bi_remaining counter of the
original bio when we make sure the original bio is not submitted, this method
need add more check and is easy to make mistake.

Signed-off-by: Miao Xie
Reviewed-by: Liu Bo
Signed-off-by: Chris Mason

Miao Xie
2014-06-20 05:20:56 +0800
e990f1676 Btrfs: use bio_endio_nodec instead of open code ... Browse Code »

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-06-20 05:20:55 +0800
298a8f9cf Btrfs: fix NULL pointer crash when running balance and scrub concurrently ... Browse Code »

While running balance, scrub, fsstress concurrently we hit the
following kernel crash:

[56561.448845] BTRFS info (device sde): relocating block group 11005853696 flags 132
[56561.524077] BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
[56561.524237] IP: [] scrub_chunk.isra.12+0xdd/0x130 [btrfs]
[56561.524297] PGD 9be28067 PUD 7f3dd067 PMD 0
[56561.524325] Oops: 0000 [#1] SMP
[....]
[56561.527237] Call Trace:
[56561.527309] [] scrub_enumerate_chunks+0x24e/0x490 [btrfs]
[56561.527392] [] ? abort_exclusive_wait+0x50/0xb0
[56561.527476] [] btrfs_scrub_dev+0x1a4/0x530 [btrfs]
[56561.527561] [] btrfs_ioctl+0x13f7/0x2a90 [btrfs]
[56561.527639] [] do_vfs_ioctl+0x2e0/0x4c0
[56561.527712] [] ? vtime_account_user+0x54/0x60
[56561.527788] [] ? __audit_syscall_entry+0x9c/0xf0
[56561.527870] [] SyS_ioctl+0x81/0xa0
[56561.527941] [] tracesys+0xdd/0xe2
[...]
[56561.528304] RIP [] scrub_chunk.isra.12+0xdd/0x130 [btrfs]
[56561.528395] RSP
[56561.528454] CR2: 0000000000000078

This is because in btrfs_relocate_chunk(), we will free @bdev directly while
scrub may still hold extent mapping, and may access freed memory.

Fix this problem by wrapping freeing @bdev work into free_extent_map() which
is based on reference count.

Reported-by: Qu Wenruo
Signed-off-by: Wang Shilong
Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Wang Shilong
2014-06-20 05:20:55 +0800
ced96edc4 btrfs: Skip scrubbing removed chunks to avoid -ENOENT. ... Browse Code »

When run scrub with balance, sometimes -ENOENT will be returned, since
in scrub_enumerate_chunks() will search dev_extent in *COMMIT_ROOT*, but
btrfs_lookup_block_group() will search block group in *MEMORY*, so if a
chunk is removed but not committed, -ENOENT will be returned.

However, there is no need to stop scrubbing since other chunks may be
scrubbed without problem.

So this patch changes the behavior to skip removed chunks and continue
to scrub the rest.

Signed-off-by: Qu Wenruo
Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Qu Wenruo
2014-06-20 05:20:54 +0800
e570fd27f Btrfs: fix broken free space cache after the system crashed ... Browse Code »

When we mounted the filesystem after the crash, we got the following
message:
BTRFS error (device xxx): block group xxxx has wrong amount of free space
BTRFS error (device xxx): failed to load free space cache for block group xxx

It is because we didn't update the metadata of the allocated space (in extent
tree) until the file data was written into the disk. During this time, there was
no information about the allocated spaces in either the extent tree nor the
free space cache. when we wrote out the free space cache at this time (commit
transaction), those spaces were lost. In fact, only the free space that is
used to store the file data had this problem, the others didn't because
the metadata of them is updated in the same transaction context.

There are many methods which can fix the above problem
- track the allocated space, and write it out when we write out the free
space cache
- account the size of the allocated space that is used to store the file
data, if the size is not zero, don't write out the free space cache.

The first one is complex and may make the performance drop down.
This patch chose the second method, we use a per-block-group variant to
account the size of that allocated space. Besides that, we also introduce
a per-block-group read-write semaphore to avoid the race between
the allocation and the free space cache write out.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-06-20 05:20:54 +0800
5349d6c3f Btrfs: make free space cache write out functions more readable ... Browse Code »

This patch makes the free space cache write out functions more readable,
and beisdes that, it also reduces the stack space that the function --
__btrfs_write_out_cache uses from 194bytes to 144bytes.

Signed-off-by: Miao Xie
Signed-off-by: Chris Mason

Miao Xie
2014-06-20 05:20:54 +0800
46fefe41b Btrfs: remove unused wait queue in struct extent_buffer ... Browse Code »

The lock_wq wait queue is not used anywhere, therefore just remove it.
On a x86_64 system, this reduced sizeof(struct extent_buffer) from 320
bytes down to 296 bytes, which means a 4Kb page can now be used for
13 extent buffers instead of 12.

Signed-off-by: Filipe David Borba Manana
Signed-off-by: Chris Mason

Filipe Manana
2014-06-20 05:20:28 +0800
ea4ebde02 Btrfs: fix deadlocks with trylock on tree nodes ... Browse Code »

The Btrfs tree trylock function is poorly named. It always takes
the spinlock and backs off if the blocking lock is held. This
can lead to surprising lockups because people expect it to really be a
trylock.

This commit makes it a pure trylock, both for the spinlock and the
blocking lock. It also reworks the nested lock handling slightly to
avoid taking the read lock while a spinning write lock might be held.

Signed-off-by: Chris Mason

Chris Mason
2014-06-20 05:19:55 +0800

18 Jun, 2014

2 commits

f41c5ad2f NFSD: fix bug for readdir of pseudofs ... Browse Code »

Commit 561f0ed498ca (nfsd4: allow large readdirs) introduces a bug
about readdir the root of pseudofs.

Call xdr_truncate_encode() revert encoded name when skipping.

Signed-off-by: Kinglong Mee
Signed-off-by: J. Bruce Fields

Kinglong Mee
2014-06-18 04:42:48 +0800
6282cd565 NFSD: Don't hand out delegations for 30 seconds after recalling them. ... Browse Code »
13

If nfsd needs to recall a delegation for some reason it implies that there is
contention on the file, so further delegations should not be handed out.

The current code fails to do so, and the result is effectively a
live-lock under some workloads: a client attempting a conflicting
operation on a read-delegated file receives NFS4ERR_DELAY and retries
the operation, but by the time it retries the server may already have
given out another delegation.

We could simply avoid delegations for (say) 30 seconds after any recall, but
this is probably too heavy handed.

We could keep a list of inodes (or inode numbers or filehandles) for recalled
delegations, but that requires memory allocation and searching.

The approach taken here is to use a bloom filter to record the filehandles
which are currently blocked from delegation, and to accept the cost of a few
false positives.

We have 2 bloom filters, each of which is valid for 30 seconds. When a
delegation is recalled the filehandle is added to one filter and will remain
disabled for between 30 and 60 seconds.

We keep a count of the number of filehandles that have been added, so when
that count is zero we can bypass all other tests.

The bloom filters have 256 bits and 3 hash functions. This should allow a
couple of dozen blocked filehandles with minimal false positives. If many
more filehandles are all blocked at once, behaviour will degrade towards
rejecting all delegations for between 30 and 60 seconds, then resetting and
allowing new delegations.

Signed-off-by: NeilBrown
Signed-off-by: J. Bruce Fields

NeilBrown
2014-06-18 04:42:47 +0800

17 Jun, 2014

1 commit

ebe06187b epoll: fix use-after-free in eventpoll_release_file ... Browse Code »
5

This fixes use-after-free of epi->fllink.next inside list loop macro.
This loop actually releases elements in the body. The list is
rcu-protected but here we cannot hold rcu_read_lock because we need to
lock mutex inside.

The obvious solution is to use list_for_each_entry_safe(). RCU-ness
isn't essential because nobody can change this list under us, it's final
fput for this file.

The bug was introduced by ae10b2b4eb01 ("epoll: optimize EPOLL_CTL_DEL
using rcu")

Signed-off-by: Konstantin Khlebnikov
Reported-by: Cyrill Gorcunov
Cc: Stable # 3.13+
Cc: Sasha Levin
Cc: Jason Baron
Signed-off-by: Linus Torvalds

Konstantin Khlebnikov
2014-06-17 11:21:59 +0800

15 Jun, 2014

2 commits

16d52ef7c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs ... Browse Code »

Pull more btrfs updates from Chris Mason:
"This has a few fixes since our last pull and a new ioctl for doing
btree searches from userland. It's very similar to the existing
ioctl, but lets us return larger items back down to the app"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: fix error handling in create_pending_snapshot
btrfs: fix use of uninit "ret" in end_extent_writepage()
btrfs: free ulist in qgroup_shared_accounting() error path
Btrfs: fix qgroups sanity test crash or hang
btrfs: prevent RCU warning when dereferencing radix tree slot
Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting
btrfs: new ioctl TREE_SEARCH_V2
btrfs: tree_search, search_ioctl: direct copy to userspace
btrfs: new function read_extent_buffer_to_user
btrfs: tree_search, copy_to_sk: return needed size on EOVERFLOW
btrfs: tree_search, copy_to_sk: return EOVERFLOW for too small buffer
btrfs: tree_search, search_ioctl: accept varying buffer
btrfs: tree_search: eliminate redundant nr_items check

Linus Torvalds
2014-06-15 08:48:43 +0800
a311c4803 Merge git://git.kvack.org/~bcrl/aio-next ... Browse Code »

Pull aio fix and cleanups from Ben LaHaise:
"This consists of a couple of code cleanups plus a minor bug fix"

* git://git.kvack.org/~bcrl/aio-next:
aio: cleanup: flatten kill_ioctx()
aio: report error from io_destroy() when threads race in io_destroy()
fs/aio.c: Remove ctx parameter in kiocb_cancel

Linus Torvalds
2014-06-15 08:43:27 +0800

14 Jun, 2014

7 commits

47a306a74 btrfs: fix error handling in create_pending_snapshot ... Browse Code »

fcebe456 cut and pasted some code to a later point
in create_pending_snapshot(), but didn't switch
to the appropriate error handling for this stage
of the function.

Signed-off-by: Eric Sandeen
Signed-off-by: Chris Mason

Eric Sandeen
2014-06-14 00:52:30 +0800
3e2426bd0 btrfs: fix use of uninit "ret" in end_extent_writepage() ... Browse Code »
5

If this condition in end_extent_writepage() is false:

if (tree->ops && tree->ops->writepage_end_io_hook)

we will then test an uninitialized "ret" at:

ret = ret < 0 ? ret : -EIO;

The test for ret is for the case where ->writepage_end_io_hook
failed, and we'd choose that ret as the error; but if
there is no ->writepage_end_io_hook, nothing sets ret.

Initializing ret to 0 should be sufficient; if
writepage_end_io_hook wasn't set, (!uptodate) means
non-zero err was passed in, so we choose -EIO in that case.

Signed-of-by: Eric Sandeen

Signed-off-by: Chris Mason

Eric Sandeen
2014-06-14 00:52:28 +0800
d73727809 btrfs: free ulist in qgroup_shared_accounting() error path ... Browse Code »

If tmp = ulist_alloc(GFP_NOFS) fails, we return without
freeing the previously allocated qgroups = ulist_alloc(GFP_NOFS)
and cause a memory leak.

Signed-off-by: Eric Sandeen
Signed-off-by: Chris Mason

Eric Sandeen
2014-06-14 00:52:26 +0800
b050f9f6d Btrfs: fix qgroups sanity test crash or hang ... Browse Code »

Often when running the qgroups sanity test, a crash or a hang happened.
This is because the extent buffer the test uses for the root node doesn't
have an header level explicitly set, making it have a random level value.
This is a problem when it's not zero for the btrfs_search_slot() calls
the test ends up doing, resulting in crashes or hangs such as the following:

[ 6454.127192] Btrfs loaded, debug=on, assert=on, integrity-checker=on
(...)
[ 6454.127760] BTRFS: selftest: Running qgroup tests
[ 6454.127964] BTRFS: selftest: Running test_test_no_shared_qgroup
[ 6454.127966] BTRFS: selftest: Qgroup basic add
[ 6480.152005] BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:5383]
[ 6480.152005] Modules linked in: btrfs(+) xor raid6_pq binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc i2c_piix4 i2c_core pcspkr evbug psmouse serio_raw e1000 [last unloaded: btrfs]
[ 6480.152005] irq event stamp: 188448
[ 6480.152005] hardirqs last enabled at (188447): [] restore_args+0x0/0x30
[ 6480.152005] hardirqs last disabled at (188448): [] apic_timer_interrupt+0x6a/0x80
[ 6480.152005] softirqs last enabled at (188446): [] __do_softirq+0x1cf/0x450
[ 6480.152005] softirqs last disabled at (188441): [] irq_exit+0xb5/0xc0
[ 6480.152005] CPU: 0 PID: 5383 Comm: modprobe Not tainted 3.15.0-rc8-fdm-btrfs-next-33+ #4
[ 6480.152005] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 6480.152005] task: ffff8802146125a0 ti: ffff8800d0d00000 task.ti: ffff8800d0d00000
[ 6480.152005] RIP: 0010:[] [] __write_lock_failed+0x13/0x20
[ 6480.152005] RSP: 0018:ffff8800d0d038e8 EFLAGS: 00000287
[ 6480.152005] RAX: 0000000000000000 RBX: ffffffff8168ef5c RCX: 000005deb8525852
[ 6480.152005] RDX: 0000000000000000 RSI: 0000000000001d45 RDI: ffff8802105000b8
[ 6480.152005] RBP: ffff8800d0d038e8 R08: fffffe12710f63db R09: ffffffffa03196fb
[ 6480.152005] R10: ffff8802146125a0 R11: ffff880214612e28 R12: ffff8800d0d03858
[ 6480.152005] R13: 0000000000000000 R14: ffff8800d0d00000 R15: ffff8802146125a0
[ 6480.152005] FS: 00007f14ff804700(0000) GS:ffff880215e00000(0000) knlGS:0000000000000000
[ 6480.152005] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 6480.152005] CR2: 00007fff4df0dac8 CR3: 00000000d1796000 CR4: 00000000000006f0
[ 6480.152005] Stack:
[ 6480.152005] ffff8800d0d03908 ffffffff810ae967 0000000000000001 ffff8802105000b8
[ 6480.152005] ffff8800d0d03938 ffffffff8168e57e ffffffffa0319c16 0000000000000007
[ 6480.152005] ffff880210500000 ffff880210500100 ffff8800d0d039b8 ffffffffa0319c16
[ 6480.152005] Call Trace:
[ 6480.152005] [] do_raw_write_lock+0x47/0xa0
[ 6480.152005] [] _raw_write_lock+0x5e/0x80
[ 6480.152005] [] ? btrfs_tree_lock+0x116/0x270 [btrfs]
[ 6480.152005] [] btrfs_tree_lock+0x116/0x270 [btrfs]
[ 6480.152005] [] btrfs_lock_root_node+0x3b/0x50 [btrfs]
[ 6480.152005] [] btrfs_search_slot+0x916/0xa20 [btrfs]
[ 6480.152005] [] ? create_object+0x23f/0x300
[ 6480.152005] [] btrfs_insert_empty_items+0x78/0xd0 [btrfs]
[ 6480.152005] [] insert_normal_tree_ref.constprop.4+0xa2/0x19a [btrfs]
[ 6480.152005] [] test_no_shared_qgroup+0xb1/0x1ca [btrfs]
[ 6480.152005] [] ? local_clock+0x16/0x30
[ 6480.152005] [] btrfs_test_qgroups+0x1ae/0x1d7 [btrfs]
[ 6480.152005] [] ? ftrace_define_fields_btrfs_space_reservation+0xfd/0xfd [btrfs]
[ 6480.152005] [] init_btrfs_fs+0xb4/0x153 [btrfs]
[ 6480.152005] [] do_one_initcall+0x102/0x150
[ 6480.152005] [] ? set_memory_nx+0x43/0x50
[ 6480.152005] [] ? set_section_ro_nx+0x6d/0x74
[ 6480.152005] [] load_module+0x1cdc/0x2630
(...)

Therefore initialize the extent buffer as an empty leaf (level 0).

Issue easy to reproduce when btrfs is built as a module via:

$ for ((i = 1; i
Signed-off-by: Chris Mason

Filipe Manana
2014-06-14 00:52:24 +0800
f1e3c2894 btrfs: prevent RCU warning when dereferencing radix tree slot ... Browse Code »

Mark the dereference as protected by lock. Not doing so triggers
an RCU warning since the radix tree assumed that RCU is in use.

Signed-off-by: Sasha Levin
Signed-off-by: Chris Mason

Sasha Levin
2014-06-14 00:52:22 +0800
5fbc7c59f Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting ... Browse Code »
7

Steps to reproduce:

# mkfs.btrfs -f /dev/sd[b-f] -m raid5 -d raid5
# mkfs.ext4 /dev/sdc --->corrupt one of btrfs device
# mount /dev/sdb /mnt -o degraded
# btrfs scrub start -BRd /mnt

This is because readahead would skip missing device, this is not true
for RAID5/6, because REQ_GET_READ_MIRRORS return 1 for RAID5/6 block
mapping. If expected data locates in missing device, readahead thread
would not call __readahead_hook() which makes event @rc->elems=0
wait forever.

Fix this problem by checking return value of btrfs_map_block(),we
can only skip missing device safely if there are several mirrors.

Signed-off-by: Wang Shilong
Signed-off-by: Chris Mason

Wang Shilong
2014-06-14 00:52:21 +0800
cc68a8a5a btrfs: new ioctl TREE_SEARCH_V2 ... Browse Code »

This new ioctl call allows the user to supply a buffer of varying size in which
a tree search can store its results. This is much more flexible if you want to
receive items which are larger than the current fixed buffer of 3992 bytes or
if you want to fetch more items at once. Items larger than this buffer are for
example some of the type EXTENT_CSUM.

Signed-off-by: Gerhard Heift
Signed-off-by: Chris Mason
Acked-by: David Sterba

Gerhard Heift
2014-06-14 00:52:19 +0800

13 Jun, 2014

10 commits

4bdeb3120 Merge tag 'dlm-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm ... Browse Code »

Pull dlm fix from David Teigland:
"This contains one small fix related to resending SCTP messages"

* tag 'dlm-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: keep listening connection alive with sctp mode

Linus Torvalds
2014-06-13 22:41:57 +0800
6d87c225f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull Ceph updates from Sage Weil:
"This has a mix of bug fixes and cleanups.

Alex's patch fixes a rare race in RBD. Ilya's patches fix an ENOENT
check when a second rbd image is mapped and a couple memory leaks.
Zheng fixes several issues with fragmented directories and multiple
MDSs. Josh fixes a spin/sleep issue, and Josh and Guangliang's
patches fix setting and unsetting RBD images read-only.

Naturally there are several other cleanups mixed in for good measure"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (23 commits)
rbd: only set disk to read-only once
rbd: move calls that may sleep out of spin lock range
rbd: add ioctl for rbd
ceph: use truncate_pagecache() instead of truncate_inode_pages()
ceph: include time stamp in every MDS request
rbd: fix ida/idr memory leak
rbd: use reference counts for image requests
rbd: fix osd_request memory leak in __rbd_dev_header_watch_sync()
rbd: make sure we have latest osdmap on 'rbd map'
libceph: add ceph_monc_wait_osdmap()
libceph: mon_get_version request infrastructure
libceph: recognize poolop requests in debugfs
ceph: refactor readpage_nounlock() to make the logic clearer
mds: check cap ID when handling cap export message
ceph: remember subtree root dirfrag's auth MDS
ceph: introduce ceph_fill_fragtree()
ceph: handle cap import atomically
ceph: pre-allocate ceph_cap struct for ceph_add_cap()
ceph: update inode fields according to issued caps
rbd: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
...

Linus Torvalds
2014-06-13 14:06:23 +0800
3737a1276 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull more perf updates from Ingo Molnar:
"A second round of perf updates:

- wide reaching kprobes sanitization and robustization, with the hope
of fixing all 'probe this function crashes the kernel' bugs, by
Masami Hiramatsu.

- uprobes updates from Oleg Nesterov: tmpfs support, corner case
fixes and robustization work.

- perf tooling updates and fixes from Jiri Olsa, Namhyung Ki, Arnaldo
et al:
* Add support to accumulate hist periods (Namhyung Kim)
* various fixes, refactorings and enhancements"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits)
perf: Differentiate exec() and non-exec() comm events
perf: Fix perf_event_comm() vs. exec() assumption
uprobes/x86: Rename arch_uprobe->def to ->defparam, minor comment updates
perf/documentation: Add description for conditional branch filter
perf/x86: Add conditional branch filtering support
perf/tool: Add conditional branch filter 'cond' to perf record
perf: Add new conditional branch filter 'PERF_SAMPLE_BRANCH_COND'
uprobes: Teach copy_insn() to support tmpfs
uprobes: Shift ->readpage check from __copy_insn() to uprobe_register()
perf/x86: Use common PMU interrupt disabled code
perf/ARM: Use common PMU interrupt disabled code
perf: Disable sampled events if no PMU interrupt
perf: Fix use after free in perf_remove_from_context()
perf tools: Fix 'make help' message error
perf record: Fix poll return value propagation
perf tools: Move elide bool into perf_hpp_fmt struct
perf tools: Remove elide setup for SORT_MODE__MEMORY mode
perf tools: Fix "==" into "=" in ui_browser__warning assignment
perf tools: Allow overriding sysfs and proc finding with env var
perf tools: Consider header files outside perf directory in tags target
...

Linus Torvalds
2014-06-13 10:18:49 +0800
ba346b357 btrfs: tree_search, search_ioctl: direct copy to userspace ... Browse Code »

By copying each found item seperatly to userspace, we do not need extra
buffer in the kernel.

Signed-off-by: Gerhard Heift
Signed-off-by: Chris Mason
Acked-by: David Sterba

Gerhard Heift
2014-06-13 09:22:05 +0800
550ac1d85 btrfs: new function read_extent_buffer_to_user ... Browse Code »

This new function reads the content of an extent directly to user memory.

Signed-off-by: Gerhard Heift
Signed-off-by: Chris Mason
Acked-by: David Sterba

Gerhard Heift
2014-06-13 09:21:56 +0800
9b6e817d0 btrfs: tree_search, copy_to_sk: return needed size on EOVERFLOW ... Browse Code »

If an item in tree_search is too large to be stored in the given buffer, return
the needed size (including the header).

Signed-off-by: Gerhard Heift
Signed-off-by: Chris Mason
Acked-by: David Sterba

Gerhard Heift
2014-06-13 09:21:47 +0800
8f5f6178f btrfs: tree_search, copy_to_sk: return EOVERFLOW for too small buffer ... Browse Code »

In copy_to_sk, if an item is too large for the given buffer, it now returns
-EOVERFLOW instead of copying a search_header with len = 0. For backward
compatibility for the first item it still copies such a header to the buffer,
but not any other following items, which could have fitted.

tree_search changes -EOVERFLOW back to 0 to behave similiar to the way it
behaved before this patch.

Signed-off-by: Gerhard Heift
Signed-off-by: Chris Mason
Acked-by: David Sterba

Gerhard Heift
2014-06-13 09:21:39 +0800
125444428 btrfs: tree_search, search_ioctl: accept varying buffer ... Browse Code »

rewrite search_ioctl to accept a buffer with varying size

Signed-off-by: Gerhard Heift
Signed-off-by: Chris Mason
Acked-by: David Sterba

Gerhard Heift
2014-06-13 09:21:26 +0800
25c9bc2e2 btrfs: tree_search: eliminate redundant nr_items check ... Browse Code »

If the amount of items reached the given limit of nr_items, we can leave
copy_to_sk without updating the key. Also by returning 1 we leave the loop in
search_ioctl without rechecking if we reached the given limit.

Signed-off-by: Gerhard Heift
Signed-off-by: Chris Mason
Acked-by: David Sterba

Gerhard Heift
2014-06-13 09:20:39 +0800
16b905780 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs updates from Al Viro:
"This the bunch that sat in -next + lock_parent() fix. This is the
minimal set; there's more pending stuff.

In particular, I really hope to get acct.c fixes merged this cycle -
we need that to deal sanely with delayed-mntput stuff. In the next
pile, hopefully - that series is fairly short and localized
(kernel/acct.c, fs/super.c and fs/namespace.c). In this pile: more
iov_iter work. Most of prereqs for ->splice_write with sane locking
order are there and Kent's dio rewrite would also fit nicely on top of
this pile"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits)
lock_parent: don't step on stale ->d_parent of all-but-freed one
kill generic_file_splice_write()
ceph: switch to iter_file_splice_write()
shmem: switch to iter_file_splice_write()
nfs: switch to iter_splice_write_file()
fs/splice.c: remove unneeded exports
ocfs2: switch to iter_file_splice_write()
->splice_write() via ->write_iter()
bio_vec-backed iov_iter
optimize copy_page_{to,from}_iter()
bury generic_file_aio_{read,write}
lustre: get rid of messing with iovecs
ceph: switch to ->write_iter()
ceph_sync_direct_write: stop poking into iov_iter guts
ceph_sync_read: stop poking into iov_iter guts
new helper: copy_page_from_iter()
fuse: switch to ->write_iter()
btrfs: switch to ->write_iter()
ocfs2: switch to ->write_iter()
xfs: switch to ->write_iter()
...

Linus Torvalds
2014-06-13 01:30:18 +0800

12 Jun, 2014

6 commits

883854c54 dlm: keep listening connection alive with sctp mode ... Browse Code »

The connection struct with nodeid 0 is the listening socket,
not a connection to another node. The sctp resend function
was not checking that the nodeid was valid (non-zero), so it
would mistakenly get and resend on the listening connection
when nodeid was zero.

Signed-off-by: Lidong Zhong
Signed-off-by: David Teigland

Lidong Zhong
2014-06-12 23:26:14 +0800
c2338f2dc lock_parent: don't step on stale ->d_parent of all-but-freed one ... Browse Code »

Dentry that had been through (or into) __dentry_kill() might be seen
by shrink_dentry_list(); that's normal, it'll be taken off the shrink
list and freed if __dentry_kill() has already finished. The problem
is, its ->d_parent might be pointing to already freed dentry, so
lock_parent() needs to be careful.

We need to check that dentry hasn't already gone into __dentry_kill()
*and* grab rcu_read_lock() before dropping ->d_lock - the latter makes
sure that whatever we see in ->d_parent after dropping ->d_lock it
won't be freed until we drop rcu_read_lock().

Signed-off-by: Al Viro

Al Viro
2014-06-12 12:29:13 +0800
9c1d5284c Merge commit '9f12600fe425bc28f0ccba034a77783c09c15af4 ' into for-linus ... Browse Code »

Backmerge of dcache.c changes from mainline. It's that, or complete
rebase...

Conflicts:
fs/splice.c

Signed-off-by: Al Viro

Al Viro
2014-06-12 12:28:09 +0800
5f0738506 kill generic_file_splice_write() ... Browse Code »

no callers left

Signed-off-by: Al Viro

Al Viro
2014-06-12 12:21:13 +0800
3551dd79a ceph: switch to iter_file_splice_write() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-06-12 12:21:12 +0800
4da54c218 nfs: switch to iter_splice_write_file() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2014-06-12 12:21:11 +0800