Eric Lee / smarc-fsl-linux-kernel

26 Dec, 2016

2 commits

629060270 mm: add PageWaiters indicating tasks are waiting for a page bit ... Browse Code »

Add a new page flag, PageWaiters, to indicate the page waitqueue has
tasks waiting. This can be tested rather than testing waitqueue_active
which requires another cacheline load.

This bit is always set when the page has tasks on page_waitqueue(page),
and is set and cleared under the waitqueue lock. It may be set when
there are no tasks on the waitqueue, which will cause a harmless extra
wakeup check that will clears the bit.

The generic bit-waitqueue infrastructure is no longer used for pages.
Instead, waitqueues are used directly with a custom key type. The
generic code was not flexible enough to have PageWaiters manipulation
under the waitqueue lock (which simplifies concurrency).

This improves the performance of page lock intensive microbenchmarks by
2-3%.

Putting two bits in the same word opens the opportunity to remove the
memory barrier between clearing the lock bit and testing the waiters
bit, after some work on the arch primitives (e.g., ensuring memory
operand widths match and cover both bits).

Signed-off-by: Nicholas Piggin
Cc: Dave Hansen
Cc: Bob Peterson
Cc: Steven Whitehouse
Cc: Andrew Lutomirski
Cc: Andreas Gruenbacher
Cc: Peter Zijlstra
Cc: Mel Gorman
Signed-off-by: Linus Torvalds

Nicholas Piggin
2016-12-26 03:54:48 +0800
6326fec11 mm: Use owner_priv bit for PageSwapCache, valid when PageSwapBacked ... Browse Code »

A page is not added to the swap cache without being swap backed,
so PageSwapBacked mappings can use PG_owner_priv_1 for PageSwapCache.

Signed-off-by: Nicholas Piggin
Acked-by: Hugh Dickins
Cc: Dave Hansen
Cc: Bob Peterson
Cc: Steven Whitehouse
Cc: Andrew Lutomirski
Cc: Andreas Gruenbacher
Cc: Peter Zijlstra
Cc: Mel Gorman
Signed-off-by: Linus Torvalds

Nicholas Piggin
2016-12-26 03:54:48 +0800

25 Dec, 2016

1 commit

7c0f6ba68 Replace <asm/uaccess.h> with <linux/uaccess.h> globally ... Browse Code »

This was entirely automated, using the script by Al:

PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*'
sed -i -e "s!$PATT!#include !" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro
Signed-off-by: Linus Torvalds

Linus Torvalds
2016-12-25 03:46:01 +0800

21 Dec, 2016

1 commit

4dd72b4a4 mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED ... Browse Code »

When FADV_DONTNEED cannot drop all pages in the range, it observes that
some pages might still be on per-cpu LRU caches after recent
instantiation and so initiates remote calls to all CPUs to flush their
local caches. However, in most cases, the fadvise happens from the same
context that instantiated the pages, and any pre-LRU pages in the
specified range are most likely sitting on the local CPU's LRU cache,
and so in many cases this results in unnecessary remote calls, which, in
a loaded system, can hold up the fadvise() call significantly.

[ I didn't record it in the extreme case we observed at Facebook,
unfortunately. We had a slow-to-respond system and noticed it
lru_add_drain_all() leading the profile during fadvise calls. This
patch came out of thinking about the code and how we commonly call
FADV_DONTNEED.

FWIW, I wrote a silly directory tree walker/searcher that recurses
through /usr to read and FADV_DONTNEED each file it finds. On a 2
socket 40 ht machine, over 1% is spent in lru_add_drain_all(). With
the patch, that cost is gone; the local drain cost shows at 0.09%. ]

Try to avoid the remote call by flushing the local LRU cache before even
attempting to invalidate anything. It's a cheap operation, and the
local LRU cache is the most likely to hold any pre-LRU pages in the
specified fadvise range.

Link: http://lkml.kernel.org/r/20161214210017.GA1465@cmpxchg.org
Signed-off-by: Johannes Weiner
Acked-by: Vlastimil Babka
Acked-by: Mel Gorman
Acked-by: Hillf Danton
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2016-12-21 01:48:46 +0800

18 Dec, 2016

1 commit

231753ef7 Merge uncontroversial parts of branch 'readlink' of git://git.kernel.org/pub/scm… ... Browse Code »

…/linux/kernel/git/mszeredi/vfs

Pull partial readlink cleanups from Miklos Szeredi.

This is the uncontroversial part of the readlink cleanup patch-set that
simplifies the default readlink handling.

Miklos and Al are still discussing the rest of the series.

* git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
vfs: make generic_readlink() static
vfs: remove ".readlink = generic_readlink" assignments
vfs: default to generic_readlink()
vfs: replace calling i_op->readlink with vfs_readlink()
proc/self: use generic_readlink
ecryptfs: use vfs_get_link()
bad_inode: add missing i_op initializers

Linus Torvalds
2016-12-18 11:16:12 +0800

15 Dec, 2016

29 commits

a57cb1c1d Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge more updates from Andrew Morton:

- a few misc things

- kexec updates

- DMA-mapping updates to better support networking DMA operations

- IPC updates

- various MM changes to improve DAX fault handling

- lots of radix-tree changes, mainly to the test suite. All leading up
to reimplementing the IDA/IDR code to be a wrapper layer over the
radix-tree. However the final trigger-pulling patch is held off for
4.11.

* emailed patches from Andrew Morton : (114 commits)
radix tree test suite: delete unused rcupdate.c
radix tree test suite: add new tag check
radix-tree: ensure counts are initialised
radix tree test suite: cache recently freed objects
radix tree test suite: add some more functionality
idr: reduce the number of bits per level from 8 to 6
rxrpc: abstract away knowledge of IDR internals
tpm: use idr_find(), not idr_find_slowpath()
idr: add ida_is_empty
radix tree test suite: check multiorder iteration
radix-tree: fix replacement for multiorder entries
radix-tree: add radix_tree_split_preload()
radix-tree: add radix_tree_split
radix-tree: add radix_tree_join
radix-tree: delete radix_tree_range_tag_if_tagged()
radix-tree: delete radix_tree_locate_item()
radix-tree: improve multiorder iterators
btrfs: fix race in btrfs_free_dummy_fs_info()
radix-tree: improve dump output
radix-tree: make radix_tree_find_next_bit more useful
...

Linus Torvalds
2016-12-15 09:25:18 +0800
268f42de7 radix-tree: delete radix_tree_range_tag_if_tagged() ... Browse Code »

This is an exceptionally complicated function with just one caller
(tag_pages_for_writeback). We devote a large portion of the runtime of
the test suite to testing this one function which has one caller. By
introducing the new function radix_tree_iter_tag_set(), we can eliminate
all of the complexity while keeping the performance. The caller can now
use a fairly standard radix_tree_for_each() loop, and it doesn't need to
worry about tricksy things like 'start' wrapping.

The test suite continues to spend a large amount of time investigating
this function, but now it's testing the underlying primitives such as
radix_tree_iter_resume() and the radix_tree_for_each_tagged() iterator
which are also used by other parts of the kernel.

Link: http://lkml.kernel.org/r/1480369871-5271-57-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox
Tested-by: Kirill A. Shutemov
Cc: Konstantin Khlebnikov
Cc: Ross Zwisler
Cc: Matthew Wilcox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-12-15 08:04:10 +0800
478922e2b radix-tree: delete radix_tree_locate_item() ... Browse Code »

This rather complicated function can be better implemented as an
iterator. It has only one caller, so move the functionality to the only
place that needs it. Update the test suite to follow the same pattern.

Link: http://lkml.kernel.org/r/1480369871-5271-56-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox
Acked-by: Konstantin Khlebnikov
Tested-by: Kirill A. Shutemov
Cc: Ross Zwisler
Cc: Matthew Wilcox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-12-15 08:04:10 +0800
148deab22 radix-tree: improve multiorder iterators ... Browse Code »

This fixes several interlinked problems with the iterators in the
presence of multiorder entries.

1. radix_tree_iter_next() would only advance by one slot, which would
result in the iterators returning the same entry more than once if
there were sibling entries.

2. radix_tree_next_slot() could return an internal pointer instead of
a user pointer if a tagged multiorder entry was immediately followed by
an entry of lower order.

3. radix_tree_next_slot() expanded to a lot more code than it used to
when multiorder support was compiled in. And I wasn't comfortable with
entry_to_node() being in a header file.

Fixing radix_tree_iter_next() for the presence of sibling entries
necessarily involves examining the contents of the radix tree, so we now
need to pass 'slot' to radix_tree_iter_next(), and we need to change the
calling convention so it is called *before* dropping the lock which
protects the tree. Also rename it to radix_tree_iter_resume(), as some
people thought it was necessary to call radix_tree_iter_next() each time
around the loop.

radix_tree_next_slot() becomes closer to how it looked before multiorder
support was introduced. It only checks to see if the next entry in the
chunk is a sibling entry or a pointer to a node; this should be rare
enough that handling this case out of line is not a performance impact
(and such impact is amortised by the fact that the entry we just
processed was a multiorder entry). Also, radix_tree_next_slot() used to
force a new chunk lookup for untagged entries, which is more expensive
than the out of line sibling entry skipping.

Link: http://lkml.kernel.org/r/1480369871-5271-55-git-send-email-mawilcox@linuxonhyperv.com
Signed-off-by: Matthew Wilcox
Tested-by: Kirill A. Shutemov
Cc: Konstantin Khlebnikov
Cc: Ross Zwisler
Cc: Matthew Wilcox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2016-12-15 08:04:10 +0800
2f89dc12a dax: protect PTE modification on WP fault by radix tree entry lock ... Browse Code »

Currently PTE gets updated in wp_pfn_shared() after dax_pfn_mkwrite()
has released corresponding radix tree entry lock. When we want to
writeprotect PTE on cache flush, we need PTE modification to happen
under radix tree entry lock to ensure consistent updates of PTE and
radix tree (standard faults use page lock to ensure this consistency).
So move update of PTE bit into dax_pfn_mkwrite().

Link: http://lkml.kernel.org/r/1479460644-25076-20-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Cc: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
cae124025 mm: export follow_pte() ... Browse Code »

DAX will need to implement its own version of page_check_address(). To
avoid duplicating page table walking code, export follow_pte() which
does what we need.

Link: http://lkml.kernel.org/r/1479460644-25076-18-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Cc: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
a19e25536 mm: change return values of finish_mkwrite_fault() ... Browse Code »

Currently finish_mkwrite_fault() returns 0 when PTE got changed before
we acquired PTE lock and VM_FAULT_WRITE when we succeeded in modifying
the PTE. This is somewhat confusing since 0 generally means success, it
is also inconsistent with finish_fault() which returns 0 on success.
Change finish_mkwrite_fault() to return 0 on success and VM_FAULT_NOPAGE
when PTE changed. Practically, there should be no behavioral difference
since we bail out from the fault the same way regardless whether we
return 0, VM_FAULT_NOPAGE, or VM_FAULT_WRITE. Also note that
VM_FAULT_WRITE has no effect for shared mappings since the only two
places that check it - KSM and GUP - care about private mappings only.
Generally the meaning of VM_FAULT_WRITE for shared mappings is not well
defined and we should probably clean that up.

Link: http://lkml.kernel.org/r/1479460644-25076-17-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Acked-by: Kirill A. Shutemov
Cc: Ross Zwisler
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
66a6197c1 mm: provide helper for finishing mkwrite faults ... Browse Code »

Provide a helper function for finishing write faults due to PTE being
read-only. The helper will be used by DAX to avoid the need of
complicating generic MM code with DAX locking specifics.

Link: http://lkml.kernel.org/r/1479460644-25076-16-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
997dd98dd mm: move part of wp_page_reuse() into the single call site ... Browse Code »

wp_page_reuse() handles write shared faults which is needed only in
wp_page_shared(). Move the handling only into that location to make
wp_page_reuse() simpler and avoid a strange situation when we sometimes
pass in locked page, sometimes unlocked etc.

Link: http://lkml.kernel.org/r/1479460644-25076-15-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
a41b70d6d mm: use vmf->page during WP faults ... Browse Code »

So far we set vmf->page during WP faults only when we needed to pass it
to the ->page_mkwrite handler. Set it in all the cases now and use that
instead of passing page pointer explicitly around.

Link: http://lkml.kernel.org/r/1479460644-25076-14-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
38b8cb7fb mm: pass vm_fault structure into do_page_mkwrite() ... Browse Code »

We will need more information in the ->page_mkwrite() helper for DAX to
be able to fully finish faults there. Pass vm_fault structure to
do_page_mkwrite() and use it there so that information propagates
properly from upper layers.

Link: http://lkml.kernel.org/r/1479460644-25076-13-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
97ba0c2b4 mm: factor out common parts of write fault handling ... Browse Code »

Currently we duplicate handling of shared write faults in
wp_page_reuse() and do_shared_fault(). Factor them out into a common
function.

Link: http://lkml.kernel.org/r/1479460644-25076-12-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
b1aa812b2 mm: move handling of COW faults into DAX code ... Browse Code »

Move final handling of COW faults from generic code into DAX fault
handler. That way generic code doesn't have to be aware of
peculiarities of DAX locking so remove that knowledge and make locking
functions private to fs/dax.c.

Link: http://lkml.kernel.org/r/1479460644-25076-11-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Acked-by: Kirill A. Shutemov
Reviewed-by: Ross Zwisler
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
9118c0cbd mm: factor out functionality to finish page faults ... Browse Code »

Introduce finish_fault() as a helper function for finishing page faults.
It is rather thin wrapper around alloc_set_pte() but since we'd want to
call this from DAX code or filesystems, it is still useful to avoid some
boilerplate code.

Link: http://lkml.kernel.org/r/1479460644-25076-10-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
3917048d4 mm: allow full handling of COW faults in ->fault handlers ... Browse Code »

Patch series "dax: Clear dirty bits after flushing caches", v5.

Patchset to clear dirty bits from radix tree of DAX inodes when caches
for corresponding pfns have been flushed. In principle, these patches
enable handlers to easily update PTEs and do other work necessary to
finish the fault without duplicating the functionality present in the
generic code. I'd like to thank Kirill and Ross for reviews of the
series!

This patch (of 20):

To allow full handling of COW faults add memcg field to struct vm_fault
and a return value of ->fault() handler meaning that COW fault is fully
handled and memcg charge must not be canceled. This will allow us to
remove knowledge about special DAX locking from the generic fault code.

Link: http://lkml.kernel.org/r/1479460644-25076-9-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
2994302bc mm: add orig_pte field into vm_fault ... Browse Code »

Add orig_pte field to vm_fault structure to allow ->page_mkwrite
handlers to fully handle the fault.

This also allows us to save some passing of extra arguments around.

Link: http://lkml.kernel.org/r/1479460644-25076-8-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
fe82221f5 mm: use passed vm_fault structure for in wp_pfn_shared() ... Browse Code »

Instead of creating another vm_fault structure, use the one passed to
wp_pfn_shared() for passing arguments into pfn_mkwrite handler.

Link: http://lkml.kernel.org/r/1479460644-25076-7-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
936ca80d3 mm: trim __do_fault() arguments ... Browse Code »

Use vm_fault structure to pass cow_page, page, and entry in and out of
the function.

That reduces number of __do_fault() arguments from 4 to 1.

Link: http://lkml.kernel.org/r/1479460644-25076-6-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
667240e0f mm: use passed vm_fault structure in __do_fault() ... Browse Code »

Instead of creating another vm_fault structure, use the one passed to
__do_fault() for passing arguments into fault handler.

Link: http://lkml.kernel.org/r/1479460644-25076-5-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
0721ec8bc mm: use pgoff in struct vm_fault instead of passing it separately ... Browse Code »

struct vm_fault has already pgoff entry. Use it instead of passing
pgoff as a separate argument and then assigning it later.

Link: http://lkml.kernel.org/r/1479460644-25076-4-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Reviewed-by: Ross Zwisler
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
1a29d85eb mm: use vmf->address instead of of vmf->virtual_address ... Browse Code »

Every single user of vmf->virtual_address typed that entry to unsigned
long before doing anything with it so the type of virtual_address does
not really provide us any additional safety. Just use masked
vmf->address which already has the appropriate type.

Link: http://lkml.kernel.org/r/1479460644-25076-3-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Acked-by: Kirill A. Shutemov
Cc: Dan Williams
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
82b0f8c39 mm: join struct fault_env and vm_fault ... Browse Code »

Currently we have two different structures for passing fault information
around - struct vm_fault and struct fault_env. DAX will need more
information in struct vm_fault to handle its faults so the content of
that structure would become event closer to fault_env. Furthermore it
would need to generate struct fault_env to be able to call some of the
generic functions. So at this point I don't think there's much use in
keeping these two structures separate. Just embed into struct vm_fault
all that is needed to use it for both purposes.

Link: http://lkml.kernel.org/r/1479460644-25076-2-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara
Acked-by: Kirill A. Shutemov
Cc: Ross Zwisler
Cc: Dan Williams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2016-12-15 08:04:09 +0800
8b7457ef9 mm: unexport __get_user_pages_unlocked() ... Browse Code »

Unexport the low-level __get_user_pages_unlocked() function and replaces
invocations with calls to more appropriate higher-level functions.

In hva_to_pfn_slow() we are able to replace __get_user_pages_unlocked()
with get_user_pages_unlocked() since we can now pass gup_flags.

In async_pf_execute() and process_vm_rw_single_vec() we need to pass
different tsk, mm arguments so get_user_pages_remote() is the sane
replacement in these cases (having added manual acquisition and release
of mmap_sem.)

Additionally get_user_pages_remote() reintroduces use of the FOLL_TOUCH
flag. However, this flag was originally silently dropped by commit
1e9877902dc7 ("mm/gup: Introduce get_user_pages_remote()"), so this
appears to have been unintentional and reintroducing it is therefore not
an issue.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20161027095141.2569-3-lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes
Acked-by: Michal Hocko
Cc: Jan Kara
Cc: Hugh Dickins
Cc: Dave Hansen
Cc: Rik van Riel
Cc: Mel Gorman
Cc: Paolo Bonzini
Cc: Radim Krcmar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lorenzo Stoakes
2016-12-15 08:04:09 +0800
5b56d49fc mm: add locked parameter to get_user_pages_remote() ... Browse Code »

Patch series "mm: unexport __get_user_pages_unlocked()".

This patch series continues the cleanup of get_user_pages*() functions
taking advantage of the fact we can now pass gup_flags as we please.

It firstly adds an additional 'locked' parameter to
get_user_pages_remote() to allow for its callers to utilise
VM_FAULT_RETRY functionality. This is necessary as the invocation of
__get_user_pages_unlocked() in process_vm_rw_single_vec() makes use of
this and no other existing higher level function would allow it to do
so.

Secondly existing callers of __get_user_pages_unlocked() are replaced
with the appropriate higher-level replacement -
get_user_pages_unlocked() if the current task and memory descriptor are
referenced, or get_user_pages_remote() if other task/memory descriptors
are referenced (having acquiring mmap_sem.)

This patch (of 2):

Add a int *locked parameter to get_user_pages_remote() to allow
VM_FAULT_RETRY faulting behaviour similar to get_user_pages_[un]locked().

Taking into account the previous adjustments to get_user_pages*()
functions allowing for the passing of gup_flags, we are now in a
position where __get_user_pages_unlocked() need only be exported for his
ability to allow VM_FAULT_RETRY behaviour, this adjustment allows us to
subsequently unexport __get_user_pages_unlocked() as well as allowing
for future flexibility in the use of get_user_pages_remote().

[sfr@canb.auug.org.au: merge fix for get_user_pages_remote API change]
Link: http://lkml.kernel.org/r/20161122210511.024ec341@canb.auug.org.au
Link: http://lkml.kernel.org/r/20161027095141.2569-2-lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes
Acked-by: Michal Hocko
Cc: Jan Kara
Cc: Hugh Dickins
Cc: Dave Hansen
Cc: Rik van Riel
Cc: Mel Gorman
Cc: Paolo Bonzini
Cc: Radim Krcmar
Signed-off-by: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lorenzo Stoakes
2016-12-15 08:04:08 +0800
44fdffd70 mm: add support for releasing multiple instances of a page ... Browse Code »

Add a function that allows us to batch free a page that has multiple
references outstanding. Specifically this function can be used to drop
a page being used in the page frag alloc cache. With this drivers can
make use of functionality similar to the page frag alloc cache without
having to do any workarounds for the fact that there is no function that
frees multiple references.

Link: http://lkml.kernel.org/r/20161110113606.76501.70752.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck
Cc: "David S. Miller"
Cc: "James E.J. Bottomley"
Cc: Chris Metcalf
Cc: David Howells
Cc: Geert Uytterhoeven
Cc: Hans-Christian Noren Egtvedt
Cc: Helge Deller
Cc: James Hogan
Cc: Jeff Kirsher
Cc: Jonas Bonn
Cc: Keguang Zhang
Cc: Ley Foon Tan
Cc: Mark Salter
Cc: Max Filippov
Cc: Michael Ellerman
Cc: Michal Simek
Cc: Ralf Baechle
Cc: Rich Felker
Cc: Richard Kuo
Cc: Russell King
Cc: Steven Miao
Cc: Tobias Klauser
Cc: Vineet Gupta
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Duyck
2016-12-15 08:04:08 +0800
73e64c51a mm, compaction: allow compaction for GFP_NOFS requests ... Browse Code »

compaction has been disabled for GFP_NOFS and GFP_NOIO requests since
the direct compaction was introduced by commit 56de7263fcf3 ("mm:
compaction: direct compact when a high-order allocation fails"). The
main reason is that the migration of page cache pages might recurse back
to fs/io layer and we could potentially deadlock. This is overly
conservative because all the anonymous memory is migrateable in the
GFP_NOFS context just fine. This might be a large portion of the memory
in many/most workkloads.

Remove the GFP_NOFS restriction and make sure that we skip all fs pages
(those with a mapping) while isolating pages to be migrated. We cannot
consider clean fs pages because they might need a metadata update so
only isolate pages without any mapping for nofs requests.

The effect of this patch will be probably very limited in many/most
workloads because higher order GFP_NOFS requests are quite rare,
although different configurations might lead to very different results.
David Chinner has mentioned a heavy metadata workload with 64kB block
which to quote him:

: Unfortunately, there was an era of cargo cult configuration tweaks in the
: Ceph community that has resulted in a large number of production machines
: with XFS filesystems configured this way. And a lot of them store large
: numbers of small files and run under significant sustained memory
: pressure.
:
: I slowly working towards getting rid of these high order allocations and
: replacing them with the equivalent number of single page allocations, but
: I haven't got that (complex) change working yet.

We can do the following to simulate that workload:
$ mkfs.xfs -f -n size=64k
$ mount /mnt/scratch
$ time ./fs_mark -D 10000 -S0 -n 100000 -s 0 -L 32 \
-d /mnt/scratch/0 -d /mnt/scratch/1 \
-d /mnt/scratch/2 -d /mnt/scratch/3 \
-d /mnt/scratch/4 -d /mnt/scratch/5 \
-d /mnt/scratch/6 -d /mnt/scratch/7 \
-d /mnt/scratch/8 -d /mnt/scratch/9 \
-d /mnt/scratch/10 -d /mnt/scratch/11 \
-d /mnt/scratch/12 -d /mnt/scratch/13 \
-d /mnt/scratch/14 -d /mnt/scratch/15

and indeed is hammers the system with many high order GFP_NOFS requests as
per a simle tracepoint during the load:
$ echo '!(gfp_flags & 0x80) && (gfp_flags &0x400000)' > $TRACE_MNT/events/kmem/mm_page_alloc/filter
I am getting
5287609 order=0
37 order=1
1594905 order=2
3048439 order=3
6699207 order=4
66645 order=5

My testing was done in a kvm guest so performance numbers should be
taken with a grain of salt but there seems to be a difference when the
patch is applied:

* Original kernel
FSUse% Count Size Files/sec App Overhead
1 1600000 0 4300.1 20745838
3 3200000 0 4239.9 23849857
5 4800000 0 4243.4 25939543
6 6400000 0 4248.4 19514050
8 8000000 0 4262.1 20796169
9 9600000 0 4257.6 21288675
11 11200000 0 4259.7 19375120
13 12800000 0 4220.7 22734141
14 14400000 0 4238.5 31936458
16 16000000 0 4231.5 23409901
18 17600000 0 4045.3 23577700
19 19200000 0 2783.4 58299526
21 20800000 0 2678.2 40616302
23 22400000 0 2693.5 83973996

and xfs complaining about memory allocation not making progress
[ 2304.372647] XFS: fs_mark(3289) possible memory allocation deadlock size 65624 in kmem_alloc (mode:0x2408240)
[ 2304.443323] XFS: fs_mark(3285) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x2408240)
[ 4796.772477] XFS: fs_mark(3424) possible memory allocation deadlock size 46936 in kmem_alloc (mode:0x2408240)
[ 4796.775329] XFS: fs_mark(3423) possible memory allocation deadlock size 51416 in kmem_alloc (mode:0x2408240)
[ 4797.388808] XFS: fs_mark(3424) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x2408240)

* Patched kernel
FSUse% Count Size Files/sec App Overhead
1 1600000 0 4289.1 19243934
3 3200000 0 4241.6 32828865
5 4800000 0 4248.7 32884693
6 6400000 0 4314.4 19608921
8 8000000 0 4269.9 24953292
9 9600000 0 4270.7 33235572
11 11200000 0 4346.4 40817101
13 12800000 0 4285.3 29972397
14 14400000 0 4297.2 20539765
16 16000000 0 4219.6 18596767
18 17600000 0 4273.8 49611187
19 19200000 0 4300.4 27944451
21 20800000 0 4270.6 22324585
22 22400000 0 4317.6 22650382
24 24000000 0 4065.2 22297964

So the dropdown at Count 19200000 didn't happen and there was only a
single warning about allocation not making progress
[ 3063.815003] XFS: fs_mark(3272) possible memory allocation deadlock size 65624 in kmem_alloc (mode:0x2408240)

This suggests that the patch has helped even though there is not all that
much of anonymous memory as the workload mostly generates fs metadata. I
assume the success rate would be higher with more anonymous memory which
should be the case in many workloads.

[akpm@linux-foundation.org: fix comment]
Link: http://lkml.kernel.org/r/20161012114721.31853-1-mhocko@kernel.org
Signed-off-by: Michal Hocko
Acked-by: Vlastimil Babka
Cc: Mel Gorman
Cc: Joonsoo Kim
Cc: Dave Chinner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2016-12-15 08:04:07 +0800
412ac77a9 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull namespace updates from Eric Biederman:
"After a lot of discussion and work we have finally reachanged a basic
understanding of what is necessary to make unprivileged mounts safe in
the presence of EVM and IMA xattrs which the last commit in this
series reflects. While technically it is a revert the comments it adds
are important for people not getting confused in the future. Clearing
up that confusion allows us to seriously work on unprivileged mounts
of fuse in the next development cycle.

The rest of the fixes in this set are in the intersection of user
namespaces, ptrace, and exec. I started with the first fix which
started a feedback cycle of finding additional issues during review
and fixing them. Culiminating in a fix for a bug that has been present
since at least Linux v1.0.

Potentially these fixes were candidates for being merged during the rc
cycle, and are certainly backport candidates but enough little things
turned up during review and testing that I decided they should be
handled as part of the normal development process just to be certain
there were not any great surprises when it came time to backport some
of these fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
Revert "evm: Translate user/group ids relative to s_user_ns when computing HMAC"
exec: Ensure mm->user_ns contains the execed files
ptrace: Don't allow accessing an undumpable mm
ptrace: Capture the ptracer's creds not PT_PTRACE_CAP
mm: Add a user_ns owner to mm_struct and fix ptrace permission checks

Linus Torvalds
2016-12-15 06:09:48 +0800
d05c5f7ba vfs,mm: fix return value of read() at s_maxbytes ... Browse Code »

We truncated the possible read iterator to s_maxbytes in commit
c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()"),
but our end condition handling was wrong: it's not an error to try to
read at the end of the file.

Reading past the end should return EOF (0), not EINVAL.

See for example

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1649342
http://lists.gnu.org/archive/html/bug-coreutils/2016-12/msg00008.html

where a md5sum of a maximally sized file fails because the final read is
exactly at s_maxbytes.

Fixes: c2a9737f45e2 ("vfs,mm: fix a dead loop in truncate_inode_pages_range()")
Reported-by: Joseph Salisbury
Cc: Wei Fang
Cc: Christoph Hellwig
Cc: Dave Chinner
Cc: Al Viro
Cc: Andrew Morton
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds

Linus Torvalds
2016-12-15 04:45:25 +0800
5084fdf08 Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 ... Browse Code »

Pull ext4 updates from Ted Ts'o:
"This merge request includes the dax-4.0-iomap-pmd branch which is
needed for both ext4 and xfs dax changes to use iomap for DAX. It also
includes the fscrypt branch which is needed for ubifs encryption work
as well as ext4 encryption and fscrypt cleanups.

Lots of cleanups and bug fixes, especially making sure ext4 is robust
against maliciously corrupted file systems --- especially maliciously
corrupted xattr blocks and a maliciously corrupted superblock. Also
fix ext4 support for 64k block sizes so it works well on ppcle. Fixed
mbcache so we don't miss some common xattr blocks that can be merged"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits)
dax: Fix sleep in atomic contex in grab_mapping_entry()
fscrypt: Rename FS_WRITE_PATH_FL to FS_CTX_HAS_BOUNCE_BUFFER_FL
fscrypt: Delay bounce page pool allocation until needed
fscrypt: Cleanup page locking requirements for fscrypt_{decrypt,encrypt}_page()
fscrypt: Cleanup fscrypt_{decrypt,encrypt}_page()
fscrypt: Never allocate fscrypt_ctx on in-place encryption
fscrypt: Use correct index in decrypt path.
fscrypt: move the policy flags and encryption mode definitions to uapi header
fscrypt: move non-public structures and constants to fscrypt_private.h
fscrypt: unexport fscrypt_initialize()
fscrypt: rename get_crypt_info() to fscrypt_get_crypt_info()
fscrypto: move ioctl processing more fully into common code
fscrypto: remove unneeded Kconfig dependencies
MAINTAINERS: fscrypto: recommend linux-fsdevel for fscrypto patches
ext4: do not perform data journaling when data is encrypted
ext4: return -ENOMEM instead of success
ext4: reject inodes with negative size
ext4: remove another test in ext4_alloc_file_blocks()
Documentation: fix description of ext4's block_validity mount option
ext4: fix checks for data=ordered and journal_async_commit options
...

Linus Torvalds
2016-12-15 01:17:42 +0800

14 Dec, 2016

5 commits

c11a6cfb0 Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

Pull workqueue updates from Tejun Heo:
"Mostly patches to initialize workqueue subsystem earlier and get rid
of keventd_up().

The patches were headed for the last merge cycle but got delayed due
to a bug found late minute, which is fixed now.

Also, to help debugging, destroy_workqueue() is more chatty now on a
sanity check failure."

* 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: move wq_numa_init() to workqueue_init()
workqueue: remove keventd_up()
debugobj, workqueue: remove keventd_up() usage
slab, workqueue: remove keventd_up() usage
power, workqueue: remove keventd_up() usage
tty, workqueue: remove keventd_up() usage
mce, workqueue: remove keventd_up() usage
workqueue: make workqueue available early during boot
workqueue: dump workqueue state on sanity check failures in destroy_workqueue()

Linus Torvalds
2016-12-14 04:59:57 +0800
e6efef726 Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

Pull percpu update from Tejun Heo:
"This includes just one patch to reject non-power-of-2 alignments and
trigger warning. Interestingly, this actually caught a bug in XEN
ARM64"

* 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: ensure the requested alignment is power of two

Linus Torvalds
2016-12-14 04:34:47 +0800
b78b499a6 Merge tag 'char-misc-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc ... Browse Code »

Pull char/misc driver updates from Greg KH:
"Here's the big char/misc driver patches for 4.10-rc1. Lots of tiny
changes over lots of "minor" driver subsystems, the largest being some
new FPGA drivers. Other than that, a few other new drivers, but no new
driver subsystems added for this kernel cycle, a nice change.

All of these have been in linux-next with no reported issues"

* tag 'char-misc-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (107 commits)
uio-hv-generic: store physical addresses instead of virtual
Tools: hv: kvp: configurable external scripts path
uio-hv-generic: new userspace i/o driver for VMBus
vmbus: add support for dynamic device id's
hv: change clockevents unbind tactics
hv: acquire vmbus_connection.channel_mutex in vmbus_free_channels()
hyperv: Fix spelling of HV_UNKOWN
mei: bus: enable non-blocking RX
mei: fix the back to back interrupt handling
mei: synchronize irq before initiating a reset.
VME: Remove shutdown entry from vme_driver
auxdisplay: ht16k33: select framebuffer helper modules
MAINTAINERS: add git url for fpga
fpga: Clarify how write_init works streaming modes
fpga zynq: Fix incorrect ISR state on bootup
fpga zynq: Remove priv->dev
fpga zynq: Add missing \n to messages
fpga: Add COMPILE_TEST to all drivers
uio: pruss: add clk_disable()
char/pcmcia: add some error checking in scr24x_read()
...

Linus Torvalds
2016-12-14 04:11:01 +0800
7b9dc3f75 Merge tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull power management updates from Rafael Wysocki:
"Again, cpufreq gets more changes than the other parts this time (one
new driver, one old driver less, a bunch of enhancements of the
existing code, new CPU IDs, fixes, cleanups)

There also are some changes in cpuidle (idle injection rework, a
couple of new CPU IDs, online/offline rework in intel_idle, fixes and
cleanups), in the generic power domains framework (mostly related to
supporting power domains containing CPUs), and in the Operating
Performance Points (OPP) library (mostly related to supporting devices
with multiple voltage regulators)

In addition to that, the system sleep state selection interface is
modified to make it easier for distributions with unchanged user space
to support suspend-to-idle as the default system suspend method, some
issues are fixed in the PM core, the latency tolerance PM QoS
framework is improved a bit, the Intel RAPL power capping driver is
cleaned up and there are some fixes and cleanups in the devfreq
subsystem

Specifics:

- New cpufreq driver for Broadcom STB SoCs and a Device Tree binding
for it (Markus Mayer)

- Support for ARM Integrator/AP and Integrator/CP in the generic DT
cpufreq driver and elimination of the old Integrator cpufreq driver
(Linus Walleij)

- Support for the zx296718, r8a7743 and r8a7745, Socionext UniPhier,
and PXA SoCs in the the generic DT cpufreq driver (Baoyou Xie,
Geert Uytterhoeven, Masahiro Yamada, Robert Jarzmik)

- cpufreq core fix to eliminate races that may lead to using inactive
policy objects and related cleanups (Rafael Wysocki)

- cpufreq schedutil governor update to make it use SCHED_FIFO kernel
threads (instead of regular workqueues) for doing delayed work (to
reduce the response latency in some cases) and related cleanups
(Viresh Kumar)

- New cpufreq sysfs attribute for resetting statistics (Markus Mayer)

- cpufreq governors fixes and cleanups (Chen Yu, Stratos Karafotis,
Viresh Kumar)

- Support for using generic cpufreq governors in the intel_pstate
driver (Rafael Wysocki)

- Support for per-logical-CPU P-state limits and the EPP/EPB (Energy
Performance Preference/Energy Performance Bias) knobs in the
intel_pstate driver (Srinivas Pandruvada)

- New CPU ID for Knights Mill in intel_pstate (Piotr Luc)

- intel_pstate driver modification to use the P-state selection
algorithm based on CPU load on platforms with the system profile in
the ACPI tables set to "mobile" (Srinivas Pandruvada)

- intel_pstate driver cleanups (Arnd Bergmann, Rafael Wysocki,
Srinivas Pandruvada)

- cpufreq powernv driver updates including fast switching support
(for the schedutil governor), fixes and cleanus (Akshay Adiga,
Andrew Donnellan, Denis Kirjanov)

- acpi-cpufreq driver rework to switch it over to the new CPU
offline/online state machine (Sebastian Andrzej Siewior)

- Assorted cleanups in cpufreq drivers (Wei Yongjun, Prashanth
Prakash)

- Idle injection rework (to make it use the regular idle path instead
of a home-grown custom one) and related powerclamp thermal driver
updates (Peter Zijlstra, Jacob Pan, Petr Mladek, Sebastian Andrzej
Siewior)

- New CPU IDs for Atom Z34xx and Knights Mill in intel_idle (Andy
Shevchenko, Piotr Luc)

- intel_idle driver cleanups and switch over to using the new CPU
offline/online state machine (Anna-Maria Gleixner, Sebastian
Andrzej Siewior)

- cpuidle DT driver update to support suspend-to-idle properly
(Sudeep Holla)

- cpuidle core cleanups and misc updates (Daniel Lezcano, Pan Bian,
Rafael Wysocki)

- Preliminary support for power domains including CPUs in the generic
power domains (genpd) framework and related DT bindings (Lina Iyer)

- Assorted fixes and cleanups in the generic power domains (genpd)
framework (Colin Ian King, Dan Carpenter, Geert Uytterhoeven)

- Preliminary support for devices with multiple voltage regulators
and related fixes and cleanups in the Operating Performance Points
(OPP) library (Viresh Kumar, Masahiro Yamada, Stephen Boyd)

- System sleep state selection interface rework to make it easier to
support suspend-to-idle as the default system suspend method
(Rafael Wysocki)

- PM core fixes and cleanups, mostly related to the interactions
between the system suspend and runtime PM frameworks (Ulf Hansson,
Sahitya Tummala, Tony Lindgren)

- Latency tolerance PM QoS framework imorovements (Andrew Lutomirski)

- New Knights Mill CPU ID for the Intel RAPL power capping driver
(Piotr Luc)

- Intel RAPL power capping driver fixes, cleanups and switch over to
using the new CPU offline/online state machine (Jacob Pan, Thomas
Gleixner, Sebastian Andrzej Siewior)

- Fixes and cleanups in the exynos-ppmu, exynos-nocp, rk3399_dmc,
rockchip-dfi devfreq drivers and the devfreq core (Axel Lin,
Chanwoo Choi, Javier Martinez Canillas, MyungJoo Ham, Viresh Kumar)

- Fix for false-positive KASAN warnings during resume from ACPI S3
(suspend-to-RAM) on x86 (Josh Poimboeuf)

- Memory map verification during resume from hibernation on x86 to
ensure a consistent address space layout (Chen Yu)

- Wakeup sources debugging enhancement (Xing Wei)

- rockchip-io AVS driver cleanup (Shawn Lin)"

* tag 'pm-4.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (127 commits)
devfreq: rk3399_dmc: Don't use OPP structures outside of RCU locks
devfreq: rk3399_dmc: Remove dangling rcu_read_unlock()
devfreq: exynos: Don't use OPP structures outside of RCU locks
Documentation: intel_pstate: Document HWP energy/performance hints
cpufreq: intel_pstate: Support for energy performance hints with HWP
cpufreq: intel_pstate: Add locking around HWP requests
PM / sleep: Print active wakeup sources when blocking on wakeup_count reads
PM / core: Fix bug in the error handling of async suspend
PM / wakeirq: Fix dedicated wakeirq for drivers not using autosuspend
PM / Domains: Fix compatible for domain idle state
PM / OPP: Don't WARN on multiple calls to dev_pm_opp_set_regulators()
PM / OPP: Allow platform specific custom set_opp() callbacks
PM / OPP: Separate out _generic_set_opp()
PM / OPP: Add infrastructure to manage multiple regulators
PM / OPP: Pass struct dev_pm_opp_supply to _set_opp_voltage()
PM / OPP: Manage supply's voltage/current in a separate structure
PM / OPP: Don't use OPP structure outside of rcu protected section
PM / OPP: Reword binding supporting multiple regulators per device
PM / OPP: Fix incorrect cpu-supply property in binding
cpuidle: Add a kerneldoc comment to cpuidle_use_deepest_state()
..

Linus Torvalds
2016-12-14 02:41:53 +0800
36869cb93 Merge branch 'for-4.10/block' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block layer updates from Jens Axboe:
"This is the main block pull request this series. Contrary to previous
release, I've kept the core and driver changes in the same branch. We
always ended up having dependencies between the two for obvious
reasons, so makes more sense to keep them together. That said, I'll
probably try and keep more topical branches going forward, especially
for cycles that end up being as busy as this one.

The major parts of this pull request is:

- Improved support for O_DIRECT on block devices, with a small
private implementation instead of using the pig that is
fs/direct-io.c. From Christoph.

- Request completion tracking in a scalable fashion. This is utilized
by two components in this pull, the new hybrid polling and the
writeback queue throttling code.

- Improved support for polling with O_DIRECT, adding a hybrid mode
that combines pure polling with an initial sleep. From me.

- Support for automatic throttling of writeback queues on the block
side. This uses feedback from the device completion latencies to
scale the queue on the block side up or down. From me.

- Support from SMR drives in the block layer and for SD. From Hannes
and Shaun.

- Multi-connection support for nbd. From Josef.

- Cleanup of request and bio flags, so we have a clear split between
which are bio (or rq) private, and which ones are shared. From
Christoph.

- A set of patches from Bart, that improve how we handle queue
stopping and starting in blk-mq.

- Support for WRITE_ZEROES from Chaitanya.

- Lightnvm updates from Javier/Matias.

- Supoort for FC for the nvme-over-fabrics code. From James Smart.

- A bunch of fixes from a whole slew of people, too many to name
here"

* 'for-4.10/block' of git://git.kernel.dk/linux-block: (182 commits)
blk-stat: fix a few cases of missing batch flushing
blk-flush: run the queue when inserting blk-mq flush
elevator: make the rqhash helpers exported
blk-mq: abstract out blk_mq_dispatch_rq_list() helper
blk-mq: add blk_mq_start_stopped_hw_queue()
block: improve handling of the magic discard payload
blk-wbt: don't throttle discard or write zeroes
nbd: use dev_err_ratelimited in io path
nbd: reset the setup task for NBD_CLEAR_SOCK
nvme-fabrics: Add FC LLDD loopback driver to test FC-NVME
nvme-fabrics: Add target support for FC transport
nvme-fabrics: Add host support for FC transport
nvme-fabrics: Add FC transport LLDD api definitions
nvme-fabrics: Add FC transport FC-NVME definitions
nvme-fabrics: Add FC transport error codes to nvme.h
Add type 0x28 NVME type code to scsi fc headers
nvme-fabrics: patch target code in prep for FC transport support
nvme-fabrics: set sqe.command_id in core not transports
parser: add u64 number parser
nvme-rdma: align to generic ib_event logging helper
...

Linus Torvalds
2016-12-14 02:19:16 +0800

13 Dec, 2016

1 commit

e34bac726 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge updates from Andrew Morton:

- various misc bits

- most of MM (quite a lot of MM material is awaiting the merge of
linux-next dependencies)

- kasan

- printk updates

- procfs updates

- MAINTAINERS

- /lib updates

- checkpatch updates

* emailed patches from Andrew Morton : (123 commits)
init: reduce rootwait polling interval time to 5ms
binfmt_elf: use vmalloc() for allocation of vma_filesz
checkpatch: don't emit unified-diff error for rename-only patches
checkpatch: don't check c99 types like uint8_t under tools
checkpatch: avoid multiple line dereferences
checkpatch: don't check .pl files, improve absolute path commit log test
scripts/checkpatch.pl: fix spelling
checkpatch: don't try to get maintained status when --no-tree is given
lib/ida: document locking requirements a bit better
lib/rbtree.c: fix typo in comment of ____rb_erase_color
lib/Kconfig.debug: make CONFIG_STRICT_DEVMEM depend on CONFIG_DEVMEM
MAINTAINERS: add drm and drm/i915 irc channels
MAINTAINERS: add "C:" for URI for chat where developers hang out
MAINTAINERS: add drm and drm/i915 bug filing info
MAINTAINERS: add "B:" for URI where to file bugs
get_maintainer: look for arbitrary letter prefixes in sections
printk: add Kconfig option to set default console loglevel
printk/sound: handle more message headers
printk/btrfs: handle more message headers
printk/kdb: handle more message headers
...

Linus Torvalds
2016-12-13 12:50:02 +0800