Eric Lee / smarc-fsl-linux-kernel

15 Jan, 2020

1 commit

3981d85a9 xtensa: Implement copy_thread_tls ... Browse Code »

commit c346b94f8c5d1b7d637522c908209de93305a8eb upstream.

This is required for clone3 which passes the TLS value through a
struct rather than a register.

Signed-off-by: Amanieu d'Antras
Cc: linux-xtensa@linux-xtensa.org
Cc: # 5.3.x
Link: https://lore.kernel.org/r/20200102172413.654385-7-amanieu@gmail.com
Signed-off-by: Christian Brauner
Signed-off-by: Greg Kroah-Hartman

Amanieu d'Antras
2020-01-15 03:08:35 +0800

21 Dec, 2019

3 commits

bae1e4713 xtensa: fix syscall_set_return_value ... Browse Code »

commit c2d9aa3b6e56de56c7f1ed9026ca6ec7cfbeef19 upstream.

syscall return value is in the register a2, not a0.

Cc: stable@vger.kernel.org # v5.0+
Fixes: 9f24f3c1067c ("xtensa: implement tracehook functions and enable HAVE_ARCH_TRACEHOOK")
Signed-off-by: Max Filippov
Signed-off-by: Greg Kroah-Hartman

Max Filippov
2019-12-21 18:04:35 +0800
147128e77 xtensa: fix TLB sanity checker ... Browse Code »

commit 36de10c4788efc6efe6ff9aa10d38cb7eea4c818 upstream.

Virtual and translated addresses retrieved by the xtensa TLB sanity
checker must be consistent, i.e. correspond to the same state of the
checked TLB entry. KASAN shadow memory is mapped dynamically using
auto-refill TLB entries and thus may change TLB state between the
virtual and translated address retrieval, resulting in false TLB
insanity report.
Move read_xtlb_translation close to read_xtlb_virtual to make sure that
read values are consistent.

Cc: stable@vger.kernel.org
Fixes: a99e07ee5e88 ("xtensa: check TLB sanity on return to userspace")
Signed-off-by: Max Filippov
Signed-off-by: Greg Kroah-Hartman

Max Filippov
2019-12-21 18:04:34 +0800
1948e76af xtensa: use MEMBLOCK_ALLOC_ANYWHERE for KASAN shadow map ... Browse Code »

commit e64681b487c897ec871465083bf0874087d47b66 upstream.

KASAN shadow map doesn't need to be accessible through the linear kernel
mapping, allocate its pages with MEMBLOCK_ALLOC_ANYWHERE so that high
memory can be used. This frees up to ~100MB of low memory on xtensa
configurations with KASAN and high memory.

Cc: stable@vger.kernel.org # v5.1+
Fixes: f240ec09bb8a ("memblock: replace memblock_alloc_base(ANYWHERE) with memblock_phys_alloc")
Reviewed-by: Mike Rapoport
Signed-off-by: Max Filippov
Signed-off-by: Greg Kroah-Hartman

Max Filippov
2019-12-21 18:04:33 +0800

16 Oct, 2019

2 commits

775fd6bfe xtensa: fix change_bit in exclusive access option ... Browse Code »

change_bit implementation for XCHAL_HAVE_EXCLUSIVE case changes all bits
except the one required due to copy-paste error from clear_bit.

Cc: stable@vger.kernel.org # v5.2+
Fixes: f7c34874f04a ("xtensa: add exclusive atomics support")
Signed-off-by: Max Filippov

Max Filippov
2019-10-16 15:14:33 +0800
0c401fdf2 xtensa: virt: fix PCI IO ports mapping ... Browse Code »

virt device tree incorrectly uses 0xf0000000 on both sides of PCI IO
ports address space mapping. This results in incorrect port address
assignment in PCI IO BARs and subsequent crash on attempt to access
them. Use 0 as base address in PCI IO ports address space.

Signed-off-by: Max Filippov

Max Filippov
2019-10-16 05:29:53 +0800

15 Oct, 2019

4 commits

8b39da985 xtensa: drop EXPORT_SYMBOL for outs*/ins* ... Browse Code »

Custom outs*/ins* implementations are long gone from the xtensa port,
remove matching EXPORT_SYMBOLs.
This fixes the following build warnings issued by modpost since commit
15bfc2348d54 ("modpost: check for static EXPORT_SYMBOL* functions"):

WARNING: "insb" [vmlinux] is a static EXPORT_SYMBOL
WARNING: "insw" [vmlinux] is a static EXPORT_SYMBOL
WARNING: "insl" [vmlinux] is a static EXPORT_SYMBOL
WARNING: "outsb" [vmlinux] is a static EXPORT_SYMBOL
WARNING: "outsw" [vmlinux] is a static EXPORT_SYMBOL
WARNING: "outsl" [vmlinux] is a static EXPORT_SYMBOL

Cc: stable@vger.kernel.org
Fixes: d38efc1f150f ("xtensa: adopt generic io routines")
Signed-off-by: Max Filippov

Max Filippov
2019-10-15 07:02:04 +0800
c9c63f3c7 xtensa: fix type conversion in __get_user_[no]check ... Browse Code »

__get_user_[no]check uses temporary buffer of type long to store result
of __get_user_size and do sign extension on it when necessary. This
doesn't work correctly for 64-bit data. Fix it by moving temporary
buffer/sign extension logic to __get_user_asm.

Don't do assignment of __get_user_bad result to (x) as it may not always
be integer-compatible now and issue warning even when it's going to be
optimized. Instead do (x) = 0; and call __get_user_bad separately.

Zero initialize __x in __get_user_asm and use '+' constraint for its
assembly argument, so that its value is preserved in error cases. This
may add at most 1 cycle to the fast path, but saves an instruction and
two padding bytes in the fixup section for each use of this macro and
works for both misaligned store and store exception.

Signed-off-by: Max Filippov

Max Filippov
2019-10-15 05:14:21 +0800
c04376429 xtensa: clean up assembly arguments in uaccess macros ... Browse Code »

Numeric assembly arguments are hard to understand and assembly code that
uses them is hard to modify. Use named arguments in __check_align_*,
__get_user_asm and __put_user_asm. Modify macro parameter names so that
they don't affect argument names. Use '+' constraint for the [err]
argument instead of having it as both input and output.

Signed-off-by: Max Filippov

Max Filippov
2019-10-15 03:58:06 +0800
6595d144d xtensa: fix {get,put}_user() for 64bit values ... Browse Code »

First of all, on short copies __copy_{to,from}_user() return the amount
of bytes left uncopied, *not* -EFAULT. get_user() and put_user() are
expected to return -EFAULT on failure.

Another problem is get_user(v32, (__u64 __user *)p); that should
fetch 64bit value and the assign it to v32, truncating it in process.
Current code, OTOH, reads 8 bytes of data and stores them at the
address of v32, stomping on the 4 bytes that follow v32 itself.

Signed-off-by: Al Viro
Signed-off-by: Max Filippov

Al Viro
2019-10-15 02:39:50 +0800

27 Sep, 2019

1 commit

b4ed71f55 mm: treewide: clarify pgtable_page_{ctor,dtor}() naming ... Browse Code »

The naming of pgtable_page_{ctor,dtor}() seems to have confused a few
people, and until recently arm64 used these erroneously/pointlessly for
other levels of page table.

To make it incredibly clear that these only apply to the PTE level, and to
align with the naming of pgtable_pmd_page_{ctor,dtor}(), let's rename them
to pgtable_pte_page_{ctor,dtor}().

These changes were generated with the following shell script:

----
git grep -lw 'pgtable_page_.tor' | while read FILE; do
sed -i '{s/pgtable_page_ctor/pgtable_pte_page_ctor/}' $FILE;
sed -i '{s/pgtable_page_dtor/pgtable_pte_page_dtor/}' $FILE;
done
----

... with the documentation re-flowed to remain under 80 columns, and
whitespace fixed up in macros to keep backslashes aligned.

There should be no functional change as a result of this patch.

Link: http://lkml.kernel.org/r/20190722141133.3116-1-mark.rutland@arm.com
Signed-off-by: Mark Rutland
Reviewed-by: Mike Rapoport
Acked-by: Geert Uytterhoeven [m68k]
Cc: Anshuman Khandual
Cc: Matthew Wilcox
Cc: Michal Hocko
Cc: Yu Zhao
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mark Rutland
2019-09-27 01:10:44 +0800

26 Sep, 2019

2 commits

1a4e58cce mm: introduce MADV_PAGEOUT ... Browse Code »

When a process expects no accesses to a certain memory range for a long
time, it could hint kernel that the pages can be reclaimed instantly but
data should be preserved for future use. This could reduce workingset
eviction so it ends up increasing performance.

This patch introduces the new MADV_PAGEOUT hint to madvise(2) syscall.
MADV_PAGEOUT can be used by a process to mark a memory range as not
expected to be used for a long time so that kernel reclaims *any LRU*
pages instantly. The hint can help kernel in deciding which pages to
evict proactively.

A note: It doesn't apply SWAP_CLUSTER_MAX LRU page isolation limit
intentionally because it's automatically bounded by PMD size. If PMD
size(e.g., 256) makes some trouble, we could fix it later by limit it to
SWAP_CLUSTER_MAX[1].

- man-page material

MADV_PAGEOUT (since Linux x.x)

Do not expect access in the near future so pages in the specified
regions could be reclaimed instantly regardless of memory pressure.
Thus, access in the range after successful operation could cause
major page fault but never lose the up-to-date contents unlike
MADV_DONTNEED. Pages belonging to a shared mapping are only processed
if a write access is allowed for the calling process.

MADV_PAGEOUT cannot be applied to locked pages, Huge TLB pages, or
VM_PFNMAP pages.

[1] https://lore.kernel.org/lkml/20190710194719.GS29695@dhcp22.suse.cz/

[minchan@kernel.org: clear PG_active on MADV_PAGEOUT]
Link: http://lkml.kernel.org/r/20190802200643.GA181880@google.com
[akpm@linux-foundation.org: resolve conflicts with hmm.git]
Link: http://lkml.kernel.org/r/20190726023435.214162-5-minchan@kernel.org
Signed-off-by: Minchan Kim
Reported-by: kbuild test robot
Acked-by: Michal Hocko
Cc: James E.J. Bottomley
Cc: Richard Henderson
Cc: Ralf Baechle
Cc: Chris Zankel
Cc: Daniel Colascione
Cc: Dave Hansen
Cc: Hillf Danton
Cc: Joel Fernandes (Google)
Cc: Johannes Weiner
Cc: Kirill A. Shutemov
Cc: Oleksandr Natalenko
Cc: Shakeel Butt
Cc: Sonny Rao
Cc: Suren Baghdasaryan
Cc: Tim Murray
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minchan Kim
2019-09-26 08:51:41 +0800
9c276cc65 mm: introduce MADV_COLD ... Browse Code »

Patch series "Introduce MADV_COLD and MADV_PAGEOUT", v7.

- Background

The Android terminology used for forking a new process and starting an app
from scratch is a cold start, while resuming an existing app is a hot
start. While we continually try to improve the performance of cold
starts, hot starts will always be significantly less power hungry as well
as faster so we are trying to make hot start more likely than cold start.

To increase hot start, Android userspace manages the order that apps
should be killed in a process called ActivityManagerService.
ActivityManagerService tracks every Android app or service that the user
could be interacting with at any time and translates that into a ranked
list for lmkd(low memory killer daemon). They are likely to be killed by
lmkd if the system has to reclaim memory. In that sense they are similar
to entries in any other cache. Those apps are kept alive for
opportunistic performance improvements but those performance improvements
will vary based on the memory requirements of individual workloads.

- Problem

Naturally, cached apps were dominant consumers of memory on the system.
However, they were not significant consumers of swap even though they are
good candidate for swap. Under investigation, swapping out only begins
once the low zone watermark is hit and kswapd wakes up, but the overall
allocation rate in the system might trip lmkd thresholds and cause a
cached process to be killed(we measured performance swapping out vs.
zapping the memory by killing a process. Unsurprisingly, zapping is 10x
times faster even though we use zram which is much faster than real
storage) so kill from lmkd will often satisfy the high zone watermark,
resulting in very few pages actually being moved to swap.

- Approach

The approach we chose was to use a new interface to allow userspace to
proactively reclaim entire processes by leveraging platform information.
This allowed us to bypass the inaccuracy of the kernel’s LRUs for pages
that are known to be cold from userspace and to avoid races with lmkd by
reclaiming apps as soon as they entered the cached state. Additionally,
it could provide many chances for platform to use much information to
optimize memory efficiency.

To achieve the goal, the patchset introduce two new options for madvise.
One is MADV_COLD which will deactivate activated pages and the other is
MADV_PAGEOUT which will reclaim private pages instantly. These new
options complement MADV_DONTNEED and MADV_FREE by adding non-destructive
ways to gain some free memory space. MADV_PAGEOUT is similar to
MADV_DONTNEED in a way that it hints the kernel that memory region is not
currently needed and should be reclaimed immediately; MADV_COLD is similar
to MADV_FREE in a way that it hints the kernel that memory region is not
currently needed and should be reclaimed when memory pressure rises.

This patch (of 5):

When a process expects no accesses to a certain memory range, it could
give a hint to kernel that the pages can be reclaimed when memory pressure
happens but data should be preserved for future use. This could reduce
workingset eviction so it ends up increasing performance.

This patch introduces the new MADV_COLD hint to madvise(2) syscall.
MADV_COLD can be used by a process to mark a memory range as not expected
to be used in the near future. The hint can help kernel in deciding which
pages to evict early during memory pressure.

It works for every LRU pages like MADV_[DONTNEED|FREE]. IOW, It moves

active file page -> inactive file LRU
active anon page -> inacdtive anon LRU

Unlike MADV_FREE, it doesn't move active anonymous pages to inactive file
LRU's head because MADV_COLD is a little bit different symantic.
MADV_FREE means it's okay to discard when the memory pressure because the
content of the page is *garbage* so freeing such pages is almost zero
overhead since we don't need to swap out and access afterward causes just
minor fault. Thus, it would make sense to put those freeable pages in
inactive file LRU to compete other used-once pages. It makes sense for
implmentaion point of view, too because it's not swapbacked memory any
longer until it would be re-dirtied. Even, it could give a bonus to make
them be reclaimed on swapless system. However, MADV_COLD doesn't mean
garbage so reclaiming them requires swap-out/in in the end so it's bigger
cost. Since we have designed VM LRU aging based on cost-model, anonymous
cold pages would be better to position inactive anon's LRU list, not file
LRU. Furthermore, it would help to avoid unnecessary scanning if system
doesn't have a swap device. Let's start simpler way without adding
complexity at this moment. However, keep in mind, too that it's a caveat
that workloads with a lot of pages cache are likely to ignore MADV_COLD on
anonymous memory because we rarely age anonymous LRU lists.

* man-page material

MADV_COLD (since Linux x.x)

Pages in the specified regions will be treated as less-recently-accessed
compared to pages in the system with similar access frequencies. In
contrast to MADV_FREE, the contents of the region are preserved regardless
of subsequent writes to pages.

MADV_COLD cannot be applied to locked pages, Huge TLB pages, or VM_PFNMAP
pages.

[akpm@linux-foundation.org: resolve conflicts with hmm.git]
Link: http://lkml.kernel.org/r/20190726023435.214162-2-minchan@kernel.org
Signed-off-by: Minchan Kim
Reported-by: kbuild test robot
Acked-by: Michal Hocko
Acked-by: Johannes Weiner
Cc: James E.J. Bottomley
Cc: Richard Henderson
Cc: Ralf Baechle
Cc: Chris Zankel
Cc: Johannes Weiner
Cc: Daniel Colascione
Cc: Dave Hansen
Cc: Hillf Danton
Cc: Joel Fernandes (Google)
Cc: Kirill A. Shutemov
Cc: Oleksandr Natalenko
Cc: Shakeel Butt
Cc: Sonny Rao
Cc: Suren Baghdasaryan
Cc: Tim Murray
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minchan Kim
2019-09-26 08:51:41 +0800

25 Sep, 2019

2 commits

782de70c4 mm: consolidate pgtable_cache_init() and pgd_cache_init() ... Browse Code »

Both pgtable_cache_init() and pgd_cache_init() are used to initialize kmem
cache for page table allocations on several architectures that do not use
PAGE_SIZE tables for one or more levels of the page table hierarchy.

Most architectures do not implement these functions and use __weak default
NOP implementation of pgd_cache_init(). Since there is no such default
for pgtable_cache_init(), its empty stub is duplicated among most
architectures.

Rename the definitions of pgd_cache_init() to pgtable_cache_init() and
drop empty stubs of pgtable_cache_init().

Link: http://lkml.kernel.org/r/1566457046-22637-1-git-send-email-rppt@linux.ibm.com
Signed-off-by: Mike Rapoport
Acked-by: Will Deacon [arm64]
Acked-by: Thomas Gleixner [x86]
Cc: Catalin Marinas
Cc: Ingo Molnar
Cc: Borislav Petkov
Cc: Matthew Wilcox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Rapoport
2019-09-25 06:54:09 +0800
13224794c mm: remove quicklist page table caches ... Browse Code »

Patch series "mm: remove quicklist page table caches".

A while ago Nicholas proposed to remove quicklist page table caches [1].

I've rebased his patch on the curren upstream and switched ia64 and sh to
use generic versions of PTE allocation.

[1] https://lore.kernel.org/linux-mm/20190711030339.20892-1-npiggin@gmail.com

This patch (of 3):

Remove page table allocator "quicklists". These have been around for a
long time, but have not got much traction in the last decade and are only
used on ia64 and sh architectures.

The numbers in the initial commit look interesting but probably don't
apply anymore. If anybody wants to resurrect this it's in the git
history, but it's unhelpful to have this code and divergent allocator
behaviour for minor archs.

Also it might be better to instead make more general improvements to page
allocator if this is still so slow.

Link: http://lkml.kernel.org/r/1565250728-21721-2-git-send-email-rppt@linux.ibm.com
Signed-off-by: Nicholas Piggin
Signed-off-by: Mike Rapoport
Cc: Tony Luck
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nicholas Piggin
2019-09-25 06:54:09 +0800

20 Sep, 2019

1 commit

671df1895 Merge tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping ... Browse Code »

Pull dma-mapping updates from Christoph Hellwig:

- add dma-mapping and block layer helpers to take care of IOMMU merging
for mmc plus subsequent fixups (Yoshihiro Shimoda)

- rework handling of the pgprot bits for remapping (me)

- take care of the dma direct infrastructure for swiotlb-xen (me)

- improve the dma noncoherent remapping infrastructure (me)

- better defaults for ->mmap, ->get_sgtable and ->get_required_mask
(me)

- cleanup mmaping of coherent DMA allocations (me)

- various misc cleanups (Andy Shevchenko, me)

* tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits)
mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE
mmc: queue: Fix bigger segments usage
arm64: use asm-generic/dma-mapping.h
swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page
swiotlb-xen: simplify cache maintainance
swiotlb-xen: use the same foreign page check everywhere
swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable
xen: remove the exports for xen_{create,destroy}_contiguous_region
xen/arm: remove xen_dma_ops
xen/arm: simplify dma_cache_maint
xen/arm: use dev_is_dma_coherent
xen/arm: consolidate page-coherent.h
xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance
arm: remove wrappers for the generic dma remap helpers
dma-mapping: introduce a dma_common_find_pages helper
dma-mapping: always use VM_DMA_COHERENT for generic DMA remap
vmalloc: lift the arm flag for coherent mappings to common code
dma-mapping: provide a better default ->get_required_mask
dma-mapping: remove the dma_declare_coherent_memory export
remoteproc: don't allow modular build
...

Linus Torvalds
2019-09-20 04:27:23 +0800

04 Sep, 2019

2 commits

512317401 dma-mapping: always use VM_DMA_COHERENT for generic DMA remap ... Browse Code »

Currently the generic dma remap allocator gets a vm_flags passed by
the caller that is a little confusing. We just introduced a generic
vmalloc-level flag to identify the dma coherent allocations, so use
that everywhere and remove the now pointless argument.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2019-09-04 17:13:20 +0800
62fcee9a3 dma-mapping: remove CONFIG_ARCH_NO_COHERENT_DMA_MMAP ... Browse Code »

CONFIG_ARCH_NO_COHERENT_DMA_MMAP is now functionally identical to
!CONFIG_MMU, so remove the separate symbol. The only difference is that
arm did not set it for !CONFIG_MMU, but arm uses a separate dma mapping
implementation including its own mmap method, which is handled by moving
the CONFIG_MMU check in dma_can_mmap so that is only applies to the
dma-direct case, just as the other ifdefs for it.

Signed-off-by: Christoph Hellwig
Acked-by: Geert Uytterhoeven # m68k

Christoph Hellwig
2019-09-04 17:13:18 +0800

02 Sep, 2019

3 commits

982792f45 xtensa: virt: move PCI root complex to KIO range ... Browse Code »

Move PCI configuration space, MMIO and memory to the KIO range to free
vmalloc area and use static TLB to access them. Move MMIO to the
beginning of KIO and define PCI_IOBASE as XCHAL_KIO_BYPASS_VADDR to
match it. Reduce number of supported PCI buses to 0x3f so that ECAM
window fits into first 64MB of the KIO. Reduce size of the PCI memory
window to 128MB so that it fits into KIO.

Signed-off-by: Max Filippov

Max Filippov
2019-09-02 15:09:30 +0800
09f8a6db2 xtensa: add support for call0 ABI in userspace ... Browse Code »

Provide a Kconfig choice to select whether only the default ABI, only
call0 ABI or both are supported. The default for XEA2 is windowed, but
it may change for XEA3. Call0 only runs userspace with PS.WOE disabled.
Supporting both windowed and call0 ABIs is tricky, as there's no
indication in the ELF binaries which ABI they use. So it is done by
probing: each process is started with PS.WOE disabled, but the handler
of an illegal instruction exception taken with PS.WOE retries faulting
instruction after enabling PS.WOE. It must happen before any signal is
delivered to the process, otherwise it may be delivered incorrectly.

Signed-off-by: Max Filippov

Max Filippov
2019-09-02 04:11:57 +0800
9e1e41c44 xtensa: clean up PS_WOE_BIT usage ... Browse Code »

PS_WOE_BIT is mainly used to generate PS.WOE mask in the code. Introduce
PS_WOE_MASK macro and use it instead.

Signed-off-by: Max Filippov

Max Filippov
2019-09-02 04:11:57 +0800

27 Aug, 2019

1 commit

f348f5c23 xtensa: remove free_initrd_mem ... Browse Code »

The xtensa free_initrd_mem() verifies that initrd is mapped and then
frees its memory using free_reserved_area().

The initrd is considered mapped when its memory was successfully reserved
with mem_reserve().

Resetting initrd_start to 0 in case of mem_reserve() failure allows to
switch to generic free_initrd_mem() implementation.

Signed-off-by: Mike Rapoport
Message-Id:
Signed-off-by: Max Filippov

Mike Rapoport
2019-08-27 09:19:04 +0800

13 Aug, 2019

1 commit

cd8869f4c xtensa: add missing isync to the cpu_reset TLB code ... Browse Code »

ITLB entry modifications must be followed by the isync instruction
before the new entries are possibly used. cpu_reset lacks one isync
between ITLB way 6 initialization and jump to the identity mapping.
Add missing isync to xtensa cpu_reset.

Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov

Max Filippov
2019-08-13 06:05:48 +0800

25 Jul, 2019

1 commit

e3cacb73e xtensa: fix build for cores with coprocessors ... Browse Code »

Assembly entry/return abstraction change didn't add asmmacro.h include
statement to coprocessor.S, resulting in references to undefined macros
abi_entry and abi_ret on cores that define XTENSA_HAVE_COPROCESSORS.
Fix that by including asm/asmmacro.h from the coprocessor.S.

Signed-off-by: Max Filippov

Max Filippov
2019-07-25 08:44:42 +0800

17 Jul, 2019

4 commits

57a8ec387 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge more updates from Andrew Morton:
"VM:
- z3fold fixes and enhancements by Henry Burns and Vitaly Wool

- more accurate reclaimed slab caches calculations by Yafang Shao

- fix MAP_UNINITIALIZED UAPI symbol to not depend on config, by
Christoph Hellwig

- !CONFIG_MMU fixes by Christoph Hellwig

- new novmcoredd parameter to omit device dumps from vmcore, by
Kairui Song

- new test_meminit module for testing heap and pagealloc
initialization, by Alexander Potapenko

- ioremap improvements for huge mappings, by Anshuman Khandual

- generalize kprobe page fault handling, by Anshuman Khandual

- device-dax hotplug fixes and improvements, by Pavel Tatashin

- enable synchronous DAX fault on powerpc, by Aneesh Kumar K.V

- add pte_devmap() support for arm64, by Robin Murphy

- unify locked_vm accounting with a helper, by Daniel Jordan

- several misc fixes

core/lib:
- new typeof_member() macro including some users, by Alexey Dobriyan

- make BIT() and GENMASK() available in asm, by Masahiro Yamada

- changed LIST_POISON2 on x86_64 to 0xdead000000000122 for better
code generation, by Alexey Dobriyan

- rbtree code size optimizations, by Michel Lespinasse

- convert struct pid count to refcount_t, by Joel Fernandes

get_maintainer.pl:
- add --no-moderated switch to skip moderated ML's, by Joe Perches

misc:
- ptrace PTRACE_GET_SYSCALL_INFO interface

- coda updates

- gdb scripts, various"

[ Using merge message suggestion from Vlastimil Babka, with some editing - Linus ]

* emailed patches from Andrew Morton : (100 commits)
fs/select.c: use struct_size() in kmalloc()
mm: add account_locked_vm utility function
arm64: mm: implement pte_devmap support
mm: introduce ARCH_HAS_PTE_DEVMAP
mm: clean up is_device_*_page() definitions
mm/mmap: move common defines to mman-common.h
mm: move MAP_SYNC to asm-generic/mman-common.h
device-dax: "Hotremove" persistent memory that is used like normal RAM
mm/hotplug: make remove_memory() interface usable
device-dax: fix memory and resource leak if hotplug fails
include/linux/lz4.h: fix spelling and copy-paste errors in documentation
ipc/mqueue.c: only perform resource calculation if user valid
include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures
scripts/gdb: add helpers to find and list devices
scripts/gdb: add lx-genpd-summary command
drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
kernel/pid.c: convert struct pid count to refcount_t
drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
select: shift restore_saved_sigmask_unless() into poll_select_copy_remaining()
select: change do_poll() to return -ERESTARTNOHAND rather than -EINTR
...

Linus Torvalds
2019-07-17 23:58:04 +0800
0bf5f9492 mm: fix the MAP_UNINITIALIZED flag ... Browse Code »

We can't expose UAPI symbols differently based on CONFIG_ symbols, as
userspace won't have them available. Instead always define the flag,
but only respect it based on the config option.

Link: http://lkml.kernel.org/r/20190703122359.18200-2-hch@lst.de
Signed-off-by: Christoph Hellwig
Reviewed-by: Vladimir Murzin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Hellwig
2019-07-17 10:23:21 +0800
c309b6f24 Merge tag 'docs/v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media ... Browse Code »

Pull rst conversion of docs from Mauro Carvalho Chehab:
"As agreed with Jon, I'm sending this big series directly to you, c/c
him, as this series required a special care, in order to avoid
conflicts with other trees"

* tag 'docs/v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (77 commits)
docs: kbuild: fix build with pdf and fix some minor issues
docs: block: fix pdf output
docs: arm: fix a breakage with pdf output
docs: don't use nested tables
docs: gpio: add sysfs interface to the admin-guide
docs: locking: add it to the main index
docs: add some directories to the main documentation index
docs: add SPDX tags to new index files
docs: add a memory-devices subdir to driver-api
docs: phy: place documentation under driver-api
docs: serial: move it to the driver-api
docs: driver-api: add remaining converted dirs to it
docs: driver-api: add xilinx driver API documentation
docs: driver-api: add a series of orphaned documents
docs: admin-guide: add a series of orphaned documents
docs: cgroup-v1: add it to the admin-guide book
docs: aoe: add it to the driver-api book
docs: add some documentation dirs to the driver-api book
docs: driver-model: move it to the driver-api book
docs: lp855x-driver.rst: add it to the driver-api book
...

Linus Torvalds
2019-07-17 03:21:41 +0800
3e859477a Merge tag 'xtensa-20190715' of git://github.com/jcmvbkbc/linux-xtensa ... Browse Code »

Pull Xtensa updates from Max Filippov:

- clean up PCI support code

- add defconfig and DTS for the 'virt' board

- abstract 'entry' and 'retw' uses in xtensa assembly in preparation
for XEA3/NX pipeline support

- random small cleanups

* tag 'xtensa-20190715' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: virt: add defconfig and DTS
xtensa: abstract 'entry' and 'retw' in assembly code
xtensa: One function call less in bootmem_init()
xtensa: remove arch/xtensa/include/asm/types.h
xtensa: use generic pcibios_set_master and pcibios_enable_device
xtensa: drop dead PCI support code
xtensa/PCI: Remove unused variable

Linus Torvalds
2019-07-17 03:17:07 +0800

15 Jul, 2019

1 commit

8ea0afa3b docs: xtensa: convert to ReST ... Browse Code »

Rename the xtensa documentation files to ReST, add an
index for them and adjust in order to produce a nice html
output via the Sphinx build system.

At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.

Signed-off-by: Mauro Carvalho Chehab

Mauro Carvalho Chehab
2019-07-15 20:20:26 +0800

13 Jul, 2019

1 commit

9e3a25dc9 Merge tag 'dma-mapping-5.3' of git://git.infradead.org/users/hch/dma-mapping ... Browse Code »

Pull dma-mapping updates from Christoph Hellwig:

- move the USB special case that bounced DMA through a device bar into
the USB code instead of handling it in the common DMA code (Laurentiu
Tudor and Fredrik Noring)

- don't dip into the global CMA pool for single page allocations
(Nicolin Chen)

- fix a crash when allocating memory for the atomic pool failed during
boot (Florian Fainelli)

- move support for MIPS-style uncached segments to the common code and
use that for MIPS and nios2 (me)

- make support for DMA_ATTR_NON_CONSISTENT and
DMA_ATTR_NO_KERNEL_MAPPING generic (me)

- convert nds32 to the generic remapping allocator (me)

* tag 'dma-mapping-5.3' of git://git.infradead.org/users/hch/dma-mapping: (29 commits)
dma-mapping: mark dma_alloc_need_uncached as __always_inline
MIPS: only select ARCH_HAS_UNCACHED_SEGMENT for non-coherent platforms
usb: host: Fix excessive alignment restriction for local memory allocations
lib/genalloc.c: Add algorithm, align and zeroed family of DMA allocators
nios2: use the generic uncached segment support in dma-direct
nds32: use the generic remapping allocator for coherent DMA allocations
arc: use the generic remapping allocator for coherent DMA allocations
dma-direct: handle DMA_ATTR_NO_KERNEL_MAPPING in common code
dma-direct: handle DMA_ATTR_NON_CONSISTENT in common code
dma-mapping: add a dma_alloc_need_uncached helper
openrisc: remove the partial DMA_ATTR_NON_CONSISTENT support
arc: remove the partial DMA_ATTR_NON_CONSISTENT support
arm-nommu: remove the partial DMA_ATTR_NON_CONSISTENT support
ARM: dma-mapping: allow larger DMA mask than supported
dma-mapping: truncate dma masks to what dma_addr_t can hold
iommu/dma: Apply dma_{alloc,free}_contiguous functions
dma-remap: Avoid de-referencing NULL atomic_pool
MIPS: use the generic uncached segment support in dma-direct
dma-direct: provide generic support for uncached kernel segments
au1100fb: fix DMA API abuse
...

Linus Torvalds
2019-07-13 06:13:55 +0800

12 Jul, 2019

1 commit

8f6ccf615 Merge tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux ... Browse Code »

Pull clone3 system call from Christian Brauner:
"This adds the clone3 syscall which is an extensible successor to clone
after we snagged the last flag with CLONE_PIDFD during the 5.2 merge
window for clone(). It cleanly supports all of the flags from clone()
and thus all legacy workloads.

There are few user visible differences between clone3 and clone.
First, CLONE_DETACHED will cause EINVAL with clone3 so we can reuse
this flag. Second, the CSIGNAL flag is deprecated and will cause
EINVAL to be reported. It is superseeded by a dedicated "exit_signal"
argument in struct clone_args thus freeing up even more flags. And
third, clone3 gives CLONE_PIDFD a dedicated return argument in struct
clone_args instead of abusing CLONE_PARENT_SETTID's parent_tidptr
argument.

The clone3 uapi is designed to be easy to handle on 32- and 64 bit:

/* uapi */
struct clone_args {
__aligned_u64 flags;
__aligned_u64 pidfd;
__aligned_u64 child_tid;
__aligned_u64 parent_tid;
__aligned_u64 exit_signal;
__aligned_u64 stack;
__aligned_u64 stack_size;
__aligned_u64 tls;
};

and a separate kernel struct is used that uses proper kernel typing:

/* kernel internal */
struct kernel_clone_args {
u64 flags;
int __user *pidfd;
int __user *child_tid;
int __user *parent_tid;
int exit_signal;
unsigned long stack;
unsigned long stack_size;
unsigned long tls;
};

The system call comes with a size argument which enables the kernel to
detect what version of clone_args userspace is passing in. clone3
validates that any additional bytes a given kernel does not know about
are set to zero and that the size never exceeds a page.

A nice feature is that this patchset allowed us to cleanup and
simplify various core kernel codepaths in kernel/fork.c by making the
internal _do_fork() function take struct kernel_clone_args even for
legacy clone().

This patch also unblocks the time namespace patchset which wants to
introduce a new CLONE_TIMENS flag.

Note, that clone3 has only been wired up for x86{_32,64}, arm{64}, and
xtensa. These were the architectures that did not require special
massaging.

Other architectures treat fork-like system calls individually and
after some back and forth neither Arnd nor I felt confident that we
dared to add clone3 unconditionally to all architectures. We agreed to
leave this up to individual architecture maintainers. This is why
there's an additional patch that introduces __ARCH_WANT_SYS_CLONE3
which any architecture can set once it has implemented support for
clone3. The patch also adds a cond_syscall(clone3) for architectures
such as nios2 or h8300 that generate their syscall table by simply
including asm-generic/unistd.h. The hope is to get rid of
__ARCH_WANT_SYS_CLONE3 and cond_syscall() rather soon"

* tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
arch: handle arches who do not yet define clone3
arch: wire-up clone3() syscall
fork: add clone3

Linus Torvalds
2019-07-12 01:09:44 +0800

11 Jul, 2019

2 commits

5450e8a31 Merge tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux ... Browse Code »

Pull pidfd updates from Christian Brauner:
"This adds two main features.

- First, it adds polling support for pidfds. This allows process
managers to know when a (non-parent) process dies in a race-free
way.

The notification mechanism used follows the same logic that is
currently used when the parent of a task is notified of a child's
death. With this patchset it is possible to put pidfds in an
{e}poll loop and get reliable notifications for process (i.e.
thread-group) exit.

- The second feature compliments the first one by making it possible
to retrieve pollable pidfds for processes that were not created
using CLONE_PIDFD.

A lot of processes get created with traditional PID-based calls
such as fork() or clone() (without CLONE_PIDFD). For these
processes a caller can currently not create a pollable pidfd. This
is a problem for Android's low memory killer (LMK) and service
managers such as systemd.

Both patchsets are accompanied by selftests.

It's perhaps worth noting that the work done so far and the work done
in this branch for pidfd_open() and polling support do already see
some adoption:

- Android is in the process of backporting this work to all their LTS
kernels [1]

- Service managers make use of pidfd_send_signal but will need to
wait until we enable waiting on pidfds for full adoption.

- And projects I maintain make use of both pidfd_send_signal and
CLONE_PIDFD [2] and will use polling support and pidfd_open() too"

[1] https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.9+backport%22
https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.14+backport%22
https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.19+backport%22

[2] https://github.com/lxc/lxc/blob/aab6e3eb73c343231cdde775db938994fc6f2803/src/lxc/start.c#L1753

* tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
tests: add pidfd_open() tests
arch: wire-up pidfd_open()
pid: add pidfd_open()
pidfd: add polling selftests
pidfd: add polling support

Linus Torvalds
2019-07-11 13:17:21 +0800
398364a35 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu ... Browse Code »

Pull m68nommu updates from Greg Ungerer:
"A series of cleanups for the FLAT format binary loader, binfmt_flat,
from Christoph.

The end goal is to support no-MMU on RISC-V, and the last patch
enables that"

* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
riscv: add binfmt_flat support
binfmt_flat: don't offset the data start
binfmt_flat: move the MAX_SHARED_LIBS definition to binfmt_flat.c
binfmt_flat: remove the persistent argument from flat_get_addr_from_rp
binfmt_flat: provide an asm-generic/flat.h
binfmt_flat: make support for old format binaries optional
binfmt_flat: add a ARCH_HAS_BINFMT_FLAT option
binfmt_flat: add endianess annotations
binfmt_flat: use fixed size type for the on-disk format
binfmt_flat: consolidate two version of flat_v2_reloc_t
binfmt_flat: remove the unused OLD_FLAT_FLAG_RAM definition
binfmt_flat: remove the uapi header
binfmt_flat: replace flat_argvp_envp_on_stack with a Kconfig variable
binfmt_flat: remove flat_old_ram_flag
binfmt_flat: provide a default version of flat_get_relocate_addr
binfmt_flat: remove flat_set_persistent
binfmt_flat: remove flat_reloc_valid

Linus Torvalds
2019-07-11 12:42:03 +0800

09 Jul, 2019

3 commits

5ad18b2e6 Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/eb… ... Browse Code »

…iederm/user-namespace

Pull force_sig() argument change from Eric Biederman:
"A source of error over the years has been that force_sig has taken a
task parameter when it is only safe to use force_sig with the current
task.

The force_sig function is built for delivering synchronous signals
such as SIGSEGV where the userspace application caused a synchronous
fault (such as a page fault) and the kernel responded with a signal.

Because the name force_sig does not make this clear, and because the
force_sig takes a task parameter the function force_sig has been
abused for sending other kinds of signals over the years. Slowly those
have been fixed when the oopses have been tracked down.

This set of changes fixes the remaining abusers of force_sig and
carefully rips out the task parameter from force_sig and friends
making this kind of error almost impossible in the future"

* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus
signal: Remove the signal number and task parameters from force_sig_info
signal: Factor force_sig_info_to_task out of force_sig_info
signal: Generate the siginfo in force_sig
signal: Move the computation of force into send_signal and correct it.
signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal
signal: Remove the task parameter from force_sig_fault
signal: Use force_sig_fault_to_task for the two calls that don't deliver to current
signal: Explicitly call force_sig_fault on current
signal/unicore32: Remove tsk parameter from __do_user_fault
signal/arm: Remove tsk parameter from __do_user_fault
signal/arm: Remove tsk parameter from ptrace_break
signal/nds32: Remove tsk parameter from send_sigtrap
signal/riscv: Remove tsk parameter from do_trap
signal/sh: Remove tsk parameter from force_sig_info_fault
signal/um: Remove task parameter from send_sigtrap
signal/x86: Remove task parameter from send_sigtrap
signal: Remove task parameter from force_sig_mceerr
signal: Remove task parameter from force_sig
signal: Remove task parameter from force_sigsegv
...

Linus Torvalds
2019-07-09 12:48:15 +0800
775f1f7ea xtensa: virt: add defconfig and DTS ... Browse Code »

Add defconfig and DTS for a virt board. Defconfig enables PCIe host and
a number of virtio devices. DTS routes legacy PCI IRQs to the first four
level-triggered external IRQ lines. CPU core with edge-triggered IRQs
among the first four may need a custom DTS to work correctly.

Signed-off-by: Max Filippov

Max Filippov
2019-07-09 05:32:06 +0800
d6d5f19e2 xtensa: abstract 'entry' and 'retw' in assembly code ... Browse Code »

Provide abi_entry, abi_entry_default, abi_ret and abi_ret_default macros
that allocate aligned stack frame in windowed and call0 ABIs.
Provide XTENSA_SPILL_STACK_RESERVE macro that specifies required stack
frame size when register spilling is involved.
Replace all uses of 'entry' and 'retw' with the above macros.
This makes most of the xtensa assembly code ready for XEA3 and call0 ABI.

Signed-off-by: Max Filippov

Max Filippov
2019-07-09 01:04:48 +0800

06 Jul, 2019

1 commit

831c4f3da xtensa: One function call less in bootmem_init() ... Browse Code »

Avoid an extra function call by using a ternary operator instead of
a conditional statement for a setting selection.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring
Message-Id:
Signed-off-by: Max Filippov

Markus Elfring
2019-07-06 03:02:37 +0800

28 Jun, 2019

2 commits

7615d9e17 arch: wire-up pidfd_open() ... Browse Code »

This wires up the pidfd_open() syscall into all arches at once.

Signed-off-by: Christian Brauner
Reviewed-by: David Howells
Reviewed-by: Oleg Nesterov
Acked-by: Arnd Bergmann
Cc: "Eric W. Biederman"
Cc: Kees Cook
Cc: Joel Fernandes (Google)
Cc: Thomas Gleixner
Cc: Jann Horn
Cc: Andy Lutomirsky
Cc: Andrew Morton
Cc: Aleksa Sarai
Cc: Linus Torvalds
Cc: Al Viro
Cc: linux-api@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org

Christian Brauner
2019-06-28 18:17:55 +0800
7d5bdc0cf xtensa: remove arch/xtensa/include/asm/types.h ... Browse Code »

Xtensa does not define CONFIG_64BIT. The generic definition of
BITS_PER_LONG in include/asm-generic/bitsperlong.h should work.
With that definition removed from arch/xtensa/include/asm/types.h
it does nothing but including arch/xtensa/include/uapi/asm/types.h
Remove the arch/xtensa/include/asm/types.h header.

Signed-off-by: Max Filippov

Max Filippov
2019-06-28 09:12:53 +0800

25 Jun, 2019

1 commit

d98849aff dma-direct: handle DMA_ATTR_NO_KERNEL_MAPPING in common code ... Browse Code »

DMA_ATTR_NO_KERNEL_MAPPING is generally implemented by allocating
normal cacheable pages or CMA memory, and then returning the page
pointer as the opaque handle. Lift that code from the xtensa and
generic dma remapping implementations into the generic dma-direct
code so that we don't even call arch_dma_alloc for these allocations.

Signed-off-by: Christoph Hellwig

Christoph Hellwig
2019-06-25 20:28:05 +0800