Eric Lee / smarc-fsl-linux-kernel

30 Aug, 2019

4 commits

6e792fdbd ANDROID: refactor build.config files to remove duplication ... Browse Code »

The build.config.* files largely contain duplicate information by their
nature. Reorganize them reduce duplication and to allow adding new
configurations without copying the definitions again.

Bug: 140224784
Change-Id: I6a3810a125b0ed48591690ca33bb5c02be58218a
Signed-off-by: Matthias Maennich

Matthias Maennich
2019-08-30 21:55:29 +0800
299e4ae14 ANDROID: update ABI dump ... Browse Code »

ABI test is complaining. I can't repro locally. Maybe updating ABI will
help.

Change-Id: I02d87a6b409f92371fadbf6377371f475c1aa61b
Signed-off-by: Tri Vo

Tri Vo
2019-08-30 06:58:56 +0800
c2c956d39 ANDROID: gki_defconfig: enable CONFIG_QCOM_{COMMAND_DB,RPMH,PDC} ... Browse Code »

CONFIG_ARCH_QCOM is a dependency of the above and selects
CONFIG_{PINCTRL, REGULATOR, TMPFS}.

Bug: 133441279
Bug: 133441092
Bug: 133440650
Change-Id: I22c37946ec3a62ccbd3fa65bbc09076964d86475
Signed-off-by: Tri Vo

Tri Vo
2019-08-30 05:05:10 +0800
c0c20b218 ANDROID: gki_defconfig enable CONFIG_SPARSEMEM_VMEMMAP ... Browse Code »

Legacy Ion driver and SPARSEMEM for carveout regions results
in invalid page structures breaking page_to_pfn(). This can
be temporarily resolved with SPARSEMEM_VMEMMAP until the Ion
driver is refactored and can be reinvestigated.

At that time if it can be solved, or maybe correct this issue
utilizing less resources than SPARSEMEM_VMEMMAP requires. The
ABI does not change so we have the flexibility to adjust this
configuration.

Signed-off-by: Mark Salyzyn
Bug: 138851285
Bug: 138149732
Test: ABI_DEFINITION=common/abi_gki_aarch64.xml \
BUILD_CONFIG=common/build.config.gki.aarch64 ./build/build_abi.sh
Change-Id: I25cc8ebe9e25260b9869c5e8d8667b280f83ca51

Mark Salyzyn
2019-08-30 03:40:56 +0800

29 Aug, 2019

2 commits

4d088b8d1 ANDROID: update ABI for EFI ... Browse Code »

Change-Id: I52daf2ed3450726409baeecbf90abec16f8d719b
Signed-off-by: Alistair Delva

Alistair Delva
2019-08-29 09:03:11 +0800
ac63b4fbf ANDROID: gki_defconfig: Minimally enable EFI ... Browse Code »

HiKey/HiKey960 need UEFI support to boot but don't need much of
the other options that default on when enabling EFI.

Bug: 140204135
Signed-off-by: John Stultz
Change-Id: I5c2e63701ae93277fcc3ddb36a39637237c65194

John Stultz
2019-08-29 08:55:09 +0800

28 Aug, 2019

1 commit

2a8322aa8 ANDROID: sdcardfs: fix fall through in param parsing ... Browse Code »

Fixes: commit bafafd3663c2e ("ANDROID: sdcardfs: Add sdcardfs filesystem")
Change-Id: I936ac03b999095d46810c0ca55a7a29cab52d82a
Signed-off-by: Daniel Rosenberg

Daniel Rosenberg
2019-08-28 07:58:32 +0800

27 Aug, 2019

4 commits

dba13606c Revert "um: remove uses of variable length arrays" ... Browse Code »

This reverts commit 0d4e5ac7e78035950d564e65c38ce148cb9af681.

Reason: Broke UML used by kernel_tests

Bug: 139897923
Change-Id: Ibf57c1f535e60caaef32dd14c4abbe253d8e185d
Signed-off-by: Alistair Delva

Alistair Delva
2019-08-27 04:56:31 +0800
ea22c3486 Revert "um: irq: don't set the chip for all irqs" ... Browse Code »

This reverts commit 1987b1b8f9f17a06255877e7917d0bb5b5377774.

Reason: Broke UML used by kernel_tests

Bug: 139897923
Change-Id: If3541721fdca7cf6d77410309ae5b503b5a848d0
Signed-off-by: Alistair Delva

Alistair Delva
2019-08-27 04:56:22 +0800
cdecefc7c ANDROID: update ABI for CONFIG_NR_CPUS=32 ... Browse Code »

Leaf changes summary: 39 artifacts changed
Changed leaf types summary: 19 leaf types changed
Removed/Changed/Added functions summary: 0 Removed, 10 Changed, 0 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

10 functions with some sub-type change:

. . .

Signed-off-by: Mark Salyzyn
Test: ABI_DEFINITION=common/abi_gki_aarch64.xml \
BUILD_CONFIG=common/build.config.gki.aarch64 ./build/build_abi.sh
Bug: 139693734
Bug: 139406736
Bug: 139692860
Change-Id: Ie5f37566e748cc43ce247fe8f9f1f8931f6fc579

Mark Salyzyn
2019-08-27 04:02:28 +0800
5ba28cfbf ANDROID: gki_defconfig: set CONFIG_NR_CPUS=32 ... Browse Code »

Consensus is that CONFIG_NR_CPUS of 32 will deal with the future
products with a moderate engineering margin.

Signed-off-by: Mark Salyzyn
Test: confirm value propagates to .config
Bug: 139693734
Bug: 139406736
Bug: 139692860
Change-Id: I9687d37da254a612947398a45ae56ab01e676562

Mark Salyzyn
2019-08-27 03:42:18 +0800

26 Aug, 2019

13 commits

a63d3d432 ABI file updates for 5.3-rc5 and 5.3-rc6 ... Browse Code »

Abridged summary:

Leaf changes summary: 7 artifacts changed
Changed leaf types summary: 4 leaf types changed
Removed/Changed/Added functions summary: 0 Removed, 2 Changed, 0 Added function
Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 0 Added variable

2 functions with some sub-type change:

[C]'function void inet_frag_reasm_finish(inet_frag_queue*, sk_buff*, void*)' at inet_fragment.c:477:1 has some sub-type changes:
parameter 4 of type 'typedef bool' was added

'struct hci_dev at hci_core.h:215:1' changed:
type size hasn't changed
1 data member insertion:
'__u8 hci_dev::min_enc_key_size', at offset 6208 (in bits) at hci_core.h:281:1
there are data member changes:
'__u8 hci_dev::ssp_debug_mode' offset changed from 6208 to 6216 (in bits) (by +8 bits)
'__u8 hci_dev::hw_error_code' offset changed from 6216 to 6224 (in bits) (by +8 bits)

'struct net at net_namespace.h:54:1' changed:
type size hasn't changed
1 data member deletion:
'atomic64_t net::cookie_gen', at offset 128 (in bits) at net_namespace.h:64:1

'struct ring_buffer at internal.h:13:1' changed:
type size changed from 1920 to 1664 (in bits)
24 data member deletions:
4 data member insertions:
there are data member changes:

struct zs_pool at zsmalloc.c:251:1' changed:
type size changed from 17472 to 17792 (in bits)
3 data member insertions:
'wait_queue_head zs_pool::migration_wait', at offset 17472 (in bits) at zsmalloc.c:273:1
'atomic_long_t zs_pool::isolated_pages', at offset 17664 (in bits) at zsmalloc.c:274:1
'bool zs_pool::destroying', at offset 17728 (in bits) at zsmalloc.c:275:1

Signed-off-by: Greg Kroah-Hartman
Change-Id: I5328b37cf9f43d732d8b5768a662362a061afe8c

Greg Kroah-Hartman
2019-08-26 23:06:08 +0800
ad455d87e Merge 5.3-rc6 into android-mainline ... Browse Code »

Linux 5.3-rc6

Signed-off-by: Greg Kroah-Hartman
Change-Id: Id10580d48d56054408b3efe0bd1866d67aba2a3d

Greg Kroah-Hartman
2019-08-26 22:45:30 +0800
a5bd47ef3 Merge 5.3-rc5 into android-mainline ... Browse Code »

Linux 5.3-rc5

Signed-off-by: Greg Kroah-Hartman
Change-Id: Ibfaea1b9aca9f04a59def096f327c2afbd0cb296

Greg Kroah-Hartman
2019-08-26 22:43:17 +0800
a55aa89aa Linux 5.3-rc6 Browse Code »

Linus Torvalds
2019-08-26 03:01:23 +0800
c749088f2 Merge tag 'auxdisplay-for-linus-v5.3-rc7' of git://github.com/ojeda/linux ... Browse Code »

Pull auxdisplay cleanup from Miguel Ojeda:
"Make ht16k33_fb_fix and ht16k33_fb_var constant (Nishka Dasgupta)"

* tag 'auxdisplay-for-linus-v5.3-rc7' of git://github.com/ojeda/linux:
auxdisplay: ht16k33: Make ht16k33_fb_fix and ht16k33_fb_var constant

Linus Torvalds
2019-08-26 02:43:17 +0800
32ae83ffe Merge tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml ... Browse Code »

Pull UML fix from Richard Weinberger:
"Fix time travel mode"

* tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: fix time travel mode

Linus Torvalds
2019-08-26 02:40:24 +0800
94a76d9b5 Merge tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs ... Browse Code »

Pull UBIFS and JFFS2 fixes from Richard Weinberger:
"UBIFS:
- Don't block too long in writeback_inodes_sb()
- Fix for a possible overrun of the log head
- Fix double unlock in orphan_delete()

JFFS2:
- Remove C++ style from UAPI header and unbreak picky toolchains"

* tag 'for-linus-5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
ubifs: Limit the number of pages in shrink_liability
ubifs: Correctly initialize c->min_log_bytes
ubifs: Fix double unlock around orphan_delete()
jffs2: Remove C++ style comments from uapi header

Linus Torvalds
2019-08-26 02:29:27 +0800
146c3d322 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 fixes from Thomas Gleixner:
"A few fixes for x86:

- Fix a boot regression caused by the recent bootparam sanitizing
change, which escaped the attention of all people who reviewed that
code.

- Address a boot problem on machines with broken E820 tables caused
by an underflow which ended up placing the trampoline start at
physical address 0.

- Handle machines which do not advertise a legacy timer of any form,
but need calibration of the local APIC timer gracefully by making
the calibration routine independent from the tick interrupt. Marked
for stable as well as there seems to be quite some new laptops
rolled out which expose this.

- Clear the RDRAND CPUID bit on AMD family 15h and 16h CPUs which are
affected by broken firmware which does not initialize RDRAND
correctly after resume. Add a command line parameter to override
this for machine which either do not use suspend/resume or have a
fixed BIOS. Unfortunately there is no way to detect this on boot,
so the only safe decision is to turn it off by default.

- Prevent RFLAGS from being clobbers in CALL_NOSPEC on 32bit which
caused fast KVM instruction emulation to break.

- Explain the Intel CPU model naming convention so that the repeating
discussions come to an end"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386
x86/boot: Fix boot regression caused by bootparam sanitizing
x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h
x86/boot/compressed/64: Fix boot on machines with broken E820 table
x86/apic: Handle missing global clockevent gracefully
x86/cpu: Explain Intel model naming convention

Linus Torvalds
2019-08-26 01:10:15 +0800
5a13fc3d8 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timekeeping fix from Thomas Gleixner:
"A single fix for a regression caused by the generic VDSO
implementation where a math overflow causes CLOCK_BOOTTIME to become a
random number generator"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping/vsyscall: Prevent math overflow in BOOTTIME update

Linus Torvalds
2019-08-26 01:08:01 +0800
8a04c2ee6 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler fix from Thomas Gleixner:
"Handle the worker management in situations where a task is scheduled
out on a PI lock contention correctly and schedule a new worker if
possible"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Schedule new worker even if PI-blocked

Linus Torvalds
2019-08-26 01:06:12 +0800
05bbb9360 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Thomas Gleixner:
"Two small fixes for kprobes and perf:

- Prevent a deadlock in kprobe_optimizer() causes by reverse lock
ordering

- Fix a comment typo"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kprobes: Fix potential deadlock in kprobe_optimizer()
perf/x86: Fix typo in comment

Linus Torvalds
2019-08-26 01:03:32 +0800
44c471e43 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull irq fix from Thomas Gleixner:
"A single fix for a imbalanced kobject operation in the irq decriptor
code which was unearthed by the new warnings in the kobject code"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Properly pair kobject_del() with kobject_add()

Linus Torvalds
2019-08-26 01:00:21 +0800
f47edb59b Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Mergr misc fixes from Andrew Morton:
"11 fixes"

Mostly VM fixes, one psi polling fix, and one parisc build fix.

* emailed patches from Andrew Morton :
mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y
mm/zsmalloc.c: fix race condition in zs_destroy_pool
mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely
mm, page_owner: handle THP splits correctly
userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx
psi: get poll_work to run when calling poll syscall next time
mm: memcontrol: flush percpu vmevents before releasing memcg
mm: memcontrol: flush percpu vmstats before releasing memcg
parisc: fix compilation errrors
mm, page_alloc: move_freepages should not examine struct page of reserved memory
mm/z3fold.c: fix race between migration and destruction

Linus Torvalds
2019-08-26 00:56:27 +0800

25 Aug, 2019

16 commits

e67095fd2 Merge tag 'dma-mapping-5.3-5' of git://git.infradead.org/users/hch/dma-mapping ... Browse Code »

Pull dma-mapping fixes from Christoph Hellwig:
"Two fixes for regressions in this merge window:

- select the Kconfig symbols for the noncoherent dma arch helpers on
arm if swiotlb is selected, not just for LPAE to not break then Xen
build, that uses swiotlb indirectly through swiotlb-xen

- fix the page allocator fallback in dma_alloc_contiguous if the CMA
allocation fails"

* tag 'dma-mapping-5.3-5' of git://git.infradead.org/users/hch/dma-mapping:
dma-direct: fix zone selection after an unaddressable CMA allocation
arm: select the dma-noncoherent symbols for all swiotlb builds

Linus Torvalds
2019-08-25 11:00:11 +0800
00fb24a42 mm/kasan: fix false positive invalid-free reports with CONFIG_KASAN_SW_TAGS=y ... Browse Code »

The code like this:

ptr = kmalloc(size, GFP_KERNEL);
page = virt_to_page(ptr);
offset = offset_in_page(ptr);
kfree(page_address(page) + offset);

may produce false-positive invalid-free reports on the kernel with
CONFIG_KASAN_SW_TAGS=y.

In the example above we lose the original tag assigned to 'ptr', so
kfree() gets the pointer with 0xFF tag. In kfree() we check that 0xFF
tag is different from the tag in shadow hence print false report.

Instead of just comparing tags, do the following:

1) Check that shadow doesn't contain KASAN_TAG_INVALID. Otherwise it's
double-free and it doesn't matter what tag the pointer have.

2) If pointer tag is different from 0xFF, make sure that tag in the
shadow is the same as in the pointer.

Link: http://lkml.kernel.org/r/20190819172540.19581-1-aryabinin@virtuozzo.com
Fixes: 7f94ffbc4c6a ("kasan: add hooks implementation for tag-based mode")
Signed-off-by: Andrey Ryabinin
Reported-by: Walter Wu
Reported-by: Mark Rutland
Reviewed-by: Andrey Konovalov
Cc: Alexander Potapenko
Cc: Dmitry Vyukov
Cc: Catalin Marinas
Cc: Will Deacon
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrey Ryabinin
2019-08-25 10:48:42 +0800
701d67859 mm/zsmalloc.c: fix race condition in zs_destroy_pool ... Browse Code »

In zs_destroy_pool() we call flush_work(&pool->free_work). However, we
have no guarantee that migration isn't happening in the background at
that time.

Since migration can't directly free pages, it relies on free_work being
scheduled to free the pages. But there's nothing preventing an
in-progress migrate from queuing the work *after*
zs_unregister_migration() has called flush_work(). Which would mean
pages still pointing at the inode when we free it.

Since we know at destroy time all objects should be free, no new
migrations can come in (since zs_page_isolate() fails for fully-free
zspages). This means it is sufficient to track a "# isolated zspages"
count by class, and have the destroy logic ensure all such pages have
drained before proceeding. Keeping that state under the class spinlock
keeps the logic straightforward.

In this case a memory leak could lead to an eventual crash if compaction
hits the leaked page. This crash would only occur if people are
changing their zswap backend at runtime (which eventually starts
destruction).

Link: http://lkml.kernel.org/r/20190809181751.219326-2-henryburns@google.com
Fixes: 48b4800a1c6a ("zsmalloc: page migration support")
Signed-off-by: Henry Burns
Reviewed-by: Sergey Senozhatsky
Cc: Henry Burns
Cc: Minchan Kim
Cc: Shakeel Butt
Cc: Jonathan Adams
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Henry Burns
2019-08-25 10:48:42 +0800
1a87aa035 mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely ... Browse Code »

In zs_page_migrate() we call putback_zspage() after we have finished
migrating all pages in this zspage. However, the return value is
ignored. If a zs_free() races in between zs_page_isolate() and
zs_page_migrate(), freeing the last object in the zspage,
putback_zspage() will leave the page in ZS_EMPTY for potentially an
unbounded amount of time.

To fix this, we need to do the same thing as zs_page_putback() does:
schedule free_work to occur.

To avoid duplicated code, move the sequence to a new
putback_zspage_deferred() function which both zs_page_migrate() and
zs_page_putback() call.

Link: http://lkml.kernel.org/r/20190809181751.219326-1-henryburns@google.com
Fixes: 48b4800a1c6a ("zsmalloc: page migration support")
Signed-off-by: Henry Burns
Reviewed-by: Sergey Senozhatsky
Cc: Henry Burns
Cc: Minchan Kim
Cc: Shakeel Butt
Cc: Jonathan Adams
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Henry Burns
2019-08-25 10:48:42 +0800
f7da677bc mm, page_owner: handle THP splits correctly ... Browse Code »

THP splitting path is missing the split_page_owner() call that
split_page() has.

As a result, split THP pages are wrongly reported in the page_owner file
as order-9 pages. Furthermore when the former head page is freed, the
remaining former tail pages are not listed in the page_owner file at
all. This patch fixes that by adding the split_page_owner() call into
__split_huge_page().

Link: http://lkml.kernel.org/r/20190820131828.22684-2-vbabka@suse.cz
Fixes: a9627bc5e34e ("mm/page_owner: introduce split_page_owner and replace manual handling")
Reported-by: Kirill A. Shutemov
Signed-off-by: Vlastimil Babka
Cc: Michal Hocko
Cc: Mel Gorman
Cc: Matthew Wilcox
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vlastimil Babka
2019-08-25 10:48:42 +0800
46d0b24c5 userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx ... Browse Code »

userfaultfd_release() should clear vm_flags/vm_userfaultfd_ctx even if
mm->core_state != NULL.

Otherwise a page fault can see userfaultfd_missing() == T and use an
already freed userfaultfd_ctx.

Link: http://lkml.kernel.org/r/20190820160237.GB4983@redhat.com
Fixes: 04f5866e41fb ("coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping")
Signed-off-by: Oleg Nesterov
Reported-by: Kefeng Wang
Reviewed-by: Andrea Arcangeli
Tested-by: Kefeng Wang
Cc: Peter Xu
Cc: Mike Rapoport
Cc: Jann Horn
Cc: Jason Gunthorpe
Cc: Michal Hocko
Cc: Tetsuo Handa
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2019-08-25 10:48:42 +0800
7b2b55da1 psi: get poll_work to run when calling poll syscall next time ... Browse Code »

Only when calling the poll syscall the first time can user receive
POLLPRI correctly. After that, user always fails to acquire the event
signal.

Reproduce case:
1. Get the monitor code in Documentation/accounting/psi.txt
2. Run it, and wait for the event triggered.
3. Kill and restart the process.

The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0). And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker. The
cancel needs to pair with resetting the poll_scheduled flag.

Link: http://lkml.kernel.org/r/1566357985-97781-1-git-send-email-joseph.qi@linux.alibaba.com
Signed-off-by: Jason Xing
Signed-off-by: Joseph Qi
Reviewed-by: Caspar Zhang
Reviewed-by: Suren Baghdasaryan
Acked-by: Johannes Weiner
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jason Xing
2019-08-25 10:48:42 +0800
bb65f89b7 mm: memcontrol: flush percpu vmevents before releasing memcg ... Browse Code »

Similar to vmstats, percpu caching of local vmevents leads to an
accumulation of errors on non-leaf levels. This happens because some
leftovers may remain in percpu caches, so that they are never propagated
up by the cgroup tree and just disappear into nonexistence with on
releasing of the memory cgroup.

To fix this issue let's accumulate and propagate percpu vmevents values
before releasing the memory cgroup similar to what we're doing with
vmstats.

Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate
only over online cpus.

Link: http://lkml.kernel.org/r/20190819202338.363363-4-guro@fb.com
Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
Signed-off-by: Roman Gushchin
Acked-by: Michal Hocko
Cc: Johannes Weiner
Cc: Vladimir Davydov
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roman Gushchin
2019-08-25 10:48:42 +0800
c350a99ea mm: memcontrol: flush percpu vmstats before releasing memcg ... Browse Code »

Percpu caching of local vmstats with the conditional propagation by the
cgroup tree leads to an accumulation of errors on non-leaf levels.

Let's imagine two nested memory cgroups A and A/B. Say, a process
belonging to A/B allocates 100 pagecache pages on the CPU 0. The percpu
cache will spill 3 times, so that 32*3=96 pages will be accounted to A/B
and A atomic vmstat counters, 4 pages will remain in the percpu cache.

Imagine A/B is nearby memory.max, so that every following allocation
triggers a direct reclaim on the local CPU. Say, each such attempt will
free 16 pages on a new cpu. That means every percpu cache will have -16
pages, except the first one, which will have 4 - 16 = -12. A/B and A
atomic counters will not be touched at all.

Now a user removes A/B. All percpu caches are freed and corresponding
vmstat numbers are forgotten. A has 96 pages more than expected.

As memory cgroups are created and destroyed, errors do accumulate. Even
1-2 pages differences can accumulate into large numbers.

To fix this issue let's accumulate and propagate percpu vmstat values
before releasing the memory cgroup. At this point these numbers are
stable and cannot be changed.

Since on cpu hotplug we do flush percpu vmstats anyway, we can iterate
only over online cpus.

Link: http://lkml.kernel.org/r/20190819202338.363363-2-guro@fb.com
Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
Signed-off-by: Roman Gushchin
Acked-by: Michal Hocko
Cc: Johannes Weiner
Cc: Vladimir Davydov
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Roman Gushchin
2019-08-25 10:48:42 +0800
bbcb03a97 parisc: fix compilation errrors ... Browse Code »

Commit 0cfaee2af3a0 ("include/asm-generic/5level-fixup.h: fix variable
'p4d' set but not used") converted a few functions from macros to static
inline, which causes parisc to complain,

In file included from include/asm-generic/4level-fixup.h:38:0,
from arch/parisc/include/asm/pgtable.h:5,
from arch/parisc/include/asm/io.h:6,
from include/linux/io.h:13,
from sound/core/memory.c:9:
include/asm-generic/5level-fixup.h:14:18: error: unknown type name 'pgd_t'; did you mean 'pid_t'?
#define p4d_t pgd_t
^
include/asm-generic/5level-fixup.h:24:28: note: in expansion of macro 'p4d_t'
static inline int p4d_none(p4d_t p4d)
^~~~~

It is because "4level-fixup.h" is included before "asm/page.h" where
"pgd_t" is defined.

Link: http://lkml.kernel.org/r/20190815205305.1382-1-cai@lca.pw
Fixes: 0cfaee2af3a0 ("include/asm-generic/5level-fixup.h: fix variable 'p4d' set but not used")
Signed-off-by: Qian Cai
Reported-by: Guenter Roeck
Tested-by: Guenter Roeck
Cc: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Qian Cai
2019-08-25 10:48:42 +0800
cd9610383 mm, page_alloc: move_freepages should not examine struct page of reserved memory ... Browse Code »

After commit 907ec5fca3dc ("mm: zero remaining unavailable struct
pages"), struct page of reserved memory is zeroed. This causes
page->flags to be 0 and fixes issues related to reading
/proc/kpageflags, for example, of reserved memory.

The VM_BUG_ON() in move_freepages_block(), however, assumes that
page_zone() is meaningful even for reserved memory. That assumption is
no longer true after the aforementioned commit.

There's no reason why move_freepages_block() should be testing the
legitimacy of page_zone() for reserved memory; its scope is limited only
to pages on the zone's freelist.

Note that pfn_valid() can be true for reserved memory: there is a
backing struct page. The check for page_to_nid(page) is also buggy but
reserved memory normally only appears on node 0 so the zeroing doesn't
affect this.

Move the debug checks to after verifying PageBuddy is true. This
isolates the scope of the checks to only be for buddy pages which are on
the zone's freelist which move_freepages_block() is operating on. In
this case, an incorrect node or zone is a bug worthy of being warned
about (and the examination of struct page is acceptable bcause this
memory is not reserved).

Why does move_freepages_block() gets called on reserved memory? It's
simply math after finding a valid free page from the per-zone free area
to use as fallback. We find the beginning and end of the pageblock of
the valid page and that can bring us into memory that was reserved per
the e820. pfn_valid() is still true (it's backed by a struct page), but
since it's zero'd we shouldn't make any inferences here about comparing
its node or zone. The current node check just happens to succeed most
of the time by luck because reserved memory typically appears on node 0.

The fix here is to validate that we actually have buddy pages before
testing if there's any type of zone or node strangeness going on.

We noticed it almost immediately after bringing 907ec5fca3dc in on
CONFIG_DEBUG_VM builds. It depends on finding specific free pages in
the per-zone free area where the math in move_freepages() will bring the
start or end pfn into reserved memory and wanting to claim that entire
pageblock as a new migratetype. So the path will be rare, require
CONFIG_DEBUG_VM, and require fallback to a different migratetype.

Some struct pages were already zeroed from reserve pages before
907ec5fca3c so it theoretically could trigger before this commit. I
think it's rare enough under a config option that most people don't run
that others may not have noticed. I wouldn't argue against a stable tag
and the backport should be easy enough, but probably wouldn't single out
a commit that this is fixing.

Mel said:

: The overhead of the debugging check is higher with this patch although
: it'll only affect debug builds and the path is not particularly hot.
: If this was a concern, I think it would be reasonable to simply remove
: the debugging check as the zone boundaries are checked in
: move_freepages_block and we never expect a zone/node to be smaller than
: a pageblock and stuck in the middle of another zone.

Link: http://lkml.kernel.org/r/alpine.DEB.2.21.1908122036560.10779@chino.kir.corp.google.com
Signed-off-by: David Rientjes
Acked-by: Mel Gorman
Cc: Naoya Horiguchi
Cc: Masayoshi Mizuma
Cc: Oscar Salvador
Cc: Pavel Tatashin
Cc: Vlastimil Babka
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Rientjes
2019-08-25 10:48:42 +0800
d776aaa98 mm/z3fold.c: fix race between migration and destruction ... Browse Code »

In z3fold_destroy_pool() we call destroy_workqueue(&pool->compact_wq).
However, we have no guarantee that migration isn't happening in the
background at that time.

Migration directly calls queue_work_on(pool->compact_wq), if destruction
wins that race we are using a destroyed workqueue.

Link: http://lkml.kernel.org/r/20190809213828.202833-1-henryburns@google.com
Signed-off-by: Henry Burns
Cc: Vitaly Wool
Cc: Shakeel Butt
Cc: Jonathan Adams
Cc: Henry Burns
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Henry Burns
2019-08-25 10:48:42 +0800
083f0f2cd Merge tag 'gpio-v5.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio ... Browse Code »

Pull GPIO fixes from Linus Walleij:
"Here is a (hopefully last) set of GPIO fixes for the v5.3 kernel
cycle. Two are pretty core:

- Fix not reporting open drain/source lines to userspace as "input"

- Fix a minor build error found in randconfigs

- Fix a chip select quirk on the Freescale SPI

- Fix the irqchip initialization semantic order to reflect what it
was using the old API"

* tag 'gpio-v5.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: Fix irqchip initialization order
gpio: of: fix Freescale SPI CS quirk handling
gpio: Fix build error of function redefinition
gpiolib: never report open-drain/source lines as 'input' to user-space

Linus Torvalds
2019-08-25 05:45:33 +0800
361469211 Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux ... Browse Code »

Pull Hyper-V fixes from Sasha Levin:

- Fix for panics and network failures on PAE guests by Dexuan Cui.

- Fix of a memory leak (and related cleanups) in the hyper-v keyboard
driver by Dexuan Cui.

- Code cleanups for hyper-v clocksource driver during the merge window
by Dexuan Cui.

- Fix for a false positive warning in the userspace hyper-v KVP store
by Vitaly Kuznetsov.

* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Fix virt_to_hvpfn() for X86_PAE
Tools: hv: kvp: eliminate 'may be used uninitialized' warning
Input: hyperv-keyboard: Use in-place iterator API in the channel callback
Drivers: hv: vmbus: Remove the unused "tsc_page" from struct hv_context

Linus Torvalds
2019-08-25 02:42:06 +0800
0a022eccf Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux ... Browse Code »

Pull arm64 fixes from Will Deacon:
"Two KVM/arm fixes for MMIO emulation and UBSAN.

Unusually, we're routing them via the arm64 tree as per Paolo's
request on the list:

https://lore.kernel.org/kvm/21ae69a2-2546-29d0-bff6-2ea825e3d968@redhat.com/

We don't actually have any other arm64 fixes pending at the moment
(touch wood), so I've pulled from Marc, written a merge commit, tagged
the result and run it through my build/boot/bisect scripts"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
KVM: arm/arm64: Only skip MMIO insn once

Linus Torvalds
2019-08-25 02:35:25 +0800
17d0fbf47 Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi ... Browse Code »

Pull SCSI fixes from James Bottomley:
"Four fixes, three for edge conditions which don't occur very often.
The lpfc fix mitigates memory exhaustion for some high CPU systems"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: lpfc: Mitigate high memory pre-allocation by SCSI-MQ
scsi: ufs: Fix NULL pointer dereference in ufshcd_config_vreg_hpm()
scsi: target: tcmu: avoid use-after-free after command timeout
scsi: qla2xxx: Fix gnl.l memory leak on adapter init failure

Linus Torvalds
2019-08-25 02:26:51 +0800