04 Oct, 2017
37 commits
-
To ensure that load_misc_binary() can't use the partially destroyed
Node, see also the next patch.The current logic looks wrong in any case, once we close interp_file it
doesn't make any sense to delay kfree(inode->i_private), this Node is no
longer valid. Even if the MISC_FMT_OPEN_FILE/interp_file checks were
not racy (they are), load_misc_binary() should not try to reopen
->interpreter if MISC_FMT_OPEN_FILE is set but ->interp_file is NULL.And I can't understand why do we use filp_close(), not fput().
Link: http://lkml.kernel.org/r/20170922143644.GA17216@redhat.com
Signed-off-by: Oleg Nesterov
Acked-by: Kees Cook
Cc: Al Viro
Cc: Ben Woodard
Cc: James Bottomley
Cc: Jim Foraker
Cc:
Cc: Travis Gummels
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
kill_node() nullifies/checks Node->dentry to avoid double free. This
complicates the next changes and this is very confusing:- we do not need to check dentry != NULL under entries_lock,
kill_node() is always called under inode_lock(d_inode(root)) and we
rely on this inode_lock() anyway, without this lock the
MISC_FMT_OPEN_FILE cleanup could race with itself.- if kill_inode() was already called and ->dentry == NULL we should not
even try to close e->interp_file.We can change bm_entry_write() to simply check !list_empty(list) before
kill_node. Again, we rely on inode_lock(), in particular it saves us
from the race with bm_status_write(), another caller of kill_node().Link: http://lkml.kernel.org/r/20170922143641.GA17210@redhat.com
Signed-off-by: Oleg Nesterov
Acked-by: Kees Cook
Cc: Al Viro
Cc: Ben Woodard
Cc: James Bottomley
Cc: Jim Foraker
Cc:
Cc: Travis Gummels
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Patch series "exec: binfmt_misc: fix use-after-free, kill
iname[BINPRM_BUF_SIZE]".It looks like this code was always wrong, then commit 948b701a607f
("binfmt_misc: add persistent opened binary handler for containers")
added more problems.This patch (of 6):
load_script() can simply use i_name instead, it points into bprm->buf[]
and nobody can change this memory until we call prepare_binprm().The only complication is that we need to also change the signature of
bprm_change_interp() but this change looks good too.While at it, do whitespace/style cleanups.
NOTE: the real motivation for this change is that people want to
increase BINPRM_BUF_SIZE, we need to change load_misc_binary() too but
this looks more complicated because afaics it is very buggy.Link: http://lkml.kernel.org/r/20170918163446.GA26793@redhat.com
Signed-off-by: Oleg Nesterov
Acked-by: Kees Cook
Cc: Travis Gummels
Cc: Ben Woodard
Cc: Jim Foraker
Cc:
Cc: Al Viro
Cc: James Bottomley
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When reading the event from the uffd, we put it on a temporary
fork_event list to detect if we can still access it after releasing and
retaking the event_wqh.lock.If fork aborts and removes the event from the fork_event all is fine as
long as we're still in the userfault read context and fork_event head is
still alive.We've to put the event allocated in the fork kernel stack, back from
fork_event list-head to the event_wqh head, before returning from
userfaultfd_ctx_read, because the fork_event head lifetime is limited to
the userfaultfd_ctx_read stack lifetime.Forgetting to move the event back to its event_wqh place then results in
__remove_wait_queue(&ctx->event_wqh, &ewq->wq); in
userfaultfd_event_wait_completion to remove it from a head that has been
already freed from the reader stack.This could only happen if resolve_userfault_fork failed (for example if
there are no file descriptors available to allocate the fork uffd). If
it succeeded it was put back correctly.Furthermore, after find_userfault_evt receives a fork event, the forked
userfault context in fork_nctx and uwq->msg.arg.reserved.reserved1 can
be released by the fork thread as soon as the event_wqh.lock is
released. Taking a reference on the fork_nctx before dropping the lock
prevents an use after free in resolve_userfault_fork().If the fork side aborted and it already released everything, we still
try to succeed resolve_userfault_fork(), if possible.Fixes: 893e26e61d04eac9 ("userfaultfd: non-cooperative: Add fork() event")
Link: http://lkml.kernel.org/r/20170920180413.26713-1-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli
Reported-by: Mark Rutland
Tested-by: Mark Rutland
Cc: Pavel Emelyanov
Cc: Mike Rapoport
Cc: "Dr. David Alan Gilbert"
Cc: Mike Kravetz
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
With device public pages at the end of my memory space, I'm getting
output from _vm_normal_page():BUG: Bad page map in process migrate_pages pte:c0800001ffff0d06 pmd:f95d3000
addr:00007fff89330000 vm_flags:00100073 anon_vma:c0000000fa899320 mapping: (null) index:7fff8933
file: (null) fault: (null) mmap: (null) readpage: (null)
CPU: 0 PID: 13963 Comm: migrate_pages Tainted: P B OE 4.14.0-rc1-wip #155
Call Trace:
dump_stack+0xb0/0xf4 (unreliable)
print_bad_pte+0x28c/0x340
_vm_normal_page+0xc0/0x140
zap_pte_range+0x664/0xc10
unmap_page_range+0x318/0x670
unmap_vmas+0x74/0xe0
exit_mmap+0xe8/0x1f0
mmput+0xac/0x1f0
do_exit+0x348/0xcd0
do_group_exit+0x5c/0xf0
SyS_exit_group+0x1c/0x20
system_call+0x58/0x6cThe pfn causing this is the very last one. Correct the bounds check
accordingly.Fixes: df6ad69838fc ("mm/device-public-memory: device memory cache coherent with CPU")
Link: http://lkml.kernel.org/r/1506092178-20351-1-git-send-email-arbab@linux.vnet.ibm.com
Signed-off-by: Reza Arbab
Reviewed-by: Jérôme Glisse
Reviewed-by: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
MADV_FREE clears pte dirty bit and then marks the page lazyfree (clear
SwapBacked). There is no lock to prevent the page is added to swap
cache between these two steps by page reclaim. If page reclaim finds
such page, it will simply add the page to swap cache without pageout the
page to swap because the page is marked as clean. Next time, page fault
will read data from the swap slot which doesn't have the original data,
so we have a data corruption. To fix issue, we mark the page dirty and
pageout the page.However, we shouldn't dirty all pages which is clean and in swap cache.
swapin page is swap cache and clean too. So we only dirty page which is
added into swap cache in page reclaim, which shouldn't be swapin page.
As Minchan suggested, simply dirty the page in add_to_swap can do the
job.Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages")
Link: http://lkml.kernel.org/r/08c84256b007bf3f63c91d94383bd9eb6fee2daa.1506446061.git.shli@fb.com
Signed-off-by: Shaohua Li
Reported-by: Artem Savkov
Acked-by: Michal Hocko
Acked-by: Minchan Kim
Cc: Johannes Weiner
Cc: Hillf Danton
Cc: Hugh Dickins
Cc: Rik van Riel
Cc: Mel Gorman
Cc: [4.12+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
MADV_FREE clears pte dirty bit and then marks the page lazyfree (clear
SwapBacked). There is no lock to prevent the page is added to swap
cache between these two steps by page reclaim. Page reclaim could add
the page to swap cache and unmap the page. After page reclaim, the page
is added back to lru. At that time, we probably start draining per-cpu
pagevec and mark the page lazyfree. So the page could be in a state
with SwapBacked cleared and PG_swapcache set. Next time there is a
refault in the virtual address, do_swap_page can find the page from swap
cache but the page has PageSwapCache false because SwapBacked isn't set,
so do_swap_page will bail out and do nothing. The task will keep
running into fault handler.Fixes: 802a3a92ad7a ("mm: reclaim MADV_FREE pages")
Link: http://lkml.kernel.org/r/6537ef3814398c0073630b03f176263bc81f0902.1506446061.git.shli@fb.com
Signed-off-by: Shaohua Li
Reported-by: Artem Savkov
Tested-by: Artem Savkov
Reviewed-by: Rik van Riel
Acked-by: Johannes Weiner
Acked-by: Michal Hocko
Acked-by: Minchan Kim
Cc: Hillf Danton
Cc: Hugh Dickins
Cc: Mel Gorman
Cc: [4.12+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Eryu noticed that he could sometimes get a leftover error reported when
it shouldn't be on fsync with ext2 and non-journalled ext4.The problem is that writeback_single_inode still uses filemap_fdatawait.
That picks up a previously set AS_EIO flag, which would ordinarily have
been cleared before.Since we're mostly using this function as a replacement for
filemap_check_errors, have filemap_check_and_advance_wb_err clear AS_EIO
and AS_ENOSPC when reporting an error. That should allow the new
function to better emulate the behavior of the old with respect to these
flags.Link: http://lkml.kernel.org/r/20170922133331.28812-1-jlayton@kernel.org
Signed-off-by: Jeff Layton
Reported-by: Eryu Guan
Reviewed-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The build of m32r allmodconfig is giving lots of build warnings about:
include/linux/byteorder/big_endian.h:7:2:
warning: #warning inconsistent configuration,
needs CONFIG_CPU_BIG_ENDIAN [-Wcpp]
#warning inconsistent configuration, needs CONFIG_CPU_BIG_ENDIANDefine CPU_BIG_ENDIAN like the way CPU_LITTLE_ENDIAN is defined.
Link: http://lkml.kernel.org/r/1505678083-10320-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In testing I found handle passed to zs_map_object in __zram_bvec_read is
NULL so eh kernel goes oops in pin_object().The reason is there is no routine to check the slot's freeing after
getting the slot's lock. This patch fixes it.[minchan@kernel.org: v2]
Link: http://lkml.kernel.org/r/1505887347-10881-1-git-send-email-minchan@kernel.org
Link: http://lkml.kernel.org/r/1505788488-26723-1-git-send-email-minchan@kernel.org
Fixes: 1f7319c74275 ("zram: partial IO refactoring")
Signed-off-by: Minchan Kim
Reviewed-by: Sergey Senozhatsky
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
On powerpc, RODATA_TEST fails with message the following messages:
Freeing unused kernel memory: 528K
rodata_test: test data was not read onlyThis is because GCC allocates it to .data section:
c0695034 g O .data 00000004 rodata_test_data
Since commit 056b9d8a7692 ("mm: remove rodata_test_data export, add
pr_fmt"), rodata_test_data is used only inside rodata_test.c By
declaring it static, it gets properly allocated into .rodata section
instead of .data:c04df710 l O .rodata 00000004 rodata_test_data
Fixes: 056b9d8a7692 ("mm: remove rodata_test_data export, add pr_fmt")
Link: http://lkml.kernel.org/r/20170921093729.1080368AC1@po15668-vm-win7.idsi0.si.c-s.fr
Signed-off-by: Christophe Leroy
Cc: Kees Cook
Cc: Jinbum Park
Cc: Segher Boessenkool
Cc: David Laight
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Locking of config and doorbell operations should be done only if the
underlying hardware requires it.This patch removes the global spinlocks from the rapidio subsystem and
moves them to the mport drivers (fsl_rio and tsi721), only to the
necessary places. For example, local config space read and write
operations (lcread/lcwrite) are atomic in all existing drivers, so there
should be no need for locking, while the cread/cwrite operations which
generate maintenance transactions need to be synchronized with a lock.Later, each driver could chose to use a per-port lock instead of a
global one, or even more granular locking.Link: http://lkml.kernel.org/r/20170824113023.GD50104@nokia.com
Signed-off-by: Ioan Nicu
Signed-off-by: Frank Kunz
Acked-by: Alexandre Bounine
Cc: Matt Porter
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Nicholas Piggin
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The function is called from __meminit context and calls other __meminit
functions but isn't it self mark as such today:WARNING: vmlinux.o(.text.unlikely+0x4516): Section mismatch in reference from the function init_reserved_page() to the function .meminit.text:early_pfn_to_nid()
The function init_reserved_page() references the function __meminit early_pfn_to_nid().
This is often because init_reserved_page lacks a __meminit annotation or the annotation of early_pfn_to_nid is wrong.On most compilers, we don't notice this because the function gets
inlined all the time. Adding __meminit here fixes the harmless warning
for the old versions and is generally the correct annotation.Link: http://lkml.kernel.org/r/20170915193149.901180-1-arnd@arndb.de
Fixes: 7e18adb4f80b ("mm: meminit: initialise remaining struct pages in parallel with kswapd")
Signed-off-by: Arnd Bergmann
Acked-by: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix the situation when clear_bit() is called for page->private before
the page pointer is actually assigned. While at it, remove work_busy()
check because it is costly and does not give 100% guarantee anyway.Signed-off-by: Vitaly Wool
Cc: Dan Streetman
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Andrea brought to my attention that the L->{L,S} guarantees are
completely bogus for this case. I was looking at the diagram, from the
offending commit, when that _is_ the race, we had the load reordered
already.What we need is at least S->L semantics, thus simply use
wq_has_sleeper() to serialize the call for good.Link: http://lkml.kernel.org/r/20170914175313.GB811@linux-80c1.suse
Fixes: 46acef048a6 (mm,compaction: serialize waitqueue_active() checks)
Signed-off-by: Davidlohr Bueso
Reported-by: Andrea Parri
Cc: Vlastimil Babka
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Drop the global lru lock in isolate callback before calling
zap_page_range which calls cond_resched, and re-acquire the global lru
lock before returning. Also change return code to LRU_REMOVED_RETRY.Use mmput_async when fail to acquire mmap sem in an atomic context.
Fix "BUG: sleeping function called from invalid context"
errors when CONFIG_DEBUG_ATOMIC_SLEEP is enabled.Also restore mmput_async, which was initially introduced in commit
ec8d7c14ea14 ("mm, oom_reaper: do not mmput synchronously from the oom
reaper context"), and was removed in commit 212925802454 ("mm: oom: let
oom_reap_task and exit_mmap run concurrently").Link: http://lkml.kernel.org/r/20170914182231.90908-1-sherryy@android.com
Fixes: f2517eb76f1f2 ("android: binder: Add global lru shrinker to binder")
Signed-off-by: Sherry Yang
Signed-off-by: Greg Kroah-Hartman
Reported-by: Kyle Yan
Acked-by: Arve Hjønnevåg
Acked-by: Michal Hocko
Cc: Martijn Coenen
Cc: Todd Kjos
Cc: Riley Andrews
Cc: Ingo Molnar
Cc: Vlastimil Babka
Cc: Hillf Danton
Cc: Peter Zijlstra
Cc: Andrea Arcangeli
Cc: Thomas Gleixner
Cc: Andy Lutomirski
Cc: Oleg Nesterov
Cc: Hoeun Ryu
Cc: Christopher Lameter
Cc: Vegard Nossum
Cc: Frederic Weisbecker
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix for 4.14, zone device page always have an elevated refcount of one
and thus page count sanity check in uncharge_page() is inappropriate for
them.[mhocko@suse.com: nano-optimize VM_BUG_ON in uncharge_page]
Link: http://lkml.kernel.org/r/20170914190011.5217-1-jglisse@redhat.com
Signed-off-by: Jérôme Glisse
Signed-off-by: Michal Hocko
Reported-by: Evgeny Baskakov
Acked-by: Michal Hocko
Cc: Johannes Weiner
Cc: Vladimir Davydov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The following lockdep splat has been noticed during LTP testing
======================================================
WARNING: possible circular locking dependency detected
4.13.0-rc3-next-20170807 #12 Not tainted
------------------------------------------------------
a.out/4771 is trying to acquire lock:
(cpu_hotplug_lock.rw_sem){++++++}, at: [] drain_all_stock.part.35+0x18/0x140but task is already holding lock:
(&mm->mmap_sem){++++++}, at: [] __do_page_fault+0x175/0x530which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&mm->mmap_sem){++++++}:
lock_acquire+0xc9/0x230
__might_fault+0x70/0xa0
_copy_to_user+0x23/0x70
filldir+0xa7/0x110
xfs_dir2_sf_getdents.isra.10+0x20c/0x2c0 [xfs]
xfs_readdir+0x1fa/0x2c0 [xfs]
xfs_file_readdir+0x30/0x40 [xfs]
iterate_dir+0x17a/0x1a0
SyS_getdents+0xb0/0x160
entry_SYSCALL_64_fastpath+0x1f/0xbe-> #2 (&type->i_mutex_dir_key#3){++++++}:
lock_acquire+0xc9/0x230
down_read+0x51/0xb0
lookup_slow+0xde/0x210
walk_component+0x160/0x250
link_path_walk+0x1a6/0x610
path_openat+0xe4/0xd50
do_filp_open+0x91/0x100
file_open_name+0xf5/0x130
filp_open+0x33/0x50
kernel_read_file_from_path+0x39/0x80
_request_firmware+0x39f/0x880
request_firmware_direct+0x37/0x50
request_microcode_fw+0x64/0xe0
reload_store+0xf7/0x180
dev_attr_store+0x18/0x30
sysfs_kf_write+0x44/0x60
kernfs_fop_write+0x113/0x1a0
__vfs_write+0x37/0x170
vfs_write+0xc7/0x1c0
SyS_write+0x58/0xc0
do_syscall_64+0x6c/0x1f0
return_from_SYSCALL_64+0x0/0x7a-> #1 (microcode_mutex){+.+.+.}:
lock_acquire+0xc9/0x230
__mutex_lock+0x88/0x960
mutex_lock_nested+0x1b/0x20
microcode_init+0xbb/0x208
do_one_initcall+0x51/0x1a9
kernel_init_freeable+0x208/0x2a7
kernel_init+0xe/0x104
ret_from_fork+0x2a/0x40-> #0 (cpu_hotplug_lock.rw_sem){++++++}:
__lock_acquire+0x153c/0x1550
lock_acquire+0xc9/0x230
cpus_read_lock+0x4b/0x90
drain_all_stock.part.35+0x18/0x140
try_charge+0x3ab/0x6e0
mem_cgroup_try_charge+0x7f/0x2c0
shmem_getpage_gfp+0x25f/0x1050
shmem_fault+0x96/0x200
__do_fault+0x1e/0xa0
__handle_mm_fault+0x9c3/0xe00
handle_mm_fault+0x16e/0x380
__do_page_fault+0x24a/0x530
do_page_fault+0x30/0x80
page_fault+0x28/0x30other info that might help us debug this:
Chain exists of:
cpu_hotplug_lock.rw_sem --> &type->i_mutex_dir_key#3 --> &mm->mmap_semPossible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&mm->mmap_sem);
lock(&type->i_mutex_dir_key#3);
lock(&mm->mmap_sem);
lock(cpu_hotplug_lock.rw_sem);*** DEADLOCK ***
2 locks held by a.out/4771:
#0: (&mm->mmap_sem){++++++}, at: [] __do_page_fault+0x175/0x530
#1: (percpu_charge_mutex){+.+...}, at: [] try_charge+0x397/0x6e0The problem is very similar to the one fixed by commit a459eeb7b852
("mm, page_alloc: do not depend on cpu hotplug locks inside the
allocator"). We are taking hotplug locks while we can be sitting on top
of basically arbitrary locks. This just calls for problems.We can get rid of {get,put}_online_cpus, fortunately. We do not have to
be worried about races with memory hotplug because drain_local_stock,
which is called from both the WQ draining and the memory hotplug
contexts, is always operating on the local cpu stock with IRQs disabled.The only thing to be careful about is that the target memcg doesn't
vanish while we are still in drain_all_stock so take a reference on it.Link: http://lkml.kernel.org/r/20170913090023.28322-1-mhocko@kernel.org
Signed-off-by: Michal Hocko
Reported-by: Artem Savkov
Tested-by: Artem Savkov
Cc: Johannes Weiner
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Andrea has noticed that the oom_reaper doesn't invalidate the range via
mmu notifiers (mmu_notifier_invalidate_range_start/end) and that can
corrupt the memory of the kvm guest for example.tlb_flush_mmu_tlbonly already invokes mmu notifiers but that is not
sufficient as per Andrea:"mmu_notifier_invalidate_range cannot be used in replacement of
mmu_notifier_invalidate_range_start/end. For KVM
mmu_notifier_invalidate_range is a noop and rightfully so. A MMU
notifier implementation has to implement either ->invalidate_range
method or the invalidate_range_start/end methods, not both. And if you
implement invalidate_range_start/end like KVM is forced to do, calling
mmu_notifier_invalidate_range in common code is a noop for KVM.For those MMU notifiers that can get away only implementing
->invalidate_range, the ->invalidate_range is implicitly called by
mmu_notifier_invalidate_range_end(). And only those secondary MMUs
that share the same pagetable with the primary MMU (like AMD iommuv2)
can get away only implementing ->invalidate_range"As the callback is allowed to sleep and the implementation is out of
hand of the MM it is safer to simply bail out if there is an mmu
notifier registered. In order to not fail too early make the
mm_has_notifiers check under the oom_lock and have a little nap before
failing to give the current oom victim some more time to exit.[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170913113427.2291-1-mhocko@kernel.org
Fixes: aac453635549 ("mm, oom: introduce oom reaper")
Signed-off-by: Michal Hocko
Reported-by: Andrea Arcangeli
Reviewed-by: Andrea Arcangeli
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It is possible that on a (partially) unsuccessful page reclaim,
kref_put() called in z3fold_reclaim_page() does not yield page release,
but the page is released shortly afterwards by another thread. Then
z3fold_reclaim_page() would try to list_add() that (released) page again
which is obviously a bug.To avoid that, spin_lock() has to be taken earlier, before the
kref_put() call mentioned earlier.Link: http://lkml.kernel.org/r/20170913162937.bfff21c7d12b12a5f47639fd@gmail.com
Signed-off-by: Vitaly Wool
Cc: Dan Streetman
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Pinmux_pins[] is initialized through PINMUX_GPIO(), using designated
array initializers, where the GPIO_* enums serve as indices. If enum
values are defined, but never used, pinmux_pins[] contains (zero-filled)
holes. Such entries are treated as pin zero, which was registered
before, thus leading to pinctrl registration failures, as seen on
sh7722:sh-pfc pfc-sh7722: pin 0 already registered
sh-pfc pfc-sh7722: error during pin registration
sh-pfc pfc-sh7722: could not register: -22
sh-pfc: probe of pfc-sh7722 failed with error -22Remove GPIO_PH[0-7] from the enum to fix this.
Link: http://lkml.kernel.org/r/1505205657-18012-5-git-send-email-geert+renesas@glider.be
Fixes: ef0fa5331a73e479 ("sh: Add pinmux for sh7269")
Signed-off-by: Geert Uytterhoeven
Reviewed-by: Laurent Pinchart
Cc: Yoshinori Sato
Cc: Rich Felker
Cc: Magnus Damm
Cc: Yoshihiro Shimoda
Cc: Jacopo Mondi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Pinmux_pins[] is initialized through PINMUX_GPIO(), using designated
array initializers, where the GPIO_* enums serve as indices. If enum
values are defined, but never used, pinmux_pins[] contains (zero-filled)
holes. Such entries are treated as pin zero, which was registered
before, thus leading to pinctrl registration failures, as seen on
sh7722:sh-pfc pfc-sh7722: pin 0 already registered
sh-pfc pfc-sh7722: error during pin registration
sh-pfc pfc-sh7722: could not register: -22
sh-pfc: probe of pfc-sh7722 failed with error -22Remove GPIO_PH[0-7] from the enum to fix this.
Link: http://lkml.kernel.org/r/1505205657-18012-4-git-send-email-geert+renesas@glider.be
Fixes: 41797f75486d8ca3 ("sh: Add pinmux for sh7264")
Signed-off-by: Geert Uytterhoeven
Reviewed-by: Laurent Pinchart
Cc: Jacopo Mondi
Cc: Magnus Damm
Cc: Rich Felker
Cc: Yoshihiro Shimoda
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Commit 3810e96056ff ("sh: modify pinmux for SH7757 2nd cut") renamed
GPIO_PT[JLNQ]7 to GPIO_PT[JLNQ]7_RESV, and removed the existing users
from the pinmux_pins[] array.However, pinmux_pins[] is initialized through PINMUX_GPIO(), using
designated array initializers, where the GPIO_* enums serve as indices.
Hence entries were not really removed, but replaced by (zero-filled)
holes. Such entries are treated as pin zero, which was registered
before, thus leading to pinctrl registration failures, as seen on
sh7722:sh-pfc pfc-sh7722: pin 0 already registered
sh-pfc pfc-sh7722: error during pin registration
sh-pfc pfc-sh7722: could not register: -22
sh-pfc: probe of pfc-sh7722 failed with error -22Remove GPIO_PT[JLNQ]7_RESV from the enum to fix this.
Link: http://lkml.kernel.org/r/1505205657-18012-3-git-send-email-geert+renesas@glider.be
Fixes: 3810e96056ffddf6 ("sh: modify pinmux for SH7757 2nd cut")
Signed-off-by: Geert Uytterhoeven
Reviewed-by: Laurent Pinchart
Cc: Jacopo Mondi
Cc: Magnus Damm
Cc: Rich Felker
Cc: Yoshihiro Shimoda
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Patch series "sh: sh7722/sh7757i/sh7264/sh7269: Fix pinctrl registration",
v2.Magnus Damm reported that on sh7722/Migo-R, pinctrl registration fails
with:sh-pfc pfc-sh7722: pin 0 already registered
sh-pfc pfc-sh7722: error during pin registration
sh-pfc pfc-sh7722: could not register: -22
sh-pfc: probe of pfc-sh7722 failed with error -22pinmux_pins[] is initialized through PINMUX_GPIO(), using designated
array initializers, where the GPIO_* enums serve as indices. Apparently
GPIO_PTQ7 was defined in the enum, but never used. If enum values are
defined, but never used, pinmux_pins[] contains (zero-filled) holes.
Hence such entries are treated as pin zero, which was registered before,
and pinctrl registration fails.I can't see how this ever worked, as at the time of commit f5e25ae52fef
("sh-pfc: Add sh7722 pinmux support"), pinmux_gpios[] in
drivers/pinctrl/sh-pfc/pfc-sh7722.c already had the hole, and
drivers/pinctrl/core.c already had the check.Some scripting revealed a few more broken drivers:
- sh7757 has four holes, due to nonexistent GPIO_PT[JLNQ]7_RESV.
- sh7264 and sh7269 define GPIO_PH[0-7], but don't use it with
PINMUX_GPIO().Patch 1 fixes the issue on sh7722, and was tested. Patches 3-4 should
fix the issue on the other 3 SoCs, but was untested due to lack of
hardware.This patch (of 4):
On sh7722/Migo-R, pinctrl registration fails with:
sh-pfc pfc-sh7722: pin 0 already registered
sh-pfc pfc-sh7722: error during pin registration
sh-pfc pfc-sh7722: could not register: -22
sh-pfc: probe of pfc-sh7722 failed with error -22pinmux_pins[] is initialized through PINMUX_GPIO(), using designated array
initializers, where the GPIO_* enums serve as indices. As GPIO_PTQ7 is
defined in the enum, but never used, pinmux_pins[] contains a
(zero-filled) hole. Hence this entry is treated as pin zero, which was
registered before, and pinctrl registration fails.According to the datasheet, port PTQ7 does not exist. Hence remove
GPIO_PTQ7 from the enum to fix this.Link: http://lkml.kernel.org/r/1505205657-18012-2-git-send-email-geert+renesas@glider.be
Fixes: 8d7b5b0af7e070b9 ("sh: Add sh7722 pinmux code")
Signed-off-by: Geert Uytterhoeven
Reported-by: Magnus Damm
Reviewed-by: Laurent Pinchart
Tested-by: Jacopo Mondi
Cc: Rich Felker
Cc: Yoshihiro Shimoda
Cc: Yoshinori Sato
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This fixes a bug in madvise() where if you'd try to soft offline a
hugepage via madvise(), while walking the address range you'd end up,
using the wrong page offset due to attempting to get the compound order
of a former but presently not compound page, due to dissolving the huge
page (since commit c3114a84f7f9: "mm: hugetlb: soft-offline: dissolve
source hugepage after successful migration").As a result I ended up with all my free pages except one being offlined.
Link: http://lkml.kernel.org/r/20170912204306.GA12053@gmail.com
Fixes: c3114a84f7f9 ("mm: hugetlb: soft-offline: dissolve source hugepage after successful migration")
Signed-off-by: Alexandru Moise
Cc: Anshuman Khandual
Cc: Michal Hocko
Cc: Andrea Arcangeli
Cc: Minchan Kim
Cc: Hillf Danton
Cc: Shaohua Li
Cc: Mike Rapoport
Cc: "Kirill A. Shutemov"
Cc: Mel Gorman
Cc: David Rientjes
Cc: Rik van Riel
Cc: Naoya Horiguchi
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
In this place mm is unlocked, so vmas or list may change. Down read
mmap_sem to protect them from modifications.Link: http://lkml.kernel.org/r/150512788393.10691.8868381099691121308.stgit@localhost.localdomain
Fixes: e86c59b1b12d ("mm/ksm: improve deduplication of zero pages with colouring")
Signed-off-by: Kirill Tkhai
Acked-by: Michal Hocko
Reviewed-by: Andrea Arcangeli
Cc: Minchan Kim
Cc: zhong jiang
Cc: Ingo Molnar
Cc: Claudio Imbrenda
Cc: "Kirill A. Shutemov"
Cc: Hugh Dickins
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There's a typo in recent change of VM_MPX definition. We want it to be
VM_HIGH_ARCH_4, not VM_HIGH_ARCH_BIT_4.This bug does cause visible regressions. In arch_vma_name the vmflags
are tested against VM_MPX. With the incorrect value of VM_MPX, a number
of vmas (such as the stack) test positive and end up being marked as
"[mpx]" in /proc/N/maps instead of their correct names.This confuses tools like rr which expect to be able to find familiar
vmas.Fixes: df3735c5b40f ("x86,mpx: make mpx depend on x86-64 to free up VMA flag")
Link: http://lkml.kernel.org/r/20170918140253.36856-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov
Reviewed-by: Rik van Riel
Cc: Dave Hansen
Cc: Kyle Huey
Cc: [4.14+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Here are some of the more spelling mistakes and typos that I've found
while fixing up spelling mistakes in kernel error message text over the
past eight weeks.[akpm@linux-foundation.org: s/|/||/, per Joe]
Link: http://lkml.kernel.org/r/20170919090818.5989-1-colin.king@canonical.com
Signed-off-by: Colin Ian King
Acked-by: Kees Cook
Cc: Masahiro Yamada
Cc: Stephen Boyd
Cc: Joe Perches
Cc: Ross Zwisler
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This parameter is named kp, so the documentation should use that.
Fixes: 9b473de87209 ("param: Fix duplicate module prefixes")
Link: http://lkml.kernel.org/r/20170919142656.64aea59e@endymion
Signed-off-by: Jean Delvare
Acked-by: Rusty Russell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The build of alpha allmodconfig is giving error:
arch/alpha/include/asm/mmu_context.h: In function 'ev5_switch_mm':
arch/alpha/include/asm/mmu_context.h:160:2: error:
implicit declaration of function 'task_thread_info';
did you mean 'init_thread_info'? [-Werror=implicit-function-declaration]The file 'mmu_context.h' needed an extra header file.
Link: http://lkml.kernel.org/r/1505668810-7497-1-git-send-email-sudipm.mukherjee@gmail.com
Signed-off-by: Sudip Mukherjee
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Matt Turner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Pull workqueue fixlet from Tejun Heo:
"Minor documentation update"* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
Documentation: core-api: minor workqueue.rst cleanups -
Pull cgroup fix from Tejun Heo:
"The recent migration code updates assumed that migrations always
execute from the top to the bottom once and didn't clean up internal
states after each migration round; however, cgroup_transfer_tasks()
repeats the inner steps multiple times and the garbage internal states
from the previous iteration led to OOPS.Waiman fixed the bug by reinitializing the relevant states at the end
of each migration round"* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Reinit cgroup_taskset structure before cgroup_migrate_execute() returns -
Pull percpu fixes from Tejun Heo:
"Rather important fixes this time.- The new percpu area allocator had a subtle bug in how it iterates
the memory regions and could skip viable areas, which led to
allocation failures for module static percpu variables. Dennis
fixed the bug and another non-critical one in stat calculation.- Mark noticed that the generic implementations of percpu local
atomic reads aren't properly protected against irqs and there's a
(slim) chance for split reads on some 32bit systems. Generic
implementations are updated to disable irq when read size is larger
than ulong size. This may have made some 32bit archs which can do
atomic local 64bit accesses generate sub-optimal code. We need to
find them out and implement arch-specific overrides"* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: fix iteration to prevent skipping over block
percpu: fix starting offset for chunk statistics traversal
percpu: make this_cpu_generic_read() atomic w.r.t. interrupts -
Pull libata fixes from Tejun Heo:
"Nothing too interesting.Arnd's gcc-7 warning fixes that slipped through the cracks for two
release cycles (my bad), and two minor low level driver updates"* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ahci: don't ignore result code of ahci_reset_controller()
ata_piix: Add Fujitsu-Siemens Lifebook S6120 to short cable IDs
ata: avoid gcc-7 warning in ata_timing_quantize -
Pull USB fixes from Greg KH:
"Here are a number of USB fixes for 4.14-rc4 to resolved reported
issues.There's a bunch of stuff in here based on the great work Andrey
Konovalov is doing in fuzzing the USB stack. Lots of bug fixes when
dealing with corrupted USB descriptors that we've never seen in
"normal" operation, but is now ensuring the stack is much more
hardened overall.There's also the usual XHCI and gadget driver fixes as well, and a
build error fix, and a few other minor things, full details in the
shortlog.All of these have been in linux-next with no reported issues"
* tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (38 commits)
usb: dwc3: of-simple: Add compatible for Spreadtrum SC9860 platform
usb: gadget: udc: atmel: set vbus irqflags explicitly
usb: gadget: ffs: handle I/O completion in-order
usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction
usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe
usb: gadget: udc: renesas_usb3: Fix return value of usb3_write_pipe()
usb: gadget: udc: renesas_usb3: fix Pn_RAMMAP.Pn_MPKT value
usb: gadget: udc: renesas_usb3: fix for no-data control transfer
USB: dummy-hcd: Fix erroneous synchronization change
USB: dummy-hcd: fix infinite-loop resubmission bug
USB: dummy-hcd: fix connection failures (wrong speed)
USB: cdc-wdm: ignore -EPIPE from GetEncapsulatedResponse
USB: devio: Don't corrupt user memory
USB: devio: Prevent integer overflow in proc_do_submiturb()
USB: g_mass_storage: Fix deadlock when driver is unbound
USB: gadgetfs: Fix crash caused by inadequate synchronization
USB: gadgetfs: fix copy_to_user while holding spinlock
USB: uas: fix bug in handling of alternate settings
usb-storage: unusual_devs entry to fix write-access regression for Seagate external drives
usb-storage: fix bogus hardware error messages for ATA pass-thru devices
... -
Pull tty/serial fixes from Greg KH:
"Here are a small number (5) of patches for some reported TTY and
serial issues. Nothing major, a documentation update, timing fix,
error handling fix, name reporting fix, and a timeout issue resolved.All of these have been in linux-next for a while with no reported
issues"* tag 'tty-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: sccnxp: Fix error handling in sccnxp_probe()
tty: serial: lpuart: avoid report NULL interrupt
serial: bcm63xx: fix timing issue.
mxser: fix timeout calculation for low rates
serial: sh-sci: document R8A77970 bindings -
Pull staging/IIO fixes from Greg KH:
"Here are some small staging/IIO driver fixes for 4.14-rc4Most of these have been in my tree for a while due to travels, sorry
for the delay. They resolve a number of small issues reported by
people, mostly for the iio drivers. Nothing major in here, full
details are in the shortlog.All have been linux-next for a few weeks with no reported issues"
* tag 'staging-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (23 commits)
staging: iio: ad7192: Fix - use the dedicated reset function avoiding dma from stack.
iio: core: Return error for failed read_reg
iio: ad7793: Fix the serial interface reset
iio: ad_sigma_delta: Implement a dedicated reset function
IIO: BME280: Updates to Humidity readings need ctrl_reg write!
iio: adc: mcp320x: Fix readout of negative voltages
iio: adc: mcp320x: Fix oops on module unload
iio: adc: stm32: fix bad error check on max_channels
iio: trigger: stm32-timer: fix a corner case to write preset
iio: trigger: stm32-timer: preset shouldn't be buffered
iio: adc: twl4030: Return an error if we can not enable the vusb3v1 regulator in 'twl4030_madc_probe()'
iio: adc: twl4030: Disable the vusb3v1 rugulator in the error handling path of 'twl4030_madc_probe()'
iio: adc: twl4030: Fix an error handling path in 'twl4030_madc_probe()'
staging: rtl8723bs: avoid null pointer dereference on pmlmepriv
staging: rtl8723bs: add missing range check on id
staging: vchiq_2835_arm: Fix NULL ptr dereference in free_pagelist
staging: speakup: fix speakup-r empty line lockup
staging: pi433: Move limit check to switch default to kill warning
staging: r8822be: fix null pointer dereferences with a null driver_adapter
staging: mt29f_spinand: Enable the read ECC before program the page
...
03 Oct, 2017
3 commits
-
Pull driver core fixes from Greg KH:
"Here are a few small fixes for 4.14-rc4.The removal of DRIVER_ATTR() was almost completed by 4.14-rc1, but one
straggler made it in through some other tree (odds are, one of
mine...) So there's a simple removal of the last user, and then
finally the macro is removed from the tree.There's a fix for old crazy udev instances that insist on reloading a
module when it is removed from the kernel due to the new uevents for
bind/unbind. This fixes the reported regression, hopefully some year
in the future we can drop the workaround, once users update to the
latest version, but I'm not holding my breath.And then there's a build fix for a linker warning, and a buffer
overflow fix to match the PCI fixes you took through the PCI tree in
the same area.All of these have been in linux-next for a few weeks while I've been
traveling, sorry for the delay"* tag 'driver-core-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: remove DRIVER_ATTR
fpga: altera-cvp: remove DRIVER_ATTR() usage
driver core: platform: Don't read past the end of "driver_override" buffer
base: arch_topology: fix section mismatch build warnings
driver core: suppress sending MODALIAS in UNBIND uevents -
Pull char/misc fixes from Greg KH:
"Here are a handful of char/misc driver fixes for 4.14-rc4.Nothing major, some binder fixups, hyperv fixes, and other tiny
things.All of these have been sitting in my tree for way too long, sorry for
the delay in getting them to you. All have been in linux-next for a
few weeks, and despite some people's feeling about if linux-next
actually tests things, I think it's a good "soak test" for patches"* tag 'char-misc-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: fcopy: restore correct transfer length
vmbus: don't acquire the mutex in vmbus_hvsock_device_unregister()
intel_th: pci: Add Lewisburg PCH support
intel_th: pci: Add Cedar Fork PCH support
stm class: Fix a use-after-free
nvmem: add missing of_node_put() in of_nvmem_cell_get()
nvmem: core: return EFBIG on out-of-range write
auxdisplay: charlcd: properly restore atomic counter on error path
binder: fix memory corruption in binder_transaction binder
binder: fix an ret value override
android: binder: fix type mismatch warning -
ahci_pci_reset_controller() calls ahci_reset_controller(), which may
fail, but ignores the result code and always returns success. This
may result in failures like belowahci 0000:02:00.0: version 3.0
ahci 0000:02:00.0: enabling device (0000 -> 0003)
ahci 0000:02:00.0: SSS flag set, parallel bus scan disabled
ahci 0000:02:00.0: controller reset failed (0xffffffff)
ahci 0000:02:00.0: failed to stop engine (-5)
... repeated many times ...
ahci 0000:02:00.0: failed to stop engine (-5)
Unable to handle kernel paging request at virtual address ffff0000093f9018
...
PC is at ahci_stop_engine+0x5c/0xd8 [libahci]
LR is at ahci_deinit_port.constprop.12+0x1c/0xc0 [libahci]
...
[] ahci_stop_engine+0x5c/0xd8 [libahci]
[] ahci_deinit_port.constprop.12+0x1c/0xc0 [libahci]
[] ahci_init_controller+0x80/0x168 [libahci]
[] ahci_pci_init_controller+0x60/0x68 [ahci]
[] ahci_init_one+0x75c/0xd88 [ahci]
[] local_pci_probe+0x3c/0xb8
[] pci_device_probe+0x138/0x170
[] driver_probe_device+0x2dc/0x458
[] __driver_attach+0x114/0x118
[] bus_for_each_dev+0x60/0xa0
[] driver_attach+0x20/0x28
[] bus_add_driver+0x1f0/0x2a8
[] driver_register+0x60/0xf8
[] __pci_register_driver+0x3c/0x48
[] ahci_pci_driver_init+0x1c/0x1000 [ahci]
[] do_one_initcall+0x38/0x120where an obvious hardware level failure results in an unnecessary 15 second
delay and a subsequent crash.So record the result code of ahci_reset_controller() and relay it, rather
than ignoring it.Signed-off-by: Ard Biesheuvel
Signed-off-by: Tejun Heo