Eric Lee / smarc-fsl-linux-kernel

04 Oct, 2018

1 commit

1390c37d1 scsi: klist: Make it safe to use klists in atomic context ... Browse Code »

[ Upstream commit 624fa7790f80575a4ec28fbdb2034097dc18d051 ]

In the scsi_transport_srp implementation it cannot be avoided to
iterate over a klist from atomic context when using the legacy block
layer instead of blk-mq. Hence this patch that makes it safe to use
klists in atomic context. This patch avoids that lockdep reports the
following:

WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
Possible interrupt unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&(&k->k_lock)->rlock);
local_irq_disable();
lock(&(&q->__queue_lock)->rlock);
lock(&(&k->k_lock)->rlock);

lock(&(&q->__queue_lock)->rlock);

stack backtrace:
Workqueue: kblockd blk_timeout_work
Call Trace:
dump_stack+0xa4/0xf5
check_usage+0x6e6/0x700
__lock_acquire+0x185d/0x1b50
lock_acquire+0xd2/0x260
_raw_spin_lock+0x32/0x50
klist_next+0x47/0x190
device_for_each_child+0x8e/0x100
srp_timed_out+0xaf/0x1d0 [scsi_transport_srp]
scsi_times_out+0xd4/0x410 [scsi_mod]
blk_rq_timed_out+0x36/0x70
blk_timeout_work+0x1b5/0x220
process_one_work+0x4fe/0xad0
worker_thread+0x63/0x5a0
kthread+0x1c1/0x1e0
ret_from_fork+0x24/0x30

See also commit c9ddf73476ff ("scsi: scsi_transport_srp: Fix shost to
rport translation").

Signed-off-by: Bart Van Assche
Cc: Martin K. Petersen
Cc: James Bottomley
Acked-by: Greg Kroah-Hartman
Signed-off-by: Martin K. Petersen
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Bart Van Assche
2018-10-04 08:00:48 +0800

20 Sep, 2018

1 commit

33dc9f7c5 rhashtable: add schedule points ... Browse Code »

Rehashing and destroying large hash table takes a lot of time,
and happens in process context. It is safe to add cond_resched()
in rhashtable_rehash_table() and rhashtable_free_and_destroy()

Signed-off-by: Eric Dumazet
Acked-by: Herbert Xu
Signed-off-by: David S. Miller
(cherry picked from commit ae6da1f503abb5a5081f9f6c4a6881de97830f3e)
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2018-09-20 04:43:45 +0800

15 Sep, 2018

1 commit

8d015a362 debugobjects: Make stack check warning more informative ... Browse Code »

commit fc91a3c4c27acdca0bc13af6fbb68c35cfd519f2 upstream.

While debugging an issue debugobject tracking warned about an annotation
issue of an object on stack. It turned out that the issue was due to the
object in concern being on a different stack which was due to another
issue.

Thomas suggested to print the pointers and the location of the stack for
the currently running task. This helped to figure out that the object was
on the wrong stack.

As this is general useful information for debugging similar issues, make
the error message more informative by printing the pointers.

[ tglx: Massaged changelog ]

Signed-off-by: Joel Fernandes (Google)
Signed-off-by: Thomas Gleixner
Acked-by: Waiman Long
Acked-by: Yang Shi
Cc: kernel-team@android.com
Cc: Arnd Bergmann
Cc: astrachan@google.com
Link: https://lkml.kernel.org/r/20180723212531.202328-1-joel@joelfernandes.org
Signed-off-by: Greg Kroah-Hartman

Joel Fernandes (Google)
2018-09-15 15:45:35 +0800

05 Sep, 2018

1 commit

cd71265a8 printk/nmi: Prevent deadlock when accessing the main log buffer in NMI ... Browse Code »

commit 03fc7f9c99c1e7ae2925d459e8487f1a6f199f79 upstream.

The commit 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI
when logbuf_lock is available") brought back the possible deadlocks
in printk() and NMI.

The check of logbuf_lock is done only in printk_nmi_enter() to prevent
mixed output. But another CPU might take the lock later, enter NMI, and:

+ Both NMIs might be serialized by yet another lock, for example,
the one in nmi_cpu_backtrace().

+ The other CPU might get stopped in NMI, see smp_send_stop()
in panic().

The only safe solution is to use trylock when storing the message
into the main log-buffer. It might cause reordering when some lines
go to the main lock buffer directly and others are delayed via
the per-CPU buffer. It means that it is not useful in general.

This patch replaces the problematic NMI deferred context with NMI
direct context. It can be used to mark a code that might produce
many messages in NMI and the risk of losing them is more critical
than problems with eventual reordering.

The context is then used when dumping trace buffers on oops. It was
the primary motivation for the original fix. Also the reordering is
even smaller issue there because some traces have their own time stamps.

Finally, nmi_cpu_backtrace() need not longer be serialized because
it will always us the per-CPU buffers again.

Fixes: 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI when logbuf_lock is available")
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20180627142028.11259-1-pmladek@suse.com
To: Steven Rostedt
Cc: Peter Zijlstra
Cc: Tetsuo Handa
Cc: Sergey Senozhatsky
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Acked-by: Sergey Senozhatsky
Signed-off-by: Petr Mladek
Signed-off-by: Greg Kroah-Hartman

Petr Mladek
2018-09-05 15:26:35 +0800

18 Aug, 2018

1 commit

a34806961 ioremap: Update pgtable free interfaces with addr ... Browse Code »

commit 785a19f9d1dd8a4ab2d0633be4656653bd3de1fc upstream.

The following kernel panic was observed on ARM64 platform due to a stale
TLB entry.

1. ioremap with 4K size, a valid pte page table is set.
2. iounmap it, its pte entry is set to 0.
3. ioremap the same address with 2M size, update its pmd entry with
a new value.
4. CPU may hit an exception because the old pmd entry is still in TLB,
which leads to a kernel panic.

Commit b6bdb7517c3d ("mm/vmalloc: add interfaces to free unmapped page
table") has addressed this panic by falling to pte mappings in the above
case on ARM64.

To support pmd mappings in all cases, TLB purge needs to be performed
in this case on ARM64.

Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page()
so that TLB purge can be added later in seprate patches.

[toshi.kani@hpe.com: merge changes, rewrite patch description]
Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Signed-off-by: Chintan Pandya
Signed-off-by: Toshi Kani
Signed-off-by: Thomas Gleixner
Cc: mhocko@suse.com
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: Will Deacon
Cc: Joerg Roedel
Cc: stable@vger.kernel.org
Cc: Andrew Morton
Cc: Michal Hocko
Cc: "H. Peter Anvin"
Cc:
Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com
Signed-off-by: Greg Kroah-Hartman

Chintan Pandya
2018-08-18 03:01:11 +0800

25 Jul, 2018

1 commit

cfb876dc3 lib/rhashtable: consider param->min_size when setting initial table size ... Browse Code »

[ Upstream commit 107d01f5ba10f4162c38109496607eb197059064 ]

rhashtable_init() currently does not take into account the user-passed
min_size parameter unless param->nelem_hint is set as well. As such,
the default size (number of buckets) will always be HASH_DEFAULT_SIZE
even if the smallest allowed size is larger than that. Remediate this
by unconditionally calling into rounded_hashtable_size() and handling
things accordingly.

Signed-off-by: Davidlohr Bueso
Acked-by: Herbert Xu
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Davidlohr Bueso
2018-07-25 17:25:09 +0800

03 Jul, 2018

1 commit

ea0ac01f6 lib/vsprintf: Remove atomic-unsafe support for %pCr ... Browse Code »

commit 666902e42fd8344b923c02dc5b0f37948ff4f225 upstream.

"%pCr" formats the current rate of a clock, and calls clk_get_rate().
The latter obtains a mutex, hence it must not be called from atomic
context.

Remove support for this rarely-used format, as vsprintf() (and e.g.
printk()) must be callable from any context.

Any remaining out-of-tree users will start seeing the clock's name
printed instead of its rate.

Reported-by: Jia-Ju Bai
Fixes: 900cca2944254edd ("lib/vsprintf: add %pC{,n,r} format specifiers for clocks")
Link: http://lkml.kernel.org/r/1527845302-12159-5-git-send-email-geert+renesas@glider.be
To: Jia-Ju Bai
To: Jonathan Corbet
To: Michael Turquette
To: Stephen Boyd
To: Zhang Rui
To: Eduardo Valentin
To: Eric Anholt
To: Stefan Wahren
To: Greg Kroah-Hartman
Cc: Sergey Senozhatsky
Cc: Petr Mladek
Cc: Linus Torvalds
Cc: Steven Rostedt
Cc: linux-doc@vger.kernel.org
Cc: linux-clk@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-serial@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-renesas-soc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Geert Uytterhoeven
Cc: stable@vger.kernel.org # 4.1+
Signed-off-by: Geert Uytterhoeven
Signed-off-by: Petr Mladek
Signed-off-by: Greg Kroah-Hartman

Geert Uytterhoeven
2018-07-03 17:24:48 +0800

30 May, 2018

2 commits

e2d9442df lib/test_kmod.c: fix limit check on number of test devices created ... Browse Code »

[ Upstream commit ac68b1b3b9c73e652dc7ce0585672e23c5a2dca4 ]

As reported by Dan the parentheses is in the wrong place, and since
unlikely() call returns either 0 or 1 it's never less than zero. The
second issue is that signed integer overflows like "INT_MAX + 1" are
undefined behavior.

Since num_test_devs represents the number of devices, we want to stop
prior to hitting the max, and not rely on the wrap arround at all. So
just cap at num_test_devs + 1, prior to assigning a new device.

Link: http://lkml.kernel.org/r/20180224030046.24238-1-mcgrof@kernel.org
Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader")
Reported-by: Dan Carpenter
Signed-off-by: Luis R. Rodriguez
Acked-by: Kees Cook
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Luis R. Rodriguez
2018-05-30 13:52:14 +0800
0472f94ce idr: fix invalid ptr dereference on item delete ... Browse Code »

commit 7a4deea1aa8bddfed4ef1b35fc2b6732563d8ad5 upstream.

If the radix tree underlying the IDR happens to be full and we attempt
to remove an id which is larger than any id in the IDR, we will call
__radix_tree_delete() with an uninitialised 'slot' pointer, at which
point anything could happen. This was easiest to hit with a single
entry at id 0 and attempting to remove a non-0 id, but it could have
happened with 64 entries and attempting to remove an id >= 64.

Roman said:

The syzcaller test boils down to opening /dev/kvm, creating an
eventfd, and calling a couple of KVM ioctls. None of this requires
superuser. And the result is dereferencing an uninitialized pointer
which is likely a crash. The specific path caught by syzbot is via
KVM_HYPERV_EVENTD ioctl which is new in 4.17. But I guess there are
other user-triggerable paths, so cc:stable is probably justified.

Matthew added:

We have around 250 calls to idr_remove() in the kernel today. Many of
them pass an ID which is embedded in the object they're removing, so
they're safe. Picking a few likely candidates:

drivers/firewire/core-cdev.c looks unsafe; the ID comes from an ioctl.
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c is similar
drivers/atm/nicstar.c could be taken down by a handcrafted packet

Link: http://lkml.kernel.org/r/20180518175025.GD6361@bombadil.infradead.org
Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree")
Reported-by:
Debugged-by: Roman Kagan
Signed-off-by: Matthew Wilcox
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Matthew Wilcox
2018-05-30 13:51:49 +0800

23 May, 2018

2 commits

572e2385a radix tree: fix multi-order iteration race ... Browse Code »

commit 9f418224e8114156d995b98fa4e0f4fd21f685fe upstream.

Fix a race in the multi-order iteration code which causes the kernel to
hit a GP fault. This was first seen with a production v4.15 based
kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used
order 9 PMD DAX entries.

The race has to do with how we tear down multi-order sibling entries
when we are removing an item from the tree. Remember for example that
an order 2 entry looks like this:

struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]

where 'entry' is in some slot in the struct radix_tree_node, and the
three slots following 'entry' contain sibling pointers which point back
to 'entry.'

When we delete 'entry' from the tree, we call :

radix_tree_delete()
radix_tree_delete_item()
__radix_tree_delete()
replace_slot()

replace_slot() first removes the siblings in order from the first to the
last, then at then replaces 'entry' with NULL. This means that for a
brief period of time we end up with one or more of the siblings removed,
so:

struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]

This causes an issue if you have a reader iterating over the slots in
the tree via radix_tree_for_each_slot() while only under
rcu_read_lock()/rcu_read_unlock() protection. This is a common case in
mm/filemap.c.

The issue is that when __radix_tree_next_slot() => skip_siblings() tries
to skip over the sibling entries in the slots, it currently does so with
an exact match on the slot directly preceding our current slot.
Normally this works:

V preceding slot
struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
^ current slot

This lets you find the first sibling, and you skip them all in order.

But in the case where one of the siblings is NULL, that slot is skipped
and then our sibling detection is interrupted:

V preceding slot
struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
^ current slot

This means that the sibling pointers aren't recognized since they point
all the way back to 'entry', so we think that they are normal internal
radix tree pointers. This causes us to think we need to walk down to a
struct radix_tree_node starting at the address of 'entry'.

In a real running kernel this will crash the thread with a GP fault when
you try and dereference the slots in your broken node starting at
'entry'.

We fix this race by fixing the way that skip_siblings() detects sibling
nodes. Instead of testing against the preceding slot we instead look
for siblings via is_sibling_entry() which compares against the position
of the struct radix_tree_node.slots[] array. This ensures that sibling
entries are properly identified, even if they are no longer contiguous
with the 'entry' they point to.

Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com
Fixes: 148deab223b2 ("radix-tree: improve multiorder iterators")
Signed-off-by: Ross Zwisler
Reported-by: CR, Sapthagirish
Reviewed-by: Jan Kara
Cc: Matthew Wilcox
Cc: Christoph Hellwig
Cc: Dan Williams
Cc: Dave Chinner
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Ross Zwisler
2018-05-23 00:53:58 +0800
f6c0f020e lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly ... Browse Code »

commit 1e3054b98c5415d5cb5f8824fc33b548ae5644c3 upstream.

I had neglected to increment the error counter when the tests failed,
which made the tests noisy when they fail, but not actually return an
error code.

Link: http://lkml.kernel.org/r/20180509114328.9887-1-mpe@ellerman.id.au
Fixes: 3cc78125a081 ("lib/test_bitmap.c: add optimisation tests")
Signed-off-by: Matthew Wilcox
Signed-off-by: Michael Ellerman
Reported-by: Michael Ellerman
Tested-by: Michael Ellerman
Reviewed-by: Kees Cook
Cc: Yury Norov
Cc: Andy Shevchenko
Cc: Geert Uytterhoeven
Cc: [4.13+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Matthew Wilcox
2018-05-23 00:53:58 +0800

09 May, 2018

1 commit

0799a0ea9 errseq: Always report a writeback error once ... Browse Code »

commit b4678df184b314a2bd47d2329feca2c2534aa12b upstream.

The errseq_t infrastructure assumes that errors which occurred before
the file descriptor was opened are of no interest to the application.
This turns out to be a regression for some applications, notably Postgres.

Before errseq_t, a writeback error would be reported exactly once (as
long as the inode remained in memory), so Postgres could open a file,
call fsync() and find out whether there had been a writeback error on
that file from another process.

This patch changes the errseq infrastructure to report errors to all
file descriptors which are opened after the error occurred, but before
it was reported to any file descriptor. This restores the user-visible
behaviour.

Cc: stable@vger.kernel.org
Fixes: 5660e13d2fd6 ("fs: new infrastructure for writeback error handling and reporting")
Signed-off-by: Matthew Wilcox
Reviewed-by: Jeff Layton
Signed-off-by: Jeff Layton
Signed-off-by: Greg Kroah-Hartman

Matthew Wilcox
2018-05-09 15:51:54 +0800

02 May, 2018

1 commit

a5f427678 kobject: don't use WARN for registration failures ... Browse Code »

commit 3e14c6abbfb5c94506edda9d8e2c145d79375798 upstream.

This WARNING proved to be noisy. The function still returns an error
and callers should handle it. That's how most of kernel code works.
Downgrade the WARNING to pr_err() and leave WARNINGs for kernel bugs.

Signed-off-by: Dmitry Vyukov
Reported-by: syzbot+209c0f67f99fec8eb14b@syzkaller.appspotmail.com
Reported-by: syzbot+7fb6d9525a4528104e05@syzkaller.appspotmail.com
Reported-by: syzbot+2e63711063e2d8f9ea27@syzkaller.appspotmail.com
Reported-by: syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com
Cc: stable
Signed-off-by: Greg Kroah-Hartman

Dmitry Vyukov
2018-05-02 03:58:19 +0800

26 Apr, 2018

1 commit

3e01c16d8 bpf: fix selftests/bpf test_kmod.sh failure when CONFIG_BPF_JIT_ALWAYS_ON=y ... Browse Code »

[ Upstream commit 09584b406742413ac4c8d7e030374d4daa045b69 ]

With CONFIG_BPF_JIT_ALWAYS_ON is defined in the config file,
tools/testing/selftests/bpf/test_kmod.sh failed like below:
[root@localhost bpf]# ./test_kmod.sh
sysctl: setting key "net.core.bpf_jit_enable": Invalid argument
[ JIT enabled:0 hardened:0 ]
[ 132.175681] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
[ 132.458834] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
[ JIT enabled:1 hardened:0 ]
[ 133.456025] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
[ 133.730935] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
[ JIT enabled:1 hardened:1 ]
[ 134.769730] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
[ 135.050864] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
[ JIT enabled:1 hardened:2 ]
[ 136.442882] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
[ 136.821810] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
[root@localhost bpf]#

The test_kmod.sh load/remove test_bpf.ko multiple times with different
settings for sysctl net.core.bpf_jit_{enable,harden}. The failed test #297
of test_bpf.ko is designed such that JIT always fails.

Commit 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config)
introduced the following tightening logic:
...
if (!bpf_prog_is_dev_bound(fp->aux)) {
fp = bpf_int_jit_compile(fp);
#ifdef CONFIG_BPF_JIT_ALWAYS_ON
if (!fp->jited) {
*err = -ENOTSUPP;
return fp;
}
#endif
...
With this logic, Test #297 always gets return value -ENOTSUPP
when CONFIG_BPF_JIT_ALWAYS_ON is defined, causing the test failure.

This patch fixed the failure by marking Test #297 as expected failure
when CONFIG_BPF_JIT_ALWAYS_ON is defined.

Fixes: 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config)
Signed-off-by: Yonghong Song
Signed-off-by: Daniel Borkmann
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Yonghong Song
2018-04-26 17:02:16 +0800

19 Apr, 2018

1 commit

a333a284f lib: fix stall in __bitmap_parselist() ... Browse Code »

commit 8351760ff5b2042039554b4948ddabaac644a976 upstream.

syzbot is catching stalls at __bitmap_parselist()
(https://syzkaller.appspot.com/bug?id=ad7e0351fbc90535558514a71cd3edc11681997a).
The trigger is

unsigned long v = 0;
bitmap_parselist("7:,", &v, BITS_PER_LONG);

which results in hitting infinite loop at

while (a
Reported-by: Tetsuo Handa
Reported-by: syzbot
Cc: Noam Camus
Cc: Rasmus Villemoes
Cc: Matthew Wilcox
Cc: Mauro Carvalho Chehab
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Yury Norov
2018-04-19 14:56:20 +0800

01 Apr, 2018

1 commit

07cf9d303 rhashtable: Fix rhlist duplicates insertion ... Browse Code »

[ Upstream commit d3dcf8eb615537526bd42ff27a081d46d337816e ]

When inserting duplicate objects (those with the same key),
current rhlist implementation messes up the chain pointers by
updating the bucket pointer instead of prev next pointer to the
newly inserted node. This causes missing elements on removal and
travesal.

Fix that by properly updating pprev pointer to point to
the correct rhash_head next pointer.

Issue: 1241076
Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7
Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
Signed-off-by: Paul Blakey
Acked-by: Herbert Xu
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Paul Blakey
2018-04-01 00:10:40 +0800

29 Mar, 2018

1 commit

acdb49816 mm/vmalloc: add interfaces to free unmapped page table ... Browse Code »

commit b6bdb7517c3d3f41f20e5c2948d6bc3f8897394e upstream.

On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may
create pud/pmd mappings. A kernel panic was observed on arm64 systems
with Cortex-A75 in the following steps as described by Hanjun Guo.

1. ioremap a 4K size, valid page table will build,
2. iounmap it, pte0 will set to 0;
3. ioremap the same address with 2M size, pgd/pmd is unchanged,
then set the a new value for pmd;
4. pte0 is leaked;
5. CPU may meet exception because the old pmd is still in TLB,
which will lead to kernel panic.

This panic is not reproducible on x86. INVLPG, called from iounmap,
purges all levels of entries associated with purged address on x86. x86
still has memory leak.

The patch changes the ioremap path to free unmapped page table(s) since
doing so in the unmap path has the following issues:

- The iounmap() path is shared with vunmap(). Since vmap() only
supports pte mappings, making vunmap() to free a pte page is an
overhead for regular vmap users as they do not need a pte page freed
up.

- Checking if all entries in a pte page are cleared in the unmap path
is racy, and serializing this check is expensive.

- The unmap path calls free_vmap_area_noflush() to do lazy TLB purges.
Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB
purge.

Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which
clear a given pud/pmd entry and free up a page for the lower level
entries.

This patch implements their stub functions on x86 and arm64, which work
as workaround.

[akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub]
Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com
Fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings")
Reported-by: Lei Li
Signed-off-by: Toshi Kani
Cc: Catalin Marinas
Cc: Wang Xuefeng
Cc: Will Deacon
Cc: Hanjun Guo
Cc: Michal Hocko
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Borislav Petkov
Cc: Matthew Wilcox
Cc: Chintan Pandya
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Toshi Kani
2018-03-29 00:24:38 +0800

19 Mar, 2018

1 commit

0ced0c46b Fix misannotated out-of-line _copy_to_user() ... Browse Code »

[ Upstream commit a0e94598e6b6c0d1df6a5fa14eb7c767ca817a20 ]

Destination is a kernel pointer and source - a userland one
in _copy_from_user(); _copy_to_user() is the other way round.

Fixes: d597580d37377 ("generic ...copy_..._user primitives")
Signed-off-by: Christophe Leroy
Signed-off-by: Al Viro
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Christophe Leroy
2018-03-19 15:42:56 +0800

15 Mar, 2018

1 commit

d50cb5ced lib/bug.c: exclude non-BUG/WARN exceptions from report_bug() ... Browse Code »

commit 1b4cfe3c0a30dde968fb43c577a8d7e262a145ee upstream.

Commit b8347c219649 ("x86/debug: Handle warnings before the notifier
chain, to fix KGDB crash") changed the ordering of fixups, and did not
take into account the case of x86 processing non-WARN() and non-BUG()
exceptions. This would lead to output of a false BUG line with no other
information.

In the case of a refcount exception, it would be immediately followed by
the refcount WARN(), producing very strange double-"cut here":

lkdtm: attempting bad refcount_inc() overflow
------------[ cut here ]------------
Kernel BUG at 0000000065f29de5 [verbose debug info unavailable]
------------[ cut here ]------------
refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0
WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4
...

In the prior ordering, exceptions were searched first:

do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
...
if (fixup_exception(regs, trapnr))
return 0;

- if (fixup_bug(regs, trapnr))
- return 0;
-

As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account
needing to search the exception list first, since that had already
happened.

So, instead of searching the exception list twice (once in
is_valid_bugaddr() and then again in fixup_exception()), just add a
simple sanity check to report_bug() that will immediately bail out if a
BUG() (or WARN()) entry is not found.

Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast
Fixes: b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash")
Signed-off-by: Kees Cook
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Peter Zijlstra
Cc: Borislav Petkov
Cc: Richard Weinberger
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Kees Cook
2018-03-15 17:54:32 +0800

03 Mar, 2018

1 commit

22d5e20c6 lib/mpi: Fix umul_ppmm() for MIPS64r6 ... Browse Code »

[ Upstream commit bbc25bee37d2b32cf3a1fab9195b6da3a185614a ]

Current MIPS64r6 toolchains aren't able to generate efficient
DMULU/DMUHU based code for the C implementation of umul_ppmm(), which
performs an unsigned 64 x 64 bit multiply and returns the upper and
lower 64-bit halves of the 128-bit result. Instead it widens the 64-bit
inputs to 128-bits and emits a __multi3 intrinsic call to perform a 128
x 128 multiply. This is both inefficient, and it results in a link error
since we don't include __multi3 in MIPS linux.

For example commit 90a53e4432b1 ("cfg80211: implement regdb signature
checking") merged in v4.15-rc1 recently broke the 64r6_defconfig and
64r6el_defconfig builds by indirectly selecting MPILIB. The same build
errors can be reproduced on older kernels by enabling e.g. CRYPTO_RSA:

lib/mpi/generic_mpih-mul1.o: In function `mpihelp_mul_1':
lib/mpi/generic_mpih-mul1.c:50: undefined reference to `__multi3'
lib/mpi/generic_mpih-mul2.o: In function `mpihelp_addmul_1':
lib/mpi/generic_mpih-mul2.c:49: undefined reference to `__multi3'
lib/mpi/generic_mpih-mul3.o: In function `mpihelp_submul_1':
lib/mpi/generic_mpih-mul3.c:49: undefined reference to `__multi3'
lib/mpi/mpih-div.o In function `mpihelp_divrem':
lib/mpi/mpih-div.c:205: undefined reference to `__multi3'
lib/mpi/mpih-div.c:142: undefined reference to `__multi3'

Therefore add an efficient MIPS64r6 implementation of umul_ppmm() using
inline assembly and the DMULU/DMUHU instructions, to prevent __multi3
calls being emitted.

Fixes: 7fd08ca58ae6 ("MIPS: Add build support for the MIPS R6 ISA")
Signed-off-by: James Hogan
Cc: Ralf Baechle
Cc: Herbert Xu
Cc: "David S. Miller"
Cc: linux-mips@linux-mips.org
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Herbert Xu
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

James Hogan
2018-03-03 17:24:29 +0800

25 Feb, 2018

1 commit

f2f12ea19 509: fix printing uninitialized stack memory when OID is empty ... Browse Code »

[ Upstream commit 8dfd2f22d3bf3ab7714f7495ad5d897b8845e8c1 ]

Callers of sprint_oid() do not check its return value before printing
the result. In the case where the OID is zero-length, -EBADMSG was
being returned without anything being written to the buffer, resulting
in uninitialized stack memory being printed. Fix this by writing
"(bad)" to the buffer in the cases where -EBADMSG is returned.

Fixes: 4f73175d0375 ("X.509: Add utility functions to render OIDs as strings")
Signed-off-by: Eric Biggers
Signed-off-by: David Howells
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Eric Biggers
2018-02-25 18:08:01 +0800

22 Feb, 2018

2 commits

f369f1486 kmemcheck: rip it out ... Browse Code »

commit 4675ff05de2d76d167336b368bd07f3fef6ed5a6 upstream.

Fix up makefiles, remove references, and git rm kmemcheck.

Link: http://lkml.kernel.org/r/20171007030159.22241-4-alexander.levin@verizon.com
Signed-off-by: Sasha Levin
Cc: Steven Rostedt
Cc: Vegard Nossum
Cc: Pekka Enberg
Cc: Michal Hocko
Cc: Eric W. Biederman
Cc: Alexander Potapenko
Cc: Tim Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Levin, Alexander (Sasha Levin)
2018-02-22 22:42:24 +0800
37efa60e1 swiotlb: suppress warning when __GFP_NOWARN is set ... Browse Code »

commit d0bc0c2a31c95002d37c3cc511ffdcab851b3256 upstream.

TTM tries to allocate coherent memory in chunks of 2MB first to improve
TLB efficiency and falls back to allocating 4K pages if that fails.

Suppress the warning when the 2MB allocations fails since there is a
valid fall back path.

Signed-off-by: Christian König
Reported-by: Mike Galbraith
Acked-by: Konrad Rzeszutek Wilk
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=104082
Signed-off-by: Christoph Hellwig
Signed-off-by: Greg Kroah-Hartman

Christian König
2018-02-22 22:42:15 +0800

17 Feb, 2018

3 commits

2617e62c2 lib/ubsan: add type mismatch handler for new GCC/Clang ... Browse Code »

commit 42440c1f9911b4b7b8ba3dc4e90c1197bc561211 upstream.

UBSAN=y fails to build with new GCC/clang:

arch/x86/kernel/head64.o: In function `sanitize_boot_params':
arch/x86/include/asm/bootparam_utils.h:37: undefined reference to `__ubsan_handle_type_mismatch_v1'

because Clang and GCC 8 slightly changed ABI for 'type mismatch' errors.
Compiler now uses new __ubsan_handle_type_mismatch_v1() function with
slightly modified 'struct type_mismatch_data'.

Let's add new 'struct type_mismatch_data_common' which is independent from
compiler's layout of 'struct type_mismatch_data'. And make
__ubsan_handle_type_mismatch[_v1]() functions transform compiler-dependent
type mismatch data to our internal representation. This way, we can
support both old and new compilers with minimal amount of change.

Link: http://lkml.kernel.org/r/20180119152853.16806-1-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin
Reported-by: Sodagudi Prasad
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Andrey Ryabinin
2018-02-17 03:23:09 +0800
5a5df7771 lib/ubsan.c: s/missaligned/misaligned/ ... Browse Code »

commit b8fe1120b4ba342b4f156d24e952d6e686b20298 upstream.

A vist from the spelling fairy.

Cc: David Laight
Cc: Andrey Ryabinin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Andrew Morton
2018-02-17 03:23:09 +0800
062cd3463 kasan: rework Kconfig settings ... Browse Code »

commit e7c52b84fb18f08ce49b6067ae6285aca79084a8 upstream.

We get a lot of very large stack frames using gcc-7.0.1 with the default
-fsanitize-address-use-after-scope --param asan-stack=1 options, which can
easily cause an overflow of the kernel stack, e.g.

drivers/gpu/drm/i915/gvt/handlers.c:2434:1: warning: the frame size of 46176 bytes is larger than 3072 bytes
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5650:1: warning: the frame size of 23632 bytes is larger than 3072 bytes
lib/atomic64_test.c:250:1: warning: the frame size of 11200 bytes is larger than 3072 bytes
drivers/gpu/drm/i915/gvt/handlers.c:2621:1: warning: the frame size of 9208 bytes is larger than 3072 bytes
drivers/media/dvb-frontends/stv090x.c:3431:1: warning: the frame size of 6816 bytes is larger than 3072 bytes
fs/fscache/stats.c:287:1: warning: the frame size of 6536 bytes is larger than 3072 bytes

To reduce this risk, -fsanitize-address-use-after-scope is now split out
into a separate CONFIG_KASAN_EXTRA Kconfig option, leading to stack
frames that are smaller than 2 kilobytes most of the time on x86_64. An
earlier version of this patch also prevented combining KASAN_EXTRA with
KASAN_INLINE, but that is no longer necessary with gcc-7.0.1.

All patches to get the frame size below 2048 bytes with CONFIG_KASAN=y
and CONFIG_KASAN_EXTRA=n have been merged by maintainers now, so we can
bring back that default now. KASAN_EXTRA=y still causes lots of
warnings but now defaults to !COMPILE_TEST to disable it in
allmodconfig, and it remains disabled in all other defconfigs since it
is a new option. I arbitrarily raise the warning limit for KASAN_EXTRA
to 3072 to reduce the noise, but an allmodconfig kernel still has around
50 warnings on gcc-7.

I experimented a bit more with smaller stack frames and have another
follow-up series that reduces the warning limit for 64-bit architectures
to 1280 bytes (without CONFIG_KASAN).

With earlier versions of this patch series, I also had patches to address
the warnings we get with KASAN and/or KASAN_EXTRA, using a
"noinline_if_stackbloat" annotation.

That annotation now got replaced with a gcc-8 bugfix (see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715) and a workaround for
older compilers, which means that KASAN_EXTRA is now just as bad as
before and will lead to an instant stack overflow in a few extreme
cases.

This reverts parts of commit 3f181b4d8652 ("lib/Kconfig.debug: disable
-Wframe-larger-than warnings with KASAN=y"). Two patches in linux-next
should be merged first to avoid introducing warnings in an allmodconfig
build:
3cd890dbe2a4 ("media: dvb-frontends: fix i2c access helpers for KASAN")
16c3ada89cff ("media: r820t: fix r820t_write_reg for KASAN")

Do we really need to backport this?

I think we do: without this patch, enabling KASAN will lead to
unavoidable kernel stack overflow in certain device drivers when built
with gcc-7 or higher on linux-4.10+ or any version that contains a
backport of commit c5caf21ab0cf8. Most people are probably still on
older compilers, but it will get worse over time as they upgrade their
distros.

The warnings we get on kernels older than this should all be for code
that uses dangerously large stack frames, though most of them do not
cause an actual stack overflow by themselves.The asan-stack option was
added in linux-4.0, and commit 3f181b4d8652 ("lib/Kconfig.debug:
disable -Wframe-larger-than warnings with KASAN=y") effectively turned
off the warning for allmodconfig kernels, so I would like to see this
fix backported to any kernels later than 4.0.

I have done dozens of fixes for individual functions with stack frames
larger than 2048 bytes with asan-stack, and I plan to make sure that
all those fixes make it into the stable kernels as well (most are
already there).

Part of the complication here is that asan-stack (from 4.0) was
originally assumed to always require much larger stacks, but that
turned out to be a combination of multiple gcc bugs that we have now
worked around and fixed, but sanitize-address-use-after-scope (from
v4.10) has a much higher inherent stack usage and also suffers from at
least three other problems that we have analyzed but not yet fixed
upstream, each of them makes the stack usage more severe than it should
be.

Link: http://lkml.kernel.org/r/20171221134744.2295529-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann
Acked-by: Andrey Ryabinin
Cc: Mauro Carvalho Chehab
Cc: Andrey Ryabinin
Cc: Alexander Potapenko
Cc: Dmitry Vyukov
Cc: Andrey Konovalov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Arnd Bergmann
2018-02-17 03:23:04 +0800

04 Feb, 2018

1 commit

aad757b65 test_firmware: fix missing unlock on error in config_num_requests_store() ... Browse Code »

commit a5e1923356505e46476c2fb518559b7a4d9d25b1 upstream.

Add the missing unlock before return from function
config_num_requests_store() in the error handling case.

Fixes: c92316bf8e94 ("test_firmware: add batched firmware tests")
Signed-off-by: Wei Yongjun
Signed-off-by: Greg Kroah-Hartman

Wei Yongjun
2018-02-04 00:39:24 +0800

31 Jan, 2018

1 commit

6fde36d5c bpf: introduce BPF_JIT_ALWAYS_ON config ... Browse Code »

[ upstream commit 290af86629b25ffd1ed6232c4e9107da031705cb ]

The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.

A quote from goolge project zero blog:
"At this point, it would normally be necessary to locate gadgets in
the host kernel code that can be used to actually leak data by reading
from an attacker-controlled location, shifting and masking the result
appropriately and then using the result of that as offset to an
attacker-controlled address for a load. But piecing gadgets together
and figuring out which ones work in a speculation context seems annoying.
So instead, we decided to use the eBPF interpreter, which is built into
the host kernel - while there is no legitimate way to invoke it from inside
a VM, the presence of the code in the host kernel's text section is sufficient
to make it usable for the attack, just like with ordinary ROP gadgets."

To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
option that removes interpreter from the kernel in favor of JIT-only mode.
So far eBPF JIT is supported by:
x64, arm64, arm32, sparc64, s390, powerpc64, mips64

The start of JITed program is randomized and code page is marked as read-only.
In addition "constant blinding" can be turned on with net.core.bpf_jit_harden

v2->v3:
- move __bpf_prog_ret0 under ifdef (Daniel)

v1->v2:
- fix init order, test_bpf and cBPF (Daniel's feedback)
- fix offloaded bpf (Jakub's feedback)
- add 'return 0' dummy in case something can invoke prog->bpf_func
- retarget bpf tree. For bpf-next the patch would need one extra hunk.
It will be sent when the trees are merged back to net-next

Considered doing:
int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
but it seems better to land the patch as-is and in bpf-next remove
bpf_jit_enable global variable from all JITs, consolidate in one place
and remove this jit_init() function.

Signed-off-by: Alexei Starovoitov
Signed-off-by: Daniel Borkmann
Signed-off-by: Greg Kroah-Hartman

Alexei Starovoitov
2018-01-31 21:03:49 +0800

25 Dec, 2017

1 commit

8af220c9e x86/unwind: Rename unwinder config options to 'CONFIG_UNWINDER_*' ... Browse Code »

commit 11af847446ed0d131cf24d16a7ef3d5ea7a49554 upstream.

Rename the unwinder config options from:

CONFIG_ORC_UNWINDER
CONFIG_FRAME_POINTER_UNWINDER
CONFIG_GUESS_UNWINDER

to:

CONFIG_UNWINDER_ORC
CONFIG_UNWINDER_FRAME_POINTER
CONFIG_UNWINDER_GUESS

... in order to give them a more logical config namespace.

Suggested-by: Ingo Molnar
Signed-off-by: Josh Poimboeuf
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/73972fc7e2762e91912c6b9584582703d6f1b8cc.1507924831.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar
Signed-off-by: Greg Kroah-Hartman

Josh Poimboeuf
2017-12-25 21:26:13 +0800

14 Dec, 2017

4 commits

8cb22e079 dynamic-debug-howto: fix optional/omitted ending line number to be LARGE instead of 0 ... Browse Code »

[ Upstream commit 1f3c790bd5989fcfec9e53ad8fa09f5b740c958f ]

line-range is supposed to treat "1-" as "1-endoffile", so
handle the special case by setting last_lineno to UINT_MAX.

Fixes this error:

dynamic_debug:ddebug_parse_query: last-line:0 < 1st-line:1
dynamic_debug:ddebug_exec_query: query parse failed

Link: http://lkml.kernel.org/r/10a6a101-e2be-209f-1f41-54637824788e@infradead.org
Signed-off-by: Randy Dunlap
Acked-by: Jason Baron
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Randy Dunlap
2017-12-14 16:53:08 +0800
346008fe4 lib/genalloc.c: make the avail variable an atomic_long_t ... Browse Code »

[ Upstream commit 36a3d1dd4e16bcd0d2ddfb4a2ec7092f0ae0d931 ]

If the amount of resources allocated to a gen_pool exceeds 2^32 then the
avail atomic overflows and this causes problems when clients try and
borrow resources from the pool. This is only expected to be an issue on
64 bit systems.

Add the header to pull in atomic_long* operations. So
that 32 bit systems continue to use atomic32_t but 64 bit systems can
use atomic64_t.

Link: http://lkml.kernel.org/r/1509033843-25667-1-git-send-email-sbates@raithlin.com
Signed-off-by: Stephen Bates
Reviewed-by: Logan Gunthorpe
Reviewed-by: Mathieu Desnoyers
Reviewed-by: Daniel Mentz
Cc: Jonathan Corbet
Cc: Andrew Morton
Cc: Will Deacon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Sasha Levin
Signed-off-by: Greg Kroah-Hartman

Stephen Bates
2017-12-14 16:53:08 +0800
4c69b3405 ASN.1: check for error from ASN1_OP_END__ACT actions ... Browse Code »

commit 81a7be2cd69b412ab6aeacfe5ebf1bb6e5bce955 upstream.

asn1_ber_decoder() was ignoring errors from actions associated with the
opcodes ASN1_OP_END_SEQ_ACT, ASN1_OP_END_SET_ACT,
ASN1_OP_END_SEQ_OF_ACT, and ASN1_OP_END_SET_OF_ACT. In practice, this
meant the pkcs7_note_signed_info() action (since that was the only user
of those opcodes). Fix it by checking for the error, just like the
decoder does for actions associated with the other opcodes.

This bug allowed users to leak slab memory by repeatedly trying to add a
specially crafted "pkcs7_test" key (requires CONFIG_PKCS7_TEST_KEY).

In theory, this bug could also be used to bypass module signature
verification, by providing a PKCS#7 message that is misparsed such that
a signature's ->authattrs do not contain its ->msgdigest. But it
doesn't seem practical in normal cases, due to restrictions on the
format of the ->authattrs.

Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
Signed-off-by: Eric Biggers
Signed-off-by: David Howells
Reviewed-by: James Morris
Signed-off-by: Greg Kroah-Hartman

Eric Biggers
2017-12-14 16:52:52 +0800
2c4c01d13 ASN.1: fix out-of-bounds read when parsing indefinite length item ... Browse Code »

commit e0058f3a874ebb48b25be7ff79bc3b4e59929f90 upstream.

In asn1_ber_decoder(), indefinitely-sized ASN.1 items were being passed
to the action functions before their lengths had been computed, using
the bogus length of 0x80 (ASN1_INDEFINITE_LENGTH). This resulted in
reading data past the end of the input buffer, when given a specially
crafted message.

Fix it by rearranging the code so that the indefinite length is resolved
before the action is called.

This bug was originally found by fuzzing the X.509 parser in userspace
using libFuzzer from the LLVM project.

KASAN report (cleaned up slightly):

BUG: KASAN: slab-out-of-bounds in memcpy ./include/linux/string.h:341 [inline]
BUG: KASAN: slab-out-of-bounds in x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
Read of size 128 at addr ffff880035dd9eaf by task keyctl/195

CPU: 1 PID: 195 Comm: keyctl Not tainted 4.14.0-09238-g1d3b78bbc6e9 #26
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0xd1/0x175 lib/dump_stack.c:53
print_address_description+0x78/0x260 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x23f/0x350 mm/kasan/report.c:409
memcpy+0x1f/0x50 mm/kasan/kasan.c:302
memcpy ./include/linux/string.h:341 [inline]
x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
asn1_ber_decoder+0xb4a/0x1fd0 lib/asn1_decoder.c:447
x509_cert_parse+0x1c7/0x620 crypto/asymmetric_keys/x509_cert_parser.c:89
x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174
asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388
key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850
SYSC_add_key security/keys/keyctl.c:122 [inline]
SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62
entry_SYSCALL_64_fastpath+0x1f/0x96

Allocated by task 195:
__do_kmalloc_node mm/slab.c:3675 [inline]
__kmalloc_node+0x47/0x60 mm/slab.c:3682
kvmalloc ./include/linux/mm.h:540 [inline]
SYSC_add_key security/keys/keyctl.c:104 [inline]
SyS_add_key+0x19e/0x290 security/keys/keyctl.c:62
entry_SYSCALL_64_fastpath+0x1f/0x96

Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
Reported-by: Alexander Potapenko
Signed-off-by: Eric Biggers
Signed-off-by: David Howells
Signed-off-by: Greg Kroah-Hartman

Eric Biggers
2017-12-14 16:52:52 +0800

30 Nov, 2017

1 commit

ce922b7b4 lib/mpi: call cond_resched() from mpi_powm() loop ... Browse Code »

commit 1d9ddde12e3c9bab7f3d3484eb9446315e3571ca upstream.

On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the
largest permitted inputs (16384 bits), the kernel spends 10+ seconds
doing modular exponentiation in mpi_powm() without rescheduling. If all
threads do it, it locks up the system. Moreover, it can cause
rcu_sched-stall warnings.

Notwithstanding the insanity of doing this calculation in kernel mode
rather than in userspace, fix it by calling cond_resched() as each bit
from the exponent is processed. It's still noninterruptible, but at
least it's preemptible now.

Do the cond_resched() once per bit rather than once per MPI limb because
each limb might still easily take 100+ milliseconds on slow CPUs.

Signed-off-by: Eric Biggers
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Eric Biggers
2017-11-30 16:40:39 +0800

08 Nov, 2017

1 commit

624f5ab87 KEYS: fix NULL pointer dereference during ASN.1 parsing [ver #2] ... Browse Code »

syzkaller reported a NULL pointer dereference in asn1_ber_decoder(). It
can be reproduced by the following command, assuming
CONFIG_PKCS7_TEST_KEY=y:

keyctl add pkcs7_test desc '' @s

The bug is that if the data buffer is empty, an integer underflow occurs
in the following check:

if (unlikely(dp >= datalen - 1))
goto data_overrun_error;

This results in the NULL data pointer being dereferenced.

Fix it by checking for 'datalen - dp < 2' instead.

Also fix the similar check for 'dp >= datalen - n' later in the same
function. That one possibly could result in a buffer overread.

The NULL pointer dereference was reproducible using the "pkcs7_test" key
type but not the "asymmetric" key type because the "asymmetric" key type
checks for a 0-length payload before calling into the ASN.1 decoder but
the "pkcs7_test" key type does not.

The bug report was:

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
PGD 7b708067 P4D 7b708067 PUD 7b6ee067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 522 Comm: syz-executor1 Not tainted 4.14.0-rc8 #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014
task: ffff9b6b3798c040 task.stack: ffff9b6b37970000
RIP: 0010:asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
RSP: 0018:ffff9b6b37973c78 EFLAGS: 00010216
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000021c
RDX: ffffffff814a04ed RSI: ffffb1524066e000 RDI: ffffffff910759e0
RBP: ffff9b6b37973d60 R08: 0000000000000001 R09: ffff9b6b3caa4180
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 00007f10ed1f2700(0000) GS:ffff9b6b3ea00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000007b6f3000 CR4: 00000000000006f0
Call Trace:
pkcs7_parse_message+0xee/0x240 crypto/asymmetric_keys/pkcs7_parser.c:139
verify_pkcs7_signature+0x33/0x180 certs/system_keyring.c:216
pkcs7_preparse+0x41/0x70 crypto/asymmetric_keys/pkcs7_key_type.c:63
key_create_or_update+0x180/0x530 security/keys/key.c:855
SYSC_add_key security/keys/keyctl.c:122 [inline]
SyS_add_key+0xbf/0x250 security/keys/keyctl.c:62
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4585c9
RSP: 002b:00007f10ed1f1bd8 EFLAGS: 00000216 ORIG_RAX: 00000000000000f8
RAX: ffffffffffffffda RBX: 00007f10ed1f2700 RCX: 00000000004585c9
RDX: 0000000020000000 RSI: 0000000020008ffb RDI: 0000000020008000
RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000216 R12: 00007fff1b2260ae
R13: 00007fff1b2260af R14: 00007f10ed1f2700 R15: 0000000000000000
Code: dd ca ff 48 8b 45 88 48 83 e8 01 4c 39 f0 0f 86 a8 07 00 00 e8 53 dd ca ff 49 8d 46 01 48 89 85 58 ff ff ff 48 8b 85 60 ff ff ff 0f b6 0c 30 89 c8 88 8d 75 ff ff ff 83 e0 1f 89 8d 28 ff ff
RIP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233 RSP: ffff9b6b37973c78
CR2: 0000000000000000

Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
Reported-by: syzbot
Cc: # v3.7+
Signed-off-by: Eric Biggers
Signed-off-by: David Howells
Signed-off-by: James Morris

Eric Biggers
2017-11-08 21:38:21 +0800

03 Nov, 2017

1 commit

ead751507 Merge tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kern… ... Browse Code »

…el/git/gregkh/driver-core

Pull initial SPDX identifiers from Greg KH:
"License cleanup: add SPDX license identifiers to some files

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the
'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
binding shorthand, which can be used instead of the full boiler plate
text.

This patch is based on work done by Thomas Gleixner and Kate Stewart
and Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset
of the use cases:

- file had no licensing information it it.

- file was a */uapi/* one with no licensing information in it,

- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to
license had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied
to a file was done in a spreadsheet of side by side results from of
the output of two independent scanners (ScanCode & Windriver)
producing SPDX tag:value files created by Philippe Ombredanne.
Philippe prepared the base worksheet, and did an initial spot review
of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537
files assessed. Kate Stewart did a file by file comparison of the
scanner results in the spreadsheet to determine which SPDX license
identifier(s) to be applied to the file. She confirmed any
determination that was not immediately clear with lawyers working with
the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:

- Files considered eligible had to be source code files.

- Make and config files were included as candidates if they contained
>5 lines of source

- File already had some variant of a license header in it (even if <5
lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.

For non */uapi/* files that summary was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139

and resulted in the first patch in this series.

If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
was:

SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930

and resulted in the second patch in this series.

- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:

SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

and that resulted in the third patch in this series.

- when the two scanners agreed on the detected license(s), that
became the concluded license(s).

- when there was disagreement between the two scanners (one detected
a license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.

- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply
(and which scanner probably needed to revisit its heuristics).

- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.

- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases,
confirmation by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.
The Windriver scanner is based on an older version of FOSSology in
part, so they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot
checks in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect
the correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial
patch version early this week with:

- a full scancode scan run, collecting the matched texts, detected
license ids and scores

- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct

- reviewing anything where there was no detection but the patch
license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
applied SPDX license was correct

This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
License cleanup: add SPDX license identifier to uapi header files with a license
License cleanup: add SPDX license identifier to uapi header files with no license
License cleanup: add SPDX GPL-2.0 license identifier to files with no license

Linus Torvalds
2017-11-03 01:04:46 +0800

02 Nov, 2017

2 commits

b24413180 License cleanup: add SPDX GPL-2.0 license identifier to files with no license ... Browse Code »

Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Greg Kroah-Hartman
2017-11-02 18:10:55 +0800
2eb9eabf1 KEYS: fix out-of-bounds read during ASN.1 parsing ... Browse Code »

syzkaller with KASAN reported an out-of-bounds read in
asn1_ber_decoder(). It can be reproduced by the following command,
assuming CONFIG_X509_CERTIFICATE_PARSER=y and CONFIG_KASAN=y:

keyctl add asymmetric desc $'\x30\x30' @s

The bug is that the length of an ASN.1 data value isn't validated in the
case where it is encoded using the short form, causing the decoder to
read past the end of the input buffer. Fix it by validating the length.

The bug report was:

BUG: KASAN: slab-out-of-bounds in asn1_ber_decoder+0x10cb/0x1730 lib/asn1_decoder.c:233
Read of size 1 at addr ffff88003cccfa02 by task syz-executor0/6818

CPU: 1 PID: 6818 Comm: syz-executor0 Not tainted 4.14.0-rc7-00008-g5f479447d983 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0xb3/0x10b lib/dump_stack.c:52
print_address_description+0x79/0x2a0 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x236/0x340 mm/kasan/report.c:409
__asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:427
asn1_ber_decoder+0x10cb/0x1730 lib/asn1_decoder.c:233
x509_cert_parse+0x1db/0x650 crypto/asymmetric_keys/x509_cert_parser.c:89
x509_key_preparse+0x64/0x7a0 crypto/asymmetric_keys/x509_public_key.c:174
asymmetric_key_preparse+0xcb/0x1a0 crypto/asymmetric_keys/asymmetric_type.c:388
key_create_or_update+0x347/0xb20 security/keys/key.c:855
SYSC_add_key security/keys/keyctl.c:122 [inline]
SyS_add_key+0x1cd/0x340 security/keys/keyctl.c:62
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x447c89
RSP: 002b:00007fca7a5d3bd8 EFLAGS: 00000246 ORIG_RAX: 00000000000000f8
RAX: ffffffffffffffda RBX: 00007fca7a5d46cc RCX: 0000000000447c89
RDX: 0000000020006f4a RSI: 0000000020006000 RDI: 0000000020001ff5
RBP: 0000000000000046 R08: fffffffffffffffd R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fca7a5d49c0 R15: 00007fca7a5d4700

Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
Cc: # v3.7+
Signed-off-by: Eric Biggers
Signed-off-by: David Howells
Signed-off-by: James Morris

Eric Biggers
2017-11-02 17:58:08 +0800

31 Oct, 2017

1 commit

b39ab98e2 Mark 'ioremap_page_range()' as possibly sleeping ... Browse Code »

It turns out that some drivers seem to think it's ok to remap page
ranges from within interrupts and even NMI's. That is definitely not
the case, since the page table build-up is simply not interrupt-safe.

This showed up in the zero-day robot that reported it for the ACPI APEI
GHES ("Generic Hardware Error Source") driver. Normally it had been
hidden by the fact that no page table operations had been needed because
the vmalloc area had been set up by other things.

Apparently due to a recent change to the GHEI driver: commit
77b246b32b2c ("acpi: apei: check for pending errors when probing GHES
entries") 0day actually caught a case during bootup whenthe ioremap
called down to page allocation. But that recent change only showed the
symptom, it wasn't the root cause of the problem.

Hopefully it is limited to just that one driver.

If you need to access random physical memory, you either need to ioremap
in process context, or you need to use the FIXMAP facility to set one
particular fixmap entry to the required mapping - that can be done safely.

Cc: Borislav Petkov
Cc: Len Brown
Cc: Tony Luck
Cc: Fengguang Wu
Cc: Tyler Baicar
Cc: Will Deacon
Signed-off-by: Linus Torvalds

Linus Torvalds
2017-10-31 01:09:56 +0800

29 Oct, 2017

1 commit

ea6789980 assoc_array: Fix a buggy node-splitting case ... Browse Code »

This fixes CVE-2017-12193.

Fix a case in the assoc_array implementation in which a new leaf is
added that needs to go into a node that happens to be full, where the
existing leaves in that node cluster together at that level to the
exclusion of new leaf.

What needs to happen is that the existing leaves get moved out to a new
node, N1, at level + 1 and the existing node needs replacing with one,
N0, that has pointers to the new leaf and to N1.

The code that tries to do this gets this wrong in two ways:

(1) The pointer that should've pointed from N0 to N1 is set to point
recursively to N0 instead.

(2) The backpointer from N0 needs to be set correctly in the case N0 is
either the root node or reached through a shortcut.

Fix this by removing this path and using the split_node path instead,
which achieves the same end, but in a more general way (thanks to Eric
Biggers for spotting the redundancy).

The problem manifests itself as:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: assoc_array_apply_edit+0x59/0xe5

Fixes: 3cb989501c26 ("Add a generic associative array implementation.")
Reported-and-tested-by: WU Fan
Signed-off-by: David Howells
Cc: stable@vger.kernel.org [v3.13-rc1+]
Signed-off-by: Linus Torvalds

David Howells
2017-10-29 01:31:07 +0800