04 Oct, 2018

1 commit

  • [ Upstream commit 624fa7790f80575a4ec28fbdb2034097dc18d051 ]

    In the scsi_transport_srp implementation it cannot be avoided to
    iterate over a klist from atomic context when using the legacy block
    layer instead of blk-mq. Hence this patch that makes it safe to use
    klists in atomic context. This patch avoids that lockdep reports the
    following:

    WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
    Possible interrupt unsafe locking scenario:

    CPU0 CPU1
    ---- ----
    lock(&(&k->k_lock)->rlock);
    local_irq_disable();
    lock(&(&q->__queue_lock)->rlock);
    lock(&(&k->k_lock)->rlock);

    lock(&(&q->__queue_lock)->rlock);

    stack backtrace:
    Workqueue: kblockd blk_timeout_work
    Call Trace:
    dump_stack+0xa4/0xf5
    check_usage+0x6e6/0x700
    __lock_acquire+0x185d/0x1b50
    lock_acquire+0xd2/0x260
    _raw_spin_lock+0x32/0x50
    klist_next+0x47/0x190
    device_for_each_child+0x8e/0x100
    srp_timed_out+0xaf/0x1d0 [scsi_transport_srp]
    scsi_times_out+0xd4/0x410 [scsi_mod]
    blk_rq_timed_out+0x36/0x70
    blk_timeout_work+0x1b5/0x220
    process_one_work+0x4fe/0xad0
    worker_thread+0x63/0x5a0
    kthread+0x1c1/0x1e0
    ret_from_fork+0x24/0x30

    See also commit c9ddf73476ff ("scsi: scsi_transport_srp: Fix shost to
    rport translation").

    Signed-off-by: Bart Van Assche
    Cc: Martin K. Petersen
    Cc: James Bottomley
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Bart Van Assche
     

20 Sep, 2018

1 commit

  • Rehashing and destroying large hash table takes a lot of time,
    and happens in process context. It is safe to add cond_resched()
    in rhashtable_rehash_table() and rhashtable_free_and_destroy()

    Signed-off-by: Eric Dumazet
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller
    (cherry picked from commit ae6da1f503abb5a5081f9f6c4a6881de97830f3e)
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

15 Sep, 2018

1 commit

  • commit fc91a3c4c27acdca0bc13af6fbb68c35cfd519f2 upstream.

    While debugging an issue debugobject tracking warned about an annotation
    issue of an object on stack. It turned out that the issue was due to the
    object in concern being on a different stack which was due to another
    issue.

    Thomas suggested to print the pointers and the location of the stack for
    the currently running task. This helped to figure out that the object was
    on the wrong stack.

    As this is general useful information for debugging similar issues, make
    the error message more informative by printing the pointers.

    [ tglx: Massaged changelog ]

    Signed-off-by: Joel Fernandes (Google)
    Signed-off-by: Thomas Gleixner
    Acked-by: Waiman Long
    Acked-by: Yang Shi
    Cc: kernel-team@android.com
    Cc: Arnd Bergmann
    Cc: astrachan@google.com
    Link: https://lkml.kernel.org/r/20180723212531.202328-1-joel@joelfernandes.org
    Signed-off-by: Greg Kroah-Hartman

    Joel Fernandes (Google)
     

05 Sep, 2018

1 commit

  • commit 03fc7f9c99c1e7ae2925d459e8487f1a6f199f79 upstream.

    The commit 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI
    when logbuf_lock is available") brought back the possible deadlocks
    in printk() and NMI.

    The check of logbuf_lock is done only in printk_nmi_enter() to prevent
    mixed output. But another CPU might take the lock later, enter NMI, and:

    + Both NMIs might be serialized by yet another lock, for example,
    the one in nmi_cpu_backtrace().

    + The other CPU might get stopped in NMI, see smp_send_stop()
    in panic().

    The only safe solution is to use trylock when storing the message
    into the main log-buffer. It might cause reordering when some lines
    go to the main lock buffer directly and others are delayed via
    the per-CPU buffer. It means that it is not useful in general.

    This patch replaces the problematic NMI deferred context with NMI
    direct context. It can be used to mark a code that might produce
    many messages in NMI and the risk of losing them is more critical
    than problems with eventual reordering.

    The context is then used when dumping trace buffers on oops. It was
    the primary motivation for the original fix. Also the reordering is
    even smaller issue there because some traces have their own time stamps.

    Finally, nmi_cpu_backtrace() need not longer be serialized because
    it will always us the per-CPU buffers again.

    Fixes: 719f6a7040f1bdaf96 ("printk: Use the main logbuf in NMI when logbuf_lock is available")
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/r/20180627142028.11259-1-pmladek@suse.com
    To: Steven Rostedt
    Cc: Peter Zijlstra
    Cc: Tetsuo Handa
    Cc: Sergey Senozhatsky
    Cc: linux-kernel@vger.kernel.org
    Cc: stable@vger.kernel.org
    Acked-by: Sergey Senozhatsky
    Signed-off-by: Petr Mladek
    Signed-off-by: Greg Kroah-Hartman

    Petr Mladek
     

18 Aug, 2018

1 commit

  • commit 785a19f9d1dd8a4ab2d0633be4656653bd3de1fc upstream.

    The following kernel panic was observed on ARM64 platform due to a stale
    TLB entry.

    1. ioremap with 4K size, a valid pte page table is set.
    2. iounmap it, its pte entry is set to 0.
    3. ioremap the same address with 2M size, update its pmd entry with
    a new value.
    4. CPU may hit an exception because the old pmd entry is still in TLB,
    which leads to a kernel panic.

    Commit b6bdb7517c3d ("mm/vmalloc: add interfaces to free unmapped page
    table") has addressed this panic by falling to pte mappings in the above
    case on ARM64.

    To support pmd mappings in all cases, TLB purge needs to be performed
    in this case on ARM64.

    Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page()
    so that TLB purge can be added later in seprate patches.

    [toshi.kani@hpe.com: merge changes, rewrite patch description]
    Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
    Signed-off-by: Chintan Pandya
    Signed-off-by: Toshi Kani
    Signed-off-by: Thomas Gleixner
    Cc: mhocko@suse.com
    Cc: akpm@linux-foundation.org
    Cc: hpa@zytor.com
    Cc: linux-mm@kvack.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: Will Deacon
    Cc: Joerg Roedel
    Cc: stable@vger.kernel.org
    Cc: Andrew Morton
    Cc: Michal Hocko
    Cc: "H. Peter Anvin"
    Cc:
    Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com
    Signed-off-by: Greg Kroah-Hartman

    Chintan Pandya
     

25 Jul, 2018

1 commit

  • [ Upstream commit 107d01f5ba10f4162c38109496607eb197059064 ]

    rhashtable_init() currently does not take into account the user-passed
    min_size parameter unless param->nelem_hint is set as well. As such,
    the default size (number of buckets) will always be HASH_DEFAULT_SIZE
    even if the smallest allowed size is larger than that. Remediate this
    by unconditionally calling into rounded_hashtable_size() and handling
    things accordingly.

    Signed-off-by: Davidlohr Bueso
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Davidlohr Bueso
     

03 Jul, 2018

1 commit

  • commit 666902e42fd8344b923c02dc5b0f37948ff4f225 upstream.

    "%pCr" formats the current rate of a clock, and calls clk_get_rate().
    The latter obtains a mutex, hence it must not be called from atomic
    context.

    Remove support for this rarely-used format, as vsprintf() (and e.g.
    printk()) must be callable from any context.

    Any remaining out-of-tree users will start seeing the clock's name
    printed instead of its rate.

    Reported-by: Jia-Ju Bai
    Fixes: 900cca2944254edd ("lib/vsprintf: add %pC{,n,r} format specifiers for clocks")
    Link: http://lkml.kernel.org/r/1527845302-12159-5-git-send-email-geert+renesas@glider.be
    To: Jia-Ju Bai
    To: Jonathan Corbet
    To: Michael Turquette
    To: Stephen Boyd
    To: Zhang Rui
    To: Eduardo Valentin
    To: Eric Anholt
    To: Stefan Wahren
    To: Greg Kroah-Hartman
    Cc: Sergey Senozhatsky
    Cc: Petr Mladek
    Cc: Linus Torvalds
    Cc: Steven Rostedt
    Cc: linux-doc@vger.kernel.org
    Cc: linux-clk@vger.kernel.org
    Cc: linux-pm@vger.kernel.org
    Cc: linux-serial@vger.kernel.org
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-renesas-soc@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: Geert Uytterhoeven
    Cc: stable@vger.kernel.org # 4.1+
    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: Petr Mladek
    Signed-off-by: Greg Kroah-Hartman

    Geert Uytterhoeven
     

30 May, 2018

2 commits

  • [ Upstream commit ac68b1b3b9c73e652dc7ce0585672e23c5a2dca4 ]

    As reported by Dan the parentheses is in the wrong place, and since
    unlikely() call returns either 0 or 1 it's never less than zero. The
    second issue is that signed integer overflows like "INT_MAX + 1" are
    undefined behavior.

    Since num_test_devs represents the number of devices, we want to stop
    prior to hitting the max, and not rely on the wrap arround at all. So
    just cap at num_test_devs + 1, prior to assigning a new device.

    Link: http://lkml.kernel.org/r/20180224030046.24238-1-mcgrof@kernel.org
    Fixes: d9c6a72d6fa2 ("kmod: add test driver to stress test the module loader")
    Reported-by: Dan Carpenter
    Signed-off-by: Luis R. Rodriguez
    Acked-by: Kees Cook
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Luis R. Rodriguez
     
  • commit 7a4deea1aa8bddfed4ef1b35fc2b6732563d8ad5 upstream.

    If the radix tree underlying the IDR happens to be full and we attempt
    to remove an id which is larger than any id in the IDR, we will call
    __radix_tree_delete() with an uninitialised 'slot' pointer, at which
    point anything could happen. This was easiest to hit with a single
    entry at id 0 and attempting to remove a non-0 id, but it could have
    happened with 64 entries and attempting to remove an id >= 64.

    Roman said:

    The syzcaller test boils down to opening /dev/kvm, creating an
    eventfd, and calling a couple of KVM ioctls. None of this requires
    superuser. And the result is dereferencing an uninitialized pointer
    which is likely a crash. The specific path caught by syzbot is via
    KVM_HYPERV_EVENTD ioctl which is new in 4.17. But I guess there are
    other user-triggerable paths, so cc:stable is probably justified.

    Matthew added:

    We have around 250 calls to idr_remove() in the kernel today. Many of
    them pass an ID which is embedded in the object they're removing, so
    they're safe. Picking a few likely candidates:

    drivers/firewire/core-cdev.c looks unsafe; the ID comes from an ioctl.
    drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c is similar
    drivers/atm/nicstar.c could be taken down by a handcrafted packet

    Link: http://lkml.kernel.org/r/20180518175025.GD6361@bombadil.infradead.org
    Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree")
    Reported-by:
    Debugged-by: Roman Kagan
    Signed-off-by: Matthew Wilcox
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Matthew Wilcox
     

23 May, 2018

2 commits

  • commit 9f418224e8114156d995b98fa4e0f4fd21f685fe upstream.

    Fix a race in the multi-order iteration code which causes the kernel to
    hit a GP fault. This was first seen with a production v4.15 based
    kernel (4.15.6-300.fc27.x86_64) utilizing a DAX workload which used
    order 9 PMD DAX entries.

    The race has to do with how we tear down multi-order sibling entries
    when we are removing an item from the tree. Remember for example that
    an order 2 entry looks like this:

    struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]

    where 'entry' is in some slot in the struct radix_tree_node, and the
    three slots following 'entry' contain sibling pointers which point back
    to 'entry.'

    When we delete 'entry' from the tree, we call :

    radix_tree_delete()
    radix_tree_delete_item()
    __radix_tree_delete()
    replace_slot()

    replace_slot() first removes the siblings in order from the first to the
    last, then at then replaces 'entry' with NULL. This means that for a
    brief period of time we end up with one or more of the siblings removed,
    so:

    struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]

    This causes an issue if you have a reader iterating over the slots in
    the tree via radix_tree_for_each_slot() while only under
    rcu_read_lock()/rcu_read_unlock() protection. This is a common case in
    mm/filemap.c.

    The issue is that when __radix_tree_next_slot() => skip_siblings() tries
    to skip over the sibling entries in the slots, it currently does so with
    an exact match on the slot directly preceding our current slot.
    Normally this works:

    V preceding slot
    struct radix_tree_node.slots[] = [entry][sibling][sibling][sibling]
    ^ current slot

    This lets you find the first sibling, and you skip them all in order.

    But in the case where one of the siblings is NULL, that slot is skipped
    and then our sibling detection is interrupted:

    V preceding slot
    struct radix_tree_node.slots[] = [entry][NULL][sibling][sibling]
    ^ current slot

    This means that the sibling pointers aren't recognized since they point
    all the way back to 'entry', so we think that they are normal internal
    radix tree pointers. This causes us to think we need to walk down to a
    struct radix_tree_node starting at the address of 'entry'.

    In a real running kernel this will crash the thread with a GP fault when
    you try and dereference the slots in your broken node starting at
    'entry'.

    We fix this race by fixing the way that skip_siblings() detects sibling
    nodes. Instead of testing against the preceding slot we instead look
    for siblings via is_sibling_entry() which compares against the position
    of the struct radix_tree_node.slots[] array. This ensures that sibling
    entries are properly identified, even if they are no longer contiguous
    with the 'entry' they point to.

    Link: http://lkml.kernel.org/r/20180503192430.7582-6-ross.zwisler@linux.intel.com
    Fixes: 148deab223b2 ("radix-tree: improve multiorder iterators")
    Signed-off-by: Ross Zwisler
    Reported-by: CR, Sapthagirish
    Reviewed-by: Jan Kara
    Cc: Matthew Wilcox
    Cc: Christoph Hellwig
    Cc: Dan Williams
    Cc: Dave Chinner
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Ross Zwisler
     
  • commit 1e3054b98c5415d5cb5f8824fc33b548ae5644c3 upstream.

    I had neglected to increment the error counter when the tests failed,
    which made the tests noisy when they fail, but not actually return an
    error code.

    Link: http://lkml.kernel.org/r/20180509114328.9887-1-mpe@ellerman.id.au
    Fixes: 3cc78125a081 ("lib/test_bitmap.c: add optimisation tests")
    Signed-off-by: Matthew Wilcox
    Signed-off-by: Michael Ellerman
    Reported-by: Michael Ellerman
    Tested-by: Michael Ellerman
    Reviewed-by: Kees Cook
    Cc: Yury Norov
    Cc: Andy Shevchenko
    Cc: Geert Uytterhoeven
    Cc: [4.13+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Matthew Wilcox
     

09 May, 2018

1 commit

  • commit b4678df184b314a2bd47d2329feca2c2534aa12b upstream.

    The errseq_t infrastructure assumes that errors which occurred before
    the file descriptor was opened are of no interest to the application.
    This turns out to be a regression for some applications, notably Postgres.

    Before errseq_t, a writeback error would be reported exactly once (as
    long as the inode remained in memory), so Postgres could open a file,
    call fsync() and find out whether there had been a writeback error on
    that file from another process.

    This patch changes the errseq infrastructure to report errors to all
    file descriptors which are opened after the error occurred, but before
    it was reported to any file descriptor. This restores the user-visible
    behaviour.

    Cc: stable@vger.kernel.org
    Fixes: 5660e13d2fd6 ("fs: new infrastructure for writeback error handling and reporting")
    Signed-off-by: Matthew Wilcox
    Reviewed-by: Jeff Layton
    Signed-off-by: Jeff Layton
    Signed-off-by: Greg Kroah-Hartman

    Matthew Wilcox
     

02 May, 2018

1 commit

  • commit 3e14c6abbfb5c94506edda9d8e2c145d79375798 upstream.

    This WARNING proved to be noisy. The function still returns an error
    and callers should handle it. That's how most of kernel code works.
    Downgrade the WARNING to pr_err() and leave WARNINGs for kernel bugs.

    Signed-off-by: Dmitry Vyukov
    Reported-by: syzbot+209c0f67f99fec8eb14b@syzkaller.appspotmail.com
    Reported-by: syzbot+7fb6d9525a4528104e05@syzkaller.appspotmail.com
    Reported-by: syzbot+2e63711063e2d8f9ea27@syzkaller.appspotmail.com
    Reported-by: syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Vyukov
     

26 Apr, 2018

1 commit

  • [ Upstream commit 09584b406742413ac4c8d7e030374d4daa045b69 ]

    With CONFIG_BPF_JIT_ALWAYS_ON is defined in the config file,
    tools/testing/selftests/bpf/test_kmod.sh failed like below:
    [root@localhost bpf]# ./test_kmod.sh
    sysctl: setting key "net.core.bpf_jit_enable": Invalid argument
    [ JIT enabled:0 hardened:0 ]
    [ 132.175681] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
    [ 132.458834] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
    [ JIT enabled:1 hardened:0 ]
    [ 133.456025] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
    [ 133.730935] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
    [ JIT enabled:1 hardened:1 ]
    [ 134.769730] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
    [ 135.050864] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
    [ JIT enabled:1 hardened:2 ]
    [ 136.442882] test_bpf: #297 BPF_MAXINSNS: Jump, gap, jump, ... FAIL to prog_create err=-524 len=4096
    [ 136.821810] test_bpf: Summary: 348 PASSED, 1 FAILED, [340/340 JIT'ed]
    [root@localhost bpf]#

    The test_kmod.sh load/remove test_bpf.ko multiple times with different
    settings for sysctl net.core.bpf_jit_{enable,harden}. The failed test #297
    of test_bpf.ko is designed such that JIT always fails.

    Commit 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config)
    introduced the following tightening logic:
    ...
    if (!bpf_prog_is_dev_bound(fp->aux)) {
    fp = bpf_int_jit_compile(fp);
    #ifdef CONFIG_BPF_JIT_ALWAYS_ON
    if (!fp->jited) {
    *err = -ENOTSUPP;
    return fp;
    }
    #endif
    ...
    With this logic, Test #297 always gets return value -ENOTSUPP
    when CONFIG_BPF_JIT_ALWAYS_ON is defined, causing the test failure.

    This patch fixed the failure by marking Test #297 as expected failure
    when CONFIG_BPF_JIT_ALWAYS_ON is defined.

    Fixes: 290af86629b2 (bpf: introduce BPF_JIT_ALWAYS_ON config)
    Signed-off-by: Yonghong Song
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Yonghong Song
     

19 Apr, 2018

1 commit

  • commit 8351760ff5b2042039554b4948ddabaac644a976 upstream.

    syzbot is catching stalls at __bitmap_parselist()
    (https://syzkaller.appspot.com/bug?id=ad7e0351fbc90535558514a71cd3edc11681997a).
    The trigger is

    unsigned long v = 0;
    bitmap_parselist("7:,", &v, BITS_PER_LONG);

    which results in hitting infinite loop at

    while (a
    Reported-by: Tetsuo Handa
    Reported-by: syzbot
    Cc: Noam Camus
    Cc: Rasmus Villemoes
    Cc: Matthew Wilcox
    Cc: Mauro Carvalho Chehab
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Yury Norov
     

01 Apr, 2018

1 commit

  • [ Upstream commit d3dcf8eb615537526bd42ff27a081d46d337816e ]

    When inserting duplicate objects (those with the same key),
    current rhlist implementation messes up the chain pointers by
    updating the bucket pointer instead of prev next pointer to the
    newly inserted node. This causes missing elements on removal and
    travesal.

    Fix that by properly updating pprev pointer to point to
    the correct rhash_head next pointer.

    Issue: 1241076
    Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7
    Fixes: ca26893f05e8 ('rhashtable: Add rhlist interface')
    Signed-off-by: Paul Blakey
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Paul Blakey
     

29 Mar, 2018

1 commit

  • commit b6bdb7517c3d3f41f20e5c2948d6bc3f8897394e upstream.

    On architectures with CONFIG_HAVE_ARCH_HUGE_VMAP set, ioremap() may
    create pud/pmd mappings. A kernel panic was observed on arm64 systems
    with Cortex-A75 in the following steps as described by Hanjun Guo.

    1. ioremap a 4K size, valid page table will build,
    2. iounmap it, pte0 will set to 0;
    3. ioremap the same address with 2M size, pgd/pmd is unchanged,
    then set the a new value for pmd;
    4. pte0 is leaked;
    5. CPU may meet exception because the old pmd is still in TLB,
    which will lead to kernel panic.

    This panic is not reproducible on x86. INVLPG, called from iounmap,
    purges all levels of entries associated with purged address on x86. x86
    still has memory leak.

    The patch changes the ioremap path to free unmapped page table(s) since
    doing so in the unmap path has the following issues:

    - The iounmap() path is shared with vunmap(). Since vmap() only
    supports pte mappings, making vunmap() to free a pte page is an
    overhead for regular vmap users as they do not need a pte page freed
    up.

    - Checking if all entries in a pte page are cleared in the unmap path
    is racy, and serializing this check is expensive.

    - The unmap path calls free_vmap_area_noflush() to do lazy TLB purges.
    Clearing a pud/pmd entry before the lazy TLB purges needs extra TLB
    purge.

    Add two interfaces, pud_free_pmd_page() and pmd_free_pte_page(), which
    clear a given pud/pmd entry and free up a page for the lower level
    entries.

    This patch implements their stub functions on x86 and arm64, which work
    as workaround.

    [akpm@linux-foundation.org: fix typo in pmd_free_pte_page() stub]
    Link: http://lkml.kernel.org/r/20180314180155.19492-2-toshi.kani@hpe.com
    Fixes: e61ce6ade404e ("mm: change ioremap to set up huge I/O mappings")
    Reported-by: Lei Li
    Signed-off-by: Toshi Kani
    Cc: Catalin Marinas
    Cc: Wang Xuefeng
    Cc: Will Deacon
    Cc: Hanjun Guo
    Cc: Michal Hocko
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Borislav Petkov
    Cc: Matthew Wilcox
    Cc: Chintan Pandya
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Toshi Kani
     

19 Mar, 2018

1 commit

  • [ Upstream commit a0e94598e6b6c0d1df6a5fa14eb7c767ca817a20 ]

    Destination is a kernel pointer and source - a userland one
    in _copy_from_user(); _copy_to_user() is the other way round.

    Fixes: d597580d37377 ("generic ...copy_..._user primitives")
    Signed-off-by: Christophe Leroy
    Signed-off-by: Al Viro
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Christophe Leroy
     

15 Mar, 2018

1 commit

  • commit 1b4cfe3c0a30dde968fb43c577a8d7e262a145ee upstream.

    Commit b8347c219649 ("x86/debug: Handle warnings before the notifier
    chain, to fix KGDB crash") changed the ordering of fixups, and did not
    take into account the case of x86 processing non-WARN() and non-BUG()
    exceptions. This would lead to output of a false BUG line with no other
    information.

    In the case of a refcount exception, it would be immediately followed by
    the refcount WARN(), producing very strange double-"cut here":

    lkdtm: attempting bad refcount_inc() overflow
    ------------[ cut here ]------------
    Kernel BUG at 0000000065f29de5 [verbose debug info unavailable]
    ------------[ cut here ]------------
    refcount_t overflow at lkdtm_REFCOUNT_INC_OVERFLOW+0x6b/0x90 in cat[3065], uid/euid: 0/0
    WARNING: CPU: 0 PID: 3065 at kernel/panic.c:657 refcount_error_report+0x9a/0xa4
    ...

    In the prior ordering, exceptions were searched first:

    do_trap_no_signal(struct task_struct *tsk, int trapnr, char *str,
    ...
    if (fixup_exception(regs, trapnr))
    return 0;

    - if (fixup_bug(regs, trapnr))
    - return 0;
    -

    As a result, fixup_bugs()'s is_valid_bugaddr() didn't take into account
    needing to search the exception list first, since that had already
    happened.

    So, instead of searching the exception list twice (once in
    is_valid_bugaddr() and then again in fixup_exception()), just add a
    simple sanity check to report_bug() that will immediately bail out if a
    BUG() (or WARN()) entry is not found.

    Link: http://lkml.kernel.org/r/20180301225934.GA34350@beast
    Fixes: b8347c219649 ("x86/debug: Handle warnings before the notifier chain, to fix KGDB crash")
    Signed-off-by: Kees Cook
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Richard Weinberger
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Kees Cook
     

03 Mar, 2018

1 commit

  • [ Upstream commit bbc25bee37d2b32cf3a1fab9195b6da3a185614a ]

    Current MIPS64r6 toolchains aren't able to generate efficient
    DMULU/DMUHU based code for the C implementation of umul_ppmm(), which
    performs an unsigned 64 x 64 bit multiply and returns the upper and
    lower 64-bit halves of the 128-bit result. Instead it widens the 64-bit
    inputs to 128-bits and emits a __multi3 intrinsic call to perform a 128
    x 128 multiply. This is both inefficient, and it results in a link error
    since we don't include __multi3 in MIPS linux.

    For example commit 90a53e4432b1 ("cfg80211: implement regdb signature
    checking") merged in v4.15-rc1 recently broke the 64r6_defconfig and
    64r6el_defconfig builds by indirectly selecting MPILIB. The same build
    errors can be reproduced on older kernels by enabling e.g. CRYPTO_RSA:

    lib/mpi/generic_mpih-mul1.o: In function `mpihelp_mul_1':
    lib/mpi/generic_mpih-mul1.c:50: undefined reference to `__multi3'
    lib/mpi/generic_mpih-mul2.o: In function `mpihelp_addmul_1':
    lib/mpi/generic_mpih-mul2.c:49: undefined reference to `__multi3'
    lib/mpi/generic_mpih-mul3.o: In function `mpihelp_submul_1':
    lib/mpi/generic_mpih-mul3.c:49: undefined reference to `__multi3'
    lib/mpi/mpih-div.o In function `mpihelp_divrem':
    lib/mpi/mpih-div.c:205: undefined reference to `__multi3'
    lib/mpi/mpih-div.c:142: undefined reference to `__multi3'

    Therefore add an efficient MIPS64r6 implementation of umul_ppmm() using
    inline assembly and the DMULU/DMUHU instructions, to prevent __multi3
    calls being emitted.

    Fixes: 7fd08ca58ae6 ("MIPS: Add build support for the MIPS R6 ISA")
    Signed-off-by: James Hogan
    Cc: Ralf Baechle
    Cc: Herbert Xu
    Cc: "David S. Miller"
    Cc: linux-mips@linux-mips.org
    Cc: linux-crypto@vger.kernel.org
    Signed-off-by: Herbert Xu
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    James Hogan
     

25 Feb, 2018

1 commit

  • [ Upstream commit 8dfd2f22d3bf3ab7714f7495ad5d897b8845e8c1 ]

    Callers of sprint_oid() do not check its return value before printing
    the result. In the case where the OID is zero-length, -EBADMSG was
    being returned without anything being written to the buffer, resulting
    in uninitialized stack memory being printed. Fix this by writing
    "(bad)" to the buffer in the cases where -EBADMSG is returned.

    Fixes: 4f73175d0375 ("X.509: Add utility functions to render OIDs as strings")
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     

22 Feb, 2018

2 commits

  • commit 4675ff05de2d76d167336b368bd07f3fef6ed5a6 upstream.

    Fix up makefiles, remove references, and git rm kmemcheck.

    Link: http://lkml.kernel.org/r/20171007030159.22241-4-alexander.levin@verizon.com
    Signed-off-by: Sasha Levin
    Cc: Steven Rostedt
    Cc: Vegard Nossum
    Cc: Pekka Enberg
    Cc: Michal Hocko
    Cc: Eric W. Biederman
    Cc: Alexander Potapenko
    Cc: Tim Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Levin, Alexander (Sasha Levin)
     
  • commit d0bc0c2a31c95002d37c3cc511ffdcab851b3256 upstream.

    TTM tries to allocate coherent memory in chunks of 2MB first to improve
    TLB efficiency and falls back to allocating 4K pages if that fails.

    Suppress the warning when the 2MB allocations fails since there is a
    valid fall back path.

    Signed-off-by: Christian König
    Reported-by: Mike Galbraith
    Acked-by: Konrad Rzeszutek Wilk
    Bug: https://bugs.freedesktop.org/show_bug.cgi?id=104082
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Greg Kroah-Hartman

    Christian König
     

17 Feb, 2018

3 commits

  • commit 42440c1f9911b4b7b8ba3dc4e90c1197bc561211 upstream.

    UBSAN=y fails to build with new GCC/clang:

    arch/x86/kernel/head64.o: In function `sanitize_boot_params':
    arch/x86/include/asm/bootparam_utils.h:37: undefined reference to `__ubsan_handle_type_mismatch_v1'

    because Clang and GCC 8 slightly changed ABI for 'type mismatch' errors.
    Compiler now uses new __ubsan_handle_type_mismatch_v1() function with
    slightly modified 'struct type_mismatch_data'.

    Let's add new 'struct type_mismatch_data_common' which is independent from
    compiler's layout of 'struct type_mismatch_data'. And make
    __ubsan_handle_type_mismatch[_v1]() functions transform compiler-dependent
    type mismatch data to our internal representation. This way, we can
    support both old and new compilers with minimal amount of change.

    Link: http://lkml.kernel.org/r/20180119152853.16806-1-aryabinin@virtuozzo.com
    Signed-off-by: Andrey Ryabinin
    Reported-by: Sodagudi Prasad
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Andrey Ryabinin
     
  • commit b8fe1120b4ba342b4f156d24e952d6e686b20298 upstream.

    A vist from the spelling fairy.

    Cc: David Laight
    Cc: Andrey Ryabinin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Andrew Morton
     
  • commit e7c52b84fb18f08ce49b6067ae6285aca79084a8 upstream.

    We get a lot of very large stack frames using gcc-7.0.1 with the default
    -fsanitize-address-use-after-scope --param asan-stack=1 options, which can
    easily cause an overflow of the kernel stack, e.g.

    drivers/gpu/drm/i915/gvt/handlers.c:2434:1: warning: the frame size of 46176 bytes is larger than 3072 bytes
    drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5650:1: warning: the frame size of 23632 bytes is larger than 3072 bytes
    lib/atomic64_test.c:250:1: warning: the frame size of 11200 bytes is larger than 3072 bytes
    drivers/gpu/drm/i915/gvt/handlers.c:2621:1: warning: the frame size of 9208 bytes is larger than 3072 bytes
    drivers/media/dvb-frontends/stv090x.c:3431:1: warning: the frame size of 6816 bytes is larger than 3072 bytes
    fs/fscache/stats.c:287:1: warning: the frame size of 6536 bytes is larger than 3072 bytes

    To reduce this risk, -fsanitize-address-use-after-scope is now split out
    into a separate CONFIG_KASAN_EXTRA Kconfig option, leading to stack
    frames that are smaller than 2 kilobytes most of the time on x86_64. An
    earlier version of this patch also prevented combining KASAN_EXTRA with
    KASAN_INLINE, but that is no longer necessary with gcc-7.0.1.

    All patches to get the frame size below 2048 bytes with CONFIG_KASAN=y
    and CONFIG_KASAN_EXTRA=n have been merged by maintainers now, so we can
    bring back that default now. KASAN_EXTRA=y still causes lots of
    warnings but now defaults to !COMPILE_TEST to disable it in
    allmodconfig, and it remains disabled in all other defconfigs since it
    is a new option. I arbitrarily raise the warning limit for KASAN_EXTRA
    to 3072 to reduce the noise, but an allmodconfig kernel still has around
    50 warnings on gcc-7.

    I experimented a bit more with smaller stack frames and have another
    follow-up series that reduces the warning limit for 64-bit architectures
    to 1280 bytes (without CONFIG_KASAN).

    With earlier versions of this patch series, I also had patches to address
    the warnings we get with KASAN and/or KASAN_EXTRA, using a
    "noinline_if_stackbloat" annotation.

    That annotation now got replaced with a gcc-8 bugfix (see
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715) and a workaround for
    older compilers, which means that KASAN_EXTRA is now just as bad as
    before and will lead to an instant stack overflow in a few extreme
    cases.

    This reverts parts of commit 3f181b4d8652 ("lib/Kconfig.debug: disable
    -Wframe-larger-than warnings with KASAN=y"). Two patches in linux-next
    should be merged first to avoid introducing warnings in an allmodconfig
    build:
    3cd890dbe2a4 ("media: dvb-frontends: fix i2c access helpers for KASAN")
    16c3ada89cff ("media: r820t: fix r820t_write_reg for KASAN")

    Do we really need to backport this?

    I think we do: without this patch, enabling KASAN will lead to
    unavoidable kernel stack overflow in certain device drivers when built
    with gcc-7 or higher on linux-4.10+ or any version that contains a
    backport of commit c5caf21ab0cf8. Most people are probably still on
    older compilers, but it will get worse over time as they upgrade their
    distros.

    The warnings we get on kernels older than this should all be for code
    that uses dangerously large stack frames, though most of them do not
    cause an actual stack overflow by themselves.The asan-stack option was
    added in linux-4.0, and commit 3f181b4d8652 ("lib/Kconfig.debug:
    disable -Wframe-larger-than warnings with KASAN=y") effectively turned
    off the warning for allmodconfig kernels, so I would like to see this
    fix backported to any kernels later than 4.0.

    I have done dozens of fixes for individual functions with stack frames
    larger than 2048 bytes with asan-stack, and I plan to make sure that
    all those fixes make it into the stable kernels as well (most are
    already there).

    Part of the complication here is that asan-stack (from 4.0) was
    originally assumed to always require much larger stacks, but that
    turned out to be a combination of multiple gcc bugs that we have now
    worked around and fixed, but sanitize-address-use-after-scope (from
    v4.10) has a much higher inherent stack usage and also suffers from at
    least three other problems that we have analyzed but not yet fixed
    upstream, each of them makes the stack usage more severe than it should
    be.

    Link: http://lkml.kernel.org/r/20171221134744.2295529-1-arnd@arndb.de
    Signed-off-by: Arnd Bergmann
    Acked-by: Andrey Ryabinin
    Cc: Mauro Carvalho Chehab
    Cc: Andrey Ryabinin
    Cc: Alexander Potapenko
    Cc: Dmitry Vyukov
    Cc: Andrey Konovalov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     

04 Feb, 2018

1 commit


31 Jan, 2018

1 commit

  • [ upstream commit 290af86629b25ffd1ed6232c4e9107da031705cb ]

    The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715.

    A quote from goolge project zero blog:
    "At this point, it would normally be necessary to locate gadgets in
    the host kernel code that can be used to actually leak data by reading
    from an attacker-controlled location, shifting and masking the result
    appropriately and then using the result of that as offset to an
    attacker-controlled address for a load. But piecing gadgets together
    and figuring out which ones work in a speculation context seems annoying.
    So instead, we decided to use the eBPF interpreter, which is built into
    the host kernel - while there is no legitimate way to invoke it from inside
    a VM, the presence of the code in the host kernel's text section is sufficient
    to make it usable for the attack, just like with ordinary ROP gadgets."

    To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
    option that removes interpreter from the kernel in favor of JIT-only mode.
    So far eBPF JIT is supported by:
    x64, arm64, arm32, sparc64, s390, powerpc64, mips64

    The start of JITed program is randomized and code page is marked as read-only.
    In addition "constant blinding" can be turned on with net.core.bpf_jit_harden

    v2->v3:
    - move __bpf_prog_ret0 under ifdef (Daniel)

    v1->v2:
    - fix init order, test_bpf and cBPF (Daniel's feedback)
    - fix offloaded bpf (Jakub's feedback)
    - add 'return 0' dummy in case something can invoke prog->bpf_func
    - retarget bpf tree. For bpf-next the patch would need one extra hunk.
    It will be sent when the trees are merged back to net-next

    Considered doing:
    int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT;
    but it seems better to land the patch as-is and in bpf-next remove
    bpf_jit_enable global variable from all JITs, consolidate in one place
    and remove this jit_init() function.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Greg Kroah-Hartman

    Alexei Starovoitov
     

25 Dec, 2017

1 commit

  • commit 11af847446ed0d131cf24d16a7ef3d5ea7a49554 upstream.

    Rename the unwinder config options from:

    CONFIG_ORC_UNWINDER
    CONFIG_FRAME_POINTER_UNWINDER
    CONFIG_GUESS_UNWINDER

    to:

    CONFIG_UNWINDER_ORC
    CONFIG_UNWINDER_FRAME_POINTER
    CONFIG_UNWINDER_GUESS

    ... in order to give them a more logical config namespace.

    Suggested-by: Ingo Molnar
    Signed-off-by: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/73972fc7e2762e91912c6b9584582703d6f1b8cc.1507924831.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Josh Poimboeuf
     

14 Dec, 2017

4 commits

  • [ Upstream commit 1f3c790bd5989fcfec9e53ad8fa09f5b740c958f ]

    line-range is supposed to treat "1-" as "1-endoffile", so
    handle the special case by setting last_lineno to UINT_MAX.

    Fixes this error:

    dynamic_debug:ddebug_parse_query: last-line:0 < 1st-line:1
    dynamic_debug:ddebug_exec_query: query parse failed

    Link: http://lkml.kernel.org/r/10a6a101-e2be-209f-1f41-54637824788e@infradead.org
    Signed-off-by: Randy Dunlap
    Acked-by: Jason Baron
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Randy Dunlap
     
  • [ Upstream commit 36a3d1dd4e16bcd0d2ddfb4a2ec7092f0ae0d931 ]

    If the amount of resources allocated to a gen_pool exceeds 2^32 then the
    avail atomic overflows and this causes problems when clients try and
    borrow resources from the pool. This is only expected to be an issue on
    64 bit systems.

    Add the header to pull in atomic_long* operations. So
    that 32 bit systems continue to use atomic32_t but 64 bit systems can
    use atomic64_t.

    Link: http://lkml.kernel.org/r/1509033843-25667-1-git-send-email-sbates@raithlin.com
    Signed-off-by: Stephen Bates
    Reviewed-by: Logan Gunthorpe
    Reviewed-by: Mathieu Desnoyers
    Reviewed-by: Daniel Mentz
    Cc: Jonathan Corbet
    Cc: Andrew Morton
    Cc: Will Deacon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Stephen Bates
     
  • commit 81a7be2cd69b412ab6aeacfe5ebf1bb6e5bce955 upstream.

    asn1_ber_decoder() was ignoring errors from actions associated with the
    opcodes ASN1_OP_END_SEQ_ACT, ASN1_OP_END_SET_ACT,
    ASN1_OP_END_SEQ_OF_ACT, and ASN1_OP_END_SET_OF_ACT. In practice, this
    meant the pkcs7_note_signed_info() action (since that was the only user
    of those opcodes). Fix it by checking for the error, just like the
    decoder does for actions associated with the other opcodes.

    This bug allowed users to leak slab memory by repeatedly trying to add a
    specially crafted "pkcs7_test" key (requires CONFIG_PKCS7_TEST_KEY).

    In theory, this bug could also be used to bypass module signature
    verification, by providing a PKCS#7 message that is misparsed such that
    a signature's ->authattrs do not contain its ->msgdigest. But it
    doesn't seem practical in normal cases, due to restrictions on the
    format of the ->authattrs.

    Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells
    Reviewed-by: James Morris
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit e0058f3a874ebb48b25be7ff79bc3b4e59929f90 upstream.

    In asn1_ber_decoder(), indefinitely-sized ASN.1 items were being passed
    to the action functions before their lengths had been computed, using
    the bogus length of 0x80 (ASN1_INDEFINITE_LENGTH). This resulted in
    reading data past the end of the input buffer, when given a specially
    crafted message.

    Fix it by rearranging the code so that the indefinite length is resolved
    before the action is called.

    This bug was originally found by fuzzing the X.509 parser in userspace
    using libFuzzer from the LLVM project.

    KASAN report (cleaned up slightly):

    BUG: KASAN: slab-out-of-bounds in memcpy ./include/linux/string.h:341 [inline]
    BUG: KASAN: slab-out-of-bounds in x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
    Read of size 128 at addr ffff880035dd9eaf by task keyctl/195

    CPU: 1 PID: 195 Comm: keyctl Not tainted 4.14.0-09238-g1d3b78bbc6e9 #26
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
    Call Trace:
    __dump_stack lib/dump_stack.c:17 [inline]
    dump_stack+0xd1/0x175 lib/dump_stack.c:53
    print_address_description+0x78/0x260 mm/kasan/report.c:252
    kasan_report_error mm/kasan/report.c:351 [inline]
    kasan_report+0x23f/0x350 mm/kasan/report.c:409
    memcpy+0x1f/0x50 mm/kasan/kasan.c:302
    memcpy ./include/linux/string.h:341 [inline]
    x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
    asn1_ber_decoder+0xb4a/0x1fd0 lib/asn1_decoder.c:447
    x509_cert_parse+0x1c7/0x620 crypto/asymmetric_keys/x509_cert_parser.c:89
    x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174
    asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388
    key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850
    SYSC_add_key security/keys/keyctl.c:122 [inline]
    SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62
    entry_SYSCALL_64_fastpath+0x1f/0x96

    Allocated by task 195:
    __do_kmalloc_node mm/slab.c:3675 [inline]
    __kmalloc_node+0x47/0x60 mm/slab.c:3682
    kvmalloc ./include/linux/mm.h:540 [inline]
    SYSC_add_key security/keys/keyctl.c:104 [inline]
    SyS_add_key+0x19e/0x290 security/keys/keyctl.c:62
    entry_SYSCALL_64_fastpath+0x1f/0x96

    Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
    Reported-by: Alexander Potapenko
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     

30 Nov, 2017

1 commit

  • commit 1d9ddde12e3c9bab7f3d3484eb9446315e3571ca upstream.

    On a non-preemptible kernel, if KEYCTL_DH_COMPUTE is called with the
    largest permitted inputs (16384 bits), the kernel spends 10+ seconds
    doing modular exponentiation in mpi_powm() without rescheduling. If all
    threads do it, it locks up the system. Moreover, it can cause
    rcu_sched-stall warnings.

    Notwithstanding the insanity of doing this calculation in kernel mode
    rather than in userspace, fix it by calling cond_resched() as each bit
    from the exponent is processed. It's still noninterruptible, but at
    least it's preemptible now.

    Do the cond_resched() once per bit rather than once per MPI limb because
    each limb might still easily take 100+ milliseconds on slow CPUs.

    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     

08 Nov, 2017

1 commit

  • syzkaller reported a NULL pointer dereference in asn1_ber_decoder(). It
    can be reproduced by the following command, assuming
    CONFIG_PKCS7_TEST_KEY=y:

    keyctl add pkcs7_test desc '' @s

    The bug is that if the data buffer is empty, an integer underflow occurs
    in the following check:

    if (unlikely(dp >= datalen - 1))
    goto data_overrun_error;

    This results in the NULL data pointer being dereferenced.

    Fix it by checking for 'datalen - dp < 2' instead.

    Also fix the similar check for 'dp >= datalen - n' later in the same
    function. That one possibly could result in a buffer overread.

    The NULL pointer dereference was reproducible using the "pkcs7_test" key
    type but not the "asymmetric" key type because the "asymmetric" key type
    checks for a 0-length payload before calling into the ASN.1 decoder but
    the "pkcs7_test" key type does not.

    The bug report was:

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
    PGD 7b708067 P4D 7b708067 PUD 7b6ee067 PMD 0
    Oops: 0000 [#1] SMP
    Modules linked in:
    CPU: 0 PID: 522 Comm: syz-executor1 Not tainted 4.14.0-rc8 #7
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.3-20171021_125229-anatol 04/01/2014
    task: ffff9b6b3798c040 task.stack: ffff9b6b37970000
    RIP: 0010:asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233
    RSP: 0018:ffff9b6b37973c78 EFLAGS: 00010216
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000000021c
    RDX: ffffffff814a04ed RSI: ffffb1524066e000 RDI: ffffffff910759e0
    RBP: ffff9b6b37973d60 R08: 0000000000000001 R09: ffff9b6b3caa4180
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
    R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
    FS: 00007f10ed1f2700(0000) GS:ffff9b6b3ea00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 000000007b6f3000 CR4: 00000000000006f0
    Call Trace:
    pkcs7_parse_message+0xee/0x240 crypto/asymmetric_keys/pkcs7_parser.c:139
    verify_pkcs7_signature+0x33/0x180 certs/system_keyring.c:216
    pkcs7_preparse+0x41/0x70 crypto/asymmetric_keys/pkcs7_key_type.c:63
    key_create_or_update+0x180/0x530 security/keys/key.c:855
    SYSC_add_key security/keys/keyctl.c:122 [inline]
    SyS_add_key+0xbf/0x250 security/keys/keyctl.c:62
    entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x4585c9
    RSP: 002b:00007f10ed1f1bd8 EFLAGS: 00000216 ORIG_RAX: 00000000000000f8
    RAX: ffffffffffffffda RBX: 00007f10ed1f2700 RCX: 00000000004585c9
    RDX: 0000000020000000 RSI: 0000000020008ffb RDI: 0000000020008000
    RBP: 0000000000000000 R08: ffffffffffffffff R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000216 R12: 00007fff1b2260ae
    R13: 00007fff1b2260af R14: 00007f10ed1f2700 R15: 0000000000000000
    Code: dd ca ff 48 8b 45 88 48 83 e8 01 4c 39 f0 0f 86 a8 07 00 00 e8 53 dd ca ff 49 8d 46 01 48 89 85 58 ff ff ff 48 8b 85 60 ff ff ff 0f b6 0c 30 89 c8 88 8d 75 ff ff ff 83 e0 1f 89 8d 28 ff ff
    RIP: asn1_ber_decoder+0x17f/0xe60 lib/asn1_decoder.c:233 RSP: ffff9b6b37973c78
    CR2: 0000000000000000

    Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
    Reported-by: syzbot
    Cc: # v3.7+
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells
    Signed-off-by: James Morris

    Eric Biggers
     

03 Nov, 2017

1 commit

  • …el/git/gregkh/driver-core

    Pull initial SPDX identifiers from Greg KH:
    "License cleanup: add SPDX license identifiers to some files

    Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the
    'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
    binding shorthand, which can be used instead of the full boiler plate
    text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart
    and Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset
    of the use cases:

    - file had no licensing information it it.

    - file was a */uapi/* one with no licensing information in it,

    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to
    license had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied
    to a file was done in a spreadsheet of side by side results from of
    the output of two independent scanners (ScanCode & Windriver)
    producing SPDX tag:value files created by Philippe Ombredanne.
    Philippe prepared the base worksheet, and did an initial spot review
    of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537
    files assessed. Kate Stewart did a file by file comparison of the
    scanner results in the spreadsheet to determine which SPDX license
    identifier(s) to be applied to the file. She confirmed any
    determination that was not immediately clear with lawyers working with
    the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:

    - Files considered eligible had to be source code files.

    - Make and config files were included as candidates if they contained
    >5 lines of source

    - File already had some variant of a license header in it (even if <5
    lines).

    All documentation files were explicitly excluded.

    The following heuristics were used to determine which SPDX license
    identifiers to apply.

    - when both scanners couldn't find any license traces, file was
    considered to have no license information in it, and the top level
    COPYING file license applied.

    For non */uapi/* files that summary was:

    SPDX license identifier # files
    ---------------------------------------------------|-------
    GPL-2.0 11139

    and resulted in the first patch in this series.

    If that file was a */uapi/* path one, it was "GPL-2.0 WITH
    Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
    was:

    SPDX license identifier # files
    ---------------------------------------------------|-------
    GPL-2.0 WITH Linux-syscall-note 930

    and resulted in the second patch in this series.

    - if a file had some form of licensing information in it, and was one
    of the */uapi/* ones, it was denoted with the Linux-syscall-note if
    any GPL family license was found in the file or had no licensing in
    it (per prior point). Results summary:

    SPDX license identifier # files
    ---------------------------------------------------|------
    GPL-2.0 WITH Linux-syscall-note 270
    GPL-2.0+ WITH Linux-syscall-note 169
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
    LGPL-2.1+ WITH Linux-syscall-note 15
    GPL-1.0+ WITH Linux-syscall-note 14
    ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
    LGPL-2.0+ WITH Linux-syscall-note 4
    LGPL-2.1 WITH Linux-syscall-note 3
    ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
    ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1

    and that resulted in the third patch in this series.

    - when the two scanners agreed on the detected license(s), that
    became the concluded license(s).

    - when there was disagreement between the two scanners (one detected
    a license but the other didn't, or they both detected different
    licenses) a manual inspection of the file occurred.

    - In most cases a manual inspection of the information in the file
    resulted in a clear resolution of the license that should apply
    (and which scanner probably needed to revisit its heuristics).

    - When it was not immediately clear, the license identifier was
    confirmed with lawyers working with the Linux Foundation.

    - If there was any question as to the appropriate license identifier,
    the file was flagged for further research and to be revisited later
    in time.

    In total, over 70 hours of logged manual review was done on the
    spreadsheet to determine the SPDX license identifiers to apply to the
    source files by Kate, Philippe, Thomas and, in some cases,
    confirmation by lawyers working with the Linux Foundation.

    Kate also obtained a third independent scan of the 4.13 code base from
    FOSSology, and compared selected files where the other two scanners
    disagreed against that SPDX file, to see if there was new insights.
    The Windriver scanner is based on an older version of FOSSology in
    part, so they are related.

    Thomas did random spot checks in about 500 files from the spreadsheets
    for the uapi headers and agreed with SPDX license identifier in the
    files he inspected. For the non-uapi files Thomas did random spot
    checks in about 15000 files.

    In initial set of patches against 4.14-rc6, 3 files were found to have
    copy/paste license identifier errors, and have been fixed to reflect
    the correct identifier.

    Additionally Philippe spent 10 hours this week doing a detailed manual
    inspection and review of the 12,461 patched files from the initial
    patch version early this week with:

    - a full scancode scan run, collecting the matched texts, detected
    license ids and scores

    - reviewing anything where there was a license detected (about 500+
    files) to ensure that the applied SPDX license was correct

    - reviewing anything where there was no detection but the patch
    license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
    applied SPDX license was correct

    This produced a worksheet with 20 files needing minor correction. This
    worksheet was then exported into 3 different .csv files for the
    different types of files to be modified.

    These .csv files were then reviewed by Greg. Thomas wrote a script to
    parse the csv files and add the proper SPDX tag to the file, in the
    format that the file expected. This script was further refined by Greg
    based on the output to detect more types of files automatically and to
    distinguish between header and source .c files (which need different
    comment types.) Finally Greg ran the script using the .csv files to
    generate the patches.

    Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
    Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
    Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

    * tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
    License cleanup: add SPDX license identifier to uapi header files with a license
    License cleanup: add SPDX license identifier to uapi header files with no license
    License cleanup: add SPDX GPL-2.0 license identifier to files with no license

    Linus Torvalds
     

02 Nov, 2017

2 commits

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • syzkaller with KASAN reported an out-of-bounds read in
    asn1_ber_decoder(). It can be reproduced by the following command,
    assuming CONFIG_X509_CERTIFICATE_PARSER=y and CONFIG_KASAN=y:

    keyctl add asymmetric desc $'\x30\x30' @s

    The bug is that the length of an ASN.1 data value isn't validated in the
    case where it is encoded using the short form, causing the decoder to
    read past the end of the input buffer. Fix it by validating the length.

    The bug report was:

    BUG: KASAN: slab-out-of-bounds in asn1_ber_decoder+0x10cb/0x1730 lib/asn1_decoder.c:233
    Read of size 1 at addr ffff88003cccfa02 by task syz-executor0/6818

    CPU: 1 PID: 6818 Comm: syz-executor0 Not tainted 4.14.0-rc7-00008-g5f479447d983 #2
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:16 [inline]
    dump_stack+0xb3/0x10b lib/dump_stack.c:52
    print_address_description+0x79/0x2a0 mm/kasan/report.c:252
    kasan_report_error mm/kasan/report.c:351 [inline]
    kasan_report+0x236/0x340 mm/kasan/report.c:409
    __asan_report_load1_noabort+0x14/0x20 mm/kasan/report.c:427
    asn1_ber_decoder+0x10cb/0x1730 lib/asn1_decoder.c:233
    x509_cert_parse+0x1db/0x650 crypto/asymmetric_keys/x509_cert_parser.c:89
    x509_key_preparse+0x64/0x7a0 crypto/asymmetric_keys/x509_public_key.c:174
    asymmetric_key_preparse+0xcb/0x1a0 crypto/asymmetric_keys/asymmetric_type.c:388
    key_create_or_update+0x347/0xb20 security/keys/key.c:855
    SYSC_add_key security/keys/keyctl.c:122 [inline]
    SyS_add_key+0x1cd/0x340 security/keys/keyctl.c:62
    entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x447c89
    RSP: 002b:00007fca7a5d3bd8 EFLAGS: 00000246 ORIG_RAX: 00000000000000f8
    RAX: ffffffffffffffda RBX: 00007fca7a5d46cc RCX: 0000000000447c89
    RDX: 0000000020006f4a RSI: 0000000020006000 RDI: 0000000020001ff5
    RBP: 0000000000000046 R08: fffffffffffffffd R09: 0000000000000000
    R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000000
    R13: 0000000000000000 R14: 00007fca7a5d49c0 R15: 00007fca7a5d4700

    Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
    Cc: # v3.7+
    Signed-off-by: Eric Biggers
    Signed-off-by: David Howells
    Signed-off-by: James Morris

    Eric Biggers
     

31 Oct, 2017

1 commit

  • It turns out that some drivers seem to think it's ok to remap page
    ranges from within interrupts and even NMI's. That is definitely not
    the case, since the page table build-up is simply not interrupt-safe.

    This showed up in the zero-day robot that reported it for the ACPI APEI
    GHES ("Generic Hardware Error Source") driver. Normally it had been
    hidden by the fact that no page table operations had been needed because
    the vmalloc area had been set up by other things.

    Apparently due to a recent change to the GHEI driver: commit
    77b246b32b2c ("acpi: apei: check for pending errors when probing GHES
    entries") 0day actually caught a case during bootup whenthe ioremap
    called down to page allocation. But that recent change only showed the
    symptom, it wasn't the root cause of the problem.

    Hopefully it is limited to just that one driver.

    If you need to access random physical memory, you either need to ioremap
    in process context, or you need to use the FIXMAP facility to set one
    particular fixmap entry to the required mapping - that can be done safely.

    Cc: Borislav Petkov
    Cc: Len Brown
    Cc: Tony Luck
    Cc: Fengguang Wu
    Cc: Tyler Baicar
    Cc: Will Deacon
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

29 Oct, 2017

1 commit

  • This fixes CVE-2017-12193.

    Fix a case in the assoc_array implementation in which a new leaf is
    added that needs to go into a node that happens to be full, where the
    existing leaves in that node cluster together at that level to the
    exclusion of new leaf.

    What needs to happen is that the existing leaves get moved out to a new
    node, N1, at level + 1 and the existing node needs replacing with one,
    N0, that has pointers to the new leaf and to N1.

    The code that tries to do this gets this wrong in two ways:

    (1) The pointer that should've pointed from N0 to N1 is set to point
    recursively to N0 instead.

    (2) The backpointer from N0 needs to be set correctly in the case N0 is
    either the root node or reached through a shortcut.

    Fix this by removing this path and using the split_node path instead,
    which achieves the same end, but in a more general way (thanks to Eric
    Biggers for spotting the redundancy).

    The problem manifests itself as:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
    IP: assoc_array_apply_edit+0x59/0xe5

    Fixes: 3cb989501c26 ("Add a generic associative array implementation.")
    Reported-and-tested-by: WU Fan
    Signed-off-by: David Howells
    Cc: stable@vger.kernel.org [v3.13-rc1+]
    Signed-off-by: Linus Torvalds

    David Howells