09 Jan, 2020

40 commits

  • commit cc976614f59bd8e45de8ce988a6bcb5de711d994 upstream.

    Prior to commit 858805b336be ("kbuild: add $(BASH) to run scripts with
    bash-extension"), this shell script was almost always run by bash since
    bash is usually installed on the system by default.

    Now, this script is run by sh, which might be a symlink to dash. On such
    distributions, the following code emits an error:

    local dev=`LC_ALL=C ls -l "${location}"`

    You can reproduce the build error, for example by setting
    CONFIG_INITRAMFS_SOURCE="/dev".

    GEN usr/initramfs_data.cpio.gz
    ./usr/gen_initramfs_list.sh: 131: local: 1: bad variable name
    make[1]: *** [usr/Makefile:61: usr/initramfs_data.cpio.gz] Error 2

    This is because `LC_ALL=C ls -l "${location}"` contains spaces.
    Surrounding it with double-quotes fixes the error.

    Fixes: 858805b336be ("kbuild: add $(BASH) to run scripts with bash-extension")
    Reported-by: Jory A. Pratt
    Signed-off-by: Masahiro Yamada
    Signed-off-by: Greg Kroah-Hartman

    Masahiro Yamada
     
  • commit 24461d9792c2c706092805ff1b067628933441bd upstream.

    vchan_vdesc_fini() is freeing up 'vd' so the access to vd->tx_result is
    via already freed up memory.

    Move the vchan_vdesc_fini() after invoking the callback to avoid this.

    Fixes: 09d5b702b0f97 ("dmaengine: virt-dma: store result on dma descriptor")
    Signed-off-by: Peter Ujfalusi
    Reviewed-by: Alexandru Ardelean
    Link: https://lore.kernel.org/r/20191220131100.21804-1-peter.ujfalusi@ti.com
    Signed-off-by: Vinod Koul
    Signed-off-by: Greg Kroah-Hartman

    Peter Ujfalusi
     
  • commit 8c62ed27a12c00e3db1c9f04bc0f272bdbb06734 upstream.

    aa_xattrs_match() is unfortunately calling vfs_getxattr_alloc() from a
    context protected by an rcu_read_lock. This can not be done as
    vfs_getxattr_alloc() may sleep regardles of the gfp_t value being
    passed to it.

    Fix this by breaking the rcu_read_lock on the policy search when the
    xattr match feature is requested and restarting the search if a policy
    changes occur.

    Fixes: 8e51f9087f40 ("apparmor: Add support for attaching profiles via xattr, presence and value")
    Reported-by: Jia-Ju Bai
    Reported-by: Al Viro
    Signed-off-by: John Johansen
    Signed-off-by: Greg Kroah-Hartman

    John Johansen
     
  • commit a7c46c0c0e3d62f2764cd08b90934cd2aaaf8545 upstream.

    In the implementation of __gup_benchmark_ioctl() the allocated pages
    should be released before returning in case of an invalid cmd. Release
    pages via kvfree().

    [akpm@linux-foundation.org: rework code flow, return -EINVAL rather than -1]
    Link: http://lkml.kernel.org/r/20191211174653.4102-1-navid.emamdoost@gmail.com
    Fixes: 714a3a1ebafe ("mm/gup_benchmark.c: add additional pinning methods")
    Signed-off-by: Navid Emamdoost
    Reviewed-by: Andrew Morton
    Reviewed-by: Ira Weiny
    Reviewed-by: John Hubbard
    Cc: Keith Busch
    Cc: Kirill A. Shutemov
    Cc: Dave Hansen
    Cc: Dan Williams
    Cc: David Hildenbrand
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Navid Emamdoost
     
  • commit 0b8c0ec7eedcd8f9f1a1f238d87f9b512b09e71a upstream.

    syzbot reports:

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 0 PID: 9217 Comm: io_uring-sq Not tainted 5.4.0-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
    Google 01/01/2011
    RIP: 0010:creds_are_invalid kernel/cred.c:792 [inline]
    RIP: 0010:__validate_creds include/linux/cred.h:187 [inline]
    RIP: 0010:override_creds+0x9f/0x170 kernel/cred.c:550
    Code: ac 25 00 81 fb 64 65 73 43 0f 85 a3 37 00 00 e8 17 ab 25 00 49 8d 7c
    24 10 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 b6 04 02 84
    c0 74 08 3c 03 0f 8e 96 00 00 00 41 8b 5c 24 10 bf
    RSP: 0018:ffff88809c45fda0 EFLAGS: 00010202
    RAX: dffffc0000000000 RBX: 0000000043736564 RCX: ffffffff814f3318
    RDX: 0000000000000002 RSI: ffffffff814f3329 RDI: 0000000000000010
    RBP: ffff88809c45fdb8 R08: ffff8880a3aac240 R09: ffffed1014755849
    R10: ffffed1014755848 R11: ffff8880a3aac247 R12: 0000000000000000
    R13: ffff888098ab1600 R14: 0000000000000000 R15: 0000000000000000
    FS: 0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007ffd51c40664 CR3: 0000000092641000 CR4: 00000000001406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    io_sq_thread+0x1c7/0xa20 fs/io_uring.c:3274
    kthread+0x361/0x430 kernel/kthread.c:255
    ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
    Modules linked in:
    ---[ end trace f2e1a4307fbe2245 ]---
    RIP: 0010:creds_are_invalid kernel/cred.c:792 [inline]
    RIP: 0010:__validate_creds include/linux/cred.h:187 [inline]
    RIP: 0010:override_creds+0x9f/0x170 kernel/cred.c:550
    Code: ac 25 00 81 fb 64 65 73 43 0f 85 a3 37 00 00 e8 17 ab 25 00 49 8d 7c
    24 10 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 b6 04 02 84
    c0 74 08 3c 03 0f 8e 96 00 00 00 41 8b 5c 24 10 bf
    RSP: 0018:ffff88809c45fda0 EFLAGS: 00010202
    RAX: dffffc0000000000 RBX: 0000000043736564 RCX: ffffffff814f3318
    RDX: 0000000000000002 RSI: ffffffff814f3329 RDI: 0000000000000010
    RBP: ffff88809c45fdb8 R08: ffff8880a3aac240 R09: ffffed1014755849
    R10: ffffed1014755848 R11: ffff8880a3aac247 R12: 0000000000000000
    R13: ffff888098ab1600 R14: 0000000000000000 R15: 0000000000000000
    FS: 0000000000000000(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007ffd51c40664 CR3: 0000000092641000 CR4: 00000000001406f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

    which is caused by slab fault injection triggering a failure in
    prepare_creds(). We don't actually need to create a copy of the creds
    as we're not modifying it, we just need a reference on the current task
    creds. This avoids the failure case as well, and propagates the const
    throughout the stack.

    Fixes: 181e448d8709 ("io_uring: async workers should inherit the user creds")
    Reported-by: syzbot+5320383e16029ba057ff@syzkaller.appspotmail.com
    Signed-off-by: Jens Axboe
    [ only use the io_uring.c portion of the patch - gregkh]
    Signed-off-by: Greg Kroah-Hartman

    Jens Axboe
     
  • commit 01f36a554e3ef32f9fc4b81a4437cf08fd0e4742 upstream.

    trace_printk schedules work via irq_work_queue(), but doesn't
    wait until it was processed. The kprobe_module.tc testcase does:

    :;: "Load module again, which means the event1 should be recorded";:
    modprobe trace-printk
    grep "event1:" trace

    so the grep which checks the trace file might run before the irq work
    was processed. Fix this by adding a irq_work_sync().

    Link: http://lore.kernel.org/linux-trace-devel/20191218074427.96184-3-svens@linux.ibm.com

    Cc: stable@vger.kernel.org
    Fixes: af2a0750f3749 ("selftests/ftrace: Improve kprobe on module testcase to load/unload module")
    Signed-off-by: Sven Schnelle
    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Greg Kroah-Hartman

    Sven Schnelle
     
  • commit fe6e096a5bbf73a142f09c72e7aa2835026eb1a3 upstream.

    At least on PA-RISC and s390 synthetic histogram triggers are failing
    selftests because trace_event_raw_event_synth() always writes a 64 bit
    values, but the reader expects a field->size sized value. On little endian
    machines this doesn't hurt, but on big endian this makes the reader always
    read zero values.

    Link: http://lore.kernel.org/linux-trace-devel/20191218074427.96184-4-svens@linux.ibm.com

    Cc: stable@vger.kernel.org
    Fixes: 4b147936fa509 ("tracing: Add support for 'synthetic' events")
    Acked-by: Tom Zanussi
    Signed-off-by: Sven Schnelle
    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Greg Kroah-Hartman

    Sven Schnelle
     
  • commit 106f41f5a302cb1f36c7543fae6a05de12e96fa4 upstream.

    The compare functions of the histogram code would be specific for the size
    of the value being compared (byte, short, int, long long). It would
    reference the value from the array via the type of the compare, but the
    value was stored in a 64 bit number. This is fine for little endian
    machines, but for big endian machines, it would end up comparing zeros or
    all ones (depending on the sign) for anything but 64 bit numbers.

    To fix this, first derference the value as a u64 then convert it to the type
    being compared.

    Link: http://lkml.kernel.org/r/20191211103557.7bed6928@gandalf.local.home

    Cc: stable@vger.kernel.org
    Fixes: 08d43a5fa063e ("tracing: Add lock-free tracing_map")
    Acked-by: Tom Zanussi
    Reported-by: Sven Schnelle
    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Greg Kroah-Hartman

    Steven Rostedt (VMware)
     
  • commit 79e65c27f09683fbb50c33acab395d0ddf5302d2 upstream.

    When failing in the allocation of filter_item, process_system_preds()
    goes to fail_mem, where the allocated filter is freed.

    However, this leads to memory leak of filter->filter_string and
    filter->prog, which is allocated before and in process_preds().
    This bug has been detected by kmemleak as well.

    Fix this by changing kfree to __free_fiter.

    unreferenced object 0xffff8880658007c0 (size 32):
    comm "bash", pid 579, jiffies 4295096372 (age 17.752s)
    hex dump (first 32 bytes):
    63 6f 6d 6d 6f 6e 5f 70 69 64 20 20 3e 20 31 30 common_pid > 10
    00 00 00 00 00 00 00 00 65 73 00 00 00 00 00 00 ........es......
    backtrace:
    [] kstrdup+0x2d/0x60
    [] apply_subsystem_event_filter+0x378/0x932
    [] subsystem_filter_write+0x5a/0x90
    [] vfs_write+0xe1/0x240
    [] ksys_write+0xb4/0x150
    [] do_syscall_64+0x6d/0x1e0
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    unreferenced object 0xffff888060c22d00 (size 64):
    comm "bash", pid 579, jiffies 4295096372 (age 17.752s)
    hex dump (first 32 bytes):
    01 00 00 00 00 00 00 00 00 e8 d7 41 80 88 ff ff ...........A....
    01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] process_preds+0x243/0x1820
    [] apply_subsystem_event_filter+0x3be/0x932
    [] subsystem_filter_write+0x5a/0x90
    [] vfs_write+0xe1/0x240
    [] ksys_write+0xb4/0x150
    [] do_syscall_64+0x6d/0x1e0
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    unreferenced object 0xffff888041d7e800 (size 512):
    comm "bash", pid 579, jiffies 4295096372 (age 17.752s)
    hex dump (first 32 bytes):
    70 bc 85 97 ff ff ff ff 0a 00 00 00 00 00 00 00 p...............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
    backtrace:
    [] process_preds+0x71a/0x1820
    [] apply_subsystem_event_filter+0x3be/0x932
    [] subsystem_filter_write+0x5a/0x90
    [] vfs_write+0xe1/0x240
    [] ksys_write+0xb4/0x150
    [] do_syscall_64+0x6d/0x1e0
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Link: http://lkml.kernel.org/r/20191211091258.11310-1-keitasuzuki.park@sslab.ics.keio.ac.jp

    Cc: Ingo Molnar
    Cc: stable@vger.kernel.org
    Fixes: 404a3add43c9c ("tracing: Only add filter list when needed")
    Signed-off-by: Keita Suzuki
    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Greg Kroah-Hartman

    Keita Suzuki
     
  • commit 3a53acf1d9bea11b57c1f6205e3fe73f9d8a3688 upstream.

    Task T2 Task T3
    trace_options_core_write() subsystem_open()

    mutex_lock(trace_types_lock) mutex_lock(event_mutex)

    set_tracer_flag()

    trace_event_enable_tgid_record() mutex_lock(trace_types_lock)

    mutex_lock(event_mutex)

    This gives a circular dependency deadlock between trace_types_lock and
    event_mutex. To fix this invert the usage of trace_types_lock and
    event_mutex in trace_options_core_write(). This keeps the sequence of
    lock usage consistent.

    Link: http://lkml.kernel.org/r/0101016eef175e38-8ca71caf-a4eb-480d-a1e6-6f0bbc015495-000000@us-west-2.amazonses.com

    Cc: stable@vger.kernel.org
    Fixes: d914ba37d7145 ("tracing: Add support for recording tgid of tasks")
    Signed-off-by: Prateek Sood
    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Greg Kroah-Hartman

    Prateek Sood
     
  • commit 8df34c56321479bfa1ec732c675b686c2b4df412 upstream.

    glibc 2.30 introduces gettid() in public headers, which clashes with
    the internal static definition within rseq selftests.

    Rename gettid() to rseq_gettid() to eliminate this symbol name clash.

    Reported-by: Tommi T. Rantala
    Signed-off-by: Mathieu Desnoyers
    Cc: Shuah Khan
    Cc: Tommi T. Rantala
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra (Intel)
    Cc: "Paul E. McKenney"
    Cc: Boqun Feng
    Cc: "H . Peter Anvin"
    Cc: Paul Turner
    Cc: Dmitry Vyukov
    Cc: # v4.18+
    Signed-off-by: Shuah Khan
    Signed-off-by: Greg Kroah-Hartman

    Mathieu Desnoyers
     
  • commit 1d8f65798240b6577d8c44d20c8ea8f1d429e495 upstream.

    The condition should be logical NOT to assign the hook address to parent
    address. Because the return value 0 of function_graph_enter upon
    success.

    Fixes: e949b6db51dc (riscv/function_graph: Simplify with function_graph_enter())
    Signed-off-by: Zong Li
    Reviewed-by: Steven Rostedt (VMware)
    Cc: stable@vger.kernel.org
    Signed-off-by: Paul Walmsley
    Signed-off-by: Greg Kroah-Hartman

    Zong Li
     
  • commit 9d05c18e8d7de566ff68f221fcae65e78708dd1d upstream.

    When enabling ftrace graph tracer, it gets the tracing clock in
    ftrace_push_return_trace(). Eventually, it invokes riscv_sched_clock()
    to get the clock value. If riscv_sched_clock() isn't marked with
    'notrace', it will call ftrace_push_return_trace() and cause infinite
    loop.

    The result of failure as follow:

    command: echo function_graph >current_tracer
    [ 46.176787] Unable to handle kernel paging request at virtual address ffffffe04fb38c48
    [ 46.177309] Oops [#1]
    [ 46.177478] Modules linked in:
    [ 46.177770] CPU: 0 PID: 256 Comm: $d Not tainted 5.5.0-rc1 #47
    [ 46.177981] epc: ffffffe00035e59a ra : ffffffe00035e57e sp : ffffffe03a7569b0
    [ 46.178216] gp : ffffffe000d29b90 tp : ffffffe03a756180 t0 : ffffffe03a756968
    [ 46.178430] t1 : ffffffe00087f408 t2 : ffffffe03a7569a0 s0 : ffffffe03a7569f0
    [ 46.178643] s1 : ffffffe00087f408 a0 : 0000000ac054cda4 a1 : 000000000087f411
    [ 46.178856] a2 : 0000000ac054cda4 a3 : 0000000000373ca0 a4 : ffffffe04fb38c48
    [ 46.179099] a5 : 00000000153e22a8 a6 : 00000000005522ff a7 : 0000000000000005
    [ 46.179338] s2 : ffffffe03a756a90 s3 : ffffffe00032811c s4 : ffffffe03a756a58
    [ 46.179570] s5 : ffffffe000d29fe0 s6 : 0000000000000001 s7 : 0000000000000003
    [ 46.179809] s8 : 0000000000000003 s9 : 0000000000000002 s10: 0000000000000004
    [ 46.180053] s11: 0000000000000000 t3 : 0000003fc815749c t4 : 00000000000efc90
    [ 46.180293] t5 : ffffffe000d29658 t6 : 0000000000040000
    [ 46.180482] status: 0000000000000100 badaddr: ffffffe04fb38c48 cause: 000000000000000f

    Signed-off-by: Zong Li
    Reviewed-by: Steven Rostedt (VMware)
    [paul.walmsley@sifive.com: cleaned up patch description]
    Fixes: 92e0d143fdef ("clocksource/drivers/riscv_timer: Provide the sched_clock")
    Cc: stable@vger.kernel.org
    Signed-off-by: Paul Walmsley
    Signed-off-by: Greg Kroah-Hartman

    Zong Li
     
  • commit 256efaea1fdc4e38970489197409a26125ee0aaa upstream.

    gpiolib has a corner case with open drain outputs that are emulated.
    When such outputs are outputting a logic 1, emulation will set the
    hardware to input mode, which will cause gpiod_get_direction() to
    report that it is in input mode. This is different from the behaviour
    with a true open-drain output.

    Unify the semantics here.

    Cc:
    Suggested-by: Linus Walleij
    Signed-off-by: Russell King
    Signed-off-by: Bartosz Golaszewski
    Signed-off-by: Greg Kroah-Hartman

    Russell King
     
  • commit 634f0348fe336fce8f6cab1933139115e983ed2f upstream.

    Commit cad6fade6e78 ("xtensa: clean up WSR*/RSR*/get_sr/set_sr") removed
    {RSR,WSR}_CPENABLE from xtensa code, but did not fix up all users,
    breaking gpio-xtensa driver build. Update gpio-xtensa to use
    new xtensa_{get,set}_sr API.

    Cc: stable@vger.kernel.org # v5.0+
    Fixes: cad6fade6e78 ("xtensa: clean up WSR*/RSR*/get_sr/set_sr")
    Signed-off-by: Max Filippov
    Signed-off-by: Bartosz Golaszewski
    Signed-off-by: Greg Kroah-Hartman

    Max Filippov
     
  • commit 8385d756e114f2df8568e508902d5f9850817ffb upstream.

    ata_qc_complete_multiple() is called with a mask of the still active
    tags.

    mv_sata doesn't have this information directly and instead calculates
    the still active tags from the started tags (ap->qc_active) and the
    finished tags as (ap->qc_active ^ done_mask)

    Since 28361c40368 the hw_tag and tag are no longer the same and the
    equation is no longer valid. In ata_exec_internal_sg() ap->qc_active is
    initialized as 1ULL << ATA_TAG_INTERNAL, but in hardware tag 0 is
    started and this will be in done_mask on completion. ap->qc_active ^
    done_mask becomes 0x100000000 ^ 0x1 = 0x100000001 and thus tag 0 used as
    the internal tag will never be reported as completed.

    This is fixed by introducing ata_qc_get_active() which returns the
    active hardware tags and calling it where appropriate.

    This is tested on mv_sata, but sata_fsl and sata_nv suffer from the same
    problem. There is another case in sata_nv that most likely needs fixing
    as well, but this looks a little different, so I wasn't confident enough
    to change that.

    Fixes: 28361c403683 ("libata: add extra internal command")
    Cc: stable@vger.kernel.org
    Tested-by: Pali Rohár
    Signed-off-by: Sascha Hauer
    Signed-off-by: Greg Kroah-Hartman

    Add missing export of ata_qc_get_active(), as per Pali.

    Signed-off-by: Jens Axboe

    Sascha Hauer
     
  • commit 1a3d78cb6e20779a19388315bd8efefbd8d4a656 upstream.

    Set AHCI_HFLAG_DELAY_ENGINE for the BCM7425 AHCI controller thus making
    it conforming to the 'strict' AHCI implementation which this controller
    is based on.

    This solves long link establishment with specific hard drives (e.g.:
    Seagate ST1000VM002-9ZL1 SC12) that would otherwise have to complete the
    error recovery handling before finally establishing a succesful SATA
    link at the desired speed.

    We re-order the hpriv->flags assignment to also remove the NONCQ quirk
    since we can set the flag directly.

    Fixes: 9586114cf1e9 ("ata: ahci_brcmstb: add support MIPS-based platforms")
    Fixes: 423be77daabe ("ata: ahci_brcmstb: add quirk for broken ncq")
    Cc: stable@vger.kernel.org
    Reviewed-by: Hans de Goede
    Signed-off-by: Florian Fainelli
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Florian Fainelli
     
  • commit bf0e5013bc2dcac205417e1252205dca39dfc005 upstream.

    The downstream implementation of ahci_brcm.c did contain clock
    management recovery, but until recently, did that outside of the
    libahci_platform helpers and this was unintentionally stripped out while
    forward porting the patch upstream.

    Add the missing clock management during recovery and sleep for 10
    milliseconds per the design team recommendations to ensure the SATA PHY
    controller and AFE have been fully quiesced.

    Fixes: eb73390ae241 ("ata: ahci_brcm: Recover from failures to identify devices")
    Cc: stable@vger.kernel.org
    Reviewed-by: Hans de Goede
    Signed-off-by: Florian Fainelli
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Florian Fainelli
     
  • commit c0cdf2ac4b5bf3e5ef2451ea29fb4104278cdabc upstream.

    The AHCI resources management within ahci_brcm.c is a little
    convoluted, largely because it historically had a dedicated clock that
    was managed within this file in the downstream tree. Once brough
    upstream though, the clock was left to be managed by libahci_platform.c
    which is entirely appropriate.

    This patch series ensures that the AHCI resources are fetched and
    enabled before any register access is done, thus avoiding bus errors on
    platforms which clock gate the controller by default.

    As a result we need to re-arrange the suspend() and resume() functions
    in order to avoid accessing registers after the clocks have been turned
    off respectively before the clocks have been turned on. Finally, we can
    refactor brcm_ahci_get_portmask() in order to fetch the number of ports
    from hpriv->mmio which is now accessible without jumping through hoops
    like we used to do.

    The commit pointed in the Fixes tag is both old and new enough not to
    require major headaches for backporting of this patch.

    Fixes: eba68f829794 ("ata: ahci_brcmstb: rename to support across Broadcom SoC's")
    Cc: stable@vger.kernel.org
    Reviewed-by: Hans de Goede
    Signed-off-by: Florian Fainelli
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Florian Fainelli
     
  • commit 84b032dbfdf1c139cd2b864e43959510646975f8 upstream.

    This reverts commit 6bb86fefa086faba7b60bb452300b76a47cde1a5
    ("libahci_platform: Staticize ahci_platform_able_phys()") we are
    going to need ahci_platform_{enable,disable}_phys() in a subsequent
    commit for ahci_brcm.c in order to properly control the PHY
    initialization order.

    Also make sure the function prototypes are declared in
    include/linux/ahci_platform.h as a result.

    Cc: stable@vger.kernel.org
    Reviewed-by: Hans de Goede
    Signed-off-by: Florian Fainelli
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Florian Fainelli
     
  • commit f54c7898ed1c3c9331376c0337a5049c38f66497 upstream.

    Anatoly has been fuzzing with kBdysch harness and reported a hang in one
    of the outcomes. Upon closer analysis, it turns out that precise scalar
    value tracking is missing a few precision markings for unknown scalars:

    0: R1=ctx(id=0,off=0,imm=0) R10=fp0
    0: (b7) r0 = 0
    1: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
    1: (35) if r0 >= 0xf72e goto pc+0
    --> only follow fallthrough
    2: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
    2: (35) if r0 >= 0x80fe0000 goto pc+0
    --> only follow fallthrough
    3: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
    3: (14) w0 -= -536870912
    4: R0_w=invP536870912 R1=ctx(id=0,off=0,imm=0) R10=fp0
    4: (0f) r1 += r0
    5: R0_w=invP536870912 R1_w=inv(id=0) R10=fp0
    5: (55) if r1 != 0x104c1500 goto pc+0
    --> push other branch for later analysis
    R0_w=invP536870912 R1_w=inv273421568 R10=fp0
    6: R0_w=invP536870912 R1_w=inv273421568 R10=fp0
    6: (b7) r0 = 0
    7: R0=invP0 R1=inv273421568 R10=fp0
    7: (76) if w1 s>= 0xffffff00 goto pc+3
    --> only follow goto
    11: R0=invP0 R1=inv273421568 R10=fp0
    11: (95) exit
    6: R0_w=invP536870912 R1_w=inv(id=0) R10=fp0
    6: (b7) r0 = 0
    propagating r0
    7: safe
    processed 11 insns [...]

    In the analysis of the second path coming after the successful exit above,
    the path is being pruned at line 7. Pruning analysis found that both r0 are
    precise P0 and both R1 are non-precise scalars and given prior path with
    R1 as non-precise scalar succeeded, this one is therefore safe as well.

    However, problem is that given condition at insn 7 in the first run, we only
    followed goto and didn't push the other branch for later analysis, we've
    never walked the few insns in there and therefore dead-code sanitation
    rewrites it as goto pc-1, causing the hang depending on the skb address
    hitting these conditions. The issue is that R1 should have been marked as
    precise as well such that pruning enforces range check and conluded that new
    R1 is not in range of old R1. In insn 4, we mark R1 (skb) as unknown scalar
    via __mark_reg_unbounded() but not mark_reg_unbounded() and therefore
    regs->precise remains as false.

    Back in b5dc0163d8fd ("bpf: precise scalar_value tracking"), this was not
    the case since marking out of __mark_reg_unbounded() had this covered as well.
    Once in both are set as precise in 4 as they should have been, we conclude
    that given R1 was in prior fall-through path 0x104c1500 and now is completely
    unknown, the check at insn 7 concludes that we need to continue walking.
    Analysis after the fix:

    0: R1=ctx(id=0,off=0,imm=0) R10=fp0
    0: (b7) r0 = 0
    1: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
    1: (35) if r0 >= 0xf72e goto pc+0
    2: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
    2: (35) if r0 >= 0x80fe0000 goto pc+0
    3: R0_w=invP0 R1=ctx(id=0,off=0,imm=0) R10=fp0
    3: (14) w0 -= -536870912
    4: R0_w=invP536870912 R1=ctx(id=0,off=0,imm=0) R10=fp0
    4: (0f) r1 += r0
    5: R0_w=invP536870912 R1_w=invP(id=0) R10=fp0
    5: (55) if r1 != 0x104c1500 goto pc+0
    R0_w=invP536870912 R1_w=invP273421568 R10=fp0
    6: R0_w=invP536870912 R1_w=invP273421568 R10=fp0
    6: (b7) r0 = 0
    7: R0=invP0 R1=invP273421568 R10=fp0
    7: (76) if w1 s>= 0xffffff00 goto pc+3
    11: R0=invP0 R1=invP273421568 R10=fp0
    11: (95) exit
    6: R0_w=invP536870912 R1_w=invP(id=0) R10=fp0
    6: (b7) r0 = 0
    7: R0_w=invP0 R1_w=invP(id=0) R10=fp0
    7: (76) if w1 s>= 0xffffff00 goto pc+3
    R0_w=invP0 R1_w=invP(id=0) R10=fp0
    8: R0_w=invP0 R1_w=invP(id=0) R10=fp0
    8: (a5) if r0 < 0x2007002a goto pc+0
    9: R0_w=invP0 R1_w=invP(id=0) R10=fp0
    9: (57) r0 &= -16316416
    10: R0_w=invP0 R1_w=invP(id=0) R10=fp0
    10: (a6) if w0 < 0x1201 goto pc+0
    11: R0_w=invP0 R1_w=invP(id=0) R10=fp0
    11: (95) exit
    11: R0=invP0 R1=invP(id=0) R10=fp0
    11: (95) exit
    processed 16 insns [...]

    Fixes: 6754172c208d ("bpf: fix precision tracking in presence of bpf2bpf calls")
    Reported-by: Anatoly Trosinenko
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Alexei Starovoitov
    Link: https://lore.kernel.org/bpf/20191222223740.25297-1-daniel@iogearbox.net
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     
  • commit 21d37340912d74b1222d43c11aa9dd0687162573 upstream.

    These were added to blkdev_ioctl() in v4.20 but not blkdev_compat_ioctl,
    so add them now.

    Cc: # v4.20+
    Fixes: 72cd87576d1d ("block: Introduce BLKGETZONESZ ioctl")
    Fixes: 65e4e3eee83d ("block: Introduce BLKGETNRZONES ioctl")
    Reviewed-by: Damien Le Moal
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit 673bdf8ce0a387ef585c13b69a2676096c6edfe9 upstream.

    These were added to blkdev_ioctl() but not blkdev_compat_ioctl,
    so add them now.

    Cc: # v4.10+
    Fixes: 3ed05a987e0f ("blk-zoned: implement ioctls")
    Reviewed-by: Damien Le Moal
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit b2c0fcd28772f99236d261509bcd242135677965 upstream.

    These were added to blkdev_ioctl() in linux-5.5 but not
    blkdev_compat_ioctl, so add them now.

    Cc: # v4.4+
    Fixes: bbd3e064362e ("block: add an API for Persistent Reservations")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Greg Kroah-Hartman

    Fold in followup patch from Arnd with missing pr.h header include.

    Signed-off-by: Jens Axboe

    Arnd Bergmann
     
  • commit de7999afedff02c6631feab3ea726a0e8f8c3d40 upstream.

    When starting writeback for a range that covers part of a preallocated
    extent, due to a race with writeback for another range that also covers
    another part of the same preallocated extent, we can end up in an infinite
    loop.

    Consider the following example where for inode 280 we have two dirty
    ranges:

    range A, from 294912 to 303103, 8192 bytes
    range B, from 348160 to 438271, 90112 bytes

    and we have the following file extent item layout for our inode:

    leaf 38895616 gen 24544 total ptrs 29 free space 13820 owner 5
    (...)
    item 27 key (280 108 200704) itemoff 14598 itemsize 53
    extent data disk bytenr 0 nr 0 type 1 (regular)
    extent data offset 0 nr 94208 ram 94208
    item 28 key (280 108 294912) itemoff 14545 itemsize 53
    extent data disk bytenr 10433052672 nr 81920 type 2 (prealloc)
    extent data offset 0 nr 81920 ram 81920

    Then the following happens:

    1) Writeback starts for range B (from 348160 to 438271), execution of
    run_delalloc_nocow() starts;

    2) The first iteration of run_delalloc_nocow()'s whil loop leaves us at
    the extent item at slot 28, pointing to the prealloc extent item
    covering the range from 294912 to 376831. This extent covers part of
    our range;

    3) An ordered extent is created against that extent, covering the file
    range from 348160 to 376831 (28672 bytes);

    4) We adjust 'cur_offset' to 376832 and move on to the next iteration of
    the while loop;

    5) The call to btrfs_lookup_file_extent() leaves us at the same leaf,
    pointing to slot 29, 1 slot after the last item (the extent item
    we processed in the previous iteration);

    6) Because we are a slot beyond the last item, we call btrfs_next_leaf(),
    which releases the search path before doing a another search for the
    last key of the leaf (280 108 294912);

    7) Right after btrfs_next_leaf() released the path, and before it did
    another search for the last key of the leaf, writeback for the range
    A (from 294912 to 303103) completes (it was previously started at
    some point);

    8) Upon completion of the ordered extent for range A, the prealloc extent
    we previously found got split into two extent items, one covering the
    range from 294912 to 303103 (8192 bytes), with a type of regular extent
    (and no longer prealloc) and another covering the range from 303104 to
    376831 (73728 bytes), with a type of prealloc and an offset of 8192
    bytes. So our leaf now has the following layout:

    leaf 38895616 gen 24544 total ptrs 31 free space 13664 owner 5
    (...)
    item 27 key (280 108 200704) itemoff 14598 itemsize 53
    extent data disk bytenr 0 nr 0 type 1
    extent data offset 0 nr 8192 ram 94208
    item 28 key (280 108 208896) itemoff 14545 itemsize 53
    extent data disk bytenr 10433142784 nr 86016 type 1
    extent data offset 0 nr 86016 ram 86016
    item 29 key (280 108 294912) itemoff 14492 itemsize 53
    extent data disk bytenr 10433052672 nr 81920 type 1
    extent data offset 0 nr 8192 ram 81920
    item 30 key (280 108 303104) itemoff 14439 itemsize 53
    extent data disk bytenr 10433052672 nr 81920 type 2
    extent data offset 8192 nr 73728 ram 81920

    9) After btrfs_next_leaf() returns, we have our path pointing to that same
    leaf and at slot 30, since it has a key we didn't have before and it's
    the first key greater then the key that was previously the last key of
    the leaf (key (280 108 294912));

    10) The extent item at slot 30 covers the range from 303104 to 376831
    which is in our target range, so we process it, despite having already
    created an ordered extent against this extent for the file range from
    348160 to 376831. This is because we skip to the next extent item only
    if its end is less than or equals to the start of our delalloc range,
    and not less than or equals to the current offset ('cur_offset');

    11) As a result we compute 'num_bytes' as:

    num_bytes = min(end + 1, extent_end) - cur_offset;
    = min(438271 + 1, 376832) - 376832 = 0

    12) We then call create_io_em() for a 0 bytes range starting at offset
    376832;

    13) Then create_io_em() enters an infinite loop because its calls to
    btrfs_drop_extent_cache() do nothing due to the 0 length range
    passed to it. So no existing extent maps that cover the offset
    376832 get removed, and therefore calls to add_extent_mapping()
    return -EEXIST, resulting in an infinite loop. This loop from
    create_io_em() is the following:

    do {
    btrfs_drop_extent_cache(BTRFS_I(inode), em->start,
    em->start + em->len - 1, 0);
    write_lock(&em_tree->lock);
    ret = add_extent_mapping(em_tree, em, 1);
    write_unlock(&em_tree->lock);
    /*
    * The caller has taken lock_extent(), who could race with us
    * to add em?
    */
    } while (ret == -EEXIST);

    Also, each call to btrfs_drop_extent_cache() triggers a warning because
    the start offset passed to it (376832) is smaller then the end offset
    (376832 - 1) passed to it by -1, due to the 0 length:

    [258532.052621] ------------[ cut here ]------------
    [258532.052643] WARNING: CPU: 0 PID: 9987 at fs/btrfs/file.c:602 btrfs_drop_extent_cache+0x3f4/0x590 [btrfs]
    (...)
    [258532.052672] CPU: 0 PID: 9987 Comm: fsx Tainted: G W 5.4.0-rc7-btrfs-next-64 #1
    [258532.052673] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c8995f-prebuilt.qemu.org 04/01/2014
    [258532.052691] RIP: 0010:btrfs_drop_extent_cache+0x3f4/0x590 [btrfs]
    (...)
    [258532.052695] RSP: 0018:ffffb4be0153f860 EFLAGS: 00010287
    [258532.052700] RAX: ffff975b445ee360 RBX: ffff975b44eb3e08 RCX: 0000000000000000
    [258532.052700] RDX: 0000000000038fff RSI: 0000000000039000 RDI: ffff975b445ee308
    [258532.052700] RBP: 0000000000038fff R08: 0000000000000000 R09: 0000000000000001
    [258532.052701] R10: ffff975b513c5c10 R11: 00000000e3c0cfa9 R12: 0000000000039000
    [258532.052703] R13: ffff975b445ee360 R14: 00000000ffffffef R15: ffff975b445ee308
    [258532.052705] FS: 00007f86a821de80(0000) GS:ffff975b76a00000(0000) knlGS:0000000000000000
    [258532.052707] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [258532.052708] CR2: 00007fdacf0f3ab4 CR3: 00000001f9d26002 CR4: 00000000003606f0
    [258532.052712] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [258532.052717] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [258532.052717] Call Trace:
    [258532.052718] ? preempt_schedule_common+0x32/0x70
    [258532.052722] ? ___preempt_schedule+0x16/0x20
    [258532.052741] create_io_em+0xff/0x180 [btrfs]
    [258532.052767] run_delalloc_nocow+0x942/0xb10 [btrfs]
    [258532.052791] btrfs_run_delalloc_range+0x30b/0x520 [btrfs]
    [258532.052812] ? find_lock_delalloc_range+0x221/0x250 [btrfs]
    [258532.052834] writepage_delalloc+0xe4/0x140 [btrfs]
    [258532.052855] __extent_writepage+0x110/0x4e0 [btrfs]
    [258532.052876] extent_write_cache_pages+0x21c/0x480 [btrfs]
    [258532.052906] extent_writepages+0x52/0xb0 [btrfs]
    [258532.052911] do_writepages+0x23/0x80
    [258532.052915] __filemap_fdatawrite_range+0xd2/0x110
    [258532.052938] btrfs_fdatawrite_range+0x1b/0x50 [btrfs]
    [258532.052954] start_ordered_ops+0x57/0xa0 [btrfs]
    [258532.052973] ? btrfs_sync_file+0x225/0x490 [btrfs]
    [258532.052988] btrfs_sync_file+0x225/0x490 [btrfs]
    [258532.052997] __x64_sys_msync+0x199/0x200
    [258532.053004] do_syscall_64+0x5c/0x250
    [258532.053007] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [258532.053010] RIP: 0033:0x7f86a7dfd760
    (...)
    [258532.053014] RSP: 002b:00007ffd99af0368 EFLAGS: 00000246 ORIG_RAX: 000000000000001a
    [258532.053016] RAX: ffffffffffffffda RBX: 0000000000000ec9 RCX: 00007f86a7dfd760
    [258532.053017] RDX: 0000000000000004 RSI: 000000000000836c RDI: 00007f86a8221000
    [258532.053019] RBP: 0000000000021ec9 R08: 0000000000000003 R09: 00007f86a812037c
    [258532.053020] R10: 0000000000000001 R11: 0000000000000246 R12: 00000000000074a3
    [258532.053021] R13: 00007f86a8221000 R14: 000000000000836c R15: 0000000000000001
    [258532.053032] irq event stamp: 1653450494
    [258532.053035] hardirqs last enabled at (1653450493): [] _raw_spin_unlock_irq+0x29/0x50
    [258532.053037] hardirqs last disabled at (1653450494): [] trace_hardirqs_off_thunk+0x1a/0x20
    [258532.053039] softirqs last enabled at (1653449852): [] __do_softirq+0x466/0x6bd
    [258532.053042] softirqs last disabled at (1653449845): [] irq_exit+0xec/0x120
    [258532.053043] ---[ end trace 8476fce13d9ce20a ]---

    Which results in flooding dmesg/syslog since btrfs_drop_extent_cache()
    uses WARN_ON() and not WARN_ON_ONCE().

    So fix this issue by changing run_delalloc_nocow()'s loop to move to the
    next extent item when the current extent item ends at at offset less than
    or equals to the current offset instead of the start offset.

    Fixes: 80ff385665b7fc ("Btrfs: update nodatacow code v2")
    CC: stable@vger.kernel.org # 4.4+
    Reviewed-by: Josef Bacik
    Signed-off-by: Filipe Manana
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Filipe Manana
     
  • commit a40c94be2336f3002563c9ae16572143ae3422e2 upstream.

    It turns out that the JZ4725B displays the same buggy behaviour as the
    JZ4740 that was described in commit f4c255f1a747 ("dmaengine: dma-jz4780:
    Break descriptor chains on JZ4740").

    Work around it by using the same workaround previously used for the
    JZ4740.

    Fixes commit f4c255f1a747 ("dmaengine: dma-jz4780: Break descriptor
    chains on JZ4740")

    Cc:
    Signed-off-by: Paul Cercueil
    Link: https://lore.kernel.org/r/20191210165545.59690-1-paul@crapouillou.net
    Signed-off-by: Vinod Koul
    Signed-off-by: Greg Kroah-Hartman

    Paul Cercueil
     
  • commit 53a256a9b925b47c7e67fc1f16ca41561a7b877c upstream.

    dmaengine_desc_set_reuse() allocates a struct dma_slave_caps on the
    stack, populates it using dma_get_slave_caps() and then accesses one
    of its members.

    However dma_get_slave_caps() may fail and this isn't accounted for,
    leading to a legitimate warning of gcc-4.9 (but not newer versions):

    In file included from drivers/spi/spi-bcm2835.c:19:0:
    drivers/spi/spi-bcm2835.c: In function 'dmaengine_desc_set_reuse':
    >> include/linux/dmaengine.h:1370:10: warning: 'caps.descriptor_reuse' is used uninitialized in this function [-Wuninitialized]
    if (caps.descriptor_reuse) {

    Fix it, thereby also silencing the gcc-4.9 warning.

    The issue has been present for 4 years but surfaces only now that
    the first caller of dmaengine_desc_set_reuse() has been added in
    spi-bcm2835.c. Another user of reusable DMA descriptors has existed
    for a while in pxa_camera.c, but it sets the DMA_CTRL_REUSE flag
    directly instead of calling dmaengine_desc_set_reuse(). Nevertheless,
    tag this commit for stable in case there are out-of-tree users.

    Fixes: 272420214d26 ("dmaengine: Add DMA_CTRL_REUSE")
    Reported-by: kbuild test robot
    Signed-off-by: Lukas Wunner
    Cc: stable@vger.kernel.org # v4.3+
    Link: https://lore.kernel.org/r/ca92998ccc054b4f2bfd60ef3adbab2913171eac.1575546234.git.lukas@wunner.de
    Signed-off-by: Vinod Koul
    Signed-off-by: Greg Kroah-Hartman

    Lukas Wunner
     
  • commit e4ab5ccc357b978999328fadae164e098c26fa40 upstream.

    This adds logic to the user_notification_basic test to set a member
    of struct seccomp_notif to an invalid value to ensure that the kernel
    returns EINVAL if any of the struct seccomp_notif members are set to
    invalid values.

    Signed-off-by: Sargun Dhillon
    Suggested-by: Christian Brauner
    Link: https://lore.kernel.org/r/20191230203811.4996-1-sargun@sargun.me
    Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
    Cc: stable@vger.kernel.org
    Signed-off-by: Kees Cook
    Signed-off-by: Greg Kroah-Hartman

    Sargun Dhillon
     
  • commit 771b894f2f3dfedc2ba5561731fffa0e39b1bbb6 upstream.

    The sizes by which seccomp_notif and seccomp_notif_resp are allocated are
    based on the SECCOMP_GET_NOTIF_SIZES ioctl. This allows for graceful
    extension of these datastructures. If userspace zeroes out the
    datastructure based on its version, and it is lagging behind the kernel's
    version, it will end up sending trailing garbage. On the other hand,
    if it is ahead of the kernel version, it will write extra zero space,
    and potentially cause corruption.

    Signed-off-by: Sargun Dhillon
    Suggested-by: Tycho Andersen
    Link: https://lore.kernel.org/r/20191230203503.4925-1-sargun@sargun.me
    Fixes: fec7b6690541 ("samples: add an example of seccomp user trap")
    Cc: stable@vger.kernel.org
    Signed-off-by: Kees Cook
    Signed-off-by: Greg Kroah-Hartman

    Sargun Dhillon
     
  • commit 2882d53c9c6f3b8311d225062522f03772cf0179 upstream.

    This patch is a small change in enforcement of the uapi for
    SECCOMP_IOCTL_NOTIF_RECV ioctl. Specifically, the datastructure which
    is passed (seccomp_notif) must be zeroed out. Previously any of its
    members could be set to nonsense values, and we would ignore it.

    This ensures all fields are set to their zero value.

    Signed-off-by: Sargun Dhillon
    Reviewed-by: Christian Brauner
    Reviewed-by: Aleksa Sarai
    Acked-by: Tycho Andersen
    Link: https://lore.kernel.org/r/20191229062451.9467-2-sargun@sargun.me
    Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
    Cc: stable@vger.kernel.org
    Signed-off-by: Kees Cook
    Signed-off-by: Greg Kroah-Hartman

    Sargun Dhillon
     
  • commit 88c13f8bd71472fbab5338b01d99122908c77e53 upstream.

    The seccomp_notif structure should be zeroed out prior to calling the
    SECCOMP_IOCTL_NOTIF_RECV ioctl. Previously, the kernel did not check
    whether these structures were zeroed out or not, so these worked.

    This patch zeroes out the seccomp_notif data structure prior to calling
    the ioctl.

    Signed-off-by: Sargun Dhillon
    Reviewed-by: Tycho Andersen
    Reviewed-by: Christian Brauner
    Link: https://lore.kernel.org/r/20191229062451.9467-1-sargun@sargun.me
    Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
    Cc: stable@vger.kernel.org
    Signed-off-by: Kees Cook
    Signed-off-by: Greg Kroah-Hartman

    Sargun Dhillon
     
  • commit 98ca480a8f22fdbd768e3dad07024c8d4856576c upstream.

    An ino is unsigned, so display it as such in /proc/locks.

    Cc: stable@vger.kernel.org
    Signed-off-by: Amir Goldstein
    Signed-off-by: Jeff Layton
    Signed-off-by: Greg Kroah-Hartman

    Amir Goldstein
     
  • commit a5b0dc5a46c221725c43bd9b01570239a4cd78b1 upstream.

    I noticed that randconfig builds with gcc no longer produce a lot of
    ccache hits, unlike with clang, and traced this back to plugins
    now being enabled unconditionally if they are supported.

    I am now working around this by adding

    export CCACHE_COMPILERCHECK=/usr/bin/size -A %compiler%

    to my top-level Makefile. This changes the heuristic that ccache uses
    to determine whether the plugins are the same after a 'make clean'.

    However, it also seems that being able to just turn off the plugins is
    generally useful, at least for build testing it adds noticeable overhead
    but does not find a lot of bugs additional bugs, and may be easier for
    ccache users than my workaround.

    Fixes: 9f671e58159a ("security: Create "kernel hardening" config area")
    Signed-off-by: Arnd Bergmann
    Acked-by: Ard Biesheuvel
    Reviewed-by: Masahiro Yamada
    Link: https://lore.kernel.org/r/20191211133951.401933-1-arnd@arndb.de
    Cc: stable@vger.kernel.org
    Signed-off-by: Kees Cook
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit 8df955a32a73315055e0cd187cbb1cea5820394b upstream.

    For callers that allocated a label for persistent_ram_new(), if the call
    fails, they must clean up the allocation.

    Suggested-by: Navid Emamdoost
    Fixes: 1227daa43bce ("pstore/ram: Clarify resource reservation labels")
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/lkml/20191211191353.14385-1-navid.emamdoost@gmail.com
    Signed-off-by: Kees Cook
    Signed-off-by: Greg Kroah-Hartman

    Kees Cook
     
  • commit 9e5f1c19800b808a37fb9815a26d382132c26c3d upstream.

    The ram_core.c routines treat przs as circular buffers. When writing a
    new crash dump, the old buffer needs to be cleared so that the new dump
    doesn't end up in the wrong place (i.e. at the end).

    The solution to this problem is to reset the circular buffer state before
    writing a new Oops dump.

    Signed-off-by: Aleksandr Yashkin
    Signed-off-by: Nikolay Merinov
    Signed-off-by: Ariel Gilman
    Link: https://lore.kernel.org/r/20191223133816.28155-1-n.merinov@inango-systems.com
    Fixes: 896fc1f0c4c6 ("pstore/ram: Switch to persistent_ram routines")
    Cc: stable@vger.kernel.org
    Signed-off-by: Kees Cook
    Signed-off-by: Greg Kroah-Hartman

    Aleksandr Yashkin
     
  • commit b73eba2a867e10b9b4477738677341f3307c07bb upstream.

    Because ocfs2_get_dlm_debug() function is called once less here, ocfs2
    file system will trigger the system crash, usually after ocfs2 file
    system is unmounted.

    This system crash is caused by a generic memory corruption, these crash
    backtraces are not always the same, for exapmle,

    ocfs2: Unmounting device (253,16) on (node 172167785)
    general protection fault: 0000 [#1] SMP PTI
    CPU: 3 PID: 14107 Comm: fence_legacy Kdump:
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
    RIP: 0010:__kmalloc+0xa5/0x2a0
    Code: 00 00 4d 8b 07 65 4d 8b
    RSP: 0018:ffffaa1fc094bbe8 EFLAGS: 00010286
    RAX: 0000000000000000 RBX: d310a8800d7a3faf RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000dc0 RDI: ffff96e68fc036c0
    RBP: d310a8800d7a3faf R08: ffff96e6ffdb10a0 R09: 00000000752e7079
    R10: 000000000001c513 R11: 0000000004091041 R12: 0000000000000dc0
    R13: 0000000000000039 R14: ffff96e68fc036c0 R15: ffff96e68fc036c0
    FS: 00007f699dfba540(0000) GS:ffff96e6ffd80000(0000) knlGS:00000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000055f3a9d9b768 CR3: 000000002cd1c000 CR4: 00000000000006e0
    Call Trace:
    ext4_htree_store_dirent+0x35/0x100 [ext4]
    htree_dirblock_to_tree+0xea/0x290 [ext4]
    ext4_htree_fill_tree+0x1c1/0x2d0 [ext4]
    ext4_readdir+0x67c/0x9d0 [ext4]
    iterate_dir+0x8d/0x1a0
    __x64_sys_getdents+0xab/0x130
    do_syscall_64+0x60/0x1f0
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x7f699d33a9fb

    This regression problem was introduced by commit e581595ea29c ("ocfs: no
    need to check return value of debugfs_create functions").

    Link: http://lkml.kernel.org/r/20191225061501.13587-1-ghe@suse.com
    Fixes: e581595ea29c ("ocfs: no need to check return value of debugfs_create functions")
    Signed-off-by: Gang He
    Acked-by: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Changwei Ge
    Cc: Gang He
    Cc: Jun Piao
    Cc: [5.3+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Gang He
     
  • commit 941f762bcb276259a78e7931674668874ccbda59 upstream.

    pr_err() expects kB, but mm_pgtables_bytes() returns the number of bytes.
    As everything else is printed in kB, I chose to fix the value rather than
    the string.

    Before:

    [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
    ...
    [ 1878] 1000 1878 217253 151144 1269760 0 0 python
    ...
    Out of memory: Killed process 1878 (python) total-vm:869012kB, anon-rss:604572kB, file-rss:4kB, shmem-rss:0kB, UID:1000 pgtables:1269760kB oom_score_adj:0

    After:

    [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
    ...
    [ 1436] 1000 1436 217253 151890 1294336 0 0 python
    ...
    Out of memory: Killed process 1436 (python) total-vm:869012kB, anon-rss:607516kB, file-rss:44kB, shmem-rss:0kB, UID:1000 pgtables:1264kB oom_score_adj:0

    Link: http://lkml.kernel.org/r/20191211202830.1600-1-idryomov@gmail.com
    Fixes: 70cb6d267790 ("mm/oom: add oom_score_adj and pgtables to Killed process message")
    Signed-off-by: Ilya Dryomov
    Reviewed-by: Andrew Morton
    Acked-by: David Rientjes
    Acked-by: Michal Hocko
    Cc: Edward Chron
    Cc: David Rientjes
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Ilya Dryomov
     
  • commit e0153fc2c7606f101392b682e720a7a456d6c766 upstream.

    Felix Abecassis reports move_pages() would return random status if the
    pages are already on the target node by the below test program:

    int main(void)
    {
    const long node_id = 1;
    const long page_size = sysconf(_SC_PAGESIZE);
    const int64_t num_pages = 8;

    unsigned long nodemask = 1 << node_id;
    long ret = set_mempolicy(MPOL_BIND, &nodemask, sizeof(nodemask));
    if (ret < 0)
    return (EXIT_FAILURE);

    void **pages = malloc(sizeof(void*) * num_pages);
    for (int i = 0; i < num_pages; ++i) {
    pages[i] = mmap(NULL, page_size, PROT_WRITE | PROT_READ,
    MAP_PRIVATE | MAP_POPULATE | MAP_ANONYMOUS,
    -1, 0);
    if (pages[i] == MAP_FAILED)
    return (EXIT_FAILURE);
    }

    ret = set_mempolicy(MPOL_DEFAULT, NULL, 0);
    if (ret < 0)
    return (EXIT_FAILURE);

    int *nodes = malloc(sizeof(int) * num_pages);
    int *status = malloc(sizeof(int) * num_pages);
    for (int i = 0; i < num_pages; ++i) {
    nodes[i] = node_id;
    status[i] = 0xd0; /* simulate garbage values */
    }

    ret = move_pages(0, num_pages, pages, nodes, status, MPOL_MF_MOVE);
    printf("move_pages: %ld\n", ret);
    for (int i = 0; i < num_pages; ++i)
    printf("status[%d] = %d\n", i, status[i]);
    }

    Then running the program would return nonsense status values:

    $ ./move_pages_bug
    move_pages: 0
    status[0] = 208
    status[1] = 208
    status[2] = 208
    status[3] = 208
    status[4] = 208
    status[5] = 208
    status[6] = 208
    status[7] = 208

    This is because the status is not set if the page is already on the
    target node, but move_pages() should return valid status as long as it
    succeeds. The valid status may be errno or node id.

    We can't simply initialize status array to zero since the pages may be
    not on node 0. Fix it by updating status with node id which the page is
    already on.

    Link: http://lkml.kernel.org/r/1575584353-125392-1-git-send-email-yang.shi@linux.alibaba.com
    Fixes: a49bd4d71637 ("mm, numa: rework do_pages_move")
    Signed-off-by: Yang Shi
    Reported-by: Felix Abecassis
    Tested-by: Felix Abecassis
    Suggested-by: Michal Hocko
    Reviewed-by: John Hubbard
    Acked-by: Christoph Lameter
    Acked-by: Michal Hocko
    Reviewed-by: Vlastimil Babka
    Cc: Mel Gorman
    Cc: [4.17+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Yang Shi
     
  • commit 84029fd04c201a4c7e0b07ba262664900f47c6f5 upstream.

    The cred_jar kmem_cache is already memcg accounted in the current kernel
    but cred->security is not. Account cred->security to kmemcg.

    Recently we saw high root slab usage on our production and on further
    inspection, we found a buggy application leaking processes. Though that
    buggy application was contained within its memcg but we observe much
    more system memory overhead, couple of GiBs, during that period. This
    overhead can adversely impact the isolation on the system.

    One source of high overhead we found was cred->security objects, which
    have a lifetime of at least the life of the process which allocated
    them.

    Link: http://lkml.kernel.org/r/20191205223721.40034-1-shakeelb@google.com
    Signed-off-by: Shakeel Butt
    Acked-by: Chris Down
    Reviewed-by: Roman Gushchin
    Acked-by: Michal Hocko
    Cc: Johannes Weiner
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Shakeel Butt
     
  • commit ac8f05da5174c560de122c499ce5dfb5d0dfbee5 upstream.

    When zspage is migrated to the other zone, the zone page state should be
    updated as well, otherwise the NR_ZSPAGE for each zone shows wrong
    counts including proc/zoneinfo in practice.

    Link: http://lkml.kernel.org/r/1575434841-48009-1-git-send-email-chanho.min@lge.com
    Fixes: 91537fee0013 ("mm: add NR_ZSMALLOC to vmstat")
    Signed-off-by: Chanho Min
    Signed-off-by: Jinsuk Choi
    Reviewed-by: Sergey Senozhatsky
    Acked-by: Minchan Kim
    Cc: [4.9+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Chanho Min