16 Oct, 2019

1 commit

  • On a machine with a 64K PAGE_SIZE, the nested for loops in
    test_check_nonzero_user() can lead to soft lockups, eg:

    watchdog: BUG: soft lockup - CPU#4 stuck for 22s! [modprobe:611]
    Modules linked in: test_user_copy(+) vmx_crypto gf128mul crc32c_vpmsum virtio_balloon ip_tables x_tables autofs4
    CPU: 4 PID: 611 Comm: modprobe Tainted: G L 5.4.0-rc1-gcc-8.2.0-00001-gf5a1a536fa14-dirty #1151
    ...
    NIP __might_sleep+0x20/0xc0
    LR __might_fault+0x40/0x60
    Call Trace:
    check_zeroed_user+0x12c/0x200
    test_user_copy_init+0x67c/0x1210 [test_user_copy]
    do_one_initcall+0x60/0x340
    do_init_module+0x7c/0x2f0
    load_module+0x2d94/0x30e0
    __do_sys_finit_module+0xc8/0x150
    system_call+0x5c/0x68

    Even with a 4K PAGE_SIZE the test takes multiple seconds. Instead
    tweak it to only scan a 1024 byte region, but make it cross the
    page boundary.

    Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
    Suggested-by: Aleksa Sarai
    Signed-off-by: Michael Ellerman
    Reviewed-by: Aleksa Sarai
    Acked-by: Christian Brauner
    Link: https://lore.kernel.org/r/20191016122732.13467-1-mpe@ellerman.id.au
    Signed-off-by: Christian Brauner

    Michael Ellerman
     

07 Oct, 2019

4 commits

  • While writing the tests for copy_struct_from_user(), I used a construct
    that Linus doesn't appear to be too fond of:

    On 2019-10-04, Linus Torvalds wrote:
    > Hmm. That code is ugly, both before and after the fix.
    >
    > This just doesn't make sense for so many reasons:
    >
    > if ((ret |= test(umem_src == NULL, "kmalloc failed")))
    >
    > where the insanity comes from
    >
    > - why "|=" when you know that "ret" was zero before (and it had to
    > be, for the test to make sense)
    >
    > - why do this as a single line anyway?
    >
    > - don't do the stupid "double parenthesis" to hide a warning. Make it
    > use an actual comparison if you add a layer of parentheses.

    So instead, use a bog-standard check that isn't nearly as ugly.

    Fixes: 341115822f88 ("usercopy: Add parentheses around assignment in test_copy_struct_from_user")
    Fixes: f5a1a536fa14 ("lib: introduce copy_struct_from_user() helper")
    Signed-off-by: Aleksa Sarai
    Reviewed-by: Nathan Chancellor
    Reviewed-by: Christian Brauner
    Link: https://lore.kernel.org/r/20191005233028.18566-1-cyphar@cyphar.com
    Signed-off-by: Christian Brauner

    Aleksa Sarai
     
  • Linus Torvalds
     
  • In commit 4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map") we
    changed elf to use MAP_FIXED_NOREPLACE instead of MAP_FIXED for the
    executable mappings.

    Then, people reported that it broke some binaries that had overlapping
    segments from the same file, and commit ad55eac74f20 ("elf: enforce
    MAP_FIXED on overlaying elf segments") re-instated MAP_FIXED for some
    overlaying elf segment cases. But only some - despite the summary line
    of that commit, it only did it when it also does a temporary brk vma for
    one obvious overlapping case.

    Now Russell King reports another overlapping case with old 32-bit x86
    binaries, which doesn't trigger that limited case. End result: we had
    better just drop MAP_FIXED_NOREPLACE entirely, and go back to MAP_FIXED.

    Yes, it's a sign of old binaries generated with old tool-chains, but we
    do pride ourselves on not breaking existing setups.

    This still leaves MAP_FIXED_NOREPLACE in place for the load_elf_interp()
    and the old load_elf_library() use-cases, because nobody has reported
    breakage for those. Yet.

    Note that in all the cases seen so far, the overlapping elf sections
    seem to be just re-mapping of the same executable with different section
    attributes. We could possibly introduce a new MAP_FIXED_NOFILECHANGE
    flag or similar, which acts like NOREPLACE, but allows just remapping
    the same executable file using different protection flags.

    It's not clear that would make a huge difference to anything, but if
    people really hate that "elf remaps over previous maps" behavior, maybe
    at least a more limited form of remapping would alleviate some concerns.

    Alternatively, we should take a look at our elf_map() logic to see if we
    end up not mapping things properly the first time.

    In the meantime, this is the minimal "don't do that then" patch while
    people hopefully think about it more.

    Reported-by: Russell King
    Fixes: 4ed28639519c ("fs, elf: drop MAP_FIXED usage from elf_map")
    Fixes: ad55eac74f20 ("elf: enforce MAP_FIXED on overlaying elf segments")
    Cc: Michal Hocko
    Cc: Kees Cook
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull dma-mapping regression fix from Christoph Hellwig:
    "Revert an incorret hunk from a patch that caused problems on various
    arm boards (Andrey Smirnov)"

    * tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping:
    dma-mapping: fix false positive warnings in dma_common_free_remap()

    Linus Torvalds
     

06 Oct, 2019

6 commits

  • Pull ARM SoC fixes from Olof Johansson:
    "A few fixes this time around:

    - Fixup of some clock specifications for DRA7 (device-tree fix)

    - Removal of some dead/legacy CPU OPP/PM code for OMAP that throws
    warnings at boot

    - A few more minor fixups for OMAPs, most around display

    - Enable STM32 QSPI as =y since their rootfs sometimes comes from
    there

    - Switch CONFIG_REMOTEPROC to =y since it went from tristate to bool

    - Fix of thermal zone definition for ux500 (5.4 regression)"

    * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
    ARM: multi_v7_defconfig: Fix SPI_STM32_QSPI support
    ARM: dts: ux500: Fix up the CPU thermal zone
    arm64/ARM: configs: Change CONFIG_REMOTEPROC from m to y
    ARM: dts: am4372: Set memory bandwidth limit for DISPC
    ARM: OMAP2+: Fix warnings with broken omap2_set_init_voltage()
    ARM: OMAP2+: Add missing LCDC midlemode for am335x
    ARM: OMAP2+: Fix missing reset done flag for am3 and am43
    ARM: dts: Fix gpio0 flags for am335x-icev2
    ARM: omap2plus_defconfig: Enable more droid4 devices as loadable modules
    ARM: omap2plus_defconfig: Enable DRM_TI_TFP410
    DTS: ARM: gta04: introduce legacy spi-cs-high to make display work again
    ARM: dts: Fix wrong clocks for dra7 mcasp
    clk: ti: dra7: Fix mcasp8 clock bits

    Linus Torvalds
     
  • …asahiroy/linux-kbuild

    Pull Kbuild fixes from Masahiro Yamada:

    - remove unneeded ar-option and KBUILD_ARFLAGS

    - remove long-deprecated SUBDIRS

    - fix modpost to suppress false-positive warnings for UML builds

    - fix namespace.pl to handle relative paths to ${objtree}, ${srctree}

    - make setlocalversion work for /bin/sh

    - make header archive reproducible

    - fix some Makefiles and documents

    * tag 'kbuild-fixes-v5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    kheaders: make headers archive reproducible
    kbuild: update compile-test header list for v5.4-rc2
    kbuild: two minor updates for Documentation/kbuild/modules.rst
    scripts/setlocalversion: clear local variable to make it work for sh
    namespace: fix namespace.pl script to support relative paths
    video/logo: do not generate unneeded logo C files
    video/logo: remove unneeded *.o pattern from clean-files
    integrity: remove pointless subdir-$(CONFIG_...)
    integrity: remove unneeded, broken attempt to add -fshort-wchar
    modpost: fix static EXPORT_SYMBOL warnings for UML build
    kbuild: correct formatting of header in kbuild module docs
    kbuild: remove SUBDIRS support
    kbuild: remove ar-option and KBUILD_ARFLAGS

    Linus Torvalds
     
  • Pull SCSI fixes from James Bottomley:
    "Twelve patches mostly small but obvious fixes or cosmetic but small
    updates"

    * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
    scsi: qla2xxx: Fix Nport ID display value
    scsi: qla2xxx: Fix N2N link up fail
    scsi: qla2xxx: Fix N2N link reset
    scsi: qla2xxx: Optimize NPIV tear down process
    scsi: qla2xxx: Fix stale mem access on driver unload
    scsi: qla2xxx: Fix unbound sleep in fcport delete path.
    scsi: qla2xxx: Silence fwdump template message
    scsi: hisi_sas: Make three functions static
    scsi: megaraid: disable device when probe failed after enabled device
    scsi: storvsc: setup 1:1 mapping between hardware queue and CPU queue
    scsi: qedf: Remove always false 'tmp_prio < 0' statement
    scsi: ufs: skip shutdown if hba is not powered
    scsi: bnx2fc: Handle scope bits when array returns BUSY or TSF

    Linus Torvalds
     
  • This makes getdents() and getdents64() do sanity checking on the
    pathname that it gives to user space. And to mitigate the performance
    impact of that, it first cleans up the way it does the user copying, so
    that the code avoids doing the SMAP/PAN updates between each part of the
    dirent structure write.

    I really wanted to do this during the merge window, but didn't have
    time. The conversion of filldir to unsafe_put_user() is something I've
    had around for years now in a private branch, but the extra pathname
    checking finally made me clean it up to the point where it is mergable.

    It's worth noting that the filename validity checking really should be a
    bit smarter: it would be much better to delay the error reporting until
    the end of the readdir, so that non-corrupted filenames are still
    returned. But that involves bigger changes, so let's see if anybody
    actually hits the corrupt directory entry case before worrying about it
    further.

    * branch 'readdir':
    Make filldir[64]() verify the directory entry filename is valid
    Convert filldir[64]() from __put_user() to unsafe_put_user()

    Linus Torvalds
     
  • This has been discussed several times, and now filesystem people are
    talking about doing it individually at the filesystem layer, so head
    that off at the pass and just do it in getdents{64}().

    This is partially based on a patch by Jann Horn, but checks for NUL
    bytes as well, and somewhat simplified.

    There's also commentary about how it might be better if invalid names
    due to filesystem corruption don't cause an immediate failure, but only
    an error at the end of the readdir(), so that people can still see the
    filenames that are ok.

    There's also been discussion about just how much POSIX strictly speaking
    requires this since it's about filesystem corruption. It's really more
    "protect user space from bad behavior" as pointed out by Jann. But
    since Eric Biederman looked up the POSIX wording, here it is for context:

    "From readdir:

    The readdir() function shall return a pointer to a structure
    representing the directory entry at the current position in the
    directory stream specified by the argument dirp, and position the
    directory stream at the next entry. It shall return a null pointer
    upon reaching the end of the directory stream. The structure dirent
    defined in the header describes a directory entry.

    From definitions:

    3.129 Directory Entry (or Link)

    An object that associates a filename with a file. Several directory
    entries can associate names with the same file.

    ...

    3.169 Filename

    A name consisting of 1 to {NAME_MAX} bytes used to name a file. The
    characters composing the name may be selected from the set of all
    character values excluding the slash character and the null byte. The
    filenames dot and dot-dot have special meaning. A filename is
    sometimes referred to as a 'pathname component'."

    Note that I didn't bother adding the checks to any legacy interfaces
    that nobody uses.

    Also note that if this ends up being noticeable as a performance
    regression, we can fix that to do a much more optimized model that
    checks for both NUL and '/' at the same time one word at a time.

    We haven't really tended to optimize 'memchr()', and it only checks for
    one pattern at a time anyway, and we really _should_ check for NUL too
    (but see the comment about "soft errors" in the code about why it
    currently only checks for '/')

    See the CONFIG_DCACHE_WORD_ACCESS case of hash_name() for how the name
    lookup code looks for pathname terminating characters in parallel.

    Link: https://lore.kernel.org/lkml/20190118161440.220134-2-jannh@google.com/
    Cc: Alexander Viro
    Cc: Jann Horn
    Cc: Eric W. Biederman
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • We really should avoid the "__{get,put}_user()" functions entirely,
    because they can easily be mis-used and the original intent of being
    used for simple direct user accesses no longer holds in a post-SMAP/PAN
    world.

    Manually optimizing away the user access range check makes no sense any
    more, when the range check is generally much cheaper than the "enable
    user accesses" code that the __{get,put}_user() functions still need.

    So instead of __put_user(), use the unsafe_put_user() interface with
    user_access_{begin,end}() that really does generate better code these
    days, and which is generally a nicer interface. Under some loads, the
    multiple user writes that filldir() does are actually quite noticeable.

    This also makes the dirent name copy use unsafe_put_user() with a couple
    of macros. We do not want to make function calls with SMAP/PAN
    disabled, and the code this generates is quite good when the
    architecture uses "asm goto" for unsafe_put_user() like x86 does.

    Note that this doesn't bother with the legacy cases. Nobody should use
    them anyway, so performance doesn't really matter there.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

05 Oct, 2019

29 commits

  • Pull networking fixes from David Miller:

    1) Fix ieeeu02154 atusb driver use-after-free, from Johan Hovold.

    2) Need to validate TCA_CBQ_WRROPT netlink attributes, from Eric
    Dumazet.

    3) txq null deref in mac80211, from Miaoqing Pan.

    4) ionic driver needs to select NET_DEVLINK, from Arnd Bergmann.

    5) Need to disable bh during nft_connlimit GC, from Pablo Neira Ayuso.

    6) Avoid division by zero in taprio scheduler, from Vladimir Oltean.

    7) Various xgmac fixes in stmmac driver from Jose Abreu.

    8) Avoid 64-bit division in mlx5 leading to link errors on 32-bit from
    Michal Kubecek.

    9) Fix bad VLAN check in rtl8366 DSA driver, from Linus Walleij.

    10) Fix sleep while atomic in sja1105, from Vladimir Oltean.

    11) Suspend/resume deadlock in stmmac, from Thierry Reding.

    12) Various UDP GSO fixes from Josh Hunt.

    13) Fix slab out of bounds access in tcp_zerocopy_receive(), from Eric
    Dumazet.

    14) Fix OOPS in __ipv6_ifa_notify(), from David Ahern.

    15) Memory leak in NFC's llcp_sock_bind, from Eric Dumazet.

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
    selftests/net: add nettest to .gitignore
    net: qlogic: Fix memory leak in ql_alloc_large_buffers
    nfc: fix memory leak in llcp_sock_bind()
    sch_dsmark: fix potential NULL deref in dsmark_init()
    net: phy: at803x: use operating parameters from PHY-specific status
    net: phy: extract pause mode
    net: phy: extract link partner advertisement reading
    net: phy: fix write to mii-ctrl1000 register
    ipv6: Handle missing host route in __ipv6_ifa_notify
    net: phy: allow for reset line to be tied to a sleepy GPIO controller
    net: ipv4: avoid mixed n_redirects and rate_tokens usage
    r8152: Set macpassthru in reset_resume callback
    cxgb4:Fix out-of-bounds MSI-X info array access
    Revert "ipv6: Handle race in addrconf_dad_work"
    net: make sock_prot_memory_pressure() return "const char *"
    rxrpc: Fix rxrpc_recvmsg tracepoint
    qmi_wwan: add support for Cinterion CLS8 devices
    tcp: fix slab-out-of-bounds in tcp_zerocopy_receive()
    lib: textsearch: fix escapes in example code
    udp: only do GSO if # of segs > 1
    ...

    Linus Torvalds
     
  • Pull s390 fixes from Vasily Gorbik:

    - defconfig updates

    - Fix build errors with CC_OPTIMIZE_FOR_SIZE due to usage of "i"
    constraint for function arguments. Two kvm changes acked-by Christian
    Borntraeger.

    - Fix -Wunused-but-set-variable warnings in mm code.

    - Avoid a constant misuse in qdio.

    - Handle a case when cpumf is temporarily unavailable.

    * tag 's390-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    KVM: s390: mark __insn32_query() as __always_inline
    KVM: s390: fix __insn32_query() inline assembly
    s390: update defconfigs
    s390/pci: mark function(s) __always_inline
    s390/mm: mark function(s) __always_inline
    s390/jump_label: mark function(s) __always_inline
    s390/cpu_mf: mark function(s) __always_inline
    s390/atomic,bitops: mark function(s) __always_inline
    s390/mm: fix -Wunused-but-set-variable warnings
    s390: mark __cpacf_query() as __always_inline
    s390/qdio: clarify size of the QIB parm area
    s390/cpumf: Fix indentation in sampling device driver
    s390/cpumsf: Check for CPU Measurement sampling
    s390/cpumf: Use consistant debug print format

    Linus Torvalds
     
  • __insn32_query() will not compile if the compiler decides to not
    inline it, since it contains an inline assembly with an "i" constraint
    with variable contents.

    Acked-by: Christian Borntraeger
    Signed-off-by: Heiko Carstens
    Signed-off-by: Vasily Gorbik

    Heiko Carstens
     
  • The inline assembly constraints of __insn32_query() tell the compiler
    that only the first byte of "query" is being written to. Intended was
    probably that 32 bytes are written to.

    Fix and simplify the code and just use a "memory" clobber.

    Fixes: d668139718a9 ("KVM: s390: provide query function for instructions returning 32 byte")
    Cc: stable@vger.kernel.org # v5.2+
    Acked-by: Christian Borntraeger
    Signed-off-by: Heiko Carstens
    Signed-off-by: Vasily Gorbik

    Heiko Carstens
     
  • Commit 5cf4537975bb ("dma-mapping: introduce a dma_common_find_pages
    helper") changed invalid input check in dma_common_free_remap() from:

    if (!area || !area->flags != VM_DMA_COHERENT)

    to

    if (!area || !area->flags != VM_DMA_COHERENT || !area->pages)

    which seem to produce false positives for memory obtained via
    dma_common_contiguous_remap()

    This triggers the following warning message when doing "reboot" on ZII
    VF610 Dev Board Rev B:

    WARNING: CPU: 0 PID: 1 at kernel/dma/remap.c:112 dma_common_free_remap+0x88/0x8c
    trying to free invalid coherent area: 9ef82980
    Modules linked in:
    CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 5.3.0-rc6-next-20190820 #119
    Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
    Backtrace:
    [] (dump_backtrace) from [] (show_stack+0x20/0x24)
    r7:8015ed78 r6:00000009 r5:00000000 r4:9f4d9b14
    [] (show_stack) from [] (dump_stack+0x24/0x28)
    [] (dump_stack) from [] (__warn.part.3+0xcc/0xe4)
    [] (__warn.part.3) from [] (warn_slowpath_fmt+0x78/0x94)
    r6:00000070 r5:808e540c r4:81c03048
    [] (warn_slowpath_fmt) from [] (dma_common_free_remap+0x88/0x8c)
    r3:9ef82980 r2:808e53e0
    r7:00001000 r6:a0b1e000 r5:a0b1e000 r4:00001000
    [] (dma_common_free_remap) from [] (remap_allocator_free+0x60/0x68)
    r5:81c03048 r4:9f4d9b78
    [] (remap_allocator_free) from [] (__arm_dma_free.constprop.3+0xf8/0x148)
    r5:81c03048 r4:9ef82900
    [] (__arm_dma_free.constprop.3) from [] (arm_dma_free+0x24/0x2c)
    r5:9f563410 r4:80110120
    [] (arm_dma_free) from [] (dma_free_attrs+0xa0/0xdc)
    [] (dma_free_attrs) from [] (dma_pool_destroy+0xc0/0x154)
    r8:9efa8860 r7:808f02f0 r6:808f02d0 r5:9ef82880 r4:9ef82780
    [] (dma_pool_destroy) from [] (ehci_mem_cleanup+0x6c/0x150)
    r7:9f563410 r6:9efa8810 r5:00000000 r4:9efd0148
    [] (ehci_mem_cleanup) from [] (ehci_stop+0xac/0xc0)
    r5:9efd0148 r4:9efd0000
    [] (ehci_stop) from [] (usb_remove_hcd+0xf4/0x1b0)
    r7:9f563410 r6:9efd0074 r5:81c03048 r4:9efd0000
    [] (usb_remove_hcd) from [] (host_stop+0x48/0xb8)
    r7:9f563410 r6:9efd0000 r5:9f5f4040 r4:9f5f5040
    [] (host_stop) from [] (ci_hdrc_host_destroy+0x34/0x38)
    r7:9f563410 r6:9f5f5040 r5:9efa8800 r4:9f5f4040
    [] (ci_hdrc_host_destroy) from [] (ci_hdrc_remove+0x50/0x10c)
    [] (ci_hdrc_remove) from [] (platform_drv_remove+0x34/0x4c)
    r7:9f563410 r6:81c4f99c r5:9efa8810 r4:9efa8810
    [] (platform_drv_remove) from [] (device_release_driver_internal+0xec/0x19c)
    r5:00000000 r4:9efa8810
    [] (device_release_driver_internal) from [] (device_release_driver+0x20/0x24)
    r7:9f563410 r6:81c41ed0 r5:9efa8810 r4:9f4a1dac
    [] (device_release_driver) from [] (bus_remove_device+0xdc/0x108)
    [] (bus_remove_device) from [] (device_del+0x150/0x36c)
    r7:9f563410 r6:81c03048 r5:9efa8854 r4:9efa8810
    [] (device_del) from [] (platform_device_del.part.2+0x20/0x84)
    r10:9f563414 r9:809177e0 r8:81cb07dc r7:81c78320 r6:9f563454 r5:9efa8800
    r4:9efa8800
    [] (platform_device_del.part.2) from [] (platform_device_unregister+0x28/0x34)
    r5:9f563400 r4:9efa8800
    [] (platform_device_unregister) from [] (ci_hdrc_remove_device+0x1c/0x30)
    r5:9f563400 r4:00000001
    [] (ci_hdrc_remove_device) from [] (ci_hdrc_imx_remove+0x38/0x118)
    r7:81c78320 r6:9f563454 r5:9f563410 r4:9f541010
    [] (ci_hdrc_imx_shutdown) from [] (platform_drv_shutdown+0x2c/0x30)
    [] (platform_drv_shutdown) from [] (device_shutdown+0x158/0x1f0)
    [] (device_shutdown) from [] (kernel_restart_prepare+0x44/0x48)
    r10:00000058 r9:9f4d8000 r8:fee1dead r7:379ce700 r6:81c0b280 r5:81c03048
    r4:00000000
    [] (kernel_restart_prepare) from [] (kernel_restart+0x1c/0x60)
    [] (kernel_restart) from [] (__do_sys_reboot+0xe0/0x1d8)
    r5:81c03048 r4:00000000
    [] (__do_sys_reboot) from [] (sys_reboot+0x18/0x1c)
    r8:80101204 r7:00000058 r6:00000000 r5:00000000 r4:00000000
    [] (sys_reboot) from [] (ret_fast_syscall+0x0/0x54)
    Exception stack(0x9f4d9fa8 to 0x9f4d9ff0)
    9fa0: 00000000 00000000 fee1dead 28121969 01234567 379ce700
    9fc0: 00000000 00000000 00000000 00000058 00000000 00000000 00000000 00016d04
    9fe0: 00028e0c 7ec87c64 000135ec 76c1f410

    Restore original invalid input check in dma_common_free_remap() to
    avoid this problem.

    Fixes: 5cf4537975bb ("dma-mapping: introduce a dma_common_find_pages helper")
    Signed-off-by: Andrey Smirnov
    [hch: just revert the offending hunk instead of creating a new helper]
    Signed-off-by: Christoph Hellwig

    Andrey Smirnov
     
  • In commit 43d8ce9d65a5 ("Provide in-kernel headers to make
    extending kernel easier") a new mechanism was introduced, for kernels
    >=5.2, which embeds the kernel headers in the kernel image or a module
    and exposes them in procfs for use by userland tools.

    The archive containing the header files has nondeterminism caused by
    header files metadata. This patch normalizes the metadata and utilizes
    KBUILD_BUILD_TIMESTAMP if provided and otherwise falls back to the
    default behaviour.

    In commit f7b101d33046 ("kheaders: Move from proc to sysfs") it was
    modified to use sysfs and the script for generation of the archive was
    renamed to what is being patched.

    Signed-off-by: Dmitry Goldin
    Reviewed-by: Greg Kroah-Hartman
    Reviewed-by: Joel Fernandes (Google)
    Signed-off-by: Masahiro Yamada

    Dmitry Goldin
     
  • Commit 6dc280ebeed2 ("coda: remove uapi/linux/coda_psdev.h") removed
    a header in question. Some more build errors were fixed. Add more
    headers into the test coverage.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • Capitalize the first word in the sentence.

    Use obj-m instead of obj-y. obj-y still works, but we have no built-in
    objects in external module builds. So, obj-m is better IMHO.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • Geert Uytterhoeven reports a strange side-effect of commit 858805b336be
    ("kbuild: add $(BASH) to run scripts with bash-extension"), which
    inserts the contents of a localversion file in the build directory twice.

    [Steps to Reproduce]
    $ echo bar > localversion
    $ mkdir build
    $ cd build/
    $ echo foo > localversion
    $ make -s -f ../Makefile defconfig include/config/kernel.release
    $ cat include/config/kernel.release
    5.4.0-rc1foofoobar

    This comes down to the behavior change of local variables.

    The 'man sh' on my Ubuntu machine, where sh is an alias to dash,
    explains as follows:
    When a variable is made local, it inherits the initial value and
    exported and readonly flags from the variable with the same name
    in the surrounding scope, if there is one. Otherwise, the variable
    is initially unset.

    [Test Code]

    foo ()
    {
    local res
    echo "res: $res"
    }

    res=1
    foo

    [Result]

    $ sh test.sh
    res: 1
    $ bash test.sh
    res:

    So, scripts/setlocalversion correctly works only for bash in spite of
    its hashbang being #!/bin/sh. Nobody had noticed it before because
    CONFIG_SHELL was previously set to bash almost all the time.

    Now that CONFIG_SHELL is set to sh, we must write portable and correct
    code. I gave the Fixes tag to the commit that uncovered the issue.

    Clear the variable 'res' in collect_files() to make it work for sh
    (and it also works on distributions where sh is an alias to bash).

    Fixes: 858805b336be ("kbuild: add $(BASH) to run scripts with bash-extension")
    Reported-by: Geert Uytterhoeven
    Signed-off-by: Masahiro Yamada
    Tested-by: Geert Uytterhoeven

    Masahiro Yamada
     
  • The namespace.pl script does not work properly if objtree is not set to
    an absolute path. The do_nm function is run from within the find
    function, which changes directories.

    Because of this, appending objtree, $File::Find::dir, and $source, will
    return a path which is not valid from the current directory.

    This used to work when objtree was set to an absolute path when using
    "make namespacecheck". It appears to have not worked when calling
    ./scripts/namespace.pl directly.

    This behavior was changed in 7e1c04779efd ("kbuild: Use relative path
    for $(objtree)", 2014-05-14)

    Rather than fixing the Makefile to set objtree to an absolute path, just
    fix namespace.pl to work when srctree and objtree are relative. Also fix
    the script to use an absolute path for these by default.

    Use the File::Spec module for this purpose. It's been part of perl
    5 since 5.005.

    The curdir() function is used to get the current directory when the
    objtree and srctree aren't set in the environment.

    rel2abs() is used to convert possibly relative objtree and srctree
    environment variables to absolute paths.

    Finally, the catfile() function is used instead of string appending
    paths together, since this is more robust when joining paths together.

    Signed-off-by: Jacob Keller
    Acked-by: Randy Dunlap
    Tested-by: Randy Dunlap
    Signed-off-by: Masahiro Yamada

    Jacob Keller
     
  • Currently, all the logo C files are generated irrespective of the
    CONFIG options. Adding them to extra-y is wrong. What we need to do
    here is to add them to 'targets' so that if_changed works properly.

    Files listed in 'targets' are cleaned, so clean-files is unneeded.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • The pattern *.o is cleaned up globally by the top Makefile.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • The ima/ and evm/ sub-directories contain built-in objects, so
    obj-$(CONFIG_...) is the correct way to descend into them.

    subdir-$(CONFIG_...) is redundant.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • I guess commit 15ea0e1e3e18 ("efi: Import certificates from UEFI Secure
    Boot") attempted to add -fshort-wchar for building load_uefi.o, but it
    has never worked as intended.

    load_uefi.o is created in the platform_certs/ sub-directory. If you
    really want to add -fshort-wchar, the correct code is:

    $(obj)/platform_certs/load_uefi.o: KBUILD_CFLAGS += -fshort-wchar

    But, you do not need to fix it.

    Commit 8c97023cf051 ("Kbuild: use -fshort-wchar globally") had already
    added -fshort-wchar globally. This code was unneeded in the first place.

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     
  • nettest is missing from gitignore.

    Fixes: acda655fefae ("selftests: Add nettest")
    Signed-off-by: Jakub Kicinski
    Signed-off-by: David S. Miller

    Jakub Kicinski
     
  • In ql_alloc_large_buffers, a new skb is allocated via netdev_alloc_skb.
    This skb should be released if pci_dma_mapping_error fails.

    Fixes: 0f8ab89e825f ("qla3xxx: Check return code from pci_map_single() in ql_release_to_lrg_buf_free_list(), ql_populate_free_queue(), ql_alloc_large_buffers(), and ql3xxx_send()")
    Signed-off-by: Navid Emamdoost
    Signed-off-by: David S. Miller

    Navid Emamdoost
     
  • sysbot reported a memory leak after a bind() has failed.

    While we are at it, abort the operation if kmemdup() has failed.

    BUG: memory leak
    unreferenced object 0xffff888105d83ec0 (size 32):
    comm "syz-executor067", pid 7207, jiffies 4294956228 (age 19.430s)
    hex dump (first 32 bytes):
    00 69 6c 65 20 72 65 61 64 00 6e 65 74 3a 5b 34 .ile read.net:[4
    30 32 36 35 33 33 30 39 37 5d 00 00 00 00 00 00 026533097]......
    backtrace:
    [] kmemleak_alloc_recursive /./include/linux/kmemleak.h:43 [inline]
    [] slab_post_alloc_hook /mm/slab.h:522 [inline]
    [] slab_alloc /mm/slab.c:3319 [inline]
    [] __do_kmalloc /mm/slab.c:3653 [inline]
    [] __kmalloc_track_caller+0x169/0x2d0 /mm/slab.c:3670
    [] kmemdup+0x27/0x60 /mm/util.c:120
    [] kmemdup /./include/linux/string.h:432 [inline]
    [] llcp_sock_bind+0x1b3/0x230 /net/nfc/llcp_sock.c:107
    [] __sys_bind+0x11c/0x140 /net/socket.c:1647
    [] __do_sys_bind /net/socket.c:1658 [inline]
    [] __se_sys_bind /net/socket.c:1656 [inline]
    [] __x64_sys_bind+0x1e/0x30 /net/socket.c:1656
    [] do_syscall_64+0x76/0x1a0 /arch/x86/entry/common.c:296
    [] entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Fixes: 30cc4587659e ("NFC: Move LLCP code to the NFC top level diirectory")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Make sure TCA_DSMARK_INDICES was provided by the user.

    syzbot reported :

    kasan: CONFIG_KASAN_INLINE enabled
    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 1 PID: 8799 Comm: syz-executor235 Not tainted 5.3.0+ #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    RIP: 0010:nla_get_u16 include/net/netlink.h:1501 [inline]
    RIP: 0010:dsmark_init net/sched/sch_dsmark.c:364 [inline]
    RIP: 0010:dsmark_init+0x193/0x640 net/sched/sch_dsmark.c:339
    Code: 85 db 58 0f 88 7d 03 00 00 e8 e9 1a ac fb 48 8b 9d 70 ff ff ff 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 04 48 89 fa 48 c1 ea 03 b6 14 02 48 89 f8 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 ca
    RSP: 0018:ffff88809426f3b8 EFLAGS: 00010247
    RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff85c6eb09
    RDX: 0000000000000000 RSI: ffffffff85c6eb17 RDI: 0000000000000004
    RBP: ffff88809426f4b0 R08: ffff88808c4085c0 R09: ffffed1015d26159
    R10: ffffed1015d26158 R11: ffff8880ae930ac7 R12: ffff8880a7e96940
    R13: dffffc0000000000 R14: ffff88809426f8c0 R15: 0000000000000000
    FS: 0000000001292880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000020000080 CR3: 000000008ca1b000 CR4: 00000000001406e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    qdisc_create+0x4ee/0x1210 net/sched/sch_api.c:1237
    tc_modify_qdisc+0x524/0x1c50 net/sched/sch_api.c:1653
    rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5223
    netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
    rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5241
    netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
    netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
    netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
    sock_sendmsg_nosec net/socket.c:637 [inline]
    sock_sendmsg+0xd7/0x130 net/socket.c:657
    ___sys_sendmsg+0x803/0x920 net/socket.c:2311
    __sys_sendmsg+0x105/0x1d0 net/socket.c:2356
    __do_sys_sendmsg net/socket.c:2365 [inline]
    __se_sys_sendmsg net/socket.c:2363 [inline]
    __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
    do_syscall_64+0xfa/0x760 arch/x86/entry/common.c:290
    entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x440369

    Fixes: 758cc43c6d73 ("[PKT_SCHED]: Fix dsmark to apply changes consistent")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Russell King says:

    ====================
    Fix regression with AR8035 speed downgrade

    The following series attempts to address an issue spotted by tinywrkb
    with the AR8035 on the Cubox-i2 in a situation where the PHY downgrades
    the negotiated link.

    This is version 2, not much has changed other than rebasing on the
    current net tree. Changes have happend to patch 2 due to conflicts,
    so I dropped Andrew's reviewed-by. Minor context changes to patch 4
    which I don't consider important enough to warrant dropping the
    reviewed-by.

    Before commit 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in
    genphy_read_status"), we would read not only the link partner's
    advertisement, but also our own advertisement from the PHY registers,
    and use both to derive the PHYs current link mode. This works when the
    AR8035 downgrades the speed, because it appears that the AR8035 clears
    link mode bits in the advertisement registers as part of the downgrade.

    Commentary: what is not yet known is whether the AR8035 restores the
    advertisement register when the link goes down to the
    previous state.

    However, since the above referenced commit, we no longer use the PHYs
    advertisement registers, instead converting the link partner's
    advertisement to the ethtool link mode array, and combine that with
    phylib's cached version of our advertisement - which is not updated on
    speed downgrade.

    This results in phylib disagreeing with the actual operating mode of
    the PHY.

    Commentary: I wonder how many more PHY drivers are broken by this
    commit, but have yet to be discovered.

    The obvious way to address this would be to disable the downgrade
    feature, and indeed this does fix the problem in tinywrkb's case - his
    link partner instead downgrades the speed by reducing its
    advertisement, resulting in phylib correctly evaluating a slower speed.

    However, it has a serious drawback - the gigabit control register (MII
    register 9) appears to become read only. It seems the only way to
    update the register is to re-enable the downgrade feature, reset the
    PHY, changing register 9, disable the downgrade feature, and reset the
    PHY again.

    This series attempts to address the problem using a different approach,
    similar to the approach taken with Marvell PHYs. The AR8031, AR8033
    and AR8035 have a PHY-Specific Status register which reports the
    actual operating mode of the PHY - both speed and duplex. This
    register correctly reports the operating mode irrespective of whether
    autoneg is enabled or not. We use this register to fill in phylib's
    speed and duplex parameters.

    In detail:

    Patch 1 fixes a bug where writing to register 9 does not update
    phylib's advertisement mask in the same way that writing register 4
    does; this looks like an omission from when gigabit PHY support came
    into being.

    Patch 2 seperates the generic phylib code which reads the link partners
    advertisement from the PHY, so that we can re-use this in the Atheros
    PHY driver.

    Patch 3 seperates the generic phylib pause mode; phylib provides no
    help for MAC drivers to ascertain the negotiated pause mode, it merely
    copies the link partner's pause mode bits into its own variables.

    Commentary: Both the aforementioned Atheros PHYs and Marvell PHYs
    provide the resolved pause modes in terms of whether
    we should transmit pause frames, or whether we should
    allow reception of pause frames. Surely the resolution
    of this should be in phylib?

    Patch 4 provides the Atheros PHY driver with a private "read_status"
    implementation that fills in phylib's speed and duplex settings
    depending on the PHY-Specific status register. This ensures that
    phylib and the MAC driver match the operating mode that the PHY has
    decided to use. Since the register also gives us MDIX status, we
    can trivially fill that status in as well.

    Note that, although the bits mentioned in this patch for this register
    match those in th Marvell PHY driver, and it is located at the same
    address, the meaning of other register bits varies between the PHYs.
    Therefore, I do not feel that it would be appropriate to make this some
    kind of generic function.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Read the PHY-specific status register for the current operating mode
    (speed and duplex) of the PHY. This register reflects the actual
    mode that the PHY has resolved depending on either the advertisements
    of autoneg is enabled, or the forced mode if autoneg is disabled.

    This ensures that phylib's software state always tracks the hardware
    state.

    It seems both AR8033 (which uses the AR8031 ID) and AR8035 support
    this status register. AR8030 is not known at the present time.

    This patch depends on "net: phy: extract pause mode" and "net: phy:
    extract link partner advertisement reading".

    Reported-by: tinywrkb
    Reviewed-by: Andrew Lunn
    Tested-by: tinywrkb
    Fixes: 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status")
    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • Extract the update of phylib's software pause mode state from
    genphy_read_status(), so that we can re-use this functionality with
    PHYs that have alternative ways to read the negotiation results.

    Tested-by: tinywrkb
    Reviewed-by: Andrew Lunn
    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • Move reading the link partner advertisement out of genphy_read_status()
    into its own separate function. This will allow re-use of this code by
    PHY drivers that are able to read the resolved status from the PHY.

    Tested-by: tinywrkb
    Signed-off-by: Russell King
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Russell King
     
  • When userspace writes to the MII_ADVERTISE register, we update phylib's
    advertising mask and trigger a renegotiation. However, writing to the
    MII_CTRL1000 register, which contains the gigabit advertisement, does
    neither. This can lead to phylib's copy of the advertisement becoming
    de-synced with the values in the PHY register set, which can result in
    incorrect negotiation resolution.

    Fixes: 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in genphy_read_status")
    Reviewed-by: Andrew Lunn
    Signed-off-by: Russell King
    Signed-off-by: David S. Miller

    Russell King
     
  • Rajendra reported a kernel panic when a link was taken down:

    [ 6870.263084] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8
    [ 6870.271856] IP: [] __ipv6_ifa_notify+0x154/0x290

    [ 6870.570501] Call Trace:
    [ 6870.573238] [] ? ipv6_ifa_notify+0x26/0x40
    [ 6870.579665] [] ? addrconf_dad_completed+0x4c/0x2c0
    [ 6870.586869] [] ? ipv6_dev_mc_inc+0x196/0x260
    [ 6870.593491] [] ? addrconf_dad_work+0x10a/0x430
    [ 6870.600305] [] ? __switch_to_asm+0x34/0x70
    [ 6870.606732] [] ? process_one_work+0x18a/0x430
    [ 6870.613449] [] ? worker_thread+0x4d/0x490
    [ 6870.619778] [] ? process_one_work+0x430/0x430
    [ 6870.626495] [] ? kthread+0xd9/0xf0
    [ 6870.632145] [] ? __switch_to_asm+0x34/0x70
    [ 6870.638573] [] ? kthread_park+0x60/0x60
    [ 6870.644707] [] ? ret_from_fork+0x57/0x70
    [ 6870.650936] Code: 31 c0 31 d2 41 b9 20 00 08 02 b9 09 00 00 0

    addrconf_dad_work is kicked to be scheduled when a device is brought
    up. There is a race between addrcond_dad_work getting scheduled and
    taking the rtnl lock and a process taking the link down (under rtnl).
    The latter removes the host route from the inet6_addr as part of
    addrconf_ifdown which is run for NETDEV_DOWN. The former attempts
    to use the host route in __ipv6_ifa_notify. If the down event removes
    the host route due to the race to the rtnl, then the BUG listed above
    occurs.

    Since the DAD sequence can not be aborted, add a check for the missing
    host route in __ipv6_ifa_notify. The only way this should happen is due
    to the previously mentioned race. The host route is created when the
    address is added to an interface; it is only removed on a down event
    where the address is kept. Add a warning if the host route is missing
    AND the device is up; this is a situation that should never happen.

    Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional")
    Reported-by: Rajendra Dendukuri
    Signed-off-by: David Ahern
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    David Ahern
     
  • mdio_device_reset() makes use of the atomic-pretending API flavor for
    handling the PHY reset GPIO line.

    I found no hint that mdio_device_reset() is called from atomic context
    and indeed it uses usleep_range() since long time, so I would assume that
    it is OK to sleep there.

    This patch switch to gpiod_set_value_cansleep() in mdio_device_reset().
    This is relevant if e.g. the PHY reset line is tied to a I2C GPIO
    controller.

    This has been tested on a ZynqMP board running an upstream 4.19 kernel and
    then hand-ported on current kernel tree.

    Signed-off-by: Andrea Merello
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrea Merello
     
  • Since commit c09551c6ff7f ("net: ipv4: use a dedicated counter
    for icmp_v4 redirect packets") we use 'n_redirects' to account
    for redirect packets, but we still use 'rate_tokens' to compute
    the redirect packets exponential backoff.

    If the device sent to the relevant peer any ICMP error packet
    after sending a redirect, it will also update 'rate_token' according
    to the leaking bucket schema; typically 'rate_token' will raise
    above BITS_PER_LONG and the redirect packets backoff algorithm
    will produce undefined behavior.

    Fix the issue using 'n_redirects' to compute the exponential backoff
    in ip_rt_send_redirect().

    Note that we still clear rate_tokens after a redirect silence period,
    to avoid changing an established behaviour.

    The root cause predates git history; before the mentioned commit in
    the critical scenario, the kernel stopped sending redirects, after
    the mentioned commit the behavior more randomic.

    Reported-by: Xiumei Mu
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Fixes: c09551c6ff7f ("net: ipv4: use a dedicated counter for icmp_v4 redirect packets")
    Signed-off-by: Paolo Abeni
    Acked-by: Lorenzo Bianconi
    Signed-off-by: David S. Miller

    Paolo Abeni
     
  • r8152 may fail to establish network connection after resume from system
    suspend.

    If the USB port connects to r8152 lost its power during system suspend,
    the MAC address was written before is lost. The reason is that The MAC
    address doesn't get written again in its reset_resume callback.

    So let's set MAC address again in reset_resume callback. Also remove
    unnecessary lock as no other locking attempt will happen during
    reset_resume.

    Signed-off-by: Kai-Heng Feng
    Signed-off-by: David S. Miller

    Kai-Heng Feng
     
  • When fetching free MSI-X vectors for ULDs, check for the error code
    before accessing MSI-X info array. Otherwise, an out-of-bounds access is
    attempted, which results in kernel panic.

    Fixes: 94cdb8bb993a ("cxgb4: Add support for dynamic allocation of resources for ULD")
    Signed-off-by: Shahjada Abul Husain
    Signed-off-by: Vishal Kulkarni
    Signed-off-by: David S. Miller

    Vishal Kulkarni
     
  • This reverts commit a3ce2a21bb8969ae27917281244fa91bf5f286d7.

    Eric reported tests failings with commit. After digging into it,
    the bottom line is that the DAD sequence is not to be messed with.
    There are too many cases that are expected to proceed regardless
    of whether a device is up.

    Revert the patch and I will send a different solution for the
    problem Rajendra reported.

    Signed-off-by: David Ahern
    Cc: Eric Dumazet
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    David Ahern