05 Jan, 2020

3 commits

  • Make the layout of kcov_remote_arg the same for 32-bit and 64-bit code.
    This makes it more convenient to write userspace apps that can be
    compiled into 32-bit or 64-bit binaries and still work with the same
    64-bit kernel.

    Also use proper __u32 types in uapi headers instead of unsigned ints.

    Link: http://lkml.kernel.org/r/9e91020876029cfefc9211ff747685eba9536426.1575638983.git.andreyknvl@google.com
    Fixes: eec028c9386ed1a ("kcov: remote coverage support")
    Signed-off-by: Andrey Konovalov
    Acked-by: Marco Elver
    Cc: Greg Kroah-Hartman
    Cc: Alan Stern
    Cc: Felipe Balbi
    Cc: Chunfeng Yun
    Cc: "Jacky . Cao @ sony . com"
    Cc: Dmitry Vyukov
    Cc: Alexander Potapenko
    Cc: Marco Elver
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Konovalov
     
  • We currently try to shrink a single zone when removing memory. We use
    the zone of the first page of the memory we are removing. If that
    memmap was never initialized (e.g., memory was never onlined), we will
    read garbage and can trigger kernel BUGs (due to a stale pointer):

    BUG: unable to handle page fault for address: 000000000000353d
    #PF: supervisor write access in kernel mode
    #PF: error_code(0x0002) - not-present page
    PGD 0 P4D 0
    Oops: 0002 [#1] SMP PTI
    CPU: 1 PID: 7 Comm: kworker/u8:0 Not tainted 5.3.0-rc5-next-20190820+ #317
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.4
    Workqueue: kacpi_hotplug acpi_hotplug_work_fn
    RIP: 0010:clear_zone_contiguous+0x5/0x10
    Code: 48 89 c6 48 89 c3 e8 2a fe ff ff 48 85 c0 75 cf 5b 5d c3 c6 85 fd 05 00 00 01 5b 5d c3 0f 1f 840
    RSP: 0018:ffffad2400043c98 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000200000000 RCX: 0000000000000000
    RDX: 0000000000200000 RSI: 0000000000140000 RDI: 0000000000002f40
    RBP: 0000000140000000 R08: 0000000000000000 R09: 0000000000000001
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000140000
    R13: 0000000000140000 R14: 0000000000002f40 R15: ffff9e3e7aff3680
    FS: 0000000000000000(0000) GS:ffff9e3e7bb00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000000000000353d CR3: 0000000058610000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
    __remove_pages+0x4b/0x640
    arch_remove_memory+0x63/0x8d
    try_remove_memory+0xdb/0x130
    __remove_memory+0xa/0x11
    acpi_memory_device_remove+0x70/0x100
    acpi_bus_trim+0x55/0x90
    acpi_device_hotplug+0x227/0x3a0
    acpi_hotplug_work_fn+0x1a/0x30
    process_one_work+0x221/0x550
    worker_thread+0x50/0x3b0
    kthread+0x105/0x140
    ret_from_fork+0x3a/0x50
    Modules linked in:
    CR2: 000000000000353d

    Instead, shrink the zones when offlining memory or when onlining failed.
    Introduce and use remove_pfn_range_from_zone(() for that. We now
    properly shrink the zones, even if we have DIMMs whereby

    - Some memory blocks fall into no zone (never onlined)

    - Some memory blocks fall into multiple zones (offlined+re-onlined)

    - Multiple memory blocks that fall into different zones

    Drop the zone parameter (with a potential dubious value) from
    __remove_pages() and __remove_section().

    Link: http://lkml.kernel.org/r/20191006085646.5768-6-david@redhat.com
    Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b319]
    Signed-off-by: David Hildenbrand
    Reviewed-by: Oscar Salvador
    Cc: Michal Hocko
    Cc: "Matthew Wilcox (Oracle)"
    Cc: "Aneesh Kumar K.V"
    Cc: Pavel Tatashin
    Cc: Greg Kroah-Hartman
    Cc: Dan Williams
    Cc: Logan Gunthorpe
    Cc: [5.0+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     
  • Pull dmaengine fixes from Vinod Koul:
    "A bunch of fixes for:

    - uninitialized dma_slave_caps access

    - virt-dma use after free in vchan_complete()

    - driver fixes for ioat, k3dma and jz4780"

    * tag 'dmaengine-fix-5.5-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
    ioat: ioat_alloc_ring() failure handling.
    dmaengine: virt-dma: Fix access after free in vchan_complete()
    dmaengine: k3dma: Avoid null pointer traversal
    dmaengine: dma-jz4780: Also break descriptor chains on JZ4725B
    dmaengine: Fix access to uninitialized dma_slave_caps

    Linus Torvalds
     

04 Jan, 2020

1 commit

  • Pull block fixes from Jens Axboe:
    "Three fixes in here:

    - Fix for a missing split on default memory boundary mask (4G) (Ming)

    - Fix for multi-page read bio truncate (Ming)

    - Fix for null_blk zone close request handling (Damien)"

    * tag 'block-5.5-20200103' of git://git.kernel.dk/linux-block:
    null_blk: Fix REQ_OP_ZONE_CLOSE handling
    block: fix splitting segments on boundary masks
    block: add bio_truncate to fix guard_bio_eod

    Linus Torvalds
     

03 Jan, 2020

2 commits

  • Pull final sizeof_field conversion from Kees Cook:
    "Remove now unused FIELD_SIZEOF() macro (Kees Cook)"

    * tag 'sizeof_field-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    kernel.h: Remove unused FIELD_SIZEOF()

    Linus Torvalds
     
  • This reverts commit 8243186f0cc7 ("fs: remove ksys_dup()") and the
    subsequent fix for it in commit 2d3145f8d280 ("early init: fix error
    handling when opening /dev/console").

    Trying to use filp_open() and f_dupfd() instead of pseudo-syscalls
    caused more trouble than what is worth it: it requires accessing vfs
    internals and it turns out there were other bugs in it too.

    In particular, the file reference counting was wrong - because unlike
    the original "open+2*dup" sequence it used "filp_open+3*f_dupfd" and
    thus had an extra leaked file reference.

    That in turn then caused odd problems with Androidx86 long after boot
    becaue of how the extra reference to the console kept the session active
    even after all file descriptors had been closed.

    Reported-by: youling 257
    Cc: Arvind Sankar
    Cc: Al Viro
    Signed-off-by: Dominik Brodowski
    Signed-off-by: Linus Torvalds

    Dominik Brodowski
     

01 Jan, 2020

1 commit

  • Pull networking fixes from David Miller:

    1) Fix big endian overflow in nf_flow_table, from Arnd Bergmann.

    2) Fix port selection on big endian in nft_tproxy, from Phil Sutter.

    3) Fix precision tracking for unbound scalars in bpf verifier, from
    Daniel Borkmann.

    4) Fix integer overflow in socket rcvbuf check in UDP, from Antonio
    Messina.

    5) Do not perform a neigh confirmation during a pmtu update over a
    tunnel, from Hangbin Liu.

    6) Fix DMA mapping leak in dpaa_eth driver, from Madalin Bucur.

    7) Various PTP fixes for sja1105 dsa driver, from Vladimir Oltean.

    8) Add missing to dummy definition of of_mdiobus_child_is_phy(), from
    Geert Uytterhoeven

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
    hsr: fix slab-out-of-bounds Read in hsr_debugfs_rename()
    net/sched: add delete_empty() to filters and use it in cls_flower
    tcp: Fix highest_sack and highest_sack_seq
    ptp: fix the race between the release of ptp_clock and cdev
    net: dsa: sja1105: Reconcile the meaning of TPID and TPID2 for E/T and P/Q/R/S
    Documentation: net: dsa: sja1105: Remove text about taprio base-time limitation
    net: dsa: sja1105: Remove restriction of zero base-time for taprio offload
    net: dsa: sja1105: Really make the PTP command read-write
    net: dsa: sja1105: Take PTP egress timestamp by port, not mgmt slot
    cxgb4/cxgb4vf: fix flow control display for auto negotiation
    mlxsw: spectrum: Use dedicated policer for VRRP packets
    mlxsw: spectrum_router: Skip loopback RIFs during MAC validation
    net: stmmac: dwmac-meson8b: Fix the RGMII TX delay on Meson8b/8m2 SoCs
    net/sched: act_mirred: Pull mac prior redir to non mac_header_xmit device
    net_sched: sch_fq: properly set sk->sk_pacing_status
    bnx2x: Fix accounting of vlan resources among the PFs
    bnx2x: Use appropriate define for vlan credit
    of: mdio: Add missing inline to of_mdiobus_child_is_phy() dummy
    net: phy: aquantia: add suspend / resume ops for AQR105
    dpaa_eth: fix DMA mapping leak
    ...

    Linus Torvalds
     

31 Dec, 2019

3 commits

  • Revert "net/sched: cls_u32: fix refcount leak in the error path of
    u32_change()", and fix the u32 refcount leak in a more generic way that
    preserves the semantic of rule dumping.
    On tc filters that don't support lockless insertion/removal, there is no
    need to guard against concurrent insertion when a removal is in progress.
    Therefore, for most of them we can avoid a full walk() when deleting, and
    just decrease the refcount, like it was done on older Linux kernels.
    This fixes situations where walk() was wrongly detecting a non-empty
    filter, like it happened with cls_u32 in the error path of change(), thus
    leading to failures in the following tdc selftests:

    6aa7: (filter, u32) Add/Replace u32 with source match and invalid indev
    6658: (filter, u32) Add/Replace u32 with custom hash table and invalid handle
    74c2: (filter, u32) Add/Replace u32 filter with invalid hash table id

    On cls_flower, and on (future) lockless filters, this check is necessary:
    move all the check_empty() logic in a callback so that each filter
    can have its own implementation. For cls_flower, it's sufficient to check
    if no IDRs have been allocated.

    This reverts commit 275c44aa194b7159d1191817b20e076f55f0e620.

    Changes since v1:
    - document the need for delete_empty() when TCF_PROTO_OPS_DOIT_UNLOCKED
    is used, thanks to Vlad Buslov
    - implement delete_empty() without doing fl_walk(), thanks to Vlad Buslov
    - squash revert and new fix in a single patch, to be nice with bisect
    tests that run tdc on u32 filter, thanks to Dave Miller

    Fixes: 275c44aa194b ("net/sched: cls_u32: fix refcount leak in the error path of u32_change()")
    Fixes: 6676d5e416ee ("net: sched: set dedicated tcf_walker flag when tp is empty")
    Suggested-by: Jamal Hadi Salim
    Suggested-by: Vlad Buslov
    Signed-off-by: Davide Caratti
    Reviewed-by: Vlad Buslov
    Tested-by: Jamal Hadi Salim
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Davide Caratti
     
  • In a case when a ptp chardev (like /dev/ptp0) is open but an underlying
    device is removed, closing this file leads to a race. This reproduces
    easily in a kvm virtual machine:

    ts# cat openptp0.c
    int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); }
    ts# uname -r
    5.5.0-rc3-46cf053e
    ts# cat /proc/cmdline
    ... slub_debug=FZP
    ts# modprobe ptp_kvm
    ts# ./openptp0 &
    [1] 670
    opened /dev/ptp0, sleeping 10s...
    ts# rmmod ptp_kvm
    ts# ls /dev/ptp*
    ls: cannot access '/dev/ptp*': No such file or directory
    ts# ...woken up
    [ 48.010809] general protection fault: 0000 [#1] SMP
    [ 48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-46cf053e #25
    [ 48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
    [ 48.016270] RIP: 0010:module_put.part.0+0x7/0x80
    [ 48.017939] RSP: 0018:ffffb3850073be00 EFLAGS: 00010202
    [ 48.018339] RAX: 000000006b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: ffff89a476c00ad0
    [ 48.018936] RDX: fffff65a08d3ea08 RSI: 0000000000000247 RDI: 6b6b6b6b6b6b6b6b
    [ 48.019470] ... ^^^ a slub poison
    [ 48.023854] Call Trace:
    [ 48.024050] __fput+0x21f/0x240
    [ 48.024288] task_work_run+0x79/0x90
    [ 48.024555] do_exit+0x2af/0xab0
    [ 48.024799] ? vfs_write+0x16a/0x190
    [ 48.025082] do_group_exit+0x35/0x90
    [ 48.025387] __x64_sys_exit_group+0xf/0x10
    [ 48.025737] do_syscall_64+0x3d/0x130
    [ 48.026056] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [ 48.026479] RIP: 0033:0x7f53b12082f6
    [ 48.026792] ...
    [ 48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm]
    [ 48.045001] Fixing recursive fault but reboot is needed!

    This happens in:

    static void __fput(struct file *file)
    { ...
    if (file->f_op->release)
    file->f_op->release(inode, file); <<< cdev is kfree'd here
    if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
    !(mode & FMODE_PATH))) {
    cdev_put(inode->i_cdev); <<< cdev fields are accessed here

    Namely:

    __fput()
    posix_clock_release()
    kref_put(&clk->kref, delete_clock) <<< the last reference
    delete_clock()
    delete_ptp_clock()
    kfree(ptp) <<< cdev is embedded in ptp
    cdev_put
    module_put(p->owner) <<< *p is kfree'd, bang!

    Here cdev is embedded in posix_clock which is embedded in ptp_clock.
    The race happens because ptp_clock's lifetime is controlled by two
    refcounts: kref and cdev.kobj in posix_clock. This is wrong.

    Make ptp_clock's sysfs device a parent of cdev with cdev_device_add()
    created especially for such cases. This way the parent device with its
    ptp_clock is not released until all references to the cdev are released.
    This adds a requirement that an initialized but not exposed struct
    device should be provided to posix_clock_register() by a caller instead
    of a simple dev_t.

    This approach was adopted from the commit 72139dfa2464 ("watchdog: Fix
    the race between the release of watchdog_core_data and cdev"). See
    details of the implementation in the commit 233ed09d7fda ("chardev: add
    helper function to register char devs with a struct device").

    Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.com/T/#u
    Analyzed-by: Stephen Johnston
    Analyzed-by: Vern Lovejoy
    Signed-off-by: Vladis Dronov
    Acked-by: Richard Cochran
    Signed-off-by: David S. Miller

    Vladis Dronov
     
  • Now that all callers of FIELD_SIZEOF() have been converted to
    sizeof_field(), remove the unused prior macro.

    Signed-off-by: Kees Cook

    Kees Cook
     

29 Dec, 2019

1 commit

  • Some filesystem, such as vfat, may send bio which crosses device boundary,
    and the worse thing is that the IO request starting within device boundaries
    can contain more than one segment past EOD.

    Commit dce30ca9e3b6 ("fs: fix guard_bio_eod to check for real EOD errors")
    tries to fix this issue by returning -EIO for this situation. However,
    this way lets fs user code lose chance to handle -EIO, then sync_inodes_sb()
    may hang for ever.

    Also the current truncating on last segment is dangerous by updating the
    last bvec, given bvec table becomes not immutable any more, and fs bio
    users may not retrieve the truncated pages via bio_for_each_segment_all() in
    its .end_io callback.

    Fixes this issue by supporting multi-segment truncating. And the
    approach is simpler:

    - just update bio size since block layer can make correct bvec with
    the updated bio size. Then bvec table becomes really immutable.

    - zero all truncated segments for read bio

    Cc: Carlos Maiolino
    Cc: linux-fsdevel@vger.kernel.org
    Fixed-by: dce30ca9e3b6 ("fs: fix guard_bio_eod to check for real EOD errors")
    Reported-by: syzbot+2b9e54155c8c25d8d165@syzkaller.appspotmail.com
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     

27 Dec, 2019

1 commit

  • If CONFIG_OF_MDIO=n:

    drivers/net/phy/mdio_bus.c:23:
    include/linux/of_mdio.h:58:13: warning: ‘of_mdiobus_child_is_phy’ defined but not used [-Wunused-function]
    static bool of_mdiobus_child_is_phy(struct device_node *child)
    ^~~~~~~~~~~~~~~~~~~~~~~

    Fix this by adding the missing "inline" keyword.

    Fixes: 0aa4d016c043d16a ("of: mdio: export of_mdiobus_child_is_phy")
    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Andrew Lunn
    Acked-by: Borislav Petkov
    Signed-off-by: David S. Miller

    Geert Uytterhoeven
     

26 Dec, 2019

2 commits

  • This reverts commit 6bb86fefa086faba7b60bb452300b76a47cde1a5
    ("libahci_platform: Staticize ahci_platform_able_phys()") we are
    going to need ahci_platform_{enable,disable}_phys() in a subsequent
    commit for ahci_brcm.c in order to properly control the PHY
    initialization order.

    Also make sure the function prototypes are declared in
    include/linux/ahci_platform.h as a result.

    Cc: stable@vger.kernel.org
    Reviewed-by: Hans de Goede
    Signed-off-by: Florian Fainelli
    Signed-off-by: Jens Axboe

    Florian Fainelli
     
  • ata_qc_complete_multiple() is called with a mask of the still active
    tags.

    mv_sata doesn't have this information directly and instead calculates
    the still active tags from the started tags (ap->qc_active) and the
    finished tags as (ap->qc_active ^ done_mask)

    Since 28361c40368 the hw_tag and tag are no longer the same and the
    equation is no longer valid. In ata_exec_internal_sg() ap->qc_active is
    initialized as 1ULL << ATA_TAG_INTERNAL, but in hardware tag 0 is
    started and this will be in done_mask on completion. ap->qc_active ^
    done_mask becomes 0x100000000 ^ 0x1 = 0x100000001 and thus tag 0 used as
    the internal tag will never be reported as completed.

    This is fixed by introducing ata_qc_get_active() which returns the
    active hardware tags and calling it where appropriate.

    This is tested on mv_sata, but sata_fsl and sata_nv suffer from the same
    problem. There is another case in sata_nv that most likely needs fixing
    as well, but this looks a little different, so I wasn't confident enough
    to change that.

    Fixes: 28361c403683 ("libata: add extra internal command")
    Cc: stable@vger.kernel.org
    Tested-by: Pali Rohár
    Signed-off-by: Sascha Hauer

    Add missing export of ata_qc_get_active(), as per Pali.

    Signed-off-by: Jens Axboe

    Sascha Hauer
     

25 Dec, 2019

3 commits

  • When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
    we should not call dst_confirm_neigh() as there is no two-way communication.

    So disable the neigh confirm for vxlan and geneve pmtu update.

    v5: No change.
    v4: No change.
    v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
    dst_ops.update_pmtu to control whether we should do neighbor confirm.
    Also split the big patch to small ones for each area.
    v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.

    Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
    Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path")
    Reviewed-by: Guillaume Nault
    Tested-by: Guillaume Nault
    Acked-by: David Ahern
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller

    Hangbin Liu
     
  • Add a new function skb_dst_update_pmtu_no_confirm() for callers who need
    update pmtu but should not do neighbor confirm.

    v5: No change.
    v4: No change.
    v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
    dst_ops.update_pmtu to control whether we should do neighbor confirm.
    Also split the big patch to small ones for each area.
    v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.

    Reviewed-by: Guillaume Nault
    Acked-by: David Ahern
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller

    Hangbin Liu
     
  • The MTU update code is supposed to be invoked in response to real
    networking events that update the PMTU. In IPv6 PMTU update function
    __ip6_rt_update_pmtu() we called dst_confirm_neigh() to update neighbor
    confirmed time.

    But for tunnel code, it will call pmtu before xmit, like:
    - tnl_update_pmtu()
    - skb_dst_update_pmtu()
    - ip6_rt_update_pmtu()
    - __ip6_rt_update_pmtu()
    - dst_confirm_neigh()

    If the tunnel remote dst mac address changed and we still do the neigh
    confirm, we will not be able to update neigh cache and ping6 remote
    will failed.

    So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
    should not be invoking dst_confirm_neigh() as we have no evidence
    of successful two-way communication at this point.

    On the other hand it is also important to keep the neigh reachability fresh
    for TCP flows, so we cannot remove this dst_confirm_neigh() call.

    To fix the issue, we have to add a new bool parameter for dst_ops.update_pmtu
    to choose whether we should do neigh update or not. I will add the parameter
    in this patch and set all the callers to true to comply with the previous
    way, and fix the tunnel code one by one on later patches.

    v5: No change.
    v4: No change.
    v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
    dst_ops.update_pmtu to control whether we should do neighbor confirm.
    Also split the big patch to small ones for each area.
    v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.

    Suggested-by: David Miller
    Reviewed-by: Guillaume Nault
    Acked-by: David Ahern
    Signed-off-by: Hangbin Liu
    Signed-off-by: David S. Miller

    Hangbin Liu
     

23 Dec, 2019

2 commits

  • Pull ext4 bug fixes from Ted Ts'o:
    "Ext4 bug fixes, including a regression fix"

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: clarify impact of 'commit' mount option
    ext4: fix unused-but-set-variable warning in ext4_add_entry()
    jbd2: fix kernel-doc notation warning
    ext4: use RCU API in debug_print_tree
    ext4: validate the debug_want_extra_isize mount option at parse time
    ext4: reserve revoke credits in __ext4_new_inode
    ext4: unlock on error in ext4_expand_extra_isize()
    ext4: optimize __ext4_check_dir_entry()
    ext4: check for directory entries too close to block end
    ext4: fix ext4_empty_dir() for directories with holes

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) Several nf_flow_table_offload fixes from Pablo Neira Ayuso,
    including adding a missing ipv6 match description.

    2) Several heap overflow fixes in mwifiex from qize wang and Ganapathi
    Bhat.

    3) Fix uninit value in bond_neigh_init(), from Eric Dumazet.

    4) Fix non-ACPI probing of nxp-nci, from Stephan Gerhold.

    5) Fix use after free in tipc_disc_rcv(), from Tuong Lien.

    6) Enforce limit of 33 tail calls in mips and riscv JIT, from Paul
    Chaignon.

    7) Multicast MAC limit test is off by one in qede, from Manish Chopra.

    8) Fix established socket lookup race when socket goes from
    TCP_ESTABLISHED to TCP_LISTEN, because there lacks an intervening
    RCU grace period. From Eric Dumazet.

    9) Don't send empty SKBs from tcp_write_xmit(), also from Eric Dumazet.

    10) Fix active backup transition after link failure in bonding, from
    Mahesh Bandewar.

    11) Avoid zero sized hash table in gtp driver, from Taehee Yoo.

    12) Fix wrong interface passed to ->mac_link_up(), from Russell King.

    13) Fix DSA egress flooding settings in b53, from Florian Fainelli.

    14) Memory leak in gmac_setup_txqs(), from Navid Emamdoost.

    15) Fix double free in dpaa2-ptp code, from Ioana Ciornei.

    16) Reject invalid MTU values in stmmac, from Jose Abreu.

    17) Fix refcount leak in error path of u32 classifier, from Davide
    Caratti.

    18) Fix regression causing iwlwifi firmware crashes on boot, from Anders
    Kaseorg.

    19) Fix inverted return value logic in llc2 code, from Chan Shu Tak.

    20) Disable hardware GRO when XDP is attached to qede, frm Manish
    Chopra.

    21) Since we encode state in the low pointer bits, dst metrics must be
    at least 4 byte aligned, which is not necessarily true on m68k. Add
    annotations to fix this, from Geert Uytterhoeven.

    * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (160 commits)
    sfc: Include XDP packet headroom in buffer step size.
    sfc: fix channel allocation with brute force
    net: dst: Force 4-byte alignment of dst_metrics
    selftests: pmtu: fix init mtu value in description
    hv_netvsc: Fix unwanted rx_table reset
    net: phy: ensure that phy IDs are correctly typed
    mod_devicetable: fix PHY module format
    qede: Disable hardware gro when xdp prog is installed
    net: ena: fix issues in setting interrupt moderation params in ethtool
    net: ena: fix default tx interrupt moderation interval
    net/smc: unregister ib devices in reboot_event
    net: stmmac: platform: Fix MDIO init for platforms without PHY
    llc2: Fix return statement of llc_stat_ev_rx_null_dsap_xid_c (and _test_c)
    net: hisilicon: Fix a BUG trigered by wrong bytes_compl
    net: dsa: ksz: use common define for tag len
    s390/qeth: don't return -ENOTSUPP to userspace
    s390/qeth: fix promiscuous mode after reset
    s390/qeth: handle error due to unsupported transport mode
    cxgb4: fix refcount init for TC-MQPRIO offload
    tc-testing: initial tdc selftests for cls_u32
    ...

    Linus Torvalds
     

21 Dec, 2019

4 commits

  • Pull xen fixes from Juergen Gross:
    "This contains two cleanup patches and a small series for supporting
    reloading the Xen block backend driver"

    * tag 'for-linus-5.5b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/grant-table: remove multiple BUG_ON on gnttab_interface
    xen-blkback: support dynamic unbind/bind
    xen/interface: re-define FRONT/BACK_RING_ATTACH()
    xenbus: limit when state is forced to closed
    xenbus: move xenbus_dev_shutdown() into frontend code...
    xen/blkfront: Adjust indentation in xlvbd_alloc_gendisk

    Linus Torvalds
     
  • When storing a pointer to a dst_metrics structure in dst_entry._metrics,
    two flags are added in the least significant bits of the pointer value.
    Hence this assumes all pointers to dst_metrics structures have at least
    4-byte alignment.

    However, on m68k, the minimum alignment of 32-bit values is 2 bytes, not
    4 bytes. Hence in some kernel builds, dst_default_metrics may be only
    2-byte aligned, leading to obscure boot warnings like:

    WARNING: CPU: 0 PID: 7 at lib/refcount.c:28 refcount_warn_saturate+0x44/0x9a
    refcount_t: underflow; use-after-free.
    Modules linked in:
    CPU: 0 PID: 7 Comm: ksoftirqd/0 Tainted: G W 5.5.0-rc2-atari-01448-g114a1a1038af891d-dirty #261
    Stack from 10835e6c:
    10835e6c 0038134f 00023fa6 00394b0f 0000001c 00000009 00321560 00023fea
    00394b0f 0000001c 001a70f8 00000009 00000000 10835eb4 00000001 00000000
    04208040 0000000a 00394b4a 10835ed4 00043aa8 001a70f8 00394b0f 0000001c
    00000009 00394b4a 0026aba8 003215a4 00000003 00000000 0026d5a8 00000001
    003215a4 003a4361 003238d6 000001f0 00000000 003215a4 10aa3b00 00025e84
    003ddb00 10834000 002416a8 10aa3b00 00000000 00000080 000aa038 0004854a
    Call Trace: [] __warn+0xb2/0xb4
    [] warn_slowpath_fmt+0x42/0x64
    [] refcount_warn_saturate+0x44/0x9a
    [] printk+0x0/0x18
    [] refcount_warn_saturate+0x44/0x9a
    [] refcount_sub_and_test.constprop.73+0x38/0x3e
    [] ipv4_dst_destroy+0x5e/0x7e
    [] __local_bh_enable_ip+0x0/0x8e
    [] dst_destroy+0x40/0xae

    Fix this by forcing 4-byte alignment of all dst_metrics structures.

    Fixes: e5fd387ad5b30ca3 ("ipv6: do not overwrite inetpeer metrics prematurely")
    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: David S. Miller

    Geert Uytterhoeven
     
  • PHY IDs are 32-bit unsigned quantities. Ensure that they are always
    treated as such, and not passed around as "int"s.

    Fixes: 13d0ab6750b2 ("net: phy: check return code when requesting PHY driver module")
    Signed-off-by: Russell King
    Reviewed-by: Florian Fainelli
    Reviewed-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Russell King
     
  • When a PHY is probed, if the top bit is set, we end up requesting a
    module with the string "mdio:-10101110000000100101000101010001" -
    the top bit is printed to a signed -1 value. This leads to the module
    not being loaded.

    Fix the module format string and the macro generating the values for
    it to ensure that we only print unsigned types and the top bit is
    always 0/1. We correctly end up with
    "mdio:10101110000000100101000101010001".

    Fixes: 8626d3b43280 ("phylib: Support phy module autoloading")
    Reviewed-by: Andrew Lunn
    Signed-off-by: Russell King
    Reviewed-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Russell King
     

20 Dec, 2019

6 commits

  • Currently these macros are defined to re-initialize a front/back ring
    (respectively) to values read from the shared ring in such a way that any
    requests/responses that are added to the shared ring whilst the front/back
    is detached will be skipped over. This, in general, is not a desirable
    semantic since most frontend implementations will eventually block waiting
    for a response which would either never appear or never be processed.

    Since the macros are currently unused, take this opportunity to re-define
    them to re-initialize a front/back ring using specified values. This also
    allows FRONT/BACK_RING_INIT() to be re-defined in terms of
    FRONT/BACK_RING_ATTACH() using a specified value of 0.

    NOTE: BACK_RING_ATTACH() will be used directly in a subsequent patch.

    Signed-off-by: Paul Durrant
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross

    Paul Durrant
     
  • If a driver probe() fails then leave the xenstore state alone. There is no
    reason to modify it as the failure may be due to transient resource
    allocation issues and hence a subsequent probe() may succeed.

    If the driver supports re-binding then only force state to closed during
    remove() only in the case when the toolstack may need to clean up. This can
    be detected by checking whether the state in xenstore has been set to
    closing prior to device removal.

    NOTE: Re-bind support is indicated by new boolean in struct xenbus_driver,
    which defaults to false. Subsequent patches will add support to
    some backend drivers.

    Signed-off-by: Paul Durrant
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross

    Paul Durrant
     
  • This patch exports of_mdiobus_child_is_phy, allowing to check if a child
    node is a network PHY.

    Signed-off-by: Antoine Tenart
    Signed-off-by: David S. Miller

    Antoine Tenart
     
  • Daniel Borkmann says:

    ====================
    pull-request: bpf 2019-12-19

    The following pull-request contains BPF updates for your *net* tree.

    We've added 10 non-merge commits during the last 8 day(s) which contain
    a total of 21 files changed, 269 insertions(+), 108 deletions(-).

    The main changes are:

    1) Fix lack of synchronization between xsk wakeup and destroying resources
    used by xsk wakeup, from Maxim Mikityanskiy.

    2) Fix pruning with tail call patching, untrack programs in case of verifier
    error and fix a cgroup local storage tracking bug, from Daniel Borkmann.

    3) Fix clearing skb->tstamp in bpf_redirect() when going from ingress to
    egress which otherwise cause issues e.g. on fq qdisc, from Lorenz Bauer.

    4) Fix compile warning of unused proc_dointvec_minmax_bpf_restricted() when
    only cBPF is present, from Alexander Lobakin.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Merge fixes from Andrew Morton:
    "6 fixes"

    * emailed patches from Andrew Morton :
    lib/Kconfig.debug: fix some messed up configurations
    mm: vmscan: protect shrinker idr replace with CONFIG_MEMCG
    kasan: don't assume percpu shadow allocations will succeed
    kasan: use apply_to_existing_page_range() for releasing vmalloc shadow
    mm/memory.c: add apply_to_existing_page_range() helper
    kasan: fix crashes on access to memory mapped by vm_map_ram()

    Linus Torvalds
     
  • Pull power management fix from Rafael Wysocki:
    "Fix a problem related to CPU offline/online and cpufreq governors that
    in some system configurations may lead to a system-wide deadlock
    during CPU online"

    * tag 'pm-5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    cpufreq: Avoid leaving stale IRQ work items during CPU offline

    Linus Torvalds
     

19 Dec, 2019

3 commits

  • * pm-cpufreq:
    cpufreq: Avoid leaving stale IRQ work items during CPU offline

    Rafael J. Wysocki
     
  • Pull tpm fixes from Jarkko Sakkinen:
    "Bunch of fixes for rc3"

    * tag 'tpmdd-next-20191219' of git://git.infradead.org/users/jjs/linux-tpmdd:
    tpm/tpm_ftpm_tee: add shutdown call back
    tpm: selftest: cleanup after unseal with wrong auth/policy test
    tpm: selftest: add test covering async mode
    tpm: fix invalid locking in NONBLOCKING mode
    security: keys: trusted: fix lost handle flush
    tpm_tis: reserve chip for duration of tpm_tis_core_init
    KEYS: asymmetric: return ENOMEM if akcipher_request_alloc() fails
    KEYS: remove CONFIG_KEYS_COMPAT

    Linus Torvalds
     
  • Pull sound fixes from Takashi Iwai:
    "A slightly high amount at this time, but all good and small fixes:

    - A PCM core fix that initializes the buffer properly for avoiding
    information leaks; it is a long-standing minor problem, but good to
    fix better now

    - A few ASoC core fixes for the init / cleanup ordering issues that
    surfaced after the recent refactoring

    - Lots of SOF and topology-related fixes went in, as usual as such
    hot topics

    - Several ASoC codec and platform-specific small fixes: wm89xx,
    realtek, and max98090, AMD, Intel-SST

    - A fix for the previous incomplete regression of HD-audio, now
    hitting Nvidia HDMI

    - A few HD-audio CA0132 codec fixes"

    * tag 'sound-5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (27 commits)
    ALSA: hda - Downgrade error message for single-cmd fallback
    ASoC: wm8962: fix lambda value
    ALSA: hda: Fix regression by strip mask fix
    ALSA: hda/ca0132 - Fix work handling in delayed HP detection
    ALSA: hda/ca0132 - Avoid endless loop
    ALSA: hda/ca0132 - Keep power on during processing DSP response
    ALSA: pcm: Avoid possible info leaks from PCM stream buffers
    ASoC: Intel: common: work-around incorrect ACPI HID for CML boards
    ASoC: SOF: Intel: split cht and byt debug window sizes
    ASoC: SOF: loader: fix snd_sof_fw_parse_ext_data
    ASoC: SOF: loader: snd_sof_fw_parse_ext_data log warning on unknown header
    ASoC: simple-card: Don't create separate link when platform is present
    ASoC: topology: Check return value for soc_tplg_pcm_create()
    ASoC: topology: Check return value for snd_soc_add_dai_link()
    ASoC: core: only flush inited work during free
    ASoC: Intel: bytcr_rt5640: Update quirk for Teclast X89
    ASoC: core: Init pcm runtime work early to avoid warnings
    ASoC: Intel: sst: Add missing include
    ASoC: max98090: fix possible race conditions
    ASoC: max98090: exit workaround earlier if PLL is locked
    ...

    Linus Torvalds
     

18 Dec, 2019

6 commits

  • Fix missing '*' kernel-doc notation that causes this warning:

    ../include/linux/netdevice.h:1779: warning: bad line: spinlock

    Fixes: ab92d68fc22f ("net: core: add generic lockdep keys")
    Signed-off-by: Randy Dunlap
    Cc: Taehee Yoo
    Signed-off-by: David S. Miller

    Randy Dunlap
     
  • sk->sk_pacing_shift can be read and written without lock
    synchronization. This patch adds annotations to
    document this fact and avoid future syzbot complains.

    This might also avoid unexpected false sharing
    in sk_pacing_shift_update(), as the compiler
    could remove the conditional check and always
    write over sk->sk_pacing_shift :

    if (sk->sk_pacing_shift != val)
    sk->sk_pacing_shift = val;

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • apply_to_page_range() takes an address range, and if any parts of it are
    not covered by the existing page table hierarchy, it allocates memory to
    fill them in.

    In some use cases, this is not what we want - we want to be able to
    operate exclusively on PTEs that are already in the tables.

    Add apply_to_existing_page_range() for this. Adjust the walker
    functions for apply_to_page_range to take 'create', which switches them
    between the old and new modes.

    This will be used in KASAN vmalloc.

    [akpm@linux-foundation.org: reduce code duplication]
    [akpm@linux-foundation.org: s/apply_to_existing_pages/apply_to_existing_page_range/]
    [akpm@linux-foundation.org: initialize __apply_to_page_range::err]
    Link: http://lkml.kernel.org/r/20191205140407.1874-1-dja@axtens.net
    Signed-off-by: Daniel Axtens
    Cc: Dmitry Vyukov
    Cc: Uladzislau Rezki (Sony)
    Cc: Alexander Potapenko
    Cc: Daniel Axtens
    Cc: Qian Cai
    Cc: Andrey Ryabinin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Axtens
     
  • With CONFIG_KASAN_VMALLOC=y any use of memory obtained via vm_map_ram()
    will crash because there is no shadow backing that memory.

    Instead of sprinkling additional kasan_populate_vmalloc() calls all over
    the vmalloc code, move it into alloc_vmap_area(). This will fix
    vm_map_ram() and simplify the code a bit.

    [aryabinin@virtuozzo.com: v2]
    Link: http://lkml.kernel.org/r/20191205095942.1761-1-aryabinin@virtuozzo.comLink: http://lkml.kernel.org/r/20191204204534.32202-1-aryabinin@virtuozzo.com
    Fixes: 3c5c3cfb9ef4 ("kasan: support backing vmalloc space with real shadow memory")
    Signed-off-by: Andrey Ryabinin
    Reported-by: Dmitry Vyukov
    Reviewed-by: Uladzislau Rezki (Sony)
    Cc: Daniel Axtens
    Cc: Alexander Potapenko
    Cc: Daniel Axtens
    Cc: Qian Cai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     
  • Pull EFI fixes from Ingo Molnar:
    "Protect presistent EFI memory reservations from kexec, fix EFIFB early
    console, EFI stub graphics output fixes and other misc fixes."

    * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    efi: Don't attempt to map RCI2 config table if it doesn't exist
    efi/earlycon: Remap entire framebuffer after page initialization
    efi: Fix efi_loaded_image_t::unload type
    efi/gop: Fix memory leak in __gop_query32/64()
    efi/gop: Return EFI_SUCCESS if a usable GOP was found
    efi/gop: Return EFI_NOT_FOUND if there are no usable GOPs
    efi/memreserve: Register reservations as 'reserved' in /proc/iomem

    Linus Torvalds
     
  • Recently noticed that we're tracking programs related to local storage maps
    through their prog pointer. This is a wrong assumption since the prog pointer
    can still change throughout the verification process, for example, whenever
    bpf_patch_insn_single() is called.

    Therefore, the prog pointer that was assigned via bpf_cgroup_storage_assign()
    is not guaranteed to be the same as we pass in bpf_cgroup_storage_release()
    and the map would therefore remain in busy state forever. Fix this by using
    the prog's aux pointer which is stable throughout verification and beyond.

    Fixes: de9cbbaadba5 ("bpf: introduce cgroup storage maps")
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Alexei Starovoitov
    Cc: Roman Gushchin
    Cc: Martin KaFai Lau
    Link: https://lore.kernel.org/bpf/1471c69eca3022218666f909bc927a92388fd09e.1576580332.git.daniel@iogearbox.net

    Daniel Borkmann
     

17 Dec, 2019

2 commits

  • …/broonie/sound into for-linus

    ASoC: Fixes for v5.5

    A collection of fixes since the merge window, mostly driver specific but
    there's a few in the core that clean up fallout from the refactorings
    done in the last cycle.

    Takashi Iwai
     
  • The original code, before it was moved into security/keys/trusted-keys
    had a flush after the blob unseal. Without that flush, the volatile
    handles increase in the TPM until it becomes unusable and the system
    either has to be rebooted or the TPM volatile area manually flushed.
    Fix by adding back the lost flush, which we now have to export because
    of the relocation of the trusted key code may cause the consumer to be
    modular.

    Signed-off-by: James Bottomley
    Fixes: 2e19e10131a0 ("KEYS: trusted: Move TPM2 trusted keys code")
    Reviewed-by: Jerry Snitselaar
    Reviewed-by: Jarkko Sakkinen
    Signed-off-by: Jarkko Sakkinen

    James Bottomley