22 Jul, 2020

10 commits

  • commit a50ca29523b18baea548bdf5df9b4b923c2bb4f6 upstream.

    This adds more hardware IDs for Elan touchpads found in various Lenovo
    laptops.

    Signed-off-by: Dave Wang
    Link: https://lore.kernel.org/r/000201d5a8bd$9fead3f0$dfc07bd0$@emc.com.tw
    Cc: stable@vger.kernel.org
    Signed-off-by: Dmitry Torokhov
    Signed-off-by: Greg Kroah-Hartman

    Dave Wang
     
  • commit f794db6841e5480208f0c3a3ac1df445a96b079e upstream.

    Until this commit the mainline kernel version (this version) of the
    vboxguest module contained a bug where it defined
    VBGL_IOCTL_VMMDEV_REQUEST_BIG and VBGL_IOCTL_LOG using
    _IOC(_IOC_READ | _IOC_WRITE, 'V', ...) instead of
    _IO(V, ...) as the out of tree VirtualBox upstream version does.

    Since the VirtualBox userspace bits are always built against VirtualBox
    upstream's headers, this means that so far the mainline kernel version
    of the vboxguest module has been failing these 2 ioctls with -ENOTTY.
    I guess that VBGL_IOCTL_VMMDEV_REQUEST_BIG is never used causing us to
    not hit that one and sofar the vboxguest driver has failed to actually
    log any log messages passed it through VBGL_IOCTL_LOG.

    This commit changes the VBGL_IOCTL_VMMDEV_REQUEST_BIG and VBGL_IOCTL_LOG
    defines to match the out of tree VirtualBox upstream vboxguest version,
    while keeping compatibility with the old wrong request defines so as
    to not break the kernel ABI in case someone has been using the old
    request defines.

    Fixes: f6ddd094f579 ("virt: Add vboxguest driver for Virtual Box Guest integration UAPI")
    Cc: stable@vger.kernel.org
    Acked-by: Arnd Bergmann
    Reviewed-by: Arnd Bergmann
    Signed-off-by: Hans de Goede
    Link: https://lore.kernel.org/r/20200709120858.63928-2-hdegoede@redhat.com
    Signed-off-by: Greg Kroah-Hartman

    Hans de Goede
     
  • [ Upstream commit e8639e1c986a8a9d0f94549170f6db579376c3ae ]

    The RTC modules on am3 and am4 need quirk handling to unlock and lock
    them for reset so let's add the quirk handling based on what we already
    have for legacy platform data. In later patches we will simply drop the
    RTC related platform data and the old quirk handling.

    Signed-off-by: Tony Lindgren
    Signed-off-by: Sasha Levin

    Tony Lindgren
     
  • [ Upstream commit bfe373f608cf81b7626dfeb904001b0e867c5110 ]

    Else there may be magic numbers in /sys/kernel/debug/block/*/state.

    Signed-off-by: Hou Tao
    Reviewed-by: Bart Van Assche
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Hou Tao
     
  • [ Upstream commit 14b032b8f8fce03a546dcf365454bec8c4a58d7d ]

    In order for no_refcnt and is_data to be the lowest order two
    bits in the 'val' we have to pad out the bitfield of the u8.

    Fixes: ad0f75e5f57c ("cgroup: fix cgroup_sk_alloc() for sk_clone_lock()")
    Reported-by: Guenter Roeck
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     
  • [ Upstream commit ad0f75e5f57ccbceec13274e1e242f2b5a6397ed ]

    When we clone a socket in sk_clone_lock(), its sk_cgrp_data is
    copied, so the cgroup refcnt must be taken too. And, unlike the
    sk_alloc() path, sock_update_netprioidx() is not called here.
    Therefore, it is safe and necessary to grab the cgroup refcnt
    even when cgroup_sk_alloc is disabled.

    sk_clone_lock() is in BH context anyway, the in_interrupt()
    would terminate this function if called there. And for sk_alloc()
    skcd->val is always zero. So it's safe to factor out the code
    to make it more readable.

    The global variable 'cgroup_sk_alloc_disabled' is used to determine
    whether to take these reference counts. It is impossible to make
    the reference counting correct unless we save this bit of information
    in skcd->val. So, add a new bit there to record whether the socket
    has already taken the reference counts. This obviously relies on
    kmalloc() to align cgroup pointers to at least 4 bytes,
    ARCH_KMALLOC_MINALIGN is certainly larger than that.

    This bug seems to be introduced since the beginning, commit
    d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets")
    tried to fix it but not compeletely. It seems not easy to trigger until
    the recent commit 090e28b229af
    ("netprio_cgroup: Fix unlimited memory leak of v2 cgroups") was merged.

    Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup")
    Reported-by: Cameron Berkenpas
    Reported-by: Peter Geis
    Reported-by: Lu Fengqi
    Reported-by: Daniël Sonck
    Reported-by: Zhang Qiang
    Tested-by: Cameron Berkenpas
    Tested-by: Peter Geis
    Tested-by: Thomas Lamprecht
    Cc: Daniel Borkmann
    Cc: Zefan Li
    Cc: Tejun Heo
    Cc: Roman Gushchin
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Cong Wang
     
  • [ Upstream commit 469aceddfa3ed16e17ee30533fae45e90f62efd8 ]

    Toshiaki pointed out that we now have two very similar functions to extract
    the L3 protocol number in the presence of VLAN tags. And Daniel pointed out
    that the unbounded parsing loop makes it possible for maliciously crafted
    packets to loop through potentially hundreds of tags.

    Fix both of these issues by consolidating the two parsing functions and
    limiting the VLAN tag parsing to a max depth of 8 tags. As part of this,
    switch over __vlan_get_protocol() to use skb_header_pointer() instead of
    pskb_may_pull(), to avoid the possible side effects of the latter and keep
    the skb pointer 'const' through all the parsing functions.

    v2:
    - Use limit of 8 tags instead of 32 (matching XMIT_RECURSION_LIMIT)

    Reported-by: Toshiaki Makita
    Reported-by: Daniel Borkmann
    Fixes: d7bf2ebebc2b ("sched: consistently handle layer3 header accesses in the presence of VLANs")
    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Toke Høiland-Jørgensen
     
  • [ Upstream commit d7bf2ebebc2bd61ab95e2a8e33541ef282f303d4 ]

    There are a couple of places in net/sched/ that check skb->protocol and act
    on the value there. However, in the presence of VLAN tags, the value stored
    in skb->protocol can be inconsistent based on whether VLAN acceleration is
    enabled. The commit quoted in the Fixes tag below fixed the users of
    skb->protocol to use a helper that will always see the VLAN ethertype.

    However, most of the callers don't actually handle the VLAN ethertype, but
    expect to find the IP header type in the protocol field. This means that
    things like changing the ECN field, or parsing diffserv values, stops
    working if there's a VLAN tag, or if there are multiple nested VLAN
    tags (QinQ).

    To fix this, change the helper to take an argument that indicates whether
    the caller wants to skip the VLAN tags or not. When skipping VLAN tags, we
    make sure to skip all of them, so behaviour is consistent even in QinQ
    mode.

    To make the helper usable from the ECN code, move it to if_vlan.h instead
    of pkt_sched.h.

    v3:
    - Remove empty lines
    - Move vlan variable definitions inside loop in skb_protocol()
    - Also use skb_protocol() helper in IP{,6}_ECN_decapsulate() and
    bpf_skb_ecn_set_ce()

    v2:
    - Use eth_type_vlan() helper in skb_protocol()
    - Also fix code that reads skb->protocol directly
    - Change a couple of 'if/else if' statements to switch constructs to avoid
    calling the helper twice

    Reported-by: Ilya Ponetayev
    Fixes: d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan path")
    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Toke Høiland-Jørgensen
     
  • [ Upstream commit 394de110a73395de2ca4516b0de435e91b11b604 ]

    The packets from tunnel devices (eg bareudp) may have only
    metadata in the dst pointer of skb. Hence a pointer check of
    neigh_lookup is needed in dst_neigh_lookup_skb

    Kernel crashes when packets from bareudp device is processed in
    the kernel neighbour subsytem.

    [ 133.384484] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [ 133.385240] #PF: supervisor instruction fetch in kernel mode
    [ 133.385828] #PF: error_code(0x0010) - not-present page
    [ 133.386603] PGD 0 P4D 0
    [ 133.386875] Oops: 0010 [#1] SMP PTI
    [ 133.387275] CPU: 0 PID: 5045 Comm: ping Tainted: G W 5.8.0-rc2+ #15
    [ 133.388052] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
    [ 133.391076] RIP: 0010:0x0
    [ 133.392401] Code: Bad RIP value.
    [ 133.394029] RSP: 0018:ffffb79980003d50 EFLAGS: 00010246
    [ 133.396656] RAX: 0000000080000102 RBX: ffff9de2fe0d6600 RCX: ffff9de2fe5e9d00
    [ 133.399018] RDX: 0000000000000000 RSI: ffff9de2fe5e9d00 RDI: ffff9de2fc21b400
    [ 133.399685] RBP: ffff9de2fe5e9d00 R08: 0000000000000000 R09: 0000000000000000
    [ 133.400350] R10: ffff9de2fbc6be22 R11: ffff9de2fe0d6600 R12: ffff9de2fc21b400
    [ 133.401010] R13: ffff9de2fe0d6628 R14: 0000000000000001 R15: 0000000000000003
    [ 133.401667] FS: 00007fe014918740(0000) GS:ffff9de2fec00000(0000) knlGS:0000000000000000
    [ 133.402412] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 133.402948] CR2: ffffffffffffffd6 CR3: 000000003bb72000 CR4: 00000000000006f0
    [ 133.403611] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 133.404270] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 133.404933] Call Trace:
    [ 133.405169]
    [ 133.405367] __neigh_update+0x5a4/0x8f0
    [ 133.405734] arp_process+0x294/0x820
    [ 133.406076] ? __netif_receive_skb_core+0x866/0xe70
    [ 133.406557] arp_rcv+0x129/0x1c0
    [ 133.406882] __netif_receive_skb_one_core+0x95/0xb0
    [ 133.407340] process_backlog+0xa7/0x150
    [ 133.407705] net_rx_action+0x2af/0x420
    [ 133.408457] __do_softirq+0xda/0x2a8
    [ 133.408813] asm_call_on_stack+0x12/0x20
    [ 133.409290]
    [ 133.409519] do_softirq_own_stack+0x39/0x50
    [ 133.410036] do_softirq+0x50/0x60
    [ 133.410401] __local_bh_enable_ip+0x50/0x60
    [ 133.410871] ip_finish_output2+0x195/0x530
    [ 133.411288] ip_output+0x72/0xf0
    [ 133.411673] ? __ip_finish_output+0x1f0/0x1f0
    [ 133.412122] ip_send_skb+0x15/0x40
    [ 133.412471] raw_sendmsg+0x853/0xab0
    [ 133.412855] ? insert_pfn+0xfe/0x270
    [ 133.413827] ? vvar_fault+0xec/0x190
    [ 133.414772] sock_sendmsg+0x57/0x80
    [ 133.415685] __sys_sendto+0xdc/0x160
    [ 133.416605] ? syscall_trace_enter+0x1d4/0x2b0
    [ 133.417679] ? __audit_syscall_exit+0x1d9/0x280
    [ 133.418753] ? __prepare_exit_to_usermode+0x5d/0x1a0
    [ 133.419819] __x64_sys_sendto+0x24/0x30
    [ 133.420848] do_syscall_64+0x4d/0x90
    [ 133.421768] entry_SYSCALL_64_after_hwframe+0x44/0xa9
    [ 133.422833] RIP: 0033:0x7fe013689c03
    [ 133.423749] Code: Bad RIP value.
    [ 133.424624] RSP: 002b:00007ffc7288f418 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
    [ 133.425940] RAX: ffffffffffffffda RBX: 000056151fc63720 RCX: 00007fe013689c03
    [ 133.427225] RDX: 0000000000000040 RSI: 000056151fc63720 RDI: 0000000000000003
    [ 133.428481] RBP: 00007ffc72890b30 R08: 000056151fc60500 R09: 0000000000000010
    [ 133.429757] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
    [ 133.431041] R13: 000056151fc636e0 R14: 000056151fc616bc R15: 0000000000000080
    [ 133.432481] Modules linked in: mpls_iptunnel act_mirred act_tunnel_key cls_flower sch_ingress veth mpls_router ip_tunnel bareudp ip6_udp_tunnel udp_tunnel macsec udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc xt_MASQUERADE iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc ebtable_filter ebtables overlay ip6table_filter ip6_tables iptable_filter sunrpc ext4 mbcache jbd2 pcspkr i2c_piix4 virtio_balloon joydev ip_tables xfs libcrc32c ata_generic qxl pata_acpi drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ata_piix libata virtio_net net_failover virtio_console failover virtio_blk i2c_core virtio_pci virtio_ring serio_raw floppy virtio dm_mirror dm_region_hash dm_log dm_mod
    [ 133.444045] CR2: 0000000000000000
    [ 133.445082] ---[ end trace f4aeee1958fd1638 ]---
    [ 133.446236] RIP: 0010:0x0
    [ 133.447180] Code: Bad RIP value.
    [ 133.448152] RSP: 0018:ffffb79980003d50 EFLAGS: 00010246
    [ 133.449363] RAX: 0000000080000102 RBX: ffff9de2fe0d6600 RCX: ffff9de2fe5e9d00
    [ 133.450835] RDX: 0000000000000000 RSI: ffff9de2fe5e9d00 RDI: ffff9de2fc21b400
    [ 133.452237] RBP: ffff9de2fe5e9d00 R08: 0000000000000000 R09: 0000000000000000
    [ 133.453722] R10: ffff9de2fbc6be22 R11: ffff9de2fe0d6600 R12: ffff9de2fc21b400
    [ 133.455149] R13: ffff9de2fe0d6628 R14: 0000000000000001 R15: 0000000000000003
    [ 133.456520] FS: 00007fe014918740(0000) GS:ffff9de2fec00000(0000) knlGS:0000000000000000
    [ 133.458046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 133.459342] CR2: ffffffffffffffd6 CR3: 000000003bb72000 CR4: 00000000000006f0
    [ 133.460782] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 133.462240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 133.463697] Kernel panic - not syncing: Fatal exception in interrupt
    [ 133.465226] Kernel Offset: 0xfa00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
    [ 133.467025] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

    Fixes: aaa0c23cb901 ("Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug")
    Signed-off-by: Martin Varghese
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Martin Varghese
     
  • [ Upstream commit 1e82a62fec613844da9e558f3493540a5b7a7b67 ]

    A potential deadlock can occur during registering or unregistering a
    new generic netlink family between the main nl_table_lock and the
    cb_lock where each thread wants the lock held by the other, as
    demonstrated below.

    1) Thread 1 is performing a netlink_bind() operation on a socket. As part
    of this call, it will call netlink_lock_table(), incrementing the
    nl_table_users count to 1.
    2) Thread 2 is registering (or unregistering) a genl_family via the
    genl_(un)register_family() API. The cb_lock semaphore will be taken for
    writing.
    3) Thread 1 will call genl_bind() as part of the bind operation to handle
    subscribing to GENL multicast groups at the request of the user. It will
    attempt to take the cb_lock semaphore for reading, but it will fail and
    be scheduled away, waiting for Thread 2 to finish the write.
    4) Thread 2 will call netlink_table_grab() during the (un)registration
    call. However, as Thread 1 has incremented nl_table_users, it will not
    be able to proceed, and both threads will be stuck waiting for the
    other.

    genl_bind() is a noop, unless a genl_family implements the mcast_bind()
    function to handle setting up family-specific multicast operations. Since
    no one in-tree uses this functionality as Cong pointed out, simply removing
    the genl_bind() function will remove the possibility for deadlock, as there
    is no attempt by Thread 1 above to take the cb_lock semaphore.

    Fixes: c380d9a7afff ("genetlink: pass multicast bind/unbind to families")
    Suggested-by: Cong Wang
    Acked-by: Johannes Berg
    Reported-by: kernel test robot
    Signed-off-by: Sean Tranchetti
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Sean Tranchetti
     

16 Jul, 2020

3 commits

  • commit 63960260457a02af2a6cb35d75e6bdb17299c882 upstream.

    When evaluating access control over kallsyms visibility, credentials at
    open() time need to be used, not the "current" creds (though in BPF's
    case, this has likely always been the same). Plumb access to associated
    file->f_cred down through bpf_dump_raw_ok() and its callers now that
    kallsysm_show_value() has been refactored to take struct cred.

    Cc: Alexei Starovoitov
    Cc: Daniel Borkmann
    Cc: bpf@vger.kernel.org
    Cc: stable@vger.kernel.org
    Fixes: 7105e828c087 ("bpf: allow for correlation of maps and helpers in dump")
    Signed-off-by: Kees Cook
    Signed-off-by: Greg Kroah-Hartman

    Kees Cook
     
  • commit 160251842cd35a75edfb0a1d76afa3eb674ff40a upstream.

    In order to perform future tests against the cred saved during open(),
    switch kallsyms_show_value() to operate on a cred, and have all current
    callers pass current_cred(). This makes it very obvious where callers
    are checking the wrong credential in their "read" contexts. These will
    be fixed in the coming patches.

    Additionally switch return value to bool, since it is always used as a
    direct permission check, not a 0-on-success, negative-on-error style
    function return.

    Cc: stable@vger.kernel.org
    Signed-off-by: Kees Cook
    Signed-off-by: Greg Kroah-Hartman

    Kees Cook
     
  • [ Upstream commit f79a732a8325dfbd570d87f1435019d7e5501c6d ]

    On partial_drain completion we should be in SNDRV_PCM_STATE_RUNNING
    state, so set that for partially draining streams in
    snd_compr_drain_notify() and use a flag for partially draining streams

    While at it, add locks for stream state change in
    snd_compr_drain_notify() as well.

    Fixes: f44f2a5417b2 ("ALSA: compress: fix drain calls blocking other compress functions (v6)")
    Reviewed-by: Srinivas Kandagatla
    Tested-by: Srinivas Kandagatla
    Reviewed-by: Charles Keepax
    Tested-by: Charles Keepax
    Signed-off-by: Vinod Koul
    Link: https://lore.kernel.org/r/20200629134737.105993-4-vkoul@kernel.org
    Signed-off-by: Takashi Iwai
    Signed-off-by: Sasha Levin

    Vinod Koul
     

09 Jul, 2020

1 commit

  • commit 34c86f4c4a7be3b3e35aa48bd18299d4c756064d upstream.

    The locking in af_alg_release_parent is broken as the BH socket
    lock can only be taken if there is a code-path to handle the case
    where the lock is owned by process-context. Instead of adding
    such handling, we can fix this by changing the ref counts to
    atomic_t.

    This patch also modifies the main refcnt to include both normal
    and nokey sockets. This way we don't have to fudge the nokey
    ref count when a socket changes from nokey to normal.

    Credits go to Mauricio Faria de Oliveira who diagnosed this bug
    and sent a patch for it:

    https://lore.kernel.org/linux-crypto/20200605161657.535043-1-mfo@canonical.com/

    Reported-by: Brian Moyles
    Reported-by: Mauricio Faria de Oliveira
    Fixes: 37f96694cf73 ("crypto: af_alg - Use bh_lock_sock in...")
    Cc:
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Herbert Xu
     

01 Jul, 2020

6 commits

  • [ Upstream commit 97dd1abd026ae4e6a82fa68645928404ad483409 ]

    qed_chain_get_element_left{,_u32} returned 0 when the difference
    between producer and consumer page count was equal to the total
    page count.
    Fix this by conditional expanding of producer value (vs
    unconditional). This allowed to eliminate normalizaton against
    total page count, which was the cause of this bug.

    Misc: replace open-coded constants with common defines.

    Fixes: a91eb52abb50 ("qed: Revisit chain implementation")
    Signed-off-by: Alexander Lobakin
    Signed-off-by: Igor Russkikh
    Signed-off-by: Michal Kalderon
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Alexander Lobakin
     
  • [ Upstream commit 7dfc06a0f25b593a9f51992f540c0f80a57f3629 ]

    It is possible that the first event in the event log is not actually a
    log header at all, but rather a normal event. This leads to the cast in
    __calc_tpm2_event_size being an invalid conversion, which means that
    the values read are effectively garbage. Depending on the first event's
    contents, this leads either to apparently normal behaviour, a crash or
    a freeze.

    While this behaviour of the firmware is not in accordance with the
    TCG Client EFI Specification, this happens on a Dell Precision 5510
    with the TPM enabled but hidden from the OS ("TPM On" disabled, state
    otherwise untouched). The EFI firmware claims that the TPM is present
    and active and that it supports the TCG 2.0 event log format.

    Fortunately, this can be worked around by simply checking the header
    of the first event and the event log header signature itself.

    Commit b4f1874c6216 ("tpm: check event log version before reading final
    events") addressed a similar issue also found on Dell models.

    Fixes: 6b0326190205 ("efi: Attempt to get the TCG2 event log in the boot stub")
    Signed-off-by: Fabian Vogt
    Link: https://lore.kernel.org/r/1927248.evlx2EsYKh@linux-e202.suse.de
    Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1165773
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Sasha Levin

    Fabian Vogt
     
  • [ Upstream commit 94579ac3f6d0820adc83b5dc5358ead0158101e9 ]

    During IPsec performance testing, we see bad ICMP checksum. The error packet
    has duplicated ESP trailer due to double validate_xmit_xfrm calls. The first call
    is from ip_output, but the packet cannot be sent because
    netif_xmit_frozen_or_stopped is true and the packet gets dev_requeue_skb. The second
    call is from NET_TX softirq. However after the first call, the packet already
    has the ESP trailer.

    Fix by marking the skb with XFRM_XMIT bit after the packet is handled by
    validate_xmit_xfrm to avoid duplicate ESP trailer insertion.

    Fixes: f6e27114a60a ("net: Add a xfrm validate function to validate_xmit_skb")
    Signed-off-by: Huy Nguyen
    Reviewed-by: Boris Pismenny
    Reviewed-by: Raed Salem
    Reviewed-by: Saeed Mahameed
    Signed-off-by: Steffen Klassert
    Signed-off-by: Sasha Levin

    Huy Nguyen
     
  • [ Upstream commit 471e39df96b9a4c4ba88a2da9e25a126624d7a9c ]

    If a socket is set ipv6only, it will still send IPv4 addresses in the
    INIT and INIT_ACK packets. This potentially misleads the peer into using
    them, which then would cause association termination.

    The fix is to not add IPv4 addresses to ipv6only sockets.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Reported-by: Corey Minyard
    Signed-off-by: Marcelo Ricardo Leitner
    Tested-by: Corey Minyard
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Marcelo Ricardo Leitner
     
  • [ Upstream commit 41b14fb8724d5a4b382a63cb4a1a61880347ccb8 ]

    Clearing the sock TX queue in sk_set_socket() might cause unexpected
    out-of-order transmit when called from sock_orphan(), as outstanding
    packets can pick a different TX queue and bypass the ones already queued.

    This is undesired in general. More specifically, it breaks the in-order
    scheduling property guarantee for device-offloaded TLS sockets.

    Remove the call to sk_tx_queue_clear() in sk_set_socket(), and add it
    explicitly only where needed.

    Fixes: e022f0b4a03f ("net: Introduce sk_tx_queue_mapping")
    Signed-off-by: Tariq Toukan
    Reviewed-by: Boris Pismenny
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Tariq Toukan
     
  • [ Upstream commit fb7861d14c8d7edac65b2fcb6e8031cb138457b2 ]

    In the current code, ->ndo_start_xmit() can be executed recursively only
    10 times because of stack memory.
    But, in the case of the vxlan, 10 recursion limit value results in
    a stack overflow.
    In the current code, the nested interface is limited by 8 depth.
    There is no critical reason that the recursion limitation value should
    be 10.
    So, it would be good to be the same value with the limitation value of
    nesting interface depth.

    Test commands:
    ip link add vxlan10 type vxlan vni 10 dstport 4789 srcport 4789 4789
    ip link set vxlan10 up
    ip a a 192.168.10.1/24 dev vxlan10
    ip n a 192.168.10.2 dev vxlan10 lladdr fc:22:33:44:55:66 nud permanent

    for i in {9..0}
    do
    let A=$i+1
    ip link add vxlan$i type vxlan vni $i dstport 4789 srcport 4789 4789
    ip link set vxlan$i up
    ip a a 192.168.$i.1/24 dev vxlan$i
    ip n a 192.168.$i.2 dev vxlan$i lladdr fc:22:33:44:55:66 nud permanent
    bridge fdb add fc:22:33:44:55:66 dev vxlan$A dst 192.168.$i.2 self
    done
    hping3 192.168.10.2 -2 -d 60000

    Splat looks like:
    [ 103.814237][ T1127] =============================================================================
    [ 103.871955][ T1127] BUG kmalloc-2k (Tainted: G B ): Padding overwritten. 0x00000000897a2e4f-0x000
    [ 103.873187][ T1127] -----------------------------------------------------------------------------
    [ 103.873187][ T1127]
    [ 103.874252][ T1127] INFO: Slab 0x000000005cccc724 objects=5 used=5 fp=0x0000000000000000 flags=0x10000000001020
    [ 103.881323][ T1127] CPU: 3 PID: 1127 Comm: hping3 Tainted: G B 5.7.0+ #575
    [ 103.882131][ T1127] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
    [ 103.883006][ T1127] Call Trace:
    [ 103.883324][ T1127] dump_stack+0x96/0xdb
    [ 103.883716][ T1127] slab_err+0xad/0xd0
    [ 103.884106][ T1127] ? _raw_spin_unlock+0x1f/0x30
    [ 103.884620][ T1127] ? get_partial_node.isra.78+0x140/0x360
    [ 103.885214][ T1127] slab_pad_check.part.53+0xf7/0x160
    [ 103.885769][ T1127] ? pskb_expand_head+0x110/0xe10
    [ 103.886316][ T1127] check_slab+0x97/0xb0
    [ 103.886763][ T1127] alloc_debug_processing+0x84/0x1a0
    [ 103.887308][ T1127] ___slab_alloc+0x5a5/0x630
    [ 103.887765][ T1127] ? pskb_expand_head+0x110/0xe10
    [ 103.888265][ T1127] ? lock_downgrade+0x730/0x730
    [ 103.888762][ T1127] ? pskb_expand_head+0x110/0xe10
    [ 103.889244][ T1127] ? __slab_alloc+0x3e/0x80
    [ 103.889675][ T1127] __slab_alloc+0x3e/0x80
    [ 103.890108][ T1127] __kmalloc_node_track_caller+0xc7/0x420
    [ ... ]

    Fixes: 11a766ce915f ("net: Increase xmit RECURSION_LIMIT to 10.")
    Signed-off-by: Taehee Yoo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Taehee Yoo
     

24 Jun, 2020

11 commits

  • commit 9b38cc704e844e41d9cf74e647bff1d249512cb3 upstream.

    Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
    My test was also able to trigger lockdep output:

    ============================================
    WARNING: possible recursive locking detected
    5.6.0-rc6+ #6 Not tainted
    --------------------------------------------
    sched-messaging/2767 is trying to acquire lock:
    ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0

    but task is already holding lock:
    ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50

    other info that might help us debug this:
    Possible unsafe locking scenario:

    CPU0
    ----
    lock(&(kretprobe_table_locks[i].lock));
    lock(&(kretprobe_table_locks[i].lock));

    *** DEADLOCK ***

    May be due to missing lock nesting notation

    1 lock held by sched-messaging/2767:
    #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50

    stack backtrace:
    CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
    Call Trace:
    dump_stack+0x96/0xe0
    __lock_acquire.cold.57+0x173/0x2b7
    ? native_queued_spin_lock_slowpath+0x42b/0x9e0
    ? lockdep_hardirqs_on+0x590/0x590
    ? __lock_acquire+0xf63/0x4030
    lock_acquire+0x15a/0x3d0
    ? kretprobe_hash_lock+0x52/0xa0
    _raw_spin_lock_irqsave+0x36/0x70
    ? kretprobe_hash_lock+0x52/0xa0
    kretprobe_hash_lock+0x52/0xa0
    trampoline_handler+0xf8/0x940
    ? kprobe_fault_handler+0x380/0x380
    ? find_held_lock+0x3a/0x1c0
    kretprobe_trampoline+0x25/0x50
    ? lock_acquired+0x392/0xbc0
    ? _raw_spin_lock_irqsave+0x50/0x70
    ? __get_valid_kprobe+0x1f0/0x1f0
    ? _raw_spin_unlock_irqrestore+0x3b/0x40
    ? finish_task_switch+0x4b9/0x6d0
    ? __switch_to_asm+0x34/0x70
    ? __switch_to_asm+0x40/0x70

    The code within the kretprobe handler checks for probe reentrancy,
    so we won't trigger any _raw_spin_lock_irqsave probe in there.

    The problem is in outside kprobe_flush_task, where we call:

    kprobe_flush_task
    kretprobe_table_lock
    raw_spin_lock_irqsave
    _raw_spin_lock_irqsave

    where _raw_spin_lock_irqsave triggers the kretprobe and installs
    kretprobe_trampoline handler on _raw_spin_lock_irqsave return.

    The kretprobe_trampoline handler is then executed with already
    locked kretprobe_table_locks, and first thing it does is to
    lock kretprobe_table_locks ;-) the whole lockup path like:

    kprobe_flush_task
    kretprobe_table_lock
    raw_spin_lock_irqsave
    _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed

    ---> kretprobe_table_locks locked

    kretprobe_trampoline
    trampoline_handler
    kretprobe_hash_lock(current, &head, &flags);
    Cc: "Gustavo A . R . Silva"
    Cc: Anders Roxell
    Cc: "Naveen N . Rao"
    Cc: Anil S Keshavamurthy
    Cc: David Miller
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: stable@vger.kernel.org
    Reported-by: "Ziqian SUN (Zamir)"
    Acked-by: Masami Hiramatsu
    Signed-off-by: Jiri Olsa
    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Greg Kroah-Hartman

    Jiri Olsa
     
  • [ Upstream commit 15b81ce5abdc4b502aa31dff2d415b79d2349d2f ]

    For optimized block readers not holding a mutex, the "number of sectors"
    64-bit value is protected from tearing on 32-bit architectures by a
    sequence counter.

    Disable preemption before entering that sequence counter's write side
    critical section. Otherwise, the read side can preempt the write side
    section and spin for the entire scheduler tick. If the reader belongs to
    a real-time scheduling class, it can spin forever and the kernel will
    livelock.

    Fixes: c83f6bf98dc1 ("block: add partition resize function to blkpg ioctl")
    Cc:
    Signed-off-by: Ahmed S. Darwish
    Reviewed-by: Sebastian Andrzej Siewior
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Ahmed S. Darwish
     
  • [ Upstream commit 7f6225e446cc8dfa4c3c7959a4de3dd03ec277bf ]

    __jbd2_journal_abort_hard() is no longer used, so now we can merge
    __jbd2_journal_abort_hard() and __journal_abort_soft() these two
    functions into jbd2_journal_abort() and remove them.

    Signed-off-by: zhangyi (F)
    Reviewed-by: Jan Kara
    Link: https://lore.kernel.org/r/20191204124614.45424-5-yi.zhang@huawei.com
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Sasha Levin

    zhangyi (F)
     
  • [ Upstream commit b5292111de9bb70cba3489075970889765302136 ]

    Commit 130f4caf145c ("libata: Ensure ata_port probe has completed before
    detach") may cause system freeze during suspend.

    Using async_synchronize_full() in PM callbacks is wrong, since async
    callbacks that are already scheduled may wait for not-yet-scheduled
    callbacks, causes a circular dependency.

    Instead of using big hammer like async_synchronize_full(), use async
    cookie to make sure port probe are synced, without affecting other
    scheduled PM callbacks.

    Fixes: 130f4caf145c ("libata: Ensure ata_port probe has completed before detach")
    Suggested-by: John Garry
    Signed-off-by: Kai-Heng Feng
    Tested-by: John Garry
    BugLink: https://bugs.launchpad.net/bugs/1867983
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Kai-Heng Feng
     
  • [ Upstream commit cc7eac1e4afdd151085be4d0341a155760388653 ]

    Since EHCI/OHCI controllers on R-Car Gen3 SoCs are possible to
    be getting stuck very rarely after a full/low usb device was
    disconnected. To detect/recover from such a situation, the controllers
    require a special way which poll the EHCI PORTSC register and changes
    the OHCI functional state.

    So, this patch adds a polling timer into the ehci-platform driver,
    and if the ehci driver detects the issue by the EHCI PORTSC register,
    the ehci driver removes a companion device (= the OHCI controller)
    to change the OHCI functional state to USB Reset once. And then,
    the ehci driver adds the companion device again.

    Signed-off-by: Yoshihiro Shimoda
    Acked-by: Alan Stern
    Link: https://lore.kernel.org/r/1580114262-25029-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin

    Yoshihiro Shimoda
     
  • [ Upstream commit 3a39e778690500066b31fe982d18e2e394d3bce2 ]

    Use the following command to test nfsv4(size of file1M is 1MB):
    mount -t nfs -o vers=4.0,actimeo=60 127.0.0.1/dir1 /mnt
    cp file1M /mnt
    du -h /mnt/file1M -->0 within 60s, then 1M

    When write is done(cp file1M /mnt), will call this:
    nfs_writeback_done
    nfs4_write_done
    nfs4_write_done_cb
    nfs_writeback_update_inode
    nfs_post_op_update_inode_force_wcc_locked(change, ctime, mtime
    nfs_post_op_update_inode_force_wcc_locked
    nfs_set_cache_invalid
    nfs_refresh_inode_locked
    nfs_update_inode

    nfsd write response contains change, ctime, mtime, the flag will be
    clear after nfs_update_inode. Howerver, write response does not contain
    space_used, previous open response contains space_used whose value is 0,
    so inode->i_blocks is still 0.

    nfs_getattr -->called by "du -h"
    do_update |= force_sync || nfs_attribute_cache_expired -->false in 60s
    cache_validity = READ_ONCE(NFS_I(inode)->cache_validity)
    do_update |= cache_validity & (NFS_INO_INVALID_ATTR -->false
    if (do_update) {
    __nfs_revalidate_inode
    }

    Within 60s, does not send getattr request to nfsd, thus "du -h /mnt/file1M"
    is 0.

    Add a NFS_INO_INVALID_BLOCKS flag, set it when nfsv4 write is done.

    Fixes: 16e143751727 ("NFS: More fine grained attribute tracking")
    Signed-off-by: Zheng Bin
    Signed-off-by: Anna Schumaker
    Signed-off-by: Sasha Levin

    Zheng Bin
     
  • [ Upstream commit bd93f003b7462ae39a43c531abca37fe7073b866 ]

    Clang normally does not warn about certain issues in inline functions when
    it only happens in an eliminated code path. However if something else
    goes wrong, it does tend to complain about the definition of hweight_long()
    on 32-bit targets:

    include/linux/bitops.h:75:41: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
    return sizeof(w) == 4 ? hweight32(w) : hweight64(w);
    ^~~~~~~~~~~~
    include/asm-generic/bitops/const_hweight.h:29:49: note: expanded from macro 'hweight64'
    define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w))
    ^~~~~~~~~~~~~~~~~~~~
    include/asm-generic/bitops/const_hweight.h:21:76: note: expanded from macro '__const_hweight64'
    define __const_hweight64(w) (__const_hweight32(w) + __const_hweight32((w) >> 32))
    ^ ~~
    include/asm-generic/bitops/const_hweight.h:20:49: note: expanded from macro '__const_hweight32'
    define __const_hweight32(w) (__const_hweight16(w) + __const_hweight16((w) >> 16))
    ^
    include/asm-generic/bitops/const_hweight.h:19:72: note: expanded from macro '__const_hweight16'
    define __const_hweight16(w) (__const_hweight8(w) + __const_hweight8((w) >> 8 ))
    ^
    include/asm-generic/bitops/const_hweight.h:12:9: note: expanded from macro '__const_hweight8'
    (!!((w) & (1ULL << 2))) + \

    Adding an explicit cast to __u64 avoids that warning and makes it easier
    to read other output.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Acked-by: Christian Brauner
    Cc: Andy Shevchenko
    Cc: Rasmus Villemoes
    Cc: Josh Poimboeuf
    Cc: Nick Desaulniers
    Link: http://lkml.kernel.org/r/20200505135513.65265-1-arnd@arndb.de
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Arnd Bergmann
     
  • [ Upstream commit 3234ac664a870e6ea69ae3a57d824cd7edbeacc5 ]

    Close the hole of holding a mapping over kernel driver takeover event of
    a given address range.

    Commit 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
    introduced CONFIG_IO_STRICT_DEVMEM with the goal of protecting the
    kernel against scenarios where a /dev/mem user tramples memory that a
    kernel driver owns. However, this protection only prevents *new* read(),
    write() and mmap() requests. Established mappings prior to the driver
    calling request_mem_region() are left alone.

    Especially with persistent memory, and the core kernel metadata that is
    stored there, there are plentiful scenarios for a /dev/mem user to
    violate the expectations of the driver and cause amplified damage.

    Teach request_mem_region() to find and shoot down active /dev/mem
    mappings that it believes it has successfully claimed for the exclusive
    use of the driver. Effectively a driver call to request_mem_region()
    becomes a hole-punch on the /dev/mem device.

    The typical usage of unmap_mapping_range() is part of
    truncate_pagecache() to punch a hole in a file, but in this case the
    implementation is only doing the "first half" of a hole punch. Namely it
    is just evacuating current established mappings of the "hole", and it
    relies on the fact that /dev/mem establishes mappings in terms of
    absolute physical address offsets. Once existing mmap users are
    invalidated they can attempt to re-establish the mapping, or attempt to
    continue issuing read(2) / write(2) to the invalidated extent, but they
    will then be subject to the CONFIG_IO_STRICT_DEVMEM checking that can
    block those subsequent accesses.

    Cc: Arnd Bergmann
    Cc: Ingo Molnar
    Cc: Kees Cook
    Cc: Matthew Wilcox
    Cc: Russell King
    Cc: Andrew Morton
    Cc: Greg Kroah-Hartman
    Fixes: 90a545e98126 ("restrict /dev/mem to idle io memory ranges")
    Signed-off-by: Dan Williams
    Reviewed-by: Kees Cook
    Link: https://lore.kernel.org/r/159009507306.847224.8502634072429766747.stgit@dwillia2-desk3.amr.corp.intel.com
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Sasha Levin

    Dan Williams
     
  • [ Upstream commit 97eda5dcc2cde5dcc778bef7a9344db3b6bf8ef5 ]

    When STMFX supply is stopped, spurious interrupt can occur. To avoid that,
    disable the interrupt in suspend before disabling the regulator and
    re-enable it at the end of resume.

    Fixes: 06252ade9156 ("mfd: Add ST Multi-Function eXpander (STMFX) core driver")
    Signed-off-by: Amelie Delaunay
    Signed-off-by: Lee Jones
    Signed-off-by: Sasha Levin

    Amelie Delaunay
     
  • [ Upstream commit 5d363120aa548ba52d58907a295eee25f8207ed2 ]

    This patch adds new config_ep_by_speed_and_alt function which
    extends the config_ep_by_speed about alt parameter.
    This additional parameter allows to find proper usb_ss_ep_comp_descriptor.

    Problem has appeared during testing f_tcm (BOT/UAS) driver function.

    f_tcm function for SS use array of headers for both BOT/UAS alternate
    setting:

    static struct usb_descriptor_header *uasp_ss_function_desc[] = {
    (struct usb_descriptor_header *) &bot_intf_desc,
    (struct usb_descriptor_header *) &uasp_ss_bi_desc,
    (struct usb_descriptor_header *) &bot_bi_ep_comp_desc,
    (struct usb_descriptor_header *) &uasp_ss_bo_desc,
    (struct usb_descriptor_header *) &bot_bo_ep_comp_desc,

    (struct usb_descriptor_header *) &uasp_intf_desc,
    (struct usb_descriptor_header *) &uasp_ss_bi_desc,
    (struct usb_descriptor_header *) &uasp_bi_ep_comp_desc,
    (struct usb_descriptor_header *) &uasp_bi_pipe_desc,
    (struct usb_descriptor_header *) &uasp_ss_bo_desc,
    (struct usb_descriptor_header *) &uasp_bo_ep_comp_desc,
    (struct usb_descriptor_header *) &uasp_bo_pipe_desc,
    (struct usb_descriptor_header *) &uasp_ss_status_desc,
    (struct usb_descriptor_header *) &uasp_status_in_ep_comp_desc,
    (struct usb_descriptor_header *) &uasp_status_pipe_desc,
    (struct usb_descriptor_header *) &uasp_ss_cmd_desc,
    (struct usb_descriptor_header *) &uasp_cmd_comp_desc,
    (struct usb_descriptor_header *) &uasp_cmd_pipe_desc,
    NULL,
    };

    The first 5 descriptors are associated with BOT alternate setting,
    and others are associated with UAS.

    During handling UAS alternate setting f_tcm driver invokes
    config_ep_by_speed and this function sets incorrect companion endpoint
    descriptor in usb_ep object.

    Instead setting ep->comp_desc to uasp_bi_ep_comp_desc function in this
    case set ep->comp_desc to uasp_ss_bi_desc.

    This is due to the fact that it searches endpoint based on endpoint
    address:

    for_each_ep_desc(speed_desc, d_spd) {
    chosen_desc = (struct usb_endpoint_descriptor *)*d_spd;
    if (chosen_desc->bEndpoitAddress == _ep->address)
    goto ep_found;
    }

    And in result it uses the descriptor from BOT alternate setting
    instead UAS.

    Finally, it causes that controller driver during enabling endpoints
    detect that just enabled endpoint for bot.

    Signed-off-by: Jayshri Pawar
    Signed-off-by: Pawel Laszczak
    Signed-off-by: Felipe Balbi
    Signed-off-by: Sasha Levin

    Pawel Laszczak
     
  • [ Upstream commit 3c73bc52195def14165c3a7d91bdbb33b51725f5 ]

    The threaded interrupt handler may still be called after the
    usb_gadget_disconnect is called, it causes the structures used
    at interrupt handler was freed before it uses, eg the
    usb_request. This issue usually occurs we remove the udc function
    during the transfer. Below is the example when doing stress
    test for android switch function, the EP0's request is freed
    by .unbind (configfs_composite_unbind -> composite_dev_cleanup),
    but the threaded handler accesses this request during handling
    setup packet request.

    In fact, there is no protection between unbind the udc
    and udc interrupt handling, so we have to avoid the interrupt
    handler is occurred or scheduled during the .unbind flow.

    init: Sending signal 9 to service 'adbd' (pid 18077) process group...
    android_work: did not send uevent (0 0 000000007bec2039)
    libprocessgroup: Successfully killed process cgroup uid 0 pid 18077 in 6ms
    init: Service 'adbd' (pid 18077) received signal 9
    init: Sending signal 9 to service 'adbd' (pid 18077) process group...
    libprocessgroup: Successfully killed process cgroup uid 0 pid 18077 in 0ms
    init: processing action (init.svc.adbd=stopped) from (/init.usb.configfs.rc:14)
    init: Received control message 'start' for 'adbd' from pid: 399 (/vendor/bin/hw/android.hardware.usb@1.

    init: starting service 'adbd'...
    read descriptors
    read strings
    Unable to handle kernel read from unreadable memory at virtual address 000000000000002a
    android_work: sent uevent USB_STATE=CONNECTED
    Mem abort info:
    ESR = 0x96000004
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    Data abort info:
    ISV = 0, ISS = 0x00000004
    CM = 0, WnR = 0
    user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e97f1000
    using random self ethernet address
    [000000000000002a] pgd=0000000000000000
    Internal error: Oops: 96000004 [#1] PREEMPT SMP
    Modules linked in:
    CPU: 0 PID: 232 Comm: irq/68-5b110000 Not tainted 5.4.24-06075-g94a6b52b5815 #92
    Hardware name: Freescale i.MX8QXP MEK (DT)
    pstate: 00400085 (nzcv daIf +PAN -UAO)
    using random host ethernet address
    pc : composite_setup+0x5c/0x1730
    lr : android_setup+0xc0/0x148
    sp : ffff80001349bba0
    x29: ffff80001349bba0 x28: ffff00083a50da00
    x27: ffff8000124e6000 x26: ffff800010177950
    x25: 0000000000000040 x24: ffff000834e18010
    x23: 0000000000000000 x22: 0000000000000000
    x21: ffff00083a50da00 x20: ffff00082e75ec40
    x19: 0000000000000000 x18: 0000000000000000
    x17: 0000000000000000 x16: 0000000000000000
    x15: 0000000000000000 x14: 0000000000000000
    x13: 0000000000000000 x12: 0000000000000001
    x11: ffff80001180fb58 x10: 0000000000000040
    x9 : ffff8000120fc980 x8 : 0000000000000000
    x7 : ffff00083f98df50 x6 : 0000000000000100
    x5 : 00000307e8978431 x4 : ffff800011386788
    x3 : 0000000000000000 x2 : ffff800012342000
    x1 : 0000000000000000 x0 : ffff800010c6d3a0
    Call trace:
    composite_setup+0x5c/0x1730
    android_setup+0xc0/0x148
    cdns3_ep0_delegate_req+0x64/0x90
    cdns3_check_ep0_interrupt_proceed+0x384/0x738
    cdns3_device_thread_irq_handler+0x124/0x6e0
    cdns3_thread_irq+0x94/0xa0
    irq_thread_fn+0x30/0xa0
    irq_thread+0x150/0x248
    kthread+0xfc/0x128
    ret_from_fork+0x10/0x18
    Code: 910e8000 f9400693 12001ed7 79400f79 (3940aa61)
    ---[ end trace c685db37f8773fba ]---
    Kernel panic - not syncing: Fatal exception
    SMP: stopping secondary CPUs
    Kernel Offset: disabled
    CPU features: 0x0002,20002008
    Memory Limit: none
    Rebooting in 5 seconds..

    Reviewed-by: Jun Li
    Signed-off-by: Peter Chen
    Signed-off-by: Felipe Balbi
    Signed-off-by: Sasha Levin

    Peter Chen
     

22 Jun, 2020

9 commits

  • commit 24c5efe41c29ee3e55bcf5a1c9f61ca8709622e8 upstream.

    gss_mech_register() calls svcauth_gss_register_pseudoflavor() for each
    flavour, but gss_mech_unregister() does not call auth_domain_put().
    This is unbalanced and makes it impossible to reload the module.

    Change svcauth_gss_register_pseudoflavor() to return the registered
    auth_domain, and save it for later release.

    Cc: stable@vger.kernel.org (v2.6.12+)
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=206651
    Signed-off-by: NeilBrown
    Signed-off-by: J. Bruce Fields
    Signed-off-by: Greg Kroah-Hartman

    NeilBrown
     
  • [ Upstream commit a4e91825d7e1252f7cba005f1451e5464b23c15d ]

    Add PCI IDs for AMD Renoir (4000-series Ryzen CPUs). This is necessary
    to enable support for temperature sensors via the k10temp module.

    Signed-off-by: Alexander Monakov
    Signed-off-by: Borislav Petkov
    Acked-by: Yazen Ghannam
    Acked-by: Guenter Roeck
    Link: https://lkml.kernel.org/r/20200510204842.2603-2-amonakov@ispras.ru
    Signed-off-by: Sasha Levin

    Alexander Monakov
     
  • [ Upstream commit 62a7f3009a460001eb46984395280dd900bc4ef4 ]

    Move the IDs to pci_ids.h so it can be used by next patch.

    Link: https://lore.kernel.org/r/20200508065343.32751-1-kai.heng.feng@canonical.com
    Signed-off-by: Kai-Heng Feng
    Signed-off-by: Bjorn Helgaas
    Acked-by: Greg Kroah-Hartman
    Cc: stable@vger.kernel.org
    Signed-off-by: Sasha Levin

    Kai-Heng Feng
     
  • [ Upstream commit 9acb9fe18d863aacc99948963f8d5d447dc311be ]

    Add the Loongson vendor ID to pci_ids.h to be used by the controller
    driver in the future.

    The Loongson vendor ID can be found at the following link:
    https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/tree/pci.ids

    Signed-off-by: Tiezhu Yang
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Tiezhu Yang
     
  • [ Upstream commit b3f79ae45904ae987a7c06a9e8d6084d7b73e67f ]

    Add the new PCI Device 18h IDs for AMD Family 19h systems. Note that
    Family 19h systems will not have a new PCI root device ID.

    Signed-off-by: Yazen Ghannam
    Signed-off-by: Borislav Petkov
    Link: https://lkml.kernel.org/r/20200110015651.14887-4-Yazen.Ghannam@amd.com
    Signed-off-by: Sasha Levin

    Yazen Ghannam
     
  • [ Upstream commit ec11e5c213cc20cac5e8310728b06793448b9f6d ]

    This patch adds support for this VMD device which supports the bus
    restriction mode.

    Signed-off-by: Jon Derrick
    Signed-off-by: Lorenzo Pieralisi
    Signed-off-by: Sasha Levin

    Jon Derrick
     
  • commit 3d060856adfc59afb9d029c233141334cfaba418 upstream.

    Initializing struct pages is a long task and keeping interrupts disabled
    for the duration of this operation introduces a number of problems.

    1. jiffies are not updated for long period of time, and thus incorrect time
    is reported. See proposed solution and discussion here:
    lkml/20200311123848.118638-1-shile.zhang@linux.alibaba.com
    2. It prevents farther improving deferred page initialization by allowing
    intra-node multi-threading.

    We are keeping interrupts disabled to solve a rather theoretical problem
    that was never observed in real world (See 3a2d7fa8a3d5).

    Let's keep interrupts enabled. In case we ever encounter a scenario where
    an interrupt thread wants to allocate large amount of memory this early in
    boot we can deal with that by growing zone (see deferred_grow_zone()) by
    the needed amount before starting deferred_init_memmap() threads.

    Before:
    [ 1.232459] node 0 initialised, 12058412 pages in 1ms

    After:
    [ 1.632580] node 0 initialised, 12051227 pages in 436ms

    Fixes: 3a2d7fa8a3d5 ("mm: disable interrupts while initializing deferred pages")
    Reported-by: Shile Zhang
    Signed-off-by: Pavel Tatashin
    Signed-off-by: Andrew Morton
    Reviewed-by: Daniel Jordan
    Reviewed-by: David Hildenbrand
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Dan Williams
    Cc: James Morris
    Cc: Kirill Tkhai
    Cc: Sasha Levin
    Cc: Yiqian Wei
    Cc: [4.17+]
    Link: http://lkml.kernel.org/r/20200403140952.17177-3-pasha.tatashin@soleen.com
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Pavel Tatashin
     
  • [ Upstream commit 47227d27e2fcb01a9e8f5958d8997cf47a820afc ]

    The memcmp KASAN self-test fails on a kernel with both KASAN and
    FORTIFY_SOURCE.

    When FORTIFY_SOURCE is on, a number of functions are replaced with
    fortified versions, which attempt to check the sizes of the operands.
    However, these functions often directly invoke __builtin_foo() once they
    have performed the fortify check. Using __builtins may bypass KASAN
    checks if the compiler decides to inline it's own implementation as
    sequence of instructions, rather than emit a function call that goes out
    to a KASAN-instrumented implementation.

    Why is only memcmp affected?
    ============================

    Of the string and string-like functions that kasan_test tests, only memcmp
    is replaced by an inline sequence of instructions in my testing on x86
    with gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu2).

    I believe this is due to compiler heuristics. For example, if I annotate
    kmalloc calls with the alloc_size annotation (and disable some fortify
    compile-time checking!), the compiler will replace every memset except the
    one in kmalloc_uaf_memset with inline instructions. (I have some WIP
    patches to add this annotation.)

    Does this affect other functions in string.h?
    =============================================

    Yes. Anything that uses __builtin_* rather than __real_* could be
    affected. This looks like:

    - strncpy
    - strcat
    - strlen
    - strlcpy maybe, under some circumstances?
    - strncat under some circumstances
    - memset
    - memcpy
    - memmove
    - memcmp (as noted)
    - memchr
    - strcpy

    Whether a function call is emitted always depends on the compiler. Most
    bugs should get caught by FORTIFY_SOURCE, but the missed memcmp test shows
    that this is not always the case.

    Isn't FORTIFY_SOURCE disabled with KASAN?
    ========================================-

    The string headers on all arches supporting KASAN disable fortify with
    kasan, but only when address sanitisation is _also_ disabled. For example
    from x86:

    #if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
    /*
    * For files that are not instrumented (e.g. mm/slub.c) we
    * should use not instrumented version of mem* functions.
    */
    #define memcpy(dst, src, len) __memcpy(dst, src, len)
    #define memmove(dst, src, len) __memmove(dst, src, len)
    #define memset(s, c, n) __memset(s, c, n)

    #ifndef __NO_FORTIFY
    #define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
    #endif

    #endif

    This comes from commit 6974f0c4555e ("include/linux/string.h: add the
    option of fortified string.h functions"), and doesn't work when KASAN is
    enabled and the file is supposed to be sanitised - as with test_kasan.c

    I'm pretty sure this is not wrong, but not as expansive it should be:

    * we shouldn't use __builtin_memcpy etc in files where we don't have
    instrumentation - it could devolve into a function call to memcpy,
    which will be instrumented. Rather, we should use __memcpy which
    by convention is not instrumented.

    * we also shouldn't be using __builtin_memcpy when we have a KASAN
    instrumented file, because it could be replaced with inline asm
    that will not be instrumented.

    What is correct behaviour?
    ==========================

    Firstly, there is some overlap between fortification and KASAN: both
    provide some level of _runtime_ checking. Only fortify provides
    compile-time checking.

    KASAN and fortify can pick up different things at runtime:

    - Some fortify functions, notably the string functions, could easily be
    modified to consider sub-object sizes (e.g. members within a struct),
    and I have some WIP patches to do this. KASAN cannot detect these
    because it cannot insert poision between members of a struct.

    - KASAN can detect many over-reads/over-writes when the sizes of both
    operands are unknown, which fortify cannot.

    So there are a couple of options:

    1) Flip the test: disable fortify in santised files and enable it in
    unsanitised files. This at least stops us missing KASAN checking, but
    we lose the fortify checking.

    2) Make the fortify code always call out to real versions. Do this only
    for KASAN, for fear of losing the inlining opportunities we get from
    __builtin_*.

    (We can't use kasan_check_{read,write}: because the fortify functions are
    _extern inline_, you can't include _static_ inline functions without a
    compiler warning. kasan_check_{read,write} are static inline so we can't
    use them even when they would otherwise be suitable.)

    Take approach 2 and call out to real versions when KASAN is enabled.

    Use __underlying_foo to distinguish from __real_foo: __real_foo always
    refers to the kernel's implementation of foo, __underlying_foo could be
    either the kernel implementation or the __builtin_foo implementation.

    This is sometimes enough to make the memcmp test succeed with
    FORTIFY_SOURCE enabled. It is at least enough to get the function call
    into the module. One more fix is needed to make it reliable: see the next
    patch.

    Fixes: 6974f0c4555e ("include/linux/string.h: add the option of fortified string.h functions")
    Signed-off-by: Daniel Axtens
    Signed-off-by: Andrew Morton
    Tested-by: David Gow
    Reviewed-by: Dmitry Vyukov
    Cc: Daniel Micay
    Cc: Andrey Ryabinin
    Cc: Alexander Potapenko
    Link: http://lkml.kernel.org/r/20200423154503.5103-3-dja@axtens.net
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Daniel Axtens
     
  • [ Upstream commit e91de6afa81c10e9f855c5695eb9a53168d96b73 ]

    KTLS uses a stream parser to collect TLS messages and send them to
    the upper layer tls receive handler. This ensures the tls receiver
    has a full TLS header to parse when it is run. However, when a
    socket has BPF_SK_SKB_STREAM_VERDICT program attached before KTLS
    is enabled we end up with two stream parsers running on the same
    socket.

    The result is both try to run on the same socket. First the KTLS
    stream parser runs and calls read_sock() which will tcp_read_sock
    which in turn calls tcp_rcv_skb(). This dequeues the skb from the
    sk_receive_queue. When this is done KTLS code then data_ready()
    callback which because we stacked KTLS on top of the bpf stream
    verdict program has been replaced with sk_psock_start_strp(). This
    will in turn kick the stream parser again and eventually do the
    same thing KTLS did above calling into tcp_rcv_skb() and dequeuing
    a skb from the sk_receive_queue.

    At this point the data stream is broke. Part of the stream was
    handled by the KTLS side some other bytes may have been handled
    by the BPF side. Generally this results in either missing data
    or more likely a "Bad Message" complaint from the kTLS receive
    handler as the BPF program steals some bytes meant to be in a
    TLS header and/or the TLS header length is no longer correct.

    We've already broke the idealized model where we can stack ULPs
    in any order with generic callbacks on the TX side to handle this.
    So in this patch we do the same thing but for RX side. We add
    a sk_psock_strp_enabled() helper so TLS can learn a BPF verdict
    program is running and add a tls_sw_has_ctx_rx() helper so BPF
    side can learn there is a TLS ULP on the socket.

    Then on BPF side we omit calling our stream parser to avoid
    breaking the data stream for the KTLS receiver. Then on the
    KTLS side we call BPF_SK_SKB_STREAM_VERDICT once the KTLS
    receiver is done with the packet but before it posts the
    msg to userspace. This gives us symmetry between the TX and
    RX halfs and IMO makes it usable again. On the TX side we
    process packets in this order BPF -> TLS -> TCP and on
    the receive side in the reverse order TCP -> TLS -> BPF.

    Discovered while testing OpenSSL 3.0 Alpha2.0 release.

    Fixes: d829e9c4112b5 ("tls: convert to generic sk_msg interface")
    Signed-off-by: John Fastabend
    Signed-off-by: Alexei Starovoitov
    Link: https://lore.kernel.org/bpf/159079361946.5745.605854335665044485.stgit@john-Precision-5820-Tower
    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Sasha Levin

    John Fastabend