Eric Lee / smarc-fsl-linux-kernel

01 Apr, 2020

37 commits

a6ce82deb tcp: repair: fix TCP_QUEUE_SEQ implementation ... Browse Code »

[ Upstream commit 6cd6cbf593bfa3ae6fc3ed34ac21da4d35045425 ]

When application uses TCP_QUEUE_SEQ socket option to
change tp->rcv_next, we must also update tp->copied_seq.

Otherwise, stuff relying on tcp_inq() being precise can
eventually be confused.

For example, tcp_zerocopy_receive() might crash because
it does not expect tcp_recv_skb() to return NULL.

We could add tests in various places to fix the issue,
or simply make sure tcp_inq() wont return a random value,
and leave fast path as it is.

Note that this fixes ioctl(fd, SIOCINQ, &val) at the same
time.

Fixes: ee9952831cfd ("tcp: Initial repair mode")
Fixes: 05255b823a61 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive")
Signed-off-by: Eric Dumazet
Reported-by: syzbot
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2020-04-01 17:01:39 +0800
27cf5410a tcp: ensure skb->dev is NULL before leaving TCP stack ... Browse Code »

[ Upstream commit b738a185beaab8728943acdb3e67371b8a88185e ]

skb->rbnode is sharing three skb fields : next, prev, dev

When a packet is sent, TCP keeps the original skb (master)
in a rtx queue, which was converted to rbtree a while back.

__tcp_transmit_skb() is responsible to clone the master skb,
and add the TCP header to the clone before sending it
to network layer.

skb_clone() already clears skb->next and skb->prev, but copies
the master oskb->dev into the clone.

We need to clear skb->dev, otherwise lower layers could interpret
the value as a pointer to a netdev.

This old bug surfaced recently when commit 28f8bfd1ac94
("netfilter: Support iif matches in POSTROUTING") was merged.

Before this netfilter commit, skb->dev value was ignored and
changed before reaching dev_queue_xmit()

Fixes: 75c119afe14f ("tcp: implement rb-tree based retransmit queue")
Fixes: 28f8bfd1ac94 ("netfilter: Support iif matches in POSTROUTING")
Signed-off-by: Eric Dumazet
Reported-by: Martin Zaharinov
Cc: Florian Westphal
Cc: Pablo Neira Ayuso
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric Dumazet
2020-04-01 17:01:39 +0800
c94b94626 tcp: also NULL skb->dev when copy was needed ... Browse Code »

[ Upstream commit 07f8e4d0fddbf2f87e4cefb551278abc38db8cdd ]

In rare cases retransmit logic will make a full skb copy, which will not
trigger the zeroing added in recent change
b738a185beaa ("tcp: ensure skb->dev is NULL before leaving TCP stack").

Cc: Eric Dumazet
Fixes: 75c119afe14f ("tcp: implement rb-tree based retransmit queue")
Fixes: 28f8bfd1ac94 ("netfilter: Support iif matches in POSTROUTING")
Signed-off-by: Florian Westphal
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Florian Westphal
2020-04-01 17:01:39 +0800
49d2333f9 slcan: not call free_netdev before rtnl_unlock in slcan_open ... Browse Code »

[ Upstream commit 2091a3d42b4f339eaeed11228e0cbe9d4f92f558 ]

As the description before netdev_run_todo, we cannot call free_netdev
before rtnl_unlock, fix it by reorder the code.

This patch is a 1:1 copy of upstream slip.c commit f596c87005f7
("slip: not call free_netdev before rtnl_unlock in slip_open").

Reported-by: yangerkun
Signed-off-by: Oliver Hartkopp
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Oliver Hartkopp
2020-04-01 17:01:38 +0800
4cc2498b7 r8169: re-enable MSI on RTL8168c ... Browse Code »

[ Upstream commit f13bc68131b0c0d67a77fb43444e109828a983bf ]

The original change fixed an issue on RTL8168b by mimicking the vendor
driver behavior to disable MSI on chip versions before RTL8168d.
This however now caused an issue on a system with RTL8168c, see [0].
Therefore leave MSI disabled on RTL8168b, but re-enable it on RTL8168c.

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1792839

Fixes: 003bd5b4a7b4 ("r8169: don't use MSI before RTL8168d")
Signed-off-by: Heiner Kallweit
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Heiner Kallweit
2020-04-01 17:01:38 +0800
3428faf70 NFC: fdp: Fix a signedness bug in fdp_nci_send_patch() ... Browse Code »

[ Upstream commit 0dcdf9f64028ec3b75db6b691560f8286f3898bf ]

The nci_conn_max_data_pkt_payload_size() function sometimes returns
-EPROTO so "max_size" needs to be signed for the error handling to
work. We can make "payload_size" an int as well.

Fixes: a06347c04c13 ("NFC: Add Intel Fields Peak NFC solution driver")
Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Dan Carpenter
2020-04-01 17:01:38 +0800
3d9cc478a net: stmmac: dwmac-rk: fix error path in rk_gmac_probe ... Browse Code »

[ Upstream commit 9de9aa487daff7a5c73434c24269b44ed6a428e6 ]

Make sure we clean up devicetree related configuration
also when clock init fails.

Fixes: fecd4d7eef8b ("net: stmmac: dwmac-rk: Add integrated PHY support")
Signed-off-by: Emil Renner Berthing
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Emil Renner Berthing
2020-04-01 17:01:37 +0800
d23faf32e net_sched: keep alloc_hash updated after hash allocation ... Browse Code »

[ Upstream commit 0d1c3530e1bd38382edef72591b78e877e0edcd3 ]

In commit 599be01ee567 ("net_sched: fix an OOB access in cls_tcindex")
I moved cp->hash calculation before the first
tcindex_alloc_perfect_hash(), but cp->alloc_hash is left untouched.
This difference could lead to another out of bound access.

cp->alloc_hash should always be the size allocated, we should
update it after this tcindex_alloc_perfect_hash().

Reported-and-tested-by: syzbot+dcc34d54d68ef7d2d53d@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+c72da7b9ed57cde6fca2@syzkaller.appspotmail.com
Fixes: 599be01ee567 ("net_sched: fix an OOB access in cls_tcindex")
Cc: Jamal Hadi Salim
Cc: Jiri Pirko
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Cong Wang
2020-04-01 17:01:37 +0800
5317abb87 net_sched: hold rtnl lock in tcindex_partial_destroy_work() ... Browse Code »

[ Upstream commit b1be2e8cd290f620777bfdb8aa00890cd2fa02b5 ]

syzbot reported a use-after-free in tcindex_dump(). This is due to
the lack of RTNL in the deferred rcu work. We queue this work with
RTNL in tcindex_change(), later, tcindex_dump() is called:

fh = tp->ops->get(tp, t->tcm_handle);
...
err = tp->ops->change(..., &fh, ...);
tfilter_notify(..., fh, ...);

but there is nothing to serialize the pending
tcindex_partial_destroy_work() with tcindex_dump().

Fix this by simply holding RTNL in tcindex_partial_destroy_work(),
so that it won't be called until RTNL is released after
tc_new_tfilter() is completed.

Reported-and-tested-by: syzbot+653090db2562495901dc@syzkaller.appspotmail.com
Fixes: 3d210534cc93 ("net_sched: fix a race condition in tcindex_destroy()")
Cc: Jamal Hadi Salim
Cc: Jiri Pirko
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Cong Wang
2020-04-01 17:01:37 +0800
ff28c6195 net_sched: cls_route: remove the right filter from hashtable ... Browse Code »

[ Upstream commit ef299cc3fa1a9e1288665a9fdc8bff55629fd359 ]

route4_change() allocates a new filter and copies values from
the old one. After the new filter is inserted into the hash
table, the old filter should be removed and freed, as the final
step of the update.

However, the current code mistakenly removes the new one. This
looks apparently wrong to me, and it causes double "free" and
use-after-free too, as reported by syzbot.

Reported-and-tested-by: syzbot+f9b32aaacd60305d9687@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+2f8c233f131943d6056d@syzkaller.appspotmail.com
Reported-and-tested-by: syzbot+9c2df9fd5e9445b74e01@syzkaller.appspotmail.com
Fixes: 1109c00547fc ("net: sched: RCU cls_route")
Cc: Jamal Hadi Salim
Cc: Jiri Pirko
Cc: John Fastabend
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Cong Wang
2020-04-01 17:01:36 +0800
a631b9668 net/sched: act_ct: Fix leak of ct zone template on replace ... Browse Code »

[ Upstream commit dd2af10402684cb5840a127caec9e7cdcff6d167 ]

Currently, on replace, the previous action instance params
is swapped with a newly allocated params. The old params is
only freed (via kfree_rcu), without releasing the allocated
ct zone template related to it.

Call tcf_ct_params_free (via call_rcu) for the old params,
so it will release it.

Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
Signed-off-by: Paul Blakey
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Paul Blakey
2020-04-01 17:01:36 +0800
312805c93 net: qmi_wwan: add support for ASKEY WWHC050 ... Browse Code »

[ Upstream commit 12a5ba5a1994568d4ceaff9e78c6b0329d953386 ]

ASKEY WWHC050 is a mcie LTE modem.
The oem configuration states:

T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=480 MxCh= 0
D: Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=1690 ProdID=7588 Rev=ff.ff
S: Manufacturer=Android
S: Product=Android
S: SerialNumber=813f0eef6e6e
C:* #Ifs= 6 Cfg#= 1 Atr=80 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
E: Ad=88(I) Atr=03(Int.) MxPS= 8 Ivl=32ms
E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 5 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=(none)
E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=125us

Tested on openwrt distribution.

Signed-off-by: Cezary Jackiewicz
Signed-off-by: Pawel Dembicki
Acked-by: Bjørn Mork
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Pawel Dembicki
2020-04-01 17:01:36 +0800
522d2dc17 net: phy: mdio-mux-bcm-iproc: check clk_prepare_enable() return value ... Browse Code »

[ Upstream commit 872307abbd0d9afd72171929806c2fa33dc34179 ]

Check clk_prepare_enable() return value.

Fixes: 2c7230446bc9 ("net: phy: Add pm support to Broadcom iProc mdio mux driver")
Signed-off-by: Rayagonda Kokatanur
Reviewed-by: Andrew Lunn
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Rayagonda Kokatanur
2020-04-01 17:01:35 +0800
f806b9e84 net: phy: mdio-bcm-unimac: Fix clock handling ... Browse Code »

[ Upstream commit c312c7818b86b663d32ec5d4b512abf06b23899a ]

The DT binding for this PHY describes an *optional* clock property.
Due to a bug in the error handling logic, we are actually ignoring this
clock *all* of the time so far.

Fix this by using devm_clk_get_optional() to handle this clock properly.

Fixes: b78ac6ecd1b6b ("net: phy: mdio-bcm-unimac: Allow configuring MDIO clock divider")
Signed-off-by: Andre Przywara
Reviewed-by: Andrew Lunn
Acked-by: Florian Fainelli
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Andre Przywara
2020-04-01 17:01:35 +0800
9fe154ee3 net: phy: dp83867: w/a for fld detect threshold bootstrapping issue ... Browse Code »

[ Upstream commit 749f6f6843115b424680f1aada3c0dd613ad807c ]

When the DP83867 PHY is strapped to enable Fast Link Drop (FLD) feature
STRAP_STS2.STRAP_ FLD (reg 0x006F bit 10), the Energy Lost Threshold for
FLD Energy Lost Mode FLD_THR_CFG.ENERGY_LOST_FLD_THR (reg 0x002e bits 2:0)
will be defaulted to 0x2. This may cause the phy link to be unstable. The
new DP83867 DM recommends to always restore ENERGY_LOST_FLD_THR to 0x1.

Hence, restore default value of FLD_THR_CFG.ENERGY_LOST_FLD_THR to 0x1 when
FLD is enabled by bootstrapping as recommended by DM.

Signed-off-by: Grygorii Strashko
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Grygorii Strashko
2020-04-01 17:01:35 +0800
86137342f net/packet: tpacket_rcv: avoid a producer race condition ... Browse Code »

[ Upstream commit 61fad6816fc10fb8793a925d5c1256d1c3db0cd2 ]

PACKET_RX_RING can cause multiple writers to access the same slot if a
fast writer wraps the ring while a slow writer is still copying. This
is particularly likely with few, large, slots (e.g., GSO packets).

Synchronize kernel thread ownership of rx ring slots with a bitmap.

Writers acquire a slot race-free by testing tp_status TP_STATUS_KERNEL
while holding the sk receive queue lock. They release this lock before
copying and set tp_status to TP_STATUS_USER to release to userspace
when done. During copying, another writer may take the lock, also see
TP_STATUS_KERNEL, and start writing to the same slot.

Introduce a new rx_owner_map bitmap with a bit per slot. To acquire a
slot, test and set with the lock held. To release race-free, update
tp_status and owner bit as a transaction, so take the lock again.

This is the one of a variety of discussed options (see Link below):

* instead of a shadow ring, embed the data in the slot itself, such as
in tp_padding. But any test for this field may match a value left by
userspace, causing deadlock.

* avoid the lock on release. This leaves a small race if releasing the
shadow slot before setting TP_STATUS_USER. The below reproducer showed
that this race is not academic. If releasing the slot after tp_status,
the race is more subtle. See the first link for details.

* add a new tp_status TP_KERNEL_OWNED to avoid the transactional store
of two fields. But, legacy applications may interpret all non-zero
tp_status as owned by the user. As libpcap does. So this is possible
only opt-in by newer processes. It can be added as an optional mode.

* embed the struct at the tail of pg_vec to avoid extra allocation.
The implementation proved no less complex than a separate field.

The additional locking cost on release adds contention, no different
than scaling on multicore or multiqueue h/w. In practice, below
reproducer nor small packet tcpdump showed a noticeable change in
perf report in cycles spent in spinlock. Where contention is
problematic, packet sockets support mitigation through PACKET_FANOUT.
And we can consider adding opt-in state TP_KERNEL_OWNED.

Easy to reproduce by running multiple netperf or similar TCP_STREAM
flows concurrently with `tcpdump -B 129 -n greater 60000`.

Based on an earlier patchset by Jon Rosen. See links below.

I believe this issue goes back to the introduction of tpacket_rcv,
which predates git history.

Link: https://www.mail-archive.com/netdev@vger.kernel.org/msg237222.html
Suggested-by: Jon Rosen
Signed-off-by: Willem de Bruijn
Signed-off-by: Jon Rosen
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Willem de Bruijn
2020-04-01 17:01:35 +0800
bb8c787be net: mvneta: Fix the case where the last poll did not process all rx ... Browse Code »

[ Upstream commit 065fd83e1be2e1ba0d446a257fd86a3cc7bddb51 ]

For the case where the last mvneta_poll did not process all
RX packets, we need to xor the pp->cause_rx_tx or port->cause_rx_tx
before claculating the rx_queue.

Fixes: 2dcf75e2793c ("net: mvneta: Associate RX queues with each CPU")
Signed-off-by: Jisheng Zhang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Jisheng Zhang
2020-04-01 17:01:34 +0800
a2a3baa29 net: ena: Add PCI shutdown handler to allow safe kexec ... Browse Code »

[ Upstream commit 428c491332bca498c8eb2127669af51506c346c7 ]

Currently ENA only provides the PCI remove() handler, used during rmmod
for example. This is not called on shutdown/kexec path; we are potentially
creating a failure scenario on kexec:

(a) Kexec is triggered, no shutdown() / remove() handler is called for ENA;
instead pci_device_shutdown() clears the master bit of the PCI device,
stopping all DMA transactions;

(b) Kexec reboot happens and the device gets enabled again, likely having
its FW with that DMA transaction buffered; then it may trigger the (now
invalid) memory operation in the new kernel, corrupting kernel memory area.

This patch aims to prevent this, by implementing a shutdown() handler
quite similar to the remove() one - the difference being the handling
of the netdev, which is unregistered on remove(), but following the
convention observed in other drivers, it's only detached on shutdown().

This prevents an odd issue in AWS Nitro instances, in which after the 2nd
kexec the next one will fail with an initrd corruption, caused by a wild
DMA write to invalid kernel memory. The lspci output for the adapter
present in my instance is:

00:05.0 Ethernet controller [0200]: Amazon.com, Inc. Elastic Network
Adapter (ENA) [1d0f:ec20]

Suggested-by: Gavin Shan
Signed-off-by: Guilherme G. Piccoli
Acked-by: Sameeh Jubran
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Guilherme G. Piccoli
2020-04-01 17:01:34 +0800
e586427a0 net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop ... Browse Code »

[ Upstream commit e80f40cbe4dd51371818e967d40da8fe305db5e4 ]

Not only did this wheel did not need reinventing, but there is also
an issue with it: It doesn't remove the VLAN header in a way that
preserves the L2 payload checksum when that is being provided by the DSA
master hw. It should recalculate checksum both for the push, before
removing the header, and for the pull afterwards. But the current
implementation is quite dizzying, with pulls followed immediately
afterwards by pushes, the memmove is done before the push, etc. This
makes a DSA master with RX checksumming offload to print stack traces
with the infamous 'hw csum failure' message.

So remove the dsa_8021q_remove_header function and replace it with
something that actually works with inet checksumming.

Fixes: d461933638ae ("net: dsa: tag_8021q: Create helper function for removing VLAN header")
Signed-off-by: Vladimir Oltean
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Vladimir Oltean
2020-04-01 17:01:34 +0800
0ec037c13 net: dsa: mt7530: Change the LINK bit to reflect the link status ... Browse Code »

[ Upstream commit 22259471b51925353bd7b16f864c79fdd76e425e ]

Andrew reported:

After a number of network port link up/down changes, sometimes the switch
port gets stuck in a state where it thinks it is still transmitting packets
but the cpu port is not actually transmitting anymore. In this state you
will see a message on the console
"mtk_soc_eth 1e100000.ethernet eth0: transmit timed out" and the Tx counter
in ifconfig will be incrementing on virtual port, but not incrementing on
cpu port.

The issue is that MAC TX/RX status has no impact on the link status or
queue manager of the switch. So the queue manager just queues up packets
of a disabled port and sends out pause frames when the queue is full.

Change the LINK bit to reflect the link status.

Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Reported-by: Andrew Smith
Signed-off-by: René van Dorst
Reviewed-by: Vivien Didelot
Reviewed-by: Florian Fainelli
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

René van Dorst
2020-04-01 17:01:33 +0800
60e975088 net: dsa: Fix duplicate frames flooded by learning ... Browse Code »

[ Upstream commit 0e62f543bed03a64495bd2651d4fe1aa4bcb7fe5 ]

When both the switch and the bridge are learning about new addresses,
switch ports attached to the bridge would see duplicate ARP frames
because both entities would attempt to send them.

Fixes: 5037d532b83d ("net: dsa: add Broadcom tag RX/TX handler")
Reported-by: Maxime Bizon
Signed-off-by: Florian Fainelli
Reviewed-by: Vivien Didelot
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Florian Fainelli
2020-04-01 17:01:33 +0800
7c6fe9b2a net: cbs: Fix software cbs to consider packet sending time ... Browse Code »

[ Upstream commit 961d0e5b32946703125964f9f5b6321d60f4d706 ]

Currently the software CBS does not consider the packet sending time
when depleting the credits. It caused the throughput to be
Idleslope[kbps] * (Port transmit rate[kbps] / |Sendslope[kbps]|) where
Idleslope * (Port transmit rate / (Idleslope + |Sendslope|)) = Idleslope
is expected. In order to fix the issue above, this patch takes the time
when the packet sending completes into account by moving the anchor time
variable "last" ahead to the send completion time upon transmission and
adding wait when the next dequeue request comes before the send
completion time of the previous packet.

changelog:
V2->V3:
- remove unnecessary whitespace cleanup
- add the checks if port_rate is 0 before division

V1->V2:
- combine variable "send_completed" into "last"
- add the comment for estimate of the packet sending

Fixes: 585d763af09c ("net/sched: Introduce Credit Based Shaper (CBS) qdisc")
Signed-off-by: Zh-yuan Ye
Reviewed-by: Vinicius Costa Gomes
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Zh-yuan Ye
2020-04-01 17:01:33 +0800
712c39d93 net/bpfilter: fix dprintf usage for /dev/kmsg ... Browse Code »

[ Upstream commit 13d0f7b814d9b4c67e60d8c2820c86ea181e7d99 ]

The bpfilter UMH code was recently changed to log its informative messages to
/dev/kmsg, however this interface doesn't support SEEK_CUR yet, used by
dprintf(). As result dprintf() returns -EINVAL and doesn't log anything.

However there already had some discussions about supporting SEEK_CUR into
/dev/kmsg interface in the past it wasn't concluded. Since the only user of
that from userspace perspective inside the kernel is the bpfilter UMH
(userspace) module it's better to correct it here instead waiting a conclusion
on the interface.

Fixes: 36c4357c63f3 ("net: bpfilter: print umh messages to /dev/kmsg")
Signed-off-by: Bruno Meneguele
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Bruno Meneguele
2020-04-01 17:01:33 +0800
856750641 mlxsw: spectrum_mr: Fix list iteration in error path ... Browse Code »

[ Upstream commit f6bf1bafdc2152bb22aff3a4e947f2441a1d49e2 ]

list_for_each_entry_from_reverse() iterates backwards over the list from
the current position, but in the error path we should start from the
previous position.

Fix this by using list_for_each_entry_continue_reverse() instead.

This suppresses the following error from coccinelle:

drivers/net/ethernet/mellanox/mlxsw//spectrum_mr.c:655:34-38: ERROR:
invalid reference to the index variable of the iterator on line 636

Fixes: c011ec1bbfd6 ("mlxsw: spectrum: Add the multicast routing offloading logic")
Signed-off-by: Ido Schimmel
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Ido Schimmel
2020-04-01 17:01:32 +0800
5a1a00f6a mlxsw: pci: Only issue reset when system is ready ... Browse Code »

[ Upstream commit 6002059d7882c3512e6ac52fa82424272ddfcd5c ]

During initialization the driver issues a software reset command and
then waits for the system status to change back to "ready" state.

However, before issuing the reset command the driver does not check that
the system is actually in "ready" state. On Spectrum-{1,2} systems this
was always the case as the hardware initialization time is very short.
On Spectrum-3 systems this is no longer the case. This results in the
software reset command timing-out and the driver failing to load:

[ 6.347591] mlxsw_spectrum3 0000:06:00.0: Cmd exec timed-out (opcode=40(ACCESS_REG),opcode_mod=0,in_mod=0)
[ 6.358382] mlxsw_spectrum3 0000:06:00.0: Reg cmd access failed (reg_id=9023(mrsr),type=write)
[ 6.368028] mlxsw_spectrum3 0000:06:00.0: cannot register bus device
[ 6.375274] mlxsw_spectrum3: probe of 0000:06:00.0 failed with error -110

Fix this by waiting for the system to become ready both before issuing
the reset command and afterwards. In case of failure, print the last
system status to aid in debugging.

Fixes: da382875c616 ("mlxsw: spectrum: Extend to support Spectrum-3 ASIC")
Signed-off-by: Ido Schimmel
Reviewed-by: Jiri Pirko
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Ido Schimmel
2020-04-01 17:01:32 +0800
6e75284e2 macsec: restrict to ethernet devices ... Browse Code »

[ Upstream commit b06d072ccc4b1acd0147b17914b7ad1caa1818bb ]

Only attach macsec to ethernet devices.

Syzbot was able to trigger a KMSAN warning in macsec_handle_frame
by attaching to a phonet device.

Macvlan has a similar check in macvlan_port_create.

v1->v2
- fix commit message typo

Reported-by: syzbot
Signed-off-by: Willem de Bruijn
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Willem de Bruijn
2020-04-01 17:01:32 +0800
51db2db8f ipv4: fix a RCU-list lock in inet_dump_fib() ... Browse Code »

[ Upstream commit dddeb30bfc43926620f954266fd12c65a7206f07 ]

There is a place,

inet_dump_fib()
fib_table_dump
fn_trie_dump_leaf()
hlist_for_each_entry_rcu()

without rcu_read_lock() will trigger a warning,

WARNING: suspicious RCU usage
-----------------------------
net/ipv4/fib_trie.c:2216 RCU-list traversed in non-reader section!!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by ip/1923:
#0: ffffffff8ce76e40 (rtnl_mutex){+.+.}, at: netlink_dump+0xd6/0x840

Call Trace:
dump_stack+0xa1/0xea
lockdep_rcu_suspicious+0x103/0x10d
fn_trie_dump_leaf+0x581/0x590
fib_table_dump+0x15f/0x220
inet_dump_fib+0x4ad/0x5d0
netlink_dump+0x350/0x840
__netlink_dump_start+0x315/0x3e0
rtnetlink_rcv_msg+0x4d1/0x720
netlink_rcv_skb+0xf0/0x220
rtnetlink_rcv+0x15/0x20
netlink_unicast+0x306/0x460
netlink_sendmsg+0x44b/0x770
__sys_sendto+0x259/0x270
__x64_sys_sendto+0x80/0xa0
do_syscall_64+0x69/0xf4
entry_SYSCALL_64_after_hwframe+0x49/0xb3

Fixes: 18a8021a7be3 ("net/ipv4: Plumb support for filtering route dumps")
Signed-off-by: Qian Cai
Reviewed-by: David Ahern
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Qian Cai
2020-04-01 17:01:31 +0800
b67aa57f4 hsr: fix general protection fault in hsr_addr_is_self() ... Browse Code »

[ Upstream commit 3a303cfdd28d5f930a307c82e8a9d996394d5ebd ]

The port->hsr is used in the hsr_handle_frame(), which is a
callback of rx_handler.
hsr master and slaves are initialized in hsr_add_port().
This function initializes several pointers, which includes port->hsr after
registering rx_handler.
So, in the rx_handler routine, un-initialized pointer would be used.
In order to fix this, pointers should be initialized before
registering rx_handler.

Test commands:
ip netns del left
ip netns del right
modprobe -rv veth
modprobe -rv hsr
killall ping
modprobe hsr
ip netns add left
ip netns add right
ip link add veth0 type veth peer name veth1
ip link add veth2 type veth peer name veth3
ip link add veth4 type veth peer name veth5
ip link set veth1 netns left
ip link set veth3 netns right
ip link set veth4 netns left
ip link set veth5 netns right
ip link set veth0 up
ip link set veth2 up
ip link set veth0 address fc:00:00:00:00:01
ip link set veth2 address fc:00:00:00:00:02
ip netns exec left ip link set veth1 up
ip netns exec left ip link set veth4 up
ip netns exec right ip link set veth3 up
ip netns exec right ip link set veth5 up
ip link add hsr0 type hsr slave1 veth0 slave2 veth2
ip a a 192.168.100.1/24 dev hsr0
ip link set hsr0 up
ip netns exec left ip link add hsr1 type hsr slave1 veth1 slave2 veth4
ip netns exec left ip a a 192.168.100.2/24 dev hsr1
ip netns exec left ip link set hsr1 up
ip netns exec left ip n a 192.168.100.1 dev hsr1 lladdr \
fc:00:00:00:00:01 nud permanent
ip netns exec left ip n r 192.168.100.1 dev hsr1 lladdr \
fc:00:00:00:00:01 nud permanent
for i in {1..100}
do
ip netns exec left ping 192.168.100.1 &
done
ip netns exec left hping3 192.168.100.1 -2 --flood &
ip netns exec right ip link add hsr2 type hsr slave1 veth3 slave2 veth5
ip netns exec right ip a a 192.168.100.3/24 dev hsr2
ip netns exec right ip link set hsr2 up
ip netns exec right ip n a 192.168.100.1 dev hsr2 lladdr \
fc:00:00:00:00:02 nud permanent
ip netns exec right ip n r 192.168.100.1 dev hsr2 lladdr \
fc:00:00:00:00:02 nud permanent
for i in {1..100}
do
ip netns exec right ping 192.168.100.1 &
done
ip netns exec right hping3 192.168.100.1 -2 --flood &
while :
do
ip link add hsr0 type hsr slave1 veth0 slave2 veth2
ip a a 192.168.100.1/24 dev hsr0
ip link set hsr0 up
ip link del hsr0
done

Splat looks like:
[ 120.954938][ C0] general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1]I
[ 120.957761][ C0] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
[ 120.959064][ C0] CPU: 0 PID: 1511 Comm: hping3 Not tainted 5.6.0-rc5+ #460
[ 120.960054][ C0] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 120.962261][ C0] RIP: 0010:hsr_addr_is_self+0x65/0x2a0 [hsr]
[ 120.963149][ C0] Code: 44 24 18 70 73 2f c0 48 c1 eb 03 48 8d 04 13 c7 00 f1 f1 f1 f1 c7 40 04 00 f2 f2 f2 4
[ 120.966277][ C0] RSP: 0018:ffff8880d9c09af0 EFLAGS: 00010206
[ 120.967293][ C0] RAX: 0000000000000006 RBX: 1ffff1101b38135f RCX: 0000000000000000
[ 120.968516][ C0] RDX: dffffc0000000000 RSI: ffff8880d17cb208 RDI: 0000000000000000
[ 120.969718][ C0] RBP: 0000000000000030 R08: ffffed101b3c0e3c R09: 0000000000000001
[ 120.972203][ C0] R10: 0000000000000001 R11: ffffed101b3c0e3b R12: 0000000000000000
[ 120.973379][ C0] R13: ffff8880aaf80100 R14: ffff8880aaf800f2 R15: ffff8880aaf80040
[ 120.974410][ C0] FS: 00007f58e693f740(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
[ 120.979794][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 120.980773][ C0] CR2: 00007ffcb8b38f29 CR3: 00000000afe8e001 CR4: 00000000000606f0
[ 120.981945][ C0] Call Trace:
[ 120.982411][ C0]
[ 120.982848][ C0] ? hsr_add_node+0x8c0/0x8c0 [hsr]
[ 120.983522][ C0] ? rcu_read_lock_held+0x90/0xa0
[ 120.984159][ C0] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 120.984944][ C0] hsr_handle_frame+0x1db/0x4e0 [hsr]
[ 120.985597][ C0] ? hsr_nl_nodedown+0x2b0/0x2b0 [hsr]
[ 120.986289][ C0] __netif_receive_skb_core+0x6bf/0x3170
[ 120.992513][ C0] ? check_chain_key+0x236/0x5d0
[ 120.993223][ C0] ? do_xdp_generic+0x1460/0x1460
[ 120.993875][ C0] ? register_lock_class+0x14d0/0x14d0
[ 120.994609][ C0] ? __netif_receive_skb_one_core+0x8d/0x160
[ 120.995377][ C0] __netif_receive_skb_one_core+0x8d/0x160
[ 120.996204][ C0] ? __netif_receive_skb_core+0x3170/0x3170
[ ... ]

Reported-by: syzbot+fcf5dd39282ceb27108d@syzkaller.appspotmail.com
Fixes: c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
Signed-off-by: Taehee Yoo
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Taehee Yoo
2020-04-01 17:01:31 +0800
6fe31c7ce geneve: move debug check after netdev unregister ... Browse Code »

[ Upstream commit 0fda7600c2e174fe27e9cf02e78e345226e441fa ]

The debug check must be done after unregister_netdevice_many() call --
the list_del() for this is done inside .ndo_stop.

Fixes: 2843a25348f8 ("geneve: speedup geneve tunnels dismantle")
Reported-and-tested-by:
Cc: Haishuang Yan
Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Florian Westphal
2020-04-01 17:01:31 +0800
b5c9652ad cxgb4: fix Txq restart check during backpressure ... Browse Code »

[ Upstream commit f1f20a8666c55cb534b8f3fc1130eebf01a06155 ]

Driver reclaims descriptors in much smaller batches, even if hardware
indicates more to reclaim, during backpressure. So, fix the check to
restart the Txq during backpressure, by looking at how many
descriptors hardware had indicated to reclaim, and not on how many
descriptors that driver had actually reclaimed. Once the Txq is
restarted, driver will reclaim even more descriptors when Tx path
is entered again.

Fixes: d429005fdf2c ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer")
Signed-off-by: Rahul Lakkireddy
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Rahul Lakkireddy
2020-04-01 17:01:30 +0800
e92a0e7fb cxgb4: fix throughput drop during Tx backpressure ... Browse Code »

[ Upstream commit 7affd80802afb6ca92dba47d768632fbde365241 ]

commit 7c3bebc3d868 ("cxgb4: request the TX CIDX updates to status page")
reverted back to getting Tx CIDX updates via DMA, instead of interrupts,
introduced by commit d429005fdf2c ("cxgb4/cxgb4vf: Add support for SGE
doorbell queue timer")

However, it missed reverting back several code changes where Tx CIDX
updates are not explicitly requested during backpressure when using
interrupt mode. These missed changes cause slow recovery during
backpressure because the corresponding interrupt no longer comes and
hence results in Tx throughput drop.

So, revert back these missed code changes, as well, which will allow
explicitly requesting Tx CIDX updates when backpressure happens.
This enables the corresponding interrupt with Tx CIDX update message
to get generated and hence speed up recovery and restore back
throughput.

Fixes: 7c3bebc3d868 ("cxgb4: request the TX CIDX updates to status page")
Fixes: d429005fdf2c ("cxgb4/cxgb4vf: Add support for SGE doorbell queue timer")
Signed-off-by: Rahul Lakkireddy
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Rahul Lakkireddy
2020-04-01 17:01:30 +0800
b0ab87002 ACPI: PM: s2idle: Rework ACPI events synchronization ... Browse Code »

commit 024aa8732acb7d2503eae43c3fe3504d0a8646d0 upstream.

Note that the EC GPE processing need not be synchronized in
acpi_s2idle_wake() after invoking acpi_ec_dispatch_gpe(), because
that function checks the GPE status and dispatches its handler if
need be and the SCI action handler is not going to run anyway at
that point.

Moreover, it is better to drain all of the pending ACPI events
before restoring the working-state configuration of GPEs in
acpi_s2idle_restore(), because those events are likely to be related
to system wakeup, in which case they will not be relevant going
forward.

Rework the code to take these observations into account.

Tested-by: Kenneth R. Crudup
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Greg Kroah-Hartman

Rafael J. Wysocki
2020-04-01 17:01:29 +0800
127882d10 mmc: sdhci-tegra: Fix busy detection by enabling MMC_CAP_NEED_RSP_BUSY ... Browse Code »

[ Upstream commit d2f8bfa4bff5028bc40ed56b4497c32e05b0178f ]

It has turned out that the sdhci-tegra controller requires the R1B response,
for commands that has this response associated with them. So, converting
from an R1B to an R1 response for a CMD6 for example, leads to problems
with the HW busy detection support.

Fix this by informing the mmc core about the requirement, via setting the
host cap, MMC_CAP_NEED_RSP_BUSY.

Reported-by: Bitan Biswas
Reported-by: Peter Geis
Suggested-by: Sowjanya Komatineni
Cc:
Tested-by: Sowjanya Komatineni
Tested-By: Peter Geis
Signed-off-by: Ulf Hansson
Signed-off-by: Sasha Levin

Ulf Hansson
2020-04-01 17:01:29 +0800
71d89344a mmc: sdhci-omap: Fix busy detection by enabling MMC_CAP_NEED_RSP_BUSY ... Browse Code »

[ Upstream commit 055e04830d4544c57f2a5192a26c9e25915c29c0 ]

It has turned out that the sdhci-omap controller requires the R1B response,
for commands that has this response associated with them. So, converting
from an R1B to an R1 response for a CMD6 for example, leads to problems
with the HW busy detection support.

Fix this by informing the mmc core about the requirement, via setting the
host cap, MMC_CAP_NEED_RSP_BUSY.

Reported-by: Naresh Kamboju
Reported-by: Anders Roxell
Reported-by: Faiz Abbas
Cc:
Tested-by: Anders Roxell
Tested-by: Faiz Abbas
Signed-off-by: Ulf Hansson
Signed-off-by: Sasha Levin

Ulf Hansson
2020-04-01 17:01:28 +0800
bf8b920f4 mmc: core: Respect MMC_CAP_NEED_RSP_BUSY for eMMC sleep command ... Browse Code »

[ Upstream commit 18d200460cd73636d4f20674085c39e32b4e0097 ]

The busy timeout for the CMD5 to put the eMMC into sleep state, is specific
to the card. Potentially the timeout may exceed the host->max_busy_timeout.
If that becomes the case, mmc_sleep() converts from using an R1B response
to an R1 response, as to prevent the host from doing HW busy detection.

However, it has turned out that some hosts requires an R1B response no
matter what, so let's respect that via checking MMC_CAP_NEED_RSP_BUSY. Note
that, if the R1B gets enforced, the host becomes fully responsible of
managing the needed busy timeout, in one way or the other.

Suggested-by: Sowjanya Komatineni
Cc:
Link: https://lore.kernel.org/r/20200311092036.16084-1-ulf.hansson@linaro.org
Signed-off-by: Ulf Hansson
Signed-off-by: Sasha Levin

Ulf Hansson
2020-04-01 17:01:28 +0800
3b9b71adb mmc: core: Respect MMC_CAP_NEED_RSP_BUSY for erase/trim/discard ... Browse Code »

[ Upstream commit 43cc64e5221cc6741252b64bc4531dd1eefb733d ]

The busy timeout that is computed for each erase/trim/discard operation,
can become quite long and may thus exceed the host->max_busy_timeout. If
that becomes the case, mmc_do_erase() converts from using an R1B response
to an R1 response, as to prevent the host from doing HW busy detection.

However, it has turned out that some hosts requires an R1B response no
matter what, so let's respect that via checking MMC_CAP_NEED_RSP_BUSY. Note
that, if the R1B gets enforced, the host becomes fully responsible of
managing the needed busy timeout, in one way or the other.

Suggested-by: Sowjanya Komatineni
Cc:
Tested-by: Anders Roxell
Tested-by: Sowjanya Komatineni
Tested-by: Faiz Abbas
Tested-By: Peter Geis
Signed-off-by: Ulf Hansson
Signed-off-by: Sasha Levin

Ulf Hansson
2020-04-01 17:01:27 +0800
d9c4f387e mmc: core: Allow host controllers to require R1B for CMD6 ... Browse Code »

[ Upstream commit 1292e3efb149ee21d8d33d725eeed4e6b1ade963 ]

It has turned out that some host controllers can't use R1B for CMD6 and
other commands that have R1B associated with them. Therefore invent a new
host cap, MMC_CAP_NEED_RSP_BUSY to let them specify this.

In __mmc_switch(), let's check the flag and use it to prevent R1B responses
from being converted into R1. Note that, this also means that the host are
on its own, when it comes to manage the busy timeout.

Suggested-by: Sowjanya Komatineni
Cc:
Tested-by: Anders Roxell
Tested-by: Sowjanya Komatineni
Tested-by: Faiz Abbas
Tested-By: Peter Geis
Signed-off-by: Ulf Hansson
Signed-off-by: Sasha Levin

Ulf Hansson
2020-04-01 17:01:27 +0800

25 Mar, 2020

3 commits

462afcd6e Linux 5.4.28 Browse Code »

Greg Kroah-Hartman
2020-03-25 15:26:00 +0800
7b2cdbd67 staging: greybus: loopback_test: fix potential path truncations ... Browse Code »

commit ae62cf5eb2792d9a818c2d93728ed92119357017 upstream.

Newer GCC warns about possible truncations of two generated path names as
we're concatenating the configurable sysfs and debugfs path prefixes
with a filename and placing the results in buffers of the same size as
the maximum length of the prefixes.

snprintf(d->name, MAX_STR_LEN, "gb_loopback%u", dev_id);

snprintf(d->sysfs_entry, MAX_SYSFS_PATH, "%s%s/",
t->sysfs_prefix, d->name);

snprintf(d->debugfs_entry, MAX_SYSFS_PATH, "%sraw_latency_%s",
t->debugfs_prefix, d->name);

Fix this by separating the maximum path length from the maximum prefix
length and reducing the latter enough to fit the generated strings.

Note that we also need to reduce the device-name buffer size as GCC
isn't smart enough to figure out that we ever only used MAX_STR_LEN
bytes of it.

Fixes: 6b0658f68786 ("greybus: tools: Add tools directory to greybus repo and add loopback")
Signed-off-by: Johan Hovold
Link: https://lore.kernel.org/r/20200312110151.22028-4-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman

Johan Hovold
2020-03-25 15:25:59 +0800
8e79f440e staging: greybus: loopback_test: fix potential path truncation ... Browse Code »

commit f16023834863932f95dfad13fac3fc47f77d2f29 upstream.

Newer GCC warns about a possible truncation of a generated sysfs path
name as we're concatenating a directory path with a file name and
placing the result in a buffer that is half the size of the maximum
length of the directory path (which is user controlled).

loopback_test.c: In function 'open_poll_files':
loopback_test.c:651:31: warning: '%s' directive output may be truncated writing up to 511 bytes into a region of size 255 [-Wformat-truncation=]
651 | snprintf(buf, sizeof(buf), "%s%s", dev->sysfs_entry, "iteration_count");
| ^~
loopback_test.c:651:3: note: 'snprintf' output between 16 and 527 bytes into a destination of size 255
651 | snprintf(buf, sizeof(buf), "%s%s", dev->sysfs_entry, "iteration_count");
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fix this by making sure the buffer is large enough the concatenated
strings.

Fixes: 6b0658f68786 ("greybus: tools: Add tools directory to greybus repo and add loopback")
Fixes: 9250c0ee2626 ("greybus: Loopback_test: use poll instead of inotify")
Signed-off-by: Johan Hovold
Link: https://lore.kernel.org/r/20200312110151.22028-3-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman

Johan Hovold
2020-03-25 15:25:59 +0800