Eric Lee / smarc-fsl-linux-kernel

01 Sep, 2019

5 commits

1c6c09a0a net/sched: cbs: Set default link speed to 10 Mbps in cbs_set_port_rate ... Browse Code »

The discussion to be made is absolutely the same as in the case of
previous patch ("taprio: Set default link speed to 10 Mbps in
taprio_set_picos_per_byte"). Nothing is lost when setting a default.

Cc: Leandro Dorileo
Fixes: e0a7683d30e9 ("net/sched: cbs: fix port_rate miscalculation")
Acked-by: Vinicius Costa Gomes
Signed-off-by: Vladimir Oltean
Signed-off-by: David S. Miller

Vladimir Oltean
2019-09-01 09:45:35 +0800
f04b514c0 taprio: Set default link speed to 10 Mbps in taprio_set_picos_per_byte ... Browse Code »

The taprio budget needs to be adapted at runtime according to interface
link speed. But that handling is problematic.

For one thing, installing a qdisc on an interface that doesn't have
carrier is not illegal. But taprio prints the following stack trace:

[ 31.851373] ------------[ cut here ]------------
[ 31.856024] WARNING: CPU: 1 PID: 207 at net/sched/sch_taprio.c:481 taprio_dequeue+0x1a8/0x2d4
[ 31.864566] taprio: dequeue() called with unknown picos per byte.
[ 31.864570] Modules linked in:
[ 31.873701] CPU: 1 PID: 207 Comm: tc Not tainted 5.3.0-rc5-01199-g8838fe023cd6 #1689
[ 31.881398] Hardware name: Freescale LS1021A
[ 31.885661] [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
[ 31.893368] [] (show_stack) from [] (dump_stack+0xb4/0xc8)
[ 31.900555] [] (dump_stack) from [] (__warn+0xe0/0xf8)
[ 31.907395] [] (__warn) from [] (warn_slowpath_fmt+0x48/0x6c)
[ 31.914841] [] (warn_slowpath_fmt) from [] (taprio_dequeue+0x1a8/0x2d4)
[ 31.923150] [] (taprio_dequeue) from [] (__qdisc_run+0x90/0x61c)
[ 31.930856] [] (__qdisc_run) from [] (net_tx_action+0x12c/0x2bc)
[ 31.938560] [] (net_tx_action) from [] (__do_softirq+0x130/0x3c8)
[ 31.946350] [] (__do_softirq) from [] (irq_exit+0xbc/0xd8)
[ 31.953536] [] (irq_exit) from [] (__handle_domain_irq+0x60/0xb4)
[ 31.961328] [] (__handle_domain_irq) from [] (gic_handle_irq+0x58/0x9c)
[ 31.969638] [] (gic_handle_irq) from [] (__irq_svc+0x6c/0x90)
[ 31.977076] Exception stack(0xe8167b20 to 0xe8167b68)
[ 31.982100] 7b20: e9d4bd80 00000cc0 000000cf 00000000 e9d4bd80 c1f38958 00000cc0 c1f38960
[ 31.990234] 7b40: 00000001 000000cf 00000004 e9dc0800 00000000 e8167b70 c0f478ec c0f46d94
[ 31.998363] 7b60: 60070013 ffffffff
[ 32.001833] [] (__irq_svc) from [] (netlink_trim+0x18/0xd8)
[ 32.009104] [] (netlink_trim) from [] (netlink_broadcast_filtered+0x34/0x414)
[ 32.017930] [] (netlink_broadcast_filtered) from [] (netlink_broadcast+0x20/0x28)
[ 32.027102] [] (netlink_broadcast) from [] (rtnetlink_send+0x34/0x88)
[ 32.035238] [] (rtnetlink_send) from [] (notify_and_destroy+0x2c/0x44)
[ 32.043461] [] (notify_and_destroy) from [] (qdisc_graft+0x398/0x470)
[ 32.051595] [] (qdisc_graft) from [] (tc_modify_qdisc+0x3a4/0x724)
[ 32.059470] [] (tc_modify_qdisc) from [] (rtnetlink_rcv_msg+0x260/0x2ec)
[ 32.067864] [] (rtnetlink_rcv_msg) from [] (netlink_rcv_skb+0xb8/0x110)
[ 32.076172] [] (netlink_rcv_skb) from [] (netlink_unicast+0x1b4/0x22c)
[ 32.084392] [] (netlink_unicast) from [] (netlink_sendmsg+0x33c/0x380)
[ 32.092614] [] (netlink_sendmsg) from [] (sock_sendmsg+0x14/0x24)
[ 32.100403] [] (sock_sendmsg) from [] (___sys_sendmsg+0x214/0x228)
[ 32.108279] [] (___sys_sendmsg) from [] (__sys_sendmsg+0x50/0x8c)
[ 32.116068] [] (__sys_sendmsg) from [] (ret_fast_syscall+0x0/0x54)
[ 32.123938] Exception stack(0xe8167fa8 to 0xe8167ff0)
[ 32.128960] 7fa0: b6fa68c8 000000f8 00000003 bea142d0 00000000 00000000
[ 32.137093] 7fc0: b6fa68c8 000000f8 0052154c 00000128 5d6468a2 00000000 00000028 00558c9c
[ 32.145224] 7fe0: 00000070 bea14278 00530d64 b6e17e64
[ 32.150659] ---[ end trace 2139c9827c3e5177 ]---

This happens because the qdisc ->dequeue callback gets called. Which
again is not illegal, the qdisc will dequeue even when the interface is
up but doesn't have carrier (and hence SPEED_UNKNOWN), and the frames
will be dropped further down the stack in dev_direct_xmit().

And, at the end of the day, for what? For calculating the initial budget
of an interface which is non-operational at the moment and where frames
will get dropped anyway.

So if we can't figure out the link speed, default to SPEED_10 and move
along. We can also remove the runtime check now.

Cc: Leandro Dorileo
Fixes: 7b9eba7ba0c1 ("net/sched: taprio: fix picos_per_byte miscalculation")
Acked-by: Vinicius Costa Gomes
Signed-off-by: Vladimir Oltean
Signed-off-by: David S. Miller

Vladimir Oltean
2019-09-01 09:45:34 +0800
efb55222d taprio: Fix kernel panic in taprio_destroy ... Browse Code »

taprio_init may fail earlier than this line:

list_add(&q->taprio_list, &taprio_list);

i.e. due to the net device not being multi queue.

Attempting to remove q from the global taprio_list when it is not part
of it will result in a kernel panic.

Fix it by matching list_add and list_del better to one another in the
order of operations. This way we can keep the deletion unconditional
and with lower complexity - O(1).

Cc: Leandro Dorileo
Fixes: 7b9eba7ba0c1 ("net/sched: taprio: fix picos_per_byte miscalculation")
Signed-off-by: Vladimir Oltean
Acked-by: Vinicius Costa Gomes
Signed-off-by: David S. Miller

Vladimir Oltean
2019-09-01 09:45:34 +0800
5f81d5455 net: dsa: microchip: fill regmap_config name ... Browse Code »

Use the register value width as the regmap_config name to prevent the
following error when the second and third regmap_configs are
initialized.
"debugfs: Directory '${bus-id}' with parent 'regmap' already present!"

Signed-off-by: George McCollister
Reviewed-by: Marek Vasut
Signed-off-by: David S. Miller

George McCollister
2019-09-01 04:19:07 +0800
5b161002b Merge tag 'batadv-net-for-davem-20190830' of git://git.open-mesh.org/linux-merge ... Browse Code »

Simon Wunderlich says:

====================
Here are two batman-adv bugfixes:

- Fix OGM and OGMv2 header read boundary check,
by Sven Eckelmann (2 patches)
====================

Signed-off-by: David S. Miller

David S. Miller
2019-09-01 04:16:07 +0800

31 Aug, 2019

6 commits

c3d7a089f Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Spurious warning when loading rules using the physdev match,
from Todd Seidelmann.

2) Fix FTP conntrack helper debugging output, from Thomas Jarosch.

3) Restore per-netns nf_conntrack_{acct,helper,timeout} sysctl knobs,
from Florian Westphal.

4) Clear skbuff timestamp from the flowtable datapath, also from Florian.

5) Fix incorrect byteorder of NFT_META_BRI_IIFVPROTO, from wenxu.
====================

Signed-off-by: David S. Miller

David S. Miller
2019-08-31 08:50:10 +0800
94880a5b2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf ... Browse Code »

Daniel Borkmann says:

====================
pull-request: bpf 2019-08-31

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix 32-bit zero-extension during constant blinding which
has been causing a regression on ppc64, from Naveen.

2) Fix a latency bug in nfp driver when updating stack index
register, from Jiong.
====================

Signed-off-by: David S. Miller

David S. Miller
2019-08-31 08:39:37 +0800
d12040b69 rxrpc: Fix lack of conn cleanup when local endpoint is cleaned up [ver #2] ... Browse Code »

When a local endpoint is ceases to be in use, such as when the kafs module
is unloaded, the kernel will emit an assertion failure if there are any
outstanding client connections:

rxrpc: Assertion failed
------------[ cut here ]------------
kernel BUG at net/rxrpc/local_object.c:433!

and even beyond that, will evince other oopses if there are service
connections still present.

Fix this by:

(1) Removing the triggering of connection reaping when an rxrpc socket is
released. These don't actually clean up the connections anyway - and
further, the local endpoint may still be in use through another
socket.

(2) Mark the local endpoint as dead when we start the process of tearing
it down.

(3) When destroying a local endpoint, strip all of its client connections
from the idle list and discard the ref on each that the list was
holding.

(4) When destroying a local endpoint, call the service connection reaper
directly (rather than through a workqueue) to immediately kill off all
outstanding service connections.

(5) Make the service connection reaper reap connections for which the
local endpoint is marked dead.

Only after destroying the connections can we close the socket lest we get
an oops in a workqueue that's looking at a connection or a peer.

Fixes: 3d18cbb7fd0c ("rxrpc: Fix conn expiry timers")
Signed-off-by: David Howells
Tested-by: Marc Dionne
Signed-off-by: David S. Miller

David Howells
2019-08-31 06:06:52 +0800
a285c1fa3 Merge tag 'rxrpc-fixes-20190827' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs ... Browse Code »

David Howells says:

====================
rxrpc: Fix use of skb_cow_data()

Here's a series of patches that replaces the use of skb_cow_data() in rxrpc
with skb_unshare() early on in the input process. The problem that is
being seen is that skb_cow_data() indirectly requires that the maximum
usage count on an sk_buff be 1, and it may generate an assertion failure in
pskb_expand_head() if not.

This can occur because rxrpc_input_data() may be still holding a ref when
it has just attached the sk_buff to the rx ring and given that attachment
its own ref. If recvmsg happens fast enough, skb_cow_data() can see the
ref still held by the softirq handler.

Further, a packet may contain multiple subpackets, each of which gets its
own attachment to the ring and its own ref - also making skb_cow_data() go
bang.

Fix this by:

(1) The DATA packet is currently parsed for subpackets twice by the input
routines. Parse it just once instead and make notes in the sk_buff
private data.

(2) Use the notes from (1) when attaching the packet to the ring multiple
times. Once the packet is attached to the ring, recvmsg can see it
and start modifying it, so the softirq handler is not permitted to
look inside it from that point.

(3) Pass the ref from the input code to the ring rather than getting an
extra ref. rxrpc_input_data() uses a ref on the second refcount to
prevent the packet from evaporating under it.

(4) Call skb_unshare() on secured DATA packets in rxrpc_input_packet()
before we take call->input_lock. Other sorts of packets don't get
modified and so can be left.

A trace is emitted if skb_unshare() eats the skb. Note that
skb_share() for our accounting in this regard as we can't see the
parameters in the packet to log in a trace line if it releases it.

(5) Remove the calls to skb_cow_data(). These are then no longer
necessary.

There are also patches to improve the rxrpc_skb tracepoint to make sure
that Tx-derived buffers are identified separately from Rx-derived buffers
in the trace.
====================

Signed-off-by: David S. Miller

David S. Miller
2019-08-31 05:54:41 +0800
3b25528e1 net: stmmac: dwmac-rk: Don't fail if phy regulator is absent ... Browse Code »

The devicetree binding lists the phy phy as optional. As such, the
driver should not bail out if it can't find a regulator. Instead it
should just skip the remaining regulator related code and continue
on normally.

Skip the remainder of phy_power_on() if a regulator supply isn't
available. This also gets rid of the bogus return code.

Fixes: 2e12f536635f ("net: stmmac: dwmac-rk: Use standard devicetree property for phy regulator")
Signed-off-by: Chen-Yu Tsai
Signed-off-by: David S. Miller

Chen-Yu Tsai
2019-08-31 05:16:26 +0800
b6b4dc4c1 amd-xgbe: Fix error path in xgbe_mod_init() ... Browse Code »

In xgbe_mod_init(), we should do cleanup if some error occurs

Reported-by: Hulk Robot
Fixes: efbaa828330a ("amd-xgbe: Add support to handle device renaming")
Fixes: 47f164deab22 ("amd-xgbe: Add PCI device support")
Signed-off-by: YueHaibing
Signed-off-by: David S. Miller

YueHaibing
2019-08-31 05:15:31 +0800

30 Aug, 2019

2 commits

daf1de907 netfilter: nft_meta_bridge: Fix get NFT_META_BRI_IIFVPROTO in network byteorder ... Browse Code »

Get the vlan_proto of ingress bridge in network byteorder as userspace
expects. Otherwise this is inconsistent with NFT_META_PROTOCOL.

Fixes: 2a3a93ef0ba5 ("netfilter: nft_meta_bridge: Add NFT_META_BRI_IIFVPROTO support")
Signed-off-by: wenxu
Signed-off-by: Pablo Neira Ayuso

wenxu
2019-08-30 08:49:04 +0800
869326532 Merge tag 'mac80211-for-davem-2019-08-29' of git://git.kernel.org/pub/scm/linux/… ... Browse Code »

…kernel/git/jberg/mac80211

Johannes Berg says:

====================
We have
* one fix for a driver as I'm covering for Kalle while he's on vacation
* two fixes for eapol-over-nl80211 work
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

David S. Miller
2019-08-30 07:44:15 +0800

29 Aug, 2019

22 commits

f8b43c5cf mac80211: Correctly set noencrypt for PAE frames ... Browse Code »

The noencrypt flag was intended to be set if the "frame was received
unencrypted" according to include/uapi/linux/nl80211.h. However, the
current behavior is opposite of this.

Cc: stable@vger.kernel.org
Fixes: 018f6fbf540d ("mac80211: Send control port frames over nl80211")
Signed-off-by: Denis Kenzior
Link: https://lore.kernel.org/r/20190827224120.14545-3-denkenz@gmail.com
Signed-off-by: Johannes Berg

Denis Kenzior
2019-08-29 22:40:00 +0800
c8a41c6af mac80211: Don't memset RXCB prior to PAE intercept ... Browse Code »

In ieee80211_deliver_skb_to_local_stack intercepts EAPoL frames if
mac80211 is configured to do so and forwards the contents over nl80211.
During this process some additional data is also forwarded, including
whether the frame was received encrypted or not. Unfortunately just
prior to the call to ieee80211_deliver_skb_to_local_stack, skb->cb is
cleared, resulting in incorrect data being exposed over nl80211.

Fixes: 018f6fbf540d ("mac80211: Send control port frames over nl80211")
Cc: stable@vger.kernel.org
Signed-off-by: Denis Kenzior
Link: https://lore.kernel.org/r/20190827224120.14545-2-denkenz@gmail.com
Signed-off-by: Johannes Berg

Denis Kenzior
2019-08-29 22:38:36 +0800
b9500577d iwlwifi: pcie: handle switching killer Qu B0 NICs to C0 ... Browse Code »

We need to use a different firmware for C0 versions of killer Qu NICs.
Add structures for them and handle them in the if block that detects
C0 revisions.

Additionally, instead of having an inclusive check for QnJ devices,
make the selection exclusive, so that switching to QnJ is the
exception, not the default. This prevents us from having to add all
the non-QnJ cards to an exclusion list. To do so, only go into the
QnJ block if the device has an RF ID type HR and HW revision QnJ.

Cc: stable@vger.kernel.org # 5.2
Signed-off-by: Luca Coelho
Link: https://lore.kernel.org/r/20190821171732.2266-1-luca@coelho.fi
Signed-off-by: Johannes Berg

Luca Coelho
2019-08-29 22:38:34 +0800
de20900fb netfilter: nf_flow_table: clear skb tstamp before xmit ... Browse Code »

If 'fq' qdisc is used and a program has requested timestamps,
skb->tstamp needs to be cleared, else fq will treat these as
'transmit time'.

Signed-off-by: Florian Westphal
Signed-off-by: Pablo Neira Ayuso

Florian Westphal
2019-08-29 22:38:05 +0800
189308d58 sky2: Disable MSI on yet another ASUS boards (P6Xxxx) ... Browse Code »

A similar workaround for the suspend/resume problem is needed for yet
another ASUS machines, P6X models. Like the previous fix, the BIOS
doesn't provide the standard DMI_SYS_* entry, so again DMI_BOARD_*
entries are used instead.

Reported-and-tested-by: SteveM
Signed-off-by: Takashi Iwai
Signed-off-by: David S. Miller

Takashi Iwai
2019-08-29 07:09:02 +0800
807e32999 Merge branch 'nfp-flower-fix-bugs-in-merge-tunnel-encap-code' ... Browse Code »

Jakub Kicinski says:

====================
nfp: flower: fix bugs in merge tunnel encap code

John says:

There are few bugs in the merge encap code that have come to light with
recent driver changes. Effectively, flow bind callbacks were being
registered twice when using internal ports (new 'busy' code triggers
this). There was also an issue with neighbour notifier messages being
ignored for internal ports.
====================

Signed-off-by: David S. Miller

David S. Miller
2019-08-29 07:06:49 +0800
e8024cb48 nfp: flower: handle neighbour events on internal ports ... Browse Code »

Recent code changes to NFP allowed the offload of neighbour entries to FW
when the next hop device was an internal port. This allows for offload of
tunnel encap when the end-point IP address is applied to such a port.

Unfortunately, the neighbour event handler still rejects events that are
not associated with a repr dev and so the firmware neighbour table may get
out of sync for internal ports.

Fix this by allowing internal port neighbour events to be correctly
processed.

Fixes: 45756dfedab5 ("nfp: flower: allow tunnels to output to internal port")
Signed-off-by: John Hurley
Reviewed-by: Simon Horman
Reviewed-by: Jakub Kicinski
Signed-off-by: Jakub Kicinski
Signed-off-by: David S. Miller

John Hurley
2019-08-29 07:06:49 +0800
739d7c575 nfp: flower: prevent ingress block binds on internal ports ... Browse Code »

Internal port TC offload is implemented through user-space applications
(such as OvS) by adding filters at egress via TC clsact qdiscs. Indirect
block offload support in the NFP driver accepts both ingress qdisc binds
and egress binds if the device is an internal port. However, clsact sends
bind notification for both ingress and egress block binds which can lead
to the driver registering multiple callbacks and receiving multiple
notifications of new filters.

Fix this by rejecting ingress block bind callbacks when the port is
internal and only adding filter callbacks for egress binds.

Fixes: 4d12ba42787b ("nfp: flower: allow offloading of matches on 'internal' ports")
Signed-off-by: John Hurley
Reviewed-by: Jakub Kicinski
Signed-off-by: Jakub Kicinski
Signed-off-by: David S. Miller

John Hurley
2019-08-29 07:06:49 +0800
80a6a5d62 Merge branch 'r8152-fix-side-effect' ... Browse Code »

Hayes Wang says:

====================
r8152: fix side effect

v3:
Update the commit message for patch #1.

v2:
Replace patch #2 with "r8152: remove calling netif_napi_del".

v1:
The commit 0ee1f4734967 ("r8152: napi hangup fix after disconnect")
add a check to avoid using napi_disable after netif_napi_del. However,
the commit ffa9fec30ca0 ("r8152: set RTL8152_UNPLUG only for real
disconnection") let the check useless.

Therefore, I revert commit 0ee1f4734967 ("r8152: napi hangup fix
after disconnect") first, and add another patch to fix it.
====================

Signed-off-by: David S. Miller

David S. Miller
2019-08-29 07:02:32 +0800
973dc6cfc r8152: remove calling netif_napi_del ... Browse Code »

Remove unnecessary use of netif_napi_del. This also avoids to call
napi_disable() after netif_napi_del().

Signed-off-by: Hayes Wang
Signed-off-by: David S. Miller

Hayes Wang
2019-08-29 07:02:32 +0800
49d4b1411 Revert "r8152: napi hangup fix after disconnect" ... Browse Code »

This reverts commit 0ee1f4734967af8321ecebaf9c74221ace34f2d5.

The commit 0ee1f4734967 ("r8152: napi hangup fix after
disconnect") adds a check about RTL8152_UNPLUG to determine
if calling napi_disable() is invalid in rtl8152_close(),
when rtl8152_disconnect() is called. This avoids to use
napi_disable() after calling netif_napi_del().

Howver, commit ffa9fec30ca0 ("r8152: set RTL8152_UNPLUG
only for real disconnection") causes that RTL8152_UNPLUG
is not always set when calling rtl8152_disconnect().
Therefore, I have to revert commit 0ee1f4734967 ("r8152:
napi hangup fix after disconnect"), first. And submit
another patch to fix it.

Signed-off-by: Hayes Wang
Signed-off-by: David S. Miller

Hayes Wang
2019-08-29 07:02:32 +0800
092e22e58 net/sched: pfifo_fast: fix wrong dereference in pfifo_fast_enqueue ... Browse Code »

Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
'TCQ_F_NOLOCK' bit in the parent qdisc, we can't assume anymore that
per-cpu counters are there in the error path of skb_array_produce().
Otherwise, the following splat can be seen:

Unable to handle kernel paging request at virtual address 0000600dea430008
Mem abort info:
ESR = 0x96000005
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000005
CM = 0, WnR = 0
user pgtable: 64k pages, 48-bit VAs, pgdp = 000000007b97530e
[0000600dea430008] pgd=0000000000000000, pud=0000000000000000
Internal error: Oops: 96000005 [#1] SMP
[...]
pstate: 10000005 (nzcV daif -PAN -UAO)
pc : pfifo_fast_enqueue+0x524/0x6e8
lr : pfifo_fast_enqueue+0x46c/0x6e8
sp : ffff800d39376fe0
x29: ffff800d39376fe0 x28: 1ffff001a07d1e40
x27: ffff800d03e8f188 x26: ffff800d03e8f200
x25: 0000000000000062 x24: ffff800d393772f0
x23: 0000000000000000 x22: 0000000000000403
x21: ffff800cca569a00 x20: ffff800d03e8ee00
x19: ffff800cca569a10 x18: 00000000000000bf
x17: 0000000000000000 x16: 0000000000000000
x15: 0000000000000000 x14: ffff1001a726edd0
x13: 1fffe4000276a9a4 x12: 0000000000000000
x11: dfff200000000000 x10: ffff800d03e8f1a0
x9 : 0000000000000003 x8 : 0000000000000000
x7 : 00000000f1f1f1f1 x6 : ffff1001a726edea
x5 : ffff800cca56a53c x4 : 1ffff001bf9a8003
x3 : 1ffff001bf9a8003 x2 : 1ffff001a07d1dcb
x1 : 0000600dea430000 x0 : 0000600dea430008
Process ping (pid: 6067, stack limit = 0x00000000dc0aa557)
Call trace:
pfifo_fast_enqueue+0x524/0x6e8
htb_enqueue+0x660/0x10e0 [sch_htb]
__dev_queue_xmit+0x123c/0x2de0
dev_queue_xmit+0x24/0x30
ip_finish_output2+0xc48/0x1720
ip_finish_output+0x548/0x9d8
ip_output+0x334/0x788
ip_local_out+0x90/0x138
ip_send_skb+0x44/0x1d0
ip_push_pending_frames+0x5c/0x78
raw_sendmsg+0xed8/0x28d0
inet_sendmsg+0xc4/0x5c0
sock_sendmsg+0xac/0x108
__sys_sendto+0x1ac/0x2a0
__arm64_sys_sendto+0xc4/0x138
el0_svc_handler+0x13c/0x298
el0_svc+0x8/0xc
Code: f9402e80 d538d081 91002000 8b010000 (885f7c03)

Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
before dereferencing 'qdisc->cpu_qstats'.

Fixes: 8a53e616de29 ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
CC: Paolo Abeni
CC: Stefano Brivio
Reported-by: Li Shuang
Signed-off-by: Davide Caratti
Acked-by: Paolo Abeni
Signed-off-by: David S. Miller

Davide Caratti
2019-08-29 06:57:38 +0800
888a5c53c tcp: inherit timestamp on mtu probe ... Browse Code »

TCP associates tx timestamp requests with a byte in the bytestream.
If merging skbs in tcp_mtu_probe, migrate the tstamp request.

Similar to MSG_EOR, do not allow moving a timestamp from any segment
in the probe but the last. This to avoid merging multiple timestamps.

Tested with the packetdrill script at
https://github.com/wdebruij/packetdrill/commits/mtu_probe-1

Link: http://patchwork.ozlabs.org/patch/1143278/#2232897
Fixes: 4ed2d765dfac ("net-timestamp: TCP timestamping")
Signed-off-by: Willem de Bruijn
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Willem de Bruijn
2019-08-29 06:56:28 +0800
dbf47a2a0 net: sched: act_sample: fix psample group handling on overwrite ... Browse Code »

Action sample doesn't properly handle psample_group pointer in overwrite
case. Following issues need to be fixed:

- In tcf_sample_init() function RCU_INIT_POINTER() is used to set
s->psample_group, even though we neither setting the pointer to NULL, nor
preventing concurrent readers from accessing the pointer in some way.
Use rcu_swap_protected() instead to safely reset the pointer.

- Old value of s->psample_group is not released or deallocated in any way,
which results resource leak. Use psample_group_put() on non-NULL value
obtained with rcu_swap_protected().

- The function psample_group_put() that released reference to struct
psample_group pointed by rcu-pointer s->psample_group doesn't respect rcu
grace period when deallocating it. Extend struct psample_group with rcu
head and use kfree_rcu when freeing it.

Fixes: 5c5670fae430 ("net/sched: Introduce sample tc action")
Signed-off-by: Vlad Buslov
Signed-off-by: David S. Miller

Vlad Buslov
2019-08-29 06:53:51 +0800
36f1031c5 ibmvnic: Do not process reset during or after device removal ... Browse Code »

Currently, the ibmvnic driver will not schedule device resets
if the device is being removed, but does not check the device
state before the reset is actually processed. This leads to a race
where a reset is scheduled with a valid device state but is
processed after the driver has been removed, resulting in an oops.

Fix this by checking the device state before processing a queued
reset event.

Reported-by: Abdul Haleem
Tested-by: Abdul Haleem
Signed-off-by: Thomas Falcon
Signed-off-by: David S. Miller

Thomas Falcon
2019-08-29 06:45:40 +0800
0754b4e8c openvswitch: Clear the L4 portion of the key for "later" fragments. ... Browse Code »

Only the first fragment in a datagram contains the L4 headers. When the
Open vSwitch module parses a packet, it always sets the IP protocol
field in the key, but can only set the L4 fields on the first fragment.
The original behavior would not clear the L4 portion of the key, so
garbage values would be sent in the key for "later" fragments. This
patch clears the L4 fields in that circumstance to prevent sending those
garbage values as part of the upcall.

Signed-off-by: Justin Pettit
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Justin Pettit
2019-08-29 05:53:51 +0800
ad06a566e openvswitch: Properly set L4 keys on "later" IP fragments ... Browse Code »

When IP fragments are reassembled before being sent to conntrack, the
key from the last fragment is used. Unless there are reordering
issues, the last fragment received will not contain the L4 ports, so the
key for the reassembled datagram won't contain them. This patch updates
the key once we have a reassembled datagram.

The handle_fragments() function works on L3 headers so we pull the L3/L4
flow key update code from key_extract into a new function
'key_extract_l3l4'. Then we add a another new function
ovs_flow_key_update_l3l4() and export it so that it is accessible by
handle_fragments() for conntrack packet reassembly.

Co-authored-by: Justin Pettit
Signed-off-by: Greg Rose
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Greg Rose
2019-08-29 05:53:51 +0800
a84d01647 mld: fix memory leak in mld_del_delrec() ... Browse Code »

Similar to the fix done for IPv4 in commit e5b1c6c6277d
("igmp: fix memory leak in igmpv3_del_delrec()"), we need to
make sure mca_tomb and mca_sources are not blindly overwritten.

Using swap() then a call to ip6_mc_clear_src() will take care
of the missing free.

BUG: memory leak
unreferenced object 0xffff888117d9db00 (size 64):
comm "syz-executor247", pid 6918, jiffies 4294943989 (age 25.350s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 fe 88 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline]
[] slab_post_alloc_hook mm/slab.h:522 [inline]
[] slab_alloc mm/slab.c:3319 [inline]
[] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3548
[] kmalloc include/linux/slab.h:552 [inline]
[] kzalloc include/linux/slab.h:748 [inline]
[] ip6_mc_add1_src net/ipv6/mcast.c:2236 [inline]
[] ip6_mc_add_src+0x31f/0x420 net/ipv6/mcast.c:2356
[] ip6_mc_source+0x4a8/0x600 net/ipv6/mcast.c:449
[] do_ipv6_setsockopt.isra.0+0x1b92/0x1dd0 net/ipv6/ipv6_sockglue.c:748
[] ipv6_setsockopt+0x89/0xd0 net/ipv6/ipv6_sockglue.c:944
[] udpv6_setsockopt+0x4e/0x90 net/ipv6/udp.c:1558
[] sock_common_setsockopt+0x38/0x50 net/core/sock.c:3139
[] __sys_setsockopt+0x10f/0x220 net/socket.c:2084
[] __do_sys_setsockopt net/socket.c:2100 [inline]
[] __se_sys_setsockopt net/socket.c:2097 [inline]
[] __x64_sys_setsockopt+0x26/0x30 net/socket.c:2097
[] do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:296
[] entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down")
Fixes: 9c8bb163ae78 ("igmp, mld: Fix memory leak in igmpv3/mld_del_delrec()")
Signed-off-by: Eric Dumazet
Reported-by: syzbot
Signed-off-by: David S. Miller

Eric Dumazet
2019-08-29 05:47:35 +0800
04d37cf46 net/sched: pfifo_fast: fix wrong dereference when qdisc is reset ... Browse Code »

Now that 'TCQ_F_CPUSTATS' bit can be cleared, depending on the value of
'TCQ_F_NOLOCK' bit in the parent qdisc, we need to be sure that per-cpu
counters are present when 'reset()' is called for pfifo_fast qdiscs.
Otherwise, the following script:

# tc q a dev lo handle 1: root htb default 100
# tc c a dev lo parent 1: classid 1:100 htb \
> rate 95Mbit ceil 100Mbit burst 64k
[...]
# tc f a dev lo parent 1: protocol arp basic classid 1:100
[...]
# tc q a dev lo parent 1:100 handle 100: pfifo_fast
[...]
# tc q d dev lo root

can generate the following splat:

Unable to handle kernel paging request at virtual address dfff2c01bd148000
Mem abort info:
ESR = 0x96000004
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
[dfff2c01bd148000] address between user and kernel address ranges
Internal error: Oops: 96000004 [#1] SMP
[...]
pstate: 80000005 (Nzcv daif -PAN -UAO)
pc : pfifo_fast_reset+0x280/0x4d8
lr : pfifo_fast_reset+0x21c/0x4d8
sp : ffff800d09676fa0
x29: ffff800d09676fa0 x28: ffff200012ee22e4
x27: dfff200000000000 x26: 0000000000000000
x25: ffff800ca0799958 x24: ffff1001940f332b
x23: 0000000000000007 x22: ffff200012ee1ab8
x21: 0000600de8a40000 x20: 0000000000000000
x19: ffff800ca0799900 x18: 0000000000000000
x17: 0000000000000002 x16: 0000000000000000
x15: 0000000000000000 x14: 0000000000000000
x13: 0000000000000000 x12: ffff1001b922e6e2
x11: 1ffff001b922e6e1 x10: 0000000000000000
x9 : 1ffff001b922e6e1 x8 : dfff200000000000
x7 : 0000000000000000 x6 : 0000000000000000
x5 : 1fffe400025dc45c x4 : 1fffe400025dc357
x3 : 00000c01bd148000 x2 : 0000600de8a40000
x1 : 0000000000000007 x0 : 0000600de8a40004
Call trace:
pfifo_fast_reset+0x280/0x4d8
qdisc_reset+0x6c/0x370
htb_reset+0x150/0x3b8 [sch_htb]
qdisc_reset+0x6c/0x370
dev_deactivate_queue.constprop.5+0xe0/0x1a8
dev_deactivate_many+0xd8/0x908
dev_deactivate+0xe4/0x190
qdisc_graft+0x88c/0xbd0
tc_get_qdisc+0x418/0x8a8
rtnetlink_rcv_msg+0x3a8/0xa78
netlink_rcv_skb+0x18c/0x328
rtnetlink_rcv+0x28/0x38
netlink_unicast+0x3c4/0x538
netlink_sendmsg+0x538/0x9a0
sock_sendmsg+0xac/0xf8
___sys_sendmsg+0x53c/0x658
__sys_sendmsg+0xc8/0x140
__arm64_sys_sendmsg+0x74/0xa8
el0_svc_handler+0x164/0x468
el0_svc+0x10/0x14
Code: 910012a0 92400801 d343fc03 11000c21 (38fb6863)

Fix this by testing the value of 'TCQ_F_CPUSTATS' bit in 'qdisc->flags',
before dereferencing 'qdisc->cpu_qstats'.

Changes since v1:
- coding style improvements, thanks to Stefano Brivio

Fixes: 8a53e616de29 ("net: sched: when clearing NOLOCK, clear TCQ_F_CPUSTATS, too")
CC: Paolo Abeni
Reported-by: Li Shuang
Signed-off-by: Davide Caratti
Acked-by: Paolo Abeni
Reviewed-by: Stefano Brivio
Signed-off-by: David S. Miller

Davide Caratti
2019-08-29 05:45:46 +0800
2965daa33 Merge branch 'macb-Update-ethernet-compatible-string-for-SiFive-FU540' ... Browse Code »

Yash Shah says:

====================
macb: Update ethernet compatible string for SiFive FU540

This patch series renames the compatible property to a more appropriate
string. The patchset is based on Linux-5.3-rc6 and tested on SiFive
Unleashed board

Change history:
Since v1:
- Dropped PATCH3 because it's already merged
- Change the reference url in the patch descriptions to point to a
'lore.kernel.org' link instead of 'lkml.org'
====================

Signed-off-by: David S. Miller

David S. Miller
2019-08-29 05:05:48 +0800
6342ea886 macb: Update compatibility string for SiFive FU540-C000 ... Browse Code »

Update the compatibility string for SiFive FU540-C000 as per the new
string updated in the binding doc.
Reference:
https://lore.kernel.org/netdev/CAJ2_jOFEVZQat0Yprg4hem4jRrqkB72FKSeQj4p8P5KA-+rgww@mail.gmail.com/

Signed-off-by: Yash Shah
Acked-by: Nicolas Ferre
Reviewed-by: Paul Walmsley
Tested-by: Paul Walmsley
Signed-off-by: David S. Miller

Yash Shah
2019-08-29 05:05:48 +0800
abecec415 macb: bindings doc: update sifive fu540-c000 binding ... Browse Code »

As per the discussion with Nicolas Ferre[0], rename the compatible property
to a more appropriate and specific string.

[0] https://lore.kernel.org/netdev/CAJ2_jOFEVZQat0Yprg4hem4jRrqkB72FKSeQj4p8P5KA-+rgww@mail.gmail.com/

Signed-off-by: Yash Shah
Acked-by: Nicolas Ferre
Reviewed-by: Paul Walmsley
Reviewed-by: Rob Herring
Signed-off-by: David S. Miller

Yash Shah
2019-08-29 05:05:48 +0800

28 Aug, 2019

5 commits

fdfc5c859 tcp: remove empty skb from write queue in error cases ... Browse Code »

Vladimir Rutsky reported stuck TCP sessions after memory pressure
events. Edge Trigger epoll() user would never receive an EPOLLOUT
notification allowing them to retry a sendmsg().

Jason tested the case of sk_stream_alloc_skb() returning NULL,
but there are other paths that could lead both sendmsg() and sendpage()
to return -1 (EAGAIN), with an empty skb queued on the write queue.

This patch makes sure we remove this empty skb so that
Jason code can detect that the queue is empty, and
call sk->sk_write_space(sk) accordingly.

Fixes: ce5ec440994b ("tcp: ensure epoll edge trigger wakeup when write queue is empty")
Signed-off-by: Eric Dumazet
Cc: Jason Baron
Reported-by: Vladimir Rutsky
Cc: Soheil Hassas Yeganeh
Cc: Neal Cardwell
Acked-by: Soheil Hassas Yeganeh
Acked-by: Neal Cardwell
Signed-off-by: David S. Miller

Eric Dumazet
2019-08-28 11:57:43 +0800
7d0a06586 net/rds: Fix info leak in rds6_inc_info_copy() ... Browse Code »

The rds6_inc_info_copy() function has a couple struct members which
are leaking stack information. The ->tos field should hold actual
information and the ->flags field needs to be zeroed out.

Fixes: 3eb450367d08 ("rds: add type of service(tos) infrastructure")
Fixes: b7ff8b1036f0 ("rds: Extend RDS API for IPv6 support")
Reported-by: 黄ID蝴蝶
Signed-off-by: Dan Carpenter
Signed-off-by: Ka-Cheong Poon
Acked-by: Santosh Shilimkar
Signed-off-by: David S. Miller

Ka-Cheong Poon
2019-08-28 11:56:06 +0800
2c1644cf6 net: fix skb use after free in netpoll ... Browse Code »

After commit baeababb5b85d5c4e6c917efe2a1504179438d3b
("tun: return NET_XMIT_DROP for dropped packets"),
when tun_net_xmit drop packets, it will free skb and return NET_XMIT_DROP,
netpoll_send_skb_on_dev will run into following use after free cases:
1. retry netpoll_start_xmit with freed skb;
2. queue freed skb in npinfo->txq.
queue_process will also run into use after free case.

hit netpoll_send_skb_on_dev first case with following kernel log:

[ 117.864773] kernel BUG at mm/slub.c:306!
[ 117.864773] invalid opcode: 0000 [#1] SMP PTI
[ 117.864774] CPU: 3 PID: 2627 Comm: loop_printmsg Kdump: loaded Tainted: P OE 5.3.0-050300rc5-generic #201908182231
[ 117.864775] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[ 117.864775] RIP: 0010:kmem_cache_free+0x28d/0x2b0
[ 117.864781] Call Trace:
[ 117.864781] ? tun_net_xmit+0x21c/0x460
[ 117.864781] kfree_skbmem+0x4e/0x60
[ 117.864782] kfree_skb+0x3a/0xa0
[ 117.864782] tun_net_xmit+0x21c/0x460
[ 117.864782] netpoll_start_xmit+0x11d/0x1b0
[ 117.864788] netpoll_send_skb_on_dev+0x1b8/0x200
[ 117.864789] __br_forward+0x1b9/0x1e0 [bridge]
[ 117.864789] ? skb_clone+0x53/0xd0
[ 117.864790] ? __skb_clone+0x2e/0x120
[ 117.864790] deliver_clone+0x37/0x50 [bridge]
[ 117.864790] maybe_deliver+0x89/0xc0 [bridge]
[ 117.864791] br_flood+0x6c/0x130 [bridge]
[ 117.864791] br_dev_xmit+0x315/0x3c0 [bridge]
[ 117.864792] netpoll_start_xmit+0x11d/0x1b0
[ 117.864792] netpoll_send_skb_on_dev+0x1b8/0x200
[ 117.864792] netpoll_send_udp+0x2c6/0x3e8
[ 117.864793] write_msg+0xd9/0xf0 [netconsole]
[ 117.864793] console_unlock+0x386/0x4e0
[ 117.864793] vprintk_emit+0x17e/0x280
[ 117.864794] vprintk_default+0x29/0x50
[ 117.864794] vprintk_func+0x4c/0xbc
[ 117.864794] printk+0x58/0x6f
[ 117.864795] loop_fun+0x24/0x41 [printmsg_loop]
[ 117.864795] kthread+0x104/0x140
[ 117.864795] ? 0xffffffffc05b1000
[ 117.864796] ? kthread_park+0x80/0x80
[ 117.864796] ret_from_fork+0x35/0x40

Signed-off-by: Feng Sun
Signed-off-by: Xiaojun Zhao
Signed-off-by: David S. Miller

Feng Sun
2019-08-28 11:52:02 +0800
bcccb0a53 net: dsa: tag_8021q: Future-proof the reserved fields in the custom VID ... Browse Code »

After witnessing the discussion in https://lkml.org/lkml/2019/8/14/151
w.r.t. ioctl extensibility, it became clear that such an issue might
prevent that the 3 RSV bits inside the DSA 802.1Q tag might also suffer
the same fate and be useless for further extension.

So clearly specify that the reserved bits should currently be
transmitted as zero and ignored on receive. The DSA tagger already does
this (and has always did), and is the only known user so far (no
Wireshark dissection plugin, etc). So there should be no incompatibility
to speak of.

Fixes: 0471dd429cea ("net: dsa: tag_8021q: Create a stable binary format")
Signed-off-by: Vladimir Oltean
Reviewed-by: Florian Fainelli
Signed-off-by: David S. Miller

Vladimir Oltean
2019-08-28 11:31:12 +0800
94acaeb50 Add genphy_c45_config_aneg() function to phy-c45.c ... Browse Code »

Commit 34786005eca3 ("net: phy: prevent PHYs w/o Clause 22 regs from calling
genphy_config_aneg") introduced a check that aborts phy_config_aneg()
if the phy is a C45 phy.
This causes phy_state_machine() to call phy_error() so that the phy
ends up in PHY_HALTED state.

Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg()
(analogous to the C22 case) so that the state machine can run
correctly.

genphy_c45_config_aneg() closely resembles mv3310_config_aneg()
in drivers/net/phy/marvell10g.c, excluding vendor specific
configurations for 1000BaseT.

Fixes: 22b56e827093 ("net: phy: replace genphy_10g_driver with genphy_c45_driver")

Signed-off-by: Marco Hartmann
Reviewed-by: Andrew Lunn
Signed-off-by: David S. Miller

Marco Hartmann
2019-08-28 11:21:15 +0800