26 Nov, 2016
5 commits
-
In commit 10724cc7bb78 ("tipc: redesign connection-level flow control")
we replaced the previous message based flow control with one based on
1k blocks. In order to ensure backwards compatibility the mechanism
falls back to using message as base unit when it senses that the peer
doesn't support the new algorithm. The default flow control window,
i.e., how many units can be sent before the sender blocks and waits
for an acknowledge (aka advertisement) is 512. This was tested against
the previous version, which uses an acknowledge frequency of on ack per
256 received message, and found to work fine.However, we missed the fact that versions older than Linux 3.15 use an
acknowledge frequency of 512, which is exactly the limit where a 4.6+
sender will stop and wait for acknowledge. This would also work fine if
it weren't for the fact that if the first sent message on a 4.6+ server
side is an empty SYNACK, this one is also is counted as a sent message,
while it is not counted as a received message on a legacy 3.15-receiver.
This leads to the sender always being one step ahead of the receiver, a
scenario causing the sender to block after 512 sent messages, while the
receiver only has registered 511 read messages. Hence, the legacy
receiver is not trigged to send an acknowledge, with a permanently
blocked sender as result.We solve this deadlock by simply allowing the sender to send one more
message before it blocks, i.e., by a making minimal change to the
condition used for determining connection congestion.Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
The ETHTOOL_GLINKSETTINGS command is deprecating the ETHTOOL_GSET
command and likewise it shouldn't require the CAP_NET_ADMIN capability.Signed-off-by: Miroslav Lichvar
Signed-off-by: David S. Miller -
In commit 35c55c9877f8 ("tipc: add neighbor monitoring framework") we
added a data area to the link monitor STATE messages under the
assumption that previous versions did not use any such data area.For versions older than Linux 4.3 this assumption is not correct. In
those version, all STATE messages sent out from a node inadvertently
contain a 16 byte data area containing a string; -a leftover from
previous RESET messages which were using this during the setup phase.
This string serves no purpose in STATE messages, and should no be there.Unfortunately, this data area is delivered to the link monitor
framework, where a sanity check catches that it is not a correct domain
record, and drops it. It also issues a rate limited warning about the
event.Since such events occur much more frequently than anticipated, we now
choose to remove the warning in order to not fill the kernel log with
useless contents. We also make the sanity check stricter, to further
reduce the risk that such data is inavertently admitted.Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
commit 817298102b0b ("tipc: fix link priority propagation") introduced a
compatibility problem between TIPC versions newer than Linux 4.6 and
those older than Linux 4.4. In versions later than 4.4, link STATE
messages only contain a non-zero link priority value when the sender
wants the receiver to change its priority. This has the effect that the
receiver resets itself in order to apply the new priority. This works
well, and is consistent with the said commit.However, in versions older than 4.4 a valid link priority is present in
all sent link STATE messages, leading to cyclic link establishment and
reset on the 4.6+ node.We fix this by adding a test that the received value should not only
be valid, but also differ from the current value in order to cause the
receiving link endpoint to reset.Reported-by: Amar Nv
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
…ux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2016-11-23this is a pull request for net/master.
The patch by Oliver Hartkopp for the broadcast manager (bcm) fixes the
CAN-FD support, which may cause an out-of-bounds access otherwise.
====================Signed-off-by: David S. Miller <davem@davemloft.net>
25 Nov, 2016
4 commits
-
Johan Hedberg says:
====================
pull request: bluetooth 2016-11-23Sorry about the late pull request for 4.9, but we have one more
important Bluetooth patch that should make it to the release. It fixes
connection creation for Bluetooth LE controllers that do not have a
public address (only a random one).Please let me know if there are any issues pulling. Thanks.
====================Signed-off-by: David S. Miller
-
Should pass valid filter handle, not the netlink flags.
Fixes: 30a391a13ab92 ("net sched filters: pass netlink message flags in event notification")
Signed-off-by: Roman Mashak
Signed-off-by: Jamal Hadi Salim
Reported-by: Cong Wang
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller -
In commits 93821778def10 ("udp: Fix rcv socket locking") and
f7ad74fef3af ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into
__udpv6_queue_rcv_skb") UDP backlog handlers were renamed, but UDPlite
was forgotten.This leads to crashes if UDPlite header is pulled twice, which happens
starting from commit e6afc8ace6dd ("udp: remove headers from UDP packets
before queueing")Bug found by syzkaller team, thanks a lot guys !
Note that backlog use in UDP/UDPlite is scheduled to be removed starting
from linux-4.10, so this patch is only needed up to linux-4.9Fixes: 93821778def1 ("udp: Fix rcv socket locking")
Fixes: f7ad74fef3af ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb")
Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing")
Signed-off-by: Eric Dumazet
Reported-by: Andrey Konovalov
Cc: Benjamin LaHaise
Cc: Herbert Xu
Signed-off-by: David S. Miller -
When an ipv6 address has the tentative flag set, it can't be
used as source for egress traffic, while the associated route,
if any, can be looked up and even stored into some dst_cache.In the latter scenario, the source ipv6 address selected and
stored in the cache is most probably wrong (e.g. with
link-local scope) and the entity using the dst_cache will
experience lack of ipv6 connectivity until said cache is
cleared or invalidated.Overall this may cause lack of connectivity over most IPv6 tunnels
(comprising geneve and vxlan), if the first egress packet reaches
the tunnel before the DaD is completed for the used ipv6
address.This patch bumps a new genid after that the IFA_F_TENTATIVE flag
is cleared, so that dst_cache will be invalidated on
next lookup and ipv6 connectivity restored.Fixes: 0c1d70af924b ("net: use dst_cache for vxlan device")
Fixes: 468dfffcd762 ("geneve: add dst caching support")
Acked-by: Hannes Frederic Sowa
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller
24 Nov, 2016
2 commits
-
This reverts commit 7c6ae610a1f0, because l2tp_xmit_skb() never
returns NET_XMIT_CN, it ignores the return value of l2tp_xmit_core().Cc: Gao Feng
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller -
For RT netlink, calcit() function should return the minimal size for
netlink dump message. This will make sure that dump message for every
network device can be stored.Currently, rtnl_calcit() function doesn't account the size of header of
netlink message, this patch will fix it.Signed-off-by: Zhang Shengju
Signed-off-by: David S. Miller
23 Nov, 2016
3 commits
-
Since commit 6f3b911d5f29b98 ("can: bcm: add support for CAN FD frames") the
CAN broadcast manager supports CAN and CAN FD data frames.As these data frames are embedded in struct can[fd]_frames which have a
different length the access to the provided array of CAN frames became
dependend of op->cfsiz. By using a struct canfd_frame pointer for the array of
CAN frames the new offset calculation based on op->cfsiz was accidently applied
to CAN FD frame element lengths.This fix makes the pointer to the arrays of the different CAN frame types a
void pointer so that the offset calculation in bytes accesses the correct CAN
frame elements.Reference: http://marc.info/?l=linux-netdev&m=147980658909653
Reported-by: Andrey Konovalov
Signed-off-by: Oliver Hartkopp
Tested-by: Andrey Konovalov
Cc: linux-stable
Signed-off-by: Marc Kleine-Budde -
The hci_get_route() API is used to look up local HCI devices, however
so far it has been incapable of dealing with anything else than the
public address of HCI devices. This completely breaks with LE-only HCI
devices that do not come with a public address, but use a static
random address instead.This patch exteds the hci_get_route() API with a src_type parameter
that's used for comparing with the right address of each HCI device.Signed-off-by: Johan Hedberg
Signed-off-by: Marcel Holtmann -
Andre Noll reported panics after my recent fix (commit 34fad54c2537
"net: __skb_flow_dissect() must cap its return value")After some more headaches, Alexander root caused the problem to
init_default_flow_dissectors() being called too late, in case
a network driver like IGB is not a module and receives DHCP message
very early.Fix is to call init_default_flow_dissectors() much earlier,
as it is a core infrastructure and does not depend on another
kernel service.Fixes: 06635a35d13d4 ("flow_dissect: use programable dissector in skb_flow_dissect and friends")
Signed-off-by: Eric Dumazet
Reported-by: Andre Noll
Diagnosed-by: Alexander Duyck
Signed-off-by: David S. Miller
22 Nov, 2016
3 commits
-
Pull networking fixes from David Miller:
1) Clear congestion control state when changing algorithms on an
existing socket, from Florian Westphal.2) Fix register bit values in altr_tse_pcs portion of stmmac driver,
from Jia Jie Ho.3) Fix PTP handling in stammc driver for GMAC4, from Giuseppe
CAVALLARO.4) Fix udplite multicast delivery handling, it ignores the udp_table
parameter passed into the lookups, from Pablo Neira Ayuso.5) Synchronize the space estimated by rtnl_vfinfo_size and the space
actually used by rtnl_fill_vfinfo. From Sabrina Dubroca.6) Fix memory leak in fib_info when splitting nodes, from Alexander
Duyck.7) If a driver does a napi_hash_del() explicitily and not via
netif_napi_del(), it must perform RCU synchronization as needed. Fix
this in virtio-net and bnxt drivers, from Eric Dumazet.8) Likewise, it is not necessary to invoke napi_hash_del() is we are
also doing neif_napi_del() in the same code path. Remove such calls
from be2net and cxgb4 drivers, also from Eric Dumazet.9) Don't allocate an ID in peernet2id_alloc() if the netns is dead,
from WANG Cong.10) Fix OF node and device struct leaks in of_mdio, from Johan Hovold.
11) We cannot cache routes in ip6_tunnel when using inherited traffic
classes, from Paolo Abeni.12) Fix several crashes and leaks in cpsw driver, from Johan Hovold.
13) Splice operations cannot use freezable blocking calls in AF_UNIX,
from WANG Cong.14) Link dump filtering by master device and kind support added an error
in loop index updates during the dump if we actually do filter, fix
from Zhang Shengju.* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits)
tcp: zero ca_priv area when switching cc algorithms
net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit
ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC
tipc: eliminate obsolete socket locking policy description
rtnl: fix the loop index update error in rtnl_dump_ifinfo()
l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind()
net: macb: add check for dma mapping error in start_xmit()
rtnetlink: fix FDB size computation
netns: fix get_net_ns_by_fd(int pid) typo
af_unix: conditionally use freezable blocking calls in read
net: ethernet: ti: cpsw: fix fixed-link phy probe deferral
net: ethernet: ti: cpsw: add missing sanity check
net: ethernet: ti: cpsw: fix secondary-emac probe error path
net: ethernet: ti: cpsw: fix of_node and phydev leaks
net: ethernet: ti: cpsw: fix deferred probe
net: ethernet: ti: cpsw: fix mdio device reference leak
net: ethernet: ti: cpsw: fix bad register access in probe error path
net: sky2: Fix shutdown crash
cfg80211: limit scan results cache size
net sched filters: pass netlink message flags in event notification
... -
We need to zero out the private data area when application switches
connection to different algorithm (TCP_CONGESTION setsockopt).When congestion ops get assigned at connect time everything is already
zeroed because sk_alloc uses GFP_ZERO flag. But in the setsockopt case
this contains whatever previous cc placed there.Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller -
The tc could return NET_XMIT_CN as one congestion notification, but
it does not mean the packe is lost. Other modules like ipvlan,
macvlan, and others treat NET_XMIT_CN as success too.
So l2tp_eth_dev_xmit should add the NET_XMIT_CN check.Signed-off-by: Gao Feng
Signed-off-by: David S. Miller
20 Nov, 2016
4 commits
-
The comment block in socket.c describing the locking policy is
obsolete, and does not reflect current reality. We remove it in this
commit.Since the current locking policy is much simpler and follows a
mainstream approach, we see no need to add a new description.Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller -
If the link is filtered out, loop index should also be updated. If not,
loop index will not be correct.Fixes: dc599f76c22b0 ("net: Add support for filtering link dump by master device and kind")
Signed-off-by: Zhang Shengju
Acked-by: David Ahern
Signed-off-by: David S. Miller -
Lock socket before checking the SOCK_ZAPPED flag in l2tp_ip6_bind().
Without lock, a concurrent call could modify the socket flags between
the sock_flag(sk, SOCK_ZAPPED) test and the lock_sock() call. This way,
a socket could be inserted twice in l2tp_ip6_bind_table. Releasing it
would then leave a stale pointer there, generating use-after-free
errors when walking through the list or modifying adjacent entries.BUG: KASAN: use-after-free in l2tp_ip6_close+0x22e/0x290 at addr ffff8800081b0ed8
Write of size 8 by task syz-executor/10987
CPU: 0 PID: 10987 Comm: syz-executor Not tainted 4.8.0+ #39
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
ffff880031d97838 ffffffff829f835b ffff88001b5a1640 ffff8800081b0ec0
ffff8800081b15a0 ffff8800081b6d20 ffff880031d97860 ffffffff8174d3cc
ffff880031d978f0 ffff8800081b0e80 ffff88001b5a1640 ffff880031d978e0
Call Trace:
[] dump_stack+0xb3/0x118 lib/dump_stack.c:15
[] kasan_object_err+0x1c/0x70 mm/kasan/report.c:156
[< inline >] print_address_description mm/kasan/report.c:194
[] kasan_report_error+0x1f6/0x4d0 mm/kasan/report.c:283
[< inline >] kasan_report mm/kasan/report.c:303
[] __asan_report_store8_noabort+0x3e/0x40 mm/kasan/report.c:329
[< inline >] __write_once_size ./include/linux/compiler.h:249
[< inline >] __hlist_del ./include/linux/list.h:622
[< inline >] hlist_del_init ./include/linux/list.h:637
[] l2tp_ip6_close+0x22e/0x290 net/l2tp/l2tp_ip6.c:239
[] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415
[] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422
[] sock_release+0x8d/0x1d0 net/socket.c:570
[] sock_close+0x16/0x20 net/socket.c:1017
[] __fput+0x28c/0x780 fs/file_table.c:208
[] ____fput+0x15/0x20 fs/file_table.c:244
[] task_work_run+0xf9/0x170
[] do_exit+0x85e/0x2a00
[] do_group_exit+0x108/0x330
[] get_signal+0x617/0x17a0 kernel/signal.c:2307
[] do_signal+0x7f/0x18f0
[] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
[] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259
[] entry_SYSCALL_64_fastpath+0xc4/0xc6
Object at ffff8800081b0ec0, in cache L2TP/IPv6 size: 1448
Allocated:
PID = 10987
[ 1116.897025] [] save_stack_trace+0x16/0x20
[ 1116.897025] [] save_stack+0x46/0xd0
[ 1116.897025] [] kasan_kmalloc+0xad/0xe0
[ 1116.897025] [] kasan_slab_alloc+0x12/0x20
[ 1116.897025] [< inline >] slab_post_alloc_hook mm/slab.h:417
[ 1116.897025] [< inline >] slab_alloc_node mm/slub.c:2708
[ 1116.897025] [< inline >] slab_alloc mm/slub.c:2716
[ 1116.897025] [] kmem_cache_alloc+0xc8/0x2b0 mm/slub.c:2721
[ 1116.897025] [] sk_prot_alloc+0x69/0x2b0 net/core/sock.c:1326
[ 1116.897025] [] sk_alloc+0x38/0xae0 net/core/sock.c:1388
[ 1116.897025] [] inet6_create+0x2d7/0x1000 net/ipv6/af_inet6.c:182
[ 1116.897025] [] __sock_create+0x37b/0x640 net/socket.c:1153
[ 1116.897025] [< inline >] sock_create net/socket.c:1193
[ 1116.897025] [< inline >] SYSC_socket net/socket.c:1223
[ 1116.897025] [] SyS_socket+0xef/0x1b0 net/socket.c:1203
[ 1116.897025] [] entry_SYSCALL_64_fastpath+0x23/0xc6
Freed:
PID = 10987
[ 1116.897025] [] save_stack_trace+0x16/0x20
[ 1116.897025] [] save_stack+0x46/0xd0
[ 1116.897025] [] kasan_slab_free+0x71/0xb0
[ 1116.897025] [< inline >] slab_free_hook mm/slub.c:1352
[ 1116.897025] [< inline >] slab_free_freelist_hook mm/slub.c:1374
[ 1116.897025] [< inline >] slab_free mm/slub.c:2951
[ 1116.897025] [] kmem_cache_free+0xc8/0x330 mm/slub.c:2973
[ 1116.897025] [< inline >] sk_prot_free net/core/sock.c:1369
[ 1116.897025] [] __sk_destruct+0x32b/0x4f0 net/core/sock.c:1444
[ 1116.897025] [] sk_destruct+0x44/0x80 net/core/sock.c:1452
[ 1116.897025] [] __sk_free+0x53/0x220 net/core/sock.c:1460
[ 1116.897025] [] sk_free+0x23/0x30 net/core/sock.c:1471
[ 1116.897025] [] sk_common_release+0x28c/0x3e0 ./include/net/sock.h:1589
[ 1116.897025] [] l2tp_ip6_close+0x1fe/0x290 net/l2tp/l2tp_ip6.c:243
[ 1116.897025] [] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415
[ 1116.897025] [] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422
[ 1116.897025] [] sock_release+0x8d/0x1d0 net/socket.c:570
[ 1116.897025] [] sock_close+0x16/0x20 net/socket.c:1017
[ 1116.897025] [] __fput+0x28c/0x780 fs/file_table.c:208
[ 1116.897025] [] ____fput+0x15/0x20 fs/file_table.c:244
[ 1116.897025] [] task_work_run+0xf9/0x170
[ 1116.897025] [] do_exit+0x85e/0x2a00
[ 1116.897025] [] do_group_exit+0x108/0x330
[ 1116.897025] [] get_signal+0x617/0x17a0 kernel/signal.c:2307
[ 1116.897025] [] do_signal+0x7f/0x18f0
[ 1116.897025] [] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156
[ 1116.897025] [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
[ 1116.897025] [] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259
[ 1116.897025] [] entry_SYSCALL_64_fastpath+0xc4/0xc6
Memory state around the buggy address:
ffff8800081b0d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8800081b0e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8800081b0e80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
^
ffff8800081b0f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8800081b0f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb==================================================================
The same issue exists with l2tp_ip_bind() and l2tp_ip_bind_table.
Fixes: c51ce49735c1 ("l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case")
Reported-by: Baozeng Ding
Reported-by: Andrey Konovalov
Tested-by: Baozeng Ding
Signed-off-by: Guillaume Nault
Signed-off-by: David S. Miller -
Simon Wunderlich says:
====================
Here are two batman-adv bugfix patches:- Revert a splat on disabling interface which created another problem,
by Sven Eckelmann- Fix error handling when the primary interface disappears during a
throughput meter test, by Sven Eckelmann
====================Signed-off-by: David S. Miller
19 Nov, 2016
4 commits
-
Pull nfsd bugfix from Bruce Fields:
"Just one fix for an NFS/RDMA crash"* tag 'nfsd-4.9-2' of git://linux-nfs.org/~bfields/linux:
sunrpc: svc_age_temp_xprts_now should not call setsockopt non-tcp transports -
Add missing NDA_VLAN attribute's size.
Fixes: 1e53d5bb8878 ("net: Pass VLAN ID to rtnl_fdb_notify.")
Signed-off-by: Sabrina Dubroca
Signed-off-by: David S. Miller -
…kernel/git/jberg/mac80211
Johannes Berg says:
====================
A few more bugfixes:
* limit # of scan results stored in memory - this is a long-standing bug
Jouni and I only noticed while discussing other things in Santa Fe
* revert AP_LINK_PS patch that was causing issues (Felix)
* various A-MSDU/A-MPDU fixes for TXQ code (Felix)
* interoperability workaround for peers with broken VHT capabilities
(Filip Matusiak)
* add bitrate definition for a VHT MCS that's supposed to be invalid
but gets used by some hardware anyway (Thomas Pedersen)
* beacon timer fix in hwsim (Benjamin Beichler)
====================Signed-off-by: David S. Miller <davem@davemloft.net>
-
Commit 2b15af6f95 ("af_unix: use freezable blocking calls in read")
converts schedule_timeout() to its freezable version, it was probably
correct at that time, but later, commit 2b514574f7e8
("net: af_unix: implement splice for stream af_unix sockets") breaks
the strong requirement for a freezable sleep, according to
commit 0f9548ca1091:We shouldn't try_to_freeze if locks are held. Holding a lock can cause a
deadlock if the lock is later acquired in the suspend or hibernate path
(e.g. by dpm). Holding a lock can also cause a deadlock in the case of
cgroup_freezer if a lock is held inside a frozen cgroup that is later
acquired by a process outside that group.The pipe_lock is still held at that point.
So use freezable version only for the recvmsg call path, avoid impact for
Android.Fixes: 2b514574f7e8 ("net: af_unix: implement splice for stream af_unix sockets")
Reported-by: Dmitry Vyukov
Cc: Tejun Heo
Cc: Colin Cross
Cc: Rafael J. Wysocki
Cc: Hannes Frederic Sowa
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
18 Nov, 2016
4 commits
-
It's possible to make scanning consume almost arbitrary amounts
of memory, e.g. by sending beacon frames with random BSSIDs at
high rates while somebody is scanning.Limit the number of BSS table entries we're willing to cache to
1000, limiting maximum memory usage to maybe 4-5MB, but lower
in practice - that would be the case for having both full-sized
beacon and probe response frames for each entry; this seems not
possible in practice, so a limit of 1000 entries will likely be
closer to 0.5 MB.Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg -
Userland client should be able to read an event, and reflect it back to
the kernel, therefore it needs to extract complete set of netlink flags.For example, this will allow "tc monitor" to distinguish Add and Replace
operations.Signed-off-by: Roman Mashak
Signed-off-by: Jamal Hadi Salim
Signed-off-by: David S. Miller -
If an ip6 tunnel is configured to inherit the traffic class from
the inner header, the dst_cache must be disabled or it will foul
the policy routing.The issue is apprently there since at leat Linux-2.6.12-rc2.
Reported-by: Liam McBirnie
Cc: Liam McBirnie
Acked-by: Hannes Frederic Sowa
Signed-off-by: Paolo Abeni
Signed-off-by: David S. Miller -
Andrei reports we still allocate netns ID from idr after we destroy
it in cleanup_net().cleanup_net():
...
idr_destroy(&net->netns_ids);
...
list_for_each_entry_reverse(ops, &pernet_list, list)
ops_exit_list(ops, &net_exit_list);
-> rollback_registered_many()
-> rtmsg_ifinfo_build_skb()
-> rtnl_fill_ifinfo()
-> peernet2id_alloc()After that point we should not even access net->netns_ids, we
should check the death of the current netns as early as we can in
peernet2id_alloc().For net-next we can consider to avoid sending rtmsg totally,
it is a good optimization for netns teardown path.Fixes: 0c7aecd4bde4 ("netns: add rtnl cmd to add and get peer netns ids")
Reported-by: Andrei Vagin
Cc: Nicolas Dichtel
Signed-off-by: Cong Wang
Acked-by: Andrei Vagin
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller
17 Nov, 2016
3 commits
-
The IOP_XATTR flag is set on sockfs because sockfs supports getting the
"system.sockprotoname" xattr. Since commit 6c6ef9f2, this flag is checked for
setxattr support as well. This is wrong on sockfs because security xattr
support there is supposed to be provided by security_inode_setsecurity. The
smack security module relies on socket labels (xattrs).Fix this by adding a security xattr handler on sockfs that returns
-EAGAIN, and by checking for -EAGAIN in setxattr.We cannot simply check for -EOPNOTSUPP in setxattr because there are
filesystems that neither have direct security xattr support nor support
via security_inode_setsecurity. A more proper fix might be to move the
call to security_inode_setsecurity into sockfs, but it's not clear to me
if that is safe: we would end up calling security_inode_post_setxattr after
that as well.Signed-off-by: Andreas Gruenbacher
Signed-off-by: Al Viro -
Fix a small memory leak that can occur where we leak a fib_alias in the
event of us not being able to insert it into the local table.Fixes: 0ddcf43d5d4a0 ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Eric Dumazet
Signed-off-by: Alexander Duyck
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
The patch that removed the FIB offload infrastructure was a bit too
aggressive and also removed code needed to clean up us splitting the table
if additional rules were added. Specifically the function
fib_trie_flush_external was called at the end of a new rule being added to
flush the foreign trie entries from the main trie.I updated the code so that we only call fib_trie_flush_external on the main
table so that we flush the entries for local from main. This way we don't
call it for every rule change which is what was happening previously.Fixes: 347e3b28c1ba2 ("switchdev: remove FIB offload infrastructure")
Reported-by: Eric Dumazet
Cc: Jiri Pirko
Signed-off-by: Alexander Duyck
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
16 Nov, 2016
4 commits
-
rtnl_xdp_size() only considers the size of the actual payload attribute,
and misses the space taken by the attribute used for nesting (IFLA_XDP).Fixes: d1fdd9138682 ("rtnl: add option for setting link xdp prog")
Signed-off-by: Sabrina Dubroca
Reviewed-by: Brenden Blanco
Signed-off-by: David S. Miller -
The size reported by rtnl_vfinfo_size doesn't match the space used by
rtnl_fill_vfinfo.rtnl_vfinfo_size currently doesn't account for the nest attributes
used by statistics (added in commit 3b766cd83232), nor for struct
ifla_vf_tx_rate (since commit ed616689a3d9, which added ifla_vf_rate
to the dump without removing ifla_vf_tx_rate, but replaced
ifla_vf_tx_rate with ifla_vf_rate in the size computation).Fixes: 3b766cd83232 ("net/core: Add reading VF statistics through the PF netdevice")
Fixes: ed616689a3d9 ("net-next:v4: Add support to configure SR-IOV VF minimum and maximum Tx rate through ip tool")
Signed-off-by: Sabrina Dubroca
Signed-off-by: David S. Miller -
Honor udptable parameter that is passed to __udp*_lib_mcast_deliver(),
otherwise udplite broadcast/multicast use the wrong table and it breaks.Fixes: 2dc41cff7545 ("udp: Use hash2 for long hash1 chains in __udp*_lib_mcast_deliver.")
Signed-off-by: Pablo Neira Ayuso
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
In commit 24cf3af3fed5 ("igmp: call ip_mc_clear_src..."), we forgot to remove
igmpv3_clear_delrec() in ip_mc_down(), which also called ip_mc_clear_src().
This make us clear all IGMPv3 source filter info after NETDEV_DOWN.
Move igmpv3_clear_delrec() to ip_mc_destroy_dev() and then no need
ip_mc_clear_src() in ip_mc_destroy_dev().On the other hand, we should restore back instead of free all source filter
info in igmpv3_del_delrec(). Or we will not able to restore IGMPv3 source
filter info after NETDEV_UP and NETDEV_POST_TYPE_CHANGE.Fixes: 24cf3af3fed5 ("igmp: call ip_mc_clear_src() only when ...")
Signed-off-by: Hangbin Liu
Signed-off-by: David S. Miller
15 Nov, 2016
4 commits
-
A-MSDU aggregation alters the QoS header after a frame has been
enqueued, so it needs to be ready before enqueue and not overwritten
again afterwardsFixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue")
Signed-off-by: Felix Fietkau
Acked-by: Toke Høiland-Jørgensen
Signed-off-by: Johannes Berg -
The call to ieee80211_txq_enqueue overwrites the vif pointer with the
codel enqueue time, so setting it just before that call makes no sense.Signed-off-by: Felix Fietkau
Acked-by: Toke Høiland-Jørgensen
Signed-off-by: Johannes Berg -
The sequence number counter is used to derive the starting sequence
number. Since that counter is updated on tx dequeue, the A-MPDU flag
needs to be up to date at the tme of dequeue as well.This patch prevents sending more A-MPDU frames after the session has
been terminated and also ensures that aggregation starts right after the
session has been establishedFixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue")
Signed-off-by: Felix Fietkau
Acked-by: Toke Høiland-Jørgensen
Signed-off-by: Johannes Berg -
Some drivers (ath10k) report MCS 9 @ 20MHz, which
technically isn't defined. To get more meaningful value
than 0 out of this however, just extrapolate a bitrate
from ratio of MCS 7 and 9 in channels where it is allowed.Signed-off-by: Thomas Pedersen
[add a comment about it in the code]
Signed-off-by: Johannes Berg