20 Oct, 2015
1 commit
-
Pull networking fixes from David Miller:
1) Account for extra headroom in ath9k driver, from Felix Fietkau.
2) Fix OOPS in pppoe driver due to incorrect socket state transition,
from Guillaume Nault.3) Kill memory leak in amd-xgbe debugfx, from Geliang Tang.
4) Power management fixes for iwlwifi, from Johannes Berg.
5) Fix races in reqsk_queue_unlink(), from Eric Dumazet.
6) Fix dst_entry usage in ARP replies, from Jiri Benc.
7) Cure OOPSes with SO_GET_FILTER, from Daniel Borkmann.
8) Missing allocation failure check in amd-xgbe, from Tom Lendacky.
9) Various resource allocation/freeing cures in DSA< from Neil
Armstrong.10) A series of bug fixes in the openvswitch conntrack support, from
Joe Stringer.11) Fix two cases (BPF and act_mirred) where we have to clean the sender
cpu stored in the SKB before transmitting. From WANG Cong and
Alexei Starovoitov.12) Disable VLAN filtering in promiscuous mode in mlx5 driver, from
Achiad Shochat.13) Older bnx2x chips cannot do 4-tuple UDP hashing, so prevent this
configuration via ethtool. From Yuval Mintz.14) Don't call rt6_uncached_list_flush_dev() from rt6_ifdown() when
'dev' is NULL, from Eric Biederman.15) Prevent stalled link synchronization in tipc, from Jon Paul Maloy.
16) kcalloc() gstrings ethtool buffer before having driver fill it in,
in order to prevent kernel memory leaking. From Joe Perches.17) Fix mixxing rt6_info initialization for blackhole routes, from
Martin KaFai Lau.18) Kill VLAN regression in via-rhine, from Andrej Ota.
19) Missing pfmemalloc check in sk_add_backlog(), from Eric Dumazet.
20) Fix spurious MSG_TRUNC signalling in netlink dumps, from Ronen Arad.
21) Scrube SKBs when pushing them between namespaces in openvswitch,
from Joe Stringer.22) bcmgenet enables link interrupts too early, fix from Florian
Fainelli.* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (92 commits)
net: bcmgenet: Fix early link interrupt enabling
tunnels: Don't require remote endpoint or ID during creation.
openvswitch: Scrub skb between namespaces
xen-netback: correctly check failed allocation
net: asix: add support for the Billionton GUSB2AM-1G-B USB adapter
netlink: Trim skb to alloc size to avoid MSG_TRUNC
net: add pfmemalloc check in sk_add_backlog()
via-rhine: fix VLAN receive handling regression.
ipv6: Initialize rt6_info properly in ip6_blackhole_route()
ipv6: Move common init code for rt6_info to a new function rt6_info_init()
Bluetooth: Fix initializing conn_params in scan phase
Bluetooth: Fix conn_params list update in hci_connect_le_scan_cleanup
Bluetooth: Fix remove_device behavior for explicit connects
Bluetooth: Fix LE reconnection logic
Bluetooth: Fix reference counting for LE-scan based connections
Bluetooth: Fix double scan updates
mlxsw: core: Fix race condition in __mlxsw_emad_transmit
tipc: move fragment importance field to new header position
ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings
tipc: eliminate risk of stalled link synchronization
...
19 Oct, 2015
3 commits
-
If OVS receives a packet from another namespace, then the packet should
be scrubbed. However, people have already begun to rely on the behaviour
that skb->mark is preserved across namespaces, so retain this one field.This is mainly to address information leakage between namespaces when
using OVS internal ports, but by placing it in ovs_vport_receive() it is
more generally applicable, meaning it should not be overlooked if other
port types are allowed to be moved into namespaces in future.Signed-off-by: Joe Stringer
Acked-by: Pravin B Shelar
Acked-by: Thomas Graf
Signed-off-by: David S. Miller -
Johan Hedberg says:
====================
pull request: bluetooth 2015-10-16First of all, sorry for the late set of patches for the 4.3 cycle. We
just finished an intensive week of testing at the Bluetooth UnPlugFest
and discovered (and fixed) issues there. Unfortunately a few issues
affect 4.3-rc5 in a way that they break existing Bluetooth LE mouse and
keyboard support.The regressions result from supporting LE privacy in conjunction with
scanning for Resolvable Private Addresses before connecting. A feature
that has been tested heavily (including automated unit tests), but sadly
some regressions slipped in. The UnPlugFest with its multitude of test
platforms is a good battle testing ground for uncovering every corner
case.The patches in this pull request focus only on fixing the regressions in
4.3-rc5. The patches look a bit larger since we also added comments in
the critical sections of the fixes to improve clarity.I would appreciate if we can get these regression fixes to Linus
quickly. Please let me know if there are any issues pulling. Thanks.
====================Signed-off-by: David S. Miller
-
netlink_dump() allocates skb based on the calculated min_dump_alloc or
a per socket max_recvmsg_len.
min_alloc_size is maximum space required for any single netdev
attributes as calculated by rtnl_calcit().
max_recvmsg_len tracks the user provided buffer to netlink_recvmsg.
It is capped at 16KiB.
The intention is to avoid small allocations and to minimize the number
of calls required to obtain dump information for all net devices.netlink_dump packs as many small messages as could fit within an skb
that was sized for the largest single netdev information. The actual
space available within an skb is larger than what is requested. It could
be much larger and up to near 2x with align to next power of 2 approach.Allowing netlink_dump to use all the space available within the
allocated skb increases the buffer size a user has to provide to avoid
truncaion (i.e. MSG_TRUNG flag set).It was observed that with many VLANs configured on at least one netdev,
a larger buffer of near 64KiB was necessary to avoid "Message truncated"
error in "ip link" or "bridge [-c[ompressvlans]] vlan show" when
min_alloc_size was only little over 32KiB.This patch trims skb to allocated size in order to allow the user to
avoid truncation with more reasonable buffer size.Signed-off-by: Ronen Arad
Signed-off-by: David S. Miller
17 Oct, 2015
1 commit
-
Pull Ceph fixes from Sage Weil:
"Just two small items from Ilya:The first patch fixes the RBD readahead to grab full objects. The
second fixes the write ops to prevent undue promotion when a cache
tier is configured on the server side"* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: use writefull op for object size writes
rbd: set max_sectors explicitly
16 Oct, 2015
10 commits
-
This covers only the simplest case - an object size sized write, but
it's still useful in tiering setups when EC is used for the base tier
as writefull op can be proxied, saving an object promotion.Even though updating ceph_osdc_new_request() to allow writefull should
just be a matter of fixing an assert, I didn't do it because its only
user is cephfs. All other sites were updated.Reflects ceph.git commit 7bfb7f9025a8ee0d2305f49bf0336d2424da5b5b.
Signed-off-by: Ilya Dryomov
Reviewed-by: Alex Elder -
ip6_blackhole_route() does not initialize the newly allocated
rt6_info properly. This patch:
1. Call rt6_info_init() to initialize rt6i_siblings and rt6i_uncached2. The current rt->dst._metrics init code is incorrect:
- 'rt->dst._metrics = ort->dst._metris' is not always safe
- Not sure what dst_copy_metrics() is trying to do here
considering ip6_rt_blackhole_cow_metrics() always returns
NULLFix:
- Always do dst_copy_metrics()
- Replace ip6_rt_blackhole_cow_metrics() with
dst_cow_metrics_generic()3. Mask out the RTF_PCPU bit from the newly allocated blackhole route.
This bug triggers an oops (reported by Phil Sutter) in rt6_get_cookie().
It is because RTF_PCPU is set while rt->dst.from is NULL.Fixes: d52d3997f843 ("ipv6: Create percpu rt6_info")
Signed-off-by: Martin KaFai Lau
Reported-by: Phil Sutter
Tested-by: Phil Sutter
Cc: Hannes Frederic Sowa
Cc: Julian Anastasov
Cc: Phil Sutter
Cc: Steffen Klassert
Signed-off-by: David S. Miller -
Introduce rt6_info_init() to do the common init work for
'struct rt6_info' (after calling dst_alloc).It is a prep work to fix the rt6_info init logic in the
ip6_blackhole_route().Signed-off-by: Martin KaFai Lau
Cc: Hannes Frederic Sowa
Cc: Julian Anastasov
Cc: Phil Sutter
Cc: Steffen Klassert
Signed-off-by: David S. Miller -
This patch makes sure that conn_params that were created just for
explicit_connect, will get properly deleted during cleanup.Signed-off-by: Jakub Pawlowski
Acked-by: Johan Hedberg
Signed-off-by: Marcel Holtmann -
After clearing the params->explicit_connect variable the parameters
may need to be either added back to the right list or potentially left
absent from both the le_reports and the le_conns lists.Signed-off-by: Johan Hedberg
Signed-off-by: Marcel Holtmann -
Devices undergoing an explicit connect should not have their
conn_params struct removed by the mgmt Remove Device command. This
patch fixes the necessary checks in the command handler to correct the
behavior.Signed-off-by: Johan Hedberg
Signed-off-by: Marcel Holtmann -
We can't use hci_explicit_connect_lookup() since that would only cover
explicit connections, leaving normal reconnections completely
untouched. Not using it in turn means leaving out entries in
pend_le_reports.To fix this and simplify the logic move conn params from the reports
list to the pend_le_conns list for the duration of an explicit
connect. Once the connect is complete move the params back to the
pend_le_reports list. This also means that the explicit connect lookup
function only needs to look into the pend_le_conns list.Signed-off-by: Johan Hedberg
Signed-off-by: Marcel Holtmann -
The code should never directly call hci_conn_hash_del since many
cleanup & reference counting updates would be lost. Normally
hci_conn_del is the right thing to do, but in the case of a connection
doing LE scanning this could cause a deadlock due to doing a
cancel_delayed_work_sync() on the same work callback that we were
called from.Connections in the LE scanning state actually need very little cleanup
- just a small subset of hci_conn_del. To solve the issue, refactor
out these essential pieces into a new hci_conn_cleanup() function and
call that from the two necessary places.Signed-off-by: Johan Hedberg
Signed-off-by: Marcel Holtmann -
When disable/enable scan command is issued twice, some controllers
will return an error for the second request, i.e. requests with this
command will fail on some controllers, and succeed on others.This patch makes sure that unnecessary scan disable/enable commands
are not issued.When adding device to the auto connect whitelist when there is pending
connect attempt, there is no need to update scan.hci_connect_le_scan_cleanup is conditionally executing
hci_conn_params_del, that is calling hci_update_background_scan. Make
the other case also update scan, and remove reduntand call from
hci_connect_le_scan_remove.When stopping interleaved discovery the state should be set to stopped
only when both LE scanning and discovery has stopped.Signed-off-by: Jakub Pawlowski
Acked-by: Johan Hedberg
Signed-off-by: Marcel Holtmann -
Pull rdma updates from Doug Ledford:
"We have four batched up patches for the current rc kernel.Two of them are small fixes that are obvious.
One of them is larger than I would like for a late stage rc pull, but
we found an issue in the namespace lookup code related to RoCE and
this works around the issue for now (we allow a lookup with a
namespace to succeed on RoCE since RoCE namespaces aren't implemented
yet). This will go away in 4.4 when we put in support for namespaces
in RoCE devices.The last one is large in terms of lines, but is all legal and no
functional changes. Cisco needed to update their files to be more
specific about their license. They had intended the files to be dual
licensed as GPL/BSD all along, and specified that in their module
license tag, but their file headers were not up to par. They
contacted all of the contributors to get agreement and then submitted
a patch to update the license headers in the files.Summary:
- Work around connection namespace lookup bug related to RoCE
- Change usnic license to Dual GPL/BSD (was intended to be that way
all along, but wasn't clear, permission from contributors was
chased down)- Fix an issue between NFSoRDMA and mlx5 that could cause an oops
- Fix leak of sendonly multicast groups"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/ipoib: For sendonly join free the multicast group on leave
IB/cma: Accept connection without a valid netdev on RoCE
xprtrdma: Don't require LOCAL_DMA_LKEY support for fastreg
usnic: add missing clauses to BSD license
15 Oct, 2015
3 commits
-
In commit e3eea1eb47a ("tipc: clean up handling of message priorities")
we introduced a field in the packet header for keeping track of the
priority of fragments, since this value is not present in the specified
protocol header. Since the value so far only is used at the transmitting
end of the link, we have not yet officially defined it as part of the
protocol.Unfortunately, the field we use for keeping this value, bits 13-15 in
in word 5, has turned out to be a poor choice; it is already used by the
broadcast protocol for carrying the 'network id' field of the sending
node. Since packet fragments also need to be transported across the
broadcast protocol, the risk of conflict is obvious, and we see this
happen when we use network identities larger than 2^13-1. This has
escaped our testing because we have so far only been using small network
id values.We now move this field to bits 0-2 in word 9, a field that is guaranteed
to be unused by all involved protocols.Fixes: e3eea1eb47a ("tipc: clean up handling of message priorities")
Signed-off-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: David S. Miller -
It seems that kernel memory can leak into userspace by a
kmalloc, ethtool_get_strings, then copy_to_user sequence.Avoid this by using kcalloc to zero fill the copied buffer.
Signed-off-by: Joe Perches
Acked-by: Ben Hutchings
Signed-off-by: David S. Miller -
…kernel/git/jberg/mac80211
Johannes Berg says:
====================
Like last time, we have two small fixes:
* fast-xmit was not doing powersave filter clearing correctly,
disable fast-xmit while any such operations are still pending
* a debugfs file was broken due to some infrastructure changes
====================Signed-off-by: David S. Miller <davem@davemloft.net>
14 Oct, 2015
2 commits
-
In commit 6e498158a827 ("tipc: move link synch and failover to link aggregation level")
we introduced a new mechanism for performing link failover and
synchronization. We have now detected a bug in this mechanism.During link synchronization we use the arrival of any packet on
the tunnel link to trig a check for whether it has reached the
synchronization point or not. This has turned out to be too
permissive, since it may cause an arriving non-last SYNCH packet to
end the synch state, just to see the next SYNCH packet initiate a
new synch state with a new, higher synch point. This is not fatal,
but should be avoided, because it may significantly extend the
synchronization period, while at the same time we are not allowed
to send NACKs if packets are lost. In the worst case, a low-traffic
user may see its traffic stall until a LINK_PROTOCOL state message
trigs the link to leave synchronization state.At the same time, LINK_PROTOCOL packets which happen to have a (non-
valid) sequence number lower than the tunnel link's rcv_nxt value will
be consistently dropped, and will never be able to resolve the situation
described above.We fix this by exempting LINK_PROTOCOL packets from the sequence number
check, as they should be. We also reduce (but don't completely
eliminate) the risk of entering multiple synchronization states by only
allowing the (logically) first SYNCH packet to initiate a synchronization
state. This works independently of actual packet arrival order.Fixes: commit 6e498158a827 ("tipc: move link synch and failover to link aggregation level")
Signed-off-by: Jon Maloy
Acked-by: Ying Xue
Signed-off-by: David S. Miller -
Pull nfsd fixes from Bruce Fields:
"Two nfsd fixes, one for an RDMA crash, one for a pnfs/block protocol
bug"* tag 'nfsd-4.3-2' of git://linux-nfs.org/~bfields/linux:
svcrdma: Fix NFS server crash triggered by 1MB NFS WRITE
nfsd/blocklayout: accept any minlength
13 Oct, 2015
3 commits
-
As originally written rt6_uncached_list_flush_dev makes no sense when
called with dev == NULL as it attempts to flush all uncached routes
regardless of network namespace when dev == NULL. Which is simply
incorrect behavior.Furthermore at the point rt6_ifdown is called with dev == NULL no more
network devices exist in the network namespace so even if the code in
rt6_uncached_list_flush_dev were to attempt something sensible it
would be meaningless.Therefore remove support in rt6_uncached_list_flush_dev for handling
network devices where dev == NULL, and only call rt6_uncached_list_flush_dev
when rt6_ifdown is called with a network device.Fixes: 8d0b94afdca8 ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister")
Signed-off-by: "Eric W. Biederman"
Reviewed-by: Martin KaFai Lau
Tested-by: Martin KaFai Lau
Signed-off-by: David S. Miller -
VLANs 0 and 4095 are reserved and shouldn't be used, add checks to
switchdev similar to the bridge. Also make sure ids above 4095 cannot
be passed either.Fixes: 47f8328bb1a4 ("switchdev: add new switchdev bridge setlink")
Signed-off-by: Nikolay Aleksandrov
Acked-by: Scott Feldman
Signed-off-by: David S. Miller -
Commit 30686bf7f5b3 ("mac80211: convert HW flags to unsigned long
bitmap") accidentally removed the newline delimiter from the hwflags
debugfs file. Fix this by adding back the newline between the HW flags.Cc: stable@vger.kernel.org [4.2]
Signed-off-by: Mohammed Shafi Shajakhan
[fix commit log]
Signed-off-by: Jouni Malinen
Signed-off-by: Johannes Berg
12 Oct, 2015
1 commit
-
Now that the NFS server advertises a maximum payload size of 1MB
for RPC/RDMA again, it crashes in svc_process_common() when NFS
client sends a 1MB NFS WRITE on an NFS/RDMA mount.The server has set up a 259 element array of struct page pointers
in rq_pages[] for each incoming request. The last element of the
array is NULL.When an incoming request has been completely received,
rdma_read_complete() attempts to set the starting page of the
incoming page vector:rqstp->rq_arg.pages = &rqstp->rq_pages[head->hdr_count];
and the page to use for the reply:
rqstp->rq_respages = &rqstp->rq_arg.pages[page_no];
But the value of page_no has already accounted for head->hdr_count.
Thus rq_respages now points past the end of the incoming pages.For NFS WRITE operations smaller than the maximum, this is harmless.
But when the NFS WRITE operation is as large as the server's max
payload size, rq_respages now points at the last entry in rq_pages,
which is NULL.Fixes: cc9a903d915c ('svcrdma: Change maximum server payload . . .')
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=270
Signed-off-by: Chuck Lever
Reviewed-by: Sagi Grimberg
Reviewed-by: Steve Wise
Reviewed-by: Shirley Ma
Signed-off-by: J. Bruce Fields
11 Oct, 2015
3 commits
-
This is a clone of commit 2ab957492d13b ("ip_forward: Drop frames with
attached skb->sk") for ipv6.This commit has exactly the same reasons as the above mentioned commit,
namely to prevent panics during netfilter reload or a misconfigured stack.Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller -
GRE point-to-point interfaces should also support ipv6 multicast. Setting
up default multicast routes on interface creation was forgotten. Add it.Bugzilla:
Cc: Julien Muchembled
Cc: Eric Dumazet
Cc: Nicolas Dumazet
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller -
Similar to commit c0afd9ce4d6a ("fq_codel: fix return value of fq_codel_drop()")
->drop() is supposed to return the number of bytes it dropped,
but hhf_drop () returns the id of the bucket where it drops
a packet from.Cc: Jamal Hadi Salim
Cc: Terry Lam
Signed-off-by: Cong Wang
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
10 Oct, 2015
1 commit
-
Pull nfsd bugfix from Bruce Fields:
"Just one RDMA bugfix"* tag 'nfsd-4.3-1' of git://linux-nfs.org/~bfields/linux:
svcrdma: handle rdma read with a non-zero initial page offset
08 Oct, 2015
2 commits
-
Similar to commit c29390c6dfee ("xps: must clear sender_cpu before forwarding")
the skb->sender_cpu needs to be cleared before xmit.Fixes: 3896d655f4d4 ("bpf: introduce bpf_clone_redirect() helper")
Signed-off-by: Alexei Starovoitov
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller -
Similar to commit c29390c6dfee ("xps: must clear sender_cpu before forwarding")
the skb->sender_cpu needs to be cleared when moving from Rx
Tx, otherwise kernel could crash.Fixes: 2bd82484bb4c ("xps: fix xps for stacked devices")
Cc: Eric Dumazet
Cc: Jamal Hadi Salim
Signed-off-by: Cong Wang
Signed-off-by: Cong Wang
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
07 Oct, 2015
10 commits
-
Previously, the CT_ATTR_FLAGS attribute, when nested under the
OVS_ACTION_ATTR_CT, encoded a 32-bit bitmask of flags that modify the
semantics of the ct action. It's more extensible to just represent each
flag as a nested attribute, and this requires no additional error
checking to reject flags that aren't currently supported.Suggested-by: Ben Pfaff
Signed-off-by: Joe Stringer
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller -
The ct_state field was initially added as an 8-bit field, however six of
the bits are already being used and use cases are already starting to
appear that may push the limits of this field. This patch extends the
field to 32 bits while retaining the internal representation of 8 bits.
This should cover forward compatibility of the ABI for the foreseeable
future.This patch also reorders the OVS_CS_F_* bits to be sequential.
Suggested-by: Jarno Rajahalme
Signed-off-by: Joe Stringer
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller -
Previously, if userspace specified ct_state bits in the flow key which
are currently undefined (and therefore unsupported), then they would be
ignored. This could cause unexpected behaviour in future if userspace is
extended to support additional bits but attempts to communicate with the
current version of the kernel. This patch rectifies the situation by
rejecting such ct_state bits.Fixes: 7f8a436eaa2c "openvswitch: Add conntrack action"
Signed-off-by: Joe Stringer
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller -
The ct action uses parts of the flow key, so we need to ensure that it
is valid before executing that action.Fixes: 7f8a436eaa2c "openvswitch: Add conntrack action"
Signed-off-by: Joe Stringer
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller -
If ovs_fragment() was unable to fragment the skb due to an L2 header
that exceeds the supported length, skbs would be leaked. Fix the bug.Fixes: 7f8a436eaa2c "openvswitch: Add conntrack action"
Signed-off-by: Joe Stringer
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller -
If no switch were found in dsa_setup_dst, return -ENODEV and
exit the dsa_probe cleanly.Signed-off-by: Neil Armstrong
Signed-off-by: David S. Miller -
Now the kfree calls exists in the the remove functions, remove them in all
places except the of_probe functions and replace allocation calls
with their devm_ counterparts.Reviewed-by: Florian Fainelli
Signed-off-by: Neil Armstrong
Signed-off-by: David S. Miller -
When unbinding dsa, complete the dsa_switch_destroy to unregister the
fixed link phy then cleanly unregister and destroy the net devices.Reviewed-by: Florian Fainelli
Signed-off-by: Neil Armstrong
Signed-off-by: David S. Miller -
To prevent memory leakage on unbinding, add missing mdiobus unregister
and unallocation calls.Reviewed-by: Florian Fainelli
Signed-off-by: Neil Armstrong
Signed-off-by: David S. Miller -
To prevent memory leakage on unbinding, add missing kfree calls.
Includes minor cosmetic change to make patch clean.Signed-off-by: Neil Armstrong
Signed-off-by: David S. Miller