22 Nov, 2016
28 commits
-
The statistics unit on the mv88e6390 needs the histogram mode to be
configured in a different register compared to other devices. Add an
ops to do this.Signed-off-by: Andrew Lunn
v2:
Rename to mv88e6390_g1_stats_set_histogram
Move into global1.c
Signed-off-by: David S. Miller -
The MV88E6390 has a control register for what the histogram statistics
actually contain. This means the stat_snapshot method should not set
this information. So implement the 6390 stats_snapshot function without
these bits.Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller -
Knowing the family of device belongs to helps with picking the ops
implementation which is appropriate to the device. So add a comment to
each structure of ops.Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller -
Taking a stats snapshot differs between same families. Abstract this
into an ops member. At the same time, move the code into global1.[ch],
since the registers are in the global1 range.Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller -
With the devices added to the tables, the probe will recognize the
switch. This however is not sufficient to make it work properly, other
changes are needed because of incompatibilities.Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller -
_mv88e6xxx_stats_wait() did not check the return value from
mv88e6xxx_g1_read(), so the compiler complained about set but unused
err.Signed-off-by: Andrew Lunn
Reviewed-by: Vivien Didelot
Signed-off-by: David S. Miller -
The switch needs to be taken out of reset before we can read its ID
register on the MDIO bus.Signed-off-by: Andrew Lunn
Reviewed-by: Vivien Didelot
Signed-off-by: David S. Miller -
Declare the structure ieee802154_ops as const as it is only passed as an
argument to the function ieee802154_alloc_hw. This argument is of type
const struct ieee802154_ops *, so ieee80254_ops structures having this
property can be declared as const.
Done using Coccinelle:@r1 disable optional_qualifier @
identifier i;
position p;
@@
static struct ieee802154_ops i@p = {...};@ok1@
identifier r1.i;
position p;
expression e1;
@@
ieee802154_alloc_hw(e1,&i@p)@bad@
position p!={r1.p,ok1.p};
identifier r1.i;
@@
i@p@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
static
+const
struct ieee802154_ops i={...};@depends on !bad disable optional_qualifier@
identifier r1.i;
@@
+const
struct ieee802154_ops i;The before and after size details of the affected files are:
text data bss dec hex filename
8669 1176 16 9861 2685 drivers/net/ieee802154/adf7242.o
8805 1048 16 9869 268d drivers/net/ieee802154/adf7242.otext data bss dec hex filename
7211 2296 32 9539 2543 drivers/net/ieee802154/atusb.o
7339 2160 32 9531 253b drivers/net/ieee802154/atusb.oSigned-off-by: Bhumika Goyal
Acked-by: Stefan Schmidt
Signed-off-by: David S. Miller -
Pravin B Shelar says:
====================
geneve: Use LWT more effectively.Following patch series make use of geneve LWT code path for
geneve netdev type of device.
This allows us to simplify geneve module without changing any
functionality.v2-v3:
Rebase against latest net-next.v1-v2:
Fix warning reported by kbuild test robot.
====================Signed-off-by: David S. Miller
-
Rather than comparing 64-bit tunnel-id, compare tunnel vni
which is 24-bit id. This also save conversion from vni
to tunnel id on each tunnel packet receive.Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller -
Geneve already has check for device socket in route
lookup function. So no need to check it in xmit
function.Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller -
There are minimal difference in building Geneve header
between ipv4 and ipv6 geneve tunnels. Following patch
refactors code to unify it.Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller -
Current geneve implementation has two separate cases to handle.
1. netdev xmit
2. LWT xmit.In case of netdev, geneve configuration is stored in various
struct geneve_dev members. For example geneve_addr, ttl, tos,
label, flags, dst_cache, etc. For LWT ip_tunnel_info is passed
to the device in ip_tunnel_info.Following patch uses ip_tunnel_info struct to store almost all
of configuration of a geneve netdevice. This allows us to unify
most of geneve driver code around ip_tunnel_info struct.
This dramatically simplify geneve code, since it does not
need to handle two different configuration cases. Removes
duplicate code, single code path can handle either type
of geneve devices.Signed-off-by: Pravin B Shelar
Signed-off-by: David S. Miller -
Florian Westphal says:
====================
tcp: make undo_cwnd mandatory for congestion moduleshighspeed, illinois, scalable, veno and yeah congestion control algorithms
don't provide a 'cwnd_undo' function. This makes the stack default to a
'reno undo' which doubles cwnd. However, the ssthresh implementation of
these algorithms do not halve the slowstart threshold. This causes similar
issue as the one fixed for dctcp in ce6dd23329b1e ("dctcp: avoid bogus
doubling of cwnd after loss").In light of this it seems better to remove the fallback and make undo_cwnd
mandatory.First patch fixes those spots where reno undo seems incorrect by providing
.cwnd_undo functions, second patch removes the fallback.
====================Signed-off-by: David S. Miller
-
The undo_cwnd fallback in the stack doubles cwnd based on ssthresh,
which un-does reno halving behaviour.It seems more appropriate to let congctl algorithms pair .ssthresh
and .undo_cwnd properly. Add a 'tcp_reno_undo_cwnd' function and wire it
up for all congestion algorithms that used to rely on the fallback.Cc: Eric Dumazet
Cc: Yuchung Cheng
Cc: Neal Cardwell
Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller -
congestion control algorithms that do not halve cwnd in their .ssthresh
should provide a .cwnd_undo rather than rely on current fallback which
assumes reno halving (and thus doubles the cwnd).All of these do 'something else' in their .ssthresh implementation, thus
store the cwnd on loss and provide .undo_cwnd to restore it again.A followup patch will remove the fallback and all algorithms will
need to provide a .cwnd_undo function.Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller -
Nikolay Aleksandrov says:
====================
bridge: add support for IGMPv3 and MLDv2 querierThis patch-set adds support for IGMPv3 and MLDv2 querier in the bridge.
Two new options which can be toggled via netlink and sysfs are added that
control the version per-bridge:
multicast_igmp_version - default 2, can be set to 3
multicast_mld_version - default 1, can be set to 2 (this option is
disabled if CONFIG_IPV6=n)Note that the names do not include "querier", I think that these options
can be re-used later as more IGMPv3 support is added to the bridge so we
can avoid adding more options to switch between v2 and v3 behaviour.The set uses the already existing br_ip{4,6}_multicast_alloc_query
functions and adds the appropriate header based on the chosen version.For the initial support I have removed the compatibility implementation
(RFC3376 sec 7.3.1, 7.3.2; RFC3810 sec 8.3.1, 8.3.2), because there are
some details that we need to sort out.
====================Signed-off-by: David S. Miller
-
This patch adds basic support for MLDv2 queries, the default is MLDv1
as before. A new multicast option - multicast_mld_version, adds the
ability to change it between 1 and 2 via netlink and sysfs.
The MLD option is disabled if CONFIG_IPV6 is disabled.Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller -
This patch adds basic support for IGMPv3 queries, the default is IGMPv2
as before. A new multicast option - multicast_igmp_version, adds the
ability to change it between 2 and 3 via netlink and sysfs. The option
struct member is in a 4 byte hole in net_bridge.There also a few minor style adjustments in br_multicast_new_group and
br_multicast_add_group.Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller -
The function macvlan_forward_source_one has already checked the flag
IFF_UP, so needn't check it outside in macvlan_forward_source too.Signed-off-by: Gao Feng
Signed-off-by: David S. Miller -
While stressing a 40Gbit mlx4 NIC with busy polling, I found false
sharing in mlx4 driver that can be easily avoided.This patch brings an additional 7 % performance improvement in UDP_RR
workload.1) If we received no frame during one mlx4_en_process_rx_cq()
invocation, no need to call mlx4_cq_set_ci() and/or dirty ring->cons2) Do not refill rx buffers if we have plenty of them.
This avoids false sharing and allows some bulk/batch optimizations.
Page allocator and its locks will thank us.Finally, mlx4_en_poll_rx_cq() should not return 0 if it determined
cpu handling NIC IRQ should be changed. We should return budget-1
instead, to not fool net_rx_action() and its netdev_budget.v2: keep AVG_PERF_COUNTER(... polled) even if polled is 0
Signed-off-by: Eric Dumazet
Cc: Tariq Toukan
Reviewed-by: Tariq Toukan
Signed-off-by: David S. Miller -
barrier() is a big hammer compared to READ_ONCE(),
and requires comments explaining what is protected.READ_ONCE() is more precise and compiler should generate
better overall code.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
UDP_SKB_CB(skb)->partial_cov is located at offset 66 in skb,
requesting a cold cache line being read in cpu cache.We can avoid this cache line miss for UDP sockets,
as partial_cov has a meaning only for UDPLite.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Daniel Borkmann says:
====================
Couple of BPF refcount fixes for mlx5Various mlx5 bugs on eBPF refcount handling found during review.
Last patch in series adds a __must_check to BPF helpers to make
sure we won't run into it again w/o compiler complaining first.v2 -> v3:
- Just reworked patch 2/4 so we don't need bpf_prog_sub().
- Rebased, rest as is.v1 -> v2:
- After discussion with Alexei, we agreed upon rebasing the
patches against net-next.
- Since net-next, I've also added the __must_check to enforce
future users to check for errors.
- Fixed up commit message #2.
- Simplify assignment from patch #1 based on Saeed's feedback
on previous set.
====================Signed-off-by: David S. Miller
-
Helpers like bpf_prog_add(), bpf_prog_inc(), bpf_map_inc() can fail
with an error, so make sure the caller properly checks their return
value and not just ignores it, which could worst-case lead to use
after free.Signed-off-by: Daniel Borkmann
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller -
mlx5e_xdp_set() is currently the only place where we drop reference on the
prog sitting in priv->xdp_prog when it's exchanged by a new one. We also
need to make sure that we eventually release that reference, for example,
in case the netdev is dismantled, otherwise we leak the program.Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann
Acked-by: Saeed Mahameed
Signed-off-by: David S. Miller -
There are multiple issues in mlx5e_xdp_set():
1) The batched bpf_prog_add() is currently not checked for errors. When
doing so, it should be done at an earlier point in time to makes sure
that we cannot fail anymore at the time we want to set the program for
each channel. The batched refs short-cut can only be performed when we
don't need to perform a reset for changing the rq type and the device
was in opened state. In case the device was not in opened state, then
the next mlx5e_open_locked() will aquire the refs from the control prog
via mlx5e_create_rq(), same when we need to perform a reset.2) When swapping the priv->xdp_prog, then no extra reference count must be
taken since we got that from call path via dev_change_xdp_fd() already.
Otherwise, we'd never be able to release the program. Also, bpf_prog_add()
without checking the return code could fail.Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann
Acked-by: Saeed Mahameed
Signed-off-by: David S. Miller -
In mlx5e_create_rq(), when creating a new queue, we call bpf_prog_add() but
without checking the return value. bpf_prog_add() can fail since 92117d8443bc
("bpf: fix refcnt overflow"), so we really must check it. Take the reference
right when we assign it to the rq from priv->xdp_prog, and just drop the
reference on error path. Destruction in mlx5e_destroy_rq() looks good, though.Fixes: 86994156c736 ("net/mlx5e: XDP fast RX drop bpf programs support")
Signed-off-by: Daniel Borkmann
Acked-by: Saeed Mahameed
Signed-off-by: David S. Miller
21 Nov, 2016
7 commits
-
Andrew Lunn says:
====================
Fixes for the MV88e6xxx interrupt codeThe interrupt code was never tested with a board who's probing
resulted in an -EPROBE_DEFFERED. So the clean up paths never got
tested. I now do have -EPROBE_DEFFERED, and things break badly during
cleanup. These are the fixes.This is fixing code in net-next.
v2:
Fix typo pointed out by David Miller
====================Signed-off-by: David S. Miller
-
Freeing interrupts requires switch register access to mask the
interrupts. Hence we must hold the register mutex.Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller -
It is not possible to use devm_request_threaded_irq() because we have
two stacked interrupt controllers in one device. The lower interrupt
controller cannot be removed until the upper is fully removed. This
happens too late with the devm API, resulting in error messages about
removing a domain while there is still an active interrupt. Swap to
using request_threaded_irq() and manage the release of the interrupt
manually.Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller -
On error, remask the interrupts, release all maps, and remove the
domain. This cannot be done using the mv88e6xxx_g1_irq_free() because
some of these actions are not idempotent.Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller -
Fix the g1 interrupt free code such that is masks any further
interrupts, and then releases the interrupt.Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller -
Trying to remove an IRQ domain that was not created results in an
Opps. Add the necessary checks that the irqs were created before
freeing them.Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller -
Simple typos, s/g2/g1/
Signed-off-by: Andrew Lunn
Signed-off-by: David S. Miller
20 Nov, 2016
5 commits
-
1) cast to "int" is unnecessary:
u8 will be promoted to int before decrementing,
small positive numbers fit into "int", so their values won't be changed
during promotion.Once everything is int including loop counters, signedness doesn't
matter: 32-bit operations will stay 32-bit operations.But! Someone tried to make this loop smart by making everything of
the same type apparently in an attempt to optimise it.
Do the optimization, just differently.
Do the cast where it matters. :^)2) frag size is unsigned entity and sum of fragments sizes is also
unsigned.Make everything unsigned, leave no MOVSX instruction behind.
add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-4 (-4)
function old new delta
skb_cow_data 835 834 -1
ip_do_fragment 2549 2548 -1
ip6_fragment 3130 3128 -2
Total: Before=154865032, After=154865028, chg -0.00%Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller -
Length of a netlink attribute may be u16 but lengths of basic attributes
are much smaller, so small we can save 16 bytes of .rodata and pocket
change inside .text.16-bit is worse on x86-64 than 8-bit because of operand size override prefix.
add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-19 (-19)
function old new delta
validate_nla 418 417 -1
nla_policy_len 66 64 -2
nla_attr_minlen 32 16 -16
Total: Before=154865051, After=154865032, chg -0.00%Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller -
->nla_len is unsigned entity (it's length after all) and u16,
thus it can't overflow when being aligned into int/unsigned int.(nlmsg_next has the same code, but I didn't yet convince myself
it is correct to do so).There is pointer arithmetic in this function and offset being
unsigned is better:add/remove: 0/0 grow/shrink: 1/64 up/down: 5/-309 (-304)
function old new delta
nl80211_set_wiphy 1444 1449 +5
team_nl_cmd_options_set 997 995 -2
tcf_em_tree_validate 872 870 -2
switchdev_port_bridge_setlink 352 350 -2
switchdev_port_br_afspec 312 310 -2
rtm_to_fib_config 428 426 -2
qla4xxx_sysfs_ddb_set_param 2193 2191 -2
qla4xxx_iface_set_param 4470 4468 -2
ovs_nla_free_flow_actions 152 150 -2
output_userspace 518 516 -2
...
nl80211_set_reg 654 649 -5
validate_scan_freqs 148 142 -6
validate_linkmsg 288 282 -6
nl80211_parse_connkeys 489 483 -6
nlattr_set 231 224 -7
nf_tables_delsetelem 267 260 -7
do_setlink 3416 3408 -8
netlbl_cipsov4_add_std 1672 1659 -13
nl80211_parse_sched_scan 2902 2888 -14
nl80211_trigger_scan 1738 1720 -18
do_execute_actions 2821 2738 -83
Total: Before=154865355, After=154865051, chg -0.00%Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller -
size_t is way too much for an integer not exceeding 64.
Space savings: 10 bytes!
add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-10 (-10)
function old new delta
napi_consume_skb 165 163 -2
__kfree_skb_flush 56 53 -3
__kfree_skb_defer 97 92 -5
Total: Before=154865639, After=154865629, chg -0.00%Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller -
Simon Wunderlich says:
====================
This feature patchset includes the following changes:- 6 patches adding functionality to detect a WiFi interface under
other virtual interfaces, like VLANs. They introduce a cache for
the detected the WiFi configuration to avoid RTNL locking in
critical sections. Patches have been prepared by Marek Lindner
and Sven Eckelmann- Enable automatic module loading for genl requests, by Sven Eckelmann
- Fix a potential race condition on interface removal. This is not
happening very often in practice, but requires bigger changes to fix,
so we are sending this to net-next. By Linus Luessing
====================Signed-off-by: David S. Miller