13 Dec, 2016

14 commits

  • Could be useful for debugging memory consumption issues,
    and perhaps power-save as well.

    Signed-off-by: Ben Greear
    Signed-off-by: Johannes Berg

    Ben Greear
     
  • Previously, kernel sends NEW_PEER_CANDIDATE event to user land even if
    the found peer does not have any room to accept other peer. This causes
    continuous connection trials.

    Signed-off-by: Masashi Honma
    Signed-off-by: Johannes Berg

    Masashi Honma
     
  • Add the ability for an AP (and associated VLANs) to perform
    multicast-to-unicast conversion for ARP, IPv4 and IPv6 frames
    (possibly within 802.1Q). If enabled, such frames are to be sent
    to each station separately, with the DA replaced by their own
    MAC address rather than the group address.

    Note that this may break certain expectations of the receiver,
    such as the ability to drop unicast IP packets received within
    multicast L2 frames, or the ability to not send ICMP destination
    unreachable messages for packets received in L2 multicast (which
    is required, but the receiver can't tell the difference if this
    new option is enabled.)

    This also doesn't implement the 802.11 DMS (directed multicast
    service).

    Signed-off-by: Michael Braun
    [use true/false, rename label to the correct "multicast",
    use __be16 for ethertype and network order for constants]
    Signed-off-by: Johannes Berg

    Michael Braun
     
  • Commit 4a733ef1bea7 (mac80211: remove PM-QoS listener) removed all use
    of 'beaconint_us' from ieee80211_recalc_ps() but left the variable
    intact. Compiling with W=1 gives the following warning, fix it.
    net/mac80211/mlme.c: In function ‘ieee80211_recalc_ps’:
    net/mac80211/mlme.c:1481:7: warning: variable ‘beaconint_us’ set but not used [-Wunused-but-set-variable]

    iee80211_tu_to_usec has no side-effects and is safe to remove.

    Fixes: 4a733ef1bea7 ("mac80211: remove PM-QoS listener")
    Cc: Johannes Berg
    Signed-off-by: Kirtika Ruchandani
    Signed-off-by: Johannes Berg

    Kirtika Ruchandani
     
  • Commit b1bce14a7954 (mac80211: update opmode when adding new station)
    refactored ieee80211_vht_handle_opmode into __ieee80211_vht_handle_opmode
    and ieee80211_vht_handle_opmode leaving a set but unused variable
    (sband) in the former. Compiling with W=1 gives the following warning,
    fix it.

    net/mac80211/vht.c: In function ‘__ieee80211_vht_handle_opmode’:
    net/mac80211/vht.c:424:35: warning: variable ‘sband’ set but not used [-Wunused-but-set-variable]

    Remove 'struct ieee80211_local* local' as well, it was only used to
    set sband.

    This is a harmless warning, and is only being fixed to reduce the
    noise with W=1 in the kernel.

    Fixes: b1bce14a7954 ("mac80211: update opmode when adding new station")
    Cc: Marek Kwaczynski
    Cc: Johannes Berg
    Signed-off-by: Kirtika Ruchandani
    Signed-off-by: Johannes Berg

    Kirtika Ruchandani
     
  • Commit 633e27132625 (mac80211: split sched scan IEs) introduced the
    len variable to keep track of the return value of
    ieee80211_build_preq_ies() but did not use it. Compiling with W=1
    gives the following warning, fix it.

    net/mac80211/scan.c: In function ‘__ieee80211_request_sched_scan_start’:
    net/mac80211/scan.c:1123:9: warning: variable ‘len’ set but not used [-Wunused-but-set-variable]

    This is a harmless warning and is only being fixed to reduce the noise
    with W=1 in the kernel.

    Fixes: 633e27132625 ("mac80211: split sched scan IEs")
    Cc: David Spinadel
    Cc: Alexander Bondar
    Cc: Johannes Berg
    Signed-off-by: Kirtika Ruchandani
    Signed-off-by: Johannes Berg

    Kirtika Ruchandani
     
  • Commit 5bcae31d9 (mac80211: implement multi-vif in-place reservations)
    introduced ieee80211_vif_use_reserved_switch() with a counter variable
    'i' that is set but not used. Compiling with W=1 gives the following
    warning, fix it.
    net/mac80211/chan.c: In function ‘ieee80211_vif_use_reserved_switch’:
    net/mac80211/chan.c:1273:6: warning: variable ‘i’ set but not used [-Wunused-but-set-variable]

    This is a harmless warning, and is only being fixed to reduce the
    noise obtained with W=1 in the kernel.

    Fixes: 5bcae31d9 ("mac80211: implement multi-vif in-place reservations")
    Cc: Michal Kazior
    Cc: Johannes Berg
    Signed-off-by: Kirtika Ruchandani
    Signed-off-by: Johannes Berg

    Kirtika Ruchandani
     
  • Commit 3b17fbf87d5d introduced sta_get_expected_throughput()
    leaving variable 'struct rate_control_ref* ref' set but unused.
    Compiling with W=1 gives the following warning, fix it.

    net/mac80211/sta_info.c: In function ‘sta_set_sinfo’:
    net/mac80211/sta_info.c:2052:27: warning: variable ‘ref’ set but not used [-Wunused-but-set-variable]

    Fixes: 3b17fbf87d5d ("mac80211: mesh: Add support for HW RC implementation")
    Cc: Johannes Berg
    Cc: Maxim Altshul
    Signed-off-by: Kirtika Ruchandani
    Signed-off-by: Johannes Berg

    Kirtika Ruchandani
     
  • Commit f027c2aca0cf introduced 'rates_idx' in
    ieee80211_tx_status_noskb but did not use it. Compiling with W=1
    gives the following warning, fix it.

    mac80211/status.c: In function ‘ieee80211_tx_status_noskb’:
    mac80211/status.c:636:6: warning: variable ‘rates_idx’ set but not used [-Wunused-but-set-variable]

    This is a harmless warning, and is only being fixed to reduce the
    noise generated with W=1.

    Fixes: f027c2aca0cf ("mac80211: add ieee80211_tx_status_noskb")
    Cc: Johannes Berg
    Cc: Felix Fietkau
    Signed-off-by: Kirtika Ruchandani
    Signed-off-by: Johannes Berg

    Kirtika Ruchandani
     
  • Commit 554891e63a29 introduced 'struct ieee80211_rx_status' in
    ieee80211_rx_h_defragment but did not use it. Compiling with W=1
    gives the following warning, fix it.

    net/mac80211/rx.c: In function ‘ieee80211_rx_h_defragment’:
    net/mac80211/rx.c:1911:30: warning: variable ‘status’ set but not used [-Wunused-but-set-variable]

    Fixes: 554891e63a29 ("mac80211: move packet flags into packet")
    Cc: Johannes Berg
    Cc: John W. Linville
    Signed-off-by: Kirtika Ruchandani
    Signed-off-by: Johannes Berg

    Kirtika Ruchandani
     
  • There is no need to prevent toggling multicast_to_unicast while
    interface is already up. This change simplifies reconfiguration
    from hostapd.

    Signed-off-by: Michael Braun
    Signed-off-by: Johannes Berg

    Michael Braun
     
  • The presence of the NL80211_ATTR_SCHED_SCAN_INTERVAL attribute was
    checked in nl80211_parse_sched_scan() and
    nl80211_parse_sched_scan_plans() which might be a bit redundant
    so removing one.

    Signed-off-by: Arend van Spriel
    Signed-off-by: Johannes Berg

    Arend Van Spriel
     
  • The comment on the name indirection suggested an issue but turned out
    to be untrue. Digging in older kernel version showed issue with ipw2x00
    but that is no longer true so get rid on the name indirection.

    Signed-off-by: Arend van Spriel
    Signed-off-by: Johannes Berg

    Arend Van Spriel
     
  • Simplify the two conditions gating the schedule_work() into
    a single one and get rid of the additional exit point from
    the function in doing so.

    Signed-off-by: Johannes Berg

    Johannes Berg
     

11 Dec, 2016

9 commits

  • Dump and reset doesn't work unless cmpxchg64() is used both from packet
    and control plane paths. This approach is going to be slow though.
    Instead, use a percpu seqcount to fetch counters consistently, then
    subtract bytes and packets in case a reset was requested.

    The cpu that running over the reset code is guaranteed to own this stats
    exclusively, we have to turn counters into signed 64bit though so stats
    update on reset don't get wrong on underflow.

    This patch is based on original sketch from Eric Dumazet.

    Fixes: 43da04a593d8 ("netfilter: nf_tables: atomic dump and reset for stateful objects")
    Suggested-by: Eric Dumazet
    Signed-off-by: Pablo Neira Ayuso
    Signed-off-by: David S. Miller

    Pablo Neira
     
  • Signed-off-by: Asbjoern Sloth Toennesen
    Signed-off-by: David S. Miller

    Asbjørn Sloth Tønnesen
     
  • Move the L2TP_MSG_* definitions to UAPI, as it is part of
    the netlink API.

    Signed-off-by: Asbjoern Sloth Toennesen
    Signed-off-by: David S. Miller

    Asbjørn Sloth Tønnesen
     
  • 802.1D [1] specifies that the bridges must use a short value to age out
    dynamic entries in the Filtering Database for a period, once a topology
    change has been communicated by the root bridge.

    Add a bridge_ageing_time member in the net_bridge structure to store the
    bridge ageing time value configured by the user (ioctl/netlink/sysfs).

    If we are using in-kernel STP, shorten the ageing time value to twice
    the forward delay used by the topology when the topology change flag is
    set. When the flag is cleared, restore the configured ageing time.

    [1] "8.3.5 Notifying topology changes ",
    http://profesores.elo.utfsm.cl/~agv/elo309/doc/802.1D-1998.pdf

    Signed-off-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • Add a __br_set_topology_change helper to set the topology change value.

    This can be later extended to add actions when the topology change flag
    is set or cleared.

    Signed-off-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • The SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME switchdev attr is actually set
    when initializing a bridge port, and when configuring the bridge ageing
    time from ioctl/netlink/sysfs.

    Add a __set_ageing_time helper to offload the ageing time to physical
    switches, and add the SWITCHDEV_F_DEFER flag since it can be called
    under bridge lock.

    Signed-off-by: Vivien Didelot
    Signed-off-by: David S. Miller

    Vivien Didelot
     
  • This patch removes a newline which was added
    in socket.c file in net-next

    Signed-off-by: Amit Kushwaha
    Signed-off-by: David S. Miller

    Amit Kushwaha
     
  • netlink_chain is called in ->release(), which is apparently
    a process context, so we don't have to use an atomic notifier
    here.

    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    WANG Cong
     
  • David S. Miller
     

10 Dec, 2016

6 commits

  • It seems attackers can also send UDP packets with no payload at all.

    skb_condense() can still be a win in this case.

    It will be possible to replace the custom code in tcp_add_backlog()
    to get full benefit from skb_condense()

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • …inux/kernel/git/jberg/mac80211-next

    Johannes Berg says:

    ====================
    Three fixes:
    * fix a logic bug introduced by a previous cleanup
    * fix nl80211 attribute confusing (trying to use
    a single attribute for two purposes)
    * fix a long-standing BSS leak that happens when an
    association attempt is abandoned
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • In flood situations, keeping sk_rmem_alloc at a high value
    prevents producers from touching the socket.

    It makes sense to lower sk_rmem_alloc only at the end
    of udp_rmem_release() after the thread draining receive
    queue in udp_recvmsg() finished the writes to sk_forward_alloc.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • If udp_recvmsg() constantly releases sk_rmem_alloc
    for every read packet, it gives opportunity for
    producers to immediately grab spinlocks and desperatly
    try adding another packet, causing false sharing.

    We can add a simple heuristic to give the signal
    by batches of ~25 % of the queue capacity.

    This patch considerably increases performance under
    flood by about 50 %, since the thread draining the queue
    is no longer slowed by false sharing.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • In UDP RX handler, we currently clear skb->dev before skb
    is added to receive queue, because device pointer is no longer
    available once we exit from RCU section.

    Since this first cache line is always hot, lets reuse this space
    to store skb->truesize and thus avoid a cache line miss at
    udp_recvmsg()/udp_skb_destructor time while receive queue
    spinlock is held.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Idea of busylocks is to let producers grab an extra spinlock
    to relieve pressure on the receive_queue spinlock shared by consumer.

    This behavior is requested only once socket receive queue is above
    half occupancy.

    Under flood, this means that only one producer can be in line
    trying to acquire the receive_queue spinlock.

    These busylock can be allocated on a per cpu manner, instead of a
    per socket one (that would consume a cache line per socket)

    This patch considerably improves UDP behavior under stress,
    depending on number of NIC RX queues and/or RPS spread.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

09 Dec, 2016

11 commits

  • When mac80211 abandons an association attempt, it may free
    all the data structures, but inform cfg80211 and userspace
    about it only by sending the deauth frame it received, in
    which case cfg80211 has no link to the BSS struct that was
    used and will not cfg80211_unhold_bss() it.

    Fix this by providing a way to inform cfg80211 of this with
    the BSS entry passed, so that it can clean up properly, and
    use this ability in the appropriate places in mac80211.

    This isn't ideal: some code is more or less duplicated and
    tracing is missing. However, it's a fairly small change and
    it's thus easier to backport - cleanups can come later.

    Cc: stable@vger.kernel.org
    Signed-off-by: Johannes Berg

    Johannes Berg
     
  • NL80211_ATTR_MAC was used to set both the specific BSSID to be scanned
    and the random MAC address to be used when privacy is enabled. When both
    the features are enabled, both the BSSID and the local MAC address were
    getting same value causing Probe Request frames to go with unintended
    DA. Hence, this has been fixed by using a different NL80211_ATTR_BSSID
    attribute to set the specific BSSID (which was the more recent addition
    in cfg80211) for a scan.

    Backwards compatibility with old userspace software is maintained to
    some extent by allowing NL80211_ATTR_MAC to be used to set the specific
    BSSID when scanning without enabling random MAC address use.

    Scanning with random source MAC address was introduced by commit
    ad2b26abc157 ("cfg80211: allow drivers to support random MAC addresses
    for scan") and the issue was introduced with the addition of the second
    user for the same attribute in commit 818965d39177 ("cfg80211: Allow a
    scan request for a specific BSSID").

    Fixes: 818965d39177 ("cfg80211: Allow a scan request for a specific BSSID")
    Signed-off-by: Vamsi Krishna
    Signed-off-by: Jouni Malinen
    Signed-off-by: Johannes Berg

    Vamsi Krishna
     
  • Arend inadvertently inverted the logic while converting to
    wdev_running(), fix that.

    Fixes: 73c7da3dae1e ("cfg80211: add generic helper to check interface is running")
    Signed-off-by: Johannes Berg

    Johannes Berg
     
  • This patch cleanup checkpatch.pl warning
    WARNING: __aligned(size) is preferred over __attribute__((aligned(size)))

    Signed-off-by: Amit Kushwaha
    Signed-off-by: David S. Miller

    Amit Kushwaha
     
  • …etooth/bluetooth-next

    Johan Hedberg says:

    ====================
    pull request: bluetooth-next 2016-12-08

    I didn't miss your "net-next is closed" email, but it did come as a bit
    of a surprise, and due to time-zone differences I didn't have a chance
    to react to it until now. We would have had a couple of patches in
    bluetooth-next that we'd still have wanted to get to 4.10.

    Out of these the most critical one is the H7/CT2 patch for Bluetooth
    Security Manager Protocol, something that couldn't be published before
    the Bluetooth 5.0 specification went public (yesterday). If these really
    can't go to net-next we'll likely be sending at least this patch through
    bluetooth.git to net.git for rc1 inclusion.
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • This patch allows XDP prog to extend/remove the packet
    data at the head (like adding or removing header). It is
    done by adding a new XDP helper bpf_xdp_adjust_head().

    It also renames bpf_helper_changes_skb_data() to
    bpf_helper_changes_pkt_data() to better reflect
    that XDP prog does not work on skb.

    This patch adds one "xdp_adjust_head" bit to bpf_prog for the
    XDP-capable driver to check if the XDP prog requires
    bpf_xdp_adjust_head() support. The driver can then decide
    to error out during XDP_SETUP_PROG.

    Signed-off-by: Martin KaFai Lau
    Acked-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Acked-by: John Fastabend
    Signed-off-by: David S. Miller

    Martin KaFai Lau
     
  • Under UDP flood, many softirq producers try to add packets to
    UDP receive queue, and one user thread is burning one cpu trying
    to dequeue packets as fast as possible.

    Two parts of the per packet cost are :
    - copying payload from kernel space to user space,
    - freeing memory pieces associated with skb.

    If socket is under pressure, softirq handler(s) can try to pull in
    skb->head the payload of the packet if it fits.

    Meaning the softirq handler(s) can free/reuse the page fragment
    immediately, instead of letting udp_recvmsg() do this hundreds of usec
    later, possibly from another node.

    Additional gains :
    - We reduce skb->truesize and thus can store more packets per SO_RCVBUF
    - We avoid cache line misses at copyout() time and consume_skb() time,
    and avoid one put_page() with potential alien freeing on NUMA hosts.

    This comes at the cost of a copy, bounded to available tail room, which
    is usually small. (We might have to fix GRO_MAX_HEAD which looks bigger
    than necessary)

    This patch gave me about 5 % increase in throughput in my tests.

    skb_condense() helper could probably used in other contexts.

    Signed-off-by: Eric Dumazet
    Cc: Paolo Abeni
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • RFS is not commonly used, so add a jump label to avoid some conditionals
    in fast path.

    Signed-off-by: Eric Dumazet
    Cc: Paolo Abeni
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Support matching on ICMP type and code.

    Example usage:

    tc qdisc add dev eth0 ingress

    tc filter add dev eth0 protocol ip parent ffff: flower \
    indev eth0 ip_proto icmp type 8 code 0 action drop

    tc filter add dev eth0 protocol ipv6 parent ffff: flower \
    indev eth0 ip_proto icmpv6 type 128 code 0 action drop

    Signed-off-by: Simon Horman
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Simon Horman
     
  • Allow dissection of ICMP(V6) type and code. This should only occur
    if a packet is ICMP(V6) and the dissector has FLOW_DISSECTOR_KEY_ICMP set.

    There are currently no users of FLOW_DISSECTOR_KEY_ICMP.
    A follow-up patch will allow FLOW_DISSECTOR_KEY_ICMP to be used by
    the flower classifier.

    Signed-off-by: Simon Horman
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Simon Horman
     
  • Add UAPI to provide set of flags for matching, where the flags
    provided from user-space are mapped to flow-dissector flags.

    The 1st flag allows to match on whether the packet is an
    IP fragment and corresponds to the FLOW_DIS_IS_FRAGMENT flag.

    Signed-off-by: Or Gerlitz
    Reviewed-by: Paul Blakey
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Or Gerlitz