30 Sep, 2016

13 commits

  • Add memory limit, usage and overlimit counter to per-PHY 'aqm' debugfs
    file.

    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: Johannes Berg

    Toke Høiland-Jørgensen
     
  • The reusable fairness queueing implementation (fq.h) lacks the memory
    usage limit that the fq_codel qdisc has. This means that small
    devices (e.g. WiFi routers) can run out of memory when flooded with a
    large number of packets. This ports the memory limit feature from
    fq_codel to fq.h.

    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: Johannes Berg

    Toke Høiland-Jørgensen
     
  • Provide an API to report NAN function match. Mac80211 will lookup the
    corresponding cookie and report the match to cfg80211.

    Signed-off-by: Andrei Otcheretianski
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Luca Coelho
    Signed-off-by: Johannes Berg

    Ayala Beker
     
  • Implement add/rm_nan_func functions and handle NAN function
    termination notifications. Handle instance_id allocation for
    NAN functions and implement the reconfig flow.

    Signed-off-by: Andrei Otcheretianski
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Luca Coelho
    Signed-off-by: Johannes Berg

    Ayala Beker
     
  • Implement nan_change_conf callback which allows to change current
    NAN configuration (master preference and dual band operation).
    Store the current NAN configuration in sdata, so it can be used
    both to provide the driver the updated configuration with changes
    and also it will be used in hw reconfig flows in next patches.

    Signed-off-by: Andrei Otcheretianski
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Luca Coelho
    Signed-off-by: Johannes Berg

    Ayala Beker
     
  • Provide a function that reports NAN DE function termination. The function
    may be terminated due to one of the following reasons: user request,
    ttl expiration or failure.
    If the NAN instance is tied to the owner, the notification will be
    sent to the socket that started the NAN interface only

    Signed-off-by: Andrei Otcheretianski
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Luca Coelho
    Signed-off-by: Johannes Berg

    Ayala Beker
     
  • Provide a function the driver can call to report a match.
    This will send the event to the user space.
    If the NAN instance is tied to the owner, the notifications will be
    sent to the socket that started the NAN interface only.

    Signed-off-by: Andrei Otcheretianski
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Luca Coelho
    Signed-off-by: Johannes Berg

    Ayala Beker
     
  • Some NAN configuration paramaters may change during the operation of
    the NAN device. For example, a user may want to update master preference
    value when the device gets plugged/unplugged to the power.
    Add API that allows to do so.

    Signed-off-by: Andrei Otcheretianski
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Luca Coelho
    Signed-off-by: Johannes Berg

    Ayala Beker
     
  • A NAN function can be either publish, subscribe or follow
    up. Make all the necessary verifications and just pass the
    request to the driver.
    Allow the user space application that starts NAN to
    forbid any other socket to add or remove functions.

    Signed-off-by: Andrei Otcheretianski
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Ayala Beker
    Signed-off-by: Luca Coelho
    Signed-off-by: Johannes Berg

    Ayala Beker
     
  • This code doesn't do much besides allowing to start and
    stop the vif.

    Signed-off-by: Andrei Otcheretianski
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Ayala Beker
    Signed-off-by: Luca Coelho
    Signed-off-by: Johannes Berg

    Ayala Beker
     
  • This allows user space to start/stop NAN interface.
    A NAN interface is like P2P device in a few aspects: it
    doesn't have a netdev associated to it.
    Add the new interface type and prevent operations that
    can't be executed on NAN interface like scan.

    Define several attributes that may be configured by user space
    when starting NAN functionality (master preference and dual
    band operation)

    Signed-off-by: Andrei Otcheretianski
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Luca Coelho
    Signed-off-by: Johannes Berg

    Ayala Beker
     
  • Add support for drivers that implement static WEP internally, i.e.
    expose connection keys to the driver in connect flow and don't
    upload the keys after the connection.

    Signed-off-by: David Spinadel
    Signed-off-by: Johannes Berg

    David Spinadel
     
  • The TXQ path restructure requires ieee80211_tx_dequeue() to call TX
    handlers and parts of the xmit_fast path. Move the function to later in
    tx.c in preparation for this.

    Signed-off-by: Toke Høiland-Jørgensen
    Signed-off-by: Johannes Berg

    Toke Høiland-Jørgensen
     

29 Sep, 2016

1 commit

  • Jouni reported that during (repeated) wext_pmf test runs (from the
    wpa_supplicant hwsim test suite) the kernel crashes. The reason is
    that after the key is set, the wext code still unnecessarily stores
    it into the key cache. Despite smatch pointing out an overflow, I
    failed to identify the possibility for this in the code and missed
    it during development of the earlier patch series.

    In order to fix this, simply check that we never store anything but
    WEP keys into the cache, adding a comment as to why that's enough.

    Also, since the cache is still allocated early even if it won't be
    used in many cases, add a comment explaining why - otherwise we'd
    have to roll back key settings to the driver in case of allocation
    failures, which is far more difficult.

    Fixes: 89b706fb28e4 ("cfg80211: reduce connect key caching struct size")
    Reported-by: Jouni Malinen
    Bisected-by: Jouni Malinen
    Signed-off-by: Johannes Berg

    Johannes Berg
     

26 Sep, 2016

2 commits


19 Sep, 2016

12 commits

  • …inux/kernel/git/jberg/mac80211-next

    Johannes Berg says:

    ====================
    This time we have various things - all across the board:
    * MU-MIMO sniffer support in mac80211
    * a create_singlethread_workqueue() cleanup
    * interface dump filtering that was documented but not implemented
    * support for the new radiotap timestamp field
    * send delBA in two unexpected conditions (as required by the spec)
    * connect keys cleanups - allow only WEP with index 0-3
    * per-station aggregation limit to work around broken APs
    * debugfs improvement for the integrated codel algorithm
    and various other small improvements and cleanups.
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • A couple of dev_err messages span two lines and the literal
    string is missing a white space between words. Add the white
    space and join the two lines into one.

    Signed-off-by: Colin Ian King
    Acked-by: FLorian Fainelli
    Signed-off-by: David S. Miller

    Colin Ian King
     
  • When fq is used on 32bit kernels, we need to lock the qdisc before
    copying 64bit fields.

    Otherwise "tc -s qdisc ..." might report bogus values.

    Fixes: afe4fd062416 ("pkt_sched: fq: Fair Queue packet scheduler")
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Instead of using flow stats per NUMA node, use it per CPU. When using
    megaflows, the stats lock can be a bottleneck in scalability.

    On a E5-2690 12-core system, usual throughput went from ~4Mpps to
    ~15Mpps when forwarding between two 40GbE ports with a single flow
    configured on the datapath.

    This has been tested on a system with possible CPUs 0-7,16-23. After
    module removal, there were no corruption on the slab cache.

    Signed-off-by: Thadeu Lima de Souza Cascardo
    Cc: pravin shelar
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Thadeu Lima de Souza Cascardo
     
  • On a system with only node 1 as possible, all statistics is going to be
    accounted on node 0 as it will have a single writer.

    However, when getting and clearing the statistics, node 0 is not going
    to be considered, as it's not a possible node.

    Tested that statistics are not zero on a system with only node 1
    possible. Also compile-tested with CONFIG_NUMA off.

    Signed-off-by: Thadeu Lima de Souza Cascardo
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Thadeu Lima de Souza Cascardo
     
  • Xin Long says:

    ====================
    sctp: fix the transmit err process

    This patchset is to improve the transmit err process and also fix some
    issues.

    After this patchset, once the chunks are enqueued successfully, even
    if the chunks fail to send out, no matter because of nodst or nomem,
    no err retruns back to users any more. Instead, they are taken care
    of by retransmit.

    v1->v2:
    - add more details to the changelog in patch 1/6
    - add Fixes: tag in patch 2/6, 3/6
    - also revert 69b5777f2e57 in patch 3/6
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • As David and Marcelo's suggestion, ENOMEM err shouldn't return back to
    user in transmit path. Instead, sctp's retransmit would take care of
    the chunks that fail to send because of ENOMEM.

    This patch is only to do some release job when alloc_skb fails, not to
    return ENOMEM back any more.

    Besides, it also cleans up sctp_packet_transmit's err path, and fixes
    some issues in err path:

    - It didn't free the head skb in nomem: path.
    - No need to check nskb in no_route: path.
    - It should goto err: path if alloc_skb fails for head.
    - Not all the NOMEMs should free nskb.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • sctp_outq_flush return value is meaningless now, this patch is
    to make sctp_outq_flush return void, as well as sctp_outq_fail
    and sctp_outq_uncork.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • Every time when sctp calls sctp_outq_flush, it sends out the chunks of
    control queue, retransmit queue and data queue. Even if some trunks are
    failed to transmit, it still has to flush all the transports, as it's
    the only chance to clean that transmit_list.

    So the latest transmit error here should be returned back. This transmit
    error is an internal error of sctp stack.

    I checked all the places where it uses the transmit error (the return
    value of sctp_outq_flush), most of them are actually just save it to
    sk_err.

    Except for sctp_assoc/endpoint_bh_rcv, they will drop the chunk if
    it's failed to send a REPLY, which is actually incorrect, as we can't
    be sure the error that sctp_outq_flush returns is from sending that
    REPLY.

    So it's meaningless for sctp_outq_flush to return error back.

    This patch is to save transmit error to sk_err in sctp_outq_flush, the
    new error can update the old value. Eventually, sctp_wait_for_* would
    check for it.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • Last patch "sctp: do not return the transmit err back to sctp_sendmsg"
    made sctp_primitive_SEND return err only when asoc state is unavailable.
    In this case, chunks are not enqueued, they have no chance to be freed if
    we don't take care of them later.

    This Patch is actually to revert commit 1cd4d5c4326a ("sctp: remove the
    unused sctp_datamsg_free()"), commit 69b5777f2e57 ("sctp: hold the chunks
    only after the chunk is enqueued in outq") and commit 8b570dc9f7b6 ("sctp:
    only drop the reference on the datamsg after sending a msg"), to use
    sctp_datamsg_free to free the chunks of current msg.

    Fixes: 8b570dc9f7b6 ("sctp: only drop the reference on the datamsg after sending a msg")
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • Once a chunk is enqueued successfully, sctp queues can take care of it.
    Even if it is failed to transmit (like because of nomem), it should be
    put into retransmit queue.

    If sctp report this error to users, it confuses them, they may resend
    that msg, but actually in kernel sctp stack is in charge of retransmit
    it already.

    Besides, this error probably is not from the failure of transmitting
    current msg, but transmitting or retransmitting another msg's chunks,
    as sctp_outq_flush just tries to send out all transports' chunks.

    This patch is to make sctp_cmd_send_msg return avoid, and not return the
    transmit err back to sctp_sendmsg

    Fixes: 8b570dc9f7b6 ("sctp: only drop the reference on the datamsg after sending a msg")
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • Data Chunks are only sent by sctp_primitive_SEND, in which sctp checks
    the asoc's state through statetable before calling sctp_outq_tail. So
    there's no need to check the asoc's state again in sctp_outq_tail.

    Besides, sctp_do_sm is protected by lock_sock, even if sending msg is
    interrupted by timer events, the event's processes still need to acquire
    lock_sock first. It means no others CMDs can be enqueue into side effect
    list before CMD_SEND_MSG to change asoc->state, so it's safe to remove it.

    This patch is to remove redundant asoc->state check from sctp_outq_tail.

    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

17 Sep, 2016

12 commits

  • Alexei Starovoitov says:

    ====================
    ip_tunnel: add collect_md mode to IPv4/IPv6 tunnels

    Similar to geneve, vxlan, gre tunnels implement 'collect metadata' mode
    in ipip, ipip6, ip6ip6 tunnels.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • the test creates 3 namespaces with veth connected via bridge.
    First two namespaces simulate two different hosts with the same
    IPv4 and IPv6 addresses configured on the tunnel interface and they
    communicate with outside world via standard tunnels.
    Third namespace creates collect_md tunnel that is driven by BPF
    program which selects different remote host (either first or
    second namespace) based on tcp dest port number while tcp dst
    ip is the same.
    This scenario is rough approximation of load balancer use case.
    The tests check both traditional tunnel configuration and collect_md mode.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     
  • extend existing tests for vxlan, geneve, gre to include IPIP tunnel.
    It tests both traditional tunnel configuration and
    dynamic via bpf helpers.

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     
  • Similar to gre, vxlan, geneve tunnels allow IPIP6 and IP6IP6 tunnels
    to operate in 'collect metadata' mode.
    Unlike ipv4 code here it's possible to reuse ip6_tnl_xmit() function
    for both collect_md and traditional tunnels.
    bpf_skb_[gs]et_tunnel_key() helpers and ovs (in the future) are the users.

    Signed-off-by: Alexei Starovoitov
    Acked-by: Thomas Graf
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     
  • Similar to gre, vxlan, geneve tunnels allow IPIP tunnels to
    operate in 'collect metadata' mode.
    bpf_skb_[gs]et_tunnel_key() helpers can make use of it right away.
    ovs can use it as well in the future (once appropriate ovs-vport
    abstractions and user apis are added).
    Note that just like in other tunnels we cannot cache the dst,
    since tunnel_info metadata can be different for every packet.

    Signed-off-by: Alexei Starovoitov
    Acked-by: Thomas Graf
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     
  • Check for net_device_ops structures that are only stored in the netdev_ops
    field of a net_device structure. This field is declared const, so
    net_device_ops structures that have this property can be declared as const
    also.

    The semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @r disable optional_qualifier@
    identifier i;
    position p;
    @@
    static struct net_device_ops i@p = { ... };

    @ok@
    identifier r.i;
    struct net_device e;
    position p;
    @@
    e.netdev_ops = &i@p;

    @bad@
    position p != {r.p,ok.p};
    identifier r.i;
    struct net_device_ops e;
    @@
    e@i@p

    @depends on !bad disable optional_qualifier@
    identifier r.i;
    @@
    static
    +const
    struct net_device_ops i = { ... };
    //

    The result of size on this file before the change is:
    text data bss dec hex filename
    3401 931 44 4376 1118 net/l2tp/l2tp_eth.o

    and after the change it is:
    text data bss dec hex filename
    3993 347 44 4384 1120 net/l2tp/l2tp_eth.o

    Signed-off-by: Julia Lawall
    Signed-off-by: David S. Miller

    Julia Lawall
     
  • Check for net_device_ops structures that are only stored in the netdev_ops
    field of a net_device structure. This field is declared const, so
    net_device_ops structures that have this property can be declared as const
    also.

    The semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @r disable optional_qualifier@
    identifier i;
    position p;
    @@
    static struct net_device_ops i@p = { ... };

    @ok@
    identifier r.i;
    struct net_device e;
    position p;
    @@
    e.netdev_ops = &i@p;

    @bad@
    position p != {r.p,ok.p};
    identifier r.i;
    struct net_device_ops e;
    @@
    e@i@p

    @depends on !bad disable optional_qualifier@
    identifier r.i;
    @@
    static
    +const
    struct net_device_ops i = { ... };
    //

    The result of size on this file before the change is:
    text data bss dec hex filename
    21623 1316 40 22979 59c3
    drivers/net/ethernet/synopsys/dwc_eth_qos.o

    and after the change it is:
    text data bss dec hex filename
    22199 724 40 22963 59b3
    drivers/net/ethernet/synopsys/dwc_eth_qos.o

    Signed-off-by: Julia Lawall
    Signed-off-by: David S. Miller

    Julia Lawall
     
  • Check for net_device_ops structures that are only stored in the netdev_ops
    field of a net_device structure. This field is declared const, so
    net_device_ops structures that have this property can be declared as const
    also.

    The semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @r disable optional_qualifier@
    identifier i;
    position p;
    @@
    static struct net_device_ops i@p = { ... };

    @ok@
    identifier r.i;
    struct net_device e;
    position p;
    @@
    e.netdev_ops = &i@p;

    @bad@
    position p != {r.p,ok.p};
    identifier r.i;
    struct net_device_ops e;
    @@
    e@i@p

    @depends on !bad disable optional_qualifier@
    identifier r.i;
    @@
    static
    +const
    struct net_device_ops i = { ... };
    //

    The result of size on this file before the change is:

    text data bss dec hex filename
    7995 848 8 8851 2293
    drivers/net/ethernet/hisilicon/hip04_eth.o

    and after the change it is:

    text data bss dec hex filename
    8571 256 8 8835 2283
    drivers/net/ethernet/hisilicon/hip04_eth.o

    Signed-off-by: Julia Lawall
    Signed-off-by: David S. Miller

    Julia Lawall
     
  • (As asked by Dave in Februrary)

    Signed-off-by: Alan Cox
    Signed-off-by: David S. Miller

    Alan Cox
     
  • No longer used after e0d56fdd73422 ("net: l3mdev: remove redundant calls")

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • No longer used after d66f6c0a8f3c0 ("net: ipv4: Remove l3mdev_get_saddr")

    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     
  • With large BDP TCP flows and lossy networks, it is very important
    to keep a low number of skbs in the write queue.

    RACK and SACK processing can perform a linear scan of it.

    We should avoid putting any payload in skb->head, so that SACK
    shifting can be done if needed.

    With this patch, we allow to pack ~0.5 MB per skb instead of
    the 64KB initially cooked at tcp_sendmsg() time.

    This gives a reduction of number of skbs in write queue by eight.
    tcp_rack_detect_loss() likes this.

    We still allow payload in skb->head for first skb put in the queue,
    to not impact RPC workloads.

    Signed-off-by: Eric Dumazet
    Cc: Yuchung Cheng
    Acked-by: Yuchung Cheng
    Signed-off-by: David S. Miller

    Eric Dumazet