07 Dec, 2013

16 commits

  • fix some typos

    Acked-by: Neil Horman
    Signed-off-by: Wang Weidong
    Signed-off-by: David S. Miller

    wangweidong
     
  • sctp_peer_needs_update only return 0 or 1.

    Acked-by: Neil Horman
    Signed-off-by: Wang Weidong
    Signed-off-by: David S. Miller

    wangweidong
     
  • Make the code more simplification.

    Acked-by: Neil Horman
    Suggested-by: Joe Perches
    Signed-off-by: Wang Weidong
    Signed-off-by: David S. Miller

    wangweidong
     
  • kzalloc had initialize the allocated memroy. Therefore, remove the
    initialize with 0 and the memset.

    Acked-by: Neil Horman
    Signed-off-by: Wang Weidong
    Signed-off-by: David S. Miller

    wangweidong
     
  • John W. Linville says:

    ====================
    Please pull this batch of updates intended for the 3.14 stream...

    For the mac80211 bits, Johannes says:

    "I have various improvements/cleanups/fixes all over, but the shortlog
    shows that Luis's regulatory work and mesh work from the cozybit folks
    are the biggest ones, along with the CSA fixes."

    Along with that, we have big batches of updates to brcmfmac, rtlwifi,
    and ath9k. There are updates to wcn36xx, rt2x00, and a handful of
    others as well.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • With the introduction of TCP Small Queues, TSO auto sizing, and TCP
    pacing, we can implement Automatic Corking in the kernel, to help
    applications doing small write()/sendmsg() to TCP sockets.

    Idea is to change tcp_push() to check if the current skb payload is
    under skb optimal size (a multiple of MSS bytes)

    If under 'size_goal', and at least one packet is still in Qdisc or
    NIC TX queues, set the TCP Small Queue Throttled bit, so that the push
    will be delayed up to TX completion time.

    This delay might allow the application to coalesce more bytes
    in the skb in following write()/sendmsg()/sendfile() system calls.

    The exact duration of the delay is depending on the dynamics
    of the system, and might be zero if no packet for this flow
    is actually held in Qdisc or NIC TX ring.

    Using FQ/pacing is a way to increase the probability of
    autocorking being triggered.

    Add a new sysctl (/proc/sys/net/ipv4/tcp_autocorking) to control
    this feature and default it to 1 (enabled)

    Add a new SNMP counter : nstat -a | grep TcpExtTCPAutoCorking
    This counter is incremented every time we detected skb was under used
    and its flush was deferred.

    Tested:

    Interesting effects when using line buffered commands under ssh.

    Excellent performance results in term of cpu usage and total throughput.

    lpq83:~# echo 1 >/proc/sys/net/ipv4/tcp_autocorking
    lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128
    9410.39

    Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128':

    35209.439626 task-clock # 2.901 CPUs utilized
    2,294 context-switches # 0.065 K/sec
    101 CPU-migrations # 0.003 K/sec
    4,079 page-faults # 0.116 K/sec
    97,923,241,298 cycles # 2.781 GHz [83.31%]
    51,832,908,236 stalled-cycles-frontend # 52.93% frontend cycles idle [83.30%]
    25,697,986,603 stalled-cycles-backend # 26.24% backend cycles idle [66.70%]
    102,225,978,536 instructions # 1.04 insns per cycle
    # 0.51 stalled cycles per insn [83.38%]
    18,657,696,819 branches # 529.906 M/sec [83.29%]
    91,679,646 branch-misses # 0.49% of all branches [83.40%]

    12.136204899 seconds time elapsed

    lpq83:~# echo 0 >/proc/sys/net/ipv4/tcp_autocorking
    lpq83:~# perf stat ./super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128
    6624.89

    Performance counter stats for './super_netperf 4 -t TCP_STREAM -H lpq84 -- -m 128':
    40045.864494 task-clock # 3.301 CPUs utilized
    171 context-switches # 0.004 K/sec
    53 CPU-migrations # 0.001 K/sec
    4,080 page-faults # 0.102 K/sec
    111,340,458,645 cycles # 2.780 GHz [83.34%]
    61,778,039,277 stalled-cycles-frontend # 55.49% frontend cycles idle [83.31%]
    29,295,522,759 stalled-cycles-backend # 26.31% backend cycles idle [66.67%]
    108,654,349,355 instructions # 0.98 insns per cycle
    # 0.57 stalled cycles per insn [83.34%]
    19,552,170,748 branches # 488.244 M/sec [83.34%]
    157,875,417 branch-misses # 0.81% of all branches [83.34%]

    12.130267788 seconds time elapsed

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Compiler doesn't know skb_shinfo(skb) pointer is usually constant.

    By using a temporary variable, we help generating smaller code.

    For example, tcp_init_nondata_skb() is inlined after this patch.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Remove one useless conditional branch :
    napi->skb is NULL, so nothing bad can happen.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • ip6_forward() runs from softirq context, we can use the SNMP macros
    assuming this.

    Use same indentation for all IP6_INC_STATS_BH() calls.

    Signed-off-by: Eric Dumazet
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Signed-off-by: Duan Jiong
    Signed-off-by: David S. Miller

    Duan Jiong
     
  • Several files refer to an old address for the Free Software Foundation
    in the file header comment. Resolve by replacing the address with
    the URL so that we do not have to keep
    updating the header comments anytime the address changes.

    CC: John Fastabend
    CC: Alex Duyck
    CC: Marcel Holtmann
    CC: Gustavo Padovan
    CC: Johan Hedberg
    CC: Jamal Hadi Salim
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Kirsher
     
  • Several files refer to an old address for the Free Software Foundation
    in the file header comment. Resolve by replacing the address with
    the URL so that we do not have to keep
    updating the header comments anytime the address changes.

    CC: Samuel Ortiz
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Kirsher
     
  • Several files refer to an old address for the Free Software Foundation
    in the file header comment. Resolve by replacing the address with
    the URL so that we do not have to keep
    updating the header comments anytime the address changes.

    CC: netfilter@vger.kernel.org
    CC: Pablo Neira Ayuso
    CC: Patrick McHardy
    CC: Jozsef Kadlecsik
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Kirsher
     
  • Several files refer to an old address for the Free Software Foundation
    in the file header comment. Resolve by replacing the address with
    the URL so that we do not have to keep
    updating the header comments anytime the address changes.

    CC: Paul Moore
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Kirsher
     
  • Several files refer to an old address for the Free Software Foundation
    in the file header comment. Resolve by replacing the address with
    the URL so that we do not have to keep
    updating the header comments anytime the address changes.

    CC: Alexey Kuznetsov
    CC: James Morris
    CC: Hideaki YOSHIFUJI
    CC: Patrick McHardy
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Kirsher
     
  • Several files refer to an old address for the Free Software Foundation
    in the file header comment. Resolve by replacing the address with
    the URL so that we do not have to keep
    updating the header comments anytime the address changes.

    CC: Vlad Yasevich
    CC: Neil Horman
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Jeff Kirsher
     

06 Dec, 2013

3 commits


05 Dec, 2013

1 commit


04 Dec, 2013

1 commit

  • After congestion update on a local connection, when rds_ib_xmit returns
    less bytes than that are there in the message, rds_send_xmit calls
    back rds_ib_xmit with an offset that causes BUG_ON(off & RDS_FRAG_SIZE)
    to trigger.

    For a 4Kb PAGE_SIZE rds_ib_xmit returns min(8240,4096)=4096 when actually
    the message contains 8240 bytes. rds_send_xmit thinks there is more to send
    and calls rds_ib_xmit again with a data offset "off" of 4096-48(rds header)
    =4048 bytes thus hitting the BUG_ON(off & RDS_FRAG_SIZE) [RDS_FRAG_SIZE=4k].

    The commit 6094628bfd94323fc1cea05ec2c6affd98c18f7f
    "rds: prevent BUG_ON triggering on congestion map updates" introduced
    this regression. That change was addressing the triggering of a different
    BUG_ON in rds_send_xmit() on PowerPC architecture with 64Kbytes PAGE_SIZE:
    BUG_ON(ret != 0 &&
    conn->c_xmit_sg == rm->data.op_nents);
    This was the sequence it was going through:
    (rds_ib_xmit)
    /* Do not send cong updates to IB loopback */
    if (conn->c_loopback
    && rm->m_inc.i_hdr.h_flags & RDS_FLAG_CONG_BITMAP) {
    rds_cong_map_updated(conn->c_fcong, ~(u64) 0);
    return sizeof(struct rds_header) + RDS_CONG_MAP_BYTES;
    }
    rds_ib_xmit returns 8240
    rds_send_xmit:
    c_xmit_data_off = 0 + 8240 - 48 (rds header accounted only the first time)
    = 8192
    c_xmit_data_off < 65536 (sg->length), so calls rds_ib_xmit again
    rds_ib_xmit returns 8240
    rds_send_xmit:
    c_xmit_data_off = 8192 + 8240 = 16432, calls rds_ib_xmit again
    and so on (c_xmit_data_off 24672,32912,41152,49392,57632)
    rds_ib_xmit returns 8240
    On this iteration this sequence causes the BUG_ON in rds_send_xmit:
    while (ret) {
    tmp = min_t(int, ret, sg->length - conn->c_xmit_data_off);
    [tmp = 65536 - 57632 = 7904]
    conn->c_xmit_data_off += tmp;
    [c_xmit_data_off = 57632 + 7904 = 65536]
    ret -= tmp;
    [ret = 8240 - 7904 = 336]
    if (conn->c_xmit_data_off == sg->length) {
    conn->c_xmit_data_off = 0;
    sg++;
    conn->c_xmit_sg++;
    BUG_ON(ret != 0 &&
    conn->c_xmit_sg == rm->data.op_nents);
    [c_xmit_sg = 1, rm->data.op_nents = 1]

    What the current fix does:
    Since the congestion update over loopback is not actually transmitted
    as a message, all that rds_ib_xmit needs to do is let the caller think
    the full message has been transmitted and not return partial bytes.
    It will return 8240 (RDS_CONG_MAP_BYTES+48) when PAGE_SIZE is 4Kb.
    And 64Kb+48 when page size is 64Kb.

    Reported-by: Josh Hunt
    Tested-by: Honggang Li
    Acked-by: Bang Nguyen
    Signed-off-by: Venkat Venkatsubra
    Signed-off-by: David S. Miller

    Venkat Venkatsubra
     

03 Dec, 2013

6 commits

  • The behaviour of blackhole and prohibit routes has been corrected by setting
    the input and output pointers of the dst variable appropriately. For
    blackhole routes, they are set to dst_discard and to ip6_pkt_discard and
    ip6_pkt_discard_out respectively for prohibit routes.

    ipv6: ip6_pkt_prohibit(_out) should not depend on
    CONFIG_IPV6_MULTIPLE_TABLES

    We need ip6_pkt_prohibit(_out) available without
    CONFIG_IPV6_MULTIPLE_TABLES

    Signed-off-by: Kamala R
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Kamala R
     
  • Signed-off-by: Francois-Xavier Le Bail
    Signed-off-by: David S. Miller

    François-Xavier Le Bail
     
  • when dealing with a RA message, if accept_ra_defrtr is false,
    the kernel will not add the default route, and then deal with
    the following route information options. Unfortunately, those
    options maybe contain default route, so let's judge the
    accept_ra_defrtr before calling rt6_route_rcv.

    Signed-off-by: Duan Jiong
    Signed-off-by: David S. Miller

    Duan Jiong
     
  • John W. Linville
     
  • John W. Linville
     
  • Pull networking updates from David Miller:
    "Here is a pile of bug fixes that accumulated while I was in Europe"

    1) In fixing kernel leaks to userspace during copying of socket
    addresses, we broke a case that used to work, namely the user
    providing a buffer larger than the in-kernel generic socket address
    structure. This broke Ruby amongst other things. Fix from Dan
    Carpenter.

    2) Fix regression added by byte queue limit support in 8139cp driver,
    from Yang Yingliang.

    3) The addition of MSG_SENDPAGE_NOTLAST buggered up a few sendpage
    implementations, they should just treat it the same as MSG_MORE.
    Fix from Richard Weinberger and Shawn Landden.

    4) Handle icmpv4 errors received on ipv6 SIT tunnels correctly, from
    Oussama Ghorbel. In particular we should send an ICMPv6 unreachable
    in such situations.

    5) Fix some regressions in the recent genetlink fixes, in particular
    get the pmcraid driver to use the new safer interfaces correctly.
    From Johannes Berg.

    6) macvtap was converted to use a per-cpu set of statistics, but some
    code was still bumping tx_dropped elsewhere. From Jason Wang.

    7) Fix build failure of xen-netback due to missing include on some
    architectures, from Andy Whitecroft.

    8) macvtap double counts received packets in statistics, fix from Vlad
    Yasevich.

    9) Fix various cases of using *_STATS_BH() when *_STATS() is more
    appropriate. From Eric Dumazet and Hannes Frederic Sowa.

    10) Pktgen ipsec mode doesn't update the ipv4 header length and checksum
    properly after encapsulation. Fix from Fan Du.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
    net/mlx4_en: Remove selftest TX queues empty condition
    {pktgen, xfrm} Update IPv4 header total len and checksum after tranformation
    virtio_net: make all RX paths handle erors consistently
    virtio_net: fix error handling for mergeable buffers
    virtio_net: Fixed a trivial typo (fitler --> filter)
    netem: fix gemodel loss generator
    netem: fix loss 4 state model
    netem: missing break in ge loss generator
    net/hsr: Support iproute print_opt ('ip -details ...')
    net/hsr: Very small fix of comment style.
    MAINTAINERS: Added net/hsr/ maintainer
    ipv6: fix possible seqlock deadlock in ip6_finish_output2
    ixgbe: Make ixgbe_identify_qsfp_module_generic static
    ixgbe: turn NETIF_F_HW_L2FW_DOFFLOAD off by default
    ixgbe: ixgbe_fwd_ring_down needs to be static
    e1000: fix possible reset_task running after adapter down
    e1000: fix lockdep warning in e1000_reset_task
    e1000: prevent oops when adapter is being closed and reset simultaneously
    igb: Fixed Wake On LAN support
    inet: fix possible seqlock deadlocks
    ...

    Linus Torvalds
     

02 Dec, 2013

13 commits

  • Drivers with hardware rate control were given
    sta->rx_nss set to 0. This was because rx_nss
    calculation procedure was protected by hw/sw rate
    control check.

    Signed-off-by: Michal Kazior
    Signed-off-by: Johannes Berg

    Michal Kazior
     
  • When external CSA IEs are received (beacons or action messages), a
    channel switch is triggered as well. This should only be allowed on
    devices which actually support channel switches, otherwise disconnect.
    (For the corresponding userspace invocation, the wiphy flag is checked
    in nl80211).

    Signed-off-by: Simon Wunderlich
    Signed-off-by: Johannes Berg

    Simon Wunderlich
     
  • The channel switch announcement code has some major locking problems
    which can cause a deadlock in worst case. A series of fixes has been
    proposed, but these are non-trivial and need to be tested first.
    Therefore disable CSA completely for 3.13.

    Signed-off-by: Simon Wunderlich
    Signed-off-by: Johannes Berg

    Simon Wunderlich
     
  • Signed-off-by: Simon Wunderlich
    Signed-off-by: Johannes Berg

    Simon Wunderlich
     
  • The current channel switch code has a potential deadlock:
    1) * cfg80211_stop_ap acquires wdev-lock
    * ieee80211_stop_ap calls cancel_work_sync for the csa_finalize_work,
    which acquires the associated worker-lock
    2) * ieee80211_csa_finalize_work holds the worker-lock when run
    * it calls cfg80211_ch_switch_notify which will claim the wdev-lock,
    and also needs to claim the sdata-lock (which is the same as the
    wdev-lock) to modify the beacons.

    It is sufficient to just set the channel switch active to false. If the
    worker is running later, it will find the channel switch to not be
    active anymore and returns immediately without changing anything.

    Canceling the worker is done anyway when the interface goes down
    (ieee80211_do_stop).

    Reported-by: Johannes Berg
    Signed-off-by: Simon Wunderlich
    Signed-off-by: Johannes Berg

    Simon Wunderlich
     
  • The channel switch notification should be sent under the
    wdev/sdata-lock, preferably in the same moment as the channel change
    happens, to avoid races by other callers (e.g. start/stop_ap).
    This also adds the previously missing sdata_lock protection in
    csa_finalize_work.

    Reported-by: Johannes Berg
    Signed-off-by: Simon Wunderlich
    Signed-off-by: Johannes Berg

    Simon Wunderlich
     
  • The csa finalize worker needs to change the beacon information (for
    different modes). These are normally protected under rtnl lock, but the
    csa finalize worker is called by drivers and should not acquire the RTNL
    lock. Therefore change access protection for beacons to sdata/wdev lock.

    Reported-by: Johannes Berg
    Signed-off-by: Simon Wunderlich
    [fix sdata_dereference]
    Signed-off-by: Johannes Berg

    Simon Wunderlich
     
  • To avoid race conditions in functions which modify the beacon
    information, lock these using the wdev lock. This is especially required
    to avoid problems for csa handling functions which modify beacons but
    can not be called under rtnl lock.

    Reported-by: Johannes Berg
    Signed-off-by: Simon Wunderlich
    Signed-off-by: Johannes Berg

    Simon Wunderlich
     
  • The local TSF timer is used to compute the timing offset between
    mesh peers on beacon reception. However, asking the device for
    the TSF is not very accurate, so we prefer to use rx->mactime
    if available. In the latter case, calling drv_get_tsf() just
    adds more delay into the RX path, so skip it if we can.

    Signed-off-by: Bob Copeland
    Signed-off-by: Johannes Berg

    Bob Copeland
     
  • Change cfg80211 and mac80211 to use cfg80211_mgmt_tx_params
    struct to aggregate parameters for mgmt_tx functions.
    This makes the functions' signatures less clumsy and allows
    less painful parameters extension.

    Signed-off-by: Andrei Otcheretianski
    [fix all other drivers]
    Signed-off-by: Johannes Berg

    Andrei Otcheretianski
     
  • There's a bug in tracking HT opmode changes in mac80211, it
    fails to update the driver when the channel parameters don't
    change.

    Move the code to do the HT opmode checking independently of
    the channel/bandwidth tracking.

    Signed-off-by: Avri Altman
    [edit commit message]
    Signed-off-by: Johannes Berg

    Avri Altman
     
  • All interface types now properly clean up their stations
    using some form of sta_info_flush() themselves, so there's
    no need to try it again at teardown. Remove the call to
    get rid of the extra delay from the synchronize_net() and
    rcu_barrier() calls.

    Reported-by: Moshe Benji
    Signed-off-by: Johannes Berg

    Johannes Berg
     
  • Measure TX latency and jitter statistics per station per TID.
    These Measurements are disabled by default and can be enabled
    via debugfs.

    Features included for each station's TID:

    1. Keep count of the maximum and average latency of Tx frames.
    2. Keep track of many frames arrived in a specific time range
    (need to enable through debugfs and configure the bins ranges)

    Signed-off-by: Matti Gottlieb
    Signed-off-by: Johannes Berg

    Matti Gottlieb