18 Oct, 2014

6 commits

  • pskb_may_pull maybe change skb->data and make eth pointer oboslete,
    so eth needs to reload

    Fixes: 91269e390d062 ("vxlan: using pskb_may_pull as early as possible")
    Cc: Eric Dumazet
    Signed-off-by: Li RongQing
    Signed-off-by: David S. Miller

    Li RongQing
     
  • pskb_may_pull() called by arphdr_ok can change skb->data, so put the arp
    setting after arphdr_ok to avoid the use the freed memory

    Fixes: 0714812134d7d ("openvswitch: Eliminate memset() from flow_extract.")
    Cc: Jesse Gross
    Cc: Eric Dumazet
    Signed-off-by: Li RongQing
    Acked-by: Jesse Gross
    Signed-off-by: David S. Miller

    Li RongQing
     
  • ip_setup_cork() called inside ip_append_data() steals dst entry from rt to cork
    and in case errors in __ip_append_data() nobody frees stolen dst entry

    Fixes: 2e77d89b2fa8 ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()")
    Signed-off-by: Vasily Averin
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Vasily Averin
     
  • We can retrieve opt from skb, no need to pass it as a parameter.
    And opt should always be non-NULL, no need to check.

    Cc: Krzysztof Kolasa
    Cc: Eric Dumazet
    Tested-by: Krzysztof Kolasa
    Signed-off-by: Cong Wang
    Signed-off-by: Cong Wang
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Cong Wang
     
  • cookie_v4_check() allocates ip_options_rcu in the same way
    with tcp_v4_save_options(), we can just make it a helper function.

    Cc: Krzysztof Kolasa
    Cc: Eric Dumazet
    Signed-off-by: Cong Wang
    Signed-off-by: Cong Wang
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Cong Wang
     
  • commit 971f10eca186cab238c49da ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
    missed that cookie_v4_check() still calls ip_options_echo() which uses
    IPCB(). It should use TCPCB() at TCP layer, so call __ip_options_echo()
    instead.

    Fixes: commit 971f10eca186cab238c49da ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
    Cc: Krzysztof Kolasa
    Cc: Eric Dumazet
    Reported-by: Krzysztof Kolasa
    Tested-by: Krzysztof Kolasa
    Signed-off-by: Cong Wang
    Signed-off-by: Cong Wang
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Cong Wang
     

17 Oct, 2014

3 commits

  • This simplifies the lanai.c driver by using
    the module_pci_driver() macro, at the expense
    of losing only debugging messages.

    Signed-off-by: Michael Opdenacker
    Signed-off-by: David S. Miller

    Michael Opdenacker
     
  • Avoid confusion between pid and portid.

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • Jeff Kirsher says:

    ====================
    Intel Wired LAN Driver Updates 2014-10-16

    This series contains updates to fm10k and ixgbe.

    Matthew provides two fixes for fm10k, first sets the flag to fetch the
    host state before kicking off the service task that reads the host
    state when bringing the interface up. The second makes sure that we
    release the mailbox lock after detecting an error and before we return
    the error code.

    Andy Zhou provides a compile fix for fm10k, when the driver is compiled
    into the kernel and the VXLAN driver is compiled as a module.

    Emil provides a fix for ixgbe to prevent against a panic by trying
    to dereference a NULL pointer in ixgbe_ndo_set_vf_spoofchk().
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

16 Oct, 2014

13 commits

  • The check for vfinfo is not sufficient because it does not protect
    against specifying vf that is outside of sriov_num_vfs range.
    All of the ndo functions have a check for it except for
    ixgbevf_ndo_set_spoofcheck().

    The following patch is all we need to protect against this panic:

    ip link set p96p1 vf 0 spoofchk off
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000052
    IP: []
    ixgbe_ndo_set_vf_spoofchk+0x51/0x150 [ixgbe]

    Reported-by: Thierry Herbelot
    Signed-off-by: Emil Tantilov
    Acked-by: Thierry Herbelot
    Signed-off-by: Jeff Kirsher

    Emil Tantilov
     
  • Compiling with CONFIG_FM10K=y and VXLAN=m resulting in linking error:

    drivers/built-in.o: In function `fm10k_open':
    (.text+0x1f9d7a): undefined reference to `vxlan_get_rx_port'
    make: *** [vmlinux] Error 1

    The fix follows the same strategy as I40E.

    Signed-off-by: Andy Zhou
    Acked-by: Alexander Duyck
    Tested-by: Aaron Brown
    Signed-off-by: Jeff Kirsher

    Andy Zhou
     
  • After grabbing the mailbox lock and detecting an error, the lock must be
    released before the error code can be returned.

    Signed-off-by: Matthew Vick
    Tested-by: Aaron Brown
    Signed-off-by: Jeff Kirsher

    Matthew Vick
     
  • Set the flag to fetch the host state before kicking off the service task
    that reads the host state when bringing the interface back up.

    Signed-off-by: Matthew Vick
    Tested-by: Aaron Brown
    Signed-off-by: Jeff Kirsher

    Matthew Vick
     
  • pskb_may_pull should be used to check if skb->data has enough space,
    skb->len can not ensure that.

    Cc: Cong Wang
    Signed-off-by: Li RongQing
    Signed-off-by: David S. Miller

    Li RongQing
     
  • when netif_rx() is done, the netif_rx handled skb maybe be freed,
    and should not be used.

    Signed-off-by: Li RongQing
    Signed-off-by: David S. Miller

    Li RongQing
     
  • All functions used struct vport *vport except
    ovs_vport_find_upcall_portid.

    This fixes 1 kerneldoc warning

    Signed-off-by: Fabian Frederick
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Fabian Frederick
     
  • s/sock/gs

    Signed-off-by: Fabian Frederick
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Fabian Frederick
     
  • For each Rx frame the eTSEC writes its FCS (Frame Check Sequence)
    to the Rx buffer.

    The eTSEC h/w manual states in the "Receive Buffer Descriptor Field
    Descriptions" table:
    "Data length is the number of octets written by the eTSEC into this BD's
    data buffer if L is cleared (the value is equal to MRBLR), or, if L is
    set, the length of the frame including *CRC*, FCB (if RCTRL[PRSDEP > 00),
    preamble (if MACCFG2[PreAmRxEn]=1), time stamp (if RCTRL[TS] = 1) and
    any padding (RCTRL[PAL])."

    Though the FCS bytes are removed by the driver before passing the skb
    to the net stack, the Rx buffer size computation does not currently
    take into account the FCS bytes (4 bytes).
    Because the Rx buffer size is multiple of 512 bytes, leaving out the
    FCS is not a problem for the default MTU of 1500, as the Rx buffer size
    is 1536 in this case. However, for custom MTUs, where the difference
    between the MTU size and the Rx buffer size is less, this can be a
    problem as the computed Rx buffer size won't be enough to accomodate
    the FCS for a received frame that is big enough (close to MTU size).
    In such case the received frame is considered to be incomplete (L flag
    not set in the RxBD status) and silently dropped.

    Note that the driver does not currently support S/G on Rx, so it has to
    compute its Rx buffer size based on the MTU of the device.

    Reported-by: Kristian Otnes
    Signed-off-by: Claudiu Manoil
    Signed-off-by: David S. Miller

    Claudiu Manoil
     
  • commit 0b725a2ca61bedc33a2a63d0451d528b268cf975
    net: Remove ndo_xmit_flush netdev operation, use signalling instead.

    added code that looks at skb->xmit_more after the skb has
    been put in TX VQ. Since some paths process the ring and free the skb
    immediately, this can cause use after free.

    Fix by storing xmit_more in a local variable.

    Cc: David S. Miller
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • iMX6SX IEEE 1588 module has one hw issue in capturing the ATVR register.
    The current SW flow is:
    ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK;
    ts_counter_ns = ENET0->ATVR;
    The ATVR value is not expected value that cause LinuxPTP stack cannot be convergent.

    ENET Block Guide/ Chapter for the iMX6SX (PELE) address the issue:
    After set ENET_ATCR[Capture], there need some time cycles before the counter
    value is capture in the register clock domain. The wait-time-cycles is at least
    6 clock cycles of the slower clock between the register clock and the 1588 clock.
    So need something like:
    ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK;
    wait();
    ts_counter_ns = ENET0->ATVR;

    For iMX6SX, the 1588 ts_clk is fixed to 25Mhz, register clock is 66Mhz, so the
    wait-time-cycles must be greater than 240ns (40ns * 6). The patch add 1us delay
    before cpu read ATVR register.

    Changes V2:
    Modify the commit/comments log to describe the issue clearly.

    Signed-off-by: Fugang Duan
    Acked-by: Richard Cochran
    Signed-off-by: David S. Miller

    Nimrod Andy
     
  • Identified by kbuild test robot. csk family is always set to be AF_INET or
    AF_INET6, so skb will always be initialized to some value but there is no harm
    in silencing the warning anyways.

    Signed-off-by: Anish Bhatt
    Fixes : f42bb57c61fd ('cxgb4i : Fix -Wunused-function warning')
    Signed-off-by: David S. Miller

    Anish Bhatt
     
  • Add ndo_gso_check which a device can define to indicate whether is
    is capable of doing GSO on a packet. This funciton would be called from
    the stack to determine whether software GSO is needed to be done. A
    driver should populate this function if it advertises GSO types for
    which there are combinations that it wouldn't be able to handle. For
    instance a device that performs UDP tunneling might only implement
    support for transparent Ethernet bridging type of inner packets
    or might have limitations on lengths of inner headers.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

15 Oct, 2014

18 commits

  • this patch is to fix the stmmac data compatibilities for
    all the SoCs inside the platform file.

    Reported-by: Stephen Rothwell
    Signed-off-by: Giuseppe Cavallaro
    Signed-off-by: David S. Miller

    Giuseppe CAVALLARO
     
  • Anish Bhatt says:

    ====================
    ipv6 and related cleanup for cxgb4/cxgb4i

    This patch set removes some duplicated/extraneous code from cxgb4i, guards
    cxgb4 against compilation failure based on ipv6 tristate, make ipv6 related
    code no longer be enabled by default irrespective of ipv6 tristate and fixes
    a refcnt issue.
    -Anish

    v2 : Provide more detailed commit messages, make subject more concise as
    recommended by Dave Miller.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • There is an extra call to dst_neigh_lookup() leftover in cxgb4i that can cause
    an unreleased refcnt issue. Remove extraneous call.

    Signed-off-by: Anish Bhatt

    Fixes : 759a0cc5a3e1b ('cxgb4i: Add ipv6 code to driver, call into libcxgbi ipv6 api')
    Signed-off-by: David S. Miller

    Anish Bhatt
     
  • A bunch of ipv6 related code is left on by default. While this causes no
    compilation issues, there is no need to have this enabled by default. Guard
    with an ipv6 check, which also takes care of a -Wunused-function warning.

    Signed-off-by: Anish Bhatt
    Signed-off-by: David S. Miller

    Anish Bhatt
     
  • cxgb4 ipv6 does not guard against ipv6 being disabled, or the standard
    ipv6 module vs inbuilt tri-state issue. This was fixed for cxgb4i & iw_cxgb4
    but missed for cxgb4.

    Signed-off-by: Anish Bhatt
    Signed-off-by: David S. Miller

    Anish Bhatt
     
  • cxgb4 already handles CLIP updates from a previous changeset for iw_cxgb4,
    there is no need to have this functionality in cxgb4i. Remove duplicated code

    Signed-off-by: Anish Bhatt
    Signed-off-by: David S. Miller

    Anish Bhatt
     
  • TCP Small queues tries to keep number of packets in qdisc
    as small as possible, and depends on a tasklet to feed following
    packets at TX completion time.
    Choice of tasklet was driven by latencies requirements.

    Then, TCP stack tries to avoid reorders, by locking flows with
    outstanding packets in qdisc in a given TX queue.

    What can happen is that many flows get attracted by a low performing
    TX queue, and cpu servicing TX completion has to feed packets for all of
    them, making this cpu 100% busy in softirq mode.

    This became particularly visible with latest skb->xmit_more support

    Strategy adopted in this patch is to detect when tcp_wfree() is called
    from ksoftirqd and let the outstanding queue for this flow being drained
    before feeding additional packets, so that skb->ooo_okay can be set
    to allow select_queue() to select the optimal queue :

    Incoming ACKS are normally handled by different cpus, so this patch
    gives more chance for these cpus to take over the burden of feeding
    qdisc with future packets.

    Tested:

    lpaa23:~# ./super_netperf 1400 --google-pacing-rate 3028000 -H lpaa24 -l 3600 &

    lpaa23:~# sar -n DEV 1 10 | grep eth1
    06:16:18 AM eth1 595448.00 1190564.00 38381.09 1760253.12 0.00 0.00 1.00
    06:16:19 AM eth1 594858.00 1189686.00 38340.76 1758952.72 0.00 0.00 0.00
    06:16:20 AM eth1 597017.00 1194019.00 38480.79 1765370.29 0.00 0.00 1.00
    06:16:21 AM eth1 595450.00 1190936.00 38380.19 1760805.05 0.00 0.00 0.00
    06:16:22 AM eth1 596385.00 1193096.00 38442.56 1763976.29 0.00 0.00 1.00
    06:16:23 AM eth1 598155.00 1195978.00 38552.97 1768264.60 0.00 0.00 0.00
    06:16:24 AM eth1 594405.00 1188643.00 38312.57 1757414.89 0.00 0.00 1.00
    06:16:25 AM eth1 593366.00 1187154.00 38252.16 1755195.83 0.00 0.00 0.00
    06:16:26 AM eth1 593188.00 1186118.00 38232.88 1753682.57 0.00 0.00 1.00
    06:16:27 AM eth1 596301.00 1192241.00 38440.94 1762733.09 0.00 0.00 0.00
    Average: eth1 595457.30 1190843.50 38381.69 1760664.84 0.00 0.00 0.50
    lpaa23:~# ./tc -s -d qd sh dev eth1 | grep backlog
    backlog 7606336b 2513p requeues 167982
    backlog 224072b 74p requeues 566
    backlog 581376b 192p requeues 5598
    backlog 181680b 60p requeues 1070
    backlog 5305056b 1753p requeues 110166 // Here, this TX queue is attracting flows
    backlog 157456b 52p requeues 1758
    backlog 672216b 222p requeues 3025
    backlog 60560b 20p requeues 24541
    backlog 448144b 148p requeues 21258

    lpaa23:~# echo 1 >/proc/sys/net/ipv4/tcp_tsq_enable_tcp_wfree_ksoftirqd_detect

    Immediate jump to full bandwidth, and traffic is properly
    shard on all tx queues.

    lpaa23:~# sar -n DEV 1 10 | grep eth1
    06:16:46 AM eth1 1397632.00 2795397.00 90081.87 4133031.26 0.00 0.00 1.00
    06:16:47 AM eth1 1396874.00 2793614.00 90032.99 4130385.46 0.00 0.00 0.00
    06:16:48 AM eth1 1395842.00 2791600.00 89966.46 4127409.67 0.00 0.00 1.00
    06:16:49 AM eth1 1395528.00 2791017.00 89946.17 4126551.24 0.00 0.00 0.00
    06:16:50 AM eth1 1397891.00 2795716.00 90098.74 4133497.39 0.00 0.00 1.00
    06:16:51 AM eth1 1394951.00 2789984.00 89908.96 4125022.51 0.00 0.00 0.00
    06:16:52 AM eth1 1394608.00 2789190.00 89886.90 4123851.36 0.00 0.00 1.00
    06:16:53 AM eth1 1395314.00 2790653.00 89934.33 4125983.09 0.00 0.00 0.00
    06:16:54 AM eth1 1396115.00 2792276.00 89984.25 4128411.21 0.00 0.00 1.00
    06:16:55 AM eth1 1396829.00 2793523.00 90030.19 4130250.28 0.00 0.00 0.00
    Average: eth1 1396158.40 2792297.00 89987.09 4128439.35 0.00 0.00 0.50

    lpaa23:~# tc -s -d qd sh dev eth1 | grep backlog
    backlog 7900052b 2609p requeues 173287
    backlog 878120b 290p requeues 589
    backlog 1068884b 354p requeues 5621
    backlog 996212b 329p requeues 1088
    backlog 984100b 325p requeues 115316
    backlog 956848b 316p requeues 1781
    backlog 1080996b 357p requeues 3047
    backlog 975016b 322p requeues 24571
    backlog 990156b 327p requeues 21274

    (All 8 TX queues get a fair share of the traffic)

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Rajesh Borundia says:

    ====================
    qlcnic: Bug fixes

    This series fixes following issues.

    * We were programming maximum number of arguments supported by
    adapter instead of required in a command.
    * Destroy tx command requires three arguments instead of two.

    Please apply these patches to net.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • o Number of arguments taken by destroy tx command is three
    instead of two.

    Signed-off-by: Rajesh Borundia
    Signed-off-by: David S. Miller

    Rajesh Borundia
     
  • o Initially we were programming maximum number of arguments.
    Instead we should program number of arguments required in
    a command.
    o Maximum number of arguments for 82xx adapter is four. Fix it
    for GET_ESWITCH_STATS command.

    Signed-off-by: Rajesh Borundia
    Signed-off-by: David S. Miller

    Rajesh Borundia
     
  • Resolve "logical 'and' applied to non-boolean constant" warnings"
    that appear in W=2 builds by adding !! to a bit test.

    Signed-off-by: Mark Rustad
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Mark Rustad
     
  • Unlike normal kfree() it is never right to call sock_kfree_s() with
    a NULL pointer, because sock_kfree_s() also has the side effect of
    discharging the memory from the sockets quota.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • It is okay to free a NULL pointer but not okay to mischarge the socket optmem
    accounting. Compile test only.

    Reported-by: rucsoftsec@gmail.com
    Cc: Chien Yen
    Cc: Stephen Hemminger
    Signed-off-by: Cong Wang
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Use t4_fw_upgrade instead of t4_load_fw to write firmware into FLASH, since
    t4_load_fw doesn't co-ordinate with the firmware and the adapter can get hosed
    enough to require a power cycle of the system.

    Based on original work by Casey Leedom

    Signed-off-by: Hariprasad Shenai
    Signed-off-by: David S. Miller

    Hariprasad Shenai
     
  • Giuseppe Cavallaro says:

    ====================
    stmmac: review and fix the dwmac-sti glue-logic

    This patch is to review the whole glue logic adopted on STi SoCs that
    was bugged.
    In the old glue-logic there was a lot of confusion when setup the
    retiming especially for STiD127 where, for example, the bits 6 and 7
    (in the GMAC control register) have a different meaning of what is
    used for STiH4xx SoCs. So we cannot adopt the same glue for all these
    SoCs.
    Moreover, GiGa on STiD127 didn't work and, for all the SoCs, the RGMII
    couldn't run when the speed was 10Mbps (because the clock was not properly
    managed).
    Note that the phy clock needs to be provided by the platform as well as
    documented in the related binding file (updated as consequence).

    The old code supported too many configurations never adopted and validated.
    This made the code very complex to maintain and debug in case of issues.

    The patch simplifies all the configurations as commented in the tables
    inside the file and obviously it has been tested on all the boards
    based on the SoCs mentioned.

    With this patch, the dwmac-sti is also ready to support new configurations that
    will be available on next SoC generations.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • This patch is to review the whole glue logic adopted on STi SoCs that
    was bugged.

    In the old glue-logic there was a lot of confusion when setup the
    retiming especially for STiD127 where, for example, the bits 6 and 7
    (in the GMAC control register) have a different meaning of what is
    used for STiH4xx SoCs. So we cannot adopt the same glue for all these
    SoCs.
    Moreover, GiGa on STiD127 didn't work and, for all the SoCs, the RGMII
    couldn't run when the speed was 10Mbps (because the clock was not properly
    managed).
    Note that the phy clock needs to be provided by the platform as well as
    documented in the related binding file (updated as consequence).

    The old code supported too many configurations never adopted and validated.
    This made the code very complex to maintain and debug in case of issues.

    The patch simplifies all the configurations as commented in the tables
    inside the file and obviously it has been tested on all the boards
    based on the SoCs mentioned.

    With this patch, the dwmac-sti is also ready to support new configurations that
    will be available on next SoC generations.

    Signed-off-by: Giuseppe Cavallaro
    Cc: Srinivas Kandagatla
    Signed-off-by: David S. Miller

    Giuseppe CAVALLARO
     
  • This adds the missing compatibility to the STiH407 SoC.

    Signed-off-by: Giuseppe Cavallaro
    Signed-off-by: David S. Miller

    Giuseppe CAVALLARO
     
  • On several STi platforms: e.g. stihxxx-b2120 an Ethernet switch is
    embedded and connected to the stmmac via RGMII mode. So this is managed
    by using the FIXED_PHY. In that case, the support in the platform needs
    to be fixed to allow the stmmac to dialog with the switch via fixed-link
    by using phy_bus_name property.

    Signed-off-by: Giuseppe Cavallaro
    Signed-off-by: David S. Miller

    Giuseppe CAVALLARO