05 Jan, 2013
2 commits
-
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller -
I slipped in a new sysctl without proper documentation. I would like to
make up for this now.Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
11 Dec, 2012
1 commit
-
The description for tcp_fin_timeout should be tigher and more clear.
In addition to being tighter, we should make the spelling of the
state name consistent with what utilities report, remove the now
dated reference to 2.2 and put the default in the consistent place.Signed-off-by: Rick Jones
Signed-off-by: David S. Miller
08 Dec, 2012
1 commit
-
'secluded' is used to describe places, not suitable here.
Suggested-by: Ben Hutchings
Signed-off-by: Shan Wei
Signed-off-by: David S. Miller
06 Dec, 2012
1 commit
-
Signed-off-by: Shan Wei
Signed-off-by: David S. Miller
30 Nov, 2012
1 commit
-
Make the description of how tcp_ecn works a bit more explicit and clear.
Signed-off-by: Rick Jones
Signed-off-by: David S. Miller
27 Nov, 2012
1 commit
-
This patch updates the stmmac.txt adding some information
about the new rx/tx mitigation schema adopted in the driver.Signed-off-by: Giuseppe Cavallaro
Signed-off-by: David S. Miller
26 Nov, 2012
1 commit
-
Conflicts:
drivers/net/wireless/iwlwifi/pcie/tx.cMinor iwlwifi conflict in TX queue disabling between 'net', which
removed a bogus warning, and 'net-next' which added some status
register poking code.Signed-off-by: David S. Miller
24 Nov, 2012
1 commit
-
Some commands don't work in its example doc. The patch will fix it.
Signed-off-by: Zhi Yong Wu
Signed-off-by: David S. Miller
18 Nov, 2012
1 commit
-
Minor line offset auto-merges.
Signed-off-by: David S. Miller
14 Nov, 2012
1 commit
-
Signed-off-by: Kirill Smelkov
Signed-off-by: David S. Miller
10 Nov, 2012
1 commit
-
This improves the packet_mmap.txt document in the following ways:
* Add initial information about different TPACKET versions
* Add initial information about packet fanout
* Add pointer to BPF document (since this also could be of interest)
* 'Fix' minor, rather cosmetic thingsInformation partially taken from related commit messages.
Reported-by: Ronny Meeus
Signed-off-by: Daniel Borkmann
Cc: Ulisses Alonso Camaró
Cc: Johann Baudy
Signed-off-by: David S. Miller
08 Nov, 2012
3 commits
-
Included changes:
- minimal fixes to the packet layout to avoid the __packed attribute when not
needed
- new packet type called UNICAST_4ADDR: in this packet it is possible to find
both source and destination node (in the classic UNICAST header only the
destination field exists).
- a new feature: Distributed ARP Table (D.A.T.). It aims to reduce ARP lookups
latency by means of a simil-DHT approach. -
The tx data offset of packet mmap tx ring used to be :
(TPACKET2_HDRLEN - sizeof(struct sockaddr_ll))The problem is that, with SOCK_RAW socket, the payload (14 bytes after
the beginning of the user data) is misaligned.This patch allows to let the user gives an offset for it's tx data if
he desires.Set sock option PACKET_TX_HAS_OFF to 1, then specify in each frame of
your tx ring tp_net for SOCK_DGRAM, or tp_mac for SOCK_RAW.Signed-off-by: Paul Chavent
Signed-off-by: David S. Miller -
A new log level has been added to concentrate messages regarding DAT: ARP
snooping, requests, response and DHT related messages.
The new log level is named BATADV_DBG_DATSigned-off-by: Antonio Quartulli
26 Oct, 2012
1 commit
-
Currently sctp allows for the optional use of md5 of sha1 hmac algorithms to
generate cookie values when establishing new connections via two build time
config options. Theres no real reason to make this a static selection. We can
add a sysctl that allows for the dynamic selection of these algorithms at run
time, with the default value determined by the corresponding crypto library
availability.
This comes in handy when, for example running a system in FIPS mode, where use
of md5 is disallowed, but SHA1 is permitted.Note: This new sysctl has no corresponding socket option to select the cookie
hmac algorithm. I chose not to implement that intentionally, as RFC 6458
contains no option for this value, and I opted not to pollute the socket option
namespace.Change notes:
v2)
* Updated subject to have the proper sctp prefix as per Dave M.
* Replaced deafult selection options with new options that allow
developers to explicitly select available hmac algs at build time
as per suggestion by Vlad Y.Signed-off-by: Neil Horman
CC: Vlad Yasevich
CC: "David S. Miller"
CC: netdev@vger.kernel.org
Acked-by: Vlad Yasevich
Signed-off-by: David S. Miller
02 Oct, 2012
1 commit
-
This is an implementation of Virtual eXtensible Local Area Network
as described in draft RFC:
http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-02The driver integrates a Virtual Tunnel Endpoint (VTEP) functionality
that learns MAC to IP address mapping.This implementation has not been tested only against the Linux
userspace implementation using TAP, not against other vendor's
equipment.Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
01 Sep, 2012
3 commits
-
This patch adds all the necessary data structure and support
functions to implement TFO server side. It also documents a number
of flags for the sysctl_tcp_fastopen knob, and adds a few Linux
extension MIBs.In addition, it includes the following:
1. a new TCP_FASTOPEN socket option an application must call to
supply a max backlog allowed in order to enable TFO on its listener.2. A number of key data structures:
"fastopen_rsk" in tcp_sock - for a big socket to access its
request_sock for retransmission and ack processing purpose. It is
non-NULL iff 3WHS not completed."fastopenq" in request_sock_queue - points to a per Fast Open
listener data structure "fastopen_queue" to keep track of qlen (# of
outstanding Fast Open requests) and max_qlen, among other things."listener" in tcp_request_sock - to point to the original listener
for book-keeping purpose, i.e., to maintain qlen against max_qlen
as part of defense against IP spoofing attack.3. various data structure and functions, many in tcp_fastopen.c, to
support server side Fast Open cookie operations, including
/proc/sys/net/ipv4/tcp_fastopen_key to allow manual rekeying.Signed-off-by: H.K. Jerry Chu
Cc: Yuchung Cheng
Cc: Neal Cardwell
Cc: Eric Dumazet
Cc: Tom Herbert
Signed-off-by: David S. Miller -
This patch removes bus_id from mdio platform data, The reason to remove
bus_id is, stmmac mdio bus_id is always same as stmmac bus-id, so there
is no point in passing this in different variable.
Also stmmac ethernet driver connects to phy with bus_id passed its
platform data.
So, having single bus-id is much simpler.Signed-off-by: Srinivas Kandagatla
Signed-off-by: David S. Miller -
Commit 9ad7c049 ("tcp: RFC2988bis + taking RTT sample from 3WHS for
the passive open side") changed the initRTO from 3secs to 1sec in
accordance to RFC6298 (former RFC2988bis). This reduced the time till
the last SYN retransmission packet gets sent from 93secs to 31secs.RFC1122 is stating that the retransmission should be done for at least 3
minutes, but this seems to be quite high."However, the values of R1 and R2 may be different for SYN
and data segments. In particular, R2 for a SYN segment MUST
be set large enough to provide retransmission of the segment
for at least 3 minutes. The application can close the
connection (i.e., give up on the open attempt) sooner, of
course."This patch increases the value of TCP_SYN_RETRIES to the value of 6,
providing a retransmission window of 63secs.The comments for SYN and SYNACK retries have also been updated to
describe the current settings. The same goes for the documentation file
"Documentation/networking/ip-sysctl.txt".Signed-off-by: Alexander Bergmann
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
23 Aug, 2012
2 commits
-
This is especially useful if there are no claims yet, but we still want
to know which gateways are using bridge loop avoidance in the network.Signed-off-by: Simon Wunderlich
Signed-off-by: Antonio Quartulli -
Currently the "bonding" driver does not support load balancing outgoing
traffic in LACP mode for IPv6 traffic. IPv4 (and TCP or UDP over IPv4)
are currently supported; this patch adds transmit hashing for IPv6 (and
TCP or UDP over IPv6), bringing IPv6 up to par with IPv4 support in the
bonding driver. In addition, bounds checking has been added to all
transmit hashing functions.The algorithm chosen (xor'ing the bottom three quads of the source and
destination addresses together, then xor'ing each byte of that result into
the bottom byte, finally xor'ing with the last bytes of the MAC addresses)
was selected after testing almost 400,000 unique IPv6 addresses harvested
from server logs. This algorithm had the most even distribution for both
big- and little-endian architectures while still using few instructions. Its
behavior also attempts to closely match that of the IPv4 algorithm.The IPv6 flow label was intentionally not included in the hash as it appears
to be unset in the vast majority of IPv6 traffic sampled, and the current
algorithm not using the flow label already offers a very even distribution.Fragmented IPv6 packets are handled the same way as fragmented IPv4 packets,
ie, they are not balanced based on layer 4 information. Additionally,
IPv6 packets with intermediate headers are not balanced based on layer
4 information. In practice these intermediate headers are not common and
this should not cause any problems, and the alternative (a packet-parsing
loop and look-up table) seemed slow and complicated for little gain.Tested-by: John Eaglesham
Signed-off-by: John Eaglesham
Signed-off-by: David S. Miller
15 Aug, 2012
1 commit
-
There are at least 4 implementations of netcat with the BSD-based
being the only one that has to be used without the -p switch to
specify the listening port.Jan Engelhardt suggested to add an example for socat(1).
Signed-off-by: Dirk Gouders
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller
31 Jul, 2012
1 commit
-
After IP route cache removal, rt_cache_rebuild_count is no longer
used.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
23 Jul, 2012
2 commits
-
The section titled "Configuring Bonding for Maximum Throughput" is
actually section twelve not thirteen, and there are a couple of words
spelled incorrectly.Signed-off-by: Rick Jones
Reviewed-by: Nicolas de Pesloüan
Signed-off-by: Jay Vosburgh
Signed-off-by: David S. Miller -
I've seen several attempts recently made to do quick failover of sctp transports
by reducing various retransmit timers and counters. While its possible to
implement a faster failover on multihomed sctp associations, its not
particularly robust, in that it can lead to unneeded retransmits, as well as
false connection failures due to intermittent latency on a network.Instead, lets implement the new ietf quick failover draft found here:
http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05This will let the sctp stack identify transports that have had a small number of
errors, and avoid using them quickly until their reliability can be
re-established. I've tested this out on two virt guests connected via multiple
isolated virt networks and believe its in compliance with the above draft and
works well.Signed-off-by: Neil Horman
CC: Vlad Yasevich
CC: Sridhar Samudrala
CC: "David S. Miller"
CC: linux-sctp@vger.kernel.org
CC: joe@perches.com
Acked-by: Vlad Yasevich
Signed-off-by: David S. Miller
21 Jul, 2012
2 commits
-
Jesse Gross says:
====================
A few bug fixes and small enhancements for net-next/3.6.
...
Ansis Atteka (1):
openvswitch: Do not send notification if ovs_vport_set_options() failedBen Pfaff (1):
openvswitch: Check gso_type for correct sk_buff in queue_gso_packets().Jesse Gross (2):
openvswitch: Enable retrieval of TCP flags from IPv6 traffic.
openvswitch: Reset upper layer protocol info on internal devices.Leo Alterman (1):
openvswitch: Fix typo in documentation.Pravin B Shelar (1):
openvswitch: Check currect return value from skb_gso_segment()Raju Subramanian (1):
openvswitch: Replace Nicira Networks.
====================Signed-off-by: David S. Miller
-
Signed-off-by: Leo Alterman
Signed-off-by: Jesse Gross
20 Jul, 2012
3 commits
-
In trusted networks, e.g., intranet, data-center, the client does not
need to use Fast Open cookie to mitigate DoS attacks. In cookie-less
mode, sendmsg() with MSG_FASTOPEN flag will send SYN-data regardless
of cookie availability.Signed-off-by: Yuchung Cheng
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
sendmsg() (or sendto()) with MSG_FASTOPEN is a combo of connect(2)
and write(2). The application should replace connect() with it to
send data in the opening SYN packet.For blocking socket, sendmsg() blocks until all the data are buffered
locally and the handshake is completed like connect() call. It
returns similar errno like connect() if the TCP handshake fails.For non-blocking socket, it returns the number of bytes queued (and
transmitted in the SYN-data packet) if cookie is available. If cookie
is not available, it transmits a data-less SYN packet with Fast Open
cookie request option and returns -EINPROGRESS like connect().Using MSG_FASTOPEN on connecting or connected socket will result in
simlar errno like repeating connect() calls. Therefore the application
should only use this flag on new sockets.The buffer size of sendmsg() is independent of the MSS of the connection.
Signed-off-by: Yuchung Cheng
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
Update the references to bridge utilities and web pages
to current locationsSigned-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
17 Jul, 2012
1 commit
-
Implement the RFC 5691 mitigation against Blind
Reset attack using RST bit.Idea is to validate incoming RST sequence,
to match RCV.NXT value, instead of previouly accepted
window : (RCV.NXT < RCV.NXT+RCV.WND)If sequence is in window but not an exact match, send
a "challenge ACK", so that the other part can resend an
RST with the appropriate sequence.Add a new sysctl, tcp_challenge_ack_limit, to limit
number of challenge ACK sent per second.Add a new SNMP counter to count number of challenge acks sent.
(netstat -s | grep TCPChallengeACK)Signed-off-by: Eric Dumazet
Cc: Kiran Kumar Kella
Signed-off-by: David S. Miller
12 Jul, 2012
1 commit
-
This introduce TSQ (TCP Small Queues)
TSQ goal is to reduce number of TCP packets in xmit queues (qdisc &
device queues), to reduce RTT and cwnd bias, part of the bufferbloat
problem.sk->sk_wmem_alloc not allowed to grow above a given limit,
allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a
given time.TSO packets are sized/capped to half the limit, so that we have two
TSO packets in flight, allowing better bandwidth use.As a side effect, setting the limit to 40000 automatically reduces the
standard gso max limit (65536) to 40000/2 : It can help to reduce
latencies of high prio packets, having smaller TSO packets.This means we divert sock_wfree() to a tcp_wfree() handler, to
queue/send following frames when skb_orphan() [2] is called for the
already queued skbs.Results on my dev machines (tg3/ixgbe nics) are really impressive,
using standard pfifo_fast, and with or without TSO/GSO.Without reduction of nominal bandwidth, we have reduction of buffering
per bulk sender :
< 1ms on Gbit (instead of 50ms with TSO)
< 8ms on 100Mbit (instead of 132 ms)I no longer have 4 MBytes backlogged in qdisc by a single netperf
session, and both side socket autotuning no longer use 4 Mbytes.As skb destructor cannot restart xmit itself ( as qdisc lock might be
taken at this point ), we delegate the work to a tasklet. We use one
tasklest per cpu for performance reasons.If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag.
This flag is tested in a new protocol method called from release_sock(),
to eventually send new segments.[1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable
[2] skb_orphan() is usually called at TX completion time,
but some drivers call it in their start_xmit() handler.
These drivers should at least use BQL, or else a single TCP
session can still fill the whole NIC TX ring, since TSQ will
have no effect.Signed-off-by: Eric Dumazet
Cc: Dave Taht
Cc: Tom Herbert
Cc: Matt Mathis
Cc: Yuchung Cheng
Cc: Nandita Dukkipati
Signed-off-by: David S. Miller
11 Jul, 2012
1 commit
-
URLs to neterion.com and s2io.com no longer resolve. Remove all references to
these URLs in the driver source and documentation.Signed-off-by: Jon Mason
Signed-off-by: David S. Miller
01 Jul, 2012
2 commits
-
This patch updates the stmmac's documentation adding
some missing files in the section used to describe the
internal driver's structure.Also the patch adds a new section to describe the EEE support.
Signed-off-by: Giuseppe Cavallaro
Signed-off-by: David S. Miller -
Signed-off-by: David S. Miller
26 Jun, 2012
1 commit
-
Update drawing and remove description of old features.
Add HSI and USB link layers to the drawing.Reported-by: Joerg Reisenweber
Signed-off-by: Sjur Brændeland
Signed-off-by: David S. Miller
20 Jun, 2012
1 commit
-
Signed-off-by: Oliver Hartkopp
Signed-off-by: Marc Kleine-Budde
19 Jun, 2012
1 commit
-
Added additional counters in a bat_stats structure, which are exported
through the ethtool api. The counters are specific to batman-adv and
includes:
forwarded packets and bytes
management packets and bytes (aggregated OGMs at this point)
translation table packetsNew counters are added by extending "enum bat_counters" in types.h and
adding corresponding descriptive string(s) to bat_counters_strings in
soft-iface.c.Counters are increased by calling batadv_add_counter() and incremented
by one by calling batadv_inc_counter().Signed-off-by: Martin Hundebøll
Signed-off-by: Sven Eckelmann
13 Jun, 2012
1 commit
-
Routing of 127/8 is tradtionally forbidden, we consider
packets from that address block martian when routing and do
not process corresponding ARP requests.This is a sane default but renders a huge address space
practically unuseable.The RFC states that no address within the 127/8 block should
ever appear on any network anywhere but it does not forbid
the use of such addresses outside of the loopback device in
particular. For example to address a pool of virtual guests
behind a load balancer.This patch adds a new interface option 'route_localnet'
enabling routing of the 127/8 address block and processing
of ARP requests on a specific interface.Note that for the feature to work, the default local route
covering 127/8 dev lo needs to be removed.Example:
$ sysctl -w net.ipv4.conf.eth0.route_localnet=1
$ ip route del 127.0.0.0/8 dev lo table local
$ ip addr add 127.1.0.1/16 dev eth0
$ ip route flush cacheV2: Fix invalid check to auto flush cache (thanks davem)
Signed-off-by: Thomas Graf
Acked-by: Neil Horman
Signed-off-by: David S. Miller