28 Apr, 2016
2 commits
-
Rename NET_INC_STATS_BH() to __NET_INC_STATS()
and NET_ADD_STATS_BH() to __NET_ADD_STATS()Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Rename DCCP_INC_STATS_BH() to __DCCP_INC_STATS()
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
21 Mar, 2015
1 commit
-
One of the major issue for TCP is the SYNACK rtx handling,
done by inet_csk_reqsk_queue_prune(), fired by the keepalive
timer of a TCP_LISTEN socket.This function runs for awful long times, with socket lock held,
meaning that other cpus needing this lock have to spin for hundred of ms.SYNACK are sent in huge bursts, likely to cause severe drops anyway.
This model was OK 15 years ago when memory was very tight.
We now can afford to have a timer per request sock.
Timer invocations no longer need to lock the listener,
and can be run from all cpus in parallel.With following patch increasing somaxconn width to 32 bits,
I tested a listener with more than 4 million active request sockets,
and a steady SYNFLOOD of ~200,000 SYN per second.
Host was sending ~830,000 SYNACK per second.This is ~100 times more what we could achieve before this patch.
Later, we will get rid of the listener hash and use ehash instead.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
23 May, 2014
1 commit
-
'dccp_timestamp_seed' is initialized once by ktime_get_real() in
dccp_timestamping_init(). It is always less than ktime_get_real()
in dccp_timestamp().Then, ktime_us_delta() in dccp_timestamp() will always return positive
number. So can use manual type cast to let compiler and do_div() know
about it to avoid warning.The related warning (with allmodconfig under unicore32):
CC [M] net/dccp/timer.o
net/dccp/timer.c: In function ‘dccp_timestamp’:
net/dccp/timer.c:285: warning: comparison of distinct pointer types lacks a castSigned-off-by: Chen Gang
Signed-off-by: David S. Miller
01 Nov, 2011
1 commit
-
These files are non modular, but need to export symbols using
the macros now living in export.h -- call out the include so
that things won't break when we remove the implicit presence
of module.h from everywhere.Signed-off-by: Paul Gortmaker
29 Oct, 2010
2 commits
-
This extends the existing wait-for-ccid routine so that it may be used with
different types of CCID, addressing the following problems:1) The queue-drain mechanism only works with rate-based CCIDs. If CCID-2 for
example has a full TX queue and becomes network-limited just as the
application wants to close, then waiting for CCID-2 to become unblocked
could lead to an indefinite delay (i.e., application "hangs").
2) Since each TX CCID in turn uses a feedback mechanism, there may be changes
in its sending policy while the queue is being drained. This can lead to
further delays during which the application will not be able to terminate.
3) The minimum wait time for CCID-3/4 can be expected to be the queue length
times the current inter-packet delay. For example if tx_qlen=100 and a delay
of 15 ms is used for each packet, then the application would have to wait
for a minimum of 1.5 seconds before being allowed to exit.
4) There is no way for the user/application to control this behaviour. It would
be good to use the timeout argument of dccp_close() as an upper bound. Then
the maximum time that an application is willing to wait for its CCIDs to can
be set via the SO_LINGER option.These problems are addressed by giving the CCID a grace period of up to the
`timeout' value.The wait-for-ccid function is, as before, used when the application
(a) has read all the data in its receive buffer and
(b) if SO_LINGER was set with a non-zero linger time, or
(c) the socket is either in the OPEN (active close) or in the PASSIVE_CLOSEREQ
state (client application closes after receiving CloseReq).In addition, there is a catch-all case of __skb_queue_purge() after waiting for
the CCID. This is necessary since the write queue may still have data when
(a) the host has been passively-closed,
(b) abnormal termination (unread data, zero linger time),
(c) wait-for-ccid could not finish within the given time limit.Signed-off-by: Gerrit Renker
Signed-off-by: David S. Miller -
This extends the packet dequeuing interface of dccp_write_xmit() to allow
1. CCIDs to take care of timing when the next packet may be sent;
2. delayed sending (as before, with an inter-packet gap up to 65.535 seconds).The main purpose is to take CCID-2 out of its polling mode (when it is network-
limited, it tries every millisecond to send, without interruption).The mode of operation for (2) is as follows:
* new packet is enqueued via dccp_sendmsg() => dccp_write_xmit(),
* ccid_hc_tx_send_packet() detects that it may not send (e.g. window full),
* it signals this condition via `CCID_PACKET_WILL_DEQUEUE_LATER',
* dccp_write_xmit() returns without further action;
* after some time the wait-condition for CCID becomes true,
* that CCID schedules the tasklet,
* tasklet function calls ccid_hc_tx_send_packet() via dccp_write_xmit(),
* since the wait-condition is now true, ccid_hc_tx_packet() returns "send now",
* packet is sent, and possibly more (since dccp_write_xmit() loops).Code reuse: the taskled function calls dccp_write_xmit(), the timer function
reduces to a wrapper around the same code.Signed-off-by: Gerrit Renker
Signed-off-by: David S. Miller
13 Apr, 2010
1 commit
-
With latest CONFIG_PROVE_RCU stuff, I felt more comfortable to make this
work.sk->sk_dst_cache is currently protected by a rwlock (sk_dst_lock)
This rwlock is readlocked for a very small amount of time, and dst
entries are already freed after RCU grace period. This calls for RCU
again :)This patch converts sk_dst_lock to a spinlock, and use RCU for readers.
__sk_dst_get() is supposed to be called with rcu_read_lock() or if
socket locked by user, so use appropriate rcu_dereference_check()
condition (rcu_read_lock_held() || sock_owned_by_user(sk))This patch avoids two atomic ops per tx packet on UDP connected sockets,
for example, and permits sk_dst_lock to be much less dirtied.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
21 Oct, 2009
1 commit
-
dst_negative_advice() should check for changed dst and reset
sk_tx_queue_mapping accordingly. Pass sock to the callers of
dst_negative_advice.(sk_reset_txq is defined just for use by dst_negative_advice. The
only way I could find to get around this is to move dst_negative_()
from dst.h to dst.c, include sock.h in dst.c, etc)Signed-off-by: Krishna Kumar
Signed-off-by: David S. Miller