02 Sep, 2010
1 commit
-
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller
28 Aug, 2010
1 commit
-
The string clone is only used as a temporary copy of the argument val
within the while loop, and so it should be freed before leaving the
function. The call to strsep, however, modifies clone, so a pointer to the
front of the string is kept in saved_clone, to make it possible to free it.The sematic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)//
@r exists@
local idexpression x;
expression E;
identifier l;
statement S;
@@*x= \(kasprintf\|kstrdup\)(...);
...
if (x == NULL) S
... when != kfree(x)
when != E = x
if (...) {
* return ...;
}
//Signed-off-by: Julia Lawall
Signed-off-by: David S. Miller
26 Aug, 2010
2 commits
-
This issue come from ruby language community. Below test program
hang up when only run on Linux.% uname -mrsv
Linux 2.6.26-2-486 #1 Sat Dec 26 08:37:39 UTC 2009 i686
% ruby -rsocket -ve '
BasicSocket.do_not_reverse_lookup = true
serv = TCPServer.open("127.0.0.1", 0)
s1 = TCPSocket.open("127.0.0.1", serv.addr[1])
s2 = serv.accept
s2.close
s1.write("a") rescue p $!
s1.write("a") rescue p $!
Thread.new {
s1.write("a")
}.join'
ruby 1.9.3dev (2010-07-06 trunk 28554) [i686-linux]
#
[Hang Here]FreeBSD, Solaris, Mac doesn't. because Ruby's write() method call
select() internally. and tcp_poll has a bug.SUS defined 'ready for writing' of select() as following.
| A descriptor shall be considered ready for writing when a call to an output
| function with O_NONBLOCK clear would not block, whether or not the function
| would transfer data successfully.That said, EPIPE situation is clearly one of 'ready for writing'.
We don't have read-side issue because tcp_poll() already has read side
shutdown care.| if (sk->sk_shutdown & RCV_SHUTDOWN)
| mask |= POLLIN | POLLRDNORM | POLLRDHUP;So, Let's insert same logic in write side.
- reference url
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31065
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/31068Signed-off-by: KOSAKI Motohiro
Signed-off-by: David S. Miller -
As discovered by Anton Blanchard, current code to autotune
tcp_death_row.sysctl_max_tw_buckets, sysctl_tcp_max_orphans and
sysctl_max_syn_backlog makes little sense.The bigger a page is, the less tcp_max_orphans is : 4096 on a 512GB
machine in Anton's case.(tcp_hashinfo.bhash_size * sizeof(struct inet_bind_hashbucket))
is much bigger if spinlock debugging is on. Its wrong to select bigger
limits in this case (where kernel structures are also bigger)bhash_size max is 65536, and we get this value even for small machines.
A better ground is to use size of ehash table, this also makes code
shorter and more obvious.Based on a patch from Anton, and another from David.
Reported-and-tested-by: Anton Blanchard
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
25 Aug, 2010
1 commit
-
As reported by Anton Blanchard when we use
percpu_counter_read_positive() to make our orphan socket limit checks,
the check can be off by up to num_cpus_online() * batch (which is 32
by default) which on a 128 cpu machine can be as large as the default
orphan limit itself.Fix this by doing the full expensive sum check if the optimized check
triggers.Reported-by: Anton Blanchard
Signed-off-by: David S. Miller
Acked-by: Eric Dumazet
24 Aug, 2010
1 commit
-
commit f3c5c1bfd430858d3a05436f82c51e53104feb6b
(netfilter: xtables: make ip_tables reentrant) forgot to
also compute the jumpstack size in the compat handlers.Result is that "iptables -I INPUT -j userchain" turns into -j DROP.
Reported by Sebastian Roesner on #netfilter, closes
http://bugzilla.netfilter.org/show_bug.cgi?id=669.Note: arptables change is compile-tested only.
Signed-off-by: Florian Westphal
Acked-by: Eric Dumazet
Tested-by: Mikael Pettersson
Signed-off-by: David S. Miller
18 Aug, 2010
1 commit
-
After commit 24b36f019 (netfilter: {ip,ip6,arp}_tables: dont block
bottom half more than necessary), lockdep can raise a warning
because we attempt to lock a spinlock with BH enabled, while
the same lock is usually locked by another cpu in a softirq context.Disable again BH to avoid these lockdep warnings.
Reported-by: Linus Torvalds
Diagnosed-by: David S. Miller
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
08 Aug, 2010
1 commit
-
tcp_parse_md5sig_option doesn't check md5sig option (TCPOPT_MD5SIG)
length, but tcp_v[46]_inbound_md5_hash assume that it's at least 16
bytes long.Signed-off-by: Dmitry Popov
Signed-off-by: David S. Miller
03 Aug, 2010
4 commits
-
Conflicts:
drivers/net/e1000e/hw.h
net/bridge/br_device.c
net/bridge/br_input.c -
6c79bf0f2440fd250c8fce8d9b82fcf03d4e8350 subtracts PPPOE_SES_HLEN from mtu at
the front of ip_fragment(). So the later subtraction should be removed. The
MTU of 802.1q is also 1500, so MTU should not be changed.Signed-off-by: Changli Gao
Signed-off-by: Bart De Schuymer
----
net/ipv4/ip_output.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
Signed-off-by: Bart De Schuymer
Signed-off-by: David S. Miller -
Initial TCP thin-stream commit did not add getsockopt support for the new
socket options: TCP_THIN_LINEAR_TIMEOUTS and TCP_THIN_DUPACK. This adds support
for them.Signed-off-by: Josh Hunt
Tested-by: Andreas Petlund
Acked-by: Andreas Petlund
Signed-off-by: David S. Miller
02 Aug, 2010
4 commits
-
The tuple got from unique_tuple() doesn't need to be really unique, so the
check for the unique tuple isn't necessary, when there isn't any other
choice. Eliminating the unnecessary nf_nat_used_tuple() can save some CPU
cycles too.Signed-off-by: Changli Gao
Signed-off-by: Patrick McHardy -
The only user of unique_tuple() get_unique_tuple() doesn't care about the
return value of unique_tuple(), so make unique_tuple() return void (nothing).Signed-off-by: Changli Gao
Signed-off-by: Patrick McHardy -
Use local variable hdrlen instead of ip_hdrlen(skb).
Signed-off-by: Changli Gao
Signed-off-by: Patrick McHardy -
We currently disable BH for the whole duration of get_counters()
On machines with a lot of cpus and large tables, this might be too long.
We can disable preemption during the whole function, and disable BH only
while fetching counters for the current cpu.Signed-off-by: Eric Dumazet
Signed-off-by: Patrick McHardy
31 Jul, 2010
1 commit
-
There is a bug in do_tcp_setsockopt(net/ipv4/tcp.c),
TCP_COOKIE_TRANSACTIONS case.
In some cases (when tp->cookie_values == NULL) new tcp_cookie_values
structure can be allocated (at cvp), but not bound to
tp->cookie_values. So a memory leak occurs.Signed-off-by: Dmitry Popov
Signed-off-by: David S. Miller
23 Jul, 2010
4 commits
-
Use skb->len for accounting as xt_quota does.
Signed-off-by: Changli Gao
Signed-off-by: Patrick McHardy -
use arp_hdr_len().
Signed-off-by: Changli Gao
Signed-off-by: Patrick McHardy -
proto->unique_tuple() will be called finally, if the previous calls fail. This
patch checks the false condition of (range->flags &IP_NAT_RANGE_PROTO_RANDOM)
instead to avoid duplicate line of code: proto->unique_tuple().Signed-off-by: Changli Gao
Signed-off-by: Patrick McHardy -
Add a new rt attribute, RTA_MARK, and use it in
rt_fill_info()/inet_rtm_getroute() to support following commands :ip route get 192.168.20.110 mark NUMBER
ip route get 192.168.20.108 from 192.168.20.110 iif eth1 mark NUMBER
ip route list cache [192.168.20.110] mark NUMBERSigned-off-by: Eric Dumazet
Signed-off-by: David S. Miller
22 Jul, 2010
1 commit
-
Network code uses the __packed macro instead of __attribute__((packed)).
Signed-off-by: Gustavo F. Padovan
Signed-off-by: David S. Miller
21 Jul, 2010
1 commit
-
Conflicts:
drivers/vhost/net.c
net/bridge/br_device.cFix merge conflict in drivers/vhost/net.c with guidance from
Stephen Rothwell.Revert the effects of net-2.6 commit 573201f36fd9c7c6d5218cdcd9948cee700b277d
since net-next-2.6 has fixes that make bridge netpoll work properly thus
we don't need it disabled.Signed-off-by: David S. Miller
20 Jul, 2010
1 commit
-
It can happen that there are no packets in queue while calling
tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
NULL and that gets deref'ed to get sacked into a local var.There is no work to do if no packets are outstanding so we just
exit early.This oops was introduced by 08ebd1721ab8fd (tcp: remove tp->lost_out
guard to make joining diff nicer).Signed-off-by: Ilpo Järvinen
Reported-by: Lennart Schulte
Tested-by: Lennart Schulte
Signed-off-by: David S. Miller
16 Jul, 2010
1 commit
-
This was detected using two mcast router tables. The
pimreg for the second interface did not have a specific
mrule, so packets received by it were handled by the
default table, which had nothing configured.This caused the ipmr_fib_lookup to fail, causing
the memory leak.Signed-off-by: Ben Greear
Signed-off-by: David S. Miller
15 Jul, 2010
1 commit
-
rfs: call sock_rps_record_flow() in tcp_splice_read()
call sock_rps_record_flow() in tcp_splice_read(), so the applications using
splice(2) or sendfile(2) can utilize RFS.Signed-off-by: Changli Gao
----
net/ipv4/tcp.c | 1 +
1 file changed, 1 insertion(+)
Signed-off-by: David S. Miller
13 Jul, 2010
2 commits
-
a new boolean flag no_autobind is added to structure proto to avoid the autobind
calls when the protocol is TCP. Then sock_rps_record_flow() is called int the
TCP's sendmsg() and sendpage() pathes.Signed-off-by: Changli Gao
----
include/net/inet_common.h | 4 ++++
include/net/sock.h | 1 +
include/net/tcp.h | 8 ++++----
net/ipv4/af_inet.c | 15 +++++++++------
net/ipv4/tcp.c | 11 +++++------
net/ipv4/tcp_ipv4.c | 3 +++
net/ipv6/af_inet6.c | 8 ++++----
net/ipv6/tcp_ipv6.c | 3 +++
8 files changed, 33 insertions(+), 20 deletions(-)
Signed-off-by: David S. Miller -
CodingStyle cleanups
EXPORT_SYMBOL should immediately follow the symbol declaration.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
09 Jul, 2010
1 commit
-
This patch makes IPV6 over IPv4 GRE tunnel propagate the transport
class field from the underlying IPV6 header to the IPV4 Type Of Service
field. Without the patch, all IPV6 packets in tunnel look the same to QoS.This assumes that IPV6 transport class is exactly the same
as IPv4 TOS. Not sure if that is always the case? Maybe need
to mask off some bits.The mask and shift to get tclass is copied from ipv6/datagram.c
Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller
08 Jul, 2010
2 commits
-
Removal of unused integer variable in ip_fragment().
Signed-off-by: George Kadianakis
Signed-off-by: David S. Miller
06 Jul, 2010
1 commit
-
Avoid touching dst refcount in ip_fragment().
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
05 Jul, 2010
3 commits
-
We can avoid a pair of atomic ops in ipt_REJECT send_reset()
Signed-off-by: Eric Dumazet
Signed-off-by: Patrick McHardy -
postpone the checksum calculation, then if the output NIC supports checksum
offloading, we can utlize it. And though the output NIC doesn't support
checksum offloading, but we'll mangle this packet, this can free us from
updating the checksum, as the checksum calculation occurs later.Signed-off-by: Changli Gao
Signed-off-by: Patrick McHardy -
While using xfrm by MARK feature in
2.6.34 - 2.6.35 kernels, the mark
is always cleared in flowi structure via memset in
_decode_session4 (net/ipv4/xfrm4_policy.c), so
the policy lookup fails.
IPv6 code is affected by this bug too.Signed-off-by: Peter Kosyh
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
03 Jul, 2010
1 commit
01 Jul, 2010
2 commits
-
add fast path for in-order fragments
As the fragments are sent in order in most of OSes, such as Windows, Darwin and
FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue.
In the fast path, we check if the skb at the end of the inet_frag_queue is the
prev we expect.Signed-off-by: Changli Gao
----
include/net/inet_frag.h | 1 +
net/ipv4/ip_fragment.c | 12 ++++++++++++
net/ipv6/reassembly.c | 11 +++++++++++
3 files changed, 24 insertions(+)
Signed-off-by: David S. Miller -
/proc/net/snmp and /proc/net/netstat expose SNMP counters.
Width of these counters is either 32 or 64 bits, depending on the size
of "unsigned long" in kernel.This means user program parsing these files must already be prepared to
deal with 64bit values, regardless of user program being 32 or 64 bit.This patch introduces 64bit snmp values for IPSTAT mib, where some
counters can wrap pretty fast if they are 32bit wide.# netstat -s|egrep "InOctets|OutOctets"
InOctets: 244068329096
OutOctets: 244069348848Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
29 Jun, 2010
2 commits
-
We can pass a gfp argument to tso_fragment() and avoid GFP_ATOMIC
allocations sometimes.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
use this_cpu_ptr(p) instead of per_cpu_ptr(p, smp_processor_id())
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller