20 Jul, 2010
1 commit
-
It can happen that there are no packets in queue while calling
tcp_xmit_retransmit_queue(). tcp_write_queue_head() then returns
NULL and that gets deref'ed to get sacked into a local var.There is no work to do if no packets are outstanding so we just
exit early.This oops was introduced by 08ebd1721ab8fd (tcp: remove tp->lost_out
guard to make joining diff nicer).Signed-off-by: Ilpo Järvinen
Reported-by: Lennart Schulte
Tested-by: Lennart Schulte
Signed-off-by: David S. Miller
16 Jul, 2010
1 commit
-
This was detected using two mcast router tables. The
pimreg for the second interface did not have a specific
mrule, so packets received by it were handled by the
default table, which had nothing configured.This caused the ipmr_fib_lookup to fail, causing
the memory leak.Signed-off-by: Ben Greear
Signed-off-by: David S. Miller
15 Jul, 2010
1 commit
-
rfs: call sock_rps_record_flow() in tcp_splice_read()
call sock_rps_record_flow() in tcp_splice_read(), so the applications using
splice(2) or sendfile(2) can utilize RFS.Signed-off-by: Changli Gao
----
net/ipv4/tcp.c | 1 +
1 file changed, 1 insertion(+)
Signed-off-by: David S. Miller
05 Jul, 2010
1 commit
-
While using xfrm by MARK feature in
2.6.34 - 2.6.35 kernels, the mark
is always cleared in flowi structure via memset in
_decode_session4 (net/ipv4/xfrm4_policy.c), so
the policy lookup fails.
IPv6 code is affected by this bug too.Signed-off-by: Peter Kosyh
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
22 Jun, 2010
1 commit
-
It has been reported that the new UFO software fallback path
fails under certain conditions with NFS. I tracked the problem
down to the generation of UFO packets that are smaller than the
MTU. The software fallback path simply discards these packets.This patch fixes the problem by not generating such packets on
the UFO path.Signed-off-by: Herbert Xu
Reviewed-by: Michael S. Tsirkin
Signed-off-by: David S. Miller
07 Jun, 2010
1 commit
-
ipmr_rules_exit() and ip6mr_rules_exit() free a list of items, but
forget to properly remove these items from list. List head is not
changed and still points to freed memory.This can trigger a fault later when icmpv6_sk_exit() is called.
Fix is to either reinit list, or use list_del() to properly remove items
from list before freeing them.bugzilla report : https://bugzilla.kernel.org/show_bug.cgi?id=16120
Introduced by commit d1db275dd3f6e4 (ipv6: ip6mr: support multiple
tables) and commit f0ad0860d01e (ipv4: ipmr: support multiple tables)Reported-by: Alex Zhavnerchik
Signed-off-by: Eric Dumazet
CC: Patrick McHardy
Signed-off-by: David S. Miller
05 Jun, 2010
3 commits
-
Its better to make a route lookup in appropriate namespace.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
I believe a moderate SYN flood attack can corrupt RFS flow table
(rps_sock_flow_table), making RPS/RFS much less effective.Even in a normal situation, server handling short lived sessions suffer
from bad steering for the first data packet of a session, if another SYN
packet is received for another session.We do following action in tcp_v4_rcv() :
sock_rps_save_rxhash(sk, skb->rxhash);
We should _not_ do this if sk is a LISTEN socket, as about each
packet received on a LISTEN socket has a different rxhash than
previous one.
-> RPS_NO_CPU markers are spread all over rps_sock_flow_table.Also, it makes sense to protect sk->rxhash field changes with socket
lock (We currently can change it even if user thread owns the lock
and might use rxhash)This patch moves sock_rps_save_rxhash() to a sock locked section,
and only for non LISTEN sockets.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
syncookies default to on since
e994b7c901ded7200b525a707c6da71f2cf6d4bb
(tcp: Don't make syn cookies initial setting depend on CONFIG_SYSCTL).Signed-off-by: Florian Westphal
Signed-off-by: David S. Miller
02 Jun, 2010
1 commit
-
For large values of rtt, 2^rho operation may overflow u32. Clamp down the increment to 2^16.
Signed-off-by: Daniele Lacamera
Signed-off-by: David S. Miller
01 Jun, 2010
3 commits
-
Commit: c720c7e8383aff1cb219bddf474ed89d850336e3 missed these.
Signed-off-by: Joe Perches
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller -
Correct sk_forward_alloc handling for error_queue would need to use a
backlog of frames that softirq handler could not deliver because socket
is owned by user thread. Or extend backlog processing to be able to
process normal and error packets.Another possibility is to not use mem charge for error queue, this is
what I implemented in this patch.Note: this reverts commit 29030374
(net: fix sk_forward_alloc corruptions), since we dont need to lock
socket anymore.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
31 May, 2010
2 commits
-
commit f3c5c1bfd4 (netfilter: xtables: make ip_tables reentrant)
introduced a performance regression, because stackptr array is shared by
all cpus, adding cache line ping pongs. (16 cpus share a 64 bytes cache
line)Fix this using alloc_percpu()
Signed-off-by: Eric Dumazet
Acked-By: Jan Engelhardt
Signed-off-by: Patrick McHardy
29 May, 2010
2 commits
-
As David found out, sock_queue_err_skb() should be called with socket
lock hold, or we risk sk_forward_alloc corruption, since we use non
atomic operations to update this field.This patch adds bh_lock_sock()/bh_unlock_sock() pair to three spots.
(BH already disabled)1) skb_tstamp_tx()
2) Before calling ip_icmp_error(), in __udp4_lib_err()
3) Before calling ipv6_icmp_error(), in __udp6_lib_err()Reported-by: Anton Blanchard
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits)
netlink: bug fix: wrong size was calculated for vfinfo list blob
netlink: bug fix: don't overrun skbs on vf_port dump
xt_tee: use skb_dst_drop()
netdev/fec: fix ifconfig eth0 down hang issue
cnic: Fix context memory init. on 5709.
drivers/net: Eliminate a NULL pointer dereference
drivers/net/hamradio: Eliminate a NULL pointer dereference
be2net: Patch removes redundant while statement in loop.
ipv6: Add GSO support on forwarding path
net: fix __neigh_event_send()
vhost: fix the memory leak which will happen when memory_access_ok fails
vhost-net: fix to check the return value of copy_to/from_user() correctly
vhost: fix to check the return value of copy_to/from_user() correctly
vhost: Fix host panic if ioctl called with wrong index
net: fix lock_sock_bh/unlock_sock_bh
net/iucv: Add missing spin_unlock
net: ll_temac: fix checksum offload logic
net: ll_temac: fix interrupt bug when interrupt 0 is used
sctp: dubious bitfields in sctp_transport
ipmr: off by one in __ipmr_fill_mroute()
...
27 May, 2010
1 commit
-
This new sock lock primitive was introduced to speedup some user context
socket manipulation. But it is unsafe to protect two threads, one using
regular lock_sock/release_sock, one using lock_sock_bh/unlock_sock_bhThis patch changes lock_sock_bh to be careful against 'owned' state.
If owned is found to be set, we must take the slow path.
lock_sock_bh() now returns a boolean to say if the slow path was taken,
and this boolean is used at unlock_sock_bh time to call the appropriate
unlock function.After this change, BH are either disabled or enabled during the
lock_sock_bh/unlock_sock_bh protected section. This might be misleading,
so we rename these functions to lock_sock_fast()/unlock_sock_fast().Reported-by: Anton Blanchard
Signed-off-by: Eric Dumazet
Tested-by: Anton Blanchard
Signed-off-by: David S. Miller
26 May, 2010
1 commit
-
This fixes a smatch warning:
net/ipv4/ipmr.c +1917 __ipmr_fill_mroute(12) error: buffer overflow
'(mrt)->vif_table' 32
Signed-off-by: David S. Miller
25 May, 2010
1 commit
-
- C99 knows about USHRT_MAX/SHRT_MAX/SHRT_MIN, not
USHORT_MAX/SHORT_MAX/SHORT_MIN.- Make SHRT_MIN of type s16, not int, for consistency.
[akpm@linux-foundation.org: fix drivers/dma/timb_dma.c]
[akpm@linux-foundation.org: fix security/keys/keyring.c]
Signed-off-by: Alexey Dobriyan
Acked-by: WANG Cong
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
21 May, 2010
2 commits
-
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1674 commits)
qlcnic: adding co maintainer
ixgbe: add support for active DA cables
ixgbe: dcb, do not tag tc_prio_control frames
ixgbe: fix ixgbe_tx_is_paused logic
ixgbe: always enable vlan strip/insert when DCB is enabled
ixgbe: remove some redundant code in setting FCoE FIP filter
ixgbe: fix wrong offset to fc_frame_header in ixgbe_fcoe_ddp
ixgbe: fix header len when unsplit packet overflows to data buffer
ipv6: Never schedule DAD timer on dead address
ipv6: Use POSTDAD state
ipv6: Use state_lock to protect ifa state
ipv6: Replace inet6_ifaddr->dead with state
cxgb4: notify upper drivers if the device is already up when they load
cxgb4: keep interrupts available when the ports are brought down
cxgb4: fix initial addition of MAC address
cnic: Return SPQ credit to bnx2x after ring setup and shutdown.
cnic: Convert cnic_local_flags to atomic ops.
can: Fix SJA1000 command register writes on SMP systems
bridge: fix build for CONFIG_SYSFS disabled
ARCNET: Limit com20020 PCI ID matches for SOHARD cards
...Fix up various conflicts with pcmcia tree drivers/net/
{pcmcia/3c589_cs.c, wireless/orinoco/orinoco_cs.c and
wireless/orinoco/spectrum_cs.c} and feature removal
(Documentation/feature-removal-schedule.txt).Also fix a non-content conflict due to pm_qos_requirement getting
renamed in the PM tree (now pm_qos_request) in net/mac80211/scan.c -
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (44 commits)
vlynq: make whole Kconfig-menu dependant on architecture
add descriptive comment for TIF_MEMDIE task flag declaration.
EEPROM: max6875: Header file cleanup
EEPROM: 93cx6: Header file cleanup
EEPROM: Header file cleanup
agp: use NULL instead of 0 when pointer is needed
rtc-v3020: make bitfield unsigned
PCI: make bitfield unsigned
jbd2: use NULL instead of 0 when pointer is needed
cciss: fix shadows sparse warning
doc: inode uses a mutex instead of a semaphore.
uml: i386: Avoid redefinition of NR_syscalls
fix "seperate" typos in comments
cocbalt_lcdfb: correct sections
doc: Change urls for sparse
Powerpc: wii: Fix typo in comment
i2o: cleanup some exit paths
Documentation/: it's -> its where appropriate
UML: Fix compiler warning due to missing task_struct declaration
UML: add kernel.h include to signal.c
...
18 May, 2010
8 commits
-
This patch removes from net/ (but not any netfilter files)
all the unnecessary return; statements that precede the
last closing brace of void functions.It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'Signed-off-by: Joe Perches
Signed-off-by: David S. Miller -
skb rxhash should be cleared when a skb is handled by a tunnel before
being delivered again, so that correct packet steering can take place.There are other cleanups and accounting that we can factorize in a new
helper, skb_tunnel_rx()Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Commit 33ad798c924b4a (tcp: options clean up) introduced a problem
if MD5+SACK+timestamps were used in initial SYN message.Some stacks (old linux for example) try to negotiate MD5+SACK+TSTAMP
sessions, but since 40 bytes of tcp options space are not enough to
store all the bits needed, we chose to disable timestamps in this case.We send a SYN-ACK _without_ timestamp option, but socket has timestamps
enabled and all further outgoing messages contain a TS block, all with
the initial timestamp of the remote peer.Fix is to really disable timestamps option for the whole session.
Reported-by: Bijay Singh
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Also added an explicit break; to avoid
a fallthrough in net/ipv4/tcp_input.cSigned-off-by: Joe Perches
Signed-off-by: David S. Miller -
TCP outgoing packets can avoid two atomic ops, and dirtying
of previously higly contended cache line using new refdst
infrastructure.Note 1: loopback device excluded because of !IFF_XMIT_DST_RELEASE
Note 2: UDP packets dsts are built before ip_queue_xmit().Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Use ip_route_input_noref() in ip fast path, to avoid two atomic ops per
incoming packet.Note: loopback is excluded from this optimization in ip_rcv_finish()
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
ip_route_input() is the version returning a refcounted dst, while
ip_route_input_noref() returns a non refcounted one.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
Use low order bit of skb->_skb_dst to tell dst is not refcounted.
Change _skb_dst to _skb_refdst to make sure all uses are catched.
skb_dst() returns the dst, regardless of noref bit set or not, but
with a lockdep check to make sure a noref dst is not given if current
user is not rcu protected.New skb_dst_set_noref() helper to set an notrefcounted dst on a skb.
(with lockdep check)skb_dst_drop() drops a reference only if skb dst was refcounted.
skb_dst_force() helper is used to force a refcount on dst, when skb
is queued and not anymore RCU protected.Use skb_dst_force() in __sk_add_backlog(), __dev_xmit_skb() if
!IFF_XMIT_DST_RELEASE or skb enqueued on qdisc queue, in
sock_queue_rcv_skb(), in __nf_queue().Use skb_dst_force() in dev_requeue_skb().
Note: dst_use_noref() still dirties dst, we might transform it
later to do one dirtying per jiffies.Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
17 May, 2010
1 commit
-
Conflicts:
include/linux/if_link.h
16 May, 2010
3 commits
-
TCP-MD5 sessions have intermittent failures, when route cache is
invalidated. ip_queue_xmit() has to find a new route, calls
sk_setup_caps(sk, &rt->u.dst), destroying thesk->sk_route_caps &= ~NETIF_F_GSO_MASK
that MD5 desperately try to make all over its way (from
tcp_transmit_skb() for example)So we send few bad packets, and everything is fine when
tcp_transmit_skb() is called again for this socket.Since ip_queue_xmit() is at a lower level than TCP-MD5, I chose to use a
socket field, sk_route_nocaps, containing bits to mask on sk_route_caps.Reported-by: Bhaskar Dutta
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
TCP MD5 support uses percpu data for temporary storage. It currently
disables preemption so that same storage cannot be reclaimed by another
thread on same cpu.We also have to make sure a softirq handler wont try to use also same
context. Various bug reports demonstrated corruptions.Fix is to disable preemption and BH.
Reported-by: Bhaskar Dutta
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
(Dropped the infiniband part, because Tetsuo modified the related code,
I will send a separate patch for it once this is accepted.)This patch introduces /proc/sys/net/ipv4/ip_local_reserved_ports which
allows users to reserve ports for third-party applications.The reserved ports will not be used by automatic port assignments
(e.g. when calling connect() or bind() with port number 0). Explicit
port allocation behavior is unchanged.Signed-off-by: Octavian Purdila
Signed-off-by: WANG Cong
Cc: Neil Horman
Cc: Eric Dumazet
Cc: Eric W. Biederman
Signed-off-by: David S. Miller
14 May, 2010
1 commit
13 May, 2010
3 commits
-
This patch removes from net/ netfilter files
all the unnecessary return; statements that precede the
last closing brace of void functions.It does not remove the returns that are immediately
preceded by a label as gcc doesn't like that.Done via:
$ grep -rP --include=*.[ch] -l "return;\n}" net/ | \
xargs perl -i -e 'local $/ ; while (<>) { s/\n[ \t\n]+return;\n}/\n}/g; print; }'Signed-off-by: Joe Perches
[Patrick: changed to keep return statements in otherwise empty function bodies]
Signed-off-by: Patrick McHardy -
Make sure all printk messages have a severity level.
Signed-off-by: Stephen Hemminger
Signed-off-by: Patrick McHardy -
Change netfilter asserts to standard WARN_ON. This has the
benefit of backtrace info and also causes netfilter errors
to show up on kerneloops.org.Signed-off-by: Stephen Hemminger
Signed-off-by: Patrick McHardy
12 May, 2010
2 commits
-
Conflicts:
Documentation/feature-removal-schedule.txt
drivers/net/wireless/ath/ar9170/usb.c
drivers/scsi/iscsi_tcp.c
net/ipv4/ipmr.c