Eric Lee / smarc-fsl-linux-kernel

27 Sep, 2010

1 commit

1d6400c7c net/9p: fix memory handling/allocation in rdma_request() ... Browse Code »

Return -ENOMEM when erroring on kmalloc and fix memory leaks when returning on error.

Signed-off-by: Davidlohr Bueso
Reviewed-by: Aneesh Kumar K.V
Signed-off-by: Eric Van Hensbergen

Davidlohr Bueso
2010-09-27 20:52:50 +0800

20 Sep, 2010

1 commit

7d7dee96e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
dca: disable dca on IOAT ver.3.0 multiple-IOH platforms
netpoll: Disable IRQ around RCU dereference in netpoll_rx
sctp: Do not reset the packet during sctp_packet_config().
net/llc: storing negative error codes in unsigned short
MAINTAINERS: move atlx discussions to netdev
drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory
drivers/net/eql.c: prevent reading uninitialized stack memory
drivers/net/usb/hso.c: prevent reading uninitialized memory
xfrm: dont assume rcu_read_lock in xfrm_output_one()
r8169: Handle rxfifo errors on 8168 chips
3c59x: Remove atomic context inside vortex_{set|get}_wol
tcp: Prevent overzealous packetization by SWS logic.
net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS
phylib: fix PAL state machine restart on resume
net: use rcu_barrier() in rollback_registered_many
bonding: correctly process non-linear skbs
ipv4: enable getsockopt() for IP_NODEFRAG
ipv4: force_igmp_version ignored when a IGMPv3 query received
ppp: potential NULL dereference in ppp_mp_explode()
net/llc: make opt unsigned in llc_ui_setsockopt()
...

Linus Torvalds
2010-09-20 02:05:50 +0800

18 Sep, 2010

1 commit

4bdab4332 sctp: Do not reset the packet during sctp_packet_config(). ... Browse Code »

sctp_packet_config() is called when getting the packet ready
for appending of chunks. The function should not touch the
current state, since it's possible to ping-pong between two
transports when sending, and that can result packet corruption
followed by skb overlfow crash.

Reported-by: Thomas Dreibholz
Signed-off-by: Vlad Yasevich
Signed-off-by: David S. Miller

Vlad Yasevich
2010-09-18 07:47:56 +0800

17 Sep, 2010

2 commits

2507136f7 net/llc: storing negative error codes in unsigned short ... Browse Code »

If the alloc_skb() fails then we return 65431 instead of -ENOBUFS
(-105).

Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller

Dan Carpenter
2010-09-17 13:38:23 +0800
e71895a1b xfrm: dont assume rcu_read_lock in xfrm_output_one() ... Browse Code »

ip_local_out() is called with rcu_read_lock() held from ip_queue_xmit()
but not from other call sites.

Reported-and-bisected-by: Nick Bowler
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-09-17 12:46:15 +0800

15 Sep, 2010

3 commits

6dcbc1229 net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS ... Browse Code »

You cannot invoke __smp_call_function_single() unless the
architecture sets this symbol.

Reported-by: Daniel Hellstrom
Signed-off-by: David S. Miller

David S. Miller
2010-09-15 12:42:22 +0800
de8d4f5d7 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 ... Browse Code »

* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies
statfs() gives ESTALE error
NFS: Fix a typo in nfs_sockaddr_match_ipaddr6
sunrpc: increase MAX_HASHTABLE_BITS to 14
gss:spkm3 miss returning error to caller when import security context
gss:krb5 miss returning error to caller when import security context
Remove incorrect do_vfs_lock message
SUNRPC: cleanup state-machine ordering
SUNRPC: Fix a race in rpc_info_open
SUNRPC: Fix race corrupting rpc upcall
Fix null dereference in call_allocate

Linus Torvalds
2010-09-15 08:04:48 +0800
ef885afbf net: use rcu_barrier() in rollback_registered_many ... Browse Code »

netdev_wait_allrefs() waits that all references to a device vanishes.

It currently uses a _very_ pessimistic 250 ms delay between each probe.
Some users reported that no more than 4 devices can be dismantled per
second, this is a pretty serious problem for some setups.

Most of the time, a refcount is about to be released by an RCU callback,
that is still in flight because rollback_registered_many() uses a
synchronize_rcu() call instead of rcu_barrier(). Problem is visible if
number of online cpus is one, because synchronize_rcu() is then a no op.

time to remove 50 ipip tunnels on a UP machine :

before patch : real 11.910s
after patch : real 1.250s

Reported-by: Nicolas Dichtel
Reported-by: Octavian Purdila
Reported-by: Benjamin LaHaise
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-09-15 05:27:29 +0800

14 Sep, 2010

3 commits

a89b47639 ipv4: enable getsockopt() for IP_NODEFRAG ... Browse Code »

While integrating your man-pages patch for IP_NODEFRAG, I noticed
that this option is settable by setsockopt(), but not gettable by
getsockopt(). I suppose this is not intended. The (untested,
trivial) patch below adds getsockopt() support.

Signed-off-by: Michael kerrisk
Acked-by: Jiri Olsa
Signed-off-by: David S. Miller

Michael Kerrisk
2010-09-14 10:57:23 +0800
799815634 ipv4: force_igmp_version ignored when a IGMPv3 query received ... Browse Code »

After all these years, it turns out that the
/proc/sys/net/ipv4/conf/*/force_igmp_version
parameter isn't fully implemented.

*Symptom*:
When set force_igmp_version to a value of 2, the kernel should only perform
multicast IGMPv2 operations (IETF rfc2236). An host-initiated Join message
will be sent as a IGMPv2 Join message. But if a IGMPv3 query message is
received, the host responds with a IGMPv3 join message. Per rfc3376 and
rfc2236, a IGMPv2 host should treat a IGMPv3 query as a IGMPv2 query and
respond with an IGMPv2 Join message.

*Consequences*:
This is an issue when a IGMPv3 capable switch is the querier and will only
issue IGMPv3 queries (which double as IGMPv2 querys) and there's an
intermediate switch that is only IGMPv2 capable. The intermediate switch
processes the initial v2 Join, but fails to recognize the IGMPv3 Join responses
to the Query, resulting in a dropped connection when the intermediate v2-only
switch times it out.

*Identifying issue in the kernel source*:
The issue is in this section of code (in net/ipv4/igmp.c), which is called when
an IGMP query is received (from mainline 2.6.36-rc3 gitweb):
...
A IGMPv3 query has a length >= 12 and no sources. This routine will exit after
line 880, setting the general query timer (random timeout between 0 and query
response time). This calls igmp_gq_timer_expire():
...
.. which only sends a v3 response. So if a v3 query is received, the kernel
always sends a v3 response.

IGMP queries happen once every 60 sec (per vlan), so the traffic is low. A
IGMPv3 query *is* a strict superset of a IGMPv2 query, so this patch properly
short circuit's the v3 behaviour.

One issue is that this does not address force_igmp_version=1. Then again, I've
never seen any IGMPv1 multicast equipment in the wild. However there is a lot
of v2-only equipment. If it's necessary to support the IGMPv1 case as well:

837 if (len == 8 || IGMP_V2_SEEN(in_dev) || IGMP_V1_SEEN(in_dev)) {

Signed-off-by: David S. Miller

Bob Arendt
2010-09-14 03:56:51 +0800
339db11b2 net/llc: make opt unsigned in llc_ui_setsockopt() ... Browse Code »

The members of struct llc_sock are unsigned so if we pass a negative
value for "opt" it can cause a sign bug. Also it can cause an integer
overflow when we multiply "opt * HZ".

CC: stable@kernel.org
Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller

Dan Carpenter
2010-09-14 03:44:10 +0800

13 Sep, 2010

9 commits

62b2be591 fs/9p, net/9p: memory leak fixes ... Browse Code »

Four memory leak fixes in the 9P code.

Signed-off-by: Latchesar Ionkov
Signed-off-by: Eric Van Hensbergen

Latchesar Ionkov
2010-09-13 21:13:02 +0800
db5fe2654 sunrpc: increase MAX_HASHTABLE_BITS to 14 ... Browse Code »

The maximum size of the authcache is now set to 1024 (10 bits),
but on our server we need at least 4096 (12 bits). Increase
MAX_HASHTABLE_BITS to 14. This is a maximum of 16384 entries,
each containing a pointer (8 bytes on x86_64). This is
exactly the limit of kmalloc() (128K).

Signed-off-by: Miquel van Smoorenburg
Signed-off-by: Trond Myklebust

Miquel van Smoorenburg
2010-09-13 07:55:26 +0800
651b2933b gss:spkm3 miss returning error to caller when import security context ... Browse Code »

spkm3 miss returning error to up layer when import security context,
it may be return ok though it has failed to import security context.

Signed-off-by: Bian Naimeng
Signed-off-by: Trond Myklebust

Bian Naimeng
2010-09-13 07:55:26 +0800
ce8477e11 gss:krb5 miss returning error to caller when import security context ... Browse Code »

krb5 miss returning error to up layer when import security context,
it may be return ok though it has failed to import security context.

Signed-off-by: Bian Naimeng
Signed-off-by: Trond Myklebust

Bian Naimeng
2010-09-13 07:55:25 +0800
55576244e SUNRPC: cleanup state-machine ordering ... Browse Code »

This is just a minor cleanup: net/sunrpc/clnt.c clarifies the rpc client
state machine by commenting each state and by laying out the functions
implementing each state in the order that each state is normally
executed (in the absence of errors).

The previous patch "Fix null dereference in call_allocate" changed the
order of the states. Move the functions and update the comments to
reflect the change.

Signed-off-by: J. Bruce Fields
Signed-off-by: Trond Myklebust

J. Bruce Fields
2010-09-13 07:55:25 +0800
006abe887 SUNRPC: Fix a race in rpc_info_open ... Browse Code »

There is a race between rpc_info_open and rpc_release_client()
in that nothing stops a process from opening the file after
the clnt->cl_kref goes to zero.

Fix this by using atomic_inc_unless_zero()...

Reported-by: J. Bruce Fields
Signed-off-by: Trond Myklebust
Cc: stable@kernel.org

Trond Myklebust
2010-09-13 07:55:25 +0800
5a67657a2 SUNRPC: Fix race corrupting rpc upcall ... Browse Code »

If rpc_queue_upcall() adds a new upcall to the rpci->pipe list just
after rpc_pipe_release calls rpc_purge_list(), but before it calls
gss_pipe_release (as rpci->ops->release_pipe(inode)), then the latter
will free a message without deleting it from the rpci->pipe list.

We will be left with a freed object on the rpc->pipe list. Most
frequent symptoms are kernel crashes in rpc.gssd system calls on the
pipe in question.

Reported-by: J. Bruce Fields
Signed-off-by: Trond Myklebust
Cc: stable@kernel.org

Trond Myklebust
2010-09-13 07:55:25 +0800
f2d47d02f Fix null dereference in call_allocate ... Browse Code »

In call_allocate we need to reach the auth in order to factor au_cslack
into the allocation.

As of a17c2153d2e271b0cbacae9bed83b0eaa41db7e1 "SUNRPC: Move the bound
cred to struct rpc_rqst", call_allocate attempts to do this by
dereferencing tk_client->cl_auth, however this is not guaranteed to be
defined--cl_auth can be zero in the case of gss context destruction (see
rpc_free_auth).

Reorder the client state machine to bind credentials before allocating,
so that we can instead reach the auth through the cred.

Signed-off-by: J. Bruce Fields
Signed-off-by: Trond Myklebust
Cc: stable@kernel.org

J. Bruce Fields
2010-09-13 07:55:25 +0800
a505b3b30 sch_atm: Fix potential NULL deref. ... Browse Code »

The list_head conversion unearther an unnecessary flow
check. Since flow is always NULL here we don't need to
see if a matching flow exists already.

Reported-by: Jiri Slaby
Signed-off-by: David S. Miller

David S. Miller
2010-09-13 02:56:44 +0800

10 Sep, 2010

1 commit

123031c0e sctp: fix test for end of loop ... Browse Code »

Add a list_has_sctp_addr function to simplify loop

Based on a patches by Dan Carpenter and David Miller

Signed-off-by: Joe Perches
Acked-by: Vlad Yasevich
Signed-off-by: David S. Miller

Joe Perches
2010-09-10 06:00:29 +0800

09 Sep, 2010

6 commits

e199e6136 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Browse Code »

David S. Miller
2010-09-09 14:49:04 +0800
719f83585 udp: add rehash on connect() ... Browse Code »

commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation)
added a secondary hash on UDP, hashed on (local addr, local port).

Problem is that following sequence :

fd = socket(...)
connect(fd, &remote, ...)

not only selects remote end point (address and port), but also sets
local address, while UDP stack stored in secondary hash table the socket
while its local address was INADDR_ANY (or ipv6 equivalent)

Sequence is :
- autobind() : choose a random local port, insert socket in hash tables
[while local address is INADDR_ANY]
- connect() : set remote address and port, change local address to IP
given by a route lookup.

When an incoming UDP frame comes, if more than 10 sockets are found in
primary hash table, we switch to secondary table, and fail to find
socket because its local address changed.

One solution to this problem is to rehash datagram socket if needed.

We add a new rehash(struct socket *) method in "struct proto", and
implement this method for UDP v4 & v6, using a common helper.

This rehashing only takes care of secondary hash table, since primary
hash (based on local port only) is not changed.

Reported-by: Krzysztof Piotr Oledzki
Signed-off-by: Eric Dumazet
Tested-by: Krzysztof Piotr Oledzki
Signed-off-by: David S. Miller

Eric Dumazet
2010-09-09 12:45:01 +0800
ae2688d59 net: blackhole route should always be recalculated ... Browse Code »

Blackhole routes are used when xfrm_lookup() returns -EREMOTE (error
triggered by IKE for example), hence this kind of route is always
temporary and so we should check if a better route exists for next
packets.
Bug has been introduced by commit d11a4dc18bf41719c9f0d7ed494d295dd2973b92.

Signed-off-by: Jianzhao Wang
Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller

Jianzhao Wang
2010-09-09 05:35:43 +0800
f6b085b69 ipv4: Suppress lockdep-RCU false positive in FIB trie (3) ... Browse Code »

Hi,
Here is one more of these warnings and a patch below:

Sep 5 23:52:33 del kernel: [46044.244833] ===================================================
Sep 5 23:52:33 del kernel: [46044.269681] [ INFO: suspicious rcu_dereference_check() usage. ]
Sep 5 23:52:33 del kernel: [46044.277000] ---------------------------------------------------
Sep 5 23:52:33 del kernel: [46044.285185] net/ipv4/fib_trie.c:1756 invoked rcu_dereference_check() without protection!
Sep 5 23:52:33 del kernel: [46044.293627]
Sep 5 23:52:33 del kernel: [46044.293632] other info that might help us debug this:
Sep 5 23:52:33 del kernel: [46044.293634]
Sep 5 23:52:33 del kernel: [46044.325333]
Sep 5 23:52:33 del kernel: [46044.325335] rcu_scheduler_active = 1, debug_locks = 0
Sep 5 23:52:33 del kernel: [46044.348013] 1 lock held by pppd/1717:
Sep 5 23:52:33 del kernel: [46044.357548] #0: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0xf/0x20
Sep 5 23:52:33 del kernel: [46044.367647]
Sep 5 23:52:33 del kernel: [46044.367652] stack backtrace:
Sep 5 23:52:33 del kernel: [46044.387429] Pid: 1717, comm: pppd Not tainted 2.6.35.4.4a #3
Sep 5 23:52:33 del kernel: [46044.398764] Call Trace:
Sep 5 23:52:33 del kernel: [46044.409596] [] ? printk+0x18/0x1e
Sep 5 23:52:33 del kernel: [46044.420761] [] lockdep_rcu_dereference+0xa9/0xb0
Sep 5 23:52:33 del kernel: [46044.432229] [] trie_firstleaf+0x65/0x70
Sep 5 23:52:33 del kernel: [46044.443941] [] fib_table_flush+0x14/0x170
Sep 5 23:52:33 del kernel: [46044.455823] [] ? local_bh_enable_ip+0x62/0xd0
Sep 5 23:52:33 del kernel: [46044.467995] [] ? _raw_spin_unlock_bh+0x2f/0x40
Sep 5 23:52:33 del kernel: [46044.480404] [] ? fib_sync_down_dev+0x120/0x180
Sep 5 23:52:33 del kernel: [46044.493025] [] fib_flush+0x2d/0x60
Sep 5 23:52:33 del kernel: [46044.505796] [] fib_disable_ip+0x25/0x50
Sep 5 23:52:33 del kernel: [46044.518772] [] fib_netdev_event+0x73/0xd0
Sep 5 23:52:33 del kernel: [46044.531918] [] notifier_call_chain+0x2d/0x70
Sep 5 23:52:33 del kernel: [46044.545358] [] raw_notifier_call_chain+0x1a/0x20
Sep 5 23:52:33 del kernel: [46044.559092] [] call_netdevice_notifiers+0x27/0x60
Sep 5 23:52:33 del kernel: [46044.573037] [] __dev_notify_flags+0x5c/0x80
Sep 5 23:52:33 del kernel: [46044.586489] [] dev_change_flags+0x37/0x60
Sep 5 23:52:33 del kernel: [46044.599394] [] devinet_ioctl+0x54d/0x630
Sep 5 23:52:33 del kernel: [46044.612277] [] inet_ioctl+0x97/0xc0
Sep 5 23:52:34 del kernel: [46044.625208] [] sock_ioctl+0x6f/0x270
Sep 5 23:52:34 del kernel: [46044.638046] [] ? handle_mm_fault+0x420/0x6c0
Sep 5 23:52:34 del kernel: [46044.650968] [] ? sock_ioctl+0x0/0x270
Sep 5 23:52:34 del kernel: [46044.663865] [] vfs_ioctl+0x28/0xa0
Sep 5 23:52:34 del kernel: [46044.676556] [] do_vfs_ioctl+0x6a/0x5c0
Sep 5 23:52:34 del kernel: [46044.688989] [] ? up_read+0x16/0x30
Sep 5 23:52:34 del kernel: [46044.701411] [] ? do_page_fault+0x1d6/0x3a0
Sep 5 23:52:34 del kernel: [46044.714223] [] ? fget_light+0xf8/0x2f0
Sep 5 23:52:34 del kernel: [46044.726601] [] ? sys_socketcall+0x208/0x2c0
Sep 5 23:52:34 del kernel: [46044.739140] [] sys_ioctl+0x63/0x70
Sep 5 23:52:34 del kernel: [46044.751967] [] syscall_call+0x7/0xb
Sep 5 23:52:34 del kernel: [46044.764734] [] ? cookie_v6_check+0x3d0/0x630

-------------->

This patch fixes the warning:
===================================================
[ INFO: suspicious rcu_dereference_check() usage. ]
---------------------------------------------------
net/ipv4/fib_trie.c:1756 invoked rcu_dereference_check() without protection!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by pppd/1717:
#0: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0xf/0x20

stack backtrace:
Pid: 1717, comm: pppd Not tainted 2.6.35.4a #3
Call Trace:
[] ? printk+0x18/0x1e
[] lockdep_rcu_dereference+0xa9/0xb0
[] trie_firstleaf+0x65/0x70
[] fib_table_flush+0x14/0x170
...

Allow trie_firstleaf() to be called either under rcu_read_lock()
protection or with RTNL held. The same annotation is added to
node_parent_rcu() to prevent a similar warning a bit later.

Followup of commits 634a4b20 and 4eaa0e3c.

Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2010-09-09 05:14:20 +0800
6523ce152 ipvs: fix active FTP ... Browse Code »

- Do not create expectation when forwarding the PORT
command to avoid blocking the connection. The problem is that
nf_conntrack_ftp.c:help() tries to create the same expectation later in
POST_ROUTING and drops the packet with "dropping packet" message after
failure in nf_ct_expect_related.

- Change ip_vs_update_conntrack to alter the conntrack
for related connections from real server. If we do not alter the reply in
this direction the next packet from client sent to vport 20 comes as NEW
connection. We alter it but may be some collision happens for both
conntracks and the second conntrack gets destroyed immediately. The
connection stucks too.

Signed-off-by: Julian Anastasov
Signed-off-by: Simon Horman
Signed-off-by: David S. Miller

Julian Anastasov
2010-09-09 01:39:57 +0800
64289c8e6 gro: Re-fix different skb headrooms ... Browse Code »

The patch: "gro: fix different skb headrooms" in its part:
"2) allocate a minimal skb for head of frag_list" is buggy. The copied
skb has p->data set at the ip header at the moment, and skb_gro_offset
is the length of ip + tcp headers. So, after the change the length of
mac header is skipped. Later skb_set_mac_header() sets it into the
NET_SKB_PAD area (if it's long enough) and ip header is misaligned at
NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the
original skb was wrongly allocated, so let's copy it as it was.

bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
fixes commit: 3d3be4333fdf6faa080947b331a6a19bce1a4f57

Reported-by: Plamen Petrov
Signed-off-by: Jarek Poplawski
CC: Eric Dumazet
Acked-by: Eric Dumazet
Tested-by: Plamen Petrov
Signed-off-by: David S. Miller

Jarek Poplawski
2010-09-09 01:32:15 +0800

08 Sep, 2010

7 commits

608307e6d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits)
pkt_sched: Fix lockdep warning on est_tree_lock in gen_estimator
ipvs: avoid oops for passive FTP
Revert "sky2: don't do GRO on second port"
gro: fix different skb headrooms
bridge: Clear INET control block of SKBs passed into ip_fragment().
3c59x: Remove incorrect locking; correct documented lock hierarchy
sky2: don't do GRO on second port
ipv4: minor fix about RPF in help of Kconfig
xfrm_user: avoid a warning with some compiler
net/sched/sch_hfsc.c: initialize parent's cl_cfmin properly in init_vf()
pxa168_eth: fix a mdiobus leak
net sched: fix kernel leak in act_police
vhost: stop worker only if created
MAINTAINERS: Add ehea driver as Supported
ath9k_hw: fix parsing of HT40 5 GHz CTLs
ath9k_hw: Fix EEPROM uncompress block reading on AR9003
wireless: register wiphy rfkill w/o holding cfg80211_mutex
netlink: Make NETLINK_USERSOCK work again.
irda: Correctly clean up self->ias_obj on irda_bind() failure.
wireless extensions: fix kernel heap content leak
...

Linus Torvalds
2010-09-08 05:06:10 +0800
6f86b3251 ipv4: Fix reverse path filtering with multipath routing. ... Browse Code »

Actually iterate over the next-hops to make sure we have
a device match. Otherwise RP filtering is always elided
when the route matched has multiple next-hops.

Reported-by: Igor M Podlesny
Signed-off-by: David S. Miller

David S. Miller
2010-09-08 04:57:24 +0800
8df73ff90 UNIX: Do not loop forever at unix_autobind(). ... Browse Code »

We assumed that unix_autobind() never fails if kzalloc() succeeded.
But unix_autobind() allows only 1048576 names. If /proc/sys/fs/file-max is
larger than 1048576 (e.g. systems with more than 10GB of RAM), a local user can
consume all names using fork()/socket()/bind().

If all names are in use, those who call bind() with addr_len == sizeof(short)
or connect()/sendmsg() with setsockopt(SO_PASSCRED) will continue

while (1)
yield();

loop at unix_autobind() till a name becomes available.
This patch adds a loop counter in order to give up after 1048576 attempts.

Calling yield() for once per 256 attempts may not be sufficient when many names
are already in use, for __unix_find_socket_byname() can take long time under
such circumstance. Therefore, this patch also adds cond_resched() call.

Note that currently a local user can consume 2GB of kernel memory if the user
is allowed to create and autobind 1048576 UNIX domain sockets. We should
consider adding some restriction for autobind operation.

Signed-off-by: Tetsuo Handa
Signed-off-by: David S. Miller

Tetsuo Handa
2010-09-08 04:57:23 +0800
cf9b94f88 irda: off by one ... Browse Code »

This is an off by one. We would go past the end when we NUL terminate
the "value" string at end of the function. The "value" buffer is
allocated in irlan_client_parse_response() or
irlan_provider_parse_command().

CC: stable@kernel.org
Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller

Dan Carpenter
2010-09-08 04:57:22 +0800
1ee89bd0f netfilter: discard overlapping IPv6 fragment ... Browse Code »

RFC5722 prohibits reassembling IPv6 fragments when some data overlaps.

Bug spotted by Zhang Zuotao .

Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller

Nicolas Dichtel
2010-09-08 04:57:21 +0800
70789d705 ipv6: discard overlapping fragment ... Browse Code »

RFC5722 prohibits reassembling fragments when some data overlaps.

Bug spotted by Zhang Zuotao .

Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller

Nicolas Dichtel
2010-09-08 04:57:21 +0800
deabc772f net: fix tx queue selection for bridged devices implementing select_queue ... Browse Code »

When a net device is implementing the select_queue callback and is part of
a bridge, frames coming from the bridge already have a tx queue associated
to the socket (introduced in commit a4ee3ce3293dc931fab19beb472a8bde1295aebe,
"net: Use sk_tx_queue_mapping for connected sockets"). The call to
sk_tx_queue_get will then return the tx queue used by the bridge instead
of calling the select_queue callback.

In case of mac80211 this broke QoS which is implemented by using the
select_queue callback. Furthermore it introduced problems with rt2x00
because frames with the same TID and RA sometimes appeared on different
tx queues which the hw cannot handle correctly.

Fix this by always calling select_queue first if it is available and only
afterwards use the socket tx queue mapping.

Signed-off-by: Helmut Schaa
Signed-off-by: David S. Miller

Helmut Schaa
2010-09-08 04:57:20 +0800

03 Sep, 2010

2 commits

0b5d404e3 pkt_sched: Fix lockdep warning on est_tree_lock in gen_estimator ... Browse Code »

This patch fixes a lockdep warning:

[ 516.287584] =========================================================
[ 516.288386] [ INFO: possible irq lock inversion dependency detected ]
[ 516.288386] 2.6.35b #7
[ 516.288386] ---------------------------------------------------------
[ 516.288386] swapper/0 just changed the state of lock:
[ 516.288386] (&qdisc_tx_lock){+.-...}, at: [] est_timer+0x62/0x1b4
[ 516.288386] but this lock took another, SOFTIRQ-unsafe lock in the past:
[ 516.288386] (est_tree_lock){+.+...}
[ 516.288386]
[ 516.288386] and interrupts could create inverse lock ordering between them.
...

So, est_tree_lock needs BH protection because it's taken by
qdisc_tx_lock, which is used both in BH and process contexts.
(Full warning with this patch at netdev, 02 Sep 2010.)

Fixes commit: ae638c47dc040b8def16d05dc6acdd527628f231
("pkt_sched: gen_estimator: add a new lock")

Signed-off-by: Jarek Poplawski
Signed-off-by: David S. Miller

Jarek Poplawski
2010-09-03 04:22:11 +0800
7bcbf81a2 ipvs: avoid oops for passive FTP ... Browse Code »

Fix Passive FTP problem in ip_vs_ftp:

- Do not oops in nf_nat_set_seq_adjust (adjust_tcp_sequence) when
iptable_nat module is not loaded

Signed-off-by: Julian Anastasov
Signed-off-by: Simon Horman
Signed-off-by: David S. Miller

Julian Anastasov
2010-09-03 01:05:00 +0800

02 Sep, 2010

4 commits

3d3be4333 gro: fix different skb headrooms ... Browse Code »

Packets entering GRO might have different headrooms, even for a given
flow (because of implementation details in drivers, like copybreak).
We cant force drivers to deliver packets with a fixed headroom.

1) fix skb_segment()

skb_segment() makes the false assumption headrooms of fragments are same
than the head. When CHECKSUM_PARTIAL is used, this can give csum_start
errors, and crash later in skb_copy_and_csum_dev()

2) allocate a minimal skb for head of frag_list

skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to
allocate a fresh skb. This adds NET_SKB_PAD to a padding already
provided by netdevice, depending on various things, like copybreak.

Use alloc_skb() to allocate an exact padding, to reduce cache line
needs:
NET_SKB_PAD + NET_IP_ALIGN

bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626

Many thanks to Plamen Petrov, testing many debugging patches !
With help of Jarek Poplawski.

Reported-by: Plamen Petrov
Signed-off-by: Eric Dumazet
CC: Jarek Poplawski
Signed-off-by: David S. Miller

Eric Dumazet
2010-09-02 10:17:35 +0800
87f94b4e9 bridge: Clear INET control block of SKBs passed into ip_fragment(). ... Browse Code »

In a similar vain to commit 17762060c25590bfddd68cc1131f28ec720f405f
("bridge: Clear IPCB before possible entry into IP stack")

Any time we call into the IP stack we have to make sure the state
there is as expected by the ipv4 code.

With help from Eric Dumazet and Herbert Xu.

Reported-by: Bandan Das
Signed-off-by: David S. Miller

David S. Miller
2010-09-02 10:17:34 +0800
750e9fad8 ipv4: minor fix about RPF in help of Kconfig ... Browse Code »

Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller

Nicolas Dichtel
2010-09-02 05:29:36 +0800
928497f02 xfrm_user: avoid a warning with some compiler ... Browse Code »

Attached is a small patch to remove a warning ("warning: ISO C90 forbids
mixed declarations and code" with gcc 4.3.2).

Signed-off-by: Nicolas Dichtel
Signed-off-by: David S. Miller

Nicolas Dichtel
2010-09-02 05:29:35 +0800