Eric Lee / smarc-fsl-linux-kernel

15 Mar, 2011

1 commit

5f40d4209 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 ... Browse Code »

* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
NFS: NFSROOT should default to "proto=udp"
nfs4: remove duplicated #include
NFSv4: nfs4_state_mark_reclaim_nograce() should be static
NFSv4: Fix the setlk error handler
NFSv4.1: Fix the handling of the SEQUENCE status bits
NFSv4/4.1: Fix nfs4_schedule_state_recovery abuses
NFSv4.1 reclaim complete must wait for completion
NFSv4: remove duplicate clientid in struct nfs_client
NFSv4.1: Retry CREATE_SESSION on NFS4ERR_DELAY
sunrpc: Propagate errors from xs_bind() through xs_create_sock()
(try3-resend) Fix nfs_compat_user_ino64 so it doesn't cause problems if bit 31 or 63 are set in fileid
nfs: fix compilation warning
nfs: add kmalloc return value check in decode_and_add_ds
SUNRPC: Remove resource leak in svc_rdma_send_error()
nfs: close NFSv4 COMMIT vs. CLOSE race
SUNRPC: Close a race in __rpc_wait_for_completion_task()

Linus Torvalds
2011-03-15 02:19:50 +0800

11 Mar, 2011

6 commits

6dfbd87a2 ip6ip6: autoload ip6 tunnel ... Browse Code »

Add necessary alias to autoload ip6ip6 tunnel module.

Signed-off-by: Stephen Hemminger
Signed-off-by: David S. Miller

stephen hemminger
2011-03-11 06:18:48 +0800
bef6e7e76 Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ Browse Code »

David S. Miller
2011-03-11 06:00:44 +0800
dcbcdf22f net: bridge builtin vs. ipv6 modular ... Browse Code »

When configs BRIDGE=y and IPV6=m, this build error occurs:

br_multicast.c:(.text+0xa3341): undefined reference to `ipv6_dev_get_saddr'

BRIDGE_IGMP_SNOOPING is boolean; if it were tristate, then adding
depends on IPV6 || IPV6=n
to BRIDGE_IGMP_SNOOPING would be a good fix. As it is currently,
making BRIDGE depend on the IPV6 config works.

Reported-by: Patrick Schaaf
Signed-off-by: Randy Dunlap
Signed-off-by: David S. Miller

Randy Dunlap
2011-03-11 05:45:57 +0800
4cea288aa sunrpc: Propagate errors from xs_bind() through xs_create_sock() ... Browse Code »

xs_create_sock() is supposed to return a pointer or an ERR_PTR-encoded
error, but it currently returns 0 if xs_bind() fails.

Signed-off-by: Ben Hutchings
Cc: stable@kernel.org [v2.6.37]
Signed-off-by: Trond Myklebust

Ben Hutchings
2011-03-11 04:04:58 +0800
a5e502681 SUNRPC: Remove resource leak in svc_rdma_send_error() ... Browse Code »

We leak the memory allocated to 'ctxt' when we return after
'ib_dma_mapping_error()' returns !=0.

Signed-off-by: Jesper Juhl
Signed-off-by: Trond Myklebust

Jesper Juhl
2011-03-11 04:04:54 +0800
bf294b41c SUNRPC: Close a race in __rpc_wait_for_completion_task() ... Browse Code »

Although they run as rpciod background tasks, under normal operation
(i.e. no SIGKILL), functions like nfs_sillyrename(), nfs4_proc_unlck()
and nfs4_do_close() want to be fully synchronous. This means that when we
exit, we want all references to the rpc_task to be gone, and we want
any dentry references etc. held by that task to be released.

For this reason these functions call __rpc_wait_for_completion_task(),
followed by rpc_put_task() in the expectation that the latter will be
releasing the last reference to the rpc_task, and thus ensuring that the
callback_ops->rpc_release() has been called synchronously.

This patch fixes a race which exists due to the fact that
rpciod calls rpc_complete_task() (in order to wake up the callers of
__rpc_wait_for_completion_task()) and then subsequently calls
rpc_put_task() without ensuring that these two steps are done atomically.

In order to avoid adding new spin locks, the patch uses the existing
waitqueue spin lock to order the rpc_task reference count releases between
the waiting process and rpciod.
The common case where nobody is waiting for completion is optimised for by
checking if the RPC_TASK_ASYNC flag is cleared and/or if the rpc_task
reference count is 1: in those cases we drop trying to grab the spin lock,
and immediately free up the rpc_task.

Those few processes that need to put the rpc_task from inside an
asynchronous context and that do not care about ordering are given a new
helper: rpc_put_task_async().

Signed-off-by: Trond Myklebust

Trond Myklebust
2011-03-11 04:04:52 +0800

10 Mar, 2011

4 commits

7343ff31e ipv6: Don't create clones of host routes. ... Browse Code »

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=29252
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=30462

In commit d80bc0fd262ef840ed4e82593ad6416fa1ba3fc4 ("ipv6: Always
clone offlink routes.") we forced the kernel to always clone offlink
routes.

The reason we do that is to make sure we never bind an inetpeer to a
prefixed route.

The logic turned on here has existed in the tree for many years,
but was always off due to a protecting CPP define. So perhaps
it's no surprise that there is a logic bug here.

The problem is that we canot clone a route that is already a
host route (ie. has DST_HOST set). Because if we do, an identical
entry already exists in the routing tree and therefore the
ip6_rt_ins() call is going to fail.

This sets off a series of failures and high cpu usage, because when
ip6_rt_ins() fails we loop retrying this operation a few times in
order to handle a race between two threads trying to clone and insert
the same host route at the same time.

Fix this by simply using the route as-is when DST_HOST is set.

Reported-by: slash@ac.auone-net.jp
Reported-by: Ernst Sjöstrand
Signed-off-by: David S. Miller

David S. Miller
2011-03-10 11:55:25 +0800
8909c9ad8 net: don't allow CAP_NET_ADMIN to load non-netdev kernel modules ... Browse Code »

Since a8f80e8ff94ecba629542d9b4b5f5a8ee3eb565c any process with
CAP_NET_ADMIN may load any module from /lib/modules/. This doesn't mean
that CAP_NET_ADMIN is a superset of CAP_SYS_MODULE as modules are
limited to /lib/modules/**. However, CAP_NET_ADMIN capability shouldn't
allow anybody load any module not related to networking.

This patch restricts an ability of autoloading modules to netdev modules
with explicit aliases. This fixes CVE-2011-1019.

Arnd Bergmann suggested to leave untouched the old pre-v2.6.32 behavior
of loading netdev modules by name (without any prefix) for processes
with CAP_SYS_MODULE to maintain the compatibility with network scripts
that use autoloading netdev modules by aliases like "eth0", "wlan0".

Currently there are only three users of the feature in the upstream
kernel: ipip, ip_gre and sit.

root@albatros:~# capsh --drop=$(seq -s, 0 11),$(seq -s, 13 34) --
root@albatros:~# grep Cap /proc/$$/status
CapInh: 0000000000000000
CapPrm: fffffff800001000
CapEff: fffffff800001000
CapBnd: fffffff800001000
root@albatros:~# modprobe xfs
FATAL: Error inserting xfs
(/lib/modules/2.6.38-rc6-00001-g2bf4ca3/kernel/fs/xfs/xfs.ko): Operation not permitted
root@albatros:~# lsmod | grep xfs
root@albatros:~# ifconfig xfs
xfs: error fetching interface information: Device not found
root@albatros:~# lsmod | grep xfs
root@albatros:~# lsmod | grep sit
root@albatros:~# ifconfig sit
sit: error fetching interface information: Device not found
root@albatros:~# lsmod | grep sit
root@albatros:~# ifconfig sit0
sit0 Link encap:IPv6-in-IPv4
NOARP MTU:1480 Metric:1

root@albatros:~# lsmod | grep sit
sit 10457 0
tunnel4 2957 1 sit

For CAP_SYS_MODULE module loading is still relaxed:

root@albatros:~# grep Cap /proc/$$/status
CapInh: 0000000000000000
CapPrm: ffffffffffffffff
CapEff: ffffffffffffffff
CapBnd: ffffffffffffffff
root@albatros:~# ifconfig xfs
xfs: error fetching interface information: Device not found
root@albatros:~# lsmod | grep xfs
xfs 745319 0

Reference: https://lkml.org/lkml/2011/2/24/203

Signed-off-by: Vasiliy Kulikov
Signed-off-by: Michael Tokarev
Acked-by: David S. Miller
Acked-by: Kees Cook
Signed-off-by: James Morris

Vasiliy Kulikov
2011-03-10 07:25:19 +0800
03a14ab13 pktgen: fix errata in show results ... Browse Code »

The units in show_results in pktgen were not correct.
The results are in usec but it was displayed nsec.

Reported-by: Jong-won Lee
Signed-off-by: Daniel Turull
Signed-off-by: David S. Miller

Daniel Turull
2011-03-10 06:11:00 +0800
6c91afe1a ipv4: Fix erroneous uses of ifa_address. ... Browse Code »

In usual cases ifa_address == ifa_local, but in the case where
SIOCSIFDSTADDR sets the destination address on a point-to-point
link, ifa_address gets set to that destination address.

Therefore we should use ifa_local when we want the local interface
address.

There were two cases where the selection was done incorrectly:

1) When devinet_ioctl() does matching, it checks ifa_address even
though gifconf correct reported ifa_local to the user

2) IN_DEV_ARP_NOTIFY handling sends a gratuitous ARP using
ifa_address instead of ifa_local.

Reported-by: Julian Anastasov
Signed-off-by: David S. Miller

David S. Miller
2011-03-10 05:27:16 +0800

09 Mar, 2011

1 commit

6094628bf rds: prevent BUG_ON triggering on congestion map updates ... Browse Code »

Recently had this bug halt reported to me:

kernel BUG at net/rds/send.c:329!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=1024 NUMA pSeries
Modules linked in: rds sunrpc ipv6 dm_mirror dm_region_hash dm_log ibmveth sg
ext4 jbd2 mbcache sd_mod crc_t10dif ibmvscsic scsi_transport_srp scsi_tgt
dm_mod [last unloaded: scsi_wait_scan]
NIP: d000000003ca68f4 LR: d000000003ca67fc CTR: d000000003ca8770
REGS: c000000175cab980 TRAP: 0700 Not tainted (2.6.32-118.el6.ppc64)
MSR: 8000000000029032 CR: 44000022 XER: 00000000
TASK = c00000017586ec90[1896] 'krdsd' THREAD: c000000175ca8000 CPU: 0
GPR00: 0000000000000150 c000000175cabc00 d000000003cb7340 0000000000002030
GPR04: ffffffffffffffff 0000000000000030 0000000000000000 0000000000000030
GPR08: 0000000000000001 0000000000000001 c0000001756b1e30 0000000000010000
GPR12: d000000003caac90 c000000000fa2500 c0000001742b2858 c0000001742b2a00
GPR16: c0000001742b2a08 c0000001742b2820 0000000000000001 0000000000000001
GPR20: 0000000000000040 c0000001742b2814 c000000175cabc70 0800000000000000
GPR24: 0000000000000004 0200000000000000 0000000000000000 c0000001742b2860
GPR28: 0000000000000000 c0000001756b1c80 d000000003cb68e8 c0000001742b27b8
NIP [d000000003ca68f4] .rds_send_xmit+0x4c4/0x8a0 [rds]
LR [d000000003ca67fc] .rds_send_xmit+0x3cc/0x8a0 [rds]
Call Trace:
[c000000175cabc00] [d000000003ca67fc] .rds_send_xmit+0x3cc/0x8a0 [rds]
(unreliable)
[c000000175cabd30] [d000000003ca7e64] .rds_send_worker+0x54/0x100 [rds]
[c000000175cabdb0] [c0000000000b475c] .worker_thread+0x1dc/0x3c0
[c000000175cabed0] [c0000000000baa9c] .kthread+0xbc/0xd0
[c000000175cabf90] [c000000000032114] .kernel_thread+0x54/0x70
Instruction dump:
4bfffd50 60000000 60000000 39080001 935f004c f91f0040 41820024 813d017c
7d094a78 7d290074 7929d182 394a0020 40e2ff68 4bffffa4 39200000
Kernel panic - not syncing: Fatal exception
Call Trace:
[c000000175cab560] [c000000000012e04] .show_stack+0x74/0x1c0 (unreliable)
[c000000175cab610] [c0000000005a365c] .panic+0x80/0x1b4
[c000000175cab6a0] [c00000000002fbcc] .die+0x21c/0x2a0
[c000000175cab750] [c000000000030000] ._exception+0x110/0x220
[c000000175cab910] [c000000000004b9c] program_check_common+0x11c/0x180

Signed-off-by: David S. Miller

Neil Horman
2011-03-09 03:22:43 +0800

08 Mar, 2011

2 commits

b3ca9b02b net: fix multithreaded signal handling in unix recv routines ... Browse Code »

The unix_dgram_recvmsg and unix_stream_recvmsg routines in
net/af_unix.c utilize mutex_lock(&u->readlock) calls in order to
serialize read operations of multiple threads on a single socket. This
implies that, if all n threads of a process block in an AF_UNIX recv
call trying to read data from the same socket, one of these threads
will be sleeping in state TASK_INTERRUPTIBLE and all others in state
TASK_UNINTERRUPTIBLE. Provided that a particular signal is supposed to
be handled by a signal handler defined by the process and that none of
this threads is blocking the signal, the complete_signal routine in
kernel/signal.c will select the 'first' such thread it happens to
encounter when deciding which thread to notify that a signal is
supposed to be handled and if this is one of the TASK_UNINTERRUPTIBLE
threads, the signal won't be handled until the one thread not blocking
on the u->readlock mutex is woken up because some data to process has
arrived (if this ever happens). The included patch fixes this by
changing mutex_lock to mutex_lock_interruptible and handling possible
error returns in the same way interruptions are handled by the actual
receive-code.

Signed-off-by: Rainer Weikusat
Signed-off-by: David S. Miller

Rainer Weikusat
2011-03-08 07:31:16 +0800
2ea6d8c44 net: Enter net/ipv6/ even if CONFIG_IPV6=n ... Browse Code »

exthdrs_core.c and addrconf_core.c in net/ipv6/ contain bits which
must be made available even if IPv6 is disabled.

net/ipv6/Makefile already correctly includes them if CONFIG_IPV6=n
but net/Makefile prevents entering the subdirectory.

Signed-off-by: Thomas Graf
Acked-by: Randy Dunlap
Signed-off-by: David S. Miller

Thomas Graf
2011-03-08 04:50:52 +0800

06 Mar, 2011

1 commit

fb62c00a6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: no .snap inside of snapped namespace
libceph: fix msgr standby handling
libceph: fix msgr keepalive flag
libceph: fix msgr backoff
libceph: retry after authorization failure
libceph: fix handling of short returns from get_user_pages
ceph: do not clear I_COMPLETE from d_release
ceph: do not set I_COMPLETE
Revert "ceph: keep reference to parent inode on ceph_dentry"

Linus Torvalds
2011-03-06 02:43:22 +0800

05 Mar, 2011

3 commits

e00de341f libceph: fix msgr standby handling ... Browse Code »

The standby logic used to be pretty dependent on the work requeueing
behavior that changed when we switched to WQ_NON_REENTRANT. It was also
very fragile.

Restructure things so that:
- We clear WRITE_PENDING when we set STANDBY. This ensures we will
requeue work when we wake up later.
- con_work backs off if STANDBY is set. There is nothing to do if we are
in standby.
- clear_standby() helper is called by both con_send() and con_keepalive(),
the two actions that can wake us up again. Move the connect_seq++
logic here.

Signed-off-by: Sage Weil

Sage Weil
2011-03-05 04:25:05 +0800
e76661d0a libceph: fix msgr keepalive flag ... Browse Code »

There was some broken keepalive code using a dead variable. Shift to using
the proper bit flag.

Signed-off-by: Sage Weil

Sage Weil
2011-03-05 04:24:31 +0800
60bf8bf88 libceph: fix msgr backoff ... Browse Code »

With commit f363e45f we replaced a bunch of hacky workqueue mutual
exclusion logic with the WQ_NON_REENTRANT flag. One pieces of fallout is
that the exponential backoff breaks in certain cases:

* con_work attempts to connect.
* we get an immediate failure, and the socket state change handler queues
immediate work.
* con_work calls con_fault, we decide to back off, but can't queue delayed
work.

In this case, we add a BACKOFF bit to make con_work reschedule delayed work
next time it runs (which should be immediately).

Signed-off-by: Sage Weil

Sage Weil
2011-03-05 04:24:28 +0800

04 Mar, 2011

5 commits

b65a0e0c8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorri… ... Browse Code »

…s/security-testing-2.6

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
DNS: Fix a NULL pointer deref when trying to read an error key [CVE-2011-1076]

Linus Torvalds
2011-03-04 07:48:01 +0800
4438a02fc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
MAINTAINERS: Add Andy Gospodarek as co-maintainer.
r8169: disable ASPM
RxRPC: Fix v1 keys
AF_RXRPC: Handle receiving ACKALL packets
cnic: Fix lost interrupt on bnx2x
cnic: Prevent status block race conditions with hardware
net: dcbnl: check correct ops in dcbnl_ieee_set()
e1000e: disable broken PHY wakeup for ICH10 LOMs, use MAC wakeup instead
igb: fix sparse warning
e1000: fix sparse warning
netfilter: nf_log: avoid oops in (un)bind with invalid nfproto values
dccp: fix oops on Reset after close
ipvs: fix dst_lock locking on dest update
davinci_emac: Add Carrier Link OK check in Davinci RX Handler
bnx2x: update driver version to 1.62.00-6
bnx2x: properly calculate lro_mss
bnx2x: perform statistics "action" before state transition.
bnx2x: properly configure coefficients for MinBW algorithm (NPAR mode).
bnx2x: Fix ethtool -t link test for MF (non-pmf) devices.
bnx2x: Fix nvram test for single port devices.
...

Linus Torvalds
2011-03-04 07:43:15 +0800
1362fa078 DNS: Fix a NULL pointer deref when trying to read an error key [CVE-2011-1076] ... Browse Code »

When a DNS resolver key is instantiated with an error indication, attempts to
read that key will result in an oops because user_read() is expecting there to
be a payload - and there isn't one [CVE-2011-1076].

Give the DNS resolver key its own read handler that returns the error cached in
key->type_data.x[0] as an error rather than crashing.

Also make the kenter() at the beginning of dns_resolver_instantiate() limit the
amount of data it prints, since the data is not necessarily NUL-terminated.

The buggy code was added in:

commit 4a2d789267e00b5a1175ecd2ddefcc78b83fbf09
Author: Wang Lei
Date: Wed Aug 11 09:37:58 2010 +0100
Subject: DNS: If the DNS server returns an error, allow that to be cached [ver #2]

This can trivially be reproduced by any user with the following program
compiled with -lkeyutils:

#include
#include
#include
static char payload[] = "#dnserror=6";
int main()
{
key_serial_t key;
key = add_key("dns_resolver", "a", payload, sizeof(payload),
KEY_SPEC_SESSION_KEYRING);
if (key == -1)
err(1, "add_key");
if (keyctl_read(key, NULL, 0) == -1)
err(1, "read_key");
return 0;
}

What should happen is that keyctl_read() reports error 6 (ENXIO) to the user:

dns-break: read_key: No such device or address

but instead the kernel oopses.

This cannot be reproduced with the 'keyutils add' or 'keyutils padd' commands
as both of those cut the data down below the NUL termination that must be
included in the data. Without this dns_resolver_instantiate() will return
-EINVAL and the key will not be instantiated such that it can be read.

The oops looks like:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [] user_read+0x4f/0x8f
PGD 3bdf8067 PUD 385b9067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:19.0/irq
CPU 0
Modules linked in:

Pid: 2150, comm: dns-break Not tainted 2.6.38-rc7-cachefs+ #468 /DG965RY
RIP: 0010:[] [] user_read+0x4f/0x8f
RSP: 0018:ffff88003bf47f08 EFLAGS: 00010246
RAX: 0000000000000001 RBX: ffff88003b5ea378 RCX: ffffffff81972368
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003b5ea378
RBP: ffff88003bf47f28 R08: ffff88003be56620 R09: 0000000000000000
R10: 0000000000000395 R11: 0000000000000002 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: ffffffffffffffa1
FS: 00007feab5751700(0000) GS:ffff88003e000000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000003de40000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process dns-break (pid: 2150, threadinfo ffff88003bf46000, task ffff88003be56090)
Stack:
ffff88003b5ea378 ffff88003b5ea3a0 0000000000000000 0000000000000000
ffff88003bf47f68 ffffffff811b708e ffff88003c442bc8 0000000000000000
00000000004005a0 00007fffba368060 0000000000000000 0000000000000000
Call Trace:
[] keyctl_read_key+0xac/0xcf
[] sys_keyctl+0x75/0xb6
[] system_call_fastpath+0x16/0x1b
Code: 75 1f 48 83 7b 28 00 75 18 c6 05 58 2b fb 00 01 be bb 00 00 00 48 c7 c7 76 1c 75 81 e8 13 c2 e9 ff 4c 8b b3 e0 00 00 00 4d 85 ed 0f b7 5e 10 74 2d 4d 85 e4 74 28 e8 98 79 ee ff 49 39 dd 48
RIP [] user_read+0x4f/0x8f
RSP
CR2: 0000000000000010

Signed-off-by: David Howells
Acked-by: Jeff Layton
cc: Wang Lei
Signed-off-by: James Morris

David Howells
2011-03-04 06:56:19 +0800
692d20f57 libceph: retry after authorization failure ... Browse Code »

If we mark the connection CLOSED we will give up trying to reconnect to
this server instance. That is appropriate for things like a protocol
version mismatch that won't change until the server is restarted, at which
point we'll get a new addr and reconnect. An authorization failure like
this is probably due to the server not properly rotating it's secret keys,
however, and should be treated as transient so that the normal backoff and
retry behavior kicks in.

Signed-off-by: Sage Weil

Sage Weil
2011-03-04 05:47:40 +0800
38815b780 libceph: fix handling of short returns from get_user_pages ... Browse Code »

get_user_pages() can return fewer pages than we ask for. We were returning
a bogus pointer/error code in that case. Instead, loop until we get all
the pages we want or get an error we can return to the caller.

Signed-off-by: Sage Weil

Sage Weil
2011-03-04 05:47:39 +0800

03 Mar, 2011

3 commits

100034534 AF_RXRPC: Handle receiving ACKALL packets ... Browse Code »

The OpenAFS server is now sending ACKALL packets, so we need to handle them.
Otherwise we report a protocol error and abort.

Signed-off-by: David Howells
Signed-off-by: David S. Miller

David Howells
2011-03-03 14:18:52 +0800
f3d7bc57c net: dcbnl: check correct ops in dcbnl_ieee_set() ... Browse Code »

The incorrect ops routine was being tested for in
DCB_ATTR_IEEE_PFC attributes. This patch corrects
it.

Currently, every driver implementing ieee_setets also
implements ieee_setpfc so this bug is not actualized
yet.

Signed-off-by: John Fastabend
Signed-off-by: David S. Miller

John Fastabend
2011-03-03 07:04:33 +0800
88d2d28b1 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 Browse Code »

David S. Miller
2011-03-03 03:29:31 +0800

02 Mar, 2011

3 commits

9ef0298a8 netfilter: nf_log: avoid oops in (un)bind with invalid nfproto values ... Browse Code »

Like many other places, we have to check that the array index is
within allowed limits, or otherwise, a kernel oops and other nastiness
can ensue when we access memory beyond the end of the array.

[ 5954.115381] BUG: unable to handle kernel paging request at 0000004000000000
[ 5954.120014] IP: __find_logger+0x6f/0xa0
[ 5954.123979] nf_log_bind_pf+0x2b/0x70
[ 5954.123979] nfulnl_recv_config+0xc0/0x4a0 [nfnetlink_log]
[ 5954.123979] nfnetlink_rcv_msg+0x12c/0x1b0 [nfnetlink]
...

The problem goes back to v2.6.30-rc1~1372~1342~31 where nf_log_bind
was decoupled from nf_log_register.

Reported-by: Miguel Di Ciurcio Filho ,
via irc.freenode.net/#netfilter
Signed-off-by: Jan Engelhardt
Signed-off-by: Patrick McHardy

Jan Engelhardt
2011-03-02 19:10:13 +0800
720dc34bb dccp: fix oops on Reset after close ... Browse Code »

This fixes a bug in the order of dccp_rcv_state_process() that still permitted
reception even after closing the socket. A Reset after close thus causes a NULL
pointer dereference by not preventing operations on an already torn-down socket.

dccp_v4_do_rcv()
|
| state other than OPEN
v
dccp_rcv_state_process()
|
| DCCP_PKT_RESET
v
dccp_rcv_reset()
|
v
dccp_time_wait()

WARNING: at net/ipv4/inet_timewait_sock.c:141 __inet_twsk_hashdance+0x48/0x128()
Modules linked in: arc4 ecb carl9170 rt2870sta(C) mac80211 r8712u(C) crc_ccitt ah
[] (unwind_backtrace+0x0/0xec) from [] (warn_slowpath_common)
[] (warn_slowpath_common+0x4c/0x64) from [] (warn_slowpath_n)
[] (warn_slowpath_null+0x1c/0x24) from [] (__inet_twsk_hashd)
[] (__inet_twsk_hashdance+0x48/0x128) from [] (dccp_time_wai)
[] (dccp_time_wait+0x40/0xc8) from [] (dccp_rcv_state_proces)
[] (dccp_rcv_state_process+0x120/0x538) from [] (dccp_v4_do_)
[] (dccp_v4_do_rcv+0x11c/0x14c) from [] (release_sock+0xac/0)
[] (release_sock+0xac/0x110) from [] (dccp_close+0x28c/0x380)
[] (dccp_close+0x28c/0x380) from [] (inet_release+0x64/0x70)

The fix is by testing the socket state first. Receiving a packet in Closed state
now also produces the required "No connection" Reset reply of RFC 4340, 8.3.1.

Reported-and-tested-by: Johan Hovold
Cc: stable@kernel.org
Signed-off-by: Gerrit Renker
Signed-off-by: David S. Miller

Gerrit Renker
2011-03-02 15:02:07 +0800
ff75f40f4 ipvs: fix dst_lock locking on dest update ... Browse Code »

Fix dst_lock usage in __ip_vs_update_dest. We need
_bh locking because destination is updated in user context.
Can cause lockups on frequent destination updates.
Problem reported by Simon Kirby. Bug was introduced
in 2.6.37 from the "ipvs: changes for local real server"
change.

Signed-off-by: Julian Anastasov
Signed-off-by: Hans Schillstrom
Signed-off-by: Simon Horman

Julian Anastasov
2011-03-02 06:54:41 +0800

01 Mar, 2011

1 commit

b44d211e1 netlink: handle errors from netlink_dump() ... Browse Code »

netlink_dump() may failed, but nobody handle its error.
It generates output data, when a previous portion has been returned to
user space. This mechanism works when all data isn't go in skb. If we
enter in netlink_recvmsg() and skb is absent in the recv queue, the
netlink_dump() will not been executed. So if netlink_dump() is failed
one time, the new data never appear and the reader will sleep forever.

netlink_dump() is called from two places:

1. from netlink_sendmsg->...->netlink_dump_start().
In this place we can report error directly and it will be returned
by sendmsg().

2. from netlink_recvmsg
There we can't report error directly, because we have a portion of
valid output data and call netlink_dump() for prepare the next portion.
If netlink_dump() is failed, the socket will be mark as error and the
next recvmsg will be failed.

Signed-off-by: Andrey Vagin
Signed-off-by: David S. Miller

Andrey Vagin
2011-03-01 04:18:12 +0800

26 Feb, 2011

3 commits

5aca1a9e8 net: handle addr_type of 0 properly ... Browse Code »

addr_type of 0 means that the type should be adopted from from_dev and
not from __hw_addr_del_multiple(). Unfortunately it isn't so and
addr_type will always be considered. Fix this by implementing the
considered and documented behavior.

Signed-off-by: Hagen Paul Pfeifer
Signed-off-by: David S. Miller

Hagen Paul Pfeifer
2011-02-26 05:58:54 +0800
0a93ea2e8 RxRPC: Allocate tokens with kzalloc to avoid oops in rxrpc_destroy ... Browse Code »

With slab poisoning enabled, I see the following oops:

Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6b73
...
NIP [c0000000006bc61c] .rxrpc_destroy+0x44/0x104
LR [c0000000006bc618] .rxrpc_destroy+0x40/0x104
Call Trace:
[c0000000feb2bc00] [c0000000006bc618] .rxrpc_destroy+0x40/0x104 (unreliable)
[c0000000feb2bc90] [c000000000349b2c] .key_cleanup+0x1a8/0x20c
[c0000000feb2bd40] [c0000000000a2920] .process_one_work+0x2f4/0x4d0
[c0000000feb2be00] [c0000000000a2d50] .worker_thread+0x254/0x468
[c0000000feb2bec0] [c0000000000a868c] .kthread+0xbc/0xc8
[c0000000feb2bf90] [c000000000020e00] .kernel_thread+0x54/0x70

We aren't initialising token->next, but the code in destroy_context relies
on the list being NULL terminated. Use kzalloc to zero out all the fields.

Signed-off-by: Anton Blanchard
Signed-off-by: David Howells
Signed-off-by: Linus Torvalds

Anton Blanchard
2011-02-26 03:12:37 +0800
c486da343 sysctl: ipv6: use correct net in ipv6_sysctl_rtcache_flush ... Browse Code »

Before this patch issuing these commands:

fd = open("/proc/sys/net/ipv6/route/flush")
unshare(CLONE_NEWNET)
write(fd, "stuff")

would flush the newly created net, not the original one.

The equivalent ipv4 code is correct (stores the net inside ->extra1).
Acked-by: Daniel Lezcano

Signed-off-by: David S. Miller

Lucian Adrian Grijincu
2011-02-26 03:01:56 +0800

24 Feb, 2011

1 commit

ef3242859 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (33 commits)
Added support for usb ethernet (0x0fe6, 0x9700)
r8169: fix RTL8168DP power off issue.
r8169: correct settings of rtl8102e.
r8169: fix incorrect args to oob notify.
DM9000B: Fix PHY power for network down/up
DM9000B: Fix reg_save after spin_lock in dm9000_timeout
net_sched: long word align struct qdisc_skb_cb data
sfc: lower stack usage in efx_ethtool_self_test
bridge: Use IPv6 link-local address for multicast listener queries
bridge: Fix MLD queries' ethernet source address
bridge: Allow mcast snooping for transient link local addresses too
ipv6: Add IPv6 multicast address flag defines
bridge: Add missing ntohs()s for MLDv2 report parsing
bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report
bridge: Fix IPv6 multicast snooping by storing correct protocol type
p54pci: update receive dma buffers before and after processing
fix cfg80211_wext_siwfreq lock ordering...
rt2x00: Fix WPA TKIP Michael MIC failures.
ath5k: Fix fast channel switching
tcp: undo_retrans counter fixes
...

Linus Torvalds
2011-02-24 08:02:00 +0800

23 Feb, 2011

6 commits

d3bd1b4c8 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 Browse Code »

David S. Miller
2011-02-23 03:53:05 +0800
fe29ec41a bridge: Use IPv6 link-local address for multicast listener queries ... Browse Code »

Currently the bridge multicast snooping feature periodically issues
IPv6 general multicast listener queries to sense the absence of a
listener.

For this, it uses :: as its source address - however RFC 2710 requires:
"To be valid, the Query message MUST come from a link-local IPv6 Source
Address". Current Linux kernel versions seem to follow this requirement
and ignore our bogus MLD queries.

With this commit a link local address from the bridge interface is being
used to issue the MLD query, resulting in other Linux devices which are
multicast listeners in the network to respond with a MLD response (which
was not the case before).

Signed-off-by: Linus Lüssing
Signed-off-by: David S. Miller

Linus Lüssing
2011-02-23 02:07:29 +0800
36cff5a10 bridge: Fix MLD queries' ethernet source address ... Browse Code »

Map the IPv6 header's destination multicast address to an ethernet
source address instead of the MLD queries multicast address.

For instance for a general MLD query (multicast address in the MLD query
set to ::), this would wrongly be mapped to 33:33:00:00:00:00, although
an MLD queries destination MAC should always be 33:33:00:00:00:01 which
matches the IPv6 header's multicast destination ff02::1.

Signed-off-by: Linus Lüssing
Signed-off-by: David S. Miller

Linus Lüssing
2011-02-23 02:07:28 +0800
e4de9f9e8 bridge: Allow mcast snooping for transient link local addresses too ... Browse Code »

Currently the multicast bridge snooping support is not active for
link local multicast. I assume this has been done to leave
important multicast data untouched, like IPv6 Neighborhood Discovery.

In larger, bridged, local networks it could however be desirable to
optimize for instance local multicast audio/video streaming too.

With the transient flag in IPv6 multicast addresses we have an easy
way to optimize such multimedia traffic without tempering with the
high priority multicast data from well-known addresses.

This patch alters the multicast bridge snooping for IPv6, to take
effect for transient multicast addresses instead of non-link-local
addresses.

Signed-off-by: Linus Lüssing
Signed-off-by: David S. Miller

Linus Lüssing
2011-02-23 02:07:28 +0800
d41db9f3f bridge: Add missing ntohs()s for MLDv2 report parsing ... Browse Code »

The nsrcs number is 2 Byte wide, therefore we need to call ntohs()
before using it.

Signed-off-by: Linus Lüssing
Signed-off-by: David S. Miller

Linus Lüssing
2011-02-23 02:07:27 +0800
649e984d0 bridge: Fix IPv6 multicast snooping by correcting offset in MLDv2 report ... Browse Code »

We actually want a pointer to the grec_nsrcr and not the following
field. Otherwise we can get very high values for *nsrcs as the first two
bytes of the IPv6 multicast address are being used instead, leading to
a failing pskb_may_pull() which results in MLDv2 reports not being
parsed.

Signed-off-by: Linus Lüssing
Signed-off-by: David S. Miller

Linus Lüssing
2011-02-23 02:07:26 +0800