Eric Lee / smarc-fsl-linux-kernel

29 Sep, 2011

1 commit

16e572626 af_unix: dont send SCM_CREDENTIALS by default ... Browse Code »
43

Since commit 7361c36c5224 (af_unix: Allow credentials to work across
user and pid namespaces) af_unix performance dropped a lot.

This is because we now take a reference on pid and cred in each write(),
and release them in read(), usually done from another process,
eventually from another cpu. This triggers false sharing.

# Events: 154K cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... .................. .........................
#
10.40% hackbench [kernel.kallsyms] [k] put_pid
8.60% hackbench [kernel.kallsyms] [k] unix_stream_recvmsg
7.87% hackbench [kernel.kallsyms] [k] unix_stream_sendmsg
6.11% hackbench [kernel.kallsyms] [k] do_raw_spin_lock
4.95% hackbench [kernel.kallsyms] [k] unix_scm_to_skb
4.87% hackbench [kernel.kallsyms] [k] pid_nr_ns
4.34% hackbench [kernel.kallsyms] [k] cred_to_ucred
2.39% hackbench [kernel.kallsyms] [k] unix_destruct_scm
2.24% hackbench [kernel.kallsyms] [k] sub_preempt_count
1.75% hackbench [kernel.kallsyms] [k] fget_light
1.51% hackbench [kernel.kallsyms] [k]
__mutex_lock_interruptible_slowpath
1.42% hackbench [kernel.kallsyms] [k] sock_alloc_send_pskb

This patch includes SCM_CREDENTIALS information in a af_unix message/skb
only if requested by the sender, [man 7 unix for details how to include
ancillary data using sendmsg() system call]

Note: This might break buggy applications that expected SCM_CREDENTIAL
from an unaware write() system call, and receiver not using SO_PASSCRED
socket option.

If SOCK_PASSCRED is set on source or destination socket, we still
include credentials for mere write() syscalls.

Performance boost in hackbench : more than 50% gain on a 16 thread
machine (2 quad-core cpus, 2 threads per core)

hackbench 20 thread 2000

4.228 sec instead of 9.102 sec

Signed-off-by: Eric Dumazet
Acked-by: Tim Chen
Signed-off-by: David S. Miller

Eric Dumazet
2011-09-29 01:29:50 +0800

12 Aug, 2011

1 commit

33d480ce6 net: cleanup some rcu_dereference_raw ... Browse Code »

RCU api had been completed and rcu_access_pointer() or
rcu_dereference_protected() are better than generic
rcu_dereference_raw()

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-08-12 17:55:28 +0800

25 Jun, 2011

1 commit

36099365c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/… ... Browse Code »

…wireless-next-2.6 into for-davem

Conflicts:
drivers/net/wireless/rtlwifi/pci.c
include/linux/netlink.h

John W. Linville
2011-06-25 03:25:51 +0800

23 Jun, 2011

1 commit

670dc2833 netlink: advertise incomplete dumps ... Browse Code »

Consider the following situation:
* a dump that would show 8 entries, four in the first
round, and four in the second
* between the first and second rounds, 6 entries are
removed
* now the second round will not show any entry, and
even if there is a sequence/generation counter the
application will not know

To solve this problem, add a new flag NLM_F_DUMP_INTR
to the netlink header that indicates the dump wasn't
consistent, this flag can also be set on the MSG_DONE
message that terminates the dump, and as such above
situation can be detected.

To achieve this, add a sequence counter to the netlink
callback struct. Of course, netlink code still needs
to use this new functionality. The correct way to do
that is to always set cb->seq when a dumpit callback
is invoked and call nl_dump_check_consistent() for
each new message. The core code will also call this
function for the final MSG_DONE message.

To make it usable with generic netlink, a new function
genlmsg_nlhdr() is needed to obtain the netlink header
from the genetlink user header.

Signed-off-by: Johannes Berg
Acked-by: David S. Miller
Signed-off-by: John W. Linville

Johannes Berg
2011-06-23 04:09:45 +0800

17 Jun, 2011

1 commit

c63d6ea30 rtnetlink: unlock on error path in netlink_dump() ... Browse Code »

In c7ac8679bec939 "rtnetlink: Compute and store minimum ifinfo dump
size", we moved the allocation under the lock so we need to unlock
on error path.

Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller

Dan Carpenter
2011-06-17 11:51:35 +0800

10 Jun, 2011

1 commit

c7ac8679b rtnetlink: Compute and store minimum ifinfo dump size ... Browse Code »

The message size allocated for rtnl ifinfo dumps was limited to
a single page. This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.

Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.

Signed-off-by: Greg Rose
Signed-off-by: Jeff Kirsher

Greg Rose
2011-06-10 11:38:07 +0800

24 May, 2011

1 commit

71338aa7d net: convert %p usage to %pK ... Browse Code »

The %pK format specifier is designed to hide exposed kernel pointers,
specifically via /proc interfaces. Exposing these pointers provides an
easy target for kernel write vulnerabilities, since they reveal the
locations of writable structures containing easily triggerable function
pointers. The behavior of %pK depends on the kptr_restrict sysctl.

If kptr_restrict is set to 0, no deviation from the standard %p behavior
occurs. If kptr_restrict is set to 1, the default, if the current user
(intended to be a reader via seq_printf(), etc.) does not have CAP_SYSLOG
(currently in the LSM tree), kernel pointers using %pK are printed as 0's.
If kptr_restrict is set to 2, kernel pointers using %pK are printed as
0's regardless of privileges. Replacing with 0's was chosen over the
default "(null)", which cannot be parsed by userland %p, which expects
"(nil)".

The supporting code for kptr_restrict and %pK are currently in the -mm
tree. This patch converts users of %p in net/ to %pK. Cases of printing
pointers to the syslog are not covered, since this would eliminate useful
information for postmortem debugging and the reading of the syslog is
already optionally protected by the dmesg_restrict sysctl.

Signed-off-by: Dan Rosenberg
Cc: James Morris
Cc: Eric Dumazet
Cc: Thomas Graf
Cc: Eugene Teo
Cc: Kees Cook
Cc: Ingo Molnar
Cc: David S. Miller
Cc: Peter Zijlstra
Cc: Eric Paris
Signed-off-by: Andrew Morton
Signed-off-by: David S. Miller

Dan Rosenberg
2011-05-24 13:13:12 +0800

08 May, 2011

1 commit

37b6b935e net,rcu: convert call_rcu(listeners_free_rcu) to kfree_rcu() ... Browse Code »

The rcu callback listeners_free_rcu() just calls a kfree(),
so we use kfree_rcu() instead of the call_rcu(listeners_free_rcu).

Signed-off-by: Lai Jiangshan
Acked-by: David S. Miller
Signed-off-by: Paul E. McKenney
Reviewed-by: Josh Triplett

Lai Jiangshan
2011-05-08 13:50:51 +0800

04 Mar, 2011

3 commits

0a0e9ae1b Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/net/bnx2x/bnx2x.h

David S. Miller
2011-03-04 13:27:42 +0800
01a16b21d netlink: kill eff_cap from struct netlink_skb_parms ... Browse Code »

Netlink message processing in the kernel is synchronous these days,
capabilities can be checked directly in security_netlink_recv() from
the current process.

Signed-off-by: Patrick McHardy
Reviewed-by: James Morris
[chrisw: update to include pohmelfs and uvesafb]
Signed-off-by: Chris Wright
Signed-off-by: David S. Miller

Patrick McHardy
2011-03-04 05:32:07 +0800
c53fa1ed9 netlink: kill loginuid/sessionid/sid members from struct netlink_skb_parms ... Browse Code »

Netlink message processing in the kernel is synchronous these days, the
session information can be collected when needed.

Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller

Patrick McHardy
2011-03-04 02:55:40 +0800

01 Mar, 2011

1 commit

b44d211e1 netlink: handle errors from netlink_dump() ... Browse Code »

netlink_dump() may failed, but nobody handle its error.
It generates output data, when a previous portion has been returned to
user space. This mechanism works when all data isn't go in skb. If we
enter in netlink_recvmsg() and skb is absent in the recv queue, the
netlink_dump() will not been executed. So if netlink_dump() is failed
one time, the new data never appear and the reader will sleep forever.

netlink_dump() is called from two places:

1. from netlink_sendmsg->...->netlink_dump_start().
In this place we can report error directly and it will be returned
by sendmsg().

2. from netlink_recvmsg
There we can't report error directly, because we have a portion of
valid output data and call netlink_dump() for prepare the next portion.
If netlink_dump() is failed, the socket will be mark as error and the
next recvmsg will be failed.

Signed-off-by: Andrey Vagin
Signed-off-by: David S. Miller

Andrey Vagin
2011-03-01 04:18:12 +0800

20 Jan, 2011

1 commit

b8f3ab429 Revert "netlink: test for all flags of the NLM_F_DUMP composite" ... Browse Code »

This reverts commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf.

It breaks several things including the avahi daemon.

Signed-off-by: David S. Miller

David S. Miller
2011-01-20 05:34:20 +0800

10 Jan, 2011

1 commit

0ab03c2b1 netlink: test for all flags of the NLM_F_DUMP composite ... Browse Code »

Due to NLM_F_DUMP is composed of two bits, NLM_F_ROOT | NLM_F_MATCH,
when doing "if (x & NLM_F_DUMP)", it tests for _either_ of the bits
being set. Because NLM_F_MATCH's value overlaps with NLM_F_EXCL,
non-dump requests with NLM_F_EXCL set are mistaken as dump requests.

Substitute the condition to test for _all_ bits being set.

Signed-off-by: Jan Engelhardt
Acked-by: Pablo Neira Ayuso
Signed-off-by: David S. Miller

Jan Engelhardt
2011-01-10 08:25:03 +0800

25 Oct, 2010

1 commit

5c398dc8f netlink: fix netlink_change_ngroups() ... Browse Code »

commit 6c04bb18ddd633 (netlink: use call_rcu for netlink_change_ngroups)
used a somewhat convoluted and racy way to perform call_rcu().

The old block of memory is freed after a grace period, but the rcu_head
used to track it is located in new block.

This can clash if we call two times or more netlink_change_ngroups(),
and a block is freed before another. call_rcu() called on different cpus
makes no guarantee in order of callbacks.

Fix this using a more standard way of handling this : Each block of
memory contains its own rcu_head, so that no 'use after free' can
happens.

Signed-off-by: Eric Dumazet
CC: Johannes Berg
CC: Paul E. McKenney
Signed-off-by: David S. Miller

Eric Dumazet
2010-10-25 07:25:39 +0800

09 Oct, 2010

1 commit

e9a68707d Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/… ... Browse Code »

…wireless-next-2.6 into for-davem

Conflicts:
Documentation/feature-removal-schedule.txt
drivers/net/wireless/ipw2x00/ipw2200.c

John W. Linville
2010-10-09 03:39:28 +0800

06 Oct, 2010

1 commit

ff4c92d85 genetlink: introduce pre_doit/post_doit hooks ... Browse Code »

Each family may have some amount of boilerplate
locking code that applies to most, or even all,
commands.

This allows a family to handle such things in
a more generic way, by allowing it to
a) include private flags in each operation
b) specify a pre_doit hook that is called,
before an operation's doit() callback and
may return an error directly,
c) specify a post_doit hook that can undo
locking or similar things done by pre_doit,
and finally
d) include two private pointers in each info
struct passed between all these operations
including doit(). (It's two because I'll
need two in nl80211 -- can be extended.)

Signed-off-by: Johannes Berg
Acked-by: David S. Miller
Signed-off-by: John W. Linville

Johannes Berg
2010-10-06 01:35:30 +0800

01 Sep, 2010

1 commit

b963ea89f netlink: Make NETLINK_USERSOCK work again. ... Browse Code »

Once we started enforcing the a nl_table[] entry exist for
a protocol, NETLINK_USERSOCK stopped working. Add a dummy
table entry so that it works again.

Reported-by: Thomas Voegtle
Tested-by: Thomas Voegtle
Signed-off-by: David S. Miller

David S. Miller
2010-09-01 00:51:37 +0800

19 Aug, 2010

1 commit

68d6ac6d2 netlink: fix compat recvmsg ... Browse Code »

Since
commit 1dacc76d0014a034b8aca14237c127d7c19d7726
Author: Johannes Berg
Date: Wed Jul 1 11:26:02 2009 +0000

net/compat/wext: send different messages to compat tasks

we had a race condition when setting and then
restoring frag_list. Eric attempted to fix it,
but the fix created even worse problems.

However, the original motivation I had when I
added the code that turned out to be racy is
no longer clear to me, since we only copy up
to skb->len to userspace, which doesn't include
the frag_list length. As a result, not doing
any frag_list clearing and restoring avoids
the race condition, while not introducing any
other problems.

Additionally, while preparing this patch I found
that since none of the remaining netlink code is
really aware of the frag_list, we need to use the
original skb's information for packet information
and credentials. This fixes, for example, the
group information received by compat tasks.

Cc: Eric Dumazet
Cc: stable@kernel.org [2.6.31+, for 2.6.35 revert 1235f504aa]
Signed-off-by: Johannes Berg
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Johannes Berg
2010-08-19 14:35:58 +0800

16 Aug, 2010

1 commit

daa3766e7 Revert "netlink: netlink_recvmsg() fix" ... Browse Code »

This reverts commit 1235f504aaba2ebeabc863fdb3ceac764a317d47.

It causes regressions worse than the problem it was trying
to fix. Eric will try to solve the problem another way.

Signed-off-by: David S. Miller

David S. Miller
2010-08-16 14:21:50 +0800

27 Jul, 2010

3 commits

652c67174 genetlink: use genl_register_family_with_ops() ... Browse Code »

Signed-off-by: Changli Gao
Signed-off-by: David S. Miller

Changli Gao
2010-07-27 12:00:10 +0800
416c2f9cf genetlink: cleanup code according to CodingStyle ... Browse Code »

If the function is exported, the EXPORT* macro for it should follow immediately
after the closing function brace line.

Signed-off-by: Changli Gao
----
net/netlink/genetlink.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
Signed-off-by: David S. Miller

Changli Gao
2010-07-27 11:53:49 +0800
1235f504a netlink: netlink_recvmsg() fix ... Browse Code »

commit 1dacc76d0014
(net/compat/wext: send different messages to compat tasks)
introduced a race condition on netlink, in case MSG_PEEK is used.

An skb given by skb_recv_datagram() might be shared, we must copy it
before any modification, or risk fatal corruption.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2010-07-27 04:09:16 +0800

21 Jul, 2010

1 commit

70d4bf6d4 drop_monitor: convert some kfree_skb call sites to consume_skb ... Browse Code »

Convert a few calls from kfree_skb to consume_skb

Noticed while I was working on dropwatch that I was detecting lots of internal
skb drops in several places. While some are legitimate, several were not,
freeing skbs that were at the end of their life, rather than being discarded due
to an error. This patch converts those calls sites from using kfree_skb to
consume_skb, which quiets the in-kernel drop_monitor code from detecting them as
drops. Tested successfully by myself

Signed-off-by: Neil Horman
Signed-off-by: David S. Miller

Neil Horman
2010-07-21 04:28:05 +0800

17 Jun, 2010

1 commit

b47030c71 af_netlink: Add needed scm_destroy after scm_send. ... Browse Code »

scm_send occasionally allocates state in the scm_cookie, so I have
modified netlink_sendmsg to guarantee that when scm_send succeeds
scm_destory will be called to free that state.

Signed-off-by: Eric W. Biederman
Reviewed-by: Daniel Lezcano
Acked-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Eric W. Biederman
2010-06-17 05:55:56 +0800

22 May, 2010

1 commit

910a7e905 netlink: Implment netlink_broadcast_filtered ... Browse Code »

When netlink sockets are used to convey data that is in a namespace
we need a way to select a subset of the listening sockets to deliver
the packet to. For the network namespace we have been doing this
by only transmitting packets in the correct network namespace.

For data belonging to other namespaces netlink_bradcast_filtered
provides a mechanism that allows us to examine the destination
socket and to decide if we should transmit the specified packet
to it.

Signed-off-by: Eric W. Biederman
Acked-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Eric W. Biederman
2010-05-22 00:37:32 +0800

12 Apr, 2010

1 commit

871039f02 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/net/stmmac/stmmac_main.c
drivers/net/wireless/wl12xx/wl1271_cmd.c
drivers/net/wireless/wl12xx/wl1271_main.c
drivers/net/wireless/wl12xx/wl1271_spi.c
net/core/ethtool.c
net/mac80211/scan.c

David S. Miller
2010-04-12 05:53:53 +0800

07 Apr, 2010

1 commit

4a35ecf8b Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

Conflicts:
drivers/net/bonding/bond_main.c
drivers/net/via-velocity.c
drivers/net/wireless/iwlwifi/iwl-agn.c

David S. Miller
2010-04-07 14:53:30 +0800

06 Apr, 2010

1 commit

cb4361c1d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 ... Browse Code »

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
smc91c92_cs: fix the problem of "Unable to find hardware address"
r8169: clean up my printk uglyness
net: Hook up cxgb4 to Kconfig and Makefile
cxgb4: Add main driver file and driver Makefile
cxgb4: Add remaining driver headers and L2T management
cxgb4: Add packet queues and packet DMA code
cxgb4: Add HW and FW support code
cxgb4: Add register, message, and FW definitions
netlabel: Fix several rcu_dereference() calls used without RCU read locks
bonding: fix potential deadlock in bond_uninit()
net: check the length of the socket address passed to connect(2)
stmmac: add documentation for the driver.
stmmac: fix kconfig for crc32 build error
be2net: fix bug in vlan rx path for big endian architecture
be2net: fix flashing on big endian architectures
be2net: fix a bug in flashing the redboot section
bonding: bond_xmit_roundrobin() fix
drivers/net: Add missing unlock
net: gianfar - align BD ring size console messages
net: gianfar - initialize per-queue statistics
...

Linus Torvalds
2010-04-06 23:34:06 +0800

04 Apr, 2010

1 commit

f408e0ce4 netlink: Export genl_lock() API for use by modules ... Browse Code »

This lets kernel modules which use genl netlink APIs serialize netlink
processing.

Signed-off-by: James Chapman
Reviewed-by: Randy Dunlap
Signed-off-by: David S. Miller

James Chapman
2010-04-04 05:56:05 +0800

02 Apr, 2010

1 commit

6503d9616 net: check the length of the socket address passed to connect(2) ... Browse Code »

check the length of the socket address passed to connect(2).

Check the length of the socket address passed to connect(2). If the
length is invalid, -EINVAL will be returned.

Signed-off-by: Changli Gao
----
net/bluetooth/l2cap.c | 3 ++-
net/bluetooth/rfcomm/sock.c | 3 ++-
net/bluetooth/sco.c | 3 ++-
net/can/bcm.c | 3 +++
net/ieee802154/af_ieee802154.c | 3 +++
net/ipv4/af_inet.c | 5 +++++
net/netlink/af_netlink.c | 3 +++
7 files changed, 20 insertions(+), 3 deletions(-)
Signed-off-by: David S. Miller

Changli Gao
2010-04-02 08:26:01 +0800

30 Mar, 2010

1 commit

5a0e3ad6a include cleanup: Update gfp.h and slab.h includes to prepare for breaking implic… ... Browse Code »

…it slab.h inclusion from percpu.h

percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.

2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).

* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

Tejun Heo
2010-03-30 21:02:32 +0800

27 Mar, 2010

1 commit

66aa4a55f netlink: use the appropriate namespace pid ... Browse Code »

This was included in OpenVZ kernels but wasn't integrated upstream.
>From git://git.openvz.org/pub/linux-2.6.24-openvz:

commit 5c69402f18adf7276352e051ece2cf31feefab02
Author: Alexey Dobriyan
Date: Mon Dec 24 14:37:45 2007 +0300

netlink: fixup ->tgid to work in multiple PID namespaces

Signed-off-by: Tom Goff
Acked-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Tom Goff
2010-03-27 11:13:58 +0800

21 Mar, 2010

1 commit

1a50307ba netlink: fix NETLINK_RECV_NO_ENOBUFS in netlink_set_err() ... Browse Code »

Currently, ENOBUFS errors are reported to the socket via
netlink_set_err() even if NETLINK_RECV_NO_ENOBUFS is set. However,
that should not happen. This fixes this problem and it changes the
prototype of netlink_set_err() to return the number of sockets that
have set the NETLINK_RECV_NO_ENOBUFS socket option. This return
value is used in the next patch in these bugfix series.

Signed-off-by: Pablo Neira Ayuso
Signed-off-by: David S. Miller

Pablo Neira Ayuso
2010-03-21 05:29:03 +0800

28 Feb, 2010

1 commit

cf0aa4e07 netlink: Adding inode field to /proc/net/netlink ... Browse Code »

The Inode field in /proc/net/{tcp,udp,packet,raw,...} is useful to know the types of
file descriptors associated to a process. Actually lsof utility uses the field.
Unfortunately, unlike /proc/net/{tcp,udp,packet,raw,...}, /proc/net/netlink doesn't have the field.
This patch adds the field to /proc/net/netlink.

Signed-off-by: Masatake YAMATO
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Masatake YAMATO
2010-02-28 17:29:49 +0800

04 Feb, 2010

2 commits

9c119ba54 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Browse Code »

David S. Miller
2010-02-04 11:38:22 +0800
974c37e9d netlink: fix for too early rmmod ... Browse Code »

Netlink code does module autoload if protocol userspace is asking for is
not ready. However, module can dissapear right after it was autoloaded.
Example: modprobe/rmmod stress-testing and xfrm_user.ko providing NETLINK_XFRM.

netlink_create() in such situation _will_ create userspace socket and
_will_not_ pin module. Now if module was removed and we're going to call
->netlink_rcv into nothing:

BUG: unable to handle kernel paging request at ffffffffa02f842a
^^^^^^^^^^^^^^^^
modules are loaded near these addresses here

IP: [] 0xffffffffa02f842a
PGD 161f067 PUD 1623063 PMD baa12067 PTE 0
Oops: 0010 [#1] PREEMPT SMP DEBUG_PAGEALLOC
last sysfs file: /sys/devices/pci0000:00/0000:00:1f.2/host0/target0:0:0/0:0:0:0/block/sda/uevent
CPU 1
Pid: 11515, comm: ip Not tainted 2.6.33-rc5-netns-00594-gaaa5728-dirty #6 P5E/P5E
RIP: 0010:[] [] 0xffffffffa02f842a
RSP: 0018:ffff8800baa3db48 EFLAGS: 00010292
RAX: ffff8800baa3dfd8 RBX: ffff8800be353640 RCX: 0000000000000000
RDX: ffffffff81959380 RSI: ffff8800bab7f130 RDI: 0000000000000001
RBP: ffff8800baa3db58 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000011
R13: ffff8800be353640 R14: ffff8800bcdec240 R15: ffff8800bd488010
FS: 00007f93749656f0(0000) GS:ffff880002300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffffffa02f842a CR3: 00000000ba82b000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ip (pid: 11515, threadinfo ffff8800baa3c000, task ffff8800bab7eb30)
Stack:
ffffffff813637c0 ffff8800bd488000 ffff8800baa3dba8 ffffffff8136397d
0000000000000000 ffffffff81344adc 7fffffffffffffff 0000000000000000
ffff8800baa3ded8 ffff8800be353640 ffff8800bcdec240 0000000000000000
Call Trace:
[] ? netlink_unicast+0x100/0x2d0
[] netlink_unicast+0x2bd/0x2d0

netlink_unicast_kernel:
nlk->netlink_rcv(skb);

[] ? memcpy_fromiovec+0x6c/0x90
[] netlink_sendmsg+0x1d3/0x2d0
[] sock_sendmsg+0xbb/0xf0
[] ? __lock_acquire+0x27b/0xa60
[] ? might_fault+0x73/0xd0
[] ? might_fault+0x73/0xd0
[] ? __lock_release+0x82/0x170
[] ? might_fault+0xbe/0xd0
[] ? might_fault+0x73/0xd0
[] ? verify_iovec+0x47/0xd0
[] sys_sendmsg+0x1a9/0x360
[] ? _raw_spin_unlock_irqrestore+0x65/0x70
[] ? trace_hardirqs_on+0xd/0x10
[] ? _raw_spin_unlock_irqrestore+0x42/0x70
[] ? __up_read+0x84/0xb0
[] ? trace_hardirqs_on_caller+0x145/0x190
[] ? trace_hardirqs_on_thunk+0x3a/0x3f
[] system_call_fastpath+0x16/0x1b
Code: Bad RIP value.
RIP [] 0xffffffffa02f842a
RSP
CR2: ffffffffa02f842a

If module was quickly removed after autoloading, return -E.

Return -EPROTONOSUPPORT if module was quickly removed after autoloading.

Signed-off-by: Alexey Dobriyan
Signed-off-by: David S. Miller

Alexey Dobriyan
2010-02-04 10:13:43 +0800

14 Jan, 2010

1 commit

e1d5a0107 genetlink: optimize ctrl_dumpfamily() ... Browse Code »

there is a unnecessary test which can be replaced by a good initialization in
the 'for' statement

Noticed by Serge E. Hallyn

Signed-off-by: Samir Bellabes
Signed-off-by: David S. Miller

Samir Bellabes
2010-01-14 12:37:45 +0800

26 Nov, 2009

1 commit

09ad9bc75 net: use net_eq to compare nets ... Browse Code »

Generated with the following semantic patch

@@
struct net *n1;
struct net *n2;
@@
- n1 == n2
+ net_eq(n1, n2)

@@
struct net *n1;
struct net *n2;
@@
- n1 != n2
+ !net_eq(n1, n2)

applied over {include,net,drivers/net}.

Signed-off-by: Octavian Purdila
Signed-off-by: David S. Miller

Octavian Purdila
2009-11-26 07:14:13 +0800

17 Nov, 2009

1 commit

649300b92 netlink: remove subscriptions check on notifier ... Browse Code »

The netlink URELEASE notifier doesn't notify for
sockets that have been used to receive multicast
but it should be called for such sockets as well
since they might _also_ be used for sending and
not solely for receiving multicast. We will need
that for nl80211 (generic netlink sockets) in the
future.

Signed-off-by: Johannes Berg
Cc: Patrick McHardy
Signed-off-by: David S. Miller

Johannes Berg
2009-11-17 20:08:49 +0800