11 Feb, 2016
1 commit
-
In order to support fast reuseport lookups in TCP, the hash function
defined in struct proto must be capable of returning an error code.
This patch changes the function signature of all related hash functions
to return an integer and handles or propagates this return value at
all call sites.Signed-off-by: Craig Gallek
Signed-off-by: David S. Miller
13 Jan, 2016
1 commit
-
Ivaylo Dimitrov reported a regression caused by commit 7866a621043f
("dev: add per net_device packet type chains").skb->dev becomes NULL and we crash in __netif_receive_skb_core().
Before above commit, different kind of bugs or corruptions could happen
without major crash.But the root cause is that phonet_rcv() can queue skb without checking
if skb is shared or not.Many thanks to Ivaylo Dimitrov for his help, diagnosis and tests.
Reported-by: Ivaylo Dimitrov
Tested-by: Ivaylo Dimitrov
Signed-off-by: Eric Dumazet
Cc: Remi Denis-Courmont
Signed-off-by: David S. Miller
11 May, 2015
1 commit
-
In preparation for changing how struct net is refcounted
on kernel sockets pass the knowledge that we are creating
a kernel socket from sock_create_kern through to sk_alloc.Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
03 Mar, 2015
1 commit
-
After TIPC doesn't depend on iocb argument in its internal
implementations of sendmsg() and recvmsg() hooks defined in proto
structure, no any user is using iocb argument in them at all now.
Then we can drop the redundant iocb argument completely from kinds of
implementations of both sendmsg() and recvmsg() in the entire
networking stack.Cc: Christoph Hellwig
Suggested-by: Al Viro
Signed-off-by: Ying Xue
Signed-off-by: David S. Miller
20 Jan, 2015
1 commit
-
My previous patch to this file changed the code to be bug-compatible
towards userspace. Unless userspace (which I wasn't able to find)
implements the dump reader by hand in a wrong way, this isn't needed.
If it uses libnl or similar code putting multiple messages into a
single SKB is far more efficient.Change the code to do this. While at it, also clean it up and don't
use so many variables - just store the address in the callback args
directly.Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller
18 Jan, 2015
1 commit
-
Contrary to common expectations for an "int" return, these functions
return only a positive value -- if used correctly they cannot even
return 0 because the message header will necessarily be in the skb.This makes the very common pattern of
if (genlmsg_end(...) < 0) { ... }
be a whole bunch of dead code. Many places also simply do
return nlmsg_end(...);
and the caller is expected to deal with it.
This also commonly (at least for me) causes errors, because it is very
common to writeif (my_function(...))
/* error condition */and if my_function() does "return nlmsg_end()" this is of course wrong.
Additionally, there's not a single place in the kernel that actually
needs the message length returned, and if anyone needs it later then
it'll be very easy to just use skb->len there.Remove this, and make the functions void. This removes a bunch of dead
code as described above. The patch adds lines because I did- return nlmsg_end(...);
+ nlmsg_end(...);
+ return 0;I could have preserved all the function's return values by returning
skb->len, but instead I've audited all the places calling the affected
functions and found that none cared. A few places actually compared
the return value with < 0 with no change in behaviour, so I opted for the more
efficient version.One instance of the error I've made numerous times now is also present
in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
check for
Signed-off-by: David S. Miller
24 Nov, 2014
1 commit
-
Signed-off-by: Al Viro
12 Nov, 2014
1 commit
-
Use the more common dynamic_debug capable net_dbg_ratelimited
and remove the LIMIT_NETDEBUG macro.All messages are still ratelimited.
Some KERN_ uses are changed to KERN_DEBUG.
This may have some negative impact on messages that were
emitted at KERN_INFO that are not not enabled at all unless
DEBUG is defined or dynamic_debug is enabled. Even so,
these messages are now _not_ emitted by default.This also eliminates the use of the net_msg_warn sysctl
"/proc/sys/net/core/warnings". For backward compatibility,
the sysctl is not removed, but it has no function. The extern
declaration of net_msg_warn is removed from sock.h and made
static in net/core/sysctl_net_core.cMiscellanea:
o Update the sysctl documentation
o Remove the embedded uses of pr_fmt
o Coalesce format fragments
o Realign argumentsSigned-off-by: Joe Perches
Signed-off-by: David S. Miller
06 Nov, 2014
1 commit
-
This encapsulates all of the skb_copy_datagram_iovec() callers
with call argument signature "skb, offset, msghdr->msg_iov, length".When we move to iov_iters in the networking, the iov_iter object will
sit in the msghdr.Having a helper like this means there will be less places to touch
during that transformation.Based upon descriptions and patch from Al Viro.
Signed-off-by: David S. Miller
07 Oct, 2014
1 commit
-
-Add __rcu annotation on table to fix sparse warnings:
net/phonet/pn_dev.c:279:25: warning: incorrect type in assignment (different address spaces)
net/phonet/pn_dev.c:279:25: expected struct net_device *
net/phonet/pn_dev.c:279:25: got void [noderef] *
net/phonet/pn_dev.c:376:17: warning: incorrect type in assignment (different address spaces)
net/phonet/pn_dev.c:376:17: expected struct net_device *volatile
net/phonet/pn_dev.c:376:17: got struct net_device [noderef] *
net/phonet/pn_dev.c:392:17: warning: incorrect type in assignment (different address spaces)
net/phonet/pn_dev.c:392:17: expected struct net_device *
net/phonet/pn_dev.c:392:17: got void [noderef] *-Access table with rcu_access_pointer (fixes the following sparse errors):
net/phonet/pn_dev.c:278:25: error: incompatible types in comparison expression (different address spaces)
net/phonet/pn_dev.c:391:17: error: incompatible types in comparison expression (different address spaces)Suggested-by: Eric Dumazet
Signed-off-by: Fabian Frederick
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller
16 Jul, 2014
1 commit
-
Extend alloc_netdev{,_mq{,s}}() to take name_assign_type as argument, and convert
all users to pass NET_NAME_UNKNOWN.Coccinelle patch:
@@
expression sizeof_priv, name, setup, txqs, rxqs, count;
@@(
-alloc_netdev_mqs(sizeof_priv, name, setup, txqs, rxqs)
+alloc_netdev_mqs(sizeof_priv, name, NET_NAME_UNKNOWN, setup, txqs, rxqs)
|
-alloc_netdev_mq(sizeof_priv, name, setup, count)
+alloc_netdev_mq(sizeof_priv, name, NET_NAME_UNKNOWN, setup, count)
|
-alloc_netdev(sizeof_priv, name, setup)
+alloc_netdev(sizeof_priv, name, NET_NAME_UNKNOWN, setup)
)v9: move comments here from the wrong commit
Signed-off-by: Tom Gundersen
Reviewed-by: David Herrmann
Signed-off-by: David S. Miller
25 Apr, 2014
1 commit
-
It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.Reported-by: Andy Lutomirski
Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
12 Apr, 2014
1 commit
-
Several spots in the kernel perform a sequence like:
skb_queue_tail(&sk->s_receive_queue, skb);
sk->sk_data_ready(sk, skb->len);But at the moment we place the SKB onto the socket receive queue it
can be consumed and freed up. So this skb->len access is potentially
to freed up memory.Furthermore, the skb->len can be modified by the consumer so it is
possible that the value isn't accurate.And finally, no actual implementation of this callback actually uses
the length argument. And since nobody actually cared about it's
value, lots of call sites pass arbitrary values in such as '0' and
even '1'.So just remove the length argument from the callback, that way there
is no confusion whatsoever and all of these use-after-free cases get
fixed as a side effect.Based upon a patch by Eric Dumazet and his suggestion to audit this
issue tree-wide.Signed-off-by: David S. Miller
19 Jan, 2014
1 commit
-
This is a follow-up patch to f3d3342602f8bc ("net: rework recvmsg
handler msg_name and msg_namelen logic").DECLARE_SOCKADDR validates that the structure we use for writing the
name information to is not larger than the buffer which is reserved
for msg->msg_name (which is 128 bytes). Also use DECLARE_SOCKADDR
consistently in sendmsg code paths.Signed-off-by: Steffen Hurrle
Suggested-by: Hannes Frederic Sowa
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
20 Nov, 2013
1 commit
-
Pull networking fixes from David Miller:
"Mostly these are fixes for fallout due to merge window changes, as
well as cures for problems that have been with us for a much longer
period of time"1) Johannes Berg noticed two major deficiencies in our genetlink
registration. Some genetlink protocols we passing in constant
counts for their ops array rather than something like
ARRAY_SIZE(ops) or similar. Also, some genetlink protocols were
using fixed IDs for their multicast groups.We have to retain these fixed IDs to keep existing userland tools
working, but reserve them so that other multicast groups used by
other protocols can not possibly conflict.In dealing with these two problems, we actually now use less state
management for genetlink operations and multicast groups.2) When configuring interface hardware timestamping, fix several
drivers that simply do not validate that the hwtstamp_config value
is one the driver actually supports. From Ben Hutchings.3) Invalid memory references in mwifiex driver, from Amitkumar Karwar.
4) In dev_forward_skb(), set the skb->protocol in the right order
relative to skb_scrub_packet(). From Alexei Starovoitov.5) Bridge erroneously fails to use the proper wrapper functions to make
calls to netdev_ops->ndo_vlan_rx_{add,kill}_vid. Fix from Toshiaki
Makita.6) When detaching a bridge port, make sure to flush all VLAN IDs to
prevent them from leaking, also from Toshiaki Makita.7) Put in a compromise for TCP Small Queues so that deep queued devices
that delay TX reclaim non-trivially don't have such a performance
decrease. One particularly problematic area is 802.11 AMPDU in
wireless. From Eric Dumazet.8) Fix crashes in tcp_fastopen_cache_get(), we can see NULL socket dsts
here. Fix from Eric Dumzaet, reported by Dave Jones.9) Fix use after free in ipv6 SIT driver, from Willem de Bruijn.
10) When computing mergeable buffer sizes, virtio-net fails to take the
virtio-net header into account. From Michael Dalton.11) Fix seqlock deadlock in ip4_datagram_connect() wrt. statistic
bumping, this one has been with us for a while. From Eric Dumazet.12) Fix NULL deref in the new TIPC fragmentation handling, from Erik
Hugne.13) 6lowpan bit used for traffic classification was wrong, from Jukka
Rissanen.14) macvlan has the same issue as normal vlans did wrt. propagating LRO
disabling down to the real device, fix it the same way. From Michal
Kubecek.15) CPSW driver needs to soft reset all slaves during suspend, from
Daniel Mack.16) Fix small frame pacing in FQ packet scheduler, from Eric Dumazet.
17) The xen-netfront RX buffer refill timer isn't properly scheduled on
partial RX allocation success, from Ma JieYue.18) When ipv6 ping protocol support was added, the AF_INET6 protocol
initialization cleanup path on failure was borked a little. Fix
from Vlad Yasevich.19) If a socket disconnects during a read/recvmsg/recvfrom/etc that
blocks we can do the wrong thing with the msg_name we write back to
userspace. From Hannes Frederic Sowa. There is another fix in the
works from Hannes which will prevent future problems of this nature.20) Fix route leak in VTI tunnel transmit, from Fan Du.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (106 commits)
genetlink: make multicast groups const, prevent abuse
genetlink: pass family to functions using groups
genetlink: add and use genl_set_err()
genetlink: remove family pointer from genl_multicast_group
genetlink: remove genl_unregister_mc_group()
hsr: don't call genl_unregister_mc_group()
quota/genetlink: use proper genetlink multicast APIs
drop_monitor/genetlink: use proper genetlink multicast APIs
genetlink: only pass array to genl_register_family_with_ops()
tcp: don't update snd_nxt, when a socket is switched from repair mode
atm: idt77252: fix dev refcnt leak
xfrm: Release dst if this dst is improper for vti tunnel
netlink: fix documentation typo in netlink_set_err()
be2net: Delete secondary unicast MAC addresses during be_close
be2net: Fix unconditional enabling of Rx interface options
net, virtio_net: replace the magic value
ping: prevent NULL pointer dereference on write to msg_name
bnx2x: Prevent "timeout waiting for state X"
bnx2x: prevent CFC attention
bnx2x: Prevent panic during DMAE timeout
...
19 Nov, 2013
1 commit
-
Only update *addr_len when we actually fill in sockaddr, otherwise we
can return uninitialized memory from the stack to the caller in the
recvfrom, recvmmsg and recvmsg syscalls. Drop the the (addr_len == NULL)
checks because we only get called with a valid addr_len pointer either
from sock_common_recvmsg or inet_recvmsg.If a blocking read waits on a socket which is concurrently shut down we
now return zero and set msg_msgnamelen to 0.Reported-by: mpb
Suggested-by: Eric Dumazet
Signed-off-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
15 Nov, 2013
1 commit
-
All seq_printf() users are using "%n" for calculating padding size,
convert them to use seq_setwidth() / seq_pad() pair.Signed-off-by: Tetsuo Handa
Signed-off-by: Kees Cook
Cc: Joe Perches
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
16 Aug, 2013
1 commit
-
UIDs are printed in the proc_fs as signed int, whereas
they are unsigned int.Signed-off-by: Francesco Fusco
Signed-off-by: David S. Miller
13 Jun, 2013
1 commit
-
Reduce the uses of this unnecessary typedef.
Done via perl script:
$ git grep --name-only -w ctl_table net | \
xargs perl -p -i -e '\
sub trim { my ($local) = @_; $local =~ s/(^\s+|\s+$)//g; return $local; } \
s/\b(?<!struct\s)ctl_table\b(\s*\*\s*|\s+\w+)/"struct ctl_table " . trim($1)/ge'Reflow the modified lines that now exceed 80 columns.
Signed-off-by: Joe Perches
Signed-off-by: David S. Miller
29 May, 2013
1 commit
-
So far, only net_device * could be passed along with netdevice notifier
event. This patch provides a possibility to pass custom structure
able to provide info that event listener needs to know.Signed-off-by: Jiri Pirko
v2->v3: fix typo on simeth
shortened dev_getter
shortened notifier_info struct name
v1->v2: fix notifier_call parameter in call_netdevice_notifier()
Signed-off-by: David S. Miller
22 Mar, 2013
1 commit
-
With decnet converted, we can finally get rid of rta_buf and its
computations around it. It also gets rid of the minimal header
length verification since all message handlers do that explicitly
anyway.Signed-off-by: Thomas Graf
Signed-off-by: David S. Miller
28 Feb, 2013
1 commit
-
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;type T;
expression a,c,d,e;
identifier b;
statement S;
@@-T b;
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin
Acked-by: Paul E. McKenney
Signed-off-by: Sasha Levin
Cc: Wu Fengguang
Cc: Marcelo Tosatti
Cc: Gleb Natapov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
19 Feb, 2013
2 commits
-
proc_net_remove is only used to remove proc entries
that under /proc/net,it's not a general function for
removing proc entries of netns. if we want to remove
some proc entries which under /proc/net/stat/, we still
need to call remove_proc_entry.this patch use remove_proc_entry to replace proc_net_remove.
we can remove proc_net_remove after this patch.Signed-off-by: Gao feng
Signed-off-by: David S. Miller -
Right now, some modules such as bonding use proc_create
to create proc entries under /proc/net/, and other modules
such as ipv4 use proc_net_fops_create.It looks a little chaos.this patch changes all of
proc_net_fops_create to proc_create. we can remove
proc_net_fops_create after this patch.Signed-off-by: Gao feng
Signed-off-by: David S. Miller
19 Nov, 2012
1 commit
-
- In rtnetlink_rcv_msg convert the capable(CAP_NET_ADMIN) check
to ns_capable(net->user-ns, CAP_NET_ADMIN). Allowing unprivileged
users to make netlink calls to modify their local network
namespace.- In the rtnetlink doit methods add capable(CAP_NET_ADMIN) so
that calls that are not safe for unprivileged users are still
protected.Later patches will remove the extra capable calls from methods
that are safe for unprivilged users.Acked-by: Serge Hallyn
Signed-off-by: "Eric W. Biederman"
Signed-off-by: David S. Miller
11 Sep, 2012
1 commit
-
It is a frequent mistake to confuse the netlink port identifier with a
process identifier. Try to reduce this confusion by renaming fields
that hold port identifiers portid instead of pid.I have carefully avoided changing the structures exported to
userspace to avoid changing the userspace API.I have successfully built an allyesconfig kernel with this change.
Signed-off-by: "Eric W. Biederman"
Acked-by: Stephen Hemminger
Signed-off-by: David S. Miller
15 Aug, 2012
1 commit
-
Cc: Alexey Kuznetsov
Cc: James Morris
Cc: Hideaki YOSHIFUJI
Cc: Patrick McHardy
Cc: Arnaldo Carvalho de Melo
Cc: Sridhar Samudrala
Acked-by: Vlad Yasevich
Acked-by: David S. Miller
Acked-by: Serge Hallyn
Signed-off-by: Eric W. Biederman
18 Jun, 2012
1 commit
-
Signed-off-by: Rémi Denis-Courmont
Cc: Sakari Ailus
Signed-off-by: David S. Miller
21 Apr, 2012
2 commits
-
This results in code with less boiler plate that is a bit easier
to read.Additionally stops us from using compatibility code in the sysctl
core, hastening the day when the compatibility code can be removed.Signed-off-by: Eric W. Biederman
Acked-by: Pavel Emelyanov
Signed-off-by: David S. Miller -
This makes it clearer which sysctls are relative to your current network
namespace.This makes it a little less error prone by not exposing sysctls for the
initial network namespace in other namespaces.This is the same way we handle all of our other network interfaces to
userspace and I can't honestly remember why we didn't do this for
sysctls right from the start.Signed-off-by: Eric W. Biederman
Acked-by: Pavel Emelyanov
Signed-off-by: David S. Miller
16 Apr, 2012
1 commit
-
Use of "unsigned int" is preferred to bare "unsigned" in net tree.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller
14 Apr, 2012
2 commits
-
Signed-off-by: Rémi Denis-Courmont
Signed-off-by: David S. Miller -
Signed-off-by: Rémi Denis-Courmont
Signed-off-by: David S. Miller
13 Apr, 2012
2 commits
-
Pull in the 'net' tree to get CAIF bug fixes upon which
the following set of CAIF feature patches depend.Signed-off-by: David S. Miller
-
Recently an oops was reported in phonet if there was a failure during
network namespace creation.[ 163.733755] ------------[ cut here ]------------
[ 163.734501] kernel BUG at include/net/netns/generic.h:45!
[ 163.734501] invalid opcode: 0000 [#1] PREEMPT SMP
[ 163.734501] CPU 2
[ 163.734501] Pid: 19145, comm: trinity Tainted: G W 3.4.0-rc1-next-20120405-sasha-dirty #57
[ 163.734501] RIP: 0010:[] [] phonet_pernet+0x182/0x1a0
[ 163.734501] RSP: 0018:ffff8800674d5ca8 EFLAGS: 00010246
[ 163.734501] RAX: 000000003fffffff RBX: 0000000000000000 RCX: ffff8800678c88d8
[ 163.734501] RDX: 00000000003f4000 RSI: ffff8800678c8910 RDI: 0000000000000282
[ 163.734501] RBP: ffff8800674d5cc8 R08: 0000000000000000 R09: 0000000000000000
[ 163.734501] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880068bec920
[ 163.734501] R13: ffffffff836b90c0 R14: 0000000000000000 R15: 0000000000000000
[ 163.734501] FS: 00007f055e8de700(0000) GS:ffff88007d000000(0000) knlGS:0000000000000000
[ 163.734501] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 163.734501] CR2: 00007f055e6bb518 CR3: 0000000070c16000 CR4: 00000000000406e0
[ 163.734501] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 163.734501] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 163.734501] Process trinity (pid: 19145, threadinfo ffff8800674d4000, task ffff8800678c8000)
[ 163.734501] Stack:
[ 163.734501] ffffffff824d5f00 ffffffff810e2ec1 ffff880067ae0000 00000000ffffffd4
[ 163.734501] ffff8800674d5cf8 ffffffff824d667a ffff880067ae0000 00000000ffffffd4
[ 163.734501] ffffffff836b90c0 0000000000000000 ffff8800674d5d18 ffffffff824d707d
[ 163.734501] Call Trace:
[ 163.734501] [] ? phonet_pernet+0x20/0x1a0
[ 163.734501] [] ? get_parent_ip+0x11/0x50
[ 163.734501] [] phonet_device_destroy+0x1a/0x100
[ 163.734501] [] phonet_device_notify+0x3d/0x50
[ 163.734501] [] notifier_call_chain+0xee/0x130
[ 163.734501] [] raw_notifier_call_chain+0x11/0x20
[ 163.734501] [] call_netdevice_notifiers+0x52/0x60
[ 163.734501] [] rollback_registered_many+0x185/0x270
[ 163.734501] [] unregister_netdevice_many+0x14/0x60
[ 163.734501] [] ipip_exit_net+0x1b3/0x1d0
[ 163.734501] [] ? ipip_rcv+0x420/0x420
[ 163.734501] [] ops_exit_list+0x35/0x70
[ 163.734501] [] setup_net+0xab/0xe0
[ 163.734501] [] copy_net_ns+0x76/0x100
[ 163.734501] [] create_new_namespaces+0xfb/0x190
[ 163.734501] [] unshare_nsproxy_namespaces+0x61/0x80
[ 163.734501] [] sys_unshare+0xff/0x290
[ 163.734501] [] ? trace_hardirqs_on_thunk+0x3a/0x3f
[ 163.734501] [] system_call_fastpath+0x16/0x1b
[ 163.734501] Code: e0 c3 fe 66 0f 1f 44 00 00 48 c7 c2 40 60 4d 82 be 01 00 00 00 48 c7 c7 80 d1 23 83 e8 48 2a c4 fe e8 73 06 c8 fe 48 85 db 75 0e 0b 0f 1f 40 00 eb fe 66 0f 1f 44 00 00 48 83 c4 10 48 89 d8
[ 163.734501] RIP [] phonet_pernet+0x182/0x1a0
[ 163.734501] RSP
[ 163.861289] ---[ end trace fb5615826c548066 ]---After investigation it turns out there were two issues.
1) Phonet was not implementing network devices but was using register_pernet_device
instead of register_pernet_subsys.This was allowing there to be cases when phonenet was not initialized and
the phonet net_generic was not set for a network namespace when network
device events were being reported on the netdevice_notifier for a network
namespace leading to the oops above.2) phonet_exit_net was implementing a confusing and special case of handling all
network devices from going away that it was hard to see was correct, and would
only occur when the phonet module was removed.Now that unregister_netdevice_notifier has been modified to synthesize unregistration
events for the network devices that are extant when called this confusing special
case in phonet_exit_net is no longer needed.Signed-off-by: Eric W. Biederman
Acked-by: Rémi Denis-Courmont
Signed-off-by: David S. Miller
11 Apr, 2012
1 commit
06 Apr, 2012
1 commit
-
A phonet packet is limited to USHRT_MAX bytes, this is never checked during
tx which means that the user can specify any size he wishes, and the kernel
will attempt to allocate that size.In the good case, it'll lead to the following warning, but it may also cause
the kernel to kick in the OOM and kill a random task on the server.[ 8921.744094] WARNING: at mm/page_alloc.c:2255 __alloc_pages_slowpath+0x65/0x730()
[ 8921.749770] Pid: 5081, comm: trinity Tainted: G W 3.4.0-rc1-next-20120402-sasha #46
[ 8921.756672] Call Trace:
[ 8921.758185] [] warn_slowpath_common+0x87/0xb0
[ 8921.762868] [] warn_slowpath_null+0x15/0x20
[ 8921.765399] [] __alloc_pages_slowpath+0x65/0x730
[ 8921.769226] [] ? zone_watermark_ok+0x1a/0x20
[ 8921.771686] [] ? get_page_from_freelist+0x625/0x660
[ 8921.773919] [] __alloc_pages_nodemask+0x1f8/0x240
[ 8921.776248] [] kmalloc_large_node+0x70/0xc0
[ 8921.778294] [] __kmalloc_node_track_caller+0x34/0x1c0
[ 8921.780847] [] ? sock_alloc_send_pskb+0xbc/0x260
[ 8921.783179] [] __alloc_skb+0x75/0x170
[ 8921.784971] [] sock_alloc_send_pskb+0xbc/0x260
[ 8921.787111] [] ? release_sock+0x7e/0x90
[ 8921.788973] [] sock_alloc_send_skb+0x10/0x20
[ 8921.791052] [] pep_sendmsg+0x60/0x380
[ 8921.792931] [] ? pn_socket_bind+0x156/0x180
[ 8921.794917] [] ? pn_socket_autobind+0x3f/0x90
[ 8921.797053] [] pn_socket_sendmsg+0x4f/0x70
[ 8921.798992] [] sock_aio_write+0x187/0x1b0
[ 8921.801395] [] ? sub_preempt_count+0xae/0xf0
[ 8921.803501] [] ? __lock_acquire+0x42c/0x4b0
[ 8921.805505] [] ? __sock_recv_ts_and_drops+0x140/0x140
[ 8921.807860] [] do_sync_readv_writev+0xbc/0x110
[ 8921.809986] [] ? might_fault+0x97/0xa0
[ 8921.811998] [] ? security_file_permission+0x1e/0x90
[ 8921.814595] [] do_readv_writev+0xe2/0x1e0
[ 8921.816702] [] ? do_setitimer+0x1ac/0x200
[ 8921.818819] [] ? get_parent_ip+0x11/0x50
[ 8921.820863] [] ? sub_preempt_count+0xae/0xf0
[ 8921.823318] [] vfs_writev+0x46/0x60
[ 8921.825219] [] sys_writev+0x4f/0xb0
[ 8921.827127] [] system_call_fastpath+0x16/0x1b
[ 8921.829384] ---[ end trace dffe390f30db9eb7 ]---Signed-off-by: Sasha Levin
Acked-by: Rémi Denis-Courmont
Signed-off-by: David S. Miller
02 Apr, 2012
1 commit
-
These macros contain a hidden goto, and are thus extremely error
prone and make code hard to audit.Signed-off-by: David S. Miller
13 Jan, 2012
1 commit
-
commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to
RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
y).We miss needed barriers, even on x86, when y is not NULL.
Signed-off-by: Eric Dumazet
CC: Stephen Hemminger
CC: Paul E. McKenney
Signed-off-by: David S. Miller
19 Nov, 2011
1 commit
-
This provides flexibility to set the pipe handle
using setsockopt. The pipe can be enabled (if disabled) later
using ioctl.Signed-off-by: Hemant Ramdasi
Signed-off-by: Dinesh Kumar Sharma
Acked-by: Rémi Denis-Courmont
Signed-off-by: David S. Miller