Eric Lee / smarc-fsl-linux-kernel

17 Dec, 2011

4 commits

22931d3b9 unix_diag: Basic module skeleton ... Browse Code »

Includes basic module_init/_exit functionality, dump/get_exact stubs
and declares the basic API structures for request and response.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-17 02:48:27 +0800
fa7ff56f7 af_unix: Export stuff required for diag module ... Browse Code »

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-17 02:48:27 +0800
f65c1b534 sock_diag: Generalize requests cookies managements ... Browse Code »

The sk address is used as a cookie between dump/get_exact calls.
It will be required for unix socket sdumping, so move it from
inet_diag to sock_diag.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-17 02:48:27 +0800
aec8dc62f sock_diag: Fix module netlink aliases ... Browse Code »

I've made a mistake when fixing the sock_/inet_diag aliases :(

1. The sock_diag layer should request the family-based alias,
not just the IPPROTO_IP one;
2. The inet_diag layer should request for AF_INET+protocol alias,
not just the protocol one.

Thus fix this.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-17 02:48:27 +0800

16 Dec, 2011

2 commits

b26e478f8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »
46

Conflicts:
drivers/net/ethernet/freescale/fsl_pq_mdio.c
net/batman-adv/translation-table.c
net/ipv6/route.c

David S. Miller
2011-12-16 15:11:14 +0800
c48e074c7 tcp_memcontrol: fix reversed if condition ... Browse Code »

We should only dereference the pointer if it's valid, not the other way
round.

Signed-off-by: Dan Carpenter
Signed-off-by: David S. Miller

Dan Carpenter
2011-12-16 00:59:44 +0800

15 Dec, 2011

2 commits

e6560d4df net: ping: remove some sparse errors ... Browse Code »

net/ipv4/sysctl_net_ipv4.c:78:6: warning: symbol 'inet_get_ping_group_range_table'
was not declared. Should it be static?

net/ipv4/sysctl_net_ipv4.c:119:31: warning: incorrect type in argument 2
(different signedness)
net/ipv4/sysctl_net_ipv4.c:119:31: expected int *range
net/ipv4/sysctl_net_ipv4.c:119:31: got unsigned int *

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-12-15 02:34:55 +0800
3a53943b5 cls_flow: remove one dynamic array ... Browse Code »

Its better to use a predefined size for this small automatic variable.

Removes a sparse error as well :

net/sched/cls_flow.c:288:13: error: bad constant expression

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-12-15 02:34:55 +0800

14 Dec, 2011

6 commits

de93cb2ea vlan: static functions ... Browse Code »

commit 6d4cdf47d2 (vlan: add 802.1q netpoll support) forgot to declare
as static some private functions.

Signed-off-by: Eric Dumazet
CC: Benjamin LaHaise
Signed-off-by: David S. Miller

Eric Dumazet
2011-12-14 15:39:30 +0800
f9586f79b vlan: add rtnl_dereference() annotations ... Browse Code »

The original code generates a Sparse warning:
net/8021q/vlan_core.c:336:9:
error: incompatible types in comparison expression (different address spaces)

It's ok to dereference __rcu pointers here because we are holding the
RTNL lock. I've added some calls to rtnl_dereference() to silence the
warning.

Signed-off-by: Dan Carpenter
Acked-by: Eric Dumazet
Acked-by: Jiri Pirko
Signed-off-by: David S. Miller

Dan Carpenter
2011-12-14 15:39:30 +0800
c63044f0d rtnetlink: rtnl_link_register() sanity test ... Browse Code »

Before adding a struct rtnl_link_ops into link_ops list, check it doesnt
clash with a prior one.

Based on a previous patch from Alexander Smirnov

Signed-off-by: Eric Dumazet
CC: Alexander Smirnov
Signed-off-by: David S. Miller

Eric Dumazet
2011-12-14 15:39:29 +0800
bb3c36863 ipv6: Check dest prefix length on original route not copied one in rt6_alloc_cow(). ... Browse Code »

After commit 8e2ec639173f325977818c45011ee176ef2b11f6 ("ipv6: don't
use inetpeer to store metrics for routes.") the test in rt6_alloc_cow()
for setting the ANYCAST flag is now wrong.

'rt' will always now have a plen of 128, because it is set explicitly
to 128 by ip6_rt_copy.

So to restore the semantics of the test, check the destination prefix
length of 'ort'.

Signed-off-by: David S. Miller

David S. Miller
2011-12-14 06:35:06 +0800
b43faac69 ipv6: If neigh lookup fails during icmp6 dst allocation, propagate error. ... Browse Code »

Don't just succeed with a route that has a NULL neighbour attached.
This follows the behavior of addrconf_dst_alloc().

Allowing this kind of route to end up with a NULL neigh attached will
result in packet drops on output until the route is somehow
invalidated, since nothing will meanwhile try to lookup the neigh
again.

A statistic is bumped for the case where we see a neigh-less route on
output, but the resulting packet drop is otherwise silent in nature,
and frankly it's a hard error for this to happen and ipv6 should do
what ipv4 does which is say something in the kernel logs.

Signed-off-by: David S. Miller

David S. Miller
2011-12-14 05:51:51 +0800
5c3ddec73 net: Remove unused neighbour layer ops. ... Browse Code »

It's simpler to just keep these things out until there is a real user
of them, so we can see what the needs actually are, rather than keep
these things around as useless overhead.

Signed-off-by: David S. Miller

David S. Miller
2011-12-14 05:44:22 +0800

13 Dec, 2011

14 commits

90b41a1cd netem: add cell concept to simulate special MAC behavior ... Browse Code »

This extension can be used to simulate special link layer
characteristics. Simulate because packet data is not modified, only the
calculation base is changed to delay a packet based on the original
packet size and artificial cell information.

packet_overhead can be used to simulate a link layer header compression
scheme (e.g. set packet_overhead to -20) or with a positive
packet_overhead value an additional MAC header can be simulated. It is
also possible to "replace" the 14 byte Ethernet header with something
else.

cell_size and cell_overhead can be used to simulate link layer schemes,
based on cells, like some TDMA schemes. Another application area are MAC
schemes using a link layer fragmentation with a (small) header each.
Cell size is the maximum amount of data bytes within one cell. Cell
overhead is an additional variable to change the per-cell-overhead
(e.g. 5 byte header per fragment).

Example (5 kbit/s, 20 byte per packet overhead, cell-size 100 byte, per
cell overhead 5 byte):

tc qdisc add dev eth0 root netem rate 5kbit 20 100 5

Signed-off-by: Hagen Paul Pfeifer
Signed-off-by: Florian Westphal
Acked-by: Stephen Hemminger
Signed-off-by: David S. Miller

Hagen Paul Pfeifer
2011-12-13 08:44:48 +0800
c7c6575f2 Merge branch 'batman-adv/next' of git://git.open-mesh.org/linux-merge Browse Code »

David S. Miller
2011-12-13 08:26:07 +0800
3f1e6d3fd sch_gred: should not use GFP_KERNEL while holding a spinlock ... Browse Code »
1

gred_change_vq() is called under sch_tree_lock(sch).

This means a spinlock is held, and we are not allowed to sleep in this
context.

We might pre-allocate memory using GFP_KERNEL before taking spinlock,
but this is not suitable for stable material.

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-12-13 08:08:54 +0800
0850f0f5c Display maximum tcp memory allocation in kmem cgroup ... Browse Code »

This patch introduces kmem.tcp.max_usage_in_bytes file, living in the
kmem_cgroup filesystem. The root cgroup will display a value equal
to RESOURCE_MAX. This is to avoid introducing any locking schemes in
the network paths when cgroups are not being actively used.

All others, will see the maximum memory ever used by this cgroup.

Signed-off-by: Glauber Costa
Reviewed-by: Hiroyouki Kamezawa
CC: David S. Miller
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:11 +0800
ffea59e50 Display current tcp failcnt in kmem cgroup ... Browse Code »

This patch introduces kmem.tcp.failcnt file, living in the
kmem_cgroup filesystem. Following the pattern in the other
memcg resources, this files keeps a counter of how many times
allocation failed due to limits being hit in this cgroup.
The root cgroup will always show a failcnt of 0.

Signed-off-by: Glauber Costa
Reviewed-by: Hiroyouki Kamezawa
CC: David S. Miller
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:11 +0800
5a6dd3437 Display current tcp memory allocation in kmem cgroup ... Browse Code »

This patch introduces kmem.tcp.usage_in_bytes file, living in the
kmem_cgroup filesystem. It is a simple read-only file that displays the
amount of kernel memory currently consumed by the cgroup.

Signed-off-by: Glauber Costa
Reviewed-by: Hiroyouki Kamezawa
CC: David S. Miller
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:11 +0800
3aaabe234 tcp buffer limitation: per-cgroup limit ... Browse Code »

This patch uses the "tcp.limit_in_bytes" field of the kmem_cgroup to
effectively control the amount of kernel memory pinned by a cgroup.

This value is ignored in the root cgroup, and in all others,
caps the value specified by the admin in the net namespaces'
view of tcp_sysctl_mem.

If namespaces are being used, the admin is allowed to set a
value bigger than cgroup's maximum, the same way it is allowed
to set pretty much unlimited values in a real box.

Signed-off-by: Glauber Costa
Reviewed-by: Hiroyouki Kamezawa
CC: David S. Miller
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:11 +0800
3dc43e3e4 per-netns ipv4 sysctl_tcp_mem ... Browse Code »
138

This patch allows each namespace to independently set up
its levels for tcp memory pressure thresholds. This patch
alone does not buy much: we need to make this values
per group of process somehow. This is achieved in the
patches that follows in this patchset.

Signed-off-by: Glauber Costa
Reviewed-by: KAMEZAWA Hiroyuki
CC: David S. Miller
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:11 +0800
d1a4c0b37 tcp memory pressure controls ... Browse Code »

This patch introduces memory pressure controls for the tcp
protocol. It uses the generic socket memory pressure code
introduced in earlier patches, and fills in the
necessary data in cg_proto struct.

Signed-off-by: Glauber Costa
Reviewed-by: KAMEZAWA Hiroyuki
CC: Eric W. Biederman
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:10 +0800
e1aab161e socket: initial cgroup code. ... Browse Code »

The goal of this work is to move the memory pressure tcp
controls to a cgroup, instead of just relying on global
conditions.

To avoid excessive overhead in the network fast paths,
the code that accounts allocated memory to a cgroup is
hidden inside a static_branch(). This branch is patched out
until the first non-root cgroup is created. So when nobody
is using cgroups, even if it is mounted, no significant performance
penalty should be seen.

This patch handles the generic part of the code, and has nothing
tcp-specific.

Signed-off-by: Glauber Costa
Reviewed-by: KAMEZAWA Hiroyuki
CC: Kirill A. Shutemov
CC: David S. Miller
CC: Eric W. Biederman
CC: Eric Dumazet
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:10 +0800
180d8cd94 foundations of per-cgroup memory pressure controlling. ... Browse Code »
46

This patch replaces all uses of struct sock fields' memory_pressure,
memory_allocated, sockets_allocated, and sysctl_mem to acessor
macros. Those macros can either receive a socket argument, or a mem_cgroup
argument, depending on the context they live in.

Since we're only doing a macro wrapping here, no performance impact at all is
expected in the case where we don't have cgroups disabled.

Signed-off-by: Glauber Costa
Reviewed-by: Hiroyouki Kamezawa
CC: David S. Miller
CC: Eric W. Biederman
CC: Eric Dumazet
Signed-off-by: David S. Miller

Glauber Costa
2011-12-13 08:04:10 +0800
72b36015b ipip, sit: copy parms.name after register_netdevice ... Browse Code »
1

Same fix as 731abb9cb2 for ipip and sit tunnel.
Commit 1c5cae815d removed an explicit call to dev_alloc_name in
ipip_tunnel_locate and ipip6_tunnel_locate, because register_netdevice
will now create a valid name, however the tunnel keeps a copy of the
name in the private parms structure. Fix this by copying the name back
after register_netdevice has successfully returned.

This shows up if you do a simple tunnel add, followed by a tunnel show:

$ sudo ip tunnel add mode ipip remote 10.2.20.211
$ ip tunnel
tunl0: ip/ip remote any local any ttl inherit nopmtudisc
tunl%d: ip/ip remote 10.2.20.211 local any ttl inherit
$ sudo ip tunnel add mode sit remote 10.2.20.212
$ ip tunnel
sit0: ipv6/ip remote any local any ttl 64 nopmtudisc 6rd-prefix 2002::/16
sit%d: ioctl 89f8 failed: No such device
sit%d: ipv6/ip remote 10.2.20.212 local any ttl inherit

Cc: stable@vger.kernel.org
Signed-off-by: Ted Feng
Signed-off-by: David S. Miller

Ted Feng
2011-12-13 07:50:51 +0800
4af04aba9 ipv6: Fix for adding multicast route for loopback device automatically. ... Browse Code »

There is no obvious reason to add a default multicast route for loopback
devices, otherwise there would be a route entry whose dst.error set to
-ENETUNREACH that would blocking all multicast packets.

====================

[ more detailed explanation ]

The problem is that the resulting routing table depends on the sequence
of interface's initialization and in some situation, that would block all
muticast packets. Suppose there are two interfaces on my computer
(lo and eth0), if we initailize 'lo' before 'eth0', the resuting routing
table(for multicast) would be

# ip -6 route show | grep ff00::
unreachable ff00::/8 dev lo metric 256 error -101
ff00::/8 dev eth0 metric 256

When sending multicasting packets, routing subsystem will return the first
route entry which with a error set to -101(ENETUNREACH).

I know the kernel will set the default ipv6 address for 'lo' when it is up
and won't set the default multicast route for it, but there is no reason to
stop 'init' program from setting address for 'lo', and that is exactly what
systemd did.

I am sure there is something wrong with kernel or systemd, currently I preferred
kernel caused this problem.

====================

Signed-off-by: Li Wei
Signed-off-by: David S. Miller

Li Wei
2011-12-13 07:48:18 +0800
f2abba492 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/… ... Browse Code »

…wireless-next into for-davem

John W. Linville
2011-12-13 03:19:43 +0800

12 Dec, 2011

4 commits

b5a1eeef0 batman-adv: Only write requested number of byte to user buffer ... Browse Code »

Don't write more than the requested number of bytes of an batman-adv icmp
packet to the userspace buffer. Otherwise unrelated userspace memory might get
overridden by the kernel.

Signed-off-by: Sven Eckelmann
Signed-off-by: Marek Lindner

Sven Eckelmann
2011-12-12 19:11:07 +0800
d18eb4533 batman-adv: Directly check read of icmp packet in copy_from_user ... Browse Code »

The access_ok read check can be directly done in copy_from_user since a failure
of access_ok is handled the same way as an error in __copy_from_user.

Signed-off-by: Sven Eckelmann
Signed-off-by: Marek Lindner

Sven Eckelmann
2011-12-12 19:11:06 +0800
c00b6856f batman-adv: bat_socket_read missing checks ... Browse Code »

Writing a icmp_packet_rr and then reading icmp_packet can lead to kernel
memory corruption, if __user *buf is just below TASK_SIZE.

Signed-off-by: Paul Kot
[sven@narfation.org: made it checkpatch clean]
Signed-off-by: Sven Eckelmann
Signed-off-by: Marek Lindner

Paul Kot
2011-12-12 19:11:06 +0800
dfd56b8b3 net: use IS_ENABLED(CONFIG_IPV6) ... Browse Code »

Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)

Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2011-12-12 07:25:16 +0800

11 Dec, 2011

2 commits

86e62ad6b udp_diag: Fix the !ipv6 case ... Browse Code »

Wrap the udp6 lookup into the proper ifdef-s.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-11 02:14:59 +0800
b872a2371 udp_diag: Make it module when ipv6 is a module ... Browse Code »

Eric Dumazet reported, that when inet_diag is built-in the udp_diag also goes
built-in and when ipv6 is a module the udp6 lookup symbol is not found.

LD .tmp_vmlinux1
net/built-in.o: In function `udp_dump_one':
udp_diag.c:(.text+0xa2b40): undefined reference to `__udp6_lib_lookup'
make: *** [.tmp_vmlinux1] Erreur 1

Fix this by making udp diag build mode depend on both -- inet diag and ipv6.

Reported-by: Eric Dumazet
Signed-off-by: Pavel Emelyanov
Acked-by: Eric Dumazet
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-11 02:14:59 +0800

10 Dec, 2011

6 commits

507dd7961 udp_diag: Wire the udp_diag module into kbuild ... Browse Code »

Copy-s/tcp/udp/-paste from TCP bits.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-10 03:15:00 +0800
b6d640c22 udp_diag: Implement the dump-all functionality ... Browse Code »

Do the same as TCP does -- iterate the given udp_table, filter
sockets with bytecode and dump sockets into reply message.

The same filtering as for TCP applies, though only some of the
state bits really matter.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-10 03:15:00 +0800
a925aa00a udp_diag: Implement the get_exact dumping functionality ... Browse Code »

Do the same as TCP does -- lookup a socket in the given udp_table,
check cookie, fill the reply message with existing inet socket dumping
helper and send one back.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-10 03:15:00 +0800
52b7c59bc udp_diag: Basic skeleton ... Browse Code »

Introduce the transport level diag handler module for UDP (and UDP-lite)
sockets and register (empty for now) callbacks in the inet_diag module.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-10 03:15:00 +0800
fce823381 udp: Export code sk lookup routines ... Browse Code »

The UDP diag get_exact handler will require them to find a
socket by provided net, [sd]addr-s, [sd]ports and device.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-10 03:14:08 +0800
1942c518c inet_diag: Generalize inet_diag dump and get_exact calls ... Browse Code »

Introduce two callbacks in inet_diag_handler -- one for dumping all
sockets (with filters) and the other one for dumping a single sk.

Replace direct calls to icsk handlers with indirect calls to callbacks
provided by handlers.

Make existing TCP and DCCP handlers use provided helpers for icsk-s.

The UDP diag module will provide its own.

Signed-off-by: Pavel Emelyanov
Signed-off-by: David S. Miller

Pavel Emelyanov
2011-12-10 03:14:08 +0800