Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

03 Jun, 2014

5 commits

418c96ac1 net: filter: fix possible memory leak in __sk_prepare_filter() ... Browse Code »

__sk_prepare_filter() was reworked in commit bd4cf0ed3 (net: filter:
rework/optimize internal BPF interpreter's instruction set) so that it should
have uncharged memory once things went wrong. However that work isn't complete.
Error is handled only in __sk_migrate_filter() while memory can still leak in
the error path right after sk_chk_filter().

Fixes: bd4cf0ed331a ("net: filter: rework/optimize internal BPF interpreter's instruction set")
Signed-off-by: Leon Yu
Acked-by: Alexei Starovoitov
Tested-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Leon Yu
2014-06-03 08:49:45 +0800
0cfa5c07d tcp: fix cwnd undo on DSACK in F-RTO ... Browse Code »
5

This bug is discovered by an recent F-RTO issue on tcpm list
https://www.ietf.org/mail-archive/web/tcpm/current/msg08794.html

The bug is that currently F-RTO does not use DSACK to undo cwnd in
certain cases: upon receiving an ACK after the RTO retransmission in
F-RTO, and the ACK has DSACK indicating the retransmission is spurious,
the sender only calls tcp_try_undo_loss() if some never retransmisted
data is sacked (FLAG_ORIG_DATA_SACKED).

The correct behavior is to unconditionally call tcp_try_undo_loss so
the DSACK information is used properly to undo the cwnd reduction.

Signed-off-by: Yuchung Cheng
Signed-off-by: Neal Cardwell
Signed-off-by: David S. Miller

Yuchung Cheng
2014-06-03 07:50:49 +0800
2d7a85f4b netlink: Only check file credentials for implicit destinations ... Browse Code »
5

It was possible to get a setuid root or setcap executable to write to
it's stdout or stderr (which has been set made a netlink socket) and
inadvertently reconfigure the networking stack.

To prevent this we check that both the creator of the socket and
the currentl applications has permission to reconfigure the network
stack.

Unfortunately this breaks Zebra which always uses sendto/sendmsg
and creates it's socket without any privileges.

To keep Zebra working don't bother checking if the creator of the
socket has privilege when a destination address is specified. Instead
rely exclusively on the privileges of the sender of the socket.

Note from Andy: This is exactly Eric's code except for some comment
clarifications and formatting fixes. Neither I nor, I think, anyone
else is thrilled with this approach, but I'm hesitant to wait on a
better fix since 3.15 is almost here.

Note to stable maintainers: This is a mess. An earlier series of
patches in 3.15 fix a rather serious security issue (CVE-2014-0181),
but they did so in a way that breaks Zebra. The offending series
includes:

commit aa4cf9452f469f16cea8c96283b641b4576d4a7b
Author: Eric W. Biederman
Date: Wed Apr 23 14:28:03 2014 -0700

net: Add variants of capable for use on netlink messages

If a given kernel version is missing that series of fixes, it's
probably worth backporting it and this patch. if that series is
present, then this fix is critical if you care about Zebra.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman"
Signed-off-by: Andy Lutomirski
Signed-off-by: David S. Miller

Eric W. Biederman
2014-06-03 07:34:09 +0800
39c36094d net: fix inet_getid() and ipv6_select_ident() bugs ... Browse Code »
5

I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery
is disabled.
Note how GSO/TSO packets do not have monotonically incrementing ID.

06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396)
06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212)
06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972)
06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292)
06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764)

It appears I introduced this bug in linux-3.1.

inet_getid() must return the old value of peer->ip_id_count,
not the new one.

Lets revert this part, and remove the prevention of
a null identification field in IPv6 Fragment Extension Header,
which is dubious and not even done properly.

Fixes: 87c48fa3b463 ("ipv6: make fragment identifications less predictable")
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller

Eric Dumazet
2014-06-03 05:09:28 +0800
e0d7968ab bridge: Prevent insertion of FDB entry with disallowed vlan ... Browse Code »
5

br_handle_local_finish() is allowing us to insert an FDB entry with
disallowed vlan. For example, when port 1 and 2 are communicating in
vlan 10, and even if vlan 10 is disallowed on port 3, port 3 can
interfere with their communication by spoofed src mac address with
vlan id 10.

Note: Even if it is judged that a frame should not be learned, it should
not be dropped because it is destined for not forwarding layer but higher
layer. See IEEE 802.1Q-2011 8.13.10.

Signed-off-by: Toshiaki Makita
Acked-by: Vlad Yasevich
Signed-off-by: David S. Miller

Toshiaki Makita
2014-06-03 04:38:23 +0800

02 Jun, 2014

2 commits

c65c7a306 bridge: notify user space after fdb update ... Browse Code »

There has been a number incidents recently where customers running KVM have
reported that VM hosts on different Hypervisors are unreachable. Based on
pcap traces we found that the bridge was broadcasting the ARP request out
onto the network. However some NICs have an inbuilt switch which on occasions
were broadcasting the VMs ARP request back through the physical NIC on the
Hypervisor. This resulted in the bridge changing ports and incorrectly learning
that the VMs mac address was external. As a result the ARP reply was directed
back onto the external network and VM never updated it's ARP cache. This patch
will notify the bridge command, after a fdb has been updated to identify such
port toggling.

Signed-off-by: Jon Maxwell
Reviewed-by: Jiri Pirko
Acked-by: Toshiaki Makita
Acked-by: Stephen Hemminger
Signed-off-by: David S. Miller

Jon Maxwell
2014-06-02 13:14:50 +0800
4b9b1cdf8 net: fix wrong mac_len calculation for vlans ... Browse Code »

After 1e785f48d29a ("net: Start with correct mac_len in
skb_network_protocol") skb->mac_len is used as a start of the
calculation in skb_network_protocol() but that is not always correct. If
skb->protocol == 8021Q/AD, usually the vlan header is already inserted
in the skb (i.e. vlan reorder hdr == 0). Usually when the packet enters
dev_hard_xmit it has mac_len == 0 so we take 2 bytes from the
destination mac address (skb->data + VLAN_HLEN) as a type in
skb_network_protocol() and return vlan_depth == 4. In the case where TSO is
off, then the mac_len is set but it's == 18 (ETH_HLEN + VLAN_HLEN), so
skb_network_protocol() returns a type from inside the packet and
offset == 22. Also make vlan_depth unsigned as suggested before.
As suggested by Eric Dumazet, move the while() loop in the if() so we
can avoid additional testing in fast path.

Here are few netperf tests + debug printk's to illustrate:
cat netperf.tso-on.reorder-on.bugged
- Vlan -> device (reorder on, default, this case is okay)
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

87380 16384 16384 10.00 7111.54
[ 81.605435] skb->len 65226 skb->gso_size 1448 skb->proto 0x800
skb->mac_len 0 vlan_depth 0 type 0x800

- Vlan -> device (reorder off, bad)
cat netperf.tso-on.reorder-off.bugged
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

87380 16384 16384 10.00 241.35
[ 204.578332] skb->len 1518 skb->gso_size 0 skb->proto 0x8100
skb->mac_len 0 vlan_depth 4 type 0x5301
0x5301 are the last two bytes of the destination mac.

And if we stop TSO, we may get even the following:
[ 83.343156] skb->len 2966 skb->gso_size 1448 skb->proto 0x8100
skb->mac_len 18 vlan_depth 22 type 0xb84
Because mac_len already accounts for VLAN_HLEN.

After the fix:
cat netperf.tso-on.reorder-off.fixed
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
192.168.3.1 () port 0 AF_INET
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^6bits/sec

87380 16384 16384 10.01 5001.46
[ 81.888489] skb->len 65230 skb->gso_size 1448 skb->proto 0x8100
skb->mac_len 0 vlan_depth 18 type 0x800

CC: Vlad Yasevich
CC: Eric Dumazet
CC: Daniel Borkman
CC: David S. Miller

Fixes:1e785f48d29a ("net: Start with correct mac_len in
skb_network_protocol")
Signed-off-by: Nikolay Aleksandrov
Signed-off-by: David S. Miller

Nikolay Aleksandrov
2014-06-02 10:39:13 +0800

01 Jun, 2014

1 commit

6ce995c6f Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge ... Browse Code »

Included changes:
- prevent NULL dereference in multicast code

Antonion Quartulli says:

====================
pull request net: batman-adv 20140527

here you have another very small fix intended for net/linux-3.15.
It prevents some multicast functions from dereferencing a NULL pointer.
(Actually it was nothing more than a typo)
I hope it is not too late for such a small patch.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-06-01 11:01:47 +0800

31 May, 2014

2 commits

af0a171c0 batman-adv: fix NULL pointer dereferences ... Browse Code »

Was introduced with 4c8755d69cbde2ec464a39c932aed0a83f9ff89f
("batman-adv: Send multicast packets to nodes with a WANT_ALL flag")

Reported-by: Sven Eckelmann
Signed-off-by: Marek Lindner
Acked-by: Antonio Quartulli
Signed-off-by: Linus Lüssing
Signed-off-by: Antonio Quartulli

Marek Lindner
2014-05-31 16:07:14 +0800
dbfc4b698 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
The following patchset contains a late fix for IPVS:

* Fix crash when trying to remove the transport header with non-linear
skbuffs, this was introduced in 3.6-rc. Patch from Peter Christensen
via the IPVS folks.

I'll pass this to -stable once this hits mainstream.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-05-31 08:56:09 +0800

26 May, 2014

1 commit

f44a5f45f ipvs: Fix panic due to non-linear skb ... Browse Code »
6

Receiving a ICMP response to an IPIP packet in a non-linear skb could
cause a kernel panic in __skb_pull.

The problem was introduced in
commit f2edb9f7706dcb2c0d9a362b2ba849efe3a97f5e ("ipvs: implement
passive PMTUD for IPIP packets").

Signed-off-by: Peter Christensen
Acked-by: Julian Anastasov
Signed-off-by: Simon Horman

Peter Christensen
2014-05-26 09:22:46 +0800

25 May, 2014

1 commit

8646224cd Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless ... Browse Code »

John W. Linville says:

====================
pull request: wireless 2014-05-23

I have two more fixes intended for the 3.15 stream...

For the iwlwifi one, Emmanuel says:

"A race has been discovered in the beacon filtering code. Since the
fix is too big for 3.15, I disable here the feature."

For the bluetooth one, Gustavo says:

"This pull request contains a very important fix for 3.15. Here we fix the
permissions of a debugfs file that would otherwise allow unauthorized users
to write content to it."

Please let me know if there are problems!
====================

Signed-off-by: David S. Miller

David S. Miller
2014-05-25 02:06:19 +0800

24 May, 2014

1 commit

5fa6a683c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:
"It looks like a sizeble collection but this is nearly 3 weeks of bug
fixing while you were away.

1) Fix crashes over IPSEC tunnels with NAT, the latter can reroute
the packet through a non-IPSEC protected path and the code has to
be able to handle SKBs attached to routes lacking an attached xfrm
state. From Steffen Klassert.

2) Fix OOPSs in ipv4 and ipv6 ipsec layers for unsupported
sub-protocols, also from Steffen Klassert.

3) Set local_df on fragmented netfilter skbs otherwise we won't be
able to forward successfully, from Florian Westphal.

4) cdc_mbim ipv6 neighbour code does __vlan_find_dev_deep without
holding RCU lock, from Bjorn Mork.

5) local_df test in ip_may_fragment is inverted, from Florian
Westphal.

6) jme driver doesn't check for DMA mapping failures, from Neil
Horman.

7) qlogic driver doesn't calculate number of TX queues properly, from
Shahed Shaikh.

8) fib_info_cnt can drift irreversibly positive if we fail to
allocate the fi->fib_metrics array, from Sergey Popovich.

9) Fix use after free in ip6_route_me_harder(), also from Sergey
Popovich.

10) When SYSCTL is disabled, we don't handle local_port_range and
ping_group_range defaults properly at all, from Cong Wang.

11) Unaccelerated VLAN tagged frames improperly handled by cdc_mbim
driver, fix from Bjorn Mork.

12) cassini driver needs nested lock annotations for TX locking, from
Emil Goode.

13) On init error ipv6 VTI driver can unregister pernet ops twice,
oops. Fix from Mahtias Krause.

14) If macvlan device is down, don't propagate IFF_ALLMULTI changes,
from Peter Christensen.

15) Missing NULL pointer check while parsing netlink config options in
ip6_tnl_validate(). From Susant Sahani.

16) Fix handling of neighbour entries during ipv6 router reachability
probing, from Duan Jiong.

17) x86 and s390 JIT address randomization has some address
calculation bugs leading to crashes, from Alexei Starovoitov and
Heiko Carstens.

18) Clear up those uglies with nop patching and net_get_random_once(),
from Hannes Frederic Sowa.

19) Option length miscalculated in ip6_append_data(), fix also from
Hannes Frederic Sowa.

20) A while ago we fixed a race during device unregistry when a
namespace went down, turns out there is a second place that needs
similar protection. From Cong Wang.

21) In the new Altera TSE driver multicast filtering isn't working,
disable it and just use promisc mode until the cause is found.
From Vince Bridgers.

22) When we disable router enabling in ipv6 we have to flush the
cached routes explicitly, from Duan Jiong.

23) NBMA tunnels should not cache routes on the tunnel object because
the key is variable, from Timo Teräs.

24) With stacked devices GRO information in skb->cb[] can be not setup
properly, make sure it is in all code paths. From Eric Dumazet.

25) Really fix stacked vlan locking, multiple levels of nesting with
intervening non-vlan devices are possible. From Vlad Yasevich.

26) Fallback ipip tunnel device's mtu is not setup properly, from
Steffen Klassert.

27) The packet scheduler's tcindex filter can crash because we
structure copy objects with list_head's inside, oops. From Cong
Wang.

28) Fix CHECKSUM_COMPLETE handling for ipv6 GRE tunnels, from Eric
Dumazet.

29) In some configurations 'itag' in __mkroute_input() can end up
being used uninitialized because of how fib_validate_source()
works. Fix it by explitly initializing itag to zero like all the
other fib_validate_source() callers do, from Li RongQing"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
batman: fix a bogus warning from batadv_is_on_batman_iface()
ipv4: initialise the itag variable in __mkroute_input
bonding: Send ALB learning packets using the right source
bonding: Don't assume 802.1Q when sending alb learning packets.
net: doc: Update references to skb->rxhash
stmmac: Remove unbalanced clk_disable call
ipv6: gro: fix CHECKSUM_COMPLETE support
net_sched: fix an oops in tcindex filter
can: peak_pci: prevent use after free at netdev removal
ip_tunnel: Initialize the fallback device properly
vlan: Fix build error wth vlan_get_encap_level()
can: c_can: remove obsolete STRICT_FRAME_ORDERING Kconfig option
MAINTAINERS: Pravin Shelar is Open vSwitch maintainer.
bnx2x: Convert return 0 to return rc
bonding: Fix alb mode to only use first level vlans.
bonding: Fix stacked device detection in arp monitoring
macvlan: Fix lockdep warnings with stacked macvlan devices
vlan: Fix lockdep warning with stacked vlan devices.
net: Allow for more then a single subclass for netif_addr_lock
net: Find the nesting level of a given device by type.
...

Linus Torvalds
2014-05-24 06:29:43 +0800

23 May, 2014

3 commits

5ca2504ea Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/… ... Browse Code »

…wireless into for-davem

John W. Linville
2014-05-23 22:55:58 +0800
b6ed54986 batman: fix a bogus warning from batadv_is_on_batman_iface() ... Browse Code »
5

batman tries to search dev->iflink to check if it's a batman interface,
but ->iflink could be 0, which is not a valid ifindex. It should just
avoid iflink == 0 case.

Reported-by: Jet Chen
Tested-by: Jet Chen
Cc: David S. Miller
Cc: Steffen Klassert
Cc: Antonio Quartulli
Cc: Marek Lindner
Signed-off-by: Cong Wang
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

Cong Wang
2014-05-23 05:23:00 +0800
fbdc0ad09 ipv4: initialise the itag variable in __mkroute_input ... Browse Code »
5

the value of itag is a random value from stack, and may not be initiated by
fib_validate_source, which called fib_combine_itag if CONFIG_IP_ROUTE_CLASSID
is not set

This will make the cached dst uncertainty

Signed-off-by: Li RongQing
Acked-by: Alexei Starovoitov
Signed-off-by: David S. Miller

Li RongQing
2014-05-23 03:57:36 +0800

22 May, 2014

2 commits

4de462ab6 ipv6: gro: fix CHECKSUM_COMPLETE support ... Browse Code »
5

When GRE support was added in linux-3.14, CHECKSUM_COMPLETE handling
broke on GRE+IPv6 because we did not update/use the appropriate csum :

GRO layer is supposed to use/update NAPI_GRO_CB(skb)->csum instead of
skb->csum

Tested using a GRE tunnel and IPv6 traffic. GRO aggregation now happens
at the first level (ethernet device) instead of being done in gre
tunnel. Native IPv6+TCP is still properly aggregated.

Fixes: bf5a755f5e918 ("net-gre-gro: Add GRE support to the GRO stack")
Signed-off-by: Eric Dumazet
Cc: Jerry Chu
Signed-off-by: David S. Miller

Eric Dumazet
2014-05-22 05:18:47 +0800
bf63ac73b net_sched: fix an oops in tcindex filter ... Browse Code »
5

Kelly reported the following crash:

IP: [] tcf_action_exec+0x46/0x90
PGD 3009067 PUD 300c067 PMD 11ff30067 PTE 800000011634b060
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
CPU: 1 PID: 639 Comm: dhclient Not tainted 3.15.0-rc4+ #342
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
task: ffff8801169ecd00 ti: ffff8800d21b8000 task.ti: ffff8800d21b8000
RIP: 0010:[] [] tcf_action_exec+0x46/0x90
RSP: 0018:ffff8800d21b9b90 EFLAGS: 00010283
RAX: 00000000ffffffff RBX: ffff88011634b8e8 RCX: ffff8800cf7133d8
RDX: ffff88011634b900 RSI: ffff8800cf7133e0 RDI: ffff8800d210f840
RBP: ffff8800d21b9bb0 R08: ffffffff8287bf60 R09: 0000000000000001
R10: ffff8800d2b22b24 R11: 0000000000000001 R12: ffff8800d210f840
R13: ffff8800d21b9c50 R14: ffff8800cf7133e0 R15: ffff8800cad433d8
FS: 00007f49723e1840(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff88011634b8f0 CR3: 00000000ce469000 CR4: 00000000000006e0
Stack:
ffff8800d2170188 ffff8800d210f840 ffff8800d2171b90 0000000000000000
ffff8800d21b9be8 ffffffff817c55bb ffff8800d21b9c50 ffff8800d2171b90
ffff8800d210f840 ffff8800d21b0300 ffff8800d21b9c50 ffff8800d21b9c18
Call Trace:
[] tcindex_classify+0x88/0x9b
[] tc_classify_compat+0x3e/0x7b
[] tc_classify+0x25/0x9f
[] htb_enqueue+0x55/0x27a
[] dsmark_enqueue+0x165/0x1a4
[] __dev_queue_xmit+0x35e/0x536
[] dev_queue_xmit+0x10/0x12
[] packet_sendmsg+0xb26/0xb9a
[] ? __lock_acquire+0x3ae/0xdf3
[] __sock_sendmsg_nosec+0x25/0x27
[] sock_aio_write+0xd0/0xe7
[] do_sync_write+0x59/0x78
[] vfs_write+0xb5/0x10a
[] SyS_write+0x49/0x7f
[] system_call_fastpath+0x16/0x1b

This is because we memcpy struct tcindex_filter_result which contains
struct tcf_exts, obviously struct list_head can not be simply copied.
This is a regression introduced by commit 33be627159913b094bb578
(net_sched: act: use standard struct list_head).

It's not very easy to fix it as the code is a mess:

if (old_r)
memcpy(&cr, r, sizeof(cr));
else {
memset(&cr, 0, sizeof(cr));
tcf_exts_init(&cr.exts, TCA_TCINDEX_ACT, TCA_TCINDEX_POLICE);
}
...
tcf_exts_change(tp, &cr.exts, &e);
...
memcpy(r, &cr, sizeof(cr));

the above code should equal to:

tcindex_filter_result_init(&cr);
if (old_r)
cr.res = r->res;
...
if (old_r)
tcf_exts_change(tp, &r->exts, &e);
else
tcf_exts_change(tp, &cr.exts, &e);
...
r->res = cr.res;

after this change, since there is no need to copy struct tcf_exts.

And it also fixes other places zero'ing struct's contains struct tcf_exts.

Fixes: commit 33be627159913b0 (net_sched: act: use standard struct list_head)
Reported-by: Kelly Anderson
Tested-by: Kelly Anderson
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

Cong Wang
2014-05-22 04:47:13 +0800

21 May, 2014

2 commits

78ff4be45 ip_tunnel: Initialize the fallback device properly ... Browse Code »
5

We need to initialize the fallback device to have a correct mtu
set on this device. Otherwise the mtu is set to null and the device
is unusable.

Fixes: fd58156e456d ("IPIP: Use ip-tunneling code.")
Cc: Pravin B Shelar
Signed-off-by: Steffen Klassert
Signed-off-by: David S. Miller

Steffen Klassert
2014-05-21 14:08:32 +0800
d050de607 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
Netfilter/nftables fixes for net

The following patchset contains nftables fixes for your net tree, they
are:

1) Fix crash when using the goto action in a rule by making sure that
we always fall back on the base chain. Otherwise, this may try to
access the counter memory area of non-base chains, which does not
exists.

2) Fix several aspects of the rule tracing that are currently broken:

* Reset rule number counter after goto/jump action, otherwise the
tracing reports a bogus rule number.
* Fix tracing of the goto action.
* Fix bogus rule number counter after goto.
* Fix missing return trace after finishing the walk through the
non-base chain.
* Fix missing trace when matching non-terminal rule.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-05-21 13:24:19 +0800

20 May, 2014

1 commit

20b4f9c73 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Browse Code »

John W. Linville
2014-05-20 04:34:27 +0800

17 May, 2014

12 commits

44a408553 bonding: Fix stacked device detection in arp monitoring ... Browse Code »
14

Prior to commit fbd929f2dce460456807a51e18d623db3db9f077
bonding: support QinQ for bond arp interval

the arp monitoring code allowed for proper detection of devices
stacked on top of vlans. Since the above commit, the
code can still detect a device stacked on top of single
vlan, but not a device stacked on top of Q-in-Q configuration.
The search will only set the inner vlan tag if the route
device is the vlan device. However, this is not always the
case, as it is possible to extend the stacked configuration.

With this patch it is possible to provision devices on
top Q-in-Q vlan configuration that should be used as
a source of ARP monitoring information.

For example:
ip link add link bond0 vlan10 type vlan proto 802.1q id 10
ip link add link vlan10 vlan100 type vlan proto 802.1q id 100
ip link add link vlan100 type macvlan

Note: This patch limites the number of stacked VLANs to 2,
just like before. The original, however had another issue
in that if we had more then 2 levels of VLANs, we would end
up generating incorrectly tagged traffic. This is no longer
possible.

Fixes: fbd929f2dce460456807a51e18d623db3db9f077 (bonding: support QinQ for bond arp interval)
CC: Jay Vosburgh
CC: Veaceslav Falico
CC: Andy Gospodarek
CC: Ding Tianhong
CC: Patric McHardy
Signed-off-by: Vlad Yasevich
Signed-off-by: David S. Miller

Vlad Yasevich
2014-05-17 10:29:05 +0800
d38569ab2 vlan: Fix lockdep warning with stacked vlan devices. ... Browse Code »
5

This reverts commit dc8eaaa006350d24030502a4521542e74b5cb39f.
vlan: Fix lockdep warning when vlan dev handle notification

Instead we use the new new API to find the lock subclass of
our vlan device. This way we can support configurations where
vlans are interspersed with other devices:
bond -> vlan -> macvlan -> vlan

Signed-off-by: Vlad Yasevich
Signed-off-by: David S. Miller

Vlad Yasevich
2014-05-17 10:14:49 +0800
4085ebe8c net: Find the nesting level of a given device by type. ... Browse Code »
5

Multiple devices in the kernel can be stacked/nested and they
need to know their nesting level for the purposes of lockdep.
This patch provides a generic function that determines a nesting
level of a particular device by its type (ex: vlan, macvlan, etc).
We only care about nesting of the same type of devices.

For example:
eth0
Signed-off-by: David S. Miller

Vlad Yasevich
2014-05-17 10:14:49 +0800
29e982427 net: gro: make sure skb->cb[] initial content has not to be zero ... Browse Code »
5

Starting from linux-3.13, GRO attempts to build full size skbs.

Problem is the commit assumed one particular field in skb->cb[]
was clean, but it is not the case on some stacked devices.

Timo reported a crash in case traffic is decrypted before
reaching a GRE device.

Fix this by initializing NAPI_GRO_CB(skb)->last at the right place,
this also removes one conditional.

Thanks a lot to Timo for providing full reports and bisecting this.

Fixes: 8a29111c7ca6 ("net: gro: allow to build full sized skb")
Bisected-by: Timo Teras
Signed-off-by: Eric Dumazet
Tested-by: Timo Teräs
Signed-off-by: David S. Miller

Eric Dumazet
2014-05-17 05:24:54 +0800
22fb22eae ipv4: ip_tunnels: disable cache for nbma gre tunnels ... Browse Code »
5

The connected check fails to check for ip_gre nbma mode tunnels
properly. ip_gre creates temporary tnl_params with daddr specified
to pass-in the actual target on per-packet basis from neighbor
layer. Detect these tunnels by inspecting the actual tunnel
configuration.

Minimal test case:
ip route add 192.168.1.1/32 via 10.0.0.1
ip route add 192.168.1.2/32 via 10.0.0.2
ip tunnel add nbma0 mode gre key 1 tos c0
ip addr add 172.17.0.0/16 dev nbma0
ip link set nbma0 up
ip neigh add 172.17.0.1 lladdr 192.168.1.1 dev nbma0
ip neigh add 172.17.0.2 lladdr 192.168.1.2 dev nbma0
ping 172.17.0.1
ping 172.17.0.2

The second ping should be going to 192.168.1.2 and head 10.0.0.2;
but cached gre tunnel level route is used and it's actually going
to 192.168.1.1 via 10.0.0.1.

The lladdr's need to go to separate dst for the bug to trigger.
Test case uses separate route entries, but this can also happen
when the route entry is same: if there is a nexthop exception or
the GRE tunnel is IPsec'ed in which case the dst points to xfrm
bundle unique to the gre lladdr.

Fixes: 7d442fab0a67 ("ipv4: Cache dst in tunnels")
Signed-off-by: Timo Teräs
Cc: Tom Herbert
Cc: Eric Dumazet
Signed-off-by: David S. Miller

Timo Teräs
2014-05-17 04:58:41 +0800
d1c0b471b net/dsa/dsa.c: increment chip_index during of_node handling on dsa_of_probe() ... Browse Code »

Adding more than one chip on device-tree currently causes the probing
routine to always use the first chips data pointer.

Signed-off-by: Fabian Godehardt
Acked-by: Florian Fainelli
Signed-off-by: David S. Miller

Fabian Godehardt
2014-05-17 04:56:33 +0800
2e47b2919 net: ipv6: make "ip -6 route get mark xyz" work. ... Browse Code »

Currently, "ip -6 route get mark xyz" ignores the mark passed in
by userspace. Make it honour the mark, just like IPv4 does.

Signed-off-by: Lorenzo Colitti
Signed-off-by: David S. Miller

Lorenzo Colitti
2014-05-17 04:50:30 +0800
2f67cc87d Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge ... Browse Code »

Include changes:
- fix NULL dereference in batadv_orig_hardif_seq_print_text()
- fix reference counting imbalance when using fragmentation
- avoid access to orig_node objects after they have been free'd
- fix local TT check for outgoing arp requests in DAT

David S. Miller
2014-05-17 04:28:53 +0800
202630b44 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless ... Browse Code »

John W. Linville says:

====================
pull request: wireless 2014-05-15

Please pull this batch of fixes for the 3.15 stream...

For the mac80211 bits, Johannes says:

"One fix is to get better VHT performance and the other fixes tracing
garbage or other potential issues with the interface name tracing."

And...

"This has a fix from Emmanuel for a problem I failed to fix - when
association is in progress then it needs to be cancelled while
suspending (I had fixed the same for authentication). Also included a
fix from myself for a userspace API problem that hit the iw tool and a
fix to the remain-on-channel framework."

For the iwlwifi bits, Emmanuel says:

"Alex fixes the scan by disabling the fragmented scan. David prevents
scan offload while associated, the firmware seems not to like it. I
fix a stupid bug I made in BT Coex, and fix a bad #ifdef clause in rate
scaling. Along with that there is a fix for a NULL pointer exception
that can happen if we load the driver and our ISR gets called because
the interrupt line is shared. The fix has been tested by the reporter."

And...

"We have here a fix from David Spinadel that makes a previous fix more
complete, and an off-by-one issue fixed by Eliad in the same area.
I fix the monitor that broke on the way."

Beyond that...

Daniel Kim's one-liner fixes a brcmfmac regression caused by a typo
in an earlier commit..

Rajkumar Manoharan fixes an ath9k oops reported by David Herrmann.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-05-17 03:45:56 +0800
fde0133b9 af_rxrpc: Fix XDR length check in rxrpc key demarshalling. ... Browse Code »

There may be padding on the ticket contained in the key payload, so just ensure
that the claimed token length is large enough, rather than exactly the right
size.

Signed-off-by: Nathaniel Wesley Filardo
Signed-off-by: David Howells
Signed-off-by: David S. Miller

Nathaniel W Filardo
2014-05-17 03:24:47 +0800
f140662f3 crush: decode and initialize chooseleaf_vary_r ... Browse Code »

Commit e2b149cc4ba0 ("crush: add chooseleaf_vary_r tunable") added the
crush_map::chooseleaf_vary_r field but missed the decode part. This
lead to misdirected requests caused by incorrect raw crush mapping
sets.

Fixes: http://tracker.ceph.com/issues/8226

Reported-and-Tested-by: Dmitry Smirnov
Signed-off-by: Ilya Dryomov
Reviewed-by: Sage Weil

Ilya Dryomov
2014-05-17 01:29:55 +0800
178eda29c libceph: fix corruption when using page_count 0 page in rbd ... Browse Code »
5

It has been reported that using ZFSonLinux on rbd will result in memory
corruption. The bug report can be found here:

https://github.com/zfsonlinux/spl/issues/241
http://tracker.ceph.com/issues/7790

The reason is that ZFS will send pages with page_count 0 into rbd, which in
turns send them to tcp_sendpage. However, tcp_sendpage cannot deal with
page_count 0, as it will do get_page and put_page, and erroneously free the
page.

This type of issue has been noted before, and handled in iscsi, drbd,
etc. So, rbd should also handle this. This fix address this issue by fall back
to slower sendmsg when page_count 0 detected.

Cc: Sage Weil
Cc: Yehuda Sadeh
Cc: stable@vger.kernel.org
Signed-off-by: Chunwei Chen
Reviewed-by: Ilya Dryomov

Chunwei Chen
2014-05-17 01:29:26 +0800

16 May, 2014

7 commits

be7a010d6 ipv6: update Destination Cache entries when gateway turn into host ... Browse Code »

RFC 4861 states in 7.2.5:

The IsRouter flag in the cache entry MUST be set based on the
Router flag in the received advertisement. In those cases
where the IsRouter flag changes from TRUE to FALSE as a result
of this update, the node MUST remove that router from the
Default Router List and update the Destination Cache entries
for all destinations using that neighbor as a router as
specified in Section 7.3.3. This is needed to detect when a
node that is used as a router stops forwarding packets due to
being configured as a host.

Currently, when dealing with NA Message which IsRouter flag changes from
TRUE to FALSE, the kernel only removes router from the Default Router List,
and don't update the Destination Cache entries.

Now in order to update those Destination Cache entries, i introduce
function rt6_clean_tohost().

Signed-off-by: Duan Jiong
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller

Duan Jiong
2014-05-16 11:26:27 +0800
f895f0cfb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec ... Browse Code »

Conflicts:
net/ipv4/ip_vti.c

Steffen Klassert says:

====================
pull request (net): ipsec 2014-05-15

This pull request has a merge conflict in net/ipv4/ip_vti.c
between commit 8d89dcdf80d8 ("vti: don't allow to add the same
tunnel twice") and commit a32452366b72 ("vti4:Don't count header
length twice"). It can be solved like it is done in linux-next.

1) Fix a ipv6 xfrm output crash when a packet is rerouted
by netfilter to not use IPsec.

2) vti4 counts some header lengths twice leading to an incorrect
device mtu. Fix this by counting these headers only once.

3) We don't catch the case if an unsupported protocol is submitted
to the xfrm protocol handlers, this can lead to NULL pointer
dereferences. Fix this by adding the appropriate checks.

4) vti6 may unregister pernet ops twice on init errors.
Fix this by removing one of the calls to do it only once.
From Mathias Krause.

5) Set the vti tunnel mark before doing a lookup in the error
handlers. Otherwise we don't find the correct xfrm state.
====================

The conflict in ip_vti.c was simple, 'net' had a commit
removing a line from vti_tunnel_init() and this tree
being merged had a commit adding a line to the same
location.

Signed-off-by: David S. Miller

David S. Miller
2014-05-16 11:23:48 +0800
200b916f3 rtnetlink: wait for unregistering devices in rtnl_link_unregister() ... Browse Code »
5

From: Cong Wang

commit 50624c934db18ab90 (net: Delay default_device_exit_batch until no
devices are unregistering) introduced rtnl_lock_unregistering() for
default_device_exit_batch(). Same race could happen we when rmmod a driver
which calls rtnl_link_unregister() as we call dev->destructor without rtnl
lock.

For long term, I think we should clean up the mess of netdev_run_todo()
and net namespce exit code.

Cc: Eric W. Biederman
Cc: David S. Miller
Signed-off-by: Cong Wang
Signed-off-by: Cong Wang
Signed-off-by: David S. Miller

Cong Wang
2014-05-16 03:30:33 +0800
cc2f33860 batman-adv: fix local TT check for outgoing arp requests in DAT ... Browse Code »
5

Change introduced by 88e48d7b3340ef07b108eb8a8b3813dd093cc7f7
("batman-adv: make DAT drop ARP requests targeting local clients")
implements a check that prevents DAT from using the caching
mechanism when the client that is supposed to provide a reply
to an arp request is local.

However change brought by be1db4f6615b5e6156c807ea8985171c215c2d57
("batman-adv: make the Distributed ARP Table vlan aware")
has not converted the above check into its vlan aware version
thus making it useless when the local client is behind a vlan.

Fix the behaviour by properly specifying the vlan when
checking for a client being local or not.

Reported-by: Simon Wunderlich
Signed-off-by: Antonio Quartulli
Signed-off-by: Marek Lindner

Antonio Quartulli
2014-05-16 02:23:47 +0800
377fe0f96 batman-adv: increase orig refcount when storing ref in gw_node ... Browse Code »
5

A pointer to the orig_node representing a bat-gateway is
stored in the gw_node->orig_node member, but the refcount
for such orig_node is never increased.
This leads to memory faults when gw_node->orig_node is accessed
and the originator has already been freed.

Fix this by increasing the refcount on gw_node creation
and decreasing it on gw_node free.

Signed-off-by: Antonio Quartulli
Signed-off-by: Marek Lindner

Antonio Quartulli
2014-05-16 02:03:17 +0800
be181015a batman-adv: fix reference counting imbalance while sending fragment ... Browse Code »
5

In the new fragmentation code the batadv_frag_send_packet()
function obtains a reference to the primary_if, but it does
not release it upon return.

This reference imbalance prevents the primary_if (and then
the related netdevice) to be properly released on shut down.

Fix this by releasing the primary_if in batadv_frag_send_packet().

Introduced by ee75ed88879af88558818a5c6609d85f60ff0df4
("batman-adv: Fragment and send skbs larger than mtu")

Cc: Martin Hundebøll
Signed-off-by: Antonio Quartulli
Signed-off-by: Marek Lindner
Acked-by: Martin Hundebøll

Antonio Quartulli
2014-05-16 02:03:17 +0800
16a414236 batman-adv: fix indirect hard_iface NULL dereference ... Browse Code »
5

If hard_iface is NULL and goto out is made batadv_hardif_free_ref()
doesn't check for NULL before dereferencing it to get to refcount.

Introduced in cb1c92ec37fb70543d133a1fa7d9b54d6f8a1ecd
("batman-adv: add debugfs support to view multiif tables").

Reported-by: Sven Eckelmann
Signed-off-by: Marek Lindner
Acked-by: Antonio Quartulli
Signed-off-by: Antonio Quartulli

Marek Lindner
2014-05-16 02:03:16 +0800