Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

15 Jan, 2015

3 commits

a6391a924 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Don't use uninitialized data in IPVS, from Dan Carpenter.

2) conntrack race fixes from Pablo Neira Ayuso.

3) Fix TX hangs with i40e, from Jesse Brandeburg.

4) Fix budget return from poll calls in dnet and alx, from Eric
Dumazet.

5) Fix bugus "if (unlikely(x) < 0)" test in AF_PACKET, from Christoph
Jaeger.

6) Fix bug introduced by conversion to list_head in TIPC retransmit
code, from Jon Paul Maloy.

7) Don't use GFP_NOIO under spinlock in USB kaweth driver, from Alexey
Khoroshilov.

8) Fix bridge build with INET disabled, from Arnd Bergmann.

9) Fix netlink array overrun for PROBE attributes in openvswitch, from
Thomas Graf.

10) Don't hold spinlock across synchronize_irq() in tg3 driver, from
Prashant Sreedharan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
tg3: Release tp->lock before invoking synchronize_irq()
tg3: tg3_reset_task() needs to use rtnl_lock to synchronize
tg3: tg3_timer() should grab tp->lock before checking for tp->irq_sync
team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin
openvswitch: packet messages need their own probe attribtue
i40e: adds FCoE configure option
cxgb4vf: Fix queue allocation for 40G adapter
netdevice: Add missing parentheses in macro
bridge: only provide proxy ARP when CONFIG_INET is enabled
neighbour: fix base_reachable_time(_ms) not effective immediatly when changed
net: fec: fix MDIO bus assignement for dual fec SoC's
xen-netfront: use different locks for Rx and Tx stats
drivers: net: cpsw: fix multicast flush in dual emac mode
cxgb4vf: Initialize mdio_addr before using it
net: Corrected the comment describing the ndo operations to reflect the actual prototype for couple of operations
usb/kaweth: use GFP_ATOMIC under spin_lock in usb_start_wait_urb()
MAINTAINERS: add me as ibmveth maintainer
tipc: fix bug in broadcast retransmit code
update ip-sysctl.txt documentation (v2)
net/at91_ether: prepare and unprepare clock
...

Linus Torvalds
2015-01-15 06:17:37 +0800
1ba398041 openvswitch: packet messages need their own probe attribtue ... Browse Code »

User space is currently sending a OVS_FLOW_ATTR_PROBE for both flow
and packet messages. This leads to an out-of-bounds access in
ovs_packet_cmd_execute() because OVS_FLOW_ATTR_PROBE >
OVS_PACKET_ATTR_MAX.

Introduce a new OVS_PACKET_ATTR_PROBE with the same numeric value
as OVS_FLOW_ATTR_PROBE to grow the range of accepted packet attributes
while maintaining to be binary compatible with existing OVS binaries.

Fixes: 05da589 ("openvswitch: Add support for OVS_FLOW_ATTR_PROBE.")
Reported-by: Sander Eikelenboom
Tracked-down-by: Florian Westphal
Signed-off-by: Thomas Graf
Reviewed-by: Jesse Gross
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Thomas Graf
2015-01-15 05:49:44 +0800
d92cfdbbe bridge: only provide proxy ARP when CONFIG_INET is enabled ... Browse Code »

When IPV4 support is disabled, we cannot call arp_send from
the bridge code, which would result in a kernel link error:

net/built-in.o: In function `br_handle_frame_finish':
:(.text+0x59914): undefined reference to `arp_send'
:(.text+0x59a50): undefined reference to `arp_tbl'

This makes the newly added proxy ARP support in the bridge
code depend on the CONFIG_INET symbol and lets the compiler
optimize the code out to avoid the link error.

Signed-off-by: Arnd Bergmann
Fixes: 958501163ddd ("bridge: Add support for IEEE 802.11 Proxy ARP")
Cc: Kyeyoon Park
Signed-off-by: David S. Miller

Arnd Bergmann
2015-01-15 04:08:02 +0800

14 Jan, 2015

1 commit

4bf6980dd neighbour: fix base_reachable_time(_ms) not effective immediatly when changed ... Browse Code »

When setting base_reachable_time or base_reachable_time_ms on a
specific interface through sysctl or netlink, the reachable_time
value is not updated.

This means that neighbour entries will continue to be updated using the
old value until it is recomputed in neigh_period_work (which
recomputes the value every 300*HZ).
On systems with HZ equal to 1000 for instance, it means 5mins before
the change is effective.

This patch changes this behavior by recomputing reachable_time after
each set on base_reachable_time or base_reachable_time_ms.
The new value will become effective the next time the neighbour's timer
is triggered.

Changes are made in two places: the netlink code for set and the sysctl
handling code. For sysctl, I use a proc_handler. The ipv6 network
code does provide its own handler but it already refreshes
reachable_time correctly so it's not an issue.
Any other user of neighbour which provide its own handlers must
refresh reachable_time.

Signed-off-by: Jean-Francois Remy
Signed-off-by: David S. Miller

Jean-Francois Remy
2015-01-14 13:28:00 +0800

13 Jan, 2015

1 commit

164167794 tipc: fix bug in broadcast retransmit code ... Browse Code »

In commit 58dc55f25631178ee74cd27185956a8f7dcb3e32 ("tipc: use generic
SKB list APIs to manage link transmission queue") we replace all list
traversal loops with the macros skb_queue_walk() or
skb_queue_walk_safe(). While the previous loops were based on the
assumption that the list was NULL-terminated, the standard macros
stop when the iterator reaches the list head, which is non-NULL.

In the function bclink_retransmit_pkt() this macro replacement has
lead to a bug. When we receive a BCAST STATE_MSG we unconditionally
call the function bclink_retransmit_pkt(), whether there really is
anything to retransmit or not, assuming that the sequence number
comparisons will lead to the correct behavior. However, if the
transmission queue is empty, or if there are no eligible buffers in
the transmission queue, we will by mistake pass the list head pointer
to the function tipc_link_retransmit(). Since the list head is not a
valid sk_buff, this leads to a crash.

In this commit we fix this by only calling tipc_link_retransmit()
if we actually found eligible buffers in the transmission queue.

Reviewed-by: Ying Xue
Signed-off-by: Jon Maloy
Signed-off-by: David S. Miller

Jon Paul Maloy
2015-01-13 05:01:59 +0800

12 Jan, 2015

2 commits

2bd822180 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf ... Browse Code »

Pablo Neira Ayuso says:

====================
netfilter/ipvs fixes for net

The following patchset contains netfilter/ipvs fixes, they are:

1) Small fix for the FTP helper in IPVS, a diff variable may be left
unset when CONFIG_IP_VS_IPV6 is set. Patch from Dan Carpenter.

2) Fix nf_tables port NAT in little endian archs, patch from leroy
christophe.

3) Fix race condition between conntrack confirmation and flush from
userspace. This is the second reincarnation to resolve this problem.

4) Make sure inner messages in the batch come with the nfnetlink header.

5) Relax strict check from nfnetlink_bind() that may break old userspace
applications using all 1s group mask.

6) Schedule removal of chains once no sets and rules refer to them in
the new nf_tables ruleset flush command. Reported by Asbjoern Sloth
Toennesen.

Note that this batch comes later than usual because of the short
winter holidays.
====================

Signed-off-by: David S. Miller

David S. Miller
2015-01-12 13:14:49 +0800
46d2cfb19 packet: bail out of packet_snd() if L2 header creation fails ... Browse Code »

Due to a misplaced parenthesis, the expression

(unlikely(offset) < 0),

which expands to

(__builtin_expect(!!(offset), 0) < 0),

never evaluates to true. Therefore, when sending packets with
PF_PACKET/SOCK_DGRAM, packet_snd() does not abort as intended
if the creation of the layer 2 header fails.

Spotted by Coverity - CID 1259975 ("Operands don't affect result").

Fixes: 9c7077622dd9 ("packet: make packet_snd fail on len smaller than l2 header")
Signed-off-by: Christoph Jaeger
Acked-by: Eric Dumazet
Acked-by: Willem de Bruijn
Acked-by: Daniel Borkmann
Signed-off-by: David S. Miller

Christoph Jaeger
2015-01-12 10:54:03 +0800

10 Jan, 2015

2 commits

dc9319f5a Merge branch 'for-3.19' of git://linux-nfs.org/~bfields/linux ... Browse Code »

Pull two nfsd bugfixes from Bruce Fields.

* 'for-3.19' of git://linux-nfs.org/~bfields/linux:
rpc: fix xdr_truncate_encode to handle buffer ending on page boundary
nfsd: fix fi_delegees leak when fi_had_conflict returns true

Linus Torvalds
2015-01-10 10:10:48 +0800
20ebb3452 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull two Ceph fixes from Sage Weil:
"These are both pretty trivial: a sparse warning fix and size_t printk
thing"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: fix sparse endianness warnings
ceph: use %zu for len in ceph_fill_inline_data()

Linus Torvalds
2015-01-10 09:55:00 +0800

09 Jan, 2015

1 commit

d7d5a007b libceph: fix sparse endianness warnings ... Browse Code »

The only real issue is the one in auth_x.c and it came with
3.19-rc1 merge.

Signed-off-by: Ilya Dryomov

Ilya Dryomov
2015-01-09 01:36:57 +0800

08 Jan, 2015

1 commit

49a068f82 rpc: fix xdr_truncate_encode to handle buffer ending on page boundary ... Browse Code »

A struct xdr_stream at a page boundary might point to the end of one
page or the beginning of the next, but xdr_truncate_encode isn't
prepared to handle the former.

This can cause corruption of NFSv4 READDIR replies in the case that a
readdir entry that would have exceeded the client's dircount/maxcount
limit would have ended exactly on a 4k page boundary. You're more
likely to hit this case on large directories.

Other xdr_truncate_encode callers are probably also affected.

Reported-by: Holger Hoffstätte
Tested-by: Holger Hoffstätte
Fixes: 3e19ce762b53 "rpc: xdr_truncate_encode"
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields

J. Bruce Fields
2015-01-08 03:03:58 +0800

07 Jan, 2015

7 commits

bdec41963 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:
"Just a pile of random fixes, including:

1) Do not apply TSO limits to non-TSO packets, fix from Herbert Xu.

2) MDI{,X} eeprom check in e100 driver is reversed, from John W.
Linville.

3) Missing error return assignments in several ethernet drivers, from
Julia Lawall.

4) Altera TSE device doesn't come back up after ifconfig down/up
sequence, fix from Kostya Belezko.

5) Add more cases to the check for whether the qmi_wwan device has a
bogus MAC address and needs to be assigned a random one. From
Kristian Evensen.

6) Fix interrupt hangs in CPSW, from Felipe Balbi.

7) Implement ndo_features_check in r8152 so that the stack doesn't
feed GSO packets which are outside of the chip's capabilities.
From Hayes Wang"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
qla3xxx: don't allow never end busy loop
xen-netback: fixing the propagation of the transmit shaper timeout
r8152: support ndo_features_check
batman-adv: fix potential TT client + orig-node memory leak
batman-adv: fix multicast counter when purging originators
batman-adv: fix counter for multicast supporting nodes
batman-adv: fix lock class for decoding hash in network-coding.c
batman-adv: fix delayed foreign originator recognition
batman-adv: fix and simplify condition when bonding should be used
Revert "mac80211: Fix accounting of the tailroom-needed counter"
net: ethernet: cpsw: fix hangs with interrupts
enic: free all rq buffs when allocation fails
qmi_wwan: Set random MAC on devices with buggy fw
openvswitch: Consistently include VLAN header in flow and port stats.
tcp: Do not apply TSO segment limit to non-TSO packets
Altera TSE: Add missing phydev
net/mlx4_core: Fix error flow in mlx4_init_hca()
net/mlx4_core: Correcly update the mtt's offset in the MR re-reg flow
qlcnic: Fix return value in qlcnic_probe()
net: axienet: fix error return code
...

Linus Torvalds
2015-01-07 09:48:14 +0800
a2f18db0c netfilter: nf_tables: fix flush ruleset chain dependencies ... Browse Code »

Jumping between chains doesn't mix well with flush ruleset. Rules
from a different chain and set elements may still refer to us.

[ 353.373791] ------------[ cut here ]------------
[ 353.373845] kernel BUG at net/netfilter/nf_tables_api.c:1159!
[ 353.373896] invalid opcode: 0000 [#1] SMP
[ 353.373942] Modules linked in: intel_powerclamp uas iwldvm iwlwifi
[ 353.374017] CPU: 0 PID: 6445 Comm: 31c3.nft Not tainted 3.18.0 #98
[ 353.374069] Hardware name: LENOVO 5129CTO/5129CTO, BIOS 6QET47WW (1.17 ) 07/14/2010
[...]
[ 353.375018] Call Trace:
[ 353.375046] [] ? nf_tables_commit+0x381/0x540
[ 353.375101] [] nfnetlink_rcv+0x3d8/0x4b0
[ 353.375150] [] netlink_unicast+0x105/0x1a0
[ 353.375200] [] netlink_sendmsg+0x32e/0x790
[ 353.375253] [] sock_sendmsg+0x8e/0xc0
[ 353.375300] [] ? move_addr_to_kernel.part.20+0x19/0x70
[ 353.375357] [] ? move_addr_to_kernel+0x19/0x30
[ 353.375410] [] ? verify_iovec+0x42/0xd0
[ 353.375459] [] ___sys_sendmsg+0x3f0/0x400
[ 353.375510] [] ? native_sched_clock+0x2a/0x90
[ 353.375563] [] ? acct_account_cputime+0x17/0x20
[ 353.375616] [] ? account_user_time+0x88/0xa0
[ 353.375667] [] __sys_sendmsg+0x3d/0x80
[ 353.375719] [] ? int_check_syscall_exit_work+0x34/0x3d
[ 353.375776] [] SyS_sendmsg+0xd/0x20
[ 353.375823] [] system_call_fastpath+0x16/0x1b

Release objects in this order: rules -> sets -> chains -> tables, to
make sure no references to chains are held anymore.

Reported-by: Asbjoern Sloth Toennesen
Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2015-01-07 05:27:48 +0800
62924af24 netfilter: nfnetlink: relax strict multicast group check from netlink_bind ... Browse Code »

Relax the checking that was introduced in 97840cb ("netfilter:
nfnetlink: fix insufficient validation in nfnetlink_bind") when the
subscription bitmask is used. Existing userspace code code may request
to listen to all of the existing netlink groups by setting an all to one
subscription group bitmask. Netlink already validates subscription via
setsockopt() for us.

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2015-01-07 05:27:47 +0800
9ea2aa8b7 netfilter: nfnetlink: validate nfnetlink header from batch ... Browse Code »
5

Make sure there is enough room for the nfnetlink header in the
netlink messages that are part of the batch. There is a similar
check in netlink_rcv_skb().

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2015-01-07 05:27:46 +0800
8ca3f5e97 netfilter: conntrack: fix race between confirmation and flush ... Browse Code »

Commit 5195c14c8b27c ("netfilter: conntrack: fix race in
__nf_conntrack_confirm against get_next_corpse") aimed to resolve the
race condition between the confirmation (packet path) and the flush
command (from control plane). However, it introduced a crash when
several packets race to add a new conntrack, which seems easier to
reproduce when nf_queue is in place.

Fix this race, in __nf_conntrack_confirm(), by removing the CT
from unconfirmed list before checking the DYING bit. In case
race occured, re-add the CT to the dying list

This patch also changes the verdict from NF_ACCEPT to NF_DROP when
we lose race. Basically, the confirmation happens for the first packet
that we see in a flow. If you just invoked conntrack -F once (which
should be the common case), then this is likely to be the first packet
of the flow (unless you already called flush anytime soon in the past).
This should be hard to trigger, but better drop this packet, otherwise
we leave things in inconsistent state since the destination will likely
reply to this packet, but it will find no conntrack, unless the origin
retransmits.

The change of the verdict has been discussed in:
https://www.marc.info/?l=linux-netdev&m=141588039530056&w=2

Signed-off-by: Pablo Neira Ayuso

Pablo Neira Ayuso
2015-01-07 05:27:45 +0800
627d2cc01 Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge ... Browse Code »

Included changes:
- ensure bonding is used (if enabled) for packets coming in the soft
interface
- fix race condition to avoid orig_nodes to be deleted right after
being added
- avoid false positive lockdep splats by assigning lockclass to
the proper hashtable lock objects
- avoid miscounting of multicast 'disabled' nodes in the network
- fix memory leak in the Global Translation Table in case of
originator interval change

Signed-off-by: David S. Miller

David S. Miller
2015-01-07 03:24:49 +0800
15ecf7a06 Merge tag 'mac80211-for-davem-2015-01-06' of git://git.kernel.org/pub/scm/linux/… ... Browse Code »

…kernel/git/jberg/mac80211

Here's just a single fix - a revert of a patch that broke the
p54 and cw2100 drivers (arguably due to bad assumptions there.)
Since this affects kernels since 3.17, I decided to revert for
now and we'll revisit this optimisation properly for -next.

Signed-off-by: David S. Miller <davem@davemloft.net>

David S. Miller
2015-01-07 02:29:27 +0800

06 Jan, 2015

6 commits

9d31b3ce8 batman-adv: fix potential TT client + orig-node memory leak ... Browse Code »

This patch fixes a potential memory leak which can occur once an
originator times out. On timeout the according global translation table
entry might not get purged correctly. Furthermore, the non purged TT
entry will cause its orig-node to leak, too. Which additionally can lead
to the new multicast optimization feature not kicking in because of a
therefore bogus counter.

In detail: The batadv_tt_global_entry->orig_list holds the reference to
the orig-node. Usually this reference is released after
BATADV_PURGE_TIMEOUT through: _batadv_purge_orig()->
batadv_purge_orig_node()->batadv_update_route()->_batadv_update_route()->
batadv_tt_global_del_orig() which purges this global tt entry and
releases the reference to the orig-node.

However, if between two batadv_purge_orig_node() calls the orig-node
timeout grew to 2*BATADV_PURGE_TIMEOUT then this call path isn't
reached. Instead the according orig-node is removed from the
originator hash in _batadv_purge_orig(), the batadv_update_route()
part is skipped and won't be reached anymore.

Fixing the issue by moving batadv_tt_global_del_orig() out of the rcu
callback.

Signed-off-by: Linus Lüssing
Acked-by: Antonio Quartulli
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli

Linus Lüssing
2015-01-06 18:07:01 +0800
a5164886b batman-adv: fix multicast counter when purging originators ... Browse Code »

When purging an orig_node we should only decrease counter tracking the
number of nodes without multicast optimizations support if it was
increased through this orig_node before.

A not yet quite initialized orig_node (meaning it did not have its turn
in the mcast-tvlv handler so far) which gets purged would not adhere to
this and will lead to a counter imbalance.

Fixing this by adding a check whether the orig_node is mcast-initalized
before decreasing the counter in the mcast-orig_node-purging routine.

Introduced by 60432d756cf06e597ef9da511402dd059b112447
("batman-adv: Announce new capability via multicast TVLV")

Reported-by: Tobias Hachmer
Signed-off-by: Linus Lüssing
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli

Linus Lüssing
2015-01-06 18:06:04 +0800
e8829f007 batman-adv: fix counter for multicast supporting nodes ... Browse Code »

A miscounting of nodes having multicast optimizations enabled can lead
to multicast packet loss in the following scenario:

If the first OGM a node receives from another one has no multicast
optimizations support (no multicast tvlv) then we are missing to
increase the counter. This potentially leads to the wrong assumption
that we could safely use multicast optimizations.

Fixings this by increasing the counter if the initial OGM has the
multicast TVLV unset, too.

Introduced by 60432d756cf06e597ef9da511402dd059b112447
("batman-adv: Announce new capability via multicast TVLV")

Reported-by: Tobias Hachmer
Signed-off-by: Linus Lüssing
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli

Linus Lüssing
2015-01-06 18:05:42 +0800
f44d54077 batman-adv: fix lock class for decoding hash in network-coding.c ... Browse Code »

batadv_has_set_lock_class() is called with the wrong hash table as first
argument (probably due to a copy-paste error), which leads to false
positives when running with lockdep.

Introduced-by: 612d2b4fe0a1ff2f8389462a6f8be34e54124c05
("batman-adv: network coding - save overheard and tx packets for decoding")

Signed-off-by: Martin Hundebøll
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli

Martin Hundebøll
2015-01-06 18:05:12 +0800
2c667a339 batman-adv: fix delayed foreign originator recognition ... Browse Code »

Currently it can happen that the reception of an OGM from a new
originator is not being accepted. More precisely it can happen that
an originator struct gets allocated and initialized
(batadv_orig_node_new()), even the TQ gets calculated and set correctly
(batadv_iv_ogm_calc_tq()) but still the periodic orig_node purging
thread will decide to delete it if it has a chance to jump between
these two function calls.

This is because batadv_orig_node_new() initializes the last_seen value
to zero and its caller (batadv_iv_ogm_orig_get()) makes it visible to
other threads by adding it to the hash table already.
batadv_iv_ogm_calc_tq() will set the last_seen variable to the correct,
current time a few lines later but if the purging thread jumps in between
that it will think that the orig_node timed out and will wrongly
schedule it for deletion already.

If the purging interval is the same as the originator interval (which is
the default: 1 second), then this game can continue for several rounds
until the random OGM jitter added enough difference between these
two (in tests, two to about four rounds seemed common).

Fixing this by initializing the last_seen variable of an orig_node
to the current time before adding it to the hash table.

Signed-off-by: Linus Lüssing
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli

Linus Lüssing
2015-01-06 18:05:09 +0800
329887ad1 batman-adv: fix and simplify condition when bonding should be used ... Browse Code »

The current condition actually does NOT consider bonding when the
interface the packet came in from is the soft interface, which is the
opposite of what it should do (and the comment describes). Fix that and
slightly simplify the condition.

Reported-by: Ray Gibson
Signed-off-by: Simon Wunderlich
Signed-off-by: Marek Lindner
Signed-off-by: Antonio Quartulli

Simon Wunderlich
2015-01-06 18:05:07 +0800

05 Jan, 2015

1 commit

1e359a5de Revert "mac80211: Fix accounting of the tailroom-needed counter" ... Browse Code »

This reverts commit ca34e3b5c808385b175650605faa29e71e91991b.

It turns out that the p54 and cw2100 drivers assume that there's
tailroom even when they don't say they really need it. However,
there's currently no way for them to explicitly say they do need
it, so for now revert this.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=90331.

Cc: stable@vger.kernel.org
Fixes: ca34e3b5c808 ("mac80211: Fix accounting of the tailroom-needed counter")
Reported-by: Christopher Chavez
Bisected-by: Larry Finger
Debugged-by: Christian Lamparter
Signed-off-by: Johannes Berg

Johannes Berg
2015-01-05 17:33:46 +0800

03 Jan, 2015

2 commits

24cc59d1e openvswitch: Consistently include VLAN header in flow and port stats. ... Browse Code »

Until now, when VLAN acceleration was in use, the bytes of the VLAN header
were not included in port or flow byte counters. They were however
included when VLAN acceleration was not used. This commit corrects the
inconsistency, by always including the VLAN header in byte counters.

Previous discussion at
http://openvswitch.org/pipermail/dev/2014-December/049521.html

Reported-by: Motonori Shindo
Signed-off-by: Ben Pfaff
Reviewed-by: Flavio Leitner
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Ben Pfaff
2015-01-03 05:14:20 +0800
843925f33 tcp: Do not apply TSO segment limit to non-TSO packets ... Browse Code »
8

Thomas Jarosch reported IPsec TCP stalls when a PMTU event occurs.

In fact the problem was completely unrelated to IPsec. The bug is
also reproducible if you just disable TSO/GSO.

The problem is that when the MSS goes down, existing queued packet
on the TX queue that have not been transmitted yet all look like
TSO packets and get treated as such.

This then triggers a bug where tcp_mss_split_point tells us to
generate a zero-sized packet on the TX queue. Once that happens
we're screwed because the zero-sized packet can never be removed
by ACKs.

Fixes: 1485348d242 ("tcp: Apply device TSO segment limit earlier")
Reported-by: Thomas Jarosch
Signed-off-by: Herbert Xu

Cheers,
Signed-off-by: David S. Miller

Herbert Xu
2015-01-03 05:13:20 +0800

31 Dec, 2014

2 commits

831a39c24 Revert "cfg80211: make WEXT compatibility unselectable" ... Browse Code »

This reverts commit 24a0aa212ee2dbe44360288684478d76a8e20a0a.

It's causing severe userspace breakage. Namely, all the utilities from
wireless-utils which are relying on CONFIG_WEXT (which means tools like
'iwconfig', 'iwlist', etc) are not working anymore. There is a 'iw'
utility in newer wireless-tools, which is supposed to be a replacement
for all the "deprecated" binaries, but it's far away from being
massively adopted.

Please see [1] for example of the userspace breakage this is causing.

In addition to that, Larry Finger reports [2] that this patch is also
causing ipw2200 driver being impossible to build.

To me this clearly shows that CONFIG_WEXT is far, far away from being
"deprecated enough" to be removed.

[1] http://thread.gmane.org/gmane.linux.kernel/1857010
[2] http://thread.gmane.org/gmane.linux.network/343688

Signed-off-by: Jiri Kosina
Signed-off-by: Linus Torvalds

Jiri Kosina
2014-12-31 08:42:29 +0800
2c90331cf Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix double SKB free in bluetooth 6lowpan layer, from Jukka Rissanen.

2) Fix receive checksum handling in enic driver, from Govindarajulu
Varadarajan.

3) Fix NAPI poll list corruption in virtio_net and caif_virtio, from
Herbert Xu. Also, add code to detect drivers that have this mistake
in the future.

4) Fix doorbell endianness handling in mlx4 driver, from Amir Vadai.

5) Don't clobber IP6CB() before xfrm6_policy_check() is called in TCP
input path,f rom Nicolas Dichtel.

6) Fix MPLS action validation in openvswitch, from Pravin B Shelar.

7) Fix double SKB free in vxlan driver, also from Pravin.

8) When we scrub a packet, which happens when we are switching the
context of the packet (namespace, etc.), we should reset the
secmark. From Thomas Graf.

9) ->ndo_gso_check() needs to do more than return true/false, it also
has to allow the driver to clear netdev feature bits in order for
the caller to be able to proceed properly. From Jesse Gross.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
genetlink: A genl_bind() to an out-of-range multicast group should not WARN().
netlink/genetlink: pass network namespace to bind/unbind
ne2k-pci: Add pci_disable_device in error handling
bonding: change error message to debug message in __bond_release_one()
genetlink: pass multicast bind/unbind to families
netlink: call unbind when releasing socket
netlink: update listeners directly when removing socket
genetlink: pass only network namespace to genl_has_listeners()
netlink: rename netlink_unbind() to netlink_undo_bind()
net: Generalize ndo_gso_check to ndo_features_check
net: incorrect use of init_completion fixup
neigh: remove next ptr from struct neigh_table
net: xilinx: Remove unnecessary temac_property in the driver
net: phy: micrel: use generic config_init for KSZ8021/KSZ8031
net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding
openvswitch: fix odd_ptr_err.cocci warnings
Bluetooth: Fix accepting connections when not using mgmt
Bluetooth: Fix controller configuration with HCI_QUIRK_INVALID_BDADDR
brcmfmac: Do not crash if platform data is not populated
ipw2200: select CFG80211_WEXT
...

Linus Torvalds
2014-12-31 02:45:47 +0800

30 Dec, 2014

1 commit

dc97a1a94 genetlink: A genl_bind() to an out-of-range multicast group should not WARN(). ... Browse Code »

Users can request to bind to arbitrary multicast groups, so warning
when the requested group number is out of range is not appropriate.

And with the warning removed, and the 'err' variable properly given
an initial value, we can remove 'found' altogether.

Reported-by: Sedat Dilek
Signed-off-by: David S. Miller

David S. Miller
2014-12-30 05:31:49 +0800

27 Dec, 2014

9 commits

023e2cfa3 netlink/genetlink: pass network namespace to bind/unbind ... Browse Code »

Netlink families can exist in multiple namespaces, and for the most
part multicast subscriptions are per network namespace. Thus it only
makes sense to have bind/unbind notifications per network namespace.

To achieve this, pass the network namespace of a given client socket
to the bind/unbind functions.

Also do this in generic netlink, and there also make sure that any
bind for multicast groups that only exist in init_net is rejected.
This isn't really a problem if it is accepted since a client in a
different namespace will never receive any notifications from such
a group, but it can confuse the family if not rejected (it's also
possible to silently (without telling the family) accept it, but it
would also have to be ignored on unbind so families that take any
kind of action on bind/unbind won't do unnecessary work for invalid
clients like that.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2014-12-27 16:07:50 +0800
c380d9a7a genetlink: pass multicast bind/unbind to families ... Browse Code »

In order to make the newly fixed multicast bind/unbind
functionality in generic netlink, pass them down to the
appropriate family.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2014-12-27 15:20:23 +0800
7d68536be netlink: call unbind when releasing socket ... Browse Code »

Currently, netlink_unbind() is only called when the socket
explicitly unbinds, which limits its usefulness (luckily
there are no users of it yet anyway.)

Call netlink_unbind() also when a socket is released, so it
becomes possible to track listeners with this callback and
without also implementing a netlink notifier (and checking
netlink_has_listeners() in there.)

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2014-12-27 15:20:23 +0800
b10dcb3b9 netlink: update listeners directly when removing socket ... Browse Code »

The code is now confusing to read - first in one function down
(netlink_remove) any group subscriptions are implicitly removed
by calling __sk_del_bind_node(), but the subscriber database is
only updated far later by calling netlink_update_listeners().

Move the latter call to just after removal from the list so it
is easier to follow the code.

This also enables moving the locking inside the kernel-socket
conditional, which improves the normal socket destruction path.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2014-12-27 15:20:23 +0800
f8403a2e4 genetlink: pass only network namespace to genl_has_listeners() ... Browse Code »

There's no point to force the caller to know about the internal
genl_sock to use inside struct net, just have them pass the network
namespace. This doesn't really change code generation since it's
an inline, but makes the caller less magic - there's never any
reason to pass another socket.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2014-12-27 15:20:23 +0800
02c81ab95 netlink: rename netlink_unbind() to netlink_undo_bind() ... Browse Code »

The new name is more expressive - this isn't a generic unbind
function but rather only a little undo helper for use only in
netlink_bind().

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2014-12-27 15:20:23 +0800
eb46e2215 Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth ... Browse Code »

Johan Hedberg says:

====================
Here's one more bluetooth pull request for 3.19. We've got two fixes:

- Fix for accepting connections with old user space versions of BlueZ
- Fix for Bluetooth controllers that don't have a public address

Both of these are regressions that were introduced in 3.17, so the
appropriate Cc: stable annotations are provided.

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller

David S. Miller
2014-12-27 07:23:37 +0800
5f35227ea net: Generalize ndo_gso_check to ndo_features_check ... Browse Code »
13

GSO isn't the only offload feature with restrictions that
potentially can't be expressed with the current features mechanism.
Checksum is another although it's a general issue that could in
theory apply to anything. Even if it may be possible to
implement these restrictions in other ways, it can result in
duplicate code or inefficient per-packet behavior.

This generalizes ndo_gso_check so that drivers can remove any
features that don't make sense for a given packet, similar to
netif_skb_features(). It also converts existing driver
restrictions to the new format, completing the work that was
done to support tunnel protocols since the issues apply to
checksums as well.

By actually removing features from the set that are used to do
offloading, it solves another problem with the existing
interface. In these cases, GSO would run with the original set
of features and not do anything because it appears that
segmentation is not required.

CC: Tom Herbert
CC: Joe Stringer
CC: Eric Dumazet
CC: Hayes Wang
Signed-off-by: Jesse Gross
Acked-by: Tom Herbert
Fixes: 04ffcb255f22 ("net: Add ndo_gso_check")
Tested-by: Hayes Wang
Signed-off-by: David S. Miller

Jesse Gross
2014-12-27 06:20:56 +0800
2c26d34bb net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding ... Browse Code »
5

When using VXLAN tunnels and a sky2 device, I have experienced
checksum failures of the following type:

[ 4297.761899] eth0: hw csum failure
[...]
[ 4297.765223] Call Trace:
[ 4297.765224] [] dump_stack+0x46/0x58
[ 4297.765235] [] netdev_rx_csum_fault+0x42/0x50
[ 4297.765238] [] ? skb_push+0x40/0x40
[ 4297.765240] [] __skb_checksum_complete+0xbc/0xd0
[ 4297.765243] [] tcp_v4_rcv+0x2e2/0x950
[ 4297.765246] [] ? ip_rcv_finish+0x360/0x360

These are reliably reproduced in a network topology of:

container:eth0 == host(OVS VXLAN on VLAN) == bond0 == eth0 (sky2) -> switch

When VXLAN encapsulated traffic is received from a similarly
configured peer, the above warning is generated in the receive
processing of the encapsulated packet. Note that the warning is
associated with the container eth0.

The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and
because the packet is an encapsulated Ethernet frame, the checksum
generated by the hardware includes the inner protocol and Ethernet
headers.

The receive code is careful to update the skb->csum, except in
__dev_forward_skb, as called by dev_forward_skb. __dev_forward_skb
calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
to skip over the Ethernet header, but does not update skb->csum when
doing so.

This patch resolves the problem by adding a call to
skb_postpull_rcsum to update the skb->csum after the call to
eth_type_trans.

Signed-off-by: Jay Vosburgh
Signed-off-by: David S. Miller

Jay Vosburgh
2014-12-27 05:16:51 +0800

25 Dec, 2014

1 commit

4aa611881 openvswitch: fix odd_ptr_err.cocci warnings ... Browse Code »

net/openvswitch/vport-gre.c:188:5-11: inconsistent IS_ERR and PTR_ERR, PTR_ERR on line 189

PTR_ERR should access the value just tested by IS_ERR

Semantic patch information:
There can be false positives in the patch case, where it is the call
IS_ERR that is wrong.

Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci

CC: Pravin B Shelar
Signed-off-by: Fengguang Wu
Acked-by: Pravin B Shelar
Signed-off-by: David S. Miller

Wu Fengguang
2014-12-25 04:18:09 +0800